CAP Theorem

The CAP theorem (Consistency, Availability, Partition tolerance) is a fundamental concept in distributed systems that describes the trade-offs among three key system properties. It states that a distributed system can only guarantee at most two out of the three properties simultaneously.

Here’s a detailed explanation of the CAP theorem with an example and a diagram:


The Three Properties

  1. Consistency (C):
    • Every read receives the most recent write or an error.
    • This ensures that all nodes in the distributed system return the same data at any time.
    • Example: In a banking system, if you transfer money from one account to another, any subsequent read operation should reflect the updated balances immediately.
  2. Availability (A):
    • Every request (read/write) receives a response, regardless of whether it is the latest version.
    • This means the system remains operational even if some nodes fail.
    • Example: If one node in a system goes down, the system should still handle requests using the remaining nodes.
  3. Partition Tolerance (P):
    • The system continues to operate despite network partitions (failures in communication between nodes).
    • This ensures the system is resilient to network issues and does not completely fail.
    • Example: If a network link between two nodes is broken, both sides should still function independently.

CAP Theorem Trade-Offs

According to CAP theorem, a distributed system can provide at most two of these properties simultaneously in the presence of a network partition:

  • CP (Consistency + Partition Tolerance):
    • Ensures data consistency across all nodes even during a partition, but some nodes may not be available.
    • Example: HBase or MongoDB (with strict consistency).
  • AP (Availability + Partition Tolerance):
    • Ensures the system is always available even during a partition, but data consistency may be compromised.
    • Example: Cassandra or DynamoDB.
  • CA (Consistency + Availability):
    • Ensures consistency and availability as long as there are no network partitions, but sacrifices partition tolerance.
    • Example: Traditional RDBMS systems like MySQL or PostgreSQL in a single-node setup.

Real-World Example

Scenario: Online Shopping System

  • Imagine a distributed system for an e-commerce platform.
  • A user adds a product to their cart. This operation is replicated across multiple servers to ensure availability and fault tolerance.
  1. Consistency:
    • If the system is consistent, all servers must have the same updated cart data before responding to the user.
    • During a network partition, the system may block updates until all servers synchronize, reducing availability.
  2. Availability:
    • The system remains available even if some servers are disconnected.
    • However, the cart data may differ between servers (inconsistency) during the partition.
  3. Partition Tolerance:
    • During a partition, the system must tolerate the failure and continue working.
    • It may prioritize either availability (AP) or consistency (CP), but not both.

Diagram of CAP Theorem

Here is a conceptual representation of the CAP theorem:

          Partition Tolerance
                 /   \
                /     \
               /       \
          Consistency   Availability

In practice:

  • CA systems work well without partitions but fail when one occurs.
  • CP systems ensure correctness but may reject requests during a partition.
  • AP systems prioritize availability but may return outdated or inconsistent data during a partition.

Detailed Example with CAP Trade-Offs

Banking System Example:

  • Consistency (C): All branches of the bank show the same account balance at all times.
  • Availability (A): ATMs and online banking always provide service.
  • Partition Tolerance (P): The system can handle communication breakdowns between branches.

Trade-Off:

  • During a network partition:
    • CP: The system halts all transactions (availability is compromised) until partitions resolve to ensure consistent account balances.
    • AP: Transactions continue (availability is maintained), but balances may be inconsistent across branches.
    • CA: Without partition tolerance, the system cannot guarantee service during communication failures.

Scroll to Top