[ 
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183382#comment-14183382
 ] 

Kyle Banker commented on KAFKA-1555:
------------------------------------

The new durability docs looks great so far. Here are a couple more proposed 
edits to the patch:

EDIT #1:
min.insync.replicas: When a producer sets request.required.acks to -1, 
min.insync.replicas specifies the minimum number of replicas that must 
acknowledge a write for the write to be considered successful. If this minimum 
cannot be met, then the producer will raise an exception (either 
NotEnoughReplicas or NotEnoughReplicasAfterAppend).

When used together, min.insync.replicas and request.required.acks allow you to 
enforce greater durability guarantees. A typical scenario would be to create a 
topic with a replication factor of 3, set min.insync.replicas to 2, and produce 
with request.required.acks of -1. This will ensure that the producer raises an 
exception if a majority of replicas do not receive a write.

EDIT #2 (for request.required.acks):
  -1. The producer gets an acknowledgement after all in-sync replicas have 
received the data. This option provides the greatest level of durability. 
However, it does not completely eliminate the risk of message loss because the 
number of in sync replicas may, in rare cases shrink, to 1. If you want to 
ensure that some minimum number of replicas (typically a majority) receive a 
write, the you must set the topic-level min.insync.replicas setting. Please 
read the Replication section of the design documentation for a more in-depth 
discussion.

> provide strong consistency with reasonable availability
> -------------------------------------------------------
>
>                 Key: KAFKA-1555
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1555
>             Project: Kafka
>          Issue Type: Improvement
>          Components: controller
>    Affects Versions: 0.8.1.1
>            Reporter: Jiang Wu
>            Assignee: Gwen Shapira
>             Fix For: 0.8.2
>
>         Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, 
> KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, 
> KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, 
> KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch
>
>
> In a mission critical application, we expect a kafka cluster with 3 brokers 
> can satisfy two requirements:
> 1. When 1 broker is down, no message loss or service blocking happens.
> 2. In worse cases such as two brokers are down, service can be blocked, but 
> no message loss happens.
> We found that current kafka versoin (0.8.1.1) cannot achieve the requirements 
> due to its three behaviors:
> 1. when choosing a new leader from 2 followers in ISR, the one with less 
> messages may be chosen as the leader.
> 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it 
> has less messages than the leader.
> 3. ISR can contains only 1 broker, therefore acknowledged messages may be 
> stored in only 1 broker.
> The following is an analytical proof. 
> We consider a cluster with 3 brokers and a topic with 3 replicas, and assume 
> that at the beginning, all 3 replicas, leader A, followers B and C, are in 
> sync, i.e., they have the same messages and are all in ISR.
> According to the value of request.required.acks (acks for short), there are 
> the following cases.
> 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement.
> 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this 
> time, although C hasn't received m, C is still in ISR. If A is killed, C can 
> be elected as the new leader, and consumers will miss m.
> 3. acks=-1. B and C restart and are removed from ISR. Producer sends a 
> message m to A, and receives an acknowledgement. Disk failure happens in A 
> before B and C replicate m. Message m is lost.
> In summary, any existing configuration cannot satisfy the requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to