[ 
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074460#comment-14074460
 ] 

Jun Rao commented on KAFKA-1555:
--------------------------------

Jiang,

If you use controlled shutdown, we will make sure that there is at least 
another replica in ISR before shutting the broker.

Two comments on your proposal.
1. In some of the corner cases, the replica with the most message shouldn't be 
the leader. Consider the following example. Suppose that we have 2 replicas A 
and B. At some point the messages in the logs look like the following.

offset   A     B
0         m1   m1
1         m2 
2         m3 

Suppose that m1 is committed and m2 and m3 are not. At this point, A dies. B 
takes over as the new leader. B then accepts and commits message m4. The log 
will now like the following.

offset   A     B
0         m1   m1
1         m2   m4
2         m3 

At this point, let's say both A and B go down and come up again. Now, you can't 
pick replica A as the new leader even though it has more message. Otherwise, 
you will lose the committed message m4. Note that it's ok to lose message m2 
and m3 since they are never committed. The correct way to pick the "longest" 
log is a bit more involved and would require knowing the leader generation id 
of each message. For details, see KAFKA-1211.

2. A second thing that you need to worry about is coordination. Suppose that we 
have the same above scenario when both A and B die. If B comes up first, we 
actually can elect B as the new leader without waiting for A. However, if you 
need to compare the length of the log, you need to wait for A to come back 
since A could be having the longest log. Then the question is that if A never 
comes back, will you block the writes forever or do you risk losing committed 
messages.

ISR is sort of our way for remembering the new leader candidates. Therefore, 
without coordination among replicas, we can easily figure out who can be the 
new leader.




> provide strong consistency with reasonable availability
> -------------------------------------------------------
>
>                 Key: KAFKA-1555
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1555
>             Project: Kafka
>          Issue Type: Improvement
>          Components: controller
>    Affects Versions: 0.8.1.1
>            Reporter: Jiang Wu
>            Assignee: Neha Narkhede
>
> In a mission critical application, we expect a kafka cluster with 3 brokers 
> can satisfy two requirements:
> 1. When 1 broker is down, no message loss or service blocking happens.
> 2. In worse cases such as two brokers are down, service can be blocked, but 
> no message loss happens.
> We found that current kafka versoin (0.8.1.1) cannot achieve the requirements 
> due to its three behaviors:
> 1. when choosing a new leader from 2 followers in ISR, the one with less 
> messages may be chosen as the leader.
> 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it 
> has less messages than the leader.
> 3. ISR can contains only 1 broker, therefore acknowledged messages may be 
> stored in only 1 broker.
> The following is an analytical proof. 
> We consider a cluster with 3 brokers and a topic with 3 replicas, and assume 
> that at the beginning, all 3 replicas, leader A, followers B and C, are in 
> sync, i.e., they have the same messages and are all in ISR.
> According to the value of request.required.acks (acks for short), there are 
> the following cases.
> 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement.
> 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this 
> time, although C hasn't received m, C is still in ISR. If A is killed, C can 
> be elected as the new leader, and consumers will miss m.
> 3. acks=-1. B and C restart and are removed from ISR. Producer sends a 
> message m to A, and receives an acknowledgement. Disk failure happens in A 
> before B and C replicate m. Message m is lost.
> In summary, any existing configuration cannot satisfy the requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to