[ 
https://issues.apache.org/jira/browse/KAFKA-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114911#comment-16114911
 ] 

Onur Karaman commented on KAFKA-1120:
-------------------------------------

[~wushujames] I think Jun's comments and the redesign doc in KAFKA-5027 are 
sort of saying the same thing. The broker-generation concept has two use cases 
which was sort of implied:
1. the controller using broker generations to distinguish events from a broker 
across generations.
2. controller-to-broker requests should include broker generation so that 
brokers can ignore requests that applied to its former generation.

While I think czxid's will work for the 1st use case, I don't think we can 
naively reuse czxid for the 2nd use case. The reason is a bit silly: 
zookeeper's CreateResponse only provides the path. It doesn't provide the 
created znode's Stat, So you have to do a later lookup to find out the znode's 
czxid.

If we want to solve both use cases with the same approach, I think we have a 
couple of options:
1. maybe we can get away with using czxids by doing a multi-op when registering 
brokers to transactionally create a znode and read that same znode to read the 
czxid of the znode it just created.
2. we can instead use the session id as the broker generation. The controller 
can infer the broker's generation by observing the broker znode's 
ephemeralOwner property. Brokers can determine their generation id by looking 
up the underlying zookeeper client's session id which is just 
ZooKeeper.getSessionId(). The ephemeralOwner of an ephemeral znode its the 
client's session id which is why this would work.

> Controller could miss a broker state change 
> --------------------------------------------
>
>                 Key: KAFKA-1120
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1120
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8.1
>            Reporter: Jun Rao
>              Labels: reliability
>             Fix For: 1.0.0
>
>
> When the controller is in the middle of processing a task (e.g., preferred 
> leader election, broker change), it holds a controller lock. During this 
> time, a broker could have de-registered and re-registered itself in ZK. After 
> the controller finishes processing the current task, it will start processing 
> the logic in the broker change listener. However, it will see no broker 
> change and therefore won't do anything to the restarted broker. This broker 
> will be in a weird state since the controller doesn't inform it to become the 
> leader of any partition. Yet, the cached metadata in other brokers could 
> still list that broker as the leader for some partitions. Client requests 
> routed to that broker will then get a TopicOrPartitionNotExistException. This 
> broker will continue to be in this bad state until it's restarted again.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to