[ 
https://issues.apache.org/jira/browse/KAFKA-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16060037#comment-16060037
 ] 

ASF GitHub Bot commented on KAFKA-5502:
---------------------------------------

Github user asfgit closed the pull request at:

    https://github.com/apache/kafka/pull/3413


> read current brokers from zookeeper upon processing broker change
> -----------------------------------------------------------------
>
>                 Key: KAFKA-5502
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5502
>             Project: Kafka
>          Issue Type: Sub-task
>            Reporter: Onur Karaman
>            Assignee: Onur Karaman
>             Fix For: 0.11.0.0
>
>
> [~lindong]'s testing of the 0.11.0 release revealed a controller-side 
> performance regression in clusters with many brokers and many partitions when 
> bringing up many brokers simultaneously.
> The regression is caused by KAFKA-5028: a Watcher receives WatchedEvent 
> notifications from the raw ZooKeeper client EventThread. A WatchedEvent only 
> contains the following information:
> - KeeperState
> - EventType
> - path
> Note that it does not actually contain the current data or current set of 
> children associated with the data/child change notification. It is up to the 
> user to do this lookup to see the current data or set of children.
> ZkClient is itself a Watcher. When it receives a WatchedEvent, it puts a 
> ZkEvent into its own queue which its own ZkEventThread processes. Users of 
> ZkClient interact with these notifications through listeners 
> (IZkDataListener, IZkChildListener). IZkDataListener actually expects as 
> input the current data of the watched znode, and likewise IZkChildListener 
> actually expects as input the current set of children of the watched znode. 
> In order to provide this information to the listeners, the ZkEventThread, 
> when processing the ZkEvent in its queue, looks up the information (either 
> the current data or current set of children) simultaneously sets up the next 
> watch, and passes the result to the listener.
> The regression introduced in KAFKA-5028 is the time at which we lookup the 
> information needed for the event processing.
> In the past, the lookup from the ZkEventThread during ZkEvent processing 
> would be passed into the listener which is processed immediately after. For 
> instance in ZkClient.fireChildChangedEvents:
> {code}
> List<String> children = getChildren(path);
> listener.handleChildChange(path, children);
> {code}
> Now, however, there are multiple listeners that pass information looked up by 
> the ZkEventThread into a ControllerEvent which gets processed potentially 
> much later. For instance in BrokerChangeListener:
> {code}
> class BrokerChangeListener(controller: KafkaController) extends 
> IZkChildListener with Logging {
>   override def handleChildChange(parentPath: String, currentChilds: 
> java.util.List[String]): Unit = {
>     import JavaConverters._
>     
> controller.addToControllerEventQueue(controller.BrokerChange(currentChilds.asScala))
>   }
> }
> {code}
> In terms of impact, this:
> - increases the odds of working with stale information by the time the 
> ControllerEvent gets processed.
> - can cause the cluster to take a long time to stabilize if you bring up many 
> brokers simultaneously.
> In terms of how to solve it:
> - (short term) just ignore the ZkClient's information lookup and repeat the 
> lookup at the start of the ControllerEvent. This increases reads from 1 read 
> per change to 2 reads per change. This is the approach taken in this ticket.
> - (long term) try to remove a queue. This basically means getting rid of 
> ZkClient. This is likely the approach that will be taken in KAFKA-5501. Note 
> that with KAFKA-5501, we can revert this short term fix so that we reduce the 
> reads from 2 reads per change back down to 1 read per change.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to