[
https://issues.apache.org/jira/browse/KAFKA-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16060037#comment-16060037
]
ASF GitHub Bot commented on KAFKA-5502:
---------------------------------------
Github user asfgit closed the pull request at:
https://github.com/apache/kafka/pull/3413
> read current brokers from zookeeper upon processing broker change
> -----------------------------------------------------------------
>
> Key: KAFKA-5502
> URL: https://issues.apache.org/jira/browse/KAFKA-5502
> Project: Kafka
> Issue Type: Sub-task
> Reporter: Onur Karaman
> Assignee: Onur Karaman
> Fix For: 0.11.0.0
>
>
> [~lindong]'s testing of the 0.11.0 release revealed a controller-side
> performance regression in clusters with many brokers and many partitions when
> bringing up many brokers simultaneously.
> The regression is caused by KAFKA-5028: a Watcher receives WatchedEvent
> notifications from the raw ZooKeeper client EventThread. A WatchedEvent only
> contains the following information:
> - KeeperState
> - EventType
> - path
> Note that it does not actually contain the current data or current set of
> children associated with the data/child change notification. It is up to the
> user to do this lookup to see the current data or set of children.
> ZkClient is itself a Watcher. When it receives a WatchedEvent, it puts a
> ZkEvent into its own queue which its own ZkEventThread processes. Users of
> ZkClient interact with these notifications through listeners
> (IZkDataListener, IZkChildListener). IZkDataListener actually expects as
> input the current data of the watched znode, and likewise IZkChildListener
> actually expects as input the current set of children of the watched znode.
> In order to provide this information to the listeners, the ZkEventThread,
> when processing the ZkEvent in its queue, looks up the information (either
> the current data or current set of children) simultaneously sets up the next
> watch, and passes the result to the listener.
> The regression introduced in KAFKA-5028 is the time at which we lookup the
> information needed for the event processing.
> In the past, the lookup from the ZkEventThread during ZkEvent processing
> would be passed into the listener which is processed immediately after. For
> instance in ZkClient.fireChildChangedEvents:
> {code}
> List<String> children = getChildren(path);
> listener.handleChildChange(path, children);
> {code}
> Now, however, there are multiple listeners that pass information looked up by
> the ZkEventThread into a ControllerEvent which gets processed potentially
> much later. For instance in BrokerChangeListener:
> {code}
> class BrokerChangeListener(controller: KafkaController) extends
> IZkChildListener with Logging {
> override def handleChildChange(parentPath: String, currentChilds:
> java.util.List[String]): Unit = {
> import JavaConverters._
>
> controller.addToControllerEventQueue(controller.BrokerChange(currentChilds.asScala))
> }
> }
> {code}
> In terms of impact, this:
> - increases the odds of working with stale information by the time the
> ControllerEvent gets processed.
> - can cause the cluster to take a long time to stabilize if you bring up many
> brokers simultaneously.
> In terms of how to solve it:
> - (short term) just ignore the ZkClient's information lookup and repeat the
> lookup at the start of the ControllerEvent. This increases reads from 1 read
> per change to 2 reads per change. This is the approach taken in this ticket.
> - (long term) try to remove a queue. This basically means getting rid of
> ZkClient. This is likely the approach that will be taken in KAFKA-5501. Note
> that with KAFKA-5501, we can revert this short term fix so that we reduce the
> reads from 2 reads per change back down to 1 read per change.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)