[
https://issues.apache.org/jira/browse/KAFKA-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490835#comment-13490835
]
Prashanth Menon commented on KAFKA-574:
---------------------------------------
So I ran the system test an a Ubuntu box and two of the test cases fail
consistently for me, both with and without the patch:
_test_case_name: test_case_0001
_test_clss_name: ReplicaBasicTest
arg : bounce_broker : false
arg : broker_type : leader
arg : message_producing_free_time_sec : 15
arg : num_iteration : 1
arg : num_messages_to_produce_per_product_call : 50
arg : num_partition : 1
arg : replica_factor : 3
arg: sleep_sseconds_between_producer_calls : 1
validation_status:
Leader Election latency MAX : None
Leader Election latency MIN : None
Validate leader election successful : FAILED
_test_case_name: test_case_1
_test_clss_name: ReplicaBasicTest
arg : bounce_broker : true
arg : broker_type : leader
arg : message_producing_free_time_sec : 15
arg : num_iteration : 2
arg : num_messages_to_produce_per_product_call : 50
arg : num_partition : 2
arg : replica_factor : 3
arg: sleep_sseconds_between_producer_calls : 1
validation_status:
Validate leader election successful : FAILED
Any idea if this is happening for everyone else? I'll investigate on my end to
see what's causing it.
> KafkaController unnecessarily reads leaderAndIsr info from ZK
> -------------------------------------------------------------
>
> Key: KAFKA-574
> URL: https://issues.apache.org/jira/browse/KAFKA-574
> Project: Kafka
> Issue Type: Bug
> Components: core
> Affects Versions: 0.8
> Reporter: Jun Rao
> Assignee: Prashanth Menon
> Priority: Blocker
> Labels: bugs
> Attachments: KAFKA-574-v1.patch, KAFKA-574-v2.patch,
> KAFKA-574-v3.patch
>
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> KafkaController calls updateLeaderAndIsrCache() in onBrokerFailure(). This is
> unnecessary since in onBrokerFailure(), we will make leader and isr change
> anyway so there is no need to first read that information from ZK. Latency is
> critical in onBrokerFailure() since it determines how quickly a leader can be
> made online.
> Similarly, updateLeaderAndIsrCache() is called in onBrokerStartup()
> unnecessarily. In this case, the controller does not change the leader or the
> isr. It just needs to send the current leader and the isr info to the newly
> started broker. We already cache leader in the controller. Isr in theory
> could change any time by the leader. So, reading from ZK doesn't guarantee
> that we can get the latest isr anyway. Instead, we just need to get the isr
> last selected by the controller (which can be cached together with the leader
> in the controller). If the leader epoc in a broker is at or larger than the
> epoc in the leaderAndIsr request, the broker can just ignore it. Otherwise,
> the leader and the isr selected by the controller should be used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira