[jira] [Commented] (KAFKA-911) Bug in controlled shutdown logic in controller leads to controller not sending out some state change request

2013-07-03 Thread Sriram Subramanian (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699500#comment-13699500
 ] 

Sriram Subramanian commented on KAFKA-911:
--

This has been fixed.

 Bug in controlled shutdown logic in controller leads to controller not 
 sending out some state change request 
 -

 Key: KAFKA-911
 URL: https://issues.apache.org/jira/browse/KAFKA-911
 Project: Kafka
  Issue Type: Bug
  Components: controller
Affects Versions: 0.8
Reporter: Neha Narkhede
Assignee: Neha Narkhede
Priority: Blocker
  Labels: kafka-0.8, p1
 Attachments: kafka-911-v1.patch, kafka-911-v2.patch


 The controlled shutdown logic in the controller first tries to move the 
 leaders from the broker being shutdown. Then it tries to remove the broker 
 from the isr list. During that operation, it does not synchronize on the 
 controllerLock. This causes a race condition while dispatching data using the 
 controller's channel manager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-911) Bug in controlled shutdown logic in controller leads to controller not sending out some state change request

2013-05-24 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13666445#comment-13666445
 ] 

Neha Narkhede commented on KAFKA-911:
-

You are right that we can send the reduced ISR request to the leader, but that 
is independent of removing the shutting down broker from the ISR in zookeeper. 
I'm arguing that the zookeeper write is unnecessary. To handle the issue you 
described, we can send a leader and isr request just to the leader with the 
reduced isr.

 Bug in controlled shutdown logic in controller leads to controller not 
 sending out some state change request 
 -

 Key: KAFKA-911
 URL: https://issues.apache.org/jira/browse/KAFKA-911
 Project: Kafka
  Issue Type: Bug
  Components: controller
Affects Versions: 0.8
Reporter: Neha Narkhede
Assignee: Neha Narkhede
Priority: Blocker
  Labels: kafka-0.8, p1
 Attachments: kafka-911-v1.patch


 The controlled shutdown logic in the controller first tries to move the 
 leaders from the broker being shutdown. Then it tries to remove the broker 
 from the isr list. During that operation, it does not synchronize on the 
 controllerLock. This causes a race condition while dispatching data using the 
 controller's channel manager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-911) Bug in controlled shutdown logic in controller leads to controller not sending out some state change request

2013-05-24 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13666532#comment-13666532
 ] 

Joel Koshy commented on KAFKA-911:
--

I had to revisit the notes from KAFKA-340. I think this was touched upon. i.e., 
the fact that the current implementation's attempt to shrink ISR may be 
ineffective for partitions whose leadership has been moved from the current 
broker - 
https://issues.apache.org/jira/browse/KAFKA-340?focusedCommentId=13483478page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13483478

quote
 3.4 What is the point of sending leader and isr request at the end of 
 shutdownBroker, since the OfflineReplica state 
 change would've taken care of that anyway. It seems like you just need to 
 send the stop replica request with the delete 
 partitions flag turned off, no ? 

I still need (as an optimization) to send the leader and isr request to the 
leaders of all partitions that are present 
on the shutting down broker so it can remove the shutting down broker from its 
inSyncReplicas cache 
(in Partition.scala) so it no longer waits for acks from the shutting down 
broker if a producer request's num-acks is 
set to -1. Otherwise, we have to wait for the leader to organically shrink 
the ISR. 

This also applies to partitions which are moved (i.e., partitions for which the 
shutting down broker was the leader): 
the ControlledShutdownLeaderSelector needs to send the updated leaderAndIsr 
request to the shutting down broker as well 
(to tell it that it is no longer the leader) at which point it will start up a 
replica fetcher and re-enter the ISR. 
So in fact, there is actually not much point in removing the current leader 
from the ISR in the 
ControlledShutdownLeaderSelector.selectLeader. 
/quote

and 

https://issues.apache.org/jira/browse/KAFKA-340?focusedCommentId=13484727page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13484727
(I don't think I actually filed that jira though.)


 Bug in controlled shutdown logic in controller leads to controller not 
 sending out some state change request 
 -

 Key: KAFKA-911
 URL: https://issues.apache.org/jira/browse/KAFKA-911
 Project: Kafka
  Issue Type: Bug
  Components: controller
Affects Versions: 0.8
Reporter: Neha Narkhede
Assignee: Neha Narkhede
Priority: Blocker
  Labels: kafka-0.8, p1
 Attachments: kafka-911-v1.patch


 The controlled shutdown logic in the controller first tries to move the 
 leaders from the broker being shutdown. Then it tries to remove the broker 
 from the isr list. During that operation, it does not synchronize on the 
 controllerLock. This causes a race condition while dispatching data using the 
 controller's channel manager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira