[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-12-03 Thread Grant Henke (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038651#comment-15038651
 ] 

Grant Henke commented on KAFKA-1778:


I broke this into its own task since it's independently tracked by [KIP-39: 
Pinning controller to broker 
|https://cwiki.apache.org/confluence/display/KAFKA/KIP-39+Pinning+controller+to+broker].

> Create new re-elect controller admin function
> -
>
> Key: KAFKA-1778
> URL: https://issues.apache.org/jira/browse/KAFKA-1778
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Joe Stein
>Assignee: Abhishek Nigam
>
> kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-08-11 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14692289#comment-14692289
 ] 

Guozhang Wang commented on KAFKA-1778:
--

Chiming in late here, I think we are actually discussing two different, though 
somewhat overlapped issues:

1. When a controller is in bad state but not resigning, or if we just want to 
move controllers programmatically (i.e. not through deleting znode or bouncing 
broker), we want to trigger a re-election, and potentially enforce a certain 
broker to be the new controller during the re-election so that the whole 
cluster can still move on without losing one broker.

2. For isolating load scenarios, we want to start a broker while indicating it 
to be the controller candidate or not. Controller elections will only be 
triggered among the candidates.

Per the JIRA title suggests, I think we are targeting on the first issue, for 
which the motivation is mainly operation convenience; hence the solution for 
the second issue may not really be preferred since it still does not allow SREs 
to trigger a new election ([~charmalloc] corrects me if I am wrong). 

 Create new re-elect controller admin function
 -

 Key: KAFKA-1778
 URL: https://issues.apache.org/jira/browse/KAFKA-1778
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Abhishek Nigam
 Fix For: 0.8.3


 kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-08-11 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14692387#comment-14692387
 ] 

Guozhang Wang commented on KAFKA-1778:
--

Could you summarize your proposal on your 27/May/15 comment, and people can 
then discuss about safetyness in corner cases and efficiency? [~junrao] 
[~jjkoshy] [~charmalloc]

 Create new re-elect controller admin function
 -

 Key: KAFKA-1778
 URL: https://issues.apache.org/jira/browse/KAFKA-1778
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Abhishek Nigam
 Fix For: 0.8.3


 kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-08-11 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14682249#comment-14682249
 ] 

Gwen Shapira commented on KAFKA-1778:
-

Apparently I can't assign Reviewer if there is no patch, so [~guozhang], this 
is for you :)

 Create new re-elect controller admin function
 -

 Key: KAFKA-1778
 URL: https://issues.apache.org/jira/browse/KAFKA-1778
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Abhishek Nigam
 Fix For: 0.8.3


 kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-08-11 Thread Abhishek Nigam (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14692370#comment-14692370
 ] 

Abhishek Nigam commented on KAFKA-1778:
---

Hi Guozhang,
I agree 100% with you. Can you tell me what is the best way to move forward
on this on the open source side.

-Abhishek

On Tue, Aug 11, 2015 at 2:30 PM, Guozhang Wang (JIRA) j...@apache.org



 Create new re-elect controller admin function
 -

 Key: KAFKA-1778
 URL: https://issues.apache.org/jira/browse/KAFKA-1778
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Abhishek Nigam
 Fix For: 0.8.3


 kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-08-11 Thread Abhishek Nigam (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14692628#comment-14692628
 ] 

Abhishek Nigam commented on KAFKA-1778:
---

Thanks Guozhang,
I will write it up in a nice proposal.

-Abhishek

On Tue, Aug 11, 2015 at 3:28 PM, Guozhang Wang (JIRA) j...@apache.org



 Create new re-elect controller admin function
 -

 Key: KAFKA-1778
 URL: https://issues.apache.org/jira/browse/KAFKA-1778
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Abhishek Nigam
 Fix For: 0.8.3


 kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-05-30 Thread Joe Stein (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14565914#comment-14565914
 ] 

Joe Stein commented on KAFKA-1778:
--

Hey, sorry for late reply. I have seen now on a few dozen clusters situations 
where the broker gets into a state where the controller is hung and the only 
recourse is to either delete the znode from Zookeeper (/controller) to force a 
re-election or shutdown the broker. In the former case I have seen in one 
situation where the entire cluster went down. I am fairly certain this was 
because of the version of Zookeeper they were running (3.4.5) however I haven't 
ever tried to reproduce it. The latter case many folks don't want to shutdown 
the broker because they are in high traffic situations and doing so we could be 
a lot worse than the controller not working... sometimes that changes and they 
shut the broker down so the controller can fail over and their partition 
reassignment can continue to the new brokers they just launched (as an example).

So, originally we were thinking of fixing this be having an admin call that 
could trigger safely another leader election. We have been finding though that 
just having the broker start without it ever being able to be the controller 
(can.be.controller = false) is preferable in *a lot* of cases. This way there 
are brokers that will never be the controller and then some that could and with 
the brokers that could one of them would.

~ Joestein

 Create new re-elect controller admin function
 -

 Key: KAFKA-1778
 URL: https://issues.apache.org/jira/browse/KAFKA-1778
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Abhishek Nigam
 Fix For: 0.8.3


 kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-05-29 Thread Abhishek Nigam (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14565837#comment-14565837
 ] 

Abhishek Nigam commented on KAFKA-1778:
---

I believe what you are suggesting is that we can have a group of brokers 
flagged as potential brokers and all controller elections will be limited to 
that subset of brokers. Do I need to provide any failsafe in case all the 
flagged brokers are not able to participate in the required election and we are 
controller-less?

-Abhishek

 Create new re-elect controller admin function
 -

 Key: KAFKA-1778
 URL: https://issues.apache.org/jira/browse/KAFKA-1778
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Abhishek Nigam
 Fix For: 0.8.3


 kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-05-27 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562053#comment-14562053
 ] 

Jun Rao commented on KAFKA-1778:


[~anigam], yes, you will need to configure multiple (e.g., 2 or 3) brokers 
eligible to be the controller in this approach. In this approach, the 
controller is guaranteed to be those eligible brokers. This seems a bit better 
for use case (b) since it allows one to completely isolate the controller from 
the other brokers that take produce/consumer load. In your proposal, the 
controller may still be on a non-preferred broker.  

 Create new re-elect controller admin function
 -

 Key: KAFKA-1778
 URL: https://issues.apache.org/jira/browse/KAFKA-1778
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Abhishek Nigam
 Fix For: 0.8.3


 kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-05-27 Thread Abhishek Nigam (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14561833#comment-14561833
 ] 

Abhishek Nigam commented on KAFKA-1778:
---

Joel,
What I was proposing was that all the brokers will watch the 
ready-to-serve-as-controller ephemeral node. In the scenario outlined where the 
preferred controller dies after the election is over but before it can write to 
the /controller node all the brokers will get this notification. Then there 
will be another round of elections in that case.

The controller is the one which pulls from /admin/next_controller persistent 
zookeeper node and also keeps a watch on it. If it detects this has been 
changed and the chosen broker id is different from it it will start the 
preferred controller move process.

Also, can we avoid the message from current controller to the preferred 
controller by having all brokers just watch the admin/next_controller znode? 
This is definitely a better approach where zookeeper node can be used to 
achieve this messaging.

Jun,
In my opinion static assignment suffers from some issues where if the 
pre-determined controller goes down what happens or runs into any issues what 
happens.







 Create new re-elect controller admin function
 -

 Key: KAFKA-1778
 URL: https://issues.apache.org/jira/browse/KAFKA-1778
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Abhishek Nigam
 Fix For: 0.8.3


 kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-05-21 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14554515#comment-14554515
 ] 

Jun Rao commented on KAFKA-1778:


[~anigam], for the two use cases that you mentioned, it seems that a simpler 
approach is what Joe said. Just configure a couple of brokers to be eligible 
for becoming the controller. Then, only those brokers will try to become the 
controller. I am not sure what the use case is to have an admin command to 
force the controller to move.

 Create new re-elect controller admin function
 -

 Key: KAFKA-1778
 URL: https://issues.apache.org/jira/browse/KAFKA-1778
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Abhishek Nigam
 Fix For: 0.8.3


 kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-05-20 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553429#comment-14553429
 ] 

Joel Koshy commented on KAFKA-1778:
---

I may be missing some detail, but (a), (b), (c) don't quite fit the scenario I 
was asking about:

Even if all brokers know that a specific broker is supposed to become the 
preferred controller. What happens if that preferred controller is about to 
become the controller but crashes before it can update the /controller path in 
ZooKeeper. No further zookeeper watches will be triggered.

Also, can we avoid the message from current controller to the preferred 
controller by having all brokers just watch the admin/next_controller znode?

Under changes in election code - (a) did you mean that brokers should watch 
admin/next_controller znode (and not ready-to-serve-as-controller znode)?

 Create new re-elect controller admin function
 -

 Key: KAFKA-1778
 URL: https://issues.apache.org/jira/browse/KAFKA-1778
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Abhishek Nigam
 Fix For: 0.8.3


 kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-05-19 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550019#comment-14550019
 ] 

Joel Koshy commented on KAFKA-1778:
---

Jun - I think this is more for convenience/debugging and such. Right now there 
is no easy way to force a broker to become the controller.

Joe, since you filed this, you may be able to give some use case that you had 
in mind.

Abhishek, what would happen if the current controller A resigns and another 
broker B is pinned and about to become the new controller, but crashes before 
it can become the controller. Other brokers would not participate in controller 
reelection. So they should probably also watch the broker registrations so they 
know if the pinned controller goes down then they should proceed to participate 
in controller reelection anyway (to avoid a situation where you have no 
controller).

 Create new re-elect controller admin function
 -

 Key: KAFKA-1778
 URL: https://issues.apache.org/jira/browse/KAFKA-1778
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Abhishek Nigam
 Fix For: 0.8.3


 kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-05-19 Thread Abhishek Nigam (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550781#comment-14550781
 ] 

Abhishek Nigam commented on KAFKA-1778:
---

Jun,
The way I see it pinning the controller gives us multiple benefits:
a) If SREs are doing rolling upgrades they can set aside the broker on which 
the controller is pinned as the broker which they touch last.
This way there are only a limited number of controller moves and we can get 
more availability of the controller as a result as opposed to un-predictable 
number of controller moves.

b) I think more importantly if we do manual partition assignment we can set 
aside a broker to have very few partitions and this would reduce the impact on 
the controller from serving too many produce and consume events. To summarize 
it enables us to isolate the controller from the broker functionality 
potentially enabling us to push the brokers harder. 

Joel,
You are spot on. Since now all the brokers will be watching for the preferred 
controller node we can have the following situations:
a) All of them know about the preferred controller (zookeeper metadata has 
flowed to everyone). In this case the preferred controller would become the 
leader right away.

b) If some of them know about the preferred controller they will participate in 
the election and it is possible that somebody other than the preferred 
controller becomes the leader. What will happen in this case is that eventually 
this new controller will figure out that the preferred controller is available 
(thru zookeeper watch) to serve traffic it will resign and trigger another 
round of elections.
c) If none of them know about the preferred controller the behavior will be 
similar as above.

  

 Create new re-elect controller admin function
 -

 Key: KAFKA-1778
 URL: https://issues.apache.org/jira/browse/KAFKA-1778
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Abhishek Nigam
 Fix For: 0.8.3


 kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-05-18 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548343#comment-14548343
 ] 

Jiangjie Qin commented on KAFKA-1778:
-

Hey [~joestein], I'm a little bit confused here, just wondering in which case 
we would like to exclude some brokers as controller?

 Create new re-elect controller admin function
 -

 Key: KAFKA-1778
 URL: https://issues.apache.org/jira/browse/KAFKA-1778
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Abhishek Nigam
 Fix For: 0.8.3


 kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-05-18 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549117#comment-14549117
 ] 

Jun Rao commented on KAFKA-1778:


[~anigam], could you explain a bit the use case of pinning the controller to a 
broker?

 Create new re-elect controller admin function
 -

 Key: KAFKA-1778
 URL: https://issues.apache.org/jira/browse/KAFKA-1778
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Abhishek Nigam
 Fix For: 0.8.3


 kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-05-18 Thread Joe Stein (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547838#comment-14547838
 ] 

Joe Stein commented on KAFKA-1778:
--

I was thinking that the broker when starting up would have another property. 
can.be.controller=false || can.be.controller=true

If a broker has this value to true, then it can be the controller and the 
thread starts up for the KafkaController, else it doesn't. Should be a few 
lines change in KafkaServer and config mod

 Create new re-elect controller admin function
 -

 Key: KAFKA-1778
 URL: https://issues.apache.org/jira/browse/KAFKA-1778
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Abhishek Nigam
 Fix For: 0.8.3


 kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-03-19 Thread Abhishek Nigam (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14370385#comment-14370385
 ] 

Abhishek Nigam commented on KAFKA-1778:
---

I have a design for pinning the controller to a broker:
e we want to pin the controller to broker id x.

Handling the admin request in the controller:
a) We send the admin request to the controller.
b) It will create a persistent zookeeper node /admin/next_controller with data 
x.
c) It will then pull the information about broker id x to see if it is up and 
running through the alive broker list.
d) If the broker is up and running it will start 3-way handshake with x.
e) It will start a watch on /admin/ready_to_serve_as_controller zookeeper node.
f) It will send a message to the broker to tell it that it should become ready 
to serve as next_controller.
g) Broker x on receiving this message will create ephemeral node 
/admin/ready_to_server_as_controller.
h) Controller observes this change.
h) At this point the current controller will resign.

Changes in the election code:
a) All the brokers will pull from /admin/ready_to_server_as_controller with a 
watch.
b) If the brokers find that if this znode exists and their broker.id does not 
match the id specified in this ephemeral node they will simply not participate 
in the leader election.
c) Broker x will rightfully takes its place as the next controller.

c) The watches will be used in case broker x comes back to life.
d) In that case if I am the controller then I will resign.

Changes in the controller startup code:
a) Always pull from the /admin/next_controller for data changes as well as new 
data.
b) If there is any change try to setup the next broker similar to what has been 
specified in handling the admin request in the controller.

 Create new re-elect controller admin function
 -

 Key: KAFKA-1778
 URL: https://issues.apache.org/jira/browse/KAFKA-1778
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Abhishek Nigam
 Fix For: 0.8.3


 kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1778) Create new re-elect controller admin function

2015-03-19 Thread Mayuresh Gharat (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14370426#comment-14370426
 ] 

Mayuresh Gharat commented on KAFKA-1778:


I did not get :

c) Broker x will rightfully takes its place as the next controller.

c) The watches will be used in case broker x comes back to life.
d) In that case if I am the controller then I will resign.

Can you explain this?

 Create new re-elect controller admin function
 -

 Key: KAFKA-1778
 URL: https://issues.apache.org/jira/browse/KAFKA-1778
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joe Stein
Assignee: Abhishek Nigam
 Fix For: 0.8.3


 kafka --controller --elect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)