from:"Hitoshi Mitake \\$JIRA\\$"

[jira] [Commented] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant

2015-07-28 Thread Hitoshi Mitake (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643954#comment-14643954
 ] 

Hitoshi Mitake commented on ZOOKEEPER-2172:
---

But the attached logs (with DEBUG level) don't contain messages of 
QuorumPeer.updateServerState(). Perhaps shutdown process of leader is stopping 
QuorumPeer main thread?

 Cluster crashes when reconfig a new node as a participant
 -

 Key: ZOOKEEPER-2172
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172
 Project: ZooKeeper
  Issue Type: Bug
  Components: leaderElection, quorum, server
Affects Versions: 3.5.0
 Environment: Ubuntu 12.04 + java 7
Reporter: Ziyou Wang
Priority: Critical
 Attachments: ZOOKEEPER-2172.patch, history.txt, node-1.log, 
 node-2.log, node-3.log, zoo-1.log, zoo-2-1.log, zoo-2-2.log, zoo-2-3.log, 
 zoo-2.log, zoo-2212-1.log, zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log, 
 zoo-3-2.log, zoo-3-3.log, zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log, 
 zoo.cfg.dynamic.1005d, zoo.cfg.dynamic.next, zookeeper-1.log, 
 zookeeper-1.out, zookeeper-2.log, zookeeper-2.out, zookeeper-3.log, 
 zookeeper-3.out


 The operations are quite simple: start three zk servers one by one, then 
 reconfig the cluster to add the new one as a participant. When I add the  
 third one, the zk cluster may enter a weird state and cannot recover.
  
   I found “2015-04-20 12:53:48,236 [myid:1] - INFO  [ProcessThread(sid:1 
 cport:-1)::PrepRequestProcessor@547] - Incremental reconfig” in node-1 log. 
 So the first node received the reconfig cmd at 12:53:48. Latter, it logged 
 “2015-04-20  12:53:52,230 [myid:1] - ERROR 
 [LearnerHandler-/10.0.0.2:55890:LearnerHandler@580] - Unexpected exception 
 causing shutdown while sock still open” and “2015-04-20 12:53:52,231 [myid:1] 
 - WARN  [LearnerHandler-/10.0.0.2:55890:LearnerHandler@595] - *** GOODBYE 
  /10.0.0.2:55890 ”. From then on, the first node and second node 
 rejected all client connections and the third node didn’t join the cluster as 
 a participant. The whole cluster was done.
  
  When the problem happened, all three nodes just used the same dynamic 
 config file zoo.cfg.dynamic.1005d which only contained the first two 
 nodes. But there was another unused dynamic config file in node-1 directory 
 zoo.cfg.dynamic.next  which already contained three nodes.
  
  When I extended the waiting time between starting the third node and 
 reconfiguring the cluster, the problem didn’t show again. So it should be a 
 race condition problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant

2015-07-28 Thread Hitoshi Mitake (JIRA)

[
https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hitoshi Mitake updated ZOOKEEPER-2172:
--
Attachment: ZOOKEEPER-2172.patch

Hi [~ziyouw],

I found a little bit strange code path like below:
1. In the tail of Leader.shutdown(), leader tries to remove all learner
handlers with synchronized (learners). The loop calls LearnerHandler.shutdown().
2. In LearnerHandler.shutdown(), learder.removeLearnerHandler() is called.
3. In Leader.removeLearnerHandler(), the member of Leader, learners, is also
locked by synchronized

Seems that the above sequence can cause deadlock.

I removed synchronized(learners) in removeLearnerHandler in the attached patch.
Could you test it on your environment?
# the targetting version is 3.5.0

Cluster crashes when reconfig a new node as a participant
-

Key: ZOOKEEPER-2172
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172
Project: ZooKeeper
Issue Type: Bug
Components: leaderElection, quorum, server
Affects Versions: 3.5.0
Environment: Ubuntu 12.04 + java 7
Reporter: Ziyou Wang
Priority: Critical
Attachments: ZOOKEEPER-2172.patch, history.txt, node-1.log,
node-2.log, node-3.log, zoo-1.log, zoo-2-1.log, zoo-2-2.log, zoo-2-3.log,
zoo-2.log, zoo-2212-1.log, zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log,
zoo-3-2.log, zoo-3-3.log, zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log,
zoo.cfg.dynamic.1005d, zoo.cfg.dynamic.next, zookeeper-1.log,
zookeeper-1.out, zookeeper-2.log, zookeeper-2.out, zookeeper-3.log,
zookeeper-3.out

The operations are quite simple: start three zk servers one by one, then
reconfig the cluster to add the new one as a participant. When I add the
third one, the zk cluster may enter a weird state and cannot recover.

I found “2015-04-20 12:53:48,236 [myid:1] - INFO [ProcessThread(sid:1
cport:-1)::PrepRequestProcessor@547] - Incremental reconfig” in node-1 log.
So the first node received the reconfig cmd at 12:53:48. Latter, it logged
“2015-04-20 12:53:52,230 [myid:1] - ERROR
[LearnerHandler-/10.0.0.2:55890:LearnerHandler@580] - Unexpected exception
causing shutdown while sock still open” and “2015-04-20 12:53:52,231 [myid:1]
- WARN [LearnerHandler-/10.0.0.2:55890:LearnerHandler@595] - *** GOODBYE
/10.0.0.2:55890 ”. From then on, the first node and second node
rejected all client connections and the third node didn’t join the cluster as
a participant. The whole cluster was done.

When the problem happened, all three nodes just used the same dynamic
config file zoo.cfg.dynamic.1005d which only contained the first two
nodes. But there was another unused dynamic config file in node-1 directory
zoo.cfg.dynamic.next which already contained three nodes.

When I extended the waiting time between starting the third node and
reconfiguring the cluster, the problem didn’t show again. So it should be a
race condition problem.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant

2015-07-28 Thread Hitoshi Mitake (JIRA)

[
https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643952#comment-14643952
]

Hitoshi Mitake commented on ZOOKEEPER-2172:
---

Sorry, the synchronized is reentrant, the patch would be wrong... please ignore
it.

Cluster crashes when reconfig a new node as a participant
-

When I extended the waiting time between starting the third node and
reconfiguring the cluster, the problem didn’t show again. So it should be a
race condition problem.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant

2015-07-27 Thread Hitoshi Mitake (JIRA)

[
https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642415#comment-14642415
]

Hitoshi Mitake commented on ZOOKEEPER-2172:
---

Hi [~ziyouw],

Could you check my understanding is correct? IIUC, your situation is like below:
1. server 1 boot
2. server 2 boot
3. client issues reconfig to server 1
4. server 2 tries to sync with server 1 with Learner.syncWithLeader()
5. server 3 boot
6. client issues reconfig to server 1

(reconfig requests in 3 and 6 are overwrapping)

Is this correct, I'll be able to reproduce the situation with earthquake.

Cluster crashes when reconfig a new node as a participant
-

Key: ZOOKEEPER-2172
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172
Project: ZooKeeper
Issue Type: Bug
Components: leaderElection, quorum, server
Affects Versions: 3.5.0
Environment: Ubuntu 12.04 + java 7
Reporter: Ziyou Wang
Priority: Critical
Attachments: history.txt, node-1.log, node-2.log, node-3.log,
zoo-1.log, zoo-2-1.log, zoo-2-2.log, zoo-2-3.log, zoo-2.log, zoo-2212-1.log,
zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log, zoo-3-2.log, zoo-3-3.log,
zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log, zoo.cfg.dynamic.1005d,
zoo.cfg.dynamic.next, zookeeper-1.log, zookeeper-1.out, zookeeper-2.log,
zookeeper-2.out, zookeeper-3.log, zookeeper-3.out

When I extended the waiting time between starting the third node and
reconfiguring the cluster, the problem didn’t show again. So it should be a
race condition problem.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant

2015-07-27 Thread Hitoshi Mitake (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642437#comment-14642437
 ] 

Hitoshi Mitake commented on ZOOKEEPER-2172:
---

[~ziyouw] BTW, if it is possible, could you share your dockerfile for your 
testing?


 Cluster crashes when reconfig a new node as a participant
 -

 Key: ZOOKEEPER-2172
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172
 Project: ZooKeeper
  Issue Type: Bug
  Components: leaderElection, quorum, server
Affects Versions: 3.5.0
 Environment: Ubuntu 12.04 + java 7
Reporter: Ziyou Wang
Priority: Critical
 Attachments: history.txt, node-1.log, node-2.log, node-3.log, 
 zoo-1.log, zoo-2-1.log, zoo-2-2.log, zoo-2-3.log, zoo-2.log, zoo-2212-1.log, 
 zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log, zoo-3-2.log, zoo-3-3.log, 
 zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log, zoo.cfg.dynamic.1005d, 
 zoo.cfg.dynamic.next, zookeeper-1.log, zookeeper-1.out, zookeeper-2.log, 
 zookeeper-2.out, zookeeper-3.log, zookeeper-3.out


 The operations are quite simple: start three zk servers one by one, then 
 reconfig the cluster to add the new one as a participant. When I add the  
 third one, the zk cluster may enter a weird state and cannot recover.
  
   I found “2015-04-20 12:53:48,236 [myid:1] - INFO  [ProcessThread(sid:1 
 cport:-1)::PrepRequestProcessor@547] - Incremental reconfig” in node-1 log. 
 So the first node received the reconfig cmd at 12:53:48. Latter, it logged 
 “2015-04-20  12:53:52,230 [myid:1] - ERROR 
 [LearnerHandler-/10.0.0.2:55890:LearnerHandler@580] - Unexpected exception 
 causing shutdown while sock still open” and “2015-04-20 12:53:52,231 [myid:1] 
 - WARN  [LearnerHandler-/10.0.0.2:55890:LearnerHandler@595] - *** GOODBYE 
  /10.0.0.2:55890 ”. From then on, the first node and second node 
 rejected all client connections and the third node didn’t join the cluster as 
 a participant. The whole cluster was done.
  
  When the problem happened, all three nodes just used the same dynamic 
 config file zoo.cfg.dynamic.1005d which only contained the first two 
 nodes. But there was another unused dynamic config file in node-1 directory 
 zoo.cfg.dynamic.next  which already contained three nodes.
  
  When I extended the waiting time between starting the third node and 
 reconfiguring the cluster, the problem didn’t show again. So it should be a 
 race condition problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant

2015-07-27 Thread Hitoshi Mitake (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642436#comment-14642436
 ] 

Hitoshi Mitake commented on ZOOKEEPER-2172:
---

[~ziyouw] BTW, if it is possible, could you share your dockerfile for your 
testing?


 Cluster crashes when reconfig a new node as a participant
 -

 Key: ZOOKEEPER-2172
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172
 Project: ZooKeeper
  Issue Type: Bug
  Components: leaderElection, quorum, server
Affects Versions: 3.5.0
 Environment: Ubuntu 12.04 + java 7
Reporter: Ziyou Wang
Priority: Critical
 Attachments: history.txt, node-1.log, node-2.log, node-3.log, 
 zoo-1.log, zoo-2-1.log, zoo-2-2.log, zoo-2-3.log, zoo-2.log, zoo-2212-1.log, 
 zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log, zoo-3-2.log, zoo-3-3.log, 
 zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log, zoo.cfg.dynamic.1005d, 
 zoo.cfg.dynamic.next, zookeeper-1.log, zookeeper-1.out, zookeeper-2.log, 
 zookeeper-2.out, zookeeper-3.log, zookeeper-3.out


 The operations are quite simple: start three zk servers one by one, then 
 reconfig the cluster to add the new one as a participant. When I add the  
 third one, the zk cluster may enter a weird state and cannot recover.
  
   I found “2015-04-20 12:53:48,236 [myid:1] - INFO  [ProcessThread(sid:1 
 cport:-1)::PrepRequestProcessor@547] - Incremental reconfig” in node-1 log. 
 So the first node received the reconfig cmd at 12:53:48. Latter, it logged 
 “2015-04-20  12:53:52,230 [myid:1] - ERROR 
 [LearnerHandler-/10.0.0.2:55890:LearnerHandler@580] - Unexpected exception 
 causing shutdown while sock still open” and “2015-04-20 12:53:52,231 [myid:1] 
 - WARN  [LearnerHandler-/10.0.0.2:55890:LearnerHandler@595] - *** GOODBYE 
  /10.0.0.2:55890 ”. From then on, the first node and second node 
 rejected all client connections and the third node didn’t join the cluster as 
 a participant. The whole cluster was done.
  
  When the problem happened, all three nodes just used the same dynamic 
 config file zoo.cfg.dynamic.1005d which only contained the first two 
 nodes. But there was another unused dynamic config file in node-1 directory 
 zoo.cfg.dynamic.next  which already contained three nodes.
  
  When I extended the waiting time between starting the third node and 
 reconfiguring the cluster, the problem didn’t show again. So it should be a 
 race condition problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant

2015-07-27 Thread Hitoshi Mitake (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642435#comment-14642435
 ] 

Hitoshi Mitake commented on ZOOKEEPER-2172:
---

[~ziyouw] BTW, if it is possible, could you share your dockerfile for your 
testing?


 Cluster crashes when reconfig a new node as a participant
 -

 Key: ZOOKEEPER-2172
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172
 Project: ZooKeeper
  Issue Type: Bug
  Components: leaderElection, quorum, server
Affects Versions: 3.5.0
 Environment: Ubuntu 12.04 + java 7
Reporter: Ziyou Wang
Priority: Critical
 Attachments: history.txt, node-1.log, node-2.log, node-3.log, 
 zoo-1.log, zoo-2-1.log, zoo-2-2.log, zoo-2-3.log, zoo-2.log, zoo-2212-1.log, 
 zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log, zoo-3-2.log, zoo-3-3.log, 
 zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log, zoo.cfg.dynamic.1005d, 
 zoo.cfg.dynamic.next, zookeeper-1.log, zookeeper-1.out, zookeeper-2.log, 
 zookeeper-2.out, zookeeper-3.log, zookeeper-3.out


 The operations are quite simple: start three zk servers one by one, then 
 reconfig the cluster to add the new one as a participant. When I add the  
 third one, the zk cluster may enter a weird state and cannot recover.
  
   I found “2015-04-20 12:53:48,236 [myid:1] - INFO  [ProcessThread(sid:1 
 cport:-1)::PrepRequestProcessor@547] - Incremental reconfig” in node-1 log. 
 So the first node received the reconfig cmd at 12:53:48. Latter, it logged 
 “2015-04-20  12:53:52,230 [myid:1] - ERROR 
 [LearnerHandler-/10.0.0.2:55890:LearnerHandler@580] - Unexpected exception 
 causing shutdown while sock still open” and “2015-04-20 12:53:52,231 [myid:1] 
 - WARN  [LearnerHandler-/10.0.0.2:55890:LearnerHandler@595] - *** GOODBYE 
  /10.0.0.2:55890 ”. From then on, the first node and second node 
 rejected all client connections and the third node didn’t join the cluster as 
 a participant. The whole cluster was done.
  
  When the problem happened, all three nodes just used the same dynamic 
 config file zoo.cfg.dynamic.1005d which only contained the first two 
 nodes. But there was another unused dynamic config file in node-1 directory 
 zoo.cfg.dynamic.next  which already contained three nodes.
  
  When I extended the waiting time between starting the third node and 
 reconfiguring the cluster, the problem didn’t show again. So it should be a 
 race condition problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant

2015-07-27 Thread Hitoshi Mitake (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642439#comment-14642439
 ] 

Hitoshi Mitake commented on ZOOKEEPER-2172:
---

Sorry for bothering with duplicated replies...

 Cluster crashes when reconfig a new node as a participant
 -

 Key: ZOOKEEPER-2172
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172
 Project: ZooKeeper
  Issue Type: Bug
  Components: leaderElection, quorum, server
Affects Versions: 3.5.0
 Environment: Ubuntu 12.04 + java 7
Reporter: Ziyou Wang
Priority: Critical
 Attachments: history.txt, node-1.log, node-2.log, node-3.log, 
 zoo-1.log, zoo-2-1.log, zoo-2-2.log, zoo-2-3.log, zoo-2.log, zoo-2212-1.log, 
 zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log, zoo-3-2.log, zoo-3-3.log, 
 zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log, zoo.cfg.dynamic.1005d, 
 zoo.cfg.dynamic.next, zookeeper-1.log, zookeeper-1.out, zookeeper-2.log, 
 zookeeper-2.out, zookeeper-3.log, zookeeper-3.out


 The operations are quite simple: start three zk servers one by one, then 
 reconfig the cluster to add the new one as a participant. When I add the  
 third one, the zk cluster may enter a weird state and cannot recover.
  
   I found “2015-04-20 12:53:48,236 [myid:1] - INFO  [ProcessThread(sid:1 
 cport:-1)::PrepRequestProcessor@547] - Incremental reconfig” in node-1 log. 
 So the first node received the reconfig cmd at 12:53:48. Latter, it logged 
 “2015-04-20  12:53:52,230 [myid:1] - ERROR 
 [LearnerHandler-/10.0.0.2:55890:LearnerHandler@580] - Unexpected exception 
 causing shutdown while sock still open” and “2015-04-20 12:53:52,231 [myid:1] 
 - WARN  [LearnerHandler-/10.0.0.2:55890:LearnerHandler@595] - *** GOODBYE 
  /10.0.0.2:55890 ”. From then on, the first node and second node 
 rejected all client connections and the third node didn’t join the cluster as 
 a participant. The whole cluster was done.
  
  When the problem happened, all three nodes just used the same dynamic 
 config file zoo.cfg.dynamic.1005d which only contained the first two 
 nodes. But there was another unused dynamic config file in node-1 directory 
 zoo.cfg.dynamic.next  which already contained three nodes.
  
  When I extended the waiting time between starting the third node and 
 reconfiguring the cluster, the problem didn’t show again. So it should be a 
 race condition problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant

2015-07-21 Thread Hitoshi Mitake (JIRA)

[
https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636344#comment-14636344
]

Hitoshi Mitake commented on ZOOKEEPER-2172:
---

Hi [~shralex], thanks for your reply.

As you pointed, the crash is caused in the inspection layer (written in
byteman). Sorry for bothering.

But the NullPointerException is a little bit odd. The exception is caused by
the byteman script like this:
RULE quorum packet receive in Follower
CLASS Learner
METHOD readPacket
HELPER net.osrg.earthquake.PBEQHelper
BIND argMap = new java.util.HashMap()
AT EXIT
IF $# == 1
DO
argMap.put(quorumPacket,
org.apache.zookeeper.server.quorum.LearnerHandler.packetToString($1));
eventFuncReturn(Learner.readPacket, argMap);
ENDRULE

IIUC, the quorumpacket will never be null in follower. I'll look at the problem.

Cluster crashes when reconfig a new node as a participant
-

Key: ZOOKEEPER-2172
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172
Project: ZooKeeper
Issue Type: Bug
Components: leaderElection, quorum, server
Affects Versions: 3.5.0
Environment: Ubuntu 12.04 + java 7
Reporter: Ziyou Wang
Priority: Critical
Attachments: history.txt, node-1.log, node-2.log, node-3.log,
zoo-1.log, zoo-2-1.log, zoo-2-2.log, zoo-2-3.log, zoo-2.log, zoo-2212-1.log,
zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log, zoo-3-2.log, zoo-3-3.log,
zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log, zoo.cfg.dynamic.1005d,
zoo.cfg.dynamic.next, zookeeper-1.log, zookeeper-1.out, zookeeper-2.log,
zookeeper-2.out, zookeeper-3.log, zookeeper-3.out

When I extended the waiting time between starting the third node and
reconfiguring the cluster, the problem didn’t show again. So it should be a
race condition problem.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant

2015-07-21 Thread Hitoshi Mitake (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitoshi Mitake updated ZOOKEEPER-2172:
--
Attachment: zookeeper-3.out
zookeeper-2.out
zookeeper-1.out
history.txt

 Cluster crashes when reconfig a new node as a participant
 -

 Key: ZOOKEEPER-2172
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172
 Project: ZooKeeper
  Issue Type: Bug
  Components: leaderElection, quorum, server
Affects Versions: 3.5.0
 Environment: Ubuntu 12.04 + java 7
Reporter: Ziyou Wang
Priority: Critical
 Attachments: history.txt, node-1.log, node-2.log, node-3.log, 
 zoo-1.log, zoo-2-1.log, zoo-2-2.log, zoo-2-3.log, zoo-2.log, zoo-2212-1.log, 
 zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log, zoo-3-2.log, zoo-3-3.log, 
 zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log, zoo.cfg.dynamic.1005d, 
 zoo.cfg.dynamic.next, zookeeper-1.log, zookeeper-1.out, zookeeper-2.log, 
 zookeeper-2.out, zookeeper-3.log, zookeeper-3.out


 The operations are quite simple: start three zk servers one by one, then 
 reconfig the cluster to add the new one as a participant. When I add the  
 third one, the zk cluster may enter a weird state and cannot recover.
  
   I found “2015-04-20 12:53:48,236 [myid:1] - INFO  [ProcessThread(sid:1 
 cport:-1)::PrepRequestProcessor@547] - Incremental reconfig” in node-1 log. 
 So the first node received the reconfig cmd at 12:53:48. Latter, it logged 
 “2015-04-20  12:53:52,230 [myid:1] - ERROR 
 [LearnerHandler-/10.0.0.2:55890:LearnerHandler@580] - Unexpected exception 
 causing shutdown while sock still open” and “2015-04-20 12:53:52,231 [myid:1] 
 - WARN  [LearnerHandler-/10.0.0.2:55890:LearnerHandler@595] - *** GOODBYE 
  /10.0.0.2:55890 ”. From then on, the first node and second node 
 rejected all client connections and the third node didn’t join the cluster as 
 a participant. The whole cluster was done.
  
  When the problem happened, all three nodes just used the same dynamic 
 config file zoo.cfg.dynamic.1005d which only contained the first two 
 nodes. But there was another unused dynamic config file in node-1 directory 
 zoo.cfg.dynamic.next  which already contained three nodes.
  
  When I extended the waiting time between starting the third node and 
 reconfiguring the cluster, the problem didn’t show again. So it should be a 
 race condition problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant

2015-07-21 Thread Hitoshi Mitake (JIRA)

[
https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634903#comment-14634903
]

Hitoshi Mitake commented on ZOOKEEPER-2172:
---

Hi [~ziyouw],

It seems that I could reproduce this problem. Just adding new servers with
reconfig one by one, then the ensemble rejects every client request.
(Of course there is a possibility of my misunderstanding)

I used our distributed systems debugger named
[earthquake|https://github.com/osrg/earthquake]. It uses byteman and inspect
execution of debuggee (zookeeper server in this case). It tries to cause corner
case situations that is hard to be produced in ordinal testing by reordering
inspected method calls and returns.

We are preparing a docker image for easy reproducing in your environment.
Please wait for a while.

I'm analyzing the problem and would like to post the root cause and patch, but
it may take a time because I'm new to zookeeper. So I attached logs
(zookeeper-123.out) and the history of ensemble (history.txt). The logs seem to
be similar to yours.
The format of the history is earthquake specific format, so it wouldn't be easy
to read. But I think you can interpret the event sequence roughly (it is just a
sequence of method calls and returns + their stacktrace). It would be great if
I can hear your comments.

Cluster crashes when reconfig a new node as a participant
-

Key: ZOOKEEPER-2172
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172
Project: ZooKeeper
Issue Type: Bug
Components: leaderElection, quorum, server
Affects Versions: 3.5.0
Environment: Ubuntu 12.04 + java 7
Reporter: Ziyou Wang
Priority: Critical
Attachments: history.txt, node-1.log, node-2.log, node-3.log,
zoo-1.log, zoo-2-1.log, zoo-2-2.log, zoo-2-3.log, zoo-2.log, zoo-2212-1.log,
zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log, zoo-3-2.log, zoo-3-3.log,
zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log, zoo.cfg.dynamic.1005d,
zoo.cfg.dynamic.next, zookeeper-1.log, zookeeper-1.out, zookeeper-2.log,
zookeeper-2.out, zookeeper-3.log, zookeeper-3.out

When I extended the waiting time between starting the third node and
reconfiguring the cluster, the problem didn’t show again. So it should be a
race condition problem.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (ZOOKEEPER-2233) Invalid description in the comment of LearnerHandler.syncFollower()

2015-07-14 Thread Hitoshi Mitake (JIRA)

Hitoshi Mitake created ZOOKEEPER-2233:
-

 Summary: Invalid description in the comment of 
LearnerHandler.syncFollower()
 Key: ZOOKEEPER-2233
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2233
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial


LearnerHandler.syncFollower() has a comment like below:

When leader election is completed, the leader will set its
lastProcessedZxid to be (epoch  32). There will be no txn associated
with this zxid.

However, IIUC, the expression epoch  32 (comparison) should be epoch  32 
(bitshift).

Of course the error is very trivial but it was a little bit confusing for me, 
so I'd like to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (ZOOKEEPER-2233) Invalid description in the comment of LearnerHandler.syncFollower()

2015-07-14 Thread Hitoshi Mitake (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitoshi Mitake updated ZOOKEEPER-2233:
--
Attachment: ZOOKEEPER-2233.patch

 Invalid description in the comment of LearnerHandler.syncFollower()
 ---

 Key: ZOOKEEPER-2233
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2233
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2233.patch


 LearnerHandler.syncFollower() has a comment like below:
 When leader election is completed, the leader will set its
 lastProcessedZxid to be (epoch  32). There will be no txn associated
 with this zxid.
 However, IIUC, the expression epoch  32 (comparison) should be epoch  
 32 (bitshift).
 Of course the error is very trivial but it was a little bit confusing for me, 
 so I'd like to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2205) Log type of unexpected quorum packet in learner handler loop

2015-06-05 Thread Hitoshi Mitake (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574101#comment-14574101
 ] 

Hitoshi Mitake commented on ZOOKEEPER-2205:
---

Hi,

The problem in the Observer class is similar but not directly related to this 
issue. Could you open your own issue and send patch to the new one?

 Log type of unexpected quorum packet in learner handler loop
 

 Key: ZOOKEEPER-2205
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2205
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2205-v2.patch, ZOOKEEPER-2205-v3.patch, 
 ZOOKEEPER-2205-v4.patch, ZOOKEEPER-2205.patch


 Current learner handler loop doesn't log anything when it receives unexpected 
 type of quorum packet from learner.
 This patch lets the learner handler loop log the type of packet for defensive 
 purpose. It would make debugging and trouble shooting a little bit easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (ZOOKEEPER-2205) Log type of unexpected quorum packet in learner handler loop

2015-06-05 Thread Hitoshi Mitake (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitoshi Mitake updated ZOOKEEPER-2205:
--
Attachment: ZOOKEEPER-2205-v4.patch

 Log type of unexpected quorum packet in learner handler loop
 

 Key: ZOOKEEPER-2205
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2205
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2205-v2.patch, ZOOKEEPER-2205-v3.patch, 
 ZOOKEEPER-2205-v4.patch, ZOOKEEPER-2205.patch


 Current learner handler loop doesn't log anything when it receives unexpected 
 type of quorum packet from learner.
 This patch lets the learner handler loop log the type of packet for defensive 
 purpose. It would make debugging and trouble shooting a little bit easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (ZOOKEEPER-2207) Enhance error logs with LearnerHandler.packetToString()

2015-06-05 Thread Hitoshi Mitake (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitoshi Mitake updated ZOOKEEPER-2207:
--
Attachment: ZOOKEEPER-2207-v2.patch

 Enhance error logs with LearnerHandler.packetToString()
 ---

 Key: ZOOKEEPER-2207
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2207
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2207-v2.patch, ZOOKEEPER-2207.patch


 This patch enhances error logs related to unexpected types of QuorumPacket 
 with LearnerHandler.packetToString().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2205) Log type of unexpected quorum packet in learner handler loop

2015-06-05 Thread Hitoshi Mitake (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574102#comment-14574102
 ] 

Hitoshi Mitake commented on ZOOKEEPER-2205:
---

Hi [~rgs],

Thanks for your review! I attached v4 patch based on your comments.

 Log type of unexpected quorum packet in learner handler loop
 

 Key: ZOOKEEPER-2205
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2205
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2205-v2.patch, ZOOKEEPER-2205-v3.patch, 
 ZOOKEEPER-2205-v4.patch, ZOOKEEPER-2205.patch


 Current learner handler loop doesn't log anything when it receives unexpected 
 type of quorum packet from learner.
 This patch lets the learner handler loop log the type of packet for defensive 
 purpose. It would make debugging and trouble shooting a little bit easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2207) Enhance error logs with LearnerHandler.packetToString()

2015-06-05 Thread Hitoshi Mitake (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574109#comment-14574109
 ] 

Hitoshi Mitake commented on ZOOKEEPER-2207:
---

Hi [~rgs],

Thanks for your review! I attached v2 patch based on your comments.

BTW, I fixed the unconditional return branch problem in the v4 patch of 
ZOOKEEPER-2205 (https://issues.apache.org/jira/browse/ZOOKEEPER-2205). Should I 
remove the return branch in this 2207? If I should do so, I'll fix both of the 
patches in 2205 and 2207.


 Enhance error logs with LearnerHandler.packetToString()
 ---

 Key: ZOOKEEPER-2207
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2207
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2207-v2.patch, ZOOKEEPER-2207.patch


 This patch enhances error logs related to unexpected types of QuorumPacket 
 with LearnerHandler.packetToString().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2207) Enhance error logs with LearnerHandler.packetToString()

2015-06-05 Thread Hitoshi Mitake (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575441#comment-14575441
 ] 

Hitoshi Mitake commented on ZOOKEEPER-2207:
---

Thanks, [~rgs]!

 Enhance error logs with LearnerHandler.packetToString()
 ---

 Key: ZOOKEEPER-2207
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2207
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.5.0
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Fix For: 3.5.1, 3.6.0

 Attachments: ZOOKEEPER-2207-v2.patch, ZOOKEEPER-2207.patch


 This patch enhances error logs related to unexpected types of QuorumPacket 
 with LearnerHandler.packetToString().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2206) Add missing packet types to LearnerHandler.packetToString()

2015-06-05 Thread Hitoshi Mitake (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575443#comment-14575443
 ] 

Hitoshi Mitake commented on ZOOKEEPER-2206:
---

Thanks, [~rgs]!

 Add missing packet types to LearnerHandler.packetToString()
 ---

 Key: ZOOKEEPER-2206
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2206
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.5.0
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Fix For: 3.5.1, 3.6.0

 Attachments: ZOOKEEPER-2206.patch


 packetToString() is a method which is suitable for obtaining string 
 representation of QuorumPacket. But it lacks some types of QuorumPacket. This 
 patch adds the missing types and enhance the method for more friendly logging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2205) Log type of unexpected quorum packet in learner handler loop

2015-06-05 Thread Hitoshi Mitake (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575439#comment-14575439
 ] 

Hitoshi Mitake commented on ZOOKEEPER-2205:
---

Thanks for merging, [~rgs]!

 Log type of unexpected quorum packet in learner handler loop
 

 Key: ZOOKEEPER-2205
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2205
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.4.6, 3.5.0
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Fix For: 3.4.7, 3.5.1, 3.6.0

 Attachments: ZOOKEEPER-2205-v2.patch, ZOOKEEPER-2205-v3.patch, 
 ZOOKEEPER-2205-v4.patch, ZOOKEEPER-2205.patch


 Current learner handler loop doesn't log anything when it receives unexpected 
 type of quorum packet from learner.
 This patch lets the learner handler loop log the type of packet for defensive 
 purpose. It would make debugging and trouble shooting a little bit easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2194) Let DataNode.getChildren() return an unmodifiable view of its children set

2015-06-04 Thread Hitoshi Mitake (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573058#comment-14573058
 ] 

Hitoshi Mitake commented on ZOOKEEPER-2194:
---

Hi [~cnauroth],

Thanks a lot for your description! Now I can understand both of the rule and 
situation of zookeeper community. I'll wait comments from comitters.


 Let DataNode.getChildren() return an unmodifiable view of its children set
 --

 Key: ZOOKEEPER-2194
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2194
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2194-v2.patch, ZOOKEEPER-2194.patch


 Current DataNode.getChildren() directly returns a pointer of its private 
 member,  children. However, the member should be modified through addChild() 
 and removeChild(). Callers of getChildren() shouldn't modify it directly.
 For preventing the direct modification by the callers, this patch lets 
 getChildren() return an unmodifiable view of its children set. If the callers 
 try to modify directly, runtime exception will be risen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2194) Let DataNode.getChildren() return an unmodifiable view of its children set

2015-06-04 Thread Hitoshi Mitake (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573139#comment-14573139
 ] 

Hitoshi Mitake commented on ZOOKEEPER-2194:
---

Hi [~rgs],

Thanks a lot for your review and merging!

 Let DataNode.getChildren() return an unmodifiable view of its children set
 --

 Key: ZOOKEEPER-2194
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2194
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.4.6, 3.5.0
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Fix For: 3.4.7, 3.5.1, 3.6.0

 Attachments: ZOOKEEPER-2194-v2.patch, ZOOKEEPER-2194.patch


 Current DataNode.getChildren() directly returns a pointer of its private 
 member,  children. However, the member should be modified through addChild() 
 and removeChild(). Callers of getChildren() shouldn't modify it directly.
 For preventing the direct modification by the callers, this patch lets 
 getChildren() return an unmodifiable view of its children set. If the callers 
 try to modify directly, runtime exception will be risen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (ZOOKEEPER-2206) Add missing packet types to LearnerHandler.packetToString()

2015-06-04 Thread Hitoshi Mitake (JIRA)

Hitoshi Mitake created ZOOKEEPER-2206:
-

 Summary: Add missing packet types to 
LearnerHandler.packetToString()
 Key: ZOOKEEPER-2206
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2206
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2206.patch

packetToString() is a method which is suitable for obtaining string 
representation of QuorumPacket. But it lacks some types of QuorumPacket. This 
patch adds the missing types and enhance the method for more friendly logging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (ZOOKEEPER-2206) Add missing packet types to LearnerHandler.packetToString()

2015-06-04 Thread Hitoshi Mitake (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitoshi Mitake updated ZOOKEEPER-2206:
--
Attachment: ZOOKEEPER-2206.patch

 Add missing packet types to LearnerHandler.packetToString()
 ---

 Key: ZOOKEEPER-2206
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2206
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2206.patch


 packetToString() is a method which is suitable for obtaining string 
 representation of QuorumPacket. But it lacks some types of QuorumPacket. This 
 patch adds the missing types and enhance the method for more friendly logging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (ZOOKEEPER-2205) Log type of unexpected quorum packet in learner handler loop

2015-06-04 Thread Hitoshi Mitake (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitoshi Mitake updated ZOOKEEPER-2205:
--
Summary: Log type of unexpected quorum packet in learner handler loop  
(was: Log type of unexpected quorum packet in learner loop)

 Log type of unexpected quorum packet in learner handler loop
 

 Key: ZOOKEEPER-2205
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2205
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2205.patch


 Current learner loop doesn't log anything when it receives unexpected type of 
 quorum packet from leader.
 This patch lets the learner loop log the type of packet for defensive 
 purpose. It would make debugging and trouble shooting a little bit easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (ZOOKEEPER-2205) Log type of unexpected quorum packet in learner handler loop

2015-06-04 Thread Hitoshi Mitake (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitoshi Mitake updated ZOOKEEPER-2205:
--
Description: 
Current learner handler loop doesn't log anything when it receives unexpected 
type of quorum packet from learner.

This patch lets the learner handler loop log the type of packet for defensive 
purpose. It would make debugging and trouble shooting a little bit easier.

  was:
Current learner loop doesn't log anything when it receives unexpected type of 
quorum packet from leader.

This patch lets the learner loop log the type of packet for defensive purpose. 
It would make debugging and trouble shooting a little bit easier.


 Log type of unexpected quorum packet in learner handler loop
 

 Key: ZOOKEEPER-2205
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2205
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2205.patch


 Current learner handler loop doesn't log anything when it receives unexpected 
 type of quorum packet from learner.
 This patch lets the learner handler loop log the type of packet for defensive 
 purpose. It would make debugging and trouble shooting a little bit easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (ZOOKEEPER-2205) Log type of unexpected quorum packet in learner handler loop

2015-06-04 Thread Hitoshi Mitake (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitoshi Mitake updated ZOOKEEPER-2205:
--
Attachment: ZOOKEEPER-2205-v2.patch

version 2, use packetToString() for friendly log

 Log type of unexpected quorum packet in learner handler loop
 

 Key: ZOOKEEPER-2205
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2205
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2205-v2.patch, ZOOKEEPER-2205.patch


 Current learner handler loop doesn't log anything when it receives unexpected 
 type of quorum packet from learner.
 This patch lets the learner handler loop log the type of packet for defensive 
 purpose. It would make debugging and trouble shooting a little bit easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (ZOOKEEPER-2206) Add missing packet types to LearnerHandler.packetToString()

2015-06-04 Thread Hitoshi Mitake (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitoshi Mitake updated ZOOKEEPER-2206:
--
Component/s: server

 Add missing packet types to LearnerHandler.packetToString()
 ---

 Key: ZOOKEEPER-2206
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2206
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2206.patch


 packetToString() is a method which is suitable for obtaining string 
 representation of QuorumPacket. But it lacks some types of QuorumPacket. This 
 patch adds the missing types and enhance the method for more friendly logging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (ZOOKEEPER-2207) Enhance error logs with LearnerHandler.packetToString()

2015-06-04 Thread Hitoshi Mitake (JIRA)

Hitoshi Mitake created ZOOKEEPER-2207:
-

 Summary: Enhance error logs with LearnerHandler.packetToString()
 Key: ZOOKEEPER-2207
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2207
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2207.patch

This patch enhances error logs related to unexpected types of QuorumPacket with 
LearnerHandler.packetToString().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (ZOOKEEPER-2207) Enhance error logs with LearnerHandler.packetToString()

2015-06-04 Thread Hitoshi Mitake (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitoshi Mitake updated ZOOKEEPER-2207:
--
Attachment: ZOOKEEPER-2207.patch

 Enhance error logs with LearnerHandler.packetToString()
 ---

 Key: ZOOKEEPER-2207
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2207
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2207.patch


 This patch enhances error logs related to unexpected types of QuorumPacket 
 with LearnerHandler.packetToString().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (ZOOKEEPER-2207) Enhance error logs with LearnerHandler.packetToString()

2015-06-04 Thread Hitoshi Mitake (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitoshi Mitake reassigned ZOOKEEPER-2207:
-

Assignee: Hitoshi Mitake

 Enhance error logs with LearnerHandler.packetToString()
 ---

 Key: ZOOKEEPER-2207
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2207
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2207.patch


 This patch enhances error logs related to unexpected types of QuorumPacket 
 with LearnerHandler.packetToString().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (ZOOKEEPER-2206) Add missing packet types to LearnerHandler.packetToString()

2015-06-04 Thread Hitoshi Mitake (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitoshi Mitake reassigned ZOOKEEPER-2206:
-

Assignee: Hitoshi Mitake

 Add missing packet types to LearnerHandler.packetToString()
 ---

 Key: ZOOKEEPER-2206
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2206
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2206.patch


 packetToString() is a method which is suitable for obtaining string 
 representation of QuorumPacket. But it lacks some types of QuorumPacket. This 
 patch adds the missing types and enhance the method for more friendly logging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2205) Log type of unexpected quorum packet in learner handler loop

2015-06-04 Thread Hitoshi Mitake (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572339#comment-14572339
 ] 

Hitoshi Mitake commented on ZOOKEEPER-2205:
---

I found the below branch in the head of packetToString():

{code}
if (true)
return null;
 {code}

Is there any reason for avoiding the method? The conditional branch seems to 
exist since the commit of Initial import.

 Log type of unexpected quorum packet in learner handler loop
 

 Key: ZOOKEEPER-2205
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2205
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2205-v2.patch, ZOOKEEPER-2205.patch


 Current learner handler loop doesn't log anything when it receives unexpected 
 type of quorum packet from learner.
 This patch lets the learner handler loop log the type of packet for defensive 
 purpose. It would make debugging and trouble shooting a little bit easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2194) Let DataNode.getChildren() return an unmodifiable view of its children set

2015-06-04 Thread Hitoshi Mitake (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572357#comment-14572357
 ] 

Hitoshi Mitake commented on ZOOKEEPER-2194:
---

Hi [~cnauroth],

For mainlining the patch, should I just wait? Or should I do some actions?
I'm very new to zookeeper community, so I just want to know the required 
procedure. I'm not hurrying at all :)

 Let DataNode.getChildren() return an unmodifiable view of its children set
 --

 Key: ZOOKEEPER-2194
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2194
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2194-v2.patch, ZOOKEEPER-2194.patch


 Current DataNode.getChildren() directly returns a pointer of its private 
 member,  children. However, the member should be modified through addChild() 
 and removeChild(). Callers of getChildren() shouldn't modify it directly.
 For preventing the direct modification by the callers, this patch lets 
 getChildren() return an unmodifiable view of its children set. If the callers 
 try to modify directly, runtime exception will be risen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (ZOOKEEPER-2205) Log type of unexpected quorum packet in learner loop

2015-06-03 Thread Hitoshi Mitake (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitoshi Mitake updated ZOOKEEPER-2205:
--
Issue Type: Improvement  (was: Bug)

 Log type of unexpected quorum packet in learner loop
 

 Key: ZOOKEEPER-2205
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2205
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2205.patch


 Current learner loop doesn't log anything when it receives unexpected type of 
 quorum packet from leader.
 This patch lets the learner loop log the type of packet for defensive 
 purpose. It would make debugging and trouble shooting a little bit easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (ZOOKEEPER-2205) Log type of unexpected quorum packet in learner loop

2015-06-03 Thread Hitoshi Mitake (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitoshi Mitake updated ZOOKEEPER-2205:
--
Attachment: ZOOKEEPER-2205.patch

 Log type of unexpected quorum packet in learner loop
 

 Key: ZOOKEEPER-2205
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2205
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2205.patch


 Current learner loop doesn't log anything when it receives unexpected type of 
 quorum packet from leader.
 This patch lets the learner loop log the type of packet for defensive 
 purpose. It would make debugging and trouble shooting a little bit easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (ZOOKEEPER-2205) Log type of unexpected quorum packet in learner loop

2015-06-03 Thread Hitoshi Mitake (JIRA)

Hitoshi Mitake created ZOOKEEPER-2205:
-

 Summary: Log type of unexpected quorum packet in learner loop
 Key: ZOOKEEPER-2205
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2205
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Reporter: Hitoshi Mitake
Assignee: Hitoshi Mitake
Priority: Trivial


Current learner loop doesn't log anything when it receives unexpected type of 
quorum packet from leader.

This patch lets the learner loop log the type of packet for defensive purpose. 
It would make debugging and trouble shooting a little bit easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2193) reconfig command completes even if parameter is wrong obviously

2015-05-24 Thread Hitoshi Mitake (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557955#comment-14557955
 ] 

Hitoshi Mitake commented on ZOOKEEPER-2193:
---

Hi [~Yasuhito Fukuda],

IIUC, there is a possibility of duplicated addresses for different purposes 
e.g. clientAddr of new node == electionAddr of existing node.

For checking duplication, 9 comparison per node pair would be required, I think.

 reconfig command completes even if parameter is wrong obviously
 ---

 Key: ZOOKEEPER-2193
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2193
 Project: ZooKeeper
  Issue Type: Bug
  Components: leaderElection, server
Affects Versions: 3.5.0
 Environment: CentOS7 + Java7
Reporter: Yasuhito Fukuda
 Attachments: ZOOKEEPER-2193.patch


 Even if reconfig parameter is wrong, it was confirmed to complete.
 refer to the following.
 - Ensemble consists of four nodes
 {noformat}
 [zk: vm-101:2181(CONNECTED) 0] config
 server.1=192.168.100.101:2888:3888:participant
 server.2=192.168.100.102:2888:3888:participant
 server.3=192.168.100.103:2888:3888:participant
 server.4=192.168.100.104:2888:3888:participant
 version=1
 {noformat}
 - add node by reconfig command
 {noformat}
 [zk: vm-101:2181(CONNECTED) 9] reconfig -add 
 server.5=192.168.100.104:2888:3888:participant;0.0.0.0:2181
 Committed new configuration:
 server.1=192.168.100.101:2888:3888:participant
 server.2=192.168.100.102:2888:3888:participant
 server.3=192.168.100.103:2888:3888:participant
 server.4=192.168.100.104:2888:3888:participant
 server.5=192.168.100.104:2888:3888:participant;0.0.0.0:2181
 version=30007
 {noformat}
 server.4 and server.5 of the IP address is a duplicate.
 In this state, reader election will not work properly.
 Besides, it is assumed an ensemble will be undesirable state.
 I think that need a parameter validation when reconfig.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2194) Let DataNode.getChildren() return an unmodifiable view of its children set

2015-05-22 Thread Hitoshi Mitake (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14555757#comment-14555757
 ] 

Hitoshi Mitake commented on ZOOKEEPER-2194:
---

Thanks for submitting test run!

I'll ask the committers to list myself as a contributor on the zookeeper 
mailing list.

 Let DataNode.getChildren() return an unmodifiable view of its children set
 --

 Key: ZOOKEEPER-2194
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2194
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2194-v2.patch, ZOOKEEPER-2194.patch


 Current DataNode.getChildren() directly returns a pointer of its private 
 member,  children. However, the member should be modified through addChild() 
 and removeChild(). Callers of getChildren() shouldn't modify it directly.
 For preventing the direct modification by the callers, this patch lets 
 getChildren() return an unmodifiable view of its children set. If the callers 
 try to modify directly, runtime exception will be risen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (ZOOKEEPER-2194) Let DataNode.getChildren() return an unmodifiable view of its children set

2015-05-21 Thread Hitoshi Mitake (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitoshi Mitake updated ZOOKEEPER-2194:
--
Attachment: ZOOKEEPER-2194.patch

 Let DataNode.getChildren() return an unmodifiable view of its children set
 --

 Key: ZOOKEEPER-2194
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2194
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2194.patch


 Current DataNode.getChildren() directly returns a pointer of its private 
 member,  children. However, the member should be modified through addChild() 
 and removeChild(). Callers of getChildren() shouldn't modify it directly.
 For preventing the direct modification by the callers, this patch lets 
 getChildren() return an unmodifiable view of its children set. If the callers 
 try to modify directly, runtime exception will be risen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (ZOOKEEPER-2194) Let DataNode.getChildren() return an unmodifiable view of its children set

2015-05-21 Thread Hitoshi Mitake (JIRA)

Hitoshi Mitake created ZOOKEEPER-2194:
-

 Summary: Let DataNode.getChildren() return an unmodifiable view of 
its children set
 Key: ZOOKEEPER-2194
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2194
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Hitoshi Mitake
Priority: Trivial


Current DataNode.getChildren() directly returns a pointer of its private 
member,  children. However, the member should be modified through addChild() 
and removeChild(). Callers of getChildren() shouldn't modify it directly.

For preventing the direct modification by the callers, this patch lets 
getChildren() return an unmodifiable view of its children set. If the callers 
try to modify directly, runtime exception will be risen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (ZOOKEEPER-2194) Let DataNode.getChildren() return an unmodifiable view of its children set

2015-05-21 Thread Hitoshi Mitake (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitoshi Mitake updated ZOOKEEPER-2194:
--
Attachment: ZOOKEEPER-2194-v2.patch

Version 2, modified based on the comments from [~cnauroth].

 Let DataNode.getChildren() return an unmodifiable view of its children set
 --

 Key: ZOOKEEPER-2194
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2194
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2194-v2.patch, ZOOKEEPER-2194.patch


 Current DataNode.getChildren() directly returns a pointer of its private 
 member,  children. However, the member should be modified through addChild() 
 and removeChild(). Callers of getChildren() shouldn't modify it directly.
 For preventing the direct modification by the callers, this patch lets 
 getChildren() return an unmodifiable view of its children set. If the callers 
 try to modify directly, runtime exception will be risen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ZOOKEEPER-2194) Let DataNode.getChildren() return an unmodifiable view of its children set

2015-05-21 Thread Hitoshi Mitake (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14555401#comment-14555401
 ] 

Hitoshi Mitake commented on ZOOKEEPER-2194:
---

Hi [~cnauroth], thanks for your reply.

I'll fix the style of the conditional branch, and follow your instruction of 
patch generation in v2.

Thanks a lot for your review!

 Let DataNode.getChildren() return an unmodifiable view of its children set
 --

 Key: ZOOKEEPER-2194
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2194
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Hitoshi Mitake
Priority: Trivial
 Attachments: ZOOKEEPER-2194.patch


 Current DataNode.getChildren() directly returns a pointer of its private 
 member,  children. However, the member should be modified through addChild() 
 and removeChild(). Callers of getChildren() shouldn't modify it directly.
 For preventing the direct modification by the callers, this patch lets 
 getChildren() return an unmodifiable view of its children set. If the callers 
 try to modify directly, runtime exception will be risen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

44 matches

Mail list logo