[jira] Commented: (ZOOKEEPER-335) zookeeper servers should commit the new leader txn to their logs.

2010-11-20 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934202#action_12934202
 ] 

Flavio Junqueira commented on ZOOKEEPER-335:


Radu, It sounds like the problem you mention has been resolved in 
ZOOKEEPER-790. I'm not sure which version you're using, but perhaps you should 
consider moving to 3.3.2.

> zookeeper servers should commit the new leader txn to their logs.
> -
>
> Key: ZOOKEEPER-335
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-335
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.1.0
>Reporter: Mahadev konar
>Assignee: Mahadev konar
>Priority: Blocker
> Fix For: 3.4.0
>
> Attachments: faultynode-vishal.txt, zk.log.gz, zklogs.tar.gz, 
> ZOOKEEPER-790.travis.log.bz2
>
>
> currently the zookeeper followers do not commit the new leader election. This 
> will cause problems in a failure scenarios with a follower acking to the same 
> leader txn id twice, which might be two different intermittent leaders and 
> allowing them to propose two different txn's of the same zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-880) QuorumCnxManager$SendWorker grows without bounds

2010-11-19 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933928#action_12933928
 ] 

Flavio Junqueira commented on ZOOKEEPER-880:


I think we agree that monitoring alone was not causing the issue. But, your 
logs indicate that there were some orphan threads due to the monitoring, and we 
can see it from excerpts of your logs like the one I posted above. Without the 
monitoring, the same problem is being triggered, though, but apparently in a 
different way and it is not clear why. You can see it from all the "Channel 
eof" messages on the log. 

To solve this issue, we need to understand the following:

# What's causing those IOExceptions?
# Why are we even starting a new connection if there is no leader election 
going on? 

Do you folks have any idea if there is anything in your environment that could 
be causing those TCP connections to break? 

> QuorumCnxManager$SendWorker grows without bounds
> 
>
> Key: ZOOKEEPER-880
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-880
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.2.2
>Reporter: Jean-Daniel Cryans
>Priority: Critical
> Attachments: hbase-hadoop-zookeeper-sv4borg12.log.gz, 
> hbase-hadoop-zookeeper-sv4borg9.log.gz, jstack, 
> TRACE-hbase-hadoop-zookeeper-sv4borg9.log.gz
>
>
> We're seeing an issue where one server in the ensemble has a steady growing 
> number of QuorumCnxManager$SendWorker threads up to a point where the OS runs 
> out of native threads, and at the same time we see a lot of exceptions in the 
> logs.  This is on 3.2.2 and our config looks like:
> {noformat}
> tickTime=3000
> dataDir=/somewhere_thats_not_tmp
> clientPort=2181
> initLimit=10
> syncLimit=5
> server.0=sv4borg9:2888:3888
> server.1=sv4borg10:2888:3888
> server.2=sv4borg11:2888:3888
> server.3=sv4borg12:2888:3888
> server.4=sv4borg13:2888:3888
> {noformat}
> The issue is on the first server. I'm going to attach threads dumps and logs 
> in moment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-934) Add sanity check for server ID

2010-11-19 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933719#action_12933719
 ] 

Flavio Junqueira commented on ZOOKEEPER-934:


One more comment. Looking at the logs for ZOOKEEPER-880, I remembered that in 
their case the RecvWorker thread was able to read a valid id from the 
connection with a Nagios server. I'm not exactly sure how that happened, but 
that essentially tells that the simple check you proposed might not do it. We 
don't want a Nagios box impersonating a ZooKeeper server! :-)

> Add sanity check for server ID
> --
>
> Key: ZOOKEEPER-934
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-934
> Project: Zookeeper
>  Issue Type: Sub-task
>Reporter: Vishal K
> Fix For: 3.4.0
>
>
> 2. Should I add a check to reject connections from peers that are not
> listed in the configuration file? Currently, we are not doing any
> sanity check for server IDs. I think this might fix ZOOKEEPER-851.
> The fix is simple. However, I am not sure if anyone in community
> is relying on this ability.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-880) QuorumCnxManager$SendWorker grows without bounds

2010-11-19 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933718#action_12933718
 ] 

Flavio Junqueira commented on ZOOKEEPER-880:


One problem here is that we had some discussions over IRC and the information 
is not reflected here. 

If you have a look at the logs, you'll observe this:

{noformat}

2010-09-28 10:31:22,227 DEBUG 
org.apache.zookeeper.server.quorum.QuorumCnxManager: Connection request 
/10.10.20.5:41861
2010-09-28 10:31:22,227 DEBUG 
org.apache.zookeeper.server.quorum.QuorumCnxManager: Connection request: 0
2010-09-28 10:31:22,227 DEBUG 
org.apache.zookeeper.server.quorum.QuorumCnxManager: Address of remote peer: 0
2010-09-28 10:31:22,229 WARN 
org.apache.zookeeper.server.quorum.QuorumCnxManager: Connection broken:
java.io.IOException: Channel eof
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:595)
{noformat}

If I remember the discussion with J-D correctly, that node trying to connect is 
running Nagios. My conjecture at the time was that the IOException was killing 
the receiver thread, but not the sender thread (RecvWorker.finish() does not 
close its SendWorker counterpart).

Your point is good, but it sounds like that the race you mention would have to 
be triggered continuously to cause the number of SendWorker threads to grow 
steadily. It sounds unlikely to me.

> QuorumCnxManager$SendWorker grows without bounds
> 
>
> Key: ZOOKEEPER-880
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-880
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.2.2
>Reporter: Jean-Daniel Cryans
>Priority: Critical
> Attachments: hbase-hadoop-zookeeper-sv4borg12.log.gz, 
> hbase-hadoop-zookeeper-sv4borg9.log.gz, jstack, 
> TRACE-hbase-hadoop-zookeeper-sv4borg9.log.gz
>
>
> We're seeing an issue where one server in the ensemble has a steady growing 
> number of QuorumCnxManager$SendWorker threads up to a point where the OS runs 
> out of native threads, and at the same time we see a lot of exceptions in the 
> logs.  This is on 3.2.2 and our config looks like:
> {noformat}
> tickTime=3000
> dataDir=/somewhere_thats_not_tmp
> clientPort=2181
> initLimit=10
> syncLimit=5
> server.0=sv4borg9:2888:3888
> server.1=sv4borg10:2888:3888
> server.2=sv4borg11:2888:3888
> server.3=sv4borg12:2888:3888
> server.4=sv4borg13:2888:3888
> {noformat}
> The issue is on the first server. I'm going to attach threads dumps and logs 
> in moment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-934) Add sanity check for server ID

2010-11-18 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933713#action_12933713
 ] 

Flavio Junqueira commented on ZOOKEEPER-934:


I was not thinking about OBSERVER_ID, good point, I think it should do it.  

> Add sanity check for server ID
> --
>
> Key: ZOOKEEPER-934
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-934
> Project: Zookeeper
>  Issue Type: Sub-task
>Reporter: Vishal K
> Fix For: 3.4.0
>
>
> 2. Should I add a check to reject connections from peers that are not
> listed in the configuration file? Currently, we are not doing any
> sanity check for server IDs. I think this might fix ZOOKEEPER-851.
> The fix is simple. However, I am not sure if anyone in community
> is relying on this ability.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-933) Remove wildcard QuorumPeer.OBSERVER_ID

2010-11-18 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933709#action_12933709
 ] 

Flavio Junqueira commented on ZOOKEEPER-933:


+1 for the idea, sounds right to me.

> Remove wildcard  QuorumPeer.OBSERVER_ID
> ---
>
> Key: ZOOKEEPER-933
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-933
> Project: Zookeeper
>  Issue Type: Sub-task
>Reporter: Vishal K
> Fix For: 3.4.0
>
>
> 1. I have a question about the following piece of code in QCM:
> if (remoteSid == QuorumPeer.OBSERVER_ID) {
>  /* * Choose identifier at random. We need a value to identify * the 
> connection. */ 
> remoteSid = observerCounter--;
> LOG.info("Setting arbitrary identifier to observer: " + remoteSid); 
> }
> Should we allow this? The problem with this code is that if a peer
> connects twice with QuorumPeer.OBSERVER_ID, we will end up creating
> threads for this peer twice. This could result in redundant
> SendWorker/RecvWorker threads.
> I haven't used observers yet. The documentation
> http://hadoop.apache.org/zookeeper/docs/r3.3.0/zookeeperObservers.html
> says that just like followers, observers should have server IDs. In
> which case, why do we want to provide a wild-card?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-934) Add sanity check for server ID

2010-11-17 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933045#action_12933045
 ] 

Flavio Junqueira commented on ZOOKEEPER-934:


It sounds like we need to do it so that we don't get affected by port scanners 
or monitoring systems. However, I'm not sure if this impacts the observers 
feature we are discussing in the other jira (ZOOKEEPER-933). It sounds like it 
does, but I need to verify. Any thoughts?



> Add sanity check for server ID
> --
>
> Key: ZOOKEEPER-934
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-934
> Project: Zookeeper
>  Issue Type: Sub-task
>Reporter: Vishal K
> Fix For: 3.4.0
>
>
> 2. Should I add a check to reject connections from peers that are not
> listed in the configuration file? Currently, we are not doing any
> sanity check for server IDs. I think this might fix ZOOKEEPER-851.
> The fix is simple. However, I am not sure if anyone in community
> is relying on this ability.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-933) Remove wildcard QuorumPeer.OBSERVER_ID

2010-11-17 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933034#action_12933034
 ] 

Flavio Junqueira commented on ZOOKEEPER-933:


Hi Vishal, The reason for the wildcard is explained in ZOOKEEPER-599. I'd 
rather keep this feature for the reasons explained before, but it would be good 
to prevent the case you mention.

> Remove wildcard  QuorumPeer.OBSERVER_ID
> ---
>
> Key: ZOOKEEPER-933
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-933
> Project: Zookeeper
>  Issue Type: Sub-task
>Reporter: Vishal K
> Fix For: 3.4.0
>
>
> 1. I have a question about the following piece of code in QCM:
> if (remoteSid == QuorumPeer.OBSERVER_ID) {
>  /* * Choose identifier at random. We need a value to identify * the 
> connection. */ 
> remoteSid = observerCounter--;
> LOG.info("Setting arbitrary identifier to observer: " + remoteSid); 
> }
> Should we allow this? The problem with this code is that if a peer
> connects twice with QuorumPeer.OBSERVER_ID, we will end up creating
> threads for this peer twice. This could result in redundant
> SendWorker/RecvWorker threads.
> I haven't used observers yet. The documentation
> http://hadoop.apache.org/zookeeper/docs/r3.3.0/zookeeperObservers.html
> says that just like followers, observers should have server IDs. In
> which case, why do we want to provide a wild-card?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-900) FLE implementation should be improved to use non-blocking sockets

2010-11-17 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932999#action_12932999
 ] 

Flavio Junqueira commented on ZOOKEEPER-900:


Ok, there might have been a confusion. I've seen the patch available flag up 
and I interpreted it as ready to commit (after review, of course). If you still 
think there is work to be done on this jira, Vishal, please consider reopening 
it and creating sub-tasks. From your comments, I can extract at least 3 
possible tasks. 

Once you create sub-tasks (or new independent jiras), I will comment on your 
questions. I'd rather do that so that we don't mix up the discussion. Is that 
ok? 

> FLE implementation should be improved to use non-blocking sockets
> -
>
> Key: ZOOKEEPER-900
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-900
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Critical
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-900.patch, ZOOKEEPER-900.patch1, 
> ZOOKEEPER-900.patch2
>
>
> From earlier email exchanges:
> 1. Blocking connects and accepts:
> a) The first problem is in manager.toSend(). This invokes connectOne(), which 
> does a blocking connect. While testing, I changed the code so that 
> connectOne() starts a new thread called AsyncConnct(). AsyncConnect.run() 
> does a socketChannel.connect(). After starting AsyncConnect, connectOne 
> starts a timer. connectOne continues with normal operations if the connection 
> is established before the timer expires, otherwise, when the timer expires it 
> interrupts AsyncConnect() thread and returns. In this way, I can have an 
> upper bound on the amount of time we need to wait for connect to succeed. Of 
> course, this was a quick fix for my testing. Ideally, we should use Selector 
> to do non-blocking connects/accepts. I am planning to do that later once we 
> at least have a quick fix for the problem and consensus from others for the 
> real fix (this problem is big blocker for us). Note that it is OK to do 
> blocking IO in SenderWorker and RecvWorker threads since they block IO to the 
> respective !
 peer.
> b) The blocking IO problem is not just restricted to connectOne(), but also 
> in receiveConnection(). The Listener thread calls receiveConnection() for 
> each incoming connection request. receiveConnection does blocking IO to get 
> peer's info (s.read(msgBuffer)). Worse, it invokes connectOne() back to the 
> peer that had sent the connection request. All of this is happening from the 
> Listener. In short, if a peer fails after initiating a connection, the 
> Listener thread won't be able to accept connections from other peers, because 
> it would be stuck in read() or connetOne(). Also the code has an inherent 
> cycle. initiateConnection() and receiveConnection() will have to be very 
> carefully synchronized otherwise, we could run into deadlocks. This code is 
> going to be difficult to maintain/modify.
> Also see: https://issues.apache.org/jira/browse/ZOOKEEPER-822

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-902) Fix findbug issue in trunk "Malicious code vulnerability"

2010-11-17 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932989#action_12932989
 ] 

Flavio Junqueira commented on ZOOKEEPER-902:


Agreed, I've seen that 900 didn't include it. I'd rather let Pat take care of 
wrapping up this issue... 

> Fix findbug issue in trunk "Malicious code vulnerability"
> -
>
> Key: ZOOKEEPER-902
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-902
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.4.0
>Reporter: Patrick Hunt
>Priority: Minor
> Fix For: 3.4.0
>
>
> https://hudson.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/970/artifact/trunk/findbugs/zookeeper-findbugs-report.html#Warnings_MALICIOUS_CODE
> Malicious code vulnerability Warnings
> Code  Warning
> MSorg.apache.zookeeper.server.quorum.LeaderElection.epochGen isn't final 
> but should be

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-900) FLE implementation should be improved to use non-blocking sockets

2010-11-17 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-900:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed revision 1036071.

> FLE implementation should be improved to use non-blocking sockets
> -
>
> Key: ZOOKEEPER-900
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-900
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Critical
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-900.patch, ZOOKEEPER-900.patch1, 
> ZOOKEEPER-900.patch2
>
>
> From earlier email exchanges:
> 1. Blocking connects and accepts:
> a) The first problem is in manager.toSend(). This invokes connectOne(), which 
> does a blocking connect. While testing, I changed the code so that 
> connectOne() starts a new thread called AsyncConnct(). AsyncConnect.run() 
> does a socketChannel.connect(). After starting AsyncConnect, connectOne 
> starts a timer. connectOne continues with normal operations if the connection 
> is established before the timer expires, otherwise, when the timer expires it 
> interrupts AsyncConnect() thread and returns. In this way, I can have an 
> upper bound on the amount of time we need to wait for connect to succeed. Of 
> course, this was a quick fix for my testing. Ideally, we should use Selector 
> to do non-blocking connects/accepts. I am planning to do that later once we 
> at least have a quick fix for the problem and consensus from others for the 
> real fix (this problem is big blocker for us). Note that it is OK to do 
> blocking IO in SenderWorker and RecvWorker threads since they block IO to the 
> respective !
 peer.
> b) The blocking IO problem is not just restricted to connectOne(), but also 
> in receiveConnection(). The Listener thread calls receiveConnection() for 
> each incoming connection request. receiveConnection does blocking IO to get 
> peer's info (s.read(msgBuffer)). Worse, it invokes connectOne() back to the 
> peer that had sent the connection request. All of this is happening from the 
> Listener. In short, if a peer fails after initiating a connection, the 
> Listener thread won't be able to accept connections from other peers, because 
> it would be stuck in read() or connetOne(). Also the code has an inherent 
> cycle. initiateConnection() and receiveConnection() will have to be very 
> carefully synchronized otherwise, we could run into deadlocks. This code is 
> going to be difficult to maintain/modify.
> Also see: https://issues.apache.org/jira/browse/ZOOKEEPER-822

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-900) FLE implementation should be improved to use non-blocking sockets

2010-11-17 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932974#action_12932974
 ] 

Flavio Junqueira commented on ZOOKEEPER-900:


+1, Great job, Vishal! On your question, the problem is that it is not easy to 
decide when a peer can close its connections because it doesn't know in which 
state others are and it might need to receive and respond to notifications. In 
any case, if have an idea for how to do it and want to discuss it further, we 
could create a new jira and work there, since this is a separate issue.

> FLE implementation should be improved to use non-blocking sockets
> -
>
> Key: ZOOKEEPER-900
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-900
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Critical
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-900.patch, ZOOKEEPER-900.patch1, 
> ZOOKEEPER-900.patch2
>
>
> From earlier email exchanges:
> 1. Blocking connects and accepts:
> a) The first problem is in manager.toSend(). This invokes connectOne(), which 
> does a blocking connect. While testing, I changed the code so that 
> connectOne() starts a new thread called AsyncConnct(). AsyncConnect.run() 
> does a socketChannel.connect(). After starting AsyncConnect, connectOne 
> starts a timer. connectOne continues with normal operations if the connection 
> is established before the timer expires, otherwise, when the timer expires it 
> interrupts AsyncConnect() thread and returns. In this way, I can have an 
> upper bound on the amount of time we need to wait for connect to succeed. Of 
> course, this was a quick fix for my testing. Ideally, we should use Selector 
> to do non-blocking connects/accepts. I am planning to do that later once we 
> at least have a quick fix for the problem and consensus from others for the 
> real fix (this problem is big blocker for us). Note that it is OK to do 
> blocking IO in SenderWorker and RecvWorker threads since they block IO to the 
> respective !
 peer.
> b) The blocking IO problem is not just restricted to connectOne(), but also 
> in receiveConnection(). The Listener thread calls receiveConnection() for 
> each incoming connection request. receiveConnection does blocking IO to get 
> peer's info (s.read(msgBuffer)). Worse, it invokes connectOne() back to the 
> peer that had sent the connection request. All of this is happening from the 
> Listener. In short, if a peer fails after initiating a connection, the 
> Listener thread won't be able to accept connections from other peers, because 
> it would be stuck in read() or connetOne(). Also the code has an inherent 
> cycle. initiateConnection() and receiveConnection() will have to be very 
> carefully synchronized otherwise, we could run into deadlocks. This code is 
> going to be difficult to maintain/modify.
> Also see: https://issues.apache.org/jira/browse/ZOOKEEPER-822

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-922) enable faster timeout of sessions in case of unexpected socket disconnect

2010-11-17 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932909#action_12932909
 ] 

Flavio Junqueira commented on ZOOKEEPER-922:


HI Camille, Say a client disconnects from server A and reconnects to server B, 
same session. Server A believes the session should be expired because it 
received an exception. Server B believes the session should stay alive, since 
the client just reconnected. What should we do in this case? Kill the session 
or not?

Our suggestion is to have an option that enables fast expiration and disables 
clients moving sessions to other servers. We are certainly not proposing to 
remove the second functionality from ZooKeeper altogether.

> enable faster timeout of sessions in case of unexpected socket disconnect
> -
>
> Key: ZOOKEEPER-922
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-922
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Camille Fournier
>Assignee: Camille Fournier
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-922.patch
>
>
> In the case when a client connection is closed due to socket error instead of 
> the client calling close explicitly, it would be nice to enable the session 
> associated with that client to time out faster than the negotiated session 
> timeout. This would enable a zookeeper ensemble that is acting as a dynamic 
> discovery provider to remove ephemeral nodes for crashed clients quickly, 
> while allowing for a longer heartbeat-based timeout for java clients that 
> need to do long stop-the-world GC. 
> I propose doing this by setting the timeout associated with the crashed 
> session to "minSessionTimeout".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-918) Review of BookKeeper Documentation (Sequence flow and failure scenarios)

2010-11-17 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932902#action_12932902
 ] 

Flavio Junqueira commented on ZOOKEEPER-918:


Amit, just to give you an update, we have been discussing switching to a new 
documentation system soon (ZOOKEEPER-925), so we were wondering if it would be 
a problem waiting until we have it. Assuming the new system is easier to work 
with, we can more easily introduce your notes to the release documentation. 
Does it sound ok? 

If we take too long, then we can rethink it and find another way, like creating 
a wiki page or committing the pdf directly and linking to the BK documentation.

> Review of BookKeeper Documentation (Sequence flow and failure scenarios)
> 
>
> Key: ZOOKEEPER-918
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-918
> Project: Zookeeper
>  Issue Type: Task
>  Components: documentation
>Reporter: Amit Jaiswal
>Assignee: Amit Jaiswal
>Priority: Minor
> Fix For: 3.3.3, 3.4.0
>
> Attachments: BookKeeperInternals.doc, BookKeeperInternals.pdf
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> I have prepared a document describing some of the internals of bookkeeper in 
> terms of:
> 1. Sequence of operations
> 2. Files layout
> 3. Failure scenarios
> The document is prepared by mostly by reading the code. Can somebody who 
> understands the design review the same.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-922) enable faster timeout of sessions in case of unexpected socket disconnect

2010-11-16 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932512#action_12932512
 ] 

Flavio Junqueira commented on ZOOKEEPER-922:


I think I understand your motivation, but I'm not sure it will work the way you 
expect it to work. I'm afraid that you might end end up getting lots of false 
positives due to delays introduced by the environment (e.g., jvm gc). Let me 
clarify one thing first: when you refer to clients crashing, are you thinking 
about the jvm crashing or the whole machine becoming unavailable? Basically my 
question is if you really expect connections to be cleanly closed or not.


> enable faster timeout of sessions in case of unexpected socket disconnect
> -
>
> Key: ZOOKEEPER-922
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-922
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Camille Fournier
>Assignee: Camille Fournier
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-922.patch
>
>
> In the case when a client connection is closed due to socket error instead of 
> the client calling close explicitly, it would be nice to enable the session 
> associated with that client to time out faster than the negotiated session 
> timeout. This would enable a zookeeper ensemble that is acting as a dynamic 
> discovery provider to remove ephemeral nodes for crashed clients quickly, 
> while allowing for a longer heartbeat-based timeout for java clients that 
> need to do long stop-the-world GC. 
> I propose doing this by setting the timeout associated with the crashed 
> session to "minSessionTimeout".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-922) enable faster timeout of sessions in case of unexpected socket disconnect

2010-11-16 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932500#action_12932500
 ] 

Flavio Junqueira commented on ZOOKEEPER-922:


Hi! I'm confused by this proposal. What happens if the client disconnects form 
one server and moves to another? Or you want to be able to disable that feature 
as well?

> enable faster timeout of sessions in case of unexpected socket disconnect
> -
>
> Key: ZOOKEEPER-922
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-922
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Camille Fournier
>Assignee: Camille Fournier
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-922.patch
>
>
> In the case when a client connection is closed due to socket error instead of 
> the client calling close explicitly, it would be nice to enable the session 
> associated with that client to time out faster than the negotiated session 
> timeout. This would enable a zookeeper ensemble that is acting as a dynamic 
> discovery provider to remove ephemeral nodes for crashed clients quickly, 
> while allowing for a longer heartbeat-based timeout for java clients that 
> need to do long stop-the-world GC. 
> I propose doing this by setting the timeout associated with the crashed 
> session to "minSessionTimeout".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-900) FLE implementation should be improved to use non-blocking sockets

2010-11-15 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932228#action_12932228
 ] 

Flavio Junqueira commented on ZOOKEEPER-900:


If we fix the findbugs issue here, then we should just close ZOOKEEPER-902 
stating that it was resolved in ZOOKEEPER-900.

> FLE implementation should be improved to use non-blocking sockets
> -
>
> Key: ZOOKEEPER-900
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-900
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Critical
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-900.patch1, ZOOKEEPER-900.patch2
>
>
> From earlier email exchanges:
> 1. Blocking connects and accepts:
> a) The first problem is in manager.toSend(). This invokes connectOne(), which 
> does a blocking connect. While testing, I changed the code so that 
> connectOne() starts a new thread called AsyncConnct(). AsyncConnect.run() 
> does a socketChannel.connect(). After starting AsyncConnect, connectOne 
> starts a timer. connectOne continues with normal operations if the connection 
> is established before the timer expires, otherwise, when the timer expires it 
> interrupts AsyncConnect() thread and returns. In this way, I can have an 
> upper bound on the amount of time we need to wait for connect to succeed. Of 
> course, this was a quick fix for my testing. Ideally, we should use Selector 
> to do non-blocking connects/accepts. I am planning to do that later once we 
> at least have a quick fix for the problem and consensus from others for the 
> real fix (this problem is big blocker for us). Note that it is OK to do 
> blocking IO in SenderWorker and RecvWorker threads since they block IO to the 
> respective !
 peer.
> b) The blocking IO problem is not just restricted to connectOne(), but also 
> in receiveConnection(). The Listener thread calls receiveConnection() for 
> each incoming connection request. receiveConnection does blocking IO to get 
> peer's info (s.read(msgBuffer)). Worse, it invokes connectOne() back to the 
> peer that had sent the connection request. All of this is happening from the 
> Listener. In short, if a peer fails after initiating a connection, the 
> Listener thread won't be able to accept connections from other peers, because 
> it would be stuck in read() or connetOne(). Also the code has an inherent 
> cycle. initiateConnection() and receiveConnection() will have to be very 
> carefully synchronized otherwise, we could run into deadlocks. This code is 
> going to be difficult to maintain/modify.
> Also see: https://issues.apache.org/jira/browse/ZOOKEEPER-822

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-900) FLE implementation should be improved to use non-blocking sockets

2010-11-15 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-900:
---

Status: Open  (was: Patch Available)

Hi Vishal, Good job, thanks! The patch is pretty much good for me. Just a few 
points:

# Findbugs complained about the fact that we are not checking if sock is null 
in line 674. It could be if the previous catch block is executed. I was 
actually thinking that there should be a single try block followed by two catch 
blocks, no? 
# You may also consider fixing the other two issues Findbugs is complaining 
about. The statement declaring msgLength should be removed. It was probably 
there for debugging purposes;
# From the patch, it sounds like the formatting for some of the log statements 
got messed up. I would appreciate if you could fix those. I've seen just a 
couple of them.

> FLE implementation should be improved to use non-blocking sockets
> -
>
> Key: ZOOKEEPER-900
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-900
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Critical
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-900.patch1, ZOOKEEPER-900.patch2
>
>
> From earlier email exchanges:
> 1. Blocking connects and accepts:
> a) The first problem is in manager.toSend(). This invokes connectOne(), which 
> does a blocking connect. While testing, I changed the code so that 
> connectOne() starts a new thread called AsyncConnct(). AsyncConnect.run() 
> does a socketChannel.connect(). After starting AsyncConnect, connectOne 
> starts a timer. connectOne continues with normal operations if the connection 
> is established before the timer expires, otherwise, when the timer expires it 
> interrupts AsyncConnect() thread and returns. In this way, I can have an 
> upper bound on the amount of time we need to wait for connect to succeed. Of 
> course, this was a quick fix for my testing. Ideally, we should use Selector 
> to do non-blocking connects/accepts. I am planning to do that later once we 
> at least have a quick fix for the problem and consensus from others for the 
> real fix (this problem is big blocker for us). Note that it is OK to do 
> blocking IO in SenderWorker and RecvWorker threads since they block IO to the 
> respective !
 peer.
> b) The blocking IO problem is not just restricted to connectOne(), but also 
> in receiveConnection(). The Listener thread calls receiveConnection() for 
> each incoming connection request. receiveConnection does blocking IO to get 
> peer's info (s.read(msgBuffer)). Worse, it invokes connectOne() back to the 
> peer that had sent the connection request. All of this is happening from the 
> Listener. In short, if a peer fails after initiating a connection, the 
> Listener thread won't be able to accept connections from other peers, because 
> it would be stuck in read() or connetOne(). Also the code has an inherent 
> cycle. initiateConnection() and receiveConnection() will have to be very 
> carefully synchronized otherwise, we could run into deadlocks. This code is 
> going to be difficult to maintain/modify.
> Also see: https://issues.apache.org/jira/browse/ZOOKEEPER-822

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-900) FLE implementation should be improved to use non-blocking sockets

2010-11-12 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931470#action_12931470
 ] 

Flavio Junqueira commented on ZOOKEEPER-900:


Sure, I can investigate a little further, and Vishal let us know if you find 
anything.

> FLE implementation should be improved to use non-blocking sockets
> -
>
> Key: ZOOKEEPER-900
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-900
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Critical
> Fix For: 3.4.0
>
>
> From earlier email exchanges:
> 1. Blocking connects and accepts:
> a) The first problem is in manager.toSend(). This invokes connectOne(), which 
> does a blocking connect. While testing, I changed the code so that 
> connectOne() starts a new thread called AsyncConnct(). AsyncConnect.run() 
> does a socketChannel.connect(). After starting AsyncConnect, connectOne 
> starts a timer. connectOne continues with normal operations if the connection 
> is established before the timer expires, otherwise, when the timer expires it 
> interrupts AsyncConnect() thread and returns. In this way, I can have an 
> upper bound on the amount of time we need to wait for connect to succeed. Of 
> course, this was a quick fix for my testing. Ideally, we should use Selector 
> to do non-blocking connects/accepts. I am planning to do that later once we 
> at least have a quick fix for the problem and consensus from others for the 
> real fix (this problem is big blocker for us). Note that it is OK to do 
> blocking IO in SenderWorker and RecvWorker threads since they block IO to the 
> respective !
 peer.
> b) The blocking IO problem is not just restricted to connectOne(), but also 
> in receiveConnection(). The Listener thread calls receiveConnection() for 
> each incoming connection request. receiveConnection does blocking IO to get 
> peer's info (s.read(msgBuffer)). Worse, it invokes connectOne() back to the 
> peer that had sent the connection request. All of this is happening from the 
> Listener. In short, if a peer fails after initiating a connection, the 
> Listener thread won't be able to accept connections from other peers, because 
> it would be stuck in read() or connetOne(). Also the code has an inherent 
> cycle. initiateConnection() and receiveConnection() will have to be very 
> carefully synchronized otherwise, we could run into deadlocks. This code is 
> going to be difficult to maintain/modify.
> Also see: https://issues.apache.org/jira/browse/ZOOKEEPER-822

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-880) QuorumCnxManager$SendWorker grows without bounds

2010-11-12 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931464#action_12931464
 ] 

Flavio Junqueira commented on ZOOKEEPER-880:


Benoit, just to clarify, is this also due to monitoring or scanning?

> QuorumCnxManager$SendWorker grows without bounds
> 
>
> Key: ZOOKEEPER-880
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-880
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.2.2
>Reporter: Jean-Daniel Cryans
>Priority: Critical
> Attachments: hbase-hadoop-zookeeper-sv4borg12.log.gz, 
> hbase-hadoop-zookeeper-sv4borg9.log.gz, jstack, 
> TRACE-hbase-hadoop-zookeeper-sv4borg9.log.gz
>
>
> We're seeing an issue where one server in the ensemble has a steady growing 
> number of QuorumCnxManager$SendWorker threads up to a point where the OS runs 
> out of native threads, and at the same time we see a lot of exceptions in the 
> logs.  This is on 3.2.2 and our config looks like:
> {noformat}
> tickTime=3000
> dataDir=/somewhere_thats_not_tmp
> clientPort=2181
> initLimit=10
> syncLimit=5
> server.0=sv4borg9:2888:3888
> server.1=sv4borg10:2888:3888
> server.2=sv4borg11:2888:3888
> server.3=sv4borg12:2888:3888
> server.4=sv4borg13:2888:3888
> {noformat}
> The issue is on the first server. I'm going to attach threads dumps and logs 
> in moment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-900) FLE implementation should be improved to use non-blocking sockets

2010-11-12 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931460#action_12931460
 ] 

Flavio Junqueira commented on ZOOKEEPER-900:


That's a pretty strong statement. You're essentially suggesting that we 
shouldn't rely upon TCP to implement even its basic functionality. Also, my 
understanding is that Vishal is just reasoning about the code and he hasn't 
been able to reproduce that situation. Please correct me if I'm mistaken, 
Vishal.

> FLE implementation should be improved to use non-blocking sockets
> -
>
> Key: ZOOKEEPER-900
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-900
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Critical
> Fix For: 3.4.0
>
>
> From earlier email exchanges:
> 1. Blocking connects and accepts:
> a) The first problem is in manager.toSend(). This invokes connectOne(), which 
> does a blocking connect. While testing, I changed the code so that 
> connectOne() starts a new thread called AsyncConnct(). AsyncConnect.run() 
> does a socketChannel.connect(). After starting AsyncConnect, connectOne 
> starts a timer. connectOne continues with normal operations if the connection 
> is established before the timer expires, otherwise, when the timer expires it 
> interrupts AsyncConnect() thread and returns. In this way, I can have an 
> upper bound on the amount of time we need to wait for connect to succeed. Of 
> course, this was a quick fix for my testing. Ideally, we should use Selector 
> to do non-blocking connects/accepts. I am planning to do that later once we 
> at least have a quick fix for the problem and consensus from others for the 
> real fix (this problem is big blocker for us). Note that it is OK to do 
> blocking IO in SenderWorker and RecvWorker threads since they block IO to the 
> respective !
 peer.
> b) The blocking IO problem is not just restricted to connectOne(), but also 
> in receiveConnection(). The Listener thread calls receiveConnection() for 
> each incoming connection request. receiveConnection does blocking IO to get 
> peer's info (s.read(msgBuffer)). Worse, it invokes connectOne() back to the 
> peer that had sent the connection request. All of this is happening from the 
> Listener. In short, if a peer fails after initiating a connection, the 
> Listener thread won't be able to accept connections from other peers, because 
> it would be stuck in read() or connetOne(). Also the code has an inherent 
> cycle. initiateConnection() and receiveConnection() will have to be very 
> carefully synchronized otherwise, we could run into deadlocks. This code is 
> going to be difficult to maintain/modify.
> Also see: https://issues.apache.org/jira/browse/ZOOKEEPER-822

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-900) FLE implementation should be improved to use non-blocking sockets

2010-11-12 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931353#action_12931353
 ] 

Flavio Junqueira commented on ZOOKEEPER-900:


Hi Vishal, This is a good question. I'm actually assuming that the behavior of 
TCP is such that if I send a message and then close the channel properly 
(calling close()), due to the reliability and order guarantees of the 
connection, the message will get through before the connection closes. 
Essentially, I'm relying upon the TCP ACK to do exactly what you're proposing. 
However, it might be a good idea to make sure that the assumption is correct or 
if you know the answer already, just let me know. Overall I do agree that 
having an ACK is important.  




> FLE implementation should be improved to use non-blocking sockets
> -
>
> Key: ZOOKEEPER-900
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-900
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Critical
> Fix For: 3.4.0
>
>
> From earlier email exchanges:
> 1. Blocking connects and accepts:
> a) The first problem is in manager.toSend(). This invokes connectOne(), which 
> does a blocking connect. While testing, I changed the code so that 
> connectOne() starts a new thread called AsyncConnct(). AsyncConnect.run() 
> does a socketChannel.connect(). After starting AsyncConnect, connectOne 
> starts a timer. connectOne continues with normal operations if the connection 
> is established before the timer expires, otherwise, when the timer expires it 
> interrupts AsyncConnect() thread and returns. In this way, I can have an 
> upper bound on the amount of time we need to wait for connect to succeed. Of 
> course, this was a quick fix for my testing. Ideally, we should use Selector 
> to do non-blocking connects/accepts. I am planning to do that later once we 
> at least have a quick fix for the problem and consensus from others for the 
> real fix (this problem is big blocker for us). Note that it is OK to do 
> blocking IO in SenderWorker and RecvWorker threads since they block IO to the 
> respective !
 peer.
> b) The blocking IO problem is not just restricted to connectOne(), but also 
> in receiveConnection(). The Listener thread calls receiveConnection() for 
> each incoming connection request. receiveConnection does blocking IO to get 
> peer's info (s.read(msgBuffer)). Worse, it invokes connectOne() back to the 
> peer that had sent the connection request. All of this is happening from the 
> Listener. In short, if a peer fails after initiating a connection, the 
> Listener thread won't be able to accept connections from other peers, because 
> it would be stuck in read() or connetOne(). Also the code has an inherent 
> cycle. initiateConnection() and receiveConnection() will have to be very 
> carefully synchronized otherwise, we could run into deadlocks. This code is 
> going to be difficult to maintain/modify.
> Also see: https://issues.apache.org/jira/browse/ZOOKEEPER-822

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-928) Follower should stop following and start FLE if it does not receive pings from the leader

2010-11-11 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930953#action_12930953
 ] 

Flavio Junqueira commented on ZOOKEEPER-928:


Good point, Pat. I should have remembered this, since our hack to introduce the 
connection timeout in QCM previously was through the socket directly, so it 
makes
sense that we would have to do the same for other blocking operations. In fact, 
I 
have quickly tried replacing the read call in receiveConnection with the 
following:

{noformat}
s.socket().getInputStream().read(msgBytes);
{noformat}

and I get a SocketTimeoutException after the especified timeout. 

> Follower should stop following and start FLE if it does not receive pings 
> from the leader
> -
>
> Key: ZOOKEEPER-928
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-928
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.3.2
>Reporter: Vishal K
>Priority: Critical
>
> In Follower.followLeader() after syncing with the leader, the follower does:
> while (self.isRunning()) {
> readPacket(qp);
> processPacket(qp);
> }
> It looks like it relies on socket timeout expiry to figure out if the 
> connection with the leader has gone down.  So a follower *with no cilents* 
> may never notice a faulty leader if a Leader has a software hang, but the TCP 
> connections with the peers are still valid. Since it has no cilents, it won't 
> hearbeat with the Leader. If majority of followers are not connected to any 
> clients, then FLE will fail even if other followers attempt to elect a new 
> leader.
> We should keep track of pings received from the leader and see if we havent 
> seen
> a ping packet from the leader for (syncLimit * tickTime) time and give up 
> following the
> leader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-928) Follower should stop following and start FLE if it does not receive pings from the leader

2010-11-10 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930851#action_12930851
 ] 

Flavio Junqueira commented on ZOOKEEPER-928:


The documentation refers to SocketInputStream.read(), but it doesn't mention 
SocketChannel.read(). I ran a quick test with QuorumCnxManager and it doesn't 
seem to work. So maybe it is true that setting SO_TIMEOUT has no effect on 
SocketChannel.read(), which is kind of surprising to me. 

> Follower should stop following and start FLE if it does not receive pings 
> from the leader
> -
>
> Key: ZOOKEEPER-928
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-928
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.3.2
>Reporter: Vishal K
>Priority: Critical
>
> In Follower.followLeader() after syncing with the leader, the follower does:
> while (self.isRunning()) {
> readPacket(qp);
> processPacket(qp);
> }
> It looks like it relies on socket timeout expiry to figure out if the 
> connection with the leader has gone down.  So a follower *with no cilents* 
> may never notice a faulty leader if a Leader has a software hang, but the TCP 
> connections with the peers are still valid. Since it has no cilents, it won't 
> hearbeat with the Leader. If majority of followers are not connected to any 
> clients, then FLE will fail even if other followers attempt to elect a new 
> leader.
> We should keep track of pings received from the leader and see if we havent 
> seen
> a ping packet from the leader for (syncLimit * tickTime) time and give up 
> following the
> leader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-928) Follower should stop following and start FLE if it does not receive pings from the leader

2010-11-10 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930800#action_12930800
 ] 

Flavio Junqueira commented on ZOOKEEPER-928:


My understanding is that SO_TIMEOUT also affects SocketChannel, since it builds 
on top of a Socket object.

> Follower should stop following and start FLE if it does not receive pings 
> from the leader
> -
>
> Key: ZOOKEEPER-928
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-928
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.3.2
>Reporter: Vishal K
>Priority: Critical
>
> In Follower.followLeader() after syncing with the leader, the follower does:
> while (self.isRunning()) {
> readPacket(qp);
> processPacket(qp);
> }
> It looks like it relies on socket timeout expiry to figure out if the 
> connection with the leader has gone down.  So a follower *with no cilents* 
> may never notice a faulty leader if a Leader has a software hang, but the TCP 
> connections with the peers are still valid. Since it has no cilents, it won't 
> hearbeat with the Leader. If majority of followers are not connected to any 
> clients, then FLE will fail even if other followers attempt to elect a new 
> leader.
> We should keep track of pings received from the leader and see if we havent 
> seen
> a ping packet from the leader for (syncLimit * tickTime) time and give up 
> following the
> leader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-928) Follower should stop following and start FLE if it does not receive pings from the leader

2010-11-10 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930788#action_12930788
 ] 

Flavio Junqueira commented on ZOOKEEPER-928:


Hi Vishal, My understanding is that the readRecord call in readPacket will 
timeout, even if the TCP connection is still up. The documentation in: 
http://download.oracle.com/javase/6/docs/api/java/net/SocketOptions.html

says that:
{noformat}
static int  SO_TIMEOUT
  Set a timeout on blocking Socket operations:
{noformat}

> Follower should stop following and start FLE if it does not receive pings 
> from the leader
> -
>
> Key: ZOOKEEPER-928
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-928
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.3.2
>Reporter: Vishal K
>Priority: Critical
> Fix For: 3.3.3, 3.4.0
>
>
> In Follower.followLeader() after syncing with the leader, the follower does:
> while (self.isRunning()) {
> readPacket(qp);
> processPacket(qp);
> }
> It looks like it relies on socket timeout expiry to figure out if the 
> connection with the leader has gone down.  So a follower *with no cilents* 
> may never notice a faulty leader if a Leader has a software hang, but the TCP 
> connections with the peers are still valid. Since it has no cilents, it won't 
> hearbeat with the Leader. If majority of followers are not connected to any 
> clients, then FLE will fail even if other followers attempt to elect a new 
> leader.
> We should keep track of pings received from the leader and see if we havent 
> seen
> a ping packet from the leader for (syncLimit * tickTime) time and give up 
> following the
> leader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-928) Follower should stop following and start FLE if it does not receive pings from the leader

2010-11-10 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930774#action_12930774
 ] 

Flavio Junqueira commented on ZOOKEEPER-928:


I've just seen the messages on zookeeper-dev, and I'm not sure this is right:

# readPacket is implemented in Learner.java, and the socket read is performed 
in this line: leaderIs.readRecord(pp, "packet");
# leaderIs is an InputArchive instance instantiated in Learner:connectToLeader;
# The socket used to instantiate leaderIs has its SO_TIMEOUT value set right 
before in connectToLeader: sock.setSoTimeout(self.tickTime * self.initLimit).

Consequently, the operation should not be delayed indefinitely and should 
return after self.tickTime * self.initLimit. This discussion on SO_TIMEOUT 
sounds familiar, huh? ;-)

> Follower should stop following and start FLE if it does not receive pings 
> from the leader
> -
>
> Key: ZOOKEEPER-928
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-928
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.3.2
>Reporter: Vishal K
>Priority: Critical
> Fix For: 3.3.3, 3.4.0
>
>
> In Follower.followLeader() after syncing with the leader, the follower does:
> while (self.isRunning()) {
> readPacket(qp);
> processPacket(qp);
> }
> It looks like it relies on socket timeout expiry to figure out if the 
> connection with the leader has gone down.  So a follower *with no cilents* 
> may never notice a faulty leader if a Leader has a software hang, but the TCP 
> connections with the peers are still valid. Since it has no cilents, it won't 
> hearbeat with the Leader. If majority of followers are not connected to any 
> clients, then FLE will fail even if other followers attempt to elect a new 
> leader.
> We should keep track of pings received from the leader and see if we havent 
> seen
> a ping packet from the leader for (syncLimit * tickTime) time and give up 
> following the
> leader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-909) Extract NIO specific code from ClientCnxn

2010-11-10 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930546#action_12930546
 ] 

Flavio Junqueira commented on ZOOKEEPER-909:


Thomas, Check the console output on hudson, close to the end of the page. The 
failure seems to be on the C tests.

> Extract NIO specific code from ClientCnxn
> -
>
> Key: ZOOKEEPER-909
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-909
> Project: Zookeeper
>  Issue Type: Sub-task
>  Components: java client
>Reporter: Thomas Koch
>Assignee: Thomas Koch
> Fix For: 3.4.0
>
> Attachments: ClientCnxnSocketNetty.java, ZOOKEEPER-909.patch, 
> ZOOKEEPER-909.patch, ZOOKEEPER-909.patch, ZOOKEEPER-909.patch, 
> ZOOKEEPER-909.patch
>
>
> This patch is mostly the same patch as my last one for ZOOKEEPER-823 minus 
> everything Netty related. This means this patch only extract all NIO specific 
> code in the class ClientCnxnSocketNIO which extends ClientCnxnSocket.
> I've redone this patch from current trunk step by step now and couldn't find 
> any logical error. I've already done a couple of successful test runs and 
> will continue to do so this night.
> It would be nice, if we could apply this patch as soon as possible to trunk. 
> This allows us to continue to work on the netty integration without blocking 
> the ClientCnxn class. Adding Netty after this patch should be only a matter 
> of adding the ClientCnxnSocketNetty class with the appropriate test cases.
> You could help me by reviewing the patch and by running it on whatever test 
> server you have available. Please send me any complete failure log you should 
> encounter to thomas at koch point ro. Thx!
> Update: Until now, I've collected 8 successful builds in a row!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (ZOOKEEPER-900) FLE implementation should be improved to use non-blocking sockets

2010-11-09 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira reassigned ZOOKEEPER-900:
--

Assignee: Vishal K  (was: Flavio Junqueira)

> FLE implementation should be improved to use non-blocking sockets
> -
>
> Key: ZOOKEEPER-900
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-900
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Critical
> Fix For: 3.4.0
>
>
> From earlier email exchanges:
> 1. Blocking connects and accepts:
> a) The first problem is in manager.toSend(). This invokes connectOne(), which 
> does a blocking connect. While testing, I changed the code so that 
> connectOne() starts a new thread called AsyncConnct(). AsyncConnect.run() 
> does a socketChannel.connect(). After starting AsyncConnect, connectOne 
> starts a timer. connectOne continues with normal operations if the connection 
> is established before the timer expires, otherwise, when the timer expires it 
> interrupts AsyncConnect() thread and returns. In this way, I can have an 
> upper bound on the amount of time we need to wait for connect to succeed. Of 
> course, this was a quick fix for my testing. Ideally, we should use Selector 
> to do non-blocking connects/accepts. I am planning to do that later once we 
> at least have a quick fix for the problem and consensus from others for the 
> real fix (this problem is big blocker for us). Note that it is OK to do 
> blocking IO in SenderWorker and RecvWorker threads since they block IO to the 
> respective !
 peer.
> b) The blocking IO problem is not just restricted to connectOne(), but also 
> in receiveConnection(). The Listener thread calls receiveConnection() for 
> each incoming connection request. receiveConnection does blocking IO to get 
> peer's info (s.read(msgBuffer)). Worse, it invokes connectOne() back to the 
> peer that had sent the connection request. All of this is happening from the 
> Listener. In short, if a peer fails after initiating a connection, the 
> Listener thread won't be able to accept connections from other peers, because 
> it would be stuck in read() or connetOne(). Also the code has an inherent 
> cycle. initiateConnection() and receiveConnection() will have to be very 
> carefully synchronized otherwise, we could run into deadlocks. This code is 
> going to be difficult to maintain/modify.
> Also see: https://issues.apache.org/jira/browse/ZOOKEEPER-822

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-925) Consider maven site generation to replace our forrest site and documentation generation

2010-11-09 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930214#action_12930214
 ] 

Flavio Junqueira commented on ZOOKEEPER-925:


I was wondering if by getting away from checking in generated docs, you mean 
that anyone should be able to come and change docs freely.

> Consider maven site generation to replace our forrest site and documentation 
> generation
> ---
>
> Key: ZOOKEEPER-925
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-925
> Project: Zookeeper
>  Issue Type: Wish
>  Components: documentation
>Reporter: Patrick Hunt
>
> See WHIRR-19 for some background.
> In whirr we looked at a number of site/doc generation facilities. In the end 
> Maven site generation plugin turned out to be by far the best option. You can 
> see our nascent site here (no attempt at styling,etc so far):
> http://incubator.apache.org/whirr/
> In particular take a look at the quick start:
> http://incubator.apache.org/whirr/quick-start-guide.html
> which was generated from
> http://svn.apache.org/repos/asf/incubator/whirr/trunk/src/site/confluence/quick-start-guide.confluence
> notice this was standard wiki markup (confluence wiki markup, same as 
> available from apache)
> You can read more about mvn site plugin here:
> http://maven.apache.org/guides/mini/guide-site.html
> Notice that other formats are available, not just confluence markup, also 
> note that you can use different markup formats if you like in the same site 
> (although probably not a great idea, but in some cases might be handy, for 
> example whirr uses the confluence wiki, so we can pretty much copy/paste 
> source docs from wiki to our site (svn) if we like)
> Re maven vs our current ant based build. It's probably a good idea for us to 
> move the build to maven at some point. We could initially move just the doc 
> generation, and then incrementally move functionality from build.xml to mvn 
> over a longer time period.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-925) Consider maven site generation to replace our forrest site and documentation generation

2010-11-09 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930201#action_12930201
 ] 

Flavio Junqueira commented on ZOOKEEPER-925:


I'm fine with moving to a different doc system and having our own look&feel, 
but my main concern is having a doc generation that is relatively easy to use. 
If it is difficult to use, then contributors won't feel very motivated to write 
documentation... It would be great to get folks to stop whining when they have 
to write documentation, and stop blaming Forrest. :-)

To be fair, I must say that my experience with Forrest hasn't been great. 
Having to insert tags by hand and not being able to find descriptions for tags 
easily made it hard for me to like Forrest. The output looks good for me, 
though. 

> Consider maven site generation to replace our forrest site and documentation 
> generation
> ---
>
> Key: ZOOKEEPER-925
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-925
> Project: Zookeeper
>  Issue Type: Wish
>  Components: documentation
>Reporter: Patrick Hunt
>
> See WHIRR-19 for some background.
> In whirr we looked at a number of site/doc generation facilities. In the end 
> Maven site generation plugin turned out to be by far the best option. You can 
> see our nascent site here (no attempt at styling,etc so far):
> http://incubator.apache.org/whirr/
> In particular take a look at the quick start:
> http://incubator.apache.org/whirr/quick-start-guide.html
> which was generated from
> http://svn.apache.org/repos/asf/incubator/whirr/trunk/src/site/confluence/quick-start-guide.confluence
> notice this was standard wiki markup (confluence wiki markup, same as 
> available from apache)
> You can read more about mvn site plugin here:
> http://maven.apache.org/guides/mini/guide-site.html
> Notice that other formats are available, not just confluence markup, also 
> note that you can use different markup formats if you like in the same site 
> (although probably not a great idea, but in some cases might be handy, for 
> example whirr uses the confluence wiki, so we can pretty much copy/paste 
> source docs from wiki to our site (svn) if we like)
> Re maven vs our current ant based build. It's probably a good idea for us to 
> move the build to maven at some point. We could initially move just the doc 
> generation, and then incrementally move functionality from build.xml to mvn 
> over a longer time period.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-925) Consider maven site generation to replace our forrest site and documentation generation

2010-11-08 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929988#action_12929988
 ] 

Flavio Junqueira commented on ZOOKEEPER-925:


Pat, Any thoughts on how it would be to port from Forrest to Maven site 
generation?

> Consider maven site generation to replace our forrest site and documentation 
> generation
> ---
>
> Key: ZOOKEEPER-925
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-925
> Project: Zookeeper
>  Issue Type: Wish
>  Components: documentation
>Reporter: Patrick Hunt
>
> See WHIRR-19 for some background.
> In whirr we looked at a number of site/doc generation facilities. In the end 
> Maven site generation plugin turned out to be by far the best option. You can 
> see our nascent site here (no attempt at styling,etc so far):
> http://incubator.apache.org/whirr/
> In particular take a look at the quick start:
> http://incubator.apache.org/whirr/quick-start-guide.html
> which was generated from
> http://svn.apache.org/repos/asf/incubator/whirr/trunk/src/site/confluence/quick-start-guide.confluence
> notice this was standard wiki markup (confluence wiki markup, same as 
> available from apache)
> You can read more about mvn site plugin here:
> http://maven.apache.org/guides/mini/guide-site.html
> Notice that other formats are available, not just confluence markup, also 
> note that you can use different markup formats if you like in the same site 
> (although probably not a great idea, but in some cases might be handy, for 
> example whirr uses the confluence wiki, so we can pretty much copy/paste 
> source docs from wiki to our site (svn) if we like)
> Re maven vs our current ant based build. It's probably a good idea for us to 
> move the build to maven at some point. We could initially move just the doc 
> generation, and then incrementally move functionality from build.xml to mvn 
> over a longer time period.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-917) Leader election selected incorrect leader

2010-11-08 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929678#action_12929678
 ] 

Flavio Junqueira commented on ZOOKEEPER-917:


Hi Vishal, There is possibly a misunderstanding here. Server 2 reported in this 
jira (the leader) does not go back to an earlier epoch, but the other two do, 
and they are following, so if I understand your argument correctly, the 
exception is being applied as you suggest.



> Leader election selected incorrect leader
> -
>
> Key: ZOOKEEPER-917
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-917
> Project: Zookeeper
>  Issue Type: Bug
>  Components: leaderElection, server
>Affects Versions: 3.2.2
> Environment: Cloudera distribution of zookeeper (patched to never 
> cache DNS entries)
> Debian lenny
>Reporter: Alexandre Hardy
>Priority: Critical
> Fix For: 3.3.3, 3.4.0
>
> Attachments: zklogs-20101102144159SAST.tar.gz
>
>
> We had three nodes running zookeeper:
>   * 192.168.130.10
>   * 192.168.130.11
>   * 192.168.130.14
> 192.168.130.11 failed, and was replaced by a new node 192.168.130.13 
> (automated startup). The new node had not participated in any zookeeper 
> quorum previously. The node 192.148.130.11 was permanently removed from 
> service and could not contribute to the quorum any further (powered off).
> DNS entries were updated for the new node to allow all the zookeeper servers 
> to find the new node.
> The new node 192.168.130.13 was selected as the LEADER, despite the fact that 
> it had not seen the latest zxid.
> This particular problem has not been verified with later versions of 
> zookeeper, and no attempt has been made to reproduce this problem as yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-914) QuorumCnxManager blocks forever

2010-11-08 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929675#action_12929675
 ] 

Flavio Junqueira commented on ZOOKEEPER-914:


Hi Vishal, The Socket documentation does sound ambiguous, but my understanding 
is that SO_TIMEOUT is for blocking mode, not non-blocking mode. Non-blocking 
calls return immediately, so they shouldn't need a timeout value, no? 
Independent of using it or not, I would be curious to learn if my understanding 
is incorrect.

About the release to include the fix, I think Mahdev later came and changed it 
to 3.3.3. It is fine with me, and we just need to check what the schedule for 
3.3.3 is. My preference is to work directly on ZOOKEEPER-900 (or 901, which I 
think might be a more significant change), if you think we can produce a patch 
in time for 3.3.3. 

> QuorumCnxManager blocks forever 
> 
>
> Key: ZOOKEEPER-914
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-914
> Project: Zookeeper
>  Issue Type: Bug
>  Components: leaderElection
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.3, 3.4.0
>
>
> This was a disaster. While testing our application we ran into a scenario 
> where a rebooted follower could not join the cluster. Further debugging 
> showed that the follower could not join because the QuorumCnxManager on the 
> leader was blocked for indefinite amount of time in receiveConnect()
> "Thread-3" prio=10 tid=0x7fa920005800 nid=0x11bb runnable 
> [0x7fa9275ed000]
>java.lang.Thread.State: RUNNABLE
> at sun.nio.ch.FileDispatcher.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
> at sun.nio.ch.IOUtil.read(IOUtil.java:206)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
> - locked <0x7fa93315f988> (a java.lang.Object)
> at 
> org.apache.zookeeper.server.quorum.QuorumCnxManager.receiveConnection(QuorumCnxManager.java:210)
> at 
> org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:501)
> I had pointed out this bug along with several other problems in 
> QuorumCnxManager earlier in 
> https://issues.apache.org/jira/browse/ZOOKEEPER-900 and 
> https://issues.apache.org/jira/browse/ZOOKEEPER-822.
> I forgot to patch this one as a part of ZOOKEEPER-822. I am working on a fix 
> and a patch will be out soon. 
> The problem is that QuorumCnxManager is using SocketChannel in blocking mode. 
> It does a read() in receiveConnection() and a write() in initiateConnection().
> Sorry, but this is really bad programming. Also, points out to lack of 
> failure tests for QuorumCnxManager.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-917) Leader election selected incorrect leader

2010-11-07 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929354#action_12929354
 ] 

Flavio Junqueira commented on ZOOKEEPER-917:


Hi Vishal, It is certainly understand not having dedicated development time 
being an issue. I actually didn't know you're interested in the cluster 
membership... I'm glad to hear, though.

On your questions:
# Suppose we have an ensemble comprising 3 servers: A, B, and C. Now suppose 
that C is the leader, and both A and B follow C. If A disconnects from C for 
whatever reason (e.g., network partition) and it tries to elect a leader, it 
won't get any other process in the LOOKING state. It will actually receive a 
notification from C saying that it is leading and one from B saying that it is 
following C, both with an earlier leader election epoch. To avoid having A 
locked out (not able to elect C as leader), we implemented this exception: a 
process accepts going back to an earlier leader election only if it receives a 
notification from the leader saying that it is leading and from a quorum saying 
that it is following;
# I'm not sure if you referring to specific problem of this jira or if you are 
asking about my hypothetical example. Assuming it is the former, the follower 
(Follower:followLeader()) checks if the leader is proposing an earlier epoch, 
and if not, it accepts the leader snapshot. Because the epoch is the same, all 
followers will accept the leader snapshot follow it. 

> Leader election selected incorrect leader
> -
>
> Key: ZOOKEEPER-917
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-917
> Project: Zookeeper
>  Issue Type: Bug
>  Components: leaderElection, server
>Affects Versions: 3.2.2
> Environment: Cloudera distribution of zookeeper (patched to never 
> cache DNS entries)
> Debian lenny
>Reporter: Alexandre Hardy
>Priority: Critical
> Fix For: 3.3.3, 3.4.0
>
> Attachments: zklogs-20101102144159SAST.tar.gz
>
>
> We had three nodes running zookeeper:
>   * 192.168.130.10
>   * 192.168.130.11
>   * 192.168.130.14
> 192.168.130.11 failed, and was replaced by a new node 192.168.130.13 
> (automated startup). The new node had not participated in any zookeeper 
> quorum previously. The node 192.148.130.11 was permanently removed from 
> service and could not contribute to the quorum any further (powered off).
> DNS entries were updated for the new node to allow all the zookeeper servers 
> to find the new node.
> The new node 192.168.130.13 was selected as the LEADER, despite the fact that 
> it had not seen the latest zxid.
> This particular problem has not been verified with later versions of 
> zookeeper, and no attempt has been made to reproduce this problem as yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-914) QuorumCnxManager blocks forever

2010-11-07 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929345#action_12929345
 ] 

Flavio Junqueira commented on ZOOKEEPER-914:


Hi Vishal, I also appreciate your contributions and your comments. I also 
understand your frustration when you find issues with the code, but think that 
it is possibly equally frustrating for the developer who thought that at least 
basic issues were covered, so please try to think that we don't introduce bugs 
on purpose (at least I don't) and our review process is not perfect. 

Regarding clover reports, we have agreed already that code coverage is not 
bulletproof, and in fact there has been several other metrics proposed in the 
scientific literature, but it does indicate that some call path including a 
give piece of code was exercised. It certainly doesn't measure more complex 
cases, like race conditions, crashes and so on. In fact, if you have a better 
way of measuring test coverage, I'd happy to hear about it.

I'm not sure if you agree, but it seems to me that we should close this jira 
because the technical discussion here seems to be similar to the one of 
ZOOKEEPER-900. I'll try to address the concerns you raised regardless of what 
will happen to this jira:

# My point about SO_TIMEOUT comes from here: 
http://download.oracle.com/javase/6/docs/api/java/net/Socket.html#setSoTimeout%28int%29
# I obviously prefer to go with real fixes instead of hacking, but we need to 
have release 3.3.2 out, and it sounded like introducing a configurable timeout 
would fix your problem until the next release;
# About testing beyond the handshake, I'm not sure what you're proposing. If 
the blocking calls are part of the handshake and this is what is failing for 
you, then this is what we should target now, no?   

> QuorumCnxManager blocks forever 
> 
>
> Key: ZOOKEEPER-914
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-914
> Project: Zookeeper
>  Issue Type: Bug
>  Components: leaderElection
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.3, 3.4.0
>
>
> This was a disaster. While testing our application we ran into a scenario 
> where a rebooted follower could not join the cluster. Further debugging 
> showed that the follower could not join because the QuorumCnxManager on the 
> leader was blocked for indefinite amount of time in receiveConnect()
> "Thread-3" prio=10 tid=0x7fa920005800 nid=0x11bb runnable 
> [0x7fa9275ed000]
>java.lang.Thread.State: RUNNABLE
> at sun.nio.ch.FileDispatcher.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
> at sun.nio.ch.IOUtil.read(IOUtil.java:206)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
> - locked <0x7fa93315f988> (a java.lang.Object)
> at 
> org.apache.zookeeper.server.quorum.QuorumCnxManager.receiveConnection(QuorumCnxManager.java:210)
> at 
> org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:501)
> I had pointed out this bug along with several other problems in 
> QuorumCnxManager earlier in 
> https://issues.apache.org/jira/browse/ZOOKEEPER-900 and 
> https://issues.apache.org/jira/browse/ZOOKEEPER-822.
> I forgot to patch this one as a part of ZOOKEEPER-822. I am working on a fix 
> and a patch will be out soon. 
> The problem is that QuorumCnxManager is using SocketChannel in blocking mode. 
> It does a read() in receiveConnection() and a write() in initiateConnection().
> Sorry, but this is really bad programming. Also, points out to lack of 
> failure tests for QuorumCnxManager.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-918) Review of BookKeeper Documentation (Sequence flow and failure scenarios)

2010-11-05 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928566#action_12928566
 ] 

Flavio Junqueira commented on ZOOKEEPER-918:


This is really nice, Amit, thanks. I haven't had a chance to go carefully over 
the document, but my first reaction is that this should be a live document, and 
perhaps a wiki page would suit this purpose well. What do you think?

> Review of BookKeeper Documentation (Sequence flow and failure scenarios)
> 
>
> Key: ZOOKEEPER-918
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-918
> Project: Zookeeper
>  Issue Type: Task
>  Components: documentation
>Reporter: Amit Jaiswal
>Priority: Trivial
> Fix For: 3.3.3, 3.4.0
>
> Attachments: BookKeeperInternals.pdf
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> I have prepared a document describing some of the internals of bookkeeper in 
> terms of:
> 1. Sequence of operations
> 2. Files layout
> 3. Failure scenarios
> The document is prepared by mostly by reading the code. Can somebody who 
> understands the design review the same.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (ZOOKEEPER-917) Leader election selected incorrect leader

2010-11-04 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira resolved ZOOKEEPER-917.


Resolution: Not A Problem

My pleasure to help. I'm marking it as not a problem for now, but feel free to 
come back and ask for more clarification if needed.

> Leader election selected incorrect leader
> -
>
> Key: ZOOKEEPER-917
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-917
> Project: Zookeeper
>  Issue Type: Bug
>  Components: leaderElection, server
>Affects Versions: 3.2.2
> Environment: Cloudera distribution of zookeeper (patched to never 
> cache DNS entries)
> Debian lenny
>Reporter: Alexandre Hardy
>Priority: Critical
> Fix For: 3.3.3, 3.4.0
>
> Attachments: zklogs-20101102144159SAST.tar.gz
>
>
> We had three nodes running zookeeper:
>   * 192.168.130.10
>   * 192.168.130.11
>   * 192.168.130.14
> 192.168.130.11 failed, and was replaced by a new node 192.168.130.13 
> (automated startup). The new node had not participated in any zookeeper 
> quorum previously. The node 192.148.130.11 was permanently removed from 
> service and could not contribute to the quorum any further (powered off).
> DNS entries were updated for the new node to allow all the zookeeper servers 
> to find the new node.
> The new node 192.168.130.13 was selected as the LEADER, despite the fact that 
> it had not seen the latest zxid.
> This particular problem has not been verified with later versions of 
> zookeeper, and no attempt has been made to reproduce this problem as yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-917) Leader election selected incorrect leader

2010-11-04 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928179#action_12928179
 ] 

Flavio Junqueira commented on ZOOKEEPER-917:


Hi Alexandre, It is an key premise of important replication algorithms, like 
Paxos, that there is a portion of the state that persists across crashes (and 
recoveries). By replacing server 2 with a fresh server, you simply got rid of 
the persistent state. In general, making that replacement you've made may lead 
you to trouble due to the problem I exposed a few postings up. Of course, if 
you wait for a successful election, the problem is supposed to go away because 
you have reestablished a quorum and this quorum does not contain the faulty 
server, but then you have to make sure the election happens before you 
introduce the fresh server perhaps through jmx or by inspecting the logs. 
Simply setting a reasonable timeout will work in most cases, but the leader 
election is not guaranteed to succeed, and there is a chance, likely to be 
small, that you'll end up with a corrupt state. 



> Leader election selected incorrect leader
> -
>
> Key: ZOOKEEPER-917
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-917
> Project: Zookeeper
>  Issue Type: Bug
>  Components: leaderElection, server
>Affects Versions: 3.2.2
> Environment: Cloudera distribution of zookeeper (patched to never 
> cache DNS entries)
> Debian lenny
>Reporter: Alexandre Hardy
>Priority: Critical
> Fix For: 3.3.3, 3.4.0
>
> Attachments: zklogs-20101102144159SAST.tar.gz
>
>
> We had three nodes running zookeeper:
>   * 192.168.130.10
>   * 192.168.130.11
>   * 192.168.130.14
> 192.168.130.11 failed, and was replaced by a new node 192.168.130.13 
> (automated startup). The new node had not participated in any zookeeper 
> quorum previously. The node 192.148.130.11 was permanently removed from 
> service and could not contribute to the quorum any further (powered off).
> DNS entries were updated for the new node to allow all the zookeeper servers 
> to find the new node.
> The new node 192.168.130.13 was selected as the LEADER, despite the fact that 
> it had not seen the latest zxid.
> This particular problem has not been verified with later versions of 
> zookeeper, and no attempt has been made to reproduce this problem as yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-917) Leader election selected incorrect leader

2010-11-04 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928171#action_12928171
 ] 

Flavio Junqueira commented on ZOOKEEPER-917:


The program I was using to open your logs was hiding some of the messages for 
some reason unknown to me. I now understand why the leader was elected in your 
case and the behavior is legitimate. Let me try to explain.

We currently repeat the last notification sent to a given server upon 
reconnecting to it. This is to avoid problems with messages partially sent, 
and, assuming no further bugs, the protocol is resilient to messages 
duplicates. At the same time, a server A decides to follow another server B if 
it receives a message from B saying that B is leading and from a quorum saying 
that they are following, even if A is in a later election epoch. This mechanism 
is there to avoid A being locked out of the ensemble in the case it partitions 
away and comes back later. 

>From you logs, what happens is:

# Fresh server 2 receives previous notifications from 0 and 1, and decide to 
lead;
# Server 1 receives the last message from server 0 saying that it is following 
2 (which was the previous leader), and the notification from 2 saying that it 
is leading. Server 1 consequently decides to follow 2;
# Server 0 receives the last message from server 1 saying that it is following 
2 (which was the previous leader), and the notification from 2 saying that it 
is leading. Server 0 consequently decides to follow 2.

Now the main problem I see is that the followers accept the snapshot from the 
leader, and they shouldn't given that they have moved to a later epoch. I 
suspect that we currently allow a server to come back to an epoch it has been 
in the past to again avoid having a server locked out after being partitioned 
away and healing, but I need to do some further inspection.

My overall take is that your case is unfortunately not legitimate, meaning that 
we don't currently provision for configuration changes. The case you expose in 
general constitutes a loss of quorum, and that violates one of our core 
assumptions. In more detail, a quorum supporting a leader must have a non-empty 
intersection with the quorum of servers that have accepted requests in the 
previous epoch. Wiping out the state of server 2, by replacing it with a fresh 
server, leads to the situation in which just one server contains all 
transactions accepted by a quorum (and possibly committed). If you hadn't 
replaced server 2 with a fresh server, then either server 2 would have been 
elected again just the same, and it would be fine because it was previously the 
leader, or it wouldn't have been elected because the leader was previously 
another server and the last notifications of 0 and 1 would be supporting a 
different server.

On reconfigurations, we have talked about it 
(http://wiki.apache.org/hadoop/ZooKeeper/ClusterMembership), but we haven't 
made enough progress recently and it is currently not implemented. It would be 
great to get some help here.

Let me know if this analysis makes any sense to you, please.

> Leader election selected incorrect leader
> -
>
> Key: ZOOKEEPER-917
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-917
> Project: Zookeeper
>  Issue Type: Bug
>  Components: leaderElection, server
>Affects Versions: 3.2.2
> Environment: Cloudera distribution of zookeeper (patched to never 
> cache DNS entries)
> Debian lenny
>Reporter: Alexandre Hardy
>Priority: Critical
> Fix For: 3.3.3, 3.4.0
>
> Attachments: zklogs-20101102144159SAST.tar.gz
>
>
> We had three nodes running zookeeper:
>   * 192.168.130.10
>   * 192.168.130.11
>   * 192.168.130.14
> 192.168.130.11 failed, and was replaced by a new node 192.168.130.13 
> (automated startup). The new node had not participated in any zookeeper 
> quorum previously. The node 192.148.130.11 was permanently removed from 
> service and could not contribute to the quorum any further (powered off).
> DNS entries were updated for the new node to allow all the zookeeper servers 
> to find the new node.
> The new node 192.168.130.13 was selected as the LEADER, despite the fact that 
> it had not seen the latest zxid.
> This particular problem has not been verified with later versions of 
> zookeeper, and no attempt has been made to reproduce this problem as yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-882) Startup loads last transaction from snapshot

2010-11-03 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928013#action_12928013
 ] 

Flavio Junqueira commented on ZOOKEEPER-882:


Ok, got it. I agree that it makes sense to submit both. Just name the patch 
file including the verification test differently and explain the difference in 
the jira.

> Startup loads last transaction from snapshot
> 
>
> Key: ZOOKEEPER-882
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-882
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: 882.diff, restore, ZOOKEEPER-882.patch
>
>
> On startup, the server first loads the latest snapshot, and then loads from 
> the log starting at the last transaction in the snapshot.  It should begin 
> from one past that last transaction in the log.  I will attach a possible 
> patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-900) FLE implementation should be improved to use non-blocking sockets

2010-11-03 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928011#action_12928011
 ] 

Flavio Junqueira commented on ZOOKEEPER-900:


Hi Vishal, I like your proposal, it seems reasonable and not difficult to 
implement.

On your questions:

# I don't think it is necessary to kill a pair SenderWorker/RecvWorker every 
time, and I'd certainly support changing it;
# I'm not sure where you're suggesting to introduce a delay. In the FLE code, a 
server sends a new batch of notifications if it changes its vote or if it times 
out waiting for a new notification. This timeout value increases over time.  I 
was actually thinking that we should reset the timeout value upon receiving a 
notification. I think this is a bug

Given that it is your proposal, I'd be happy to let you take a stab at it and 
help you out if you need a hand. Does it make sense for you?

> FLE implementation should be improved to use non-blocking sockets
> -
>
> Key: ZOOKEEPER-900
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-900
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Vishal K
>Assignee: Flavio Junqueira
>Priority: Critical
>
> From earlier email exchanges:
> 1. Blocking connects and accepts:
> a) The first problem is in manager.toSend(). This invokes connectOne(), which 
> does a blocking connect. While testing, I changed the code so that 
> connectOne() starts a new thread called AsyncConnct(). AsyncConnect.run() 
> does a socketChannel.connect(). After starting AsyncConnect, connectOne 
> starts a timer. connectOne continues with normal operations if the connection 
> is established before the timer expires, otherwise, when the timer expires it 
> interrupts AsyncConnect() thread and returns. In this way, I can have an 
> upper bound on the amount of time we need to wait for connect to succeed. Of 
> course, this was a quick fix for my testing. Ideally, we should use Selector 
> to do non-blocking connects/accepts. I am planning to do that later once we 
> at least have a quick fix for the problem and consensus from others for the 
> real fix (this problem is big blocker for us). Note that it is OK to do 
> blocking IO in SenderWorker and RecvWorker threads since they block IO to the 
> respective !
 peer.
> b) The blocking IO problem is not just restricted to connectOne(), but also 
> in receiveConnection(). The Listener thread calls receiveConnection() for 
> each incoming connection request. receiveConnection does blocking IO to get 
> peer's info (s.read(msgBuffer)). Worse, it invokes connectOne() back to the 
> peer that had sent the connection request. All of this is happening from the 
> Listener. In short, if a peer fails after initiating a connection, the 
> Listener thread won't be able to accept connections from other peers, because 
> it would be stuck in read() or connetOne(). Also the code has an inherent 
> cycle. initiateConnection() and receiveConnection() will have to be very 
> carefully synchronized otherwise, we could run into deadlocks. This code is 
> going to be difficult to maintain/modify.
> Also see: https://issues.apache.org/jira/browse/ZOOKEEPER-822

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-917) Leader election selected incorrect leader

2010-11-03 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12927949#action_12927949
 ] 

Flavio Junqueira commented on ZOOKEEPER-917:


Even though the logs do not make a lot of sense for me at this point, I was 
thinking that your scenario is not supposed to work given our guarantees. Let's 
look at an example.

Suppose we have 3 servers: A, B, and  C. Suppose that C is initially the leader 
and proposes operations that B is able to ack, but A doesn't. Now, suppose that 
I come and replace C with a fresh server, same id but empty state, and I do it 
before A and B are able to elect a new leader and recover. In this case, A and 
C may form a quorum and the state of the ZooKeeper ensemble would be empty. The 
replacement of server C with a fresh server violates our assumptions. 

It should work, though, if you add a fresh server with a working ensemble. That 
is, you let A and B elect a new leader, and then you start the new C server. In 
your case, I'm still not sure why it happens because the initial zxid of node 1 
is 4294967742 according to your excerpt. 

> Leader election selected incorrect leader
> -
>
> Key: ZOOKEEPER-917
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-917
> Project: Zookeeper
>  Issue Type: Bug
>  Components: leaderElection, server
>Affects Versions: 3.2.2
> Environment: Cloudera distribution of zookeeper (patched to never 
> cache DNS entries)
> Debian lenny
>Reporter: Alexandre Hardy
>Priority: Critical
> Fix For: 3.3.3, 3.4.0
>
> Attachments: zklogs-20101102144159SAST.tar.gz
>
>
> We had three nodes running zookeeper:
>   * 192.168.130.10
>   * 192.168.130.11
>   * 192.168.130.14
> 192.168.130.11 failed, and was replaced by a new node 192.168.130.13 
> (automated startup). The new node had not participated in any zookeeper 
> quorum previously. The node 192.148.130.11 was permanently removed from 
> service and could not contribute to the quorum any further (powered off).
> DNS entries were updated for the new node to allow all the zookeeper servers 
> to find the new node.
> The new node 192.168.130.13 was selected as the LEADER, despite the fact that 
> it had not seen the latest zxid.
> This particular problem has not been verified with later versions of 
> zookeeper, and no attempt has been made to reproduce this problem as yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-882) Startup loads last transaction from snapshot

2010-11-03 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12927909#action_12927909
 ] 

Flavio Junqueira commented on ZOOKEEPER-882:


If I understand correctly what you're proposing, I think it won't be necessary 
to submit two separate patches. To verify that the test fails without the 
patch, I can simply add the test without applying any other modification in the 
patch file, and then run the test. After applying the modifications to the code 
base, I'd be able to verify that the test does not fail any longer. Does it 
sound right to you?

> Startup loads last transaction from snapshot
> 
>
> Key: ZOOKEEPER-882
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-882
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: 882.diff, restore, ZOOKEEPER-882.patch
>
>
> On startup, the server first loads the latest snapshot, and then loads from 
> the log starting at the last transaction in the snapshot.  It should begin 
> from one past that last transaction in the log.  I will attach a possible 
> patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-917) Leader election selected incorrect leader

2010-11-03 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12927908#action_12927908
 ] 

Flavio Junqueira commented on ZOOKEEPER-917:


I downloaded your logs, but the out files are empty and I couldn't find the 
notification messages. By looking at the excerpts you posted, it sounds like 
node 1 tells 0 that it is following 2 and node says that it is following (this 
is fine as node 2 might have received some old messages), so node 0 must follow 
2. Now the question is why node 1 decided to follow 2, specially because it has 
a higher zxid and the follower code should have rejected an attempt to follow a 
leader from an earlier epoch. 

It would be nice to have a look at the output of node 1. 

> Leader election selected incorrect leader
> -
>
> Key: ZOOKEEPER-917
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-917
> Project: Zookeeper
>  Issue Type: Bug
>  Components: leaderElection, server
>Affects Versions: 3.2.2
> Environment: Cloudera distribution of zookeeper (patched to never 
> cache DNS entries)
> Debian lenny
>Reporter: Alexandre Hardy
>Priority: Critical
> Fix For: 3.3.3, 3.4.0
>
> Attachments: zklogs-20101102144159SAST.tar.gz
>
>
> We had three nodes running zookeeper:
>   * 192.168.130.10
>   * 192.168.130.11
>   * 192.168.130.14
> 192.168.130.11 failed, and was replaced by a new node 192.168.130.13 
> (automated startup). The new node had not participated in any zookeeper 
> quorum previously. The node 192.148.130.11 was permanently removed from 
> service and could not contribute to the quorum any further (powered off).
> DNS entries were updated for the new node to allow all the zookeeper servers 
> to find the new node.
> The new node 192.168.130.13 was selected as the LEADER, despite the fact that 
> it had not seen the latest zxid.
> This particular problem has not been verified with later versions of 
> zookeeper, and no attempt has been made to reproduce this problem as yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-917) Leader election selected incorrect leader

2010-11-03 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12927869#action_12927869
 ] 

Flavio Junqueira commented on ZOOKEEPER-917:


Hi Alexandre, Could you please post your configuration parameters?

I noticed the following in both excerpts:
{noformat}
INFO org.apache.zookeeper.server.quorum.FastLeaderElection: Notification: 2, 
-1, 1, 2, LOOKING, LOOKING, 1
INFO org.apache.zookeeper.server.quorum.FastLeaderElection: Notification: 2, 
-1, 1, 2, LOOKING, LOOKING, 2
{noformat}

which implies that both servers, 1 and 2, were starting from scratch and in an 
ensemble of 3 servers they form a quorum.

> Leader election selected incorrect leader
> -
>
> Key: ZOOKEEPER-917
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-917
> Project: Zookeeper
>  Issue Type: Bug
>  Components: leaderElection, server
>Affects Versions: 3.2.2
> Environment: Cloudera distribution of zookeeper (patched to never 
> cache DNS entries)
> Debian lenny
>Reporter: Alexandre Hardy
>Priority: Critical
> Attachments: zklogs-20101102144159SAST.tar.gz
>
>
> We had three nodes running zookeeper:
>   * 192.168.130.10
>   * 192.168.130.11
>   * 192.168.130.14
> 192.168.130.11 failed, and was replaced by a new node 192.168.130.13 
> (automated startup). The new node had not participated in any zookeeper 
> quorum previously. The node 192.148.130.11 was permanently removed from 
> service and could not contribute to the quorum any further (powered off).
> DNS entries were updated for the new node to allow all the zookeeper servers 
> to find the new node.
> The new node 192.168.130.13 was selected as the LEADER, despite the fact that 
> it had not seen the latest zxid.
> This particular problem has not been verified with later versions of 
> zookeeper, and no attempt has been made to reproduce this problem as yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-702) GSoC 2010: Failure Detector Model

2010-11-02 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12927657#action_12927657
 ] 

Flavio Junqueira commented on ZOOKEEPER-702:


Thanks, Abmar. It looks good to me. I have one quick comment, though. Is there 
any configuration value that could be causing tests to run slower? I have the 
impression that tests are running slightly slower with your patch. One in 
particular that called my attention was QuorumZxidSyncTest:

{noformat}

Trunk: [junit] Running org.apache.zookeeper.test.QuorumZxidSyncTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 94.55 sec

702: [junit] Running org.apache.zookeeper.test.QuorumZxidSyncTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 139.985 sec
{noformat}

and this seems to be pretty consistent.

> GSoC 2010: Failure Detector Model
> -
>
> Key: ZOOKEEPER-702
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-702
> Project: Zookeeper
>  Issue Type: Wish
>Reporter: Henry Robinson
>Assignee: Abmar Barros
> Fix For: 3.4.0
>
> Attachments: bertier-pseudo.txt, bertier-pseudo.txt, chen-pseudo.txt, 
> chen-pseudo.txt, phiaccrual-pseudo.txt, phiaccrual-pseudo.txt, 
> ZOOKEEPER-702-code.patch, ZOOKEEPER-702-doc.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch
>
>
> Failure Detector Module
> Possible Mentor
> Henry Robinson (henry at apache dot org)
> Requirements
> Java, some distributed systems knowledge, comfort implementing distributed 
> systems protocols
> Description
> ZooKeeper servers detects the failure of other servers and clients by 
> counting the number of 'ticks' for which it doesn't get a heartbeat from 
> other machines. This is the 'timeout' method of failure detection and works 
> very well; however it is possible that it is too aggressive and not easily 
> tuned for some more unusual ZooKeeper installations (such as in a wide-area 
> network, or even in a mobile ad-hoc network).
> This project would abstract the notion of failure detection to a dedicated 
> Java module, and implement several failure detectors to compare and contrast 
> their appropriateness for ZooKeeper. For example, Apache Cassandra uses a 
> phi-accrual failure detector (http://ddsg.jaist.ac.jp/pub/HDY+04.pdf) which 
> is much more tunable and has some very interesting properties. This is a 
> great project if you are interested in distributed algorithms, or want to 
> help re-factor some of ZooKeeper's internal code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-882) Startup loads last transaction from snapshot

2010-11-02 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-882:
---

Status: Open  (was: Patch Available)

Hi Jared, I was wondering if you can add a test case to your patch.

> Startup loads last transaction from snapshot
> 
>
> Key: ZOOKEEPER-882
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-882
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: 882.diff, restore, ZOOKEEPER-882.patch
>
>
> On startup, the server first loads the latest snapshot, and then loads from 
> the log starting at the last transaction in the snapshot.  It should begin 
> from one past that last transaction in the log.  I will attach a possible 
> patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-876) Unnecessary snapshot transfers between new leader and followers

2010-11-02 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-876:
---

Status: Open  (was: Patch Available)

This is a nice catch, Diogo, and the patch looks good to me. I have a few very 
quick comments:

# Instead of returning a pair of longs in startForwarding, we could simply 
return maxZxid and read lastProposed directly from the leader object. Doesn't 
it work?
# The first comment of startForwarding is not saying much. Could you please 
expand it?
# Could you please explain in the beginning of the test case what it is 
supposed to be testing? It is for later remembering what the test does.

Good job!



> Unnecessary snapshot transfers between new leader and followers
> ---
>
> Key: ZOOKEEPER-876
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-876
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Diogo
>Assignee: Diogo
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-876.patch
>
>
> When starting a new leadership, unnecessary snapshot transfers happen between 
> new leader and followers. This is so because of multiple small bugs. 
> 1) the comparison of zxids is done based on a new proposal, instead of the 
> last logged zxid. (LearnerFollower.java:310)
> 2) if follower is one zxid behind, the check of the interval of committed 
> logs excludes the follower. (LearnerFollower.java:269)
> 3) the bug reported in ZOOKEEPER-874 (commitLogs are empty after recover).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-702) GSoC 2010: Failure Detector Model

2010-10-28 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-702:
---

Status: Open  (was: Patch Available)

Hi Abmar, Thanks for the addition to the patch. I was wondering if it is really 
a good idea to have both options, normal and exponential, implemented. Since 
your experiments have shown that exponential performs better, why don't use it 
only? Also, I was wondering if you have posted expertimental numbers showing 
that exponential performs better. 

In the case we go with exponential only, then we don't need the modification to 
ivy.xml, right?

And last comment, it doesn't look like the classes implementing 
PhiTimeoutEvaluator need to be public. Is this right?  

> GSoC 2010: Failure Detector Model
> -
>
> Key: ZOOKEEPER-702
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-702
> Project: Zookeeper
>  Issue Type: Wish
>Reporter: Henry Robinson
>Assignee: Abmar Barros
> Fix For: 3.4.0
>
> Attachments: bertier-pseudo.txt, bertier-pseudo.txt, chen-pseudo.txt, 
> chen-pseudo.txt, phiaccrual-pseudo.txt, phiaccrual-pseudo.txt, 
> ZOOKEEPER-702-code.patch, ZOOKEEPER-702-doc.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch
>
>
> Failure Detector Module
> Possible Mentor
> Henry Robinson (henry at apache dot org)
> Requirements
> Java, some distributed systems knowledge, comfort implementing distributed 
> systems protocols
> Description
> ZooKeeper servers detects the failure of other servers and clients by 
> counting the number of 'ticks' for which it doesn't get a heartbeat from 
> other machines. This is the 'timeout' method of failure detection and works 
> very well; however it is possible that it is too aggressive and not easily 
> tuned for some more unusual ZooKeeper installations (such as in a wide-area 
> network, or even in a mobile ad-hoc network).
> This project would abstract the notion of failure detection to a dedicated 
> Java module, and implement several failure detectors to compare and contrast 
> their appropriateness for ZooKeeper. For example, Apache Cassandra uses a 
> phi-accrual failure detector (http://ddsg.jaist.ac.jp/pub/HDY+04.pdf) which 
> is much more tunable and has some very interesting properties. This is a 
> great project if you are interested in distributed algorithms, or want to 
> help re-factor some of ZooKeeper's internal code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-885) Zookeeper drops connections under moderate IO load

2010-10-28 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925754#action_12925754
 ] 

Flavio Junqueira commented on ZOOKEEPER-885:


Sure, let's discuss over e-mail and we can post here later our findings. 

> Zookeeper drops connections under moderate IO load
> --
>
> Key: ZOOKEEPER-885
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-885
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.2.2, 3.3.1
> Environment: Debian (Lenny)
> 1Gb RAM
> swap disabled
> 100Mb heap for zookeeper
>Reporter: Alexandre Hardy
>Priority: Critical
> Attachments: benchmark.csv, tracezklogs.tar.gz, tracezklogs.tar.gz, 
> WatcherTest.java, zklogs.tar.gz
>
>
> A zookeeper server under minimum load, with a number of clients watching 
> exactly one node will fail to maintain the connection when the machine is 
> subjected to moderate IO load.
> In a specific test example we had three zookeeper servers running on 
> dedicated machines with 45 clients connected, watching exactly one node. The 
> clients would disconnect after moderate load was added to each of the 
> zookeeper servers with the command:
> {noformat}
> dd if=/dev/urandom of=/dev/mapper/nimbula-test
> {noformat}
> The {{dd}} command transferred data at a rate of about 4Mb/s.
> The same thing happens with
> {noformat}
> dd if=/dev/zero of=/dev/mapper/nimbula-test
> {noformat}
> It seems strange that such a moderate load should cause instability in the 
> connection.
> Very few other processes were running, the machines were setup to test the 
> connection instability we have experienced. Clients performed no other read 
> or mutation operations.
> Although the documents state that minimal competing IO load should present on 
> the zookeeper server, it seems reasonable that moderate IO should not cause 
> problems in this case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-914) QuorumCnxManager blocks forever

2010-10-28 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925746#action_12925746
 ] 

Flavio Junqueira commented on ZOOKEEPER-914:


As Pat, I would also appreciate some more constructive comments (and behavior). 

>From the Clover reports, we exercise a significant part of the QCM code, but 
>it is true, though, that we don't test the cases you have been exposing. Here 
>is a way I believe we can reproduce this problem (I haven't implemented it, 
>but seems to make sense). The high-level idea is to make sure that if some 
>server stops responding before it completes the handshake protocol, then no 
>instance of QCM across all servers will block and prevent other servers from 
>joining the ensemble.

Suppose we configure an ensemble with 5 servers using QuorumBase. One of the 
servers will be a simple mock server, as we do in the CnxManagerTest tests. Now 
here is the sequence of steps to follow:

# Start three of the servers and confirm that they accept and execute 
operations;
# Start mock server and execute the protocol partially. For the read case you 
mention, you can simply not send the server identifier. That will cause the 
read on the other end to block and to not accept more connections;
# Start a 5th server and check if it is able to join the ensemble.

A simple fix to have it working for you soon along the lines of what we have 
done to make the connection timeout configurable seems to be to set SO_TIMEOUT. 
But, if you have other ideas, please lay them out. Please bear in mind that the 
major modifications we should leave for ZOOKEEPER-901 because those will take 
more time to develop and get into shape.

> QuorumCnxManager blocks forever 
> 
>
> Key: ZOOKEEPER-914
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-914
> Project: Zookeeper
>  Issue Type: Bug
>  Components: leaderElection
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.3, 3.4.0
>
>
> This was a disaster. While testing our application we ran into a scenario 
> where a rebooted follower could not join the cluster. Further debugging 
> showed that the follower could not join because the QuorumCnxManager on the 
> leader was blocked for indefinite amount of time in receiveConnect()
> "Thread-3" prio=10 tid=0x7fa920005800 nid=0x11bb runnable 
> [0x7fa9275ed000]
>java.lang.Thread.State: RUNNABLE
> at sun.nio.ch.FileDispatcher.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
> at sun.nio.ch.IOUtil.read(IOUtil.java:206)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
> - locked <0x7fa93315f988> (a java.lang.Object)
> at 
> org.apache.zookeeper.server.quorum.QuorumCnxManager.receiveConnection(QuorumCnxManager.java:210)
> at 
> org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:501)
> I had pointed out this bug along with several other problems in 
> QuorumCnxManager earlier in 
> https://issues.apache.org/jira/browse/ZOOKEEPER-900 and 
> https://issues.apache.org/jira/browse/ZOOKEEPER-822.
> I forgot to patch this one as a part of ZOOKEEPER-822. I am working on a fix 
> and a patch will be out soon. 
> The problem is that QuorumCnxManager is using SocketChannel in blocking mode. 
> It does a read() in receiveConnection() and a write() in initiateConnection().
> Sorry, but this is really bad programming. Also, points out to lack of 
> failure tests for QuorumCnxManager.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-914) QuorumCnxManager blocks forever

2010-10-28 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-914:
---

Component/s: (was: server)
 (was: quorum)
 leaderElection

> QuorumCnxManager blocks forever 
> 
>
> Key: ZOOKEEPER-914
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-914
> Project: Zookeeper
>  Issue Type: Bug
>  Components: leaderElection
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.3, 3.4.0
>
>
> This was a disaster. While testing our application we ran into a scenario 
> where a rebooted follower could not join the cluster. Further debugging 
> showed that the follower could not join because the QuorumCnxManager on the 
> leader was blocked for indefinite amount of time in receiveConnect()
> "Thread-3" prio=10 tid=0x7fa920005800 nid=0x11bb runnable 
> [0x7fa9275ed000]
>java.lang.Thread.State: RUNNABLE
> at sun.nio.ch.FileDispatcher.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
> at sun.nio.ch.IOUtil.read(IOUtil.java:206)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
> - locked <0x7fa93315f988> (a java.lang.Object)
> at 
> org.apache.zookeeper.server.quorum.QuorumCnxManager.receiveConnection(QuorumCnxManager.java:210)
> at 
> org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:501)
> I had pointed out this bug along with several other problems in 
> QuorumCnxManager earlier in 
> https://issues.apache.org/jira/browse/ZOOKEEPER-900 and 
> https://issues.apache.org/jira/browse/ZOOKEEPER-822.
> I forgot to patch this one as a part of ZOOKEEPER-822. I am working on a fix 
> and a patch will be out soon. 
> The problem is that QuorumCnxManager is using SocketChannel in blocking mode. 
> It does a read() in receiveConnection() and a write() in initiateConnection().
> Sorry, but this is really bad programming. Also, points out to lack of 
> failure tests for QuorumCnxManager.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-893) ZooKeeper high cpu usage when invalid requests

2010-10-19 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-893:
---

Status: Patch Available  (was: Open)

> ZooKeeper high cpu usage when invalid requests
> --
>
> Key: ZOOKEEPER-893
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-893
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.1
> Environment: Linux 2.6.16
> 4x Intel(R) Xeon(R) CPU X3320  @ 2.50GHz
> java version "1.6.0_17"
> Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
> Java HotSpot(TM) Server VM (build 14.3-b01, mixed mode)
>Reporter: Thijs Terlouw
>Assignee: Thijs Terlouw
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-893-3.3.patch, ZOOKEEPER-893.patch, 
> ZOOKEEPER-893.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> When ZooKeeper receives certain illegally formed messages on the internal 
> communication port (:4181 by default), it's possible for ZooKeeper to enter 
> an infinite loop which causes 100% cpu usage. It's related to ZOOKEEPER-427, 
> but that patch does not resolve all issues.
> from: src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java 
> the two affected parts:
> ===
> int length = msgLength.getInt();  
>   
> if(length <= 0) { 
>   
> throw new IOException("Invalid packet length:" + length); 
>   
> } 
> ===
> ===
> while (message.hasRemaining()) {  
>   
> temp_numbytes = channel.read(message);
>   
> if(temp_numbytes < 0) {   
>   
> throw new IOException("Channel eof before end");  
>   
> } 
>   
> numbytes += temp_numbytes;
>   
> } 
> ===
> how to replicate this bug:
> perform an nmap portscan against your zookeeper server: "nmap -sV -n 
> your.ip.here -p4181"
> wait for a while untill you see some messages in the logfile and then you 
> will see 100% cpu usage. It does not recover from this situation. With my 
> patch, it does not occur anymore

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-893) ZooKeeper high cpu usage when invalid requests

2010-10-19 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-893:
---

Attachment: ZOOKEEPER-893-3.3.patch

Thanks, Thijs. Adding 3.3 patch. 

> ZooKeeper high cpu usage when invalid requests
> --
>
> Key: ZOOKEEPER-893
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-893
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.1
> Environment: Linux 2.6.16
> 4x Intel(R) Xeon(R) CPU X3320  @ 2.50GHz
> java version "1.6.0_17"
> Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
> Java HotSpot(TM) Server VM (build 14.3-b01, mixed mode)
>Reporter: Thijs Terlouw
>Assignee: Thijs Terlouw
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-893-3.3.patch, ZOOKEEPER-893.patch, 
> ZOOKEEPER-893.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> When ZooKeeper receives certain illegally formed messages on the internal 
> communication port (:4181 by default), it's possible for ZooKeeper to enter 
> an infinite loop which causes 100% cpu usage. It's related to ZOOKEEPER-427, 
> but that patch does not resolve all issues.
> from: src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java 
> the two affected parts:
> ===
> int length = msgLength.getInt();  
>   
> if(length <= 0) { 
>   
> throw new IOException("Invalid packet length:" + length); 
>   
> } 
> ===
> ===
> while (message.hasRemaining()) {  
>   
> temp_numbytes = channel.read(message);
>   
> if(temp_numbytes < 0) {   
>   
> throw new IOException("Channel eof before end");  
>   
> } 
>   
> numbytes += temp_numbytes;
>   
> } 
> ===
> how to replicate this bug:
> perform an nmap portscan against your zookeeper server: "nmap -sV -n 
> your.ip.here -p4181"
> wait for a while untill you see some messages in the logfile and then you 
> will see 100% cpu usage. It does not recover from this situation. With my 
> patch, it does not occur anymore

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-893) ZooKeeper high cpu usage when invalid requests

2010-10-19 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-893:
---

Attachment: ZOOKEEPER-893.patch

Adding a test and removing an if statement that became unnecessary with this 
patch from RecvWorker.run(). I'll be adding a patch for the 3.3 branch shortly.

> ZooKeeper high cpu usage when invalid requests
> --
>
> Key: ZOOKEEPER-893
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-893
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.1
> Environment: Linux 2.6.16
> 4x Intel(R) Xeon(R) CPU X3320  @ 2.50GHz
> java version "1.6.0_17"
> Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
> Java HotSpot(TM) Server VM (build 14.3-b01, mixed mode)
>Reporter: Thijs Terlouw
>Assignee: Thijs Terlouw
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-893.patch, ZOOKEEPER-893.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> When ZooKeeper receives certain illegally formed messages on the internal 
> communication port (:4181 by default), it's possible for ZooKeeper to enter 
> an infinite loop which causes 100% cpu usage. It's related to ZOOKEEPER-427, 
> but that patch does not resolve all issues.
> from: src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java 
> the two affected parts:
> ===
> int length = msgLength.getInt();  
>   
> if(length <= 0) { 
>   
> throw new IOException("Invalid packet length:" + length); 
>   
> } 
> ===
> ===
> while (message.hasRemaining()) {  
>   
> temp_numbytes = channel.read(message);
>   
> if(temp_numbytes < 0) {   
>   
> throw new IOException("Channel eof before end");  
>   
> } 
>   
> numbytes += temp_numbytes;
>   
> } 
> ===
> how to replicate this bug:
> perform an nmap portscan against your zookeeper server: "nmap -sV -n 
> your.ip.here -p4181"
> wait for a while untill you see some messages in the logfile and then you 
> will see 100% cpu usage. It does not recover from this situation. With my 
> patch, it does not occur anymore

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-893) ZooKeeper high cpu usage when invalid requests

2010-10-19 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-893:
---

Status: Open  (was: Patch Available)

Missing a test.

> ZooKeeper high cpu usage when invalid requests
> --
>
> Key: ZOOKEEPER-893
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-893
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.1
> Environment: Linux 2.6.16
> 4x Intel(R) Xeon(R) CPU X3320  @ 2.50GHz
> java version "1.6.0_17"
> Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
> Java HotSpot(TM) Server VM (build 14.3-b01, mixed mode)
>Reporter: Thijs Terlouw
>Assignee: Thijs Terlouw
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-893.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> When ZooKeeper receives certain illegally formed messages on the internal 
> communication port (:4181 by default), it's possible for ZooKeeper to enter 
> an infinite loop which causes 100% cpu usage. It's related to ZOOKEEPER-427, 
> but that patch does not resolve all issues.
> from: src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java 
> the two affected parts:
> ===
> int length = msgLength.getInt();  
>   
> if(length <= 0) { 
>   
> throw new IOException("Invalid packet length:" + length); 
>   
> } 
> ===
> ===
> while (message.hasRemaining()) {  
>   
> temp_numbytes = channel.read(message);
>   
> if(temp_numbytes < 0) {   
>   
> throw new IOException("Channel eof before end");  
>   
> } 
>   
> numbytes += temp_numbytes;
>   
> } 
> ===
> how to replicate this bug:
> perform an nmap portscan against your zookeeper server: "nmap -sV -n 
> your.ip.here -p4181"
> wait for a while untill you see some messages in the logfile and then you 
> will see 100% cpu usage. It does not recover from this situation. With my 
> patch, it does not occur anymore

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-855) clientPortBindAddress should be clientPortAddress

2010-10-18 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-855:
---

Attachment: ZOOKEEPER-855.patch

I'm uploading the patch I committed. The original patch was modifying the html 
instead of the xml source.

> clientPortBindAddress should be clientPortAddress
> -
>
> Key: ZOOKEEPER-855
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-855
> Project: Zookeeper
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.3.0, 3.3.1
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
>Priority: Trivial
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-855.patch, ZOOKEEPER-855.patch
>
>
> The server documentation states that the configuration parameter for binding 
> to a specific ip address is clientPortBindAddress.  The code believes the 
> parameter is clientPortAddress.  The documentation for 3.3.X versions needs 
> changed to reflect the correct parameter .  This parameter was added in 
> ZOOKEEPER-635.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-855) clientPortBindAddress should be clientPortAddress

2010-10-18 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-855:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks, Jared, I have just committed this:

Branch 3.3: Committed revision 1024022.
Trunk: Committed revision 1024029.

> clientPortBindAddress should be clientPortAddress
> -
>
> Key: ZOOKEEPER-855
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-855
> Project: Zookeeper
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.3.0, 3.3.1
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
>Priority: Trivial
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-855.patch, ZOOKEEPER-855.patch
>
>
> The server documentation states that the configuration parameter for binding 
> to a specific ip address is clientPortBindAddress.  The code believes the 
> parameter is clientPortAddress.  The documentation for 3.3.X versions needs 
> changed to reflect the correct parameter .  This parameter was added in 
> ZOOKEEPER-635.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-855) clientPortBindAddress should be clientPortAddress

2010-10-18 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12922238#action_12922238
 ] 

Flavio Junqueira commented on ZOOKEEPER-855:


+1, I'll commit this in a minute.

> clientPortBindAddress should be clientPortAddress
> -
>
> Key: ZOOKEEPER-855
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-855
> Project: Zookeeper
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.3.0, 3.3.1
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
>Priority: Trivial
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-855.patch
>
>
> The server documentation states that the configuration parameter for binding 
> to a specific ip address is clientPortBindAddress.  The code believes the 
> parameter is clientPortAddress.  The documentation for 3.3.X versions needs 
> changed to reflect the correct parameter .  This parameter was added in 
> ZOOKEEPER-635.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-786) Exception in ZooKeeper.toString

2010-10-18 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12922216#action_12922216
 ] 

Flavio Junqueira commented on ZOOKEEPER-786:


Since this seems to be a minor issue and to avoid further delays with 3.3.2, I 
propose we move it to 3.4.0.

> Exception in ZooKeeper.toString
> ---
>
> Key: ZOOKEEPER-786
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-786
> Project: Zookeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.3.1
> Environment: Mac OS X, x86
>Reporter: Stephen Green
> Fix For: 3.4.0
>
>
> When trying to call ZooKeeper.toString during client disconnections, an 
> exception can be generated:
> [04/06/10 15:39:57.744] ERROR Error while calling watcher 
> java.lang.Error: java.net.SocketException: Socket operation on non-socket
>   at sun.nio.ch.Net.localAddress(Net.java:128)
>   at sun.nio.ch.SocketChannelImpl.localAddress(SocketChannelImpl.java:430)
>   at sun.nio.ch.SocketAdaptor.getLocalAddress(SocketAdaptor.java:147)
>   at java.net.Socket.getLocalSocketAddress(Socket.java:717)
>   at 
> org.apache.zookeeper.ClientCnxn.getLocalSocketAddress(ClientCnxn.java:227)
>   at org.apache.zookeeper.ClientCnxn.toString(ClientCnxn.java:183)
>   at java.lang.String.valueOf(String.java:2826)
>   at java.lang.StringBuilder.append(StringBuilder.java:115)
>   at org.apache.zookeeper.ZooKeeper.toString(ZooKeeper.java:1486)
>   at java.util.Formatter$FormatSpecifier.printString(Formatter.java:2794)
>   at java.util.Formatter$FormatSpecifier.print(Formatter.java:2677)
>   at java.util.Formatter.format(Formatter.java:2433)
>   at java.util.Formatter.format(Formatter.java:2367)
>   at java.lang.String.format(String.java:2769)
>   at com.echonest.cluster.ZooContainer.process(ZooContainer.java:544)
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:488)
> Caused by: java.net.SocketException: Socket operation on non-socket
>   at sun.nio.ch.Net.localInetAddress(Native Method)
>   at sun.nio.ch.Net.localAddress(Net.java:125)
>   ... 15 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-786) Exception in ZooKeeper.toString

2010-10-18 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-786:
---

 Priority: Minor  (was: Major)
Fix Version/s: (was: 3.3.2)

> Exception in ZooKeeper.toString
> ---
>
> Key: ZOOKEEPER-786
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-786
> Project: Zookeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.3.1
> Environment: Mac OS X, x86
>Reporter: Stephen Green
>Priority: Minor
> Fix For: 3.4.0
>
>
> When trying to call ZooKeeper.toString during client disconnections, an 
> exception can be generated:
> [04/06/10 15:39:57.744] ERROR Error while calling watcher 
> java.lang.Error: java.net.SocketException: Socket operation on non-socket
>   at sun.nio.ch.Net.localAddress(Net.java:128)
>   at sun.nio.ch.SocketChannelImpl.localAddress(SocketChannelImpl.java:430)
>   at sun.nio.ch.SocketAdaptor.getLocalAddress(SocketAdaptor.java:147)
>   at java.net.Socket.getLocalSocketAddress(Socket.java:717)
>   at 
> org.apache.zookeeper.ClientCnxn.getLocalSocketAddress(ClientCnxn.java:227)
>   at org.apache.zookeeper.ClientCnxn.toString(ClientCnxn.java:183)
>   at java.lang.String.valueOf(String.java:2826)
>   at java.lang.StringBuilder.append(StringBuilder.java:115)
>   at org.apache.zookeeper.ZooKeeper.toString(ZooKeeper.java:1486)
>   at java.util.Formatter$FormatSpecifier.printString(Formatter.java:2794)
>   at java.util.Formatter$FormatSpecifier.print(Formatter.java:2677)
>   at java.util.Formatter.format(Formatter.java:2433)
>   at java.util.Formatter.format(Formatter.java:2367)
>   at java.lang.String.format(String.java:2769)
>   at com.echonest.cluster.ZooContainer.process(ZooContainer.java:544)
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:488)
> Caused by: java.net.SocketException: Socket operation on non-socket
>   at sun.nio.ch.Net.localInetAddress(Native Method)
>   at sun.nio.ch.Net.localAddress(Net.java:125)
>   ... 15 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (ZOOKEEPER-881) ZooKeeperServer.loadData loads database twice

2010-10-18 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira resolved ZOOKEEPER-881.


Resolution: Fixed

Committed to the 3.3 branch (Committed revision 1023935.)

> ZooKeeperServer.loadData loads database twice
> -
>
> Key: ZOOKEEPER-881
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-881
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
>Priority: Trivial
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-881.patch
>
>
> zkDb.loadDataBase() is called twice at the beginning of loadData().  It 
> shouldn't have any negative affects, but is unnecessary.   A patch should be 
> trivial.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-881) ZooKeeperServer.loadData loads database twice

2010-10-18 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-881:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Ben forgot to close this issue.

> ZooKeeperServer.loadData loads database twice
> -
>
> Key: ZOOKEEPER-881
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-881
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
>Priority: Trivial
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-881.patch
>
>
> zkDb.loadDataBase() is called twice at the beginning of loadData().  It 
> shouldn't have any negative affects, but is unnecessary.   A patch should be 
> trivial.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-901) Redesign of QuorumCnxManager

2010-10-18 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921997#action_12921997
 ] 

Flavio Junqueira commented on ZOOKEEPER-901:


It is a good point, Pat. It crossed my mind, but I thought it would be overkill 
to use netty. However, if it is simpler to have it for compatibility and 
uniformity purposes, then we should consider it.

> Redesign of QuorumCnxManager
> 
>
> Key: ZOOKEEPER-901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-901
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: leaderElection
>Affects Versions: 3.3.1
>Reporter: Flavio Junqueira
>Assignee: Flavio Junqueira
> Fix For: 3.4.0
>
>
> QuorumCnxManager manages TCP connections between ZooKeeper servers for leader 
> election in replicated mode. We have identified over time a couple of 
> deficiencies that we would like to fix. Unfortunately, fixing these issues 
> requires a little more than just generating a couple of small patches. More 
> specifically, I propose, based on previous discussions with the community, 
> that we reimplement QuorumCnxManager so that we achieve the following:
> # Establishing connections should not be a blocking operation, and perhaps 
> even more important, it shouldn't prevent the establishment of connections 
> with other servers;
> # Using a pair of threads per connection is a little messy, and we have seen 
> issues over time due to the creation and destruction of such threads. A more 
> reasonable approach is to have a single thread and a selector.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-901) Redesign of QuorumCnxManager

2010-10-17 Thread Flavio Junqueira (JIRA)
Redesign of QuorumCnxManager


 Key: ZOOKEEPER-901
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-901
 Project: Zookeeper
  Issue Type: Improvement
  Components: leaderElection
Affects Versions: 3.3.1
Reporter: Flavio Junqueira
Assignee: Flavio Junqueira
 Fix For: 3.4.0


QuorumCnxManager manages TCP connections between ZooKeeper servers for leader 
election in replicated mode. We have identified over time a couple of 
deficiencies that we would like to fix. Unfortunately, fixing these issues 
requires a little more than just generating a couple of small patches. More 
specifically, I propose, based on previous discussions with the community, that 
we reimplement QuorumCnxManager so that we achieve the following:

# Establishing connections should not be a blocking operation, and perhaps even 
more important, it shouldn't prevent the establishment of connections with 
other servers;
# Using a pair of threads per connection is a little messy, and we have seen 
issues over time due to the creation and destruction of such threads. A more 
reasonable approach is to have a single thread and a selector.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-885) Zookeeper drops connections under moderate IO load

2010-10-15 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921557#action_12921557
 ] 

Flavio Junqueira commented on ZOOKEEPER-885:


I've been running it and there is no traffic to the disk while the clients are 
watching. We generate a snapshot every snapCount, but given that there are no 
transactions generated, no transaction is appended to the log and no new 
snapshot is written.  

> Zookeeper drops connections under moderate IO load
> --
>
> Key: ZOOKEEPER-885
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-885
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.2.2, 3.3.1
> Environment: Debian (Lenny)
> 1Gb RAM
> swap disabled
> 100Mb heap for zookeeper
>Reporter: Alexandre Hardy
>Priority: Critical
> Attachments: benchmark.csv, tracezklogs.tar.gz, tracezklogs.tar.gz, 
> WatcherTest.java, zklogs.tar.gz
>
>
> A zookeeper server under minimum load, with a number of clients watching 
> exactly one node will fail to maintain the connection when the machine is 
> subjected to moderate IO load.
> In a specific test example we had three zookeeper servers running on 
> dedicated machines with 45 clients connected, watching exactly one node. The 
> clients would disconnect after moderate load was added to each of the 
> zookeeper servers with the command:
> {noformat}
> dd if=/dev/urandom of=/dev/mapper/nimbula-test
> {noformat}
> The {{dd}} command transferred data at a rate of about 4Mb/s.
> The same thing happens with
> {noformat}
> dd if=/dev/zero of=/dev/mapper/nimbula-test
> {noformat}
> It seems strange that such a moderate load should cause instability in the 
> connection.
> Very few other processes were running, the machines were setup to test the 
> connection instability we have experienced. Clients performed no other read 
> or mutation operations.
> Although the documents state that minimal competing IO load should present on 
> the zookeeper server, it seems reasonable that moderate IO should not cause 
> problems in this case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-885) Zookeeper drops connections under moderate IO load

2010-10-15 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921467#action_12921467
 ] 

Flavio Junqueira commented on ZOOKEEPER-885:


I'm not sure it is that simple, Dave. The problem is that pings do not require 
writes to disk, and in the scenario that Alexandre describes, there are only 
pings being processed. Why is the background I/O load affecting the processing 
of ZooKeeper? And in particular, why are session expiring as a consequence of 
this background I/O load?

> Zookeeper drops connections under moderate IO load
> --
>
> Key: ZOOKEEPER-885
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-885
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.2.2, 3.3.1
> Environment: Debian (Lenny)
> 1Gb RAM
> swap disabled
> 100Mb heap for zookeeper
>Reporter: Alexandre Hardy
>Priority: Critical
> Attachments: benchmark.csv, tracezklogs.tar.gz, tracezklogs.tar.gz, 
> WatcherTest.java, zklogs.tar.gz
>
>
> A zookeeper server under minimum load, with a number of clients watching 
> exactly one node will fail to maintain the connection when the machine is 
> subjected to moderate IO load.
> In a specific test example we had three zookeeper servers running on 
> dedicated machines with 45 clients connected, watching exactly one node. The 
> clients would disconnect after moderate load was added to each of the 
> zookeeper servers with the command:
> {noformat}
> dd if=/dev/urandom of=/dev/mapper/nimbula-test
> {noformat}
> The {{dd}} command transferred data at a rate of about 4Mb/s.
> The same thing happens with
> {noformat}
> dd if=/dev/zero of=/dev/mapper/nimbula-test
> {noformat}
> It seems strange that such a moderate load should cause instability in the 
> connection.
> Very few other processes were running, the machines were setup to test the 
> connection instability we have experienced. Clients performed no other read 
> or mutation operations.
> Although the documents state that minimal competing IO load should present on 
> the zookeeper server, it seems reasonable that moderate IO should not cause 
> problems in this case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-885) Zookeeper drops connections under moderate IO load

2010-10-14 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921218#action_12921218
 ] 

Flavio Junqueira commented on ZOOKEEPER-885:


Hi Alexandre, When you load the machines running the zookeeper servers by 
running the dd command, how much time elapses between running dd and observing 
the connections expiring? I'm not being able to reproduce it, and I wonder how 
long the problem takes to manifest.

> Zookeeper drops connections under moderate IO load
> --
>
> Key: ZOOKEEPER-885
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-885
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.2.2, 3.3.1
> Environment: Debian (Lenny)
> 1Gb RAM
> swap disabled
> 100Mb heap for zookeeper
>Reporter: Alexandre Hardy
>Priority: Critical
> Attachments: tracezklogs.tar.gz, tracezklogs.tar.gz, 
> WatcherTest.java, zklogs.tar.gz
>
>
> A zookeeper server under minimum load, with a number of clients watching 
> exactly one node will fail to maintain the connection when the machine is 
> subjected to moderate IO load.
> In a specific test example we had three zookeeper servers running on 
> dedicated machines with 45 clients connected, watching exactly one node. The 
> clients would disconnect after moderate load was added to each of the 
> zookeeper servers with the command:
> {noformat}
> dd if=/dev/urandom of=/dev/mapper/nimbula-test
> {noformat}
> The {{dd}} command transferred data at a rate of about 4Mb/s.
> The same thing happens with
> {noformat}
> dd if=/dev/zero of=/dev/mapper/nimbula-test
> {noformat}
> It seems strange that such a moderate load should cause instability in the 
> connection.
> Very few other processes were running, the machines were setup to test the 
> connection instability we have experienced. Clients performed no other read 
> or mutation operations.
> Although the documents state that minimal competing IO load should present on 
> the zookeeper server, it seems reasonable that moderate IO should not cause 
> problems in this case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-885) Zookeeper drops connections under moderate IO load

2010-10-13 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920713#action_12920713
 ] 

Flavio Junqueira commented on ZOOKEEPER-885:


I remember a while back fixing an issue with CommitProcessor, which was being 
killed by a runtime exception. As Pat pointed out, it does look like the 
pipeline is stalling, but it is still unclear why and I couldn't find anything 
that can indicate the cause. 

Let me try to reproduce it.

> Zookeeper drops connections under moderate IO load
> --
>
> Key: ZOOKEEPER-885
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-885
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.2.2, 3.3.1
> Environment: Debian (Lenny)
> 1Gb RAM
> swap disabled
> 100Mb heap for zookeeper
>Reporter: Alexandre Hardy
>Priority: Critical
> Attachments: tracezklogs.tar.gz, tracezklogs.tar.gz, 
> WatcherTest.java, zklogs.tar.gz
>
>
> A zookeeper server under minimum load, with a number of clients watching 
> exactly one node will fail to maintain the connection when the machine is 
> subjected to moderate IO load.
> In a specific test example we had three zookeeper servers running on 
> dedicated machines with 45 clients connected, watching exactly one node. The 
> clients would disconnect after moderate load was added to each of the 
> zookeeper servers with the command:
> {noformat}
> dd if=/dev/urandom of=/dev/mapper/nimbula-test
> {noformat}
> The {{dd}} command transferred data at a rate of about 4Mb/s.
> The same thing happens with
> {noformat}
> dd if=/dev/zero of=/dev/mapper/nimbula-test
> {noformat}
> It seems strange that such a moderate load should cause instability in the 
> connection.
> Very few other processes were running, the machines were setup to test the 
> connection instability we have experienced. Clients performed no other read 
> or mutation operations.
> Although the documents state that minimal competing IO load should present on 
> the zookeeper server, it seems reasonable that moderate IO should not cause 
> problems in this case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-884) Remove LedgerSequence references from BookKeeper documentation and comments in tests

2010-10-05 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-884:
---

Status: Patch Available  (was: Open)

> Remove LedgerSequence references from BookKeeper documentation and comments 
> in tests 
> -
>
> Key: ZOOKEEPER-884
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-884
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-bookkeeper
>Affects Versions: 3.3.1
>Reporter: Flavio Junqueira
>Assignee: Flavio Junqueira
> Attachments: ZOOKEEPER-884.patch
>
>
> We no longer use LedgerSequence, so we need to remove references in 
> documentation and comments sprinkled throughout the code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-884) Remove LedgerSequence references from BookKeeper documentation and comments in tests

2010-10-05 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-884:
---

Attachment: ZOOKEEPER-884.patch

This is a very simple patch, and it fixes mostly documentation and comments. 
Given the pace that patches are making progress in ZooKeeper these days, I'll 
+1 it myself (at the risk of not having any value :-) ).

> Remove LedgerSequence references from BookKeeper documentation and comments 
> in tests 
> -
>
> Key: ZOOKEEPER-884
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-884
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-bookkeeper
>Affects Versions: 3.3.1
>Reporter: Flavio Junqueira
>Assignee: Flavio Junqueira
> Attachments: ZOOKEEPER-884.patch
>
>
> We no longer use LedgerSequence, so we need to remove references in 
> documentation and comments sprinkled throughout the code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-884) Remove LedgerSequence references from BookKeeper documentation and comments in tests

2010-10-01 Thread Flavio Junqueira (JIRA)
Remove LedgerSequence references from BookKeeper documentation and comments in 
tests 
-

 Key: ZOOKEEPER-884
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-884
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bookkeeper
Affects Versions: 3.3.1
Reporter: Flavio Junqueira


We no longer use LedgerSequence, so we need to remove references in 
documentation and comments sprinkled throughout the code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (ZOOKEEPER-884) Remove LedgerSequence references from BookKeeper documentation and comments in tests

2010-10-01 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira reassigned ZOOKEEPER-884:
--

Assignee: Flavio Junqueira

> Remove LedgerSequence references from BookKeeper documentation and comments 
> in tests 
> -
>
> Key: ZOOKEEPER-884
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-884
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-bookkeeper
>Affects Versions: 3.3.1
>Reporter: Flavio Junqueira
>Assignee: Flavio Junqueira
>
> We no longer use LedgerSequence, so we need to remove references in 
> documentation and comments sprinkled throughout the code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-883) Idle cluster increasingly consumes CPU resources

2010-09-30 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916476#action_12916476
 ] 

Flavio Junqueira commented on ZOOKEEPER-883:


I meant to say that there is an orphan SendWorker, not an orphan RecvWorker.

> Idle cluster increasingly consumes CPU resources
> 
>
> Key: ZOOKEEPER-883
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-883
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.1
>Reporter: Lars George
> Attachments: Archive.zip
>
>
> Monitoring the ZooKeeper nodes by polling the various ports using Nagios' 
> open port checks seems to cause a substantial raise of CPU being used by the 
> ZooKeeper daemons. Over the course of a week an idle cluster grew from a 
> baseline 2% to >10% CPU usage. Attached is a stack dump and logs showing the 
> occupied threads. At the end the daemon starts failing on "too many open 
> files" errors as all handles are used up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-883) Idle cluster increasingly consumes CPU resources

2010-09-30 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916446#action_12916446
 ] 

Flavio Junqueira commented on ZOOKEEPER-883:


I think this issue is related to ZOOKEEPER-880. It seems that the connections 
nagios creates start a RecvWorker and a SendWorker, and once they close, they 
kill RecvWorker but not SendWorker, so for every notification sent there is an 
orphan RecvWorker.

I see two options:

# Patch it so that it also kills the SendWorker instance;
# Decline connection requests from unknown servers.

I'm also curious to understand why you guys are monitoring the election port.

> Idle cluster increasingly consumes CPU resources
> 
>
> Key: ZOOKEEPER-883
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-883
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.1
>Reporter: Lars George
> Attachments: Archive.zip
>
>
> Monitoring the ZooKeeper nodes by polling the various ports using Nagios' 
> open port checks seems to cause a substantial raise of CPU being used by the 
> ZooKeeper daemons. Over the course of a week an idle cluster grew from a 
> baseline 2% to >10% CPU usage. Attached is a stack dump and logs showing the 
> occupied threads. At the end the daemon starts failing on "too many open 
> files" errors as all handles are used up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-882) Startup loads last transaction from snapshot

2010-09-30 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916410#action_12916410
 ] 

Flavio Junqueira commented on ZOOKEEPER-882:


(I meant to post a comment yesterday, but jira decided to re-index right at the 
time)

I like the way you structured the restore loop, it is simpler and easier to 
read, and I can't find any problem with it. About the severity of the bug, my 
interpretation is that it is harmless to re-execute the transaction, but still 
worth proposing a patch.

> Startup loads last transaction from snapshot
> 
>
> Key: ZOOKEEPER-882
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-882
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Priority: Minor
> Attachments: 882.diff, restore
>
>
> On startup, the server first loads the latest snapshot, and then loads from 
> the log starting at the last transaction in the snapshot.  It should begin 
> from one past that last transaction in the log.  I will attach a possible 
> patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-882) Startup loads last transaction from snapshot

2010-09-29 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916145#action_12916145
 ] 

Flavio Junqueira commented on ZOOKEEPER-882:


I agree with your description of the behavior of next, and sounds right to me 
that we should be setting hdr and calling "return next();" at the end of the 
catch block.

Regarding init(), we first use the value of zxid to determine which log files 
to read: all log files tagged with a value higher than zxid and the last log 
file that is less than zxid. Next we iterate over the log files until 
hdr.getZxid() is greater or equal to zxid (should be zxid really). This 
guarantees that the next call to next(), after init() returns, will return 
zxid+1. Does it sound right to you?

> Startup loads last transaction from snapshot
> 
>
> Key: ZOOKEEPER-882
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-882
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Priority: Minor
> Attachments: 882.diff
>
>
> On startup, the server first loads the latest snapshot, and then loads from 
> the log starting at the last transaction in the snapshot.  It should begin 
> from one past that last transaction in the log.  I will attach a possible 
> patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-882) Startup loads last transaction from snapshot

2010-09-29 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916070#action_12916070
 ] 

Flavio Junqueira commented on ZOOKEEPER-882:


I'm also not clear on your second point. If you check FileTxnIterator.init(), 
then it seems to me that the zxid passed as a parameter should be included, so 
not dt.lastProcessedZxid+1. What am I missing?

> Startup loads last transaction from snapshot
> 
>
> Key: ZOOKEEPER-882
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-882
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Priority: Minor
> Attachments: 882.diff
>
>
> On startup, the server first loads the latest snapshot, and then loads from 
> the log starting at the last transaction in the snapshot.  It should begin 
> from one past that last transaction in the log.  I will attach a possible 
> patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-882) Startup loads last transaction from snapshot

2010-09-29 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916065#action_12916065
 ] 

Flavio Junqueira commented on ZOOKEEPER-882:


Hi Jared, Thanks for bringing this up. It doesn't look like that extra call to 
next() is necessary. If there is another file to process, then the call to next 
will return true and we will keep processing transactions, no? 

> Startup loads last transaction from snapshot
> 
>
> Key: ZOOKEEPER-882
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-882
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Priority: Minor
> Attachments: 882.diff
>
>
> On startup, the server first loads the latest snapshot, and then loads from 
> the log starting at the last transaction in the snapshot.  It should begin 
> from one past that last transaction in the log.  I will attach a possible 
> patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-702) GSoC 2010: Failure Detector Model

2010-09-28 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-702:
---

Status: Open  (was: Patch Available)

I forgot to mention that the patch does not apply cleanly. I had to delete the 
first two lines (generated by eclipse), but once I did it applied cleanly. 
Abmar, could you upload a new patch? My +1 still holds...

> GSoC 2010: Failure Detector Model
> -
>
> Key: ZOOKEEPER-702
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-702
> Project: Zookeeper
>  Issue Type: Wish
>Reporter: Henry Robinson
>Assignee: Abmar Barros
> Attachments: bertier-pseudo.txt, bertier-pseudo.txt, chen-pseudo.txt, 
> chen-pseudo.txt, phiaccrual-pseudo.txt, phiaccrual-pseudo.txt, 
> ZOOKEEPER-702-code.patch, ZOOKEEPER-702-doc.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch
>
>
> Failure Detector Module
> Possible Mentor
> Henry Robinson (henry at apache dot org)
> Requirements
> Java, some distributed systems knowledge, comfort implementing distributed 
> systems protocols
> Description
> ZooKeeper servers detects the failure of other servers and clients by 
> counting the number of 'ticks' for which it doesn't get a heartbeat from 
> other machines. This is the 'timeout' method of failure detection and works 
> very well; however it is possible that it is too aggressive and not easily 
> tuned for some more unusual ZooKeeper installations (such as in a wide-area 
> network, or even in a mobile ad-hoc network).
> This project would abstract the notion of failure detection to a dedicated 
> Java module, and implement several failure detectors to compare and contrast 
> their appropriateness for ZooKeeper. For example, Apache Cassandra uses a 
> phi-accrual failure detector (http://ddsg.jaist.ac.jp/pub/HDY+04.pdf) which 
> is much more tunable and has some very interesting properties. This is a 
> great project if you are interested in distributed algorithms, or want to 
> help re-factor some of ZooKeeper's internal code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-28 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-822:
---

Status: Patch Available  (was: Open)

Thanks for the comments, Ben. I have modified zookeeperAdmin and added the 
"zookeeper." prefix to the code.

Regarding your question, initiateConnection is called from two methods: 
testInitiateConnection (used only in tests) and connectOne. connectOne is 
synchronized. Do you still see an issue?

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-28 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-822:
---

Attachment: ZOOKEEPER-822.patch

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-28 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-822:
---

Attachment: ZOOKEEPER-822-3.3.2.patch

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-28 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-822:
---

Status: Open  (was: Patch Available)

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-702) GSoC 2010: Failure Detector Model

2010-09-28 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915825#action_12915825
 ] 

Flavio Junqueira commented on ZOOKEEPER-702:


+1, I'm pretty happy with the patch.

> GSoC 2010: Failure Detector Model
> -
>
> Key: ZOOKEEPER-702
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-702
> Project: Zookeeper
>  Issue Type: Wish
>Reporter: Henry Robinson
>Assignee: Abmar Barros
> Attachments: bertier-pseudo.txt, bertier-pseudo.txt, chen-pseudo.txt, 
> chen-pseudo.txt, phiaccrual-pseudo.txt, phiaccrual-pseudo.txt, 
> ZOOKEEPER-702-code.patch, ZOOKEEPER-702-doc.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch
>
>
> Failure Detector Module
> Possible Mentor
> Henry Robinson (henry at apache dot org)
> Requirements
> Java, some distributed systems knowledge, comfort implementing distributed 
> systems protocols
> Description
> ZooKeeper servers detects the failure of other servers and clients by 
> counting the number of 'ticks' for which it doesn't get a heartbeat from 
> other machines. This is the 'timeout' method of failure detection and works 
> very well; however it is possible that it is too aggressive and not easily 
> tuned for some more unusual ZooKeeper installations (such as in a wide-area 
> network, or even in a mobile ad-hoc network).
> This project would abstract the notion of failure detection to a dedicated 
> Java module, and implement several failure detectors to compare and contrast 
> their appropriateness for ZooKeeper. For example, Apache Cassandra uses a 
> phi-accrual failure detector (http://ddsg.jaist.ac.jp/pub/HDY+04.pdf) which 
> is much more tunable and has some very interesting properties. This is a 
> great project if you are interested in distributed algorithms, or want to 
> help re-factor some of ZooKeeper's internal code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-880) QuorumCnxManager$SendWorker grows without bounds

2010-09-27 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915625#action_12915625
 ] 

Flavio Junqueira commented on ZOOKEEPER-880:


J-D, Has it happened just once or it is reproducible? Does it also happen with 
3.3?

> QuorumCnxManager$SendWorker grows without bounds
> 
>
> Key: ZOOKEEPER-880
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-880
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.2.2
>Reporter: Jean-Daniel Cryans
> Attachments: hbase-hadoop-zookeeper-sv4borg12.log.gz, 
> hbase-hadoop-zookeeper-sv4borg9.log.gz, jstack
>
>
> We're seeing an issue where one server in the ensemble has a steady growing 
> number of QuorumCnxManager$SendWorker threads up to a point where the OS runs 
> out of native threads, and at the same time we see a lot of exceptions in the 
> logs.  This is on 3.2.2 and our config looks like:
> {noformat}
> tickTime=3000
> dataDir=/somewhere_thats_not_tmp
> clientPort=2181
> initLimit=10
> syncLimit=5
> server.0=sv4borg9:2888:3888
> server.1=sv4borg10:2888:3888
> server.2=sv4borg11:2888:3888
> server.3=sv4borg12:2888:3888
> server.4=sv4borg13:2888:3888
> {noformat}
> The issue is on the first server. I'm going to attach threads dumps and logs 
> in moment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-702) GSoC 2010: Failure Detector Model

2010-09-26 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915034#action_12915034
 ] 

Flavio Junqueira commented on ZOOKEEPER-702:


In the previous comment, hopefully it was clear that I meant to say that the 
new tests are NOT working as expected. Apologies for the typo.

> GSoC 2010: Failure Detector Model
> -
>
> Key: ZOOKEEPER-702
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-702
> Project: Zookeeper
>  Issue Type: Wish
>Reporter: Henry Robinson
>Assignee: Abmar Barros
> Attachments: bertier-pseudo.txt, bertier-pseudo.txt, chen-pseudo.txt, 
> chen-pseudo.txt, phiaccrual-pseudo.txt, phiaccrual-pseudo.txt, 
> ZOOKEEPER-702-code.patch, ZOOKEEPER-702-doc.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch
>
>
> Failure Detector Module
> Possible Mentor
> Henry Robinson (henry at apache dot org)
> Requirements
> Java, some distributed systems knowledge, comfort implementing distributed 
> systems protocols
> Description
> ZooKeeper servers detects the failure of other servers and clients by 
> counting the number of 'ticks' for which it doesn't get a heartbeat from 
> other machines. This is the 'timeout' method of failure detection and works 
> very well; however it is possible that it is too aggressive and not easily 
> tuned for some more unusual ZooKeeper installations (such as in a wide-area 
> network, or even in a mobile ad-hoc network).
> This project would abstract the notion of failure detection to a dedicated 
> Java module, and implement several failure detectors to compare and contrast 
> their appropriateness for ZooKeeper. For example, Apache Cassandra uses a 
> phi-accrual failure detector (http://ddsg.jaist.ac.jp/pub/HDY+04.pdf) which 
> is much more tunable and has some very interesting properties. This is a 
> great project if you are interested in distributed algorithms, or want to 
> help re-factor some of ZooKeeper's internal code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-26 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-822:
---

Status: Patch Available  (was: Open)

Thanks for reviewing it, Vishal. I have fixed the LOG.warn you pointed out and 
uploaded new patch files.

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-26 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-822:
---

Attachment: ZOOKEEPER-822.patch

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-26 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-822:
---

Attachment: ZOOKEEPER-822-3.3.2.patch

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-26 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-822:
---

Status: Open  (was: Patch Available)

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-823) update ZooKeeper java client to optionally use Netty for connections

2010-09-25 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914822#action_12914822
 ] 

Flavio Junqueira commented on ZOOKEEPER-823:


Here is another instance:

{noformat}
Testcase: testPathValidation took 1.865 sec
Caused an ERROR
KeeperErrorCode = ConnectionLoss for /chrootclienttest
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /chrootclienttest
at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:640)
at 
org.apache.zookeeper.test.ChrootClientTest.setUp(ChrootClientTest.java:42)
{noformat}

I'm on Mac OS X 1.5.8, java build 1.6.0_20-b02-279-9M3165.

> update ZooKeeper java client to optionally use Netty for connections
> 
>
> Key: ZOOKEEPER-823
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-823
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: java client
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
> Fix For: 3.4.0
>
> Attachments: NettyNettySuiteTest.rtf, 
> TEST-org.apache.zookeeper.test.NettyNettySuiteTest.txt.gz, 
> ZOOKEEPER-823.patch, ZOOKEEPER-823.patch, ZOOKEEPER-823.patch, 
> ZOOKEEPER-823.patch, ZOOKEEPER-823.patch, ZOOKEEPER-823.patch, 
> ZOOKEEPER-823.patch
>
>
> This jira will port the client side connection code to use netty rather than 
> direct nio.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-702) GSoC 2010: Failure Detector Model

2010-09-25 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-702:
---

Status: Open  (was: Patch Available)

Thanks for the updated patch, Abmar. The new tests, however, are working as 
expected. More specifically, the methods in QuorumBase (createLearnersFD and 
createSessionsFD) are not being overridden as expected, which affects all new 
hammer tests. I haven't checked the other tests, but I suspect they suffer from 
the same problem.

I'm canceling the patch for now. 

> GSoC 2010: Failure Detector Model
> -
>
> Key: ZOOKEEPER-702
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-702
> Project: Zookeeper
>  Issue Type: Wish
>Reporter: Henry Robinson
>Assignee: Abmar Barros
> Attachments: bertier-pseudo.txt, bertier-pseudo.txt, chen-pseudo.txt, 
> chen-pseudo.txt, phiaccrual-pseudo.txt, phiaccrual-pseudo.txt, 
> ZOOKEEPER-702-code.patch, ZOOKEEPER-702-doc.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch
>
>
> Failure Detector Module
> Possible Mentor
> Henry Robinson (henry at apache dot org)
> Requirements
> Java, some distributed systems knowledge, comfort implementing distributed 
> systems protocols
> Description
> ZooKeeper servers detects the failure of other servers and clients by 
> counting the number of 'ticks' for which it doesn't get a heartbeat from 
> other machines. This is the 'timeout' method of failure detection and works 
> very well; however it is possible that it is too aggressive and not easily 
> tuned for some more unusual ZooKeeper installations (such as in a wide-area 
> network, or even in a mobile ad-hoc network).
> This project would abstract the notion of failure detection to a dedicated 
> Java module, and implement several failure detectors to compare and contrast 
> their appropriateness for ZooKeeper. For example, Apache Cassandra uses a 
> phi-accrual failure detector (http://ddsg.jaist.ac.jp/pub/HDY+04.pdf) which 
> is much more tunable and has some very interesting properties. This is a 
> great project if you are interested in distributed algorithms, or want to 
> help re-factor some of ZooKeeper's internal code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-823) update ZooKeeper java client to optionally use Netty for connections

2010-09-22 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-823:
---

Attachment: NettyNettySuiteTest.rtf

> update ZooKeeper java client to optionally use Netty for connections
> 
>
> Key: ZOOKEEPER-823
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-823
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: java client
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
> Fix For: 3.4.0
>
> Attachments: NettyNettySuiteTest.rtf, 
> TEST-org.apache.zookeeper.test.NettyNettySuiteTest.txt.gz, 
> ZOOKEEPER-823.patch, ZOOKEEPER-823.patch, ZOOKEEPER-823.patch, 
> ZOOKEEPER-823.patch, ZOOKEEPER-823.patch, ZOOKEEPER-823.patch, 
> ZOOKEEPER-823.patch
>
>
> This jira will port the client side connection code to use netty rather than 
> direct nio.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-823) update ZooKeeper java client to optionally use Netty for connections

2010-09-22 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12913591#action_12913591
 ] 

Flavio Junqueira commented on ZOOKEEPER-823:


NettyNettySuiteTest is failing intermittently for me. I'm attaching logs for a 
run that failed.

> update ZooKeeper java client to optionally use Netty for connections
> 
>
> Key: ZOOKEEPER-823
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-823
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: java client
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
> Fix For: 3.4.0
>
> Attachments: 
> TEST-org.apache.zookeeper.test.NettyNettySuiteTest.txt.gz, 
> ZOOKEEPER-823.patch, ZOOKEEPER-823.patch, ZOOKEEPER-823.patch, 
> ZOOKEEPER-823.patch, ZOOKEEPER-823.patch, ZOOKEEPER-823.patch, 
> ZOOKEEPER-823.patch
>
>
> This jira will port the client side connection code to use netty rather than 
> direct nio.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-20 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-822:
---

Status: Patch Available  (was: Open)

I have added a system property called "cnxtimeout" to change the timeout value 
in QuorumCnxManager. Tests pass for me.

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-20 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-822:
---

Status: Open  (was: Patch Available)

Going to submit patches that introduce a system property.

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-20 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-822:
---

Attachment: ZOOKEEPER-822.patch

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-20 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-822:
---

Attachment: ZOOKEEPER-822-3.3.2.patch

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



  1   2   >