[jira] Commented: (ZOOKEEPER-1006) QuorumPeer "Address already in use" -- regression in 3.3.3

2011-03-08 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003837#comment-13003837
 ] 

Patrick Hunt commented on ZOOKEEPER-1006:
-

I believe Vishal is working on addressing ZOOKEEPER-880 issue (the origin of 
this test) on trunk in a different jira 

see this 
https://issues.apache.org/jira/browse/ZOOKEEPER-880?focusedCommentId=12991286&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12991286

> QuorumPeer "Address already in use" -- regression in 3.3.3
> --
>
> Key: ZOOKEEPER-1006
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1006
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 3.3.3
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
>Priority: Minor
> Fix For: 3.3.4, 3.4.0
>
> Attachments: TEST-org.apache.zookeeper.test.CnxManagerTest.txt, 
> ZOOKEEPER-1006.patch, ZOOKEEPER-1006.patch, workerthreads_badtest.txt
>
>
> CnxManagerTest.testWorkerThreads 
> See attachment, this is the first time I've seen this test fail, and it's 
> failed 2 out of the last three test runs.
> Notice (attachment) once this happens the port never becomes available.
> {noformat}
> 2011-03-02 15:53:12,425 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11245:NIOServerCnxn$Factory@251] - 
> Accepted socket connection from /172.29.6.162:51441
> 2011-03-02 15:53:12,430 - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11245:NIOServerCnxn@639] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2011-03-02 15:53:12,430 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11245:NIOServerCnxn@1435] - Closed 
> socket connection for client /172.29.6.162:51441 (no session established for 
> client)
> 2011-03-02 15:53:12,430 - WARN  
> [QuorumPeer:/0:0:0:0:0:0:0:0:11241:Follower@82] - Exception when following 
> the leader
> java.io.EOFException
>   at java.io.DataInputStream.readInt(DataInputStream.java:375)
>   at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>   at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
>   at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>   at 
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:148)
>   at 
> org.apache.zookeeper.server.quorum.Learner.registerWithLeader(Learner.java:267)
>   at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:66)
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:645)
> 2011-03-02 15:53:12,431 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:11241:Follower@165] - shutdown called
> java.lang.Exception: shutdown Follower
>   at 
> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:165)
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:649)
> 2011-03-02 15:53:12,432 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:11241:QuorumPeer@621] - LOOKING
> 2011-03-02 15:53:12,432 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:11241:FastLeaderElection@663] - New election. My 
> id =  0, Proposed zxid = 0
> 2011-03-02 15:53:12,433 - INFO  [WorkerReceiver 
> Thread:FastLeaderElection@496] - Notification: 0 (n.leader), 0 (n.zxid), 2 
> (n.round), LOOKING (n.state), 0 (n.sid), LOOKING (my state)
> 2011-03-02 15:53:12,433 - INFO  [WorkerReceiver 
> Thread:FastLeaderElection@496] - Notification: 0 (n.leader), 0 (n.zxid), 2 
> (n.round), LOOKING (n.state), 0 (n.sid), LOOKING (my state)
> 2011-03-02 15:53:12,433 - INFO  [WorkerReceiver 
> Thread:FastLeaderElection@496] - Notification: 0 (n.leader), 0 (n.zxid), 2 
> (n.round), LOOKING (n.state), 0 (n.sid), LOOKING (my state)
> 2011-03-02 15:53:12,633 - INFO  [WorkerReceiver 
> Thread:FastLeaderElection@496] - Notification: 0 (n.leader), 0 (n.zxid), 2 
> (n.round), LOOKING (n.state), 0 (n.sid), LOOKING (my state)
> 2011-03-02 15:53:12,633 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:11245:QuorumPeer@655] - LEADING
> 2011-03-02 15:53:12,636 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:11245:Leader@54] 
> - TCP NoDelay set to: true
> 2011-03-02 15:53:12,638 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:11245:ZooKeeperServer@151] - Created server with 
> tickTime 1000 minSessionTimeout 2000 maxSessionTimeout 2 datadir 
> /var/lib/hudson/workspace/CDH3-ZooKeeper-3.3.3_sles/build/test/tmp/test9001250572426375869.junit.dir/version-2
>  snapdir 
> /var/lib/hudson/workspace/CDH3-ZooKeeper-3.3.3_sles/build/test/tmp/test9001250572426375869.junit.dir/version-2
> 2011-03-02 15:53:12,639 - ERROR 
> [QuorumPeer:/0:0:0:0:0:0:0:0:11245:Leader@133] - Couldn't bind to port 11245
> java.net.BindException: Address already in use
>   at java.ne

[jira] Created: (ZOOKEEPER-1009) BookKeeper as a sequencer

2011-03-08 Thread Flavio Junqueira (JIRA)
BookKeeper as a sequencer
-

 Key: ZOOKEEPER-1009
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1009
 Project: ZooKeeper
  Issue Type: New Feature
  Components: contrib-bookkeeper
Reporter: Flavio Junqueira
Assignee: Flavio Junqueira


It would be interesting to use BookKeeper as a sequencer for a number of 
parallel streams of events. The general idea is to use BookKeeper to establish 
a total order on the events of concurrent streams, and use the ordered sequence 
as the input of an architectural element that processes the events in the 
streams. This way the state of the element is reproducible.

Given that we currently interleave the entries of concurrent ledgers in a 
bookie for writing efficiently, I believe it wouldn't be difficult to extend 
the interface to enable such a feature.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (ZOOKEEPER-1006) QuorumPeer "Address already in use" -- regression in 3.3.3

2011-03-08 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003950#comment-13003950
 ] 

Vishal K commented on ZOOKEEPER-1006:
-

Hi Patrick,

Thanks for the fix. I will port the test to the trunk.

> QuorumPeer "Address already in use" -- regression in 3.3.3
> --
>
> Key: ZOOKEEPER-1006
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1006
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 3.3.3
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
>Priority: Minor
> Fix For: 3.3.4, 3.4.0
>
> Attachments: TEST-org.apache.zookeeper.test.CnxManagerTest.txt, 
> ZOOKEEPER-1006.patch, ZOOKEEPER-1006.patch, workerthreads_badtest.txt
>
>
> CnxManagerTest.testWorkerThreads 
> See attachment, this is the first time I've seen this test fail, and it's 
> failed 2 out of the last three test runs.
> Notice (attachment) once this happens the port never becomes available.
> {noformat}
> 2011-03-02 15:53:12,425 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11245:NIOServerCnxn$Factory@251] - 
> Accepted socket connection from /172.29.6.162:51441
> 2011-03-02 15:53:12,430 - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11245:NIOServerCnxn@639] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2011-03-02 15:53:12,430 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11245:NIOServerCnxn@1435] - Closed 
> socket connection for client /172.29.6.162:51441 (no session established for 
> client)
> 2011-03-02 15:53:12,430 - WARN  
> [QuorumPeer:/0:0:0:0:0:0:0:0:11241:Follower@82] - Exception when following 
> the leader
> java.io.EOFException
>   at java.io.DataInputStream.readInt(DataInputStream.java:375)
>   at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>   at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
>   at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>   at 
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:148)
>   at 
> org.apache.zookeeper.server.quorum.Learner.registerWithLeader(Learner.java:267)
>   at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:66)
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:645)
> 2011-03-02 15:53:12,431 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:11241:Follower@165] - shutdown called
> java.lang.Exception: shutdown Follower
>   at 
> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:165)
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:649)
> 2011-03-02 15:53:12,432 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:11241:QuorumPeer@621] - LOOKING
> 2011-03-02 15:53:12,432 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:11241:FastLeaderElection@663] - New election. My 
> id =  0, Proposed zxid = 0
> 2011-03-02 15:53:12,433 - INFO  [WorkerReceiver 
> Thread:FastLeaderElection@496] - Notification: 0 (n.leader), 0 (n.zxid), 2 
> (n.round), LOOKING (n.state), 0 (n.sid), LOOKING (my state)
> 2011-03-02 15:53:12,433 - INFO  [WorkerReceiver 
> Thread:FastLeaderElection@496] - Notification: 0 (n.leader), 0 (n.zxid), 2 
> (n.round), LOOKING (n.state), 0 (n.sid), LOOKING (my state)
> 2011-03-02 15:53:12,433 - INFO  [WorkerReceiver 
> Thread:FastLeaderElection@496] - Notification: 0 (n.leader), 0 (n.zxid), 2 
> (n.round), LOOKING (n.state), 0 (n.sid), LOOKING (my state)
> 2011-03-02 15:53:12,633 - INFO  [WorkerReceiver 
> Thread:FastLeaderElection@496] - Notification: 0 (n.leader), 0 (n.zxid), 2 
> (n.round), LOOKING (n.state), 0 (n.sid), LOOKING (my state)
> 2011-03-02 15:53:12,633 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:11245:QuorumPeer@655] - LEADING
> 2011-03-02 15:53:12,636 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:11245:Leader@54] 
> - TCP NoDelay set to: true
> 2011-03-02 15:53:12,638 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:11245:ZooKeeperServer@151] - Created server with 
> tickTime 1000 minSessionTimeout 2000 maxSessionTimeout 2 datadir 
> /var/lib/hudson/workspace/CDH3-ZooKeeper-3.3.3_sles/build/test/tmp/test9001250572426375869.junit.dir/version-2
>  snapdir 
> /var/lib/hudson/workspace/CDH3-ZooKeeper-3.3.3_sles/build/test/tmp/test9001250572426375869.junit.dir/version-2
> 2011-03-02 15:53:12,639 - ERROR 
> [QuorumPeer:/0:0:0:0:0:0:0:0:11245:Leader@133] - Couldn't bind to port 11245
> java.net.BindException: Address already in use
>   at java.net.PlainSocketImpl.socketBind(Native Method)
>   at java.net.PlainSocketImpl.bind(PlainSocketImpl.java:365)
>   at java.net.ServerSocket.bind(ServerSocket.java:319)
>   at java.net.ServerSocket.(ServerSocket.java:185)
> 

Re: PMC member criteria for ZooKeeper.

2011-03-08 Thread Flavio Junqueira
Most discussions apart from issues like new committers are open, and anyone in the community has the right to express an opinion, and I believe we in general do take opinions and suggestions into account. Consequently, I don't see much benefit in having a PMC member that does not have a set of responsabilities that is a superset of the of the ones of a committer. At the same time, I don't see a reason for constraining PMC to be committers in the bylaws. I would much rather discuss each case individually, and evaluate the merit of the candidate accordingly. -FlavioOn Mar 8, 2011, at 12:12 AM, Benjamin Reed wrote:i would like to the pmc to have more of a project management view. ithink it would be great to have pmc members come up through thecommitter ranks, but i also think there may be potential pmc membersthat are more project management oriented than code oriented.for me an ideal pmc member would:  - understand the project  - have a good understanding for where the project should andshouldn't go, and be able to express that understanding  - should vote on releases and be involved in release discussions  - should participate in the mailing lists  - have a good view of how zookeeper sits in the apache eco system  - know what work is going on and identify areas of needed worka committer will do many of these things, but you could be the idealpmc member and not be heavily involved in the coding, so making thepmc members a subset of the committers seems overly restrictive.actually it may be nice to have some members who don't have theirheads down in the code so that they can take a broader view.so i guess the one attribute i would take issue with from your list isthe "patch reviews and contributions". a pmc member should be familiarwith the work going on in the project, but "patch reviews andcontributions" is squarely in the committers area of responsibility.benOn Mon, Mar 7, 2011 at 9:00 AM, Mahadev Konar  wrote:Hi all, I have been thinking about what should be the criteria for PMCmembers for ZK. I do not have much experience with PMC member criteriafor other projects except for Hadoop. In Hadoop we indirectly implythat a PMC member be a superset of a committer. Meaning moreresponsibilities than a committer, more responsibility towards projectdirection, more responsibilities towards projects day to dayactivities. and here is what I had in mind for ZK (mostly explicitly stating whatwe have in Hadoop):A PMC member should be able to get involved in the day to dayactivities of the project  - by day to day activities I imply     -  release discussions    -  code reviews/ could be any kind - documentation/ others (doesnot imply a deep understanding of the project), should be willing tocontribute on any part of the project    -  should be willing to work with new contributors and mentorthem (mostly a superset of committer). - works well with other PMC membersBy the above I imply that a PMC member has a greater set ofresponsibilities that a committer and should be able to review (anycontribution) and contribute towards ZK releases.What do others think?thanksmahadev flaviojunqueira research scientist f...@yahoo-inc.comdirect +34 93-183-8828 avinguda diagonal 177, 8th floor, barcelona, 08018, esphone (408) 349 3300fax (408) 349 3301 

Something about performance of Zookeeper

2011-03-08 Thread Qian Ye
Hi all:

These days my friend and I did some performance tests on zookeeper. We found
the performance of zookeeper is not as good as it is described in the
Zookeeper Overview (
http://hadoop.apache.org/zookeeper/docs/r3.3.2/zookeeperOver.html) . In the
Zookeeper Overview, the "ZooKeeper Throughput as the Read-Write Ratio
Varies" shows that in a ensemble of 3 Zookeeper server, the throughput can
reach about 8, if the requests are all reads. However, we cannot get
results like that in our performance test with the synchronized interface,
zkpython.

Here is some of our test results:
(3 zookeeper ensemble, 8 core CPU,  2.4GHZ, 16 RAM, Linux 2.6.9)

§ 1 client server,1 process per client server,connect 1 zookeeper server,all
reads:cpu:8%~9%,qps:2208,latency:0.000453s
§ 1 client server,1 process per client server,connect all 3 zookeeper
server,all reads:cpu:8%~9%,qps:2376.241573 ,latency:0.000421s
§ 1 client server,1 process per client server,connect all 3 zookeeper
server,all reads,cpu:10%~20%,qps:15600,latency:0.000764s
*§ 1 client server,30 process per client server,connect all 3 zookeeper
server,all reads,cpu:10%~20%,qps:15200,latency:*
*§ 2 client server,30 process **per client server**,connect all 3 zookeeper
server,all reads,cpu:10%~20%,qps:15800,latency:0.003487*

qps means "query per second", that is throughput. The result shows that when
adding more client server, the utilization rate of CPU don't increase,  and
the throughput don't increase much. It seems that the throughput won't reach
8, even if we add 28 more client servers to reach the number you
mentioned in the Zookeeper Overview.

Maybe I've done the tests wrong. Is there any particular thing I should pay
attention to in this case? We set the max java heap size to 12GB in our
test.

*Could you tell me the details about how you do the performance test, from
which you get the results showed in the Zookeeper Overview?*

-- 
With Regards!

Ye, Qian


Re: Something about performance of Zookeeper

2011-03-08 Thread Qian Ye
P.S. 1 we use zookeeper 3.3.2
P.S. 2 all our testing process get data from the same znode. The size of
data on the znode is less than 1K.

On Wed, Mar 9, 2011 at 12:08 AM, Qian Ye  wrote:

> Hi all:
>
> These days my friend and I did some performance tests on zookeeper. We
> found the performance of zookeeper is not as good as it is described in the
> Zookeeper Overview (
> http://hadoop.apache.org/zookeeper/docs/r3.3.2/zookeeperOver.html) . In
> the Zookeeper Overview, the "ZooKeeper Throughput as the Read-Write Ratio
> Varies" shows that in a ensemble of 3 Zookeeper server, the throughput can
> reach about 8, if the requests are all reads. However, we cannot get
> results like that in our performance test with the synchronized interface,
> zkpython.
>
> Here is some of our test results:
> (3 zookeeper ensemble, 8 core CPU,  2.4GHZ, 16 RAM, Linux 2.6.9)
>
> § 1 client server,1 process per client server,connect 1 zookeeper
> server,all reads:cpu:8%~9%,qps:2208,latency:0.000453s
> § 1 client server,1 process per client server,connect all 3 zookeeper
> server,all reads:cpu:8%~9%,qps:2376.241573 ,latency:0.000421s
> § 1 client server,1 process per client server,connect all 3 zookeeper
> server,all reads,cpu:10%~20%,qps:15600,latency:0.000764s
> *§ 1 client server,30 process per client server,connect all 3 zookeeper
> server,all reads,cpu:10%~20%,qps:15200,latency:*
> *§ 2 client server,30 process **per client server**,connect all 3
> zookeeper server,all reads,cpu:10%~20%,qps:15800,latency:0.003487*
>
> qps means "query per second", that is throughput. The result shows that
> when adding more client server, the utilization rate of CPU don't increase,
> and the throughput don't increase much. It seems that the throughput won't
> reach 8, even if we add 28 more client servers to reach the number you
> mentioned in the Zookeeper Overview.
>
> Maybe I've done the tests wrong. Is there any particular thing I should pay
> attention to in this case? We set the max java heap size to 12GB in our
> test.
>
> *Could you tell me the details about how you do the performance test, from
> which you get the results showed in the Zookeeper Overview?*
>
> --
> With Regards!
>
> Ye, Qian
>
>


-- 
With Regards!

Ye, Qian


Re: Something about performance of Zookeeper

2011-03-08 Thread Flavio Junqueira
Hi Qian, If I understand your description correctly, you are using synchronous calls. To get high throughput values, you need multiple outstanding requests, so you will need to use asynchronous calls. -FlavioOn Mar 8, 2011, at 5:16 PM, Qian Ye wrote:P.S. 1 we use zookeeper 3.3.2P.S. 2 all our testing process get data from the same znode. The size ofdata on the znode is less than 1K.On Wed, Mar 9, 2011 at 12:08 AM, Qian Ye  wrote:Hi all:These days my friend and I did some performance tests on zookeeper. Wefound the performance of zookeeper is not as good as it is described in theZookeeper Overview (http://hadoop.apache.org/zookeeper/docs/r3.3.2/zookeeperOver.html) . Inthe Zookeeper Overview, the "ZooKeeper Throughput as the Read-Write RatioVaries" shows that in a ensemble of 3 Zookeeper server, the throughput canreach about 8, if the requests are all reads. However, we cannot getresults like that in our performance test with the synchronized interface,zkpython.Here is some of our test results:(3 zookeeper ensemble, 8 core CPU,  2.4GHZ, 16 RAM, Linux 2.6.9)§ 1 client server,1 process per client server,connect 1 zookeeperserver,all reads:cpu:8%~9%,qps:2208,latency:0.000453s§ 1 client server,1 process per client server,connect all 3 zookeeperserver,all reads:cpu:8%~9%,qps:2376.241573 ,latency:0.000421s§ 1 client server,1 process per client server,connect all 3 zookeeperserver,all reads,cpu:10%~20%,qps:15600,latency:0.000764s*§ 1 client server,30 process per client server,connect all 3 zookeeperserver,all reads,cpu:10%~20%,qps:15200,latency:**§ 2 client server,30 process **per client server**,connect all 3zookeeper server,all reads,cpu:10%~20%,qps:15800,latency:0.003487*qps means "query per second", that is throughput. The result shows thatwhen adding more client server, the utilization rate of CPU don't increase,and the throughput don't increase much. It seems that the throughput won'treach 8, even if we add 28 more client servers to reach the number youmentioned in the Zookeeper Overview.Maybe I've done the tests wrong. Is there any particular thing I should payattention to in this case? We set the max java heap size to 12GB in ourtest.*Could you tell me the details about how you do the performance test, fromwhich you get the results showed in the Zookeeper Overview?*--With Regards!Ye, Qian-- With Regards!Ye, Qian flaviojunqueira research scientist f...@yahoo-inc.comdirect +34 93-183-8828 avinguda diagonal 177, 8th floor, barcelona, 08018, esphone (408) 349 3300fax (408) 349 3301 

release management

2011-03-08 Thread Benjamin Reed
we are already following the release management guidelines at:
https://cwiki.apache.org/confluence/display/ZOOKEEPER/ReleaseManagement

currently that page says that the document is not authoritative. i'd
like to remove that line. does anyone have a problem with that? should
we call a vote?

ben


Re: PMC member criteria for ZooKeeper.

2011-03-08 Thread Patrick Hunt
On Tue, Mar 8, 2011 at 7:00 AM, Flavio Junqueira  wrote:

> Most discussions apart from issues like new committers are open, and anyone
> in the community has the right to express an opinion, and I believe we in
> general do take opinions and suggestions into account. Consequently, I don't
> see much benefit in having a PMC member that does not have a set of
> responsabilities that is a superset of the of the ones of a committer.
>
>
Community members come and go, a sign of a healthy Apache project is adding
new committers and pmc members to ensure that the project continues to be
viable as this ebb/flow happens.


> At the same time, I don't see a reason for constraining PMC to be
> committers in the bylaws. I would much rather discuss each case
> individually, and evaluate the merit of the candidate accordingly.
>

We have clearly stated in the bylaws how one becomes a PMC member voting),
so I agree with you we don't need to update the bylaws. But it is a good
idea to outline how one becomes a PMC member and the criteria we (zk) use to
judge. Even if this is just a pointer to the links I sent earlier. (similar
to what we have for committers
https://cwiki.apache.org/confluence/display/ZOOKEEPER/CommitterCriteria I
think this is what Mahadev was shooting for, get everyone on the same page;
current PMC members, new members as they are elected, and the community at
large)

Patrick



>
> On Mar 8, 2011, at 12:12 AM, Benjamin Reed wrote:
>
> i would like to the pmc to have more of a project management view. i
> think it would be great to have pmc members come up through the
> committer ranks, but i also think there may be potential pmc members
> that are more project management oriented than code oriented.
>
> for me an ideal pmc member would:
>  - understand the project
>  - have a good understanding for where the project should and
> shouldn't go, and be able to express that understanding
>  - should vote on releases and be involved in release discussions
>  - should participate in the mailing lists
>  - have a good view of how zookeeper sits in the apache eco system
>  - know what work is going on and identify areas of needed work
>
> a committer will do many of these things, but you could be the ideal
> pmc member and not be heavily involved in the coding, so making the
> pmc members a subset of the committers seems overly restrictive.
> actually it may be nice to have some members who don't have their
> heads down in the code so that they can take a broader view.
>
> so i guess the one attribute i would take issue with from your list is
> the "patch reviews and contributions". a pmc member should be familiar
> with the work going on in the project, but "patch reviews and
> contributions" is squarely in the committers area of responsibility.
>
> ben
>
> On Mon, Mar 7, 2011 at 9:00 AM, Mahadev Konar  wrote:
>
> Hi all,
>
>  I have been thinking about what should be the criteria for PMC
>
> members for ZK. I do not have much experience with PMC member criteria
>
> for other projects except for Hadoop. In Hadoop we indirectly imply
>
> that a PMC member be a superset of a committer. Meaning more
>
> responsibilities than a committer, more responsibility towards project
>
> direction, more responsibilities towards projects day to day
>
> activities.
>
>
>
>  and here is what I had in mind for ZK (mostly explicitly stating what
>
> we have in Hadoop):
>
>
> A PMC member should be able to get involved in the day to day
>
> activities of the project
>
>   - by day to day activities I imply
>
>  -  release discussions
>
> -  code reviews/ could be any kind - documentation/ others (does
>
> not imply a deep understanding of the project), should be willing to
>
> contribute on any part of the project
>
> -  should be willing to work with new contributors and mentor
>
> them (mostly a superset of committer).
>
>  - works well with other PMC members
>
>
> By the above I imply that a PMC member has a greater set of
>
> responsibilities that a committer and should be able to review (any
>
> contribution) and contribute towards ZK releases.
>
>
> What do others think?
>
>
> thanks
>
> mahadev
>
>
>
> *flavio*
> *junqueira*
>
> research scientist
>
> f...@yahoo-inc.com
> direct +34 93-183-8828
>
> avinguda diagonal 177, 8th floor, barcelona, 08018, es
> phone (408) 349 3300fax (408) 349 3301
>
>
>


Re: PMC member criteria for ZooKeeper.

2011-03-08 Thread Patrick Hunt
On Mon, Mar 7, 2011 at 9:00 AM, Mahadev Konar  wrote:
>  I have been thinking about what should be the criteria for PMC
> members for ZK. I do not have much experience with PMC member criteria
> for other projects except for Hadoop. In Hadoop we indirectly imply
> that a PMC member be a superset of a committer. Meaning more
> responsibilities than a committer, more responsibility towards project
> direction, more responsibilities towards projects day to day
> activities.
>

Hey Mahadev, from an Apache perspective coding doesn't really come
into play, PMC is more about governance/legal/community than coding:
http://www.apache.org/foundation/how-it-works.html#pmc
The key components are this:

"The role of the PMC from a Foundation perspective is oversight. The
main role of the PMC is not code and not coding - but to ensure that
all legal issues are addressed, that procedure is followed, and that
each and every release is the product of the community as a whole.
That is key to our litigation protection mechanisms.

Secondly the role of the PMC is to further the long term development
and health of the community as a whole, and to ensure that balanced
and wide scale peer review and collaboration does happen. Within the
ASF we worry about any community which centers around a few
individuals who are working virtually uncontested. We believe that
this is detrimental to quality, stability, and robustness of both code
and long term social structures."

Further there is no requirement that a PMC member even be a committer.
http://www.apache.org/foundation/how-it-works.html#pmc-members
"A PMC member is a developer or a committer that was elected due to
merit for the evolution of the project and demonstration of
commitment. They have write access to the code repository, an
apache.org mail address, the right to vote for the community-related
decisions and the right to propose an active user for committership.
The PMC as a whole is the entity that controls the project, nobody
else."

What you are describing about coding/review is more Committership and not PMC.

> By the above I imply that a PMC member has a greater set of
> responsibilities that a committer and should be able to review (any
> contribution) and contribute towards ZK releases.
>
> What do others think?
>

Wrt to great responsibilities that's definitely true, however PMC
responsibilities are around governance, while Committer
responsibilities are coding/reviewing.

Patrick


Re: PMC member criteria for ZooKeeper.

2011-03-08 Thread Patrick Hunt
Ben, what you are detailing is similar to my response to Mahadev. One
note though, from an Apache perspective PMC members need not even be
familiar with the project, take Hadoop as an example where Ian was
largely unfamiliar with Hadoop prior to joining their PMC.
legal/procedure/community building, these are all things that can be
done by someone familiar with the apache way, but not necessarily
familiar with the individual project (not that I'm advocating we pull
in non-zk community into the pmc, but just to highlight).

Another example is the IPMC (incubator pmc), any Apache Member may be
an IPMC member just by asking, and they are charged with the oversight
of the individual podlings.

Patrick

On Mon, Mar 7, 2011 at 3:12 PM, Benjamin Reed  wrote:
> i would like to the pmc to have more of a project management view. i
> think it would be great to have pmc members come up through the
> committer ranks, but i also think there may be potential pmc members
> that are more project management oriented than code oriented.
>
> for me an ideal pmc member would:
>  - understand the project
>  - have a good understanding for where the project should and
> shouldn't go, and be able to express that understanding
>  - should vote on releases and be involved in release discussions
>  - should participate in the mailing lists
>  - have a good view of how zookeeper sits in the apache eco system
>  - know what work is going on and identify areas of needed work
>
> a committer will do many of these things, but you could be the ideal
> pmc member and not be heavily involved in the coding, so making the
> pmc members a subset of the committers seems overly restrictive.
> actually it may be nice to have some members who don't have their
> heads down in the code so that they can take a broader view.
>
> so i guess the one attribute i would take issue with from your list is
> the "patch reviews and contributions". a pmc member should be familiar
> with the work going on in the project, but "patch reviews and
> contributions" is squarely in the committers area of responsibility.
>
> ben
>
> On Mon, Mar 7, 2011 at 9:00 AM, Mahadev Konar  wrote:
>> Hi all,
>>  I have been thinking about what should be the criteria for PMC
>> members for ZK. I do not have much experience with PMC member criteria
>> for other projects except for Hadoop. In Hadoop we indirectly imply
>> that a PMC member be a superset of a committer. Meaning more
>> responsibilities than a committer, more responsibility towards project
>> direction, more responsibilities towards projects day to day
>> activities.
>>
>>
>>  and here is what I had in mind for ZK (mostly explicitly stating what
>> we have in Hadoop):
>>
>> A PMC member should be able to get involved in the day to day
>> activities of the project
>>   - by day to day activities I imply
>>      -  release discussions
>>     -  code reviews/ could be any kind - documentation/ others (does
>> not imply a deep understanding of the project), should be willing to
>> contribute on any part of the project
>>     -  should be willing to work with new contributors and mentor
>> them (mostly a superset of committer).
>>  - works well with other PMC members
>>
>> By the above I imply that a PMC member has a greater set of
>> responsibilities that a committer and should be able to review (any
>> contribution) and contribute towards ZK releases.
>>
>> What do others think?
>>
>> thanks
>> mahadev
>>
>


Re: PMC member criteria for ZooKeeper.

2011-03-08 Thread Benjamin Reed
yes, overall i agree with your response to mahadev. i think in the
case of zookeeper we want pmc members who are familiar with the
project since they are voting on releases and release planning. they
also vote on committers. this is a bit different than the IPMC which
has a large collection of unrelated projects. since zookeeper has a
much more focused scope familiarity with the project is important.

ben

On Tue, Mar 8, 2011 at 8:47 AM, Patrick Hunt  wrote:
> Ben, what you are detailing is similar to my response to Mahadev. One
> note though, from an Apache perspective PMC members need not even be
> familiar with the project, take Hadoop as an example where Ian was
> largely unfamiliar with Hadoop prior to joining their PMC.
> legal/procedure/community building, these are all things that can be
> done by someone familiar with the apache way, but not necessarily
> familiar with the individual project (not that I'm advocating we pull
> in non-zk community into the pmc, but just to highlight).
>
> Another example is the IPMC (incubator pmc), any Apache Member may be
> an IPMC member just by asking, and they are charged with the oversight
> of the individual podlings.
>
> Patrick
>
> On Mon, Mar 7, 2011 at 3:12 PM, Benjamin Reed  wrote:
>> i would like to the pmc to have more of a project management view. i
>> think it would be great to have pmc members come up through the
>> committer ranks, but i also think there may be potential pmc members
>> that are more project management oriented than code oriented.
>>
>> for me an ideal pmc member would:
>>  - understand the project
>>  - have a good understanding for where the project should and
>> shouldn't go, and be able to express that understanding
>>  - should vote on releases and be involved in release discussions
>>  - should participate in the mailing lists
>>  - have a good view of how zookeeper sits in the apache eco system
>>  - know what work is going on and identify areas of needed work
>>
>> a committer will do many of these things, but you could be the ideal
>> pmc member and not be heavily involved in the coding, so making the
>> pmc members a subset of the committers seems overly restrictive.
>> actually it may be nice to have some members who don't have their
>> heads down in the code so that they can take a broader view.
>>
>> so i guess the one attribute i would take issue with from your list is
>> the "patch reviews and contributions". a pmc member should be familiar
>> with the work going on in the project, but "patch reviews and
>> contributions" is squarely in the committers area of responsibility.
>>
>> ben
>>
>> On Mon, Mar 7, 2011 at 9:00 AM, Mahadev Konar  wrote:
>>> Hi all,
>>>  I have been thinking about what should be the criteria for PMC
>>> members for ZK. I do not have much experience with PMC member criteria
>>> for other projects except for Hadoop. In Hadoop we indirectly imply
>>> that a PMC member be a superset of a committer. Meaning more
>>> responsibilities than a committer, more responsibility towards project
>>> direction, more responsibilities towards projects day to day
>>> activities.
>>>
>>>
>>>  and here is what I had in mind for ZK (mostly explicitly stating what
>>> we have in Hadoop):
>>>
>>> A PMC member should be able to get involved in the day to day
>>> activities of the project
>>>   - by day to day activities I imply
>>>      -  release discussions
>>>     -  code reviews/ could be any kind - documentation/ others (does
>>> not imply a deep understanding of the project), should be willing to
>>> contribute on any part of the project
>>>     -  should be willing to work with new contributors and mentor
>>> them (mostly a superset of committer).
>>>  - works well with other PMC members
>>>
>>> By the above I imply that a PMC member has a greater set of
>>> responsibilities that a committer and should be able to review (any
>>> contribution) and contribute towards ZK releases.
>>>
>>> What do others think?
>>>
>>> thanks
>>> mahadev
>>>
>>
>


Re: PMC member criteria for ZooKeeper.

2011-03-08 Thread Patrick Hunt
Yes, I agree, just giving some insight into Apache at large.

On Tue, Mar 8, 2011 at 11:10 AM, Benjamin Reed  wrote:
> yes, overall i agree with your response to mahadev. i think in the
> case of zookeeper we want pmc members who are familiar with the
> project since they are voting on releases and release planning. they
> also vote on committers. this is a bit different than the IPMC which
> has a large collection of unrelated projects. since zookeeper has a
> much more focused scope familiarity with the project is important.
>
> ben
>
> On Tue, Mar 8, 2011 at 8:47 AM, Patrick Hunt  wrote:
>> Ben, what you are detailing is similar to my response to Mahadev. One
>> note though, from an Apache perspective PMC members need not even be
>> familiar with the project, take Hadoop as an example where Ian was
>> largely unfamiliar with Hadoop prior to joining their PMC.
>> legal/procedure/community building, these are all things that can be
>> done by someone familiar with the apache way, but not necessarily
>> familiar with the individual project (not that I'm advocating we pull
>> in non-zk community into the pmc, but just to highlight).
>>
>> Another example is the IPMC (incubator pmc), any Apache Member may be
>> an IPMC member just by asking, and they are charged with the oversight
>> of the individual podlings.
>>
>> Patrick
>>
>> On Mon, Mar 7, 2011 at 3:12 PM, Benjamin Reed  wrote:
>>> i would like to the pmc to have more of a project management view. i
>>> think it would be great to have pmc members come up through the
>>> committer ranks, but i also think there may be potential pmc members
>>> that are more project management oriented than code oriented.
>>>
>>> for me an ideal pmc member would:
>>>  - understand the project
>>>  - have a good understanding for where the project should and
>>> shouldn't go, and be able to express that understanding
>>>  - should vote on releases and be involved in release discussions
>>>  - should participate in the mailing lists
>>>  - have a good view of how zookeeper sits in the apache eco system
>>>  - know what work is going on and identify areas of needed work
>>>
>>> a committer will do many of these things, but you could be the ideal
>>> pmc member and not be heavily involved in the coding, so making the
>>> pmc members a subset of the committers seems overly restrictive.
>>> actually it may be nice to have some members who don't have their
>>> heads down in the code so that they can take a broader view.
>>>
>>> so i guess the one attribute i would take issue with from your list is
>>> the "patch reviews and contributions". a pmc member should be familiar
>>> with the work going on in the project, but "patch reviews and
>>> contributions" is squarely in the committers area of responsibility.
>>>
>>> ben
>>>
>>> On Mon, Mar 7, 2011 at 9:00 AM, Mahadev Konar  wrote:
 Hi all,
  I have been thinking about what should be the criteria for PMC
 members for ZK. I do not have much experience with PMC member criteria
 for other projects except for Hadoop. In Hadoop we indirectly imply
 that a PMC member be a superset of a committer. Meaning more
 responsibilities than a committer, more responsibility towards project
 direction, more responsibilities towards projects day to day
 activities.


  and here is what I had in mind for ZK (mostly explicitly stating what
 we have in Hadoop):

 A PMC member should be able to get involved in the day to day
 activities of the project
   - by day to day activities I imply
      -  release discussions
     -  code reviews/ could be any kind - documentation/ others (does
 not imply a deep understanding of the project), should be willing to
 contribute on any part of the project
     -  should be willing to work with new contributors and mentor
 them (mostly a superset of committer).
  - works well with other PMC members

 By the above I imply that a PMC member has a greater set of
 responsibilities that a committer and should be able to review (any
 contribution) and contribute towards ZK releases.

 What do others think?

 thanks
 mahadev

>>>
>>
>


[jira] Reopened: (ZOOKEEPER-880) QuorumCnxManager$SendWorker grows without bounds

2011-03-08 Thread Vishal K (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vishal K reopened ZOOKEEPER-880:



For some reason the patch didn't get committed to trunk. Reopening to submit 
patch to trunk.

> QuorumCnxManager$SendWorker grows without bounds
> 
>
> Key: ZOOKEEPER-880
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-880
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.3.3
>Reporter: Jean-Daniel Cryans
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.3
>
> Attachments: TRACE-hbase-hadoop-zookeeper-sv4borg9.log.gz, 
> ZOOKEEPER-880-3.3.patch, ZOOKEEPER-880.patch, ZOOKEEPER-880.patch, 
> ZOOKEEPER-880.patch, hbase-hadoop-zookeeper-sv4borg12.log.gz, 
> hbase-hadoop-zookeeper-sv4borg9.log.gz, jstack
>
>
> We're seeing an issue where one server in the ensemble has a steady growing 
> number of QuorumCnxManager$SendWorker threads up to a point where the OS runs 
> out of native threads, and at the same time we see a lot of exceptions in the 
> logs.  This is on 3.2.2 and our config looks like:
> {noformat}
> tickTime=3000
> dataDir=/somewhere_thats_not_tmp
> clientPort=2181
> initLimit=10
> syncLimit=5
> server.0=sv4borg9:2888:3888
> server.1=sv4borg10:2888:3888
> server.2=sv4borg11:2888:3888
> server.3=sv4borg12:2888:3888
> server.4=sv4borg13:2888:3888
> {noformat}
> The issue is on the first server. I'm going to attach threads dumps and logs 
> in moment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Updated: (ZOOKEEPER-880) QuorumCnxManager$SendWorker grows without bounds

2011-03-08 Thread Vishal K (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vishal K updated ZOOKEEPER-880:
---

Affects Version/s: (was: 3.3.3)
   3.4.0
Fix Version/s: (was: 3.3.3)
   3.4.0
 Hadoop Flags:   (was: [Reviewed])

Changing version tags.

> QuorumCnxManager$SendWorker grows without bounds
> 
>
> Key: ZOOKEEPER-880
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-880
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.0
>Reporter: Jean-Daniel Cryans
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.4.0
>
> Attachments: TRACE-hbase-hadoop-zookeeper-sv4borg9.log.gz, 
> ZOOKEEPER-880-3.3.patch, ZOOKEEPER-880.patch, ZOOKEEPER-880.patch, 
> ZOOKEEPER-880.patch, hbase-hadoop-zookeeper-sv4borg12.log.gz, 
> hbase-hadoop-zookeeper-sv4borg9.log.gz, jstack
>
>
> We're seeing an issue where one server in the ensemble has a steady growing 
> number of QuorumCnxManager$SendWorker threads up to a point where the OS runs 
> out of native threads, and at the same time we see a lot of exceptions in the 
> logs.  This is on 3.2.2 and our config looks like:
> {noformat}
> tickTime=3000
> dataDir=/somewhere_thats_not_tmp
> clientPort=2181
> initLimit=10
> syncLimit=5
> server.0=sv4borg9:2888:3888
> server.1=sv4borg10:2888:3888
> server.2=sv4borg11:2888:3888
> server.3=sv4borg12:2888:3888
> server.4=sv4borg13:2888:3888
> {noformat}
> The issue is on the first server. I'm going to attach threads dumps and logs 
> in moment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [DISCUSS] move hedwig/bookkeeper to subproject

2011-03-08 Thread Flavio Junqueira
I'm sorry for not replying before. I didn't feel that the message was for me, since it should be pretty obvious that I'm interested in those projects. Here are some thoughts, though:- It would be really nice to have committers for bookkeeper/hedwig;- It would be really nice to have independent releases for bookkeeper/hedwig;- It sounds like bookkeeper and hedwig don't always go together, and hdfs is an instance in which it happens. But, hedwig builds on top of bookkeeper (and other components), so using hedwig implies using bookkeeper. Consequently, if we choose only one to be a main project, perhaps bookkeeper would be a better choice;- I don't think we have anyone who could be a project lead for these projects right now, so it could be a problem to split up at this point. For this reason, a zookeeper subproject sounds like a better option compared to incubator, unless we are able to find a project lead.-FlavioOn Mar 2, 2011, at 6:55 PM, Benjamin Reed wrote:i wanted to start a discussion about making hedwig and bookkeeper asubproject. (actually pat started the discussion last month in generalabout all of the contrib projects.) there are three questions, in mymind, that we need to answer to move forward:1) should it be a hedwig/bookkeeper subproject, or should there be twoseparate projects? we need to build a developer community and i'mwondering if we should try to build a single dev community or two. therelationship is a bit asymmetrical: hedwig depends on bookkeeper, butnot visa-versa. i'm inclined to say we do a hedwig subproject andinclude bookkeeper with it, but i don't feel strongly.2) should we propose a subproject to zookeeper or to incubator? i'm abit more inclined to propose a zookeeper subproject simply because itfits well with the zookeeper community, but it does introduce a bitmore overhead to the zookeeper PMC.3) do we have the developer interest to make it happen in the firstplace? i know we can get at least 3 initial committers from yahoo!,but projects should be represented by multiple companies. (the goal isat least 3.) so, is there interest in working on the project fromothers?please comment. these are all open issues, so opinions are what i'mlooking for. if there isn't much discussion, i think that willimplicitly answer 3 :)thanxben flaviojunqueira research scientist f...@yahoo-inc.comdirect +34 93-183-8828 avinguda diagonal 177, 8th floor, barcelona, 08018, esphone (408) 349 3300fax (408) 349 3301 

[jira] Created: (ZOOKEEPER-1010) Remove or move ManagedUtil to contrib, because it has direct log4j dependencies

2011-03-08 Thread Olaf Krische (JIRA)
Remove or move ManagedUtil to contrib, because it has direct log4j dependencies
---

 Key: ZOOKEEPER-1010
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1010
 Project: ZooKeeper
  Issue Type: Sub-task
  Components: java client
Affects Versions: 3.3.1
Reporter: Olaf Krische
 Fix For: 3.4.0


Please move ManagedUtil out of the way. It has direct dependencies on log4j api.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Updated: (ZOOKEEPER-880) QuorumCnxManager$SendWorker grows without bounds

2011-03-08 Thread Vishal K (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vishal K updated ZOOKEEPER-880:
---

Attachment: ZOOKEEPER-trunk-880

Submitting patch for trunk.

> QuorumCnxManager$SendWorker grows without bounds
> 
>
> Key: ZOOKEEPER-880
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-880
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.0
>Reporter: Jean-Daniel Cryans
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.4.0
>
> Attachments: TRACE-hbase-hadoop-zookeeper-sv4borg9.log.gz, 
> ZOOKEEPER-880-3.3.patch, ZOOKEEPER-880.patch, ZOOKEEPER-880.patch, 
> ZOOKEEPER-880.patch, ZOOKEEPER-trunk-880, 
> hbase-hadoop-zookeeper-sv4borg12.log.gz, 
> hbase-hadoop-zookeeper-sv4borg9.log.gz, jstack
>
>
> We're seeing an issue where one server in the ensemble has a steady growing 
> number of QuorumCnxManager$SendWorker threads up to a point where the OS runs 
> out of native threads, and at the same time we see a lot of exceptions in the 
> logs.  This is on 3.2.2 and our config looks like:
> {noformat}
> tickTime=3000
> dataDir=/somewhere_thats_not_tmp
> clientPort=2181
> initLimit=10
> syncLimit=5
> server.0=sv4borg9:2888:3888
> server.1=sv4borg10:2888:3888
> server.2=sv4borg11:2888:3888
> server.3=sv4borg12:2888:3888
> server.4=sv4borg13:2888:3888
> {noformat}
> The issue is on the first server. I'm going to attach threads dumps and logs 
> in moment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (ZOOKEEPER-1006) QuorumPeer "Address already in use" -- regression in 3.3.3

2011-03-08 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004247#comment-13004247
 ] 

Vishal K commented on ZOOKEEPER-1006:
-

Patch submitted to ZOOKEEPER-880

> QuorumPeer "Address already in use" -- regression in 3.3.3
> --
>
> Key: ZOOKEEPER-1006
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1006
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 3.3.3
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
>Priority: Minor
> Fix For: 3.3.4, 3.4.0
>
> Attachments: TEST-org.apache.zookeeper.test.CnxManagerTest.txt, 
> ZOOKEEPER-1006.patch, ZOOKEEPER-1006.patch, workerthreads_badtest.txt
>
>
> CnxManagerTest.testWorkerThreads 
> See attachment, this is the first time I've seen this test fail, and it's 
> failed 2 out of the last three test runs.
> Notice (attachment) once this happens the port never becomes available.
> {noformat}
> 2011-03-02 15:53:12,425 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11245:NIOServerCnxn$Factory@251] - 
> Accepted socket connection from /172.29.6.162:51441
> 2011-03-02 15:53:12,430 - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11245:NIOServerCnxn@639] - Exception 
> causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
> running
> 2011-03-02 15:53:12,430 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11245:NIOServerCnxn@1435] - Closed 
> socket connection for client /172.29.6.162:51441 (no session established for 
> client)
> 2011-03-02 15:53:12,430 - WARN  
> [QuorumPeer:/0:0:0:0:0:0:0:0:11241:Follower@82] - Exception when following 
> the leader
> java.io.EOFException
>   at java.io.DataInputStream.readInt(DataInputStream.java:375)
>   at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>   at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
>   at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
>   at 
> org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:148)
>   at 
> org.apache.zookeeper.server.quorum.Learner.registerWithLeader(Learner.java:267)
>   at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:66)
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:645)
> 2011-03-02 15:53:12,431 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:11241:Follower@165] - shutdown called
> java.lang.Exception: shutdown Follower
>   at 
> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:165)
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:649)
> 2011-03-02 15:53:12,432 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:11241:QuorumPeer@621] - LOOKING
> 2011-03-02 15:53:12,432 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:11241:FastLeaderElection@663] - New election. My 
> id =  0, Proposed zxid = 0
> 2011-03-02 15:53:12,433 - INFO  [WorkerReceiver 
> Thread:FastLeaderElection@496] - Notification: 0 (n.leader), 0 (n.zxid), 2 
> (n.round), LOOKING (n.state), 0 (n.sid), LOOKING (my state)
> 2011-03-02 15:53:12,433 - INFO  [WorkerReceiver 
> Thread:FastLeaderElection@496] - Notification: 0 (n.leader), 0 (n.zxid), 2 
> (n.round), LOOKING (n.state), 0 (n.sid), LOOKING (my state)
> 2011-03-02 15:53:12,433 - INFO  [WorkerReceiver 
> Thread:FastLeaderElection@496] - Notification: 0 (n.leader), 0 (n.zxid), 2 
> (n.round), LOOKING (n.state), 0 (n.sid), LOOKING (my state)
> 2011-03-02 15:53:12,633 - INFO  [WorkerReceiver 
> Thread:FastLeaderElection@496] - Notification: 0 (n.leader), 0 (n.zxid), 2 
> (n.round), LOOKING (n.state), 0 (n.sid), LOOKING (my state)
> 2011-03-02 15:53:12,633 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:11245:QuorumPeer@655] - LEADING
> 2011-03-02 15:53:12,636 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:11245:Leader@54] 
> - TCP NoDelay set to: true
> 2011-03-02 15:53:12,638 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:11245:ZooKeeperServer@151] - Created server with 
> tickTime 1000 minSessionTimeout 2000 maxSessionTimeout 2 datadir 
> /var/lib/hudson/workspace/CDH3-ZooKeeper-3.3.3_sles/build/test/tmp/test9001250572426375869.junit.dir/version-2
>  snapdir 
> /var/lib/hudson/workspace/CDH3-ZooKeeper-3.3.3_sles/build/test/tmp/test9001250572426375869.junit.dir/version-2
> 2011-03-02 15:53:12,639 - ERROR 
> [QuorumPeer:/0:0:0:0:0:0:0:0:11245:Leader@133] - Couldn't bind to port 11245
> java.net.BindException: Address already in use
>   at java.net.PlainSocketImpl.socketBind(Native Method)
>   at java.net.PlainSocketImpl.bind(PlainSocketImpl.java:365)
>   at java.net.ServerSocket.bind(ServerSocket.java:319)
>   at java.net.ServerSocket.(ServerSocket.java:185)
>   at java.net.ServerSocket.(ServerS

Python binding

2011-03-08 Thread Jesse Kempf
Hi,

I heard a rumor that there's a pure-Python binding for ZK. Is there any truth 
to this?

Thanks,
-Jesse

[jira] Commented: (ZOOKEEPER-975) new peer goes in LEADING state even if ensemble is online

2011-03-08 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004273#comment-13004273
 ] 

Vishal K commented on ZOOKEEPER-975:


Hi Flavio,

I have a patch for this, but I have it on the top of the fix for ZOOKEEPER-932. 
We have 932 applied to our ZK code since we need it. Until ZOOKEEPER-932 is 
reviewed and committed, I will have to keep back porting patches (and do double 
testing). I will port my changes to trunk if someone requires a fix for the 
bug. Since this is not a blocker, I am going to hold off the patch until 932 is 
reviewed. That will reduce my testing and porting overhead. Does that sound ok?

The patch I have is good only for FLE.

{quote}
About maintenance, we have some time back talked about maintaining only the TCP 
version of FLE (FLE+QCM). There was never some real pressure to eliminate the 
others, and in fact previously some users were still using LE. I'm all for 
maintaining only FLE, but we need to hear the opinion of others. More thoughts?
{quote}

The documentation says: "The implementations of leader election 1 and 2 are 
currently not supported, and we have the intention of deprecating them in the 
near future. Implementations 0 and 3 are currently supported, and we plan to 
keep supporting them in the near future. To avoid having to support multiple 
versions of leader election unecessarily, we may eventually consider 
deprecating algorithm 0 as well, but we will plan according to the needs of the 
community."

Is there a significant advantage of using LE 0 vs LE 3?

> new peer goes in LEADING state even if ensemble is online
> -
>
> Key: ZOOKEEPER-975
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-975
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.3.2
>Reporter: Vishal K
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-975.patch
>
>
> Scenario:
> 1. 2 of the 3 ZK nodes are online
> 2. Third node is attempting to join
> 3. Third node unnecessarily goes in "LEADING" state
> 4. Then third goes back to LOOKING (no majority of followers) and finally 
> goes to FOLLOWING state.
> While going through the logs I noticed that a peer C that is trying to
> join an already formed cluster goes in LEADING state. This is because
> QuorumCnxManager of A and B sends the entire history of notification
> messages to C. C receives the notification messages that were
> exchanged between A and B when they were forming the cluster.
> In FastLeaderElection.lookForLeader(), due to the following piece of
> code, C quits lookForLeader assuming that it is supposed to lead.
> 740 //If have received from all nodes, then 
> terminate
> 741 if ((self.getVotingView().size() == 
> recvset.size()) &&
> 742 
> (self.getQuorumVerifier().getWeight(proposedLeader) != 0)){
> 743 self.setPeerState((proposedLeader == 
> self.getId()) ?
> 744 ServerState.LEADING: 
> learningState());
> 745 leaveInstance();
> 746 return new Vote(proposedLeader, 
> proposedZxid);
> 747
> 748 } else if (termPredicate(recvset,
> This can cause:
> 1.  C to unnecessarily go in LEADING state and wait for tickTime * initLimit 
> and then restart the FLE.
> 2. C waits for 200 ms (finalizeWait) and then considers whatever
> notifications it has received to make a decision. C could potentially
> decide to follow an old leader, fail to connect to the leader, and
> then restart FLE. See code below.
> 752 if (termPredicate(recvset,
> 753 new Vote(proposedLeader, proposedZxid,
> 754 logicalclock))) {
> 755 
> 756 // Verify if there is any change in the 
> proposed leader
> 757 while((n = recvqueue.poll(finalizeWait,
> 758 TimeUnit.MILLISECONDS)) != null){
> 759 if(totalOrderPredicate(n.leader, 
> n.zxid,
> 760 proposedLeader, 
> proposedZxid)){
> 761 recvqueue.put(n);
> 762 break;
> 763 }
> 764 }
> In general, this does not affect correctness of FLE since C will
> eventually go back to FOLLOWING state (A and B won't vote for
> C). However, this delays C from joining the cluster. This can in turn
> affect recovery time of an application.
> Proposal: A and B should send only the latest 

Re: Python binding

2011-03-08 Thread Mahadev Konar
Not that I know of :). There is the usual python using c (zkpython in
contrib) but I am not ware of anything other than that.

thanks
mahadev

On Tue, Mar 8, 2011 at 2:23 PM, Jesse Kempf  wrote:
> Hi,
>
> I heard a rumor that there's a pure-Python binding for ZK. Is there any truth 
> to this?
>
> Thanks,
> -Jesse


Review Request: FD options in ZooKeeper

2011-03-08 Thread Camille Fournier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/483/
---

Review request for zookeeper.


Summary
---

https://issues.apache.org/jira/browse/ZOOKEEPER-702


Diffs
-

  trunk/src/docs/src/documentation/content/xdocs/index.xml 1065709 
  trunk/src/docs/src/documentation/content/xdocs/zookeeperFailureDetector.xml 
PRE-CREATION 
  trunk/src/java/main/org/apache/zookeeper/ClientCnxn.java 1065709 
  trunk/src/java/main/org/apache/zookeeper/ClientCnxnSocket.java 1065709 
  trunk/src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java 1065709 
  trunk/src/java/main/org/apache/zookeeper/ZooKeeper.java 1065709 
  trunk/src/java/main/org/apache/zookeeper/ZooKeeperMain.java 1065709 
  
trunk/src/java/main/org/apache/zookeeper/common/fd/BertierFailureDetector.java 
PRE-CREATION 
  trunk/src/java/main/org/apache/zookeeper/common/fd/ChenFailureDetector.java 
PRE-CREATION 
  trunk/src/java/main/org/apache/zookeeper/common/fd/FailureDetector.java 
PRE-CREATION 
  
trunk/src/java/main/org/apache/zookeeper/common/fd/FailureDetectorFactory.java 
PRE-CREATION 
  
trunk/src/java/main/org/apache/zookeeper/common/fd/FailureDetectorOptParser.java
 PRE-CREATION 
  
trunk/src/java/main/org/apache/zookeeper/common/fd/FixedPingFailureDetector.java
 PRE-CREATION 
  
trunk/src/java/main/org/apache/zookeeper/common/fd/InterArrivalSamplingWindow.java
 PRE-CREATION 
  trunk/src/java/main/org/apache/zookeeper/common/fd/MessageType.java 
PRE-CREATION 
  
trunk/src/java/main/org/apache/zookeeper/common/fd/PhiAccrualFailureDetector.java
 PRE-CREATION 
  
trunk/src/java/main/org/apache/zookeeper/common/fd/SlicedPingFailureDetector.java
 PRE-CREATION 
  trunk/src/java/main/org/apache/zookeeper/server/ServerConfig.java 1065709 
  trunk/src/java/main/org/apache/zookeeper/server/SessionTracker.java 1065709 
  trunk/src/java/main/org/apache/zookeeper/server/SessionTrackerImpl.java 
1065709 
  trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 1065709 
  trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServerMain.java 
1065709 
  
trunk/src/java/main/org/apache/zookeeper/server/quorum/FollowerZooKeeperServer.java
 1065709 
  trunk/src/java/main/org/apache/zookeeper/server/quorum/Leader.java 1065709 
  
trunk/src/java/main/org/apache/zookeeper/server/quorum/LeaderZooKeeperServer.java
 1065709 
  trunk/src/java/main/org/apache/zookeeper/server/quorum/Learner.java 1072085 
  trunk/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java 
1065709 
  
trunk/src/java/main/org/apache/zookeeper/server/quorum/LearnerSessionTracker.java
 1065709 
  
trunk/src/java/main/org/apache/zookeeper/server/quorum/LearnerZooKeeperServer.java
 1065709 
  
trunk/src/java/main/org/apache/zookeeper/server/quorum/ObserverZooKeeperServer.java
 1065709 
  trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java 
1065709 
  trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerConfig.java 
1065709 
  trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerMain.java 
1065709 
  
trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumZooKeeperServer.java
 1065709 
  trunk/src/java/test/org/apache/zookeeper/TestableZooKeeper.java 1065709 
  trunk/src/java/test/org/apache/zookeeper/test/ClientBase.java 1065709 
  trunk/src/java/test/org/apache/zookeeper/test/DisconnectableZooKeeper.java 
1065709 
  trunk/src/java/test/org/apache/zookeeper/test/QuorumBase.java 1065709 
  trunk/src/java/test/org/apache/zookeeper/test/QuorumFDHammerTest.java 
PRE-CREATION 
  trunk/src/java/test/org/apache/zookeeper/test/RecoveryTest.java 1065709 
  trunk/src/java/test/org/apache/zookeeper/test/SessionTest.java 1065709 
  trunk/src/java/test/org/apache/zookeeper/test/fd/BertierClientHammerTest.java 
PRE-CREATION 
  trunk/src/java/test/org/apache/zookeeper/test/fd/BertierFDTest.java 
PRE-CREATION 
  trunk/src/java/test/org/apache/zookeeper/test/fd/BertierQuorumHammerTest.java 
PRE-CREATION 
  trunk/src/java/test/org/apache/zookeeper/test/fd/BertierRecoveryTest.java 
PRE-CREATION 
  trunk/src/java/test/org/apache/zookeeper/test/fd/BertierSessionTest.java 
PRE-CREATION 
  trunk/src/java/test/org/apache/zookeeper/test/fd/ChenClientHammerTest.java 
PRE-CREATION 
  trunk/src/java/test/org/apache/zookeeper/test/fd/ChenFDTest.java PRE-CREATION 
  trunk/src/java/test/org/apache/zookeeper/test/fd/ChenQuorumHammerTest.java 
PRE-CREATION 
  trunk/src/java/test/org/apache/zookeeper/test/fd/ChenRecoveryTest.java 
PRE-CREATION 
  trunk/src/java/test/org/apache/zookeeper/test/fd/ChenSessionTest.java 
PRE-CREATION 
  trunk/src/java/test/org/apache/zookeeper/test/fd/FixedPingFDTest.java 
PRE-CREATION 
  
trunk/src/java/test/org/apache/zookeeper/test/fd/PhiAccrualClientHammerTest.java
 PRE-CREATION 
  trunk/src/java/test/org/apache/zookeeper/test/fd/PhiAccrualFDTest.java 
PRE-CR

[jira] Commented: (ZOOKEEPER-702) GSoC 2010: Failure Detector Model

2011-03-08 Thread Camille Fournier (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004295#comment-13004295
 ] 

Camille Fournier commented on ZOOKEEPER-702:


Created ReviewBoard for code review:
https://reviews.apache.org/r/483/

> GSoC 2010: Failure Detector Model
> -
>
> Key: ZOOKEEPER-702
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-702
> Project: ZooKeeper
>  Issue Type: Wish
>Reporter: Henry Robinson
>Assignee: Abmar Barros
>  Labels: gsoc, mentor
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-702-code.patch, ZOOKEEPER-702-doc.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, bertier-pseudo.txt, 
> bertier-pseudo.txt, chen-pseudo.txt, chen-pseudo.txt, phiaccrual-pseudo.txt, 
> phiaccrual-pseudo.txt
>
>
> Failure Detector Module
> Possible Mentor
> Henry Robinson (henry at apache dot org)
> Requirements
> Java, some distributed systems knowledge, comfort implementing distributed 
> systems protocols
> Description
> ZooKeeper servers detects the failure of other servers and clients by 
> counting the number of 'ticks' for which it doesn't get a heartbeat from 
> other machines. This is the 'timeout' method of failure detection and works 
> very well; however it is possible that it is too aggressive and not easily 
> tuned for some more unusual ZooKeeper installations (such as in a wide-area 
> network, or even in a mobile ad-hoc network).
> This project would abstract the notion of failure detection to a dedicated 
> Java module, and implement several failure detectors to compare and contrast 
> their appropriateness for ZooKeeper. For example, Apache Cassandra uses a 
> phi-accrual failure detector (http://ddsg.jaist.ac.jp/pub/HDY+04.pdf) which 
> is much more tunable and has some very interesting properties. This is a 
> great project if you are interested in distributed algorithms, or want to 
> help re-factor some of ZooKeeper's internal code.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: FD options in ZooKeeper

2011-03-08 Thread Camille Fournier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/483/#review316
---



trunk/src/java/test/org/apache/zookeeper/test/fd/BertierFDTest.java


A rather large nit to pick: I wish the  asserts in your tests had a message 
associated with them. If/When it fails, it would be nice to have a bit of 
context to know what we were testing and why


- Camille


On 2011-03-08 15:52:08, Camille Fournier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/483/
> ---
> 
> (Updated 2011-03-08 15:52:08)
> 
> 
> Review request for zookeeper.
> 
> 
> Summary
> ---
> 
> https://issues.apache.org/jira/browse/ZOOKEEPER-702
> 
> 
> Diffs
> -
> 
>   trunk/src/docs/src/documentation/content/xdocs/index.xml 1065709 
>   trunk/src/docs/src/documentation/content/xdocs/zookeeperFailureDetector.xml 
> PRE-CREATION 
>   trunk/src/java/main/org/apache/zookeeper/ClientCnxn.java 1065709 
>   trunk/src/java/main/org/apache/zookeeper/ClientCnxnSocket.java 1065709 
>   trunk/src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java 1065709 
>   trunk/src/java/main/org/apache/zookeeper/ZooKeeper.java 1065709 
>   trunk/src/java/main/org/apache/zookeeper/ZooKeeperMain.java 1065709 
>   
> trunk/src/java/main/org/apache/zookeeper/common/fd/BertierFailureDetector.java
>  PRE-CREATION 
>   trunk/src/java/main/org/apache/zookeeper/common/fd/ChenFailureDetector.java 
> PRE-CREATION 
>   trunk/src/java/main/org/apache/zookeeper/common/fd/FailureDetector.java 
> PRE-CREATION 
>   
> trunk/src/java/main/org/apache/zookeeper/common/fd/FailureDetectorFactory.java
>  PRE-CREATION 
>   
> trunk/src/java/main/org/apache/zookeeper/common/fd/FailureDetectorOptParser.java
>  PRE-CREATION 
>   
> trunk/src/java/main/org/apache/zookeeper/common/fd/FixedPingFailureDetector.java
>  PRE-CREATION 
>   
> trunk/src/java/main/org/apache/zookeeper/common/fd/InterArrivalSamplingWindow.java
>  PRE-CREATION 
>   trunk/src/java/main/org/apache/zookeeper/common/fd/MessageType.java 
> PRE-CREATION 
>   
> trunk/src/java/main/org/apache/zookeeper/common/fd/PhiAccrualFailureDetector.java
>  PRE-CREATION 
>   
> trunk/src/java/main/org/apache/zookeeper/common/fd/SlicedPingFailureDetector.java
>  PRE-CREATION 
>   trunk/src/java/main/org/apache/zookeeper/server/ServerConfig.java 1065709 
>   trunk/src/java/main/org/apache/zookeeper/server/SessionTracker.java 1065709 
>   trunk/src/java/main/org/apache/zookeeper/server/SessionTrackerImpl.java 
> 1065709 
>   trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 
> 1065709 
>   trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServerMain.java 
> 1065709 
>   
> trunk/src/java/main/org/apache/zookeeper/server/quorum/FollowerZooKeeperServer.java
>  1065709 
>   trunk/src/java/main/org/apache/zookeeper/server/quorum/Leader.java 1065709 
>   
> trunk/src/java/main/org/apache/zookeeper/server/quorum/LeaderZooKeeperServer.java
>  1065709 
>   trunk/src/java/main/org/apache/zookeeper/server/quorum/Learner.java 1072085 
>   trunk/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java 
> 1065709 
>   
> trunk/src/java/main/org/apache/zookeeper/server/quorum/LearnerSessionTracker.java
>  1065709 
>   
> trunk/src/java/main/org/apache/zookeeper/server/quorum/LearnerZooKeeperServer.java
>  1065709 
>   
> trunk/src/java/main/org/apache/zookeeper/server/quorum/ObserverZooKeeperServer.java
>  1065709 
>   trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java 
> 1065709 
>   
> trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerConfig.java 
> 1065709 
>   trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerMain.java 
> 1065709 
>   
> trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumZooKeeperServer.java
>  1065709 
>   trunk/src/java/test/org/apache/zookeeper/TestableZooKeeper.java 1065709 
>   trunk/src/java/test/org/apache/zookeeper/test/ClientBase.java 1065709 
>   trunk/src/java/test/org/apache/zookeeper/test/DisconnectableZooKeeper.java 
> 1065709 
>   trunk/src/java/test/org/apache/zookeeper/test/QuorumBase.java 1065709 
>   trunk/src/java/test/org/apache/zookeeper/test/QuorumFDHammerTest.java 
> PRE-CREATION 
>   trunk/src/java/test/org/apache/zookeeper/test/RecoveryTest.java 1065709 
>   trunk/src/java/test/org/apache/zookeeper/test/SessionTest.java 1065709 
>   
> trunk/src/java/test/org/apache/zookeeper/test/fd/BertierClientHammerTest.java 
> PRE-CREATION 
>   trunk/src/java/test/org/apache/zookeeper/test/fd/BertierFDTest.java 
> PRE-CREATION 
>   
> trunk/src/java/test/org/apache/zookeeper/test/fd/BertierQuorumHammerTest.java 
> PRE-CREATION 
>   trunk/src/java/test/

[jira] Commented: (ZOOKEEPER-965) Need a multi-update command to allow multiple znodes to be updated safely

2011-03-08 Thread Jared Cantwell (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004296#comment-13004296
 ] 

Jared Cantwell commented on ZOOKEEPER-965:
--

Hey Ted,

Has there been any progress on this issue?  We would really love to see this 
happen, so just pinging to see what's left

> Need a multi-update command to allow multiple znodes to be updated safely
> -
>
> Key: ZOOKEEPER-965
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-965
> Project: ZooKeeper
>  Issue Type: Bug
>Reporter: Ted Dunning
>Assignee: Ted Dunning
> Fix For: 3.4.0
>
>
> The basic idea is to have a single method called "multi" that will accept a 
> list of create, delete, update or check objects each of which has a desired 
> version or file state in the case of create.  If all of the version and 
> existence constraints can be satisfied, then all updates will be done 
> atomically.
> Two API styles have been suggested.  One has a list as above and the other 
> style has a "Transaction" that allows builder-like methods to build a set of 
> updates and a commit method to finalize the transaction.  This can trivially 
> be reduced to the first kind of API so the list based API style should be 
> considered the primitive and the builder style should be implemented as 
> syntactic sugar.
> The total size of all the data in all updates and creates in a single 
> transaction should be limited to 1MB.
> Implementation-wise this capability can be done using standard ZK internals.  
> The changes include:
> - update to ZK clients to all the new call
> - additional wire level request
> - on the server, in the code that converts transactions to idempotent form, 
> the code should be slightly extended to convert a list of operations to 
> idempotent form.
> - on the client, a down-rev server that rejects the multi-update should be 
> detected gracefully and an informative exception should be thrown.
> To facilitate shared development, I have established a github repository at 
> https://github.com/tdunning/zookeeper  and am happy to extend committer 
> status to anyone who agrees to donate their code back to Apache.  The final 
> patch will be attached to this bug as normal.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [jira] Commented: (ZOOKEEPER-975) new peer goes in LEADING state even if ensemble is online

2011-03-08 Thread Michi Mutsuzaki


"Vishal K (JIRA)"  wrote:


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004273#comment-13004273
 ]

Vishal K commented on ZOOKEEPER-975:


Hi Flavio,

I have a patch for this, but I have it on the top of the fix for ZOOKEEPER-932. 
We have 932 applied to our ZK code since we need it. Until ZOOKEEPER-932 is 
reviewed and committed, I will have to keep back porting patches (and do double 
testing). I will port my changes to trunk if someone requires a fix for the 
bug. Since this is not a blocker, I am going to hold off the patch until 932 is 
reviewed. That will reduce my testing and porting overhead. Does that sound ok?

The patch I have is good only for FLE.

{quote}
About maintenance, we have some time back talked about maintaining only the TCP 
version of FLE (FLE+QCM). There was never some real pressure to eliminate the 
others, and in fact previously some users were still using LE. I'm all for 
maintaining only FLE, but we need to hear the opinion of others. More thoughts?
{quote}

The documentation says: "The implementations of leader election 1 and 2 are 
currently not supported, and we have the intention of deprecating them in the 
near future. Implementations 0 and 3 are currently supported, and we plan to 
keep supporting them in the near future. To avoid having to support multiple 
versions of leader election unecessarily, we may eventually consider 
deprecating algorithm 0 as well, but we will plan according to the needs of the 
community."

Is there a significant advantage of using LE 0 vs LE 3?

> new peer goes in LEADING state even if ensemble is online
> -
>
> Key: ZOOKEEPER-975
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-975
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.3.2
>Reporter: Vishal K
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-975.patch
>
>
> Scenario:
> 1. 2 of the 3 ZK nodes are online
> 2. Third node is attempting to join
> 3. Third node unnecessarily goes in "LEADING" state
> 4. Then third goes back to LOOKING (no majority of followers) and finally 
> goes to FOLLOWING state.
> While going through the logs I noticed that a peer C that is trying to
> join an already formed cluster goes in LEADING state. This is because
> QuorumCnxManager of A and B sends the entire history of notification
> messages to C. C receives the notification messages that were
> exchanged between A and B when they were forming the cluster.
> In FastLeaderElection.lookForLeader(), due to the following piece of
> code, C quits lookForLeader assuming that it is supposed to lead.
> 740 //If have received from all nodes, then 
> terminate
> 741 if ((self.getVotingView().size() == 
> recvset.size()) &&
> 742 
> (self.getQuorumVerifier().getWeight(proposedLeader) != 0)){
> 743 self.setPeerState((proposedLeader == 
> self.getId()) ?
> 744 ServerState.LEADING: 
> learningState());
> 745 leaveInstance();
> 746 return new Vote(proposedLeader, 
> proposedZxid);
> 747
> 748 } else if (termPredicate(recvset,
> This can cause:
> 1.  C to unnecessarily go in LEADING state and wait for tickTime * initLimit 
> and then restart the FLE.
> 2. C waits for 200 ms (finalizeWait) and then considers whatever
> notifications it has received to make a decision. C could potentially
> decide to follow an old leader, fail to connect to the leader, and
> then restart FLE. See code below.
> 752 if (termPredicate(recvset,
> 753 new Vote(proposedLeader, proposedZxid,
> 754 logicalclock))) {
> 755
> 756 // Verify if there is any change in the 
> proposed leader
> 757 while((n = recvqueue.poll(finalizeWait,
> 758 TimeUnit.MILLISECONDS)) != null){
> 759 if(totalOrderPredicate(n.leader, 
> n.zxid,
> 760 proposedLeader, 
> proposedZxid)){
> 761 recvqueue.put(n);
> 762 break;
> 763 }
> 764 }
> In general, this does not affect correctness of FLE since C will
> eventually go back to FOLLOWING state (A and B won't vote for
> C). However, this delays C from joining the cluster. This can in turn
> affect recovery time of an application.
> Proposal: A and B s

[jira] Commented: (ZOOKEEPER-965) Need a multi-update command to allow multiple znodes to be updated safely

2011-03-08 Thread Ted Dunning (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004311#comment-13004311
 ] 

Ted Dunning commented on ZOOKEEPER-965:
---

There has only been limited progress.

I have the client and server command parsing, wire formats and some message 
passing defined.

I have talked to Ben and Mahadev about where the final commit needs to be 
placed, but not done any code at all.

My own extra-curricular schedule is completely under water from day job 
activities.  I am happy to help explain what has happened so far, but I am 
unable to drive further right now.

I can update my git branch on request to make it easier to see what is going on 
relative to trunk.

> Need a multi-update command to allow multiple znodes to be updated safely
> -
>
> Key: ZOOKEEPER-965
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-965
> Project: ZooKeeper
>  Issue Type: Bug
>Reporter: Ted Dunning
>Assignee: Ted Dunning
> Fix For: 3.4.0
>
>
> The basic idea is to have a single method called "multi" that will accept a 
> list of create, delete, update or check objects each of which has a desired 
> version or file state in the case of create.  If all of the version and 
> existence constraints can be satisfied, then all updates will be done 
> atomically.
> Two API styles have been suggested.  One has a list as above and the other 
> style has a "Transaction" that allows builder-like methods to build a set of 
> updates and a commit method to finalize the transaction.  This can trivially 
> be reduced to the first kind of API so the list based API style should be 
> considered the primitive and the builder style should be implemented as 
> syntactic sugar.
> The total size of all the data in all updates and creates in a single 
> transaction should be limited to 1MB.
> Implementation-wise this capability can be done using standard ZK internals.  
> The changes include:
> - update to ZK clients to all the new call
> - additional wire level request
> - on the server, in the code that converts transactions to idempotent form, 
> the code should be slightly extended to convert a list of operations to 
> idempotent form.
> - on the client, a down-rev server that rejects the multi-update should be 
> detected gracefully and an informative exception should be thrown.
> To facilitate shared development, I have established a github repository at 
> https://github.com/tdunning/zookeeper  and am happy to extend committer 
> status to anyone who agrees to donate their code back to Apache.  The final 
> patch will be attached to this bug as normal.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Apache Sonar service now available from asfinfra

2011-03-08 Thread Mahadev Konar
Dont we need to mavenized before we get on board on this?

Looks like they only work with mavenized projects no?

thanks
mahadev

On Fri, Mar 4, 2011 at 8:52 AM, Patrick Hunt  wrote:
> This is cool:
> http://wiki.apache.org/general/SonarInstance
>
> Would be great to see ZooKeeper up there. Anyone interested to take
> the lead on making this happen?
>
> Patrick
>


[jira] Commented: (ZOOKEEPER-837) cyclic dependency ClientCnxn, ZooKeeper

2011-03-08 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004341#comment-13004341
 ] 

Mahadev konar commented on ZOOKEEPER-837:
-

the test failure feels like the usual c test thats been failing for a long time.

> cyclic dependency ClientCnxn, ZooKeeper
> ---
>
> Key: ZOOKEEPER-837
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-837
> Project: ZooKeeper
>  Issue Type: Sub-task
>Affects Versions: 3.3.1
>Reporter: Patrick Datko
>Assignee: Thomas Koch
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-837.patch, ZOOKEEPER-837.patch, 
> ZOOKEEPER-837.patch, ZOOKEEPER-837.patch
>
>
> ZooKeeper instantiates ClientCnxn in its ctor with this and therefor builds a 
> cyclic dependency graph between both objects. This means, you can't have the 
> one without the other. So why did you bother do make them to separate classes 
> in the first place?
> ClientCnxn accesses ZooKeeper.state. State should rather be a property of 
> ClientCnxn. And ClientCnxn accesses zooKeeper.get???Watches() in its method 
> primeConnection(). I've not yet checked, how this dependency should be 
> resolved better.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Something about performance of Zookeeper

2011-03-08 Thread Qian Ye
Hi Flavio, asynchronous calls doesn't perform better, here is some results
we've got,

§ 1 client server,1 process per client server,connect 1 zookeeper server,all
reads:cpu:14%~15%,qps:3833,latency:0.000261
§ 1 client server,1 process per client server,connect all 3 zookeeper
server,all reads:cpu:14%~15%,qps:3832,latency:0.000261
§ 1 client server,10 process per client server,connect all 3 zookeeper
server,all reads,cpu:13%~20%,qps:14000->12000,latency:0.000469
*§ 1 client server,30 process per client server,connect all 3 zookeeper
server,all reads,cpu:15%~20%,qps:14000->1,,latency:
§ 2 client server,30 process per client server,connect all 3 zookeeper
server,all reads,cpu:15%~20%,qps:about 11000,latency:*

It seems that the asynchronous calls perform even worse than the synchronous
calls.


On Wed, Mar 9, 2011 at 12:29 AM, Flavio Junqueira  wrote:

> Hi Qian, If I understand your description correctly, you are using
> synchronous calls. To get high throughput values, you need multiple
> outstanding requests, so you will need to use asynchronous calls.
>
> -Flavio
>
> On Mar 8, 2011, at 5:16 PM, Qian Ye wrote:
>
> P.S. 1 we use zookeeper 3.3.2
> P.S. 2 all our testing process get data from the same znode. The size of
> data on the znode is less than 1K.
>
> On Wed, Mar 9, 2011 at 12:08 AM, Qian Ye  wrote:
>
> Hi all:
>
>
> These days my friend and I did some performance tests on zookeeper. We
>
> found the performance of zookeeper is not as good as it is described in the
>
> Zookeeper Overview (
>
> http://hadoop.apache.org/zookeeper/docs/r3.3.2/zookeeperOver.html) . In
>
> the Zookeeper Overview, the "ZooKeeper Throughput as the Read-Write Ratio
>
> Varies" shows that in a ensemble of 3 Zookeeper server, the throughput can
>
> reach about 8, if the requests are all reads. However, we cannot get
>
> results like that in our performance test with the synchronized interface,
>
> zkpython.
>
>
> Here is some of our test results:
>
> (3 zookeeper ensemble, 8 core CPU,  2.4GHZ, 16 RAM, Linux 2.6.9)
>
>
> § 1 client server,1 process per client server,connect 1 zookeeper
>
> server,all reads:cpu:8%~9%,qps:2208,latency:0.000453s
>
> § 1 client server,1 process per client server,connect all 3 zookeeper
>
> server,all reads:cpu:8%~9%,qps:2376.241573 ,latency:0.000421s
>
> § 1 client server,1 process per client server,connect all 3 zookeeper
>
> server,all reads,cpu:10%~20%,qps:15600,latency:0.000764s
>
> *§ 1 client server,30 process per client server,connect all 3 zookeeper
>
> server,all reads,cpu:10%~20%,qps:15200,latency:*
>
> *§ 2 client server,30 process **per client server**,connect all 3
>
> zookeeper server,all reads,cpu:10%~20%,qps:15800,latency:0.003487*
>
>
> qps means "query per second", that is throughput. The result shows that
>
> when adding more client server, the utilization rate of CPU don't increase,
>
> and the throughput don't increase much. It seems that the throughput won't
>
> reach 8, even if we add 28 more client servers to reach the number you
>
> mentioned in the Zookeeper Overview.
>
>
> Maybe I've done the tests wrong. Is there any particular thing I should pay
>
> attention to in this case? We set the max java heap size to 12GB in our
>
> test.
>
>
> *Could you tell me the details about how you do the performance test, from
>
> which you get the results showed in the Zookeeper Overview?*
>
>
> --
>
> With Regards!
>
>
> Ye, Qian
>
>
>
>
>
> --
> With Regards!
>
> Ye, Qian
>
>
> *flavio*
> *junqueira*
>
> research scientist
>
> f...@yahoo-inc.com
> direct +34 93-183-8828
>
> avinguda diagonal 177, 8th floor, barcelona, 08018, es
> phone (408) 349 3300fax (408) 349 3301
>
>
>


-- 
With Regards!

Ye, Qian


[jira] Commented: (ZOOKEEPER-837) cyclic dependency ClientCnxn, ZooKeeper

2011-03-08 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004344#comment-13004344
 ] 

Mahadev konar commented on ZOOKEEPER-837:
-

Thomas,
 The watchmanager class does not synchronize on the 
default/existWatches/childWatches always. Any reason for inconsistent 
synchronization.
 
 I am specifically talking abt the following method:

{quote}
public Set materialize(Watcher.Event.KeeperState state,
Watcher.Event.EventType type,
String clientPath)
{quote}

> cyclic dependency ClientCnxn, ZooKeeper
> ---
>
> Key: ZOOKEEPER-837
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-837
> Project: ZooKeeper
>  Issue Type: Sub-task
>Affects Versions: 3.3.1
>Reporter: Patrick Datko
>Assignee: Thomas Koch
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-837.patch, ZOOKEEPER-837.patch, 
> ZOOKEEPER-837.patch, ZOOKEEPER-837.patch
>
>
> ZooKeeper instantiates ClientCnxn in its ctor with this and therefor builds a 
> cyclic dependency graph between both objects. This means, you can't have the 
> one without the other. So why did you bother do make them to separate classes 
> in the first place?
> ClientCnxn accesses ZooKeeper.state. State should rather be a property of 
> ClientCnxn. And ClientCnxn accesses zooKeeper.get???Watches() in its method 
> primeConnection(). I've not yet checked, how this dependency should be 
> resolved better.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Something about performance of Zookeeper

2011-03-08 Thread Eugene Koontz

Dear Qian Ye,
Thank you for doing these tests. I am interested to hear more about 
this! Can you document how you did your testing with zkpython?

-Eugene

On 3/8/11 6:44 PM, Qian Ye wrote:



http://hadoop.apache.org/zookeeper/docs/r3.3.2/zookeeperOver.html)

. In
the Zookeeper Overview, the "ZooKeeper Throughput as the
Read-Write Ratio
Varies" shows that in a ensemble of 3 Zookeeper server, the
throughput can
reach about 8, if the requests are all reads. However, we
cannot get
results like that in our performance test with the synchronized
interface,
zkpython.