[jira] [Commented] (ZOOKEEPER-1355) Add zk.updateServerList(newServerList)

2012-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475543#comment-13475543
 ] 

Hadoop QA commented on ZOOKEEPER-1355:
--

+1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12549007/ZOOKEEPER-1355-13-Oct.patch
  against trunk revision 1391526.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 31 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1222//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1222//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1222//console

This message is automatically generated.

> Add zk.updateServerList(newServerList) 
> ---
>
> Key: ZOOKEEPER-1355
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1355
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: c client, java client
>Reporter: Alexander Shraer
>Assignee: Alexander Shraer
> Fix For: 3.5.0
>
> Attachments: loadbalancing-more-details.pdf, loadbalancing.pdf, 
> ZOOKEEPER-1355-10-Oct.patch, ZOOKEEPER-1355-12-Oct.patch, 
> ZOOKEEPER-1355-13-Oct.patch, ZOOKEEPER-1355-ver10-1.patch, 
> ZOOKEEPER-1355-ver10-2.patch, ZOOKEEPER-1355-ver10-3.patch, 
> ZOOKEEPER-1355-ver10-4.patch, ZOOKEEPER-1355-ver10-4.patch, 
> ZOOKEEPER-1355-ver10.patch, ZOOKEEPER-1355-ver11-1.patch, 
> ZOOKEEPER-1355-ver11.patch, ZOOKEEPER-1355-ver12-1.patch, 
> ZOOKEEPER-1355-ver12-2.patch, ZOOKEEPER-1355-ver12-4.patch, 
> ZOOKEEPER-1355-ver12.patch, ZOOKEEPER-1355-ver13.patch, 
> ZOOKEEPER-1355-ver14.patch, ZOOKEEPER-1355-ver2.patch, 
> ZOOKEEPER=1355-ver3.patch, ZOOKEEPER-1355-ver4.patch, 
> ZOOKEEPER-1355-ver5.patch, ZOOKEEPER-1355-ver6.patch, 
> ZOOKEEPER-1355-ver7.patch, ZOOKEEPER-1355-ver8.patch, 
> ZOOKEEPER-1355-ver9-1.patch, ZOOKEEPER-1355-ver9.patch, 
> ZOOOKEEPER-1355.patch, ZOOOKEEPER-1355-test.patch, ZOOOKEEPER-1355-ver1.patch
>
>
> When the set of servers changes, we would like to update the server list 
> stored by clients without restarting the clients.
> Moreover, assuming that the number of clients per server is the same (in 
> expectation) in the old configuration (as guaranteed by the current list 
> shuffling for example), we would like to re-balance client connections across 
> the new set of servers in a way that a) the number of clients per server is 
> the same for all servers (in expectation) and b) there is no 
> excessive/unnecessary client migration.
> It is simple to achieve (a) without (b) - just re-shuffle the new list of 
> servers at every client. But this would create unnecessary migration, which 
> we'd like to avoid.
> We propose a simple probabilistic migration scheme that achieves (a) and (b) 
> - each client locally decides whether and where to migrate when the list of 
> servers changes. The attached document describes the scheme and shows an 
> evaluation of it in Zookeeper. We also implemented re-balancing through a 
> consistent-hashing scheme and show a comparison. We derived the probabilistic 
> migration rules from a simple formula that we can also provide, if someone's 
> interested in the proof.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Success: ZOOKEEPER-1355 PreCommit Build #1222

2012-10-12 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1355
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1222/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 263962 lines...]
 [exec] BUILD SUCCESSFUL
 [exec] Total time: 0 seconds
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] +1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12549007/ZOOKEEPER-1355-13-Oct.patch
 [exec]   against trunk revision 1391526.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 31 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1222//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1222//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1222//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] hlqeau01pI logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD SUCCESSFUL
Total time: 29 minutes 4 seconds
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1355
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Updated] (ZOOKEEPER-1355) Add zk.updateServerList(newServerList)

2012-10-12 Thread Marshall McMullen (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marshall McMullen updated ZOOKEEPER-1355:
-

Attachment: ZOOKEEPER-1355-13-Oct.patch

Had meant to remove the Zab Test part from this patch as Alex tells me that was 
already committed to trunk under another Jira.

> Add zk.updateServerList(newServerList) 
> ---
>
> Key: ZOOKEEPER-1355
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1355
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: c client, java client
>Reporter: Alexander Shraer
>Assignee: Alexander Shraer
> Fix For: 3.5.0
>
> Attachments: loadbalancing-more-details.pdf, loadbalancing.pdf, 
> ZOOKEEPER-1355-10-Oct.patch, ZOOKEEPER-1355-12-Oct.patch, 
> ZOOKEEPER-1355-13-Oct.patch, ZOOKEEPER-1355-ver10-1.patch, 
> ZOOKEEPER-1355-ver10-2.patch, ZOOKEEPER-1355-ver10-3.patch, 
> ZOOKEEPER-1355-ver10-4.patch, ZOOKEEPER-1355-ver10-4.patch, 
> ZOOKEEPER-1355-ver10.patch, ZOOKEEPER-1355-ver11-1.patch, 
> ZOOKEEPER-1355-ver11.patch, ZOOKEEPER-1355-ver12-1.patch, 
> ZOOKEEPER-1355-ver12-2.patch, ZOOKEEPER-1355-ver12-4.patch, 
> ZOOKEEPER-1355-ver12.patch, ZOOKEEPER-1355-ver13.patch, 
> ZOOKEEPER-1355-ver14.patch, ZOOKEEPER-1355-ver2.patch, 
> ZOOKEEPER=1355-ver3.patch, ZOOKEEPER-1355-ver4.patch, 
> ZOOKEEPER-1355-ver5.patch, ZOOKEEPER-1355-ver6.patch, 
> ZOOKEEPER-1355-ver7.patch, ZOOKEEPER-1355-ver8.patch, 
> ZOOKEEPER-1355-ver9-1.patch, ZOOKEEPER-1355-ver9.patch, 
> ZOOOKEEPER-1355.patch, ZOOOKEEPER-1355-test.patch, ZOOOKEEPER-1355-ver1.patch
>
>
> When the set of servers changes, we would like to update the server list 
> stored by clients without restarting the clients.
> Moreover, assuming that the number of clients per server is the same (in 
> expectation) in the old configuration (as guaranteed by the current list 
> shuffling for example), we would like to re-balance client connections across 
> the new set of servers in a way that a) the number of clients per server is 
> the same for all servers (in expectation) and b) there is no 
> excessive/unnecessary client migration.
> It is simple to achieve (a) without (b) - just re-shuffle the new list of 
> servers at every client. But this would create unnecessary migration, which 
> we'd like to avoid.
> We propose a simple probabilistic migration scheme that achieves (a) and (b) 
> - each client locally decides whether and where to migrate when the list of 
> servers changes. The attached document describes the scheme and shows an 
> evaluation of it in Zookeeper. We also implemented re-balancing through a 
> consistent-hashing scheme and show a comparison. We derived the probabilistic 
> migration rules from a simple formula that we can also provide, if someone's 
> interested in the proof.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1355) Add zk.updateServerList(newServerList)

2012-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475510#comment-13475510
 ] 

Hadoop QA commented on ZOOKEEPER-1355:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12548996/ZOOKEEPER-1355-12-Oct.patch
  against trunk revision 1391526.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 34 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The patch appears to cause tar ant target to fail.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1221//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1221//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1221//console

This message is automatically generated.

> Add zk.updateServerList(newServerList) 
> ---
>
> Key: ZOOKEEPER-1355
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1355
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: c client, java client
>Reporter: Alexander Shraer
>Assignee: Alexander Shraer
> Fix For: 3.5.0
>
> Attachments: loadbalancing-more-details.pdf, loadbalancing.pdf, 
> ZOOKEEPER-1355-10-Oct.patch, ZOOKEEPER-1355-12-Oct.patch, 
> ZOOKEEPER-1355-ver10-1.patch, ZOOKEEPER-1355-ver10-2.patch, 
> ZOOKEEPER-1355-ver10-3.patch, ZOOKEEPER-1355-ver10-4.patch, 
> ZOOKEEPER-1355-ver10-4.patch, ZOOKEEPER-1355-ver10.patch, 
> ZOOKEEPER-1355-ver11-1.patch, ZOOKEEPER-1355-ver11.patch, 
> ZOOKEEPER-1355-ver12-1.patch, ZOOKEEPER-1355-ver12-2.patch, 
> ZOOKEEPER-1355-ver12-4.patch, ZOOKEEPER-1355-ver12.patch, 
> ZOOKEEPER-1355-ver13.patch, ZOOKEEPER-1355-ver14.patch, 
> ZOOKEEPER-1355-ver2.patch, ZOOKEEPER=1355-ver3.patch, 
> ZOOKEEPER-1355-ver4.patch, ZOOKEEPER-1355-ver5.patch, 
> ZOOKEEPER-1355-ver6.patch, ZOOKEEPER-1355-ver7.patch, 
> ZOOKEEPER-1355-ver8.patch, ZOOKEEPER-1355-ver9-1.patch, 
> ZOOKEEPER-1355-ver9.patch, ZOOOKEEPER-1355.patch, ZOOOKEEPER-1355-test.patch, 
> ZOOOKEEPER-1355-ver1.patch
>
>
> When the set of servers changes, we would like to update the server list 
> stored by clients without restarting the clients.
> Moreover, assuming that the number of clients per server is the same (in 
> expectation) in the old configuration (as guaranteed by the current list 
> shuffling for example), we would like to re-balance client connections across 
> the new set of servers in a way that a) the number of clients per server is 
> the same for all servers (in expectation) and b) there is no 
> excessive/unnecessary client migration.
> It is simple to achieve (a) without (b) - just re-shuffle the new list of 
> servers at every client. But this would create unnecessary migration, which 
> we'd like to avoid.
> We propose a simple probabilistic migration scheme that achieves (a) and (b) 
> - each client locally decides whether and where to migrate when the list of 
> servers changes. The attached document describes the scheme and shows an 
> evaluation of it in Zookeeper. We also implemented re-balancing through a 
> consistent-hashing scheme and show a comparison. We derived the probabilistic 
> migration rules from a simple formula that we can also provide, if someone's 
> interested in the proof.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Failed: ZOOKEEPER-1355 PreCommit Build #1221

2012-10-12 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1355
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1221/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 607 lines...]
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12548996/ZOOKEEPER-1355-12-Oct.patch
 [exec]   against trunk revision 1391526.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 34 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] -1 javac.  The patch appears to cause tar ant target to fail.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 core tests.  The patch failed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1221//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1221//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1221//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] Wzr34544w1 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1568:
 exec returned: 2

Total time: 2 minutes 1 second
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1355
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Updated] (ZOOKEEPER-1355) Add zk.updateServerList(newServerList)

2012-10-12 Thread Marshall McMullen (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marshall McMullen updated ZOOKEEPER-1355:
-

Attachment: ZOOKEEPER-1355-12-Oct.patch

This is an updated version of patch that applies cleanly to the latest tip of 
trunk.

> Add zk.updateServerList(newServerList) 
> ---
>
> Key: ZOOKEEPER-1355
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1355
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: c client, java client
>Reporter: Alexander Shraer
>Assignee: Alexander Shraer
> Fix For: 3.5.0
>
> Attachments: loadbalancing-more-details.pdf, loadbalancing.pdf, 
> ZOOKEEPER-1355-10-Oct.patch, ZOOKEEPER-1355-12-Oct.patch, 
> ZOOKEEPER-1355-ver10-1.patch, ZOOKEEPER-1355-ver10-2.patch, 
> ZOOKEEPER-1355-ver10-3.patch, ZOOKEEPER-1355-ver10-4.patch, 
> ZOOKEEPER-1355-ver10-4.patch, ZOOKEEPER-1355-ver10.patch, 
> ZOOKEEPER-1355-ver11-1.patch, ZOOKEEPER-1355-ver11.patch, 
> ZOOKEEPER-1355-ver12-1.patch, ZOOKEEPER-1355-ver12-2.patch, 
> ZOOKEEPER-1355-ver12-4.patch, ZOOKEEPER-1355-ver12.patch, 
> ZOOKEEPER-1355-ver13.patch, ZOOKEEPER-1355-ver14.patch, 
> ZOOKEEPER-1355-ver2.patch, ZOOKEEPER=1355-ver3.patch, 
> ZOOKEEPER-1355-ver4.patch, ZOOKEEPER-1355-ver5.patch, 
> ZOOKEEPER-1355-ver6.patch, ZOOKEEPER-1355-ver7.patch, 
> ZOOKEEPER-1355-ver8.patch, ZOOKEEPER-1355-ver9-1.patch, 
> ZOOKEEPER-1355-ver9.patch, ZOOOKEEPER-1355.patch, ZOOOKEEPER-1355-test.patch, 
> ZOOOKEEPER-1355-ver1.patch
>
>
> When the set of servers changes, we would like to update the server list 
> stored by clients without restarting the clients.
> Moreover, assuming that the number of clients per server is the same (in 
> expectation) in the old configuration (as guaranteed by the current list 
> shuffling for example), we would like to re-balance client connections across 
> the new set of servers in a way that a) the number of clients per server is 
> the same for all servers (in expectation) and b) there is no 
> excessive/unnecessary client migration.
> It is simple to achieve (a) without (b) - just re-shuffle the new list of 
> servers at every client. But this would create unnecessary migration, which 
> we'd like to avoid.
> We propose a simple probabilistic migration scheme that achieves (a) and (b) 
> - each client locally decides whether and where to migrate when the list of 
> servers changes. The attached document describes the scheme and shows an 
> evaluation of it in Zookeeper. We also implemented re-balancing through a 
> consistent-hashing scheme and show a comparison. We derived the probabilistic 
> migration rules from a simple formula that we can also provide, if someone's 
> interested in the proof.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (ZOOKEEPER-1562) Memory leaks in zoo_multi API

2012-10-12 Thread Deepak Jagtap (JIRA)
Deepak Jagtap created ZOOKEEPER-1562:


 Summary: Memory leaks in zoo_multi API
 Key: ZOOKEEPER-1562
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1562
 Project: ZooKeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.4.3, 3.4.4
 Environment: Zookeeper client and server both are running on CentOS 6.3
Reporter: Deepak Jagtap
Priority: Trivial


Valgrind is reporting memory leak for zoo_multi operations.

==4056== 2,240 (160 direct, 2,080 indirect) bytes in 1 blocks are definitely 
lost in loss record 18 of 24
==4056==at 0x4A04A28: calloc (vg_replace_malloc.c:467)
==4056==by 0x504D822: create_completion_entry (zookeeper.c:2322)
==4056==by 0x5052833: zoo_amulti (zookeeper.c:3141)
==4056==by 0x5052A8B: zoo_multi (zookeeper.c:3240)

It looks like completion entries for individual operations in multiupdate 
transaction are not getting freed. My observation is that memory leak size 
depends on the number of operations in single mutlipupdate transaction

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: Multi-thread CommitProcessor

2012-10-12 Thread Jay Shrauner

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6260/
---

(Updated Oct. 12, 2012, 11:47 p.m.)


Review request for zookeeper and Patrick Hunt.


Changes
---

Address feedback from review--shutdown CommitProcessor if downstream processor 
throws an exception (preserves previous behavior)


Description
---

See https://issues.apache.org/jira/browse/ZOOKEEPER-1505


This addresses bug ZOOKEEPER-1505.
https://issues.apache.org/jira/browse/ZOOKEEPER-1505


Diffs (updated)
-

  /src/java/main/org/apache/zookeeper/server/FinalRequestProcessor.java 1391526 
  /src/java/main/org/apache/zookeeper/server/ServerCnxnFactory.java 1391526 
  /src/java/main/org/apache/zookeeper/server/WorkerService.java PRE-CREATION 
  /src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java 
1391526 
  /src/java/main/org/apache/zookeeper/server/quorum/Leader.java 1391526 
  /src/java/test/org/apache/zookeeper/server/quorum/CommitProcessorTest.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/6260/diff/


Testing
---


Thanks,

Jay Shrauner



Re: Review Request: Multi-thread NIOServerCnxn

2012-10-12 Thread Jay Shrauner

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6256/
---

(Updated Oct. 12, 2012, 11:45 p.m.)


Review request for zookeeper and Patrick Hunt.


Changes
---

Rebase


Description
---

See https://issues.apache.org/jira/browse/ZOOKEEPER-1504


This addresses bug ZOOKEEPER-1504.
https://issues.apache.org/jira/browse/ZOOKEEPER-1504


Diffs (updated)
-

  /src/java/main/org/apache/zookeeper/server/ExpiryQueue.java PRE-CREATION 
  /src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java 1391526 
  /src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java 1391526 
  /src/java/main/org/apache/zookeeper/server/RateLogger.java PRE-CREATION 
  /src/java/main/org/apache/zookeeper/server/ServerCnxn.java 1391526 
  /src/java/main/org/apache/zookeeper/server/ServerCnxnFactory.java 1391526 
  /src/java/main/org/apache/zookeeper/server/SessionTrackerImpl.java 1391526 
  /src/java/main/org/apache/zookeeper/server/WorkerService.java PRE-CREATION 
  /src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 1391526 
  /src/java/test/org/apache/zookeeper/test/ServerCnxnTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/6256/diff/


Testing
---


Thanks,

Jay Shrauner



[jira] [Commented] (ZOOKEEPER-1505) Multi-thread CommitProcessor

2012-10-12 Thread Jay Shrauner (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475444#comment-13475444
 ] 

Jay Shrauner commented on ZOOKEEPER-1505:
-

Findbug warning ("naked notify") is bogus; this is a helper routine to wakeup 
the main thread with the state change happening in the routines that call it.

>From the blurb in findbug: "This bug does not necessarily indicate an error, 
>since the change to mutable object state may have taken place in a method 
>which then called the method containing the notification."

> Multi-thread CommitProcessor
> 
>
> Key: ZOOKEEPER-1505
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1505.patch, ZOOKEEPER-1505.patch, 
> ZOOKEEPER-1505.patch
>
>
> CommitProcessor has a single thread that both pulls requests off its queues 
> and runs all downstream processors. This is noticeably inefficient for 
> read-intensive workloads, which could be run concurrently. The trick is 
> handling write transactions. I propose multi-threading this code according to 
> the following two constraints
>   - each session must see its requests responded to in order
>   - all committed transactions must be handled in zxid order, across all 
> sessions
> I believe these cover the only constraints we need to honor. In particular, I 
> believe we can relax the following:
>   - it does not matter if the read request in one session happens before or 
> after the write request in another session
> With these constraints, I propose the following threads
>   - 1primary queue servicing/work dispatching thread
>   - 0-N  assignable worker threads, where a given session is always assigned 
> to the same worker thread
> By assigning sessions always to the same worker thread (using a simple 
> sessionId mod number of worker threads), we guarantee the first constraint-- 
> requests we push onto the thread queue are processed in order. The way we 
> guarantee the second constraint is we only allow a single commit transaction 
> to be in flight at a time--the queue servicing thread blocks while a commit 
> transaction is in flight, and when the transaction completes it clears the 
> flag.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 32 
> worker threads for a 56% +/- 5% improvement in throughput (this improvement 
> was measured on top of that for ZOOKEEPER-1504, not in isolation).
> New classes introduced in this patch are:
> WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used here) and non-assignable threads 
> (as used by NIOServerCnxnFactory).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Success: ZOOKEEPER-1504 PreCommit Build #1220

2012-10-12 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1504
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1220/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 266158 lines...]
 [exec] BUILD SUCCESSFUL
 [exec] Total time: 0 seconds
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] +1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12548950/ZOOKEEPER-1504.patch
 [exec]   against trunk revision 1391526.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1220//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1220//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1220//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 61jfuJgRdC logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD SUCCESSFUL
Total time: 27 minutes 38 seconds
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1504
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (ZOOKEEPER-1504) Multi-thread NIOServerCnxn

2012-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475337#comment-13475337
 ] 

Hadoop QA commented on ZOOKEEPER-1504:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12548950/ZOOKEEPER-1504.patch
  against trunk revision 1391526.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1220//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1220//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1220//console

This message is automatically generated.

> Multi-thread NIOServerCnxn
> --
>
> Key: ZOOKEEPER-1504
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1504
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1504.patch, ZOOKEEPER-1504.patch, 
> ZOOKEEPER-1504.patch, ZOOKEEPER-1504.patch
>
>
> NIOServerCnxnFactory is single threaded, which doesn't scale well to large 
> numbers of clients. This is particularly noticeable when thousands of clients 
> connect. I propose multi-threading this code as follows:
> - 1   acceptor thread, for accepting new connections
> - 1-N selector threads
> - 0-M I/O worker threads
> Numbers of threads are configurable, with defaults scaling according to 
> number of cores. Communication with the selector threads is handled via 
> LinkedBlockingQueues, and connections are permanently assigned to a 
> particular selector thread so that all potentially blocking SelectionKey 
> operations can be performed solely by the selector thread. An ExecutorService 
> is used for the worker threads.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 4 
> selector threads and 64 worker threads for a 70% +/- 5% improvement in 
> throughput.
> This patch incorporates and supersedes the patches for
> https://issues.apache.org/jira/browse/ZOOKEEPER-517
> https://issues.apache.org/jira/browse/ZOOKEEPER-1444
> New classes introduced in this patch are:
>   - ExpiryQueue (from ZOOKEEPER-1444): factor out the logic from 
> SessionTrackerImpl used to expire sessions so that the same logic can be used 
> to expire connections
>   - RateLogger (from ZOOKEEPER-517): rate limit error message logging, 
> currently only used to throttle rate of logging "out of file descriptors" 
> errors
>   - WorkerService (also in ZOOKEEPER-1505): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used by CommitProcessor) and 
> non-assignable threads (as used here).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1505) Multi-thread CommitProcessor

2012-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475336#comment-13475336
 ] 

Hadoop QA commented on ZOOKEEPER-1505:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12548952/ZOOKEEPER-1505.patch
  against trunk revision 1391526.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1219//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1219//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1219//console

This message is automatically generated.

> Multi-thread CommitProcessor
> 
>
> Key: ZOOKEEPER-1505
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1505.patch, ZOOKEEPER-1505.patch, 
> ZOOKEEPER-1505.patch
>
>
> CommitProcessor has a single thread that both pulls requests off its queues 
> and runs all downstream processors. This is noticeably inefficient for 
> read-intensive workloads, which could be run concurrently. The trick is 
> handling write transactions. I propose multi-threading this code according to 
> the following two constraints
>   - each session must see its requests responded to in order
>   - all committed transactions must be handled in zxid order, across all 
> sessions
> I believe these cover the only constraints we need to honor. In particular, I 
> believe we can relax the following:
>   - it does not matter if the read request in one session happens before or 
> after the write request in another session
> With these constraints, I propose the following threads
>   - 1primary queue servicing/work dispatching thread
>   - 0-N  assignable worker threads, where a given session is always assigned 
> to the same worker thread
> By assigning sessions always to the same worker thread (using a simple 
> sessionId mod number of worker threads), we guarantee the first constraint-- 
> requests we push onto the thread queue are processed in order. The way we 
> guarantee the second constraint is we only allow a single commit transaction 
> to be in flight at a time--the queue servicing thread blocks while a commit 
> transaction is in flight, and when the transaction completes it clears the 
> flag.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 32 
> worker threads for a 56% +/- 5% improvement in throughput (this improvement 
> was measured on top of that for ZOOKEEPER-1504, not in isolation).
> New classes introduced in this patch are:
> WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used here) and non-assignable threads 
> (as used by NIOServerCnxnFactory).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Failed: ZOOKEEPER-1505 PreCommit Build #1219

2012-10-12 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1219/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 262152 lines...]
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12548952/ZOOKEEPER-1505.patch
 [exec]   against trunk revision 1391526.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] -1 findbugs.  The patch appears to introduce 1 new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1219//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1219//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1219//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] vrs878qBhU logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1568:
 exec returned: 1

Total time: 27 minutes 44 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1505
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

Re: [VOTE] Release ZooKeeper 3.4.5 (candidate 0)

2012-10-12 Thread Mahadev Konar
Thanks Ted. Will review the changes over the weekend.

Thanks again
mahadev

On Fri, Oct 12, 2012 at 1:12 PM, Ted Yu  wrote:
> Patch v7 for ZOOKEEPER-1560 passes test suite.
>
> Please take a look.
>
> On Thu, Oct 11, 2012 at 2:45 PM, Mahadev Konar wrote:
>
>> Thanks Alex for bringing it up. Ill hold the release for now. I see a
>> patch on 1560. Ill take a look and we'll see how to roll this into
>> 3.4.5.
>>
>> thanks
>> mahadev
>>
>> On Thu, Oct 11, 2012 at 2:42 PM, Alexander Shraer 
>> wrote:
>> > Hi Mahadev,
>> >
>> > ZOOKEEPER-1560 and ZOOKEEPER-1561 indicate a potentially serious issue,
>> > introduced recently in ZOOKEEPER-1437. Please consider this w.r.t. the
>> > 3.4.5 release.
>> >
>> > Best Regards,
>> > Alex
>> >
>> > On Wed, Oct 10, 2012 at 10:38 PM, Mahadev Konar 
>> wrote:
>> >> I think we have waited enough. Closing the vote now.
>> >>
>> >> With 5 +1's (3 binding) the vote passes. I will do the needful for
>> >> getting the release out.
>> >>
>> >> Thanks for voting folks.
>> >>
>> >> mahadev
>> >>
>> >> On Wed, Oct 10, 2012 at 9:04 AM, Flavio Junqueira 
>> wrote:
>> >>> +1
>> >>>
>> >>> -Flavio
>> >>>
>> >>> On Oct 8, 2012, at 7:05 AM, Mahadev Konar wrote:
>> >>>
>>  Given Eugene's findings on ZOOKEEPER-1557, I think we can continue
>>  rolling the current RC out. Others please vote on the thread if you
>>  see any issues with that. Folks who have already voted, please re vote
>>  in case you have a change of opinion.
>> 
>>  As for myself, I ran a couple of tests with the RC using open jdk 7
>>  and things seem to work.
>> 
>>  +1 from my side. Pat/Ben/Flavio/others what do you guys think?
>> 
>>  thanks
>>  mahadev
>> 
>>  On Sun, Oct 7, 2012 at 8:34 AM, Ted Yu  wrote:
>> > Currently ZooKeeper_branch34_openjdk7 and ZooKeeper_branch34_jdk7
>> are using
>> > lock ZooKeeper-solaris.
>> > I think ZooKeeper_branch34_openjdk7 and ZooKeeper_branch34_jdk7
>> should use
>> > a separate lock since they wouldn't run on a Solaris machine.
>> > I didn't seem to find how a new lock name can be added.
>> >
>> > Recent builds for ZooKeeper_branch34_openjdk7 and
>> ZooKeeper_branch34_jdk7
>> > have been green.
>> >
>> > Cheers
>> >
>> > On Sun, Oct 7, 2012 at 6:56 AM, Patrick Hunt 
>> wrote:
>> >
>> >> I've seen that before, it's a flakey test that's unrelated to the
>> sasl
>> >> stuff.
>> >>
>> >> Patrick
>> >>
>> >> On Sat, Oct 6, 2012 at 2:25 PM, Ted Yu  wrote:
>> >>> I saw one test failure:
>> >>>
>> >>>
>> >>
>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch34_openjdk7/9/testReport/org.apache.zookeeper.server.quorum/QuorumPeerMainTest/testHighestZxidJoinLate/
>> >>>
>> >>> FYI
>> >>>
>> >>> On Sat, Oct 6, 2012 at 7:16 AM, Ted Yu 
>> wrote:
>> >>>
>>  Up in ZOOKEEPER-1557, Eugene separated one test out and test
>> failure
>> >> seems
>>  to be gone.
>> 
>>  For ZooKeeper_branch34_jdk7, the two failed builds:
>>  #10 corresponded to ZooKeeper_branch34_openjdk7 build #7,
>>  #8 corresponded to ZooKeeper_branch34_openjdk7 build #5
>>  where tests failed due to BindException
>> 
>>  Cheers
>> 
>> 
>>  On Sat, Oct 6, 2012 at 7:06 AM, Patrick Hunt 
>> wrote:
>> 
>> > Yes. Those ubuntu machines have two slots each. If both tests
>> run at
>> > the same time... bam.
>> >
>> > I just added exclusion locks to the configuration of these two
>> jobs,
>> > that should help.
>> >
>> > Patrick
>> >
>> > On Fri, Oct 5, 2012 at 8:58 PM, Ted Yu 
>> wrote:
>> >> I think that was due to the following running on the same
>> machine at
>> >> the
>> >> same time:
>> >>
>> >> Building remotely on ubuntu4
>> >>  in workspace
>> >>
>> /home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_openjdk7
>> >>
>> >> We should introduce randomized port so that test suite can
>> execute in
>> >> parallel.
>> >>
>> >> Cheers
>> >>
>> >> On Fri, Oct 5, 2012 at 8:55 PM, Ted Yu 
>> wrote:
>> >>
>> >>> Some tests failed in build 8 due to (See
>> >>>
>> >>>
>> >
>> >>
>> https://builds.apache.org//view/S-Z/view/ZooKeeper/job/ZooKeeper_branch34_jdk7/8/testReport/org.apache.zookeeper.server/ZxidRolloverTest/testRolloverThenRestart/
>> > ):
>> >>>
>> >>> java.lang.RuntimeException: java.net.BindException: Address
>> already
>> >> in
>> > use
>> >>>  at
>> > org.apache.zookeeper.test.QuorumUtil.(QuorumUtil.java:118)
>> >>>  at
>> >
>> >>
>> org.apache.zookeeper.server.ZxidRolloverTest.setUp(ZxidRolloverT

[jira] [Assigned] (ZOOKEEPER-1147) Add support for local sessions

2012-10-12 Thread Jay Shrauner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner reassigned ZOOKEEPER-1147:
---

Assignee: Thawan Kooburat  (was: Jay Shrauner)

> Add support for local sessions
> --
>
> Key: ZOOKEEPER-1147
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1147
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.3.3
>Reporter: Vishal Kathuria
>Assignee: Thawan Kooburat
>  Labels: api-change, scaling
> Fix For: 3.5.0
>
>   Original Estimate: 840h
>  Remaining Estimate: 840h
>
> This improvement is in the bucket of making ZooKeeper work at a large scale. 
> We are planning on having about a 1 million clients connect to a ZooKeeper 
> ensemble through a set of 50-100 observers. Majority of these clients are 
> read only - ie they do not do any updates or create ephemeral nodes.
> In ZooKeeper today, the client creates a session and the session creation is 
> handled like any other update. In the above use case, the session create/drop 
> workload can easily overwhelm an ensemble. The following is a proposal for a 
> "local session", to support a larger number of connections.
> 1.   The idea is to introduce a new type of session - "local" session. A 
> "local" session doesn't have a full functionality of a normal session.
> 2.   Local sessions cannot create ephemeral nodes.
> 3.   Once a local session is lost, you cannot re-establish it using the 
> session-id/password. The session and its watches are gone for good.
> 4.   When a local session connects, the session info is only maintained 
> on the zookeeper server (in this case, an observer) that it is connected to. 
> The leader is not aware of the creation of such a session and there is no 
> state written to disk.
> 5.   The pings and expiration is handled by the server that the session 
> is connected to.
> With the above changes, we can make ZooKeeper scale to a much larger number 
> of clients without making the core ensemble a bottleneck.
> In terms of API, there are two options that are being considered
> 1. Let the client specify at the connect time which kind of session do they 
> want.
> 2. All sessions connect as local sessions and automatically get promoted to 
> global sessions when they do an operation that requires a global session 
> (e.g. creating an ephemeral node)
> Chubby took the approach of lazily promoting all sessions to global, but I 
> don't think that would work in our case, where we want to keep sessions which 
> never create ephemeral nodes as always local. Option 2 would make it more 
> broadly usable but option 1 would be easier to implement.
> We are thinking of implementing option 1 as the first cut. There would be a 
> client flag, IsLocalSession (much like the current readOnly flag) that would 
> be used to determine whether to create a local session or a global session.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1505) Multi-thread CommitProcessor

2012-10-12 Thread Jay Shrauner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1505:


Attachment: ZOOKEEPER-1505.patch

Address feedback from review--shutdown CommitProcessor if downstream processor 
throws an exception (preserves previous behavior)

> Multi-thread CommitProcessor
> 
>
> Key: ZOOKEEPER-1505
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1505.patch, ZOOKEEPER-1505.patch, 
> ZOOKEEPER-1505.patch
>
>
> CommitProcessor has a single thread that both pulls requests off its queues 
> and runs all downstream processors. This is noticeably inefficient for 
> read-intensive workloads, which could be run concurrently. The trick is 
> handling write transactions. I propose multi-threading this code according to 
> the following two constraints
>   - each session must see its requests responded to in order
>   - all committed transactions must be handled in zxid order, across all 
> sessions
> I believe these cover the only constraints we need to honor. In particular, I 
> believe we can relax the following:
>   - it does not matter if the read request in one session happens before or 
> after the write request in another session
> With these constraints, I propose the following threads
>   - 1primary queue servicing/work dispatching thread
>   - 0-N  assignable worker threads, where a given session is always assigned 
> to the same worker thread
> By assigning sessions always to the same worker thread (using a simple 
> sessionId mod number of worker threads), we guarantee the first constraint-- 
> requests we push onto the thread queue are processed in order. The way we 
> guarantee the second constraint is we only allow a single commit transaction 
> to be in flight at a time--the queue servicing thread blocks while a commit 
> transaction is in flight, and when the transaction completes it clears the 
> flag.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 32 
> worker threads for a 56% +/- 5% improvement in throughput (this improvement 
> was measured on top of that for ZOOKEEPER-1504, not in isolation).
> New classes introduced in this patch are:
> WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used here) and non-assignable threads 
> (as used by NIOServerCnxnFactory).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [VOTE] Release ZooKeeper 3.4.5 (candidate 0)

2012-10-12 Thread Ted Yu
Patch v7 for ZOOKEEPER-1560 passes test suite.

Please take a look.

On Thu, Oct 11, 2012 at 2:45 PM, Mahadev Konar wrote:

> Thanks Alex for bringing it up. Ill hold the release for now. I see a
> patch on 1560. Ill take a look and we'll see how to roll this into
> 3.4.5.
>
> thanks
> mahadev
>
> On Thu, Oct 11, 2012 at 2:42 PM, Alexander Shraer 
> wrote:
> > Hi Mahadev,
> >
> > ZOOKEEPER-1560 and ZOOKEEPER-1561 indicate a potentially serious issue,
> > introduced recently in ZOOKEEPER-1437. Please consider this w.r.t. the
> > 3.4.5 release.
> >
> > Best Regards,
> > Alex
> >
> > On Wed, Oct 10, 2012 at 10:38 PM, Mahadev Konar 
> wrote:
> >> I think we have waited enough. Closing the vote now.
> >>
> >> With 5 +1's (3 binding) the vote passes. I will do the needful for
> >> getting the release out.
> >>
> >> Thanks for voting folks.
> >>
> >> mahadev
> >>
> >> On Wed, Oct 10, 2012 at 9:04 AM, Flavio Junqueira 
> wrote:
> >>> +1
> >>>
> >>> -Flavio
> >>>
> >>> On Oct 8, 2012, at 7:05 AM, Mahadev Konar wrote:
> >>>
>  Given Eugene's findings on ZOOKEEPER-1557, I think we can continue
>  rolling the current RC out. Others please vote on the thread if you
>  see any issues with that. Folks who have already voted, please re vote
>  in case you have a change of opinion.
> 
>  As for myself, I ran a couple of tests with the RC using open jdk 7
>  and things seem to work.
> 
>  +1 from my side. Pat/Ben/Flavio/others what do you guys think?
> 
>  thanks
>  mahadev
> 
>  On Sun, Oct 7, 2012 at 8:34 AM, Ted Yu  wrote:
> > Currently ZooKeeper_branch34_openjdk7 and ZooKeeper_branch34_jdk7
> are using
> > lock ZooKeeper-solaris.
> > I think ZooKeeper_branch34_openjdk7 and ZooKeeper_branch34_jdk7
> should use
> > a separate lock since they wouldn't run on a Solaris machine.
> > I didn't seem to find how a new lock name can be added.
> >
> > Recent builds for ZooKeeper_branch34_openjdk7 and
> ZooKeeper_branch34_jdk7
> > have been green.
> >
> > Cheers
> >
> > On Sun, Oct 7, 2012 at 6:56 AM, Patrick Hunt 
> wrote:
> >
> >> I've seen that before, it's a flakey test that's unrelated to the
> sasl
> >> stuff.
> >>
> >> Patrick
> >>
> >> On Sat, Oct 6, 2012 at 2:25 PM, Ted Yu  wrote:
> >>> I saw one test failure:
> >>>
> >>>
> >>
> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch34_openjdk7/9/testReport/org.apache.zookeeper.server.quorum/QuorumPeerMainTest/testHighestZxidJoinLate/
> >>>
> >>> FYI
> >>>
> >>> On Sat, Oct 6, 2012 at 7:16 AM, Ted Yu 
> wrote:
> >>>
>  Up in ZOOKEEPER-1557, Eugene separated one test out and test
> failure
> >> seems
>  to be gone.
> 
>  For ZooKeeper_branch34_jdk7, the two failed builds:
>  #10 corresponded to ZooKeeper_branch34_openjdk7 build #7,
>  #8 corresponded to ZooKeeper_branch34_openjdk7 build #5
>  where tests failed due to BindException
> 
>  Cheers
> 
> 
>  On Sat, Oct 6, 2012 at 7:06 AM, Patrick Hunt 
> wrote:
> 
> > Yes. Those ubuntu machines have two slots each. If both tests
> run at
> > the same time... bam.
> >
> > I just added exclusion locks to the configuration of these two
> jobs,
> > that should help.
> >
> > Patrick
> >
> > On Fri, Oct 5, 2012 at 8:58 PM, Ted Yu 
> wrote:
> >> I think that was due to the following running on the same
> machine at
> >> the
> >> same time:
> >>
> >> Building remotely on ubuntu4
> >>  in workspace
> >>
> /home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_openjdk7
> >>
> >> We should introduce randomized port so that test suite can
> execute in
> >> parallel.
> >>
> >> Cheers
> >>
> >> On Fri, Oct 5, 2012 at 8:55 PM, Ted Yu 
> wrote:
> >>
> >>> Some tests failed in build 8 due to (See
> >>>
> >>>
> >
> >>
> https://builds.apache.org//view/S-Z/view/ZooKeeper/job/ZooKeeper_branch34_jdk7/8/testReport/org.apache.zookeeper.server/ZxidRolloverTest/testRolloverThenRestart/
> > ):
> >>>
> >>> java.lang.RuntimeException: java.net.BindException: Address
> already
> >> in
> > use
> >>>  at
> > org.apache.zookeeper.test.QuorumUtil.(QuorumUtil.java:118)
> >>>  at
> >
> >>
> org.apache.zookeeper.server.ZxidRolloverTest.setUp(ZxidRolloverTest.java:63)
> >>> Caused by: java.net.BindException: Address already in use
> >>>  at sun.nio.ch.Net.bind0(Native Method)
> >>>  at sun.nio.ch.Net.bind(Net.java:344)
> >>>  at sun.nio.ch.Net.bind(Net.java:336)
> >>>  at
> >>

[jira] [Updated] (ZOOKEEPER-1504) Multi-thread NIOServerCnxn

2012-10-12 Thread Jay Shrauner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1504:


Attachment: ZOOKEEPER-1504.patch

Rebase

> Multi-thread NIOServerCnxn
> --
>
> Key: ZOOKEEPER-1504
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1504
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1504.patch, ZOOKEEPER-1504.patch, 
> ZOOKEEPER-1504.patch, ZOOKEEPER-1504.patch
>
>
> NIOServerCnxnFactory is single threaded, which doesn't scale well to large 
> numbers of clients. This is particularly noticeable when thousands of clients 
> connect. I propose multi-threading this code as follows:
> - 1   acceptor thread, for accepting new connections
> - 1-N selector threads
> - 0-M I/O worker threads
> Numbers of threads are configurable, with defaults scaling according to 
> number of cores. Communication with the selector threads is handled via 
> LinkedBlockingQueues, and connections are permanently assigned to a 
> particular selector thread so that all potentially blocking SelectionKey 
> operations can be performed solely by the selector thread. An ExecutorService 
> is used for the worker threads.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 4 
> selector threads and 64 worker threads for a 70% +/- 5% improvement in 
> throughput.
> This patch incorporates and supersedes the patches for
> https://issues.apache.org/jira/browse/ZOOKEEPER-517
> https://issues.apache.org/jira/browse/ZOOKEEPER-1444
> New classes introduced in this patch are:
>   - ExpiryQueue (from ZOOKEEPER-1444): factor out the logic from 
> SessionTrackerImpl used to expire sessions so that the same logic can be used 
> to expire connections
>   - RateLogger (from ZOOKEEPER-517): rate limit error message logging, 
> currently only used to throttle rate of logging "out of file descriptors" 
> errors
>   - WorkerService (also in ZOOKEEPER-1505): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used by CommitProcessor) and 
> non-assignable threads (as used here).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes

2012-10-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475146#comment-13475146
 ] 

Ted Yu commented on ZOOKEEPER-1560:
---

Good news was that patch v7 passed.
Not so good news was that I didn't find any occurrence of the warning message I 
added in v7.

Essentially patch v7 is the same as patch v2 - we shouldn't bail if a single 
sock.write() call didn't make progress.

> Zookeeper client hangs on creation of large nodes
> -
>
> Key: ZOOKEEPER-1560
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.4, 3.5.0
>Reporter: Igor Motov
>Assignee: Ted Yu
> Fix For: 3.5.0, 3.4.5
>
> Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
> zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, 
> zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt
>
>
> To reproduce, try creating a node with 0.5M of data using java client. The 
> test will hang waiting for a response from the server. See the attached patch 
> for the test that reproduces the issue.
> It seems that ZOOKEEPER-1437 introduced a few issues to 
> {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
> sending large packets that require several invocations of 
> {{SocketChannel.write}} to complete. The first issue is that the call to 
> {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
> even if the packet wasn't completely sent yet.  It looks to me that this call 
> should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
> {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
> confuses {{SocketChannel.write}}. And the third issue is caused by extra 
> calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
> the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes

2012-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475134#comment-13475134
 ] 

Hadoop QA commented on ZOOKEEPER-1560:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12548908/zookeeper-1560-v7.txt
  against trunk revision 1391526.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218//console

This message is automatically generated.

> Zookeeper client hangs on creation of large nodes
> -
>
> Key: ZOOKEEPER-1560
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.4, 3.5.0
>Reporter: Igor Motov
>Assignee: Ted Yu
> Fix For: 3.5.0, 3.4.5
>
> Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
> zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, 
> zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt
>
>
> To reproduce, try creating a node with 0.5M of data using java client. The 
> test will hang waiting for a response from the server. See the attached patch 
> for the test that reproduces the issue.
> It seems that ZOOKEEPER-1437 introduced a few issues to 
> {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
> sending large packets that require several invocations of 
> {{SocketChannel.write}} to complete. The first issue is that the call to 
> {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
> even if the packet wasn't completely sent yet.  It looks to me that this call 
> should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
> {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
> confuses {{SocketChannel.write}}. And the third issue is caused by extra 
> calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
> the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Success: ZOOKEEPER-1560 PreCommit Build #1218

2012-10-12 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 262995 lines...]
 [exec] BUILD SUCCESSFUL
 [exec] Total time: 0 seconds
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] +1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12548908/zookeeper-1560-v7.txt
 [exec]   against trunk revision 1391526.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] b2727V26Mo logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD SUCCESSFUL
Total time: 27 minutes 20 seconds
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1560
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (BOOKKEEPER-431) Duplicate definition of COOKIES_NODE

2012-10-12 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475101#comment-13475101
 ] 

Uma Maheswara Rao G commented on BOOKKEEPER-431:


How about having a constants file and maintaining all such consts at place?
If we maintain the constants inside specific files, it is very easy to 
duplicate the consts.

> Duplicate definition of COOKIES_NODE
> 
>
> Key: BOOKKEEPER-431
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-431
> Project: Bookkeeper
>  Issue Type: Improvement
>Affects Versions: 4.1.0
>Reporter: Flavio Junqueira
>Assignee: Uma Maheswara Rao G
>Priority: Minor
> Fix For: 4.2.0
>
>
> Is it necessary two definitions of COOKIES_NODE, one in cookie.java and one 
> in AbstractZkLedgerManager?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (BOOKKEEPER-431) Duplicate definition of COOKIES_NODE

2012-10-12 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/BOOKKEEPER-431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G reassigned BOOKKEEPER-431:
--

Assignee: Uma Maheswara Rao G

> Duplicate definition of COOKIES_NODE
> 
>
> Key: BOOKKEEPER-431
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-431
> Project: Bookkeeper
>  Issue Type: Improvement
>Affects Versions: 4.1.0
>Reporter: Flavio Junqueira
>Assignee: Uma Maheswara Rao G
>Priority: Minor
> Fix For: 4.2.0
>
>
> Is it necessary two definitions of COOKIES_NODE, one in cookie.java and one 
> in AbstractZkLedgerManager?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (BOOKKEEPER-431) Duplicate definition of COOKIES_NODE

2012-10-12 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475099#comment-13475099
 ] 

Flavio Junqueira commented on BOOKKEEPER-431:
-

Actually, Cookie.java defines COOKIE_NODE while AbstractZkLedgerManager defines 
COOKIES_NODE. I also noticed that AVAILABLE_NODE is duplicated. Is it for 
readability reasons? Shouldn't we have that in a single place?

> Duplicate definition of COOKIES_NODE
> 
>
> Key: BOOKKEEPER-431
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-431
> Project: Bookkeeper
>  Issue Type: Improvement
>Affects Versions: 4.1.0
>Reporter: Flavio Junqueira
>Priority: Minor
> Fix For: 4.2.0
>
>
> Is it necessary two definitions of COOKIES_NODE, one in cookie.java and one 
> in AbstractZkLedgerManager?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes

2012-10-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated ZOOKEEPER-1560:
--

Attachment: zookeeper-1560-v7.txt

Patch v7 changes the IOE to a warning.
Let's see if the test is able to make further progress.

I wonder whether 77152 bytes would be big enough for most use cases.

> Zookeeper client hangs on creation of large nodes
> -
>
> Key: ZOOKEEPER-1560
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.4, 3.5.0
>Reporter: Igor Motov
>Assignee: Ted Yu
> Fix For: 3.5.0, 3.4.5
>
> Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
> zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, 
> zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt
>
>
> To reproduce, try creating a node with 0.5M of data using java client. The 
> test will hang waiting for a response from the server. See the attached patch 
> for the test that reproduces the issue.
> It seems that ZOOKEEPER-1437 introduced a few issues to 
> {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
> sending large packets that require several invocations of 
> {{SocketChannel.write}} to complete. The first issue is that the call to 
> {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
> even if the packet wasn't completely sent yet.  It looks to me that this call 
> should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
> {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
> confuses {{SocketChannel.write}}. And the third issue is caused by extra 
> calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
> the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (BOOKKEEPER-431) Duplicate definition of COOKIES_NODE

2012-10-12 Thread Flavio Junqueira (JIRA)
Flavio Junqueira created BOOKKEEPER-431:
---

 Summary: Duplicate definition of COOKIES_NODE
 Key: BOOKKEEPER-431
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-431
 Project: Bookkeeper
  Issue Type: Improvement
Affects Versions: 4.1.0
Reporter: Flavio Junqueira
Priority: Minor
 Fix For: 4.2.0


Is it necessary two definitions of COOKIES_NODE, one in cookie.java and one in 
AbstractZkLedgerManager?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes

2012-10-12 Thread Eugene Koontz (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475085#comment-13475085
 ] 

Eugene Koontz commented on ZOOKEEPER-1560:
--

It seems like in a particular iteration, 0 bytes is written:

{code}
localhost/127.0.0.1:11222, unexpected error, closing socket connection and 
attempting reconnect
 [exec] [junit] java.io.IOException: Couldn't write 2000 bytes, 0 bytes 
written in this iteration and 77152 bytes written in total. Original limit: 
500074
 [exec] [junit] at 
org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:145)
 [exec] [junit] at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:375)
 [exec] [junit] at 
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
 [exec] [junit] 2012-10-12 15:20:42,629 [myid:] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11222:NIOServerCnxn@349] - caught end of 
stream exception
 [exec] [junit] EndOfStreamException: Unable to read additional data 
from client sessionid 0x13a55902b650001, likely client has closed socket
 [exec] [junit] at 
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
 [exec] [junit] at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
 [exec] [junit] at java.lang.Thread.run(Thread.java:662)
 [exec] [junit] 2012-10-12 15:20:42,630 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11222:NIOServerCnxn@1001] - Closed socket 
connection for client /127.0.0.1:57126 which had sessionid 0x13a55902b650001
{code}

Seems like there's a strange resemblance among all the test failures thus far: 
always fails after 77152 bytes written.

> Zookeeper client hangs on creation of large nodes
> -
>
> Key: ZOOKEEPER-1560
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.4, 3.5.0
>Reporter: Igor Motov
>Assignee: Ted Yu
> Fix For: 3.5.0, 3.4.5
>
> Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
> zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, 
> zookeeper-1560-v5.txt, zookeeper-1560-v6.txt
>
>
> To reproduce, try creating a node with 0.5M of data using java client. The 
> test will hang waiting for a response from the server. See the attached patch 
> for the test that reproduces the issue.
> It seems that ZOOKEEPER-1437 introduced a few issues to 
> {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
> sending large packets that require several invocations of 
> {{SocketChannel.write}} to complete. The first issue is that the call to 
> {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
> even if the packet wasn't completely sent yet.  It looks to me that this call 
> should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
> {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
> confuses {{SocketChannel.write}}. And the third issue is caused by extra 
> calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
> the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes

2012-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475079#comment-13475079
 ] 

Hadoop QA commented on ZOOKEEPER-1560:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12548898/zookeeper-1560-v6.txt
  against trunk revision 1391526.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217//console

This message is automatically generated.

> Zookeeper client hangs on creation of large nodes
> -
>
> Key: ZOOKEEPER-1560
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.4, 3.5.0
>Reporter: Igor Motov
>Assignee: Ted Yu
> Fix For: 3.5.0, 3.4.5
>
> Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
> zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, 
> zookeeper-1560-v5.txt, zookeeper-1560-v6.txt
>
>
> To reproduce, try creating a node with 0.5M of data using java client. The 
> test will hang waiting for a response from the server. See the attached patch 
> for the test that reproduces the issue.
> It seems that ZOOKEEPER-1437 introduced a few issues to 
> {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
> sending large packets that require several invocations of 
> {{SocketChannel.write}} to complete. The first issue is that the call to 
> {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
> even if the packet wasn't completely sent yet.  It looks to me that this call 
> should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
> {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
> confuses {{SocketChannel.write}}. And the third issue is caused by extra 
> calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
> the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Failed: ZOOKEEPER-1560 PreCommit Build #1217

2012-10-12 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 169234 lines...]
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12548898/zookeeper-1560-v6.txt
 [exec]   against trunk revision 1391526.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 core tests.  The patch failed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] l38K6LEVny logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1568:
 exec returned: 1

Total time: 24 minutes 48 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1560
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
3 tests failed.
REGRESSION:  org.apache.zookeeper.test.LETest.testLE

Error Message:
Thread 3 got 27 expected 28

Stack Trace:
junit.framework.AssertionFailedError: Thread 3 got 27 expected 28
at org.apache.zookeeper.test.LETest.testLE(LETest.java:135)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)


FAILED:  org.apache.zookeeper.test.ChrootClientTest.testLargeNodeData

Error Message:
KeeperErrorCode = ConnectionLoss for /large

Stack Trace:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /large
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at 
org.apache.zookeeper.test.ClientTest.testLargeNodeData(ClientTest.java:61)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)


FAILED:  org.apache.zookeeper.test.ClientTest.testLargeNodeData

Error Message:
KeeperErrorCode = ConnectionLoss for /large

Stack Trace:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /large
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at 
org.apache.zookeeper.test.ClientTest.testLargeNodeData(ClientTest.java:61)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)




[jira] [Created] (BOOKKEEPER-430) Remove manual bookie registration from overview

2012-10-12 Thread Flavio Junqueira (JIRA)
Flavio Junqueira created BOOKKEEPER-430:
---

 Summary: Remove manual bookie registration from overview
 Key: BOOKKEEPER-430
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-430
 Project: Bookkeeper
  Issue Type: Improvement
Affects Versions: 4.1.0
Reporter: Flavio Junqueira
Assignee: Flavio Junqueira


The documentation suggests that a user needs to manually register a bookie, 
which is not right.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (BOOKKEEPER-422) Simplify AbstractSubscriptionManager

2012-10-12 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475068#comment-13475068
 ] 

Flavio Junqueira commented on BOOKKEEPER-422:
-

It sounds like a good idea to me to use a SortedMap, Sijie. Do you see a 
problem with doing it, Stu?

It also sounds like a good idea to validate the subscriber id as you point out, 
Sijie. It should be a separate jira as you suggest.

> Simplify AbstractSubscriptionManager
> 
>
> Key: BOOKKEEPER-422
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-422
> Project: Bookkeeper
>  Issue Type: Improvement
>  Components: hedwig-server
>Reporter: Stu Hood
>Assignee: Stu Hood
>Priority: Minor
> Attachments: bk-422.diff, bk-422.diff, bk-422.diff
>
>
> It's difficult to maintain a duplicated/cached count of local subscribers, 
> and we've experienced a few issues due to it getting out of sync with the 
> actual set of subscribers. Since a count of local subscribers can be 
> calculated from the top2sub2seq map, let's do that instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes

2012-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475062#comment-13475062
 ] 

Hadoop QA commented on ZOOKEEPER-1560:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12548893/zookeeper-1560-v5.txt
  against trunk revision 1391526.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216//console

This message is automatically generated.

> Zookeeper client hangs on creation of large nodes
> -
>
> Key: ZOOKEEPER-1560
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.4, 3.5.0
>Reporter: Igor Motov
>Assignee: Ted Yu
> Fix For: 3.5.0, 3.4.5
>
> Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
> zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, 
> zookeeper-1560-v5.txt, zookeeper-1560-v6.txt
>
>
> To reproduce, try creating a node with 0.5M of data using java client. The 
> test will hang waiting for a response from the server. See the attached patch 
> for the test that reproduces the issue.
> It seems that ZOOKEEPER-1437 introduced a few issues to 
> {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
> sending large packets that require several invocations of 
> {{SocketChannel.write}} to complete. The first issue is that the call to 
> {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
> even if the packet wasn't completely sent yet.  It looks to me that this call 
> should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
> {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
> confuses {{SocketChannel.write}}. And the third issue is caused by extra 
> calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
> the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Failed: ZOOKEEPER-1560 PreCommit Build #1216

2012-10-12 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 169973 lines...]
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12548893/zookeeper-1560-v5.txt
 [exec]   against trunk revision 1391526.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 core tests.  The patch failed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] vuJG8poe1s logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1568:
 exec returned: 1

Total time: 23 minutes 57 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1560
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
2 tests failed.
FAILED:  org.apache.zookeeper.test.ChrootClientTest.testLargeNodeData

Error Message:
KeeperErrorCode = ConnectionLoss for /large

Stack Trace:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /large
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at 
org.apache.zookeeper.test.ClientTest.testLargeNodeData(ClientTest.java:61)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)


FAILED:  org.apache.zookeeper.test.ClientTest.testLargeNodeData

Error Message:
KeeperErrorCode = ConnectionLoss for /large

Stack Trace:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /large
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at 
org.apache.zookeeper.test.ClientTest.testLargeNodeData(ClientTest.java:61)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)




[jira] [Updated] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes

2012-10-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated ZOOKEEPER-1560:
--

Attachment: zookeeper-1560-v6.txt

Patch v6 changes the condition for raising IOE: if there is no progress between 
successive sock.write() calls.

I guess socket's output buffer might be a limiting factor as to the number of 
bytes written in a particular sock.write() call.

> Zookeeper client hangs on creation of large nodes
> -
>
> Key: ZOOKEEPER-1560
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.4, 3.5.0
>Reporter: Igor Motov
>Assignee: Ted Yu
> Fix For: 3.5.0, 3.4.5
>
> Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
> zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, 
> zookeeper-1560-v5.txt, zookeeper-1560-v6.txt
>
>
> To reproduce, try creating a node with 0.5M of data using java client. The 
> test will hang waiting for a response from the server. See the attached patch 
> for the test that reproduces the issue.
> It seems that ZOOKEEPER-1437 introduced a few issues to 
> {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
> sending large packets that require several invocations of 
> {{SocketChannel.write}} to complete. The first issue is that the call to 
> {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
> even if the packet wasn't completely sent yet.  It looks to me that this call 
> should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
> {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
> confuses {{SocketChannel.write}}. And the third issue is caused by extra 
> calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
> the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes

2012-10-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated ZOOKEEPER-1560:
--

Attachment: zookeeper-1560-v5.txt

>From 
>https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//testReport/org.apache.zookeeper.test/ClientTest/testLargeNodeData/
> :
{code}
2012-10-12 14:10:50,042 [myid:] - WARN  
[main-SendThread(localhost:11221):ClientCnxn$SendThread@1089] - Session 
0x13a555031cf for server localhost/127.0.0.1:11221, unexpected error, 
closing socket connection and attempting reconnect
java.io.IOException: Couldn't write 2000 bytes, 1152 bytes written
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:142)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
2012-10-12 14:10:50,044 [myid:] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@349] - caught end of 
stream exception
EndOfStreamException: Unable to read additional data from client sessionid 
0x13a555031cf, likely client has closed socket
at 
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:662)
{code}
Patch v5 adds more information to exception message.

> Zookeeper client hangs on creation of large nodes
> -
>
> Key: ZOOKEEPER-1560
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.4, 3.5.0
>Reporter: Igor Motov
>Assignee: Ted Yu
> Fix For: 3.5.0, 3.4.5
>
> Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
> zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, 
> zookeeper-1560-v5.txt
>
>
> To reproduce, try creating a node with 0.5M of data using java client. The 
> test will hang waiting for a response from the server. See the attached patch 
> for the test that reproduces the issue.
> It seems that ZOOKEEPER-1437 introduced a few issues to 
> {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
> sending large packets that require several invocations of 
> {{SocketChannel.write}} to complete. The first issue is that the call to 
> {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
> even if the packet wasn't completely sent yet.  It looks to me that this call 
> should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
> {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
> confuses {{SocketChannel.write}}. And the third issue is caused by extra 
> calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
> the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Jenkins build is still unstable: bookkeeper-trunk » hedwig-server #750

2012-10-12 Thread Apache Jenkins Server
See 




Jenkins build is back to stable : bookkeeper-trunk » bookkeeper-server #750

2012-10-12 Thread Apache Jenkins Server
See 




[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes

2012-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475014#comment-13475014
 ] 

Hadoop QA commented on ZOOKEEPER-1560:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12548889/zookeeper-1560-v4.txt
  against trunk revision 1391526.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//console

This message is automatically generated.

> Zookeeper client hangs on creation of large nodes
> -
>
> Key: ZOOKEEPER-1560
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.4, 3.5.0
>Reporter: Igor Motov
>Assignee: Ted Yu
> Fix For: 3.5.0, 3.4.5
>
> Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
> zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt
>
>
> To reproduce, try creating a node with 0.5M of data using java client. The 
> test will hang waiting for a response from the server. See the attached patch 
> for the test that reproduces the issue.
> It seems that ZOOKEEPER-1437 introduced a few issues to 
> {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
> sending large packets that require several invocations of 
> {{SocketChannel.write}} to complete. The first issue is that the call to 
> {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
> even if the packet wasn't completely sent yet.  It looks to me that this call 
> should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
> {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
> confuses {{SocketChannel.write}}. And the third issue is caused by extra 
> calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
> the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Failed: ZOOKEEPER-1560 PreCommit Build #1215

2012-10-12 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 172814 lines...]
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12548889/zookeeper-1560-v4.txt
 [exec]   against trunk revision 1391526.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 core tests.  The patch failed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] b22x2sL361 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1568:
 exec returned: 1

Total time: 24 minutes 12 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1560
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
2 tests failed.
FAILED:  org.apache.zookeeper.test.ChrootClientTest.testLargeNodeData

Error Message:
KeeperErrorCode = ConnectionLoss for /large

Stack Trace:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /large
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at 
org.apache.zookeeper.test.ClientTest.testLargeNodeData(ClientTest.java:61)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)


FAILED:  org.apache.zookeeper.test.ClientTest.testLargeNodeData

Error Message:
KeeperErrorCode = ConnectionLoss for /large

Stack Trace:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /large
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at 
org.apache.zookeeper.test.ClientTest.testLargeNodeData(ClientTest.java:61)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)




[jira] [Updated] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes

2012-10-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated ZOOKEEPER-1560:
--

Attachment: zookeeper-1560-v4.txt

Patch v4 adopts what Igor and Jacky suggested.

Since Packet p is guaranteed to send all bytes in its ByteBuffer, I think 
problem 2 no longer exists.

ChrootClientTest and ClientTest both passed.

> Zookeeper client hangs on creation of large nodes
> -
>
> Key: ZOOKEEPER-1560
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.4, 3.5.0
>Reporter: Igor Motov
>Assignee: Ted Yu
> Fix For: 3.5.0, 3.4.5
>
> Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
> zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt
>
>
> To reproduce, try creating a node with 0.5M of data using java client. The 
> test will hang waiting for a response from the server. See the attached patch 
> for the test that reproduces the issue.
> It seems that ZOOKEEPER-1437 introduced a few issues to 
> {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
> sending large packets that require several invocations of 
> {{SocketChannel.write}} to complete. The first issue is that the call to 
> {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
> even if the packet wasn't completely sent yet.  It looks to me that this call 
> should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
> {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
> confuses {{SocketChannel.write}}. And the third issue is caused by extra 
> calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
> the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes

2012-10-12 Thread Jacky007 (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13474981#comment-13474981
 ] 

Jacky007 commented on ZOOKEEPER-1560:
-

I think this would work for both 1560 and 1561.
{noformat}
 if (p != null) {
updateLastSend();
if ((p.requestHeader != null) &&
(p.requestHeader.getType() != OpCode.ping) &&
(p.requestHeader.getType() != OpCode.auth)) {
p.requestHeader.setXid(cnxn.getXid());
}
p.createBB();
ByteBuffer pbb = p.bb;
  --->   while (pbb.hasRemaining()) sock.write(pbb);
  --->   outgoingQueue.removeFirstOccurrence(p);
sentCount++;
if (p.requestHeader != null
&& p.requestHeader.getType() != OpCode.ping
&& p.requestHeader.getType() != OpCode.auth) {
pending.add(p);
}
   }
{noformat}

> Zookeeper client hangs on creation of large nodes
> -
>
> Key: ZOOKEEPER-1560
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.4, 3.5.0
>Reporter: Igor Motov
>Assignee: Ted Yu
> Fix For: 3.5.0, 3.4.5
>
> Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
> zookeeper-1560-v2.txt, zookeeper-1560-v3.txt
>
>
> To reproduce, try creating a node with 0.5M of data using java client. The 
> test will hang waiting for a response from the server. See the attached patch 
> for the test that reproduces the issue.
> It seems that ZOOKEEPER-1437 introduced a few issues to 
> {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
> sending large packets that require several invocations of 
> {{SocketChannel.write}} to complete. The first issue is that the call to 
> {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
> even if the packet wasn't completely sent yet.  It looks to me that this call 
> should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
> {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
> confuses {{SocketChannel.write}}. And the third issue is caused by extra 
> calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
> the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-10-12 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13474893#comment-13474893
 ] 

Flavio Junqueira edited comment on ZOOKEEPER-1549 at 10/12/12 8:44 AM:
---

I don't think major changes are needed, at least for the leader case. We simply 
shouldn't be taking snapshots over uncommitted state. Check ZOOKEEPER-1558 and 
ZOOKEEPER-1559, subtasks of this jira.

  was (Author: fpj):
I don't think major changes are needed, at least for the leader case. We 
simply shouldn't be taking snapshots over uncommitted state. Check 
ZOOKEEPER-1558 and ZOOKEEPER-1559, a subtask of this jira.
  
> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
> Attachments: case.patch
>
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ZOOKEEPER-1549) Data inconsistency when follower is receiving a DIFF with a dirty snapshot

2012-10-12 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13474893#comment-13474893
 ] 

Flavio Junqueira commented on ZOOKEEPER-1549:
-

I don't think major changes are needed, at least for the leader case. We simply 
shouldn't be taking snapshots over uncommitted state. Check ZOOKEEPER-1558 and 
ZOOKEEPER-1559, a subtask of this jira.

> Data inconsistency when follower is receiving a DIFF with a dirty snapshot
> --
>
> Key: ZOOKEEPER-1549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.4.3
>Reporter: Jacky007
>Priority: Critical
> Attachments: case.patch
>
>
> the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is 
> not correct.
> here is scenario(similar to 1154):
> Initial Condition
> 1.Lets say there are three nodes in the ensemble A,B,C with A being the 
> leader
> 2.The current epoch is 7. 
> 3.For simplicity of the example, lets say zxid is a two digit number, 
> with epoch being the first digit.
> 4.The zxid is 73
> 5.All the nodes have seen the change 73 and have persistently logged it.
> Step 1
> Request with zxid 74 is issued. The leader A writes it to the log but there 
> is a crash of the entire ensemble and B,C never write the change 74 to their 
> log.
> Step 2
> A,B restart, A is elected as the new leader,  and A will load data and take a 
> clean snapshot(change 74 is in it), then send diff to B, but B died before 
> sync with A. A died later.
> Step 3
> B,C restart, A is still down
> B,C form the quorum
> B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73
> epoch is now 8, zxid is 80
> Request with zxid 81 is successful. On B, minCommitLog is now 71, 
> maxCommitLog is 81
> Step 4
> A starts up. It applies the change in request with zxid 74 to its in-memory 
> data tree
> A contacts B to registerAsFollower and provides 74 as its ZxId
> Since 71<=74<=81, B decides to send A the diff. 
> Problem:
> The problem with the above sequence is that after truncate the log, A will 
> load the snapshot again which is not correct.
> In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), 
> the leader will send a snapshot to follower, it will not be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira