date:20121024

See https://builds.apache.org/job/ZooKeeper-trunk-jdk7/430/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 252338 lines...]
[junit] 2012-10-24 09:55:57,187 [myid:] - INFO  [main:ClientBase@427] - 
STOPPING server
[junit] 2012-10-24 09:55:57,188 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@224] - 
NIOServerCnxn factory exited run method
[junit] 2012-10-24 09:55:57,188 [myid:] - INFO  [main:ZooKeeperServer@399] 
- shutting down
[junit] 2012-10-24 09:55:57,188 [myid:] - INFO  
[main:SessionTrackerImpl@225] - Shutting down
[junit] 2012-10-24 09:55:57,188 [myid:] - INFO  
[main:PrepRequestProcessor@733] - Shutting down
[junit] 2012-10-24 09:55:57,188 [myid:] - INFO  
[main:SyncRequestProcessor@175] - Shutting down
[junit] 2012-10-24 09:55:57,188 [myid:] - INFO  [ProcessThread(sid:0 
cport:-1)::PrepRequestProcessor@142] - PrepRequestProcessor exited loop!
[junit] 2012-10-24 09:55:57,188 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@155] - SyncRequestProcessor exited!
[junit] 2012-10-24 09:55:57,189 [myid:] - INFO  
[main:FinalRequestProcessor@411] - shutdown of request processor complete
[junit] 2012-10-24 09:55:57,189 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2012-10-24 09:55:57,190 [myid:] - INFO  [main:JMXEnv@133] - 
ensureOnly:[]
[junit] 2012-10-24 09:55:57,191 [myid:] - INFO  [main:ClientBase@420] - 
STARTING server
[junit] 2012-10-24 09:55:57,191 [myid:] - INFO  [main:ZooKeeperServer@147] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-jdk7/trunk/build/test/tmp/test6841446492964033919.junit.dir/version-2
 snapdir 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-jdk7/trunk/build/test/tmp/test6841446492964033919.junit.dir/version-2
[junit] 2012-10-24 09:55:57,192 [myid:] - INFO  
[main:NIOServerCnxnFactory@94] - binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2012-10-24 09:55:57,192 [myid:] - INFO  [main:FileSnap@83] - 
Reading snapshot 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-jdk7/trunk/build/test/tmp/test6841446492964033919.junit.dir/version-2/snapshot.b
[junit] 2012-10-24 09:55:57,194 [myid:] - INFO  [main:FileTxnSnapLog@270] - 
Snapshotting: 0xb to 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-jdk7/trunk/build/test/tmp/test6841446492964033919.junit.dir/version-2/snapshot.b
[junit] 2012-10-24 09:55:57,196 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2012-10-24 09:55:57,197 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@197] - 
Accepted socket connection from /127.0.0.1:49287
[junit] 2012-10-24 09:55:57,197 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@821] - Processing 
stat command from /127.0.0.1:49287
[junit] 2012-10-24 09:55:57,198 [myid:] - INFO  
[Thread-4:NIOServerCnxn$StatCommand@655] - Stat command output
[junit] 2012-10-24 09:55:57,198 [myid:] - INFO  
[Thread-4:NIOServerCnxn@1001] - Closed socket connection for client 
/127.0.0.1:49287 (no session established for client)
[junit] 2012-10-24 09:55:57,198 [myid:] - INFO  [main:JMXEnv@133] - 
ensureOnly:[InMemoryDataTree, StandaloneServer_port]
[junit] 2012-10-24 09:55:57,200 [myid:] - INFO  [main:JMXEnv@105] - 
expect:InMemoryDataTree
[junit] 2012-10-24 09:55:57,200 [myid:] - INFO  [main:JMXEnv@108] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
[junit] 2012-10-24 09:55:57,200 [myid:] - INFO  [main:JMXEnv@105] - 
expect:StandaloneServer_port
[junit] 2012-10-24 09:55:57,200 [myid:] - INFO  [main:JMXEnv@108] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1
[junit] 2012-10-24 09:55:57,201 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@57] - FINISHED TEST METHOD testQuota
[junit] 2012-10-24 09:55:57,201 [myid:] - INFO  [main:ClientBase@457] - 
tearDown starting
[junit] 2012-10-24 09:55:57,273 [myid:] - INFO  [main:ZooKeeper@684] - 
Session: 0x13a923327c8 closed
[junit] 2012-10-24 09:55:57,273 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@509] - EventThread shut down
[junit] 2012-10-24 09:55:57,273 [myid:] - INFO  [main:ClientBase@427] - 
STOPPING server
[junit] 2012-10-24 09:55:57,274 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@224] - 
NIOServerCnxn factory exited run method
[junit] 2012-10-24 09:55:57,274 [myid:] - INFO  [main:ZooKeeperServer@399] 
- shutting down
[junit] 2012-10-24 09:55:57,275 [myid:] - INFO  
[main:SessionTrackerImpl@225] - Shutting down
[junit] 2012-10-24 09:55:57,275 [myid:] -

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483136#comment-13483136
 ] 

Flavio Junqueira commented on ZOOKEEPER-1568:
-

Hi Jimmy, I'm trying to understand why submitting operations asynchronously is 
not sufficient for your case. Why do you need to use multi in this case?

 multi should have a non-transaction version
 ---

 Key: ZOOKEEPER-1568
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Jimmy Xiang

 Currently multi is transactional, i.e. all or none.  However, sometimes, we 
 don't want that.  We want all operations to be executed.  Even some 
 operation(s) fails, it is ok. We just need to know the result of each 
 operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version

2012-10-24 Thread Jimmy Xiang (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483311#comment-13483311
 ] 

Jimmy Xiang commented on ZOOKEEPER-1568:


Hi Flavio, for our use case, we need to create/setData hundreds/thousands of 
znodes.  By submitting operations asynchronously, we need to do it one by one. 
If we can do it in batches, we can save lots of network trips.

 multi should have a non-transaction version
 ---

 Key: ZOOKEEPER-1568
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Jimmy Xiang

 Currently multi is transactional, i.e. all or none.  However, sometimes, we 
 don't want that.  We want all operations to be executed.  Even some 
 operation(s) fails, it is ok. We just need to know the result of each 
 operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483347#comment-13483347
 ] 

Flavio Junqueira commented on ZOOKEEPER-1568:
-

In my view, the asynchronous API has been designed to address exactly use cases 
like yours. I don't think you should be suffering any severe penalty by using 
the asynchronous API. Have you actually tried it and had any issue with it?

 multi should have a non-transaction version
 ---

 Key: ZOOKEEPER-1568
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Jimmy Xiang

 Currently multi is transactional, i.e. all or none.  However, sometimes, we 
 don't want that.  We want all operations to be executed.  Even some 
 operation(s) fails, it is ok. We just need to know the result of each 
 operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes

2012-10-24 Thread Patrick Hunt (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483566#comment-13483566
 ] 

Patrick Hunt commented on ZOOKEEPER-1560:
-

That's a good point, the while loop in the patch seems like it would block when 
the tcp buffer is full (e.g. if the server is slow to read). I don't think 
that's a good idea. Rather we should have the code structured similar to what 
it was before - write as much as possible and then use the selector to wait for 
the socket to become writeable again. Eventually the send buffer will drain and 
we can remove it from the queue.

 Zookeeper client hangs on creation of large nodes
 -

 Key: ZOOKEEPER-1560
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.4, 3.5.0
Reporter: Igor Motov
Assignee: Ted Yu
 Fix For: 3.5.0, 3.4.5

 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
 zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, 
 zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt


 To reproduce, try creating a node with 0.5M of data using java client. The 
 test will hang waiting for a response from the server. See the attached patch 
 for the test that reproduces the issue.
 It seems that ZOOKEEPER-1437 introduced a few issues to 
 {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
 sending large packets that require several invocations of 
 {{SocketChannel.write}} to complete. The first issue is that the call to 
 {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
 even if the packet wasn't completely sent yet.  It looks to me that this call 
 should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
 {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
 confuses {{SocketChannel.write}}. And the third issue is caused by extra 
 calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
 the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version

2012-10-24 Thread Marshall McMullen (JIRA)

[
https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483590#comment-13483590
]

Marshall McMullen commented on ZOOKEEPER-1568:
--

I actually think there is a valid use case for this. Mostly for performance
reasons. Because a multi is one transaction, it causes less permuation on the
distributed and replicated state of zookeeper than multiple individual
operations not in a multi.

With a Multi:
- You only pay the cost of the RPC overhead once rather than on each individual
operation
- You get one flush of the leader channel rather than multiple ones for each
write operation
- A multi will case one new snapshot/log to be generated rather than multiple
ones for each operation

There are other reasons that make this a good reason too that are not
performance based. e.g., if it makes the programmer's job easier to use a multi
with these semantics, then that's a win.

In other distributed databases I've worked on, we used different terminology to
disinguish between a multi op that all succeed/fail vs one that does not. We
used the term Batch to imply we were batching up operations but there was no
guarantee they'd all succeed/fail.

multi should have a non-transaction version
---

Key: ZOOKEEPER-1568
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
Project: ZooKeeper
Issue Type: Improvement
Reporter: Jimmy Xiang

Currently multi is transactional, i.e. all or none. However, sometimes, we
don't want that. We want all operations to be executed. Even some
operation(s) fails, it is ok. We just need to know the result of each
operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes

2012-10-24 Thread Nikita Vetoshkin (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483592#comment-13483592
 ] 

Nikita Vetoshkin commented on ZOOKEEPER-1560:
-

If no one can prepend {{outgoingQueue}} with packet, straightforward 
implementation like this should work:
{noformat}
diff --git a/src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java 
b/src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
index 70d8538..457c8cc 100644
--- a/src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
+++ b/src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
@@ -111,17 +111,20 @@ public class ClientCnxnSocketNIO extends ClientCnxnSocket 
{
 
cnxn.sendThread.clientTunneledAuthenticationInProgress());
 
 if (p != null) {
-outgoingQueue.removeFirstOccurrence(p);
 updateLastSend();
-if ((p.requestHeader != null) 
-(p.requestHeader.getType() != OpCode.ping) 
-(p.requestHeader.getType() != OpCode.auth)) {
-p.requestHeader.setXid(cnxn.getXid());
+if (p.bb != null) {
+if ((p.requestHeader != null) 
+(p.requestHeader.getType() != OpCode.ping) 
+(p.requestHeader.getType() != OpCode.auth)) {
+p.requestHeader.setXid(cnxn.getXid());
+}
+p.createBB();
+// otherwise we're in the middle of sending packet
 }
-p.createBB();
 ByteBuffer pbb = p.bb;
 sock.write(pbb);
 if (!pbb.hasRemaining()) {
+outgoingQueue.removeFirstOccurrence(p);
 sentCount++;
 if (p.requestHeader != null
  p.requestHeader.getType() != OpCode.ping

{noformat}

 Zookeeper client hangs on creation of large nodes
 -

 Key: ZOOKEEPER-1560
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.4, 3.5.0
Reporter: Igor Motov
Assignee: Ted Yu
 Fix For: 3.5.0, 3.4.5

 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
 zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, 
 zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt


 To reproduce, try creating a node with 0.5M of data using java client. The 
 test will hang waiting for a response from the server. See the attached patch 
 for the test that reproduces the issue.
 It seems that ZOOKEEPER-1437 introduced a few issues to 
 {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
 sending large packets that require several invocations of 
 {{SocketChannel.write}} to complete. The first issue is that the call to 
 {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
 even if the packet wasn't completely sent yet.  It looks to me that this call 
 should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
 {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
 confuses {{SocketChannel.write}}. And the third issue is caused by extra 
 calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
 the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483597#comment-13483597
 ] 

Ted Yu commented on ZOOKEEPER-1560:
---

Looking at createBB(), upon exit the field bb wouldn't be null.
I wonder why p.createBB() is enclosed in the if (p.bb != null) block above ?

 Zookeeper client hangs on creation of large nodes
 -

 Key: ZOOKEEPER-1560
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.4, 3.5.0
Reporter: Igor Motov
Assignee: Ted Yu
 Fix For: 3.5.0, 3.4.5

 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
 zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, 
 zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt


 To reproduce, try creating a node with 0.5M of data using java client. The 
 test will hang waiting for a response from the server. See the attached patch 
 for the test that reproduces the issue.
 It seems that ZOOKEEPER-1437 introduced a few issues to 
 {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
 sending large packets that require several invocations of 
 {{SocketChannel.write}} to complete. The first issue is that the call to 
 {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
 even if the packet wasn't completely sent yet.  It looks to me that this call 
 should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
 {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
 confuses {{SocketChannel.write}}. And the third issue is caused by extra 
 calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
 the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483601#comment-13483601
 ] 

Ted Yu commented on ZOOKEEPER-1560:
---

bq. similar to what it was before - write as much as possible and then use the 
selector to wait for the socket to become writeable again
I looked at svn log for ClientCnxnSocketNIO.java back to 2011-04-12 and didn't 
seem to find the above change.
FYI

 Zookeeper client hangs on creation of large nodes
 -

 Key: ZOOKEEPER-1560
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.4, 3.5.0
Reporter: Igor Motov
Assignee: Ted Yu
 Fix For: 3.5.0, 3.4.5

 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
 zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, 
 zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt


 To reproduce, try creating a node with 0.5M of data using java client. The 
 test will hang waiting for a response from the server. See the attached patch 
 for the test that reproduces the issue.
 It seems that ZOOKEEPER-1437 introduced a few issues to 
 {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
 sending large packets that require several invocations of 
 {{SocketChannel.write}} to complete. The first issue is that the call to 
 {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
 even if the packet wasn't completely sent yet.  It looks to me that this call 
 should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
 {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
 confuses {{SocketChannel.write}}. And the third issue is caused by extra 
 calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
 the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483618#comment-13483618
 ] 

Ted Yu commented on ZOOKEEPER-1568:
---

bq. A multi will case one new snapshot/log to be generated
I guess you meant 'cause' above.
bq. but there was no guarantee they'd all succeed/fail.
I think we need to formalize how success / failure status for individual 
operations in this new multi API should be delivered back to client.

 multi should have a non-transaction version
 ---

 Key: ZOOKEEPER-1568
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Jimmy Xiang

 Currently multi is transactional, i.e. all or none.  However, sometimes, we 
 don't want that.  We want all operations to be executed.  Even some 
 operation(s) fails, it is ok. We just need to know the result of each 
 operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version

2012-10-24 Thread Marshall McMullen (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483626#comment-13483626
 ] 

Marshall McMullen commented on ZOOKEEPER-1568:
--

Yes, I meant 'cause' :).

The existing multi code fills in a list of results for each op. Right now, it 
aborts on the first op that fails and rolls back the data tree to what it was 
before it started. And it explicitly marks all ops after that in the results 
list with a runtime exception. So the mechanism is already there to communicate 
the errors back to the client.

So I suppose the Multi code would need to take a bool to indicate if it was all 
or nothing or not.

 multi should have a non-transaction version
 ---

 Key: ZOOKEEPER-1568
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Jimmy Xiang

 Currently multi is transactional, i.e. all or none.  However, sometimes, we 
 don't want that.  We want all operations to be executed.  Even some 
 operation(s) fails, it is ok. We just need to know the result of each 
 operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483635#comment-13483635
 ] 

Ted Yu commented on ZOOKEEPER-1568:
---

bq. it aborts on the first op that fails and rolls back
Should we allow operations after the failed operation to continue ?
The rationale is that the operations in the batch may not have dependencies 
among them.

 multi should have a non-transaction version
 ---

 Key: ZOOKEEPER-1568
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Jimmy Xiang

 Currently multi is transactional, i.e. all or none.  However, sometimes, we 
 don't want that.  We want all operations to be executed.  Even some 
 operation(s) fails, it is ok. We just need to know the result of each 
 operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes

2012-10-24 Thread Patrick Hunt (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483725#comment-13483725
 ] 

Patrick Hunt commented on ZOOKEEPER-1560:
-

bq. PDH: similar to what it was before - write as much as possible and then use 
the selector to wait for the socket to become writeable again

bq. Ted: I looked at svn log for ClientCnxnSocketNIO.java back to 2011-04-12 
and didn't seem to find the above change. FYI

Hi Ted, the following is what I was referring to. This is from latest on 
branch-3.3, 3.4.4 has a similar (although broken) block where it's a bit less 
obvious what's happening. branch-3.3 is more clear.

Notice that we first attempt to write, if !remaining them we remove from the 
queue, otw we'll wait till the next time the selector wakes us up (the final 
isempty check is pretty criticial here as well to set interest correctly) and 
retry until the buffer is drained.

{noformat}
if (sockKey.isWritable()) {
synchronized (outgoingQueue) {
if (!outgoingQueue.isEmpty()) {
ByteBuffer pbb = outgoingQueue.getFirst().bb;
sock.write(pbb);
if (!pbb.hasRemaining()) {
sentCount++;
Packet p = outgoingQueue.removeFirst();
if (p.header != null
 p.header.getType() != OpCode.ping
 p.header.getType() != OpCode.auth) {
pendingQueue.add(p);
}
}
}
}
}
if (outgoingQueue.isEmpty()) {
disableWrite();
} else {
enableWrite();
}
{noformat}


 Zookeeper client hangs on creation of large nodes
 -

 Key: ZOOKEEPER-1560
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.4, 3.5.0
Reporter: Igor Motov
Assignee: Ted Yu
 Fix For: 3.5.0, 3.4.5

 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
 zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, 
 zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt


 To reproduce, try creating a node with 0.5M of data using java client. The 
 test will hang waiting for a response from the server. See the attached patch 
 for the test that reproduces the issue.
 It seems that ZOOKEEPER-1437 introduced a few issues to 
 {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
 sending large packets that require several invocations of 
 {{SocketChannel.write}} to complete. The first issue is that the call to 
 {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
 even if the packet wasn't completely sent yet.  It looks to me that this call 
 should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
 {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
 confuses {{SocketChannel.write}}. And the third issue is caused by extra 
 calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
 the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483737#comment-13483737
 ] 

Ted Yu commented on ZOOKEEPER-1560:
---

I got the following based on the above code snippet:
{code}
Index: src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
===
--- src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java (revision 
1401904)
+++ src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java (working copy)
@@ -111,18 +111,18 @@
 
cnxn.sendThread.clientTunneledAuthenticationInProgress());
 
 if (p != null) {
-outgoingQueue.removeFirstOccurrence(p);
 updateLastSend();
 if ((p.requestHeader != null) 
 (p.requestHeader.getType() != OpCode.ping) 
 (p.requestHeader.getType() != OpCode.auth)) {
 p.requestHeader.setXid(cnxn.getXid());
 }
-p.createBB();
+if (p.bb == null) p.createBB();
 ByteBuffer pbb = p.bb;
 sock.write(pbb);
 if (!pbb.hasRemaining()) {
 sentCount++;
+outgoingQueue.removeFirstOccurrence(p);
 if (p.requestHeader != null
  p.requestHeader.getType() != OpCode.ping
  p.requestHeader.getType() != OpCode.auth) {
@@ -141,8 +141,12 @@
 synchronized(pendingQueue) {
 pendingQueue.addAll(pending);
 }
-
 }
+if (outgoingQueue.isEmpty()) {
+  disableWrite();
+} else {
+enableWrite();
+}
 }
 
 private Packet findSendablePacket(LinkedListPacket outgoingQueue,
{code}
I still saw testLargeNodeData fail:
{code}
Testcase: testLargeNodeData took 0.714 sec
  Caused an ERROR
KeeperErrorCode = ConnectionLoss for /large
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /large
  at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
  at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
  at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
  at org.apache.zookeeper.test.ClientTest.testLargeNodeData(ClientTest.java:61)
{code}

 Zookeeper client hangs on creation of large nodes
 -

 Key: ZOOKEEPER-1560
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.4, 3.5.0
Reporter: Igor Motov
Assignee: Ted Yu
 Fix For: 3.5.0, 3.4.5

 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
 zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, 
 zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt


 To reproduce, try creating a node with 0.5M of data using java client. The 
 test will hang waiting for a response from the server. See the attached patch 
 for the test that reproduces the issue.
 It seems that ZOOKEEPER-1437 introduced a few issues to 
 {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
 sending large packets that require several invocations of 
 {{SocketChannel.write}} to complete. The first issue is that the call to 
 {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
 even if the packet wasn't completely sent yet.  It looks to me that this call 
 should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
 {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
 confuses {{SocketChannel.write}}. And the third issue is caused by extra 
 calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
 the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes

2012-10-24 Thread Igor Motov (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483748#comment-13483748
 ] 

Igor Motov commented on ZOOKEEPER-1560:
---

{quote}
For problem #3, I only found one call to getXid() in doIO:
{code}
p.requestHeader.setXid(cnxn.getXid());
{code}
which is not in a loop. Some clarification would be nice.
{quote}
It's in the outer loop, so to speak. If the packet is large and is sent in 
chunks, Xid is incremented for every chunk. Before ZOOKEEPER-1437 it was 
incremented for every packet. 


 Zookeeper client hangs on creation of large nodes
 -

 Key: ZOOKEEPER-1560
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.4, 3.5.0
Reporter: Igor Motov
Assignee: Ted Yu
 Fix For: 3.5.0, 3.4.5

 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
 zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, 
 zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt


 To reproduce, try creating a node with 0.5M of data using java client. The 
 test will hang waiting for a response from the server. See the attached patch 
 for the test that reproduces the issue.
 It seems that ZOOKEEPER-1437 introduced a few issues to 
 {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
 sending large packets that require several invocations of 
 {{SocketChannel.write}} to complete. The first issue is that the call to 
 {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
 even if the packet wasn't completely sent yet.  It looks to me that this call 
 should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
 {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
 confuses {{SocketChannel.write}}. And the third issue is caused by extra 
 calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
 the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request: patch for ZOOKEEPER-1560: Zookeeper client hangs on creation of large nodes

2012-10-24 Thread Skye Wanderman-Milne


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7730/#review12743
---



src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
https://reviews.apache.org/r/7730/#comment27199

Don't setXid  1x for the same packet



src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
https://reviews.apache.org/r/7730/#comment27200

Don't createBB  1x for the same packet



src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
https://reviews.apache.org/r/7730/#comment27203

Remove p from outgoingQueue only after we have finished writing it



src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
https://reviews.apache.org/r/7730/#comment27205

Always pick first packet if we already started writing it so we finished 
writing it


- Skye Wanderman-Milne


On Oct. 25, 2012, 12:50 a.m., Skye Wanderman-Milne wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/7730/
 ---
 
 (Updated Oct. 25, 2012, 12:50 a.m.)
 
 
 Review request for zookeeper, Patrick Hunt and Ted Yu.
 
 
 Description
 ---
 
 see ZOOKEEPER-1560 JIRA
 
 
 This addresses bug ZOOKEEPER-1560.
 https://issues.apache.org/jira/browse/ZOOKEEPER-1560
 
 
 Diffs
 -
 
   src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java 70d8538 
 
 Diff: https://reviews.apache.org/r/7730/diff/
 
 
 Testing
 ---
 
 unit tests (including testLargeNodeData from ZOOKEEPER-1560 JIRA)
 
 
 Thanks,
 
 Skye Wanderman-Milne

[jira] [Updated] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes

2012-10-24 Thread Skye Wanderman-Milne (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Skye Wanderman-Milne updated ZOOKEEPER-1560:


Attachment: ZOOKEEPER-1560-v8.patch

I've created a new patch (ZOOKEEPER-1560-v8.patch) that incorporates what we 
have so far (moving removeFirstOccurrence to after the packet is completely 
written, only calling createBB when a BB doesn't already exist, and only 
calling setXid when no xid is already set). It also modifies findSendablePacket 
to always choose the first packet if it is partially written. The only place 
that a packet is prepended to outgoingQueue is ClientCnxn.primeConnection, 
which should only happen at the very beginning, so a partially-written packet 
should remain at the beginning of the queue until it is removed. I also cleaned 
up some of the code so the changes look more extensive than they really are :) 
Posted at https://reviews.apache.org/r/7730. I added comments to mark the 
important parts (as opposed to the clean up).

 Zookeeper client hangs on creation of large nodes
 -

 Key: ZOOKEEPER-1560
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.4, 3.5.0
Reporter: Igor Motov
Assignee: Ted Yu
 Fix For: 3.5.0, 3.4.5

 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
 zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, 
 zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt, 
 ZOOKEEPER-1560-v8.patch


 To reproduce, try creating a node with 0.5M of data using java client. The 
 test will hang waiting for a response from the server. See the attached patch 
 for the test that reproduces the issue.
 It seems that ZOOKEEPER-1437 introduced a few issues to 
 {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
 sending large packets that require several invocations of 
 {{SocketChannel.write}} to complete. The first issue is that the call to 
 {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
 even if the packet wasn't completely sent yet.  It looks to me that this call 
 should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
 {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
 confuses {{SocketChannel.write}}. And the third issue is caused by extra 
 calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
 the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Failed: ZOOKEEPER-1560 PreCommit Build #1238

Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1238/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 257285 lines...]
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12550725/ZOOKEEPER-1560-v8.patch
 [exec]   against trunk revision 1391526.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
 [exec] Please justify why no new tests are needed 
for this patch.
 [exec] Also please list what manual steps were 
performed to verify this patch.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1238//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1238//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1238//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 61ekK3nG4J logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1568:
 exec returned: 1

Total time: 27 minutes 35 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1560
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes

2012-10-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483809#comment-13483809
 ] 

Hadoop QA commented on ZOOKEEPER-1560:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12550725/ZOOKEEPER-1560-v8.patch
  against trunk revision 1391526.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1238//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1238//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1238//console

This message is automatically generated.

 Zookeeper client hangs on creation of large nodes
 -

 Key: ZOOKEEPER-1560
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.4, 3.5.0
Reporter: Igor Motov
Assignee: Ted Yu
 Fix For: 3.5.0, 3.4.5

 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
 zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, 
 zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt, 
 ZOOKEEPER-1560-v8.patch


 To reproduce, try creating a node with 0.5M of data using java client. The 
 test will hang waiting for a response from the server. See the attached patch 
 for the test that reproduces the issue.
 It seems that ZOOKEEPER-1437 introduced a few issues to 
 {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
 sending large packets that require several invocations of 
 {{SocketChannel.write}} to complete. The first issue is that the call to 
 {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
 even if the packet wasn't completely sent yet.  It looks to me that this call 
 should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
 {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
 confuses {{SocketChannel.write}}. And the third issue is caused by extra 
 calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
 the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request: patch for ZOOKEEPER-1560: Zookeeper client hangs on creation of large nodes

2012-10-24 Thread Ted Yu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7730/#review12751
---

Ship it!


I think this patch nicely summarizes collective feedback for this JIRA.
Minor comments below.


src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
https://reviews.apache.org/r/7730/#comment27226

'p.bb will already exist' - 'p.bb would not be null'



src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
https://reviews.apache.org/r/7730/#comment27227

'we already starting' - 'we have already started'



src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
https://reviews.apache.org/r/7730/#comment27228

Remove white space introduced.


- Ted Yu


On Oct. 25, 2012, 12:50 a.m., Skye Wanderman-Milne wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/7730/
 ---
 
 (Updated Oct. 25, 2012, 12:50 a.m.)
 
 
 Review request for zookeeper, Patrick Hunt and Ted Yu.
 
 
 Description
 ---
 
 see ZOOKEEPER-1560 JIRA
 
 
 This addresses bug ZOOKEEPER-1560.
 https://issues.apache.org/jira/browse/ZOOKEEPER-1560
 
 
 Diffs
 -
 
   src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java 70d8538 
 
 Diff: https://reviews.apache.org/r/7730/diff/
 
 
 Testing
 ---
 
 unit tests (including testLargeNodeData from ZOOKEEPER-1560 JIRA)
 
 
 Thanks,
 
 Skye Wanderman-Milne

[jira] [Commented] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483823#comment-13483823
 ] 

Ted Yu commented on ZOOKEEPER-1560:
---

I left some minor comments on review board.
Nice work, Skye.

 Zookeeper client hangs on creation of large nodes
 -

 Key: ZOOKEEPER-1560
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.4, 3.5.0
Reporter: Igor Motov
Assignee: Ted Yu
 Fix For: 3.5.0, 3.4.5

 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, 
 zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, 
 zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt, 
 ZOOKEEPER-1560-v8.patch


 To reproduce, try creating a node with 0.5M of data using java client. The 
 test will hang waiting for a response from the server. See the attached patch 
 for the test that reproduces the issue.
 It seems that ZOOKEEPER-1437 introduced a few issues to 
 {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from 
 sending large packets that require several invocations of 
 {{SocketChannel.write}} to complete. The first issue is that the call to 
 {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue 
 even if the packet wasn't completely sent yet.  It looks to me that this call 
 should be moved under {{if (!pbb.hasRemaining())}} The second issue is that 
 {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which 
 confuses {{SocketChannel.write}}. And the third issue is caused by extra 
 calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse 
 the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1437) Client uses session before SASL authentication complete

2012-10-24 Thread Eugene Koontz (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483888#comment-13483888
 ] 

Eugene Koontz commented on ZOOKEEPER-1437:
--

Hi Jordan,

What version of Java are you using to run the Java client? If it's Java 7, your 
problem in fact might be ZOOKEEPER-1550.

-Eugene

 Client uses session before SASL authentication complete
 ---

 Key: ZOOKEEPER-1437
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1437
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.3
Reporter: Thomas Weise
Assignee: Eugene Koontz
 Fix For: 3.4.4, 3.5.0

 Attachments: getXidCallHierarchy.png, ZOOKEEPER-1437.patch, 
 ZOOKEEPER-1437.patch, ZOOKEEPER-1437.patch, ZOOKEEPER-1437.patch, 
 ZOOKEEPER-1437.patch, ZOOKEEPER-1437.patch, ZOOKEEPER-1437.patch, 
 ZOOKEEPER-1437.patch, ZOOKEEPER-1437.patch, ZOOKEEPER-1437.patch, 
 ZOOKEEPER-1437.patch, ZOOKEEPER-1437.patch, ZOOKEEPER-1437.patch, 
 ZOOKEEPER-1437.patch, ZOOKEEPER-1437.patch, ZOOKEEPER-1437.patch, 
 ZOOKEEPER-1437.patch


 Found issue in the context of hbase region server startup, but can be 
 reproduced w/ zkCli alone.
 getData may occur prior to SaslAuthenticated and fail with NoAuth. This is 
 not expected behavior when the client is configured to use SASL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-362) Local subscriptions fail if remote region is down

2012-10-24 Thread Sijie Guo (JIRA)


[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483033#comment-13483033
 ] 

Sijie Guo commented on BOOKKEEPER-362:
--

It seems that the new patch could not be applied to latest trunk.

I took a look at the patch and had some comments as below:

1) #checkTopicSubscribedFromRegion #setTopicUnsubscribedFromRegion 
setTopicSubscribedFromRegion

1.1) where to manage these informations?

I don't think it was a good idea to TopicManager. Because TopicManager takes a 
role to manage the ownership of topics, but these three method tries to 
interactive with metadata related to region info, which is similar as 
SubscriptionDataManager and TopicPersistenceInfoManager. So I would suggest you 
to move it to a brand new metadata manager like 
'TopicRemoteSubscriptionDataManager' and put this manager under 
MetadataManagerFactory. And you could provide a 
ZKTopicRemoteSubscriptionDataManager in ZKMetadataManagerFactory. so you don't 
need to worry the TODO issue as you described in MMTopicManager. And it would 
make the responsibilities more clear.

1.2) the name of these methods?

from my understanding about this issue, the data to record is what regions that 
the hub has subscribed the topic remotely. so a better name might be 'To' not 
'From', like '#checkTopicSubscribedToRegion' '#setTopicUnsubscribedToRegion', 
'#setTopicSubscribedToRegion'. If my understanding is not right, please correct 
me.

1.3) It would be better to use colo name rather than regionAddress, I think. 
since regionAddress might be changed to a different address.

Besides that, ZooKeeperServiceDown is not a good name, I would suggest using 
name like 'MetadataServiceDown'.

2) 

{code}
-// no subscriptions now, it may be removed by other 
release ops
-if (null != topicSubscriptions) {
-for (ByteString subId : topicSubscriptions.keySet()) {
-if (logger.isDebugEnabled()) {
-logger.debug(Stop serving subscriber ( + 
topic.toStringUtf8() + , 
-   + subId.toStringUtf8() + ) when 
losing topic);
-}
-if (null != dm) {
-dm.stopServingSubscriber(topic, subId);
-}
-}
-}
-if (logger.isDebugEnabled()) {
-logger.debug(Stop serving topic  + 
topic.toStringUtf8());
-}
-// Since we decrement local count when some of remote 
subscriptions failed,
-// while we don't unsubscribe those succeed subscriptions. 
so we can't depends
-// on local count, just try to notify unsubscribe.
-notifyLastLocalUnsubscribe(topic);
+
 cb.operationFinished(ctx, null);
{code}

seems that you removed some logic (stop serving subscriber when unsubscribe) 
when you rebased the patch.

(3) It was great to clean up the boolean flag in ReleaseOp but adding backup 
map. But seems that this fix is not relate to this jira. so if it was convient, 
when you generate a new patch, could you split this part into a separated JIRA?

{code}
 public class InMemorySubscriptionManager extends AbstractSubscriptionManager {
+// Backup for top2sub2seq
+final ConcurrentHashMapByteString, MapByteString, 
InMemorySubscriptionState _top2sub2seq =
+new ConcurrentHashMapByteString, MapByteString, 
InMemorySubscriptionState();
 
{code}

(4) indent issue

{code}
 if (LOGGER.isDebugEnabled())
-LOGGER.debug([ + myRegion + ] 
cross-region recv-fwd succeeded for topic 
+LOGGER.debug([ + myRegion + ] 
cross-region recv-fwd succeeded for topic 
  + topic.toStringUtf8());
{code}

I saw some code are in wrong indent. I think it might be introduced by rebase.

 Local subscriptions fail if remote region is down
 -

 Key: BOOKKEEPER-362
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-362
 Project: Bookkeeper
  Issue Type: Bug
  Components: hedwig-server
Affects Versions: 4.2.0
Reporter: Aniruddha
Assignee: Aniruddha
Priority: Critical
  Labels: hedwig
 Attachments: 
 0001-Ignore-hub-client-remote-subscription-failure-if-we-.patch, 
 rebase_remoteregion.patch


 Currently, local subscriptions fail if the remote region hubs are down, even 
 if the local hub has subscribed to the remote topic previously. Because of 
 this, one region cannot function

[jira] [Commented] (BOOKKEEPER-336) bookie readEntries is taking more time if the ensemble has failed bookie(s)

2012-10-24 Thread Rakesh R (JIRA)

[
https://issues.apache.org/jira/browse/BOOKKEEPER-336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483042#comment-13483042
]

Rakesh R commented on BOOKKEEPER-336:
-

Thanks Ivan, Stud for the participation. I also agree to go with read timeouts.

I could see the parallel read to quorum bookies would make the bkserver
busy/exhaust with many read requests and may affect the write latency in worst
case. But good thing is, we have BOOKKEEPER-429 idea of separate read/write
threads, this would help us in latency and IMO this issue also go together.

bookie readEntries is taking more time if the ensemble has failed bookie(s)
---

Key: BOOKKEEPER-336
URL: https://issues.apache.org/jira/browse/BOOKKEEPER-336
Project: Bookkeeper
Issue Type: Bug
Affects Versions: 4.1.0
Reporter: Brahma Reddy Battula
Attachments: BOOKKEEPER-336.1.patch, BOOKKEEPER-336.draft1.diff,
BOOKKEEPER-336.patch

Scenario:
1) Start three bookies. Create ledger with ensemblesize=3, quorumsize=2
2) Add 100 entries to this ledger
3) Make first bookie down and read the entries from 0-99
Output: Each entry is going to fetch from the failed bookie and is waiting
for the bookie connection timeout, only after failure going to next bookie.
This is affecting the read entry performance.
Impact: Namenode switching time will be affected by adding this failed bookie
readTimeOut also.

Review Request: BOOKKEEPER-368: Implementing multiplexing java client.

2012-10-24 Thread Sijie Guo


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7724/
---

Review request for bookkeeper.


Description
---

Implement a multiplexing java client.


This addresses bug BOOKKEEPER-368.
https://issues.apache.org/jira/browse/BOOKKEEPER-368


Diffs
-

  
hedwig-client/src/main/java/org/apache/hedwig/client/conf/ClientConfiguration.java
 fa2c6d6 
  
hedwig-client/src/main/java/org/apache/hedwig/client/handlers/SubscribeResponseHandler.java
 b8e5aec 
  
hedwig-client/src/main/java/org/apache/hedwig/client/netty/HedwigClientImpl.java
 1724e04 
  
hedwig-client/src/main/java/org/apache/hedwig/client/netty/impl/HChannelHandler.java
 7753c6e 
  
hedwig-client/src/main/java/org/apache/hedwig/client/netty/impl/multiplex/MultiplexHChannelManager.java
 PRE-CREATION 
  
hedwig-client/src/main/java/org/apache/hedwig/client/netty/impl/multiplex/MultiplexSubscribeResponseHandler.java
 PRE-CREATION 
  
hedwig-client/src/main/java/org/apache/hedwig/client/netty/impl/multiplex/MultiplexSubscriptionChannelPipelineFactory.java
 PRE-CREATION 
  
hedwig-client/src/main/java/org/apache/hedwig/client/netty/impl/multiplex/ResubscribeCallback.java
 PRE-CREATION 
  
hedwig-client/src/main/java/org/apache/hedwig/client/netty/impl/simple/SimpleSubscribeResponseHandler.java
 a426a7b 
  hedwig-protocol/src/main/java/org/apache/hedwig/protocol/PubSubProtocol.java 
8d8f2ac 
  
hedwig-protocol/src/main/java/org/apache/hedwig/protoextensions/PubSubResponseUtils.java
 af69043 
  hedwig-protocol/src/main/protobuf/PubSubProtocol.proto 7fafcce 
  
hedwig-server/src/main/java/org/apache/hedwig/server/delivery/FIFODeliveryManager.java
 fd5f448 
  
hedwig-server/src/main/java/org/apache/hedwig/server/handlers/SubscribeHandler.java
 dfcde9f 
  
hedwig-server/src/main/java/org/apache/hedwig/server/handlers/SubscriptionChannelManager.java
 2a8d093 
  
hedwig-server/src/main/java/org/apache/hedwig/server/proxy/ChannelTracker.java 
5bfd898 
  hedwig-server/src/main/java/org/apache/hedwig/server/proxy/HedwigProxy.java 
35f8b64 
  
hedwig-server/src/main/java/org/apache/hedwig/server/proxy/ProxyCloseSubscriptionHandler.java
 PRE-CREATION 
  hedwig-server/src/test/java/org/apache/hedwig/client/TestPubSubClient.java 
ce0f3f6 
  
hedwig-server/src/test/java/org/apache/hedwig/client/netty/TestCloseSubscription.java
 bf74df1 
  
hedwig-server/src/test/java/org/apache/hedwig/client/netty/TestMultiplexing.java
 PRE-CREATION 
  
hedwig-server/src/test/java/org/apache/hedwig/server/HedwigRegionTestBase.java 
4ec0d50 
  
hedwig-server/src/test/java/org/apache/hedwig/server/delivery/TestThrottlingDelivery.java
 4338825 
  
hedwig-server/src/test/java/org/apache/hedwig/server/handlers/TestSubUnsubHandler.java
 5bbf603 
  
hedwig-server/src/test/java/org/apache/hedwig/server/integration/TestHedwigHub.java
 02b4503 
  
hedwig-server/src/test/java/org/apache/hedwig/server/integration/TestHedwigRegion.java
 0b1851e 

Diff: https://reviews.apache.org/r/7724/diff/


Testing
---

Passed all testing.


Thanks,

Sijie Guo

[jira] [Updated] (BOOKKEEPER-368) Implementing multiplexing java client.

2012-10-24 Thread Sijie Guo (JIRA)


 [ 
https://issues.apache.org/jira/browse/BOOKKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sijie Guo updated BOOKKEEPER-368:
-

Attachment: BOOKKEEPER-368.diff

Attach a new patch rebased to latest trunk.

Also put it on review board: https://reviews.apache.org/r/7724/

 Implementing multiplexing java client.
 --

 Key: BOOKKEEPER-368
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-368
 Project: Bookkeeper
  Issue Type: Sub-task
Reporter: Sijie Guo
Assignee: Sijie Guo
 Fix For: 4.2.0

 Attachments: BOOKKEEPER-368.diff, BOOKKEEPER-368.diff


 Implement a multiplexing java client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request: [BOOKKEEPER-204] Provide a MetaStore interface, and a mock implementation.

2012-10-24 Thread Jiannan Wang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7314/
---

(Updated Oct. 24, 2012, 12:07 p.m.)


Review request for bookkeeper.


Changes
---

Make some changes following Ivan's comments and refine test case.

BTW, although MetastoreTable#pub is removed, MetastoreTable#versionedPut(.., 
Version.ANY) can also update data without comparing version. I don't quite sure 
whether to disable this usage or not.


Description
---

We need a MetaStore interface which easy for us to plugin different scalable 
k/v storage, such as HBase.


This addresses bug BOOKKEEPER-204.
https://issues.apache.org/jira/browse/BOOKKEEPER-204


Diffs (updated)
-

  
bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/MSException.java
 PRE-CREATION 
  
bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/MetaStore.java 
PRE-CREATION 
  
bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/MetastoreCallback.java
 PRE-CREATION 
  
bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/MetastoreCursor.java
 PRE-CREATION 
  
bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/MetastoreException.java
 PRE-CREATION 
  
bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/MetastoreFactory.java
 PRE-CREATION 
  
bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/MetastoreScannableTable.java
 PRE-CREATION 
  
bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/MetastoreTable.java
 PRE-CREATION 
  
bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/MetastoreTableItem.java
 PRE-CREATION 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/Value.java 
PRE-CREATION 
  
bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/mock/MockMetaStore.java
 PRE-CREATION 
  
bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/mock/MockMetastoreCursor.java
 PRE-CREATION 
  
bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/mock/MockMetastoreTable.java
 PRE-CREATION 
  
bookkeeper-server/src/test/java/org/apache/bookkeeper/metastore/MetastoreScannableTableAsyncToSyncConverter.java
 PRE-CREATION 
  
bookkeeper-server/src/test/java/org/apache/bookkeeper/metastore/MetastoreTableAsyncToSyncConverter.java
 PRE-CREATION 
  
bookkeeper-server/src/test/java/org/apache/bookkeeper/metastore/TestMetaStore.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/7314/diff/


Testing
---


Thanks,

Jiannan Wang

Re: Review Request: [BOOKKEEPER-204] Provide a MetaStore interface, and a mock implementation.

2012-10-24 Thread Ivan Kelly



 On Oct. 22, 2012, 2:11 p.m., Ivan Kelly wrote:
  bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/Value.java, 
  line 37
  https://reviews.apache.org/r/7314/diff/1/?file=160309#file160309line37
 
  value should only be a byte[]. adding fields like this does, 
  overexpands the scope of the change without a strong need.
 
 Jiannan Wang wrote:
 Currently, SubscriptionData contains preference and state information, 
 where state is updated frequently while preference changes only when sub. To 
 make better performance, SubscriptionDataManager supports partial update 
 operation. This is the reason why we introduce fields in Value to support 
 updating a specific field.

This is a lot of complexity to add for a single corner case. A simpler solution 
would be to split SubscriptionData before writing to the metadata interface and 
to write to 2 separate keys, subid-prefs and subid-state for example.


- Ivan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7314/#review12653
---


On Oct. 24, 2012, 12:07 p.m., Jiannan Wang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/7314/
 ---
 
 (Updated Oct. 24, 2012, 12:07 p.m.)
 
 
 Review request for bookkeeper.
 
 
 Description
 ---
 
 We need a MetaStore interface which easy for us to plugin different scalable 
 k/v storage, such as HBase.
 
 
 This addresses bug BOOKKEEPER-204.
 https://issues.apache.org/jira/browse/BOOKKEEPER-204
 
 
 Diffs
 -
 
   
 bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/MSException.java
  PRE-CREATION 
   
 bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/MetaStore.java
  PRE-CREATION 
   
 bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/MetastoreCallback.java
  PRE-CREATION 
   
 bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/MetastoreCursor.java
  PRE-CREATION 
   
 bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/MetastoreException.java
  PRE-CREATION 
   
 bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/MetastoreFactory.java
  PRE-CREATION 
   
 bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/MetastoreScannableTable.java
  PRE-CREATION 
   
 bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/MetastoreTable.java
  PRE-CREATION 
   
 bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/MetastoreTableItem.java
  PRE-CREATION 
   bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/Value.java 
 PRE-CREATION 
   
 bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/mock/MockMetaStore.java
  PRE-CREATION 
   
 bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/mock/MockMetastoreCursor.java
  PRE-CREATION 
   
 bookkeeper-server/src/main/java/org/apache/bookkeeper/metastore/mock/MockMetastoreTable.java
  PRE-CREATION 
   
 bookkeeper-server/src/test/java/org/apache/bookkeeper/metastore/MetastoreScannableTableAsyncToSyncConverter.java
  PRE-CREATION 
   
 bookkeeper-server/src/test/java/org/apache/bookkeeper/metastore/MetastoreTableAsyncToSyncConverter.java
  PRE-CREATION 
   
 bookkeeper-server/src/test/java/org/apache/bookkeeper/metastore/TestMetaStore.java
  PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/7314/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jiannan Wang

[jira] [Commented] (BOOKKEEPER-390) Provide support for ZooKeeper authentication


[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483285#comment-13483285
 ] 

Flavio Junqueira commented on BOOKKEEPER-390:
-

I'd like to ask a couple of questions just for my own understanding, it is not 
(yet) a criticism to this approach:

# When creating a bookkeeper object, we have the option of passing a zookeeper 
object. What if we require that, in the case of zookeeper authentication 
enabled, the application creates a zookeeper object before using bookkeeper?
# We are moving towards having a MetaStore interface (BOOKKEEPER-204) so that 
we can use different backends to store metadata. Should we be looking into 
implementing a more general approach that fits into the MetaStore interface an 
enables authentication anything that supports SASL?

 Provide support for ZooKeeper authentication
 

 Key: BOOKKEEPER-390
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-390
 Project: Bookkeeper
  Issue Type: New Feature
  Components: bookkeeper-client, bookkeeper-server
Affects Versions: 4.0.0
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: BOOKKEEPER-390-Acl-draftversion.patch


 This JIRA adds support for protecting the state of Bookkeeper znodes on a 
 multi-tenant ZooKeeper cluster.
 Use case: When user tries to run a ZK cluster in multitenant mode,  where 
 more than one client service would like to share a single ZK service instance 
 (cluster). In this case the client services typically want to protect their 
 data (ZK znodes) from access by other services (tenants) on the cluster. Say 
 you are running BK, HBase or ZKFC instances, etc... having 
 authentication/authorization on the znodes is important for both security and 
 helping to ensure that services don't interact negatively (touch each other's 
 data).
 Presently Bookkeeper does not have support for authentication or 
 authorization while accessing to ZK. This should be added to the BK 
 clients/server that are accessing the ZK cluster. In general it means calling 
 addAuthInfo once after a session is established

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Jenkins build is still unstable: bookkeeper-trunk » bookkeeper-server #769

See 
https://builds.apache.org/job/bookkeeper-trunk/org.apache.bookkeeper$bookkeeper-server/769/

Jenkins build is still unstable: bookkeeper-trunk » hedwig-server #769

See 
https://builds.apache.org/job/bookkeeper-trunk/org.apache.bookkeeper$hedwig-server/769/

[jira] [Comment Edited] (BOOKKEEPER-390) Provide support for ZooKeeper authentication


[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483285#comment-13483285
 ] 

Flavio Junqueira edited comment on BOOKKEEPER-390 at 10/24/12 3:05 PM:
---

I'd like to ask a couple of questions just for my own understanding, it is not 
(yet) a criticism to this approach:

# When creating a bookkeeper object, we have the option of passing a zookeeper 
object. What if we require that, in the case of zookeeper authentication 
enabled, the application creates a zookeeper object before using bookkeeper?
# We are moving towards having a MetaStore interface (BOOKKEEPER-204) so that 
we can use different backends to store metadata. Should we be looking into 
implementing a more general approach that fits into the MetaStore interface and 
enables authentication for anything that supports SASL?

  was (Author: fpj):
I'd like to ask a couple of questions just for my own understanding, it is 
not (yet) a criticism to this approach:

# When creating a bookkeeper object, we have the option of passing a zookeeper 
object. What if we require that, in the case of zookeeper authentication 
enabled, the application creates a zookeeper object before using bookkeeper?
# We are moving towards having a MetaStore interface (BOOKKEEPER-204) so that 
we can use different backends to store metadata. Should we be looking into 
implementing a more general approach that fits into the MetaStore interface an 
enables authentication anything that supports SASL?
  
 Provide support for ZooKeeper authentication
 

 Key: BOOKKEEPER-390
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-390
 Project: Bookkeeper
  Issue Type: New Feature
  Components: bookkeeper-client, bookkeeper-server
Affects Versions: 4.0.0
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: BOOKKEEPER-390-Acl-draftversion.patch


 This JIRA adds support for protecting the state of Bookkeeper znodes on a 
 multi-tenant ZooKeeper cluster.
 Use case: When user tries to run a ZK cluster in multitenant mode,  where 
 more than one client service would like to share a single ZK service instance 
 (cluster). In this case the client services typically want to protect their 
 data (ZK znodes) from access by other services (tenants) on the cluster. Say 
 you are running BK, HBase or ZKFC instances, etc... having 
 authentication/authorization on the znodes is important for both security and 
 helping to ensure that services don't interact negatively (touch each other's 
 data).
 Presently Bookkeeper does not have support for authentication or 
 authorization while accessing to ZK. This should be added to the BK 
 clients/server that are accessing the ZK cluster. In general it means calling 
 addAuthInfo once after a session is established

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Jenkins build is still unstable: bookkeeper-trunk #769

See https://builds.apache.org/job/bookkeeper-trunk/changes

[jira] [Created] (BOOKKEEPER-441) InMemorySubscriptionManager should back up top2sub2seq before change it

Yixue (Andrew) Zhu created BOOKKEEPER-441:
-

 Summary: InMemorySubscriptionManager should back up top2sub2seq 
before change it
 Key: BOOKKEEPER-441
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-441
 Project: Bookkeeper
  Issue Type: Bug
  Components: hedwig-server
Affects Versions: 4.3.0
 Environment: unix
Reporter: Yixue (Andrew) Zhu
Priority: Minor
 Fix For: 4.3.0


On topic loss, InMemorySubscriptionManager currently does not clear 
top2sub2seq. The intent is to allow readSubscription to get the information 
there. This introduce dependency outside the class, evidence is that general 
ReleaseOp has to use a boolean parameter which targets this implementation 
detail. Further, this prevents Acquire-topic to notify listeners 
(notifyFirstLocalSubscribe is not called) of first subscription to act 
appropriately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-346) Detect IOExceptions in LedgerCache and bookie should look at next ledger dir(if any)

2012-10-24 Thread Ivan Kelly (JIRA)


[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483373#comment-13483373
 ] 

Ivan Kelly commented on BOOKKEEPER-346:
---

I dont like the coupling of the journal with ledger storage. It makes things 
hard to test and benchmark in isolation. Whats more, it's unnecessary.
If we have the copy sequence as above
1) copy A.idx to A.idx.rloc
2) delete A.idx
3) rename A.idx.rloc to A.idx

the problematic case is if we crash after 2, before 3 completes. But if on 
initialization of the LedgerCacheImpl, we scan all directories for A.idx.rloc, 
if A.idx exists the copy was incomplete, remove A.idx.rloc, if A.idx does not 
exist, rename A.idx.rloc to A.idx. Theres no need to mess with the journal at 
all.

 Detect IOExceptions in LedgerCache and bookie should look at next ledger 
 dir(if any)
 

 Key: BOOKKEEPER-346
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-346
 Project: Bookkeeper
  Issue Type: Sub-task
  Components: bookkeeper-server
Affects Versions: 4.1.0
Reporter: Rakesh R
Assignee: Vinay
 Fix For: 4.2.0

 Attachments: BOOKKEEPER-346.patch, BOOKKEEPER-346.patch, 
 BOOKKEEPER-346.patch, BOOKKEEPER-346.patch, BOOKKEEPER-346.patch, 
 BOOKKEEPER-346.patch


 This jira to detect IOExceptions in LedgerCache to iterate over all the 
 configured ledger(s).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-441) InMemorySubscriptionManager should back up top2sub2seq before change it


[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483378#comment-13483378
 ] 

Yixue (Andrew) Zhu commented on BOOKKEEPER-441:
---

I cannot assign the issue to myself for some reason (Edit/More Actions do not 
have the option).

 InMemorySubscriptionManager should back up top2sub2seq before change it
 ---

 Key: BOOKKEEPER-441
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-441
 Project: Bookkeeper
  Issue Type: Bug
  Components: hedwig-server
Affects Versions: 4.3.0
 Environment: unix
Reporter: Yixue (Andrew) Zhu
Priority: Minor
  Labels: patch
 Fix For: 4.3.0


 On topic loss, InMemorySubscriptionManager currently does not clear 
 top2sub2seq. The intent is to allow readSubscription to get the information 
 there. This introduce dependency outside the class, evidence is that general 
 ReleaseOp has to use a boolean parameter which targets this implementation 
 detail. Further, this prevents Acquire-topic to notify listeners 
 (notifyFirstLocalSubscribe is not called) of first subscription to act 
 appropriately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (BOOKKEEPER-441) InMemorySubscriptionManager should back up top2sub2seq before change it


 [ 
https://issues.apache.org/jira/browse/BOOKKEEPER-441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yixue (Andrew) Zhu updated BOOKKEEPER-441:
--

Attachment: BackupTop2Sub2Seq.patch

 InMemorySubscriptionManager should back up top2sub2seq before change it
 ---

 Key: BOOKKEEPER-441
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-441
 Project: Bookkeeper
  Issue Type: Bug
  Components: hedwig-server
Affects Versions: 4.3.0
 Environment: unix
Reporter: Yixue (Andrew) Zhu
Assignee: Yixue (Andrew) Zhu
Priority: Minor
  Labels: patch
 Fix For: 4.3.0

 Attachments: BackupTop2Sub2Seq.patch


 On topic loss, InMemorySubscriptionManager currently does not clear 
 top2sub2seq. The intent is to allow readSubscription to get the information 
 there. This introduce dependency outside the class, evidence is that general 
 ReleaseOp has to use a boolean parameter which targets this implementation 
 detail. Further, this prevents Acquire-topic to notify listeners 
 (notifyFirstLocalSubscribe is not called) of first subscription to act 
 appropriately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (BOOKKEEPER-428) Expose command options in bookie scripts to disable/enable auto recovery temporarily

2012-10-24 Thread Rakesh R (JIRA)


 [ 
https://issues.apache.org/jira/browse/BOOKKEEPER-428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated BOOKKEEPER-428:


Attachment: BOOKKEEPER-428.patch

 Expose command options in bookie scripts to disable/enable auto recovery 
 temporarily
 

 Key: BOOKKEEPER-428
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-428
 Project: Bookkeeper
  Issue Type: Sub-task
  Components: bookkeeper-auto-recovery
Affects Versions: 4.0.0
Reporter: Rakesh R
Assignee: Rakesh R
 Fix For: 4.2.0

 Attachments: BOOKKEEPER-428.patch


 Administrators can invoke disble/enable autorecovery options through bookie 
 shell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-428) Expose command options in bookie scripts to disable/enable auto recovery temporarily

2012-10-24 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483492#comment-13483492
 ] 

Rakesh R commented on BOOKKEEPER-428:
-

Attached patch, which has the following command options for toggling 
autorecovery. Could you please review. Thanks
{code}
 -d,--disable  Disable auto recovery of underreplicated ledgers
 -e,--enable   Enable auto recovery of underreplicated ledgers
{code}

 Expose command options in bookie scripts to disable/enable auto recovery 
 temporarily
 

 Key: BOOKKEEPER-428
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-428
 Project: Bookkeeper
  Issue Type: Sub-task
  Components: bookkeeper-auto-recovery
Affects Versions: 4.0.0
Reporter: Rakesh R
Assignee: Rakesh R
 Fix For: 4.2.0

 Attachments: BOOKKEEPER-428.patch


 Administrators can invoke disble/enable autorecovery options through bookie 
 shell.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-441) InMemorySubscriptionManager should back up top2sub2seq before change it


[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483741#comment-13483741
 ] 

Yixue (Andrew) Zhu commented on BOOKKEEPER-441:
---

4.2.0 sounds good. will update it.

 InMemorySubscriptionManager should back up top2sub2seq before change it
 ---

 Key: BOOKKEEPER-441
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-441
 Project: Bookkeeper
  Issue Type: Bug
  Components: hedwig-server
Affects Versions: 4.3.0
 Environment: unix
Reporter: Yixue (Andrew) Zhu
Assignee: Yixue (Andrew) Zhu
Priority: Minor
  Labels: patch
 Fix For: 4.3.0

 Attachments: BackupTop2Sub2Seq.patch


 On topic loss, InMemorySubscriptionManager currently does not clear 
 top2sub2seq. The intent is to allow readSubscription to get the information 
 there. This introduce dependency outside the class, evidence is that general 
 ReleaseOp has to use a boolean parameter which targets this implementation 
 detail. Further, this prevents Acquire-topic to notify listeners 
 (notifyFirstLocalSubscribe is not called) of first subscription to act 
 appropriately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (BOOKKEEPER-441) InMemorySubscriptionManager should back up top2sub2seq before change it


 [ 
https://issues.apache.org/jira/browse/BOOKKEEPER-441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yixue (Andrew) Zhu updated BOOKKEEPER-441:
--

Attachment: BackupTop2Sub2Seq.patch

 InMemorySubscriptionManager should back up top2sub2seq before change it
 ---

 Key: BOOKKEEPER-441
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-441
 Project: Bookkeeper
  Issue Type: Bug
  Components: hedwig-server
Affects Versions: 4.2.0
 Environment: unix
Reporter: Yixue (Andrew) Zhu
Assignee: Yixue (Andrew) Zhu
Priority: Minor
  Labels: patch
 Fix For: 4.2.0

 Attachments: BackupTop2Sub2Seq.patch, BackupTop2Sub2Seq.patch


 On topic loss, InMemorySubscriptionManager currently does not clear 
 top2sub2seq. The intent is to allow readSubscription to get the information 
 there. This introduce dependency outside the class, evidence is that general 
 ReleaseOp has to use a boolean parameter which targets this implementation 
 detail. Further, this prevents Acquire-topic to notify listeners 
 (notifyFirstLocalSubscribe is not called) of first subscription to act 
 appropriately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Review Request: Backup topic-sub inside InMemorySubscriptionManager

2012-10-24 Thread Yixue (Andrew) Zhu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7731/
---

Review request for bookkeeper, Ivan Kelly, Sijie Guo, and Aniruddha Laud.


Description
---

On topic loss, InMemorySubscriptionManager currently does not clear 
top2sub2seq. The intent is to allow readSubscription to get the information 
there. This introduce dependency outside the class, evidence is that general 
ReleaseOp has to use a boolean parameter which targets this implementation 
detail. Further, this prevents Acquire-topic to notify listeners 
(notifyFirstLocalSubscribe is not called) of first subscription to act 
appropriately.
This change address the issue.


This addresses bug BOOKKEEPER-441.
https://issues.apache.org/jira/browse/BOOKKEEPER-441


Diffs
-

  
hedwig-server/src/main/java/org/apache/hedwig/server/subscriptions/AbstractSubscriptionManager.java
 5552265 
  
hedwig-server/src/main/java/org/apache/hedwig/server/subscriptions/InMemorySubscriptionManager.java
 1400e49 

Diff: https://reviews.apache.org/r/7731/diff/


Testing
---

Unit tests


Thanks,

Yixue (Andrew) Zhu

[jira] [Commented] (BOOKKEEPER-362) Local subscriptions fail if remote region is down


[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483843#comment-13483843
 ] 

Yixue (Andrew) Zhu commented on BOOKKEEPER-362:
---

I will change 1) and 3).
As to 2), ZooKeeperServiceDown is more general exception, not specific for this 
issue.

 Local subscriptions fail if remote region is down
 -

 Key: BOOKKEEPER-362
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-362
 Project: Bookkeeper
  Issue Type: Bug
  Components: hedwig-server
Affects Versions: 4.2.0
Reporter: Aniruddha
Assignee: Yixue (Andrew) Zhu
Priority: Critical
  Labels: hedwig
 Attachments: 
 0001-Ignore-hub-client-remote-subscription-failure-if-we-.patch, 
 rebase_remoteregion.patch


 Currently, local subscriptions fail if the remote region hubs are down, even 
 if the local hub has subscribed to the remote topic previously. Because of 
 this, one region cannot function independent of the other. 
 A more detailed discussion related to this can be found here 
 http://mail-archives.apache.org/mod_mbox/zookeeper-bookkeeper-dev/201208.mbox/%3cCAOLhyDQSOF+Y+pvnyrd-HJRq1YEr=c8ok_b3_mr81r1g-9m...@mail.gmail.com%3e

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (BOOKKEEPER-362) Local subscriptions fail if remote region is down