[jira] [Updated] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Skye Wanderman-Milne updated ZOOKEEPER-1560: Attachment: ZOOKEEPER-1560-v8_r4.patch Add ClientTest.testLargeNodeData to v8_r3 patch. Zookeeper client hangs on creation of large nodes - Key: ZOOKEEPER-1560 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.4, 3.5.0 Reporter: Igor Motov Assignee: Skye Wanderman-Milne Fix For: 3.5.0, 3.4.5 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt, ZOOKEEPER-1560-v8.patch, ZOOKEEPER-1560-v8_r3.patch, ZOOKEEPER-1560-v8_r4.patch To reproduce, try creating a node with 0.5M of data using java client. The test will hang waiting for a response from the server. See the attached patch for the test that reproduces the issue. It seems that ZOOKEEPER-1437 introduced a few issues to {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from sending large packets that require several invocations of {{SocketChannel.write}} to complete. The first issue is that the call to {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue even if the packet wasn't completely sent yet. It looks to me that this call should be moved under {{if (!pbb.hasRemaining())}} The second issue is that {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which confuses {{SocketChannel.write}}. And the third issue is caused by extra calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Skye Wanderman-Milne updated ZOOKEEPER-1560: Attachment: ZOOKEEPER-1560-v8.patch I've created a new patch (ZOOKEEPER-1560-v8.patch) that incorporates what we have so far (moving removeFirstOccurrence to after the packet is completely written, only calling createBB when a BB doesn't already exist, and only calling setXid when no xid is already set). It also modifies findSendablePacket to always choose the first packet if it is partially written. The only place that a packet is prepended to outgoingQueue is ClientCnxn.primeConnection, which should only happen at the very beginning, so a partially-written packet should remain at the beginning of the queue until it is removed. I also cleaned up some of the code so the changes look more extensive than they really are :) Posted at https://reviews.apache.org/r/7730. I added comments to mark the important parts (as opposed to the clean up). Zookeeper client hangs on creation of large nodes - Key: ZOOKEEPER-1560 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.4, 3.5.0 Reporter: Igor Motov Assignee: Ted Yu Fix For: 3.5.0, 3.4.5 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt, ZOOKEEPER-1560-v8.patch To reproduce, try creating a node with 0.5M of data using java client. The test will hang waiting for a response from the server. See the attached patch for the test that reproduces the issue. It seems that ZOOKEEPER-1437 introduced a few issues to {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from sending large packets that require several invocations of {{SocketChannel.write}} to complete. The first issue is that the call to {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue even if the packet wasn't completely sent yet. It looks to me that this call should be moved under {{if (!pbb.hasRemaining())}} The second issue is that {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which confuses {{SocketChannel.write}}. And the third issue is caused by extra calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated ZOOKEEPER-1560: -- Attachment: zookeeper-1560-v5.txt From https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//testReport/org.apache.zookeeper.test/ClientTest/testLargeNodeData/ : {code} 2012-10-12 14:10:50,042 [myid:] - WARN [main-SendThread(localhost:11221):ClientCnxn$SendThread@1089] - Session 0x13a555031cf for server localhost/127.0.0.1:11221, unexpected error, closing socket connection and attempting reconnect java.io.IOException: Couldn't write 2000 bytes, 1152 bytes written at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:142) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) 2012-10-12 14:10:50,044 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@349] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x13a555031cf, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:662) {code} Patch v5 adds more information to exception message. Zookeeper client hangs on creation of large nodes - Key: ZOOKEEPER-1560 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.4, 3.5.0 Reporter: Igor Motov Assignee: Ted Yu Fix For: 3.5.0, 3.4.5 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, zookeeper-1560-v5.txt To reproduce, try creating a node with 0.5M of data using java client. The test will hang waiting for a response from the server. See the attached patch for the test that reproduces the issue. It seems that ZOOKEEPER-1437 introduced a few issues to {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from sending large packets that require several invocations of {{SocketChannel.write}} to complete. The first issue is that the call to {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue even if the packet wasn't completely sent yet. It looks to me that this call should be moved under {{if (!pbb.hasRemaining())}} The second issue is that {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which confuses {{SocketChannel.write}}. And the third issue is caused by extra calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated ZOOKEEPER-1560: -- Attachment: zookeeper-1560-v6.txt Patch v6 changes the condition for raising IOE: if there is no progress between successive sock.write() calls. I guess socket's output buffer might be a limiting factor as to the number of bytes written in a particular sock.write() call. Zookeeper client hangs on creation of large nodes - Key: ZOOKEEPER-1560 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.4, 3.5.0 Reporter: Igor Motov Assignee: Ted Yu Fix For: 3.5.0, 3.4.5 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, zookeeper-1560-v5.txt, zookeeper-1560-v6.txt To reproduce, try creating a node with 0.5M of data using java client. The test will hang waiting for a response from the server. See the attached patch for the test that reproduces the issue. It seems that ZOOKEEPER-1437 introduced a few issues to {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from sending large packets that require several invocations of {{SocketChannel.write}} to complete. The first issue is that the call to {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue even if the packet wasn't completely sent yet. It looks to me that this call should be moved under {{if (!pbb.hasRemaining())}} The second issue is that {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which confuses {{SocketChannel.write}}. And the third issue is caused by extra calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated ZOOKEEPER-1560: -- Attachment: zookeeper-1560-v7.txt Patch v7 changes the IOE to a warning. Let's see if the test is able to make further progress. I wonder whether 77152 bytes would be big enough for most use cases. Zookeeper client hangs on creation of large nodes - Key: ZOOKEEPER-1560 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.4, 3.5.0 Reporter: Igor Motov Assignee: Ted Yu Fix For: 3.5.0, 3.4.5 Attachments: ZOOKEEPER-1560.patch, zookeeper-1560-v1.txt, zookeeper-1560-v2.txt, zookeeper-1560-v3.txt, zookeeper-1560-v4.txt, zookeeper-1560-v5.txt, zookeeper-1560-v6.txt, zookeeper-1560-v7.txt To reproduce, try creating a node with 0.5M of data using java client. The test will hang waiting for a response from the server. See the attached patch for the test that reproduces the issue. It seems that ZOOKEEPER-1437 introduced a few issues to {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from sending large packets that require several invocations of {{SocketChannel.write}} to complete. The first issue is that the call to {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue even if the packet wasn't completely sent yet. It looks to me that this call should be moved under {{if (!pbb.hasRemaining())}} The second issue is that {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which confuses {{SocketChannel.write}}. And the third issue is caused by extra calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Motov updated ZOOKEEPER-1560: -- Affects Version/s: (was: 3.4.3) 3.4.4 Zookeeper client hangs on creation of large nodes - Key: ZOOKEEPER-1560 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.4, 3.5.0 Reporter: Igor Motov Attachments: ZOOKEEPER-1560.patch To reproduce, try creating a node with 0.5M of data using java client. The test will hang waiting for a response from the server. See the attached patch for the test that reproduces the issue. It seems that ZOOKEEPER-1437 introduced a few issues to {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from sending large packets that require several invocations of {{SocketChannel.write}} to complete. The first issue is that the call to {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue even if the packet wasn't completely sent yet. It looks to me that this call should be moved under {{if (!pbb.hasRemaining())}} The second issue is that {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which confuses {{SocketChannel.write}}. And the third issue is caused by extra calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (ZOOKEEPER-1560) Zookeeper client hangs on creation of large nodes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Motov updated ZOOKEEPER-1560: -- Attachment: ZOOKEEPER-1560.patch Test that reproduces that issue. Zookeeper client hangs on creation of large nodes - Key: ZOOKEEPER-1560 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1560 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.4.3, 3.5.0 Reporter: Igor Motov Attachments: ZOOKEEPER-1560.patch To reproduce, try creating a node with 0.5M of data using java client. The test will hang waiting for a response from the server. See the attached patch for the test that reproduces the issue. It seems that ZOOKEEPER-1437 introduced a few issues to {{ClientCnxnSocketNIO.doIO}} that prevent {{ClientCnxnSocketNIO}} from sending large packets that require several invocations of {{SocketChannel.write}} to complete. The first issue is that the call to {{outgoingQueue.removeFirstOccurrence(p);}} removes the packet from the queue even if the packet wasn't completely sent yet. It looks to me that this call should be moved under {{if (!pbb.hasRemaining())}} The second issue is that {{p.createBB()}} is reinitializing {{ByteBuffer}} on every iteration, which confuses {{SocketChannel.write}}. And the third issue is caused by extra calls to {{cnxn.getXid()}} that increment xid on every iteration and confuse the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira