Re: svn commit: r1660498 - /tomcat/trunk/java/org/apache/tomcat/util/net/Nio2Endpoint.java
2015-02-17 22:52 GMT+01:00 Mark Thomas ma...@apache.org: On 17/02/2015 21:02, ma...@apache.org wrote: Author: markt Date: Tue Feb 17 21:02:09 2015 New Revision: 1660498 URL: http://svn.apache.org/r1660498 Log: Possible fix for occasional NIO2 CI failures. Without the sync it is possible for a write registration to get lost. I still see the error but less frequently. So I think this patch is a step in the right direction. The logs still indicate that a write registration is being lost somewhere so my plan is to continue the code review. I'm a bit puzzled as to why blocking is related to that. Anyway, last time this failure occurred with the same symptom, this was caused by SecureNio2Channel: r1586789. Since this wasn't changed during the refactoring, I don't think it needs to be suspected this time though. Following the NPE fix I made, there are signs of double closing of the socket. I doubt this can be avoided simply with adding a null check. 17-Feb-2015 22:00:20.936 WARNING [https-nio2-127.0.0.1-auto-2-Acceptor-0] org.apache.tomcat.util.net.AbstractEndpoint.countDownConnection Incorrect connection count, multiple socket.close called on the same socket. Rémy
Re: svn commit: r1660498 - /tomcat/trunk/java/org/apache/tomcat/util/net/Nio2Endpoint.java
On 18/02/2015 09:19, Rémy Maucherat wrote: 2015-02-17 22:52 GMT+01:00 Mark Thomas ma...@apache.org: On 17/02/2015 21:02, ma...@apache.org wrote: Author: markt Date: Tue Feb 17 21:02:09 2015 New Revision: 1660498 URL: http://svn.apache.org/r1660498 Log: Possible fix for occasional NIO2 CI failures. Without the sync it is possible for a write registration to get lost. I still see the error but less frequently. So I think this patch is a step in the right direction. The logs still indicate that a write registration is being lost somewhere so my plan is to continue the code review. I'm a bit puzzled as to why blocking is related to that. I was too. I'm beginning to think what looked like less frequent occurrence of the error was just random effects. I'm leaning towards reverting this patch. Anyway, last time this failure occurred with the same symptom, this was caused by SecureNio2Channel: r1586789. Since this wasn't changed during the refactoring, I don't think it needs to be suspected this time though. Following the NPE fix I made, there are signs of double closing of the socket. I doubt this can be avoided simply with adding a null check. 17-Feb-2015 22:00:20.936 WARNING [https-nio2-127.0.0.1-auto-2-Acceptor-0] org.apache.tomcat.util.net.AbstractEndpoint.countDownConnection Incorrect connection count, multiple socket.close called on the same socket. That might be a different issue. I'm not sure. I'm fairly confident that the problem we are seeing with TestWebSocketFrameClientSSL is related to a write registration not happening / getting lost. The symptom is that the server just stops writing, the client times out after 60s and the test fails. I found a few places where this might be going wrong but - much like the commit above - I'm not convinced that the affected code path is used at all - let alone used in this test. I have a few other ideas about where it might be going wrong that I want to look at today. If those don't pan out it will be back to adding debug log statements. If folks want to follow along then I'll be using this branch in my fork of the Tomcat trunk git mirror: https://github.com/markt-asf/tomcat/tree/linux-debug At the moment that branch is an exact copy of trunk. I am just running the unit tests in a loop waiting for the first failure. Mark - To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org
Re: svn commit: r1660498 - /tomcat/trunk/java/org/apache/tomcat/util/net/Nio2Endpoint.java
2015-02-18 10:39 GMT+01:00 Mark Thomas ma...@apache.org: I'm fairly confident that the problem we are seeing with TestWebSocketFrameClientSSL is related to a write registration not happening / getting lost. The symptom is that the server just stops writing, the client times out after 60s and the test fails. There's one failure on the last run, but it's on the non SSL variant of the test, and it doesn't timeout. Testcase: testConnectToServerEndpoint took 23.317 sec FAILED expected:10 but was:81754 junit.framework.AssertionFailedError: expected:10 but was:81754 at org.apache.tomcat.websocket.TestWebSocketFrameClient.testConnectToServerEndpoint(TestWebSocketFrameClient.java:76) Rémy
Re: svn commit: r1660498 - /tomcat/trunk/java/org/apache/tomcat/util/net/Nio2Endpoint.java
On 18/02/2015 09:47, Rémy Maucherat wrote: 2015-02-18 10:39 GMT+01:00 Mark Thomas ma...@apache.org: I'm fairly confident that the problem we are seeing with TestWebSocketFrameClientSSL is related to a write registration not happening / getting lost. The symptom is that the server just stops writing, the client times out after 60s and the test fails. There's one failure on the last run, but it's on the non SSL variant of the test, and it doesn't timeout. Testcase: testConnectToServerEndpoint took 23.317 sec FAILED expected:10 but was:81754 junit.framework.AssertionFailedError: expected:10 but was:81754 at org.apache.tomcat.websocket.TestWebSocketFrameClient.testConnectToServerEndpoint(TestWebSocketFrameClient.java:76) Looks like we have multiple issues to track down then. Right now I'm having difficulty getting reproducing the problem that results in a timeout. Mark - To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org
Re: svn commit: r1660498 - /tomcat/trunk/java/org/apache/tomcat/util/net/Nio2Endpoint.java
On 17/02/2015 21:02, ma...@apache.org wrote: Author: markt Date: Tue Feb 17 21:02:09 2015 New Revision: 1660498 URL: http://svn.apache.org/r1660498 Log: Possible fix for occasional NIO2 CI failures. Without the sync it is possible for a write registration to get lost. I still see the error but less frequently. So I think this patch is a step in the right direction. The logs still indicate that a write registration is being lost somewhere so my plan is to continue the code review. Mark - To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org