[JBoss-user] [Clustering/JBoss] - Re: silent TCP disconnect not detected
Well, if you're into src code, you could simply fetch TCP and ConnectionTable from CVS, and diff against 2.2.9.1 (or whatever you're running). Otherwise, you could also fetch the entire 2.3alpha from CVS, build it yourself and try with the new JAR. Or I could send you the JAR, I can do a build in 2 seconds here. If you prefer the latter, send me (bela at jboss dot com) an email and I'll reply with the new JAR file. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3924335#3924335 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3924335 --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 ___ JBoss-user mailing list JBoss-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jboss-user
[JBoss-user] [Clustering/JBoss] - Re: silent TCP disconnect not detected
Hi Bela, I'm not sure how to obtain the change made for JGRP-185 - the fixed version is 2.3, but this doesn't seem to be available for download yet. This question might not belong in this forum, if there is a better place to have it answered please let me know. Thanks Anne View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3924331#3924331 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3924331 --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 ___ JBoss-user mailing list JBoss-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jboss-user
[JBoss-user] [Clustering/JBoss] - Re: silent TCP disconnect not detected
Yes. Try it out and let me know whether this works. Ran it in the debugger and it did, but feedback is welcome. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3919412#3919412 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3919412 --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 ___ JBoss-user mailing list JBoss-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jboss-user
[JBoss-user] [Clustering/JBoss] - Re: silent TCP disconnect not detected
I was thinking that I would like to use it in addition to FD, but on further consideration I'm thinking you're right and it's probably unnecessary. With JGRP-185 that you just checked in, would I see the following behaviour? - firewall starts dropping packets - FD kicks in and member is declared suspect - connection to suspect member is closed - a new connection is attempted and is successful, member joins group again If so, then I guess that is exactly what I need! Thanks Anne View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3919411#3919411 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3919411 --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 ___ JBoss-user mailing list JBoss-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jboss-user
[JBoss-user] [Clustering/JBoss] - Re: silent TCP disconnect not detected
So you want me to do pinging on the TCP connection, and a missed heartbeat would close the connection, *before* FD detects the connection loss and generates a view change ? What's the diff ? Why would you use this rather than FD ? View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3919402#3919402 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3919402 --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 ___ JBoss-user mailing list JBoss-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jboss-user
[JBoss-user] [Clustering/JBoss] - Re: silent TCP disconnect not detected
Super - thanks Bela! Do you know if there are any existing JBoss config parameters that would accomplish the application-level tcp keepalive that I was mentioning above? Reason being, ideally I would like to have this scenario detected, and the connection dropped and re-created before the far-end member is declared suspect. Thanks in advance Anne View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3919380#3919380 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3919380 --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 ___ JBoss-user mailing list JBoss-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jboss-user
[JBoss-user] [Clustering/JBoss] - Re: silent TCP disconnect not detected
Okay, done (TCP and ConnectionTable) View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3919379#3919379 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3919379 --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 ___ JBoss-user mailing list JBoss-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jboss-user
[JBoss-user] [Clustering/JBoss] - Re: silent TCP disconnect not detected
Okay, got you. I created http://jira.jboss.com/jira/browse/JGRP-185, and am fixing it right now. This is in CVS in 10 minutes View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3919367#3919367 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3919367 --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 ___ JBoss-user mailing list JBoss-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jboss-user
[JBoss-user] [Clustering/JBoss] - Re: silent TCP disconnect not detected
Hi Bela, We do use FD, and the behaviour seems to be as follows: - firewall starts dropping packets belonging to the cluster TCP connection - FD kicks in, heartbeats are sent to neighbor but no ACKs are received - max_tries is finally reached, and the neighbor is deemed suspect (we do have a VERIFY_SUSPECT here, but as above, no ACK is received) - since that same TCP connection is still being used, we get stuck in a state where JBoss thinks the neighbor is down. I am looking for a way to have JBoss close the socket it's using for clustering traffic and open a new one. Thanks Anne View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3919337#3919337 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3919337 --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 ___ JBoss-user mailing list JBoss-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jboss-user
[JBoss-user] [Clustering/JBoss] - Re: silent TCP disconnect not detected
If you use FD rather than FD_SOCK, the reject rule of the FW will discard packets, therefore heartbeats sent by FD won't be received, and the connection should be closed. Does that work for you ? View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3919253#3919253 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3919253 --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 ___ JBoss-user mailing list JBoss-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jboss-user
[JBoss-user] [Clustering/JBoss] - Re: silent TCP disconnect not detected
Interesting. So with what the firewall is doing, JGroups must not be seeing any exceptions on the Socket, and thus doesn't close the connection. I've been trying to think of a workaround involving the conn_expire_time property of TCP (see http://wiki.jboss.org/wiki/Wiki.jsp?page=JGroupsTCP) but it has the flaw of 1) not working if there is continuous traffic over the connection and 2) needing to recycle the connection every few seconds if FD is used. Using FD_SOCK shouldn't help either; eventually the firewall will cut the main TCP connection. This won't cause suspicions any more, but messages still won't get through -- that's actually worse. Will have to get back to you on this one :(. AFAICT, there are no simple hooks in the TCP protocol code where you can trigger a connection recycle. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3919226#3919226 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3919226 --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 ___ JBoss-user mailing list JBoss-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jboss-user
[JBoss-user] [Clustering/JBoss] - Re: silent TCP disconnect not detected
Sorry, I should have provided more details in my original post - we use FD (not FD_SOCK) with timeout=5s and max_tries=9. We use TCP as the transport protocol. The firewall removes connections that have been established for more than 4 hours. The firewall accomplishes this by removing the connection from it's "pass" list, causing all further packets across that connection to be dropped. This manifests as a loss of visibility to the members in the cluster. I can recover from this by lowering the OS tcp_keepalive parameter, so that the TCP connection will timeout and be destroyed by the OS before JBoss failure detection causes the remote member to be deemed suspect. When the connection is destroyed by the OS, JBoss creates a new TCP connection and is able to reach the other member of the cluster. However, lowering the OS tcp_keepalive is not acceptable as a permanent solution. I was hoping that JBoss might have a configuration parameter to achieve this timeout behaviour at the application-level for sockets created by JBoss. I hope that's a little better, sorry for the confusion :) Thanks Anne View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3919219#3919219 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3919219 --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 ___ JBoss-user mailing list JBoss-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jboss-user
[JBoss-user] [Clustering/JBoss] - Re: silent TCP disconnect not detected
I want to be sure I understand what connection you're talking about. A connection opened by the TCP protocol for normal message traffic? The protocol itself should be able to handle that, so if it's not able to handle the firewall breaking the connection that's one issue. I interpreted your first post to be about the connection opened by the FD_SOCK protocol, which is opened and then sits idle for hours. If that connection gets broken, a suspect event will be sent up the stack, but then VERIFY_SUSPECT should kick in, send a new packet over the regular TCP connection, and see that the other member isn't really dead. At that point FD_SOCK should open a new connection. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3918759#3918759 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3918759 --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 ___ JBoss-user mailing list JBoss-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jboss-user
[JBoss-user] [Clustering/JBoss] - Re: silent TCP disconnect not detected
Hi Brian, Yes, we are using VERIFY_SUSPECT. But the "ping" sent by VERIFY_SUSPECT seems to use the same socket as all the other clustering traffic. I suppose I need JBoss to question the integrity of the connection: drop the connection and attempt a reconnection before suspecting the member. Thanks Anne View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3918751#3918751 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3918751 --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 ___ JBoss-user mailing list JBoss-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jboss-user
[JBoss-user] [Clustering/JBoss] - Re: silent TCP disconnect not detected
Are you using VERIFY_SUSPECT? (http://wiki.jboss.org/wiki/Wiki.jsp?page=JGroupsVERIFY_SUSPECT). You mention wanting to do this at the application level, so maybe I'm misunderstanding what you want. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3918733#3918733 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3918733 --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 ___ JBoss-user mailing list JBoss-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jboss-user