[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-12-22 Thread bstansbe...@jboss.com
"mohitanchlia" wrote : Such a great explanation. Why don't I get this from docs :) I know it takes so much keep up with the docs. http://www.jboss.org/file-access/default/members/jbossas/freezone/docs/Clustering_Guide/4/html/ch05s11s04.html is pretty close. :-) It lacks some of the implementati

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-12-19 Thread b...@jboss.com
VIEW_SYNC is discussed in http://javagroups.cvs.sourceforge.net/viewvc/javagroups/JGroups/doc/design/ReliableViewInstallation.txt?revision=1.1&view=markup. To ask JGroups-related questions, your better bet is the JGroups mailing list. View the original post : http://www.jboss.com/index.html?mo

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-12-19 Thread mohitanchlia
Such a great explanation. Why don't I get this from docs :) I know it takes so much keep up with the docs. So if I understand correctly, in above example if for some reason election policy choses Node D as master then the view will be {D,C} instead of {C,D}. Is there a way in JMX Console to see

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-12-18 Thread bstansbe...@jboss.com
I just pinged the JGroups folks again. You can also try the jgroups user mail list at https://lists.sourceforge.net/lists/listinfo/javagroups-users . Re: HASingletonController, first thing to understand is that who the master is is a function of what nodes in the cluster have that HASingletonCon

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-12-18 Thread mohitanchlia
Just checking if you heard from jgroups about VIEW_SYNC. I just had one more question, in one of the previous replies you mentioned that HAController tells the cluster that it's there when it comes up. My understanding was that HAPartition notifies HAController and then HAController based on th

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-11-28 Thread mohitanchlia
I think there is a problem with that process, and my understanding was that they introduced VIEW_SYNC to overcome that process. I am eager to hear jgroups response about VIEW_SYNC View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4193146#4193146 Reply to the pos

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-11-27 Thread bstansbe...@jboss.com
Not quite correct. FD doesn't "detect" a member, it validates that a known member is still alive. After "max_tries" heartbeat messages without a response, it initiates the process that leads to GMS excluding the non-responding member from the group. The fact you have shun="true" in both FD and G

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-11-25 Thread mohitanchlia
I thought that FD retries only 'x' number of times (max_tries) after that it will not try to detect the member again. Is that not correct? Below is the config, max_tries is set to 5.

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-11-25 Thread bstansbe...@jboss.com
OK, but make sure you trace down that "FD drops the member it never detects that member again" issue on the support case, as you shouldn't be experiencing that behavior. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4192312#4192312 Reply to the post : http:

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-11-25 Thread mohitanchlia
Thanks a lot. Jboss support group tells us that we shouldn't upgrade jgroups just like that. I think they are working on giving us the patch. Please let me know about VIEW_SYNC because what scares me is that once FD drops the member it never detects that member again and VIEW_SYNC seem to be sol

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-11-25 Thread bstansbe...@jboss.com
I've asked the JGroups developers to respond to this question. VIEW_SYNC is not one of the standard protocols we include in the JBoss AS channel configurations, so I'm not as familiar with all of its pros and cons as I am with most protocols, and I don't want to give you wrong information. View

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-11-25 Thread mohitanchlia
So I upgraded the jgroup.jar to 2.6 and that resolved the issue of slowness. I do have a question, while looking around I found VIEW_SYNC in jgroups that help resolve scenario where the node is taken out of service by FD. Do you think it's worth having this parameter in UDP configuration of Jgro

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-11-25 Thread bstansbe...@jboss.com
Mohit, It turns out your employer has a support contract with Red Hat. Please use the support case opened on our Customer Service Portal to resolve the issues you are seeing. The CSP is a much better tool for handling complex operational issues like what we're discussing on this thread. Thank

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-11-24 Thread mohitanchlia
I also see: 2008-11-24 17:08:34,990 DEBUG [jgroups.protocols.FD] - heartbeat missing from 10.10.81.92:34144 (number=0) View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4191943#4191943 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mod

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-11-24 Thread bstansbe...@jboss.com
Threads waiting on an object are quite normal; it's the standard mechanism via which a thread that's completed it's work is unscheduled while waiting for more work. The threads in your stack trace other than "main" all look fine. The "main" thread wait is as I described above. As nodes start, e

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-11-24 Thread mohitanchlia
I also see these messages 2008-11-24 17:01:08,547 WARN [protocols.pbcast.GMS] - failed to collect all ACKs (2) for view MergeView::[ View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4191941#4191941 Reply to the post : http://www.jboss.com/index.html?module=b

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-11-24 Thread mohitanchlia
Do you think I should add VIEW_SYNC to help remerge the views? I am not sure why the startup is slow and Jboss gets stuck or sometimes it declares node dead View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4191939#4191939 Reply to the post : http://www.jboss.co

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-11-24 Thread mohitanchlia
I don't know why I see so many object waits in above thread dumps. Also, as we add node to the cluster, startup terribly becomes slow. It looks like it take a very long time inintializing 10 HA SingletonControllers we have. Sometimes it just declares one of the nodes as Dead member even though a

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-11-24 Thread [EMAIL PROTECTED]
| "main" prio=1 tid=0x0817a910 nid=0xaf5 in Object.wait() [0xa727..0xa72720b0] | at java.lang.Object.wait(Native Method) | - waiting on <0xe91e7988> (a java.util.HashMap) | at org.jgroups.blocks.GroupRequest.doExecute(GroupRequest.java:501) | - locked <0xe91e7988> (a

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-11-24 Thread mohitanchlia
Here is the thread dump from one of the nodes: "ClientConnectionHandler" daemon prio=1 tid=0x080810a8 nid=0xba7 runnable [0x9f813000..0x9f813f30] at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at jav

[jboss-user] [Clustering/JBoss] - Re: strace shows futex

2008-11-24 Thread [EMAIL PROTECTED]
Try getting a thread dump as discussed at http://www.jboss.org/community/docs/DOC-12300; what you posted doesn't show anything that means anything to me. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4191880#4191880 Reply to the post : http://www.jboss.com/