[Openais] cpg behavior on transitional membership change
Hi all, I'm trying to further investigate problem I described at https://www.redhat.com/archives/cluster-devel/2011-August/msg00133.html The main problem for me there is that pacemaker first sees transitional membership with left nodes, then it sees stable membership with that nodes returned back, and does nothing about that. On the other hand, dlm_controld sees CPG_REASON_NODEDOWN events on CPGs related to all its lockspaces (at the same time with transitional membership change) and stops kernel part of each lockspace until whole cluster is rebooted (or until some other recovery procedure which unfortunately does not happen :( ). It neither requests to fence left node nor recovers when node is returned on next stable membership. Could anyone please help me to understand, what is a correct CPG behavior on membership change? From what I see, CPG emits CPG_REASON_NODEDOWN event on both transitional and stable membership if there is node which left the cluster. Am I correct here? And is that a right thing if I am? If yes, is there a way do detect membership change type (transitional pr stable) through CPG API? Hoping for answer, Best regards, Vladislav ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH] Ignore memb_join messages during flush operations
Reviewed-by: Jan Friesse jfrie...@redhat.com Steven Dake napsal(a): a memb_join operation that occurs during flushing can result in an entry into the GATHER state from the RECOVERY state. This results in the regular sort queue being used instead of the recovery sort queue, resulting in segfault. Signed-off-by: Steven Dake sd...@redhat.com --- exec/totemudp.c | 13 + 1 files changed, 13 insertions(+), 0 deletions(-) diff --git a/exec/totemudp.c b/exec/totemudp.c index 96849b7..0c12b56 100644 --- a/exec/totemudp.c +++ b/exec/totemudp.c @@ -90,6 +90,8 @@ #define BIND_STATE_REGULAR 1 #define BIND_STATE_LOOPBACK 2 +#define MESSAGE_TYPE_MCAST 1 + #define HMAC_HASH_SIZE 20 struct security_header { unsigned char hash_digest[HMAC_HASH_SIZE]; /* The hash *MUST* be first in the data structure */ @@ -1172,6 +1174,7 @@ static int net_deliver_fn ( int res = 0; unsigned char *msg_offset; unsigned int size_delv; + char *message_type; if (instance-flushing == 1) { iovec = instance-totemudp_iov_recv_flush; @@ -1234,6 +1237,16 @@ static int net_deliver_fn ( } /* + * Drop all non-mcast messages (more specifically join + * messages should be dropped) + */ + message_type = (char *)msg_offset; + if (instance-flushing == 1 *message_type != MESSAGE_TYPE_MCAST) { + iovec-iov_len = FRAME_SIZE_MAX; + return (0); + } + + /* * Handle incoming message */ instance-totemudp_deliver_fn ( ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH] Allow nss building conditionally with rpmbuild operation
Signed-off-by: Steven Dake sd...@redhat.com --- corosync.spec.in |8 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/corosync.spec.in b/corosync.spec.in index 74ab851..5c651aa 100644 --- a/corosync.spec.in +++ b/corosync.spec.in @@ -11,6 +11,7 @@ %bcond_with snmp %bcond_with dbus %bcond_with rdma +%bcond_with nss Name: corosync Summary: The Corosync Cluster Engine and Application Programming Interfaces @@ -36,7 +37,9 @@ Conflicts: openais = 0.89, openais-devel = 0.89 %if %{buildtrunk} BuildRequires: autoconf automake %endif +%if %{with nss} BuildRequires: nss-devel +%endif %if %{with rdma} BuildRequires: libibverbs-devel librdmacm-devel %endif @@ -83,6 +86,11 @@ export rdmacm_LIBS=-lrdmacm \ %if %{with rdma} --enable-rdma \ %endif +%if %{with nss} + --enable-nss \ +%else + --disable-nss \ +%endif --with-initddir=%{_initrddir} make %{_smp_mflags} -- 1.7.6 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] cpg behavior on transitional membership change
On 09/02/2011 12:59 AM, Vladislav Bogdanov wrote: Hi all, I'm trying to further investigate problem I described at https://www.redhat.com/archives/cluster-devel/2011-August/msg00133.html The main problem for me there is that pacemaker first sees transitional membership with left nodes, then it sees stable membership with that nodes returned back, and does nothing about that. On the other hand, dlm_controld sees CPG_REASON_NODEDOWN events on CPGs related to all its lockspaces (at the same time with transitional membership change) and stops kernel part of each lockspace until whole cluster is rebooted (or until some other recovery procedure which unfortunately does not happen I believe fenced should reboot the node, but only if there is quorum. It is possible your cluster has lost quorum during this series of events. I have copied Dave for his feedback on this point. :( ). It neither requests to fence left node nor recovers when node is returned on next stable membership. Could anyone please help me to understand, what is a correct CPG behavior on membership change? From what I see, CPG emits CPG_REASON_NODEDOWN event on both transitional and stable membership if there is node which left the cluster. Am I correct here? And is that a right thing if I am? Line #'s where this happens? If yes, is there a way do detect membership change type (transitional pr stable) through CPG API? A transitional membership will always contain a subset of the previous regular membership. This means it will always contains 0 or more left members. A transitional membership means The membership of nodes transitioning from previous regular membership to new regular mebmership. A regular configuration is where members are added to the configuration when detected. A transitional membership never has nodes added to it. Hoping for answer, It would be nice if cpg and totem had a direct relationship in how their transitional and regular configurations were generated, but this doesn't happen currently. I am not sure if there is a good reason for this. Regards -steve Best regards, Vladislav ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] cpg behavior on transitional membership change
On Fri, Sep 02, 2011 at 10:30:53AM -0700, Steven Dake wrote: On 09/02/2011 12:59 AM, Vladislav Bogdanov wrote: Hi all, I'm trying to further investigate problem I described at https://www.redhat.com/archives/cluster-devel/2011-August/msg00133.html The main problem for me there is that pacemaker first sees transitional membership with left nodes, then it sees stable membership with that nodes returned back, and does nothing about that. On the other hand, dlm_controld sees CPG_REASON_NODEDOWN events on CPGs related to all its lockspaces (at the same time with transitional membership change) and stops kernel part of each lockspace until whole cluster is rebooted (or until some other recovery procedure which unfortunately does not happen I believe fenced should reboot the node, but only if there is quorum. It is possible your cluster has lost quorum during this series of events. I have copied Dave for his feedback on this point. I really can't make any sense of the report, sorry. Maybe reproduce it without pacemaker, and then describe the specific steps to create the issue and resulting symptoms. After that we can determine what logs, if any, would be useful. ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] cpg behavior on transitional membership change
Hi Steve, 02.09.2011 20:30, Steven Dake wrote: On 09/02/2011 12:59 AM, Vladislav Bogdanov wrote: ... I'm trying to further investigate problem I described at https://www.redhat.com/archives/cluster-devel/2011-August/msg00133.html The main problem for me there is that pacemaker first sees transitional membership with left nodes, then it sees stable membership with that nodes returned back, and does nothing about that. On the other hand, dlm_controld sees CPG_REASON_NODEDOWN events on CPGs related to all its lockspaces (at the same time with transitional membership change) and stops kernel part of each lockspace until whole cluster is rebooted (or until some other recovery procedure which unfortunately does not happen I believe fenced should reboot the node, but only if there is quorum. It is possible your cluster has lost quorum during this series of events. I have copied Dave for his feedback on this point. Aha. I think so too. But fenced doesn't do that as well as all other daemons from cluster3, this part of code is identical among them, that's why I think this does not depend on whether cman or pacemaker stack is used: fence/fenced/cpg.c around line 1440 (as for 3.1.1) if (left_list[i].reason == CPG_REASON_NODEDOWN || left_list[i].reason == CPG_REASON_PROCDOWN) { memb-failed = 1; cg-failed_count++; } ... if (left_list[i].reason == CPG_REASON_PROCDOWN) kick_node_from_cluster(memb-nodeid); probably last lines should be: if (left_list[i].reason == CPG_REASON_NODEDOWN || left_list[i].reason == CPG_REASON_PROCDOWN) kick_node_from_cluster(memb-nodeid); at least in one of daemons (fenced is a good candidate, but I prefer dlm_controld)? About quorum: 3 node cluster was split to two partitions, 2 bare-metal and 1 VM nodes. When I found that, two metal ones were in 'kern_stop' state, transitioning via 'kern_stop,fencing' state I suppose. VM did not have quorum, so it was left in 'kern_stop,fencing' state. dlm dump says: 1313579105 clvmd add_change cg 4 remove nodeid 1543767306 reason 3 That means CPG_REASON_NODEDOWN event. Then: 1313579105 Node 1543767306/mgmt01 has not been shot yet 1313579105 clvmd check_fencing 1543767306 wait add 1313562825 fail 1313579105 last 0 1313579107 Node 1543767306/mgmt01 was last shot 'now' This is not true, there is no line about actual fencing scheduling (and it is clear from code why). This could be a deficiency of .pcmk dlm_controld variant, but that is not important here I think. 1313579107 clvmd check_fencing 1543767306 done add 1313562825 fail 1313579105 last 1313579107 1313579107 clvmd check_fencing done :( ). It neither requests to fence left node nor recovers when node is returned on next stable membership. Could anyone please help me to understand, what is a correct CPG behavior on membership change? From what I see, CPG emits CPG_REASON_NODEDOWN event on both transitional and stable membership if there is node which left the cluster. Am I correct here? And is that a right thing if I am? Ah, I should be mixed something, it was quite long ago. Actually, yes, that was transitional one. There was only one such event. Line #'s where this happens? I just saw that in pacemaker plugin logs and in dlm_tool dump logs. Their timestamps are identical. If yes, is there a way do detect membership change type (transitional pr stable) through CPG API? A transitional membership will always contain a subset of the previous regular membership. This means it will always contains 0 or more left members. A transitional membership means The membership of nodes transitioning from previous regular membership to new regular mebmership. A regular configuration is where members are added to the configuration when detected. A transitional membership never has nodes added to it. Thank you for clarification very much. Shouldn't pacemaker then schedule fencing itself (from the partition with quorum) if there are left nodes? BTW, actually there was only second or two between transitional and regular membership. I probably need to ask Andrew for pacemaker logic details. Unfortunately I lost that logs and hardly can reproduce that :( That was a VM which left the cluster, and it probably just suffered from insufficient host CPU time. And... Just wandering, what could be a reason to recalculate membership if there are 0 left or added members? Hoping for answer, It would be nice if cpg and totem had a direct relationship in how their transitional and regular configurations were generated, but this doesn't happen currently. I am not sure if there is a good reason for this. Pacemaker uses totem? At least it doesn't use cpg. May be that is the reason of not-fencing from within it? Thank you very much, Vladislav
Re: [Openais] cpg behavior on transitional membership change
02.09.2011 20:55, David Teigland wrote: [snip] I really can't make any sense of the report, sorry. Maybe reproduce it without pacemaker, and then describe the specific steps to create the issue and resulting symptoms. After that we can determine what logs, if any, would be useful. I just tried to ask a question about cluster components logic based on information I discovered from both logs and code analysis. I'm sorry if I was unclear in that, probably some language barrier still exists. Please see my previous mail, I tried to add some explanations why I think current logic is not complete. Thank you, Vladislav ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais