27.09.2011 10:56, Andrew Beekhof wrote: > On Tue, Sep 27, 2011 at 5:07 PM, Vladislav Bogdanov > <bub...@hoster-ok.com> wrote: >> 27.09.2011 08:59, Andrew Beekhof wrote: >> [snip] >>>>>>>> I agree with Jiaju >>>>>>>> (https://lists.linux-foundation.org/pipermail/openais/2011-September/016713.html), >>>>>>>> that could be solely pacemaker problem, because it probably should >>>>>>>> originate fencing itself is such situation I think. >>>>>>>> >>>>>>>> So, using pacemaker/dlm with openais stack is currently risky due to >>>>>>>> possible hangs of dlm_lockspaces. >>>>>>> >>>>>>> It shouldn't be, failing to connect to attrd is very unusual. >>>>>> >>>>>> By the way, one of underlying problems, which actually made me to notice >>>>>> all this, is that pacemaker cluster does not fence its DC if it leaves >>>>>> the cluster for a very short time. That is what Jiaju told in his notes. >>>>>> And I can confirm that. >>>>> >>>>> Thats highly surprising. Do the logs you sent display this behaviour? >>>> >>>> They do. Rest of the cluster begins the election, but then accepts >>>> returned DC back (I write this from memory, I looked at logs Sep 5-6, so >>>> I may mix up something). >>> >>> Actually, this might be possible - if DC.old came back before DC.new >>> had a chance to get elected, run the PE and initiate fencing, then >>> there would be no need to fence. >>> >> >> (text below is for pacemaker on top of openais stack, not for cman) >> >> Except dlm lockspaces are in kern_stop state, so a whole dlm-related >> part is frozen :( - clvmd in my case, but I expect the same from gfs2 >> and ocfs2. >> And fencing requests originated on CPG NODEDOWN event by dlm_controld >> (with my patch to dlm_controld and your patch for >> crm_terminate_member_common()) on a quorate partition are lost. DC.old >> doesn't accept CIB updates from other nodes, so that fencing requests >> are discarded. > > All the more reason to start using the stonith api directly. > I was playing around list night with the dlm_controld.pcmk code: > > https://github.com/beekhof/dlm/commit/9f890a36f6844c2a0567aea0a0e29cc47b01b787
Wow, I'll try it! Btw (offtopic), don't you think that it could be interesting to have stacks support in dlopened modules there? From what I see in that code, it could be almost easily achieved. One just needs to create module API structure, enumerate functions in each stack, add module loading to dlm_controld core and change calls to module functions. > >> >> I think that problem is that membership changes are handled in a >> non-transactional way (?). > > Sounds more like the dlm/etc is being dumb - if the host is back and > healthy, why would we want to shoot it? Ammmm..... No comments from me on this ;) But, anyways, something needs to be done at either side... > >> If pacemaker fully finish processing of one membership change - elect >> new DC on a quorate partition, and do not try to take over dc role (or >> release it) on a non-quorate partition if quorate one exists, that >> problem could be gone. > > Non quorate partitions still have a DC. > They're just not supposed to do anything (depending on the value of > no-quorum-policy). I actually meant "do not try to take over dc role in a rejoined cluster (or release that role) if it was running on a non-quorate partition before rejoin if quorate one existed". Sorry for confusion. Not very natural wording again, but should be better. May be DC from non-quorate partition should just have lower priority to become DC when cluster rejoins and new election happen (does it?)? > >> I didn't dig into code so much, so all above is just my deduction which >> may be completely wrong. >> And of course real logic could (should) be much more complicated, with >> handling of just rebooted members, etc. >> >> (end of openais specific part) >> >>>> [snip] >>>>>>>> Although it took 25 seconds instead of 3 to break the cluster (I >>>>>>>> understand, this is almost impossible to load host so much, but >>>>>>>> anyways), then I got a real nightmare: two nodes of 3-node cluster had >>>>>>>> cman stopped (and pacemaker too because of cman connection loss) - they >>>>>>>> asked to kick_node_from_cluster() for each other, and that succeeded. >>>>>>>> But fencing didn't happen (I still need to look why, but this is cman >>>>>>>> specific). >>>> >>>> Btw this part is tricky for me to understand the underlying logic: >>>> * cman just stops cman processes on remote nodes, disregarding the >>>> quorum. I hope that could be fixed in corosync If I understand one of >>>> latest threads there right. >>>> * But cman does not do fencing of that nodes, and they still run >>>> resources. And this could be extremely dangerous under some >>>> circumstances. And cman does not do fencing even if it has fence devices >>>> configure in cluster.conf (I verified that). >>>> >>>>>>>> Remaining node had pacemaker hanged, it doesn't even >>>>>>>> notice cluster infrastructure change, down nodes were listed as a >>>>>>>> online, one of them was a DC, all resources are marked as started on >>>>>>>> all >>>>>>>> (down too) nodes. No log entries from pacemaker at all. >>>>>>> >>>>>>> Well I can't see any logs from anyone to its hard for me to comment. >>>>>> >>>>>> Logs are sent privately. >>>>>> >>>>>>> >>>> >>>> Vladislav >>>> >>>> >>>> _______________________________________________ >>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>> >>>> Project Home: http://www.clusterlabs.org >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> Bugs: >>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >>>> >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: >>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: >> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >> > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker