On 07/29/2016 05:41 PM, Andrew Beekhof wrote: > > > Sent from my iPhone > >> On 30 Jul 2016, at 8:32 AM, Ken Gaillot <[email protected]> wrote: >> >> I finally had time to investigate this, and it definitely is broken. >> >> The only existing heartbeat RA to use the *_notify_active_* variables is >> Filesystem, and it only does so for OCFS2 on SLES10, which didn't even >> ship pacemaker, > > I'm pretty sure it did
All I could find was: "SLES 10 did not yet ship pacemaker, but heartbeat with the builtin crm" http://oss.clusterlabs.org/pipermail/pacemaker/2014-July/022232.html I'm sure people were compiling it, and ClusterLabs probably even provided a repo, but it looks like sles didn't ship it. The issue is that the code that builds the active list checks for role RSC_ROLE_STARTED rather than RSC_ROLE_SLAVE + RSC_ROLE_MASTER, so I don't think it ever would have worked. > >> so I'm guessing it's been broken from the beginning of >> pacemaker. >> >> The fix looks straightforward, so I should be able to take care of it soon. >> >> Filed bug http://bugs.clusterlabs.org/show_bug.cgi?id=5295 >> >>> On 05/08/2016 04:57 AM, Jehan-Guillaume de Rorthais wrote: >>> Le Fri, 6 May 2016 15:41:11 -0500, >>> Ken Gaillot <[email protected]> a écrit : >>> >>>>> On 05/03/2016 05:30 PM, Jehan-Guillaume de Rorthais wrote: >>>>> Le Tue, 3 May 2016 21:10:12 +0200, >>>>> Jehan-Guillaume de Rorthais <[email protected]> a écrit : >>>>> >>>>>> Le Mon, 2 May 2016 17:59:55 -0500, >>>>>> Ken Gaillot <[email protected]> a écrit : >>>>>> >>>>>>>> On 04/28/2016 04:47 AM, Jehan-Guillaume de Rorthais wrote: >>>>>>>> Hello all, >>>>>>>> >>>>>>>> While testing and experiencing with our RA for PostgreSQL, I found the >>>>>>>> meta_notify_active_* variables seems always empty. Here is an example >>>>>>>> of >>>>>>>> these variables as they are seen from our RA during a >>>>>>>> migration/switchover: >>>>>>>> >>>>>>>> >>>>>>>> { >>>>>>>> 'type' => 'pre', >>>>>>>> 'operation' => 'demote', >>>>>>>> 'active' => [], >>>>>>>> 'inactive' => [], >>>>>>>> 'start' => [], >>>>>>>> 'stop' => [], >>>>>>>> 'demote' => [ >>>>>>>> { >>>>>>>> 'rsc' => 'pgsqld:1', >>>>>>>> 'uname' => 'hanode1' >>>>>>>> } >>>>>>>> ], >>>>>>>> >>>>>>>> 'master' => [ >>>>>>>> { >>>>>>>> 'rsc' => 'pgsqld:1', >>>>>>>> 'uname' => 'hanode1' >>>>>>>> } >>>>>>>> ], >>>>>>>> >>>>>>>> 'promote' => [ >>>>>>>> { >>>>>>>> 'rsc' => 'pgsqld:0', >>>>>>>> 'uname' => 'hanode3' >>>>>>>> } >>>>>>>> ], >>>>>>>> 'slave' => [ >>>>>>>> { >>>>>>>> 'rsc' => 'pgsqld:0', >>>>>>>> 'uname' => 'hanode3' >>>>>>>> }, >>>>>>>> { >>>>>>>> 'rsc' => 'pgsqld:2', >>>>>>>> 'uname' => 'hanode2' >>>>>>>> } >>>>>>>> ], >>>>>>>> >>>>>>>> } >>>>>>>> >>>>>>>> In case this comes from our side, here is code building this: >>>>>>>> >>>>>>>> >>>>>>>> https://github.com/dalibo/PAF/blob/6e86284bc647ef1e81f01f047f1862e40ba62906/lib/OCF_Functions.pm#L444 >>>>>>>> >>>>>>>> But looking at the variable itself in debug logs, I always find it >>>>>>>> empty, >>>>>>>> in various situations (switchover, recover, failover). >>>>>>>> >>>>>>>> If I understand the documentation correctly, I would expect 'active' to >>>>>>>> list all the three resources, shouldn't it? Currently, to bypass this, >>>>>>>> we >>>>>>>> consider: active == master + slave >>>>>>> >>>>>>> You're right, it should. The pacemaker code that generates the "active" >>>>>>> variables is the same used for "demote" etc., so it seems unlikely the >>>>>>> issue is on pacemaker's side. Especially since your code treats active >>>>>>> etc. differently from demote etc., it seems like it must be in there >>>>>>> somewhere, but I don't see where. >>>>>> >>>>>> The code treat active, inactive, start and stop all together, for any >>>>>> cloned resource. If the resource is a multistate, it adds promote, >>>>>> demote, >>>>>> slave and master. >>>>>> >>>>>> Note that from this piece of code, the 7 other notify vars are set >>>>>> correctly: start, stop, inactive, promote, demote, slave, master. Only >>>>>> active is always missing. >>>>>> >>>>>> I'll investigate and try to find where is hiding the bug. >>>>> >>>>> So I added a piece of code to dump the **all** the environment variables >>>>> to >>>>> a temp file as early as possible **to avoid any interaction with our perl >>>>> module** in the code of the RA, ie.: >>>>> >>>>> BEGIN { >>>>> use Time::HiRes qw(time); >>>>> my $now = time; >>>>> open my $fh, ">", "/tmp/test-$now.env.txt"; >>>>> printf($fh "%-20s = ''%s''\n", $_, $ENV{$_}) foreach sort keys %ENV; >>>>> } >>>>> >>>>> Then I started my cluster and set maintenance-mode=false while no >>>>> resources >>>>> where running. So the debug files contains the probe action, start on all >>>>> nodes, one promote on the master and the first monitors. The "*active" >>>>> variables are always empty anywhere in the cluster. Find in attachment the >>>>> result of the following command on the master node: >>>>> >>>>> for i in test-*; do echo "===== $i ====="; grep OCF_ $i; done > >>>>> debug-env.txt >>>>> >>>>> I'm using Pacemaker 1.1.13-10.el7_2.2-44eb2dd under CentOS 7.2.1511. >>>>> >>>>> For completeness, I added the Pacemaker configuration I use for my 3 node >>>>> dev/test cluster. >>>>> >>>>> Let me know if you think of more investigations and test I could run on >>>>> this >>>>> issue. I'm out of ideas for tonight (and I really would prefer having this >>>>> bug on my side). >>>> >>>> From your environment dumps, what I think is happening is that you are >>>> getting multiple notifications (start, pre-promote, post-promote) in a >>>> single cluster transition. So the variables reflect the initial state of >>>> that transition -- none of the instances are active, all three are being >>>> started (so the nodes are in the "*_start_*" variables), and one is >>>> being promoted. >>> >>> >>> Yes, this is what happening here. It's embarrassing I didn't thought about >>> that :) >>> >>>> The starts will be done before the promote. If one of the starts fails, >>>> the transition will be aborted, and a new one will be calculated. So, if >>>> you get to the promote, you can assume anything in "*_start_*" is now >>>> active. >>> >>> I did another simple test: >>> >>> * 3 ms clones are running on hanode1 hanode2 hanode3 >>> * master role is on hanode1 >>> * I move the master role to hanode 2 using: >>> "pcs resource move pgsql-ha hanode2 --master" >>> >>> The transition gives us: >>> >>> * demote on hanode1 >>> * promote en hanode2 >>> >>> I suppose all the three clone on hanode1, hanode2 and hanode3 should appear >>> in >>> active env variable in this context, isn't it? >>> >>> Please, find in attachment the environment dumps of this transition from >>> hanode1. You'll see both "OCF_RESKEY_CRM_meta_notify_active_resource" and >>> "OCF_RESKEY_CRM_meta_notify_active_uname" only contains one char: a space. >>> >>> I start looking at the Pacemaker code, at least to have a better >>> understanding >>> on where environment variables are set and when they are available. I was >>> out >>> of luck so far but I lack of time. Any pointers would be appreciated :) >>> >>>>> On a side note, I noticed with these debug files that the notify >>>>> variables where also available outside of notify actions (start and notify >>>>> here). Are they always available during "transition actions" (start, stop, >>>>> promote, demote)? Checking at the mysql RA, they are using >>>>> OCF_RESKEY_CRM_meta_notify_master_uname during the start action. So I >>>>> suppose it's safe? >>>> >>>> Good question, I've never tried that before. I'm reluctant to say it's >>>> guaranteed; it's possible seeing them in the start action is a side >>>> effect of the current implementation and could theoretically change in >>>> the future. But if mysql is relying on it, I suppose it's >>>> well-established already, making changing it unlikely ... >>> >>> Thank you very much for this clarification. Presently we keep in a private >>> attribute what we //think// (we can not rely on active_uname :/) are the >>> active >>> uname for the ms resource. As it seems the notify vars appears outside of >>> notify >>> action is just a side effect of the current implementation, I prefer to stay >>> away from them when we are not in a notify action and keep our current >>> implementation. >>> >>> Thank you, >> >> >> _______________________________________________ >> Developers mailing list >> [email protected] >> http://clusterlabs.org/mailman/listinfo/developers > _______________________________________________ Developers mailing list [email protected] http://clusterlabs.org/mailman/listinfo/developers
