On Mon, 2017-08-14 at 12:33 -0500, Ken Gaillot wrote: > On Wed, 2017-08-02 at 09:59 +0000, 井上 和徳 wrote: > > Hi, > > > > In Pacemaker-1.1.17, the attribute updated while starting pacemaker is not > > displayed in crm_mon. > > In Pacemaker-1.1.16, it is displayed and results are different. > > > > https://github.com/ClusterLabs/pacemaker/commit/fe44f400a3116a158ab331a92a49a4ad8937170d > > This commit is the cause, but the following result (3.) is expected > > behavior? > > This turned out to be an odd one. The sequence of events is: > > 1. When the node leaves the cluster, the DC (correctly) wipes all its > transient attributes from attrd and the CIB. > > 2. Pacemaker is newly started on the node, and a transient attribute is > set before the node joins the cluster. > > 3. The node joins the cluster, and its transient attributes (including > the new value) are sync'ed with the rest of the cluster, in both attrd > and the CIB. So far, so good. > > 4. Because this is the node's first join since its crmd started, its > crmd wipes all of its transient attributes again. The idea is that the > node may have restarted so quickly that the DC hasn't yet done it (step > 1 here), so clear them now to avoid any problems with old values. > However, the crmd wipes only the CIB -- not attrd (arguably a bug).
Whoops, clarification: the node may have restarted so quickly that corosync didn't notice it left, so the DC would never have gotten the "peer lost" message that triggers wiping its transient attributes. I suspect the crmd wipes only the CIB in this case because we assumed attrd would be empty at this point -- missing exactly this case where a value was set between start-up and first join. > 5. With the older pacemaker version, both the joining node and the DC > would request a full write-out of all values from attrd. Because step 4 > only wiped the CIB, this ends up restoring the new value. With the newer > pacemaker version, this step is no longer done, so the value winds up > staying in attrd but not in CIB (until the next write-out naturally > occurs). > > I don't have a solution yet, but step 4 is clearly the problem (rather > than the new code that skips step 5, which is still a good idea > performance-wise). I'll keep working on it. > > > [test case] > > 1. Start pacemaker on two nodes at the same time and update the attribute > > during startup. > > In this case, the attribute is displayed in crm_mon. > > > > [root@node1 ~]# ssh -f node1 'systemctl start pacemaker ; attrd_updater > > -n KEY -U V-1' ; \ > > ssh -f node3 'systemctl start pacemaker ; attrd_updater > > -n KEY -U V-3' > > [root@node1 ~]# crm_mon -QA1 > > Stack: corosync > > Current DC: node3 (version 1.1.17-1.el7-b36b869) - partition with quorum > > > > 2 nodes configured > > 0 resources configured > > > > Online: [ node1 node3 ] > > > > No active resources > > > > > > Node Attributes: > > * Node node1: > > + KEY : V-1 > > * Node node3: > > + KEY : V-3 > > > > > > 2. Restart pacemaker on node1, and update the attribute during startup. > > > > [root@node1 ~]# systemctl stop pacemaker > > [root@node1 ~]# systemctl start pacemaker ; attrd_updater -n KEY -U V-10 > > > > > > 3. The attribute is registered in attrd but it is not registered in CIB, > > so the updated attribute is not displayed in crm_mon. > > > > [root@node1 ~]# attrd_updater -Q -n KEY -A > > name="KEY" host="node3" value="V-3" > > name="KEY" host="node1" value="V-10" > > > > [root@node1 ~]# crm_mon -QA1 > > Stack: corosync > > Current DC: node3 (version 1.1.17-1.el7-b36b869) - partition with quorum > > > > 2 nodes configured > > 0 resources configured > > > > Online: [ node1 node3 ] > > > > No active resources > > > > > > Node Attributes: > > * Node node1: > > * Node node3: > > + KEY : V-3 > > > > > > Best Regards > > > > _______________________________________________ > > Users mailing list: Users@clusterlabs.org > > http://lists.clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org