On 01/15/2016 05:02 AM, Arjun Pandey wrote: > Based on corosync logs from orana ( The node that did the actual > fencing and is the current master node) > > I also tried looking at pengine outputs based on crm_simulate. Uptil > the fenced node rejoins things look good. > > [root@ucc1 orana]# crm_simulate -S --xml-file > ./pengine/pe-input-1450.bz2 -u kamet > Current cluster status: > Node kamet: pending > Online: [ orana ]
Above, "pending" means that the node has started to join the cluster, but has not yet fully joined. > Jan 13 19:32:53 [4295] orana pengine: info: probe_resources: > Action probe_complete-kamet on kamet is unrunnable (pending) Any action on kamet is unrunnable until it finishes joining the cluster. > Jan 13 19:32:59 [4292] orana stonith-ng: info: > crm_update_peer_proc: pcmk_cpg_membership: Node kamet[2] - > corosync-cpg is now online The pacemaker daemons on orana each report when they see kamet come up at the corosync level. Here, stonith-ng sees it. > Jan 13 19:32:59 [4291] orana cib: info: > crm_update_peer_proc: pcmk_cpg_membership: Node kamet[2] - > corosync-cpg is now online Now, the cib sees it. > Jan 13 19:33:00 [4296] orana crmd: info: > crm_update_peer_proc: pcmk_cpg_membership: Node kamet[2] - > corosync-cpg is now online Now, crmd sees it. >>>> [Arjun] Why does pengine declare that the following monitor actions are >>>> now unrunnable ? > > Jan 13 19:33:00 [4295] orana pengine: warning: custom_action: > Action foo:0_monitor_0 on kamet is unrunnable (pending) At this point, pengine still hasn't seen kamet join yet, so actions on it are still unrunnable. > Jan 13 19:33:00 [4296] orana crmd: info: join_make_offer: > join-2: Sending offer to kamet Having seen kamet at the corosync level, crmd now offers cluster-level membership to kamet. > Jan 13 19:33:00 [4291] orana cib: info: > cib_process_replace: Replacement 0.4.0 from kamet not applied to > 0.74.1: current epoch is greater than the replacement > Jan 13 19:33:00 [4291] orana cib: warning: > cib_process_request: Completed cib_replace operation for section > 'all': Update was older than existing configuration (rc=-205, > origin=kamet/cibadmin/2, version=0.74.1) > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: > Diff: --- 0.74.1 2 > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: > Diff: +++ 0.75.0 (null) > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/nodes/node[@id='kamet'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/nodes/node[@id='orana'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/resources/primitive[@id='fence-uc-orana']/meta_attributes[@id='fence-uc-orana-meta_attributes'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/resources/primitive[@id='fence-uc-kamet'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/resources/primitive[@id='C-3'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/resources/primitive[@id='C-FLT'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/resources/primitive[@id='C-FLT2'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/resources/primitive[@id='E-3'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/resources/primitive[@id='MGMT-FLT'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/resources/primitive[@id='M-FLT'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/resources/primitive[@id='M-FLT2'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/resources/primitive[@id='S-FLT'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/resources/primitive[@id='S-FLT2'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_colocation[@id='colocation-C-3-foo-master-INFINITY'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_order[@id='order-C-3-foo-master-mandatory'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_colocation[@id='colocation-C-FLT-foo-master-INFINITY'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_order[@id='order-C-FLT-foo-master-mandatory'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_colocation[@id='colocation-C-FLT2-foo-master-INFINITY'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_order[@id='order-C-FLT2-foo-master-mandatory'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_colocation[@id='colocation-E-3-foo-master-INFINITY'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_order[@id='order-E-3-foo-master-mandatory'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_colocation[@id='colocation-MGMT-FLT-foo-master-INFINITY'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_order[@id='order-MGMT-FLT-foo-master-mandatory'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_colocation[@id='colocation-M-FLT-foo-master-INFINITY'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_order[@id='order-M-FLT-foo-master-mandatory'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_colocation[@id='colocation-M-FLT2-foo-master-INFINITY'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_order[@id='order-M-FLT2-foo-master-mandatory'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_colocation[@id='colocation-S-FLT-foo-master-INFINITY'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_order[@id='order-S-FLT-foo-master-mandatory'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_colocation[@id='colocation-S-FLT2-foo-master-INFINITY'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_order[@id='order-S-FLT2-foo-master-mandatory'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_colocation[@id='colocation-fence-uc-orana-foo-master-INFINITY'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_colocation[@id='colocation-fence-uc-kamet-foo-master-INFINITY'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_order[@id='order-fence-uc-kamet-foo-master-mandatory'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: -- > /cib/configuration/constraints/rsc_order[@id='order-fence-uc-orana-foo-master-mandatory'] > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: + > /cib: @epoch=75, @num_updates=0 > Jan 13 19:33:00 [4291] orana cib: info: cib_perform_op: + > /cib/configuration/resources/primitive[@id='fence-uc-orana']/instance_attributes[@id='fence-uc-orana-instance_attributes']/nvpair[@id='fence-uc-orana-instance_attributes-delay']: > @value=0 > Jan 13 19:33:00 [4291] orana cib: info: > cib_process_request: Completed cib_replace operation for section > configuration: OK (rc=0, origin=kamet/cibadmin/2, version=0.75.0) The above is the problem. You can see all the resources being deleted from the CIB ("--" indicates lines being removed from the CIB, and "+" indicates lines being added). For some reason, the cluster used a much older CIB on kamet to replace the current one used by the cluster. I'm not sure why this happened; it may be a bug. What version of pacemaker are you using? Check the permissions on /var/lib/pacemaker/cib and the files in it on both nodes. I'd expect everything to be owned and writeable by the hacluster user. >>>>>> [Arjun] What do the following logs signify ? > Jan 13 19:33:00 [4292] orana stonith-ng: info: > stonith_device_remove: Device 'C-3' not found (2 active devices) These are not important in themselves, but are follow-up effects from the resources being removed from the CIB above. Whenever the CIB changes, stonith-ng will re-check what fencing devices are available. _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org