Hi! Could this mean the stonith-timeout is signioficantly larger than the time for a complete reboot? So the fenced node would be up again when the cluster thinks the fencing has just completed.
Regards, Ulrich P.S: Sorry for the late reply; I was offline for a while... >>> Cesar Hernandez <c.hernan...@medlabmg.com> schrieb am 06.07.2017 um 16:20 in Nachricht <0674aeed-8fd2-4dab-a27f-498db0f36...@medlabmg.com>: >> >> If node2 is getting the notification of its own fencing, it wasn't >> successfully fenced. Successful fencing would render it incapacitated >> (powered down, or at least cut off from the network and any shared >> resources). > > > Maybe I don't understand you, or maybe you don't understand me... ;) > This is the syslog of the machine, where you can see that the machine has > rebooted successfully, and as I said, it has been rebooted successfully all > the times: > > Jul 5 10:41:54 node2 kernel: [ 0.000000] Initializing cgroup subsys > cpuset > Jul 5 10:41:54 node2 kernel: [ 0.000000] Initializing cgroup subsys cpu > Jul 5 10:41:54 node2 kernel: [ 0.000000] Initializing cgroup subsys > cpuacct > Jul 5 10:41:54 node2 kernel: [ 0.000000] Linux version 3.16.0-4-amd64 > (debian-ker...@lists.debian.org) (gcc version 4.8.4 (Debian 4.8.4-1) ) #1 SMP > Debian 3.16.39-1 (2016-12-30) > Jul 5 10:41:54 node2 kernel: [ 0.000000] Command line: > BOOT_IMAGE=/boot/vmlinuz-3.16.0-4-amd64 > root=UUID=711e1ec2-2a36-4405-bf46-44b43cfee42e ro init=/bin/systemd > console=ttyS0 console=hvc0 > Jul 5 10:41:54 node2 kernel: [ 0.000000] e820: BIOS-provided physical RAM > map: > Jul 5 10:41:54 node2 kernel: [ 0.000000] BIOS-e820: [mem > 0x0000000000000000-0x000000000009dfff] usable > Jul 5 10:41:54 node2 kernel: [ 0.000000] BIOS-e820: [mem > 0x000000000009e000-0x000000000009ffff] reserved > Jul 5 10:41:54 node2 kernel: [ 0.000000] BIOS-e820: [mem > 0x00000000000e0000-0x00000000000fffff] reserved > Jul 5 10:41:54 node2 kernel: [ 0.000000] BIOS-e820: [mem > 0x0000000000100000-0x000000003fffffff] usable > Jul 5 10:41:54 node2 kernel: [ 0.000000] BIOS-e820: [mem > 0x00000000fc000000-0x00000000ffffffff] reserved > Jul 5 10:41:54 node2 kernel: [ 0.000000] NX (Execute Disable) > protection: active > Jul 5 10:41:54 node2 kernel: [ 0.000000] SMBIOS 2.4 present. > > ... > > Jul 5 10:41:54 node2 dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port > 67 > > ... > > Jul 5 10:41:54 node2 corosync[585]: [MAIN ] Corosync Cluster Engine > ('UNKNOWN'): started and ready to provide service. > Jul 5 10:41:54 node2 corosync[585]: [MAIN ] Corosync built-in features: > nss > Jul 5 10:41:54 node2 corosync[585]: [MAIN ] Successfully read main > configuration file '/etc/corosync/corosync.conf'. > > ... > > Jul 5 10:41:57 node2 crmd[608]: notice: Defaulting to uname -n for the > local classic openais (with plugin) node name > Jul 5 10:41:57 node2 crmd[608]: notice: Membership 4308: quorum acquired > Jul 5 10:41:57 node2 crmd[608]: notice: plugin_handle_membership: Node > node2[1108352940] - state is now member (was (null)) > Jul 5 10:41:57 node2 crmd[608]: notice: plugin_handle_membership: Node > node11[794540] - state is now member (was (null)) > Jul 5 10:41:57 node2 crmd[608]: notice: The local CRM is operational > Jul 5 10:41:57 node2 crmd[608]: notice: State transition S_STARTING -> > S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL origin=do_started ] > Jul 5 10:41:57 node2 stonith-ng[604]: notice: Watching for stonith > topology changes > Jul 5 10:41:57 node2 stonith-ng[604]: notice: Membership 4308: quorum > acquired > Jul 5 10:41:57 node2 stonith-ng[604]: notice: plugin_handle_membership: > Node node11[794540] - state is now member (was (null)) > Jul 5 10:41:57 node2 stonith-ng[604]: notice: On loss of CCM Quorum: > Ignore > Jul 5 10:41:58 node2 stonith-ng[604]: notice: Added 'st-fence_propio:0' to > the device list (1 active devices) > Jul 5 10:41:59 node2 stonith-ng[604]: notice: Operation reboot of node2 by > node11 for crmd.2141@node11.61c3e613: OK > Jul 5 10:41:59 node2 crmd[608]: crit: We were allegedly just fenced by > node11 for node11! > Jul 5 10:41:59 node2 corosync[585]: [pcmk ] info: pcmk_ipc_exit: Client > crmd (conn=0x228d970, async-conn=0x228d970) left > Jul 5 10:41:59 node2 pacemakerd[597]: warning: The crmd process (608) can > no longer be respawned, shutting the cluster down. > Jul 5 10:41:59 node2 pacemakerd[597]: notice: Shutting down Pacemaker > Jul 5 10:41:59 node2 pacemakerd[597]: notice: Stopping pengine: Sent -15 > to process 607 > Jul 5 10:41:59 node2 pengine[607]: notice: Invoking handler for signal > 15: Terminated > Jul 5 10:41:59 node2 pacemakerd[597]: notice: Stopping attrd: Sent -15 to > process 606 > Jul 5 10:41:59 node2 attrd[606]: notice: Invoking handler for signal 15: > Terminated > Jul 5 10:41:59 node2 attrd[606]: notice: Exiting... > Jul 5 10:41:59 node2 corosync[585]: [pcmk ] info: pcmk_ipc_exit: Client > attrd (conn=0x2280ef0, async-conn=0x2280ef0) left > > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org