Hi Andrew, > > Thank you for comments. > > > >>> The guest located it to the shared disk. > >> > >> What is on the shared disk? The whole OS or app-specific data (i.e. > >> nothing pacemaker needs directly)? > > > > Shared disk has all the OS and the all data. > > Oh. I can imagine that being problematic. > Pacemaker really isn't designed to function without disk access.
I think so, too. I thought so, and I did the following suggestion. > >>> For example... > >>> 1. crmd watches a request to pengine with a timer... > >>> 2. pengine writes in it with a timer and watches processing.... > >>> ..etc... But, there may be a better method. > > You might be able to get away with it if you turn off saving PE files to disk > though. > > > The placement of this shared disk is similar in KVM where the problem does > > not occur. > > That it works in KVM in this situation is kind of surprising. > Or perhaps I misunderstand. About the movement on KVM, I confirm the details once again. However, the movement on KVM is clearly different from the movement on vSphere5.1. Best Regards, Hideo Yamauchi. > > > > > * We understand that we are different in movement in the difference of the > > hyper visor. > > * However, it seems to be necessary to evade this problem to use Pacemaker > > in vSphere5.1 environment. > > > > Best Regards, > > Hideo Yamauchi. > > > > > > --- On Wed, 2013/5/15, Andrew Beekhof <and...@beekhof.net> wrote: > > > >> > >> On 13/05/2013, at 4:14 PM, renayama19661...@ybb.ne.jp wrote: > >> > >>> Hi All, > >>> > >>> We constituted a simple cluster in environment of vSphere5.1. > >>> > >>> We composed it of two ESXi servers and shared disk. > >>> > >>> The guest located it to the shared disk. > >> > >> What is on the shared disk? The whole OS or app-specific data (i.e. > >> nothing pacemaker needs directly)? > >> > >>> > >>> > >>> Step 1) Constitute a cluster.(A DC node is an active node.) > >>> > >>> ============ > >>> Last updated: Mon May 13 14:16:09 2013 > >>> Stack: Heartbeat > >>> Current DC: pgsr01 (85a81130-4fed-4932-ab4c-21ac2320186f) - partition > >>> with quorum > >>> Version: 1.0.13-30bb726 > >>> 2 Nodes configured, unknown expected votes > >>> 2 Resources configured. > >>> ============ > >>> > >>> Online: [ pgsr01 pgsr02 ] > >>> > >>> Resource Group: test-group > >>> Dummy1 (ocf::pacemaker:Dummy): Started pgsr01 > >>> Dummy2 (ocf::pacemaker:Dummy): Started pgsr01 > >>> Clone Set: clnPingd > >>> Started: [ pgsr01 pgsr02 ] > >>> > >>> Node Attributes: > >>> * Node pgsr01: > >>> + default_ping_set : 100 > >>> * Node pgsr02: > >>> + default_ping_set : 100 > >>> > >>> Migration summary: > >>> * Node pgsr01: > >>> * Node pgsr02: > >>> > >>> > >>> Step 2) Strace does the pengine process of the DC node. > >>> > >>> [root@pgsr01 ~]# ps -ef |grep heartbeat > >>> root 2072 1 0 13:56 ? 00:00:00 heartbeat: master control > >>> process > >>> root 2075 2072 0 13:56 ? 00:00:00 heartbeat: FIFO reader > >>> > >>> root 2076 2072 0 13:56 ? 00:00:00 heartbeat: write: bcast > >>> eth1 > >>> root 2077 2072 0 13:56 ? 00:00:00 heartbeat: read: bcast > >>> eth1 > >>> root 2078 2072 0 13:56 ? 00:00:00 heartbeat: write: bcast > >>> eth2 > >>> root 2079 2072 0 13:56 ? 00:00:00 heartbeat: read: bcast > >>> eth2 > >>> 496 2082 2072 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/ccm > >>> 496 2083 2072 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/cib > >>> root 2084 2072 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/lrmd > >>> -r > >>> root 2085 2072 0 13:57 ? 00:00:00 > >>> /usr/lib64/heartbeat/stonithd > >>> 496 2086 2072 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/attrd > >>> 496 2087 2072 0 13:57 ? 00:00:00 /usr/lib64/heartbeat/crmd > >>> 496 2089 2087 0 13:57 ? 00:00:00 > >>> /usr/lib64/heartbeat/pengine > >>> root 2182 1 0 14:15 ? 00:00:00 > >>> /usr/lib64/heartbeat/pingd -D -p /var/run//pingd-default_ping_set -a > >>> default_ping_set -d 5s -m 100 -i 1 -h 192.168.101.254 > >>> root 2287 1973 0 14:16 pts/0 00:00:00 grep heartbea > >>> > >>> [root@pgsr01 ~]# strace -p 2089 > >>> Process 2089 attached - interrupt to quit > >>> restart_syscall(<... resuming interrupted call ...>) = 0 > >>> times({tms_utime=5, tms_stime=6, tms_cutime=0, tms_cstime=0}) = 429527557 > >>> recvfrom(5, 0xa93ff7, 953, 64, 0, 0) = -1 EAGAIN (Resource temporarily > >>> unavailable) > >>> poll([{fd=5, events=0}], 1, 0) = 0 (Timeout) > >>> recvfrom(5, 0xa93ff7, 953, 64, 0, 0) = -1 EAGAIN (Resource temporarily > >>> unavailable) > >>> poll([{fd=5, events=0}], 1, 0) = 0 (Timeout) > >>> (snip) > >>> > >>> > >>> Step 3) Disconnect the shared disk which an active node was placed. > >>> > >>> Step 4) Cut off pingd of the standby node. > >>> The score of pingd is reflected definitely, but handling of > >>>pengine blocks it. > >>> > >>> ~ # esxcfg-vswitch -N vmnic1 -p "ap-db" vSwitch1 > >>> ~ # esxcfg-vswitch -N vmnic2 -p "ap-db" vSwitch1 > >>> > >>> > >>> (snip) > >>> brk(0xd05000) = 0xd05000 > >>> brk(0xeed000) = 0xeed000 > >>> brk(0xf2d000) = 0xf2d000 > >>> fstat(6, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0 > >>> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) > >>> = 0x7f86a255a000 > >>> write(6, "BZh51AY&SY\327\373\370\203\0\t(_\200UPX\3\377\377%cT > >>> \277\377\377"..., 2243) = 2243 > >>> brk(0xb1d000) = 0xb1d000 > >>> fsync(6 ------------------------------> > >>> BLOCKED > >>> (snip) > >>> > >>> > >>> ============ > >>> Last updated: Mon May 13 14:19:15 2013 > >>> Stack: Heartbeat > >>> Current DC: pgsr01 (85a81130-4fed-4932-ab4c-21ac2320186f) - partition > >>> with quorum > >>> Version: 1.0.13-30bb726 > >>> 2 Nodes configured, unknown expected votes > >>> 2 Resources configured. > >>> ============ > >>> > >>> Online: [ pgsr01 pgsr02 ] > >>> > >>> Resource Group: test-group > >>> Dummy1 (ocf::pacemaker:Dummy): Started pgsr01 > >>> Dummy2 (ocf::pacemaker:Dummy): Started pgsr01 > >>> Clone Set: clnPingd > >>> Started: [ pgsr01 pgsr02 ] > >>> > >>> Node Attributes: > >>> * Node pgsr01: > >>> + default_ping_set : 100 > >>> * Node pgsr02: > >>> + default_ping_set : 0 : Connectivity is > >>>lost > >>> > >>> Migration summary: > >>> * Node pgsr01: > >>> * Node pgsr02: > >>> > >>> > >>> Step 4) Reconnect communication of pingd of the standby node. > >>> The score of pingd is reflected definitely, but handling of > >>>pengine blocks it. > >>> > >>> > >>> ~ # esxcfg-vswitch -M vmnic1 -p "ap-db" vSwitch1 > >>> ~ # esxcfg-vswitch -M vmnic2 -p "ap-db" vSwitch1 > >>> > >>> ============ > >>> Last updated: Mon May 13 14:19:40 2013 > >>> Stack: Heartbeat > >>> Current DC: pgsr01 (85a81130-4fed-4932-ab4c-21ac2320186f) - partition > >>> with quorum > >>> Version: 1.0.13-30bb726 > >>> 2 Nodes configured, unknown expected votes > >>> 2 Resources configured. > >>> ============ > >>> > >>> Online: [ pgsr01 pgsr02 ] > >>> > >>> Resource Group: test-group > >>> Dummy1 (ocf::pacemaker:Dummy): Started pgsr01 > >>> Dummy2 (ocf::pacemaker:Dummy): Started pgsr01 > >>> Clone Set: clnPingd > >>> Started: [ pgsr01 pgsr02 ] > >>> > >>> Node Attributes: > >>> * Node pgsr01: > >>> + default_ping_set : 100 > >>> * Node pgsr02: > >>> + default_ping_set : 100 > >>> > >>> Migration summary: > >>> * Node pgsr01: > >>> * Node pgsr02: > >>> > >>> > >>> --------- A block state of pengine continues ----- > >>> > >>> Step 5) Cut off pingd of the active node. > >>> The score of pingd is reflected definitely, but handling of > >>>pengine blocks it. > >>> > >>> > >>> ~ # esxcfg-vswitch -N vmnic1 -p "ap-db" vSwitch1 > >>> ~ # esxcfg-vswitch -N vmnic2 -p "ap-db" vSwitch1 > >>> > >>> > >>> ============ > >>> Last updated: Mon May 13 14:20:32 2013 > >>> Stack: Heartbeat > >>> Current DC: pgsr01 (85a81130-4fed-4932-ab4c-21ac2320186f) - partition > >>> with quorum > >>> Version: 1.0.13-30bb726 > >>> 2 Nodes configured, unknown expected votes > >>> 2 Resources configured. > >>> ============ > >>> > >>> Online: [ pgsr01 pgsr02 ] > >>> > >>> Resource Group: test-group > >>> Dummy1 (ocf::pacemaker:Dummy): Started pgsr01 > >>> Dummy2 (ocf::pacemaker:Dummy): Started pgsr01 > >>> Clone Set: clnPingd > >>> Started: [ pgsr01 pgsr02 ] > >>> > >>> Node Attributes: > >>> * Node pgsr01: > >>> + default_ping_set : 0 : Connectivity is > >>>lost > >>> * Node pgsr02: > >>> + default_ping_set : 100 > >>> > >>> Migration summary: > >>> * Node pgsr01: > >>> * Node pgsr02: > >>> > >>> --------- A block state of pengine continues ----- > >>> > >>> > >>> After that the movement to the standby node of the resource does not > >>> happen because in condition transition is not made because a block state > >>> of pengine continues. > >>> In the vSphere environment, time considerably passes, and blocking is > >>> canceled, and transition is generated. > >>> * The IO blocking of pengine seems to occur repeatedly > >>> * Other processes may be blocked, too. > >>> * It took it from trouble to FO completion more than one hour. > >>> > >>> This problem shows that resource movement may not occur after disk > >>> trouble in vSphere environment. > >>> > >>> Because our user thinks that I use Pacemaker in vSphere environment, the > >>> solution to this problem is necessary. > >>> > >>> Do not you know the example which solved a similar problem on vSphere? > >>> > >>> We think that it is necessary to evade a block of pengine if there is not > >>> a solution example. > >>> > >>> For example... > >>> 1. crmd watches a request to pengine with a timer... > >>> 2. pengine writes in it with a timer and watches processing.... > >>> ..etc... > >>> > >>> * This problem does not seem to occur in KVM. > >>> * There is the possibility of the difference of the hyper visor. > >>> * In addition, even an actual machine of Linux did not generate the > >>> problem. > >>> > >>> > >>> Best Regards, > >>> Hideo Yamauchi. > >>> > >>> > >>> > >>> _______________________________________________ > >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >>> > >>> Project Home: http://www.clusterlabs.org > >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >>> Bugs: http://bugs.clusterlabs.org > >> > >> > > > > _______________________________________________ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org