Hi, I saw from mailing list archives that topic in subject was discussed already before, but I thought I may add my experience into the debate.
I'm running (not yet in production, so I'm free to make all kind of tests) three node Corosync/Pacemaker cluster (Rita, Sara, Quorum-rs). Two nodes are physical machines running Xen (Rita and Sara), while third one (Quorum-rs) is a virtual machine running on another physical host (so it never runs on Rita or Sara). Quorum-rs is always on standby, never runs any services and, like name suggests, is there just to to be counted for quorum. Rita and Sara have small DRBD+OCFS2 shared storage, just to hold configuration files of Xen domains, some iso images and so on. There is also NFS mounted partition on both Rita and Sara from some external NAS device. Both Rita and Sara have dedicated eth1 interface to NFS, which is neither used by Corosync or DRBD. There is for example one Xen HVM virtual machine running WindowsXP, which has its disk as a file on NFS. If I exercise disk in WinXP domain (by running 'sdelete -c c:'), things get worse. I start getting messages like below on the node where this WinXP guest is running ... Aug 26 10:09:28 rita corosync[1359]: [TOTEM ] Process pause detected for 5614 ms, flushing membership messages. Aug 26 10:09:28 rita corosync[1359]: [pcmk ] notice: pcmk_peer_update: Transitional membership event on ring 128: memb=3, new=0, lost=0 Aug 26 10:09:28 rita corosync[1359]: [pcmk ] info: pcmk_peer_update: memb: rita 16863498 Aug 26 10:09:28 rita corosync[1359]: [pcmk ] info: pcmk_peer_update: memb: sara 33640714 Aug 26 10:09:28 rita corosync[1359]: [pcmk ] info: pcmk_peer_update: memb: quorum-rs 50417930 Aug 26 10:09:28 rita corosync[1359]: [pcmk ] notice: pcmk_peer_update: Stable membership event on ring 128: memb=3, new=0, lost=0 Aug 26 10:09:28 rita corosync[1359]: [pcmk ] info: pcmk_peer_update: MEMB: rita 16863498 Aug 26 10:09:28 rita corosync[1359]: [pcmk ] info: pcmk_peer_update: MEMB: sara 33640714 Aug 26 10:09:28 rita corosync[1359]: [pcmk ] info: pcmk_peer_update: MEMB: quorum-rs 50417930 Aug 26 10:09:28 rita corosync[1359]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Aug 26 10:09:28 rita corosync[1359]: [MAIN ] Completed service synchronization, ready to provide service. ... if Corosync is paused for longer time, like in case ... Aug 26 10:11:00 rita corosync[1359]: [TOTEM ] Process pause detected for 16948 ms, flushing membership messages. ... it's detected by remaining two nodes as failed node and fenced (powered down). It's no surprise, because when I do tcpdump on Quorum-rs server for example, I can see a period of around 17seconds when I'm not receiving any messages from Rita running WinXP guest ... 10:11:07.285895 IP 10.81.1.1.5404 > 239.94.81.2.5405: UDP, length 82 10:11:07.600773 IP 10.81.1.1.5404 > 239.94.81.2.5405: UDP, length 82 10:11:07.915640 IP 10.81.1.1.5404 > 239.94.81.2.5405: UDP, length 82 10:11:08.230572 IP 10.81.1.1.5404 > 239.94.81.2.5405: UDP, length 82 10:11:08.547069 IP 10.81.1.1.5404 > 239.94.81.2.5405: UDP, length 82 10:11:08.862314 IP 10.81.1.1.5404 > 239.94.81.2.5405: UDP, length 82 10:11:25.923506 IP 10.81.1.1.5404 > 239.94.81.2.5405: UDP, length 82 <---- gap here 10:11:25.923676 IP 10.81.1.1.5404 > 239.94.81.2.5405: UDP, length 200 10:11:26.132995 IP 10.81.1.1.5404 > 239.94.81.2.5405: UDP, length 200 Things which I've tried so far: 1.) both Rita and Sara have dedicated two CPU cores (0-1) and domUs can run on (2-7) only xen_commandline : dom0_mem=768M dom0_max_vcpus=2 dom0_vcpus_pin Name ID VCPU CPU State Time(s) CPU Affinity Domain-0 0 0 0 r-- 221.3 0 Domain-0 0 1 1 -b- 280.7 1 2.) changed credit scheduler weight for Domain0 from default 256 to 1024 Name ID Weight Cap Domain-0 0 1024 0 3.) recompiled Corosync with patch corosync-trunk-reset-pause-timestamp-on-events.patch applied ... and none of it made any difference. I can reproduce the problem each and every time. Does anybody have any hints what to try next? I'm thinking of switching to jumbo frames for eth1 NIC towards NFS server (I should be able to do it on both ends) and to recompile kernel with CONFIG_PREEMPT set to Yes. I can try perhaps Linux Trace Toolkit? As for the versions, it's Debian/Squeeze installation: corosync/squeeze uptodate 1.2.1-4 pacemaker/squeeze uptodate 1.0.9.1+hg15626-1 ocfs2-tools/squeeze uptodate 1.4.4-3 kernel 2.6.32-5-xen-amd64 Best Regards, Martin _______________________________________________ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais