Re: [Pacemaker] Problem to support NFS v3
Thanks Dejan. I will try to look for answers from other components. Liang Ma Contractuel | Consultant | SED Systems Inc. Ground Systems Analyst Agence spatiale canadienne | Canadian Space Agency 6767, Route de l'Aéroport, Longueuil (St-Hubert), QC, Canada, J3Y 8Y9 Tél/Tel : (450) 926-5099 | Téléc/Fax: (450) 926-5083 Courriel/E-mail : [liang...@space.gc.ca] Site web/Web site : [www.space.gc.ca ] -Original Message- From: Dejan Muhamedagic [mailto:deja...@fastmail.fm] Sent: September 17, 2010 5:27 AM To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker] Problem to support NFS v3 Hi, On Thu, Sep 16, 2010 at 09:04:29AM -0400, liang...@asc-csa.gc.ca wrote: > Somehow this may not get through. Let me try it again. > > Hi There, > > I have setup a pair of primary/secondary servers for services > such as web, ftp, samba and nfs with pacemaker 1.1.1, > drdb-pacemaker 8.3.7, corosync 1.2.7 on Fedora 13. Things work > well except for an old Solaris 2.7 client. The Solaris system > can't mount to the nfs server because it can support NFS v3 not > v4. From tcpdump this is what goes wrong with a pure NFS v3 > client, > > 19:55:02.886062 192.168.249.150.39331 > 10.1.1.200.sunrpc: udp 84 (DF) > 19:55:02.886625 10.1.1.199.sunrpc > 192.168.249.150.39331: udp 28 (DF) > 20:01:12.741180 192.168.249.150.52440 > 10.1.1.200.sunrpc: udp 84 (DF) > 20:01:12.741667 10.1.1.199.sunrpc > 192.168.249.150.52440: udp 28 (DF) > > Where 10.1.1.200 is the virtual IP address and 10.1.1.199 is the IP address > of the primary server. If I try to mount to the non-virtual address > 10.1.1.199, it works fine. But obviously this is what we want with a virtual > server. > > Originally I used resource agent lsb:nfs. Then changed to > ocf:heartbeat:nfsserver with specific parameter pointing to the virtual > address 10.1.1.200. Plus I added rpcbind which takes care sunrpc request as a > virtual service too (see the related configuration below please). It has the > same thing as before: The virtual server responds with its physical address. > primitive nfs ocf:heartbeat:nfsserver \ > params nfs_ip="10.1.1.200" nfs_shared_infodir="/var/drbdata0/exports" \ > params nfs_init_script="/etc/init.d/nfs" > nfs_notify_cmd="/sbin/rpc.statd" \ > op monitor interval="5s" timeout="20s" depth="0" \ > meta target-role="Started" > primitive rpcbind lsb:rpcbind \ > op monitor interval="50s" \ > meta target-role="Started" > > The virtual NFS service works fine with nfs clients that > support NFS v4. Any ideas please? I can vaguely recall problems like this in the past, but missing all the details :-/ At any rate, the problem is related either to the kernel or NFS. I think you should better ask in the Fedora or the NFS related forum. Thanks, Dejan > Thanks in advance. > > > Liang Ma > Contractuel | Consultant | SED Systems Inc. > Ground Systems Analyst > Agence spatiale canadienne | Canadian Space Agency > 6767, Route de l'Aéroport, Longueuil (St-Hubert), QC, Canada, J3Y 8Y9 > Tél/Tel : (450) 926-5099 | Téléc/Fax: (450) 926-5083 > Courriel/E-mail : [liang...@space.gc.ca] > Site web/Web site : [www.space.gc.ca ] > > > > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Stonith External/SBD problem/question
Ignore this guys. Sorry for double posting also (in both mailing lists linux-ha and pacemaker). I forgot to add the resource name. Mihai Tanasescu wrote: Hello guys, I'm a newbie with Linux-ha and as all newbies I have a problem that I'm trying to sort out. I have a shared storage (iSCSI) and 2 nodes connected to it. I wanted to have stonith active. I configured a separate partition on the shared storage, initialized it, configured it in /etc/sysconfig/sbd. All fine. Now I tried defining the primitive in pacemaker, in crm configure: crm(live)configure# property stonith-enabled="true" crm(live)configure# property stonith-timeout="30s" crm(live)configure# primitive stonith:external/sbd params sbd_device="/dev/sda2" ERROR: provider could not be determined for params ERROR: syntax in primitive: params sbd_device=/dev/sda2 I changed the device naming to: crm(live)configure# primitive stonith:external/sbd params sbd_device="/dev/disk/by-id/scsi-360003ffc4b1185eab88f92602529a5a3-part2" ERROR: provider could not be determined for params ERROR: syntax in primitive: params sbd_device=/dev/disk/by-id/scsi-360003ffc4b1185eab88f92602529a5a3-part2 then to: crm(live)configure# primitive stonith:external/sbd params sbd_device="/dev/disk/by-path/ip-192.168.10.2:3260-iscsi-iqn.1991-05.com.microsoft:deh9451lvp--mailconta-target-lun-0-part2" ERROR: provider could not be determined for params ERROR: syntax in primitive: params sbd_device=/dev/disk/by-path/ip-192.168.10.2:3260-iscsi-iqn.1991-05.com.microsoft:deh9451lvp--mailconta-target-lun-0-part2 and finally to: crm(live)configure# primitive stonith:external/sbd params sbd_device="/dev/disk/by-uuid/d25c9264-6065-4dee-9a39-2dc146727579" ERROR: provider could not be determined for params ERROR: syntax in primitive: params sbd_device=/dev/disk/by-uuid/d25c9264-6065-4dee-9a39-2dc146727579 Can someone point me in the right direction ? What am I doing wrong and how can I fix this ? Thanks, Mihai ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] cib fails to start until host is rebooted
Andrew Beekhof wrote: I spoke to Steve, and the only thing he could come up with was that the group might not be correct. When the cluster is in this state, please run: ps x -o pid,euser,ruser,egroup,rgroup,command And compare it to the "normal" output. Also, confirm that there is only one group named haclient, and one user named hacluster. Thanks, that was the right track. Looks like I fat-fingered a '9' in front of the '0' in root's gid in /etc/passwd: root:x:0:90:root:/root:/bin/bash gid 90 happens to be owned by haclient. With root's gid fixed, everything works as expected. Mike ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
[Pacemaker] Stonith External/SBD problem/question
Hello guys, I'm a newbie with Linux-ha and as all newbies I have a problem that I'm trying to sort out. I have a shared storage (iSCSI) and 2 nodes connected to it. I wanted to have stonith active. I configured a separate partition on the shared storage, initialized it, configured it in /etc/sysconfig/sbd. All fine. Now I tried defining the primitive in pacemaker, in crm configure: crm(live)configure# property stonith-enabled="true" crm(live)configure# property stonith-timeout="30s" crm(live)configure# primitive stonith:external/sbd params sbd_device="/dev/sda2" ERROR: provider could not be determined for params ERROR: syntax in primitive: params sbd_device=/dev/sda2 I changed the device naming to: crm(live)configure# primitive stonith:external/sbd params sbd_device="/dev/disk/by-id/scsi-360003ffc4b1185eab88f92602529a5a3-part2" ERROR: provider could not be determined for params ERROR: syntax in primitive: params sbd_device=/dev/disk/by-id/scsi-360003ffc4b1185eab88f92602529a5a3-part2 then to: crm(live)configure# primitive stonith:external/sbd params sbd_device="/dev/disk/by-path/ip-192.168.10.2:3260-iscsi-iqn.1991-05.com.microsoft:deh9451lvp--mailconta-target-lun-0-part2" ERROR: provider could not be determined for params ERROR: syntax in primitive: params sbd_device=/dev/disk/by-path/ip-192.168.10.2:3260-iscsi-iqn.1991-05.com.microsoft:deh9451lvp--mailconta-target-lun-0-part2 and finally to: crm(live)configure# primitive stonith:external/sbd params sbd_device="/dev/disk/by-uuid/d25c9264-6065-4dee-9a39-2dc146727579" ERROR: provider could not be determined for params ERROR: syntax in primitive: params sbd_device=/dev/disk/by-uuid/d25c9264-6065-4dee-9a39-2dc146727579 Can someone point me in the right direction ? What am I doing wrong and how can I fix this ? Thanks, Mihai ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] cannot append to /var/log/cluster/corosync.log, permission denied
Change log owner and group to: -rw-rw 1 hacluster haclient 111231 17 set 13:01 /var/log/corosync/corosync.log and check logrotate config (if you use it). Even if in corosync.conf you have specified "user: root" the crmd daemon is executed as hacluster and not as root. I think this change has been introduced since 1.0.9 pacemaker version. regards, mike ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] cannot append to /var/log/cluster/corosync.log permission denied
-Original Message- From: Dejan Muhamedagic [mailto:deja...@fastmail.fm] Sent: Friday, September 17, 2010 6:01 PM To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker] cannot append to /var/log/cluster/corosync.log permission denied Hi, On Fri, Sep 17, 2010 at 12:49:39AM +0800, Alister Wong wrote: > Hi, all, > > I just setup a 2 node clusters in 2 RHEL server(5.5 x86_64). > > After I setup and it looked fine. However, I found in > /var/log/messages I found some logs showed that it failed to append to > corosync.log. I run the corosync as root. > > e.g. > > Sep 17 00:39:14 property01-b crmd: [17332]: info: do_state_transition: State > transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS > cause=C_FSA_INTERNAL origin=notify_crmd ] > > Sep 17 00:39:14 property01-b crmd: Cannot append to > /var/log/cluster/corosync.log: Permission denied > > Sep 17 00:39:14 property01-b crmd: [17332]: info: do_state_transition: > Starting PEngine Recheck Timer > > Sep 17 00:39:14 property01-b crmd: Cannot append to > /var/log/cluster/corosync.log: Permission denied > > > > Below is corosync.conf and crm configure result. > > Corosync.conf > > # Please read the corosync.conf.5 manual page > > compatibility: whitetank > > > > totem { > > version: 2 > > secauth: off > > threads: 0 > > interface { > > ringnumber: 0 > > bindnetaddr: 192.168.200.15 > > mcastaddr: 226.94.1.1 > > mcastport: 5405 > > } > > } > > > > logging { > > fileline: off > > to_stderr: no > > to_logfile: yes Change this to: to_logfile: no And use just syslog. Thanks, Dejan > to_syslog: yes > > logfile: /var/log/cluster/corosync.log > > debug: off > > timestamp: on > > logger_subsys { > > subsys: AMF > > debug: off > > } > > } > > > > amf { > > mode: disabled > > } > > aisexec{ > > user: root > > group: root > > } > > service{ > > # Load the Pacemaker Cluster Resource Manager > > name: pacemaker > > ver: 0 > > } > > > > Crm configure: > > node property01-a > > node property01-b > > property $id="cib-bootstrap-options" \ > > dc-version="1.0.9-89bd754939df5150de7cd76835f98fe90851b677" \ > > cluster-infrastructure="openais" \ > > expected-quorum-votes="2" > > > > Here is the /var/log/corosync/corosync.log permission info: > > [r...@property01-b corosync]# ls -l /var/log/cluster/ > > total 148 > > -rw-rw 1 root root 145285 Sep 17 00:39 corosync.log > > > > From corosync.log > > Pcmk and corosync should able to write the log in corosync.log > > Sep 17 00:39:07 corosync [pcmk ] info: pcmk_peer_update: MEMB: property01-b > 80259264 > > Sep 17 00:39:07 corosync [pcmk ] info: send_member_notification: Sending > membership update 72 to 2 children > > Sep 17 00:39:07 corosync [pcmk ] info: update_member: 0x5275c50 Node > 80259264 ((null)) born on: 72 > > Sep 17 00:39:07 corosync [TOTEM ] A processor joined or left the membership > and a new membership was formed. > > Sep 17 00:39:07 corosync [pcmk ] info: update_member: 0x527cec0 Node > 63482048 (property01-a) born on: 72 > > Sep 17 00:39:07 corosync [pcmk ] info: update_member: 0x527cec0 Node > 63482048 now known as property01-a (was: (null)) > > Sep 17 00:39:07 corosync [pcmk ] info: update_member: Node property01-a now > has process list: 00013312 (78610) > > Sep 17 00:39:08 corosync [pcmk ] info: update_member: Node property01-a now > has 1 quorum votes (was 0) > > Sep 17 00:39:08 corosync [pcmk ] info: send_member_notification: Sending > membership update 72 to 2 children > > Sep 17 00:39:08 corosync [MAIN ] Completed service synchronization, ready > to provide service. > > > > The cluster was just setup and the current configure is so simple, but I > can't find out what did I do to cause this issue. Can anyone help me to get > rid of the permission issue? I have no clue to find out the cause. > > Thank you very much. > > > > Regards, > > Alister > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker Hi, Dejan, Thanks. It works. However, I have another cluster was finished setup about a month ago and they didn't have this problem. Anyway, thank you. Regards, Alister ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterl
Re: [Pacemaker] cannot append to /var/log/cluster/corosync.log permission denied
Hi, On Fri, Sep 17, 2010 at 12:49:39AM +0800, Alister Wong wrote: > Hi, all, > > I just setup a 2 node clusters in 2 RHEL server(5.5 x86_64). > > After I setup and it looked fine. However, I found in > /var/log/messages I found some logs showed that it failed to append to > corosync.log. I run the corosync as root. > > e.g. > > Sep 17 00:39:14 property01-b crmd: [17332]: info: do_state_transition: State > transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS > cause=C_FSA_INTERNAL origin=notify_crmd ] > > Sep 17 00:39:14 property01-b crmd: Cannot append to > /var/log/cluster/corosync.log: Permission denied > > Sep 17 00:39:14 property01-b crmd: [17332]: info: do_state_transition: > Starting PEngine Recheck Timer > > Sep 17 00:39:14 property01-b crmd: Cannot append to > /var/log/cluster/corosync.log: Permission denied > > > > Below is corosync.conf and crm configure result. > > Corosync.conf > > # Please read the corosync.conf.5 manual page > > compatibility: whitetank > > > > totem { > > version: 2 > > secauth: off > > threads: 0 > > interface { > > ringnumber: 0 > > bindnetaddr: 192.168.200.15 > > mcastaddr: 226.94.1.1 > > mcastport: 5405 > > } > > } > > > > logging { > > fileline: off > > to_stderr: no > > to_logfile: yes Change this to: to_logfile: no And use just syslog. Thanks, Dejan > to_syslog: yes > > logfile: /var/log/cluster/corosync.log > > debug: off > > timestamp: on > > logger_subsys { > > subsys: AMF > > debug: off > > } > > } > > > > amf { > > mode: disabled > > } > > aisexec{ > > user: root > > group: root > > } > > service{ > > # Load the Pacemaker Cluster Resource Manager > > name: pacemaker > > ver: 0 > > } > > > > Crm configure: > > node property01-a > > node property01-b > > property $id="cib-bootstrap-options" \ > > dc-version="1.0.9-89bd754939df5150de7cd76835f98fe90851b677" \ > > cluster-infrastructure="openais" \ > > expected-quorum-votes="2" > > > > Here is the /var/log/corosync/corosync.log permission info: > > [r...@property01-b corosync]# ls -l /var/log/cluster/ > > total 148 > > -rw-rw 1 root root 145285 Sep 17 00:39 corosync.log > > > > From corosync.log > > Pcmk and corosync should able to write the log in corosync.log > > Sep 17 00:39:07 corosync [pcmk ] info: pcmk_peer_update: MEMB: property01-b > 80259264 > > Sep 17 00:39:07 corosync [pcmk ] info: send_member_notification: Sending > membership update 72 to 2 children > > Sep 17 00:39:07 corosync [pcmk ] info: update_member: 0x5275c50 Node > 80259264 ((null)) born on: 72 > > Sep 17 00:39:07 corosync [TOTEM ] A processor joined or left the membership > and a new membership was formed. > > Sep 17 00:39:07 corosync [pcmk ] info: update_member: 0x527cec0 Node > 63482048 (property01-a) born on: 72 > > Sep 17 00:39:07 corosync [pcmk ] info: update_member: 0x527cec0 Node > 63482048 now known as property01-a (was: (null)) > > Sep 17 00:39:07 corosync [pcmk ] info: update_member: Node property01-a now > has process list: 00013312 (78610) > > Sep 17 00:39:08 corosync [pcmk ] info: update_member: Node property01-a now > has 1 quorum votes (was 0) > > Sep 17 00:39:08 corosync [pcmk ] info: send_member_notification: Sending > membership update 72 to 2 children > > Sep 17 00:39:08 corosync [MAIN ] Completed service synchronization, ready > to provide service. > > > > The cluster was just setup and the current configure is so simple, but I > can't find out what did I do to cause this issue. Can anyone help me to get > rid of the permission issue? I have no clue to find out the cause. > > Thank you very much. > > > > Regards, > > Alister > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Problem to support NFS v3
Hi, On Thu, Sep 16, 2010 at 09:04:29AM -0400, liang...@asc-csa.gc.ca wrote: > Somehow this may not get through. Let me try it again. > > Hi There, > > I have setup a pair of primary/secondary servers for services > such as web, ftp, samba and nfs with pacemaker 1.1.1, > drdb-pacemaker 8.3.7, corosync 1.2.7 on Fedora 13. Things work > well except for an old Solaris 2.7 client. The Solaris system > can't mount to the nfs server because it can support NFS v3 not > v4. From tcpdump this is what goes wrong with a pure NFS v3 > client, > > 19:55:02.886062 192.168.249.150.39331 > 10.1.1.200.sunrpc: udp 84 (DF) > 19:55:02.886625 10.1.1.199.sunrpc > 192.168.249.150.39331: udp 28 (DF) > 20:01:12.741180 192.168.249.150.52440 > 10.1.1.200.sunrpc: udp 84 (DF) > 20:01:12.741667 10.1.1.199.sunrpc > 192.168.249.150.52440: udp 28 (DF) > > Where 10.1.1.200 is the virtual IP address and 10.1.1.199 is the IP address > of the primary server. If I try to mount to the non-virtual address > 10.1.1.199, it works fine. But obviously this is what we want with a virtual > server. > > Originally I used resource agent lsb:nfs. Then changed to > ocf:heartbeat:nfsserver with specific parameter pointing to the virtual > address 10.1.1.200. Plus I added rpcbind which takes care sunrpc request as a > virtual service too (see the related configuration below please). It has the > same thing as before: The virtual server responds with its physical address. > primitive nfs ocf:heartbeat:nfsserver \ > params nfs_ip="10.1.1.200" nfs_shared_infodir="/var/drbdata0/exports" \ > params nfs_init_script="/etc/init.d/nfs" > nfs_notify_cmd="/sbin/rpc.statd" \ > op monitor interval="5s" timeout="20s" depth="0" \ > meta target-role="Started" > primitive rpcbind lsb:rpcbind \ > op monitor interval="50s" \ > meta target-role="Started" > > The virtual NFS service works fine with nfs clients that > support NFS v4. Any ideas please? I can vaguely recall problems like this in the past, but missing all the details :-/ At any rate, the problem is related either to the kernel or NFS. I think you should better ask in the Fedora or the NFS related forum. Thanks, Dejan > Thanks in advance. > > > Liang Ma > Contractuel | Consultant | SED Systems Inc. > Ground Systems Analyst > Agence spatiale canadienne | Canadian Space Agency > 6767, Route de l'Aéroport, Longueuil (St-Hubert), QC, Canada, J3Y 8Y9 > Tél/Tel : (450) 926-5099 | Téléc/Fax: (450) 926-5083 > Courriel/E-mail : [liang...@space.gc.ca] > Site web/Web site : [www.space.gc.ca ] > > > > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] error in logs - cannot interleave clone... and some general ideas
2010/9/15 Michał Purzyński : > hey, > > i'm in process of setting up a failover + load balancing cluster for > OpenLDAP (mirror mode). > > i've got some strange log messages i don't really understand: > > Sep 15 00:33:38 udalia3 crmd: [30332]: ERROR: crmd_ha_msg_filter: > Another DC detected: udalia4 (op=noop) > > this one is easy - one node lost connectivity, each of them (i have two > node cluster) thinks he's the DC. anyway, after the connectivity is > back, everything syncs. > > btw, what should i do about that? stonith? almost certainly > does not seem to be a good > solution with slapd service - no shared data, each node has own > database, they are synchronized on the ldap replication level, mirror mode. what happens if changes are made to both? > > if you have some ideas/thoughts about my cluster setup, share them as > well. maybe there are some logical mistakes i made or sth. just comment > on it. > > Sep 15 00:36:53 udalia3 pengine: [30331]: ERROR: > clone_rsc_colocation_rh: Cannot interleave clone LDAPclone and > ClusterIPclone because they do not support the same number of resources > per node > > this is the one i don't get. i don't interleave anything. at least i > don't know i do ;) it became the default. set the interleave=false param for both clones and it will go away > > configuration of my cluster: > > node udalia3 \ > attributes standby="off" > node udalia4 \ > attributes standby="off" > primitive ClusterIP ocf:heartbeat:IPaddr2 \ > params ip="172.16.6.93" cidr_netmask="24" > clusterip_hash="sourceip-sourceport" \ > op monitor interval="0" timeout="60s" start stop > primitive LDAP lsb:slapd \ > op monitor interval="0" timeout="60s" start stop > primitive resPing ocf:pacemaker:ping \ > params host_list="172.16.4.7" dampen="5s" multiplier="1000" > debug="true" \ > op start interval="0" timeout="60s" \ > op stop interval="0" timeout="60s" \ > op monitor interval="0" timeout="60s" > clone ClusterIPclone ClusterIP \ > meta globally-unique="true" clone-max="2" clone-node-max="2" > target-role="Started" > clone LDAPclone LDAP \ > meta globally-unique="false" clone-max="2" clone-node-max="1" > target-role="Started" > clone clonePing resPing > location ip-no-convectivity ClusterIPclone \ > rule $id="ping-exclude-rule2" -inf: not_defined pingd or pingd > number:lte 0 > location ldap-no-convectivity LDAPclone \ > rule $id="ping-exclude-rule" -inf: not_defined pingd or pingd > number:lte 0 > colocation ldap-with-ip inf: LDAPclone ClusterIPclone > order ldap-after-ip inf: ClusterIPclone LDAPclone > property $id="cib-bootstrap-options" \ > dc-version="1.1.2-53b05d88305c603b83b6dbdd799f8e3bca9b7efd" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > stonith-enabled="false" \ > no-quorum-policy="ignore" \ > last-lrm-refresh="1284502285" > rsc_defaults $id="rsc-options" \ > resource-stickiness="0" > > thx for your help. > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] cib fails to start until host is rebooted
I spoke to Steve, and the only thing he could come up with was that the group might not be correct. When the cluster is in this state, please run: ps x -o pid,euser,ruser,egroup,rgroup,command And compare it to the "normal" output. Also, confirm that there is only one group named haclient, and one user named hacluster. On Tue, Sep 7, 2010 at 11:03 PM, Michael Smith wrote: > Michael Smith wrote: >> >> On Mon, 6 Sep 2010, Andrew Beekhof wrote: >> > Is /dev/shm full (or not mounted) by any chance? No - I tried clearing that out, too. >>> >>> And corosync is actually running? >> >> Yes, it's logging "[IPC ] Invalid IPC credentials." when cib tries to >> connect. > > For what it's worth, I have the same problem after updating: > > > cluster-glue-1.0.6-2.1 > corosync-1.2.7-1.1 > openais-1.1.3-1.1 > pacemaker-1.1.2.1-5.1 > > Mike > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker