Re: [Pacemaker] Problem to support NFS v3

2010-09-17 Thread Liang.Ma
Thanks Dejan.

I will try to look for answers from other components.

Liang Ma
Contractuel | Consultant | SED Systems Inc. 
Ground Systems Analyst
Agence spatiale canadienne | Canadian Space Agency
6767, Route de l'Aéroport, Longueuil (St-Hubert), QC, Canada, J3Y 8Y9
Tél/Tel : (450) 926-5099 | Téléc/Fax: (450) 926-5083
Courriel/E-mail : [liang...@space.gc.ca]
Site web/Web site : [www.space.gc.ca ] 




-Original Message-
From: Dejan Muhamedagic [mailto:deja...@fastmail.fm] 
Sent: September 17, 2010 5:27 AM
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] Problem to support NFS v3

Hi,
On Thu, Sep 16, 2010 at 09:04:29AM -0400, liang...@asc-csa.gc.ca wrote:
> Somehow this may not get through. Let me try it again.
> 
> Hi There,
> 
> I have setup a pair of primary/secondary servers for services
> such as web, ftp, samba and nfs with pacemaker 1.1.1,
> drdb-pacemaker 8.3.7, corosync 1.2.7 on Fedora 13. Things work
> well except for an old Solaris 2.7 client. The Solaris system
> can't mount to the nfs server because it can support NFS v3 not
> v4. From tcpdump this is what goes wrong with a pure NFS v3
> client,
> 
> 19:55:02.886062 192.168.249.150.39331 > 10.1.1.200.sunrpc:  udp 84 (DF)
> 19:55:02.886625 10.1.1.199.sunrpc > 192.168.249.150.39331:  udp 28 (DF) 
> 20:01:12.741180 192.168.249.150.52440 > 10.1.1.200.sunrpc:  udp 84 (DF)
> 20:01:12.741667 10.1.1.199.sunrpc > 192.168.249.150.52440:  udp 28 (DF)


> 
> Where 10.1.1.200 is the virtual IP address and 10.1.1.199 is the IP address 
> of the primary server. If I try to mount to the non-virtual address 
> 10.1.1.199, it works fine. But obviously this is what we want with a virtual 
> server.
> 
> Originally I used resource agent lsb:nfs. Then changed to 
> ocf:heartbeat:nfsserver with specific parameter pointing to the virtual 
> address 10.1.1.200. Plus I added rpcbind which takes care sunrpc request as a 
> virtual service too (see the related configuration below please). It has the 
> same thing as before: The virtual server responds with its physical address. 
> primitive nfs ocf:heartbeat:nfsserver \
>   params nfs_ip="10.1.1.200" nfs_shared_infodir="/var/drbdata0/exports" \
>   params nfs_init_script="/etc/init.d/nfs" 
> nfs_notify_cmd="/sbin/rpc.statd" \
>   op monitor interval="5s" timeout="20s" depth="0" \
>   meta target-role="Started"
> primitive rpcbind lsb:rpcbind \
>   op monitor interval="50s" \
>   meta target-role="Started"
> 
> The virtual NFS service works fine with nfs clients that
> support NFS v4. Any ideas please?

I can vaguely recall problems like this in the past, but missing
all the details :-/ At any rate, the problem is related either to
the kernel or NFS. I think you should better ask in the Fedora or
the NFS related forum.

Thanks,

Dejan

> Thanks in advance.
> 
> 
> Liang Ma
> Contractuel | Consultant | SED Systems Inc. 
> Ground Systems Analyst
> Agence spatiale canadienne | Canadian Space Agency
> 6767, Route de l'Aéroport, Longueuil (St-Hubert), QC, Canada, J3Y 8Y9
> Tél/Tel : (450) 926-5099 | Téléc/Fax: (450) 926-5083
> Courriel/E-mail : [liang...@space.gc.ca]
> Site web/Web site : [www.space.gc.ca ] 
> 
> 
> 
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Stonith External/SBD problem/question

2010-09-17 Thread Mihai Tanasescu

Ignore this guys.
Sorry for double posting also (in both mailing lists linux-ha and 
pacemaker).


I forgot to add the resource name.


Mihai Tanasescu wrote:

Hello guys,


I'm a newbie with Linux-ha and as all newbies I have a problem that I'm
trying to sort out.


I have a shared storage (iSCSI) and 2 nodes connected to it.

I wanted to have stonith active.

I configured a separate partition on the shared storage, initialized it,
configured it in /etc/sysconfig/sbd.

All fine.

Now I tried defining the primitive in pacemaker, in crm configure:

crm(live)configure# property stonith-enabled="true"
crm(live)configure# property stonith-timeout="30s"
crm(live)configure# primitive stonith:external/sbd params
sbd_device="/dev/sda2"
ERROR: provider could not be determined for params
ERROR: syntax in primitive: params sbd_device=/dev/sda2

I changed the device naming to:
crm(live)configure# primitive stonith:external/sbd params
sbd_device="/dev/disk/by-id/scsi-360003ffc4b1185eab88f92602529a5a3-part2"
ERROR: provider could not be determined for params
ERROR: syntax in primitive: params
sbd_device=/dev/disk/by-id/scsi-360003ffc4b1185eab88f92602529a5a3-part2

then to:
crm(live)configure# primitive stonith:external/sbd params
sbd_device="/dev/disk/by-path/ip-192.168.10.2:3260-iscsi-iqn.1991-05.com.microsoft:deh9451lvp--mailconta-target-lun-0-part2" 


ERROR: provider could not be determined for params
ERROR: syntax in primitive: params
sbd_device=/dev/disk/by-path/ip-192.168.10.2:3260-iscsi-iqn.1991-05.com.microsoft:deh9451lvp--mailconta-target-lun-0-part2 



and finally to:

crm(live)configure# primitive stonith:external/sbd params
sbd_device="/dev/disk/by-uuid/d25c9264-6065-4dee-9a39-2dc146727579"
ERROR: provider could not be determined for params
ERROR: syntax in primitive: params
sbd_device=/dev/disk/by-uuid/d25c9264-6065-4dee-9a39-2dc146727579

Can someone point me in the right direction ?

What am I doing wrong and how can I fix this ?


Thanks,
Mihai





___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: 
http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] cib fails to start until host is rebooted

2010-09-17 Thread Michael Smith

Andrew Beekhof wrote:

I spoke to Steve, and the only thing he could come up with was that
the group might not be correct.

When the cluster is in this state, please run:
   ps x -o pid,euser,ruser,egroup,rgroup,command

And compare it to the "normal" output.

Also, confirm that there is only one group named haclient, and one
user named hacluster.


Thanks, that was the right track. Looks like I fat-fingered a '9' in 
front of the '0' in root's gid in /etc/passwd:


root:x:0:90:root:/root:/bin/bash

gid 90 happens to be owned by haclient. With root's gid fixed, 
everything works as expected.


Mike

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


[Pacemaker] Stonith External/SBD problem/question

2010-09-17 Thread Mihai Tanasescu

Hello guys,


I'm a newbie with Linux-ha and as all newbies I have a problem that I'm
trying to sort out.


I have a shared storage (iSCSI) and 2 nodes connected to it.

I wanted to have stonith active.

I configured a separate partition on the shared storage, initialized it,
configured it in /etc/sysconfig/sbd.

All fine.

Now I tried defining the primitive in pacemaker, in crm configure:

crm(live)configure# property stonith-enabled="true"
crm(live)configure# property stonith-timeout="30s"
crm(live)configure# primitive stonith:external/sbd params
sbd_device="/dev/sda2"
ERROR: provider could not be determined for params
ERROR: syntax in primitive: params sbd_device=/dev/sda2

I changed the device naming to:
crm(live)configure# primitive stonith:external/sbd params
sbd_device="/dev/disk/by-id/scsi-360003ffc4b1185eab88f92602529a5a3-part2"
ERROR: provider could not be determined for params
ERROR: syntax in primitive: params
sbd_device=/dev/disk/by-id/scsi-360003ffc4b1185eab88f92602529a5a3-part2

then to:
crm(live)configure# primitive stonith:external/sbd params
sbd_device="/dev/disk/by-path/ip-192.168.10.2:3260-iscsi-iqn.1991-05.com.microsoft:deh9451lvp--mailconta-target-lun-0-part2"
ERROR: provider could not be determined for params
ERROR: syntax in primitive: params
sbd_device=/dev/disk/by-path/ip-192.168.10.2:3260-iscsi-iqn.1991-05.com.microsoft:deh9451lvp--mailconta-target-lun-0-part2

and finally to:

crm(live)configure# primitive stonith:external/sbd params
sbd_device="/dev/disk/by-uuid/d25c9264-6065-4dee-9a39-2dc146727579"
ERROR: provider could not be determined for params
ERROR: syntax in primitive: params
sbd_device=/dev/disk/by-uuid/d25c9264-6065-4dee-9a39-2dc146727579

Can someone point me in the right direction ?

What am I doing wrong and how can I fix this ?


Thanks,
Mihai





___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] cannot append to /var/log/cluster/corosync.log, permission denied

2010-09-17 Thread Michele Pellegrini
Change log owner and group to:

-rw-rw 1 hacluster haclient 111231 17 set 13:01
/var/log/corosync/corosync.log

and check logrotate config (if you use it).

Even if in corosync.conf you have specified "user: root" the crmd daemon
is executed as hacluster and not as root. I think this change has been
introduced since 1.0.9 pacemaker version.


regards,
mike

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] cannot append to /var/log/cluster/corosync.log permission denied

2010-09-17 Thread Alister Wong
-Original Message-
From: Dejan Muhamedagic [mailto:deja...@fastmail.fm] 
Sent: Friday, September 17, 2010 6:01 PM
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] cannot append to /var/log/cluster/corosync.log
permission denied

Hi,

On Fri, Sep 17, 2010 at 12:49:39AM +0800, Alister Wong wrote:
> Hi, all,
> 
> I just setup a 2 node clusters in 2 RHEL server(5.5 x86_64).
> 
> After I setup and it looked fine. However, I found in
> /var/log/messages I found some logs showed that it failed to append to
> corosync.log. I run the corosync as root. 
> 
> e.g.
> 
> Sep 17 00:39:14 property01-b crmd: [17332]: info: do_state_transition:
State
> transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
> cause=C_FSA_INTERNAL origin=notify_crmd ]
> 
> Sep 17 00:39:14 property01-b crmd: Cannot append to
> /var/log/cluster/corosync.log: Permission denied
> 
> Sep 17 00:39:14 property01-b crmd: [17332]: info: do_state_transition:
> Starting PEngine Recheck Timer
> 
> Sep 17 00:39:14 property01-b crmd: Cannot append to
> /var/log/cluster/corosync.log: Permission denied
> 
>  
> 
> Below is corosync.conf and crm configure result.
> 
> Corosync.conf
> 
> # Please read the corosync.conf.5 manual page
> 
> compatibility: whitetank
> 
>  
> 
> totem {
> 
> version: 2
> 
> secauth: off
> 
> threads: 0
> 
> interface {
> 
> ringnumber: 0
> 
> bindnetaddr: 192.168.200.15
> 
> mcastaddr: 226.94.1.1
> 
> mcastport: 5405
> 
> }
> 
> }
> 
>  
> 
> logging {
> 
> fileline: off
> 
> to_stderr: no
> 
> to_logfile: yes

Change this to: 
 to_logfile: no

And use just syslog.

Thanks,

Dejan

> to_syslog: yes
> 
> logfile: /var/log/cluster/corosync.log
> 
> debug: off
> 
> timestamp: on
> 
> logger_subsys {
> 
> subsys: AMF
> 
> debug: off
> 
> }
> 
> }
> 
>  
> 
> amf {
> 
> mode: disabled
> 
> }
> 
> aisexec{
> 
> user: root
> 
> group: root
> 
> }
> 
> service{
> 
> # Load the Pacemaker Cluster Resource Manager
> 
> name: pacemaker
> 
> ver: 0
> 
> }
> 
>  
> 
> Crm configure:
> 
> node property01-a
> 
> node property01-b
> 
> property $id="cib-bootstrap-options" \
> 
> dc-version="1.0.9-89bd754939df5150de7cd76835f98fe90851b677" \
> 
> cluster-infrastructure="openais" \
> 
> expected-quorum-votes="2"
> 
>  
> 
> Here is the /var/log/corosync/corosync.log permission info:
> 
> [r...@property01-b corosync]# ls -l /var/log/cluster/
> 
> total 148
> 
> -rw-rw 1 root root 145285 Sep 17 00:39 corosync.log
> 
>  
> 
> From corosync.log
> 
> Pcmk and corosync should able to write the log in corosync.log
> 
> Sep 17 00:39:07 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
property01-b
> 80259264
> 
> Sep 17 00:39:07 corosync [pcmk  ] info: send_member_notification: Sending
> membership update 72 to 2 children
> 
> Sep 17 00:39:07 corosync [pcmk  ] info: update_member: 0x5275c50 Node
> 80259264 ((null)) born on: 72
> 
> Sep 17 00:39:07 corosync [TOTEM ] A processor joined or left the
membership
> and a new membership was formed.
> 
> Sep 17 00:39:07 corosync [pcmk  ] info: update_member: 0x527cec0 Node
> 63482048 (property01-a) born on: 72
> 
> Sep 17 00:39:07 corosync [pcmk  ] info: update_member: 0x527cec0 Node
> 63482048 now known as property01-a (was: (null))
> 
> Sep 17 00:39:07 corosync [pcmk  ] info: update_member: Node property01-a
now
> has process list: 00013312 (78610)
> 
> Sep 17 00:39:08 corosync [pcmk  ] info: update_member: Node property01-a
now
> has 1 quorum votes (was 0)
> 
> Sep 17 00:39:08 corosync [pcmk  ] info: send_member_notification: Sending
> membership update 72 to 2 children
> 
> Sep 17 00:39:08 corosync [MAIN  ] Completed service synchronization, ready
> to provide service.
> 
>  
> 
> The cluster was just setup and the current configure is so simple, but I
> can't find out what did I do to cause this issue. Can anyone help me to
get
> rid of the permission issue? I have no clue to find out the cause.
> 
> Thank you very much.
> 
>  
> 
> Regards,
> 
> Alister
> 

> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Hi, Dejan,
Thanks. It works. However, I have another cluster was finished setup
about a month ago and they didn't have this problem.
Anyway, thank you.

Regards,
Alister
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterl

Re: [Pacemaker] cannot append to /var/log/cluster/corosync.log permission denied

2010-09-17 Thread Dejan Muhamedagic
Hi,

On Fri, Sep 17, 2010 at 12:49:39AM +0800, Alister Wong wrote:
> Hi, all,
> 
> I just setup a 2 node clusters in 2 RHEL server(5.5 x86_64).
> 
> After I setup and it looked fine. However, I found in
> /var/log/messages I found some logs showed that it failed to append to
> corosync.log. I run the corosync as root. 
> 
> e.g.
> 
> Sep 17 00:39:14 property01-b crmd: [17332]: info: do_state_transition: State
> transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
> cause=C_FSA_INTERNAL origin=notify_crmd ]
> 
> Sep 17 00:39:14 property01-b crmd: Cannot append to
> /var/log/cluster/corosync.log: Permission denied
> 
> Sep 17 00:39:14 property01-b crmd: [17332]: info: do_state_transition:
> Starting PEngine Recheck Timer
> 
> Sep 17 00:39:14 property01-b crmd: Cannot append to
> /var/log/cluster/corosync.log: Permission denied
> 
>  
> 
> Below is corosync.conf and crm configure result.
> 
> Corosync.conf
> 
> # Please read the corosync.conf.5 manual page
> 
> compatibility: whitetank
> 
>  
> 
> totem {
> 
> version: 2
> 
> secauth: off
> 
> threads: 0
> 
> interface {
> 
> ringnumber: 0
> 
> bindnetaddr: 192.168.200.15
> 
> mcastaddr: 226.94.1.1
> 
> mcastport: 5405
> 
> }
> 
> }
> 
>  
> 
> logging {
> 
> fileline: off
> 
> to_stderr: no
> 
> to_logfile: yes

Change this to: 
 to_logfile: no

And use just syslog.

Thanks,

Dejan

> to_syslog: yes
> 
> logfile: /var/log/cluster/corosync.log
> 
> debug: off
> 
> timestamp: on
> 
> logger_subsys {
> 
> subsys: AMF
> 
> debug: off
> 
> }
> 
> }
> 
>  
> 
> amf {
> 
> mode: disabled
> 
> }
> 
> aisexec{
> 
> user: root
> 
> group: root
> 
> }
> 
> service{
> 
> # Load the Pacemaker Cluster Resource Manager
> 
> name: pacemaker
> 
> ver: 0
> 
> }
> 
>  
> 
> Crm configure:
> 
> node property01-a
> 
> node property01-b
> 
> property $id="cib-bootstrap-options" \
> 
> dc-version="1.0.9-89bd754939df5150de7cd76835f98fe90851b677" \
> 
> cluster-infrastructure="openais" \
> 
> expected-quorum-votes="2"
> 
>  
> 
> Here is the /var/log/corosync/corosync.log permission info:
> 
> [r...@property01-b corosync]# ls -l /var/log/cluster/
> 
> total 148
> 
> -rw-rw 1 root root 145285 Sep 17 00:39 corosync.log
> 
>  
> 
> From corosync.log
> 
> Pcmk and corosync should able to write the log in corosync.log
> 
> Sep 17 00:39:07 corosync [pcmk  ] info: pcmk_peer_update: MEMB: property01-b
> 80259264
> 
> Sep 17 00:39:07 corosync [pcmk  ] info: send_member_notification: Sending
> membership update 72 to 2 children
> 
> Sep 17 00:39:07 corosync [pcmk  ] info: update_member: 0x5275c50 Node
> 80259264 ((null)) born on: 72
> 
> Sep 17 00:39:07 corosync [TOTEM ] A processor joined or left the membership
> and a new membership was formed.
> 
> Sep 17 00:39:07 corosync [pcmk  ] info: update_member: 0x527cec0 Node
> 63482048 (property01-a) born on: 72
> 
> Sep 17 00:39:07 corosync [pcmk  ] info: update_member: 0x527cec0 Node
> 63482048 now known as property01-a (was: (null))
> 
> Sep 17 00:39:07 corosync [pcmk  ] info: update_member: Node property01-a now
> has process list: 00013312 (78610)
> 
> Sep 17 00:39:08 corosync [pcmk  ] info: update_member: Node property01-a now
> has 1 quorum votes (was 0)
> 
> Sep 17 00:39:08 corosync [pcmk  ] info: send_member_notification: Sending
> membership update 72 to 2 children
> 
> Sep 17 00:39:08 corosync [MAIN  ] Completed service synchronization, ready
> to provide service.
> 
>  
> 
> The cluster was just setup and the current configure is so simple, but I
> can't find out what did I do to cause this issue. Can anyone help me to get
> rid of the permission issue? I have no clue to find out the cause.
> 
> Thank you very much.
> 
>  
> 
> Regards,
> 
> Alister
> 

> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Problem to support NFS v3

2010-09-17 Thread Dejan Muhamedagic
Hi,
On Thu, Sep 16, 2010 at 09:04:29AM -0400, liang...@asc-csa.gc.ca wrote:
> Somehow this may not get through. Let me try it again.
> 
> Hi There,
> 
> I have setup a pair of primary/secondary servers for services
> such as web, ftp, samba and nfs with pacemaker 1.1.1,
> drdb-pacemaker 8.3.7, corosync 1.2.7 on Fedora 13. Things work
> well except for an old Solaris 2.7 client. The Solaris system
> can't mount to the nfs server because it can support NFS v3 not
> v4. From tcpdump this is what goes wrong with a pure NFS v3
> client,
> 
> 19:55:02.886062 192.168.249.150.39331 > 10.1.1.200.sunrpc:  udp 84 (DF)
> 19:55:02.886625 10.1.1.199.sunrpc > 192.168.249.150.39331:  udp 28 (DF) 
> 20:01:12.741180 192.168.249.150.52440 > 10.1.1.200.sunrpc:  udp 84 (DF)
> 20:01:12.741667 10.1.1.199.sunrpc > 192.168.249.150.52440:  udp 28 (DF)


> 
> Where 10.1.1.200 is the virtual IP address and 10.1.1.199 is the IP address 
> of the primary server. If I try to mount to the non-virtual address 
> 10.1.1.199, it works fine. But obviously this is what we want with a virtual 
> server.
> 
> Originally I used resource agent lsb:nfs. Then changed to 
> ocf:heartbeat:nfsserver with specific parameter pointing to the virtual 
> address 10.1.1.200. Plus I added rpcbind which takes care sunrpc request as a 
> virtual service too (see the related configuration below please). It has the 
> same thing as before: The virtual server responds with its physical address. 
> primitive nfs ocf:heartbeat:nfsserver \
>   params nfs_ip="10.1.1.200" nfs_shared_infodir="/var/drbdata0/exports" \
>   params nfs_init_script="/etc/init.d/nfs" 
> nfs_notify_cmd="/sbin/rpc.statd" \
>   op monitor interval="5s" timeout="20s" depth="0" \
>   meta target-role="Started"
> primitive rpcbind lsb:rpcbind \
>   op monitor interval="50s" \
>   meta target-role="Started"
> 
> The virtual NFS service works fine with nfs clients that
> support NFS v4. Any ideas please?

I can vaguely recall problems like this in the past, but missing
all the details :-/ At any rate, the problem is related either to
the kernel or NFS. I think you should better ask in the Fedora or
the NFS related forum.

Thanks,

Dejan

> Thanks in advance.
> 
> 
> Liang Ma
> Contractuel | Consultant | SED Systems Inc. 
> Ground Systems Analyst
> Agence spatiale canadienne | Canadian Space Agency
> 6767, Route de l'Aéroport, Longueuil (St-Hubert), QC, Canada, J3Y 8Y9
> Tél/Tel : (450) 926-5099 | Téléc/Fax: (450) 926-5083
> Courriel/E-mail : [liang...@space.gc.ca]
> Site web/Web site : [www.space.gc.ca ] 
> 
> 
> 
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] error in logs - cannot interleave clone... and some general ideas

2010-09-17 Thread Andrew Beekhof
2010/9/15 Michał Purzyński :
> hey,
>
> i'm in process of setting up a failover + load balancing cluster for
> OpenLDAP (mirror mode).
>
> i've got some strange log messages i don't really understand:
>
> Sep 15 00:33:38 udalia3 crmd: [30332]: ERROR: crmd_ha_msg_filter:
> Another DC detected: udalia4 (op=noop)
>
> this one is easy - one node lost connectivity, each of them (i have two
> node cluster) thinks he's the DC. anyway, after the connectivity is
> back, everything syncs.
>
> btw, what should i do about that? stonith?

almost certainly

> does not seem to be a good
> solution with slapd service - no shared data, each node has own
> database, they are synchronized on the ldap replication level, mirror mode.

what happens if changes are made to both?

>
> if you have some ideas/thoughts about my cluster setup, share them as
> well. maybe there are some logical mistakes i made or sth. just comment
> on it.
>
> Sep 15 00:36:53 udalia3 pengine: [30331]: ERROR:
> clone_rsc_colocation_rh: Cannot interleave clone LDAPclone and
> ClusterIPclone because they do not support the same number of resources
> per node
>
> this is the one i don't get. i don't interleave anything. at least i
> don't know i do ;)

it became the default.
set the interleave=false param for both clones and it will go away

>
> configuration of my cluster:
>
> node udalia3 \
>        attributes standby="off"
> node udalia4 \
>        attributes standby="off"
> primitive ClusterIP ocf:heartbeat:IPaddr2 \
>        params ip="172.16.6.93" cidr_netmask="24"
> clusterip_hash="sourceip-sourceport" \
>        op monitor interval="0" timeout="60s" start stop
> primitive LDAP lsb:slapd \
>        op monitor interval="0" timeout="60s" start stop
> primitive resPing ocf:pacemaker:ping \
>        params host_list="172.16.4.7" dampen="5s" multiplier="1000"
> debug="true" \
>        op start interval="0" timeout="60s" \
>        op stop interval="0" timeout="60s" \
>        op monitor interval="0" timeout="60s"
> clone ClusterIPclone ClusterIP \
>        meta globally-unique="true" clone-max="2" clone-node-max="2"
> target-role="Started"
> clone LDAPclone LDAP \
>        meta globally-unique="false" clone-max="2" clone-node-max="1"
> target-role="Started"
> clone clonePing resPing
> location ip-no-convectivity ClusterIPclone \
>        rule $id="ping-exclude-rule2" -inf: not_defined pingd or pingd
> number:lte 0
> location ldap-no-convectivity LDAPclone \
>        rule $id="ping-exclude-rule" -inf: not_defined pingd or pingd
> number:lte 0
> colocation ldap-with-ip inf: LDAPclone ClusterIPclone
> order ldap-after-ip inf: ClusterIPclone LDAPclone
> property $id="cib-bootstrap-options" \
>        dc-version="1.1.2-53b05d88305c603b83b6dbdd799f8e3bca9b7efd" \
>        cluster-infrastructure="openais" \
>        expected-quorum-votes="2" \
>        stonith-enabled="false" \
>        no-quorum-policy="ignore" \
>        last-lrm-refresh="1284502285"
> rsc_defaults $id="rsc-options" \
>        resource-stickiness="0"
>
> thx for your help.
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] cib fails to start until host is rebooted

2010-09-17 Thread Andrew Beekhof
I spoke to Steve, and the only thing he could come up with was that
the group might not be correct.

When the cluster is in this state, please run:
   ps x -o pid,euser,ruser,egroup,rgroup,command

And compare it to the "normal" output.

Also, confirm that there is only one group named haclient, and one
user named hacluster.

On Tue, Sep 7, 2010 at 11:03 PM, Michael Smith  wrote:
> Michael Smith wrote:
>>
>> On Mon, 6 Sep 2010, Andrew Beekhof wrote:
>>
> Is /dev/shm full (or not mounted) by any chance?

 No - I tried clearing that out, too.
>>>
>>> And corosync is actually running?
>>
>> Yes, it's logging "[IPC   ] Invalid IPC credentials." when cib tries to
>> connect.
>
> For what it's worth, I have the same problem after updating:
>
>
> cluster-glue-1.0.6-2.1
> corosync-1.2.7-1.1
> openais-1.1.3-1.1
> pacemaker-1.1.2.1-5.1
>
> Mike
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker