Re: [Pacemaker] ERROR: Wrong stack o2cb

2013-06-26 Thread Denis Witt
On Wed, 26 Jun 2013 11:53:17 -0400 (EDT)
Jake Smith  wrote:

> If you wanted you could set location constraints preventing test4
> and/or clone node max so you don't have stopped status in the clone
> sets for node test4.

Hi Jake,

at the moment I'm not sure if we're going to use ocfs at all on the
production systems. But if we do I'll add those rules, thanks.
 
> I like shorter when possible so you can also combine the filesystem
> into the ocfs2 group before cloning and only have one clone set for
> controld, o2cb, and ocfs fs.  That's just me ;-)

Sounds good for later use. I'm all new to pacemaker and HA at all so I
try to keep it as simple as possible so far to make it easier to spot
errors and better understanding. It's a bit irritating at first when
resources didn't seem to exists anymore when used in groups or
clonesets, but I get used to it. :)

> > Thanks a lot!
> Glad it helped!  That's what we're all here for

And you all are doing a very good job here. This list is remarkably
helpful and friendly. 

Best regards
Denis Witt


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] ERROR: Wrong stack o2cb

2013-06-26 Thread Jake Smith

- Original Message -
> From: "Denis Witt" 
> To: pacemaker@oss.clusterlabs.org
> Cc: "jsmith" 
> Sent: Wednesday, June 26, 2013 8:35:08 AM
> Subject: Re: [Pacemaker] ERROR: Wrong stack o2cb
> 
> On Wed, 26 Jun 2013 07:53:37 -0400 (EDT)
> jsmith  wrote:
> 
> > You could start ocfs2 in the cluster just disable/remove the
> > filesystem resource for now. Once pacemaker has started ocfs2 I
> > believe you can do what you need?
> 
> Hi Jake,
> 
> Node test4: standby
> Online: [ test4-node1 test4-node2 ]
> 
>  Master/Slave Set: ms_drbd [drbd]
>  Masters: [ test4-node1 test4-node2 ]
>  Clone Set: clone_pingtest [pingtest]
>  Started: [ test4-node1 test4-node2 ]
>  Stopped: [ pingtest:2 ]
>  Resource Group: grp_all
>  sip  (ocf::heartbeat:IPaddr2):   Started test4-node1
>  apache   (ocf::heartbeat:apache):Started test4-node1
>  Clone Set: cl_ocfs2mgmt [g_ocfs2mgmt]
>  Started: [ test4-node2 test4-node1 ]
>  Stopped: [ g_ocfs2mgmt:2 ]
>  Clone Set: cl_fs_ocfs2 [fs_drbd]
>  Started: [ test4-node2 test4-node1 ]
>  Stopped: [ fs_drbd:2 ]

If you wanted you could set location constraints preventing test4 and/or clone 
node max so you don't have stopped status in the clone sets for node test4.

I like shorter when possible so you can also combine the filesystem into the 
ocfs2 group before cloning and only have one clone set for controld, o2cb, and 
ocfs fs.  That's just me ;-)

> 
> Failed actions:
> p_o2cb:0_monitor_0 (node=test4, call=164, rc=5, status=complete):
> not installed p_controld:0_monitor_0 (node=test4, call=163, rc=5,
> status=complete): not installed drbd:0_monitor_0 (node=test4,
> call=159,
> rc=5, status=complete): not installed
> 
> Thanks a lot!
> 

Glad it helped!  That's what we're all here for

> For the record, using DRBD/OCFS2 with Pacemaker/corosync on Debian
> Wheezy:
> 
> apt-get install ocfs2-tools ocfs2-tools-pacemaker openais dlm-pcmk
> 
> Configure your DRBD-Drives, make sure they are running (you can
> format
> them as ext4 to test if they mount well, but don't run them as
> dual-primary, yet).
> 
> DON'T add /etc/ocfs2/cluster.conf
> update-rc.d ocfs2 disable
> update-rc.d o2cb disable
> Add ocfs2_stack_user to /etc/modules
> 
> Then add all groups/clone sets/primitives, except fs_drbd related
> ones.
> When the cluster is running format the drive, so that the correct
> stack
> will be written.
> Then add the fs_drbd related stuff.
> 
> Should work then. I'll check this procedure on a new machine and
> extend
> the list if necessary.

One more comment - I noticed in the log out put last time an error about 
killproc.  When I tested a bit on Ubuntu with ocfs2 I noticed there was a patch 
for that error.  It looks from you log like in Debian it still stopped 
successfully and I'm sure you'll find out for sure when you test 
fencing/stonith/failures but might want to double check:

Jun 26 10:32:29 test4 lrmd: [3134]: info: rsc:p_o2cb:0 stop[15] (pid 3651)
Jun 26 10:32:29 test4 ocfs2_controld: kill node 302186506 - ocfs2_controld 
PROCDOWN
Jun 26 10:32:29 test4 stonith-ng: [3133]: info: initiate_remote_stonith_op: 
Initiating remote operation off for 302186506: 
1be401d4-547d-4f59-b380-a5e996c70a31
Jun 26 10:32:29 test4 stonith-ng: [3133]: info: stonith_command: Processed 
st_query from test4-node1: rc=0
Jun 26 10:32:29 test4 stonith-ng: [3133]: info: crm_new_peer: Node test4 now 
has id: 302252042
Jun 26 10:32:29 test4 stonith-ng: [3133]: info: crm_new_peer: Node 302252042 is 
now known as test4
Jun 26 10:32:29 test4 stonith-ng: [3133]: info: crm_new_peer: Node test4-node2 
now has id: 302186506
Jun 26 10:32:29 test4 stonith-ng: [3133]: info: crm_new_peer: Node 302186506 is 
now known as test4-node2
Jun 26 10:32:29 test4 o2cb[3651]: INFO: Stopping p_o2cb:0
Jun 26 10:32:29 test4 o2cb[3651]: INFO: Stopping ocfs2_controld.pcmk
Jun 26 10:32:29 test4 lrmd: [3134]: info: RA output: (p_o2cb:0:stop:stderr) 
/usr/lib/ocf/resource.d//pacemaker/o2cb: line 171: killproc: command not found
^^^ This line
Jun 26 10:32:30 test4 lrmd: [3134]: info: operation stop[15] on p_o2cb:0 for 
client 3137: pid 3651 exited with return code 0
Jun 26 10:32:30 test4 crmd: [3137]: info: process_lrm_event: LRM operation 
p_o2cb:0_stop_0 (call=15, rc=0, cib-update=21, confirmed=true) ok
^^^ looks like the stop succeed anyway but...

On Ubuntu - 
https://bugs.launchpad.net/ubuntu/lucid/+source/pacemaker/+bug/727422

Jake

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] ERROR: Wrong stack o2cb

2013-06-26 Thread Denis Witt
On Wed, 26 Jun 2013 07:53:37 -0400 (EDT)
jsmith  wrote:

> You could start ocfs2 in the cluster just disable/remove the
> filesystem resource for now. Once pacemaker has started ocfs2 I
> believe you can do what you need? 

Hi Jake,

Node test4: standby
Online: [ test4-node1 test4-node2 ]

 Master/Slave Set: ms_drbd [drbd]
 Masters: [ test4-node1 test4-node2 ]
 Clone Set: clone_pingtest [pingtest]
 Started: [ test4-node1 test4-node2 ]
 Stopped: [ pingtest:2 ]
 Resource Group: grp_all
 sip(ocf::heartbeat:IPaddr2):   Started test4-node1
 apache (ocf::heartbeat:apache):Started test4-node1
 Clone Set: cl_ocfs2mgmt [g_ocfs2mgmt]
 Started: [ test4-node2 test4-node1 ]
 Stopped: [ g_ocfs2mgmt:2 ]
 Clone Set: cl_fs_ocfs2 [fs_drbd]
 Started: [ test4-node2 test4-node1 ]
 Stopped: [ fs_drbd:2 ]

Failed actions:
p_o2cb:0_monitor_0 (node=test4, call=164, rc=5, status=complete):
not installed p_controld:0_monitor_0 (node=test4, call=163, rc=5,
status=complete): not installed drbd:0_monitor_0 (node=test4, call=159,
rc=5, status=complete): not installed

Thanks a lot!

For the record, using DRBD/OCFS2 with Pacemaker/corosync on Debian
Wheezy:

apt-get install ocfs2-tools ocfs2-tools-pacemaker openais dlm-pcmk

Configure your DRBD-Drives, make sure they are running (you can format
them as ext4 to test if they mount well, but don't run them as
dual-primary, yet).

DON'T add /etc/ocfs2/cluster.conf
update-rc.d ocfs2 disable
update-rc.d o2cb disable
Add ocfs2_stack_user to /etc/modules

Then add all groups/clone sets/primitives, except fs_drbd related ones.
When the cluster is running format the drive, so that the correct stack
will be written.
Then add the fs_drbd related stuff.

Should work then. I'll check this procedure on a new machine and extend
the list if necessary.

Again, thanks a lot!

Best regards
Denis Witt

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] ERROR: Wrong stack o2cb

2013-06-26 Thread jsmith


 Original message 
From: Denis Witt  
Date: 06/26/2013  6:46 AM  (GMT-05:00) 
To: The Pacemaker cluster resource manager  
Subject: Re: [Pacemaker] ERROR: Wrong stack o2cb 
 
On Wed, 26 Jun 2013 11:07:05 +0200
Lars Marowsky-Bree  wrote:

> This indicates you have a 'wrong stack' on disk still. You need to run
> mkfs.ocfs2/tunefs.ocfs while the o2cb cluster resource is running, or
> to set it to "pcmk" manually.

Hi Lars,

at the moment I assume I have a user-Stack on
disk. /sys/fs/ocfs2/cluster_stack says pcmk. loaded_cluster_plugins
says user. active_cluster_plugins is empty.

> I'm not sure if anyone has tested pcmk+ocfs2 on Debian for a while.
> Perhaps it's a good thing to check the debian cluster list if any
> exists?

I'll have a look.

> I'd just delete /etc/ocfs2/cluster.conf. Anything that requires it
> indicates that it's not working properly with pacemaker ;-)

The problem is that it won't start, so I can't write a new stack. Is
there a way to start it manually using the pcmk stack?



You could start ocfs2 in the cluster just disable/remove the filesystem 
resource for now. Once pacemaker has started ocfs2 I believe you can do what 
you need? 

Jake

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] ERROR: Wrong stack o2cb

2013-06-26 Thread Denis Witt
On Wed, 26 Jun 2013 11:07:05 +0200
Lars Marowsky-Bree  wrote:

> This indicates you have a 'wrong stack' on disk still. You need to run
> mkfs.ocfs2/tunefs.ocfs while the o2cb cluster resource is running, or
> to set it to "pcmk" manually.

Hi Lars,

at the moment I assume I have a user-Stack on
disk. /sys/fs/ocfs2/cluster_stack says pcmk. loaded_cluster_plugins
says user. active_cluster_plugins is empty.

> I'm not sure if anyone has tested pcmk+ocfs2 on Debian for a while.
> Perhaps it's a good thing to check the debian cluster list if any
> exists?

I'll have a look.

> I'd just delete /etc/ocfs2/cluster.conf. Anything that requires it
> indicates that it's not working properly with pacemaker ;-)

The problem is that it won't start, so I can't write a new stack. Is
there a way to start it manually using the pcmk stack?

Best regards
Denis Witt

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] ERROR: Wrong stack o2cb

2013-06-26 Thread Lars Marowsky-Bree
On 2013-06-25T17:08:36, Denis Witt  wrote:

> My cluster.conf (I added this later to be able to run tunefs.ocfs
> --update-cluster-stack):

This indicates you have a 'wrong stack' on disk still. You need to run
mkfs.ocfs2/tunefs.ocfs while the o2cb cluster resource is running, or to
set it to "pcmk" manually.

I'm not sure if anyone has tested pcmk+ocfs2 on Debian for a while.
Perhaps it's a good thing to check the debian cluster list if any
exists?

I'd just delete /etc/ocfs2/cluster.conf. Anything that requires it
indicates that it's not working properly with pacemaker ;-)


Regards,
Lars

-- 
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 
21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] ERROR: Wrong stack o2cb

2013-06-26 Thread Denis Witt
On Tue, 25 Jun 2013 13:34:30 -0400 (EDT)
Jake Smith  wrote:

> I'm guessing if you run:
> grep user /sys/fs/ocfs2/loaded_cluster_plugins 2>&1; rc=$?
> your going to return a 1 or something other than 0.  

Hi Jake,

true, as long as ocfs2_stack_user isn't loaded. I loaded it and now the
command returns 0. 

But now I'm not able to mount the cluster-drive anymore. Corosync fails
to start o2cb/ocfs2 and if I start it by hand the stack mismatches. 

Below you will find the corosync logs:

Jun 26 10:32:13 test4 corosync[3101]:   [MAIN  ] Corosync Cluster Engine 
('1.4.2'): started and ready to provide service.
Jun 26 10:32:13 test4 corosync[3101]:   [MAIN  ] Corosync built-in features: nss
Jun 26 10:32:13 test4 corosync[3101]:   [MAIN  ] Successfully read main 
configuration file '/etc/corosync/corosync.conf'.
Jun 26 10:32:13 test4 corosync[3101]:   [TOTEM ] Initializing transport (UDP/IP 
Multicast).
Jun 26 10:32:13 test4 corosync[3101]:   [TOTEM ] Initializing transmit/receive 
security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Jun 26 10:32:13 test4 corosync[3101]:   [TOTEM ] The network interface 
[10.0.2.18] is now up.
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: process_ais_conf: 
Reading configure
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: config_find_init: Local 
handle: 5650605097994944515 for logging
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: config_find_next: 
Processing additional logging options...
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: get_config_opt: Found 
'off' for option: debug
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: get_config_opt: Found 
'no' for option: to_logfile
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: get_config_opt: Found 
'yes' for option: to_syslog
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: get_config_opt: Found 
'daemon' for option: syslog_facility
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: config_find_init: Local 
handle: 273040974342372 for quorum
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: config_find_next: No 
additional configuration supplied for: quorum
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: get_config_opt: No 
default for option: provider
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: config_find_init: Local 
handle: 5880381755227111429 for service
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: config_find_next: 
Processing additional service options...
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: get_config_opt: Found 
'0' for option: ver
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: get_config_opt: 
Defaulting to 'pcmk' for option: clustername
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: get_config_opt: 
Defaulting to 'no' for option: use_logd
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: get_config_opt: 
Defaulting to 'no' for option: use_mgmtd
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: pcmk_startup: CRM: 
Initialized
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] Logging: Initialized 
pcmk_startup
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: pcmk_startup: Maximum 
core file size is: 18446744073709551615
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: pcmk_startup: Service: 9
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: pcmk_startup: Local 
hostname: test4-node1
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: pcmk_update_nodeid: 
Local node id: 302120970
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: update_member: Creating 
entry for node 302120970 born on 0
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: update_member: 0xe28c00 
Node 302120970 now known as test4-node1 (was: (null))
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: update_member: Node 
test4-node1 now has 1 quorum votes (was 0)
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: update_member: Node 
302120970/test4-node1 is now: member
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: spawn_child: Forked 
child 3132 for process cib
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: spawn_child: Forked 
child 3133 for process stonith-ng
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: spawn_child: Forked 
child 3134 for process lrmd
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: spawn_child: Forked 
child 3135 for process attrd
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: spawn_child: Forked 
child 3136 for process pengine
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: spawn_child: Forked 
child 3137 for process crmd
Jun 26 10:32:13 test4 corosync[3101]:   [SERV  ] Service engine loaded: 
Pacemaker Cluster Manager 1.1.7
Jun 26 10:32:14 test4 pengine: [3136]: info: Invoked: 
/usr/lib/pacemaker/pengine 
Jun 26 10:32:14 test4 corosync[3101]:   [SERV  ] Service engine loaded: openais 
checkpoint service B.01.01
Jun 26 10:32:14 test4 attrd: [3135]: info: Invoked: /usr/lib/pacemaker/attrd 
Jun 26 10:32:14 test4 attrd: [

Re: [Pacemaker] ERROR: Wrong stack o2cb

2013-06-25 Thread Jake Smith



- Original Message -
> From: "Denis Witt" 
> To: pacemaker@oss.clusterlabs.org
> Cc: "Jake Smith" 
> Sent: Tuesday, June 25, 2013 11:47:36 AM
> Subject: Re: [Pacemaker] ERROR: Wrong stack o2cb
> 
> On Tue, 25 Jun 2013 11:37:15 -0400 (EDT)
> Jake Smith  wrote:
> 
> > You probably already know but you're going to get failed "not
> > installed" from test4 always unless you install the same packages
> > there.
> > 
> > Do you have logs from test4-node[1|2] that are generating the not
> > installed for o2cb?  The log below is just from test4 if I'm not
> > mistaken which we expect doesn't have o2cb installed.
> 
> Hi Jake,
> 
> the log is from test4-node2, the machine was renamed and in the logs
> it
> still shows up as test4. It has o2cb installed. I can use the Drive
> fine on this machine when I start o2cb and ocfs2 by hand and mount
> the
> drive.
>  
> > A quick search for "ERROR: Wrong stack o2cb" indicates you may want
> > to verify o2cb isn't starting on boot?  But that's just a guess
> > without the logs from the affected nodes.
> 
> I've executed "update-rc.d o2cb disable" and "update-rc.d ocfs2
> disable". The services are stopped and pacemaker/corosync should
> handle
> everything. o2cb is still enabled in /etc/default/o2cb but the
> init-Script isn't executed on boot.
> 

This might help some - the second to last post:
http://comments.gmane.org/gmane.linux.highavailability.pacemaker/13918

I'll quote Bruno Macadre:


I don't known if you solved your problem but I just have the same 
behavior on my fresh installed pacemaker.

With the 2 lines :
p_o2cb:1_monitor_0 (node=nas1, call=10, rc=5,status=complete): not installed
p_o2cb:0_monitor_0 (node=nas2, call=10, rc=5, status=complete): not 
installed

After some tries, I've found a bug in the resource agent 
ocf:pacemaker:o2cb

When this agent start, his first action is to do a 'o2cb_monitor' to 
check if o2cb is already started. If not (ie $? == $OCF_NOT_RUNNING) it 
load all needed adn finally start.

The bug is that 'o2cb_monitor' return $OCF_NOT_RUNNING if something is 
missing, except for the module 'ocfs2_user_stack' which he returns 
$OCF_ERR_INSTALLED. So if the module 'ocfs2_user_stack' is not loaded 
before starting ocf:pacemaker:o2cb resource agent it fails to start with 
'not installed' error.

The workaround I've just find is to place 'ocfs2_user_stack' in my 
/etc/modules on all nodes and all works fine.

I hope I helped someone and this bug was corrected in future release of 
o2cb RA.


I took a look at the current RA and around lines 341-5 there is this check in 
the monitor code:
grep user "$LOADED_PLUGINS_FILE" >/dev/null 2>&1; rc=$?
if [ $rc != 0 ]; then
ocf_log err "Wrong stack `cat $LOADED_PLUGINS_FILE`"
return $OCF_ERR_INSTALLED
fi

I'm guessing if you run:
grep user /sys/fs/ocfs2/loaded_cluster_plugins 2>&1; rc=$?

your going to return a 1 or something other than 0.

Also in the above thread the 4th to last post from Andreas Kurz @Hastexo 
mentions this:
 > This message was immediately followed by "Wrong stack" errors, and

 check the content of /sysfs/fs/ocfs2/loaded_cluster_plugins ... and if
 you have that configfile and it contains the value "user" this is a good
 sign you have started ocfs2/o2cb via init

HTH

Jake

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] ERROR: Wrong stack o2cb

2013-06-25 Thread Denis Witt
On Tue, 25 Jun 2013 11:37:15 -0400 (EDT)
Jake Smith  wrote:

> You probably already know but you're going to get failed "not
> installed" from test4 always unless you install the same packages
> there.
> 
> Do you have logs from test4-node[1|2] that are generating the not
> installed for o2cb?  The log below is just from test4 if I'm not
> mistaken which we expect doesn't have o2cb installed.

Hi Jake,

the log is from test4-node2, the machine was renamed and in the logs it
still shows up as test4. It has o2cb installed. I can use the Drive
fine on this machine when I start o2cb and ocfs2 by hand and mount the
drive.
 
> A quick search for "ERROR: Wrong stack o2cb" indicates you may want
> to verify o2cb isn't starting on boot?  But that's just a guess
> without the logs from the affected nodes.

I've executed "update-rc.d o2cb disable" and "update-rc.d ocfs2
disable". The services are stopped and pacemaker/corosync should handle
everything. o2cb is still enabled in /etc/default/o2cb but the
init-Script isn't executed on boot.

Best regards
Denis Witt

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] ERROR: Wrong stack o2cb

2013-06-25 Thread Denis Witt
On Tue, 25 Jun 2013 17:31:49 +0200
emmanuel segura  wrote:

> If you use ocfs with pacemaker, you don't need to configure ocfs in
> legacy mode using /etc/ocfs2/cluster.conf

Hi,

I just added the cluster.conf to be able to run tunefs.ocfs2. It
doesn't matter if it is present or not, the error is the same.

Best regards
Denis Witt

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] ERROR: Wrong stack o2cb

2013-06-25 Thread Jake Smith



- Original Message -
> From: "Denis Witt" 
> To: pacemaker@oss.clusterlabs.org
> Sent: Tuesday, June 25, 2013 11:08:36 AM
> Subject: [Pacemaker] ERROR: Wrong stack o2cb
> 
> Hi List,
> 
> I'm having trouble getting OCFS2 running. If I run everything by hand
> the OCFS-Drive works quite well, but cluster integration doesn't work
> at all.
> 
> The Status:
> 
> 
> Last updated: Tue Jun 25 17:00:49 2013
> Last change: Tue Jun 25 16:58:03 2013 via crmd on test4
> Stack: openais
> Current DC: test4 - partition with quorum
> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
> 3 Nodes configured, 3 expected votes
> 16 Resources configured.
> 
> 
> Node test4: standby
> Online: [ test4-node1 test4-node2 ]
> 
>  Master/Slave Set: ms_drbd [drbd]
>  Masters: [ test4-node1 test4-node2 ]
>  Clone Set: clone_pingtest [pingtest]
>  Started: [ test4-node2 test4-node1 ]
>  Stopped: [ pingtest:2 ]
> 
> Failed actions:
> p_o2cb:0_monitor_0 (node=test4-node2, call=20, rc=5,
> status=complete): not installed p_o2cb:1_monitor_0 (node=test4-node1,
> call=20, rc=5, status=complete): not installed drbd:0_monitor_0
> (node=test4, call=98, rc=5, status=complete): not installed
> p_controld:0_monitor_0 (node=test4, call=99, rc=5, status=complete):
> not installed p_o2cb:0_monitor_0 (node=test4, call=100, rc=5,
> status=complete): not installed
> 

You probably already know but you're going to get failed "not installed" from 
test4 always unless you install the same packages there.

Do you have logs from test4-node[1|2] that are generating the not installed for 
o2cb?  The log below is just from test4 if I'm not mistaken which we expect 
doesn't have o2cb installed.

A quick search for "ERROR: Wrong stack o2cb" indicates you may want to verify 
o2cb isn't starting on boot?  But that's just a guess without the logs from the 
affected nodes.

> My Config:
> 
> node test4 \
>   attributes standby="on"
> node test4-node1
> node test4-node2
> primitive apache ocf:heartbeat:apache \
>   params configfile="/etc/apache2/apache2.conf" \
>   op monitor interval="10" timeout="15" \
>   meta target-role="Started"
> primitive drbd ocf:linbit:drbd \
>   params drbd_resource="drbd0"
> primitive fs_drbd ocf:heartbeat:Filesystem \
>   params device="/dev/drbd0" directory="/var/www" fstype="ocfs2"
> primitive p_controld ocf:pacemaker:controld
> primitive p_o2cb ocf:pacemaker:o2cb
> primitive pingtest ocf:pacemaker:ping \
>   params multiplier="1000" host_list="10.0.0.1" \
>   op monitor interval="5s"
> primitive sip ocf:heartbeat:IPaddr2 \
>   params ip="10.0.0.18" nic="eth0" \
>   op monitor interval="10" timeout="20" \
>   meta target-role="Started"
> group g_ocfs2mgmt p_controld p_o2cb
> group grp_all sip apache
> ms ms_drbd drbd \
>   meta master-max="2" clone-max="2"
> clone cl_fs_ocfs2 fs_drbd \
>   meta target-role="Started"
> clone cl_ocfs2mgmt g_ocfs2mgmt \
>   meta interleave="true"
> clone clone_pingtest pingtest
> location loc_all_on_best_ping grp_all \
>   rule $id="loc_all_on_best_ping-rule" -inf: not_defined pingd or
> pingd lt 1000 colocation c_ocfs2 inf: cl_fs_ocfs2 cl_ocfs2mgmt
> ms_drbd:Master colocation coloc_all_on_drbd inf: grp_all
> ms_drbd:Master
> order order_all_after_drbd inf: ms_drbd:promote cl_ocfs2mgmt:start
> cl_fs_ocfs2:start grp_all:start property $id="cib-bootstrap-options"
> \
>   dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
>   cluster-infrastructure="openais" \
>   expected-quorum-votes="3" \
>   stonith-enabled="false" \
>   default-resource-stickiness="100" \
>   maintenance-mode="false" \
>   last-lrm-refresh="1372172283"
> 
> test4 is a quorum-node.

Even though you have test4 in standby I would recommend location rules to 
prevent drbd from running on test4 ever.  Just in case ;-)

HTH

Jake

> 
> My system is Debian Wheezy. I installed the following packages:
> 
> dlm-pcmk, ocfs2-tools, ocfs2-tools-pacemaker, openais
> 
> My drbd.conf:
> 
> ### globale Angaben ###
> global {
> # an Statistikauswertung auf usage.drbd.org teilnehmen?
> usage-count no;
> }
> ### Optionen, die an alle Ressourcen vererbt werden ###
> common {
>   sync

Re: [Pacemaker] ERROR: Wrong stack o2cb

2013-06-25 Thread emmanuel segura
Hello Denis

If you use ocfs with pacemaker, you don't need to configure ocfs in legacy
mode using /etc/ocfs2/cluster.conf

Thanks
Emmanuel


2013/6/25 Denis Witt 

> Hi List,
>
> I'm having trouble getting OCFS2 running. If I run everything by hand
> the OCFS-Drive works quite well, but cluster integration doesn't work
> at all.
>
> The Status:
>
> 
> Last updated: Tue Jun 25 17:00:49 2013
> Last change: Tue Jun 25 16:58:03 2013 via crmd on test4
> Stack: openais
> Current DC: test4 - partition with quorum
> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
> 3 Nodes configured, 3 expected votes
> 16 Resources configured.
> 
>
> Node test4: standby
> Online: [ test4-node1 test4-node2 ]
>
>  Master/Slave Set: ms_drbd [drbd]
>  Masters: [ test4-node1 test4-node2 ]
>  Clone Set: clone_pingtest [pingtest]
>  Started: [ test4-node2 test4-node1 ]
>  Stopped: [ pingtest:2 ]
>
> Failed actions:
> p_o2cb:0_monitor_0 (node=test4-node2, call=20, rc=5,
> status=complete): not installed p_o2cb:1_monitor_0 (node=test4-node1,
> call=20, rc=5, status=complete): not installed drbd:0_monitor_0
> (node=test4, call=98, rc=5, status=complete): not installed
> p_controld:0_monitor_0 (node=test4, call=99, rc=5, status=complete):
> not installed p_o2cb:0_monitor_0 (node=test4, call=100, rc=5,
> status=complete): not installed
>
> My Config:
>
> node test4 \
> attributes standby="on"
> node test4-node1
> node test4-node2
> primitive apache ocf:heartbeat:apache \
> params configfile="/etc/apache2/apache2.conf" \
> op monitor interval="10" timeout="15" \
> meta target-role="Started"
> primitive drbd ocf:linbit:drbd \
> params drbd_resource="drbd0"
> primitive fs_drbd ocf:heartbeat:Filesystem \
> params device="/dev/drbd0" directory="/var/www" fstype="ocfs2"
> primitive p_controld ocf:pacemaker:controld
> primitive p_o2cb ocf:pacemaker:o2cb
> primitive pingtest ocf:pacemaker:ping \
> params multiplier="1000" host_list="10.0.0.1" \
> op monitor interval="5s"
> primitive sip ocf:heartbeat:IPaddr2 \
> params ip="10.0.0.18" nic="eth0" \
> op monitor interval="10" timeout="20" \
> meta target-role="Started"
> group g_ocfs2mgmt p_controld p_o2cb
> group grp_all sip apache
> ms ms_drbd drbd \
> meta master-max="2" clone-max="2"
> clone cl_fs_ocfs2 fs_drbd \
> meta target-role="Started"
> clone cl_ocfs2mgmt g_ocfs2mgmt \
> meta interleave="true"
> clone clone_pingtest pingtest
> location loc_all_on_best_ping grp_all \
> rule $id="loc_all_on_best_ping-rule" -inf: not_defined pingd or
> pingd lt 1000 colocation c_ocfs2 inf: cl_fs_ocfs2 cl_ocfs2mgmt
> ms_drbd:Master colocation coloc_all_on_drbd inf: grp_all ms_drbd:Master
> order order_all_after_drbd inf: ms_drbd:promote cl_ocfs2mgmt:start
> cl_fs_ocfs2:start grp_all:start property $id="cib-bootstrap-options" \
> dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
> cluster-infrastructure="openais" \
> expected-quorum-votes="3" \
> stonith-enabled="false" \
> default-resource-stickiness="100" \
> maintenance-mode="false" \
> last-lrm-refresh="1372172283"
>
> test4 is a quorum-node.
>
> My system is Debian Wheezy. I installed the following packages:
>
> dlm-pcmk, ocfs2-tools, ocfs2-tools-pacemaker, openais
>
> My drbd.conf:
>
> ### globale Angaben ###
> global {
> # an Statistikauswertung auf usage.drbd.org teilnehmen?
> usage-count no;
> }
> ### Optionen, die an alle Ressourcen vererbt werden ###
> common {
>   syncer {
> rate 33M;
>   }
> }
> ### Ressourcenspezifische Optionen
> resource drbd0 {
>   # Protokoll-Version
>   protocol C;
>
>   startup {
> # Timeout (in Sekunden) für Verbindungsherstellung beim Start
> wfc-timeout 60;
> # Timeout (in Sekunden) für Verbindungsherstellung beim Start
> # nach vorheriger Feststellung von Dateninkonsistenz
> # ("degraded mode")
> degr-wfc-timeout  120;
>
> become-primary-on both;
>
>   }
>   disk {
> # Aktion bei EA-Fehlern: Laufwerk aushängen
> on-io-error pass_on;
> fencing resource-only;
>   }
>   net {
> ### Verschiedene Netzwerkoptionen, die normalerweise nicht
> gebraucht werden, ### ### die HA-Verbindung sollte generell möglichst
> performant sein...   ### # timeout   60;
> # connect-int   10;
> # ping-int  10;
> # max-buffers 2048;
> # max-epoch-size  2048;
> allow-two-primaries;
> after-sb-0pri discard-zero-changes;
> after-sb-1pri discard-secondary;
> after-sb-2pri disconnect;
>   }
>   syncer {
> # Geschwindigkeit der HA-Verbindung
> rate 33M;
>   }
>   on test4-node1 {
> ### Optionen für Master

[Pacemaker] ERROR: Wrong stack o2cb

2013-06-25 Thread Denis Witt
Hi List,

I'm having trouble getting OCFS2 running. If I run everything by hand
the OCFS-Drive works quite well, but cluster integration doesn't work
at all.

The Status:


Last updated: Tue Jun 25 17:00:49 2013
Last change: Tue Jun 25 16:58:03 2013 via crmd on test4
Stack: openais
Current DC: test4 - partition with quorum
Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
3 Nodes configured, 3 expected votes
16 Resources configured.


Node test4: standby
Online: [ test4-node1 test4-node2 ]

 Master/Slave Set: ms_drbd [drbd]
 Masters: [ test4-node1 test4-node2 ]
 Clone Set: clone_pingtest [pingtest]
 Started: [ test4-node2 test4-node1 ]
 Stopped: [ pingtest:2 ]

Failed actions:
p_o2cb:0_monitor_0 (node=test4-node2, call=20, rc=5,
status=complete): not installed p_o2cb:1_monitor_0 (node=test4-node1,
call=20, rc=5, status=complete): not installed drbd:0_monitor_0
(node=test4, call=98, rc=5, status=complete): not installed
p_controld:0_monitor_0 (node=test4, call=99, rc=5, status=complete):
not installed p_o2cb:0_monitor_0 (node=test4, call=100, rc=5,
status=complete): not installed

My Config:

node test4 \
attributes standby="on"
node test4-node1
node test4-node2
primitive apache ocf:heartbeat:apache \
params configfile="/etc/apache2/apache2.conf" \
op monitor interval="10" timeout="15" \
meta target-role="Started"
primitive drbd ocf:linbit:drbd \
params drbd_resource="drbd0"
primitive fs_drbd ocf:heartbeat:Filesystem \
params device="/dev/drbd0" directory="/var/www" fstype="ocfs2"
primitive p_controld ocf:pacemaker:controld
primitive p_o2cb ocf:pacemaker:o2cb
primitive pingtest ocf:pacemaker:ping \
params multiplier="1000" host_list="10.0.0.1" \
op monitor interval="5s"
primitive sip ocf:heartbeat:IPaddr2 \
params ip="10.0.0.18" nic="eth0" \
op monitor interval="10" timeout="20" \
meta target-role="Started"
group g_ocfs2mgmt p_controld p_o2cb
group grp_all sip apache
ms ms_drbd drbd \
meta master-max="2" clone-max="2"
clone cl_fs_ocfs2 fs_drbd \
meta target-role="Started"
clone cl_ocfs2mgmt g_ocfs2mgmt \
meta interleave="true"
clone clone_pingtest pingtest
location loc_all_on_best_ping grp_all \
rule $id="loc_all_on_best_ping-rule" -inf: not_defined pingd or
pingd lt 1000 colocation c_ocfs2 inf: cl_fs_ocfs2 cl_ocfs2mgmt
ms_drbd:Master colocation coloc_all_on_drbd inf: grp_all ms_drbd:Master
order order_all_after_drbd inf: ms_drbd:promote cl_ocfs2mgmt:start
cl_fs_ocfs2:start grp_all:start property $id="cib-bootstrap-options" \
dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
cluster-infrastructure="openais" \
expected-quorum-votes="3" \
stonith-enabled="false" \
default-resource-stickiness="100" \
maintenance-mode="false" \
last-lrm-refresh="1372172283"

test4 is a quorum-node.

My system is Debian Wheezy. I installed the following packages:

dlm-pcmk, ocfs2-tools, ocfs2-tools-pacemaker, openais

My drbd.conf:

### globale Angaben ###
global {
# an Statistikauswertung auf usage.drbd.org teilnehmen?
usage-count no;
}
### Optionen, die an alle Ressourcen vererbt werden ###
common {
  syncer { 
rate 33M; 
  }
}
### Ressourcenspezifische Optionen
resource drbd0 {
  # Protokoll-Version
  protocol C;

  startup {
# Timeout (in Sekunden) für Verbindungsherstellung beim Start
wfc-timeout 60;
# Timeout (in Sekunden) für Verbindungsherstellung beim Start 
# nach vorheriger Feststellung von Dateninkonsistenz
# ("degraded mode")
degr-wfc-timeout  120;

become-primary-on both;

  }
  disk {
# Aktion bei EA-Fehlern: Laufwerk aushängen
on-io-error pass_on;
fencing resource-only;
  }
  net {
### Verschiedene Netzwerkoptionen, die normalerweise nicht
gebraucht werden, ### ### die HA-Verbindung sollte generell möglichst
performant sein...   ### # timeout   60;
# connect-int   10;
# ping-int  10;
# max-buffers 2048;
# max-epoch-size  2048;
allow-two-primaries;
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
  }
  syncer {
# Geschwindigkeit der HA-Verbindung
rate 33M;
  }
  on test4-node1 {
### Optionen für Master-Server ###
# Name des bereitgestellten Blockdevices
device /dev/drbd0;
# dem DRBD zugrunde liegendes Laufwerk
disk   /dev/xvda3;
# Adresse und Port, über welche die Synchr. läuft
address10.0.2.18:7788;
# Speicherort der Metadaten, hier im Laufwerk selbst
meta-disk  internal; 
  }
  on test4-node2 {
## Optionen für Sla

Re: [Pacemaker] "ERROR: Wrong stack o2cb" when trying to start o2cb service in Pacemaker cluster

2012-07-26 Thread David Guyot
Hello!

No, I didn't solved my problem; instead, I removed the non-essential
OCFS2 functionality from my cluster. Nevertheless, from what I remember,
this likely comes from a bug, so your answer seems consistent.

It could prove helpful in the future.

Thank you!

Kind regards.

Le 26/07/2012 14:39, Bruno MACADRE a écrit :
> Hi,
>
> I don't known if you solved your problem but I just have the same
> behavior on my fresh installed pacemaker.
>
> With the 2 lines :
> p_o2cb:1_monitor_0 (node=nas1, call=10, rc=5,status=complete): not
> installed
> p_o2cb:0_monitor_0 (node=nas2, call=10, rc=5, status=complete): not
> installed
>
> After some tries, I've found a bug in the resource agent
> ocf:pacemaker:o2cb
>
> When this agent start, his first action is to do a 'o2cb_monitor'
> to check if o2cb is already started. If not (ie $? ==
> $OCF_NOT_RUNNING) it load all needed adn finally start.
>
> The bug is that 'o2cb_monitor' return $OCF_NOT_RUNNING if
> something is missing, except for the module 'ocfs2_user_stack' which
> he returns $OCF_ERR_INSTALLED. So if the module 'ocfs2_user_stack' is
> not loaded before starting ocf:pacemaker:o2cb resource agent it fails
> to start with 'not installed' error.
>
> The workaround I've just find is to place 'ocfs2_user_stack' in my
> /etc/modules on all nodes and all works fine.
>
> I hope I helped someone and this bug was corrected in future
> release of o2cb RA.
>
> For information :
> . OS : Ubuntu Server 12.04 x64
> . Pacemaker : 1.1.6
> . Stack : cman 3.1.7
>
> Regards,
> Bruno
>




signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] "ERROR: Wrong stack o2cb" when trying to start o2cb service in Pacemaker cluster

2012-07-26 Thread Bruno MACADRE

Hi,

	I don't known if you solved your problem but I just have the same 
behavior on my fresh installed pacemaker.


With the 2 lines :
p_o2cb:1_monitor_0 (node=nas1, call=10, rc=5,status=complete): not installed
p_o2cb:0_monitor_0 (node=nas2, call=10, rc=5, status=complete): not 
installed


After some tries, I've found a bug in the resource agent 
ocf:pacemaker:o2cb

	When this agent start, his first action is to do a 'o2cb_monitor' to 
check if o2cb is already started. If not (ie $? == $OCF_NOT_RUNNING) it 
load all needed adn finally start.


	The bug is that 'o2cb_monitor' return $OCF_NOT_RUNNING if something is 
missing, except for the module 'ocfs2_user_stack' which he returns 
$OCF_ERR_INSTALLED. So if the module 'ocfs2_user_stack' is not loaded 
before starting ocf:pacemaker:o2cb resource agent it fails to start with 
'not installed' error.


	The workaround I've just find is to place 'ocfs2_user_stack' in my 
/etc/modules on all nodes and all works fine.


	I hope I helped someone and this bug was corrected in future release of 
o2cb RA.


For information :
. OS : Ubuntu Server 12.04 x64
. Pacemaker : 1.1.6
. Stack : cman 3.1.7

Regards,
Bruno

--

Bruno MACADRE
---
 Ingénieur Systèmes et Réseau | Systems and Network Engineer
 Département Informatique | Department of computer science
 Responsable Réseau et Téléphonie | Telecom and Network Manager
 Université de Rouen  | University of Rouen
---
Coordonnées / Contact :
Université de Rouen
Faculté des Sciences et Techniques - Madrillet
Avenue de l'Université - BP12
76801 St Etienne du Rouvray CEDEX
FRANCE

Tél : +33 (0)2-32-95-51-86
Fax : +33 (0)2-32-95-51-87
---


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] "ERROR: Wrong stack o2cb" when trying to start o2cb service in Pacemaker cluster

2012-06-22 Thread David Guyot

Le 22/06/2012 11:58, Andreas Kurz a écrit :
> On 06/22/2012 11:14 AM, David Guyot wrote:
>> Hello.
>>
>> Concerning dlm-pcmk, it's not available from backports, so I installed
>> it from stable; only ocfs2-tools-pacemaker are available and installed
>> from it.
> thats ok
>
>> I checked if /etc/init.d/ocfs2 and /etc/init.d/o2cb are removed from
>> /etc/rcX.d/*, and they are, so the system cannot boot them up by itself.
> you also explicitely stopped them (on both nodes) or did you reboot the
> systems anyway?
Yes, I explicitly stopped them on both nodes and, to be sure, restarted
the system and then again explicitly stopped them, but without effect, I
always have :

Failed actions:
p_o2cb:1_monitor_0 (node=Vindemiatrix, call=9, rc=5,
status=complete): not installed
p_o2cb:0_monitor_0 (node=Malastare, call=9, rc=5, status=complete):
not installed
>
>> I also reconfigured DRBD resources using notify=true in each DRBD
>> master, then I reconfigured OCFS2 resources using these crm commands
>>
>> primitive p_controld ocf:pacemaker:controld
>> primitive p_o2cb ocf:ocfs2:o2cb
> interesting ... should be ocf:pacemaker:o2cb
In fact, this is an error in the guide I already noticed and corrected
to ocf:pacemaker:o2cb.
>
>> group g_ocfs2mgmt p_controld p_o2cb
>> clone cl_ocfs2mgmt g_ocfs2mgmt meta interleave=true
>>
> looks ok for testing o2cb, controld .. you will need colocation and
> order constraints later when starting the filesystem
>
>> root@Malastare:/home/david# crm configure show
>> node Malastare
>> node Vindemiatrix
>> primitive p_controld ocf:pacemaker:controld
>> primitive p_drbd_backupvi ocf:linbit:drbd \
>> params drbd_resource="backupvi"
>> primitive p_drbd_pgsql ocf:linbit:drbd \
>> params drbd_resource="postgresql"
>> primitive p_drbd_svn ocf:linbit:drbd \
>> params drbd_resource="svn"
>> primitive p_drbd_www ocf:linbit:drbd \
>> params drbd_resource="www"
>> primitive p_o2cb ocf:pacemaker:o2cb
>> primitive soapi-fencing-malastare stonith:external/ovh \
>> params reversedns="ns208812.ovh.net"
>> primitive soapi-fencing-vindemiatrix stonith:external/ovh \
>> params reversedns="ns235795.ovh.net"
>> group g_ocfs2mgmt p_controld p_o2cb
>> ms ms_drbd_backupvi p_drbd_backupvi \
>> meta master-max="2" clone-max="2" notify="true"
>> ms ms_drbd_pgsql p_drbd_pgsql \
>> meta master-max="2" clone-max="2" notify="true"
>> ms ms_drbd_svn p_drbd_svn \
>> meta master-max="2" clone-max="2" notify="true"
>> ms ms_drbd_www p_drbd_www \
>> meta master-max="2" clone-max="2" notify="true"
>> clone cl_ocfs2mgmt g_ocfs2mgmt \
>> meta interleave="true"
>> location stonith-malastare soapi-fencing-malastare -inf: Malastare
>> location stonith-vindemiatrix soapi-fencing-vindemiatrix -inf: Vindemiatrix
>> property $id="cib-bootstrap-options" \
>> dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
>> cluster-infrastructure="openais" \
>> expected-quorum-votes="2"
>>
>> Unfortunately, the problem is still there :
>>
>> root@Malastare:/home/david# crm_mon --one-shot -VroA
>> 
>> Last updated: Fri Jun 22 10:54:31 2012
>> Last change: Fri Jun 22 10:54:27 2012 via crm_shadow on Malastare
>> Stack: openais
>> Current DC: Malastare - partition with quorum
>> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
>> 2 Nodes configured, 2 expected votes
>> 14 Resources configured.
>> 
>>
>> Online: [ Malastare Vindemiatrix ]
>>
>> Full list of resources:
>>
>>  soapi-fencing-malastare(stonith:external/ovh):Started Vindemiatrix
>>  soapi-fencing-vindemiatrix(stonith:external/ovh):Started Malastare
>>  Master/Slave Set: ms_drbd_pgsql [p_drbd_pgsql]
>>  Masters: [ Malastare Vindemiatrix ]
>>  Master/Slave Set: ms_drbd_svn [p_drbd_svn]
>>  Masters: [ Malastare Vindemiatrix ]
>>  Master/Slave Set: ms_drbd_www [p_drbd_www]
>>  Masters: [ Malastare Vindemiatrix ]
>>  Master/Slave Set: ms_drbd_backupvi [p_drbd_backupvi]
>>  Masters: [ Malastare Vindemiatrix ]
>>  Clone Set: cl_ocfs2mgmt [g_ocfs2mgmt]
>>  Stopped: [ g_ocfs2mgmt:0 g_ocfs2mgmt:1 ]
>>
>> Node Attributes:
>> * Node Malastare:
>> + master-p_drbd_backupvi:0: 1
>> + master-p_drbd_pgsql:0   : 1
>> + master-p_drbd_svn:0 : 1
>> + master-p_drbd_www:0 : 1
>> * Node Vindemiatrix:
>> + master-p_drbd_backupvi:1: 1
>> + master-p_drbd_pgsql:1   : 1
>> + master-p_drbd_svn:1 : 1
>> + master-p_drbd_www:1 : 1
>>
>> Operations:
>> * Node Vindemiatrix:
>>soapi-fencing-malastare: migration-threshold=100
>> + (4) start: rc=0 (ok)
>>p_drbd_pgsql:1: migration-threshold=100
>> + (5) probe: rc=8 (master)
>>p_drbd_svn:1: migration-threshold=100
>> + (6) probe: rc=8 (master)
>>p_drbd_www:1: migration-threshold=100
>> + (7) p

Re: [Pacemaker] "ERROR: Wrong stack o2cb" when trying to start o2cb service in Pacemaker cluster

2012-06-22 Thread Andreas Kurz
On 06/22/2012 11:14 AM, David Guyot wrote:
> Hello.
> 
> Concerning dlm-pcmk, it's not available from backports, so I installed
> it from stable; only ocfs2-tools-pacemaker are available and installed
> from it.

thats ok

> 
> I checked if /etc/init.d/ocfs2 and /etc/init.d/o2cb are removed from
> /etc/rcX.d/*, and they are, so the system cannot boot them up by itself.

you also explicitely stopped them (on both nodes) or did you reboot the
systems anyway?

> I also reconfigured DRBD resources using notify=true in each DRBD
> master, then I reconfigured OCFS2 resources using these crm commands
> 
> primitive p_controld ocf:pacemaker:controld
> primitive p_o2cb ocf:ocfs2:o2cb

interesting ... should be ocf:pacemaker:o2cb

> group g_ocfs2mgmt p_controld p_o2cb
> clone cl_ocfs2mgmt g_ocfs2mgmt meta interleave=true
> 

looks ok for testing o2cb, controld .. you will need colocation and
order constraints later when starting the filesystem

> root@Malastare:/home/david# crm configure show
> node Malastare
> node Vindemiatrix
> primitive p_controld ocf:pacemaker:controld
> primitive p_drbd_backupvi ocf:linbit:drbd \
> params drbd_resource="backupvi"
> primitive p_drbd_pgsql ocf:linbit:drbd \
> params drbd_resource="postgresql"
> primitive p_drbd_svn ocf:linbit:drbd \
> params drbd_resource="svn"
> primitive p_drbd_www ocf:linbit:drbd \
> params drbd_resource="www"
> primitive p_o2cb ocf:pacemaker:o2cb
> primitive soapi-fencing-malastare stonith:external/ovh \
> params reversedns="ns208812.ovh.net"
> primitive soapi-fencing-vindemiatrix stonith:external/ovh \
> params reversedns="ns235795.ovh.net"
> group g_ocfs2mgmt p_controld p_o2cb
> ms ms_drbd_backupvi p_drbd_backupvi \
> meta master-max="2" clone-max="2" notify="true"
> ms ms_drbd_pgsql p_drbd_pgsql \
> meta master-max="2" clone-max="2" notify="true"
> ms ms_drbd_svn p_drbd_svn \
> meta master-max="2" clone-max="2" notify="true"
> ms ms_drbd_www p_drbd_www \
> meta master-max="2" clone-max="2" notify="true"
> clone cl_ocfs2mgmt g_ocfs2mgmt \
> meta interleave="true"
> location stonith-malastare soapi-fencing-malastare -inf: Malastare
> location stonith-vindemiatrix soapi-fencing-vindemiatrix -inf: Vindemiatrix
> property $id="cib-bootstrap-options" \
> dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
> cluster-infrastructure="openais" \
> expected-quorum-votes="2"
> 
> Unfortunately, the problem is still there :
> 
> root@Malastare:/home/david# crm_mon --one-shot -VroA
> 
> Last updated: Fri Jun 22 10:54:31 2012
> Last change: Fri Jun 22 10:54:27 2012 via crm_shadow on Malastare
> Stack: openais
> Current DC: Malastare - partition with quorum
> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
> 2 Nodes configured, 2 expected votes
> 14 Resources configured.
> 
> 
> Online: [ Malastare Vindemiatrix ]
> 
> Full list of resources:
> 
>  soapi-fencing-malastare(stonith:external/ovh):Started Vindemiatrix
>  soapi-fencing-vindemiatrix(stonith:external/ovh):Started Malastare
>  Master/Slave Set: ms_drbd_pgsql [p_drbd_pgsql]
>  Masters: [ Malastare Vindemiatrix ]
>  Master/Slave Set: ms_drbd_svn [p_drbd_svn]
>  Masters: [ Malastare Vindemiatrix ]
>  Master/Slave Set: ms_drbd_www [p_drbd_www]
>  Masters: [ Malastare Vindemiatrix ]
>  Master/Slave Set: ms_drbd_backupvi [p_drbd_backupvi]
>  Masters: [ Malastare Vindemiatrix ]
>  Clone Set: cl_ocfs2mgmt [g_ocfs2mgmt]
>  Stopped: [ g_ocfs2mgmt:0 g_ocfs2mgmt:1 ]
> 
> Node Attributes:
> * Node Malastare:
> + master-p_drbd_backupvi:0: 1
> + master-p_drbd_pgsql:0   : 1
> + master-p_drbd_svn:0 : 1
> + master-p_drbd_www:0 : 1
> * Node Vindemiatrix:
> + master-p_drbd_backupvi:1: 1
> + master-p_drbd_pgsql:1   : 1
> + master-p_drbd_svn:1 : 1
> + master-p_drbd_www:1 : 1
> 
> Operations:
> * Node Vindemiatrix:
>soapi-fencing-malastare: migration-threshold=100
> + (4) start: rc=0 (ok)
>p_drbd_pgsql:1: migration-threshold=100
> + (5) probe: rc=8 (master)
>p_drbd_svn:1: migration-threshold=100
> + (6) probe: rc=8 (master)
>p_drbd_www:1: migration-threshold=100
> + (7) probe: rc=8 (master)
>p_drbd_backupvi:1: migration-threshold=100
> + (8) probe: rc=8 (master)
>p_o2cb:1: migration-threshold=100
> + (10) probe: rc=5 (not installed)
> * Node Malastare:
>soapi-fencing-vindemiatrix: migration-threshold=100
> + (4) start: rc=0 (ok)
>p_drbd_pgsql:0: migration-threshold=100
> + (5) probe: rc=8 (master)
>p_drbd_svn:0: migration-threshold=100
> + (6) probe: rc=8 (master)
>p_drbd_www:0: migration-threshold=100
> + (7) probe: rc=8 (master)
>p_drbd_backupvi:0: migration-threshold=100

Re: [Pacemaker] "ERROR: Wrong stack o2cb" when trying to start o2cb service in Pacemaker cluster

2012-06-22 Thread David Guyot
Hello.

Concerning dlm-pcmk, it's not available from backports, so I installed
it from stable; only ocfs2-tools-pacemaker are available and installed
from it.

I checked if /etc/init.d/ocfs2 and /etc/init.d/o2cb are removed from
/etc/rcX.d/*, and they are, so the system cannot boot them up by itself.
I also reconfigured DRBD resources using notify=true in each DRBD
master, then I reconfigured OCFS2 resources using these crm commands

primitive p_controld ocf:pacemaker:controld
primitive p_o2cb ocf:ocfs2:o2cb
group g_ocfs2mgmt p_controld p_o2cb
clone cl_ocfs2mgmt g_ocfs2mgmt meta interleave=true

root@Malastare:/home/david# crm configure show
node Malastare
node Vindemiatrix
primitive p_controld ocf:pacemaker:controld
primitive p_drbd_backupvi ocf:linbit:drbd \
params drbd_resource="backupvi"
primitive p_drbd_pgsql ocf:linbit:drbd \
params drbd_resource="postgresql"
primitive p_drbd_svn ocf:linbit:drbd \
params drbd_resource="svn"
primitive p_drbd_www ocf:linbit:drbd \
params drbd_resource="www"
primitive p_o2cb ocf:pacemaker:o2cb
primitive soapi-fencing-malastare stonith:external/ovh \
params reversedns="ns208812.ovh.net"
primitive soapi-fencing-vindemiatrix stonith:external/ovh \
params reversedns="ns235795.ovh.net"
group g_ocfs2mgmt p_controld p_o2cb
ms ms_drbd_backupvi p_drbd_backupvi \
meta master-max="2" clone-max="2" notify="true"
ms ms_drbd_pgsql p_drbd_pgsql \
meta master-max="2" clone-max="2" notify="true"
ms ms_drbd_svn p_drbd_svn \
meta master-max="2" clone-max="2" notify="true"
ms ms_drbd_www p_drbd_www \
meta master-max="2" clone-max="2" notify="true"
clone cl_ocfs2mgmt g_ocfs2mgmt \
meta interleave="true"
location stonith-malastare soapi-fencing-malastare -inf: Malastare
location stonith-vindemiatrix soapi-fencing-vindemiatrix -inf: Vindemiatrix
property $id="cib-bootstrap-options" \
dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
cluster-infrastructure="openais" \
expected-quorum-votes="2"

Unfortunately, the problem is still there :

root@Malastare:/home/david# crm_mon --one-shot -VroA

Last updated: Fri Jun 22 10:54:31 2012
Last change: Fri Jun 22 10:54:27 2012 via crm_shadow on Malastare
Stack: openais
Current DC: Malastare - partition with quorum
Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
2 Nodes configured, 2 expected votes
14 Resources configured.


Online: [ Malastare Vindemiatrix ]

Full list of resources:

 soapi-fencing-malastare(stonith:external/ovh):Started Vindemiatrix
 soapi-fencing-vindemiatrix(stonith:external/ovh):Started Malastare
 Master/Slave Set: ms_drbd_pgsql [p_drbd_pgsql]
 Masters: [ Malastare Vindemiatrix ]
 Master/Slave Set: ms_drbd_svn [p_drbd_svn]
 Masters: [ Malastare Vindemiatrix ]
 Master/Slave Set: ms_drbd_www [p_drbd_www]
 Masters: [ Malastare Vindemiatrix ]
 Master/Slave Set: ms_drbd_backupvi [p_drbd_backupvi]
 Masters: [ Malastare Vindemiatrix ]
 Clone Set: cl_ocfs2mgmt [g_ocfs2mgmt]
 Stopped: [ g_ocfs2mgmt:0 g_ocfs2mgmt:1 ]

Node Attributes:
* Node Malastare:
+ master-p_drbd_backupvi:0: 1
+ master-p_drbd_pgsql:0   : 1
+ master-p_drbd_svn:0 : 1
+ master-p_drbd_www:0 : 1
* Node Vindemiatrix:
+ master-p_drbd_backupvi:1: 1
+ master-p_drbd_pgsql:1   : 1
+ master-p_drbd_svn:1 : 1
+ master-p_drbd_www:1 : 1

Operations:
* Node Vindemiatrix:
   soapi-fencing-malastare: migration-threshold=100
+ (4) start: rc=0 (ok)
   p_drbd_pgsql:1: migration-threshold=100
+ (5) probe: rc=8 (master)
   p_drbd_svn:1: migration-threshold=100
+ (6) probe: rc=8 (master)
   p_drbd_www:1: migration-threshold=100
+ (7) probe: rc=8 (master)
   p_drbd_backupvi:1: migration-threshold=100
+ (8) probe: rc=8 (master)
   p_o2cb:1: migration-threshold=100
+ (10) probe: rc=5 (not installed)
* Node Malastare:
   soapi-fencing-vindemiatrix: migration-threshold=100
+ (4) start: rc=0 (ok)
   p_drbd_pgsql:0: migration-threshold=100
+ (5) probe: rc=8 (master)
   p_drbd_svn:0: migration-threshold=100
+ (6) probe: rc=8 (master)
   p_drbd_www:0: migration-threshold=100
+ (7) probe: rc=8 (master)
   p_drbd_backupvi:0: migration-threshold=100
+ (8) probe: rc=8 (master)
   p_o2cb:0: migration-threshold=100
+ (10) probe: rc=5 (not installed)

Failed actions:
p_o2cb:1_monitor_0 (node=Vindemiatrix, call=10, rc=5,
status=complete): not installed
p_o2cb:0_monitor_0 (node=Malastare, call=10, rc=5, status=complete):
not installed

Nevertheless, I noticed a strange error message in Corosync/Pacemaker logs :
Jun 22 10:54:25 Vindemiatrix lrmd: [24580]: info: RA output:
(p_controld:1:probe:stderr) dlm_controld.pcmk: no process found

This message was immediately followed b

Re: [Pacemaker] "ERROR: Wrong stack o2cb" when trying to start o2cb service in Pacemaker cluster

2012-06-20 Thread Andreas Kurz
On 06/20/2012 03:49 PM, David Guyot wrote:
> Actually, yes, I start DRBD manually, because this is currently a test
> configuration which relies on OpenVPN for the communications between
> these 2 nodes. I have no order and collocation constraints because I'm
> discovering these software and trying to configure them step by step and
> make resources work before ordering them (nevertheless, I just tried to
> configure DLM/O2CB constraints, but they fail, apparently because they
> are relying on O2CB, which causes the problem I wrote you about.) And I
> have no OCFS2 mounts because I was on the assumption that OCFS2 wouldn't
> mount partitions without O2CB and DLM, which seems to be right :

In fact it won't work without constraints, even if you are only testing
e.g. controld and o2cb must run on the same node (in fact on both nodes
of course) and controld must run before o2cb.

And the error message you showed in a previous mail:

2012/06/20_09:04:35 ERROR: Wrong stack o2cb

... implies, that you are already running the native ocfs2 cluster stack
outside of pacemaker. You did a "/etc/init.d/ocfs2" stop before starting
your cluster tests and it is still stopped? And if it is stopped, a
cleanup of cl_ocfs2mgmt resource should start that resource ... if there
are no more other errors.

You installed dlm-pcmk and ocfs2-tools-pacemaker packages from backports?

> 
> root@Malastare:/home/david# crm_mon --one-shot -VroA
> 
> Last updated: Wed Jun 20 15:32:50 2012
> Last change: Wed Jun 20 15:28:34 2012 via crm_shadow on Malastare
> Stack: openais
> Current DC: Vindemiatrix - partition with quorum
> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
> 2 Nodes configured, 2 expected votes
> 14 Resources configured.
> 
> 
> Online: [ Vindemiatrix Malastare ]
> 
> Full list of resources:
> 
>  soapi-fencing-malastare(stonith:external/ovh):Started Vindemiatrix
>  soapi-fencing-vindemiatrix(stonith:external/ovh):Started Malastare
>  Master/Slave Set: ms_drbd_ocfs2_pgsql [p_drbd_ocfs2_pgsql]
>  Masters: [ Malastare Vindemiatrix ]
>  Master/Slave Set: ms_drbd_ocfs2_backupvi [p_drbd_ocfs2_backupvi]
>  Masters: [ Malastare Vindemiatrix ]
>  Master/Slave Set: ms_drbd_ocfs2_svn [p_drbd_ocfs2_svn]
>  Masters: [ Malastare Vindemiatrix ]
>  Master/Slave Set: ms_drbd_ocfs2_www [p_drbd_ocfs2_www]
>  Masters: [ Malastare Vindemiatrix ]
>  Clone Set: cl_ocfs2mgmt [g_ocfs2mgmt]
>  Stopped: [ g_ocfs2mgmt:0 g_ocfs2mgmt:1 ]
> 
> Node Attributes:
> * Node Vindemiatrix:
> + master-p_drbd_ocfs2_backupvi:1  : 1
> + master-p_drbd_ocfs2_pgsql:1 : 1
> + master-p_drbd_ocfs2_svn:1   : 1
> + master-p_drbd_ocfs2_www:1   : 1
> * Node Malastare:
> + master-p_drbd_ocfs2_backupvi:0  : 1
> + master-p_drbd_ocfs2_pgsql:0 : 1
> + master-p_drbd_ocfs2_svn:0   : 1
> + master-p_drbd_ocfs2_www:0   : 1
> 
> Operations:
> * Node Vindemiatrix:
>p_drbd_ocfs2_pgsql:1: migration-threshold=100
> + (4) probe: rc=8 (master)
>p_drbd_ocfs2_backupvi:1: migration-threshold=100
> + (5) probe: rc=8 (master)
>p_drbd_ocfs2_svn:1: migration-threshold=100
> + (6) probe: rc=8 (master)
>p_drbd_ocfs2_www:1: migration-threshold=100
> + (7) probe: rc=8 (master)
>soapi-fencing-malastare: migration-threshold=100
> + (10) start: rc=0 (ok)
>p_o2cb:1: migration-threshold=100
> + (9) probe: rc=5 (not installed)
> * Node Malastare:
>p_drbd_ocfs2_pgsql:0: migration-threshold=100
> + (4) probe: rc=8 (master)
>p_drbd_ocfs2_backupvi:0: migration-threshold=100
> + (5) probe: rc=8 (master)
>p_drbd_ocfs2_svn:0: migration-threshold=100
> + (6) probe: rc=8 (master)
>soapi-fencing-vindemiatrix: migration-threshold=100
> + (10) start: rc=0 (ok)
>p_drbd_ocfs2_www:0: migration-threshold=100
> + (7) probe: rc=8 (master)
>p_o2cb:0: migration-threshold=100
> + (9) probe: rc=5 (not installed)
> 
> Failed actions:
> p_o2cb:1_monitor_0 (node=Vindemiatrix, call=9, rc=5,
> status=complete): not installed
> p_o2cb:0_monitor_0 (node=Malastare, call=9, rc=5, status=complete):
> not installed
> root@Malastare:/home/david# mount -t ocfs2 /dev/drbd1 /media/ocfs/
> mount.ocfs2: Cluster stack specified does not match the one currently
> running while trying to join the group
> 
> Concerning the notify meta-attribute, I didn't configured it because it
> wasn't even referred to in the DRBD official guide (
> http://www.drbd.org/users-guide-8.3/s-ocfs2-pacemaker.html), and I don't
> know what it does, so, by default, I stupidly followed the official
> guide. What does this meta-attribute sets? If you know a better guide,
> could you please tell me about, so I can check my config based on this
> other guide?

Well, than this is a documentation bug ... you will 

Re: [Pacemaker] "ERROR: Wrong stack o2cb" when trying to start o2cb service in Pacemaker cluster

2012-06-20 Thread David Guyot
Actually, yes, I start DRBD manually, because this is currently a test
configuration which relies on OpenVPN for the communications between
these 2 nodes. I have no order and collocation constraints because I'm
discovering these software and trying to configure them step by step and
make resources work before ordering them (nevertheless, I just tried to
configure DLM/O2CB constraints, but they fail, apparently because they
are relying on O2CB, which causes the problem I wrote you about.) And I
have no OCFS2 mounts because I was on the assumption that OCFS2 wouldn't
mount partitions without O2CB and DLM, which seems to be right :

root@Malastare:/home/david# crm_mon --one-shot -VroA

Last updated: Wed Jun 20 15:32:50 2012
Last change: Wed Jun 20 15:28:34 2012 via crm_shadow on Malastare
Stack: openais
Current DC: Vindemiatrix - partition with quorum
Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
2 Nodes configured, 2 expected votes
14 Resources configured.


Online: [ Vindemiatrix Malastare ]

Full list of resources:

 soapi-fencing-malastare(stonith:external/ovh):Started Vindemiatrix
 soapi-fencing-vindemiatrix(stonith:external/ovh):Started Malastare
 Master/Slave Set: ms_drbd_ocfs2_pgsql [p_drbd_ocfs2_pgsql]
 Masters: [ Malastare Vindemiatrix ]
 Master/Slave Set: ms_drbd_ocfs2_backupvi [p_drbd_ocfs2_backupvi]
 Masters: [ Malastare Vindemiatrix ]
 Master/Slave Set: ms_drbd_ocfs2_svn [p_drbd_ocfs2_svn]
 Masters: [ Malastare Vindemiatrix ]
 Master/Slave Set: ms_drbd_ocfs2_www [p_drbd_ocfs2_www]
 Masters: [ Malastare Vindemiatrix ]
 Clone Set: cl_ocfs2mgmt [g_ocfs2mgmt]
 Stopped: [ g_ocfs2mgmt:0 g_ocfs2mgmt:1 ]

Node Attributes:
* Node Vindemiatrix:
+ master-p_drbd_ocfs2_backupvi:1  : 1
+ master-p_drbd_ocfs2_pgsql:1 : 1
+ master-p_drbd_ocfs2_svn:1   : 1
+ master-p_drbd_ocfs2_www:1   : 1
* Node Malastare:
+ master-p_drbd_ocfs2_backupvi:0  : 1
+ master-p_drbd_ocfs2_pgsql:0 : 1
+ master-p_drbd_ocfs2_svn:0   : 1
+ master-p_drbd_ocfs2_www:0   : 1

Operations:
* Node Vindemiatrix:
   p_drbd_ocfs2_pgsql:1: migration-threshold=100
+ (4) probe: rc=8 (master)
   p_drbd_ocfs2_backupvi:1: migration-threshold=100
+ (5) probe: rc=8 (master)
   p_drbd_ocfs2_svn:1: migration-threshold=100
+ (6) probe: rc=8 (master)
   p_drbd_ocfs2_www:1: migration-threshold=100
+ (7) probe: rc=8 (master)
   soapi-fencing-malastare: migration-threshold=100
+ (10) start: rc=0 (ok)
   p_o2cb:1: migration-threshold=100
+ (9) probe: rc=5 (not installed)
* Node Malastare:
   p_drbd_ocfs2_pgsql:0: migration-threshold=100
+ (4) probe: rc=8 (master)
   p_drbd_ocfs2_backupvi:0: migration-threshold=100
+ (5) probe: rc=8 (master)
   p_drbd_ocfs2_svn:0: migration-threshold=100
+ (6) probe: rc=8 (master)
   soapi-fencing-vindemiatrix: migration-threshold=100
+ (10) start: rc=0 (ok)
   p_drbd_ocfs2_www:0: migration-threshold=100
+ (7) probe: rc=8 (master)
   p_o2cb:0: migration-threshold=100
+ (9) probe: rc=5 (not installed)

Failed actions:
p_o2cb:1_monitor_0 (node=Vindemiatrix, call=9, rc=5,
status=complete): not installed
p_o2cb:0_monitor_0 (node=Malastare, call=9, rc=5, status=complete):
not installed
root@Malastare:/home/david# mount -t ocfs2 /dev/drbd1 /media/ocfs/
mount.ocfs2: Cluster stack specified does not match the one currently
running while trying to join the group

Concerning the notify meta-attribute, I didn't configured it because it
wasn't even referred to in the DRBD official guide (
http://www.drbd.org/users-guide-8.3/s-ocfs2-pacemaker.html), and I don't
know what it does, so, by default, I stupidly followed the official
guide. What does this meta-attribute sets? If you know a better guide,
could you please tell me about, so I can check my config based on this
other guide?

And, last but not least, I run Debian Squeeze 3.2.13-grsec--grs-ipv6-64.

Thank you in advance.

Kind regards.

PS: if you find me a bit rude, please accept my apologies; I'm working
on it for weeks following the official DRBD guide and it's frustrating
to ask help as a last resort and to be answered with something which
sounds like "What's this bloody mess !?!" to my tired nerve cells. Once
again, please accept my apologies.

Le 20/06/2012 15:09, Andreas Kurz a écrit :
> On 06/20/2012 02:22 PM, David Guyot wrote:
>> Hello.
>>
>> Oops, an omission.
>>
>> Here comes my Pacemaker config :
>> root@Malastare:/home/david# crm configure show
>> node Malastare
>> node Vindemiatrix
>> primitive p_controld ocf:pacemaker:controld
>> primitive p_drbd_ocfs2_backupvi ocf:linbit:drbd \
>> params drbd_resource="backupvi"
>> primitive p_drbd_ocfs2_pgsql ocf:linbit:drbd \
>> params drbd_resource="postgresql"
>> primitive p_drbd_ocfs2_svn ocf:linbit:drbd 

Re: [Pacemaker] "ERROR: Wrong stack o2cb" when trying to start o2cb service in Pacemaker cluster

2012-06-20 Thread Andreas Kurz
On 06/20/2012 02:22 PM, David Guyot wrote:
> Hello.
> 
> Oops, an omission.
> 
> Here comes my Pacemaker config :
> root@Malastare:/home/david# crm configure show
> node Malastare
> node Vindemiatrix
> primitive p_controld ocf:pacemaker:controld
> primitive p_drbd_ocfs2_backupvi ocf:linbit:drbd \
> params drbd_resource="backupvi"
> primitive p_drbd_ocfs2_pgsql ocf:linbit:drbd \
> params drbd_resource="postgresql"
> primitive p_drbd_ocfs2_svn ocf:linbit:drbd \
> params drbd_resource="svn"
> primitive p_drbd_ocfs2_www ocf:linbit:drbd \
> params drbd_resource="www"
> primitive p_o2cb ocf:pacemaker:o2cb \
> meta target-role="Started"
> primitive soapi-fencing-malastare stonith:external/ovh \
> params reversedns="ns208812.ovh.net"
> primitive soapi-fencing-vindemiatrix stonith:external/ovh \
> params reversedns="ns235795.ovh.net"
> ms ms_drbd_ocfs2_backupvi p_drbd_ocfs2_backupvi \
> meta master-max="2" clone-max="2"
> ms ms_drbd_ocfs2_pgsql p_drbd_ocfs2_pgsql \
> meta master-max="2" clone-max="2"
> ms ms_drbd_ocfs2_svn p_drbd_ocfs2_svn \
> meta master-max="2" clone-max="2"
> ms ms_drbd_ocfs2_www p_drbd_ocfs2_www \
> meta master-max="2" clone-max="2"
> location stonith-malastare soapi-fencing-malastare -inf: Malastare
> location stonith-vindemiatrix soapi-fencing-vindemiatrix -inf: Vindemiatrix
> property $id="cib-bootstrap-options" \
> dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
> cluster-infrastructure="openais" \
> expected-quorum-votes="2"
> 

I have absolutely no idea why your configuration can run at all without
more errors ... do you start the drbd resources manually before the cluster?

You are missing the notify meta-attribute for all your DRBD ms
resources, you have no order and colocation constraints or groups at all
and you don't clone controld and o2cb ... and there are no ocfs2 mounts?

Also quite important: what distribution are you using?

> The STONITH resources are custom ones which use my provider SOAP API to
> electrically reboot fenced nodes.
> 
> Concerning the web page you talked me about, I tried to insert the
> referred environment variable, but it did not solved the problem :

Really have a look at the crm configuration snippet on that page and
read manuals about setting up DRBD in Pacemaker.

Regards,
Andreas

> root@Malastare:/home/david# crm_mon --one-shot -VroA
> 
> Last updated: Wed Jun 20 14:14:41 2012
> Last change: Wed Jun 20 09:22:39 2012 via cibadmin on Malastare
> Stack: openais
> Current DC: Vindemiatrix - partition with quorum
> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
> 2 Nodes configured, 2 expected votes
> 12 Resources configured.
> 
> 
> Online: [ Vindemiatrix Malastare ]
> 
> Full list of resources:
> 
>  soapi-fencing-malastare(stonith:external/ovh):Stopped
>  p_controld(ocf::pacemaker:controld):Started Malastare
>  p_o2cb(ocf::pacemaker:o2cb):Started Vindemiatrix FAILED
>  soapi-fencing-vindemiatrix(stonith:external/ovh):Stopped
>  Master/Slave Set: ms_drbd_ocfs2_pgsql [p_drbd_ocfs2_pgsql]
>  Masters: [ Vindemiatrix Malastare ]
>  Master/Slave Set: ms_drbd_ocfs2_backupvi [p_drbd_ocfs2_backupvi]
>  Masters: [ Vindemiatrix Malastare ]
>  Master/Slave Set: ms_drbd_ocfs2_svn [p_drbd_ocfs2_svn]
>  Masters: [ Vindemiatrix Malastare ]
>  Master/Slave Set: ms_drbd_ocfs2_www [p_drbd_ocfs2_www]
>  Masters: [ Vindemiatrix Malastare ]
> 
> Node Attributes:
> * Node Vindemiatrix:
> + master-p_drbd_ocfs2_backupvi:0  : 1
> + master-p_drbd_ocfs2_pgsql:0 : 1
> + master-p_drbd_ocfs2_svn:0   : 1
> + master-p_drbd_ocfs2_www:0   : 1
> * Node Malastare:
> + master-p_drbd_ocfs2_backupvi:1  : 1
> + master-p_drbd_ocfs2_pgsql:1 : 1
> + master-p_drbd_ocfs2_svn:1   : 1
> + master-p_drbd_ocfs2_www:1   : 1
> 
> Operations:
> * Node Vindemiatrix:
>p_o2cb: migration-threshold=100 fail-count=100
> + (11) start: rc=5 (not installed)
>p_drbd_ocfs2_pgsql:0: migration-threshold=100
> + (6) probe: rc=8 (master)
>p_drbd_ocfs2_backupvi:0: migration-threshold=100
> + (7) probe: rc=8 (master)
>p_drbd_ocfs2_svn:0: migration-threshold=100
> + (8) probe: rc=8 (master)
>p_drbd_ocfs2_www:0: migration-threshold=100
> + (9) probe: rc=8 (master)
> * Node Malastare:
>p_controld: migration-threshold=100
> + (10) start: rc=0 (ok)
>p_o2cb: migration-threshold=100
> + (4) probe: rc=5 (not installed)
>p_drbd_ocfs2_pgsql:1: migration-threshold=100
> + (6) probe: rc=8 (master)
>p_drbd_ocfs2_backupvi:1: migration-threshold=100
> + (7) probe: rc=8 (master)
>p_drbd_ocfs2_svn:1: migration-threshold=100
> + (8) probe: rc=8 (master)
>p_drbd_ocfs2_www:1: migration-threshold=100
> + (9) prob

Re: [Pacemaker] "ERROR: Wrong stack o2cb" when trying to start o2cb service in Pacemaker cluster

2012-06-20 Thread David Guyot
Hello.

Oops, an omission.

Here comes my Pacemaker config :
root@Malastare:/home/david# crm configure show
node Malastare
node Vindemiatrix
primitive p_controld ocf:pacemaker:controld
primitive p_drbd_ocfs2_backupvi ocf:linbit:drbd \
params drbd_resource="backupvi"
primitive p_drbd_ocfs2_pgsql ocf:linbit:drbd \
params drbd_resource="postgresql"
primitive p_drbd_ocfs2_svn ocf:linbit:drbd \
params drbd_resource="svn"
primitive p_drbd_ocfs2_www ocf:linbit:drbd \
params drbd_resource="www"
primitive p_o2cb ocf:pacemaker:o2cb \
meta target-role="Started"
primitive soapi-fencing-malastare stonith:external/ovh \
params reversedns="ns208812.ovh.net"
primitive soapi-fencing-vindemiatrix stonith:external/ovh \
params reversedns="ns235795.ovh.net"
ms ms_drbd_ocfs2_backupvi p_drbd_ocfs2_backupvi \
meta master-max="2" clone-max="2"
ms ms_drbd_ocfs2_pgsql p_drbd_ocfs2_pgsql \
meta master-max="2" clone-max="2"
ms ms_drbd_ocfs2_svn p_drbd_ocfs2_svn \
meta master-max="2" clone-max="2"
ms ms_drbd_ocfs2_www p_drbd_ocfs2_www \
meta master-max="2" clone-max="2"
location stonith-malastare soapi-fencing-malastare -inf: Malastare
location stonith-vindemiatrix soapi-fencing-vindemiatrix -inf: Vindemiatrix
property $id="cib-bootstrap-options" \
dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
cluster-infrastructure="openais" \
expected-quorum-votes="2"

The STONITH resources are custom ones which use my provider SOAP API to
electrically reboot fenced nodes.

Concerning the web page you talked me about, I tried to insert the
referred environment variable, but it did not solved the problem :
root@Malastare:/home/david# crm_mon --one-shot -VroA

Last updated: Wed Jun 20 14:14:41 2012
Last change: Wed Jun 20 09:22:39 2012 via cibadmin on Malastare
Stack: openais
Current DC: Vindemiatrix - partition with quorum
Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
2 Nodes configured, 2 expected votes
12 Resources configured.


Online: [ Vindemiatrix Malastare ]

Full list of resources:

 soapi-fencing-malastare(stonith:external/ovh):Stopped
 p_controld(ocf::pacemaker:controld):Started Malastare
 p_o2cb(ocf::pacemaker:o2cb):Started Vindemiatrix FAILED
 soapi-fencing-vindemiatrix(stonith:external/ovh):Stopped
 Master/Slave Set: ms_drbd_ocfs2_pgsql [p_drbd_ocfs2_pgsql]
 Masters: [ Vindemiatrix Malastare ]
 Master/Slave Set: ms_drbd_ocfs2_backupvi [p_drbd_ocfs2_backupvi]
 Masters: [ Vindemiatrix Malastare ]
 Master/Slave Set: ms_drbd_ocfs2_svn [p_drbd_ocfs2_svn]
 Masters: [ Vindemiatrix Malastare ]
 Master/Slave Set: ms_drbd_ocfs2_www [p_drbd_ocfs2_www]
 Masters: [ Vindemiatrix Malastare ]

Node Attributes:
* Node Vindemiatrix:
+ master-p_drbd_ocfs2_backupvi:0  : 1
+ master-p_drbd_ocfs2_pgsql:0 : 1
+ master-p_drbd_ocfs2_svn:0   : 1
+ master-p_drbd_ocfs2_www:0   : 1
* Node Malastare:
+ master-p_drbd_ocfs2_backupvi:1  : 1
+ master-p_drbd_ocfs2_pgsql:1 : 1
+ master-p_drbd_ocfs2_svn:1   : 1
+ master-p_drbd_ocfs2_www:1   : 1

Operations:
* Node Vindemiatrix:
   p_o2cb: migration-threshold=100 fail-count=100
+ (11) start: rc=5 (not installed)
   p_drbd_ocfs2_pgsql:0: migration-threshold=100
+ (6) probe: rc=8 (master)
   p_drbd_ocfs2_backupvi:0: migration-threshold=100
+ (7) probe: rc=8 (master)
   p_drbd_ocfs2_svn:0: migration-threshold=100
+ (8) probe: rc=8 (master)
   p_drbd_ocfs2_www:0: migration-threshold=100
+ (9) probe: rc=8 (master)
* Node Malastare:
   p_controld: migration-threshold=100
+ (10) start: rc=0 (ok)
   p_o2cb: migration-threshold=100
+ (4) probe: rc=5 (not installed)
   p_drbd_ocfs2_pgsql:1: migration-threshold=100
+ (6) probe: rc=8 (master)
   p_drbd_ocfs2_backupvi:1: migration-threshold=100
+ (7) probe: rc=8 (master)
   p_drbd_ocfs2_svn:1: migration-threshold=100
+ (8) probe: rc=8 (master)
   p_drbd_ocfs2_www:1: migration-threshold=100
+ (9) probe: rc=8 (master)

Failed actions:
p_o2cb_start_0 (node=Vindemiatrix, call=11, rc=5, status=complete):
not installed
p_o2cb_monitor_0 (node=Malastare, call=4, rc=5, status=complete):
not installed

Thank you in advance for your help!

Kind regards.

Le 20/06/2012 14:02, Andreas Kurz a écrit :
> On 06/20/2012 01:43 PM, David Guyot wrote:
>> Hello, everybody.
>>
>> I'm trying to configure Pacemaker for using DRBD + OCFS2 storage, but
>> I'm stuck with DRBD and controld up and o2cb doggedly displaying "not
>> installed" errors. To do this, I followed the DRBD guide (
>> http://www.drbd.org/users-guide-8.3/ch-ocfs2.html), with the difference
>> that I was forced to disable DRBD fencing because it was interfering
>> with Pacemaker fencing and stopping each nodes as often as it could.
> Unfortunately y

Re: [Pacemaker] "ERROR: Wrong stack o2cb" when trying to start o2cb service in Pacemaker cluster

2012-06-20 Thread Andreas Kurz
On 06/20/2012 01:43 PM, David Guyot wrote:
> Hello, everybody.
> 
> I'm trying to configure Pacemaker for using DRBD + OCFS2 storage, but
> I'm stuck with DRBD and controld up and o2cb doggedly displaying "not
> installed" errors. To do this, I followed the DRBD guide (
> http://www.drbd.org/users-guide-8.3/ch-ocfs2.html), with the difference
> that I was forced to disable DRBD fencing because it was interfering
> with Pacemaker fencing and stopping each nodes as often as it could.

Unfortunately you didn't share your Pacemaker configuration but you
definitely must not start any ocfs2 init script but let all be managed
by the cluster-manager.

Here is a brief setup description, also mentioning the tune.ocfs2 when
the Pacemaker stack is running:

http://www.hastexo.com/resources/hints-and-kinks/ocfs2-pacemaker-debianubuntu

And once this is running as expected you really want to reactivate the
DRBD fencing configuration.

Regards,
Andreas

-- 
Need help with Pacemaker?
http://www.hastexo.com/now

> 
> Now, as I said, I'm stuck with these errors (Malastare and Vindemiatrix
> being the 2 nodes of my cluster) :
> Failed actions:
> p_o2cb_start_0 (node=Vindemiatrix, call=11, rc=5, status=complete):
> not installed
> p_o2cb_monitor_0 (node=Malastare, call=4, rc=5, status=complete):
> not installed
> 
> Looking into logs, I find these messages :
> o2cb[19904]:2012/06/20_09:04:35 ERROR: Wrong stack o2cb
> o2cb[19904]:2012/06/20_09:04:35 ERROR: Wrong stack o2cb
> 
> I tried to manually test ocf:pacemaker:o2cb, but I got this result :
> root@Malastare:/home/david# export OCF_ROOT="/usr/lib/ocf"
> root@Malastare:/home/david# /usr/lib/ocf/resource.d/pacemaker/o2cb monitor
> o2cb[22387]: ERROR: Wrong stack o2cb
> root@Malastare:/home/david# echo $?
> 5
> 
> I tried the solution described on this message (
> http://oss.clusterlabs.org/pipermail/pacemaker/2009-December/004112.html),
> but tunefs.ocfs2 failed :
> root@Malastare:/home/david# cat /etc/ocfs2/cluster.conf
> node:
> name = Malastare
> cluster = ocfs2
> number = 0
> ip_address = 10.88.0.1
> ip_port = 
> node:
> name = Vindemiatrix
> cluster = ocfs2
> number = 1
> ip_address = 10.88.0.2
> ip_port = 
> cluster:
> name = ocfs2
> node_count = 2
> root@Malastare:/home/david# /etc/init.d/ocfs2 start
> root@Malastare:/home/david# tunefs.ocfs2 --update-cluster-stack /dev/drbd1
> Updating on-disk cluster information to match the running cluster.
> DANGER: YOU MUST BE ABSOLUTELY SURE THAT NO OTHER NODE IS USING THIS
> FILESYSTEM BEFORE MODIFYING ITS CLUSTER CONFIGURATION.
> Update the on-disk cluster information? yes
> tunefs.ocfs2: Unable to access cluster service - unable to update the
> cluster stack information on device "/dev/drbd1"
> root@Malastare:/home/david# /etc/init.d/o2cb start
> root@Malastare:/home/david# tunefs.ocfs2 --update-cluster-stack /dev/drbd1
> Updating on-disk cluster information to match the running cluster.
> DANGER: YOU MUST BE ABSOLUTELY SURE THAT NO OTHER NODE IS USING THIS
> FILESYSTEM BEFORE MODIFYING ITS CLUSTER CONFIGURATION.
> Update the on-disk cluster information? yes
> tunefs.ocfs2: Unable to access cluster service - unable to update the
> cluster stack information on device "/dev/drbd1"
> 
> I also tried to manually set the cluster stack using "echo pcmk >
> /sys/fs/ocfs2/cluster_stack" on both nodes, then restarting Corosync and
> Pacemaker, but error messages stayed. Same thing with resetting OCFS2
> FSs while cluster is on-line. Now I'm stuck, desperately waiting for
> help  Seriously, if some do-gooder would like to help me, I would
> greatly appreciate his or her help.
> 
> Thank you in advance for your answers.
> 
> Kind regards.
> 
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 





signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] "ERROR: Wrong stack o2cb" when trying to start o2cb service in Pacemaker cluster

2012-06-20 Thread David Guyot
Hello, everybody.

I'm trying to configure Pacemaker for using DRBD + OCFS2 storage, but
I'm stuck with DRBD and controld up and o2cb doggedly displaying "not
installed" errors. To do this, I followed the DRBD guide (
http://www.drbd.org/users-guide-8.3/ch-ocfs2.html), with the difference
that I was forced to disable DRBD fencing because it was interfering
with Pacemaker fencing and stopping each nodes as often as it could.

Now, as I said, I'm stuck with these errors (Malastare and Vindemiatrix
being the 2 nodes of my cluster) :
Failed actions:
p_o2cb_start_0 (node=Vindemiatrix, call=11, rc=5, status=complete):
not installed
p_o2cb_monitor_0 (node=Malastare, call=4, rc=5, status=complete):
not installed

Looking into logs, I find these messages :
o2cb[19904]:2012/06/20_09:04:35 ERROR: Wrong stack o2cb
o2cb[19904]:2012/06/20_09:04:35 ERROR: Wrong stack o2cb

I tried to manually test ocf:pacemaker:o2cb, but I got this result :
root@Malastare:/home/david# export OCF_ROOT="/usr/lib/ocf"
root@Malastare:/home/david# /usr/lib/ocf/resource.d/pacemaker/o2cb monitor
o2cb[22387]: ERROR: Wrong stack o2cb
root@Malastare:/home/david# echo $?
5

I tried the solution described on this message (
http://oss.clusterlabs.org/pipermail/pacemaker/2009-December/004112.html),
but tunefs.ocfs2 failed :
root@Malastare:/home/david# cat /etc/ocfs2/cluster.conf
node:
name = Malastare
cluster = ocfs2
number = 0
ip_address = 10.88.0.1
ip_port = 
node:
name = Vindemiatrix
cluster = ocfs2
number = 1
ip_address = 10.88.0.2
ip_port = 
cluster:
name = ocfs2
node_count = 2
root@Malastare:/home/david# /etc/init.d/ocfs2 start
root@Malastare:/home/david# tunefs.ocfs2 --update-cluster-stack /dev/drbd1
Updating on-disk cluster information to match the running cluster.
DANGER: YOU MUST BE ABSOLUTELY SURE THAT NO OTHER NODE IS USING THIS
FILESYSTEM BEFORE MODIFYING ITS CLUSTER CONFIGURATION.
Update the on-disk cluster information? yes
tunefs.ocfs2: Unable to access cluster service - unable to update the
cluster stack information on device "/dev/drbd1"
root@Malastare:/home/david# /etc/init.d/o2cb start
root@Malastare:/home/david# tunefs.ocfs2 --update-cluster-stack /dev/drbd1
Updating on-disk cluster information to match the running cluster.
DANGER: YOU MUST BE ABSOLUTELY SURE THAT NO OTHER NODE IS USING THIS
FILESYSTEM BEFORE MODIFYING ITS CLUSTER CONFIGURATION.
Update the on-disk cluster information? yes
tunefs.ocfs2: Unable to access cluster service - unable to update the
cluster stack information on device "/dev/drbd1"

I also tried to manually set the cluster stack using "echo pcmk >
/sys/fs/ocfs2/cluster_stack" on both nodes, then restarting Corosync and
Pacemaker, but error messages stayed. Same thing with resetting OCFS2
FSs while cluster is on-line. Now I'm stuck, desperately waiting for
help  Seriously, if some do-gooder would like to help me, I would
greatly appreciate his or her help.

Thank you in advance for your answers.

Kind regards.




signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org