Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2

2013-09-05 Thread Christine Caulfield
On 03/09/13 22:03, Andrew Beekhof wrote: On 03/09/2013, at 11:49 PM, Christine Caulfield ccaul...@redhat.com wrote: On 03/09/13 05:20, Andrew Beekhof wrote: On 02/09/2013, at 5:27 PM, Andrey Groshev gre...@yandex.ru wrote: 30.08.2013, 07:18, Andrew Beekhof and...@beekhof.net: On

[Pacemaker] Howto recover from node state UNCLEAN (online)

2013-09-05 Thread Andreas Mock
Hi all, is there a way to recover from node state UNCLEAN (online) without rebooting? Background: - RHEL6.4 - cman-cluster with pacemaker - stonith enabled and working - resource monitoring failed on node 1 = stop of resource on node 1 failed = stonith off node 1 worked - more or less

Re: [Pacemaker] Howto recover from node state UNCLEAN (online)

2013-09-05 Thread Lars Marowsky-Bree
On 2013-09-05T12:23:23, Andreas Mock andreas.m...@web.de wrote: - resource monitoring failed on node 1 = stop of resource on node 1 failed = stonith off node 1 worked - more or less parallel as resource is clone resource resource monitoring failed on node 2 = stop of resource on

Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2

2013-09-05 Thread Andrew Beekhof
On 05/09/2013, at 6:37 PM, Christine Caulfield ccaul...@redhat.com wrote: On 03/09/13 22:03, Andrew Beekhof wrote: On 03/09/2013, at 11:49 PM, Christine Caulfield ccaul...@redhat.com wrote: On 03/09/13 05:20, Andrew Beekhof wrote: On 02/09/2013, at 5:27 PM, Andrey Groshev

[Pacemaker] Corosync quorum not updating on split node

2013-09-05 Thread Mark Round
Hi all, I have a problem whereby when I create a network split/partition (by dropping traffic with iptables), the victim node for some reason does not realise it has split from the network. It seems to recognise that it can't form a cluster due to network issues, but the status is not

Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2

2013-09-05 Thread Christine Caulfield
On 05/09/13 11:33, Andrew Beekhof wrote: On 05/09/2013, at 6:37 PM, Christine Caulfield ccaul...@redhat.com wrote: On 03/09/13 22:03, Andrew Beekhof wrote: On 03/09/2013, at 11:49 PM, Christine Caulfield ccaul...@redhat.com wrote: On 03/09/13 05:20, Andrew Beekhof wrote: On 02/09/2013,

[Pacemaker] Resource ordering/colocating question (DRBD + LVM + FS)

2013-09-05 Thread Heikki Manninen
Hello, I'm having a bit of a problem understanding what's going on with my simple two-node demo cluster here. My resources come up correctly after restarting the whole cluster but the LVM and Filesystem resources fail to start after a single node restart or standby/unstandby (after node comes

Re: [Pacemaker] Resource ordering/colocating question (DRBD + LVM + FS)

2013-09-05 Thread Andreas Mock
Hi Heikki, just some comments for helping yourself. 1) The second output of crm_mon show a resource IP_database which is not shown in the initial crm_mon output and also not in the config. = Reduce your problem/config to the minimum being reproducible. 2) Enable logging and look out which node

Re: [Pacemaker] Corosync quorum not updating on split node

2013-09-05 Thread Mark Round
Just a quick follow up - I had this answered on the Corosync mailing list (which I guess should have been the place for this anyway). As I was blocking all traffic with iptables, it was also blocking lo, which caused all sorts of things to break. As soon as I only blocked on eth0, things

[Pacemaker] heartbeat:anything resource not stop/monitoring after reboot

2013-09-05 Thread David Coulson
We patched and rebooted one of our clusters this morning - I verified that pacemaker is the same as previous, plus it matches another similar cluster. There is a resource in the cluster defined as: primitive re-named-reload ocf:heartbeat:anything \ params binfile=/usr/sbin/rndc

Re: [Pacemaker] heartbeat:anything resource not stop/monitoring after reboot

2013-09-05 Thread Andrew Beekhof
On 06/09/2013, at 1:23 AM, David Coulson da...@davidcoulson.net wrote: We patched and rebooted one of our clusters this morning - I verified that pacemaker is the same as previous, plus it matches another similar cluster. There is a resource in the cluster defined as: primitive