Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped)

2015-09-21 Thread Jason Gress
Thank you for comment.  I attempted to use iDRAC/IPMI STONITH, and after
spending over a day, I had to put it on the backburner for timeline
reasons.  For whatever reason, I could not get IPMI to talk, and the
iDRAC5 plugin was not working either for reasons I don't understand.

Is that what you had in mind, or is there another method/configuration for
fencing DRBD?

Thank you for your advice,

Jason

On 9/20/15, 9:40 PM, "Digimer" <li...@alteeve.ca> wrote:

>On 20/09/15 09:18 PM, Jason Gress wrote:
>> I had seemed to cause a split brain attempting to repair this.  But that
>
>Use fencing! Voila, no more split-brains.
>
>> wasn't the issue.  You can't have any colocation requirements for DRBD
>> resources; that's what killed me.   This line did it:
>> 
>>  ms_drbd_vmfs with ClusterIP (score:INFINITY)
>> (id:colocation-ms_drbd_vmfs-ClusterIP-INFINITY)
>> 
>> Do NOT do this!
>> 
>> Jason
>> 
>> From: Jason Gress <jgr...@accertify.com <mailto:jgr...@accertify.com>>
>> Reply-To: Cluster Labs - All topics related to open-source clustering
>> welcomed <users@clusterlabs.org <mailto:users@clusterlabs.org>>
>> Date: Friday, September 18, 2015 at 3:03 PM
>> To: Cluster Labs - All topics related to open-source clustering welcomed
>> <users@clusterlabs.org <mailto:users@clusterlabs.org>>
>> Subject: Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary
>> node to Slave (always Stopped)
>> 
>> Well, it almost worked.  I was able to modify the existing cluster per
>> your command, and it worked great.
>> 
>> Today, I made two more clusters via the exact same process (I
>> used/modified my notes as I was building and fixing the first one
>> yesterday) and now it's doing the same thing, despite having your
>> improved master slave rule.  Here's the config:
>> 
>> [root@fx201-1a ~]# pcs config --full
>> Cluster Name: fx201-vmcl
>> Corosync Nodes:
>>  fx201-1a.zwo fx201-1b.zwo
>> Pacemaker Nodes:
>>  fx201-1a.zwo fx201-1b.zwo
>> 
>> Resources:
>>  Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
>>   Attributes: ip=10.XX.XX.XX cidr_netmask=24
>>   Operations: start interval=0s timeout=20s
>>(ClusterIP-start-timeout-20s)
>>   stop interval=0s timeout=20s (ClusterIP-stop-timeout-20s)
>>   monitor interval=15s (ClusterIP-monitor-interval-15s)
>>  Master: ms_drbd_vmfs
>>   Meta Attrs: master-max=1 master-node-max=1 clone-max=2
>> clone-node-max=1 notify=true
>>   Resource: drbd_vmfs (class=ocf provider=linbit type=drbd)
>>Attributes: drbd_resource=vmfs
>>Operations: start interval=0s timeout=240
>>(drbd_vmfs-start-timeout-240)
>>promote interval=0s timeout=90
>>(drbd_vmfs-promote-timeout-90)
>>demote interval=0s timeout=90
>>(drbd_vmfs-demote-timeout-90)
>>stop interval=0s timeout=100 (drbd_vmfs-stop-timeout-100)
>>monitor interval=29s role=Master
>> (drbd_vmfs-monitor-interval-29s-role-Master)
>>monitor interval=31s role=Slave
>> (drbd_vmfs-monitor-interval-31s-role-Slave)
>>  Resource: vmfsFS (class=ocf provider=heartbeat type=Filesystem)
>>   Attributes: device=/dev/drbd0 directory=/exports/vmfs fstype=xfs
>>   Operations: start interval=0s timeout=60 (vmfsFS-start-timeout-60)
>>   stop interval=0s timeout=60 (vmfsFS-stop-timeout-60)
>>   monitor interval=20 timeout=40
>>(vmfsFS-monitor-interval-20)
>>  Resource: nfs-server (class=systemd type=nfs-server)
>>   Operations: monitor interval=60s (nfs-server-monitor-interval-60s)
>> 
>> Stonith Devices:
>> Fencing Levels:
>> 
>> Location Constraints:
>> Ordering Constraints:
>>   promote ms_drbd_vmfs then start vmfsFS (kind:Mandatory)
>> (id:order-ms_drbd_vmfs-vmfsFS-mandatory)
>>   start vmfsFS then start nfs-server (kind:Mandatory)
>> (id:order-vmfsFS-nfs-server-mandatory)
>>   start ClusterIP then start nfs-server (kind:Mandatory)
>> (id:order-ClusterIP-nfs-server-mandatory)
>> Colocation Constraints:
>>   ms_drbd_vmfs with ClusterIP (score:INFINITY)
>> (id:colocation-ms_drbd_vmfs-ClusterIP-INFINITY)
>>   vmfsFS with ms_drbd_vmfs (score:INFINITY) (with-rsc-role:Master)
>> (id:colocation-vmfsFS-ms_drbd_vmfs-INFINITY)
>>   nfs-server with vmfsFS (score:INFINITY)
>> (id:colocation-nfs-server-vmfsFS-INFINITY)
>>   nfs-server with ClusterIP (score:INFINITY)
>> (id:colocation-nfs-server-ClusterIP-INFINITY)
>> 
>> Cluster P

Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped)

2015-09-21 Thread Jason Gress
Yeah, I had problems, which I am thinking might be firewall related.  In a
previous place of employment I had ipmi working great (but with
heartbeat), so I do have some experience with IPMI STONITH.

Example:

[root@fx201-1a ~]# fence_ipmilan -a 10.XX.XX.XX -l root -p calvin -o status
Failed: Unable to obtain correct plug status or plug is not available

(Don't worry, default dell pw left in for illustrative purposes only!)


When I tried to watch it with tcpdump, it seems to use random ports, so I
couldn't really advise the firewall team on how to make this work.  SSH
into the drac works fine, and IPMI over IP is enabled.  If anyone has
ideas on this, they would be greatly appreciated.


Thank you again,

Jason

On 9/21/15, 10:12 AM, "Digimer" <li...@alteeve.ca> wrote:

>IPMI fencing is a very common type, and it shouldn't be so hard to get
>working. Easiest is to test it out first on the command line, outside
>pacemaker. Run:
>
>fence_ipmilan -a  -l  -p  -o status
>
>If that doesn't work, you may need to use lanplus or similar. See 'man
>fence_ipmilan'. Once you can use the above command to query the power
>status of the nodes, you're 95% of the way there.
>
>Fencing can not be put on the back burner, as you've now seen. Without
>it, things can and will go very wrong.
>
>On 21/09/15 09:34 AM, Jason Gress wrote:
>> Thank you for comment.  I attempted to use iDRAC/IPMI STONITH, and after
>> spending over a day, I had to put it on the backburner for timeline
>> reasons.  For whatever reason, I could not get IPMI to talk, and the
>> iDRAC5 plugin was not working either for reasons I don't understand.
>> 
>> Is that what you had in mind, or is there another method/configuration
>>for
>> fencing DRBD?
>> 
>> Thank you for your advice,
>> 
>> Jason
>> 
>> On 9/20/15, 9:40 PM, "Digimer" <li...@alteeve.ca> wrote:
>> 
>>> On 20/09/15 09:18 PM, Jason Gress wrote:
>>>> I had seemed to cause a split brain attempting to repair this.  But
>>>>that
>>>
>>> Use fencing! Voila, no more split-brains.
>>>
>>>> wasn't the issue.  You can't have any colocation requirements for DRBD
>>>> resources; that's what killed me.   This line did it:
>>>>
>>>>  ms_drbd_vmfs with ClusterIP (score:INFINITY)
>>>> (id:colocation-ms_drbd_vmfs-ClusterIP-INFINITY)
>>>>
>>>> Do NOT do this!
>>>>
>>>> Jason
>>>>
>>>> From: Jason Gress <jgr...@accertify.com <mailto:jgr...@accertify.com>>
>>>> Reply-To: Cluster Labs - All topics related to open-source clustering
>>>> welcomed <users@clusterlabs.org <mailto:users@clusterlabs.org>>
>>>> Date: Friday, September 18, 2015 at 3:03 PM
>>>> To: Cluster Labs - All topics related to open-source clustering
>>>>welcomed
>>>> <users@clusterlabs.org <mailto:users@clusterlabs.org>>
>>>> Subject: Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary
>>>> node to Slave (always Stopped)
>>>>
>>>> Well, it almost worked.  I was able to modify the existing cluster per
>>>> your command, and it worked great.
>>>>
>>>> Today, I made two more clusters via the exact same process (I
>>>> used/modified my notes as I was building and fixing the first one
>>>> yesterday) and now it's doing the same thing, despite having your
>>>> improved master slave rule.  Here's the config:
>>>>
>>>> [root@fx201-1a ~]# pcs config --full
>>>> Cluster Name: fx201-vmcl
>>>> Corosync Nodes:
>>>>  fx201-1a.zwo fx201-1b.zwo
>>>> Pacemaker Nodes:
>>>>  fx201-1a.zwo fx201-1b.zwo
>>>>
>>>> Resources:
>>>>  Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
>>>>   Attributes: ip=10.XX.XX.XX cidr_netmask=24
>>>>   Operations: start interval=0s timeout=20s
>>>> (ClusterIP-start-timeout-20s)
>>>>   stop interval=0s timeout=20s
>>>>(ClusterIP-stop-timeout-20s)
>>>>   monitor interval=15s (ClusterIP-monitor-interval-15s)
>>>>  Master: ms_drbd_vmfs
>>>>   Meta Attrs: master-max=1 master-node-max=1 clone-max=2
>>>> clone-node-max=1 notify=true
>>>>   Resource: drbd_vmfs (class=ocf provider=linbit type=drbd)
>>>>Attributes: drbd_resource=vmfs
>>>>Operations: start interval=0s timeout=240
>>>> (drbd_vmfs-start-timeout-240)
>>>>   

Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped)

2015-09-21 Thread Digimer
On 21/09/15 11:23 AM, Jason Gress wrote:
> Yeah, I had problems, which I am thinking might be firewall related.  In a
> previous place of employment I had ipmi working great (but with
> heartbeat), so I do have some experience with IPMI STONITH.

If you can query the IPMI sensor data, you should have no trouble
fencing. It's basically a wrapper for ipmitool. Can you use ipmitool to
query anything over the network?

> Example:
> 
> [root@fx201-1a ~]# fence_ipmilan -a 10.XX.XX.XX -l root -p calvin -o status
> Failed: Unable to obtain correct plug status or plug is not available
> 
> (Don't worry, default dell pw left in for illustrative purposes only!)

That doesn't make sense... No plug is needed for IPMI fencing. What OS
and what hardware?

> When I tried to watch it with tcpdump, it seems to use random ports, so I
> couldn't really advise the firewall team on how to make this work.  SSH
> into the drac works fine, and IPMI over IP is enabled.  If anyone has
> ideas on this, they would be greatly appreciated.

You're initiating the connection, so no firewall edits should be needed
(anything returning should be ESTABLISHED/RELATED).

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped)

2015-09-21 Thread Digimer
IPMI fencing is a very common type, and it shouldn't be so hard to get
working. Easiest is to test it out first on the command line, outside
pacemaker. Run:

fence_ipmilan -a  -l  -p  -o status

If that doesn't work, you may need to use lanplus or similar. See 'man
fence_ipmilan'. Once you can use the above command to query the power
status of the nodes, you're 95% of the way there.

Fencing can not be put on the back burner, as you've now seen. Without
it, things can and will go very wrong.

On 21/09/15 09:34 AM, Jason Gress wrote:
> Thank you for comment.  I attempted to use iDRAC/IPMI STONITH, and after
> spending over a day, I had to put it on the backburner for timeline
> reasons.  For whatever reason, I could not get IPMI to talk, and the
> iDRAC5 plugin was not working either for reasons I don't understand.
> 
> Is that what you had in mind, or is there another method/configuration for
> fencing DRBD?
> 
> Thank you for your advice,
> 
> Jason
> 
> On 9/20/15, 9:40 PM, "Digimer" <li...@alteeve.ca> wrote:
> 
>> On 20/09/15 09:18 PM, Jason Gress wrote:
>>> I had seemed to cause a split brain attempting to repair this.  But that
>>
>> Use fencing! Voila, no more split-brains.
>>
>>> wasn't the issue.  You can't have any colocation requirements for DRBD
>>> resources; that's what killed me.   This line did it:
>>>
>>>  ms_drbd_vmfs with ClusterIP (score:INFINITY)
>>> (id:colocation-ms_drbd_vmfs-ClusterIP-INFINITY)
>>>
>>> Do NOT do this!
>>>
>>> Jason
>>>
>>> From: Jason Gress <jgr...@accertify.com <mailto:jgr...@accertify.com>>
>>> Reply-To: Cluster Labs - All topics related to open-source clustering
>>> welcomed <users@clusterlabs.org <mailto:users@clusterlabs.org>>
>>> Date: Friday, September 18, 2015 at 3:03 PM
>>> To: Cluster Labs - All topics related to open-source clustering welcomed
>>> <users@clusterlabs.org <mailto:users@clusterlabs.org>>
>>> Subject: Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary
>>> node to Slave (always Stopped)
>>>
>>> Well, it almost worked.  I was able to modify the existing cluster per
>>> your command, and it worked great.
>>>
>>> Today, I made two more clusters via the exact same process (I
>>> used/modified my notes as I was building and fixing the first one
>>> yesterday) and now it's doing the same thing, despite having your
>>> improved master slave rule.  Here's the config:
>>>
>>> [root@fx201-1a ~]# pcs config --full
>>> Cluster Name: fx201-vmcl
>>> Corosync Nodes:
>>>  fx201-1a.zwo fx201-1b.zwo
>>> Pacemaker Nodes:
>>>  fx201-1a.zwo fx201-1b.zwo
>>>
>>> Resources:
>>>  Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
>>>   Attributes: ip=10.XX.XX.XX cidr_netmask=24
>>>   Operations: start interval=0s timeout=20s
>>> (ClusterIP-start-timeout-20s)
>>>   stop interval=0s timeout=20s (ClusterIP-stop-timeout-20s)
>>>   monitor interval=15s (ClusterIP-monitor-interval-15s)
>>>  Master: ms_drbd_vmfs
>>>   Meta Attrs: master-max=1 master-node-max=1 clone-max=2
>>> clone-node-max=1 notify=true
>>>   Resource: drbd_vmfs (class=ocf provider=linbit type=drbd)
>>>Attributes: drbd_resource=vmfs
>>>Operations: start interval=0s timeout=240
>>> (drbd_vmfs-start-timeout-240)
>>>promote interval=0s timeout=90
>>> (drbd_vmfs-promote-timeout-90)
>>>demote interval=0s timeout=90
>>> (drbd_vmfs-demote-timeout-90)
>>>stop interval=0s timeout=100 (drbd_vmfs-stop-timeout-100)
>>>monitor interval=29s role=Master
>>> (drbd_vmfs-monitor-interval-29s-role-Master)
>>>monitor interval=31s role=Slave
>>> (drbd_vmfs-monitor-interval-31s-role-Slave)
>>>  Resource: vmfsFS (class=ocf provider=heartbeat type=Filesystem)
>>>   Attributes: device=/dev/drbd0 directory=/exports/vmfs fstype=xfs
>>>   Operations: start interval=0s timeout=60 (vmfsFS-start-timeout-60)
>>>   stop interval=0s timeout=60 (vmfsFS-stop-timeout-60)
>>>   monitor interval=20 timeout=40
>>> (vmfsFS-monitor-interval-20)
>>>  Resource: nfs-server (class=systemd type=nfs-server)
>>>   Operations: monitor interval=60s (nfs-server-monitor-interval-60s)
>>>
>>> Stonith Devices:
>>> Fencing Levels:
>>>
>>> Location Constrai

Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped)

2015-09-17 Thread Jason Gress
That may very well be it.  Would you be so kind as to show me the pcs command 
to create that config?  I generated my configuration with these commands, and 
I'm not sure how to get the additional monitor options in there:

pcs resource create drbd_vmfs ocf:linbit:drbd drbd_resource=vmfs op monitor 
interval=30s
pcs resource master ms_drbd_vmfs drbd_vmfs master-max=1 master-node-max=1 
clone-max=2 clone-node-max=1 notify=true

Thank you very much for your help, and sorry for the newbie question!

Jason

From: Luke Pascoe <l...@osnz.co.nz<mailto:l...@osnz.co.nz>>
Reply-To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org<mailto:users@clusterlabs.org>>
Date: Thursday, September 17, 2015 at 6:54 PM
To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org<mailto:users@clusterlabs.org>>
Subject: Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to 
Slave (always Stopped)

The only difference in the DRBD resource between yours and mine that I can see 
is the monitoring parameters (mine works nicely, but is Centos 6). Here's mine:

Master: ms_drbd_iscsicg0
  Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 
notify=true
  Resource: drbd_iscsivg0 (class=ocf provider=linbit type=drbd)
   Attributes: drbd_resource=iscsivg0
   Operations: start interval=0s timeout=240 (drbd_iscsivg0-start-timeout-240)
   promote interval=0s timeout=90 (drbd_iscsivg0-promote-timeout-90)
   demote interval=0s timeout=90 (drbd_iscsivg0-demote-timeout-90)
   stop interval=0s timeout=100 (drbd_iscsivg0-stop-timeout-100)
   monitor interval=29s role=Master 
(drbd_iscsivg0-monitor-interval-29s-role-Master)
   monitor interval=31s role=Slave 
(drbd_iscsivg0-monitor-interval-31s-role-Slave)

What mechanism are you using to fail over? Check your constraints after you do 
it and make sure it hasn't added one which stops the slave clone from starting 
on the "failed" node.



Luke Pascoe

[http://osnz.co.nz/logo_blue_80.png]


E l...@osnz.co.nz<mailto:l...@osnz.co.nz>
P +64 (9) 296 2961
M +64 (27) 426 6649
W www.osnz.co.nz<http://www.osnz.co.nz/>

24 Wellington St
Papakura
Auckland, 2110
New Zealand

On 18 September 2015 at 11:40, Jason Gress 
<jgr...@accertify.com<mailto:jgr...@accertify.com>> wrote:
Looking more closely, according to page 64 
(http://clusterlabs.org/doc/Cluster_from_Scratch.pdf) it does indeed appear 
that 1 is the correct number.  (I just realized that it's page 64 of the 
"book", but page 76 of the pdf.)

Thank you again,

Jason

From: Jason Gress <jgr...@accertify.com<mailto:jgr...@accertify.com>>
Reply-To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org<mailto:users@clusterlabs.org>>
Date: Thursday, September 17, 2015 at 6:36 PM
To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org<mailto:users@clusterlabs.org>>
Subject: Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to 
Slave (always Stopped)

I can't say whether or not you are right or wrong (you may be right!) but I 
followed the Cluster From Scratch tutorial closely, and it only had a 
clone-node-max=1 there.  (Page 106 of the pdf, for the curious.)

Thanks,

Jason

From: Luke Pascoe <l...@osnz.co.nz<mailto:l...@osnz.co.nz>>
Reply-To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org<mailto:users@clusterlabs.org>>
Date: Thursday, September 17, 2015 at 6:29 PM
To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org<mailto:users@clusterlabs.org>>
Subject: Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to 
Slave (always Stopped)

I may be wrong, but shouldn't "clone-node-max" be 2 on the ms_drbd_vmfs 
resource?


Luke Pascoe

[http://osnz.co.nz/logo_blue_80.png]


E l...@osnz.co.nz<mailto:l...@osnz.co.nz>
P +64 (9) 296 2961<tel:%2B64%20%289%29%20296%202961>
M +64 (27) 426 6649
W www.osnz.co.nz<http://www.osnz.co.nz/>

24 Wellington St
Papakura
Auckland, 2110
New Zealand

On 18 September 2015 at 11:02, Jason Gress 
<jgr...@accertify.com<mailto:jgr...@accertify.com>> wrote:
I have a simple DRBD + filesystem + NFS configuration that works properly when 
I manually start/stop DRBD, but will not start the DRBD slave resource properly 
on failover or recovery.  I cannot ever get the Master/Slave set to say 
anything but 'Stopped'.  I am running CentOS 7.1 with the latest packages as 
of today:

[root@fx201-1a log]# rpm -qa | grep -e pcs -e pacemaker -e drbd
pacemaker-cluster-libs-1.1.12-22.el7_1.4.x86_64
pacemaker-1.1.12-22.el7_1.4.x86_64
pcs-0.9.137-13.el7_1.4.x86_64
pacemaker-libs-1.

Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped)

2015-09-17 Thread Luke Pascoe
Ah yes, sorry.

clone-node-max
How many copies of the resource can be started on a single node; the
default value is 1.

So yes, a value of 1 here is correct.

Luke Pascoe



*E* l...@osnz.co.nz
* P* +64 (9) 296 2961
* M* +64 (27) 426 6649
* W* www.osnz.co.nz

24 Wellington St
Papakura
Auckland, 2110
New Zealand

On 18 September 2015 at 11:36, Jason Gress <jgr...@accertify.com> wrote:

> I can't say whether or not you are right or wrong (you may be!) but I
> followed the Cluster From Scratch tutorial closely, and it only had a
> clone-node-max=1 there.  (Page 106 of the pdf, for the curious.)
>
> Thanks,
>
> Jason
>
> From: Luke Pascoe <l...@osnz.co.nz>
> Reply-To: Cluster Labs - All topics related to open-source clustering
> welcomed <users@clusterlabs.org>
> Date: Thursday, September 17, 2015 at 6:29 PM
> To: Cluster Labs - All topics related to open-source clustering welcomed <
> users@clusterlabs.org>
> Subject: Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary
> node to Slave (always Stopped)
>
> I may be wrong, but shouldn't "clone-node-max" be 2 on the ms_drbd_vmfs
> resource?
>
> Luke Pascoe
>
>
>
> *E* l...@osnz.co.nz
> *P* +64 (9) 296 2961
> *M* +64 (27) 426 6649
> *W* www.osnz.co.nz
>
> 24 Wellington St
> Papakura
> Auckland, 2110
> New Zealand
>
> On 18 September 2015 at 11:02, Jason Gress <jgr...@accertify.com> wrote:
>
>> I have a simple DRBD + filesystem + NFS configuration that works properly
>> when I manually start/stop DRBD, but will not start the DRBD slave resource
>> properly on failover or recovery.  I cannot ever get the Master/Slave set
>> to say anything but 'Stopped'.  I am running CentOS 7.1 with the latest
>> packages as of today:
>>
>> [root@fx201-1a log]# rpm -qa | grep -e pcs -e pacemaker -e drbd
>> pacemaker-cluster-libs-1.1.12-22.el7_1.4.x86_64
>> pacemaker-1.1.12-22.el7_1.4.x86_64
>> pcs-0.9.137-13.el7_1.4.x86_64
>> pacemaker-libs-1.1.12-22.el7_1.4.x86_64
>> drbd84-utils-8.9.3-1.1.el7.elrepo.x86_64
>> pacemaker-cli-1.1.12-22.el7_1.4.x86_64
>> kmod-drbd84-8.4.6-1.el7.elrepo.x86_64
>>
>> Here is my pcs config output:
>>
>> [root@fx201-1a log]# pcs config
>> Cluster Name: fx201-vmcl
>> Corosync Nodes:
>>  fx201-1a.ams fx201-1b.ams
>> Pacemaker Nodes:
>>  fx201-1a.ams fx201-1b.ams
>>
>> Resources:
>>  Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
>>   Attributes: ip=10.XX.XX.XX cidr_netmask=24
>>   Operations: start interval=0s timeout=20s (ClusterIP-start-timeout-20s)
>>   stop interval=0s timeout=20s (ClusterIP-stop-timeout-20s)
>>   monitor interval=15s (ClusterIP-monitor-interval-15s)
>>  Master: ms_drbd_vmfs
>>   Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1
>> notify=true
>>   Resource: drbd_vmfs (class=ocf provider=linbit type=drbd)
>>Attributes: drbd_resource=vmfs
>>Operations: start interval=0s timeout=240 (drbd_vmfs-start-timeout-240)
>>promote interval=0s timeout=90
>> (drbd_vmfs-promote-timeout-90)
>>demote interval=0s timeout=90 (drbd_vmfs-demote-timeout-90)
>>stop interval=0s timeout=100 (drbd_vmfs-stop-timeout-100)
>>monitor interval=30s (drbd_vmfs-monitor-interval-30s)
>>  Resource: vmfsFS (class=ocf provider=heartbeat type=Filesystem)
>>   Attributes: device=/dev/drbd0 directory=/exports/vmfs fstype=xfs
>>   Operations: start interval=0s timeout=60 (vmfsFS-start-timeout-60)
>>   stop interval=0s timeout=60 (vmfsFS-stop-timeout-60)
>>   monitor interval=20 timeout=40 (vmfsFS-monitor-interval-20)
>>  Resource: nfs-server (class=systemd type=nfs-server)
>>   Operations: monitor interval=60s (nfs-server-monitor-interval-60s)
>>
>> Stonith Devices:
>> Fencing Levels:
>>
>> Location Constraints:
>> Ordering Constraints:
>>   promote ms_drbd_vmfs then start vmfsFS (kind:Mandatory)
>> (id:order-ms_drbd_vmfs-vmfsFS-mandatory)
>>   start vmfsFS then start nfs-server (kind:Mandatory)
>> (id:order-vmfsFS-nfs-server-mandatory)
>>   start ClusterIP then start nfs-server (kind:Mandatory)
>> (id:order-ClusterIP-nfs-server-mandatory)
>> Colocation Constraints:
>>   ms_drbd_vmfs with ClusterIP (score:INFINITY)
>> (id:colocation-ms_drbd_vmfs-ClusterIP-INFINITY)
>>   vmfsFS with ms_drbd_vmfs (score:INFINITY) (with-rsc-role:Master)
>> (id:colocation-vmfsFS-ms_drbd_vmfs-INFINITY)
>>   nfs-server with vmfsFS (score:INFINITY)
>> (id:colocation-nfs-server-vmfsFS

Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped)

2015-09-17 Thread Jason Gress
I can't say whether or not you are right or wrong (you may be!) but I followed 
the Cluster From Scratch tutorial closely, and it only had a clone-node-max=1 
there.  (Page 106 of the pdf, for the curious.)

Thanks,

Jason

From: Luke Pascoe <l...@osnz.co.nz<mailto:l...@osnz.co.nz>>
Reply-To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org<mailto:users@clusterlabs.org>>
Date: Thursday, September 17, 2015 at 6:29 PM
To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org<mailto:users@clusterlabs.org>>
Subject: Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to 
Slave (always Stopped)

I may be wrong, but shouldn't "clone-node-max" be 2 on the ms_drbd_vmfs 
resource?


Luke Pascoe

[http://osnz.co.nz/logo_blue_80.png]


E l...@osnz.co.nz<mailto:l...@osnz.co.nz>
P +64 (9) 296 2961
M +64 (27) 426 6649
W www.osnz.co.nz<http://www.osnz.co.nz/>

24 Wellington St
Papakura
Auckland, 2110
New Zealand

On 18 September 2015 at 11:02, Jason Gress 
<jgr...@accertify.com<mailto:jgr...@accertify.com>> wrote:
I have a simple DRBD + filesystem + NFS configuration that works properly when 
I manually start/stop DRBD, but will not start the DRBD slave resource properly 
on failover or recovery.  I cannot ever get the Master/Slave set to say 
anything but 'Stopped'.  I am running CentOS 7.1 with the latest packages as 
of today:

[root@fx201-1a log]# rpm -qa | grep -e pcs -e pacemaker -e drbd
pacemaker-cluster-libs-1.1.12-22.el7_1.4.x86_64
pacemaker-1.1.12-22.el7_1.4.x86_64
pcs-0.9.137-13.el7_1.4.x86_64
pacemaker-libs-1.1.12-22.el7_1.4.x86_64
drbd84-utils-8.9.3-1.1.el7.elrepo.x86_64
pacemaker-cli-1.1.12-22.el7_1.4.x86_64
kmod-drbd84-8.4.6-1.el7.elrepo.x86_64

Here is my pcs config output:

[root@fx201-1a log]# pcs config
Cluster Name: fx201-vmcl
Corosync Nodes:
 fx201-1a.ams fx201-1b.ams
Pacemaker Nodes:
 fx201-1a.ams fx201-1b.ams

Resources:
 Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=10.XX.XX.XX cidr_netmask=24
  Operations: start interval=0s timeout=20s (ClusterIP-start-timeout-20s)
  stop interval=0s timeout=20s (ClusterIP-stop-timeout-20s)
  monitor interval=15s (ClusterIP-monitor-interval-15s)
 Master: ms_drbd_vmfs
  Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 
notify=true
  Resource: drbd_vmfs (class=ocf provider=linbit type=drbd)
   Attributes: drbd_resource=vmfs
   Operations: start interval=0s timeout=240 (drbd_vmfs-start-timeout-240)
   promote interval=0s timeout=90 (drbd_vmfs-promote-timeout-90)
   demote interval=0s timeout=90 (drbd_vmfs-demote-timeout-90)
   stop interval=0s timeout=100 (drbd_vmfs-stop-timeout-100)
   monitor interval=30s (drbd_vmfs-monitor-interval-30s)
 Resource: vmfsFS (class=ocf provider=heartbeat type=Filesystem)
  Attributes: device=/dev/drbd0 directory=/exports/vmfs fstype=xfs
  Operations: start interval=0s timeout=60 (vmfsFS-start-timeout-60)
  stop interval=0s timeout=60 (vmfsFS-stop-timeout-60)
  monitor interval=20 timeout=40 (vmfsFS-monitor-interval-20)
 Resource: nfs-server (class=systemd type=nfs-server)
  Operations: monitor interval=60s (nfs-server-monitor-interval-60s)

Stonith Devices:
Fencing Levels:

Location Constraints:
Ordering Constraints:
  promote ms_drbd_vmfs then start vmfsFS (kind:Mandatory) 
(id:order-ms_drbd_vmfs-vmfsFS-mandatory)
  start vmfsFS then start nfs-server (kind:Mandatory) 
(id:order-vmfsFS-nfs-server-mandatory)
  start ClusterIP then start nfs-server (kind:Mandatory) 
(id:order-ClusterIP-nfs-server-mandatory)
Colocation Constraints:
  ms_drbd_vmfs with ClusterIP (score:INFINITY) 
(id:colocation-ms_drbd_vmfs-ClusterIP-INFINITY)
  vmfsFS with ms_drbd_vmfs (score:INFINITY) (with-rsc-role:Master) 
(id:colocation-vmfsFS-ms_drbd_vmfs-INFINITY)
  nfs-server with vmfsFS (score:INFINITY) 
(id:colocation-nfs-server-vmfsFS-INFINITY)

Cluster Properties:
 cluster-infrastructure: corosync
 cluster-name: fx201-vmcl
 dc-version: 1.1.13-a14efad
 have-watchdog: false
 last-lrm-refresh: 1442528181
 stonith-enabled: false

And status:

[root@fx201-1a log]# pcs status --full
Cluster name: fx201-vmcl
Last updated: Thu Sep 17 17:55:56 2015 Last change: Thu Sep 17 17:18:10 2015 by 
root via crm_attribute on fx201-1b.ams
Stack: corosync
Current DC: fx201-1b.ams (2) (version 1.1.13-a14efad) - partition with quorum
2 nodes and 5 resources configured

Online: [ fx201-1a.ams (1) fx201-1b.ams (2) ]

Full list of resources:

 ClusterIP (ocf::heartbeat:IPaddr2):Started fx201-1a.ams
 Master/Slave Set: ms_drbd_vmfs [drbd_vmfs]
 drbd_vmfs (ocf::linbit:drbd):Master fx201-1a.ams
 drbd_vmfs (ocf::linbit:drbd):Stopped
 Masters: [ fx201-1a.ams ]
 Stopped: [ fx201-1b.ams ]
 vmfsFS (ocf::heartbeat:Filesystem):Started fx201-1a.ams
 nfs-ser

Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped)

2015-09-17 Thread Luke Pascoe
The only difference in the DRBD resource between yours and mine that I can
see is the monitoring parameters (mine works nicely, but is Centos 6).
Here's mine:

Master: ms_drbd_iscsicg0
  Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1
notify=true
  Resource: drbd_iscsivg0 (class=ocf provider=linbit type=drbd)
   Attributes: drbd_resource=iscsivg0
   Operations: start interval=0s timeout=240
(drbd_iscsivg0-start-timeout-240)
   promote interval=0s timeout=90
(drbd_iscsivg0-promote-timeout-90)
   demote interval=0s timeout=90
(drbd_iscsivg0-demote-timeout-90)
   stop interval=0s timeout=100 (drbd_iscsivg0-stop-timeout-100)
   monitor interval=29s role=Master
(drbd_iscsivg0-monitor-interval-29s-role-Master)
   monitor interval=31s role=Slave
(drbd_iscsivg0-monitor-interval-31s-role-Slave)

What mechanism are you using to fail over? Check your constraints after you
do it and make sure it hasn't added one which stops the slave clone from
starting on the "failed" node.


Luke Pascoe



*E* l...@osnz.co.nz
* P* +64 (9) 296 2961
* M* +64 (27) 426 6649
* W* www.osnz.co.nz

24 Wellington St
Papakura
Auckland, 2110
New Zealand

On 18 September 2015 at 11:40, Jason Gress <jgr...@accertify.com> wrote:

> Looking more closely, according to page 64 (
> http://clusterlabs.org/doc/Cluster_from_Scratch.pdf) it does indeed
> appear that 1 is the correct number.  (I just realized that it's page 64 of
> the "book", but page 76 of the pdf.)
>
> Thank you again,
>
> Jason
>
> From: Jason Gress <jgr...@accertify.com>
> Reply-To: Cluster Labs - All topics related to open-source clustering
> welcomed <users@clusterlabs.org>
> Date: Thursday, September 17, 2015 at 6:36 PM
> To: Cluster Labs - All topics related to open-source clustering welcomed <
> users@clusterlabs.org>
> Subject: Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary
> node to Slave (always Stopped)
>
> I can't say whether or not you are right or wrong (you may be right!) but
> I followed the Cluster From Scratch tutorial closely, and it only had a
> clone-node-max=1 there.  (Page 106 of the pdf, for the curious.)
>
> Thanks,
>
> Jason
>
> From: Luke Pascoe <l...@osnz.co.nz>
> Reply-To: Cluster Labs - All topics related to open-source clustering
> welcomed <users@clusterlabs.org>
> Date: Thursday, September 17, 2015 at 6:29 PM
> To: Cluster Labs - All topics related to open-source clustering welcomed <
> users@clusterlabs.org>
> Subject: Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary
> node to Slave (always Stopped)
>
> I may be wrong, but shouldn't "clone-node-max" be 2 on the ms_drbd_vmfs
> resource?
>
> Luke Pascoe
>
>
>
> *E* l...@osnz.co.nz
> *P* +64 (9) 296 2961
> *M* +64 (27) 426 6649
> *W* www.osnz.co.nz
>
> 24 Wellington St
> Papakura
> Auckland, 2110
> New Zealand
>
> On 18 September 2015 at 11:02, Jason Gress <jgr...@accertify.com> wrote:
>
>> I have a simple DRBD + filesystem + NFS configuration that works properly
>> when I manually start/stop DRBD, but will not start the DRBD slave resource
>> properly on failover or recovery.  I cannot ever get the Master/Slave set
>> to say anything but 'Stopped'.  I am running CentOS 7.1 with the latest
>> packages as of today:
>>
>> [root@fx201-1a log]# rpm -qa | grep -e pcs -e pacemaker -e drbd
>> pacemaker-cluster-libs-1.1.12-22.el7_1.4.x86_64
>> pacemaker-1.1.12-22.el7_1.4.x86_64
>> pcs-0.9.137-13.el7_1.4.x86_64
>> pacemaker-libs-1.1.12-22.el7_1.4.x86_64
>> drbd84-utils-8.9.3-1.1.el7.elrepo.x86_64
>> pacemaker-cli-1.1.12-22.el7_1.4.x86_64
>> kmod-drbd84-8.4.6-1.el7.elrepo.x86_64
>>
>> Here is my pcs config output:
>>
>> [root@fx201-1a log]# pcs config
>> Cluster Name: fx201-vmcl
>> Corosync Nodes:
>>  fx201-1a.ams fx201-1b.ams
>> Pacemaker Nodes:
>>  fx201-1a.ams fx201-1b.ams
>>
>> Resources:
>>  Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
>>   Attributes: ip=10.XX.XX.XX cidr_netmask=24
>>   Operations: start interval=0s timeout=20s (ClusterIP-start-timeout-20s)
>>   stop interval=0s timeout=20s (ClusterIP-stop-timeout-20s)
>>   monitor interval=15s (ClusterIP-monitor-interval-15s)
>>  Master: ms_drbd_vmfs
>>   Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1
>> notify=true
>>   Resource: drbd_vmfs (class=ocf provider=linbit type=drbd)
>>Attributes: drbd_resource=vmfs
>>Operations: start interval=0s timeout=240 (drbd_vmfs-start-timeout-240)
>>prom