Re: [ClusterLabs] fence_apc delay?

2016-09-06 Thread Ken Gaillot
On 09/06/2016 11:44 AM, Dan Swartzendruber wrote:
> On 2016-09-06 10:59, Ken Gaillot wrote:
>> On 09/05/2016 09:38 AM, Marek Grac wrote:
>>> Hi,
>>>
> 
> [snip]
> 
>> FYI, no special configuration is needed for this with recent pacemaker
>> versions. If multiple devices are listed in a topology level, pacemaker
>> will automatically convert reboot requests into all-off-then-all-on.
> 
> Hmmm, thinking about this some more, this just puts me back in the
> current situation (e.g. having an 'extra' delay.)  The issue for me
> would be having two fencing devices, each of which needs a brief delay
> to let its target's PS drain.  If a single PDU fencing agent does this
> (with proposed change):
> 
> power-off
> wait N seconds
> power-on
> 
> that is cool.  Unfortunately, with the all-off-then-all-on pacemaker
> would do, I would get this:
> 
> power-off node A
> wait N seconds
> power-off node B
> wait N seconds
> power-on node A
> power-on node B
> 
> or am I missing something?  If not, seems like it would be nice to have
> some sort of delay at the pacemaker level.  e.g. tell pacemaker to
> convert a reboot of node A into a 'turn off node A, wait N seconds, turn
> on node A'?

You're exactly right. Pacemaker does seem like the appropriate place to
handle this, but it would be a good bit of work. I think the best
workaround for now would be to set the delay only on the B device.

I do see now why power-wait, as a fence agent property, is not ideal for
this purpose: one fence device might be used with multiple nodes, yet
the ideal delay might vary by node (if they have different power supply
models, for example).

On the other hand, setting it as a node attribute isn't right either,
because one node might be fenceable by multiple devices, and the delay
might not be appropriate for all of them.

We'd need to specify the delay per node/device combination -- something
like pcmk_off_delay=node1:3;node2:5 as an (ugly) fence device property.

It would be a significant project. If you think it's important, please
open a feature request at bugs.clusterlabs.org.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Service pacemaker start kills my cluster and other NFS HA issues

2016-09-06 Thread Ken Gaillot
On 09/05/2016 05:16 AM, Pablo Pines Leon wrote:
> Hello,
> 
> I implemented the suggested change in corosync and I realized that service 
> pacemaker stop on the master node works provided that I run crm_resource -P 
> from another terminal right after it, and the same goes for the case of the 
> "failback", getting back the node that failed on the cluster, which causes 
> the IP resource and then the NFS exports to fail, if I run crm_resource -P 
> twice after running service pacemaker start to get it back in it will work.
> 
> However, I see no reason why this is happening, if the failover works fine 
> why can there be any problem getting a node back in the cluster?

Looking at your config again, I see only some of your resources have
monitor operations. All primitives should have monitors, except for
master/slave resources which should have two monitors on the m/s
resource, one for master and one for slave (with different intervals).

BTW, crm_resource -P is deprecated in favor of -C. Same thing, just renamed.

> Thanks and kind regards
> 
> Pablo
> 
> From: Pablo Pines Leon [pablo.pines.l...@cern.ch]
> Sent: 01 September 2016 09:49
> To: kgail...@redhat.com; Cluster Labs - All topics  related to 
> open-source clustering welcomed
> Subject: Re: [ClusterLabs] Service pacemaker start kills my cluster and other 
> NFS HA issues
> 
> Dear Ken,
> 
> Thanks for your reply. That configuration in Ubuntu works perfectly fine, the 
> problem is that in CentOS 7 for some reason I am not even able to do a 
> "service pacemaker stop" of the node that is running as master (with the 
> slave off too) because it will have some failed actions that don't make any 
> sense:
> 
> Migration Summary:
> * Node nfsha1:
>res_exportfs_root: migration-threshold=100 fail-count=1 
> last-failure='Thu
>  Sep  1 09:42:43 2016'
>res_exportfs_export1: migration-threshold=100 fail-count=100 
> last-fai
> lure='Thu Sep  1 09:42:38 2016'
> 
> Failed Actions:
> * res_exportfs_root_monitor_3 on nfsha1 'not running' (7): call=79, 
> status=c
> omplete, exitreason='none',
> last-rc-change='Thu Sep  1 09:42:43 2016', queued=0ms, exec=0ms
> * res_exportfs_export1_stop_0 on nfsha1 'unknown error' (1): call=88, 
> status=Tim
> ed Out, exitreason='none',
> last-rc-change='Thu Sep  1 09:42:18 2016', queued=0ms, exec=20001ms
> 
> So I am wondering what is different between both OSes that will cause this 
> different outcome.
> 
> Kind regards
> 
> 
> From: Ken Gaillot [kgail...@redhat.com]
> Sent: 31 August 2016 17:31
> To: users@clusterlabs.org
> Subject: Re: [ClusterLabs] Service pacemaker start kills my cluster and other 
> NFS HA issues
> 
> On 08/30/2016 10:49 AM, Pablo Pines Leon wrote:
>> Hello,
>>
>> I have set up a DRBD-Corosync-Pacemaker cluster following the
>> instructions from https://wiki.ubuntu.com/ClusterStack/Natty adapting
>> them to CentOS 7 (e.g: using systemd). After testing it in Virtual
> 
> There is a similar how-to specifically for CentOS 7:
> 
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Clusters_from_Scratch/index.html
> 
> I think if you compare your configs to that, you'll probably find the
> cause. I'm guessing the most important missing pieces are "two_node: 1"
> in corosync.conf, and fencing.
> 
> 
>> Machines it seemed to be working fine, so it is now implemented in
>> physical machines, and I have noticed that the failover works fine as
>> long as I kill the master by pulling the AC cable, but not if I issue
>> the halt, reboot or shutdown commands, that makes the cluster get in a
>> situation like this:
>>
>> Last updated: Tue Aug 30 16:55:58 2016  Last change: Tue Aug 23
>> 11:49:43 2016 by hacluster via crmd on nfsha2
>> Stack: corosync
>> Current DC: nfsha2 (version 1.1.13-10.el7_2.4-44eb2dd) - partition with
>> quorum
>> 2 nodes and 9 resources configured
>>
>> Online: [ nfsha1 nfsha2 ]
>>
>>  Master/Slave Set: ms_drbd_export [res_drbd_export]
>>  Masters: [ nfsha2 ]
>>  Slaves: [ nfsha1 ]
>>  Resource Group: rg_export
>>  res_fs (ocf::heartbeat:Filesystem):Started nfsha2
>>  res_exportfs_export1(ocf::heartbeat:exportfs):FAILED nfsha2
>> (unmanaged)
>>  res_ip (ocf::heartbeat:IPaddr2):Stopped
>>  Clone Set: cl_nfsserver [res_nfsserver]
>>  Started: [ nfsha1 ]
>>  Clone Set: cl_exportfs_root [res_exportfs_root]
>>  res_exportfs_root  (ocf::heartbeat:exportfs):FAILED nfsha2
>>  Started: [ nfsha1 ]
>>
>> Migration Summary:
>> * Node 2:
>>res_exportfs_export1: migration-threshold=100
>> fail-count=100last-failure='Tue Aug 30 16:55:50 2016'
>>res_exportfs_root: migration-threshold=100 fail-count=1
>> last-failure='Tue Aug 30 16:55:48 2016'
>> * Node 1:
>>
>> Failed Actions:
>> * res_exportfs_export1_stop_0 on nfsha2 'unknown error' (1): call=134,
>> status=Timed Out, exitreason='non
>> e',

Re: [ClusterLabs] [rgmanager] generic 'initscript' resource agent that passes arguments?

2016-09-06 Thread Jan Pokorný
On 29/08/16 13:41 -0400, berg...@merctech.com wrote:
> I've got a number of scripts that are based on LSB compliant scripts,
> but which also accept arguments & values. For example, a script to manage
> multiple virtual machines has a command-line in the form:
> 
>   vbox_init --vmname $VMNAME [-d|--debug] [start|stop|status|restart]
> 
> I'd like to manage these services as cluster resources, ideally without
> modifying the existing (tested, functioning) init scripts.
> 
> For example, I do not want to create individual vbox_init script with
> hard-coded values for the virtual machine name (and I'd strongly prefer
> not to do a hack using $0 to lookup the vm name, as in calling the scripts
> "vbox_init.$NAME1" and "vbox_init.$NAME2", etc).
> 
> Similarly, I don't want to create individual resource agents with
> hard-coded values.
> 
> Is there an existing varient on the /usr/share/cluster/script.sh resource
> that enables passing arbitrary argument+value pairs and flags to the
> actual init script? Continuing the above example, the new resource
> ("scriptArgs") would appear in cluster.conf something like:
> 
> 
>  name="vbox_init"/>
> 
> 
>  restart_expire_time="180">
> 
> 
> 
> 
>  restart_expire_time="180">
> 
> 
> 

None that I know of.

Adapting script.sh to scriptArgs.sh working as desired should not
be a hard task, though.  Once you add such custom RA to the system
(i.e., across all nodes), supposing you have adapted produced
meta-data respectively, you should remember to run "ccs_update_schema"
so that any subsequent start of the cluster stack on the node or
modification via ccs utility will not fail due to unrecognized RA
if present in the to-apply configuration.

Note that the new agent will not get reflected in luci web UI
automatically, you would have to add a support on your own.

> I'm using CentOS6 and:
>   cman-3.0.12.1-78.el6.x86_64
>   luci-0.26.0-78.el6.centos.x86_64
>   rgmanager-3.0.12.1-26.el6_8.3.x86_64
>   ricci-0.16.2-86.el6.x86_64

-- 
Jan (Poki)


pgpjnAsOXKbfG.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] fence_apc delay?

2016-09-06 Thread Jan Pokorný
On 06/09/16 10:35 -0500, Ken Gaillot wrote:
> On 09/06/2016 10:20 AM, Dan Swartzendruber wrote:
>> On 2016-09-06 10:59, Ken Gaillot wrote:
>> 
>> [snip]
>> 
>>> I thought power-wait was intended for this situation, where the node's
>>> power supply can survive a brief outage, so a delay is needed to ensure
>>> it drains. In any case, I know people are using it for that.
>>> 
>>> Are there any drawbacks to using power-wait for this purpose, even if
>>> that wasn't its original intent? Is it just that the "on" will get the
>>> delay as well?
>> 
>> I can't speak to the first part of your question, but for me the second
>> part is a definite YES.  The issue is that I want a long enough delay to
>> be sure the host is D E A D and not writing to the pool anymore; but
>> that delay is now multiplied by 2, and if it gets "too long", vsphere
>> guests can start getting disk I/O errors...
> 
> Ah, Marek's suggestions are the best way out, then. Fence agents are
> usually simple shell scripts, so adding a power-wait-off option
> shouldn't be difficult.

Little correction, they are almost exclusively _Python_ scripts, but
that doesn't change much.  Just a basic understanding of fencing
library (part of fence-agents) and perhaps its slight modification
will be needed.

-- 
Jan (Poki)


pgpCy2FTeFoAF.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] fence_apc delay?

2016-09-06 Thread Dan Swartzendruber

On 2016-09-06 10:59, Ken Gaillot wrote:

On 09/05/2016 09:38 AM, Marek Grac wrote:

Hi,



[snip]


FYI, no special configuration is needed for this with recent pacemaker
versions. If multiple devices are listed in a topology level, pacemaker
will automatically convert reboot requests into all-off-then-all-on.


Hmmm, thinking about this some more, this just puts me back in the 
current situation (e.g. having an 'extra' delay.)  The issue for me 
would be having two fencing devices, each of which needs a brief delay 
to let its target's PS drain.  If a single PDU fencing agent does this 
(with proposed change):


power-off
wait N seconds
power-on

that is cool.  Unfortunately, with the all-off-then-all-on pacemaker 
would do, I would get this:


power-off node A
wait N seconds
power-off node B
wait N seconds
power-on node A
power-on node B

or am I missing something?  If not, seems like it would be nice to have 
some sort of delay at the pacemaker level.  e.g. tell pacemaker to 
convert a reboot of node A into a 'turn off node A, wait N seconds, turn 
on node A'?


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] fence_apc delay?

2016-09-06 Thread Ken Gaillot
On 09/06/2016 10:20 AM, Dan Swartzendruber wrote:
> On 2016-09-06 10:59, Ken Gaillot wrote:
> 
> [snip]
> 
>> I thought power-wait was intended for this situation, where the node's
>> power supply can survive a brief outage, so a delay is needed to ensure
>> it drains. In any case, I know people are using it for that.
>>
>> Are there any drawbacks to using power-wait for this purpose, even if
>> that wasn't its original intent? Is it just that the "on" will get the
>> delay as well?
> 
> I can't speak to the first part of your question, but for me the second
> part is a definite YES.  The issue is that I want a long enough delay to
> be sure the host is D E A D and not writing to the pool anymore; but
> that delay is now multiplied by 2, and if it gets "too long", vsphere
> guests can start getting disk I/O errors...

Ah, Marek's suggestions are the best way out, then. Fence agents are
usually simple shell scripts, so adding a power-wait-off option
shouldn't be difficult.

>>> *) Configure fence device to not use reboot but OFF, ON
>>> Very same to the situation when there are multiple power circuits; you
>>> have to switch them all OFF and afterwards turn them ON.
>>
>> FYI, no special configuration is needed for this with recent pacemaker
>> versions. If multiple devices are listed in a topology level, pacemaker
>> will automatically convert reboot requests into all-off-then-all-on.
> 
> My understanding was that applied to 1.1.14?  My CentOS 7 host has
> pacemaker 1.1.13 :(

Correct -- but most OS distributions, including CentOS, backport
specific bugfixes and features from later versions. In this case, as
long as you've applied updates (pacemaker-1.1.13-10 or later), you've
got it.


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] fence_apc delay?

2016-09-06 Thread Dan Swartzendruber

On 2016-09-06 10:59, Ken Gaillot wrote:

[snip]


I thought power-wait was intended for this situation, where the node's
power supply can survive a brief outage, so a delay is needed to ensure
it drains. In any case, I know people are using it for that.

Are there any drawbacks to using power-wait for this purpose, even if
that wasn't its original intent? Is it just that the "on" will get the
delay as well?


I can't speak to the first part of your question, but for me the second 
part is a definite YES.  The issue is that I want a long enough delay to 
be sure the host is D E A D and not writing to the pool anymore; but 
that delay is now multiplied by 2, and if it gets "too long", vsphere 
guests can start getting disk I/O errors...



*) Configure fence device to not use reboot but OFF, ON
Very same to the situation when there are multiple power circuits; you
have to switch them all OFF and afterwards turn them ON.


FYI, no special configuration is needed for this with recent pacemaker
versions. If multiple devices are listed in a topology level, pacemaker
will automatically convert reboot requests into all-off-then-all-on.


My understanding was that applied to 1.1.14?  My CentOS 7 host has 
pacemaker 1.1.13 :(


[snip]


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] fence_apc delay?

2016-09-06 Thread Ken Gaillot
On 09/05/2016 09:38 AM, Marek Grac wrote:
> Hi,
> 
> On Mon, Sep 5, 2016 at 3:46 PM, Dan Swartzendruber  > wrote:
> 
> ...
> Marek, thanks.  I have tested repeatedly (8 or so times with disk
> writes in progress) with 5-7 seconds and have had no corruption.  My
> only issue with using power_wait here (possibly I am
> misunderstanding this) is that the default action is 'reboot' which
> I *think* is 'power off, then power on'.  e.g. two operations to the
> fencing device.  The only place I need a delay though, is after the
> power off operation - doing so after power on is just wasted time
> that the resource is offline before the other node takes it over. 
> Am I misunderstanding this?  Thanks!
> 
> 
> You are right. Default sequence for reboot is:
> 
> get status, power off, delay(power-wait), get status [repeat until OFF],
> power on, delay(power-wait), get status [repeat until ON].
> 
> The power-wait was introduced because some devices respond with strange
> values when they are asked too soon after power change. It was not
> intended to be used in a way that you propose. Possible solutions:

I thought power-wait was intended for this situation, where the node's
power supply can survive a brief outage, so a delay is needed to ensure
it drains. In any case, I know people are using it for that.

Are there any drawbacks to using power-wait for this purpose, even if
that wasn't its original intent? Is it just that the "on" will get the
delay as well?

> *) Configure fence device to not use reboot but OFF, ON
> Very same to the situation when there are multiple power circuits; you
> have to switch them all OFF and afterwards turn them ON.

FYI, no special configuration is needed for this with recent pacemaker
versions. If multiple devices are listed in a topology level, pacemaker
will automatically convert reboot requests into all-off-then-all-on.

> *) Add a new option power-wait-off that will be used only in OFF case
> (and will override power-wait). It should be quite easy to do. Just,
> send us PR.
> 
> m,

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] What cib_stats line means in logfile

2016-09-06 Thread Ken Gaillot
On 09/05/2016 03:59 PM, Jan Pokorný wrote:
> On 05/09/16 21:26 +0200, Jan Pokorný wrote:
>> On 25/08/16 17:55 +0200, Sébastien Emeriau wrote:
>>> When i check my corosync.log i see this line :
>>>
>>> info: cib_stats: Processed 1 operations (1.00us average, 0%
>>> utilization) in the last 10min
>>>
>>> What does it mean (cpu load or just information) ?
>>
>> These are just periodically (10 minutes by default, if any
>> operations observed at all) emitted diagnostic summaries that
>> were once considered useful, which was later reconsidered
>> leading to their complete removal:
>>
>> https://github.com/ClusterLabs/pacemaker/commit/73e8c89#diff-37b681fa792dfc09ec67bb0d64eb55feL306
>>
>> Honestly, using as old Pacemaker as 1.1.8 (released 4 years ago)
> 
> actually, it must have been even older than that (I'm afraid to ask).
> 
>> would be a bigger concern for me.  Plenty of important fixes
>> (as well as enhancements) have been added since then...
> 
> P.S. Checked my mailbox, aggregating plentiful sources such as this
> list and various GitHub notifications, and found 1 other trace of
> such an oudated version within this year + 2 another last year(!).

My guess is Debian -- the pacemaker package stagnated in Debian for a
long time, so the stock Debian packages were at 1.1.7 as late as wheezy
(initially released in 2013, but it's LTS until 2018). Then, pacemaker
was dropped entirely from jessie.

Recent versions are once again actively maintained in Debian
backports/unstable, so the situation should improve from here on out,
but I bet a lot of Debian boxes still run wheezy or earlier.


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] clustering fiber channel pools on multiple wwns

2016-09-06 Thread Gabriele Bulfon
Hi,
on illumos, I have a way to cluster one zfs pool on two nodes, by moving 
ip,pool and its shares at once on the other node.
This works for iscsi too: the ip of the target has been migrated together with 
the pool, so the iscsi resource is still there running on the same ip (just a 
different node).
Now I was thinking to do the same with fiber channel: two nodes, each with its 
own qlogic fc connected to a fc switch, with vmware clients with their fc cards 
connected on the same switch.
I can't see how I can do this with fc, because with iscsi I can migrate the 
hosting IP, but with fc I can't migrate the hosting wwn!
What I need, is to tell vmware that the target volume may be running on two 
different wwns, so a failing wwn should trigger retry on the other wwn: the 
pool and shared volumes will be moving from one wwn to the other.
Am I dreaming??
Gabriele

Sonicle S.r.l.
:
http://www.sonicle.com
Music:
http://www.gabrielebulfon.com
Quantum Mechanics :
http://www.cdbaby.com/cd/gabrielebulfon
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] unable to add removed node to cluster

2016-09-06 Thread Tomas Jelinek

Hi,

Dne 6.9.2016 v 08:12 Omar Jaber napsal(a):

Hi,

I create cluster contain three nodes  when I remove one of the node by
run "pcs cluster destroy"  command


This is the root cause of your problem.  "pcs cluster destroy" only 
wipes out cluster configuration from a node but it does not tell the 
rest of the cluster that the node got removed.  Use "pcs cluster node 
remove" to remove a node from a cluster.




The node was  stopped from cluster, but when I try to  rejoin the node
by run  commands

1-systemctl start pcsd.service

2-systemctl start pcsd.service

(one removed node)

3-*pcs cluster auth*

4-*pcs cluster node add*

(On a node in the existing cluster)

the output from last command  (Error: unable to add hosname1 on
hostname2- Error connecting to hostname2- (HTTP error: 400

Error: unable to add hostname1 on hostname3 - Error connecting to
hostname3 - (HTTP error: 400)

Error: unable to add hostname1 on hostname1- Error connecting to
hostname2 - (HTTP error: 400)

Error: Unable to update any nodes


This fails most probably because the node you want to add is still 
present in the cluster configuration on the remaining nodes.  You can 
get detailed info by running "pcs cluster node add  --debug".


Now to fix that run "pcs cluster localnode remove " on the 
two remaining nodes.  Then you can add the removed node back to the cluster.



Regards,
Tomas





Any idea what is the problem ?



Thanks

Omar Jaber













___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org