from:"Ken Gaillot"

Re: [ClusterLabs] Antw: Re: Notification agent and Notification recipients

2017-08-08 Thread Ken Gaillot

On Tue, 2017-08-08 at 17:40 +0530, Sriram wrote:
> Hi Ulrich,
> 
> 
> Please see inline.
> 
> On Tue, Aug 8, 2017 at 2:01 PM, Ulrich Windl
>  wrote:
> >>> Sriram  schrieb am 08.08.2017 um
> 09:30 in Nachricht
>

Re: [ClusterLabs] Antw: Re: big trouble with a DRBD resource

2017-08-08 Thread Ken Gaillot

On Tue, 2017-08-08 at 10:18 +0200, Ulrich Windl wrote:
> >>> Ken Gaillot <kgail...@redhat.com> schrieb am 07.08.2017 um 22:26 in 
> >>> Nachricht
> <1502137587.5788.83.ca...@redhat.com>:
> 
> [...]
> > Unmanaging doesn't stop monitoring a resource, it only prevents starting
> > and stopping of the resource. That lets you see the current status, even
> > if you're in the middle of maintenance or what not. You can disable
> 
> This feature is discussable IMHO: If you plan to update the RAs, it seems a 
> bad idea to run the monitor (that is part of the RA). Especially if a monitor 
> detects a problem while in maintenance (e.g. the updated RA needs a new or 
> changed parameter), it will cause actions once you stop maintenance mode, 
> right?

Generally, it won't cause any actions if the resource is back in a good
state when you leave maintenance mode. I'm not sure whether failures
during maintenance mode count toward the migration fail count -- I'm
guessing they do but shouldn't. If so, it would be possible that the
cluster decides to move it even if it's in a good state, due to the
migration threshold. I'll make a note to look into that.

Unmanaging a resource (or going into maintenance mode) doesn't
necessarily mean that the user expects that resource to stop working. It
can be a precaution while doing other work on that node, in which case
they may very well want to know if it starts having problems.

You can already disable the monitors if you want, so I don't think it
needs to be changed in pacemaker. My general outlook is that pacemaker
should be as conservative as possible (in this case, letting the user
know when there's an error), but higher-level tools can make different
assumptions if they feel their users would prefer it. So, pcs and crm
are free to disable monitors by default when unmanaging a resource, if
they think that's better.

> My preference would be to leave the RAs completely alone while in maintenance 
> mode. Leaving maintenance mode could trigger a re-probe to make sure the 
> cluster is happy with the current state.
> 
> > monitoring separately by setting the enabled="false" meta-attribute on
> > the monitor operation.
> > 
> > Standby would normally stop all resources from running on a node (and
> > thus all monitors as well), but if a resource is unmanaged, standby
> > won't override that -- it'll prevent the cluster from starting any new
> > resources on the node, but it won't stop the unmanaged resource (or any
> > of its monitors).
> [...]
> 
> Regards,
> Ulrich
> 
> 
> 

-- 
Ken Gaillot <kgail...@redhat.com>

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] big trouble with a DRBD resource

2017-08-07 Thread Ken Gaillot

On Mon, 2017-08-07 at 21:16 +0200, Lentes, Bernd wrote:
> - On Aug 4, 2017, at 10:19 PM, kgaillot kgail...@redhat.com wrote:
> 
> > The cluster reacted promptly:
> > crm(live)# configure primitive prim_drbd_idcc_devel ocf:linbit:drbd params 
> > drbd_resource=idcc-devel \
> >> op monitor interval=60
> > WARNING: prim_drbd_idcc_devel: default timeout 20s for start is smaller 
> > than the advised 240
> > WARNING: prim_drbd_idcc_devel: default timeout 20s for stop is smaller than 
> > the advised 100
> > WARNING: prim_drbd_idcc_devel: action monitor not advertised in meta-data, 
> > it may not be supported by the RA
> 
> > Why is it complaining about missing clone-max ? This is a meta attribute 
> > for a clone, but not for a simple resource !?! 
> > This message is constantly repeated, it still appears although cluster is 
> > in standby since three days.
> 
> > The "ERROR" message is coming from the DRBD resource agent itself, not
> > pacemaker. Between that message and the two separate monitor operations,
> > it looks like the agent will only run as a master/slave clone.
> 
> This message concerning clone-max still appears once per minute in syslog, 
> although both nodes are in standby for days and the drbd resource is 
> unmanaged too.
> With stat i checked that the RA is called once per minute. With strace i 
> found out that it is lrmd which calls the RA with the option "monitor".
> But why is it still checking ? I thought "standby" for the nodes means the 
> cluster does not care anymore about the resources. And "unmanaged" means the 
> same for a dedicated resource.
> So this should mean doubled "don't care anymore about drbd".

Unmanaging doesn't stop monitoring a resource, it only prevents starting
and stopping of the resource. That lets you see the current status, even
if you're in the middle of maintenance or what not. You can disable
monitoring separately by setting the enabled="false" meta-attribute on
the monitor operation.

Standby would normally stop all resources from running on a node (and
thus all monitors as well), but if a resource is unmanaged, standby
won't override that -- it'll prevent the cluster from starting any new
resources on the node, but it won't stop the unmanaged resource (or any
of its monitors).

> 
> crm(live)# status
> Last updated: Mon Aug  7 16:05:13 2017
> Last change: Tue Aug  1 18:54:02 2017 by root via cibadmin on ha-idg-2
> Stack: classic openais (with plugin)
> Current DC: ha-idg-2 - partition with quorum
> Version: 1.1.12-f47ea56
> 2 Nodes configured, 2 expected votes
> 18 Resources configured
> 
> 
> Node ha-idg-1: standby
> Node ha-idg-2: standby
> 
>  prim_drbd_idcc_devel   (ocf::linbit:drbd): FAILED (unmanaged)[ ha-idg-1 
> ha-idg-2 ]
> 
> 
> What is interesting: Although saying "action monitor not advertised in 
> meta-data, it may not be supported by the RA", it is:

Pacemaker doesn't print that message, it's probably coming from crm

> /usr/lib/ocf/resource.d/linbit/drbd:
> 
>  ...
> case $__OCF_ACTION in
> start)
> drbd_start
> ;;
>  ...
> 
> monitor)   
> ^^^
> drbd_monitor
> ;;
>  ...
> 
> And the in the "crm ra info ocf:linbit:drbd" mentioned "monitor_Slave" and 
> "monitor_Master" i can't find in the RA. Strange.
> Doesn't "crm ra info ocf:linbit:drbd" retrieve its information from the RA ?
> 
> In the RA i just find:
> 
>  ...
> 
> 
>  ...
> 
> 
> This is what "crm ra info ocf:linbit:drbd" says:
>  ...
> Operations' defaults (advisory minimum):
> 
> 
>  ...
> monitor_Slave timeout=20 interval=20
> monitor_Master timeout=20 interval=10

Ah, this makes more sense to me now ... it looks like the RA actually
supports "monitor" (which is what I expected) and crm ra info is
displaying that differently due to the two supported rules.

> Bernd
>  
> 
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
> Ingolstaedter Landstr. 1
> 85764 Neuherberg
> www.helmholtz-muenchen.de
> Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe
> Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons 
> Enhsen
> Registergericht: Amtsgericht Muenchen HRB 6466
> USt-IdNr: DE 129521671

-- 
Ken Gaillot <kgail...@redhat.com>





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] IPaddr2 RA and bonding

2017-08-07 Thread Ken Gaillot

On Mon, 2017-08-07 at 10:02 +, Tomer Azran wrote:
> Hello All,
> 
>  
> 
> We are using CentOS 7.3 with pacemaker in order to create a cluster.
> 
> Each cluster node ha a bonding interface consists of two nics.
> 
> The cluster has an IPAddr2 resource configured like that:
> 
>  
> 
> # pcs resource show cluster_vip
> 
> Resource: cluster_vip (class=ocf provider=heartbeat type=IPaddr2)
> 
>   Attributes: ip=192.168.1.3
> 
>   Operations: start interval=0s timeout=20s (cluster_vip
> -start-interval-0s)
> 
>   stop interval=0s timeout=20s (cluster_vip
> -stop-interval-0s)
> 
>   monitor interval=30s (cluster_vip -monitor-interval-30s)
> 
>  
> 
>  
> 
> We are running tests and want to simulate a state when the network
> links are down.
> 
> We are pulling both network cables from the server.
> 
>  
> 
> The problem is that the resource is not marked as failed, and the
> faulted node keep holding it and does not fail it over to the other
> node.
> 
> I think that the problem is within the bond interface. The bond
> interface is marked as UP on the OS. It even can ping itself:
> 
>  
> 
> # ip link show
> 
> 2: eno3: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq
> master bond1 state DOWN mode DEFAULT qlen 1000
> 
> link/ether 00:1e:67:f6:5a:8a brd ff:ff:ff:ff:ff:ff
> 
> 3: eno4: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq
> master bond1 state DOWN mode DEFAULT qlen 1000
> 
> link/ether 00:1e:67:f6:5a:8a brd ff:ff:ff:ff:ff:ff
> 
> 9: bond1: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc
> noqueue state DOWN mode DEFAULT qlen 1000
> 
> link/ether 00:1e:67:f6:5a:8a brd ff:ff:ff:ff:ff:ff
> 
>  
> 
> As far as I understand the IPaddr2 RA does not check the link state of
> the interface – What can be done?

You are correct. The IP address itself *is* up, even if the link is
down, and it can be used locally on that host.

If you want to monitor connectivity to other hosts, you have to do that
separately. The most common approach is to use the ocf:pacemaker:ping
resource. See:

http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_moving_resources_due_to_connectivity_changes
 
> BTW, I tried to find a solution on the bonding configuration which
> disables the bond when no link is up, but I didn't find any.
> 
>  
> 
> Tomer.
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-- 
Ken Gaillot <kgail...@redhat.com>





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] big trouble with a DRBD resource

2017-08-07 Thread Ken Gaillot

On Mon, 2017-08-07 at 15:23 +0200, Lentes, Bernd wrote:
> - On Aug 4, 2017, at 10:19 PM, kgaillot kgail...@redhat.com wrote:
> 
> > 
> > The "ERROR" message is coming from the DRBD resource agent itself, not
> > pacemaker. Between that message and the two separate monitor operations,
> > it looks like the agent will only run as a master/slave clone.
> 
> Btw:
> Does the crm/lrm call the RA in the same way init does with scripts in 
> /etc/init.d ?
> E.g if a resource has a monitor interval from 120, the RA is called every 2 
> min from the cluster with the option monitor ?
> That would be a pretty simple concept. Not a bad one.

Yes, Pacemaker executes the RA with the appropriate environment
variables whenever a resource action is needed.

Monitors work almost as you describe -- it's not exactly every 2
minutes, but 2 minutes from the last monitor's completion (or timeout),
to ensure that monitors don't overlap.

> https://wiki.clusterlabs.org/wiki/Debugging_Resource_Failures indicates that.
> 
> 
> Bernd
>  
> 
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
> Ingolstaedter Landstr. 1
> 85764 Neuherberg
> www.helmholtz-muenchen.de
> Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe
> Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons 
> Enhsen
> Registergericht: Amtsgericht Muenchen HRB 6466
> USt-IdNr: DE 129521671
> 

-- 
Ken Gaillot <kgail...@redhat.com>





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] nginx resource - how to reload config or do a config test

2017-08-07 Thread Ken Gaillot

On Mon, 2017-08-07 at 16:32 +0200, Przemyslaw Kulczycki wrote:
> Hi.
> I have a 2-node cluster with a cloned IP and nginx configured.
> 
> 
> user@proxy04 ~]$ sudo pcs resource show --full
>  Clone: ha-ip-clone
>   Meta Attrs: clone-max=2 clone-node-max=2 globally-unique=true
> resource-stickiness=0
>   Resource: ha-ip (class=ocf provider=heartbeat type=IPaddr2)
>Attributes: cidr_netmask=24 clusterip_hash=sourceip
> ip=192.68.20.240
>Operations: monitor interval=5s timeout=15s
> (ha-ip-monitor-interval-5s)
>start interval=0s timeout=20s (ha-ip-start-interval-0s)
>stop interval=0s timeout=20s (ha-ip-stop-interval-0s)
>  Clone: ha-nginx-clone
>   Meta Attrs: clone-max=2 clone-node-max=1 globally-unique=true
> resource-stickiness=0
>   Resource: ha-nginx (class=ocf provider=heartbeat type=nginx)
>Attributes: configfile=/etc/nginx/nginx.conf
>Operations: monitor interval=5s timeout=20s
> (ha-nginx-monitor-interval-5s)
>start interval=0s timeout=60s
> (ha-nginx-start-interval-0s)
>stop interval=0s timeout=60s
> (ha-nginx-stop-interval-0s)
> 
> 
> $ pcs --version
> 0.9.158
> 
> 
> I have 2 questions about that resource type:
> 1) How do I reload nginx config in the clustered resource without
> restarting the nginx process?
> pcs doesn't have an option to do that (analogous to pcs resource
> restart ha-nginx-clone)
> Is there a pacemaker command to do that?

Pacemaker's reload capability is a bit muddled right now. It's on the
to-do to overhaul it. Currently, if the resource agent supports the
reload action, and you change a resource parameter marked as unique=0
(or left as default) in the agent meta-data, Pacemaker will execute the
reload action.

You can use this method by changing such a parameter, but it's more a
workaround than a solution.

> According to http://linux-ha.org/doc/man-pages/re-ra-nginx.html this
> type of agent supports a reload option, so how can I use it?

You can run the agent manually from the command line like:

OCF_ROOT=/usr/lib/ocf
[OCF_RESKEY_= ...] /usr/lib/ocf/resource.d/heartbeat/nginx
reload

where you need to set the param/value pairs identical to what you have
in the cluster configuration.

However, in nginx's case, the RA simply does a "kill -HUP" to the nginx
PID, so it's probably easier to just do that yourself.

> 2) How do I do an nginx config test with the clustered resource?
> 
> 
> I know I can do a "nginx -t", but is there an option to do it using
> pacemaker/pcs commands on both nodes?

No, but that's OK. You don't want to start or stop or change the
configuration without involving the cluster, but tests and checks are
fine to run outside cluster control.

> -- 
> Best Regards
>  
> Przemysław Kulczycki
> System administrator
> Avaleo
> 
> Email: u...@avaleo.net

-- 
Ken Gaillot <kgail...@redhat.com>





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] dry-run an alert?

2017-08-07 Thread Ken Gaillot

On Mon, 2017-08-07 at 17:48 +0100, lejeczek wrote:
> hi everyone
> 
> I wonder, is it possible to dry-run an alert agent? Test it 
> somehow without the actual event taking place?
> 
> 
> many thanks.
> L.

There's no special tool to do so, but it would be fairly simple to do it
by hand -- just set the environment variables to simulate the event you
want and then call the agent. The possible environment variables are
described at:

http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_writing_an_alert_agent

-- 
Ken Gaillot <kgail...@redhat.com>





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] ClusterLabs.Org Documentation Problem?

2017-08-22 Thread Ken Gaillot

On Tue, 2017-08-22 at 19:40 +, Eric Robinson wrote:
> The documentation located here… 
> 
>  
> 
> http://clusterlabs.org/doc/
> 
>  
> 
> …is confusing because it offers two combinations:
> 
>  
> 
> Pacemaker 1.0 for Corosync 1.x
> 
> Pacemaker 1.1 for Corosync 2.x
> 
>  
> 
> According to the documentation, if you use Corosync 1.x you need
> Pacemaker 1.0, but if you use Corosync 2.x then you need Pacemaker
> 1.1. 
> 
>  
> 
> However, on my Centos 6.9 system, when I do ‘yum install pacemaker
> corosync” I get the following versions:
> 
>  
> 
> pacemaker-1.1.15-5.el6.x86_64
> 
> corosync-1.4.7-5.el6.x86_64
> 
>  
> 
> What’s the correct answer? Does Pacemaker 1.1.15 work with Corosync
> 1.4.7? If so, is the documentation at ClusterLabs misleading? 
> 
>  
> 
> --
> Eric Robinson

The page actually offers a third option ... "Pacemaker 1.1 for CMAN or
Corosync 1.x". That's the configuration used by CentOS 6.

However, that's still a bit misleading; the documentation set for
"Pacemaker 1.1 for Corosync 2.x" is the only one that is updated, and
it's mostly independent of the underlying layer, so you should prefer
that set.

I plan to reorganize that page in the coming months, so I'll try to make
it clearer.

-- 
Ken Gaillot <kgail...@redhat.com>





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Resources still retains in Primary Node even though its interface went down

2017-05-03 Thread Ken Gaillot

On 05/03/2017 02:43 AM, pillai bs wrote:
> Hi Experts!!!
> 
>   Am having two node setup for HA (Primary/Secondary) with
> separate resources for Home/data/logs/Virtual IP.. As known the Expected
> behavior should be , if Primary node went down, secondary has to take
> in-charge (meaning initially the VIP will point the primary node, so
> user can access home/data/logs from primary node.Once primary node went
> down, the VIP/floatingIP will point the secondary node so that the user
> can experienced uninterrupted service).
>  I'm using dual ring support to avoid split brain. I have
> two interfaces (Public & Private).Intention for having private interface
> is for Data Sync alone.
> 
> I have tested my setup in two different ways:
> 1. Made primary Interface down (ifdown eth0), as expected VIP and other
> resources moved from primary to secondary node.(VIP will not be
> reachable from primary node)
> 2. Made Primary Interface down (Physically unplugged the Ethernet
> Cable). The primary node still retain the resources, VIP/FloatingIP was
> reachable from primary node.
> 
> Is my testing correct?? how come the VIP will be reachable even though
> eth0 was down. Please advice!!!
> 
> Regards,
> Madhan.B

Sorry, didn't see this message before replying to the other one :)

The IP resource is successful if the IP is up *on that host*. It doesn't
check that the IP is reachable from any other site. Similarly,
filesystem resources just make sure that the filesystem can be mounted
on the host. So, unplugging the Ethernet won't necessarily make those
resources fail.

Take a look at the ocf:pacemaker:ping resource for a way to ensure that
the primary host has connectivity to the outside world. Also, be sure
you have fencing configured, so that the surviving node can kill a node
that is completely cut off or unresponsive.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Resources still retains in primary node

2017-05-03 Thread Ken Gaillot

On 05/03/2017 02:30 AM, pillai bs wrote:
> Hi Experts!!!
> 
>   Am having two node HA setup (Primary/Secondary) with
> separate resources for Home/data/logs/Virtual IP.. As known the Expected
> behavior should be , if Primary node went down, secondary has to take
> in-charge (meaning initially the VIP will point the primary node, so
> user can access home/data/logs from primary node.Once primary node went
> down, the VIP/floatingIP will point the secondary node so that the user
> can experienced uninterrupted service).

Yes, that's a common setup for pacemaker clusters. Did you have a
problem with it?

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] How to check if a resource on a cluster node is really back on after a crash

2017-05-11 Thread Ken Gaillot

On 05/11/2017 03:00 PM, Ludovic Vaugeois-Pepin wrote:
> Hi
> I translated the a Postgresql multi state RA
> (https://github.com/dalibo/PAF) in Python
> (https://github.com/ulodciv/deploy_cluster), and I have been editing it
> heavily.
> 
> In parallel I am writing unit tests and functional tests.
> 
> I am having an issue with a functional test that abruptly powers off a
> slave named says "host3" (hot standby PG instance). Later on I start the
> slave back. Once it is started, I run "pcs cluster start host3". And
> this is where I start having a problem.
> 
> I check every second the output of "pcs status xml" until host3 is said
> to be ready as a slave again. In the following I assume that test3 is
> ready as a slave:
> 
> 
>  standby_onfail="false" maintenance="false" pending="false"
> unclean="false" shutdown="false" expected_up="true" is_dc="false"
> resources_running="2" type="member" />
>  standby_onfail="false" maintenance="false" pending="false"
> unclean="false" shutdown="false" expected_up="true" is_dc="true"
> resources_running="1" type="member" />
>  standby_onfail="false" maintenance="false" pending="false"
> unclean="false" shutdown="false" expected_up="true" is_dc="false"
> resources_running="1" type="member" />
> 

The  section says nothing about the current state of the nodes.
Look at the  entries for that. in_ccm means the cluster
stack level, and crmd means the pacemaker level -- both need to be up.

> 
>  managed="true" failed="false" failure_ignored="false" >
>  role="Slave" active="true" orphaned="false" managed="true"
> failed="false" failure_ignored="false" nodes_running_on="1" >
> 
> 
>  role="Master" active="true" orphaned="false" managed="true"
> failed="false" failure_ignored="false" nodes_running_on="1" >
> 
> 
>  role="Slave" active="true" orphaned="false" managed="true"
> failed="false" failure_ignored="false" nodes_running_on="1" >
> 
> 
> 
> By ready to go I mean that upon running "pcs cluster start test3", the
> following occurs before test3 appears ready in the XML:
> 
> pcs cluster start test3
> monitor-> RA returns unknown error (1) 
> notify/pre-stop-> RA returns ok (0)
> stop   -> RA returns ok (0)
> start-> RA returns ok (0)
> 
> The problem I have is that between "pcs cluster start test3" and
> "monitor", it seems that the XML returned by "pcs status xml" says test3
> is ready (the XML extract above is what I get at that moment). Once
> "monitor" occurs, the returned XML shows test3 to be offline, and not
> until the start is finished do I once again have test3 shown as ready.
> 
> I am getting anything wrong? Is there a simpler or better way to check
> if test3 is fully functional again, ie OCF start was successful?
> 
> Thanks
> 
> Ludovic

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] newbie question

2017-05-11 Thread Ken Gaillot

On 05/05/2017 03:09 PM, Sergei Gerasenko wrote:
> Hi,
> 
> I have a very simple question. 
> 
> Pacemaker uses a dedicated "multicast" interface for the totem protocol.
> I'm using pacemaker with LVS to provide HA load balancing. LVS uses
> multicast interfaces to sync the status of TCP connections if a failover
> occurs.
> 
> I can understand services using the same interface if ports are used.
> That way you can get a socket (ip + port). But there's no ports in this
> case. So how can two services exchange messages without specifying
> ports? I guess that's somehow related to multicast, but how exactly I
> don't get.
> 
> Can somebody point me to a primer on this topic?
> 
> Thanks,
>   S.

Corosync is actually the cluster component that can use multicast, and
it does use a specific port on a specific address. By default, it uses
ports 5404 and 5405 when using multicast. See the corosync.conf(5) man
page for mcastaddr and mcastport. Also see the transport option;
corosync can be configured to use UDP unicast rather than multicast.

I don't remember much about LVS, but I would guess it's the same -- it's
probably just using a default port if not specified in the config.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] how to set a dedicated fence delay for a stonith agent ?

2017-05-10 Thread Ken Gaillot

On 05/10/2017 12:20 AM, Kristoffer Grönlund wrote:
> "Lentes, Bernd"  writes:
> 
>> - On May 8, 2017, at 9:20 PM, Bernd Lentes 
>> bernd.len...@helmholtz-muenchen.de wrote:
>>
>>> Hi,
>>>
>>> i remember that digimer often campaigns for a fence delay in a 2-node  
>>> cluster.
>>> E.g. here: 
>>> http://oss.clusterlabs.org/pipermail/pacemaker/2013-July/019228.html
>>> In my eyes it makes sense, so i try to establish that. I have two HP 
>>> servers,
>>> each with an ILO card.
>>> I have to use the stonith:external/ipmi agent, the stonith:external/riloe
>>> refused to work.
>>>
>>> But i don't have a delay parameter there.
>>> crm ra info stonith:external/ipmi:
>>>
>>> ...
>>> pcmk_delay_max (time, [0s]): Enable random delay for stonith actions and 
>>> specify
>>> the maximum of random delay
>>>This prevents double fencing when using slow devices such as sbd.
>>>Use this to enable random delay for stonith actions and specify the 
>>> maximum of
>>>random delay.
>>> ...
>>>
>>> This is the only delay parameter i can use. But a random delay does not 
>>> seem to
>>> be a reliable solution.
>>>
>>> The stonith:ipmilan agent also provides just a random delay. Same with the 
>>> riloe
>>> agent.
>>>
>>> How did anyone solve this problem ?
>>>
>>> Or do i have to edit the RA (I will get practice in that :-))?
>>>
>>>
>>
>> crm ra info stonith:external/ipmi says there exists a parameter 
>> pcmk_delay_max.
>> Having a look in  /usr/lib64/stonith/plugins/external/ipmi i don't find 
>> anything about delay.
>> Also "crm_resource --show-metadata=stonith:external/ipmi" does not say 
>> anything about a delay.
>>
>> Is this "pcmk_delay_max" not implemented ? From where does "crm ra info 
>> stonith:external/ipmi" get this info ?
>>
> 
> pcmk_delay_max is implemented by Pacemaker. crmsh gets the information
> about available parameters by querying stonithd directly.
> 
> Cheers,
> Kristoffer

The various pcmk_* parameters are documented in the stonithd(7) man page.

Some fence agents implement a delay parameter of their own, to set a
fixed delay. I believe that's what digimer uses.

>>
>> Bernd
>>  
>>
>> Helmholtz Zentrum Muenchen
>> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
>> Ingolstaedter Landstr. 1
>> 85764 Neuherberg
>> www.helmholtz-muenchen.de
>> Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe
>> Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons 
>> Enhsen
>> Registergericht: Amtsgericht Muenchen HRB 6466
>> USt-IdNr: DE 129521671

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] cloned resources ordering and remote nodes problem

2017-05-09 Thread Ken Gaillot

On 04/13/2017 08:49 AM, Radoslaw Garbacz wrote:
> Thank you, however in my case this parameter does not change the
> described behavior.
> 
> I have a more detail example:
> order: res_A-clone -> res_B-clone -> res_C
> when "res_C" is not on the node, which had "res_A" instance failed, it
> will not be restarted, only "res_A" and "res_B" all instances will.
> 
> I implemented a workaround by modifying "res_C" I made it also cloned,
> and now it is restarted.
> 
> 
> My Pacemaker 1.1.16-1.el6
> System: CentOS 6

I haven't been able to reproduce this. Can you attach a configuration
file that exhibits the problem?


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Pacemaker's "stonith too many failures" log is not accurate

2017-05-17 Thread Ken Gaillot

On 05/17/2017 04:56 AM, Klaus Wenninger wrote:
> On 05/17/2017 11:28 AM, 井上 和徳 wrote:
>> Hi,
>> I'm testing Pacemaker-1.1.17-rc1.
>> The number of failures in "Too many failures (10) to fence" log does not 
>> match the number of actual failures.
> 
> Well it kind of does as after 10 failures it doesn't try fencing again
> so that is what
> failures stay at ;-)
> Of course it still sees the need to fence but doesn't actually try.
> 
> Regards,
> Klaus

This feature can be a little confusing: it doesn't prevent all further
fence attempts of the target, just *immediate* fence attempts. Whenever
the next transition is started for some other reason (a configuration or
state change, cluster-recheck-interval, node failure, etc.), it will try
to fence again.

Also, it only checks this threshold if it's aborting a transition
*because* of this fence failure. If it's aborting the transition for
some other reason, the number can go higher than the threshold. That's
what I'm guessing happened here.

>> After the 11th time fence failure, "Too many failures (10) to fence" is 
>> output.
>> Incidentally, stonith-max-attempts has not been set, so it is 10 by default..
>>
>> [root@x3650f log]# egrep "Requesting fencing|error: Operation reboot|Stonith 
>> failed|Too many failures"
>> ##Requesting fencing : 1st time
>> May 12 05:51:47 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of 
>> node rhel73-2
>> May 12 05:52:52 rhel73-1 stonith-ng[5265]:   error: Operation reboot of 
>> rhel73-2 by rhel73-1 for crmd.5269@rhel73-1.8415167d: No data available
>> May 12 05:52:52 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith 
>> failed
>> ## 2nd time
>> May 12 05:52:52 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of 
>> node rhel73-2
>> May 12 05:53:56 rhel73-1 stonith-ng[5265]:   error: Operation reboot of 
>> rhel73-2 by rhel73-1 for crmd.5269@rhel73-1.53d3592a: No data available
>> May 12 05:53:56 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith 
>> failed
>> ## 3rd time
>> May 12 05:53:56 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of 
>> node rhel73-2
>> May 12 05:55:01 rhel73-1 stonith-ng[5265]:   error: Operation reboot of 
>> rhel73-2 by rhel73-1 for crmd.5269@rhel73-1.9177cb76: No data available
>> May 12 05:55:01 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith 
>> failed
>> ## 4th time
>> May 12 05:55:01 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of 
>> node rhel73-2
>> May 12 05:56:05 rhel73-1 stonith-ng[5265]:   error: Operation reboot of 
>> rhel73-2 by rhel73-1 for crmd.5269@rhel73-1.946531cb: No data available
>> May 12 05:56:05 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith 
>> failed
>> ## 5th time
>> May 12 05:56:05 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of 
>> node rhel73-2
>> May 12 05:57:10 rhel73-1 stonith-ng[5265]:   error: Operation reboot of 
>> rhel73-2 by rhel73-1 for crmd.5269@rhel73-1.278b3c4b: No data available
>> May 12 05:57:10 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith 
>> failed
>> ## 6th time
>> May 12 05:57:10 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of 
>> node rhel73-2
>> May 12 05:58:14 rhel73-1 stonith-ng[5265]:   error: Operation reboot of 
>> rhel73-2 by rhel73-1 for crmd.5269@rhel73-1.7a49aebb: No data available
>> May 12 05:58:14 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith 
>> failed
>> ## 7th time
>> May 12 05:58:14 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of 
>> node rhel73-2
>> May 12 05:59:19 rhel73-1 stonith-ng[5265]:   error: Operation reboot of 
>> rhel73-2 by rhel73-1 for crmd.5269@rhel73-1.83421862: No data available
>> May 12 05:59:19 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith 
>> failed
>> ## 8th time
>> May 12 05:59:19 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of 
>> node rhel73-2
>> May 12 06:00:24 rhel73-1 stonith-ng[5265]:   error: Operation reboot of 
>> rhel73-2 by rhel73-1 for crmd.5269@rhel73-1.afd7ef98: No data available
>> May 12 06:00:24 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith 
>> failed
>> ## 9th time
>> May 12 06:00:24 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of 
>> node rhel73-2
>> May 12 06:01:28 rhel73-1 stonith-ng[5265]:   error: Operation reboot of 
>> rhel73-2 by rhel73-1 for crmd.5269@rhel73-1.3b033dbe: No data available
>> May 12 06:01:28 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith 
>> failed
>> ## 10th time
>> May 12 06:01:28 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of 
>> node rhel73-2
>> May 12 06:02:33 rhel73-1 stonith-ng[5265]:   error: Operation reboot of 
>> rhel73-2 by rhel73-1 for crmd.5269@rhel73-1.5447a345: No data available
>> May 12 06:02:33 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith 
>> failed
>> ## 11th time
>> May 12 06:02:33 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of 
>> node rhel73-2
>> May 12 06:03:37 rhel73-1 stonith-ng[5265]:   error:

Re: [ClusterLabs] Pacemaker 1.1.17-rc1 now available

2017-05-09 Thread Ken Gaillot

On 05/09/2017 03:51 AM, Lars Ellenberg wrote:
> Yay!
> 
> On Mon, May 08, 2017 at 07:50:49PM -0500, Ken Gaillot wrote:
>> "crm_attribute --pattern" to update or delete all node
>> attributes matching a regular expression
> 
> Just a nit, but "pattern" usually is associated with "glob pattern".
> If it's not a "pattern" but a "regex",
> "--regex" would be more appropriate.
> 
>  :-)
> 
> Cheers,
> 
> Lars

How about "--match", with the help text saying "regular expression"?


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] How to check if a resource on a cluster node is really back on after a crash

2017-05-12 Thread Ken Gaillot

aged="true"
> failed="false" failure_ignored="false" nodes_running_on="1" >
> 
> 
>  role="Slave" active="true" orphaned="false" managed="true"
> failed="false" failure_ignored="false" nodes_running_on="1" >
> 
> 
> 
>  resource_agent="ocf::heartbeat:IPaddr2" role="Started" active="true"
> orphaned="false" managed="true" failed="false"
> failure_ignored="false" nodes_running_on="1" >
> 
> 
> 
> 
> 
> 
> At 10:45:41.606, after first "monitor" on test3 (I can now tell the
> resources on test3 are not ready):
> 
> crm_mon -1:
> 
> Stack: corosync
> Current DC: test1 (version 1.1.15-11.el7_3.4-e174ec8) -
> partition with quorum
> Last updated: Fri May 12 10:45:41 2017  Last change: Fri
> May 12 10:45:39 2017 by root via crm_attribute on test1
> 
> 3 nodes and 4 resources configured
> 
> Online: [ test1 test2 test3 ]
> 
> Active resources:
> 
>  Master/Slave Set: pgsql-ha [pgsqld]
>  Masters: [ test1 ]
>  Slaves: [ test2 ]
>  pgsql-master-ip(ocf::heartbeat:IPaddr2):   Started
> test1
> 
> 
> crm_mon -X:
> 
> 
>  managed="true" failed="false" failure_ignored="false" >
>  role="Master" active="true" orphaned="false" managed="true"
> failed="false" failure_ignored="false" nodes_running_on="1" >
> 
> 
>  role="Slave" active="true" orphaned="false" managed="true"
> failed="false" failure_ignored="false" nodes_running_on="1" >
> 
> 
>  role="Stopped" active="false" orphaned="false" managed="true"
> failed="false" failure_ignored="false" nodes_running_on="0" />
> 
>  resource_agent="ocf::heartbeat:IPaddr2" role="Started" active="true"
> orphaned="false" managed="true" failed="false"
> failure_ignored="false" nodes_running_on="1" >
> 
> 
> 
> 
> On Fri, May 12, 2017 at 12:45 AM, Ken Gaillot <kgail...@redhat.com
> <mailto:kgail...@redhat.com>> wrote:
> 
> On 05/11/2017 03:00 PM, Ludovic Vaugeois-Pepin wrote:
> > Hi
> > I translated the a Postgresql multi state RA
> > (https://github.com/dalibo/PAF) in Python
> > (https://github.com/ulodciv/deploy_cluster
> <https://github.com/ulodciv/deploy_cluster>), and I have been
> editing it
> > heavily.
> >
> > In parallel I am writing unit tests and functional tests.
> >
> > I am having an issue with a functional test that abruptly
> powers off a
> > slave named says "host3" (hot standby PG instance). Later on I
> start the
> > slave back. Once it is started, I run "pcs cluster start
> host3". And
> > this is where I start having a problem.
> >
> > I check every second the output of "pcs status xml" until
> host3 is said
> > to be ready as a slave again. In the following I assume that
> test3 is
> > ready as a slave:
> >
> > 
> >  > standby_onfail="false" maintenance="false" pending="false"
> > unclean="false" shutdown="false" expected_up="true" is_dc="false"
> > resources_running="2" type="member" />
> >  > standby_onfail="false" maintenance="false" pending="false"
> > unclean="false" shutdown="false" expected_up="true" is_dc="true"
> > resources_running="1" type="member" />
> >  > standby_onfail="false" maintenance="false" pending="false"
>

Re: [ClusterLabs] how to set a dedicated fence delay for a stonith agent ?

2017-05-10 Thread Ken Gaillot

On 05/10/2017 12:26 PM, Dimitri Maziuk wrote:
> 
> i remember that digimer often campaigns for a fence delay in a 2-node  
> cluster.
> ...
> But  ... a random delay does not seem to
> be a reliable solution.
> 
>> Some fence agents implement a delay parameter of their own, to set a
>> fixed delay. I believe that's what digimer uses.
> 
> Is it just me or does this sound like catch-22:
> - pacemaker does not work reliably without fencing

Correct -- more specifically, some failure scenarios can't be safely
handled without fencing.

> - fencing in 2-node clusters does not work reliably without fixed delay

Not quite. Fixed delay allows a particular method for avoiding a death
match in a two-node cluster. Pacemaker's built-in random delay
capability is another method.

> - code that ships with pacemaker does not implement fixed delay.

Fence agents are used with pacemaker but not shipped as part of it. They
have their own packages distributed separately. Anyone can write a fence
agent and make it available to the community.

It would be nice if every fence agent supported a delay parameter, but
there's no requirement to do so, and even if there were, it would just
be a guideline -- it's up to the developer.

There's certainly an argument to be made for supporting a fixed delay at
the pacemaker level. There's an idea floating around to do this based on
node health, which could allow a lot of flexibility.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] ClusterIP won't return to recovered node

2017-06-12 Thread Ken Gaillot

On 06/10/2017 10:53 AM, Dan Ragle wrote:
> So I guess my bottom line question is: How does one tell Pacemaker that
> the individual legs of globally unique clones should *always* be spread
> across the available nodes whenever possible, regardless of the number
> of processes on any one of the nodes? For kicks I did try:
> 
> pcs constraint location ClusterIP:0 prefers node1-pcs=INFINITY
> 
> but it responded with an error about an invalid character (:).
There isn't a way currently. It will try to do that when initially
placing them, but once they've moved together, there's no simple way to
tell them to move. I suppose a workaround might be to create a dummy
resource that you constrain to that node so it looks like the other node
is less busy.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] what is the best practice for removing a node temporary (e.g. for installing updates) ?

2017-06-19 Thread Ken Gaillot

On 06/19/2017 10:23 AM, Lentes, Bernd wrote:
> Hi,
> 
> what would you consider to be the best way for removing a node temporary from 
> the cluster, e.g. for installing updates ?
> I thought "crm node maintenance node" would be the right way, but i was 
> astonished that the resources keep running on it. I would have expected that 
> the resources stop.
> I think "/etc/init.d/openais stop" seems to be a good solution. The resources 
> are stopped and eventually moved to the other node, then i can install 
> updates, change hardware ... and reboot it afterwards.
> 
> Or prefer "crm node standby" ?
> 
> Thanks.
> 
> 
> Bernd

standby followed by either maintenance mode or stop

standby moves all resources off the node, and ensures that resources
won't move back to it when it rejoins, giving you the chance to make
sure everything looks good before allowing resources back on it.

Maintenance mode or stop lets you mess with the node without getting it
fenced. With the node in standby, there's not a whole lot of difference.
Mainly a node in maintenance but still up contributes to quorum.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Pacemaker 1.1.17 Release Candidate 4 (likely final)

2017-06-21 Thread Ken Gaillot

On 06/21/2017 02:58 AM, Ferenc Wágner wrote:
> Ken Gaillot <kgail...@redhat.com> writes:
> 
>> The most significant change in this release is a new cluster option to
>> improve scalability.
>>
>> As users start to create clusters with hundreds of resources and many
>> nodes, one bottleneck is a complete reprobe of all resources (for
>> example, after a cleanup of all resources).
> 
> Hi,
> 
> Does crm_resource --cleanup without any --resource specified do this?
> Does this happen any other (automatic or manual) way?

Correct.

A full probe also happens at startup, but that generally is spread out
over enough time not to matter.

Prior to this release, a full write-out of all node attributes also
occurs whenever a node joins the cluster, which has similar
characteristic (due to fail counts for each resource on each node). With
this release, that is skipped when using the corosync 2 stack, since we
have extra guarantees there that make it unnecessary.

> 
>> This can generate enough CIB updates to get the crmd's CIB connection
>> dropped for not processing them quickly enough.
> 
> Is this a catastrophic scenario, or does the cluster recover gently?

The crmd exits, leading to node fencing.

>> This bottleneck has been addressed with a new cluster option,
>> cluster-ipc-limit, to raise the threshold for dropping the connection.
>> The default is 500. The recommended value is the number of nodes in the
>> cluster multiplied by the number of resources.
> 
> I'm running a production cluster with 6 nodes and 159 resources (ATM),
> which gives almost twice the above default.  What symptoms should I
> expect to see under 1.1.16?  (1.1.16 has just been released with Debian
> stretch.  We can't really upgrade it, but changing the built-in default
> is possible if it makes sense.)

Even twice the threshold is fine in most clusters, because it's highly
unlikely that all probe results will come back at exactly the same time.
The case that prompted this involved 200 resources whose monitor action
was a simple pid check, so they executed near instantaneously (on 9 nodes).

The symptom is an "Evicting client" log message from the cib, listing
the pid of the crmd, followed by the crmd exiting.

Changing the compiled-in default on older versions is a potential
workaround (search for 500 in lib/common/ipc.c), but not ideal since it
applies to all clusters (even those too small to need it) and all
clients (including command-line clients, whereas the new
cluster-ipc-limit option only affects connections from other cluster
daemons).

The only real downside of increasing the threshold is the potential for
increased memory usage (which is why there's a limit to begin with, to
avoid an unresponsive client from causing a memory surge on a cluster
daemon). The usage is dependent on the size of the queued IPC messages
-- for probe results, it should be under 1K per result. The memory is
only used if the queue actually backs up (it's not pre-allocated).

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] vip is not removed after node lost connection with the other two nodes

2017-06-23 Thread Ken Gaillot

On 06/23/2017 11:52 AM, Dimitri Maziuk wrote:
> On 06/23/2017 11:24 AM, Jan Pokorný wrote:
> 
>> People using ifdown or the iproute-based equivalent seem far
>> too prevalent, even if for long time bystanders the idea looks
>> continually disproved ad nauseam.
> 
> Has anyone had a network card fail recently and what does that look like
> on modern kernels? -- That's an honest question, I have not seen that in
> forever (fingers crossed knock on wood).
> 
> I.e. is the expectation that real life failure will be "nice" to
> corosync actually warranted?

I don't think there is such an expectation. If I understand correctly,
the issue with using ifdown as a test is two-fold: it's not a good
simulation of a typical network outage, and corosync is unable to
recover from an interface that goes down and later comes back up, so you
can only test the "down" part. Implementing some sort of recovery
mechanism in that situation is a goal for corosync 3, I believe.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] clusterlabs.org now supports https :-)

2017-06-26 Thread Ken Gaillot

Thanks to the wonderful service provided by Let's Encrypt[1], we now
have an SSL certificate for the ClusterLabs websites. You can use the
websites with secure encryption by starting the URL with "https", for
example:

   https://www.clusterlabs.org/

The ClusterLabs wiki[2] and bugzilla[3] sites, which accept logins, now
always redirect to https, so passwords are never sent in clear text.
While we have no indication that any accounts have ever been
compromised, it's a good time to login and change your password if you
have an account on one of these sites.

[1] https://letsencrypt.org/
[2] https://wiki.clusterlabs.org/
[3] https://bugs.clusterlabs.org/
-- 
Ken Gaillot <kgail...@redhat.com>

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] clearing failed actions

2017-06-21 Thread Ken Gaillot

Paddr2):   Started ctdb2
>>> Jun 19 17:37:06 [18997] ctmgrpengine: info: clone_print:
>>> Master/Slave Set: mysql [db-mysql]
>>> Jun 19 17:37:06 [18997] ctmgrpengine:debug: native_active:
>> Resource
>>> db-mysql:0 active on ctdb1
>>> Jun 19 17:37:06 [18997] ctmgrpengine:debug: native_active:
>> Resource
>>> db-mysql:0 active on ctdb1
>>> Jun 19 17:37:06 [18997] ctmgrpengine:debug: native_active:
>> Resource
>>> db-mysql:1 active on ctdb2
>>> Jun 19 17:37:06 [18997] ctmgrpengine:debug: native_active:
>> Resource
>>> db-mysql:1 active on ctdb2
>>> Jun 19 17:37:06 [18997] ctmgrpengine: info: short_print:
>>>  Masters:
>> [
>>> ctdb1 ]
>>> Jun 19 17:37:06 [18997] ctmgrpengine: info: short_print:
>>>  Slaves: [
>>> ctdb2 ]
>>> Jun 19 17:37:06 [18997] ctmgrpengine: info: get_failcount_full: 
>>> db-
>> ip-
>>> master has failed 1 times on ctdb1
>>> Jun 19 17:37:06 [18997] ctmgrpengine: info: common_apply_stickiness:
>>> db-ip-master can fail 99 more times on ctdb1 before being forced off
>>> Jun 19 17:37:06 [18997] ctmgrpengine:debug:
>> common_apply_stickiness:
>>> Resource db-mysql:0: preferring current location (node=ctdb1, weight=1)
>>> Jun 19 17:37:06 [18997] ctmgrpengine:debug:
>> common_apply_stickiness:
>>> Resource db-mysql:1: preferring current location (node=ctdb2, weight=1)
>>> Jun 19 17:37:06 [18997] ctmgrpengine:debug: native_assign_node:
>>> Assigning ctdb1 to db-mysql:0
>>> Jun 19 17:37:06 [18997] ctmgrpengine:debug: native_assign_node:
>>> Assigning ctdb2 to db-mysql:1
>>> Jun 19 17:37:06 [18997] ctmgrpengine:debug: clone_color:
>>> Allocated
>> 2
>>> mysql instances of a possible 2
>>> Jun 19 17:37:06 [18997] ctmgrpengine:debug: master_color:   db-
>>> mysql:0 master score: 3601
>>> Jun 19 17:37:06 [18997] ctmgrpengine: info: master_color:   
>>> Promoting
>>> db-mysql:0 (Master ctdb1)
>>> Jun 19 17:37:06 [18997] ctmgrpengine:debug: master_color:   db-
>>> mysql:1 master score: 3600
>>> Jun 19 17:37:06 [18997] ctmgrpengine: info: master_color:   
>>> mysql:
>>> Promoted 1 instances of a possible 1 to master
>>> Jun 19 17:37:06 [18997] ctmgrpengine:debug: native_assign_node:
>>> Assigning ctdb1 to db-ip-master
>>> Jun 19 17:37:06 [18997] ctmgrpengine:debug: native_assign_node:
>>> Assigning ctdb2 to db-ip-slave
>>> Jun 19 17:37:06 [18997] ctmgrpengine:debug: master_create_actions:
>>> Creating actions for mysql
>>> Jun 19 17:37:06 [18997] ctmgrpengine: info: LogActions: 
>>> Leave   db-
>> ip-
>>> master(Started ctdb1)
>>> Jun 19 17:37:06 [18997] ctmgrpengine: info: LogActions: 
>>> Leave   db-
>> ip-
>>> slave (Started ctdb2)
>>> Jun 19 17:37:06 [18997] ctmgrpengine: info: LogActions: 
>>> Leave   db-
>>> mysql:0  (Master ctdb1)
>>> Jun 19 17:37:06 [18997] ctmgrpengine: info: LogActions: 
>>> Leave   db-
>>> mysql:1  (Slave ctdb2)
>>> Jun 19 17:37:06 [18997] ctmgrpengine:   notice: process_pe_message:
>>> Calculated Transition 38: /var/lib/pacemaker/pengine/pe-input-16.bz2
>>> Jun 19 17:37:06 [18998] ctmgr   crmd:debug: s_crmd_fsa: 
>>> Processing
>>> I_PE_SUCCESS: [ state=S_POLICY_ENGINE cause=C_IPC_MESSAGE
>>> origin=handle_response ]
>>> Jun 19 17:37:06 [18998] ctmgr   crmd: info: do_state_transition:
>> State
>>> transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [
>>> input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
>>> Jun 19 17:37:06 [18998] ctmgr   crmd:debug: unpack_graph:
>> Unpacked
>>> transition 38: 0 actions in 0 synapses
>>> Jun 19 17:37:06 [18998] ctmgr   crmd: info: do_te_invoke:   
>>> Processing
>>> graph 38 (ref=pe_calc-dc-1497893826-144) derived from
>>> /var/lib/pacemaker/pengine/pe-input-16.bz2
>>> Jun 19 17:37:06 [18998] ctmgr   crmd:debug: print_graph:
>>> Empty
>>> transition graph
>>> Jun 19 17:37:06 [18998] ctmgr   crmd:   notice: run_graph:  Transition 
>>> 38
>>> (Complete=0, Pen

Re: [ClusterLabs] vip is not removed after node lost connection with the other two nodes

2017-06-23 Thread Ken Gaillot

On 06/22/2017 09:44 PM, Hui Xiang wrote:
> Hi guys,
> 
>   I have setup 3 nodes(node-1, node-2, node-3) as controller nodes, an
> vip is selected by pacemaker between them, after manually make the
> management interface down in node-1 (used by corosync) but still have
> connectivity to public or non-management network, I was expecting that
> the vip in node-1 will be stop/remove by pacemaker since this node lost
> connection with the other two node, however, now there are two vip in
> the cluster, below is my configuration:
> 
> [node-1]
> Online: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
>  vip__public_old(ocf::es:ns_IPaddr2):Started node-1.domain.tld 
> 
> [node-2 node-3]
> Online: [ node-2.domain.tld node-3.domain.tld ]
> OFFLINE: [ node-1.domain.tld ]
>  vip__public_old(ocf::es:ns_IPaddr2):Started node-3.domain.tld 
> 
> 
> My question is am I miss any configuration, how can I make vip removed
> in node-1, shouldn't crm status in node-1 be:
> [node-1]
> Online: [ node-1.domain.tld ]
> OFFLINE: [  node-2.domain.tld node-3.domain.tld ] 
> 
> 
> Thanks much.
> Hui.

Hi,

How did you make the cluster interface down? If you're blocking it via
firewall, be aware that you have to block *outbound* traffic on the
corosync port.

Do you have stonith working? When the cluster loses a node, it recovers
by fencing it.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] both nodes OFFLINE

2017-05-22 Thread Ken Gaillot

On 05/13/2017 01:36 AM, 石井 俊直 wrote:
> Hi.
> 
> We have, sometimes, a problem in our two nodes cluster on CentOS7. Let node-2 
> and node-3
> be the names of the nodes. When the problem happens, both nodes are 
> recognized OFFLINE
> on node-3 and on node-2, only node-3 is recognized OFFLINE.
> 
> When that happens, the following log message is added repeatedly on node-2 
> and log file
> (/var/log/cluster/corosync.log) becomes hundreds of megabytes in short time. 
> Log message
> content on node-3 is different.
> 
> The erroneous state is temporally solved if OS of node-2 is restarted. On the 
> other hand,
> restarting OS of node-3 results in the same state.
> 
> I’ve searched content of ML and found a post (Mon Oct 1 01:27:39 CEST 2012) 
> about
> "Discarding update with feature set” problem. According to the message, our 
> problem
> may be solved by removing /var/lib/pacemaker/crm/cib.* on node-2.
> 
> What I want to know is whether removing the above files on just one of the 
> node is safe ?
> If there’s other method to solve the problem, I’d like to hear that.
> 
> Thanks.
> 
> —— from corosync.log  
> cib:error: cib_perform_op:Discarding update with feature set 
> '3.0.11' greater than our own '3.0.10'

This implies that the pacemaker versions are different on the two nodes.
Usually, when the pacemaker version changes, the feature set version
also changes, which means that it introduces new features that won't
work with older pacemaker versions.

Running a cluster with mixed pacemaker versions in such a case is
allowed, but only during a rolling upgrade. Once an older node leaves
the cluster for any reason, it will not be allowed to rejoin until it is
upgraded.

Removing the cib files won't help, since node-2 apparently does not
support node-3's pacemaker version.

If that's not the situation you are in, please give more details, as
this should not be possible otherwise.

> cib:error: cib_process_request:   Completed cib_replace operation for 
> section 'all': Protocol not supported (rc=-93, origin=node-3/crmd/12708, 
> version=0.83.30)
> crmd:   error: finalize_sync_callback:Sync from node-3 failed: 
> Protocol not supported
> crmd:info: register_fsa_error_adv:Resetting the current action 
> list
> crmd: warning: do_log:Input I_ELECTION_DC received in state 
> S_FINALIZE_JOIN from finalize_sync_callback
> crmd:info: do_state_transition:   State transition S_FINALIZE_JOIN -> 
> S_INTEGRATION | input=I_ELECTION_DC cause=C_FSA_INTERNAL 
> origin=finalize_sync_callback
> crmd:info: crm_update_peer_join:  initialize_join: Node node-2[1] - 
> join-6329 phase 2 -> 0
> crmd:info: crm_update_peer_join:  initialize_join: Node node-3[2] - 
> join-6329 phase 2 -> 0
> crmd:info: update_dc: Unset DC. Was node-2
> crmd:info: join_make_offer:   join-6329: Sending offer to node-2
> crmd:info: crm_update_peer_join:  join_make_offer: Node node-2[1] - 
> join-6329 phase 0 -> 1
> crmd:info: join_make_offer:   join-6329: Sending offer to node-3
> crmd:info: crm_update_peer_join:  join_make_offer: Node node-3[2] - 
> join-6329 phase 0 -> 1
> crmd:info: do_dc_join_offer_all:  join-6329: Waiting on 2 outstanding 
> join acks
> crmd:info: update_dc: Set DC to node-2 (3.0.10)
> crmd:info: crm_update_peer_join:  do_dc_join_filter_offer: Node node-2[1] 
> - join-6329 phase 1 -> 2
> crmd:info: crm_update_peer_join:  do_dc_join_filter_offer: Node node-3[2] 
> - join-6329 phase 1 -> 2
> crmd:info: do_state_transition:   State transition S_INTEGRATION -> 
> S_FINALIZE_JOIN | input=I_INTEGRATED cause=C_FSA_INTERNAL 
> origin=check_join_state
> crmd:info: crmd_join_phase_log:   join-6329: node-2=integrated
> crmd:info: crmd_join_phase_log:   join-6329: node-3=integrated
> crmd:  notice: do_dc_join_finalize:   Syncing the Cluster Information Base 
> from node-3 to rest of cluster | join-6329
> crmd:  notice: do_dc_join_finalize:   Requested versioncrm_feature_set="3.0.11" validate-with="pacemaker-2.5" epoch="84" 
> num_updates="1" admin_epoch="0" cib-last-written="Thu May 11 08:05:45 2017" 
> update-origin="node-2" update-client="crm_resource" update-user="root" 
> have-quorum="1"/>
> cib: info: cib_process_request:   Forwarding cib_sync operation for 
> section 'all' to node-3 (origin=local/crmd/12710)
> cib: info: cib_process_replace:   Digest matched on replace from node-3: 
> 85a19c7927c54ccb15794f2720e07ce1
> cib: info: cib_process_replace:   Replaced 0.83.30 with 0.84.1 from node-3
> cib: info: __xml_diff_object: Moved node_state@crmd (3 -> 2)

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] CIB: op-status=4 ?

2017-05-22 Thread Ken Gaillot

0 on olegdbx39-vm03
> returned 'unknown' (189) instead of the expected value: 'not running' (7)
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:  warning:
> unpack_rsc_op_failure:Processing failed op monitor for
> dbx_first_datas:0 on olegdbx39-vm03: unknown (189)
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug:
> determine_op_status:dbx_first_datas_monitor_0 on olegdbx39-vm03
> returned 'unknown' (189) instead of the expected value: 'not running' (7)
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:  warning:
> unpack_rsc_op_failure:Processing failed op monitor for
> dbx_first_datas:0 on olegdbx39-vm03: unknown (189)
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug:
> determine_op_status:dbx_first_head_monitor_0 on olegdbx39-vm03
> returned 'unknown' (189) instead of the expected value: 'not running' (7)
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:  warning:
> unpack_rsc_op_failure:Processing failed op monitor for
> dbx_first_head on olegdbx39-vm03: unknown (189)
> May 19 13:15:42 [6872] olegdbx39-vm-0 stonith-ng:debug:
> xml_patch_version_check:Can apply patch 2.5.47 to 2.5.46
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug:
> determine_op_status:dbx_first_head_monitor_0 on olegdbx39-vm03
> returned 'unknown' (189) instead of the expected value: 'not running' (7)
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:  warning:
> unpack_rsc_op_failure:Processing failed op monitor for
> dbx_first_head on olegdbx39-vm03: unknown (189)
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug:
> find_anonymous_clone:Internally renamed dbx_mounts_nodes on
> olegdbx39-vm03 to dbx_mounts_nodes:0
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug:
> determine_op_status:dbx_mounts_nodes_monitor_0 on olegdbx39-vm03
> returned 'unknown' (189) instead of the expected value: 'not running' (7)
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:  warning:
> unpack_rsc_op_failure:Processing failed op monitor for
> dbx_mounts_nodes:0 on olegdbx39-vm03: unknown (189)
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug:
> determine_op_status:dbx_mounts_nodes_monitor_0 on olegdbx39-vm03
> returned 'unknown' (189) instead of the expected value: 'not running' (7)
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:  warning:
> unpack_rsc_op_failure:Processing failed op monitor for
> dbx_mounts_nodes:0 on olegdbx39-vm03: unknown (189)
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug:
> find_anonymous_clone:Internally renamed dbx_nfs_mounts_datas on
> olegdbx39-vm03 to dbx_nfs_mounts_datas:0
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug:
> determine_op_status:dbx_nfs_mounts_datas_monitor_0 on
> olegdbx39-vm03 returned 'unknown' (189) instead of the expected
> value: 'not running' (7)
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:  warning:
> unpack_rsc_op_failure:Processing failed op monitor for
> dbx_nfs_mounts_datas:0 on olegdbx39-vm03: unknown (189)
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug:
> find_anonymous_clone:Internally renamed dbx_ready_primary on
> olegdbx39-vm03 to dbx_ready_primary:0
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug:
> find_anonymous_clone:Internally renamed dbx_first_datas on
> olegdbx39-vm-0 to dbx_first_datas:1
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug:
> find_anonymous_clone:Internally renamed dbx_swap_nodes on
> olegdbx39-vm-0 to dbx_swap_nodes:0
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug:
> find_anonymous_clone:Internally renamed dbx_mounts_nodes on
> olegdbx39-vm-0 to dbx_mounts_nodes:1
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug:
> find_anonymous_clone:Internally renamed dbx_bind_mounts_nodes on
> olegdbx39-vm-0 to dbx_bind_mounts_nodes:1
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug:
> find_anonymous_clone:Internally renamed dbx_nfs_mounts_datas on
> olegdbx39-vm-0 to dbx_nfs_mounts_datas:0
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug:
> find_anonymous_clone:Internally renamed dbx_nfs_nodes on
> olegdbx39-vm-0 to dbx_nfs_nodes:0
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug:
> find_anonymous_clone:Internally renamed dbx_ready_primary on
> olegdbx39-vm-0 to dbx_ready_primary:0
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug:
> find_anonymous_clone:Internally renamed dbx_first_datas on
> olegdbx39-vm02 to dbx_first_datas:1
> May 19 13:15:42 [8114] olegdbx39-vm-0crm_mon:debug:
> find_anonymous_clone:Internally renamed dbx_swap_nodes on

[ClusterLabs] Pacemaker 1.1.17 Release Candidate 2

2017-05-23 Thread Ken Gaillot

The second release candidate for Pacemaker version 1.1.17 is now
available at:

https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.17-rc2

This release contains multiple fixes related to the new bundle feature,
plus:

* A regression introduced in Pacemaker 1.1.15 has been discovered and
fixed. When a Pacemaker Remote connection needed to be recovered, any
actions on that node were not ordered after the connection recovery,
potentially leading to unnecessary failures and recovery actions before
arriving at the correct state.

* The fencing daemon monitors the cluster configuration for constraints
related to fence devices, to know whether to enable or disable them on
the local node. Previously, after reading the initial configuration, it
could detect later changes or removals of constraints, but not
additions. Now, it can.
-- 
Ken Gaillot <kgail...@redhat.com>

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] failcount is not getiing reset after failure_timeout if monitoring is disabled

2017-05-23 Thread Ken Gaillot

On 05/23/2017 08:00 AM, ashutosh tiwari wrote:
> Hi,
> 
> We are running a two node cluster(Active(X)/passive(Y)) having muliple
> resources of type IpAddr2.
> Running monitor operations for multiple IPAddr2 resource is actually
> hoging the cpu, 
> as we have configured very low value for monitor interval (200 msec).

That is very low. Although times are generally specified in msec in the
pacemaker configuration, pacemaker generally has 1-second granularity in
the implementation, so this is probably treated the same as a 1s interval.

> 
> To avoid this problem ,we are trying to use netlink notification for
> monitoring floating Ip  and updating the failcount for the corresponding
> Ipaddr2 resource using crm_failcount . Along with this we have disabled
> the ipaddr2 monitoring. 

There is a better approach.

Directly modifying fail counts is not a good idea. Fail counts are being
overhauled in pacemaker 1.1.17 and later, and crm_failcount will only be
able to query or delete a failcount, not set or increment it. There
won't be a convenient way to modify a fail count, as we are trying to
discourage that as an implementation detail that can change.

> Thing work fine till here as IPAddr2 resource migrates to other node(Y)
> once failcount equals the migration threshold(1) and Y becomes Active
> due to resource colocation constraints.
> 
> We have configured failure timeout to 3 sec and expected it to clear the
> failcount on the initially active node(X). 
> Problem is that failcount never gets reset on X and thus cluster fails
> to move back to X.

Technically, it's not the fail count that expires, but a particular
failed operation that expires. Even though manually increasing the fail
count will result in recovery actions, if there is no failed operation
in the resource history, then there's nothing to expire.

However, pacemaker does provide a way to do what you want: see the
crm_resource(8) man page for the -F/--fail option. It will record a fake
operation failure in the resource history, and process it as if it were
a real failure. That should do what you want.

> However if we enable the monitoring everything works fine and failcount
> gets reset allowing to fallback.
> 
> 
> Regrds,
> Ashutosh T
FYI, there's an idea for a future feature that could also be helpful
here. We're thinking of creating a new ocf:pacemaker:IP resource agent
that would be based on systemd's networking support. This would allow
pacemaker to be notified by systemd of IP failures without having to
poll. I'm not sure how systemd itself detects the failures. No timeline
on when this might be available, though.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] resource monitor logging

2017-05-25 Thread Ken Gaillot

On 05/24/2017 03:44 PM, Christopher Pax wrote:
> 
> I am running postgresql as a resource in corosync. and there is a
> monitor process that kicks off every few seconds to see if postgresqlis
> alive (it runs a select now()). My immediate concernis that it is
> generating alotof logs in auth.log, and I am wondering of this is normal
> behavior? Is there a way to silence this?

That's part of the operating system's security setup. I wouldn't disable
it, for security reasons. If the issue is that the logs are growing too
fast, I'd just rotate them more frequently and keep fewer old logs.

Also, consider whether running a monitor that frequently is really
necessary.

Another possibility, which probably would require modifying the resource
agent, would be to configure two monitors of different "levels". The
regular monitor, scheduled frequently, could just check that the
postgresql pid is still alive. The second level monitor, scheduled less
frequently, could try the select.

> 
> ##
> ## /var/log/auth.log
> ##
> May 24 15:23:19 ssinode02-g2 runuser: pam_unix(runuser:session): session
> opened for user postgres by (uid=0)
> May 24 15:23:19 ssinode02-g2 runuser: pam_unix(runuser:session): session
> closed for user postgres
> May 24 15:23:19 ssinode02-g2 runuser: pam_unix(runuser:session): session
> opened for user postgres by (uid=0)
> May 24 15:23:19 ssinode02-g2 runuser: pam_unix(runuser:session): session
> closed for user postgres
> May 24 15:23:19 ssinode02-g2 runuser: pam_unix(runuser:session): session
> opened for user postgres by (uid=0)
> May 24 15:23:19 ssinode02-g2 runuser: pam_unix(runuser:session): session
> closed for user postgres
> 
> ##
> ## /var/log/postgresql/data.log
> ##
> DEBUG:  forked new backend, pid=27900 socket=11
> LOG:  connection received: host=[local]
> LOG:  connection authorized: user=postgres database=template1
> LOG:  statement: select now();
> LOG:  disconnection: session time: 0:00:00.003 user=postgres
> database=template1 host=[local]
> DEBUG:  server process (PID 27900) exited with exit code 0
> DEBUG:  forked new backend, pid=28030 socket=11
> LOG:  connection received: host=[local]
> LOG:  connection authorized: user=postgres database=template1
> LOG:  statement: select now();
> LOG:  disconnection: session time: 0:00:00.002 user=postgres
> database=template1 host=[local]
> 
> 
> ## snippit of pgsql corosync primitive
> primitive res_pgsql_2 pgsql \
> params pgdata="/mnt/drbd/postgres"
> config="/mnt/drbd/postgres/postgresql.conf" start_opt="-d 2"
> pglibs="/usr/lib/postgresql/9.5/lib"
> logfile="/var/log/postgresql/data.log" \
> operations $id=res_pgsql_1-operations \
> op start interval=0 timeout=60 \
> op stop interval=0 timeout=60 \
> op monitor interval=3 timeout=60 start-delay=0

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] In N+1 cluster, add/delete of one resource result in other node resources to restart

2017-05-19 Thread Ken Gaillot

On 05/19/2017 04:14 AM, Anu Pillai wrote:
> Hi Ken,
> 
> Did you get any chance to go through the logs? 

sorry, not yet

> Do you need any more details ?
> 
> Regards,
> Aswathi
> 
> On Tue, May 16, 2017 at 3:04 PM, Anu Pillai
> <anu.pillai.subsc...@gmail.com <mailto:anu.pillai.subsc...@gmail.com>>
> wrote:
> 
> Hi,
> 
> Please find attached debug logs for the stated problem as well as
> crm_mon command outputs. 
> In this case we are trying to remove/delete res3 and system/node
> (0005B94238BC) from the cluster.
> 
> *_Test reproduction steps_*
> 
> Current Configuration of the cluster:
>  0005B9423910  - res2 
>  0005B9427C5A - res1
>  0005B94238BC - res3
> 
> *crm_mon output:*
> 
> Defaulting to one-shot mode
> You need to have curses available at compile time to enable console mode
> Stack: corosync
> Current DC: 0005B9423910 (version 1.1.14-5a6cdd1) - partition with
> quorum
> Last updated: Tue May 16 12:21:23 2017  Last change: Tue May
> 16 12:13:40 2017 by root via crm_attribute on 0005B9423910
> 
> 3 nodes and 3 resources configured
> 
> Online: [ 0005B94238BC 0005B9423910 0005B9427C5A ]
> 
>  res2   (ocf::redundancy:RedundancyRA): Started 0005B9423910
>  res1   (ocf::redundancy:RedundancyRA): Started 0005B9427C5A
>  res3   (ocf::redundancy:RedundancyRA): Started 0005B94238BC
> 
> 
> Trigger the delete operation for res3 and node 0005B94238BC.
> 
> Following commands applied from node 0005B94238BC
> $ pcs resource delete res3 --force
> $ crm_resource -C res3
> $ pcs cluster stop --force 
> 
> Following command applied from DC(0005B9423910)
> $ crm_node -R 0005B94238BC --force
> 
> 
> *crm_mon output:*
> *
> *
> Defaulting to one-shot mode
> You need to have curses available at compile time to enable console mode
> Stack: corosync
> Current DC: 0005B9423910 (version 1.1.14-5a6cdd1) - partition with
> quorum
> Last updated: Tue May 16 12:21:27 2017  Last change: Tue May
> 16 12:21:26 2017 by root via cibadmin on 0005B94238BC
> 
> 3 nodes and 2 resources configured
> 
> Online: [ 0005B94238BC 0005B9423910 0005B9427C5A ]
> 
> 
> Observation is remaining two resources res2 and res1 were stopped
> and started.
> 
> 
> Regards,
> Aswathi
> 
> On Mon, May 15, 2017 at 8:11 PM, Ken Gaillot <kgail...@redhat.com
> <mailto:kgail...@redhat.com>> wrote:
> 
> On 05/15/2017 06:59 AM, Klaus Wenninger wrote:
> > On 05/15/2017 12:25 PM, Anu Pillai wrote:
> >> Hi Klaus,
> >>
> >> Please find attached cib.xml as well as corosync.conf.
> 
> Maybe you're only setting this while testing, but having
> stonith-enabled=false and no-quorum-policy=ignore is highly
> dangerous in
> any kind of network split.
> 
> FYI, default-action-timeout is deprecated in favor of setting a
> timeout
> in op_defaults, but it doesn't hurt anything.
> 
> > Why wouldn't you keep placement-strategy with default
> > to keep things simple. You aren't using any load-balancing
> > anyway as far as I understood it.
> 
> It looks like the intent is to use placement-strategy to limit
> each node
> to 1 resource. The configuration looks good for that.
> 
> > Haven't used resource-stickiness=INF. No idea which strange
> > behavior that triggers. Try to have it just higher than what
> > the other scores might some up to.
> 
> Either way would be fine. Using INFINITY ensures that no other
> combination of scores will override it.
> 
> > I might have overseen something in your scores but otherwise
> > there is nothing obvious to me.
> >
> > Regards,
> > Klaus
> 
> I don't see anything obvious either. If you have logs around the
> time of
> the incident, that might help.
> 
> >> Regards,
> >> Aswathi
> >>
> >> On Mon, May 15, 2017 at 2:46 PM, Klaus Wenninger 
> <kwenn...@redhat.com <mailto:kwenn...@redhat.com>
> >> <mailto:kwenn...@redhat.com <mailto:kwenn...@redhat.com>>> wrote:
> >>
> >> On 05/15/2017 09:36 AM, Anu Pillai wrote:
> >> > Hi,
> >> >
>

Re: [ClusterLabs] CIB: op-status=4 ?

2017-05-18 Thread Ken Gaillot

On 05/17/2017 06:10 PM, Radoslaw Garbacz wrote:
> Hi,
> 
> I have a question regarding ' 'op-status
> attribute getting value 4.
> 
> In my case I have a strange behavior, when resources get those "monitor"
> operation entries in the CIB with op-status=4, and they do not seem to
> be called (exec-time=0).
> 
> What does 'op-status' = 4 mean?

The action had an error status

> 
> I would appreciate some elaboration regarding this, since this is
> interpreted by pacemaker as an error, which causes logs:
> crm_mon:error: unpack_rsc_op:Preventing dbx_head_head from
> re-starting anywhere: operation monitor failed 'not configured' (6)

The rc-code="6" is the more interesting number; it's the result returned
by the resource agent. As you can see above, it means "not configured".
What that means exactly is up to the resource agent's interpretation.

> and I am pretty sure the resource agent was not called (no logs,
> exec-time=0)

Normally this could only come from the resource agent.

However there are two cases where pacemaker generates this error itself:
if the resource definition in the CIB is invalid; and if your version of
pacemaker was compiled with support for reading sensitive parameter
values from a file, but that file could not be read.

It doesn't sound like your case is either one of those though, since
they would prevent the resource from even starting. Most likely it's
coming from the resource agent. I'd look at the resource agent source
code and see where it can return OCF_ERR_CONFIGURED.

> There are two aspects of this:
> 
> 1) harmless (pacemaker seems to not bother about it), which I guess
> indicates cancelled monitoring operations:
> op-status=4, rc-code=189

This error means the connection between the crmd and lrmd daemons was
lost -- most commonly, that shows up for operations that were pending at
shutdown.

> 
> * Example:
>  operation_key="dbx_first_datas_monitor_0" operation="monitor"
> crm-debug-origin="do_update_resource" crm_feature_set="3.0.12"
> transition-key="38:0:7:c8b63d9d-9c70-4f99-aa1b-e993de6e4739"
> transition-magic="4:189;38:0:7:c8b63d9d-9c70-4f99-aa1b-e993de6e4739"
> on_node="olegdbx61-vm01" call-id="10" rc-code="189" op-status="4"
> interval="0" last-run="1495057378" last-rc-change="1495057378"
> exec-time="0" queue-time="0" op-digest="f6bd1386a336e8e6ee25ecb651a9efb6"/>
> 
> 
> 2) error level one (op-status=4, rc-code=6), which generates logs:
> crm_mon:error: unpack_rsc_op:Preventing dbx_head_head from
> re-starting anywhere: operation monitor failed 'not configured' (6)
> 
> * Example:
>  operation_key="dbx_head_head_monitor_0" operation="monitor"
> crm-debug-origin="do_update_resource" crm_feature_set="3.0.12"
> transition-key="39:0:7:c8b63d9d-9c70-4f99-aa1b-e993de6e4739"
> transition-magic="4:6;39:0:7:c8b63d9d-9c70-4f99-aa1b-e993de6e4739"
> on_node="olegdbx61-vm01" call-id="9" rc-code="6"
> op-status="4" interval="0" last-run="1495057389"
> last-rc-change="1495057389" exec-time="0" queue-time="0"
> op-digest="60cdc9db1c5b77e8dba698d3d0c8cda8"/>
> 
> 
> Could it be some hardware (VM hyperviser) issue?
> 
> 
> Thanks in advance,
> 
> -- 
> Best Regards,
> 
> Radoslaw Garbacz
> XtremeData Incorporated

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] question about fence-virsh

2017-05-19 Thread Ken Gaillot

On 05/19/2017 03:47 PM, Andrew Kerber wrote:
> What I am trying to say here is when I get one of the virtual machines
> in a bad state, I can still log in and reboot it with the reboot
> command. But I need my fencing resource to handle that reboot.
> 
> On Fri, May 19, 2017 at 1:32 PM, Andrew Kerber  > wrote:
> 
> Thanks for the answer, but thats not the problem.  I dont have
> access to the console, its a security issue.  I only have access
> within the virtual machines, so I want to send the reboot command
> within the virtual machine, not to the console. Typically our
> hangups are such that the reboot command works, and the machine
> hangs at starting back up, and I get an admin to go hit the console.

What you're asking for is an "ssh" fence agent. While such can be found,
they are not considered reliable fence agents.

Your *typical* problem may be solvable with running "reboot" inside the
VM, but there are situations in which that won't work (kernel panic,
loss of network connectivity in the VM, crippling load, etc.). Only
access to the hypervisor can provide a reliable fence mechanism for the VM.

If you're lucky, whoever is providing your VM can also provide you an
API to use to request a hard reboot of the VM at the hypervisor level.
Then, you can see if there is a fence agent already written for that
API, or modify an existing one to handle it.

If you can't even get API access to the hypervisor, then you're not
going to get full HA. You could search for an ssh fence agent, but be
aware that's a partial solution at best, and you won't be able to
recover from certain failure scenarios.


> On Fri, May 19, 2017 at 12:39 PM, Digimer  > wrote:
> 
> On 19/05/17 12:59 PM, Andrew Kerber wrote:
> > I have been setting up a cluster on virtual machines with some 
> shared
> > resources.  The only fencing tool I have found designed for that
> > configuration is fence virsh, but I have not been able to figure out
> > from the documentation how to get fence-virsh to issue the reboot
> > command.  Does anyone have a good explanation of how to configure
> > fence-virsh to issue a reboot command?  I understand its not 
> perfect,
> > because in some hard lockup situations only hitting a power button 
> will
> > work, but for this configuration thats not really an option.
> >
> > --
> > Andrew W. Kerber
> 
> fence_virsh -a  -l root -p
> 
> -n  -o status
> 
> That should show the status. To reboot, change 'status' to 'reboot'.
> 
> If this doesn't work, make sure you can ssh from the nodes to the
> hypervisor as the root user.
> 
> --
> Digimer
> Papers and Projects: https://alteeve.com/w/
> "I am, somehow, less interested in the weight and convolutions of
> Einstein’s brain than in the near certainty that people of equal
> talent
> have lived and died in cotton fields and sweatshops." - Stephen
> Jay Gould

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] In N+1 cluster, add/delete of one resource result in other node resources to restart

2017-05-22 Thread Ken Gaillot

On 05/16/2017 04:34 AM, Anu Pillai wrote:
> Hi,
> 
> Please find attached debug logs for the stated problem as well as
> crm_mon command outputs. 
> In this case we are trying to remove/delete res3 and system/node
> (0005B94238BC) from the cluster.
> 
> *_Test reproduction steps_*
> 
> Current Configuration of the cluster:
>  0005B9423910  - res2 
>  0005B9427C5A - res1
>  0005B94238BC - res3
> 
> *crm_mon output:*
> 
> Defaulting to one-shot mode
> You need to have curses available at compile time to enable console mode
> Stack: corosync
> Current DC: 0005B9423910 (version 1.1.14-5a6cdd1) - partition with quorum
> Last updated: Tue May 16 12:21:23 2017  Last change: Tue May 16
> 12:13:40 2017 by root via crm_attribute on 0005B9423910
> 
> 3 nodes and 3 resources configured
> 
> Online: [ 0005B94238BC 0005B9423910 0005B9427C5A ]
> 
>  res2   (ocf::redundancy:RedundancyRA): Started 0005B9423910
>  res1   (ocf::redundancy:RedundancyRA): Started 0005B9427C5A
>  res3   (ocf::redundancy:RedundancyRA): Started 0005B94238BC
> 
> 
> Trigger the delete operation for res3 and node 0005B94238BC.
> 
> Following commands applied from node 0005B94238BC
> $ pcs resource delete res3 --force
> $ crm_resource -C res3
> $ pcs cluster stop --force 

I don't think "pcs resource delete" or "pcs cluster stop" does anything
with the --force option. In any case, --force shouldn't be needed here.

The crm_mon output you see is actually not what it appears. It starts with:

May 16 12:21:27 [4661] 0005B9423910   crmd:   notice: do_lrm_invoke:
   Forcing the status of all resources to be redetected

This is usually the result of a "cleanup all" command. It works by
erasing the resource history, causing pacemaker to re-probe all nodes to
get the current state. The history erasure makes it appear to crm_mon
that the resources are stopped, but they actually are not.

In this case, I'm not sure why it's doing a "cleanup all", since you
only asked it to cleanup res3. Maybe in this particular instance, you
actually did "crm_resource -C"?

> Following command applied from DC(0005B9423910)
> $ crm_node -R 0005B94238BC --force

This can cause problems. This command shouldn't be run unless the node
is removed from both pacemaker's and corosync's configuration. If you
actually are trying to remove the node completely, a better alternative
would be "pcs cluster node remove 0005B94238BC", which will handle all
of that for you. If you're not trying to remove the node completely,
then you shouldn't need this command at all.

> 
> 
> *crm_mon output:*
> *
> *
> Defaulting to one-shot mode
> You need to have curses available at compile time to enable console mode
> Stack: corosync
> Current DC: 0005B9423910 (version 1.1.14-5a6cdd1) - partition with quorum
> Last updated: Tue May 16 12:21:27 2017  Last change: Tue May 16
> 12:21:26 2017 by root via cibadmin on 0005B94238BC
> 
> 3 nodes and 2 resources configured
> 
> Online: [ 0005B94238BC 0005B9423910 0005B9427C5A ]
> 
> 
> Observation is remaining two resources res2 and res1 were stopped and
> started.
> 
> 
> Regards,
> Aswathi
> 
> On Mon, May 15, 2017 at 8:11 PM, Ken Gaillot <kgail...@redhat.com
> <mailto:kgail...@redhat.com>> wrote:
> 
> On 05/15/2017 06:59 AM, Klaus Wenninger wrote:
> > On 05/15/2017 12:25 PM, Anu Pillai wrote:
> >> Hi Klaus,
> >>
> >> Please find attached cib.xml as well as corosync.conf.
> 
> Maybe you're only setting this while testing, but having
> stonith-enabled=false and no-quorum-policy=ignore is highly dangerous in
> any kind of network split.
> 
> FYI, default-action-timeout is deprecated in favor of setting a timeout
> in op_defaults, but it doesn't hurt anything.
> 
> > Why wouldn't you keep placement-strategy with default
> > to keep things simple. You aren't using any load-balancing
> > anyway as far as I understood it.
> 
> It looks like the intent is to use placement-strategy to limit each node
> to 1 resource. The configuration looks good for that.
> 
> > Haven't used resource-stickiness=INF. No idea which strange
> > behavior that triggers. Try to have it just higher than what
> > the other scores might some up to.
> 
> Either way would be fine. Using INFINITY ensures that no other
> combination of scores will override it.
> 
> > I might have overseen something in your scores but otherwise
> > there is nothing obvious to me.
> >
> > Regards,
> > Klaus
> 
> I don't see anything obvious either. If you have lo

Re: [ClusterLabs] clearing failed actions

2017-05-30 Thread Ken Gaillot

On 05/30/2017 09:13 AM, Attila Megyeri wrote:
> Hi,
> 
>  
> 
> Shouldn’t the 
> 
>  
> 
> cluster-recheck-interval="2m"
> 
>  
> 
> property instruct pacemaker to recheck the cluster every 2 minutes and
> clean the failcounts?

It instructs pacemaker to recalculate whether any actions need to be
taken (including expiring any failcounts appropriately).

> At the primitive level I also have a
> 
>  
> 
> migration-threshold="30" failure-timeout="2m"
> 
>  
> 
> but whenever I have a failure, it remains there forever.
> 
>  
> 
>  
> 
> What could be causing this?
> 
>  
> 
> thanks,
> 
> Attila
Is it a single old failure, or a recurring failure? The failure timeout
works in a somewhat nonintuitive way. Old failures are not individually
expired. Instead, all failures of a resource are simultaneously cleared
if all of them are older than the failure-timeout. So if something keeps
failing repeatedly (more frequently than the failure-timeout), none of
the failures will be cleared.

If it's not a repeating failure, something odd is going on.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] clearing failed actions

2017-05-31 Thread Ken Gaillot

On 05/30/2017 02:50 PM, Attila Megyeri wrote:
> Hi Ken,
> 
> 
>> -Original Message-----
>> From: Ken Gaillot [mailto:kgail...@redhat.com]
>> Sent: Tuesday, May 30, 2017 4:32 PM
>> To: users@clusterlabs.org
>> Subject: Re: [ClusterLabs] clearing failed actions
>>
>> On 05/30/2017 09:13 AM, Attila Megyeri wrote:
>>> Hi,
>>>
>>>
>>>
>>> Shouldn't the
>>>
>>>
>>>
>>> cluster-recheck-interval="2m"
>>>
>>>
>>>
>>> property instruct pacemaker to recheck the cluster every 2 minutes and
>>> clean the failcounts?
>>
>> It instructs pacemaker to recalculate whether any actions need to be
>> taken (including expiring any failcounts appropriately).
>>
>>> At the primitive level I also have a
>>>
>>>
>>>
>>> migration-threshold="30" failure-timeout="2m"
>>>
>>>
>>>
>>> but whenever I have a failure, it remains there forever.
>>>
>>>
>>>
>>>
>>>
>>> What could be causing this?
>>>
>>>
>>>
>>> thanks,
>>>
>>> Attila
>> Is it a single old failure, or a recurring failure? The failure timeout
>> works in a somewhat nonintuitive way. Old failures are not individually
>> expired. Instead, all failures of a resource are simultaneously cleared
>> if all of them are older than the failure-timeout. So if something keeps
>> failing repeatedly (more frequently than the failure-timeout), none of
>> the failures will be cleared.
>>
>> If it's not a repeating failure, something odd is going on.
> 
> It is not a repeating failure. Let's say that a resource fails for whatever 
> action, It will remain in the failed actions (crm_mon -Af) until I issue a 
> "crm resource cleanup ". Even after days or weeks, even though 
> I see in the logs that cluster is rechecked every 120 seconds.
> 
> How could I troubleshoot this issue?
> 
> thanks!


Ah, I see what you're saying. That's expected behavior.

The failure-timeout applies to the failure *count* (which is used for
checking against migration-threshold), not the failure *history* (which
is used for the status display).

The idea is to have it no longer affect the cluster behavior, but still
allow an administrator to know that it happened. That's why a manual
cleanup is required to clear the history.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Pacemaker's "stonith too many failures" log is not accurate

2017-05-31 Thread Ken Gaillot

On 05/26/2017 03:21 AM, 井上 和徳 wrote:
> Hi Ken,
> 
> The cause turned out.
> 
> When stonith is executed, stonithd sends results and notifications to crmd.
> https://github.com/ClusterLabs/pacemaker/blob/0459f409580f41b35ce8ae31fb22e6370a508dab/fencing/remote.c#L402-L406
> 
> - when "result" is sent (calling do_local_reply()), too many stonith failures 
> is checked in too_many_st_failures().
>   
> https://github.com/ClusterLabs/pacemaker/blob/0459f409580f41b35ce8ae31fb22e6370a508dab/crmd/te_callbacks.c#L638-L669
> - when "notification" is sent (calling do_stonith_notify()), the number of 
> failures is incremented in st_fail_count_increment().
>   
> https://github.com/ClusterLabs/pacemaker/blob/0459f409580f41b35ce8ae31fb22e6370a508dab/crmd/te_callbacks.c#L704-L726
> From this, since checking is done before incrementing, the number of failures 
> in "Too many failures (10) to fence" log does not match the number of actual 
> failures.

Thanks for this analysis!

We do want the result to be sent before the notifications, so the
solution will be slightly more complicated. The DC will have to call
st_fail_count_increment() when receiving the result, while non-DC nodes
will continue to call it when receiving the notification.

I'll put together a fix before the 1.1.17 release.

> 
> I confirmed that the expected result will be obtained from the following 
> changes.
> 
> # git diff
> diff --git a/fencing/remote.c b/fencing/remote.c
> index 4a47d49..3ff324e 100644
> --- a/fencing/remote.c
> +++ b/fencing/remote.c
> @@ -399,12 +399,12 @@ handle_local_reply_and_notify(remote_fencing_op_t * op, 
> xmlNode * data, int rc)
>  reply = stonith_construct_reply(op->request, NULL, data, rc);
>  crm_xml_add(reply, F_STONITH_DELEGATE, op->delegate);
> 
> -/* Send fencing OP reply to local client that initiated fencing */
> -do_local_reply(reply, op->client_id, op->call_options & 
> st_opt_sync_call, FALSE);
> -
>  /* bcast to all local clients that the fencing operation happend */
>  do_stonith_notify(0, T_STONITH_NOTIFY_FENCE, rc, notify_data);
> 
> +/* Send fencing OP reply to local client that initiated fencing */
> +do_local_reply(reply, op->client_id, op->call_options & 
> st_opt_sync_call, FALSE);
> +
>  /* mark this op as having notify's already sent */
>  op->notify_sent = TRUE;
>  free_xml(reply);
> 
> Regards,
> Kazunori INOUE
> 
>> -Original Message-
>> From: Ken Gaillot [mailto:kgail...@redhat.com]
>> Sent: Wednesday, May 17, 2017 11:09 PM
>> To: users@clusterlabs.org
>> Subject: Re: [ClusterLabs] Pacemaker's "stonith too many failures" log is 
>> not accurate
>>
>> On 05/17/2017 04:56 AM, Klaus Wenninger wrote:
>>> On 05/17/2017 11:28 AM, 井上 和徳 wrote:
>>>> Hi,
>>>> I'm testing Pacemaker-1.1.17-rc1.
>>>> The number of failures in "Too many failures (10) to fence" log does not 
>>>> match the number of actual failures.
>>>
>>> Well it kind of does as after 10 failures it doesn't try fencing again
>>> so that is what
>>> failures stay at ;-)
>>> Of course it still sees the need to fence but doesn't actually try.
>>>
>>> Regards,
>>> Klaus
>>
>> This feature can be a little confusing: it doesn't prevent all further
>> fence attempts of the target, just *immediate* fence attempts. Whenever
>> the next transition is started for some other reason (a configuration or
>> state change, cluster-recheck-interval, node failure, etc.), it will try
>> to fence again.
>>
>> Also, it only checks this threshold if it's aborting a transition
>> *because* of this fence failure. If it's aborting the transition for
>> some other reason, the number can go higher than the threshold. That's
>> what I'm guessing happened here.
>>
>>>> After the 11th time fence failure, "Too many failures (10) to fence" is 
>>>> output.
>>>> Incidentally, stonith-max-attempts has not been set, so it is 10 by 
>>>> default..
>>>>
>>>> [root@x3650f log]# egrep "Requesting fencing|error: Operation 
>>>> reboot|Stonith failed|Too many failures"
>>>> ##Requesting fencing : 1st time
>>>> May 12 05:51:47 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) 
>>>> of node rhel73-2
>>>> May 12 05:52:52 rhel73-1 stonith-ng[5265]:   error: Operation reboot of 
>>>> rhel73-2 by rhel73-1 for
>> crmd.5269@rhel73-1.8415167d: No data available
>>>> May 12 05:52:52

Re: [ClusterLabs] Node attribute disappears when pacemaker is started

2017-05-31 Thread Ken Gaillot

On 05/26/2017 03:21 AM, 井上 和徳 wrote:
> Hi Ken,
> 
> I got crm_report.
> 
> Regards,
> Kazunori INOUE

I don't think it attached -- my mail client says it's 0 bytes.

>> -Original Message-----
>> From: Ken Gaillot [mailto:kgail...@redhat.com]
>> Sent: Friday, May 26, 2017 4:23 AM
>> To: users@clusterlabs.org
>> Subject: Re: [ClusterLabs] Node attribute disappears when pacemaker is 
>> started
>>
>> On 05/24/2017 05:13 AM, 井上 和徳 wrote:
>>> Hi,
>>>
>>> After loading the node attribute, when I start pacemaker of that node, the 
>>> attribute disappears.
>>>
>>> 1. Start pacemaker on node1.
>>> 2. Load configure containing node attribute of node2.
>>>(I use multicast addresses in corosync, so did not set "nodelist 
>>> {nodeid: }" in corosync.conf.)
>>> 3. Start pacemaker on node2, the node attribute that should have been load 
>>> disappears.
>>>Is this specifications ?
>>
>> Hi,
>>
>> No, this should not happen for a permanent node attribute.
>>
>> Transient node attributes (status-attr in crm shell) are erased when the
>> node starts, so it would be expected in that case.
>>
>> I haven't been able to reproduce this with a permanent node attribute.
>> Can you attach logs from both nodes around the time node2 is started?
>>
>>>
>>> 1.
>>> [root@rhel73-1 ~]# systemctl start corosync;systemctl start pacemaker
>>> [root@rhel73-1 ~]# crm configure show
>>> node 3232261507: rhel73-1
>>> property cib-bootstrap-options: \
>>>   have-watchdog=false \
>>>   dc-version=1.1.17-0.1.rc2.el7-524251c \
>>>   cluster-infrastructure=corosync
>>>
>>> 2.
>>> [root@rhel73-1 ~]# cat rhel73-2.crm
>>> node rhel73-2 \
>>>   utilization capacity="2" \
>>>   attributes attrname="attr2"
>>>
>>> [root@rhel73-1 ~]# crm configure load update rhel73-2.crm
>>> [root@rhel73-1 ~]# crm configure show
>>> node 3232261507: rhel73-1
>>> node rhel73-2 \
>>>   utilization capacity=2 \
>>>   attributes attrname=attr2
>>> property cib-bootstrap-options: \
>>>   have-watchdog=false \
>>>   dc-version=1.1.17-0.1.rc2.el7-524251c \
>>>   cluster-infrastructure=corosync
>>>
>>> 3.
>>> [root@rhel73-1 ~]# ssh rhel73-2 'systemctl start corosync;systemctl start 
>>> pacemaker'
>>> [root@rhel73-1 ~]# crm configure show
>>> node 3232261507: rhel73-1
>>> node 3232261508: rhel73-2
>>> property cib-bootstrap-options: \
>>>   have-watchdog=false \
>>>   dc-version=1.1.17-0.1.rc2.el7-524251c \
>>>   cluster-infrastructure=corosync
>>>
>>> Regards,
>>> Kazunori INOUE

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] crm_resource -c field

2017-06-07 Thread Ken Gaillot

On 06/05/2017 10:19 AM, iva...@libero.it wrote:
> Hello,
> could you explain the meaning of fields in "crm_resource -c" command (c
> in lowercase)?
> 
> I've tried to search on web but i didn't find anything.
> 
> Thanks and regards
> 
> Ivan

It's used solely by pacemaker's cluster test suite (CTS) to get key
information in a parsable format. It could theoretically change in any
release, so I wouldn't rely on it for end user purposes.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Cloned IP not moving back after node restart or standby

2017-06-07 Thread Ken Gaillot

On 06/02/2017 06:33 AM, Takehiro Matsushima wrote:
> Hi,
> 
> You should not clone IPaddr2 resource.
> Clone means that to run the resource at same time on both nodes, so
> these nodes will have same duplicated IP address on a network.
> 
> Specifically, you need to configure a IPaddr2 resource runs on a node
> which cloned nginx is running, by collocation constraint.
> 
> However please note that does not work as a load-balancer.
> 
> Regards,
> Takehiro

Actually, IPaddr2 provides special logic for being run as a unique clone
(unique meaning that multiple instances of the clone can run on one
node). It uses the iptables "clusterip" feature to utilize multicast
Ethernet addresses (something different from multicast IP) so that each
node handles only certain requests to the IP.

> 2017/05/31 午前1:48 "Przemyslaw Kulczycki"  >:
> 
> Hi.
> I'm trying to setup a 2-node corosync+pacemaker cluster to function
> as an active-active setup for nginx with a shared IP.
> 
> I've discovered (much to my disappointment) that every time I
> restart one node or put it in standby, the second instance of the
> cloned IP gets moved to the first node and doesn't go back once the
> second node is available, even though I have set stickiness to 0.

> [upr@webdemo3 ~]$ sudo pcs status
> Cluster name: webdemo_cluster2
> Stack: corosync
> Current DC: webdemo3 (version 1.1.15-11.el7_3.4-e174ec8) - partition
> with quorum
> Last updated: Tue May 30 18:40:18 2017  Last change: Tue May
> 30 17:56:24 2017 by hacluster via crmd on webdemo4
> 
> 2 nodes and 4 resources configured
> 
> Online: [ webdemo3 webdemo4 ]
> 
> Full list of resources:
> 
>  Clone Set: ha-ip-clone [ha-ip] (unique)
>  ha-ip:0(ocf::heartbeat:IPaddr2):   Started webdemo3
>  ha-ip:1(ocf::heartbeat:IPaddr2):   Started webdemo3
>  Clone Set: ha-nginx-clone [ha-nginx] (unique)
>  ha-nginx:0 (ocf::heartbeat:nginx): Started webdemo3
>  ha-nginx:1 (ocf::heartbeat:nginx): Started webdemo4
> 
> Failed Actions:
> * ha-nginx:0_monitor_2 on webdemo3 'not running' (7): call=108,
> status=complete, exitreason='none',
> last-rc-change='Tue May 30 17:56:46 2017', queued=0ms, exec=0ms
> 
> 
> Daemon Status:
>   corosync: active/enabled
>   pacemaker: active/enabled
>   pcsd: active/enabled
> 
> [upr@webdemo3 ~]$ sudo pcs config --full
> Cluster Name: webdemo_cluster2
> Corosync Nodes:
>  webdemo3 webdemo4
> Pacemaker Nodes:
>  webdemo3 webdemo4
> 
> Resources:
>  Clone: ha-ip-clone
>   Meta Attrs: clone-max=2 clone-node-max=2 globally-unique=true
> *stickiness=0*
>
>   Resource: ha-ip (class=ocf provider=heartbeat type=IPaddr2)
>Attributes: ip=10.75.39.235 cidr_netmask=24 clusterip_hash=sourceip
>Operations: start interval=0s timeout=20s (ha-ip-start-interval-0s)
>stop interval=0s timeout=20s (ha-ip-stop-interval-0s)
>monitor interval=10s timeout=20s
> (ha-ip-monitor-interval-10s)
>  Clone: ha-nginx-clone
>   Meta Attrs: globally-unique=true clone-node-max=1
>   Resource: ha-nginx (class=ocf provider=heartbeat type=nginx)
>Operations: start interval=0s timeout=60s
> (ha-nginx-start-interval-0s)
>stop interval=0s timeout=60s (ha-nginx-stop-interval-0s)
>monitor interval=20s timeout=30s
> (ha-nginx-monitor-interval-20s)
> 
> Stonith Devices:
> Fencing Levels:
> 
> Location Constraints:
> Ordering Constraints:
> Colocation Constraints:
>   ha-ip-clone with ha-nginx-clone (score:INFINITY)
> (id:colocation-ha-ip-ha-nginx-INFINITY)
> Ticket Constraints:
> 
> Alerts:
>  No alerts defined
> 
> Resources Defaults:
>  resource-stickiness: 100
> Operations Defaults:
>  No defaults set
> 
> Cluster Properties:
>  cluster-infrastructure: corosync
>  cluster-name: webdemo_cluster2
>  dc-version: 1.1.15-11.el7_3.4-e174ec8
>  have-watchdog: false
>  last-lrm-refresh: 1496159785
>  no-quorum-policy: ignore
>  stonith-enabled: false
> 
> Quorum:
>   Options:
> 
> Am I doing something incorrectly?
> 
> Additionally, I'd like to know what's the difference between these
> commands:
> 
> sudo pcs resource update ha-ip-clone stickiness=0
> 
> sudo pcs resource meta ha-ip-clone resource-stickiness=0
> 
> 
> They seem to set the same thing, but there might be a subtle difference.
> 
> -- 
> Best Regards
>  
> Przemysław Kulczycki
> System administrator
> Avaleo
> 
> Email: u...@avaleo.net 

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home:

Re: [ClusterLabs] Node attribute disappears when pacemaker is started

2017-06-08 Thread Ken Gaillot

Hi,

Looking at the incident around May 26 16:40:00, here is what happens:

You are setting the attribute for rhel73-2 from rhel73-1, while rhel73-2
is not part of cluster from rhel73-1's point of view.

The crm shell sets the node attribute for rhel73-2 with a CIB
modification that starts like this:

++ /cib/configuration/nodes:  

Note that the node ID is the same as its name. The CIB accepts the
change (because you might be adding the proper node later). The crmd
knows that this is not currently valid:

May 26 16:39:39 rhel73-1 crmd[2908]:   error: Invalid node id: rhel73-2

When rhel73-2 joins the cluster, rhel73-1 learns its node ID, and it
removes the existing (invalid) rhel73-2 entry, including its attributes,
because it assumes that the entry is for an older node that has been
removed.

I believe attributes can be set for a node that's not in the cluster
only if the node IDs are specified explicitly in corosync.conf.

You may want to mention the issue to the crm shell developers. It should
probably at least warn if the node isn't known.


On 05/31/2017 09:35 PM, 井上 和徳 wrote:
> Hi Ken,
> 
> I'm sorry. Attachment size was too large.
> I attached it to GitHub, so look at it.
> https://github.com/inouekazu/pcmk_report/blob/master/pcmk-Fri-26-May-2017.tar.bz2
> 
>> -Original Message-----
>> From: Ken Gaillot [mailto:kgail...@redhat.com]
>> Sent: Thursday, June 01, 2017 8:43 AM
>> To: users@clusterlabs.org
>> Subject: Re: [ClusterLabs] Node attribute disappears when pacemaker is 
>> started
>>
>> On 05/26/2017 03:21 AM, 井上 和徳 wrote:
>>> Hi Ken,
>>>
>>> I got crm_report.
>>>
>>> Regards,
>>> Kazunori INOUE
>>
>> I don't think it attached -- my mail client says it's 0 bytes.
>>
>>>> -Original Message-
>>>> From: Ken Gaillot [mailto:kgail...@redhat.com]
>>>> Sent: Friday, May 26, 2017 4:23 AM
>>>> To: users@clusterlabs.org
>>>> Subject: Re: [ClusterLabs] Node attribute disappears when pacemaker is 
>>>> started
>>>>
>>>> On 05/24/2017 05:13 AM, 井上 和徳 wrote:
>>>>> Hi,
>>>>>
>>>>> After loading the node attribute, when I start pacemaker of that node, 
>>>>> the attribute disappears.
>>>>>
>>>>> 1. Start pacemaker on node1.
>>>>> 2. Load configure containing node attribute of node2.
>>>>>(I use multicast addresses in corosync, so did not set "nodelist 
>>>>> {nodeid: }" in corosync.conf.)
>>>>> 3. Start pacemaker on node2, the node attribute that should have been 
>>>>> load disappears.
>>>>>Is this specifications ?
>>>>
>>>> Hi,
>>>>
>>>> No, this should not happen for a permanent node attribute.
>>>>
>>>> Transient node attributes (status-attr in crm shell) are erased when the
>>>> node starts, so it would be expected in that case.
>>>>
>>>> I haven't been able to reproduce this with a permanent node attribute.
>>>> Can you attach logs from both nodes around the time node2 is started?
>>>>
>>>>>
>>>>> 1.
>>>>> [root@rhel73-1 ~]# systemctl start corosync;systemctl start pacemaker
>>>>> [root@rhel73-1 ~]# crm configure show
>>>>> node 3232261507: rhel73-1
>>>>> property cib-bootstrap-options: \
>>>>>   have-watchdog=false \
>>>>>   dc-version=1.1.17-0.1.rc2.el7-524251c \
>>>>>   cluster-infrastructure=corosync
>>>>>
>>>>> 2.
>>>>> [root@rhel73-1 ~]# cat rhel73-2.crm
>>>>> node rhel73-2 \
>>>>>   utilization capacity="2" \
>>>>>   attributes attrname="attr2"
>>>>>
>>>>> [root@rhel73-1 ~]# crm configure load update rhel73-2.crm
>>>>> [root@rhel73-1 ~]# crm configure show
>>>>> node 3232261507: rhel73-1
>>>>> node rhel73-2 \
>>>>>   utilization capacity=2 \
>>>>>   attributes attrname=attr2
>>>>> property cib-bootstrap-options: \
>>>>>   have-watchdog=false \
>>>>>   dc-version=1.1.17-0.1.rc2.el7-524251c \
>>>>>   cluster-infrastructure=corosync
>>>>>
>>>>> 3.
>>>>> [root@rhel73-1 ~]# ssh rhel73-2 'systemctl start corosync;systemctl start 
>>>>> pacemaker'
>>>>> [root@rhel73-1 ~]# crm configure show
>>>>> node 3232261507: rhel73-1
>>>>> node 3232261508: rhel73-2
>>>>> property cib-bootstrap-options: \
>>>>>   have-watchdog=false \
>>>>>   dc-version=1.1.17-0.1.rc2.el7-524251c \
>>>>>   cluster-infrastructure=corosync
>>>>>
>>>>> Regards,
>>>>> Kazunori INOUE

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Pacemaker shutting down peer node

2017-06-15 Thread Ken Gaillot

On 06/15/2017 12:38 AM, Jaz Khan wrote:
> Hi,
> 
> I have been encountering this serious issue from past couple of months.
> I really have no idea that why pacemaker sends shutdown signal to peer
> node and it goes down. This is very strange and I am too much worried . 
> 
> This is not happening daily, but it surely does this kind of behavior
> after every few days.
> 
> Version:
> Pacemaker 1.1.16
> Corosync 2.4.2
> 
> Please help me out with this bug! Below is the log message.
> 
> 
> 
> Jun 14 15:52:23 apex1 crmd[18733]:  notice: State transition S_IDLE ->
> S_POLICY_ENGINE
> Jun 14 15:52:23 apex1 pengine[18732]:  notice: On loss of CCM Quorum: Ignore
> 
> Jun 14 15:52:23 apex1 pengine[18732]:  notice: Scheduling Node ha-apex2
> for shutdown

This is not a fencing, but a clean shutdown. Normally this only happens
in response to a user request.

Check the logs on both nodes before this point, to try to see what was
the first indication that it would shut down.

> 
> Jun 14 15:52:23 apex1 pengine[18732]:  notice: Movevip#011(Started
> ha-apex2 -> ha-apex1)
> Jun 14 15:52:23 apex1 pengine[18732]:  notice: Move  
>  filesystem#011(Started ha-apex2 -> ha-apex1)
> Jun 14 15:52:23 apex1 pengine[18732]:  notice: Movesamba#011(Started
> ha-apex2 -> ha-apex1)
> Jun 14 15:52:23 apex1 pengine[18732]:  notice: Move  
>  database#011(Started ha-apex2 -> ha-apex1)
> Jun 14 15:52:23 apex1 pengine[18732]:  notice: Calculated transition
> 1744, saving inputs in /var/lib/pacemaker/pengine/pe-input-123.bz2
> Jun 14 15:52:23 apex1 crmd[18733]:  notice: Initiating stop operation
> vip_stop_0 on ha-apex2
> Jun 14 15:52:23 apex1 crmd[18733]:  notice: Initiating stop operation
> samba_stop_0 on ha-apex2
> Jun 14 15:52:23 apex1 crmd[18733]:  notice: Initiating stop operation
> database_stop_0 on ha-apex2
> Jun 14 15:52:26 apex1 crmd[18733]:  notice: Initiating stop operation
> filesystem_stop_0 on ha-apex2
> Jun 14 15:52:27 apex1 kernel: drbd apexdata apex2.br :
> peer( Primary -> Secondary )
> Jun 14 15:52:27 apex1 crmd[18733]:  notice: Initiating start operation
> filesystem_start_0 locally on ha-apex1
> 
> Jun 14 15:52:27 apex1 crmd[18733]:  notice: do_shutdown of peer ha-apex2
> is complete
> 
> Jun 14 15:52:27 apex1 attrd[18731]:  notice: Node ha-apex2 state is now lost
> Jun 14 15:52:27 apex1 attrd[18731]:  notice: Removing all ha-apex2
> attributes for peer loss
> Jun 14 15:52:27 apex1 attrd[18731]:  notice: Lost attribute writer ha-apex2
> Jun 14 15:52:27 apex1 attrd[18731]:  notice: Purged 1 peers with id=2
> and/or uname=ha-apex2 from the membership cache
> Jun 14 15:52:27 apex1 stonith-ng[18729]:  notice: Node ha-apex2 state is
> now lost
> Jun 14 15:52:27 apex1 stonith-ng[18729]:  notice: Purged 1 peers with
> id=2 and/or uname=ha-apex2 from the membership cache
> Jun 14 15:52:27 apex1 cib[18728]:  notice: Node ha-apex2 state is now lost
> Jun 14 15:52:27 apex1 cib[18728]:  notice: Purged 1 peers with id=2
> and/or uname=ha-apex2 from the membership cache
> 
> 
> 
> Best regards,
> Jaz. K

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] ClusterIP won't return to recovered node

2017-06-16 Thread Ken Gaillot

On 06/16/2017 01:18 PM, Dan Ragle wrote:
> 
> 
> On 6/12/2017 10:30 AM, Ken Gaillot wrote:
>> On 06/12/2017 09:23 AM, Klaus Wenninger wrote:
>>> On 06/12/2017 04:02 PM, Ken Gaillot wrote:
>>>> On 06/10/2017 10:53 AM, Dan Ragle wrote:
>>>>> So I guess my bottom line question is: How does one tell Pacemaker
>>>>> that
>>>>> the individual legs of globally unique clones should *always* be
>>>>> spread
>>>>> across the available nodes whenever possible, regardless of the number
>>>>> of processes on any one of the nodes? For kicks I did try:
>>>>>
>>>>> pcs constraint location ClusterIP:0 prefers node1-pcs=INFINITY
>>>>>
>>>>> but it responded with an error about an invalid character (:).
>>>> There isn't a way currently. It will try to do that when initially
>>>> placing them, but once they've moved together, there's no simple way to
>>>> tell them to move. I suppose a workaround might be to create a dummy
>>>> resource that you constrain to that node so it looks like the other
>>>> node
>>>> is less busy.
>>>
>>> Another ugly dummy resource idea - maybe less fragile -
>>> and not tried out:
>>> One could have 2 dummy resources that would rather like
>>> to live on different nodes - no issue with primitives - and
>>> do depend collocated on ClusterIP.
>>> Wouldn't that pull them apart once possible?
>>
>> Sounds like a good idea
> 
> H... still no luck with this.
> 
> Based on your suggestion, I thought this would work (leaving out all the
> status displays this time):
> 
> # pcs resource create Test1 systemd:test1
> # pcs resource create Test2 systemd:test2
> # pcs constraint location Test1 prefers node1-pcs=INFINITY
> # pcs constraint location Test2 prefers node1-pcs=INFINITY
> # pcs resource create Test3 systemd:test3
> # pcs resource create Test4 systemd:test4
> # pcs constraint location Test3 prefers node1-pcs=INFINITY
> # pcs constraint location Test4 prefers node2-pcs=INFINITY
> # pcs resource create ClusterIP ocf:heartbeat:IPaddr2 ip=162.220.75.138
> nic=bond0 cidr_netmask=24
> # pcs resource meta ClusterIP resource-stickiness=0
> # pcs resource clone ClusterIP clone-max=2 clone-node-max=2
> globally-unique=true
> # pcs constraint colocation add ClusterIP-clone with Test3 INFINITY
> # pcs constraint colocation add ClusterIP-clone with Test4 INFINITY
> 
> But that simply refuses to run ClusterIP at all ("Resource ClusterIP:0/1
> cannot run anywhere"). And if I change the last two colocation
> constraints to a numeric then it runs, but with the same problem I had
> before (both ClusterIP instances on one node).
> 
> I also tried it reversing the colocation definition (add Test3 with
> ClusterIP-clone) and trying differing combinations of scores between the
> location and colocation constraints, still with no luck.
> 
> Thanks,
> 
> Dan

Ah of course, the colocation with both means they all have to run on the
same node, which is impossible.

FYI you can create dummy resources with ocf:pacemaker:Dummy so you don't
have to write your own agents.

OK, this is getting even hackier, but I'm thinking you can use
utilization for this:

http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm139683960632560

* Create two dummy resources, each with a -INFINITY location preference
for one of the nodes, so each is allowed to run on only one node.

* Set the priority meta-attribute to a positive number on all your real
resources, and leave the dummies at 0 (so if the cluster can't run all
of them, it will stop the dummies first).

* Set placement-strategy=utilization.

* Define a utilization attribute, with values for each node and resource
like this:
** Set a utilization of 1 on all resources except the dummies and the
clone, so that their total utilization is N.
** Set a utilization of 100 on the dummies and the clone.
** Set a utilization capacity of 200 + N on each node.

(I'm assuming you never expect to have more than 99 other resources. If
that's not the case, just raise the 100 usage accordingly.)

With those values, if only one node is up, that node can host all the
real resources (including both clone instances), with the dummies
stopped. If both nodes are up, the only way the cluster can run all
resources (including the clone instances and dummies) is to spread the
clone instances out.

Again, it's hacky, and I haven't tested it, but I think it would work.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Pacemaker shutting down peer node

2017-06-16 Thread Ken Gaillot

On 06/16/2017 11:21 AM, Jaz Khan wrote:
> Hi,
> 
> I have checked node ha-apex2.
> The log on that machine from /var/log/messages says "systemd: Power
> button pressed" and "Shutting down"  but this message appeared just
> when the ha-apex1 node scheduled the shutdown with difference in seconds.
> 
> It seems like the peer node (ha-apex1) has sent some kind of power off
> request and it obeyed to the request.
>  
> On node ha-apex1 it clearly says "Scheduling Node ha-apex2 for shutdown"
> which seems like it has scheduled this task to be executed on peer node.

That's the cluster's response to systemd's shutdown request.

Something in your system is triggering the "power button pressed" event.
I believe that message usually originates from /etc/acpi/powerbtn.sh.
(As an aside, it's usually a good idea to disable ACPI on servers.)

In my experience, "Power button pressed" usually means a real person
pushed a real power button. (My favorite time was when a hosting
provider labeled some physical machines incorrectly and kept rebooting a
server used by the company I worked for at the time, wondering why it
wasn't having the intended effect.) But I'm sure it's possible it's
being generated via IPMI or something.

I don't think any cluster fence agents could be the cause because you
don't see any fencing messages in your logs, and fence agents should
always use a hard poweroff, not something that can be intercepted by the OS.

> My servers are running in production, please help me out. I really do
> not want anything to happen to any of node. I hope you understand the
> seriousness of this issue.
> 
> NOTE: This didn't only happen on this cluster group of nodes. It also
> happened few times on another cluster group of machines as well.
> 
> Look at this two messages from ha-apex1 node.
> 
> Jun 14 15:52:23 apex1 pengine[18732]:  notice: Scheduling Node ha-apex2
> for shutdown
> 
> Jun 14 15:52:27 apex1 crmd[18733]:  notice: do_shutdown of peer ha-apex2
> is complete
> 
> 
> Best regards,
> Jaz
> 
> 
> 
> 
> 
> Message: 1
> Date: Thu, 15 Jun 2017 13:53:00 -0500
> From: Ken Gaillot <kgail...@redhat.com <mailto:kgail...@redhat.com>>
> To: users@clusterlabs.org <mailto:users@clusterlabs.org>
> Subject: Re: [ClusterLabs] Pacemaker shutting down peer node
> Message-ID: <5d122183-2030-050d-3a8e-9c158fa5f...@redhat.com
> <mailto:5d122183-2030-050d-3a8e-9c158fa5f...@redhat.com>>
> Content-Type: text/plain; charset=utf-8
> 
> On 06/15/2017 12:38 AM, Jaz Khan wrote:
> > Hi,
> >
> > I have been encountering this serious issue from past couple of
> months.
> > I really have no idea that why pacemaker sends shutdown signal to peer
> > node and it goes down. This is very strange and I am too much
> worried .
> >
> > This is not happening daily, but it surely does this kind of behavior
> > after every few days.
> >
> > Version:
> > Pacemaker 1.1.16
> > Corosync 2.4.2
> >
> > Please help me out with this bug! Below is the log message.
> >
> >
> >
> > Jun 14 15:52:23 apex1 crmd[18733]:  notice: State transition S_IDLE ->
> > S_POLICY_ENGINE
> > Jun 14 15:52:23 apex1 pengine[18732]:  notice: On loss of CCM
> Quorum: Ignore
> >
> > Jun 14 15:52:23 apex1 pengine[18732]:  notice: Scheduling Node
> ha-apex2
> > for shutdown
> 
> This is not a fencing, but a clean shutdown. Normally this only happens
> in response to a user request.
> 
> Check the logs on both nodes before this point, to try to see what was
> the first indication that it would shut down.
> 
> >
> > Jun 14 15:52:23 apex1 pengine[18732]:  notice: Movevip#011(Started
> > ha-apex2 -> ha-apex1)
> > Jun 14 15:52:23 apex1 pengine[18732]:  notice: Move
> >  filesystem#011(Started ha-apex2 -> ha-apex1)
> > Jun 14 15:52:23 apex1 pengine[18732]:  notice: Move   
> samba#011(Started
> > ha-apex2 -> ha-apex1)
> > Jun 14 15:52:23 apex1 pengine[18732]:  notice: Move
> >  database#011(Started ha-apex2 -> ha-apex1)
> > Jun 14 15:52:23 apex1 pengine[18732]:  notice: Calculated transition
> > 1744, saving inputs in /var/lib/pacemaker/pengine/pe-input-123.bz2
> > Jun 14 15:52:23 apex1 crmd[18733]:  notice: Initiating stop operation
> > vip_stop_0 on ha-apex2
> > Jun 14 15:52:23 apex1 crmd[18733]:  notice: Initiating stop operation
> > samba_stop_0 on ha-apex2
&g

Re: [ClusterLabs] How to fence cluster node when SAN filesystem fail

2017-05-02 Thread Ken Gaillot

Hi,

Upstream documentation on fencing in Pacemaker is available at:

http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm139683949958512

Higher-level tools such as crm shell and pcs make it easier; see their
man pages and other documentation for details.


On 05/01/2017 10:35 PM, Albert Weng wrote:
> Hi All,
> 
> My environment :
> (1) two node (active/passive) pacemaker cluster
> (2) SAN storage attached, add resource type "filesystem"
> (3) OS : RHEL 7.2
> 
> In old version of RHEL cluster, when attached SAN storage path lost(ex.
> filesystem fail),
> active node will trigger fence device to reboot itself.
> 
> but when i use pacemaker on RHEL cluster, when i remove fiber cable on
> active node, all resources failover to passive node normally, but active
> node doesn't reboot.
> 
> how to trigger fence reboot action when SAN filesystem lost?
> 
> Thank a lot~~~
> 
> 
> -- 
> Kind regards,
> Albert Weng
> 
> 
>   不含病毒。www.avast.com

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] crm_mon -h (writing to a html-file) not showing all desired information and having trouble with the -d option

2017-05-08 Thread Ken Gaillot

On 05/08/2017 11:13 AM, Lentes, Bernd wrote:
> Hi,
> 
> playing around with my cluster i always have a shell with crm_mon running 
> because it provides me a lot of useful and current information concerning 
> cluster, nodes, resources ...
> Normally i have a "crm_mon -nrfRAL" running.
> I'd like to have that output as a web page too.
> So i tried the option -h.
> I have crm_mon from pacemaker 1.1.12 on a SLES 11 SP4 box. I'm writing the 
> file to /srv/www/hawk/public/crm_mon.html.
> I have hawk running, so i don't need an extra webserver for that.
> 
> First, i was very astonished when i used the option -d (daemonize). Using 
> that hawk does not find the html-file, although i see it in the fs, and it's 
> looking good.
> Hawk (or lighttpd) throws an error 404. Without -d lighttpd finds the files 
> and presents it via browser !?!
> 
> This is the file without -d:
> 
> ha-idg-2:/srv/www/hawk/public # stat crm_mon.html
>   File: `crm_mon.html'
>   Size: 1963Blocks: 8  IO Block: 4096   regular file
> Device: 1fh/31d Inode: 7082Links: 1
> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
> Access: 2017-05-08 18:03:25.695754151 +0200
> Modify: 2017-05-08 18:03:20.875680374 +0200
> Change: 2017-05-08 18:03:20.875680374 +0200
> 
> 
> Same file with crm_mon -d:
> 
> ha-idg-2:/srv/www/hawk/public # stat crm_mon.html
>   File: `crm_mon.html'
>   Size: 1963Blocks: 8  IO Block: 4096   regular file
> Device: 1fh/31d Inode: 7084Links: 1
> Access: (0640/-rw-r-)  Uid: (0/root)   Gid: (0/root)

The "other" bit is gone, is that it?

> Access: 2017-05-08 18:04:16.048524856 +0200
> Modify: 2017-05-08 18:04:16.048524856 +0200
> Change: 2017-05-08 18:04:16.048524856 +0200
>  Birth: -
> 
> I see no important difference, just the different inode.
> 
> This is the access.log from lighttpd:
> 
> 10.35.34.70 ha-idg-2:7630 - [08/May/2017:18:04:10 +0200] "GET /crm_mon.html 
> HTTP/1.1" 200 563 "https://ha-idg-2:7630/crm_mon.html; "Mozilla/5.0 (Windows 
> NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) 
> Chrome/57.0.2987.133 Safa
> ri/537.36"
> 10.35.34.70 ha-idg-2:7630 - [08/May/2017:18:04:15 +0200] "GET /crm_mon.html 
> HTTP/1.1" 200 563 "https://ha-idg-2:7630/crm_mon.html; "Mozilla/5.0 (Windows 
> NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) 
> Chrome/57.0.2987.133 Safa
> ri/537.36"
> 10.35.34.70 ha-idg-2:7630 - [08/May/2017:18:04:20 +0200] "GET /crm_mon.html 
> HTTP/1.1" 404 1163 "https://ha-idg-2:7630/crm_mon.html; "Mozilla/5.0 (Windows 
> NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) 
> Chrome/57.0.2987.133 Saf
> ari/537.36"
> 
> It simply changes from http status code 200 to 404. Why ? 
> 
> And using "crm_mon -nfrotRALV -h /srv/www/hawk/public/crm_mon.html" i get the 
> following output:
> 
> 
> Cluster summary
> 
> Last updated: Mon May 8 18:08:58 2017
> Current DC: ha-idg-2
> 2 Nodes configured.
> 14 Resources configured.
> Config Options
> 
> STONITH of failed nodes   :   enabled
> Cluster is:   symmetric
> No Quorum Policy  :   Ignore
> Node List
> 
> Node: ha-idg-1: online
> prim_clvmd(ocf::lvm2:clvmd):  Started 
> prim_stonith_ipmi_ha-idg-2(stonith:external/ipmi):Started 
> prim_ocfs2(ocf::ocfs2:o2cb):  Started 
> prim_vm_mausdb(ocf::heartbeat:VirtualDomain): Started 
> prim_vg_cluster_01(ocf::heartbeat:LVM):   Started 
> prim_fs_lv_xml_vm (ocf::heartbeat:Filesystem):Started 
> prim_dlm  (ocf::pacemaker:controld):  Started 
> prim_vnc_ip_mausdb(ocf::lentes:IPaddr):   Started 
> Node: ha-idg-2: online
> prim_clvmd(ocf::lvm2:clvmd):  Started 
> prim_stonith_ipmi_ha-idg-1(stonith:external/ipmi):Started 
> prim_ocfs2(ocf::ocfs2:o2cb):  Started 
> prim_vg_cluster_01(ocf::heartbeat:LVM):   Started 
> prim_fs_lv_xml_vm (ocf::heartbeat:Filesystem):Started 
> prim_dlm  (ocf::pacemaker:controld):  Started 
> Inactive Resources
> 
> I'm missing the constraints, operations and timing details. How can i get 
> them ?
> 
> 
> Bernd

The crm_mon HTML code doesn't get many reports/requests/submissions from
users, so it doesn't get a lot of attention. I wouldn't be too surprised
if there are some loose ends.

I'm not sure why those sections wouldn't appear. The code for it seems
to be there.



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Instant service restart during failback

2017-05-08 Thread Ken Gaillot

If you look in the logs when the node comes back, there should be some
"pengine:" messages noting that the restarts will be done, and then a
"saving inputs in " message. If you can attach that file (both
with and without the constraint changes would be ideal), I'll take a
look at it.

On 04/21/2017 05:26 AM, Euronas Support wrote:
> Seems that replacing inf: with 0: in some colocation constraints fixes the 
> problem, but still cannot understand why it worked for one node and not for 
> the other.
> 
> On 20.4.2017 12:16:02 Klechomir wrote:
>> Hi Klaus,
>> It would have been too easy if it was interleave.
>> All my cloned resoures have interlave=true, of course.
>> What bothers me more is that the behaviour is asymmetrical.
>>
>> Regards,
>> Klecho
>>
>> On 20.4.2017 10:43:29 Klaus Wenninger wrote:
>>> On 04/20/2017 10:30 AM, Klechomir wrote:
 Hi List,
 Been investigating the following problem recently:

 Have two node cluster with 4 cloned (2 on top of 2) + 1 master/slave
 services on it (corosync+pacemaker 1.1.15)
 The failover works properly for both nodes, i.e. when one node is
 restarted/turned in standby, the other properly takes over, but:

 Every time when node2 has been in standby/turned off and comes back,
 everything recovers propery.
 Every time when node1 has been in standby/turned off and comes back,
 part
 of the cloned services on node2 are getting instantly restarted, at the
 same second when node1 re-appeares, without any apparent reason (only
 the
 stop/start messages in the debug).

 Is there some known possible reason for this?
>>>
>>> That triggers some deja-vu feeling...
>>> Did you have a similar issue a couple of weeks ago?
>>> I remember in that particular case 'interleave=true' was not the
>>> solution to the problem but maybe here ...
>>>
>>> Regards,
>>> Klaus
>>>
 Best regards,
 Klecho

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] pacemaker daemon shutdown time with lost remote node

2017-05-08 Thread Ken Gaillot

On 04/28/2017 02:22 PM, Radoslaw Garbacz wrote:
> Hi,
> 
> I have a question regarding pacemaker daemon shutdown
> procedure/configuration.
> 
> In my case, when a remote node is lost pacemaker needs exactly 10minutes
> to shutdown, during which there is nothing logged.
> So my questions:
> 1. What is pacemaker doing at this time?
> 2. How to make it shorter?

The logs from the other nodes will be helpful. One of the nodes will be
the DC, and will have all the scheduled commands.

Generally, in a shutdown, pacemaker first tries to stop all resources.
If one of those stops is either taking a long time or timing out, that
might explain it.

> Changed Pacemaker Configuration:
> - cluster-delay
> - dc-deadtime
> 
> 
> Pacemaker Logs:
> Apr 28 17:38:08 [17689] ip-10-41-177-183 pacemakerd:   notice:
> crm_signal_dispatch: Caught 'Terminated' signal | 15 (invoking handler)
> Apr 28 17:38:08 [17689] ip-10-41-177-183 pacemakerd:   notice:
> pcmk_shutdown_worker:Shutting down Pacemaker
> Apr 28 17:38:08 [17689] ip-10-41-177-183 pacemakerd:   notice:
> stop_child:  Stopping crmd | sent signal 15 to process 17698
> Apr 28 17:48:07 [17695] ip-10-41-177-183   lrmd: info:
> cancel_recurring_action: Cancelling ocf operation
> monitor_head_monitor_191000
> Apr 28 17:48:07 [17695] ip-10-41-177-183   lrmd: info:
> log_execute: executing - rsc:monitor_head action:stop call_id:130
> [...]
> Apr 28 17:48:07 [17689] ip-10-41-177-183 pacemakerd: info: main:   
> Exiting pacemakerd
> Apr 28 17:48:07 [17689] ip-10-41-177-183 pacemakerd: info:
> crm_xml_cleanup: Cleaning up memory from libxml2
> 
> 
> Pacemaker built from github: 1.16
> 
> 
> Help greatly appreciated.
> 
> -- 
> Best Regards,
> 
> Radoslaw Garbacz
> XtremeData Incorporated

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Antw: Antw: notice: throttle_handle_load: High CPU load detected

2017-05-08 Thread Ken Gaillot

On 05/05/2017 12:37 AM, jitendra.jaga...@dell.com wrote:
>  
> 
> Hello All,
> 
>  
> 
> Sorry for resurrecting old thread.
> 
>  
> 
> I am also observing “High CPU load detected" messages in the logs
> 
>  
> 
> In this email chain, I see everyone is suggesting to change
> "load-threshold" settings
> 
>  
> 
> But I am not able to find any good information about “load-threshold”
> except this https://www.mankier.com/7/crmd
> 
>  
> 
> Even in Pacemaker document
> “http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/pdf/Pacemaker_Explained/Pacemaker-1.1-Pacemaker_Explained-en-US.pdf”
> 
>  
> 
> There is not much detail about “load-threshold”.
> 
>  
> 
> Please can someone share steps or any commands to modify “load-threshold”.
> 
>  
> 
> Thanks
> 
> Jitendra

Hi Jitendra,

Those messages indicate there is a real issue with the CPU load. When
the cluster notices high load, it reduces the number of actions it will
execute at the same time. This is generally a good idea, to avoid making
the load worse.

The messages don't hurt anything, they just let you know that there is
something worth investigating.

If you've investigated the load and it's not something to be concerned
about, you can change load-threshold to adjust what the cluster
considers "high". The load-threshold works like this:

* It defaults to 0.8 (which means pacemaker should try to avoid
consuming more than 80% of the system's resources).

* On a single-core machine, load-threshold is multiplied by 0.6 (because
with only one core you *really* don't want to consume too many
resources); on a multi-core machine, load-threshold is multiplied by the
number of cores (to normalize the system load per core).

* That number is then multiplied by 1.2 to get the "Noticeable CPU load
detected" message (debug level), by 1.6 to get the "Moderate CPU load"
message, and 2.0 to get the "High CPU load" message. These are measured
against the 1-minute system load average (the same number you would get
with top, uptime, etc.).

So, if you raise load-threshold above 0.8, you won't see the log
messages until the load gets even higher. But, that doesn't do anything
about the actual load problem.

> *From:*Kostiantyn Ponomarenko [mailto:konstantin.ponomare...@gmail.com]
> *Sent:* Tuesday, April 5, 2016 8:37 AM
> *To:* kgail...@redhat.com
> *Cc:* Cluster Labs - All topics related to open-source clustering
> welcomed <users@clusterlabs.org>
> *Subject:* Re: [ClusterLabs] Antw: Antw: notice: throttle_handle_load:
> High CPU load detected
> 
>  
> 
> Thank you, Ken.
> 
> This helps a lot.
> 
> Now I am sure that my current approach fits best for me =)
> 
> 
> Thank you,
> 
> Kostia
> 
>  
> 
> On Wed, Mar 30, 2016 at 11:10 PM, Ken Gaillot <kgail...@redhat.com
> <mailto:kgail...@redhat.com>> wrote:
> 
> On 03/29/2016 08:22 AM, Kostiantyn Ponomarenko wrote:
> > Ken, thank you for the answer.
> >
> > Every node in my cluster under normal conditions has "load average" of
> > about 420. It is mainly connected to the high disk IO on the system.
> > My system is designed to use almost 100% of its hardware
> (CPU/RAM/disks),
> > so the situation when the system consumes almost all HW resources is
> > normal.
> 
> 420 suggests that HW resources are outstripped -- anything above the
> system's number of cores means processes are waiting for some resource.
> (Although with an I/O-bound workload like this, the number of cores
> isn't very important -- most will be sitting idle despite the high
> load.) And if that's during normal conditions, what will happen during a
> usage spike? It sounds like a recipe for less-than-HA.
> 
> Under high load, there's a risk of negative feedback, where monitors
> time out, causing pacemaker to schedule recovery actions, which cause
> load to go higher and more monitors to time out, etc. That's why
> throttling is there.
> 
> > I would like to get rid of "High CPU load detected" messages in the
> > log, because
> > they flood corosync.log as well as system journal.
> >
> > Maybe you can give an advice what would be the best way do to it?
> >
> > So far I came up with the idea of setting "load-threshold" to 1000% ,
> > because of:
> > 420(load average) / 24 (cores) = 17.5 (adjusted_load);
> > 2 (THROTLE_FACTOR_HIGH) * 10 (throttle_load_target) = 20
> >
> > if(adjusted_load > THROTTLE_FACTOR_HIGH * throttle_load_target) {
> > crm_notice(&qu

Re: [ClusterLabs] Antw: Behavior after stop action failure with the failure-timeout set and STONITH disabled

2017-05-08 Thread Ken Gaillot

On 05/05/2017 07:49 AM, Jan Wrona wrote:
> On 5.5.2017 08:15, Ulrich Windl wrote:
> Jan Wrona  schrieb am 04.05.2017 um 16:41 in
> Nachricht
>> :
>>> I hope I'll be able to explain the problem clearly and correctly.
>>>
>>> My setup (simplified): I have two cloned resources, a filesystem mount
>>> and a process which writes to that filesystem. The filesystem is Gluster
>>> so its OK to clone it. I also have a mandatory ordering constraint
>>> "start gluster-mount-clone then start writer-process-clone". I don't
>>> have a STONITH device, so I've disable STONITH by setting
>>> stonith-enabled=false.
>>>
>>> The problem: Sometimes the Gluster freezes for a while, which causes the
>>> gluster-mount resource's monitor with the OCF_CHECK_LEVEL=20 to timeout
>>> (it is unable to write the status file). When this happens, the cluster

Have you tried increasing the monitor timeouts?

>> Actually I would do two things:
>>
>> 1) Find out why Gluster freezes, and what to do to avoid that
> 
> It freezes when one of the underlying MD RAIDs starts its regular check.
> I've decreased its speed limit (from the default 200 MB/s to the 50
> MB/s, I cannot go any lower), but it helped only a little, the mount
> still tends to freeze for a few seconds during the check.
> 
>>
>> 2) Implement stonith
> 
> Currently I can't. But AFAIK Pacemaker should work properly even with
> disabled STONITH and the state I've run into doesn't seem right to me at
> all. I was asking for clarification of what the cluster is trying to do
> in such situation, I don't understand the "Ignoring expired calculated
> failure" log messages and I don't understand why the crm_mon was showing
> that the writer-process is started even though it was not.

Pacemaker can work without stonith, but there are certain failure
situations that can't be recovered any other way, so whether that's
working "properly" is a matter of opinion. :-) In this particular case,
stonith doesn't make the situation much better -- you want to prevent
the need for stonith to begin with (hopefully increasing the monitor
timeouts is sufficient). But stonith is still good to have for other
situations.

The cluster shows the service as started because it determines the state
by the service's operation history:

   successful start at time A = started
   successful start at time A + failed stop at time B = started (failed)
   after failure expires, back to: successful start at time A = started

If the service is not actually running at that point, the next recurring
monitor should detect that.

>> Regards,
>> Ulrich
>>
>>
>>> tries to recover by restarting the writer-process resource. But the
>>> writer-process is writing to the frozen filesystem which makes it
>>> uninterruptable, not even SIGKILL works. Then the stop operation times
>>> out and on-fail with disabled STONITH defaults to block (don’t perform
>>> any further operations on the resource):
>>> warning: Forcing writer-process-clone away from node1.example.org after
>>> 100 failures (max=100)
>>> After that, the cluster continues with the recovery process by
>>> restarting the gluster-mount resource on that node and it usually
>>> succeeds. As a consequence of that remount, the uninterruptable system
>>> call in the writer process fails, signals are finally delivered and the
>>> writer-process is terminated. But the cluster doesn't know about that!
>>>
>>> I thought I can solve this by setting the failure-timeout meta attribute
>>> to the writer-process resource, but it only made things worse. The
>>> documentation states: "Stop failures are slightly different and crucial.
>>> ... If a resource fails to stop and STONITH is not enabled, then the
>>> cluster has no way to continue and will not try to start the resource
>>> elsewhere, but will try to stop it again after the failure timeout.",

The documentation is silently making the assumption that the condition
that led to the initial stop is still true. In this case, if the gluster
failure has long since been cleaned up, there is no reason to try to
stop the writer-process.

>>> but I'm seeing something different. When the policy engine is launched
>>> after the nearest cluster-recheck-interval, following lines are written
>>> to the syslog:
>>> crmd[11852]: notice: State transition S_IDLE -> S_POLICY_ENGINE
>>> pengine[11851]:  notice: Clearing expired failcount for writer-process:1
>>> on node1.example.org
>>> pengine[11851]:  notice: Clearing expired failcount for writer-process:1
>>> on node1.example.org
>>> pengine[11851]:  notice: Ignoring expired calculated failure
>>> writer-process_stop_0 (rc=1,
>>> magic=2:1;64:557:0:2169780b-ca1f-483e-ad42-118b7c7c1a7d) on
>>> node1.example.org
>>> pengine[11851]:  notice: Clearing expired failcount for writer-process:1
>>> on node1.example.org
>>> pengine[11851]:  notice: Ignoring expired calculated failure
>>> writer-process_stop_0 (rc=1,
>>>

Re: [ClusterLabs] stonith device locate on same host in active/passive cluster

2017-05-04 Thread Ken Gaillot

On 05/03/2017 09:04 PM, Albert Weng wrote:
> Hi Marek,
> 
> Thanks your reply.
> 
> On Tue, May 2, 2017 at 5:15 PM, Marek Grac  > wrote:
> 
> 
> 
> On Tue, May 2, 2017 at 11:02 AM, Albert Weng  > wrote:
> 
> 
> Hi Marek,
> 
> thanks for your quickly responding.
> 
> According to you opinion, when i type "pcs status" then i saw
> the following result of fence :
> ipmi-fence-node1(stonith:fence_ipmilan):Started cluaterb
> ipmi-fence-node2(stonith:fence_ipmilan):Started clusterb
> 
> Does it means both ipmi stonith devices are working correctly?
> (rest of resources can failover to another node correctly)
> 
> 
> Yes, they are working correctly. 
> 
> When it becomes important to run fence agents to kill the other
> node. It will be executed from the other node, so the fact where
> fence agent resides currently is not important
> 
> Does "started on node" means which node is controlling fence behavior?
> even all fence agents and resources "started on same node", the cluster
> fence behavior still work correctly?
>  
> 
> Thanks a lot.
> 
> m,

Correct. Fencing is *executed* independently of where or even whether
fence devices are running. The node that is "running" a fence device
performs the recurring monitor on the device; that's the only real effect.

> should i have to use location constraint to avoid stonith device
> running on same node ?
> # pcs constraint location ipmi-fence-node1 prefers clustera
> # pcs constraint location ipmi-fence-node2 prefers clusterb
> 
> thanks a lot

It's a good idea, so that a node isn't monitoring its own fence device,
but that's the only reason -- it doesn't affect whether or how the node
can be fenced. I would configure it as an anti-location, e.g.

   pcs constraint location ipmi-fence-node1 avoids node1=100

In a 2-node cluster, there's no real difference, but in a larger
cluster, it's the simplest config. I wouldn't use INFINITY (there's no
harm in a node monitoring its own fence device if it's the last node
standing), but I would use a score high enough to outweigh any stickiness.

> On Tue, May 2, 2017 at 4:25 PM, Marek Grac  > wrote:
> 
> Hi,
> 
> 
> 
> On Tue, May 2, 2017 at 3:39 AM, Albert Weng
> > wrote:
> 
> Hi All,
> 
> I have created active/passive pacemaker cluster on RHEL 7.
> 
> here is my environment:
> clustera : 192.168.11.1
> clusterb : 192.168.11.2
> clustera-ilo4 : 192.168.11.10
> clusterb-ilo4 : 192.168.11.11
> 
> both nodes are connected SAN storage for shared storage.
> 
> i used the following cmd to create my stonith devices on
> each node :
> # pcs -f stonith_cfg stonith create ipmi-fence-node1
> fence_ipmilan parms lanplus="ture"
> pcmk_host_list="clustera" pcmk_host_check="static-list"
> action="reboot" ipaddr="192.168.11.10"
> login=adminsitrator passwd=1234322 op monitor interval=60s
> 
> # pcs -f stonith_cfg stonith create ipmi-fence-node02
> fence_ipmilan parms lanplus="true"
> pcmk_host_list="clusterb" pcmk_host_check="static-list"
> action="reboot" ipaddr="192.168.11.11" login=USERID
> passwd=password op monitor interval=60s
> 
> # pcs status
> ipmi-fence-node1 clustera
> ipmi-fence-node2 clusterb
> 
> but when i failover to passive node, then i ran
> # pcs status
> 
> ipmi-fence-node1clusterb
> ipmi-fence-node2clusterb
> 
> why both fence device locate on the same node ? 
> 
> 
> When node 'clustera' is down, is there any place where
> ipmi-fence-node* can be executed?
> 
> If you are worrying that node can not self-fence itself you
> are right. But if 'clustera' will become available then
> attempt to fence clusterb will work as expected.
> 
> m, 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> 
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
> 
> Project Home: http://www.clusterlabs.org
> Getting started:
>

[ClusterLabs] Pacemaker 1.1.17-rc1 now available

2017-05-08 Thread Ken Gaillot

Source code for the first release candidate for Pacemaker version 1.1.17
is now available at:

https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.17-rc1

The most significant enhancements in this release are:

* A new "bundle" resource type simplifies launching resources inside
Docker containers. This feature is considered experimental for this
release. It was discussed in detail previously:

  http://lists.clusterlabs.org/pipermail/users/2017-April/005380.html

A walk-through is available on the ClusterLabs wiki for anyone who wants
to experiment with the feature:

  http://wiki.clusterlabs.org/wiki/Bundle_Walk-Through

* A new environment variable PCMK_node_start_state can specify that a
node should start in standby mode. It was also discussed previously:

  http://lists.clusterlabs.org/pipermail/users/2017-April/005607.html

* The "crm_resource --cleanup" and "crm_failcount" commands can now
operate on a single operation type (previously, they could only operate
on all operations at once). This is part of an underlying switch to
tracking failure counts per operation, also discussed previously:

  http://lists.clusterlabs.org/pipermail/users/2017-April/005391.html

* Several command-line tools have new options, including "crm_resource
--validate" to run a resource agent's validate-all action,
"stonith_admin --list-targets" to list all potential targets of a fence
device, and "crm_attribute --pattern" to update or delete all node
attributes matching a regular expression

* The cluster's handling of fence failures has been improved. Among the
changes, a new "stonith-max-attempts" cluster option specifies how many
times fencing can fail for a target before the cluster will no longer
immediately re-attempt it (previously hard-coded at 10).

* Location constraints using rules may now compare a node attribute
against a resource parameter, using the new "value-source" field.
Previously, node attributes could only be compared against literal
values. This is most useful in combination with rsc-pattern to apply the
constraint to multiple resources.

As usual, to support the new features, the CRM feature set has been
incremented. This means that mixed-version clusters are supported only
during a rolling upgrade -- nodes with an older version will not be
allowed to rejoin once they shut down.

For a more detailed list of bug fixes and other changes, see the change log:

https://github.com/ClusterLabs/pacemaker/blob/1.1/ChangeLog

Everyone is encouraged to download, compile and test the new release. We
do many regression tests and simulations, but we can't cover all
possible use cases, so your feedback is important and appreciated.

Many thanks to all contributors of source code to this release,
including Alexandra Zhuravleva, Andrew Beekhof, Aravind Kumar, Eric
Marques, Ferenc Wágner, Yan Gao, Hayley Swimelar, Hideo Yamauchi, Igor
Tsiglyar, Jan Pokorný, Jehan-Guillaume de Rorthais, Ken Gaillot, Klaus
Wenninger, Kristoffer Grönlund, Michal Koutný, Nate Clark, Patrick
Hemmer, Sergey Mishin, Vladislav Bogdanov, and Yusuke Iida. Apologies if
I have overlooked anyone.
-- 
Ken Gaillot <kgail...@redhat.com>

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Node attribute disappears when pacemaker is started

2017-05-25 Thread Ken Gaillot

On 05/24/2017 05:13 AM, 井上 和徳 wrote:
> Hi,
> 
> After loading the node attribute, when I start pacemaker of that node, the 
> attribute disappears.
> 
> 1. Start pacemaker on node1.
> 2. Load configure containing node attribute of node2.
>(I use multicast addresses in corosync, so did not set "nodelist {nodeid: 
> }" in corosync.conf.)
> 3. Start pacemaker on node2, the node attribute that should have been load 
> disappears.
>Is this specifications ?

Hi,

No, this should not happen for a permanent node attribute.

Transient node attributes (status-attr in crm shell) are erased when the
node starts, so it would be expected in that case.

I haven't been able to reproduce this with a permanent node attribute.
Can you attach logs from both nodes around the time node2 is started?

> 
> 1.
> [root@rhel73-1 ~]# systemctl start corosync;systemctl start pacemaker
> [root@rhel73-1 ~]# crm configure show
> node 3232261507: rhel73-1
> property cib-bootstrap-options: \
>   have-watchdog=false \
>   dc-version=1.1.17-0.1.rc2.el7-524251c \
>   cluster-infrastructure=corosync
> 
> 2.
> [root@rhel73-1 ~]# cat rhel73-2.crm
> node rhel73-2 \
>   utilization capacity="2" \
>   attributes attrname="attr2"
> 
> [root@rhel73-1 ~]# crm configure load update rhel73-2.crm
> [root@rhel73-1 ~]# crm configure show
> node 3232261507: rhel73-1
> node rhel73-2 \
>   utilization capacity=2 \
>   attributes attrname=attr2
> property cib-bootstrap-options: \
>   have-watchdog=false \
>   dc-version=1.1.17-0.1.rc2.el7-524251c \
>   cluster-infrastructure=corosync
> 
> 3.
> [root@rhel73-1 ~]# ssh rhel73-2 'systemctl start corosync;systemctl start 
> pacemaker'
> [root@rhel73-1 ~]# crm configure show
> node 3232261507: rhel73-1
> node 3232261508: rhel73-2
> property cib-bootstrap-options: \
>   have-watchdog=false \
>   dc-version=1.1.17-0.1.rc2.el7-524251c \
>   cluster-infrastructure=corosync
> 
> Regards,
> Kazunori INOUE

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] clearing failed actions

2017-05-31 Thread Ken Gaillot

On 05/31/2017 12:17 PM, Ken Gaillot wrote:
> On 05/30/2017 02:50 PM, Attila Megyeri wrote:
>> Hi Ken,
>>
>>
>>> -Original Message-
>>> From: Ken Gaillot [mailto:kgail...@redhat.com]
>>> Sent: Tuesday, May 30, 2017 4:32 PM
>>> To: users@clusterlabs.org
>>> Subject: Re: [ClusterLabs] clearing failed actions
>>>
>>> On 05/30/2017 09:13 AM, Attila Megyeri wrote:
>>>> Hi,
>>>>
>>>>
>>>>
>>>> Shouldn't the
>>>>
>>>>
>>>>
>>>> cluster-recheck-interval="2m"
>>>>
>>>>
>>>>
>>>> property instruct pacemaker to recheck the cluster every 2 minutes and
>>>> clean the failcounts?
>>>
>>> It instructs pacemaker to recalculate whether any actions need to be
>>> taken (including expiring any failcounts appropriately).
>>>
>>>> At the primitive level I also have a
>>>>
>>>>
>>>>
>>>> migration-threshold="30" failure-timeout="2m"
>>>>
>>>>
>>>>
>>>> but whenever I have a failure, it remains there forever.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> What could be causing this?
>>>>
>>>>
>>>>
>>>> thanks,
>>>>
>>>> Attila
>>> Is it a single old failure, or a recurring failure? The failure timeout
>>> works in a somewhat nonintuitive way. Old failures are not individually
>>> expired. Instead, all failures of a resource are simultaneously cleared
>>> if all of them are older than the failure-timeout. So if something keeps
>>> failing repeatedly (more frequently than the failure-timeout), none of
>>> the failures will be cleared.
>>>
>>> If it's not a repeating failure, something odd is going on.
>>
>> It is not a repeating failure. Let's say that a resource fails for whatever 
>> action, It will remain in the failed actions (crm_mon -Af) until I issue a 
>> "crm resource cleanup ". Even after days or weeks, even 
>> though I see in the logs that cluster is rechecked every 120 seconds.
>>
>> How could I troubleshoot this issue?
>>
>> thanks!
> 
> 
> Ah, I see what you're saying. That's expected behavior.
> 
> The failure-timeout applies to the failure *count* (which is used for
> checking against migration-threshold), not the failure *history* (which
> is used for the status display).
> 
> The idea is to have it no longer affect the cluster behavior, but still
> allow an administrator to know that it happened. That's why a manual
> cleanup is required to clear the history.

Hmm, I'm wrong there ... failure-timeout does expire the failure history
used for status display.

It works with the current versions. It's possible 1.1.10 had issues with
that.

Check the status to see which node is DC, and look at the pacemaker log
there after the failure occurred. There should be a message about the
failcount expiring. You can also look at the live CIB and search for
last_failure to see what is used for the display.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] Pacemaker 1.1.17 Release Candidate 3

2017-05-31 Thread Ken Gaillot

The third release candidate for Pacemaker version 1.1.17 is now
available at:

https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.17-rc3

Significant changes in this release:

* This release adds support for setting meta-attributes on the new
bundle resource type, which will be inherited by the bundle's component
resources. This allows features such target-role, is-managed,
maintenance mode, etc., to work with bundles.

* A node joining a cluster no longer forces a write-out of all node
attributes when atomic attrd is in use, as this is only necessary with
legacy attrd (which is used on legacy cluster stacks such as heartbeat,
corosync 1, and CMAN). This improves scalability, as the write-out could
cause a surge in IPC traffic that causes problems in large clusters.

* Recovery of failed Pacemaker Remote connections now avoids restarting
resources on the Pacemaker Remote node unless necessary.

Testing and feedback is welcome!
-- 
Ken Gaillot <kgail...@redhat.com>

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Pacemaker occasionally takes minutes to respond

2017-05-31 Thread Ken Gaillot

On 05/24/2017 08:04 AM, Attila Megyeri wrote:
> Hi Klaus,
> 
> Thank you for your response.
> I tried many things, but no luck.
> 
> We have many pacemaker clusters with 99% identical configurations, package 
> versions, and only this one causes issues. (BTW we use unicast for corosync, 
> but this is the same for our other clusters as well.)
> I checked all connection settings between the nodes (to confirm there are no 
> firewall issues), increased the number of cores on each node, but still - as 
> long as a monitor operation is pending for a resource, no other operation is 
> executed.
> 
> e.g. resource A is being monitored, and timeout is 90 seconds, until this 
> check times out I cannot do a cleanup or start/stop on any other resource.

Do you have any constraints configured? If B depends on A, you probably
want at least an ordering constraint. Then the cluster would stop B
before stopping A, and not try to start it until A is up again.

Throttling based on load wasn't added until Pacemaker 1.1.11, so the
only limit on parallel execution in 1.1.10 was batch-limit, which
defaulted to 30 at the time.

I'd investigate by figuring out which node was DC at the time and
checking its pacemaker log (preferably with PCMK_debug=crmd turned on).
You can see each run of the policy engine and what decisions were made,
ending with a message like "saving inputs in
/var/lib/pacemaker/pengine/pe-input-4940.bz2". You can run crm_simulate
on that file to get more information about the decision-making process.

"crm_simulate -Sx $FILE -D transition.dot" will create a dot graph of
the transition showing dependencies. You can convert the graph to an svg
with "dot transition.dot -Tsvg > transition.svg" and then look at that
file in any SVG viewer (including most browsers).

> Two more interesting things: 
> - cluster recheck is set to 2 minutes, and even though the resources are 
> running properly, the fail counters are not reduced and crm_mon lists the 
> resources in failed actions section. forever. Or until I manually do resource 
> cleanup.
> - If i execute a crm resource cleanup RES_name from another node, sometimes 
> it simply does not clean up the failed state. If I execute this from the node 
> where the resource IS actually runing, the resource is removed from the 
> failed actions.
> 
> 
> What do you recommend, how could I start troubleshooting these issues? As I 
> said, this setup works fine in several other systems, but here I am 
> really-realy stuck.
> 
> 
> thanks!
> 
> Attila
> 
> 
> 
> 
> 
>> -Original Message-
>> From: Klaus Wenninger [mailto:kwenn...@redhat.com]
>> Sent: Wednesday, May 10, 2017 2:04 PM
>> To: users@clusterlabs.org
>> Subject: Re: [ClusterLabs] Pacemaker occasionally takes minutes to respond
>>
>> On 05/09/2017 10:34 PM, Attila Megyeri wrote:
>>>
>>> Actually I found some more details:
>>>
>>>
>>>
>>> there are two resources: A and B
>>>
>>>
>>>
>>> resource B depends on resource A (when the RA monitors B, if will fail
>>> if A is not running properly)
>>>
>>>
>>>
>>> If I stop resource A, the next monitor operation of "B" will fail.
>>> Interestingly, this check happens immediately after A is stopped.
>>>
>>>
>>>
>>> B is configured to restart if monitor fails. Start timeout is rather
>>> long, 180 seconds. So pacemaker tries to restart B, and waits.
>>>
>>>
>>>
>>> If I want to start "A", nothing happens until the start operation of
>>> "B" fails - typically several minutes.
>>>
>>>
>>>
>>>
>>>
>>> Is this the right behavior?
>>>
>>> It appears that pacemaker is blocked until resource B is being
>>> started, and I cannot really start its dependency...
>>>
>>> Shouldn't it be possible to start a resource while another resource is
>>> also starting?
>>>
>>
>> As long as resources don't depend on each other parallel starting should
>> work/happen.
>>
>> The number of parallel actions executed is derived from the number of
>> cores and
>> when load is detected some kind of throttling kicks in (in fact reduction of
>> the operations executed in parallel with the aim to reduce the load induced
>> by pacemaker). When throttling kicks in you should get log messages (there
>> is in fact a parallel discussion going on ...).
>> No idea if throttling might be a reason here but maybe worth considering
>> at least.
>>
>> Another reason why certain things happen with quite some delay I've
>> observed
>> is that obviously some situations are just resolved when the
>> cluster-recheck-interval
>> triggers a pengine run in addition to those triggered by changes.
>> You might easily verify this by changing the cluster-recheck-interval.
>>
>> Regards,
>> Klaus
>>
>>>
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Attila
>>>
>>>
>>>
>>>
>>>
>>> *From:*Attila Megyeri [mailto:amegy...@minerva-soft.com]
>>> *Sent:* Tuesday, May 9, 2017 9:53 PM
>>> *To:* users@clusterlabs.org; kgail...@redhat.com
>>> *Subject:* [ClusterLabs] Pacemaker occasionally takes minutes to respond
>>>
>>>
>>>
>>> Hi Ken, all,
>>>
>>>
>>>

Re: [ClusterLabs] How to avoid stopping ordered resources on cleanup?

2017-09-15 Thread Ken Gaillot

ure_set="3.0.11" transition-key="6:1380:0:e2c19428-0707-4677-
> a89a-ff1c19ebe57c" transition-magic="0:0;6:1380:0:e2c19428-0707-4677-
> a89a-ff1c19ebe57c" on_node="bam2-backend" call-id="
> Sep 13 06:43:55 [3826] bam1-omc    cib: info:
> cib_process_request:  Completed cib_modify operation for section
> status: OK (rc=0, origin=bam2-backend/crmd/45, version=0.168.8)
> S

Re: [ClusterLabs] Force stopping the resources from a resource group in parallel

2017-09-15 Thread Ken Gaillot

On Tue, 2017-09-12 at 10:49 +0200, John Gogu wrote:
> Hello,
> I have created a resource group from 2 resources: pcs resource group
> add Group1 IPaddr Email. From the documentation is clear that
> resources are stopped in the reverse order in which are specified
> (Email first, then IPaddr).
> 
> There is a way to force stopping of the resources from a resource
> group (Group1) in parallel?

A group is essentially a shorthand for ordering+colocation, so the
ordering is always enforced.

Instead of a group, you could create just a colocation constraint,
which allows the resources to start and stop in any order (or
simultaneously), but always on the same node.

If you need them to always start in order, but stopping can be done in
any order (or simultaneously), then use a colocation constraint plus an
ordering constraint with symmetrical=false.

> 
> 
> Mit freundlichen Grüßen/Kind regards,
> 
> John Gogu
> Skype: ionut.gogu
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.
> pdf
> Bugs: http://bugs.clusterlabs.org
-- 
Ken Gaillot <kgail...@redhat.com>

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Cannot stop cluster due to order constraint

2017-09-15 Thread Ken Gaillot

ce create backup3 ocf:heartbeat:Dummy
> pcs resource create backup4 ocf:heartbeat:Dummy
> pcs resource create backup5 ocf:heartbeat:Dummy
> pcs resource create backup6 ocf:heartbeat:Dummy
> 
> pcs constraint order start main1 then stop backup1 kind=Serialize
> pcs constraint order start main2 then stop backup2 kind=Serialize
> pcs constraint order start main3 then stop backup3 kind=Serialize
> pcs constraint order start main4 then stop backup4 kind=Serialize
> pcs constraint order start main5 then stop backup5 kind=Serialize
> pcs constraint order start main6 then stop backup6 kind=Serialize
> 
> pcs constraint colocation add backup1 with main1 -200
> pcs constraint colocation add backup2 with main2 -200
> pcs constraint colocation add backup3 with main3 -200
> pcs constraint colocation add backup4 with main3 -200
> pcs constraint colocation add backup5 with main3 -200
> pcs constraint colocation add backup6 with main3 -200
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.
> pdf
> Bugs: http://bugs.clusterlabs.org
-- 
Ken Gaillot <kgail...@redhat.com>




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] IP clone issue

2017-09-15 Thread Ken Gaillot

.
> >
> > I've created an IP resource with clone using the following
> command
> >
> > pcs resource create ClusterIP ocf:heartbeat:IPaddr2 params
> > nic="ens192" ip="xxx.yyy.zzz.www" cidr_netmask="24"
> > clusterip_hash="sourceip" op start interval="0"
> timeout="20" op
> > stop interval="0" timeout="20" op monitor interval="10"
> > timeout="20" meta resource-stickiness=0 clone meta clone-
> max="2"
> > clone-node-max="2" interleave="true" globally-unique="true"
> >
> > The xxx.yyy.zzz.www is public IP not a private one.
> >
> > With the above command the IP clone is created but it is
> started
> > only on one node. This is the output of pcs status command
> >
> > Clone Set: ClusterIP-clone [ClusterIP] (unique)
> >  ClusterIP:0(ocf::heartbeat:IPaddr2):Started
> node02
> >  ClusterIP:1(ocf::heartbeat:IPaddr2):Started
> node02

By default, pacemaker will spread out all resources (including unique
clone instances) evenly across nodes. So if the other node already has
more resources, the above can be the result.

The suggestion of raising the priority on ClusterIP would make the
cluster place it first, so it will be spread out first. Stickiness can
affect it, though.

> > If I modify the clone-node-max to 1 then the resource is
> started
> > on both nodes as seen in this pcs status output:
> >
> > Clone Set: ClusterIP-clone [ClusterIP] (unique)
> >  ClusterIP:0(ocf::heartbeat:IPaddr2):Started
> node02
> >  ClusterIP:1(ocf::heartbeat:IPaddr2):Started
> node01
> >
> > But if one node fails the IP resource is not migrated to
> active
> > node as is said in documentation.
> >
> > Clone Set: ClusterIP-clone [ClusterIP] (unique)
> >  ClusterIP:0(ocf::heartbeat:IPaddr2):Started
> node02
> >  ClusterIP:1(ocf::heartbeat:IPaddr2):Stopped

This is surprising. I'd have to see the logs and/or pe-input to know
why both can't be started.

> >
> > When the IP is active on both nodes the services are
> accessible
> > so there is not an issue with the fact that the interface
> dose
> > not have an IP allocated at boot. The gateway is set with
> > another pcs command and it is working.
> >
> > Thank in advance for any info.
> >
> > Best regards
> > Octavian Ciobanu

-- 
Ken Gaillot <kgail...@redhat.com>




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] Disabling stonith in Pacemaker 2.0 (was: Re: Pacemaker 1.1.18 deprecation warnings)

2017-09-18 Thread Ken Gaillot

On Mon, 2017-09-18 at 13:53 -0400, Digimer wrote:
> On 2017-09-18 01:48 PM, Ken Gaillot wrote:
> > As discussed at the recent ClusterLabs Summit, I plan to start the
> > release cycle for Pacemaker 1.1.18 soon.
> > 
> > There will be the usual bug fixes and a few small new features, but
> the
> > main goal will be to provide a final 1.1 release that Pacemaker 2.0
> can
> > branch from.
> > 
> > As such, 1.1.18 will start to log deprecation warnings for syntax
> that
> > is planned to be removed in 2.0. So, we need to decide fairly
> quickly
> > what we intend to remove.
> > 
> > Below is what I'm proposing. If anyone feels strongly about keeping
> > support for any of these, speak now or forever hold your peace!
> > 



> > * cluster properties that have been obsoleted by the rsc_defaults
> and
> > op_defaults sections
> > ** stonith-enabled or stonith_enabled (now "requires" in
> rsc_defaults)



> Andrew announced that disabling stonith will put a node into
> maintenance
> mode. This should be announced/alerted as well, eh?

My current plan is to remove the stonith-enabled option entirely, so
there will be no stonith-enabled=false anymore.

Users will still have a way to disable fencing, by (ab)using the
"requires" resource meta-attribute. This is a per-resource option
rather than cluster-wide (though it can be applied to all resources
using rsc_defaults).

I don't plan on preventing people from running only resources with
requires=quorum or requires=nothing, without any fencing configured.
However we will probably start tracking whether at least one resource
has requires=fencing, and require that at least one enabled fencing
device be configured if so. If fencing is not available in such a case,
we could put the cluster into maintenance mode, or log a warning and
block when fencing is needed.
-- 
Ken Gaillot <kgail...@redhat.com>




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] Pacemaker 1.1.18 deprecation warnings

2017-09-18 Thread Ken Gaillot

As discussed at the recent ClusterLabs Summit, I plan to start the
release cycle for Pacemaker 1.1.18 soon.

There will be the usual bug fixes and a few small new features, but the
main goal will be to provide a final 1.1 release that Pacemaker 2.0 can
branch from.

As such, 1.1.18 will start to log deprecation warnings for syntax that
is planned to be removed in 2.0. So, we need to decide fairly quickly
what we intend to remove.

Below is what I'm proposing. If anyone feels strongly about keeping
support for any of these, speak now or forever hold your peace!

* support for legacy cluster stacks (heartbeat, corosync 1 + CMAN, and
corosync 1 + pacemaker plugin). Pacemaker 2.0 will initially support
only corosync 2, though future support is planned for the new knet
stack.

* compile-time option to directly support SNMP and ESMTP in crm_mon
(i.e. the --snmp-* and --mail-* options) (alerts are the current
syntax)

* pcmk_*_cmd stonith attributes (pcmk_*_action is the current syntax)

* pcmk_poweroff_action (pcmk_off_action is the current syntax)

* "requires" operation meta-attribute ("requires" resource meta-
attribute is the current syntax)

* undocumented "resource isolation" feature (bundles are current
syntax)

* undocumented LRMD_MAX_CHILDREN environment variable
(PCMK_node_action_limit is the current syntax)

* cluster properties that have been obsoleted by the rsc_defaults and
op_defaults sections
** stonith-enabled or stonith_enabled (now "requires" in rsc_defaults)
** default-resource-stickiness, default_resource_stickiness (now
"resource-stickiness" in rsc_defaults)
** is-managed-default or is_managed_default (now "is-managed" in
rsc_defaults)
** default-action-timeout or default_action_timeout (now "timeout" in
op_defaults)

* undocumented old names of cluster properties
** no_quorum_policy (now no-quorum-policy)
** symmetric_cluster (now symmetric-cluster)
** stonith_action (now stonith-action)
** startup_fencing (now startup-fencing)
** transition_idle_timeout (now cluster-delay)
** default_action_timeout (now default-action-timeout)
** stop_orphan_resources (now stop-orphan-resources)
** stop_orphan_actions (now stop-orphan-actions)
** remote_after_stop (now remove-after-stop)
** dc_deadtime (now dc-deadtime)
** cluster_recheck_interval (now cluster-recheck-interval)
** election_timeout (now election-timeout)
** shutdown_escalation (now shutdown-escalation)

* undocumented old names of resource meta-attributes
** resource-failure-stickiness, resource_failure_stickiness, default-
resource-failure-stickiness, and
default_resource_failure_stickiness (now migration-threshold)

* undocumented and ignored -r option to lrmd

* compile-time option to use undocumented "notification-agent" and
"notification-recipient" cluster properties instead of current "alerts"
syntax

* compatibility with CIB schemas below 1.0, and schema 1.1 (should not
affect anyone who created their configuration using Pacemaker 1.0.0 or
later)
-- 
Ken Gaillot <kgail...@redhat.com>




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Antw: Pacemaker 1.1.18 deprecation warnings

2017-09-19 Thread Ken Gaillot

On Tue, 2017-09-19 at 09:13 +0200, Ulrich Windl wrote:
> >>> Ken Gaillot <kgail...@redhat.com> schrieb am 18.09.2017 um 19:48
> in Nachricht
> <1505756918.5541.4.ca...@redhat.com>:
> > As discussed at the recent ClusterLabs Summit, I plan to start the
> > release cycle for Pacemaker 1.1.18 soon.
> > 
> > There will be the usual bug fixes and a few small new features, but
> the
> > main goal will be to provide a final 1.1 release that Pacemaker 2.0
> can
> > branch from.
> > 
> > As such, 1.1.18 will start to log deprecation warnings for syntax
> that
> > is planned to be removed in 2.0. So, we need to decide fairly
> quickly
> > what we intend to remove.
> 
> I think it should work the other way 'round: Once pacemaker 2.0
> implemented the replacements, declare the old versions as obsolete.
> I see little sense in declaring features as obsolete as long as there
> is no replacement available.

All of the items mentioned already have replacements. Most are either
already documented as deprecated, or undocumented -- this would just be
to get a log warning as well.

I'm thinking something along the lines of:

warning: configuration uses XXX, which will be removed in a future
release of Pacemaker; use YYY instead
 
> > Below is what I'm proposing. If anyone feels strongly about keeping
> > support for any of these, speak now or forever hold your peace!
> > 
> > * support for legacy cluster stacks (heartbeat, corosync 1 + CMAN,
> and
> > corosync 1 + pacemaker plugin). Pacemaker 2.0 will initially
> support
> > only corosync 2, though future support is planned for the new knet
> > stack.
> > 
> > * compile-time option to directly support SNMP and ESMTP in crm_mon
> > (i.e. the --snmp-* and --mail-* options) (alerts are the current
> > syntax)
> > 
> > * pcmk_*_cmd stonith attributes (pcmk_*_action is the current
> syntax)
> > 
> > * pcmk_poweroff_action (pcmk_off_action is the current syntax)
> > 
> > * "requires" operation meta-attribute ("requires" resource meta-
> > attribute is the current syntax)
> > 
> > * undocumented "resource isolation" feature (bundles are current
> > syntax)
> > 
> > * undocumented LRMD_MAX_CHILDREN environment variable
> > (PCMK_node_action_limit is the current syntax)
> > 
> > * cluster properties that have been obsoleted by the rsc_defaults
> and
> > op_defaults sections
> > ** stonith-enabled or stonith_enabled (now "requires" in
> rsc_defaults)
> > ** default-resource-stickiness, default_resource_stickiness (now
> > "resource-stickiness" in rsc_defaults)
> > ** is-managed-default or is_managed_default (now "is-managed" in
> > rsc_defaults)
> > ** default-action-timeout or default_action_timeout (now "timeout"
> in
> > op_defaults)
> > 
> > * undocumented old names of cluster properties
> > ** no_quorum_policy (now no-quorum-policy)
> > ** symmetric_cluster (now symmetric-cluster)
> > ** stonith_action (now stonith-action)
> > ** startup_fencing (now startup-fencing)
> > ** transition_idle_timeout (now cluster-delay)
> > ** default_action_timeout (now default-action-timeout)
> > ** stop_orphan_resources (now stop-orphan-resources)
> > ** stop_orphan_actions (now stop-orphan-actions)
> > ** remote_after_stop (now remove-after-stop)
> > ** dc_deadtime (now dc-deadtime)
> > ** cluster_recheck_interval (now cluster-recheck-interval)
> > ** election_timeout (now election-timeout)
> > ** shutdown_escalation (now shutdown-escalation)
> > 
> > * undocumented old names of resource meta-attributes
> > ** resource-failure-stickiness, resource_failure_stickiness,
> default-
> > resource-failure-stickiness, and
> > default_resource_failure_stickiness (now migration-threshold)
> > 
> > * undocumented and ignored -r option to lrmd
> > 
> > * compile-time option to use undocumented "notification-agent" and
> > "notification-recipient" cluster properties instead of current
> "alerts"
> > syntax
> > 
> > * compatibility with CIB schemas below 1.0, and schema 1.1 (should
> not
> > affect anyone who created their configuration using Pacemaker 1.0.0
> or
> > later)
-- 
Ken Gaillot <kgail...@redhat.com>




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] New website design and new-new logo

2017-09-21 Thread Ken Gaillot

On Thu, 2017-09-21 at 09:26 +0200, Jehan-Guillaume de Rorthais wrote:
> On Wed, 20 Sep 2017 21:25:51 -0400
> Digimer <li...@alteeve.ca> wrote:
> 
> > On 2017-09-20 07:53 PM, Ken Gaillot wrote:
> > > Hi everybody,
> > > 
> > > We've started a major update of the ClusterLabs web design. The
> main
> > > goal (besides making it look more modern) is to make the top-
> level more
> > > about all ClusterLabs projects rather than Pacemaker-specific.
> It's
> > > also much more mobile-friendly.
> > > 
> > > We've also updated our new logo -- Kristoffer Grönlund had a
> > > professional designer look at the one he created. I hope everyone
> likes
> > > the end result. It's simpler, cleaner and friendlier.
> 
> Really nice, I like it! Thanks to both of you!
> 
> > > Check it out at https://clusterlabs.org/  
> > 
> > This is excellent!
> > 
> > Can I recommend an additional category? It would be nice to have a
> > "Projects" link that provided a list of projects that fall under
> the
> > clusterlabs umbrella, with a brief blurb and a link to each.
> 
> Speaking of another category, maybe it could references some more
> community blogs? Or even add a planet? (planet.postgresql.org is
> pretty popular
> in pgsql community).

That's a great idea, I'll look into planet when I get the chance

> 
> I'm sure you guys has some people around writing posts about news,
> features,
> tech preview, etc. On top of my head, I can think of RH, Suse,
> Linbit,
> Unixarena, Hastexo, Alteeve, ...
> 
> ++
-- 
Ken Gaillot <kgail...@redhat.com>




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] New website design and new-new logo

2017-09-21 Thread Ken Gaillot

On Wed, 2017-09-20 at 21:25 -0400, Digimer wrote:
> On 2017-09-20 07:53 PM, Ken Gaillot wrote:
> > Hi everybody,
> > 
> > We've started a major update of the ClusterLabs web design. The
> main
> > goal (besides making it look more modern) is to make the top-level
> more
> > about all ClusterLabs projects rather than Pacemaker-specific. It's
> > also much more mobile-friendly.
> > 
> > We've also updated our new logo -- Kristoffer Grönlund had a
> > professional designer look at the one he created. I hope everyone
> likes
> > the end result. It's simpler, cleaner and friendlier.
> > 
> > Check it out at https://clusterlabs.org/
> 
> This is excellent!
> 
> Can I recommend an additional category? It would be nice to have a
> "Projects" link that provided a list of projects that fall under the
> clusterlabs umbrella, with a brief blurb and a link to each.

Yes! This is just the start, we intend to improve the content over
time. (Time being a scarce commodity ...)

I'd like to see the front page be much shorter, with most of its
content moved to the Pacemaker page, and the front page would be mostly
links to all the individual projects plus brief info about ClusterLabs
itself.

> 
> Thanks for doing this! It's much more general now, which is great.
> 
-- 
Ken Gaillot <kgail...@redhat.com>




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Pacemaker 1.1.18 deprecation warnings

2017-09-20 Thread Ken Gaillot

On Wed, 2017-09-20 at 11:48 +0200, Ferenc Wágner wrote:
> Ken Gaillot <kgail...@redhat.com> writes:
> 
> > * undocumented LRMD_MAX_CHILDREN environment variable
> > (PCMK_node_action_limit is the current syntax)
> 
> By the way, is the current syntax documented somewhere?  Looking at

Unfortunately not in its entirety (on the to-do list)

The crmd man page documents load-threshold and node-action-limit

> crmd/throttle.c, throttle_update_job_max() is only ever invoked with
> a
> NULL argument, so "Global preference from the CIB" isn't implemented
> either.  Or do I overlook something?

See crmd/control.c:config_query_callback() -- it calls
throttle_update_job_max() with the value of node-action-limit (a
cluster property that applies to all nodes, as opposed to
PCMK_node_action_limit which is an environment variable that applies
only to the local node)
-- 
Ken Gaillot <kgail...@redhat.com>




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] some resources move after recovery

2017-09-20 Thread Ken Gaillot

On Wed, 2017-09-20 at 10:08 +, Roberto Muñoz Gomez wrote:
> Hi,
> 
> 
> I don't know why if one of the two nodes is rebooted, when the node
> is back, some of the resources move to it despite default-resource-
> stickiness=100 and the resources have failcount=0 and there is no
> constraint influencing that change.
> 
> By some I mean sometimes 1, other 90, other 103...in a 900+ resource
> cluster.
> 
> The only clue I have is this line in the log:
> 
> pengine: info: native_color:    Resource o464rt cannot run
> anywhere
> 
> Is there any way I can "debug" this behaviour?

It's not very user-friendly, but you can get the most information from
the crm_simulate tool. Shortly past the above line in the log, there
will be a line like "Calculated transition ..., saving inputs in ..."
with a file path.

Grab that file, which has the entire state of the cluster at that point
in time. Run "crm_simulate -Ssx ". It will tell you what it
thinks the state of the nodes and resources were at that time, all the
scores that go into resource placement, and then the actions it thinks
need to be taken ("Transition Summary"). It will then simulate taking
those actions and show what the resulting new status would be.

It's not always obvious where the scores come from, but it does give
more information.

> 
> 
> Best Regards.
> ·
> Roberto Muñoz
> BME - Sistemas UNIX
> C/ Tramontana, 2 Bis. Edificio 2 - 1ª Planta
> 28230 Las Rozas, Madrid - España
> Tlfn: +34-917095778
> 
> 
> P Antes de imprimir, piensa en el MEDIO AMBIENTE
> AVISO LEGAL/DISCLAIMER
-- 
Ken Gaillot <kgail...@redhat.com>

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] New website design and new-new logo

2017-09-21 Thread Ken Gaillot

On Thu, 2017-09-21 at 11:56 +0200, Kai Dupke wrote:
> On 09/21/2017 01:53 AM, Ken Gaillot wrote:
> > Check it out at https://clusterlabs.org/
> 
> Two comments
> 
> - I would like to see the logo used by as many
> people/projects/marketingers, so I propose to link the Logo to a Logo
> page with some prepared Logos - at least with one big to download and
> a
> license info

Another good idea

> 
> - Should we not add a word about the license to the FAQ on top? I
> mean,
> I am with Open Source for quite some time but some others might not
> and
> we want to get fresh members to the community, right? I'm not sure
> all
> subproject share the same license, but if so then it should be
> written down.

Yes, the FAQ needs an overhaul as well -- all the Pacemaker-specific
questions should be moved to a separate Pacemaker FAQ, and the top FAQ
should just have questions about ClusterLabs plus links to project FAQs

> 
> Best regards,
> Kai Dupke
> Senior Product Manager
> SUSE Linux Enterprise 15
-- 
Ken Gaillot <kgail...@redhat.com>




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] New website design and new-new logo

2017-09-21 Thread Ken Gaillot

On Thu, 2017-09-21 at 16:46 +0200, Kai Dupke wrote:
> On 09/21/2017 04:42 PM, Ken Gaillot wrote:
> > Yes, the FAQ needs an overhaul as well -- all the Pacemaker-
> specific
> > questions should be moved to a separate Pacemaker FAQ, and the top
> FAQ
> > should just have questions about ClusterLabs plus links to project
> FAQs
> 
> Can we make this a wiki page, so others can contribute as well?

I'm thinking the project FAQs should be on the wiki (or their own
separate sites if they use any), and the overall ClusterLabs FAQ (which
should be short and rarely change) can stay a static web page.

There already is a (pacemaker-centric) FAQ on the wiki that overlaps
with the main one, so it definitely could use some clean up for
consistency.

> 
> Best regards,
> Kai Dupke
> Senior Product Manager
> SUSE Linux Enterprise 15
-- 
Ken Gaillot <kgail...@redhat.com>

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] New website design and new-new logo

2017-09-20 Thread Ken Gaillot

Hi everybody,

We've started a major update of the ClusterLabs web design. The main
goal (besides making it look more modern) is to make the top-level more
about all ClusterLabs projects rather than Pacemaker-specific. It's
also much more mobile-friendly.

We've also updated our new logo -- Kristoffer Grönlund had a
professional designer look at the one he created. I hope everyone likes
the end result. It's simpler, cleaner and friendlier.

Check it out at https://clusterlabs.org/

-- 
Ken Gaillot <kgail...@redhat.com>




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] High CPU during CIB sync

2017-09-15 Thread Ken Gaillot

On Mon, 2017-09-11 at 16:02 +0530, Anu Pillai wrote:
> Hi,
> 
> We are using 3 node cluster (2 active and 1 standby). 
> When failover happens, CPU utilization going high in newly active
> node as well as other active node. It is remaining in high CPU state
> for nearly 20 seconds.
> 
> We have 122 resource attributes under the resource(res1) which is
> failing over. Failover triggered at 14:49:05 
> 
> Cluster Information:
> Pacemaker 1.1.14
> Corosync Cluster Engine, version '2.3.5'
> pcs version 0.9.150
> dc-version: 1.1.14-5a6cdd1
> no-quorum-policy: ignore
> notification-agent: /etc/sysconfig/notify.sh
> notification-recipient: /var/log/notify.log
> placement-strategy: balanced
> startup-fencing: true
> stonith-enabled: false
> 
> Our device is having 8 cores. Pacemaker and related application
> running on Core 6
> 
> top command output:
> CPU0:  4.4% usr 17.3% sys  0.0% nic 75.7% idle  0.0% io  1.9% irq
>  0.4% sirq
> CPU1:  9.5% usr  2.5% sys  0.0% nic 88.0% idle  0.0% io  0.0% irq
>  0.0% sirq
> CPU2:  1.4% usr  1.4% sys  0.0% nic 96.5% idle  0.0% io  0.4% irq
>  0.0% sirq
> CPU3:  3.4% usr  0.4% sys  0.0% nic 95.5% idle  0.0% io  0.4% irq
>  0.0% sirq
> CPU4:  7.9% usr  2.4% sys  0.0% nic 88.5% idle  0.0% io  0.9% irq
>  0.0% sirq
> CPU5:  0.5% usr  0.5% sys  0.0% nic 98.5% idle  0.0% io  0.5% irq
>  0.0% sirq
> CPU6: 60.3% usr 38.6% sys  0.0% nic  0.0% idle  0.0% io  0.4% irq
>  0.4% sirq
> CPU7:  2.9% usr 10.3% sys  0.0% nic 83.6% idle  0.0% io  2.9% irq
>  0.0% sirq
> Load average: 3.47 1.82 1.63 7/314 11444
>  
>  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
>  4921  4839 hacluste R <  78492  2.8   6  2.0
> /usr/libexec/pacemaker/cib
> 11240 11239 root       RW<      0  0.0   6  1.9 [python]
>  4925  4839 hacluste R <  52804  1.9   6  1.1
> /usr/libexec/pacemaker/pengine
>  4637     1 root           R <  97620  3.5   6  0.4 corosync -p -f
>  4926  4839 hacluste S <   131m  4.8   6  0.3
> /usr/libexec/pacemaker/crmd
>  4839     1 root           S <  33448  1.2   6  0.1 pacemakerd
> 
> 
> 
> I am attaching the log for your reference.
> 
> 
> 
> Regards,
> Aswathi

Is there a reason all the cluster services are pegged to one core?
Pacemaker can take advantage of multiple cores both by spreading out
the daemons and by running multiple resource actions at once.

I see you're using the original "notifications" implementation. This
has been superseded by "alerts" in Pacemaker 1.1.15 and later. I
recommend upgrading if you can, which will also get you bugfixes in
pacemaker and corosync that could help. In any case, your notify script
/etc/sysconfig/notify.sh is generating errors. If you don't really need
the notify logging, I'd disable that and see if that helps.

It looks to me that, after failover, the resource agent is setting a
lot of node attributes and possibly its own resource attributes. Each
of those changes requires the cluster to recalculate resource
placement, and that's probably where most of the CPU usage is coming
from. (BTW, setting node attributes is fine, but a resource agent
generally shouldn't change its own configuration.)

You should be able to reduce the CPU usage by setting "dampening" on
the node attributes. This will make the cluster wait a bit of time
before writing node attribute changes to the CIB, so the recalculation
doesn't have to occur immediately after each change. See the "--delay"
option to attrd_updater (which can be used when creating the attribute
initially).

-- 
Ken Gaillot <kgail...@redhat.com>

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Pacemaker resource parameter reload confusion

2017-09-22 Thread Ken Gaillot

On Fri, 2017-09-22 at 16:23 +0200, Ferenc Wágner wrote:
> Hi,
> 
> I'm running a custom resourcre agent under Pacemaker 1.1.16, which
> has
> several reloadable parameters:
> 
> $ /usr/sbin/crm_resource --show-metadata=ocf:niif:TransientDomain |
> fgrep unique=
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> I used to routinely change the unique="0" parameters without having
> the
> corresponding resources restarted.  But now something like
> 
> $ sudo crm_resource -r vm-alder -p admins -v "kissg wferi"
> 
> restarts the resource in a somewhat strange way:
> 
> crmd[27037]:   notice: State transition S_IDLE -> S_POLICY_ENGINE
> pengine[27036]:   notice: Reload  vm-alder#011(Started vhbl05)
> pengine[27036]:   notice: Calculated transition 1309, saving inputs
> in /var/lib/pacemaker/pengine/pe-input-1033.bz2
> crmd[27037]:   notice: Initiating stop operation vm-alder_stop_0 on
> vhbl05
> crmd[27037]:   notice: Initiating reload operation vm-alder_reload_0
> on vhbl05
> crmd[27037]:   notice: Transition aborted by deletion of
> lrm_rsc_op[@id='vm-alder_last_failure_0']: Resource operation removal
> crmd[27037]:   notice: Transition 1309 (Complete=10, Pending=0,
> Fired=0, Skipped=1, Incomplete=2,
> Source=/var/lib/pacemaker/pengine/pe-input-1033.bz2): Stopped
> pengine[27036]:   notice: Calculated transition 1310, saving inputs
> in /var/lib/pacemaker/pengine/pe-input-1034.bz2

Hmm, stop+reload is definitely a bug. Can you attach (or email it to me
privately, or file a bz with it attached) the above pe-input file with
any sensitive info removed?

> crmd[27037]:   notice: Initiating monitor operation vm-
> alder_monitor_6 on vhbl05
> crmd[27037]:  warning: Action 228 (vm-alder_monitor_6) on vhbl05
> failed (target: 0 vs. rc: 7): Error
> crmd[27037]:   notice: Transition aborted by operation vm-
> alder_monitor_6 'create' on vhbl05: Event failed
> crmd[27037]:   notice: Transition 1310 (Complete=7, Pending=0,
> Fired=0, Skipped=0, Incomplete=0,
> Source=/var/lib/pacemaker/pengine/pe-input-1034.bz2): Complete
> pengine[27036]:  warning: Processing failed op monitor for vm-alder
> on vhbl05: not running (7)
> pengine[27036]:   notice: Recover vm-alder#011(Started vhbl05)
> pengine[27036]:   notice: Calculated transition 1311, saving inputs
> in /var/lib/pacemaker/pengine/pe-input-1035.bz2
> pengine[27036]:  warning: Processing failed op monitor for vm-alder
> on vhbl05: not running (7)
> pengine[27036]:   notice: Recover vm-alder#011(Started vhbl05)
> pengine[27036]:   notice: Calculated transition 1312, saving inputs
> in /var/lib/pacemaker/pengine/pe-input-1036.bz2
> crmd[27037]:   notice: Initiating stop operation vm-alder_stop_0 on
> vhbl05
> crmd[27037]:   notice: Initiating start operation vm-alder_start_0 on
> vhbl05
> crmd[27037]:   notice: Initiating monitor operation vm-
> alder_monitor_6 on vhbl05
> crmd[27037]:   notice: Transition 1312 (Complete=10, Pending=0,
> Fired=0, Skipped=0, Incomplete=0,
> Source=/var/lib/pacemaker/pengine/pe-input-1036.bz2): Complete
> crmd[27037]:   notice: State transition S_TRANSITION_ENGINE -> S_IDLE
> 
> I've got info level logs as well, but those are rather long and maybe
> someone can pinpoint my problem without going through those.  I
> remember
> past discussions about "doing reload right", but I'm not sure what
> was
> implemented in the end, and I can't find anything in the changelog
> either.  So, what do I miss here?  Parallel reload and stop looks
> rather
> suspicious, though...

Nothing's been done about reload yet. It's waiting until we get around
to an overhaul of the OCF resource agent standard, so we can define the
semantics more clearly. It will involve replacing "unique" with
separate meta-data for reloadability and GUI hinting, and possibly
changes to the reload operation. Of course we'll try to stay backward-
compatible.
-- 
Ken Gaillot <kgail...@redhat.com>




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Pacemaker 1.1.18 deprecation warnings

2017-09-19 Thread Ken Gaillot

On Tue, 2017-09-19 at 15:28 +0200, Jan Pokorný wrote:
> On 19/09/17 08:45 +0200, Klaus Wenninger wrote:
> > We could as well deprecate use of CRM_NOTIFY_* in alert-agents.
> > Just don't know an easy way of writing out a deprecation warning
> > upon a script using one of these.

I considered dropping support for CRM_notify_* in 2.0, but I think we
want to retain compatibility with scripts that people wrote for crm_mon
-E (which itself will not be deprecated this go-around either). I think
I'd rather wait for a future Pacemaker 2.x to break those, since alerts
are still relatively new to users.

The notification-agent/notification-recipient syntax is different,
since it was never released upstream.

> Rename to CRM_NOTIFY_DEPRECATED_* to allow emergency sed-based
> re-enabling in script-based notification agents (100% now?)
> while getting a clear message wrt. future across?

Not necessary, it's equally easy to s/CRM_notify_/CRM_alert_/g

> 
> > Of course one could search the alert-agents for the string when
> > they are read from the CIB. Apart from the multiple points of
> > ugliness this would impose positive point would be a log for non
> > existentalert-agents prior to their unsuccessful first use.
-- 
Ken Gaillot <kgail...@redhat.com>




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Pacemaker 1.1.18 deprecation warnings

2017-09-19 Thread Ken Gaillot

On Tue, 2017-09-19 at 10:18 +0300, Klechomir wrote:
> Hi Ken,
> Any plans that there will be "lost monitoring request" handling (an
> added 
> syntax) in 2.0?
> 
> Regards,
> Klecho

2.0 will be about removing legacy syntax rather than adding new
features. The X.Y.Z version numbering policy will become:

X (major): changes that prevent at least some rolling upgrades

Y (minor): changes that don't break any rolling upgrades but require
closer user attention (changes in configuration defaults, tool
behavior, and/or public C API)

Z (minor-minor): backward-compatible changes (features and bug fixes)

So, we can add new (backward-compatible) features in any release.

Specifically to your question, what do you mean by "lost monitoring
request"? If you mean the ability to ignore a certain number of failed
monitors before recovering, there are plans for that, but it will be
later than 2.0.

> On 18.9.2017 12:48:38 Ken Gaillot wrote:
> > As discussed at the recent ClusterLabs Summit, I plan to start the
> > release cycle for Pacemaker 1.1.18 soon.
> > 
> > There will be the usual bug fixes and a few small new features, but
> the
> > main goal will be to provide a final 1.1 release that Pacemaker 2.0
> can
> > branch from.
> > 
> > As such, 1.1.18 will start to log deprecation warnings for syntax
> that
> > is planned to be removed in 2.0. So, we need to decide fairly
> quickly
> > what we intend to remove.
> > 
> > Below is what I'm proposing. If anyone feels strongly about keeping
> > support for any of these, speak now or forever hold your peace!
> > 
> > * support for legacy cluster stacks (heartbeat, corosync 1 + CMAN,
> and
> > corosync 1 + pacemaker plugin). Pacemaker 2.0 will initially
> support
> > only corosync 2, though future support is planned for the new knet
> > stack.
> > 
> > * compile-time option to directly support SNMP and ESMTP in crm_mon
> > (i.e. the --snmp-* and --mail-* options) (alerts are the current
> > syntax)
> > 
> > * pcmk_*_cmd stonith attributes (pcmk_*_action is the current
> syntax)
> > 
> > * pcmk_poweroff_action (pcmk_off_action is the current syntax)
> > 
> > * "requires" operation meta-attribute ("requires" resource meta-
> > attribute is the current syntax)
> > 
> > * undocumented "resource isolation" feature (bundles are current
> > syntax)
> > 
> > * undocumented LRMD_MAX_CHILDREN environment variable
> > (PCMK_node_action_limit is the current syntax)
> > 
> > * cluster properties that have been obsoleted by the rsc_defaults
> and
> > op_defaults sections
> > ** stonith-enabled or stonith_enabled (now "requires" in
> rsc_defaults)
> > ** default-resource-stickiness, default_resource_stickiness (now
> > "resource-stickiness" in rsc_defaults)
> > ** is-managed-default or is_managed_default (now "is-managed" in
> > rsc_defaults)
> > ** default-action-timeout or default_action_timeout (now "timeout"
> in
> > op_defaults)
> > 
> > * undocumented old names of cluster properties
> > ** no_quorum_policy (now no-quorum-policy)
> > ** symmetric_cluster (now symmetric-cluster)
> > ** stonith_action (now stonith-action)
> > ** startup_fencing (now startup-fencing)
> > ** transition_idle_timeout (now cluster-delay)
> > ** default_action_timeout (now default-action-timeout)
> > ** stop_orphan_resources (now stop-orphan-resources)
> > ** stop_orphan_actions (now stop-orphan-actions)
> > ** remote_after_stop (now remove-after-stop)
> > ** dc_deadtime (now dc-deadtime)
> > ** cluster_recheck_interval (now cluster-recheck-interval)
> > ** election_timeout (now election-timeout)
> > ** shutdown_escalation (now shutdown-escalation)
> > 
> > * undocumented old names of resource meta-attributes
> > ** resource-failure-stickiness, resource_failure_stickiness,
> default-
> > resource-failure-stickiness, and
> > default_resource_failure_stickiness (now migration-threshold)
> > 
> > * undocumented and ignored -r option to lrmd
> > 
> > * compile-time option to use undocumented "notification-agent" and
> > "notification-recipient" cluster properties instead of current
> "alerts"
> > syntax
> > 
> > * compatibility with CIB schemas below 1.0, and schema 1.1 (should
> not
> > affect anyone who created their configuration using Pacemaker 1.0.0
> or
> > later)
-- 
Ken Gaillot <kgail...@redhat.com>




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] Pacemaker 1.1.18-rc1 now available

2017-10-06 Thread Ken Gaillot

Source code for the first release candidate for Pacemaker version
1.1.18 is now available at:

https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.18-
rc1

The main goal of this release is to provide a point from which we can
branch off the development of Pacemaker 2.0.0, and to start logging
warnings when legacy configuration syntax to be removed in 2.0.0 is in
use.

Despite that, and only being 3 months after the 1.1.17 release, we have
some interesting new features. The most significant are:

* Bundles are now close to being fully production-ready. They support
all constraint types, and they now support rkt as well as Docker
containers. The only known significant limitation is that cleaning up a
running bundle, or restarting Pacemaker while a bundle is unmanaged or
the cluster is in maintenance mode, may cause the bundle to fail. 

* Alerts may now be filtered so that alert agents are called only for
desired alert types, and (as an experimental feature) it is now
possible to receive alerts for transient node attribute changes.

* Status output (including crm_mon, pengine logs, and a new
crm_resource --why option) now has more details about why resources are
in a certain state.

As usual, to support the new features, the CRM feature set has been
incremented. This means that mixed-version clusters are supported only
during a rolling upgrade -- nodes with an older version will not be
allowed to rejoin once they shut down.

For a more detailed list of bug fixes and other changes, see the change
log:

https://github.com/ClusterLabs/pacemaker/blob/1.1/ChangeLog

Everyone is encouraged to download, compile and test the new release.
We do many regression tests and simulations, but we can't cover all
possible use cases, so your feedback is important and appreciated.

Many thanks to all contributors of source code to this release,
including Andrew Beekhof, Aravind Kumar, Artur Novik, Bin Liu, Yan Gao,
Hideo Yamauchi, Igor Tsiglyar, Jan Pokorný, Ken Gaillot, Klaus
Wenninger, Nye Liu, and Valentin Vidic.
-- 
Ken Gaillot <kgail...@redhat.com>

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] crm_resource --wait

2017-10-09 Thread Ken Gaillot

On Mon, 2017-10-09 at 16:37 +1000, Leon Steffens wrote:
> Hi all,
> 
> We have a use case where we want to place a node into standby and
> then wait for all the resources to move off the node (and be started
> on other nodes) before continuing.  
> 
> In order to do this we call:
> $ pcs cluster standby brilxvm45
> $ crm_resource --wait --timeout 300
> 
> This works most of the time, but in one of our test environments we
> are hitting a problem:
> 
> When we put the node in standby, the reported cluster transition is:
> 
> $  /usr/sbin/crm_simulate -x pe-input-3595.bz2 -S
> 
> Using the original execution date of: 2017-10-08 16:58:05Z
> ...
> Transition Summary:
>  * Restart sv_fencer    (Started brilxvm43)
>  * Stop    sv.svtest.aa.sv.monitor:1    (brilxvm45)
>  * Move    sv.svtest.aa.26.partition    (Started brilxvm45 ->
> brilxvm43)
>  * Move    sv.svtest.aa.27.partition    (Started brilxvm45 ->
> brilxvm44)
>  * Move    sv.svtest.aa.28.partition    (Started brilxvm45 ->
> brilxvm43)
> 
> We expect crm_resource --wait to return once sv_fencer (a fencing
> device) has been restarted (not sure why it's being restarted), and
> the 3 partition resources have been moved.
> 
> But crm_resource actually times out after 300 seconds with the
> following error:
> 
> Pending actions:
> Action 40: sv_fencer_monitor_6 on brilxvm44
> Action 39: sv_fencer_start_0 on brilxvm44
> Action 38: sv_fencer_stop_0 on brilxvm43
> Error performing operation: Timer expired
> 
> It looks like it's waiting for the sv_fencer fencing agent to start
> on brilxvm44, even though the current transition did not include that
> move.  

crm_resource --wait doesn't wait for a specific transition to complete;
it waits until no further actions are needed.

That is one of its limitations, that if something keeps provoking a new
transition, it will never complete except by timeout.

> 
> After the crm_resource --wait has timed out, we set a property on a
> different node (brilxvm43).  This seems to trigger a new transition
> to move sv_fencer to brilxvm44:
> 
> $  /usr/sbin/crm_simulate -x pe-input-3596.bz2 -S
> Using the original execution date of: 2017-10-08 17:03:27Z
> 
> Transition Summary:
>  * Move    sv_fencer    (Started brilxvm43 -> brilxvm44)
> 
> And from the corosync.log it looks like this transition triggers
> actions 38 - 40 (the ones crm_resource --wait waited for).
> 
> So it looks like the crm_resource --wait knows about the transition
> to move the sv_fencer resource, but the subsequent setting of the
> node property is the one that actually triggers it  (which is too
> late as it gets executed after the wait).
> 
> I have attached the DC's corosync.log for the applicable time period
> (timezone is UTC+10).  (The last few lines in the corosync - the
> interruption of transition 141 - is because of a subsequent standby
> being done for brilxvm43).
> 
> A possible workaround I thought of was to make the sv_fencer resource
> slightly sticky (all the other resources are), but I'm not sure if
> this will just hide the problem for this specific scenario.
> 
> We are using Pacemaker 1.1.15 on RedHat 6.9.
> 
> Regards,
> Leon
> 
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.
> pdf
> Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] corosync service not automatically started

2017-10-10 Thread Ken Gaillot

On Tue, 2017-10-10 at 12:24 +0200, Václav Mach wrote:
> On 10/10/2017 11:40 AM, Valentin Vidic wrote:
> > On Tue, Oct 10, 2017 at 11:26:24AM +0200, Václav Mach wrote:
> > > # The primary network interface
> > > allow-hotplug eth0
> > > iface eth0 inet dhcp
> > > # This is an autoconfigured IPv6 interface
> > > iface eth0 inet6 auto
> > 
> > allow-hotplug or dhcp could be causing problems.  You can try
> > disabling corosync and pacemaker so they don't start on boot
> > and start them manually after a few minutes when the network
> > is stable.  If it works than you have some kind of a timing
> > issue.  You can try using 'auto eth0' or a static IP address
> > to see if it helps...
> > 
> 
> It seems that static network configuration really solved this issue.
> No 
> further modifications of services were necessary.
> 
> Thanks for help.

Yes, that would be it -- corosync doesn't play well with DHCP. The
lease renewals (even when keeping the same IP) can disrupt corosync
communication. (At least that was the case I last looked into it --
there may have been recent changes to work around it, or that may be
coming in the new kronosnet protocol.)

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] crm_resource --wait

2017-10-10 Thread Ken Gaillot

On Tue, 2017-10-10 at 15:19 +1000, Leon Steffens wrote:
> Hi Ken,
> 
> I managed to reproduce this on a simplified version of the cluster,
> and on Pacemaker 1.1.15, 1.1.16, as well as 1.1.18-rc1

> The steps to create the cluster are:
> 
> pcs property set stonith-enabled=false
> pcs property set placement-strategy=balanced
> 
> pcs node utilization vm1 cpu=100
> pcs node utilization vm2 cpu=100
> pcs node utilization vm3 cpu=100
> 
> pcs property set maintenance-mode=true
> 
> pcs resource create sv-fencer ocf:pacemaker:Dummy
> 
> pcs resource create sv ocf:pacemaker:Dummy clone notify=false
> pcs resource create std ocf:pacemaker:Dummy meta resource-
> stickiness=100
> 
> pcs resource create partition1 ocf:pacemaker:Dummy meta resource-
> stickiness=100
> pcs resource create partition2 ocf:pacemaker:Dummy meta resource-
> stickiness=100
> pcs resource create partition3 ocf:pacemaker:Dummy meta resource-
> stickiness=100
> 
> pcs resource utilization partition1 cpu=5
> pcs resource utilization partition2 cpu=5
> pcs resource utilization partition3 cpu=5
> 
> pcs constraint colocation add std with sv-clone INFINITY
> pcs constraint colocation add partition1 with sv-clone INFINITY
> pcs constraint colocation add partition2 with sv-clone INFINITY
> pcs constraint colocation add partition3 with sv-clone INFINITY
> 
> pcs property set maintenance-mode=false
>  
> 
> I can then reproduce the issues in the following way:
> 
> $ pcs resource
>  sv-fencer      (ocf::pacemaker:Dummy): Started vm1
>  Clone Set: sv-clone [sv]
>      Started: [ vm1 vm2 vm3 ]
>  std    (ocf::pacemaker:Dummy): Started vm2
>  partition1     (ocf::pacemaker:Dummy): Started vm3
>  partition2     (ocf::pacemaker:Dummy): Started vm1
>  partition3     (ocf::pacemaker:Dummy): Started vm2
> 
> $ pcs cluster standby vm3
> 
> # Check that all resources have moved off vm3
> $ pcs resource
>  sv-fencer      (ocf::pacemaker:Dummy): Started vm1
>  Clone Set: sv-clone [sv]
>      Started: [ vm1 vm2 ]
>      Stopped: [ vm3 ]
>  std    (ocf::pacemaker:Dummy): Started vm2
>  partition1     (ocf::pacemaker:Dummy): Started vm1
>  partition2     (ocf::pacemaker:Dummy): Started vm1
>  partition3     (ocf::pacemaker:Dummy): Started vm2

Thanks for the detailed information, this should help me get to the
bottom of it. From this description, it sounds like a new transition
isn't being triggered when it should.

Could you please attach the DC's pe-input file that is listed in the
logs after the standby step above? That would simplify analysis.

> # Wait for any outstanding actions to complete.
> $ crm_resource --wait --timeout 300
> Pending actions:
>         Action 22: sv-fencer_monitor_1      on vm2
>         Action 21: sv-fencer_start_0    on vm2
>         Action 20: sv-fencer_stop_0     on vm1
> Error performing operation: Timer expired
> 
> # Check the resources again - sv-fencer is still on vm1
> $ pcs resource
>  sv-fencer      (ocf::pacemaker:Dummy): Started vm1
>  Clone Set: sv-clone [sv]
>      Started: [ vm1 vm2 ]
>      Stopped: [ vm3 ]
>  std    (ocf::pacemaker:Dummy): Started vm2
>  partition1     (ocf::pacemaker:Dummy): Started vm1
>  partition2     (ocf::pacemaker:Dummy): Started vm1
>  partition3     (ocf::pacemaker:Dummy): Started vm2
> 
> # Perform a random update to the CIB.
> $ pcs resource update std op monitor interval=20 timeout=20
> 
> # Check resource status again - sv_fencer has now moved to vm2 (the
> action crm_resource was waiting for)
> $ pcs resource
>  sv-fencer      (ocf::pacemaker:Dummy): Started vm2  <<<
>  Clone Set: sv-clone [sv]
>      Started: [ vm1 vm2 ]
>      Stopped: [ vm3 ]
>  std    (ocf::pacemaker:Dummy): Started vm2
>  partition1     (ocf::pacemaker:Dummy): Started vm1
>  partition2     (ocf::pacemaker:Dummy): Started vm1
>  partition3     (ocf::pacemaker:Dummy): Started vm2
> 
> I do not get the problem if I:
> 1) remove the "std" resource; or
> 2) remove the co-location constraints; or
> 3) remove the utilization attributes for the partition resources.
> 
> In these cases the sv-fencer resource is happy to stay on vm1, and
> crm_resource --wait returns immediately.
> 
> It looks like the pcs cluster standby call is creating/registering
> the actions to move the sv-fencer resource to vm2, but it doesn't
> include it in the cluster transition.  When the CIB is later updated
> by something else, the action is included in that transition.
> 
> 
> Regards,
> Leon

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] Pacemaker 1.1.18 Release Candidate 2

2017-10-16 Thread Ken Gaillot

The second release candidate for Pacemaker version 1.1.18 is now
available at:

https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.18-
rc2

This release fixes a few minor regressions introduced in rc1, plus a
few long-standing minor bugs. For details, see the ChangeLog.

Any testing you can do is very welcome.
-- 
Ken Gaillot <kgail...@redhat.com>

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-16 Thread Ken Gaillot

On Mon, 2017-10-16 at 18:30 +0200, Gerard Garcia wrote:
> Hi,
> 
> I have a cluster with two ocf:heartbeat:anything resources each one
> running as a clone in all nodes of the cluster. For some reason when
> one of them fails to start the other one stops. There is not any
> constrain configured or any kind of relation between them. 
> 
> Is it possible that there is some kind of implicit relation that I'm
> not aware of (for example because they are the same type?)
> 
> Thanks,
> 
> Gerard

There is no implicit relation on the Pacemaker side. However if the
agent returns "failed" for both resources when either one fails, you
could see something like that. I'd look at the logs on the DC and see
why it decided to restart the second resource.
-- 
Ken Gaillot <kgail...@redhat.com>

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-17 Thread Ken Gaillot

On Tue, 2017-10-17 at 11:47 +0200, Gerard Garcia wrote:
> Thanks Ken. Yes, inspecting the logs seems that the failcount of the
> correctly running resource reaches the maximum number of allowed
> failures and gets banned in all nodes.
> 
> What is weird is that I just see how the failcount for the first
> resource gets updated, is like the failcount are being mixed. In
> fact, when the two resources get banned the only way I have to make
> the first one start is to disable the failing one and clean the
> failcount of the two resources (it is not enough to only clean the
> failcount of the first resource) does it make sense?
> 
> Gerard

My suspicion is that you have two instances of the same service, and
the resource agent monitor is only checking the general service, rather
than a specific instance of it, so the monitors on both of them return
failure if either one is failing.

That would make sense why you have to disable the failing resource, so
its monitor stops running. I can't think of why you'd have to clean its
failcount for the other one to start, though.

The "anything" agent very often causes more problems than it solves ...
 I'd recommend writing your own OCF agent tailored to your service.
It's not much more complicated than an init script.

> On Mon, Oct 16, 2017 at 6:57 PM, Ken Gaillot <kgail...@redhat.com>
> wrote:
> > On Mon, 2017-10-16 at 18:30 +0200, Gerard Garcia wrote:
> > > Hi,
> > >
> > > I have a cluster with two ocf:heartbeat:anything resources each
> > one
> > > running as a clone in all nodes of the cluster. For some reason
> > when
> > > one of them fails to start the other one stops. There is not any
> > > constrain configured or any kind of relation between them. 
> > >
> > > Is it possible that there is some kind of implicit relation that
> > I'm
> > > not aware of (for example because they are the same type?)
> > >
> > > Thanks,
> > >
> > > Gerard
> > 
> > There is no implicit relation on the Pacemaker side. However if the
> > agent returns "failed" for both resources when either one fails,
> > you
> > could see something like that. I'd look at the logs on the DC and
> > see
> > why it decided to restart the second resource.
> > --
> > Ken Gaillot <kgail...@redhat.com>
> > 
> > ___
> > Users mailing list: Users@clusterlabs.org
> > http://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratc
> > h.pdf
> > Bugs: http://bugs.clusterlabs.org
> > 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.
> pdf
> Bugs: http://bugs.clusterlabs.org
-- 
Ken Gaillot <kgail...@redhat.com>

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Debugging problems with resource timeout without any actions from cluster

2017-10-17 Thread Ken Gaillot

On Tue, 2017-10-17 at 15:30 +0600, Sergey Korobitsin wrote:
> Ken Gaillot ☫ → To Cluster Labs - All topics related to open-source
> clustering welcomed @ Thu, Oct 12, 2017 09:47 -0500
> 
> Thanks for the answer, Ken,
> 
> > > I found several ways to achieve that:
> > > 
> > > 1. Put cluster in maintainance mode (as described here:
> > >    https://www.hastexo.com/resources/hints-and-kinks/maintenance-
> > > acti
> > > ve-pacemaker-clusters/)
> > > 
> > >    As far as I understand, services will be monitored, all logs
> > > written,
> > >    etc., but no action in case of failures will be taken. Is that
> > > right?
> > 
> > Actually, maintenance mode stops all monitors (except those with
> > role=Stopped, which ensure a service is not running).
> 
> OK, got it.
> 
> > > 2. Put the particular resource to unmanaged mode, as described
> > > here:
> > >    http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pac
> > > emak
> > > er_Explained/#s-monitoring-unmanaged
> > 
> > Disabling starts and stops is the exact purpose of unmanaged, so
> > this
> > is one way to get what you want. FYI you can also set this as a
> > global
> > default for all resources by setting it in the resource defaults
> > section of the configuration.
> 
> OK, got it too.
> 
> > > 3. Start all resources and remove start and stop operations from
> > > them.
> > 
> > :-O
> 
> This is kinda quirky way, but it exists! :-)
> 
> > > Which is the best way to achieve my purpose? I would like cluster
> > > to
> > > run
> > > as usual (and logging as usual or with trace on problematic
> > > resource),
> > > but no action in case of monitor failure should be taken.
> > 
> > That's actually a different goal, also easily accomplished, by
> > setting
> > on-fail=ignore on the monitor operation. From the sound of it, this
> > is
> > closer to what you want, since the cluster is still allowed to
> > start/stop resources when you standby a node, etc.
> 
> I'll try this one.
> 
> > You could also delete the recurring monitor operation from the
> > configuration, and it wouldn't run at all. But keeping it and
> > setting
> > on-fail=ignore lets you see failures in cluster status.
> > However, I'm not sure bypassing the monitor is the best solution to
> > this problem. If the problem is simply that your database monitor
> > can
> > legitimately take longer than 20 seconds in normal operation, then
> > raise the timeout as needed.
> 
> I want to determine why it needed more than 20 seconds, and under
> what
> circumstances.

Ah, excellent, that's what on-fail=ignore is useful for :-)

-- 
Ken Gaillot <kgail...@redhat.com>

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Pacemaker resource parameter reload confusion

2017-10-17 Thread Ken Gaillot

On Fri, 2017-09-22 at 18:30 +0200, Ferenc Wágner wrote:
> Ken Gaillot <kgail...@redhat.com> writes:
> 
> > Hmm, stop+reload is definitely a bug. Can you attach (or email it
> > to me
> > privately, or file a bz with it attached) the above pe-input file
> > with
> > any sensitive info removed?
> 
> I sent you the pe-input file privately.  It indeed shows the issue:
> 
> $ /usr/sbin/crm_simulate -x pe-input-1033.bz2 -RS
> [...]
> Executing cluster transition:
>  * Resource action: vm-alderstop on vhbl05
>  * Resource action: vm-alderreload on vhbl05
> [...]
> 
> Hope you can easily get to the bottom of this.
> 
> > Nothing's been done about reload yet. It's waiting until we get
> > around
> > to an overhaul of the OCF resource agent standard, so we can define
> > the semantics more clearly. It will involve replacing "unique" with
> > separate meta-data for reloadability and GUI hinting, and possibly
> > changes to the reload operation. Of course we'll try to stay
> > backward-
> > compatible.
> 
> Thanks for the confirmation.

This turned out to have the same underlying cause as CLBZ#5309. I have
a fix pending review, which I expect to make it into the soon-to-be-
released 1.1.18.

It is a regression introduced in 1.1.15 by commit 2558d76f. The logic
for reloads was consolidated in one place, but that happened to be
before restarts were scheduled, so it no longer had the right
information about whether a restart was needed. Now, it sets an
ordering flag that is used later to cancel the reload if the restart
becomes required. I've also added a regression test for it.
-- 
Ken Gaillot <kgail...@redhat.com>

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Mysql upgrade in DRBD setup

2017-10-13 Thread Ken Gaillot

On Fri, 2017-10-13 at 17:35 +0200, Attila Megyeri wrote:
> Hi Ken, Kristián,
> 
> 
> Thanks - I am familiar with the native replication, and we use that
> as well.
> But in this scenario I have to use DRBD. (There is a DRBD Mysql
> cluster that is a central site, which is replicated to many sites
> using native replication, and all sites have DRBD clusters as well -
> In this setup I have to use DRBD for high availability).
> 
> 
> Anyway - I thought there is a better approach for the DRBD-replicated 
> Mysql than what I outlined.
> What I am concerned about, is what will happen if I upgrade the
> active node (let's say I'm okay with the downtime) - when I fail over
> to the other node, where the program files and the data files are on
> different versions...And when I start upgrading that.
> 
> Any experience anyone?
> 
> @Kristián: my experience shows that If I try to update mysql without
> a mounted data fs - it will fail terribly... So the only option is to
> upgrade the mounted, and active instance - but the issue is the
> version difference (prog vs. data)

Exactly -- which is why I'd still go with native replication for this,
too. It just adds a step in the upgrade process I outlined earlier:
repoint all the other sites' mysql instances to the second central
server after it is upgraded (before or after it is made master, doesn't
matter). I'm assuming only the master is allowed to write.

Another alternative would be to use galera for multi-master (at least
for the two servers at the central site).

Also, it's still possible to use DRBD beneath a native replication
setup, but you'd have to replicate both the master and slave data
(using only one at a time on any given server). This makes more sense
if the mysql servers are running inside VMs or containers that can
migrate between the physical machines.

> 
> Thanks!
> 
> 
> 
> -Original Message-
> From: Ken Gaillot [mailto:kgail...@redhat.com] 
> Sent: Thursday, October 12, 2017 9:22 PM
> To: Cluster Labs - All topics related to open-source clustering
> welcomed <users@clusterlabs.org>
> Subject: Re: [ClusterLabs] Mysql upgrade in DRBD setup
> 
> On Thu, 2017-10-12 at 18:51 +0200, Attila Megyeri wrote:
> > Hi all,
> >  
> > What is the recommended mysql server upgrade methodology in case of
> > an 
> > active/passive DRBD storage?
> > (Ubuntu is the platform)
> 
> If you want to minimize downtime in a MySQL upgrade, your best bet is
> to use MySQL native replication rather than replicate the storage.
> 
> 1. starting point: node1 = master, node2 = slave 2. stop mysql on
> node2, upgrade, start mysql again, ensure OK 3. switch master to
> node2 and slave to node1, ensure OK 4. stop mysql on node1, upgrade,
> start mysql again, ensure OK
> 
> You might have a small window where the database is read-only while
> you switch masters (you can keep it to a few seconds if you arrange
> things well), but other than that, you won't have any downtime, even
> if some part of the upgrade gives you trouble.
> 
> >  
> > 1)  On the passive node the mysql data directory is not
> > mounted, 
> > so the backup fails (some postinstall jobs will attempt to perform 
> > manipulations on certain files in the data directory).
> > 2)  If the upgrade is done on the active node, it will restart
> > the 
> > service (with the service restart, not in a crm managed fassion…), 
> > which is not a very good option (downtime in a HA solution). Not
> > to 
> > mention, that it will update some files in the mysql data
> > directory, 
> > which can cause strange issues if the A/P pair is changed – since
> > on 
> > the other node the program code will still be the old one, while
> > the 
> > data dir is already upgraded.
> >  
> > Any hints are welcome!
> >  
> > Thanks,
> > Attila
> >  
> > ___
> > Users mailing list: Users@clusterlabs.org 
> > http://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > Project Home: http://www.clusterlabs.org Getting started: 
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.
> > pdf
> > Bugs: http://bugs.clusterlabs.org
> 
> --
> Ken Gaillot <kgail...@redhat.com>
> 
> ___
> Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.or
> g/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> ___
> Users mailing list: Users@clus

Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-16 Thread Ken Gaillot

On Mon, 2017-10-16 at 21:49 +0200, Lars Ellenberg wrote:
> On Mon, Oct 16, 2017 at 09:20:52PM +0200, Lentes, Bernd wrote:
> > - On Oct 16, 2017, at 7:38 PM, Digimer li...@alteeve.ca wrote:
> > > On 2017-10-16 01:24 PM, Lentes, Bernd wrote:
> > > > i have the following behavior: I put a node in maintenance
> > > > mode, afterwards stop
> > > > corosync on that node with /etc/init.d/openais stop.
> > > > This node is immediately fenced. Is that expected behavior ? I
> > > > thought putting a
> > > > node into maintenance does mean the cluster does not care
> > > > anymore about that
> > > > node.
> > OS is SLES 11 SP4. That's not the most recent one.
> > Pacmekaer is 1.1.12.
> > I didn't plan to remove the node, but to do some maintenance on it.
> > 
> > If i put the node in standby, then i can invoke
> > "/etc/init.d/openais
> > stop" without that node getting fenced.
> > But then all resources on that node are stopped/migrated. If i
> > don't
> > want that, i thought maintenance is the right way.
> > Am i wrong ?
> > 
> > Ah, i just saw that i wasn't complete clear. The node is fenced
> > after
> > stopping openais, not after putting it into maintenance.
> > I did that via "crm node maintenance "
> 
> from the Changelog:
> 
> Changes since Pacemaker-1.1.15
>   ...
>   + pengine: do not fence a node in maintenance mode if it shuts down
> cleanly
>   ...
> 
> just saying ... may or may not be what you are seeing.
> 
> Short term "workaround" may be to do things differently.
> Maybe just set the cluster wide maintenance mode, not per node?

Sounds right.

Another thing to keep in mind is that even if pacemaker doesn't fence
the node, if you use DLM, DLM might fence the node (it doesn't know
about or respect any pacemaker maintenance/unmanaged settings).

I'd stop pacemaker before stopping corosync, in any case. In
maintenance mode, that should be fine. I don't think a running
pacemaker would be able to reconnect to corosync after corosync comes
back.

> What are you really trying to do,
> what is the reason you need it in maintenance-mode
> and stop pacemaker/corosync/openais/the clusterstack,
> but do not want to stop/migrate off the resources,
> as would be done with "standby"?
> 
-- 
Ken Gaillot <kgail...@redhat.com>

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Corosync on a home network

2017-09-12 Thread Ken Gaillot

On Mon, 2017-09-11 at 23:38 +0100, J Martin Rushton wrote:
> I've had it switched off over the last week whilst I've been trying to
> sort this out, but forgot tonight.  It must be the combination of
> setting multicast_querier and stopping the firewall that is needed.  I

FYI this should open the requisite ports:

firewall-cmd --permanent --add-service=high-availability
firewall-cmd --reload

You may need to tweak that if your cluster network is not in the default
zone.

> can now see:
> 
> Quorum information
> --
> Date: Mon Sep 11 23:20:15 2017
> Quorum provider:  corosync_votequorum
> Nodes:4
> Node ID:  1
> Ring ID:  1/31156
> Quorate:  Yes
> 
> Votequorum information
> --
> Expected votes:   4
> Highest expected: 4
> Total votes:  4
> Quorum:   3
> Flags:Quorate
> 
> Membership information
> --
> Nodeid  Votes Name
>  1  1 192.168.1.2 (local)
>  2  1 192.168.1.51
>  3  1 192.168.1.52
>  4  1 192.168.1.53
> 
> which is what I wanted.
> 
> Thank you very much, I can go on to build the filesystem now.
> Martin
> 
> On 11/09/17 23:14, Leon Steffens wrote:
> > Is the firewalld service running?  Just did a quick test on my Centos 7 
> > installation and by default SSH is allowed through the firewall, but 
> > corosync cannot connect to the other nodes.
> > 
> > Try: systemctl stop firewalld.service
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-- 
Ken Gaillot <kgail...@redhat.com>





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] 2017 ClusterLabs Summit -- Pacemaker 1.2.0 or 2.0 talk

2017-09-06 Thread Ken Gaillot

I thought the first day of the 2017 ClusterLabs Summit went impressively
well today. We had about 50 attendees from Alteeve, Citrix, Debian,
Linbit, Red Hat, SuSE, and possibly more I've missed. The talks were
packed with information and covered a wide variety of topics, from
container orchestration to storage management to GUIs.

I gave a talk on future directions for Pacemaker. The slides are
available at:

   https://www.clusterlabs.org/images/Pacemaker-slides-2017-09-06.pdf

The main idea is that Pacemaker 1.2.0 and/or 2.0 will be more about
removing legacy code more than adding anything new. As the slides are
presented, my original idea was to release 1.2.0 in the near term, with
smaller changes that don't affect rolling upgrades, then 2.0 at some
point further in the future, that would break rolling upgrades for a
(hopefully small) subset of users.

However after discussions during and after the talk, it seems more
worthwhile to go straight to 2.0, with the major change being dropping
support for the legacy cluster layers -- heartbeat, CMAN, and the
corosync 1 pacemaker plugin. This would allow us to simplify the
pacemaker code and make it easier to add new features in the future. We
would keep the 1.1 branch alive for a period of time, and anyone who
still uses the older stacks, but is interested in fixes or features from
the 2.0 line, could submit pull requests to backport them to 1.1.

I'd like to open the discussion on this list as to what changes should
be in 2.0. The slides give some examples that fall into categories:

* Changes in Pacemaker's defaults: a higher default stonith-timeout;
defaulting record-pending to true, which will allow status displays to
show when an action (such as start or stop) is currently in progress;
interpreting a negative stonith-watchdog-timeout as meaning to
automatically calculate a value (the default of 0 would continue to mean
disabling watchdog use by Pacemaker); moving the default location of the
Pacemaker detail log to /var/log/cluster/pacemaker.log (a change from
the slides, based on summit discussions), and removing support for some
very rarely used legacy names for various configuration values.

* Changes in command-line tool behavior: We might drop old legacy
synonyms for command-line options, such as "crm_resource --host-uname"
for what is now "crm_resource --node"; and remove crm_mon's built-in
SNMP/ESMTP support in favor of the relatively recent alerts feature.

* Changes in the C API: This would affect very few people -- only users
who wrote their own C code using the Pacemaker C libraries. We would
coordinate these changes with the handful of public projects (such as
sbd) that use the API. These changes will be discussed further on the
develop...@clusterlabs.org mailing list rather than this one.

* Breaking rolling upgrades for certain legacy features: We would try to
keep the number of affected users to a minimum. Examples are
configurations created under pre-Pacemaker-1.0 and never converted to
the post-1.0 XML syntax; the undocumented "resource isolation" feature,
which has been superseded by the new bundle feature; certain legacy
cluster options that changed names long ago; and as mentioned, support
for the heartbeat, CMAN, and corosync 1+plugin stacks. Also, dropping
support for "legacy attrd" would mean that rolling upgrades from
Pacemaker older than 1.1.11, even if running on corosync 2, would not
work (even then, a rolling upgrade would work in two steps, first to a
later 1.1 version, then to 2.0).

The purpose of this email is to start a discussion about these changes.
Nothing is set in stone. We do want to focus more on removing legacy
usage rather than adding new features in the 2.0 release. Anyone who has
an opinion or questions about the changes mentioned above, or
suggestions for similar changes, is encouraged to participate.

-- 
Ken Gaillot <kgail...@redhat.com>





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] Coming in 1.1.18: deprecating stonith-enabled

2017-09-25 Thread Ken Gaillot

Hi all,

I thought I'd call attention to one of the most visible deprecations
coming in 1.1.18: stonith-enabled. In order to deprecate that option,
we have to provide an alternate way to do the things that it does.

stonith-enabled determines whether a resource's "requires" meta-
attribute defaults to "quorum" or "fencing". This already has an
alternate method, the rsc_defaults section.

For everything else, e.g. whether to fence misbehaving nodes, and
whether to start resources when fencing hasn't been configured, the
cluster will now check additional criteria.

This my plan at the moment:

Fencing will be considered possible in a configuration if: "no-quorum-
policy" is "suicide", any resource has "requires" set to "unfencing" or
"fencing" (the default), any operation has "on-fail" set to "fence"
(the default for stop operations), or any fence resource has been
configured.

If fencing is not possible, the cluster will behave as if stonith-
enabled is false (even if it's not).
-- 
Ken Gaillot <kgail...@redhat.com>




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Transition aborted when disabling resource

2017-09-30 Thread Ken Gaillot

ormlrz1    cib: info:
> > cib_perform_op:  Diff: +++ 0.18640.1 (null)
> > Sep 25 23:50:08 [4492] vttwinformlrz1    cib: info:
> > cib_perform_op:  +  /cib:  @num_updates=1
> > Sep 25 23:50:08 [4492] vttwinformlrz1    cib: info:
> > cib_perform_op:  + 
> > /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_reso
> > urce[@id='cftmd1s1']/lrm_rsc_op[@id='cftmd1s1_last_0']: 
> > @operation_key=cftmd1s1_start_0, @operation=start, @transition-
> > key=513:17221:0:d060d698-76d6-4a95-8f54-b0cd908aa999, @transition-
> > magic=0:0;513:17221:0:d060d698-76d6-4a95-8f54-b0cd908aa999, @call-
> > id=10460, @last-run=1506376207, @last-rc-change=1506376207, @exec-
> > time=1190
> > Sep 25 23:50:08 [4492] vttwinformlrz1    cib: info:
> > cib_process_request: Completed cib_modify operation for section
> > status: OK (rc=0, origin=vttwinformlrz2/crmd/9922,
> > version=0.18640.1)
> > Sep 25 23:50:08 [4501] vttwinformlrz1   crmd: info:
> > match_graph_event:   Action cftmd1s1_start_0 (513) confirmed on
> > vttwinformlrz2 (rc=0)
> > Sep 25 23:50:08 [4495] vttwinformlrz1 stonith-ng: info:
> > update_cib_stonith_devices_v2:   Updating device list from the
> > cib: modify lrm_rsc_op[@id='cftmd1s1_last_0']
> > Sep 25 23:50:08 [4495] vttwinformlrz1 stonith-ng: info:
> > cib_devices_update:  Updating devices to version 0.18640.1
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > I noticed the service is still running because the EOD taks fails
> > and the next day the program is still running and it should not.
> > The forward lines in the log do not show anything regarding this
> > resource, it only shows it is 'Started'
>  
> Well we see that the target-role is properly set to 'Stopped' in the
> CIB.
> As we don't see the pengine running triggered by this config change I
> would
> suppose that the logs you have attached are not from the DC where the
> pengine would be running. (e.g. 'Current DC' shown by crm_mon)
> 
> 
> Yes it was the DC becasuse the log in the other node is even shorter.
> I have noticed the line: "Sep 25 23:50:08 [4495] vttwinformlrz1
> stonith-ng: info: stonith_device_remove:   Device 'ctpinetfh'
> not found (1 active devices)"
> 
> The resource is not a stonith device. Why try to remove it?
> 
> I have set up some logs in the resource agent to trace if the stop
> method is called.
> 
> 
> Regards,
> Klaus
> 
> > Regards
> > Roberto
-- 
Ken Gaillot <kgail...@redhat.com>




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Pacemaker starts with error on LVM resource

2017-10-01 Thread Ken Gaillot

On Thu, 2017-09-28 at 18:05 +0300, Octavian Ciobanu wrote:
> Hello all.
> 
> I have a test configuration with 2 nodes that is configured as iSCSI
> storage.
> 
> I've created a master/slave DRBD resource and a group that has the
> following resources ordered as follow : 
>  - iSCSI TCP IP/port block (ocf::heartbeat:portblock)
>  - LVM (ocf::heartbeat:LVM)
>  - iSCSI IP (ocf::heartbeat:IPaddr2)
>  - iSCSI Target (ocf::heartbeat:iSCSITarget) for first LVM partition
>  - iSCSI LUN (ocf::heartbeat:iSCSILogicalUnit) for first LVM
> partition
>  - iSCSI Target (ocf::heartbeat:iSCSITarget) for second LVM partition
>  - iSCSI LUN (ocf::heartbeat:iSCSILogicalUnit) for second LVM
> partition
>  - iSCSI Target (ocf::heartbeat:iSCSITarget) for third LVM partition
>  - iSCSI LUN (ocf::heartbeat:iSCSILogicalUnit) for third LVM
> partition
>  - iSCSI TCP IP/port unBlock (ocf::heartbeat:portblock)
> 
> the LVM-iSCSI group has an order constraint on it to start after the
> DRBD resource as can be seen from pcs constraint list command
> 
> Ordering Constraints:
>   promote Storage-DRBD then start Storage (kind:Mandatory)
> Colocation Constraints:
>   Storage with Storage-DRBD (score:INFINITY) (with-rsc-role:Master)
> 
> All was OK till I've did an update from CentOS 7.3 to 7.4 via yum.
> 
> After the update every time I start the cluster I get this error:
> 
> Failed Actions:
> * Storage-LVM_monitor_0 on storage01 'unknown error' (1): call=22,
> status=complete, exitreason='LVM Volume ClusterDisk is not
> available',
>     last-rc-change='Thu Sep 28 19:16:57 2017', queued=0ms, exec=515ms
> * Storage-LVM_monitor_0 on storage02 'unknown error' (1): call=22,
> status=complete, exitreason='LVM Volume ClusterDisk is not
> available',
>     last-rc-change='Thu Sep 28 19:17:48 2017', queued=0ms, exec=746ms

The "_monitor_0" on these failures means they were the initial probes
of the resource, not a recurring monitor after it was started. Before
starting a resource, Pacemaker probes its current state on all nodes,
to make sure it matches what is expected.

"Ordered probes" is a long-desired enhancement, where Pacemaker
wouldn't probe a resource until all its dependencies are up. It's
trickier than it sounds though, so it hasn't been implemented yet
(except for resources on guest nodes ordered after the guest resource
starts, which just got added to the master branch).

I don't remember anything specific to iSCSI+LVM in 7.4, hopefully
someone else does.

> Even with this error after the DRBD resource start the LVM resource
> start as it should be on the DRBD master node.
> 
> I've did look on both nodes to see if LVM services got started by the
> system and disabled them and even mask-ed them to be sure that they
> will not start at all but with this changes I still get this error.
> 
> From what I see the cluster service tries to start LVM before the
> DRBD resource is started and fails as it dose not find the DRBD disk.
> 
> Any ideas on how to fix this ?
> 
> Best regards 
> Octavian Ciobanu

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] strange behaviour from pacemaker_remote

2017-10-01 Thread Ken Gaillot

On Thu, 2017-09-28 at 01:39 +0200, Adam Spiers wrote:
> Hi all,
> 
> When I do a
> 
> pkill -9 -f pacemaker_remote
> 
> to simulate failure of a remote node, sometimes I see things like:
> 
> 08:29:32 d52-54-00-da-4e-05 pacemaker_remoted[5806]: error: No
> ipc providers available for uid 0 gid 0
> 08:29:32 d52-54-00-da-4e-05 pacemaker_remoted[5806]: error: Error
> in connection setup (5806-5805-15): Remote I/O error (121)
> 
> ... and the node doesn't get fenced as expected.  Other times it
> does.
> Is this my fault for using an invalid way of simulating failure, or
> some kind of bug?
> 
> Sadly I don't have the exact version of pacemaker_remoted to hand,
> but
> I can provide it tomorrow if necessary.  It's not the latest release,
> maybe not even the one immediately preceding it.
> 
> Thanks!
> Adam

Before fencing, the cluster will try re-establishing the connection. If
you've got pacemaker_remote enabled via systemd, systemd may be
respawning it quick enough that the cluster reconnect succeeds.

Also, until a recent master branch commit, remote nodes would not get
fenced if they were not running any resources.

And of course, a fencing resource has to be configured for the remote
node.

If none of those things were the reason, there may be a bug -- a PE
input file from the DC for that transition would be helpful.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Restarting a failed resource on same node

2017-10-03 Thread Ken Gaillot

On Mon, 2017-10-02 at 12:32 -0700, Paolo Zarpellon wrote:
> Hi,
> on a basic 2-node cluster, I have a master-slave resource where
> master runs on a node and slave on the other one. If I kill the slave
> resource, the resource status goes to "stopped".
> Similarly, if I kill the the master resource, the slave one is
> promoted to master but the failed one does not restart as slave.
> Is there a way to restart failing resources on the same node they
> were running?
> Thank you in advance.
> Regards,
> Paolo

Restarting on the same node is the default behavior -- something must
be blocking it. For example, check your migration-threshold (if
restarting fails this many times, it has nowhere to go and will stop).

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] what does cluster do when 'resourceA with resourceB' happens

2017-10-03 Thread Ken Gaillot

On Tue, 2017-10-03 at 11:53 +0100, lejeczek wrote:
> hi
> 
> I'm reading "An A-Z guide to Pacemaker's Configurations 
> Options" and in there it read:
> "...
> So when you are creating colocation constraints, it is 
> important to consider whether you should
> colocate A with B, or B with A.
> Another thing to keep in mind is that, assuming A is 
> colocated with B, the cluster will take into account
> A’s preferences when deciding which node to choose for B.
> ..."
> 
> I have a healthy cluster, three nodes, a five resources at 
> given time running on node1.
> Then I create a LVM resource which cluster decides to put on 
> node2.
> Then I do:
> 
> $ pcs constraint colocation add resourceA(running@node1) 
> with aLocalStorage6(newly created, running@node2)
> 
> I cluster moves all the resources to node2.
> 
> And I say whaaat? Something, guide+reality, does not ad up, 
> right?

That colocation constraint says "place aLocalStorage6 somewhere, then
place resourceA with it", so it makes sense resourceA moves there.

If everything else moves there, too, there must be something else in
the configuration telling it to. (Other colocation constraints maybe?)

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] ClusterLabs.Org Documentation Problem?

2017-08-24 Thread Ken Gaillot

On Wed, 2017-08-23 at 23:33 +, Eric Robinson wrote:
> I have a BIG correction.
> 
> If you follow the instructions titled, "Pacemaker 1.1 for Corosync 2.x," and 
> NOT the ones entitled, "Pacemaker 1.1 for CMAN or Corosync 1.x," guess what? 
> It installs cman anyway, and you spend a couple of days wondering why none of 
> your changes to corosync.conf seem to be working.
> 
> --
> Eric Robinson

That's an unfortunate result of trying to use the corosync 2
instructions with CentOS 6, which only supports corosync 1 + CMAN.

The "Pacemaker Explained" document is independent of OS and toolset, but
"Clusters From Scratch" and the walk-through portions of "Pacemaker
Remote" have to pick one configuration to use for examples, and
currently they use CentOS 7.

At one point, we maintained dual versions of Clusters From Scratch for
CentOS and OpenSuSE, but it was too difficult to maintain. I believe
Debian maintains their own variant at a different location.

It would probably be worthwhile to add more info boxes to Clusters From
Scratch pointing out where other OSes might do things differently.

> -Original Message-
> From: Jan Friesse [mailto:jfrie...@redhat.com] 
> Sent: Tuesday, August 22, 2017 11:52 PM
> To: Cluster Labs - All topics related to open-source clustering welcomed 
> <users@clusterlabs.org>; kgail...@redhat.com
> Subject: Re: [ClusterLabs] ClusterLabs.Org Documentation Problem?
> 
> > Thanks for the reply. Yes, it's a bit confusing. I did end up using the 
> > documentation for Corosync 2.X since that seemed newer, but it also assumed 
> > CentOS/RHEL7 and systemd-based commands. It also incorporates cman, pcsd, 
> > psmisc, and policycoreutils-pythonwhich, which are all new to me. If there 
> > is anything I can do to assist with getting the documentation cleaned up, 
> > I'd be more than glad to help.
> 
> Just a small correction.
> 
> Documentation shouldn't incorporate cman. Cman was used with corosync 1.x as 
> a configuration layer and (more important) quorum provider. With Corosync 2.x 
> quorum provider is already in corosync so no need for cman.
> 
> 
> 
> >
> > --
> > Eric Robinson
> >
> > -Original Message-
> > From: Ken Gaillot [mailto:kgail...@redhat.com]
> > Sent: Tuesday, August 22, 2017 2:08 PM
> > To: Cluster Labs - All topics related to open-source clustering 
> > welcomed <users@clusterlabs.org>
> > Subject: Re: [ClusterLabs] ClusterLabs.Org Documentation Problem?
> >
> > On Tue, 2017-08-22 at 19:40 +, Eric Robinson wrote:
> >> The documentation located here…
> >>
> >>
> >>
> >> http://clusterlabs.org/doc/
> >>
> >>
> >>
> >> …is confusing because it offers two combinations:
> >>
> >>
> >>
> >> Pacemaker 1.0 for Corosync 1.x
> >>
> >> Pacemaker 1.1 for Corosync 2.x
> >>
> >>
> >>
> >> According to the documentation, if you use Corosync 1.x you need 
> >> Pacemaker 1.0, but if you use Corosync 2.x then you need Pacemaker 
> >> 1.1.
> >>
> >>
> >>
> >> However, on my Centos 6.9 system, when I do ‘yum install pacemaker 
> >> corosync” I get the following versions:
> >>
> >>
> >>
> >> pacemaker-1.1.15-5.el6.x86_64
> >>
> >> corosync-1.4.7-5.el6.x86_64
> >>
> >>
> >>
> >> What’s the correct answer? Does Pacemaker 1.1.15 work with Corosync 
> >> 1.4.7? If so, is the documentation at ClusterLabs misleading?
> >>
> >>
> >>
> >> --
> >> Eric Robinson
> >
> > The page actually offers a third option ... "Pacemaker 1.1 for CMAN or 
> > Corosync 1.x". That's the configuration used by CentOS 6.
> >
> > However, that's still a bit misleading; the documentation set for 
> > "Pacemaker 1.1 for Corosync 2.x" is the only one that is updated, and it's 
> > mostly independent of the underlying layer, so you should prefer that set.
> >
> > I plan to reorganize that page in the coming months, so I'll try to make it 
> > clearer.

-- 
Ken Gaillot <kgail...@redhat.com>





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] start one node only?

2017-08-24 Thread Ken Gaillot

On Thu, 2017-08-24 at 15:10 -0500, Dimitri Maziuk wrote:
> Hi everyone,
> 
> I seem to remember seeing theis once before, but my google-fu is
> failing: I've a 2-node active-passive cluster, when I power up one node
> only, resources remain stopped. Is there a way to boot a cluster on one
> node only?
> 
> -- Note that if I boot up the other node everything starts, and then I
> can shut one of them down and it'll keep running. But that doesn't seem
> to happen when starting cold.
> 
> What am I missing?
> 
> TIA

That's a fail-safe. You're probably using corosync's wait_for_all option
(most likely via the two_node option). See the votequorum(5) man page
for details.

You could set wait_for_all to 0 in corosync.conf, then boot. The living
node should try to fence the other one, and proceed if fencing succeeds.
You may want to set wait_for_all back to 1 once your cluster is back to
normal.
-- 
Ken Gaillot <kgail...@redhat.com>





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] start one node only?

2017-08-24 Thread Ken Gaillot

On Thu, 2017-08-24 at 15:53 -0500, Dimitri Maziuk wrote:
> On 08/24/2017 03:40 PM, Ken Gaillot wrote:
> 
> > You could set wait_for_all to 0 in corosync.conf, then boot. The living
> > node should try to fence the other one, and proceed if fencing succeeds.
> 
> Didn't I just read a thread that says it won't: the other node is
> already down?

How could it know that, from a cold boot? It doesn't know if the other
node is down, or up but unreachable. wait_for_all is how to keep that
fencing from happening at every cluster start, but the trade-off is you
can't cold-boot a partial cluster.
-- 
Ken Gaillot <kgail...@redhat.com>





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Pacemaker in Azure

2017-08-24 Thread Ken Gaillot

That would definitely be of wider interest.

I could see modifying the IPaddr2 RA to take some new arguments for
AWS/Azure parameters, and if those are configured, it would do the
appropriate API requests.

On Thu, 2017-08-24 at 23:27 +, Eric Robinson wrote:
> Leon -- I will pay you one trillion samolians for that resource agent!
> Any way we can get our hands on a copy? 
> 
>  
> 
> --
> Eric Robinson
> 
>  
> 
> From: Leon Steffens [mailto:l...@steffensonline.com] 
> Sent: Thursday, August 24, 2017 3:48 PM
> To: Cluster Labs - All topics related to open-source clustering
> welcomed <users@clusterlabs.org>
> Subject: Re: [ClusterLabs] Pacemaker in Azure
> 
>  
> 
> That's what we did in AWS.  The IPaddr2 resource agent does an arp
> broadcast after changing the local IP but this does not work in AWS
> (probably for the same reasons as Azure). 
> 
>  
> 
> 
> We created our own OCF resource agent that uses the Amazon APIs to
> move the IP in AWS land and made that dependent on the IPaddr2
> resource, and it worked fine.
> 
> 
>  
> 
> 
>  
> 
> 
> Leon Steffens
> 
> 
>  
> 
> On Fri, Aug 25, 2017 at 8:34 AM, Eric Robinson
> <eric.robin...@psmnv.com> wrote:
> 
> > Don't use Azure? ;)
> 
> That would be my preference. But since I'm stuck with Azure
> (management decision) I need to come up with something. It
> appears there is an Azure API to make changes on-the-fly from
> a Linux box. Maybe I'll write a resource agent to change Azure
> and make IPaddr2 dependent on it. That might work?
> 
> --
> Eric Robinson
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
>  
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-- 
Ken Gaillot <kgail...@redhat.com>





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Antw: Retries before setting fail-count to INFINITY

2017-08-21 Thread Ken Gaillot

On Mon, 2017-08-21 at 15:39 +0200, Ulrich Windl wrote:
> >>> Vaibhaw Pandey  schrieb am 21.08.2017 um 14:58 in
> Nachricht
>

Re: [ClusterLabs] Pacemaker stopped monitoring the resource

2017-08-31 Thread Ken Gaillot

On Thu, 2017-08-31 at 06:41 +, Abhay B wrote:
> Hi, 
> 
> 
> I have a 2 node HA cluster configured on CentOS 7 with pcs command. 
> 
> 
> Below are the properties of the cluster :
> 
> 
> # pcs property
> Cluster Properties:
>  cluster-infrastructure: corosync
>  cluster-name: SVSDEHA
>  cluster-recheck-interval: 2s
>  dc-deadtime: 5
>  dc-version: 1.1.15-11.el7_3.5-e174ec8
>  have-watchdog: false
>  last-lrm-refresh: 1504090367
>  no-quorum-policy: ignore
>  start-failure-is-fatal: false
>  stonith-enabled: false
> 
> 
> PFA the cib.
> Also attached is the corosync.log around the time the below issue
> happened.
> 
> 
> After around 10 hrs and multiple failures, pacemaker stops monitoring
> resource on one of the nodes in the cluster.
> 
> 
> So even though the resource on other node fails, it is never migrated
> to the node on which the resource is not monitored.
> 
> 
> Wanted to know what could have triggered this and how to avoid getting
> into such scenarios.
> I am going through the logs and couldn't find why this happened.
> 
> 
> After this log the monitoring stopped.   
> 
> Aug 29 11:01:44 [16500] TPC-D12-10-002.phaedrus.sandvine.com
> crmd: info: process_lrm_event:   Result of monitor operation for
> SVSDEHA on TPC-D12-10-002.phaedrus.sandvine.com: 0 (ok) | call=538
> key=SVSDEHA_monitor_2000 confirmed=false cib-update=50013

Are you sure the monitor stopped? Pacemaker only logs recurring monitors
when the status changes. Any successful monitors after this wouldn't be
logged.

> Below log says the resource is leaving the cluster. 
> Aug 29 11:01:44 [16499] TPC-D12-10-002.phaedrus.sandvine.com
> pengine: info: LogActions:  Leave   SVSDEHA:0   (Slave
> TPC-D12-10-002.phaedrus.sandvine.com)

This means that the cluster will leave the resource where it is (i.e. it
doesn't need a start, stop, move, demote, promote, etc.).

> Let me know if anything more is needed. 
> 
> 
> Regards,
> Abhay
> 
> 
> PS:'pcs resource cleanup' brought the cluster back into good state. 

There are a lot of resource action failures, so I'm not sure where the
issue is, but I'm guessing it has to do with migration-threshold=1 --
once a resource has failed once on a node, it won't be allowed back on
that node until the failure is cleaned up. Of course you also have
failure-timeout=1s, which should clean it up immediately, so I'm not
sure.

My gut feeling is that you're trying to do too many things at once. I'd
start over from scratch and proceed more slowly: first, set "two_node:
1" in corosync.conf and let no-quorum-policy default in pacemaker; then,
get stonith configured, tested, and enabled; then, test your resource
agent manually on the command line to make sure it conforms to the
expected return values
( 
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#ap-ocf
 ); then add your resource to the cluster without migration-threshold or 
failure-timeout, and work out any issues with frequent failures; then finally 
set migration-threshold and failure-timeout to reflect how you want recovery to 
proceed.
-- 
Ken Gaillot <kgail...@redhat.com>





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] VirtualDomain live migration error

2017-08-31 Thread Ken Gaillot

On Thu, 2017-08-31 at 01:13 +0200, Oscar Segarra wrote:
> Hi,
> 
> 
> In my environment, I have just two hosts, where qemu-kvm process is
> launched by a regular user (oneadmin) - open nebula - 
> 
> 
> I have created a VirtualDomain resource that starts and stops the VM
> perfectly. Nevertheless, when I change the location weight in order to
> force the migration, It raises a migration failure "error: 1"
> 
> 
> If I execute the virsh migrate command (that appears in corosync.log)
> from command line, it works perfectly.
> 
> 
> Anybody has experienced the same issue?
> 
> 
> Thanks in advance for your help 

If something works from the command line but not when run by a daemon,
my first suspicion is SELinux. Check the audit log for denials around
that time.

I'd also check the system log and Pacemaker detail log around that time
to see if there is any more information.
-- 
Ken Gaillot <kgail...@redhat.com>





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

< 1 2 3 4 5 6 7 8 9 10 >

501 - 600 of 1689 matches

Mail list logo