[ClusterLabs] Antw: Re: Cannot clone clvmd resource

2017-03-02 Thread Ulrich Windl
>>> Eric Ren  schrieb am 03.03.2017 um 04:12 in Nachricht
:
[...]
> A bugfix for this issue has been released in lvm2 2.02.120-70.1. And, since 
> SLE12-SP2
> and openSUSE leap42.2, we recommend using 
> '/usr/lib/ocf/resource.d/heartbeat/clvm'
> instead, which is from 'resource-agents' package.

[...]
It seems some release notes were not clear enough: I found out that we are also 
using ocf:lvm2:clvmd here (SLES11 SP4). When trying to diff, I found this:
# diff -u /usr/lib/ocf/resource.d/{lvm2,heartbeat}/clvmd |less
diff: /usr/lib/ocf/resource.d/heartbeat/clvmd: No such file or directory
# rpm -qf /usr/lib/ocf/resource.d/heartbeat /usr/lib/ocf/resource.d/lvm2/
resource-agents-3.9.5-49.2
lvm2-clvm-2.02.98-0.42.3

I'm confused!

Regards,
Ulrich




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: Never join a list without a problem...

2017-03-02 Thread Ulrich Windl
>>> Jeffrey Westgate  schrieb am 02.03.2017 um
17:32
in Nachricht
:
> Since we have both pieces of the load-balanced cluster doing the same thing
- 
> for still-as-yet unidentified reasons - we've put atop on one and sysdig on
the 
> other.  Running atop at 10 second slices, hoping it will catch something.  
> While configuring it yesterday, that server went into it's 'episode', but 
> there was nothing in the atop log to show anything.  Nothing else changed 
> except the cpu load average.  No increase in any other parameter.
> 
> frustrating.

Hi!

You could try the monit-approach (I could provide an RPM with a
"recent-enough" monit compiled for SLES11 SP4 (x86-64) if you need it).

The part that monitors unusual load looks like this here:
  check system host.domain.org
if loadavg (1min) > 8 then exec "/var/lib/monit/log-top.sh"
if loadavg (5min) > 4 then exec "/var/lib/monit/log-top.sh"
if loadavg (15min) > 2 then exec "/var/lib/monit/log-top.sh"
if memory usage > 90% for 2 cycles then exec "/var/lib/monit/log-top.sh"
if swap usage > 25% for 2 cycles then exec "/var/lib/monit/log-top.sh"
if swap usage > 50% then exec "/var/lib/monit/log-top.sh"
if cpu usage > 99% for 15 cycles then alert
if cpu usage (user) > 90% for 30 cycles then alert
if cpu usage (system) > 20% for 2 cycles then exec
"/var/lib/monit/log-top.s
h"
if cpu usage (wait) > 80% then exec "/var/lib/monit/log-top.sh"
group local
### all numbers are a matter of taste ;-)
And my script (in lack of better ideas) looks like this:
#!/bin/sh
{
echo "== $(/bin/date) =="
/usr/bin/mpstat
echo "---"
/usr/bin/vmstat
echo "---"
/usr/bin/top -b -n 1 -Hi
} >> /var/log/monit/top.log

Regards,
Ulrich

> 
> 
> 
> From: Adam Spiers [aspi...@suse.com]
> Sent: Wednesday, March 01, 2017 5:33 AM
> To: Cluster Labs - All topics related to open-source clustering welcomed
> Cc: Jeffrey Westgate
> Subject: Re: [ClusterLabs] Never join a list without a problem...
> 
> Ferenc Wágner  wrote:
>>Jeffrey Westgate  writes:
>>
>>> We use Nagios to monitor, and once every 20 to 40 hours - sometimes
>>> longer, and we cannot set a clock by it - while the machine is 95%
>>> idle (or more according to 'top'), the host load shoots up to 50 or
>>> 60%.  It takes about 20 minutes to peak, and another 30 to 45 minutes
>>> to come back down to baseline, which is mostly 0.00.  (attached
>>> hostload.pdf) This happens to both machines, randomly, and is
>>> concerning, as we'd like to find what's causing it and resolve it.
>>
>>Try running atop (http://www.atoptool.nl/).  It collects and logs
>>process accounting info, allowing you to step back in time and check
>>resource usage in the past.
> 
> Nice, I didn't know atop could also log the collected data for future
> analysis.
> 
> If you want to capture even more detail, sysdig is superb:
> 
> http://www.sysdig.org/ 
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Cannot clone clvmd resource

2017-03-02 Thread Eric Ren

On 03/02/2017 07:09 AM, Anne Nicolas wrote:


Le 01/03/2017 à 23:20, Ken Gaillot a écrit :

On 03/01/2017 03:49 PM, Anne Nicolas wrote:

Hi there


I'm testing quite an easy configuration to work on clvm. I'm just
getting crazy as it seems clmd cannot be cloned on other nodes.

clvmd start well on node1 but fails on both node2 and node3.

Your config looks fine, so I'm going to guess there's some local
difference on the nodes.


In pacemaker journalctl I get the following message
Mar 01 16:34:36 node3 pidofproc[27391]: pidofproc: cannot stat /clvmd:
No such file or directory
Mar 01 16:34:36 node3 pidofproc[27392]: pidofproc: cannot stat
/cmirrord: No such file or directory

I have no idea where the above is coming from. pidofproc is an LSB
function, but (given journalctl) I'm assuming you're using systemd. I
don't think anything in pacemaker or resource-agents uses pidofproc (at
least not currently, not sure about the older version you're using).


Thanks for your feedback. I finally checked the RA script and found the
error

in clvm2 RA script on non working nodes I got
# Common variables
DAEMON="${sbindir}/clvmd"
CMIRRORD="${sbindir}/cmirrord"
LVMCONF="${sbindir}/lvmconf"

on working node
DAEMON="/usr/sbin/clvmd"
CMIRRORD="/usr/sbin/cmirrord"

Looks like it was path variables were not interpreted. I just have to
check why I did get those versions

A bugfix for this issue has been released in lvm2 2.02.120-70.1. And, since 
SLE12-SP2
and openSUSE leap42.2, we recommend using 
'/usr/lib/ocf/resource.d/heartbeat/clvm'
instead, which is from 'resource-agents' package.

Eric


THanks again for your answer


Mar 01 16:34:36 node3 lrmd[2174]: notice: finished - rsc:p-clvmd
action:stop call_id:233 pid:27384 exit-code:0 exec-time:45ms queue-time:0ms
Mar 01 16:34:36 node3 crmd[2177]: notice: Operation p-clvmd_stop_0: ok
(node=node3, call=233, rc=0, cib-update=541, confirmed=true)
Mar 01 16:34:36 node3 crmd[2177]: notice: Initiating action 72: stop
p-dlm_stop_0 on node3 (local)
Mar 01 16:34:36 node3 lrmd[2174]: notice: executing - rsc:p-dlm
action:stop call_id:235
Mar 01 16:34:36 node3 crmd[2177]: notice: Initiating action 67: stop
p-dlm_stop_0 on node2

Here is my configuration

node 739312139: node1
node 739312140: node2
node 739312141: node3
primitive admin_addr IPaddr2 \
 params ip=172.17.2.10 \
 op monitor interval=10 timeout=20 \
 meta target-role=Started
primitive p-clvmd ocf:lvm2:clvmd \
 op start timeout=90 interval=0 \
 op stop timeout=100 interval=0 \
 op monitor interval=30 timeout=90
primitive p-dlm ocf:pacemaker:controld \
 op start timeout=90 interval=0 \
 op stop timeout=100 interval=0 \
 op monitor interval=60 timeout=90
primitive stonith-sbd stonith:external/sbd
group g-clvm p-dlm p-clvmd
clone c-clvm g-clvm meta interleave=true
property cib-bootstrap-options: \
 have-watchdog=true \
 dc-version=1.1.13-14.7-6f22ad7 \
 cluster-infrastructure=corosync \
 cluster-name=hacluster \
 stonith-enabled=true \
 placement-strategy=balanced \
 no-quorum-policy=freeze \
 last-lrm-refresh=1488404073
rsc_defaults rsc-options: \
 resource-stickiness=1 \
 migration-threshold=10
op_defaults op-options: \
 timeout=600 \
 record-pending=true

Thanks in advance for your input

Cheers



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Cannot clone clvmd resource

2017-03-02 Thread Eric Ren

On 03/02/2017 06:20 AM, Ken Gaillot wrote:

On 03/01/2017 03:49 PM, Anne Nicolas wrote:

Hi there


I'm testing quite an easy configuration to work on clvm. I'm just
getting crazy as it seems clmd cannot be cloned on other nodes.

clvmd start well on node1 but fails on both node2 and node3.

Your config looks fine, so I'm going to guess there's some local
difference on the nodes.


In pacemaker journalctl I get the following message
Mar 01 16:34:36 node3 pidofproc[27391]: pidofproc: cannot stat /clvmd:
No such file or directory
Mar 01 16:34:36 node3 pidofproc[27392]: pidofproc: cannot stat
/cmirrord: No such file or directory

I have no idea where the above is coming from. pidofproc is an LSB
function, but (given journalctl) I'm assuming you're using systemd. I
don't think anything in pacemaker or resource-agents uses pidofproc (at
least not currently, not sure about the older version you're using).

I guess Anne is using LVM2 on SUSE release. In our lvm2 package, there are cLVM 
related
resource agents for clvmd and cmirrord. They're using pidofproc.

Eric




Mar 01 16:34:36 node3 lrmd[2174]: notice: finished - rsc:p-clvmd
action:stop call_id:233 pid:27384 exit-code:0 exec-time:45ms queue-time:0ms
Mar 01 16:34:36 node3 crmd[2177]: notice: Operation p-clvmd_stop_0: ok
(node=node3, call=233, rc=0, cib-update=541, confirmed=true)
Mar 01 16:34:36 node3 crmd[2177]: notice: Initiating action 72: stop
p-dlm_stop_0 on node3 (local)
Mar 01 16:34:36 node3 lrmd[2174]: notice: executing - rsc:p-dlm
action:stop call_id:235
Mar 01 16:34:36 node3 crmd[2177]: notice: Initiating action 67: stop
p-dlm_stop_0 on node2

Here is my configuration

node 739312139: node1
node 739312140: node2
node 739312141: node3
primitive admin_addr IPaddr2 \
 params ip=172.17.2.10 \
 op monitor interval=10 timeout=20 \
 meta target-role=Started
primitive p-clvmd ocf:lvm2:clvmd \
 op start timeout=90 interval=0 \
 op stop timeout=100 interval=0 \
 op monitor interval=30 timeout=90
primitive p-dlm ocf:pacemaker:controld \
 op start timeout=90 interval=0 \
 op stop timeout=100 interval=0 \
 op monitor interval=60 timeout=90
primitive stonith-sbd stonith:external/sbd
group g-clvm p-dlm p-clvmd
clone c-clvm g-clvm meta interleave=true
property cib-bootstrap-options: \
 have-watchdog=true \
 dc-version=1.1.13-14.7-6f22ad7 \
 cluster-infrastructure=corosync \
 cluster-name=hacluster \
 stonith-enabled=true \
 placement-strategy=balanced \
 no-quorum-policy=freeze \
 last-lrm-refresh=1488404073
rsc_defaults rsc-options: \
 resource-stickiness=1 \
 migration-threshold=10
op_defaults op-options: \
 timeout=600 \
 record-pending=true

Thanks in advance for your input

Cheers



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] PCMK_OCF_DEGRADED (_MASTER): exit codes are mapped to PCMK_OCF_UNKNOWN_ERROR

2017-03-02 Thread Ken Gaillot
On 03/01/2017 05:28 PM, Andrew Beekhof wrote:
> On Tue, Feb 28, 2017 at 12:06 AM, Lars Ellenberg
>  wrote:
>> When I recently tried to make use of the DEGRADED monitoring results,
>> I found out that it does still not work.
>>
>> Because LRMD choses to filter them in ocf2uniform_rc(),
>> and maps them to PCMK_OCF_UNKNOWN_ERROR.
>>
>> See patch suggestion below.
>>
>> It also filters away the other "special" rc values.
>> Do we really not want to see them in crmd/pengine?
> 
> I would think we do.
> 
>> Why does LRMD think it needs to outsmart the pengine?
> 
> Because the person that implemented the feature incorrectly assumed
> the rc would be passed back unmolested.
> 
>>
>> Note: I did build it, but did not use this yet,
>> so I have no idea if the rest of the implementation of the DEGRADED
>> stuff works as intended or if there are other things missing as well.
> 
> failcount might be the other place that needs some massaging.
> specifically, not incrementing it when a degraded rc comes through

I think that's already taken care of.

>> Thougts?\
> 
> looks good to me
> 
>>
>> diff --git a/lrmd/lrmd.c b/lrmd/lrmd.c
>> index 724edb7..39a7dd1 100644
>> --- a/lrmd/lrmd.c
>> +++ b/lrmd/lrmd.c
>> @@ -800,11 +800,40 @@ hb2uniform_rc(const char *action, int rc, const char 
>> *stdout_data)
>>  static int
>>  ocf2uniform_rc(int rc)
>>  {
>> -if (rc < 0 || rc > PCMK_OCF_FAILED_MASTER) {
>> -return PCMK_OCF_UNKNOWN_ERROR;

Let's simply use > PCMK_OCF_OTHER_ERROR here, since that's guaranteed to
be the high end.

Lars, do you want to test that?

>> +switch (rc) {
>> +default:
>> +   return PCMK_OCF_UNKNOWN_ERROR;
>> +
>> +case PCMK_OCF_OK:
>> +case PCMK_OCF_UNKNOWN_ERROR:
>> +case PCMK_OCF_INVALID_PARAM:
>> +case PCMK_OCF_UNIMPLEMENT_FEATURE:
>> +case PCMK_OCF_INSUFFICIENT_PRIV:
>> +case PCMK_OCF_NOT_INSTALLED:
>> +case PCMK_OCF_NOT_CONFIGURED:
>> +case PCMK_OCF_NOT_RUNNING:
>> +case PCMK_OCF_RUNNING_MASTER:
>> +case PCMK_OCF_FAILED_MASTER:
>> +
>> +case PCMK_OCF_DEGRADED:
>> +case PCMK_OCF_DEGRADED_MASTER:
>> +   return rc;
>> +
>> +#if 0
>> +   /* What about these?? */
> 
> yes, these should get passed back as-is too
> 
>> +/* 150-199 reserved for application use */
>> +PCMK_OCF_CONNECTION_DIED = 189, /* Operation failure implied by 
>> disconnection of the LRM API to a local or remote node */
>> +
>> +PCMK_OCF_EXEC_ERROR= 192, /* Generic problem invoking the agent */
>> +PCMK_OCF_UNKNOWN   = 193, /* State of the service is unknown - used 
>> for recording in-flight operations */
>> +PCMK_OCF_SIGNAL= 194,
>> +PCMK_OCF_NOT_SUPPORTED = 195,
>> +PCMK_OCF_PENDING   = 196,
>> +PCMK_OCF_CANCELLED = 197,
>> +PCMK_OCF_TIMEOUT   = 198,
>> +PCMK_OCF_OTHER_ERROR   = 199, /* Keep the same codes as PCMK_LSB */
>> +#endif
>>  }
>> -
>> -return rc;
>>  }
>>
>>  static int

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Never join a list without a problem...

2017-03-02 Thread Jeffrey Westgate
Since we have both pieces of the load-balanced cluster doing the same thing - 
for still-as-yet unidentified reasons - we've put atop on one and sysdig on the 
other.  Running atop at 10 second slices, hoping it will catch something.  
While configuring it yesterday, that server went into it's 'episode', but there 
was nothing in the atop log to show anything.  Nothing else changed except the 
cpu load average.  No increase in any other parameter.

frustrating.



From: Adam Spiers [aspi...@suse.com]
Sent: Wednesday, March 01, 2017 5:33 AM
To: Cluster Labs - All topics related to open-source clustering welcomed
Cc: Jeffrey Westgate
Subject: Re: [ClusterLabs] Never join a list without a problem...

Ferenc Wágner  wrote:
>Jeffrey Westgate  writes:
>
>> We use Nagios to monitor, and once every 20 to 40 hours - sometimes
>> longer, and we cannot set a clock by it - while the machine is 95%
>> idle (or more according to 'top'), the host load shoots up to 50 or
>> 60%.  It takes about 20 minutes to peak, and another 30 to 45 minutes
>> to come back down to baseline, which is mostly 0.00.  (attached
>> hostload.pdf) This happens to both machines, randomly, and is
>> concerning, as we'd like to find what's causing it and resolve it.
>
>Try running atop (http://www.atoptool.nl/).  It collects and logs
>process accounting info, allowing you to step back in time and check
>resource usage in the past.

Nice, I didn't know atop could also log the collected data for future
analysis.

If you want to capture even more detail, sysdig is superb:

http://www.sysdig.org/

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Antw: Cannot clone clvmd resource

2017-03-02 Thread Ulrich Windl
>>> "Ulrich Windl"  schrieb am 02.03.2017 um
08:40 in Nachricht <58b7cc6a02a100024...@gwsmtp1.uni-regensburg.de>:
> Hi!
> 
> What about colocation and ordering?

Sorry, I missed the "group"

> 
> Regards,
> Ulrich
> 
 Anne Nicolas  schrieb am 01.03.2017 um 22:49 in 
 Nachricht
> <0b585272-1c5b-0f07-1f01-747c003c6...@gmail.com>:
>> Hi there
>> 
>> 
>> I'm testing quite an easy configuration to work on clvm. I'm just
>> getting crazy as it seems clmd cannot be cloned on other nodes.
>> 
>> clvmd start well on node1 but fails on both node2 and node3.
>> 
>> In pacemaker journalctl I get the following message
>> Mar 01 16:34:36 node3 pidofproc[27391]: pidofproc: cannot stat /clvmd:
>> No such file or directory
>> Mar 01 16:34:36 node3 pidofproc[27392]: pidofproc: cannot stat
>> /cmirrord: No such file or directory
>> Mar 01 16:34:36 node3 lrmd[2174]: notice: finished - rsc:p-clvmd
>> action:stop call_id:233 pid:27384 exit-code:0 exec-time:45ms queue-time:0ms
>> Mar 01 16:34:36 node3 crmd[2177]: notice: Operation p-clvmd_stop_0: ok
>> (node=node3, call=233, rc=0, cib-update=541, confirmed=true)
>> Mar 01 16:34:36 node3 crmd[2177]: notice: Initiating action 72: stop
>> p-dlm_stop_0 on node3 (local)
>> Mar 01 16:34:36 node3 lrmd[2174]: notice: executing - rsc:p-dlm
>> action:stop call_id:235
>> Mar 01 16:34:36 node3 crmd[2177]: notice: Initiating action 67: stop
>> p-dlm_stop_0 on node2
>> 
>> Here is my configuration
>> 
>> node 739312139: node1
>> node 739312140: node2
>> node 739312141: node3
>> primitive admin_addr IPaddr2 \
>> params ip=172.17.2.10 \
>> op monitor interval=10 timeout=20 \
>> meta target-role=Started
>> primitive p-clvmd ocf:lvm2:clvmd \
>> op start timeout=90 interval=0 \
>> op stop timeout=100 interval=0 \
>> op monitor interval=30 timeout=90
>> primitive p-dlm ocf:pacemaker:controld \
>> op start timeout=90 interval=0 \
>> op stop timeout=100 interval=0 \
>> op monitor interval=60 timeout=90
>> primitive stonith-sbd stonith:external/sbd
>> group g-clvm p-dlm p-clvmd
>> clone c-clvm g-clvm meta interleave=true
>> property cib-bootstrap-options: \
>> have-watchdog=true \
>> dc-version=1.1.13-14.7-6f22ad7 \
>> cluster-infrastructure=corosync \
>> cluster-name=hacluster \
>> stonith-enabled=true \
>> placement-strategy=balanced \
>> no-quorum-policy=freeze \
>> last-lrm-refresh=1488404073
>> rsc_defaults rsc-options: \
>> resource-stickiness=1 \
>> migration-threshold=10
>> op_defaults op-options: \
>> timeout=600 \
>> record-pending=true
>> 
>> Thanks in advance for your input
>> 
>> Cheers
>> 
>> -- 
>> Anne Nicolas
>> http://mageia.org 
>> 
>> ___
>> Users mailing list: Users@clusterlabs.org 
>> http://lists.clusterlabs.org/mailman/listinfo/users 
>> 
>> Project Home: http://www.clusterlabs.org 
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> Bugs: http://bugs.clusterlabs.org 
> 
> 
> 
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Cannot clone clvmd resource

2017-03-02 Thread Anne Nicolas
Anne
http://mageia.org

Le 2 mars 2017 08:40, "Ulrich Windl"  a
écrit :
>
> Hi!
>
> What about colocation and ordering?

The problem was not there but indeed I should have created a group with
cloned resources rather than primitives

Thanks anyway
>
> Regards,
> Ulrich
>
> >>> Anne Nicolas  schrieb am 01.03.2017 um 22:49 in
Nachricht
> <0b585272-1c5b-0f07-1f01-747c003c6...@gmail.com>:
> > Hi there
> >
> >
> > I'm testing quite an easy configuration to work on clvm. I'm just
> > getting crazy as it seems clmd cannot be cloned on other nodes.
> >
> > clvmd start well on node1 but fails on both node2 and node3.
> >
> > In pacemaker journalctl I get the following message
> > Mar 01 16:34:36 node3 pidofproc[27391]: pidofproc: cannot stat /clvmd:
> > No such file or directory
> > Mar 01 16:34:36 node3 pidofproc[27392]: pidofproc: cannot stat
> > /cmirrord: No such file or directory
> > Mar 01 16:34:36 node3 lrmd[2174]: notice: finished - rsc:p-clvmd
> > action:stop call_id:233 pid:27384 exit-code:0 exec-time:45ms
queue-time:0ms
> > Mar 01 16:34:36 node3 crmd[2177]: notice: Operation p-clvmd_stop_0: ok
> > (node=node3, call=233, rc=0, cib-update=541, confirmed=true)
> > Mar 01 16:34:36 node3 crmd[2177]: notice: Initiating action 72: stop
> > p-dlm_stop_0 on node3 (local)
> > Mar 01 16:34:36 node3 lrmd[2174]: notice: executing - rsc:p-dlm
> > action:stop call_id:235
> > Mar 01 16:34:36 node3 crmd[2177]: notice: Initiating action 67: stop
> > p-dlm_stop_0 on node2
> >
> > Here is my configuration
> >
> > node 739312139: node1
> > node 739312140: node2
> > node 739312141: node3
> > primitive admin_addr IPaddr2 \
> > params ip=172.17.2.10 \
> > op monitor interval=10 timeout=20 \
> > meta target-role=Started
> > primitive p-clvmd ocf:lvm2:clvmd \
> > op start timeout=90 interval=0 \
> > op stop timeout=100 interval=0 \
> > op monitor interval=30 timeout=90
> > primitive p-dlm ocf:pacemaker:controld \
> > op start timeout=90 interval=0 \
> > op stop timeout=100 interval=0 \
> > op monitor interval=60 timeout=90
> > primitive stonith-sbd stonith:external/sbd
> > group g-clvm p-dlm p-clvmd
> > clone c-clvm g-clvm meta interleave=true
> > property cib-bootstrap-options: \
> > have-watchdog=true \
> > dc-version=1.1.13-14.7-6f22ad7 \
> > cluster-infrastructure=corosync \
> > cluster-name=hacluster \
> > stonith-enabled=true \
> > placement-strategy=balanced \
> > no-quorum-policy=freeze \
> > last-lrm-refresh=1488404073
> > rsc_defaults rsc-options: \
> > resource-stickiness=1 \
> > migration-threshold=10
> > op_defaults op-options: \
> > timeout=600 \
> > record-pending=true
> >
> > Thanks in advance for your input
> >
> > Cheers
> >
> > --
> > Anne Nicolas
> > http://mageia.org
> >
> > ___
> > Users mailing list: Users@clusterlabs.org
> > http://lists.clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
>
>
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org