[ClusterLabs] Antw: starting primitive resources of a group without starting the complete group - unclear behaviour

2017-04-20 Thread Ulrich Windl
>>> "Lentes, Bernd"  schrieb am 20.04.2017 
>>> um
21:53 in Nachricht
<1649590422.18260279.1492718032265.javamail.zim...@helmholtz-muenchen.de>:
> Hi,
> 
> just for the sake of completeness i'd like to figure out what happens if i 
> start one resource, which is a member of a group, but only this resource.
> I'd like to see what the other resources of that group are doing. Also if it 
> maybe does not make much sense. Just for learning and understanding.

The resource in the group is restricted to what the group enforces.

> 
> But i'm getting mad about my test results:

You should have explained what your group looks like and what resource you are 
testing.

> 
> first test:
> 
> crm(live)# status
> Last updated: Thu Apr 20 20:56:08 2017
> Last change: Thu Apr 20 20:46:35 2017 by root via cibadmin on ha-idg-2
> Stack: classic openais (with plugin)
> Current DC: ha-idg-2 - partition with quorum
> Version: 1.1.12-f47ea56
> 2 Nodes configured, 2 expected votes
> 14 Resources configured
> 
> 
> Online: [ ha-idg-1 ha-idg-2 ]
> 
>  Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml 
> [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml]
>  Started: [ ha-idg-1 ha-idg-2 ]
>  prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started 
> ha-idg-2
>  prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started 
> ha-idg-1
> 
> crm(live)# resource start prim_vnc_ip_mausdb
> 
> crm(live)# status
> Last updated: Thu Apr 20 20:56:44 2017
> Last change: Thu Apr 20 20:56:44 2017 by root via crm_resource on ha-idg-1
> Stack: classic openais (with plugin)
> Current DC: ha-idg-2 - partition with quorum
> Version: 1.1.12-f47ea56
> 2 Nodes configured, 2 expected votes
> 14 Resources configured
> 
> 
> Online: [ ha-idg-1 ha-idg-2 ]
> 
>  Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml 
> [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml]
>  Started: [ ha-idg-1 ha-idg-2 ]
>  prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started 
> ha-idg-2
>  prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started 
> ha-idg-1
>  Resource Group: group_vnc_mausdb
>  prim_vnc_ip_mausdb (ocf::heartbeat:IPaddr):Started ha-idg-1   
> <===
>  prim_vm_mausdb (ocf::heartbeat:VirtualDomain): Started ha-idg-1   
> <===
> 
> 
> 
> second test:

What's the status before the test?

> 
> crm(live)# status
> Last updated: Thu Apr 20 21:24:19 2017
> Last change: Thu Apr 20 21:20:04 2017 by root via cibadmin on ha-idg-2
> Stack: classic openais (with plugin)
> Current DC: ha-idg-2 - partition with quorum
> Version: 1.1.12-f47ea56
> 2 Nodes configured, 2 expected votes
> 14 Resources configured
> 
> 
> Online: [ ha-idg-1 ha-idg-2 ]
> 
>  Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml 
> [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml]
>  Started: [ ha-idg-1 ha-idg-2 ]
>  prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started 
> ha-idg-2
>  prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started 
> ha-idg-1
> 
> 
> crm(live)# resource start prim_vnc_ip_mausdb
> 
> 
> crm(live)# status
> Last updated: Thu Apr 20 21:26:05 2017
> Last change: Thu Apr 20 21:25:55 2017 by root via cibadmin on ha-idg-2
> Stack: classic openais (with plugin)
> Current DC: ha-idg-2 - partition with quorum
> Version: 1.1.12-f47ea56
> 2 Nodes configured, 2 expected votes
> 14 Resources configured
> 
> 
> Online: [ ha-idg-1 ha-idg-2 ]
> 
>  Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml 
> [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml]
>  Started: [ ha-idg-1 ha-idg-2 ]
>  prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started 
> ha-idg-2
>  prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started 
> ha-idg-1
>  Resource Group: group_vnc_mausdb
>  prim_vnc_ip_mausdb (ocf::heartbeat:IPaddr):Started ha-idg-1   
> <===
>  prim_vm_mausdb (ocf::heartbeat:VirtualDomain): 
> (target-role:Stopped) Stopped   <===
> 
> 
> Once the second resource of the group is started with the first resource, 
> the other time not !?!
> Why this unclear behaviour ?

With your incomplete status and test description it's hard to say.

> 
> This is my configuration:
> 
> primitive prim_vm_mausdb VirtualDomain \
> params config="/var/lib/libvirt/images/xml/mausdb_vm.xml" \
> params hypervisor="qemu:///system" \
> params migration_transport=ssh \
> op start interval=0 timeout=120 \
> op stop interval=0 timeout=130 \
> op monitor interval=30 timeout=30 \
> op migrate_from interval=0 timeout=180 \
> op migrate_to interval=0 timeout=190 \
> meta allow-migrate=true is-managed=true \
> utilization cpu=4 hv_memory=8006
> 
> 
> primitive prim_vnc_ip_mausdb IPaddr \
> params ip=146.107.235.161 nic=br0 cidr_netmask=24 \
> meta target-role=Started
> 
> 
> group grou

[ClusterLabs] Antw: Re: Colocation of a primitive resource with a clone with limited copies

2017-04-20 Thread Ulrich Windl
>>> Ken Gaillot  schrieb am 20.04.2017 um 19:33 in 
>>> Nachricht
<252dd10e-e418-9d96-658c-1a351002c...@redhat.com>:
> On 04/20/2017 10:52 AM, Jan Wrona wrote:
>> Hello,
>> 
>> my problem is closely related to the thread [1], but I didn't find a
>> solution there. I have a resource that is set up as a clone C restricted
>> to two copies (using the clone-max=2 meta attribute||), because the
>> resource takes long time to get ready (it starts immediately though),
> 
> A resource agent must not return from "start" until a "monitor"
> operation would return success.

Of course it may, but with an error ;-)

[...]

Ulrich



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] starting primitive resources of a group without starting the complete group - unclear behaviour

2017-04-20 Thread Ken Gaillot
On 04/20/2017 02:53 PM, Lentes, Bernd wrote:
> Hi,
> 
> just for the sake of completeness i'd like to figure out what happens if i 
> start one resource, which is a member of a group, but only this resource.
> I'd like to see what the other resources of that group are doing. Also if it 
> maybe does not make much sense. Just for learning and understanding.
> 
> But i'm getting mad about my test results:
> 
> first test:
> 
> crm(live)# status
> Last updated: Thu Apr 20 20:56:08 2017
> Last change: Thu Apr 20 20:46:35 2017 by root via cibadmin on ha-idg-2
> Stack: classic openais (with plugin)
> Current DC: ha-idg-2 - partition with quorum
> Version: 1.1.12-f47ea56
> 2 Nodes configured, 2 expected votes
> 14 Resources configured
> 
> 
> Online: [ ha-idg-1 ha-idg-2 ]
> 
>  Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml 
> [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml]
>  Started: [ ha-idg-1 ha-idg-2 ]
>  prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started 
> ha-idg-2
>  prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started 
> ha-idg-1
> 
> crm(live)# resource start prim_vnc_ip_mausdb
> 
> crm(live)# status
> Last updated: Thu Apr 20 20:56:44 2017
> Last change: Thu Apr 20 20:56:44 2017 by root via crm_resource on ha-idg-1
> Stack: classic openais (with plugin)
> Current DC: ha-idg-2 - partition with quorum
> Version: 1.1.12-f47ea56
> 2 Nodes configured, 2 expected votes
> 14 Resources configured
> 
> 
> Online: [ ha-idg-1 ha-idg-2 ]
> 
>  Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml 
> [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml]
>  Started: [ ha-idg-1 ha-idg-2 ]
>  prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started 
> ha-idg-2
>  prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started 
> ha-idg-1
>  Resource Group: group_vnc_mausdb
>  prim_vnc_ip_mausdb (ocf::heartbeat:IPaddr):Started ha-idg-1   
> <===
>  prim_vm_mausdb (ocf::heartbeat:VirtualDomain): Started ha-idg-1   
> <===
> 
> 
> 
> second test:
> 
> crm(live)# status
> Last updated: Thu Apr 20 21:24:19 2017
> Last change: Thu Apr 20 21:20:04 2017 by root via cibadmin on ha-idg-2
> Stack: classic openais (with plugin)
> Current DC: ha-idg-2 - partition with quorum
> Version: 1.1.12-f47ea56
> 2 Nodes configured, 2 expected votes
> 14 Resources configured
> 
> 
> Online: [ ha-idg-1 ha-idg-2 ]
> 
>  Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml 
> [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml]
>  Started: [ ha-idg-1 ha-idg-2 ]
>  prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started 
> ha-idg-2
>  prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started 
> ha-idg-1
> 
> 
> crm(live)# resource start prim_vnc_ip_mausdb
> 
> 
> crm(live)# status
> Last updated: Thu Apr 20 21:26:05 2017
> Last change: Thu Apr 20 21:25:55 2017 by root via cibadmin on ha-idg-2
> Stack: classic openais (with plugin)
> Current DC: ha-idg-2 - partition with quorum
> Version: 1.1.12-f47ea56
> 2 Nodes configured, 2 expected votes
> 14 Resources configured
> 
> 
> Online: [ ha-idg-1 ha-idg-2 ]
> 
>  Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml 
> [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml]
>  Started: [ ha-idg-1 ha-idg-2 ]
>  prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started 
> ha-idg-2
>  prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started 
> ha-idg-1
>  Resource Group: group_vnc_mausdb
>  prim_vnc_ip_mausdb (ocf::heartbeat:IPaddr):Started ha-idg-1   
> <===
>  prim_vm_mausdb (ocf::heartbeat:VirtualDomain): (target-role:Stopped) 
> Stopped   <===

target-role=Stopped prevents a resource from being started.

In a group, each member of the group depends on the previously listed
members, same as if ordering and colocation constraints had been created
between each pair. So, starting a resource in the "middle" of a group
will also start everything before it.

> 
> Once the second resource of the group is started with the first resource, the 
> other time not !?!
> Why this unclear behaviour ?
> 
> This is my configuration:
> 
> primitive prim_vm_mausdb VirtualDomain \
> params config="/var/lib/libvirt/images/xml/mausdb_vm.xml" \
> params hypervisor="qemu:///system" \
> params migration_transport=ssh \
> op start interval=0 timeout=120 \
> op stop interval=0 timeout=130 \
> op monitor interval=30 timeout=30 \
> op migrate_from interval=0 timeout=180 \
> op migrate_to interval=0 timeout=190 \
> meta allow-migrate=true is-managed=true \
> utilization cpu=4 hv_memory=8006
> 
> 
> primitive prim_vnc_ip_mausdb IPaddr \
> params ip=146.107.235.161 nic=br0 cidr_netmask=24 \
> meta target-role=Started
> 
> 
> group group_vnc_mausdb prim_vnc_ip_mausdb prim_vm_mausdb \
>  

Re: [ClusterLabs] Colocation of a primitive resource with a clone with limited copies

2017-04-20 Thread Jan Wrona

On 20.4.2017 19:33, Ken Gaillot wrote:

On 04/20/2017 10:52 AM, Jan Wrona wrote:

Hello,

my problem is closely related to the thread [1], but I didn't find a
solution there. I have a resource that is set up as a clone C restricted
to two copies (using the clone-max=2 meta attribute||), because the
resource takes long time to get ready (it starts immediately though),

A resource agent must not return from "start" until a "monitor"
operation would return success.

Beyond that, the cluster doesn't care what "ready" means, so it's OK if
it's not fully operational by some measure. However, that raises the
question of what you're accomplishing with your monitor.
I know all that and my RA respects that. I didn't want to go into 
details about the service I'm running, but maybe it will help you 
understand. Its a data collector which receives and processes data from 
a UDP stream. To understand these data, it needs templates which 
periodically occur in the stream (every five minutes or so). After 
"start" the service is up and running, "monitor" operations are 
successful, but until the templates arrive the service is not "ready". I 
basically need to somehow simulate this "ready" state.



and by having it ready as a clone, I can failover in the time it takes
to move an IP resource. I have a colocation constraint "resource IP with
clone C", which will make sure IP runs with a working instance of C:

Configuration:
  Clone: dummy-clone
   Meta Attrs: clone-max=2 interleave=true
   Resource: dummy (class=ocf provider=heartbeat type=Dummy)
Operations: start interval=0s timeout=20 (dummy-start-interval-0s)
stop interval=0s timeout=20 (dummy-stop-interval-0s)
monitor interval=10 timeout=20 (dummy-monitor-interval-10)
  Resource: ip (class=ocf provider=heartbeat type=Dummy)
   Operations: start interval=0s timeout=20 (ip-start-interval-0s)
   stop interval=0s timeout=20 (ip-stop-interval-0s)
   monitor interval=10 timeout=20 (ip-monitor-interval-10)

Colocation Constraints:
   ip with dummy-clone (score:INFINITY)

State:
  Clone Set: dummy-clone [dummy]
  Started: [ sub1.example.org sub3.example.org ]
  ip (ocf::heartbeat:Dummy): Started sub1.example.org


This is fine until the the active node (sub1.example.org) fails. Instead
of moving the IP to the passive node (sub3.example.org) with ready clone
instance, Pacemaker will move it to the node where it just started a
fresh instance of the clone (sub2.example.org in my case):

New state:
  Clone Set: dummy-clone [dummy]
  Started: [ sub2.example.org sub3.example.org ]
  ip (ocf::heartbeat:Dummy): Started sub2.example.org


Documentation states that the cluster will choose a copy based on where
the clone is running and the resource's own location preferences, so I
don't understand why this is happening. Is there a way to tell Pacemaker
to move the IP to the node where the resource is already running?

Thanks!
Jan Wrona

[1] http://lists.clusterlabs.org/pipermail/users/2016-November/004540.html

The cluster places ip based on where the clone will be running at that
point in the recovery, rather than where it was running before recovery.

Unfortunately I can't think of a way to do exactly what you want,
hopefully someone else has an idea.

One possibility would be to use on-fail=standby on the clone monitor.
That way, instead of recovering the clone when it fails, all resources
on the node would move elsewhere. You'd then have to manually take the
node out of standby for it to be usable again.
I don't see how that would solve it. The node would be put into the 
standby state, cluster would recover the clone instance on some other 
node and possibly place the IP there too. Moreover I don't want to put 
the whole node into standby because of one failed monitor.


It might be possible to do something more if you convert the clone to a
master/slave resource, and colocate ip with the master role. For
example, you could set the master score based on how long the service
has been running, so the longest-running instance is always master.
This sounds promising, I have heard about master/slave resources but 
never actually used any. I'll look more into that, thanks you for your 
advice!


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] starting primitive resources of a group without starting the complete group - unclear behaviour

2017-04-20 Thread Lentes, Bernd
Hi,

just for the sake of completeness i'd like to figure out what happens if i 
start one resource, which is a member of a group, but only this resource.
I'd like to see what the other resources of that group are doing. Also if it 
maybe does not make much sense. Just for learning and understanding.

But i'm getting mad about my test results:

first test:

crm(live)# status
Last updated: Thu Apr 20 20:56:08 2017
Last change: Thu Apr 20 20:46:35 2017 by root via cibadmin on ha-idg-2
Stack: classic openais (with plugin)
Current DC: ha-idg-2 - partition with quorum
Version: 1.1.12-f47ea56
2 Nodes configured, 2 expected votes
14 Resources configured


Online: [ ha-idg-1 ha-idg-2 ]

 Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml 
[group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml]
 Started: [ ha-idg-1 ha-idg-2 ]
 prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started ha-idg-2
 prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started ha-idg-1

crm(live)# resource start prim_vnc_ip_mausdb

crm(live)# status
Last updated: Thu Apr 20 20:56:44 2017
Last change: Thu Apr 20 20:56:44 2017 by root via crm_resource on ha-idg-1
Stack: classic openais (with plugin)
Current DC: ha-idg-2 - partition with quorum
Version: 1.1.12-f47ea56
2 Nodes configured, 2 expected votes
14 Resources configured


Online: [ ha-idg-1 ha-idg-2 ]

 Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml 
[group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml]
 Started: [ ha-idg-1 ha-idg-2 ]
 prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started ha-idg-2
 prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started ha-idg-1
 Resource Group: group_vnc_mausdb
 prim_vnc_ip_mausdb (ocf::heartbeat:IPaddr):Started ha-idg-1   
<===
 prim_vm_mausdb (ocf::heartbeat:VirtualDomain): Started ha-idg-1   
<===



second test:

crm(live)# status
Last updated: Thu Apr 20 21:24:19 2017
Last change: Thu Apr 20 21:20:04 2017 by root via cibadmin on ha-idg-2
Stack: classic openais (with plugin)
Current DC: ha-idg-2 - partition with quorum
Version: 1.1.12-f47ea56
2 Nodes configured, 2 expected votes
14 Resources configured


Online: [ ha-idg-1 ha-idg-2 ]

 Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml 
[group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml]
 Started: [ ha-idg-1 ha-idg-2 ]
 prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started ha-idg-2
 prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started ha-idg-1


crm(live)# resource start prim_vnc_ip_mausdb


crm(live)# status
Last updated: Thu Apr 20 21:26:05 2017
Last change: Thu Apr 20 21:25:55 2017 by root via cibadmin on ha-idg-2
Stack: classic openais (with plugin)
Current DC: ha-idg-2 - partition with quorum
Version: 1.1.12-f47ea56
2 Nodes configured, 2 expected votes
14 Resources configured


Online: [ ha-idg-1 ha-idg-2 ]

 Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml 
[group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml]
 Started: [ ha-idg-1 ha-idg-2 ]
 prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started ha-idg-2
 prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started ha-idg-1
 Resource Group: group_vnc_mausdb
 prim_vnc_ip_mausdb (ocf::heartbeat:IPaddr):Started ha-idg-1   
<===
 prim_vm_mausdb (ocf::heartbeat:VirtualDomain): (target-role:Stopped) 
Stopped   <===


Once the second resource of the group is started with the first resource, the 
other time not !?!
Why this unclear behaviour ?

This is my configuration:

primitive prim_vm_mausdb VirtualDomain \
params config="/var/lib/libvirt/images/xml/mausdb_vm.xml" \
params hypervisor="qemu:///system" \
params migration_transport=ssh \
op start interval=0 timeout=120 \
op stop interval=0 timeout=130 \
op monitor interval=30 timeout=30 \
op migrate_from interval=0 timeout=180 \
op migrate_to interval=0 timeout=190 \
meta allow-migrate=true is-managed=true \
utilization cpu=4 hv_memory=8006


primitive prim_vnc_ip_mausdb IPaddr \
params ip=146.107.235.161 nic=br0 cidr_netmask=24 \
meta target-role=Started


group group_vnc_mausdb prim_vnc_ip_mausdb prim_vm_mausdb \
meta target-role=Stopped is-managed=true


Failcounts for the group and the vm are zero on both nodes. Scores for the vm 
on both nodes is -INFINITY.


Starting the vm in the second case (resource start prim_vm_mausdb) succeeds, 
than i have both resources running.

Any ideas ?


Bernd


-- 
Bernd Lentes 

Systemadministration 
institute of developmental genetics 
Gebäude 35.34 - Raum 208 
HelmholtzZentrum München 
bernd.len...@helmholtz-muenchen.de 
phone: +49 (0)89 3187 1241 
fax: +49 (0)89 3187 2294 

Erst wenn man sich auf etwas festlegt kann man Unrecht haben 
Scott Adams
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszen

Re: [ClusterLabs] Colocation of a primitive resource with a clone with limited copies

2017-04-20 Thread Ken Gaillot
On 04/20/2017 10:52 AM, Jan Wrona wrote:
> Hello,
> 
> my problem is closely related to the thread [1], but I didn't find a
> solution there. I have a resource that is set up as a clone C restricted
> to two copies (using the clone-max=2 meta attribute||), because the
> resource takes long time to get ready (it starts immediately though),

A resource agent must not return from "start" until a "monitor"
operation would return success.

Beyond that, the cluster doesn't care what "ready" means, so it's OK if
it's not fully operational by some measure. However, that raises the
question of what you're accomplishing with your monitor.

> and by having it ready as a clone, I can failover in the time it takes
> to move an IP resource. I have a colocation constraint "resource IP with
> clone C", which will make sure IP runs with a working instance of C:
> 
> Configuration:
>  Clone: dummy-clone
>   Meta Attrs: clone-max=2 interleave=true
>   Resource: dummy (class=ocf provider=heartbeat type=Dummy)
>Operations: start interval=0s timeout=20 (dummy-start-interval-0s)
>stop interval=0s timeout=20 (dummy-stop-interval-0s)
>monitor interval=10 timeout=20 (dummy-monitor-interval-10)
>  Resource: ip (class=ocf provider=heartbeat type=Dummy)
>   Operations: start interval=0s timeout=20 (ip-start-interval-0s)
>   stop interval=0s timeout=20 (ip-stop-interval-0s)
>   monitor interval=10 timeout=20 (ip-monitor-interval-10)
> 
> Colocation Constraints:
>   ip with dummy-clone (score:INFINITY)
> 
> State:
>  Clone Set: dummy-clone [dummy]
>  Started: [ sub1.example.org sub3.example.org ]
>  ip (ocf::heartbeat:Dummy): Started sub1.example.org
> 
> 
> This is fine until the the active node (sub1.example.org) fails. Instead
> of moving the IP to the passive node (sub3.example.org) with ready clone
> instance, Pacemaker will move it to the node where it just started a
> fresh instance of the clone (sub2.example.org in my case):
> 
> New state:
>  Clone Set: dummy-clone [dummy]
>  Started: [ sub2.example.org sub3.example.org ]
>  ip (ocf::heartbeat:Dummy): Started sub2.example.org
> 
> 
> Documentation states that the cluster will choose a copy based on where
> the clone is running and the resource's own location preferences, so I
> don't understand why this is happening. Is there a way to tell Pacemaker
> to move the IP to the node where the resource is already running?
> 
> Thanks!
> Jan Wrona
> 
> [1] http://lists.clusterlabs.org/pipermail/users/2016-November/004540.html

The cluster places ip based on where the clone will be running at that
point in the recovery, rather than where it was running before recovery.

Unfortunately I can't think of a way to do exactly what you want,
hopefully someone else has an idea.

One possibility would be to use on-fail=standby on the clone monitor.
That way, instead of recovering the clone when it fails, all resources
on the node would move elsewhere. You'd then have to manually take the
node out of standby for it to be usable again.

It might be possible to do something more if you convert the clone to a
master/slave resource, and colocate ip with the master role. For
example, you could set the master score based on how long the service
has been running, so the longest-running instance is always master.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Digest does not match

2017-04-20 Thread Kostiantyn Ponomarenko
Hi folks,

We have a lot of our two-node systems running in our server room.
I noticed that some of them occasionally have this entries in the syslog:

Mar 15 12:54:45 A5-E4-151-bottom corosync[13766]: [TOTEM ] Digest does
not match
Mar 15 12:54:45 A5-E4-151-bottom corosync[13766]: [TOTEM ] Received
message has invalid digest... ignoring.
Mar 15 12:54:45 A5-E4-151-bottom corosync[13766]: [TOTEM ] Invalid packet data

I am attaching corosync.conf which we use on all our systems.

Each two-node system has a dedicated Ethernet interface for interconnection.
And the two nodes are connected directly one to another using this
interface without any switches.
Based on that I assume there is no way this connection is exposed to
outside world.

What could be causing this issue?

Thank you,
Kostia


corosync.conf
Description: Binary data
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Colocation of a primitive resource with a clone with limited copies

2017-04-20 Thread Jan Wrona

Hello,

my problem is closely related to the thread [1], but I didn't find a 
solution there. I have a resource that is set up as a clone C restricted 
to two copies (using the clone-max=2 meta attribute||), because the 
resource takes long time to get ready (it starts immediately though), 
and by having it ready as a clone, I can failover in the time it takes 
to move an IP resource. I have a colocation constraint "resource IP with 
clone C", which will make sure IP runs with a working instance of C:


Configuration:
 Clone: dummy-clone
  Meta Attrs: clone-max=2 interleave=true
  Resource: dummy (class=ocf provider=heartbeat type=Dummy)
   Operations: start interval=0s timeout=20 (dummy-start-interval-0s)
   stop interval=0s timeout=20 (dummy-stop-interval-0s)
   monitor interval=10 timeout=20 (dummy-monitor-interval-10)
 Resource: ip (class=ocf provider=heartbeat type=Dummy)
  Operations: start interval=0s timeout=20 (ip-start-interval-0s)
  stop interval=0s timeout=20 (ip-stop-interval-0s)
  monitor interval=10 timeout=20 (ip-monitor-interval-10)

Colocation Constraints:
  ip with dummy-clone (score:INFINITY)

State:
 Clone Set: dummy-clone [dummy]
 Started: [ sub1.example.org sub3.example.org ]
 ip (ocf::heartbeat:Dummy): Started sub1.example.org


This is fine until the the active node (sub1.example.org) fails. Instead 
of moving the IP to the passive node (sub3.example.org) with ready clone 
instance, Pacemaker will move it to the node where it just started a 
fresh instance of the clone (sub2.example.org in my case):


New state:
 Clone Set: dummy-clone [dummy]
 Started: [ sub2.example.org sub3.example.org ]
 ip (ocf::heartbeat:Dummy): Started sub2.example.org


Documentation states that the cluster will choose a copy based on where 
the clone is running and the resource's own location preferences, so I 
don't understand why this is happening. Is there a way to tell Pacemaker 
to move the IP to the node where the resource is already running?


Thanks!
Jan Wrona

[1] http://lists.clusterlabs.org/pipermail/users/2016-November/004540.html
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Wtrlt: Antw: Re: Antw: Re: how important would you consider to have two independent fencing device for each node ?

2017-04-20 Thread Ken Gaillot
On 04/20/2017 01:43 AM, Ulrich Windl wrote:
> Should have gone to the list...
> 
> Digimer  schrieb am 19.04.2017 um 17:20 in Nachricht
>> <600637f1-fef8-0a3d-821c-7aecfa398...@alteeve.ca>:
>>> On 19/04/17 02:38 AM, Ulrich Windl wrote:
>>> Digimer  schrieb am 18.04.2017 um 19:08 in
> Nachricht
 <26e49390-b384-b46e-4965-eba5bfe59...@alteeve.ca>:
> On 18/04/17 11:07 AM, Lentes, Bernd wrote:
>> Hi,
>>
>> i'm currently establishing a two node cluster. Each node is a HP
> server
 with 
> an ILO card.
>> I can fence both of them, it's working fine.
>> But what is if the ILO does not work correctly ? Then fencing is not 
> possible.
>
> Correct. If you only have iLO fencing, then the cluster would hang
> (failed fencing is *not* an indication of node death).
>
>> I also have a switched PDU from APC. Each server has two power
> supplies. 
> Currently one is connected to the normal power equipment, the other to
> the 
> UPS.
>> As a sort of redundancy, if the UPS does not work properly.
>
> That's a fine setup.
>
>> When i'd like to use the switched PDU as a fencing device i will loose
> the

> redundancy of two independent power sources, because then i have to
> connect

> both power supplies together to the UPS.
>> I wouldn't like to do that.
>
> Not if you have two switched PDUs. This is what we do in our Anvil!
> systems... One PDU feeds the first PSU in each node and the second PDU
> feeds the second PSUs. Ideally both PDUs are fed by UPSes, but that's
> not as important. One PDU on a UPS and one PDU directly from mains will
> work.
>
>> How important would you consider to have two independent fencing device
> for

> each node ? I'd can't by another PDU, currently we are very poor.
>
> Depends entirely on your tolerance for interruption. *I* answer that
> with "extremely important". However, most clusters out there have only
> IPMI-based fencing, so they would obviously say "not so important".
>
>> Is there another way to create a second fencing device, independent
> from
 the 
> ILO card ?
>>
>> Thanks.
>
> Sure, SBD would work. I've never seen IPMI not have a watchdog timer
> (and iLO is IPMI++), as one example. It's slow, and needs shared
> storage, but a small box somewhere running a small tgtd or iscsid
> should
> do the trick (note that I have never used SBD myself...).

 Slow is relative: If it takes 3 seconds from issuing the reset command
> until
 the node is dead, it's fast enough for most cases. Even a switched PDU
> has 
>>> some
 delays: The command has to be processed, the relay may "stick" a short 
>>> moment,
 the power supply's capacitors have to discharge (if you have two power 
>>> supplys,
 both need to)...  And iLOs don't really like to be powered off.

 Ulrich
>>>
>>> The way I understand SBD, and correct me if I am wrong, recovery won't
>>> begin until sometime after the watchdog timer kicks. If the watchdog
>>> timer is 60 seconds, then your cluster will hang for >60 seconds (plus
>>> fence delays, etc).
>>
>> I think it works differently: One task periodically reads ist mailbox slot 
>> for commands, and once a comment was read, it's executed immediately. Only
> if 
>> the read task does hang for a long time, the watchdog itself triggers a
> reset 
>> (as SBD seems dead). So the delay is actually made from the sum of "write 
>> delay", "read delay", "command excution".

I think you're right when sbd uses shared-storage, but there is a
watchdog-only configuration that I believe digimer was referring to.

With watchdog-only, the cluster will wait for the value of the
stonith-watchdog-timeout property before considering the fencing successful.

>> The manual page (LSES 11 SP4) states: "Set watchdog timeout to N seconds. 
>> This depends mostly on your storage latency; the majority of devices must be
> 
>> successfully read within this time, or else the node will self-fence." and 
>> "If a watchdog is used together with the "sbd" as is strongly recommended, 
>> the watchdog is activated at initial start of the sbd daemon. The watchdog
> is 
>> refreshed every time the majority of SBD devices has been successfully read.
> 
>> Using a watchdog provides additional protection against "sbd" crashing."
>>
>> Final remark: I thing the developers of sbd were under drugs (or never saw a
> 
>> UNIX program before) when designing the options. For example: "-W  Enable or
> 
>> disable use of the system watchdog to protect against the sbd processes 
>> failing and the node being left in an undefined state. Specify this once to
> 
>> enable, twice to disable." (MHO)
>>
>> Regards,
>> Ulrich
>>
>>>
>>> IPMI and PDUs can confirm fence the peer if ~5 seconds (plus fence
> delays).
>>>
>>> -- 
>>> Digimer
>>> Papers and Projects: https://alteeve.com/w/ 
>>> "I am,

Re: [ClusterLabs] Antw: Re: Antw: Re: lvm on shared storage and a lot of...

2017-04-20 Thread lejeczek



On 20/04/17 07:57, Ulrich Windl wrote:

lejeczek  schrieb am 19.04.2017 um 18:51 in Nachricht

:


On 18/04/17 15:22, Ulrich Windl wrote:

lejeczek  schrieb am 18.04.2017 um 16:14 in Nachricht

:


On 18/04/17 14:45, Digimer wrote:

On 18/04/17 07:31 AM, lejeczek wrote:

.. device_block & device_unblock in dmesg.

and I see that the LVM resource would fail.
This to me seems to happen randomly, or I fail to spot a pattern.

Shared storage is a sas3 enclosure.
I believe I follow docs on LVM to the letter. I don't know what could be
the problem.

would you suggest ways to troubleshoot it? Is it faulty/failing hardware?

many thanks,
L.

LVM or clustered LVM?


no clvmd
And inasmuch as the resource would start, fs would mount, if
I start using it more intensely I'd get more of
block/unblock and after a while mountpoint resource failes
and then LVM resource too.
It gets only worse after, even after I deleted resourced, I
begin to see, eg.:

[ 6242.606870] sd 7:0:32:0: device_unblock and setting to
running, handle(0x002c)
[ 6334.248617] sd 7:0:18:0: [sdy] tag#0 FAILED Result:
hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 6334.248633] sd 7:0:18:0: [sdy] tag#0 Sense Key : Not
Ready [current]
[ 6334.248640] sd 7:0:18:0: [sdy] tag#0 Add. Sense: Logical
unit is in process of becoming ready
[ 6334.248647] sd 7:0:18:0: [sdy] tag#0 CDB: Read(10) 28 00
00 00 00 00 00 00 08 00
[ 6334.248652] blk_update_request: I/O error, dev sdy, sector 0

Silly question: Do you have a multi-initiator setup where both initiators

use the same ID? Do your initiators have the highest prioriy (over the
targets)?

Regards,
Ulrich


no, I am not using iscsi here, it'a a DAS via sas3.

Isn't SAS also using the SCSI protocol? Initiator and target are SCSI terms, 
not iSCSI terms.


yes, in my mind though it first always references to iscsi.
I don't see where the cluster would use multi-initiator, but 
I may miss/or not know it, setup is a single link(cable) sas 
between the HBA and the enclosure, and the same for the 
second node.
Albeit, there might be actually a problem with the power 
board in this enclosure, manufacturer concluded and now a 
replacement part is on its way.
I only would be grateful if I could rule out the software as 
a culprit here.







___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org







___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Instant service restart during failback

2017-04-20 Thread Klechomir
Hi Klaus,
It would have been too easy if it was interleave.
All my cloned resoures have interlave=true, of course.
What bothers me more is that the behaviour is asymmetrical.

Regards,
Klecho

On 20.4.2017 10:43:29 Klaus Wenninger wrote:
> On 04/20/2017 10:30 AM, Klechomir wrote:
> > Hi List,
> > Been investigating the following problem recently:
> > 
> > Have two node cluster with 4 cloned (2 on top of 2) + 1 master/slave
> > services on it (corosync+pacemaker 1.1.15)
> > The failover works properly for both nodes, i.e. when one node is
> > restarted/turned in standby, the other properly takes over, but:
> > 
> > Every time when node2 has been in standby/turned off and comes back,
> > everything recovers propery.
> > Every time when node1 has been in standby/turned off and comes back, part
> > of the cloned services on node2 are getting instantly restarted, at the
> > same second when node1 re-appeares, without any apparent reason (only the
> > stop/start messages in the debug).
> > 
> > Is there some known possible reason for this?
> 
> That triggers some deja-vu feeling...
> Did you have a similar issue a couple of weeks ago?
> I remember in that particular case 'interleave=true' was not the
> solution to the problem but maybe here ...
> 
> Regards,
> Klaus
> 
> > Best regards,
> > Klecho
> > 
> > ___
> > Users mailing list: Users@clusterlabs.org
> > http://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Instant service restart during failback

2017-04-20 Thread Klaus Wenninger
On 04/20/2017 10:30 AM, Klechomir wrote:
> Hi List,
> Been investigating the following problem recently:
>
> Have two node cluster with 4 cloned (2 on top of 2) + 1 master/slave services 
> on it (corosync+pacemaker 1.1.15)
> The failover works properly for both nodes, i.e. when one node is 
> restarted/turned in standby, the other properly takes over, but:
>
> Every time when node2 has been in standby/turned off and comes back, 
> everything 
> recovers propery.
> Every time when node1 has been in standby/turned off and comes back, part of 
> the cloned services on node2 are getting instantly restarted, at the same 
> second when node1 re-appeares, without any apparent reason (only the 
> stop/start messages in the debug).
>
> Is there some known possible reason for this?

That triggers some deja-vu feeling...
Did you have a similar issue a couple of weeks ago?
I remember in that particular case 'interleave=true' was not the
solution to the problem but maybe here ...

Regards,
Klaus

>
> Best regards,
> Klecho
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


-- 
Klaus Wenninger

Senior Software Engineer, EMEA ENG Openstack Infrastructure

Red Hat

kwenn...@redhat.com   


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Instant service restart during failback

2017-04-20 Thread Klechomir
Hi List,
Been investigating the following problem recently:

Have two node cluster with 4 cloned (2 on top of 2) + 1 master/slave services 
on it (corosync+pacemaker 1.1.15)
The failover works properly for both nodes, i.e. when one node is 
restarted/turned in standby, the other properly takes over, but:

Every time when node2 has been in standby/turned off and comes back, everything 
recovers propery.
Every time when node1 has been in standby/turned off and comes back, part of 
the cloned services on node2 are getting instantly restarted, at the same 
second when node1 re-appeares, without any apparent reason (only the 
stop/start messages in the debug).

Is there some known possible reason for this?

Best regards,
Klecho

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: Antw: Re: lvm on shared storage and a lot of...

2017-04-20 Thread Ulrich Windl
>>> lejeczek  schrieb am 19.04.2017 um 18:51 in Nachricht
:

> 
> On 18/04/17 15:22, Ulrich Windl wrote:
> lejeczek  schrieb am 18.04.2017 um 16:14 in Nachricht
>> :
>>
>>> On 18/04/17 14:45, Digimer wrote:
 On 18/04/17 07:31 AM, lejeczek wrote:
> .. device_block & device_unblock in dmesg.
>
> and I see that the LVM resource would fail.
> This to me seems to happen randomly, or I fail to spot a pattern.
>
> Shared storage is a sas3 enclosure.
> I believe I follow docs on LVM to the letter. I don't know what could be
> the problem.
>
> would you suggest ways to troubleshoot it? Is it faulty/failing hardware?
>
> many thanks,
> L.
 LVM or clustered LVM?

>>> no clvmd
>>> And inasmuch as the resource would start, fs would mount, if
>>> I start using it more intensely I'd get more of
>>> block/unblock and after a while mountpoint resource failes
>>> and then LVM resource too.
>>> It gets only worse after, even after I deleted resourced, I
>>> begin to see, eg.:
>>>
>>> [ 6242.606870] sd 7:0:32:0: device_unblock and setting to
>>> running, handle(0x002c)
>>> [ 6334.248617] sd 7:0:18:0: [sdy] tag#0 FAILED Result:
>>> hostbyte=DID_OK driverbyte=DRIVER_SENSE
>>> [ 6334.248633] sd 7:0:18:0: [sdy] tag#0 Sense Key : Not
>>> Ready [current]
>>> [ 6334.248640] sd 7:0:18:0: [sdy] tag#0 Add. Sense: Logical
>>> unit is in process of becoming ready
>>> [ 6334.248647] sd 7:0:18:0: [sdy] tag#0 CDB: Read(10) 28 00
>>> 00 00 00 00 00 00 08 00
>>> [ 6334.248652] blk_update_request: I/O error, dev sdy, sector 0
>> Silly question: Do you have a multi-initiator setup where both initiators 
> use the same ID? Do your initiators have the highest prioriy (over the 
> targets)?
>>
>> Regards,
>> Ulrich
>>
> no, I am not using iscsi here, it'a a DAS via sas3.

Isn't SAS also using the SCSI protocol? Initiator and target are SCSI terms, 
not iSCSI terms.


> 
>>
>>
>>
>>
>>
>> ___
>> Users mailing list: Users@clusterlabs.org 
>> http://lists.clusterlabs.org/mailman/listinfo/users 
>>
>> Project Home: http://www.clusterlabs.org 
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> Bugs: http://bugs.clusterlabs.org 





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org