[ClusterLabs] Antw: starting primitive resources of a group without starting the complete group - unclear behaviour
>>> "Lentes, Bernd" schrieb am 20.04.2017 >>> um 21:53 in Nachricht <1649590422.18260279.1492718032265.javamail.zim...@helmholtz-muenchen.de>: > Hi, > > just for the sake of completeness i'd like to figure out what happens if i > start one resource, which is a member of a group, but only this resource. > I'd like to see what the other resources of that group are doing. Also if it > maybe does not make much sense. Just for learning and understanding. The resource in the group is restricted to what the group enforces. > > But i'm getting mad about my test results: You should have explained what your group looks like and what resource you are testing. > > first test: > > crm(live)# status > Last updated: Thu Apr 20 20:56:08 2017 > Last change: Thu Apr 20 20:46:35 2017 by root via cibadmin on ha-idg-2 > Stack: classic openais (with plugin) > Current DC: ha-idg-2 - partition with quorum > Version: 1.1.12-f47ea56 > 2 Nodes configured, 2 expected votes > 14 Resources configured > > > Online: [ ha-idg-1 ha-idg-2 ] > > Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml > [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml] > Started: [ ha-idg-1 ha-idg-2 ] > prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started > ha-idg-2 > prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started > ha-idg-1 > > crm(live)# resource start prim_vnc_ip_mausdb > > crm(live)# status > Last updated: Thu Apr 20 20:56:44 2017 > Last change: Thu Apr 20 20:56:44 2017 by root via crm_resource on ha-idg-1 > Stack: classic openais (with plugin) > Current DC: ha-idg-2 - partition with quorum > Version: 1.1.12-f47ea56 > 2 Nodes configured, 2 expected votes > 14 Resources configured > > > Online: [ ha-idg-1 ha-idg-2 ] > > Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml > [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml] > Started: [ ha-idg-1 ha-idg-2 ] > prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started > ha-idg-2 > prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started > ha-idg-1 > Resource Group: group_vnc_mausdb > prim_vnc_ip_mausdb (ocf::heartbeat:IPaddr):Started ha-idg-1 > <=== > prim_vm_mausdb (ocf::heartbeat:VirtualDomain): Started ha-idg-1 > <=== > > > > second test: What's the status before the test? > > crm(live)# status > Last updated: Thu Apr 20 21:24:19 2017 > Last change: Thu Apr 20 21:20:04 2017 by root via cibadmin on ha-idg-2 > Stack: classic openais (with plugin) > Current DC: ha-idg-2 - partition with quorum > Version: 1.1.12-f47ea56 > 2 Nodes configured, 2 expected votes > 14 Resources configured > > > Online: [ ha-idg-1 ha-idg-2 ] > > Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml > [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml] > Started: [ ha-idg-1 ha-idg-2 ] > prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started > ha-idg-2 > prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started > ha-idg-1 > > > crm(live)# resource start prim_vnc_ip_mausdb > > > crm(live)# status > Last updated: Thu Apr 20 21:26:05 2017 > Last change: Thu Apr 20 21:25:55 2017 by root via cibadmin on ha-idg-2 > Stack: classic openais (with plugin) > Current DC: ha-idg-2 - partition with quorum > Version: 1.1.12-f47ea56 > 2 Nodes configured, 2 expected votes > 14 Resources configured > > > Online: [ ha-idg-1 ha-idg-2 ] > > Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml > [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml] > Started: [ ha-idg-1 ha-idg-2 ] > prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started > ha-idg-2 > prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started > ha-idg-1 > Resource Group: group_vnc_mausdb > prim_vnc_ip_mausdb (ocf::heartbeat:IPaddr):Started ha-idg-1 > <=== > prim_vm_mausdb (ocf::heartbeat:VirtualDomain): > (target-role:Stopped) Stopped <=== > > > Once the second resource of the group is started with the first resource, > the other time not !?! > Why this unclear behaviour ? With your incomplete status and test description it's hard to say. > > This is my configuration: > > primitive prim_vm_mausdb VirtualDomain \ > params config="/var/lib/libvirt/images/xml/mausdb_vm.xml" \ > params hypervisor="qemu:///system" \ > params migration_transport=ssh \ > op start interval=0 timeout=120 \ > op stop interval=0 timeout=130 \ > op monitor interval=30 timeout=30 \ > op migrate_from interval=0 timeout=180 \ > op migrate_to interval=0 timeout=190 \ > meta allow-migrate=true is-managed=true \ > utilization cpu=4 hv_memory=8006 > > > primitive prim_vnc_ip_mausdb IPaddr \ > params ip=146.107.235.161 nic=br0 cidr_netmask=24 \ > meta target-role=Started > > > group grou
[ClusterLabs] Antw: Re: Colocation of a primitive resource with a clone with limited copies
>>> Ken Gaillot schrieb am 20.04.2017 um 19:33 in >>> Nachricht <252dd10e-e418-9d96-658c-1a351002c...@redhat.com>: > On 04/20/2017 10:52 AM, Jan Wrona wrote: >> Hello, >> >> my problem is closely related to the thread [1], but I didn't find a >> solution there. I have a resource that is set up as a clone C restricted >> to two copies (using the clone-max=2 meta attribute||), because the >> resource takes long time to get ready (it starts immediately though), > > A resource agent must not return from "start" until a "monitor" > operation would return success. Of course it may, but with an error ;-) [...] Ulrich ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] starting primitive resources of a group without starting the complete group - unclear behaviour
On 04/20/2017 02:53 PM, Lentes, Bernd wrote: > Hi, > > just for the sake of completeness i'd like to figure out what happens if i > start one resource, which is a member of a group, but only this resource. > I'd like to see what the other resources of that group are doing. Also if it > maybe does not make much sense. Just for learning and understanding. > > But i'm getting mad about my test results: > > first test: > > crm(live)# status > Last updated: Thu Apr 20 20:56:08 2017 > Last change: Thu Apr 20 20:46:35 2017 by root via cibadmin on ha-idg-2 > Stack: classic openais (with plugin) > Current DC: ha-idg-2 - partition with quorum > Version: 1.1.12-f47ea56 > 2 Nodes configured, 2 expected votes > 14 Resources configured > > > Online: [ ha-idg-1 ha-idg-2 ] > > Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml > [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml] > Started: [ ha-idg-1 ha-idg-2 ] > prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started > ha-idg-2 > prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started > ha-idg-1 > > crm(live)# resource start prim_vnc_ip_mausdb > > crm(live)# status > Last updated: Thu Apr 20 20:56:44 2017 > Last change: Thu Apr 20 20:56:44 2017 by root via crm_resource on ha-idg-1 > Stack: classic openais (with plugin) > Current DC: ha-idg-2 - partition with quorum > Version: 1.1.12-f47ea56 > 2 Nodes configured, 2 expected votes > 14 Resources configured > > > Online: [ ha-idg-1 ha-idg-2 ] > > Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml > [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml] > Started: [ ha-idg-1 ha-idg-2 ] > prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started > ha-idg-2 > prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started > ha-idg-1 > Resource Group: group_vnc_mausdb > prim_vnc_ip_mausdb (ocf::heartbeat:IPaddr):Started ha-idg-1 > <=== > prim_vm_mausdb (ocf::heartbeat:VirtualDomain): Started ha-idg-1 > <=== > > > > second test: > > crm(live)# status > Last updated: Thu Apr 20 21:24:19 2017 > Last change: Thu Apr 20 21:20:04 2017 by root via cibadmin on ha-idg-2 > Stack: classic openais (with plugin) > Current DC: ha-idg-2 - partition with quorum > Version: 1.1.12-f47ea56 > 2 Nodes configured, 2 expected votes > 14 Resources configured > > > Online: [ ha-idg-1 ha-idg-2 ] > > Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml > [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml] > Started: [ ha-idg-1 ha-idg-2 ] > prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started > ha-idg-2 > prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started > ha-idg-1 > > > crm(live)# resource start prim_vnc_ip_mausdb > > > crm(live)# status > Last updated: Thu Apr 20 21:26:05 2017 > Last change: Thu Apr 20 21:25:55 2017 by root via cibadmin on ha-idg-2 > Stack: classic openais (with plugin) > Current DC: ha-idg-2 - partition with quorum > Version: 1.1.12-f47ea56 > 2 Nodes configured, 2 expected votes > 14 Resources configured > > > Online: [ ha-idg-1 ha-idg-2 ] > > Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml > [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml] > Started: [ ha-idg-1 ha-idg-2 ] > prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started > ha-idg-2 > prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started > ha-idg-1 > Resource Group: group_vnc_mausdb > prim_vnc_ip_mausdb (ocf::heartbeat:IPaddr):Started ha-idg-1 > <=== > prim_vm_mausdb (ocf::heartbeat:VirtualDomain): (target-role:Stopped) > Stopped <=== target-role=Stopped prevents a resource from being started. In a group, each member of the group depends on the previously listed members, same as if ordering and colocation constraints had been created between each pair. So, starting a resource in the "middle" of a group will also start everything before it. > > Once the second resource of the group is started with the first resource, the > other time not !?! > Why this unclear behaviour ? > > This is my configuration: > > primitive prim_vm_mausdb VirtualDomain \ > params config="/var/lib/libvirt/images/xml/mausdb_vm.xml" \ > params hypervisor="qemu:///system" \ > params migration_transport=ssh \ > op start interval=0 timeout=120 \ > op stop interval=0 timeout=130 \ > op monitor interval=30 timeout=30 \ > op migrate_from interval=0 timeout=180 \ > op migrate_to interval=0 timeout=190 \ > meta allow-migrate=true is-managed=true \ > utilization cpu=4 hv_memory=8006 > > > primitive prim_vnc_ip_mausdb IPaddr \ > params ip=146.107.235.161 nic=br0 cidr_netmask=24 \ > meta target-role=Started > > > group group_vnc_mausdb prim_vnc_ip_mausdb prim_vm_mausdb \ >
Re: [ClusterLabs] Colocation of a primitive resource with a clone with limited copies
On 20.4.2017 19:33, Ken Gaillot wrote: On 04/20/2017 10:52 AM, Jan Wrona wrote: Hello, my problem is closely related to the thread [1], but I didn't find a solution there. I have a resource that is set up as a clone C restricted to two copies (using the clone-max=2 meta attribute||), because the resource takes long time to get ready (it starts immediately though), A resource agent must not return from "start" until a "monitor" operation would return success. Beyond that, the cluster doesn't care what "ready" means, so it's OK if it's not fully operational by some measure. However, that raises the question of what you're accomplishing with your monitor. I know all that and my RA respects that. I didn't want to go into details about the service I'm running, but maybe it will help you understand. Its a data collector which receives and processes data from a UDP stream. To understand these data, it needs templates which periodically occur in the stream (every five minutes or so). After "start" the service is up and running, "monitor" operations are successful, but until the templates arrive the service is not "ready". I basically need to somehow simulate this "ready" state. and by having it ready as a clone, I can failover in the time it takes to move an IP resource. I have a colocation constraint "resource IP with clone C", which will make sure IP runs with a working instance of C: Configuration: Clone: dummy-clone Meta Attrs: clone-max=2 interleave=true Resource: dummy (class=ocf provider=heartbeat type=Dummy) Operations: start interval=0s timeout=20 (dummy-start-interval-0s) stop interval=0s timeout=20 (dummy-stop-interval-0s) monitor interval=10 timeout=20 (dummy-monitor-interval-10) Resource: ip (class=ocf provider=heartbeat type=Dummy) Operations: start interval=0s timeout=20 (ip-start-interval-0s) stop interval=0s timeout=20 (ip-stop-interval-0s) monitor interval=10 timeout=20 (ip-monitor-interval-10) Colocation Constraints: ip with dummy-clone (score:INFINITY) State: Clone Set: dummy-clone [dummy] Started: [ sub1.example.org sub3.example.org ] ip (ocf::heartbeat:Dummy): Started sub1.example.org This is fine until the the active node (sub1.example.org) fails. Instead of moving the IP to the passive node (sub3.example.org) with ready clone instance, Pacemaker will move it to the node where it just started a fresh instance of the clone (sub2.example.org in my case): New state: Clone Set: dummy-clone [dummy] Started: [ sub2.example.org sub3.example.org ] ip (ocf::heartbeat:Dummy): Started sub2.example.org Documentation states that the cluster will choose a copy based on where the clone is running and the resource's own location preferences, so I don't understand why this is happening. Is there a way to tell Pacemaker to move the IP to the node where the resource is already running? Thanks! Jan Wrona [1] http://lists.clusterlabs.org/pipermail/users/2016-November/004540.html The cluster places ip based on where the clone will be running at that point in the recovery, rather than where it was running before recovery. Unfortunately I can't think of a way to do exactly what you want, hopefully someone else has an idea. One possibility would be to use on-fail=standby on the clone monitor. That way, instead of recovering the clone when it fails, all resources on the node would move elsewhere. You'd then have to manually take the node out of standby for it to be usable again. I don't see how that would solve it. The node would be put into the standby state, cluster would recover the clone instance on some other node and possibly place the IP there too. Moreover I don't want to put the whole node into standby because of one failed monitor. It might be possible to do something more if you convert the clone to a master/slave resource, and colocate ip with the master role. For example, you could set the master score based on how long the service has been running, so the longest-running instance is always master. This sounds promising, I have heard about master/slave resources but never actually used any. I'll look more into that, thanks you for your advice! ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] starting primitive resources of a group without starting the complete group - unclear behaviour
Hi, just for the sake of completeness i'd like to figure out what happens if i start one resource, which is a member of a group, but only this resource. I'd like to see what the other resources of that group are doing. Also if it maybe does not make much sense. Just for learning and understanding. But i'm getting mad about my test results: first test: crm(live)# status Last updated: Thu Apr 20 20:56:08 2017 Last change: Thu Apr 20 20:46:35 2017 by root via cibadmin on ha-idg-2 Stack: classic openais (with plugin) Current DC: ha-idg-2 - partition with quorum Version: 1.1.12-f47ea56 2 Nodes configured, 2 expected votes 14 Resources configured Online: [ ha-idg-1 ha-idg-2 ] Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml] Started: [ ha-idg-1 ha-idg-2 ] prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started ha-idg-2 prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started ha-idg-1 crm(live)# resource start prim_vnc_ip_mausdb crm(live)# status Last updated: Thu Apr 20 20:56:44 2017 Last change: Thu Apr 20 20:56:44 2017 by root via crm_resource on ha-idg-1 Stack: classic openais (with plugin) Current DC: ha-idg-2 - partition with quorum Version: 1.1.12-f47ea56 2 Nodes configured, 2 expected votes 14 Resources configured Online: [ ha-idg-1 ha-idg-2 ] Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml] Started: [ ha-idg-1 ha-idg-2 ] prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started ha-idg-2 prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started ha-idg-1 Resource Group: group_vnc_mausdb prim_vnc_ip_mausdb (ocf::heartbeat:IPaddr):Started ha-idg-1 <=== prim_vm_mausdb (ocf::heartbeat:VirtualDomain): Started ha-idg-1 <=== second test: crm(live)# status Last updated: Thu Apr 20 21:24:19 2017 Last change: Thu Apr 20 21:20:04 2017 by root via cibadmin on ha-idg-2 Stack: classic openais (with plugin) Current DC: ha-idg-2 - partition with quorum Version: 1.1.12-f47ea56 2 Nodes configured, 2 expected votes 14 Resources configured Online: [ ha-idg-1 ha-idg-2 ] Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml] Started: [ ha-idg-1 ha-idg-2 ] prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started ha-idg-2 prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started ha-idg-1 crm(live)# resource start prim_vnc_ip_mausdb crm(live)# status Last updated: Thu Apr 20 21:26:05 2017 Last change: Thu Apr 20 21:25:55 2017 by root via cibadmin on ha-idg-2 Stack: classic openais (with plugin) Current DC: ha-idg-2 - partition with quorum Version: 1.1.12-f47ea56 2 Nodes configured, 2 expected votes 14 Resources configured Online: [ ha-idg-1 ha-idg-2 ] Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml [group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml] Started: [ ha-idg-1 ha-idg-2 ] prim_stonith_ipmi_ha-idg-1 (stonith:external/ipmi):Started ha-idg-2 prim_stonith_ipmi_ha-idg-2 (stonith:external/ipmi):Started ha-idg-1 Resource Group: group_vnc_mausdb prim_vnc_ip_mausdb (ocf::heartbeat:IPaddr):Started ha-idg-1 <=== prim_vm_mausdb (ocf::heartbeat:VirtualDomain): (target-role:Stopped) Stopped <=== Once the second resource of the group is started with the first resource, the other time not !?! Why this unclear behaviour ? This is my configuration: primitive prim_vm_mausdb VirtualDomain \ params config="/var/lib/libvirt/images/xml/mausdb_vm.xml" \ params hypervisor="qemu:///system" \ params migration_transport=ssh \ op start interval=0 timeout=120 \ op stop interval=0 timeout=130 \ op monitor interval=30 timeout=30 \ op migrate_from interval=0 timeout=180 \ op migrate_to interval=0 timeout=190 \ meta allow-migrate=true is-managed=true \ utilization cpu=4 hv_memory=8006 primitive prim_vnc_ip_mausdb IPaddr \ params ip=146.107.235.161 nic=br0 cidr_netmask=24 \ meta target-role=Started group group_vnc_mausdb prim_vnc_ip_mausdb prim_vm_mausdb \ meta target-role=Stopped is-managed=true Failcounts for the group and the vm are zero on both nodes. Scores for the vm on both nodes is -INFINITY. Starting the vm in the second case (resource start prim_vm_mausdb) succeeds, than i have both resources running. Any ideas ? Bernd -- Bernd Lentes Systemadministration institute of developmental genetics Gebäude 35.34 - Raum 208 HelmholtzZentrum München bernd.len...@helmholtz-muenchen.de phone: +49 (0)89 3187 1241 fax: +49 (0)89 3187 2294 Erst wenn man sich auf etwas festlegt kann man Unrecht haben Scott Adams Helmholtz Zentrum Muenchen Deutsches Forschungszen
Re: [ClusterLabs] Colocation of a primitive resource with a clone with limited copies
On 04/20/2017 10:52 AM, Jan Wrona wrote: > Hello, > > my problem is closely related to the thread [1], but I didn't find a > solution there. I have a resource that is set up as a clone C restricted > to two copies (using the clone-max=2 meta attribute||), because the > resource takes long time to get ready (it starts immediately though), A resource agent must not return from "start" until a "monitor" operation would return success. Beyond that, the cluster doesn't care what "ready" means, so it's OK if it's not fully operational by some measure. However, that raises the question of what you're accomplishing with your monitor. > and by having it ready as a clone, I can failover in the time it takes > to move an IP resource. I have a colocation constraint "resource IP with > clone C", which will make sure IP runs with a working instance of C: > > Configuration: > Clone: dummy-clone > Meta Attrs: clone-max=2 interleave=true > Resource: dummy (class=ocf provider=heartbeat type=Dummy) >Operations: start interval=0s timeout=20 (dummy-start-interval-0s) >stop interval=0s timeout=20 (dummy-stop-interval-0s) >monitor interval=10 timeout=20 (dummy-monitor-interval-10) > Resource: ip (class=ocf provider=heartbeat type=Dummy) > Operations: start interval=0s timeout=20 (ip-start-interval-0s) > stop interval=0s timeout=20 (ip-stop-interval-0s) > monitor interval=10 timeout=20 (ip-monitor-interval-10) > > Colocation Constraints: > ip with dummy-clone (score:INFINITY) > > State: > Clone Set: dummy-clone [dummy] > Started: [ sub1.example.org sub3.example.org ] > ip (ocf::heartbeat:Dummy): Started sub1.example.org > > > This is fine until the the active node (sub1.example.org) fails. Instead > of moving the IP to the passive node (sub3.example.org) with ready clone > instance, Pacemaker will move it to the node where it just started a > fresh instance of the clone (sub2.example.org in my case): > > New state: > Clone Set: dummy-clone [dummy] > Started: [ sub2.example.org sub3.example.org ] > ip (ocf::heartbeat:Dummy): Started sub2.example.org > > > Documentation states that the cluster will choose a copy based on where > the clone is running and the resource's own location preferences, so I > don't understand why this is happening. Is there a way to tell Pacemaker > to move the IP to the node where the resource is already running? > > Thanks! > Jan Wrona > > [1] http://lists.clusterlabs.org/pipermail/users/2016-November/004540.html The cluster places ip based on where the clone will be running at that point in the recovery, rather than where it was running before recovery. Unfortunately I can't think of a way to do exactly what you want, hopefully someone else has an idea. One possibility would be to use on-fail=standby on the clone monitor. That way, instead of recovering the clone when it fails, all resources on the node would move elsewhere. You'd then have to manually take the node out of standby for it to be usable again. It might be possible to do something more if you convert the clone to a master/slave resource, and colocate ip with the master role. For example, you could set the master score based on how long the service has been running, so the longest-running instance is always master. ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Digest does not match
Hi folks, We have a lot of our two-node systems running in our server room. I noticed that some of them occasionally have this entries in the syslog: Mar 15 12:54:45 A5-E4-151-bottom corosync[13766]: [TOTEM ] Digest does not match Mar 15 12:54:45 A5-E4-151-bottom corosync[13766]: [TOTEM ] Received message has invalid digest... ignoring. Mar 15 12:54:45 A5-E4-151-bottom corosync[13766]: [TOTEM ] Invalid packet data I am attaching corosync.conf which we use on all our systems. Each two-node system has a dedicated Ethernet interface for interconnection. And the two nodes are connected directly one to another using this interface without any switches. Based on that I assume there is no way this connection is exposed to outside world. What could be causing this issue? Thank you, Kostia corosync.conf Description: Binary data ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Colocation of a primitive resource with a clone with limited copies
Hello, my problem is closely related to the thread [1], but I didn't find a solution there. I have a resource that is set up as a clone C restricted to two copies (using the clone-max=2 meta attribute||), because the resource takes long time to get ready (it starts immediately though), and by having it ready as a clone, I can failover in the time it takes to move an IP resource. I have a colocation constraint "resource IP with clone C", which will make sure IP runs with a working instance of C: Configuration: Clone: dummy-clone Meta Attrs: clone-max=2 interleave=true Resource: dummy (class=ocf provider=heartbeat type=Dummy) Operations: start interval=0s timeout=20 (dummy-start-interval-0s) stop interval=0s timeout=20 (dummy-stop-interval-0s) monitor interval=10 timeout=20 (dummy-monitor-interval-10) Resource: ip (class=ocf provider=heartbeat type=Dummy) Operations: start interval=0s timeout=20 (ip-start-interval-0s) stop interval=0s timeout=20 (ip-stop-interval-0s) monitor interval=10 timeout=20 (ip-monitor-interval-10) Colocation Constraints: ip with dummy-clone (score:INFINITY) State: Clone Set: dummy-clone [dummy] Started: [ sub1.example.org sub3.example.org ] ip (ocf::heartbeat:Dummy): Started sub1.example.org This is fine until the the active node (sub1.example.org) fails. Instead of moving the IP to the passive node (sub3.example.org) with ready clone instance, Pacemaker will move it to the node where it just started a fresh instance of the clone (sub2.example.org in my case): New state: Clone Set: dummy-clone [dummy] Started: [ sub2.example.org sub3.example.org ] ip (ocf::heartbeat:Dummy): Started sub2.example.org Documentation states that the cluster will choose a copy based on where the clone is running and the resource's own location preferences, so I don't understand why this is happening. Is there a way to tell Pacemaker to move the IP to the node where the resource is already running? Thanks! Jan Wrona [1] http://lists.clusterlabs.org/pipermail/users/2016-November/004540.html ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Wtrlt: Antw: Re: Antw: Re: how important would you consider to have two independent fencing device for each node ?
On 04/20/2017 01:43 AM, Ulrich Windl wrote: > Should have gone to the list... > > Digimer schrieb am 19.04.2017 um 17:20 in Nachricht >> <600637f1-fef8-0a3d-821c-7aecfa398...@alteeve.ca>: >>> On 19/04/17 02:38 AM, Ulrich Windl wrote: >>> Digimer schrieb am 18.04.2017 um 19:08 in > Nachricht <26e49390-b384-b46e-4965-eba5bfe59...@alteeve.ca>: > On 18/04/17 11:07 AM, Lentes, Bernd wrote: >> Hi, >> >> i'm currently establishing a two node cluster. Each node is a HP > server with > an ILO card. >> I can fence both of them, it's working fine. >> But what is if the ILO does not work correctly ? Then fencing is not > possible. > > Correct. If you only have iLO fencing, then the cluster would hang > (failed fencing is *not* an indication of node death). > >> I also have a switched PDU from APC. Each server has two power > supplies. > Currently one is connected to the normal power equipment, the other to > the > UPS. >> As a sort of redundancy, if the UPS does not work properly. > > That's a fine setup. > >> When i'd like to use the switched PDU as a fencing device i will loose > the > redundancy of two independent power sources, because then i have to > connect > both power supplies together to the UPS. >> I wouldn't like to do that. > > Not if you have two switched PDUs. This is what we do in our Anvil! > systems... One PDU feeds the first PSU in each node and the second PDU > feeds the second PSUs. Ideally both PDUs are fed by UPSes, but that's > not as important. One PDU on a UPS and one PDU directly from mains will > work. > >> How important would you consider to have two independent fencing device > for > each node ? I'd can't by another PDU, currently we are very poor. > > Depends entirely on your tolerance for interruption. *I* answer that > with "extremely important". However, most clusters out there have only > IPMI-based fencing, so they would obviously say "not so important". > >> Is there another way to create a second fencing device, independent > from the > ILO card ? >> >> Thanks. > > Sure, SBD would work. I've never seen IPMI not have a watchdog timer > (and iLO is IPMI++), as one example. It's slow, and needs shared > storage, but a small box somewhere running a small tgtd or iscsid > should > do the trick (note that I have never used SBD myself...). Slow is relative: If it takes 3 seconds from issuing the reset command > until the node is dead, it's fast enough for most cases. Even a switched PDU > has >>> some delays: The command has to be processed, the relay may "stick" a short >>> moment, the power supply's capacitors have to discharge (if you have two power >>> supplys, both need to)... And iLOs don't really like to be powered off. Ulrich >>> >>> The way I understand SBD, and correct me if I am wrong, recovery won't >>> begin until sometime after the watchdog timer kicks. If the watchdog >>> timer is 60 seconds, then your cluster will hang for >60 seconds (plus >>> fence delays, etc). >> >> I think it works differently: One task periodically reads ist mailbox slot >> for commands, and once a comment was read, it's executed immediately. Only > if >> the read task does hang for a long time, the watchdog itself triggers a > reset >> (as SBD seems dead). So the delay is actually made from the sum of "write >> delay", "read delay", "command excution". I think you're right when sbd uses shared-storage, but there is a watchdog-only configuration that I believe digimer was referring to. With watchdog-only, the cluster will wait for the value of the stonith-watchdog-timeout property before considering the fencing successful. >> The manual page (LSES 11 SP4) states: "Set watchdog timeout to N seconds. >> This depends mostly on your storage latency; the majority of devices must be > >> successfully read within this time, or else the node will self-fence." and >> "If a watchdog is used together with the "sbd" as is strongly recommended, >> the watchdog is activated at initial start of the sbd daemon. The watchdog > is >> refreshed every time the majority of SBD devices has been successfully read. > >> Using a watchdog provides additional protection against "sbd" crashing." >> >> Final remark: I thing the developers of sbd were under drugs (or never saw a > >> UNIX program before) when designing the options. For example: "-W Enable or > >> disable use of the system watchdog to protect against the sbd processes >> failing and the node being left in an undefined state. Specify this once to > >> enable, twice to disable." (MHO) >> >> Regards, >> Ulrich >> >>> >>> IPMI and PDUs can confirm fence the peer if ~5 seconds (plus fence > delays). >>> >>> -- >>> Digimer >>> Papers and Projects: https://alteeve.com/w/ >>> "I am,
Re: [ClusterLabs] Antw: Re: Antw: Re: lvm on shared storage and a lot of...
On 20/04/17 07:57, Ulrich Windl wrote: lejeczek schrieb am 19.04.2017 um 18:51 in Nachricht : On 18/04/17 15:22, Ulrich Windl wrote: lejeczek schrieb am 18.04.2017 um 16:14 in Nachricht : On 18/04/17 14:45, Digimer wrote: On 18/04/17 07:31 AM, lejeczek wrote: .. device_block & device_unblock in dmesg. and I see that the LVM resource would fail. This to me seems to happen randomly, or I fail to spot a pattern. Shared storage is a sas3 enclosure. I believe I follow docs on LVM to the letter. I don't know what could be the problem. would you suggest ways to troubleshoot it? Is it faulty/failing hardware? many thanks, L. LVM or clustered LVM? no clvmd And inasmuch as the resource would start, fs would mount, if I start using it more intensely I'd get more of block/unblock and after a while mountpoint resource failes and then LVM resource too. It gets only worse after, even after I deleted resourced, I begin to see, eg.: [ 6242.606870] sd 7:0:32:0: device_unblock and setting to running, handle(0x002c) [ 6334.248617] sd 7:0:18:0: [sdy] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 6334.248633] sd 7:0:18:0: [sdy] tag#0 Sense Key : Not Ready [current] [ 6334.248640] sd 7:0:18:0: [sdy] tag#0 Add. Sense: Logical unit is in process of becoming ready [ 6334.248647] sd 7:0:18:0: [sdy] tag#0 CDB: Read(10) 28 00 00 00 00 00 00 00 08 00 [ 6334.248652] blk_update_request: I/O error, dev sdy, sector 0 Silly question: Do you have a multi-initiator setup where both initiators use the same ID? Do your initiators have the highest prioriy (over the targets)? Regards, Ulrich no, I am not using iscsi here, it'a a DAS via sas3. Isn't SAS also using the SCSI protocol? Initiator and target are SCSI terms, not iSCSI terms. yes, in my mind though it first always references to iscsi. I don't see where the cluster would use multi-initiator, but I may miss/or not know it, setup is a single link(cable) sas between the HBA and the enclosure, and the same for the second node. Albeit, there might be actually a problem with the power board in this enclosure, manufacturer concluded and now a replacement part is on its way. I only would be grateful if I could rule out the software as a culprit here. ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Instant service restart during failback
Hi Klaus, It would have been too easy if it was interleave. All my cloned resoures have interlave=true, of course. What bothers me more is that the behaviour is asymmetrical. Regards, Klecho On 20.4.2017 10:43:29 Klaus Wenninger wrote: > On 04/20/2017 10:30 AM, Klechomir wrote: > > Hi List, > > Been investigating the following problem recently: > > > > Have two node cluster with 4 cloned (2 on top of 2) + 1 master/slave > > services on it (corosync+pacemaker 1.1.15) > > The failover works properly for both nodes, i.e. when one node is > > restarted/turned in standby, the other properly takes over, but: > > > > Every time when node2 has been in standby/turned off and comes back, > > everything recovers propery. > > Every time when node1 has been in standby/turned off and comes back, part > > of the cloned services on node2 are getting instantly restarted, at the > > same second when node1 re-appeares, without any apparent reason (only the > > stop/start messages in the debug). > > > > Is there some known possible reason for this? > > That triggers some deja-vu feeling... > Did you have a similar issue a couple of weeks ago? > I remember in that particular case 'interleave=true' was not the > solution to the problem but maybe here ... > > Regards, > Klaus > > > Best regards, > > Klecho > > > > ___ > > Users mailing list: Users@clusterlabs.org > > http://lists.clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Instant service restart during failback
On 04/20/2017 10:30 AM, Klechomir wrote: > Hi List, > Been investigating the following problem recently: > > Have two node cluster with 4 cloned (2 on top of 2) + 1 master/slave services > on it (corosync+pacemaker 1.1.15) > The failover works properly for both nodes, i.e. when one node is > restarted/turned in standby, the other properly takes over, but: > > Every time when node2 has been in standby/turned off and comes back, > everything > recovers propery. > Every time when node1 has been in standby/turned off and comes back, part of > the cloned services on node2 are getting instantly restarted, at the same > second when node1 re-appeares, without any apparent reason (only the > stop/start messages in the debug). > > Is there some known possible reason for this? That triggers some deja-vu feeling... Did you have a similar issue a couple of weeks ago? I remember in that particular case 'interleave=true' was not the solution to the problem but maybe here ... Regards, Klaus > > Best regards, > Klecho > > ___ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org -- Klaus Wenninger Senior Software Engineer, EMEA ENG Openstack Infrastructure Red Hat kwenn...@redhat.com ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Instant service restart during failback
Hi List, Been investigating the following problem recently: Have two node cluster with 4 cloned (2 on top of 2) + 1 master/slave services on it (corosync+pacemaker 1.1.15) The failover works properly for both nodes, i.e. when one node is restarted/turned in standby, the other properly takes over, but: Every time when node2 has been in standby/turned off and comes back, everything recovers propery. Every time when node1 has been in standby/turned off and comes back, part of the cloned services on node2 are getting instantly restarted, at the same second when node1 re-appeares, without any apparent reason (only the stop/start messages in the debug). Is there some known possible reason for this? Best regards, Klecho ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Antw: Re: Antw: Re: lvm on shared storage and a lot of...
>>> lejeczek schrieb am 19.04.2017 um 18:51 in Nachricht : > > On 18/04/17 15:22, Ulrich Windl wrote: > lejeczek schrieb am 18.04.2017 um 16:14 in Nachricht >> : >> >>> On 18/04/17 14:45, Digimer wrote: On 18/04/17 07:31 AM, lejeczek wrote: > .. device_block & device_unblock in dmesg. > > and I see that the LVM resource would fail. > This to me seems to happen randomly, or I fail to spot a pattern. > > Shared storage is a sas3 enclosure. > I believe I follow docs on LVM to the letter. I don't know what could be > the problem. > > would you suggest ways to troubleshoot it? Is it faulty/failing hardware? > > many thanks, > L. LVM or clustered LVM? >>> no clvmd >>> And inasmuch as the resource would start, fs would mount, if >>> I start using it more intensely I'd get more of >>> block/unblock and after a while mountpoint resource failes >>> and then LVM resource too. >>> It gets only worse after, even after I deleted resourced, I >>> begin to see, eg.: >>> >>> [ 6242.606870] sd 7:0:32:0: device_unblock and setting to >>> running, handle(0x002c) >>> [ 6334.248617] sd 7:0:18:0: [sdy] tag#0 FAILED Result: >>> hostbyte=DID_OK driverbyte=DRIVER_SENSE >>> [ 6334.248633] sd 7:0:18:0: [sdy] tag#0 Sense Key : Not >>> Ready [current] >>> [ 6334.248640] sd 7:0:18:0: [sdy] tag#0 Add. Sense: Logical >>> unit is in process of becoming ready >>> [ 6334.248647] sd 7:0:18:0: [sdy] tag#0 CDB: Read(10) 28 00 >>> 00 00 00 00 00 00 08 00 >>> [ 6334.248652] blk_update_request: I/O error, dev sdy, sector 0 >> Silly question: Do you have a multi-initiator setup where both initiators > use the same ID? Do your initiators have the highest prioriy (over the > targets)? >> >> Regards, >> Ulrich >> > no, I am not using iscsi here, it'a a DAS via sas3. Isn't SAS also using the SCSI protocol? Initiator and target are SCSI terms, not iSCSI terms. > >> >> >> >> >> >> ___ >> Users mailing list: Users@clusterlabs.org >> http://lists.clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org