Re: [Pacemaker] Cleanup over secondary node
Ho Andrew. On Monday, 15 April 2013 14:36:48 +1000, Andrew Beekhof wrote: I'm testing Pacemaker+Corosync cluster with KVM virtual machines. When restarting a node, I got the following status: # crm status Last updated: Sun Apr 14 11:50:00 2013 Last change: Sun Apr 14 11:49:54 2013 Stack: openais Current DC: daedalus - partition with quorum Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff 2 Nodes configured, 2 expected votes 8 Resources configured. Online: [ atlantis daedalus ] Resource Group: servicios fs_drbd_servicios (ocf::heartbeat:Filesystem):Started daedalus clusterIP (ocf::heartbeat:IPaddr2): Started daedalus Mysql (ocf::heartbeat:mysql): Started daedalus Apache (ocf::heartbeat:apache):Started daedalus Pure-FTPd (ocf::heartbeat:Pure-FTPd): Started daedalus Asterisk (ocf::heartbeat:asterisk): Started daedalus Master/Slave Set: drbd_serviciosClone [drbd_servicios] Masters: [ daedalus ] Slaves: [ atlantis ] Failed actions: Asterisk_monitor_0 (node=atlantis, call=12, rc=5, status=complete): not installed The problem is that if I do a cleanup of the Asterisk resource in the secondary, this has no effect. It seems to be Paceemaker needs to have access to the config file to the resource. Not Pacemaker, the resource agent. Pacemaker runs a non-recurring monitor operation to see what state the service is in, it seems the asterisk agent needs that config file. I'd suggest changing the agent so that if the asterisk process is not running, the agent returns 7 (not running) before trying to access the config file. I was reviewing the resource definition assuming there I might have made some reference to the Asterisk configuration file, but this was not the case: primitive Asterisk ocf:heartbeat:asterisk \ params realtime=true \ op monitor interval=60s \ meta target-role=Started This agent is the one that is available in the resource-agents package from Debian Backports repository: atlantis:~# aptitude show resource-agents Paquete: resource-agents Nuevo: sí Estado: instalado Instalado automáticamente: sí Versión: 1:3.9.2-5~bpo60+1 Prioridad: opcional Sección: admin Desarrollador: Debian HA Maintainers debian-ha-maintain...@lists.alioth.debian.org Tamaño sin comprimir: 2.228 k Depende de: libc6 (= 2.4), libglib2.0-0 (= 2.12.0), libnet1 (= 1.1.2.1), libplumb2, libplumbgpl2, cluster-glue, python Tiene conflictos con: cluster-agents (= 1:1.0.4-1), rgmanager (= 3.0.12-2+b1) Reemplaza: cluster-agents (= 1:1.0.4-1), rgmanager (= 3.0.12-2+b1) Descripción: Cluster Resource Agents The Cluster Resource Agents are a set of scripts to interface with several services to operate in a High Availability environment for both Pacemaker and rgmanager resource managers. Página principal: https://github.com/ClusterLabs/resource-agents Do you know if there is any way to get the behavior that you suggested me using this agent? Thanks for your reply. Regards, Daniel -- Ing. Daniel Bareiro - GNU/Linux registered user #188.598 Proudly running Debian GNU/Linux with uptime: 21:54:06 up 52 days, 6:01, 11 users, load average: 0.00, 0.02, 0.00 signature.asc Description: Digital signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Cleanup over secondary node
On 17/04/2013, at 11:28 AM, Daniel Bareiro daniel-lis...@gmx.net wrote: Ho Andrew. On Monday, 15 April 2013 14:36:48 +1000, Andrew Beekhof wrote: I'm testing Pacemaker+Corosync cluster with KVM virtual machines. When restarting a node, I got the following status: # crm status Last updated: Sun Apr 14 11:50:00 2013 Last change: Sun Apr 14 11:49:54 2013 Stack: openais Current DC: daedalus - partition with quorum Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff 2 Nodes configured, 2 expected votes 8 Resources configured. Online: [ atlantis daedalus ] Resource Group: servicios fs_drbd_servicios (ocf::heartbeat:Filesystem):Started daedalus clusterIP (ocf::heartbeat:IPaddr2): Started daedalus Mysql (ocf::heartbeat:mysql): Started daedalus Apache (ocf::heartbeat:apache):Started daedalus Pure-FTPd (ocf::heartbeat:Pure-FTPd): Started daedalus Asterisk (ocf::heartbeat:asterisk): Started daedalus Master/Slave Set: drbd_serviciosClone [drbd_servicios] Masters: [ daedalus ] Slaves: [ atlantis ] Failed actions: Asterisk_monitor_0 (node=atlantis, call=12, rc=5, status=complete): not installed The problem is that if I do a cleanup of the Asterisk resource in the secondary, this has no effect. It seems to be Paceemaker needs to have access to the config file to the resource. Not Pacemaker, the resource agent. Pacemaker runs a non-recurring monitor operation to see what state the service is in, it seems the asterisk agent needs that config file. I'd suggest changing the agent so that if the asterisk process is not running, the agent returns 7 (not running) before trying to access the config file. I was reviewing the resource definition assuming there I might have made some reference to the Asterisk configuration file, but this was not the case: primitive Asterisk ocf:heartbeat:asterisk \ params realtime=true \ op monitor interval=60s \ meta target-role=Started This agent is the one that is available in the resource-agents package from Debian Backports repository: atlantis:~# aptitude show resource-agents Paquete: resource-agents Nuevo: sí Estado: instalado Instalado automáticamente: sí Versión: 1:3.9.2-5~bpo60+1 Prioridad: opcional Sección: admin Desarrollador: Debian HA Maintainers debian-ha-maintain...@lists.alioth.debian.org Tamaño sin comprimir: 2.228 k Depende de: libc6 (= 2.4), libglib2.0-0 (= 2.12.0), libnet1 (= 1.1.2.1), libplumb2, libplumbgpl2, cluster-glue, python Tiene conflictos con: cluster-agents (= 1:1.0.4-1), rgmanager (= 3.0.12-2+b1) Reemplaza: cluster-agents (= 1:1.0.4-1), rgmanager (= 3.0.12-2+b1) Descripción: Cluster Resource Agents The Cluster Resource Agents are a set of scripts to interface with several services to operate in a High Availability environment for both Pacemaker and rgmanager resource managers. Página principal: https://github.com/ClusterLabs/resource-agents Do you know if there is any way to get the behavior that you suggested me using this agent? You'll have to edit it and submit the changes upstream. If whatever it is looking for is not found when a monitor is requested, it should probably return 7 (STOPPED) Thanks for your reply. Regards, Daniel -- Ing. Daniel Bareiro - GNU/Linux registered user #188.598 Proudly running Debian GNU/Linux with uptime: 21:54:06 up 52 days, 6:01, 11 users, load average: 0.00, 0.02, 0.00 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Cleanup over secondary node
Hi all! I'm testing Pacemaker+Corosync cluster with KVM virtual machines. When restarting a node, I got the following status: # crm status Last updated: Sun Apr 14 11:50:00 2013 Last change: Sun Apr 14 11:49:54 2013 Stack: openais Current DC: daedalus - partition with quorum Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff 2 Nodes configured, 2 expected votes 8 Resources configured. Online: [ atlantis daedalus ] Resource Group: servicios fs_drbd_servicios (ocf::heartbeat:Filesystem):Started daedalus clusterIP (ocf::heartbeat:IPaddr2): Started daedalus Mysql (ocf::heartbeat:mysql): Started daedalus Apache (ocf::heartbeat:apache):Started daedalus Pure-FTPd (ocf::heartbeat:Pure-FTPd): Started daedalus Asterisk (ocf::heartbeat:asterisk): Started daedalus Master/Slave Set: drbd_serviciosClone [drbd_servicios] Masters: [ daedalus ] Slaves: [ atlantis ] Failed actions: Asterisk_monitor_0 (node=atlantis, call=12, rc=5, status=complete): not installed The problem is that if I do a cleanup of the Asterisk resource in the secondary, this has no effect. It seems to be Paceemaker needs to have access to the config file to the resource. But this is not available, because it is mounted on the DRBD device that is accessible in the primary: Apr 14 11:58:06 atlantis cib: [1136]: info: apply_xml_diff: Digest mis-match: expected f6e4778e0ca9d8d681ba86acb83a6086, calculated ad03ff3e0622f60c78e8e1ece055bd63 Apr 14 11:58:06 atlantis cib: [1136]: notice: cib_process_diff: Diff 0.825.3 - 0.825.4 not applied to 0.825.3: Failed application of an update diff Apr 14 11:58:06 atlantis cib: [1136]: info: cib_server_process_diff: Requesting re-sync from peer Apr 14 11:58:06 atlantis crmd: [1141]: info: delete_resource: Removing resource Asterisk for 3141_crm_resource (internal) on atlantis Apr 14 11:58:06 atlantis crmd: [1141]: info: notify_deleted: Notifying 3141_crm_resource on atlantis that Asterisk was deleted Apr 14 11:58:06 atlantis crmd: [1141]: WARN: decode_transition_key: Bad UUID (crm-resource-3141) in sscanf result (3) for 0:0:crm-resource-3141 Apr 14 11:58:06 atlantis crmd: [1141]: info: ais_dispatch_message: Membership 1616: quorum retained Apr 14 11:58:06 atlantis lrmd: [1138]: info: rsc:Asterisk probe[13] (pid 3144) Apr 14 11:58:06 atlantis asterisk[3144]: ERROR: Config /etc/asterisk/asterisk.conf doesn't exist Apr 14 11:58:06 atlantis lrmd: [1138]: info: operation monitor[13] on Asterisk for client 1141: pid 3144 exited with return code 5 Apr 14 11:58:06 atlantis crmd: [1141]: info: process_lrm_event: LRM operation Asterisk_monitor_0 (call=13, rc=5, cib-update=40, confirmed=true) not installed Is there any way to remedy this situation? Thanks in advance for your reply. Regards, Daniel -- Ing. Daniel Bareiro - GNU/Linux registered user #188.598 Proudly running Debian GNU/Linux with uptime: 11:46:23 up 49 days, 19:53, 12 users, load average: 0.00, 0.01, 0.00 signature.asc Description: Digital signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Cleanup over secondary node
On 15/04/2013, at 1:01 AM, Daniel Bareiro daniel-lis...@gmx.net wrote: Hi all! I'm testing Pacemaker+Corosync cluster with KVM virtual machines. When restarting a node, I got the following status: # crm status Last updated: Sun Apr 14 11:50:00 2013 Last change: Sun Apr 14 11:49:54 2013 Stack: openais Current DC: daedalus - partition with quorum Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff 2 Nodes configured, 2 expected votes 8 Resources configured. Online: [ atlantis daedalus ] Resource Group: servicios fs_drbd_servicios (ocf::heartbeat:Filesystem):Started daedalus clusterIP (ocf::heartbeat:IPaddr2): Started daedalus Mysql (ocf::heartbeat:mysql): Started daedalus Apache (ocf::heartbeat:apache):Started daedalus Pure-FTPd (ocf::heartbeat:Pure-FTPd): Started daedalus Asterisk (ocf::heartbeat:asterisk): Started daedalus Master/Slave Set: drbd_serviciosClone [drbd_servicios] Masters: [ daedalus ] Slaves: [ atlantis ] Failed actions: Asterisk_monitor_0 (node=atlantis, call=12, rc=5, status=complete): not installed The problem is that if I do a cleanup of the Asterisk resource in the secondary, this has no effect. It seems to be Paceemaker needs to have access to the config file to the resource. Not Pacemaker, the resource agent. Pacemaker runs a non-recurring monitor operation to see what state the service is in, it seems the asterisk agent needs that config file. I'd suggest changing the agent so that if the asterisk process is not running, the agent returns 7 (not running) before trying to access the config file. But this is not available, because it is mounted on the DRBD device that is accessible in the primary: Apr 14 11:58:06 atlantis cib: [1136]: info: apply_xml_diff: Digest mis-match: expected f6e4778e0ca9d8d681ba86acb83a6086, calculated ad03ff3e0622f60c78e8e1ece055bd63 Apr 14 11:58:06 atlantis cib: [1136]: notice: cib_process_diff: Diff 0.825.3 - 0.825.4 not applied to 0.825.3: Failed application of an update diff Apr 14 11:58:06 atlantis cib: [1136]: info: cib_server_process_diff: Requesting re-sync from peer Apr 14 11:58:06 atlantis crmd: [1141]: info: delete_resource: Removing resource Asterisk for 3141_crm_resource (internal) on atlantis Apr 14 11:58:06 atlantis crmd: [1141]: info: notify_deleted: Notifying 3141_crm_resource on atlantis that Asterisk was deleted Apr 14 11:58:06 atlantis crmd: [1141]: WARN: decode_transition_key: Bad UUID (crm-resource-3141) in sscanf result (3) for 0:0:crm-resource-3141 Apr 14 11:58:06 atlantis crmd: [1141]: info: ais_dispatch_message: Membership 1616: quorum retained Apr 14 11:58:06 atlantis lrmd: [1138]: info: rsc:Asterisk probe[13] (pid 3144) Apr 14 11:58:06 atlantis asterisk[3144]: ERROR: Config /etc/asterisk/asterisk.conf doesn't exist Apr 14 11:58:06 atlantis lrmd: [1138]: info: operation monitor[13] on Asterisk for client 1141: pid 3144 exited with return code 5 Apr 14 11:58:06 atlantis crmd: [1141]: info: process_lrm_event: LRM operation Asterisk_monitor_0 (call=13, rc=5, cib-update=40, confirmed=true) not installed Is there any way to remedy this situation? Thanks in advance for your reply. Regards, Daniel -- Ing. Daniel Bareiro - GNU/Linux registered user #188.598 Proudly running Debian GNU/Linux with uptime: 11:46:23 up 49 days, 19:53, 12 users, load average: 0.00, 0.01, 0.00 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org