[Pacemaker] pcs equivalent of crm configure erase
Hi all, can someone tell me what the pcs equivalent to crm configure erase is? Is there a pcs cheat sheet showing the common tasks? Or a documentation? Best regards Andreas ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Disable startup fencing with cman
Hi all, in a two node cluster (RHEL6.x, cman, pacemaker) when I startup the very first node, this node will try to fence the other node if it can't see it. This can be true in case of maintenance. How do I avoid this startup fencing temporarily when I know that the other node is down? Best regards Andreas ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] iscsi target mounting readonly on client
Thanks for the informations. I give it a try monday and gave you a feedback. 2013/4/12 Felix Zachlod fz.li...@sis-gmbh.info: Hello Joseph! -Ursprüngliche Nachricht-, Von: Joseph-Andre Guaragna [mailto:joseph-an...@rdmo.com] Gesendet: Freitag, 12. April 2013 17:19 An: pacemaker@oss.clusterlabs.org Betreff: [Pacemaker] iscsi target mounting readonly on client You have to make two things absolutely shure. 1. Data that has been acknowledged by you iscsi Target to your initiator has hit the device and not only the page cache! If you run your target in fileio mode you have to use write trough- cause with write back you or your cluster manager can't ever tell if the writes have completed before switching the DRBD states. That will only perform good if you have a decent raid card with BBWC! BUT YOU MUST RUN WRITE TRHOUGH or blockio (which will be write trough too) running write back in such a constellation IS NOT SAFE you might risk SERIOUS DATA CORRUPTION when switching targets. 2. On your initiator side try to rise the /sys/block/sd*/device/timeout value. That is the time the block device will wait for a command to complete before handing an i/o error tot he upper layer- which will most probably lead to your filesystem remounting r/o. 3. This is just a side note: do not use iet. We were running a production target wit iet for about 2 year which caused horrible problems to us. Consider scst or lio (I personally do not have any experiences with lio but scst is running in our production environment for years now without any problems) regards Felix ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Cleanup over secondary node
Hi all! I'm testing Pacemaker+Corosync cluster with KVM virtual machines. When restarting a node, I got the following status: # crm status Last updated: Sun Apr 14 11:50:00 2013 Last change: Sun Apr 14 11:49:54 2013 Stack: openais Current DC: daedalus - partition with quorum Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff 2 Nodes configured, 2 expected votes 8 Resources configured. Online: [ atlantis daedalus ] Resource Group: servicios fs_drbd_servicios (ocf::heartbeat:Filesystem):Started daedalus clusterIP (ocf::heartbeat:IPaddr2): Started daedalus Mysql (ocf::heartbeat:mysql): Started daedalus Apache (ocf::heartbeat:apache):Started daedalus Pure-FTPd (ocf::heartbeat:Pure-FTPd): Started daedalus Asterisk (ocf::heartbeat:asterisk): Started daedalus Master/Slave Set: drbd_serviciosClone [drbd_servicios] Masters: [ daedalus ] Slaves: [ atlantis ] Failed actions: Asterisk_monitor_0 (node=atlantis, call=12, rc=5, status=complete): not installed The problem is that if I do a cleanup of the Asterisk resource in the secondary, this has no effect. It seems to be Paceemaker needs to have access to the config file to the resource. But this is not available, because it is mounted on the DRBD device that is accessible in the primary: Apr 14 11:58:06 atlantis cib: [1136]: info: apply_xml_diff: Digest mis-match: expected f6e4778e0ca9d8d681ba86acb83a6086, calculated ad03ff3e0622f60c78e8e1ece055bd63 Apr 14 11:58:06 atlantis cib: [1136]: notice: cib_process_diff: Diff 0.825.3 - 0.825.4 not applied to 0.825.3: Failed application of an update diff Apr 14 11:58:06 atlantis cib: [1136]: info: cib_server_process_diff: Requesting re-sync from peer Apr 14 11:58:06 atlantis crmd: [1141]: info: delete_resource: Removing resource Asterisk for 3141_crm_resource (internal) on atlantis Apr 14 11:58:06 atlantis crmd: [1141]: info: notify_deleted: Notifying 3141_crm_resource on atlantis that Asterisk was deleted Apr 14 11:58:06 atlantis crmd: [1141]: WARN: decode_transition_key: Bad UUID (crm-resource-3141) in sscanf result (3) for 0:0:crm-resource-3141 Apr 14 11:58:06 atlantis crmd: [1141]: info: ais_dispatch_message: Membership 1616: quorum retained Apr 14 11:58:06 atlantis lrmd: [1138]: info: rsc:Asterisk probe[13] (pid 3144) Apr 14 11:58:06 atlantis asterisk[3144]: ERROR: Config /etc/asterisk/asterisk.conf doesn't exist Apr 14 11:58:06 atlantis lrmd: [1138]: info: operation monitor[13] on Asterisk for client 1141: pid 3144 exited with return code 5 Apr 14 11:58:06 atlantis crmd: [1141]: info: process_lrm_event: LRM operation Asterisk_monitor_0 (call=13, rc=5, cib-update=40, confirmed=true) not installed Is there any way to remedy this situation? Thanks in advance for your reply. Regards, Daniel -- Ing. Daniel Bareiro - GNU/Linux registered user #188.598 Proudly running Debian GNU/Linux with uptime: 11:46:23 up 49 days, 19:53, 12 users, load average: 0.00, 0.01, 0.00 signature.asc Description: Digital signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Disable startup fencing with cman
On 14/04/2013 10:47 πμ, Andreas Mock wrote: Hi all, in a two node cluster (RHEL6.x, cman, pacemaker) when I startup the very first node, this node will try to fence the other node if it can't see it. This can be true in case of maintenance. How do I avoid this startup fencing temporarily when I know that the other node is down? Have you tried to standby the node? I don't know if it will work, just sharing my idea here. Best regards Andreas ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org signature.asc Description: OpenPGP digital signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] 1.1.8 not compatible with 1.1.7?
On 12/04/2013 09:37 μμ, Pavlos Parissis wrote: Hoi, As I wrote to another post[1] I failed to upgrade to 1.1.8 for a 2 node cluster. Before the upgrade process both nodes are using CentOS 6.3, corosync 1.4.1-7 and pacemaker-1.1.7. I followed the rolling upgrade process, so I stopped pacemaker and then corosync on node1 and upgraded to CentOS 6.4. The OS upgrade upgrades also pacemaker to 1.1.8-7 and corosync to 1.4.1-15. The upgrade of rpms went smoothly as I knew about the crmsh issue so I made sure I had crmsh rpm on my repos. Corosync started without any problems and both nodes could see each other[2]. But for some reason node2 failed to receive a reply on join offer from node1 and node1 never joined the cluster. Node1 formed a new cluster as it never got an reply from node2, so I ended up with a split-brain situation. Logs of node1 can be found here https://dl.dropboxusercontent.com/u/1773878/pacemaker-issue/node1.log and of node2 here https://dl.dropboxusercontent.com/u/1773878/pacemaker-issue/node2.log Doing a Disconnect Reattach upgrade of both nodes at the same time brings me a working 1.1.8 cluster. Any attempt to make a 1.1.8 node to join a cluster with a 1.1.7 failed. Cheers, Pavlos signature.asc Description: OpenPGP digital signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] 1.1.8 not compatible with 1.1.7?
On 15/04/2013, at 7:31 AM, Pavlos Parissis pavlos.paris...@gmail.com wrote: On 12/04/2013 09:37 μμ, Pavlos Parissis wrote: Hoi, As I wrote to another post[1] I failed to upgrade to 1.1.8 for a 2 node cluster. Before the upgrade process both nodes are using CentOS 6.3, corosync 1.4.1-7 and pacemaker-1.1.7. I followed the rolling upgrade process, so I stopped pacemaker and then corosync on node1 and upgraded to CentOS 6.4. The OS upgrade upgrades also pacemaker to 1.1.8-7 and corosync to 1.4.1-15. The upgrade of rpms went smoothly as I knew about the crmsh issue so I made sure I had crmsh rpm on my repos. Corosync started without any problems and both nodes could see each other[2]. But for some reason node2 failed to receive a reply on join offer from node1 and node1 never joined the cluster. Node1 formed a new cluster as it never got an reply from node2, so I ended up with a split-brain situation. Logs of node1 can be found here https://dl.dropboxusercontent.com/u/1773878/pacemaker-issue/node1.log and of node2 here https://dl.dropboxusercontent.com/u/1773878/pacemaker-issue/node2.log Doing a Disconnect Reattach upgrade of both nodes at the same time brings me a working 1.1.8 cluster. Any attempt to make a 1.1.8 node to join a cluster with a 1.1.7 failed. There wasn't enough detail in the logs to suggest a solution, but if you add the following to /etc/sysconfig/pacemaker and re-test, it might shed some additional light on the problem. export PCMK_trace_functions=ais_dispatch_message Certainly there was no intention to make them incompatible. Cheers, Pavlos ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Disable startup fencing with cman
On 14/04/2013, at 6:47 PM, Andreas Mock andreas.m...@web.de wrote: Hi all, in a two node cluster (RHEL6.x, cman, pacemaker) when I startup the very first node, this node will try to fence the other node if it can't see it. This can be true in case of maintenance. How do I avoid this startup fencing temporarily when I know that the other node is down? Set the target-role for your fencing device(s) to Stopped and use stonith_admin --confirm ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] racing crm commands... last write wins?
On 12/04/2013, at 11:35 PM, Brian J. Murrell br...@interlinx.bc.ca wrote: On 13-04-10 07:02 PM, Andrew Beekhof wrote: On 11/04/2013, at 6:33 AM, Brian J. Murrell brian-squohqy54cvwr29bmmi...@public.gmane.org wrote: Does crm_resource suffer from this problem no Excellent. I was unable to find any comprehensive documentation on just how to implement a pacemaker configuration solely with crm_resource and the manpage for it doesn't seem to indicate any way to create resources, for example. Right, creation (and any other modifications of the config) is via cibadmin. However that involves dealing with XML which most people have an aversion to, hence the common use of pcs and crmsh. Is it typical that when you don't want to use crm (or pcs) and want to rely on the crm_* group of commands, that you do so in conjunction with cibadmin for things like creating resources, etc.? Yes. It seems so, but I just want to make sure there is not something I have not uncovered yet. Cheers, b. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] pcs equivalent of crm configure erase
On 14/04/2013, at 5:52 PM, Andreas Mock andreas.m...@web.de wrote: Hi all, can someone tell me what the pcs equivalent to crm configure erase is? Is there a pcs cheat sheet showing the common tasks? Or a documentation? pcs help should be reasonably informative, but I don't see anything equivalent Chris? Best regards Andreas ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Different value on cluster-infrastructure between 2 nodes
On 12/04/2013, at 11:10 PM, Pavlos Parissis pavlos.paris...@gmail.com wrote: Hi I am doing a rolling upgrade of pacemaker from CentOS 6.3 to 6.4 and when 1st node is upgraded and gets 1.1.8 version it doesn't join the cluster and I ended up with 2 clusters. In the logs of node1 I see cluster-infrastructure value=classic openais (with pluin) but node2(still in centos6.3 and pacemaker 1.1.7) it has cluster-infrastructure=openais The string changed but they mean the same thing. I also see different dc-version between nodes. Because both nodes are their own DC for some reason. Does anyone know if these could be the reason for node1 to not join the cluster and decides to make its own cluster? No. Its the side-effect, not the cause. corosync communication looks fine Printing ring status. Local node ID 484162314 RING ID 0 id = 10.187.219.28 status = ring 0 active with no faults RING ID 1 id = 192.168.1.2 status = ring 1 active with no faults Cheers, Pavlos ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] attrd waits one second before doing update
On 12/04/2013, at 5:45 PM, Rainer Brestan rainer.bres...@gmx.net wrote: OK, and where is the difference between 1.1.8 and 1.1.7. Prior to 1.1.8 the local node flushed its value immediately, which caused the CIB to be updated too soon (compared to the other nodes). Since the whole point of attrd is to try and have them arrive at the same time, we changed this to be more consistent. I am currently testing this on a one node cluster, so attrd wait for the message come back from himself. This cant take one second, or is attrd waiting this time anyhow to be sure to get it from all nodes back? There is no additional delay, the local node flushes its value as soon as the message comes back to itself (and therefor all other nodes too) Rainer Gesendet: Freitag, 12. April 2013 um 02:03 Uhr Von: Andrew Beekhof and...@beekhof.net An: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Betreff: Re: [Pacemaker] attrd waits one second before doing update On 12/04/2013, at 7:17 AM, Rainer Brestan rainer.bres...@gmx.net wrote: In pacemaker 1.1.7-6 with corosync 1.4.1-7 update of attributes works almost online. Used with SysInfo resource agent and manual commands like attrd_updater -U 4 -n test. In the logfile there is one line attrd[...] notice: attrd_trigger_update: Sending flush up to all hosts for: ... and a few milliseconds later attrd[...] notice: attrd_perform_update: Sent update ... with the same content. After upgrade to version 1.1.8-6 there is always nearly exact one second between trigger and perform. 2013-04-11T22:51:55.389+02:00 int2node2 attrd[28370] notice: notice: attrd_trigger_update: Sending flush op to all hosts for: text (81) 2013-04-11T22:51:56.397+02:00 int2node2 attrd[28370] notice: notice: attrd_perform_update: Sent update 5814: text=81 And what i found out having several updates running, they have a single queue. All attrd_updater processes are waiting for the next to be finished, so there cant be more than one update per second any more. Has this something to do with attrd: Have single-shot clients wait for an ack before disconnecting stated in the Changelog for 1.1.8 ? No, nothing at all. If yes, is it intended to have a single queue ? More like unavoidable, since we need to talk to the other nodes and messages between them are ordered. And is this 1 second fixed ? From where does this 1 second come, i dont think that it takes one second to get the ack. When the timer expires, attrd sends a cluster message to all nodes (including itself) telling them to update the CIB with their current value. The delay comes from waiting for the cluster message we sent to arrive back again before sending our own updates, this helps ensure all the updates arrive in the CIB at almost the same time. This can run into heavy delays (and therefore timeouts) for monitor functions of RA performing attribute updates. Rainer ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] RHEL6.x dependency between 2-node-settings for cman and quorum settings in pacemaker
On 12/04/2013, at 4:58 PM, Andreas Mock andreas.m...@web.de wrote: Hi all, another question rised up while reading documentation concerning 2-node-cluster under RHEL6.x with CMAN and pacemaker. a) In the quick start guide one of the things you set is CMAN_QUORUM_TIMEOUT=0 in /etc/sysconfig/cman to get one node of the cluster up without waiting for quorum. (Correct me if my understanding is wrong) b) There is a special setting in cluster.conf cman two_node=1 expected_votes=1 /cman which allows one node to gain quorum in a two node cluster (Please also correct me here if my understanding is wrong) c) And there is a pacemaker setting no-quorum-policy which is mostly set to 'ignore' in all startup tutorials. My question: I would like to understand how these settings influence each other and/or are dependent. a) allows service cman start to complete (and therefor allow service pacemaker start to begin) before quorum has arrived. b) is a possible alternative to a) but I've never tested it because it is superseded by c) and in fact makes c) meaningless since the cluster always has quorum. a+c is preferred for consistency with clusters of more than 2 nodes. As most insight as possible appreciated. ;-) Best regards Andreas ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Cleanup over secondary node
On 15/04/2013, at 1:01 AM, Daniel Bareiro daniel-lis...@gmx.net wrote: Hi all! I'm testing Pacemaker+Corosync cluster with KVM virtual machines. When restarting a node, I got the following status: # crm status Last updated: Sun Apr 14 11:50:00 2013 Last change: Sun Apr 14 11:49:54 2013 Stack: openais Current DC: daedalus - partition with quorum Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff 2 Nodes configured, 2 expected votes 8 Resources configured. Online: [ atlantis daedalus ] Resource Group: servicios fs_drbd_servicios (ocf::heartbeat:Filesystem):Started daedalus clusterIP (ocf::heartbeat:IPaddr2): Started daedalus Mysql (ocf::heartbeat:mysql): Started daedalus Apache (ocf::heartbeat:apache):Started daedalus Pure-FTPd (ocf::heartbeat:Pure-FTPd): Started daedalus Asterisk (ocf::heartbeat:asterisk): Started daedalus Master/Slave Set: drbd_serviciosClone [drbd_servicios] Masters: [ daedalus ] Slaves: [ atlantis ] Failed actions: Asterisk_monitor_0 (node=atlantis, call=12, rc=5, status=complete): not installed The problem is that if I do a cleanup of the Asterisk resource in the secondary, this has no effect. It seems to be Paceemaker needs to have access to the config file to the resource. Not Pacemaker, the resource agent. Pacemaker runs a non-recurring monitor operation to see what state the service is in, it seems the asterisk agent needs that config file. I'd suggest changing the agent so that if the asterisk process is not running, the agent returns 7 (not running) before trying to access the config file. But this is not available, because it is mounted on the DRBD device that is accessible in the primary: Apr 14 11:58:06 atlantis cib: [1136]: info: apply_xml_diff: Digest mis-match: expected f6e4778e0ca9d8d681ba86acb83a6086, calculated ad03ff3e0622f60c78e8e1ece055bd63 Apr 14 11:58:06 atlantis cib: [1136]: notice: cib_process_diff: Diff 0.825.3 - 0.825.4 not applied to 0.825.3: Failed application of an update diff Apr 14 11:58:06 atlantis cib: [1136]: info: cib_server_process_diff: Requesting re-sync from peer Apr 14 11:58:06 atlantis crmd: [1141]: info: delete_resource: Removing resource Asterisk for 3141_crm_resource (internal) on atlantis Apr 14 11:58:06 atlantis crmd: [1141]: info: notify_deleted: Notifying 3141_crm_resource on atlantis that Asterisk was deleted Apr 14 11:58:06 atlantis crmd: [1141]: WARN: decode_transition_key: Bad UUID (crm-resource-3141) in sscanf result (3) for 0:0:crm-resource-3141 Apr 14 11:58:06 atlantis crmd: [1141]: info: ais_dispatch_message: Membership 1616: quorum retained Apr 14 11:58:06 atlantis lrmd: [1138]: info: rsc:Asterisk probe[13] (pid 3144) Apr 14 11:58:06 atlantis asterisk[3144]: ERROR: Config /etc/asterisk/asterisk.conf doesn't exist Apr 14 11:58:06 atlantis lrmd: [1138]: info: operation monitor[13] on Asterisk for client 1141: pid 3144 exited with return code 5 Apr 14 11:58:06 atlantis crmd: [1141]: info: process_lrm_event: LRM operation Asterisk_monitor_0 (call=13, rc=5, cib-update=40, confirmed=true) not installed Is there any way to remedy this situation? Thanks in advance for your reply. Regards, Daniel -- Ing. Daniel Bareiro - GNU/Linux registered user #188.598 Proudly running Debian GNU/Linux with uptime: 11:46:23 up 49 days, 19:53, 12 users, load average: 0.00, 0.01, 0.00 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] How to display interface link status in corosync
Hi, 2013/4/8 Andrew Beekhof and...@beekhof.net: I'm not 100% sure what the best approach is here. Traditionally this is done with resource agents (ie. ClusterMon or ping) which update attrd. We could potentially build it into attrd directly, but then we'd need to think about how to turn it on/off. I think I'd lean towards a new agent+daemon or a new daemon launched by ClusterMon. I check to see if I implement this function by a new agent+daemon. I have a question. I am not sure how to launch daemon by ClusterMon. Do you mean to use crm_mon -E? Sincerely, Yuichi On 04/04/2013, at 8:59 PM, Yuichi SEINO seino.clust...@gmail.com wrote: Hi All, I want to display interface link status in corosync. So, I think that I will add this function to the part of pacemakerd. I am going to display this status to Node Attributes in crm_mon. When the state of link change, corosync can run the callback function. When it happens, we update attributes. And, this function need to start after attrd started. pacemakerd of mainloop start after sub-process started. So, I think that this is the best timing. I show the expected crm_mon. # crm_mon -fArc1 Last updated: Thu Apr 4 08:08:08 2013 Last change: Wed Apr 3 04:15:48 2013 via crmd on coro-n2 Stack: corosync Current DC: coro-n1 (168427526) - partition with quorum Version: 1.1.9-c791037 2 Nodes configured, unknown expected votes 2 Resources configured. Online: [ coro-n1 coro-n2 ] Full list of resources: Clone Set: OFclone [openstack-fencing] Started: [ coro-n1 coro-n2 ] Node Attributes: * Node coro-n1: + ringnumber(0) : 10.10.0.6 is FAULTY + ringnumber(1) : 10.20.0.6 is UP * Node coro-n2: + ringnumber(0) : 10.10.0.7 is FAULTY + ringnumber(1) : 10.20.0.7 is UP Migration summary: * Node coro-n2: * Node coro-n1: Tickets: Sincerely, Yuichi ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org