Re: [Pacemaker] Pacemaker fails to transition single node master/slave resource to master

2012-08-27 Thread Eliot Gable
suggestions? On Mon, Aug 27, 2012 at 7:49 AM, Lars Marowsky-Bree wrote: > On 2012-08-26T11:44:52, Eliot Gable wrote: > > > Looking back through my E-mails, it looks like this was originally > deployed > > on 1.1.2. My guess is that someone did a sweeping software update on the

Re: [Pacemaker] Pacemaker fails to transition single node master/slave resource to master

2012-08-26 Thread Eliot Gable
On Sun, Aug 26, 2012 at 11:31 AM, Eliot Gable wrote: > > I have also tried doing a resource cleanup on FreeSWITCH-MS, restarting > pacemaker and corosync, putting the node in standby and bringing it back > out, upgrading pacemaker and corosync (to the version you seen in the > o

[Pacemaker] Pacemaker fails to transition single node master/slave resource to master

2012-08-26 Thread Eliot Gable
node node1 primitive FreeSWITCH ocf:fssolutions:FreeSWITCH \ params ips="bond2/212.163.22.155/26:bond2/212.163.22.156/26" user="freeswitch" group="freeswitch" \ op monitor interval="3s" role="Master" depth="0" \ op monitor interval="10s" role="Slave" depth="0" \ op s

Re: [Pacemaker] STONITH external/ssh missing on RHEL 5.5 EPEL 5.4 + ClusterLabs Repo RPM Build?

2010-12-20 Thread Eliot Gable
> That's intentional, see: > > http://hg.linux-ha.org/glue/rev/5ef3f9370458 > > You really don't want to rely on SSH STONITH in a production environment. > > Regards, > > Tim Sure, but I'm in a lab environment at the moment without UPS-based STONITH capabilities, so having SSH STONITH working

[Pacemaker] STONITH external/ssh missing on RHEL 5.5 EPEL 5.4 + ClusterLabs Repo RPM Build?

2010-12-17 Thread Eliot Gable
I just did an install of Pacemaker on my CentOS 5.5 system using EPEL 5.4 and ClusterLabs Repo. It seems the RPMs do not include the STONITH plugin external/ssh. Is it in some package that I missed or is it really not provided? Is there any way to get it? Thanks. Eliot Gable Senior Product

Re: [Pacemaker] Master/Slave not failing over

2010-06-28 Thread Eliot Gable
? Eliot Gable Senior Product Developer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted with it are confidential and are intended solely for the use of the individual or

Re: [Pacemaker] Master/Slave not failing over

2010-06-25 Thread Eliot Gable
switched to a Master. Although, right now, it is arguably doing things correctly, because it is reporting the current state as it exists at the time of returning from the monitoring action. Eliot Gable Senior Product Developer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 F

Re: [Pacemaker] Master/Slave not failing over

2010-06-25 Thread Eliot Gable
3000 on node-1: unknown error (1) Which makes me think maybe this is related to this failed operator from yesterday. However, I have stopped and started the resource several times on node-1 since this failed op occurred. Do I need to clear these things (cleanup the resource) each time I start the resourc

Re: [Pacemaker] Master/Slave not failing over

2010-06-25 Thread Eliot Gable
till just sits there as a slave. Is there something else I am missing? Thanks again. Eliot Gable Senior Product Developer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net CONFIDENTIAL COMMUNICATION. This e-mail and any files transmit

Re: [Pacemaker] Master/Slave not failing over

2010-06-25 Thread Eliot Gable
onitoring on it to ensure that everything came up correctly, I should at that point issue crm_master again with -v option to set a score for the node so it is a good candidate to become master, correct? Eliot Gable Senior Product Developer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 2

Re: [Pacemaker] Master/Slave not failing over

2010-06-25 Thread Eliot Gable
ile it mentions using crm_master to provide a promotion score, it does not tell me what actual attribute it is that needs to be modified. Is there another command that can print out all available attributes, or a document somewhere that lists them? Eliot Gable Senior Product Developer 1228 Eu

Re: [Pacemaker] Master/Slave not failing over

2010-06-24 Thread Eliot Gable
slave, then STOP on the failed master, followed by START on the failed master. How can I achieve this? Is there some sort of constraint or something I can put in place to make it happen? Thanks again for any insights. Eliot Gable Senior Product Developer 1228 Euclid Ave, Suite 390 Cleveland

Re: [Pacemaker] Master/Slave not failing over

2010-06-24 Thread Eliot Gable
CF_CHECK_LEVEL="10" \ op monitor interval="5" role="Master" timeout="30s" \ op monitor interval="10" role="Master" timeout="30s" OCF_CHECK_LEVEL="10" \ op start interval="0" timeout="40

[Pacemaker] Master/Slave not failing over

2010-06-24 Thread Eliot Gable
immediately promote the slave. I can understand it waiting for a DEMOTE action to succeed on the failed master before it promotes the slave, but that is all it should need to do it. Is there any way I can change this behavior? Am I missing some key point in the process? Eliot Gable Senior

[Pacemaker] pgpool2 OCF Resource Agent

2010-06-21 Thread Eliot Gable
n MA 02111-1307, USA. # ### # # This resource agent was written by E

Re: [Pacemaker] Corosync + Pacemaker New Install: Corosync Fails Without Error Message

2010-06-18 Thread Eliot Gable
) from /lib64/libpthread.so.0 No symbol table info available. #3 0x00332ded3d1d in clone () from /lib64/libc.so.6 No symbol table info available. (gdb) Downgrading again back to 1.2.1-1.el5 seems to resolve the issue, and Corosync runs. Eliot Gable Senior Product Developer 1228 Euclid A

[Pacemaker] Corosync + Pacemaker New Install: Corosync Fails Without Error Message

2010-06-18 Thread Eliot Gable
nyone have any suggestions about how I can figure out what is causing the problem? Eliot Gable Senior Product Developer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net<mailto:ega...@broadvox.net> [cid:image001.gif@01CB0ED8.F4F060

Re: [Pacemaker] CRM help

2009-10-29 Thread Eliot Gable
ERROR: could not parse meta-data for (ocf,mysql,heartbeat) ERROR: ocf:heartbeat:mysql: no such resource agent Why would CRM not be able to parse the meta-data while the ocf-tester script seems to like the RA and it's meta-data just fine? Eliot Gable Senior Engineer 1228 Euclid Ave, Suit

Re: [Pacemaker] CRM help

2009-10-29 Thread Eliot Gable
h resource agent Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted with it are confidential and are intended solely for the use of the individual

Re: [Pacemaker] CRM help

2009-10-29 Thread Eliot Gable
resource agent Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted with it are confidential and are intended solely for the use of the individual or

Re: [Pacemaker] CRM help

2009-10-29 Thread Eliot Gable
hitting CTRL-C for each line until it gives me all errors and then it exits. The documentation clearly shows doing it the way I first posted. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net CONFIDENTIAL

Re: [Pacemaker] CRM help

2009-10-29 Thread Eliot Gable
Nevermind. I had a leading 'configure' statement by itself. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted with it are confidenti

Re: [Pacemaker] CRM help

2009-10-29 Thread Eliot Gable
That just results in syntax errors on every line. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted with it are confidential and are intended

[Pacemaker] CRM help

2009-10-29 Thread Eliot Gable
In the past, I have always done things by manually creating a CIB XML file and then importing it. But, to save time, I thought I would try CRM. So, I made this script: #!/bin/bash crm<http://www.clusterlabs.org/mediawiki/images/8/8d/Crm_cli.pdf. Anyone have any suggestions? Eliot Ga

[Pacemaker] Pacemaker Build Issues on CentOS 5.3 64-bit

2009-10-29 Thread Eliot Gable
/usr/lib64 instead of /usr/lib. If someone could fix these issues quickly, it would save everyone using CentOS lots of time and headache trying to make this work. :) I would be more than happy to test any modifications for anyone willing to try to fix it. Eliot Gable Senior Engineer 1228 E

Re: [Pacemaker] Resource agent monitoring

2009-07-27 Thread Eliot Gable
Nevermind. I did something stupid. From: Eliot Gable [mailto:ega...@broadvox.com] Sent: Monday, July 27, 2009 3:16 PM To: pacemaker@oss.clusterlabs.org Subject: [Pacemaker] Resource agent monitoring Is Pacemaker now monitoring resource agents for changes and marking the nodes as UNCLEAN if a

[Pacemaker] Resource agent monitoring

2009-07-27 Thread Eliot Gable
Is Pacemaker now monitoring resource agents for changes and marking the nodes as UNCLEAN if a change is detected? If so, how do I disable this. I recently upgraded to 1.0.4 from1.0.3 and now, when I update my RA, it causes a stonith on every node I push the RA out to. __

Re: [Pacemaker] Master/Slave Resource and Stop/Start of HA System

2009-07-27 Thread Eliot Gable
esource and Stop/Start of HA System > > > > I'd guess the master preference (in the status section) got lost somehow. > > You should probably file a bug. > > > > On Jul 23, 2009, at 11:24 PM, Eliot Gable wrote: > > Ok, it does not actually stop the mas

Re: [Pacemaker] Master/Slave Resource and Stop/Start of HA System

2009-07-23 Thread Eliot Gable
Ok, it does not actually stop the master, but it DOES demote the master to slave. From: Eliot Gable [mailto:ega...@broadvox.com] Sent: Thursday, July 23, 2009 5:23 PM To: pacemaker@oss.clusterlabs.org Subject: [Pacemaker] Master/Slave Resource and Stop/Start of HA System Running Pacemaker

[Pacemaker] Master/Slave Resource and Stop/Start of HA System

2009-07-23 Thread Eliot Gable
Running Pacemaker 1.0.4. With my Master/Slave resource in Master on node1 and Slave on node2, if I /etc/init.d/heartbeat stop on node2, I see the slave go down and node1 stays master in crm_mon. When it finishes, node2 is in OFFLINE status (no unclean modifier). When I then /etc/init.d/heartbe

Re: [Pacemaker] Master/Slave failover during reboot

2009-07-21 Thread Eliot Gable
On another note, what remote power device do you recommend for fencing an UNCLEAN node? -Original Message- From: Eliot Gable [mailto:ega...@broadvox.com] Sent: Tuesday, July 21, 2009 9:40 AM To: pacemaker@oss.clusterlabs.org Subject: Re: [Pacemaker] Master/Slave failover during reboot

Re: [Pacemaker] Master/Slave failover during reboot

2009-07-21 Thread Eliot Gable
9-07-20T15:27:23, Eliot Gable wrote: > I have a resource that is configured as a Master/Slave resource. If I > kill a resource it is dependent on, it properly fails over to the > other node. However, if I reboot the master node, it does not fail > over. What I see is that the master no

[Pacemaker] Master/Slave failover during reboot

2009-07-21 Thread Eliot Gable
I have a resource that is configured as a Master/Slave resource. If I kill a resource it is dependent on, it properly fails over to the other node. However, if I reboot the master node, it does not fail over. What I see is that the master node switches to UNCLEAN - Offline, the master resource

Re: [Pacemaker] Master/Slave failover during reboot

2009-07-21 Thread Eliot Gable
happening again, I will E-mail the list. From: Eliot Gable Sent: Monday, July 20, 2009 3:27 PM To: 'pacemaker@oss.clusterlabs.org' Subject: Master/Slave failover during reboot I have a resource that is configured as a Master/Slave resource. If I kill a resource it is dependent on, i

Re: [Pacemaker] stonith reboot behavior

2009-06-25 Thread Eliot Gable
A "reboot" should never fail. That is, it should always guarantee that the system actually went down entirely. It does not need to guarantee that it comes back up automatically. If it gets stuck in the boot-up process, you can just manually intervene and fix that whenever it's possible and when

Re: [Pacemaker] Pacemaker on OpenAIS, RRP, and link failure

2009-06-05 Thread Eliot Gable
FYI, I recently tried NIC bonding on CentOS 5.2 32-bit and had issues in the bonding driver causing kernel panics. I disabled bonding because it was less stable. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega

Re: [Pacemaker] cibadmin update

2009-06-05 Thread Eliot Gable
Update your admin epoch. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted with it are confidential and are intended solely for the use of the

Re: [Pacemaker] cibadmin doesn't change cib.xml

2009-06-03 Thread Eliot Gable
You can always check. Probably look at /var/lib/heartbeat and everything under it if you are using Heartbeat. If OpenAIS, not sure where to look. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net CONFIDENTIAL

Re: [Pacemaker] cibadmin doesn't change cib.xml

2009-06-03 Thread Eliot Gable
Try: crm_verify -V -x newcib.xml and make sure it verifies OK. Then do: cibadmin -R -o cib -x newcib.xml After doing that, try: cibadmin -Q | less And check to see if it has the new CIB. If that doesn't work, post your CIB. Eliot Gable Senior Engineer 1228 Euclid Ave, Suit

Re: [Pacemaker] System Health backend part

2009-06-03 Thread Eliot Gable
I actually do start pingd on just one node and fail it over. It won't work on my slave node because the slave node does not have Internet access, only local cluster access. If it ran all the time on that node, it would always show Internet connectivity down. Thus, I must agree with Andrew: Pacem

Re: [Pacemaker] Managing resources - classes

2009-05-27 Thread Eliot Gable
sent, contains the appropriate resources, and is readable and executable by root. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net<mailto:ega...@broadvox.net> [cid:image001.gif@01C9DEC6.6AB2E690] CONFIDENTIAL

Re: [Pacemaker] Managing resources - classes

2009-05-27 Thread Eliot Gable
missing the files. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net<mailto:ega...@broadvox.net> [cid:image001.gif@01C9DEAC.80CF4BE0] CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted w

Re: [Pacemaker] Managing resources - classes

2009-05-27 Thread Eliot Gable
crm(live)# ra crm(live)ra# classes heartbeat ocf / pacemaker heartbeat lsb stonith Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net<mailto:ega...@broadvox.net> [cid:image001.gif@01C9DEA9.F95D2440] CONFID

Re: [Pacemaker] Managing resources - classes

2009-05-27 Thread Eliot Gable
give you some idea just how much there is to learn. Don’t expect to have it mastered in a couple of days. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net<mailto:ega...@broadvox.net> [cid:image001.gif@01

Re: [Pacemaker] PingD Failure-Timeout

2009-05-26 Thread Eliot Gable
I am using 1.0.3, but the failure-timeout thing does not seem to work for pingd. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted with it are

Re: [Pacemaker] PingD Failure-Timeout

2009-05-21 Thread Eliot Gable
: The stonith resources correctly fence on a failure of the stop action on a resource. Any suggestions? Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373

[Pacemaker] PingD Failure-Timeout

2009-05-21 Thread Eliot Gable
omatically clear that -1000 score after a certain (small) interval of time? Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net<mailto:ega...@broadvox.net> [cid:image001.gif@01C9DA2F.800EEE30] CONFIDENTIA

Re: [Pacemaker] globally-unique clone question

2009-05-21 Thread Eliot Gable
ydata" value="node1:www.example.com node2:www.example2.com node3:www.example3.com" Then, inside the RA, check the hostname and use the domain that is attached to that hostname. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 F

Re: [Pacemaker] globally-unique clone question

2009-05-21 Thread Eliot Gable
CLUSTERIP (since all nodes would then be serving the same data) and put a constraint in the CIB that says that if the clone fails on a node, pull that node from the load-sharing config by stopping or moving away your load-sharing resource. At least, that's how I would do it. Eliot Gable S

Re: [Pacemaker] PEngine Recheck Timer message every 15 minutes - why?

2009-05-14 Thread Eliot Gable
Most likely because you have a cluster-recheck-interval="15m" specified. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted w

Re: [Pacemaker] crm_resource -C vs. crm/resource/cleanup

2009-05-11 Thread Eliot Gable
with. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net<mailto:ega...@broadvox.net> [cid:image001.gif@01C9D23D.D6141C50] CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted with

Re: [Pacemaker] Force a Master resource off a node if another resource fails

2009-05-04 Thread Eliot Gable
th lots of attributes, but it really makes more sense for CRM to handle this natively. Please let me know if I am just being ignorant and missing something here. Thanks again, Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@br

Re: [Pacemaker] Pacemaker 1.0.3 errors

2009-05-04 Thread Eliot Gable
FYI, I see it too on CentOS 5.2. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted with it are confidential and are intended solely for the

Re: [Pacemaker] Force a Master resource off a node if another resource fails

2009-04-30 Thread Eliot Gable
are both completely ignored. I cannot find anything in the documentation that says it is supported, but it parses properly. It would be really nice to support this. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@bro

[Pacemaker] Force a Master resource off a node if another resource fails

2009-04-30 Thread Eliot Gable
ssing in this logic? Do I need another constraint to force it over to node2 if res-b fails? Do I need another constraint to force it to node1 if res-c fails on node2? Thanks for any assistance you can provide. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 21

Re: [Pacemaker] pacemaker 1.0.2 rsc_defaults section

2009-04-29 Thread Eliot Gable
u express that two multi-state resources are preferred to be in Master/Slave or Slave/Master colocation, but Master/Master is also allowed (just lower preference)? Would you assign role="Master" with a score="-1"? Thanks again for your assistance. Eliot Gable Senior Engin

[Pacemaker] master_slave not allowed in resources section?

2009-04-28 Thread Eliot Gable
...SNIP... Am I missing something? Everywhere I look shows inside : The live CIB reports this for version: ...SNIP... Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net<mailto:

[Pacemaker] pacemaker 1.0.2 rsc_defaults section

2009-04-28 Thread Eliot Gable
: ...snip... Any suggestions? Thanks in advance for the assistance. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373