suggestions?
On Mon, Aug 27, 2012 at 7:49 AM, Lars Marowsky-Bree wrote:
> On 2012-08-26T11:44:52, Eliot Gable wrote:
>
> > Looking back through my E-mails, it looks like this was originally
> deployed
> > on 1.1.2. My guess is that someone did a sweeping software update on the
On Sun, Aug 26, 2012 at 11:31 AM, Eliot Gable wrote:
>
> I have also tried doing a resource cleanup on FreeSWITCH-MS, restarting
> pacemaker and corosync, putting the node in standby and bringing it back
> out, upgrading pacemaker and corosync (to the version you seen in the
> o
node node1
primitive FreeSWITCH ocf:fssolutions:FreeSWITCH \
params ips="bond2/212.163.22.155/26:bond2/212.163.22.156/26"
user="freeswitch" group="freeswitch" \
op monitor interval="3s" role="Master" depth="0" \
op monitor interval="10s" role="Slave" depth="0" \
op s
> That's intentional, see:
>
> http://hg.linux-ha.org/glue/rev/5ef3f9370458
>
> You really don't want to rely on SSH STONITH in a production environment.
>
> Regards,
>
> Tim
Sure, but I'm in a lab environment at the moment without UPS-based STONITH
capabilities, so having SSH STONITH working
I just did an install of Pacemaker on my CentOS 5.5 system using EPEL 5.4 and
ClusterLabs Repo. It seems the RPMs do not include the STONITH plugin
external/ssh. Is it in some package that I missed or is it really not provided?
Is there any way to get it?
Thanks.
Eliot Gable
Senior Product
?
Eliot Gable
Senior Product Developer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net
CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted with it are
confidential and are intended solely for the use of the individual or
switched to a Master. Although, right
now, it is arguably doing things correctly, because it is reporting the current
state as it exists at the time of returning from the monitoring action.
Eliot Gable
Senior Product Developer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
F
3000 on node-1: unknown error (1)
Which makes me think maybe this is related to this failed operator from
yesterday. However, I have stopped and started the resource several times on
node-1 since this failed op occurred. Do I need to clear these things (cleanup
the resource) each time I start the resourc
till
just sits there as a slave. Is there something else I am missing?
Thanks again.
Eliot Gable
Senior Product Developer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net
CONFIDENTIAL COMMUNICATION. This e-mail and any files transmit
onitoring on it to ensure that everything came up correctly, I should at that
point issue crm_master again with -v option to set a score for the node so it
is a good candidate to become master, correct?
Eliot Gable
Senior Product Developer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 2
ile it mentions using crm_master to
provide a promotion score, it does not tell me what actual attribute it is that
needs to be modified. Is there another command that can print out all available
attributes, or a document somewhere that lists them?
Eliot Gable
Senior Product Developer
1228 Eu
slave, then STOP on the failed master,
followed by START on the failed master. How can I achieve this? Is there some
sort of constraint or something I can put in place to make it happen?
Thanks again for any insights.
Eliot Gable
Senior Product Developer
1228 Euclid Ave, Suite 390
Cleveland
CF_CHECK_LEVEL="10" \
op monitor interval="5" role="Master" timeout="30s" \
op monitor interval="10" role="Master" timeout="30s"
OCF_CHECK_LEVEL="10" \
op start interval="0" timeout="40
immediately promote the slave. I can understand it waiting for a DEMOTE action
to succeed on the failed master before it promotes the slave, but that is all
it should need to do it. Is there any way I can change this behavior? Am I
missing some key point in the process?
Eliot Gable
Senior
n MA 02111-1307, USA.
#
###
#
# This resource agent was written by E
) from /lib64/libpthread.so.0
No symbol table info available.
#3 0x00332ded3d1d in clone () from /lib64/libc.so.6
No symbol table info available.
(gdb)
Downgrading again back to 1.2.1-1.el5 seems to resolve the issue, and Corosync
runs.
Eliot Gable
Senior Product Developer
1228 Euclid A
nyone have any suggestions about how I can figure out what is causing the
problem?
Eliot Gable
Senior Product Developer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net<mailto:ega...@broadvox.net>
[cid:image001.gif@01CB0ED8.F4F060
ERROR: could not parse meta-data for (ocf,mysql,heartbeat)
ERROR: ocf:heartbeat:mysql: no such resource agent
Why would CRM not be able to parse the meta-data while the ocf-tester script
seems to like the RA and it's meta-data just fine?
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suit
h resource agent
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net
CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted with it are
confidential and are intended solely for the use of the individual
resource agent
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net
CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted with it are
confidential and are intended solely for the use of the individual or
hitting CTRL-C
for each line until it gives me all errors and then it exits.
The documentation clearly shows doing it the way I first posted.
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net
CONFIDENTIAL
Nevermind. I had a leading 'configure' statement by itself.
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net
CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted with it are
confidenti
That just results in syntax errors on every line.
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net
CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted with it are
confidential and are intended
In the past, I have always done things by manually creating a CIB XML file and
then importing it. But, to save time, I thought I would try CRM. So, I made
this script:
#!/bin/bash
crm<http://www.clusterlabs.org/mediawiki/images/8/8d/Crm_cli.pdf. Anyone have any
suggestions?
Eliot Ga
/usr/lib64 instead of /usr/lib.
If someone could fix these issues quickly, it would save everyone using CentOS
lots of time and headache trying to make this work. :) I would be more than
happy to test any modifications for anyone willing to try to fix it.
Eliot Gable
Senior Engineer
1228 E
Nevermind. I did something stupid.
From: Eliot Gable [mailto:ega...@broadvox.com]
Sent: Monday, July 27, 2009 3:16 PM
To: pacemaker@oss.clusterlabs.org
Subject: [Pacemaker] Resource agent monitoring
Is Pacemaker now monitoring resource agents for changes and marking the nodes
as UNCLEAN if a
Is Pacemaker now monitoring resource agents for changes and marking the nodes
as UNCLEAN if a change is detected? If so, how do I disable this. I recently
upgraded to 1.0.4 from1.0.3 and now, when I update my RA, it causes a stonith
on every node I push the RA out to.
__
esource and Stop/Start of HA System
>
>
>
> I'd guess the master preference (in the status section) got lost somehow.
>
> You should probably file a bug.
>
>
>
> On Jul 23, 2009, at 11:24 PM, Eliot Gable wrote:
>
> Ok, it does not actually stop the mas
Ok, it does not actually stop the master, but it DOES demote the master to
slave.
From: Eliot Gable [mailto:ega...@broadvox.com]
Sent: Thursday, July 23, 2009 5:23 PM
To: pacemaker@oss.clusterlabs.org
Subject: [Pacemaker] Master/Slave Resource and Stop/Start of HA System
Running Pacemaker
Running Pacemaker 1.0.4.
With my Master/Slave resource in Master on node1 and Slave on node2, if I
/etc/init.d/heartbeat stop on node2, I see the slave go down and node1 stays
master in crm_mon. When it finishes, node2 is in OFFLINE status (no unclean
modifier). When I then /etc/init.d/heartbe
On another note, what remote power device do you recommend for fencing an
UNCLEAN node?
-Original Message-
From: Eliot Gable [mailto:ega...@broadvox.com]
Sent: Tuesday, July 21, 2009 9:40 AM
To: pacemaker@oss.clusterlabs.org
Subject: Re: [Pacemaker] Master/Slave failover during reboot
9-07-20T15:27:23, Eliot Gable wrote:
> I have a resource that is configured as a Master/Slave resource. If I
> kill a resource it is dependent on, it properly fails over to the
> other node. However, if I reboot the master node, it does not fail
> over. What I see is that the master no
I have a resource that is configured as a Master/Slave resource. If I kill a
resource it is dependent on, it properly fails over to the other node. However,
if I reboot the master node, it does not fail over. What I see is that the
master node switches to UNCLEAN - Offline, the master resource
happening again, I will
E-mail the list.
From: Eliot Gable
Sent: Monday, July 20, 2009 3:27 PM
To: 'pacemaker@oss.clusterlabs.org'
Subject: Master/Slave failover during reboot
I have a resource that is configured as a Master/Slave resource. If I kill a
resource it is dependent on, i
A "reboot" should never fail. That is, it should always guarantee that the
system actually went down entirely. It does not need to guarantee that it comes
back up automatically. If it gets stuck in the boot-up process, you can just
manually intervene and fix that whenever it's possible and when
FYI, I recently tried NIC bonding on CentOS 5.2 32-bit and had issues in the
bonding driver causing kernel panics. I disabled bonding because it was less
stable.
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega
Update your admin epoch.
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net
CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted with it are
confidential and are intended solely for the use of the
You can always check. Probably look at /var/lib/heartbeat and everything under
it if you are using Heartbeat. If OpenAIS, not sure where to look.
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net
CONFIDENTIAL
Try:
crm_verify -V -x newcib.xml
and make sure it verifies OK. Then do:
cibadmin -R -o cib -x newcib.xml
After doing that, try:
cibadmin -Q | less
And check to see if it has the new CIB. If that doesn't work, post your CIB.
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suit
I actually do start pingd on just one node and fail it over. It won't work on
my slave node because the slave node does not have Internet access, only local
cluster access. If it ran all the time on that node, it would always show
Internet connectivity down. Thus, I must agree with Andrew: Pacem
sent, contains the appropriate
resources, and is readable and executable by root.
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net<mailto:ega...@broadvox.net>
[cid:image001.gif@01C9DEC6.6AB2E690]
CONFIDENTIAL
missing the files.
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net<mailto:ega...@broadvox.net>
[cid:image001.gif@01C9DEAC.80CF4BE0]
CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted w
crm(live)# ra
crm(live)ra# classes
heartbeat
ocf / pacemaker heartbeat
lsb
stonith
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net<mailto:ega...@broadvox.net>
[cid:image001.gif@01C9DEA9.F95D2440]
CONFID
give you some idea just
how much there is to learn. Don’t expect to have it mastered in a couple of
days.
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net<mailto:ega...@broadvox.net>
[cid:image001.gif@01
I am using 1.0.3, but the failure-timeout thing does not seem to work for pingd.
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net
CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted with it are
:
The stonith resources correctly fence on a failure of the stop action on a
resource.
Any suggestions?
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373
omatically clear that -1000 score
after a certain (small) interval of time?
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net<mailto:ega...@broadvox.net>
[cid:image001.gif@01C9DA2F.800EEE30]
CONFIDENTIA
ydata" value="node1:www.example.com node2:www.example2.com
node3:www.example3.com"
Then, inside the RA, check the hostname and use the domain that is attached to
that hostname.
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
F
CLUSTERIP
(since all nodes would then be serving the same data) and put a constraint in
the CIB that says that if the clone fails on a node, pull that node from the
load-sharing config by stopping or moving away your load-sharing resource. At
least, that's how I would do it.
Eliot Gable
S
Most likely because you have a cluster-recheck-interval="15m" specified.
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net
CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted w
with.
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net<mailto:ega...@broadvox.net>
[cid:image001.gif@01C9D23D.D6141C50]
CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted with
th lots of
attributes, but it really makes more sense for CRM to handle this natively.
Please let me know if I am just being ignorant and missing something here.
Thanks again,
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@br
FYI, I see it too on CentOS 5.2.
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net
CONFIDENTIAL COMMUNICATION. This e-mail and any files transmitted with it are
confidential and are intended solely for the
are both
completely ignored. I cannot find anything in the documentation that says it is
supported, but it parses properly.
It would be really nice to support this.
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@bro
ssing in this logic? Do I need another constraint to force it over to
node2 if res-b fails? Do I need another constraint to force it to node1 if
res-c fails on node2?
Thanks for any assistance you can provide.
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 21
u express that two multi-state
resources are preferred to be in Master/Slave or Slave/Master colocation, but
Master/Master is also allowed (just lower preference)? Would you assign
role="Master" with a score="-1"?
Thanks again for your assistance.
Eliot Gable
Senior Engin
...SNIP...
Am I missing something? Everywhere I look shows inside
:
The live CIB reports this for version:
...SNIP...
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373-4657
ega...@broadvox.net<mailto:
:
...snip...
Any suggestions?
Thanks in advance for the assistance.
Eliot Gable
Senior Engineer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115
Direct: 216-373-4808
Fax: 216-373
58 matches
Mail list logo