Hi,all
I have defined a fence resource ,and cloned it.But when a node becomes
UNCLEAN(I disconneted its network),the fence action will be executed
immediately.Is there a method to avoid it(for example,a network tolerance
time for network flash time )?For if the network is not stable,
I don`t want
I have a resource that returned 'not installed' because (I think) I had
forgotten to install the required package. I've installed the package now but I
still see the following every 15 minutes:
Preventing ocfs2mgmt from re-starting on node1: operation monitor failed 'not
installed' (rc=5)
As f
On Thu, Dec 20, 2012 at 12:25 AM, Felipe Gutierrez <
felipe.o.gutier...@gmail.com> wrote:
> Hi Soni,
>
> I did these configurations on my DRBD and the correct recovery of
> split-brain worked well.
>
> http://www.drbd.org/users-guide-8.3/s-configure-split-brain-behavior.html#s-split-brain-notifica
> Hi Cherish,
>
> On Wed, Dec 19, 2012 at 1:11 AM, bin chen wrote:
>
>> Hi,all
>> My cluster is pacemaker 1.1.7 + corosync 2.0. I have write a
>> resource agent to manage the virtual machine.The RA supports
>> start,stop,migrate_from,migrate_to,monitor.
>> But when I try to migrate
Hi!
Is it possible to set something like started-weight score for a
resource? Or make resource stop instead of migrating elsewhere (and take
all the resources it's colocated with)? (That would be the same as
setting resource stickiness = started-weight.)
I have a spam filter that sometimes fails
Is fencing/stonith configured in pacemaker? Can you call a fence against
a peer in pacemaker and trigger a reboot of the target node? If that
doesn't work, then you don't have proper fencing in pacemaker and the
crm-fence-peer.sh hook script won't work.
So yes, you need stonith and you need to mak
Hi Digimer,
I am already using crm-fence-peer.sh
resource r8 {
handlers {
fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
split-brain "/usr/lib/drbd/notify-split-brain.sh root";
}
Is Stonith still necessary? How do I configure it
On 12/19/2012 06:21 AM, Felipe Gutierrez wrote:
> Hi everyone,
>
> I have a scenario that I disconnect my primary from the network and the
> secondary assume, becaming primary. After this, I connect the younger
> primary, and both nodes became secondary(DRBD), or Slave on Pacemaker.
> It is becaus
Hi Soni,
I did these configurations on my DRBD and the correct recovery of
split-brain worked well.
http://www.drbd.org/users-guide-8.3/s-configure-split-brain-behavior.html#s-split-brain-notification
I believe that the article you sent to me is about split-brain on Corosync
and it is different o
Hi Soni, thanks for reply,
I understood that is not possible if I don't have a connection back-to-back
(dedicated).
But I am thinking to create a script that do that for me.
The commands are describe here:
http://www.hastexo.com/resources/hints-and-kinks/solve-drbd-split-brain-4-steps
And these co
Well, not accurate unless they consider the crm interface as making manual
changes. I do not see those errors anymore after rebooting the machines.
Things are (almost) all working now.
Paul
-
Speak the truth, but leave immediately after. - Slovenian proverb**
*
*Paul Shannon
ITO, WFO Juneau
Thank you, thank you, thank you. I was chasing that problem for several
days. I did not see anything in any of the logs that pointed to where the
problem might be. Now I see that there is something about that requirement
on the linux-ha ocf_heartbeat_apache page. Thanks again.
Paul Shannon
Hi Cherish,
On Wed, Dec 19, 2012 at 1:11 AM, bin chen wrote:
> Hi,all
> My cluster is pacemaker 1.1.7 + corosync 2.0. I have write a
> resource agent to manage the virtual machine.The RA supports
> start,stop,migrate_from,migrate_to,monitor.
> But when I try to migrate a running
Oops, I haven't have my coffee yet this morning... I see you've written
your own RA rather than using the existing ones, my apologies for the noise
on the list.
Mark
On Wed, Dec 19, 2012 at 9:08 AM, mark - pacemaker list <
m+pacema...@nerdish.us> wrote:
> Hi Cherish,
>
> On Wed, Dec 19, 2012 at
Am Mittwoch, 19. Dezember 2012 13:28:40 schrieb Lars Marowsky-Bree:
> On 2012-12-19T13:22:54, Nikita Michalko wrote:
> > > They should all read Lamport ;-)
> >
> > Interesting - what/who is Lamport though?
>
> LMGTFY:
> http://en.wikipedia.org/wiki/Lamport_timestamps#Lamport.27s_logical_clock_i
On 2012-12-19T13:22:54, Nikita Michalko wrote:
> > They should all read Lamport ;-)
> Interesting - what/who is Lamport though?
LMGTFY:
http://en.wikipedia.org/wiki/Lamport_timestamps#Lamport.27s_logical_clock_in_distributed_systems
--
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff
Hi Lars!
|
v
Am Mittwoch, 19. Dezember 2012 13:06:25 schrieb Lars Marowsky-Bree:
> On 2012-12-19T10:06:25, James Harper wrote:
> > What is the behaviour of a cluster when the nodes are up to 10 minutes
> > out of sync with each other, because they've just been booted up after
> > a crash and th
On 2012-12-19T10:06:25, James Harper wrote:
> What is the behaviour of a cluster when the nodes are up to 10 minutes
> out of sync with each other, because they've just been booted up after
> a crash and the hwclocks are out of date and there is no ntp time
> source reachable? Could it cause lots
On 12/19/12 5:06 AM, James Harper wrote:
What is the best way on bootup in the above situation to ensure time
synchronisation? Is it as simple as having a cron job to reset the hardware
clock every so often so that on reboot things are reasonable?
At least RHEL and SuSE can do an explicit ntp
cutting the communication link between the two nodes is not a valid
failover scenario. both side will think that other nodes offline and become
primary. and if you reconnect them, the splitbrain will happen. you can
make the communication link redundant between the two nodes. maybe these
articles c
laurent+pacema...@u-picardie.fr writes:
> In the end I'm going to fill a bug.
Just for information:
http://bugs.clusterlabs.org/show_bug.cgi?id=5127
"stonith_agent status" was always returning rc=0 despite being called
with the port and nodename env vars, my mistake.
to work around the issue w
Hi everyone,
I have a scenario that I disconnect my primary from the network and the
secondary assume, becaming primary. After this, I connect the younger
primary, and both nodes became secondary(DRBD), or Slave on Pacemaker. It
is because DRBD on younger Primary is Standalone and Outdated. It is
>
> What is the behaviour of a cluster when the nodes are up to 10 minutes out
> of sync with each other, because they've just been booted up after a crash
> and the hwclocks are out of date and there is no ntp time source reachable?
> Could it cause lots of sig11's and constant re-elections becau
What is the behaviour of a cluster when the nodes are up to 10 minutes out of
sync with each other, because they've just been booted up after a crash and the
hwclocks are out of date and there is no ntp time source reachable? Could it
cause lots of sig11's and constant re-elections because that'
(12.12.13 08:26), Andrew Beekhof wrote:
On Wed, Dec 12, 2012 at 8:02 PM, Kazunori INOUE
wrote:
Hi,
I recognize that pacemakerd is much less likely to crash.
However, a possibility of being killed by OOM_Killer etc. is not 0%.
True. Although we just established in another thread that we don
Hi,
How can I configure a resource (e.g. an apache) to depend on the start of a
clone resource (e.g. a filesystem resource) for the given node?
I know how to arrange a primitive into a group, but in this particular case,
the primitive must run on the passive node as well (performing some async
Hi,
I am not able to move nfs to second node of my cluster, some time ago
crmd on the node that NFS currently runs on was jammed (used all
filedescriptors) and was kill -9'ed:
Last updated: Wed Dec 19 03:39:59 2012
Stack: openaisCurrent
DC: filer-1 - partition with quorumVersion:
Hi all,
I'd like to share my successful attempt to confine pacemaker.
I took pacemaker module barebone found in latest fedora's selinux-policy
(3.11.1-64.fc18) and
extended it a bit, so now I have pacemaker and some pacemaker-managed services
running confined.
Everything runs on EL6 with corosy
28 matches
Mail list logo