Hello

There is a simple configuration of two cluster nodes (built via RHEL 6 pcs
interface) with multiple master/slave resources, disabled fencing and the single
sync interface.

All is ok mainly. But there is some problem of the cluster activity performance
when the master node is powered off (hard): the slave node detects that the
master one is down after about 100-3500 ms. And the main question is how to 
avoid
this 3 sec delay that occurred sometimes.
On the slave node i have a little script that checks the connection to the 
master
node. It detects a problem of a sync breakage within about 100 ms.But corosync
requires a much more time sometimes to figure out the situation and mark the
master node as offline one. It shows 'ok' ring status.

If i understand correctly then 1 the pacemaker actions (crm_resource --move) 
will
not perform until corosync is not refreshed its ring state2 the detection of a
problem (from a corosync side) can be speeded up via timeout tuning in the
corosync.conf
3 there is no way to ask corosync to recheck its ring status or mark a ring as
failed manually

But maybe i'm missing something.
All i want is to move resources faster.In my little script i tried to force the
cluster software to move resources to the slave node. But i've no success so 
far.

Could you please share your thoughts about the situation.Thank you in advance.

Cluster software:
corosync - 2.4.3pacemaker - 1.1.18libqb - 1.0.2

corosync.conf:totem {
version: 2
secauth: off
cluster_name: cluster
transport: udpu
token: 2000
}nodelist {
node {
ring0_addr: main-node
nodeid: 1
}node {
ring0_addr: reserve-node
nodeid: 2
}
}quorum {
provider: corosync_votequorum
two_node: 1
}


Regards, Maxim.
_______________________________________________
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to