Re: [ClusterLabs] Replicated PGSQL woes

2016-10-13 Thread Israel Brewster
On Oct 13, 2016, at 1:56 PM, Jehan-Guillaume de Rorthais  
wrote:
> 
> On Thu, 13 Oct 2016 10:05:33 -0800
> Israel Brewster  wrote:
> 
>> On Oct 13, 2016, at 9:41 AM, Ken Gaillot  wrote:
>>> 
>>> On 10/13/2016 12:04 PM, Israel Brewster wrote:  
> [...]
> 
 But whatever- this is a cluster, it doesn't really matter which node
 things are running on, as long as they are running. So the cluster is
 working - postgresql starts, the master process is on the same node as
 the IP, you can connect, etc, everything looks good. Obviously the next
 thing to try is failover - should the master node fail, the slave node
 should be promoted to master. So I try testing this by shutting down the
 cluster on the primary server: "pcs cluster stop"
 ...and nothing happens. The master shuts down (uncleanly, I might add -
 it leaves behind a lock file that prevents it from starting again until
 I manually remove said lock file), but the slave is never promoted to  
>>> 
>>> This definitely needs to be corrected. What creates the lock file, and
>>> how is that entity managed?  
>> 
>> The lock file entity is created/managed by the postgresql process itself. On
>> launch, postgres creates the lock file to say it is running, and deletes said
>> lock file when it shuts down. To my understanding, its role in life is to
>> prevent a restart after an unclean shutdown so the admin is reminded to make
>> sure that the data is in a consistent state before starting the server again.
> 
> What is the name of this lock file? Where is it?
> 
> PostgreSQL does not create lock file. It creates a "postmaster.pid" file, but
> it does not forbid a startup if the new process doesn't find another process
> with the pid and shm shown in the postmaster.pid.
> 
> As far as I know, the pgsql resource agent create such a lock file on promote
> and delete it on graceful stop. If the PostgreSQL instance couldn't be stopped
> correctly, the lock files stays and the RA refuse to start it the next time.

Ah, you're right. Looking auth the RA I see where it creates the file in 
question. The delete appears to be in the pgsql_real_stop() function (which 
makes sense), wrapped in an if block that checks for $1 being master and 
$OCF_RESKEY_CRM_meta_notify_slave_uname being a space. Throwing a little 
debugging code in there I see that when it hits that block on a cluster stop, 
$OCF_RESKEY_CRM_meta_notify_slave_uname is centtest1.ravnalaska.net 
, not a space, so the lock file is not 
removed:

if  [ "$1" = "master" -a "$OCF_RESKEY_CRM_meta_notify_slave_uname" = " " ]; 
then
ocf_log info "Removing $PGSQL_LOCK."
rm -f $PGSQL_LOCK
fi 

It doesn't look like there is anywhere else where the file would be removed.

> 
> [...]
 What can I do to fix this? What troubleshooting steps can I follow? Thanks.
> 
> I can not find the result of the stop operation in your log files, maybe the
> log from CentTest2 would be more useful.

Sure. I was looking at centtest1 because I was trying to figure out why it 
wouldn't promote, but if centtest2 never really stopped (properly) that could 
explain things. Here's the log from 2 when calling pcs cluster stop:

Oct 13 14:05:14 CentTest2 attrd[9424]:   notice: Sending flush op to all hosts 
for: standby (true)
Oct 13 14:05:14 CentTest2 attrd[9424]:   notice: Sent update 26: standby=true
Oct 13 14:05:14 CentTest2 pacemaker: Waiting for shutdown of managed resources
Oct 13 14:05:14 CentTest2 crmd[9426]:   notice: Operation pgsql_96_notify_0: ok 
(node=centtest2.ravnalaska.net, call=21, rc=0, cib-update=0, confirmed=true)
Oct 13 14:05:14 CentTest2 attrd[9424]:   notice: Sending flush op to all hosts 
for: master-pgsql_96 (-INFINITY)
Oct 13 14:05:14 CentTest2 attrd[9424]:   notice: Sent update 28: 
master-pgsql_96=-INFINITY
Oct 13 14:05:14 CentTest2 attrd[9424]:   notice: Sending flush op to all hosts 
for: pgsql_96-master-baseline ()
Oct 13 14:05:14 CentTest2 attrd[9424]:   notice: Sent delete 30: 
node=centtest2.ravnalaska.net, attr=pgsql_96-master-baseline, id=, 
set=(null), section=status
Oct 13 14:05:14 CentTest2 attrd[9424]:   notice: Sent delete 32: 
node=centtest2.ravnalaska.net, attr=pgsql_96-master-baseline, id=, 
set=(null), section=status
Oct 13 14:05:14 CentTest2 pgsql(pgsql_96)[5107]: INFO: Stopping PostgreSQL on 
demote.
Oct 13 14:05:14 CentTest2 pgsql(pgsql_96)[5107]: INFO: stop_escalate(or 
stop_escalate_in_slave) time is adjusted to 50 based on the configured timeout.
Oct 13 14:05:14 CentTest2 pgsql(pgsql_96)[5107]: INFO: server shutting down
Oct 13 14:05:15 CentTest2 pgsql(pgsql_96)[5107]: INFO: PostgreSQL is down
Oct 13 14:05:15 CentTest2 pgsql(pgsql_96)[5107]: INFO: Changing pgsql_96-status 
on centtest2.ravnalaska.net : PRI->STOP.
Oct 13 14:05:15 CentTest2 attrd[9424]:   notice: Sending flush op to all hosts 
for: pgsql_96-status (STOP)
Oct 13 14:05:15 

Re: [ClusterLabs] Replicated PGSQL woes

2016-10-13 Thread Jehan-Guillaume de Rorthais
On Thu, 13 Oct 2016 10:05:33 -0800
Israel Brewster  wrote:

> On Oct 13, 2016, at 9:41 AM, Ken Gaillot  wrote:
> > 
> > On 10/13/2016 12:04 PM, Israel Brewster wrote:  
[...]
 
> >> But whatever- this is a cluster, it doesn't really matter which node
> >> things are running on, as long as they are running. So the cluster is
> >> working - postgresql starts, the master process is on the same node as
> >> the IP, you can connect, etc, everything looks good. Obviously the next
> >> thing to try is failover - should the master node fail, the slave node
> >> should be promoted to master. So I try testing this by shutting down the
> >> cluster on the primary server: "pcs cluster stop"
> >> ...and nothing happens. The master shuts down (uncleanly, I might add -
> >> it leaves behind a lock file that prevents it from starting again until
> >> I manually remove said lock file), but the slave is never promoted to  
> > 
> > This definitely needs to be corrected. What creates the lock file, and
> > how is that entity managed?  
> 
> The lock file entity is created/managed by the postgresql process itself. On
> launch, postgres creates the lock file to say it is running, and deletes said
> lock file when it shuts down. To my understanding, its role in life is to
> prevent a restart after an unclean shutdown so the admin is reminded to make
> sure that the data is in a consistent state before starting the server again.

What is the name of this lock file? Where is it?

PostgreSQL does not create lock file. It creates a "postmaster.pid" file, but
it does not forbid a startup if the new process doesn't find another process
with the pid and shm shown in the postmaster.pid.

As far as I know, the pgsql resource agent create such a lock file on promote
and delete it on graceful stop. If the PostgreSQL instance couldn't be stopped
correctly, the lock files stays and the RA refuse to start it the next time.

[...]
> >> What can I do to fix this? What troubleshooting steps can I follow? Thanks.

I can not find the result of the stop operation in your log files, maybe the
log from CentTest2 would be more useful. but I can find this:

  Oct 13 08:29:41 CentTest1 pengine[30095]:   notice: Scheduling Node
  centtest2.ravnalaska.net for shutdown
  ...
  Oct 13 08:29:41 CentTest1 pengine[30095]:   notice: Scheduling Node
  centtest2.ravnalaska.net for shutdown

Which means the stop operation probably raised an error, leading to a fencing
of the node. In this circumstance, I bet PostgreSQL wasn't able to stop
correctly and the lock file stayed in place.

Could you please show us your full cluster setup?

By the way, did you had a look to the PAF project? 

  http://dalibo.github.io/PAF/
  http://dalibo.github.io/PAF/documentation.html

The v1.1 version for EL6 is not ready yet, but you might want to give it a
try: https://github.com/dalibo/PAF/tree/v1.1

I would recommend EL7 and PAF 2.0, published, packaged, ready to use.

Regards,

-- 
Jehan-Guillaume (ioguix) de Rorthais
Dalibo

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] cross DC cluster using public ip?

2016-10-13 Thread Les Green
Corosync does not work with NAT. At least I tried for AGES and could not
get it to.

Easiest is to set up a VPN between the sites or servers for just the
corosync traffic.

On 13.10.2016 22:14, neeraj ch wrote:
> Hello 
> 
> Thank you for taking the time to respond. 
> 
> In my setup the public IP is not on the box , the box is attached to a
> private network and packets to the public IP  I think are just forwarded
> to the private IP. 
> 
> When I tried using the local private address as the bind address ,
> public address as the member address and ran a tcp dump , both nodes are
> sending packets to each other over the public IP but they are responding
> to each other's private address Instead of just responding back to the
> address the packet arrived from. It looks like corosync is sending the
> IP its listening on , and the other node is trying to respond to it ,
> and hence if corosync binds to a private address a node not in the same
> DC will not be able to respond to it. 
> 
> Is this how corosync works ? 
> 
> Is there a way to force the node to respond to the IP its receiving
> packets from ? or to broad cast its public IP rather than the private IP
> ? Would it be any better if I used corosync 2.X , for the same setup ?  
> 
> On Thu, Oct 13, 2016 at 12:41 AM, Klaus Wenninger  > wrote:
> 
> On 10/13/2016 09:30 AM, Jan Friesse wrote:
> > neeraj ch napsal(a):
> >> Hello ,
> >>
> >> We are testing out corosync and pacemaker for DB high availability on
> >> the
> >> cloud. I was able to set up a cluster with in a DC using corosync 1.4
> >> and
> >> pacemaker 1.12. It works great and I wanted to try a cross DC cluster. 
> I
> >> was using unicast as multicast was disabled by default.
> >>
> >> I was not sure how Corosync behaves with public IP's but I still went
> >> ahead
> >> and tried it with both public IP's as well as DNS names. These DNS 
> names
> >> resolve as local IP when the other node is with in the same subnet.
> >
> > Every node has to be able to see every other node. So mixing of public
> > and private ips is not going to work (with exception of special case
> > where all private ips are in the same network). Also keep in mind
> > config file has to be same on all nodes.
> 
> Guess reason is that corosync derives an ID from the IP.
> So the hostname has to resolve to the same IP on all nodes
> and under all circumstances.
> 
> Oh Got It.  
> 
> 
> >
> >
> >>
> >> while I was using public IP's both the node inside the same subnet as
> >> well
> >> as outside were unable to connect, except for itself. While using DNS
> >> names
> >> the membership information showed the nodes within same subnet being
> >> connected to while the nodes outside were not connected
> >
> > This is somehow expected.
> >>
> >>
> >> My corosync config is as follows.
> >>
> >> totem {
> >>version: 2
> >>secauth: off
> >>threads: 0
> >>interface {
> >>
> >> member {
> >>memberaddr: 
> >> }
> >>member {
> >>memberaddr: 
> >> }
> >> member {
> >>memberaddr: 
> >> }
> >> ringnumber: 0
> >> bindnetaddr: 172.31.0.0
> >> mcastport: 5405
> >> ttl: 1
> >>}
> >>transport: udpu
> >> }
> >>
> >> logging {
> >>fileline: off
> >>to_stderr: no
> >>to_logfile: yes
> >>to_syslog: yes
> >>logfile: /var/log/cluster/corosync.log
> >>debug: on
> >>timestamp: on
> >>logger_subsys {
> >> subsys: AMF
> >> debug: on
> >>}
> >> }
> >>
> >> service {
> >>  # Load the Pacemaker Cluster Resource Manager
> >>  name: pacemaker
> >>  ver: 1
> >> }
> >>
> >> amf {
> >>mode: disabled
> >> }
> >>
> >>
> >> I am checking membership information by using corosync-objctl. I have
> >> also
> >> tried using public ip as the bind address , that makes the membership
> >> from
> >
> > Just to make sure. This "public" ip is really ip of given machine?
> >
> >> 1 to 0 as it doesn't add itself.
> >>
> >> If any one has any suggestion / advice on how to debug or what I am
> >> doing
> >> wrong . Any help would be very appreciated.
> >>
> >> Thank you
> >>
> >>
> >>
> >> ___
> >> Users mailing list: Users@clusterlabs.org
> 
> 

Re: [ClusterLabs] cross DC cluster using public ip?

2016-10-13 Thread neeraj ch
Hello

Thank you for taking the time to respond.

In my setup the public IP is not on the box , the box is attached to a
private network and packets to the public IP  I think are just forwarded to
the private IP.

When I tried using the local private address as the bind address , public
address as the member address and ran a tcp dump , both nodes are sending
packets to each other over the public IP but they are responding to each
other's private address Instead of just responding back to the address the
packet arrived from. It looks like corosync is sending the IP its listening
on , and the other node is trying to respond to it , and hence if corosync
binds to a private address a node not in the same DC will not be able to
respond to it.

Is this how corosync works ?

Is there a way to force the node to respond to the IP its receiving packets
from ? or to broad cast its public IP rather than the private IP ? Would it
be any better if I used corosync 2.X , for the same setup ?

On Thu, Oct 13, 2016 at 12:41 AM, Klaus Wenninger 
wrote:

> On 10/13/2016 09:30 AM, Jan Friesse wrote:
> > neeraj ch napsal(a):
> >> Hello ,
> >>
> >> We are testing out corosync and pacemaker for DB high availability on
> >> the
> >> cloud. I was able to set up a cluster with in a DC using corosync 1.4
> >> and
> >> pacemaker 1.12. It works great and I wanted to try a cross DC cluster. I
> >> was using unicast as multicast was disabled by default.
> >>
> >> I was not sure how Corosync behaves with public IP's but I still went
> >> ahead
> >> and tried it with both public IP's as well as DNS names. These DNS names
> >> resolve as local IP when the other node is with in the same subnet.
> >
> > Every node has to be able to see every other node. So mixing of public
> > and private ips is not going to work (with exception of special case
> > where all private ips are in the same network). Also keep in mind
> > config file has to be same on all nodes.
>
> Guess reason is that corosync derives an ID from the IP.
> So the hostname has to resolve to the same IP on all nodes
> and under all circumstances.
>
Oh Got It.

>
> >
> >
> >>
> >> while I was using public IP's both the node inside the same subnet as
> >> well
> >> as outside were unable to connect, except for itself. While using DNS
> >> names
> >> the membership information showed the nodes within same subnet being
> >> connected to while the nodes outside were not connected
> >
> > This is somehow expected.
> >>
> >>
> >> My corosync config is as follows.
> >>
> >> totem {
> >>version: 2
> >>secauth: off
> >>threads: 0
> >>interface {
> >>
> >> member {
> >>memberaddr: 
> >> }
> >>member {
> >>memberaddr: 
> >> }
> >> member {
> >>memberaddr: 
> >> }
> >> ringnumber: 0
> >> bindnetaddr: 172.31.0.0
> >> mcastport: 5405
> >> ttl: 1
> >>}
> >>transport: udpu
> >> }
> >>
> >> logging {
> >>fileline: off
> >>to_stderr: no
> >>to_logfile: yes
> >>to_syslog: yes
> >>logfile: /var/log/cluster/corosync.log
> >>debug: on
> >>timestamp: on
> >>logger_subsys {
> >> subsys: AMF
> >> debug: on
> >>}
> >> }
> >>
> >> service {
> >>  # Load the Pacemaker Cluster Resource Manager
> >>  name: pacemaker
> >>  ver: 1
> >> }
> >>
> >> amf {
> >>mode: disabled
> >> }
> >>
> >>
> >> I am checking membership information by using corosync-objctl. I have
> >> also
> >> tried using public ip as the bind address , that makes the membership
> >> from
> >
> > Just to make sure. This "public" ip is really ip of given machine?
> >
> >> 1 to 0 as it doesn't add itself.
> >>
> >> If any one has any suggestion / advice on how to debug or what I am
> >> doing
> >> wrong . Any help would be very appreciated.
> >>
> >> Thank you
> >>
> >>
> >>
> >> ___
> >> Users mailing list: Users@clusterlabs.org
> >> http://clusterlabs.org/mailman/listinfo/users
> >>
> >> Project Home: http://www.clusterlabs.org
> >> Getting started: http://www.clusterlabs.org/
> doc/Cluster_from_Scratch.pdf
> >> Bugs: http://bugs.clusterlabs.org
> >>
> >
> >
> > ___
> > Users mailing list: Users@clusterlabs.org
> > http://clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: 

Re: [ClusterLabs] Replicated PGSQL woes

2016-10-13 Thread Ken Gaillot
On 10/13/2016 12:04 PM, Israel Brewster wrote:
> Summary: Two-node cluster setup with latest pgsql resource agent.
> Postgresql starts initially, but failover never happens.
> 
> Details:
> 
> I'm trying to get a cluster set up with Postgresql 9.6 in a streaming
> replication using named slots scenario. I'm using the latest pgsql
> Resource Agent, which does appear to support the named replication slot
> feature, and I've pulled in the various utility functions the RA uses
> that weren't available in my base install, so the RA itself no longer
> gives me errors.
> 
> Setup: Two machines, centtest1 and centtest2. Both are running CentOS
> 6.8. Centtest1 has an IP of 10.211.55.100, and centtest2 has an IP of
> 10.211.55.101. The cluster is set up and functioning, with a shared
> virtual IP resource at 10.211.55.200. Postgresql has been set up and
> tested functioning properly on both nodes with centtest1 as the master
> and centtest2 as the streaming replica slave. 
> 
> I then set up the postgresql master/slave resource using the following
> commands:
> 
> pcs resource create pgsql_96 pgsql \
> pgctl="/usr/pgsql-9.6/bin/pg_ctl" \
> logfile="/var/log/pgsql/test2.log" \
> psql="/usr/pgsql-9.6/bin/psql" \
> pgdata="/pgsql96/data" \
> rep_mode="async" \
> repuser="postgres" \
> node_list="tcentest1.ravnalaska.net 
> centtest2.ravnalaska.net " \
> master_ip="10.211.55.200" \
> archive_cleanup_command="" \
> restart_on_promote="true" \
> replication_slot_name="centtest_2_slot" \
> monitor_user="postgres" \
> monitor_password="SuperSecret" \
> op start timeout="60s" interval="0s" on-fail="restart" \
> op monitor timeout="60s" interval="4s" on-fail="restart" \
> op monitor timeout="60s" interval="3s" on-fail="restart" role="Master" \
> op promote timeout="60s" interval="0s" on-fail="restart" \
> op demote timeout="60s" interval="0s" on-fail=stop \
> op stop timeout="60s" interval="0s" on-fail="block" \
> op notify timeout="60s" interval="0s";
> 
> pcs resource master msPostgresql pgsql_96 master-max=1 master-node-max=1
> clone-max=2 clone-node-max=1 notify=true
> 
> pcs constraint colocation add virtual_ip with Master msPostgresql INFINITY
> pcs constraint order promote msPostgresql then start virtual_ip
> symmetrical=false score=INFINITY
> pcs constraint order demote  msPostgresql then stop  virtual_ip
> symmetrical=false score=0
> 
> My preference would be that the master runs on centtest1, so I add the
> following constraint as well:
> 
> pcs constraint location --master msPostgresql prefers
> centtest1.ravnalaska.net=50
> 
> When I then start the cluster, I first see *both* machines come up as
> "slave", which I feel is somewhat odd, however the cluster software
> quickly figures things out and promotes centtest2 to master. I've tried

This is inherent to pacemaker's model of multistate resources. Instances
are always started in slave mode, and then promotion to master is a
separate step.

> this a dozen different times, and it *always* promotes centtest2 to
> master - even if I put INFINITY in for the location constraint.

Surprisingly, location constraints do not directly support limiting to
one role (your "--master" option is ignored, and I'm surprised it
doesn't give an error). To do what you want, you need a rule, like:

pcs constraint location msPostgresql rule \
   role=master score=50 \
   #uname eq centtest1.ravnalaska.net


> But whatever- this is a cluster, it doesn't really matter which node
> things are running on, as long as they are running. So the cluster is
> working - postgresql starts, the master process is on the same node as
> the IP, you can connect, etc, everything looks good. Obviously the next
> thing to try is failover - should the master node fail, the slave node
> should be promoted to master. So I try testing this by shutting down the
> cluster on the primary server: "pcs cluster stop"
> ...and nothing happens. The master shuts down (uncleanly, I might add -
> it leaves behind a lock file that prevents it from starting again until
> I manually remove said lock file), but the slave is never promoted to

This definitely needs to be corrected. What creates the lock file, and
how is that entity managed?

> master. Neither pcs status or crm_mon show any errors, but centtest1
> never becomes master.

I remember a situation where a resource agent improperly set master
scores, that led to no master being promoted. I don't remember the
details, though.

> 
> If instead of stoping the cluster on centtest2, I try to simply move the
> master using the command "pcs resource move --master msPostgresql", I
> first run into the aforementioned unclean shutdown issue (lock file left
> behind that has to be manually removed), and after removing the lock
> file, I wind up with *both* nodes being slaves, and no master node. "pcs
> resource clear --master msPostgresql" re-promotes centtest2 to master.
> 
> What it looks like is that for 

[ClusterLabs] Replicated PGSQL woes

2016-10-13 Thread Israel Brewster
Summary: Two-node cluster setup with latest pgsql resource agent. Postgresql starts initially, but failover never happens.Details:I'm trying to get a cluster set up with Postgresql 9.6 in a streaming replication using named slots scenario. I'm using the latest pgsql Resource Agent, which does appear to support the named replication slot feature, and I've pulled in the various utility functions the RA uses that weren't available in my base install, so the RA itself no longer gives me errors.Setup: Two machines, centtest1 and centtest2. Both are running CentOS 6.8. Centtest1 has an IP of 10.211.55.100, and centtest2 has an IP of 10.211.55.101. The cluster is set up and functioning, with a shared virtual IP resource at 10.211.55.200. Postgresql has been set up and tested functioning properly on both nodes with centtest1 as the master and centtest2 as the streaming replica slave. I then set up the postgresql master/slave resource using the following commands:pcs resource create pgsql_96 pgsql \pgctl="/usr/pgsql-9.6/bin/pg_ctl" \logfile="/var/log/pgsql/test2.log" \psql="/usr/pgsql-9.6/bin/psql" \pgdata="/pgsql96/data" \rep_mode="async" \repuser="postgres" \node_list="tcentest1.ravnalaska.net centtest2.ravnalaska.net" \master_ip="10.211.55.200" \archive_cleanup_command="" \restart_on_promote="true" \replication_slot_name="centtest_2_slot" \monitor_user="postgres" \monitor_password="SuperSecret" \op start timeout="60s" interval="0s" on-fail="restart" \op monitor timeout="60s" interval="4s" on-fail="restart" \op monitor timeout="60s" interval="3s" on-fail="restart" role="Master" \op promote timeout="60s" interval="0s" on-fail="restart" \op demote timeout="60s" interval="0s" on-fail=stop \op stop timeout="60s" interval="0s" on-fail="block" \op notify timeout="60s" interval="0s";pcs resource master msPostgresql pgsql_96 master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=truepcs constraint colocation add virtual_ip with Master msPostgresql INFINITYpcs constraint order promote msPostgresql then start virtual_ip symmetrical=false score=INFINITYpcs constraint order demote  msPostgresql then stop  virtual_ip symmetrical=false score=0My preference would be that the master runs on centtest1, so I add the following constraint as well:pcs constraint location --master msPostgresql prefers centtest1.ravnalaska.net=50When I then start the cluster, I first see *both* machines come up as "slave", which I feel is somewhat odd, however the cluster software quickly figures things out and promotes centtest2 to master. I've tried this a dozen different times, and it *always* promotes centtest2 to master - even if I put INFINITY in for the location constraint.But whatever- this is a cluster, it doesn't really matter which node things are running on, as long as they are running. So the cluster is working - postgresql starts, the master process is on the same node as the IP, you can connect, etc, everything looks good. Obviously the next thing to try is failover - should the master node fail, the slave node should be promoted to master. So I try testing this by shutting down the cluster on the primary server: "pcs cluster stop"...and nothing happens. The master shuts down (uncleanly, I might add - it leaves behind a lock file that prevents it from starting again until I manually remove said lock file), but the slave is never promoted to master. Neither pcs status or crm_mon show any errors, but centtest1 never becomes master.If instead of stoping the cluster on centtest2, I try to simply move the master using the command "pcs resource move --master msPostgresql", I first run into the aforementioned unclean shutdown issue (lock file left behind that has to be manually removed), and after removing the lock file, I wind up with *both* nodes being slaves, and no master node. "pcs resource clear --master msPostgresql" re-promotes centtest2 to master.What it looks like is that for some reason pacemaker/corosync is absolutely refusing to ever make centtest1 a master - even when I explicitly tell it to, or when it is the only node left.Looking at the messages log when I do the node shutdown test I see this:Oct 13 08:29:39 CentTest1 crmd[30096]:   notice: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]Oct 13 08:29:39 CentTest1 pengine[30095]:   notice: On loss of CCM Quorum: IgnoreOct 13 08:29:39 CentTest1 pengine[30095]:   notice: Stop    virtual_ip#011(centtest2.ravnalaska.net)Oct 13 08:29:39 CentTest1 pengine[30095]:   notice: Demote  pgsql_96:0#011(Master -> Stopped centtest2.ravnalaska.net)Oct 13 08:29:39 CentTest1 pengine[30095]:   notice: Calculated Transition 193: /var/lib/pacemaker/pengine/pe-input-500.bz2Oct 13 08:29:39 CentTest1 crmd[30096]:   notice: Initiating action 43: notify pgsql_96_pre_notify_demote_0 on centtest2.ravnalaska.netOct 13 08:29:39 CentTest1 crmd[30096]:   notice: Initiating action 45: notify pgsql_96_pre_notify_demote_0 on 

Re: [ClusterLabs] Unexpected Resource movement after failover

2016-10-13 Thread Nikhil Utane
Andrei,

*"It would help if you told which node and which resources, so
your configuration could be interpreted in context. "*

Any resource can run on any node as long as it is not running any other
resource.

*"so "a not with b" does not imply "b not with a". So first pacemaker
decided where to place "b" and then it had to move "a" because it cannot
colocate with "b"."*

Hmm. I used to think "a not with b" means "b not with a" as well. Looks
like that's not the case. That should be it then.

Thanks for the quick answer, guys.

-Nikhil



On Thu, Oct 13, 2016 at 7:59 PM, Andrei Borzenkov 
wrote:

> On Thu, Oct 13, 2016 at 4:59 PM, Nikhil Utane
>  wrote:
> > Hi,
> >
> > I have 5 nodes and 4 resources configured.
> > I have configured constraint such that no two resources can be
> co-located.
> > I brought down a node (which happened to be DC). I was expecting the
> > resource on the failed node would be migrated to the 5th waiting node
> (that
> > is not running any resource).
> > However what happened was the failed node resource was started on another
> > active node (after stopping it's existing resource) and that node's
> resource
> > was moved to the waiting node.
> >
> > What could I be doing wrong?
> >
>
> It would help if you told which node and which resources, so your
> configuration could be interpreted in context. But I guess Ulrich is
> correct - your constraints are asymmetrical (I assume, I am not
> familiar with PCS), so "a not with b" does not imply "b not with a".
> So first pacemaker decided where to place "b" and then it had to move
> "a" because it cannot colocate with "b".
>
> >  > name="have-watchdog"/>
> >  > name="dc-version"/>
> >  value="corosync"
> > name="cluster-infrastructure"/>
> >  > name="stonith-enabled"/>
> >  > name="no-quorum-policy"/>
> >  > name="default-action-timeout"/>
> >  > name="symmetric-cluster"/>
> >
> > # pcs constraint
> > Location Constraints:
> >   Resource: cu_2
> > Enabled on: Redun_CU4_Wb30 (score:0)
> > Enabled on: Redund_CU2_WB30 (score:0)
> > Enabled on: Redund_CU3_WB30 (score:0)
> > Enabled on: Redund_CU5_WB30 (score:0)
> > Enabled on: Redund_CU1_WB30 (score:0)
> >   Resource: cu_3
> > Enabled on: Redun_CU4_Wb30 (score:0)
> > Enabled on: Redund_CU2_WB30 (score:0)
> > Enabled on: Redund_CU3_WB30 (score:0)
> > Enabled on: Redund_CU5_WB30 (score:0)
> > Enabled on: Redund_CU1_WB30 (score:0)
> >   Resource: cu_4
> > Enabled on: Redun_CU4_Wb30 (score:0)
> > Enabled on: Redund_CU2_WB30 (score:0)
> > Enabled on: Redund_CU3_WB30 (score:0)
> > Enabled on: Redund_CU5_WB30 (score:0)
> > Enabled on: Redund_CU1_WB30 (score:0)
> >   Resource: cu_5
> > Enabled on: Redun_CU4_Wb30 (score:0)
> > Enabled on: Redund_CU2_WB30 (score:0)
> > Enabled on: Redund_CU3_WB30 (score:0)
> > Enabled on: Redund_CU5_WB30 (score:0)
> > Enabled on: Redund_CU1_WB30 (score:0)
> > Ordering Constraints:
> > Colocation Constraints:
> >   cu_3 with cu_2 (score:-INFINITY)
> >   cu_4 with cu_2 (score:-INFINITY)
> >   cu_4 with cu_3 (score:-INFINITY)
> >   cu_5 with cu_2 (score:-INFINITY)
> >   cu_5 with cu_3 (score:-INFINITY)
> >   cu_5 with cu_4 (score:-INFINITY)
> >
> > -Thanks
> > Nikhil
> >
> > ___
> > Users mailing list: Users@clusterlabs.org
> > http://clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: Antw: Re: OCFS2 on cLVM with node waiting for fencing timeout

2016-10-13 Thread Ken Gaillot
On 10/13/2016 03:36 AM, Ulrich Windl wrote:
> That's what I'm talking about: If 1 of 3 nodes is rebooting (or the cluster 
> is split-brain 1:2), the single node CANNOT continue due to lack of quorum, 
> while the remaining two nodes can. Is it still necessary to wait for 
> completion of stonith?

If the 2 nodes have working communication with the 1 node, then the 1
node will leave the cluster in an orderly way, and fencing will not be
involved. In that case, yes, quorum is used to prevent the 1 node from
starting services until it rejoins the cluster.

However, if the 2 nodes lose communication with the 1 node, they cannot
be sure it is functioning well enough to respect quorum. In this case,
they have to fence it. DLM has to wait for the fencing to succeed to be
sure the 1 node is not messing with shared resources.


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Unexpected Resource movement after failover

2016-10-13 Thread Nikhil Utane
Ulrich,

I have 4 resources only (not 5, nodes are 5). So then I only need 6
constraints, right?

 [,1]   [,2]   [,3]   [,4]   [,5]  [,6]
[1,] "A"  "A"  "A""B"   "B""C"
[2,] "B"  "C"  "D"   "C"  "D""D"

I understand that if I configure constraint of R1 with R2 score as
-infinity, then the same applies for R2 with R1 score as -infinity (don't
have to configure it explicitly).
I am not having a problem of multiple resources getting schedule on the
same node. Rather, one working resource is unnecessarily getting relocated.

-Thanks
Nikhil


On Thu, Oct 13, 2016 at 7:45 PM, Ulrich Windl <
ulrich.wi...@rz.uni-regensburg.de> wrote:

> Hi!
>
> Don't you need 10 constraints, excluding every possible pair of your 5
> resources (named A-E here), like in this table (produced with R):
>
>  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> [1,] "A"  "A"  "A"  "A"  "B"  "B"  "B"  "C"  "C"  "D"
> [2,] "B"  "C"  "D"  "E"  "C"  "D"  "E"  "D"  "E"  "E"
>
> Ulrich
>
> >>> Nikhil Utane  schrieb am 13.10.2016 um
> 15:59 in
> Nachricht
> :
> > Hi,
> >
> > I have 5 nodes and 4 resources configured.
> > I have configured constraint such that no two resources can be
> co-located.
> > I brought down a node (which happened to be DC). I was expecting the
> > resource on the failed node would be migrated to the 5th waiting node
> (that
> > is not running any resource).
> > However what happened was the failed node resource was started on another
> > active node (after stopping it's existing resource) and that node's
> > resource was moved to the waiting node.
> >
> > What could I be doing wrong?
> >
> >  > name="have-watchdog"/>
> >  > name="dc-version"/>
> >  value="corosync"
> > name="cluster-infrastructure"/>
> >  > name="stonith-enabled"/>
> >  > name="no-quorum-policy"/>
> >  > name="default-action-timeout"/>
> >  > name="symmetric-cluster"/>
> >
> > # pcs constraint
> > Location Constraints:
> >   Resource: cu_2
> > Enabled on: Redun_CU4_Wb30 (score:0)
> > Enabled on: Redund_CU2_WB30 (score:0)
> > Enabled on: Redund_CU3_WB30 (score:0)
> > Enabled on: Redund_CU5_WB30 (score:0)
> > Enabled on: Redund_CU1_WB30 (score:0)
> >   Resource: cu_3
> > Enabled on: Redun_CU4_Wb30 (score:0)
> > Enabled on: Redund_CU2_WB30 (score:0)
> > Enabled on: Redund_CU3_WB30 (score:0)
> > Enabled on: Redund_CU5_WB30 (score:0)
> > Enabled on: Redund_CU1_WB30 (score:0)
> >   Resource: cu_4
> > Enabled on: Redun_CU4_Wb30 (score:0)
> > Enabled on: Redund_CU2_WB30 (score:0)
> > Enabled on: Redund_CU3_WB30 (score:0)
> > Enabled on: Redund_CU5_WB30 (score:0)
> > Enabled on: Redund_CU1_WB30 (score:0)
> >   Resource: cu_5
> > Enabled on: Redun_CU4_Wb30 (score:0)
> > Enabled on: Redund_CU2_WB30 (score:0)
> > Enabled on: Redund_CU3_WB30 (score:0)
> > Enabled on: Redund_CU5_WB30 (score:0)
> > Enabled on: Redund_CU1_WB30 (score:0)
> > Ordering Constraints:
> > Colocation Constraints:
> >   cu_3 with cu_2 (score:-INFINITY)
> >   cu_4 with cu_2 (score:-INFINITY)
> >   cu_4 with cu_3 (score:-INFINITY)
> >   cu_5 with cu_2 (score:-INFINITY)
> >   cu_5 with cu_3 (score:-INFINITY)
> >   cu_5 with cu_4 (score:-INFINITY)
> >
> > -Thanks
> > Nikhil
>
>
>
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Unexpected Resource movement after failover

2016-10-13 Thread Andrei Borzenkov
On Thu, Oct 13, 2016 at 4:59 PM, Nikhil Utane
 wrote:
> Hi,
>
> I have 5 nodes and 4 resources configured.
> I have configured constraint such that no two resources can be co-located.
> I brought down a node (which happened to be DC). I was expecting the
> resource on the failed node would be migrated to the 5th waiting node (that
> is not running any resource).
> However what happened was the failed node resource was started on another
> active node (after stopping it's existing resource) and that node's resource
> was moved to the waiting node.
>
> What could I be doing wrong?
>

It would help if you told which node and which resources, so your
configuration could be interpreted in context. But I guess Ulrich is
correct - your constraints are asymmetrical (I assume, I am not
familiar with PCS), so "a not with b" does not imply "b not with a".
So first pacemaker decided where to place "b" and then it had to move
"a" because it cannot colocate with "b".

>  name="have-watchdog"/>
>  name="dc-version"/>
>  name="cluster-infrastructure"/>
>  name="stonith-enabled"/>
>  name="no-quorum-policy"/>
>  name="default-action-timeout"/>
>  name="symmetric-cluster"/>
>
> # pcs constraint
> Location Constraints:
>   Resource: cu_2
> Enabled on: Redun_CU4_Wb30 (score:0)
> Enabled on: Redund_CU2_WB30 (score:0)
> Enabled on: Redund_CU3_WB30 (score:0)
> Enabled on: Redund_CU5_WB30 (score:0)
> Enabled on: Redund_CU1_WB30 (score:0)
>   Resource: cu_3
> Enabled on: Redun_CU4_Wb30 (score:0)
> Enabled on: Redund_CU2_WB30 (score:0)
> Enabled on: Redund_CU3_WB30 (score:0)
> Enabled on: Redund_CU5_WB30 (score:0)
> Enabled on: Redund_CU1_WB30 (score:0)
>   Resource: cu_4
> Enabled on: Redun_CU4_Wb30 (score:0)
> Enabled on: Redund_CU2_WB30 (score:0)
> Enabled on: Redund_CU3_WB30 (score:0)
> Enabled on: Redund_CU5_WB30 (score:0)
> Enabled on: Redund_CU1_WB30 (score:0)
>   Resource: cu_5
> Enabled on: Redun_CU4_Wb30 (score:0)
> Enabled on: Redund_CU2_WB30 (score:0)
> Enabled on: Redund_CU3_WB30 (score:0)
> Enabled on: Redund_CU5_WB30 (score:0)
> Enabled on: Redund_CU1_WB30 (score:0)
> Ordering Constraints:
> Colocation Constraints:
>   cu_3 with cu_2 (score:-INFINITY)
>   cu_4 with cu_2 (score:-INFINITY)
>   cu_4 with cu_3 (score:-INFINITY)
>   cu_5 with cu_2 (score:-INFINITY)
>   cu_5 with cu_3 (score:-INFINITY)
>   cu_5 with cu_4 (score:-INFINITY)
>
> -Thanks
> Nikhil
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Unexpected Resource movement after failover

2016-10-13 Thread Nikhil Utane
Additional info,




-Nikhil

On Thu, Oct 13, 2016 at 7:29 PM, Nikhil Utane 
wrote:

> Hi,
>
> I have 5 nodes and 4 resources configured.
> I have configured constraint such that no two resources can be co-located.
> I brought down a node (which happened to be DC). I was expecting the
> resource on the failed node would be migrated to the 5th waiting node (that
> is not running any resource).
> However what happened was the failed node resource was started on another
> active node (after stopping it's existing resource) and that node's
> resource was moved to the waiting node.
>
> What could I be doing wrong?
>
>  name="have-watchdog"/>
>  name="dc-version"/>
>  value="corosync" name="cluster-infrastructure"/>
>  name="stonith-enabled"/>
>  name="no-quorum-policy"/>
>  name="default-action-timeout"/>
>  name="symmetric-cluster"/>
>
> # pcs constraint
> Location Constraints:
>   Resource: cu_2
> Enabled on: Redun_CU4_Wb30 (score:0)
> Enabled on: Redund_CU2_WB30 (score:0)
> Enabled on: Redund_CU3_WB30 (score:0)
> Enabled on: Redund_CU5_WB30 (score:0)
> Enabled on: Redund_CU1_WB30 (score:0)
>   Resource: cu_3
> Enabled on: Redun_CU4_Wb30 (score:0)
> Enabled on: Redund_CU2_WB30 (score:0)
> Enabled on: Redund_CU3_WB30 (score:0)
> Enabled on: Redund_CU5_WB30 (score:0)
> Enabled on: Redund_CU1_WB30 (score:0)
>   Resource: cu_4
> Enabled on: Redun_CU4_Wb30 (score:0)
> Enabled on: Redund_CU2_WB30 (score:0)
> Enabled on: Redund_CU3_WB30 (score:0)
> Enabled on: Redund_CU5_WB30 (score:0)
> Enabled on: Redund_CU1_WB30 (score:0)
>   Resource: cu_5
> Enabled on: Redun_CU4_Wb30 (score:0)
> Enabled on: Redund_CU2_WB30 (score:0)
> Enabled on: Redund_CU3_WB30 (score:0)
> Enabled on: Redund_CU5_WB30 (score:0)
> Enabled on: Redund_CU1_WB30 (score:0)
> Ordering Constraints:
> Colocation Constraints:
>   cu_3 with cu_2 (score:-INFINITY)
>   cu_4 with cu_2 (score:-INFINITY)
>   cu_4 with cu_3 (score:-INFINITY)
>   cu_5 with cu_2 (score:-INFINITY)
>   cu_5 with cu_3 (score:-INFINITY)
>   cu_5 with cu_4 (score:-INFINITY)
>
> -Thanks
> Nikhil
>
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Unexpected Resource movement after failover

2016-10-13 Thread Nikhil Utane
Hi,

I have 5 nodes and 4 resources configured.
I have configured constraint such that no two resources can be co-located.
I brought down a node (which happened to be DC). I was expecting the
resource on the failed node would be migrated to the 5th waiting node (that
is not running any resource).
However what happened was the failed node resource was started on another
active node (after stopping it's existing resource) and that node's
resource was moved to the waiting node.

What could I be doing wrong?









# pcs constraint
Location Constraints:
  Resource: cu_2
Enabled on: Redun_CU4_Wb30 (score:0)
Enabled on: Redund_CU2_WB30 (score:0)
Enabled on: Redund_CU3_WB30 (score:0)
Enabled on: Redund_CU5_WB30 (score:0)
Enabled on: Redund_CU1_WB30 (score:0)
  Resource: cu_3
Enabled on: Redun_CU4_Wb30 (score:0)
Enabled on: Redund_CU2_WB30 (score:0)
Enabled on: Redund_CU3_WB30 (score:0)
Enabled on: Redund_CU5_WB30 (score:0)
Enabled on: Redund_CU1_WB30 (score:0)
  Resource: cu_4
Enabled on: Redun_CU4_Wb30 (score:0)
Enabled on: Redund_CU2_WB30 (score:0)
Enabled on: Redund_CU3_WB30 (score:0)
Enabled on: Redund_CU5_WB30 (score:0)
Enabled on: Redund_CU1_WB30 (score:0)
  Resource: cu_5
Enabled on: Redun_CU4_Wb30 (score:0)
Enabled on: Redund_CU2_WB30 (score:0)
Enabled on: Redund_CU3_WB30 (score:0)
Enabled on: Redund_CU5_WB30 (score:0)
Enabled on: Redund_CU1_WB30 (score:0)
Ordering Constraints:
Colocation Constraints:
  cu_3 with cu_2 (score:-INFINITY)
  cu_4 with cu_2 (score:-INFINITY)
  cu_4 with cu_3 (score:-INFINITY)
  cu_5 with cu_2 (score:-INFINITY)
  cu_5 with cu_3 (score:-INFINITY)
  cu_5 with cu_4 (score:-INFINITY)

-Thanks
Nikhil
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: Antw: Re: OCFS2 on cLVM with node waiting for fencing timeout

2016-10-13 Thread Eric Ren

Hi,

On 10/13/2016 04:36 PM, Ulrich Windl wrote:

Eric Ren  schrieb am 13.10.2016 um 09:48 in Nachricht

<73f764d0-75e7-122f-ff4e-d0b27dbdd...@suse.com>:
[...]

When assuming node h01 still lived when communication failed, wouldn't

quorum prevent h01 from doing anything with DLM and OCFS2 anyway?
Not sure I understand you correctly. By default, loosing quorum will make
DLM stop service.

That's what I'm talking about: If 1 of 3 nodes is rebooting (or the cluster is 
split-brain 1:2), the single node CANNOT continue due to lack of quorum, while 
the remaining two nodes can. Is it still necessary to wait for completion of 
stonith?
quorum and fencing completion are different conditions to be checked before starting 
providing service again. FYI,


https://github.com/renzhengeek/libdlm/blob/master/dlm_controld/cpg.c#L603



See `man dlm_controld`:
```
--enable_quorum_lockspace 0|1
 enable/disable quorum requirement for lockspace operations
```

Does not exist in SLES11 SP4...
Well, I think it's better to keeps the default behavior. Otherwise, it's dangerous when 
brain-split happens.


Eric


Ulrich



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: Antw: Re: OCFS2 on cLVM with node waiting for fencing timeout

2016-10-13 Thread Ulrich Windl
>>> Eric Ren  schrieb am 13.10.2016 um 09:48 in Nachricht
<73f764d0-75e7-122f-ff4e-d0b27dbdd...@suse.com>:
[...]
>> When assuming node h01 still lived when communication failed, wouldn't 
> quorum prevent h01 from doing anything with DLM and OCFS2 anyway?
> Not sure I understand you correctly. By default, loosing quorum will make 
> DLM stop service. 

That's what I'm talking about: If 1 of 3 nodes is rebooting (or the cluster is 
split-brain 1:2), the single node CANNOT continue due to lack of quorum, while 
the remaining two nodes can. Is it still necessary to wait for completion of 
stonith?

> See `man dlm_controld`:
> ```
> --enable_quorum_lockspace 0|1
> enable/disable quorum requirement for lockspace operations
> ```

Does not exist in SLES11 SP4...

Ulrich



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: OCFS2 on cLVM with node waiting for fencing timeout

2016-10-13 Thread emmanuel segura
If you want to reduce the multipath switching time, when one
controller goes down
https://www.redhat.com/archives/dm-devel/2009-April/msg00266.html

2016-10-13 10:27 GMT+02:00 Ulrich Windl :
 Eric Ren  schrieb am 13.10.2016 um 09:31 in Nachricht
> :
>> Hi,
>>
>> On 10/10/2016 10:46 PM, Ulrich Windl wrote:
>>> Hi!
>>>
>>> I observed an interesting thing: In a three node cluster (SLES11 SP4) with
>> cLVM and OCFS2 on top, one node was fenced as the OCFS2 filesystem was
>> somehow busy on unmount. We have (for paranoid reasons mainly) an excessive
>> long fencing timout for SBD: 180 seconds
>>>
>>> While one node was actually reset immediately (the cluster was still waiting
>> for the fencing to "complete" through timeout), the other nodes seemed to
>> freeze the filesystem. Thus I observed a read delay > 140 seconds on one 
>> node,
>> the other was also close to 140 seconds.
>> ocfs2 and cLVM are both depending on DLM. DLM deamon will notify them to
>> stop service (which
>> means any cluster locking
>> request would be blocked) during the fencing process.
>>
>> So I'm wondering why it takes so long to finish the fencing process?
>
> As I wrote: Using SBD this is paranoia (as fencing doesn't report back a 
> status like "completed" or "failed". Actually the fencing only needs a few 
> seconds, but the timeout is 3 minutes. Only then the cluster believes that 
> the node is down now (our servers boot so slowly that they are not up within 
> three minutes, also). Why three minutes? Writing to a SCSI disk may be 
> retried up to one minute, and reading may also be retried for a minute. So 
> for a bad SBD disk (or some strange transport problem) it could take two 
> minutes until the receiving SBD gets the fencing command. If the timeout is 
> too low, resources could be restarted before the node was actually fenced, 
> causing data corruption.
>
> Ulrich
> P.S: One common case where our SAN disks seem slow is "Online" firmware 
> update where a controller may be down 20 to 30 seconds. Multipathing is 
> expected to switch to another controller within a few seconds. However the 
> commands to test the disk in multipath are also SCSI commands that may hang 
> for a while...
>
>>
>> Eric
>>>
>>> This was not expected for a cluster filesystem (by me).
>>>
>>> I wonder: Is that expected bahavior?
>>>
>>> Regards,
>>> Ulrich
>>>
>>>
>>>
>>> ___
>>> Users mailing list: Users@clusterlabs.org
>>> http://clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>
>>
>> ___
>> Users mailing list: Users@clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
>
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org



-- 
  .~.
  /V\
 //  \\
/(   )\
^`~'^

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: OCFS2 on cLVM with node waiting for fencing timeout

2016-10-13 Thread Eric Ren

Hi,

On 10/11/2016 02:18 PM, Ulrich Windl wrote:

{ emmanuel segura  schrieb am 10.10.2016 um 16:49 in

Nachricht

Re: [ClusterLabs] OCFS2 on cLVM with node waiting for fencing timeout

2016-10-13 Thread Eric Ren

Hi,

On 10/10/2016 10:46 PM, Ulrich Windl wrote:

Hi!

I observed an interesting thing: In a three node cluster (SLES11 SP4) with cLVM 
and OCFS2 on top, one node was fenced as the OCFS2 filesystem was somehow busy 
on unmount. We have (for paranoid reasons mainly) an excessive long fencing 
timout for SBD: 180 seconds

While one node was actually reset immediately (the cluster was still waiting for the fencing 
to "complete" through timeout), the other nodes seemed to freeze the filesystem. 
Thus I observed a read delay > 140 seconds on one node, the other was also close to 140 
seconds.
ocfs2 and cLVM are both depending on DLM. DLM deamon will notify them to stop service (which 
means any cluster locking

request would be blocked) during the fencing process.

So I'm wondering why it takes so long to finish the fencing process?

Eric


This was not expected for a cluster filesystem (by me).

I wonder: Is that expected bahavior?

Regards,
Ulrich



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] cross DC cluster using public ip?

2016-10-13 Thread Jan Friesse

neeraj ch napsal(a):

Hello ,

We are testing out corosync and pacemaker for DB high availability on the
cloud. I was able to set up a cluster with in a DC using corosync 1.4 and
pacemaker 1.12. It works great and I wanted to try a cross DC cluster. I
was using unicast as multicast was disabled by default.

I was not sure how Corosync behaves with public IP's but I still went ahead
and tried it with both public IP's as well as DNS names. These DNS names
resolve as local IP when the other node is with in the same subnet.


Every node has to be able to see every other node. So mixing of public 
and private ips is not going to work (with exception of special case 
where all private ips are in the same network). Also keep in mind config 
file has to be same on all nodes.





while I was using public IP's both the node inside the same subnet as well
as outside were unable to connect, except for itself. While using DNS names
the membership information showed the nodes within same subnet being
connected to while the nodes outside were not connected


This is somehow expected.



My corosync config is as follows.

totem {
   version: 2
   secauth: off
   threads: 0
   interface {

member {
   memberaddr: 
}
   member {
   memberaddr: 
}
member {
   memberaddr: 
}
ringnumber: 0
bindnetaddr: 172.31.0.0
mcastport: 5405
ttl: 1
   }
   transport: udpu
}

logging {
   fileline: off
   to_stderr: no
   to_logfile: yes
   to_syslog: yes
   logfile: /var/log/cluster/corosync.log
   debug: on
   timestamp: on
   logger_subsys {
subsys: AMF
debug: on
   }
}

service {
 # Load the Pacemaker Cluster Resource Manager
 name: pacemaker
 ver: 1
}

amf {
   mode: disabled
}


I am checking membership information by using corosync-objctl. I have also
tried using public ip as the bind address , that makes the membership from


Just to make sure. This "public" ip is really ip of given machine?


1 to 0 as it doesn't add itself.

If any one has any suggestion / advice on how to debug or what I am doing
wrong . Any help would be very appreciated.

Thank you



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org