[ClusterLabs] Mysql slave did not start replication after failure, and read-only IP also remained active on the much outdated slave

2016-08-22 Thread Attila Megyeri
Dear community, A few days ago we had an issue in our Mysql M/S replication cluster. We have a one R/W Master, and a one RO Slave setup. RO VIP is supposed to be running on the slave if it is not too much behind the master, and if any error occurs, RO VIP is moved to the master. Something happe

Re: [ClusterLabs] Mysql slave did not start replication after failure, and read-only IP also remained active on the much outdated slave

2016-08-22 Thread Andrei Borzenkov
On Mon, Aug 22, 2016 at 12:18 PM, Attila Megyeri wrote: > Dear community, > > > > A few days ago we had an issue in our Mysql M/S replication cluster. > > We have a one R/W Master, and a one RO Slave setup. RO VIP is supposed to be > running on the slave if it is not too much behind the master, an

Re: [ClusterLabs] Mysql slave did not start replication after failure, and read-only IP also remained active on the much outdated slave

2016-08-22 Thread Attila Megyeri
Hi Andrei, I waited several hours, and nothing happened. I assume that the RA does not treat this case properly. Mysql was running, but the "show slave status" command returned something that the RA was not prepared to parse, and instead of reporting a non-readable attribute, it returned some

Re: [ClusterLabs] Entire Group stop on stopping of single Resource

2016-08-22 Thread Jan Pokorný
On 19/08/16 23:09 +0530, jaspal singla wrote: > I have an resource group (ctm_service) comprise of various resources. Now > the requirement is when one of its resource stops for soem time (10-20) > seconds, I want entire group will be stopped. Note that if resource is stopped _just_ for this perio

Re: [ClusterLabs] Mysql slave did not start replication after failure, and read-only IP also remained active on the much outdated slave

2016-08-22 Thread Ken Gaillot
On 08/22/2016 07:24 AM, Attila Megyeri wrote: > Hi Andrei, > > I waited several hours, and nothing happened. And actually, we can see from the configuration you provided that cluster-recheck-interval is 2 minutes. I don't see anything about stonith; is it enabled and tested? This looks like a s

[ClusterLabs] pacemakerd quits after few seconds with some errors

2016-08-22 Thread Gabriele Bulfon
Hi, I built corosync/pacemaker for our XStreamOS/illumos : corosync starts fine and log correctly, pacemakerd quits after some seconds with the attached log. Any idea where is the issue? Thanks, Gabriele Sonic

Re: [ClusterLabs] Mysql slave did not start replication after failure, and read-only IP also remained active on the much outdated slave

2016-08-22 Thread Attila Megyeri
Hi Ken, Thanks a lot for your feedback, my answers are inline. > -Original Message- > From: Ken Gaillot [mailto:kgail...@redhat.com] > Sent: Monday, August 22, 2016 4:12 PM > To: users@clusterlabs.org > Subject: Re: [ClusterLabs] Mysql slave did not start replication after > failure, >

Re: [ClusterLabs] pacemakerd quits after few seconds with some errors

2016-08-22 Thread Ken Gaillot
On 08/22/2016 12:17 PM, Gabriele Bulfon wrote: > Hi, > > I built corosync/pacemaker for our XStreamOS/illumos : corosync starts > fine and log correctly, pacemakerd quits after some seconds with the > attached log. > Any idea where is the issue? Pacemaker is not able to communicate with corosync

[ClusterLabs] Which cluster HA package to choose

2016-08-22 Thread Ron Gilad
Hi, I have encountered with several cluster High Availability packages: - Pacemaker + Corosync - Red Hat Enterprise Linux Cluster Which package do you think is the best to choose? Do you know if the latest ver is stable? And which companies are using it? Thanks in advance, Ron

Re: [ClusterLabs] pacemakerd quits after few seconds with some errors

2016-08-22 Thread Klaus Wenninger
On 08/23/2016 12:20 AM, Ken Gaillot wrote: > On 08/22/2016 12:17 PM, Gabriele Bulfon wrote: >> Hi, >> >> I built corosync/pacemaker for our XStreamOS/illumos : corosync starts >> fine and log correctly, pacemakerd quits after some seconds with the >> attached log. >> Any idea where is the issue? >

Re: [ClusterLabs] Which cluster HA package to choose

2016-08-22 Thread Digimer
On 22/08/16 11:40 AM, Ron Gilad wrote: > Hi, > > I have encountered with several cluster High Availability packages: > > - Pacemaker + Corosync > > - Red Hat Enterprise Linux Cluster > > Which package do you think is the best to choose? > > Do you know if the latest ver is st

[ClusterLabs] corosync.log is 5.1GB in a short period

2016-08-22 Thread 朱荣
Hello: I has a problem about corosync log, my corosync log is increase to 5.1GB in a short time. Then I check the corosync log, it’s show me the same message in short period,like the attachment. What happened about corosync? Thank you! my corosync and pacemaker is:corosync-2.3.4-7.el7.x86_64 pac

Re: [ClusterLabs] pacemakerd quits after few seconds with some errors

2016-08-22 Thread Gabriele Bulfon
Thanks! I am using Corosync 2.3.6 and Pacemaker 1.1.4 using the "--with-corosync". How is Corosync looking for his own version? Sonicle S.r.l. : http://www.sonicle.com Music: http://www.gabrielebulfon.com Quan

Re: [ClusterLabs] pacemakerd quits after few seconds with some errors

2016-08-22 Thread Jan Pokorný
On 23/08/16 07:23 +0200, Gabriele Bulfon wrote: > Thanks! I am using Corosync 2.3.6 and Pacemaker 1.1.4 using the > "--with-corosync". > How is Corosync looking for his own version? The situation may be as easy as building corosync from GitHub-provided automatic tarball, which is never a good ide

Re: [ClusterLabs] pacemakerd quits after few seconds with some errors

2016-08-22 Thread Gabriele Bulfon
Thanks! I found it myself before reading this :) using that url I got the correct tar.gz and PACKAGE_VERSION is fine ;) Going on now hoping it's going to work :) Gabriele Sonicle S.r.l. : http://www.sonicle.co

Re: [ClusterLabs] corosync.log is 5.1GB in a short period

2016-08-22 Thread Kristoffer Grönlund
朱荣 writes: > Hello: > I has a problem about corosync log, my corosync log is increase to 5.1GB in a > short time. > Then I check the corosync log, it’s show me the same message in short > period,like the attachment. > What happened about corosync? Thank you! > my corosync and pacemaker is:coros

Re: [ClusterLabs] corosync.log is 5.1GB in a short period

2016-08-22 Thread Klaus Wenninger
On 08/23/2016 08:31 AM, Kristoffer Grönlund wrote: > 朱荣 writes: > >> Hello: >> I has a problem about corosync log, my corosync log is increase to 5.1GB in >> a short time. >> Then I check the corosync log, it’s show me the same message in short >> period,like the attachment. >> What happened abo

Re: [ClusterLabs] pacemakerd quits after few seconds with some errors

2016-08-22 Thread Gabriele Bulfon
Ok, looks like Corosync now runs fine with its version, but then pacemakerd fails again with new errors on attrd and other daemons it tries to fork. The main reason seems around ha signon and cluster process group api. Any idea? Gabriele