Re: [Pacemaker] [Problem]Number of times control of the fail-count is late.

2010-11-12 Thread renayama19661014
Hi Andrew,

Thank you for comment.

> >
> > It seems to be a problem that update of fail-count was late.
> > But, this problem seems to occur by a timing.
> >
> > It affects it in fail over time of the resource that the control number of 
> > times of fail-count
> is
> > wrong.
> >
> > Is this problem already discussed?
> 
> Not that I know of
> 
> > Is not a delay of the update of fail-count which went by way of attrd a 
> > problem?
> 
> Indeed.

All right.


> 
> >
> >  * I attach log and some pe-files at Bugzilla.
> >  * http://developerbugs.linux-foundation.org/show_bug.cgi?id=2520
> 
> Ok, I'll follow up there.

Thanks.

Besides, if there is necessary information, give me an email.

Best Regards,
Hideo Yamauchi.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] symmetric anti-collocation

2010-11-12 Thread Alan Jones
I've looked into the code more and added more logging, etc.
The pengine essentially walks the list of constraints, applying
weights, and then walks the list of resources and tallies the weights.
In my example, it ends up walking the resources backward, i.e. it
assigns a node to Y and then assigns a node to X.
Unfortunately, at the time of assigning a node to Y, X has no assigned
node and so the colocation rule cannot be applied.
What is needed is a backtracking method from the computer science area
of "constraint satisfaction".
Alan

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


[Pacemaker] understanding scores

2010-11-12 Thread Pavlos Parissis
Hi,

I am trying to understand how the scores are calculated based on the
output of ptest -sL and I have few questions
Below is my scores with a  line number column and the bottom you will
find my configuration

So, let's start

1 group_color: pbx_service_01 allocation score on node-01: 200
 2 group_color: pbx_service_01 allocation score on node-03: 10
 3 group_color: ip_01 allocation score on node-01: 1200
 4 group_color: ip_01 allocation score on node-03: 10
so for so good, ip_01 has 1000 due to resource-stickiness="1000" plus
200 from the group location constraint

 5 group_color: fs_01 allocation score on node-01: 1000
 6 group_color: fs_01 allocation score on node-03: 0
 7 group_color: pbx_01 allocation score on node-01: 1000
 8 group_color: pbx_01 allocation score on node-03: 0
 9 group_color: sshd_01 allocation score on node-01: 1000
 10 group_color: sshd_01 allocation score on node-03: 0
 11 group_color: mailAlert-01 allocation score on node-01: 1000
 12 group_color: mailAlert-01 allocation score on node-03: 0
hold on now, why all the above resources have 1000 on node-01 and not
1200 as fs_01

 13 native_color: ip_01 allocation score on node-01: 5200
5 resources x 1000 from resource-stickiness="1000" plus, right? what
is the difference between in native and group?

 14 native_color: ip_01 allocation score on node-03: 10
 15 clone_color: ms-drbd_01 allocation score on node-01: 4100
why 4100?

 16 clone_color: ms-drbd_01 allocation score on node-03: -100
I guess this comes out from the colocation constraint

 17 clone_color: drbd_01:0 allocation score on node-01: 11100
i am lost now so I will stop here :-)

 18 clone_color: drbd_01:0 allocation score on node-03: 0
 19 clone_color: drbd_01:1 allocation score on node-01: 100
 20 clone_color: drbd_01:1 allocation score on node-03: 11000
 21 native_color: drbd_01:0 allocation score on node-01: 11100
 22 native_color: drbd_01:0 allocation score on node-03: 0
 23 native_color: drbd_01:1 allocation score on node-01: -100
 24 native_color: drbd_01:1 allocation score on node-03: 11000
 25 drbd_01:0 promotion score on node-01: 18100
 26 drbd_01:1 promotion score on node-03: -100
 27 native_color: fs_01 allocation score on node-01: 15100
 28 native_color: fs_01 allocation score on node-03: -100
 29 native_color: pbx_01 allocation score on node-01: 3000
 30 native_color: pbx_01 allocation score on node-03: -100
 31 native_color: sshd_01 allocation score on node-01: 2000
 32 native_color: sshd_01 allocation score on node-03: -100
 33 native_color: mailAlert-01 allocation score on node-01: 1000
 34 native_color: mailAlert-01 allocation score on node-03: -100
 35 group_color: pbx_service_02 allocation score on node-02: 200
 36 group_color: pbx_service_02 allocation score on node-03: 10
 37 group_color: ip_02 allocation score on node-02: 1200
 38 group_color: ip_02 allocation score on node-03: 10
 39 group_color: fs_02 allocation score on node-02: 1000
 40 group_color: fs_02 allocation score on node-03: 0
 41 group_color: pbx_02 allocation score on node-02: 1000
 42 group_color: pbx_02 allocation score on node-03: 0
 43 group_color: sshd_02 allocation score on node-02: 1000
 44 group_color: sshd_02 allocation score on node-03: 0
 45 group_color: mailAlert-02 allocation score on node-02: 1000
 46 group_color: mailAlert-02 allocation score on node-03: 0
 47 native_color: ip_02 allocation score on node-02: 5200
 48 native_color: ip_02 allocation score on node-03: 10
 49 clone_color: ms-drbd_02 allocation score on node-02: 4100
 50 clone_color: ms-drbd_02 allocation score on node-03: -100
 51 clone_color: drbd_02:0 allocation score on node-02: 11100
 52 clone_color: drbd_02:0 allocation score on node-03: 0
 53 clone_color: drbd_02:1 allocation score on node-02: 100
 54 clone_color: drbd_02:1 allocation score on node-03: 11000
 55 native_color: drbd_02:0 allocation score on node-02: 11100
 56 native_color: drbd_02:0 allocation score on node-03: 0
 57 native_color: drbd_02:1 allocation score on node-02: -100
 58 native_color: drbd_02:1 allocation score on node-03: 11000
 59 drbd_02:0 promotion score on node-02: 18100
 60 drbd_02:2 promotion score on none: 0
 61 drbd_02:1 promotion score on node-03: -100
 62 native_color: fs_02 allocation score on node-02: 15100
 63 native_color: fs_02 allocation score on node-03: -100
 64 native_color: pbx_02 allocation score on node-02: 3000
 65 native_color: pbx_02 allocation score on node-03: -100
 66 native_color: sshd_02 allocation score on node-02: 2000
 67 native_color: sshd_02 allocation score on node-03: -100
 68 native_color: mailAlert-02 allocation score on node-02: 1000
 69 native_color: mailAlert-02 allocation score on node-03: -100
 70 drbd_01:0 promotion score on node-01: 100
 71 drbd_01:1 promotion score on node-03: -100
 72 drbd_02:0 promotion score on node-02: 100
 73 drbd_02:2 promotion score on none: 0
 74 drbd_02:1 promotion score on node-03: -100

Re: [Pacemaker] 2 node failover cluster + MySQL Master-Master replica setup

2010-11-12 Thread Ruzsinszky Attila
> That's what I said - I didn't see it either.
> but if you you check the current RA:
OK, sorry.

> # crm ra meta mysql|grep ^replica
> replication_user (string): MySQL replication user
> replication_passwd (string): MySQL replication user password
> replication_port (string, [3306]): MySQL replication port
Maybe I found this (or not) those parameters in DMC...

Ruzsi

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] 2 node failover cluster + MySQL Master-Master replica setup

2010-11-12 Thread Vadym Chepkov

On Nov 12, 2010, at 1:22 PM, Ruzsinszky Attila wrote:

>> http://lmgtfy.com/?q=linbit+mysql+replication
> OK.
> I found that webinar. There isn't any "printed" (readable) doc. :-(
> 

That's what I said - I didn't see it either. 
but if you you check the current RA:

# crm ra meta mysql|grep ^replica
replication_user (string): MySQL replication user
replication_passwd (string): MySQL replication user password
replication_port (string, [3306]): MySQL replication port

Vadym




___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] 2 node failover cluster + MySQL Master-Master replica setup

2010-11-12 Thread Ruzsinszky Attila
> http://lmgtfy.com/?q=linbit+mysql+replication
OK.
I found that webinar. There isn't any "printed" (readable) doc. :-(

TIA,
Ruzsi

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] 2 node failover cluster + MySQL Master-Master replica setup

2010-11-12 Thread Vadym Chepkov

On Nov 12, 2010, at 1:11 PM, Ruzsinszky Attila wrote:

>> I am pretty sure Linbit announced mysql RA with replication capabilities. 
>> Haven't seen documentation though.
> Any URL, or anything?
> 

http://lmgtfy.com/?q=linbit+mysql+replication



> Thanks,
> Ruzsi
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] 2 node failover cluster + MySQL Master-Master replica setup

2010-11-12 Thread Ruzsinszky Attila
> I am pretty sure Linbit announced mysql RA with replication capabilities. 
> Haven't seen documentation though.
Any URL, or anything?

Thanks,
Ruzsi

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] 2 node failover cluster + MySQL Master-Master replica setup

2010-11-12 Thread Vadym Chepkov

On Nov 12, 2010, at 11:14 AM, Ruzsinszky Attila wrote:

> Hi,
> 
>> And now (officially) RHCS can also use Pacemaker
>> http://theclusterguy.clusterlabs.org/post/1551292286
> Nice.
> 
>> Yeah, like I said, Master-Master and Pacemaker without a proper resource
>> agent will cause issues.
> Yes.
> 
>> big problems. Now let me explain this, a 2-node Multi-Master MySQL setup
>> means setting up every node as both Master and Slave, node 1's Master
>> replicates asynchronously to node 2's Slave and node 2's Master replicates
>> asynchronously to node 1's Slave. The replication channels between the two
>> are not redundant, nor do they recover from failure automatically and you
>> have to manually set the auto-increment-increment and auto-increment-offset
>> so that you don't have primary key collisions.
> Clear.
> 
>> each server. Looking at how DRBD handles these kinds of things is one way to
>> go about it, but ... it's a huge task and there are a lot of things that can
>> go terribly wrong.
> :-(
> 
>> So again, for the third time, the problem is not the Multi-Master setup, nor
>> it is Pacemaker, it's just a very specific use case for which a resource
>> agent wasn't written.
> OK.
> So now almost the only one possibilities is DRBD+MySQL?
> 

I am pretty sure Linbit announced mysql RA with replication capabilities. 
Haven't seen documentation though.

Vadym


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] start filesystem like this is right?

2010-11-12 Thread Pavlos Parissis
On 12 November 2010 07:37, jiaju liu  wrote:
>
> start resource steps
> step(1)
> crm configure primitive vol_mpath0 ocf:heartbeat:Filesystem meta 
> target-role=stopped params device=/dev/mapper/mpath0 
> directory=/mnt/mapper/mpath0 fstype='lustre' op start timeout=300s  op stop 
> timeout=120s op monitor timeout=120s interval=60s op notify timeout=60s
> step(2)
> crm resource reprobe
>
> step(3)
> crm configure location vol_mpath0_location_manage datavol_mpath0 rule -inf: 
> not_defined pingd_manage or pingd_manage lte 0
>
> crm configure location vol_mpath0_location_data datavol_mpath0 rule -inf: 
> not_defined pingd_data or pingd_data lte 0

why do you have 2 location constraints? where is the definitions for
pingd_data and  pingd_manage?

>
> step(4)
> crm resource start vol_mpath0
>
> delete resource steps
>
> step(1)
> crm resource stop vol_mpath0
>
> step(2)
> crm resource cleanup vol_mpath0
>
> step(3)
> crm configure delete vol_mpath0
>
> above is my steps? is it right? I repeat these steps for several times. at 
> begin it works well. after 5 or 6 times the reosurce could not start .I use 
> crm resource start vol_mpath0 again no use.

Could be that your ping nodes are down?
>
> my pacemaker package are
>     pacemaker-1.0.8-6.1.el5
>     pacemaker-libs-devel-1.0.8-6.1.el5
>     pacemaker-libs-1.0.8-6.1.el5
>
>     openais packages are
>     openaislib-devel-1.1.0-1.el5
>     openais-1.1.0-1.el5
>     openaislib-1.1.0-1.el5
>
>     corosync packages are
>     corosync-1.2.2-1.1.el5
>     corosynclib-devel-1.2.2-1.1.el5
>     corosynclib-1.2.2-1.1.el5
>     who know why thanks a lot
>
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Infinite fail-count and migration-threshold after node fail-back

2010-11-12 Thread Pavlos Parissis
On 11 November 2010 16:59, Dan Frincu  wrote:
[...snip...]
>
>   
> 
> 
> 
> 
>   
> Example 6.1. Example set of opt-in location constraints
>
> At the moment you have symmetric-cluster=false, you need to add
> location constraints in order to get your resources running.
> Below is my conf and it works as expected, pbx_service_01 starts on
> node-01 and never fails back, in case failed over to node-03 and
> node-01 is back on line, due to resource-stickiness="1000", but take a
> look at the score in location constraint, very low scores compared to
> 1000 - I could  have also set it to inf
>
>
> Yes but you don't have groups defined in your setup, having groups means the
> score of each active resource is added.
> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/ch-advanced-resources.html#id2220530
>
> For example:
>
> r...@cluster1:~# ptest -sL
> Allocation scores:
> group_color: all allocation score on cluster1: 0
> group_color: all allocation score on cluster2: -100
> group_color: virtual_ip_1 allocation score on cluster1: 1000
> group_color: virtual_ip_1 allocation score on cluster2: -100
> group_color: virtual_ip_2 allocation score on cluster1: 1000
> group_color: virtual_ip_2 allocation score on cluster2: 0
> group_color: Failover_Alert allocation score on cluster1: 1000
> group_color: Failover_Alert allocation score on cluster2: 0
> group_color: fs_home allocation score on cluster1: 1000
> group_color: fs_home allocation score on cluster2: 0
> group_color: fs_mysql allocation score on cluster1: 1000
> group_color: fs_mysql allocation score on cluster2: 0
> group_color: fs_storage allocation score on cluster1: 1000
> group_color: fs_storage allocation score on cluster2: 0
> group_color: httpd allocation score on cluster1: 1000
> group_color: httpd allocation score on cluster2: 0
> group_color: mysqld allocation score on cluster1: 1000
> group_color: mysqld allocation score on cluster2: 0
> clone_color: ms_drbd_home allocation score on cluster1: 9000
> clone_color: ms_drbd_home allocation score on cluster2: -100
> clone_color: drbd_home:0 allocation score on cluster1: 1100
> clone_color: drbd_home:0 allocation score on cluster2: 0
> clone_color: drbd_home:1 allocation score on cluster1: 0
> clone_color: drbd_home:1 allocation score on cluster2: 1100
> native_color: drbd_home:0 allocation score on cluster1: 1100
> native_color: drbd_home:0 allocation score on cluster2: 0
> native_color: drbd_home:1 allocation score on cluster1: -100
> native_color: drbd_home:1 allocation score on cluster2: 1100
> drbd_home:0 promotion score on cluster1: 18100
> drbd_home:1 promotion score on cluster2: -100
> clone_color: ms_drbd_mysql allocation score on cluster1: 10100
> clone_color: ms_drbd_mysql allocation score on cluster2: -100
> clone_color: drbd_mysql:0 allocation score on cluster1: 1100
> clone_color: drbd_mysql:0 allocation score on cluster2: 0
> clone_color: drbd_mysql:1 allocation score on cluster1: 0
> clone_color: drbd_mysql:1 allocation score on cluster2: 1100
> native_color: drbd_mysql:0 allocation score on cluster1: 1100
> native_color: drbd_mysql:0 allocation score on cluster2: 0
> native_color: drbd_mysql:1 allocation score on cluster1: -100
> native_color: drbd_mysql:1 allocation score on cluster2: 1100
> drbd_mysql:0 promotion score on cluster1: 20300
> drbd_mysql:1 promotion score on cluster2: -100
> clone_color: ms_drbd_storage allocation score on cluster1: 11200
> clone_color: ms_drbd_storage allocation score on cluster2: -100
> clone_color: drbd_storage:0 allocation score on cluster1: 1100
> clone_color: drbd_storage:0 allocation score on cluster2: 0
> clone_color: drbd_storage:1 allocation score on cluster1: 0
> clone_color: drbd_storage:1 allocation score on cluster2: 1100
> native_color: drbd_storage:0 allocation score on cluster1: 1100
> native_color: drbd_storage:0 allocation score on cluster2: 0
> native_color: drbd_storage:1 allocation score on cluster1: -100
> native_color: drbd_storage:1 allocation score on cluster2: 1100
> drbd_storage:0 promotion score on cluster1: 22500
> drbd_storage:1 promotion score on cluster2: -100
> native_color: virtual_ip_1 allocation score on cluster1: 12300
> native_color: virtual_ip_1 allocation score on cluster2: -100
> native_color: virtual_ip_2 allocation score on cluster1: 8000
> native_color: virtual_ip_2 allocation score on cluster2: -100
> native_color: Failover_Alert allocation score on cluster1: 7000
> native_color: Failover_Alert allocation score on cluster2: -100
> native_color: fs_home allocation score on cluster1: 6000
> native_color: fs_home allocation score on cluster2: -100
> native_color: fs_mysql allocation score on cluster1: 5000
> native_color: fs_mysql allocation score on cluster2: -100
> native_color: fs_storage allocation score on cluster1: 4000
> native_color: fs_storage allocation score on cluster2: -100
> native_color: mys

Re: [Pacemaker] 2 node failover cluster + MySQL Master-Master replica setup

2010-11-12 Thread Ruzsinszky Attila
> Isn't a Multi-Master and DRBD a big no-no? It implies that two MySQL
In this situation there isn't Multi-Master setup.
It is just a normal single instance mysql process which will be switched by
DRBD to active node.

TIA,
Ruzsi

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Project updates

2010-11-12 Thread Vadym Chepkov
On Fri, Nov 12, 2010 at 9:32 AM, Andrew Beekhof  wrote:
> For those that aren't using RSS readers, I wanted to draw people's
> attention to a couple of updates that went out today.
>
> Nothing dramatic, just a new 1.0 release (and back-annoucement for
> some from 1.1):

Perhaps something was forgotten in the excitement? :)

$ hg diff
diff -r 99f5a1e61667 GNUmakefile
--- a/GNUmakefile   Fri Nov 12 09:12:32 2010 +0100
+++ b/GNUmakefile   Fri Nov 12 11:47:28 2010 -0500
@@ -26,7 +26,7 @@
 TARFILE= $(distdir).tar.bz2
 DIST_ARCHIVES  = $(TARFILE)

-LAST_RELEASE   = Pacemaker-1.0.9.1
+LAST_RELEASE   = Pacemaker-1.0.10
 STABLE_SERIES  = stable-1.0

 RPM_ROOT   = $(shell pwd)
diff -r 99f5a1e61667 configure.ac
--- a/configure.ac  Fri Nov 12 09:12:32 2010 +0100
+++ b/configure.ac  Fri Nov 12 11:47:28 2010 -0500
@@ -19,7 +19,7 @@
 dnl checks for library functions
 dnl checks for system services

-AC_INIT(pacemaker, 1.0.9, pacemaker@oss.clusterlabs.org)
+AC_INIT(pacemaker, 1.0.10, pacemaker@oss.clusterlabs.org)
 CRM_DTD_VERSION="1.0"

 PKG_FEATURES=""

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] 2 node failover cluster + MySQL Master-Master replica setup

2010-11-12 Thread Dennis Jacobfeuerborn

On 11/12/2010 01:50 PM, Dan Frincu wrote:
[SNIP]

server. Even the LSB script doesn't handle a Multi-Master setup. You'd have
to write a custom resource agent, and it would probably fit your setup and
your setup alone, meaning it couldn't be widely used for other setups, I
know I had to make some modifications to the mysql resource agent and those
changes were specific to my setup.


No, I don't want to write scripts. I'm not a programmer. I just want
to try out a
new tech for MySQL clustering except MySQL+DRBD.It is clear for me
theoretically. The files of mysqld reside on the common dir. which was switched
by DRBD. Is that right?


Yes.


Isn't a Multi-Master and DRBD a big no-no? It implies that two MySQL 
instances write to the same datadir which if I remember correctly is not 
supported by MySQL or hat that changed?


Regards,
  Dennis

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] 2 node failover cluster + MySQL Master-Master replica setup

2010-11-12 Thread Ruzsinszky Attila
> The mysql RA controls a single MySQL instance and the rest of the HA setup
> is done via DRBD Master-Slave resources.
OK. Thank you for your help.

Ruzsi

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] 2 node failover cluster + MySQL Master-Master replica setup

2010-11-12 Thread Dan Frincu

Hi,

Ruzsinszky Attila wrote:

Hi,

  

And now (officially) RHCS can also use Pacemaker
http://theclusterguy.clusterlabs.org/post/1551292286


Nice.

  

Yeah, like I said, Master-Master and Pacemaker without a proper resource
agent will cause issues.


Yes.

  

big problems. Now let me explain this, a 2-node Multi-Master MySQL setup
means setting up every node as both Master and Slave, node 1's Master
replicates asynchronously to node 2's Slave and node 2's Master replicates
asynchronously to node 1's Slave. The replication channels between the two
are not redundant, nor do they recover from failure automatically and you
have to manually set the auto-increment-increment and auto-increment-offset
so that you don't have primary key collisions.


Clear.

  

each server. Looking at how DRBD handles these kinds of things is one way to
go about it, but ... it's a huge task and there are a lot of things that can
go terribly wrong.


:-(

  

So again, for the third time, the problem is not the Multi-Master setup, nor
it is Pacemaker, it's just a very specific use case for which a resource
agent wasn't written.


OK.
So now almost the only one possibilities is DRBD+MySQL?
  
Afaik, yes, I'm hoping someone will step in and say otherwise, but to 
the best of my knowledge, the only implementation between MySQL and 
Pacemaker is represented by the mysql and mysql-proxy resource agents 
http://www.linux-ha.org/wiki/Resource_Agents


The mysql RA controls a single MySQL instance and the rest of the HA 
setup is done via DRBD Master-Slave resources.


Regards,

Dan

TIA,
Ruzsi

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
  


--
Dan FRINCU
Systems Engineer
CCNA, RHCE
Streamwide Romania

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] symmetric anti-collocation

2010-11-12 Thread Alan Jones
On Thu, Nov 11, 2010 at 11:31 PM, Andrew Beekhof  wrote:
>> colocation X-Y -2: X Y
>> colocation Y-X -2: Y X
>
> the second one is implied by the first and is therefore redundant

If only that were true!
What happens with the first rule is that other constraints that force
Y to a node will evict X but not the other way around.
What I'm doing is to first apply a "slight" preference for each
resource to each node:
location X-nodeA X 1: nodeA
location Y-nodeB Y 1: nodeB
And then impose absolute constraints that come from the outside environment.
In the particular case that has a problem, the constraint looks like:
location X-not-nodeA X -inf: nodeA
The behavior I expected was for X to be placed on nodeB and Y to
"anti-colocate" onto nodeA because our colocation rule is stronger
than the node preference rule.  What happens instead is that both X
and Y run on nodeB.
The similar constraint on Y (by itself) does work:
location Y-not-nodeB Y -inf: nodeB
and results in Y running on nodeA and X running on nodeB.  This is the
case whether I have one colocation rule or two, i.e. the second
colocation rule is ignored.

Looking at the code, I think the solution would be to short-circuit
the recursion when you can only run on one node due to -inf rules
rather than on a loop.  Obviously, it would not be a simple change and
needs some thought.
If you have any other suggestions let me know.
Alan

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] 2 node failover cluster + MySQL Master-Master replica setup

2010-11-12 Thread Ruzsinszky Attila
Hi,

> And now (officially) RHCS can also use Pacemaker
> http://theclusterguy.clusterlabs.org/post/1551292286
Nice.

> Yeah, like I said, Master-Master and Pacemaker without a proper resource
> agent will cause issues.
Yes.

> big problems. Now let me explain this, a 2-node Multi-Master MySQL setup
> means setting up every node as both Master and Slave, node 1's Master
> replicates asynchronously to node 2's Slave and node 2's Master replicates
> asynchronously to node 1's Slave. The replication channels between the two
> are not redundant, nor do they recover from failure automatically and you
> have to manually set the auto-increment-increment and auto-increment-offset
> so that you don't have primary key collisions.
Clear.

> each server. Looking at how DRBD handles these kinds of things is one way to
> go about it, but ... it's a huge task and there are a lot of things that can
> go terribly wrong.
:-(

> So again, for the third time, the problem is not the Multi-Master setup, nor
> it is Pacemaker, it's just a very specific use case for which a resource
> agent wasn't written.
OK.
So now almost the only one possibilities is DRBD+MySQL?

TIA,
Ruzsi

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


[Pacemaker] Project updates

2010-11-12 Thread Andrew Beekhof
For those that aren't using RSS readers, I wanted to draw people's
attention to a couple of updates that went out today.

Nothing dramatic, just a new 1.0 release (and back-annoucement for
some from 1.1):

   
http://theclusterguy.clusterlabs.org/post/1551292286/pacemaker-release-roundup

Also, NTT contributed a new logo.  See what you think:

  http://theclusterguy.clusterlabs.org/post/1551578523/new-logo

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Pacemaker-1.1.4, when?

2010-11-12 Thread Nikola Ciprich
(resent)
1.1.4 with new glib2: tests pass smoothly
1.1.4 + patch and older glib2 - all tests are segfaulting...

ie:
Program terminated with signal 11, Segmentation fault.
#0  IA__g_str_hash (v=0x0) at gstring.c:95
95guint32 h = *p;
(gdb) bt
#0  IA__g_str_hash (v=0x0) at gstring.c:95
#1  0x7fe087bb6128 in g_hash_table_lookup_node (hash_table=0x1390ec0, 
key=0x0, value=0x13a3b00) at ghash.c:231
#2  IA__g_hash_table_insert (hash_table=0x1390ec0, key=0x0, value=0x13a3b00) at 
ghash.c:336
#3  0x7fe089367953 in convert_graph_action (resource=0x13a30a0, 
action=0x139cb80, status=0, rc=7) at unpack.c:308
#4  0x0040362a in exec_rsc_action (graph=0x1394fa0, action=0x139cb80) 
at crm_inject.c:359
#5  0x7fe089368642 in initiate_action (graph=0x1394fa0, action=0x139cb80) 
at graph.c:172
#6  0x7fe08936899d in fire_synapse (graph=0x1394fa0, synapse=0x139ba60) at 
graph.c:204
#7  0x7fe089368dbd in run_graph (graph=0x1394fa0) at graph.c:262
#8  0x0040428f in run_simulation (data_set=0x7fff712280a0) at 
crm_inject.c:540
#9  0x0040632a in main (argc=9, argv=0x7fff71228308) at 
crm_inject.c:1148


On Fri, Nov 12, 2010 at 01:41:26PM +0100, Andrew Beekhof wrote:
> 2010/11/12 Nikola Ciprich :
> >> do the pe regression tests pass?
> > Hi Andrew,
> > how do I run PE tests? looking into regression directory,
> > I'm a bit confused..
> 
> either pengine/regression.sh from the top of the source directory, or
> from somewhere under /usr/share/pacemaker (check where the -devel
> package puts them)
> 
> > n.
> >
> > --
> > -
> > Ing. Nikola CIPRICH
> > LinuxBox.cz, s.r.o.
> > 28. rijna 168, 709 01 Ostrava
> >
> > tel.:   +420 596 603 142
> > fax:    +420 596 621 273
> > mobil:  +420 777 093 799
> > www.linuxbox.cz
> >
> > mobil servis: +420 737 238 656
> > email servis: ser...@linuxbox.cz
> > -
> >
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> 

-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] 2 node failover cluster + MySQL Master-Master replica setup

2010-11-12 Thread Dan Frincu

Hi,

Ruzsinszky Attila wrote:

Hi,

  

A MySQL Multi-Master architecture for a 2 node setup brings a lot of
configuration and administration overhead and has no conflict detection or
resolution. Integrating such a setup with Pacemaker only adds to the


Yes, I found it.
The real story: I want to learn clustering with a 2 node failover cluster.
I configured the cluster by DMC (DRBD Management Console).
I used the GUI configuring a MySQL service. It was almost unsuccessfull
which wasn't a surprise for me. After that I started to read some HowTo,
WEB page, etc. for help. I found someone from #mysql-nbd channel who
helped me and adviced me using M-M MySQL config but he doesn't know
almost anything about Pacemaker (He uses RH cluster).
  
And now (officially) RHCS can also use Pacemaker 
http://theclusterguy.clusterlabs.org/post/1551292286

After we did the working M-M config I started pacemaker and I could see
MySQL is working. I could connect to the commonIP and I could create a
test DB. Everything seemed all right until I put standby the master node
(from pacemaker point of view). In that moment mysqld started to "blinking"
between working and not working state because pacemaker always restarted
the process.

In the messages file I clould see some lines about missing privs. (RELOAD
and SUPER).

So I'm here now.
  
Yeah, like I said, Master-Master and Pacemaker without a proper resource 
agent will cause issues.
  

server. Even the LSB script doesn't handle a Multi-Master setup. You'd have
to write a custom resource agent, and it would probably fit your setup and
your setup alone, meaning it couldn't be widely used for other setups, I
know I had to make some modifications to the mysql resource agent and those
changes were specific to my setup.


No, I don't want to write scripts. I'm not a programmer. I just want
to try out a
new tech for MySQL clustering except MySQL+DRBD.It is clear for me
theoretically. The files of mysqld reside on the common dir. which was switched
by DRBD. Is that right?
  

Yes.
  

MySQL Cluster is a choice, it could be integrated with Pacemaker, although I


Now I don't want MySQL Cluster. I think it is a bigger task for me.

  

Anyways, this is just to get a feel for what's involved in the process, and
how Pacemaker would fit the picture, at least from my point of view.


OK

  

I would recommend all questions related to MySQL Cluster, Replication,
Multi-Master be directed to the appropriate mailing lists though, and if you


As I mentioned I've got an M-M config from #mysql-nbd.
The recent problem is MySQL (M-M) + Pacemaker.
  
Back to square one, don't pass go, don't collect $200. No resource 
agent, big problems. Now let me explain this, a 2-node Multi-Master 
MySQL setup means setting up every node as both Master and Slave, node 
1's Master replicates asynchronously to node 2's Slave and node 2's 
Master replicates asynchronously to node 1's Slave. The replication 
channels between the two are not redundant, nor do they recover from 
failure automatically and you have to manually set the 
auto-increment-increment and auto-increment-offset so that you don't 
have primary key collisions.


Now imagine a resource agent and what it should do to keep the resources 
up. First you need to check periodically (monitor) the replication 
channels, if they fail, you must determine which node has the most 
recent information and make sure that first it's information is sent to 
the other node via the Slave replication channel, then activate the 
reverse Master-Slave channel, otherwise you'd be in a MySQL 
'split-brain' situation, where each node has information written to it 
and the database now contains different views on each server. Looking at 
how DRBD handles these kinds of things is one way to go about it, but 
... it's a huge task and there are a lot of things that can go terribly 
wrong.


So again, for the third time, the problem is not the Multi-Master setup, 
nor it is Pacemaker, it's just a very specific use case for which a 
resource agent wasn't written.


Regards,

Dan
  

want to write a resource agent for a Multi-Master setup, by all means, do
share :)


No, I don't want. I'm a beginner both in clustering and MySQL.

  

Hope this helps.


Yes, of course.

BTW.
If I want to solve the above problem can you help me? 
Of course with my

strict error messages, config files, etc. I "feel" my M-M config is not
rock stable (I was able to brake the IO or SQL "channel" between the
two mysqld processes) so I don't know whether I want this type of setup.

TIA,
Ruzsi

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
  


--
Dan FRINCU
Systems Engineer
CCNA,

Re: [Pacemaker] Pacemaker-1.1.4, when?

2010-11-12 Thread Andrew Beekhof
2010/11/12 Nikola Ciprich :
>> do the pe regression tests pass?
> Hi Andrew,
> how do I run PE tests? looking into regression directory,
> I'm a bit confused..

either pengine/regression.sh from the top of the source directory, or
from somewhere under /usr/share/pacemaker (check where the -devel
package puts them)

> n.
>
> --
> -
> Ing. Nikola CIPRICH
> LinuxBox.cz, s.r.o.
> 28. rijna 168, 709 01 Ostrava
>
> tel.:   +420 596 603 142
> fax:    +420 596 621 273
> mobil:  +420 777 093 799
> www.linuxbox.cz
>
> mobil servis: +420 737 238 656
> email servis: ser...@linuxbox.cz
> -
>

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Pacemaker-1.1.4, when?

2010-11-12 Thread Andrew Beekhof
On Fri, Nov 12, 2010 at 11:19 AM, Nikola Ciprich
 wrote:
> it compiles well and doesn't crash, but seems to me like
> it misbehaves somehow...
> MS resources are not promoted properly or are suddenly marked
> as orphaned etc..

do the pe regression tests pass?

>
> On Fri, Nov 12, 2010 at 05:42:41PM +0900, nozawat wrote:
>> Hi Andrew,
>>
>>  It is very good.
>>  I attach a patch for the time being.
>>
>> Regards
>> Tomo
>>
>> 2010/11/12 Andrew Beekhof 
>>
>> > Ah, silly me.
>> >
>> > After
>> > +    iter->offset = 0;
>> >
>> > you need:
>> > +    iter->values = NULL;
>> >
>> >
>> > On Fri, Nov 12, 2010 at 2:07 AM, nozawat  wrote:
>> > > Hi Andrew,
>> > >
>> > >  I show below a result of print.
>> > >
>> > > 1)print *iter
>> > > (gdb) print *iter
>> > > $1 = {offset = 2, hash = 0x12aa7ec0, values = 0x7fff1568e580}
>> > > -
>> > >
>> > > 2)print *values
>> > > (gdb) print *values
>> > > $2 = {data = 0x7fff1568e5c0, next = 0x2b247ebb85a1, prev = 0x1}
>> > > -
>> > >
>> > > Regards,
>> > > Tomo
>> > >
>> > > 2010/11/12 Andrew Beekhof 
>> > >>
>> > >> On Thu, Nov 11, 2010 at 3:50 PM, nozawat  wrote:
>> > >> > Hi Andrew
>> > >> >
>> > >> >  Sorry,pengine output a core.
>> > >>
>> > >> could you go up to frame #1 and run:
>> > >> print *iter
>> > >> print *values
>> > >>
>> > >> >
>> > >> > -
>> > >> > gdb) where
>> > >> > #0  0x2b247fa8b53a in g_list_nth_data () from
>> > >> > /lib64/libglib-2.0.so.0
>> > >> > #1  0x2b247ebc5027 in g_hash_table_iter_next (iter=0x7fff1568e4c0,
>> > >> > key=0x0, value=0x7fff1568e4e0)
>> > >> >     at ../include/crm/common/util.h:
>> > >> > 348
>> > >> > #2  0x2b247ebc9301 in native_rsc_location (rsc=0x12aa9cc0,
>> > >> > constraint=0x12af5480) at native.c:1215
>> > >> > #3  0x2b247ebcf56c in group_rsc_location (rsc=0x12aa9cc0,
>> > >> > constraint=0x12af5480) at group.c:421
>> > >> > #4  0x2b247ebb85a1 in apply_placement_constraints
>> > >> > (data_set=0x7fff1568e6b0) at allocate.c:523
>> > >> > #5  0x2b247ebb96f6 in stage2 (data_set=0x7fff1568e6b0) at
>> > >> > allocate.c:872
>> > >> > #6  0x2b247ebb6754 in do_calculations (data_set=0x7fff1568e6b0,
>> > >> > xml_input=0x1295ec90, now=0x0)
>> > >> >     at pengine.c:262
>> > >> > #7  0x2b247ebb5d3e in process_pe_message (msg=0x12941e60,
>> > >> > xml_data=0x1295a610, sender=0x12940ac0)
>> > >> >     at pengine.c:124
>> > >> > #8  0x00401265 in pe_msg_callback (client=0x12940ac0,
>> > >> > user_data=0x0)
>> > >> > at main.c:60
>> > >> > #9  0x2b247f634b97 in G_CH_dispatch_int (source=0x1293fd80,
>> > >> > callback=0,
>> > >> > user_data=0x0) at GSource.c:637
>> > >> > #10 0x2b247fa8ddb4 in g_main_context_dispatch () from
>> > >> > /lib64/libglib-2.0.so.0
>> > >> > #11 0x2b247fa90c0d in ?? () from /lib64/libglib-2.0.so.0
>> > >> > #12 0x2b247fa90f1a in g_main_loop_run () from
>> > >> > /lib64/libglib-2.0.so.0
>> > >> > #13 0x0040186f in main (argc=1, argv=0x7fff1568eb48) at
>> > >> > main.c:177
>> > >> > (gdb)
>> > >> >
>> > >> > --
>> > >> >
>> > >> > Regards,
>> > >> > Tomo
>> > >> >
>> > >> >
>> > >> >
>> > >> >
>> > >> > 2010/11/11 Andrew Beekhof 
>> > >> >>
>> > >> >> On Thu, Nov 11, 2010 at 12:31 PM, nozawat  wrote:
>> > >> >> > Hi Andrew,
>> > >> >> >
>> > >> >> >  I ran it. However, an error has been output.
>> > >> >> >  Probably I have a feeling that glib does not move well.
>> > >> >> >  I attached ha-log.
>> > >> >> >
>> > >> >> >  I feel like cannot read a library well.
>> > >> >> >  It is contents of core as follows.
>> > >> >>
>> > >> >> you'll need the debuginfo package installed
>> > >> >>
>> > >> >> > 
>> > >> >> > $ gdb /usr/sbin/corosync core.27920
>> > >> >> > GNU gdb Fedora (6.8-37.el5)
>> > >> >> > Copyright (C) 2008 Free Software Foundation, Inc.
>> > >> >> > License GPLv3+: GNU GPL version 3 or later
>> > >> >> > 
>> > >> >> > This is free software: you are free to change and redistribute it.
>> > >> >> > There is NO WARRANTY, to the extent permitted by law.  Type "show
>> > >> >> > copying"
>> > >> >> > and "show warranty" for details.
>> > >> >> > This GDB was configured as "x86_64-redhat-linux-gnu"...
>> > >> >> >
>> > >> >> > warning: core file may not match specified executable file.
>> > >> >> > Core was generated by `/usr/lib64/heartbeat/pengine'.
>> > >> >> > Program terminated with signal 11, Segmentation fault.
>> > >> >> > [New process 27920]
>> > >> >> > #0  0x2b247fa8b53a in ?? ()
>> > >> >> > (gdb) where
>> > >> >> > #0  0x2b247fa8b53a in ?? ()
>> > >> >> > #1  0x2b247ebc5027 in ?? ()
>> > >> >> > #2  0x in ?? ()
>> > >> >> > --
>> > >> >> >
>> > >> >> > Regards,
>> > >> >> > Tomo
>> > >> >> >
>> > >> >> > 2010/11/11 Andrew Beekhof 
>> > >> >> >>
>> > >> >> >> On Thu, Nov 11, 2010 at 10:26 AM, nozawat 
>> > wrote:
>> > >> >> >> > Hi Andrew,
>> > >> >> >> >
>> > >> >> >> >  Thanks for a revision.
>> > >> >> >> >  I confirmed completion of compilin

Re: [Pacemaker] 2 node failover cluster + MySQL Master-Master replica setup

2010-11-12 Thread Ruzsinszky Attila
Hi,

> A MySQL Multi-Master architecture for a 2 node setup brings a lot of
> configuration and administration overhead and has no conflict detection or
> resolution. Integrating such a setup with Pacemaker only adds to the
Yes, I found it.
The real story: I want to learn clustering with a 2 node failover cluster.
I configured the cluster by DMC (DRBD Management Console).
I used the GUI configuring a MySQL service. It was almost unsuccessfull
which wasn't a surprise for me. After that I started to read some HowTo,
WEB page, etc. for help. I found someone from #mysql-nbd channel who
helped me and adviced me using M-M MySQL config but he doesn't know
almost anything about Pacemaker (He uses RH cluster).

After we did the working M-M config I started pacemaker and I could see
MySQL is working. I could connect to the commonIP and I could create a
test DB. Everything seemed all right until I put standby the master node
(from pacemaker point of view). In that moment mysqld started to "blinking"
between working and not working state because pacemaker always restarted
the process.

In the messages file I clould see some lines about missing privs. (RELOAD
and SUPER).

So I'm here now.

> server. Even the LSB script doesn't handle a Multi-Master setup. You'd have
> to write a custom resource agent, and it would probably fit your setup and
> your setup alone, meaning it couldn't be widely used for other setups, I
> know I had to make some modifications to the mysql resource agent and those
> changes were specific to my setup.
No, I don't want to write scripts. I'm not a programmer. I just want
to try out a
new tech for MySQL clustering except MySQL+DRBD.It is clear for me
theoretically. The files of mysqld reside on the common dir. which was switched
by DRBD. Is that right?

> MySQL Cluster is a choice, it could be integrated with Pacemaker, although I
Now I don't want MySQL Cluster. I think it is a bigger task for me.

> Anyways, this is just to get a feel for what's involved in the process, and
> how Pacemaker would fit the picture, at least from my point of view.
OK

> I would recommend all questions related to MySQL Cluster, Replication,
> Multi-Master be directed to the appropriate mailing lists though, and if you
As I mentioned I've got an M-M config from #mysql-nbd.
The recent problem is MySQL (M-M) + Pacemaker.

> want to write a resource agent for a Multi-Master setup, by all means, do
> share :)
No, I don't want. I'm a beginner both in clustering and MySQL.

> Hope this helps.
Yes, of course.

BTW.
If I want to solve the above problem can you help me? Of course with my
strict error messages, config files, etc. I "feel" my M-M config is not
rock stable (I was able to brake the IO or SQL "channel" between the
two mysqld processes) so I don't know whether I want this type of setup.

TIA,
Ruzsi

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Pacemaker-1.1.4, when?

2010-11-12 Thread Nikola Ciprich
it compiles well and doesn't crash, but seems to me like
it misbehaves somehow...
MS resources are not promoted properly or are suddenly marked
as orphaned etc..

On Fri, Nov 12, 2010 at 05:42:41PM +0900, nozawat wrote:
> Hi Andrew,
> 
>  It is very good.
>  I attach a patch for the time being.
> 
> Regards
> Tomo
> 
> 2010/11/12 Andrew Beekhof 
> 
> > Ah, silly me.
> >
> > After
> > +iter->offset = 0;
> >
> > you need:
> > +iter->values = NULL;
> >
> >
> > On Fri, Nov 12, 2010 at 2:07 AM, nozawat  wrote:
> > > Hi Andrew,
> > >
> > >  I show below a result of print.
> > >
> > > 1)print *iter
> > > (gdb) print *iter
> > > $1 = {offset = 2, hash = 0x12aa7ec0, values = 0x7fff1568e580}
> > > -
> > >
> > > 2)print *values
> > > (gdb) print *values
> > > $2 = {data = 0x7fff1568e5c0, next = 0x2b247ebb85a1, prev = 0x1}
> > > -
> > >
> > > Regards,
> > > Tomo
> > >
> > > 2010/11/12 Andrew Beekhof 
> > >>
> > >> On Thu, Nov 11, 2010 at 3:50 PM, nozawat  wrote:
> > >> > Hi Andrew
> > >> >
> > >> >  Sorry,pengine output a core.
> > >>
> > >> could you go up to frame #1 and run:
> > >> print *iter
> > >> print *values
> > >>
> > >> >
> > >> > -
> > >> > gdb) where
> > >> > #0  0x2b247fa8b53a in g_list_nth_data () from
> > >> > /lib64/libglib-2.0.so.0
> > >> > #1  0x2b247ebc5027 in g_hash_table_iter_next (iter=0x7fff1568e4c0,
> > >> > key=0x0, value=0x7fff1568e4e0)
> > >> > at ../include/crm/common/util.h:
> > >> > 348
> > >> > #2  0x2b247ebc9301 in native_rsc_location (rsc=0x12aa9cc0,
> > >> > constraint=0x12af5480) at native.c:1215
> > >> > #3  0x2b247ebcf56c in group_rsc_location (rsc=0x12aa9cc0,
> > >> > constraint=0x12af5480) at group.c:421
> > >> > #4  0x2b247ebb85a1 in apply_placement_constraints
> > >> > (data_set=0x7fff1568e6b0) at allocate.c:523
> > >> > #5  0x2b247ebb96f6 in stage2 (data_set=0x7fff1568e6b0) at
> > >> > allocate.c:872
> > >> > #6  0x2b247ebb6754 in do_calculations (data_set=0x7fff1568e6b0,
> > >> > xml_input=0x1295ec90, now=0x0)
> > >> > at pengine.c:262
> > >> > #7  0x2b247ebb5d3e in process_pe_message (msg=0x12941e60,
> > >> > xml_data=0x1295a610, sender=0x12940ac0)
> > >> > at pengine.c:124
> > >> > #8  0x00401265 in pe_msg_callback (client=0x12940ac0,
> > >> > user_data=0x0)
> > >> > at main.c:60
> > >> > #9  0x2b247f634b97 in G_CH_dispatch_int (source=0x1293fd80,
> > >> > callback=0,
> > >> > user_data=0x0) at GSource.c:637
> > >> > #10 0x2b247fa8ddb4 in g_main_context_dispatch () from
> > >> > /lib64/libglib-2.0.so.0
> > >> > #11 0x2b247fa90c0d in ?? () from /lib64/libglib-2.0.so.0
> > >> > #12 0x2b247fa90f1a in g_main_loop_run () from
> > >> > /lib64/libglib-2.0.so.0
> > >> > #13 0x0040186f in main (argc=1, argv=0x7fff1568eb48) at
> > >> > main.c:177
> > >> > (gdb)
> > >> >
> > >> > --
> > >> >
> > >> > Regards,
> > >> > Tomo
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > 2010/11/11 Andrew Beekhof 
> > >> >>
> > >> >> On Thu, Nov 11, 2010 at 12:31 PM, nozawat  wrote:
> > >> >> > Hi Andrew,
> > >> >> >
> > >> >> >  I ran it. However, an error has been output.
> > >> >> >  Probably I have a feeling that glib does not move well.
> > >> >> >  I attached ha-log.
> > >> >> >
> > >> >> >  I feel like cannot read a library well.
> > >> >> >  It is contents of core as follows.
> > >> >>
> > >> >> you'll need the debuginfo package installed
> > >> >>
> > >> >> > 
> > >> >> > $ gdb /usr/sbin/corosync core.27920
> > >> >> > GNU gdb Fedora (6.8-37.el5)
> > >> >> > Copyright (C) 2008 Free Software Foundation, Inc.
> > >> >> > License GPLv3+: GNU GPL version 3 or later
> > >> >> > 
> > >> >> > This is free software: you are free to change and redistribute it.
> > >> >> > There is NO WARRANTY, to the extent permitted by law.  Type "show
> > >> >> > copying"
> > >> >> > and "show warranty" for details.
> > >> >> > This GDB was configured as "x86_64-redhat-linux-gnu"...
> > >> >> >
> > >> >> > warning: core file may not match specified executable file.
> > >> >> > Core was generated by `/usr/lib64/heartbeat/pengine'.
> > >> >> > Program terminated with signal 11, Segmentation fault.
> > >> >> > [New process 27920]
> > >> >> > #0  0x2b247fa8b53a in ?? ()
> > >> >> > (gdb) where
> > >> >> > #0  0x2b247fa8b53a in ?? ()
> > >> >> > #1  0x2b247ebc5027 in ?? ()
> > >> >> > #2  0x in ?? ()
> > >> >> > --
> > >> >> >
> > >> >> > Regards,
> > >> >> > Tomo
> > >> >> >
> > >> >> > 2010/11/11 Andrew Beekhof 
> > >> >> >>
> > >> >> >> On Thu, Nov 11, 2010 at 10:26 AM, nozawat 
> > wrote:
> > >> >> >> > Hi Andrew,
> > >> >> >> >
> > >> >> >> >  Thanks for a revision.
> > >> >> >> >  I confirmed completion of compiling it.
> > >> >> >> >  I revised it a little, I attach a patch.
> > >> >> >>
> > >> >> >> Thanks!  Did you try running it?
> > >> >> >>
> > >> >> >> >
> > >> >> >> > Regards,
> > >> >> >> > Tomo
> > >> >> >> >
> > >> >> >> >
> > >> >> >> > 20

Re: [Pacemaker] making resource managed

2010-11-12 Thread Vadim S. Khondar
У ср, 2010-11-10 у 09:03 +0100, Andrew Beekhof пише:
> On Tue, Nov 9, 2010 at 2:14 PM, Vadim S. Khondar  wrote:
> > У вт, 2010-11-09 у 09:49 +0100, Andrew Beekhof пише:
> >> being unmanaged is a side-effect of a) the resource failing to stop
> >> and b) no fencing being configured
> >> once you've fixed the error, run crm resource cleanup as misch suggested
> >>
> >
> > I understand that.
> > However, for example, in situation when VPS fails to start (not to stop)
> 
> Its failing to stop too:
> 
> ca_stop_0 (node=ha-3, call=49, rc=1, status=complete): unknown error
>

> 
> Possibly an ordering constraint.  Otherwise, no idea.
> Depends on how your resource agent works.

No ordering constraints explicitly listed in configuration.
:(

Will try with moving to v1.1.

-- 
Vadim S. Khondar

v.khon...@o3.ua


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] 2 node failover cluster + MySQL Master-Master replica setup

2010-11-12 Thread Dan Frincu

Hi,

Ruzsinszky Attila wrote:

You're not making sense, first you say MySQL Master-Master, then you
mention master mysqld on clusterB and slave mysqld on clusterA. So,
which one is it:


Yes, it is true. If I stop openais and I start mysql without openais the config
is M-M (or Multi-Master).

When pacemaker starts mysql processes I can see master and slave mysqld
text from crm_mon.

  

- MySQL Master-Master (or Multi-Master) which can be achieved via MySQL
Replication
- MySQL Master-Slave, which can be achieved via MySQL Replication as well


I'd like to implement the above. I don't know which one is right for me.
Because of M-M MySQL config I think the 1st one is my choice.
  
A MySQL Multi-Master architecture for a 2 node setup brings a lot of 
configuration and administration overhead and has no conflict detection 
or resolution. Integrating such a setup with Pacemaker only adds to the 
overhead, as the current resource agents only handle a standalone MySQL 
server. Even the LSB script doesn't handle a Multi-Master setup. You'd 
have to write a custom resource agent, and it would probably fit your 
setup and your setup alone, meaning it couldn't be widely used for other 
setups, I know I had to make some modifications to the mysql resource 
agent and those changes were specific to my setup.


MySQL Cluster is a choice, it could be integrated with Pacemaker, 
although I don't actually see the benefits in this case, meaning MySQL 
Cluster would be the database backend, on it's own, doing it job, and to 
that backend you could connect from multiple frontends, put a load 
balancer (or two) before the frontends and you've got quite the setup, 
and the frontends and load balancer could be controlled by Pacemaker. 
But MySQL Cluster has it's downsides as well, it needs a minimum of 4 
nodes (it could probably work with less but that's the general 
recommendation), 2 data node, one SQL node and one management node. The 
SQL and management role could be collocated on one physical node + 2 
data nodes = 3 nodes.


Anyways, this is just to get a feel for what's involved in the process, 
and how Pacemaker would fit the picture, at least from my point of view.


I would recommend all questions related to MySQL Cluster, Replication, 
Multi-Master be directed to the appropriate mailing lists though, and if 
you want to write a resource agent for a Multi-Master setup, by all 
means, do share :)


Hope this helps.

Regards,
Dan
  

- MySQL Master with a DRBD backend (even MySQL docs recommend this type
of setup for some use cases) in which the MySQL instance runs only where
DRBD is primary


I think I know this setup and don't want it now.

  

- MySQL Cluster (nothing to do with Pacemaker, although they can be put
together in a setup)


This would be the next test if I have enough time.

TIA,
Ruzsi

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
  


--
Dan FRINCU
Systems Engineer
CCNA, RHCE
Streamwide Romania

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Pacemaker-1.1.4, when?

2010-11-12 Thread nozawat
Hi Andrew,

 It is very good.
 I attach a patch for the time being.

Regards
Tomo

2010/11/12 Andrew Beekhof 

> Ah, silly me.
>
> After
> +iter->offset = 0;
>
> you need:
> +iter->values = NULL;
>
>
> On Fri, Nov 12, 2010 at 2:07 AM, nozawat  wrote:
> > Hi Andrew,
> >
> >  I show below a result of print.
> >
> > 1)print *iter
> > (gdb) print *iter
> > $1 = {offset = 2, hash = 0x12aa7ec0, values = 0x7fff1568e580}
> > -
> >
> > 2)print *values
> > (gdb) print *values
> > $2 = {data = 0x7fff1568e5c0, next = 0x2b247ebb85a1, prev = 0x1}
> > -
> >
> > Regards,
> > Tomo
> >
> > 2010/11/12 Andrew Beekhof 
> >>
> >> On Thu, Nov 11, 2010 at 3:50 PM, nozawat  wrote:
> >> > Hi Andrew
> >> >
> >> >  Sorry,pengine output a core.
> >>
> >> could you go up to frame #1 and run:
> >> print *iter
> >> print *values
> >>
> >> >
> >> > -
> >> > gdb) where
> >> > #0  0x2b247fa8b53a in g_list_nth_data () from
> >> > /lib64/libglib-2.0.so.0
> >> > #1  0x2b247ebc5027 in g_hash_table_iter_next (iter=0x7fff1568e4c0,
> >> > key=0x0, value=0x7fff1568e4e0)
> >> > at ../include/crm/common/util.h:
> >> > 348
> >> > #2  0x2b247ebc9301 in native_rsc_location (rsc=0x12aa9cc0,
> >> > constraint=0x12af5480) at native.c:1215
> >> > #3  0x2b247ebcf56c in group_rsc_location (rsc=0x12aa9cc0,
> >> > constraint=0x12af5480) at group.c:421
> >> > #4  0x2b247ebb85a1 in apply_placement_constraints
> >> > (data_set=0x7fff1568e6b0) at allocate.c:523
> >> > #5  0x2b247ebb96f6 in stage2 (data_set=0x7fff1568e6b0) at
> >> > allocate.c:872
> >> > #6  0x2b247ebb6754 in do_calculations (data_set=0x7fff1568e6b0,
> >> > xml_input=0x1295ec90, now=0x0)
> >> > at pengine.c:262
> >> > #7  0x2b247ebb5d3e in process_pe_message (msg=0x12941e60,
> >> > xml_data=0x1295a610, sender=0x12940ac0)
> >> > at pengine.c:124
> >> > #8  0x00401265 in pe_msg_callback (client=0x12940ac0,
> >> > user_data=0x0)
> >> > at main.c:60
> >> > #9  0x2b247f634b97 in G_CH_dispatch_int (source=0x1293fd80,
> >> > callback=0,
> >> > user_data=0x0) at GSource.c:637
> >> > #10 0x2b247fa8ddb4 in g_main_context_dispatch () from
> >> > /lib64/libglib-2.0.so.0
> >> > #11 0x2b247fa90c0d in ?? () from /lib64/libglib-2.0.so.0
> >> > #12 0x2b247fa90f1a in g_main_loop_run () from
> >> > /lib64/libglib-2.0.so.0
> >> > #13 0x0040186f in main (argc=1, argv=0x7fff1568eb48) at
> >> > main.c:177
> >> > (gdb)
> >> >
> >> > --
> >> >
> >> > Regards,
> >> > Tomo
> >> >
> >> >
> >> >
> >> >
> >> > 2010/11/11 Andrew Beekhof 
> >> >>
> >> >> On Thu, Nov 11, 2010 at 12:31 PM, nozawat  wrote:
> >> >> > Hi Andrew,
> >> >> >
> >> >> >  I ran it. However, an error has been output.
> >> >> >  Probably I have a feeling that glib does not move well.
> >> >> >  I attached ha-log.
> >> >> >
> >> >> >  I feel like cannot read a library well.
> >> >> >  It is contents of core as follows.
> >> >>
> >> >> you'll need the debuginfo package installed
> >> >>
> >> >> > 
> >> >> > $ gdb /usr/sbin/corosync core.27920
> >> >> > GNU gdb Fedora (6.8-37.el5)
> >> >> > Copyright (C) 2008 Free Software Foundation, Inc.
> >> >> > License GPLv3+: GNU GPL version 3 or later
> >> >> > 
> >> >> > This is free software: you are free to change and redistribute it.
> >> >> > There is NO WARRANTY, to the extent permitted by law.  Type "show
> >> >> > copying"
> >> >> > and "show warranty" for details.
> >> >> > This GDB was configured as "x86_64-redhat-linux-gnu"...
> >> >> >
> >> >> > warning: core file may not match specified executable file.
> >> >> > Core was generated by `/usr/lib64/heartbeat/pengine'.
> >> >> > Program terminated with signal 11, Segmentation fault.
> >> >> > [New process 27920]
> >> >> > #0  0x2b247fa8b53a in ?? ()
> >> >> > (gdb) where
> >> >> > #0  0x2b247fa8b53a in ?? ()
> >> >> > #1  0x2b247ebc5027 in ?? ()
> >> >> > #2  0x in ?? ()
> >> >> > --
> >> >> >
> >> >> > Regards,
> >> >> > Tomo
> >> >> >
> >> >> > 2010/11/11 Andrew Beekhof 
> >> >> >>
> >> >> >> On Thu, Nov 11, 2010 at 10:26 AM, nozawat 
> wrote:
> >> >> >> > Hi Andrew,
> >> >> >> >
> >> >> >> >  Thanks for a revision.
> >> >> >> >  I confirmed completion of compiling it.
> >> >> >> >  I revised it a little, I attach a patch.
> >> >> >>
> >> >> >> Thanks!  Did you try running it?
> >> >> >>
> >> >> >> >
> >> >> >> > Regards,
> >> >> >> > Tomo
> >> >> >> >
> >> >> >> >
> >> >> >> > 2010/11/11 Andrew Beekhof 
> >> >> >> >>
> >> >> >> >> This might be a little better:
> >> >> >> >>
> >> >> >> >> diff -r dd75da218e4f configure.ac
> >> >> >> >> --- a/configure.ac  Fri Oct 29 12:12:45 2010 +0200
> >> >> >> >> +++ b/configure.ac  Tue Nov 09 13:20:55 2010 +0100
> >> >> >> >> @@ -654,7 +654,7 @@ AC_MSG_RESULT(using $GLIBCONFIG)
> >> >> >> >>
> >> >> >> >>  AC_CHECK_LIB(glib-2.0, g_hash_table_get_values)
> >> >> >> >>  if test "x$ac_cv_lib_glib_2_0_g_hash_table_get_values" !=
> >> >> >> >> x""yes;
>