date:20070423

Re: [Linux-HA] mysql drbd and SAN all together

2007-04-23 Thread Tijl Van den Broeck


Indeed, I can confirm that for the stable versions.

MySQL cluster (NDB) engine provides great performance, but every node
must have enough memory to cope with the current database size. As
long as you're willing to go for small lean, mean database that's
fine.

However, the more recent bleeding edge versions (as of 5.1.6) do have
disk database support, it depends upon your environment whether you'd
like have a go at 5.1.

There are always the more proven mature engines to turn to.

On 4/22/07, Eddie C <[EMAIL PROTECTED]> wrote:

MySQL cluster is an in memory database. Your Database size is limited by
your ram if I read the documentation properly. Not useful for anything I am
doing.


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] heartbeat-2.0.8: load balancing

2007-04-23 Thread Simon Horman

On Sat, Apr 21, 2007 at 07:48:09PM -0600, Alan Robertson wrote:
> Gerry Reno wrote:
> > I have a virtual IP resource that I'm making available via heartbeat and
> > I am controlling this via the GUI.  Now I want to add ldirectord load
> > balancing for a service on three machines.  How can this be added? 
> > ldirectord is installed but there is no config file.  How do I see and
> > control these ldirectord load balanced resources in the GUI?  Or are
> > they not manageable via heartbeat and GUI?
> 
> At this point in time, the load balancer infrastructure doesn't
> integrate with the heartbeat infrastructure beyond being able to keep
> the load balancer running.
> 
> Sorry :-(
> 
> I can see that being a nice thing to do, though...

Is the answer that ldirectord needs to be extended so that
the GUI knows how to configure it? If so I am (as always)
happy to consider patches.

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] standalone pingd.sh

2007-04-23 Thread David Lee

On Fri, 20 Apr 2007, Alan Robertson wrote:

> David Lee wrote:
> > On Thu, 19 Apr 2007, Xinwei Hu wrote:
> >
> >> [David Lee had earlier written:]
> >>> 5. "ping -q -c 1 $ping_host".  The options for "ping" are notoriously
> >>> variable from system to system.  Keep it simple. (For example my system
> >>> doesn't have a "-q" option; and it says that "-c " is for a thing
> >>> called "traffic class", only valid on IPv6.)  If they are not necessary,
> >>> leave them out.  If they are necessary, then for those of us who come
> >>> along later to maintain code, especially on other operating systems, it is
> >>> worth adding comments about your intentions, such as:
> >>># -q: to do foo
> >>># -c  to do bar
> >> Here on my system:
> >>-c count
> >>   Stop  after  sending  count  ECHO_REQUEST packets. With 
> >> deadline
> >>   option, ping waits for count ECHO_REPLY packets, until the 
> >> time‐
> >>   out expires.
> >>-q Quiet output.  Nothing is displayed except the summary lines 
> >>  at
> >>   startup time and when finished.
> >
> > "ping" on Linux and Solaris (to name two of our OSes) seem incompatible in
> > their options.
> >
> >> -q can be removed as we did ">/dev/null 2>&1" already.
> >
> > Yes.  The ">/dev/null 2>&1" method is the way to go to suppress output
> > across a range of OSes
> >
> >> -c is used so that ping won't last forever.
> >
> > On Solaris:  "ping hostname [data_size ]  [ count ]"
> >
> > In practice, it seems that "ping hostname number" also causes a swift
> > return for non-replying hosts.
> >
> > See "resources/heartbeat/IPaddr.in" and "resources/OCF/IPaddr.in" which
> > tryi to do the right thing according to which OS they are running on.
> >
> > So it might be worth us trying to develop our own "ping-wrapper" command
> > with a fixed, portable, interface, whose contents are based on those in
> > those other two files, and which they would then use, and which your new
> > "pingd" could also use.
> [...]
> Of course, we already have portable ping code in C in our base.   The
> best way to do this portably is probably to use that code, which is
> guaranteed not to change without us knowing about it...

Alan: The context of this discussion is for callers that are shell-scripts
(the proposed "pingd.sh"; also noting two "IPaddr.in") rather than C code.

The principle of calling the system "ping" is clean and simple -- far more
so (isn't it?) than having to (re-)write a ping-like command to call the C
code in our base.

The current problem is simply that the system "ping" in different OSes has
different options, and the current discussion is about how to handle that.

So I'm wondering whether, in the case of script-based (not C) callers, we
could simply have a shell function (e.g. "pingfn") with a fixed interface
acting as a wrapper to the system-shell ping command and handling all the
relevant OS incompatibilities.


-- 

:  David LeeI.T. Service  :
:  Senior Systems ProgrammerComputer Centre   :
:  UNIX Team Leader Durham University :
:   South Road:
:  http://www.dur.ac.uk/t.d.lee/Durham DH1 3LE:
:  Phone: +44 191 334 2752  U.K.  :
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] standalone pingd.sh

2007-04-23 Thread Carson Gaspar


David Lee wrote:

On Fri, 20 Apr 2007, Alan Robertson wrote:



Of course, we already have portable ping code in C in our base.   The
best way to do this portably is probably to use that code, which is
guaranteed not to change without us knowing about it...


Alan: The context of this discussion is for callers that are shell-scripts
(the proposed "pingd.sh"; also noting two "IPaddr.in") rather than C code.

The principle of calling the system "ping" is clean and simple -- far more
so (isn't it?) than having to (re-)write a ping-like command to call the C
code in our base.


Sadly, no, it isn't. The system pings are wildly incompatible and 
lacking in useful features. And since the C code already exists (or so I 
believe, as Alan made the claim), wrapping a main() and getopts() around 
it is trivial.



The current problem is simply that the system "ping" in different OSes has
different options, and the current discussion is about how to handle that.

So I'm wondering whether, in the case of script-based (not C) callers, we
could simply have a shell function (e.g. "pingfn") with a fixed interface
acting as a wrapper to the system-shell ping command and handling all the
relevant OS incompatibilities.


Good luck - mostly, you can't do so and keep useful semantics. How many 
ICMP ECHO REQUEST packets do you send? How many REPLY packets do you 
require for it to be "good"? How long do you wait for each packet? How 
long do you wait between each REQUEST, and does it depend on the timing 
 of the REPLY? I've had to do this for our monitoring system, and ended 
up writing a wrapper around fping.


--
Carson
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] heartbeat-2.0.8: load balancing

2007-04-23 Thread Michael Schwartzkopff

Am Montag, 23. April 2007 11:06 schrieb Simon Horman:
> On Sat, Apr 21, 2007 at 07:48:09PM -0600, Alan Robertson wrote:
> > Gerry Reno wrote:
> > > I have a virtual IP resource that I'm making available via heartbeat
> > > and I am controlling this via the GUI.  Now I want to add ldirectord
> > > load balancing for a service on three machines.  How can this be added?
> > > ldirectord is installed but there is no config file.  How do I see and
> > > control these ldirectord load balanced resources in the GUI?  Or are
> > > they not manageable via heartbeat and GUI?
> >
> > At this point in time, the load balancer infrastructure doesn't
> > integrate with the heartbeat infrastructure beyond being able to keep
> > the load balancer running.
> >
> > Sorry :-(
> >
> > I can see that being a nice thing to do, though...
>
> Is the answer that ldirectord needs to be extended so that
> the GUI knows how to configure it? If so I am (as always)
> happy to consider patches.

Hi,

if your platform is Linux, your distro supports CLUSTERIP target of iptables I 
would be glad if you could do some tests on my IPaddr2 resource agent.

It does load sharing between several nodes. Please mail me if you want to test 
it.

Status of the script: works for me.

-- 
Dr. Michael Schwartzkopff
MultiNET Services GmbH
Addresse: Bretonischer Ring 7; 85630 Grasbrunn; Germany
Tel: +49 - 89 - 45 69 11 0
Fax: +49 - 89 - 45 69 11 21
mob: +49 - 174 - 343 28 75

mail: [EMAIL PROTECTED]
web: www.multinet.de

Sitz der Gesellschaft: 85630 Grasbrunn
Registergericht: Amtsgericht München HRB 114375
Geschäftsführer: Günter Jurgeneit, Hubert Martens

---

PGP Fingerprint: F919 3919 FF12 ED5A 2801 DEA6 AA77 57A4 EDD8 979B
Skype: misch42
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Heartbeat Active-Active Mysql servers

2007-04-23 Thread Dejan Muhamedagic

On Mon, Apr 09, 2007 at 11:45:12AM -0400, Benjamin Lawetz wrote:
> I'm upgrading my existing mysql replication to new servers and moving up
> from heartbeat 1.x to 2.x
> I've got the basics up and running and was reading up on the monitoring
> feature. I'm a little confused so I just wanted to run this by you guys to
> make sure I understood correctly:
> 
> I have 2 mysql servers
> Mysql1 192.168.1.4
> Mysql2 192.168.1.5
> I have a heartbeat that controls 2 virtual Ips: 192.168.1.6 (which has a
> preffered node of mysql1) and 192.168.1.7 (which has a preffered node of
> mysql2)
> 
> I don't want hearbeat to control starting and stopping of mysql (these
> services should always be running and are controlled by the OS).

That's what heartbeat can do for you as well. I can't imagine a
reason to run them outside of the cluster if you want them HA.

> But I do want heartbeat to monitor mysql and failover if the monitoring
> fails.

What would failover consist of then?

> The way I understand how to configure this is that I have to add the mysql
> ressource to my configuration but disable (or make them return that
> everything proceeded correctly) the "start" and "stop" commands on the mysql
> script so that start and stop does nothing but monitor functions correctly.
> 
> Is this correct?
> 
> Thanks in advance for your help
> 
> -- 
> Benjamin
> TéliPhone inc.
> 
> 
> --
> N'envoyé pas de courriel à l'adresse qui suit, sinon vous serez
> automatiquement mis sur notre liste noire.
> [EMAIL PROTECTED]
> Do not send an email to the email above or you will automatically be
> blacklisted.
> 
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

-- 
Dejan
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] status check

2007-04-23 Thread Dejan Muhamedagic

On Fri, Apr 20, 2007 at 04:18:24PM +0200, Andrew Beekhof wrote:
> On 4/19/07, Alan Robertson <[EMAIL PROTECTED]> wrote:
> >Andrew Beekhof wrote:
> >> On 4/18/07, Lars Marowsky-Bree <[EMAIL PROTECTED]> wrote:
> >>> On 2007-04-17T19:40:13, [EMAIL PROTECTED] wrote:
> >>>
> >>> > >Easiest way is to model after an existing resource agent, Xen for
> >>> > >example.
> >>> > I've found the Dummy one a good start in the past. Simple, and shows
> >>> > the basic required components.
> >>>
> >>> Yeah, but that one now calls the ha_pseudo_resource wrappers which isn't
> >>> exactly obvious.
> >>
> >> it never used to and I'd much prefer it didn't for precisely the
> >> reason that it was a good template
> >>
> >> in fact i might just go and revert that particular change now - at
> >> least for Dummy
> >
> >We actually use that resource, and it's not such a good template, IIRC
> >for other reasons.
> 
> as the person who wrote this RA, I can tell you it has always had two
> purposes (as also mentioned in the commit message).
> 
> 1 - be so insanely simple that it was guaranteed to work (and therefor
> good for testing)
> 2 - be an appropriate starting point for people writing RAs
> (without any extra baggage that would then get copied a million times)
> 
> _please_ do not add features to it, nor refactor it.
> I like it exactly how it is.

I removed all features on Friday.

> >If we want a template, maybe we should just put it in the doc directory
> >as a template with lots and lots of comments?
> >
> >--
> >Alan Robertson <[EMAIL PROTECTED]>
> >
> >"Openness is the foundation and preservative of friendship...  Let me
> >claim from you at all times your undisguised opinions." - William
> >Wilberforce
> >___
> >Linux-HA mailing list
> >Linux-HA@lists.linux-ha.org
> >http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >See also: http://linux-ha.org/ReportingProblems
> >
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

-- 
Dejan
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] IPv6, service fail on start behaviour

2007-04-23 Thread Dejan Muhamedagic

On Thu, Apr 19, 2007 at 12:08:48PM +0200, Andrew Beekhof wrote:
> On 4/19/07, Benjamin Watine <[EMAIL PROTECTED]> wrote:
> >Andrew Beekhof a écrit :
> >> On 4/18/07, Benjamin Watine <[EMAIL PROTECTED]> wrote:
> >>> Hi all
> >>>
> >>> I have two questions about Heartbeat v2 configuration :
> >>>
> >>> 1. IPv6addr : I've tried to configure virtual IPv6 address for a
> >>> resource group. Because I didn't find documentation about this script, I
> >>> did it like IPaddr, but it don't seems to be OK.
> >>> What are the parameters to use with IPv6addr script ? Is this script
> >>> fully OK ?
> >>
> >> /path/to/IPv6addr meta-data
> >>
> ># /etc/ha.d/resource.d/IPv6addr meta-data
> >usage: /etc/ha.d/resource.d/IPv6addr 
> >(start|stop|status|usage|meta-data)
> >
> >That's why I ask.
> >"/usr/lib/ocf/resource.d/heartbeat/IPv6addr meta-data" works, thank you
> >Should I open a bugzilla ?
> 
> couldn't hurt to

None of the heartbeat class agents supports meta-data. Is this
usage wrong?

> 
> >
> >>>
> >>> 2. On a resource group, when a service shut down (manually or because of
> >>> fail), heartbeat try to start it again. It's ok, but if the service
> >>> have, for any reason, stop after one minute, HB will try to start it
> >>> undefinitely... How can I configure HeartBeat in the way it tries to
> >>> load a service only 3 times between two reboot, and failover if it can't
> >>> run durably the service ? I look after DTD, but didn't find anything
> >>> about this.
> >>
> >> http://www.linux-ha.org/v2/faq/forced_failover
> >
> >Thanks a lot !
> >
> >>
> >>>
> >>> I hope you understand what I mean :)
> >>>
> >>> Thanks !
> >>>
> >>> Ben
> >>> ___
> >>> Linux-HA mailing list
> >>> Linux-HA@lists.linux-ha.org
> >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >>> See also: http://linux-ha.org/ReportingProblems
> >>>
> >> ___
> >> Linux-HA mailing list
> >> Linux-HA@lists.linux-ha.org
> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> See also: http://linux-ha.org/ReportingProblems
> >>
> >>
> >
> >___
> >Linux-HA mailing list
> >Linux-HA@lists.linux-ha.org
> >http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >See also: http://linux-ha.org/ReportingProblems
> >
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

-- 
Dejan
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Location constraints

2007-04-23 Thread Benjamin Watine


Any hint ?

Benjamin Watine a écrit :

Hi the list

I'm trying to set location constraint for 2 resources group, but I don't 
understand very well how it works.
I want to define a prefered node for each group, and tell HeartBeat to 
move the group on the other node if 3 resources fail (and restart) occurs.


So, I defined default-resource-stickiness at 100, 
default-resource-failure-stickiness at -100, and put a score of 1200 on 
prefered node, and 1000 for "second" node. ((1200-1000+100)/100 = 3).


I'm trying to do this for 2 group. If 3 fails occurs for the resource of 
a group, all the group have to be moved to the other node. Can I 
configure group location constraint as for resource ? How can I get 
group failcount (if it make sense) ?


... and nothing works :p The resource group don't start on the good 
node, and never failover if I manually stop 3 times a resource of the 
group.


Some light about this location constraints would be greatly appreciated !

cibadmin -Ql attached.

Thank you, in advance.

Ben




___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] IPv6, service fail on start behaviour

2007-04-23 Thread Benjamin Watine


Dejan Muhamedagic a écrit :

On Thu, Apr 19, 2007 at 12:08:48PM +0200, Andrew Beekhof wrote:

On 4/19/07, Benjamin Watine <[EMAIL PROTECTED]> wrote:

Andrew Beekhof a écrit :

On 4/18/07, Benjamin Watine <[EMAIL PROTECTED]> wrote:

Hi all

I have two questions about Heartbeat v2 configuration :

1. IPv6addr : I've tried to configure virtual IPv6 address for a
resource group. Because I didn't find documentation about this script, I
did it like IPaddr, but it don't seems to be OK.
What are the parameters to use with IPv6addr script ? Is this script
fully OK ?

/path/to/IPv6addr meta-data


# /etc/ha.d/resource.d/IPv6addr meta-data
usage: /etc/ha.d/resource.d/IPv6addr 
(start|stop|status|usage|meta-data)

That's why I ask.
"/usr/lib/ocf/resource.d/heartbeat/IPv6addr meta-data" works, thank you
Should I open a bugzilla ?

couldn't hurt to


What couldn't hurt ? This behaviour or to open a bugzilla ? :)


None of the heartbeat class agents supports meta-data. Is this
usage wrong?


2. On a resource group, when a service shut down (manually or because of
fail), heartbeat try to start it again. It's ok, but if the service
have, for any reason, stop after one minute, HB will try to start it
undefinitely... How can I configure HeartBeat in the way it tries to
load a service only 3 times between two reboot, and failover if it can't
run durably the service ? I look after DTD, but didn't find anything
about this.

http://www.linux-ha.org/v2/faq/forced_failover

Thanks a lot !


I hope you understand what I mean :)

Thanks !

Ben
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems



___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems




___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] Help : Is RHEL Advanced Server 3 Supports Heartbeat 2.0.8 and DRBD?

2007-04-23 Thread jugal shah

Hi All,

I want to setup MySQL Failover (With the Single DB Fail) with the help of 
Heartbeat 2.0.8 and DRDB (Disk Based Replication).

Anybody please guide me for the Operating System requirement for the HeartBeat 
2.0.8 and DRDB.

Though I have idea that DRBD is inbuilt with Red Hat Advanced Server 3.

Right now I have Red-Hat Advanced Server version 3.

If it supports Heartbeat 2.0.8, Please let me know.

I have done Heartbeat 2.0.8 setup on CentOS 4.3 but the setup is not worked 
perfactly. I have couple of question please solve my query for the Heartbeat 
SetUp.

1. Which are the things to take while generating CIB.xml using 
haresource2cib.py file.

(** I got many errors regarding this because my virtual ip is not activated by 
doing this)

2. Where to put MySQL service so Heartbeat can execute it.

Right now I have only these two queries but  tomorrow I will got RHEL Advanced 
Server 3 so, I can explain my problems more while the setup.

Thanks & Regards,
Jugal Shah







   
-
Ahhh...imagining that irresistible "new car" smell?
 Check outnew cars at Yahoo! Autos.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] IPv6, service fail on start behaviour

2007-04-23 Thread Andrew Beekhof


On 4/23/07, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote:

On Thu, Apr 19, 2007 at 12:08:48PM +0200, Andrew Beekhof wrote:
> On 4/19/07, Benjamin Watine <[EMAIL PROTECTED]> wrote:
> >Andrew Beekhof a écrit :
> >> On 4/18/07, Benjamin Watine <[EMAIL PROTECTED]> wrote:
> >>> Hi all
> >>>
> >>> I have two questions about Heartbeat v2 configuration :
> >>>
> >>> 1. IPv6addr : I've tried to configure virtual IPv6 address for a
> >>> resource group. Because I didn't find documentation about this script, I
> >>> did it like IPaddr, but it don't seems to be OK.
> >>> What are the parameters to use with IPv6addr script ? Is this script
> >>> fully OK ?
> >>
> >> /path/to/IPv6addr meta-data
> >>
> ># /etc/ha.d/resource.d/IPv6addr meta-data
> >usage: /etc/ha.d/resource.d/IPv6addr 
> >(start|stop|status|usage|meta-data)
> >
> >That's why I ask.
> >"/usr/lib/ocf/resource.d/heartbeat/IPv6addr meta-data" works, thank you
> >Should I open a bugzilla ?
>
> couldn't hurt to

None of the heartbeat class agents supports meta-data. Is this
usage wrong?


99% of them just redirect to the OCF agent... so in theory it should work anyway
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

R: [Linux-HA] I: Mysql Ha cluster

2007-04-23 Thread Viesti Luca

 Hallo, thank you for your reply, but is i type 

Mysql meta-data 

The out is the follow 

1.0

Resource script for MySQL.
It manages a MySQL Database instance as an HA resource.

MySQL resource agent

Configuration file

MySQL config

Directory containing databases

MySQL datadir

User running MySQL daemon

MySQL user

Group running MySQL daemon (for logfile and directory permissions)

MySQL group

Table to be tested in monitor statement (in . notation)

MySQL test table

MySQL test user

MySQL test user

MySQL test user password

MySQL test user password

If the MySQL database does not exist, it will be created

Create the database if it does not
exist

And there aren't any parameter  relative to the  binary location of
mysqld.

-Messaggio originale-
Da: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Per conto di Alan Robertson
Inviato: sabato 14 aprile 2007 16.37
A: General Linux-HA mailing list
Oggetto: Re: [Linux-HA] I: Mysql Ha cluster

Viesti Luca wrote:
>  
> 
>  
> I have need to create a mysql centralized cluster whith 3 different 
> mysql.
> One mysql for devel istance
> One mysql for test istance and the last for the production.
> Now i have installed the version 2.08 of heartbeat on a sles9 SP3.
> 
>  
> Can i specify different binary for start end stop mysql when i use the

> mysql agent?

There is a parameter called 'binary' which you can use to configure that
(at least in the version I'm looking at)

> I think that i use the OCF agent and after specify 3 different 
> OCF-resource-name (mysql-dev,..,mysql-prod)
> 
> Can someone help me to configure OCF?

Have you read the script?

Have you run it and asked for its metadata?
/usr/lib/ocf/resource.d/heartbeat/mysql meta-data

will tell you all the parameters it supports and what they mean.

The output from this is as follows:

  1.0

Resource script for MySQL.
It manages a MySQL Database instance as an HA resource.

MySQL resource agent

Location of the MySQL binary  MySQL
binary  

Configuration file  MySQL
config 

Directory containing databases  MySQL
datadir 

User running MySQL daemon  MySQL
user  

Group running MySQL daemon (for logfile and directory permissions)
 MySQL group  

  Table to be tested in monitor statement (in
. notation)  MySQL test
table 

  MySQL test user  MySQL test
user  

  MySQL test user password  MySQL test user password  

  If the MySQL database does not exist, it will be created
 Create the database if it does not
exist  

-- 
Alan Robertson <[EMAIL PROTECTED]>

"Openness is the foundation and preservative of friendship...  Let me
claim from you at all times your undisguised opinions." - William
Wilberforce ___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
The information contained in this electronic message and any attachments (the
"Message")  is intended for one or more specific individuals or entities, and
may be confidential, proprietary,  privileged  or otherwise protected by law. 
If you are  not the intended recipient, please notify the sender immediately,
delete this Message and do not  disclose, distribute, or copy it to any third
party or  otherwise use this Message.  Electronic  messages are not secure or
error free  and can  contain viruses or may be delayed, and the sender is not
liable for  any  of  these  occurrences. The  sender  reserves  the  right to
monitor, record and retain electronic messages.

Le informazioni  contenute  in  questo messaggio e gli eventuali allegati (il
"Messaggio")  si  intendono  inviate a uno o  piu' specifici  destinatari. Il 
contenuto  del  Messaggio  puo'  essere  confidenziale,  riservato e comunque
protetto  dalla  legge applicabile. Se non siete i destinatari del Messaggio,
siete  pregati  di  informare  immediatamente il  mittente, cancellate questo
Messaggio, non rivelatelo,  non distribuitelo  ne' inoltratelo a  terzi,  non
copiatelo ne' fatene alcun uso. I messaggi  di  posta  elettronica  non  sono
sicuri  e sono soggetti ad alterazioni, possono essere trasmettitori di Virus 
informatici  o  soggetti  a  ritardi  nella  distribuzione.  Il  mittente del 
Messaggio  non  puo' essere in alcun modo considerato responsabile per queste
evenienze. Il  mittente  si  riserva  il  diritto  di  archiviare, ritenere e
controllare i messaggi di posta elettronica.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] nodes stays offline after communication is restored

2007-04-23 Thread Dejan Muhamedagic

On Mon, Apr 23, 2007 at 10:27:55AM +0200, Thomas Åkerblom (HF/EBC) wrote:
> Hi.
> 
> OS:   SLES10
> Linux-HA  2.0.8
> 
> I have a system with two nodes (HA-1 & HA-2) and one standby (HA-3).
> To illustrate my problem I have set up HA to define two alias addresses each 
> on the two hosts HA-1 & HA-2.
> After initiation all is OK and crm_mon on all three nodes shows that all 
> nodes are online.
> When I unplug the network cable to HA-2, HA-3 will take over and also then 
> all seams OK.
> HA-1 and HA-3 are online and HA-2 is offline.
> HA-2 is still running but consider HA-1 & HA-3 to be offline, as they are.
> The problem starts when I plug the network back to HA-2.
> The situation stays, HA-2 is offline to HA-1 & HA-3 and vice versa.
> I have a persistent split brain situation.
> In the syslog I can see that they recognize each other to be alive but it 
> just doesn't appear to be good enough.
> Is my configuration faulty?
> I'm attaching the syslogs from the time when I plugged the cable back and for 
> a short time after.

A problem somewhere in CCM but couldn't see anything obvious.

BTW, you're running a version of heartbeat recently pulled from
the dev branch (or you downloaded a compiled package from
somewhere) which has lame logging, i.e. all messages are tagged
with "logd" which is not very useful. It was me that broke it, but
then fixed it on Thursday, so you should either get the newer code
or, since this bug is excercised only if ha_logd logs through
syslog, change your logging config accordingly.

Thanks.

Dejan

> I also attach the cib.xml and the ha.cf
> 
> 
>  <>   <>   <> 
> 
>  <>  <> 
> 
> Regards
>  *** Thomas
> This communication is confidential and intended solely for the addressee(s). 
> Any unauthorized review, use, disclosure or distribution is prohibited. If 
> you believe this message has been sent to you in error, please notify the 
> sender by replying to this transmission and delete the message without 
> disclosing it. Thank you.
> E-mail including attachments is susceptible to data corruption, interruption, 
> unauthorized amendment, tampering and viruses, and we only send and receive 
> e-mails on the basis that we are not liable for any such corruption, 
> interception, amendment, tampering or viruses or any consequences thereof.
> 
> 





Content-Description: cib.xml
>   ccm_transition="4" cib_feature_revision="1.3" generated="true" 
> dc_uuid="7e126899-9c1f-477c-9e0e-7d28620f89da" epoch="1" num_updates="32" 
> cib-last-written="Mon Apr 23 09:39:36 2007">
>
>  
>
>  
> name="transition_idle_timeout" value="120s"/>
> value="false"/>
> value="ignore"/>
> value="true"/>
> name="default_resource_failure_stickiness" value="-INFINITY"/>
>  
>
>  
>  
> type="normal"/>
> type="normal"/>
> type="normal"/>
>  
>  
>
>   provider="heartbeat">
>
>   on_fail="stop"/>
>   on_fail="restart"/>
>   timeout="10s" on_fail="restart"/>
>
>
>  
>
>
>
>  
>
>  
>   provider="heartbeat">
>
>   on_fail="stop"/>
>   on_fail="restart"/>
>   timeout="10s" on_fail="restart"/>
>
>
>  
>
>
>
>  
>
>  
>
>
>   provider="heartbeat">
>
>   on_fail="stop"/>
>   on_fail="restart"/>
>   timeout="10s" on_fail="restart"/>
>
>
>  
>
>
>
>  
>
>  
>   provider="heartbeat">
>
>   on_fail="stop"/>
>   on_fail="restart"/>
>   timeout="10s" on_fail="restart"/>
>
>
>  
>
>
>
>  
>
>  
>
>  
>  
>
>  
> operation="eq" value="ha-1"/>
>  
>
>
>  
> operation="eq" value="ha-3"/>
>  
>
>
>  
> operation="eq" value="ha-2"/>
>  
>
>
>  
> operation="eq" value="ha-3"/>
>  
>
> score="-INFINITY"/>
>  
>
>  

> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

-- 
Dejan
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
S

Re: [Linux-HA] heartbeat-2.0.8: load balancing

2007-04-23 Thread Alan Robertson

Simon Horman wrote:
> On Sat, Apr 21, 2007 at 07:48:09PM -0600, Alan Robertson wrote:
>> Gerry Reno wrote:
>>> I have a virtual IP resource that I'm making available via heartbeat and
>>> I am controlling this via the GUI.  Now I want to add ldirectord load
>>> balancing for a service on three machines.  How can this be added? 
>>> ldirectord is installed but there is no config file.  How do I see and
>>> control these ldirectord load balanced resources in the GUI?  Or are
>>> they not manageable via heartbeat and GUI?
>> At this point in time, the load balancer infrastructure doesn't
>> integrate with the heartbeat infrastructure beyond being able to keep
>> the load balancer running.
>>
>> Sorry :-(
>>
>> I can see that being a nice thing to do, though...
> 
> Is the answer that ldirectord needs to be extended so that
> the GUI knows how to configure it? If so I am (as always)
> happy to consider patches.

I think so, but also of course, the GUI would need work as well.

It just seems like a nice thing to think about.


-- 
Alan Robertson <[EMAIL PROTECTED]>

"Openness is the foundation and preservative of friendship...  Let me
claim from you at all times your undisguised opinions." - William
Wilberforce
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] ipfail not failing over when ping nodes are unpingable

2007-04-23 Thread Faisal Shaikh

On Fri, 2007-04-20 at 13:55 -0600, Alan Robertson wrote: 
> Faisal Shaikh wrote:
> > Hi all,
> > 
> > Im having trouble with setting up a pair of machines with a single
> > resource (an IP address) to failover between them.
> > The machines are Sun Netras (T1 105) running Gentoo Linux.
> > 
> > The scenario is as follows:
> > fw1: (primary resource holder)
> > eth0: 192.168.1.52
> > eth1: 10.0.0.2
> > 
> > fw3: (secondary resource holder)
> > eth0: 192.168.1.60
> > eth1: 10.0.0.1
> > 
> > 
> > eth1 is used as a private network between these two machines for the
> > heartbeat. (I havn't got the correct cable type to use the serial
> > connection for the heartbeat.)
> > 
> > 
> > The IP address fails over correctly in the following cases:
> > 
> > 1. When I switch off the primary resource holder.
> > 2. When I stop heartbeat on the primary resource holder.
> > 
> > However, if I disconnect the primary resource holder from the network so
> > that it cant ping the ping nodes, the IP address does not fail over to
> > the secondary resource holder.
> > 
> > After disconnecting the cable, The log entries in the primary resource
> > holder is as follows:
> > 
> > Apr 20 19:47:03 fw1 heartbeat: [4232]: WARN: node 192.168.1.2: is dead
> > Apr 20 19:47:03 fw1 heartbeat: [4232]: WARN: node 192.168.1.3: is dead
> > Apr 20 19:47:03 fw1 heartbeat: [4232]: debug: StartNextRemoteRscReq():
> > child count 1
> > Apr 20 19:47:03 fw1 heartbeat: [4232]: info: Link
> > 192.168.1.2:192.168.1.2 dead.
> > Apr 20 19:47:03 fw1 heartbeat: [4232]: info: Link
> > 192.168.1.3:192.168.1.3 dead.
> > Apr 20 19:47:03 fw1 heartbeat: [4496]: debug: notify_world: setting
> > SIGCHLD Handler to SIG_DFL
> > Apr 20 19:47:03 fw1 harc[4496]: info: Running /etc/ha.d/rc.d/status
> > status
> > Apr 20 19:47:03 fw1 heartbeat: [4512]: debug: notify_world: setting
> > SIGCHLD Handler to SIG_DFL
> > Apr 20 19:47:03 fw1 harc[4512]: info: Running /etc/ha.d/rc.d/status
> > status
> > 
> > 
> > And it stays there doing nothing.
> > 
> > My ha.cf file is as follows:
> > 
> > ucast eth1 10.0.0.1
> > logfile /var/log/ha-log
> > debugfile /var/log/ha-debug
> > keepalive 2
> > warntime 10
> > deadtime 30
> > initdead 120
> > baud 19200
> > udpport 694
> > auto_failback on
> > node fw1
> > node fw3
> > 
> > 
> > respawn hacluster /usr/lib/heartbeat/ipfail
> > ping 192.168.1.2 192.168.1.3
> > crm off
> > 
> > My haresources file is :
> > fw1 192.168.1.100/32/192.168.1.255
> > 
> > I'd appreciate it greatly if someone could point me in the right
> > direction please.
> 
> You need redundant communication for ipfail to work.
> 
> You see, ipfail will only fail over if the two nodes can communicate
> with each other, and agree to move things around.
> 
> What you've done is created a split-brain, where the each node thinks
> the other is dead.  If the other is dead, who can take over?
> 
> 

Hi Alan,

Many thanks for your quick reply!

I thought that I did have a redundant communication link. The service IP
was on eth0(192.168.1.0 network) while the heartbeat link was on eth1
(10.0.0.0 network). 

After pulling the network cable on eth0, I could still see the heartbeat
packets and their replies on eth1 using tcpdump. After pulling the
network cable on eth0, if I stop heartbeat on the primary, the secondary
takes over with no problems. This would indicate that the servers are
able to communicate but the handover doesnt occur if a NIC/cable goes
down on the primary.

Im going to try to obtain the a null modem adaptor for a serial cable
that Ill run between the two Netras.


Regards,
Faisal

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] ipfail not failing over when ping nodes are unpingable

2007-04-23 Thread Faisal Shaikh

Both cards on each machine(Sun Netras) have the same MAC addresses. Can this be 
a cause of the problem?

fw1 ha.d # lspci | grep Ethernet
01:01.1 Ethernet controller: Sun Microsystems Computer Corp. Happy Meal (rev 01)
01:03.1 Ethernet controller: Sun Microsystems Computer Corp. Happy Meal (rev 01)

fw1 ha.d # dmesg | grep Ethernet
Ethernet address: 08:00:20:c2:d5:3e
eth0: HAPPY MEAL (PCI/CheerIO) 10/100BaseT Ethernet 08:00:20:c2:d5:3e
eth1: HAPPY MEAL (PCI/CheerIO) 10/100BaseT Ethernet 08:00:20:c2:d5:3e



fw3 ~ # lspci | grep Ethernet
01:01.1 Ethernet controller: Sun Microsystems Computer Corp. Happy Meal (rev 01)
01:03.1 Ethernet controller: Sun Microsystems Computer Corp. Happy Meal (rev 01)

fw3 ~ # dmesg | grep Ethernet
[0.00] Ethernet address: 08:00:20:c2:d3:4e
[   16.613126] eth0: HAPPY MEAL (PCI/CheerIO) 10/100BaseT Ethernet 
08:00:20:c2:d3:4e
[   16.714221] eth1: HAPPY MEAL (PCI/CheerIO) 10/100BaseT Ethernet 
08:00:20:c2:d3:4e


Faisal






___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] STONITH in response to stop failures (suicide or ssh)

2007-04-23 Thread Max Hofer

I think i want the same functioanlity as Christopher wants:

* when a resource on a node goes into FAILED state, reboot the machine
(currently we have no STONITH device - i know, it is insecure but i have to use
what i have)

Heartbeat version 2.0.8

Situation:
* 2 node cluster
* dummy-resource provided by heartbeat runs on management2
* DC management1

Actions:
* touch /tmp/Dummy.monitor /tmp/Dummy.stop --> in this way monitor and stop 
opeartion fails

Afterwards:
* dummy-resource does not run anywhere
* stonithd seems to core dump
* reboot of management2 failed (?? may this be because the stonithd core dumps?)
* "etc/init.d/heartbeat stop" on management2 hangs forever

Here the attached CIB, pe-warn* from management1 and ha-log of both machine.

Do i something wrong with the stonith device?


On Tuesday 17 April 2007 15:07, Dave Blaschke wrote:
> Christophe Zwecker wrote:
> > Dave Blaschke wrote:
> >> Christophe Zwecker wrote:
> >>> Dave Blaschke wrote:
>  Christophe Zwecker wrote:
> > Hi Dave,
> >
> > its this:
> >
> > grep mw-test /etc/ha.d/ha.cf
> > nodemw-test-n1.i-dis.net
> > nodemw-test-n2.i-dis.net
> >
> > [EMAIL PROTECTED] ~]# uname -n
> > mw-test-n2.i-dis.net
> >
>  And your cib.xml?
> >
> >>>
> >>>  grep mw-test /var/lib/heartbeat/crm/cib.xml
> >>> >>> id="5b1a3c52-a893-44c5-a9c7-035fc632ff8d">
> >>> >>> id="cc1c8955-58d2-4ee3-8e98-b07599335e0c">
> >>> >>> id="prefered_location_group_1_expr" operation="eq" 
> >>> value="mw-test-n1.i-dis.net"/>
> >>>
> >> I'd actually like to see the whole thing please...
> >
> >
> > here ya go, sorry for the delay i was on vacation!
> Ahh, vacation.  Okay, envying over... :-)
> 
> I don't proclaim to be a R2 config expert, but I'm pretty sure you'll 
> need something similar to the following in your CIB to tell heartbeat 
> how to STONITH:
> 
>  provider="heartbeat">
> 
> You won't need any attributes for suicide, you'll need a hostlist if you 
> choose to use ssh.  See 
> http://www.linux-ha.org/ConfiguringStonithPlugins for the full XML sample.
> >
> > thx alot for your input and time
> >
> > Christophe
> 
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
> 



pe-input-44.bz2
Description: BZip2 compressed data


pe-warn-297.bz2
Description: BZip2 compressed data


pe-warn-298.bz2
Description: BZip2 compressed data


pe-warn-299.bz2
Description: BZip2 compressed data


pe-warn-300.bz2
Description: BZip2 compressed data


pe-warn-301.bz2
Description: BZip2 compressed data


pe-warn-302.bz2
Description: BZip2 compressed data


ha-log-mgt2.bz2
Description: BZip2 compressed data


ha-log-mgt1.bz2
Description: BZip2 compressed data
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Cannot create group containing drbd using HB GUI

2007-04-23 Thread Knight, Doug

OK, unstuck, and moving forward with a patch from the DRBD email list...
I've got drbd configured in a fairly reliable Master/Slave setup, and I
can fail it back and forth between nodes using cibadmin and xml that
changes the Place constraint from node to node. (Not sure what this
means, but when the drbd processes first come up, the GUI indicates one
as Master, but does not show the other as Slave, only that it is
running. When I change the Place constraint, Master moves from one node
to the other, then the formerly Master node indicates Slave. From that
point on behavior is as expected.) Now, I've created a group containing
only a single Filesystem resource, colocated to the drbd master (based
on the previously discussed constraint rules of a -infinity for existing
on a stopped or slave drbd node), ordered to come up after the drbd
master. I'm using target_role to control whether HA starts it or not
(one xml sets target_role to stopped, the other started). First
question: What is the best way to start and stop resources, without
using the GUI (In other words, does my use of target_role a good way to
control resources)? Second question: Does it make more sense to have
target_role defined in the group instance_attributes or in the
instance_attributes within the individual primitive resource?

Thanks,
Doug

On Fri, 2007-04-20 at 14:46 -0400, Doug Knight wrote:

> Well, whatever was stuck, I had to do a rmmod to remove the drbd module
> from the kernel, then modprobe it back in, and the "stuck" Secondary
> indication went away.
> 
> Doug
> 
> On Fri, 2007-04-20 at 14:30 -0400, Doug Knight wrote:
> 
> > I completely shutdown heartbeat on both nodes, cleared out the backup
> > cib.xml files, recopied the cib.xml from the primary node to the
> > secondary, then brought everything back up. This cleared the "diff"
> > error. The drbd master/slave pair came up as expected, but when I tried
> > to stop them, they eventually went into an unmanaged state. Looking at
> > the logs and comparing to the stop function in the OCF script, I noticed
> > that I was seeing a successful "drbdadm down", but the additional check
> > for status after the down was indicating that the down was unsuccessful
> > (from checking drbdadm state). Further, I manually verified that indeed
> > the drbd processes were down, and executed the following:
> > 
> > [EMAIL PROTECTED] xml]# /sbin/drbdadm -c /etc/drbd.conf state pgsql
> > Secondary/Unknown
> > [EMAIL PROTECTED] xml]# cat /proc/drbd
> > version: 8.0.1 (api:86/proto:86)
> > SVN Revision: 2784 build by [EMAIL PROTECTED], 2007-04-09 11:30:31
> >  0: cs:Unconfigured
> > 
> > Its the same output on either node, and drbd is definitely down on both
> > nodes. So, /proc/drbd correctly indicates drbd is down, but the
> > subsequent check using drbdadm state comes back indicating one side is
> > up in Secondary mode, which its not. This is why the resource is now in
> > unmanaged mode. Any ideas why the two tools would differ?
> > 
> > Doug
> > 
> > On Fri, 2007-04-20 at 11:35 -0400, Doug Knight wrote:
> > 
> > > In the interim I set the filesystem group to unmanaged to test failing
> > > the drbd master/slave processes back and forth, using the the value part
> > > of the place constraint. On my first attempt to switch nodes, it
> > > basically took both drbd processes down, and they stayed down. When I
> > > checked the logs on the node to which I was switching the primary drbd I
> > > found a message about a failed application diff. I switched the place
> > > constraint back to the original node. I decided to shutdown heartbeat on
> > > the node where I was seeing the diff error, now the shutdown is hung and
> > > the diff error below is repeating every minute:
> > > 
> > > cib[3040]: 2007/04/20_11:24:52 WARN: cib_process_diff: Diff 0.11.587 ->
> > > 0.11.588 not applied to 0.11.593: current "num_updates" is greater than
> > > required
> > > cib[3040]: 2007/04/20_11:24:52 WARN: do_cib_notify: cib_apply_diff of
> > >  FAILED: Application of an update diff failed
> > > cib[3040]: 2007/04/20_11:24:52 WARN: cib_process_request: cib_apply_diff
> > > operation failed: Application of an update diff failed
> > > cib[3040]: 2007/04/20_11:24:52 WARN: cib_process_diff: Diff 0.11.588 ->
> > > 0.11.589 not applied to 0.11.593: current "num_updates" is greater than
> > > required
> > > cib[3040]: 2007/04/20_11:24:52 WARN: do_cib_notify: cib_apply_diff of
> > >  FAILED: Application of an update diff failed
> > > cib[3040]: 2007/04/20_11:24:52 WARN: cib_process_request: cib_apply_diff
> > > operation failed: Application of an update diff failed
> > > 
> > > 
> > > I (and my boss) are kind of getting frustrated getting this setup to
> > > work. Is there something obvious I'm missing? Has anyone ever had HA
> > > 2.0.8, using v2 monitoring and drbd ocf script, and drbd version 8.0.1
> > > working in a two node cluster? I'm concerned because of the comment made
> > > earlier by Bernhard.
> > > 
> > > Doug
> >

RE: [Linux-HA] Cannot create group containing drbd using HB GUI

2007-04-23 Thread Alastair N. Young

I'm also wrangling with this issue (getting drbd OCF to work in V2,
logically grouping master mode with the services that are on it)

One thing I've run into so far is that there appear to be some bugs in
the drbd ocf script.

1) In do_cmd() it uses "local cmd_out" immediately before taking the
result code from $?. This always succeeds (on CentOS 4.4 32 bit anyway).
Declaring this local in an earlier line returns the correct return code
from the drbdadm command from the function. As this return code is used
elsewhere, it helps that failure codes are passed back as intended.

2) There needs to be a wait loop after the module is loaded, same as is
in the drbd distributed /etc/init.d/drbd script. I inserted this into
drbd_start() (UDEV_TIMEOUT is set in the script header to 10)

# make sure udev has time to create the device files
for RESOURCE in `$DRBDADM sh-resources`; do
for DEVICE in `$DRBDADM sh-dev $RESOURCE`; do
UDEV_TIMEOUT_LOCAL=$UDEV_TIMEOUT
while [ ! -e $DEVICE ] && [ $UDEV_TIMEOUT_LOCAL -gt
0 ] ; do
sleep 1
UDEV_TIMEOUT_LOCAL=$(( $UDEV_TIMEOUT_LOCAL-1 ))
done
done
done

It takes several seconds after the modload returns for the /dev/drbd0
device to appear - and nothing works until it does.

3) A similar timer is needed in drbd_promote as drbdadm won't let you
"Primary" until the other is not "Primary". I found that hearbeat was
firing off the promote on "b" slightly before the "demote" on "a",
causing a failure.

I added this: (REMOTE_DEMOTE_TIMEOUT is set in the script header to 10)

 drbd_get_status
 DEMOTE_TIMEOUT_LOCAL=$REMOTE_DEMOTE_TIMEOUT
 while [ "x$DRBD_STATE_REMOTE" = "xPrimary" ] && [ $DEMOTE_TIMEOUT_LOCAL
-gt 0 ] ; do
sleep 1
DEMOTE_TIMEOUT_LOCAL=$(( $DEMOTE_TIMEOUT_LOCAL-1 ))
drbd_get_status
 done

With these changes I was able to get drbd to start, stop and migrate
cleanly when I tweaked the location scores.

Getting the services dependent on that disk to do the same is still an
open question :-)

My modified drbd ocf script is attached, use at your own risk.

Alastair Young
Director, Operations
Ludi labs
399 West El Camino Real
Mountain View, CA 94040
Email: [EMAIL PROTECTED]
Direct: 650-241-0068
Mobile: 925-784-0812
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Martin Fick
Sent: Thursday, April 19, 2007 1:13 PM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] Cannot create group containing drbd using HB GUI

Hi Doug,

I personally could not get the DRBD OCF to work, I am
using drbd .7x, what about you?  I never tried a
master/slave setup though.  I created my own drbd OCF,
it is on my site along with the CIB scripts.

http://www.theficks.name/bin/lib/ocf/drbd

You can even use the drbd CIBS as a starting place if
you want:

http://www.theficks.name/bin/lib/heartbeat/drbd

I just updated them all (CIBS and OCF agents) if you
want to try them out.  

-Martin

--- Doug Knight <[EMAIL PROTECTED]> wrote:

> I made the ID change indicated below (for the
> colocation constraints),
> and everything configured fine using cibadmin. Now,
> I started JUST the
> drbd master/slave resource, with the rsc_location
> rule setting the
> expression uname to one of the two nodes in the
> cluster. Both drbd
> processes come up and sync up the partition, but
> both are still in
> slave/secondary mode (i.e. the rsc_location rule did
> not cause a
> promotion). Am I missing something here? This is the
> rsc_location
> constraint:
> 
> 
>  score="100">
>  attribute="#uname"
> operation="eq" value="arc-dknightlx"/>
> 
> 
> 
> (By the way, the example from
> Idioms/MasterConstraints web page does not
> have an ID specified in the expression tag, so I
> added one to mine.)
> Doug
> On Thu, 2007-04-19 at 13:04 -0400, Doug Knight
> wrote:
> 
> > ...
> > 
> > > > > >> 
> > > > >  For exemple
> > > > >   rsc="drbd1">
> > > > >   score="600">
> > > > >    operation="eq" value="nodeA" 
> > > > >  id="pref_drbd1_loc_nodeA_attr"/>
> > > > >  
> > > > >   score="800">
> > > > >    operation="eq" value="nodeB" 
> > > > >  id="pref_drbd1_loc_nodeB_attr"/>
> > > > >  
> > > > >  
> > > > > 
> > > > >  In this case, nodeB will be primary for
> resource drbd1. Is that what
> > > > >  
> > > > > >> you 
> > > > > >> 
> > > > >  were looking for ?
> > > > >  
> > > > > >>> Not like this, not when using the drbd
> OCF Resource Agent as a
> > > > > >>> master-slave one. In that case, you need
> to bind the rsc_location to
> > > > > >>>   
> > > > > >> the
> > > > > >> 
> > > > > >>> role=Master as well.
> > > > > >>>   
> > > > > >> I was missing this in the CIB idioms
> page.  I just added it.
> > > > > >>
> > > > > >>h

RE: [Linux-HA] Forced umount of DRBD volume

2007-04-23 Thread Alastair N. Young

This is what I have in my /etc/init.d/drbdctrl for my HB V1 machines.

This almost always seems to work. I do make sure that all important
services that should be accessing the disk are already dead, but who
knows who may be logged in scanning logfiles etc.

It is important to point fuser -mk at the disk device, not at the mount
point. If the device is already dismounted, that latter syntax kills a
lot of processes that it shouldn't. 

#!/bin/bash

case "$1" in
start)
# makes this node the primary node
/sbin/drbdadm primary all
mount /dev/drbd0 /service
;;
stop)
# makes this node the secondary node
fuser -mk /dev/drbd0
sleep 2
umount /service
/sbin/drbdadm secondary all
;;
*)
echo "Usage: /etc/init.d/drbctrl {start|stop}"
exit 1
;;
esac

exit 0




Alastair Young
Director, Operations
Ludi labs
399 West El Camino Real
Mountain View, CA 94040
Email: [EMAIL PROTECTED]
Direct: 650-241-0068
Mobile: 925-784-0812

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Piotr
Kaczmarzyk
Sent: Saturday, April 21, 2007 5:30 AM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] Forced umount of DRBD volume

> If would be the best to replace underlying DRBD with NFS so that user
could 
> still operate on his files without logging out, but user processes
(including 
> bash) would have to be restarted to reopen the files.

I'll answer to myself (partially) - first of all I found 'Filesystem'
OCF 
script and it's use of fuser command (so the recommended solution is to 
kill everything), second - I read a note on
http://www.linux-ha.org/HaNFS 
saying that "NFS-mounting any filesystem on your NFS servers is highly 
discouraged".

I don't understand why. I did that manually and it worked:
  - mounted /dev/drbd2 as /usr/local/mysql-drbd
  - added IP 10.0.0.1 to eth0
  - started NFS
  - mounted via NFS 10.0.0.1:/usr/local/mysql-drbd as /usr/local/mysql

Then:
  - stopped mysqld (just in case)
  - stopped NFS server
  - removed IP 10.0.0.1
  - unmounted /dev/drbd2 (only nfsd used it) and set it as secondary

On second node:
  - set /dev/drbd2 as primary
  - mounted it as /usr/local/mysql-drbd
  - added IP 10.0.0.1 to eth0
  - started NFS

And the directory was still accessible from the first node. So what's 
wrong with such configuration and why it should be avoided? It has 
advantages - users having shell access won't notice that something has 
changed, postfix will be able to deliver mail queued in local spool,
etc.

Best regards,

Piotr
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

RE: [Linux-HA] Cannot create group containing drbd using HB GUI

2007-04-23 Thread Alastair N. Young

Attached is the cib I am using. By adjusting the scores on the
drbd_m_like_ rules I can migrate the drbd master between nodes, and the
filesystem cleanly dismounts first and remounts on the new master after.

What I also need it to do is to migrate the services in response to a
failure or other score change of the grp_www group. I've tried many
permutations and I can't figure this out. The best I come up with is
failure of the rsc_www_fs resource in situ after I manually dismount it
a few times. At worst, Bad Things Happen. 

As best as I can guess grp_www won't move to the slave node no matter
what. Perhaps because of the -INFINITY in the colocation? 

What I need is to have the other node become master and then have
grp_www start on it. Essentially I need the master state of drbd-ms to
effectively be the first member of grp_www. I know that cannot be done
overtly, but how does one get that effect?

What's the incantation to get the master_slave to change master in
response to failure/scorechange on a collocated service?

I am running hb2.0.8 on CentOS4.4 i386 running under vmware.
Drbd is v0.7 with the modified/fixed drbd ocf script I posted earlier.

Alastair Young
Director, Operations
Ludi labs
399 West El Camino Real
Mountain View, CA 94040
Email: [EMAIL PROTECTED]
Direct: 650-241-0068
Mobile: 925-784-0812

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Alastair N.
Young
Sent: Monday, April 23, 2007 2:19 PM
To: General Linux-HA mailing list
Subject: RE: [Linux-HA] Cannot create group containing drbd using HB GUI

I'm also wrangling with this issue (getting drbd OCF to work in V2,
logically grouping master mode with the services that are on it)

One thing I've run into so far is that there appear to be some bugs in
the drbd ocf script.

1) In do_cmd() it uses "local cmd_out" immediately before taking the
result code from $?. This always succeeds (on CentOS 4.4 32 bit anyway).
Declaring this local in an earlier line returns the correct return code
from the drbdadm command from the function. As this return code is used
elsewhere, it helps that failure codes are passed back as intended.

2) There needs to be a wait loop after the module is loaded, same as is
in the drbd distributed /etc/init.d/drbd script. I inserted this into
drbd_start() (UDEV_TIMEOUT is set in the script header to 10)

# make sure udev has time to create the device files
for RESOURCE in `$DRBDADM sh-resources`; do
for DEVICE in `$DRBDADM sh-dev $RESOURCE`; do
UDEV_TIMEOUT_LOCAL=$UDEV_TIMEOUT
while [ ! -e $DEVICE ] && [ $UDEV_TIMEOUT_LOCAL -gt
0 ] ; do
sleep 1
UDEV_TIMEOUT_LOCAL=$(( $UDEV_TIMEOUT_LOCAL-1 ))
done
done
done

It takes several seconds after the modload returns for the /dev/drbd0
device to appear - and nothing works until it does.

3) A similar timer is needed in drbd_promote as drbdadm won't let you
"Primary" until the other is not "Primary". I found that hearbeat was
firing off the promote on "b" slightly before the "demote" on "a",
causing a failure.

I added this: (REMOTE_DEMOTE_TIMEOUT is set in the script header to 10)

 drbd_get_status
 DEMOTE_TIMEOUT_LOCAL=$REMOTE_DEMOTE_TIMEOUT
 while [ "x$DRBD_STATE_REMOTE" = "xPrimary" ] && [ $DEMOTE_TIMEOUT_LOCAL
-gt 0 ] ; do
sleep 1
DEMOTE_TIMEOUT_LOCAL=$(( $DEMOTE_TIMEOUT_LOCAL-1 ))
drbd_get_status
 done

With these changes I was able to get drbd to start, stop and migrate
cleanly when I tweaked the location scores.

Getting the services dependent on that disk to do the same is still an
open question :-)

My modified drbd ocf script is attached, use at your own risk.

Alastair Young
Director, Operations
Ludi labs
399 West El Camino Real
Mountain View, CA 94040
Email: [EMAIL PROTECTED]
Direct: 650-241-0068
Mobile: 925-784-0812
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Martin Fick
Sent: Thursday, April 19, 2007 1:13 PM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] Cannot create group containing drbd using HB GUI

Hi Doug,

I personally could not get the DRBD OCF to work, I am
using drbd .7x, what about you?  I never tried a
master/slave setup though.  I created my own drbd OCF,
it is on my site along with the CIB scripts.

http://www.theficks.name/bin/lib/ocf/drbd

You can even use the drbd CIBS as a starting place if
you want:

http://www.theficks.name/bin/lib/heartbeat/drbd

I just updated them all (CIBS and OCF agents) if you
want to try them out.  

-Martin

--- Doug Knight <[EMAIL PROTECTED]> wrote:

> I made the ID change indicated below (for the
> colocation constraints),
> and everything configured fine using cibadmin. Now,
> I started JUST the
> drbd master/slave resource, with the rsc_location
> rule setting the
> expression uname to one of the two nodes in the
> cluster. Both dr

Re: [Linux-HA] heartbeat-2.0.8: load balancing

2007-04-23 Thread Simon Horman

On Mon, Apr 23, 2007 at 09:42:35AM -0600, Alan Robertson wrote:
> Simon Horman wrote:
> > On Sat, Apr 21, 2007 at 07:48:09PM -0600, Alan Robertson wrote:
> >> Gerry Reno wrote:
> >>> I have a virtual IP resource that I'm making available via heartbeat and
> >>> I am controlling this via the GUI.  Now I want to add ldirectord load
> >>> balancing for a service on three machines.  How can this be added? 
> >>> ldirectord is installed but there is no config file.  How do I see and
> >>> control these ldirectord load balanced resources in the GUI?  Or are
> >>> they not manageable via heartbeat and GUI?
> >> At this point in time, the load balancer infrastructure doesn't
> >> integrate with the heartbeat infrastructure beyond being able to keep
> >> the load balancer running.
> >>
> >> Sorry :-(
> >>
> >> I can see that being a nice thing to do, though...
> > 
> > Is the answer that ldirectord needs to be extended so that
> > the GUI knows how to configure it? If so I am (as always)
> > happy to consider patches.
> 
> I think so, but also of course, the GUI would need work as well.
> 
> It just seems like a nice thing to think about.

Yes, I totally agree.

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] mysql drbd and SAN all together

Re: [Linux-HA] heartbeat-2.0.8: load balancing

Re: [Linux-HA] standalone pingd.sh

Re: [Linux-HA] standalone pingd.sh

Re: [Linux-HA] heartbeat-2.0.8: load balancing

Re: [Linux-HA] Heartbeat Active-Active Mysql servers

Re: [Linux-HA] status check

Re: [Linux-HA] IPv6, service fail on start behaviour

Re: [Linux-HA] Location constraints

Re: [Linux-HA] IPv6, service fail on start behaviour

[Linux-HA] Help : Is RHEL Advanced Server 3 Supports Heartbeat 2.0.8 and DRBD?

Re: [Linux-HA] IPv6, service fail on start behaviour

R: [Linux-HA] I: Mysql Ha cluster

Re: [Linux-HA] nodes stays offline after communication is restored

Re: [Linux-HA] heartbeat-2.0.8: load balancing

Re: [Linux-HA] ipfail not failing over when ping nodes are unpingable

Re: [Linux-HA] ipfail not failing over when ping nodes are unpingable

Re: [Linux-HA] STONITH in response to stop failures (suicide or ssh)

Re: [Linux-HA] Cannot create group containing drbd using HB GUI

RE: [Linux-HA] Cannot create group containing drbd using HB GUI

RE: [Linux-HA] Forced umount of DRBD volume

RE: [Linux-HA] Cannot create group containing drbd using HB GUI

Re: [Linux-HA] heartbeat-2.0.8: load balancing

23 matches

Site Navigation

Mail list logo

Footer information