Re: [Linux-HA] heartbeat - execute a script on a running node when the other node is back?

2009-11-16 Thread Dejan Muhamedagic
Hi,

On Mon, Nov 16, 2009 at 02:39:52PM +0100, Dominik Klein wrote:
> Tomasz Chmielewski wrote:
> > Dejan Muhamedagic wrote:
> >> Hi,
> >>
> >> On Sun, Nov 15, 2009 at 09:09:53PM +0100, Tomasz Chmielewski wrote:
> >>> I have two nodes, node_1 and node_2.
> >>>
> >>> node_2 was down, but is now up.
> >>>
> >>>
> >>> How can I execute a custom script on node_1 when it detects that node_2 
> >>> is back?
> >> That's not possible. What would you want to with that script?
> > 
> > I have two PostgreSQL servers running; pgpool-ii is started by Heartbeat 
> > to distribute the load (reads) among two servers and to send writes to 
> > both servers.
> > 
> > When one PostgreSQL server fails, the setup will still work fine. When 
> > the failed PostgreSQL instance is back, the data should be first 
> > "synchronized" from the running PostgreSQL server to a server which was 
> > failed a while ago.
> > 
> > It is best if such a script could be started by Heartbeat running on the 
> > active node, as soon as it detects that the other node is back.
> 
> If you need such thing - I'd personally be most comfortable with not
> starting the cluster at boot time. Then you can do whatever you need to
> do and then - when you _know_ everything is right, the script is done
> etc. - start the cluster software.
> 
> Just my personal preference.

It would be mine too. Nothing wrong to have a script for that,
but best to run it by hand so that you can check the database.
Besides, your nodes shouldn't be disappearing that often.

Thanks,

Dejan

> Regards
> Dominik
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Announce: LINBIT to assume stewardship for Heartbeat code base, with permission from the Board

2009-11-16 Thread Florian Haas
Dear members of the Linux-HA community,

This is to announce that LINBIT, with the kind permission from the
Linux-HA project board, will act as the "steward" of the Heartbeat
cluster messaging layer code base, from this point forward. This is a
summary of our motivation and plans related to that role.


What does this entail?

- LINBIT will assume responsibility for bug fixes for
the Heartbeat code base, currently hosted at http://hg.linux-ha.org/dev/.

- LINBIT will bundle up the 3.0 beta codebase, make a 3.0 final
release (currently this is planned for the month of January 2010), and
subsequently make bugfix releases as deemed necessary.

- LINBIT will further collaborate with the Pacemaker project to
keep the existing dual-stack capability in Pacemaker.

- LINBIT will continue making the public Mercurial repository
available at the present location (any eventual relocation, if desired
by the Board, would be publicly announced with ample advance notice).

- LINBIT will administer the public mailing lists (linux-ha and
linux-ha-dev) on the servers currently hosting them (again, any eventual
relocation, would be publicly announced with ample advance notice).

- LINBIT intends to offer improved documentation for the Heartbeat
messaging layer. This is meant to consolidate the content currently
found on the linux-ha.org wiki site.

- LINBIT intends to offer support services for the Heartbeat/Pacemaker
cluster stack (i.e. the Pacemaker cluster resource manager running on
top of the Heartbeat cluster communication layer).

- LINBIT will continue to respect he Board as the final authority on
matters affecting the project as a whole.


What does this not entail?

- LINBIT has no intention to add significant features to the Heartbeat
code base, or extend its functionality significantly.

- LINBIT has no intention to apply changes to the licensing, development
model, or collaboration model for the Linux-HA code base.

- LINBIT has no intention to establish the Heartbeat code base as a
_long-term_ alternative or competition to the OpenAIS/Corosync cluster
messaging layer. However, we do believe that it is a valid alternative
for the short to mid term, and for some configurations where
OpenAIS/Corosync is currently suffering from some growing pains,

- LINBIT has no intention to support or advocate continued use of
Heartbeat in v1 (haresources) configurations. We will continue to
recommend to switch to the Pacemaker cluster stack, now that two
(technically and commercially) supported cluster messaging layers are
available.


At this time, the primary contact in charge of Heartbeat development
matters at LINBIT is Lars Ellenberg, the person in charge of
documentation is myself. The best means of relaying comments and asking
questions continues to be the public mailing list.

We hope that this is a useful service to the Heartbeat user community. I
want to reiterate that we have no intention whatsoever to change the
current, proven, community centric approach to how the Heartbeat code
base is managed. We continue to welcome, and depend on, community
suggestions, feedback, and collaboration. Heartbeat is a community
project and will remain so. If you have any questions about our
intentions and plans, please post them on this list. If for some reason
you would like to discuss them off-list, please use my direct email
address. Your comments are highly appreciated. Thank you very much!

With best regards on behalf of LINBIT,
Florian




signature.asc
Description: OpenPGP digital signature
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] How to put minutes in time based rule

2009-11-16 Thread Andrew Beekhof
On Tue, Nov 10, 2009 at 11:50 AM, abhishek agrawal
 wrote:
> I was trying to define time based rule to stop a resource in a certain
> period. I want a granularity on a minute basis. like stopeed between
> 9-10.30 daily..
>
> for this there is no support of minutes in HA. i was trying fractional
> hours as following:
>
> 
>          
>            
>              
>            
>          
>          
>   
>
>
>
> should it work ..Will 10.5 correspond to 10:30 .

No. But it would be a worthy feature request in bugzilla.
You could use two expressions (one with operation=gt the other with
lt) to achieve the desired result.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Linux HA on Solaris 10

2009-11-16 Thread Andrew Beekhof
On Mon, Nov 16, 2009 at 2:53 PM, Tiaan Wessels  wrote:
> Hi,
>
> I am really interested in getting Linux HA working on a Solaris 10 SPARC
> zone. I have downloaded Heartbeat-STABLE-2-1-STABLE-2.1.4 and executed
> as root  ./ConfigureMe configure and got a long list of output ending in
> the following:
>
> configure: WARNING: The following recommended components noted earlier
> are missing:
>     gnutls/gnutls.h, swig, gnutls/gnutls.h
>    We will continue but you may have lost some non-critical functionality.
> configure: error: The following required components noted earlier are
> missing:
>     glib2-devel
>    Please supply them and try again.
>
> I could not find a glib2-devel package for our platform and installing
> this from source is quite difficult due to administrative and
> configuration management concerns (I have special permission for HA!).
> My question: Is there a command-line option to ConfigureMe or another
> way to disable the component requiring glib2 ?

No, because its mandatory.
Nothing will compile without it.

> I cannot run gmake after
> the ConfigureMe completes even though it is stated only as a warning. I
> could not find another way to kick start compiling. Any documents on how
> to compile ? Had a look at Linux-HA but quite scarce on information on
> how to compile from scratch so maybe I can document my findings at the
> end and add to HA docs.
>
> Thanks
> --
> Tiaan Wessels
> Netsys International
> Tel: +27 (0)12 349-2056 (Business)
> +27 (0)12 349-2757 (Facsimile)
> E-mail: ti...@netsys.co.za
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Linux HA on Solaris 10

2009-11-16 Thread Tiaan Wessels
Hi,

I am really interested in getting Linux HA working on a Solaris 10 SPARC 
zone. I have downloaded Heartbeat-STABLE-2-1-STABLE-2.1.4 and executed 
as root  ./ConfigureMe configure and got a long list of output ending in 
the following:

configure: WARNING: The following recommended components noted earlier 
are missing:
 gnutls/gnutls.h, swig, gnutls/gnutls.h
We will continue but you may have lost some non-critical functionality.
configure: error: The following required components noted earlier are 
missing:
 glib2-devel
Please supply them and try again.

I could not find a glib2-devel package for our platform and installing 
this from source is quite difficult due to administrative and 
configuration management concerns (I have special permission for HA!). 
My question: Is there a command-line option to ConfigureMe or another 
way to disable the component requiring glib2 ? I cannot run gmake after 
the ConfigureMe completes even though it is stated only as a warning. I 
could not find another way to kick start compiling. Any documents on how 
to compile ? Had a look at Linux-HA but quite scarce on information on 
how to compile from scratch so maybe I can document my findings at the 
end and add to HA docs.

Thanks
-- 
Tiaan Wessels
Netsys International
Tel: +27 (0)12 349-2056 (Business)
+27 (0)12 349-2757 (Facsimile)
E-mail: ti...@netsys.co.za
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] heartbeat - execute a script on a running node when the other node is back?

2009-11-16 Thread Dominik Klein
Tomasz Chmielewski wrote:
> Dejan Muhamedagic wrote:
>> Hi,
>>
>> On Sun, Nov 15, 2009 at 09:09:53PM +0100, Tomasz Chmielewski wrote:
>>> I have two nodes, node_1 and node_2.
>>>
>>> node_2 was down, but is now up.
>>>
>>>
>>> How can I execute a custom script on node_1 when it detects that node_2 
>>> is back?
>> That's not possible. What would you want to with that script?
> 
> I have two PostgreSQL servers running; pgpool-ii is started by Heartbeat 
> to distribute the load (reads) among two servers and to send writes to 
> both servers.
> 
> When one PostgreSQL server fails, the setup will still work fine. When 
> the failed PostgreSQL instance is back, the data should be first 
> "synchronized" from the running PostgreSQL server to a server which was 
> failed a while ago.
> 
> It is best if such a script could be started by Heartbeat running on the 
> active node, as soon as it detects that the other node is back.

If you need such thing - I'd personally be most comfortable with not
starting the cluster at boot time. Then you can do whatever you need to
do and then - when you _know_ everything is right, the script is done
etc. - start the cluster software.

Just my personal preference.

Regards
Dominik
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] heartbeat - execute a script on a running node when the other node is back?

2009-11-16 Thread Tomasz Chmielewski
Dejan Muhamedagic wrote:
> Hi,
> 
> On Sun, Nov 15, 2009 at 09:09:53PM +0100, Tomasz Chmielewski wrote:
>> I have two nodes, node_1 and node_2.
>>
>> node_2 was down, but is now up.
>>
>>
>> How can I execute a custom script on node_1 when it detects that node_2 
>> is back?
> 
> That's not possible. What would you want to with that script?

I have two PostgreSQL servers running; pgpool-ii is started by Heartbeat 
to distribute the load (reads) among two servers and to send writes to 
both servers.

When one PostgreSQL server fails, the setup will still work fine. When 
the failed PostgreSQL instance is back, the data should be first 
"synchronized" from the running PostgreSQL server to a server which was 
failed a while ago.

It is best if such a script could be started by Heartbeat running on the 
active node, as soon as it detects that the other node is back.


-- 
Tomasz Chmielewski
http://wpkg.org
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] heartbeat - execute a script on a running node when the other node is back?

2009-11-16 Thread Tobias Appel
On 11/15/2009 09:09 PM, Tomasz Chmielewski wrote:
> I have two nodes, node_1 and node_2.
>
> node_2 was down, but is now up.
>
>
> How can I execute a custom script on node_1 when it detects that node_2
> is back?
>
>

This is a little off the heartbeat list I guess, but we use Nagios to 
monitor our heartbeat clusters and you can use the eventhandler from 
nagios to execute scripts when there is a state change.

I don't think that you would want to install nagios just for this, but 
in case you are running Nagios or something similar already you might 
try this approach.

Bye,
Tobi
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Lot of core dumps found - should I worry about it?

2009-11-16 Thread Tobias Appel
On 11/16/2009 11:27 AM, Dejan Muhamedagic wrote:

>>
>> Can I stop Heartbeat from creating those?
>
> Yes. Add "coredumps false" to ha.cf. Though if something really
> goes wrong and you don't have a coredump we'd probably ask you to
> reproduce :)
>
> Thanks,
>
> Dejan

Thanks, that's fine by me since for the moment I just have to make sure 
that the root partition doesn't run full.

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] heartbeat - execute a script on a running node when the other node is back?

2009-11-16 Thread Dejan Muhamedagic
Hi,

On Sun, Nov 15, 2009 at 09:09:53PM +0100, Tomasz Chmielewski wrote:
> I have two nodes, node_1 and node_2.
> 
> node_2 was down, but is now up.
> 
> 
> How can I execute a custom script on node_1 when it detects that node_2 
> is back?

That's not possible. What would you want to with that script?

Thanks,

Dejan

> 
> -- 
> Tomasz Chmielewski
> http://wpkg.org
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Lot of core dumps found - should I worry about it?

2009-11-16 Thread Dejan Muhamedagic
Hi,

On Mon, Nov 16, 2009 at 11:12:51AM +0100, Tobias Appel wrote:
> Hi,
> 
> well Nagios informed me today that the root partition of my Heartbeat 
> Cluster is getting full. After a short investigation I found out that 
> this directory has over 2 GB of size:
> 
> /var/lib/heartbeat/cores/root/
> 
> Over 250 of those files were in there:
> 
> -rw---  1 root   root   8228864 Nov 16 11:08 core.8251
> 
> Heartbeat runs fine and stable though. I know that one of the two 
> Ethernet Interfaces I use for hb (eth1 and eth3) crashes a lot due to a 
> driver error (problem with SUN / NVIDIA and RedHat, no fix yet) and I 
> suppose that's why there is a core dump - because Heartbeat knows that 
> the link is down.
> Other then that I don't think that anything is wrong.

If you're really sure about that ... BTW, if there are core dumps
which don't result from ABORT (signal 6), then it would be good
to see backtraces.

> Also those core dumps happen only on the active node in our two-node 
> cluster. None are on the passive node.
> 
> Can I stop Heartbeat from creating those?

Yes. Add "coredumps false" to ha.cf. Though if something really
goes wrong and you don't have a coredump we'd probably ask you to
reproduce :)

Thanks,

Dejan

> Thanks in advance,
> Tobi
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Lot of core dumps found - should I worry about it?

2009-11-16 Thread Tobias Appel
Hi,

well Nagios informed me today that the root partition of my Heartbeat 
Cluster is getting full. After a short investigation I found out that 
this directory has over 2 GB of size:

/var/lib/heartbeat/cores/root/

Over 250 of those files were in there:

-rw---  1 root   root   8228864 Nov 16 11:08 core.8251

Heartbeat runs fine and stable though. I know that one of the two 
Ethernet Interfaces I use for hb (eth1 and eth3) crashes a lot due to a 
driver error (problem with SUN / NVIDIA and RedHat, no fix yet) and I 
suppose that's why there is a core dump - because Heartbeat knows that 
the link is down.
Other then that I don't think that anything is wrong.

Also those core dumps happen only on the active node in our two-node 
cluster. None are on the passive node.

Can I stop Heartbeat from creating those?

Thanks in advance,
Tobi
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] DRBD Management Console 0.4.2

2009-11-16 Thread Rasto Levrinc
Hi,

This is the next DRBD-MC beta release 0.4.2. DRBD-MC is a Java GUI that
helps to configure DRBD/Pacemaker/Corosync/Openais/Heartbeat clusters.
Some people are concerned that Java is too slow for GUIs and that was true
years ago, but now at last it's Java GUI time!

Now let's see what is new in this release. As a nod to the bleeding edge
people DRBD-MC now works with Pacemaker 1.0.6 and Corosync. New
Clusterlabs repositories were added as an installation option, so that you
can install the latest Pacemaker releases. If Clusterlabs repositories are
still too stale for you, it is possible to install from the GUI the latest
clone from Pacemaker Mercurial, making the GUI only one step behind the
last commit from Beekhof. The goal is to get ahead and let them catch up
on DRBD-MC.

Thanks to the feedback of the sysadmins out there some issues with disks
not being recognized and some other misconceptions were fixed. They also
found even more bugs, while clicking on things I would not dare to click.

In the meantime the whole new DRBD 8.3.6 was released and they got finally
configure script together, unfortunately breaking the DRBD installation
from the GUI in the process. Needless to say, it is fixed now.

Another new addition in this release is support for cloned groups. It
still needs some work to make it easier and more intuitive to configure
but at least it is now possible to enter and operate it.

See some new features in the screenshot:

http://oss.linbit.com/drbd-mc/img/drbd-mc-0.4.2.png

You can get DRBD-MC here:

http://www.drbd.org/mc/management-console/
http://oss.linbit.com/drbd-mc/drbd-mc-0.4.2.tar.gz
http://oss.linbit.com/drbd-mc/DMC-0.4.2.jar

You can start it with help of Java Web-Start or you can download it and
start it with "java -jar DMC-0.4.2.jar" command.

Here is the almost complete changelog:

* offline/online detection for nodes was fixed
* bogus "status failed" message while loading the cluster was removed
* null pointer exception when clicking on host that is not connected was
fixed
* DRBD 8.3.6 installation was fixed
* freezes, when adding a dependent group were fixed
* block devices that do not have partitions are shown now
* md_d device names for sw raids are allowed now
* support for cloned groups was implemented
* advanced mode check box is now global
* Pacemaker installation fixes
* auto option for DRBD configuration was added
* start up, if there are no DRBD resources, was fixed
* install from source option for Pacemaker was added
* versions of installed components on every node are now shown in the node
tooltips

Rasto Levrinc

-- 
: Dipl-Ing Rastislav Levrinc
: DRBD-MC http://www.drbd.org/mc/management-console/
: DRBD/HA support and consulting http://www.linbit.com/
DRBD(R) and LINBIT(R) are registered trademarks of LINBIT, Austria.


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems