Re: [Linux-HA] Heartbeat 2.1.3-23.1 (where is the stable version HA/pacemaker)

2008-08-28 Thread Lars Marowsky-Bree
On 2008-08-28T19:50:42, Andrew Beekhof [EMAIL PROTECTED] wrote: that sounds great. Do you know how much this packages differ from the suse-10.2 packages??? different dependancies. not sure i'd mix them personally Not for anything worth running in production, for sure not. I'd honestly

Re: [Linux-HA] Loss of name resolution == loss of cluster?

2008-08-27 Thread Lars Marowsky-Bree
On 2008-08-26T20:10:24, Todd, Conor [EMAIL PROTECTED] wrote: I'm about to switch the primary interface over to static IPs and put the hostname to IP mapping in /etc/hosts so that the cluster nodes will always be able to resolve their hostnames. Static IPs? Were the hostnames periodically

Re: [Linux-HA] Filesystem resource fails to start

2008-08-27 Thread Lars Marowsky-Bree
On 2008-08-26T11:52:35, Matt Zagrabelny [EMAIL PROTECTED] wrote: Set use_logd yes in the ha.cf file and log to syslog; the logs are so much more readable. Out of curiosity, why do you say that using syslog makes the logs so much more readable? Well, of course, the line format is a personal

Re: [Linux-HA] Heartbeat 2.1.3-32.1

2008-08-27 Thread Lars Marowsky-Bree
On 2008-08-27T16:49:23, Mega Mailingliste [EMAIL PROTECTED] wrote: why do you want these specific versions? Because this is one of these versions which runs stable on our environment. The other versions before, got us into trouble with some failovers and other stuff. So why we should use a

Re: [Linux-HA] Run custom scripts after resource start/stop

2008-08-27 Thread Lars Marowsky-Bree
On 2008-08-27T11:36:37, Enrique José Hernández Blasco [EMAIL PROTECTED] wrote: Write this into the accoring OCF RA scripts. It is plain bash, so shoild not be too difficult. I've set up a directory per start/stop hook and OCF RA and run-parts on there but I think it could be cooler o:) to

Re: [Linux-HA] oracle-xe, xen, ocfs2 and heartbeat

2008-08-27 Thread Lars Marowsky-Bree
On 2008-08-26T14:00:35, Ciro Iriarte [EMAIL PROTECTED] wrote: I was talking about the combination of OCFS2 with HA for membership, for other tasks I think it'll be as useful as today... We will continue to support the integration between OCFS2 and the user-space cluster membership layer in

Re: [Linux-HA] Heartbeat 2.1.3-32.1

2008-08-27 Thread Lars Marowsky-Bree
On 2008-08-27T18:42:41, Mega Mailingliste [EMAIL PROTECTED] wrote: Right that's true, but where're using the suse build service since 3 or 4 installations and, I know that’s my mistake, we forgot to save the rpm's anywhere. How should know that’s the rpm's changing so fast. Sorry, the build

Re: [Linux-HA] Run custom scripts after resource start/stop

2008-08-27 Thread Lars Marowsky-Bree
On 2008-08-27T18:48:01, Andreas Mock [EMAIL PROTECTED] wrote: Despite that you have to emphasize the following: What do you do if the piece of code in the do-it-after-resource-start/stop-hock has an error? What do you send back as result code? What happens if the

Re: [Linux-HA] Loss of name resolution == loss of cluster?

2008-08-26 Thread Lars Marowsky-Bree
On 2008-08-25T23:50:26, Todd, Conor [EMAIL PROTECTED] wrote: My understanding with heartbeat is that the bcast directive tells it to use UDP broadcast on the interfaces in the order in which they're given. UDP broadcast doesn't rely on DNS, so I'm wondering why a loss of DNS resolution would

Re: [Linux-HA] Loss of name resolution == loss of cluster?

2008-08-26 Thread Lars Marowsky-Bree
On 2008-08-26T13:34:31, Brian Reichert [EMAIL PROTECTED] wrote: Are you using IP addresses to describe the other nodes, or are you using hostnames? The docs say to use IP addresses, but I've seen hostnames work. But these do not prevent nodes from seeing each other. The media channels do

Re: [Linux-ha-dev] Problem compiling 2.1.4 and 2.9.0 beta release

2008-08-25 Thread Lars Marowsky-Bree
On 2008-08-25T11:29:58, Adrian Chapela [EMAIL PROTECTED] wrote: And if I run: ./ConfigureMe make --enable-libc-malloc the output error is the same. Have you another idea ? Yes. Build with --enable-fatal-warnings=no ___ Linux-HA-Dev:

Re: [Linux-ha-dev] Problem compiling 2.1.4 and 2.9.0 beta release

2008-08-25 Thread Lars Marowsky-Bree
On 2008-08-25T16:43:14, Adrian Chapela [EMAIL PROTECTED] wrote: Have you another idea ? Yes. Build with --enable-fatal-warnings=no I was sure about I tried this option but I try again now and it was compiling OK. The problem was caused by an incompatible type definition - cl_malloc.h

Re: [Linux-ha-dev] Problem compiling 2.1.4 and 2.9.0 beta release

2008-08-25 Thread Lars Marowsky-Bree
On 2008-08-25T18:05:49, Adrian Chapela [EMAIL PROTECTED] wrote: The problem was caused by an incompatible type definition - cl_malloc.h used unsigned long instead of size_t. Since cl_malloc is now gone from dev, the next release should do without this warning. Thank you! Actually, I spoke

Re: [Linux-HA] Filesystem resource fails to start

2008-08-25 Thread Lars Marowsky-Bree
On 2008-08-25T14:04:14, Christoph Eßer [EMAIL PROTECTED] wrote: Hi there, meanwhile I managed getting a logfile by a simple reboot of both nodes. The real problem I have is that Heartbeat doesn't start a configured Filesystem resource. The logfile entry is rather cryptic and doesn't help

Re: [Linux-HA] (no subject)

2008-08-25 Thread Lars Marowsky-Bree
On 2008-08-25T16:29:14, Klaus Jagemann [EMAIL PROTECTED] wrote: heartbeat[1539]: 2008/08/25_15:09:12 ERROR: Cannot write to media pipe 2: Resource temporarily unavailable heartbeat[1539]: 2008/08/25_15:09:12 ERROR: Shutting down. Due to the size of the logfile (1.2MB), I will only attach

Re: [Linux-HA] Using Hearbeat to restart ec2 instance

2008-08-25 Thread Lars Marowsky-Bree
On 2008-08-25T08:38:19, Chris Conley [EMAIL PROTECTED] wrote: Sorry if I wasn't clear enough. EC2 is the cloud computing service provided by Amazon: http://www.amazon.com/gp/browse.html?node=201590011 Basically, I have heartbeat running on two virtual machines in the EC2 cloud. If one of

Re: [Linux-HA] Heartbeat 2.1.3 compatible with 2.1.4?

2008-08-24 Thread Lars Marowsky-Bree
On 2008-08-24T07:50:42, Todd, Conor [EMAIL PROTECTED] wrote: I'm wondering if I upgrade one of my cluster nodes to 2.1.4, if the rest of the nodes will still accept it as part of their cluster? I'd rather not have to start from scratch in order to upgrade. In theory, that should work.

Re: [Linux-HA] Compilation error of new release 2.1.4

2008-08-23 Thread Lars Marowsky-Bree
On 2008-08-22T11:18:07, Paul Walsh [EMAIL PROTECTED] wrote: pils.c:244: error: initialization from incompatible pointer type pils.c:245: error: initialization from incompatible pointer type make[2]: *** [pils.lo] Error 1 make[2]: se sale del directorio

Re: [Linux-HA] How to configure fencing after failover ?

2008-08-23 Thread Lars Marowsky-Bree
On 2008-08-22T09:20:02, Ronny Becker [EMAIL PROTECTED] wrote: Hello, I am trying to configure a fencing after failover and cannot find a way to do this. I am using heartbeat 2.1.4. This is what I have configured: - resource_stickiness is about 99 - rosource_failure_stickiness is -50 -

Re: [Linux-HA] Adding/removing constraints on the fly

2008-08-23 Thread Lars Marowsky-Bree
On 2008-08-22T19:40:52, Mike Sweetser - Adhost [EMAIL PROTECTED] wrote: I want to do the following: * Remove no_r1_on_gamma * Remove run_r1 * Create a new constraint called run_r1 with a score of 100 for gamma * Create a new constraint called no_r1_on_alpha with a score of -INFINITY for

Re: [Linux-HA] Securing data within the cib.xml

2008-08-23 Thread Lars Marowsky-Bree
On 2008-08-22T17:20:40, Kevin Harms [EMAIL PROTECTED] wrote: I'm wondering if anyone has any experience with the following case. The STONITH utility uses IPMI to reboot machines and works via a username and password. This username and password are in the cib.xml. Beyond the standard

[Linux-HA] Linux Kongress 2008

2008-08-23 Thread Lars Marowsky-Bree
Dear readers, this year's Linux Kongress, held from 7th to 10th October in Hamburg, features 3 tutorials which explain and use Linux-HA as their clustering technology. Congratulations and thanks to Ralph Dehner, Lars Ellenberg, Joerg Jungermann, Maximilian Wilhelm. It is wonderful to see the

Re: [Linux-ha-dev] Problem compiling 2.1.4 and 2.9.0 beta release

2008-08-22 Thread Lars Marowsky-Bree
On 2008-08-22T13:53:08, Adrian Chapela [EMAIL PROTECTED] wrote: pils.c -fPIC -DPIC -o .libs/pils.o cc1: warnings being treated as errors pils.c:244: error: initialization from incompatible pointer type pils.c:245: error: initialization from incompatible pointer type make[2]: *** [pils.lo]

Re: [Linux-ha-dev] Problem compiling 2.1.4 and 2.9.0 beta release

2008-08-22 Thread Lars Marowsky-Bree
On 2008-08-22T15:09:01, Adrian Chapela [EMAIL PROTECTED] wrote: Lars Marowsky-Bree escribió: On 2008-08-22T13:53:08, Adrian Chapela [EMAIL PROTECTED] wrote: pils.c -fPIC -DPIC -o .libs/pils.o cc1: warnings being treated as errors pils.c:244: error: initialization from incompatible

[Linux-HA] Re: [Pacemaker] Constraints with and without rules

2008-08-22 Thread Lars Marowsky-Bree
On 2008-08-22T16:35:45, Ben Beuchler [EMAIL PROTECTED] wrote: I apologize for the cross-post if that's not kosher. It appears linux-ha still gets considerably more traffic than the pacemaker list. Are these functionally equivalent on a cluster with symmetric-cluster=false? This is the

[Linux-ha-dev] Announcing: 2.99.0 (beta! release)

2008-08-21 Thread Lars Marowsky-Bree
Hi, in preparation for 3.0.0, which is bound to happen eventually when it is ready, I am happy to announce 2.99.0 as a _beta_ version. 2.99.0 introduces significant changes: - 2.99.0 removes code now maintained outside heartbeat such as the CRM (now Pacemaker), thus everything which depends

Re: [Linux-ha-dev] 3.0 thoughts (dopd)

2008-08-21 Thread Lars Marowsky-Bree
On 2008-08-19T15:02:09, Rasto Levrinc [EMAIL PROTECTED] wrote: - Remove code which depends on pacemaker (mgmt, dopd, CIM/SNMP) - Remove unmaintained code (telecom/). dopd does not depend on crm directly, but it uses clplumbing. Can it stay there? Ah, right, dopd stays then I think.

[Linux-ha-dev] 3.0.0: cts?

2008-08-21 Thread Lars Marowsky-Bree
Hi, where should we maintain CTS? Pacemaker has it's own fork of it, at this point in time. But I think CTS makes sense for heartbeat to have too, as _someone_ might still care for the v1 functionality, and it might make sense to test just the cluster layer w/o resources. My preferred approach

Re: [Linux-ha-dev] 3.0.0: cts?

2008-08-21 Thread Lars Marowsky-Bree
On 2008-08-21T20:43:55, Andrew Beekhof [EMAIL PROTECTED] wrote: My preferred approach would be to create heartbeat-cts as a package, and put CTS in there; and then, Pacemaker would just drop in the overlays it needs. wont work as you'd have multiple packages with the same files.

Re: [Linux-HA] 3node split site quorumd tests

2008-08-21 Thread Lars Marowsky-Bree
On 2008-08-20T18:33:47, Robert [EMAIL PROTECTED] wrote: Well, from my perspective - which of course is the perspective of a user and not the developer of heartbeat - it works. By works I mean, when I pull the cable between the datacenters, the resources are active on one node only (the

Re: [Linux-HA] 3node split site quorumd tests

2008-08-21 Thread Lars Marowsky-Bree
On 2008-08-20T23:35:36, Robert Heinzmann (ml) [EMAIL PROTECTED] wrote: I'm getting confused even more while going through some postings and google findings regarding Linux split site setups and quorum server. Alan writes a lot about quorum server on his blog: That's something you'd have to

Re: [Linux-HA] building the stripped-down heartbeat on Ubuntu 8.0.4

2008-08-21 Thread Lars Marowsky-Bree
On 2008-08-20T17:33:01, Ben Beuchler [EMAIL PROTECTED] wrote: I'm attempting to build debs of this version of Heartbeat on Ubuntu 8.0.4: http://hg.linux-ha.org/dev/archive/tip.tar.bz2 $ cd Linux-HA-Dev-b32ca6086e32 $ dpkg-buildpackage -rfakeroot ... All goes well until it tries to

Re: [Linux-HA] 3node split site quorumd tests

2008-08-21 Thread Lars Marowsky-Bree
On 2008-08-21T13:38:14, Robert [EMAIL PROTECTED] wrote: So if quorumd works well in certain environmnts can it be used ? (e.g. non STONITH configurations). Sure. 2 Nodes without STONITH, and unable to recover from stop failures - now that makes for a stable, reliable, and supportable

Re: [Linux-HA] 3node split site quorumd tests

2008-08-21 Thread Lars Marowsky-Bree
On 2008-08-21T13:47:07, Andreas Kurz [EMAIL PROTECTED] wrote: Have you considered to install an additional cluster node as a Quorumserver which runs no services ? Doesn't help the STONITH issue at all. Doesn't work in split-site configurations; the site with N/2+1 nodes will always win, the

Re: [Linux-HA] 3node split site quorumd tests

2008-08-21 Thread Lars Marowsky-Bree
On 2008-08-21T13:56:48, Robert [EMAIL PROTECTED] wrote: Another way could also be, to just setup a 2 node cluster with stonith via meatware. This allows server maintainance, but automatic failovers have to be administrator controlled - that's acceptable for the disaster case I guess.

[Linux-HA] Announcing: 2.99.0 (beta! release)

2008-08-21 Thread Lars Marowsky-Bree
Hi, in preparation for 3.0.0, which is bound to happen eventually when it is ready, I am happy to announce 2.99.0 as a _beta_ version. 2.99.0 introduces significant changes: - 2.99.0 removes code now maintained outside heartbeat such as the CRM (now Pacemaker), thus everything which depends

Re: [Linux-HA] 3node split site quorumd tests

2008-08-21 Thread Lars Marowsky-Bree
On 2008-08-21T14:52:41, Andreas Kurz [EMAIL PROTECTED] wrote: Yes, that should work. But so would already simple majority quorum; set ignore_quorum=freeze ^ by default, and when you want one site/node to continue, force it to ignore. Hmm ...

Re: [Linux-HA] 3node split site quorumd tests

2008-08-21 Thread Lars Marowsky-Bree
On 2008-08-21T14:25:36, Robert [EMAIL PROTECTED] wrote: And what about three sides ? NodeA - DC-A NodeB - DC-B NodeQuorum - DC-C ? Fencing still is not addressed. Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development SUSE LINUX Products GmbH, GF: Markus Rex, HRB

Re: [Linux-HA] long time until reachability

2008-08-21 Thread Lars Marowsky-Bree
On 2008-08-21T11:44:42, [EMAIL PROTECTED] wrote: Sorry for my first mail with an incorrect adress... :( So this post is the second Hi! I've set up an small Linux-HA Projekt for myself. I have two Servers which should provide one IP Server1: 10.0.0.4 Server2: 10.0.0.5

Re: [Linux-HA] Announcing: 2.99.0 (beta! release)

2008-08-21 Thread Lars Marowsky-Bree
On 2008-08-21T15:14:08, Michael Schwartzkopff [EMAIL PROTECTED] wrote: Am Donnerstag, 21. August 2008 14:50 schrieb Lars Marowsky-Bree: Hi, in preparation for 3.0.0, which is bound to happen eventually when it is ready, I am happy to announce 2.99.0 as a _beta_ version. (...) Hi, I

Re: [Linux-ha-dev] 3.0 thoughts

2008-08-20 Thread Lars Marowsky-Bree
On 2008-08-20T16:33:44, Simon Horman [EMAIL PROTECTED] wrote: I realise that the whole point of 3.0.0 is to produce a lha release that depends on pacemaker. I think you meant the right thing, but the phrasing would be more like a lha release which pacemaker can depend on ;-) When I read your

Re: [Linux-ha-dev] 3.0 thoughts

2008-08-20 Thread Lars Marowsky-Bree
On 2008-08-20T11:22:25, Wolfram Schlich [EMAIL PROTECTED] wrote: * Robert [EMAIL PROTECTED] [2008-08-19 15:38]: When thinking about splitting the packages, I think a seperate RA package is useful. Most changes are done at the RA level (new versions of supported applications, bug fixes

Re: [Linux-ha-dev] 3.0 thoughts

2008-08-20 Thread Lars Marowsky-Bree
On 2008-08-20T13:18:42, Wolfram Schlich [EMAIL PROTECTED] wrote: But yes, clearly we hope to have more independently packaged resource agents one day; it would be so cool if they were instead shipped with the service package (and not our problem to maintain! ;-). That's a bit beyond of

Re: [Linux-ha-dev] uninstall Heartbeat 2.1.4-1 on RedHat

2008-08-20 Thread Lars Marowsky-Bree
On 2008-08-20T11:18:12, Junko IKEDA [EMAIL PROTECTED] wrote: Is it hard to replaced the %run_ldconfig macro with /sbin/ldconfig for RedHat? No, it's fixed. ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org

Re: [Linux-ha-dev] 3.0 thoughts

2008-08-20 Thread Lars Marowsky-Bree
On 2008-08-20T14:17:17, Wolfram Schlich [EMAIL PROTECTED] wrote: How the service provider arranges their packaging and upgrade strategy is their concern, not ours ;-) How often do you upgrade init scripts without the service? An OCF RA is not just an init script -- please do not forget the

Re: [Linux-ha-dev] 3.0 thoughts

2008-08-20 Thread Lars Marowsky-Bree
On 2008-08-20T18:05:32, Wolfram Schlich [EMAIL PROTECTED] wrote: So there's no point why the LSB script and the OCF RA could not be one and the same script, sharing much of the logic. There are systems that are not using LSB init scripts at all but that are still used together with software

Re: [Linux-ha-dev] 3.0 thoughts

2008-08-20 Thread Lars Marowsky-Bree
On 2008-08-20T18:16:41, Robert [EMAIL PROTECTED] wrote: That's a good idea. Why not put all OCF library stuff into one package (ocf-base) and add a package for each resource agent (e.g. ocfs-ra-drbd). Packages are not free. That'd be a mess to maintain. You don't want that. Trust me. Not

Re: [Linux-HA] 3node split site quorumd tests

2008-08-20 Thread Lars Marowsky-Bree
On 2008-08-19T16:40:48, Michael Schwartzkopff [EMAIL PROTECTED] wrote: Am Dienstag, 19. August 2008 15:55 schrieb Robert: Hi, I've been running some tests with quorumd and a three node setup. The setup is shown in the attached graphics. The cluster uses export HA_quorum=majority:quorumd.

Re: [Linux-HA] Announcing: heartbeat 2.1.4

2008-08-20 Thread Lars Marowsky-Bree
On 2008-08-20T13:03:50, Adrian Chapela [EMAIL PROTECTED] wrote: I can't find this bug 1852 (http://developerbugs.linux-foundation.org/show_bug.cgi?id=1852) resolved. When it could be resolved ? That was fixed in March and is included, but the commit did not mention that bugzilla id, because

Re: [Linux-HA] 3node split site quorumd tests

2008-08-20 Thread Lars Marowsky-Bree
On 2008-08-19T15:55:13, Robert [EMAIL PROTECTED] wrote: I played around with hb_setweight - no success (hb_setweight debnode1 200). The quorum server ignores the weights. The quorum server is a broken piece of ... code. Avoid it. Is there a way to force the DC to be in DataCenter1 if

Re: [Linux-ha-dev] Announcing: heartbeat 2.1.4

2008-08-19 Thread Lars Marowsky-Bree
On 2008-08-19T10:47:20, Simon Horman [EMAIL PROTECTED] wrote: I have uploaded packages for Debian (sid) to debian.org and they should be available in the debian archive for most architectures within in the next 24 hours. Source and i386 binaries are also available at

[Linux-ha-dev] 3.0 thoughts

2008-08-19 Thread Lars Marowsky-Bree
Hi, here are my thoughts for 3.0: - Remove code which depends on pacemaker (mgmt, dopd, CIM/SNMP) - Remove unmaintained code (telecom/). - Split up packages so heartbeat's cluster infrastructure layer can be installed separately from - supporting libraries, resources, stonith, and lrm code -

Re: [Linux-ha-dev] 3.0 thoughts

2008-08-19 Thread Lars Marowsky-Bree
On 2008-08-19T23:13:25, Simon Horman [EMAIL PROTECTED] wrote: I'm fine with all of this. Though I wonder if it might be best to get 3.0.0 out the door before worrying about splitting up the packages. Uhm, but that's the whole point of 3.0.x. ;-) What release goals for 3.0.0 do you have in

[Linux-HA] Re: [Linux-ha-dev] Announcing: heartbeat 2.1.4

2008-08-19 Thread Lars Marowsky-Bree
On 2008-08-19T10:47:20, Simon Horman [EMAIL PROTECTED] wrote: I have uploaded packages for Debian (sid) to debian.org and they should be available in the debian archive for most architectures within in the next 24 hours. Source and i386 binaries are also available at

Re: [Linux-ha-dev] duplicate resource active in 2.1.4-RC

2008-08-18 Thread Lars Marowsky-Bree
On 2008-08-18T10:19:56, Andrew Beekhof [EMAIL PROTECTED] wrote: Fixed in: http://hg.clusterlabs.org/pacemaker/stable-0.6/rev/2d516888d27c This is in 2.1.4 as well. Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development SUSE LINUX Products GmbH, GF: Markus Rex, HRB

[Linux-ha-dev] Announcing: heartbeat 2.1.4

2008-08-18 Thread Lars Marowsky-Bree
are available from http://software.opensuse.org/download/server:/ha-clustering:/lha-2.1 2. The source tar ball available from http://hg.linux-ha.org/lha-2.1/archive/STABLE-2.1.4.tar.bz2 * Mon Aug 18 2008 Lars Marowsky-Bree [EMAIL PROTECTED] and MANY others (see doc/AUTHORS) + Version

[Linux-ha-dev] Re: [Linux-HA] Announcing: heartbeat 2.1.4

2008-08-18 Thread Lars Marowsky-Bree
On 2008-08-18T10:30:24, Harakiri [EMAIL PROTECTED] wrote: Since this is the last 2.1.x Version, i hope that at least for the 3.x branch, solaris sparc is planned for release. Sure. We're quite willing to take patches. 3.0.x should be considerably easier to port too, as a lot of code has,

[Linux-HA] Announcing: heartbeat 2.1.4

2008-08-18 Thread Lars Marowsky-Bree
are available from http://software.opensuse.org/download/server:/ha-clustering:/lha-2.1 2. The source tar ball available from http://hg.linux-ha.org/lha-2.1/archive/STABLE-2.1.4.tar.bz2 * Mon Aug 18 2008 Lars Marowsky-Bree [EMAIL PROTECTED] and MANY others (see doc/AUTHORS) + Version

Re: [Linux-ha-dev] Medium: Debian: Dependancy on uuid-runtime

2008-08-17 Thread Lars Marowsky-Bree
On 2008-08-17T22:11:41, Simon Horman [EMAIL PROTECTED] wrote: I should have made it clear that my target for the work that I did in debian/ is unstable. If uuid-runtime | uuidgen makes Etch happy, I'm happy for that change to go in. It is curious that the packages seem to build for Etch in

Re: [Linux-ha-dev] Cope with empty $RANDOM

2008-08-16 Thread Lars Marowsky-Bree
On 2008-08-13T10:24:14, Simon Horman [EMAIL PROTECTED] wrote: Sorry, I've poked around a bit further and I no longer think that local is POSIX. People should simply chose whatever language it is they want to write their scripts in and declare that in the #! line, be it bash, sh, ksh, csh,

Re: [Linux-ha-dev] duplicate resource active in 2.1.4-RC

2008-08-15 Thread Lars Marowsky-Bree
On 2008-08-15T11:55:35, Keisuke MORI [EMAIL PROTECTED] wrote: I look forward to hearing from Keisuke-san whether this works for them now! It does not seem to be fixed right. It does not cause an assertion failure any more (neither crash ;-), but an invalid clone resource is appeared.

Re: [Linux-ha-dev] duplicate resource active in 2.1.4-RC

2008-08-15 Thread Lars Marowsky-Bree
On 2008-08-15T17:52:42, Keisuke MORI [EMAIL PROTECTED] wrote: More precisely, we once tried to use clones with 2.1.3 in production but had to suspend to use it because there were some problems. Now we want to upgrade it to the coming 2.1.4 with using clones. _Clones_ by themselves work fine,

[Linux-ha-dev] Announcing: Release candidate for 2.1.4

2008-08-15 Thread Lars Marowsky-Bree
, Fedora. The package version matching this announcement is 2.1.14-17.1. I will move this to the final location, and update the wiki, and post a more formal announcement (also cc'ing the announcement list) once the final version is ready. * Fri Aug 15 2008 Lars Marowsky-Bree [EMAIL PROTECTED

[Linux-HA] Announcing: Release candidate for 2.1.4

2008-08-15 Thread Lars Marowsky-Bree
, Fedora. The package version matching this announcement is 2.1.14-17.1. I will move this to the final location, and update the wiki, and post a more formal announcement (also cc'ing the announcement list) once the final version is ready. * Fri Aug 15 2008 Lars Marowsky-Bree [EMAIL PROTECTED

Re: [Linux-HA] Help with conversion to heartbeat v2 IPaddr not working

2008-08-15 Thread Lars Marowsky-Bree
On 2008-08-14T14:24:58, David Avery [EMAIL PROTECTED] wrote: I originally setup Heartbeat and I setup my haresoruces file and everything worked fine. After converting my haresources file with the python script and converting it to my cib.xml file. When I start heartbeat my drbd, mysql, and

Re: [Linux-HA] bad path to pingd

2008-08-15 Thread Lars Marowsky-Bree
On 2008-08-14T10:32:24, Chase Simms [EMAIL PROTECTED] wrote: I have the following message throughout my logs. lrmd[3302]: 2008/08/14_08:54:53 info: RA output: (resource_Pingd:monitor:stderr) /usr/lib/ocf/resource.d//heartbeat/pingd: line 193: kill: (31237) - No such process Please note

Re: [Linux-HA] Restart behavior of master/slave resources

2008-08-15 Thread Lars Marowsky-Bree
On 2008-08-15T13:13:46, Krauth, Alexander [EMAIL PROTECTED] wrote: Situation: Running a two node cluster. A master/slave resource with clone_max = 2, clone_node_max = 1, master_max = 1, master_node_max = 1. Startup is working, that means 1 clone is running as a master and the other clone is

Re: [Linux-HA] heartbeart.alert problems

2008-08-15 Thread Lars Marowsky-Bree
On 2008-08-14T16:21:54, Fabricio Vaccari Constanski [EMAIL PROTECTED] wrote: Hi I don't speak english veru well, but I will try I'm using mon and heartbeat.alert script to monitoring apache2 in debian distro, but not stoping the heartbeat when the service apache2 falls. any suggestions?

Re: [Linux-ha-dev] 2.1.4 Changes

2008-08-14 Thread Lars Marowsky-Bree
On 2008-08-14T23:12:08, Simon Horman [EMAIL PROTECTED] wrote: Hi Lars, Hi All, I got to the bottom of why the BasicSanityCheck was haning on the CRM test. Apparently the CRM test needs access to the directory in which the log file is stored, and the new maketempdir code wasn't allowing

Re: [Linux-ha-dev] duplicate resource active in 2.1.4-RC

2008-08-14 Thread Lars Marowsky-Bree
On 2008-08-14T16:33:57, Andrew Beekhof [EMAIL PROTECTED] wrote: But I've got PE crash now when I used with clone resources... I think the following is the correct fix, but i need to do some more testing I've pushed that fix for the fatal assert to both the lha-2.1 tree and the openSUSE build

Re: [Linux-ha-dev] duplicate resource active in 2.1.4-RC

2008-08-13 Thread Lars Marowsky-Bree
On 2008-08-13T17:11:54, Keisuke MORI [EMAIL PROTECTED] wrote: Hi, I've got an unexpected behavior during our regression test for the 2.1.4 release. When the stop of a resource with on_fail=block failed, it looks like the resource is running on the both nodes according to the log and

Re: [Linux-ha-dev] duplicate resource active in 2.1.4-RC

2008-08-13 Thread Lars Marowsky-Bree
On 2008-08-13T15:01:17, Andrew Beekhof [EMAIL PROTECTED] wrote: It's fixed in pacemaker (now): 665119a56b2a Pushed to lha-2.1 ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page:

Re: [Linux-ha-dev] Cope with empty $RANDOM

2008-08-11 Thread Lars Marowsky-Bree
On 2008-08-07T09:41:25, David Lee [EMAIL PROTECTED] wrote: var1=XXX var2=YYY my_function() { _mf_v1=$1 _mf_v2=$2 } your_function() { _yf_v1=$1 _yf_v2=$2 } Pragmatically most sh-programming things are fine with such conventions; the

Re: [Linux-HA] I don't understand

2008-08-11 Thread Lars Marowsky-Bree
On 2008-08-08T17:01:54, Franck Huet [EMAIL PROTECTED] wrote: heartbeat[7641]: 2008/08/08_16:51:46 WARN: node avil003.sylvea.fr: is dead heartbeat[10274]: 2008/08/08_16:53:14 WARN: node noeud1: is dead The two nodes cannot communicate. You need to fix that. Regards, Lars -- Teamlead

Re: [Linux-HA] Resources starting twice

2008-08-08 Thread Lars Marowsky-Bree
On 2008-08-05T18:29:16, Michael Alger [EMAIL PROTECTED] wrote: It's a bit crude, but it works. The main problem is that if the monitoring script stops, heartbeat has no idea and therefore can't factor that into its decision-making process. I haven't determined whether we can somehow store a

Re: [Linux-HA] Adding third node turns all resources unmanaged

2008-08-04 Thread Lars Marowsky-Bree
On 2008-07-24T16:29:54, Gerard Petersen [EMAIL PROTECTED] wrote: The third node is already added to heartbeat config, and in standby mode. We have contraints in place (full log and config will follow), that work with the +INF, 'zero' and -INF values, respectively as Master location, Slave

Re: [Linux-HA] Resources starting twice

2008-08-01 Thread Lars Marowsky-Bree
On 2008-07-31T16:44:31, Angel Rengifo Cancino [EMAIL PROTECTED] wrote: Yep, it's because I'm first trying to understand very well heartbeat 1.x before learning 2.x style. Using haresources it seems easier for my simple requirements. That's not necessarily helpful, as v2 is very different, and

[Linux-ha-dev] HA Cluster Developer Summit 2008: call for participants

2008-07-30 Thread Lars Marowsky-Bree
Hi, sorry, it seems that Fabio hat forgotten to forward this to the linux-ha-dev list. Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) Experience is the name everyone gives to their mistakes. -- Oscar

Re: [Linux-HA] RE: ipfail-like, or something else?

2008-07-24 Thread Lars Marowsky-Bree
On 2008-07-22T11:58:31, [EMAIL PROTECTED] wrote: Hi Mark, so what you are asking for is a way to externally mark a node down (and clean/fenced too), because you have additional out-of-band knowledge? We don't yet have a way of doing this. A single HBcomm plugin can't deliver, because

Re: [Linux-ha-dev] Apache resource agent broken in Debian lenny

2008-07-20 Thread Lars Marowsky-Bree
On 2008-07-20T17:30:20, Dejan Muhamedagic [EMAIL PROTECTED] wrote: I'm not sure if we should deal with this at all in heartbeat. Trying to support all the new setups and distributions, it may become a nightmare. For the maintenance too: dealing with the apache configuration is already rather

Re: [Linux-ha-dev] Apache resource agent broken in Debian lenny

2008-07-20 Thread Lars Marowsky-Bree
On 2008-07-20T21:40:12, Florian Knauf [EMAIL PROTECTED] wrote: From what I can see, the apache RA basically replicates functionality that's already part of apache2ctl - which is supposed to be used to start, stop and control apache, anyway. Apache2ctl evaluates the envvars file if it finds

Re: [Linux-HA] Late Heartbeat Error

2008-07-20 Thread Lars Marowsky-Bree
On 2008-07-21T00:17:04, Choon Kiat [EMAIL PROTECTED] wrote: Hi, I running xen virtual machine with heartbeat on. It prompt this Gmain_timeout_dispatch every 2 or 5 minutes. The host is not busy as nothing is running currently. Is there any other ways to optimize heartbeat to prevent this

Re: [Linux-HA] ip ressource with alias instead of secondary address

2008-07-19 Thread Lars Marowsky-Bree
On 2008-07-18T19:19:39, Dirk H. Schulz [EMAIL PROTECTED] wrote: What I need is an interface alias with it's own address (like eth0:0) because of the routing protocol I use at the gateway cluster. Just out of curiousity, what broken routing protocol requires that? Oh, it is not broken. It is

Re: [Linux-ha-dev] Re: cronjobs v1.2 rc1 OCF resource agent now available

2008-07-18 Thread Lars Marowsky-Bree
On 2008-07-18T16:46:49, Dejan Muhamedagic [EMAIL PROTECTED] wrote: We are talking about different metrics. All monitor operations require a fork of a shell. ... which incidentially is also something we eventually need to fix, and either fork a daemon when the RA starts (which then would talk

Re: [Linux-HA] use of crm_resource

2008-07-18 Thread Lars Marowsky-Bree
On 2008-07-15T14:24:09, Dejan Muhamedagic [EMAIL PROTECTED] wrote: I'd propose to set the group to the unmanaged mode, do whatever you have to do to the resource and then put it back to managed: crm_resource -r grp --meta -p is_managed -v false stop apache ...

Re: [Linux-HA] ip ressource with alias instead of secondary address

2008-07-18 Thread Lars Marowsky-Bree
On 2008-07-17T06:49:40, Dirk H. Schulz [EMAIL PROTECTED] wrote: What I need is an interface alias with it's own address (like eth0:0) because of the routing protocol I use at the gateway cluster. Just out of curiousity, what broken routing protocol requires that?

Re: [Linux-ha-dev] Re: cronjobs v1.2 rc1 OCF resource agent now available

2008-07-17 Thread Lars Marowsky-Bree
On 2008-07-16T20:42:28, Matthew Soffen [EMAIL PROTECTED] wrote: The main reason for not wanting bash was for the Non Linux architectures (i.e. *BSD and Solaris). Yes, its a readily available but since it isn't standard and sh is, whats the big deal ? Desiging/coding for sh (not for bash)

Re: [Linux-ha-dev] Re: cronjobs v1.2 rc1 OCF resource agent now available

2008-07-16 Thread Lars Marowsky-Bree
On 2008-07-15T15:18:26, Dejan Muhamedagic [EMAIL PROTECTED] wrote: There's a tool in Debian which finds bashisms: checkbashisms(1) It should find most of bash features. Otherwise, I don't know any conversion guides. You should be able to find shell references on Internet. If in doubt,

Re: [Linux-HA] Typo in Xen(Xen_Stop).

2008-07-16 Thread Lars Marowsky-Bree
On 2008-07-16T13:56:27, HIDEO YAMAUCHI [EMAIL PROTECTED] wrote: Hi All, The resource of Xen uses 'xm destroy -w .'. But, this command does not have the w option. I think that it is a typo. Which Xen version are you running on? And yeah, more recent versions seem to have cut that

Re: [Linux-HA] DRBD Volumes won't get promoted

2008-07-16 Thread Lars Marowsky-Bree
On 2008-07-16T11:19:53, Schmidt, Florian [EMAIL PROTECTED] wrote: What can I do? Provide more information in the form of your CIB and the actual version of our heartbeat and drbd you're running, for a start ;-) Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development SUSE

Re: [Linux-HA] About motion time of Watchdog.

2008-07-15 Thread Lars Marowsky-Bree
On 2008-07-14T21:20:10, HIDEO YAMAUCHI [EMAIL PROTECTED] wrote: Hi Lars, I'm sorry. I am not good at English. Regards, But yes, it appears that the watchdog timeout should be less than deadtime. However, the idea and I of you are the same. Can you reflect this demand? Or, should I

Re: [Linux-HA] Wierd heartbeat problem.

2008-07-15 Thread Lars Marowsky-Bree
On 2008-07-14T10:13:21, Nikhil Kulkarni [EMAIL PROTECTED] wrote: This is my haresources file: watchdog-client1 IPaddr::10.0.38.71/24/eth0 drbddisk::r0 Delay::3::0 Filesystem::/dev/drbd1:://mnt/data::ext3 kill nfs Delay::3::0 nfs nfslock Why all the Delay resources? They don't help

Re: [Linux-HA] Error messages in /var/log/messages after starting heartbeat

2008-07-15 Thread Lars Marowsky-Bree
On 2008-07-14T19:25:30, Bala [EMAIL PROTECTED] wrote: Jul 12 01:27:43 w2k8-src heartbeat: [5424]: ERROR: MSG[3] : [protocol=1] My /etc/ha.d/ha.cnf has the following two nodes defined in the HA configuration which match with the uname -n output as recommended:

Re: [Linux-HA] use of crm_resource

2008-07-15 Thread Lars Marowsky-Bree
On 2008-07-15T08:13:25, Paul Walsh [EMAIL PROTECTED] wrote: I have the following resource group defined: Resource Group: Moodle web_dev (heartbeat:drbddisk): Started mercury mysql_dev (heartbeat:drbddisk): Started mercury weblog_dev(heartbeat:drbddisk): Started

Re: [Linux-HA] use of crm_resource

2008-07-15 Thread Lars Marowsky-Bree
On 2008-07-15T10:56:44, Paul Walsh [EMAIL PROTECTED] wrote: /usr/lib/ocf/resource.d/BCU/apache2 stop /usr/lib/ocf/resource.d/BCU/apache2 start No, not if you're doing monitoring; the cluster will find out and restart the group. The group, or just the resource? In theory, the script should

Re: [Linux-HA] Re: Error messages in /var/log/messages after starting heartbeat

2008-07-15 Thread Lars Marowsky-Bree
On 2008-07-15T15:54:10, Bala [EMAIL PROTECTED] wrote: Do you have several clusters on the same network segment? If so, you should put them on different port numbers (udpport) or multicast addresses. Actually, yes. I do have a different cluster on the same network. Thanks for the pointer. I

Re: [Linux-HA] About motion time of Watchdog.

2008-07-15 Thread Lars Marowsky-Bree
On 2008-07-15T16:19:31, HIDEO YAMAUCHI [EMAIL PROTECTED] wrote: Hi Lars, I forgot to say a very important thing. The system which I intend for does not use STONITH. If STONITH enters, I do not become such a problem. Systems without STONITH are not supportable anway. Regards, Lars

Re: [Linux-HA] Xen agent

2008-07-15 Thread Lars Marowsky-Bree
On 2008-07-14T08:05:52, Ciro Iriarte [EMAIL PROTECTED] wrote: Checking the Xen agent I see that the configuration file is required. Can't xen sync the configuration by itself at migration time (xend-relocation-server)?, I case this is false, I still need to export and sync the

Re: [Linux-HA] About motion time of Watchdog.

2008-07-14 Thread Lars Marowsky-Bree
On 2008-07-14T12:27:58, HIDEO YAMAUCHI [EMAIL PROTECTED] wrote: Hi All, I confirmed it about watchdog of Heartbeat. The environment that I confirmed has two resources in two nodes(active/standby). The resource of one eyes checks starting VIP. The resource of two eyes is VIP(IPaddr).

Re: [Linux-HA] Xen agent

2008-07-14 Thread Lars Marowsky-Bree
On 2008-07-12T00:06:10, Ciro Iriarte [EMAIL PROTECTED] wrote: Hi, currently Xen 3.2 on SLES10 managed domains ('xm new' instead of 'xm create') only uses configuration files the first time you add them, later, any 'xm block-attach' is stored in its own database (not sure about the

<    2   3   4   5   6   7   8   9   10   11   >