Re: [Linux-HA] Anyone successfully install PAcemaker/Corosync on Freebsd?

2016-02-10 Thread Lars Ellenberg
gestions. Hoping that perhaps > someone has successfully done this. > > thanks in advance > -mgb -- : Lars Ellenberg : http://www.LINBIT.com ___ Linux-HA mailing list is closing down. Please subscribe to us...@clusterlabs.org instead.

Re: [Linux-HA] CIB not supported: validator 'pacemaker-2.0', release '3.0.9'

2015-12-21 Thread Lars Ellenberg
he upgrade command You should upgrade your crm shell. If you want, you can also upgrade your cib. -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacemaker support and consulting DRBD® and LINBI

Re: [Linux-HA] Download of Cluster Glue package?

2015-10-30 Thread Lars Ellenberg
On Fri, Oct 30, 2015 at 04:08:13PM +0100, Lars Ellenberg wrote: > On Wed, Oct 28, 2015 at 03:21:57PM +, Dejan Bucar wrote: > > Hi, > > > > The download link for cluster glue has stopped working, > > http://hg.linux-ha.org/glue/archive/glue-1.0.12.tar.bz2. Is it

Re: [Linux-HA] Download of Cluster Glue package?

2015-10-30 Thread Lars Ellenberg
cally generated tarballs, so I disabled them, but put some (the most frequently requested ones) up as static files. Apparently this had been requested by mercurial hash more frequently than by version tag, so I left out the version tag. Corrected now. -- : Lars Ellenberg : http://www.LINBIT.com | Y

Re: [Linux-HA] Pacemaker 10-15% CPU.

2015-10-30 Thread Lars Ellenberg
cores equally. > > Please let us know if there is any way we could minimise the CPU > utilisation. We dont require stonith feature, but there is no way stop that > daemon from running to our knowledge. If that is also possible, please let > us know. Has been answered on the P

Re: [Linux-HA] ORACLE 12 and SLES HAE (Sles 11sp3)

2015-10-30 Thread Lars Ellenberg
/heartbeat/SAPDatabase > resource-agents-3.9.5-0.34.57 > and there are NOT updates about thatin the channel. > > Any Idea on that? Has been answered on the Pacemaker/Clusterlabs list meanwhile... -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Li

Re: [Linux-HA] Heartbeat packages for Redhat-7

2015-04-03 Thread Lars Ellenberg
get with the RHEL 7 "native" HA cluster. For more about Pacemaker, visit clusterlabs.org, subscribe to us...@clusterlabs.org, or join on freenode #clusterlabs -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA a

Re: [Linux-HA] old page listing HA success-stories?

2015-03-30 Thread Lars Ellenberg
On Sat, Mar 28, 2015 at 06:57:40PM -0500, james pruett wrote: > Hi, > > I am looking for HA success stories written around year 2000. > Are those pages around anywhere ?? You may be referring to http://linux-ha.org/SuccessStories/ But they are outdated by a decade at least. (There are entries ref

[Linux-HA] Please subscribe to us...@clusterlabs.org. This Mailing List is considered deprecated.

2015-02-27 Thread Lars Ellenberg
need to subscribe to the 'users' list at: http://oss.clusterlabs.org/mailman/listinfo/users The mailing list itself will automatically remind posters of this (at most) once a week. Apologies for the inconvenience. Lars Ellenberg _

[Linux-HA] Call for review of undocumented parameters in resource agent meta data

2015-02-11 Thread Lars Ellenberg
On Fri, Jan 30, 2015 at 09:52:49PM +0100, Dejan Muhamedagic wrote: > Hello, > > We've tagged today (Jan 30) a new stable resource-agents release > (3.9.6) in the upstream repository. > > Big thanks go to all contributors! Needless to say, without you > this release would not be possible. Big tha

[Linux-HA] Announcing the Heartbeat 3.0.6 Release

2015-02-10 Thread Lars Ellenberg
orosync still. But typically, for new deployments involving Pacemaker, in most cases you should chose Corosync 2.3.x as your membership and communication layer. For existing deployments using Heartbeat, upgrading to this Heartbeat version is strongly recommended. Thanks, Lars Ellenberg signatu

Re: [Linux-HA] [ha-wg-technical] [ha-wg] [RFC] Organizing HA Summit 2015

2014-11-05 Thread Lars Ellenberg
On Sat, Nov 01, 2014 at 01:19:35AM -0400, Digimer wrote: > All the cool kids will be there. > > You want to be a cool kid, right? Well, no. ;-) But I'll still be there, and a few other Linbit'ers as well. Fabio, let us know what we could do to help make it happen. Lars > On 01/11/14 0

Re: [Linux-HA] Hertbeat fail-over Email Alert

2014-09-29 Thread Lars Ellenberg
opped" (and send out an email). May be good enough for what you seem to ask for. Just don't mistake email as a synchronous, or reliable, communication. Not receiving an email (in time) may be *very* different from "All ok, running as normal, nothing happened". So this does NOT re

Re: [Linux-HA] Hertbeat fail-over Email Alert

2014-09-24 Thread Lars Ellenberg
f fail-over > happen and fail over completed. > > We already setup smtp in both the servers. > And we are able to send mail from terminal window. > Storage1 > Storage2 > > Please guide us. What's wrong with the MailTo resource agent? -- : Lars Ellenberg : LINBIT

Re: [Linux-HA] Antw: Re: Q: dampening explained?

2014-09-10 Thread Lars Ellenberg
; did not wait for "the dust to settle". > > > > > > Proposal: > > > > add a "update deadline" along with the dampening, which would normally > > be sufficiently larger, and count from the original update (this timer > > would not be

Re: [Linux-HA] Antw: Re: Q: dampening explained?

2014-09-09 Thread Lars Ellenberg
x27;t make it into the cib for "a long time". Workaround: use a dampening interval shorter than the update interval. Problem with that workaround: you may still hit the same undesired situations you could reach with immediately updating the values, you did not wait for "the

Re: [Linux-HA] Q: ping (ocf:pacemaker:ping) from specific address?

2014-09-04 Thread Lars Ellenberg
On Thu, Sep 04, 2014 at 04:23:36PM +0200, Ulrich Windl wrote: > >>> Lars Ellenberg schrieb am 04.09.2014 um 15:28 > Hi! > > Yes, it helps: The problems with ping are: > 1) There is no error if you use an interface alias with -I (like bond0:xyz) > 2) You can use an IP

Re: [Linux-HA] Q: ping (ocf:pacemaker:ping) from specific address?

2014-09-04 Thread Lars Ellenberg
eq=2 ttl=41 time=27.4 ms --- 8.8.8.8 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1000ms rtt min/avg/max/mdev = 27.405/27.430/27.456/0.167 ms Does that help? -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and cons

Re: [Linux-HA] Ha.cf with IPv6 address

2014-07-17 Thread Lars Ellenberg
figuration error. > heartbeat [3436]: 2014/07/14_15: 59:32 ERROR: Configuration error, > heartbeat not started. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.

Re: [Linux-HA] heartbeat 3.0.3 crashes if there are networking/multicast issues (ERROR: lowseq cannnot be greater than ackseq)

2014-06-26 Thread Lars Ellenberg
tbeat: [10923]: CRIT: Emergency Shutdown(MCP > dead): Killing ourselves. > > At this point clustering has failed, because the heartbeat services/processes > aren't running anymore.. > > Has anyone else seen this? It has been fixed years ago ... > It seems the bug

Re: [Linux-HA] unable to recover from split-brain in a two-node cluster

2014-06-26 Thread Lars Ellenberg
On Tue, Jun 24, 2014 at 08:48:03AM -0700, f...@vmware.com wrote: > Hi Lars, > > Thanks for pointing out the patch. It is not in the heartbeat version on the > system (it is using Heartbeat-3-0-7e3a82377fa8). I'll try that out. > > As for ccm_testclient, the system has stripped out unnecessary fi

Re: [Linux-HA] IP-Address problem with cluster

2014-06-26 Thread Lars Ellenberg
e "service" IP address. You could use ip route to make them use the node address to contact the other nodes, but use the service address to contact clients. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® an

Re: [Linux-HA] unable to recover from split-brain in a two-node cluster

2014-06-24 Thread Lars Ellenberg
ship? Is that 3.0.5 release tag, or a more "recent" hg checkout? You need heartbeat up to at least this commit: http://hg.linux-ha.org/heartbeat-STABLE_3_0/rev/fd1b907a0de6 (I meant to add a 3.0.6 release tag since at least I pushed that commit, but because of packaging inconsistencies I want

Re: [Linux-HA] Heartbeat Supported Version

2014-06-02 Thread Lars Ellenberg
an and corosync 1.x...) Recommendation for new clusters: go with pacemaker (1.1.12 will be release soon) and corosync (2.3.3 is it now?). That's also about what you will get with current distributions (rhel7, sles12). (Though we at Linbit are still happy with heartbeat + pacemaker as well).

Re: [Linux-HA] Problems with cluster glue archive

2013-11-26 Thread Lars Ellenberg
append "tip.tar.bz2" to your attempt above... Cheers, Lars -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://

Re: [Linux-HA] Ping domain

2013-11-19 Thread Lars Ellenberg
://moin.linux-ha.org/PingDirective Or consider switching to Pacemaker... -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. ___ L

Re: [Linux-HA] Heartbeat errors related to Gmain_timeout_dispatch at low traffic

2013-11-19 Thread Lars Ellenberg
d your virtualization just stops scheduling the VM itself, because it thinks it is underutilized... Does it recover if you kill/restart heartbeat? -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trad

Re: [Linux-HA] drbd/pacemaker multiple tgt targets, portblock, and race conditions (long-ish)

2013-11-19 Thread Lars Ellenberg
d also like to know, especially if they've found > > alternate solutions. > > Can't say about 1, I use IET, it doesn't seem to have that limitation. > 2 - I use alternative home-brew ms RA which blocks (DROP) both input and >

Re: [Linux-HA] iSCSI corruption during interconnect failure with pacemaker+tgt+drbd+protocol C

2013-11-19 Thread Lars Ellenberg
of > them. But this isn't what happens; instead most of the outstanding > writes are lost. No i/o error is reported on the initiator; stuff > just vanishes. > > I'm writing directly to a block device for these tests, so the lost > data isn't the result of filesystem co

Re: [Linux-HA] Antw: Re: General question about heartbeat tokens and node overloaded.

2013-10-02 Thread Lars Ellenberg
oint, if a node is so busy it cannot get anything done anymore, maybe it is better for overall behaviour to put it down. Unless death-by-overload becomes a "frequent" problem, in which case there is nothing the cluster can do really: You then need to revisit capacity planing, and put in

Re: [Linux-HA] announcement: planning resource-agents release 3.9.6

2013-09-30 Thread Lars Ellenberg
On Mon, Sep 30, 2013 at 03:33:35PM +0200, Dejan Muhamedagic wrote: > Hello, > > We released resource-agents v3.9.5 back in February. In the > meantime there have been quite a few fixes and new features > pushed to the repository and it is high time for another release. > >

Re: [Linux-HA] Installing Heartbeat 3.0.5 in RHEL 6.3

2013-09-30 Thread Lars Ellenberg
ld/BUILD/Heartbeat-3-0-STABLE-3.0.4/heartbeat' > make: *** [all-recursive] Error 1 > error: Estado de salida erróneo de /var/tmp/rpm-tmp.nA38v2 (%build) > > > Errores de construcción RPM: > Estado de salida erróneo de /var/tmp/rpm-tmp.nA38v2 (%build) > > > a

Re: [Linux-HA] [Pacemaker] Probably a regression of the linbit drbd agent between pacemaker 1.1.8 and 1.1.10

2013-09-10 Thread Lars Ellenberg
ters like me? ;-) Why not? That's what release candidates are intended for. You'd only have to confirm that it works for you now. Respectively, that it still does not, in which case you better report that now than after the release, right? -- : Lars Ellenberg : LINBIT | Your Way to H

Re: [Linux-HA] [Pacemaker] Probably a regression of the linbit drbd agent between pacemaker 1.1.8 and 1.1.10

2013-09-09 Thread Lars Ellenberg
On Mon, Sep 09, 2013 at 02:42:45PM +1000, Andrew Beekhof wrote: > > On 06/09/2013, at 5:51 PM, Lars Ellenberg wrote: > > > On Tue, Aug 27, 2013 at 06:51:45AM +0200, Andreas Mock wrote: > >> Hi Andrew, > >> > >> as this is a real showstopper at the mom

Re: [Linux-HA] [Pacemaker] Probably a regression of the linbit drbd agent between pacemaker 1.1.8 and 1.1.10

2013-09-06 Thread Lars Ellenberg
03-test results in the > > following: > > > > --8<- > > Last updated: Mon Aug 26 19:29:38 2013 Last change: Mon Aug 26 > > 19:29:28 2013 via cibadmin on dis04-test > > Stack: cman > > Current DC: dis03-test

Re: [Linux-HA] establishing a new resource-agent package provider

2013-08-07 Thread Lars Ellenberg
one project, before it was exploded into all those sub-projets. People don't even need to know the former ever existed. Though I maintain that it is still quite heavily in use... but that's not the point at all. I simply see no reason to change the name. It is established, and documented all ov

Re: [Linux-HA] Question for the linux HA group

2013-07-01 Thread Lars Ellenberg
that this > move doesn't occur? > We are missing the involved program versions, (heartbeat, pacemaker, ...) and the pacemaker configuration. ( crm configure show or cibadmin -Q ) But just from the ha.cf, we already see that you have only one communiaction link. please use multiple

Re: [Linux-HA] LVM Resource agent, "exclusive" activation

2013-06-18 Thread Lars Ellenberg
; the tag on stop though? Or why we would override a different tag on start. As the "cluster tag" is not supposed to change, we could just require the admin to set it once. Has the side-effect that an admin can revoke the "cluster rights" by simply re-tagging with so

Re: [Linux-HA] Heartbeat haresources with IPv6

2013-06-15 Thread Lars Ellenberg
rg() { case `canonname $1` in +IPv6addr::*) + # special case, there is only one argument, + # and it contains :: + echo $1 | sed 's%[^:]*::%%' + ;; *::*) echo $1 | sed 's%[^:]*::%%' | sed 's%::% %g'

Re: [Linux-HA] LVM Resource agent, "exclusive" activation

2013-05-21 Thread Lars Ellenberg
On Tue, May 21, 2013 at 05:52:39PM -0400, David Vossel wrote: > - Original Message - > > From: "Lars Ellenberg" > > To: "Brassow Jonathan" > > Cc: "General Linux-HA mailing list" , "Lars > > Marowsky-Bree" , "Fabio

Re: [Linux-HA] Sockets Missing from /var/run/heartbeat

2013-05-21 Thread Lars Ellenberg
d... Or a later "cleanup" by some init script or misguided daemon did rm -rf /var/run/* ? What does "lsof -p " say? Lars > On May 17, 2013, at 11:02 AM, "Lars Ellenberg" > wrote: > > > On Thu, May 16, 2013 at 08:05:39PM +, Wilson, Chris

Re: [Linux-HA] Antw: Using crm to configure a rule

2013-05-21 Thread Lars Ellenberg
group called 'HACMASTER': > >>>> > >>>> Resource Group: HACMASTER > >>>> HACMASTER-JOBFILE (ocf::PPS:hacJobFile): Started gpmhac01 > >>>> HACMASTER-PWFILE (ocf::PPS:hacPWFile): Started gpmhac01 > >>&

Re: [Linux-HA] LVM Resource agent, "exclusive" activation

2013-05-20 Thread Lars Ellenberg
On Fri, May 17, 2013 at 02:00:48PM -0500, Brassow Jonathan wrote: > > On May 17, 2013, at 10:14 AM, Lars Ellenberg wrote: > > > On Thu, May 16, 2013 at 10:42:30AM -0400, David Vossel wrote: > > > >>>>>>> The use of 'auto_act

Re: [Linux-HA] LVM Resource agent, "exclusive" activation

2013-05-17 Thread Lars Ellenberg
On Thu, May 16, 2013 at 10:42:30AM -0400, David Vossel wrote: > > The use of 'auto_activation_volume_list' depends on updates to the LVM > > initscripts - ensuring that they use '-aay' in order to activate > > logical > > volumes. That has been checked in upstream. I'm sure

Re: [Linux-HA] Sockets Missing from /var/run/heartbeat

2013-05-17 Thread Lars Ellenberg
put a mkdir in your init script, if you have /var/run on tmpfs or similar. heartbeat 3 has that covered, btw. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com ___ Linux-HA mailing l

Re: [Linux-HA] LVM Resource agent, "exclusive" activation

2013-05-15 Thread Lars Ellenberg
On Tue, May 14, 2013 at 11:36:54AM -0400, David Vossel wrote: > - Original Message - > > From: "Lars Ellenberg" > > To: "Lars Marowsky-Bree" > > Cc: "Fabio M. Di Nitto" , "General Linux-HA mailing > > list" , > > &

Re: [Linux-HA] LVM Resource agent, "exclusive" activation

2013-05-14 Thread Lars Ellenberg
r would know about it, too, and had made sure it is not activated there. If that other node was not in the membership, we would re-tag and activate anyways. So why not just do that, document that it is done this way, and not pretend it would do more than that. It does not. Lars

[Linux-HA] LVM Resource agent, "exclusive" activation

2013-05-14 Thread Lars Ellenberg
This is about pull request https://github.com/ClusterLabs/resource-agents/pull/222 "Merge redhat lvm.sh feature set into heartbeat LVM agent" Apologies to the CC for list duplicates. Cc list was made by looking at the comments in the pull request, and some previous off-list thread. Even though

Re: [Linux-HA] Antw: DRBD NetworkFailure

2013-04-25 Thread Lars Ellenberg
r!) into the servers, and use that. That's usually quick to do, does not cost much effort nor money, and gives fast results. Either the issue persists, and you have ruled out NIC/driver issues with high confidence. Or the issue goes away, and you can then proceed with either blacklisting th

Re: [Linux-HA] drbd error message decoding help

2013-04-24 Thread Lars Ellenberg
alling a CRM process? Or is it the > other way around (which would make more sense?) The DRBD resource agent adjusts the "master score". Just as is required for all resource agents supporting a "Master" state. -- : Lars Ellenberg : LINBIT | Your Way to High Availability :

Re: [Linux-HA] Announcing release 0.1.0 of the Assimilation Monitoring Project!

2013-04-24 Thread Lars Ellenberg
at all times your undisguised opinions." - William Wilberforce > > _______ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems -- :

Re: [Linux-HA] best combo of software for HA (on Ubuntu 12.04)

2013-02-02 Thread Lars Ellenberg
ing about some 100% ... but those are the 100% of the to-be-resynced blocks in this particular resync, not the 100% of the device... > > I settled on the latest version of gluster instead. Resyncing is > > faster, as the files for the individual VMs can be synced. What

Re: [Linux-HA] "times(2) really returns an unsigned value" for linux-ha fixed from which version?

2012-12-07 Thread Lars Ellenberg
hg.linux-ha.org/ > Could you please tell us if 2.1.3 fixed the bug? Thank u. That was fixed int 2006, which was before 2.0.8. But seriously: wtf are you doing with 2.1.3? -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRB

Re: [Linux-HA] resource monitor timeout, Killing with signal SIGTERM (15).

2012-11-13 Thread Lars Ellenberg
termined by some "failure stickyness arithmetic", which iirc was really cumbersome to get "right"; if at all. That has since been replaced by the "fail count" concept, which is much easier to handle. > provider="heartbeat"> > > > > >

Re: [Linux-HA] Heartbeat not starting when both nodes are down

2012-10-10 Thread Lars Ellenberg
2.168.20.51 > >>> auto_failback on > >>> nodecluster1.gamez.es cluster2.gamez.es > >>> use_logd yes > >>> crm on > >>> autojoin none > >>> > >>> Any ideas on what am I doing wrong? > > [...] > > >

Re: [Linux-HA] resource-agents build fails with error about cl_log.h

2012-07-31 Thread Lars Ellenberg
the yum equivalent would be yum-builddep $package. Not sure about rug/zypper/whatnot. > I'd make those changes myself, but I'm not really qualified/have permissions. > > >>> On 31.07.2012 at 11:56, in message <20120731155622.GS29767@soda.linbit>, > >>>

Re: [Linux-HA] Manual Resource Migration/Move

2012-07-31 Thread Lars Ellenberg
:30:14 halab4 cib: [13557]: info: cib:diff: + ... yet an other contraint, crm shell syntax equivalent below. from the crm resource migrate: location cli-prefer-groupMysql groupMysql inf: halab4 from your config: location location-groupMysql-on-node1 groupMysql inf: halab3 So pace

Re: [Linux-HA] Heartbeat isn't switching to the 2nd node when Httpd is down!

2012-07-31 Thread Lars Ellenberg
10 > initdead 15 > > crm respawn > > node node01 > node node02 > > And I've tried 2 combinations for my cib.xml: learn to use the crm shell, so much easier to the eyes... > > 1: > Code: > > > I think you are mis

Re: [Linux-HA] resource-agents build fails with error about cl_log.h

2012-07-31 Thread Lars Ellenberg
: *** [all-recursive] Error 1 > gmake[1]: Leaving directory > `/home/dcorlette/src/siem/content/tools/HA/resource-agents' > make: *** [all] Error 2 > > > > > > > David Corlette > Product Line Lead > dcorle...@netiq.com > 703.663.55

Re: [Linux-HA] Crm configure edit infinite loop

2012-07-02 Thread Lars Ellenberg
> > > > Jul 02 15:24:19 PCRM-WEB-PROD1 cib: [1307]: info: cib_stats: > Processed 52 operations (384.00us average, 0% utilization) in the last 10min > > Crm does not release to configure it again or even apply any change. > Version - > Pacemaker - 1.1.6 Corosync

Re: [Linux-HA] problems with 3.4 kernel

2012-06-25 Thread Lars Ellenberg
On Fri, Jun 08, 2012 at 02:37:34PM -0700, David Lang wrote: > On Fri, 8 Jun 2012, Lars Ellenberg wrote: > > > On Fri, Jun 08, 2012 at 02:07:17PM -0700, David Lang wrote: > >> I just updated one of my systems to the 3.4 kernel and findif appears to be > >> failing (

Re: [Linux-HA] problems with 3.4 kernel

2012-06-08 Thread Lars Ellenberg
etmask 24 broadcast 1.2.3.255 Try adding more variables as listed by the help text above to see what makes it work or fail. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com ___

Re: [Linux-HA] BUG in IPaddr: Reboot Continually and suggested fix

2012-06-08 Thread Lars Ellenberg
results in ha-log these > lines: Worked around in resource-agents 3.9.3. commit: https://github.com/ClusterLabs/resource-agents/commit/dbe0f24 Though I'd really had used ifname=${ifname%:} ... -- : Lars Ellenberg : LINBIT | Your Way to High Availabi

Re: [Linux-HA] Does globally-unique make sense on filesystems cloned resources?

2012-06-06 Thread Lars Ellenberg
Two globally-unique clones I came accross in real life: Cluster IP buckets, in the sense of the iptables CLUSTERIP target. Sequences of IPs generated by the IPaddr2 resource, where the clone id is added to the base IP. Both will also need to allow clone-node-max > 1, and one node will h

Re: [Linux-HA] corosync/pacemaker cluster failed

2012-05-25 Thread Lars Ellenberg
On Fri, May 25, 2012 at 10:46:10AM +1000, Andrew Beekhof wrote: > On Fri, May 25, 2012 at 10:39 AM, Tracy Reed wrote: > > On Fri, May 25, 2012 at 02:01:18AM +0200, Lars Ellenberg spake thusly: > >> Something is broken with your IPaddr2 script. > >> Relevant packa

Re: [Linux-HA] 8.3.7 Version Advice

2012-05-25 Thread Lars Ellenberg
On Thu, May 24, 2012 at 09:33:37PM -0300, Net Warrior wrote: > El 05/24/2012 08:32 PM, Lars Ellenberg escribió: > > On Thu, May 24, 2012 at 01:56:37PM -0300, Net Warrior wrote: > >> Hi there list. > >> > >> I'm about to implement some work with DRBD but th

Re: [Linux-HA] corosync/pacemaker cluster failed

2012-05-24 Thread Lars Ellenberg
gents/master/heartbeat/IPaddr2 and pastebin diff -u /usr/lib/ocf/resource.d//heartbeat/IPaddr2 ./IPaddr2 Or replace yours with the current one. Maybe just upgrade resource-agents? Though, if the script is broken in that way (syntax error), it should never work, and not even start, usual

Re: [Linux-HA] 8.3.7 Version Advice

2012-05-24 Thread Lars Ellenberg
ou are willing to build and use your own 2.6.33 on a RHEL6, but not considering using a more recent DRBD module? Why would you do that. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com _

Re: [Linux-HA] questions about Heartbeat

2012-05-14 Thread Lars Ellenberg
vent the wheel and do your own cluster management. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. ___ Linux-HA mailing li

Re: [Linux-HA] heartbeat strange behavior

2012-05-02 Thread Lars Ellenberg
rce.d/IPaddr x.x.x.x/24 stop > > Apr 30 00:03:29 node-b heartbeat: [3082]: info: node-b wants to go standby > [foreign] > Apr 30 00:03:39 node-b heartbeat: [3082]: WARN: No reply to standby > request. Standby request cancelled. > Apr 30 00:04:29 node-b heartbeat: [3082]: WARN

Re: [Linux-HA] "grep -zqs" in ocf:heartbeat:exportfs

2012-04-12 Thread Lars Ellenberg
On Thu, Apr 12, 2012 at 03:30:06PM +0200, Lars Ellenberg wrote: > On Thu, Apr 12, 2012 at 03:20:08PM +0200, Ulrich Windl wrote: > > Hi, > > > > in ocf:heartbeat:exportfs (as found in SLES11 SP1) there is a problem with > > "grep -zqs": > > The "-

Re: [Linux-HA] "grep -zqs" in ocf:heartbeat:exportfs

2012-04-12 Thread Lars Ellenberg
does produce false results. Probably never tested... > BTW: Shouldn't the match be "anchored"? ^$directory ... $host\$ https://github.com/ClusterLabs/resource-agents/commit/5b0bf96e77ed3c4e179c8b4c6a5ffd4709f8fdae There also was a medium length discussion on the mailing l

Re: [Linux-HA] ocf:heartbeat:apache resource agent and timeouts

2012-04-12 Thread Lars Ellenberg
On Thu, Apr 12, 2012 at 12:06:54PM +0200, Lars Ellenberg wrote: > On Sun, Apr 08, 2012 at 03:16:17PM +0200, David Gubler wrote: > > Hi Lars, > > > > On 05.04.2012 18:53, Lars Ellenberg wrote: > > > Uhm, "invalid test case". > > > > > > r

Re: [Linux-HA] ocf:heartbeat:apache resource agent and timeouts

2012-04-12 Thread Lars Ellenberg
On Sun, Apr 08, 2012 at 03:16:17PM +0200, David Gubler wrote: > Hi Lars, > > On 05.04.2012 18:53, Lars Ellenberg wrote: > > Uhm, "invalid test case". > > > > rather try: > > iptables -I INPUT -p tcp --dport 80 -i lo -j REJECT > > or even > &

Re: [Linux-HA] ocf:heartbeat:apache resource agent and timeouts

2012-04-12 Thread Lars Ellenberg
quot;false" \ > no-quorum-policy="ignore" \ > last-lrm-refresh="1333886776" > > > crm_mon shows > Clone Set: apacheClone [apache] > apache:0 (ocf::heartbeat:apache):Started node2 (unmanaged) > apache:1 (ocf::hear

Re: [Linux-HA] pacemaker+drbd promotion delay

2012-04-12 Thread Lars Ellenberg
gt; > Based on what you asked for from the previous extract, I think what you want > > from this test is pe-input-5. Just to play it safe, I copied and bunzip2'ed > > all > > three pe-input files mentioned in the log messages: > > > > pe-input-4: <http://pastebin.com/Tx

Re: [Linux-HA] ocf:heartbeat:apache resource agent and timeouts

2012-04-05 Thread Lars Ellenberg
tart timed out. Which I think is the right thing to do. > Bottom line: > I think the apache resource agent badly needs a timeout parameter which > is supplied to wget/curl and the documentation should make clear that > the current monitor timeout provided by pacemaker is not a substitute

Re: [Linux-HA] order troubles

2012-03-22 Thread Lars Ellenberg
t;cli-prefer-rule-ve2010" inf: #uname eq virtue4 > location cli-prefer-ve2100 ve2100 \ > rule $id="cli-prefer-rule-ve2100" inf: #uname eq virtue5 > location cli-prefer-ve2101 ve2101 \ > rule $id="cli-prefer-rule-ve2101" inf: #uname eq virtue4 >

Re: [Linux-HA] clvm/dlm/gfs2 hangs if a node crashes

2012-03-20 Thread Lars Ellenberg
odes should already > >> have been fenced because of connection loss between nodes (on drbd > >> replication link). > >> > >> You can use e.g. that nice fencing script: > >> > >> http://goo.gl/O4N8f > > > > This is the output

Re: [Linux-HA] cman+pacemaker+drbd fencing problem

2012-03-01 Thread Lars Ellenberg
orestes-tb corosync[2296]: [CPG ] chosen downlist: sender > r(0) ip(129.236.252.14) r(1) ip(192.168.100.6) ; members(old:2 left:1) > Mar 1 12:04:03 orestes-tb corosync[2296]: [MAIN ] Completed service > synchronization, ready to provide service. > Mar 1 12:0

Re: [Linux-HA] cman+pacemaker+drbd fencing problem

2012-02-28 Thread Lars Ellenberg
On Tue, Feb 28, 2012 at 03:51:29PM -0500, William Seligman wrote: > On 2/28/12 2:09 PM, Lars Ellenberg wrote: > > You say "fencing resource-only" in drbd.conf. > > But you did not show the fencing handler used? > > Did you specify one at all? > > It looks l

Re: [Linux-HA] cman+pacemaker+drbd fencing problem

2012-02-28 Thread Lars Ellenberg
n link will just fail, in which case you really need the "fencing resource-and-stonith", and a suitable fence-peer handler. > and starts the sync process with hypatia-tb. Then cman+corosync steps > in on orestes-tb and fences hypatia-tb, before the sync can proceed. > > I ra

Re: [Linux-HA] Understanding the behavior of IPaddr2 clone

2012-02-27 Thread Lars Ellenberg
ailover, each remaining node should takeover it's share. > Could this please be documented more clearly somewhere? Clusters from Scratch not good enought? http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch08s06.html But yes, I'll add a note to the I

Re: [Linux-HA] Understanding the behavior of IPaddr2 clone

2012-02-27 Thread Lars Ellenberg
On Mon, Feb 27, 2012 at 05:23:36PM -0500, William Seligman wrote: > On 2/27/12 4:10 PM, Lars Ellenberg wrote: > > On Mon, Feb 27, 2012 at 03:39:04PM -0500, William Seligman wrote: > >> On 2/24/12 3:36 PM, William Seligman wrote: > >>> On 2/17/12 7:30 AM, Dejan Muhame

Re: [Linux-HA] Understanding the behavior of IPaddr2 clone

2012-02-27 Thread Lars Ellenberg
but I can't be sure it will work for more than two nodes. > Would some nice person test this? > > - I wrote my code assuming that the clone number assigned to a node would > remain > constant. If the clone numbers were to change by deleting/adding a node to the > cluster,

Re: [Linux-HA] stonith/fence using external/libvirt on KVM

2012-02-24 Thread Lars Ellenberg
clusters (handful of resources, small number of nodes), no cluster file system, not DLM involved, and trying to get away without stonith: please use heartbeat. Unless the mentioned behaviour is fixed meanwhile... Everyone else, **with tested and working stonith**, for new deploym

Re: [Linux-HA] Active-passive cluster, best practice question

2012-02-21 Thread Lars Ellenberg
${i}/x ; done" -- i.e. a > non-issue really. There is also csync2 ... and this would be exactly the originally intended use case. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com __

Re: [Linux-HA] Suggestion for exportfs resource

2012-02-21 Thread Lars Ellenberg
tfs > /mail franklin.nevis.columbia.edu > > The exportfs command "canonicalizes" the clientspec, so once again the monitor > operation will always fail. > > I either have to use the canonical name in the clientspec, Right. -- : Lars Ellenberg : LINBIT

Re: [Linux-HA] lrmd error

2012-02-20 Thread Lars Ellenberg
o: RA output: (ping:0:monitor:stderr) syntax error > lrmd: [9362]: info: RA output: (ping:0:monitor:stderr) > attrd_updater: [6457]: info: Invoked: attrd_updater -n pingd -v -d 5s > > Thanks. > Josh Becigneul -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA supp

Re: [Linux-HA] pacemaker/corosync - cl_status . REASON: hb_api_signon: Can't initiate connection to heartbeat

2012-02-20 Thread Lars Ellenberg
nux-ha > See also: http://linux-ha.org/ReportingProblems > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems -- : Lars El

Re: [Linux-HA] cman+pacemaker+dual-primary drbd does not promote

2012-01-31 Thread Lars Ellenberg
mething DRBD could fix. Try to get a "ocf:pacemaker:Stateful" dummy resource promoted, if that works, come back with drbd specifics. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com _

Re: [Linux-HA] cman+pacemaker+dual-primary drbd does not promote

2012-01-31 Thread Lars Ellenberg
On Tue, Jan 31, 2012 at 10:11:23PM +0100, emmanuel segura wrote: > William can you try like this > > primitive AdminDrbd ocf:linbit:drbd \ > params drbd_resource="admin" \ > op monitor interval="60s" role="Master" > > clone Adming AdminDrbd

Re: [Linux-HA] cman+pacemaker+dual-primary drbd does not promote

2012-01-31 Thread Lars Ellenberg
vel (fence-peer) is mandatory. Unless you don't care for data integrity. > > > DRBD looks OK: > > # cat /proc/drbd > version: 8.4.0 (api:1/proto:86-100) > GIT-hash: 28753f559ab51b549d16bcf487fe625d5919c49c build by gardner@, > 2012-01-25 > 19:10:28 > 0: cs:C

Re: [Linux-HA] heartbeat doesnt create the socket /var/run/heartbeat/register

2012-01-20 Thread Lars Ellenberg
eated? Where did you get your packages/binaries? Double check your build? lsof -n -p your heartbeat master control process? > Is there a workaround I can do to create the socket? Fix your installation. > This problem doesn't happen all the time. I have another node with the >

Re: [Linux-HA] pacemaker virtual ip does not start (was: Unknown Problems with Heartbeat)

2012-01-20 Thread Lars Ellenberg
ited in length, though. The length over ":label" has to be <= 15 bytes (IFNAMSIZ - terminating NUL). Which means eth0:vinotinto works, but eth0:cabernetsauvignon will fail with the non-obvious "RTNETLINK answers: Numerical result out of range". hth, -- : Lars Ellenberg :

Re: [Linux-HA] compiling cluster glue in solaris 10 getting error

2011-12-28 Thread Lars Ellenberg
> > -- > > ‘Winner make things happen, Lossers let things happen’ > > > > > > -- > ‘Winner make things happen, Lossers let things happen’ > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lis

Re: [Linux-HA] Antw: Re: DRBD Split brain question

2011-12-21 Thread Lars Ellenberg
mmon.conf: net { > global_common.conf: allow-two-primaries; > global_common.conf: after-sb-0pri discard-zero-changes; > global_common.conf: after-sb-1pri discard-secondary; > global_common.conf: after-sb-2pri disconnect; > glo

Re: [Linux-HA] OCFS on top of dual-primary DRBD in SLES11 SP1

2011-12-19 Thread Lars Ellenberg
igger ; > reboot -f"; > pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; > /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; > reboot -f"; > local-io-error "/usr/lib/drbd

Re: [Linux-HA] Antw: Re: Q: "exec-time" values

2011-12-06 Thread Lars Ellenberg
because of "max-children" or for whatever reason, and the actual exec time from fork to sigchld processing, that's a specific scope, no problem to replace time_longclock there with gettimeofday, or whatever you feel gives you the most pleasure (getrusage is not it, I dare say). If the pa

Re: [Linux-HA] Antw: Re: Q: "exec-time" values

2011-12-05 Thread Lars Ellenberg
(clock_t) < 8) meets those requirements. I don't see how getrusage would meet these requirements. So if you want to hack something up with getrusage, restrict that to a certain usage of time_longclock(), not to the implementation of time_longclock() itself. If you find something that fu

  1   2   3   4   >