Hi Igor,

I've been following this and your previous thread a little bit and have
some suggestions.

What version(s) of the following packages are installed on your system:
Heartbeat, 
drbd, 
drbdlinks, 
cluster-glue,
cluster-agents
drbd0.7-module-source or drbd8-source

(these package names are based on Ubuntu Lucid which you indicated you
were running).

How is DRBD configured on your system? (can you post your configs
please?)

I've run heartbeat and corosync both in production using DRBD and apart
from some very occasional odd behaviour they all work perfectly.

Are you running any kind of iptables firewalls on your systems?

The first thing to establish is that DRBD is working properly and you
can manually promote / demote your resource - because if that doesn't
work nothing will.

Secondly, DRBD resource agents changed in version 8.x, the one supplied
with Heartbeat is _not_ supported according to Linbit. Instead you
should use Linbit's OCF resource agent (ocf:linbit:heartbeat)

DRBDLINKS is useful but it relies upon DRBD being started by the OS, not
by the cluster manager.  This is why many people use heartbeat/pacemaker
because drbdlinks can still be used in a controlled manner (after
permitting the cluster to start drbd first).  Start simple - start off
just getting DRBD to go primary, worry about drbdlinks later.

What about startup order on boot?
Is heartbeat started before or after DRBD?
Probably in your case (if you're using drbdlinks) Heartbeat should be
started _after_ drbd (in RHEL systems its typically the reverse).

After a reboot what does cat /proc/drbd say on each system?

That will at least confirm that DRBD is in the correct state.

Yes you are on the right track with heartbeat or corosync - but clusters
are not simple creatures and many things can cause intermittent or
downright silly problems (such as port span or auto negotiation on
switches). Don't give up.

Best Regards,

Brett

On Wed, 2010-08-11 at 16:23 -0500, Igor Chudov wrote:
> On Wed, Aug 11, 2010 at 3:24 PM, Dimitri Maziuk <dmaz...@bmrb.wisc.edu> wrote:
> > On Wednesday 11 August 2010 15:12, Igor Chudov wrote:
> > ...
> >> At this point, I am beginning to have my doubts about this whole
> >> heartbeat system and its ability to serve for years, in what looks to
> >> me like simple configuration.
> > ...
> >
> > Well, that's kinda why I stick to 2.1.4 (also b/c it's a stock rpm on 
> > centos)
> > and v1-style config. From back when things were simple stupid.
> 
> Simple stupid is exactly what I want.
> 
> > As I understand it, most heartbeat work since was done on v2 features: xml,
> > resource monitoring, corosync, pacemaker... which I'm either not missing 
> > (mon
> > works just fine for monitoring) or actively don't want (xml in particular).
> 
> I would not mind xml if either 1) it was documented or 2) the command
> line tool was documented beyond just mentioning every field or 3) the
> GUI was working instead of not working.
> 
> > When I need a 3-node cluster I'll think about those. Until then, 2.1.4 is 
> > not
> > perfect but it works well enough.
> 
> My heartbeat is 3.0.3.
> 
> Do you think that, say, 2.1.4 s sufficiently bug free that I could
> install it from source and just let it run forever?
> 
> I mean, I just want to get that simple two node cluster to run. I am
> not trying to back up Mars to Venus and Uranus by TCP over light rays.
> is 2.1.4 is easy and works, I will just install it. I assume that it
> can work with standard Ubuntu Lucid drbd.
> 
> 
> i
> 

-- 
Best Regards,

Brett Delle Grazie

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
______________________________________________________________________
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to