Re: [Linux-HA] Heartbeat does not take over if BOTH machines are bootedat the same time

Igor Chudov Thu, 05 Aug 2010 21:47:56 -0700

On Thu, Aug 5, 2010 at 6:32 PM, Pushkar Pradhan <[email protected]> wrote:
> I set up two Ubuntu Lucid machines to serve as a two-node Heartbeat
> cluster without Corosync.
>
> They support a DRBD service, IP address, NFS and Samba services.
>
> Things mostly work, and if I reboot one server, the other takes over.
>
> What does NOT work is that if I reboot both, then *neither* takes
> over. When they are in this state -- both running and none active --
> if I reboot one of them, then the other begins to work.
>
> This is becoming a real embarrassment for me at work and I would love
> to get some help.
>
> haresources:
> pfs-srv3 drbddisk::r0 Filesystem::/dev/drbd0::/pfs::ext3 10.1.8.45/24
> nfs-kernel-server smbd
> pfs-srv4
>
> ha.cf:
> use_logd on
> udpport 12694
> keepalive 1
> warntime 15
> deadtime 20
> debug 1
> initdead 60
> bcast eth1
> node pfs-srv3
> node pfs-srv4
> auto_failback on
> crm off
>
>
> Can you experiment with a really large initdead time like 2 or 5 minutes? 
> Also see if it helps to do unicast messaging?


Larger initdead does not help. I will try unicast tomorrow but I doubt
it will help.

Pushkar, could someone or someone else suggest some tools to trouble
shoot this issue?

Right now I am poking in the dark.

i


>
> pushkar
>
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Heartbeat does not take over if BOTH machines are bootedat the same time

Reply via email to