On 8/3/07, Klemens Kittan <[EMAIL PROTECTED]> wrote:
> Hi,
>
> we had partial success and now further questions arise:
>
> We added role="Master" to the declaration for the drbd0 monitoring operation:
>
> <op id="drbd0_mon_0" name="monitor" interval="10s" timeout="5s"
> role="Master"/>
>
> With this declaration, the other node becomes Master if either drbd0 or
> hearbeat itself is stopped on the master or if external network is
> disconnected. Output of cat /proc/drbd is ok.
>
> Naturally, drbd0 on the Slave is then not monitored at all. Though this is not

> critical, we tried to add another monitor operation with role="Slave" but
> then none of the nodes was promoted as master initially...

with the same interval?
try adding:
 <op id="drbd0_mon_11" name="monitor" interval="11s" timeout="5s"/>

> We still do have a problem with the failback mechanism:
> If the old Master comes back (heartbeat/drbd start or network reconnected),
> the Master role moves back (failback behaviour).  We have a
> preferred_location with score 100 on odin, but we thought that the
> default_resource_stickiness of INFINITY would clearly win over that so
> preferred_location would only have an effect when heartbeat is started?

no.  stickiness only controls where resources are run, not what state they're in
the drbd agent should be setting the correct master preference using
crm_master...

> The observed behaviour suggests that either the default_resource_stickiness
> does not apply to a multistate resource or that it does only distinguish
> between Stopped and Started, not between Master/Slave.
> Can anyone please help with this?
>
> For these tests, we omitted the IPAddr/MailTo resource group from the cib.xml
> The file is attached.
>
> Thanks in advance,
> Klemens
>
> ...
> >
> > 2. I stopped the drbd service on the Master(odin). The drbd resource though
> > stays Master on odin, Slave on frigg and the IPAddr/MailTo group stays on
> > odin. drbd-status on odin becomes Unknown/Secondary , on frigg
> > Secondary/Unknown. Why does monitoring for the drbd resource not work?
> >
> > 3. if I stop heartbeat itself on odin, takeover to frigg is done with all
> > resources... Fine. Now, if I start heartbeat again, everything moves back
> > to odin. I thought auto-failback was disabled by setting
> > default-resource-stickiness=INFINITY
> >
> > 4. The same behaviour as under 3. can be triggered by disconnecting the
> > odins connection to the external network. Again, after disconnecting,
> > everything moves back to odin...
> >
> > >> I used the configuration from DRDBHowTov2 (taken from
> > >> wiki.linux-ha.org/DRBD/...) and have DRDB running on two nodes
> >
> > successfully.
> >
> > >> Node 1 (odin) became master, node 2 (frigg) became started (equal to
> >
> > slave).
> >
> > >> drbd_1:0 Started frigg
> > >> drbd_1:1 Master odin
> > >>
> > >> Now, i stopped DRBD on node 1 using "/etc/init.d/drbd stop". HA still
> > >> monitors:
> > >> drbd_1:0 Started frigg
> > >> drbd_1:1 Master odin
> > >> while drbd ("cat /proc/drbd") on frigg monitors:
> > >> st:Secondary/Unknown
> > >>
> > >> In result, I can not mount DRBD, because it is not running as primary.
> > >> Normally, HA should notice that DRBD is not running on odin anymore and
> > >> should migrate DRBD to frigg (or should set state on primary on frigg)!
> > >> This is not the case - and I dont find the mistake.
> > >>
> > >> Can anyone help me in this case?
>
> --
> Klemens Kittan
> Systemadministrator
>
> Uni-Potsdam, Inst. f. Informatik
> August-Bebel-Str. 89
> 14482 Potsdam
>
> Tel.    :   +49-331-977/3125
> Fax.    :   +49-331-977/3122
> eMail   : [EMAIL PROTECTED]
>
> gpg --recv-keys --keyserver wwwkeys.de.pgp.net 6EA09333
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to