Thanks for your comments.

Lars Marowsky-Bree schrieb:
> On 2011-02-05T00:16:45, Lars Ellenberg <lars.ellenb...@linbit.com> wrote:
> 
>> I do like this packet counter monitoring.
> 
> So do I, but I'd just casually suggest that this may make sense as a
> daemon, or at least per physical NIC - instead of per virtual IP.
> 
I do understand the reasoning, but right now it's beyond my abilities to 
code such a daemon. But I have talked to a work mate who said it should 
be doable (for him) to write a small daemon listening on the netlink 
interface of the kernel and act on that (link loss etc..). This would of 
course be a much cleaner solution, and faster too. We will see whether 
we have time to go into this.
So for now I will just go with this approach, which has some benefits 
too, since you have the monitoring right by the IP you care for...

> (Andrew, much to my dismay, has changed the pingd code to instead be a
> periodically executed RA, which IMHO is a detrimental change - we need
> to stop executing more.)
> 
>>> +# 09: check for nonempty ARP cache
>>> +# 10: watch for packet counter changes
>>> +#
>>> +# 19: check arping_ip_list
>>> +# 20: check arping ARP cache entries
>>> +# 
>>> +# 30:  watch for packet counter changes in promiscios mode
>>> +# 
>>> +# If unsuccessfull in levels 18 and above,
>>> +# the tests for higher check levels are run.
> 
> As a general suggestion, I would not base this on the "monitor depth".
> These are not necessarily incremental, and we have the ability to pass
> arbitrary parameters to the monitor operation. You could either have one
> parameter that you treat like a flag list or individual ones.

This sounds reasonable to me. I will look into this as son as I can.

Does anybody have an other opinions on that?

> 
> And, clearly, one may want to try arping|pinging the default gateway
> too. So there's significant overlap here with the "ping" stuff.

Not that clear to me (yet?) that this would be useful. Couldn't I be 
that we just serve an IP for one subnet on one Interface, but the 
default gateway is on an other subnet on an other interface? If there is 
a gateway this interface, then I would definitely recommend the user to 
add its IP to the arping_ip_list. Maybe I should add a comment about this.
I don't see so much overlap here yet, but maybe it's because I haven't 
looked at the ping-RA yet.

> In particular, your changes capture a point in time, while ping, by
> virtue of going through attrd, is able to dampen changes. Your approach
> will immediately fail a NIC - possibly on all nodes - if a switch
> reboots, or if there really is a brief lack of traffic. I'm not quite
> sure that is a desirable property.

Me neither, but I'm also not sure that the opposite is true. If I have 
the two (ore more) nodes connected to different switches then I might in 
fact want the IP to switch over as soon as one fails. But on the other 
hand if all nodes are connected to just one switch, then moving the IP 
on a switch reboot would not be such a good idea. (Altought one could 
expect an administrator to disable the monitor option prior to rebooting 
the switch... :)

Maybe one could keep that configurable (eg. by a dampen option). I will 
have a look at the ping RA and attrd.


Yours,
Robert.
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to