On 05/22/10 01:00 AM, Jacques wrote:
I'm imagining a situation where ILB stops being responsive but is
still running.  A bug would be the most likely cause, but I've seen
imaginary gremlins flip a few bits, too. I think that my main
question was whether VRRP has any awareness of ILB running.  It
sounds like it doesn't.  So if the process runs away or someone
foolishly disables it via svcadm, VRRP won't be any the wiser.
Correct?


The real work, except server health check, is done in the
kernel.  So depending on what the bug in ilbd really is, the
ILB function will continue.  If the bug is that it somehow
removes all ILB rules, then of course the ILB function will
no longer work.  If you are talking about a bug in the kernel
ILB code, then yes, the ILB will stop functioning even if the
ilbd process is running happily.  So if you are talking about
the whole ILB functionality, there are multiple components
which should be considered.


I'm just trying to understand the relationship between the services.
If there isn't a relationship, I guess I could setup a verify script
that monitors the responsiveness of ILB and disables the VRRP node if
the service is unresponsive (due to a bug or someone accidentally
disabling it).  Does anyone already do this?


I think you can consider the use of VRRP with ILB as an HA
mechanism just a simple ping health check on the ILB machine.
Currently, we don't have a higher level health check on ILB
between master and back up.  This can get quite complicated as
ILB can support many different rules.  So in theory, the health
check on ILB functionality should be rule specific.  We don't
want to have such a complicated mechanism as it is really needed
when there is a serious bug in the ILB code.  And the more
complex a HA mechanism is, the more likely to have bugs in the
code...  The original design indeed had some ties between VRRP
and ILB service (i.e. between the user level processes).  But the
complexity was not justified and it did not catch kernel issues
anyway.  We opted for the simple design with clear interaction.



--

                                        K. Poon.
                                        [email protected]
_______________________________________________
networking-discuss mailing list
[email protected]

Reply via email to