[ilb-dev] [networking-discuss] ILB design doc ( revision 1.0) has been posted

Darren Reed Mon, 27 Oct 2008 17:09:56 -0700

On 10/27/08 09:31, Michael Schuster wrote:
> Darren,
>
> thx for your comments. some answers/reflection below:
>
> On 10/26/08 19:43, Darren Reed wrote:
> ..
>> Health Checks.
>> ==============
>> This design has a single daemon, with a single thread,
>> that polls multiple servers to update a single pool of
>> data in the kernel.
>>
>> If we assume that the in-kernel handling of requests
>> from the daemon enforces MP-safety, why not run multiple
>> daemons?
>
> actually, it's the daemon that will serialise access to the kernel.


If your kernel interface aren't MP-safe then you need to fix
your interfaces so that they are. It is not acceptable to
require the daemon to ensure the integrity of data inside
the kernel.


>> i.e. run an ilbd per back end server (or at least a
>> thread per back-end server.) You might still need a
>> single daemon to act as the manager? *shrug*
>
> this sounds like you're replacing the load of repeatedly starting health 
> check processes by having as many processes sitting around idly a lot of 
> the time.

Yup.


> Since in the current design ilbd maintains quite a bit of state, one would 
> indeed have to coordinate all the information to be able to get the 
> "complete" picture again, so the added benefit seem a little elusive to me 
> here.

What state is there to manage that needs to be shared?
And if there is such state, why isn't it talked about
in the design doc.?

So far as health checks go, the ilbd is responsible for:
- ensuring that all of the destinations are periodically
  probed and
- ensuring that the list of in-kernel destinations matches
  those that are successfully responding to probes.

One way to do that is to have a big program that polls each
one in turn, with lots of complexity to ensure that nobody
causes the program to pause too long and everyone gets
serviced in turn, all inside one big loop. There is lots
of state held but it is all still per-destination.


>> This should also remove the ilbd main-loop from being a
>> critical section of code, where slow down from dealing
>> with one external server can impact all of the others.
>> Instead, scheduling of work is left up to the kernel to
>> schedule threads/processes, depending on who's busy or
>> blocked, etc.
>
> anything that we expect to block (health check) is farmed out to processes, 
> so that happens anyway.

The design document does not reflect this at all.

Darren

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://mail.opensolaris.org/pipermail/ilb-dev/attachments/20081027/bdc739af/attachment.html>

[ilb-dev] [networking-discuss] ILB design doc ( revision 1.0) has been posted

Reply via email to