On 10/27/08 17:09, Darren Reed wrote:
> On 10/27/08 09:31, Michael Schuster wrote:
>> Darren,
>>
>> thx for your comments. some answers/reflection below:
>>
>> On 10/26/08 19:43, Darren Reed wrote:
>> ..
>>> Health Checks.
>>> ==============
>>> This design has a single daemon, with a single thread,
>>> that polls multiple servers to update a single pool of
>>> data in the kernel.
>>>
>>> If we assume that the in-kernel handling of requests
>>> from the daemon enforces MP-safety, why not run multiple
>>> daemons?
>>
>> actually, it's the daemon that will serialise access to the kernel.
>
> If your kernel interface aren't MP-safe then you need to fix
> your interfaces so that they are. It is not acceptable to
> require the daemon to ensure the integrity of data inside
> the kernel.
>
>
>>> i.e. run an ilbd per back end server (or at least a
>>> thread per back-end server.) You might still need a
>>> single daemon to act as the manager? *shrug*
>>
>> this sounds like you're replacing the load of repeatedly starting health 
>> check processes by having as many processes sitting around idly a lot of 
>> the time.
>
> Yup.
>
Darren,

Assuming I have 100 back-end servers,  your suggestions requires 100 
ilbd instances .  What exactly would be the benefit of having this 
design that would justify the added complexity?

BTW, for Phase 1 we have decided to have the external health checks be 
implemented in the same way as the ping and tcp/udp probes. Depending on 
what gets most use by admin, we may change the implementation of ping 
and tcp/udp probe in later phase.
>
>> Since in the current design ilbd maintains quite a bit of state, one would 
>> indeed have to coordinate all the information to be able to get the 
>> "complete" picture again, so the added benefit seem a little elusive to me 
>> here.
>
> What state is there to manage that needs to be shared?
> And if there is such state, why isn't it talked about
> in the design doc.?
>
> So far as health checks go, the ilbd is responsible for:
> - ensuring that all of the destinations are periodically
>   probed and
> - ensuring that the list of in-kernel destinations matches
>   those that are successfully responding to probes.
>
> One way to do that is to have a big program that polls each
> one in turn, with lots of complexity to ensure that nobody
> causes the program to pause too long and everyone gets
> serviced in turn, all inside one big loop. There is lots
> of state held but it is all still per-destination.
>
>
>>> This should also remove the ilbd main-loop from being a
>>> critical section of code, where slow down from dealing
>>> with one external server can impact all of the others.
>>> Instead, scheduling of work is left up to the kernel to
>>> schedule threads/processes, depending on who's busy or
>>> blocked, etc.
>>
>> anything that we expect to block (health check) is farmed out to processes, 
>> so that happens anyway.
>
> The design document does not reflect this at all.
>
> Darren
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://mail.opensolaris.org/pipermail/ilb-dev/attachments/20081029/1ca7800e/attachment.html>

Reply via email to