Re: [PATCH master] Inter-node RPC timeout design

Michael Hanselmann Tue, 01 Dec 2009 05:10:00 -0800

2009/12/1 Iustin Pop <[email protected]>:
> On Mon, Nov 30, 2009 at 06:01:17PM +0100, Michael Hanselmann wrote:
>> +There is one major problem with this design: Timeouts can not be used on
>> +a per-request basis. Neither client or server know how long it will
>> +take. Even if we might be able to group requests into different
>> +categories (e.g. fast and slow), this is not reliable.
>> +
>> +If a node has an issue or the network connection fails while a request
>> +is being handled, the master daemon can wait for a long time for the
>> +connection to time out (due to the operating system's underlying TCP
>> +keep-alive packets or timeouts). While the settings for keep-alive
>> +packets can be changed using Linux-specific socket options, we don't
>> +consider them reliable and responsive enough for our case.
>
> This is not really fair/correct, we use and rely on socket timeouts configured
> system-wide for 'node down' case - and it works.


Yes, for the initial connect. However, the HTTP client disables read
timeouts after connecting (see
lib/http/client.py:HttpClientRequestExecutor.READ_TIMEOUT and
HttpClientRequestExecutor._ReadResponse). Otherwise it would time out
for long-running RPCs, depending on how the timeout is chosen. Hence
the “while a request is being handled” above.

>> +This proposal can easily be implemented using HTTP, though it would
>> +likely be more efficient and less complicated to use the LUXI protocol
>> +already used to communicate between client tools and the Ganeti master
>> +daemon.
>
> I'm not sure I understand here - what is the actual proposal, switch or
> remain with HTTP?

Remain with HTTP for now:

@@ -73,7 +73,8 @@ libraries, which, unfortunately, turned out to miss
important features
 This proposal can easily be implemented using HTTP, though it would
 likely be more efficient and less complicated to use the LUXI protocol
 already used to communicate between client tools and the Ganeti master
-daemon.
+daemon. Switching to another protocol can occur at a later point. This
+proposal should be implemented using HTTP as its underlying protocol.

>> +Function processes communicate with the parent process via stdio and
>> +possibly their exit status. Every function process has a unique
>> +identifier, though it shouldn't be the process ID (PIDs can be recycled
>> +and are prone to race conditions for this use case).
>
> (I wonder if PIDs+other ID is not unique enough)

If the other ID is just a counter, there's no need to combine it with
the PID. The node daemon will have to keep an internal list of its
child processes anyway.

Actually, it probably should be something like "%s-%s-%s" %
(time.time(), pid, unique_id). Otherwise, if the node daemon is
restarted, function calls can collide again. A UUID would be even
better, but probably be too expensive. The exact format or composition
of the function call ID should not be part of this rather high-level
proposal.

>> +In the future, ``StartFunction`` could support an additional parameter
>> +to specify after how long the function process should be aborted.
>                                          ^ started, or process started?

I used the term “function process” to describe the child processes
started by the node daemon to actually call a (backend) function.
Should I add a small glossary?

>> +Simplified timing diagram::
>> +
>> +  Master daemon        Node daemon                      Function process
>> +   |
>> +  Call function
>> +  (timeout 10s) -----> Parse request and fork for ----> Start function
>> +                       calling actual function, then     |
>> +                       wait up to 10s for function to    |
>> +                       finish                            |
> […]
>
> Questions: the "wait up to 10s" there, is done in which process? parent
> ganeti-noded - which would mean stalling all other requests?

It needs to be in the parent process (ganeti-noded). As Guido writes,
we already have the library code to do this in an asynchronous fashion
(which is, more or less, a necessity for this proposal).

> What happens if the noded process is restarted?

If we handle SIGINT/SIGTERM, it could wait for its child processes.
Otherwise the function processes just run to the end. I don't think we
should kill them, otherwise things get even more complicated with
signal handling (assuming root won't send signals).

Regards,
Michael

Re: [PATCH master] Inter-node RPC timeout design

Reply via email to