2009/12/1 Iustin Pop <[email protected]>: > On Mon, Nov 30, 2009 at 06:01:17PM +0100, Michael Hanselmann wrote: >> +There is one major problem with this design: Timeouts can not be used on >> +a per-request basis. Neither client or server know how long it will >> +take. Even if we might be able to group requests into different >> +categories (e.g. fast and slow), this is not reliable. >> + >> +If a node has an issue or the network connection fails while a request >> +is being handled, the master daemon can wait for a long time for the >> +connection to time out (due to the operating system's underlying TCP >> +keep-alive packets or timeouts). While the settings for keep-alive >> +packets can be changed using Linux-specific socket options, we don't >> +consider them reliable and responsive enough for our case. > > This is not really fair/correct, we use and rely on socket timeouts configured > system-wide for 'node down' case - and it works.
Yes, for the initial connect. However, the HTTP client disables read timeouts after connecting (see lib/http/client.py:HttpClientRequestExecutor.READ_TIMEOUT and HttpClientRequestExecutor._ReadResponse). Otherwise it would time out for long-running RPCs, depending on how the timeout is chosen. Hence the “while a request is being handled” above. >> +This proposal can easily be implemented using HTTP, though it would >> +likely be more efficient and less complicated to use the LUXI protocol >> +already used to communicate between client tools and the Ganeti master >> +daemon. > > I'm not sure I understand here - what is the actual proposal, switch or > remain with HTTP? Remain with HTTP for now: @@ -73,7 +73,8 @@ libraries, which, unfortunately, turned out to miss important features This proposal can easily be implemented using HTTP, though it would likely be more efficient and less complicated to use the LUXI protocol already used to communicate between client tools and the Ganeti master -daemon. +daemon. Switching to another protocol can occur at a later point. This +proposal should be implemented using HTTP as its underlying protocol. >> +Function processes communicate with the parent process via stdio and >> +possibly their exit status. Every function process has a unique >> +identifier, though it shouldn't be the process ID (PIDs can be recycled >> +and are prone to race conditions for this use case). > > (I wonder if PIDs+other ID is not unique enough) If the other ID is just a counter, there's no need to combine it with the PID. The node daemon will have to keep an internal list of its child processes anyway. Actually, it probably should be something like "%s-%s-%s" % (time.time(), pid, unique_id). Otherwise, if the node daemon is restarted, function calls can collide again. A UUID would be even better, but probably be too expensive. The exact format or composition of the function call ID should not be part of this rather high-level proposal. >> +In the future, ``StartFunction`` could support an additional parameter >> +to specify after how long the function process should be aborted. > ^ started, or process started? I used the term “function process” to describe the child processes started by the node daemon to actually call a (backend) function. Should I add a small glossary? >> +Simplified timing diagram:: >> + >> + Master daemon Node daemon Function process >> + | >> + Call function >> + (timeout 10s) -----> Parse request and fork for ----> Start function >> + calling actual function, then | >> + wait up to 10s for function to | >> + finish | > […] > > Questions: the "wait up to 10s" there, is done in which process? parent > ganeti-noded - which would mean stalling all other requests? It needs to be in the parent process (ganeti-noded). As Guido writes, we already have the library code to do this in an asynchronous fashion (which is, more or less, a necessity for this proposal). > What happens if the noded process is restarted? If we handle SIGINT/SIGTERM, it could wait for its child processes. Otherwise the function processes just run to the end. I don't think we should kill them, otherwise things get even more complicated with signal handling (assuming root won't send signals). Regards, Michael
