On Thu, Aug 25, 2016 at 02:55:12PM +0200, Klaus Aehlig wrote: > > Hi Brian, > > > I'm not sure that's true. For example if you have running job that has made > > a WaitForJobChange RPC call over the luxi UDS and is waiting for the > > response, > > isn't that going to be interrupted if the luxi daemon is restarted? > > it certainly was a design goal of the daemon refactoring; we might have missed > some bugs, but I think here we did it right. > > Besides that I doubt that any jobs do a WaitForJobChange RPC (and, in fact, as > far as I remember, jobs get all their information from WConfD), the mechanism > used by jobs for calling UDS is aware that the daemon might be absent, > restarted, > etc, and does all the needed retries. If I remember correctly, quite a lot of > the magic is in lib/rpc/transport.py.
*lightbulb* Aaaah, you're right! The WaitForJobChanges I was seeing in our test setup in fact all came from a bunch of running gnt-job watch commands, not the jobs themselves. Cheers, Brian. > Thanks, > Klaus > > -- > Klaus Aehlig > Google Germany GmbH, Erika-Mann-Str. 33, 80636 Muenchen > Registergericht und -nummer: Hamburg, HRB 86891 > Sitz der Gesellschaft: Hamburg > Geschaeftsfuehrer: Matthew Scott Sucherman, Paul Terence Manicle
