> How much of Ganeti does actually depend on 100% reproducible job queues?

Probably not too much; the main point is that we announced that
restarting luxid is not a big deal, and can be done at any time
now (as it won't even affect running jobs). So, I don't think
the scheduling needs to be 100% reproducible, but if an essential
property of Ganeti (like jobs not starving) depends on luxid
living long enough, that certainly is a fundamental change on how
we currently think about Ganeti daemons.

> The other option that I had considered was to somehow "quantize" the 
> start-time and current-time [...] This means using the wall clock to
> determine how long the job has been waiting and reorder the
> queue according to that instead. [...]
> The age formula wouldn't change, it would be persistent across reboots, and 
> it would be fairly trivial to implement.

That seems like a good plan. It would also be consistent with the strategy
of running jobs to avoid starvation waiting for locks, which is also time-based.

Thanks,
Klaus

-- 
Klaus Aehlig
Google Germany GmbH, Erika-Mann-Str. 33, 80636 Muenchen
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschaeftsfuehrer: Matthew Scott Sucherman, Paul Terence Manicle

Reply via email to