>Will this age parameter be part of the job data? No, the age parameter is a runtime variable that lives in the job scheduler. It is not saved to disk nor propagated anywhere. In case of the master node crashing and the scheduler resuming operations on a new master, this age parameter will be reset and re-initialized. We don't expect to have jobs that live for so long anyway and it's not a problem, this avoids any unnecessary replication across the cluster.
>...changing all jobs in the queue whenever scheduling a job and replicating that to all master candidates might introduce a significant overhead. The data for each job never gets modified or touched by the scheduler, nothing will be propagated across the cluster. This is simply an extra "sorting" step in the middle of the scheduling pipeline, applied to the job queue. The queue itself does not get modified. The code pretty much looks like this: toRun = take n . applyFilter . applyReasonLimit . applyPriority . sortByLockWeight $ jobQueue (not 100% exactly like this but this is the idea) Using a job ID as age indicator is actually a really interesting idea, but I think it might make it trickier to calculate and make it more confusing than anything else while achieving pretty much the same end result. As it is now the age is just a counter that gets incremented inside a Map. On Wednesday, August 24, 2016 at 6:23:19 PM UTC+1, Klaus Aehlig wrote: > > > +As previously specified, to prevent starvation we introduce an aging > system > > +for queued jobs that keeps the queue fair. Each job in the queue will > have an > > +'Age' parameter that will keep track of its current age in the queue. > > Will this age parameter be part of the job data? The reason I'm asking is > that > every change to the state of a job will be replicated to all master > candidates. > Given that the design is meant for the situation where there are many jobs > in the queue... > > > +Every time > > +a job is scheduled to run in the queue, the age parameter for any other > pending > > +job gets increased by 1. > > ...changing all jobs in the queue whenever scheduling a job and > replicating that > to all master candidates might introduce a significant overhead. Given > that job > identifiers are assigned sequentially, the difference between the id of a > job and > the next id to be assigned could also serve as an age indicator. > > -- > Klaus Aehlig > Google Germany GmbH, Erika-Mann-Str. 33, 80636 Muenchen > Registergericht und -nummer: Hamburg, HRB 86891 > Sitz der Gesellschaft: Hamburg > Geschaeftsfuehrer: Matthew Scott Sucherman, Paul Terence Manicle >
