Re: [Beaker-devel] Scheduling recipe sets rather than recipes

Nick Coghlan Tue, 01 Jan 2013 18:08:55 -0800

On 12/21/2012 09:53 PM, Jaroslav Kortus wrote:

Furthermore, the scheduler would be updated to work on a *cached* copy
of the System
status data. This is needed to avoid the current problem where there's
a race condition
with system status changes occurring during a scheduling pass leading
to recipes jumping
the queue (I'm interested in hearing about relatively clean ways to
this with SQL Alchemy,
though:
http://stackoverflow.com/questions/13983067/cached-reads-immediate-writes-with-sql-alchemy)


Do you mean https://bugzilla.redhat.com/show_bug.cgi?id=872187 ?

I've worked around it using the transaction for the whole scheduling
loop (not just per-recipe as it was before). This is sort of "cached"
data, as the transactions reads are consistent no matter what got
written during the transaction. I'm ordering the recipes by recipe set
priority, recipeset.id and recipe.id (in this order) and then they get
scheduled.

I haven't seen any deadlocks since then (with tasks having the same
priority).

Yeah, but lumping everything into one giant transaction has its ownproblems (mainly to do with state consistency with external systems likeRHEV-M and the filesystem).

What I realised over the Christmas break is that many of these problemscan be resolved by moving towards a more event based scheduling system,with two key scheduling events:

1. When a new recipe is submitted, attempt to assign it to a dynamicvirtual system or to a system from the idle pool2. When a system completes its current task, attempt to assign it arecipe from the queue *before* placing it back in the idle pool (in thecase of dynamic virt, see if any of the recipes that previously faileddynamic virt allocation can now be allocated a dynamic VM)

The current scheduling loop would then become a cleanup loop (e.g.looking for dead recipes that need to be aborted for various reasons)

With separate scheduling events, different prioritisation rules can beapplied to the two kinds of scheduling:

New recipes would use the current recipe based scheduling: filter andorder the available systems according to the preferences of the usersubmitting the job and the requirements expressed in the recipe.

Free systems would use system based scheduling: order the queued recipesaccording to the preferences of the system owner and the priorities ofthe queued recipes.

The latter scheduling algorithm can deal with the deadlock problem byprioritising recipes that are part of a recipe set that already has someresources allocated on the relevant lab controller over those which arejust part of the general queue.

The one downside is that users would be able to exploit this is order tojump the queue for access to rare resources by also scheduling a recipein the same recipe set that can run on readily available hardware. Whilewe likely can't prevent such abuse, we should be able to provide toolsto help detect it and leave it up to organizational "acceptable use"policies to deal with it.


Cheers,
Nick.

--
Nick Coghlan
Red Hat Infrastructure Engineering & Development, Brisbane

Python Applications Team Lead
Beaker Development Lead (http://beaker-project.org/)
GlobalSync Development Lead (http://pulpdist.readthedocs.org)
_______________________________________________
Beaker-devel mailing list
[email protected]
https://lists.fedorahosted.org/mailman/listinfo/beaker-devel

Re: [Beaker-devel] Scheduling recipe sets rather than recipes

Reply via email to