Thanks Niphlop, that's brilliant! I've been using a work-around to schedule dependent jobs, and this will help to tidy things up.
On Tuesday, August 5, 2014 8:51:25 PM UTC+12, Niphlod wrote: > > Hi @all, > we have another feature in trunk for the scheduler... Jobs (i.e. task > dependencies) > > Directly from https://github.com/niphlod/w2p_scheduler_tests/ (that has > been updated to accomodate the new feature explanation...) > > > What are "Jobs", you ask ? Well, it's a way to coordinate a set of tasks > that have dependencies (what in Celery is called "Canvas"). > > As always, the Scheduler sticks to the basics. Every Job is considered to > be a DAG (a Directed Acyclic Graph > <http://en.wikipedia.org/wiki/Directed_acyclic_graph>). > Without going into silly details, every task can have one or more > dependencies, but of course you can't have mutual dependencies among the > same tasks. > If a "job" can't be represented as a DAG, then it can't be processed in > its entirety. You can still queue it, but it won't ever complete (i.e. you > could have a > complete stall at the first task or just a task left on 100 queued...) > > So... what can you do ? > Let's take a trivial example (there are a few based on mathematics, > map/reduce, etc... but hey, this is an example!!!) > Suppose you need to create a job that describes what is needed to get > dressed ( thanks to http://hansolav.net/sql/graphs.html )... > > We have a few items to wear, and there's an "order" to respect... > Items are: watch, jacket, shirt, tie, pants, undershorts, belt, shoes, > socks > > Now, we can't put on the tie without wearing the shirt first, etc... > > <http://yuml.me/995413d6> > > > > Suppose we have those tasks queued in a controller (for example's sake, > the same function, with different task_name(s))... > watch = s.queue_task(fname, task_name='watch') > jacket = s.queue_task(fname, task_name='jacket') > shirt = s.queue_task(fname, task_name='shirt') > tie = s.queue_task(fname, task_name='tie') > pants = s.queue_task(fname, task_name='pants') > undershorts = s.queue_task(fname, task_name='undershorts') > belt = s.queue_task(fname, task_name='belt') > shoes = s.queue_task(fname, task_name='shoes') > socks = s.queue_task(fname, task_name='socks') > > > Now, there's a helper class to construct and validate a "job". > First, let's declare a job named "job_1" > > > #from gluon.scheduler import JobGraph > myjob = JobGraph(db, 'job_1') > > > > Next, we'd need to establish dependencies > > > # before the tie, comes the shirt > myjob.add_deps(tie.id, shirt.id) > # before the belt too comes the shirt > myjob.add_deps(belt.id, shirt.id) > # before the jacket, comes the tie > myjob.add_deps(jacket.id, tie.id) > # before the belt, come the pants > myjob.add_deps(belt.id, pants.id) > # before the shoes, comes the pants > myjob.add_deps(shoes.id, pants.id) > # before the pants, comes the undershorts > myjob.add_deps(pants.id, undershorts.id) > # before the shoes, comes the undershorts > myjob.add_deps(shoes.id, undershorts.id) > # before the jacket, comes the belt > myjob.add_deps(jacket.id, belt.id) > # before the shoes, comes the socks > myjob.add_deps(shoes.id, socks.id) > > > > Then, we can ask JobGraph if what we asked is a job that is accomplishable > > myjob.validate('job_1') > > And voilĂ , job done! If it's not a DAG, then an exception will be raised > and the jobs won't be committed (of course their dependencies won't be > committed too) > > How it works under the hood ? > > There's a new table called scheduler_task_deps that holds a reference to > the job_name, the task parent, the task child and > a boolean to mark the "path" (the arrows in the graph) as "visitable". > To be fair, the job name isn't that important, you can have task > dependencies amongst > different jobs, it's just not that easy to verify that the Job is a DAG at > a later stage. > If a path is "visitable" it means that the DAG graph can be "walked" in > that direction. > Every time a task gets "COMPLETED", the "paths" gets updated to be > "visitable". The algo to pick up tasks has been updated > to work fetching only tasks that have no dependencies, or dependencies > that have already been satisfied (i.e. tasks that depends > on nothing, or tasks that depend on tasks that are yet COMPLETED). > > > Let me know what you think, and if you spot bugs. > > > > -- Resources: - http://web2py.com - http://web2py.com/book (Documentation) - http://github.com/web2py/web2py (Source code) - https://code.google.com/p/web2py/issues/list (Report Issues) --- You received this message because you are subscribed to the Google Groups "web2py-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.