Hi @all, we have another feature in trunk for the scheduler... Jobs (i.e. task dependencies)
Directly from https://github.com/niphlod/w2p_scheduler_tests/ (that has been updated to accomodate the new feature explanation...) What are "Jobs", you ask ? Well, it's a way to coordinate a set of tasks that have dependencies (what in Celery is called "Canvas"). As always, the Scheduler sticks to the basics. Every Job is considered to be a DAG (a Directed Acyclic Graph <http://en.wikipedia.org/wiki/Directed_acyclic_graph>). Without going into silly details, every task can have one or more dependencies, but of course you can't have mutual dependencies among the same tasks. If a "job" can't be represented as a DAG, then it can't be processed in its entirety. You can still queue it, but it won't ever complete (i.e. you could have a complete stall at the first task or just a task left on 100 queued...) So... what can you do ? Let's take a trivial example (there are a few based on mathematics, map/reduce, etc... but hey, this is an example!!!) Suppose you need to create a job that describes what is needed to get dressed ( thanks to http://hansolav.net/sql/graphs.html )... We have a few items to wear, and there's an "order" to respect... Items are: watch, jacket, shirt, tie, pants, undershorts, belt, shoes, socks Now, we can't put on the tie without wearing the shirt first, etc... <http://yuml.me/995413d6> Suppose we have those tasks queued in a controller (for example's sake, the same function, with different task_name(s))... watch = s.queue_task(fname, task_name='watch') jacket = s.queue_task(fname, task_name='jacket') shirt = s.queue_task(fname, task_name='shirt') tie = s.queue_task(fname, task_name='tie') pants = s.queue_task(fname, task_name='pants') undershorts = s.queue_task(fname, task_name='undershorts') belt = s.queue_task(fname, task_name='belt') shoes = s.queue_task(fname, task_name='shoes') socks = s.queue_task(fname, task_name='socks') Now, there's a helper class to construct and validate a "job". First, let's declare a job named "job_1" #from gluon.scheduler import JobGraph myjob = JobGraph(db, 'job_1') Next, we'd need to establish dependencies # before the tie, comes the shirt myjob.add_deps(tie.id, shirt.id) # before the belt too comes the shirt myjob.add_deps(belt.id, shirt.id) # before the jacket, comes the tie myjob.add_deps(jacket.id, tie.id) # before the belt, come the pants myjob.add_deps(belt.id, pants.id) # before the shoes, comes the pants myjob.add_deps(shoes.id, pants.id) # before the pants, comes the undershorts myjob.add_deps(pants.id, undershorts.id) # before the shoes, comes the undershorts myjob.add_deps(shoes.id, undershorts.id) # before the jacket, comes the belt myjob.add_deps(jacket.id, belt.id) # before the shoes, comes the socks myjob.add_deps(shoes.id, socks.id) Then, we can ask JobGraph if what we asked is a job that is accomplishable myjob.validate('job_1') And voilĂ , job done! If it's not a DAG, then an exception will be raised and the jobs won't be committed (of course their dependencies won't be committed too) How it works under the hood ? There's a new table called scheduler_task_deps that holds a reference to the job_name, the task parent, the task child and a boolean to mark the "path" (the arrows in the graph) as "visitable". To be fair, the job name isn't that important, you can have task dependencies amongst different jobs, it's just not that easy to verify that the Job is a DAG at a later stage. If a path is "visitable" it means that the DAG graph can be "walked" in that direction. Every time a task gets "COMPLETED", the "paths" gets updated to be "visitable". The algo to pick up tasks has been updated to work fetching only tasks that have no dependencies, or dependencies that have already been satisfied (i.e. tasks that depends on nothing, or tasks that depend on tasks that are yet COMPLETED). Let me know what you think, and if you spot bugs. -- Resources: - http://web2py.com - http://web2py.com/book (Documentation) - http://github.com/web2py/web2py (Source code) - https://code.google.com/p/web2py/issues/list (Report Issues) --- You received this message because you are subscribed to the Google Groups "web2py-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.