Fabien <fabien.mauss...@gmail.com> writes: > I am developing a tool which works on individual entities (glaciers) > and do a lot of operations on them. There are many tasks to do, one > after each other, and each task follows the same interface: ...
If most of the resources will be spent on computation and the communications overhead is fairly low, the path of least resistance may be to: 1) write a script that computes just one glacier (no multiprocessing) 2) write a control script that runs the glacier script through something like os.popen(), so normally it will collect an answer, but it can also notice if the glacier script crashes, or kill it from a timeout if it takes too long 3) Track the glacier tasks in an external queue server: I've used Redis (redis.io) for this, since it's simple and powerful, but there are other tools like 0mq that might be more precisely fitted. 4) The control script can read the queue server for tasks and update the queue server when results are ready The advantages of this over multiprocessing are: 1) Redis is a TCP server which means you can spread your compute scripts over multiple computers easily, getting more parallelism. You can write values into it as JSON strings if they are compound values that are not too large. Otherwise you probably have to use files, but can pass the filenames through Redis. You can connect new clients whenever you want through the publish/subscribe interface, etc. 2) by using a simple control script you don't have to worry too much about the many ways that the computation script might fail, you can restart it, you can put the whole thing under your favorite supervision daemon (cron, upstart, systemd or whatever) so it can restart automatically even if your whole computer reboots, etc. Redis can even mirror itself to a failover server in real time if you think you need that, plus it can checkpoint its state to disk. -- https://mail.python.org/mailman/listinfo/python-list