On Sat, 20 Jun 2015 12:01 am, Fabien wrote: > Folks, > > I am developing a tool which works on individual entities (glaciers) and > do a lot of operations on them. There are many tasks to do, one after > each other, and each task follows the same interface:
I'm afraid your description is contradictory. Here you say the tasks run one after another, but then you say: > This way, the tasks can be run in parallel very easily: and then later still you contradict this: > Also, the task2 should not be run if task1 threw an error. If task2 relies on task1, then you *cannot* run them in parallel. You have to run them one after the other, sequentially. You also ask: > There are going to be errors, some > of them are even expected for special outliers. What I would like the > tool to do is that in case of error, it writes the identifier of the > problematic glacier somewhere, the error encountered and more info if > possible. Because of multiprocessing, I can't write in a shared file, so > I thought that the individual processes should write a unique "error > file" in a dedicated directory. The documentation for the logging module has examples of using multiprocessing write to a single log file from multiple processes. It's a bit complicated, since *directly* writing to a single log from multiple processes is not supported, but it is possible. https://docs.python.org/3/howto/logging-cookbook.html#logging-to-a-single-file-from-multiple-processes Or if you are on a Unix or Linux system, you can log to syslog and let syslog handle it. Since your sample code appears to have a lot of file I/O, it may be that you can use threads rather than multiprocessing. That would allow all the threads to communicate with a single thread that handles logging. Or use a lock file: http://stackoverflow.com/questions/1444790/python-module-for-creating-pid-based-lockfile -- Steven -- https://mail.python.org/mailman/listinfo/python-list