On 04/13/2015 10:58 AM, Fabien wrote:
Folks,


A comment. Pickle is a method of creating persistent data, most commonly used to preserve data between runs. A database is another method. Although either one can also be used with multiprocessing, you seem to be worrying more about the mechanism, and not enough about the problem.

I am writing a quite extensive piece of scientific software. Its
workflow is quite easy to explain. The tool realizes series of
operations on watersheds (such as mapping data on it, geostatistics and
more). There are thousands of independent watersheds of different size,
and the size determines the computing time spent on each of them.

First question: what is the name or "identity" of a watershed? Apparently it's named by a directory. But you mention ID as well. You write a function A() that takes only a directory name. Is that the name of the watershed? One per directory? And you can derive the ID from the directory name?

Second question, is there any communication between watersheds, or are they totally independent?

Third: this "external data", is it dynamic, do you have to fetch it in a particular order, is it separated by watershed id, or what?

Fourth: when the program starts, are the directories all empty, so the presence of a pickle file tells you that A() has run? Or is there some other meaning for those files?


Say I have the operations A, B, C and D. B and C are completely
independent but they need A to be run first, D needs B and C, and so
forth. Eventually the whole operations A, B, C and D will run once for
all,

For all what?

but of course the whole development is an iterative process and I
rerun all operations many times.

Based on what? Is the external data changing, and you have to rerun functions to update what you've already stored about them? Or do you just mean you call the A() function on every possible watershed?



(I suddenly have to go out, so I can't comment on the rest, except that choosing to pickle, or to marshall, or to database, or to custom-serialize seems a bit premature. You may have it all clear in your head, but I can't see what the interplay between all these calls to one-letter-named functions is intended to be.)


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to