zugnush wrote:
You could do something like this so that  every process will know if
the file "belongs" to it without prior coordination, it  means a lot
of redundant hashing though.

In [36]: import md5

In [37]: pool = 11

In [38]: process = 5

In [39]: [f for f in glob.glob('*') if int(md5.md5(f).hexdigest(),16)
% pool == process ]
Out[39]:

You're also relying on the hashing being perfectly distributed, otherwise some processes aren't going to be performing useful work even though there is useful work to perform.

In other words, why would you rely on a scheme that limits some processes to certain parts of the data? If we're already talking about trying to get away without some global lock for synchronisation this seems to go against the original intent of the problem...

  n
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to