zugnush wrote:
You could do something like this so that every process will know if
the file "belongs" to it without prior coordination, it means a lot
of redundant hashing though.
In [36]: import md5
In [37]: pool = 11
In [38]: process = 5
In [39]: [f for f in glob.glob('*') if int(md5.md5(f).hexdigest(),16)
% pool == process ]
Out[39]:
You're also relying on the hashing being perfectly distributed,
otherwise some processes aren't going to be performing useful work even
though there is useful work to perform.
In other words, why would you rely on a scheme that limits some
processes to certain parts of the data? If we're already talking about
trying to get away without some global lock for synchronisation this
seems to go against the original intent of the problem...
n
--
http://mail.python.org/mailman/listinfo/python-list