Hi,
your suggestion sounds really reasonable but the point is that these
processes are running on different machines and I don't want to put a
lot of effort in synchronizing these threads.
Isn't there any easy solution for having multiple processes working on
the same database table?
Thanks.
entirely different machines, then youd have to partition out rows from
the table yourself. you'd select some range of rows using LIMIT/
OFFSET on each machine to be processed. the exact count would depend
on the total rows in the table and the total number of machines.
On Feb 25, 2009,
I would select out the total set of rows and then hand off groups of
those rows, converted into serializable objects first, using the
imap function of a multiprocessing.Pool object. It would be best
if the rows are returned via a ResultProxy so that work can begin on
results before all results