Alban Hertroys wrote:

Jeremy Jones wrote:

(not waiting, because it already did happen). What is it exactly that you are trying to accomplish? I'm sure there is a better approach.


I think I saw at least a bit of the light, reading up on readers and writers (A colleague showed up with a book called "Operating system concepts" that has a chapter on process synchronization).
It looks like I should be writing and reading 3 Queues instead of trying to halt and pause the threads explicitly. That looks a lot easier...


Thanks for pointing out the problem area.

That's actually along the lines of what I was going to recommend after getting more detail on what you are doing. A couple of things that may (or may not) help you are:


* the Queue class in the Python standard library has a "maxsize" parameter. When you create a queue, you can specify how large you want it to grow. You can have your three threads busily parsing XML and extracting data from it and putting it into a queue and when there are a total of "maxsize" items in the queue, the next put() call (to put data into the queue) will block until the consumer thread has reduced the number of items in the queue. I've never used xml.parsers.xmlproc.xmlproc.Application, but looking at the data, it seems to resemble a SAX parser, so you should have no problem putting (potentially blocking) calls to the queue into your handler. The only thing this really buys you won't have read the whole XML file into memory.
* the get method on a queue object has a "block" flag. You can effectively poll your queues something like this:


#untested code
#a_done, b_done and c_done are just checks to see if that particular document is done
while not (a_done and b_done and c_done):
got_a, got_b, got_c = False, False, False
item_a, item_b, item_c = None, None, None
while (not a_done) and (not got_a):
try:
item_a = queue_a.get(0) #the 0 says don't block and raise an Empty exception if there's nothing there
got_a = True
except Queue.Empty:
time.sleep(.3)
while (not b_done) and (not got_b):
try:
item_b = queue_b.get(0)
got_a = True
except Queue.Empty:
time.sleep(.3)
while (not c_done) and (not got_c):
try:
item_c = queue_c.get(0)
got_c = True
except Queue.Empty:
time.sleep(.3)
put_into_database_or_whatever(item_a, item_b, item_c)


This will allow you to deal with one item at a time and if the xml files are different sizes, it should still work - you'll just pass None to put_into_database_or_whaver for that particular file.

HTH.

Jeremy Jones
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to