On Aug 9, 7:25 pm, [EMAIL PROTECTED] wrote: > Hi all! I'm implementing one of my first multithreaded apps, and have > gotten to a point where I think I'm going off track from a standard > idiom. Wondering if anyone can point me in the right direction. > > The script will run as a daemon and watch a given directory for new > files. Once it determines that a file has finished moving into the > watch folder, it will kick off a process on one of the files. Several > of these could be running at any given time up to a max number of > threads. > > Here's how I have it designed so far. The main thread starts a > Watch(threading.Thread) class that loops and searches a directory for > files. It has been passed a Queue.Queue() object (watch_queue), and > as it finds new files in the watch folder, it adds the file name to > the queue. > > The main thread then grabs an item off the watch_queue, and kicks off > processing on that file using another class Worker(threading.thread). > > My problem is with communicating between the threads as to which files > are currently processing, or are already present in the watch_queue so > that the Watch thread does not continuously add unneeded files to the > watch_queue to be processed. For example...Watch() finds a file to be > processed and adds it to the queue. The main thread sees the file on > the queue and pops it off and begins processing. Now the file has > been removed from the watch_queue, and Watch() thread has no way of > knowing that the other Worker() thread is processing it, and shouldn't > pick it up again. So it will see the file as new and add it to the > queue again. PS.. The file is deleted from the watch folder after it > has finished processing, so that's how i'll know which files to > process in the long term. > I would suggest something like the following in the watch thread:
seen_files = {} while True: # look for new files for name in os.listdir(folder): if name not in seen_files: process_queue.add(name) seen_files[name] = True # forget any missing files and mark the others as not seen, ready for next time seen_files = dict((name, False) for name, seen in seen_files.items() if seen) time.sleep(1) -- http://mail.python.org/mailman/listinfo/python-list