Sahil Tandon wrote: >On Tue, 2012-11-06 at 11:26:40 -0800, Mark Sapiro wrote: > >> In your case, the input to the hash on which runners are sliced >> includes all the message headers and the listname so it is likely that >> the "equivalent but different" listname messages will be in different >> slices of the hash space. >> >> This is not a concern if IncomingRunner is not sliced. It is also not a >> concern with a disk based cache as long as buffers are flushed after >> writing because IncomingRunner locks the list whose message is being >> processed which should prevent race conditions between different >> slices of IncomingRunner. > >Then, would it make sense (or be overkill) to have the handler populate >a dict of key, value = message-id, timestamp? And, store that dict in a >pickle whose filename is derived from mlist.internal_name()? > >Obviously, this would result in a lot of pickles that are constantly >opened, edited (and, periodically cleansed), and closed. Is the >performance cost/benefit prohibitive?
Whether the cost is prohibitive depends on how many messages per minute, hour, day, etc you process through the list. I think it could work. The 'in-memory dictionary would also work as long as you are running with the default single qrunner per queue except for the rare case where the duplicates are processed one on each side of a restart. Note as an implementation for the file name (path) derived from the list's internal_name, I would just use a fixed file name, e.g., message-ids.pck in the existing lists/internal_name()/ directory. >I would also be relying on the >fact that a handler is never concurrently called for the same list -- is >that understanding accurate? -- which avoids the scenario in which we >are trying to simultaneously manipulate the same pickle. Yes, that is accurate. IncomingRunner locks the list before processing the pipeline and doesn't unlock it until it's done, so processing of the pipeline for a given message and list is complete before any other runner can begin processing a message for that list. -- Mark Sapiro <[email protected]> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan ------------------------------------------------------ Mailman-Users mailing list [email protected] http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-users/archive%40jab.org
