On Sat, May 14, 2011 at 3:06 PM, Jeroen Vermeulen <[email protected]> wrote: > On 2011-05-11 10:13, Robert Collins wrote: > >> I suspect an easy migration target if folk want one would be to >> migrate all the fire-and-forget jobs to trigger via rabbit (leaving >> the payload in the db), by hooking a 'do it now' message into the >> post-transaction actions in zope. > > It's exciting news. We'll want to be careful in migrating jobs though: IIRC > rabbit is nontransactional. That means we'll still need some way for > consumers of jobs to recognize cases where the producer transaction aborted > after firing off the job.
I believe 2pc is possible with amqp 0.10, but rabbitmq does 0.8 - and see the debate around 0mq for > In some of those cases, executing a job unnecessarily won't hurt -- ones > that refresh statistics for example. In others, the job absolutely must not > execute. > > Without having looked into it properly, I think we'll need some kind of > wrapper to support this distinction. Traditional transactional messaging > uses two-phase commit; other products use database queues similar to our > Job. Both are probably overweight to the point where our baby would go out > with the bathwater. We could fake it by queuing up jobs in memory and > sending them after commit, but that leaves open a window for message loss. pg_amqp is probably the right thing for now, for must-be-coupled-to-pg-transaction issues. That said, I suspect much of our usage will be in notifications not content transferral. that is, idempotent messages that trigger processing of something in a transactional store. Beyond that, I think we need consider ordering things (where possible) to be idempotent and failure tolerate (given the potential overheads of 2pc). > Another problem happens when things work too well: you create a > database-backed object. You fire off a job related to that object. You > commit. But the consumer of that job picks it up before your commit has > propagated and boom! The job dies in flames because it accesses objects > that aren't decisively in the database yet. For this sort of thing fire-after-commit should be easy and sufficient. > I imagine both problems go away if every message carries a database > transaction id, and the job runner keeps an eye on the database transaction > log: the runner shouldn't consume a job until the producing transaction has > committed, and it should drop jobs whose producers have aborted. Is > something along those lines built in? No, and that sounds terrifying to me - if we tie things up that tightly, we may as well just have a queue in the db and poll it every 50ms. I think we need some optional glue for things that need transactional semantics or after-transaction semantics. But there are many other sorts of things(offhand: cancelling of jobs [tell the job service to do the cancellation], reporting of oopses, dispatching to clusters of workers, operational stats gathering, parallelism-within-a-request) for which we shouldn't need either of those constraints - and for those we should be as lean as we can. -Rob _______________________________________________ Mailing list: https://launchpad.net/~launchpad-dev Post to : [email protected] Unsubscribe : https://launchpad.net/~launchpad-dev More help : https://help.launchpad.net/ListHelp

