Hey all --

I've been thinking about ways to improve how we handle temporary failures
in our background processing queues. For instance if we're pushing a
message out to another service (OStatus federation, Twitter & Facebook
bridges) or pinging updates out over XMPP or another instant messaging
protocol, we might be temporarily unable to reach the other server.

Right now, we have to either drop the message immediately ("sorry can't get
there") or put it back on the queue stack to retry immediately. That only
really helps if trying again immediately is going to work... if a remote
server is down for a few minutes or a few hours, that's not going to help
much... either we're pegging the server resources retrying something that
doesn't work, or we dump the message after a few retries, possibly just
seconds later.

What would be great would be if we can say "ok this queue item failed, but
can we try it again in 15 seconds? or 15 minutes? or 12 hours?"

I whipped up a few notes on the wiki on how this might be implemented
reasonably efficiently:
http://status.net/wiki/Delayed_queue

Any thoughts? My primary concern is making sure we can handle checks for
future events for thousands of sites hosted by the same server set, of
course, as that's our case here at StatusNet Inc. ;) But it should work for
small sites, too.

-- brion
_______________________________________________
StatusNet-dev mailing list
[email protected]
http://lists.status.net/mailman/listinfo/statusnet-dev

Reply via email to