Hi folks,

I thought I'd poll the list on the best way to approach a problem in Twisted.

The background is that we have a number of resources which can be requested by a REST client, and which are calculated on demand. The calculation is moderately expensive (can take multiple seconds), so the results of the calculation are cached so multiple lookups of the same resource are more efficient.

The problem comes in trying to handle multiple clients requesting the same resource at once. Obviously if 200 clients all request the same resource at the same time, we don't want to fire off 200 calculation requests.

The approach we adopted was, effectively, to maintain a lock for each resource:

lock = defer.DeferredLock()
cached_result = None

@defer.inlineCallbacks
def getResource():
     yield lock.acquire()
     try:
         if cached_result is None:
             cached_result = yield do_expensive_calculation()
         defer.returnValue(cached_result)
     finally:
         lock.release()

(Of course one can optimise the above to avoid getting the lock if we already have the cached result - I've omitted that for simplicity.)

That's all very well, but it falls down when we get more than about 200 requests for the same resource: once the calculation completes, we can suddenly serve all the requests, and the Deferreds returned by DeferredLock end up chaining together in a way that overflows the stack.

I reported this as http://twistedmatrix.com/trac/ticket/9304 and, at the time, worked around it by adding a call to reactor.callLater(0) into our implementation. However, Jean-Paul's comments on that bug implied that we were approaching the problem in completely the wrong way, and instead we should be avoiding queuing up work like this in the first place.

It's worth reiterating that the requests arrive from REST clients which we have no direct control over. We *could* keep track of the number of waiting clients, and make the API respond with a 5xx error or similar if that number gets too high, with the expectation that the client retries - but one concern would be that the load from the additional HTTP traffic would outweigh any efficiency gained by not stacking up Deferreds.

So, I'd welcome any advice on better ways to approach the problem.

Richard

_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Reply via email to