On Friday, 27 September 2019 04:38:46 CEST Waqar Khan wrote: > Hi, > What's a good way to use a simple dictionary as a cache in twisted > framework? > Basically, I have this callback chain where I ultimately make a rest > call (in non-blocking way using treq) to fetch some data. But before > I make the call, I am using a dictionary to see if the value is > available or not. But, I have noticed that the event loop gets pretty > busy(sometimes, things get stuck and twisted server stops) as soon as > I add this logic.. Which is pretty much > > @defer.inlinecallbacks > def fetch(key): > if key in cache: > return cache[key] > # else call back to treq to fetch value > cache[key] = value > return value > > This dict can grow to around 50k.. What's a good way to solve this > issue?
If it gets stuck, then the cause for that is probably in the part of the code you omitted. So it would help to elaborate on how the value is fetched exactly. I can see two other problems with this caching mechanism though: 1. Items are never removed from the cache, so unless there is a limit to the number of different keys that can be used, the cache can grow indefinitely. You might want something like an LRU cache rather than a plain dictionary. https://docs.python.org/3/library/functools.html#functools.lru_cache 2. If a lot of clients are requesting the same thing, you won't see any benefits from caching until the first request completes. So you could get a pattern like this: T=0: key A requested, A is not cached, start fetch #1 of A T=1: key A requested, A is not cached, start fetch #2 of A T=2: key A requested, A is not cached, start fetch #3 of A T=3: key A requested, A is not cached, start fetch #4 of A T=4: key A requested, A is not cached, start fetch #5 of A T=5: fetch #1 of A completes and is added to the cache T=6: key A requested, A is cached, return value immediately In this example, the value for A is fetched 5 times despite the caching mechanism. If the fetching takes a long time compared to the rate at which requests are coming in, this effect gets worse at a quadratic rate: the total time spent fetching is the number of requests that come in during the fetching of the first request times the duration of the fetch. To avoid this, you could put a Deferred for the fetch operation in the cache or in a separate dictionary and if you get another request for the same key before the fetch completes, return that Deferred instead of starting another fetch. Bye, Maarten _______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python