Hello, I have some questions regarding Django.middleware.cache, which is impressive feature of Django. I am trying to understand it better but some design decisions are unclear to me. I am sure you had your reasons to do it that way and I want to know them. ;-)
1) Prefix. All cache keys are prefixed with 'views.decorators.cache.cache_page.'. It is rather long prefix. Why didn't you use something shorter? It looks like Django's cache objects were tiny part of some huge cache => you worried about some potential clash. Is it the original design restriction? Do you have any rational explanation why we have it in Django now? 2) Gzip. Gzip flag is part of cache key. Additionally 'Content-Encoding' = 'gzip' is part of response object. If somebody request a page, but cannot accept gzip-encoded content, it is trivial to ungzip it. And visa versa: we can gzip uncompressed data. It can be exploited using several strategies: a) Active: every time we put something in cache, we put 2 objects: gzipped, and uncompressed. Cons: extra work + extra cache space, if nobody wants on of generated versions. b) Passive: we keep one version in the cache, and generate the counterpart dynamically. Cons: extra work, which can be compounded if we have "wrong" version in the cache. c) Passive-aggressive: this is variation of b. We always keep compressed version in cache saving space and transfer time. Practically all modern browsers accept compressed content (if I remember correctly, Opera is notable exception). For the rest of them we will uncompress on the fly. I hope it will be rather rare event. d) Lazy: this is variation of a. If we have some version in cache and its counterpart was requested, we generate it from cached version and save it in the cache as well. b and d may require extra lookup => has_key() should be implemented efficiently. It better be. As far as I can tell it's not a case for memcache. But nevertheless it can shave off time for expensive requests. Why did you decide to implement multi-component key, which has gzip flag as the last component? Why did you decide to _generate_ content by full Django machinery independently for gzipped and uncompressed versions? 3) 404s. I've noticed that all responses are cached, including responses with status code 404, and everything returned from Django. Was it the original intention to cache 404s? Why? Was it too expensive to recheck and too frequent to ignore? What about the rest of non-200 codes? I hope someone will educate me on these issues. Thank you in advance, Eugene
