On 1/19/07, Zoran Vasiljevic <[EMAIL PROTECTED]> wrote:
Hi!I have observed a ns_cache behaviour that I'd like to change. In ns_cache, it is possible to create timestamped elements that would expire after some time. But, they will never expire on their own. If you fill the cache with thousands of elements and never visit any of those any more, they will stay in the cache forever, bloating the memory. There has to be some instance that will either automatically purge expired elements (a yet-another-detached thread) or one should be able to do that programatically, like ns_cache_flush ?-expired? cache otherwise the system memory just bloats and bloats. At the moment, the only workaround to this behaviour is programmer calling ns_cache_keys as it will return all still-valid keys, silently deleting expired ones. This is not optimal, as again large memory might be needed to store all the "alive" keys, although caller is not particulary interested to have them in the first place. What should we do?
I don't think it's 100% accurate to say memory just bloats and bloats! All caches have an upper bound. A 10MB cache will not grow beyond 10MB. There's two ways an expired entry will be physically removed from the cache: it reaches the end of the LRU list, and whether it has expired or not, it is pruned; it is accessed but has expired, so is pruned rather than returned. The idea is that a cache contains a subset of the total data set. The natural action of fresh data entering the cache pushes out the expired entries without overhead, just as it pushes out infrequently used entries. Looks like you have a different situation though. Your cache is sized large enough to hold all data, but the usual working set is much smaller. With the fancy new memory allocator that returns memory to the OS, you'd like to prune the cache of expired entries even though the cache is still below it's configured max size. I removed this functionality from the old code! I'm fine with it being put back, but there were a couple of problems with the old code worth mentioning. In the old code, caches were either size limited *or* time limited, but not both. With a time limited cache therefore you had to have a sweeper thread to clear out expired entries or it would indeed bloat and bloat... Unfortunately, if a load spike occurred in between a sweep then the cache would grow until the server aborts. Another problem (if I'm remembering this right) was that the sweep interval was the same as the expiry time for entries, which was specified at cache creation time. So if you had a large cache with a short expiry time, frequently a thread would be created, lock the cache, scan through all the entries, maybe evict a few, then unlock the cache. And then do it again a moment later. So for a new sweeper, maybe the way to handle it is to create Ns_CacheFlushExpired() which can be passed to Ns_ScheduleProc(). C code can handle the period manually. For Tcl, add a new switch to ns_cache_create, -expireflush (or whatever). Keep the -expire and -expireflush times distinct. Tcl caches live forever IIRC so that's all there is to it, no need to handle unregistering the scheduled proc. I've been thinking about how to auto-tune cache sizes. It occurs to me that maybe it solves this problem too. The idea is to use a common LRU list for all caches. Specify a total maximum cache size which all caches will share. Don't specify individual cache sizes. You set the total max cache size to whatever your server can spare, and as the server runs and load changes, the caches compete for a slice of memory. In effect, creating a new cache is creating a new name space in the single global cache. The more caches exist, the harder it is tune the sizes for maximum performance. When you consider dynamically changing load, it's basically impossible. The above is just an implementation -- the problem is given space X and Y caches of varying load, how do you assign the space to the caches such that, overall, the performance is highest. Hit rate is not necessarily the best measure of effectiveness. Perhaps it's more expensive to fill one cache than it is to fill another? A single, shared LRU seemed like the simplest solution. Ideas appreciated. Anyway, this applies to your current problem because by using a common LRU the contents of other caches can push out the expired entries from a cache which would otherwise leave them active. My assumption is the same: caches are usually smaller than the total data size. The common case will compensate for your unusual cache problem. If you only have the one cache, or all your caches fit fully into memory, I guess this doesn't apply. You need the sweeper. Otherwise, the auto tuning caches would mean no new extra tuning knob for expiry flushing, and of course the auto size tuning...
