On Dec 11, 2013, at 12:29 PM, Esteban Lorenzano <esteba...@gmail.com> wrote:
> > > > On Tue, Dec 10, 2013 at 11:09 PM, Stéphane Ducasse > <stephane.duca...@inria.fr> wrote: > Sven thanks a lot > > yes! really cool! > > How can we resist to push this in Pharo 30 :) > > over my dead body :) > but in Pharo 4… you are saying that you will let squeak be in advance compared to Pharo :) I cannot believe that :). Stef > > > Stef > > On Dec 10, 2013, at 4:54 PM, Sven Van Caekenberghe <s...@stfx.eu> wrote: > > > Hi, > > > > Caching is an important technique to trade time (speed of execution) for > > space (memory consumption) in computation. Used correctly, caching can > > significantly improve an application’s performance. > > > > Given the key/value nature of a cache, a Dictionary is often the first data > > structure used for implementing it. This is however not a good idea since a > > simple Dictionary is unbounded by definition and can/will lead to explosive > > cache growth. > > > > Quick and dirty hacks to turn a Dictionary into a least-recently-used (LRU) > > and/or time-to-live (TTL) based limited cache are often incorrect and > > inefficient. > > > > The LRUCache implementation currently in Pharo is too simple and not > > efficient, being O(n) on its main operation. > > > > Pharo deserves and needs a good LRU/TTL cache implementation that > > can/should be used uniformly in various places of the system and in code > > built on top of it. That is why I implemented a proposal. > > > > Neo-Caching is an implementation of both a good LRU cache as well as a TTL > > extension of it. The caches are easy to use, yet have a number of > > interesting features. The implementation is properly O(1) on its main > > operations and is 2 to 3 times faster than LRUCache. The package also > > contains a double linked list implementation. The code comes with a full > > set of unit tests. There are class and method comments and the code is easy > > to read. > > > > The code (which has no dependencies) can be found here > > > > http://mc.stfx.eu/Neo > > http://www.smalltalkhub.com/mc/SvenVanCaekenberghe/Neo/main > > > > There is only one package to load. The code has been written in Pharo 3.0 > > but it should run on older versions as well. > > > > > > Code > > > > > > Let’s create a 16K ordered collection of keys that have some repetitions in > > them with a reasonable distribution: > > > > data := Array streamContents: [ :out | > > 1 to: 4096 do: [ :each | > > each primeFactorsOn: out. > > out nextPut: each ] ]. > > data := data collect: [ :each | each asWords ]. > > > > By using #asWords the keys have better hash properties. Now, we put all > > this in a cache: > > > > cache := NeoLRUCache new. > > cache maximumWeight: 512. > > cache factory: [ :key | key ]. > > data do: [ :each | cache at: each ]. > > cache. > > > > ==> a NeoLRUCache(#512 512/512 [ 1 ] [ :key | key ] 72%) > > > > We had a 72% hit ratio, which is good. The cache is of course full, 512/512 > > with 512 entries (not necessarily the same thing, see further). > > > > We can now benchmark this: > > > > [ data do: [ :each | cache at: each ] ] bench. > > > > ==> '35.8 per second.’ > > > > And compare it to LRUCache: > > > > cache := LRUCache size: 512 factory: [ :key | key ]. > > > > [ data do: [ :each | cache at: each ] ] bench. > > > > ==> '12.6 per second.’ > > > > > > Features > > > > > > Instead of just counting the number of entries, which is the default > > behaviour, the concept of weight is used to determine when a cache is full. > > For each cached value, a block or selector is used to compute its weight. > > When adding a new entry causes the weight to exceed a maximum, eviction of > > the least recently used item(s) takes place, until the weight is again > > below the maximum. > > > > cache > > computeWeight: #sizeInMemory; > > maximumWeight: 16*1024. > > > > Will keep the cache below 16Kb in real memory usage. This can be very > > useful when caching various sized images or MC packages, for example. > > > > By default, no concurrent access protection takes place, but optionally a > > semaphore for mutual exclusion can be used. This slows down access. > > > > cache useSemaphore. > > > > NeoTTLCache extends NeoLRUCache by maintaining a timestamp for each cached > > value. Upon cache hit, there is a check to see if this timestamp is not > > older than the allowed time to live. If so, the value is stale and will be > > recomputed. > > > > (cache := NeoTTLCache new) > > timeToLive: 10 minutes; > > factory: [ :key | ZnEasy get: key ]; > > maximumWeight: 32*1024; > > computeWeight: #contentLength. > > > > cache at: ‘http://zn.stfx.eu/zn/numbers.txt'. > > > > ==> a ZnResponse(200 OK text/plain;charset=utf-8 71B) > > > > cache > > > > ==> a NeoTTLCache(#1 71/32768 #contentLength [ :key | ZnEasy get: key ] 0% > > 0:00:10:00) > > > > Would be a simple HTTP cache, keeping resolved URLs in memory for 10 > > minutes maximum before refreshing them. It uses #contentLength on > > ZnResponse to compute the weight. You can see from the print string that > > there is now 1 entry, while the weight is 17 bytes out of 32768. > > > > > > These are the main points, please have a look at the code. Feedback is > > welcome. > > > > Regards, > > > > Sven > > > > > > >