Re: [Pharo-dev] [ANN] Neo-Caching

Stéphane Ducasse Wed, 11 Dec 2013 03:49:03 -0800

On Dec 11, 2013, at 12:29 PM, Esteban Lorenzano <esteba...@gmail.com> wrote:


> 
> 
> 
> On Tue, Dec 10, 2013 at 11:09 PM, Stéphane Ducasse 
> <stephane.duca...@inria.fr> wrote:
> Sven thanks a lot
> 
> yes! really cool!
>  
> How can we resist to push this in Pharo 30 :)
> 
> over my dead body :)
> but in Pharo 4… 

you are saying that you will let squeak be in advance compared to Pharo :)
I cannot believe that :).

Stef
>  
> 
> Stef
> 
> On Dec 10, 2013, at 4:54 PM, Sven Van Caekenberghe <s...@stfx.eu> wrote:
> 
> > Hi,
> >
> > Caching is an important technique to trade time (speed of execution) for 
> > space (memory consumption) in computation. Used correctly, caching can 
> > significantly improve an application’s performance.
> >
> > Given the key/value nature of a cache, a Dictionary is often the first data 
> > structure used for implementing it. This is however not a good idea since a 
> > simple Dictionary is unbounded by definition and can/will lead to explosive 
> > cache growth.
> >
> > Quick and dirty hacks to turn a Dictionary into a least-recently-used (LRU) 
> > and/or time-to-live (TTL) based limited cache are often incorrect and 
> > inefficient.
> >
> > The LRUCache implementation currently in Pharo is too simple and not 
> > efficient, being O(n) on its main operation.
> >
> > Pharo deserves and needs a good LRU/TTL cache implementation that 
> > can/should be used uniformly in various places of the system and in code 
> > built on top of it. That is why I implemented a proposal.
> >
> > Neo-Caching is an implementation of both a good LRU cache as well as a TTL 
> > extension of it. The caches are easy to use, yet have a number of 
> > interesting features. The implementation is properly O(1) on its main 
> > operations and is 2 to 3 times faster than LRUCache. The package also 
> > contains a double linked list implementation. The code comes with a full 
> > set of unit tests. There are class and method comments and the code is easy 
> > to read.
> >
> > The code (which has no dependencies) can be found here
> >
> >  http://mc.stfx.eu/Neo
> >  http://www.smalltalkhub.com/mc/SvenVanCaekenberghe/Neo/main
> >
> > There is only one package to load. The code has been written in Pharo 3.0 
> > but it should run on older versions as well.
> >
> >
> > Code
> >
> >
> > Let’s create a 16K ordered collection of keys that have some repetitions in 
> > them with a reasonable distribution:
> >
> > data := Array streamContents: [ :out |
> >  1 to: 4096 do: [ :each |
> >    each primeFactorsOn: out.
> >    out nextPut: each ] ].
> > data := data collect: [ :each | each asWords ].
> >
> > By using #asWords the keys have better hash properties. Now, we put all 
> > this in a cache:
> >
> > cache := NeoLRUCache new.
> > cache maximumWeight: 512.
> > cache factory: [ :key | key ].
> > data do: [ :each | cache at: each ].
> > cache.
> >
> > ==> a NeoLRUCache(#512 512/512 [ 1 ] [ :key | key ] 72%)
> >
> > We had a 72% hit ratio, which is good. The cache is of course full, 512/512 
> > with 512 entries (not necessarily the same thing, see further).
> >
> > We can now benchmark this:
> >
> > [ data do: [ :each | cache at: each ] ] bench.
> >
> > ==> '35.8 per second.’
> >
> > And compare it to LRUCache:
> >
> > cache := LRUCache size: 512 factory: [ :key | key ].
> >
> > [ data do: [ :each | cache at: each ] ] bench.
> >
> > ==> '12.6 per second.’
> >
> >
> > Features
> >
> >
> > Instead of just counting the number of entries, which is the default 
> > behaviour, the concept of weight is used to determine when a cache is full. 
> > For each cached value, a block or selector is used to compute its weight. 
> > When adding a new entry causes the weight to exceed a maximum, eviction of 
> > the least recently used item(s) takes place, until the weight is again 
> > below the maximum.
> >
> > cache
> >  computeWeight: #sizeInMemory;
> >  maximumWeight: 16*1024.
> >
> > Will keep the cache below 16Kb in real memory usage. This can be very 
> > useful when caching various sized images or MC packages, for example.
> >
> > By default, no concurrent access protection takes place, but optionally a 
> > semaphore for mutual exclusion can be used. This slows down access.
> >
> > cache useSemaphore.
> >
> > NeoTTLCache extends NeoLRUCache by maintaining a timestamp for each cached 
> > value. Upon cache hit, there is a check to see if this timestamp is not 
> > older than the allowed time to live. If so, the value is stale and will be 
> > recomputed.
> >
> > (cache := NeoTTLCache new)
> >  timeToLive: 10 minutes;
> >  factory: [ :key | ZnEasy get: key ];
> >  maximumWeight: 32*1024;
> >  computeWeight: #contentLength.
> >
> > cache at: ‘http://zn.stfx.eu/zn/numbers.txt'.
> >
> > ==> a ZnResponse(200 OK text/plain;charset=utf-8 71B)
> >
> > cache
> >
> > ==> a NeoTTLCache(#1 71/32768 #contentLength [ :key | ZnEasy get: key ] 0% 
> > 0:00:10:00)
> >
> > Would be a simple HTTP cache, keeping resolved URLs in memory for 10 
> > minutes maximum before refreshing them. It uses #contentLength on 
> > ZnResponse to compute the weight. You can see from the print string that 
> > there is now 1 entry, while the weight is 17 bytes out of 32768.
> >
> >
> > These are the main points, please have a look at the code. Feedback is 
> > welcome.
> >
> > Regards,
> >
> > Sven
> >
> >
> 
> 
>

Re: [Pharo-dev] [ANN] Neo-Caching

Reply via email to