Title: RE: [PROPOSAL] Cache project...

For what it's worth, I too have a home-grown cache implementation that we've been pretty happy with.  We use this cache in a number of places, from caching JSP output (using a set of <cache/> tags), to caching database results, and like Daniel suggests, computationally expensive results. 

I had actually been planning to propose a cache project as well, so I think I'm in favor, but I have some design suggestions.

Like Craig suggests, the interface is essentially the same as ObjectPool or HashMap--i.e., put an object into the cache, get an object from the cache--but of course the cache can return multiple copies of the object, and the put operation often includes extra attributes such as time-to-live or cost.  Our basic cache interface looks like:

 boolean store(Serializable key, Serializable val, Long expiry, Long cost)
          Store the specified val under the specified key.
 Serializable retrieve(Serializable key)
          Obtain the value previously stored under the given key.
 void clear()
          Remove all values previously stored.
 void clear(Serializable key)
          Remove any value previously stored under the given key.
 boolean contains(Serializable key)
          Returns true if I have a value associated with the given key, false otherwise.


The cache can publish store/retieve/evict events to listeners:

 void registerRetrievalListener(RetrievalListener obs)
          Add the given RetrievalListener to my set of RetrievalListeners.
 void registerStorageListener(StorageListener obs)
          Add the given StorageListener to my set of StorageListeners.
 void unregisterRetrievalListener(RetrievalListener obs)
          Remove the given RetrievalListener from my set of RetrievalListeners.
 void unregisterRetrievalListeners()
          Clear my set of RetrievalListeners.
 void unregisterStorageListener(StorageListener obs)
          Remove the given StorageListener from my set of StorageListeners.
 void unregisterStorageListeners()
          Clear my set of StorageListeners.

The cache also contains the notion of a "group"--a sort of meta-key that can be associated with more than one object in the cache:

 boolean store(Serializable key, Serializable val, Long expiry, Long cost, Serializable group)
          Store the specified val under the specified key and the specified group.
 Serializable[] getKeysForGroup(Serializable group)
           
 void clearGroup(Serializable group)
          Remove any value previously stored under the given group.


The main thing that we do right in this cache impl., I think, is that we treat the Cache as an aggregation of 'policy' objects, making it easy to create different types of caches and/or different cache configurations.  Specifically, the Cache is an aggregation of:

 * A "Stash" which is a physical storage mechanism for cached objects (memory, disk, database, etc.)

 * An (optional) "StashPolicy" which determines whether or not a given object is cacheable or not

 * An (optional) "EvictionPolicy" which determines which objects to evict (remove from the cache) when the cache is full

(Least Recently Used, Least Relative Value, Least Frequently Used, etc.)

 * A "StaleObjectEvictor" which removes stale (expired) objects from the cache.

I definitely think that that is the right general approach.

> Altold, do you think we are trying to achieve something
> similar or is my approach too much heavy-wheight for
> your needs?

Daniel, I think we're talking about something similiar, don't you?

-----Original Message-----
From: Daniel Hoppe
To: '[EMAIL PROTECTED]'
Cc: Felix Schauerte; Stefan Siprell
Sent: 5/17/01 4:10 PM
Subject: AW: [PROPOSAL] Cache project...

James, Craig,

I did not follow the ObjectPool discussions to closely, so I hope I'm
not
missing the point. As far as I got it

- the pool hands out an object exactly once. The object is unavailable
for
others until it is returned
- the cache may hand out an object several times. The object is not
exclusively used.

The cache could therefore be used for e.g. complex computation results,
results of database queries and other computing time intensive tasks.
The
object pool contains scarce resources, e.g. typically database
connections.

James, what do you think of making a cache JMX compliant? I'm working on
a
cache which is supposed to buffer data in a content management system.
The
cache is supposed to store objects which are quite expensive, the
caching
will be crucial for application performance That's why  I need to have a
good overview on what's happening inside the cache, not just as some
debugging output but rather in a fashion that can be remote monitored
and
integrated with monitoring tools.

I plan to implement the measures
- objects in cache,
- cache hits,
- invalidations,
- cache misses of key that have not been in the cache yet,
- cache misses due to a constraint of maximum cache entries.

The last two points might seem a little bit strange at the first
thought,
but I think they can make a sense in certain situations. If a cache has
the
option of either using soft references or a maximum number of entries,
there
will be a certain amount of cache misses due to either a limited heap
size
in case of soft references or a limited number of maximum entries
allowed
(which is most probably related to heap size as well.

With this monitoring information a sysadmin could easily determine if
e.g.
an installation would benefit from an increased heap size.

To distinguish between both types of cache misses it is of course
necessary
- to keep a list of keys which are already known to the cache
- to have a mechanisim to finally drop keys after a certain while (e.g.
the
key of a deleted page in a content management system should not remain
in
the cache for weeks and months).

This implies that the value object needs to have noticeably higher
amount of
heap consumption and computation time on creation than the key,
otherwise
the cache would of course not make much of a sense.

In my current prototype I'm supplying three types of caches, a cache
which,
when full,

- drops the oldest value objects,
- drops the ones with the longest interim since the last hit
- drops the ones with the fewest number of total hits

I can configure which kind of references are used (strong, soft, weak).

I like the idea of a cacheloader. Thought of that as well, but somehow
did
not have the drive to implement that in my ejb environment yet (might be
messy if the cacheloader has to deal with FinderExceptions of EJBs).

What I did not fully get so far is the idea of the cache-regions. I
always
thought of putting a cache instance to some central location (Web
Application Context, JNDI Tree, e.g.), but maybe my view is to J2EE
focused
on this. I'm a little bit sceptical about a cache being a static member
as
there are some restrictions on that in the EJB spec..

Altold, do you think we are trying to achieve something similar or is my
approach too much heavy-wheight for your needs?

Cheers,

Daniel

Reply via email to