Uma resposta do J2EE interest:
Hi!

How do you handle the following topics:

1) Lets say the DAO is involved in a transaction with a isolation level that
means that data should not be visible to clients outside the transaction
until the transaction is commit? The solution to this could be to let the
cache implement XAResource and use transaction that span the jdbc calls and
the calls to update the cache.

2) To decrease the impact on the heap space the cache could use a map that
extends abstractmap and uses SoftReferences.

Thanx

Lars-Fredrik Smedberg
Senior Architect, Mynta Management & IT AB


Antonio Kantek wrote:
009101c17e5f$30e484c0$a901a8c0@localdomain">
A.C.E. Smart Cache: Speeding Up Data Access
 
Intent:

To cache data objects (be they value objects or entity beans) that are frequently read, yet represent mutable data (read-mostly). This problem is not so difficult when only one server (JVM) is in use, but is much more complicated when applied to a cluster of servers. The Active Clustered Expiry Cache can solve this problem.

This pattern is somewhat similar to the Seppuku pattern published by Dimitri Rakitine. I'd consider Seppuku to be an more specific (and very cool) variation of the ACE Cache pattern, involving Read-Only Entity Beans and Weblogic Server.

This pattern is J2EE compliant and vendor neutral (although the Seppuku pattern is a neat one for Weblogic users).

Motivation:

Typically, in a data-driven application, some data is read more frequently that other data. Although most DBMSs will cache queries for such data in memory, enabling fast retrieval, is it often desired to have something even faster: A cache in memory in the server (app or even web).

This is easy enough in the case of a single node server (no cluster). The application is designed so that reading this data goes through a singleton cache interface, which either returns the cached data or retrieves (and caches) it from the DB. Changing this data goes through the same interface, invalidating the cached data.

This is also easy to accomplish in a cluster, if the data is truly read-only. Then each node will have an instance (singleton) of the cache, populate it as necessary. No expiration (a term used throughout this text in lieu of "invalidation") is necessary, since the data never changes. In addition, a cache such as this can have a "timeout", so that items will only be stale for a maximum time.

Where this becomes difficult is when a clustered environment is necessary (high load, failover) and the data is mutable. In a nutshell, changes to data must be reflected in ALL caches, so that stale reads do not occur. Ensuring that all nodes are notified synchronously also has performance problems, both with network traffic and also contention between notifiers (publishers) and caches being notified (subscribers). However, it is still very desirable to have asynchronous expiration across the cluster, so that all caches will be synced in a "timely" manner. Hence the "smart" cache; it is aware of its peers, and keeps in sync.

This restriction means that the cached data must be considered read-only. Because the caches are expired asynchronously, there is a small interval of time when the data is stale. This is fine for data that is only to be read for output; after all, the request for such data could have come a split second earlier. But, if cached data is read, then the application decides (using a cached read which is (slightly) stale) to change the data, then we are violating ACIDity. The goal here is NOT to build an ACID, cluster-wide, in-memory data store; the DB and application server vendors are counted on to provide that kind of functionality.

Applicability:

Use the ACE Cache when

1) Data is "read-mostly"
2) Application server tier is clustered.
3) Data is read by many simultaneous requests
4) Data is not usually changed (at runtime) through other means (e.g. direct SQL by an admin, other kinds of applications)

Participants:

  DataObject: The data object itself
             This could be an entity bean or separate value object.

  DataObjectKey: A key object, satisfying equals() and hashCode(), to uniquely retrieve the DataObject
                 This could be an EJB Primary Key, or just any key class

  Cache: Used to store the data objects, mapping DataObjectKey to DataObjects. Best performance if a singleton, and must be synchronized appropriately.
        This could be backed by a Map, or possibly an application server's entity bean cache (e.g. WL Read-Only beans).

For some stripped down interfaces, see the end of the text. I might expose more implementation code later on, but it uses many of my libraries of utilities, and dragging all of that in here would make this post quite a novel!

Behavior:

The Cache has reference to the DAO (Data Access Object) logic, whether embedded within an EJB or not. If the cache is queried, and the DataObject does not exist, then the DataObject is created (and cached). This means that different instances of the cache (in different processes) will be populated differently (this can be exploited, especially when dealing with user-specific data).

These DataObjects are for read-only, so if they are entity beans, they should be read-only beans. If they are value objects, they need to have a flag set so that they cannot be "saved". What is more, since these DataObject instances are shared between all callers of the Cache, the DataObjects need to *immutable*, either by only implementing a read-only interface, or by throwing exceptions (usually RuntimeExceptions) when mutating methods are called.

The expiration logic is also tied to the Cache object. The Cache subscribes to a JMS Topic (or referenced by a MessageDrivenBean). Then when a value object is "saved" or an entity bean's ejbStore() method is called, the Cache is expired for a particular DataObjectKey. The cache then publishes (to all caches but itself) the DataObjectKey. The listeners (onMessage()) then expire that key (and value) from that cache. So, asynchronously, all Caches across the cluster are brought into sync.

If using WL read-only entity beans, the link between the Cache and the expiry logic already exists (see Seppuku). If using Value Objects, one way to integrate the expiration logic is to have each Value Object keep reference to the Cache. Then, when the Value Object is "saved", the (local) Cache is expired, and then the remote Caches are expired asynchronously (and actively).

Consequences:

1. The DB will be queried much less often by read requests.
2. There will be much less object creation in the application servers.
3. There may be small latencies in read-only data propagating across the cluster.
4. The cache may take up significant heap space in the application server.


Implementation Issues Beyond The Scope of the Pattern (I can comment on these separately):

1. Managing cache size (e.g. LRU scheme, different caches for different DataObjects or not)
2. Trading off granular caching with coarse data retrieval (hard or soft references between value objects?). Some REALLY cool stuff here.
3. Proper synchronization of the Cache and the DAO within
4. Strategies/frameworks for making DataObjects immutable when needed, and for integrating this pattern with data access control (permissions)

Some interfaces:

*******************************************************************

import java.util.*;

/**
 * Cache interface, to hide our many implementations
 */
public interface ICache
    extends java.io.Serializable
{
    // Hook to tell cache what to do if it does not contain requested item
    public Object miss(Object aKey)
        throws CacheException;
    
    public void flush();

    public void expire(Object aKey);
    
    public void hit(Object aKey);
    
    public Object get(Object aKey)
        throws CacheException;
    
    public void add(Object aKey, Object aValue);
    
    public void addAll(Map aMap);
    
    public boolean contains(Object aKey);
}

*******************************************************************

import java.util.*;

/**
 * extends the Cache interface to provide method for bulk access, hitting, missing, and expiry.
 */
public interface IBulkCache
    extends ICache
{
    public Map getAll(Collection aColl)
        throws CacheException;

    public Map missAll(Collection aColl)
        throws CacheException;

    public void hitAll(Collection aColl);

    public void expireAll(Collection aColl);
}

*******************************************************************
/** Just a tag interface, implementations might expose ids, perhaps.
 * equals() and hashCode(),and also compareTo() implementations important
 */
public interface IValueObjectKey
    extends java.io.Serializable,
            Cloneable,
            Comparable
{
}

*******************************************************************

//this one has dependencies on lots of other stuff, so don't try to compile it

public interface IValueObject
    extends java.io.Serializable,
            Cloneable
{
    public void flagReadOnly();

    public boolean isReadOnly();

    public void flagImmutable()
        throws ValueObjectException;

    public boolean isImmutable();

    public boolean isImmutableCapable();

    public boolean isCacheable();

    public boolean isValueObjectCloneable();

    public boolean isSanitized()
           throws SanityCheckException;

    public void sanitize()
        throws SanityCheckException;

    public IValueObjectKey getValueObjectKey();

    public Object clone();

    public IValueObject cloneDeep()
        throws CloneNotSupportedException;

    public void save()
        throws SaveException;

    //this other part deals with the Value Object graph, important for getting expiry right

    public boolean isValueObjectReferencesEnabled();

    public Collection getValueObjectReferences()
        throws ValueObjectReferencesNotEnabledException; //value object references and value object reference lists

    public Collection getReferencedValueObjects()
        throws ValueObjectReferencesNotEnabledException; //flat collection of the value objects

    public Collection getValueObjectsDeep()
        throws ValueObjectReferencesNotEnabledException; // all contained value objects, including this one

    public Map getValueObjectMapDeep()
        throws ValueObjectReferencesNotEnabledException; // all contained value objects, including this one
}


Responder a