On 9 Jul 2009, at 14:57, Emmanuel Bernard wrote:

Here is the concall notes on how to cluster and copy Hibernate indexes using non file system approaches.

Forget JBoss Cache, forget plain JGroups and focus on Infinispan
Start with Infinispan in replication mode (the most stable code) and then try distribution. It should be interesting to test the dist algo and see how well L1 cache behaves in a search environment. For the architecture, we will try the following approach in decreasing interest )If the first one works like a charm we stick with it):
1. share the same grid cache between the master and the slaves
2. have a local cache on the master where indexing is done and manually copy over the chuncks of changed data to the grid This requires to store some metadata (namely the list of chunks for a given index and the lastupdate for each chunk) to implement the same algorithm as the one implemented in FSMaster/ SlaveDirectoryProvider (incremental copy). 3. have a local cache on the master where indexing is done and manually copy over the chuncks of changed data to the grid. Each slave copy from the grid to a local version of the index and use the local version for search.

When writing the InfinispanDirectory (inspired by the RAMDirectory and the JBossCacheDirectory), one need to consider than Infinispan has a flat structure. The key has to contain:
- the index name
- the chunk name
Both with essentially be the unique identifier.
Each chunk should have its size limited (Lucene does that already AFAIK) Question on the metadata. one need ot keep the last update and the list of chuncks. Because Infinispan is not queryable, we need to store that as metadata: - should it be on each chunk (ie last time on each chunk, the size of a chunk) - on a dedicated metadata chunk ie one metadata chunk per chunk + a chink containing the list
- on a single metadata chunk (I fear conflicts and inconsistencies)

On changes or read explore the use of Infinispan transaction to ensure RR semantic. Is it necessary? A file system does not guarantee that anyway.

In the case of replication, make sure a FD back end can be activated in case the grid goes to the unreachable clouds of total inactivity.

FD backend? I presume you mean a cache store. Have a look at the different cache stores we ship with, I reckon a FileCacheStore would do the trick for you.

http://infinispan.sourceforge.net/4.0/apidocs/org/infinispan/loaders/CacheStore.html
http://infinispan.sourceforge.net/4.0/apidocs/org/infinispan/loaders/file/FileCacheStore.html

Question to Manik: do you have a cluster to play with once we reach this stage?

The cluster team does have a set of lab servers used to test, benchmark, etc. You will need to "book" time on this cluster though since it is shared between JBC/Infinispan, JGroups and JBoss AS clustering devs.

Cheers
--
Manik Surtani
ma...@jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org




_______________________________________________
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev

Reply via email to