Hello JBossCache gurus

I've been toying with the idea of using in-memory replication across a cluster 
to preserve transaction log information for JBossTS, as an alternative to 
writing it to disk. What follows is my current thinking on the matter, which 
you can poke holes in at your leisure.

'Memory is the new disk', so let's use it as such...

Transaction log entries are typically short lived (just the interval between 
the prepare and commit phases if all goes well) but must survive a node 
failure. Or, depending on the degree of user paranoia, perhaps multiple node 
failures. The size is not much - a few hundred bytes per transaction. Writing 
the tx log to RAID is a major bottleneck for transaction systems and hence app 

JBossTS already has a pluggable interface for transaction log ('ObjectStore') 
implementations, so writing one based on JBossCache is not too difficult. The 
relative performance of this approach compared to the existing file system or 
database stores remains to be seen. Of course it largely depends on the disk 
and network hardware and utilization. I should be able to get some preliminary 
numbers without too much work, but first I need to decide what configurations 
to test...

Clearly the number of replicas is critical - it must be high enough to ensure 
at least one node will survive any outage, but low enough to perform well.

Writes must be synchronous for obvious reasons, but ideally a node that is up 
should not halt just because another member of the cluster is down. That model 
would preserve information but reduce availability, which is undesirable.

So my first question is: does the cache support a mode somewhere between async 
and sync, say 'return when at least M of N nodes have acked' ?  I can get 
something similar with buddy replication, but it's not quite the model I want - 
if more than M nodes are available they should be used. Similarly the crash of 
one buddy should not halt the system if there is an additional node available 
such that the total live number remains more than M.  Perhaps I can do this 
only with the raw JGroups API, not the cache?

Also, are there any numbers on the performance as a function of groups size, 
particularly mixing nodes on the same or different network segments. I'm 
thinking that to get independent failure characteristics for the nodes will 
probably require a distributed cluster, such that the nodes are on different 
power supplies etc. Having all the nodes in the same rack probably provides a 
false sense of security...

On a similar note, whilst cache puts must be synchronous, my design can 
tolerate asynchronous removes. Is such a hybrid configuration possible?

Transaction log entries fall into two groups: the ones for transactions that 
complete cleanly and the ones for transaction that go wrong. The former set is 
much larger and its members have a lifetime of at most a few seconds. The 
failure set is much smaller (hopefully empty!) but entries may persist 

I'm thinking of setting up the cache loaders such that the eviction time is 
longer than the expected lifetime of  members of the first group. What I want 
to achieve is this:

Synchronous write of an entry to at least N in-memory replicas.

If the transaction works, remove, possibly asynchronously, of that information 
from the cluster.

If the transaction fails, writing of the entry to disk for longer term storage.

Critically this is not the same as having all writes go through to disk. Is it 
possible to configure the cache loaders to write only on eviction?

Or I guess there is another possibility: since the loader's writes are 
asynchronous with respect to cache puts, is it possible to have it try to write 
everything, but intelligently remove queued writes from its work list if the 
corresponding node is removed before the write for its 'put' is made? That 
would effectively cause the disk to operate at its max throughput without 
(subject to the size limit of the work q) throttling the in-memory replication. 
It thus provides an extra layer of assurance compared to in-memory only copies 
but without the performance hit of synchronous disk writes.

Also, it vital to ensure there is no circular dependency between the cache and 
the transaction manager. I'm assuming this can be achieved simply by ensuring 
there is no transaction context on the thread at the time the cache API is 
called. Or does it use transactions JTA anywhere internally?

One final question: Am I totally mad, or only mildly demented?



View the original post : 

Reply to the post : 
jboss-user mailing list

Reply via email to