Sanne,
Agreed. Could all involved please make sure we post to both hibernate-
dev as well as infinispan-dev (rather than jbosscache-dev) when
discussing anything to do with such integration work. As there are
parallel efforts which can be brought together.
Cheers
Manik
On 25 May 2009, at 10:53, Sanne Grinovero wrote:
Hello,
I'm forwarding this email to Emmanuel and Hibernate Search dev, as I
believe we should join the discussion.
Could we keep both dev-lists (jbosscache-...@lists.jboss.org,
hibernate-dev@lists.jboss.org ) on CC ?
Sanne
2009/4/29 Manik Surtani <ma...@jboss.org>:
On 27 Apr 2009, at 05:18, Andrew Duckworth wrote:
Hello,
I have been working on a Lucene Directory provider based on JBoss
Cache,
my starting point was an implementation Manik had already written
which
pretty much worked with a few minor tweaks. Our use case was to
cluster a
Lucene index being used with Hibernate Search in our application,
with the
requirements that searching needed to be fast, there was no shared
file
system and it was important that the index was consistent across
the cluster
in a relatively short time frame.
Maniks code used a token node in the cache to implement the
distributed
lock. During my testing I set up multiple cache copies with
multiple threads
reading/writing to each cache copy. I was finding a lot of
transactions to
acquire or release this lock were timing out, not understanding
JBC well I
modified the distributed lock to use JGroups
DistrubutedLockManager. This
worked quite well, however the time taken to acquire/release the
lock (~100
ms for both) dwarfed the time to process the index update, lowering
throughput. Even using Hibernate Search with an async worker
thread, there
was still a lot of contention for the single lock which seemed to
limit the
scalability of the solution. I thinkl part of the problem was that
our use
of HB Search generates a lot of small units of work (remove index
entry, add
index entry) and each of these UOW acquire a new IndexWriter and
new write
lock on the underlying Lucene Directory implementation.
Out of curiosity, I created an alternative implementation based on
the
Hibernate Search JMS clustering strategy. Inside JBoss Cache I
created a
queue node and each slave node in the cluster creates a separate
queue
underneath where indexing work is written:
/queue/slave1/[work0, work1, work2 ....]
/slave2
/slave3
etc
In each cluster member a background thread runs continuously when
it wakes
up, it decides if it is the master node or not (currently checks
if it is
the view coordinator, but I'm considering changing it to use a
longer lived
distributed lock). If it is the master it merges the tasks from
each slave
queue, and updates the JBCDirectory in one go, it can safely do
this with
only local VM locking. This approach means that in all the slave
nodes they
can write to their queue without needing a global lock that any
other slave
or the master would be using. On the master, it can perform
multiple updates
in the context of a single Lucene index writer. With a cache loader
configured, work that is written into the slave queue is
persistent, so it
can survive the master node crashing with automatic fail over to a
new
master meaning that eventually all updates should be applied to
the index.
Each work element in the queue is time stamped to allow them to be
processed
in order (requires!
time synchronisation across the cluster) by the master. For our
workload
the master/slave pattern seems to improve the throughput of the
system.
Currently I'm refining the code and I have a few JBoss Cache
questions
which I hope you can help me with:
1) I have noticed that under high load I get LockTimeoutExceptions
writing
to /queue/slave0 when the lock owner is a transaction working on
/queue/slave1 , i.e. the same lock seems to be used for 2
unrelated nodes in
the cache. I'm assuming this is a result of the lock striping
algorithm, if
you could give me some insight into how this works that would be
very
helpful. Bumping up the cache concurrency level from 500 to 2000
seemed to
reduce this problem, however I'm not sure if it just reduces the
probability
of a random event of if there is some level that will be
sufficient to
eliminate the issue.
It could well be the lock striping at work. As of JBoss Cache
3.1.0 you can
disable lock striping and have one lock per node. While this is
expensive
in that if you have a lot of nodes, you end up with a lot of locks,
if you
have a finite number of nodes this may help you a lot.
2) Is there a reason to use separate nodes for each slave queue ?
Will it
help with locking, or can each slave safely insert to the same
parent node
in separate transactions without interfering or blocking each
other ? If I
can reduce it to a single queue I thin that would be a more elegant
solution. I am setting the lockParentForChildInsertRemove to false
for the
queue nodes.
It depends. Are the work objects attributes in /queue/slaveN ?
Remember
that the granularity for all locks is the node itself so if all
slaves write
to a single node, they will all compete for the same lock.
3) Similarly, is there any reason why the master should/shouldn't
take
responsibility for removing work nodes that have been processed ?
Not quite sure I understand your design - so this distributes the
work
objects and each cluster member maintains indexes locally? If so,
you need
to know when all members have processed the work objects before
removing
these.
Thanks in advance for help, I hope to make this solution general
purpose
enough to be able to contribute back to Hibernate Search and JBC
teams.
Thanks for offering to contribute. :-) One other thing that may
be of
interest is that I just launched Infinispan [1] [2] - a new data grid
product. You could implement a directory provider on Infinispan
too - it is
a lot more efficient than JBC at many things, including
concurrency. Also,
Infinispan's lock granularity is per-key/value pair. So a single
distributed cache would be all you need for work objects. Also,
another
thing that could help is the eager locking we have on the roadmap
[3] which
may make a more traditional approach of locking + writing indexes
to the
cache more feasible. I'd encourage you to check it out.
[1] http://www.infinispan.org
[2]
http://infinispan.blogspot.com/2009/04/infinispan-start-of-new-era-in-open.html
[3] https://jira.jboss.org/jira/browse/ISPN-48
--
Manik Surtani
ma...@jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org
_______________________________________________
jbosscache-dev mailing list
jbosscache-...@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/jbosscache-dev
--
Manik Surtani
ma...@jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org
_______________________________________________
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev