Not sure if I've done exactly what you had in mind... here is my updated XML: https://www.refheap.com/paste/12601
I added the loader to the lucene-index namedCache, which is the one I'm using for distribution. This didn't appear to change anything, as far as I can see. Still seeing a lot of disk IO with every request. James. On 15 March 2013 15:54, Ray Tsang <saturn...@gmail.com> wrote: > Can you try adding a ClusterCacheLoader to see if that helps? > > Thanks, > > > On Fri, Mar 15, 2013 at 8:49 AM, James Aley <james.a...@swiftkey.net> wrote: >> >> Apologies - forgot to copy list. >> >> On 15 March 2013 15:48, James Aley <james.a...@swiftkey.net> wrote: >> > Hey Adrian, >> > >> > Thanks for the response. I was chatting to Sanne on IRC yesterday, and >> > he suggested this to me. Actually the logging I attached was from a >> > cluster of 4 servers with numOwners=2. Sorry, I should have mentioned >> > this actually, but I thought seeing as it didn't appear to make any >> > difference that I'd just keep things simple in my previous email. >> > >> > While it seemed not to make a difference in this case... I can see why >> > that would make sense. In future tests I guess I should probably stick >> > with numOwners > 1. >> > >> > >> > James. >> > >> > On 15 March 2013 15:44, Adrian Nistor <anis...@redhat.com> wrote: >> >> Hi James, >> >> >> >> I'm not an expert on InfinispanDirectory but I've noticed in [1] that >> >> the >> >> lucene-index cache is distributed with numOwners = 1. That means each >> >> cache >> >> entry is owned by just one cluster node and there's nowhere else to go >> >> in >> >> the cluster if the key is not available in local memory, thus it needs >> >> fetching from the cache store. This can be solved with numOwners > 1. >> >> Please let me know if this solves your problem. >> >> >> >> Cheers! >> >> >> >> >> >> On 03/15/2013 05:03 PM, James Aley wrote: >> >>> >> >>> Hey all, >> >>> >> >>> <OT> >> >>> Seeing as this is my first post, I wanted to just quickly thank you >> >>> all for Infinispan. So far I'm really enjoying working with it - great >> >>> product! >> >>> </OT> >> >>> >> >>> I'm using the InfinispanDirectory for a Lucene project at the moment. >> >>> We use Lucene directly to build a search product, which has high read >> >>> requirements and likely very large indexes. I'm hoping to make use of >> >>> a distribution mode cache to keep the whole index in memory across a >> >>> cluster of machines (the index will be too big for one server). >> >>> >> >>> The problem I'm having is that after loading a filesystem-based Lucene >> >>> directory into InfinispanDirectory via LuceneCacheLoader, no nodes are >> >>> retrieving data from the cluster - they instead look up keys in their >> >>> local CacheLoaders, which involves lots of disk I/O and is very slow. >> >>> I was hoping to just use the CacheLoader to initialize the caches, but >> >>> from there on read only from RAM (and network, of course). Is this >> >>> supported? Maybe I've misunderstood the purpose of the CacheLoader? >> >>> >> >>> To explain my observations in a little more detail: >> >>> * I start a cluster of two servers, using [1] as the cache config. >> >>> Both have a local copy of the Lucene index that will be loaded into >> >>> the InfinispanDirectory via the loader. This is a test configuration, >> >>> where I've set numOwners=1 so that I only need two servers for >> >>> distribution to happen. >> >>> * Upon startup, things look good. I see the memory usage of the JVM >> >>> reflect a pretty near 50/50 split of the data across both servers. >> >>> Logging indicates both servers are in the cluster view, all seems >> >>> fine. >> >>> * When I send a search query to either one of the nodes, I notice the >> >>> following: >> >>> - iotop shows huge (~100MB/s) disk I/O on that node alone from the >> >>> JVM process. >> >>> - no change in network activity between nodes (~300b/s, same as >> >>> when >> >>> idle) >> >>> - memory usage on the node running the query increases >> >>> dramatically, >> >>> and stays higher even after the query is finished. >> >>> >> >>> So it seemed to me like each node was favouring use of the CacheLoader >> >>> to retrieve keys that are not in memory, instead of using the cluster. >> >>> Does that seem reasonable? Is this the expected behaviour? >> >>> >> >>> I started to investigate this by turning on trace logging, in this >> >>> made me think perhaps the cause was that the CacheLoader's interceptor >> >>> is higher priority in the chain than the the distribution interceptor? >> >>> I'm not at all familiar with the design in any level of detail - just >> >>> what I picked up in the last 24 hours from browsing the code, so I >> >>> could easily be way off. I've attached the log snippets I thought >> >>> relevant in [2]. >> >>> >> >>> Any advice offered much appreciated. >> >>> Thanks! >> >>> >> >>> James. >> >>> >> >>> >> >>> [1] https://www.refheap.com/paste/12531 >> >>> [2] https://www.refheap.com/paste/12543 >> >>> _______________________________________________ >> >>> infinispan-dev mailing list >> >>> infinispan-dev@lists.jboss.org >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev