Hi James, By specifying the LuceneCacheLoader as a loader for the default cache, it will added to both the "lucene-index" (where it is needed) and the other two caches (lucene-metadata and lucene-locks) - where I don't think it is needed. I think it should only be configured for the "lucene-index" cache and removed from the default config.
On top of that you might want to add the cluster cache loader *before* the LuceneCacheLoader, otherwise it will always be the LuceneCacheLoader that would be queried first. The config I have in mind is[1], would you mind giving it a try? [1] https://gist.github.com/mmarkus/5195400 On 15 Mar 2013, at 16:22, James Aley wrote: > Not sure if I've done exactly what you had in mind... here is my updated XML: > https://www.refheap.com/paste/12601 > > I added the loader to the lucene-index namedCache, which is the one > I'm using for distribution. > > This didn't appear to change anything, as far as I can see. Still > seeing a lot of disk IO with every request. > > > James. > > > On 15 March 2013 15:54, Ray Tsang <saturn...@gmail.com> wrote: >> Can you try adding a ClusterCacheLoader to see if that helps? >> >> Thanks, >> >> >> On Fri, Mar 15, 2013 at 8:49 AM, James Aley <james.a...@swiftkey.net> wrote: >>> >>> Apologies - forgot to copy list. >>> >>> On 15 March 2013 15:48, James Aley <james.a...@swiftkey.net> wrote: >>>> Hey Adrian, >>>> >>>> Thanks for the response. I was chatting to Sanne on IRC yesterday, and >>>> he suggested this to me. Actually the logging I attached was from a >>>> cluster of 4 servers with numOwners=2. Sorry, I should have mentioned >>>> this actually, but I thought seeing as it didn't appear to make any >>>> difference that I'd just keep things simple in my previous email. >>>> >>>> While it seemed not to make a difference in this case... I can see why >>>> that would make sense. In future tests I guess I should probably stick >>>> with numOwners > 1. >>>> >>>> >>>> James. >>>> >>>> On 15 March 2013 15:44, Adrian Nistor <anis...@redhat.com> wrote: >>>>> Hi James, >>>>> >>>>> I'm not an expert on InfinispanDirectory but I've noticed in [1] that >>>>> the >>>>> lucene-index cache is distributed with numOwners = 1. That means each >>>>> cache >>>>> entry is owned by just one cluster node and there's nowhere else to go >>>>> in >>>>> the cluster if the key is not available in local memory, thus it needs >>>>> fetching from the cache store. This can be solved with numOwners > 1. >>>>> Please let me know if this solves your problem. >>>>> >>>>> Cheers! >>>>> >>>>> >>>>> On 03/15/2013 05:03 PM, James Aley wrote: >>>>>> >>>>>> Hey all, >>>>>> >>>>>> <OT> >>>>>> Seeing as this is my first post, I wanted to just quickly thank you >>>>>> all for Infinispan. So far I'm really enjoying working with it - great >>>>>> product! >>>>>> </OT> >>>>>> >>>>>> I'm using the InfinispanDirectory for a Lucene project at the moment. >>>>>> We use Lucene directly to build a search product, which has high read >>>>>> requirements and likely very large indexes. I'm hoping to make use of >>>>>> a distribution mode cache to keep the whole index in memory across a >>>>>> cluster of machines (the index will be too big for one server). >>>>>> >>>>>> The problem I'm having is that after loading a filesystem-based Lucene >>>>>> directory into InfinispanDirectory via LuceneCacheLoader, no nodes are >>>>>> retrieving data from the cluster - they instead look up keys in their >>>>>> local CacheLoaders, which involves lots of disk I/O and is very slow. >>>>>> I was hoping to just use the CacheLoader to initialize the caches, but >>>>>> from there on read only from RAM (and network, of course). Is this >>>>>> supported? Maybe I've misunderstood the purpose of the CacheLoader? >>>>>> >>>>>> To explain my observations in a little more detail: >>>>>> * I start a cluster of two servers, using [1] as the cache config. >>>>>> Both have a local copy of the Lucene index that will be loaded into >>>>>> the InfinispanDirectory via the loader. This is a test configuration, >>>>>> where I've set numOwners=1 so that I only need two servers for >>>>>> distribution to happen. >>>>>> * Upon startup, things look good. I see the memory usage of the JVM >>>>>> reflect a pretty near 50/50 split of the data across both servers. >>>>>> Logging indicates both servers are in the cluster view, all seems >>>>>> fine. >>>>>> * When I send a search query to either one of the nodes, I notice the >>>>>> following: >>>>>> - iotop shows huge (~100MB/s) disk I/O on that node alone from the >>>>>> JVM process. >>>>>> - no change in network activity between nodes (~300b/s, same as >>>>>> when >>>>>> idle) >>>>>> - memory usage on the node running the query increases >>>>>> dramatically, >>>>>> and stays higher even after the query is finished. >>>>>> >>>>>> So it seemed to me like each node was favouring use of the CacheLoader >>>>>> to retrieve keys that are not in memory, instead of using the cluster. >>>>>> Does that seem reasonable? Is this the expected behaviour? >>>>>> >>>>>> I started to investigate this by turning on trace logging, in this >>>>>> made me think perhaps the cause was that the CacheLoader's interceptor >>>>>> is higher priority in the chain than the the distribution interceptor? >>>>>> I'm not at all familiar with the design in any level of detail - just >>>>>> what I picked up in the last 24 hours from browsing the code, so I >>>>>> could easily be way off. I've attached the log snippets I thought >>>>>> relevant in [2]. >>>>>> >>>>>> Any advice offered much appreciated. >>>>>> Thanks! >>>>>> >>>>>> James. >>>>>> >>>>>> >>>>>> [1] https://www.refheap.com/paste/12531 >>>>>> [2] https://www.refheap.com/paste/12543 >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev@lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> >>>>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev@lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev