Update: I tried again - I think I misconfigured that ClusterCacheLoader on my last attempt. With this configuration [1] it actually appears to be loading keys over the network from the peer node. I'm seeing a lot of network IO between the nodes when requesting from either one of them (30-50 MBp/s), and considerably less disk I/O than previously, though still not negligible.
I think, however, that both nodes are holding on to any data they retrieve from the other node. Is this possible? The reason I think this is the case: * I have a fairly large test index on disk, which the Lucene CacheLoader loads into memory as soon as the cache is created. It's about a 12GB index, and after a flurry of disk activity when they processes start, I see about 5-6GB of heap usage on each node -- all seems good. * When I send requests now (with this ClusterCacheLoader configuration linked below), I see network activity between nodes, plus some disk I/O. * After each query, each node grows in heap usage considerably. Eventually they'll both be using about 11GB of RAM. * At the point where both nodes have lots of data in RAM, the network I/O has dropped hugely to ~100k/s * If I repeat an identical query to either node, the response is instant - O(10ms) I don't know if this is because they're lazily loading entries from disk despite the preload=true setting (and the index just takes up far more RAM when loaded as a Cache like this?), or if it's because they're locally caching entries that should (by the consistent hash and numOwners configuration, at least) only live in the remote node? Thanks! James. [1] https://www.refheap.com/paste/12685 On 16 March 2013 01:19, Sanne Grinovero <sa...@infinispan.org> wrote: > Hi Adrian, > let's forget about Lucene details and focus on DIST. > With numOwners=1 and having two nodes the entries should be stored > roughly 50% on each node, I see nothing wrong with that > considering you don't need data failover in a read-only use case > having all the index available in the shared CacheLoader. > > In such a scenario, and having both nodes preloaded all data, in case > of a get() operation I would expect > either: > A) to be the owner, hence retrieve the value from local in-JVM reference > B) to not be the owner, so to forward the request to the other node > having roughly 50% chance per key to be in case A or B. > > But when hitting case B) it seems that instead of loading from the > other node, it hits the CacheLoader to fetch the value. > > I already had asked James to verify with 4 nodes and numOwners=2, the > result is the same so I suggested him to ask here; > BTW I think numOwners=1 is perfectly valid and should work as with > numOwners=1, the only reason I asked him to repeat > the test is that we don't have much tests on the numOwners=1 case and > I was assuming there might be some (wrong) assumptions > affecting this. > > Note that this is not "just" a critical performance problem but I'm > also suspecting it could provide inconsistent reads, in two classes of > problems: > > # non-shared CacheStore with stale entries > If for non-owned keys it will hit the local CacheStore first, where > you might expect to not find anything, so to forward the request to > the right node. What if this node has been the owner in the past? It > might have an old entry locally stored, which would be returned > instead of the correct value which is owned on a different node. > > # shared CacheStore using write-behind > When using an async CacheStore by definition the content of the > CacheStore is not trustworthy if you don't check on the owner first > for entries in memory. > > Both seem critical to me, but the performance impact is really bad too. > > I hoped to make some more tests myself but couldn't look at this yet, > any help from the core team would be appreciated. > > @Ray, thanks for mentioning the ClusterCacheLoader. Wasn't there > someone else with a CacheLoader issue recently who had worked around > the problem by using a ClusterCacheLoader ? > Do you remember what the scenario was? > > Cheers, > Sanne > > On 15 March 2013 15:44, Adrian Nistor <anis...@redhat.com> wrote: >> Hi James, >> >> I'm not an expert on InfinispanDirectory but I've noticed in [1] that >> the lucene-index cache is distributed with numOwners = 1. That means >> each cache entry is owned by just one cluster node and there's nowhere >> else to go in the cluster if the key is not available in local memory, >> thus it needs fetching from the cache store. This can be solved with >> numOwners > 1. >> Please let me know if this solves your problem. >> >> Cheers! >> >> On 03/15/2013 05:03 PM, James Aley wrote: >>> Hey all, >>> >>> <OT> >>> Seeing as this is my first post, I wanted to just quickly thank you >>> all for Infinispan. So far I'm really enjoying working with it - great >>> product! >>> </OT> >>> >>> I'm using the InfinispanDirectory for a Lucene project at the moment. >>> We use Lucene directly to build a search product, which has high read >>> requirements and likely very large indexes. I'm hoping to make use of >>> a distribution mode cache to keep the whole index in memory across a >>> cluster of machines (the index will be too big for one server). >>> >>> The problem I'm having is that after loading a filesystem-based Lucene >>> directory into InfinispanDirectory via LuceneCacheLoader, no nodes are >>> retrieving data from the cluster - they instead look up keys in their >>> local CacheLoaders, which involves lots of disk I/O and is very slow. >>> I was hoping to just use the CacheLoader to initialize the caches, but >>> from there on read only from RAM (and network, of course). Is this >>> supported? Maybe I've misunderstood the purpose of the CacheLoader? >>> >>> To explain my observations in a little more detail: >>> * I start a cluster of two servers, using [1] as the cache config. >>> Both have a local copy of the Lucene index that will be loaded into >>> the InfinispanDirectory via the loader. This is a test configuration, >>> where I've set numOwners=1 so that I only need two servers for >>> distribution to happen. >>> * Upon startup, things look good. I see the memory usage of the JVM >>> reflect a pretty near 50/50 split of the data across both servers. >>> Logging indicates both servers are in the cluster view, all seems >>> fine. >>> * When I send a search query to either one of the nodes, I notice the >>> following: >>> - iotop shows huge (~100MB/s) disk I/O on that node alone from the >>> JVM process. >>> - no change in network activity between nodes (~300b/s, same as when >>> idle) >>> - memory usage on the node running the query increases dramatically, >>> and stays higher even after the query is finished. >>> >>> So it seemed to me like each node was favouring use of the CacheLoader >>> to retrieve keys that are not in memory, instead of using the cluster. >>> Does that seem reasonable? Is this the expected behaviour? >>> >>> I started to investigate this by turning on trace logging, in this >>> made me think perhaps the cause was that the CacheLoader's interceptor >>> is higher priority in the chain than the the distribution interceptor? >>> I'm not at all familiar with the design in any level of detail - just >>> what I picked up in the last 24 hours from browsing the code, so I >>> could easily be way off. I've attached the log snippets I thought >>> relevant in [2]. >>> >>> Any advice offered much appreciated. >>> Thanks! >>> >>> James. >>> >>> >>> [1] https://www.refheap.com/paste/12531 >>> [2] https://www.refheap.com/paste/12543 >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev@lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev