On 19 Mar 2013, at 16:15, Dan Berindei wrote: > Hi Sanne > > On Tue, Mar 19, 2013 at 4:12 PM, Sanne Grinovero <sa...@infinispan.org> wrote: > Mircea, > what I was most looking forward was to you comment on the interceptor > order generated for DIST+cachestores > - we don't think the ClusteredCacheLoader should be needed at all > > Agree, ClusteredCacheLoader should not be necessary. > > James, if you're still seeing problems with numOwners=1, could you create an > issue in JIRA? > > > - each DIST node is loading from the CacheLoader (any) rather than > loading from its peer nodes for non-owned entries (!!) > > > Sometimes loading stuff from a local disk is faster than going remote, e.g. > if you have numOwners=2 and both owners have to load the same entry from disk > and send it to the originator twice. the staggering of remote gets should overcome that. > > Still, most of the time the entry is going to be in memory on the owner > nodes, so the local load is slower (especially with a shared cache store, > where loading is over the network as well). +1 > > > This has come up on several threads now and I think it's critically > wrong, as I commented previously this also introduces many > inconsistencies - as far as I understand it. > > > Is there a JIRA for this already? > > Yes, loading a stale entry from the local cache store is definitely not a > good thing, but we actually delete the non-owned entries after the initial > state transfer. There may be some consistency issues if one uses a DIST_SYNC > cache with a shared async cache store, but fully sync configurations should > be fine. > > OTOH, if the cache store is not shared, the chances of finding the entry in > the local store on a non-owner are slim to none, so it doesn't make sense to > do the lookup. > > Implementation-wise, just changing the interceptor order is probably not > enough. If the key doesn't exist in the cache, the CacheLoaderInterceptor > will still try to load it from the cache store after the remote lookup, so > we'll need a marker in the invocation context to avoid the extra cache store > load. if the key does't map to the local node it should trigger a remote get to owners (or allow the dist interceptor to do just that) > Actually, since this is just a performance issue, it could wait until we > implement tombstones everywhere. Hmm, not sure i see the correlation between this and tombstones?
> > BTW your gist wouldn't work, the metadata cache needs to load certain > elements too. But nice you spotted the need to potentially filter what > "preload" means in the scope of each cache, as the metadata one should > only preload metadata, while in the original configuration this data > would indeed be duplicated. > Opened: https://issues.jboss.org/browse/ISPN-2938 > > Sanne > > On 19 March 2013 11:51, Mircea Markus <mmar...@redhat.com> wrote: > > > > On 16 Mar 2013, at 01:19, Sanne Grinovero wrote: > > > >> Hi Adrian, > >> let's forget about Lucene details and focus on DIST. > >> With numOwners=1 and having two nodes the entries should be stored > >> roughly 50% on each node, I see nothing wrong with that > >> considering you don't need data failover in a read-only use case > >> having all the index available in the shared CacheLoader. > >> > >> In such a scenario, and having both nodes preloaded all data, in case > >> of a get() operation I would expect > >> either: > >> A) to be the owner, hence retrieve the value from local in-JVM reference > >> B) to not be the owner, so to forward the request to the other node > >> having roughly 50% chance per key to be in case A or B. > >> > >> But when hitting case B) it seems that instead of loading from the > >> other node, it hits the CacheLoader to fetch the value. > >> > >> I already had asked James to verify with 4 nodes and numOwners=2, the > >> result is the same so I suggested him to ask here; > >> BTW I think numOwners=1 is perfectly valid and should work as with > >> numOwners=1, the only reason I asked him to repeat > >> the test is that we don't have much tests on the numOwners=1 case and > >> I was assuming there might be some (wrong) assumptions > >> affecting this. > >> > >> Note that this is not "just" a critical performance problem but I'm > >> also suspecting it could provide inconsistent reads, in two classes of > >> problems: > >> > >> # non-shared CacheStore with stale entries > >> If for non-owned keys it will hit the local CacheStore first, where > >> you might expect to not find anything, so to forward the request to > >> the right node. What if this node has been the owner in the past? It > >> might have an old entry locally stored, which would be returned > >> instead of the correct value which is owned on a different node. > >> > >> # shared CacheStore using write-behind > >> When using an async CacheStore by definition the content of the > >> CacheStore is not trustworthy if you don't check on the owner first > >> for entries in memory. > >> > >> Both seem critical to me, but the performance impact is really bad too. > >> > >> I hoped to make some more tests myself but couldn't look at this yet, > >> any help from the core team would be appreciated. > > I think you have a fair point and reads/writes to the data should be > > coordinated through its owners both for performance and (more importantly) > > correctness. > > Mind creating a JIRA for this? > > > >> > >> @Ray, thanks for mentioning the ClusterCacheLoader. Wasn't there > >> someone else with a CacheLoader issue recently who had worked around > >> the problem by using a ClusterCacheLoader ? > >> Do you remember what the scenario was? > >> > >> Cheers, > >> Sanne > >> > >> On 15 March 2013 15:44, Adrian Nistor <anis...@redhat.com> wrote: > >>> Hi James, > >>> > >>> I'm not an expert on InfinispanDirectory but I've noticed in [1] that > >>> the lucene-index cache is distributed with numOwners = 1. That means > >>> each cache entry is owned by just one cluster node and there's nowhere > >>> else to go in the cluster if the key is not available in local memory, > >>> thus it needs fetching from the cache store. This can be solved with > >>> numOwners > 1. > >>> Please let me know if this solves your problem. > >>> > >>> Cheers! > >>> > >>> On 03/15/2013 05:03 PM, James Aley wrote: > >>>> Hey all, > >>>> > >>>> <OT> > >>>> Seeing as this is my first post, I wanted to just quickly thank you > >>>> all for Infinispan. So far I'm really enjoying working with it - great > >>>> product! > >>>> </OT> > >>>> > >>>> I'm using the InfinispanDirectory for a Lucene project at the moment. > >>>> We use Lucene directly to build a search product, which has high read > >>>> requirements and likely very large indexes. I'm hoping to make use of > >>>> a distribution mode cache to keep the whole index in memory across a > >>>> cluster of machines (the index will be too big for one server). > >>>> > >>>> The problem I'm having is that after loading a filesystem-based Lucene > >>>> directory into InfinispanDirectory via LuceneCacheLoader, no nodes are > >>>> retrieving data from the cluster - they instead look up keys in their > >>>> local CacheLoaders, which involves lots of disk I/O and is very slow. > >>>> I was hoping to just use the CacheLoader to initialize the caches, but > >>>> from there on read only from RAM (and network, of course). Is this > >>>> supported? Maybe I've misunderstood the purpose of the CacheLoader? > >>>> > >>>> To explain my observations in a little more detail: > >>>> * I start a cluster of two servers, using [1] as the cache config. > >>>> Both have a local copy of the Lucene index that will be loaded into > >>>> the InfinispanDirectory via the loader. This is a test configuration, > >>>> where I've set numOwners=1 so that I only need two servers for > >>>> distribution to happen. > >>>> * Upon startup, things look good. I see the memory usage of the JVM > >>>> reflect a pretty near 50/50 split of the data across both servers. > >>>> Logging indicates both servers are in the cluster view, all seems > >>>> fine. > >>>> * When I send a search query to either one of the nodes, I notice the > >>>> following: > >>>> - iotop shows huge (~100MB/s) disk I/O on that node alone from the > >>>> JVM process. > >>>> - no change in network activity between nodes (~300b/s, same as when > >>>> idle) > >>>> - memory usage on the node running the query increases dramatically, > >>>> and stays higher even after the query is finished. > >>>> > >>>> So it seemed to me like each node was favouring use of the CacheLoader > >>>> to retrieve keys that are not in memory, instead of using the cluster. > >>>> Does that seem reasonable? Is this the expected behaviour? > >>>> > >>>> I started to investigate this by turning on trace logging, in this > >>>> made me think perhaps the cause was that the CacheLoader's interceptor > >>>> is higher priority in the chain than the the distribution interceptor? > >>>> I'm not at all familiar with the design in any level of detail - just > >>>> what I picked up in the last 24 hours from browsing the code, so I > >>>> could easily be way off. I've attached the log snippets I thought > >>>> relevant in [2]. > >>>> > >>>> Any advice offered much appreciated. > >>>> Thanks! > >>>> > >>>> James. > >>>> > >>>> > >>>> [1] https://www.refheap.com/paste/12531 > >>>> [2] https://www.refheap.com/paste/12543 > >>>> _______________________________________________ > >>>> infinispan-dev mailing list > >>>> infinispan-dev@lists.jboss.org > >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >>> > >>> _______________________________________________ > >>> infinispan-dev mailing list > >>> infinispan-dev@lists.jboss.org > >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev@lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > Cheers, > > -- > > Mircea Markus > > Infinispan lead (www.infinispan.org) > > > > > > > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev@lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev