FYI I've created a JIRA to track this: https://issues.jboss.org/browse/ISPN-2950
Whilst quite a performance issues, I don't think that this is an 
critical/consistency issue for async stores: by using an async store you might 
loose data (expect inconsistencies) during a node crash anyway, so what this 
behaviour does is just to increase the inconsistency window.
 

On 19 Mar 2013, at 16:30, Mircea Markus wrote:
> 
> On 19 Mar 2013, at 16:15, Dan Berindei wrote:
> 
>> Hi Sanne
>> 
>> On Tue, Mar 19, 2013 at 4:12 PM, Sanne Grinovero <sa...@infinispan.org> 
>> wrote:
>> Mircea,
>> what I was most looking forward was to you comment on the interceptor
>> order generated for DIST+cachestores
>> - we don't think the ClusteredCacheLoader should be needed at all
>> 
>> Agree, ClusteredCacheLoader should not be necessary.
>> 
>> James, if you're still seeing problems with numOwners=1, could you create an 
>> issue in JIRA?
>> 
>> 
>> - each DIST node is loading from the CacheLoader (any) rather than
>> loading from its peer nodes for non-owned entries (!!)
>> 
>> 
>> Sometimes loading stuff from a local disk is faster than going remote, e.g. 
>> if you have numOwners=2 and both owners have to load the same entry from 
>> disk and send it to the originator twice. 
> the staggering of remote gets should overcome that. 
>> 
>> Still, most of the time the entry is going to be in memory on the owner 
>> nodes, so the local load is slower (especially with a shared cache store, 
>> where loading is over the network as well).
> +1
>> 
>> 
>> This has come up on several threads now and I think it's critically
>> wrong, as I commented previously this also introduces many
>> inconsistencies - as far as I understand it.
>> 
>> 
>> Is there a JIRA for this already?
>> 
>> Yes, loading a stale entry from the local cache store is definitely not a 
>> good thing, but we actually delete the non-owned entries after the initial 
>> state transfer. There may be some consistency issues if one uses a DIST_SYNC 
>> cache with a shared async cache store, but fully sync configurations should 
>> be fine.
>> 
>> OTOH, if the cache store is not shared, the chances of finding the entry in 
>> the local store on a non-owner are slim to none, so it doesn't make sense to 
>> do the lookup.
>> 
>> Implementation-wise, just changing the interceptor order is probably not 
>> enough. If the key doesn't exist in the cache, the CacheLoaderInterceptor 
>> will still try to load it from the cache store after the remote lookup, so 
>> we'll need a marker  in the invocation context to avoid the extra cache 
>> store load.
> if the key does't map to the local node it should trigger a remote get to 
> owners (or allow the dist interceptor to do just that)
>> Actually, since this is just a performance issue, it could wait until we 
>> implement tombstones everywhere.
> Hmm, not sure i see the correlation between this and tombstones? 
> 
>> 
>> BTW your gist wouldn't work, the metadata cache needs to load certain
>> elements too. But nice you spotted the need to potentially filter what
>> "preload" means in the scope of each cache, as the metadata one should
>> only preload metadata, while in the original configuration this data
>> would indeed be duplicated.
>> Opened: https://issues.jboss.org/browse/ISPN-2938
>> 
>> Sanne
>> 
>> On 19 March 2013 11:51, Mircea Markus <mmar...@redhat.com> wrote:
>>> 
>>> On 16 Mar 2013, at 01:19, Sanne Grinovero wrote:
>>> 
>>>> Hi Adrian,
>>>> let's forget about Lucene details and focus on DIST.
>>>> With numOwners=1 and having two nodes the entries should be stored
>>>> roughly 50% on each node, I see nothing wrong with that
>>>> considering you don't need data failover in a read-only use case
>>>> having all the index available in the shared CacheLoader.
>>>> 
>>>> In such a scenario, and having both nodes preloaded all data, in case
>>>> of a get() operation I would expect
>>>> either:
>>>> A) to be the owner, hence retrieve the value from local in-JVM reference
>>>> B) to not be the owner, so to forward the request to the other node
>>>> having roughly 50% chance per key to be in case A or B.
>>>> 
>>>> But when hitting case B) it seems that instead of loading from the
>>>> other node, it hits the CacheLoader to fetch the value.
>>>> 
>>>> I already had asked James to verify with 4 nodes and numOwners=2, the
>>>> result is the same so I suggested him to ask here;
>>>> BTW I think numOwners=1 is perfectly valid and should work as with
>>>> numOwners=1, the only reason I asked him to repeat
>>>> the test is that we don't have much tests on the numOwners=1 case and
>>>> I was assuming there might be some (wrong) assumptions
>>>> affecting this.
>>>> 
>>>> Note that this is not "just" a critical performance problem but I'm
>>>> also suspecting it could provide inconsistent reads, in two classes of
>>>> problems:
>>>> 
>>>> # non-shared CacheStore with stale entries
>>>> If for non-owned keys it will hit the local CacheStore first, where
>>>> you might expect to not find anything, so to forward the request to
>>>> the right node. What if this node has been the owner in the past? It
>>>> might have an old entry locally stored, which would be returned
>>>> instead of the correct value which is owned on a different node.
>>>> 
>>>> # shared CacheStore using write-behind
>>>> When using an async CacheStore by definition the content of the
>>>> CacheStore is not trustworthy if you don't check on the owner first
>>>> for entries in memory.
>>>> 
>>>> Both seem critical to me, but the performance impact is really bad too.
>>>> 
>>>> I hoped to make some more tests myself but couldn't look at this yet,
>>>> any help from the core team would be appreciated.
>>> I think you have a fair point and reads/writes to the data should be 
>>> coordinated through its owners both for performance and (more importantly) 
>>> correctness.
>>> Mind creating a JIRA for this?
>>> 
>>>> 
>>>> @Ray, thanks for mentioning the ClusterCacheLoader. Wasn't there
>>>> someone else with a CacheLoader issue recently who had worked around
>>>> the problem by using a ClusterCacheLoader ?
>>>> Do you remember what the scenario was?
>>>> 
>>>> Cheers,
>>>> Sanne
>>>> 
>>>> On 15 March 2013 15:44, Adrian Nistor <anis...@redhat.com> wrote:
>>>>> Hi James,
>>>>> 
>>>>> I'm not an expert on InfinispanDirectory but I've noticed in [1] that
>>>>> the lucene-index cache is distributed with numOwners = 1. That means
>>>>> each cache entry is owned by just one cluster node and there's nowhere
>>>>> else to go in the cluster if the key is not available in local memory,
>>>>> thus it needs fetching from the cache store. This can be solved with
>>>>> numOwners > 1.
>>>>> Please let me know if this solves your problem.
>>>>> 
>>>>> Cheers!
>>>>> 
>>>>> On 03/15/2013 05:03 PM, James Aley wrote:
>>>>>> Hey all,
>>>>>> 
>>>>>> <OT>
>>>>>> Seeing as this is my first post, I wanted to just quickly thank you
>>>>>> all for Infinispan. So far I'm really enjoying working with it - great
>>>>>> product!
>>>>>> </OT>
>>>>>> 
>>>>>> I'm using the InfinispanDirectory for a Lucene project at the moment.
>>>>>> We use Lucene directly to build a search product, which has high read
>>>>>> requirements and likely very large indexes. I'm hoping to make use of
>>>>>> a distribution mode cache to keep the whole index in memory across a
>>>>>> cluster of machines (the index will be too big for one server).
>>>>>> 
>>>>>> The problem I'm having is that after loading a filesystem-based Lucene
>>>>>> directory into InfinispanDirectory via LuceneCacheLoader, no nodes are
>>>>>> retrieving data from the cluster - they instead look up keys in their
>>>>>> local CacheLoaders, which involves lots of disk I/O and is very slow.
>>>>>> I was hoping to just use the CacheLoader to initialize the caches, but
>>>>>> from there on read only from RAM (and network, of course). Is this
>>>>>> supported? Maybe I've misunderstood the purpose of the CacheLoader?
>>>>>> 
>>>>>> To explain my observations in a little more detail:
>>>>>> * I start a cluster of two servers, using [1] as the cache config.
>>>>>> Both have a local copy of the Lucene index that will be loaded into
>>>>>> the InfinispanDirectory via the loader. This is a test configuration,
>>>>>> where I've set numOwners=1 so that I only need two servers for
>>>>>> distribution to happen.
>>>>>> * Upon startup, things look good. I see the memory usage of the JVM
>>>>>> reflect a pretty near 50/50 split of the data across both servers.
>>>>>> Logging indicates both servers are in the cluster view, all seems
>>>>>> fine.
>>>>>> * When I send a search query to either one of the nodes, I notice the 
>>>>>> following:
>>>>>>  - iotop shows huge (~100MB/s) disk I/O on that node alone from the
>>>>>> JVM process.
>>>>>>  - no change in network activity between nodes (~300b/s, same as when 
>>>>>> idle)
>>>>>>  - memory usage on the node running the query increases dramatically,
>>>>>> and stays higher even after the query is finished.
>>>>>> 
>>>>>> So it seemed to me like each node was favouring use of the CacheLoader
>>>>>> to retrieve keys that are not in memory, instead of using the cluster.
>>>>>> Does that seem reasonable? Is this the expected behaviour?
>>>>>> 
>>>>>> I started to investigate this by turning on trace logging, in this
>>>>>> made me think perhaps the cause was that the CacheLoader's interceptor
>>>>>> is higher priority in the chain than the the distribution interceptor?
>>>>>> I'm not at all familiar with the design in any level of detail - just
>>>>>> what I picked up in the last 24 hours from browsing the code, so I
>>>>>> could easily be way off. I've attached the log snippets I thought
>>>>>> relevant in [2].
>>>>>> 
>>>>>> Any advice offered much appreciated.
>>>>>> Thanks!
>>>>>> 
>>>>>> James.
>>>>>> 
>>>>>> 
>>>>>> [1] https://www.refheap.com/paste/12531
>>>>>> [2] https://www.refheap.com/paste/12543
>>>>>> _______________________________________________
>>>>>> infinispan-dev mailing list
>>>>>> infinispan-dev@lists.jboss.org
>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>> 
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev@lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev@lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> 
>>> Cheers,
>>> --
>>> Mircea Markus
>>> Infinispan lead (www.infinispan.org)
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev@lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev@lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev@lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> Cheers,
> -- 
> Mircea Markus
> Infinispan lead (www.infinispan.org)
> 
> 
> 
> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)





_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Reply via email to