Re: [infinispan-dev] Preloading from disk versus state transfer Re: ISPN-1384 - InboundInvocationHandlerImpl should wait for cache to be started? (not just defined)

2011-10-26 Thread Dan Berindei
On Mon, Oct 24, 2011 at 4:42 PM, Sanne Grinovero sa...@infinispan.org wrote:
 On 24 October 2011 12:58, Dan Berindei dan.berin...@gmail.com wrote:
 Hi Galder

 On Mon, Oct 24, 2011 at 1:46 PM, Galder Zamarreño gal...@redhat.com wrote:

 On Oct 24, 2011, at 12:04 PM, Dan Berindei wrote:

 ISPN-1470 (https://issues.jboss.org/browse/ISPN-1470) raises an
 interesting question: if the preloading happens before joining, the
 preloading code won't know anything about the consistent hash. It will
 load everything from the cache store, including the keys that are
 owned by other nodes.

 It's been defined to work that way:
 https://docs.jboss.org/author/display/ISPN/CacheLoaders

 Tbh, that will only happen in shared cache stores. In non-shared ones, 
 you'll only have data that belongs to that node.


 Not really... in distributed mode, every time the cache starts it will
 have another position on the hash wheel.
 That means even with a non-shared cache store, it's likely most of the
 stored keys will no longer be local.

 Actually I just noticed that you've fixed ISPN-1404, which looks like
 it would solves my problem when the cache is created by a HotRod
 server. I would like to extend it to work like this by default, e.g.
 by using the transport's nodeName as the seed.

 I think there is a check in place already so that the joiner won't
 push stale data from its cache store to the other nodes, but we should
 also discard the keys that don't map locally or we'll have stale data
 (since we don't have a way to check if those keys are stale and
 register to receive invalidations for those keys).

 +1, only for shared cache stores.


 What do you think, should I discard the non-local keys with the fix
 for ISPN-1470 or should I let them be and warn the user about
 potentially stale data?

 Discard only for shared cache stores.

 Cache configurations should be symmetrical, so if other nodes preload, 
 they'll preload only data local to them with your change.


 Discarding works fine from the correctness POV, but for performance
 it's not that great: we may do a lot of work to preload keys and have
 nothing to show for it at the end.

 Can't you just skip loading state and be happy with the state you
 receive from peers? More data will be lazily loaded.
 Applying of course only when you're not the only/first node in the
 grid, in which case you have to load.


Right, we could preload only on the first node. With a shared cache
store this should work great, we just have to start preloading after
we connect to the cluster and before we send the join request.

But I have trouble visualizing how a persistent (purgeOnStartup =
false) non-shared cache store should to work until we have some
validation mechanism like in
https://issues.jboss.org/browse/ISPN-1195. Should we even allow this
kind of setup?


 The only alternative I see is to be able to find the boundaries of
 keys you own, and change the CacheLoader API to load keys by the
 identified range - should work with multiple boundaries too for
 virtualnodes, but this is something that not all CacheLoaders will be
 able to implement, so it should be an optional API; for now I'd stick
 with the first option above as I don't see how we can be more
 efficient in loading the state from CacheLoaders than via JGroups.

 Sanne

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Preloading from disk versus state transfer Re: ISPN-1384 - InboundInvocationHandlerImpl should wait for cache to be started? (not just defined)

2011-10-26 Thread Sanne Grinovero

 Can't you just skip loading state and be happy with the state you
 receive from peers? More data will be lazily loaded.
 Applying of course only when you're not the only/first node in the
 grid, in which case you have to load.


 Right, we could preload only on the first node. With a shared cache
 store this should work great, we just have to start preloading after
 we connect to the cluster and before we send the join request.

 But I have trouble visualizing how a persistent (purgeOnStartup =
 false) non-shared cache store should to work until we have some
 validation mechanism like in
 https://issues.jboss.org/browse/ISPN-1195. Should we even allow this
 kind of setup?

Right I don't think it makes much sense. The current node might have
been down for a long time and it's dedicated cacheloader will likely
contain stale values; we might update older values via versions of
optimistic locking, but we won't be able to remove those which should
have been removed.
I don't think we should support that, at least until these problems are solved.
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Preloading from disk versus state transfer Re: ISPN-1384 - InboundInvocationHandlerImpl should wait for cache to be started? (not just defined)

2011-10-26 Thread Galder Zamarreño

On Oct 24, 2011, at 12:58 PM, Dan Berindei wrote:

 Hi Galder
 
 On Mon, Oct 24, 2011 at 1:46 PM, Galder Zamarreño gal...@redhat.com wrote:
 
 On Oct 24, 2011, at 12:04 PM, Dan Berindei wrote:
 
 ISPN-1470 (https://issues.jboss.org/browse/ISPN-1470) raises an
 interesting question: if the preloading happens before joining, the
 preloading code won't know anything about the consistent hash. It will
 load everything from the cache store, including the keys that are
 owned by other nodes.
 
 It's been defined to work that way:
 https://docs.jboss.org/author/display/ISPN/CacheLoaders
 
 Tbh, that will only happen in shared cache stores. In non-shared ones, 
 you'll only have data that belongs to that node.
 
 
 Not really... in distributed mode, every time the cache starts it will
 have another position on the hash wheel.
 That means even with a non-shared cache store, it's likely most of the
 stored keys will no longer be local.
 
 Actually I just noticed that you've fixed ISPN-1404, which looks like
 it would solves my problem when the cache is created by a HotRod
 server. I would like to extend it to work like this by default, e.g.
 by using the transport's nodeName as the seed.
 
 I think there is a check in place already so that the joiner won't
 push stale data from its cache store to the other nodes, but we should
 also discard the keys that don't map locally or we'll have stale data
 (since we don't have a way to check if those keys are stale and
 register to receive invalidations for those keys).
 
 +1, only for shared cache stores.
 
 
 What do you think, should I discard the non-local keys with the fix
 for ISPN-1470 or should I let them be and warn the user about
 potentially stale data?
 
 Discard only for shared cache stores.
 
 Cache configurations should be symmetrical, so if other nodes preload, 
 they'll preload only data local to them with your change.
 
 
 Discarding works fine from the correctness POV, but for performance
 it's not that great: we may do a lot of work to preload keys and have
 nothing to show for it at the end.

I agree, I thought of that when replying to this. It'd be great if you could 
only bring that data that will belong to you, but for that we'd need to store 
the hash of the key as well.

 
 Enabling the fixed hash seed by default should make the performance
 issue go away. I think it would also require virtual nodes enabled by
 default and a way to ensure that the nodeNames are unique across the
 cluster.
 
 Cheers
 Dan
 
 
 
 Cheers
 Dan
 
 
 On Mon, Oct 3, 2011 at 3:09 AM, Manik Surtani ma...@jboss.org wrote:
 
 On 28 Sep 2011, at 10:56, Dan Berindei wrote:
 
 I'm not sure if the comment is valid though, since the old
 StateTransferManager had priority 55 and it also cleared the data
 container before applying the state from the coordinator. I'm not sure
 how preloading and state transfer are supposed to interact, maybe
 Manik can help clear this up?
 
 Hmm - this is interesting.  I think preloading should happen first, since
 the cache store may contain old data.
 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani
 Lead, Infinispan
 http://www.infinispan.org
 
 
 
 
 --
 Galder Zamarreño
 Sr. Software Engineer
 Infinispan, JBoss Cache
 
 

--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Preloading from disk versus state transfer Re: ISPN-1384 - InboundInvocationHandlerImpl should wait for cache to be started? (not just defined)

2011-10-26 Thread Galder Zamarreño

On Oct 24, 2011, at 2:42 PM, Sanne Grinovero wrote:

 On 24 October 2011 12:58, Dan Berindei dan.berin...@gmail.com wrote:
 Hi Galder
 
 On Mon, Oct 24, 2011 at 1:46 PM, Galder Zamarreño gal...@redhat.com wrote:
 
 On Oct 24, 2011, at 12:04 PM, Dan Berindei wrote:
 
 ISPN-1470 (https://issues.jboss.org/browse/ISPN-1470) raises an
 interesting question: if the preloading happens before joining, the
 preloading code won't know anything about the consistent hash. It will
 load everything from the cache store, including the keys that are
 owned by other nodes.
 
 It's been defined to work that way:
 https://docs.jboss.org/author/display/ISPN/CacheLoaders
 
 Tbh, that will only happen in shared cache stores. In non-shared ones, 
 you'll only have data that belongs to that node.
 
 
 Not really... in distributed mode, every time the cache starts it will
 have another position on the hash wheel.
 That means even with a non-shared cache store, it's likely most of the
 stored keys will no longer be local.
 
 Actually I just noticed that you've fixed ISPN-1404, which looks like
 it would solves my problem when the cache is created by a HotRod
 server. I would like to extend it to work like this by default, e.g.
 by using the transport's nodeName as the seed.
 
 I think there is a check in place already so that the joiner won't
 push stale data from its cache store to the other nodes, but we should
 also discard the keys that don't map locally or we'll have stale data
 (since we don't have a way to check if those keys are stale and
 register to receive invalidations for those keys).
 
 +1, only for shared cache stores.
 
 
 What do you think, should I discard the non-local keys with the fix
 for ISPN-1470 or should I let them be and warn the user about
 potentially stale data?
 
 Discard only for shared cache stores.
 
 Cache configurations should be symmetrical, so if other nodes preload, 
 they'll preload only data local to them with your change.
 
 
 Discarding works fine from the correctness POV, but for performance
 it's not that great: we may do a lot of work to preload keys and have
 nothing to show for it at the end.
 
 Can't you just skip loading state and be happy with the state you
 receive from peers? More data will be lazily loaded.
 Applying of course only when you're not the only/first node in the
 grid, in which case you have to load.
 
 The only alternative I see is to be able to find the boundaries of
 keys you own, and change the CacheLoader API to load keys by the
 identified range - should work with multiple boundaries too for
 virtualnodes, but this is something that not all CacheLoaders will be
 able to implement, so it should be an optional API; for now I'd stick
 with the first option above as I don't see how we can be more
 efficient in loading the state from CacheLoaders than via JGroups.

Before when state transfer meant that state came from a single node, that node 
could be overloaded and so cache loader access might have been more efficient, 
particularly if it's a non-shared one that's available in your machine.

The benefit of loading state from cache loader is that the rest of nodes don't 
have to stop what they're doing, which with loading it from other nodes, in the 
current design they have to.

 
 Sanne

--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Preloading from disk versus state transfer Re: ISPN-1384 - InboundInvocationHandlerImpl should wait for cache to be started? (not just defined)

2011-10-24 Thread Dan Berindei
ISPN-1470 (https://issues.jboss.org/browse/ISPN-1470) raises an
interesting question: if the preloading happens before joining, the
preloading code won't know anything about the consistent hash. It will
load everything from the cache store, including the keys that are
owned by other nodes.

I think there is a check in place already so that the joiner won't
push stale data from its cache store to the other nodes, but we should
also discard the keys that don't map locally or we'll have stale data
(since we don't have a way to check if those keys are stale and
register to receive invalidations for those keys).

What do you think, should I discard the non-local keys with the fix
for ISPN-1470 or should I let them be and warn the user about
potentially stale data?

Cheers
Dan


On Mon, Oct 3, 2011 at 3:09 AM, Manik Surtani ma...@jboss.org wrote:

 On 28 Sep 2011, at 10:56, Dan Berindei wrote:

 I'm not sure if the comment is valid though, since the old
 StateTransferManager had priority 55 and it also cleared the data
 container before applying the state from the coordinator. I'm not sure
 how preloading and state transfer are supposed to interact, maybe
 Manik can help clear this up?

 Hmm - this is interesting.  I think preloading should happen first, since
 the cache store may contain old data.
 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani
 Lead, Infinispan
 http://www.infinispan.org




___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Preloading from disk versus state transfer Re: ISPN-1384 - InboundInvocationHandlerImpl should wait for cache to be started? (not just defined)

2011-10-24 Thread Sanne Grinovero
On 24 October 2011 12:58, Dan Berindei dan.berin...@gmail.com wrote:
 Hi Galder

 On Mon, Oct 24, 2011 at 1:46 PM, Galder Zamarreño gal...@redhat.com wrote:

 On Oct 24, 2011, at 12:04 PM, Dan Berindei wrote:

 ISPN-1470 (https://issues.jboss.org/browse/ISPN-1470) raises an
 interesting question: if the preloading happens before joining, the
 preloading code won't know anything about the consistent hash. It will
 load everything from the cache store, including the keys that are
 owned by other nodes.

 It's been defined to work that way:
 https://docs.jboss.org/author/display/ISPN/CacheLoaders

 Tbh, that will only happen in shared cache stores. In non-shared ones, 
 you'll only have data that belongs to that node.


 Not really... in distributed mode, every time the cache starts it will
 have another position on the hash wheel.
 That means even with a non-shared cache store, it's likely most of the
 stored keys will no longer be local.

 Actually I just noticed that you've fixed ISPN-1404, which looks like
 it would solves my problem when the cache is created by a HotRod
 server. I would like to extend it to work like this by default, e.g.
 by using the transport's nodeName as the seed.

 I think there is a check in place already so that the joiner won't
 push stale data from its cache store to the other nodes, but we should
 also discard the keys that don't map locally or we'll have stale data
 (since we don't have a way to check if those keys are stale and
 register to receive invalidations for those keys).

 +1, only for shared cache stores.


 What do you think, should I discard the non-local keys with the fix
 for ISPN-1470 or should I let them be and warn the user about
 potentially stale data?

 Discard only for shared cache stores.

 Cache configurations should be symmetrical, so if other nodes preload, 
 they'll preload only data local to them with your change.


 Discarding works fine from the correctness POV, but for performance
 it's not that great: we may do a lot of work to preload keys and have
 nothing to show for it at the end.

Can't you just skip loading state and be happy with the state you
receive from peers? More data will be lazily loaded.
Applying of course only when you're not the only/first node in the
grid, in which case you have to load.

The only alternative I see is to be able to find the boundaries of
keys you own, and change the CacheLoader API to load keys by the
identified range - should work with multiple boundaries too for
virtualnodes, but this is something that not all CacheLoaders will be
able to implement, so it should be an optional API; for now I'd stick
with the first option above as I don't see how we can be more
efficient in loading the state from CacheLoaders than via JGroups.

Sanne

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Preloading from disk versus state transfer Re: ISPN-1384 - InboundInvocationHandlerImpl should wait for cache to be started? (not just defined)

2011-10-04 Thread Mircea Markus

On 3 Oct 2011, at 01:09, Manik Surtani wrote:

 
 On 28 Sep 2011, at 10:56, Dan Berindei wrote:
 
 I'm not sure if the comment is valid though, since the old
 StateTransferManager had priority 55 and it also cleared the data
 container before applying the state from the coordinator. I'm not sure
 how preloading and state transfer are supposed to interact, maybe
 Manik can help clear this up?
 
 Hmm - this is interesting.  I think preloading should happen first, since the 
 cache store may contain old data.

I can't find Dan's original email - was it set to the entire list?

I don't get the entire context, but I don't think preloading *first* would 
resolve the consistency problem in the case of deletions: what if you preload 
something that was in between deleted from memory? ___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev