[infinispan-dev] Local state transfer before going over network
Not sure if the idea has come up but while at GeeCON last week I was discussing to one of the attendees about state transfer improvements in replicated environments: The idea is that in a replicated environment, if a cache manager shuts down, it would dump its memory contents to a cache store (i.e. a local filesystem) and when it starts up, instead of going over the network to do state transfer, it would load the state from the local filesystem which would be much quicker. Obviously, at times the cache manager would crash or have some failure dumping the memory contents, so in that case it would fallback on state transfer over the network. I think it's an interesting idea since it could reduce the amount of state transfer to be done. It's true though that there're other tricks if you're having issues with state transfer, such as the use of a cluster cache loader. WDYT? -- Galder Zamarreño Sr. Software Engineer Infinispan, JBoss Cache ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Local state transfer before going over network
2011/5/16 Galder Zamarreño : > Not sure if the idea has come up but while at GeeCON last week I was > discussing to one of the attendees about state transfer improvements in > replicated environments: > > The idea is that in a replicated environment, if a cache manager shuts down, > it would dump its memory contents to a cache store (i.e. a local filesystem) > and when it starts up, instead of going over the network to do state > transfer, it would load the state from the local filesystem which would be > much quicker. Obviously, at times the cache manager would crash or have some > failure dumping the memory contents, so in that case it would fallback on > state transfer over the network. I think it's an interesting idea since it > could reduce the amount of state transfer to be done. It's true though that > there're other tricks if you're having issues with state transfer, such as > the use of a cluster cache loader. > > WDYT? Well if it's a shared cachestore, then we're using network at some level anyway. If we're talking about a not-shared cachestore, how do you know which keys/values are still valid and where not updated? and about the new keys? I like the concept though, let's explore more in this direction. Sanne > -- > Galder Zamarreño > Sr. Software Engineer > Infinispan, JBoss Cache > > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Local state transfer before going over network
On 16 May 2011, at 12:18, Sanne Grinovero wrote: > 2011/5/16 Galder Zamarreño : >> Not sure if the idea has come up but while at GeeCON last week I was >> discussing to one of the attendees about state transfer improvements in >> replicated environments: >> >> The idea is that in a replicated environment, if a cache manager shuts down, >> it would dump its memory contents to a cache store (i.e. a local filesystem) >> and when it starts up, instead of going over the network to do state >> transfer, it would load the state from the local filesystem which would be >> much quicker. Obviously, at times the cache manager would crash or have some >> failure dumping the memory contents, so in that case it would fallback on >> state transfer over the network. I think it's an interesting idea since it >> could reduce the amount of state transfer to be done. It's true though that >> there're other tricks if you're having issues with state transfer, such as >> the use of a cluster cache loader. >> >> WDYT? > > Well if it's a shared cachestore, then we're using network at some > level anyway. If we're talking about a not-shared cachestore, how do > you know which keys/values are still valid and where not updated? and > about the new keys? > > I like the concept though, let's explore more in this direction. +1. During startup the resurrected node can load some state from local store and remaining/deltas through state transfer. That's if deltas are determinable - perhaps with some Merkle trees, or by application. This problem sounds similar with solving a merge. ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Local state transfer before going over network
On May 16, 2011, at 1:18 PM, Sanne Grinovero wrote: > 2011/5/16 Galder Zamarreño : >> Not sure if the idea has come up but while at GeeCON last week I was >> discussing to one of the attendees about state transfer improvements in >> replicated environments: >> >> The idea is that in a replicated environment, if a cache manager shuts down, >> it would dump its memory contents to a cache store (i.e. a local filesystem) >> and when it starts up, instead of going over the network to do state >> transfer, it would load the state from the local filesystem which would be >> much quicker. Obviously, at times the cache manager would crash or have some >> failure dumping the memory contents, so in that case it would fallback on >> state transfer over the network. I think it's an interesting idea since it >> could reduce the amount of state transfer to be done. It's true though that >> there're other tricks if you're having issues with state transfer, such as >> the use of a cluster cache loader. >> >> WDYT? > > Well if it's a shared cachestore, then we're using network at some > level anyway. If we're talking about a not-shared cachestore, how do > you know which keys/values are still valid and where not updated? and > about the new keys? I see this only being useful with a local cache store cos if you need to go remote over the network, might as well just do state transfer. Not sure if the timestamp of creation/update is available per all entries (i'd need to check the code but maybe immortals do not store it...), but anyway assuming that a timestamp was stored in the local cache store, on startup the node could send this timestamp and the coordinator could send anything new created/updated after that timestamp. This would be particularly efficient in situations where you have to quickly restart a machine for whatever reason and so the deltas are very small, or when the caches are big and state transfer would cost a lot from a bandwidth perspective. > > I like the concept though, let's explore more in this direction. > > Sanne > >> -- >> Galder Zamarreño >> Sr. Software Engineer >> Infinispan, JBoss Cache >> >> >> ___ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarreño Sr. Software Engineer Infinispan, JBoss Cache ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Local state transfer before going over network
2011/5/17 Galder Zamarreño : > > On May 16, 2011, at 1:18 PM, Sanne Grinovero wrote: > >> 2011/5/16 Galder Zamarreño : >>> Not sure if the idea has come up but while at GeeCON last week I was >>> discussing to one of the attendees about state transfer improvements in >>> replicated environments: >>> >>> The idea is that in a replicated environment, if a cache manager shuts >>> down, it would dump its memory contents to a cache store (i.e. a local >>> filesystem) and when it starts up, instead of going over the network to do >>> state transfer, it would load the state from the local filesystem which >>> would be much quicker. Obviously, at times the cache manager would crash or >>> have some failure dumping the memory contents, so in that case it would >>> fallback on state transfer over the network. I think it's an interesting >>> idea since it could reduce the amount of state transfer to be done. It's >>> true though that there're other tricks if you're having issues with state >>> transfer, such as the use of a cluster cache loader. >>> >>> WDYT? >> >> Well if it's a shared cachestore, then we're using network at some >> level anyway. If we're talking about a not-shared cachestore, how do >> you know which keys/values are still valid and where not updated? and >> about the new keys? > > I see this only being useful with a local cache store cos if you need to go > remote over the network, might as well just do state transfer. +1 > Not sure if the timestamp of creation/update is available per all entries > (i'd need to check the code but maybe immortals do not store it...), but > anyway assuming that a timestamp was stored in the local cache store, on > startup the node could send this timestamp and the coordinator could send > anything new created/updated after that timestamp. This means we'll need an API on the cache stores to return the "highest timestamp"; some like the JDBC cacheloader could implement that with a single query. Not sure how you would handle cases about deleted entries; the other nodes would need to keep a list of deleted keys with timestamps; maybe there could be an option to never delete keys from a cacheloader, only values - and record the timestamp of the operation. > > This would be particularly efficient in situations where you have to quickly > restart a machine for whatever reason and so the deltas are very small, or > when the caches are big and state transfer would cost a lot from a bandwidth > perspective. super; this would be quite useful in the Lucene case, as I can actually figure out which keys should be deleted inferring which ones are "obsolete" from the common metadata (a known key contains this information); and indeed startup time is a point I'd like to improve. > >> >> I like the concept though, let's explore more in this direction. >> >> Sanne >> >>> -- >>> Galder Zamarreño >>> Sr. Software Engineer >>> Infinispan, JBoss Cache >>> >>> >>> ___ >>> infinispan-dev mailing list >>> infinispan-dev@lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> >> ___ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- > Galder Zamarreño > Sr. Software Engineer > Infinispan, JBoss Cache > > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Local state transfer before going over network
Interesting discussions. Another approach may be to version data using lamport clocks or vector clocks. Then at the start of a rehash, a digest of keys and versions can be pushed, and the receiver 'decides' which keys are out of date and needs to be pulled from across the network. On 17 May 2011, at 12:06, Sanne Grinovero wrote: > 2011/5/17 Galder Zamarreño : >> >> On May 16, 2011, at 1:18 PM, Sanne Grinovero wrote: >> >>> 2011/5/16 Galder Zamarreño : Not sure if the idea has come up but while at GeeCON last week I was discussing to one of the attendees about state transfer improvements in replicated environments: The idea is that in a replicated environment, if a cache manager shuts down, it would dump its memory contents to a cache store (i.e. a local filesystem) and when it starts up, instead of going over the network to do state transfer, it would load the state from the local filesystem which would be much quicker. Obviously, at times the cache manager would crash or have some failure dumping the memory contents, so in that case it would fallback on state transfer over the network. I think it's an interesting idea since it could reduce the amount of state transfer to be done. It's true though that there're other tricks if you're having issues with state transfer, such as the use of a cluster cache loader. WDYT? >>> >>> Well if it's a shared cachestore, then we're using network at some >>> level anyway. If we're talking about a not-shared cachestore, how do >>> you know which keys/values are still valid and where not updated? and >>> about the new keys? >> >> I see this only being useful with a local cache store cos if you need to go >> remote over the network, might as well just do state transfer. > > +1 > >> Not sure if the timestamp of creation/update is available per all entries >> (i'd need to check the code but maybe immortals do not store it...), but >> anyway assuming that a timestamp was stored in the local cache store, on >> startup the node could send this timestamp and the coordinator could send >> anything new created/updated after that timestamp. > > This means we'll need an API on the cache stores to return the > "highest timestamp"; some like the JDBC cacheloader could implement > that with a single query. > > Not sure how you would handle cases about deleted entries; the other > nodes would need to keep a list of deleted keys with timestamps; maybe > there could be an option to never delete keys from a cacheloader, only > values - and record the timestamp of the operation. > >> >> This would be particularly efficient in situations where you have to quickly >> restart a machine for whatever reason and so the deltas are very small, or >> when the caches are big and state transfer would cost a lot from a bandwidth >> perspective. > > super; this would be quite useful in the Lucene case, as I can > actually figure out which keys should be deleted inferring which ones > are "obsolete" from the common metadata (a known key contains this > information); and indeed startup time is a point I'd like to improve. > >> >>> >>> I like the concept though, let's explore more in this direction. >>> >>> Sanne >>> -- Galder Zamarreño Sr. Software Engineer Infinispan, JBoss Cache ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> ___ >>> infinispan-dev mailing list >>> infinispan-dev@lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> -- >> Galder Zamarreño >> Sr. Software Engineer >> Infinispan, JBoss Cache >> >> >> ___ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Manik Surtani ma...@jboss.org twitter.com/maniksurtani Lead, Infinispan http://www.infinispan.org ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Local state transfer before going over network
This is exactly what JGroups digests do On 05/17/2011 10:38 AM, Manik Surtani wrote: > Interesting discussions. Another approach may be to version data using > lamport clocks or vector clocks. Then at the start of a rehash, a digest of > keys and versions can be pushed, and the receiver 'decides' which keys are > out of date and needs to be pulled from across the network. > > On 17 May 2011, at 12:06, Sanne Grinovero wrote: > >> 2011/5/17 Galder Zamarreño: >>> >>> On May 16, 2011, at 1:18 PM, Sanne Grinovero wrote: >>> 2011/5/16 Galder Zamarreño: > Not sure if the idea has come up but while at GeeCON last week I was > discussing to one of the attendees about state transfer improvements in > replicated environments: > > The idea is that in a replicated environment, if a cache manager shuts > down, it would dump its memory contents to a cache store (i.e. a local > filesystem) and when it starts up, instead of going over the network to > do state transfer, it would load the state from the local filesystem > which would be much quicker. Obviously, at times the cache manager would > crash or have some failure dumping the memory contents, so in that case > it would fallback on state transfer over the network. I think it's an > interesting idea since it could reduce the amount of state transfer to be > done. It's true though that there're other tricks if you're having issues > with state transfer, such as the use of a cluster cache loader. > > WDYT? Well if it's a shared cachestore, then we're using network at some level anyway. If we're talking about a not-shared cachestore, how do you know which keys/values are still valid and where not updated? and about the new keys? >>> >>> I see this only being useful with a local cache store cos if you need to go >>> remote over the network, might as well just do state transfer. >> >> +1 >> >>> Not sure if the timestamp of creation/update is available per all entries >>> (i'd need to check the code but maybe immortals do not store it...), but >>> anyway assuming that a timestamp was stored in the local cache store, on >>> startup the node could send this timestamp and the coordinator could send >>> anything new created/updated after that timestamp. >> >> This means we'll need an API on the cache stores to return the >> "highest timestamp"; some like the JDBC cacheloader could implement >> that with a single query. >> >> Not sure how you would handle cases about deleted entries; the other >> nodes would need to keep a list of deleted keys with timestamps; maybe >> there could be an option to never delete keys from a cacheloader, only >> values - and record the timestamp of the operation. >> >>> >>> This would be particularly efficient in situations where you have to >>> quickly restart a machine for whatever reason and so the deltas are very >>> small, or when the caches are big and state transfer would cost a lot from >>> a bandwidth perspective. >> >> super; this would be quite useful in the Lucene case, as I can >> actually figure out which keys should be deleted inferring which ones >> are "obsolete" from the common metadata (a known key contains this >> information); and indeed startup time is a point I'd like to improve. >> >>> I like the concept though, let's explore more in this direction. Sanne > -- > Galder Zamarreño > Sr. Software Engineer > Infinispan, JBoss Cache > > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> -- >>> Galder Zamarreño >>> Sr. Software Engineer >>> Infinispan, JBoss Cache >>> >>> >>> ___ >>> infinispan-dev mailing list >>> infinispan-dev@lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> >> ___ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- > Manik Surtani > ma...@jboss.org > twitter.com/maniksurtani > > Lead, Infinispan > http://www.infinispan.org > > > > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Bela Ban Lead JGroups / Clustering Team JBoss ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Local state transfer before going over network
I'll create a wiki to compile the details/ideas of this discussion so that we can hash this out further. On May 18, 2011, at 1:13 AM, Bela Ban wrote: > This is exactly what JGroups digests do > > On 05/17/2011 10:38 AM, Manik Surtani wrote: >> Interesting discussions. Another approach may be to version data using >> lamport clocks or vector clocks. Then at the start of a rehash, a digest of >> keys and versions can be pushed, and the receiver 'decides' which keys are >> out of date and needs to be pulled from across the network. >> >> On 17 May 2011, at 12:06, Sanne Grinovero wrote: >> >>> 2011/5/17 Galder Zamarreño: On May 16, 2011, at 1:18 PM, Sanne Grinovero wrote: > 2011/5/16 Galder Zamarreño: >> Not sure if the idea has come up but while at GeeCON last week I was >> discussing to one of the attendees about state transfer improvements in >> replicated environments: >> >> The idea is that in a replicated environment, if a cache manager shuts >> down, it would dump its memory contents to a cache store (i.e. a local >> filesystem) and when it starts up, instead of going over the network to >> do state transfer, it would load the state from the local filesystem >> which would be much quicker. Obviously, at times the cache manager would >> crash or have some failure dumping the memory contents, so in that case >> it would fallback on state transfer over the network. I think it's an >> interesting idea since it could reduce the amount of state transfer to >> be done. It's true though that there're other tricks if you're having >> issues with state transfer, such as the use of a cluster cache loader. >> >> WDYT? > > Well if it's a shared cachestore, then we're using network at some > level anyway. If we're talking about a not-shared cachestore, how do > you know which keys/values are still valid and where not updated? and > about the new keys? I see this only being useful with a local cache store cos if you need to go remote over the network, might as well just do state transfer. >>> >>> +1 >>> Not sure if the timestamp of creation/update is available per all entries (i'd need to check the code but maybe immortals do not store it...), but anyway assuming that a timestamp was stored in the local cache store, on startup the node could send this timestamp and the coordinator could send anything new created/updated after that timestamp. >>> >>> This means we'll need an API on the cache stores to return the >>> "highest timestamp"; some like the JDBC cacheloader could implement >>> that with a single query. >>> >>> Not sure how you would handle cases about deleted entries; the other >>> nodes would need to keep a list of deleted keys with timestamps; maybe >>> there could be an option to never delete keys from a cacheloader, only >>> values - and record the timestamp of the operation. >>> This would be particularly efficient in situations where you have to quickly restart a machine for whatever reason and so the deltas are very small, or when the caches are big and state transfer would cost a lot from a bandwidth perspective. >>> >>> super; this would be quite useful in the Lucene case, as I can >>> actually figure out which keys should be deleted inferring which ones >>> are "obsolete" from the common metadata (a known key contains this >>> information); and indeed startup time is a point I'd like to improve. >>> > > I like the concept though, let's explore more in this direction. > > Sanne > >> -- >> Galder Zamarreño >> Sr. Software Engineer >> Infinispan, JBoss Cache >> >> >> ___ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarreño Sr. Software Engineer Infinispan, JBoss Cache ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> ___ >>> infinispan-dev mailing list >>> infinispan-dev@lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> -- >> Manik Surtani >> ma...@jboss.org >> twitter.com/maniksurtani >> >> Lead, Infinispan >> http://www.infinispan.org >> >> >> >> >> ___ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- >
Re: [infinispan-dev] Local state transfer before going over network
On 18 May 2011, at 09:40, Galder Zamarreño wrote: > I'll create a wiki to compile the details/ideas of this discussion so that we > can hash this out further. +1. I think this can be a very good optimisation, and further still, even if rehashing needs to happen, the logic should be able to pick "preferred" data owners - e.g., one on the same physical machine or at least rack, as opposed to, say, one across a WAN. I believe right now it is just the primary data owner for each entry that does the pushing. - Manik -- Manik Surtani ma...@jboss.org twitter.com/maniksurtani Lead, Infinispan http://www.infinispan.org ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Local state transfer before going over network
As someone mentioned, the biggest issue would be to make sure the data read from the file system isn't stale. If a node has been down for an extended period of time, then this process might slow things down. We'd have to implement some rsync like algorithm, which checks the local data against the cluster data, and this might be costly if the data set is small. If it's big, then that cost would be amortized over the smaller deltas sent over the network to update a local cache. I don't think this makes sense as (1) data sets in replicated mode are usually small and (2) Infinispan's focus is on distributed data. On 5/17/11 11:25 AM, Galder Zamarreño wrote: > > On May 16, 2011, at 1:18 PM, Sanne Grinovero wrote: > >> 2011/5/16 Galder Zamarreño: >>> Not sure if the idea has come up but while at GeeCON last week I was >>> discussing to one of the attendees about state transfer improvements in >>> replicated environments: >>> >>> The idea is that in a replicated environment, if a cache manager shuts >>> down, it would dump its memory contents to a cache store (i.e. a local >>> filesystem) and when it starts up, instead of going over the network to do >>> state transfer, it would load the state from the local filesystem which >>> would be much quicker. Obviously, at times the cache manager would crash or >>> have some failure dumping the memory contents, so in that case it would >>> fallback on state transfer over the network. I think it's an interesting >>> idea since it could reduce the amount of state transfer to be done. It's >>> true though that there're other tricks if you're having issues with state >>> transfer, such as the use of a cluster cache loader. >>> >>> WDYT? >> >> Well if it's a shared cachestore, then we're using network at some >> level anyway. If we're talking about a not-shared cachestore, how do >> you know which keys/values are still valid and where not updated? and >> about the new keys? > > I see this only being useful with a local cache store cos if you need to go > remote over the network, might as well just do state transfer. > > Not sure if the timestamp of creation/update is available per all entries > (i'd need to check the code but maybe immortals do not store it...), but > anyway assuming that a timestamp was stored in the local cache store, on > startup the node could send this timestamp and the coordinator could send > anything new created/updated after that timestamp. > > This would be particularly efficient in situations where you have to quickly > restart a machine for whatever reason and so the deltas are very small, or > when the caches are big and state transfer would cost a lot from a bandwidth > perspective. -- Bela Ban Lead JGroups / Clustering Team JBoss ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Local state transfer before going over network
On 19 May 2011, at 09:51, Bela Ban wrote: > As someone mentioned, the biggest issue would be to make sure the data > read from the file system isn't stale. If a node has been down for an > extended period of time, then this process might slow things down. > > We'd have to implement some rsync like algorithm, which checks the local > data against the cluster data, and this might be costly if the data set > is small. If it's big, then that cost would be amortized over the > smaller deltas sent over the network to update a local cache. > > I don't think this makes sense as (1) data sets in replicated mode are > usually small and (2) Infinispan's focus is on distributed data. I think in both cases (repl and dist) it still may make sense in some cases. E.g., in dist, if a node joins, existing owners could, rather than push data to the joiner, just push a list of {key: version} tuples, which may be significantly smaller than the values. The joiner can then load stuff from a cache loader based on key/version - we'd need a new API on the CacheLoader, like load(Set keys) - this can be implemented pretty efficiently in many cache stores such as JDBC. The keys that the cache loader doesn't retrieve would need to be pulled back across the network. Certainly not high prio, but something to think about for Infinispan.next(). Cheers Manik -- Manik Surtani ma...@jboss.org twitter.com/maniksurtani Lead, Infinispan http://www.infinispan.org ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Local state transfer before going over network
On 5/19/11 12:02 PM, Manik Surtani wrote: >> I don't think this makes sense as (1) data sets in replicated mode are >> usually small and (2) Infinispan's focus is on distributed data. > > I think in both cases (repl and dist) it still may make sense in some cases. > E.g., in dist, if a node joins, existing owners could, rather than push data > to the joiner, just push a list of {key: version} tuples, which may be > significantly smaller than the values. How does it know which keys to send ? It doesn't know the joiner's local data, so it would have to do a key-by-key comparsion of the joiner's local data with its own data, akin to what rsync does. This only makes sense if the data to be shipped to the joiner is large. -- Bela Ban Lead JGroups / Clustering Team JBoss ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Local state transfer before going over network
On 19 May 2011, at 11:14, Bela Ban wrote: > > > On 5/19/11 12:02 PM, Manik Surtani wrote: > >>> I don't think this makes sense as (1) data sets in replicated mode are >>> usually small and (2) Infinispan's focus is on distributed data. >> >> I think in both cases (repl and dist) it still may make sense in some cases. >> E.g., in dist, if a node joins, existing owners could, rather than push >> data to the joiner, just push a list of {key: version} tuples, which may be >> significantly smaller than the values. > > > How does it know which keys to send ? It doesn't know the joiner's local > data, so it would have to do a key-by-key comparsion of the joiner's > local data with its own data, akin to what rsync does. This only makes > sense if the data to be shipped to the joiner is large. Yes, it needs to be a configurable option. E.g., if you are storing stock prices keyed on ticker symbol/timestamp, this isn't worth it. If you are storing DVDs keyed on title, it certainly is. :) -- Manik Surtani ma...@jboss.org twitter.com/maniksurtani Lead, Infinispan http://www.infinispan.org ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Local state transfer before going over network
On May 18, 2011, at 11:55 AM, Manik Surtani wrote: > On 18 May 2011, at 09:40, Galder Zamarreño wrote: > >> I'll create a wiki to compile the details/ideas of this discussion so that >> we can hash this out further. > > +1. I think this can be a very good optimisation, and further still, even if > rehashing needs to happen, the logic should be able to pick "preferred" data > owners - e.g., one on the same physical machine or at least rack, as opposed > to, say, one across a WAN. > > I believe right now it is just the primary data owner for each entry that > does the pushing. The primary data owner could still do the initial key+version digest push, and then the node decides where to get the value part from which is the part that's potentially big. > > - Manik > > -- > Manik Surtani > ma...@jboss.org > twitter.com/maniksurtani > > Lead, Infinispan > http://www.infinispan.org > > > > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarreño Sr. Software Engineer Infinispan, JBoss Cache ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Local state transfer before going over network
I've just created http://community.jboss.org/docs/DOC-16815 The document title is temporary and I'm open to suggestions for different names. Feel free to add more thoughts/comments to it. I'll add a link from the main wiki now. On May 19, 2011, at 12:18 PM, Manik Surtani wrote: > > On 19 May 2011, at 11:14, Bela Ban wrote: > >> >> >> On 5/19/11 12:02 PM, Manik Surtani wrote: >> I don't think this makes sense as (1) data sets in replicated mode are usually small and (2) Infinispan's focus is on distributed data. >>> >>> I think in both cases (repl and dist) it still may make sense in some >>> cases. E.g., in dist, if a node joins, existing owners could, rather than >>> push data to the joiner, just push a list of {key: version} tuples, which >>> may be significantly smaller than the values. >> >> >> How does it know which keys to send ? It doesn't know the joiner's local >> data, so it would have to do a key-by-key comparsion of the joiner's >> local data with its own data, akin to what rsync does. This only makes >> sense if the data to be shipped to the joiner is large. > > Yes, it needs to be a configurable option. E.g., if you are storing stock > prices keyed on ticker symbol/timestamp, this isn't worth it. > > If you are storing DVDs keyed on title, it certainly is. :) > > > -- > Manik Surtani > ma...@jboss.org > twitter.com/maniksurtani > > Lead, Infinispan > http://www.infinispan.org > > > > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarreño Sr. Software Engineer Infinispan, JBoss Cache ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Local state transfer before going over network
> > I think in both cases (repl and dist) it still may make sense in some cases. > E.g., in dist, if a node joins, existing owners could, rather than push data > to the joiner, just push a list of {key: version} tuples, which may be > significantly smaller than the values. The joiner can then load stuff from a > cache loader based on key/version - we'd need a new API on the CacheLoader, > like load(Set keys) - this can be implemented pretty > efficiently in many cache stores such as JDBC. The keys that the cache > loader doesn't retrieve would need to be pulled back across the network. > I don't know a lot about the subject but for comparing state efficiently Merkle trees seem to be heavily used[1]. > Certainly not high prio, but something to think about for Infinispan.next(). +1. [1] http://en.wikipedia.org/wiki/Hash_tree ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Local state transfer before going over network
On 19 May 2011, at 11:14, Bela Ban wrote: > > > On 5/19/11 12:02 PM, Manik Surtani wrote: > >>> I don't think this makes sense as (1) data sets in replicated mode are >>> usually small and (2) Infinispan's focus is on distributed data. >> >> I think in both cases (repl and dist) it still may make sense in some cases. >> E.g., in dist, if a node joins, existing owners could, rather than push >> data to the joiner, just push a list of {key: version} tuples, which may be >> significantly smaller than the values. > > > How does it know which keys to send ? It doesn't know the joiner's local > data, so it would have to do a key-by-key comparsion of the joiner's > local data with its own data, akin to what rsync does. Merkle trees would do much better than key-by-key comparison. > This only makes > sense if the data to be shipped to the joiner is large. > > > -- > Bela Ban > Lead JGroups / Clustering Team > JBoss > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Local state transfer before going over network
On 24 May 2011, at 11:33, Mircea Markus wrote: > Merkle trees would do much better than key-by-key comparison. +1. Is there a JIRA for this? Anyone wants to take this on, I would target this at 6.0 though. -- Manik Surtani ma...@jboss.org twitter.com/maniksurtani Lead, Infinispan http://www.infinispan.org ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Local state transfer before going over network
On Jun 8, 2011, at 2:02 PM, Manik Surtani wrote: > > On 24 May 2011, at 11:33, Mircea Markus wrote: > >> Merkle trees would do much better than key-by-key comparison. > > +1. > > Is there a JIRA for this? Anyone wants to take this on, I would target this > at 6.0 though. I'll take it: https://issues.jboss.org/browse/ISPN-1165 > > -- > Manik Surtani > ma...@jboss.org > twitter.com/maniksurtani > > Lead, Infinispan > http://www.infinispan.org > > > > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarreño Sr. Software Engineer Infinispan, JBoss Cache ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev