[infinispan-dev] What's master?
Hello all, I couldn't find a 5.1 branch, and the current master builds produce snapshot versions of 5.1. Should we change this to 5.2.0-SNAPSHOT or are we expecting some work as 5.1.1-SNAPSHOT ? ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] DIST.retrieveFromRemoteSource
On 25 January 2012 17:09, Dan Berindei wrote: > On Wed, Jan 25, 2012 at 4:22 PM, Mircea Markus > wrote: >> >> One node might be busy doing GC and stay unresponsive for a whole >> >> second or longer, another one might be actually crashed and you didn't >> >> know that yet, these are unlikely but possible. >> >> All these are possible but I would rather consider them as exceptional >> situations, possibly handled by a retry logic. We should *not* optimise for >> that these situations IMO. >> > > As Sanne pointed out, an exceptional situation on a node becomes > ordinary with 100s or 1000s of nodes. > So the default policy should scale the initial number of requests with > numOwners. > >> >> More likely, a rehash is in progress, you could then be asking a node >> >> which doesn't yet (or anymore) have the value. >> >> >> this is a consistency issue and I think we can find a way to handle it some >> other way. >> > > With the current state transfer we always send ClusteredGetCommands to > the old owners (and only the old owners). If a node didn't receive the > entire state, it means that state transfer hasn't finished yet and the > CH will not return it as an owner. But the CH could also return owners > that are no longer members of the cluster, so we have to check for > that before picking one owner to send the command to. > > In Sanne's non-blocking state transfer proposal I think a new owner > may have to ask the old owner for the key value, so it would still > never return null. But it might be less expensive to ask the old owner > directly (assuming it's safe from a consistency POV). > >> >> All good reasons for which imho it makes sense to send out "a couple" >> >> of requests in parallel, but I'd unlikely want to send more than 2, >> >> and I agree often 1 might be enough. >> >> Maybe it should even optimize for the most common case: send out just >> >> one, have a more aggressive timeout and in case of trouble ask for the >> >> next node. >> >> +1 >> > > -1 for aggressive timeouts... you're going to do the same work as you > do now, except you're going to wait a bit between sending requests. If > you're really unlucky the first target will return first but you'll > ignore its response because the timeout already expired. Agreed, what I meant with "more aggressive timeouts" is not the overall timeout to fail the get, but we might have a second one which is more aggressive by starting to send the next GET when the first one is "starting to not look good"; so we would have a timeout for the whole operation, and one which decides at which point after a single GET RPC didn't return yet we start to ask to another node. So even if the global timeout is something high like "10 seconds", if after 40 ms I still didn't get a reply from the first node I think we can start sending the next one.. but still wait to eventually get an answer on the first. > >> >> In addition, sending a single request might spare us some Future, >> >> await+notify messing in terms of CPU cost of sending the request. >> >> it's the remote OOB thread that's the most costly resource imo. >> > > I don't think the OOB thread is that costly, it doesn't block on > anything (not even on state transfer!) so the most expensive part is > reading the key and writing the value. BTW Sanne, we may want to run > Transactional with a smaller payload size ;) > > We could implement our own GroupRequest that sends the requests in > parallel instead implementing FutureCollator on top of UnicastRequest > and save some of that overhead on the caller. > > I think we already have a JIRA to make PutKeyValueCommands return the > previous value, that would eliminate lots of GetKeyValueCommands and > it would actually improve the performance of puts - we should probably > make this a priority. +1 !! > >> >> I think I agree on all points, it makes more sense. >> Just that in a large cluster, let's say >> 1000 nodes, maybe I want 20 owners as a sweet spot for read/write >> performance tradeoff, and with such high numbers I guess doing 2-3 >> gets in parallel might make sense as those "unlikely" events, suddenly >> are an almost certain.. especially the rehash in progress. >> >> So I'd propose a separate configuration option for # parallel get >> events, and one to define a "try next node" policy. Or this policy >> should be the whole strategy, and the #gets one of the options for the >> default implementation. >> >> Agreed that having a configurable remote get policy makes sense. >> We already have a JIRA for this[1], I'll start working on it as the >> performance results are hunting me. > > I'd rather focus on implementing one remote get policy that works > instead of making it configurable - even if we make it configurable > we'll have to focus our optimizations on the default policy. > > Keep in mind that we also want to introduce eventual consistency - I > think that's going to eliminate our optimization opportunity here > because we'll need to get the v
Re: [infinispan-dev] DIST.retrieveFromRemoteSource
On 25 Jan 2012, at 17:09, Dan Berindei wrote: > On Wed, Jan 25, 2012 at 4:22 PM, Mircea Markus > wrote: >> >> One node might be busy doing GC and stay unresponsive for a whole >> >> second or longer, another one might be actually crashed and you didn't >> >> know that yet, these are unlikely but possible. >> >> All these are possible but I would rather consider them as exceptional >> situations, possibly handled by a retry logic. We should *not* optimise for >> that these situations IMO. >> > > As Sanne pointed out, an exceptional situation on a node becomes > ordinary with 100s or 1000s of nodes. possible, but still that's not our every day use case yet. For the *default* value I'd rather consider an 4-20 cluster. The idea is to have numGets configurable, or even dynamic. > So the default policy should scale the initial number of requests with > numOwners. Not sure what you mean by that. As you mention, there might be a correlation between the number of nodes to which to send the remote get and the cluster size. > >> >> More likely, a rehash is in progress, you could then be asking a node >> >> which doesn't yet (or anymore) have the value. >> >> >> this is a consistency issue and I think we can find a way to handle it some >> other way. >> > > With the current state transfer we always send ClusteredGetCommands to > the old owners (and only the old owners). If a node didn't receive the > entire state, it means that state transfer hasn't finished yet and the > CH will not return it as an owner. But the CH could also return owners > that are no longer members of the cluster, so we have to check for > that before picking one owner to send the command to. > > In Sanne's non-blocking state transfer proposal I think a new owner > may have to ask the old owner for the key value, so it would still > never return null. But it might be less expensive to ask the old owner > directly (assuming it's safe from a consistency POV). > >> >> All good reasons for which imho it makes sense to send out "a couple" >> >> of requests in parallel, but I'd unlikely want to send more than 2, >> >> and I agree often 1 might be enough. >> >> Maybe it should even optimize for the most common case: send out just >> >> one, have a more aggressive timeout and in case of trouble ask for the >> >> next node. >> >> +1 >> > > -1 for aggressive timeouts... you're going to do the same work as you > do now, except you're going to wait a bit between sending requests. If > you're really unlucky the first target will return first but you'll > ignore its response because the timeout already expired. > >> >> In addition, sending a single request might spare us some Future, >> >> await+notify messing in terms of CPU cost of sending the request. >> >> it's the remote OOB thread that's the most costly resource imo. >> > > I don't think the OOB thread is that costly, it doesn't block on > anything (not even on state transfer!) so the most expensive part is > reading the key and writing the value. BTW Sanne, we may want to run > Transactional with a smaller payload size ;) Yes, besides using the OOB pool unnecessarily, other resource are also costumed. Not sure I agree that OOB thread usage is not costly: this pool is also used for releasing locks and exhausting it might result in a chained performance degradation. > > We could implement our own GroupRequest that sends the requests in > parallel instead implementing FutureCollator on top of UnicastRequest > and save some of that overhead on the caller. > > I think we already have a JIRA to make PutKeyValueCommands return the > previous value, that would eliminate lots of GetKeyValueCommands and > it would actually improve the performance of puts - we should probably > make this a priority. Not saying that sending requests in parallel doesn't make sense: just questioning weather it makes sense to *always* send them in parallel. > >> >> I think I agree on all points, it makes more sense. >> Just that in a large cluster, let's say >> 1000 nodes, maybe I want 20 owners as a sweet spot for read/write >> performance tradeoff, and with such high numbers I guess doing 2-3 >> gets in parallel might make sense as those "unlikely" events, suddenly >> are an almost certain.. especially the rehash in progress. >> >> So I'd propose a separate configuration option for # parallel get >> events, and one to define a "try next node" policy. Or this policy >> should be the whole strategy, and the #gets one of the options for the >> default implementation. >> >> Agreed that having a configurable remote get policy makes sense. >> We already have a JIRA for this[1], I'll start working on it as the >> performance results are hunting me. > > I'd rather focus on implementing one remote get policy that works > instead of making it configurable - even if we make it configurable > we'll have to focus our optimizations on the default policy. This *might* make a significant differen
Re: [infinispan-dev] DIST.retrieveFromRemoteSource
On Wed, Jan 25, 2012 at 4:22 PM, Mircea Markus wrote: > > One node might be busy doing GC and stay unresponsive for a whole > > second or longer, another one might be actually crashed and you didn't > > know that yet, these are unlikely but possible. > > All these are possible but I would rather consider them as exceptional > situations, possibly handled by a retry logic. We should *not* optimise for > that these situations IMO. > As Sanne pointed out, an exceptional situation on a node becomes ordinary with 100s or 1000s of nodes. So the default policy should scale the initial number of requests with numOwners. > > More likely, a rehash is in progress, you could then be asking a node > > which doesn't yet (or anymore) have the value. > > > this is a consistency issue and I think we can find a way to handle it some > other way. > With the current state transfer we always send ClusteredGetCommands to the old owners (and only the old owners). If a node didn't receive the entire state, it means that state transfer hasn't finished yet and the CH will not return it as an owner. But the CH could also return owners that are no longer members of the cluster, so we have to check for that before picking one owner to send the command to. In Sanne's non-blocking state transfer proposal I think a new owner may have to ask the old owner for the key value, so it would still never return null. But it might be less expensive to ask the old owner directly (assuming it's safe from a consistency POV). > > All good reasons for which imho it makes sense to send out "a couple" > > of requests in parallel, but I'd unlikely want to send more than 2, > > and I agree often 1 might be enough. > > Maybe it should even optimize for the most common case: send out just > > one, have a more aggressive timeout and in case of trouble ask for the > > next node. > > +1 > -1 for aggressive timeouts... you're going to do the same work as you do now, except you're going to wait a bit between sending requests. If you're really unlucky the first target will return first but you'll ignore its response because the timeout already expired. > > In addition, sending a single request might spare us some Future, > > await+notify messing in terms of CPU cost of sending the request. > > it's the remote OOB thread that's the most costly resource imo. > I don't think the OOB thread is that costly, it doesn't block on anything (not even on state transfer!) so the most expensive part is reading the key and writing the value. BTW Sanne, we may want to run Transactional with a smaller payload size ;) We could implement our own GroupRequest that sends the requests in parallel instead implementing FutureCollator on top of UnicastRequest and save some of that overhead on the caller. I think we already have a JIRA to make PutKeyValueCommands return the previous value, that would eliminate lots of GetKeyValueCommands and it would actually improve the performance of puts - we should probably make this a priority. > > I think I agree on all points, it makes more sense. > Just that in a large cluster, let's say > 1000 nodes, maybe I want 20 owners as a sweet spot for read/write > performance tradeoff, and with such high numbers I guess doing 2-3 > gets in parallel might make sense as those "unlikely" events, suddenly > are an almost certain.. especially the rehash in progress. > > So I'd propose a separate configuration option for # parallel get > events, and one to define a "try next node" policy. Or this policy > should be the whole strategy, and the #gets one of the options for the > default implementation. > > Agreed that having a configurable remote get policy makes sense. > We already have a JIRA for this[1], I'll start working on it as the > performance results are hunting me. I'd rather focus on implementing one remote get policy that works instead of making it configurable - even if we make it configurable we'll have to focus our optimizations on the default policy. Keep in mind that we also want to introduce eventual consistency - I think that's going to eliminate our optimization opportunity here because we'll need to get the values from a majority of owners (if not all the owners). > I'd like to have Dan's input on this as well first, as he has worked with > remote gets and I still don't know why null results are not considered valid > :) Pre-5.0 during state transfer an owner could return null to mean "I'm not sure", so the caller would ignore it unless every target returned null. That's no longer necessary, but it wasn't broken so I didn't fix it... Cheers Dan > > [1] https://issues.jboss.org/browse/ISPN-825 > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] DIST.retrieveFromRemoteSource
>>> >>> One node might be busy doing GC and stay unresponsive for a whole >>> second or longer, another one might be actually crashed and you didn't >>> know that yet, these are unlikely but possible. >> All these are possible but I would rather consider them as exceptional >> situations, possibly handled by a retry logic. We should *not* optimise for >> that these situations IMO. >> Thinking about our last performance results, we have avg 26kgets per >> second. Now with numOwners = 2, these means that each node handles 26k >> *redundant* gets every second: I'm not concerned about the network load, as >> Bela mentioned in a previous mail the network link should not be the >> bottleneck, but there's a huge unnecessary activity in OOB threads which >> should rather be used for releasing locks or whatever needed. On top of >> that, this consuming activity highly encourages GC pauses, as the effort for >> a get is practically numOwners higher than it should be. >> >>> More likely, a rehash is in progress, you could then be asking a node >>> which doesn't yet (or anymore) have the value. >> >> this is a consistency issue and I think we can find a way to handle it some >> other way. >>> >>> All good reasons for which imho it makes sense to send out "a couple" >>> of requests in parallel, but I'd unlikely want to send more than 2, >>> and I agree often 1 might be enough. >>> Maybe it should even optimize for the most common case: send out just >>> one, have a more aggressive timeout and in case of trouble ask for the >>> next node. >> +1 >>> >>> In addition, sending a single request might spare us some Future, >>> await+notify messing in terms of CPU cost of sending the request. >> it's the remote OOB thread that's the most costly resource imo. > > I think I agree on all points, it makes more sense. > Just that in a large cluster, let's say > 1000 nodes, maybe I want 20 owners as a sweet spot for read/write > performance tradeoff, and with such high numbers I guess doing 2-3 > gets in parallel might make sense as those "unlikely" events, suddenly > are an almost certain.. especially the rehash in progress. > So I'd propose a separate configuration option for # parallel get > events, and one to define a "try next node" policy. Or this policy > should be the whole strategy, and the #gets one of the options for the > default implementation. Agreed that having a configurable remote get policy makes sense. We already have a JIRA for this[1], I'll start working on it as the performance results are hunting me. I'd like to have Dan's input on this as well first, as he has worked with remote gets and I still don't know why null results are not considered valid :) [1] https://issues.jboss.org/browse/ISPN-825___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] DIST.retrieveFromRemoteSource
On 25 January 2012 13:41, Mircea Markus wrote: > > On 25 Jan 2012, at 13:25, Sanne Grinovero wrote: > >> [cut] I agree, we should not ask all replicas for the same information. Asking only one is the opposite though: I think this should be a configuration option to ask for any value between (1 and numOwner). That's because I understand it might be beneficial to ask to more than one node immediately, >>> why is it more beneficial to ask multiple members than a single one? I >>> guess it doesn't have to do with consistency, as in that case it would be >>> required (vs beneficial). >>> Is it because one of the nodes might reply faster? I'm not that sure that >>> compensates the burden of numOwner-1 additional RPCs, but a benchmark will >>> tell us just that. >> >> One node might be busy doing GC and stay unresponsive for a whole >> second or longer, another one might be actually crashed and you didn't >> know that yet, these are unlikely but possible. > All these are possible but I would rather consider them as exceptional > situations, possibly handled by a retry logic. We should *not* optimise for > that these situations IMO. > Thinking about our last performance results, we have avg 26k gets per > second. Now with numOwners = 2, these means that each node handles 26k > *redundant* gets every second: I'm not concerned about the network load, as > Bela mentioned in a previous mail the network link should not be the > bottleneck, but there's a huge unnecessary activity in OOB threads which > should rather be used for releasing locks or whatever needed. On top of that, > this consuming activity highly encourages GC pauses, as the effort for a get > is practically numOwners higher than it should be. > >> More likely, a rehash is in progress, you could then be asking a node >> which doesn't yet (or anymore) have the value. > > this is a consistency issue and I think we can find a way to handle it some > other way. >> >> All good reasons for which imho it makes sense to send out "a couple" >> of requests in parallel, but I'd unlikely want to send more than 2, >> and I agree often 1 might be enough. >> Maybe it should even optimize for the most common case: send out just >> one, have a more aggressive timeout and in case of trouble ask for the >> next node. > +1 >> >> In addition, sending a single request might spare us some Future, >> await+notify messing in terms of CPU cost of sending the request. > it's the remote OOB thread that's the most costly resource imo. I think I agree on all points, it makes more sense. Just that in a large cluster, let's say 1000 nodes, maybe I want 20 owners as a sweet spot for read/write performance tradeoff, and with such high numbers I guess doing 2-3 gets in parallel might make sense as those "unlikely" events, suddenly are an almost certain.. especially the rehash in progress. So I'd propose a separate configuration option for # parallel get events, and one to define a "try next node" policy. Or this policy should be the whole strategy, and the #gets one of the options for the default implementation. ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] DIST.retrieveFromRemoteSource
On 25 Jan 2012, at 13:25, Sanne Grinovero wrote: > [cut] >>> I agree, we should not ask all replicas for the same information. >>> Asking only one is the opposite though: I think this should be a >>> configuration option to ask for any value between (1 and numOwner). >>> That's because I understand it might be beneficial to ask to more than >>> one node immediately, >> why is it more beneficial to ask multiple members than a single one? I guess >> it doesn't have to do with consistency, as in that case it would be required >> (vs beneficial). >> Is it because one of the nodes might reply faster? I'm not that sure that >> compensates the burden of numOwner-1 additional RPCs, but a benchmark will >> tell us just that. > > One node might be busy doing GC and stay unresponsive for a whole > second or longer, another one might be actually crashed and you didn't > know that yet, these are unlikely but possible. All these are possible but I would rather consider them as exceptional situations, possibly handled by a retry logic. We should *not* optimise for that these situations IMO. Thinking about our last performance results, we have avg 26kgets per second. Now with numOwners = 2, these means that each node handles 26k *redundant* gets every second: I'm not concerned about the network load, as Bela mentioned in a previous mail the network link should not be the bottleneck, but there's a huge unnecessary activity in OOB threads which should rather be used for releasing locks or whatever needed. On top of that, this consuming activity highly encourages GC pauses, as the effort for a get is practically numOwners higher than it should be. > More likely, a rehash is in progress, you could then be asking a node > which doesn't yet (or anymore) have the value. this is a consistency issue and I think we can find a way to handle it some other way. > > All good reasons for which imho it makes sense to send out "a couple" > of requests in parallel, but I'd unlikely want to send more than 2, > and I agree often 1 might be enough. > Maybe it should even optimize for the most common case: send out just > one, have a more aggressive timeout and in case of trouble ask for the > next node. +1 > > In addition, sending a single request might spare us some Future, > await+notify messing in terms of CPU cost of sending the request. it's the remote OOB thread that's the most costly resource imo. ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] DIST.retrieveFromRemoteSource
[cut] >> I agree, we should not ask all replicas for the same information. >> Asking only one is the opposite though: I think this should be a >> configuration option to ask for any value between (1 and numOwner). >> That's because I understand it might be beneficial to ask to more than >> one node immediately, > why is it more beneficial to ask multiple members than a single one? I guess > it doesn't have to do with consistency, as in that case it would be required > (vs beneficial). > Is it because one of the nodes might reply faster? I'm not that sure that > compensates the burden of numOwner-1 additional RPCs, but a benchmark will > tell us just that. One node might be busy doing GC and stay unresponsive for a whole second or longer, another one might be actually crashed and you didn't know that yet, these are unlikely but possible. More likely, a rehash is in progress, you could then be asking a node which doesn't yet (or anymore) have the value. All good reasons for which imho it makes sense to send out "a couple" of requests in parallel, but I'd unlikely want to send more than 2, and I agree often 1 might be enough. Maybe it should even optimize for the most common case: send out just one, have a more aggressive timeout and in case of trouble ask for the next node. In addition, sending a single request might spare us some Future, await+notify messing in terms of CPU cost of sending the request. ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] DIST.retrieveFromRemoteSource
On 25 Jan 2012, at 12:06, Sanne Grinovero wrote: > On 25 January 2012 11:48, Mircea Markus wrote: >> >> On 25 Jan 2012, at 08:51, Dan Berindei wrote: >> >>> Hi Sanne >>> >>> On Wed, Jan 25, 2012 at 1:22 AM, Sanne Grinovero >>> wrote: Hello, in the method: org.infinispan.distribution.DistributionManagerImpl.retrieveFromRemoteSource(Object, InvocationContext, boolean) we have: List targets = locate(key); // if any of the recipients has left the cluster since the command was issued, just don't wait for its response targets.retainAll(rpcManager.getTransport().getMembers()); But then then we use ResponseMode.WAIT_FOR_VALID_RESPONSE, which means we're not going to wait for all responses anyway, and I think we might assume to get a reply by a node which actually is in the cluster. So the retainAll method is unneeded and can be removed? I'm wondering, because it's not safe anyway, actually it seems very unlikely to me that just between a locate(key) and the retainAll the view is being changed, so not something we should be relying on anyway. I'd rather assume that such a get method might be checked and eventually dropped by the receiver. >>> >>> The locate method will return a list of owners based on the >>> "committed" cache view, so there is a non-zero probability that one of >>> the owners has already left. >>> >>> If I remember correctly, I added the retainAll call because otherwise >>> ClusteredGetResponseFilter.needMoreResponses() would keep returning >>> true if one of the targets left the cluster. Coupled with the fact >>> that null responses are not considered valid (unless *all* responses >>> are null), this meant that a remote get for a key that doesn't exist >>> would throw a TimeoutException after 15 seconds instead of returning >>> null immediately. >>> >>> We could revisit the decision to make null responses invalid, and then >>> as long as there is still one of the old owners left in the cluster >>> you'll get the null result immediately. >> can't we just directly to a single node for getting a remote value? The main >> data owner perhaps. If the node is down we can retry to another node. Going >> to multiple nodes seems like a waist of resources - network in this case. > > I agree, we should not ask all replicas for the same information. > Asking only one is the opposite though: I think this should be a > configuration option to ask for any value between (1 and numOwner). > That's because I understand it might be beneficial to ask to more than > one node immediately, why is it more beneficial to ask multiple members than a single one? I guess it doesn't have to do with consistency, as in that case it would be required (vs beneficial). Is it because one of the nodes might reply faster? I'm not that sure that compensates the burden of numOwner-1 additional RPCs, but a benchmark will tell us just that. > but assuming we have many owners it would be > nice to pick only a subset. > > A second configuration option should care about which strategy the > subset is selected. In case we ask only one node, I'm not sure if the > first node would be the best option. The main data owner would be a good fit if the distribution is even (we can assume that's the case) and all the keys are accessed with the same frequency. The later is out of our control, even though, if L1 is enabled (default) then we can assume it as well. Or we can use a RND. > > Dan, thank you for your explanation, this makes much more sense now. > Indeed I agree as well that we should revisit the interpretation of > "return null", that's not an unlikely case since every put will run a > get first. I don't see why null responses are not considered valid unless all the responses are null, Dan, can you perhaps comment on this? ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] DIST.retrieveFromRemoteSource
On 25 Jan 2012, at 12:06, Bela Ban wrote: > > > On 1/25/12 12:58 PM, Mircea Markus wrote: >> >> On 25 Jan 2012, at 09:42, Bela Ban wrote: >> >>> >>> >>> On 1/25/12 9:51 AM, Dan Berindei wrote: >>> Slightly related, I wonder if Manik's comment is still true: if at all possible, try not to use JGroups' ANYCAST for now. Multiple (parallel) UNICASTs are much faster.) Intuitively it shouldn't be true, unicasts+FutureCollator do basically the same thing as anycast+GroupRequest. >>> >>> >>> No, parallel unicasts will be faster, as an anycast to A,B,C sends the >>> unicasts sequentially >> Thanks, very good to know that. >> >> I'm a a bit confused by the jgroups terminology though :-) >> My understanding of the term ANYCAST is that the message is sent to *one* of >> the A,B,C. But from what I read here it is sent to A, B and C - that's what >> I know as MULTICAST. > > > No, here's the definition: > * anycast: message sent to a subset S of members N. The message is sent > to all members in S as sequential unicasts. S <= N > * multicast: cluster-wide message, sent to all members N of a cluster. > This can be done via UDP (IP multicast) or TCP > * IP multicast: the network level datagram packet with a class D address > as destination > * broadcast: IP packet sent to all hosts on a given range same host, > subnet or higher) Thanks for the clarification Bela. I've been using wikipedia[1] as a reference and the terms have a slightly different meaning there. [1] http://en.wikipedia.org/wiki/Anycast#Addressing_methodologies___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] DIST.retrieveFromRemoteSource
On 25 January 2012 11:48, Mircea Markus wrote: > > On 25 Jan 2012, at 08:51, Dan Berindei wrote: > >> Hi Sanne >> >> On Wed, Jan 25, 2012 at 1:22 AM, Sanne Grinovero >> wrote: >>> Hello, >>> in the method: >>> org.infinispan.distribution.DistributionManagerImpl.retrieveFromRemoteSource(Object, >>> InvocationContext, boolean) >>> >>> we have: >>> >>> List targets = locate(key); >>> // if any of the recipients has left the cluster since the >>> command was issued, just don't wait for its response >>> targets.retainAll(rpcManager.getTransport().getMembers()); >>> >>> But then then we use ResponseMode.WAIT_FOR_VALID_RESPONSE, which means >>> we're not going to wait for all responses anyway, and I think we might >>> assume to get a reply by a node which actually is in the cluster. >>> >>> So the retainAll method is unneeded and can be removed? I'm wondering, >>> because it's not safe anyway, actually it seems very unlikely to me >>> that just between a locate(key) and the retainAll the view is being >>> changed, so not something we should be relying on anyway. >>> I'd rather assume that such a get method might be checked and >>> eventually dropped by the receiver. >>> >> >> The locate method will return a list of owners based on the >> "committed" cache view, so there is a non-zero probability that one of >> the owners has already left. >> >> If I remember correctly, I added the retainAll call because otherwise >> ClusteredGetResponseFilter.needMoreResponses() would keep returning >> true if one of the targets left the cluster. Coupled with the fact >> that null responses are not considered valid (unless *all* responses >> are null), this meant that a remote get for a key that doesn't exist >> would throw a TimeoutException after 15 seconds instead of returning >> null immediately. >> >> We could revisit the decision to make null responses invalid, and then >> as long as there is still one of the old owners left in the cluster >> you'll get the null result immediately. > can't we just directly to a single node for getting a remote value? The main > data owner perhaps. If the node is down we can retry to another node. Going > to multiple nodes seems like a waist of resources - network in this case. I agree, we should not ask all replicas for the same information. Asking only one is the opposite though: I think this should be a configuration option to ask for any value between (1 and numOwner). That's because I understand it might be beneficial to ask to more than one node immediately, but assuming we have many owners it would be nice to pick only a subset. A second configuration option should care about which strategy the subset is selected. In case we ask only one node, I'm not sure if the first node would be the best option. Dan, thank you for your explanation, this makes much more sense now. Indeed I agree as well that we should revisit the interpretation of "return null", that's not an unlikely case since every put will run a get first. ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] DIST.retrieveFromRemoteSource
On 1/25/12 12:58 PM, Mircea Markus wrote: > > On 25 Jan 2012, at 09:42, Bela Ban wrote: > >> >> >> On 1/25/12 9:51 AM, Dan Berindei wrote: >> >>> Slightly related, I wonder if Manik's comment is still true: >>> >>> if at all possible, try not to use JGroups' ANYCAST for now. >>> Multiple (parallel) UNICASTs are much faster.) >>> >>> Intuitively it shouldn't be true, unicasts+FutureCollator do basically >>> the same thing as anycast+GroupRequest. >> >> >> No, parallel unicasts will be faster, as an anycast to A,B,C sends the >> unicasts sequentially > Thanks, very good to know that. > > I'm a a bit confused by the jgroups terminology though :-) > My understanding of the term ANYCAST is that the message is sent to *one* of > the A,B,C. But from what I read here it is sent to A, B and C - that's what I > know as MULTICAST. No, here's the definition: * anycast: message sent to a subset S of members N. The message is sent to all members in S as sequential unicasts. S <= N * multicast: cluster-wide message, sent to all members N of a cluster. This can be done via UDP (IP multicast) or TCP * IP multicast: the network level datagram packet with a class D address as destination * broadcast: IP packet sent to all hosts on a given range same host, subnet or higher) -- Bela Ban Lead JGroups (http://www.jgroups.org) JBoss / Red Hat ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] DIST.retrieveFromRemoteSource
On 25 Jan 2012, at 09:42, Bela Ban wrote: > > > On 1/25/12 9:51 AM, Dan Berindei wrote: > >> Slightly related, I wonder if Manik's comment is still true: >> >> if at all possible, try not to use JGroups' ANYCAST for now. >> Multiple (parallel) UNICASTs are much faster.) >> >> Intuitively it shouldn't be true, unicasts+FutureCollator do basically >> the same thing as anycast+GroupRequest. > > > No, parallel unicasts will be faster, as an anycast to A,B,C sends the > unicasts sequentially Thanks, very good to know that. I'm a a bit confused by the jgroups terminology though :-) My understanding of the term ANYCAST is that the message is sent to *one* of the A,B,C. But from what I read here it is sent to A, B and C - that's what I know as MULTICAST. More, on the discussion we had on IRC, jgroup's MULTICAST seemed to mean BROADCAST... I hope I don't sound pedant here, I just want to understand the things correctly :-) ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] DIST.retrieveFromRemoteSource
On 25 Jan 2012, at 08:51, Dan Berindei wrote: > Hi Sanne > > On Wed, Jan 25, 2012 at 1:22 AM, Sanne Grinovero wrote: >> Hello, >> in the method: >> org.infinispan.distribution.DistributionManagerImpl.retrieveFromRemoteSource(Object, >> InvocationContext, boolean) >> >> we have: >> >> List targets = locate(key); >> // if any of the recipients has left the cluster since the >> command was issued, just don't wait for its response >> targets.retainAll(rpcManager.getTransport().getMembers()); >> >> But then then we use ResponseMode.WAIT_FOR_VALID_RESPONSE, which means >> we're not going to wait for all responses anyway, and I think we might >> assume to get a reply by a node which actually is in the cluster. >> >> So the retainAll method is unneeded and can be removed? I'm wondering, >> because it's not safe anyway, actually it seems very unlikely to me >> that just between a locate(key) and the retainAll the view is being >> changed, so not something we should be relying on anyway. >> I'd rather assume that such a get method might be checked and >> eventually dropped by the receiver. >> > > The locate method will return a list of owners based on the > "committed" cache view, so there is a non-zero probability that one of > the owners has already left. > > If I remember correctly, I added the retainAll call because otherwise > ClusteredGetResponseFilter.needMoreResponses() would keep returning > true if one of the targets left the cluster. Coupled with the fact > that null responses are not considered valid (unless *all* responses > are null), this meant that a remote get for a key that doesn't exist > would throw a TimeoutException after 15 seconds instead of returning > null immediately. > > We could revisit the decision to make null responses invalid, and then > as long as there is still one of the old owners left in the cluster > you'll get the null result immediately. can't we just directly to a single node for getting a remote value? The main data owner perhaps. If the node is down we can retry to another node. Going to multiple nodes seems like a waist of resources - network in this case. ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] DIST.retrieveFromRemoteSource
On 1/25/12 9:51 AM, Dan Berindei wrote: > Slightly related, I wonder if Manik's comment is still true: > > if at all possible, try not to use JGroups' ANYCAST for now. > Multiple (parallel) UNICASTs are much faster.) > > Intuitively it shouldn't be true, unicasts+FutureCollator do basically > the same thing as anycast+GroupRequest. No, parallel unicasts will be faster, as an anycast to A,B,C sends the unicasts sequentially -- Bela Ban Lead JGroups (http://www.jgroups.org) JBoss / Red Hat ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] DIST.retrieveFromRemoteSource
Hi Sanne On Wed, Jan 25, 2012 at 1:22 AM, Sanne Grinovero wrote: > Hello, > in the method: > org.infinispan.distribution.DistributionManagerImpl.retrieveFromRemoteSource(Object, > InvocationContext, boolean) > > we have: > > List targets = locate(key); > // if any of the recipients has left the cluster since the > command was issued, just don't wait for its response > targets.retainAll(rpcManager.getTransport().getMembers()); > > But then then we use ResponseMode.WAIT_FOR_VALID_RESPONSE, which means > we're not going to wait for all responses anyway, and I think we might > assume to get a reply by a node which actually is in the cluster. > > So the retainAll method is unneeded and can be removed? I'm wondering, > because it's not safe anyway, actually it seems very unlikely to me > that just between a locate(key) and the retainAll the view is being > changed, so not something we should be relying on anyway. > I'd rather assume that such a get method might be checked and > eventually dropped by the receiver. > The locate method will return a list of owners based on the "committed" cache view, so there is a non-zero probability that one of the owners has already left. If I remember correctly, I added the retainAll call because otherwise ClusteredGetResponseFilter.needMoreResponses() would keep returning true if one of the targets left the cluster. Coupled with the fact that null responses are not considered valid (unless *all* responses are null), this meant that a remote get for a key that doesn't exist would throw a TimeoutException after 15 seconds instead of returning null immediately. We could revisit the decision to make null responses invalid, and then as long as there is still one of the old owners left in the cluster you'll get the null result immediately. You may still get an exception if all the old owners left the cluster, but I'm not sure. I wish I had added a test for this... We may also be able to add a workaround in FutureCollator as well - just remember that we use the same FutureCollator for writes in REPL mode so it needs to work with GET_ALL as well as with GET_FIRST. Slightly related, I wonder if Manik's comment is still true: if at all possible, try not to use JGroups' ANYCAST for now. Multiple (parallel) UNICASTs are much faster.) Intuitively it shouldn't be true, unicasts+FutureCollator do basically the same thing as anycast+GroupRequest. Cheers Dan > Cheers, > Sanne > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev