Yes, we've had dynamic snitch on by default in all the 0.7 releases so it's pretty well tested by this point.
On Wed, Apr 13, 2011 at 1:17 PM, Erik Onnen <eon...@gmail.com> wrote: > So we're not currently using a dynamic snitch, only the SimpleSnitch > is at play (lots of history as to why, I won't go into it). If this > would solve our problems I'm fine changing it. > > Understood re: client contract. I guess in this case my issue is that > the server we're connected to never tries more than the one failing > server until failure detector has kicked in - it keeps flogging the > bad server so subsequent requests never produce a different result > until conviction. > > Regarding clients retrying, in this configuration the situation > doesn't improve and it still times out because our client libraries > don't try another host. They still have a valid connection to a > working host, it's just that given our configuration that one node > keeps proxying to a bad server and never routes around it. It sounds > like switching to the dynamic switch would adjust for the first > timeout on subsequent attempts so maybe that's the most advisable > thing in this case. > > On Wed, Apr 13, 2011 at 10:58 AM, Jonathan Ellis <jbel...@gmail.com> wrote: >> First, our contract with the client says "we'll give you the answer or >> a timeout after rpc_timeout." Once we start trying to cheat on that >> the client has no guarantee anymore when it should expect a response >> by. So that feels iffy to me. >> >> Second, retrying to a different node isn't expected to give >> substantially better results than the client issuing a retry itself if >> that's what it wants, since by the time we timeout once then FD and/or >> dynamic snitch should route the request to another node for the retry >> without adding additional complexity to StorageProxy. (If that's not >> what you see in practice, then we probably have a dynamic snitch bug.) >> >> On Wed, Apr 13, 2011 at 12:32 PM, Erik Onnen <eon...@gmail.com> wrote: >>> Sorry for the complex setup, took a while to identify the behavior and >>> I'm still not sure I'm reading the code correctly. >>> >>> Scenario: >>> >>> Six node ring w/ SimpleSnitch and RF3. For the sake of discussion >>> assume the token space looks like: >>> >>> node-0 1-10 >>> node-1 11-20 >>> node-2 21-30 >>> node-3 31-40 >>> node-4 41-50 >>> node-5 51-60 >>> >>> In this scenario we want key 35 where nodes 3,4 and 5 are natural >>> endpoints. Client is connected to node-0, node-1 or node-2. node-3 >>> goes into a full GC lasting 12 seconds. >>> >>> What I think we're seeing is that as long as we read with CL.ONE *and* >>> are connected to 0,1 or 2, we'll never get a response for the >>> requested key until the failure detector kicks in and convicts 3 >>> resulting in reads spilling over to the other endpoints. >>> >>> We've tested this by switching to CL.QUORUM and since haven't seen >>> read timeouts during big GCs. >>> >>> Assuming the above, is this behavior really correct? We have copies of >>> the data on two other nodes but because this snitch config always >>> picks node-3, we always timeout until conviction which can take up to >>> 8 seconds sometimes. Shouldn't the read attempt to pick a different >>> endpoint in the case of the first timeout rather than repeatedly >>> trying a node that isn't responding? >>> >>> Thanks, >>> -erik >>> >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of DataStax, the source for professional Cassandra support >> http://www.datastax.com >> > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com