First, our contract with the client says "we'll give you the answer or a timeout after rpc_timeout." Once we start trying to cheat on that the client has no guarantee anymore when it should expect a response by. So that feels iffy to me.
Second, retrying to a different node isn't expected to give substantially better results than the client issuing a retry itself if that's what it wants, since by the time we timeout once then FD and/or dynamic snitch should route the request to another node for the retry without adding additional complexity to StorageProxy. (If that's not what you see in practice, then we probably have a dynamic snitch bug.) On Wed, Apr 13, 2011 at 12:32 PM, Erik Onnen <eon...@gmail.com> wrote: > Sorry for the complex setup, took a while to identify the behavior and > I'm still not sure I'm reading the code correctly. > > Scenario: > > Six node ring w/ SimpleSnitch and RF3. For the sake of discussion > assume the token space looks like: > > node-0 1-10 > node-1 11-20 > node-2 21-30 > node-3 31-40 > node-4 41-50 > node-5 51-60 > > In this scenario we want key 35 where nodes 3,4 and 5 are natural > endpoints. Client is connected to node-0, node-1 or node-2. node-3 > goes into a full GC lasting 12 seconds. > > What I think we're seeing is that as long as we read with CL.ONE *and* > are connected to 0,1 or 2, we'll never get a response for the > requested key until the failure detector kicks in and convicts 3 > resulting in reads spilling over to the other endpoints. > > We've tested this by switching to CL.QUORUM and since haven't seen > read timeouts during big GCs. > > Assuming the above, is this behavior really correct? We have copies of > the data on two other nodes but because this snitch config always > picks node-3, we always timeout until conviction which can take up to > 8 seconds sometimes. Shouldn't the read attempt to pick a different > endpoint in the case of the first timeout rather than repeatedly > trying a node that isn't responding? > > Thanks, > -erik > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com