Hi!

I have a question regarding the use of fallback nodes for get requests.

As I read the code, get requests use fallback nodes if any of the nodes
normally holding the key are not reachable. Also a simple test seemed to
confirm this.
What I mean is, if I do a get request for a key with R=N, and one of the
first N nodes in the preflist is down the request will still succeed.
Why is that? Doesn't that undermine the purpose of seting R to a high
number (specifically setting it to N)? That way a request might succeed
even if all primary nodes responsible for the key are unavailable.

I can see that behaviour might make sense if a node is down for a long
time, but in that case it should just be removed from the cluster.

On a similar note, why is the riak_kv_get_fsm waiting for at least
(N/2)+1 responses, if there are only not_found responses, effectively
ignoring a smaller R value of the request if the key does not exists?
My guess was, that this also has to do with the use of fallback nodes:
Since the partition will usually be very small on the fallback/handoff
node, it is likely to be the first to answer. So to avoid returning
false not_found responses, a basic quorum is required.
Am I on the right track here?

The problem is, this is imposed even for the case that all nodes are up.
If one requires very low latency or very high availability (that's why
one uses a small R value in the first place) and does a lot of gets for
non existent keys, riak silently screws you over by raising R for those
keys.

I most likely missed something here, but some ad hoc test I did seem to
be consistent with my understanding of the code.


Thanks,
Nico







_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to