Hi, Marc. I understand your confusion as that code is a bit subtle.
The reason this isn't a bug is that upon receiving the very first notfound in your situation, the "FailThreshold" case in the clause for notfound messages would return true -- since it would already know that it could never get 3 ok responses after that. The FSM would immediately send a notfound to the client and would not wait for the subsequent vnode responses. I hope that this explanation was helpful. Best, -Justin On Tue, Apr 13, 2010 at 9:00 AM, Marc Worrell <m...@worrell.nl> wrote: > Hi, > > I was reading the source code of riak_get_fsm to see how failure is handled. > I stumbled on a construction that I don't understand. > > In waiting_vnode_r/2 I see that: > 1. on receiving an ok: there is a check if there are R ok replies > 2. on receiving notfound: there is a check of there are R (ok + notfound) > replies > > Now suppose I have R = N = 3. > And I get back from the nodes the sequence: [notfound, ok, ok] > Then #state.replied_r = 2, and #state.replied_notfound = 1. > This will let "waiting_vnode_r({r, {ok, RObj}, ...)" stay in the state > "waiting_vnode_r". > Though we know we got an answer from all R (N) nodes, only a timeout will > move the fsm further. > > Could this be handled differently or am I missing something? > > - Marc > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com