Hi, Marc.

I understand your confusion as that code is a bit subtle.

The reason this isn't a bug is that upon receiving the very first
notfound in your situation, the  "FailThreshold" case in the clause
for notfound messages would return true -- since it would already know
that it could never get 3 ok responses after that.  The FSM would
immediately send a notfound to the client and would not wait for the
subsequent vnode responses.

I hope that this explanation was helpful.

Best,

-Justin



On Tue, Apr 13, 2010 at 9:00 AM, Marc Worrell <m...@worrell.nl> wrote:
> Hi,
>
> I was reading the source code of riak_get_fsm to see how failure is handled.
> I stumbled on a construction that I don't understand.
>
> In waiting_vnode_r/2 I see that:
> 1. on receiving an ok: there is a check if there are R ok replies
> 2. on receiving notfound: there is a check of there are R (ok + notfound) 
> replies
>
> Now suppose I have R = N = 3.
> And I get back from the nodes the sequence: [notfound, ok, ok]
> Then #state.replied_r = 2, and #state.replied_notfound = 1.
> This will let "waiting_vnode_r({r, {ok, RObj}, ...)" stay in the state 
> "waiting_vnode_r".
> Though we know we got an answer from all R (N) nodes, only a timeout will 
> move the fsm further.
>
> Could this be handled differently or am I missing something?
>
> - Marc
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to