Re: Handoffs are too slow after netsplit

Charlie Voiselle Fri, 24 Feb 2017 09:22:05 -0800

Andrey:

Another thing that can stall handoff is running commands that reset the vnode 
activity timer.  The `riak-admin vnode-status` command will reset the activity 
timer and should never be run more frequently that then vnode inactivity 
timeout; if you do, that can permanently stall handoff.  We have seen this 
before at a customer site where they were collecting the statics from the 
vnode-status command into their metrics system.


Regards,
Charlie Voiselle


> On Feb 23, 2017, at 7:14 AM, Douglas Rohrer <droh...@basho.com> wrote:
> 
> Andrey:
> 
> It's waiting for 60 seconds, literally...
> 
> See 
> https://github.com/basho/riak_core/search?utf8=%E2%9C%93&q=vnode_inactivity_timeout
>  
> <https://github.com/basho/riak_core/search?utf8=%E2%9C%93&q=vnode_inactivity_timeout>
>  - handoff is not initiated until a vnode has been inactive for the specified 
> inactivity period.
> 
> For demonstration purposes, if you want to reduce this time, you could set 
> the riak_core.vnode_inactivity_timeout period lower ,which can be set in 
> advanced.config. Also note that, depending on the backend you use, it's 
> possible to have other settings set lower than the vnode inactivity timeout, 
> you can actually prevent handoff completely - see 
> http://docs.basho.com/riak/kv/2.2.0/setup/planning/backend/bitcask/#sync-strategy
>  
> <http://docs.basho.com/riak/kv/2.2.0/setup/planning/backend/bitcask/#sync-strategy>,
>  for examnple.
> 
> Hope this helps.
> 
> Doug
> 
> On Thu, Feb 23, 2017 at 6:40 AM Andrey Ershov <andrers...@gmail.com 
> <mailto:andrers...@gmail.com>> wrote:
> Hi, guys!
> 
> I'd like to follow up on handoffs behaviour after netsplit. The problem is 
> that right after network partition is healed, "riak-admin transfers" command 
> says that there are X partitions waiting transfer from one node to another, 
> and Y partitions waiting transfer in the opposite direction. What are they 
> waiting for? Active transfers section is always empty. It takes about 1 
> minute for transfer to occur. I've increased transfer_limit to 100 and it 
> does not help. 
> Also I've tried to attach to Erlang VM and execute 
> riak_core_vnode_manager:force_handoff() on each node. This command returns 
> 'ok'. But seems that it does not work right after network is healed. After 
> some time 30-60 s, force_handoff() works as expected, but actually it's the 
> same latency as in auto handoff case. 
> 
> So what is it waiting for? Any ideas?
> 
> I'm preparing real-time coding demo to be shown on the conference. So it's 
> too much time to wait for 1 minute for handoff to occur just for a couple of 
> keys...
> -- 
> Thanks,
> Andrey
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com 
> <http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Handoffs are too slow after netsplit

Reply via email to