Andrey: Another thing that can stall handoff is running commands that reset the vnode activity timer. The `riak-admin vnode-status` command will reset the activity timer and should never be run more frequently that then vnode inactivity timeout; if you do, that can permanently stall handoff. We have seen this before at a customer site where they were collecting the statics from the vnode-status command into their metrics system.
Regards, Charlie Voiselle > On Feb 23, 2017, at 7:14 AM, Douglas Rohrer <droh...@basho.com> wrote: > > Andrey: > > It's waiting for 60 seconds, literally... > > See > https://github.com/basho/riak_core/search?utf8=%E2%9C%93&q=vnode_inactivity_timeout > > <https://github.com/basho/riak_core/search?utf8=%E2%9C%93&q=vnode_inactivity_timeout> > - handoff is not initiated until a vnode has been inactive for the specified > inactivity period. > > For demonstration purposes, if you want to reduce this time, you could set > the riak_core.vnode_inactivity_timeout period lower ,which can be set in > advanced.config. Also note that, depending on the backend you use, it's > possible to have other settings set lower than the vnode inactivity timeout, > you can actually prevent handoff completely - see > http://docs.basho.com/riak/kv/2.2.0/setup/planning/backend/bitcask/#sync-strategy > > <http://docs.basho.com/riak/kv/2.2.0/setup/planning/backend/bitcask/#sync-strategy>, > for examnple. > > Hope this helps. > > Doug > > On Thu, Feb 23, 2017 at 6:40 AM Andrey Ershov <andrers...@gmail.com > <mailto:andrers...@gmail.com>> wrote: > Hi, guys! > > I'd like to follow up on handoffs behaviour after netsplit. The problem is > that right after network partition is healed, "riak-admin transfers" command > says that there are X partitions waiting transfer from one node to another, > and Y partitions waiting transfer in the opposite direction. What are they > waiting for? Active transfers section is always empty. It takes about 1 > minute for transfer to occur. I've increased transfer_limit to 100 and it > does not help. > Also I've tried to attach to Erlang VM and execute > riak_core_vnode_manager:force_handoff() on each node. This command returns > 'ok'. But seems that it does not work right after network is healed. After > some time 30-60 s, force_handoff() works as expected, but actually it's the > same latency as in auto handoff case. > > So what is it waiting for? Any ideas? > > I'm preparing real-time coding demo to be shown on the conference. So it's > too much time to wait for 1 minute for handoff to occur just for a couple of > keys... > -- > Thanks, > Andrey > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com <mailto:riak-users@lists.basho.com> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > <http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com> > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com