Hi Ryan, yes, you can change a number of settings. Have you had a look at http://docs.basho.com/riak/kv/2.1.4/using/admin/riak-admin/#transfer-limit and http://lists.basho.com/pipermail/riak-users_lists.basho.com/2014-July/015529.html ?
-Alexander On Tue, Nov 1, 2016 at 2:43 AM, Ryan Maclear <ry...@miranetworks.net> wrote: > Good Day, > > We have a 4 node riak cluster running inside AWS. The riak is riak-kv 2.1.2 > with AAE enabled on Ubuntu 14.04.4 LTS > > We are in the process of replacing one node with another using the process > described here: > > http://docs.basho.com/riak/kv/2.1.4/using/cluster-operations/replacing-node/ > > We have successfully replaced two of the nodes so far but we are having a > problem with the third. If we look at /var/log/riak/console.log we see the > start of the hinted handoff, and some time later (sometimes minutes and > sometimes hours) we see: > > 2016-10-31 06:30:40.090 [error] > <0.19834.2101>@riak_core_handoff_sender:start_fold:272 hinted transfer of > riak_kv_vnode from 'r...@aew54.miranetworks.net' > 274031556999544297163190906134303066185487351808 to > 'r...@aew75.miranetworks.net' > 274031556999544297163190906134303066185487351808 failed because of TCP recv > timeout > 2016-10-31 06:30:40.090 [error] > <0.187.0>@riak_core_handoff_manager:handle_info:303 An outbound handoff of > partition riak_kv_vnode 274031556999544297163190906134303066185487351808 was > terminated for reason: {shutdown,timeout} > > So the handoff was terminated due to a tcp timeout. The handoff then starts > again. > > This has been going on for some times (about two weeks now). > > The current member status is as follows: > > riak-admin member-status > ================================= Membership > ================================== > Status Ring Pending Node > ------------------------------------------------------------------------------- > leaving 0.0% -- 'r...@aew54.miranetworks.net' > valid 25.0% -- 'r...@aew59.miranetworks.net' > valid 25.0% -- 'r...@aew73.miranetworks.net' > valid 25.0% -- 'r...@aew74.miranetworks.net' > valid 25.0% -- 'r...@aew75.miranetworks.net' > ------------------------------------------------------------------------------- > Valid:4 / Leaving:1 / Exiting:0 / Joining:0 / Down:0 > > > Here are some questions: > > 1. What is the default tcp timeout? > 2. Is there any way to increase this timeout? > 3. Is there any way to increase the rate of handoff? > 4. Are there any other parameters we can tune to try and avoid this? > > The output from riak-admin transfers is as follows: > > 'r...@aew54.miranetworks.net' waiting to handoff 1 partitions > > Active Transfers: > > transfer type: hinted > vnode type: riak_kv_vnode > partition: 274031556999544297163190906134303066185487351808 > started: 2016-11-01 05:30:47 [2.10 hr ago] > last update: 2016-11-01 07:36:51 [3.03 s ago] > total size: 78393086512 bytes > objects transferred: 11440967 > > 1513 Objs/s > riak@aew54.miranetworks.n =======> riak@aew75.miranetworks.n > et et > |====== | 15% > 1.53 MB/s > > > Thanks, > Ryan Maclear > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com