Re: Understanding Riaks rebalancing and handoff behaviour

Scott Lystig Fritchie Thu, 11 Nov 2010 14:17:45 -0800

Nico Meyer <[email protected]> wrote:

nm> I discovered another problem while debugging this. I you restart (or
nm> it crashes) a node that you removed from the cluster which still has
nm> data, it won't start handing off it's data afterwards. The reason
nm> being, that is the node watcher also does not get notified that the
nm> other nodes are up, and so all of them are considered down. This
nm> also can only be worked around manually via the erlang console.


Nico, I've opened ticket 878 after scripting your scenario and
duplicating it on an Ubuntu9 32-bit box using the Riak package
riak_0.13.0-2_i386.deb.

    https://issues.basho.com/show_bug.cgi?id=878

On to Sven's problem that started this thread ... I've a larger script
that attempts to reproduce his problem, using 12 nodes installed on a
single Ubuntu9 32-bit machine (though reading carefully, Sven doesn't
get around to using EC2 instance number D, so only 9 nodes are used).

I have the script and output available at
http://www.snookles.com/scotttmp/riedel-scenario.tar.gz.  Sorry, I don't
have the rest of the basho_expect infrastructure available to outside
users right now(*), so it isn't possible for outsiders to re-run the
test, but it should show what's being done at a high level (the Python
script) and the detailed output (the other file, search for the regexp
"\*\*" for major section headings).

Sven, if I've made a major mistake on the script, please let me know
outside of the mailing list.  I'll try to fix the script and, if
necessary, open another Bugzilla ticket.

-Scott

(*) Releasing basho_expect with a reasonable open source license is on
the Basho todo list.

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Understanding Riaks rebalancing and handoff behaviour

Reply via email to