On May 2, 2012, at 6:12 PM, Jon Meredith wrote: > Hi Nitish, for this to work you'll have to stop all the nodes at the same > time, clear the ring on all nodes, start up all nodes, then rejoin > > If you clear the rings one node at a time, when you rejoin the nodes the ring > with the old and new style names will be gossipped back to it and you'll > still have both names. Sorry for the confusion. I didn't clear the rings one node at a time while keeping other nodes live. Following are the steps I followed: 1. Stop Riak on all the nodes. 2. Remove ring directory from all nodes. 3. Start the nodes and rejoin.
> I didn't realize you had a large amount of data - originally you said > "Currently, we are hosting limited amount of data", but 200mil docs per node > seems like a fair amount. Rebuilding that size cluster may take a long time. > Yeah, we are currently serving very limited amount because of Riak shortage. In total, we have almost 750 million documents served by Riak. > Your options as I see them are > 1) If you have backups of the ring files, you could revert the node name > changes and get the cluster stable again on riak@IP. The ring files have a > timestamp associated with them, but we only keep a few of the last ring > files, so if enough gossip has happened then the pre-rename rings will have > been destroyed. You will have to stop all nodes, put the ring files back as > they were before the change and fix the names in vm.args and then restart the > nodes. > > 2) you can continue on the rebuild plan. stop all nodes, set the new names > in vm.args, start the nodes again and rebuild the cluster, adding as many > nodes as you can at once so they rebalance at the same time. When new nodes > are added the claimant node works out ownership changes and will start a > sequence of transfers. If new nodes are added once a sequence is under way > the claimant will wait for that to complete, then check if there are any new > nodes and repeat until all nodes are assigned. If you add all the nodes at > once you will do less transfers over all. > > > If the cluster cannot be stopped, there are other things we might be able to > do, but they're a bit more complex. What are your uptime requirements? > We have currently stopped the cluster and running on small amount of data. We can wait for the partition re-distribution to complete on Riak, but I don't have a strong feeling about it. "member_status" doesn't give us a correct picture: http://pastie.org/3849548. Is this expected behavior? I should also mention that all the nodes are still loading existing data and it will take few hours (2-3) until Riak KV is running on all of them. Cheers Nitish > Jon > > > > On Wed, May 2, 2012 at 9:57 AM, Nitish Sharma <[email protected]> > wrote: > Hi Jon, > Thanks for your input. I've already started working on that lines. > I stopped all the nodes, moved ring directory from one node, brought that one > up, and issued join command to one other node (after moving the ring > directory) - node2. While they were busy re-distributing the partitions, I > started another node (node3) and issued join command (before risk_kv was > running, since it takes some time to load existing data). > But after this, data handoffs are occurring only between node1 and node2. > "member_status" says that node 3 owns 0% of the ring and 0% are pending. > We have a lot of data - each node serves around 200 million documents. Riak > cluster is running 1.1.2. > Any suggestions? > > Cheers > Nitish > On May 2, 2012, at 5:31 PM, Jon Meredith wrote: > >> Hi Nitish, >> >> If you rebuild the cluster with the same ring size, the data will eventually >> get back to the right place. While the rebuild is taking place you may have >> notfounds for gets until the data has been handed off to the newly assigned >> owner (as it will be secondary handoff, not primary ownership handoff to get >> teh data back). If you don't have a lot of data stored in the cluster it >> shouldn't take too long. >> >> The process would be to stop all nodes, move the files out of the ring >> directory to a safe place, start all nodes and rejoin. If you're using >> 1.1.x and you have capacity in your hardware you may want to increase >> handoff_concurrency to something like 4 to permit more transfers to happen >> across the cluster. >> >> >> Jon. >> >> >> >> On Wed, May 2, 2012 at 9:05 AM, Nitish Sharma <[email protected]> >> wrote: >> Hi, >> We have a 12-node Riak cluster. Until now we were naming every new node as >> riak@<ip_address>. We then decided to rename the all the nodes to >> riak@<hostname>, which makes troubleshooting easier. >> After issuing reip command to two nodes, we noticed in the "status" that >> those 2 nodes were now appearing in the cluster with the old name as well as >> the new name. Other nodes were trying to handoff partitions to the "new" >> nodes, but apparently they were not able to. After this the whole cluster >> went down and completely stopped responding to any read/write requests. >> member_status displayed old Riak name in "legacy" mode. Since this is our >> production cluster, we are desperately looking for some quick remedies. >> Issuing "force-remove" to the old names, restarting all the nodes, changing >> the riak names back to the old ones - none of it helped. >> Currently, we are hosting limited amount of data. Whats an elegant way to >> recover from this mess? Would shutting off all the nodes, deleting the ring >> directory, and again forming the cluster work? >> >> Cheers >> Nitish >> _______________________________________________ >> riak-users mailing list >> [email protected] >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> >> >> -- >> Jon Meredith >> Platform Engineering Manager >> Basho Technologies, Inc. >> [email protected] >> > > > > > -- > Jon Meredith > Platform Engineering Manager > Basho Technologies, Inc. > [email protected] >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
