Hey Joe, the upgrade itself appears to be fine on stage. But I think I might have an issue with the capability negotiation.
Server A thinks, that Server B is a legacy node. Server B thinks the same of Server A. On Server A (running riak-admin member_status): (legacy) 50.0% -- 'riak@SERVER_B' valid 50.0% -- 'riak@SERVER_A' and on Server B: valid 50.0% -- 'riak@SERVER_B' (legacy) 50.0% -- 'riak@SERVER_A' riak-admin transfers does not work properly too. Each node thinks that the other node is down. riak-admin ring_status just prints a: "Currently in legacy gossip mode." Restarting the nodes did not have an effect. Any ideas? Best Sebastian On 09.08.2012, at 23:14, Sebastian Cohnen <[email protected]> wrote: > Hey Joe, > > thanks for your detailed description of the problem. > > I already assumed that this is not necessarily an indicator for problems. I > just wanted to make sure I'm not missing anything important. ring_status just > tells me "Currently in legacy gossip mode.", but member_status looks very > informative. > > It's getting a bit too late (I'm on CEST) to continue to work on the > migration testing, but I'll continue tomorrow. > > > Thanks again for your help! > > Sebastian > > On 09.08.2012, at 22:47, Joseph Blomstedt <[email protected]> wrote: > >> Yes, this makes sense unfortunately. 'riak-admin transfers' isn't >> going to work for you in a mixed 0.14.2 and 1.2 cluster. >> >> Between 0.14.2 and 1.0, the entire cluster system was revamped. One >> consequence of this change was that 'riak-admin transfers' would only >> work on the 1.0+ nodes in the cluster, not any of the 0.14.2 nodes. At >> the time, this wasn't a major issue because you could just use the >> command on the right nodes and get the information you needed until >> all nodes were eventually upgraded. >> >> For Riak 1.2, 'riak-admin transfers' has been changed again. This >> time, in a mixed cluster 'riak-admin transfers' only works on the >> older nodes, not the Riak 1.2 nodes. For example, in a mixed 1.1 and >> 1.2 cluster, you can only use riak-admin transfers on the 1.1 nodes >> until all have been upgraded. >> >> Unfortunately, the combination doesn't work out well for you in this >> case. Riak 0.14.2 transfers fails if there are any 1.0+ nodes in the >> cluster, and Riak 1.2 transfers fails if there are any <1.2 nodes in >> the cluster. Both are true, and therefore neither versions of Riak >> can properly give you transfer information. >> >> Of course, the lack of being able to monitor transfers doesn't mean >> things aren't actually working. Running 'riak-admin member_status' and >> 'riak-admin ring_status' on the newer nodes should provide enough >> detail about what's going on to see if your cluster is moving along. >> >> Regards, >> Joe >> >> >> On Thu, Aug 9, 2012 at 1:25 PM, Sebastian Cohnen >> <[email protected]> wrote: >>> I forgot to mention, that I also ran >>> "riak_core_node_watcher:service_up(riak_pipe, self())." on the 0.14.2 node >>> (got that from here: http://wiki.basho.com/Rolling-Upgrades.html) >>> >>> On 09.08.2012, at 22:16, Sebastian Cohnen <[email protected]> >>> wrote: >>> >>>> Hey all, >>>> >>>> looks like I'm already stuck :-/ >>>> >>>> I'm trying to test the upgrade on a stage cluster (with 2 nodes). What I >>>> did so far: >>>> * downloaded 1.2 >>>> * stopped riak >>>> * backup /var/lib/riak/ring and /etc/riak >>>> * installed 1.2 >>>> * changed app.config and vm.args (just node name, ring creation size, >>>> config for our multi-backends) >>>> * started riak again >>>> >>>> riak-admin status looked fine, ring membership is fine, both nodes answer >>>> requests. As hinted by Jon, I attached to riak console and run >>>> riak_core_capability:all(). As far as I can tell, everything looks okay >>>> here too. >>>> >>>> What is not working is: riak-admin transfers. It is not working on both >>>> nodes. For the state situation this is not a big deal, for production this >>>> would be a potential problem. >>>> >>>> I've pasted the output of "riak_core_capability:all()." and command output >>>> of riak-admin transfers here: https://gist.github.com/3307714 >>>> >>>> Is there anything I can do about that? >>>> >>>> >>>> Best >>>> >>>> Sebastian >>>> >>>> >>>> PS: What's interesting is that I think that I saw a similar behavior while >>>> trying to upgrade to 1.1.4 a few days ago. I have to double check that >>>> though. >>>> >>>> On 09.08.2012, at 14:08, Sebastian Cohnen <[email protected]> >>>> wrote: >>>> >>>>> I'm actually thinking about taking the risk. We only have a small 3-node >>>>> cluster with ~50GB of data with relatively little traffic (and we don't >>>>> have any 2i, nor do we use search or MR). >>>>> >>>>> I'll backup the data files, the ring state and everything else I find and >>>>> give it a try. If anything strange happens, we roll back and do the >>>>> additional 1.1.4 step. >>>>> >>>>> Thanks for the information and help so far! >>>>> >>>>> On 08.08.2012, at 19:57, Jon Meredith <[email protected]> wrote: >>>>> >>>>>> Only test coverage. We didn't run direct testing to 0.14.2 - we also >>>>>> deliberately made the decision not to remove some older code that would >>>>>> have broken 0.14 upgrades until the next major release. >>>>>> >>>>>> It all depends on your risk tolerance - we didn't make any file format >>>>>> changes to bitcask so your data should be safe. If you wanted to try >>>>>> it, I would take a backup of the ring directory in case you had to >>>>>> downgrade the node again for any reason. >>>>>> >>>>>> On the newly upgraded node you could run riak_core_capability:all(). on >>>>>> the riak console, that would double-check that the settings matched the >>>>>> required rolling upgrade settings, and make sure you do a diff of your >>>>>> app.config/vm.args against the new package to check there aren't any >>>>>> settings missing. >>>>>> >>>>>> Jon. >>>>>> >>>>>> On Wed, Aug 8, 2012 at 11:39 AM, Sebastian Cohnen >>>>>> <[email protected]> wrote: >>>>>> I'm curious, are there any special reasons for your recommendation? >>>>>> >>>>>> On 08.08.2012, at 19:38, Jon Meredith <[email protected]> wrote: >>>>>> >>>>>>> I would recommend going 0.14.2 -> 1.1.4 -> 1.2, making sure you follow >>>>>>> the pre-1.0 upgrade instructions on >>>>>>> http://wiki.basho.com/Rolling-Upgrades.html >>>>>>> >>>>>>> Once you do the upgrade from 1.2, the capabilities system will kick in >>>>>>> and the old legacy settings mentioned in the rolling upgrade will no >>>>>>> longer be used (if you need to you can override them with the new >>>>>>> capability override mechanism). >>>>>>> >>>>>>> Jon. >>>>>>> >>>>>>> On Wed, Aug 8, 2012 at 10:23 AM, Nathan Wilken <[email protected]> wrote: >>>>>>> Is an intermediate upgrade recommended? 0.14.2 --> 1.0/1.1 --> 1.2? >>>>>>> >>>>>>> >>>>>>> >>>>>>> From: [email protected] >>>>>>> [[email protected]] on behalf of Sean Cribbs >>>>>>> [[email protected]] >>>>>>> Sent: Wednesday, August 08, 2012 6:35 AM >>>>>>> To: Sebastian Cohnen >>>>>>> Cc: [email protected] >>>>>>> Subject: Re: Upgrading 0.14.2 cluster to 1.2 >>>>>>> >>>>>>> Sebastian, >>>>>>> >>>>>>> While it might work, we did not specifically test upgrades from 0.14.2, >>>>>>> only 1.0 and 1.1. >>>>>>> >>>>>>> On Wed, Aug 8, 2012 at 7:08 AM, Sebastian Cohnen >>>>>>> <[email protected]> wrote: >>>>>>> Hey list, >>>>>>> >>>>>>> is it a good idea to upgrade a small (3 node) cluster straight to 1.2 >>>>>>> from 0.14.2. Especially with riak's 1.2 capabilities negotiation, it >>>>>>> feels like the upgrade process should be much simpler now? We don't do >>>>>>> any M/R jobs currently and we are only using bitcask right now. >>>>>>> >>>>>>> >>>>>>> Best >>>>>>> >>>>>>> Sebastian >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> riak-users mailing list >>>>>>> [email protected] >>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Sean Cribbs <[email protected]> >>>>>>> Software Engineer >>>>>>> Basho Technologies, Inc. >>>>>>> http://basho.com/ >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> riak-users mailing list >>>>>>> [email protected] >>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Jon Meredith >>>>>>> Platform Engineering Manager >>>>>>> Basho Technologies, Inc. >>>>>>> [email protected] >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Jon Meredith >>>>>> Platform Engineering Manager >>>>>> Basho Technologies, Inc. >>>>>> [email protected] >>>>>> >>>>> >>>> >>> >>> >>> _______________________________________________ >>> riak-users mailing list >>> [email protected] >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> >> >> -- >> Joseph Blomstedt <[email protected]> >> Senior Software Engineer >> Basho Technologies, Inc. >> http://www.basho.com/ > _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
