Hey Joe,

the upgrade itself appears to be fine on stage. But I think I might have an 
issue with the capability negotiation.

Server A thinks, that Server B is a legacy node. Server B thinks the same of 
Server A.

On Server A (running riak-admin member_status):

(legacy)   50.0%      --      'riak@SERVER_B'
valid      50.0%      --      'riak@SERVER_A'

and on Server B:

valid      50.0%      --      'riak@SERVER_B'
(legacy)   50.0%      --      'riak@SERVER_A'


riak-admin transfers does not work properly too. Each node thinks that the 
other node is down.

riak-admin ring_status just prints a: "Currently in legacy gossip mode."


Restarting the nodes did not have an effect. Any ideas?


Best

Sebastian


On 09.08.2012, at 23:14, Sebastian Cohnen <[email protected]> wrote:

> Hey Joe,
> 
> thanks for your detailed description of the problem.
> 
> I already assumed that this is not necessarily an indicator for problems. I 
> just wanted to make sure I'm not missing anything important. ring_status just 
> tells me "Currently in legacy gossip mode.", but member_status looks very 
> informative.
> 
> It's getting a bit too late (I'm on CEST) to continue to work on the 
> migration testing, but I'll continue tomorrow.
> 
> 
> Thanks again for your help!
> 
> Sebastian
> 
> On 09.08.2012, at 22:47, Joseph Blomstedt <[email protected]> wrote:
> 
>> Yes, this makes sense unfortunately. 'riak-admin transfers' isn't
>> going to work for you in a mixed 0.14.2 and 1.2 cluster.
>> 
>> Between 0.14.2 and 1.0, the entire cluster system was revamped. One
>> consequence of this change was that 'riak-admin transfers' would only
>> work on the 1.0+ nodes in the cluster, not any of the 0.14.2 nodes. At
>> the time, this wasn't a major issue because you could just use the
>> command on the right nodes and get the information you needed until
>> all nodes were eventually upgraded.
>> 
>> For Riak 1.2, 'riak-admin transfers' has been changed again. This
>> time, in a mixed cluster 'riak-admin transfers' only works on the
>> older nodes, not the Riak 1.2 nodes. For example, in a mixed 1.1 and
>> 1.2 cluster, you can only use riak-admin transfers on the 1.1 nodes
>> until all have been upgraded.
>> 
>> Unfortunately, the combination doesn't work out well for you in this
>> case. Riak 0.14.2 transfers fails if there are any 1.0+ nodes in the
>> cluster, and Riak 1.2 transfers fails if there are any <1.2 nodes in
>> the cluster.  Both are true, and therefore neither versions of Riak
>> can properly give you transfer information.
>> 
>> Of course, the lack of being able to monitor transfers doesn't mean
>> things aren't actually working. Running 'riak-admin member_status' and
>> 'riak-admin ring_status' on the newer nodes should provide enough
>> detail about what's going on to see if your cluster is moving along.
>> 
>> Regards,
>> Joe
>> 
>> 
>> On Thu, Aug 9, 2012 at 1:25 PM, Sebastian Cohnen
>> <[email protected]> wrote:
>>> I forgot to mention, that I also ran 
>>> "riak_core_node_watcher:service_up(riak_pipe, self())." on the 0.14.2 node 
>>> (got that from here: http://wiki.basho.com/Rolling-Upgrades.html)
>>> 
>>> On 09.08.2012, at 22:16, Sebastian Cohnen <[email protected]> 
>>> wrote:
>>> 
>>>> Hey all,
>>>> 
>>>> looks like I'm already stuck :-/
>>>> 
>>>> I'm trying to test the upgrade on a stage cluster (with 2 nodes). What I 
>>>> did so far:
>>>> * downloaded 1.2
>>>> * stopped riak
>>>> * backup /var/lib/riak/ring and /etc/riak
>>>> * installed 1.2
>>>> * changed app.config and vm.args (just node name, ring creation size, 
>>>> config for our multi-backends)
>>>> * started riak again
>>>> 
>>>> riak-admin status looked fine, ring membership is fine, both nodes answer 
>>>> requests. As hinted by Jon, I attached to riak console and run 
>>>> riak_core_capability:all(). As far as I can tell, everything looks okay 
>>>> here too.
>>>> 
>>>> What is not working is: riak-admin transfers. It is not working on both 
>>>> nodes. For the state situation this is not a big deal, for production this 
>>>> would be a potential problem.
>>>> 
>>>> I've pasted the output of "riak_core_capability:all()." and command output 
>>>> of riak-admin transfers here: https://gist.github.com/3307714
>>>> 
>>>> Is there anything I can do about that?
>>>> 
>>>> 
>>>> Best
>>>> 
>>>> Sebastian
>>>> 
>>>> 
>>>> PS: What's interesting is that I think that I saw a similar behavior while 
>>>> trying to upgrade to 1.1.4 a few days ago. I have to double check that 
>>>> though.
>>>> 
>>>> On 09.08.2012, at 14:08, Sebastian Cohnen <[email protected]> 
>>>> wrote:
>>>> 
>>>>> I'm actually thinking about taking the risk. We only have a small 3-node 
>>>>> cluster with ~50GB of data with relatively little traffic (and we don't 
>>>>> have any 2i, nor do we use search or MR).
>>>>> 
>>>>> I'll backup the data files, the ring state and everything else I find and 
>>>>> give it a try. If anything strange happens, we roll back and do the 
>>>>> additional 1.1.4 step.
>>>>> 
>>>>> Thanks for the information and  help so far!
>>>>> 
>>>>> On 08.08.2012, at 19:57, Jon Meredith <[email protected]> wrote:
>>>>> 
>>>>>> Only test coverage.  We didn't run direct testing to 0.14.2 - we also 
>>>>>> deliberately made the decision not to remove some older code that would 
>>>>>> have broken 0.14 upgrades until the next major release.
>>>>>> 
>>>>>> It all depends on your risk tolerance - we didn't make any file format 
>>>>>> changes to bitcask so your data should be safe.  If you wanted to try 
>>>>>> it, I would take a backup of the ring directory in case you had to 
>>>>>> downgrade the node again for any reason.
>>>>>> 
>>>>>> On the newly upgraded node you could run riak_core_capability:all(). on 
>>>>>> the riak console, that would double-check that the settings matched the 
>>>>>> required rolling upgrade settings, and make sure you do a diff of your 
>>>>>> app.config/vm.args against the new package to check there aren't any 
>>>>>> settings missing.
>>>>>> 
>>>>>> Jon.
>>>>>> 
>>>>>> On Wed, Aug 8, 2012 at 11:39 AM, Sebastian Cohnen 
>>>>>> <[email protected]> wrote:
>>>>>> I'm curious, are there any special reasons for your recommendation?
>>>>>> 
>>>>>> On 08.08.2012, at 19:38, Jon Meredith <[email protected]> wrote:
>>>>>> 
>>>>>>> I would recommend going 0.14.2 -> 1.1.4 -> 1.2, making sure you follow 
>>>>>>> the pre-1.0 upgrade instructions on 
>>>>>>> http://wiki.basho.com/Rolling-Upgrades.html
>>>>>>> 
>>>>>>> Once you do the upgrade from 1.2, the capabilities system will kick in 
>>>>>>> and the old legacy settings mentioned in the rolling upgrade will no 
>>>>>>> longer be used (if you need to you can override them with the new 
>>>>>>> capability override mechanism).
>>>>>>> 
>>>>>>> Jon.
>>>>>>> 
>>>>>>> On Wed, Aug 8, 2012 at 10:23 AM, Nathan Wilken <[email protected]> wrote:
>>>>>>> Is an intermediate upgrade recommended?  0.14.2 --> 1.0/1.1 --> 1.2?
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> From: [email protected] 
>>>>>>> [[email protected]] on behalf of Sean Cribbs 
>>>>>>> [[email protected]]
>>>>>>> Sent: Wednesday, August 08, 2012 6:35 AM
>>>>>>> To: Sebastian Cohnen
>>>>>>> Cc: [email protected]
>>>>>>> Subject: Re: Upgrading 0.14.2 cluster to 1.2
>>>>>>> 
>>>>>>> Sebastian,
>>>>>>> 
>>>>>>> While it might work, we did not specifically test upgrades from 0.14.2, 
>>>>>>> only 1.0 and 1.1.
>>>>>>> 
>>>>>>> On Wed, Aug 8, 2012 at 7:08 AM, Sebastian Cohnen 
>>>>>>> <[email protected]> wrote:
>>>>>>> Hey list,
>>>>>>> 
>>>>>>> is it a good idea to upgrade a small (3 node) cluster straight to 1.2 
>>>>>>> from 0.14.2. Especially with riak's 1.2 capabilities negotiation, it 
>>>>>>> feels like the upgrade process should be much simpler now? We don't do 
>>>>>>> any M/R jobs currently and we are only using bitcask right now.
>>>>>>> 
>>>>>>> 
>>>>>>> Best
>>>>>>> 
>>>>>>> Sebastian
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> riak-users mailing list
>>>>>>> [email protected]
>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Sean Cribbs <[email protected]>
>>>>>>> Software Engineer
>>>>>>> Basho Technologies, Inc.
>>>>>>> http://basho.com/
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> riak-users mailing list
>>>>>>> [email protected]
>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Jon Meredith
>>>>>>> Platform Engineering Manager
>>>>>>> Basho Technologies, Inc.
>>>>>>> [email protected]
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Jon Meredith
>>>>>> Platform Engineering Manager
>>>>>> Basho Technologies, Inc.
>>>>>> [email protected]
>>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>>> _______________________________________________
>>> riak-users mailing list
>>> [email protected]
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
>> 
>> 
>> -- 
>> Joseph Blomstedt <[email protected]>
>> Senior Software Engineer
>> Basho Technologies, Inc.
>> http://www.basho.com/
> 


_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to