0.14 node leaving mixed cluster

Timo Gatsonides Fri, 24 Feb 2012 23:09:26 -0800

I have a 6 node cluster. One node is running 1.1 already, the other 5 are still 
0.14.


I want not only to upgrade from 0.14 to 1.1 but also change backend from bit 
cask to leveled, so I thought I would do a "leave cluster, wait for handoff, 
upgrade and change config, join cluster" cycle for each node.

Now I have done a "riak-admin leave" and getting some errors. Both on the node 
that has left as well as on other nodes I get this:

$ sudo /usr/sbin/riak-admin ringready
Attempting to restart script through sudo -u riak
RPC to 'riak@….' failed: {'EXIT',
                                          {{badrecord,chstate},
                                           [{riak_core_ring,owner_node,1},
                                            {riak_kv_status,
                                             '-get_rings/0-lc$^0/1-0-',1},
                                            {riak_kv_status,
                                             '-get_rings/0-lc$^0/1-0-',1},
                                            {riak_kv_status,get_rings,0},
                                            {riak_kv_status,ringready,0},
                                            {riak_kv_console,ringready,1},
                                            {rpc,
                                             '-handle_call_call/6-fun-0-',
                                             5}]}}

I'm unsure handoffs are happening now, but the CPU usage on two of the nodes, 
including the one that is leaving is very high for the beams.smp process.

Has anyone seen this before? And can someone please tell me what is the best 
way to monitor handoff progress?

Thanks,
Timo

p.s. riak-admin transfers show the same error
$ sudo /usr/sbin/riak-admin transfers
Attempting to restart script through sudo -u riak
RPC to 'riak@….' failed: {'EXIT',
                                          {{badrecord,chstate},
                                           [{riak_core_ring,owner_node,1},
                                            {riak_kv_status,
                                             '-get_rings/0-lc$^0/1-0-',1},
                                            {riak_kv_status,
                                             '-get_rings/0-lc$^0/1-0-',1},
                                            {riak_kv_status,get_rings,0},
                                            {riak_kv_status,transfers,0},
                                            {riak_kv_console,transfers,1},
                                            {rpc,
                                             '-handle_call_call/6-fun-0-',
                                             5}]}}
 
p.s. could it be that in 1.1 or with LevelDB the paths need to be absolute? I 
got all kinds of strange errors on the first 1.1 node when I had a symlink 
/var/lib/riak to /data/riak and these disappeared when I changed the 
configuration to using /data/riak in app.config.
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

0.14 node leaving mixed cluster

Reply via email to