Hi,

My 5-node cluster exhibits a strange spike on one particular node.

Overall, the mean get time is about 1ms. This node occasionally shoots up to 
40ms.

During those times, %iowait is still the same as it is before the spike. No 
error. Console log shows many lines like the below, which I don't think 
relevant to the spike.

2012-05-30 21:29:50.591 [info] 
<0.72.0>@riak_core_sysmon_handler:handle_event:85 monitor long_gc <0.938.0> 
[{initial_call,{riak_core_vnode,init,1}},{almost_current_function,{gen_fsm,loop,7}},{message_queue_len,0}]
 
[{timeout,185},{old_heap_block_size,0},{heap_block_size,2584},{mbuf_size,0},{stack_size,55},{old_heap_size,0},{heap_size,804}]

The cluster is set up uniformly. Ubuntu 64bit, m2.2xlarge instance. Riak 1.1.2 
with LevelDB backend.

What would be the best course of actions for me?

I plan to:

- riak-admin leave on that node
- set up new instance
- riak-admin reip the new instance
- riak-admin join it to the cluster

Cheers,
Nam


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to