Ingo,
I have two guesses that might explain the symptoms:
- there is a bad drive in one the nodes, or
- one or more nodes begins to use swap space during a compaction or 2i
iteration.
I might be able to describe / isolate the problem by examining the "LOG" files
produced by leveldb. Wou
Hi Mark,
we have updated to riak 1.3 and raised zdbbl to 32MB but still run into
the described phenomen. 9 nodes of our 12 nodes suddenly drastically
drop their cpu utilisation, 3 nodes still have a normal cpu load, but
have some "busy_dist_port" messages in the console.log (a lot less since
Hi, Ingo.
On Mar 19, 2013, at 10:41 AM, Ingo Rockel wrote:
> and the riak-users mailer-daemon should really set a "reply-to"…
Most email client programs have two well-understood controls for replies, one
for "reply (to sender)" and one for "reply to all."
We are not going to make one of them b
reff: Re: riak cluster suddenly became unresponsive
> Datum: Tue, 19 Mar 2013 15:40:12 +0100
> Von: Ingo Rockel
> An: Mark Phillips
>
> Hi Mark,
>
> thanks!
>
> The 1.3 update is already planned.
>
> But we will add the zdbbl first as we ran into the same is
and the riak-users mailer-daemon should really set a "reply-to"...
Original-Nachricht
Betreff: Re: riak cluster suddenly became unresponsive
Datum: Tue, 19 Mar 2013 15:40:12 +0100
Von: Ingo Rockel
An: Mark Phillips
Hi Mark,
thanks!
The 1.3 update is already pla
Hi Ingo,
Sorry for the delay in getting back to you.
This looks symptomatic of some of the scheduler issues we fixed of 1.3. A
few of theeleveldb issues in the release notes [1] can provide precise
details. Is upgrading a possibility?
Tweaking your zdbbl in vm.args should alleviate some of t
Hi,
we have a 12 nodes cluster running riak 1.2.1 which went live a week
ago. Yesterday, suddenly from one minute to another the put_fsm_time_95
and the get_fsm_time_95 raised from something below 100ms up to several
seconds. This went on for about 25 min and than went away.
Checking the ria