Re: Update to 1.4.8

2014-04-08 Thread Edgar Veiga
Hi Timo,

...So I stopped AAE on all nodes (with riak attach), removed the AAE
folders on all the nodes. And then restarted them one-by-one, so they
all started with a clean AAE state. Then about a day later the cluster
was finally in a normal state.

I don't understand the difference between what you did and what I'm
describing in the former emails? I've stopped the aae via riak attach, and
then one by one, I've stopped the node, removed the anti-entropy data and
started the node. Is there any subtle difference I'm not getting?

I'm asking this because indeed this hasn't proved to be enough to stop the
cluster entire cluster load. Another thing is the anti-entropy dir data
size, since the upgrade it has reached very high values comparing to the
previous ones...

Best regards
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Update to 1.4.8

2014-04-08 Thread Timo Gatsonides
 ...So I stopped AAE on all nodes (with riak attach), removed the AAE folders 
 on all the nodes. And then restarted them one-by-one, so they all started 
 with a clean AAE state. Then about a day later the cluster was finally in a 
 normal state.
 I don't understand the difference between what you did and what I'm 
 describing in the former emails? I've stopped the aae via riak attach, and 
 then one by one, I've stopped the node, removed the anti-entropy data and 
 started the node. Is there any subtle difference I'm not getting?
 
 I'm asking this because indeed this hasn't proved to be enough to stop the 
 cluster entire cluster load. Another thing is the anti-entropy dir data size, 
 since the upgrade it has reached very high values comparing to the previous 
 ones…

Unfortunately I don’t have a 100% definitive answer for you, maybe someone from 
Basho can advise. 

In my case I noticed that after running riak_kv_entropy_manager:disable() the 
IO load did not decrease immediately and on some servers it took quite a while 
before iostat showed disk I/O going to normal levels. I only removed the AAE 
folders after IO load was normal.

Now that you have mentioned it I just took a look at my servers and the 
anti-entropy dir is large (500Mb) on my servers too, although it varies from 
one server to the next.

Best regards,
Timo

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Update to 1.4.8

2014-04-08 Thread Edgar Veiga
Well, my anti-entropy folders in each machine have ~120G, It's quite a
lot!!!

I have ~600G of data per server and a cluster of 6 servers with level-db.
Just for comparison effects, what about you?

Someone of basho, can you please advise on this one?

Best regards! :)


On 8 April 2014 11:02, Timo Gatsonides t...@me.com wrote:

 ...So I stopped AAE on all nodes (with riak attach), removed the AAE folders 
 on all the nodes. And then restarted them one-by-one, so they all started 
 with a clean AAE state. Then about a day later the cluster was finally in a 
 normal state.

 I don't understand the difference between what you did and what I'm
 describing in the former emails? I've stopped the aae via riak attach, and
 then one by one, I've stopped the node, removed the anti-entropy data and
 started the node. Is there any subtle difference I'm not getting?

 I'm asking this because indeed this hasn't proved to be enough to stop the
 cluster entire cluster load. Another thing is the anti-entropy dir data
 size, since the upgrade it has reached very high values comparing to the
 previous ones...


 Unfortunately I don't have a 100% definitive answer for you, maybe someone
 from Basho can advise.

 In my case I noticed that after running riak_kv_entropy_manager:disable()
 the IO load did not decrease immediately and on some servers it took quite
 a while before iostat showed disk I/O going to normal levels. I only
 removed the AAE folders after IO load was normal.

 Now that you have mentioned it I just took a look at my servers and the
 anti-entropy dir is large (500Mb) on my servers too, although it varies
 from one server to the next.

 Best regards,
 Timo


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Update to 1.4.8

2014-04-08 Thread Timo Gatsonides

I also have 6 servers. Each server has about 2Tb of data. So maybe my 
anti_entropy dir size is “normal”.

Kind regards,
Timo

p.s. I’m using the multi_backend as I have some data only in memory 
(riak_kv_memory_backend); all data on disk is in riak_kv_eleveldb_backend.

 Well, my anti-entropy folders in each machine have ~120G, It's quite a lot!!!
 
 I have ~600G of data per server and a cluster of 6 servers with level-db. 
 Just for comparison effects, what about you?
 
 Someone of basho, can you please advise on this one?
 
 Best regards! :)
 



smime.p7s
Description: S/MIME cryptographic signature
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Update to 1.4.8

2014-04-08 Thread Edgar Veiga
So basho, to resume:

I've upgraded to the latest 1.4.8 version without removing the anti-entropy
data dir because at the time that note wasn't already on the Release Notes
of 1.4.8.

A few days later, I've made it: Stopped the aae via riak attach, restarted
all the nodes one by one removing the anti-entropy data in between.

The expected results didn't happened.. My level-db data dir has ~600G per
server and the anti-antropy data dir has ~120G, this seems to be quite a
lot :(
The cluster load is still high... Write and read times are inconstant and
both high.

I have a 6 machine cluster with level-db as backend. The total amount of
keys is about 2.5 billions.

Best regards!




On 8 April 2014 11:21, Timo Gatsonides t...@me.com wrote:


 I also have 6 servers. Each server has about 2Tb of data. So maybe my
 anti_entropy dir size is normal.

 Kind regards,
 Timo

 p.s. I'm using the multi_backend as I have some data only in memory
 (riak_kv_memory_backend); all data on disk is in riak_kv_eleveldb_backend.

  Well, my anti-entropy folders in each machine have ~120G, It's quite a
 lot!!!
 
  I have ~600G of data per server and a cluster of 6 servers with
 level-db. Just for comparison effects, what about you?
 
  Someone of basho, can you please advise on this one?
 
  Best regards! :)
 


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Update to 1.4.8

2014-03-07 Thread Timo Gatsonides

Hi Edgar,

please note that I’ve run into the same problem. And I did what you’re 
describing: restarting the nodes one by one and removing the AAE folders. 
However I am strongly suspecting that this didn’t solve it completely for me 
since a few nodes kept thrashing the disk continuously. My gut feeling says 
this is because the AAE starts immediately when the node is started and some of 
the other nodes then still have the incorrect/faulty AAE data, immediately 
triggering repairs or something.

So I stopped AAE on all nodes (with riak attach), removed the AAE folders on 
all the nodes. And then restarted them one-by-one, so they all started with a 
clean AAE state. Then about a day later the cluster was finally in a normal 
state.

Regards,
Timo

p.s. of course I can be wrong and maybe the I/O was caused by something else.

 Message: 5
 Date: Thu, 6 Mar 2014 10:15:55 +
 From: Edgar Veiga edgarmve...@gmail.com
 To: Scott Lystig Fritchie fritc...@snookles.com
 Cc: riak-users@lists.basho.com riak-users@lists.basho.com
 Subject: Re: Update to 1.4.8
 Message-ID:
   CAM-EGEqp=3eqi6r2zji9yryactgsz-1btwzqz1h_ofd-xvd...@mail.gmail.com
 Content-Type: text/plain; charset=iso-8859-1
 
 Hi Scott,
 
 Thanks for replying.
 
 After this problem, I've been faced with a huge amount of disk consumption
 regarding the aae folder (It's now in the order of the 100G).
 
 Indeed, I've been talking with your Brian Sparrow and he advised me to
 remove the anti antropy folder contents. Before that I'm going to attach to
 the riak process and stop the aae. After that I'll restart all the nodes
 one by one, removing the anti antropy contents in between.
 
 I should have done this by the time I've updated to 1.4.8, but
 unfortunately the message only appeared on the release notes after I've
 started the upgrade process…
 




smime.p7s
Description: S/MIME cryptographic signature
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Update to 1.4.8

2014-03-06 Thread Edgar Veiga
Hi Scott,

Thanks for replying.

After this problem, I've been faced with a huge amount of disk consumption
regarding the aae folder (It's now in the order of the 100G).

Indeed, I've been talking with your Brian Sparrow and he advised me to
remove the anti antropy folder contents. Before that I'm going to attach to
the riak process and stop the aae. After that I'll restart all the nodes
one by one, removing the anti antropy contents in between.

I should have done this by the time I've updated to 1.4.8, but
unfortunately the message only appeared on the release notes after I've
started the upgrade process...

Best regards!


On 6 March 2014 03:08, Scott Lystig Fritchie fritc...@snookles.com wrote:

 Edgar Veiga edgarmve...@gmail.com wrote:

 ev Is this normal?

 Yes.  One or more of your vnodes can't keep up with the workload
 generated by AAE repair  or a vnode can't keep up for another
 reason, and AAE repair shouldn't actively make things worse.

 The logging is done by:


 https://github.com/basho/riak_kv/blob/1.4/src/riak_kv_entropy_manager.erl#L821

 -Scott

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Update to 1.4.8

2014-03-05 Thread Scott Lystig Fritchie
Edgar Veiga edgarmve...@gmail.com wrote:

ev Is this normal?

Yes.  One or more of your vnodes can't keep up with the workload
generated by AAE repair  or a vnode can't keep up for another
reason, and AAE repair shouldn't actively make things worse.

The logging is done by:

https://github.com/basho/riak_kv/blob/1.4/src/riak_kv_entropy_manager.erl#L821

-Scott

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com