Re: Rebuilding AAE hashes - small question

2014-04-11 Thread Timo Gatsonides
 Date: Thu, 10 Apr 2014 14:42:23 +0100
 From: Guido Medina guido.med...@temetra.com
 To: Engel Sanchez en...@basho.com, Luke Bakken lbak...@basho.com
 Cc: riak-users riak-users@lists.basho.com
 Subject: Re: Rebuilding AAE hashes - small question
 Message-ID: 53469fbf.1000...@temetra.com
 Content-Type: text/plain; charset=iso-8859-1; Format=flowed
 
 Thanks Engel,
 
 That approach looks very accurate, I would only suggest to have a 
 riak-admin cluster stop-aae and similar for start, for the dummies ;-)
 

I agree with Guido that would be the best solution for the dummies. But a good 
interim solution would be if Basho puts up a “definitive guide to enabling and 
disabling AAE” somewhere. I have now seen a lot of different commands to 
achieve this. In this thread and in the “RIAK 1.4.6 - Mass key deletion” thread 
and some older threads.

There is the per-host approach and there seem to be several cluster-wide / rpc 
approaches. I have seen:

a.
rpc:multicall(riak_kv_entropy_manager, disable, []).
rpc:multicall(riak_kv_entropy_manager, cancel_exchanges, []).
z.

(with questions about the a. and z. ) and there is:

riak_core_util:rpc_every_member_ann(riak_kv_entropy_manager, disable, [],
6).

And similar to enable.

Can someone please advise what is the “authoritative” way to enable/disable AAE?

Kind regards,
Timo



___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Rebuilding AAE hashes - small question

2014-04-10 Thread Engel Sanchez
Hey there. There are a couple of things to keep in mind when deleting
invalid AAE trees from the 1.4.3-1.4.7 series after upgrading to 1.4.8:

* If AAE is disabled, you don't have to stop the node to delete the data in
the anti_entropy directories
* If AAE is enabled, deleting the AAE data in a rolling manner may trigger
an avalanche of read repairs between nodes with the bad trees and nodes
with good trees as the data seems to diverge.

If your nodes are already up, with AAE enabled and with old incorrect trees
in the mix, there is a better way.  You can dynamically disable AAE with
some console commands. At that point, without stopping the nodes, you can
delete all AAE data across the cluster.  At a convenient time, re-enable
AAE.  I say convenient because all trees will start to rebuild, and that
can be problematic in an overloaded cluster.  Doing this over the weekend
might be a good idea unless your cluster can take the extra load.

To dynamically disable AAE from the Riak console, you can run this command:

 riak_core_util:rpc_every_member_ann(riak_kv_entropy_manager, disable, [],
6).

and enable with the similar:

 riak_core_util:rpc_every_member_ann(riak_kv_entropy_manager, enable, [],
6).

That last number is just a timeout for the RPC operation.  I hope this
saves you some extra load on your clusters.


On Wed, Apr 9, 2014 at 11:02 AM, Luke Bakken lbak...@basho.com wrote:

 Hi Guido,

 I specifically meant riak-admin transfers however using riak-admin
 wait-for-service riak_kv riak@node is a good first step before waiting
 for transfers.

 Thanks!

 --
 Luke Bakken
 CSE
 lbak...@basho.com


 On Wed, Apr 9, 2014 at 7:54 AM, Guido Medina guido.med...@temetra.comwrote:

  What do you mean by wait for handoff to finish?

 Are you referring to wait for the service to be fully started? i.e.
 riak-admin wait-for-service riak_kv riak@node

 Or do you mean to check for riak-admin transfers on the started node
 and wait until those handoffs/transfers are gone?

 Guido.



 On 09/04/14 15:46, Luke Bakken wrote:

 Hi Guido,

  That is the correct process. Be sure to use the rolling restart
 procedure when restarting nodes (i.e. wait for handoff to finish before
 moving on).

 --
 Luke Bakken
 CSE
 lbak...@basho.com


 On Wed, Apr 9, 2014 at 6:34 AM, Guido Medina guido.med...@temetra.comwrote:

  Hi,

 If nodes are already upgraded to 1.4.8 (and they went all the way from
 1.4.0 to 1.4.8 including AAE buggy versions)

 Will the following command (as root) on Ubuntu Servers 12.04:

 riak stop; rm -Rf /var/lib/riak/anti_entropy/*; riak start

 executed on each node be enough to rebuild AAE hashes?

 Regards,

 Guido.

 ___
 riak-users mailing list
 riak-users@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




 ___
 riak-users mailing list
 riak-users@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



 ___
 riak-users mailing list
 riak-users@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Rebuilding AAE hashes - small question

2014-04-10 Thread Guido Medina

Thanks Engel,

That approach looks very accurate, I would only suggest to have a 
riak-admin cluster stop-aae and similar for start, for the dummies ;-)


Guido.

On 10/04/14 14:22, Engel Sanchez wrote:
Hey there. There are a couple of things to keep in mind when deleting 
invalid AAE trees from the 1.4.3-1.4.7 series after upgrading to 1.4.8:


* If AAE is disabled, you don't have to stop the node to delete the 
data in the anti_entropy directories
* If AAE is enabled, deleting the AAE data in a rolling manner may 
trigger an avalanche of read repairs between nodes with the bad trees 
and nodes with good trees as the data seems to diverge.


If your nodes are already up, with AAE enabled and with old incorrect 
trees in the mix, there is a better way.  You can dynamically disable 
AAE with some console commands. At that point, without stopping the 
nodes, you can delete all AAE data across the cluster.  At a 
convenient time, re-enable AAE.  I say convenient because all trees 
will start to rebuild, and that can be problematic in an overloaded 
cluster.  Doing this over the weekend might be a good idea unless your 
cluster can take the extra load.


To dynamically disable AAE from the Riak console, you can run this 
command:


 riak_core_util:rpc_every_member_ann(riak_kv_entropy_manager, 
disable, [], 6).


and enable with the similar:

 riak_core_util:rpc_every_member_ann(riak_kv_entropy_manager, enable, 
[], 6).


That last number is just a timeout for the RPC operation.  I hope this 
saves you some extra load on your clusters.



On Wed, Apr 9, 2014 at 11:02 AM, Luke Bakken lbak...@basho.com 
mailto:lbak...@basho.com wrote:


Hi Guido,

I specifically meant riak-admin transfers however using riak-admin
wait-for-service riak_kv riak@node is a good first step before
waiting for transfers.

Thanks!

--
Luke Bakken
CSE
lbak...@basho.com mailto:lbak...@basho.com


On Wed, Apr 9, 2014 at 7:54 AM, Guido Medina
guido.med...@temetra.com mailto:guido.med...@temetra.com wrote:

What do you mean by wait for handoff to finish?

Are you referring to wait for the service to be fully started?
i.e. riak-admin wait-for-service riak_kv riak@node

Or do you mean to check for riak-admin transfers on the
started node and wait until those handoffs/transfers are gone?

Guido.



On 09/04/14 15:46, Luke Bakken wrote:

Hi Guido,

That is the correct process. Be sure to use the rolling
restart procedure when restarting nodes (i.e. wait for
handoff to finish before moving on).

--
Luke Bakken
CSE
lbak...@basho.com mailto:lbak...@basho.com


On Wed, Apr 9, 2014 at 6:34 AM, Guido Medina
guido.med...@temetra.com mailto:guido.med...@temetra.com
wrote:

Hi,

If nodes are already upgraded to 1.4.8 (and they went all
the way from 1.4.0 to 1.4.8 including AAE buggy versions)

Will the following command (as root) on Ubuntu Servers 12.04:

riak stop; rm -Rf /var/lib/riak/anti_entropy/*; riak
start

executed on each node be enough to rebuild AAE hashes?

Regards,

Guido.

___
riak-users mailing list
riak-users@lists.basho.com
mailto:riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com





___
riak-users mailing list
riak-users@lists.basho.com mailto:riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



___
riak-users mailing list
riak-users@lists.basho.com mailto:riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Rebuilding AAE hashes - small question

2014-04-09 Thread Guido Medina

Hi,

If nodes are already upgraded to 1.4.8 (and they went all the way from 
1.4.0 to 1.4.8 including AAE buggy versions)


Will the following command (as root) on Ubuntu Servers 12.04:

   riak stop; rm -Rf /var/lib/riak/anti_entropy/*; riak start

executed on each node be enough to rebuild AAE hashes?

Regards,

Guido.
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Rebuilding AAE hashes - small question

2014-04-09 Thread Luke Bakken
Hi Guido,

That is the correct process. Be sure to use the rolling restart procedure
when restarting nodes (i.e. wait for handoff to finish before moving on).

--
Luke Bakken
CSE
lbak...@basho.com


On Wed, Apr 9, 2014 at 6:34 AM, Guido Medina guido.med...@temetra.comwrote:

  Hi,

 If nodes are already upgraded to 1.4.8 (and they went all the way from
 1.4.0 to 1.4.8 including AAE buggy versions)

 Will the following command (as root) on Ubuntu Servers 12.04:

 riak stop; rm -Rf /var/lib/riak/anti_entropy/*; riak start

 executed on each node be enough to rebuild AAE hashes?

 Regards,

 Guido.

 ___
 riak-users mailing list
 riak-users@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Rebuilding AAE hashes - small question

2014-04-09 Thread Guido Medina

What do you mean by wait for handoff to finish?

Are you referring to wait for the service to be fully started? i.e. 
riak-admin wait-for-service riak_kv riak@node


Or do you mean to check for riak-admin transfers on the started node 
and wait until those handoffs/transfers are gone?


Guido.


On 09/04/14 15:46, Luke Bakken wrote:

Hi Guido,

That is the correct process. Be sure to use the rolling restart 
procedure when restarting nodes (i.e. wait for handoff to finish 
before moving on).


--
Luke Bakken
CSE
lbak...@basho.com mailto:lbak...@basho.com


On Wed, Apr 9, 2014 at 6:34 AM, Guido Medina guido.med...@temetra.com 
mailto:guido.med...@temetra.com wrote:


Hi,

If nodes are already upgraded to 1.4.8 (and they went all the way
from 1.4.0 to 1.4.8 including AAE buggy versions)

Will the following command (as root) on Ubuntu Servers 12.04:

riak stop; rm -Rf /var/lib/riak/anti_entropy/*; riak start

executed on each node be enough to rebuild AAE hashes?

Regards,

Guido.

___
riak-users mailing list
riak-users@lists.basho.com mailto:riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Rebuilding AAE hashes - small question

2014-04-09 Thread Luke Bakken
Hi Guido,

I specifically meant riak-admin transfers however using riak-admin
wait-for-service riak_kv riak@node is a good first step before waiting for
transfers.

Thanks!
--
Luke Bakken
CSE
lbak...@basho.com


On Wed, Apr 9, 2014 at 7:54 AM, Guido Medina guido.med...@temetra.comwrote:

  What do you mean by wait for handoff to finish?

 Are you referring to wait for the service to be fully started? i.e.
 riak-admin wait-for-service riak_kv riak@node

 Or do you mean to check for riak-admin transfers on the started node and
 wait until those handoffs/transfers are gone?

 Guido.



 On 09/04/14 15:46, Luke Bakken wrote:

 Hi Guido,

  That is the correct process. Be sure to use the rolling restart
 procedure when restarting nodes (i.e. wait for handoff to finish before
 moving on).

 --
 Luke Bakken
 CSE
 lbak...@basho.com


 On Wed, Apr 9, 2014 at 6:34 AM, Guido Medina guido.med...@temetra.comwrote:

  Hi,

 If nodes are already upgraded to 1.4.8 (and they went all the way from
 1.4.0 to 1.4.8 including AAE buggy versions)

 Will the following command (as root) on Ubuntu Servers 12.04:

 riak stop; rm -Rf /var/lib/riak/anti_entropy/*; riak start

 executed on each node be enough to rebuild AAE hashes?

 Regards,

 Guido.

 ___
 riak-users mailing list
 riak-users@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




 ___
 riak-users mailing list
 riak-users@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com