Update PB ip in runtime

2016-04-11 Thread Edgar Veiga
Hi everyone!

Quick question, is it possible to update the PB ip:port entry in runtime
without restarting de node?

Best regards!


-- 
*Edgar Veiga*
__
*BySide*
*edgar.ve...@byside.com *
http://www.byside.com
Rua Visconde Bóbeda, 70 r/c
4000-108 Porto
__

Advertência/Warning
Este correio electrónico contém informação privada e estritamente
confidencial.
Qualquer leitura, retenção, distribuição ou cópia desta mensagem por
qualquer pessoa que não seja o destinatário da presente mensagem é proibida.

This e-mail is privileged, confidential and contains private information.
Any reading, retention, distribution or copying of this communication by
any person other than its intended recipient is prohibited.
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Update PB ip in runtime

2016-04-06 Thread Edgar Veiga
Hi everyone!

Quick question, is it possible to update the PB  entry in runtime
without restarting de node?

Best regards!
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Memory usage

2015-02-13 Thread Edgar Veiga
Hi again everyone!

- The memory usage keeps growing day by day:
https://dl.dropboxusercontent.com/u/1962284/riak2.png

- The handoffs keep on going, with strange things like a transfer started
1.5 days ago:
riak-admin transfers
'riak@192.168.20.112' waiting to handoff 51 partitions
'riak@192.168.20.111' waiting to handoff 74 partitions
'riak@192.168.20.110' waiting to handoff 86 partitions
'riak@192.168.20.109' waiting to handoff 191 partitions
'riak@192.168.20.108' waiting to handoff 67 partitions
'riak@192.168.20.107' waiting to handoff 177 partitions

transfer type: hinted_handoff
vnode type: riak_kv_vnode
partition: 51380916937414555718098294900181824909778878464
started: 2015-02-11 21:54:07 [1.53 d ago]
last update: no updates seen
total size: unknown
objects transferred: unknown

- I'm starting to have some entries in the error log:
2015-02-12 19:58:54.026 [error]
<0.184.0>@riak_core_handoff_manager:handle_info:289 An outbound handoff of
partition riak_kv_vnode 936274486415109681974235595958868809467081785344
was terminated for reason: noproc
2015-02-12 20:27:34.092 [error]
<0.21096.1867>@riak_core_handoff_sender:start_fold:263 hinted_handoff
transfer of riak_kv_vnode from 'riak@192.168.20.112'
1210306043414653979137426502093171875652569137152 to 'riak@192.168.20.109'
1210306043414653979137426502093171875652569137152 failed because of TCP
recv timeout
2015-02-12 20:27:34.092 [error]
<0.184.0>@riak_core_handoff_manager:handle_info:289 An outbound handoff of
partition riak_kv_vnode 1210306043414653979137426502093171875652569137152
was terminated for reason: {shutdown,timeout}
2015-02-12 21:25:32.852 [error]
<0.184.0>@riak_core_handoff_manager:handle_info:289 An outbound handoff of
partition riak_kv_vnode 742168800207099138150308704113737470919028244480
was terminated for reason: noproc


Please, can anyone give me a help on this? I'm starting to get worried with
this behaviour. Tell me if you need more info!

Thanks and Best regards,
Edgar Veiga

On 10 February 2015 at 16:16, Edgar Veiga  wrote:

> Hi all!
>
> I have a riak cluster, working smoothly in production for about one year, 
> with the following characteristics:
>
> - Version 1.4.12
>
> - 6 nodes
>
> - leveldb backend
>
> - replication (n) = 3
>
> ~ 3 billion keys
>
> ~ 1.2Tb per node
>
> - AAE disabled
>
>
> Two days ago I've upgraded all of the 6 nodes from riak v1.4.8 to v1.4.12, 
> and two things started happening that are a little bit odd
>
> 1) The first is the memory consumption, please check the next imagem to 
> understand what I mean:
>
> - https://dl.dropboxusercontent.com/u/1962284/riak.png
>
> 2) All of the machines keep logging hinted handoffs after the rolling 
> restart. I've made the upgrade on non-busy hours and assured that the rolling 
> restart was concluded only when all the in-progress handoffs were concluded, 
> but on the next day when checking the logs I've realised that they keep 
> appearing... Heres are some random examples:
>
> 2015-02-10 16:11:55.547 [info] 
> <0.3070.753>@riak_core_handoff_sender:start_fold:148 Starting hinted_handoff 
> transfer of riak_kv_vnode from 'riak@192.168.20.112' 
> 765004763290394496247241279624929393101152190464 to 'riak@192.168.20.109' 
> 765004763290394496247241279624929393101152190464
>
> 2015-02-10 16:11:55.548 [info] 
> <0.3070.753>@riak_core_handoff_sender:start_fold:236 hinted_handoff transfer 
> of riak_kv_vnode from 'riak@192.168.20.112' 
> 765004763290394496247241279624929393101152190464 to 'riak@192.168.20.109' 
> 765004763290394496247241279624929393101152190464 completed: sent 3.15 KB 
> bytes in 1 of 1 objects in 0.00 seconds (3.99 MB/second)
>
> 2015-02-10 16:12:05.803 [info] 
> <0.3434.753>@riak_core_handoff_sender:start_fold:148 Starting hinted_handoff 
> transfer of riak_kv_vnode from 'riak@192.168.20.112' 
> 902020541790166644828836732692080926193895866368 to 'riak@192.168.20.109' 
> 902020541790166644828836732692080926193895866368
>
> 2015-02-10 16:12:05.856 [info] 
> <0.3368.753>@riak_core_handoff_sender:start_fold:148 Starting hinted_handoff 
> transfer of riak_kv_vnode from 'riak@192.168.20.112' 
> 570899077082383952423314387779798054553098649600 to 'riak@192.168.20.111' 
> 570899077082383952423314387779798054553098649600
>
> 2015-02-10 16:12:05.860 [info] 
> <0.3434.753>@riak_core_handoff_sender:start_fold:236 hinted_handoff transfer 
> of riak_kv_vnode from 'riak@192.168.20.112' 
> 902020541790166644828836732692080926193895866368 to 'riak@192.168.20.109' 
> 902020541790166644828836732692080926193895866368 comp

Memory usage

2015-02-10 Thread Edgar Veiga
Hi all!

I have a riak cluster, working smoothly in production for about one
year, with the following characteristics:

- Version 1.4.12

- 6 nodes

- leveldb backend

- replication (n) = 3

~ 3 billion keys

~ 1.2Tb per node

- AAE disabled


Two days ago I've upgraded all of the 6 nodes from riak v1.4.8 to
v1.4.12, and two things started happening that are a little bit odd

1) The first is the memory consumption, please check the next imagem
to understand what I mean:

- https://dl.dropboxusercontent.com/u/1962284/riak.png

2) All of the machines keep logging hinted handoffs after the rolling
restart. I've made the upgrade on non-busy hours and assured that the
rolling restart was concluded only when all the in-progress handoffs
were concluded, but on the next day when checking the logs I've
realised that they keep appearing... Heres are some random examples:

2015-02-10 16:11:55.547 [info]
<0.3070.753>@riak_core_handoff_sender:start_fold:148 Starting
hinted_handoff transfer of riak_kv_vnode from 'riak@192.168.20.112'
765004763290394496247241279624929393101152190464 to
'riak@192.168.20.109' 765004763290394496247241279624929393101152190464

2015-02-10 16:11:55.548 [info]
<0.3070.753>@riak_core_handoff_sender:start_fold:236 hinted_handoff
transfer of riak_kv_vnode from 'riak@192.168.20.112'
765004763290394496247241279624929393101152190464 to
'riak@192.168.20.109' 765004763290394496247241279624929393101152190464
completed: sent 3.15 KB bytes in 1 of 1 objects in 0.00 seconds (3.99
MB/second)

2015-02-10 16:12:05.803 [info]
<0.3434.753>@riak_core_handoff_sender:start_fold:148 Starting
hinted_handoff transfer of riak_kv_vnode from 'riak@192.168.20.112'
902020541790166644828836732692080926193895866368 to
'riak@192.168.20.109' 902020541790166644828836732692080926193895866368

2015-02-10 16:12:05.856 [info]
<0.3368.753>@riak_core_handoff_sender:start_fold:148 Starting
hinted_handoff transfer of riak_kv_vnode from 'riak@192.168.20.112'
570899077082383952423314387779798054553098649600 to
'riak@192.168.20.111' 570899077082383952423314387779798054553098649600

2015-02-10 16:12:05.860 [info]
<0.3434.753>@riak_core_handoff_sender:start_fold:236 hinted_handoff
transfer of riak_kv_vnode from 'riak@192.168.20.112'
902020541790166644828836732692080926193895866368 to
'riak@192.168.20.109' 902020541790166644828836732692080926193895866368
completed: sent 39.79 KB bytes in 1 of 1 objects in 0.06 seconds
(699.32 KB/second)

2015-02-10 16:12:05.886 [info]
<0.3368.753>@riak_core_handoff_sender:start_fold:236 hinted_handoff
transfer of riak_kv_vnode from 'riak@192.168.20.112'
570899077082383952423314387779798054553098649600 to
'riak@192.168.20.111' 570899077082383952423314387779798054553098649600
completed: sent 3.55 KB bytes in 1 of 1 objects in 0.03 seconds
(118.58 KB/second)


Should I be worried or is this normal on this version?


Best regards,

Edgar
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Adding nodes to cluster

2015-02-09 Thread Edgar Veiga
Thanks Christopher,

BTW, A few more questions:
- If something goes wrong during the process of joining the nodes, is it
possible to make a rollback? If true, how?
- Should I look again to the leveldb configurations. It has been working
pretty smoothly for more than one year, but 6 more nodes suggests that some
tuning may be needed..

Best regards,
Edgar

On 9 February 2015 at 14:54, Christopher Meiklejohn 
wrote:

>
> > On Feb 6, 2015, at 1:53 AM, Edgar Veiga  wrote:
> >
> > It is expected that the total amount of data per node lowers quite a
> lot, correct? I'm doubling the size of the cluster (6 more nodes).
> >
> > I ask this because the actual 6 machines have 1.5Tb in disks, but the
> new ones ( for now) have only 1Tb.
>
> That’s correct; you should reduce per-node allocation by adding additional
> nodes.
>
> However, to minimize the amount of data that needs to be relocated, you
> should add all nodes together rather than one at a time.  This prevents the
> situation where some percentage of the data gets moved between nodes
> multiple times.
>
> - Chris
>
> Christopher Meiklejohn
> Senior Software Engineer
> Basho Technologies, Inc.
> cmeiklej...@basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Adding nodes to cluster

2015-02-05 Thread Edgar Veiga
It is expected that the total amount of data per node lowers quite a lot, 
correct? I'm doubling the size of the cluster (6 more nodes).




I ask this because the actual 6 machines have 1.5Tb in disks, but the new ones 
( for now) have only 1Tb.




Best regards



—
Sent from my iPhone

On Sat, Jan 24, 2015 at 9:49 PM, Edgar Veiga 
wrote:

> Yeah, after sending the email I realized both! :)
> Thanks! Have a nice weekend
> On 24 January 2015 at 21:46, Sargun Dhillon  wrote:
>> 1) Potentially re-enable AAE after migration. As your cluster gets
>> bigger, the likelihood of any node failing in the cluster goes up.
>> Replica divergence only becomes scarier in light of this. Losing data
>> != awesome.
>>
>> 6) There shouldn't be any problems, but for safe measures you should
>> probably upgrade the old ones before the migration.
>>
>>
>>
>> On Sat, Jan 24, 2015 at 1:31 PM, Edgar Veiga 
>> wrote:
>> > Sargun,
>> >
>> > Regarding 1) - AAE is disabled. We had a problems with it and there's a
>> lot
>> > of threads here in the mailing list regarding this. AAE won't stop using
>> > more and more disk space and the only solution was disabling it! Since
>> then
>> > the cluster has been pretty stable...
>> >
>> > Regarding 6) Can you or anyone in basho confirm that there won't be any
>> > problems using the latest (1.4.12) version of riak in the new nodes and
>> only
>> > upgrading the old ones after this process is completed?
>> >
>> > Thanks a lot for the other tips, you've been very helpful!
>> >
>> > Best regards,
>> > Edgar
>> >
>> > On 24 January 2015 at 21:09, Sargun Dhillon  wrote:
>> >>
>> >> Several things:
>> >> 1) If you have data at rest that doesn't change, make sure you have
>> >> AAE, and it's ran before your cluster is manipulated. Given that
>> >> you're running at 85% space, I would be a little worried to turn it
>> >> on, because you might run out of disk space. You can also pretty
>> >> reasonably put the AAE trees on magnetic storage. AAE is nice in the
>> >> sense that you _know_ your cluster is consistent at a point in time.
>> >>
>> >> 2) Make sure you're getting SSDs of roughly the same quality. I've
>> >> seen enterprise SSDs get higher and higher latency as time goes on,
>> >> due to greater data protection features. We don't need any of that.
>> >> Basho_bench is your friend if you have the time.
>> >>
>> >> 3) Do it all in one go. This will enable handoffs more cleanly, and all
>> at
>> >> once.
>> >>
>> >> 4) Do not add the new nodes to the load balancer until handoff is
>> >> done. At least experimentally, latency increases slightly on the
>> >> original cluster, but the target nodes have pretty awful latency.
>> >>
>> >> 5) Start with a handoff_limit of 1. You can easily raise this. If
>> >> things look good, you can increase it. We're not optimizing for the
>> >> total time to handoff, we really should be optimizing for individual
>> >> vnode handoff time.
>> >>
>> >> 6) If you're using Leveldb, upgrade to the most recent version of Riak
>> >> 1.4. There have been some improvements. 1.4.9 made me happier. I think
>> >> it's reasonable for the new nodes to start on 1.4.12, and the old
>> >> nodes to be switched over later.
>> >>
>> >> 7) Watch your network utilization. Keep your disk latency flat. Stop
>> >> it if it spikes. Start from enabling one node with the lowest usage
>> >> and see if it works.
>> >>
>> >>
>> >> These are the things I can think of immediately.
>> >>
>> >> On Sat, Jan 24, 2015 at 12:42 PM, Alexander Sicular > >
>> >> wrote:
>> >> > I would probably add them all in one go so you have one vnode
>> migration
>> >> > plan that gets executed. What is your ring size? How much data are we
>> >> > talking about? It's not necessarily the number of keys but rather the
>> total
>> >> > amount of data and how quickly that data can move en mass between
>> machines.
>> >> >
>> >> > -Alexander
>> >> >
>> >> >
>> >> > @siculars
>> >> > http://siculars.posthaven.com
>> >> >
>> >&

Re: Adding nodes to cluster

2015-01-24 Thread Edgar Veiga
Yeah, after sending the email I realized both! :)

Thanks! Have a nice weekend



On 24 January 2015 at 21:46, Sargun Dhillon  wrote:

> 1) Potentially re-enable AAE after migration. As your cluster gets
> bigger, the likelihood of any node failing in the cluster goes up.
> Replica divergence only becomes scarier in light of this. Losing data
> != awesome.
>
> 6) There shouldn't be any problems, but for safe measures you should
> probably upgrade the old ones before the migration.
>
>
>
> On Sat, Jan 24, 2015 at 1:31 PM, Edgar Veiga 
> wrote:
> > Sargun,
> >
> > Regarding 1) - AAE is disabled. We had a problems with it and there's a
> lot
> > of threads here in the mailing list regarding this. AAE won't stop using
> > more and more disk space and the only solution was disabling it! Since
> then
> > the cluster has been pretty stable...
> >
> > Regarding 6) Can you or anyone in basho confirm that there won't be any
> > problems using the latest (1.4.12) version of riak in the new nodes and
> only
> > upgrading the old ones after this process is completed?
> >
> > Thanks a lot for the other tips, you've been very helpful!
> >
> > Best regards,
> > Edgar
> >
> > On 24 January 2015 at 21:09, Sargun Dhillon  wrote:
> >>
> >> Several things:
> >> 1) If you have data at rest that doesn't change, make sure you have
> >> AAE, and it's ran before your cluster is manipulated. Given that
> >> you're running at 85% space, I would be a little worried to turn it
> >> on, because you might run out of disk space. You can also pretty
> >> reasonably put the AAE trees on magnetic storage. AAE is nice in the
> >> sense that you _know_ your cluster is consistent at a point in time.
> >>
> >> 2) Make sure you're getting SSDs of roughly the same quality. I've
> >> seen enterprise SSDs get higher and higher latency as time goes on,
> >> due to greater data protection features. We don't need any of that.
> >> Basho_bench is your friend if you have the time.
> >>
> >> 3) Do it all in one go. This will enable handoffs more cleanly, and all
> at
> >> once.
> >>
> >> 4) Do not add the new nodes to the load balancer until handoff is
> >> done. At least experimentally, latency increases slightly on the
> >> original cluster, but the target nodes have pretty awful latency.
> >>
> >> 5) Start with a handoff_limit of 1. You can easily raise this. If
> >> things look good, you can increase it. We're not optimizing for the
> >> total time to handoff, we really should be optimizing for individual
> >> vnode handoff time.
> >>
> >> 6) If you're using Leveldb, upgrade to the most recent version of Riak
> >> 1.4. There have been some improvements. 1.4.9 made me happier. I think
> >> it's reasonable for the new nodes to start on 1.4.12, and the old
> >> nodes to be switched over later.
> >>
> >> 7) Watch your network utilization. Keep your disk latency flat. Stop
> >> it if it spikes. Start from enabling one node with the lowest usage
> >> and see if it works.
> >>
> >>
> >> These are the things I can think of immediately.
> >>
> >> On Sat, Jan 24, 2015 at 12:42 PM, Alexander Sicular  >
> >> wrote:
> >> > I would probably add them all in one go so you have one vnode
> migration
> >> > plan that gets executed. What is your ring size? How much data are we
> >> > talking about? It's not necessarily the number of keys but rather the
> total
> >> > amount of data and how quickly that data can move en mass between
> machines.
> >> >
> >> > -Alexander
> >> >
> >> >
> >> > @siculars
> >> > http://siculars.posthaven.com
> >> >
> >> > Sent from my iRotaryPhone
> >> >
> >> >> On Jan 24, 2015, at 15:37, Ed  wrote:
> >> >>
> >> >> Hi everyone!
> >> >>
> >> >> I have a riak cluster, working in production for about one year, with
> >> >> the following characteristics:
> >> >> - Version 1.4.8
> >> >> - 6 nodes
> >> >> - leveldb backend
> >> >> - replication (n) = 3
> >> >> ~ 3 billion keys
> >> >>
> >> >> My ssd's are reaching 85% of capacity and we have decided to buy 6
> more
> >> >> nodes to expand the cluster.
> >> >>
> >> >> Have you got any kind of advice on executing this operation or
> should I
> >> >> just follow the documentation on adding new nodes to a cluster?
> >> >>
> >> >> Best regards!
> >> >> Edgar
> >> >>
> >> >> ___
> >> >> riak-users mailing list
> >> >> riak-users@lists.basho.com
> >> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >> >
> >> > ___
> >> > riak-users mailing list
> >> > riak-users@lists.basho.com
> >> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
> >
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Adding nodes to cluster

2015-01-24 Thread Edgar Veiga
Sargun,

Regarding 1) - AAE is disabled. We had a problems with it and there's a lot
of threads here in the mailing list regarding this. AAE won't stop using
more and more disk space and the only solution was disabling it! Since then
the cluster has been pretty stable...

Regarding 6) Can you or anyone in basho confirm that there won't be any
problems using the latest (1.4.12) version of riak in the new nodes and
only upgrading the old ones after this process is completed?

Thanks a lot for the other tips, you've been very helpful!

Best regards,
Edgar

On 24 January 2015 at 21:09, Sargun Dhillon  wrote:

> Several things:
> 1) If you have data at rest that doesn't change, make sure you have
> AAE, and it's ran before your cluster is manipulated. Given that
> you're running at 85% space, I would be a little worried to turn it
> on, because you might run out of disk space. You can also pretty
> reasonably put the AAE trees on magnetic storage. AAE is nice in the
> sense that you _know_ your cluster is consistent at a point in time.
>
> 2) Make sure you're getting SSDs of roughly the same quality. I've
> seen enterprise SSDs get higher and higher latency as time goes on,
> due to greater data protection features. We don't need any of that.
> Basho_bench is your friend if you have the time.
>
> 3) Do it all in one go. This will enable handoffs more cleanly, and all at
> once.
>
> 4) Do not add the new nodes to the load balancer until handoff is
> done. At least experimentally, latency increases slightly on the
> original cluster, but the target nodes have pretty awful latency.
>
> 5) Start with a handoff_limit of 1. You can easily raise this. If
> things look good, you can increase it. We're not optimizing for the
> total time to handoff, we really should be optimizing for individual
> vnode handoff time.
>
> 6) If you're using Leveldb, upgrade to the most recent version of Riak
> 1.4. There have been some improvements. 1.4.9 made me happier. I think
> it's reasonable for the new nodes to start on 1.4.12, and the old
> nodes to be switched over later.
>
> 7) Watch your network utilization. Keep your disk latency flat. Stop
> it if it spikes. Start from enabling one node with the lowest usage
> and see if it works.
>
>
> These are the things I can think of immediately.
>
> On Sat, Jan 24, 2015 at 12:42 PM, Alexander Sicular 
> wrote:
> > I would probably add them all in one go so you have one vnode migration
> plan that gets executed. What is your ring size? How much data are we
> talking about? It's not necessarily the number of keys but rather the total
> amount of data and how quickly that data can move en mass between machines.
> >
> > -Alexander
> >
> >
> > @siculars
> > http://siculars.posthaven.com
> >
> > Sent from my iRotaryPhone
> >
> >> On Jan 24, 2015, at 15:37, Ed  wrote:
> >>
> >> Hi everyone!
> >>
> >> I have a riak cluster, working in production for about one year, with
> the following characteristics:
> >> - Version 1.4.8
> >> - 6 nodes
> >> - leveldb backend
> >> - replication (n) = 3
> >> ~ 3 billion keys
> >>
> >> My ssd's are reaching 85% of capacity and we have decided to buy 6 more
> nodes to expand the cluster.
> >>
> >> Have you got any kind of advice on executing this operation or should I
> just follow the documentation on adding new nodes to a cluster?
> >>
> >> Best regards!
> >> Edgar
> >>
> >> ___
> >> riak-users mailing list
> >> riak-users@lists.basho.com
> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Adding nodes to cluster

2015-01-24 Thread Edgar Veiga
Hi Alexander! Thanks for the reply.

Ring actual size: 256;
Total amount of data on cluster: ~6.6TB (~1.1TB per node)

Best regards,
Edgar

On 24 January 2015 at 20:42, Alexander Sicular  wrote:

> I would probably add them all in one go so you have one vnode migration
> plan that gets executed. What is your ring size? How much data are we
> talking about? It's not necessarily the number of keys but rather the total
> amount of data and how quickly that data can move en mass between machines.
>
> -Alexander
>
>
> @siculars
> http://siculars.posthaven.com
>
> Sent from my iRotaryPhone
>
> > On Jan 24, 2015, at 15:37, Ed  wrote:
> >
> > Hi everyone!
> >
> > I have a riak cluster, working in production for about one year, with
> the following characteristics:
> > - Version 1.4.8
> > - 6 nodes
> > - leveldb backend
> > - replication (n) = 3
> > ~ 3 billion keys
> >
> > My ssd's are reaching 85% of capacity and we have decided to buy 6 more
> nodes to expand the cluster.
> >
> > Have you got any kind of advice on executing this operation or should I
> just follow the documentation on adding new nodes to a cluster?
> >
> > Best regards!
> > Edgar
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: RIAK 1.4.6 - Mass key deletion

2014-04-10 Thread Edgar Veiga
Thanks, I'll start the process and give you guys some feedback in the mean
while.

The plan is
1 - Disable AAE in the cluster via riak attach:


a.
rpc:multicall(riak_kv_entropy_manager, disable, []).
rpc:multicall(riak_kv_entropy_manager, cancel_exchanges, []).
z.

2 - Update the app.config changing the aae dir to the mechanic disk;

3 - Restart riak process in each machine one by one;

4 - Remove old aae data;

By the way, I've seen here in the list different ways of disabling the
aae via riak attach... The former one is the most complete. What does
the a. and z. stand for? I've been disabling the aae just running
"rpc:multicall(riak_kv_entropy_manager, disable, []).", is there any
difference if we ignore the a., z. and the cancel_exchanges?

Best regards!



On 10 April 2014 13:41, Matthew Von-Maszewski  wrote:

> Yes, you can send the AAE (active anti-entropy) data to a different disk.
>
> AAE calculates a hash each time you PUT new data to the regular database.
>  AAE then buffers around 1,000 hashes (I forget the exact value) to write
> as a block to the AAE database.  The AAE write is NOT in series with the
> user database writes.  Your throughput should not be impacted.  But this is
> not something I have personally measured/validated.
>
> Matthew
>
>
> On Apr 10, 2014, at 7:33 AM, Edgar Veiga  wrote:
>
> Hi Matthew!
>
> I have a possibility of moving the data of anti-entropy directory to a
> mechanic disk 7200, that exists on each of the machines. I was thinking of
> changing the anti_entropy data dir config in app.config file and restart
> the riak process.
>
> Is there any problem using a mechanic disk to store the anti-entropy data?
>
> Best regards!
>
>
> On 8 April 2014 23:58, Edgar Veiga  wrote:
>
>> I'll wait a few more days, see if the AAE maybe "stabilises" and only
>> after that make a decision regarding this.
>> The cluster expanding was on the roadmap, but not right now :)
>>
>> I've attached a few screenshot, you can clearly observe  the evolution of
>> one of the machines after the anti-entropy data removal and consequent
>> restart  (5th of April).
>>
>> https://cloudup.com/cB0a15lCMeS
>>
>> Best regards!
>>
>>
>> On 8 April 2014 23:44, Matthew Von-Maszewski  wrote:
>>
>>> No.  I do not see a problem with your plan.  But ...
>>>
>>> I would prefer to see you add servers to your cluster.  Scalabilty is
>>> one of Riak's fundamental characteristics.  As your database needs grow, we
>>> grow with you ... just add another server and migrate some of the vnodes
>>> there.
>>>
>>> I obviously cannot speak to your budgetary constraints.  All of the
>>> engineers at Basho, I am just one, are focused upon providing you
>>> performance and features along with your scalability needs.  This seems to
>>> be a situation where you might be sacrificing data integrity where another
>>> server or two would address the situation.
>>>
>>> And if 2.0 makes things better ... sell the extra servers on Ebay.
>>>
>>> Matthew
>>>
>>>
>>> On Apr 8, 2014, at 6:31 PM, Edgar Veiga  wrote:
>>>
>>> Thanks Matthew!
>>>
>>> Today this situation has become unsustainable, In two of the machines I
>>> have an anti-entropy dir of 250G... It just keeps growing and growing and
>>> I'm almost reaching max size of the disks.
>>>
>>> Maybe I'll just turn off aae in the cluster, remove all the data in the
>>> anti-entropy directory and wait for the v2 of riak. Do you see any problem
>>> with this?
>>>
>>> Best regards!
>>>
>>>
>>> On 8 April 2014 22:11, Matthew Von-Maszewski  wrote:
>>>
>>>> Edgar,
>>>>
>>>> Today we disclosed a new feature for Riak's leveldb, Tiered Storage.
>>>>  The details are here:
>>>>
>>>> https://github.com/basho/leveldb/wiki/mv-tiered-options
>>>>
>>>> This feature might give you another option in managing your storage
>>>> volume.
>>>>
>>>>
>>>> Matthew
>>>>
>>>> On Apr 8, 2014, at 11:07 AM, Edgar Veiga  wrote:
>>>>
>>>> It makes sense, I do a lot, and I really mean a LOT of updates per key,
>>>> maybe thousands a day! The cluster is experiencing a lot more updates per
>>>> each key, than new keys being inserted.
>>>>
>>>> The hash trees will rebuild during the next weekend (normally it takes
>>>> about two d

Re: RIAK 1.4.6 - Mass key deletion

2014-04-10 Thread Edgar Veiga
Hi Matthew!

I have a possibility of moving the data of anti-entropy directory to a
mechanic disk 7200, that exists on each of the machines. I was thinking of
changing the anti_entropy data dir config in app.config file and restart
the riak process.

Is there any problem using a mechanic disk to store the anti-entropy data?

Best regards!


On 8 April 2014 23:58, Edgar Veiga  wrote:

> I'll wait a few more days, see if the AAE maybe "stabilises" and only
> after that make a decision regarding this.
> The cluster expanding was on the roadmap, but not right now :)
>
> I've attached a few screenshot, you can clearly observe  the evolution of
> one of the machines after the anti-entropy data removal and consequent
> restart  (5th of April).
>
> https://cloudup.com/cB0a15lCMeS
>
> Best regards!
>
>
> On 8 April 2014 23:44, Matthew Von-Maszewski  wrote:
>
>> No.  I do not see a problem with your plan.  But ...
>>
>> I would prefer to see you add servers to your cluster.  Scalabilty is one
>> of Riak's fundamental characteristics.  As your database needs grow, we
>> grow with you ... just add another server and migrate some of the vnodes
>> there.
>>
>> I obviously cannot speak to your budgetary constraints.  All of the
>> engineers at Basho, I am just one, are focused upon providing you
>> performance and features along with your scalability needs.  This seems to
>> be a situation where you might be sacrificing data integrity where another
>> server or two would address the situation.
>>
>> And if 2.0 makes things better ... sell the extra servers on Ebay.
>>
>> Matthew
>>
>>
>> On Apr 8, 2014, at 6:31 PM, Edgar Veiga  wrote:
>>
>> Thanks Matthew!
>>
>> Today this situation has become unsustainable, In two of the machines I
>> have an anti-entropy dir of 250G... It just keeps growing and growing and
>> I'm almost reaching max size of the disks.
>>
>> Maybe I'll just turn off aae in the cluster, remove all the data in the
>> anti-entropy directory and wait for the v2 of riak. Do you see any problem
>> with this?
>>
>> Best regards!
>>
>>
>> On 8 April 2014 22:11, Matthew Von-Maszewski  wrote:
>>
>>> Edgar,
>>>
>>> Today we disclosed a new feature for Riak's leveldb, Tiered Storage.
>>>  The details are here:
>>>
>>> https://github.com/basho/leveldb/wiki/mv-tiered-options
>>>
>>> This feature might give you another option in managing your storage
>>> volume.
>>>
>>>
>>> Matthew
>>>
>>> On Apr 8, 2014, at 11:07 AM, Edgar Veiga  wrote:
>>>
>>> It makes sense, I do a lot, and I really mean a LOT of updates per key,
>>> maybe thousands a day! The cluster is experiencing a lot more updates per
>>> each key, than new keys being inserted.
>>>
>>> The hash trees will rebuild during the next weekend (normally it takes
>>> about two days to complete the operation) so I'll come back and give you
>>> some feedback (hopefully good) on the next Monday!
>>>
>>> Again, thanks a lot, You've been very helpful.
>>> Edgar
>>>
>>>
>>> On 8 April 2014 15:47, Matthew Von-Maszewski  wrote:
>>>
>>>> Edgar,
>>>>
>>>> The test I have running currently has reach 1 Billion keys.  It is
>>>> running against a single node with N=1.  It has 42G of AAE data.  Here is
>>>> my extrapolation to compare your numbers:
>>>>
>>>> You have ~2.5 Billion keys.  I assume you are running N=3 (the
>>>> default).  AAE therefore is actually tracking ~7.5 Billion keys.  You have
>>>> six nodes, therefore tracking ~1.25 Billion keys per node.
>>>>
>>>> Raw math would suggest that my 42G of AAE data for 1 billion keys would
>>>> extrapolate to 52.5G of AAE data for you.  Yet you have ~120G of AAE data.
>>>>  Is something wrong?  No.  My data is still loading and has experience zero
>>>> key/value updates/edits.
>>>>
>>>> AAE hashes get rewritten every time a user updates the value of a key.
>>>>  AAE's leveldb is just like the user leveldb, all prior values of a key
>>>> accumulate in the .sst table files until compaction removes duplicates.
>>>>  Similarly, a user delete of a key causes a delete tombstone in the AAE
>>>> hash tree.  Those delete tombstones have to await compactions too before
>>>> leveldb recovers the disk space.
>>>>
&

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Edgar Veiga
I'll wait a few more days, see if the AAE maybe "stabilises" and only after
that make a decision regarding this.
The cluster expanding was on the roadmap, but not right now :)

I've attached a few screenshot, you can clearly observe  the evolution of
one of the machines after the anti-entropy data removal and consequent
restart  (5th of April).

https://cloudup.com/cB0a15lCMeS

Best regards!


On 8 April 2014 23:44, Matthew Von-Maszewski  wrote:

> No.  I do not see a problem with your plan.  But ...
>
> I would prefer to see you add servers to your cluster.  Scalabilty is one
> of Riak's fundamental characteristics.  As your database needs grow, we
> grow with you ... just add another server and migrate some of the vnodes
> there.
>
> I obviously cannot speak to your budgetary constraints.  All of the
> engineers at Basho, I am just one, are focused upon providing you
> performance and features along with your scalability needs.  This seems to
> be a situation where you might be sacrificing data integrity where another
> server or two would address the situation.
>
> And if 2.0 makes things better ... sell the extra servers on Ebay.
>
> Matthew
>
>
> On Apr 8, 2014, at 6:31 PM, Edgar Veiga  wrote:
>
> Thanks Matthew!
>
> Today this situation has become unsustainable, In two of the machines I
> have an anti-entropy dir of 250G... It just keeps growing and growing and
> I'm almost reaching max size of the disks.
>
> Maybe I'll just turn off aae in the cluster, remove all the data in the
> anti-entropy directory and wait for the v2 of riak. Do you see any problem
> with this?
>
> Best regards!
>
>
> On 8 April 2014 22:11, Matthew Von-Maszewski  wrote:
>
>> Edgar,
>>
>> Today we disclosed a new feature for Riak's leveldb, Tiered Storage.  The
>> details are here:
>>
>> https://github.com/basho/leveldb/wiki/mv-tiered-options
>>
>> This feature might give you another option in managing your storage
>> volume.
>>
>>
>> Matthew
>>
>> On Apr 8, 2014, at 11:07 AM, Edgar Veiga  wrote:
>>
>> It makes sense, I do a lot, and I really mean a LOT of updates per key,
>> maybe thousands a day! The cluster is experiencing a lot more updates per
>> each key, than new keys being inserted.
>>
>> The hash trees will rebuild during the next weekend (normally it takes
>> about two days to complete the operation) so I'll come back and give you
>> some feedback (hopefully good) on the next Monday!
>>
>> Again, thanks a lot, You've been very helpful.
>> Edgar
>>
>>
>> On 8 April 2014 15:47, Matthew Von-Maszewski  wrote:
>>
>>> Edgar,
>>>
>>> The test I have running currently has reach 1 Billion keys.  It is
>>> running against a single node with N=1.  It has 42G of AAE data.  Here is
>>> my extrapolation to compare your numbers:
>>>
>>> You have ~2.5 Billion keys.  I assume you are running N=3 (the default).
>>>  AAE therefore is actually tracking ~7.5 Billion keys.  You have six nodes,
>>> therefore tracking ~1.25 Billion keys per node.
>>>
>>> Raw math would suggest that my 42G of AAE data for 1 billion keys would
>>> extrapolate to 52.5G of AAE data for you.  Yet you have ~120G of AAE data.
>>>  Is something wrong?  No.  My data is still loading and has experience zero
>>> key/value updates/edits.
>>>
>>> AAE hashes get rewritten every time a user updates the value of a key.
>>>  AAE's leveldb is just like the user leveldb, all prior values of a key
>>> accumulate in the .sst table files until compaction removes duplicates.
>>>  Similarly, a user delete of a key causes a delete tombstone in the AAE
>>> hash tree.  Those delete tombstones have to await compactions too before
>>> leveldb recovers the disk space.
>>>
>>> AAE's hash trees rebuild weekly.  I am told that the rebuild operation
>>> will actually destroy the existing files and start over.  That is when you
>>> should see AAE space usage dropping dramatically.
>>>
>>> Matthew
>>>
>>>
>>> On Apr 8, 2014, at 9:31 AM, Edgar Veiga  wrote:
>>>
>>> Thanks a lot Matthew!
>>>
>>> A little bit of more info, I've gathered a sample of the contents of
>>> anti-entropy data of one of my machines:
>>> - 44 folders with the name equal to the name of the folders in level-db
>>> dir (i.e. 393920363186844927172086927568060657641638068224/)
>>> - each folder has a 5 files (log, current, log, etc) and 5 sst_* folde

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Edgar Veiga
Thanks Matthew!

Today this situation has become unsustainable, In two of the machines I
have an anti-entropy dir of 250G... It just keeps growing and growing and
I'm almost reaching max size of the disks.

Maybe I'll just turn off aae in the cluster, remove all the data in the
anti-entropy directory and wait for the v2 of riak. Do you see any problem
with this?

Best regards!


On 8 April 2014 22:11, Matthew Von-Maszewski  wrote:

> Edgar,
>
> Today we disclosed a new feature for Riak's leveldb, Tiered Storage.  The
> details are here:
>
> https://github.com/basho/leveldb/wiki/mv-tiered-options
>
> This feature might give you another option in managing your storage
> volume.
>
>
> Matthew
>
> On Apr 8, 2014, at 11:07 AM, Edgar Veiga  wrote:
>
> It makes sense, I do a lot, and I really mean a LOT of updates per key,
> maybe thousands a day! The cluster is experiencing a lot more updates per
> each key, than new keys being inserted.
>
> The hash trees will rebuild during the next weekend (normally it takes
> about two days to complete the operation) so I'll come back and give you
> some feedback (hopefully good) on the next Monday!
>
> Again, thanks a lot, You've been very helpful.
> Edgar
>
>
> On 8 April 2014 15:47, Matthew Von-Maszewski  wrote:
>
>> Edgar,
>>
>> The test I have running currently has reach 1 Billion keys.  It is
>> running against a single node with N=1.  It has 42G of AAE data.  Here is
>> my extrapolation to compare your numbers:
>>
>> You have ~2.5 Billion keys.  I assume you are running N=3 (the default).
>>  AAE therefore is actually tracking ~7.5 Billion keys.  You have six nodes,
>> therefore tracking ~1.25 Billion keys per node.
>>
>> Raw math would suggest that my 42G of AAE data for 1 billion keys would
>> extrapolate to 52.5G of AAE data for you.  Yet you have ~120G of AAE data.
>>  Is something wrong?  No.  My data is still loading and has experience zero
>> key/value updates/edits.
>>
>> AAE hashes get rewritten every time a user updates the value of a key.
>>  AAE's leveldb is just like the user leveldb, all prior values of a key
>> accumulate in the .sst table files until compaction removes duplicates.
>>  Similarly, a user delete of a key causes a delete tombstone in the AAE
>> hash tree.  Those delete tombstones have to await compactions too before
>> leveldb recovers the disk space.
>>
>> AAE's hash trees rebuild weekly.  I am told that the rebuild operation
>> will actually destroy the existing files and start over.  That is when you
>> should see AAE space usage dropping dramatically.
>>
>> Matthew
>>
>>
>> On Apr 8, 2014, at 9:31 AM, Edgar Veiga  wrote:
>>
>> Thanks a lot Matthew!
>>
>> A little bit of more info, I've gathered a sample of the contents of
>> anti-entropy data of one of my machines:
>> - 44 folders with the name equal to the name of the folders in level-db
>> dir (i.e. 393920363186844927172086927568060657641638068224/)
>> - each folder has a 5 files (log, current, log, etc) and 5 sst_* folders.
>> - The biggest sst folder is sst_3 with 4.3G
>> - Inside sst_3 folder there are 1219 files name 00.sst.
>> - Each of the 00*****.sst files has ~3.7M
>>
>> Hope this info gives you some more help!
>>
>> Best regards, and again, thanks a lot
>> Edgar
>>
>>
>> On 8 April 2014 13:24, Matthew Von-Maszewski  wrote:
>>
>>> Argh. Missed where you said you had upgraded. Ok it will proceed with
>>> getting you comparison numbers.
>>>
>>> Sent from my iPhone
>>>
>>> On Apr 8, 2014, at 6:51 AM, Edgar Veiga  wrote:
>>>
>>> Thanks again Matthew, you've been very helpful!
>>>
>>> Maybe you can give me some kind of advise on this issue I'm having since
>>> I've upgraded to 1.4.8.
>>>
>>> Since I've upgraded my anti-entropy data has been growing a lot and has
>>> only stabilised in very high values... Write now my cluster has 6 machines
>>> each one with ~120G of anti-entropy data and 600G of level-db data. This
>>> seems to be quite a lot no? My total amount of keys is ~2.5 Billions.
>>>
>>> Best regards,
>>> Edgar
>>>
>>> On 6 April 2014 23:30, Matthew Von-Maszewski  wrote:
>>>
>>>> Edgar,
>>>>
>>>> This is indirectly related to you key deletion discussion.  I made
>>>> changes recently to the aggressive delete code.  The second section of the
>>>> following (updated) web page d

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Edgar Veiga
It makes sense, I do a lot, and I really mean a LOT of updates per key,
maybe thousands a day! The cluster is experiencing a lot more updates per
each key, than new keys being inserted.

The hash trees will rebuild during the next weekend (normally it takes
about two days to complete the operation) so I'll come back and give you
some feedback (hopefully good) on the next Monday!

Again, thanks a lot, You've been very helpful.
Edgar


On 8 April 2014 15:47, Matthew Von-Maszewski  wrote:

> Edgar,
>
> The test I have running currently has reach 1 Billion keys.  It is running
> against a single node with N=1.  It has 42G of AAE data.  Here is my
> extrapolation to compare your numbers:
>
> You have ~2.5 Billion keys.  I assume you are running N=3 (the default).
>  AAE therefore is actually tracking ~7.5 Billion keys.  You have six nodes,
> therefore tracking ~1.25 Billion keys per node.
>
> Raw math would suggest that my 42G of AAE data for 1 billion keys would
> extrapolate to 52.5G of AAE data for you.  Yet you have ~120G of AAE data.
>  Is something wrong?  No.  My data is still loading and has experience zero
> key/value updates/edits.
>
> AAE hashes get rewritten every time a user updates the value of a key.
>  AAE's leveldb is just like the user leveldb, all prior values of a key
> accumulate in the .sst table files until compaction removes duplicates.
>  Similarly, a user delete of a key causes a delete tombstone in the AAE
> hash tree.  Those delete tombstones have to await compactions too before
> leveldb recovers the disk space.
>
> AAE's hash trees rebuild weekly.  I am told that the rebuild operation
> will actually destroy the existing files and start over.  That is when you
> should see AAE space usage dropping dramatically.
>
> Matthew
>
>
> On Apr 8, 2014, at 9:31 AM, Edgar Veiga  wrote:
>
> Thanks a lot Matthew!
>
> A little bit of more info, I've gathered a sample of the contents of
> anti-entropy data of one of my machines:
> - 44 folders with the name equal to the name of the folders in level-db
> dir (i.e. 393920363186844927172086927568060657641638068224/)
> - each folder has a 5 files (log, current, log, etc) and 5 sst_* folders.
> - The biggest sst folder is sst_3 with 4.3G
> - Inside sst_3 folder there are 1219 files name 00.sst.
> - Each of the 00*.sst files has ~3.7M
>
> Hope this info gives you some more help!
>
> Best regards, and again, thanks a lot
> Edgar
>
>
> On 8 April 2014 13:24, Matthew Von-Maszewski  wrote:
>
>> Argh. Missed where you said you had upgraded. Ok it will proceed with
>> getting you comparison numbers.
>>
>> Sent from my iPhone
>>
>> On Apr 8, 2014, at 6:51 AM, Edgar Veiga  wrote:
>>
>> Thanks again Matthew, you've been very helpful!
>>
>> Maybe you can give me some kind of advise on this issue I'm having since
>> I've upgraded to 1.4.8.
>>
>> Since I've upgraded my anti-entropy data has been growing a lot and has
>> only stabilised in very high values... Write now my cluster has 6 machines
>> each one with ~120G of anti-entropy data and 600G of level-db data. This
>> seems to be quite a lot no? My total amount of keys is ~2.5 Billions.
>>
>> Best regards,
>> Edgar
>>
>> On 6 April 2014 23:30, Matthew Von-Maszewski  wrote:
>>
>>> Edgar,
>>>
>>> This is indirectly related to you key deletion discussion.  I made
>>> changes recently to the aggressive delete code.  The second section of the
>>> following (updated) web page discusses the adjustments:
>>>
>>> https://github.com/basho/leveldb/wiki/Mv-aggressive-delete
>>>
>>> Matthew
>>>
>>>
>>> On Apr 6, 2014, at 4:29 PM, Edgar Veiga  wrote:
>>>
>>> Matthew, thanks again for the response!
>>>
>>> That said, I'll wait again for the 2.0 (and maybe buy some bigger disks
>>> :)
>>>
>>> Best regards
>>>
>>>
>>> On 6 April 2014 15:02, Matthew Von-Maszewski  wrote:
>>>
>>>> Edgar,
>>>>
>>>> In Riak 1.4, there is no advantage to using empty values versus
>>>> deleting.
>>>>
>>>> leveldb is a "write once" data store.  New data for a given key never
>>>> physically overwrites old data for the same key.  New data "hides" the old
>>>> data by being in a lower level, and therefore picked first.
>>>>
>>>> leveldb's compaction operation will remove older key/value pairs only
>>>> when the newer key/value is pair is part of a compa

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Edgar Veiga
Thanks a lot Matthew!

A little bit of more info, I've gathered a sample of the contents of
anti-entropy data of one of my machines:
- 44 folders with the name equal to the name of the folders in level-db dir
(i.e. 393920363186844927172086927568060657641638068224/)
- each folder has a 5 files (log, current, log, etc) and 5 sst_* folders.
- The biggest sst folder is sst_3 with 4.3G
- Inside sst_3 folder there are 1219 files name 00.sst.
- Each of the 00*.sst files has ~3.7M

Hope this info gives you some more help!

Best regards, and again, thanks a lot
Edgar


On 8 April 2014 13:24, Matthew Von-Maszewski  wrote:

> Argh. Missed where you said you had upgraded. Ok it will proceed with
> getting you comparison numbers.
>
> Sent from my iPhone
>
> On Apr 8, 2014, at 6:51 AM, Edgar Veiga  wrote:
>
> Thanks again Matthew, you've been very helpful!
>
> Maybe you can give me some kind of advise on this issue I'm having since
> I've upgraded to 1.4.8.
>
> Since I've upgraded my anti-entropy data has been growing a lot and has
> only stabilised in very high values... Write now my cluster has 6 machines
> each one with ~120G of anti-entropy data and 600G of level-db data. This
> seems to be quite a lot no? My total amount of keys is ~2.5 Billions.
>
> Best regards,
> Edgar
>
> On 6 April 2014 23:30, Matthew Von-Maszewski  wrote:
>
>> Edgar,
>>
>> This is indirectly related to you key deletion discussion.  I made
>> changes recently to the aggressive delete code.  The second section of the
>> following (updated) web page discusses the adjustments:
>>
>> https://github.com/basho/leveldb/wiki/Mv-aggressive-delete
>>
>> Matthew
>>
>>
>> On Apr 6, 2014, at 4:29 PM, Edgar Veiga  wrote:
>>
>> Matthew, thanks again for the response!
>>
>> That said, I'll wait again for the 2.0 (and maybe buy some bigger disks :)
>>
>> Best regards
>>
>>
>> On 6 April 2014 15:02, Matthew Von-Maszewski  wrote:
>>
>>> Edgar,
>>>
>>> In Riak 1.4, there is no advantage to using empty values versus deleting.
>>>
>>> leveldb is a "write once" data store.  New data for a given key never
>>> physically overwrites old data for the same key.  New data "hides" the old
>>> data by being in a lower level, and therefore picked first.
>>>
>>> leveldb's compaction operation will remove older key/value pairs only
>>> when the newer key/value is pair is part of a compaction involving both new
>>> and old.  The new and the old key/value pairs must have migrated to
>>> adjacent levels through normal compaction operations before leveldb will
>>> see them in the same compaction.  The migration could take days, weeks, or
>>> even months depending upon the size of your entire dataset and the rate of
>>> incoming write operations.
>>>
>>> leveldb's "delete" object is exactly the same as your empty JSON object.
>>>  The delete object simply has one more flag set that allows it to also be
>>> removed if and only if there is no chance for an identical key to exist on
>>> a higher level.
>>>
>>> I apologize that I cannot give you a more useful answer.  2.0 is on the
>>> horizon.
>>>
>>> Matthew
>>>
>>>
>>> On Apr 6, 2014, at 7:04 AM, Edgar Veiga  wrote:
>>>
>>> Hi again!
>>>
>>> Sorry to reopen this discussion, but I have another question regarding
>>> the former post.
>>>
>>> What if, instead of doing a mass deletion (We've already seen that it
>>> will be non profitable, regarding disk space) I update all the values with
>>> an empty JSON object "{}" ? Do you see any problem with this? I no longer
>>> need those millions of values that are living in the cluster...
>>>
>>> When the version 2.0 of riak runs stable I'll do the update and only
>>> then delete those keys!
>>>
>>> Best regards
>>>
>>>
>>> On 18 February 2014 16:32, Edgar Veiga  wrote:
>>>
>>>> Ok, thanks a lot Matthew.
>>>>
>>>>
>>>> On 18 February 2014 16:18, Matthew Von-Maszewski wrote:
>>>>
>>>>> Riak 2.0 is coming.  Hold your mass delete until then.  The "bug" is
>>>>> within Google's original leveldb architecture.  Riak 2.0 sneaks around to
>>>>> get the disk space freed.
>>>>>
>>>>> Matthew
>>>>>
>>>>>
>>

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Edgar Veiga
Thanks again Matthew, you've been very helpful!

Maybe you can give me some kind of advise on this issue I'm having since
I've upgraded to 1.4.8.

Since I've upgraded my anti-entropy data has been growing a lot and has
only stabilised in very high values... Write now my cluster has 6 machines
each one with ~120G of anti-entropy data and 600G of level-db data. This
seems to be quite a lot no? My total amount of keys is ~2.5 Billions.

Best regards,
Edgar

On 6 April 2014 23:30, Matthew Von-Maszewski  wrote:

> Edgar,
>
> This is indirectly related to you key deletion discussion.  I made changes
> recently to the aggressive delete code.  The second section of the
> following (updated) web page discusses the adjustments:
>
> https://github.com/basho/leveldb/wiki/Mv-aggressive-delete
>
> Matthew
>
>
> On Apr 6, 2014, at 4:29 PM, Edgar Veiga  wrote:
>
> Matthew, thanks again for the response!
>
> That said, I'll wait again for the 2.0 (and maybe buy some bigger disks :)
>
> Best regards
>
>
> On 6 April 2014 15:02, Matthew Von-Maszewski  wrote:
>
>> Edgar,
>>
>> In Riak 1.4, there is no advantage to using empty values versus deleting.
>>
>> leveldb is a "write once" data store.  New data for a given key never
>> physically overwrites old data for the same key.  New data "hides" the old
>> data by being in a lower level, and therefore picked first.
>>
>> leveldb's compaction operation will remove older key/value pairs only
>> when the newer key/value is pair is part of a compaction involving both new
>> and old.  The new and the old key/value pairs must have migrated to
>> adjacent levels through normal compaction operations before leveldb will
>> see them in the same compaction.  The migration could take days, weeks, or
>> even months depending upon the size of your entire dataset and the rate of
>> incoming write operations.
>>
>> leveldb's "delete" object is exactly the same as your empty JSON object.
>>  The delete object simply has one more flag set that allows it to also be
>> removed if and only if there is no chance for an identical key to exist on
>> a higher level.
>>
>> I apologize that I cannot give you a more useful answer.  2.0 is on the
>> horizon.
>>
>> Matthew
>>
>>
>> On Apr 6, 2014, at 7:04 AM, Edgar Veiga  wrote:
>>
>> Hi again!
>>
>> Sorry to reopen this discussion, but I have another question regarding
>> the former post.
>>
>> What if, instead of doing a mass deletion (We've already seen that it
>> will be non profitable, regarding disk space) I update all the values with
>> an empty JSON object "{}" ? Do you see any problem with this? I no longer
>> need those millions of values that are living in the cluster...
>>
>> When the version 2.0 of riak runs stable I'll do the update and only then
>> delete those keys!
>>
>> Best regards
>>
>>
>> On 18 February 2014 16:32, Edgar Veiga  wrote:
>>
>>> Ok, thanks a lot Matthew.
>>>
>>>
>>> On 18 February 2014 16:18, Matthew Von-Maszewski wrote:
>>>
>>>> Riak 2.0 is coming.  Hold your mass delete until then.  The "bug" is
>>>> within Google's original leveldb architecture.  Riak 2.0 sneaks around to
>>>> get the disk space freed.
>>>>
>>>> Matthew
>>>>
>>>>
>>>>
>>>> On Feb 18, 2014, at 11:10 AM, Edgar Veiga 
>>>> wrote:
>>>>
>>>> The only/main purpose is to free disk space..
>>>>
>>>> I was a little bit concerned regarding this operation, but now with
>>>> your feedback I'm tending to don't do nothing, I can't risk the growing of
>>>> space...
>>>> Regarding the overhead I think that with a tight throttling system I
>>>> could control and avoid overloading the cluster.
>>>>
>>>> Mixed feelings :S
>>>>
>>>>
>>>>
>>>> On 18 February 2014 15:45, Matthew Von-Maszewski wrote:
>>>>
>>>>> Edgar,
>>>>>
>>>>> The first "concern" I have is that leveldb's delete does not free disk
>>>>> space.  Others have executed mass delete operations only to discover they
>>>>> are now using more disk space instead of less.  Here is a discussion of 
>>>>> the
>>>>> problem:
>>>>>
>>>>> https://github.com/bash

Re: Update to 1.4.8

2014-04-08 Thread Edgar Veiga
So basho, to resume:

I've upgraded to the latest 1.4.8 version without removing the anti-entropy
data dir because at the time that note wasn't already on the Release Notes
of 1.4.8.

A few days later, I've made it: Stopped the aae via riak attach, restarted
all the nodes one by one removing the anti-entropy data in between.

The expected results didn't happened.. My level-db data dir has ~600G per
server and the anti-antropy data dir has ~120G, this seems to be quite a
lot :(
The cluster load is still high... Write and read times are inconstant and
both high.

I have a 6 machine cluster with level-db as backend. The total amount of
keys is about 2.5 billions.

Best regards!




On 8 April 2014 11:21, Timo Gatsonides  wrote:

>
> I also have 6 servers. Each server has about 2Tb of data. So maybe my
> anti_entropy dir size is "normal".
>
> Kind regards,
> Timo
>
> p.s. I'm using the multi_backend as I have some data only in memory
> (riak_kv_memory_backend); all data on disk is in riak_kv_eleveldb_backend.
>
> > Well, my anti-entropy folders in each machine have ~120G, It's quite a
> lot!!!
> >
> > I have ~600G of data per server and a cluster of 6 servers with
> level-db. Just for comparison effects, what about you?
> >
> > Someone of basho, can you please advise on this one?
> >
> > Best regards! :)
> >
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Update to 1.4.8

2014-04-08 Thread Edgar Veiga
Well, my anti-entropy folders in each machine have ~120G, It's quite a
lot!!!

I have ~600G of data per server and a cluster of 6 servers with level-db.
Just for comparison effects, what about you?

Someone of basho, can you please advise on this one?

Best regards! :)


On 8 April 2014 11:02, Timo Gatsonides  wrote:

> "...So I stopped AAE on all nodes (with riak attach), removed the AAE folders 
> on all the nodes. And then restarted them one-by-one, so they all started 
> with a clean AAE state. Then about a day later the cluster was finally in a 
> normal state."
>
> I don't understand the difference between what you did and what I'm
> describing in the former emails? I've stopped the aae via riak attach, and
> then one by one, I've stopped the node, removed the anti-entropy data and
> started the node. Is there any subtle difference I'm not getting?
>
> I'm asking this because indeed this hasn't proved to be enough to stop the
> cluster entire cluster load. Another thing is the anti-entropy dir data
> size, since the upgrade it has reached very high values comparing to the
> previous ones...
>
>
> Unfortunately I don't have a 100% definitive answer for you, maybe someone
> from Basho can advise.
>
> In my case I noticed that after running riak_kv_entropy_manager:disable()
> the IO load did not decrease immediately and on some servers it took quite
> a while before iostat showed disk I/O going to normal levels. I only
> removed the AAE folders after IO load was normal.
>
> Now that you have mentioned it I just took a look at my servers and the
> anti-entropy dir is large (>500Mb) on my servers too, although it varies
> from one server to the next.
>
> Best regards,
> Timo
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Update to 1.4.8

2014-04-08 Thread Edgar Veiga
Hi Timo,

"...So I stopped AAE on all nodes (with riak attach), removed the AAE
folders on all the nodes. And then restarted them one-by-one, so they
all started with a clean AAE state. Then about a day later the cluster
was finally in a normal state."

I don't understand the difference between what you did and what I'm
describing in the former emails? I've stopped the aae via riak attach, and
then one by one, I've stopped the node, removed the anti-entropy data and
started the node. Is there any subtle difference I'm not getting?

I'm asking this because indeed this hasn't proved to be enough to stop the
cluster entire cluster load. Another thing is the anti-entropy dir data
size, since the upgrade it has reached very high values comparing to the
previous ones...

Best regards
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: RIAK 1.4.6 - Mass key deletion

2014-04-06 Thread Edgar Veiga
Matthew, thanks again for the response!

That said, I'll wait again for the 2.0 (and maybe buy some bigger disks :)

Best regards


On 6 April 2014 15:02, Matthew Von-Maszewski  wrote:

> Edgar,
>
> In Riak 1.4, there is no advantage to using empty values versus deleting.
>
> leveldb is a "write once" data store.  New data for a given key never
> physically overwrites old data for the same key.  New data "hides" the old
> data by being in a lower level, and therefore picked first.
>
> leveldb's compaction operation will remove older key/value pairs only when
> the newer key/value is pair is part of a compaction involving both new and
> old.  The new and the old key/value pairs must have migrated to adjacent
> levels through normal compaction operations before leveldb will see them in
> the same compaction.  The migration could take days, weeks, or even months
> depending upon the size of your entire dataset and the rate of incoming
> write operations.
>
> leveldb's "delete" object is exactly the same as your empty JSON object.
>  The delete object simply has one more flag set that allows it to also be
> removed if and only if there is no chance for an identical key to exist on
> a higher level.
>
> I apologize that I cannot give you a more useful answer.  2.0 is on the
> horizon.
>
> Matthew
>
>
> On Apr 6, 2014, at 7:04 AM, Edgar Veiga  wrote:
>
> Hi again!
>
> Sorry to reopen this discussion, but I have another question regarding the
> former post.
>
> What if, instead of doing a mass deletion (We've already seen that it will
> be non profitable, regarding disk space) I update all the values with an
> empty JSON object "{}" ? Do you see any problem with this? I no longer need
> those millions of values that are living in the cluster...
>
> When the version 2.0 of riak runs stable I'll do the update and only then
> delete those keys!
>
> Best regards
>
>
> On 18 February 2014 16:32, Edgar Veiga  wrote:
>
>> Ok, thanks a lot Matthew.
>>
>>
>> On 18 February 2014 16:18, Matthew Von-Maszewski wrote:
>>
>>> Riak 2.0 is coming.  Hold your mass delete until then.  The "bug" is
>>> within Google's original leveldb architecture.  Riak 2.0 sneaks around to
>>> get the disk space freed.
>>>
>>> Matthew
>>>
>>>
>>>
>>> On Feb 18, 2014, at 11:10 AM, Edgar Veiga  wrote:
>>>
>>> The only/main purpose is to free disk space..
>>>
>>> I was a little bit concerned regarding this operation, but now with your
>>> feedback I'm tending to don't do nothing, I can't risk the growing of
>>> space...
>>> Regarding the overhead I think that with a tight throttling system I
>>> could control and avoid overloading the cluster.
>>>
>>> Mixed feelings :S
>>>
>>>
>>>
>>> On 18 February 2014 15:45, Matthew Von-Maszewski wrote:
>>>
>>>> Edgar,
>>>>
>>>> The first "concern" I have is that leveldb's delete does not free disk
>>>> space.  Others have executed mass delete operations only to discover they
>>>> are now using more disk space instead of less.  Here is a discussion of the
>>>> problem:
>>>>
>>>> https://github.com/basho/leveldb/wiki/mv-aggressive-delete
>>>>
>>>> The link also describes Riak's database operation overhead.  This is a
>>>> second "concern".  You will need to carefully throttle your delete rate or
>>>> the overhead will likely impact your production throughput.
>>>>
>>>> We have new code to help quicken the actual purge of deleted data in
>>>> Riak 2.0.  But that release is not quite ready for production usage.
>>>>
>>>>
>>>> What do you hope to achieve by the mass delete?
>>>>
>>>> Matthew
>>>>
>>>>
>>>>
>>>>
>>>> On Feb 18, 2014, at 10:29 AM, Edgar Veiga 
>>>> wrote:
>>>>
>>>> Sorry, forgot that info!
>>>>
>>>> It's leveldb.
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On 18 February 2014 15:27, Matthew Von-Maszewski wrote:
>>>>
>>>>> Which Riak backend are you using:  bitcask, leveldb, multi?
>>>>>
>>>>> Matthew
>>>>>
>>>>>
>>>>> On Feb 18, 2014, at 10:17 AM, Edgar Veiga 
>&g

Re: RIAK 1.4.6 - Mass key deletion

2014-04-06 Thread Edgar Veiga
Hi again!

Sorry to reopen this discussion, but I have another question regarding the
former post.

What if, instead of doing a mass deletion (We've already seen that it will
be non profitable, regarding disk space) I update all the values with an
empty JSON object "{}" ? Do you see any problem with this? I no longer need
those millions of values that are living in the cluster...

When the version 2.0 of riak runs stable I'll do the update and only then
delete those keys!

Best regards


On 18 February 2014 16:32, Edgar Veiga  wrote:

> Ok, thanks a lot Matthew.
>
>
> On 18 February 2014 16:18, Matthew Von-Maszewski wrote:
>
>> Riak 2.0 is coming.  Hold your mass delete until then.  The "bug" is
>> within Google's original leveldb architecture.  Riak 2.0 sneaks around to
>> get the disk space freed.
>>
>> Matthew
>>
>>
>>
>> On Feb 18, 2014, at 11:10 AM, Edgar Veiga  wrote:
>>
>> The only/main purpose is to free disk space..
>>
>> I was a little bit concerned regarding this operation, but now with your
>> feedback I'm tending to don't do nothing, I can't risk the growing of
>> space...
>> Regarding the overhead I think that with a tight throttling system I
>> could control and avoid overloading the cluster.
>>
>> Mixed feelings :S
>>
>>
>>
>> On 18 February 2014 15:45, Matthew Von-Maszewski wrote:
>>
>>> Edgar,
>>>
>>> The first "concern" I have is that leveldb's delete does not free disk
>>> space.  Others have executed mass delete operations only to discover they
>>> are now using more disk space instead of less.  Here is a discussion of the
>>> problem:
>>>
>>> https://github.com/basho/leveldb/wiki/mv-aggressive-delete
>>>
>>> The link also describes Riak's database operation overhead.  This is a
>>> second "concern".  You will need to carefully throttle your delete rate or
>>> the overhead will likely impact your production throughput.
>>>
>>> We have new code to help quicken the actual purge of deleted data in
>>> Riak 2.0.  But that release is not quite ready for production usage.
>>>
>>>
>>> What do you hope to achieve by the mass delete?
>>>
>>> Matthew
>>>
>>>
>>>
>>>
>>> On Feb 18, 2014, at 10:29 AM, Edgar Veiga  wrote:
>>>
>>> Sorry, forgot that info!
>>>
>>> It's leveldb.
>>>
>>> Best regards
>>>
>>>
>>> On 18 February 2014 15:27, Matthew Von-Maszewski wrote:
>>>
>>>> Which Riak backend are you using:  bitcask, leveldb, multi?
>>>>
>>>> Matthew
>>>>
>>>>
>>>> On Feb 18, 2014, at 10:17 AM, Edgar Veiga 
>>>> wrote:
>>>>
>>>> > Hi all!
>>>> >
>>>> > I have a fairly trivial question regarding mass deletion on a riak
>>>> cluster, but firstly let me give you just some context. My cluster is
>>>> running with riak 1.4.6 on 6 machines with a ring of 256 nodes and 1Tb ssd
>>>> disks.
>>>> >
>>>> > I need to execute a massive object deletion on a bucket, I'm talking
>>>> of ~1 billion keys (The object average size is ~1Kb). I will not retrive
>>>> the keys from riak because a I have a file with all of them. I'll just
>>>> start a script that reads them from the file and triggers an HTTP DELETE
>>>> for each one.
>>>> > The cluster will continue running on production with a quite high
>>>> load serving all other applications, while running this deletion.
>>>> >
>>>> > My question is simple, do I need to have any kind of extra concerns
>>>> regarding this action? Do you advise me on taking special attention to any
>>>> kind of metrics regarding riak or event the servers where it's running?
>>>> >
>>>> > Best regards!
>>>> > ___
>>>> > riak-users mailing list
>>>> > riak-users@lists.basho.com
>>>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>
>>>>
>>>
>>>
>>
>>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Riak node down after ssd failure

2014-03-18 Thread Edgar Veiga
Thanks Evan!


Is it really imperative that the name should be != than the old one?




Best regards
—
Sent from my iPhone

On Tue, Mar 18, 2014 at 8:47 PM, Evan Vigil-McClanahan
 wrote:

> Probably the easiest thing would be to wait until the new machine is
> ready to join, add it with a nodename distinct from that of the last
> one, and force-replace the new node for the old, dead one.
> On Tue, Mar 18, 2014 at 1:03 PM, Edgar Veiga  wrote:
>> Hello all!
>>
>> I have a 6 machine cluster with leveldb as backend, using riak 1.4.8
>> version.
>>
>> Today, the ssd of one of the machine's has failed and is not recoverable, so
>> I have already ordered a new one that should arrive by thursday!
>>
>> In the meanwhile what should I do? I have no backup of the old ssd data...
>> Should I remove the machine of the cluster and when the new ssd arrives join
>> the machine again?
>>
>> Best regards!
>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Riak node down after ssd failure

2014-03-18 Thread Edgar Veiga
Hello all!

I have a 6 machine cluster with leveldb as backend, using riak 1.4.8
version.

Today, the ssd of one of the machine's has failed and is not recoverable,
so I have already ordered a new one that should arrive by thursday!

In the meanwhile what should I do? I have no backup of the old ssd data...
Should I remove the machine of the cluster and when the new ssd arrives
join the machine again?

Best regards!
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Update to 1.4.8

2014-03-06 Thread Edgar Veiga
Hi Scott,

Thanks for replying.

After this problem, I've been faced with a huge amount of disk consumption
regarding the aae folder (It's now in the order of the 100G).

Indeed, I've been talking with your Brian Sparrow and he advised me to
remove the anti antropy folder contents. Before that I'm going to attach to
the riak process and stop the aae. After that I'll restart all the nodes
one by one, removing the anti antropy contents in between.

I should have done this by the time I've updated to 1.4.8, but
unfortunately the message only appeared on the release notes after I've
started the upgrade process...

Best regards!


On 6 March 2014 03:08, Scott Lystig Fritchie  wrote:

> Edgar Veiga  wrote:
>
> ev> Is this normal?
>
> Yes.  One or more of your vnodes can't keep up with the workload
> generated by AAE repair  or a vnode can't keep up for another
> reason, and AAE repair shouldn't actively make things worse.
>
> The logging is done by:
>
>
> https://github.com/basho/riak_kv/blob/1.4/src/riak_kv_entropy_manager.erl#L821
>
> -Scott
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Update to 1.4.8

2014-03-03 Thread Edgar Veiga
Hi all,

I have a cluster of 6 servers, with level-db as backend. Previously I was
with 1.4.6 version of riak.

I've updated to 1.4.8 and since then I have the output log of all of the
nodes flooded with messages like this:

2014-03-03 08:52:21.619 [info]
<0.776.0>@riak_kv_entropy_manager:perhaps_log_throttle_change:826 Changing
AAE throttle from 5000 -> 0 msec/key, based on maximum vnode mailbox size 3
from 'riak@192.168.20.112'
2014-03-03 08:55:36.754 [info]
<0.776.0>@riak_kv_entropy_manager:perhaps_log_throttle_change:826 Changing
AAE throttle from 0 -> 10 msec/key, based on maximum vnode mailbox size 215
from 'riak@192.168.20.108'
2014-03-03 08:55:51.760 [info]
<0.776.0>@riak_kv_entropy_manager:perhaps_log_throttle_change:826 Changing
AAE throttle from 10 -> 50 msec/key, based on maximum vnode mailbox size
550 from 'riak@192.168.20.108'
2014-03-03 08:56:21.776 [info]
<0.776.0>@riak_kv_entropy_manager:perhaps_log_throttle_change:826 Changing
AAE throttle from 50 -> 5000 msec/key, based on maximum vnode mailbox size
3080 from 'riak@192.168.20.108'
2014-03-03 08:56:36.788 [info]
<0.776.0>@riak_kv_entropy_manager:perhaps_log_throttle_change:826 Changing
AAE throttle from 5000 -> 0 msec/key, based on maximum vnode mailbox size 9
from 'riak@192.168.20.110'
2014-03-03 09:00:06.912 [info]
<0.776.0>@riak_kv_entropy_manager:perhaps_log_throttle_change:826 Changing
AAE throttle from 0 -> 10 msec/key, based on maximum vnode mailbox size 404
from 'riak@192.168.20.108'
2014-03-03 09:00:21.924 [info]
<0.776.0>@riak_kv_entropy_manager:perhaps_log_throttle_change:826 Changing
AAE throttle from 10 -> 0 msec/key, based on maximum vnode mailbox size 0
from 'riak@192.168.20.112'
2014-03-03 09:00:36.933 [info]
<0.776.0>@riak_kv_entropy_manager:perhaps_log_throttle_change:826 Changing
AAE throttle from 0 -> 10 msec/key, based on maximum vnode mailbox size 359
from 'riak@192.168.20.108'
2014-03-03 09:00:52.024 [info]
<0.776.0>@riak_kv_entropy_manager:perhaps_log_throttle_change:826 Changing
AAE throttle from 10 -> 5000 msec/key, based on maximum vnode mailbox size
3507 from 'riak@192.168.20.108'
2014-03-03 09:01:07.033 [info]
<0.776.0>@riak_kv_entropy_manager:perhaps_log_throttle_change:826 Changing
AAE throttle from 5000 -> 0 msec/key, based on maximum vnode mailbox size 2
from 'riak@192.168.20.111'

Is this normal?

Best regards
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


1.4.8 - AAE regression fixed

2014-02-27 Thread Edgar Veiga
On the Release notes, theres a new section recommending the deletion of all
previous AAE info before upgrading to 1.4.8.

What are the risks (if any) of not doing this (the deletion) beside the
wasting resources?

Best Regards
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: RIAK 1.4.6 - Mass key deletion

2014-02-18 Thread Edgar Veiga
Ok, thanks a lot Matthew.


On 18 February 2014 16:18, Matthew Von-Maszewski  wrote:

> Riak 2.0 is coming.  Hold your mass delete until then.  The "bug" is
> within Google's original leveldb architecture.  Riak 2.0 sneaks around to
> get the disk space freed.
>
> Matthew
>
>
>
> On Feb 18, 2014, at 11:10 AM, Edgar Veiga  wrote:
>
> The only/main purpose is to free disk space..
>
> I was a little bit concerned regarding this operation, but now with your
> feedback I'm tending to don't do nothing, I can't risk the growing of
> space...
> Regarding the overhead I think that with a tight throttling system I could
> control and avoid overloading the cluster.
>
> Mixed feelings :S
>
>
>
> On 18 February 2014 15:45, Matthew Von-Maszewski wrote:
>
>> Edgar,
>>
>> The first "concern" I have is that leveldb's delete does not free disk
>> space.  Others have executed mass delete operations only to discover they
>> are now using more disk space instead of less.  Here is a discussion of the
>> problem:
>>
>> https://github.com/basho/leveldb/wiki/mv-aggressive-delete
>>
>> The link also describes Riak's database operation overhead.  This is a
>> second "concern".  You will need to carefully throttle your delete rate or
>> the overhead will likely impact your production throughput.
>>
>> We have new code to help quicken the actual purge of deleted data in Riak
>> 2.0.  But that release is not quite ready for production usage.
>>
>>
>> What do you hope to achieve by the mass delete?
>>
>> Matthew
>>
>>
>>
>>
>> On Feb 18, 2014, at 10:29 AM, Edgar Veiga  wrote:
>>
>> Sorry, forgot that info!
>>
>> It's leveldb.
>>
>> Best regards
>>
>>
>> On 18 February 2014 15:27, Matthew Von-Maszewski wrote:
>>
>>> Which Riak backend are you using:  bitcask, leveldb, multi?
>>>
>>> Matthew
>>>
>>>
>>> On Feb 18, 2014, at 10:17 AM, Edgar Veiga  wrote:
>>>
>>> > Hi all!
>>> >
>>> > I have a fairly trivial question regarding mass deletion on a riak
>>> cluster, but firstly let me give you just some context. My cluster is
>>> running with riak 1.4.6 on 6 machines with a ring of 256 nodes and 1Tb ssd
>>> disks.
>>> >
>>> > I need to execute a massive object deletion on a bucket, I'm talking
>>> of ~1 billion keys (The object average size is ~1Kb). I will not retrive
>>> the keys from riak because a I have a file with all of them. I'll just
>>> start a script that reads them from the file and triggers an HTTP DELETE
>>> for each one.
>>> > The cluster will continue running on production with a quite high load
>>> serving all other applications, while running this deletion.
>>> >
>>> > My question is simple, do I need to have any kind of extra concerns
>>> regarding this action? Do you advise me on taking special attention to any
>>> kind of metrics regarding riak or event the servers where it's running?
>>> >
>>> > Best regards!
>>> > ___
>>> > riak-users mailing list
>>> > riak-users@lists.basho.com
>>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>>
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: RIAK 1.4.6 - Mass key deletion

2014-02-18 Thread Edgar Veiga
The only/main purpose is to free disk space..

I was a little bit concerned regarding this operation, but now with your
feedback I'm tending to don't do nothing, I can't risk the growing of
space...
Regarding the overhead I think that with a tight throttling system I could
control and avoid overloading the cluster.

Mixed feelings :S



On 18 February 2014 15:45, Matthew Von-Maszewski  wrote:

> Edgar,
>
> The first "concern" I have is that leveldb's delete does not free disk
> space.  Others have executed mass delete operations only to discover they
> are now using more disk space instead of less.  Here is a discussion of the
> problem:
>
> https://github.com/basho/leveldb/wiki/mv-aggressive-delete
>
> The link also describes Riak's database operation overhead.  This is a
> second "concern".  You will need to carefully throttle your delete rate or
> the overhead will likely impact your production throughput.
>
> We have new code to help quicken the actual purge of deleted data in Riak
> 2.0.  But that release is not quite ready for production usage.
>
>
> What do you hope to achieve by the mass delete?
>
> Matthew
>
>
>
>
> On Feb 18, 2014, at 10:29 AM, Edgar Veiga  wrote:
>
> Sorry, forgot that info!
>
> It's leveldb.
>
> Best regards
>
>
> On 18 February 2014 15:27, Matthew Von-Maszewski wrote:
>
>> Which Riak backend are you using:  bitcask, leveldb, multi?
>>
>> Matthew
>>
>>
>> On Feb 18, 2014, at 10:17 AM, Edgar Veiga  wrote:
>>
>> > Hi all!
>> >
>> > I have a fairly trivial question regarding mass deletion on a riak
>> cluster, but firstly let me give you just some context. My cluster is
>> running with riak 1.4.6 on 6 machines with a ring of 256 nodes and 1Tb ssd
>> disks.
>> >
>> > I need to execute a massive object deletion on a bucket, I'm talking of
>> ~1 billion keys (The object average size is ~1Kb). I will not retrive the
>> keys from riak because a I have a file with all of them. I'll just start a
>> script that reads them from the file and triggers an HTTP DELETE for each
>> one.
>> > The cluster will continue running on production with a quite high load
>> serving all other applications, while running this deletion.
>> >
>> > My question is simple, do I need to have any kind of extra concerns
>> regarding this action? Do you advise me on taking special attention to any
>> kind of metrics regarding riak or event the servers where it's running?
>> >
>> > Best regards!
>> > ___
>> > riak-users mailing list
>> > riak-users@lists.basho.com
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: RIAK 1.4.6 - Mass key deletion

2014-02-18 Thread Edgar Veiga
Sorry, forgot that info!

It's leveldb.

Best regards


On 18 February 2014 15:27, Matthew Von-Maszewski  wrote:

> Which Riak backend are you using:  bitcask, leveldb, multi?
>
> Matthew
>
>
> On Feb 18, 2014, at 10:17 AM, Edgar Veiga  wrote:
>
> > Hi all!
> >
> > I have a fairly trivial question regarding mass deletion on a riak
> cluster, but firstly let me give you just some context. My cluster is
> running with riak 1.4.6 on 6 machines with a ring of 256 nodes and 1Tb ssd
> disks.
> >
> > I need to execute a massive object deletion on a bucket, I'm talking of
> ~1 billion keys (The object average size is ~1Kb). I will not retrive the
> keys from riak because a I have a file with all of them. I'll just start a
> script that reads them from the file and triggers an HTTP DELETE for each
> one.
> > The cluster will continue running on production with a quite high load
> serving all other applications, while running this deletion.
> >
> > My question is simple, do I need to have any kind of extra concerns
> regarding this action? Do you advise me on taking special attention to any
> kind of metrics regarding riak or event the servers where it's running?
> >
> > Best regards!
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


RIAK 1.4.6 - Mass key deletion

2014-02-18 Thread Edgar Veiga
Hi all!

I have a fairly trivial question regarding mass deletion on a riak cluster,
but firstly let me give you just some context. My cluster is running with
riak 1.4.6 on 6 machines with a ring of 256 nodes and 1Tb ssd disks.

I need to execute a massive object deletion on a bucket, I'm talking of ~1
billion keys (The object average size is ~1Kb). I will not retrive the keys
from riak because a I have a file with all of them. I'll just start a
script that reads them from the file and triggers an HTTP DELETE for each
one.
The cluster will continue running on production with a quite high load
serving all other applications, while running this deletion.

My question is simple, do I need to have any kind of extra concerns
regarding this action? Do you advise me on taking special attention to any
kind of metrics regarding riak or event the servers where it's running?

Best regards!
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Level-db cluster

2014-02-11 Thread Edgar Veiga
Thanks for replying :)

Best regards!


On 11 February 2014 23:28, John Daily  wrote:

> After asking around a bit, I think we're likely to remove that
> recommendation from the documentation.
>
> Tuning is always a bit of a dark art, so as always your mileage may vary,
> but there doesn't seem to be any real advantage to lowering the thread
> count.
>
> Thanks for raising the issue.
>
> -John
>
>
> On Feb 3, 2014, at 10:51 AM, Edgar Veiga  wrote:
>
> Hi all,
>
> I have a 6 machines cluster with a ring of 256 nodes with levelDB as
> backend.
>
> I've seen that recently in the documentation, this has appeared:
>
> If using LevelDB as the storage backend (which maintains its own I/O
> thread pool), the number of async threads in Riak's default pool can be
> decreased in the /etc/riak/vm.args file.
>
> How much is this relevant regarding the overall cluster performance? Right
> now I've the defaul 64 threads.
>
> Best regards.
>  ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Level-db cluster

2014-02-03 Thread Edgar Veiga
Hi all,

I have a 6 machines cluster with a ring of 256 nodes with levelDB as
backend.

I've seen that recently in the documentation, this has appeared:

If using LevelDB as the storage backend (which maintains its own I/O thread
pool), the number of async threads in Riak's default pool can be decreased
in the /etc/riak/vm.args file.

How much is this relevant regarding the overall cluster performance? Right
now I've the defaul 64 threads.

Best regards.
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: last_write_wins

2014-01-30 Thread Edgar Veiga
No problem Jason, I'm glad you've tried to help :)

I'm using leveldb backend, and the system is running in production for
about 6 months. It's being quiet an interesting experience, but now that
the load is getting bigger and the amount of data in riak too, we need to
start tunning this little things.

Best regards!


On 30 January 2014 23:17, Jason Campbell  wrote:

> Oh, I completely misunderstood, I'm sorry for that.  I was thinking of
> your application as a typical web application which could regenerate the
> data at any time (making that the authoritative source, not Riak).
>
> In that case, Riak does sound perfect, but I would definitely not use the
> memory backend if that is the only copy of the data.
>
> Eric, I'm sorry if I made is sound like Riak is a poor cache in all
> situations, I just didn't think it fit here (although I clearly
> misunderstood).  There is a tradeoff between speed and
> consistency/reliability, and the whole application has to take advantage of
> the extra consistency and reliability for it to make sense.
>
> Sorry again,
> Jason Campbell
>
> - Original Message -
> From: "Edgar Veiga" 
> To: "Eric Redmond" 
> Cc: "Jason Campbell" , "riak-users" <
> riak-users@lists.basho.com>, "Russell Brown" 
> Sent: Friday, 31 January, 2014 9:54:33 AM
> Subject: Re: last_write_wins
>
>
> Hi!
>
>
> I think that you are making some kind of confusion here... I'm not using
> riak for cache purposes, thats exactly the opposite! Riak is my end
> persistence system, I need to store the documents in a strong, secure,
> available and consistent place. That's riak.
>
>
> It's like I've said before, just make an analogy with the linux file cache
> system. Node.js workers simulate that in-memory cache, php applications
> write and read from them and when something is dirty, it's persisted to
> riak...
>
>
> Best regards
>
>
>
>
>
>
>
> On 30 January 2014 22:26, Eric Redmond < eredm...@basho.com > wrote:
>
>
>
>
> Actually people use Riak as a distributed cache all the time. In fact,
> many customers use it exclusively as a cache system. Not all backends write
> to disk. Riak supports a main memory backend[1], complete with size limits
> and TTL.
>
>
> Eric
>
>
> [1]: http://docs.basho.com/riak/latest/ops/advanced/backends/memory/
>
>
>
>
>
>
> On Jan 30, 2014, at 1:48 PM, Jason Campbell < xia...@xiaclo.net > wrote:
>
>
> I'm not sure Riak is the best fit for this. Riak is great for applications
> where it is the source of data, and has very strong consistency when used
> in this way. You are using it as a cache, where Riak will be significantly
> slower than other cache solutions. Especially since you say that each
> worker will have a set of documents it is responsible for. Something like a
> local memcache or redis would likely suit this use case just as well, but
> do it much faster with less overhead.
>
> Riak will guarantee 3 writes to disk (by default), where something like
> memcache or redis will stay in memory, and if local, won't have network
> latency either. In the worst case where a node goes offline, the real data
> can be pulled from the backend again, so it isn't a big deal. It will also
> simplify your application, because node.js can always request from cache
> and not worry about the speed, instead of maintaining it's own cache layer.
>
> I'm as happy as the next person on this list to see Riak being used for
> all sorts of uses, but I believe in the right tool for the right job.
> Unless there is something I don't understand, Riak is probably the wrong
> tool. It will work, but there is other software that will work much better.
>
> I hope this helps,
> Jason Campbell
>
> - Original Message -
> From: "Edgar Veiga" < edgarmve...@gmail.com >
> To: "Russell Brown" < russell.br...@me.com >
> Cc: "riak-users" < riak-users@lists.basho.com >
> Sent: Friday, 31 January, 2014 3:20:42 AM
> Subject: Re: last_write_wins
>
>
>
> I'll try to explain this the best I can, although it's a simples
> architecture I'm not describing it in my native language :)
>
>
> I have a set of node.js workers (64 for now) that serve as a
> cache/middleware layer for a dozen of php applications. Each worker deals
> with a set of documents (it's not a distributed cache system). Each worker
> updates the documents in memory, and tags them as dirty (just like OS file
> cache), and from time to time (for now, it's a 5 seconds w

Re: last_write_wins

2014-01-30 Thread Edgar Veiga
Here's a (bad) mockup of the solution:

https://cloudup.com/cOMhcPry38U

Hope that this time I've made myself a little more clear :)

Regards


On 30 January 2014 23:04, Edgar Veiga  wrote:

> Yes Eric, I understood :)
>
>
> On 30 January 2014 23:00, Eric Redmond  wrote:
>
>> For clarity, I was responding to Jason's assertion that Riak shouldn't be
>> used as a cache, not to your specific issue, Edgar.
>>
>> Eric
>>
>> On Jan 30, 2014, at 2:54 PM, Edgar Veiga  wrote:
>>
>> Hi!
>>
>> I think that you are making some kind of confusion here... I'm not using
>> riak for cache purposes, thats exactly the opposite! Riak is my end
>> persistence system, I need to store the documents in a strong, secure,
>> available and consistent place. That's riak.
>>
>> It's like I've said before, just make an analogy with the linux file
>> cache system. Node.js workers simulate that in-memory cache, php
>> applications write and read from them and when something is dirty, it's
>> persisted to riak...
>>
>> Best regards
>>
>>
>>
>>
>> On 30 January 2014 22:26, Eric Redmond  wrote:
>>
>>> Actually people use Riak as a distributed cache all the time. In fact,
>>> many customers use it exclusively as a cache system. Not all backends write
>>> to disk. Riak supports a main memory backend[1], complete with size limits
>>> and TTL.
>>>
>>> Eric
>>>
>>> [1]: http://docs.basho.com/riak/latest/ops/advanced/backends/memory/
>>>
>>>
>>> On Jan 30, 2014, at 1:48 PM, Jason Campbell  wrote:
>>>
>>> I'm not sure Riak is the best fit for this.  Riak is great for
>>> applications where it is the source of data, and has very strong
>>> consistency when used in this way.  You are using it as a cache, where Riak
>>> will be significantly slower than other cache solutions.  Especially since
>>> you say that each worker will have a set of documents it is responsible
>>> for.  Something like a local memcache or redis would likely suit this use
>>> case just as well, but do it much faster with less overhead.
>>>
>>> Riak will guarantee 3 writes to disk (by default), where something like
>>> memcache or redis will stay in memory, and if local, won't have network
>>> latency either.  In the worst case where a node goes offline, the real data
>>> can be pulled from the backend again, so it isn't a big deal.  It will also
>>> simplify your application, because node.js can always request from cache
>>> and not worry about the speed, instead of maintaining it's own cache layer.
>>>
>>> I'm as happy as the next person on this list to see Riak being used for
>>> all sorts of uses, but I believe in the right tool for the right job.
>>>  Unless there is something I don't understand, Riak is probably the wrong
>>> tool.  It will work, but there is other software that will work much better.
>>>
>>> I hope this helps,
>>> Jason Campbell
>>>
>>> - Original Message -
>>> From: "Edgar Veiga" 
>>> To: "Russell Brown" 
>>> Cc: "riak-users" 
>>> Sent: Friday, 31 January, 2014 3:20:42 AM
>>> Subject: Re: last_write_wins
>>>
>>>
>>>
>>> I'll try to explain this the best I can, although it's a simples
>>> architecture I'm not describing it in my native language :)
>>>
>>>
>>> I have a set of node.js workers (64 for now) that serve as a
>>> cache/middleware layer for a dozen of php applications. Each worker deals
>>> with a set of documents (it's not a distributed cache system). Each worker
>>> updates the documents in memory, and tags them as dirty (just like OS file
>>> cache), and from time to time (for now, it's a 5 seconds window interval),
>>> a persister module will deal with the persistence of those dirty documents
>>> to riak.
>>> If the document isn't in memory, it will be fetched from riak.
>>>
>>>
>>> If you want document X, you need to ask to the corresponding worker
>>> dealing with it. Two different workers, don't deal with the same document.
>>> That way we can guarantee that there will be no concurrent writes to
>>> riak.
>>>
>>>
>>> Best Regards,
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 30

Re: last_write_wins

2014-01-30 Thread Edgar Veiga
Yes Eric, I understood :)


On 30 January 2014 23:00, Eric Redmond  wrote:

> For clarity, I was responding to Jason's assertion that Riak shouldn't be
> used as a cache, not to your specific issue, Edgar.
>
> Eric
>
> On Jan 30, 2014, at 2:54 PM, Edgar Veiga  wrote:
>
> Hi!
>
> I think that you are making some kind of confusion here... I'm not using
> riak for cache purposes, thats exactly the opposite! Riak is my end
> persistence system, I need to store the documents in a strong, secure,
> available and consistent place. That's riak.
>
> It's like I've said before, just make an analogy with the linux file cache
> system. Node.js workers simulate that in-memory cache, php applications
> write and read from them and when something is dirty, it's persisted to
> riak...
>
> Best regards
>
>
>
>
> On 30 January 2014 22:26, Eric Redmond  wrote:
>
>> Actually people use Riak as a distributed cache all the time. In fact,
>> many customers use it exclusively as a cache system. Not all backends write
>> to disk. Riak supports a main memory backend[1], complete with size limits
>> and TTL.
>>
>> Eric
>>
>> [1]: http://docs.basho.com/riak/latest/ops/advanced/backends/memory/
>>
>>
>> On Jan 30, 2014, at 1:48 PM, Jason Campbell  wrote:
>>
>> I'm not sure Riak is the best fit for this.  Riak is great for
>> applications where it is the source of data, and has very strong
>> consistency when used in this way.  You are using it as a cache, where Riak
>> will be significantly slower than other cache solutions.  Especially since
>> you say that each worker will have a set of documents it is responsible
>> for.  Something like a local memcache or redis would likely suit this use
>> case just as well, but do it much faster with less overhead.
>>
>> Riak will guarantee 3 writes to disk (by default), where something like
>> memcache or redis will stay in memory, and if local, won't have network
>> latency either.  In the worst case where a node goes offline, the real data
>> can be pulled from the backend again, so it isn't a big deal.  It will also
>> simplify your application, because node.js can always request from cache
>> and not worry about the speed, instead of maintaining it's own cache layer.
>>
>> I'm as happy as the next person on this list to see Riak being used for
>> all sorts of uses, but I believe in the right tool for the right job.
>>  Unless there is something I don't understand, Riak is probably the wrong
>> tool.  It will work, but there is other software that will work much better.
>>
>> I hope this helps,
>> Jason Campbell
>>
>> - Original Message -
>> From: "Edgar Veiga" 
>> To: "Russell Brown" 
>> Cc: "riak-users" 
>> Sent: Friday, 31 January, 2014 3:20:42 AM
>> Subject: Re: last_write_wins
>>
>>
>>
>> I'll try to explain this the best I can, although it's a simples
>> architecture I'm not describing it in my native language :)
>>
>>
>> I have a set of node.js workers (64 for now) that serve as a
>> cache/middleware layer for a dozen of php applications. Each worker deals
>> with a set of documents (it's not a distributed cache system). Each worker
>> updates the documents in memory, and tags them as dirty (just like OS file
>> cache), and from time to time (for now, it's a 5 seconds window interval),
>> a persister module will deal with the persistence of those dirty documents
>> to riak.
>> If the document isn't in memory, it will be fetched from riak.
>>
>>
>> If you want document X, you need to ask to the corresponding worker
>> dealing with it. Two different workers, don't deal with the same document.
>> That way we can guarantee that there will be no concurrent writes to
>> riak.
>>
>>
>> Best Regards,
>>
>>
>>
>>
>>
>>
>>
>> On 30 January 2014 10:46, Russell Brown < russell.br...@me.com > wrote:
>>
>>
>>
>>
>>
>>
>>
>> On 30 Jan 2014, at 10:37, Edgar Veiga < edgarmve...@gmail.com > wrote:
>>
>>
>>
>> Also,
>>
>>
>> Using last_write_wins = true, do I need to always send the vclock while
>> on a PUT request? In the official documention it says that riak will look
>> only at the timestamp of the requests.
>>
>>
>> Ok, from what you've said it sounds like you are always wanting to
>> replace what is at a key w

Re: last_write_wins

2014-01-30 Thread Edgar Veiga
Hi!

I think that you are making some kind of confusion here... I'm not using
riak for cache purposes, thats exactly the opposite! Riak is my end
persistence system, I need to store the documents in a strong, secure,
available and consistent place. That's riak.

It's like I've said before, just make an analogy with the linux file cache
system. Node.js workers simulate that in-memory cache, php applications
write and read from them and when something is dirty, it's persisted to
riak...

Best regards




On 30 January 2014 22:26, Eric Redmond  wrote:

> Actually people use Riak as a distributed cache all the time. In fact,
> many customers use it exclusively as a cache system. Not all backends write
> to disk. Riak supports a main memory backend[1], complete with size limits
> and TTL.
>
> Eric
>
> [1]: http://docs.basho.com/riak/latest/ops/advanced/backends/memory/
>
>
> On Jan 30, 2014, at 1:48 PM, Jason Campbell  wrote:
>
> I'm not sure Riak is the best fit for this.  Riak is great for
> applications where it is the source of data, and has very strong
> consistency when used in this way.  You are using it as a cache, where Riak
> will be significantly slower than other cache solutions.  Especially since
> you say that each worker will have a set of documents it is responsible
> for.  Something like a local memcache or redis would likely suit this use
> case just as well, but do it much faster with less overhead.
>
> Riak will guarantee 3 writes to disk (by default), where something like
> memcache or redis will stay in memory, and if local, won't have network
> latency either.  In the worst case where a node goes offline, the real data
> can be pulled from the backend again, so it isn't a big deal.  It will also
> simplify your application, because node.js can always request from cache
> and not worry about the speed, instead of maintaining it's own cache layer.
>
> I'm as happy as the next person on this list to see Riak being used for
> all sorts of uses, but I believe in the right tool for the right job.
>  Unless there is something I don't understand, Riak is probably the wrong
> tool.  It will work, but there is other software that will work much better.
>
> I hope this helps,
> Jason Campbell
>
> - Original Message -
> From: "Edgar Veiga" 
> To: "Russell Brown" 
> Cc: "riak-users" 
> Sent: Friday, 31 January, 2014 3:20:42 AM
> Subject: Re: last_write_wins
>
>
>
> I'll try to explain this the best I can, although it's a simples
> architecture I'm not describing it in my native language :)
>
>
> I have a set of node.js workers (64 for now) that serve as a
> cache/middleware layer for a dozen of php applications. Each worker deals
> with a set of documents (it's not a distributed cache system). Each worker
> updates the documents in memory, and tags them as dirty (just like OS file
> cache), and from time to time (for now, it's a 5 seconds window interval),
> a persister module will deal with the persistence of those dirty documents
> to riak.
> If the document isn't in memory, it will be fetched from riak.
>
>
> If you want document X, you need to ask to the corresponding worker
> dealing with it. Two different workers, don't deal with the same document.
> That way we can guarantee that there will be no concurrent writes to riak.
>
>
> Best Regards,
>
>
>
>
>
>
>
> On 30 January 2014 10:46, Russell Brown < russell.br...@me.com > wrote:
>
>
>
>
>
>
>
> On 30 Jan 2014, at 10:37, Edgar Veiga < edgarmve...@gmail.com > wrote:
>
>
>
> Also,
>
>
> Using last_write_wins = true, do I need to always send the vclock while on
> a PUT request? In the official documention it says that riak will look only
> at the timestamp of the requests.
>
>
> Ok, from what you've said it sounds like you are always wanting to replace
> what is at a key with the new information you are putting. If that is the
> case, then you have the perfect use case for LWW=true. And indeed, you do
> not need to pass a vclock with your put request. And it sounds like there
> is no need for you to fetch-before-put since that is only to get context
> /resolve siblings. Curious about your use case if you can share more.
>
>
> Cheers
>
>
> Russell
>
>
>
>
>
>
>
>
>
>
> Best regards,
>
>
>
> On 29 January 2014 10:29, Edgar Veiga < edgarmve...@gmail.com > wrote:
>
>
>
> Hi Russel,
>
>
> No, it doesn't depend. It's always a new value.
>
>
> Best regards
>
>
>
>
>
> On 29 January 2014 10:10, Russe

Re: last_write_wins

2014-01-30 Thread Edgar Veiga
I'll try to explain this the best I can, although it's a simples
architecture I'm not describing it in my native language :)

I have a set of node.js workers (64 for now) that serve as a
cache/middleware layer for a dozen of php applications. Each worker deals
with a set of documents (it's not a distributed cache system). Each worker
updates the documents in memory, and tags them as dirty (just like OS file
cache), and from time to time (for now, it's a 5 seconds window interval),
a persister module will deal with the persistence of those dirty documents
to riak.
If the document isn't in memory, it will be fetched from riak.

If you want document X, you need to ask to the corresponding worker dealing
with it. Two different workers, don't deal with the same document.
That way we can guarantee that there will be no concurrent writes to riak.

Best Regards,




On 30 January 2014 10:46, Russell Brown  wrote:

>
> On 30 Jan 2014, at 10:37, Edgar Veiga  wrote:
>
> Also,
>
> Using last_write_wins = true, do I need to always send the vclock while on
> a PUT request? In the official documention it says that riak will look only
> at the timestamp of the requests.
>
>
> Ok, from what you've said it sounds like you are always wanting to replace
> what is at a key with the new information you are putting. If that is the
> case, then you have the perfect use case for LWW=true. And indeed, you do
> not need to pass a vclock with your put request. And it sounds like there
> is no need for you to fetch-before-put since that is only to get context
> /resolve siblings. Curious about your use case if you can share more.
>
> Cheers
>
> Russell
>
>
>
> Best regards,
>
>
> On 29 January 2014 10:29, Edgar Veiga  wrote:
>
>> Hi Russel,
>>
>> No, it doesn't depend. It's always a new value.
>>
>> Best regards
>>
>>
>> On 29 January 2014 10:10, Russell Brown  wrote:
>>
>>>
>>> On 29 Jan 2014, at 09:57, Edgar Veiga  wrote:
>>>
>>> tl;dr
>>>
>>> If I guarantee that the same key is only written with a 5 second
>>> interval, is last_write_wins=true profitable?
>>>
>>>
>>> It depends. Does the value you write depend in anyway on the value you
>>> read, or is it always that you are just getting a totally new value that
>>> replaces what is in Riak (regardless what is in Riak)?
>>>
>>>
>>>
>>> On 27 January 2014 23:25, Edgar Veiga  wrote:
>>>
>>>> Hi there everyone!
>>>>
>>>> I would like to know, if my current application is a good use case to
>>>> set last_write_wins to true.
>>>>
>>>> Basically I have a cluster of node.js workers reading and writing to
>>>> riak. Each node.js worker is responsible for a set of keys, so I can
>>>> guarantee some kind of non distributed cache...
>>>> The real deal here is that the writing operation is not run evertime an
>>>> object is changed but each 5 seconds in a "batch insertion/update" style.
>>>> This brings the guarantee that the same object cannot be write to riak at
>>>> the same time, not event at the same seconds, there's always a 5 second
>>>> window between each insertion/update.
>>>>
>>>> That said, is it profitable to me if I set last_write_wins to true?
>>>> I've been facing some massive writting delays under high loads and it would
>>>> be nice if I have some kind of way to tune riak.
>>>>
>>>> Thanks a lot and keep up the good work!
>>>>
>>>>
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>>
>>
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: last_write_wins

2014-01-30 Thread Edgar Veiga
Also,

Using last_write_wins = true, do I need to always send the vclock while on
a PUT request? In the official documention it says that riak will look only
at the timestamp of the requests.

Best regards,


On 29 January 2014 10:29, Edgar Veiga  wrote:

> Hi Russel,
>
> No, it doesn't depend. It's always a new value.
>
> Best regards
>
>
> On 29 January 2014 10:10, Russell Brown  wrote:
>
>>
>> On 29 Jan 2014, at 09:57, Edgar Veiga  wrote:
>>
>> tl;dr
>>
>> If I guarantee that the same key is only written with a 5 second
>> interval, is last_write_wins=true profitable?
>>
>>
>> It depends. Does the value you write depend in anyway on the value you
>> read, or is it always that you are just getting a totally new value that
>> replaces what is in Riak (regardless what is in Riak)?
>>
>>
>>
>> On 27 January 2014 23:25, Edgar Veiga  wrote:
>>
>>> Hi there everyone!
>>>
>>> I would like to know, if my current application is a good use case to
>>> set last_write_wins to true.
>>>
>>> Basically I have a cluster of node.js workers reading and writing to
>>> riak. Each node.js worker is responsible for a set of keys, so I can
>>> guarantee some kind of non distributed cache...
>>> The real deal here is that the writing operation is not run evertime an
>>> object is changed but each 5 seconds in a "batch insertion/update" style.
>>> This brings the guarantee that the same object cannot be write to riak at
>>> the same time, not event at the same seconds, there's always a 5 second
>>> window between each insertion/update.
>>>
>>> That said, is it profitable to me if I set last_write_wins to true? I've
>>> been facing some massive writting delays under high loads and it would be
>>> nice if I have some kind of way to tune riak.
>>>
>>> Thanks a lot and keep up the good work!
>>>
>>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: last_write_wins

2014-01-29 Thread Edgar Veiga
Hi Russel,

No, it doesn't depend. It's always a new value.

Best regards


On 29 January 2014 10:10, Russell Brown  wrote:

>
> On 29 Jan 2014, at 09:57, Edgar Veiga  wrote:
>
> tl;dr
>
> If I guarantee that the same key is only written with a 5 second interval,
> is last_write_wins=true profitable?
>
>
> It depends. Does the value you write depend in anyway on the value you
> read, or is it always that you are just getting a totally new value that
> replaces what is in Riak (regardless what is in Riak)?
>
>
>
> On 27 January 2014 23:25, Edgar Veiga  wrote:
>
>> Hi there everyone!
>>
>> I would like to know, if my current application is a good use case to set
>> last_write_wins to true.
>>
>> Basically I have a cluster of node.js workers reading and writing to
>> riak. Each node.js worker is responsible for a set of keys, so I can
>> guarantee some kind of non distributed cache...
>> The real deal here is that the writing operation is not run evertime an
>> object is changed but each 5 seconds in a "batch insertion/update" style.
>> This brings the guarantee that the same object cannot be write to riak at
>> the same time, not event at the same seconds, there's always a 5 second
>> window between each insertion/update.
>>
>> That said, is it profitable to me if I set last_write_wins to true? I've
>> been facing some massive writting delays under high loads and it would be
>> nice if I have some kind of way to tune riak.
>>
>> Thanks a lot and keep up the good work!
>>
>>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: last_write_wins

2014-01-29 Thread Edgar Veiga
tl;dr

If I guarantee that the same key is only written with a 5 second interval,
is last_write_wins=true profitable?


On 27 January 2014 23:25, Edgar Veiga  wrote:

> Hi there everyone!
>
> I would like to know, if my current application is a good use case to set
> last_write_wins to true.
>
> Basically I have a cluster of node.js workers reading and writing to riak.
> Each node.js worker is responsible for a set of keys, so I can guarantee
> some kind of non distributed cache...
> The real deal here is that the writing operation is not run evertime an
> object is changed but each 5 seconds in a "batch insertion/update" style.
> This brings the guarantee that the same object cannot be write to riak at
> the same time, not event at the same seconds, there's always a 5 second
> window between each insertion/update.
>
> That said, is it profitable to me if I set last_write_wins to true? I've
> been facing some massive writting delays under high loads and it would be
> nice if I have some kind of way to tune riak.
>
> Thanks a lot and keep up the good work!
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


last_write_wins

2014-01-27 Thread Edgar Veiga
Hi there everyone!

I would like to know, if my current application is a good use case to set
last_write_wins to true.

Basically I have a cluster of node.js workers reading and writing to riak.
Each node.js worker is responsible for a set of keys, so I can guarantee
some kind of non distributed cache...
The real deal here is that the writing operation is not run evertime an
object is changed but each 5 seconds in a "batch insertion/update" style.
This brings the guarantee that the same object cannot be write to riak at
the same time, not event at the same seconds, there's always a 5 second
window between each insertion/update.

That said, is it profitable to me if I set last_write_wins to true? I've
been facing some massive writting delays under high loads and it would be
nice if I have some kind of way to tune riak.

Thanks a lot and keep up the good work!
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: anti_entropy_expire

2014-01-03 Thread Edgar Veiga
By the way,

I think that the number of repaired key is pretty high.

2014-01-03 06:33:42.857 [info]
<0.31440.2586>@riak_kv_exchange_fsm:key_exchange:206 Repaired 1491787 keys
during active anti-entropy exchange of
{468137243207554840987117797979434404733540892672,3} between
{473846233978378680511350941857232385279071879168,'riak@192.168.20.112'}
and {479555224749202520035584085735030365824602865664,'riak@192.168.20.107'}

I have few but consistent lines like this (every two hours, during this
process).

Best regards.

On 2 January 2014 10:05, Edgar Veiga  wrote:

> This is the only thing related to AAE that exists in my app.config. I
> haven't changed any default values...
>
> %% Enable active anti-entropy subsystem + optional debug
> messages:
> %%   {anti_entropy, {on|off, []}},
> %%   {anti_entropy, {on|off, [debug]}},
> {anti_entropy, {on, []}},
>
> %% Restrict how fast AAE can build hash trees. Building the
> tree
> %% for a given partition requires a full scan over that
> partition's
> %% data. Once built, trees stay built until they are expired.
> %% Config is of the form:
> %%   {num-builds, per-timespan-in-milliseconds}
> %% Default is 1 build per hour.
> {anti_entropy_build_limit, {1, 360}},
>
> %% Determine how often hash trees are expired after being
> built.
> %% Periodically expiring a hash tree ensures the on-disk hash
> tree
> %% data stays consistent with the actual k/v backend data. It
> also
> %% helps Riak identify silent disk failures and bit rot.
> However,
> %% expiration is not needed for normal AAE operation and
> should be
> %% infrequent for performance reasons. The time is specified in
> %% milliseconds. The default is 1 week.
> {anti_entropy_expire, 60480},
>
> %% Limit how many AAE exchanges/builds can happen concurrently.
> {anti_entropy_concurrency, 2},
>
> %% The tick determines how often the AAE manager looks for work
> %% to do (building/expiring trees, triggering exchanges, etc).
> %% The default is every 15 seconds. Lowering this value will
> %% speedup the rate that all replicas are synced across the
> cluster.
> %% Increasing the value is not recommended.
> {anti_entropy_tick, 15000},
>
> %% The directory where AAE hash trees are stored.
> {anti_entropy_data_dir, "/var/lib/riak/anti_entropy"},
>
> %% The LevelDB options used by AAE to generate the
> LevelDB-backed
> %% on-disk hashtrees.
> {anti_entropy_leveldb_opts, [{write_buffer_size, 4194304},
>  {max_open_files, 20}]},
>
>  I'll update the bloom filters value and see what happens...
>
> It's thursday again, and the regeneration process has started again. Since
> I've updated to 1.4.6, I have another thing different. The get/put values
> for each cluster node now have a "random" behaviour. Take a look at the
> next screenshot
>
> https://cloudup.com/cgbu9VNhSo1
>
> Best regards
>
>
> On 31 December 2013 21:16, Charlie Voiselle  wrote:
>
>> Edgar:
>>
>> Could you attach the AAE section of your app.config?  I’d like to look
>> into this issue further for you.  Something I think you might be running
>> into is https://github.com/basho/riak_core/pull/483.
>>
>> The issue of concern is that the LevelDB bloom filter is not enabled
>> properly for the instance into which the AAE data is stored.  You can
>> mitigate this particular issue by adding *{use_bloomfilter, true}* as
>> shown below:
>>
>> %% The LevelDB options used by AAE to generate the LevelDB-backed
>>
>> %% on-disk hashtrees.
>> {anti_entropy_leveldb_opts, [{write_buffer_size, 4194304},
>>  {max_open_files, 20}]},
>>
>>
>> Becomes:
>>
>>
>>
>> %% The LevelDB options used by AAE to generate the LevelDB-backed
>> %% on-disk hashtrees.
>>
>>
>> {anti_entropy_leveldb_opts, [{write_buffer_size, 4194304},
>>       {use_bloomfilter, true},
>>
>>  {max_open_files, 20}]},
>>
>>
>> This might not solve your specific problem, but it will certainly improve
>> your AAE performance.
>>
>&g

Re: anti_entropy_expire

2014-01-02 Thread Edgar Veiga
This is the only thing related to AAE that exists in my app.config. I
haven't changed any default values...

%% Enable active anti-entropy subsystem + optional debug
messages:
%%   {anti_entropy, {on|off, []}},
%%   {anti_entropy, {on|off, [debug]}},
{anti_entropy, {on, []}},

%% Restrict how fast AAE can build hash trees. Building the tree
%% for a given partition requires a full scan over that
partition's
%% data. Once built, trees stay built until they are expired.
%% Config is of the form:
%%   {num-builds, per-timespan-in-milliseconds}
%% Default is 1 build per hour.
{anti_entropy_build_limit, {1, 360}},

%% Determine how often hash trees are expired after being built.
%% Periodically expiring a hash tree ensures the on-disk hash
tree
%% data stays consistent with the actual k/v backend data. It
also
%% helps Riak identify silent disk failures and bit rot.
However,
%% expiration is not needed for normal AAE operation and should
be
%% infrequent for performance reasons. The time is specified in
%% milliseconds. The default is 1 week.
{anti_entropy_expire, 60480},

%% Limit how many AAE exchanges/builds can happen concurrently.
{anti_entropy_concurrency, 2},

%% The tick determines how often the AAE manager looks for work
%% to do (building/expiring trees, triggering exchanges, etc).
%% The default is every 15 seconds. Lowering this value will
%% speedup the rate that all replicas are synced across the
cluster.
%% Increasing the value is not recommended.
{anti_entropy_tick, 15000},

%% The directory where AAE hash trees are stored.
{anti_entropy_data_dir, "/var/lib/riak/anti_entropy"},

%% The LevelDB options used by AAE to generate the
LevelDB-backed
%% on-disk hashtrees.
{anti_entropy_leveldb_opts, [{write_buffer_size, 4194304},
 {max_open_files, 20}]},

I'll update the bloom filters value and see what happens...

It's thursday again, and the regeneration process has started again. Since
I've updated to 1.4.6, I have another thing different. The get/put values
for each cluster node now have a "random" behaviour. Take a look at the
next screenshot

https://cloudup.com/cgbu9VNhSo1

Best regards


On 31 December 2013 21:16, Charlie Voiselle  wrote:

> Edgar:
>
> Could you attach the AAE section of your app.config?  I’d like to look
> into this issue further for you.  Something I think you might be running
> into is https://github.com/basho/riak_core/pull/483.
>
> The issue of concern is that the LevelDB bloom filter is not enabled
> properly for the instance into which the AAE data is stored.  You can
> mitigate this particular issue by adding *{use_bloomfilter, true}* as
> shown below:
>
> %% The LevelDB options used by AAE to generate the LevelDB-backed
> %% on-disk hashtrees.
> {anti_entropy_leveldb_opts, [{write_buffer_size, 4194304},
>  {max_open_files, 20}]},
>
>
> Becomes:
>
>
> %% The LevelDB options used by AAE to generate the LevelDB-backed
> %% on-disk hashtrees.
>
> {anti_entropy_leveldb_opts, [{write_buffer_size, 4194304},
>{use_bloomfilter, true},
>  {max_open_files, 20}]},
>
>
> This might not solve your specific problem, but it will certainly improve
> your AAE performance.
>
> Thanks,
> Charlie Voiselle
>
> On Dec 31, 2013, at 12:04 PM, Edgar Veiga  wrote:
>
> Hey guys!
>
> Nothing on this one?
>
> Btw: Happy new year :)
>
>
> On 27 December 2013 22:35, Edgar Veiga  wrote:
>
>> This is a du -hs * of the riak folder:
>>
>> 44G anti_entropy
>> 1.1M kv_vnode
>> 252G leveldb
>> 124K ring
>>
>> It's a 6 machine cluster, so ~1512G of levelDB.
>>
>> Thanks for the tip, I'll upgrade in a near future!
>>
>> Best regards
>>
>>
>> On 27 December 2013 21:41, Matthew Von-Maszewski wrote:
>>
>>> I have a query out to the developer that can better respond to your
>>> follow-up questions.  It might be Monday before we get a reply due to the
>>> holidays.
>>>
>>> Do you happen to know how much data is in the leveldb dataset and/or one
>>> vnode?  Not sure it will change the response, but might be nice to have
>>> that info available.
>>

Re: anti_entropy_expire

2013-12-31 Thread Edgar Veiga
Hey guys!

Nothing on this one?

Btw: Happy new year :)


On 27 December 2013 22:35, Edgar Veiga  wrote:

> This is a du -hs * of the riak folder:
>
> 44G anti_entropy
> 1.1M kv_vnode
> 252G leveldb
> 124K ring
>
> It's a 6 machine cluster, so ~1512G of levelDB.
>
> Thanks for the tip, I'll upgrade in a near future!
>
> Best regards
>
>
> On 27 December 2013 21:41, Matthew Von-Maszewski wrote:
>
>> I have a query out to the developer that can better respond to your
>> follow-up questions.  It might be Monday before we get a reply due to the
>> holidays.
>>
>> Do you happen to know how much data is in the leveldb dataset and/or one
>> vnode?  Not sure it will change the response, but might be nice to have
>> that info available.
>>
>> Matthew
>>
>> P.S.  Unrelated to your question:  Riak 1.4.4 is available for download.
>>  It has a couple of nice bug fixes for leveldb.
>>
>>
>> On Dec 27, 2013, at 2:08 PM, Edgar Veiga  wrote:
>>
>> Ok, thanks for confirming!
>>
>> Is it normal, that this action affects the overall state of the cluster?
>> On the 26th It started the regeneration and the the response times of the
>> cluster raised to never seen values. It was a day of heavy traffic but
>> everything was going quite ok until it started the regeneration process..
>>
>> Have you got any advices about changing those app.config values? My
>> cluster is running smoothly for the past 6 months and I don't want to start
>> all over again :)
>>
>> Best Regards
>>
>>
>> On 27 December 2013 18:56, Matthew Von-Maszewski wrote:
>>
>>> Yes.  Confirmed.
>>>
>>> There are options available in app.config to control how often this
>>> occurs and how many vnodes rehash at once:  defaults are every 7 days and
>>> two vnodes per server at a time.
>>>
>>> Matthew Von-Maszewski
>>>
>>>
>>> On Dec 27, 2013, at 13:50, Edgar Veiga  wrote:
>>>
>>> Hi!
>>>
>>> I've been trying to find what may be the cause of this.
>>>
>>> Every once in a week, all the nodes in my riak cluster start to do some
>>> kind of operation that lasts at least for two days.
>>>
>>> You can watch a sample of my munin logs regarding the last week in here:
>>>
>>> https://cloudup.com/imWiBwaC6fm
>>> Take a look at the days 19 and 20, and now it has started again on the
>>> 26...
>>>
>>> I'm suspecting that this may be caused by the aae hash trees being
>>> regenerated, as you say in your documentation:
>>> For added protection, Riak periodically (default: once a week) clears
>>> and regenerates all hash trees from the on-disk K/V data.
>>> Can you confirm me that this may be the root of the "problem" and if
>>> it's normal for the action to last for two days?
>>>
>>> I'm using riak 1.4.2 on 6 machines, with centOS. The backend is levelDB.
>>>
>>> Best Regards,
>>> Edgar Veiga
>>>
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: anti_entropy_expire

2013-12-27 Thread Edgar Veiga
This is a du -hs * of the riak folder:

44G anti_entropy
1.1M kv_vnode
252G leveldb
124K ring

It's a 6 machine cluster, so ~1512G of levelDB.

Thanks for the tip, I'll upgrade in a near future!

Best regards


On 27 December 2013 21:41, Matthew Von-Maszewski  wrote:

> I have a query out to the developer that can better respond to your
> follow-up questions.  It might be Monday before we get a reply due to the
> holidays.
>
> Do you happen to know how much data is in the leveldb dataset and/or one
> vnode?  Not sure it will change the response, but might be nice to have
> that info available.
>
> Matthew
>
> P.S.  Unrelated to your question:  Riak 1.4.4 is available for download.
>  It has a couple of nice bug fixes for leveldb.
>
>
> On Dec 27, 2013, at 2:08 PM, Edgar Veiga  wrote:
>
> Ok, thanks for confirming!
>
> Is it normal, that this action affects the overall state of the cluster?
> On the 26th It started the regeneration and the the response times of the
> cluster raised to never seen values. It was a day of heavy traffic but
> everything was going quite ok until it started the regeneration process..
>
> Have you got any advices about changing those app.config values? My
> cluster is running smoothly for the past 6 months and I don't want to start
> all over again :)
>
> Best Regards
>
>
> On 27 December 2013 18:56, Matthew Von-Maszewski wrote:
>
>> Yes.  Confirmed.
>>
>> There are options available in app.config to control how often this
>> occurs and how many vnodes rehash at once:  defaults are every 7 days and
>> two vnodes per server at a time.
>>
>> Matthew Von-Maszewski
>>
>>
>> On Dec 27, 2013, at 13:50, Edgar Veiga  wrote:
>>
>> Hi!
>>
>> I've been trying to find what may be the cause of this.
>>
>> Every once in a week, all the nodes in my riak cluster start to do some
>> kind of operation that lasts at least for two days.
>>
>> You can watch a sample of my munin logs regarding the last week in here:
>>
>> https://cloudup.com/imWiBwaC6fm
>> Take a look at the days 19 and 20, and now it has started again on the
>> 26...
>>
>> I'm suspecting that this may be caused by the aae hash trees being
>> regenerated, as you say in your documentation:
>> For added protection, Riak periodically (default: once a week) clears and
>> regenerates all hash trees from the on-disk K/V data.
>> Can you confirm me that this may be the root of the "problem" and if it's
>> normal for the action to last for two days?
>>
>> I'm using riak 1.4.2 on 6 machines, with centOS. The backend is levelDB.
>>
>> Best Regards,
>> Edgar Veiga
>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: anti_entropy_expire

2013-12-27 Thread Edgar Veiga
Ok, thanks for confirming!

Is it normal, that this action affects the overall state of the cluster? On
the 26th It started the regeneration and the the response times of the
cluster raised to never seen values. It was a day of heavy traffic but
everything was going quite ok until it started the regeneration process..

Have you got any advices about changing those app.config values? My cluster
is running smoothly for the past 6 months and I don't want to start all
over again :)

Best Regards


On 27 December 2013 18:56, Matthew Von-Maszewski  wrote:

> Yes.  Confirmed.
>
> There are options available in app.config to control how often this occurs
> and how many vnodes rehash at once:  defaults are every 7 days and two
> vnodes per server at a time.
>
> Matthew Von-Maszewski
>
>
> On Dec 27, 2013, at 13:50, Edgar Veiga  wrote:
>
> Hi!
>
> I've been trying to find what may be the cause of this.
>
> Every once in a week, all the nodes in my riak cluster start to do some
> kind of operation that lasts at least for two days.
>
> You can watch a sample of my munin logs regarding the last week in here:
>
> https://cloudup.com/imWiBwaC6fm
> Take a look at the days 19 and 20, and now it has started again on the
> 26...
>
> I'm suspecting that this may be caused by the aae hash trees being
> regenerated, as you say in your documentation:
> For added protection, Riak periodically (default: once a week) clears and
> regenerates all hash trees from the on-disk K/V data.
> Can you confirm me that this may be the root of the "problem" and if it's
> normal for the action to last for two days?
>
> I'm using riak 1.4.2 on 6 machines, with centOS. The backend is levelDB.
>
> Best Regards,
> Edgar Veiga
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


anti_entropy_expire

2013-12-27 Thread Edgar Veiga
Hi!

I've been trying to find what may be the cause of this.

Every once in a week, all the nodes in my riak cluster start to do some
kind of operation that lasts at least for two days.

You can watch a sample of my munin logs regarding the last week in here:

https://cloudup.com/imWiBwaC6fm
Take a look at the days 19 and 20, and now it has started again on the 26...

I'm suspecting that this may be caused by the aae hash trees being
regenerated, as you say in your documentation:
For added protection, Riak periodically (default: once a week) clears and
regenerates all hash trees from the on-disk K/V data.
Can you confirm me that this may be the root of the "problem" and if it's
normal for the action to last for two days?

I'm using riak 1.4.2 on 6 machines, with centOS. The backend is levelDB.

Best Regards,
Edgar Veiga
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Riak cluster all nodes down

2013-09-20 Thread Edgar Veiga
I was preparing a mail making that suggestion!

All the instances in my 6 machines cluster went down because of this :S

Thanks again, keep up the good work!


On 20 September 2013 16:30, Jon Meredith  wrote:

> Hi Edgar,
>
> I'm pleased to hear you've resolved the issue.
> https://github.com/basho/riak_kv/issues/666 will track adding some
> validation code to protect against similar incidents.
>
> Jon
>
>
> On Fri, Sep 20, 2013 at 8:59 AM, Edgar Veiga wrote:
>
>> Problem solved.
>>
>> The n_val = "3" caused the crash! I had a window of time while starting a
>> node to send a new PUT command and restore the correct value.
>>
>> Best regards, thanks Jon
>>
>>
>> On 20 September 2013 15:42, Edgar Veiga  wrote:
>>
>>> Yes I did, via CURL command:
>>>
>>> curl -v -XPUT http://dat7:8098/riak/visitors -H "Content-Type:
>>> application/json" -d '{"props":{"n_val":"3"}}'
>>>
>>> How can I update the n_val of an already existing bucket, while the
>>> instance is down?
>>>
>>>
>>> On 20 September 2013 15:32, Jon Meredith  wrote:
>>>
>>>> Looks like the nval is set to a binary <<"3">> rather than an integer.
>>>> Have you changed it recently and how?
>>>>
>>>> On Sep 20, 2013, at 8:25 AM, Edgar Veiga  wrote:
>>>>
>>>> Hello everyone,
>>>>
>>>> Please lend me a hand here... I'm running a riak cluster of 6 machines
>>>> (version 1.4.1).
>>>>
>>>> Suddenly all the nodes in the cluster went down and they are refusing
>>>> to go up again. It keeps crashing all the the time, this is just a sample
>>>> of what I get when starting a node:
>>>>
>>>> 2013-09-20 15:16:14.016 [info] <0.7.0> Application lager started on
>>>> node 'riak@192.168.20.107'
>>>> 2013-09-20 15:16:14.024 [info] <0.7.0> Application sasl started on node
>>>> 'riak@192.168.20.107'
>>>> 2013-09-20 15:16:14.026 [info] <0.7.0> Application crypto started on
>>>> node 'riak@192.168.20.107'
>>>> 2013-09-20 15:16:14.026 [info] <0.7.0> Application public_key started
>>>> on node 'riak@192.168.20.107'
>>>> 2013-09-20 15:16:14.029 [info] <0.7.0> Application ssl started on node '
>>>> riak@192.168.20.107'
>>>> 2013-09-20 15:16:14.031 [info] <0.7.0> Application riak_sysmon started
>>>> on node 'riak@192.168.20.107'
>>>> 2013-09-20 15:16:14.036 [info] <0.7.0> Application os_mon started on
>>>> node 'riak@192.168.20.107'
>>>> 2013-09-20 15:16:14.044 [info] <0.7.0> Application runtime_tools
>>>> started on node 'riak@192.168.20.107'
>>>> 2013-09-20 15:16:14.047 [info] <0.7.0> Application erlang_js started on
>>>> node 'riak@192.168.20.107'
>>>> 2013-09-20 15:16:14.058 [info] <0.7.0> Application inets started on
>>>> node 'riak@192.168.20.107'
>>>> 2013-09-20 15:16:14.059 [info] <0.7.0> Application mochiweb started on
>>>> node 'riak@192.168.20.107'
>>>> 2013-09-20 15:16:14.063 [info] <0.7.0> Application webmachine started
>>>> on node 'riak@192.168.20.107'
>>>> 2013-09-20 15:16:14.064 [info] <0.7.0> Application basho_stats started
>>>> on node 'riak@192.168.20.107'
>>>> 2013-09-20 15:16:14.067 [info] <0.7.0> Application bitcask started on
>>>> node 'riak@192.168.20.107'
>>>> 2013-09-20 15:16:14.152 [info]
>>>> <0.162.0>@riak_core_capability:process_capability_changes:530 New
>>>> capability: {riak_core,vnode_routing} = proxy
>>>> 2013-09-20 15:16:14.160 [info]
>>>> <0.162.0>@riak_core_capability:process_capability_changes:530 New
>>>> capability: {riak_core,staged_joins} = true
>>>> 2013-09-20 15:16:14.166 [info]
>>>> <0.162.0>@riak_core_capability:process_capability_changes:530 New
>>>> capability: {riak_core,resizable_ring} = true
>>>> 2013-09-20 15:16:14.172 [info] <0.7.0> Application riak_core started on
>>>> node 'riak@192.168.20.107'
>>>> 2013-09-20 15:16:14.177 [info]
>>>> <0.162.0>@riak_core_capability:process_capability_changes:530 New
>>>> capability: {riak_

Re: Riak cluster all nodes down

2013-09-20 Thread Edgar Veiga
Problem solved.

The n_val = "3" caused the crash! I had a window of time while starting a
node to send a new PUT command and restore the correct value.

Best regards, thanks Jon


On 20 September 2013 15:42, Edgar Veiga  wrote:

> Yes I did, via CURL command:
>
> curl -v -XPUT http://dat7:8098/riak/visitors -H "Content-Type:
> application/json" -d '{"props":{"n_val":"3"}}'
>
> How can I update the n_val of an already existing bucket, while the
> instance is down?
>
>
> On 20 September 2013 15:32, Jon Meredith  wrote:
>
>> Looks like the nval is set to a binary <<"3">> rather than an integer.
>> Have you changed it recently and how?
>>
>> On Sep 20, 2013, at 8:25 AM, Edgar Veiga  wrote:
>>
>> Hello everyone,
>>
>> Please lend me a hand here... I'm running a riak cluster of 6 machines
>> (version 1.4.1).
>>
>> Suddenly all the nodes in the cluster went down and they are refusing to
>> go up again. It keeps crashing all the the time, this is just a sample of
>> what I get when starting a node:
>>
>> 2013-09-20 15:16:14.016 [info] <0.7.0> Application lager started on node '
>> riak@192.168.20.107'
>> 2013-09-20 15:16:14.024 [info] <0.7.0> Application sasl started on node '
>> riak@192.168.20.107'
>> 2013-09-20 15:16:14.026 [info] <0.7.0> Application crypto started on node
>> 'riak@192.168.20.107'
>> 2013-09-20 15:16:14.026 [info] <0.7.0> Application public_key started on
>> node 'riak@192.168.20.107'
>> 2013-09-20 15:16:14.029 [info] <0.7.0> Application ssl started on node '
>> riak@192.168.20.107'
>> 2013-09-20 15:16:14.031 [info] <0.7.0> Application riak_sysmon started on
>> node 'riak@192.168.20.107'
>> 2013-09-20 15:16:14.036 [info] <0.7.0> Application os_mon started on node
>> 'riak@192.168.20.107'
>> 2013-09-20 15:16:14.044 [info] <0.7.0> Application runtime_tools started
>> on node 'riak@192.168.20.107'
>> 2013-09-20 15:16:14.047 [info] <0.7.0> Application erlang_js started on
>> node 'riak@192.168.20.107'
>> 2013-09-20 15:16:14.058 [info] <0.7.0> Application inets started on node '
>> riak@192.168.20.107'
>> 2013-09-20 15:16:14.059 [info] <0.7.0> Application mochiweb started on
>> node 'riak@192.168.20.107'
>> 2013-09-20 15:16:14.063 [info] <0.7.0> Application webmachine started on
>> node 'riak@192.168.20.107'
>> 2013-09-20 15:16:14.064 [info] <0.7.0> Application basho_stats started on
>> node 'riak@192.168.20.107'
>> 2013-09-20 15:16:14.067 [info] <0.7.0> Application bitcask started on
>> node 'riak@192.168.20.107'
>> 2013-09-20 15:16:14.152 [info]
>> <0.162.0>@riak_core_capability:process_capability_changes:530 New
>> capability: {riak_core,vnode_routing} = proxy
>> 2013-09-20 15:16:14.160 [info]
>> <0.162.0>@riak_core_capability:process_capability_changes:530 New
>> capability: {riak_core,staged_joins} = true
>> 2013-09-20 15:16:14.166 [info]
>> <0.162.0>@riak_core_capability:process_capability_changes:530 New
>> capability: {riak_core,resizable_ring} = true
>> 2013-09-20 15:16:14.172 [info] <0.7.0> Application riak_core started on
>> node 'riak@192.168.20.107'
>> 2013-09-20 15:16:14.177 [info]
>> <0.162.0>@riak_core_capability:process_capability_changes:530 New
>> capability: {riak_pipe,trace_format} = ordsets
>> 2013-09-20 15:16:14.270 [info] <0.522.0>@riak_kv_env:doc_env:42
>> Environment and OS variables:
>> 2013-09-20 15:16:14.446 [notice] <0.41.0>@lager_app:119 Deprecated
>> lager_file_backend config detected, please consider updating it
>> 2013-09-20 15:16:14.786 [info] <0.573.0>@riak_core:wait_for_service:470
>> Waiting for service riak_kv to start (0 seconds)
>> 2013-09-20 15:16:14.821 [info] <0.590.0>@riak_kv_js_vm:init:76
>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_map)
>> host starting (<0.590.0>)
>> 2013-09-20 15:16:14.823 [info] <0.591.0>@riak_kv_js_vm:init:76
>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_map)
>> host starting (<0.591.0>)
>> 2013-09-20 15:16:14.825 [info] <0.592.0>@riak_kv_js_vm:init:76
>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_map)
>> host starting (<0.592.0>)
>> 2013-09-20 15:16:14.826 [info] <0.593.0>@riak_

Riak cluster all nodes down

2013-09-20 Thread Edgar Veiga
Hello everyone,

Please lend me a hand here... I'm running a riak cluster of 6 machines
(version 1.4.1).

Suddenly all the nodes in the cluster went down and they are refusing to go
up again. It keeps crashing all the the time, this is just a sample of what
I get when starting a node:

2013-09-20 15:16:14.016 [info] <0.7.0> Application lager started on node '
riak@192.168.20.107'
2013-09-20 15:16:14.024 [info] <0.7.0> Application sasl started on node '
riak@192.168.20.107'
2013-09-20 15:16:14.026 [info] <0.7.0> Application crypto started on node '
riak@192.168.20.107'
2013-09-20 15:16:14.026 [info] <0.7.0> Application public_key started on
node 'riak@192.168.20.107'
2013-09-20 15:16:14.029 [info] <0.7.0> Application ssl started on node '
riak@192.168.20.107'
2013-09-20 15:16:14.031 [info] <0.7.0> Application riak_sysmon started on
node 'riak@192.168.20.107'
2013-09-20 15:16:14.036 [info] <0.7.0> Application os_mon started on node '
riak@192.168.20.107'
2013-09-20 15:16:14.044 [info] <0.7.0> Application runtime_tools started on
node 'riak@192.168.20.107'
2013-09-20 15:16:14.047 [info] <0.7.0> Application erlang_js started on
node 'riak@192.168.20.107'
2013-09-20 15:16:14.058 [info] <0.7.0> Application inets started on node '
riak@192.168.20.107'
2013-09-20 15:16:14.059 [info] <0.7.0> Application mochiweb started on node
'riak@192.168.20.107'
2013-09-20 15:16:14.063 [info] <0.7.0> Application webmachine started on
node 'riak@192.168.20.107'
2013-09-20 15:16:14.064 [info] <0.7.0> Application basho_stats started on
node 'riak@192.168.20.107'
2013-09-20 15:16:14.067 [info] <0.7.0> Application bitcask started on node '
riak@192.168.20.107'
2013-09-20 15:16:14.152 [info]
<0.162.0>@riak_core_capability:process_capability_changes:530 New
capability: {riak_core,vnode_routing} = proxy
2013-09-20 15:16:14.160 [info]
<0.162.0>@riak_core_capability:process_capability_changes:530 New
capability: {riak_core,staged_joins} = true
2013-09-20 15:16:14.166 [info]
<0.162.0>@riak_core_capability:process_capability_changes:530 New
capability: {riak_core,resizable_ring} = true
2013-09-20 15:16:14.172 [info] <0.7.0> Application riak_core started on
node 'riak@192.168.20.107'
2013-09-20 15:16:14.177 [info]
<0.162.0>@riak_core_capability:process_capability_changes:530 New
capability: {riak_pipe,trace_format} = ordsets
2013-09-20 15:16:14.270 [info] <0.522.0>@riak_kv_env:doc_env:42 Environment
and OS variables:
2013-09-20 15:16:14.446 [notice] <0.41.0>@lager_app:119 Deprecated
lager_file_backend config detected, please consider updating it
2013-09-20 15:16:14.786 [info] <0.573.0>@riak_core:wait_for_service:470
Waiting for service riak_kv to start (0 seconds)
2013-09-20 15:16:14.821 [info] <0.590.0>@riak_kv_js_vm:init:76 Spidermonkey
VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_map) host starting
(<0.590.0>)
2013-09-20 15:16:14.823 [info] <0.591.0>@riak_kv_js_vm:init:76 Spidermonkey
VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_map) host starting
(<0.591.0>)
2013-09-20 15:16:14.825 [info] <0.592.0>@riak_kv_js_vm:init:76 Spidermonkey
VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_map) host starting
(<0.592.0>)
2013-09-20 15:16:14.826 [info] <0.593.0>@riak_kv_js_vm:init:76 Spidermonkey
VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_map) host starting
(<0.593.0>)
2013-09-20 15:16:14.827 [info] <0.594.0>@riak_kv_js_vm:init:76 Spidermonkey
VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_map) host starting
(<0.594.0>)
2013-09-20 15:16:14.829 [info] <0.595.0>@riak_kv_js_vm:init:76 Spidermonkey
VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_map) host starting
(<0.595.0>)
2013-09-20 15:16:14.831 [info] <0.596.0>@riak_kv_js_vm:init:76 Spidermonkey
VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_map) host starting
(<0.596.0>)
2013-09-20 15:16:14.833 [info] <0.597.0>@riak_kv_js_vm:init:76 Spidermonkey
VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_map) host starting
(<0.597.0>)
2013-09-20 15:16:14.837 [info] <0.600.0>@riak_kv_js_vm:init:76 Spidermonkey
VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_reduce) host
starting (<0.600.0>)
2013-09-20 15:16:14.841 [info] <0.601.0>@riak_kv_js_vm:init:76 Spidermonkey
VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_reduce) host
starting (<0.601.0>)
2013-09-20 15:16:14.845 [info] <0.602.0>@riak_kv_js_vm:init:76 Spidermonkey
VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_reduce) host
starting (<0.602.0>)
2013-09-20 15:16:14.848 [info] <0.603.0>@riak_kv_js_vm:init:76 Spidermonkey
VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_reduce) host
starting (<0.603.0>)
2013-09-20 15:16:14.850 [info] <0.605.0>@riak_kv_js_vm:init:76 Spidermonkey
VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_reduce) host
starting (<0.605.0>)
2013-09-20 15:16:14.852 [info] <0.606.0>@riak_kv_js_vm:init:76 Spidermonkey
VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_reduce) host
starting (<0.606.0>)
2013-09-20 15:16:14.8

Re: Riak cluster all nodes down

2013-09-20 Thread Edgar Veiga
Yes I did, via CURL command:

curl -v -XPUT http://dat7:8098/riak/visitors -H "Content-Type:
application/json" -d '{"props":{"n_val":"3"}}'

How can I update the n_val of an already existing bucket, while the
instance is down?


On 20 September 2013 15:32, Jon Meredith  wrote:

> Looks like the nval is set to a binary <<"3">> rather than an integer.
> Have you changed it recently and how?
>
> On Sep 20, 2013, at 8:25 AM, Edgar Veiga  wrote:
>
> Hello everyone,
>
> Please lend me a hand here... I'm running a riak cluster of 6 machines
> (version 1.4.1).
>
> Suddenly all the nodes in the cluster went down and they are refusing to
> go up again. It keeps crashing all the the time, this is just a sample of
> what I get when starting a node:
>
> 2013-09-20 15:16:14.016 [info] <0.7.0> Application lager started on node '
> riak@192.168.20.107'
> 2013-09-20 15:16:14.024 [info] <0.7.0> Application sasl started on node '
> riak@192.168.20.107'
> 2013-09-20 15:16:14.026 [info] <0.7.0> Application crypto started on node '
> riak@192.168.20.107'
> 2013-09-20 15:16:14.026 [info] <0.7.0> Application public_key started on
> node 'riak@192.168.20.107'
> 2013-09-20 15:16:14.029 [info] <0.7.0> Application ssl started on node '
> riak@192.168.20.107'
> 2013-09-20 15:16:14.031 [info] <0.7.0> Application riak_sysmon started on
> node 'riak@192.168.20.107'
> 2013-09-20 15:16:14.036 [info] <0.7.0> Application os_mon started on node '
> riak@192.168.20.107'
> 2013-09-20 15:16:14.044 [info] <0.7.0> Application runtime_tools started
> on node 'riak@192.168.20.107'
> 2013-09-20 15:16:14.047 [info] <0.7.0> Application erlang_js started on
> node 'riak@192.168.20.107'
> 2013-09-20 15:16:14.058 [info] <0.7.0> Application inets started on node '
> riak@192.168.20.107'
> 2013-09-20 15:16:14.059 [info] <0.7.0> Application mochiweb started on
> node 'riak@192.168.20.107'
> 2013-09-20 15:16:14.063 [info] <0.7.0> Application webmachine started on
> node 'riak@192.168.20.107'
> 2013-09-20 15:16:14.064 [info] <0.7.0> Application basho_stats started on
> node 'riak@192.168.20.107'
> 2013-09-20 15:16:14.067 [info] <0.7.0> Application bitcask started on node
> 'riak@192.168.20.107'
> 2013-09-20 15:16:14.152 [info]
> <0.162.0>@riak_core_capability:process_capability_changes:530 New
> capability: {riak_core,vnode_routing} = proxy
> 2013-09-20 15:16:14.160 [info]
> <0.162.0>@riak_core_capability:process_capability_changes:530 New
> capability: {riak_core,staged_joins} = true
> 2013-09-20 15:16:14.166 [info]
> <0.162.0>@riak_core_capability:process_capability_changes:530 New
> capability: {riak_core,resizable_ring} = true
> 2013-09-20 15:16:14.172 [info] <0.7.0> Application riak_core started on
> node 'riak@192.168.20.107'
> 2013-09-20 15:16:14.177 [info]
> <0.162.0>@riak_core_capability:process_capability_changes:530 New
> capability: {riak_pipe,trace_format} = ordsets
> 2013-09-20 15:16:14.270 [info] <0.522.0>@riak_kv_env:doc_env:42
> Environment and OS variables:
> 2013-09-20 15:16:14.446 [notice] <0.41.0>@lager_app:119 Deprecated
> lager_file_backend config detected, please consider updating it
> 2013-09-20 15:16:14.786 [info] <0.573.0>@riak_core:wait_for_service:470
> Waiting for service riak_kv to start (0 seconds)
> 2013-09-20 15:16:14.821 [info] <0.590.0>@riak_kv_js_vm:init:76
> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_map)
> host starting (<0.590.0>)
> 2013-09-20 15:16:14.823 [info] <0.591.0>@riak_kv_js_vm:init:76
> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_map)
> host starting (<0.591.0>)
> 2013-09-20 15:16:14.825 [info] <0.592.0>@riak_kv_js_vm:init:76
> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_map)
> host starting (<0.592.0>)
> 2013-09-20 15:16:14.826 [info] <0.593.0>@riak_kv_js_vm:init:76
> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_map)
> host starting (<0.593.0>)
> 2013-09-20 15:16:14.827 [info] <0.594.0>@riak_kv_js_vm:init:76
> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_map)
> host starting (<0.594.0>)
> 2013-09-20 15:16:14.829 [info] <0.595.0>@riak_kv_js_vm:init:76
> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: riak_kv_js_map)
> host starting (<0.595.0>)
> 2013-09-20 15:16:14.831 [info] <0.596.0>@riak_kv_js_vm:init:76
&

Re: Migration from memcachedb to riak

2013-07-10 Thread Edgar Veiga
Hi Damien,

Well let's dive into this a little bit.

I told you guys that bitcask was not an option due to a bad past
experiencie with couchbase (sorry, in the previous post I wrote couchdb),
that uses the same architecture as bitcask, keys in memory and values in
disk.

We started the migration to couchbase, and were already using 3 physical
nodes and only had imported 5% of the real data! That was one of the main
reasons for choosing a solution like riak + leveldb, to delete the keys fit
in memory bottleneck..

Now, here's a tipical key (It's large :D, x=letter and 0=numbers ):

xxx___0x_00_000_000xx

We are using the php serialize native function!

Best regards



On 10 July 2013 11:43, damien krotkine  wrote:

>
>
>
> On 10 July 2013 11:03, Edgar Veiga  wrote:
>
>> Hi Guido.
>>
>> Thanks for your answer!
>>
>> Bitcask it's not an option due to the amount of ram needed.. We would
>> need a lot more of physical nodes so more money spent...
>>
>
> Why is it not an option?
>
> If you use Bitcask, then each node needs to store its keys in memory. It's
> usually not a lot. In a precedent email I asked you the average lenght of
> *keys*, but you gave us the average length of *values* :)
>
> We have 1 billion keys and fits on a 5 nodes Ring. ( check out
> http://docs.basho.com/riak/1.2.0/references/appendices/Bitcask-Capacity-Planning/).
>  Our bucket names are 1 letter, our keys are 10 chars long.
>
> What does a typical key look like ? Also, what are you using to serialize
> your php objects? Maybe you could paste a typical value somewhere as well
>
> Damien
>
>
>>
>> Instead we're using less machines with SSD disks to improve elevelDB
>> performance.
>>
>> Best regards
>>
>>
>>
>> On 10 July 2013 09:58, Guido Medina  wrote:
>>
>>>  Well, I rushed my answer before, if you want performance, you probably
>>> want Bitcask, if you want compression then LevelDB, the following links
>>> should help you decide better:
>>>
>>> http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Bitcask/
>>> http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/LevelDB/
>>>
>>> Or multi, use one as default and then the other for specific buckets:
>>>
>>> http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Multi/
>>>
>>> HTH,
>>>
>>> Guido.
>>>
>>>
>>>
>>> On 10/07/13 09:53, Guido Medina wrote:
>>>
>>> Then you are better off with Bitcask, that will be the fastest in your
>>> case (no 2i, no searches, no M/R)
>>>
>>> HTH,
>>>
>>> Guido.
>>>
>>> On 10/07/13 09:49, Edgar Veiga wrote:
>>>
>>> Hello all!
>>>
>>>  I have a couple of questions that I would like to address all of you
>>> guys, in order to start this migration the best as possible.
>>>
>>>  Context:
>>> - I'm responsible for the migration of a pure key/value store that for
>>> now is being stored on memcacheDB.
>>> - We're serializing php objects and storing them.
>>> - The total size occupied it's ~2TB.
>>>
>>>  - The idea it's to migrate this data to a riak cluster with elevelDB
>>> backend (starting with 6 nodes, 256 partitions. This thing is scaling very
>>> fast).
>>> - We only need to access the information by key. *We won't need neither
>>> map/reduces, searches or secondary indexes*. It's a pure key/value
>>> store!
>>>
>>>  My questions are:
>>> - Do you have any riak fine tunning tip regarding this use case (due to
>>> the fact that we will only use the key/value capabilities of riak)?
>>>  - It's expected that those 2TB would be reduced due to the levelDB
>>> compression. Do you think we should compress our objects to on the client?
>>>
>>>  Best regards,
>>> Edgar Veiga
>>>
>>>
>>> ___
>>> riak-users mailing 
>>> listriak-users@lists.basho.comhttp://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>>
>>>
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Migration from memcachedb to riak

2013-07-10 Thread Edgar Veiga
Guido, we'r not using Java and that won't be an option.

The technology stack is php and/or node.js

Thanks anyway :)

Best regards


On 10 July 2013 10:35, Edgar Veiga  wrote:

> Hi Damien,
>
> We have ~11 keys and we are using ~2TB of disk space.
> (The average object length will be ~2000 bytes).
>
> This is a lot to fit in memory (We have bad past experiencies with
> couchDB...).
>
> Thanks for the rest of the tips!
>
>
> On 10 July 2013 10:13, damien krotkine  wrote:
>
>>
>> ( first post here, hi everybody... )
>>
>> If you don't need MR, 2i, etc, then BitCask will be faster. You just need
>> to make sure all your keys fit in memory, which should not be a problem.
>> How many keys do you have and what's their average length ?
>>
>> About the values,you can save a lot of space by choosing an appropriate
>> serialization. We use Sereal[1] to serialize our data, and it's small
>> enough that we don't need to compress it further (it can automatically use
>> snappy to compress further). There is a php client [2]
>>
>> If you use leveldb, it can compress using snappy, but I've been a bit
>> disappointed by snappy, because it didn't work well with our data. If you
>> serialize your php object as verbose string (I don't know what's the usual
>> way to serialize php objects), then you should probably benchmark different
>> compressions algorithms on the application side.
>>
>>
>> [1]: https://github.com/Sereal/Sereal/wiki/Sereal-Comparison-Graphs
>> [2]: https://github.com/tobyink/php-sereal/tree/master/PHP
>>
>> On 10 July 2013 10:49, Edgar Veiga  wrote:
>>
>>>  Hello all!
>>>
>>> I have a couple of questions that I would like to address all of you
>>> guys, in order to start this migration the best as possible.
>>>
>>> Context:
>>> - I'm responsible for the migration of a pure key/value store that for
>>> now is being stored on memcacheDB.
>>> - We're serializing php objects and storing them.
>>> - The total size occupied it's ~2TB.
>>>
>>> - The idea it's to migrate this data to a riak cluster with elevelDB
>>> backend (starting with 6 nodes, 256 partitions. This thing is scaling very
>>> fast).
>>> - We only need to access the information by key. *We won't need neither
>>> map/reduces, searches or secondary indexes*. It's a pure key/value
>>> store!
>>>
>>> My questions are:
>>> - Do you have any riak fine tunning tip regarding this use case (due to
>>> the fact that we will only use the key/value capabilities of riak)?
>>> - It's expected that those 2TB would be reduced due to the levelDB
>>> compression. Do you think we should compress our objects to on the client?
>>>
>>> Best regards,
>>> Edgar Veiga
>>>
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Migration from memcachedb to riak

2013-07-10 Thread Edgar Veiga
Hi Damien,

We have ~11 keys and we are using ~2TB of disk space.
(The average object length will be ~2000 bytes).

This is a lot to fit in memory (We have bad past experiencies with
couchDB...).

Thanks for the rest of the tips!


On 10 July 2013 10:13, damien krotkine  wrote:

>
> ( first post here, hi everybody... )
>
> If you don't need MR, 2i, etc, then BitCask will be faster. You just need
> to make sure all your keys fit in memory, which should not be a problem.
> How many keys do you have and what's their average length ?
>
> About the values,you can save a lot of space by choosing an appropriate
> serialization. We use Sereal[1] to serialize our data, and it's small
> enough that we don't need to compress it further (it can automatically use
> snappy to compress further). There is a php client [2]
>
> If you use leveldb, it can compress using snappy, but I've been a bit
> disappointed by snappy, because it didn't work well with our data. If you
> serialize your php object as verbose string (I don't know what's the usual
> way to serialize php objects), then you should probably benchmark different
> compressions algorithms on the application side.
>
>
> [1]: https://github.com/Sereal/Sereal/wiki/Sereal-Comparison-Graphs
> [2]: https://github.com/tobyink/php-sereal/tree/master/PHP
>
> On 10 July 2013 10:49, Edgar Veiga  wrote:
>
>>  Hello all!
>>
>> I have a couple of questions that I would like to address all of you
>> guys, in order to start this migration the best as possible.
>>
>> Context:
>> - I'm responsible for the migration of a pure key/value store that for
>> now is being stored on memcacheDB.
>> - We're serializing php objects and storing them.
>> - The total size occupied it's ~2TB.
>>
>> - The idea it's to migrate this data to a riak cluster with elevelDB
>> backend (starting with 6 nodes, 256 partitions. This thing is scaling very
>> fast).
>> - We only need to access the information by key. *We won't need neither
>> map/reduces, searches or secondary indexes*. It's a pure key/value store!
>>
>> My questions are:
>> - Do you have any riak fine tunning tip regarding this use case (due to
>> the fact that we will only use the key/value capabilities of riak)?
>> - It's expected that those 2TB would be reduced due to the levelDB
>> compression. Do you think we should compress our objects to on the client?
>>
>> Best regards,
>> Edgar Veiga
>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Migration from memcachedb to riak

2013-07-10 Thread Edgar Veiga
Hi Guido.

Thanks for your answer!

Bitcask it's not an option due to the amount of ram needed.. We would need
a lot more of physical nodes so more money spent...

Instead we're using less machines with SSD disks to improve elevelDB
performance.

Best regards



On 10 July 2013 09:58, Guido Medina  wrote:

>  Well, I rushed my answer before, if you want performance, you probably
> want Bitcask, if you want compression then LevelDB, the following links
> should help you decide better:
>
> http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Bitcask/
> http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/LevelDB/
>
> Or multi, use one as default and then the other for specific buckets:
>
> http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Multi/
>
> HTH,
>
> Guido.
>
>
>
> On 10/07/13 09:53, Guido Medina wrote:
>
> Then you are better off with Bitcask, that will be the fastest in your
> case (no 2i, no searches, no M/R)
>
> HTH,
>
> Guido.
>
> On 10/07/13 09:49, Edgar Veiga wrote:
>
> Hello all!
>
>  I have a couple of questions that I would like to address all of you
> guys, in order to start this migration the best as possible.
>
>  Context:
> - I'm responsible for the migration of a pure key/value store that for now
> is being stored on memcacheDB.
> - We're serializing php objects and storing them.
> - The total size occupied it's ~2TB.
>
>  - The idea it's to migrate this data to a riak cluster with elevelDB
> backend (starting with 6 nodes, 256 partitions. This thing is scaling very
> fast).
> - We only need to access the information by key. *We won't need neither
> map/reduces, searches or secondary indexes*. It's a pure key/value store!
>
>  My questions are:
> - Do you have any riak fine tunning tip regarding this use case (due to
> the fact that we will only use the key/value capabilities of riak)?
>  - It's expected that those 2TB would be reduced due to the levelDB
> compression. Do you think we should compress our objects to on the client?
>
>  Best regards,
> Edgar Veiga
>
>
> ___
> riak-users mailing 
> listriak-users@lists.basho.comhttp://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Migration from memcachedb to riak

2013-07-10 Thread Edgar Veiga
Hello all!

I have a couple of questions that I would like to address all of you guys,
in order to start this migration the best as possible.

Context:
- I'm responsible for the migration of a pure key/value store that for now
is being stored on memcacheDB.
- We're serializing php objects and storing them.
- The total size occupied it's ~2TB.

- The idea it's to migrate this data to a riak cluster with elevelDB
backend (starting with 6 nodes, 256 partitions. This thing is scaling very
fast).
- We only need to access the information by key. *We won't need neither
map/reduces, searches or secondary indexes*. It's a pure key/value store!

My questions are:
- Do you have any riak fine tunning tip regarding this use case (due to the
fact that we will only use the key/value capabilities of riak)?
- It's expected that those 2TB would be reduced due to the levelDB
compression. Do you think we should compress our objects to on the client?

Best regards,
Edgar Veiga
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com