Edgar,

This is indirectly related to you key deletion discussion.  I made changes 
recently to the aggressive delete code.  The second section of the following 
(updated) web page discusses the adjustments:

    https://github.com/basho/leveldb/wiki/Mv-aggressive-delete

Matthew


On Apr 6, 2014, at 4:29 PM, Edgar Veiga <edgarmve...@gmail.com> wrote:

> Matthew, thanks again for the response!
> 
> That said, I'll wait again for the 2.0 (and maybe buy some bigger disks :)
> 
> Best regards
> 
> 
> On 6 April 2014 15:02, Matthew Von-Maszewski <matth...@basho.com> wrote:
> Edgar,
> 
> In Riak 1.4, there is no advantage to using empty values versus deleting.
> 
> leveldb is a "write once" data store.  New data for a given key never 
> physically overwrites old data for the same key.  New data "hides" the old 
> data by being in a lower level, and therefore picked first.
> 
> leveldb's compaction operation will remove older key/value pairs only when 
> the newer key/value is pair is part of a compaction involving both new and 
> old.  The new and the old key/value pairs must have migrated to adjacent 
> levels through normal compaction operations before leveldb will see them in 
> the same compaction.  The migration could take days, weeks, or even months 
> depending upon the size of your entire dataset and the rate of incoming write 
> operations.
> 
> leveldb's "delete" object is exactly the same as your empty JSON object.  The 
> delete object simply has one more flag set that allows it to also be removed 
> if and only if there is no chance for an identical key to exist on a higher 
> level.
> 
> I apologize that I cannot give you a more useful answer.  2.0 is on the 
> horizon.
> 
> Matthew
> 
> 
> On Apr 6, 2014, at 7:04 AM, Edgar Veiga <edgarmve...@gmail.com> wrote:
> 
>> Hi again!
>> 
>> Sorry to reopen this discussion, but I have another question regarding the 
>> former post.
>> 
>> What if, instead of doing a mass deletion (We've already seen that it will 
>> be non profitable, regarding disk space) I update all the values with an 
>> empty JSON object "{}" ? Do you see any problem with this? I no longer need 
>> those millions of values that are living in the cluster... 
>> 
>> When the version 2.0 of riak runs stable I'll do the update and only then 
>> delete those keys!
>> 
>> Best regards
>> 
>> 
>> On 18 February 2014 16:32, Edgar Veiga <edgarmve...@gmail.com> wrote:
>> Ok, thanks a lot Matthew.
>> 
>> 
>> On 18 February 2014 16:18, Matthew Von-Maszewski <matth...@basho.com> wrote:
>> Riak 2.0 is coming.  Hold your mass delete until then.  The "bug" is within 
>> Google's original leveldb architecture.  Riak 2.0 sneaks around to get the 
>> disk space freed.
>> 
>> Matthew
>> 
>> 
>> 
>> On Feb 18, 2014, at 11:10 AM, Edgar Veiga <edgarmve...@gmail.com> wrote:
>> 
>>> The only/main purpose is to free disk space..
>>> 
>>> I was a little bit concerned regarding this operation, but now with your 
>>> feedback I'm tending to don't do nothing, I can't risk the growing of 
>>> space... 
>>> Regarding the overhead I think that with a tight throttling system I could 
>>> control and avoid overloading the cluster.
>>> 
>>> Mixed feelings :S
>>> 
>>> 
>>> 
>>> On 18 February 2014 15:45, Matthew Von-Maszewski <matth...@basho.com> wrote:
>>> Edgar,
>>> 
>>> The first "concern" I have is that leveldb's delete does not free disk 
>>> space.  Others have executed mass delete operations only to discover they 
>>> are now using more disk space instead of less.  Here is a discussion of the 
>>> problem:
>>> 
>>> https://github.com/basho/leveldb/wiki/mv-aggressive-delete
>>> 
>>> The link also describes Riak's database operation overhead.  This is a 
>>> second "concern".  You will need to carefully throttle your delete rate or 
>>> the overhead will likely impact your production throughput.
>>> 
>>> We have new code to help quicken the actual purge of deleted data in Riak 
>>> 2.0.  But that release is not quite ready for production usage.
>>> 
>>> 
>>> What do you hope to achieve by the mass delete?
>>> 
>>> Matthew
>>> 
>>> 
>>> 
>>> 
>>> On Feb 18, 2014, at 10:29 AM, Edgar Veiga <edgarmve...@gmail.com> wrote:
>>> 
>>>> Sorry, forgot that info!
>>>> 
>>>> It's leveldb.
>>>> 
>>>> Best regards
>>>> 
>>>> 
>>>> On 18 February 2014 15:27, Matthew Von-Maszewski <matth...@basho.com> 
>>>> wrote:
>>>> Which Riak backend are you using:  bitcask, leveldb, multi?
>>>> 
>>>> Matthew
>>>> 
>>>> 
>>>> On Feb 18, 2014, at 10:17 AM, Edgar Veiga <edgarmve...@gmail.com> wrote:
>>>> 
>>>> > Hi all!
>>>> >
>>>> > I have a fairly trivial question regarding mass deletion on a riak 
>>>> > cluster, but firstly let me give you just some context. My cluster is 
>>>> > running with riak 1.4.6 on 6 machines with a ring of 256 nodes and 1Tb 
>>>> > ssd disks.
>>>> >
>>>> > I need to execute a massive object deletion on a bucket, I'm talking of 
>>>> > ~1 billion keys (The object average size is ~1Kb). I will not retrive 
>>>> > the keys from riak because a I have a file with all of them. I'll just 
>>>> > start a script that reads them from the file and triggers an HTTP DELETE 
>>>> > for each one.
>>>> > The cluster will continue running on production with a quite high load 
>>>> > serving all other applications, while running this deletion.
>>>> >
>>>> > My question is simple, do I need to have any kind of extra concerns 
>>>> > regarding this action? Do you advise me on taking special attention to 
>>>> > any kind of metrics regarding riak or event the servers where it's 
>>>> > running?
>>>> >
>>>> > Best regards!
>>>> > _______________________________________________
>>>> > riak-users mailing list
>>>> > riak-users@lists.basho.com
>>>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
>> 
> 
> 

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to