Edgar, This is indirectly related to you key deletion discussion. I made changes recently to the aggressive delete code. The second section of the following (updated) web page discusses the adjustments:
https://github.com/basho/leveldb/wiki/Mv-aggressive-delete Matthew On Apr 6, 2014, at 4:29 PM, Edgar Veiga <edgarmve...@gmail.com> wrote: > Matthew, thanks again for the response! > > That said, I'll wait again for the 2.0 (and maybe buy some bigger disks :) > > Best regards > > > On 6 April 2014 15:02, Matthew Von-Maszewski <matth...@basho.com> wrote: > Edgar, > > In Riak 1.4, there is no advantage to using empty values versus deleting. > > leveldb is a "write once" data store. New data for a given key never > physically overwrites old data for the same key. New data "hides" the old > data by being in a lower level, and therefore picked first. > > leveldb's compaction operation will remove older key/value pairs only when > the newer key/value is pair is part of a compaction involving both new and > old. The new and the old key/value pairs must have migrated to adjacent > levels through normal compaction operations before leveldb will see them in > the same compaction. The migration could take days, weeks, or even months > depending upon the size of your entire dataset and the rate of incoming write > operations. > > leveldb's "delete" object is exactly the same as your empty JSON object. The > delete object simply has one more flag set that allows it to also be removed > if and only if there is no chance for an identical key to exist on a higher > level. > > I apologize that I cannot give you a more useful answer. 2.0 is on the > horizon. > > Matthew > > > On Apr 6, 2014, at 7:04 AM, Edgar Veiga <edgarmve...@gmail.com> wrote: > >> Hi again! >> >> Sorry to reopen this discussion, but I have another question regarding the >> former post. >> >> What if, instead of doing a mass deletion (We've already seen that it will >> be non profitable, regarding disk space) I update all the values with an >> empty JSON object "{}" ? Do you see any problem with this? I no longer need >> those millions of values that are living in the cluster... >> >> When the version 2.0 of riak runs stable I'll do the update and only then >> delete those keys! >> >> Best regards >> >> >> On 18 February 2014 16:32, Edgar Veiga <edgarmve...@gmail.com> wrote: >> Ok, thanks a lot Matthew. >> >> >> On 18 February 2014 16:18, Matthew Von-Maszewski <matth...@basho.com> wrote: >> Riak 2.0 is coming. Hold your mass delete until then. The "bug" is within >> Google's original leveldb architecture. Riak 2.0 sneaks around to get the >> disk space freed. >> >> Matthew >> >> >> >> On Feb 18, 2014, at 11:10 AM, Edgar Veiga <edgarmve...@gmail.com> wrote: >> >>> The only/main purpose is to free disk space.. >>> >>> I was a little bit concerned regarding this operation, but now with your >>> feedback I'm tending to don't do nothing, I can't risk the growing of >>> space... >>> Regarding the overhead I think that with a tight throttling system I could >>> control and avoid overloading the cluster. >>> >>> Mixed feelings :S >>> >>> >>> >>> On 18 February 2014 15:45, Matthew Von-Maszewski <matth...@basho.com> wrote: >>> Edgar, >>> >>> The first "concern" I have is that leveldb's delete does not free disk >>> space. Others have executed mass delete operations only to discover they >>> are now using more disk space instead of less. Here is a discussion of the >>> problem: >>> >>> https://github.com/basho/leveldb/wiki/mv-aggressive-delete >>> >>> The link also describes Riak's database operation overhead. This is a >>> second "concern". You will need to carefully throttle your delete rate or >>> the overhead will likely impact your production throughput. >>> >>> We have new code to help quicken the actual purge of deleted data in Riak >>> 2.0. But that release is not quite ready for production usage. >>> >>> >>> What do you hope to achieve by the mass delete? >>> >>> Matthew >>> >>> >>> >>> >>> On Feb 18, 2014, at 10:29 AM, Edgar Veiga <edgarmve...@gmail.com> wrote: >>> >>>> Sorry, forgot that info! >>>> >>>> It's leveldb. >>>> >>>> Best regards >>>> >>>> >>>> On 18 February 2014 15:27, Matthew Von-Maszewski <matth...@basho.com> >>>> wrote: >>>> Which Riak backend are you using: bitcask, leveldb, multi? >>>> >>>> Matthew >>>> >>>> >>>> On Feb 18, 2014, at 10:17 AM, Edgar Veiga <edgarmve...@gmail.com> wrote: >>>> >>>> > Hi all! >>>> > >>>> > I have a fairly trivial question regarding mass deletion on a riak >>>> > cluster, but firstly let me give you just some context. My cluster is >>>> > running with riak 1.4.6 on 6 machines with a ring of 256 nodes and 1Tb >>>> > ssd disks. >>>> > >>>> > I need to execute a massive object deletion on a bucket, I'm talking of >>>> > ~1 billion keys (The object average size is ~1Kb). I will not retrive >>>> > the keys from riak because a I have a file with all of them. I'll just >>>> > start a script that reads them from the file and triggers an HTTP DELETE >>>> > for each one. >>>> > The cluster will continue running on production with a quite high load >>>> > serving all other applications, while running this deletion. >>>> > >>>> > My question is simple, do I need to have any kind of extra concerns >>>> > regarding this action? Do you advise me on taking special attention to >>>> > any kind of metrics regarding riak or event the servers where it's >>>> > running? >>>> > >>>> > Best regards! >>>> > _______________________________________________ >>>> > riak-users mailing list >>>> > riak-users@lists.basho.com >>>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>> >>>> >>> >>> >> >> >> > >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com