Matthew, Thank for the details about LevelDB. Is there a way to trigger compaction from Erlang or any other way to get rid of tombstones faster with 1.4? If there is no such a thing I guess waiting is my only option.
Thanks everybody helping with this issue. Regards, Istvan On Sat, Mar 22, 2014 at 5:33 AM, Matthew Von-Maszewski <matth...@basho.com> wrote: > Leveldb, as written by Google, does not actively clean up delete "tombstones" > or prior data records with the same key. The old data and tombstones stay on > disk until they happen to participate in compaction at the highest "level". > The clean up can therefore happen days, weeks, or even months later depending > upon the size of your dataset, speed of incoming writes, and distribution of > new keys versus deleted keys. > > Basho has added code to leveldb in Riak 2.0 to more aggressively free up disk > space. Details on this 2.0 feature are here: > > https://github.com/basho/leveldb/wiki/mv-aggressive-delete > > Matthew Von-Maszewski > > > On Mar 22, 2014, at 1:53, István <lecc...@gmail.com> wrote: > >> All good, all the keys are gone! :) >> >> I am just waiting Riak to free up the space. It seems it is not >> instant... Or I am missing something. I need to read up on how LevelDB >> actually frees up space. I have updated the code to stop on {ReqID, >> done}. I think you get this only when you have no keys left. I have >> verified that that there are no keys left in the bucket. >> >> >> # curl -XGET -i http://127.0.0.1:8098/buckets/test/keys?keys=stream >> HTTP/1.1 200 OK >> Vary: Accept-Encoding >> Transfer-Encoding: chunked >> Server: MochiWeb/1.1 WebMachine/1.10.0 (never breaks eye contact) >> Date: Sat, 22 Mar 2014 05:51:48 GMT >> Content-Type: application/json >> >> {"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]} >> >> Thanks Evan! >> I. >> >> On Fri, Mar 21, 2014 at 9:44 PM, Evan Vigil-McClanahan >> <emcclana...@basho.com> wrote: >>> Did some double checking on the off chance that I gave you some bad >>> advice. Here's the function that the erlang client uses to accumulate >>> the outcome of stream_list_keys et al: >>> https://github.com/basho/riak-erlang-client/blob/master/src/riakc_pb_socket.erl#L2146-L2155 >>> >>> here is how you get the request id: >>> https://github.com/basho/riak-erlang-client/blob/master/src/riakc_pb_socket.erl#L490-L494 >>> >>> On Fri, Mar 21, 2014 at 9:29 PM, Evan Vigil-McClanahan >>> <emcclana...@basho.com> wrote: >>>> You don't want to recurse when you get the {ReqID, done} message, you >>>> should just stop there. >>>> >>>> On Fri, Mar 21, 2014 at 6:20 PM, István <lecc...@gmail.com> wrote: >>>>> With help of Evan (evanmcc) on the IRC channel I was able to kick off >>>>> the clean up job using riak-erlang-client. >>>>> >>>>> Here is the code: >>>>> >>>>> https://gist.github.com/l1x/9698847 >>>>> >>>>> It sometimes behaves a bit weirdly, the PB client returns {40127151, >>>>> done} or something similar, that I can't recognize why but it >>>>> definitely deleted some of the keys so far. I am letting it run for a >>>>> while and see what happens. >>>>> >>>>> Regards, >>>>> Istvan >>>>> >>>>> >>>>> On Wed, Mar 19, 2014 at 1:02 AM, Christian Dahlqvist >>>>> <christ...@basho.com> wrote: >>>>>> Hi Istvan, >>>>>> >>>>>> Did you run the Basho Bench clean-up job with the following settings? >>>>>> >>>>>> {driver, basho_bench_driver_riakc_pb}. >>>>>> {key_generator, {int_to_bin, {partitioned_sequential_int, 10000000}}}. >>>>>> {operations, [{delete, 1}]}. >>>>>> >>>>>> Also, how did you verify that the data was not deleted? >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Christian >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Mar 19, 2014 at 6:49 AM, István <lecc...@gmail.com> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I was trying to delete all of the keys generated with the following: >>>>>>> >>>>>>> {key_generator, {int_to_bin, {uniform_int, 10000000}}}. >>>>>>> >>>>>>> I have used this for the deletion: >>>>>>> >>>>>>> {key_generator, {int_to_bin, {partitioned_sequential_int, 10000000}}}. >>>>>>> >>>>>>> I has completed but unfortunately was not deleting any data.... >>>>>>> >>>>>>> Next is to use the Erlang client and see if I can list the keys and >>>>>>> delete them, or try to use the Erlang interface for MR. >>>>>>> >>>>>>> Regards, >>>>>>> Istvan >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Sat, Mar 15, 2014 at 1:38 PM, Christian Dahlqvist >>>>>>> <christ...@basho.com> wrote: >>>>>>>> Hi Istvan, >>>>>>>> >>>>>>>> Depending on how you have run your Basho Bench job(s), you could try >>>>>>>> deleting the generated keys by running a separate Basho Bench job based >>>>>>>> on a >>>>>>>> partitioned_sequential_int key generator and only delete operations. >>>>>>>> >>>>>>>> Best regards, >>>>>>>> >>>>>>>> Christian >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Mar 14, 2014 at 5:00 PM, István <lecc...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I am trying to clean up some of the test data that was inserted by >>>>>>>>> basho_bench. The first approach to use curl and streaming the keys >>>>>>>>> fails like this: >>>>>>>>> >>>>>>>>> # curl -XGET -i http://127.0.0.1:8098/buckets/test/keys?keys=stream >>>>>>>>> HTTP/1.1 200 OK >>>>>>>>> Vary: Accept-Encoding >>>>>>>>> Transfer-Encoding: chunked >>>>>>>>> Server: MochiWeb/1.1 WebMachine/1.10.0 (never breaks eye contact) >>>>>>>>> Date: Fri, 14 Mar 2014 16:59:08 GMT >>>>>>>>> Content-Type: application/json >>>>>>>>> >>>>>>>>> curl: (18) transfer closed with outstanding read data remaining >>>>>>>>> >>>>>>>>> When I am trying to the same thing with MapReduce it fails like this: >>>>>>>>> >>>>>>>>> curl -X POST "http://localhost:8098/mapred" -H "Content-Type: >>>>>>>>> application/json" -d '{ >>>>>>>>> "inputs": "test", >>>>>>>>> "query": [ >>>>>>>>> { >>>>>>>>> "map": { >>>>>>>>> "language": "javascript", >>>>>>>>> "source": "function(riakObject) { return >>>>>>>>> [riakObject.key]; >>>>>>>>> }" >>>>>>>>> } >>>>>>>>> } >>>>>>>>> ] >>>>>>>>> }' >>>>>>>>> >>>>>>>>> Error: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> {"phase":0,"error":"bad_utf8_character_code","input":"{ok,{r_object,<<\"test\">>,<<0,116,71,0>>,[{r_content,{dict,3,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[[<<\"X-Riak-VTag\">>,71,81,80,81,87,76,105,54,113,120,97,116,114,106,51,86,72,53,67,50,82]],[[<<\"index\">>]],[],[[<<\"X-Riak-Last-Modified\">>|{1391,27501,255280}]],[],[]}}},<<75,191,51,171,193,113,206,163,24,68,247,188,84,72,5,72,179,195,99,44,202,122,136,31,250,94,166,5,160,199,182,137,40,6,253,115,100,4,34,67,64,10,25,210,58,23,104,97,228,...>>}],...},...}"} >>>>>>>>> >>>>>>>>> I am wondering how else could I just get a list of keys in that >>>>>>>>> bucket. The ultimate goal is to be able to delete them all. >>>>>>>>> >>>>>>>>> Thank you in advance, >>>>>>>>> Istvan >>>>>>>>> >>>>>>>>> -- >>>>>>>>> the sun shines for all >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> riak-users mailing list >>>>>>>>> riak-users@lists.basho.com >>>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> the sun shines for all >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> the sun shines for all >>>>> >>>>> _______________________________________________ >>>>> riak-users mailing list >>>>> riak-users@lists.basho.com >>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> >> >> -- >> the sun shines for all >> >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com -- the sun shines for all _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com