Leveldb, as written by Google, does not actively clean up delete "tombstones" 
or prior data records with the same key. The old data and tombstones stay on 
disk until they happen to participate in compaction at the highest "level".  
The clean up can therefore happen days, weeks, or even months later depending 
upon the size of your dataset, speed of incoming writes, and distribution of 
new keys versus deleted keys.

Basho has added code to leveldb in Riak 2.0 to more aggressively free up disk 
space.  Details on this 2.0 feature are here:

   https://github.com/basho/leveldb/wiki/mv-aggressive-delete 

Matthew Von-Maszewski


On Mar 22, 2014, at 1:53, István <lecc...@gmail.com> wrote:

> All good, all the keys are gone! :)
> 
> I am just waiting Riak to free up the space. It seems it is not
> instant... Or I am missing something. I need to read up on how LevelDB
> actually frees up space.  I have updated the code to stop on {ReqID,
> done}. I think you get this only when you have no keys left. I have
> verified that that there are no keys left in the bucket.
> 
> 
> # curl -XGET -i http://127.0.0.1:8098/buckets/test/keys?keys=stream
> HTTP/1.1 200 OK
> Vary: Accept-Encoding
> Transfer-Encoding: chunked
> Server: MochiWeb/1.1 WebMachine/1.10.0 (never breaks eye contact)
> Date: Sat, 22 Mar 2014 05:51:48 GMT
> Content-Type: application/json
> 
> {"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}
> 
> Thanks Evan!
> I.
> 
> On Fri, Mar 21, 2014 at 9:44 PM, Evan Vigil-McClanahan
> <emcclana...@basho.com> wrote:
>> Did some double checking on the off chance that I gave you some bad
>> advice.  Here's the function that the erlang client uses to accumulate
>> the outcome of stream_list_keys et al:
>> https://github.com/basho/riak-erlang-client/blob/master/src/riakc_pb_socket.erl#L2146-L2155
>> 
>> here is how you get the request id:
>> https://github.com/basho/riak-erlang-client/blob/master/src/riakc_pb_socket.erl#L490-L494
>> 
>> On Fri, Mar 21, 2014 at 9:29 PM, Evan Vigil-McClanahan
>> <emcclana...@basho.com> wrote:
>>> You don't want to recurse when you get the {ReqID, done} message, you
>>> should just stop there.
>>> 
>>> On Fri, Mar 21, 2014 at 6:20 PM, István <lecc...@gmail.com> wrote:
>>>> With help of Evan (evanmcc) on the IRC channel I was able to kick off
>>>> the clean up job using riak-erlang-client.
>>>> 
>>>> Here is the code:
>>>> 
>>>> https://gist.github.com/l1x/9698847
>>>> 
>>>> It sometimes behaves a bit weirdly, the PB client returns {40127151,
>>>> done} or something similar, that I can't recognize why but it
>>>> definitely deleted some of the keys so far. I am letting it run for a
>>>> while and see what happens.
>>>> 
>>>> Regards,
>>>> Istvan
>>>> 
>>>> 
>>>> On Wed, Mar 19, 2014 at 1:02 AM, Christian Dahlqvist
>>>> <christ...@basho.com> wrote:
>>>>> Hi Istvan,
>>>>> 
>>>>> Did you run the Basho Bench clean-up job with the following settings?
>>>>> 
>>>>> {driver, basho_bench_driver_riakc_pb}.
>>>>> {key_generator, {int_to_bin, {partitioned_sequential_int, 10000000}}}.
>>>>> {operations, [{delete, 1}]}.
>>>>> 
>>>>> Also, how did you verify that the data was not deleted?
>>>>> 
>>>>> Best regards,
>>>>> 
>>>>> Christian
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Wed, Mar 19, 2014 at 6:49 AM, István <lecc...@gmail.com> wrote:
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> I was trying to delete all of the keys generated with the following:
>>>>>> 
>>>>>> {key_generator, {int_to_bin, {uniform_int, 10000000}}}.
>>>>>> 
>>>>>> I have used this for the deletion:
>>>>>> 
>>>>>> {key_generator, {int_to_bin, {partitioned_sequential_int, 10000000}}}.
>>>>>> 
>>>>>> I has completed but unfortunately was not deleting any data....
>>>>>> 
>>>>>> Next is to use the Erlang client and see if I can list the keys and
>>>>>> delete them, or try to use the Erlang interface for MR.
>>>>>> 
>>>>>> Regards,
>>>>>> Istvan
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Sat, Mar 15, 2014 at 1:38 PM, Christian Dahlqvist
>>>>>> <christ...@basho.com> wrote:
>>>>>>> Hi Istvan,
>>>>>>> 
>>>>>>> Depending on how you have run your Basho Bench job(s), you could try
>>>>>>> deleting the generated keys by running a separate Basho Bench job based
>>>>>>> on a
>>>>>>> partitioned_sequential_int key generator and only delete operations.
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> 
>>>>>>> Christian
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Fri, Mar 14, 2014 at 5:00 PM, István <lecc...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> I am trying to clean up some of the test data that was inserted by
>>>>>>>> basho_bench. The first approach to use curl and streaming the keys
>>>>>>>> fails like this:
>>>>>>>> 
>>>>>>>> # curl -XGET -i http://127.0.0.1:8098/buckets/test/keys?keys=stream
>>>>>>>> HTTP/1.1 200 OK
>>>>>>>> Vary: Accept-Encoding
>>>>>>>> Transfer-Encoding: chunked
>>>>>>>> Server: MochiWeb/1.1 WebMachine/1.10.0 (never breaks eye contact)
>>>>>>>> Date: Fri, 14 Mar 2014 16:59:08 GMT
>>>>>>>> Content-Type: application/json
>>>>>>>> 
>>>>>>>> curl: (18) transfer closed with outstanding read data remaining
>>>>>>>> 
>>>>>>>> When I am trying to the same thing with MapReduce it fails like this:
>>>>>>>> 
>>>>>>>> curl -X POST "http://localhost:8098/mapred"; -H "Content-Type:
>>>>>>>> application/json" -d '{
>>>>>>>>    "inputs": "test",
>>>>>>>>    "query": [
>>>>>>>>        {
>>>>>>>>            "map": {
>>>>>>>>                "language": "javascript",
>>>>>>>>                "source": "function(riakObject) { return
>>>>>>>> [riakObject.key];
>>>>>>>> }"
>>>>>>>>            }
>>>>>>>>        }
>>>>>>>>    ]
>>>>>>>> }'
>>>>>>>> 
>>>>>>>> Error:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> {"phase":0,"error":"bad_utf8_character_code","input":"{ok,{r_object,<<\"test\">>,<<0,116,71,0>>,[{r_content,{dict,3,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[[<<\"X-Riak-VTag\">>,71,81,80,81,87,76,105,54,113,120,97,116,114,106,51,86,72,53,67,50,82]],[[<<\"index\">>]],[],[[<<\"X-Riak-Last-Modified\">>|{1391,27501,255280}]],[],[]}}},<<75,191,51,171,193,113,206,163,24,68,247,188,84,72,5,72,179,195,99,44,202,122,136,31,250,94,166,5,160,199,182,137,40,6,253,115,100,4,34,67,64,10,25,210,58,23,104,97,228,...>>}],...},...}"}
>>>>>>>> 
>>>>>>>> I am wondering how else could I just get a list of keys in that
>>>>>>>> bucket. The ultimate goal is to be able to delete them all.
>>>>>>>> 
>>>>>>>> Thank you in advance,
>>>>>>>> Istvan
>>>>>>>> 
>>>>>>>> --
>>>>>>>> the sun shines for all
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> riak-users mailing list
>>>>>>>> riak-users@lists.basho.com
>>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> the sun shines for all
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> the sun shines for all
>>>> 
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> riak-users@lists.basho.com
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> -- 
> the sun shines for all
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to