Matthew,

Thank for the details about LevelDB. Is there a way to trigger
compaction from Erlang or any other way to get rid of tombstones
faster with 1.4? If there is no such a thing I guess waiting is my
only option.

Thanks everybody helping with this issue.

Regards,
Istvan




On Sat, Mar 22, 2014 at 5:33 AM, Matthew Von-Maszewski
<matth...@basho.com> wrote:
> Leveldb, as written by Google, does not actively clean up delete "tombstones" 
> or prior data records with the same key. The old data and tombstones stay on 
> disk until they happen to participate in compaction at the highest "level".  
> The clean up can therefore happen days, weeks, or even months later depending 
> upon the size of your dataset, speed of incoming writes, and distribution of 
> new keys versus deleted keys.
>
> Basho has added code to leveldb in Riak 2.0 to more aggressively free up disk 
> space.  Details on this 2.0 feature are here:
>
>    https://github.com/basho/leveldb/wiki/mv-aggressive-delete
>
> Matthew Von-Maszewski
>
>
> On Mar 22, 2014, at 1:53, István <lecc...@gmail.com> wrote:
>
>> All good, all the keys are gone! :)
>>
>> I am just waiting Riak to free up the space. It seems it is not
>> instant... Or I am missing something. I need to read up on how LevelDB
>> actually frees up space.  I have updated the code to stop on {ReqID,
>> done}. I think you get this only when you have no keys left. I have
>> verified that that there are no keys left in the bucket.
>>
>>
>> # curl -XGET -i http://127.0.0.1:8098/buckets/test/keys?keys=stream
>> HTTP/1.1 200 OK
>> Vary: Accept-Encoding
>> Transfer-Encoding: chunked
>> Server: MochiWeb/1.1 WebMachine/1.10.0 (never breaks eye contact)
>> Date: Sat, 22 Mar 2014 05:51:48 GMT
>> Content-Type: application/json
>>
>> {"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}
>>
>> Thanks Evan!
>> I.
>>
>> On Fri, Mar 21, 2014 at 9:44 PM, Evan Vigil-McClanahan
>> <emcclana...@basho.com> wrote:
>>> Did some double checking on the off chance that I gave you some bad
>>> advice.  Here's the function that the erlang client uses to accumulate
>>> the outcome of stream_list_keys et al:
>>> https://github.com/basho/riak-erlang-client/blob/master/src/riakc_pb_socket.erl#L2146-L2155
>>>
>>> here is how you get the request id:
>>> https://github.com/basho/riak-erlang-client/blob/master/src/riakc_pb_socket.erl#L490-L494
>>>
>>> On Fri, Mar 21, 2014 at 9:29 PM, Evan Vigil-McClanahan
>>> <emcclana...@basho.com> wrote:
>>>> You don't want to recurse when you get the {ReqID, done} message, you
>>>> should just stop there.
>>>>
>>>> On Fri, Mar 21, 2014 at 6:20 PM, István <lecc...@gmail.com> wrote:
>>>>> With help of Evan (evanmcc) on the IRC channel I was able to kick off
>>>>> the clean up job using riak-erlang-client.
>>>>>
>>>>> Here is the code:
>>>>>
>>>>> https://gist.github.com/l1x/9698847
>>>>>
>>>>> It sometimes behaves a bit weirdly, the PB client returns {40127151,
>>>>> done} or something similar, that I can't recognize why but it
>>>>> definitely deleted some of the keys so far. I am letting it run for a
>>>>> while and see what happens.
>>>>>
>>>>> Regards,
>>>>> Istvan
>>>>>
>>>>>
>>>>> On Wed, Mar 19, 2014 at 1:02 AM, Christian Dahlqvist
>>>>> <christ...@basho.com> wrote:
>>>>>> Hi Istvan,
>>>>>>
>>>>>> Did you run the Basho Bench clean-up job with the following settings?
>>>>>>
>>>>>> {driver, basho_bench_driver_riakc_pb}.
>>>>>> {key_generator, {int_to_bin, {partitioned_sequential_int, 10000000}}}.
>>>>>> {operations, [{delete, 1}]}.
>>>>>>
>>>>>> Also, how did you verify that the data was not deleted?
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Christian
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 19, 2014 at 6:49 AM, István <lecc...@gmail.com> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I was trying to delete all of the keys generated with the following:
>>>>>>>
>>>>>>> {key_generator, {int_to_bin, {uniform_int, 10000000}}}.
>>>>>>>
>>>>>>> I have used this for the deletion:
>>>>>>>
>>>>>>> {key_generator, {int_to_bin, {partitioned_sequential_int, 10000000}}}.
>>>>>>>
>>>>>>> I has completed but unfortunately was not deleting any data....
>>>>>>>
>>>>>>> Next is to use the Erlang client and see if I can list the keys and
>>>>>>> delete them, or try to use the Erlang interface for MR.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Istvan
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Mar 15, 2014 at 1:38 PM, Christian Dahlqvist
>>>>>>> <christ...@basho.com> wrote:
>>>>>>>> Hi Istvan,
>>>>>>>>
>>>>>>>> Depending on how you have run your Basho Bench job(s), you could try
>>>>>>>> deleting the generated keys by running a separate Basho Bench job based
>>>>>>>> on a
>>>>>>>> partitioned_sequential_int key generator and only delete operations.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> Christian
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Mar 14, 2014 at 5:00 PM, István <lecc...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I am trying to clean up some of the test data that was inserted by
>>>>>>>>> basho_bench. The first approach to use curl and streaming the keys
>>>>>>>>> fails like this:
>>>>>>>>>
>>>>>>>>> # curl -XGET -i http://127.0.0.1:8098/buckets/test/keys?keys=stream
>>>>>>>>> HTTP/1.1 200 OK
>>>>>>>>> Vary: Accept-Encoding
>>>>>>>>> Transfer-Encoding: chunked
>>>>>>>>> Server: MochiWeb/1.1 WebMachine/1.10.0 (never breaks eye contact)
>>>>>>>>> Date: Fri, 14 Mar 2014 16:59:08 GMT
>>>>>>>>> Content-Type: application/json
>>>>>>>>>
>>>>>>>>> curl: (18) transfer closed with outstanding read data remaining
>>>>>>>>>
>>>>>>>>> When I am trying to the same thing with MapReduce it fails like this:
>>>>>>>>>
>>>>>>>>> curl -X POST "http://localhost:8098/mapred"; -H "Content-Type:
>>>>>>>>> application/json" -d '{
>>>>>>>>>    "inputs": "test",
>>>>>>>>>    "query": [
>>>>>>>>>        {
>>>>>>>>>            "map": {
>>>>>>>>>                "language": "javascript",
>>>>>>>>>                "source": "function(riakObject) { return
>>>>>>>>> [riakObject.key];
>>>>>>>>> }"
>>>>>>>>>            }
>>>>>>>>>        }
>>>>>>>>>    ]
>>>>>>>>> }'
>>>>>>>>>
>>>>>>>>> Error:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> {"phase":0,"error":"bad_utf8_character_code","input":"{ok,{r_object,<<\"test\">>,<<0,116,71,0>>,[{r_content,{dict,3,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[[<<\"X-Riak-VTag\">>,71,81,80,81,87,76,105,54,113,120,97,116,114,106,51,86,72,53,67,50,82]],[[<<\"index\">>]],[],[[<<\"X-Riak-Last-Modified\">>|{1391,27501,255280}]],[],[]}}},<<75,191,51,171,193,113,206,163,24,68,247,188,84,72,5,72,179,195,99,44,202,122,136,31,250,94,166,5,160,199,182,137,40,6,253,115,100,4,34,67,64,10,25,210,58,23,104,97,228,...>>}],...},...}"}
>>>>>>>>>
>>>>>>>>> I am wondering how else could I just get a list of keys in that
>>>>>>>>> bucket. The ultimate goal is to be able to delete them all.
>>>>>>>>>
>>>>>>>>> Thank you in advance,
>>>>>>>>> Istvan
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> the sun shines for all
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> riak-users mailing list
>>>>>>>>> riak-users@lists.basho.com
>>>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> the sun shines for all
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> the sun shines for all
>>>>>
>>>>> _______________________________________________
>>>>> riak-users mailing list
>>>>> riak-users@lists.basho.com
>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>>
>> --
>> the sun shines for all
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



-- 
the sun shines for all

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to