For what it's worth, in the integration tests of our client libraries we have moved to generating random bucket and key names for each test/example. This reduces setup/teardown time and is less susceptible to the types of unexpected behaviors you are seeing from list-keys. If possible, I highly recommend this approach in your suite.
On Tue, May 20, 2014 at 9:25 AM, Dmitri Zagidulin <dzagidu...@basho.com>wrote: > Ok, so, from what I understand, this is going to be expected behavior from > strongly consistent buckets. (I'm in the process of confirming this, and > we'll see if we can add it to the documentation). The delete_mode: > immediate is ignored, and the tombstone is kept around, to ensure the > consistency of not found, etc. (In the context of further over-writes of > that key). > > So, unfortunately that may be bad news in terms of deleting a > stongly_consistent bucket via keylist for unit testing. :) > > You may want to switch to method #2, for your test suite. (Write a shell > script to stop the node, delete the bitcask & aae dirs, and restart. And > invoke it as a shell script command from your test suite. Or just call > those commands directly.). > > > > On Tue, May 20, 2014 at 5:44 AM, Paweł Królikowski <rabb...@gmail.com>wrote: > >> Ok then, >> >> I've stopped riak, wiped bitcask and anti_entropy directories, updated >> config, started riak. >> >> I've tried to verify it with: >> >> riak config generate -l debug >> >> Got output: >> >> [...] >> >> 10:25:46.260 [info] /etc/riak/advanced.config detected, overlaying >> proplists >> -config /var/lib/riak/generated.configs/app.2014.05.20.10.25.46.config >> -args_file /var/lib/riak/generated.configs/vm.2014.05.20.10.25.46.args >> -vm_args /var/lib/riak/generated.configs/vm.2014.05.20.10.25.46.args >> >> >> And at the very end of the config file there's: >> >> {k_kv,[{delete_mode,immediate}]}]. >> >> So, it worked. >> >> >> Then did this: >> >> >>> import riak >> >>> c = riak.RiakClient(pb_port=8087, protocol='pbc', host='db-13') >> >>> b = c.bucket(name='locate', bucket_type='strongly_consistent') >> >>> o = b.get('foo') >> >>> o.data = 3 >> >>> o.store() >> <riak.riak_object.RiakObject object at 0x2b2ce90> >> >>> o.delete() >> <riak.riak_object.RiakObject object at 0x2b2ce90> >> >>> b.delete('foo') >> <riak.riak_object.RiakObject object at 0x2b55d90> >> >>> o.exists >> False >> >>> b.get_keys() >> ['foo'] >> >> >> So, it didn't work. >> >> It's not just the python client, because if I do this, I get the key back: >> >> http://db-13:8098/types/strongly_consistent/buckets/locate/keys?keys=true >> {"keys":["foo"]} >> >> >> >> I've tried deleting the key via http request (curl -v -X DELETE >> http://db-13:8098/types/strongly_consistent/buckets/locate/keys/bar), >> but it still remains. >> >> http://db-13:8098/types/strongly_consistent/buckets/locate/keys/foo >> >> returns >> >> not found >> >> but >> >> http://db-13:8098/types/strongly_consistent/buckets/locate/keys?keys=true >> >> gives >> >> {"keys":["foo","bar"]} >> >> >> I've tried looking for detailed logs, but console.log, even on debug, >> doesn't print anything useful. >> I've also tried looking inside bitcask directory, and there's definitely >> 'some' binary data there, even after deletion. >> >> >> On 19 May 2014 23:23, Dmitri Zagidulin <dzagidu...@basho.com> wrote: >> >>> Ah, that's interesting, let's see if we can test this. >>> >>> The 'delete_mode' configuration is not supported in the regular >>> riak.conf file, from what I understand. >>> However, you can still set it in the 'advanced.config' file, as >>> described here: >>> >>> https://github.com/basho/basho_docs/blob/features/lp/advanced-conf/source/languages/en/riak/ops/advanced/configs/configuration-files.md#the-advancedconfig-file >>> (those docs are a current work-in-progress, mind you) >>> >>> So, create an advanced.config file in your riak etc/ directory (this >>> will be in addition to your existing riak.conf), with the following >>> contents: >>> [ >>> {riak_kv, [ >>> {delete_mode, immediate} >>> ]} >>> ]. >>> >>> Restart the node, and try your tests again. The tombstones should >>> disappear now on every delete request. (You should probably also wipe all >>> of the old data, by deleting the contents of the bitcask and anti_entropy >>> directories in your riak data dir, just to make sure the old ones are gone. >>> This should be done while the node is down, of course.) >>> >>> >>> >>> On Mon, May 19, 2014 at 4:33 PM, Paweł Królikowski <rabb...@gmail.com>wrote: >>> >>>> The problem is that the tombstones never disappear - they keep coming >>>> back through bucket.get_keys() hours after deletion, even after a restart. >>>> >>>> I said I'm using the delete_mode default configuration, because I >>>> didn't change it. I now tried, and apparently it's not supported any more >>>> in Riak 2.0. >>>> >>>> 17:16:56.318 [error] You've tried to set delete_mode, but there is no >>>> setting with that name.^M >>>> 17:16:56.318 [error] Did you mean one of these?^M >>>> 17:16:56.335 [error] dtrace^M >>>> 17:16:56.335 [error] nodename^M >>>> 17:16:56.335 [error] ssl.keyfile^M >>>> 17:16:56.335 [error] Error generating configuration in phase >>>> transform_datatypes^M >>>> 17:16:56.335 [error] Conf file attempted to set unknown variable: >>>> delete_mode^M >>>> Error generating config with cuttlefish >>>> >>>> I'm using Riak 2.0.0pre20, on strongly consistent buckets, on a single >>>> node cluster. Can this be the reason? I guess what I need is a confirmation >>>> that something is broken/that I'm doing something stupid. >>>> >>>> I've tried looking for similar issues (github.com/basho/riak/issues), >>>> didn't find any -> I guess that suggests I'm doing something stupid, I just >>>> don't know what yet. >>>> >>>> >>>> Thanks again :) >>>> >>>> -- >>>> Paweł >>>> >>>> >>>> On 19 May 2014 18:00, Dmitri Zagidulin <dzagidu...@basho.com> wrote: >>>> >>>>> Ah, yes, you bring up a good point. (And, that's another subtlety to >>>>> keep in mind, with Option #1). >>>>> >>>>> Tombstones are definitely something to keep in mind, when deleting >>>>> unit test data. >>>>> As you mentioned in your earlier question, if you're using default >>>>> delete_mode configuration ( 3 seconds ), it means that if you issue a >>>>> delete, a tombstone object is going to be written (and stick around for at >>>>> least 3 seconds), and unfortunately, it is going to show up as a false >>>>> positive on a List Keys call. >>>>> >>>>> The easiest thing to try, in your case, is to set 'delete_mode' to >>>>> 'immediate', restart the test cluster, and retest. With an immediate >>>>> delete, your second test with 10 keys should not take as long as the >>>>> previous delete with 10000 keys. >>>>> >>>>> >>>>> >>>>> >>>>> On Mon, May 19, 2014 at 11:46 AM, Paweł Królikowski <rabb...@gmail.com >>>>> > wrote: >>>>> >>>>>> Hi Dmitri, >>>>>> >>>>>> Thanks a lot for the answer. Option #1 seems the best, but I have a >>>>>> follow up question: >>>>>> >>>>>> - when do the deleted keys disappear from Riak: a part of my problem >>>>>> (have not explained it correctly the first time), is that get_keys() >>>>>> returns keys that no longer exist. So, I run a test with 10 000 keys, I >>>>>> remove them, it takes Nseconds. I then follow with a test with 10 keys, >>>>>> but >>>>>> removing them takes just as much time - I imagine it's because I'm going >>>>>> over that 10 000 keys again. >>>>>> >>>>>> This article seems relevant: >>>>>> http://basho.com/riaks-config-behaviors-part-3/ - it seems like the >>>>>> tombstones simply remain in my system indefinitely. >>>>>> >>>>>> -- >>>>>> Paweł >>>>>> >>>>>> >>>>>> On 19 May 2014 15:32, Dmitri Zagidulin <dzagidu...@basho.com> wrote: >>>>>> >>>>>>> Hi Pawel, >>>>>>> >>>>>>> There's basically three ways to clear data from Riak (for the >>>>>>> purposes of automated testing): >>>>>>> >>>>>>> 1. Iterate through the keys via get_keys(), and delete each one. >>>>>>> This is what you're currently doing, except you don't need to invoke >>>>>>> if.exists(). >>>>>>> if.exists() makes an additional API call to Riak, and it takes twice >>>>>>> as long as just calling delete() (and trapping a potential 404 doesn't >>>>>>> exist error). >>>>>>> >>>>>>> Advantages: Easy to understand, can be done entirely in code >>>>>>> (without invoking OS/shell commands). >>>>>>> >>>>>>> Disadvantages: It can get slow, for large data sets. Another subtle >>>>>>> disadvantage is that, as your app grows, it can get difficult to keep >>>>>>> track >>>>>>> of which buckets you've created and need to be cleared. >>>>>>> >>>>>>> 2. Stop the Riak cluster, delete the riak data directory, and >>>>>>> re-start. >>>>>>> >>>>>>> Advantages: Very fast, and you can be sure that you're deleting all >>>>>>> buckets. >>>>>>> >>>>>>> Disadvantages: Involves invoking OS/shell commands. This is fairly >>>>>>> easy if your Riak node is running on the same machine as your tests >>>>>>> (and if >>>>>>> it's a single node). To delete the data directories of a multi-node >>>>>>> cluster, now you need to involve either a bash script that uses SSH to >>>>>>> log >>>>>>> in and restart, or a coordination framework like Ansible. >>>>>>> >>>>>>> 3. Use an in-memory back end. (And to drop all data, just restart >>>>>>> the node(s)). >>>>>>> >>>>>>> Advantages: Same as #2 - fast, thorough. >>>>>>> >>>>>>> Disadvantages: Same as #2 (involves shell commands, potentially SSH >>>>>>> etc). In addition, since you're likely not going to be running your >>>>>>> production code on an in-memory back end, this method introduces a >>>>>>> potential environmental/functional difference between your testing and >>>>>>> production clusters. >>>>>>> >>>>>>> I generally use method #1 in my unit tests, and manually delete each >>>>>>> key. >>>>>>> >>>>>>> Dmitri >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, May 19, 2014 at 8:53 AM, Paweł Królikowski < >>>>>>> rabb...@gmail.com> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> For testing, I'd like to be able to throw a large number of data at >>>>>>>> Riak (100k+ entries), check how it performed, change something in the >>>>>>>> application, run the test again. I'd like to use the same data every >>>>>>>> time, >>>>>>>> so, I'd like to clear the bucket between every test. >>>>>>>> >>>>>>>> The documentation ( >>>>>>>> http://docs.basho.com/riak/2.0.0beta1/dev/references/http/) says: >>>>>>>> >>>>>>>> *Delete Buckets* >>>>>>>> There is no straightforward way to delete an entire Bucket. To >>>>>>>> delete all the keys in a bucket, you’ll need to delete them all >>>>>>>> individually. >>>>>>>> >>>>>>>> >>>>>>>> So, I'm currently using something like: >>>>>>>> >>>>>>>> for k in r_bk.get_keys(): >>>>>>>> v = r_bk.get(k) >>>>>>>> if v.exists: >>>>>>>> r_bk.delete(v) >>>>>>>> >>>>>>>> The problem is that r_bk.get_keys() returns a lot of elements that >>>>>>>> don't exist (tombstones?) and iterating over all of them takes time. >>>>>>>> >>>>>>>> Is that the way it's supposed to work? Or am I missing something? >>>>>>>> >>>>>>>> - I'm using default delete_mode configuration ( 3 seconds ) >>>>>>>> - I'm using Riak 2.0 alpha 19 with Python. ( there's a bug with >>>>>>>> strong consistency in Beta1, cannot use it) >>>>>>>> - changing the bucket name for every run seems .. impractical? >>>>>>>> >>>>>>>> Any advices welcomed, >>>>>>>> >>>>>>>> -- >>>>>>>> Thanks, >>>>>>>> Paweł >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> riak-users mailing list >>>>>>>> riak-users@lists.basho.com >>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> riak-users mailing list >>>>> riak-users@lists.basho.com >>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>> >>>>> >>>> >>> >> > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > -- Sean Cribbs <s...@basho.com> Software Engineer Basho Technologies, Inc. http://basho.com/
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com