Re: Riak LevelDB Deletion Problem
Timo, After one hour it spitted the data we needed , for the record : curl YOURIP:8098/buckets/YOURBUCKET/index/\$bucket/_ Cheers, Antonio 2015-07-15 21:51 GMT+01:00 Timo Gatsonides t...@me.com: From: Antonio Teixeira eagle.anto...@gmail.com To: Matthew Von-Maszewski matth...@basho.com Cc: riak-users riak-users@lists.basho.com Subject: Re: Riak LevelDB Deletion Problem Message-ID: CAF+3j-tCr+CDD7PiWLm-QOaO69jWDzGyGhP8X9jsZyH=wsb...@mail.gmail.com Content-Type: text/plain; charset=utf-8 Hello Matthew, Space is reducing slowly , now we are faced with another problem : Alot of applications store data on this node and we don't have the bucket keys (they are uuid4) so : We are using listkeys ( I know its bad ) on the erlang client and we also tried with curl using both blocking and stream methods. And they are all returning {error, timeout}. We are 95 % sure that not all data has been migrated so , is there any way to get the keys of the bucket, even if we have to shutdown the node ( Uptime/Availability not important for us ). We are currently looking at mapreduce ?! What are you using to list keys? Are you using secondary indexes? You can get all the keys for a bucket using the $bucket index, or do a range scan on $key. See http://docs.basho.com/riak/latest/dev/using/2i/ . If your cluster is healthy that should return the keys without throwing a timeout error. Kind regards, Timo ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Riak LevelDB Deletion Problem
Hello Matthew, Space is reducing slowly , now we are faced with another problem : Alot of applications store data on this node and we don't have the bucket keys (they are uuid4) so : We are using listkeys ( I know its bad ) on the erlang client and we also tried with curl using both blocking and stream methods. And they are all returning {error, timeout}. We are 95 % sure that not all data has been migrated so , is there any way to get the keys of the bucket, even if we have to shutdown the node ( Uptime/Availability not important for us ). We are currently looking at mapreduce ?! Thanks for all the patience Antonio 2015-07-14 22:28 GMT+01:00 Matthew Von-Maszewski matth...@basho.com: Antonio, Someone reminded me that you could make temporary space on your servers by deactivating active_anti_entropy, then deleting its data. Of course, this assumes you are running “anti_entropy = active” in your riak.conf file. I will send you some better notes if you think this is worth researching. Let me know your thoughts. Matthew On Jul 14, 2015, at 4:21 PM, Antonio Teixeira eagle.anto...@gmail.com wrote: Ok Matthew, We will proceed with the deletion , will monitor the disk space and will come back with further reports to the list. Thanks for your time, Antonio 2015-07-14 18:32 GMT+01:00 Matthew Von-Maszewski matth...@basho.com: Antonio, A Riak delete operation happens in these steps: - Riak writes a “tombstone” value for the key to the N vnodes that contain it (this is a new record) - Riak by default, waits 3 seconds to verify all vnodes agree to the tombstone/delete - Riak issues an actual delete operation against the key to leveldb - leveldb creates its own tombstone - the leveldb tombstone “floats” through level-0 and level-1 as part of normal compactions - upon reaching level-2, leveldb will initiate immediate compaction and propagation of tombstones in .sst table files containing 1000 or more tombstones. This is when disk space recovery begins. Yes, this means that initially the leveldb vnodes will grow in size until enough stuff (new data and/or tombstones) forces the tombstones to level-2 via normal compaction operations. “Enough stuff” to fill levels 0 and 1 is about 4.2Gbytes of compressed Riak objects. The “get” operation you mentioned is something that happens internally. A manual “get” by your code will not influence the operation. Matthew On Jul 14, 2015, at 1:01 PM, Antonio Teixeira eagle.anto...@gmail.com wrote: Hi Matthew, We will be removing close to 1 TB of data from the node , and since we are short on disk space when we saw that the disk space was actually rising we halted the data removal. Now according to some docs I have read, if after a deletion ( a few seconds ) we make a .get() it force the release of the diskspace , is this true ? For us it's not possible to move data to another node, is there any way even manually to release the space ? or at least to force : The actual release occurs significantly later (days, weeks, or even months later) when the tombstone record merges with the actual data in a background compaction. Thanks for all the help, Antonio 2015-07-14 17:43 GMT+01:00 Matthew Von-Maszewski matth...@basho.com: Antonio, Here is a detailed discussion of the Riak / leveldb delete scenario: https://github.com/basho/leveldb/wiki/mv-aggressive-delete Pay close attention to the section titled “Update April 6, 2014”. This explains why as much as 4.2G bytes per vnode might remain within leveldb after deleting all keys. There is no mechanism to override the logic that causes the disk space retention. One workaround is to use Riak’s handoff mechanism to transfer vnodes from one physical server to another. The vnode transfer will remove all deletion tombstones on the destination. The last step of the transfer then deletes all leveldb files on the original server, recovering the space. Matthew On Jul 14, 2015, at 12:32 PM, Antonio Teixeira eagle.anto...@gmail.com wrote: Hello, We have been migrating our Riak Database to another infrastructure through a streaming process and right now we should have somewhere around 2Gb of free space the Hard Disk, however those 2Gb are still being used by Riak. After some research I believe the problem is the Objects are only being marked for deletion and not actually deleted at runtime. What we need is a way to aggressively deleted those keys or some way to force Riak to delete those marked keys and subsequently release the Disk Space. The Riak version we are using is v2.0.2 ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak
Re: Riak LevelDB Deletion Problem
Hi Matthew, We will be removing close to 1 TB of data from the node , and since we are short on disk space when we saw that the disk space was actually rising we halted the data removal. Now according to some docs I have read, if after a deletion ( a few seconds ) we make a .get() it force the release of the diskspace , is this true ? For us it's not possible to move data to another node, is there any way even manually to release the space ? or at least to force : The actual release occurs significantly later (days, weeks, or even months later) when the tombstone record merges with the actual data in a background compaction. Thanks for all the help, Antonio 2015-07-14 17:43 GMT+01:00 Matthew Von-Maszewski matth...@basho.com: Antonio, Here is a detailed discussion of the Riak / leveldb delete scenario: https://github.com/basho/leveldb/wiki/mv-aggressive-delete Pay close attention to the section titled “Update April 6, 2014”. This explains why as much as 4.2G bytes per vnode might remain within leveldb after deleting all keys. There is no mechanism to override the logic that causes the disk space retention. One workaround is to use Riak’s handoff mechanism to transfer vnodes from one physical server to another. The vnode transfer will remove all deletion tombstones on the destination. The last step of the transfer then deletes all leveldb files on the original server, recovering the space. Matthew On Jul 14, 2015, at 12:32 PM, Antonio Teixeira eagle.anto...@gmail.com wrote: Hello, We have been migrating our Riak Database to another infrastructure through a streaming process and right now we should have somewhere around 2Gb of free space the Hard Disk, however those 2Gb are still being used by Riak. After some research I believe the problem is the Objects are only being marked for deletion and not actually deleted at runtime. What we need is a way to aggressively deleted those keys or some way to force Riak to delete those marked keys and subsequently release the Disk Space. The Riak version we are using is v2.0.2 ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Riak LevelDB Deletion Problem
Hello, We have been migrating our Riak Database to another infrastructure through a streaming process and right now we should have somewhere around 2Gb of free space the Hard Disk, however those 2Gb are still being used by Riak. After some research I believe the problem is the Objects are only being marked for deletion and not actually deleted at runtime. What we need is a way to aggressively deleted those keys or some way to force Riak to delete those marked keys and subsequently release the Disk Space. The Riak version we are using is v2.0.2 ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Riak LevelDB Deletion Problem
Ok Matthew, We will proceed with the deletion , will monitor the disk space and will come back with further reports to the list. Thanks for your time, Antonio 2015-07-14 18:32 GMT+01:00 Matthew Von-Maszewski matth...@basho.com: Antonio, A Riak delete operation happens in these steps: - Riak writes a “tombstone” value for the key to the N vnodes that contain it (this is a new record) - Riak by default, waits 3 seconds to verify all vnodes agree to the tombstone/delete - Riak issues an actual delete operation against the key to leveldb - leveldb creates its own tombstone - the leveldb tombstone “floats” through level-0 and level-1 as part of normal compactions - upon reaching level-2, leveldb will initiate immediate compaction and propagation of tombstones in .sst table files containing 1000 or more tombstones. This is when disk space recovery begins. Yes, this means that initially the leveldb vnodes will grow in size until enough stuff (new data and/or tombstones) forces the tombstones to level-2 via normal compaction operations. “Enough stuff” to fill levels 0 and 1 is about 4.2Gbytes of compressed Riak objects. The “get” operation you mentioned is something that happens internally. A manual “get” by your code will not influence the operation. Matthew On Jul 14, 2015, at 1:01 PM, Antonio Teixeira eagle.anto...@gmail.com wrote: Hi Matthew, We will be removing close to 1 TB of data from the node , and since we are short on disk space when we saw that the disk space was actually rising we halted the data removal. Now according to some docs I have read, if after a deletion ( a few seconds ) we make a .get() it force the release of the diskspace , is this true ? For us it's not possible to move data to another node, is there any way even manually to release the space ? or at least to force : The actual release occurs significantly later (days, weeks, or even months later) when the tombstone record merges with the actual data in a background compaction. Thanks for all the help, Antonio 2015-07-14 17:43 GMT+01:00 Matthew Von-Maszewski matth...@basho.com: Antonio, Here is a detailed discussion of the Riak / leveldb delete scenario: https://github.com/basho/leveldb/wiki/mv-aggressive-delete Pay close attention to the section titled “Update April 6, 2014”. This explains why as much as 4.2G bytes per vnode might remain within leveldb after deleting all keys. There is no mechanism to override the logic that causes the disk space retention. One workaround is to use Riak’s handoff mechanism to transfer vnodes from one physical server to another. The vnode transfer will remove all deletion tombstones on the destination. The last step of the transfer then deletes all leveldb files on the original server, recovering the space. Matthew On Jul 14, 2015, at 12:32 PM, Antonio Teixeira eagle.anto...@gmail.com wrote: Hello, We have been migrating our Riak Database to another infrastructure through a streaming process and right now we should have somewhere around 2Gb of free space the Hard Disk, however those 2Gb are still being used by Riak. After some research I believe the problem is the Objects are only being marked for deletion and not actually deleted at runtime. What we need is a way to aggressively deleted those keys or some way to force Riak to delete those marked keys and subsequently release the Disk Space. The Riak version we are using is v2.0.2 ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Riak Search 2.0 Tagging
Hi Zee, Sorry for the extremely late response and I would to say , I'm highly thankful for your time. Antonio 2015-04-07 18:29 GMT+01:00 Zeeshan Lakhani zlakh...@basho.com: Hello Antonio, Firstly, please always reply to the list and not via personal email. In regards to your question, I wrote a test that showcases how to write objects with tags in pb for search: https://github.com/basho/yokozuna/pull/479/files#diff-f9f0e102b2a5208f41b2f304ada0ee5cR306. It’s a bit of a workaround, needing the ‘x-riak-meta*’, which gets stripped in preparation for query, but it does work. Please make sure you’ve also created an index and associated it with a bucket_type/bucket, i.e. http://docs.basho.com/riak/latest/dev/using/search/#Simple-Setup. To query, it’d be the same as in the docs (field:*query*): http://docs.basho.com/riak/latest/dev/using/search/#Querying. Thanks. Zeeshan Lakhani programmer | software engineer at @basho | org. member/founder of @papers_we_love | paperswelove.org twitter = @zeeshanlakhani On Apr 7, 2015, at 4:28 AM, Antonio Teixeira eagle.anto...@gmail.com wrote: Hi , I've been experimenting with with Riak Search and Tags but I've reached a dead-end. I can't find any documentation on Querying Tags, I have followed your example at: https://github.com/basho/riak-erlang-client/blob/7487c90275c88dbe8ef4c2fed6540864364ca3d4/src/riakc_... https://github.com/basho/riak-erlang-client/blob/7487c90275c88dbe8ef4c2fed6540864364ca3d4/src/riakc_pb_socket.erl#L3348 Here is my Code: O0 = riakc_obj:new(test, key0, value0), MD0 = riakc_obj:get_update_metadata(O0), MD1 = riakc_obj:set_user_metadata_entry(MD0, {ola_s,nuno}), O1 = riakc_obj:update_metadata(O0, MD1), ok = riakc_pb_socket:put(Pid, O1). My question is, how do I query this data without knowing the Key of the Object. Regards Antonio 2015-04-02 16:10 GMT+01:00 Zeeshan Lakhani zlakh...@basho.com: Hello Antonio, You can insert an object and tag in the same operation, and you can query on that tag via Riak Search. Before writing the object, just apply/set the metadata accordingly. Here’s an example in the Erlang client: https://github.com/basho/riak-erlang-client/blob/7487c90275c88dbe8ef4c2fed6540864364ca3d4/src/riakc_pb_socket.erl#L3348 . Also, this can be done via http: https://github.com/basho/yokozuna/blob/5868266b11f131d14c85495e50f899f3fe8158ba/riak_test/yokozuna_essential.erl#L281 . Thanks. Zeeshan Lakhani programmer | software engineer at @basho | org. member/founder of @papers_we_love | paperswelove.org twitter = @zeeshanlakhani On Apr 2, 2015, at 5:24 AM, Antonio Teixeira eagle.anto...@gmail.com wrote: I've been using Riak as my main database for a few months, now I've been experimenting with Riak Search 2.0 and for what I read in your documentation there is no way to insert a object and tag it in the same operation. Right now we have an opaque object and we query them through secondary indexes, this is becoming unbearable. What we need is to store an object and tag( https://github.com/basho/yokozuna/blob/develop/docs/TAGGING.md) it all in one step. Our objects consist of erlang dictionaries and it would be a bit expensive (performance wise) to convert the Dictionary to a list and then to json in every database related operation. We are using Riak 2.0.5, Erlang 17.0 and the PB Driver. ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Riak Search 2.0 Tagging
Hello, I've been using Riak as my main database for a few months, now I've been experimenting with Riak Search 2.0 and for what I read in your documentation there is no way to insert a object and tag it in the same operation. Right now we have an opaque object and we query them through secondary indexes, this is becoming unbearable. What we need is to store an object and tag( https://github.com/basho/yokozuna/blob/develop/docs/TAGGING.md) it all in one step. Our objects consist of erlang dictionaries and it would be a bit expensive (performance wise) to convert the Dictionary to a list and then to json in every database related operation. We are using Riak 2.0.5, Erlang 17.0 and the PB Driver. Thank you, Antonio ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Riak Corruption
Hi There, The following started to pop every second on our logs : http://pastebin.com/Lgaqw2Wu I grabbed the manual and found this : http://docs.basho.com/riak/latest/ops/running/recovery/repairing-partitions/ Will this cause any data corruption/loss ? Running Version : *Riak :* riak_2.0.2-1_amd64.deb *OS :* Distributor ID: Ubuntu Description: Ubuntu 14.04.1 LTS Release: 14.04 Codename: trusty 2015-02-12 9:13 GMT+00:00 Antonio Teixeira eagle.anto...@gmail.com: Hi There, The following started to pop every second on our logs : http://pastebin.com/Lgaqw2Wu I grabbed the manual and found this : ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Riak Corruption
Hi There, The following started to pop every second on our logs : http://pastebin.com/Lgaqw2Wu I grabbed the manual and found this : ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com