Re: Clarifying withoutFetch() with LevelDB and
Hi Daniel, What version of the Java client are you using? Any reason you're on such an old version of Riak? What is the size of each object written? -- Luke Bakken Engineer lbak...@basho.com On Wed, May 13, 2015 at 4:48 AM, Daniel Iwan iwan.dan...@gmail.com wrote: Hi I'm using 4 node Riak cluster v1.3.1 I wanted to know a little bit more about using withoutFetch() option when used with levelDB. I'm trying to write to a single key as fast as I can with n=3. I deliberately create siblings by writing with stale vclock. I'm limiting number of writes to 1000 per key to keep size of Riak object under control and then I switch to another key. Siblings will probably never be resolved (or resolved in realtime during sporadic reads) Single write operation is about 250 bytes, rate 10-80 events per sec which gives 3-20kB per second per node. So roughly 100kB / s for the cluster. During the test I see activity on the on disk via iostat and it's between 20-30 MB/s on each node. Even taking into account multiple copies and overhead of Riak (vclocks etc) this seems to be pretty high rate. I don't see any read activity which suggest withoutFetch() works as expected. After 2 mins of tests leveldb on each node is 250MB is size, before test (11MB) Am I using it incorrectly? Is writing in this way to a single key a good idea or will I be bitten by something? How to explain high number of MB written to disks? Regards Daniel -- View this message in context: http://riak-users.197444.n3.nabble.com/Clarifying-withoutFetch-with-LevelDB-and-tp4033051.html Sent from the Riak Users mailing list archive at Nabble.com. ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Clarifying withoutFetch() with LevelDB and
We are using Java client 1.1.4. We haven't moved to newer version of Riak as as for the moment we don't need any new features. Also roll out of the new version may be complicated since we have multiple clusters. As with regards to object size its ~250-300 bytes per write. We store simple JSON structures. Is there anything in new versions that would limit size of data going to the disk? And more importantly is there a way of determining why levelDB grows so big? I'm using ring size 128 which is probably too high at the moment, but after switching to 64 not much has changed. I also disabled 2i indexes that I thought may matter (four 16 bytes fields) and that did not made any difference, still 25-38MB/s write to level db per node. D. -- View this message in context: http://riak-users.197444.n3.nabble.com/Clarifying-withoutFetch-with-LevelDB-and-tp4033051p4033053.html Sent from the Riak Users mailing list archive at Nabble.com. ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Clarifying withoutFetch() with LevelDB and
Alex, Thanks for answering this one and pointing me into right direction. I did an experiment and wrote 0 bytes instead of a JSON and got the same effect - level db folder is 80-220MB in size and activity around 20MB/s write to disk, no read from disk. Java client reports speed 45 secs for 1000 entries, so avg. 45ms per entry Then I change the code so it writes to unique keys. Astonishing difference. Very little write activity to disk ~ 600kB/s per node and db is only 6MB big! Java client reports 2.5 secs for 1000 entries! The difference is huge both in speed and storage! Now, I was always under the impression that writing with stale clock using withoutFetch() would be the quickest way to put data into Riak. Looks like I was wrong. Would the all overhead be basically vclocks? I did not know that even if I'm using withoutFetch() data is still read in the background (?) Regarding data model. I'm trying to solve particular problem. I'm modeling timeline in Riak and I wanted to group events into batches of 1 hour windows. So basically timeboxing. Data has to go to disk so there's no option for me to delay write. Once 1000 events per key is reached next key is selected. Keys are predictable so I can calculate them when read operation happens. I want to grab as much events in one read operation as possible hence the idea of writing in a controlled way to the same key with stale clock. Is there any better way to model that? Obviously next thing I will try is to resolve sibling during write but I hoped I can avoid/delay that until read happens. This vclock/storage/bandwidth explosion really surprised me. Regards Daniel -- View this message in context: http://riak-users.197444.n3.nabble.com/Clarifying-withoutFetch-with-LevelDB-and-tp4033051p4033057.html Sent from the Riak Users mailing list archive at Nabble.com. ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Clarifying withoutFetch() with LevelDB and
Hey Daniel, I wanted to know a little bit more about using withoutFetch() option when used with levelDB. I'm trying to write to a single key as fast as I can with n=3. I deliberately create siblings by writing with stale vclock. ... During the test I see activity on the on disk via iostat and it's between 20-30 MB/s on each node. Even taking into account multiple copies and overhead of Riak (vclocks etc) this seems to be pretty high rate. I don't see any read activity which suggest withoutFetch() works as expected. After 2 mins of tests leveldb on each node is 250MB is size, before test (11MB) Am I using it incorrectly? Is writing in this way to a single key a good idea or will I be bitten by something? How to explain high number of MB written to disks? We call this problem a hot key. When you write with a stale vclock, it will generate a new sibling every time. For example the first time you store your object it's just {v1}, the next time it will get a sibling: {v1, v2}, eventually it's {v1,...v1000} since the siblings are never resolved. That data is read, updated, old version tombstoned, and the new data written with every PUT. Based on your info I would see about 250MB raw data there if LevelDB hasn't compacted the tombstones away. RiakObject.withoutFetch() tells your java client to store data without fetching the most current value first. During that fetch, it would resolve siblings before writing the value back. You may get better throughput by resolving your siblings (less writes overall), or by rethinking your data model so you're not always writing to the same key repeatedly. Is this just a benchmark or are you modeling something in your application? Thanks, Alex On Wed, May 13, 2015 at 11:03 AM, Daniel Iwan iwan.dan...@gmail.com wrote: We are using Java client 1.1.4. We haven't moved to newer version of Riak as as for the moment we don't need any new features. Also roll out of the new version may be complicated since we have multiple clusters. As with regards to object size its ~250-300 bytes per write. We store simple JSON structures. Is there anything in new versions that would limit size of data going to the disk? And more importantly is there a way of determining why levelDB grows so big? I'm using ring size 128 which is probably too high at the moment, but after switching to 64 not much has changed. I also disabled 2i indexes that I thought may matter (four 16 bytes fields) and that did not made any difference, still 25-38MB/s write to level db per node. D. -- View this message in context: http://riak-users.197444.n3.nabble.com/Clarifying-withoutFetch-with-LevelDB-and-tp4033051p4033053.html Sent from the Riak Users mailing list archive at Nabble.com. ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com