First of all, thank you for your reply. Well, if you tell me that I can beat myself up trying to get what another person gets in a benchmark, then I don't understand what's the whole point in post your results here. I thought that you were trying to tell me that in an similar setup, you could do much better, and that I probably had some problem on my setup (and you were right). Considering how different our results are, I imagine that there is something wrong - by "wrong" I mean: I have a setup problem; my hardware is not up to the task and our environments are not that similar; etc. That's what I'm trying to figure out and also, I'm trying to know the tool.
I have a few different scenarios to face: heavy write, heavy read, both of them, etc. Now, I'm considering the heavy write scenario and later I'll deal with others. If I jump from one scenario to another without at least a minimum solid conclusion, it will not help me. Based on what you said, I replaced the machine that was producing the load by a new one identical to the cluster machines (intel core i3 2.3GHz, 4GB RAM 1TB HD). Now I have 3 machines with the same setup in a gigabit network. I started using bitcask backend: https://dl.dropbox.com/u/308392/sum_bit.png Then I tried memory backend: https://dl.dropbox.com/u/308392/sum_mem.png Now we can see a major impact on the results. The memory backend could do much better with roughly 4000 ops/sec, but bitcask not so much, about +200 ops/sec than the last result. Changing the third machine really worked. Just to see what would happen, I decided to try the benchmark locally (of corse, I knew that it would put a heavy load over the CPU). I ran the test from one of the cluster machines. Results: Bitcask: https://dl.dropbox.com/u/308392/sum_bit_local.png Memory: https://dl.dropbox.com/u/308392/sum_mem_local.png Well, it seems that my bottleneck is related to my disks. Once again, thank you. You helped me alot. On Mon, Nov 5, 2012 at 4:36 PM, Jared Morrow <ja...@basho.com> wrote: > So if you are getting close to similar numbers with the memory and bitcask > backends, you know that file IO isn't your bottleneck in the load test. My > guess is that the machine you are running the load with (where you are > running basho_bench) can't keep up with the test. I'm running basho_bench > on a core i7 on the same switch as the riak nodes, so that could be enough > of the difference. Again though, you can beat yourself up trying to just > get what another person gets in a benchmark, so it really comes down to > what you want out of it. What are your app's requirements? What do you > expect to be the requirements a year from now? What do your access patterns > look like, more read heavy or more write heavy? > > Right now, you are only doing bulk load experiments, what about > Write/Read/Update that simulate your expected usage pattern? What we have > done in the past as a good test is to do a bulk load like you are doing > with say 10 or 100 million keys. That'll make sure Riak is loaded with a > real world amount of data to start. Then run another pareto distribution > run (using something like {key_generator, {int_to_bin, {pareto_int, > 10000000}}}.) with a spread of puts/gets/updates like {operations, [{get, > 4},{put, 1},{update, 1}]}.. I'll tell you right now with a two node > cluster you probably aren't going to be happy with the results as that is > not how Riak was designed to work at its best. If you can swing it, add two > more nodes to the test, set N=2 instead of N=1 so you are getting at least > some data safety and run the above. If it meets the need of your app, > awesome. if it doesn't come back and discuss it or try another solution > with the same requirements. > > -Jared > > > <snip'd the history to pass through the mailing-list size requirement>
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com