First of all, thank you for your reply.

Well, if you tell me that I can beat myself up trying to get what another
person gets in a benchmark, then I don't understand what's the whole point
in post your results here. I thought that you were trying to tell me that
in an similar setup, you could do much better, and that I probably had some
problem on my setup (and you were right). Considering how different our
results are, I imagine that there is something wrong - by "wrong" I mean: I
have a setup problem; my hardware is not up to the task and our
environments are not that similar; etc. That's what I'm trying to figure
out and also, I'm trying to know the tool.

I have a few different scenarios to face: heavy write, heavy read, both of
them, etc. Now, I'm considering the heavy write scenario and later I'll
deal with others. If I jump from one scenario to another without at least a
minimum solid conclusion, it will not help me.

Based on what you said, I replaced the machine that was producing the load
by a new one identical to the cluster machines (intel core i3 2.3GHz, 4GB
RAM 1TB HD). Now I have 3 machines with the same setup in a gigabit network.

I started using bitcask backend:
https://dl.dropbox.com/u/308392/sum_bit.png

Then I tried memory backend:
https://dl.dropbox.com/u/308392/sum_mem.png

Now we can see a major impact on the results. The memory backend could do
much better with roughly 4000 ops/sec, but bitcask not so much, about +200
ops/sec than the last result. Changing the third machine really worked.

Just to see what would happen, I decided to try the benchmark locally (of
corse, I knew that it would put a heavy load over the CPU). I ran the test
from one of the cluster machines.
Results:

Bitcask:
https://dl.dropbox.com/u/308392/sum_bit_local.png

Memory:
https://dl.dropbox.com/u/308392/sum_mem_local.png

Well, it seems that my bottleneck is related to my disks.
Once again, thank you.
You helped me alot.


On Mon, Nov 5, 2012 at 4:36 PM, Jared Morrow <ja...@basho.com> wrote:

> So if you are getting close to similar numbers with the memory and bitcask
> backends, you know that file IO isn't your bottleneck in the load test. My
> guess is that the machine you are running the load with (where you are
> running basho_bench) can't keep up with the test. I'm running basho_bench
> on a core i7 on the same switch as the riak nodes, so that could be enough
> of the difference. Again though, you can beat yourself up trying to just
> get what another person gets in a benchmark, so it really comes down to
> what you want out of it. What are your app's requirements? What do you
> expect to be the requirements a year from now? What do your access patterns
> look like, more read heavy or more write heavy?
>
> Right now, you are only doing bulk load experiments, what about
> Write/Read/Update that simulate your expected usage pattern? What we have
> done in the past as a good test is to do a bulk load like you are doing
> with say 10 or 100 million keys. That'll make sure Riak is loaded with a
> real world amount of data to start. Then run another pareto distribution
> run (using something like {key_generator, {int_to_bin, {pareto_int,
> 10000000}}}.) with a spread of puts/gets/updates like {operations, [{get,
> 4},{put, 1},{update, 1}]}.. I'll tell you right now with a two node
> cluster you probably aren't going to be happy with the results as that is
> not how Riak was designed to work at its best. If you can swing it, add two
> more nodes to the test, set N=2 instead of N=1 so you are getting at least
> some data safety and run the above. If it meets the need of your app,
> awesome. if it doesn't come back and discuss it or try another solution
> with the same requirements.
>
> -Jared
>
>
> <snip'd the history to pass through the mailing-list size requirement>
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to