the high end numbers are due to pipelining responses. ie; ascii multiget,
which reduces the syscalls. you can see how the tests were run via the
links to the source code in the blog.
I was running some pure get tests on dual 8 core machine yesterday with
memcached pinned to one numa node. Without
Thanks for the article link. That is some comprehensive benchmarking.
Compared to article numbers, my latency numbers are sane enough. I hit ~120
us while you get similar/closer numbers at 99 th percentile.
However, my throughput numbers seems to be wrong. I hit a throughput
kneepoint at 500K
It'll depend on your hardware/test/etc.
https://memcached.org/blog/persistent-memory/ - a thorough performance
test with some higher end numbers on both throughput and latency along
with 50th/90th/95/99/etc percentiles and latency point clouds for each sub
test. That was a big machine though.
Hi,
Thanks for the help!.
After a couple of trial and error configs, I figured out 'concurrency
parameter' used in memaslap as the culprit.
In my configs I was using 16 (constant) as the concurrency input. Scaling
the value along with thread count gave me sane numbers.
The average latency is 120
Hey,
Sorry; I'm not going to have any other major insights :) I'd have to sit
here playing 20 questions to figure out your test setup. If you're running
memaslap from another box, that one needs to be cpu pinned as well. If
it's a VM, the governor/etc might not even matter.
Also I don't use
Hi Dormando,
That is great insight.!.
However, it did not solve the problem. I disabled turbo, as per your
instructions.
I even, set the CPU to operate with maximum performance, with
> cpupower frequency-set --governor performance ( i verified this by
monitoring cpu freq)
Still the same
Hi,
First as an aside; 1/1 get/set ratio is unusual for mc. The gets scale a
lot better than sets. If you get into testing more "realistic" perf
numbers make sure to increase the get rate.
You're probably just running into CPU scaling. OS's come with a "battery
saver" or "ondemand" performance
Hi Devs,
I run memaslap to understand the performance characteristics of memcached,
My setup : both memcached and memaslap running on a single machine with
NUMA. memcached is bound to NUMA 1. Gave 3GB of memory to memcached.
workload : get/set 0.5/0.5
I increase number of thread from memaslap