Sounds good! fwiw binprot simply lacks a method for batching responses. I'll be fixing it this year but that's going to be a big refactor of the old frontend code. There's some use to it but there should be an option for batching anyway.
VM's fine. I do most of my quick tests on a NUC, which is essentially a laptop with better cooling. Have some donated hardware from packet.net for bigger tests, thankfully :) mc-crusher is nice since it uses very little CPU compared to something like memtier. A lot of "memcached" benchmarks end up benchmarking the benchmark, and are only useful in measuring relative latency of changes in the daemon vs scalability/throughput. Just uploaded some fixes and an updated README file. pretty sure I'm the only regular user of the thing so please ask away if anything is weird/confusing. have fun, -Dormando On Sun, 17 Mar 2019, 'Eamonn Nugent' via memcached wrote: > Hi, > > I've been meaning to get myself dedicated test environments at one point, for > my own stuff as well. Guess now is as good a time as any > to try it out. > I'll start using mc-crusher for benchmarks, and I'm going to throw it on at > least a VM for now, run some long-term benchmarks, and > likely try out a couple algos to figure out which one runs best. Good to know > about ascii being a different workload, I wasn't sure if > they were essentially aliased or some sort of individually unique protocol. > Maybe if I get annoyed enough at syscalls, I'll find a way > of writing an OS that allows user applications to be run safely without the > same overhead. Tempting, tempting... > > No worries about it being a pain in the ass - this is the process behind good > and reliable software, and I personally wouldn't want it > any other way. I just wanted to get back with early results to get your > feedback. > > Thanks, > > Eamonn > > On Sun, Mar 17, 2019 at 4:16 PM dormando <dorma...@rydia.net> wrote: > Yo, > > Fwiw, I use mc-crusher as the "official" benchmark: > https://github.com/memcached/mc-crusher tho I need to update the README > slightly. will do that in a bit. > > I also test on hardware uninterrupted by other things with turbo > disabled > :) testing on laptops can be really tough since you'll get different > frequencies minute to minute. You have to interleave test runs > A/B/A/B/A/B > then average to get through the noise. > > Also, make sure to test both binprot/ascii. with ascii multiget you can > deeply pipeline requests and get more hashing vs syscalls. > > Also, test short vs long keys. a for loop and some scripting should get > you there. :) > > I don't really want to add a ton of algo's. More or less a "best > average" > is fine. Given the syscall overhead, hashing is barely measurable for a > lot of workloads. When I switched from jenkins to murmur I did a > somewhat comprehensive set of tests then swapped default + left old one > in just in case. I highly value good defaults! Libmemcached ended up > kitchen-sinking hash algo's and I think that didn't work out well in the > long run. > > I did also test hash chain bucket depth a bit. Finally, loading up > different counts of keys (1m, 10m, 100m, etc) and re-running uniform > random benchmarks since fairness will affect the bucket depth and thus > latency. > > Sorry if that's a pain in the ass, but this thing is quite widely used > and > there aren't really beta testers :) Have to be thorough these days. > > -Dormando > > On Sun, 17 Mar 2019, 'Eamonn Nugent' via memcached wrote: > > > Reporting back with very preliminary benchmarks. Somehow, xxh64 is > actually faster than xxh3 on my machine. One thing I > forgot to > > mention before - I also looked at latencies with xxh32/64, and saw > the 99th percentile latency lowered by about half > compared to mmh3. > > So it could be beneficial in that sense. Latencies with xxh3 are in > the 3.6ms 99% range, xxh64 go down to about 3.0 (I saw > 2.5 > > yesterday, maybe testing on a laptop with about a billion chrome tabs > open isn't a brilliant idea), and mmh3 were in the > 4.xms range. > > All of this with modern options, but with non-modern, xxh64 shone > quite a bit. I was doing my testing there yesterday. > > I used the following memtier_benchmark command to stress test: > > > > ./memtier_benchmark -P memcache_binary -p 11211 --key-prefix="" > --key-maximum=9999999999999999999 > > > > A lot of this seems to be very architecture dependent. Maybe it would > make sense to include a lot of hash algos long term, > and let power > > users figure out which they feel like using? Not sure, though, and > you're the expert here :P > > > > Thanks, > > > > Eamonn > > > > On Sun, Mar 17, 2019 at 2:59 PM Eamonn Nugent > <eamonn.nug...@demilletech.net> wrote: > > Hiya, > > Last night, I was running memtier_benchmark on my laptop (mid-2015 > 15" MBP, 2.5GHz 4c i7) and achieved about a 10-15% > throughput > > improvement on both modern and non-modern settings on the 64 bit > variant. 32 bit variant was about equal in performance > (the > > results showed them to be within about 3% of each other, but most of > the difference was probably just typical entropy). I > was able > > to solve the 32/64 bit compile time problem by adding in a wrapper > and some compile-time declarations, so I'd say that's > about 50% > > solved for x86-based systems. But yeah, with ARM, it could turn > interesting. > > > > As a next-ish step, I'm going to attempt to drop in xxh3, but since > it's still in active development, it's probably not > good as > > anything more than a tech demo. I'm happy, if it would help, just to > go nuts adding a dozen different algos into hash.c, > though > > (cityhash/farmhash, as you mentioned). In xxhash's implementation, > though, I played with some compile-time flags to make > it a bit > > faster, and I've been toying with the idea of modifying it so no seed > logic ever occurs, to maybe gain a couple cycles of > speed > > increase. I'm also looking into seeing if I can find a pure assembly > version to squeeze a bit more speed out of x86 and > ARM > > versions. I should probably get one of my ARM systems running and > test the difference... > > > > But hey, thanks for humoring me. Maybe next I'll take a look at the > reading & processing command steps, and see if there's > > anything I can do. Or maybe parallelizing rotl... Hm. I'll keep on > with trying it out :) > > > > Thanks, > > > > Eamonn > > > > > > On Sun, Mar 17, 2019 at 2:46 PM dormando <dorma...@rydia.net> wrote: > > Hey, > > > > What exact test did you do? > > > > Well to be honest I've been wanting to swap in xxhash for a > long time, but > > in my own profiling other things show up higher than murmur so > I keep > > deprioritizing it :) > > > > One big problem with the hash algo is mc keys can be short and > are > > hashed one at a time. xxhash is more optimized for longer data > (kilobytes > > to megabytes). The original author tries to address this with > an updated > > algorithm: > > > https://fastcompression.blogspot.com/2019/03/presenting-xxh3.html > > > > xxhash makes significant use of instruction parallelism, such > that if a > > key is 8 bytes or less you could end up waiting for the > pipeline more > > than murmur. Other algos like cityhash/farmhash are better at > short keys > > IIRC. Also xx's 32bit algo is a bit slower on 64bit machines... > so if I > > wanted to use it I was going to test both 32bit and 64bit > hashes and then > > have to do compile time testing to figure out which to use. > It's also > > heavily x86 optimized so we might have to default something > else for ARM. > > > > Sorry, not debated on the list, just in my own head :) It's not > quite as > > straightforward as just dropping it in. If you're willing to > get all the > > conditions tested go nuts! :) > > > > -Dormando > > > > On Sat, 16 Mar 2019, eamonn.nugent via memcached wrote: > > > > > Hi there, > > > I started using memcached in prod a week or two ago, and am > loving it. I wanted to give back, and took a look > > through the issues board, > > > but most of them looked solved. So, in my usual "it's never > fast enough" style, I went and profiled its > performance, > > and had some fun. > > > > > > After seeing that MurmurHash3 was taking a good amount of the > execution time, I decided to run a test integrating > > one of my old favorite > > > hash functions, xxhash. My guess is that Memcached could > benefit from using the hash function, as it is faster > than > > MMH3 and has several > > > native variants. I ran some of my own tests, and found > roughly equal performance, but with no tuning performed on > > xxhash. For example, > > > using an assembly (x86/arm/etc) version could likely speed up > hashing, along with properly implementing it in > > memcached. However, I was > > > also running this on a much older Nehalem CPU, so there could > be unseen advantages to one or both of the > algorithms > > by running them on a > > > newer CPU. I'm in the process of fighting with my newer > systems to get libevent installed properly, so I'll report > > back with more > > > up-to-date tests later. > > > > > > I did a cursory search, but didn't find any mention of the > algo in the mailing list. If this has been discussed, > > though, apologies for > > > bringing it up again. On the other hand, I would be happy to > write a PR to add it, using the `hash_algorithm` CLI > > arg. > > > > > > Thanks, > > > Eamonn > > > > > > -- > > > > > > --- > > > You received this message because you are subscribed to the > Google Groups "memcached" group. > > > To unsubscribe from this group and stop receiving emails from > it, send an email to > > memcached+unsubscr...@googlegroups.com. > > > For more options, visit https://groups.google.com/d/optout. > > > > > > > > > > -- > > > > --- > > You received this message because you are subscribed to a topic > in the Google Groups "memcached" group. > > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/memcached/Y02zPF-WTKg/unsubscribe. > > To unsubscribe from this group and all its topics, send an > email to memcached+unsubscr...@googlegroups.com. > > For more options, visit https://groups.google.com/d/optout. > > > > -- > > > > --- > > You received this message because you are subscribed to the Google > Groups "memcached" group. > > To unsubscribe from this group and stop receiving emails from it, > send an email to memcached+unsubscr...@googlegroups.com. > > For more options, visit https://groups.google.com/d/optout. > > > > > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "memcached" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/memcached/Y02zPF-WTKg/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > memcached+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > > -- > > --- > You received this message because you are subscribed to the Google Groups > "memcached" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to memcached+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > > -- --- You received this message because you are subscribed to the Google Groups "memcached" group. To unsubscribe from this group and stop receiving emails from it, send an email to memcached+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.