Another interesting set of tests would involve stability. I recall that consul has not done well in the Aphyr/Jensen tests and ZK was perfect.
When I get back in front of a computer I'll open a Jira master task for this (or someone else can now). ==================== Jordan Zimmerman > On Feb 22, 2017, at 2:40 PM, Dan Benediktson > <[email protected]> wrote: > > Performance benchmarking is a very hard problem, so let's keep that in mind > before criticizing this one overmuch, and before going too far trying to > build our own. I do agree that the benchmark chosen here is probably not > the most useful in guiding customers to select among their options for > coordination databases, so I like Jordan's suggestion: first define a small > number of interesting benchmarks, based on common use cases for these > coordination databases. On the topic of service discovery, I agree that's > probably the #1 use case, so a benchmark trying to replicate that scenario > would likely be the first and most important one to go after. > > To be honest, I would expect all existing ZK releases to perform much > worse, by comparison, to etcd and consul, with any kind of mixed read and > write workload, and I think it would would help demonstrate the benefits of > the patch that recently landed in trunk, and any other subsequent > performance-oriented patches we might go after, if we had some ready > benchmarks which could clearly demonstrate the beneficial results of those > patches. > > Thanks, > Dan > > On Wed, Feb 22, 2017 at 6:52 AM, Camille Fournier <[email protected]> > wrote: > >> Even just writing about what objective tests might look like would be a >> good start! I'm happy to read draft posts by anyone who wishes to write on >> the topic. >> >> C >> >> On Wed, Feb 22, 2017 at 9:36 AM, Jordan Zimmerman < >> [email protected]> wrote: >> >>> IMO there is tremendous FUD in the etcd world. It's the new cool toy and >>> ZK feels old. To suggest that ZK does not do Service Discovery is >>> ludicrous. That was one of the very first Curator recipes. >>> >>> It might be useful to counter this trend objectively. I'd be interested >> in >>> helping. Anyone else? We can create objective tests that compare common >> use >>> cases. >>> >>> ==================== >>> Jordan Zimmerman >>> >>>> On Feb 22, 2017, at 11:21 AM, Camille Fournier <[email protected]> >>> wrote: >>>> >>>> I think that my biggest feeling about this blog post (besides not >>>> disclosing the disk setup clearly) is that, ZK is really not designed >> to >>>> have massive write throughput. I would not traditionally recommend >>> someone >>>> use ZK in that manner. If we think that evolving it to be useful for >> such >>>> workloads would be good, it could be an interesting community >> discussion, >>>> but it's really not the purpose of the system design. >>>> >>>> I'd love to see a more read/write mixed load test for the systems, as >>> well >>>> as a blog post about why you might choose different systems for >> different >>>> workloads. I think developers have a hard time really understanding the >>>> tradeoffs they are choosing in these systems, because of the nuance >>> around >>>> them. >>>> >>>> For me, I'm more concerned about the fact that I saw a talk yesterday >>> that >>>> mentioned both etcd and consul as options for service discovery but not >>> ZK. >>>> That feels like a big hit for our community. Orthogonal to this topic, >>> just >>>> feels worth mentioning. >>>> >>>> C >>>> >>>> On Wed, Feb 22, 2017 at 4:05 AM, Alexander Binzberger < >>>> [email protected]> wrote: >>>> >>>>> 1. Seams like it might make sense to increase snapCount for those >> tests. >>>>> >>>>> 2. ZK write performance also depends on the number of watches - afaik. >>>>> This is not mentioned and not tested. >>>>> >>>>> 3. Does it really make sense to "blast" the store? Wouldn't it make >> more >>>>> sense to compare fixed write/read per clients rates? >>>>> >>>>> >>>>> >>>>>> Am 22.02.2017 um 05:53 schrieb Michael Han: >>>>>> >>>>>> Kudus to etcd team for making this blog and thanks for sharing. >>>>>> >>>>>> I feel like they're running a questionable configuration. >>>>>>>> >>>>>>> Looks like the test configuration >>>>>> <https://github.com/coreos/dbtester/blob/89eb8d31addff1d9538 >>>>>> 235c20878a8637f24608c/agent/agent_zookeeper.go#L29> >>>>>> does not have separate directory for transaction logs and snapshots >> as >>> it >>>>>> does not have configuration for dataLogDir. So the configuration is >> not >>>>>> optimal. Would be interesting to see the numbers with updated >>>>>> configuration. >>>>>> >>>>>> They mention that ZK snapshots "stop the world", and maybe I'm >>> mistaken, >>>>>>>> but >>>>>>>> >>>>>>> I didn't think that was right >>>>>> >>>>>> Right, ZK snapshots does not block processing pipeline as it is fuzzy >>> and >>>>>> it is done in a separate thread. The warning message "*To busy to >> snap, >>>>>> skipping*" mentioned in the blog is a sign that a snap shot is also >>>>>> generating in progress, which could be caused by the write >> contentions >>>>>> created from serializing transaction logs that leads to longer than >>>>>> expected snap shot generation. So "stop the world" is a side effect >> of >>>>>> resource contention, but not a design intention IMO. >>>>>> >>>>>> Also the blog mentions ZooKeeper as a key value store and I also want >>> to >>>>>> point out that ZooKeeper is more than a (metadata) key value store >> has >>>>>> features such as sessions, ephemerals, and watchers, and these design >>>>>> choices were made I believe to make ZK more useful as a coordination >>>>>> kernel, and these design choice also (negatively) contribute to the >>>>>> performance and scalability of ZooKeeper. >>>>>> >>>>>> >>>>>> On Tue, Feb 21, 2017 at 4:32 PM, Dan Benediktson < >>>>>> [email protected]> wrote: >>>>>> >>>>>> I kind of wonder about them only using one disk. I haven't >> experimented >>>>>>> with this in ZooKeeper, nor have I ever been a DBA, but with >>> traditional >>>>>>> database systems (which ZooKeeper should be basically identical to, >> in >>>>>>> this >>>>>>> regard), it's a pretty common recommendation to put snapshots and >>> TxLogs >>>>>>> on >>>>>>> different drives, for the obvious reason of avoiding one of the >>> biggest >>>>>>> problems laid out in that blog post: when snapshot happens, it >>> contends >>>>>>> with your log flushes, causing write latencies to explode. Suddenly >>> you >>>>>>> have tons more IO, and where it used to be nicely sequential, now >> it's >>>>>>> heavily randomized because of the two competing writers. It's kind >> of >>> the >>>>>>> nature of benchmarks that there's always something you can nitpick, >>> but >>>>>>> still, I feel like they're running a questionable configuration. >>>>>>> >>>>>>> They mention that ZK snapshots "stop the world", and maybe I'm >>> mistaken, >>>>>>> but I didn't think that was right - I thought they were just slowing >>>>>>> everything down because they write a lot and contend a lot. I'm >> pretty >>>>>>> sure >>>>>>> ZK snapshots are fuzzy over a range of transactions, and >> transactions >>>>>>> keep >>>>>>> applying during the snapshot, right? >>>>>>> >>>>>>> Thanks, >>>>>>> Dan >>>>>>> >>>>>>> On Tue, Feb 21, 2017 at 2:24 PM, Benjamin Mahler < >>> [email protected]> >>>>>>> wrote: >>>>>>> >>>>>>> I'm curious if folks here have seen the following write performance >>>>>>>> comparison that was done by CoreOS on etc, Consul, and ZooKeeper: >>>>>>>> https://coreos.com/blog/performance-of-etcd.html >>>>>>>> >>>>>>>> Sounds like performance comparison of reads and updates are coming >>> next. >>>>>>>> Are there any thoughts from folks here on this comparison so far? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Ben >>>>>>>> >>>>>>>> >>>>> -- >>>>> Alexander Binzberger >>>>> System Designer - WINGcon AG >>>>> Tel. +49 7543 966-119 >>>>> >>>>> Sitz der Gesellschaft: Langenargen >>>>> Registergericht: ULM, HRB 734260 >>>>> USt-Id.: DE232931635, WEEE-Id.: DE74015979 >>>>> Vorstand: thomasThomas Ehrle (Vorsitz), Fritz R. Paul >> (Stellvertreter), >>>>> Tobias Treß >>>>> Aufsichtsrat: Jürgen Maucher (Vorsitz), Andreas Paul (Stellvertreter), >>>>> Martin Sauter >>>>> >>>>> >>> >>
