Thanks Jordan for driving this. A side note on the ZooKeeper Jepsen report <https://aphyr.com/posts/291-jepsen-zookeeper> - it is a little bit out dated so I created https://issues.apache.org/jira/browse/ZOOKEEPER-2704.
On Wed, Feb 22, 2017 at 10:39 AM, Jordan Zimmerman < [email protected]> wrote: > Another interesting set of tests would involve stability. I recall that > consul has not done well in the Aphyr/Jensen tests and ZK was perfect. > > When I get back in front of a computer I'll open a Jira master task for > this (or someone else can now). > > ==================== > Jordan Zimmerman > > > On Feb 22, 2017, at 2:40 PM, Dan Benediktson > > <[email protected]> > wrote: > > > > Performance benchmarking is a very hard problem, so let's keep that in > mind > > before criticizing this one overmuch, and before going too far trying to > > build our own. I do agree that the benchmark chosen here is probably not > > the most useful in guiding customers to select among their options for > > coordination databases, so I like Jordan's suggestion: first define a > small > > number of interesting benchmarks, based on common use cases for these > > coordination databases. On the topic of service discovery, I agree that's > > probably the #1 use case, so a benchmark trying to replicate that > scenario > > would likely be the first and most important one to go after. > > > > To be honest, I would expect all existing ZK releases to perform much > > worse, by comparison, to etcd and consul, with any kind of mixed read and > > write workload, and I think it would would help demonstrate the benefits > of > > the patch that recently landed in trunk, and any other subsequent > > performance-oriented patches we might go after, if we had some ready > > benchmarks which could clearly demonstrate the beneficial results of > those > > patches. > > > > Thanks, > > Dan > > > > On Wed, Feb 22, 2017 at 6:52 AM, Camille Fournier <[email protected]> > > wrote: > > > >> Even just writing about what objective tests might look like would be a > >> good start! I'm happy to read draft posts by anyone who wishes to write > on > >> the topic. > >> > >> C > >> > >> On Wed, Feb 22, 2017 at 9:36 AM, Jordan Zimmerman < > >> [email protected]> wrote: > >> > >>> IMO there is tremendous FUD in the etcd world. It's the new cool toy > and > >>> ZK feels old. To suggest that ZK does not do Service Discovery is > >>> ludicrous. That was one of the very first Curator recipes. > >>> > >>> It might be useful to counter this trend objectively. I'd be interested > >> in > >>> helping. Anyone else? We can create objective tests that compare common > >> use > >>> cases. > >>> > >>> ==================== > >>> Jordan Zimmerman > >>> > >>>> On Feb 22, 2017, at 11:21 AM, Camille Fournier <[email protected]> > >>> wrote: > >>>> > >>>> I think that my biggest feeling about this blog post (besides not > >>>> disclosing the disk setup clearly) is that, ZK is really not designed > >> to > >>>> have massive write throughput. I would not traditionally recommend > >>> someone > >>>> use ZK in that manner. If we think that evolving it to be useful for > >> such > >>>> workloads would be good, it could be an interesting community > >> discussion, > >>>> but it's really not the purpose of the system design. > >>>> > >>>> I'd love to see a more read/write mixed load test for the systems, as > >>> well > >>>> as a blog post about why you might choose different systems for > >> different > >>>> workloads. I think developers have a hard time really understanding > the > >>>> tradeoffs they are choosing in these systems, because of the nuance > >>> around > >>>> them. > >>>> > >>>> For me, I'm more concerned about the fact that I saw a talk yesterday > >>> that > >>>> mentioned both etcd and consul as options for service discovery but > not > >>> ZK. > >>>> That feels like a big hit for our community. Orthogonal to this topic, > >>> just > >>>> feels worth mentioning. > >>>> > >>>> C > >>>> > >>>> On Wed, Feb 22, 2017 at 4:05 AM, Alexander Binzberger < > >>>> [email protected]> wrote: > >>>> > >>>>> 1. Seams like it might make sense to increase snapCount for those > >> tests. > >>>>> > >>>>> 2. ZK write performance also depends on the number of watches - > afaik. > >>>>> This is not mentioned and not tested. > >>>>> > >>>>> 3. Does it really make sense to "blast" the store? Wouldn't it make > >> more > >>>>> sense to compare fixed write/read per clients rates? > >>>>> > >>>>> > >>>>> > >>>>>> Am 22.02.2017 um 05:53 schrieb Michael Han: > >>>>>> > >>>>>> Kudus to etcd team for making this blog and thanks for sharing. > >>>>>> > >>>>>> I feel like they're running a questionable configuration. > >>>>>>>> > >>>>>>> Looks like the test configuration > >>>>>> <https://github.com/coreos/dbtester/blob/89eb8d31addff1d9538 > >>>>>> 235c20878a8637f24608c/agent/agent_zookeeper.go#L29> > >>>>>> does not have separate directory for transaction logs and snapshots > >> as > >>> it > >>>>>> does not have configuration for dataLogDir. So the configuration is > >> not > >>>>>> optimal. Would be interesting to see the numbers with updated > >>>>>> configuration. > >>>>>> > >>>>>> They mention that ZK snapshots "stop the world", and maybe I'm > >>> mistaken, > >>>>>>>> but > >>>>>>>> > >>>>>>> I didn't think that was right > >>>>>> > >>>>>> Right, ZK snapshots does not block processing pipeline as it is > fuzzy > >>> and > >>>>>> it is done in a separate thread. The warning message "*To busy to > >> snap, > >>>>>> skipping*" mentioned in the blog is a sign that a snap shot is also > >>>>>> generating in progress, which could be caused by the write > >> contentions > >>>>>> created from serializing transaction logs that leads to longer than > >>>>>> expected snap shot generation. So "stop the world" is a side effect > >> of > >>>>>> resource contention, but not a design intention IMO. > >>>>>> > >>>>>> Also the blog mentions ZooKeeper as a key value store and I also > want > >>> to > >>>>>> point out that ZooKeeper is more than a (metadata) key value store > >> has > >>>>>> features such as sessions, ephemerals, and watchers, and these > design > >>>>>> choices were made I believe to make ZK more useful as a coordination > >>>>>> kernel, and these design choice also (negatively) contribute to the > >>>>>> performance and scalability of ZooKeeper. > >>>>>> > >>>>>> > >>>>>> On Tue, Feb 21, 2017 at 4:32 PM, Dan Benediktson < > >>>>>> [email protected]> wrote: > >>>>>> > >>>>>> I kind of wonder about them only using one disk. I haven't > >> experimented > >>>>>>> with this in ZooKeeper, nor have I ever been a DBA, but with > >>> traditional > >>>>>>> database systems (which ZooKeeper should be basically identical to, > >> in > >>>>>>> this > >>>>>>> regard), it's a pretty common recommendation to put snapshots and > >>> TxLogs > >>>>>>> on > >>>>>>> different drives, for the obvious reason of avoiding one of the > >>> biggest > >>>>>>> problems laid out in that blog post: when snapshot happens, it > >>> contends > >>>>>>> with your log flushes, causing write latencies to explode. Suddenly > >>> you > >>>>>>> have tons more IO, and where it used to be nicely sequential, now > >> it's > >>>>>>> heavily randomized because of the two competing writers. It's kind > >> of > >>> the > >>>>>>> nature of benchmarks that there's always something you can nitpick, > >>> but > >>>>>>> still, I feel like they're running a questionable configuration. > >>>>>>> > >>>>>>> They mention that ZK snapshots "stop the world", and maybe I'm > >>> mistaken, > >>>>>>> but I didn't think that was right - I thought they were just > slowing > >>>>>>> everything down because they write a lot and contend a lot. I'm > >> pretty > >>>>>>> sure > >>>>>>> ZK snapshots are fuzzy over a range of transactions, and > >> transactions > >>>>>>> keep > >>>>>>> applying during the snapshot, right? > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Dan > >>>>>>> > >>>>>>> On Tue, Feb 21, 2017 at 2:24 PM, Benjamin Mahler < > >>> [email protected]> > >>>>>>> wrote: > >>>>>>> > >>>>>>> I'm curious if folks here have seen the following write performance > >>>>>>>> comparison that was done by CoreOS on etc, Consul, and ZooKeeper: > >>>>>>>> https://coreos.com/blog/performance-of-etcd.html > >>>>>>>> > >>>>>>>> Sounds like performance comparison of reads and updates are coming > >>> next. > >>>>>>>> Are there any thoughts from folks here on this comparison so far? > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Ben > >>>>>>>> > >>>>>>>> > >>>>> -- > >>>>> Alexander Binzberger > >>>>> System Designer - WINGcon AG > >>>>> Tel. +49 7543 966-119 > >>>>> > >>>>> Sitz der Gesellschaft: Langenargen > >>>>> Registergericht: ULM, HRB 734260 > >>>>> USt-Id.: DE232931635, WEEE-Id.: DE74015979 > >>>>> Vorstand: thomasThomas Ehrle (Vorsitz), Fritz R. Paul > >> (Stellvertreter), > >>>>> Tobias Treß > >>>>> Aufsichtsrat: Jürgen Maucher (Vorsitz), Andreas Paul > (Stellvertreter), > >>>>> Martin Sauter > >>>>> > >>>>> > >>> > >> >
