Re: etcd performance comparison

Alexander Binzberger Wed, 22 Feb 2017 01:05:57 -0800

1. Seams like it might make sense to increase snapCount for those tests.

2. ZK write performance also depends on the number of watches - afaik.This is not mentioned and not tested.

3. Does it really make sense to "blast" the store? Wouldn't it make moresense to compare fixed write/read per clients rates?



Am 22.02.2017 um 05:53 schrieb Michael Han:

Kudus to etcd team for making this blog and thanks for sharing.

I feel like they're running a questionable configuration.

Looks like the test configuration
<https://github.com/coreos/dbtester/blob/89eb8d31addff1d9538235c20878a8637f24608c/agent/agent_zookeeper.go#L29>
does not have separate directory for transaction logs and snapshots as it
does not have configuration for dataLogDir. So the configuration is not
optimal. Would be interesting to see the numbers with updated configuration.

They mention that ZK snapshots "stop the world", and maybe I'm mistaken, but

I didn't think that was right

Right, ZK snapshots does not block processing pipeline as it is fuzzy and
it is done in a separate thread. The warning message "*To busy to snap,
skipping*" mentioned in the blog is a sign that a snap shot is also
generating in progress, which could be caused by the write contentions
created from serializing transaction logs that leads to longer than
expected snap shot generation. So "stop the world" is a side effect of
resource contention, but not a design intention IMO.

Also the blog mentions ZooKeeper as a key value store and I also want to
point out that ZooKeeper is more than a (metadata) key value store has
features such as sessions, ephemerals, and watchers, and these design
choices were made I believe to make ZK more useful as a coordination
kernel, and these design choice also (negatively) contribute to the
performance and scalability of ZooKeeper.


On Tue, Feb 21, 2017 at 4:32 PM, Dan Benediktson <
[email protected]> wrote:

I kind of wonder about them only using one disk. I haven't experimented
with this in ZooKeeper, nor have I ever been a DBA, but with traditional
database systems (which ZooKeeper should be basically identical to, in this
regard), it's a pretty common recommendation to put snapshots and TxLogs on
different drives, for the obvious reason of avoiding one of the biggest
problems laid out in that blog post: when snapshot happens, it contends
with your log flushes, causing write latencies to explode. Suddenly you
have tons more IO, and where it used to be nicely sequential, now it's
heavily randomized because of the two competing writers. It's kind of the
nature of benchmarks that there's always something you can nitpick, but
still, I feel like they're running a questionable configuration.

They mention that ZK snapshots "stop the world", and maybe I'm mistaken,
but I didn't think that was right - I thought they were just slowing
everything down because they write a lot and contend a lot. I'm pretty sure
ZK snapshots are fuzzy over a range of transactions, and transactions keep
applying during the snapshot, right?

Thanks,
Dan

On Tue, Feb 21, 2017 at 2:24 PM, Benjamin Mahler <[email protected]>
wrote:

I'm curious if folks here have seen the following write performance
comparison that was done by CoreOS on etc, Consul, and ZooKeeper:
https://coreos.com/blog/performance-of-etcd.html

Sounds like performance comparison of reads and updates are coming next.
Are there any thoughts from folks here on this comparison so far?

Thanks,
Ben


--
Alexander Binzberger
System Designer - WINGcon AG
Tel. +49 7543 966-119

Sitz der Gesellschaft: Langenargen
Registergericht: ULM, HRB 734260
USt-Id.: DE232931635, WEEE-Id.: DE74015979
Vorstand: thomasThomas Ehrle (Vorsitz), Fritz R. Paul (Stellvertreter), Tobias 
Treß
Aufsichtsrat: Jürgen Maucher (Vorsitz), Andreas Paul (Stellvertreter), Martin 
Sauter

Re: etcd performance comparison

Reply via email to