Single machine performance looks great. I'm assuming this was done with replication factor = 1, right ?
In reality, the only factor that can affect this number, is the time it takes for the followers to fetch the data. This is important for end-to-end latency since until the follower replicas fetch the data, the leader will not increment the highwatermark to expose that data to the consumer, thereby affecting end-to-end latency. Thanks, Neha On Thu, Oct 25, 2012 at 5:08 PM, Jay Kreps <jay.kr...@gmail.com> wrote: > I ran the end-to-end latency test on 0.8 branch and it actually looks > pretty good even with the remaining 0.8 perf bugs. I get an average 0.29 ms > for the producer=>broker=>consumer roundtrip over localhost. > > Some background for those not following along: > > End-to-end latency is the time it takes for a message sent from the > producer to arrive at the consumer. It actually does make sense to measure > this over localhost since the networks contribution is going to > vary enormously based on the network. > > In 0.7 there was a pretty strong tradeoff between latency and throughput, > you could have great throughput with high latency (hundreds of ms to > seconds) or bad throughput with low latency (bad at least in terms of disk > capabilities, though comparable to other popular messaging systems). There > were two causes for this: (1) we were using the disk flush to guarantee > durability and so we wouldn't hand out messages to consumers until the > flush occurred, and (2) the consumer would backoff whenever it reached the > end of the log to avoid "spin waiting" and polling for new data in a tight > loop. > > Both these issues are resolved in 0.8 which is what leads to the > improvement: > - In 0.8 we have replication which gives a stronger durability guarantee > without requiring that we block on the disk flush. > - We added "long poll" for the consumer which ensures immediate message > delivery without adding any polling overhead. > > -Jay