Hi Jerry, Thanks for the clarification. But just for my understanding, the reason why we got the lower latency is the "window" mechanism in Flink. I guess the stream in Flink is flushed as one or several batches for a window. So when lower throughputs, it will lead to the extra waiting at source component. So it is possible to lower the latency of Flink by adjusting configuration. Actually, my point here is that if we want to compete with Flink or spark stream for at least once or exactly once (high throughput and low latency), the acking mechanism of storm needs to be improved. Currently, there are too many extras messages for acking mechanism in Storm. Sometimes, the throughput of topology depends on the throughput of acker.
Regards Basti -----Original Message----- From: Boyang(Jerry) Peng [mailto:[email protected]] Sent: Friday, December 18, 2015 7:08 AM To: [email protected] Subject: Re: Benchmarking Streaming Computation Engines at Yahoo! Hello Satiash, One of the experiments we wish to do in the future is to compare flink with checkpointing with Storm with acking. If you look at our results, Storm with acking does have lower latency than Flink without checkpointing at lower throughputs. The keyword here is lower throughputs. What we were trying to say is that Storm with the optimizations we proposed can be comparable to with Flink without checkpointing at higher throughputs even with acking turned on. Best, Jerry On Thursday, December 17, 2015 1:27 PM, Satish Duggana <[email protected]> wrote: Hi Jerry, Thanks for updating the blog. Storm with acking should be compared with similar configuration on Flink which may be with checkpointing enabled or some other configuration which gives at-least-once guarantee. But the below paragraph gives an impression that storm with acking is equivalent of Flink without checkpointing which is not right. "Without acking, Storm even beat Flink at very high throughput, and we expect that with further optimizations like combining bolts, more intelligent routing of tuples, and improved acking, Storm with acking enabled would compete with Flink at very high throughput too." Thanks, Satish. On Thu, Dec 17, 2015 at 10:47 PM, Boyang(Jerry) Peng < [email protected]> wrote: > Hello Satish, > You are correct, there was a typo. The sentence should be: > Flink uses a mechanism called checkpointing to guarantee processing. > Unless checkpointing is used in the Flink job, Flink offers at most once > processing similar to Storm with acking turned OFF. For the Flink > benchmark we did not use checkpointing." > > We have already fixed the typo on the blog. Thanks! > Best, > Boyang Jerry Peng > > > On Thursday, December 17, 2015 4:12 AM, Satish Duggana < > [email protected]> wrote: > > > Hi Bobby etal, > Thanks for publishing blog post on “Benchmarking streaming computation > engines< > http://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at>”. > It gives good insights on how different streaming engines perform with the > usecase mentioned. > > “Flink uses a mechanism called checkpointing to guarantee processing. > Unless checkpointing is used in the Flink job, Flink offers at most once > processing similar to Storm with acking turned on. For the Flink benchmark > we did not use checkpointing." > > Above snippet in your blog was confusing regarding at-most-once guarantee. > My understanding is that Storm gives at-most-once without acking. But > at-least-once guarantee requires acking on. So, Storm’s acking should be > compared with Flink’s at-least-once guarantee which may be by enabling > checkpointing or any other required configuration. Am I missing anything > here? > > Thanks, > Satish. > > > >
