Re: benchmarking flink streaming

2017-01-25 Thread Stephan Ewen
The latency markers "pass through windows" so they do not take the latency of windows into account. They represent only the latency of the actual streams and their backpressure. On Wed, Jan 25, 2017 at 6:08 PM, Dominik Safaric wrote: > Hi Stephan, > > As I’m already familiar with the latency mar

Re: benchmarking flink streaming

2017-01-25 Thread Meghashyam Sandeep V
Hi Stephan, Thats great to hear. We are using EMR which is still on Flink 1.1.3. I'll use the latency markers when Flink on EMR is upgraded. Thanks, Sandeep On Wed, Jan 25, 2017 at 9:08 AM, Dominik Safaric wrote: > Hi Stephan, > > As I’m already familiar with the latency markers of Flink 1.2,

Re: benchmarking flink streaming

2017-01-25 Thread Dominik Safaric
Hi Stephan, As I’m already familiar with the latency markers of Flink 1.2, there is one question that bothers me in regard to them - how does Flink measure end-to-end latency when dealing with e.g. aggregations? Suppose you have a topology ingesting data from Kafka, and you want to output fre

Re: benchmarking flink streaming

2017-01-25 Thread Stephan Ewen
Hi! There are new latency metrics in Flink 1.2 that you can use. They are sampled, so not on every record. You can always attach your own timestamps, in order to measure the latency of specific records. Stephan On Fri, Dec 16, 2016 at 5:02 PM, Meghashyam Sandeep V < vr1meghash...@gmail.com> wr

Re: benchmarking flink streaming

2016-12-16 Thread Meghashyam Sandeep V
Hi Stephan, Thanks for your answer. Is there a way to get the metrics such as latency of each message in the stream? For eg. I have a Kafka source, Cassandra sink and I do some processing in between. I would like to know how long does it take for each message from the beginning(entering flink str

Re: benchmarking flink streaming

2016-12-16 Thread Stephan Ewen
Hi! I am not sure there exists a recommended benchmarking tool. Performance comparisons depend heavily on the scenarios you are looking at: Simple event processing, shuffles (grouping aggregation), joins, small state, large state, etc... As fas as I know, most people try to write a "mock" version

benchmarking flink streaming

2016-12-15 Thread Meghashyam Sandeep V
Hi There, We are evaluating Flink streaming for real time data analysis. I have my flink job running in EMR with Yarn. What are the possible benchmarking tools that work best with Flink? I couldn't find this information in the Apache website. Thanks, Sandeep