Lower Parallelism derives better latency

2018-01-02 Thread Netzer, Liron
Hi group, We have a standalone Flink cluster that is running on a UNIX host with 40 CPUs and 256GB RAM. There is one task manager and 24 slots were defined. When we decrease the parallelism of the Stream graph operators(each operator has the same parallelism), we see a consistent change in the

JobManager not receiving resource offers from Mesos

2018-01-02 Thread 김동원
Hi, I try to launch a Flink cluster on top of dc/os but TaskManagers are not launched at all. What I do to launch a Flink cluster is as follows: - Click "flink" from "Catalog" on the left panel of dc/os GUI. - Click "Run service" without any modification on configuration for the purpose of

Re: BackPressure handling

2018-01-02 Thread Vishal Santoshi
Also note that if I were to start 2 pipelines 1. Working off the head of the topic and thus not prone to the pathological case described above 2. Doing a replay and thus prone to the pathological case described above Than the 2nd pipe will stall the 1st pipeline. This seems to to point to -

Re: BackPressure handling

2018-01-02 Thread Vishal Santoshi
Thank you. On Tue, Jan 2, 2018 at 1:31 PM, Nico Kruber wrote: > Hi Vishal, > let me already point you towards the JIRA issue for the credit-based > flow control: https://issues.apache.org/jira/browse/FLINK-7282 > > I'll have a look at the rest of this email thread

Re: BackPressure handling

2018-01-02 Thread Nico Kruber
Hi Vishal, let me already point you towards the JIRA issue for the credit-based flow control: https://issues.apache.org/jira/browse/FLINK-7282 I'll have a look at the rest of this email thread tomorrow... Regards, Nico On 02/01/18 17:52, Vishal Santoshi wrote: > Could you please point me to

Re: BackPressure handling

2018-01-02 Thread Vishal Santoshi
Could you please point me to any documentation on the "credit-based flow control" approach On Tue, Jan 2, 2018 at 10:35 AM, Timo Walther wrote: > Hi Vishal, > > your assumptions sound reasonable to me. The community is currently > working on a more fine-grained back

Re: keyby() issue

2018-01-02 Thread Timo Walther
Hi Jinhua, did you check the key group assignments? What is the distribution of "MathUtils.murmurHash(keyHash) % maxParallelism" on a sample of your data? This also depends on the hashCode on the output of your KeySelector. keyBy should handle high traffic well, but it is designed for key

Re: How to stop FlinkKafkaConsumer and make job finished?

2018-01-02 Thread Timo Walther
Hi Arnaud, thanks for letting us know your workaround. I agree that this is a frequently asked topic and important in certain use cases. I'm sure that it will be solved in the near future depending on the priorities. My 2 cents: Flink is an open source project maybe somebody is willing to

RE: How to stop FlinkKafkaConsumer and make job finished?

2018-01-02 Thread LINZ, Arnaud
Hi, My 2 cents: not being able to programmatically nicely stop a Flink stream is what lacks most to the framework IMHO. It's a very common use case: each time you want to update the application or change its configuration you need to nicely stop & restart it, without triggering alerts, data

Re: Flink Kafka Consumer stops fetching records

2018-01-02 Thread Timo Walther
Hi Teena, could you tell us a bit more about your job. Are you using event-time semantics? Regards, Timo Am 1/2/18 um 6:14 AM schrieb Teena K: Hi, I am using Flink 1.4 along with Kafka 0.11. My stream job has 4 Kafka consumers each subscribing to 4 different topics. The stream from each

Re: BackPressure handling

2018-01-02 Thread Timo Walther
Hi Vishal, your assumptions sound reasonable to me. The community is currently working on a more fine-grained back pressuring with credit-based flow control. It is on the roamap for 1.5 [1]/[2]. I will loop in Nico that might tell you more about the details. Until then I guess you have to

BackPressure handling

2018-01-02 Thread Vishal Santoshi
I did a simulation on session windows ( in 2 modes ) and let it rip for about 12 hours 1. Replay where a kafka topic with retention of 7 days was the source ( earliest ) 2. Start the pipe with kafka source ( latest ) I saw results that differed dramatically. On replay the pipeline stalled after

Re: S3 Access in eu-central-1

2018-01-02 Thread Nico Kruber
Sorry for the late response, but I finally got around adding this workaround to our "common issues" section with PR https://github.com/apache/flink/pull/5231 Nico On 29/11/17 09:31, Ufuk Celebi wrote: > Hey Dominik, > > yes, we should definitely add this to the docs. > > @Nico: You recently