Fwd: Decrease initial source read speed

2018-05-16 Thread Andrei Shumanski
Hi, I am trying to use Flink for data ingestion. Input is a Kafka topic with strings - paths to incoming archive files. The job is unpacking the archives, reads data in them, parses and stores data in another format. Everything works fine if the topic is empty at the beginning of execution and th

Re: Fwd: Decrease initial source read speed

2018-05-18 Thread makeyang
Andrei Shumanski: which source are u using? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Fwd: Decrease initial source read speed

2018-05-18 Thread Andrei Shumanski
Hi, Right now it is a Kafka source, but I had the same issue when reading data from local FS. It looks like a common problem for many (all?) sources. When incoming data is very small (paths to large archives) but each entry requires a significant time to process (unpack, parse, etc.) Flink detec

Re: Fwd: Decrease initial source read speed

2018-05-23 Thread Fabian Hueske
Hi Andrei, With the current version of Flink, there is no general solution to this problem. The upcoming version 1.5.0 of Flink adds a feature called credit-based flow control which might help here. I'm adding @Piotr to this thread who knows more about the details of this new feature. Best, Fabi