Hi Andrei, With the current version of Flink, there is no general solution to this problem. The upcoming version 1.5.0 of Flink adds a feature called credit-based flow control which might help here.
I'm adding @Piotr to this thread who knows more about the details of this new feature. Best, Fabian 2018-05-18 11:59 GMT+02:00 Andrei Shumanski <and...@shumanski.com>: > Hi, > > > Right now it is a Kafka source, but I had the same issue when reading data > from local FS. > > It looks like a common problem for many (all?) sources. > When incoming data is very small (paths to large archives) but each entry > requires a significant time to process (unpack, parse, etc.) Flink detects > the back pressure with delay and too much data becomes part of the first > transaction. > > > > -- > Best regards, > Andrei Shumanski > > > > On Fri, May 18, 2018 at 11:44 AM, makeyang <riverbuild...@hotmail.com> > wrote: > >> Andrei Shumanski: >> which source are u using? >> >> >> >> -- >> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4. >> nabble.com/ >> > >