Flink 1.3.1 (I'm waiting 1.5 before upgrading..) On Fri, May 4, 2018 at 2:50 PM, Amit Jain <aj201...@gmail.com> wrote:
> Hi Flavio, > > Which version of Flink are you using? > > -- > Thanks, > Amit > > On Fri, May 4, 2018 at 6:14 PM, Flavio Pompermaier <pomperma...@okkam.it> > wrote: > > Hi all, > > I've a Flink batch job that reads a parquet dataset and then applies 2 > > flatMap to it (see pseudocode below). > > The problem is that this dataset is quite big and Flink duplicates it > before > > sending the data to these 2 operators (I've guessed this from the > doubling > > amount of sent bytes) . > > Is there a way to avoid this behaviour? > > > > ------------------------------------------------------- > > Here's the pseudo code of my job: > > > > DataSet X = readParquetDir(); > > X1 = X.flatMap(...); > > X2 = X.flatMap(...); > > > > Best, > > Flavio >