Hi Guys,

Following is how my pipeline looks (DataStream API) :

[1] Read the data from the csv file
[2] KeyBy it by some id
[3] Do the enrichment and write it to DB

[1] reads the data in sequence as it has single parallelism and then I have
default parallelism for the other operators.

I want to generate a response (ack) when all the data of the file is
processed. How can I achieve this ?

One solution I can think of is to have EOF dummy record in a file and a
unique field for all the records in that file. Doing a keyBy on this field
will make sure that all records are sent to a single slot. So, when EOF
dummy records is read I can generate a response/ack.

Is there a better way I can deal with this ?


Regards,
Vinay Patil

Reply via email to