Hello everyone, I need to join some files to perform some processing.. The dataset API is a perfect way to achieve this, I am able to do it when I read file in batch (csv)
However in the prod environment, I will receive thoses files in kafka messages (one message = one line of a file) So I am considering using a global window + a custom trigger on a end of file message and a process window function. But I can not go too far with that as process is only one function and chaining functions will be a pain. I don't think that emitting a datastream & windows / trigger on EOF before every process function is a good idea However I would like to work in a bounded way once I received all of my elements (after the trigger on global window), like the dataset API, as I will join on my whole dataset.. I thought maybe it would be a good idea to go for table API and group window ? but you can not have custom trigger and a global group window on a table ?(like the global window on datastream ?) Best alternative would be to create a dataset as a result of my process window function.. but I don't think this is possible, is it ? Best Regards, Bastien