Hi to all, I want to discuss with the dev group something about CSV parsing. Since I started using Flink with CSVs I always faced some little problem here and there and the new tickets about the CSV parsing seems to confirm that this part is still problematic. In my production jobs I gave up using Flink CSV parsing in favour of apace commons-csv and it works great. It's perfectly configurable ans robust. A working example is available at [1].
Thus, why not to use that library directly and contribute back (if needed) to another apache library if improvements are required to speed up the parsing? Have you ever tried to compare the performances of the 2 parsers? Best, Flavio [1] https://github.com/okkam-it/flink-examples/blob/master/src/main/java/it/okkam/datalinks/batch/flink/datasourcemanager/importers/Csv2RowExample.java