Hey!

You can try setting the minimum split size of the file input format so
large that only one split per file gets created.

You can probably reuse the delimited input Format when you choose a
delimiter that does not exist as a character sequence in the file. But just
reading the file stream into a string builder (through a reader that
decodes the charset) is probably quite straightforward as well.

It may make sense to add an option to the file input format to not split up
files...

Stephan

Reply via email to