Hello,

I'm proposing to remove the Twitter-InputFormat in FLINK-6710 <https://issues.apache.org/jira/browse/FLINK-6710>, with an open PR you can find here <https://github.com/apache/flink/pull/3984>. The PR currently has a +1 from Robert, but Timo raised some concerns saying that it is useful for prototyping and
advised me to start a discussion on the ML.

This format is a DelimitedInputFormat that reads JSON objects and turns them into a custom tweet class. I believe this format doesn't provide much value to Flink; there's nothing interesting about it as an InputFormat, as it is purely an exercise in /manually /converting a JSON object into a POJO. This is apparent since you could just as well use ExecutionEnvironment#readTextFile(...) and throw the parsing logic
into a subsequent MapFunction.

In the PR i suggested to replace this with a JsonInputFormat, but this was a misguided attempt at getting Timo to agree to the removal. This format has the same problem outlined above, as it could be effectively implemented with a one-liner map function.

So the question now is whether we want to keep it, remove it, or replace it with something more general.

Regards,
Chesnay

Reply via email to