Etienne Chauchot created FLINK-24921:
----------------------------------------
Summary: FileSourceSplit should not be visible in the user API in
ParquetColumnarRowInputFormat
Key: FLINK-24921
URL: https://issues.apache.org/jira/browse/FLINK-24921
Project: Flink
Issue Type: Improvement
Components: Connectors / FileSystem
Reporter: Etienne Chauchot
_FileSourceSplit_ is an internal class that should not be visible in the user
API like
[here|https://github.com/apache/flink/blob/6f2d8fe3007464343c5312e27612be448b415148/flink-formats/flink-parquet/src/test/java/org/apache/flink/formats/parquet/ParquetColumnarRowInputFormatTest.java#L235].
The fact that _FileSourceSplit_ surfaces in the API also influences the user
to do a raw use of the parametrized class like
[here|https://github.com/apache/flink/blob/6f2d8fe3007464343c5312e27612be448b415148/flink-formats/flink-parquet/src/test/java/org/apache/flink/formats/parquet/ParquetColumnarRowInputFormatTest.java#L407]
It could be better to make parquet format a not parametrized class as it is
done for hive connector
_class HiveInputFormat implements BulkFormat<RowData, HiveSourceSplit>_
rather than
_class ParquetColumnarRowInputFormat<SplitT extends FileSourceSplit>_
_extends ParquetVectorizedInputFormat<RowData, SplitT>_
--
This message was sent by Atlassian Jira
(v8.20.1#820001)