Etienne Chauchot created FLINK-24921:
----------------------------------------

             Summary: FileSourceSplit should not be visible in the user API in 
ParquetColumnarRowInputFormat
                 Key: FLINK-24921
                 URL: https://issues.apache.org/jira/browse/FLINK-24921
             Project: Flink
          Issue Type: Improvement
          Components: Connectors / FileSystem
            Reporter: Etienne Chauchot


_FileSourceSplit_ is an internal class that should not be visible in the user 
API like 
[here|https://github.com/apache/flink/blob/6f2d8fe3007464343c5312e27612be448b415148/flink-formats/flink-parquet/src/test/java/org/apache/flink/formats/parquet/ParquetColumnarRowInputFormatTest.java#L235].
 The fact that _FileSourceSplit_ surfaces in the API also influences the user 
to do a raw use of the parametrized class like 
[here|https://github.com/apache/flink/blob/6f2d8fe3007464343c5312e27612be448b415148/flink-formats/flink-parquet/src/test/java/org/apache/flink/formats/parquet/ParquetColumnarRowInputFormatTest.java#L407]

It could be better to make parquet format a not parametrized class as it is 
done for hive connector

_class HiveInputFormat implements BulkFormat<RowData, HiveSourceSplit>_

rather than

_class ParquetColumnarRowInputFormat<SplitT extends FileSourceSplit>_
 _extends ParquetVectorizedInputFormat<RowData, SplitT>_ 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to