[ https://issues.apache.org/jira/browse/FLINK-20174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Flink Jira Bot updated FLINK-20174: ----------------------------------- Labels: stale-major (was: ) > Make BulkFormat more extensible > ------------------------------- > > Key: FLINK-20174 > URL: https://issues.apache.org/jira/browse/FLINK-20174 > Project: Flink > Issue Type: Improvement > Components: Connectors / FileSystem > Affects Versions: 1.12.0 > Reporter: Steven Zhen Wu > Priority: Major > Labels: stale-major > > Right now, BulkFormat has the generic `SpitT` type extending from > `FileSourceSplit`. We can make BulkFormat taking the generic `SplitT` type > extending from `SourceSplit`. This way, IcebergSourceSplit doesn't have to > extend from `FileSourceSplit` and Iceberg source can reuse this BulkFormat > interface as [~lzljs3620320] suggested. This allows Iceberg source to take > advantages high-performant `ParquetVectorizedInputFormat` provided by Flink. > [~sewen] [~lzljs3620320] if you are onboard with the change, I would be happy > to submit a PR. Since it is a breaking change, maybe we can only add it to > master branch after 1.12 release branch is cut? > The other related question is the two `createReader` and `restoreReader` > APIs. I understand the motivation. I am just wondering if the separation is > necessary. if the SplitT has the CheckpointedLocation, the seek operation can > be handled internal to `createReader`. We can also define an abstract > `FileSourceSplitBase` that adds a `getCheckpointedPosition` API to the > `SourceSplit`. -- This message was sent by Atlassian Jira (v8.3.4#803005)