[jira] [Commented] (FLINK-19802) Let BulkFormat createReader and restoreReader methods accept Splits directly

2020-11-08 Thread Steven Zhen Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17228338#comment-17228338
 ] 

Steven Zhen Wu commented on FLINK-19802:


[~sewen] sorry, I didn't mean that we need this before the 1.12.0 release. Yes, 
currently I am also thinking about just reusing some components like 
BulkFormat/reader.

> Let BulkFormat createReader and restoreReader methods accept Splits directly
> 
>
> Key: FLINK-19802
> URL: https://issues.apache.org/jira/browse/FLINK-19802
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / FileSystem
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
>Priority: Major
> Fix For: 1.12.0
>
>
> To support sources where the splits communicate additional information, the 
> BulkFormats should accept a generic split type, instead of path/offset/length 
> from the splits directly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19802) Let BulkFormat createReader and restoreReader methods accept Splits directly

2020-11-08 Thread Stephan Ewen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17228018#comment-17228018
 ] 

Stephan Ewen commented on FLINK-19802:
--

I don't think we'll have the time to investigate this for the 1.12 release.
We may be able to relax this later, or go with a different way of not letting 
the Iceberg Source extend from the FileSource at all, but reuse some components 
(maybe the reader?).

Until then, can the splits in the iceberg source simply put "dummy values" in 
the fields from {{FileSourceSplit}}.

> Let BulkFormat createReader and restoreReader methods accept Splits directly
> 
>
> Key: FLINK-19802
> URL: https://issues.apache.org/jira/browse/FLINK-19802
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / FileSystem
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
>Priority: Major
> Fix For: 1.12.0
>
>
> To support sources where the splits communicate additional information, the 
> BulkFormats should accept a generic split type, instead of path/offset/length 
> from the splits directly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19802) Let BulkFormat createReader and restoreReader methods accept Splits directly

2020-11-06 Thread Steven Zhen Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227644#comment-17227644
 ] 

Steven Zhen Wu commented on FLINK-19802:


[~sewen][~lzljs3620320] to make the `BulkFormat` truly generic and reusable by 
Iceberg source, can we avoid `SplitT` extending from `FileSourceSplit`? just 
leaves it as pure generic type `SplitT`. I think `IcebergSourceSplit` shouldn't 
extend from `FileSourceSplit` , because it contains many fields not meaningful 
for IcebergSourceSplit (e.g. filePath, offset, length etc.). Iceberg already 
captured those info in its data structure.

> Let BulkFormat createReader and restoreReader methods accept Splits directly
> 
>
> Key: FLINK-19802
> URL: https://issues.apache.org/jira/browse/FLINK-19802
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / FileSystem
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
>Priority: Major
> Fix For: 1.12.0
>
>
> To support sources where the splits communicate additional information, the 
> BulkFormats should accept a generic split type, instead of path/offset/length 
> from the splits directly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)