[jira] [Commented] (FLINK-19481) Add support for a flink native GCS FileSystem

Xintong Song (Jira) Sun, 16 May 2021 19:08:09 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-19481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17345823#comment-17345823
 ]


Xintong Song commented on FLINK-19481:
--------------------------------------

Hi [~jgrier], thanks for your input.

I have noticed your earlier comment. However, that comment was before Galen's 
PR and I think things are a bit different with this PR now.
- IIUC, what you've described are benefits of a native implementation 
*comparing to the current status*, where Flink does not provide any specific 
supports for GS and users have to deal with the Hadoop dependencies and Flink's 
FS abstractions by themselves. 
- What I'm trying to understand are the benefits *comparing to the status once 
Galen's PR is merged*. The PR provides an out-of-box GS FS implementation, so 
that users no longer need to deal with the dependencies and abstractions. In 
that case, is it still beneficial that this implementation, internally, is 
built directly on top of the GCS native SDK, rather than leveraging the 
existing Hadoop stack provided by google storage connector?

> Add support for a flink native GCS FileSystem
> ---------------------------------------------
>
>                 Key: FLINK-19481
>                 URL: https://issues.apache.org/jira/browse/FLINK-19481
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / FileSystem, FileSystems
>    Affects Versions: 1.12.0
>            Reporter: Ben Augarten
>            Priority: Minor
>              Labels: auto-deprioritized-major
>
> Currently, GCS is supported but only by using the hadoop connector[1]
>  
> The objective of this improvement is to add support for checkpointing to 
> Google Cloud Storage with the Flink File System,
>  
> This would allow the `gs://` scheme to be used for savepointing and 
> checkpointing. Long term, it would be nice if we could use the GCS FileSystem 
> as a source and sink in flink jobs as well. 
>  
> Long term, I hope that implementing a flink native GCS FileSystem will 
> simplify usage of GCS because the hadoop FileSystem ends up bringing in many 
> unshaded dependencies.
>  
> [1] 
> [https://github.com/GoogleCloudDataproc/hadoop-connectors|https://github.com/GoogleCloudDataproc/hadoop-connectors)]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-19481) Add support for a flink native GCS FileSystem

Reply via email to