Fokko Driesprong created FLINK-11838:
----------------------------------------

             Summary: Create RecoverableWriter for GCS
                 Key: FLINK-11838
                 URL: https://issues.apache.org/jira/browse/FLINK-11838
             Project: Flink
          Issue Type: Improvement
          Components: FileSystems
    Affects Versions: 1.8.0
            Reporter: Fokko Driesprong
            Assignee: Fokko Driesprong


GCS supports the resumable upload which we can use to create a Recoverable 
writer similar to the S3 implementation:
https://cloud.google.com/storage/docs/json_api/v1/how-tos/resumable-upload

After using the Hadoop compatible interface: 
https://github.com/apache/flink/pull/7519
We've noticed that the current implementation relies heavily on the renaming of 
the files on the commit: 
https://github.com/apache/flink/blob/master/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopRecoverableFsDataOutputStream.java#L233-L259
This is suboptimal on an object store such as GCS. Therefore we would like to 
implement a more GCS native RecoverableWriter 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to