[jira] [Commented] (HDFS-13186) [PROVIDED Phase 2] Multipart Uploader API

Steve Loughran (JIRA) Wed, 06 Mar 2019 11:12:14 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16786039#comment-16786039
 ]


Steve Loughran commented on HDFS-13186:
---------------------------------------

We are going to have an API to request an MPU on a path of an existing 
filesystem/filecontext. This is needed because the service loader API is 
brittle, inflexible and cannot handle things like proxyfs, viewfs etc. a

bq. Would it make sense to make a checksumFS MPU that throws upon creation?

but how do you load it through the service loader API, as that's bonded to the 
FS Schema? and file:// matches both RawLocal and the checksummed FS.

bq. but using inheritance to remove functionality as checksum FS is doing is 
already broken.

Not sure how else you'd do it.

The HADOOP-15691 PathCapabilities patch is intended to allow callers to probe 
for a feature being available before making the API Call. This'd let you go

{code}
if (fs.hasPathCapability("fs.path.multipart-upload", dest)) {
  uploader=   fs.createMultipartUpload(path)
  ... 
 } else {
 // fallback
}
{code}

Bear in mind I also want to move the MPU API to being async block uploads, 
complete calls. For the classic local and HDFS stores, these would actually be 
done in the current thread. For S3 they'd run in the thread pool, so you could 
trivially kick off a parallel upload of blocks from a single thread without 
even knowing that the FS impl worked that way.

[~fabbri] another use of this is that it effectively provides a stable API For 
the S3A committers to move to -one which could even be accessed through filter 
filesystems if needed; as well as a high-speed distcp. Currently distcp upload 
of very large files from HDFS to S3 is really slow because it's done a file at 
a time; this will enable block-at-a-time



> [PROVIDED Phase 2] Multipart Uploader API
> -----------------------------------------
>
>                 Key: HDFS-13186
>                 URL: https://issues.apache.org/jira/browse/HDFS-13186
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Ewan Higgs
>            Assignee: Ewan Higgs
>            Priority: Major
>             Fix For: 3.2.0
>
>         Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, 
> HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch, 
> HDFS-13186.006.patch, HDFS-13186.007.patch, HDFS-13186.008.patch, 
> HDFS-13186.009.patch, HDFS-13186.010.patch
>
>
> To write files in parallel to an external storage system as in HDFS-12090, 
> there are two approaches:
>  # Naive approach: use a single datanode per file that copies blocks locally 
> as it streams data to the external service. This requires a copy for each 
> block inside the HDFS system and then a copy for the block to be sent to the 
> external system.
>  # Better approach: Single point (e.g. Namenode or SPS style external client) 
> and Datanodes coordinate in a multipart - multinode upload.
> This system needs to work with multiple back ends and needs to coordinate 
> across the network. So we propose an API that resembles the following:
> {code:java}
> public UploadHandle multipartInit(Path filePath) throws IOException;
> public PartHandle multipartPutPart(InputStream inputStream,
>     int partNumber, UploadHandle uploadId) throws IOException;
> public void multipartComplete(Path filePath,
>     List<Pair<Integer, PartHandle>> handles, 
>     UploadHandle multipartUploadId) throws IOException;{code}
> Here, UploadHandle and PartHandle are opaque handlers in the vein of 
> PathHandle so they can be serialized and deserialized in hadoop-hdfs project 
> without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle 
> and PartHandle.
> In an object store such as S3A, the implementation is straight forward. In 
> the case of writing multipart/multinode to HDFS, we can write each block as a 
> file part. The complete call will perform a concat on the blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13186) [PROVIDED Phase 2] Multipart Uploader API

Reply via email to