rbalamohan opened a new pull request #1655: HADOOP-16629: support copyFile in 
s3afilesystem
URL: https://github.com/apache/hadoop/pull/1655
 
 
   This is subtask of HADOOP-16604 which aims to provide copy functionality for 
cloud native applications. Intent of this PR is to provide copyFile(URI src, 
URI dst) functionality for S3AFileSystem (HADOOP-16629).
   
   Creating new PR due to a merge mess up in 
https://github.com/apache/hadoop/pull/1591.
   
   Changes w.r.t PR:1591:
   
   1. Fixed doc (filesystem.md)
   2. Fixed AbstractContractCopyTest.
   3. If file already exists in destination, it would overwrite dest file.
   4. Added CompletableFuture support. `public CompletableFuture<Void> 
copyFile(URI srcFile, URI dstFile)`
   
   CompletableFuture makes the API nicer. However, `CompletableFuture::get --> 
waitingAndGet` invokes `Runtime.getAvailableProcessors` frequently. This can 
turn out
   to be expensive native call depending on workload. We can optimise this 
later, if it turns out to be an issue.
   
   If the destination bucket is different, relevant persmissions/policies have 
to be already setup, without which it would throw exceptions. Providing URI 
instead of path, makes it easier to mention different buckets on need basis. 
Since this is yet to stabilize in implemetation, we can make relevant changes 
in the store. 
   
   Testing was done in region=us-west-2 on my local laptop. Contract tests and 
huge file tests passed . Other tests are still running and I will post the 
results. (ITestS3AContractRename failed, but not related to this patch)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to