[ 
https://issues.apache.org/jira/browse/HADOOP-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15252596#comment-15252596
 ] 

Hadoop QA commented on HADOOP-12891:
------------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
34s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 43s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 46s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
5s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 14s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
56s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 44s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 44s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 44s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 44s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
2s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 0s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_77. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 12s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_77. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 17s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 15s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
21s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 66m 19s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12799994/HADOOP-12891-002.patch
 |
| JIRA Issue | HADOOP-12891 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux 56ea5dbd8dde 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 4838b73 |
| Default Java | 1.7.0_95 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_77 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 |
| findbugs | v3.0.0 |
| JDK v1.7.0_95  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/9144/testReport/ |
| modules | C:  hadoop-common-project/hadoop-common   hadoop-tools/hadoop-aws  
U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/9144/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> S3AFileSystem should configure Multipart Copy threshold and chunk size
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-12891
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12891
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>    Affects Versions: 2.7.2
>            Reporter: Andrew Olson
>            Assignee: Andrew Olson
>         Attachments: HADOOP-12891-001.patch, HADOOP-12891-002.patch
>
>
> In the AWS S3 Java SDK the defaults for Multipart Copy threshold and chunk 
> size are very high [1],
> {noformat}
>     /** Default size threshold for Amazon S3 object after which multi-part 
> copy is initiated. */
>     private static final long DEFAULT_MULTIPART_COPY_THRESHOLD = 5 * GB;
>     /** Default minimum size of each part for multi-part copy. */
>     private static final long DEFAULT_MINIMUM_COPY_PART_SIZE = 100 * MB;
> {noformat}
> In internal testing we have found that a lower but still reasonable threshold 
> and chunk size can be extremely beneficial. In our case we set both the 
> threshold and size to 25 MB with good results.
> Amazon enforces a minimum of 5 MB [2].
> For the S3A filesystem, file renames are actually implemented via a remote 
> copy request, which is already quite slow compared to a rename on HDFS. This 
> very high threshold for utilizing the multipart functionality can make the 
> performance considerably worse, particularly for files in the 100MB to 5GB 
> range which is fairly typical for mapreduce job outputs.
> Two apparent options are:
> 1) Use the same configuration ({{fs.s3a.multipart.threshold}}, 
> {{fs.s3a.multipart.size}}) for both. This seems preferable as the 
> accompanying documentation [3] for these configuration properties actually 
> already says that they are applicable for either "uploads or copies". We just 
> need to add in the missing 
> {{TransferManagerConfiguration#setMultipartCopyThreshold}} [4] and 
> {{TransferManagerConfiguration#setMultipartCopyPartSize}} [5] calls at [6] 
> like:
> {noformat}
>     /* Handle copies in the same way as uploads. */
>     transferConfiguration.setMultipartCopyPartSize(partSize);
>     transferConfiguration.setMultipartCopyThreshold(multiPartThreshold);
> {noformat}
> 2) Add two new configuration properties so that the copy threshold and part 
> size can be independently configured, maybe change the defaults to be lower 
> than Amazon's, set into {{TransferManagerConfiguration}} in the same way.
> In any case at a minimum if neither of the above options are acceptable 
> changes the config documentation should be adjusted to match the code, noting 
> that {{fs.s3a.multipart.threshold}} and {{fs.s3a.multipart.size}} are 
> applicable to uploads of new objects only and not copies (i.e. renaming 
> objects).
> [1] 
> https://github.com/aws/aws-sdk-java/blob/1.10.58/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/transfer/TransferManagerConfiguration.java#L36-L40
> [2] http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadUploadPartCopy.html
> [3] 
> https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#S3A
> [4] 
> http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManagerConfiguration.html#setMultipartCopyThreshold(long)
> [5] 
> http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManagerConfiguration.html#setMultipartCopyPartSize(long)
> [6] 
> https://github.com/apache/hadoop/blob/release-2.7.2-RC2/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L286



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to