[ 
https://issues.apache.org/jira/browse/HBASE-28686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867407#comment-17867407
 ] 

Hudson commented on HBASE-28686:
--------------------------------

Results for branch branch-3
        [build #254 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/254/]: 
(x) *{color:red}-1 overall{color}*
----
details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/254/General_20Nightly_20Build_20Report/]








(/) {color:green}+1 jdk17 hadoop3 checks{color}
-- For more information [see jdk17 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/254/JDK17_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
-- Something went wrong with this stage, [check relevant console 
output|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/254//console].


> MapReduceBackupCopyJob should support custom DistCp options
> -----------------------------------------------------------
>
>                 Key: HBASE-28686
>                 URL: https://issues.apache.org/jira/browse/HBASE-28686
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 2.6.0
>            Reporter: Ray Mattingly
>            Assignee: Ray Mattingly
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0-alpha-1, 3.0.0-beta-2
>
>
> h4. Problem
> The MapReduceBackupCopyJob class provides no means for updating DistCp job 
> options. This means that you're stuck with defaults, which isn't always 
> desirable. For example, my workplace would like the freedom to deviate from 
> at least two DistCp defaults:
>  # distcp.direct.write — we would like to set this to true, because writing 
> and renaming tmp files is expensive in S3 (where we store our backups).
>  # we would also like control over the number of mappers that DistCp will run
> h4. Proposed Solution
> It is not the prettiest solution, but I'm proposing that we support DistCp 
> customizations via the given backup client configuration like 
> [this.|https://github.com/HubSpot/hbase/compare/hubspot-2.6...HubSpot:hbase:backup-distcp-options]
>  It's necessary to do this conf -> arg conversion because we still want to 
> use [DistCp's run 
> method|https://github.com/HubSpot/hadoop/blob/c4c25b0ea2be1c8bca31d86962597060b2630f62/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCp.java#L134-L171],
>  which expects args, so as to not change any error codes. Hadoop actually 
> does something similar, but in the opposite direction — the DistCp job has 
> logic to convert the args back to configurations (lol).
> Further, the DistCp API is really unfortunately designed for programmatic 
> use, so it doesn't leave us great alternatives. For example, it doesn't 
> matter what you pass in as DistCpOptions to the constructor if you use the 
> run method, your options will be overwritten based on the args that you pass 
> in. Alternatively, if you pass in the DistCpOptions in the constructor and 
> use DistCp#execute or DistCp#createAndSubmitJob, then you get none of the 
> error specificity!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to