[ 
https://issues.apache.org/jira/browse/HBASE-13356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14512859#comment-14512859
 ] 

Hadoop QA commented on HBASE-13356:
-----------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12728190/HBASE-13356.patch
  against master branch at commit cd83d39fb4f50db901b699ba5470b5f709c95c69.
  ATTACHMENT ID: 12728190

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:green}+1 tests included{color}.  The patch appears to include 12 new 
or modified tests.

    {color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.1 2.5.2 2.6.0)

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

    {color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 4 
warning messages.

                {color:red}-1 checkstyle{color}.  The applied patch generated 
1965 checkstyle errors (more than the master's current 1900 errors).

    {color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

    {color:red}-1 release audit{color}.  The applied patch generated 7 release 
audit warnings (more than the master's current 0 warnings).

    {color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
    + * MultiTableSnapshotInputFormat generalizes {@link 
org.apache.hadoop.hbase.mapred.TableSnapshotInputFormat}
+ * allowing a MapReduce job to run over one or more table snapshots, with one 
or more scans configured for each.
+ * Internally, the input format delegates to {@link 
org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat}
+ * and thus has the same performance advantages; see {@link 
org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat} for
+ * Usage is similar to TableSnapshotInputFormat, with the following exception: 
initMultiTableSnapshotMapperJob takes in a map
+ * from snapshot name to a collection of scans. For each snapshot in the map, 
each corresponding scan will be applied;
+ * the overall dataset for the job is defined by the concatenation of the 
regions and tables included in each snapshot/scan
+ * {@link 
org.apache.hadoop.hbase.mapred.TableMapReduceUtil#initMultiTableSnapshotMapperJob(Map,
 Class, Class, Class, JobConf, boolean, Path)}
+ * Internally, this input format restores each snapshot into a subdirectory of 
the given tmp directory. Input splits and
+ * record readers are created as described in {@link 
org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat}

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

    {color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13816//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13816//artifact/patchprocess/patchReleaseAuditWarnings.txt
Release Findbugs (version 2.0.3)        warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13816//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13816//artifact/patchprocess/checkstyle-aggregate.html

                Javadoc warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13816//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13816//console

This message is automatically generated.

> HBase should provide an InputFormat supporting multiple scans in mapreduce 
> jobs over snapshots
> ----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-13356
>                 URL: https://issues.apache.org/jira/browse/HBASE-13356
>             Project: HBase
>          Issue Type: New Feature
>          Components: mapreduce
>            Reporter: Andrew Mains
>            Assignee: Andrew Mains
>            Priority: Minor
>         Attachments: HBASE-13356.patch
>
>
> Currently, HBase supports the pushing of multiple scans to mapreduce jobs 
> over live tables (via MultiTableInputFormat) but only supports a single scan 
> for mapreduce jobs over table snapshots. It would be handy to support 
> multiple scans over snapshots as well, probably through another input format 
> (MultiTableSnapshotInputFormat?). To mimic the functionality present in 
> MultiTableInputFormat, the new input format would likely have to take in the 
> names of all snapshots used in addition to the scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to