[ https://issues.apache.org/jira/browse/HBASE-13356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14512859#comment-14512859 ]
Hadoop QA commented on HBASE-13356: ----------------------------------- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12728190/HBASE-13356.patch against master branch at commit cd83d39fb4f50db901b699ba5470b5f709c95c69. ATTACHMENT ID: 12728190 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 12 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 4 warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1965 checkstyle errors (more than the master's current 1900 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 7 release audit warnings (more than the master's current 0 warnings). {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + * MultiTableSnapshotInputFormat generalizes {@link org.apache.hadoop.hbase.mapred.TableSnapshotInputFormat} + * allowing a MapReduce job to run over one or more table snapshots, with one or more scans configured for each. + * Internally, the input format delegates to {@link org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat} + * and thus has the same performance advantages; see {@link org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat} for + * Usage is similar to TableSnapshotInputFormat, with the following exception: initMultiTableSnapshotMapperJob takes in a map + * from snapshot name to a collection of scans. For each snapshot in the map, each corresponding scan will be applied; + * the overall dataset for the job is defined by the concatenation of the regions and tables included in each snapshot/scan + * {@link org.apache.hadoop.hbase.mapred.TableMapReduceUtil#initMultiTableSnapshotMapperJob(Map, Class, Class, Class, JobConf, boolean, Path)} + * Internally, this input format restores each snapshot into a subdirectory of the given tmp directory. Input splits and + * record readers are created as described in {@link org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat} {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13816//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13816//artifact/patchprocess/patchReleaseAuditWarnings.txt Release Findbugs (version 2.0.3) warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13816//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13816//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13816//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13816//console This message is automatically generated. > HBase should provide an InputFormat supporting multiple scans in mapreduce > jobs over snapshots > ---------------------------------------------------------------------------------------------- > > Key: HBASE-13356 > URL: https://issues.apache.org/jira/browse/HBASE-13356 > Project: HBase > Issue Type: New Feature > Components: mapreduce > Reporter: Andrew Mains > Assignee: Andrew Mains > Priority: Minor > Attachments: HBASE-13356.patch > > > Currently, HBase supports the pushing of multiple scans to mapreduce jobs > over live tables (via MultiTableInputFormat) but only supports a single scan > for mapreduce jobs over table snapshots. It would be handy to support > multiple scans over snapshots as well, probably through another input format > (MultiTableSnapshotInputFormat?). To mimic the functionality present in > MultiTableInputFormat, the new input format would likely have to take in the > names of all snapshots used in addition to the scans. -- This message was sent by Atlassian JIRA (v6.3.4#6332)