[ https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053039#comment-13053039 ]
Ted Yu edited comment on HBASE-3996 at 6/22/11 5:17 AM: -------------------------------------------------------- FYI, patch has bunch of tabs in it instead of two spaces for tabs and some lines > 80 chars but no biggie -- I can fix that on commit. Here's a few comments. In TableSplit you create an HTable instance. Do you need to? And when you create it, though I believe it will be less of a problem going forward, can you use the constructor that takes a Configuration and table name? Is there a close in Split interface? If so, you might want to call close of your HTable in there. (Where is it used? Each split needs its own HTable?) Use the constructor that takes a Configuration here too... {noformat} + HTable table = new HTable(tic.getTableName());$ {noformat} You don't need the e.printStackTrace in below {code} + Log.warn("Failed to convert Scan to Strting", e);$ + e.printStackTrace();$ {code} Nice javadoc. By any chance is the code here in MultiTableInputFormatBase where we are checking start and end rows copied from elsewhere? Otherwise patch looks great. Test too. The line above it will output the stack trace (spelling too!). You remove the hashCode in TableSplit. Should it have one? was (Author: stack): FYI, patch has bunch of tabs in it instead of two spaces for tabs and some lines > 80 chars but no biggie -- I can fix that on commit. Here's a few comments. In TableSplit you create an HTable instance. Do you need to? And when you create it, though I believe it will be less of a problem going forward, can you use the constructor that takes a Configuration and table name? Is there a close in Split interface? If so, you might want to call close of your HTable in there. (Where is it used? Each split needs its own HTable?) Use the constructor that takes a Configuration here too... + HTable table = new HTable(tic.getTableName());$ You don't need the e.printStackTrace in below {code} + Log.warn("Failed to convert Scan to Strting", e);$ + e.printStackTrace();$ {code} Nice javadoc. By any chance is the code here in MultiTableInputFormatBase where we are checking start and end rows copied from elsewhere? Otherwise patch looks great. Test too. The line above it will output the stack trace (spelling too!). You remove the hashCode in TableSplit. Should it have one? > Support multiple tables and scanners as input to the mapper in map/reduce jobs > ------------------------------------------------------------------------------ > > Key: HBASE-3996 > URL: https://issues.apache.org/jira/browse/HBASE-3996 > Project: HBase > Issue Type: Improvement > Components: mapreduce > Reporter: Eran Kutner > Fix For: 0.90.4 > > Attachments: MultiTableInputFormat.patch, > TestMultiTableInputFormat.java.patch > > > It seems that in many cases feeding data from multiple tables or multiple > scanners on a single table can save a lot of time when running map/reduce > jobs. > I propose a new MultiTableInputFormat class that would allow doing this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira