[ https://issues.apache.org/jira/browse/HBASE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251896#comment-13251896 ]
jirapos...@reviews.apache.org commented on HBASE-5741: ------------------------------------------------------ ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4700/#review6860 ----------------------------------------------------------- Ship it! Please attach new patch to JIRA after addressing minor comments. src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java <https://reviews.apache.org/r/4700/#comment15276> The parentheses around hbaseAdmin.tableExists() are not needed. src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java <https://reviews.apache.org/r/4700/#comment15277> Move this to line 263. src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java <https://reviews.apache.org/r/4700/#comment15278> 'bothered about' -> 'concerned with' - Ted On 2012-04-11 19:01:39, Himanshu Vashishtha wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4700/ bq. ----------------------------------------------------------- bq. bq. (Updated 2012-04-11 19:01:39) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. ------- bq. bq. There is a bulk output option in the importtsv workload. It outputs the HFiles in a user defined directory. The current code assumes that a table with its name equal to the given output directory exists, and throws an exception otherwise. Here is a patch for creating a table in case it doesn't exist. bq. bq. bq. This addresses bug HBase-5741. bq. https://issues.apache.org/jira/browse/HBase-5741 bq. bq. bq. Diffs bq. ----- bq. bq. src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java ab22fc4 bq. src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java ac30a62 bq. bq. Diff: https://reviews.apache.org/r/4700/diff bq. bq. bq. Testing bq. ------- bq. bq. Added a new test for bulkoutput; All importtsv tests pass. bq. bq. bq. Thanks, bq. bq. Himanshu bq. bq. > ImportTsv does not check for table existence > --------------------------------------------- > > Key: HBASE-5741 > URL: https://issues.apache.org/jira/browse/HBASE-5741 > Project: HBase > Issue Type: Bug > Components: mapreduce > Affects Versions: 0.90.4 > Reporter: Clint Heath > Assignee: Himanshu Vashishtha > Attachments: HBase-5741-v2.patch, HBase-5741.patch > > > The usage statement for the "importtsv" command to hbase claims this: > "Note: if you do not use this option, then the target table must already > exist in HBase" (in reference to the "importtsv.bulk.output" command-line > option) > The truth is, the table must exist no matter what, importtsv cannot and will > not create it for you. > This is the case because the createSubmittableJob method of ImportTsv does > not even attempt to check if the table exists already, much less create it: > (From org.apache.hadoop.hbase.mapreduce.ImportTsv.java) > 305 HTable table = new HTable(conf, tableName); > The HTable method signature in use there assumes the table exists and runs a > meta scan on it: > (From org.apache.hadoop.hbase.client.HTable.java) > 142 * Creates an object to access a HBase table. > ... > 151 public HTable(Configuration conf, final String tableName) > What we should do inside of createSubmittableJob is something similar to what > the "completebulkloads" command would do: > (Taken from org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.java) > 690 boolean tableExists = this.doesTableExist(tableName); > 691 if (!tableExists) this.createTable(tableName,dirPath); > Currently the docs are misleading, the table in fact must exist prior to > running importtsv. We should check if it exists rather than assume it's > already there and throw the below exception: > 12/03/14 17:15:42 WARN client.HConnectionManager$HConnectionImplementation: > Encountered problems when prefetch META table: > org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for > table: myTable2, row=myTable2,,99999999999999 > at > org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:150) > ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira