[
https://issues.apache.org/jira/browse/HCATALOG-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605673#comment-13605673
]
Sushanth Sowmyan commented on HCATALOG-623:
-------------------------------------------
After some investigations done by Nick and myself, we see that this problem
exists because the HBaseStorageHandler uses a method
TableMapReduceUtil.findContainingJar from HBase to find jars and ship them on
secondary jobs. However, it has a filter present that tries to check that the
file associated with the class is a .jar file, which is not the case when a
HBase job is kicked off through HCat, since it is already at the backend, and
has had the .jar files exploded out into individual .class files. Thus, it
complains that it cannot find them, and subsequently fails.
Nick has created HBASE-8140 to fix this.
> Understanding how to use the HBase bulk import feature
> ------------------------------------------------------
>
> Key: HCATALOG-623
> URL: https://issues.apache.org/jira/browse/HCATALOG-623
> Project: HCatalog
> Issue Type: Documentation
> Components: hbase
> Affects Versions: 0.5
> Reporter: Nick Dimiduk
> Attachments: simple.bulkload.pig, simple.ddl, simple.tsv
>
>
> I'm working through use of the HBaseBulkOutputFormat and I'm getting stuck. I
> have a simple example that replicates the [ImportTsv
> example|http://hbase.apache.org/book/ops_mgt.html#importtsv] from the HBase
> documentation. The end result is the ImportSequenceFile job failing due to
> jars missing from its classpath. Presumably I've not configured something
> correctly. In this example I'm using Pig.
> Here's the error message and also the command files and commands I use to run
> them.
> {noformat}
> $ hadoop fs -put simple.tsv /tmp/
> $ HCAT_CLASSPATH=$(hbase classpath) hcat -f simple.ddl
> $ PIG_CLASSPATH=$(hbase classpath) pig -v -useHCatalog simple.bulkload.pig
> {noformat}
> Error message:
> {noformat}
> 2013-02-19 19:55:30,354 WARN
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for
> class class org.apache.zookeeper.ZooKeeper in order to ship it to the cluster.
> 2013-02-19 19:55:30,355 WARN
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for
> class class org.apache.hadoop.hbase.client.HTable in order to ship it to the
> cluster.
> 2013-02-19 19:55:30,357 WARN
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for
> class class org.apache.hadoop.hive.ql.metadata.HiveException in order to ship
> it to the cluster.
> 2013-02-19 19:55:30,358 WARN
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for
> class class org.apache.hcatalog.mapreduce.HCatOutputFormat in order to ship
> it to the cluster.
> 2013-02-19 19:55:30,359 WARN
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for
> class class org.apache.hcatalog.hbase.HBaseHCatStorageHandler in order to
> ship it to the cluster.
> 2013-02-19 19:55:30,360 WARN
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for
> class class org.apache.hadoop.hive.hbase.HBaseSerDe in order to ship it to
> the cluster.
> 2013-02-19 19:55:30,361 WARN
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for
> class class org.apache.hadoop.hive.metastore.api.Table in order to ship it to
> the cluster.
> 2013-02-19 19:55:30,363 WARN
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for
> class interface org.apache.thrift.TBase in order to ship it to the cluster.
> 2013-02-19 19:55:30,364 WARN
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for
> class class org.apache.hadoop.hbase.util.Bytes in order to ship it to the
> cluster.
> 2013-02-19 19:55:30,365 WARN
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for
> class class com.facebook.fb303.FacebookBase in order to ship it to the
> cluster.
> 2013-02-19 19:55:30,366 WARN
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil: Could not find jar for
> class class com.google.common.util.concurrent.ThreadFactoryBuilder in order
> to ship it to the cluster.
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira