Sergey Soldatov created PHOENIX-3835:
----------------------------------------
Summary: CSV Bulkload fails if hbase mapredcp was used for
classpath
Key: PHOENIX-3835
URL: https://issues.apache.org/jira/browse/PHOENIX-3835
Project: Phoenix
Issue Type: Bug
Reporter: Sergey Soldatov
For long period of time our documentation has a recommendation to use hbase
mapredcp for HADOOP_CLASSPATH when MR bulk load is used. Actually it doesn't
work and in this case the job will fail with the exception:
{noformat}
Exception in thread "main" java.lang.RuntimeException:
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
org.apache.phoenix.mapreduce.bulkload.TableRowkeyPair not found
at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2246)
at
org.apache.hadoop.mapred.JobConf.getMapOutputKeyClass(JobConf.java:813)
at
org.apache.hadoop.mapreduce.task.JobContextImpl.getMapOutputKeyClass(JobContextImpl.java:142)
at
org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:779)
at
org.apache.phoenix.mapreduce.MultiHfileOutputFormat.configureIncrementalLoad(MultiHfileOutputFormat.java:698)
at
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(AbstractBulkLoadTool.java:330)
at
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(AbstractBulkLoadTool.java:299)
at
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(AbstractBulkLoadTool.java:182)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at
org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoadTool.java:117)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
org.apache.phoenix.mapreduce.bulkload.TableRowkeyPair not found
at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2214)
at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2238)
... 16 more
Caused by: java.lang.ClassNotFoundException: Class
org.apache.phoenix.mapreduce.bulkload.TableRowkeyPair not found
at
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2120)
at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2212)
... 17 more
{noformat}
I may be wrong, but it looks like a side effect of HBASE-12108. Not sure
whether it's possible to fix it on phoenix side or we just need to update the
documentation to use it for some specific versions of HBase. In most cases
everything works just fine without specifying HADOOP_CLASSPATH.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)