It seems that you need to use phoenix for CDH since it has some changes in HBase API. In one of recent threads there were several links how to build it.
Thanks, Sergey On Wed, Oct 26, 2016 at 2:31 AM, min zou <zoumin1...@gmail.com> wrote: > Hi Sergey, i used the advice you gave me ,Then i got a error: > Exception in thread "main" java.lang.NoSuchMethodError: > org.apache.hadoop.hbase.HTableDescriptor.setValue( > Ljava/lang/String;Ljava/lang/String;)Lorg/apache/hadoop/ > hbase/HTableDescriptor; > at org.apache.phoenix.query.ConnectionQueryServicesImpl. > generateTableDescriptor(ConnectionQueryServicesImpl.java:756) > at org.apache.phoenix.query.ConnectionQueryServicesImpl. > ensureTableCreated(ConnectionQueryServicesImpl.java:1020) > at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable( > ConnectionQueryServicesImpl.java:1396) > at org.apache.phoenix.schema.MetaDataClient.createTableInternal( > MetaDataClient.java:2302) > at org.apache.phoenix.schema.MetaDataClient.createTable( > MetaDataClient.java:922) > at org.apache.phoenix.compile.CreateTableCompiler$2.execute( > CreateTableCompiler.java:194) > at org.apache.phoenix.jdbc.PhoenixStatement$2.call( > PhoenixStatement.java:343) > at org.apache.phoenix.jdbc.PhoenixStatement$2.call( > PhoenixStatement.java:331) > at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) > at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation( > PhoenixStatement.java:329) > at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate( > PhoenixStatement.java:1421) > at org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call( > ConnectionQueryServicesImpl.java:2353) > at org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call( > ConnectionQueryServicesImpl.java:2300) > at org.apache.phoenix.util.PhoenixContextExecutor.call( > PhoenixContextExecutor.java:78) > at org.apache.phoenix.query.ConnectionQueryServicesImpl.init( > ConnectionQueryServicesImpl.java:2300) > at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices( > PhoenixDriver.java:231) > at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection( > PhoenixEmbeddedDriver.java:144) > at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:202) > at java.sql.DriverManager.getConnection(DriverManager.java:664) > at java.sql.DriverManager.getConnection(DriverManager.java:208) > at org.apache.phoenix.mapreduce.util.ConnectionUtil. > getConnection(ConnectionUtil.java:98) > at org.apache.phoenix.mapreduce.util.ConnectionUtil.getInputConnection( > ConnectionUtil.java:57) > at org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan( > PhoenixInputFormat.java:114) > at org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits( > PhoenixInputFormat.java:81) > at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:120) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) > at org.apache.phoenix.spark.PhoenixRDD.getPartitions(PhoenixRDD.scala:52) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) > at org.apache.spark.rdd.MapPartitionsRDD.getPartitions( > MapPartitionsRDD.scala:35) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:1940) > at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:912) > at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:910) > at org.apache.spark.rdd.RDDOperationScope$.withScope( > RDDOperationScope.scala:150) > at org.apache.spark.rdd.RDDOperationScope$.withScope( > RDDOperationScope.scala:111) > at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) > at org.apache.spark.rdd.RDD.foreach(RDD.scala:910) > at com.linkstec.bigdata.main.PhoenixTest$.main(PhoenixTest.scala:48) > at com.linkstec.bigdata.main.PhoenixTest.main(PhoenixTest.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$ > deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > > thanks > > > 2016-10-26 15:09 GMT+08:00 Sergey Soldatov <sergeysolda...@gmail.com>: > >> (1) You need only client jar (phoenix-xxxx-client.jar) >> (2) set spark.executor.extraClassPath in the spark-defaults.conf to the >> client jar >> Hope that would help. >> >> Thanks, >> Sergey >> >> On Tue, Oct 25, 2016 at 9:31 PM, min zou <zoumin1...@gmail.com> wrote: >> >>> Dear, i use spark to do data analysis,then save the result to Phonix. >>> When i run the application on Intellij IDEA by local model, the apllication >>> runs ok, but i run it by spark-submit(spark-submit --class >>> com.bigdata.main.RealTimeMain --master yarn --driver-memory 2G >>> --executor-memory 2G --num-executors 5 /home/zt/rt-analyze-1.0-SNAPSHOT.jar) >>> on my cluster, i get a error:Caused by: java.lang.ClassNotFoundException: >>> Class org.apache.phoenix.mapreduce.PhoenixOutputFormat not found. >>> >>> Exception in thread "main" java.lang.RuntimeException: >>> java.lang.ClassNotFoundException: Class >>> org.apache.phoenix.mapreduce.PhoenixOutputFormat >>> not found at org.apache.hadoop.conf.Configu >>> ration.getClass(Configuration.java:2112) at >>> org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFor >>> matClass(JobContextImpl.java:232) at org.apache.spark.rdd.PairRDDFu >>> nctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:971) at >>> org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:903) >>> at >>> org.apache.phoenix.spark.ProductRDDFunctions.saveToPhoenix(ProductRDDFunctions.scala:51) >>> at com.mypackage.save(DAOImpl.scala:41) at >>> com.mypackage.ProtoStreamingJob.execute(ProtoStreamingJob.scala:58) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:606) at >>> com.mypackage.SparkApplication.sparkRun(SparkApplication.scala:95) >>> at >>> com.mypackage.SparkApplication$delayedInit$body.apply(SparkApplication.scala:112) >>> at scala.Function0$class.apply$mcV$sp(Function0.scala:40) at >>> scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12) >>> at scala.App$$anonfun$main$1.apply(App.scala:71) at >>> scala.App$$anonfun$main$1.apply(App.scala:71) at >>> scala.collection.immutable.List.foreach(List.scala:318) at >>> scala.collection.generic.TraversableForwarder$class.foreach( >>> TraversableForwarder.scala:32) at scala.App$class.main(App.scala:71) >>> at com.mypackage.SparkApplication.main(SparkApplication.scala:15) >>> at com.mypackage.ProtoStreamingJobRunner.main(ProtoStreamingJob.scala) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:606) at >>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy >>> $SparkSubmit$$runMain(SparkSubmit.scala:569) at >>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166) >>> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189) >>> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110) >>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused >>> by: java.lang.ClassNotFoundException: Class >>> org.apache.phoenix.mapreduce.PhoenixOutputFormat not found at >>> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2018) >>> at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2110) >>> ... 30 more >>> >>> >>> Then i use spark-submit --jars(spark-submit --class >>> com.bigdata.main.RealTimeMain --master yarn --jars >>> /root/apache-phoenix-4.8.0-HBase-1.2-bin/phoenix-spark-4.8.0 >>> -HBase-1.2.jar,/root/apache-phoenix-4.8.0-HBase-1.2-bin/phoe >>> nix-4.8.0-HBase-1.2-client.jar,/root/apache-phoenix-4.8.0- >>> HBase-1.2-bin/phoenix-core-4.8.0-HBase-1.2.jar --driver-memory 2G >>> --executor-memory 2G --num-executors 5 /home/zm/rt-analyze-1.0-SNAPSHOT.jar) >>> , i get the same error. My cluster is CDH5.7,phoenix4.8.0, Hbase1.2, >>> spark1.6 . How can i solve the promble ? Please help me. thanks. >>> >> >> >