Re:Re: Phoenix + Spark

程磊 Wed, 26 Oct 2016 03:15:07 -0700

this problem is caused by the HTableDescriptor.setValue method in HBase 1.2.0 
is：
    publicHTableDescriptorsetValue(Stringkey, Stringvalue)
but HTableDescriptor.setValue method in HBase 1.2.0-cdh5.7.0 is：
   voidHTableDescriptorsetValue(Stringkey, Stringvalue)


phoenix 4.8.0-HBase1.2.0 use HBase 1.2.0 ,not  HBase 1.2.0-cdh5.7.0,  you may 
rebuild the phoenix use HBase 1.2.0-cdh5.7.0.




At 2016-10-26 17:31:30, "min zou" <[email protected]> wrote:

Hi Sergey, i used the advice you gave me ,Then i got a error:
 Exception in thread "main" java.lang.NoSuchMethodError: 
org.apache.hadoop.hbase.HTableDescriptor.setValue(Ljava/lang/String;Ljava/lang/String;)Lorg/apache/hadoop/hbase/HTableDescriptor;
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.generateTableDescriptor(ConnectionQueryServicesImpl.java:756)
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1020)
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1396)
at 
org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:2302)
at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:922)
at 
org.apache.phoenix.compile.CreateTableCompiler$2.execute(CreateTableCompiler.java:194)
at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:343)
at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:331)
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:329)
at 
org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1421)
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call(ConnectionQueryServicesImpl.java:2353)
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call(ConnectionQueryServicesImpl.java:2300)
at 
org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:78)
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2300)
at 
org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:231)
at 
org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:144)
at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:202)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
at 
org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:98)
at 
org.apache.phoenix.mapreduce.util.ConnectionUtil.getInputConnection(ConnectionUtil.java:57)
at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:114)
at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:81)
at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:120)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.phoenix.spark.PhoenixRDD.getPartitions(PhoenixRDD.scala:52)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1940)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:912)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:910)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.RDD.foreach(RDD.scala:910)
at com.linkstec.bigdata.main.PhoenixTest$.main(PhoenixTest.scala:48)
at com.linkstec.bigdata.main.PhoenixTest.main(PhoenixTest.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


thanks




2016-10-26 15:09 GMT+08:00 Sergey Soldatov <[email protected]>:

(1) You need only client jar (phoenix-xxxx-client.jar) 

(2) set spark.executor.extraClassPath in the spark-defaults.conf to the client 
jar 
Hope that would help.


Thanks,
Sergey


On Tue, Oct 25, 2016 at 9:31 PM, min zou <[email protected]> wrote:

Dear, i use spark to do data analysis,then save the result to Phonix. When i 
run the application on Intellij IDEA by local model, the apllication runs ok, 
but i run it by spark-submit(spark-submit --class com.bigdata.main.RealTimeMain 
--master yarn  --driver-memory 2G --executor-memory 2G --num-executors 5 
/home/zt/rt-analyze-1.0-SNAPSHOT.jar) on my cluster, i get a error:Caused by: 
java.lang.ClassNotFoundException: Class 
org.apache.phoenix.mapreduce.PhoenixOutputFormat not found. 


Exception in thread "main" java.lang.RuntimeException: 
java.lang.ClassNotFoundException: Class 
org.apache.phoenix.mapreduce.PhoenixOutputFormat not found    at 
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2112)    at 
org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:232)
    at 
org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:971)
    at 
org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:903)
    at 
org.apache.phoenix.spark.ProductRDDFunctions.saveToPhoenix(ProductRDDFunctions.scala:51)
    at com.mypackage.save(DAOImpl.scala:41)    at 
com.mypackage.ProtoStreamingJob.execute(ProtoStreamingJob.scala:58)    at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)   
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)    at 
com.mypackage.SparkApplication.sparkRun(SparkApplication.scala:95)    at 
com.mypackage.SparkApplication$delayedInit$body.apply(SparkApplication.scala:112)
    at scala.Function0$class.apply$mcV$sp(Function0.scala:40)    at 
scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)    at 
scala.App$$anonfun$main$1.apply(App.scala:71)    at 
scala.App$$anonfun$main$1.apply(App.scala:71)    at 
scala.collection.immutable.List.foreach(List.scala:318)    at 
scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
    at scala.App$class.main(App.scala:71)    at 
com.mypackage.SparkApplication.main(SparkApplication.scala:15)    at 
com.mypackage.ProtoStreamingJobRunner.main(ProtoStreamingJob.scala)    at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)   
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)    at 
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)  
  at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)    at 
org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)    at 
org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: 
java.lang.ClassNotFoundException: Class 
org.apache.phoenix.mapreduce.PhoenixOutputFormat not found    at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2018)    
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2110)    
... 30 more





Then i use spark-submit --jars(spark-submit --class 
com.bigdata.main.RealTimeMain --master yarn --jars 
/root/apache-phoenix-4.8.0-HBase-1.2-bin/phoenix-spark-4.8.0-HBase-1.2.jar,/root/apache-phoenix-4.8.0-HBase-1.2-bin/phoenix-4.8.0-HBase-1.2-client.jar,/root/apache-phoenix-4.8.0-HBase-1.2-bin/phoenix-core-4.8.0-HBase-1.2.jar
    --driver-memory 2G --executor-memory 2G --num-executors 5 
/home/zm/rt-analyze-1.0-SNAPSHOT.jar) , i get the same error. My cluster is 
CDH5.7,phoenix4.8.0, Hbase1.2, spark1.6 . How can i solve the promble ? Please 
help me. thanks.
Re:Re: Phoenix + Spark

Reply via email to