Josh,
I’d like to give you a little more information regarding this error. It looks
like when I add the Phoenix Client JAR to Spark, it causes Spark to fail:
spark.executor.extraClassPath
/usr/local/phoenix/phoenix-4.6.0-HBase-1.1-client.jar
spark.driver.extraClassPath
/usr/local/phoenix/phoenix-4.6.0-HBase-1.1-client.jar
After adding this JAR, I get the following error when excuting the following
command:
scala> val textFile = sc.textFile("README.md")
java.lang.NoSuchMethodError:
com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;
at
com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.<init>(ScalaNumberDeserializersModule.scala:49)
at
com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.<clinit>(ScalaNumberDeserializersModule.scala)
As you can see, adding this phoenix JAR is breaking other Spark functionality
for me. My naïve guess is that there is a different version of the Jackson
FasterXML classes packaged inside phoenix-4.6.0-HBase-1.1-client.jar that is
breaking Spark.
Have you seen anything like this before?
Regards,
Jonathan
From: Cox, Jonathan A [mailto:[email protected]]
Sent: Wednesday, December 09, 2015 11:58 AM
To: [email protected]
Subject: [EXTERNAL] Re: Confusion Installing Phoenix Spark Plugin / Various
Errors
Josh,
So using user provided Hadoop 2.6 solved the immediate Phoenix / Spark
integration problem I was having. However, I now have another problem, which
seems to be similar to:
https://issues.apache.org/jira/browse/SPARK-8332
java.lang.NoSuchMethodError:
com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer
I’m getting this error when executing the simple example in the Phoenix / Spark
Plugin page:
Spark context available as sc.
15/12/09 11:51:02 INFO repl.SparkILoop: Created sql context..
SQL context available as sqlContext.
scala> val df = sqlContext.load(
| "org.apache.phoenix.spark",
| Map("table" -> "TABLE1", "zkUrl" -> "phoenix-server:2181")
| )
warning: there were 1 deprecation warning(s); re-run with -deprecation for
details
java.lang.NoSuchMethodError:
com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;
I did try upgrading the Hadoop Jackson JARs from 2.2.3 to 2.4.3, as some
suggested in the link above, and including them in Spark’s classpath. However,
the error was the same.
From: Josh Mahonin [mailto:[email protected]]
Sent: Wednesday, December 09, 2015 11:21 AM
To: [email protected]<mailto:[email protected]>
Subject: Re: [EXTERNAL] Re: Confusion Installing Phoenix Spark Plugin / Various
Errors
Definitely. I'd like to dig into what the root cause is, but it might be
optimistic to think I'll be able to get to that any time soon.
I'll try get the docs updated today.
On Wed, Dec 9, 2015 at 1:09 PM, James Taylor
<[email protected]<mailto:[email protected]>> wrote:
Would it make sense to tweak the Spark installation instructions slightly with
this information, Josh?
On Wed, Dec 9, 2015 at 9:11 AM, Cox, Jonathan A
<[email protected]<mailto:[email protected]>> wrote:
Josh,
Previously, I was using the SPARK_CLASSPATH, but then read that it was
deprecated and switched to the spark-defaults.conf file. The result was the
same.
Also, I was using ‘spark-1.5.2-bin-hadoop2.6.tgz’, which includes some Hadoop
2.6 JARs. This caused the trouble. However, by separately downloading Hadoop
2.6 and Spark without Hadoop, the errors went away.
-Jonathan
From: Josh Mahonin [mailto:[email protected]<mailto:[email protected]>]
Sent: Wednesday, December 09, 2015 5:57 AM
To: [email protected]<mailto:[email protected]>
Subject: Re: [EXTERNAL] Re: Confusion Installing Phoenix Spark Plugin / Various
Errors
Hi Jonathan,
Thanks for the information. If you're able, could you also try the
'SPARK_CLASSPATH' environment variable instead of the spark-defaults.conf
setting, and let us know if that works? Also the exact Spark package you're
using would be helpful as well (from source, prebuilt for 2.6+, 2.4+, CDH, etc.)
Thanks,
Josh
On Wed, Dec 9, 2015 at 12:08 AM, Cox, Jonathan A
<[email protected]<mailto:[email protected]>> wrote:
Alright, I reproduced what you did exactly, and it now works. The problem is
that the Phoenix client JAR is not working correctly with the Spark builds that
include Hadoop.
When I downloaded the Spark build with user provided Hadoop, and also installed
Hadoop manually, Spark works with Phoenix correctly!
Thank you much,
Jonathan
Sent from my iPhone
On Dec 8, 2015, at 8:54 PM, Josh Mahonin
<[email protected]<mailto:[email protected]>> wrote:
Hi Jonathan,
Spark only needs the client JAR. It contains all the other Phoenix dependencies
as well.
I'm not sure exactly what the issue you're seeing is. I just downloaded and
extracted fresh copies of Spark 1.5.2 (pre-built with user-provided Hadoop),
and the latest Phoenix 4.6.0 binary release.
I copied the 'phoenix-4.6.0-HBase-1.1-client.jar' to /tmp and created a
'spark-defaults.conf' in the 'conf' folder of the Spark install with the
following:
spark.executor.extraClassPath /tmp/phoenix-4.6.0-HBase-1.1-client.jar
spark.driver.extraClassPath /tmp/phoenix-4.6.0-HBase-1.1-client.jar
I then launched the 'spark-shell', and was able to execute:
import org.apache.phoenix.spark._
From there, you should be able to use the methods provided by the phoenix-spark
integration within the Spark shell.
Good luck,
Josh
On Tue, Dec 8, 2015 at 8:51 PM, Cox, Jonathan A
<[email protected]<mailto:[email protected]>> wrote:
I am trying to get Spark up and running with Phoenix, but the installation
instructions are not clear to me, or there is something else wrong. I’m using
Spark 1.5.2, HBase 1.1.2 and Phoenix 4.6.0 with a standalone install (no HDFS
or cluster) with Debian Linux 8 (Jessie) x64. I’m also using Java 1.8.0_40.
The instructions state:
1. Ensure that all requisite Phoenix / HBase platform dependencies are
available on the classpath for the Spark executors and drivers
2. One method is to add the phoenix-4.4.0-client.jar to ‘SPARK_CLASSPATH’
in spark-env.sh, or setting both ‘spark.executor.extraClassPath’ and
‘spark.driver.extraClassPath’ in spark-defaults.conf
First off, what are “all requisite Phoenix / HBase platform dependencies”? #2
suggests that all I need to do is add ‘phoenix-4.6.0-HBase-1.1-client.jar’ to
Spark’s class path. But what about ‘phoenix-spark-4.6.0-HBase-1.1.jar’ or
‘phoenix-core-4.6.0-HBase-1.1.jar’? Do either of these (or anything else) need
to be added to Spark’s class path?
Secondly, if I follow the instructions exactly, and add only
‘phoenix-4.6.0-HBase-1.1-client.jar’ to ‘spark-defaults.conf’:
spark.executor.extraClassPath
/usr/local/phoenix/phoenix-4.6.0-HBase-1.1-client.jar
spark.driver.extraClassPath
/usr/local/phoenix/phoenix-4.6.0-HBase-1.1-client.jar
Then I get the following error when starting the interactive Spark shell with
‘spark-shell’:
15/12/08 18:38:05 WARN ObjectStore: Version information not found in metastore.
hive.metastore.schema.verification is not enabled so recording the schema
version 1.2.0
15/12/08 18:38:05 WARN ObjectStore: Failed to get database default, returning
NoSuchObjectException
15/12/08 18:38:05 WARN Hive: Failed to access metastore. This class should not
accessed in runtime.
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException:
Unable to instantiate
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at
org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1236)
…
<console>:10: error: not found: value sqlContext
import sqlContext.implicits._
^
<console>:10: error: not found: value sqlContext
import sqlContext.sql
On the other hand, if I include all three of the aforementioned JARs, I get the
same error. However, if I include only the ‘phoenix-spark-4.6.0-HBase-1.1.jar’,
spark-shell seems so launch without error. Nevertheless, if I then try the
simple tutorial commands in spark-shell, I get the following:
Spark output: SQL context available as sqlContext.
scala >> import org.apache.spark.SparkContext
import org.apache.spark.sql.SQLContext
import org.apache.phoenix.spark._
val sqlContext = new SQLContext(sc)
val df =
sqlContext.load("org.apache.phoenix.spark", Map("table" -> "TABLE1", "zkUrl" ->
"phoenix-server:2181")
Spark error:
java.lang.NoClassDefFoundError:
org/apache/hadoop/hbase/HBaseConfiguration
at
org.apache.phoenix.spark.PhoenixRDD.getPhoenixConfiguration(PhoenixRDD.scala:71)
at
org.apache.phoenix.spark.PhoenixRDD.phoenixConf$lzycompute(PhoenixRDD.scala:39)
at
org.apache.phoenix.spark.PhoenixRDD.phoenixConf(PhoenixRDD.scala:38)
at
org.apache.phoenix.spark.PhoenixRDD.<init>(PhoenixRDD.scala:42)
at
org.apache.phoenix.spark.PhoenixRelation.schema(PhoenixRelation.scala:50)
at
org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:37)
at
org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:120)
This final error seems similar to the one in mailing list post Phoenix-spark :
NoClassDefFoundError:
HBaseConfiguration<http://mail-archives.apache.org/mod_mbox/phoenix-user/201511.mbox/ajax/%3CCAKwwsRSEJHkotiF28kzumDZM6kgBVeTJNGUoJnZcLiuEGCTjHQ%40mail.gmail.com%3E>
<
http://mail-archives.apache.org/mod_mbox/phoenix-user/201511.mbox/ajax/%3CCAKwwsRSEJHkotiF28kzumDZM6kgBVeTJNGUoJnZcLiuEGCTjHQ%40mail.gmail.com%3E>.
But the question does not seem to have been answered satisfactory. Also note,
if I include all three JARs, as he did, I get an error when launching
spark-shell.
Can you please clarify what is the proper way to install and configure Phoenix
with Spark?
Sincerely,
Jonathan