[ 
https://issues.apache.org/jira/browse/PHOENIX-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065525#comment-15065525
 ] 

Josh Mahonin commented on PHOENIX-2503:
---------------------------------------

Looking through the various fixes they've applied, I'm not sure any of them 
benefit us, since they're mostly in the scope of Spark's JDBC RDD integration 
(it parallelizes most standard JDBC drivers, if the query has numeric primary 
key it can partition on). The phoenix-spark integration bypasses that code path 
completely, so we lose some of the niceties there.

I did find a comment about the Spark classloader not picking up the 
java.sql.DriverManager from the bootstrap classloader [1], with the 
accompanying class that acts as a thin wrapper for their JDBC integration.

One option is to call Spark's DriverRegistry wrapper like the JDBC RDD does, 
instead of the java.sql.DriverManager, but I'm hesitant to rely too much on 
internal Spark APIs since they tend to change from version to version. Looking 
at the code, eventually it just invokes DriverManager.registerDriver() as well.

[1] 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/DriverRegistry.scala#L27-L33

> Multiple Java NoClass/Method Errors with Spark and Phoenix
> ----------------------------------------------------------
>
>                 Key: PHOENIX-2503
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2503
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.6.0
>         Environment: Debian 8 (Jessie) x64
> hadoop-2.6.2
> hbase-1.1.2
> phoenix-4.6.0-HBase-1.1
> spark-1.5.2-bin-without-hadoop
>            Reporter: Jonathan Cox
>            Priority: Blocker
>             Fix For: 4.7.0
>
>         Attachments: PHOENIX-2503.patch
>
>
> I have encountered a variety of Java errors while trying to get Apache 
> Phoenix working with Spark. In particular, I encounter these errors when 
> submitting Python jobs to the spark-shell, or running interactively in the 
> scala Spark shell. 
> ------- Issue 1 -------
> The first issue I encountered was that Phoenix would not work with the binary 
> Spark release that includes Hadoop 2.6 (spark-1.5.2-bin-hadoop2.6.tgz). I 
> tried adding the phoenix-4.6.0-HBase-1.1-client.jar to both spark-env.sh and 
> spark-defaults.conf, but encountered the same error when launching 
> spark-shell:
> 15/12/08 18:38:05 WARN ObjectStore: Version information not found in 
> metastore. hive.metastore.schema.verification is not enabled so recording the 
> schema version 1.2.0
> 15/12/08 18:38:05 WARN ObjectStore: Failed to get database default, returning 
> NoSuchObjectException
> 15/12/08 18:38:05 WARN Hive: Failed to access metastore. This class should 
> not accessed in runtime.
> org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: 
> Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
>                 at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1236)
> ----- Issue 2 -----
> Alright, having given up on getting Phoenix to work with the Spark package 
> that includes Hadoop, I decided to download hadoop-2.6.2.tar.gz and 
> spark-1.5.2-bin-without-hadoop.tgz. I installed these, and again added 
> phoenix-4.6.0-HBase-1.1-client.jar to spark-defaults.conf. In addition, I 
> added the following lines to spark-env.sh:
> SPARK_DIST_CLASSPATH=$(/usr/local/hadoop/bin/hadoop classpath)
> export 
> SPARK_DIST_CLASSPATH="$SPARK_DIST_CLASSPATH:/usr/local/hadoop/share/hadoop/tools/lib/*"
>  
> export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
> This solved "Issue 1" described above, and now spark-shell launches without 
> generating an error. Nevertheless, other Spark functionality is now broken:
> 15/12/09 13:55:46 INFO repl.SparkILoop: Created spark context..
> Spark context available as sc.
> 15/12/09 13:55:46 INFO repl.SparkILoop: Created sql context..
> SQL context available as sqlContext.
> scala> val textFile = sc.textFile("README.md")
> java.lang.NoSuchMethodError: 
> com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;
>       at 
> com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.<init>(ScalaNumberDeserializersModule.scala:49)
>       at 
> com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.<clinit>(ScalaNumberDeserializersModule.scala)
> Note, this error goes away if I omit phoenix-4.6.0-HBase-1.1-client.jar (but 
> then I have no Phoenix support, obviously). This makes me believe that 
> phoenix-4.6.0-HBase-1.1-client.jar contains some conflicting version of 
> Jackson FastXML classes, which are overriding Spark's Jackson classes with an 
> earlier version that doesn't include this particular method. In other words, 
> Spark needs one version of Jackson JARs, but Phoenix is including another 
> that breaks Spark. Does this make any sense?
> Sincerely,
> Jonathan



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to