Thanks Jonathan, I'm making some headway on getting a the client library working again. I thought I saw a mention that you were using pyspark as well using the DataFrame support. Are you able to confirm this works as well?
Thanks! Josh On Wed, Dec 9, 2015 at 7:51 PM, Cox, Jonathan A <ja...@sandia.gov> wrote: > Josh, > > I added all of those JARs separately to Spark's class paths, and it seems > to be working fine now. > > Thanks a lot for your help! > > Sent from my iPhone > > On Dec 9, 2015, at 2:30 PM, Josh Mahonin <jmaho...@gmail.com> wrote: > > Thanks Jonathan, > > I'll follow-up with the issue there. In the meantime, you may have some > luck just submitting a fat (assembly) JAR to a spark cluster. > > If you really want to dive into the nitty-gritty, I'm decomposing the > client JAR down to the required components that allow for the Spark > integration to work (especially excluding the fasterxml JARs). If you were > to manually assemble the following libraries into the Spark classpath, I > believe you'll be able to get the spark-shell going: > > guava-12.0.1.jar hbase-common-1.1.0.jar hbase-server-1.1.0.jar > phoenix-core-4.6.0-HBase-1.1.jar hbase-client-1.1.0.jar > hbase-protocol-1.1.0.jar htrace-core-3.1.0-incubating.jar > phoenix-spark-4.6.0-HBase-1.1.jar > > Thanks for the report. > > Josh > > On Wed, Dec 9, 2015 at 4:00 PM, Cox, Jonathan A <ja...@sandia.gov> wrote: > >> Thanks, Josh. I submitted the issue, which can be found at: >> https://issues.apache.org/jira/browse/PHOENIX-2503 >> >> >> >> Multiple Java NoClass/Method Errors with Spark and Phoenix >> >> >> >> *From:* Josh Mahonin [mailto:jmaho...@gmail.com] >> *Sent:* Wednesday, December 09, 2015 1:15 PM >> >> *To:* user@phoenix.apache.org >> *Subject:* Re: [EXTERNAL] Re: Confusion Installing Phoenix Spark Plugin >> / Various Errors >> >> >> >> Hi Jonathan, >> >> >> >> Thanks, I'm digging into this as we speak. That SPARK-8332 issue looks >> like the same issue, and to quote one of the comments in that issue >> 'Classpath hell is hell'. >> >> >> >> What is interesting is that the unit tests in Phoenix 4.6.0 successfully >> run against Spark 1.5.2 [1], so I wonder if this is issue is specific to >> the spark-shell. You may have some success compiling your app as an >> assembly JAR and submitting it to a Spark cluster instead. >> >> >> >> Could you do me a favour and file a JIRA ticket for this, and copy all >> the relevant information you've posted there? >> >> >> >> Thanks! >> >> Josh >> >> [1] >> https://github.com/apache/phoenix/blob/master/phoenix-spark/src/it/scala/org/apache/phoenix/spark/PhoenixSparkIT.scala >> >> >> >> On Wed, Dec 9, 2015 at 2:52 PM, Cox, Jonathan A <ja...@sandia.gov> wrote: >> >> Josh, >> >> >> >> I’d like to give you a little more information regarding this error. It >> looks like when I add the Phoenix Client JAR to Spark, it causes Spark to >> fail: >> >> spark.executor.extraClassPath >> /usr/local/phoenix/phoenix-4.6.0-HBase-1.1-client.jar >> >> spark.driver.extraClassPath >> /usr/local/phoenix/phoenix-4.6.0-HBase-1.1-client.jar >> >> >> >> After adding this JAR, I get the following error when excuting the >> following command: >> >> scala> val textFile = sc.textFile("README.md") >> >> java.lang.NoSuchMethodError: >> com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class; >> >> at >> com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.<init>(ScalaNumberDeserializersModule.scala:49) >> >> at >> com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.<clinit>(ScalaNumberDeserializersModule.scala) >> >> >> >> As you can see, adding this phoenix JAR is breaking other Spark >> functionality for me. My naïve guess is that there is a different version >> of the Jackson FasterXML classes packaged inside >> phoenix-4.6.0-HBase-1.1-client.jar that is breaking Spark. >> >> >> >> Have you seen anything like this before? >> >> >> >> Regards, >> >> Jonathan >> >> >> >> *From:* Cox, Jonathan A [mailto:ja...@sandia.gov] >> *Sent:* Wednesday, December 09, 2015 11:58 AM >> *To:* user@phoenix.apache.org >> *Subject:* [EXTERNAL] Re: Confusion Installing Phoenix Spark Plugin / >> Various Errors >> >> >> >> Josh, >> >> >> >> So using user provided Hadoop 2.6 solved the immediate Phoenix / Spark >> integration problem I was having. However, I now have another problem, >> which seems to be similar to: >> >> https://issues.apache.org/jira/browse/SPARK-8332 >> >> java.lang.NoSuchMethodError: >> com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer >> >> >> >> I’m getting this error when executing the simple example in the Phoenix / >> Spark Plugin page: >> >> Spark context available as sc. >> >> 15/12/09 11:51:02 INFO repl.SparkILoop: Created sql context.. >> >> SQL context available as sqlContext. >> >> >> >> scala> val df = sqlContext.load( >> >> | "org.apache.phoenix.spark", >> >> | Map("table" -> "TABLE1", "zkUrl" -> "phoenix-server:2181") >> >> | ) >> >> warning: there were 1 deprecation warning(s); re-run with -deprecation >> for details >> >> java.lang.NoSuchMethodError: >> com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class; >> >> >> >> I did try upgrading the Hadoop Jackson JARs from 2.2.3 to 2.4.3, as some >> suggested in the link above, and including them in Spark’s classpath. >> However, the error was the same. >> >> >> >> *From:* Josh Mahonin [mailto:jmaho...@gmail.com <jmaho...@gmail.com>] >> *Sent:* Wednesday, December 09, 2015 11:21 AM >> *To:* user@phoenix.apache.org >> *Subject:* Re: [EXTERNAL] Re: Confusion Installing Phoenix Spark Plugin >> / Various Errors >> >> >> >> Definitely. I'd like to dig into what the root cause is, but it might be >> optimistic to think I'll be able to get to that any time soon. >> >> >> >> I'll try get the docs updated today. >> >> >> >> On Wed, Dec 9, 2015 at 1:09 PM, James Taylor <jamestay...@apache.org> >> wrote: >> >> Would it make sense to tweak the Spark installation instructions slightly >> with this information, Josh? >> >> >> >> On Wed, Dec 9, 2015 at 9:11 AM, Cox, Jonathan A <ja...@sandia.gov> wrote: >> >> Josh, >> >> >> >> Previously, I was using the SPARK_CLASSPATH, but then read that it was >> deprecated and switched to the spark-defaults.conf file. The result was the >> same. >> >> >> >> Also, I was using ‘spark-1.5.2-bin-hadoop2.6.tgz’, which includes some >> Hadoop 2.6 JARs. This caused the trouble. However, by separately >> downloading Hadoop 2.6 and Spark without Hadoop, the errors went away. >> >> >> >> -Jonathan >> >> >> >> *From:* Josh Mahonin [mailto:jmaho...@gmail.com] >> *Sent:* Wednesday, December 09, 2015 5:57 AM >> *To:* user@phoenix.apache.org >> *Subject:* Re: [EXTERNAL] Re: Confusion Installing Phoenix Spark Plugin >> / Various Errors >> >> >> >> Hi Jonathan, >> >> >> >> Thanks for the information. If you're able, could you also try the >> 'SPARK_CLASSPATH' environment variable instead of the spark-defaults.conf >> setting, and let us know if that works? Also the exact Spark package you're >> using would be helpful as well (from source, prebuilt for 2.6+, 2.4+, CDH, >> etc.) >> >> Thanks, >> >> >> >> Josh >> >> >> >> On Wed, Dec 9, 2015 at 12:08 AM, Cox, Jonathan A <ja...@sandia.gov> >> wrote: >> >> Alright, I reproduced what you did exactly, and it now works. The problem >> is that the Phoenix client JAR is not working correctly with the Spark >> builds that include Hadoop. >> >> >> >> When I downloaded the Spark build with user provided Hadoop, and also >> installed Hadoop manually, Spark works with Phoenix correctly! >> >> >> >> Thank you much, >> >> Jonathan >> >> Sent from my iPhone >> >> >> On Dec 8, 2015, at 8:54 PM, Josh Mahonin <jmaho...@gmail.com> wrote: >> >> Hi Jonathan, >> >> >> >> Spark only needs the client JAR. It contains all the other Phoenix >> dependencies as well. >> >> >> >> I'm not sure exactly what the issue you're seeing is. I just downloaded >> and extracted fresh copies of Spark 1.5.2 (pre-built with user-provided >> Hadoop), and the latest Phoenix 4.6.0 binary release. >> >> >> >> I copied the 'phoenix-4.6.0-HBase-1.1-client.jar' to /tmp and created a >> 'spark-defaults.conf' in the 'conf' folder of the Spark install with the >> following: >> >> >> spark.executor.extraClassPath /tmp/phoenix-4.6.0-HBase-1.1-client.jar >> >> spark.driver.extraClassPath /tmp/phoenix-4.6.0-HBase-1.1-client.jar >> >> I then launched the 'spark-shell', and was able to execute: >> >> import org.apache.phoenix.spark._ >> >> >> >> From there, you should be able to use the methods provided by the >> phoenix-spark integration within the Spark shell. >> >> >> >> Good luck, >> >> >> >> Josh >> >> >> >> On Tue, Dec 8, 2015 at 8:51 PM, Cox, Jonathan A <ja...@sandia.gov> wrote: >> >> I am trying to get Spark up and running with Phoenix, but the >> installation instructions are not clear to me, or there is something else >> wrong. I’m using Spark 1.5.2, HBase 1.1.2 and Phoenix 4.6.0 with a >> standalone install (no HDFS or cluster) with Debian Linux 8 (Jessie) x64. >> I’m also using Java 1.8.0_40. >> >> >> >> The instructions state: >> >> 1. Ensure that all requisite Phoenix / HBase platform dependencies >> are available on the classpath for the Spark executors and drivers >> >> 2. One method is to add the phoenix-4.4.0-client.jar to >> ‘SPARK_CLASSPATH’ in spark-env.sh, or setting both >> ‘spark.executor.extraClassPath’ and ‘spark.driver.extraClassPath’ in >> spark-defaults.conf >> >> >> >> *First off, what are “all requisite Phoenix / HBase platform >> dependencies”?* #2 suggests that all I need to do is add >> ‘phoenix-4.6.0-HBase-1.1-client.jar’ to Spark’s class path. But what about >> ‘phoenix-spark-4.6.0-HBase-1.1.jar’ or ‘phoenix-core-4.6.0-HBase-1.1.jar’? >> Do either of these (or anything else) need to be added to Spark’s class >> path? >> >> >> >> Secondly, if I follow the instructions exactly, and add only >> ‘phoenix-4.6.0-HBase-1.1-client.jar’ to ‘spark-defaults.conf’: >> >> spark.executor.extraClassPath >> /usr/local/phoenix/phoenix-4.6.0-HBase-1.1-client.jar >> >> spark.driver.extraClassPath >> /usr/local/phoenix/phoenix-4.6.0-HBase-1.1-client.jar >> >> Then I get the following error when starting the interactive Spark shell >> with ‘spark-shell’: >> >> 15/12/08 18:38:05 WARN ObjectStore: Version information not found in >> metastore. hive.metastore.schema.verification is not enabled so recording >> the schema version 1.2.0 >> >> 15/12/08 18:38:05 WARN ObjectStore: Failed to get database default, >> returning NoSuchObjectException >> >> 15/12/08 18:38:05 WARN Hive: Failed to access metastore. This class >> should not accessed in runtime. >> >> org.apache.hadoop.hive.ql.metadata.HiveException: >> java.lang.RuntimeException: Unable to instantiate >> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient >> >> at >> org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1236) >> >> … >> >> >> >> <console>:10: error: not found: value sqlContext >> >> import sqlContext.implicits._ >> >> ^ >> >> <console>:10: error: not found: value sqlContext >> >> import sqlContext.sql >> >> >> >> On the other hand, if I include all three of the aforementioned JARs, I >> get the same error. However, *if I include only the >> ‘phoenix-spark-4.6.0-HBase-1.1.jar’*, spark-shell seems so launch >> without error. Nevertheless, if I then try the simple tutorial commands in >> spark-shell, I get the following: >> >> *Spark output:* SQL context available as sqlContext. >> >> >> >> *scala >>* import org.apache.spark.SparkContext >> >> import org.apache.spark.sql.SQLContext >> >> import org.apache.phoenix.spark._ >> >> >> >> val sqlContext = new SQLContext(sc) >> >> >> >> val df = >> sqlContext.load("org.apache.phoenix.spark", Map("table" -> "TABLE1", >> "zkUrl" -> "phoenix-server:2181") >> >> >> >> *Spark error:* >> >> *java.lang.NoClassDefFoundError: >> org/apache/hadoop/hbase/HBaseConfiguration* >> >> at >> org.apache.phoenix.spark.PhoenixRDD.getPhoenixConfiguration(PhoenixRDD.scala:71) >> >> at >> org.apache.phoenix.spark.PhoenixRDD.phoenixConf$lzycompute(PhoenixRDD.scala:39) >> >> at >> org.apache.phoenix.spark.PhoenixRDD.phoenixConf(PhoenixRDD.scala:38) >> >> at >> org.apache.phoenix.spark.PhoenixRDD.<init>(PhoenixRDD.scala:42) >> >> at >> org.apache.phoenix.spark.PhoenixRelation.schema(PhoenixRelation.scala:50) >> >> at >> org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:37) >> >> at >> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:120) >> >> >> >> This final error seems similar to the one in mailing list post Phoenix-spark >> : NoClassDefFoundError: HBaseConfiguration >> <http://mail-archives.apache.org/mod_mbox/phoenix-user/201511.mbox/ajax/%3CCAKwwsRSEJHkotiF28kzumDZM6kgBVeTJNGUoJnZcLiuEGCTjHQ%40mail.gmail.com%3E> >> < >> http://mail-archives.apache.org/mod_mbox/phoenix-user/201511.mbox/ajax/%3CCAKwwsRSEJHkotiF28kzumDZM6kgBVeTJNGUoJnZcLiuEGCTjHQ%40mail.gmail.com%3E>. >> But the question does not seem to have been answered satisfactory. Also >> note, if I include all three JARs, as he did, I get an error when launching >> spark-shell. >> >> >> >> *Can you please clarify what is the proper way to install and configure >> Phoenix with Spark?* >> >> >> >> Sincerely, >> >> Jonathan >> >> >> >> >> >> >> >> >> >> >> > >