Robert, you should use the phoenix-4*-spark.jar that is located in root phoenix directory.
Thanks, Sergey On Tue, Jul 5, 2016 at 8:06 AM, Josh Elser <[email protected]> wrote: > Looking into this on the HDP side. Please feel free to reach out via HDP > channels instead of Apache channels. > > Thanks for letting us know as well. > > Josh Mahonin wrote: > >> Hi Robert, >> >> I recommend following up with HDP on this issue. >> >> The underlying problem is that the 'phoenix-spark-4.4.0.2.4.0.0-169.jar' >> they've provided isn't actually a fat client JAR, it's missing many of >> the required dependencies. They might be able to provide the correct JAR >> for you, but you'd have to check with them. It may also be possible for >> you to manually include all of the necessary JARs on the Spark classpath >> to mimic the fat jar, but that's fairly ugly and time consuming. >> >> FWIW, the HDP 2.5 Tech Preview seems to include the correct JAR, though >> I haven't personally tested it out yet. >> >> Good luck, >> >> Josh >> >> On Tue, Jul 5, 2016 at 2:00 AM, Robert James <[email protected] >> <mailto:[email protected]>> wrote: >> >> I'm trying to use Phoenix on Spark, and can't get around this error: >> >> java.lang.NoClassDefFoundError: >> org/apache/hadoop/hbase/HBaseConfiguration >> at >> >> org.apache.phoenix.spark.PhoenixRDD.getPhoenixConfiguration(PhoenixRDD.scala:82) >> >> DETAILS: >> 1. I'm running HDP 2.4.0.0-169 >> 2. Using phoenix-sqlline, I can access Phoenix perfectly >> 3. Using hbase shell, I can access HBase perfectly >> 4. I added the following lines to /etc/spark/conf/spark-defaults.conf >> >> spark.driver.extraClassPath >> /usr/hdp/current/phoenix-client/lib/phoenix-spark-4.4.0.2.4.0.0-169 >> .jar >> spark.executor.extraClassPath >> /usr/hdp/current/phoenix-client/lib/phoenix-spark-4.4.0.2.4.0.0-169 >> .jar >> >> 5. Steps to reproduce the error: >> # spark-shell >> ... >> scala> import org.apache.phoenix.spark._ >> import org.apache.phoenix.spark._ >> >> scala> sqlContext.load("org.apache.phoenix.spark", Map("table" -> >> "EMAIL_ENRON", "zkUrl" -> "localhost:2181")) >> warning: there were 1 deprecation warning(s); re-run with -deprecation >> for details >> java.lang.NoClassDefFoundError: >> org/apache/hadoop/hbase/HBaseConfiguration >> at >> >> org.apache.phoenix.spark.PhoenixRDD.getPhoenixConfiguration(PhoenixRDD.scala:82) >> >> // Or, this gets the same error >> scala> val rdd = sc.phoenixTableAsRDD("EMAIL_ENRON", Seq("MAIL_FROM", >> "MAIL_TO"), zkUrl=Some("localhost")) >> java.lang.NoClassDefFoundError: >> org/apache/hadoop/hbase/HBaseConfiguration >> at >> >> org.apache.phoenix.spark.PhoenixRDD.getPhoenixConfiguration(PhoenixRDD.scala:82) >> at >> >> org.apache.phoenix.spark.PhoenixRDD.phoenixConf$lzycompute(PhoenixRDD.scala:38) >> >> 6. I've tried every permutation I can think of, and also spent hours >> Googling. Some times I can get different errors, but always errors. >> Interestingly, if I manage to load the HBaseConfiguration class >> manually (by specifying classpaths and then import), I get a >> "phoenixTableAsRDD is not a member of SparkContext" error. >> >> How can I use Phoenix from within Spark? I'm really eager to do so, >> but haven't been able to. >> >> Also: Can someone give me some background on the underlying issues >> here? Trial-and-error-plus-google is not exactly high quality >> engineering; I'd like to understand the problem better. >> >> >>
