Josh, My production environment at our company is: CDH 5.4.8 Hadoop 2.6.0-cdh5.4.8 YARN 2.6.0-cdh5.4.8 HBase 1.0.0-cdh5.4.8 Apache HBase 1.1.3 Spark 1.6.0 Phoenix 4.7.0
I tried to use the Phoenix Spark Plugin against both versions of HBase. I hope this helps. Thanks, Ben > On Feb 20, 2016, at 7:37 AM, Josh Mahonin <jmaho...@gmail.com> wrote: > > Hi Ben, > > Can you describe in more detail what your environment is? Are you using stock > installs of HBase, Spark and Phoenix? Are you using the hadoop2.4 pre-built > Spark distribution as per the documentation [1]? > > The unread block data error is commonly traced back to this issue [2] which > indicates some sort of mismatched version problem.. > > Thanks, > > Josh > > [1] https://phoenix.apache.org/phoenix_spark.html > <https://phoenix.apache.org/phoenix_spark.html> > [2] https://issues.apache.org/jira/browse/SPARK-1867 > <https://issues.apache.org/jira/browse/SPARK-1867> > > On Fri, Feb 19, 2016 at 2:18 PM, Benjamin Kim <bbuil...@gmail.com > <mailto:bbuil...@gmail.com>> wrote: > Hi Josh, > > When I run the following code in spark-shell for spark 1.6: > > import org.apache.phoenix.spark._ > val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> > "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181")) > df.select(df("ID")).show() > > I get this error: > > java.lang.IllegalStateException: unread block data > > Thanks, > Ben > > >> On Feb 19, 2016, at 11:12 AM, Josh Mahonin <jmaho...@gmail.com >> <mailto:jmaho...@gmail.com>> wrote: >> >> What specifically doesn't work for you? >> >> I have a Docker image that I used to do some basic testing on it with and >> haven't run into any problems: >> https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark >> <https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark> >> >> On Fri, Feb 19, 2016 at 12:40 PM, Benjamin Kim <bbuil...@gmail.com >> <mailto:bbuil...@gmail.com>> wrote: >> All, >> >> Thanks for the help. I have switched out Cloudera’s HBase 1.0.0 with the >> current Apache HBase 1.1.3. Also, I installed Phoenix 4.7.0, and everything >> works fine except for the Phoenix Spark Plugin. I wonder if it’s a version >> incompatibility issue with Spark 1.6. Has anyone tried compiling 4.7.0 using >> Spark 1.6? >> >> Thanks, >> Ben >> >>> On Feb 12, 2016, at 6:33 AM, Benjamin Kim <bbuil...@gmail.com >>> <mailto:bbuil...@gmail.com>> wrote: >>> >>> Anyone know when Phoenix 4.7 will be officially released? And what Cloudera >>> distribution versions will it be compatible with? >>> >>> Thanks, >>> Ben >>> >>>> On Feb 10, 2016, at 11:03 AM, Benjamin Kim <bbuil...@gmail.com >>>> <mailto:bbuil...@gmail.com>> wrote: >>>> >>>> Hi Pierre, >>>> >>>> I am getting this error now. >>>> >>>> Error: org.apache.phoenix.exception.PhoenixIOException: >>>> org.apache.hadoop.hbase.DoNotRetryIOException: >>>> SYSTEM.CATALOG,,1453397732623.8af7b44f3d7609eb301ad98641ff2611.: >>>> org.apache.hadoop.hbase.client.Delete.setAttribute(Ljava/lang/String;[B)Lorg/apache/hadoop/hbase/client/Delete; >>>> >>>> I even tried to use sqlline.py to do some queries too. It resulted in the >>>> same error. I followed the installation instructions. Is there something >>>> missing? >>>> >>>> Thanks, >>>> Ben >>>> >>>> >>>>> On Feb 9, 2016, at 10:20 AM, Ravi Kiran <maghamraviki...@gmail.com >>>>> <mailto:maghamraviki...@gmail.com>> wrote: >>>>> >>>>> Hi Pierre, >>>>> >>>>> Try your luck for building the artifacts from >>>>> https://github.com/chiastic-security/phoenix-for-cloudera >>>>> <https://github.com/chiastic-security/phoenix-for-cloudera>. Hopefully it >>>>> helps. >>>>> >>>>> Regards >>>>> Ravi . >>>>> >>>>> On Tue, Feb 9, 2016 at 10:04 AM, Benjamin Kim <bbuil...@gmail.com >>>>> <mailto:bbuil...@gmail.com>> wrote: >>>>> Hi Pierre, >>>>> >>>>> I found this article about how Cloudera’s version of HBase is very >>>>> different than Apache HBase so it must be compiled using Cloudera’s repo >>>>> and versions. But, I’m not having any success with it. >>>>> >>>>> http://stackoverflow.com/questions/31849454/using-phoenix-with-cloudera-hbase-installed-from-repo >>>>> >>>>> <http://stackoverflow.com/questions/31849454/using-phoenix-with-cloudera-hbase-installed-from-repo> >>>>> >>>>> There’s also a Chinese site that does the same thing. >>>>> >>>>> https://www.zybuluo.com/xtccc/note/205739 >>>>> <https://www.zybuluo.com/xtccc/note/205739> >>>>> >>>>> I keep getting errors like the one’s below. >>>>> >>>>> [ERROR] >>>>> /opt/tools/phoenix/phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexMerger.java:[110,29] >>>>> cannot find symbol >>>>> [ERROR] symbol: class Region >>>>> [ERROR] location: class >>>>> org.apache.hadoop.hbase.regionserver.LocalIndexMerger >>>>> … >>>>> >>>>> Have you tried this also? >>>>> >>>>> As a last resort, we will have to abandon Cloudera’s HBase for Apache’s >>>>> HBase. >>>>> >>>>> Thanks, >>>>> Ben >>>>> >>>>> >>>>>> On Feb 8, 2016, at 11:04 PM, pierre lacave <pie...@lacave.me >>>>>> <mailto:pie...@lacave.me>> wrote: >>>>>> >>>>>> Havent met that one. >>>>>> >>>>>> According to SPARK-1867, the real issue is hidden. >>>>>> >>>>>> I d process by elimination, maybe try in local[*] mode first >>>>>> >>>>>> https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-1867 >>>>>> <https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-1867> >>>>>> On Tue, 9 Feb 2016, 04:58 Benjamin Kim <bbuil...@gmail.com >>>>>> <mailto:bbuil...@gmail.com>> wrote: >>>>>> Pierre, >>>>>> >>>>>> I got it to work using phoenix-4.7.0-HBase-1.0-client-spark.jar. But, >>>>>> now, I get this error: >>>>>> >>>>>> org.apache.spark.SparkException: Job aborted due to stage failure: Task >>>>>> 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in >>>>>> stage 0.0 (TID 3, prod-dc1-datanode151.pdc1i.gradientx.com >>>>>> <http://prod-dc1-datanode151.pdc1i.gradientx.com/>): >>>>>> java.lang.IllegalStateException: unread block data >>>>>> >>>>>> It happens when I do: >>>>>> >>>>>> df.show() >>>>>> >>>>>> Getting closer… >>>>>> >>>>>> Thanks, >>>>>> Ben >>>>>> >>>>>> >>>>>> >>>>>>> On Feb 8, 2016, at 2:57 PM, pierre lacave <pie...@lacave.me >>>>>>> <mailto:pie...@lacave.me>> wrote: >>>>>>> >>>>>>> This is the wrong client jar try with the one named >>>>>>> phoenix-4.7.0-HBase-1.1-client-spark.jar >>>>>>> >>>>>>> >>>>>>> On Mon, 8 Feb 2016, 22:29 Benjamin Kim <bbuil...@gmail.com >>>>>>> <mailto:bbuil...@gmail.com>> wrote: >>>>>>> Hi Josh, >>>>>>> >>>>>>> I tried again by putting the settings within the spark-default.conf. >>>>>>> >>>>>>> spark.driver.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar >>>>>>> spark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar >>>>>>> >>>>>>> I still get the same error using the code below. >>>>>>> >>>>>>> import org.apache.phoenix.spark._ >>>>>>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> >>>>>>> "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181")) >>>>>>> >>>>>>> Can you tell me what else you’re doing? >>>>>>> >>>>>>> Thanks, >>>>>>> Ben >>>>>>> >>>>>>> >>>>>>>> On Feb 8, 2016, at 1:44 PM, Josh Mahonin <jmaho...@gmail.com >>>>>>>> <mailto:jmaho...@gmail.com>> wrote: >>>>>>>> >>>>>>>> Hi Ben, >>>>>>>> >>>>>>>> I'm not sure about the format of those command line options you're >>>>>>>> passing. I've had success with spark-shell just by setting the >>>>>>>> 'spark.executor.extraClassPath' and 'spark.driver.extraClassPath' >>>>>>>> options on the spark config, as per the docs [1]. >>>>>>>> >>>>>>>> I'm not sure if there's anything special needed for CDH or not though. >>>>>>>> I also have a docker image I've been toying with which has a working >>>>>>>> Spark/Phoenix setup using the Phoenix 4.7.0 RC and Spark 1.6.0. It >>>>>>>> might be a useful reference for you as well [2]. >>>>>>>> >>>>>>>> Good luck, >>>>>>>> >>>>>>>> Josh >>>>>>>> >>>>>>>> [1] https://phoenix.apache.org/phoenix_spark.html >>>>>>>> <https://phoenix.apache.org/phoenix_spark.html> >>>>>>>> [2] https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark >>>>>>>> <https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark> >>>>>>>> >>>>>>>> On Mon, Feb 8, 2016 at 4:29 PM, Benjamin Kim <bbuil...@gmail.com >>>>>>>> <mailto:bbuil...@gmail.com>> wrote: >>>>>>>> Hi Pierre, >>>>>>>> >>>>>>>> I tried to run in spark-shell using spark 1.6.0 by running this: >>>>>>>> >>>>>>>> spark-shell --master yarn-client --driver-class-path >>>>>>>> /opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar >>>>>>>> --driver-java-options >>>>>>>> "-Dspark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar” >>>>>>>> >>>>>>>> The version of HBase is the one in CDH5.4.8, which is 1.0.0-cdh5.4.8. >>>>>>>> >>>>>>>> When I get to the line: >>>>>>>> >>>>>>>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> >>>>>>>> “TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181”)) >>>>>>>> >>>>>>>> I get this error: >>>>>>>> >>>>>>>> java.lang.NoClassDefFoundError: Could not initialize class >>>>>>>> org.apache.spark.rdd.RDDOperationScope$ >>>>>>>> >>>>>>>> Any ideas? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Ben >>>>>>>> >>>>>>>> >>>>>>>>> On Feb 5, 2016, at 1:36 PM, pierre lacave <pie...@lacave.me >>>>>>>>> <mailto:pie...@lacave.me>> wrote: >>>>>>>>> >>>>>>>>> I don't know when the full release will be, RC1 just got pulled out, >>>>>>>>> and expecting RC2 soon >>>>>>>>> >>>>>>>>> you can find them here >>>>>>>>> >>>>>>>>> https://dist.apache.org/repos/dist/dev/phoenix/ >>>>>>>>> <https://dist.apache.org/repos/dist/dev/phoenix/> >>>>>>>>> >>>>>>>>> >>>>>>>>> there is a new phoenix-4.7.0-HBase-1.1-client-spark.jar that is all >>>>>>>>> you need to have in spark classpath >>>>>>>>> >>>>>>>>> >>>>>>>>> Pierre Lacave >>>>>>>>> 171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland >>>>>>>>> Phone : +353879128708 <tel:%2B353879128708> >>>>>>>>> >>>>>>>>> On Fri, Feb 5, 2016 at 9:28 PM, Benjamin Kim <bbuil...@gmail.com >>>>>>>>> <mailto:bbuil...@gmail.com>> wrote: >>>>>>>>> Hi Pierre, >>>>>>>>> >>>>>>>>> When will I be able to download this version? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Ben >>>>>>>>> >>>>>>>>> >>>>>>>>> On Friday, February 5, 2016, pierre lacave <pie...@lacave.me >>>>>>>>> <mailto:pie...@lacave.me>> wrote: >>>>>>>>> This was addressed in Phoenix 4.7 (currently in RC) >>>>>>>>> https://issues.apache.org/jira/browse/PHOENIX-2503 >>>>>>>>> <https://issues.apache.org/jira/browse/PHOENIX-2503> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Pierre Lacave >>>>>>>>> 171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland >>>>>>>>> Phone : +353879128708 <tel:%2B353879128708> >>>>>>>>> >>>>>>>>> On Fri, Feb 5, 2016 at 6:17 PM, Benjamin Kim <bbuil...@gmail.com <>> >>>>>>>>> wrote: >>>>>>>>> I cannot get this plugin to work in CDH 5.4.8 using Phoenix 4.5.2 and >>>>>>>>> Spark 1.6. When I try to launch spark-shell, I get: >>>>>>>>> >>>>>>>>> java.lang.RuntimeException: java.lang.RuntimeException: >>>>>>>>> Unable to instantiate >>>>>>>>> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient >>>>>>>>> >>>>>>>>> I continue on and run the example code. When I get tot the line below: >>>>>>>>> >>>>>>>>> val df = sqlContext.load("org.apache.phoenix.spark", >>>>>>>>> Map("table" -> "TEST.MY_TEST", "zkUrl" -> >>>>>>>>> "zookeeper1,zookeeper2,zookeeper3:2181") >>>>>>>>> >>>>>>>>> I get this error: >>>>>>>>> >>>>>>>>> java.lang.NoSuchMethodError: >>>>>>>>> com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class; >>>>>>>>> >>>>>>>>> Can someone help? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Ben >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>> >>> >> >> > >