Re: Spark Phoenix Plugin

Josh Mahonin Sat, 20 Feb 2016 07:38:33 -0800

Hi Ben,

Can you describe in more detail what your environment is? Are you using
stock installs of HBase, Spark and Phoenix? Are you using the hadoop2.4
pre-built Spark distribution as per the documentation [1]?


The unread block data error is commonly traced back to this issue [2] which
indicates some sort of mismatched version problem..

Thanks,

Josh

[1] https://phoenix.apache.org/phoenix_spark.html
[2] https://issues.apache.org/jira/browse/SPARK-1867

On Fri, Feb 19, 2016 at 2:18 PM, Benjamin Kim <[email protected]> wrote:

> Hi Josh,
>
> When I run the following code in spark-shell for spark 1.6:
>
> import org.apache.phoenix.spark._
> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" ->
> "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))
> df.select(df("ID")).show()
>
> I get this error:
>
> java.lang.IllegalStateException: unread block data
>
> Thanks,
> Ben
>
>
> On Feb 19, 2016, at 11:12 AM, Josh Mahonin <[email protected]> wrote:
>
> What specifically doesn't work for you?
>
> I have a Docker image that I used to do some basic testing on it with and
> haven't run into any problems:
> https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark
>
> On Fri, Feb 19, 2016 at 12:40 PM, Benjamin Kim <[email protected]> wrote:
>
>> All,
>>
>> Thanks for the help. I have switched out Cloudera’s HBase 1.0.0 with the
>> current Apache HBase 1.1.3. Also, I installed Phoenix 4.7.0, and everything
>> works fine except for the Phoenix Spark Plugin. I wonder if it’s a version
>> incompatibility issue with Spark 1.6. Has anyone tried compiling 4.7.0
>> using Spark 1.6?
>>
>> Thanks,
>> Ben
>>
>> On Feb 12, 2016, at 6:33 AM, Benjamin Kim <[email protected]> wrote:
>>
>> Anyone know when Phoenix 4.7 will be officially released? And what
>> Cloudera distribution versions will it be compatible with?
>>
>> Thanks,
>> Ben
>>
>> On Feb 10, 2016, at 11:03 AM, Benjamin Kim <[email protected]> wrote:
>>
>> Hi Pierre,
>>
>> I am getting this error now.
>>
>> Error: org.apache.phoenix.exception.PhoenixIOException:
>> org.apache.hadoop.hbase.DoNotRetryIOException:
>> SYSTEM.CATALOG,,1453397732623.8af7b44f3d7609eb301ad98641ff2611.:
>> org.apache.hadoop.hbase.client.Delete.setAttribute(Ljava/lang/String;[B)Lorg/apache/hadoop/hbase/client/Delete;
>>
>> I even tried to use sqlline.py to do some queries too. It resulted in the
>> same error. I followed the installation instructions. Is there something
>> missing?
>>
>> Thanks,
>> Ben
>>
>>
>> On Feb 9, 2016, at 10:20 AM, Ravi Kiran <[email protected]>
>> wrote:
>>
>> Hi Pierre,
>>
>>   Try your luck for building the artifacts from
>> https://github.com/chiastic-security/phoenix-for-cloudera. Hopefully it
>> helps.
>>
>> Regards
>> Ravi .
>>
>> On Tue, Feb 9, 2016 at 10:04 AM, Benjamin Kim <[email protected]> wrote:
>>
>>> Hi Pierre,
>>>
>>> I found this article about how Cloudera’s version of HBase is very
>>> different than Apache HBase so it must be compiled using Cloudera’s repo
>>> and versions. But, I’m not having any success with it.
>>>
>>>
>>> http://stackoverflow.com/questions/31849454/using-phoenix-with-cloudera-hbase-installed-from-repo
>>>
>>> There’s also a Chinese site that does the same thing.
>>>
>>> https://www.zybuluo.com/xtccc/note/205739
>>>
>>> I keep getting errors like the one’s below.
>>>
>>> [ERROR]
>>> /opt/tools/phoenix/phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexMerger.java:[110,29]
>>> cannot find symbol
>>> [ERROR] symbol:   class Region
>>> [ERROR] location: class
>>> org.apache.hadoop.hbase.regionserver.LocalIndexMerger
>>> …
>>>
>>> Have you tried this also?
>>>
>>> As a last resort, we will have to abandon Cloudera’s HBase for Apache’s
>>> HBase.
>>>
>>> Thanks,
>>> Ben
>>>
>>>
>>> On Feb 8, 2016, at 11:04 PM, pierre lacave <[email protected]> wrote:
>>>
>>> Havent met that one.
>>>
>>> According to SPARK-1867, the real issue is hidden.
>>>
>>> I d process by elimination, maybe try in local[*] mode first
>>>
>>> https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-1867
>>>
>>> On Tue, 9 Feb 2016, 04:58 Benjamin Kim <[email protected]> wrote:
>>>
>>>> Pierre,
>>>>
>>>> I got it to work using phoenix-4.7.0-HBase-1.0-client-spark.jar. But,
>>>> now, I get this error:
>>>>
>>>> org.apache.spark.SparkException: Job aborted due to stage failure: Task
>>>> 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage
>>>> 0.0 (TID 3, prod-dc1-datanode151.pdc1i.gradientx.com):
>>>> java.lang.IllegalStateException: unread block data
>>>>
>>>> It happens when I do:
>>>>
>>>> df.show()
>>>>
>>>> Getting closer…
>>>>
>>>> Thanks,
>>>> Ben
>>>>
>>>>
>>>>
>>>> On Feb 8, 2016, at 2:57 PM, pierre lacave <[email protected]> wrote:
>>>>
>>>> This is the wrong client jar try with the one named
>>>> phoenix-4.7.0-HBase-1.1-client-spark.jar
>>>>
>>>> On Mon, 8 Feb 2016, 22:29 Benjamin Kim <[email protected]> wrote:
>>>>
>>>>> Hi Josh,
>>>>>
>>>>> I tried again by putting the settings within the spark-default.conf.
>>>>>
>>>>>
>>>>> spark.driver.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
>>>>>
>>>>> spark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
>>>>>
>>>>> I still get the same error using the code below.
>>>>>
>>>>> import org.apache.phoenix.spark._
>>>>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" ->
>>>>> "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))
>>>>>
>>>>> Can you tell me what else you’re doing?
>>>>>
>>>>> Thanks,
>>>>> Ben
>>>>>
>>>>>
>>>>> On Feb 8, 2016, at 1:44 PM, Josh Mahonin <[email protected]> wrote:
>>>>>
>>>>> Hi Ben,
>>>>>
>>>>> I'm not sure about the format of those command line options you're
>>>>> passing. I've had success with spark-shell just by setting the
>>>>> 'spark.executor.extraClassPath' and 'spark.driver.extraClassPath' options
>>>>> on the spark config, as per the docs [1].
>>>>>
>>>>> I'm not sure if there's anything special needed for CDH or not though.
>>>>> I also have a docker image I've been toying with which has a working
>>>>> Spark/Phoenix setup using the Phoenix 4.7.0 RC and Spark 1.6.0. It might 
>>>>> be
>>>>> a useful reference for you as well [2].
>>>>>
>>>>> Good luck,
>>>>>
>>>>> Josh
>>>>>
>>>>> [1] https://phoenix.apache.org/phoenix_spark.html
>>>>> [2] https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark
>>>>>
>>>>> On Mon, Feb 8, 2016 at 4:29 PM, Benjamin Kim <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi Pierre,
>>>>>>
>>>>>> I tried to run in spark-shell using spark 1.6.0 by running this:
>>>>>>
>>>>>> spark-shell --master yarn-client --driver-class-path
>>>>>> /opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar 
>>>>>> --driver-java-options
>>>>>> "-Dspark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar”
>>>>>>
>>>>>> The version of HBase is the one in CDH5.4.8, which is 1.0.0-cdh5.4.8.
>>>>>>
>>>>>> When I get to the line:
>>>>>>
>>>>>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" ->
>>>>>> “TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181”))
>>>>>>
>>>>>> I get this error:
>>>>>>
>>>>>> java.lang.NoClassDefFoundError: Could not initialize class
>>>>>> org.apache.spark.rdd.RDDOperationScope$
>>>>>>
>>>>>> Any ideas?
>>>>>>
>>>>>> Thanks,
>>>>>> Ben
>>>>>>
>>>>>>
>>>>>> On Feb 5, 2016, at 1:36 PM, pierre lacave <[email protected]> wrote:
>>>>>>
>>>>>> I don't know when the full release will be, RC1 just got pulled out,
>>>>>> and expecting RC2 soon
>>>>>>
>>>>>> you can find them here
>>>>>>
>>>>>> https://dist.apache.org/repos/dist/dev/phoenix/
>>>>>>
>>>>>>
>>>>>> there is a new phoenix-4.7.0-HBase-1.1-client-spark.jar that is all
>>>>>> you need to have in spark classpath
>>>>>>
>>>>>>
>>>>>> *Pierre Lacave*
>>>>>> 171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
>>>>>> Phone :       +353879128708
>>>>>>
>>>>>> On Fri, Feb 5, 2016 at 9:28 PM, Benjamin Kim <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Pierre,
>>>>>>>
>>>>>>> When will I be able to download this version?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Ben
>>>>>>>
>>>>>>>
>>>>>>> On Friday, February 5, 2016, pierre lacave <[email protected]> wrote:
>>>>>>>
>>>>>>>> This was addressed in Phoenix 4.7 (currently in RC)
>>>>>>>> https://issues.apache.org/jira/browse/PHOENIX-2503
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *Pierre Lacave*
>>>>>>>> 171 Skellig House, Custom House, Lower Mayor street, Dublin 1,
>>>>>>>> Ireland
>>>>>>>> Phone :       +353879128708
>>>>>>>>
>>>>>>>> On Fri, Feb 5, 2016 at 6:17 PM, Benjamin Kim <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I cannot get this plugin to work in CDH 5.4.8 using Phoenix 4.5.2
>>>>>>>>> and Spark 1.6. When I try to launch spark-shell, I get:
>>>>>>>>>
>>>>>>>>>         java.lang.RuntimeException: java.lang.RuntimeException:
>>>>>>>>> Unable to instantiate
>>>>>>>>> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
>>>>>>>>>
>>>>>>>>> I continue on and run the example code. When I get tot the line
>>>>>>>>> below:
>>>>>>>>>
>>>>>>>>>         val df = sqlContext.load("org.apache.phoenix.spark",
>>>>>>>>> Map("table" -> "TEST.MY_TEST", "zkUrl" ->
>>>>>>>>> "zookeeper1,zookeeper2,zookeeper3:2181")
>>>>>>>>>
>>>>>>>>> I get this error:
>>>>>>>>>
>>>>>>>>>         java.lang.NoSuchMethodError:
>>>>>>>>> com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;
>>>>>>>>>
>>>>>>>>> Can someone help?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Ben
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>>
>>
>
>

Re: Spark Phoenix Plugin

Reply via email to