Re: Read-Write data to/from Phoenix 4.13 or 4.14 with Spark SQL Dataframe 2.1.0

2018-09-18 Thread Josh Elser

Ok, glad to hear you got it working.

I don't do any work with the CDH-based releases. Pedro is the primary 
maintainer of these -- I imagine he's watching this email list. Does the 
4.14.0-cdh5.11.2 release not contain a phoenix-spark2 jar? Maybe the 
explanation is just that the spark2 integration wasn't backported into 
that release... not sure.


On 9/18/18 10:30 AM, lkyaes wrote:

sorry  for bad copy-paste, versions we  have combined together are
phoenix-4.14.0-cdh5.11.2-client.jar and 
phoenix-spark2-4.7.0.2.6.5.3002-10.jar (from last release 
4.7.0.2.6.5.3002-10 (17.8.2018)) -worked for us.


Br.

On Tue, Sep 18, 2018 at 5:26 PM lkyaes > wrote:


Hello, thank you for your response. It gave me a tip :)

I've reviewd one more time our JAR's.
Before we used

  * phoenix-4.14.0-cdh5.11.2-client.jar
  * phoenix-spark-4.14.0-cdh5.11.2.jar

Becouse they came together with  APACHE_PHOENIX 
4.14.0-cdh5.11.2.p0.3*Cloudera parcel* from

http://www.apache.org/dist/phoenix/apache-phoenix-4.14.0-cdh5.11.2/parcels/
The only available version for our environment .

But now I've found  Phoenix-Spark2
https://javalibs.com/artifact/org.apache.phoenix/phoenix-spark2
I've installed/configured this one  and it works.
One thing -  last Release  4.14.0-cdh5.11.2 (9.6.2018) , which has
Spark2 JAR phoenix-spark2-4.7.0.2.6.5.3002-10.jar -
We got really confused with it, because of  release number 4.7,
which looks like old Phoenix version and because this is for
Hortonworks.
In any way *phoenix-4.14.0-cdh5.11.2-client.jar* and
*phoenix-4.14.0-cdh5.11.2-client.jar* are working for us, at least
we can load and save data from/to Phoenix.

Regards,
Liubov
Data Engineer
IR.ee


On Tue, Sep 11, 2018 at 4:06 AM Josh Elser mailto:els...@apache.org>> wrote:

Lots of details missing here about how you're trying to submit
these
Spark jobs, but let me try to explain how things work now:

Phoenix provides spark(1) and spark2 jars. These JARs provide the
implementation for Spark *on top* of what the
phoenix-client.jar. You
want to include both the phoenix-client and relevant
phoenix-spark jars
when you submit your application.

This should be how things are meant to work with Phoenix 4.13
and 4.14.
If this doesn't help you, please give us some more specifics
about the
commands you run and the output you get. Thanks!

On 9/10/18 6:20 AM, lkyaes wrote:
 > Hello !
 >
 > I wonder if there any way how to get working Phoenix 4.13 or
4.14 with
 > Spark 2.1.0
 >
 > In production we used Spark SQL dataframe to load from and
write data to
 > Hbase with Apache Phoenix (Spark 1.6 and Phoenix 4.7) and it
worked well.
 >
 > After upgrade , we faced an issues with loading and writing,
it is not
 > possible anymore.
 >
 > Our environment:
 >
 > ·Cloudera 5.11.2,
 >
 > ·HBase 1.2
 >
 > ·Spark 2.1.0(parcel , compatible with Coudera 5.11.2)
 >
 > ·APACHE_PHOENIX 4.14.0-cdh5.11.2.p0.3 (we tested 4.13 as well)
 >
 > We read/write data by Python (Pyspark library) but the same
errors will
 > come also writing in Scala.
 >
 > *Read data from Phoenix 4.13 with Spark 2.1.0 error :*
 >
 > Py4JJavaError:An error occurred while calling o213.load.
 > : java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
 >
 > *Read data from Phoenix 4.14 with Spark 2.1.0 error :*
 >
 > Py4JJavaError:An error occurred while calling o89.load. :
 > com.google.common.util.concurrent.ExecutionError:
 > java.lang.NoSuchMethodError:
 >

com.lmax.disruptor.dsl.Disruptor.(Lcom/lmax/disruptor/EventFactory;ILjava/util/concurrent/ThreadFactory;Lcom/lmax/disruptor/dsl/ProducerType;Lcom/lmax/disruptor/WaitStrategy;)V
 >
 > (Disruptor .jar versions changing - did not solve the issue)
 >
 > *Insert data to Phoenix 4.14 with Spark 2.1.0 error:*
 >
 > Py4JJavaError:An error occurred while calling o186.save.
 > :java.lang.AbstractMethodError:
 >

org.apache.phoenix.spark.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation;
 >
 >
 > Actually we areawarethat Spark2 failed to read and write
Phoenix due to
 > Spark changing the DataFrame API, as well as a Scala version
change, the
 > resultant JAR isn't binary compatible with Spark versions < 2.0.
 >
   

Re: Read-Write data to/from Phoenix 4.13 or 4.14 with Spark SQL Dataframe 2.1.0

2018-09-18 Thread lkyaes
sorry  for bad copy-paste, versions we  have combined together are
phoenix-4.14.0-cdh5.11.2-client.jar and
phoenix-spark2-4.7.0.2.6.5.3002-10.jar (from last release
4.7.0.2.6.5.3002-10 (17.8.2018)) -worked for us.

Br.

On Tue, Sep 18, 2018 at 5:26 PM lkyaes  wrote:

> Hello, thank you for your response. It gave me a tip :)
>
> I've reviewd one more time our JAR's.
> Before we used
>
>- phoenix-4.14.0-cdh5.11.2-client.jar
>- phoenix-spark-4.14.0-cdh5.11.2.jar
>
> Becouse they came together with  APACHE_PHOENIX  4.14.0-cdh5.11.2.p0.3*
> Cloudera parcel* from
> http://www.apache.org/dist/phoenix/apache-phoenix-4.14.0-cdh5.11.2/parcels/
> The only available version for our environment .
>
> But now I've found  Phoenix-Spark2
> https://javalibs.com/artifact/org.apache.phoenix/phoenix-spark2
> I've installed/configured this one  and it works.
> One thing -  last Release  4.14.0-cdh5.11.2 (9.6.2018) , which has Spark2
> JAR phoenix-spark2-4.7.0.2.6.5.3002-10.jar -
> We got really confused with it, because of  release number 4.7, which
> looks like old Phoenix version and because this is for Hortonworks.
> In any way *phoenix-4.14.0-cdh5.11.2-client.jar* and
> *phoenix-4.14.0-cdh5.11.2-client.jar* are working for us, at least we can
> load and save data from/to Phoenix.
>
> Regards,
> Liubov
> Data Engineer
> IR.ee
>
>
> On Tue, Sep 11, 2018 at 4:06 AM Josh Elser  wrote:
>
>> Lots of details missing here about how you're trying to submit these
>> Spark jobs, but let me try to explain how things work now:
>>
>> Phoenix provides spark(1) and spark2 jars. These JARs provide the
>> implementation for Spark *on top* of what the phoenix-client.jar. You
>> want to include both the phoenix-client and relevant phoenix-spark jars
>> when you submit your application.
>>
>> This should be how things are meant to work with Phoenix 4.13 and 4.14.
>> If this doesn't help you, please give us some more specifics about the
>> commands you run and the output you get. Thanks!
>>
>> On 9/10/18 6:20 AM, lkyaes wrote:
>> > Hello !
>> >
>> > I wonder if there any way how to get working Phoenix 4.13 or 4.14 with
>> > Spark 2.1.0
>> >
>> > In production we used Spark SQL dataframe to load from and write data
>> to
>> > Hbase with Apache Phoenix (Spark 1.6 and Phoenix 4.7) and it worked
>> well.
>> >
>> > After upgrade , we faced an issues with loading and writing, it is not
>> > possible anymore.
>> >
>> > Our environment:
>> >
>> > ·Cloudera 5.11.2,
>> >
>> > ·HBase 1.2
>> >
>> > ·Spark 2.1.0(parcel , compatible with Coudera 5.11.2)
>> >
>> > ·APACHE_PHOENIX 4.14.0-cdh5.11.2.p0.3 (we tested 4.13 as well)
>> >
>> > We read/write data by Python (Pyspark library) but the same errors will
>> > come also writing in Scala.
>> >
>> > *Read data from Phoenix 4.13 with Spark 2.1.0 error :*
>> >
>> > Py4JJavaError:An error occurred while calling o213.load.
>> > : java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
>> >
>> > *Read data from Phoenix 4.14 with Spark 2.1.0 error :*
>> >
>> > Py4JJavaError:An error occurred while calling o89.load. :
>> > com.google.common.util.concurrent.ExecutionError:
>> > java.lang.NoSuchMethodError:
>> >
>> com.lmax.disruptor.dsl.Disruptor.(Lcom/lmax/disruptor/EventFactory;ILjava/util/concurrent/ThreadFactory;Lcom/lmax/disruptor/dsl/ProducerType;Lcom/lmax/disruptor/WaitStrategy;)V
>> >
>> > (Disruptor .jar versions changing - did not solve the issue)
>> >
>> > *Insert data to Phoenix 4.14 with Spark 2.1.0 error:*
>> >
>> > Py4JJavaError:An error occurred while calling o186.save.
>> > :java.lang.AbstractMethodError:
>> >
>> org.apache.phoenix.spark.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation;
>> >
>> >
>> > Actually we areawarethat Spark2 failed to read and write Phoenix due to
>> > Spark changing the DataFrame API, as well as a Scala version change,
>> the
>> > resultant JAR isn't binary compatible with Spark versions < 2.0.
>> >
>> > *DataFrame class is missing from Spark 2 and *This issues was fixed
>> ONCE
>> > by patch for Phoenix versioon
>> > 4.10https://issues.apache.org/jira/browse/PHOENIX-
>> >
>> > Unfortanatly this patch is not sutable for our enviroment, Could you
>> > please comment whether other versions of Phoenix has such fix?
>> >
>> > How to read/write data from Phoenix 4.13/or 4.14 using Spark2?
>> >
>> > Regards and hope for you help,
>> > Liubov Kyaes
>> > Data Engineer
>> > ir.ee 
>> >
>> > **//___^
>> >
>>
>


Re: Read-Write data to/from Phoenix 4.13 or 4.14 with Spark SQL Dataframe 2.1.0

2018-09-18 Thread lkyaes
Hello, thank you for your response. It gave me a tip :)

I've reviewd one more time our JAR's.
Before we used

   - phoenix-4.14.0-cdh5.11.2-client.jar
   - phoenix-spark-4.14.0-cdh5.11.2.jar

Becouse they came together with  APACHE_PHOENIX  4.14.0-cdh5.11.2.p0.3*
Cloudera parcel* from
http://www.apache.org/dist/phoenix/apache-phoenix-4.14.0-cdh5.11.2/parcels/
The only available version for our environment .

But now I've found  Phoenix-Spark2
https://javalibs.com/artifact/org.apache.phoenix/phoenix-spark2
I've installed/configured this one  and it works.
One thing -  last Release  4.14.0-cdh5.11.2 (9.6.2018) , which has Spark2
JAR phoenix-spark2-4.7.0.2.6.5.3002-10.jar -
We got really confused with it, because of  release number 4.7, which looks
like old Phoenix version and because this is for Hortonworks.
In any way *phoenix-4.14.0-cdh5.11.2-client.jar* and
*phoenix-4.14.0-cdh5.11.2-client.jar* are working for us, at least we can
load and save data from/to Phoenix.

Regards,
Liubov
Data Engineer
IR.ee


On Tue, Sep 11, 2018 at 4:06 AM Josh Elser  wrote:

> Lots of details missing here about how you're trying to submit these
> Spark jobs, but let me try to explain how things work now:
>
> Phoenix provides spark(1) and spark2 jars. These JARs provide the
> implementation for Spark *on top* of what the phoenix-client.jar. You
> want to include both the phoenix-client and relevant phoenix-spark jars
> when you submit your application.
>
> This should be how things are meant to work with Phoenix 4.13 and 4.14.
> If this doesn't help you, please give us some more specifics about the
> commands you run and the output you get. Thanks!
>
> On 9/10/18 6:20 AM, lkyaes wrote:
> > Hello !
> >
> > I wonder if there any way how to get working Phoenix 4.13 or 4.14 with
> > Spark 2.1.0
> >
> > In production we used Spark SQL dataframe to load from and write data to
> > Hbase with Apache Phoenix (Spark 1.6 and Phoenix 4.7) and it worked well.
> >
> > After upgrade , we faced an issues with loading and writing, it is not
> > possible anymore.
> >
> > Our environment:
> >
> > ·Cloudera 5.11.2,
> >
> > ·HBase 1.2
> >
> > ·Spark 2.1.0(parcel , compatible with Coudera 5.11.2)
> >
> > ·APACHE_PHOENIX 4.14.0-cdh5.11.2.p0.3 (we tested 4.13 as well)
> >
> > We read/write data by Python (Pyspark library) but the same errors will
> > come also writing in Scala.
> >
> > *Read data from Phoenix 4.13 with Spark 2.1.0 error :*
> >
> > Py4JJavaError:An error occurred while calling o213.load.
> > : java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
> >
> > *Read data from Phoenix 4.14 with Spark 2.1.0 error :*
> >
> > Py4JJavaError:An error occurred while calling o89.load. :
> > com.google.common.util.concurrent.ExecutionError:
> > java.lang.NoSuchMethodError:
> >
> com.lmax.disruptor.dsl.Disruptor.(Lcom/lmax/disruptor/EventFactory;ILjava/util/concurrent/ThreadFactory;Lcom/lmax/disruptor/dsl/ProducerType;Lcom/lmax/disruptor/WaitStrategy;)V
> >
> > (Disruptor .jar versions changing - did not solve the issue)
> >
> > *Insert data to Phoenix 4.14 with Spark 2.1.0 error:*
> >
> > Py4JJavaError:An error occurred while calling o186.save.
> > :java.lang.AbstractMethodError:
> >
> org.apache.phoenix.spark.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation;
> >
> >
> > Actually we areawarethat Spark2 failed to read and write Phoenix due to
> > Spark changing the DataFrame API, as well as a Scala version change, the
> > resultant JAR isn't binary compatible with Spark versions < 2.0.
> >
> > *DataFrame class is missing from Spark 2 and *This issues was fixed ONCE
> > by patch for Phoenix versioon
> > 4.10https://issues.apache.org/jira/browse/PHOENIX-
> >
> > Unfortanatly this patch is not sutable for our enviroment, Could you
> > please comment whether other versions of Phoenix has such fix?
> >
> > How to read/write data from Phoenix 4.13/or 4.14 using Spark2?
> >
> > Regards and hope for you help,
> > Liubov Kyaes
> > Data Engineer
> > ir.ee 
> >
> > **//___^
> >
>


Re: Read-Write data to/from Phoenix 4.13 or 4.14 with Spark SQL Dataframe 2.1.0

2018-09-10 Thread Josh Elser
Lots of details missing here about how you're trying to submit these 
Spark jobs, but let me try to explain how things work now:


Phoenix provides spark(1) and spark2 jars. These JARs provide the 
implementation for Spark *on top* of what the phoenix-client.jar. You 
want to include both the phoenix-client and relevant phoenix-spark jars 
when you submit your application.


This should be how things are meant to work with Phoenix 4.13 and 4.14. 
If this doesn't help you, please give us some more specifics about the 
commands you run and the output you get. Thanks!


On 9/10/18 6:20 AM, lkyaes wrote:

Hello !

I wonder if there any way how to get working Phoenix 4.13 or 4.14 with 
Spark 2.1.0


In production we used Spark SQL dataframe to load from and write data to 
Hbase with Apache Phoenix (Spark 1.6 and Phoenix 4.7) and it worked well.


After upgrade , we faced an issues with loading and writing, it is not 
possible anymore.


Our environment:

·Cloudera 5.11.2,

·HBase 1.2

·Spark 2.1.0(parcel , compatible with Coudera 5.11.2)

·APACHE_PHOENIX 4.14.0-cdh5.11.2.p0.3 (we tested 4.13 as well)

We read/write data by Python (Pyspark library) but the same errors will 
come also writing in Scala.


*Read data from Phoenix 4.13 with Spark 2.1.0 error :*

Py4JJavaError:An error occurred while calling o213.load.
: java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame

*Read data from Phoenix 4.14 with Spark 2.1.0 error :*

Py4JJavaError:An error occurred while calling o89.load. : 
com.google.common.util.concurrent.ExecutionError: 
java.lang.NoSuchMethodError: 
com.lmax.disruptor.dsl.Disruptor.(Lcom/lmax/disruptor/EventFactory;ILjava/util/concurrent/ThreadFactory;Lcom/lmax/disruptor/dsl/ProducerType;Lcom/lmax/disruptor/WaitStrategy;)V


(Disruptor .jar versions changing - did not solve the issue)

*Insert data to Phoenix 4.14 with Spark 2.1.0 error:*

Py4JJavaError:An error occurred while calling o186.save. 
:java.lang.AbstractMethodError: 
org.apache.phoenix.spark.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation;



Actually we areawarethat Spark2 failed to read and write Phoenix due to 
Spark changing the DataFrame API, as well as a Scala version change, the 
resultant JAR isn't binary compatible with Spark versions < 2.0.


*DataFrame class is missing from Spark 2 and *This issues was fixed ONCE 
by patch for Phoenix versioon 
4.10https://issues.apache.org/jira/browse/PHOENIX-


Unfortanatly this patch is not sutable for our enviroment, Could you 
please comment whether other versions of Phoenix has such fix?


How to read/write data from Phoenix 4.13/or 4.14 using Spark2?

Regards and hope for you help,
Liubov Kyaes
Data Engineer
ir.ee 

**//___^



Read-Write data to/from Phoenix 4.13 or 4.14 with Spark SQL Dataframe 2.1.0

2018-09-10 Thread lkyaes
Hello !

I wonder if there any way  how to get working Phoenix 4.13 or 4.14 with
Spark 2.1.0

In production we used Spark SQL dataframe to load from and  write data to
Hbase with Apache Phoenix  (Spark 1.6 and Phoenix 4.7)  and it worked well.

After upgrade , we faced an issues with loading and writing, it is not
possible anymore.

Our environment:

· Cloudera 5.11.2,

· HBase 1.2

· Spark 2.1.0   (parcel , compatible with Coudera 5.11.2)

· APACHE_PHOENIX  4.14.0-cdh5.11.2.p0.3   (we tested 4.13 as well)



We read/write data by Python (Pyspark library) but the same errors will
come also writing in Scala.

*Read data  from Phoenix 4.13  with Spark 2.1.0 error :*

Py4JJavaError: An error occurred while calling o213.load.
: java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame

*Read data  from Phoenix 4.14  with Spark 2.1.0 error :*

Py4JJavaError: An error occurred while calling o89.load. :
com.google.common.util.concurrent.ExecutionError:
java.lang.NoSuchMethodError:
com.lmax.disruptor.dsl.Disruptor.(Lcom/lmax/disruptor/EventFactory;ILjava/util/concurrent/ThreadFactory;Lcom/lmax/disruptor/dsl/ProducerType;Lcom/lmax/disruptor/WaitStrategy;)V

(Disruptor .jar versions changing -  did not solve the issue)

*Insert data to Phoenix 4.14  with Spark 2.1.0  error:*

Py4JJavaError: An error occurred while calling o186.save. :
java.lang.AbstractMethodError:
org.apache.phoenix.spark.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation;



Actually we are aware that  Spark2 failed to read and write Phoenix  due to
Spark changing the DataFrame API, as well as a Scala version change, the
resultant JAR isn't binary compatible with Spark versions < 2.0.

*DataFrame class is missing from Spark 2 and *This issues was fixed ONCE  by
patch for Phoenix versioon 4.10
https://issues.apache.org/jira/browse/PHOENIX-

Unfortanatly this patch is not sutable for our enviroment, Could you please
comment whether other versions of Phoenix has such fix?

How to read/write data from Phoenix 4.13/or 4.14 using Spark2?

Regards and hope for you help,
Liubov Kyaes
Data Engineer
ir.ee