[jira] [Comment Edited] (PHOENIX-3814) Unable to connect to Phoenix via Spark

Josh Mahonin (JIRA) Tue, 02 May 2017 11:24:19 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993446#comment-15993446
 ]


Josh Mahonin edited comment on PHOENIX-3814 at 5/2/17 6:23 PM:
---------------------------------------------------------------

Sorry for the delay in responding, was out of town and returned to a flooded 
house...

At the surface, that exception doesn't seem Spark specific at all, but perhaps 
there's some sort of mismatch between HBase/Hadoop JARs within Spark itself. I 
assume the SYSTEM.MUTEX issue doesn't crop up through any other usage pattern, 
only through Spark? Also, since there are multiple pre-built Spark packages for 
multiple versions of Hadoop, which one are you using? 

Note that the Spark functionality has only been tested up to Spark 2.0, and 
they have a habit of breaking things between minor releases. If you're looking 
for a more stable solution, I would suggest looking at either Spark 1.6 or 2.0 
with Phoenix 4.10 (the release is binary compatible with Spark 2.0, but if you 
compile your own you can specify Spark 1.6 compatibility with the 'spark16' 
maven profile.

Re: SaveModes, PHOENIX-2745 describes a similar issue. At this point, only 
'Overwrite' is supported since the DataFrame save() does a blind upsert, 
without checking if the data is already present. Patches would be greatly 
appreciated to update this behaviour.








was (Author: jmahonin):
Sorry for the delay, in responding, was out of town and returned to a flooded 
house...

At the surface, that exception doesn't seem Spark specific at all, but perhaps 
there's some sort of mismatch between HBase/Hadoop JARs within Spark itself. I 
assume the SYSTEM.MUTEX issue doesn't crop up through any other usage pattern, 
only through Spark? Also, since there are multiple pre-built Spark packages for 
multiple versions of Hadoop, which one are you using? 

Note that the Spark functionality has only been tested up to Spark 2.0, and 
they have a habit of breaking things between minor releases. If you're looking 
for a more stable solution, I would suggest looking at either Spark 1.6 or 2.0 
with Phoenix 4.10 (the release is binary compatible with Spark 2.0, but if you 
compile your own you can specify Spark 1.6 compatibility with the 'spark16' 
maven profile.

Re: SaveModes, PHOENIX-2745 describes a similar issue. At this point, only 
'Overwrite' is supported since the DataFrame save() does a blind upsert, 
without checking if the data is already present. Patches would be greatly 
appreciated to update this behaviour.







> Unable to connect to Phoenix via Spark
> --------------------------------------
>
>                 Key: PHOENIX-3814
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3814
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.10.0
>         Environment: Ubuntu 16.04.1, Apache Spark 2.1.0, Hbase 1.2.5, Phoenix 
> 4.10.0
>            Reporter: Wajid Khattak
>
> Please see 
> http://stackoverflow.com/questions/43640864/apache-phoenix-for-spark-not-working



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (PHOENIX-3814) Unable to connect to Phoenix via Spark

Reply via email to