Re: Apache Spark Integration
Hi Luqman, At present, the phoenix-spark integration relies on the schema having been already created. There has been some discussion of augmenting the supported Spark 'SaveMode's to include 'CREATE IF NOT EXISTS' logic. https://issues.apache.org/jira/browse/PHOENIX-2745 https://issues.apache.org/jira/browse/PHOENIX-2632 Contributions would be most welcome! Josh On Tue, Jul 18, 2017 at 6:50 AM, Luqman Ghani <lgsa...@gmail.com> wrote: > Hi, > > I was wondering if phoenix-spark connector creates a new table if there > doesn't exist one? Or do I have to create a table before calling > saveToPhoenix function on a DataFrame? It is not evident from the above > tests link provided by Ankit. > > Thanks, > Luqman > > On Mon, Jul 17, 2017 at 11:23 PM, Luqman Ghani <lgsa...@gmail.com> wrote: > >> Thanks Ankit. I am sure this will help. >> >> On Mon, Jul 17, 2017 at 11:20 PM, Ankit Singhal <ankitsingha...@gmail.com >> > wrote: >> >>> You can take a look at our IT tests for phoenix-spark module. >>> https://github.com/apache/phoenix/blob/master/phoenix-spark/ >>> src/it/scala/org/apache/phoenix/spark/PhoenixSparkIT.scala >>> >>> On Mon, Jul 17, 2017 at 9:20 PM, Luqman Ghani <lgsa...@gmail.com> wrote: >>> >>>> >>>> -- Forwarded message -- >>>> From: Luqman Ghani <lgsa...@gmail.com> >>>> Date: Sat, Jul 15, 2017 at 2:38 PM >>>> Subject: Apache Spark Integration >>>> To: user@phoenix.apache.org >>>> >>>> >>>> Hi, >>>> >>>> I am evaluating which approach to use for integrating Phoenix with >>>> Spark, namely JDBC and phoenix-spark. I have one query regarding the >>>> following point stated in limitations in Apache Spark Integration >>>> <https://phoenix.apache.org/phoenix_spark.html> section: >>>> " >>>> >>>>- The Data Source API does not support passing custom Phoenix >>>>settings in configuration, you must create the DataFrame or RDD >>>> directly if >>>>you need fine-grained configuration. >>>> >>>> " >>>> >>>> Can someone point me to or give an example on how to give such >>>> configuration? >>>> >>>> Also, it says in the docs >>>> <https://phoenix.apache.org/phoenix_spark.html#Saving_DataFrames> that >>>> there is a 'save' function to save a dataframe to a table. But there is >>>> none. Instead, 'saveToPhoenix' shows up in my Intellij IDE suggestions. I'm >>>> using phoenix-4.11.0-HBase-1.2 and Spark-2.0.2. Is this an error in docs? >>>> >>>> Thanks, >>>> Luqman >>>> >>>> >>> >> >
Re: Apache Spark Integration
Hi, I was wondering if phoenix-spark connector creates a new table if there doesn't exist one? Or do I have to create a table before calling saveToPhoenix function on a DataFrame? It is not evident from the above tests link provided by Ankit. Thanks, Luqman On Mon, Jul 17, 2017 at 11:23 PM, Luqman Ghani <lgsa...@gmail.com> wrote: > Thanks Ankit. I am sure this will help. > > On Mon, Jul 17, 2017 at 11:20 PM, Ankit Singhal <ankitsingha...@gmail.com> > wrote: > >> You can take a look at our IT tests for phoenix-spark module. >> https://github.com/apache/phoenix/blob/master/phoenix-spark/ >> src/it/scala/org/apache/phoenix/spark/PhoenixSparkIT.scala >> >> On Mon, Jul 17, 2017 at 9:20 PM, Luqman Ghani <lgsa...@gmail.com> wrote: >> >>> >>> -- Forwarded message -- >>> From: Luqman Ghani <lgsa...@gmail.com> >>> Date: Sat, Jul 15, 2017 at 2:38 PM >>> Subject: Apache Spark Integration >>> To: user@phoenix.apache.org >>> >>> >>> Hi, >>> >>> I am evaluating which approach to use for integrating Phoenix with >>> Spark, namely JDBC and phoenix-spark. I have one query regarding the >>> following point stated in limitations in Apache Spark Integration >>> <https://phoenix.apache.org/phoenix_spark.html> section: >>> " >>> >>>- The Data Source API does not support passing custom Phoenix >>>settings in configuration, you must create the DataFrame or RDD directly >>> if >>>you need fine-grained configuration. >>> >>> " >>> >>> Can someone point me to or give an example on how to give such >>> configuration? >>> >>> Also, it says in the docs >>> <https://phoenix.apache.org/phoenix_spark.html#Saving_DataFrames> that >>> there is a 'save' function to save a dataframe to a table. But there is >>> none. Instead, 'saveToPhoenix' shows up in my Intellij IDE suggestions. I'm >>> using phoenix-4.11.0-HBase-1.2 and Spark-2.0.2. Is this an error in docs? >>> >>> Thanks, >>> Luqman >>> >>> >> >
Re: Apache Spark Integration
You can take a look at our IT tests for phoenix-spark module. https://github.com/apache/phoenix/blob/master/phoenix-spark/src/it/scala/org/apache/phoenix/spark/PhoenixSparkIT.scala On Mon, Jul 17, 2017 at 9:20 PM, Luqman Ghani <lgsa...@gmail.com> wrote: > > -- Forwarded message -- > From: Luqman Ghani <lgsa...@gmail.com> > Date: Sat, Jul 15, 2017 at 2:38 PM > Subject: Apache Spark Integration > To: user@phoenix.apache.org > > > Hi, > > I am evaluating which approach to use for integrating Phoenix with Spark, > namely JDBC and phoenix-spark. I have one query regarding the following > point stated in limitations in Apache Spark Integration > <https://phoenix.apache.org/phoenix_spark.html> section: > " > >- The Data Source API does not support passing custom Phoenix settings >in configuration, you must create the DataFrame or RDD directly if you need >fine-grained configuration. > > " > > Can someone point me to or give an example on how to give such > configuration? > > Also, it says in the docs > <https://phoenix.apache.org/phoenix_spark.html#Saving_DataFrames> that > there is a 'save' function to save a dataframe to a table. But there is > none. Instead, 'saveToPhoenix' shows up in my Intellij IDE suggestions. I'm > using phoenix-4.11.0-HBase-1.2 and Spark-2.0.2. Is this an error in docs? > > Thanks, > Luqman > >
Fwd: Apache Spark Integration
-- Forwarded message -- From: Luqman Ghani <lgsa...@gmail.com> Date: Sat, Jul 15, 2017 at 2:38 PM Subject: Apache Spark Integration To: user@phoenix.apache.org Hi, I am evaluating which approach to use for integrating Phoenix with Spark, namely JDBC and phoenix-spark. I have one query regarding the following point stated in limitations in Apache Spark Integration <https://phoenix.apache.org/phoenix_spark.html> section: " - The Data Source API does not support passing custom Phoenix settings in configuration, you must create the DataFrame or RDD directly if you need fine-grained configuration. " Can someone point me to or give an example on how to give such configuration? Also, it says in the docs <https://phoenix.apache.org/phoenix_spark.html#Saving_DataFrames> that there is a 'save' function to save a dataframe to a table. But there is none. Instead, 'saveToPhoenix' shows up in my Intellij IDE suggestions. I'm using phoenix-4.11.0-HBase-1.2 and Spark-2.0.2. Is this an error in docs? Thanks, Luqman