Re: Apache Spark Integration

2017-07-19 Thread Josh Mahonin
Hi Luqman,

At present, the phoenix-spark integration relies on the schema having been
already created.

There has been some discussion of augmenting the supported Spark
'SaveMode's to include 'CREATE IF NOT EXISTS' logic.

https://issues.apache.org/jira/browse/PHOENIX-2745
https://issues.apache.org/jira/browse/PHOENIX-2632

Contributions would be most welcome!

Josh


On Tue, Jul 18, 2017 at 6:50 AM, Luqman Ghani <lgsa...@gmail.com> wrote:

> Hi,
>
> I was wondering if phoenix-spark connector creates a new table if there
> doesn't exist one? Or do I have to create a table before calling
> saveToPhoenix function on a DataFrame? It is not evident from the above
> tests link provided by Ankit.
>
> Thanks,
> Luqman
>
> On Mon, Jul 17, 2017 at 11:23 PM, Luqman Ghani <lgsa...@gmail.com> wrote:
>
>> Thanks Ankit. I am sure this will help.
>>
>> On Mon, Jul 17, 2017 at 11:20 PM, Ankit Singhal <ankitsingha...@gmail.com
>> > wrote:
>>
>>> You can take a look at our IT tests for phoenix-spark module.
>>> https://github.com/apache/phoenix/blob/master/phoenix-spark/
>>> src/it/scala/org/apache/phoenix/spark/PhoenixSparkIT.scala
>>>
>>> On Mon, Jul 17, 2017 at 9:20 PM, Luqman Ghani <lgsa...@gmail.com> wrote:
>>>
>>>>
>>>> -- Forwarded message --
>>>> From: Luqman Ghani <lgsa...@gmail.com>
>>>> Date: Sat, Jul 15, 2017 at 2:38 PM
>>>> Subject: Apache Spark Integration
>>>> To: user@phoenix.apache.org
>>>>
>>>>
>>>> Hi,
>>>>
>>>> I am evaluating which approach to use for integrating Phoenix with
>>>> Spark, namely JDBC and phoenix-spark. I have one query regarding the
>>>> following point stated in limitations in Apache Spark Integration
>>>> <https://phoenix.apache.org/phoenix_spark.html> section:
>>>> "
>>>>
>>>>- The Data Source API does not support passing custom Phoenix
>>>>settings in configuration, you must create the DataFrame or RDD 
>>>> directly if
>>>>you need fine-grained configuration.
>>>>
>>>> "
>>>>
>>>> Can someone point me to or give an example on how to give such
>>>> configuration?
>>>>
>>>> Also, it says in the docs
>>>> <https://phoenix.apache.org/phoenix_spark.html#Saving_DataFrames> that
>>>> there is a 'save' function to save a dataframe to a table. But there is
>>>> none. Instead, 'saveToPhoenix' shows up in my Intellij IDE suggestions. I'm
>>>> using phoenix-4.11.0-HBase-1.2 and Spark-2.0.2. Is this an error in docs?
>>>>
>>>> Thanks,
>>>> Luqman
>>>>
>>>>
>>>
>>
>


Re: Apache Spark Integration

2017-07-18 Thread Luqman Ghani
Hi,

I was wondering if phoenix-spark connector creates a new table if there
doesn't exist one? Or do I have to create a table before calling
saveToPhoenix function on a DataFrame? It is not evident from the above
tests link provided by Ankit.

Thanks,
Luqman

On Mon, Jul 17, 2017 at 11:23 PM, Luqman Ghani <lgsa...@gmail.com> wrote:

> Thanks Ankit. I am sure this will help.
>
> On Mon, Jul 17, 2017 at 11:20 PM, Ankit Singhal <ankitsingha...@gmail.com>
> wrote:
>
>> You can take a look at our IT tests for phoenix-spark module.
>> https://github.com/apache/phoenix/blob/master/phoenix-spark/
>> src/it/scala/org/apache/phoenix/spark/PhoenixSparkIT.scala
>>
>> On Mon, Jul 17, 2017 at 9:20 PM, Luqman Ghani <lgsa...@gmail.com> wrote:
>>
>>>
>>> -- Forwarded message --
>>> From: Luqman Ghani <lgsa...@gmail.com>
>>> Date: Sat, Jul 15, 2017 at 2:38 PM
>>> Subject: Apache Spark Integration
>>> To: user@phoenix.apache.org
>>>
>>>
>>> Hi,
>>>
>>> I am evaluating which approach to use for integrating Phoenix with
>>> Spark, namely JDBC and phoenix-spark. I have one query regarding the
>>> following point stated in limitations in Apache Spark Integration
>>> <https://phoenix.apache.org/phoenix_spark.html> section:
>>> "
>>>
>>>- The Data Source API does not support passing custom Phoenix
>>>settings in configuration, you must create the DataFrame or RDD directly 
>>> if
>>>you need fine-grained configuration.
>>>
>>> "
>>>
>>> Can someone point me to or give an example on how to give such
>>> configuration?
>>>
>>> Also, it says in the docs
>>> <https://phoenix.apache.org/phoenix_spark.html#Saving_DataFrames> that
>>> there is a 'save' function to save a dataframe to a table. But there is
>>> none. Instead, 'saveToPhoenix' shows up in my Intellij IDE suggestions. I'm
>>> using phoenix-4.11.0-HBase-1.2 and Spark-2.0.2. Is this an error in docs?
>>>
>>> Thanks,
>>> Luqman
>>>
>>>
>>
>


Re: Apache Spark Integration

2017-07-17 Thread Ankit Singhal
You can take a look at our IT tests for phoenix-spark module.
https://github.com/apache/phoenix/blob/master/phoenix-spark/src/it/scala/org/apache/phoenix/spark/PhoenixSparkIT.scala

On Mon, Jul 17, 2017 at 9:20 PM, Luqman Ghani <lgsa...@gmail.com> wrote:

>
> -- Forwarded message --
> From: Luqman Ghani <lgsa...@gmail.com>
> Date: Sat, Jul 15, 2017 at 2:38 PM
> Subject: Apache Spark Integration
> To: user@phoenix.apache.org
>
>
> Hi,
>
> I am evaluating which approach to use for integrating Phoenix with Spark,
> namely JDBC and phoenix-spark. I have one query regarding the following
> point stated in limitations in Apache Spark Integration
> <https://phoenix.apache.org/phoenix_spark.html> section:
> "
>
>- The Data Source API does not support passing custom Phoenix settings
>in configuration, you must create the DataFrame or RDD directly if you need
>fine-grained configuration.
>
> "
>
> Can someone point me to or give an example on how to give such
> configuration?
>
> Also, it says in the docs
> <https://phoenix.apache.org/phoenix_spark.html#Saving_DataFrames> that
> there is a 'save' function to save a dataframe to a table. But there is
> none. Instead, 'saveToPhoenix' shows up in my Intellij IDE suggestions. I'm
> using phoenix-4.11.0-HBase-1.2 and Spark-2.0.2. Is this an error in docs?
>
> Thanks,
> Luqman
>
>


Fwd: Apache Spark Integration

2017-07-17 Thread Luqman Ghani
-- Forwarded message --
From: Luqman Ghani <lgsa...@gmail.com>
Date: Sat, Jul 15, 2017 at 2:38 PM
Subject: Apache Spark Integration
To: user@phoenix.apache.org


Hi,

I am evaluating which approach to use for integrating Phoenix with Spark,
namely JDBC and phoenix-spark. I have one query regarding the following
point stated in limitations in Apache Spark Integration
<https://phoenix.apache.org/phoenix_spark.html> section:
"

   - The Data Source API does not support passing custom Phoenix settings
   in configuration, you must create the DataFrame or RDD directly if you need
   fine-grained configuration.

"

Can someone point me to or give an example on how to give such
configuration?

Also, it says in the docs
<https://phoenix.apache.org/phoenix_spark.html#Saving_DataFrames> that
there is a 'save' function to save a dataframe to a table. But there is
none. Instead, 'saveToPhoenix' shows up in my Intellij IDE suggestions. I'm
using phoenix-4.11.0-HBase-1.2 and Spark-2.0.2. Is this an error in docs?

Thanks,
Luqman