Re: Configuring phoenix.query.dateFormatTimeZone

2015-08-17 Thread Gabriel Reid
If I'm understanding correctly, you're using DateTimeFormatter (from
joda-time) to convert a String to a long, and then instantiating a
java.sql.Timestamp from the long -- could you confirm that that is
correct? And could you also explain how you're checking what is stored
in Phoenix? Is that via sqlline or Squirrel or something similar, or
your own JDBC code?

Even better would be if you could write a little test class that
demonstrates the issue that you're running into. My gut feeling is
that things are working as intended and that it's timezone weirdness
in the JDBC spec that is causing the issue, but I need to nail down
your exact use case to verify this.

- Gabriel


On Sun, Aug 16, 2015 at 5:54 PM, Naor David  wrote:
> I'm upserting timestamps by setting a java.sql.TimeStamp object to it's
> proper index in my PreparedStatement object, so it,s something like this:
> ps.setObject(1,ts);
> By doing so I am not using the TO_DATE Phoenix function, nor am I parsing
> the time samp as a String, so I think that setting the parameter at
> hbase-site.xml wouldn't help..
>
> For example, if I insert the timestamp corresponding to 1-1-2015 10:00:00,
> the inserted timestamp column would be 1-1-2015 07:00:00 and so on.
> FYI, I use the DateTimeFormatter class for converting the date (which comes
> with GMT+3 suffix) to a TimeStamp object as above for inserting the date as
> a TimeStamp.
>
> - David
>
> בתאריך 14 באוג׳ 2015 16:31,‏ "Gabriel Reid"  כתב:
>
>> Hi David,
>>
>> How are you upserting timestamps? The phoenix.query.dateFormatTimeZone
>> config property only affects string parsing or the TO_DATE function (docs on
>> this are at [1]). If you're using the TO_DATE function, it's also possible
>> to supply a custom time zone in the function call (docs on this are at [2]).
>>
>> Regardless, if you want to use this setting, you need to update the
>> hbase-site.xml on the client machine where you're connecting to
>> HBase/Phoenix. This configuration file will typically be in /etc/hbase/conf,
>> although if you're using Cloudera Manager (or probably some other cluster
>> management software) the hbase-site is automatically overwritten by CM, so
>> you'll need to configure this within Cloudera Manager itself (via
>> configuration settings called "Gateway safety-valve", or something along
>> those lines).
>>
>> In any case, there are often issues due to the odd way in which JDBC
>> itself handles (or doesn't handle) timezones, so the best way to resolve
>> this issue is probably for you to post some examples of the statements
>> you're running, what output you're getting, and what you would expect
>> instead of what you're getting.
>>
>> - Gabriel
>>
>> 1. http://phoenix.apache.org/tuning.html
>> 2. https://phoenix.apache.org/language/functions.html#to_date
>>
>> On Fri, Aug 14, 2015 at 1:59 PM Naor David  wrote:
>>>
>>> Hello,
>>> I recently installed Apache Pheonix 4.3 at a Cloudera cluster via parcel
>>> installation.
>>> My problem is that while inserting a java.sql.TimeStamp object via jdbc,
>>> the corresponding inserted timestamp column is converted to GMT+0 timezone.
>>> (While my local time is GMT+3).
>>> I understood that one can configure the Phoenix timezone by setting
>>> phoenix.query.dateFormatTimeZone to the desired timezone.
>>> My problem is that I don't know which hbase-site.xml should I edit (and
>>> where can I find it).
>>>
>>> Any help would be appreciated.
>>>
>>> Regards,
>>> David.


Re: REG: Using Sequences in Phoenix Data Frame

2015-08-17 Thread Ns G
It would be really helpful if  links to resources are provided  where
sequences are used in Map reduce which I will try to replicate in spark.

Thank you James and Josh for your answers.
On 17-Aug-2015 8:25 pm, "Josh Mahonin"  wrote:

> Oh, neat! I was looking for some references to it in code, unit tests and
> docs and didn't see anything relevant.
>
> It's possible they might "just work" then, although it's definitely an
> untested scenario.
>
> On Mon, Aug 17, 2015 at 10:48 AM, James Taylor 
> wrote:
>
>> Sequences are supported by MR integration, but I'm not sure if their
>> usage by the Spark integration would cause any issues.
>>
>>
>> On Monday, August 17, 2015, Josh Mahonin  wrote:
>>
>>> Hi Satya,
>>>
>>> I don't believe sequences are supported by the broader Phoenix
>>> map-reduce integration, which the phoenix-spark module uses under the hood.
>>>
>>> One workaround that would give you sequential IDs, is to use the
>>> 'zipWithIndex' method on the underlying Spark RDD, with a small 'map()'
>>> operation to unpack / reorganize the tuple, before saving it to Phoenix.
>>>
>>> Good luck!
>>>
>>> Josh
>>>
>>> On Sat, Aug 15, 2015 at 10:02 AM, Ns G  wrote:
>>>
 Hi All,

 I hope that someone will reply to this email as all my previous emails
 have been unanswered.

 I have 10-20 Million records in file and I want to insert it through
 Phoenix-Spark.
 The table primary id is generated by a sequence. So, every time an
 upsert is done, the sequence Id gets generated.

 Now I want to implement this in Spark and more precisely using data
 frames. Since RDDs are immutables, How can I add sequence to the rows in
 dataframe?

 Thanks for any help or direction or suggestion.

 Satya

>>>
>>>
>


Re: REG: Using Sequences in Phoenix Data Frame

2015-08-17 Thread Josh Mahonin
Oh, neat! I was looking for some references to it in code, unit tests and
docs and didn't see anything relevant.

It's possible they might "just work" then, although it's definitely an
untested scenario.

On Mon, Aug 17, 2015 at 10:48 AM, James Taylor 
wrote:

> Sequences are supported by MR integration, but I'm not sure if their
> usage by the Spark integration would cause any issues.
>
>
> On Monday, August 17, 2015, Josh Mahonin  wrote:
>
>> Hi Satya,
>>
>> I don't believe sequences are supported by the broader Phoenix map-reduce
>> integration, which the phoenix-spark module uses under the hood.
>>
>> One workaround that would give you sequential IDs, is to use the
>> 'zipWithIndex' method on the underlying Spark RDD, with a small 'map()'
>> operation to unpack / reorganize the tuple, before saving it to Phoenix.
>>
>> Good luck!
>>
>> Josh
>>
>> On Sat, Aug 15, 2015 at 10:02 AM, Ns G  wrote:
>>
>>> Hi All,
>>>
>>> I hope that someone will reply to this email as all my previous emails
>>> have been unanswered.
>>>
>>> I have 10-20 Million records in file and I want to insert it through
>>> Phoenix-Spark.
>>> The table primary id is generated by a sequence. So, every time an
>>> upsert is done, the sequence Id gets generated.
>>>
>>> Now I want to implement this in Spark and more precisely using data
>>> frames. Since RDDs are immutables, How can I add sequence to the rows in
>>> dataframe?
>>>
>>> Thanks for any help or direction or suggestion.
>>>
>>> Satya
>>>
>>
>>


Re: how to write a row_number function in phoenix?

2015-08-17 Thread James Taylor
You might be able to mimic what NEXT VALUE FOR does for sequences, but you
wouldn't need any server-side code. We do something similar for query more
support by creating a sequence on the fly and using a very big cache value
(see WueryMoreIT).

On Monday, August 17, 2015, 曾柏棠  wrote:

> how to write a row_number function in phoenix?
>
>
> 本邮件中所包含的信息是保密并受相关法律保护的。本邮件仅供收件人参阅。我们特此告知,如果您不是特定的收件人,您对本邮件任何使用、转发、传播或复制是被严令禁止的,并且可能是违法的,请立即联系发件人,并且销毁所有的原始邮件。
>
> This E-mail and any attachments are confidential and intended solely for
> the addressees. If you receive this E-mail in error, please delete it and
> immediately notify the sender. If the reader of this E-mail is not the
> intended recipient, you are hereby notified that any unauthorized use,
> copying or dissemination is prohibited. Please delete this E-mail and all
> its attachments.
>


Re: REG: Using Sequences in Phoenix Data Frame

2015-08-17 Thread James Taylor
Sequences are supported by MR integration, but I'm not sure if their usage
by the Spark integration would cause any issues.

On Monday, August 17, 2015, Josh Mahonin  wrote:

> Hi Satya,
>
> I don't believe sequences are supported by the broader Phoenix map-reduce
> integration, which the phoenix-spark module uses under the hood.
>
> One workaround that would give you sequential IDs, is to use the
> 'zipWithIndex' method on the underlying Spark RDD, with a small 'map()'
> operation to unpack / reorganize the tuple, before saving it to Phoenix.
>
> Good luck!
>
> Josh
>
> On Sat, Aug 15, 2015 at 10:02 AM, Ns G  > wrote:
>
>> Hi All,
>>
>> I hope that someone will reply to this email as all my previous emails
>> have been unanswered.
>>
>> I have 10-20 Million records in file and I want to insert it through
>> Phoenix-Spark.
>> The table primary id is generated by a sequence. So, every time an upsert
>> is done, the sequence Id gets generated.
>>
>> Now I want to implement this in Spark and more precisely using data
>> frames. Since RDDs are immutables, How can I add sequence to the rows in
>> dataframe?
>>
>> Thanks for any help or direction or suggestion.
>>
>> Satya
>>
>
>


Re: REG: Using Sequences in Phoenix Data Frame

2015-08-17 Thread Josh Mahonin
Hi Satya,

I don't believe sequences are supported by the broader Phoenix map-reduce
integration, which the phoenix-spark module uses under the hood.

One workaround that would give you sequential IDs, is to use the
'zipWithIndex' method on the underlying Spark RDD, with a small 'map()'
operation to unpack / reorganize the tuple, before saving it to Phoenix.

Good luck!

Josh

On Sat, Aug 15, 2015 at 10:02 AM, Ns G  wrote:

> Hi All,
>
> I hope that someone will reply to this email as all my previous emails
> have been unanswered.
>
> I have 10-20 Million records in file and I want to insert it through
> Phoenix-Spark.
> The table primary id is generated by a sequence. So, every time an upsert
> is done, the sequence Id gets generated.
>
> Now I want to implement this in Spark and more precisely using data
> frames. Since RDDs are immutables, How can I add sequence to the rows in
> dataframe?
>
> Thanks for any help or direction or suggestion.
>
> Satya
>


how to write a row_number function in phoenix?

2015-08-17 Thread 曾柏棠
how to write a row_number function in phoenix?

-- 


本邮件中所包含的信息是保密并受相关法律保护的。本邮件仅供收件人参阅。我们特此告知,如果您不是特定的收件人,您对本邮件任何使用、转发、传播或复制是被严令禁止的,并且可能是违法的,请立即联系发件人,并且销毁所有的原始邮件。

This E-mail and any attachments are confidential and intended solely for 
the addressees. If you receive this E-mail in error, please delete it and 
immediately notify the sender. If the reader of this E-mail is not the 
intended recipient, you are hereby notified that any unauthorized use, 
copying or dissemination is prohibited. Please delete this E-mail and all 
its attachments.