For better performance, I'd recommend this, as otherwise your needlessly
creating a bunch of strings on the server-side when your query is running:

select to_char(trunc(date - 8.0/24.0,'DAY') , 'yyyy-MM-dd'), count(1) from
events where date > to_date('2013-02-01 PST','yyyy-MM-dd Z') group by
trunc(date - 8.0/24.0,'DAY')


On Tue, Feb 25, 2014 at 3:48 PM, Sean Huo <[email protected]> wrote:

> Thanks.
>
> Looks like using this query
>
> select to_char(trunc(date - 8.0/24.0,'DAY') , 'yyyy-MM-dd') as day,
> count(1) from events where date > to_date('2013-02-01 PST','yyyy-MM-dd Z')
> group by day
>
> will achieve aggregation by day in pst zone.
>
>
>
> On Tue, Feb 25, 2014 at 2:44 PM, James Taylor <[email protected]>wrote:
>
>> Run this if you want to see the difference between doing and not doing
>> TRUNC:
>> select to_char(trunc(date,'DAY'),'yyyy-MM-dd hh'),
>> to_char(date,'yyyy-MM-dd hh') from bi.events limit 1;
>>
>> Sqlline is formatting the date if you don't do a TO_CHAR on it, not
>> Phoenix. I still don't understand why it matters how it's printed out. The
>> underlying date is not changing.
>>
>>
>> On Tue, Feb 25, 2014 at 1:15 PM, Sean Huo <[email protected]> wrote:
>>
>>> here is what i get
>>>
>>> select trunc(date,'DAY'), to_char(date,'yyyy-MM-dd hh'),date from
>>> bi.events limit 1;
>>>
>>> *+----------------------+---------------+--------------+*
>>>
>>> *| **FLOOR(TO_DATE(DATE))** | **TO_CHAR(DATE)** | **    DATE    ** |*
>>>
>>> *+----------------------+---------------+--------------+*
>>>
>>> *| *2013-07-14          * | *2013-07-15 07* | *2013-07-15 00:01:02.346*
>>> |*
>>>
>>> *+----------------------+---------------+--------------+*
>>>
>>>
>>> looks like to_char is formatting using gmt, where date is in pst.
>>>
>>>
>>> On Tue, Feb 25, 2014 at 1:10 PM, James Taylor <[email protected]>wrote:
>>>
>>>> I'm not sure how sqlline prints date fields by default. Can you do a
>>>> TO_CHAR(date) with a format string instead for both dates?
>>>>
>>>> The TRUNC function just truncates the date to a day boundary when you
>>>> specify the 'DAY' argument.
>>>>
>>>>
>>>> On Tue, Feb 25, 2014 at 12:59 PM, Sean Huo <[email protected]>wrote:
>>>>
>>>>> Hi James,
>>>>>
>>>>> I still don't understand the semantics of trunc function.
>>>>>
>>>>> Here is what is displayed on sqlline
>>>>>
>>>>> select trunc(date,'DAY'), date from events limit 1;
>>>>>
>>>>> *+----------------------+--------------+*
>>>>>
>>>>> *| **FLOOR(TO_DATE(DATE))** | **    DATE    ** |*
>>>>>
>>>>> *+----------------------+--------------+*
>>>>>
>>>>> *| *2013-07-14          * | *2013-07-15 00:01:02.346* |*
>>>>>
>>>>> *+----------------------+--------------+*
>>>>>
>>>>> Can you tell me why it returns '2013-07-14' rather than '2013-07-15'
>>>>>
>>>>> Thanks
>>>>> Sean
>>>>>
>>>>>
>>>>> On Tue, Feb 25, 2014 at 12:18 PM, James Taylor <[email protected]
>>>>> > wrote:
>>>>>
>>>>>> You'll get much better performance using the TRUNC function. See
>>>>>> org.apache.phoenix.end2end.ProductMetricTest for some examples.
>>>>>>
>>>>>> Thanks,
>>>>>> James
>>>>>>
>>>>>>
>>>>>> On Tue, Feb 25, 2014 at 12:11 PM, Sean Huo <[email protected]>wrote:
>>>>>>
>>>>>>> My usecase is simple. I have a event table that has timestamp as
>>>>>>> part of the key. I want to do a event count per day
>>>>>>>
>>>>>>> I could do this in phoenix
>>>>>>>
>>>>>>> select to_char(ts,'yyyy-MM-dd') day, count(1) from events group by
>>>>>>> day;
>>>>>>>
>>>>>>> it returns event count per day per GMT time.
>>>>>>>
>>>>>>> Now if I am interested in doing a event count per PST timezone,
>>>>>>>
>>>>>>> That query doesn't work any more.
>>>>>>>
>>>>>>> The round about way to do this is to execute a query per day like
>>>>>>> this
>>>>>>>
>>>>>>> select count(1) from events where ts between to_date('2014-02-20
>>>>>>> PST','yyyy-MM-dd Z') and to_date('2014-02-21 13 PST','yyyy-MM-dd Z')
>>>>>>>
>>>>>>> I will look into trunc function. There is not much documentation and
>>>>>>> usage on the function.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Sean
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Feb 25, 2014 at 11:53 AM, James Taylor <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Would it be possible to get a bit more info on your use case?
>>>>>>>> Usually showing a date/time using a different timezone is a end-user
>>>>>>>> display issue. How does this impact your group by? Grouping by a 
>>>>>>>> date/time
>>>>>>>> will be the same regardless of the timezone you use to format your 
>>>>>>>> date.
>>>>>>>>
>>>>>>>> Do you know about our TRUNC and ROUND functions?
>>>>>>>> http://phoenix.incubator.apache.org/language/functions.html#/truncate
>>>>>>>>
>>>>>>>> This is typically a good way to "bucketize" a date when you do a
>>>>>>>> group by, like this:
>>>>>>>>     SELECT count(*) FROM t GROUP BY TRUNC(my_date,'DAY')
>>>>>>>>
>>>>>>>> You can use date arithmetic if you wanted to "shift" all the dates
>>>>>>>> based on a timezone offset, like this (shifting 8 hours forward):
>>>>>>>>    SELECT count(*) FROM t GROUP BY TRUNC(my_date + 8.0/24.0,'DAY')
>>>>>>>>
>>>>>>>> HTH,
>>>>>>>> James
>>>>>>>>
>>>>>>>> On Tue, Feb 25, 2014 at 11:41 AM, Sean Huo <[email protected]>wrote:
>>>>>>>>
>>>>>>>>> Well, I can not use to_date function since it expects a string
>>>>>>>>> input while I have a timestamp.
>>>>>>>>> Also doing is in java is not a solution since I want to do a group
>>>>>>>>> by on the timestamp in a customized timezone.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Feb 25, 2014 at 11:25 AM, James Taylor <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Have you tried using the TO_DATE in conjunction with the TO_CHAR,
>>>>>>>>>> where you specify a different timezone in the TO_DATE format_arg?
>>>>>>>>>>
>>>>>>>>>> Another option is to do this in Java. When you do a
>>>>>>>>>> resultSet.getDate("MY_DATE_COL"), you can do whatever you want with 
>>>>>>>>>> the
>>>>>>>>>> Date you get back.
>>>>>>>>>>
>>>>>>>>>> We're definitely open to taking contributions for new built-in
>>>>>>>>>> functions. They're pretty easy to add. Just follow this guide:
>>>>>>>>>> http://phoenix-hbase.blogspot.com/2013/04/how-to-add-your-own-built-in-function.html
>>>>>>>>>>
>>>>>>>>>> Adding more date manipulation functions would be much appreciated.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> James
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Feb 25, 2014 at 11:16 AM, Sean Huo 
>>>>>>>>>> <[email protected]>wrote:
>>>>>>>>>>
>>>>>>>>>>> Well, to be frankly, the example on the to_char udf is wrong
>>>>>>>>>>>
>>>>>>>>>>> TO_CHAR(myDate, '2001-02-03 04:05:06')
>>>>>>>>>>> does not produce the right result and is misleading.
>>>>>>>>>>>
>>>>>>>>>>> This function does not give one the ability to format the date
>>>>>>>>>>> in a customized timezone.
>>>>>>>>>>> ALl it does is to allow timezone to be included in the output,
>>>>>>>>>>> but it is is always GMT.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Feb 25, 2014 at 11:04 AM, James Taylor <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> http://phoenix.incubator.apache.org/language/functions.html#/to_charwith
>>>>>>>>>>>>  a formatString argument.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Feb 25, 2014 at 10:58 AM, Sean Huo <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> It seems that to_char udf always produces timestamp/date
>>>>>>>>>>>>> string in GMT.
>>>>>>>>>>>>> Is there a function that allows users to pass in a timezone
>>>>>>>>>>>>> string so
>>>>>>>>>>>>> that timestamp can be displayed accordingly?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sean
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to