Hi Maciek, Jason

Sorry I could not find my old code but I came up with a little code as much
as I can remember.
you can try the following jar
https://github.com/nitinpawar/hive-udfs/tree/master/FromUnixtimeWithTZ/dist

and let me know if this works for you guys.
I can change it the way it needs to be

PS: I am not a java dev so forgive anything bad I have done in there

On Thu, Nov 6, 2014 at 3:44 PM, Maciek <mac...@sonra.io> wrote:

> @Jason:
> re. "Hive (…) just assumes things are in the system's local timezone",
> just to clarify - this is not true in case of conversions (from_unixtime())
> as it respects the local system TZ settings hence the problem.
> TZ itself is a very hairy subject and would definitely be a big
> undertaking. Extending from_unixtime seems like easiest solution for now.
> Happy to do ER in JIRA but haven't done this for before...
>
> @Nitin
> Would be very grateful if you're able to dig it out! Thanks!
>
> Best Regards
>
>
> On Thu, Nov 6, 2014 at 7:48 AM, Jason Dere <jd...@hortonworks.com> wrote:
>
>> That would be great!
>>
>> On Nov 5, 2014, at 10:49 PM, Nitin Pawar <nitinpawar...@gmail.com> wrote:
>>
>> May be a JIRA ?
>>
>> I remember having my own UDF for doing this. If possible I will share the
>> code
>>
>> On Thu, Nov 6, 2014 at 6:22 AM, Jason Dere <jd...@hortonworks.com> wrote:
>>
>>> Hive should probably at least provide a timezone option to
>>> from_unixtime().
>>> As you mentioned, Hive doesn't really do any timezone handling, just
>>> assumes things are in the system's local timezone. It will be a bit of a
>>> bigger project to add better time zone handling to Hive timestamps.
>>>
>>>
>>> On Nov 5, 2014, at 7:18 AM, Maciek <mac...@sonra.io> wrote:
>>>
>>> I see… and confirm, it's consistent with Linux/Unix output I get:
>>> date -r 0
>>> Thu  1 Jan 1970 01:00:00 IST
>>>
>>> date
>>> Wed  5 Nov 2014 14:49:52 GMT
>>> Got some digging and it actually makes sense. Turns out Ireland didn't
>>> observe daylight saving time in years 1968-1972 as set permanently to
>>> GMT+1=IST.
>>>
>>> Anyway, back to Hive
>>> I'm trying to convert unix_times to UTC (using from_unixtime UDF )but
>>> due to the issue it I'm getting different results on different servers (TZ
>>> settings)
>>> Is there any way influence that behaviour without changing timezone on
>>> the server?
>>>
>>> Oracle for that instance offers a good few options to facilitate
>>> timezone conversion, among the others:
>>> 'AT TIME ZONE [GMT]' clause
>>> ALTER SESSION SET TIME_ZONE [= 'GMT']
>>> or
>>> to_timestamp_tz() function
>>>
>>> Currently it seems, the only way to perform this conversion is to detect
>>> server settings first (won't work at all for some cases like though JDBC
>>> connection I think) and apply the shift during the process.
>>>
>>> Would be really nice if Hive offers some elegant way to support this.
>>> I'm thinking of similar ALTER SESSION statement equivalent, maybe
>>> parameter SET in hive or extra parameter for the from_unixtime() Hive
>>> function?
>>>
>>> On Mon, Nov 3, 2014 at 10:33 PM, Jason Dere <jd...@hortonworks.com>
>>> wrote:
>>>
>>>>
>>>> As Nitin mentions, the behavior is "to a string representing the
>>>> timestamp of that moment in the current system time zone".  What are the
>>>> timezone settings on your machine?
>>>>
>>>> $ TZ="GMT" date -r 0
>>>> Thu Jan  1 00:00:00 GMT 1970
>>>>
>>>> $ TZ="UTC" date -r 0
>>>> Thu Jan  1 00:00:00 UTC 1970
>>>>
>>>> $ TZ="Europe/London" date -r 0
>>>> Thu Jan  1 01:00:00 BST 1970
>>>>
>>>> $ TZ="Europe/Dublin" date -r 0
>>>> Thu Jan  1 01:00:00 IST 1970
>>>>
>>>> On Nov 3, 2014, at 12:50 PM, Maciek <mac...@sonra.io> wrote:
>>>>
>>>> I'd consider this behaviour as a bug and would like to raise it as such.
>>>> Is there anyone to confirm it's the same on Hive 0.14?
>>>>
>>>> On Fri, Oct 31, 2014 at 3:41 PM, Maciek <mac...@sonra.io> wrote:
>>>>
>>>>> Actually confirmed! It's down to the timezone settings
>>>>> I've moved temporarily server/client settings to 'Atlantic/Reykjavik'
>>>>> (no change in time comparing to what I was on (GMT), but it's permanent 
>>>>> UTC
>>>>> and as such doesn't observe daylight saving.
>>>>> I believe this shouldn't matter (see my points from previous mail) but
>>>>> apparently there's an issue with it.
>>>>> Not sure how to deal with this situation (can't just change TZ
>>>>> settings everywhere because of Hive) and don't want to hardcode anything.
>>>>> I'm on Hive 0.13.
>>>>> Does Hive 0.14 provide better support for TimeZones?
>>>>>
>>>>>
>>>>> On Fri, Oct 31, 2014 at 3:25 PM, Maciek <mac...@sonra.io> wrote:
>>>>>
>>>>>> Thought about that myself based on my prior (bad) experience when
>>>>>> tried to working with timezones in Hive (functionality pretty much 
>>>>>> doesn't
>>>>>> exists)
>>>>>> That shouldn't be the case here though, here's why:
>>>>>> in Oracle [timestamp with timezone] can be adjusted when
>>>>>> sent/displayed on the client based on client's settings. This may be also
>>>>>> relevant if the timestamp in question would fall onto client's daily 
>>>>>> saving
>>>>>> time period. This behaviour would make sense to me, however:
>>>>>>
>>>>>> • this is server, not client settings we're talking about here
>>>>>> • the server and client do reside in the same timezone anyway, which
>>>>>> is currently GMT [UTC]
>>>>>>
>>>>>> • while we observe the daily saving here [Dublin] the time in
>>>>>> question ("1970-01-01 00:00:00") is not in that period, neither the time
>>>>>> I'm sending the query (now).
>>>>>>
>>>>>>
>>>>>>
>>>>>> Based on all above, I don't see the reason the time gets shifted by
>>>>>> one hour, but I realise the issue might be down to the general problems 
>>>>>> in
>>>>>> Hive' implementation of timezones…
>>>>>>
>>>>>> On Fri, Oct 31, 2014 at 12:26 PM, Nitin Pawar <
>>>>>> nitinpawar...@gmail.com> wrote:
>>>>>>
>>>>>>> In hive from_unixtime is returned from the timezone which you belong
>>>>>>> to
>>>>>>> "From document : from_unixtime(bigint unixtime[, string format]) :
>>>>>>> Converts the number of seconds from unix epoch (1970-01-01 00:00:00 
>>>>>>> UTC) to
>>>>>>> a string representing the timestamp of that moment in the current system
>>>>>>> time zone in the format of "1970-01-01 00:00:00".
>>>>>>>
>>>>>>> if possible can you also check by changing the timezone to UTC on
>>>>>>> your machine?
>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Oct 31, 2014 at 12:00 PM, Maciek <mac...@sonra.io> wrote:
>>>>>>>>
>>>>>>>>> Any reason why
>>>>>>>>>
>>>>>>>>> select from_unixtime(0) t0 FROM …
>>>>>>>>> gives
>>>>>>>>>
>>>>>>>>> 1970-01-01 01:00:00
>>>>>>>>> ?
>>>>>>>>>
>>>>>>>>> By all available definitions (epoch, from_unixtime etc..) I would
>>>>>>>>> expect it to be 1970-01-01 00:00:00…?
>>>>>>>>>
>>>>>>>>


-- 
Nitin Pawar

Reply via email to