Hi Maciek, Jason Sorry I could not find my old code but I came up with a little code as much as I can remember. you can try the following jar https://github.com/nitinpawar/hive-udfs/tree/master/FromUnixtimeWithTZ/dist
and let me know if this works for you guys. I can change it the way it needs to be PS: I am not a java dev so forgive anything bad I have done in there On Thu, Nov 6, 2014 at 3:44 PM, Maciek <mac...@sonra.io> wrote: > @Jason: > re. "Hive (…) just assumes things are in the system's local timezone", > just to clarify - this is not true in case of conversions (from_unixtime()) > as it respects the local system TZ settings hence the problem. > TZ itself is a very hairy subject and would definitely be a big > undertaking. Extending from_unixtime seems like easiest solution for now. > Happy to do ER in JIRA but haven't done this for before... > > @Nitin > Would be very grateful if you're able to dig it out! Thanks! > > Best Regards > > > On Thu, Nov 6, 2014 at 7:48 AM, Jason Dere <jd...@hortonworks.com> wrote: > >> That would be great! >> >> On Nov 5, 2014, at 10:49 PM, Nitin Pawar <nitinpawar...@gmail.com> wrote: >> >> May be a JIRA ? >> >> I remember having my own UDF for doing this. If possible I will share the >> code >> >> On Thu, Nov 6, 2014 at 6:22 AM, Jason Dere <jd...@hortonworks.com> wrote: >> >>> Hive should probably at least provide a timezone option to >>> from_unixtime(). >>> As you mentioned, Hive doesn't really do any timezone handling, just >>> assumes things are in the system's local timezone. It will be a bit of a >>> bigger project to add better time zone handling to Hive timestamps. >>> >>> >>> On Nov 5, 2014, at 7:18 AM, Maciek <mac...@sonra.io> wrote: >>> >>> I see… and confirm, it's consistent with Linux/Unix output I get: >>> date -r 0 >>> Thu 1 Jan 1970 01:00:00 IST >>> >>> date >>> Wed 5 Nov 2014 14:49:52 GMT >>> Got some digging and it actually makes sense. Turns out Ireland didn't >>> observe daylight saving time in years 1968-1972 as set permanently to >>> GMT+1=IST. >>> >>> Anyway, back to Hive >>> I'm trying to convert unix_times to UTC (using from_unixtime UDF )but >>> due to the issue it I'm getting different results on different servers (TZ >>> settings) >>> Is there any way influence that behaviour without changing timezone on >>> the server? >>> >>> Oracle for that instance offers a good few options to facilitate >>> timezone conversion, among the others: >>> 'AT TIME ZONE [GMT]' clause >>> ALTER SESSION SET TIME_ZONE [= 'GMT'] >>> or >>> to_timestamp_tz() function >>> >>> Currently it seems, the only way to perform this conversion is to detect >>> server settings first (won't work at all for some cases like though JDBC >>> connection I think) and apply the shift during the process. >>> >>> Would be really nice if Hive offers some elegant way to support this. >>> I'm thinking of similar ALTER SESSION statement equivalent, maybe >>> parameter SET in hive or extra parameter for the from_unixtime() Hive >>> function? >>> >>> On Mon, Nov 3, 2014 at 10:33 PM, Jason Dere <jd...@hortonworks.com> >>> wrote: >>> >>>> >>>> As Nitin mentions, the behavior is "to a string representing the >>>> timestamp of that moment in the current system time zone". What are the >>>> timezone settings on your machine? >>>> >>>> $ TZ="GMT" date -r 0 >>>> Thu Jan 1 00:00:00 GMT 1970 >>>> >>>> $ TZ="UTC" date -r 0 >>>> Thu Jan 1 00:00:00 UTC 1970 >>>> >>>> $ TZ="Europe/London" date -r 0 >>>> Thu Jan 1 01:00:00 BST 1970 >>>> >>>> $ TZ="Europe/Dublin" date -r 0 >>>> Thu Jan 1 01:00:00 IST 1970 >>>> >>>> On Nov 3, 2014, at 12:50 PM, Maciek <mac...@sonra.io> wrote: >>>> >>>> I'd consider this behaviour as a bug and would like to raise it as such. >>>> Is there anyone to confirm it's the same on Hive 0.14? >>>> >>>> On Fri, Oct 31, 2014 at 3:41 PM, Maciek <mac...@sonra.io> wrote: >>>> >>>>> Actually confirmed! It's down to the timezone settings >>>>> I've moved temporarily server/client settings to 'Atlantic/Reykjavik' >>>>> (no change in time comparing to what I was on (GMT), but it's permanent >>>>> UTC >>>>> and as such doesn't observe daylight saving. >>>>> I believe this shouldn't matter (see my points from previous mail) but >>>>> apparently there's an issue with it. >>>>> Not sure how to deal with this situation (can't just change TZ >>>>> settings everywhere because of Hive) and don't want to hardcode anything. >>>>> I'm on Hive 0.13. >>>>> Does Hive 0.14 provide better support for TimeZones? >>>>> >>>>> >>>>> On Fri, Oct 31, 2014 at 3:25 PM, Maciek <mac...@sonra.io> wrote: >>>>> >>>>>> Thought about that myself based on my prior (bad) experience when >>>>>> tried to working with timezones in Hive (functionality pretty much >>>>>> doesn't >>>>>> exists) >>>>>> That shouldn't be the case here though, here's why: >>>>>> in Oracle [timestamp with timezone] can be adjusted when >>>>>> sent/displayed on the client based on client's settings. This may be also >>>>>> relevant if the timestamp in question would fall onto client's daily >>>>>> saving >>>>>> time period. This behaviour would make sense to me, however: >>>>>> >>>>>> • this is server, not client settings we're talking about here >>>>>> • the server and client do reside in the same timezone anyway, which >>>>>> is currently GMT [UTC] >>>>>> >>>>>> • while we observe the daily saving here [Dublin] the time in >>>>>> question ("1970-01-01 00:00:00") is not in that period, neither the time >>>>>> I'm sending the query (now). >>>>>> >>>>>> >>>>>> >>>>>> Based on all above, I don't see the reason the time gets shifted by >>>>>> one hour, but I realise the issue might be down to the general problems >>>>>> in >>>>>> Hive' implementation of timezones… >>>>>> >>>>>> On Fri, Oct 31, 2014 at 12:26 PM, Nitin Pawar < >>>>>> nitinpawar...@gmail.com> wrote: >>>>>> >>>>>>> In hive from_unixtime is returned from the timezone which you belong >>>>>>> to >>>>>>> "From document : from_unixtime(bigint unixtime[, string format]) : >>>>>>> Converts the number of seconds from unix epoch (1970-01-01 00:00:00 >>>>>>> UTC) to >>>>>>> a string representing the timestamp of that moment in the current system >>>>>>> time zone in the format of "1970-01-01 00:00:00". >>>>>>> >>>>>>> if possible can you also check by changing the timezone to UTC on >>>>>>> your machine? >>>>>>> >>>>>>>> >>>>>>>> On Fri, Oct 31, 2014 at 12:00 PM, Maciek <mac...@sonra.io> wrote: >>>>>>>> >>>>>>>>> Any reason why >>>>>>>>> >>>>>>>>> select from_unixtime(0) t0 FROM … >>>>>>>>> gives >>>>>>>>> >>>>>>>>> 1970-01-01 01:00:00 >>>>>>>>> ? >>>>>>>>> >>>>>>>>> By all available definitions (epoch, from_unixtime etc..) I would >>>>>>>>> expect it to be 1970-01-01 00:00:00…? >>>>>>>>> >>>>>>>> -- Nitin Pawar