@Jason: re. "Hive (…) just assumes things are in the system's local timezone", just to clarify - this is not true in case of conversions (from_unixtime()) as it respects the local system TZ settings hence the problem. TZ itself is a very hairy subject and would definitely be a big undertaking. Extending from_unixtime seems like easiest solution for now. Happy to do ER in JIRA but haven't done this for before...
@Nitin Would be very grateful if you're able to dig it out! Thanks! Best Regards On Thu, Nov 6, 2014 at 7:48 AM, Jason Dere <jd...@hortonworks.com> wrote: > That would be great! > > On Nov 5, 2014, at 10:49 PM, Nitin Pawar <nitinpawar...@gmail.com> wrote: > > May be a JIRA ? > > I remember having my own UDF for doing this. If possible I will share the > code > > On Thu, Nov 6, 2014 at 6:22 AM, Jason Dere <jd...@hortonworks.com> wrote: > >> Hive should probably at least provide a timezone option to >> from_unixtime(). >> As you mentioned, Hive doesn't really do any timezone handling, just >> assumes things are in the system's local timezone. It will be a bit of a >> bigger project to add better time zone handling to Hive timestamps. >> >> >> On Nov 5, 2014, at 7:18 AM, Maciek <mac...@sonra.io> wrote: >> >> I see… and confirm, it's consistent with Linux/Unix output I get: >> date -r 0 >> Thu 1 Jan 1970 01:00:00 IST >> >> date >> Wed 5 Nov 2014 14:49:52 GMT >> Got some digging and it actually makes sense. Turns out Ireland didn't >> observe daylight saving time in years 1968-1972 as set permanently to >> GMT+1=IST. >> >> Anyway, back to Hive >> I'm trying to convert unix_times to UTC (using from_unixtime UDF )but >> due to the issue it I'm getting different results on different servers (TZ >> settings) >> Is there any way influence that behaviour without changing timezone on >> the server? >> >> Oracle for that instance offers a good few options to facilitate timezone >> conversion, among the others: >> 'AT TIME ZONE [GMT]' clause >> ALTER SESSION SET TIME_ZONE [= 'GMT'] >> or >> to_timestamp_tz() function >> >> Currently it seems, the only way to perform this conversion is to detect >> server settings first (won't work at all for some cases like though JDBC >> connection I think) and apply the shift during the process. >> >> Would be really nice if Hive offers some elegant way to support this. >> I'm thinking of similar ALTER SESSION statement equivalent, maybe >> parameter SET in hive or extra parameter for the from_unixtime() Hive >> function? >> >> On Mon, Nov 3, 2014 at 10:33 PM, Jason Dere <jd...@hortonworks.com> >> wrote: >> >>> >>> As Nitin mentions, the behavior is "to a string representing the >>> timestamp of that moment in the current system time zone". What are the >>> timezone settings on your machine? >>> >>> $ TZ="GMT" date -r 0 >>> Thu Jan 1 00:00:00 GMT 1970 >>> >>> $ TZ="UTC" date -r 0 >>> Thu Jan 1 00:00:00 UTC 1970 >>> >>> $ TZ="Europe/London" date -r 0 >>> Thu Jan 1 01:00:00 BST 1970 >>> >>> $ TZ="Europe/Dublin" date -r 0 >>> Thu Jan 1 01:00:00 IST 1970 >>> >>> On Nov 3, 2014, at 12:50 PM, Maciek <mac...@sonra.io> wrote: >>> >>> I'd consider this behaviour as a bug and would like to raise it as such. >>> Is there anyone to confirm it's the same on Hive 0.14? >>> >>> On Fri, Oct 31, 2014 at 3:41 PM, Maciek <mac...@sonra.io> wrote: >>> >>>> Actually confirmed! It's down to the timezone settings >>>> I've moved temporarily server/client settings to 'Atlantic/Reykjavik' >>>> (no change in time comparing to what I was on (GMT), but it's permanent UTC >>>> and as such doesn't observe daylight saving. >>>> I believe this shouldn't matter (see my points from previous mail) but >>>> apparently there's an issue with it. >>>> Not sure how to deal with this situation (can't just change TZ settings >>>> everywhere because of Hive) and don't want to hardcode anything. >>>> I'm on Hive 0.13. >>>> Does Hive 0.14 provide better support for TimeZones? >>>> >>>> >>>> On Fri, Oct 31, 2014 at 3:25 PM, Maciek <mac...@sonra.io> wrote: >>>> >>>>> Thought about that myself based on my prior (bad) experience when >>>>> tried to working with timezones in Hive (functionality pretty much doesn't >>>>> exists) >>>>> That shouldn't be the case here though, here's why: >>>>> in Oracle [timestamp with timezone] can be adjusted when >>>>> sent/displayed on the client based on client's settings. This may be also >>>>> relevant if the timestamp in question would fall onto client's daily >>>>> saving >>>>> time period. This behaviour would make sense to me, however: >>>>> >>>>> • this is server, not client settings we're talking about here >>>>> • the server and client do reside in the same timezone anyway, which >>>>> is currently GMT [UTC] >>>>> >>>>> • while we observe the daily saving here [Dublin] the time in question >>>>> ("1970-01-01 00:00:00") is not in that period, neither the time I'm >>>>> sending >>>>> the query (now). >>>>> >>>>> >>>>> >>>>> Based on all above, I don't see the reason the time gets shifted by >>>>> one hour, but I realise the issue might be down to the general problems in >>>>> Hive' implementation of timezones… >>>>> >>>>> On Fri, Oct 31, 2014 at 12:26 PM, Nitin Pawar <nitinpawar...@gmail.com >>>>> > wrote: >>>>> >>>>>> In hive from_unixtime is returned from the timezone which you belong >>>>>> to >>>>>> "From document : from_unixtime(bigint unixtime[, string format]) : >>>>>> Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) >>>>>> to >>>>>> a string representing the timestamp of that moment in the current system >>>>>> time zone in the format of "1970-01-01 00:00:00". >>>>>> >>>>>> if possible can you also check by changing the timezone to UTC on >>>>>> your machine? >>>>>> >>>>>>> >>>>>>> On Fri, Oct 31, 2014 at 12:00 PM, Maciek <mac...@sonra.io> wrote: >>>>>>> >>>>>>>> Any reason why >>>>>>>> >>>>>>>> select from_unixtime(0) t0 FROM … >>>>>>>> gives >>>>>>>> >>>>>>>> 1970-01-01 01:00:00 >>>>>>>> ? >>>>>>>> >>>>>>>> By all available definitions (epoch, from_unixtime etc..) I would >>>>>>>> expect it to be 1970-01-01 00:00:00…? >>>>>>>> >>>>>>>