[ https://issues.apache.org/jira/browse/OOZIE-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Kanter updated OOZIE-2155: --------------------------------- Attachment: OOZIE-2155.patch The patch add a note to the Database Configuration docs. > Incorrect DST Shifts are occurring based on the Database timezone > ----------------------------------------------------------------- > > Key: OOZIE-2155 > URL: https://issues.apache.org/jira/browse/OOZIE-2155 > Project: Oozie > Issue Type: Bug > Components: core > Affects Versions: trunk, 4.1.0 > Reporter: Robert Kanter > Assignee: Robert Kanter > Priority: Blocker > Attachments: OOZIE-2155.patch, oracle-fix-not-working.patch > > > We discovered a critical bug where incorrect Daylight Saving Time shifts were > occurring based on the Database timezone. > Try running this Coordinator: > {code:xml} > <coordinator-app name=“foo” frequency="${coord:hours(1)}" > start="2014-11-02T04:15Z" end="2014-11-02T16:00Z" timezone=“UTC” > xmlns="uri:oozie:coordinator:0.1"> > <controls> > <concurrency>10</concurrency> > </controls> > <action> > <workflow> > <app-path>${appPath}</app-path> > <configuration> > <property> > <name>jobTracker</name> > <value>${jobTracker}</value> > </property> > <property> > <name>nameNode</name> > <value>${nameNode}</value> > </property> > <property> > <name>queueName</name> > <value>${queueName}</value> > </property> > </configuration> > </workflow> > </action> > </coordinator-app> > {code} > Note that it runs over a DST shift (at least in most US timezones). > Here's a sample of some of the actions, along with their Nominal Times: > {noformat} > 0000013-150212140838307-oozie-rkan-C@4 2014-11-02 07:15 GMT > 0000013-150212140838307-oozie-rkan-C@6 2014-11-02 09:15 GMT > 0000013-150212140838307-oozie-rkan-C@5 2014-11-02 09:15 GMT > 0000013-150212140838307-oozie-rkan-C@7 2014-11-02 10:15 GMT > {noformat} > Note that actions 5 and 6 have the same time. This is incorrect because the > times are in GMT (with no DST). I've also confirmed via a debugger that > these dates have the same number of seconds since epoch (so it's not just a > rendering issue). > By the way, if you're in a different timezone, you'll see this problem occur > on different actions because the issue is related to the timezone that your > database is in. It depends on when the DST shift occurs in your timezone; > for example, in "America/New_York", it happens with actions 2 and 3. > On that note, if I ask Oozie to use "America/Los_Angeles" to print the dates, > I get this: > {noformat} > 0000013-150212140838307-oozie-rkan-C@4 2014-11-02 00:15 PDT > 0000013-150212140838307-oozie-rkan-C@6 2014-11-02 01:15 PST > 0000013-150212140838307-oozie-rkan-C@5 2014-11-02 01:15 PST > 0000013-150212140838307-oozie-rkan-C@7 2014-11-02 02:15 PST > {noformat} > Action 5's nominal time should be {{2014-11-02 01:15 PDT}}, not {{2014-11-02 > 01:15 PST}}. > Using the debugger some more, I verified that Oozie is creating the nominal > times correctly, and writing them to the database correctly (at least, it's > converting them to Java's SQL TimeStamp objects correctly; OpenJPA handles > writing them). But when the problematic value is read back from the > database, it has the wrong value! > Here's something interesting from the [MySQL > documentation|http://dev.mysql.com/doc/refman/5.5/en/datetime.html]: > {quote}MySQL converts TIMESTAMP values from the current time zone to UTC for > storage, and back from UTC to the current time zone for retrieval. (This does > not occur for other types such as DATETIME.) By default, the current time > zone for each connection is the server's time. The time zone can be set on a > per-connection basis. As long as the time zone setting remains constant, you > get back the same value you store. If you store a TIMESTAMP value, and then > change the time zone and retrieve the value, the retrieved value is different > from the value you stored. This occurs because the same time zone was not > used for conversion in both directions. > {quote} > So, I think what's happening is that it's interpreting everything in PST, and > not in PST and PDT depending on the time itself. Ideally, it would just > store the time since epoch, like Java's Date does, but it's doing this broken > interpretation instead. In fact, this [Stack > Overflow|http://stackoverflow.com/questions/1646171/mysql-datetime-fields-and-daylight-savings-time-how-do-i-reference-the-extra/1650910#1650910] > I found talks about this problem. > I tried to create a unit test to show the problem without all the action > stuff, but it seems like HSQLDB doesn't have this problem, so the test > passes. I've confirmed that this affects Derby and MySQL; I'm not sure about > Postgres, Oracle, or SQLServer. > I was able to find a workaround for Derby and one of my colleagues found one > for MySQL: > - For Derby, you just need to change the JVM's timezone to GMT (see > [here|http://objectmix.com/apache/647950-moving-derby-database-across-timezones.html]). > All you have to do is add {{-Duser.timezone=GMT}} to {{CATALINA_OPTS}}. > - For MySQL, you can either change the global timezone to GMT (which the DB > admin probably won't go for), or you can add > {{useLegacyDatetimeCode=false&serverTimezone=GMT}} to JDBC URL. For example: > {code:xml} > <property> > <name>oozie.service.JPAService.jdbc.url</name> > > <value>jdbc:mysql://HOST/oozie?useUnicode=true&characterEncoding=UTF-8&useLegacyDatetimeCode=false&serverTimezone=GMT</value> > </property> > {code} > I imagine the Derby workaround won't work for anything else, but perhaps the > MySQL workaround would? > The easiest way to fix this is to add the {{user.timezone}} thing (to fix > Derby), and have Oozie always add the JDBC configs to the JDBC URL (for the > others). If Oracle, Postgres, and SQLServer don't support that, we'll have > to come up with other ideas. The other ideas I had were to replace TIMESTAMP > with DATETIME or the SQL equivalent of {{long}}; though these have the > downside of being schema changes and possibly other side effects. > I have some homework for everyone :) > Please check if the DB type listed below is affected by this, and if so, does > the above workaround (or some similar configs) fix it? > - [~bowenzhangusa]: SQLServer > - [~chitnis], [~puru], [~rohini], or [~egashira]: Oracle > - Me: Postgres -- This message was sent by Atlassian JIRA (v6.3.4#6332)