[
https://issues.apache.org/jira/browse/OOZIE-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Kanter updated OOZIE-2155:
---------------------------------
Attachment: OOZIE-2155.patch
The patch add a note to the Database Configuration docs.
> Incorrect DST Shifts are occurring based on the Database timezone
> -----------------------------------------------------------------
>
> Key: OOZIE-2155
> URL: https://issues.apache.org/jira/browse/OOZIE-2155
> Project: Oozie
> Issue Type: Bug
> Components: core
> Affects Versions: trunk, 4.1.0
> Reporter: Robert Kanter
> Assignee: Robert Kanter
> Priority: Blocker
> Attachments: OOZIE-2155.patch, oracle-fix-not-working.patch
>
>
> We discovered a critical bug where incorrect Daylight Saving Time shifts were
> occurring based on the Database timezone.
> Try running this Coordinator:
> {code:xml}
> <coordinator-app name=“foo” frequency="${coord:hours(1)}"
> start="2014-11-02T04:15Z" end="2014-11-02T16:00Z" timezone=“UTC”
> xmlns="uri:oozie:coordinator:0.1">
> <controls>
> <concurrency>10</concurrency>
> </controls>
> <action>
> <workflow>
> <app-path>${appPath}</app-path>
> <configuration>
> <property>
> <name>jobTracker</name>
> <value>${jobTracker}</value>
> </property>
> <property>
> <name>nameNode</name>
> <value>${nameNode}</value>
> </property>
> <property>
> <name>queueName</name>
> <value>${queueName}</value>
> </property>
> </configuration>
> </workflow>
> </action>
> </coordinator-app>
> {code}
> Note that it runs over a DST shift (at least in most US timezones).
> Here's a sample of some of the actions, along with their Nominal Times:
> {noformat}
> 0000013-150212140838307-oozie-rkan-C@4 2014-11-02 07:15 GMT
> 0000013-150212140838307-oozie-rkan-C@6 2014-11-02 09:15 GMT
> 0000013-150212140838307-oozie-rkan-C@5 2014-11-02 09:15 GMT
> 0000013-150212140838307-oozie-rkan-C@7 2014-11-02 10:15 GMT
> {noformat}
> Note that actions 5 and 6 have the same time. This is incorrect because the
> times are in GMT (with no DST). I've also confirmed via a debugger that
> these dates have the same number of seconds since epoch (so it's not just a
> rendering issue).
> By the way, if you're in a different timezone, you'll see this problem occur
> on different actions because the issue is related to the timezone that your
> database is in. It depends on when the DST shift occurs in your timezone;
> for example, in "America/New_York", it happens with actions 2 and 3.
> On that note, if I ask Oozie to use "America/Los_Angeles" to print the dates,
> I get this:
> {noformat}
> 0000013-150212140838307-oozie-rkan-C@4 2014-11-02 00:15 PDT
> 0000013-150212140838307-oozie-rkan-C@6 2014-11-02 01:15 PST
> 0000013-150212140838307-oozie-rkan-C@5 2014-11-02 01:15 PST
> 0000013-150212140838307-oozie-rkan-C@7 2014-11-02 02:15 PST
> {noformat}
> Action 5's nominal time should be {{2014-11-02 01:15 PDT}}, not {{2014-11-02
> 01:15 PST}}.
> Using the debugger some more, I verified that Oozie is creating the nominal
> times correctly, and writing them to the database correctly (at least, it's
> converting them to Java's SQL TimeStamp objects correctly; OpenJPA handles
> writing them). But when the problematic value is read back from the
> database, it has the wrong value!
> Here's something interesting from the [MySQL
> documentation|http://dev.mysql.com/doc/refman/5.5/en/datetime.html]:
> {quote}MySQL converts TIMESTAMP values from the current time zone to UTC for
> storage, and back from UTC to the current time zone for retrieval. (This does
> not occur for other types such as DATETIME.) By default, the current time
> zone for each connection is the server's time. The time zone can be set on a
> per-connection basis. As long as the time zone setting remains constant, you
> get back the same value you store. If you store a TIMESTAMP value, and then
> change the time zone and retrieve the value, the retrieved value is different
> from the value you stored. This occurs because the same time zone was not
> used for conversion in both directions.
> {quote}
> So, I think what's happening is that it's interpreting everything in PST, and
> not in PST and PDT depending on the time itself. Ideally, it would just
> store the time since epoch, like Java's Date does, but it's doing this broken
> interpretation instead. In fact, this [Stack
> Overflow|http://stackoverflow.com/questions/1646171/mysql-datetime-fields-and-daylight-savings-time-how-do-i-reference-the-extra/1650910#1650910]
> I found talks about this problem.
> I tried to create a unit test to show the problem without all the action
> stuff, but it seems like HSQLDB doesn't have this problem, so the test
> passes. I've confirmed that this affects Derby and MySQL; I'm not sure about
> Postgres, Oracle, or SQLServer.
> I was able to find a workaround for Derby and one of my colleagues found one
> for MySQL:
> - For Derby, you just need to change the JVM's timezone to GMT (see
> [here|http://objectmix.com/apache/647950-moving-derby-database-across-timezones.html]).
> All you have to do is add {{-Duser.timezone=GMT}} to {{CATALINA_OPTS}}.
> - For MySQL, you can either change the global timezone to GMT (which the DB
> admin probably won't go for), or you can add
> {{useLegacyDatetimeCode=false&serverTimezone=GMT}} to JDBC URL. For example:
> {code:xml}
> <property>
> <name>oozie.service.JPAService.jdbc.url</name>
>
> <value>jdbc:mysql://HOST/oozie?useUnicode=true&characterEncoding=UTF-8&useLegacyDatetimeCode=false&serverTimezone=GMT</value>
> </property>
> {code}
> I imagine the Derby workaround won't work for anything else, but perhaps the
> MySQL workaround would?
> The easiest way to fix this is to add the {{user.timezone}} thing (to fix
> Derby), and have Oozie always add the JDBC configs to the JDBC URL (for the
> others). If Oracle, Postgres, and SQLServer don't support that, we'll have
> to come up with other ideas. The other ideas I had were to replace TIMESTAMP
> with DATETIME or the SQL equivalent of {{long}}; though these have the
> downside of being schema changes and possibly other side effects.
> I have some homework for everyone :)
> Please check if the DB type listed below is affected by this, and if so, does
> the above workaround (or some similar configs) fix it?
> - [~bowenzhangusa]: SQLServer
> - [~chitnis], [~puru], [~rohini], or [~egashira]: Oracle
> - Me: Postgres
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)