[ https://issues.apache.org/jira/browse/DRILL-8100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17481095#comment-17481095 ]
James Turton commented on DRILL-8100: ------------------------------------- Hi [~Paul.Rogers], I was just looking at some JSON timestamps written by Drill in connection with this issue. They were zoneless so I proceeded to try to reproduce the abovementioned CTAS round trip problem in a sqlline session on my UTC+2 laptop, but could not. Was I missing something in the below? {code:java} Apache Drill 1.20.0-SNAPSHOT "Everything is easier with Drill." apache drill> alter session set `store.format` = 'json'; ok true summary store.format updated. 1 row selected (0.476 seconds) apache drill> create table dfs.tmp.foo as select now(); Fragment 0_0 Number of records written 1 1 row selected (2.189 seconds) apache drill> select * from dfs.tmp.foo; EXPR$0 2022-01-24 15:02:22.319 1 row selected (0.562 seconds) apache drill> create table dfs.tmp.foo2 as select * from dfs.tmp.foo; Fragment 0_0 Number of records written 1 1 row selected (0.33 seconds) apache drill> create table dfs.tmp.foo3 as select * from dfs.tmp.foo2; Fragment 0_0 Number of records written 1 1 row selected (0.291 seconds) apache drill> select * from dfs.tmp.foo3; EXPR$0 2022-01-24 15:02:22.319 1 row selected (0.228 seconds) {code} > JSON record writer does not convert Drill local timestamp to UTC > ---------------------------------------------------------------- > > Key: DRILL-8100 > URL: https://issues.apache.org/jira/browse/DRILL-8100 > Project: Apache Drill > Issue Type: Bug > Affects Versions: 1.19.0 > Reporter: Paul Rogers > Assignee: Paul Rogers > Priority: Major > > Drill follows the old SQL engine convention to store the `TIMESTAMP` type in > the local time zone. This is, of course, highly awkward in today's age when > UTC is used as the standard timestamp in most products. However, it is how > Drill works. (It would be great to add a `UTC_TIMESTAMP` type, but that is > another topic.) > Each reader or writer that works with files that hold UTC timestamps must > convert to (reader) or from (writer) Drill's local-time timestamp. Otherwise, > Drill works correctly only when the server time zone is set to UTC. > The JSON writer does not do the proper conversion, causing tests to fail when > run in a time zone other than UTC. > {noformat} > @Override > public void writeTimestamp(FieldReader reader) throws IOException { > if (reader.isSet()) { > writeTimestamp(reader.readLocalDateTime()); > } else { > writeTimeNull(); > } > } > {noformat} > Basically, it takes a {{LocalDateTime}}, and formats it as a UTC timezone > (using the "Z" suffix.) This is only valid if the machine is in the UTC time > zone, which is why the test for this class attempts to force the local time > zone to UTC, something that must users will not do. > A consequence of this bug is that "round trip" CTAS will change dates by the > UTC offset of the machine running the CTAS. In the Pacific time zone, each > "round trip" subtracts 8 hours from the time. After three round trips, the > "UTC" date in the Parquet file or JSON will be a day earlier than the > original data. One might argue that this "feature" is not always helpful. -- This message was sent by Atlassian Jira (v8.20.1#820001)