[ 
https://issues.apache.org/jira/browse/DRILL-8100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17481095#comment-17481095
 ] 

James Turton commented on DRILL-8100:
-------------------------------------

Hi [~Paul.Rogers], I was just looking at some JSON timestamps written by Drill 
in connection with this issue.  They were zoneless so I proceeded to try to 
reproduce the abovementioned CTAS round trip problem in a sqlline session on my 
UTC+2 laptop, but could not.  Was I missing something in the below?



 
{code:java}
Apache Drill 1.20.0-SNAPSHOT
"Everything is easier with Drill."
apache drill> alter session set `store.format` = 'json';
ok       true
summary  store.format updated.
1 row selected (0.476 seconds)
apache drill> create table dfs.tmp.foo as select now();
Fragment                   0_0
Number of records written  1
1 row selected (2.189 seconds)
apache drill> select * from dfs.tmp.foo;
EXPR$0  2022-01-24 15:02:22.319
1 row selected (0.562 seconds)
apache drill> create table dfs.tmp.foo2 as select * from dfs.tmp.foo;
Fragment                   0_0
Number of records written  1
1 row selected (0.33 seconds)
apache drill> create table dfs.tmp.foo3 as select * from dfs.tmp.foo2;
Fragment                   0_0
Number of records written  1
1 row selected (0.291 seconds)
apache drill> select * from dfs.tmp.foo3;
EXPR$0  2022-01-24 15:02:22.319
1 row selected (0.228 seconds)
{code}
 

> JSON record writer does not convert Drill local timestamp to UTC
> ----------------------------------------------------------------
>
>                 Key: DRILL-8100
>                 URL: https://issues.apache.org/jira/browse/DRILL-8100
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.19.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Major
>
> Drill follows the old SQL engine convention to store the `TIMESTAMP` type in 
> the local time zone. This is, of course, highly awkward in today's age when 
> UTC is used as the standard timestamp in most products. However, it is how 
> Drill works. (It would be great to add a `UTC_TIMESTAMP` type, but that is 
> another topic.)
> Each reader or writer that works with files that hold UTC timestamps must 
> convert to (reader) or from (writer) Drill's local-time timestamp. Otherwise, 
> Drill works correctly only when the server time zone is set to UTC.
> The JSON writer does not do the proper conversion, causing tests to fail when 
> run in a time zone other than UTC.
> {noformat}
>   @Override
>   public void writeTimestamp(FieldReader reader) throws IOException {
>     if (reader.isSet()) {
>       writeTimestamp(reader.readLocalDateTime());
>     } else {
>       writeTimeNull();
>     }
>   }
> {noformat}
> Basically, it takes a {{LocalDateTime}}, and formats it as a UTC timezone 
> (using the "Z" suffix.) This is only valid if the machine is in the UTC time 
> zone, which is why the test for this class attempts to force the local time 
> zone to UTC, something that must users will not do.
> A consequence of this bug is that "round trip" CTAS will change dates by the 
> UTC offset of the machine running the CTAS. In the Pacific time zone, each 
> "round trip" subtracts 8 hours from the time. After three round trips, the 
> "UTC" date in the Parquet file or JSON will be a day earlier than the 
> original data. One might argue that this "feature" is not always helpful.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to