Paul Rogers created DRILL-8099:
----------------------------------
Summary: Parquet record writer does not convert Dril local
timestamp to UTC
Key: DRILL-8099
URL: https://issues.apache.org/jira/browse/DRILL-8099
Project: Apache Drill
Issue Type: Bug
Affects Versions: 1.19.0
Reporter: Paul Rogers
Assignee: Paul Rogers
Drill follows the old SQL engine convention to store the `TIMESTAMP` type in
the local time zone. This is, of course, highly awkward in today's age when UTC
is used as the standard timestamp in most products. However, it is how Drill
works. (It would be great to add a `UTC_TIMESTAMP` type, but that is another
topic.)
Each reader or writer that works with files that hold UTC timestamps must
convert to (reader) or from (writer) Drill's local-time timestamp. Otherwise,
Drill works correctly only when the server time zone is set to UTC.
Now, perhaps we can convince must shops to run their Drill server in UTC, or at
least set the JVM timezone to UTC. However, this still leads developers in a
lurch: if the development machine timezone is not UTC, then some tests fail. In
particular:
{{TestNestedDateTimeTimestamp.testNestedDateTimeCTASParquet}}
The reason that the above test fails is that the generated Parquet writer code
assumes (incorrectly) that the Drill timestamp is in UTC and so no conversion
is needed to write that data into Parquet. In particular, in
{{ParquetOutputRecordWriter.getNewTimeStampConverter()}}:
{noformat}
reader.read(holder);
consumer.addLong(holder.value);
{noformat}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)