Paul Rogers created DRILL-8099:
----------------------------------

             Summary: Parquet record writer does not convert Dril local 
timestamp to UTC
                 Key: DRILL-8099
                 URL: https://issues.apache.org/jira/browse/DRILL-8099
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 1.19.0
            Reporter: Paul Rogers
            Assignee: Paul Rogers


Drill follows the old SQL engine convention to store the `TIMESTAMP` type in 
the local time zone. This is, of course, highly awkward in today's age when UTC 
is used as the standard timestamp in most products. However, it is how Drill 
works. (It would be great to add a `UTC_TIMESTAMP` type, but that is another 
topic.)

Each reader or writer that works with files that hold UTC timestamps must 
convert to (reader) or from (writer) Drill's local-time timestamp. Otherwise, 
Drill works correctly only when the server time zone is set to UTC.

Now, perhaps we can convince must shops to run their Drill server in UTC, or at 
least set the JVM timezone to UTC. However, this still leads developers in a 
lurch: if the development machine timezone is not UTC, then some tests fail. In 
particular:

{{TestNestedDateTimeTimestamp.testNestedDateTimeCTASParquet}}

The reason that the above test fails is that the generated Parquet writer code 
assumes (incorrectly) that the Drill timestamp is in UTC and so no conversion 
is needed to write that data into Parquet. In particular, in 
{{ParquetOutputRecordWriter.getNewTimeStampConverter()}}:

{noformat}
    reader.read(holder);
    consumer.addLong(holder.value);
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to