bersprockets commented on a change in pull request #34741: URL: https://github.com/apache/spark/pull/34741#discussion_r759707950
########## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcUtils.scala ########## @@ -531,13 +533,16 @@ object OrcUtils extends Logging { } def fromOrcNTZ(ts: Timestamp): Long = { - DateTimeUtils.millisToMicros(ts.getTime) + + val utcMicros = DateTimeUtils.millisToMicros(ts.getTime) + (ts.getNanos / NANOS_PER_MICROS) % MICROS_PER_MILLIS + val micros = DateTimeUtils.fromUTCTime(utcMicros, TimeZone.getDefault.getID) + micros } def toOrcNTZ(micros: Long): OrcTimestamp = { - val seconds = Math.floorDiv(micros, MICROS_PER_SECOND) - val nanos = (micros - seconds * MICROS_PER_SECOND) * NANOS_PER_MICROS + val utcMicros = DateTimeUtils.toUTCTime(micros, TimeZone.getDefault.getID) Review comment: >I'm trying to understand this issue better. From the ORC source code, seems like Almost: For the first point, the ORC writer stores the timestamp value passed by the caller as-is, and records the timezone in file footer (that's why we need to shift the value before passing it to ORC). The devil, however, is in the details. _I know enough to know there are issues with the above, but not enough to know every issue or solution_. For example, an offset for a particular time zone is not fixed. It changes depending on the point on the timeline. When shifting, ORC determines the offsets for the two time zones based on the stored timestamp value (millis) ([see here](https://github.com/apache/orc/blob/334bf1f2c605f38c7e75ec81d1dab93c31fc8459/java/core/src/java/org/apache/orc/impl/SerializationUtils.java#L1444)). Because ORC uses the old time APIs to do this, we should ensure we pass a correct Hybrid Julian epoch time to ORC (via a Timestamp object), or ORC could get the wrong offsets. Because of that, I don't think you can pass an arbitrary long value in the Timestamp object and hope to properly reconstruct it on read. You might not know how ORC shifted it. In addition, the fromUTCTime/toUTCTime/convertTZ utilty methods work correctly, but are not appropriate for our needs here. fromUTCTime/toUTCTime/convertTZ uses the new Java time APIs, which don't work as we'd like for timestamps before the introduction of Railway time (circa 1883-11-17). Emulating here what fromUTCTime does, you can see the issue here: ``` scala> val ldtUtc = ZonedDateTime.of(1883, 11, 16, 0, 0, 0, 0, ZoneId.of("UTC")).toLocalDateTime ldtUtc: java.time.LocalDateTime = 1883-11-16T00:00 scala> val zdtShifted = ldtUtc.atZone(ZoneId.of("America/Los_Angeles")) zdtShifted: java.time.ZonedDateTime = 1883-11-16T00:00-07:52:58[America/Los_Angeles] scala> ``` Note how the time zone that the new APIs applied to the shifted value is -07:52:58, which is not some offset evenly divisible by 1 hour or 30 minutes. As a result, you end up with a timestamp like this: ``` scala> val ts = new Timestamp(zdtShifted.toInstant.toEpochMilli) ts: java.sql.Timestamp = 1883-11-15 23:52:58.0 scala> ``` In fact, you can see a compounded version of this with this PR: ``` scala> TimeZone.setDefault(TimeZone.getTimeZone("UTC")) scala> sql("select timestamp_ntz'1883-11-16 00:00:00.0' as ts_ntz").write.mode("overwrite").orc("test") scala> TimeZone.setDefault(TimeZone.getTimeZone("America/Los_Angeles")) scala> spark.read.orc("test").show(false) +-------------------+ |ts_ntz | +-------------------+ |1883-11-16 00:07:02| +-------------------+ scala> ``` Edit: I assume the above is because of pre-railway shifting, I didn't verify that. I updated my [POC](https://github.com/apache/spark/compare/master...bersprockets:orc_ntz_issue_play) to attempt to accomodate for these issues by: - Stealing the timstamp shifting technique from ORC code (thus I avoid using fromUTCTime/toUTCTime) - Passing Hybrid Julian values to ORC on write, and assuming ORC retrieves a Hybrid Julian value on read. That seems to have eliminated the pre-Railway and pre-Gregorian issues. But there is more... The ORC API accepts and returns Timestamp objects for writing and reading ORC timestamp fields. This alone introduces some oddities that will be noticable to end users. For example, not all timestamp_ntz values can be represented with Timestamp in all time zones. The date/time '2021-03-14 02:15:00.0' doesn't exist in the America/Los_Angeles time zone. ``` scala> TimeZone.setDefault(TimeZone.getTimeZone("America/Los_Angeles")) scala> val ts = java.sql.Timestamp.valueOf("2021-03-14 02:15:00.0") ts: java.sql.Timestamp = 2021-03-14 03:15:00.0 scala> ``` So we will have oddities like this (using my POC code): ``` scala> import java.util.TimeZone import java.util.TimeZone scala> TimeZone.setDefault(TimeZone.getTimeZone("America/Los_Angeles")) scala> sql("select timestamp_ntz'2021-03-14 02:15:00.0' as ts_ntz").write.mode("overwrite").orc("test") scala> spark.read.orc("test").show(false) +-------------------+ |ts_ntz | +-------------------+ |2021-03-14 01:15:00| +-------------------+ scala> ``` With this PR, you actually see it not with "spring forward", but with "fall back" (not sure why): ``` scala> import java.util.TimeZone import java.util.TimeZone scala> TimeZone.setDefault(TimeZone.getTimeZone("America/Los_Angeles")) scala> sql("select timestamp_ntz'1996-10-27T09:10:25.088353' as ts_ntz").write.mode("overwrite").orc("test") scala> spark.read.orc("test").show(false) +--------------------------+ |ts_ntz | +--------------------------+ |1996-10-27 08:10:25.088353| +--------------------------+ scala> ``` So to summarize: - ORC needs the correct epoch values (pre-shifted, rebased to Hybrid Julian) to determine the correct offsets when shifting on read. - You can't use fromUTCTime/toUTCTime (or convertTZ) for shifting pre-Railway datetime values. - For the ORC timestamp type, the ORC API receives and returns Timestamp objects, but Timestamp objects alone introduce oddities. - It would certainly be nice if we could save and retrieve a long value (ala useUTCTime) without affecting timestamp_ltz. Edit: caveat: I am no expert in the time APIs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org