Re: [PR] HIVE-29033: ORC reader should not assume that TimestampColumnVector is in UTC time zone [hive]

via GitHub Wed, 25 Jun 2025 15:47:36 -0700


vrozov commented on code in PR #5868:
URL: https://github.com/apache/hive/pull/5868#discussion_r2167768653



##########
ql/src/java/org/apache/hadoop/hive/ql/io/BatchToRowReader.java:
##########
@@ -518,7 +521,8 @@ public static TimestampWritableV2 
nextTimestamp(ColumnVector vector,
         result = (TimestampWritableV2) previous;
       }
       TimestampColumnVector tcv = (TimestampColumnVector) vector;
-      result.setInternal(tcv.time[row], tcv.nanos[row]);
+      result.set(Timestamp.ofEpochSecond(Math.floorDiv(tcv.time[row], 1000L), 
tcv.nanos[row],
+          tcv.isUTC() ? ZoneOffset.UTC : ZoneId.systemDefault()));

Review Comment:
   1. It passes all Hive tests (that use UTC). As I said I am open to introduce 
new test cases assuming that other changes in the PR are OK and the PR is 
rejected only because of missing unit tests for `isUTC=false`.
   2. It is bad ASF policy to reject PR without giving -1 and detailed 
technical explanation. It is OK to open a second PR that uses completely 
different approach.
   3. The issue that I need to fix is purely in Hive. The problem is that 
without more changes in `RecordReaderImpl` few Hive tests also break, so that 
extra fix is necessary for Hive.
   4. OK, it is regression in Hive 3.x compared to Hive 2.x. Note that Spark 
upgrades from 2.3.10 to 4.x, so for Spark it is a regression.
   5. I don't argue whether or not your changes are compatible with 
https://github.com/apache/orc/pull/2300. They are simply orthogonal to the way 
how `TimestampColumnVector` works.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Re: [PR] HIVE-29033: ORC reader should not assume that TimestampColumnVector is in UTC time zone [hive]

Reply via email to