vrozov commented on code in PR #5868: URL: https://github.com/apache/hive/pull/5868#discussion_r2165213074
########## ql/src/java/org/apache/hadoop/hive/ql/io/BatchToRowReader.java: ########## @@ -518,7 +521,8 @@ public static TimestampWritableV2 nextTimestamp(ColumnVector vector, result = (TimestampWritableV2) previous; } TimestampColumnVector tcv = (TimestampColumnVector) vector; - result.setInternal(tcv.time[row], tcv.nanos[row]); + result.set(Timestamp.ofEpochSecond(Math.floorDiv(tcv.time[row], 1000L), tcv.nanos[row], + tcv.isUTC() ? ZoneOffset.UTC : ZoneId.systemDefault())); Review Comment: I still don't see why significant refactoring and propagating flag using `context` is a better solution. What is wrong with the iterating over columns? The PR approach works with existing way how `TimestampTreeReader` sets `isUTC` flag in `TimestampColumnVector` and also aligns with the https://github.com/apache/orc/pull/2300 changes. Once Hive upgrades to ORC with the fix, the workaround in the PR can be kept in place or removed, while suggested refactoring becomes obsolete. > Additionally, it would be nice to reuse the timestamp conversion function to avoid code duplication I can make that change, if that the only remaining concern with the PR. ########## ql/src/java/org/apache/hadoop/hive/ql/io/BatchToRowReader.java: ########## @@ -518,7 +521,8 @@ public static TimestampWritableV2 nextTimestamp(ColumnVector vector, result = (TimestampWritableV2) previous; } TimestampColumnVector tcv = (TimestampColumnVector) vector; - result.setInternal(tcv.time[row], tcv.nanos[row]); + result.set(Timestamp.ofEpochSecond(Math.floorDiv(tcv.time[row], 1000L), tcv.nanos[row], + tcv.isUTC() ? ZoneOffset.UTC : ZoneId.systemDefault())); Review Comment: I still don't see why significant refactoring and propagating flag using `context` is a better solution. What is wrong with the iterating over columns? The PR approach works with existing way how `TimestampTreeReader` sets `isUTC` flag in `TimestampColumnVector` and also aligns with the https://github.com/apache/orc/pull/2300 changes. Once Hive upgrades to ORC with the fix, the workaround in the PR can be kept in place or removed, while suggested refactoring becomes obsolete. > Additionally, it would be nice to reuse the timestamp conversion function to avoid code duplication I can make that change, if that the only remaining concern with the PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org