Jackie-Jiang commented on code in PR #18400:
URL: https://github.com/apache/pinot/pull/18400#discussion_r3180035533


##########
pinot-plugins/pinot-input-format/pinot-parquet/src/main/java/org/apache/pinot/plugin/inputformat/parquet/ParquetNativeRecordExtractor.java:
##########
@@ -148,95 +164,74 @@ private Object extractValue(Group from, int fieldIndex, 
Type fieldType, int inde
             return BigDecimal.valueOf(longValue, 
decimalLogicalTypeAnnotation.getScale());
           }
           if (logicalTypeAnnotation instanceof 
LogicalTypeAnnotation.TimeLogicalTypeAnnotation) {
-            // `TIME_MICROS` / `TIME_NANOS` → [LocalTime] preserving full 
nanosecond precision.
+            // `TIME_MICROS` / `TIME_NANOS` → [LocalTime] preserving full 
nanosecond precision, or raw value
+            // (in the column's declared unit) when configured.
+            if (_extractRawTimeValues) {
+              return longValue;
+            }
             LogicalTypeAnnotation.TimeUnit timeUnit =
                 ((LogicalTypeAnnotation.TimeLogicalTypeAnnotation) 
logicalTypeAnnotation).getUnit();
             return LocalTime.ofNanoOfDay(timeUnit == 
LogicalTypeAnnotation.TimeUnit.MICROS ? longValue * 1_000L
                 : longValue);
           }
           if (logicalTypeAnnotation instanceof 
LogicalTypeAnnotation.TimestampLogicalTypeAnnotation) {
-            // `TIMESTAMP_MILLIS` / `TIMESTAMP_MICROS` / `TIMESTAMP_NANOS` → 
`java.sql.Timestamp` preserving
-            // sub-millisecond precision via `setNanos`.
-            return timestampFromLong(longValue,
+            // `TIMESTAMP_MILLIS` / `TIMESTAMP_MICROS` / `TIMESTAMP_NANOS` → 
[Timestamp] preserving sub-millisecond
+            // nanos via `setNanos`. Raw mode returns the value as stored in 
the column's declared unit.
+            // TODO: `isAdjustedToUTC()` is ignored — a `timestamp-* 
adjustToUTC=false` field is local-clock
+            //   time, not a UTC instant, but we treat all values as UTC 
instants per the Pinot TIMESTAMP
+            //   contract. For local-datetime data the produced `Timestamp` 
will be off by the JVM TZ offset.
+            if (_extractRawTimeValues) {
+              return longValue;
+            }
+            return ParquetUtils.convertLongToTimestamp(longValue,
                 ((LogicalTypeAnnotation.TimestampLogicalTypeAnnotation) 
logicalTypeAnnotation).getUnit());

Review Comment:
   Acknowledged, but out of scope for this PR. The `isAdjustedToUTC=false` 
mishandling is pre-existing — the prior code (before this PR) decoded both 
variants identically as UTC instants too. This PR keeps that behavior unchanged 
and adds an explicit TODO at the call site (`isAdjustedToUTC() is ignored — a 
'timestamp-* adjustToUTC=false' field is local-clock time, not a UTC instant, 
but we treat all values as UTC instants per the Pinot TIMESTAMP contract`). 
Fixing it is itself a user-visible behavior change for anyone whose data is 
`adjustToUTC=false` — their TIMESTAMP column values would shift by the JVM TZ 
offset on upgrade — so it deserves its own PR with a release note rather than 
being folded into the contract-refinement scope here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to