Copilot commented on code in PR #18400:
URL: https://github.com/apache/pinot/pull/18400#discussion_r3175820076


##########
pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/BaseRecordExtractor.java:
##########
@@ -211,22 +194,18 @@ protected Map<Object, Object> convertRecord(Object value) 
{
         getClass().getSimpleName() + " does not support nested records; 
override convertRecord() to enable.");
   }
 
-  /// Converts a single value per the [RecordExtractor] contract:
+  /// Universal normalizations for types that any format extractor might 
produce:
   /// - `Byte` / `Short` widen to `Integer` so all small ints unify behind a 
single Pinot type
   /// - [BigInteger] widens to [BigDecimal] (Pinot has no `BigInteger` data 
type; downstream transforms handle
   ///   `BigDecimal` natively)
   /// - other `Number` (`Integer` / `Long` / `Float` / `Double` / 
`BigDecimal`) passes through
   /// - `Boolean` passes through
   /// - `byte[]` passes through
-  /// - [Timestamp] passes through
   /// - `ByteBuffer` materializes to `byte[]` (sliced so the source buffer's 
position is not advanced)
   ///
-  /// [Temporal] family (`java.time`):
-  /// - [LocalDate] / [LocalTime] pass through (TZ-independent date / 
time-of-day)
-  /// - [Instant] / [OffsetDateTime] / [ZonedDateTime] bridge to [Timestamp] 
via [Timestamp#from] — all three are
-  ///   unambiguously TZ-anchored, so the conversion is lossless (sub-second 
nanos preserved)
-  /// - [LocalDateTime] bridges to [Timestamp] interpreting the wall-clock as 
UTC
-  /// - other [Temporal] types fall through to `value.toString()`
+  /// Date / time types (DATE / TIME / TIMESTAMP) are NOT handled here — see 
[RecordExtractor] for the contract
+  /// and the format-specific extractor (`convertSingleValue` override or 
direct construction in `extractValue`)
+  /// for the native-to-contract conversion.

Review Comment:
   This change silently alters the default SPI behavior for third-party 
`BaseRecordExtractor` subclasses: any extractor that relied on the base 
implementation to normalize `Timestamp`/`Instant`/`LocalDate`/`LocalTime` now 
falls through to `toString()`. That is a breaking contract change for external 
formats unless every downstream subclass is updated in lockstep.



##########
pinot-plugins/pinot-input-format/pinot-arrow/src/main/java/org/apache/pinot/plugin/inputformat/arrow/ArrowToGenericRowConverter.java:
##########
@@ -246,12 +245,10 @@ private Object 
convertArrowTypeToPinotCompatible(@Nullable Object value) {
       return value.toString();
     }
 
-    // Handle Arrow LocalDateTime -> java.sql.Timestamp conversion
+    // Handle Arrow LocalDateTime -> Long epoch millis (per RecordExtractor 
contract: TIMESTAMP → Long).
     if (value instanceof LocalDateTime) {
-      // Arrow TimeStampMilliVector.getObject() returns LocalDateTime, but 
Pinot expects
-      // java.sql.Timestamp objects for proper timestamp handling and native 
support
       LocalDateTime dateTime = (LocalDateTime) value;
-      return Timestamp.from(dateTime.toInstant(ZoneOffset.UTC));
+      return dateTime.toInstant(ZoneOffset.UTC).toEpochMilli();

Review Comment:
   The new `LocalDateTime -> epochMillis` conversion is only covered by a test 
that checks the result is a positive `Long`, so it would not catch a wrong 
epoch calculation or a timezone/offset regression. Since this file already has 
Arrow decoder tests, please add an assertion for the exact expected millis 
value (ideally including a non-midnight case) for the new UTC interpretation.



##########
pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/BaseRecordExtractor.java:
##########
@@ -77,14 +68,6 @@ public void init(@Nullable Set<String> fields, @Nullable 
RecordExtractorConfig c
   protected void initConfig(@Nullable RecordExtractorConfig config) {
   }
 

Review Comment:
   Removing this protected helper is a source-compatible breaking change for 
external `pinot-spi` consumers that subclass `BaseRecordExtractor`. Even if the 
method is unused inside this repository, third-party extractors can still call 
`shouldExtract(...)`, so deleting it from a published SPI will break their 
builds on upgrade.
   



##########
pinot-plugins/pinot-input-format/pinot-orc/src/main/java/org/apache/pinot/plugin/inputformat/orc/ORCRecordExtractor.java:
##########
@@ -206,13 +206,12 @@ private static Object extractSingleValue(String field, 
ColumnVector columnVector
       }
       case TIMESTAMP:
       case TIMESTAMP_INSTANT: {
-        // `time[rowId]` is epoch millis truncated to second precision; 
sub-second nanos (0..999_999_999)
+        // `time[rowId]` is epoch millis (truncated to second precision); 
sub-second nanos (0..999_999_999)
         // live in `nanos[rowId]`. `TIMESTAMP_INSTANT` (UTC instant) shares 
the same vector layout as
-        // `TIMESTAMP` (local). Construct a `Timestamp` and set the full nanos 
field to preserve precision.
+        // `TIMESTAMP` (local). Per the contract, output is `Long` epoch 
millis; sub-millisecond precision is
+        // dropped here to match Pinot's TIMESTAMP physical storage.
         TimestampColumnVector tsv = (TimestampColumnVector) columnVector;
-        Timestamp ts = new Timestamp(tsv.time[rowId]);
-        ts.setNanos(tsv.nanos[rowId]);
-        return ts;
+        return tsv.time[rowId] + tsv.nanos[rowId] / 1_000_000L;

Review Comment:
   This new manual epoch-millis calculation is only exercised with positive 
timestamps in `ORCRecordExtractorTest`. Since the code now combines `time` and 
`nanos` directly instead of delegating to `Timestamp`, it needs at least one 
pre-epoch test case as well to catch sign/rounding regressions for negative 
timestamps.



##########
pinot-spi/src/main/java/org/apache/pinot/spi/utils/MapUtils.java:
##########
@@ -50,11 +49,13 @@ private MapUtils() {
   }
 
   private static final Logger LOGGER = LoggerFactory.getLogger(MapUtils.class);
-  private static final ObjectMapper SORTED_MAPPER =
-      new 
ObjectMapper().configure(SerializationFeature.ORDER_MAP_ENTRIES_BY_KEYS, true);
-  private static final ObjectMapper UNSORTED_MAPPER = new ObjectMapper();
-  private static final TypeReference<Map<String, Object>> MAP_TYPE_REFERENCE = 
new TypeReference<>() {
-  };
+
+  // Pinot's standard ObjectMapper config (JSR-310 / ISO-8601 for LocalDate / 
LocalTime — see JsonUtils),
+  // plus key-ordered output for the sorted variant so byte output is a pure 
function of the logical map.
+  // Writes only — read paths delegate to JsonUtils, whose DEFAULT_READER is 
already JSR-310-aware.
+  private static final ObjectWriter SORTED_WRITER = 
JsonUtils.newObjectMapperWithJavaTime()
+      .configure(SerializationFeature.ORDER_MAP_ENTRIES_BY_KEYS, 
true).writer();
+  private static final ObjectWriter UNSORTED_WRITER = 
JsonUtils.newObjectMapperWithJavaTime().writer();

Review Comment:
   This switches all `MapUtils` serialization paths to the new JSR-310-aware 
mapper, but none of the existing `MapUtilsTest` cases cover 
`LocalDate`/`LocalTime` values or nested maps/lists containing them. Because 
this file already has round-trip tests for other value types, please add 
analogous coverage here so regressions in the new mapper configuration are 
caught in the code path that actually serializes `MAP` columns.



##########
pinot-spi/src/main/java/org/apache/pinot/spi/utils/JsonUtils.java:
##########
@@ -157,11 +188,6 @@ public static JsonNode stringToJsonNode(String jsonString)
     return DEFAULT_READER.readTree(jsonString);
   }
 

Review Comment:
   Deleting `jsonNodeToStringMap(...)` is a backward-incompatible change in the 
published `pinot-spi` API. There are no in-repo callers anymore, but external 
code can still call this public helper, so removing it will break upgrades for 
downstream consumers even though the replacement is only an internal call-site 
change here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to