anshul98ks123 opened a new pull request, #17534:
URL: https://github.com/apache/pinot/pull/17534

   ## Issue
   
   `FieldSpec.getStringValue()` fails to serialize Scala collections to valid 
JSON
   
   ## Description
   
   When `jackson-module-scala` is on the classpath (via Kafka connectors), 
Jackson deserializes empty JSON objects `{}` as Scala collections instead of 
Java collections. The existing `getStringValue()` method only handles Java 
`Map`/`List` via `instanceof` checks, but **Scala collections don't implement 
`java.util.Map` or `java.util.List`**, causing them to fall through to 
`toString()` which produces invalid JSON.
   
   For example, a schema with ComplexFieldSpec:
   
   ```
   "complexFieldSpecs": [
       {
           "name": "dimensions",
           "dataType": "MAP",
           "defaultNullValue": {}
       }
   ]
   ```
   When Jackson deserializes this with `jackson-module-scala` on the classpath:
   1. "defaultNullValue": {} → scala.collection.immutable.Map$EmptyMap$
   2. `setDefaultNullValue(scalaMap)` calls `getStringValue(scalaMap)`
   3. instanceof Map returns false (Scala Map ≠ java.util.Map)
   4. Falls through to `scalaMap.toString()` → "Map()"
   5. Later, `setDataType(DataType.MAP)` triggers `getDefaultNullValue()` which 
calls `DataType.MAP.convert("Map()")`
   6. `JsonUtils.stringToObject("Map()", Map.class)` fails because "Map()" is 
not valid JSON
   ```
   Caused by: com.fasterxml.jackson.databind.exc.MismatchedInputException: 
Cannot convert value: 'Map()' to type: MAP
    at [Source: REDACTED; line: 333, column: 33] (through reference chain: 
      
TablePreviewApi["tableConfigs"]->TableConfigs["schema"]->Schema["complexFieldSpecs"]
      ->ArrayList[0]->ComplexFieldSpec["dataType"])
   ```
   
   ## Bug
   
   The bug is in 
[FieldSpec.getStringValue()](https://github.com/startreedata/pinot/blob/3bb68a04bb03dd108030fd2280aee1701bb46058/pinot-spi/src/main/java/org/apache/pinot/spi/data/FieldSpec.java#L347):
   ```
   public static String getStringValue(Object value) {
       if (value instanceof BigDecimal) {
         return ((BigDecimal) value).toPlainString();
       }
       if (value instanceof byte[]) {
         return BytesUtils.toHexString((byte[]) value);
       }
       return value.toString();  // ← BUG: Scala Map.toString() = "Map()"
   }
   ```
   
   ## Fix
   
   This PR fixes that by detecting Scala collections via class name and 
serializing them to JSON
   ```
   public static String getStringValue(Object value) {
       // ... BigDecimal, byte[] handling ...
       
       // Handle Java collections AND Scala collections
       if (value instanceof Map || value instanceof List || 
isScalaCollection(value)) {
         try {
           return JsonUtils.objectToString(value);  // Serialize to JSON 
properly
         } catch (JsonProcessingException e) {
           throw new RuntimeException("Failed to serialize collection to JSON", 
e);
         }
       }
       return value.toString();
   }
   ```
   
   ## Testing
   - [X] UTs
   - [X] Lint Check
   - [X] Ran STP locally and validated Preview API with following block
   ```
   {
       "schema": {
           ...
           ...
           "complexFieldSpecs": [
               {
                   "fieldType": "COMPLEX",
                   "childFieldSpecs": {
                       ...
                       ...
                   },
                   "singleValueField": true,
                   "notNull": false,
                   "allowTrailingZeros": false,
                   "defaultNullValueString": "{}",
                   "name": "address",
                   "defaultNullValue": {},
                   "dataType": "MAP"
               }
           ]
       }
   }
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to