aglinxinyuan opened a new issue, #5776:
URL: https://github.com/apache/texera/issues/5776

   ### Task Summary
   
   Add dedicated unit-specs (`PortIdentityKeySerializerSpec.scala` and 
`PortIdentityKeyDeserializerSpec.scala`) that pin the string-key format Texera 
uses for `PortIdentity` JSON **map keys**, and the round-trip between the 
serializer and the deserializer.
   
   ## Background
   
   `PortIdentity` (ScalaPB-generated from `workflow.proto`: `id: Int`, 
`internal: Boolean`) is used as a **map key** in JSON-serialized structures. 
Jackson can't serialize an object as a JSON key, so `common/workflow-core` 
provides a paired key serializer / deserializer that flattens it to an 
`"id_internal"` string and back. Neither has a dedicated unit-spec today.
   
   ```scala
   // PortIdentityKeySerializer.scala
   case object PortIdentityKeySerializer {
     def portIdToString(portId: PortIdentity): String = 
s"${portId.id}_${portId.internal}"
   }
   class PortIdentityKeySerializer extends JsonSerializer[PortIdentity] {
     override def serialize(key, gen, serializers): Unit = 
gen.writeFieldName(portIdToString(key))
   }
   
   // PortIdentityKeyDeserializer.scala
   class PortIdentityKeyDeserializer extends KeyDeserializer {
     override def deserializeKey(key: String, ctxt): PortIdentity = {
       val parts = key.split("_")
       PortIdentity(parts(0).toInt, parts(1).toBoolean)
     }
   }
   ```
   
   ## Behavior to pin
   
   | Surface | Contract |
   | --- | --- |
   | `PortIdentityKeySerializer.portIdToString` | `PortIdentity(3, internal = 
false)` → `"3_false"`; `PortIdentity(0, internal = true)` → `"0_true"` (format 
is exactly `"${id}_${internal}"`) |
   | `PortIdentityKeyDeserializer.deserializeKey` | `"3_false"` → 
`PortIdentity(3, internal = false)`; `"0_true"` → `PortIdentity(0, internal = 
true)` |
   | Round-trip | `deserializeKey(portIdToString(p)) == p` for representative 
`p` — `internal` both `true`/`false`, `id` zero / large / negative |
   | Full Jackson map round-trip (recommended) | a `Map[PortIdentity, V]` 
serialized with the project mapper 
(`org.apache.texera.amber.util.JSONUtils.objectMapper`, which registers these 
as the key (de)serializers) reads back with identical keys |
   
   > The deserializer currently assumes exactly two `_`-separated parts and a 
numeric id — pin **today's** behavior; this task does not add new validation.
   
   ## Scope
   
   - New spec files under 
`common/workflow-core/src/test/scala/org/apache/texera/amber/util/serde/` — one 
per source class (`<srcClassName>Spec.scala` convention). Bundling both in a 
single PR is fine.
   - Exercise the serializer via `portIdToString` and/or a Jackson `Map` 
round-trip; no engine/runtime needed.
   - No production-code changes. Follow the module's existing spec style (e.g. 
`AnyFunSuite`/`AnyFlatSpec`).
   
   ### Task Type
   - [ ] Refactor / Cleanup
   - [ ] DevOps / Deployment / CI
   - [x] Testing / QA
   - [ ] Documentation
   - [ ] Performance
   - [ ] Other
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to