eugenegujing opened a new pull request, #5722:
URL: https://github.com/apache/texera/pull/5722

   ### What changes were proposed in this PR?
   
   This PR adds a dedicated `GlobalPortIdentitySerdeSpec` for 
`GlobalPortIdentitySerde`, the helper that serializes/deserializes a 
`GlobalPortIdentity` to the compact, underscore-free string used in VFS URIs 
and file paths.
   
   `GlobalPortIdentitySerde` was previously exercised inside 
`PortIdentitySerdeSpec`, which mixed it with the unrelated `PortIdentity` 
Jackson key (de)serializer tests. Rather than duplicate that coverage, this PR:
   
   - **Moves** the `GlobalPortIdentitySerde` cases into their own file (the 
name issue #5717 asks for), so the two serde concerns are tested independently.
   - **Trims** `PortIdentitySerdeSpec` down to only the `PortIdentity` Jackson 
key (de)serialization tests, dropping the now-unused imports.
   - **Adds** seven edge cases that were previously unpinned (see below).
   
   There are **no production-code changes** and **no duplication** between the 
two specs.
   
   New edge cases added beyond the migrated coverage:
   
   1. **serialize/deserialize asymmetry** — the deserializer regex accepts 
underscores even though the serializer rejects them; characterized so a future 
tightening of the regex breaks the test on purpose.
   2. **unescaped comma** — a `logicalOpId` containing a `,` fails to 
round-trip, because the format does not escape its separator.
   3. **portId boundary** — `Int.MaxValue` round-trips, and an out-of-Int-range 
value (`9999999999`) is rejected with `NumberFormatException` on deserialize.
   4. **mixed-case booleans** — `True` / `FALSE` are accepted on deserialize, 
since Scala's `String.toBoolean` is case-insensitive.
   5. **`=` in a value round-trips** — only `,` is a sensitive separator; an 
embedded `=` is captured fine, the counterpart to the comma case.
   6. **empty `logicalOpId`** — serializes into `(logicalOpId=,...)` but the 
result can no longer be deserialized (`[^,]+` requires ≥1 char), the serialize 
side of the empty-field asymmetry.
   7. **require messages name the field** — the serialize-side guards identify 
the offending field (`logicalOpId` / `layerName` / `portId`) in their message, 
catching a regression that wires a check to the wrong field.
   
   | File | Change |
   | --- | --- |
   | `common/workflow-core/.../util/serde/GlobalPortIdentitySerdeSpec.scala` | 
**new** — 23 tests (`AnyFlatSpec with Matchers`) |
   | `common/workflow-core/.../util/serde/PortIdentitySerdeSpec.scala` | 
removed the migrated `GlobalPortIdentitySerde` block + now-unused imports (−189 
lines); `PortIdentity` Jackson key (de)serialization tests kept intact |
   
   ### Any related issues, documentation, discussions?
   
   Closes #5717
   
   ### How was this PR tested?
   
   Test-only PR. All checks below were run locally from the repository root and 
pass:
   
   1. **Unit tests** — the new and modified specs:
      ```
      sbt "WorkflowCore/testOnly 
org.apache.texera.amber.util.serde.GlobalPortIdentitySerdeSpec 
org.apache.texera.amber.util.serde.PortIdentitySerdeSpec"
      ```
      Result: **31 tests, 0 failures** (1 `pendingUntilFixed`, pre-existing in 
`PortIdentitySerdeSpec`).
   
   2. **Lint (scalafix)** — `sbt "scalafixAll --check"` → success.
   
   3. **Format (scalafmt)** — `sbt scalafmtCheckAll` → success.
   
   ### Was this PR authored or co-authored using generative AI tooling?
   
   Co-authored by: Claude Opus 4.8
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to