aglinxinyuan opened a new issue, #5735:
URL: https://github.com/apache/texera/issues/5735
### Task Summary
Add dedicated unit-specs for three small operator-package utility / contract
classes in `common/workflow-operator/`. None of them carry heavy infrastructure
dependencies; the value is regression coverage against accidental renames,
default-value drift, and Jackson-annotation drift.
## Background
| Source class | Package | Purpose |
| --- | --- | --- |
| `OperatorDescriptorUtils` | `operator.util` | Holds
`equallyPartitionGoal(goal, totalNumWorkers)` and `toImmutableMap(javaMap)` —
utility pure functions used by worker / partition planning |
| `PropertyNameConstants` | `operator.metadata` | Holds the canonical
JSON-property-name constants used across the engine (e.g. `OPERATOR_ID`,
`INPUT_PORTS`, `WORKFLOW_ID`). Drift in any constant breaks every persisted
workflow JSON that uses it. |
| `PortDescriptor` (+ `PortDescription`) | `operator` | `PortDescriptor` is
a trait with mutable `inputPorts` / `outputPorts` lists; `PortDescription` is a
case class with the Jackson-annotated
`@JsonIgnoreProperties("allowMultiInputs")` backward-compat shim |
## Behavior to pin
### `OperatorDescriptorUtils.equallyPartitionGoal`
| Surface | Contract |
| --- | --- |
| Exactly divisible (`goal % totalNumWorkers == 0`) | every worker gets
`goal / totalNumWorkers` |
| Inexact (`goal % totalNumWorkers > 0`) | first `goal % totalNumWorkers`
workers get an extra `1`; the rest get the floor |
| Result length | always equals `totalNumWorkers` |
| Sum of the result | always equals `goal` |
| `goal = 0` | all workers get 0 |
| Single-worker | one entry equal to `goal` |
### `OperatorDescriptorUtils.toImmutableMap`
| Surface | Contract |
| --- | --- |
| Empty java map | returns the empty Scala map |
| Populated java map | returns a Scala immutable map with the same entries |
| Subsequent mutation of the source java map | does NOT leak into the
returned Scala map |
### `PropertyNameConstants`
| Surface | Contract |
| --- | --- |
| Logical-plan keys (`OPERATOR_ID` / `OPERATOR_TYPE` / `OPERATOR_LIST` /
`OPERATOR_LINK_LIST` / `OPERATOR_VERSION` / `ORIGIN_OPERATOR_ID` /
`DESTINATION_OPERATOR_ID`) | exact string values |
| Common operator keys (`ATTRIBUTE_NAMES` / `ATTRIBUTE_NAME` /
`RESULT_ATTRIBUTE_NAME` / `SPAN_LIST_NAME` / `TABLE_NAME`) | exact string
values |
| Physical-plan keys (`WORKFLOW_ID` / `EXECUTION_ID` / `PARALLELIZABLE` /
`LOCATION_PREFERENCE` / `PARTITION_REQUIREMENT` / `INPUT_PORTS` /
`OUTPUT_PORTS` / `IS_ONE_TO_MANY_OP` / `SUGGESTED_WORKER_NUM`) | exact string
values |
| All constants together | distinct (no accidental aliasing) |
### `PortDescriptor` / `PortDescription`
| Surface | Contract |
| --- | --- |
| `PortDescriptor` (mixed into a test class) | defaults `inputPorts` /
`outputPorts` to `null` (Jackson sets them post-construction); both are mutable
`var`s |
| `PortDescription()` constructor | preserves every field |
| `PortDescription` equality / `copy` | case-class semantics |
| `@JsonIgnoreProperties("allowMultiInputs")` | present on `PortDescription`
— verified via reflection (backward-compat shim) |
| JSON round-trip | preserves every field |
## Scope
- New spec files (one per source class per the spec-filename convention):
- `OperatorDescriptorUtilsSpec.scala`
- `PropertyNameConstantsSpec.scala`
- `PortDescriptorSpec.scala`
- No production-code changes.
### Task Type
- [ ] Refactor / Cleanup
- [ ] DevOps / Deployment / CI
- [x] Testing / QA
- [ ] Documentation
- [ ] Performance
- [ ] Other
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]