The GitHub Actions job "Required Checks" on texera.git/gh-readonly-queue/main/pr-5738-8a90f1f667c44bc26c0faf9eee619392e3f57ddf has succeeded. Run started by GitHub user aglinxinyuan (triggered by aglinxinyuan).
Head commit for run: 0efbc0f59cad0a660912fab63de04a4860d8b42c / Xinyuan Lin <[email protected]> test(workflow-operator): add unit test coverage for SET-family LogicalOp descriptors (#5738) ### What changes were proposed in this PR? Pin behavior of three previously-uncovered `LogicalOp` descriptors in the SET / cleaning operator family. Each descriptor wires a physical-op class name + port shape + (where applicable) partitioning + schema-propagation contract through `getPhysicalOp`. No production-code changes. | Spec | Source class | Tests | | --- | --- | --- | | `UnionOpDescSpec` | `UnionOpDesc` | 5 | | `DistinctOpDescSpec` | `DistinctOpDesc` | 7 | | `DifferenceOpDescSpec` | `DifferenceOpDesc` | 9 | All three spec files follow the `<srcClassName>Spec.scala` one-to-one convention. `IntersectOpDescSpec` already exists and gave us the spec-shape template. **Behavior pinned — `UnionOpDesc`** | Surface | Contract | | --- | --- | | `operatorInfo` | name `"Union"`, group `SET_GROUP`, description mentions "Union" | | Ports | one input, one non-blocking output | | `getPhysicalOp` | wires `OpExecWithClassName("…operator.union.UnionOpExec")` | | Partition requirement | empty (no hash-alignment forced; unlike Distinct / Difference / Intersect, Union preserves whatever the upstream produced) | | Independent instances | no static state shared across `new UnionOpDesc` | **Behavior pinned — `DistinctOpDesc`** | Surface | Contract | | --- | --- | | `operatorInfo` | name `"Distinct"`, group `CLEANING_GROUP`, description mentions "duplicate" | | Ports | one input, one **blocking** output | | `getPhysicalOp` | wires `OpExecWithClassName("…operator.distinct.DistinctOpExec")`; `partitionRequirement` is `List(Option(HashPartition()))`; `derivePartition` always returns `HashPartition` regardless of input partition kind | **Behavior pinned — `DifferenceOpDesc`** | Surface | Contract | | --- | --- | | `operatorInfo` | name `"Difference"`, group `SET_GROUP`, description mentions "difference"; two input ports with `displayName` `"left"` (PortIdentity 0) and `"right"` (PortIdentity 1); one **blocking** output | | `getPhysicalOp` | wires `OpExecWithClassName("…operator.difference.DifferenceOpExec")`; `partitionRequirement` is `List(Option(HashPartition()), Option(HashPartition()))` (both inputs); `derivePartition` always returns `HashPartition` | | Schema propagation | accepts a single shared input schema and produces that schema on every output port; throws `IllegalArgumentException` when the two inputs do not share one schema | ### Any related issues, documentation, discussions? Closes #5734. ### How was this PR tested? Pure unit-test additions; verified locally with: - `sbt "WorkflowOperator/testOnly org.apache.texera.amber.operator.union.UnionOpDescSpec org.apache.texera.amber.operator.distinct.DistinctOpDescSpec org.apache.texera.amber.operator.difference.DifferenceOpDescSpec"` — 21 tests, all green - `sbt scalafmtCheckAll` — clean - CI to confirm ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Code (Opus 4.7 [1M context]) Report URL: https://github.com/apache/texera/actions/runs/27722882811 With regards, GitHub Actions via GitBox
