aglinxinyuan opened a new pull request, #5813: URL: https://github.com/apache/texera/pull/5813
### What changes were proposed in this PR? Pin behavior of four previously-untested text-search/match descriptors in `common/workflow-operator/`. They share the same shape — match/filter tuples by a string predicate on a column — and contribute operator metadata, physical-op wiring, and (for DictionaryMatcher) output-schema propagation. No production-code changes. | Spec | Source class | Tests | | --- | --- | --- | | `KeywordSearchOpDescSpec` | `KeywordSearchOpDesc` | 6 | | `SubstringSearchOpDescSpec` | `SubstringSearchOpDesc` | 4 | | `RegexOpDescSpec` | `RegexOpDesc` | 3 | | `DictionaryMatcherOpDescSpec` | `DictionaryMatcherOpDesc` | 5 | All four spec files follow the `<srcClassName>Spec.scala` one-to-one convention. **Behavior pinned** | Surface | Contract | | --- | --- | | `operatorInfo` | exact `userFriendlyName`; group `SEARCH_GROUP`; one input / one output port; `supportReconfiguration == true` | | Field defaults | `KeywordSearch`/`Substring` `isCaseSensitive == false` | | `getPhysicalOp` | `opExecInitInfo` pattern-matches `OpExecWithClassName(<FQCN>, descString)` with the exact executor class name and a non-empty payload; ports carried forward from `operatorInfo` | | Polymorphic JSON round-trip | serialize → deserialize via `classOf[LogicalOp]` → correct subtype with fields preserved (pins the `@JsonTypeInfo` discriminator + `@JsonProperty` wire-keys) | | `DictionaryMatcher` schema propagation | `getExternalOutputSchemas` appends a `BOOLEAN` column named by `resultAttribute` to the input schema | | `DictionaryMatcher` MatchingType | serializes via its `@JsonValue` name (`SCANBASED` → `"Scan"`) and round-trips | Mirrors the established `SleepOpDescSpec` / `SortOpDescSpec` patterns (AnyFlatSpec + Matchers; `OpExecWithClassName` match instead of brittle `toString`; polymorphic deserialize via `classOf[LogicalOp]`). ### Any related issues, documentation, discussions? Closes #5806. ### How was this PR tested? Pure unit-test additions; verified locally with: - `sbt "WorkflowOperator/testOnly org.apache.texera.amber.operator.keywordSearch.KeywordSearchOpDescSpec org.apache.texera.amber.operator.substringSearch.SubstringSearchOpDescSpec org.apache.texera.amber.operator.regex.RegexOpDescSpec org.apache.texera.amber.operator.dictionary.DictionaryMatcherOpDescSpec"` — 18 tests, all green - `sbt "WorkflowOperator/Test/scalafmtCheck"` and `sbt "WorkflowOperator/Test/scalafix --check"` — clean - CI to confirm ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Code (Opus 4.8 [1M context]) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
