aglinxinyuan opened a new pull request, #5768:
URL: https://github.com/apache/texera/pull/5768

   ### What changes were proposed in this PR?
   
   Pin behavior of four previously-uncovered sklearn-trainer descriptors in 
`common/workflow-operator/operator/machineLearning/sklearnAdvanced/`. Each is a 
30-line override of `SklearnMLOperatorDescriptor` that contributes just two 
values: the Python `import` statement and the operator-info label. Drift in 
either silently breaks generated Python code or the UI label. No 
production-code changes.
   
   | Spec | Source class | Tests |
   | --- | --- | --- |
   | `SklearnAdvancedKNNClassifierTrainerOpDescSpec` | 
`SklearnAdvancedKNNClassifierTrainerOpDesc` | 5 |
   | `SklearnAdvancedKNNRegressorTrainerOpDescSpec` | 
`SklearnAdvancedKNNRegressorTrainerOpDesc` | 6 |
   | `SklearnAdvancedSVCTrainerOpDescSpec` | `SklearnAdvancedSVCTrainerOpDesc` 
| 5 |
   | `SklearnAdvancedSVRTrainerOpDescSpec` | `SklearnAdvancedSVRTrainerOpDesc` 
| 6 |
   
   All four spec files follow the `<srcClassName>Spec.scala` one-to-one 
convention.
   
   **Behavior pinned (per descriptor)**
   
   | Surface | Contract |
   | --- | --- |
   | `getImportStatements` | exact canonical Python import 
(`KNeighborsClassifier` / `KNeighborsRegressor` / `SVC` / `SVR` from the 
appropriate sklearn module) |
   | `getOperatorInfo` | exact canonical label (`"KNN Classifier"` / `"KNN 
Regressor"` / `"SVM Classifier"` / `"SVM Regressor"`) |
   | Stability across two instances | both methods return the same string 
regardless of which instance is queried |
   | Type assignability | extends `SklearnMLOperatorDescriptor[ParamsT]` 
(compile-time enforced through a typed `val` binding) |
   | Type-pattern matching | `case _: SklearnMLOperatorDescriptor[_]` matches a 
concrete instance |
   
   The Regressor spec additionally cross-checks against the Classifier sibling 
(and SVR vs SVC) — catches copy-paste regressions where one subclass 
accidentally returned the other's strings.
   
   ### Any related issues, documentation, discussions?
   
   Closes #5765.
   
   ### How was this PR tested?
   
   Pure unit-test additions; verified locally with:
   
   - `sbt "WorkflowOperator/testOnly 
org.apache.texera.amber.operator.machineLearning.sklearnAdvanced.KNNTrainer.SklearnAdvancedKNNClassifierTrainerOpDescSpec
 
org.apache.texera.amber.operator.machineLearning.sklearnAdvanced.KNNTrainer.SklearnAdvancedKNNRegressorTrainerOpDescSpec
 
org.apache.texera.amber.operator.machineLearning.sklearnAdvanced.SVCTrainer.SklearnAdvancedSVCTrainerOpDescSpec
 
org.apache.texera.amber.operator.machineLearning.sklearnAdvanced.SVRTrainer.SklearnAdvancedSVRTrainerOpDescSpec"`
 — 22 tests, all green
   - `sbt scalafmtCheckAll` — clean
   - CI to confirm
   
   ### Was this PR authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Code (Opus 4.7 [1M context])


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to