eugenegujing opened a new pull request, #4406:
URL: https://github.com/apache/texera/pull/4406
PR Description
## Purpose
Adds a new **Empirical Cumulative Distribution Function (ECDF)** plot
operator to the Statistical Visualization group, letting users visualize the
cumulative distribution of a numeric column and easily compare distributions
across groups.
## Summary
- New operator `ECDFPlotOpDesc` under `operator/visualization/ecdfPlot/`,
rendered via `plotly.express.ecdf`.
- Configurable fields:
- **Value Column** (required, numeric): column to compute ECDF on
- **Color Column** (optional): group and color lines by category
- **SeparateBy Column** (optional): split plot into facets
- **Y Axis Mode**: `probability` / `count` / `sum`
- **CDF Mode**: `standard` / `reversed` / `complementary`
- **Orientation**: `vertical` / `horizontal`
- **Show Markers / Show Lines** toggles
- **Marginal Plot**: `""` / `histogram` / `rug`
- Registered in `LogicalOp.scala` as the `ECDFPlot` operator type.
- Added operator icon `frontend/src/assets/operator_images/ECDFPlot.png`.
- User-provided enum fields (`cdfMode`, `orientation`, `marginal`) use
`EncodableString` so generated Python safely passes
`PythonCodeRawInvalidTextSpec`.
- Unit tests in `ECDFPlotOpDescSpec` covering the empty-value assertion and
the generated figure with all optional parameters.
## Test
- [x] `sbt scalafmtCheckAll` passes
- [x] `sbt "scalafixAll --check"` passes
- [x] `sbt "WorkflowOperator/testOnly
org.apache.texera.amber.operator.visualization.ecdfPlot.ECDFPlotOpDescSpec
org.apache.texera.amber.util.PythonCodeRawInvalidTextSpec"`
— all tests pass (4/4), 110/110 raw-invalid OK, 110/110 py_compile OK
- [x] Manually tested end-to-end in the UI with CSV source → ECDF Plot
operator; verified the rendered plot in the result panel for all combinations
of color/facet/CDF mode/orientation/marginal options.
## Screenshots
<img width="3529" height="1962" alt="image"
src="https://github.com/user-attachments/assets/44392eaa-e6bb-48ee-80dc-a4c128425255"
/>
[ecdf_demo.csv](https://github.com/user-attachments/files/26813799/ecdf_demo.csv)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]