aminghadersohi commented on code in PR #38955:
URL: https://github.com/apache/superset/pull/38955#discussion_r3009828778
##########
superset/mcp_service/chart/validation/dataset_validator.py:
##########
@@ -426,6 +426,8 @@ def _validate_aggregations(
errors = []
for col_ref in column_refs:
+ if col_ref.saved_metric:
Review Comment:
When `saved_metric=True`, this skips aggregation validation (correct), but
there's no check that `col_ref.name` actually exists in
`dataset_context.available_metrics` vs `available_columns`.
A user could set `saved_metric=True` on a regular column name like
`"product_line"` — it would pass `_column_exists` (which checks both columns
and metrics), then emit a plain string in `form_data["metrics"]`, causing a
cryptic query-time failure since `"product_line"` isn't in `metrics_by_name`.
**Suggestion:** Add a check here (or in `validate_against_dataset`) that
verifies saved metric names match an entry in `available_metrics` specifically.
##########
superset/mcp_service/chart/schemas.py:
##########
@@ -485,6 +485,19 @@ class ColumnRef(BaseModel):
]
| None
) = Field(None, description="SQL aggregate function")
+ saved_metric: bool = Field(
Review Comment:
The `sanitize_name` field validator (below) calls `sanitize_user_input` with
`check_sql_keywords=True`. Some valid saved metric names might contain SQL
keywords (e.g., a metric literally named `count` or `sum`). Since saved metrics
reference existing dataset objects rather than being injected into raw SQL,
should `check_sql_keywords` be relaxed when `saved_metric=True`?
##########
superset/mcp_service/chart/chart_utils.py:
##########
@@ -381,8 +381,8 @@ def map_table_config(config: TableChartConfig) -> Dict[str,
Any]:
aggregated_metrics = []
for col in config.columns:
- if col.aggregate:
- # Column has aggregation - treat as metric
+ if col.saved_metric or col.aggregate:
Review Comment:
The return type widened from `Dict[str, Any]` to `Dict[str, Any] | str`.
Current callers handle this fine since Superset supports mixed `str | dict` in
metric lists, but future callers doing `result["aggregate"]` without a type
check would break silently. Maybe worth a brief comment at the call sites
noting the mixed-type list?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]