codeant-ai-for-open-source[bot] commented on code in PR #37639:
URL: https://github.com/apache/superset/pull/37639#discussion_r2761890858
##########
superset/mcp_service/chart/preview_utils.py:
##########
@@ -36,6 +36,23 @@
logger = logging.getLogger(__name__)
+def _build_query_columns(form_data: Dict[str, Any]) -> list[str]:
+ """Build query columns list from form_data, including both x_axis and
groupby."""
+ x_axis_config = form_data.get("x_axis")
+ groupby_columns: list[str] = form_data.get("groupby", [])
+ raw_columns: list[str] = form_data.get("columns", [])
Review Comment:
**Suggestion:** If `form_data["columns"]` exists but is `None` (or another
non-list type), assigning it directly to `raw_columns` and then calling
`.copy()` will raise an `AttributeError`, breaking previews for such requests
instead of safely falling back to an empty list. [possible bug]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Preview generation errors when "columns" key is null
- ⚠️ Table and XY previews may be blocked intermittently
```
</details>
```suggestion
raw_columns = form_data.get("columns") or []
if not isinstance(raw_columns, list):
raw_columns = []
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Invoke generate_preview_from_form_data(form_data, dataset_id,
preview_format) in
superset/mcp_service/chart/preview_utils.py (caller uses
_build_query_columns at line 86).
2. Provide form_data that contains the "columns" key with a null or non-list
value, e.g.
form_data = {"viz_type": "table", "columns": None}. The presence of the key
is important.
3. Inside _build_query_columns (lines 39-53), line 43 sets raw_columns =
form_data.get("columns", []) which yields None. Because "columns" key
exists, the later
expression at line 45 uses raw_columns.copy() and attempts to call .copy()
on None.
4. This causes AttributeError: 'NoneType' object has no attribute 'copy',
failing preview
creation along the same path as ChartDataCommand execution. This reproduces
whenever a
request includes "columns": null.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/preview_utils.py
**Line:** 43:43
**Comment:**
*Possible Bug: If `form_data["columns"]` exists but is `None` (or
another non-list type), assigning it directly to `raw_columns` and then calling
`.copy()` will raise an `AttributeError`, breaking previews for such requests
instead of safely falling back to an empty list.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
##########
superset/mcp_service/chart/preview_utils.py:
##########
@@ -36,6 +36,23 @@
logger = logging.getLogger(__name__)
+def _build_query_columns(form_data: Dict[str, Any]) -> list[str]:
+ """Build query columns list from form_data, including both x_axis and
groupby."""
+ x_axis_config = form_data.get("x_axis")
+ groupby_columns: list[str] = form_data.get("groupby", [])
+ raw_columns: list[str] = form_data.get("columns", [])
+
+ columns = raw_columns.copy() if "columns" in form_data else
groupby_columns.copy()
+ if x_axis_config and isinstance(x_axis_config, str):
+ if x_axis_config not in columns:
+ columns.insert(0, x_axis_config)
+ elif x_axis_config and isinstance(x_axis_config, dict):
+ col_name = x_axis_config.get("column_name")
+ if col_name and col_name not in columns:
+ columns.insert(0, col_name)
Review Comment:
**Suggestion:** When the x-axis is configured as an ad-hoc column (a dict
without `column_name` but with `sqlExpression`), the current logic ignores it
entirely, so the generated query columns omit the x-axis dimension and the
preview query may not match the chart configuration; adding a fallback to
include the dict itself when `column_name` is missing fixes this. [logic error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ x_axis omitted from preview query for ad-hoc expressions
- ⚠️ Vega-Lite/table previews mismatch when using SQL expressions
```
</details>
```suggestion
if col_name:
if col_name not in columns:
columns.insert(0, col_name)
elif "sqlExpression" in x_axis_config and x_axis_config not in
columns:
columns.insert(0, x_axis_config)
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Call generate_preview_from_form_data(form_data, dataset_id,
preview_format) in
superset/mcp_service/chart/preview_utils.py (the call to
_build_query_columns is at line
86).
2. Provide form_data where x_axis is an ad-hoc column dict without
"column_name" but with
a SQL expression, e.g. form_data["x_axis"] = {"sqlExpression": "YEAR(date)"}
(this format
is used by ad-hoc columns).
3. _build_query_columns (lines 39-53) hits the dict branch at line 49,
extracts col_name
(line 50) which is None, and because no fallback exists the ad-hoc dict is
ignored. The
resulting columns list does not include the x_axis expression.
4. The QueryContext created from these columns therefore omits the x-axis
expression from
SELECT and GROUP BY. The preview (via ChartDataCommand and subsequent
preview generators)
shows missing or incorrect x-axis, reproducing for ad-hoc x_axis usages.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/preview_utils.py
**Line:** 51:52
**Comment:**
*Logic Error: When the x-axis is configured as an ad-hoc column (a dict
without `column_name` but with `sqlExpression`), the current logic ignores it
entirely, so the generated query columns omit the x-axis dimension and the
preview query may not match the chart configuration; adding a fallback to
include the dict itself when `column_name` is missing fixes this.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
##########
superset/mcp_service/chart/preview_utils.py:
##########
@@ -36,6 +36,23 @@
logger = logging.getLogger(__name__)
+def _build_query_columns(form_data: Dict[str, Any]) -> list[str]:
+ """Build query columns list from form_data, including both x_axis and
groupby."""
+ x_axis_config = form_data.get("x_axis")
+ groupby_columns: list[str] = form_data.get("groupby", [])
Review Comment:
**Suggestion:** When `form_data["groupby"]` is present but set to `None` (or
any non-list value), assigning it directly to `groupby_columns` and then
calling `.copy()` later will raise an `AttributeError`, causing preview
generation to fail instead of gracefully treating it as no groupby columns.
[possible bug]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ generate_preview_from_form_data preview fails (AttributeError)
- ⚠️ LLM/form_data with null groupby triggers failure
```
</details>
```suggestion
groupby_columns = form_data.get("groupby") or []
if not isinstance(groupby_columns, list):
groupby_columns = []
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Call generate_preview_from_form_data(...) in
superset/mcp_service/chart/preview_utils.py (the function defined at the top
of the file;
its usage of _build_query_columns is at line 86 in the PR hunk).
2. Pass form_data containing "groupby": null (JSON null -> Python None).
Example payload:
form_data = {"viz_type": "echarts_timeseries_bar", "x_axis": "territory",
"groupby":
None}. The call site that triggers this is the line: columns =
_build_query_columns(form_data) at line 86.
3. Execution enters _build_query_columns (defined in the same file at lines
39-53). Line
42 assigns groupby_columns = form_data.get("groupby", []) which returns None
(because the
key exists with null).
4. Later, at line 45 (columns = raw_columns.copy() if "columns" in form_data
else
groupby_columns.copy()), Python attempts groupby_columns.copy() and raises
AttributeError:
'NoneType' object has no attribute 'copy'. The preview generation path blows
up and
returns a ChartError fallback. This reproduces deterministically when
"groupby" is present
and null.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/preview_utils.py
**Line:** 42:42
**Comment:**
*Possible Bug: When `form_data["groupby"]` is present but set to `None`
(or any non-list value), assigning it directly to `groupby_columns` and then
calling `.copy()` later will raise an `AttributeError`, causing preview
generation to fail instead of gracefully treating it as no groupby columns.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
##########
superset/mcp_service/chart/tool/get_chart_preview.py:
##########
@@ -56,6 +56,22 @@ class ChartLike(Protocol):
uuid: Any
+def _build_query_columns(form_data: Dict[str, Any]) -> list[str]:
+ """Build query columns list from form_data, including both x_axis and
groupby."""
+ x_axis_config = form_data.get("x_axis")
+ groupby_columns: list[str] = form_data.get("groupby", [])
Review Comment:
**Suggestion:** `_build_query_columns` assumes that `form_data["groupby"]`
is always a list, but if it is explicitly `None` or a scalar (e.g. a single
string), calling `.copy()` will raise an AttributeError or produce an incorrect
type, breaking table/vega previews for such charts. Normalizing the value to a
list before copying avoids this runtime failure. [type error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Table preview generation may raise exceptions.
- ❌ Vega-Lite preview queries can be aborted.
- ⚠️ get_chart_preview tool returns ChartError instead.
```
</details>
```suggestion
groupby_columns = form_data.get("groupby") or []
if not isinstance(groupby_columns, list):
groupby_columns = [groupby_columns]
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Load a chart preview using get_chart_preview tool at
superset/mcp_service/chart/tool/get_chart_preview.py: async
_get_chart_preview_internal(...) which eventually instantiates
PreviewFormatGenerator and
calls a strategy (TablePreviewStrategy or VegaLitePreviewStrategy).
2. The TablePreviewStrategy.generate path calls
_build_query_columns(form_data) (callsite
added at get_chart_preview.py:301 in the PR) passing form_data parsed from
chart.params
(from ChartDAO or transient chart).
3. If the stored chart.params contains "groupby": null or "groupby": "year"
(a scalar) — a
realistic payload when form_data JSON comes from external LLMs or older
saved exports —
_build_query_columns executes groupby_columns: list[str] =
form_data.get("groupby", [])
then columns = groupby_columns.copy() at get_chart_preview.py:64.
4. When groupby_columns is None, None.copy() raises AttributeError; when
groupby_columns
is a string, .copy() is a method error (strings don't have copy), causing
the preview path
to raise and the Table/Vega preview to fail. The failure appears in logs from
get_chart_preview and results in ChartError returned to the caller.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/tool/get_chart_preview.py
**Line:** 62:62
**Comment:**
*Type Error: `_build_query_columns` assumes that `form_data["groupby"]`
is always a list, but if it is explicitly `None` or a scalar (e.g. a single
string), calling `.copy()` will raise an AttributeError or produce an incorrect
type, breaking table/vega previews for such charts. Normalizing the value to a
list before copying avoids this runtime failure.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
##########
superset/mcp_service/chart/tool/get_chart_data.py:
##########
@@ -178,6 +178,17 @@ async def get_chart_data( # noqa: C901
metrics = form_data.get("metrics", [])
groupby_columns = form_data.get("groupby", [])
+ # Build query columns list: include both x_axis and groupby
+ x_axis_config = form_data.get("x_axis")
+ query_columns = groupby_columns.copy()
Review Comment:
**Suggestion:** The code assumes that the `groupby` value from `form_data`
is always a list and calls `.copy()` on it directly; if an existing chart was
saved with `groupby` as `null` or a scalar (string), this will raise an
`AttributeError` at runtime in the fallback path, preventing chart data
retrieval. [type error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ get_chart_data MCP tool crashes on fallback charts.
- ⚠️ Chart data previews fail for older unsaved charts.
- ⚠️ Series/group_by chart previews may error inconsistently.
```
</details>
```suggestion
if isinstance(groupby_columns, list):
query_columns = groupby_columns.copy()
elif groupby_columns:
# Handle older or malformed form_data where groupby is a
scalar
query_columns = [groupby_columns]
else:
query_columns = []
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Create or find a chart record where chart.query_context is NULL and
chart.params
contains form_data with `"groupby": null` or `"groupby": "year"`. This falls
into the
fallback path in get_chart_data (file
superset/mcp_service/chart/tool/get_chart_data.py).
The fallback form_data is loaded at the line `form_data =
utils_json.loads(chart.params)
if chart.params else {}` (the fallback block around lines ~172-180).
2. Invoke the MCP data tool for that chart by calling the get_chart_data
tool function
(superset/mcp_service/chart/tool/get_chart_data.py:def get_chart_data). Call
it with
request.identifier set to the chart's id so the code loads that chart (the
ChartDAO lookup
happens earlier in the same function).
3. Execution reaches the fallback branch where groupby is read with
`groupby_columns =
form_data.get("groupby", [])` (line 179). If the saved form_data had
`groupby: null` (->
None) or `groupby: "year"` (-> str), groupby_columns becomes None or a
string.
4. The code then executes `query_columns = groupby_columns.copy()` (line
183). This raises
an AttributeError ('NoneType' object has no attribute 'copy' or 'str' object
has no
attribute 'copy'), causing the get_chart_data invocation to fail and return
an error from
the inner except block that logs "Error retrieving chart data" and returns a
ChartError.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/tool/get_chart_data.py
**Line:** 183:183
**Comment:**
*Type Error: The code assumes that the `groupby` value from `form_data`
is always a list and calls `.copy()` on it directly; if an existing chart was
saved with `groupby` as `null` or a scalar (string), this will raise an
`AttributeError` at runtime in the fallback path, preventing chart data
retrieval.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
##########
superset/mcp_service/chart/tool/get_chart_preview.py:
##########
@@ -56,6 +56,22 @@ class ChartLike(Protocol):
uuid: Any
+def _build_query_columns(form_data: Dict[str, Any]) -> list[str]:
+ """Build query columns list from form_data, including both x_axis and
groupby."""
+ x_axis_config = form_data.get("x_axis")
+ groupby_columns: list[str] = form_data.get("groupby", [])
+
+ columns = groupby_columns.copy()
+ if x_axis_config and isinstance(x_axis_config, str):
+ if x_axis_config not in columns:
+ columns.insert(0, x_axis_config)
+ elif x_axis_config and isinstance(x_axis_config, dict):
+ col_name = x_axis_config.get("column_name")
+ if col_name and col_name not in columns:
+ columns.insert(0, col_name)
Review Comment:
**Suggestion:** When `x_axis` is provided as a dict (for example, an ad‑hoc
or SQL expression column configuration), `_build_query_columns` currently
extracts only `column_name` and drops the rest of the configuration; if
`column_name` is missing this silently omits the x‑axis from the query, so
previews for such charts won't include the intended x‑axis dimension. Using the
entire dict as the column specification ensures ad‑hoc x‑axis definitions are
preserved in the query. [logic error]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ Vega-Lite previews missing ad-hoc x-axis dimension.
- ❌ Table previews omit intended x-axis column.
- ⚠️ Preview SQL GROUP BY incorrect for ad-hoc x_axis.
```
</details>
```suggestion
if x_axis_config not in columns:
columns.insert(0, x_axis_config)
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Create or have a chart whose form_data.x_axis is an ad-hoc or SQL
expression dict (for
example: {"sqlExpression": "DATE_TRUNC('year', date)", "label": "year"})
stored in
chart.params (ChartDAO result used in VegaLitePreviewStrategy or transient
chart creation
in _get_chart_preview_internal).
2. Call get_chart_preview
(superset/mcp_service/chart/tool/get_chart_preview.py), which
selects VegaLitePreviewStrategy.generate or TablePreviewStrategy.generate
and calls
columns = _build_query_columns(form_data) (callsites in
get_chart_preview.py:204 and
get_chart_preview.py:301 in the PR).
3. Inside _build_query_columns (defined at get_chart_preview.py:59), the dict
x_axis_config path looks only for "column_name" (get("column_name")) and if
absent does
not add any column; the ad-hoc dict therefore is silently omitted from the
columns list.
4. QueryContextFactory.create
(superset/common/query_context_factory.py:create at ~line
39) receives a queries[0]["columns"] lacking the x_axis dict, so the
generated SQL and
GROUP BY omit the intended x-axis expression, producing incorrect preview
results (missing
dimension in SELECT/GROUP BY) visible in Table/Vega previews.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/tool/get_chart_preview.py
**Line:** 69:71
**Comment:**
*Logic Error: When `x_axis` is provided as a dict (for example, an
ad‑hoc or SQL expression column configuration), `_build_query_columns`
currently extracts only `column_name` and drops the rest of the configuration;
if `column_name` is missing this silently omits the x‑axis from the query, so
previews for such charts won't include the intended x‑axis dimension. Using the
entire dict as the column specification ensures ad‑hoc x‑axis definitions are
preserved in the query.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]