codeant-ai-for-open-source[bot] commented on code in PR #37639:
URL: https://github.com/apache/superset/pull/37639#discussion_r2761890858


##########
superset/mcp_service/chart/preview_utils.py:
##########
@@ -36,6 +36,23 @@
 logger = logging.getLogger(__name__)
 
 
+def _build_query_columns(form_data: Dict[str, Any]) -> list[str]:
+    """Build query columns list from form_data, including both x_axis and 
groupby."""
+    x_axis_config = form_data.get("x_axis")
+    groupby_columns: list[str] = form_data.get("groupby", [])
+    raw_columns: list[str] = form_data.get("columns", [])

Review Comment:
   **Suggestion:** If `form_data["columns"]` exists but is `None` (or another 
non-list type), assigning it directly to `raw_columns` and then calling 
`.copy()` will raise an `AttributeError`, breaking previews for such requests 
instead of safely falling back to an empty list. [possible bug]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Preview generation errors when "columns" key is null
   - ⚠️ Table and XY previews may be blocked intermittently
   ```
   </details>
   
   ```suggestion
       raw_columns = form_data.get("columns") or []
       if not isinstance(raw_columns, list):
           raw_columns = []
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Invoke generate_preview_from_form_data(form_data, dataset_id, 
preview_format) in
   superset/mcp_service/chart/preview_utils.py (caller uses 
_build_query_columns at line 86).
   
   2. Provide form_data that contains the "columns" key with a null or non-list 
value, e.g.
   form_data = {"viz_type": "table", "columns": None}. The presence of the key 
is important.
   
   3. Inside _build_query_columns (lines 39-53), line 43 sets raw_columns =
   form_data.get("columns", []) which yields None. Because "columns" key 
exists, the later
   expression at line 45 uses raw_columns.copy() and attempts to call .copy() 
on None.
   
   4. This causes AttributeError: 'NoneType' object has no attribute 'copy', 
failing preview
   creation along the same path as ChartDataCommand execution. This reproduces 
whenever a
   request includes "columns": null.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/chart/preview_utils.py
   **Line:** 43:43
   **Comment:**
        *Possible Bug: If `form_data["columns"]` exists but is `None` (or 
another non-list type), assigning it directly to `raw_columns` and then calling 
`.copy()` will raise an `AttributeError`, breaking previews for such requests 
instead of safely falling back to an empty list.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/mcp_service/chart/preview_utils.py:
##########
@@ -36,6 +36,23 @@
 logger = logging.getLogger(__name__)
 
 
+def _build_query_columns(form_data: Dict[str, Any]) -> list[str]:
+    """Build query columns list from form_data, including both x_axis and 
groupby."""
+    x_axis_config = form_data.get("x_axis")
+    groupby_columns: list[str] = form_data.get("groupby", [])
+    raw_columns: list[str] = form_data.get("columns", [])
+
+    columns = raw_columns.copy() if "columns" in form_data else 
groupby_columns.copy()
+    if x_axis_config and isinstance(x_axis_config, str):
+        if x_axis_config not in columns:
+            columns.insert(0, x_axis_config)
+    elif x_axis_config and isinstance(x_axis_config, dict):
+        col_name = x_axis_config.get("column_name")
+        if col_name and col_name not in columns:
+            columns.insert(0, col_name)

Review Comment:
   **Suggestion:** When the x-axis is configured as an ad-hoc column (a dict 
without `column_name` but with `sqlExpression`), the current logic ignores it 
entirely, so the generated query columns omit the x-axis dimension and the 
preview query may not match the chart configuration; adding a fallback to 
include the dict itself when `column_name` is missing fixes this. [logic error]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ x_axis omitted from preview query for ad-hoc expressions
   - ⚠️ Vega-Lite/table previews mismatch when using SQL expressions
   ```
   </details>
   
   ```suggestion
           if col_name:
               if col_name not in columns:
                   columns.insert(0, col_name)
           elif "sqlExpression" in x_axis_config and x_axis_config not in 
columns:
               columns.insert(0, x_axis_config)
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Call generate_preview_from_form_data(form_data, dataset_id, 
preview_format) in
   superset/mcp_service/chart/preview_utils.py (the call to 
_build_query_columns is at line
   86).
   
   2. Provide form_data where x_axis is an ad-hoc column dict without 
"column_name" but with
   a SQL expression, e.g. form_data["x_axis"] = {"sqlExpression": "YEAR(date)"} 
(this format
   is used by ad-hoc columns).
   
   3. _build_query_columns (lines 39-53) hits the dict branch at line 49, 
extracts col_name
   (line 50) which is None, and because no fallback exists the ad-hoc dict is 
ignored. The
   resulting columns list does not include the x_axis expression.
   
   4. The QueryContext created from these columns therefore omits the x-axis 
expression from
   SELECT and GROUP BY. The preview (via ChartDataCommand and subsequent 
preview generators)
   shows missing or incorrect x-axis, reproducing for ad-hoc x_axis usages.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/chart/preview_utils.py
   **Line:** 51:52
   **Comment:**
        *Logic Error: When the x-axis is configured as an ad-hoc column (a dict 
without `column_name` but with `sqlExpression`), the current logic ignores it 
entirely, so the generated query columns omit the x-axis dimension and the 
preview query may not match the chart configuration; adding a fallback to 
include the dict itself when `column_name` is missing fixes this.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/mcp_service/chart/preview_utils.py:
##########
@@ -36,6 +36,23 @@
 logger = logging.getLogger(__name__)
 
 
+def _build_query_columns(form_data: Dict[str, Any]) -> list[str]:
+    """Build query columns list from form_data, including both x_axis and 
groupby."""
+    x_axis_config = form_data.get("x_axis")
+    groupby_columns: list[str] = form_data.get("groupby", [])

Review Comment:
   **Suggestion:** When `form_data["groupby"]` is present but set to `None` (or 
any non-list value), assigning it directly to `groupby_columns` and then 
calling `.copy()` later will raise an `AttributeError`, causing preview 
generation to fail instead of gracefully treating it as no groupby columns. 
[possible bug]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ generate_preview_from_form_data preview fails (AttributeError)
   - ⚠️ LLM/form_data with null groupby triggers failure
   ```
   </details>
   
   ```suggestion
       groupby_columns = form_data.get("groupby") or []
       if not isinstance(groupby_columns, list):
           groupby_columns = []
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Call generate_preview_from_form_data(...) in
   superset/mcp_service/chart/preview_utils.py (the function defined at the top 
of the file;
   its usage of _build_query_columns is at line 86 in the PR hunk).
   
   2. Pass form_data containing "groupby": null (JSON null -> Python None). 
Example payload:
   form_data = {"viz_type": "echarts_timeseries_bar", "x_axis": "territory", 
"groupby":
   None}. The call site that triggers this is the line: columns =
   _build_query_columns(form_data) at line 86.
   
   3. Execution enters _build_query_columns (defined in the same file at lines 
39-53). Line
   42 assigns groupby_columns = form_data.get("groupby", []) which returns None 
(because the
   key exists with null).
   
   4. Later, at line 45 (columns = raw_columns.copy() if "columns" in form_data 
else
   groupby_columns.copy()), Python attempts groupby_columns.copy() and raises 
AttributeError:
   'NoneType' object has no attribute 'copy'. The preview generation path blows 
up and
   returns a ChartError fallback. This reproduces deterministically when 
"groupby" is present
   and null.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/chart/preview_utils.py
   **Line:** 42:42
   **Comment:**
        *Possible Bug: When `form_data["groupby"]` is present but set to `None` 
(or any non-list value), assigning it directly to `groupby_columns` and then 
calling `.copy()` later will raise an `AttributeError`, causing preview 
generation to fail instead of gracefully treating it as no groupby columns.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/mcp_service/chart/tool/get_chart_preview.py:
##########
@@ -56,6 +56,22 @@ class ChartLike(Protocol):
     uuid: Any
 
 
+def _build_query_columns(form_data: Dict[str, Any]) -> list[str]:
+    """Build query columns list from form_data, including both x_axis and 
groupby."""
+    x_axis_config = form_data.get("x_axis")
+    groupby_columns: list[str] = form_data.get("groupby", [])

Review Comment:
   **Suggestion:** `_build_query_columns` assumes that `form_data["groupby"]` 
is always a list, but if it is explicitly `None` or a scalar (e.g. a single 
string), calling `.copy()` will raise an AttributeError or produce an incorrect 
type, breaking table/vega previews for such charts. Normalizing the value to a 
list before copying avoids this runtime failure. [type error]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Table preview generation may raise exceptions.
   - ❌ Vega-Lite preview queries can be aborted.
   - ⚠️ get_chart_preview tool returns ChartError instead.
   ```
   </details>
   
   ```suggestion
       groupby_columns = form_data.get("groupby") or []
   
       if not isinstance(groupby_columns, list):
           groupby_columns = [groupby_columns]
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Load a chart preview using get_chart_preview tool at
   superset/mcp_service/chart/tool/get_chart_preview.py: async
   _get_chart_preview_internal(...) which eventually instantiates 
PreviewFormatGenerator and
   calls a strategy (TablePreviewStrategy or VegaLitePreviewStrategy).
   
   2. The TablePreviewStrategy.generate path calls 
_build_query_columns(form_data) (callsite
   added at get_chart_preview.py:301 in the PR) passing form_data parsed from 
chart.params
   (from ChartDAO or transient chart).
   
   3. If the stored chart.params contains "groupby": null or "groupby": "year" 
(a scalar) — a
   realistic payload when form_data JSON comes from external LLMs or older 
saved exports —
   _build_query_columns executes groupby_columns: list[str] = 
form_data.get("groupby", [])
   then columns = groupby_columns.copy() at get_chart_preview.py:64.
   
   4. When groupby_columns is None, None.copy() raises AttributeError; when 
groupby_columns
   is a string, .copy() is a method error (strings don't have copy), causing 
the preview path
   to raise and the Table/Vega preview to fail. The failure appears in logs from
   get_chart_preview and results in ChartError returned to the caller.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/chart/tool/get_chart_preview.py
   **Line:** 62:62
   **Comment:**
        *Type Error: `_build_query_columns` assumes that `form_data["groupby"]` 
is always a list, but if it is explicitly `None` or a scalar (e.g. a single 
string), calling `.copy()` will raise an AttributeError or produce an incorrect 
type, breaking table/vega previews for such charts. Normalizing the value to a 
list before copying avoids this runtime failure.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/mcp_service/chart/tool/get_chart_data.py:
##########
@@ -178,6 +178,17 @@ async def get_chart_data(  # noqa: C901
                     metrics = form_data.get("metrics", [])
                     groupby_columns = form_data.get("groupby", [])
 
+                # Build query columns list: include both x_axis and groupby
+                x_axis_config = form_data.get("x_axis")
+                query_columns = groupby_columns.copy()

Review Comment:
   **Suggestion:** The code assumes that the `groupby` value from `form_data` 
is always a list and calls `.copy()` on it directly; if an existing chart was 
saved with `groupby` as `null` or a scalar (string), this will raise an 
`AttributeError` at runtime in the fallback path, preventing chart data 
retrieval. [type error]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ get_chart_data MCP tool crashes on fallback charts.
   - ⚠️ Chart data previews fail for older unsaved charts.
   - ⚠️ Series/group_by chart previews may error inconsistently.
   ```
   </details>
   
   ```suggestion
                   if isinstance(groupby_columns, list):
                       query_columns = groupby_columns.copy()
                   elif groupby_columns:
                       # Handle older or malformed form_data where groupby is a 
scalar
                       query_columns = [groupby_columns]
                   else:
                       query_columns = []
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Create or find a chart record where chart.query_context is NULL and 
chart.params
   contains form_data with `"groupby": null` or `"groupby": "year"`. This falls 
into the
   fallback path in get_chart_data (file 
superset/mcp_service/chart/tool/get_chart_data.py).
   The fallback form_data is loaded at the line `form_data = 
utils_json.loads(chart.params)
   if chart.params else {}` (the fallback block around lines ~172-180).
   
   2. Invoke the MCP data tool for that chart by calling the get_chart_data 
tool function
   (superset/mcp_service/chart/tool/get_chart_data.py:def get_chart_data). Call 
it with
   request.identifier set to the chart's id so the code loads that chart (the 
ChartDAO lookup
   happens earlier in the same function).
   
   3. Execution reaches the fallback branch where groupby is read with 
`groupby_columns =
   form_data.get("groupby", [])` (line 179). If the saved form_data had 
`groupby: null` (->
   None) or `groupby: "year"` (-> str), groupby_columns becomes None or a 
string.
   
   4. The code then executes `query_columns = groupby_columns.copy()` (line 
183). This raises
   an AttributeError ('NoneType' object has no attribute 'copy' or 'str' object 
has no
   attribute 'copy'), causing the get_chart_data invocation to fail and return 
an error from
   the inner except block that logs "Error retrieving chart data" and returns a 
ChartError.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/chart/tool/get_chart_data.py
   **Line:** 183:183
   **Comment:**
        *Type Error: The code assumes that the `groupby` value from `form_data` 
is always a list and calls `.copy()` on it directly; if an existing chart was 
saved with `groupby` as `null` or a scalar (string), this will raise an 
`AttributeError` at runtime in the fallback path, preventing chart data 
retrieval.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/mcp_service/chart/tool/get_chart_preview.py:
##########
@@ -56,6 +56,22 @@ class ChartLike(Protocol):
     uuid: Any
 
 
+def _build_query_columns(form_data: Dict[str, Any]) -> list[str]:
+    """Build query columns list from form_data, including both x_axis and 
groupby."""
+    x_axis_config = form_data.get("x_axis")
+    groupby_columns: list[str] = form_data.get("groupby", [])
+
+    columns = groupby_columns.copy()
+    if x_axis_config and isinstance(x_axis_config, str):
+        if x_axis_config not in columns:
+            columns.insert(0, x_axis_config)
+    elif x_axis_config and isinstance(x_axis_config, dict):
+        col_name = x_axis_config.get("column_name")
+        if col_name and col_name not in columns:
+            columns.insert(0, col_name)

Review Comment:
   **Suggestion:** When `x_axis` is provided as a dict (for example, an ad‑hoc 
or SQL expression column configuration), `_build_query_columns` currently 
extracts only `column_name` and drops the rest of the configuration; if 
`column_name` is missing this silently omits the x‑axis from the query, so 
previews for such charts won't include the intended x‑axis dimension. Using the 
entire dict as the column specification ensures ad‑hoc x‑axis definitions are 
preserved in the query. [logic error]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ❌ Vega-Lite previews missing ad-hoc x-axis dimension.
   - ❌ Table previews omit intended x-axis column.
   - ⚠️ Preview SQL GROUP BY incorrect for ad-hoc x_axis.
   ```
   </details>
   
   ```suggestion
           if x_axis_config not in columns:
               columns.insert(0, x_axis_config)
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Create or have a chart whose form_data.x_axis is an ad-hoc or SQL 
expression dict (for
   example: {"sqlExpression": "DATE_TRUNC('year', date)", "label": "year"}) 
stored in
   chart.params (ChartDAO result used in VegaLitePreviewStrategy or transient 
chart creation
   in _get_chart_preview_internal).
   
   2. Call get_chart_preview 
(superset/mcp_service/chart/tool/get_chart_preview.py), which
   selects VegaLitePreviewStrategy.generate or TablePreviewStrategy.generate 
and calls
   columns = _build_query_columns(form_data) (callsites in 
get_chart_preview.py:204 and
   get_chart_preview.py:301 in the PR).
   
   3. Inside _build_query_columns (defined at get_chart_preview.py:59), the dict
   x_axis_config path looks only for "column_name" (get("column_name")) and if 
absent does
   not add any column; the ad-hoc dict therefore is silently omitted from the 
columns list.
   
   4. QueryContextFactory.create 
(superset/common/query_context_factory.py:create at ~line
   39) receives a queries[0]["columns"] lacking the x_axis dict, so the 
generated SQL and
   GROUP BY omit the intended x-axis expression, producing incorrect preview 
results (missing
   dimension in SELECT/GROUP BY) visible in Table/Vega previews.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/chart/tool/get_chart_preview.py
   **Line:** 69:71
   **Comment:**
        *Logic Error: When `x_axis` is provided as a dict (for example, an 
ad‑hoc or SQL expression column configuration), `_build_query_columns` 
currently extracts only `column_name` and drops the rest of the configuration; 
if `column_name` is missing this silently omits the x‑axis from the query, so 
previews for such charts won't include the intended x‑axis dimension. Using the 
entire dict as the column specification ensures ad‑hoc x‑axis definitions are 
preserved in the query.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to