codeant-ai-for-open-source[bot] commented on code in PR #36350:
URL: https://github.com/apache/superset/pull/36350#discussion_r2582899831


##########
superset/charts/schemas.py:
##########
@@ -1415,6 +1415,12 @@ class ChartDataQueryContextSchema(Schema):
 
     form_data = fields.Raw(allow_none=True, required=False)
 
+    client_id = fields.String(
+        required=False,
+        allow_none=True,
+        metadata={"description": "Optional client-generated id for this chart 
query"},

Review Comment:
   **Suggestion:** Adding `client_id` as a loadable field means 
`ChartDataQueryContextSchema().load()` will include `client_id` in the returned 
data and then pass it as an unexpected keyword argument to 
`QueryContextFactory.create(**data)`, causing a runtime TypeError whenever a 
client sends this field; to avoid this, the field should not be part of the 
loaded kwargs (e.g. by making it dump-only) until the factory and QueryContext 
are updated to accept it. [type error]
   
   **Severity Level:** Minor ⚠️
   ```suggestion
           dump_only=True,
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   This is a real runtime bug, not cosmetics.
   
   `ChartDataQueryContextSchema` does a `@post_load` that calls
   `self.get_query_context_factory().create(**data)`. The current
   `QueryContextFactory.create` signature is:
   
       def create(
           *,
           datasource: DatasourceDict,
           queries: list[dict[str, Any]],
           form_data: dict[str, Any] | None = None,
           result_type: ChartDataResultType | None = None,
           result_format: ChartDataResultFormat | None = None,
           force: bool = False,
           custom_cache_timeout: int | None = None,
       ) -> QueryContext:
   
   There is no `client_id` parameter. With the new field defined as a normal
   loadable field, any request body that includes `client_id` will deserialize
   into `data["client_id"]` and then `create(**data)` will raise:
   
       TypeError: QueryContextFactory.create() got an unexpected keyword 
argument 'client_id'
   
   Marking the field as `dump_only=True` keeps it in the schema (for output
   / documentation) but excludes it from deserialization, so `client_id`
   will not be included in `data` and no unexpected-kwarg error occurs.
   Given that `QueryContext` and `QueryContextFactory` do not currently use
   a client id at all, dropping it on load is functionally safe and prevents
   a production TypeError.
   
   So this change directly fixes a concrete runtime failure and is therefore
   not undesirable.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/charts/schemas.py
   **Line:** 1421:1421
   **Comment:**
        *Type Error: Adding `client_id` as a loadable field means 
`ChartDataQueryContextSchema().load()` will include `client_id` in the returned 
data and then pass it as an unexpected keyword argument to 
`QueryContextFactory.create(**data)`, causing a runtime TypeError whenever a 
client sends this field; to avoid this, the field should not be part of the 
loaded kwargs (e.g. by making it dump-only) until the factory and QueryContext 
are updated to accept it.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset-frontend/src/explore/components/ExploreViewContainer/index.jsx:
##########
@@ -398,9 +399,22 @@ function ExploreViewContainer(props) {
   );
 
   function onStop() {
-    if (props.chart && props.chart.queryController) {
+    // Method 1: Abort the in-flight HTTP request using AbortController
+    if (props.chart?.queryController) {
       props.chart.queryController.abort();
     }
+
+    // Method 2: Send stop request to backend if we have a query ID
+    const queryId = props.chart?.latestQueryId;
+    if (queryId) {
+      SupersetClient.post({
+        endpoint: '/api/v1/query/stop',
+        body: JSON.stringify({ client_id: queryId }),
+        headers: { 'Content-Type': 'application/json' },
+      }).catch(error => {

Review Comment:
   **Suggestion:** The stop handler reads `latestQueryId` from the chart state, 
but `ChartState` (and the chart reducer) never defines or sets this property, 
so `queryId` will typically be `undefined` and the `/api/v1/query/stop` call 
will never execute; instead, derive the query id from the existing 
`queriesResponse` array (which contains `QueryData` objects with `query_id`) so 
that the backend stop endpoint is actually invoked for the running query. 
[logic error]
   
   **Severity Level:** Minor ⚠️
   ```suggestion
       const queryId =
         props.chart?.queriesResponse && props.chart.queriesResponse.length > 0
           ? props.chart.queriesResponse[0]?.query_id
           : undefined;
   
       if (queryId) {
         SupersetClient.post({
           endpoint: '/api/v1/query/stop',
           body: JSON.stringify({ client_id: queryId }),
           headers: { 'Content-Type': 'application/json' },
         }).catch(() => {
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   In the actual Explore chart state (`ChartState` from `src/explore/types.ts` 
and the `chartReducer`),
   there is no `latestQueryId` field at all: the reducer never sets it, and the 
type definition omits it.
   That means `props.chart?.latestQueryId` will almost always be `undefined`, 
so the `/api/v1/query/stop`
   branch *never runs* and the backend stop endpoint is effectively dead code.
   
   The suggestion switches to deriving the identifier from 
`props.chart.queriesResponse[0]?.query_id`.
   `queriesResponse` is a real, well-defined part of `ChartState`, and 
`QueryData` in
   `@superset-ui/core` has an optional `query_id` field, so at least this code 
is operating on
   actual, populated state instead of a fantasy property. It eliminates a 
genuine logic bug
   (reading a non-existent field and silently never sending the stop request) 
and replaces it
   with a call based on the only available query identifier, without 
introducing syntax or
   type errors. This is a behavioral fix, not just cosmetic.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** 
superset-frontend/src/explore/components/ExploreViewContainer/index.jsx
   **Line:** 408:414
   **Comment:**
        *Logic Error: The stop handler reads `latestQueryId` from the chart 
state, but `ChartState` (and the chart reducer) never defines or sets this 
property, so `queryId` will typically be `undefined` and the 
`/api/v1/query/stop` call will never execute; instead, derive the query id from 
the existing `queriesResponse` array (which contains `QueryData` objects with 
`query_id`) so that the backend stop endpoint is actually invoked for the 
running query.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/common/query_context.py:
##########
@@ -122,8 +123,10 @@ def get_df_payload(
             force_cached=force_cached,
         )
 
-    def get_query_result(self, query_object: QueryObject) -> QueryResult:
-        return self._processor.get_query_result(query_object)
+    def get_query_result(
+        self, query_object: QueryObject, query: Query | None = None
+    ) -> QueryResult:
+        return self._processor.get_query_result(query_object, query=query)

Review Comment:
   **Suggestion:** The new `get_query_result` method forwards a `query` keyword 
argument to `_processor.get_query_result`, but 
`QueryContextProcessor.get_query_result` only accepts a single `query_object` 
parameter; this will raise a `TypeError` at runtime whenever 
`QueryContext.get_query_result` is called, preventing queries from running. 
[type error]
   
   **Severity Level:** Minor ⚠️
   ```suggestion
           return self._processor.get_query_result(query_object)
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   This is a real, concrete runtime bug. In `QueryContextProcessor` the method
   is defined as:
   
       def get_query_result(self, query_object: QueryObject) -> QueryResult:
   
   It takes only `query_object`. The new wrapper in `QueryContext` forwards a
   `query=` keyword argument:
   
       self._processor.get_query_result(query_object, query=query)
   
   which will raise `TypeError: QueryContextProcessor.get_query_result()
   got an unexpected keyword argument 'query'` the moment the optional
   parameter is used. The suggested change removes the unsupported keyword
   and calls `get_query_result` with only the allowed argument, eliminating
   the TypeError while preserving the wrapper's signature. This directly
   fixes a correctness issue, not a stylistic nit.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/common/query_context.py
   **Line:** 129:129
   **Comment:**
        *Type Error: The new `get_query_result` method forwards a `query` 
keyword argument to `_processor.get_query_result`, but 
`QueryContextProcessor.get_query_result` only accepts a single `query_object` 
parameter; this will raise a `TypeError` at runtime whenever 
`QueryContext.get_query_result` is called, preventing queries from running.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/common/query_context_processor.py:
##########
@@ -219,15 +279,17 @@ def query_cache_key(self, query_obj: QueryObject, 
**kwargs: Any) -> str | None:
         )
         return cache_key
 
-    def get_query_result(self, query_object: QueryObject) -> QueryResult:
+    def get_query_result(
+        self, query_object: QueryObject, query: Query | None = None
+    ) -> QueryResult:
         """
         Returns a pandas dataframe based on the query object.
 
         This method delegates to the datasource's get_query_result method,
         which handles query execution, normalization, time offsets, and
         post-processing.
         """
-        return self._qc_datasource.get_query_result(query_object)
+        return self._qc_datasource.get_query_result(query_object, query=query)

Review Comment:
   **Suggestion:** The wrapper `get_query_result` passes an extra `query` 
argument down to the datasource's `get_query_result` method, but 
`BaseDatasource.get_query_result` only accepts a single `query_object` 
parameter, so this will raise a TypeError at runtime when executing chart 
queries. [type error]
   
   **Severity Level:** Minor ⚠️
   ```suggestion
   
           The optional `query` parameter is accepted for compatibility with
           callers but is not passed down, since datasources currently do not
           support it.
           """
           return self._qc_datasource.get_query_result(query_object)
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   `BaseDatasource.get_query_result` (see superset/models/helpers.py) is 
defined as
   `def get_query_result(self, query_object: QueryObject) -> QueryResult` and 
does
   not accept a `query` parameter. The PR's wrapper now calls
   `self._qc_datasource.get_query_result(query_object, query=query)`, which will
   raise a `TypeError: get_query_result() takes 2 positional arguments but 3 
were
   given` at runtime for all chart queries. The suggested change keeps the 
wrapper
   signature (so callers like `get_df_payload(..., query=query_model)` still 
compile)
   but stops passing the extra argument down to datasources, matching the actual
   method signature and eliminating a real runtime error. This is a concrete 
bug fix,
   not cosmetic.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/common/query_context_processor.py
   **Line:** 291:292
   **Comment:**
        *Type Error: The wrapper `get_query_result` passes an extra `query` 
argument down to the datasource's `get_query_result` method, but 
`BaseDatasource.get_query_result` only accepts a single `query_object` 
parameter, so this will raise a TypeError at runtime when executing chart 
queries.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/connectors/sqla/models.py:
##########
@@ -1637,16 +1637,16 @@ def _get_top_groups(
 
         return or_(*groups)
 
-    def query(self, query_obj: QueryObjectDict) -> QueryResult:
+    def query(
+        self, query_obj: QueryObjectDict, query: "Query" | None = None
+    ) -> QueryResult:
         """
         Executes the query for SqlaTable with additional column ordering logic.
 
         This overrides ExploreMixin.query() to add SqlaTable-specific behavior
         for handling column_order from extras.
         """
-        # Get the base result from ExploreMixin
-        # (explicitly, not super() which would hit BaseDatasource first)
-        result = ExploreMixin.query(self, query_obj)
+        result = ExploreMixin.query(self, query_obj, query=query)

Review Comment:
   **Suggestion:** The overridden query method passes an unexpected keyword 
argument `query` into `ExploreMixin.query`, whose current signature only 
accepts `(self, query_obj)`; this will raise a TypeError at runtime as soon as 
SqlaTable.query is invoked, preventing any chart query from executing. [type 
error]
   
   **Severity Level:** Minor ⚠️
   ```suggestion
           result = ExploreMixin.query(self, query_obj)
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   This is a real runtime bug, not cosmetic nonsense. ExploreMixin.query is 
defined
   in superset/models/helpers.py as `def query(self, query_obj: 
QueryObjectDict) ->
   QueryResult` and does not accept a `query` parameter or **kwargs. Calling it 
as
   `ExploreMixin.query(self, query_obj, query=query)` will raise a `TypeError:
   query() got an unexpected keyword argument 'query'` the moment
   SqlaTable.query is used. Dropping the bogus keyword so the call matches the
   actual signature fixes a concrete, verifiable error that would break chart
   queries.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/connectors/sqla/models.py
   **Line:** 1649:1649
   **Comment:**
        *Type Error: The overridden query method passes an unexpected keyword 
argument `query` into `ExploreMixin.query`, whose current signature only 
accepts `(self, query_obj)`; this will raise a TypeError at runtime as soon as 
SqlaTable.query is invoked, preventing any chart query from executing.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/models/helpers.py:
##########
@@ -1182,6 +1185,7 @@ def assign_column_label(df: pd.DataFrame) -> 
Optional[pd.DataFrame]:
                 self.catalog,
                 self.schema,
                 mutator=assign_column_label,
+                query=query,

Review Comment:
   **Suggestion:** The call to the database layer passes an unexpected keyword 
argument `query` to `Database.get_df`, whose current signature does not accept 
this parameter, causing a `TypeError` at runtime when executing queries via 
this mixin. [type error]
   
   **Severity Level:** Minor ⚠️
   ```suggestion
   
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   In the current code, `ExploreMixin.query` calls `self.database.get_df(..., 
query=query)`, but
   `Database.get_df` is defined as `get_df(self, sql, catalog=None, 
schema=None, mutator=None)`
   and does not accept a `query` keyword argument. This will cause Python to 
raise
   `TypeError: get_df() got an unexpected keyword argument 'query'` at runtime 
whenever this path
   is executed. The suggested change (removing `query=query`) directly fixes a 
real, verified
   runtime bug rather than being cosmetic.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/models/helpers.py
   **Line:** 1188:1188
   **Comment:**
        *Type Error: The call to the database layer passes an unexpected 
keyword argument `query` to `Database.get_df`, whose current signature does not 
accept this parameter, causing a `TypeError` at runtime when executing queries 
via this mixin.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/common/query_context_processor.py:
##########
@@ -199,6 +250,15 @@ def get_df_payload(
             "label_map": label_map,
         }
 
+        # expose client id so callers (and ultimately the frontend) can stop 
queries
+        if query_model is not None:

Review Comment:
   **Suggestion:** The variable `query_model` is only defined inside the 
cache-miss branch, but is later accessed unconditionally when building the 
payload, so when the result comes from cache (or the branch is skipped) this 
will raise an UnboundLocalError at `if query_model is not None`, breaking 
cached chart requests. [logic error]
   
   **Severity Level:** Minor ⚠️
   ```suggestion
           if "query_model" in locals() and query_model is not None:
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   In the current implementation `query_model` is only defined inside the 
cache-miss
   branch in `get_df_payload`. When the result is served from cache (or the 
whole
   branch is skipped), the later `if query_model is not None:` reads a local 
variable
   that was never assigned, which will raise `UnboundLocalError` and blow up 
cached
   chart responses. Guarding the access with `if "query_model" in locals() and
   query_model is not None:` avoids reading an uninitialized local, fixing a 
real
   runtime bug, not just style. A cleaner alternative would be initializing
   `query_model = None` at the start of the function, but the proposed change 
is still
   functionally correct and addresses a genuine issue.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/common/query_context_processor.py
   **Line:** 254:254
   **Comment:**
        *Logic Error: The variable `query_model` is only defined inside the 
cache-miss branch, but is later accessed unconditionally when building the 
payload, so when the result comes from cache (or the branch is skipped) this 
will raise an UnboundLocalError at `if query_model is not None`, breaking 
cached chart requests.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to