codeant-ai-for-open-source[bot] commented on code in PR #37890:
URL: https://github.com/apache/superset/pull/37890#discussion_r2985810114
##########
superset/db_engine_specs/kusto.py:
##########
@@ -213,3 +266,33 @@ def convert_dttm(
return f"""datetime({dttm.isoformat(timespec="microseconds")})"""
return None
+
+ @classmethod
+ def execute(
+ cls,
+ cursor: Any,
+ query: str,
+ database: "Database",
+ **kwargs: Any,
+ ) -> None:
+ """
+ Execute a KQL query, fixing ARRAY() wrappers around
+ bracket-quoted identifiers.
+
+ Example:
+ ARRAY(["age"]) -> ["age"]
+ ARRAY(["user_name"]) -> ["user_name"]
+ """
+ # Replace ARRAY(["identifier"]) with ["identifier"]
+ processed_query, num_replacements = re.subn(
+ r'ARRAY\(\[("(?:[^"\\]|\\.)*")\]\)',
Review Comment:
**Suggestion:** The `ARRAY(...)` cleanup regex can rewrite substrings inside
longer identifiers/functions (for example `PACK_ARRAY(["x"])`), producing
invalid KQL like `PACK_["x"]`. Restrict the match so `ARRAY` is treated as a
standalone token before replacing. [logic error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ⚠️ KQL SQL Lab queries can be text-corrupted.
- ⚠️ Affects `kustokql` execution path before cursor execution.
```
</details>
```suggestion
r'(?<![A-Za-z0-9_])ARRAY\(\[("(?:[^"\\]|\\.)*")\]\)',
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Execute the KQL engine path by calling `KustoKqlEngineSpec.execute()` in
`superset/db_engine_specs/kusto.py:271`; this method is used by SQL
execution flows
(`superset/sql/execution/executor.py:63`, `superset/models/core.py:43`, and
SQL Lab via
`superset/db_engine_specs/base.py:1533-1545`).
2. Pass a query containing a longer token ending with `ARRAY(...)`, e.g. `T
| extend y =
PACK_ARRAY(["x"]) | take 1`, into `execute()` at
`superset/db_engine_specs/kusto.py:287-291`.
3. The current regex `ARRAY\(\[...\]\)` at `kusto.py:288` matches inside
`PACK_ARRAY(["x"])` and rewrites only that substring; verified by execution:
output
becomes `PACK_["x"]` (1 replacement).
4. `super().execute()` (`superset/db_engine_specs/base.py:2107-2141`) then
sends the
corrupted query string to `cursor.execute(query)`, so query text is modified
incorrectly
before DB execution.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/db_engine_specs/kusto.py
**Line:** 288:288
**Comment:**
*Logic Error: The `ARRAY(...)` cleanup regex can rewrite substrings
inside longer identifiers/functions (for example `PACK_ARRAY(["x"])`),
producing invalid KQL like `PACK_["x"]`. Restrict the match so `ARRAY` is
treated as a standalone token before replacing.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F37890&comment_hash=47f4298559c0d06b3a6222cff05c174ee003ae1d7186136579dd7d84f08f942c&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F37890&comment_hash=47f4298559c0d06b3a6222cff05c174ee003ae1d7186136579dd7d84f08f942c&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]