Copilot commented on code in PR #37890:
URL: https://github.com/apache/superset/pull/37890#discussion_r2848858188


##########
superset/db_engine_specs/kusto.py:
##########
@@ -213,3 +264,35 @@ def convert_dttm(
             return f"""datetime({dttm.isoformat(timespec="microseconds")})"""
 
         return None
+
+    @classmethod
+    def execute(
+        cls,
+        cursor: Any,
+        query: str,
+        database: "Database",
+        **kwargs: Any,
+    ) -> None:
+        """
+        Execute a KQL query, fixing ARRAY() wrappers around
+        bracket-quoted identifiers.
+
+        Example:
+            ARRAY(["age"]) -> ["age"]
+            ARRAY(["user_name"]) -> ["user_name"]
+        """
+        import logging
+        import re
+
+        logger = logging.getLogger(__name__)
+        logger.debug("execute: input query: %s", query)
+
+        # Replace ARRAY(["identifier"]) with ["identifier"]
+        processed_query = re.sub(
+            r'ARRAY\(\[("[^"]+")\]\)',

Review Comment:
   The regex pattern `r'ARRAY\(\[("[^"]+")\]\)'` has a potential issue. It uses 
`[^"]+` which matches one or more non-quote characters, but this won't handle 
edge cases properly:
   
   1. It won't match identifiers with escaped quotes inside them (e.g., 
`["col\"name"]`)
   2. It requires at least one character between quotes, so `ARRAY([""])` 
wouldn't match
   3. The pattern is too restrictive for bracket-quoted identifiers in KQL
   
   Consider using a more robust pattern like 
`r'ARRAY\(\[("(?:[^"\\]|\\.)*")\]\)'` which handles escaped characters within 
quotes, or if identifiers can't contain quotes, use `r'ARRAY\(\[("[^"]*")\]\)'` 
to allow empty identifiers.
   ```suggestion
               r'ARRAY\(\[("(?:[^"\\]|\\.)*")\]\)',
   ```



##########
superset/db_engine_specs/kusto.py:
##########
@@ -201,6 +212,46 @@ def get_dbapi_exception_mapping(cls) -> 
dict[type[Exception], type[Exception]]:
             kusto_exceptions.ProgrammingError: SupersetDBAPIProgrammingError,
         }
 
+    @classmethod
+    def handle_null_filter(
+        cls,
+        sqla_col: Any,
+        op: Any,
+    ) -> Any:
+        """
+        Handle null/not null filter operations for KQL.
+
+        In KQL, null checks use functions:
+        - isnull(col) for IS NULL
+        - isnotnull(col) for IS NOT NULL
+
+        :param sqla_col: SQLAlchemy column element
+        :param op: Filter operator (IS_NULL or IS_NOT_NULL)
+        :return: SQLAlchemy expression for the null filter
+        """
+        from superset.utils import core as utils
+
+        if op == utils.FilterOperator.IS_NULL:
+            return text(f"isnull({sqla_col})")
+        if op == utils.FilterOperator.IS_NOT_NULL:
+            return text(f"isnotnull({sqla_col})")

Review Comment:
   Using f-string interpolation with `text()` bypasses SQLAlchemy's query 
compilation and parameter binding mechanisms. While sqla_col is a SQLAlchemy 
ColumnElement, converting it to a string with str() and interpolating it 
directly can lead to issues:
   
   1. It doesn't respect SQLAlchemy's dialect-specific quoting rules
   2. If sqla_col contains complex expressions or subqueries, the string 
representation may not be syntactically valid in all contexts
   3. It prevents SQLAlchemy from applying proper escaping for the target 
database
   
   Use `literal_column()` instead, which preserves SQLAlchemy's expression 
compilation system while generating raw SQL/KQL. For example: `return 
literal_column(f"isnull({sqla_col})")`. This is safer and more maintainable as 
it works with SQLAlchemy's query builder rather than bypassing it.



##########
superset/db_engine_specs/kusto.py:
##########
@@ -201,6 +212,46 @@ def get_dbapi_exception_mapping(cls) -> 
dict[type[Exception], type[Exception]]:
             kusto_exceptions.ProgrammingError: SupersetDBAPIProgrammingError,
         }
 
+    @classmethod
+    def handle_null_filter(
+        cls,
+        sqla_col: Any,
+        op: Any,

Review Comment:
   The type hint for the `op` parameter should be `utils.FilterOperator` (or 
fully qualified as seen in the base class) instead of `Any` to maintain type 
safety and consistency with the base class definition. This ensures proper type 
checking and makes the expected parameter type clear.



##########
superset/db_engine_specs/kusto.py:
##########
@@ -213,3 +264,35 @@ def convert_dttm(
             return f"""datetime({dttm.isoformat(timespec="microseconds")})"""
 
         return None
+
+    @classmethod
+    def execute(
+        cls,
+        cursor: Any,
+        query: str,
+        database: "Database",
+        **kwargs: Any,
+    ) -> None:
+        """
+        Execute a KQL query, fixing ARRAY() wrappers around
+        bracket-quoted identifiers.
+
+        Example:
+            ARRAY(["age"]) -> ["age"]
+            ARRAY(["user_name"]) -> ["user_name"]
+        """
+        import logging
+        import re

Review Comment:
   The imports for `logging` and `re` should be moved to the module level 
rather than being imported within the method. The `re` module is already 
imported at line 17, so the redundant import on line 285 should be removed. The 
`logging` module should be imported at the top of the file alongside other 
module-level imports, following the codebase convention seen in other 
db_engine_specs files.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to