codeant-ai-for-open-source[bot] commented on code in PR #36856:
URL: https://github.com/apache/superset/pull/36856#discussion_r2651188849


##########
superset/db_engine_specs/snowflake.py:
##########
@@ -135,6 +161,71 @@ class SnowflakeEngineSpec(PostgresBaseEngineSpec):
         ),
     }
 
+    # OAuth 2.0 support
+    supports_oauth2 = True
+    oauth2_exception = CustomSnowflakeAuthError
+
+    @classmethod
+    def impersonate_user(
+        cls,
+        database: Database,
+        username: str | None,
+        user_token: str | None,
+        url: URL,
+        engine_kwargs: dict[str, Any],
+    ) -> tuple[URL, dict[str, Any]]:
+        """
+        Modify URL and/or engine kwargs to impersonate a different user.
+        """
+        connect_args = engine_kwargs.setdefault("connect_args", {})
+
+        # When test_connection is executed (i.e., when 
validate_default_parameters is
+        # set to True in connect_args), authentication via OAuth is not 
performed.
+        if (
+            not connect_args.get("validate_default_parameters", False)
+            and cls.is_oauth2_enabled()
+        ):
+            url = url.update_query_dict({"authenticator": "oauth"})
+            connect_args["authenticator"] = "oauth"
+
+        if user_token:
+            if username is not None:
+                user = security_manager.find_user(username=username)
+                if user and user.email:
+                    if is_feature_enabled("IMPERSONATE_WITH_EMAIL_PREFIX"):
+                        url = url.set(username=user.email.split("@")[0])
+                    else:
+                        url = url.set(username=user.email)
+
+            url = url.update_query_dict({"token": user_token})
+
+        return url, engine_kwargs
+
+    @classmethod
+    def get_oauth2_authorization_uri(
+        cls,
+        config: OAuth2ClientConfig,
+        state: OAuth2State,
+    ) -> str:
+        """
+        Return URI for initial OAuth2 request.
+        """
+        uri = config["authorization_request_uri"]
+        # When calling the Snowflake OAuth authorization endpoint for a custom 
client,
+        # specify only the query parameters documented in the URL below.
+        # Adding unsupported parameters
+        # (e.g., `prompt` as used in 
BaseEngineSpec.get_oauth2_authorization_uri)
+        # will cause an error.
+        # https://docs.snowflake.com/user-guide/oauth-custom#query-parameters
+        params = {
+            "scope": config["scope"],
+            "response_type": "code",
+            "state": encode_oauth2_state(state),
+            "redirect_uri": config["redirect_uri"],
+            "client_id": config["id"],
+        }
+        return parse.urljoin(uri, "?" + parse.urlencode(params))

Review Comment:
   **Suggestion:** Bug in building the authorization URI: using 
urllib.parse.urljoin with a query string will drop the path portion of 
`authorization_request_uri` (urljoin treats a leading '?' as a network-location 
relative URL), causing the OAuth2 authorization endpoint path to be removed and 
resulting in invalid URIs. Construct the URL by parsing and updating the query 
component to preserve the original path. [logic error]
   
   **Severity Level:** Minor ⚠️
   ```suggestion
           # Build the URL by updating the query component to preserve the 
original path.
           parsed = parse.urlparse(uri)
           existing_qs = dict(parse.parse_qsl(parsed.query, 
keep_blank_values=True))
           existing_qs.update(params)
           new_qs = parse.urlencode(existing_qs)
           return parse.urlunparse(parsed._replace(query=new_qs))
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   This is a real logic bug: urllib.parse.urljoin(uri, '?...') can replace the 
path component and yield an authorization endpoint with the wrong path (or no 
path) depending on the base URI, producing invalid OAuth2 redirects. The 
proposed change to parse the URI and update its query component preserves the 
original path and merges params correctly. This directly fixes a correctness 
issue visible in the PR.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/db_engine_specs/snowflake.py
   **Line:** 227:227
   **Comment:**
        *Logic Error: Bug in building the authorization URI: using 
urllib.parse.urljoin with a query string will drop the path portion of 
`authorization_request_uri` (urljoin treats a leading '?' as a network-location 
relative URL), causing the OAuth2 authorization endpoint path to be removed and 
resulting in invalid URIs. Construct the URL by parsing and updating the query 
component to preserve the original path.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/db_engine_specs/snowflake.py:
##########
@@ -33,19 +33,45 @@
 from sqlalchemy import types
 from sqlalchemy.engine.reflection import Inspector
 from sqlalchemy.engine.url import URL
+from sqlalchemy.exc import DatabaseError as SqlalchemyDatabaseError
 
+from superset import is_feature_enabled, security_manager
 from superset.constants import TimeGrain
 from superset.databases.utils import make_url_safe
 from superset.db_engine_specs.base import BaseEngineSpec, BasicPropertiesType
 from superset.db_engine_specs.postgres import PostgresBaseEngineSpec
 from superset.errors import ErrorLevel, SupersetError, SupersetErrorType
 from superset.models.sql_lab import Query
+from superset.superset_typing import (
+    OAuth2ClientConfig,
+    OAuth2State,
+)
 from superset.utils import json
 from superset.utils.core import get_user_agent, QuerySource
+from superset.utils.oauth2 import encode_oauth2_state
 
 if TYPE_CHECKING:
     from superset.models.core import Database
 
+try:
+    from snowflake.connector.errors import DatabaseError
+except ImportError:
+    DatabaseError = Exception

Review Comment:
   **Suggestion:** Overly-broad fallback when the snowflake connector is not 
installed: assigning `DatabaseError = Exception` will make the custom exception 
matcher treat many unrelated exceptions as Snowflake errors (since almost 
anything is an Exception). Use a distinct sentinel exception class to avoid 
false-positive matches when Snowflake isn't available. [possible bug]
   
   **Severity Level:** Critical 🚨
   ```suggestion
       # Use a distinct sentinel type when snowflake is not installed to avoid
       # matching unrelated exception types (using `Exception` would be too 
broad).
       class _SnowflakeDatabaseError(Exception):
           """Sentinel type to stand in for 
snowflake.connector.errors.DatabaseError."""
           pass
       DatabaseError = _SnowflakeDatabaseError
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   Assigning DatabaseError = Exception when the snowflake package is absent 
will make isinstance(..., DatabaseError) match basically any Exception, 
producing false positives in the custom exception meta-class logic. Defining a 
distinct sentinel exception type avoids accidental matches while keeping 
runtime behavior when the connector is missing.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/db_engine_specs/snowflake.py
   **Line:** 59:59
   **Comment:**
        *Possible Bug: Overly-broad fallback when the snowflake connector is 
not installed: assigning `DatabaseError = Exception` will make the custom 
exception matcher treat many unrelated exceptions as Snowflake errors (since 
almost anything is an Exception). Use a distinct sentinel exception class to 
avoid false-positive matches when Snowflake isn't available.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/db_engine_specs/snowflake.py:
##########
@@ -377,6 +468,18 @@ def update_params_from_encrypted_extra(
         database: "Database",
         params: dict[str, Any],
     ) -> None:
+        # To use OAuth authentication, a database connection must first be 
created using
+        # another authenticator (typically key-pair authentication)
+        # with “Impersonate logged in user” enabled.
+        # Key-pair authentication is used for connection tests,
+        # while OAuth authentication is used when executing actual queries,
+        # such as in SQL Lab or dashboards.
+        # Therefore, when using OAuth authentication, the key-pair 
authentication
+        # settings are not loaded, and the connection is established using 
OAuth only.
+        connects_args = params.get("connect_args", {})
+        if connects_args.get("authenticator") == "oauth":

Review Comment:
   **Suggestion:** Variable name typo and inconsistent handling of connect 
args: `connects_args` is a misspelling and may be confusing; use `connect_args` 
consistently so the early check references the same semantics as the rest of 
the function and avoids accidental logical errors when `params` doesn't contain 
`connect_args`. [possible bug]
   
   **Severity Level:** Critical 🚨
   ```suggestion
           connect_args = params.get("connect_args") or {}
           if connect_args.get("authenticator") == "oauth":
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   The code uses a different variable name ("connects_args") here and 
"connect_args" later, which is confusing and risks maintainability issues. 
While it doesn't currently introduce a runtime bug (both read from params), 
normalizing to a single name prevents accidental bugs if the code is modified 
later and improves readability.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/db_engine_specs/snowflake.py
   **Line:** 479:480
   **Comment:**
        *Possible Bug: Variable name typo and inconsistent handling of connect 
args: `connects_args` is a misspelling and may be confusing; use `connect_args` 
consistently so the early check references the same semantics as the rest of 
the function and avoids accidental logical errors when `params` doesn't contain 
`connect_args`.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
tests/unit_tests/db_engine_specs/test_snowflake.py:
##########
@@ -438,3 +439,122 @@ def test_unmask_encrypted_extra() -> None:
             },
         }
     )
+
+
[email protected]
+def oauth2_config() -> OAuth2ClientConfig:
+    """
+    Config for Snowflake OAuth2.
+    """
+    return {
+        "id": "snowflake-oauth2-client-id",
+        "secret": "snowflake-oauth2-client-secret",
+        "scope": "refesh_token",

Review Comment:
   **Suggestion:** Typo in the OAuth2 `scope` value: it's spelled 
"refesh_token" but should be "refresh_token". This will cause the token request 
to send the wrong scope and can lead to authentication failures or unexpected 
token responses. [logic error]
   
   **Severity Level:** Minor ⚠️
   ```suggestion
           "scope": "refresh_token",
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   This is a clear typo in the test fixture ("refesh_token" → "refresh_token"). 
Fixing it makes the fixture accurate and avoids confusion or subtle test drift 
if production code ever marshals the scope through. It's a 
correctness/readability fix with negligible risk.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** tests/unit_tests/db_engine_specs/test_snowflake.py
   **Line:** 452:452
   **Comment:**
        *Logic Error: Typo in the OAuth2 `scope` value: it's spelled 
"refesh_token" but should be "refresh_token". This will cause the token request 
to send the wrong scope and can lead to authentication failures or unexpected 
token responses.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to