codeant-ai-for-open-source[bot] commented on code in PR #36850:
URL: https://github.com/apache/superset/pull/36850#discussion_r2649909893


##########
superset/db_engine_specs/databricks.py:
##########
@@ -218,43 +228,221 @@ class DatabricksPythonConnectorPropertiesType(TypedDict):
 
 
 class DatabricksHiveEngineSpec(HiveEngineSpec):
-    engine_name = "Databricks Interactive Cluster"
+    """
+    Engine spec for Databricks Interactive Cluster using PyHive.
+    Note: This inherits from HiveEngineSpec, not DatabricksBaseEngineSpec,
+    to preserve Hive-specific functionality.
+    """
 
+    engine_name = "Databricks Interactive Cluster"
     engine = "databricks"
     drivers = {"pyhive": "Hive driver for Interactive Cluster"}
     default_driver = "pyhive"
 
     _show_functions_column = "function"
-
     _time_grain_expressions = time_grain_expressions
 
+    # OAuth 2.0 support - replicate from DatabricksBaseEngineSpec since we 
inherit from HiveEngineSpec
+    allows_user_impersonation = True
+    supports_oauth2 = True
+    oauth2_scope = "all-apis offline_access"
+    oauth2_exception = RequestError
+
+    @classmethod
+    def is_oauth2_enabled(cls) -> bool:
+        """Check if OAuth2 is enabled for this engine."""
+        return (
+            cls.supports_oauth2
+            and cls.engine_name in app.config.get("DATABASE_OAUTH2_CLIENTS", 
{})
+        )
+
+    @classmethod
+    def needs_oauth2(cls, ex: Exception) -> bool:
+        """Check if the exception indicates that OAuth2 authentication is 
needed."""
+        if not (g and hasattr(g, "user")):
+            return False
+
+        error_msg = str(ex).lower()
+        auth_error_patterns = [
+            "credential was not sent",
+            "unsupported type for this api",
+            "no valid authentication settings",
+            "authentication failed",
+            "invalid access token",
+            "token expired",
+            "unauthorized",
+            "401",
+            "403",
+        ]
+        return any(pattern in error_msg for pattern in auth_error_patterns)
+
+    @classmethod
+    def impersonate_user(
+        cls,
+        database: "Database",
+        username: str | None,
+        user_token: str | None,
+        url: URL,
+        engine_kwargs: dict[str, Any],
+    ) -> tuple[URL, dict[str, Any]]:
+        """Update the connection to use OAuth2 access token."""
+        if user_token:
+            url = url.set(password=user_token)
+        return url, engine_kwargs
+
+    @staticmethod
+    def update_params_from_encrypted_extra(
+        database: "Database", params: dict[str, Any]
+    ) -> None:
+        """Remove OAuth2 client info from engine params."""
+        HiveEngineSpec.update_params_from_encrypted_extra(database, params)
+        if "oauth2_client_info" in params:
+            del params["oauth2_client_info"]
+
 
 class DatabricksBaseEngineSpec(BaseEngineSpec):
+    """
+    Base class for all Databricks engine specifications.
+    Contains common OAuth2 and configuration logic shared across all 
Databricks connectors.
+    """
+
     _time_grain_expressions = time_grain_expressions
 
+    # OAuth 2.0 support - common configuration for all Databricks engines
+    allows_user_impersonation = True
+    supports_oauth2 = True
+    oauth2_scope = "all-apis offline_access"
+    oauth2_exception = RequestError
+
+    @classmethod
+    def is_oauth2_enabled(cls) -> bool:
+        """Check if OAuth2 is enabled for this engine."""
+        return (
+            cls.supports_oauth2
+            and cls.engine_name in app.config.get("DATABASE_OAUTH2_CLIENTS", 
{})
+        )
+
+    @classmethod
+    def needs_oauth2(cls, ex: Exception) -> bool:
+        """
+        Check if the exception indicates that OAuth2 authentication is needed.
+        
+        :param ex: Exception raised during connection or query
+        :return: True if OAuth2 is required, False otherwise
+        """
+        if not (g and hasattr(g, "user")):

Review Comment:
   **Suggestion:** Referencing Flask's `g` proxy directly (`if not (g and 
hasattr(g, "user"))`) will raise a RuntimeError when there is no app/context; 
check for the presence of `g.user` in a safe way (catch RuntimeError) before 
using it. [possible bug]
   
   **Severity Level:** Critical 🚨
   ```suggestion
           try:
               user = getattr(g, "user", None)
           except RuntimeError:
               # No request/app context: cannot determine user, so OAuth2 
shouldn't be assumed needed
               return False
           if not user:
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   This is a valid and practical improvement. Accessing the `g` proxy when no
   app/request context exists will raise RuntimeError. The suggested try/getattr
   pattern (or catching RuntimeError) safely handles cases where the method is
   invoked outside a request lifecycle and avoids spurious exceptions. The
   change is localized and preserves existing logic around pattern matching.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/db_engine_specs/databricks.py
   **Line:** 333:333
   **Comment:**
        *Possible Bug: Referencing Flask's `g` proxy directly (`if not (g and 
hasattr(g, "user"))`) will raise a RuntimeError when there is no app/context; 
check for the presence of `g.user` in a safe way (catch RuntimeError) before 
using it.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/db_engine_specs/databricks.py:
##########
@@ -670,6 +870,24 @@ def adjust_engine_params(
 
         return uri, connect_args
 
+    @classmethod
+    def get_oauth2_authorization_uri(cls, database: "Database") -> str:
+        """
+        Return the OAuth2 authorization URI for this Databricks workspace.
+        Dynamically generated from the database host.
+        """
+        host = cls.get_parameters_from_uri(database.sqlalchemy_uri)["host"]
+        return f"https://{host}/oidc/v1/authorize";
+
+    @classmethod
+    def get_oauth2_token_uri(cls, database: "Database") -> str:
+        """
+        Return the OAuth2 token URI for this Databricks workspace.
+        Dynamically generated from the database host.
+        """
+        host = cls.get_parameters_from_uri(database.sqlalchemy_uri)["host"]

Review Comment:
   **Suggestion:** The Python connector's OAuth2 URI helpers assume `host` is 
always present via dictionary indexing and will raise KeyError if missing; 
mirror the ODBC implementation by using `.get("host")` and falling back to 
parsing the URL with `make_url_safe()` to obtain the host. [possible bug]
   
   **Severity Level:** Critical 🚨
   ```suggestion
           host = 
cls.get_parameters_from_uri(database.sqlalchemy_uri).get("host")
           if not host:
               url = make_url_safe(database.sqlalchemy_uri)
               host = url.host
           return f"https://{host}/oidc/v1/authorize";
   
       @classmethod
       def get_oauth2_token_uri(cls, database: "Database") -> str:
           """
           Return the OAuth2 token URI for this Databricks workspace.
           Dynamically generated from the database host.
           """
           host = 
cls.get_parameters_from_uri(database.sqlalchemy_uri).get("host")
           if not host:
               url = make_url_safe(database.sqlalchemy_uri)
               host = url.host
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   Mirroring the ODBC implementation's defensive approach is sensible. While
   current get_parameters_from_uri likely returns a host in normal cases, using
   .get("host") with a fallback to parsing the URL (make_url_safe) prevents a
   KeyError if the parameters dict is missing 'host' for any reason. It's a
   low-risk hardening change and consistent with nearby code in the file.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/db_engine_specs/databricks.py
   **Line:** 879:888
   **Comment:**
        *Possible Bug: The Python connector's OAuth2 URI helpers assume `host` 
is always present via dictionary indexing and will raise KeyError if missing; 
mirror the ODBC implementation by using `.get("host")` and falling back to 
parsing the URL with `make_url_safe()` to obtain the host.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/db_engine_specs/databricks.py:
##########
@@ -218,43 +228,221 @@ class DatabricksPythonConnectorPropertiesType(TypedDict):
 
 
 class DatabricksHiveEngineSpec(HiveEngineSpec):
-    engine_name = "Databricks Interactive Cluster"
+    """
+    Engine spec for Databricks Interactive Cluster using PyHive.
+    Note: This inherits from HiveEngineSpec, not DatabricksBaseEngineSpec,
+    to preserve Hive-specific functionality.
+    """
 
+    engine_name = "Databricks Interactive Cluster"
     engine = "databricks"
     drivers = {"pyhive": "Hive driver for Interactive Cluster"}
     default_driver = "pyhive"
 
     _show_functions_column = "function"
-
     _time_grain_expressions = time_grain_expressions
 
+    # OAuth 2.0 support - replicate from DatabricksBaseEngineSpec since we 
inherit from HiveEngineSpec
+    allows_user_impersonation = True
+    supports_oauth2 = True
+    oauth2_scope = "all-apis offline_access"
+    oauth2_exception = RequestError
+
+    @classmethod
+    def is_oauth2_enabled(cls) -> bool:
+        """Check if OAuth2 is enabled for this engine."""
+        return (
+            cls.supports_oauth2
+            and cls.engine_name in app.config.get("DATABASE_OAUTH2_CLIENTS", 
{})
+        )

Review Comment:
   **Suggestion:** Accessing Flask's `current_app` proxy (`app`) without an 
application context will raise a RuntimeError when `is_oauth2_enabled` is 
called outside a request/app context; guard access to `app.config` (or catch 
RuntimeError) and return False if the app context is not available. [possible 
bug]
   
   **Severity Level:** Critical 🚨
   ```suggestion
           try:
               clients = app.config.get("DATABASE_OAUTH2_CLIENTS", {})
           except RuntimeError:
               # No app context available; treat OAuth2 as disabled in this 
context
               return False
           return cls.supports_oauth2 and cls.engine_name in clients
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   The suggestion is correct and actionable: accessing the `current_app` proxy
   (aliased as `app`) can raise a RuntimeError when called outside an 
application
   context (e.g., background tasks, unit tests). Wrapping the access in a
   try/except and returning False when no app context exists is a safe,
   low-risk change that prevents unexpected crashes and aligns with the ODBC
   implementation pattern in the same file. The proposed improved code is a
   local, minimal change and does not require broader refactors.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/db_engine_specs/databricks.py
   **Line:** 254:257
   **Comment:**
        *Possible Bug: Accessing Flask's `current_app` proxy (`app`) without an 
application context will raise a RuntimeError when `is_oauth2_enabled` is 
called outside a request/app context; guard access to `app.config` (or catch 
RuntimeError) and return False if the app context is not available.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to