Copilot commented on code in PR #63775:
URL: https://github.com/apache/airflow/pull/63775#discussion_r3025383633


##########
providers/databricks/src/airflow/providers/databricks/utils/openlineage.py:
##########
@@ -24,7 +24,7 @@
 import requests
 
 from airflow.providers.common.compat.openlineage.check import 
require_openlineage_version
-from airflow.utils import timezone
+from airflow.utils import timezone  # type: ignore[attr-defined]

Review Comment:
   `from airflow.utils import timezone  # type: ignore[attr-defined]` relies on 
Airflow’s runtime deprecation redirect and silences typing. Elsewhere in the 
repo the cross-version pattern is to import timezone via 
`airflow.providers.common.compat.sdk` (or `try: from airflow.sdk import 
timezone`). Consider switching to the compat import to avoid the type ignore 
and keep a consistent Airflow 2/3 import style.
   ```suggestion
   from airflow.providers.common.compat.sdk import timezone
   ```



##########
providers/databricks/src/airflow/providers/databricks/hooks/databricks_base.py:
##########
@@ -1156,18 +1214,25 @@ async def _a_do_api_call(self, endpoint_info: 
tuple[str, str], json: dict[str, A
         method, endpoint = endpoint_info
 
         full_endpoint = f"api/{endpoint}"
-        url = self._endpoint_url(full_endpoint)
+        url = await self._a_endpoint_url(full_endpoint)
 
         aad_headers = await self._a_get_aad_headers()
         headers = {**self.user_agent_header, **aad_headers}
 
-        auth: aiohttp.BasicAuth
+        auth: aiohttp.BasicAuth | BearerAuth
         token = await self._a_get_token()
         if token:
             auth = BearerAuth(token)
         else:
             self.log.info("Using basic auth.")
-            auth = aiohttp.BasicAuth(self._get_connection_attr("login"), 
self.databricks_conn.password)
+            conn = await self.a_databricks_conn()
+
+            # We access login and password directly from the connection object 
to avoid
+            # calling self.databricks_conn which would trigger sync 
get_connection().
+            # Behavior matches sync _get_connection_attr() but without the 
cache-clash.

Review Comment:
   The inline comment mentions a “cache-clash”, but the actual issue being 
avoided is triggering the synchronous `get_connection()` path (and AsyncToSync 
failures) inside the triggerer event loop. Consider rewording this comment to 
reflect that motivation, and optionally reuse `_get_connection_attr("login", 
conn)` here for consistency with other async auth paths.
   ```suggestion
               # Access login and password directly on the async connection 
object to avoid
               # calling the synchronous get_connection() path (and AsyncToSync 
failures)
               # inside the triggerer event loop. This mirrors sync 
_get_connection_attr()
               # behavior without invoking the sync hook property.
   ```



##########
providers/databricks/src/airflow/providers/databricks/hooks/databricks_base.py:
##########
@@ -1156,18 +1214,25 @@ async def _a_do_api_call(self, endpoint_info: 
tuple[str, str], json: dict[str, A
         method, endpoint = endpoint_info
 
         full_endpoint = f"api/{endpoint}"
-        url = self._endpoint_url(full_endpoint)
+        url = await self._a_endpoint_url(full_endpoint)
 
         aad_headers = await self._a_get_aad_headers()
         headers = {**self.user_agent_header, **aad_headers}
 
-        auth: aiohttp.BasicAuth
+        auth: aiohttp.BasicAuth | BearerAuth
         token = await self._a_get_token()
         if token:
             auth = BearerAuth(token)
         else:
             self.log.info("Using basic auth.")
-            auth = aiohttp.BasicAuth(self._get_connection_attr("login"), 
self.databricks_conn.password)
+            conn = await self.a_databricks_conn()
+
+            # We access login and password directly from the connection object 
to avoid
+            # calling self.databricks_conn which would trigger sync 
get_connection().
+            # Behavior matches sync _get_connection_attr() but without the 
cache-clash.

Review Comment:
   The comment says this avoids a “cache-clash”, but the main concern here is 
avoiding the sync `get_connection()` path inside the triggerer event loop. 
Consider updating the wording to reflect the actual risk (AsyncToSync in a 
running event loop) and/or reference that `a_databricks_conn()` primes the 
cached_property.
   ```suggestion
               # We access login and password directly from the async 
connection object to avoid
               # going through self.databricks_conn, which would invoke the 
sync get_connection()
               # (AsyncToSync) path inside the triggerer event loop. 
a_databricks_conn() also primes
               # the databricks_conn cached_property so later access does not 
re-enter get_connection().
   ```



##########
providers/databricks/src/airflow/providers/databricks/hooks/databricks_base.py:
##########
@@ -1156,18 +1214,25 @@ async def _a_do_api_call(self, endpoint_info: 
tuple[str, str], json: dict[str, A
         method, endpoint = endpoint_info
 
         full_endpoint = f"api/{endpoint}"
-        url = self._endpoint_url(full_endpoint)
+        url = await self._a_endpoint_url(full_endpoint)
 
         aad_headers = await self._a_get_aad_headers()
         headers = {**self.user_agent_header, **aad_headers}
 

Review Comment:
   `headers` is already computed as `{**self.user_agent_header, 
**aad_headers}`; later, the request passes `headers={**headers, 
**self.user_agent_header}`, re-merging the user-agent header and making 
precedence harder to reason about. Consider passing `headers=headers` (or 
explicitly document why user-agent should override AAD headers).



##########
providers/databricks/src/airflow/providers/databricks/hooks/databricks_base.py:
##########
@@ -1156,18 +1214,25 @@ async def _a_do_api_call(self, endpoint_info: 
tuple[str, str], json: dict[str, A
         method, endpoint = endpoint_info
 
         full_endpoint = f"api/{endpoint}"
-        url = self._endpoint_url(full_endpoint)
+        url = await self._a_endpoint_url(full_endpoint)
 
         aad_headers = await self._a_get_aad_headers()
         headers = {**self.user_agent_header, **aad_headers}
 

Review Comment:
   `headers` is built as `{**self.user_agent_header, **aad_headers}` but later 
the request call re-merges `self.user_agent_header` again. Consider making the 
header precedence explicit in one place (either keep the merge here and pass 
`headers=headers`, or build `headers` in the final form once) to avoid 
redundant dict copies and reduce ambiguity about which values win.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to