rawwar opened a new issue, #43080:
URL: https://github.com/apache/airflow/issues/43080
### Description
When there are SSL handshake issues(And usually intermittent), All
deferrable Databricks operators fail in deferrable mode without retrying as
`aiohttp.client_exceptions.ClientConnectorError` is not a retryable error.
As of now, we only consider `aiohttp.ClientResponseError` to be retryable.
I would like to make `aiohttp.client_exceptions.ClientConnectorError` error to
be retryable.
### Use case/motivation
When SSL handshake takes longer(usually 60 seconds by default), it fails
with the below error:
```
2024-10-16, 09:27:20 UTC] {warnings.py:109} WARNING -
/usr/local/lib/python3.10/site-packages/airflow/models/baseoperator.py:1214:
AirflowProviderDeprecationWarning: Call to deprecated class
DatabricksRunNowDeferrableOperator. (`DatabricksRunNowDeferrableOperator` has
been deprecated. Please use
`airflow.providers.databricks.operators.DatabricksRunNowOperator` with
`deferrable=True` instead.)
result = cls.__new__(cls)
[2024-10-16, 09:27:20 UTC] {taskinstance.py:1598} ERROR - Trigger failed:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/aiohttp/connector.py", line
1098, in _wrap_create_connection
return await self._loop.create_connection(*args, **kwargs, sock=sock)
File "/usr/local/lib/python3.10/asyncio/base_events.py", line 1103, in
create_connection
transport, protocol = await self._create_connection_transport(
File "/usr/local/lib/python3.10/asyncio/base_events.py", line 1133, in
_create_connection_transport
await waiter
ConnectionAbortedError: SSL handshake is taking longer than 60.0 seconds:
aborting the connection
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File
"/usr/local/lib/python3.10/site-packages/airflow/jobs/triggerer_job_runner.py",
line 529, in cleanup_finished_triggers
result = details["task"].result()
File
"/usr/local/lib/python3.10/site-packages/airflow/jobs/triggerer_job_runner.py",
line 607, in run_trigger
async for event in trigger.run():
File
"/usr/local/lib/python3.10/site-packages/airflow/providers/databricks/triggers/databricks.py",
line 86, in run
run_state = await self.hook.a_get_run_state(self.run_id)
File
"/usr/local/lib/python3.10/site-packages/airflow/providers/databricks/hooks/databricks.py",
line 417, in a_get_run_state
response = await self._a_do_api_call(GET_RUN_ENDPOINT, json)
File
"/usr/local/lib/python3.10/site-packages/airflow/providers/databricks/hooks/databricks_base.py",
line 651, in _a_do_api_call
async for attempt in self._a_get_retry_object():
File "/usr/local/lib/python3.10/site-packages/tenacity/_asyncio.py", line
71, in __anext__
do = self.iter(retry_state=self._retry_state)
File "/usr/local/lib/python3.10/site-packages/tenacity/__init__.py", line
314, in iter
return fut.result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in
result
return self.__get_result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in
__get_result
raise self._exception
File
"/usr/local/lib/python3.10/site-packages/airflow/providers/databricks/hooks/databricks_base.py",
line 653, in _a_do_api_call
async with request_func(
File "/usr/local/lib/python3.10/site-packages/aiohttp/client.py", line
1359, in __aenter__
self._resp: _RetType = await self._coro
File "/usr/local/lib/python3.10/site-packages/aiohttp/client.py", line
663, in _request
conn = await self._connector.connect(
File "/usr/local/lib/python3.10/site-packages/aiohttp/connector.py", line
563, in connect
proto = await self._create_connection(req, traces, timeout)
File "/usr/local/lib/python3.10/site-packages/aiohttp/connector.py", line
1032, in _create_connection
_, proto = await self._create_direct_connection(req, traces, timeout)
File "/usr/local/lib/python3.10/site-packages/aiohttp/connector.py", line
1366, in _create_direct_connection
raise last_exc
File "/usr/local/lib/python3.10/site-packages/aiohttp/connector.py", line
1335, in _create_direct_connection
transp, proto = await self._wrap_create_connection(
File "/usr/local/lib/python3.10/site-packages/aiohttp/connector.py", line
1106, in _wrap_create_connection
raise client_error(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host
adb-******.***REDACTED****.azuredatabricks.net:443 ssl:default [None]
```
And, that's intermittent. Making this retryable will help
### Related issues
NA
### Are you willing to submit a PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]