nathadfield opened a new issue, #38532: URL: https://github.com/apache/airflow/issues/38532
### Apache Airflow version main (development) ### If "Other Airflow 2 version" selected, which one? _No response_ ### What happened? Occasionally, a `BigQueryInsertJobOperator` task can fail when in deferred mode due to an inability to acquire impersonated credentials when checking the job status. Here is an example of the task log. ``` [2024-03-26, 05:38:04 GMT] {credentials_provider.py:353} INFO - Getting connection using `google.auth.default()` since no explicit credentials are provided. [2024-03-26, 05:38:04 GMT] {bigquery.py:62} INFO - Using the connection google_cloud_default . [2024-03-26, 05:38:04 GMT] {taskinstance.py:2370} INFO - Pausing task as DEFERRED. dag_id=my-dag, task_id=my-task, execution_date=20240325T000000, start_date=20240326T053802 [2024-03-26, 05:38:04 GMT] {local_task_job_runner.py:231} INFO - Task exited with return code 100 (task deferral) [2024-03-26, 05:38:07 GMT] {bigquery.py:110} INFO - Bigquery job status is running. Sleeping for 4.0 seconds. [2024-03-26, 05:38:08 GMT] {gcs_task_handler.py:158} INFO - Error checking for previous log; if exists, may be overwritten: 'NoneType' object has no attribute 'decode' [2024-03-26, 05:38:12 GMT] {bigquery.py:110} INFO - Bigquery job status is running. Sleeping for 4.0 seconds. [2024-03-26, 05:38:16 GMT] {bigquery.py:110} INFO - Bigquery job status is running. Sleeping for 4.0 seconds. [2024-03-26, 05:38:21 GMT] {bigquery.py:110} INFO - Bigquery job status is running. Sleeping for 4.0 seconds. ... ... [2024-03-26, 06:05:44 GMT] {bigquery.py:110} INFO - Bigquery job status is running. Sleeping for 4.0 seconds. [2024-03-26, 06:05:48 GMT] {bigquery.py:117} ERROR - Exception occurred while checking for query completion Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/airflow/providers/google/cloud/triggers/bigquery.py", line 94, in run job_status = await hook.get_job_status( ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/airflow/providers/google/cloud/hooks/bigquery.py", line 3292, in get_job_status job = await self._get_job(job_id=job_id, project_id=project_id, location=location) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/airflow/providers/google/cloud/hooks/bigquery.py", line 3268, in _get_job job = await loop.run_in_executor(None, self._get_job_sync, job_id, project_id, location) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/airflow/providers/google/cloud/hooks/bigquery.py", line 3287, in _get_job_sync return hook.get_job(job_id=job_id, project_id=project_id, location=location) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/airflow/providers/google/common/hooks/base_google.py", line 485, in inner_wrapper return func(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/airflow/providers/google/cloud/hooks/bigquery.py", line 1546, in get_job job = client.get_job(job_id=job_id, project=project_id, location=location) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/google/cloud/bigquery/client.py", line 2107, in get_job resource = self._call_api( ^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/google/cloud/bigquery/client.py", line 827, in _call_api return call() ^^^^^^ File "/usr/local/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 293, in retry_wrapped_func return retry_target( ^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 153, in retry_target _retry_error_helper( File "/usr/local/lib/python3.11/site-packages/google/api_core/retry/retry_base.py", line 212, in _retry_error_helper raise final_exc from source_exc File "/usr/local/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 144, in retry_target result = target() ^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/google/cloud/_http/__init__.py", line 482, in api_request response = self._make_request( ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/google/cloud/_http/__init__.py", line 341, in _make_request return self._do_request( ^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/google/cloud/_http/__init__.py", line 379, in _do_request return self.http.request( ^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/google/auth/transport/requests.py", line 537, in request self.credentials.before_request(auth_request, method, url, request_headers) File "/usr/local/lib/python3.11/site-packages/google/auth/credentials.py", line 230, in before_request self._blocking_refresh(request) File "/usr/local/lib/python3.11/site-packages/google/auth/credentials.py", line 193, in _blocking_refresh self.refresh(request) File "/usr/local/lib/python3.11/site-packages/google/auth/impersonated_credentials.py", line 250, in refresh self._update_token(request) File "/usr/local/lib/python3.11/site-packages/google/auth/impersonated_credentials.py", line 282, in _update_token self.token, self.expiry = _make_iam_token_request( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/google/auth/impersonated_credentials.py", line 100, in _make_iam_token_request raise exceptions.RefreshError(_REFRESH_ERROR, response_body) ``` This also returns an exception error such as the following: ``` ('Unable to acquire impersonated credentials', '<!DOCTYPE html>\n<html lang=en>\n <meta charset=utf-8>\n <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">\n <title>Error 502 (Server Error)!!1</title>\n <style>\n {margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px} > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_colo ``` ### What you think should happen instead? A problem with this is that, although the task can enter the retry state, the initial BQ job can still be running which can have unexpected downstream effect; such as failures relating to concurrent updates on the same table. ``` airflow.exceptions.AirflowException: Query error: Transaction is aborted due to concurrent update against table ``` Ideally, an issue with acquiring the impersonated credentials when checking the job status wouldn't immediately result in the task failing. ### How to reproduce Unfortunately, this is not possible to replicate consistently due to the unpredictable nature of the scenario. ### Operating System PRETTY_NAME="Debian GNU/Linux 11 (bullseye)" NAME="Debian GNU/Linux" VERSION_ID="11" VERSION="11 (bullseye)" VERSION_CODENAME=bullseye ID=debian HOME_URL="https://www.debian.org/" SUPPORT_URL="https://www.debian.org/support" BUG_REPORT_URL="https://bugs.debian.org/" ### Versions of Apache Airflow Providers apache-airflow-providers-google==10.16.0 ### Deployment Astronomer ### Deployment details _No response_ ### Anything else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org