Srabasti opened a new issue, #32107: URL: https://github.com/apache/airflow/issues/32107
### Apache Airflow version Other Airflow 2 version (please specify below) ### What happened When running Dataflow job in Cloud Composer composer-1.20.12-airflow-1.10.15 Airflow 1.10.15, Dataflow job fails throwing a generic error "Exception: DataFlow failed with return code 1", and the reason for the failure is not evident clearly from logs. This issue is in Airflow 1: https://github.com/apache/airflow/blob/d3b066931191b82880d216af103517ea941c74ba/airflow/contrib/hooks/gcp_dataflow_hook.py#L172https://github.com/apache/airflow/blob/d3b066931191b82880d216af103517ea941c74b This issue still exists in Airflow 2. Airflow 2: https://github.com/apache/airflow/blob/main/airflow/providers/google/cloud/hooks/dataflow.py#L1019 Can the error logging be improved to show exact reason and a few lines displayed of the standard error from Dataflow command run, so that it gives context? This will help Dataflow users to identify the root cause of the issue directly from the logs, and avoid additional research and troubleshooting by going through the log details via Cloud Logging. I am happy to contribute and raise PR to help out implementation for the bug fix. I am looking for inputs as to how to integrate with existing code bases. Thanks for your help in advance! Srabasti Banerjee ### What you think should happen instead [2023-06-15 14:04:37,071] {taskinstance.py:1152} ERROR - DataFlow failed with return code 1 Traceback (most recent call last): File "/usr/local/lib/airflow/airflow/models/taskinstance.py", line 985, in _run_raw_task result = task_copy.execute(context=context) File "/usr/local/lib/airflow/airflow/operators/python_operator.py", line 113, in execute return_value = self.execute_callable() File "/usr/local/lib/airflow/airflow/operators/python_operator.py", line 118, in execute_callable return self.python_callable(*self.op_args, **self.op_kwargs) File "/home/airflow/gcs/dags/X.zip/X.py", line Y, in task DataFlowPythonOperator( File "/usr/local/lib/airflow/airflow/contrib/operators/dataflow_operator.py", line 379, in execute hook.start_python_dataflow( File "/usr/local/lib/airflow/airflow/contrib/hooks/gcp_dataflow_hook.py", line 245, in start_python_dataflow self._start_dataflow(variables, name, ["python"] + py_options + [dataflow], File "/usr/local/lib/airflow/airflow/contrib/hooks/gcp_api_base_hook.py", line 363, in wrapper return func(self, *args, **kwargs) File "/usr/local/lib/airflow/airflow/contrib/hooks/gcp_dataflow_hook.py", line 204, in _start_dataflow job_id = _Dataflow(cmd).wait_for_done() File "/usr/local/lib/airflow/airflow/contrib/hooks/gcp_dataflow_hook.py", line 178, in wait_for_done raise Exception("DataFlow failed with return code {}".format( Exception: DataFlow failed with return code 1 ### How to reproduce Any Failed Dataflow job that involves deleting a file when it is in process of being ingested via Dataflow job task run via Cloud Composer. Please let me know for any details needed. ### Operating System Cloud Composer ### Versions of Apache Airflow Providers _No response_ ### Deployment Google Cloud Composer ### Deployment details _No response_ ### Anything else _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org