Srabasti opened a new issue, #32107:
URL: https://github.com/apache/airflow/issues/32107

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### What happened
   
   When running Dataflow job in Cloud Composer composer-1.20.12-airflow-1.10.15 
Airflow 1.10.15, Dataflow job fails throwing a generic error "Exception: 
DataFlow failed with return code 1", and the reason for the failure is not 
evident clearly from logs. This issue is in
   Airflow 1:
   
https://github.com/apache/airflow/blob/d3b066931191b82880d216af103517ea941c74ba/airflow/contrib/hooks/gcp_dataflow_hook.py#L172https://github.com/apache/airflow/blob/d3b066931191b82880d216af103517ea941c74b
   This issue still exists in Airflow 2.
   Airflow 2:
   
https://github.com/apache/airflow/blob/main/airflow/providers/google/cloud/hooks/dataflow.py#L1019
   
   Can the error logging be improved to show exact reason and a few lines 
displayed of the standard error from Dataflow command run, so that it gives 
context? 
   This will help Dataflow users to identify the root cause of the issue 
directly from the logs, and avoid additional research and troubleshooting by 
going through the log details via Cloud Logging.
   
   I am happy to contribute and raise PR to help out implementation for the bug 
fix. I am looking for inputs as to how to integrate with existing code bases. 
   
   Thanks for your help in advance!
   Srabasti Banerjee
   
   
   ### What you think should happen instead
   
   [2023-06-15 14:04:37,071] {taskinstance.py:1152} ERROR - DataFlow failed 
with return code 1
   Traceback (most recent call last):
     File "/usr/local/lib/airflow/airflow/models/taskinstance.py", line 985, in 
_run_raw_task
       result = task_copy.execute(context=context)
     File "/usr/local/lib/airflow/airflow/operators/python_operator.py", line 
113, in execute
       return_value = self.execute_callable()
     File "/usr/local/lib/airflow/airflow/operators/python_operator.py", line 
118, in execute_callable
       return self.python_callable(*self.op_args, **self.op_kwargs)
     File "/home/airflow/gcs/dags/X.zip/X.py", line Y, in task
       DataFlowPythonOperator(
     File 
"/usr/local/lib/airflow/airflow/contrib/operators/dataflow_operator.py", line 
379, in execute
       hook.start_python_dataflow(
     File "/usr/local/lib/airflow/airflow/contrib/hooks/gcp_dataflow_hook.py", 
line 245, in start_python_dataflow
       self._start_dataflow(variables, name, ["python"] + py_options + 
[dataflow],
     File "/usr/local/lib/airflow/airflow/contrib/hooks/gcp_api_base_hook.py", 
line 363, in wrapper
       return func(self, *args, **kwargs)
     File "/usr/local/lib/airflow/airflow/contrib/hooks/gcp_dataflow_hook.py", 
line 204, in _start_dataflow
       job_id = _Dataflow(cmd).wait_for_done()
     File "/usr/local/lib/airflow/airflow/contrib/hooks/gcp_dataflow_hook.py", 
line 178, in wait_for_done
       raise Exception("DataFlow failed with return code {}".format(
   Exception: DataFlow failed with return code 1
   
   ### How to reproduce
   
   Any Failed Dataflow job that involves deleting a file when it is in process 
of being ingested via Dataflow job task run via Cloud Composer. Please let me 
know for any details needed.
   
   ### Operating System
   
   Cloud Composer
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Google Cloud Composer
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to