kristopherkane opened a new issue, #29109:
URL: https://github.com/apache/airflow/issues/29109

   ### Apache Airflow version
   
   main (development)
   
   ### What happened
   
   The provider operator and hooks for Google Cloud Dataproc has two bugs: 
   
   1. The running 
[operator](https://github.com/apache/airflow/blob/main/airflow/providers/google/cloud/operators/dataproc.py#L2123-L2124)
 returns successful even if the job transitions to State.CANCELLED or State 
CANCELLING 
   2. It 
[attempts](https://github.com/apache/airflow/blob/main/airflow/providers/google/cloud/operators/dataproc.py#L2154)
 to 'reattach' to a potentially running job if it AlreadyExists, but it sends 
the wrong type since 'result' is a Batch and needs Operation
   
   ### What you think should happen instead
   
   A new hook that polls for batch job completion.  There is precedent for it 
in traditional dataproc with 'wait_for_job'.
   
   ### How to reproduce
   
   Use the Breeze environment and a DAG that runs DataprocCreateBatchOperator.  
Allow the first instance to start. 
   
   Use the gcloud CLI to cancel the job. 
   
   `gcloud dataproc batches cancel <batch_id> --project <project_id> --region 
<region>`
   
   Observe that the task completes successfully after a 3-5 minute timeout, 
even though the job was cancelled. 
   
   Run the task again with the same batch_id.  Observe the ValueError where it 
expects Operation but receives Batch
   
   
   
   ### Operating System
   
   Darwin 5806 21.6.0 Darwin Kernel Version 21.6.0: Mon Aug 22 20:17:10 PDT 
2022; root:xnu-8020.140.49~2/RELEASE_X86_64 x86_64
   
   ### Versions of Apache Airflow Providers
   
   Same as dev (main) version. 
   
   ### Deployment
   
   Other Docker-based deployment
   
   ### Deployment details
   
   Observable in the Breeze environment, when running against real Google 
Infrastructure. 
   
   ### Anything else
   
   Every time. 
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to