[jira] [Commented] (AIRFLOW-3149) GCP dataproc cluster creation should have the option to delete an ERROR cluster
[ https://issues.apache.org/jira/browse/AIRFLOW-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16864144#comment-16864144 ] ASF GitHub Bot commented on AIRFLOW-3149: - dossett commented on pull request #4064: AIRFLOW-3149 Support dataproc cluster deletion on ERROR URL: https://github.com/apache/airflow/pull/4064 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > GCP dataproc cluster creation should have the option to delete an ERROR > cluster > --- > > Key: AIRFLOW-3149 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3149 > Project: Apache Airflow > Issue Type: Improvement > Components: gcp >Affects Versions: 1.10.0 >Reporter: Aaron Dossett >Assignee: Aaron Dossett >Priority: Minor > > We sometimes encounter issues where a dataproc cluster creation ends up in > ERROR state. That is, the cluster “exists” but in the state of ERROR[1] (not > just that the cluster creation API call failed). This makes retries > impossible since the cluster name already exists subsequent retried creations > are guaranteed to fail. > A `delete_cluster_on_error` parameter should be added to the > `DataprocClusterCreateOperator` operator that controls whether or not an > attempt to delete an ERROR cluster is made. > > [1] - I’ve seen that happen in two ways 1) a purely transient error from GCP > `Internal server error` or the like 2) when the request is rejected because > it would exceed the project quota. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3149) GCP dataproc cluster creation should have the option to delete an ERROR cluster
[ https://issues.apache.org/jira/browse/AIRFLOW-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16864146#comment-16864146 ] ASF GitHub Bot commented on AIRFLOW-3149: - dossett commented on pull request #4064: AIRFLOW-3149 Support dataproc cluster deletion on ERROR URL: https://github.com/apache/airflow/pull/4064 Sometimes a dataproc cluster creation results in a cluster in a state of ERROR, which makes it unsuable. Subsequent Airflow retries will fail because a cluster already exists. This change adds the option to delete an ERROR cluster on creation so that subsequent attempts might succeed. There are also some other small cleanups. Make sure you have checked _all_ steps below. ### Jira - [X] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW-3149/) issues and references them in the PR title. ### Description - [X] See commit message above ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: My change does not include tests, I did not see any integration tests in the code base that this could fit into. ### Commits - [X] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [X] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [X] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > GCP dataproc cluster creation should have the option to delete an ERROR > cluster > --- > > Key: AIRFLOW-3149 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3149 > Project: Apache Airflow > Issue Type: Improvement > Components: gcp >Affects Versions: 1.10.0 >Reporter: Aaron Dossett >Assignee: Aaron Dossett >Priority: Minor > > We sometimes encounter issues where a dataproc cluster creation ends up in > ERROR state. That is, the cluster “exists” but in the state of ERROR[1] (not > just that the cluster creation API call failed). This makes retries > impossible since the cluster name already exists subsequent retried creations > are guaranteed to fail. > A `delete_cluster_on_error` parameter should be added to the > `DataprocClusterCreateOperator` operator that controls whether or not an > attempt to delete an ERROR cluster is made. > > [1] - I’ve seen that happen in two ways 1) a purely transient error from GCP > `Internal server error` or the like 2) when the request is rejected because > it would exceed the project quota. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3149) GCP dataproc cluster creation should have the option to delete an ERROR cluster
[ https://issues.apache.org/jira/browse/AIRFLOW-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16931285#comment-16931285 ] ASF GitHub Bot commented on AIRFLOW-3149: - mik-laj commented on pull request #4064: AIRFLOW-3149 Support dataproc cluster deletion on ERROR URL: https://github.com/apache/airflow/pull/4064 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > GCP dataproc cluster creation should have the option to delete an ERROR > cluster > --- > > Key: AIRFLOW-3149 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3149 > Project: Apache Airflow > Issue Type: Improvement > Components: gcp >Affects Versions: 1.10.0 >Reporter: Aaron Dossett >Assignee: Aaron Dossett >Priority: Minor > > We sometimes encounter issues where a dataproc cluster creation ends up in > ERROR state. That is, the cluster “exists” but in the state of ERROR[1] (not > just that the cluster creation API call failed). This makes retries > impossible since the cluster name already exists subsequent retried creations > are guaranteed to fail. > A `delete_cluster_on_error` parameter should be added to the > `DataprocClusterCreateOperator` operator that controls whether or not an > attempt to delete an ERROR cluster is made. > > [1] - I’ve seen that happen in two ways 1) a purely transient error from GCP > `Internal server error` or the like 2) when the request is rejected because > it would exceed the project quota. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (AIRFLOW-3149) GCP dataproc cluster creation should have the option to delete an ERROR cluster
[ https://issues.apache.org/jira/browse/AIRFLOW-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16931286#comment-16931286 ] ASF subversion and git services commented on AIRFLOW-3149: -- Commit 578c57f1ccac0ef8b5d17b0c6d7b0fa9accff8e2 in airflow's branch refs/heads/master from Aaron Niskode-Dossett [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=578c57f ] [AIRFLOW-3149] Support Dataproc cluster deletion on ERROR (#4064) > GCP dataproc cluster creation should have the option to delete an ERROR > cluster > --- > > Key: AIRFLOW-3149 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3149 > Project: Apache Airflow > Issue Type: Improvement > Components: gcp >Affects Versions: 1.10.0 >Reporter: Aaron Dossett >Assignee: Aaron Dossett >Priority: Minor > > We sometimes encounter issues where a dataproc cluster creation ends up in > ERROR state. That is, the cluster “exists” but in the state of ERROR[1] (not > just that the cluster creation API call failed). This makes retries > impossible since the cluster name already exists subsequent retried creations > are guaranteed to fail. > A `delete_cluster_on_error` parameter should be added to the > `DataprocClusterCreateOperator` operator that controls whether or not an > attempt to delete an ERROR cluster is made. > > [1] - I’ve seen that happen in two ways 1) a purely transient error from GCP > `Internal server error` or the like 2) when the request is rejected because > it would exceed the project quota. -- This message was sent by Atlassian Jira (v8.3.2#803003)