[ https://issues.apache.org/jira/browse/AIRFLOW-15?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16280644#comment-16280644 ]
Barry Hart commented on AIRFLOW-15: ----------------------------------- I understand. "Wait and see" might be a reasonable strategy. (I.e. until Google clarifies the message). The specific reason for my comment is that one of our DAGs transfers a very large number of files to and from Google Storage. With this number of files, we almost always see some transient 5XX errors from the Google side, so we see some value in the google-cloud-python-library, which has retry logic built in, both [generally|https://github.com/GoogleCloudPlatform/google-cloud-python/blob/master/api_core/google/api_core/retry.py] and [specifically for Google Storage|https://github.com/GoogleCloudPlatform/google-cloud-python/blob/master/storage/google/cloud/storage/blob.py#L84-L91].) (Although Airflow has its own retry support, I see those as being intended for coarse-grained retries (i.e. when one task does a few things). When one task is transferring thousands of files, it seems useful to retry internal to the task as well (per file). Let me know what you think. It may be worth creating a ticket about retries to perhaps get input from other users. For now, we can use the google-cloud-python-library directly from our DAGs. > Remove GCloud from Airflow > -------------------------- > > Key: AIRFLOW-15 > URL: https://issues.apache.org/jira/browse/AIRFLOW-15 > Project: Apache Airflow > Issue Type: Task > Components: gcp > Reporter: Chris Riccomini > Assignee: Chris Riccomini > Labels: gcp > > After speaking with Google, there was some concern about using the > [gcloud-python|https://github.com/GoogleCloudPlatform/gcloud-python] library > for Airflow. There are several concerns: > # It's not clear (even to people at Google) what this library is, who owns > it, etc. > # It does not support all services (the way > [google-api-python-client|https://github.com/google/google-api-python-client] > does). > # There are compatibility issues between google-api-python-client and > gcloudpython. > We currently support both, after libraries depending on which package you you > install: {{airfow[gcp_api]}} or {{airflow[gcloud]}}. This ticket is to remove > the {{airflow[gcloud]}} packaged, and all associated code. > The main associated code, afaik, is the use of the {{gcloud}} library in the > Google cloud storage hooks/operators--specifically for Google cloud storage > Airfow logging. -- This message was sent by Atlassian JIRA (v6.4.14#64029)