[ https://issues.apache.org/jira/browse/AIRFLOW-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joel Croteau updated AIRFLOW-5046: ---------------------------------- Priority: Major (was: Minor) > Allow GoogleCloudStorageToBigQueryOperator to accept source_objects as a > string or otherwise take input from XCom > ----------------------------------------------------------------------------------------------------------------- > > Key: AIRFLOW-5046 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5046 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib, gcp > Affects Versions: 1.10.2 > Reporter: Joel Croteau > Priority: Major > > `GoogleCloudStorageToBigQueryOperator` should be able to have its > `source_objects` dynamically determined by the results of a previous > workflow. This is hard to do with it expecting a list, as any template > expansion will render as a string. This could be implemented either as a > check for whether `source_objects` is a string, and trying to parse it as a > list if it is, or a separate argument for a string encoded as a list. > My particular use case for this is as follows: > # A daily DAG scans a GCS bucket for all objects created in the last day and > loads them into BigQuery. > # To find these objects, a `PythonOperator` scans the bucket and returns a > list of object names. > # A `GoogleCloudStorageToBigQueryOperator` is used to load these objects > into BigQuery. > The operator should be able to have its list of objects provided by XCom, but > there is no functionality to do this, and trying to do a template expansion > along the lines of `source_objects='\{{ task_instance.xcom_pull(key="KEY") > }}'` doesn't work because this is rendered as a string, which > `GoogleCloudStorageToBigQueryOperator` will try to treat as a list, with each > character being a single item. -- This message was sent by Atlassian Jira (v8.3.4#803005)