manesioz commented on issue #11911:
URL: https://github.com/apache/airflow/issues/11911#issuecomment-718057825


   I agree this functionality would be super useful! I would love to hear what 
you and the maintainers think about how to design this. 
   
   It seems like we have to consider how we deal with the case when we wish to 
create a table when a table by the same name already exists in the dataset. We 
could: 
   
   1) Fail. This is consistent with the `CREATE TABLE IF NOT EXISTS` DDL, and 
it seems like the current `BigQueryCreateEmptyTableOperator` works like this as 
well. 
   
   2) Overwrite the existing table with the new table. This is consistent with 
the `CREATE OR REPLACE TABLE` DDL. This would be a new feature, and could 
possibly be added as a boolean option in the existing operator, or a completely 
new operator altogether. 
   
   Example: 
   
   ```python
   create_table = BigQueryCreateTableOperator(
       task_id="create_table",
       dataset_id=DATASET_NAME,
       table_id="test_table",
       replace=True, # this will create AND replace any existing tables with 
the same name in the same dataset 
       schema_fields=[
           {"name": "emp_name", "type": "STRING", "mode": "REQUIRED"},
           {"name": "salary", "type": "INTEGER", "mode": "NULLABLE"},
       ],
   )
   ```
   If `replace=False` then it should fail if a table exists by the same name in 
the same dataset. 
   
   Am I missing something? What are everyones thoughts? I'd love to work on a 
PR for this if possible. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to