shahar1 commented on issue #11911: URL: https://github.com/apache/airflow/issues/11911#issuecomment-1517652005
> It doesn't look like the [API supports a replace_if_exists / delete_if_exists parameter](https://github.com/googleapis/python-bigquery/blob/v2.34.4/google/cloud/bigquery/client.py#L702-L761). It will simply raise a `google.cloud.exceptions.Conflict` if the table exists, unless `exists_ok` is set to True. > > Is the desired behavior for a `delete_if_exists` or `replace_if_exists` flag to delete the table and recreate it if the table already exists? Also, is it okay if this type of operation is not atomic? We will need to delete the table first and then recreate it. I'm not sure how CREATE OR REPLACE TABLE in standard BQ SQL is implemented under the hood. > > It's also unclear how such a `replace_if_exists` parameter would work with the `exists_ok` parameter. Should the `replace_if_exists` parameter take precedence? Or should we only do the delete/recreate if `exists_ok` is set to False and `replace_if_exists` is set to True? I kinda prefer that approach so it's clear that the user is not okay with the table existing. (exists_ok currently defaults to False and a replace_if_exists param would obviously default to False as well). > > An aside - for tables that get appended to, this operation could be quite dangerous - since a user will lose all their historical data as a result. Just worth mentioning that this should only be done for tables that are truncated / recreated and are okay with non-atomicity. I agree with your statement - as long as BQ API doesn't support it as an atomic operation natively, I don't see a good reason to maintain a specific operator for that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
