moiseenkov opened a new pull request, #28111:
URL: https://github.com/apache/airflow/pull/28111
Changes:
fixed `GCSToGCSOperator` in case copying list of objects without wildcard
fixed and slightly refactored unit tests
fixed example DAG
Aforementioned changes of `GCSToGCSOperator` cover the following cases:
1. Copy a list of files into the folder.
```
copy_files = GCSToGCSOperator(
task_id='copy_files_without_wildcard',
source_bucket=SOURCE_BUCKET,
source_objects=['src/file_1.txt', 'src/file_2.csv'],
destination_bucket=TARGET_BUCKET,
destination_object='new_folder/'
)
```
The previous implementation didn't actually copy files - it was just
creating an empty destination folder. The following fix solves this problem and
performs actual copying of the listed files into the specified destination
folder.
2. Copy folder without trailing slash
```
copy_files_from_folder = GCSToGCSOperator(
task_id='copy_folder_without_trailing_slash',
source_bucket=SOURCE_BUCKET,
source_objects=['test_folder'],
destination_bucket=TARGET_BUCKET,
destination_object='new_folder/'
)
```
For example, we have a folder `test_folder/` and a file
`test_folder/file.txt` inside of it. If we miss a trailing slash at the source
folder name, then the previous implementation instead of copying the file
`file.txt` were creating two files `test_folder` and `new_folderfile.txt`. It
seems that there are two bugs here:
a) a file `new_folder` created instead of a folder `new_folder/`;
b) a wrong path `new_folderfile.txt` for the copied file was generated
instead of `new_folder/file.txt`.
The following fix resolves these problems.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]