deepujain opened a new pull request, #63201:
URL: https://github.com/apache/airflow/pull/63201
## Summary
`S3DagBundle` / `sync_to_local_dir` used `Path.iterdir()`, which only lists
direct children. Stale DAG files in subfolders were never removed. This change
uses `Path.rglob("*")` to traverse the local directory recursively, delete any
file not in the current S3 object set, then remove empty directories (deepest
first).
## Change
- **providers/amazon/src/airflow/providers/amazon/aws/hooks/s3.py**:
`_sync_to_local_dir_delete_stale_local_files` now recursively collects all
paths under `local_dir` with `rglob("*")`, deletes stale files, then removes
empty dirs in depth-descending order so nested stale folders are fully removed.
- **providers/amazon/tests/unit/amazon/aws/hooks/test_s3.py**: Extended
`test_sync_to_local_dir_behaviour` with a case that creates a stale file in a
nested subfolder (`nested/subdir/stale_in_subdir.py`), runs sync with
`delete_stale=True`, and asserts the file and empty subdirs are deleted.
## JIRA
Fixes #62622
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]