potiuk opened a new pull request, #67509: URL: https://github.com/apache/airflow/pull/67509
`GCSHook.sync_to_local_dir` and `GCSTimeSpanFileTransformOperator._download` joined GCS blob names into local paths without verifying the resolved path stayed within the intended directory. GCS allows object names containing `..` segments, so a hostile blob name could cause files to be written outside `local_dir` / the operator's temp dir — a classic CWE-22 path-traversal sink. The trust model matters: a DAG author's own bucket is trusted, but these operators are routinely pointed at buckets shared with external partners or other tenants, where the write side may not be fully trusted. Reported as **F-005** + **F-006** in the [`apache/tooling-agents` L3 providers/google sweep `b1aec75`](https://github.com/apache/tooling-agents/issues/34). ## Change At both sites, resolve the destination path and assert `is_relative_to` the target root before any download. On violation, raise `ValueError` with a clear message instead of silently writing outside the target. Sites touched: - [`hooks/gcs.py` `sync_to_local_dir`](https://github.com/apache/airflow/blob/main/providers/google/src/airflow/providers/google/cloud/hooks/gcs.py#L1370) — check before `_sync_to_local_dir_if_changed`. - [`operators/gcs.py` `GCSTimeSpanFileTransformOperator._download`](https://github.com/apache/airflow/blob/main/providers/google/src/airflow/providers/google/cloud/operators/gcs.py#L894) — check inside the per-blob download worker. ## Test plan - [x] `test_sync_to_local_dir_rejects_path_traversal` (hook) — a `../escape.py` blob raises `ValueError` and no file is created outside `local_dir`. - [x] `test_execute_rejects_path_traversal_in_blob_name` (operator) — a `../escape.py` blob raises `ValueError` and `download_to_filename` is never called. - [x] `prek run ruff` clean on touched files. - [x] Existing `test_sync_to_local_dir_behaviour` still passes (no behaviour change on safe blob names). --- ##### Was generative AI tooling used to co-author this PR? - [X] Yes — Claude Code (Opus 4.7) Generated-by: Claude Code (Opus 4.7) following [the guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
