rpofuk opened a new issue, #64254:
URL: https://github.com/apache/airflow/issues/64254
### Description
Add support for **pre-signed URL based log I/O** as a `RemoteLogIO`
implementation, where:
- **API server (reader)**: Uses the built-in `S3RemoteLogIO` with direct IAM
access for reading logs (no change needed).
- **Worker (uploader)**: Uses a new `RemoteLogIO` that requests a pre-signed
PUT URL from the API server and uploads via plain HTTP. No AWS credentials
needed on the worker.
### How it works
```
Worker (after task) API Server S3
| | |
|-- POST /presigned-url ---->| |
| |-- generate PUT URL -->|
|<-- { presigned_url } ------| |
| |
|------------- HTTP PUT log file ------------------->|
```
The API server endpoint that generates pre-signed URLs can enforce **custom
authorization rules** before issuing the URL - e.g. verifying the worker's
service account is only allowed to upload logs for DAGs in its bundle.
The only change a worker deployment needs is to use new functionalit:
```ini
[logging]
remote_log_io_role = worker
```
### Optional: custom auth hook
By default, the presigned URL endpoints use standard Airflow authentication
(the requesting user must be authenticated). For deployments that need
additional authorization logic (e.g. bundle-scoped access, tenant isolation),
an optional callable can be configured:
```ini
[logging]
presigned_url_auth_hook = mypackage.auth.validate_log_access
```
I'm deploying solution on our side to pruduction and would gladely
contribute if it would be accepted (Dont want to go to trouble of getting
appoval to opensource it nobody is interested :))
### Use case/motivation
Airflow 3.x introduced `RemoteLogIO` as the protocol for remote log
upload/download from the supervisor process. Currently, the only built-in
implementation uses direct S3 access (`S3RemoteLogIO`), which requires the
worker to have S3 credentials.
In multi-account or zero-trust deployments, workers run on a separate AWS
account and should **not** be trusted with S3 write credentials. There is no
way to plug in a custom log upload/download mechanism that uses pre-signed URLs
instead of direct S3 access.
### Related issues
_No response_
### Are you willing to submit a PR?
- [x] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]