alifaz31 opened a new issue, #63242:
URL: https://github.com/apache/airflow/issues/63242

   ### Apache Airflow Provider(s)
   
   alibaba
   
   ### Versions of Apache Airflow Providers
   
   3.3.2+
   
   ### Apache Airflow version
   
   3.1.6
   
   ### Operating System
   
   Debian GNU/Linux 12 (bookworm)
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   helm chart version 1.19
   
   ```
   config:
     logging:
       remote_logging: 'True'
       remote_base_log_folder: 'oss://my-airflow-logs/airflow/logs'
       remote_log_conn_id: oss_logs
       base_log_folder: '/opt/airflow/logs'
   ```
   
   ### What happened
   
   Task logs are written to a malformed path in OSS storage instead of the 
correct location.
   
   Actual path written in OSS:
   `  
oss://my-airflow-logs/airflow/logs/oss://my-airflow-logs/airflow/logs/dag_id=.../run_id=.../task_id=.../attempt=1.log`
   
   The OSS key prefix is being prepended to the full OSS URI, producing a 
double path.
   As a result, logs cannot be retrieved and the Airflow UI shows no logs for 
any task.
   
   ### What you think should happen instead
   
   Logs should be written to the correct path:
   `  
oss://my-airflow-logs/airflow/logs/dag_id=.../run_id=.../task_id=.../attempt=1.log`
   
   `oss_write()` expects a relative path within the bucket (e.g. 
dag_id=.../attempt=1.log),
   not a full OSS URI. The `upload()` method should pass only the relative path.
   
   ### How to reproduce
   
   1. Deploy Airflow 3.x using the official Helm chart with 
apache-airflow-providers-alibaba >= 3.3.2
   2. Configure remote logging with an OSS bucket:
        remote_logging = True
        remote_base_log_folder = oss://my-airflow-logs/airflow/logs
        remote_log_conn_id = oss_logs
   3. Trigger any DAG and let a task complete
   4. Check the OSS bucket — log files are created under a malformed path
      (bucket key prefix prepended to the full OSS URI)
   5. Open the Airflow UI → task logs → no logs displayed (No host supplied err)
   
   ### Anything else
   
   ## Root Cause
   
   The bug is in `OSSRemoteLogIO.upload()` in `oss_task_handler.py`.
   
   When the log path is relative, `upload()` builds a full OSS URI and passes 
it to `oss_write()`.
   But `oss_write()` expects a relative path — it prepends `base_folder` 
internally, resulting in a double path.
   
   **Buggy code:**
   ```python
   # upload() — relative path branch
   remote_loc = os.path.join(self.remote_base, path)  # builds full OSS URI: 
oss://bucket/prefix/dag_id=...
   has_uploaded = self.oss_write(log, remote_loc)      # passes full URI
   
   # oss_write() then does:
   oss_remote_log_location = f"{self.base_folder}/{remote_log_location}"
   # result → "prefix/logs/oss://bucket/prefix/logs/dag_id=..."  ← malformed
   ```
   
   ## Fix
   pass only the relative path to oss_write():
   ```
   relative = str(path.relative_to(self.base_log_folder) if path.is_absolute() 
else path)
   has_uploaded = self.oss_write(log, relative)
   ```
   
   We have already applied this fix on our end by patching the installed 
provider file during image build, and it fully resolves the issue — logs are 
now written to the correct OSS path and display correctly in the Airflow UI.
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to