Vasu-Madaan opened a new issue, #68333:
URL: https://github.com/apache/airflow/issues/68333

   ### Apache Airflow version
   
   3.2.1
   
   ### If "Other Airflow 2 version" selected, which one?
   
   _No response_
   
   ### What happened?
   
   After upgrading an environment from Airflow 2.10.5 to Airflow 3.2.1, the 
Audit Log page fails with a 500 response from the Airflow 3 public API when the 
`log` table contains historical malformed rows where `dttm` is `NULL`.
   
   The failing request is:
   
   ```text
   GET /api/v2/eventLogs?limit=50&offset=0&order_by=-when
   ```
   
   The API server raises a Pydantic serialization error because the ORM model 
allows `Log.dttm` to be nullable, but the public API response model requires 
`dttm` / `when` to be a valid datetime:
   
   ```text
   pydantic_core._pydantic_core.PydanticSerializationError: Error serializing 
to JSON: ValidationError: 1 validation error for ValidatorIterator
   0.dttm
     Input should be a valid datetime [type=datetime_type, input_value=None, 
input_type=NoneType]
   ```
   
   In the affected metadata database, the malformed rows had only the `id` 
column populated and all other audit fields were `NULL`.
   
   Example diagnostic query:
   
   ```sql
   SELECT COUNT(*) AS null_dttm_count
   FROM log
   WHERE dttm IS NULL;
   ```
   
   Example affected rows:
   
   ```text
   id      dttm  dag_id  task_id  map_index  event  logical_date  owner  extra  
owner_display_name  run_id  try_number
   602051
   602052
   602053
   ...
   ```
   
   ### What you think should happen instead?
   
   The `/api/v2/eventLogs` endpoint and Audit Log UI should handle malformed 
historical rows gracefully instead of failing the whole response with HTTP 500.
   
   Possible approaches:
   
   - filter out rows where `log.dttm IS NULL` from `/api/v2/eventLogs`
   - make the API response model tolerate `when: null`
   - add migration or validation logic to prevent or clean up fully malformed 
audit-log rows
   
   The current behavior makes the full Audit Log page unusable until an 
operator manually deletes or repairs the malformed rows in each affected 
metadata database.
   
   ### How to reproduce
   
   1. Start with an Airflow 3.x metadata database.
   2. Insert an audit-log row with `dttm IS NULL`.
   
   ```sql
   INSERT INTO log (id, dttm, dag_id, task_id, map_index, event, logical_date, 
owner, extra, owner_display_name, run_id, try_number)
   VALUES (99999999, NULL, NULL, NULL, NULL, 'malformed_test_event', NULL, 
NULL, NULL, NULL, NULL, NULL);
   ```
   
   3. Call the event logs API:
   
   ```text
   GET /api/v2/eventLogs?limit=50&offset=0&order_by=-when
   ```
   
   4. Observe the API returning HTTP 500 with the Pydantic serialization error 
for `dttm`.
   
   ### Operating System
   
   Debian GNU/Linux 12
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Other
   
   ### Deployment details
   
   Observed after an upgrade path that included Airflow 2.10.5 before moving to 
Airflow 3.2.1.
   
   ### Anything else?
   
   Relevant upstream code on `main`:
   
   - `Log.dttm` is nullable in `airflow-core/src/airflow/models/log.py`
   - `EventLogResponse.dttm` is required as `datetime` in 
`airflow-core/src/airflow/api_fastapi/core_api/datamodels/event_logs.py`
   - `/eventLogs` selects `Log` rows without filtering `Log.dttm IS NOT NULL` 
in `airflow-core/src/airflow/api_fastapi/core_api/routes/public/event_logs.py`
   
   I found related `/eventLogs` issues, but they appear to cover different 
failure modes:
   
   - #53695 is a PostgreSQL `GROUP BY log.dttm` error
   - #59965 is an event-log/task-instance join issue
   
   This report is specifically for null `log.dttm` rows causing response 
serialization to fail.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's Code of Conduct
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to