Vasu-Madaan opened a new issue, #68333:
URL: https://github.com/apache/airflow/issues/68333
### Apache Airflow version
3.2.1
### If "Other Airflow 2 version" selected, which one?
_No response_
### What happened?
After upgrading an environment from Airflow 2.10.5 to Airflow 3.2.1, the
Audit Log page fails with a 500 response from the Airflow 3 public API when the
`log` table contains historical malformed rows where `dttm` is `NULL`.
The failing request is:
```text
GET /api/v2/eventLogs?limit=50&offset=0&order_by=-when
```
The API server raises a Pydantic serialization error because the ORM model
allows `Log.dttm` to be nullable, but the public API response model requires
`dttm` / `when` to be a valid datetime:
```text
pydantic_core._pydantic_core.PydanticSerializationError: Error serializing
to JSON: ValidationError: 1 validation error for ValidatorIterator
0.dttm
Input should be a valid datetime [type=datetime_type, input_value=None,
input_type=NoneType]
```
In the affected metadata database, the malformed rows had only the `id`
column populated and all other audit fields were `NULL`.
Example diagnostic query:
```sql
SELECT COUNT(*) AS null_dttm_count
FROM log
WHERE dttm IS NULL;
```
Example affected rows:
```text
id dttm dag_id task_id map_index event logical_date owner extra
owner_display_name run_id try_number
602051
602052
602053
...
```
### What you think should happen instead?
The `/api/v2/eventLogs` endpoint and Audit Log UI should handle malformed
historical rows gracefully instead of failing the whole response with HTTP 500.
Possible approaches:
- filter out rows where `log.dttm IS NULL` from `/api/v2/eventLogs`
- make the API response model tolerate `when: null`
- add migration or validation logic to prevent or clean up fully malformed
audit-log rows
The current behavior makes the full Audit Log page unusable until an
operator manually deletes or repairs the malformed rows in each affected
metadata database.
### How to reproduce
1. Start with an Airflow 3.x metadata database.
2. Insert an audit-log row with `dttm IS NULL`.
```sql
INSERT INTO log (id, dttm, dag_id, task_id, map_index, event, logical_date,
owner, extra, owner_display_name, run_id, try_number)
VALUES (99999999, NULL, NULL, NULL, NULL, 'malformed_test_event', NULL,
NULL, NULL, NULL, NULL, NULL);
```
3. Call the event logs API:
```text
GET /api/v2/eventLogs?limit=50&offset=0&order_by=-when
```
4. Observe the API returning HTTP 500 with the Pydantic serialization error
for `dttm`.
### Operating System
Debian GNU/Linux 12
### Versions of Apache Airflow Providers
_No response_
### Deployment
Other
### Deployment details
Observed after an upgrade path that included Airflow 2.10.5 before moving to
Airflow 3.2.1.
### Anything else?
Relevant upstream code on `main`:
- `Log.dttm` is nullable in `airflow-core/src/airflow/models/log.py`
- `EventLogResponse.dttm` is required as `datetime` in
`airflow-core/src/airflow/api_fastapi/core_api/datamodels/event_logs.py`
- `/eventLogs` selects `Log` rows without filtering `Log.dttm IS NOT NULL`
in `airflow-core/src/airflow/api_fastapi/core_api/routes/public/event_logs.py`
I found related `/eventLogs` issues, but they appear to cover different
failure modes:
- #53695 is a PostgreSQL `GROUP BY log.dttm` error
- #59965 is an event-log/task-instance join issue
This report is specifically for null `log.dttm` rows causing response
serialization to fail.
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's Code of Conduct
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]