kaxil commented on code in PR #67418:
URL: https://github.com/apache/airflow/pull/67418#discussion_r3294916823
##########
airflow-core/src/airflow/api_fastapi/execution_api/routes/asset_state.py:
##########
@@ -105,7 +106,7 @@ def set_asset_state_by_name(
) -> None:
"""Set an asset state value by asset name."""
asset_id = _resolve_asset_id_by_name(name, session)
- get_state_backend().set(AssetScope(asset_id=asset_id), key, body.value,
session=session)
+ get_state_backend().set(AssetScope(asset_id=asset_id), key,
json.dumps(body.value), session=session)
Review Comment:
The matching `test_asset_state.py` was not updated alongside
`test_task_state.py`, so these assertions will now fail under CI:
-
`airflow-core/tests/unit/api_fastapi/execution_api/versions/head/test_asset_state.py:116`
-- `assert row.value == "2026-04-29"`
-
`airflow-core/tests/unit/api_fastapi/execution_api/versions/head/test_asset_state.py:211`
-- `assert row.value == "2026-04-29"`
After `json.dumps(body.value)` lands here (and at the by-uri equivalent on
line 159), the DB stores `'"2026-04-29"'` (JSON-quoted), so both assertions
need the symmetric change made in `test_task_state.py` (e.g. `assert row.value
== '"2026-04-29"'`). Worth adding the typed-value round-trip tests
(int/dict/list) on the asset side for parity too.
##########
airflow-core/src/airflow/api_fastapi/execution_api/datamodels/task_state.py:
##########
@@ -19,17 +19,26 @@
from datetime import datetime
+from pydantic import JsonValue, field_validator
+
from airflow.api_fastapi.core_api.base import StrictBaseModel
class TaskStateResponse(StrictBaseModel):
"""Task state value returned to a worker."""
- value: str
+ value: JsonValue
Review Comment:
NaN/Inf floats round-trip through JSON to `null`, so a worker sending
`{"value": float("nan")}` ends up tripping this validator and bubbling `"value
cannot be null"` back, which is confusing because the caller never sent `null`.
Verified:
>>> TaskStatePutBody(value=float("nan"))
ValidationError: 1 validation error for TaskStatePutBody
value
Value error, value cannot be null
Two ways out:
1. Reject NaN/Inf explicitly with a clearer message (e.g. `"NaN/Inf are not
JSON-representable"`).
2. Keep the current behavior but reword the error to mention both `null` and
non-finite floats.
The same applies to the duplicate validator in
`datamodels/asset_state.py:31`.
##########
task-sdk/src/airflow/sdk/execution_time/comms.py:
##########
@@ -923,7 +923,7 @@ class GetTaskState(BaseModel):
class SetTaskState(BaseModel):
ti_id: UUID
key: str
- value: str
+ value: JsonValue
Review Comment:
The matching client wrappers in `task-sdk/src/airflow/sdk/api/client.py`
still declare `value: str`:
- `TaskStateOperations.set` (client.py:724): `def set(self, ti_id:
uuid.UUID, key: str, value: str, expires_at: datetime | None)`
- `AssetStateOperations.set` (client.py:777): `def set(self, key: str,
value: str, *, name: str | None = None, uri: str | None = None)`
Now that `SetTaskState` / `SetAssetStateBy*` accept `JsonValue`, those
signatures lie to callers and mypy will flag any direct caller that passes
non-str. Bump them to `JsonValue` for consistency so the public client signals
the new contract.
##########
task-sdk/src/airflow/sdk/execution_time/context.py:
##########
@@ -635,13 +639,15 @@ def get(self, key: str) -> str | None:
raise AirflowRuntimeError(resp)
if isinstance(resp, AssetStateResult):
stored = resp.value
- # if custom backend is configured, the stored value in DB is a
reference, fetch the actual value from
- # custom backend using the reference
backend = _get_worker_state_backend()
- return backend.deserialize_asset_state_from_ref(stored) if backend
else stored
+ if backend is not None:
+ # serialize_asset_state_to_ref always returns str by contract;
stored is the ref.
+ # stored is always str here: serialize_asset_state_to_ref
always returns str
Review Comment:
The task_state branch above narrows the same situation with a
`TYPE_CHECKING` assert (around line 511), but here it falls back to `# type:
ignore[arg-type]` plus two adjacent comment lines that say the same thing
("stored is the ref" / "stored is always str here"). Worth either using the
`TYPE_CHECKING` assert here too, or applying the type-ignore on both sides --
symmetry between the two accessors helps the next reader.
Nit: the two comment lines duplicate each other; one is enough.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]