kaxil commented on code in PR #67418:
URL: https://github.com/apache/airflow/pull/67418#discussion_r3294916823


##########
airflow-core/src/airflow/api_fastapi/execution_api/routes/asset_state.py:
##########
@@ -105,7 +106,7 @@ def set_asset_state_by_name(
 ) -> None:
     """Set an asset state value by asset name."""
     asset_id = _resolve_asset_id_by_name(name, session)
-    get_state_backend().set(AssetScope(asset_id=asset_id), key, body.value, 
session=session)
+    get_state_backend().set(AssetScope(asset_id=asset_id), key, 
json.dumps(body.value), session=session)

Review Comment:
   The matching `test_asset_state.py` was not updated alongside 
`test_task_state.py`, so these assertions will now fail under CI:
   
   - 
`airflow-core/tests/unit/api_fastapi/execution_api/versions/head/test_asset_state.py:116`
 -- `assert row.value == "2026-04-29"`
   - 
`airflow-core/tests/unit/api_fastapi/execution_api/versions/head/test_asset_state.py:211`
 -- `assert row.value == "2026-04-29"`
   
   After `json.dumps(body.value)` lands here (and at the by-uri equivalent on 
line 159), the DB stores `'"2026-04-29"'` (JSON-quoted), so both assertions 
need the symmetric change made in `test_task_state.py` (e.g. `assert row.value 
== '"2026-04-29"'`). Worth adding the typed-value round-trip tests 
(int/dict/list) on the asset side for parity too.



##########
airflow-core/src/airflow/api_fastapi/execution_api/datamodels/task_state.py:
##########
@@ -19,17 +19,26 @@
 
 from datetime import datetime
 
+from pydantic import JsonValue, field_validator
+
 from airflow.api_fastapi.core_api.base import StrictBaseModel
 
 
 class TaskStateResponse(StrictBaseModel):
     """Task state value returned to a worker."""
 
-    value: str
+    value: JsonValue
 
 

Review Comment:
   NaN/Inf floats round-trip through JSON to `null`, so a worker sending 
`{"value": float("nan")}` ends up tripping this validator and bubbling `"value 
cannot be null"` back, which is confusing because the caller never sent `null`. 
Verified:
   
       >>> TaskStatePutBody(value=float("nan"))
       ValidationError: 1 validation error for TaskStatePutBody
       value
         Value error, value cannot be null
   
   Two ways out:
   1. Reject NaN/Inf explicitly with a clearer message (e.g. `"NaN/Inf are not 
JSON-representable"`).
   2. Keep the current behavior but reword the error to mention both `null` and 
non-finite floats.
   
   The same applies to the duplicate validator in 
`datamodels/asset_state.py:31`.



##########
task-sdk/src/airflow/sdk/execution_time/comms.py:
##########
@@ -923,7 +923,7 @@ class GetTaskState(BaseModel):
 class SetTaskState(BaseModel):
     ti_id: UUID
     key: str
-    value: str
+    value: JsonValue

Review Comment:
   The matching client wrappers in `task-sdk/src/airflow/sdk/api/client.py` 
still declare `value: str`:
   
   - `TaskStateOperations.set` (client.py:724): `def set(self, ti_id: 
uuid.UUID, key: str, value: str, expires_at: datetime | None)`
   - `AssetStateOperations.set` (client.py:777): `def set(self, key: str, 
value: str, *, name: str | None = None, uri: str | None = None)`
   
   Now that `SetTaskState` / `SetAssetStateBy*` accept `JsonValue`, those 
signatures lie to callers and mypy will flag any direct caller that passes 
non-str. Bump them to `JsonValue` for consistency so the public client signals 
the new contract.



##########
task-sdk/src/airflow/sdk/execution_time/context.py:
##########
@@ -635,13 +639,15 @@ def get(self, key: str) -> str | None:
             raise AirflowRuntimeError(resp)
         if isinstance(resp, AssetStateResult):
             stored = resp.value
-            # if custom backend is configured, the stored value in DB is a 
reference, fetch the actual value from
-            # custom backend using the reference
             backend = _get_worker_state_backend()
-            return backend.deserialize_asset_state_from_ref(stored) if backend 
else stored
+            if backend is not None:
+                # serialize_asset_state_to_ref always returns str by contract; 
stored is the ref.
+                # stored is always str here: serialize_asset_state_to_ref 
always returns str

Review Comment:
   The task_state branch above narrows the same situation with a 
`TYPE_CHECKING` assert (around line 511), but here it falls back to `# type: 
ignore[arg-type]` plus two adjacent comment lines that say the same thing 
("stored is the ref" / "stored is always str here"). Worth either using the 
`TYPE_CHECKING` assert here too, or applying the type-ignore on both sides -- 
symmetry between the two accessors helps the next reader.
   
   Nit: the two comment lines duplicate each other; one is enough.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to