amoghrajesh commented on code in PR #66859:
URL: https://github.com/apache/airflow/pull/66859#discussion_r3263996661
##########
shared/state/src/airflow_shared/state/__init__.py:
##########
@@ -166,3 +173,41 @@ def cleanup(self) -> None:
retention policy. The backend is responsible for reading any relevant
config (e.g.
``[state_store] default_retention_days``) and deciding what to delete.
"""
+
+ def serialize_task_state_value(self, *, value: str, key: str, ti_id: str)
-> str:
+ """
+ Serialize a task state value before it is sent to the execution API
for db persistence.
+
+ Called by ``TaskStateAccessor.set()`` on the worker. The return value
is what gets
+ stored in the DB — typically a reference path (e.g. an S3 key) rather
than the
+ actual value. Default: return ``value`` unchanged.
+ """
+ return value
+
+ def deserialize_task_state_value(self, stored: str) -> str:
+ """
+ Resolve a stored task state string back to the actual value.
+
+ Called by ``TaskStateAccessor.get()`` after the stored string is
retrieved from
+ the execution API. Default: return ``stored`` unchanged.
+ """
+ return stored
+
+ def serialize_asset_state_value(self, *, value: str, key: str, asset_name:
str) -> str:
Review Comment:
I think the naming is confusing you?
`asset_name` is probably misleading. It is actually `self._name or
self._uri` like this:
```python
asset_name = self._name or self._uri or ""
stored = (
backend.serialize_asset_state_value(value=value, key=key,
asset_name=asset_name)
if backend
else value
)
```
so if the accessor was created from an `AssetUriRef`, the value passed is
the URI string, not a name. Will rename the param to `asset_ref` and clarify
help?
Regarding the inactive asset qn, for the default metastore path: no stale
state issue — the execution API resolver filters that:
https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/api_fastapi/execution_api/routes/asset_state.py#L73,
so inactive assets return 404 and no state is written at all.
For custom backends: there is a chance of orphaned external write if the
asset becomes inactive after the backend stores the value but before the
execution API confirms the reference. This is an edge case the custom backend
author needs to handle in their garbage cleanup imo.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]