amoghrajesh commented on code in PR #66463:
URL: https://github.com/apache/airflow/pull/66463#discussion_r3234087336
##########
shared/state/src/airflow_shared/state/__init__.py:
##########
@@ -122,3 +122,14 @@ async def aclear(self, scope: StateScope, *,
all_map_indices: bool = False) -> N
scope are cleared. Pass ``all_map_indices=True`` to wipe state across
every
mapped instance of the task. For ``AssetScope`` the flag has no effect.
"""
+
+ def cleanup(self) -> None:
+ """
+ Remove expired and orphaned state records.
+
+ This is a no-op by default. Custom backends override this to implement
their own
+ retention policy. The backend is responsible for reading any relevant
config (e.g.
+ ``[state_store] default_retention_days``) and deciding what to delete.
+ Airflow does not call this from any standard job — the scheduler
triggers it via
+ ``call_regular_interval`` for the default backend.
Review Comment:
Yeah, earlier it was added to scheduler, but now we do not do that and have
a CLI command specifically because scheduler running in its main loop can have
some performance implications, which is better to avoid. Updated this in
77450e580c
##########
airflow-core/src/airflow/config_templates/config.yml:
##########
@@ -3025,6 +3025,24 @@ state_store:
type: string
example: "mypackage.state.CustomStateBackend"
default: "airflow.state.metastore.MetastoreStateBackend"
+ default_retention_days:
+ description: |
+ Number of days to retain task state after their last update.
+ Rows older than this are removed by the scheduler's periodic cleanup.
Review Comment:
Yeah, i have pushed a fix for it: 77450e580c
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]