OscarLigthart commented on code in PR #59764:
URL: https://github.com/apache/airflow/pull/59764#discussion_r2824283014
##########
airflow-core/src/airflow/serialization/definitions/dag.py:
##########
@@ -1008,7 +1012,42 @@ def clear(
tuples that should not be cleared
:param exclude_run_ids: A set of ``run_id`` or (``run_id``)
"""
- from airflow.models.taskinstance import clear_task_instances
+ from airflow.models.taskinstance import (
+ _get_new_task_ids,
+ _update_dagrun_to_latest_version,
+ clear_task_instances,
+ )
+
+ if only_new:
+ if not run_id:
+ raise ValueError("only_new requires run_id to be specified")
+ task_ids = _get_new_task_ids(self.dag_id, run_id, session)
+
+ if dry_run:
+ # For dry run, create temporary TaskInstance objects without
database changes
+ # The dag_run is not affected in dry run mode
+ from airflow.models.dag_version import DagVersion
+ from airflow.models.dagbag import DBDagBag
+ from airflow.models.taskinstance import TaskInstance
+
+ scheduler_dagbag = DBDagBag(load_op_links=False)
+ latest_dag =
scheduler_dagbag.get_latest_version_of_dag(self.dag_id, session=session)
+ dag_version = DagVersion.get_latest_version(self.dag_id,
session=session)
+
+ tis = []
+ for task_id in sorted(task_ids):
+ task = latest_dag.get_task(task_id)
+ ti = TaskInstance(
Review Comment:
At the same time, the TaskInstance above is a bit meaningless, so probably
better to return `set[str]`. At least in this way it is more clear what is to
be expected from this function.
In the API the use-case is more defined so having that extra step there
might be a bit cleaner of a solution as opposed to in the serialized DAG.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]