kzosabe commented on issue #38632: URL: https://github.com/apache/airflow/issues/38632#issuecomment-2031136250
@uranusjr Thanks for the explanation! Just to confirm, does a DEFFERED task really transition directly to RUNNING? I may be missing something, but I could not find a code path that directly transitions from DEFFERED to RUNNING. As far as I have investigated, it seems that DEFFERED tasks are first rolled back to SCHEDULED, as in the following test code. https://github.com/apache/airflow/blob/db3181c27bc52ebf0167fb6b2f8181a13e7d02b9/tests/models/test_trigger.py#L93-L118 However, whether there is a direct transition or not, your explanation seems to be correct. As I was checking your comment, I noticed that, at least the presence or absence of start_date is not be determined by the value of ti.state. So the form of matrix I wrote above may be nonsense. From reading your comment, in my interpretation, the current desired behavior of `start_date` is: - The datetime when the task is first touched by worker is stored. - And is normally retained no matter how the state changes thereafter (including changes to pre-running state). - If there is a action that can be regarded as `restart` , it will be overwritten with the datetime of it. - Perhaps we don't want to consider a run from DEFERRED as a `restart`, but we want to consider a run from UP_FOR_RESCHEDULE as a `restart`? Is this understanding correct? If this is correct, then at least the changes I've made in https://github.com/apache/airflow/pull/38631 should be reasonable in terms of conforming to this interpretation. Because in the current implementation, the change from QUEUED to SCHEDULED also sets the end_date, which is also clearly contrary to the above. However, it seems that changes such as putting None in start_date when reverting to a pre-running state, as described in https://github.com/apache/airflow/pull/38631#discussion_r1545086130 , are wrong and should not be done. > I think your reference on UP_FOR_RETRY’s end_date is wrong? According to my referenced code, `ti.set_state(TaskInstanceState.UP_FOR_RETRY)` actually sets current time as ti.end_date (because of `or ti.state == TaskInstanceState.UP_FOR_RETRY`). However, from reading your commentary, this may also be the wrong behavior. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org