Commented as well. I think we are really going in a good direction! On Wed, Jan 8, 2020 at 9:22 AM Driesprong, Fokko <fo...@driesprong.frl> wrote:
> Thanks Jacob for building the document. I think we're on the right track. > I've added some comments and clarification to the document, to validate > we're looking in the same direction. Would love to get more people's > opinion on this. > > Cheers, Fokko > > Op wo 8 jan. 2020 om 03:31 schreef Jacob Ferriero > <jferri...@google.com.invalid>: > > > Image not working on dev list here is link to the github review comment > > containing said image: > > https://github.com/apache/airflow/pull/6370#issuecomment-546582724. > > > > On Tue, Jan 7, 2020 at 5:40 PM Jacob Ferriero <jferri...@google.com> > > wrote: > > > >> Hello Dev List, > >> > >> The inspiration for this is to allow operators to start a long running > >> task on an external system and reschedule pokes for completion (e.g > spark > >> job on dataproc), instead of blocking a worker (sketched out in #6210 > >> <https://github.com/apache/airflow/pull/6210>) to allow freeing up of > >> slots between pokes. To do this requires supporting a method for storing > >> task state between reschedules. > >> It's worth noting that a task would maintain state only during > >> reschedules but clear state on retries. In this way the task is > idempotent > >> before reaching a terminal state [SUCCES, FAIL, UP_FOR_RETRY]. This > brings > >> up a question of the scope of commitment to idempotency of operators. > If it > >> is deemed acceptable for reschedules to maintain some state, then we can > >> free up workers between pokes. > >> > >> Because this is very similar to the purpose of XCom it's been postulated > >> that we should support this behavior in XCom rather than provide a new > >> model in the db for TaskState. (Though discussion here on which is more > >> appropriate is more than welcome.) > >> > >> I'd like to put forward a proposal to resurrect the reverted #6370 > >> <https://github.com/apache/airflow/pull/6370> in order to provide a > >> modification to the lifetime of XComs under certain conditions. The > diagram > >> below helps illustrate the change originally proposed in #6370. There > was > >> concern about changing existing behavior (potentially breaking) and the > >> fact that this makes operators stateful. Per the review comments and an > >> informal discussion (meetings notes > >> < > https://docs.google.com/document/d/1uuNCPAcwnn0smcDUJPDFMMjrK-z6Z0osesPG7jVZ3oU/edit# > > > >> and #sig-async-operators) I'd like to modify the approach #6370 to only > >> skip clearing of XCom if the Xom key is prefixed with > >> `airflow.models.xcom.DO_NOT_CLEAR_PREFIX = "_STATEFUL_"` or similar. > >> > >> [image: image.png] > >> -- > >> > >> *Jacob Ferriero* > >> > >> Strategic Cloud Engineer: Data Engineering > >> > >> jferri...@google.com > >> > >> 617-714-2509 <(617)%20714-2509> > >> > > > > > > -- > > > > *Jacob Ferriero* > > > > Strategic Cloud Engineer: Data Engineering > > > > jferri...@google.com > > > > 617-714-2509 > > > -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>