Commented as well. I think we are really going in a good direction!

On Wed, Jan 8, 2020 at 9:22 AM Driesprong, Fokko <fo...@driesprong.frl>
wrote:

> Thanks Jacob for building the document. I think we're on the right track.
> I've added some comments and clarification to the document, to validate
> we're looking in the same direction. Would love to get more people's
> opinion on this.
>
> Cheers, Fokko
>
> Op wo 8 jan. 2020 om 03:31 schreef Jacob Ferriero
> <jferri...@google.com.invalid>:
>
> > Image not working on dev list here is link to the github review comment
> > containing said image:
> > https://github.com/apache/airflow/pull/6370#issuecomment-546582724.
> >
> > On Tue, Jan 7, 2020 at 5:40 PM Jacob Ferriero <jferri...@google.com>
> > wrote:
> >
> >> Hello Dev List,
> >>
> >> The inspiration for this is to allow operators to start a long running
> >> task on an external system and reschedule pokes for completion (e.g
> spark
> >> job on dataproc), instead of blocking a worker (sketched out in #6210
> >> <https://github.com/apache/airflow/pull/6210>) to allow freeing up of
> >> slots between pokes. To do this requires supporting a method for storing
> >> task state between reschedules.
> >> It's worth noting that a task would maintain state only during
> >> reschedules but clear state on retries. In this way the task is
> idempotent
> >> before reaching a terminal state [SUCCES, FAIL, UP_FOR_RETRY]. This
> brings
> >> up a question of the scope of commitment to idempotency of operators.
> If it
> >> is deemed acceptable for reschedules to maintain some state, then we can
> >> free up workers between pokes.
> >>
> >> Because this is very similar to the purpose of XCom it's been postulated
> >> that we should support this behavior in XCom rather than provide a new
> >> model in the db for TaskState. (Though discussion here on which is more
> >> appropriate is more than welcome.)
> >>
> >> I'd like to put forward a proposal to resurrect the reverted #6370
> >> <https://github.com/apache/airflow/pull/6370> in order to provide a
> >> modification to the lifetime of XComs under certain conditions. The
> diagram
> >> below helps illustrate the change originally proposed in #6370. There
> was
> >> concern about changing existing behavior (potentially breaking) and the
> >> fact that this makes operators stateful. Per the review comments and an
> >> informal discussion (meetings notes
> >> <
> https://docs.google.com/document/d/1uuNCPAcwnn0smcDUJPDFMMjrK-z6Z0osesPG7jVZ3oU/edit#
> >
> >> and #sig-async-operators) I'd like to modify the approach #6370 to only
> >> skip clearing of XCom if the Xom key is prefixed with
> >> `airflow.models.xcom.DO_NOT_CLEAR_PREFIX = "_STATEFUL_"` or similar.
> >>
> >> [image: image.png]
> >> --
> >>
> >> *Jacob Ferriero*
> >>
> >> Strategic Cloud Engineer: Data Engineering
> >>
> >> jferri...@google.com
> >>
> >> 617-714-2509 <(617)%20714-2509>
> >>
> >
> >
> > --
> >
> > *Jacob Ferriero*
> >
> > Strategic Cloud Engineer: Data Engineering
> >
> > jferri...@google.com
> >
> > 617-714-2509
> >
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to