I agree this looks great, one question, how does the tree view look? James Coder
> On Aug 11, 2020, at 6:48 PM, Gerard Casas Saez > <gcasass...@twitter.com.invalid> wrote: > > First of all, this is awesome!! > > Secondly, checking your UI code, seems you are loading all operators at > once. Wondering if we can load them as needed (aka load whenever we click > the TaskGroup). Some of our DAGs are so large that take forever to load on > the Graph view, so worried about this still being an issue here. It may be > easily solvable by implementing lazy loading of the graph. Not sure how > easy to implement/add to the UI extension (and dont want to push for early > optimization as its the root of all evil). > Gerard Casas Saez > Twitter | Cortex | @casassaez <http://twitter.com/casassaez> > > >> On Tue, Aug 11, 2020 at 10:35 AM Xinbin Huang <bin.huan...@gmail.com> wrote: >> >> Hi Yu, >> >> Thank you so much for taking on this. I was fairly distracted previously >> and I didn't have the time to update the proposal. In fact, after >> discussing with Ash, Kaxil and Daniel, the direction of this AIP has been >> changed to favor the concept of TaskGroup instead of rewriting >> SubDagOperator (though it may may sense to deprecate SubDag in a future >> date.). >> >> Your PR is amazing and it has implemented the desire features. I think we >> can focus on your new PR instead. Do you mind updating the AIP based on >> what you have done in your PR? >> >> Best, >> Bin >> >> >>> On Tue, Aug 11, 2020 at 7:11 AM Yu Qian <yuqian1...@gmail.com> wrote: >>> >>> Hi, all, I've added the basic UI changes to my proposed implementation of >>> TaskGroup as UI grouping concept: >>> https://github.com/apache/airflow/pull/10153 >>> >>> I think Chris had a pretty good specification of TaskGroup so i'm quoting >>> it here. The only thing I don't fully agree with is the restriction >>> "... **cannot* >>> have dependencies between a Task in a TaskGroup and either a* >>> * Task in a different TaskGroup or a Task not in any group*". I think >>> this is over restrictive. Since TaskGroup is a UI concept, tasks can have >>> dependencies on tasks in other TaskGroup or not in any TaskGroup. In my >> PR, >>> this is allowed. The graph edges will update accordingly when TaskGroups >>> are expanded/collapsed. TaskGroup is only helping to make the UI look >> less >>> crowded. Under the hood, everything is still a DAG of tasks and edges so >>> things work normally. Here's a screenshot >>> < >>> >> https://raw.githubusercontent.com/yuqian90/airflow/gif_for_demo/airflow/www/static/screen-shot-short.gif >>>> >>> of the UI interaction. >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> * - Tasks can be added to a TaskGroup - You *can* have dependencies >>> between Tasks in the same TaskGroup, but *cannot* have dependencies >>> between a Task in a TaskGroup and either a Task in a different >> TaskGroup >>> or a Task not in any group - You *can* have dependencies between a >>> TaskGroup and either other TaskGroups or Tasks not in any group - The >>> UI will by default render a TaskGroup as a single "object", but which >> you >>> expand or zoom into in some way - You'd need some way to determine what >>> the "status" of a TaskGroup was at least for UI display purposes* >>> >>> >>> Regarding Jake's comment, I agree it's possible to implement the >> "retrying >>> tasks in a group" pattern he mentioned as an optional feature of >> TaskGroup >>> although that may go against having TaskGroup as a pure UI concept. For >> the >>> motivating example Jake provided, I suggest implementing both >>> SubmitLongRunningJobTask and PollJobStatusSensor in a single operator. It >>> can do something like BaseSensorOperator.execute() does in "reschedule" >>> mode, i.e. it first executes some code to submit the long running job to >>> the external service, and store the state (e.g. in XCom). Then reschedule >>> itself. Subsequent runs then pokes for the completion state. >>> >>> >>> On Thu, Aug 6, 2020 at 2:08 AM Jacob Ferriero >> <jferri...@google.com.invalid >>>> >>> wrote: >>> >>>> I really like this idea of a TaskGroup container as I think this will >> be >>>> much easier to use than SubDag. >>>> >>>> I'd like to propose an optional behavior for special retry mechanics >> via >>> a >>>> TaskGroup.retry_all property. >>>> This way I could use TaskGroup to replace my favorite use of SubDag for >>>> atomically retrying tasks of the pattern "act on external state then >>>> reschedule poll until desired state reached". >>>> >>>> Motivating use case I have for a SubDag is very simple two task group >>>> [SubmitLongRunningJobTask >> PollJobStatusSensor]. >>>> I use SubDag is because it gives me an easy way to retry the >>> SubmitJobTask >>>> if something about the PollJobSensor fails. >>>> This pattern would be really nice for jobs that are expected to run a >>> long >>>> time (because we can use sensor can use reschedule mode freeing up >> slots) >>>> but might fail for a retryable reason. >>>> However, using SubDag to meet this use case defeats the purpose because >>>> SubDag infamously >>>> < >>>> >>> >> https://medium.com/@team_24989/fixing-subdagoperator-deadlock-in-airflow-6c64312ebb10 >>>>> >>>> blocks a "controller" slot for the entire duration. >>>> This may feel like a cyclic behavior but reality it is very common for >> a >>>> single operator to submit job / wait til done. >>>> We could use this case refactor many operators (e.g. BQ, Dataproc, >>>> Dataflow) to be implemented as TaskGroup[SubmitTask >> PollTask] with >> an >>>> optional reschedule mode if user knows that this job may take a long >>> time. >>>> >>>> I'd be happy to the development work on adding this specific retry >>> behavior >>>> to TaskGroup once the base concept is implemented if others in the >>>> community would find this a useful feature. >>>> >>>> Cheers, >>>> Jake >>>> >>>> On Tue, Aug 4, 2020 at 10:07 AM Jarek Potiuk <jarek.pot...@polidea.com >>> >>>> wrote: >>>> >>>>> All for it :) . I think we are getting closer to have regular >> planning >>>> and >>>>> making some structured approach to 2.0 and starting task force for it >>>> soon, >>>>> so I think this should be perfectly fine to discuss and even start >>>>> implementing what's beyond as soon as we make sure that we are >>>> prioritizing >>>>> 2.0 work. >>>>> >>>>> J, >>>>> >>>>> >>>>> On Tue, Aug 4, 2020 at 12:09 PM Yu Qian <yuqian1...@gmail.com> >> wrote: >>>>> >>>>>> Hi Jarek, >>>>>> >>>>>> I agree we should not change the behaviour of the existing >>>> SubDagOperator >>>>>> till Airflow 2.1. Is it okay to continue the discussion about >>> TaskGroup >>>>> as >>>>>> a brand new concept/feature independent from the existing >>>> SubDagOperator? >>>>>> In other words, shall we add TaskGroup as a UI grouping concept >> like >>>> Ash >>>>>> suggested, and not touch SubDagOperator atl all. Whenever we are >>> ready >>>>> with >>>>>> TaskGroup, we then deprecate SubDagOperator in Airflow 2.1. >>>>>> >>>>>> I really like Ash's idea of simplifying the SubDagOperator idea >> into >>> a >>>>>> simple UI grouping concept. I think Xinbin's idea of "reattaching >> all >>>> the >>>>>> tasks to the root DAG" is the way to go. And I see James pointed >> out >>> we >>>>>> need some helper functions to simplify dependencies setting of >>>> TaskGroup. >>>>>> Xinbin put up a pretty elegant example in his PR >>>>>> <https://github.com/apache/airflow/pull/9243>. I think having >>>> TaskGroup >>>>> as >>>>>> a UI concept should be a relatively small change. We can simplify >>>>> Xinbin's >>>>>> PR further. So I put up this alternative proposal here: >>>>>> https://github.com/apache/airflow/pull/10153 >>>>>> >>>>>> I have not done any UI changes due to lack of experience with web >> UI. >>>> If >>>>>> anyone's interested, please take a look at the PR. >>>>>> >>>>>> Qian >>>>>> >>>>>> On Mon, Jun 22, 2020 at 5:15 AM Jarek Potiuk < >>> jarek.pot...@polidea.com >>>>> >>>>>> wrote: >>>>>> >>>>>>> Similar point here to the other ideas that are popping up. Maybe >> we >>>>>> should >>>>>>> just focus on completing 2.0 and make all discussions about >> further >>>>>>> improvements to 2.1? While those are important discussions (and >> we >>>>> should >>>>>>> continue them in the near future !) I think at this point >> focusing >>>> on >>>>>>> delivering 2.0 in its current shape should be our focus now ? >>>>>>> >>>>>>> J. >>>>>>> >>>>>>> On Thu, Jun 18, 2020 at 6:35 PM Xinbin Huang < >>> bin.huan...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Daniel >>>>>>>> >>>>>>>> I agree that the TaskGroup should have the same API as a DAG >>> object >>>>>>> related >>>>>>>> to task dependencies, but it will not have anything related to >>>> actual >>>>>>>> execution or scheduling. >>>>>>>> I will update the AIP according to this over the weekend. >>>>>>>> >>>>>>>>> We could even make a “DAGTemplate” object s.t. when you >> import >>>> the >>>>>>> object >>>>>>>> you can import it with parameters to determine the shape of the >>>> DAG. >>>>>>>> >>>>>>>> Can you elaborate a bit more on this? Does it serve a similar >>>> purpose >>>>>> as >>>>>>> a >>>>>>>> DAG factory function? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Jun 18, 2020 at 9:13 AM Daniel Imberman < >>>>>>> daniel.imber...@gmail.com >>>>>>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Bin, >>>>>>>>> >>>>>>>>> Why not give the TaskGroup the same API as a DAG object (e.g. >>> the >>>>>>> bitwise >>>>>>>>> operator fro task dependencies). We could even make a >>>> “DAGTemplate” >>>>>>>> object >>>>>>>>> s.t. when you import the object you can import it with >>> parameters >>>>> to >>>>>>>>> determine the shape of the DAG. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Jun 17, 2020 at 8:54 PM, Xinbin Huang < >>>>> bin.huan...@gmail.com >>>>>>> >>>>>>>>> wrote: >>>>>>>>> The TaskGroup will not take schedule interval as a parameter >>>>> itself, >>>>>>> and >>>>>>>> it >>>>>>>>> depends on the DAG where it attaches to. In my opinion, the >>>>> TaskGroup >>>>>>>> will >>>>>>>>> only contain a group of tasks with interdependencies, and the >>>>>> TaskGroup >>>>>>>>> behaves like a task. It doesn't contain any >>> execution/scheduling >>>>>> logic >>>>>>>>> (i.e. schedule_interval, concurrency, max_active_runs etc.) >>> like >>>> a >>>>>> DAG >>>>>>>>> does. >>>>>>>>> >>>>>>>>>> For example, there is the scenario that the schedule >> interval >>>> of >>>>>> DAG >>>>>>> is >>>>>>>>> 1 hour and the schedule interval of TaskGroup is 20 min. >>>>>>>>> >>>>>>>>> I am curious why you ask this. Is this a use case that you >> want >>>> to >>>>>>>> achieve? >>>>>>>>> >>>>>>>>> Bin >>>>>>>>> >>>>>>>>> On Wed, Jun 17, 2020 at 7:59 PM 蒋晓峰 < >> thanosxnicho...@gmail.com >>>> >>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi Bin, >>>>>>>>>> Using TaskGroup, Is the schedule interval of TaskGroup the >>> same >>>>> as >>>>>>> the >>>>>>>>>> parent DAG? My main concern is whether the schedule >> interval >>> of >>>>>>>> TaskGroup >>>>>>>>>> could be different with that of the DAG? For example, there >>> is >>>>> the >>>>>>>>> scenario >>>>>>>>>> that the schedule interval of DAG is 1 hour and the >> schedule >>>>>> interval >>>>>>>> of >>>>>>>>>> TaskGroup is 20 min. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Nicholas >>>>>>>>>> >>>>>>>>>> On Thu, Jun 18, 2020 at 10:30 AM Xinbin Huang < >>>>>> bin.huan...@gmail.com >>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Nicholas, >>>>>>>>>>> >>>>>>>>>>> I am not sure about the old behavior of SubDagOperator, >>> maybe >>>>> it >>>>>>> will >>>>>>>>>> throw >>>>>>>>>>> an error? But in the original proposal, the subdag's >>>>>>>> schedule_interval >>>>>>>>>> will >>>>>>>>>>> be ignored. Or if we decide to use TaskGroup to replace >>>> SubDag, >>>>>>> there >>>>>>>>>> will >>>>>>>>>>> be no subdag schedule_interval. >>>>>>>>>>> >>>>>>>>>>> Bin >>>>>>>>>>> >>>>>>>>>>> On Wed, Jun 17, 2020 at 6:21 PM 蒋晓峰 < >>>> thanosxnicho...@gmail.com >>>>>> >>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Bin, >>>>>>>>>>>> Thanks for your good proposal. I was confused whether >> the >>>>>>> schedule >>>>>>>>>>>> interval of SubDAG is different from that of the parent >>>> DAG? >>>>> I >>>>>>> have >>>>>>>>>>>> discussed with Jiajie Zhong about the schedule interval >>> of >>>>>>> SubDAG. >>>>>>>> If >>>>>>>>>> the >>>>>>>>>>>> SubDagOperator has a different schedule interval, what >>> will >>>>>>> happen >>>>>>>>> for >>>>>>>>>>> the >>>>>>>>>>>> scheduler to schedule the parent DAG? >>>>>>>>>>>> >>>>>>>>>>>> Regards, >>>>>>>>>>>> Nicholas Jiang >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jun 18, 2020 at 8:04 AM Xinbin Huang < >>>>>>>> bin.huan...@gmail.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Thank you, Max, Kaxil, and everyone's feedback! >>>>>>>>>>>>> >>>>>>>>>>>>> I have rethought about the concept of subdag and task >>>>>> groups. I >>>>>>>>> think >>>>>>>>>>> the >>>>>>>>>>>>> better way to approach this is to entirely remove >>> subdag >>>>> and >>>>>>>>>> introduce >>>>>>>>>>>> the >>>>>>>>>>>>> concept of TaskGroup, which is a container of tasks >>> along >>>>>> with >>>>>>>>> their >>>>>>>>>>>>> dependencies *without execution/scheduling logic as a >>>> DAG*. >>>>>> The >>>>>>>>> only >>>>>>>>>>>>> purpose of it is to group a list of tasks, but you >>> still >>>>> need >>>>>>> to >>>>>>>>> add >>>>>>>>>> it >>>>>>>>>>>> to >>>>>>>>>>>>> a DAG for execution. >>>>>>>>>>>>> >>>>>>>>>>>>> Here is a small code snippet. >>>>>>>>>>>>> >>>>>>>>>>>>> ``` >>>>>>>>>>>>> class TaskGroup: >>>>>>>>>>>>> """ >>>>>>>>>>>>> A TaskGroup contains a group of tasks. >>>>>>>>>>>>> >>>>>>>>>>>>> If default_args is missing, it will take default args >>>> from >>>>>> the >>>>>>>>>> DAG. >>>>>>>>>>>>> """ >>>>>>>>>>>>> def __init__(self, group_id, default_args): >>>>>>>>>>>>> pass >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> """ >>>>>>>>>>>>> You can add tasks to a task group similar to adding >>> tasks >>>>> to >>>>>> a >>>>>>>> DAG >>>>>>>>>>>>> >>>>>>>>>>>>> This can be declared in a separate file from the dag >>> file >>>>>>>>>>>>> """ >>>>>>>>>>>>> download_group = TaskGroup(group_id='download', >>>>>>>>>>>> default_args=default_args) >>>>>>>>>>>>> download_group.add_task(task1) >>>>>>>>>>>>> task2.dag = download_group >>>>>>>>>>>>> >>>>>>>>>>>>> with download_group: >>>>>>>>>>>>> task3 = DummyOperator(task_id='task3') >>>>>>>>>>>>> >>>>>>>>>>>>> [task, task2] >> task3 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> """Add it to a DAG for execution""" >>>>>>>>>>>>> with DAG(dag_id='start_download_dag', >>>>>>> default_args=default_args, >>>>>>>>>>>>> schedule_interval='@daily', ...) as dag: >>>>>>>>>>>>> start = DummyOperator(task_id='start') >>>>>>>>>>>>> start >> download_group >>>>>>>>>>>>> # this is equivalent to >>>>>>>>>>>>> # start >> [task, task2] >> task3 >>>>>>>>>>>>> ``` >>>>>>>>>>>>> >>>>>>>>>>>>> With this, we can still reuse a group of tasks and >> set >>>>>>>> dependencies >>>>>>>>>>>> between >>>>>>>>>>>>> them; it avoids the boilerplate code from using >>>>>> SubDagOperator, >>>>>>>> and >>>>>>>>>> we >>>>>>>>>>>> can >>>>>>>>>>>>> declare dependencies as `task >> task_group >> task`. >>>>>>>>>>>>> >>>>>>>>>>>>> User migration wise, we can introduce it before >> Airflow >>>> 2.0 >>>>>> and >>>>>>>>> allow >>>>>>>>>>>>> gradual transition. Then we can decide if we still >> want >>>> to >>>>>> keep >>>>>>>> the >>>>>>>>>>>>> SubDagOperator or simply remove it. >>>>>>>>>>>>> >>>>>>>>>>>>> Any thoughts? >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers, >>>>>>>>>>>>> Bin >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Jun 17, 2020 at 7:37 AM Maxime Beauchemin < >>>>>>>>>>>>> maximebeauche...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> +1, proposal looks good. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The original intention was really to have tasks >>> groups >>>>> and >>>>>> a >>>>>>>>>>>> zoom-in/out >>>>>>>>>>>>> in >>>>>>>>>>>>>> the UI. The original reasoning was to reuse the DAG >>>>> object >>>>>>>> since >>>>>>>>> it >>>>>>>>>>> is >>>>>>>>>>>> a >>>>>>>>>>>>>> group of tasks, but as highlighted here it does >>> create >>>>>>>> underlying >>>>>>>>>>>>>> confusions since a DAG is much more than just a >> group >>>> of >>>>>>> tasks. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Max >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Jun 15, 2020 at 2:43 AM Poornima Joshi < >>>>>>>>>>>>> joshipoornim...@gmail.com> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thank you for your email. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Sat, Jun 13, 2020 at 12:18 AM Xinbin Huang < >>>>>>>>>>> bin.huan...@gmail.com >>>>>>>>>>>>> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> - *Unpack SubDags during dag parsing*: This >>>>>> rewrites >>>>>>>> the >>>>>>>>>>>>>>>> *DagBag.bag_dag* >>>>>>>>>>>>>>>>>> method to unpack subdag while parsing, and >> it >>>>> will >>>>>>>> give a >>>>>>>>>>>> flat >>>>>>>>>>>>>>>>>> structure at >>>>>>>>>>>>>>>>>> the task level >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The serialized_dag representation already >> does >>>>> this I >>>>>>>>> think. >>>>>>>>>> At >>>>>>>>>>>>> least >>>>>>>>>>>>>>> if >>>>>>>>>>>>>>>>> I've understood your idea here correctly. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I am not sure about serialized_dag >>> representation, >>>>> but >>>>>> at >>>>>>>>> least >>>>>>>>>>> it >>>>>>>>>>>>> will >>>>>>>>>>>>>>>> still keep the subdag entry in the DAG table? >> In >>> my >>>>>>>> proposal >>>>>>>>> as >>>>>>>>>>>> also >>>>>>>>>>>>> in >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>> draft PR, the idea is to *extract the tasks >> from >>>> the >>>>>>> subdag >>>>>>>>> and >>>>>>>>>>> add >>>>>>>>>>>>>> them >>>>>>>>>>>>>>>> back to the root_dag. *So the runtime DAG graph >>>> will >>>>>> look >>>>>>>>>> exactly >>>>>>>>>>>> the >>>>>>>>>>>>>>>> same as without subdag but with metadata >> attached >>>> to >>>>>>> those >>>>>>>>>>>> sections. >>>>>>>>>>>>>>> These >>>>>>>>>>>>>>>> metadata will be later on used to render in the >>> UI. >>>>> So >>>>>>>> after >>>>>>>>>>>> parsing >>>>>>>>>>>>> ( >>>>>>>>>>>>>>>> *DagBag.process_file()*), it will just output >> the >>>>>>> *root_dag >>>>>>>>>>>> *instead >>>>>>>>>>>>> of >>>>>>>>>>>>>>> *root_dag + >>>>>>>>>>>>>>>> subdag + subdag + nested subdag* etc. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - e.g. section-1-* will have metadata >>>>>>>>>> current_group=section-1, >>>>>>>>>>>>>>>> parent_group=<the-root-dag-id> (welcome for >>> naming >>>>>>>>>>> suggestions), >>>>>>>>>>>>> the >>>>>>>>>>>>>>>> reason for parent_group is that we can have >>> nested >>>>>> group >>>>>>>> and >>>>>>>>>>>> still >>>>>>>>>>>>>> be >>>>>>>>>>>>>>>> able to capture the dependency. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Runtime DAG: >>>>>>>>>>>>>>>> [image: image.png] >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> While at the UI, what we see would be something >>>> like >>>>>> this >>>>>>>> by >>>>>>>>>>>>> utilizing >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>> metadata, and then we can expand or zoom into >> in >>>> some >>>>>>> way. >>>>>>>>>>>>>>>> [image: image.png] >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The benefits I can see is that: >>>>>>>>>>>>>>>> 1. We don't need to deal with the extra >>> complexity >>>> of >>>>>>>> SubDag >>>>>>>>>> for >>>>>>>>>>>>>>> execution >>>>>>>>>>>>>>>> and scheduling. It will be the same as not >> using >>>>>> SubDag. >>>>>>>>>>>>>>>> 2. Still have the benefits of modularized and >>>>> reusable >>>>>>> dag >>>>>>>>> code >>>>>>>>>>> and >>>>>>>>>>>>>>>> declare dependencies between them. And with the >>> new >>>>>>>>>>> SubDagOperator >>>>>>>>>>>>> (see >>>>>>>>>>>>>>> AIP >>>>>>>>>>>>>>>> or draft PR), we can use the same dag_factory >>>>> function >>>>>>> for >>>>>>>>>>>>> generating 1 >>>>>>>>>>>>>>>> dag, a lot of dynamic dags, or used for SubDag >>> (in >>>>> this >>>>>>>> case, >>>>>>>>>> it >>>>>>>>>>>> will >>>>>>>>>>>>>>> just >>>>>>>>>>>>>>>> extract all underlying tasks and append to the >>> root >>>>>> dag). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - Then it gets to the idea of replacing subdag >>>> with a >>>>>>>>>> simpler >>>>>>>>>>>>>> concept >>>>>>>>>>>>>>>> by Ash: the proposed change basically drains >> out >>>> the >>>>>>>>>> contents >>>>>>>>>>>> of >>>>>>>>>>>>> a >>>>>>>>>>>>>>> SubDag >>>>>>>>>>>>>>>> and becomes more like >>>>>>>>>>>> ExtractSubdagTasksAndAppendToRootdagOperator >>>>>>>>>>>>>>> (forgive >>>>>>>>>>>>>>>> me about the crazy name..). In this case, it is >>>> still >>>>>>>>>>> necessary >>>>>>>>>>>> to >>>>>>>>>>>>>>> keep the >>>>>>>>>>>>>>>> concept of subdag as it is nothing more than a >>>> name? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> That's why the TaskGroup idea comes up. Thanks >>>> Chris >>>>>>> Palmer >>>>>>>>> for >>>>>>>>>>>>> helping >>>>>>>>>>>>>>>> conceptualize the functionality of TaskGroup, I >>>> will >>>>>> just >>>>>>>>> paste >>>>>>>>>>> it >>>>>>>>>>>>>> here. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - Tasks can be added to a TaskGroup >>>>>>>>>>>>>>>>> - You *can* have dependencies between Tasks >> in >>>> the >>>>>> same >>>>>>>>>>>> TaskGroup, >>>>>>>>>>>>>> but >>>>>>>>>>>>>>>>> *cannot* have dependencies between a Task in >> a >>>>>>> TaskGroup >>>>>>>>>> and >>>>>>>>>>>>>> either a >>>>>>>>>>>>>>>>> Task in a different TaskGroup or a Task not >> in >>>> any >>>>>>> group >>>>>>>>>>>>>>>>> - You *can* have dependencies between a >>> TaskGroup >>>>> and >>>>>>>>>> either >>>>>>>>>>>>> other >>>>>>>>>>>>>>>>> TaskGroups or Tasks not in any group >>>>>>>>>>>>>>>>> - The UI will by default render a TaskGroup >> as >>> a >>>>>> single >>>>>>>>>>>> "object", >>>>>>>>>>>>>> but >>>>>>>>>>>>>>>>> which you expand or zoom into in some way >>>>>>>>>>>>>>>>> - You'd need some way to determine what the >>>>> "status" >>>>>>> of a >>>>>>>>>>>>> TaskGroup >>>>>>>>>>>>>>> was >>>>>>>>>>>>>>>>> at least for UI display purposes >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I agree with Chris: >>>>>>>>>>>>>>>> - From the backend's view (scheduler & >>> executor), I >>>>>> think >>>>>>>>>>> TaskGroup >>>>>>>>>>>>>>> should >>>>>>>>>>>>>>>> be ignored during execution. (unless we decide >> to >>>>>>> implement >>>>>>>>>> some >>>>>>>>>>>>>> metadata >>>>>>>>>>>>>>>> operations that allows start/stop a group of >>> tasks >>>>>> etc.) >>>>>>>>>>>>>>>> - From the UI's View, it should be able to pick >>> up >>>>> the >>>>>>>>>> individual >>>>>>>>>>>>>> tasks' >>>>>>>>>>>>>>>> status and then determine the TaskGroup's >> status >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Bin >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Jun 12, 2020 at 10:28 AM Daniel >> Imberman >>> < >>>>>>>>>>>>>>>> daniel.imber...@gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I hadn’t thought about using the `>>` operator >>> to >>>>> tie >>>>>>> dags >>>>>>>>>>>> together >>>>>>>>>>>>>> but >>>>>>>>>>>>>>> I >>>>>>>>>>>>>>>>> think that sounds pretty great! I wonder if we >>>> could >>>>>>>>>> essentially >>>>>>>>>>>>> write >>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>> the ability to set dependencies to all >>>> starter-tasks >>>>>> for >>>>>>>>> that >>>>>>>>>>> DAG. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I’m personally ok with SubDag being a mostly >> UI >>>>>> concept. >>>>>>>> It >>>>>>>>>>>> doesn’t >>>>>>>>>>>>>> need >>>>>>>>>>>>>>>>> to execute separately, you’re just adding more >>>> tasks >>>>>> to >>>>>>>> the >>>>>>>>>>> queue >>>>>>>>>>>>> that >>>>>>>>>>>>>>> will >>>>>>>>>>>>>>>>> be executed when there are resources >> available. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> via Newton Mail [ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.50&pv=10.14.6&source=email_footer_2 >>>>>>>>>>>>>>>>> ] >>>>>>>>>>>>>>>>> On Fri, Jun 12, 2020 at 9:45 AM, Chris Palmer >> < >>>>>>>>>>> ch...@crpalmer.com >>>>>>>>>>>>> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> I agree that SubDAGs are an overly complex >>>>>> abstraction. >>>>>>> I >>>>>>>>>> think >>>>>>>>>>>> what >>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>> needed/useful is a TaskGroup concept. On a >> high >>>>> level >>>>>> I >>>>>>>>> think >>>>>>>>>>> you >>>>>>>>>>>>> want >>>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>>>> functionality: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - Tasks can be added to a TaskGroup >>>>>>>>>>>>>>>>> - You *can* have dependencies between Tasks in >>> the >>>>>> same >>>>>>>>>>> TaskGroup, >>>>>>>>>>>>> but >>>>>>>>>>>>>>>>> *cannot* have dependencies between a Task in a >>>>>> TaskGroup >>>>>>>> and >>>>>>>>>>>> either >>>>>>>>>>>>> a >>>>>>>>>>>>>>>>> Task in a different TaskGroup or a Task not in >>> any >>>>>> group >>>>>>>>>>>>>>>>> - You *can* have dependencies between a >>> TaskGroup >>>>> and >>>>>>>> either >>>>>>>>>>> other >>>>>>>>>>>>>>>>> TaskGroups or Tasks not in any group >>>>>>>>>>>>>>>>> - The UI will by default render a TaskGroup >> as a >>>>>> single >>>>>>>>>>> "object", >>>>>>>>>>>>> but >>>>>>>>>>>>>>>>> which you expand or zoom into in some way >>>>>>>>>>>>>>>>> - You'd need some way to determine what the >>>> "status" >>>>>> of >>>>>>> a >>>>>>>>>>>> TaskGroup >>>>>>>>>>>>>> was >>>>>>>>>>>>>>>>> at least for UI display purposes >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Not sure if it would need to be a top level >>> object >>>>>> with >>>>>>>> its >>>>>>>>>> own >>>>>>>>>>>>>> database >>>>>>>>>>>>>>>>> table and model or just another attribute on >>>> tasks. >>>>> I >>>>>>>> think >>>>>>>>>> you >>>>>>>>>>>>> could >>>>>>>>>>>>>>>>> build >>>>>>>>>>>>>>>>> it in a way such that from the schedulers >> point >>> of >>>>>> view >>>>>>> a >>>>>>>>> DAG >>>>>>>>>>> with >>>>>>>>>>>>>>>>> TaskGroups doesn't get treated any >> differently. >>> So >>>>> it >>>>>>>> really >>>>>>>>>>> just >>>>>>>>>>>>>>> becomes >>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>> shortcut for setting dependencies between sets >>> of >>>>>> Tasks, >>>>>>>> and >>>>>>>>>>>> allows >>>>>>>>>>>>>> the >>>>>>>>>>>>>>> UI >>>>>>>>>>>>>>>>> to simplify the render of the DAG structure. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Fri, Jun 12, 2020 at 12:12 PM Dan Davydov >>>>>>>>>>>>>>> <ddavy...@twitter.com.invalid >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Agree with James (and think it's actually >> the >>>> more >>>>>>>>> important >>>>>>>>>>>> issue >>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>> fix), >>>>>>>>>>>>>>>>>> but I am still convinced Ash' idea is the >>> right >>>>> way >>>>>>>>> forward >>>>>>>>>>>> (just >>>>>>>>>>>>> it >>>>>>>>>>>>>>>>> might >>>>>>>>>>>>>>>>>> require a bit more work to deprecate than >>> adding >>>>>>> visual >>>>>>>>>>> grouping >>>>>>>>>>>>> in >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> UI). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> There was a previous thread about this FYI >>> with >>>>> more >>>>>>>>> context >>>>>>>>>>> on >>>>>>>>>>>>> why >>>>>>>>>>>>>>>>> subdags >>>>>>>>>>>>>>>>>> are bad and potential solutions: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>> >> https://www.mail-archive.com/dev@airflow.apache.org/msg01202.html >>>>>>>>>>>>>> . A >>>>>>>>>>>>>>>>>> solution I outline there to Jame's problem >> is >>>> e.g. >>>>>>>>> enabling >>>>>>>>>>> the >>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> operator >>>>>>>>>>>>>>>>>> for Airflow operators to work with DAGs as >>>> well. I >>>>>> see >>>>>>>>> this >>>>>>>>>>>> being >>>>>>>>>>>>>>>>> separate >>>>>>>>>>>>>>>>>> from Ash' solution for DAG grouping in the >> UI >>>> but >>>>>> one >>>>>>> of >>>>>>>>> the >>>>>>>>>>> two >>>>>>>>>>>>>> items >>>>>>>>>>>>>>>>>> required to replace all existing subdag >>>>>> functionality. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I've been working with subdags for 3 years >> and >>>>> they >>>>>>> are >>>>>>>>>>> always a >>>>>>>>>>>>>> giant >>>>>>>>>>>>>>>>> pain >>>>>>>>>>>>>>>>>> to use. They are a constant source of user >>>>> confusion >>>>>>> and >>>>>>>>>>>> breakages >>>>>>>>>>>>>>>>> during >>>>>>>>>>>>>>>>>> upgrades. Would love to see them gone :). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Fri, Jun 12, 2020 at 11:11 AM James >> Coder < >>>>>>>>>>>> jcode...@gmail.com> >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I'm not sure I totally agree it's just a >> UI >>>>>>> concept. I >>>>>>>>> use >>>>>>>>>>> the >>>>>>>>>>>>>>> subdag >>>>>>>>>>>>>>>>>>> operator to simplify dependencies too. If >>> you >>>>>> have a >>>>>>>>> group >>>>>>>>>>> of >>>>>>>>>>>>>> tasks >>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>> need to finish before another group of >> tasks >>>>>> start, >>>>>>>>> using >>>>>>>>>> a >>>>>>>>>>>>> subdag >>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>> pretty quick way to set those dependencies >>>> and I >>>>>>> think >>>>>>>>>> also >>>>>>>>>>>> make >>>>>>>>>>>>>> it >>>>>>>>>>>>>>>>>> easier >>>>>>>>>>>>>>>>>>> to follow the dag code. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Fri, Jun 12, 2020 at 9:53 AM Kyle >> Hamlin >>> < >>>>>>>>>>>>> hamlin...@gmail.com> >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I second Ash’s grouping concept. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Fri, Jun 12, 2020 at 5:10 AM Ash >>>>>> Berlin-Taylor >>>>>>> < >>>>>>>>>>>>>> a...@apache.org >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Question: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Do we even need the SubDagOperator >>>> anymore? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Would removing it entirely and just >>>>> replacing >>>>>> it >>>>>>>>> with >>>>>>>>>> a >>>>>>>>>>> UI >>>>>>>>>>>>>>>>> grouping >>>>>>>>>>>>>>>>>>>>> concept be conceptually simpler, less >> to >>>> get >>>>>>>> wrong, >>>>>>>>>> and >>>>>>>>>>>>> closer >>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>> what >>>>>>>>>>>>>>>>>>>>> users actually want to achieve with >>>> subdags? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> With your proposed change, tasks in >>>> subdags >>>>>>> could >>>>>>>>>> start >>>>>>>>>>>>>> running >>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>> parallel (a good change) -- so should >> we >>>> not >>>>>>> also >>>>>>>>> just >>>>>>>>>>>>>>> _enitrely_ >>>>>>>>>>>>>>>>>>> remove >>>>>>>>>>>>>>>>>>>>> the concept of a sub dag and replace >> it >>>> with >>>>>>>>> something >>>>>>>>>>>>>> simpler. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Problems with subdags (I think. I >>> haven't >>>>> used >>>>>>>> them >>>>>>>>>>>>>> extensively >>>>>>>>>>>>>>> so >>>>>>>>>>>>>>>>>> may >>>>>>>>>>>>>>>>>>>>> be wrong on some of these): >>>>>>>>>>>>>>>>>>>>> - They need their own dag_id, but it >>>> has(?) >>>>> to >>>>>>> be >>>>>>>> of >>>>>>>>>> the >>>>>>>>>>>>> form >>>>>>>>>>>>>>>>>>>>> `parent_dag_id.subdag_id`. >>>>>>>>>>>>>>>>>>>>> - They need their own >> schedule_interval, >>>> but >>>>>> it >>>>>>>> has >>>>>>>>> to >>>>>>>>>>>> match >>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> parent >>>>>>>>>>>>>>>>>>>> dag >>>>>>>>>>>>>>>>>>>>> - Sub dags can be paused on their own. >>>> (Does >>>>>> it >>>>>>>> make >>>>>>>>>>> sense >>>>>>>>>>>>> to >>>>>>>>>>>>>> do >>>>>>>>>>>>>>>>>> this? >>>>>>>>>>>>>>>>>>>>> Pausing just a sub dag would mean the >>> sub >>>>> dag >>>>>>>> would >>>>>>>>>>> never >>>>>>>>>>>>>>>>> execute, so >>>>>>>>>>>>>>>>>>>>> the SubDagOperator would fail too. >>>>>>>>>>>>>>>>>>>>> - You had to choose the executor to >>>>> operator a >>>>>>>>> subdag >>>>>>>>>>> with >>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> always >>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>> bit of a kludge. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thoughts? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -ash >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Jun 12 2020, at 12:01 pm, Ash >>>>>> Berlin-Taylor < >>>>>>>>>>>>>> a...@apache.org> >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Workon sub-dags is much needed, I'm >>>>> excited >>>>>> to >>>>>>>> see >>>>>>>>>> how >>>>>>>>>>>>> this >>>>>>>>>>>>>>>>>>> progresses. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> - *Unpack SubDags during dag >>> parsing*: >>>>> This >>>>>>>>>> rewrites >>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>> *DagBag.bag_dag* >>>>>>>>>>>>>>>>>>>>>>> method to unpack subdag while >>> parsing, >>>>> and >>>>>> it >>>>>>>>> will >>>>>>>>>>>> give a >>>>>>>>>>>>>>> flat >>>>>>>>>>>>>>>>>>>>>>> structure at >>>>>>>>>>>>>>>>>>>>>>> the task level >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> The serialized_dag representation >>>> already >>>>>> does >>>>>>>>> this >>>>>>>>>> I >>>>>>>>>>>>> think. >>>>>>>>>>>>>>> At >>>>>>>>>>>>>>>>>> least >>>>>>>>>>>>>>>>>>>> if >>>>>>>>>>>>>>>>>>>>>> I've understood your idea here >>>> correctly. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> -ash >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Jun 12 2020, at 9:51 am, Xinbin >>>> Huang < >>>>>>>>>>>>>>> bin.huan...@gmail.com >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Hi everyone, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Sending a message to everyone and >>>> collect >>>>>>>>> feedback >>>>>>>>>> on >>>>>>>>>>>> the >>>>>>>>>>>>>>>>> AIP-34 >>>>>>>>>>>>>>>>>> on >>>>>>>>>>>>>>>>>>>>>>> rewriting SubDagOperator. This was >>>>>> previously >>>>>>>>>> briefly >>>>>>>>>>>>>>>>> mentioned in >>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>> discussion about what needs to be >>> done >>>>> for >>>>>>>>> Airflow >>>>>>>>>>> 2.0, >>>>>>>>>>>>> and >>>>>>>>>>>>>>>>> one of >>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>> ideas is to make SubDagOperator >>> attach >>>>>> tasks >>>>>>>> back >>>>>>>>>> to >>>>>>>>>>>> the >>>>>>>>>>>>>> root >>>>>>>>>>>>>>>>> DAG. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> This AIP-34 focuses on solving >>>>>> SubDagOperator >>>>>>>>>> related >>>>>>>>>>>>>> issues >>>>>>>>>>>>>>> by >>>>>>>>>>>>>>>>>>>>> reattaching >>>>>>>>>>>>>>>>>>>>>>> all tasks back to the root dag >> while >>>>>>> respecting >>>>>>>>>>>>>> dependencies >>>>>>>>>>>>>>>>>> during >>>>>>>>>>>>>>>>>>>>>>> parsing. The original grouping >> effect >>>> on >>>>>> the >>>>>>> UI >>>>>>>>>> will >>>>>>>>>>> be >>>>>>>>>>>>>>>>> achieved >>>>>>>>>>>>>>>>>>>> through >>>>>>>>>>>>>>>>>>>>>>> grouping related tasks by metadata. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> This also makes the dag_factory >>>> function >>>>>> more >>>>>>>>>>> reusable >>>>>>>>>>>>>>> because >>>>>>>>>>>>>>>>> you >>>>>>>>>>>>>>>>>>>> don't >>>>>>>>>>>>>>>>>>>>>>> need to have parent_dag_name and >>>>>>> child_dag_name >>>>>>>>> in >>>>>>>>>>> the >>>>>>>>>>>>>>> function >>>>>>>>>>>>>>>>>>>>> signature >>>>>>>>>>>>>>>>>>>>>>> anymore. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Changes proposed: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> - *Unpack SubDags during dag >>> parsing*: >>>>> This >>>>>>>>>> rewrites >>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>> *DagBag.bag_dag* >>>>>>>>>>>>>>>>>>>>>>> method to unpack subdag while >>> parsing, >>>>> and >>>>>> it >>>>>>>>> will >>>>>>>>>>>> give a >>>>>>>>>>>>>>> flat >>>>>>>>>>>>>>>>>>>>>>> structure at >>>>>>>>>>>>>>>>>>>>>>> the task level >>>>>>>>>>>>>>>>>>>>>>> - *Simplify SubDagOperator*: The >> new >>>>>>>>> SubDagOperator >>>>>>>>>>>> acts >>>>>>>>>>>>>>> like a >>>>>>>>>>>>>>>>>>>>>>> container and most of the original >>>>> methods >>>>>>> are >>>>>>>>>>> removed. >>>>>>>>>>>>> The >>>>>>>>>>>>>>>>>>>>>>> signature is >>>>>>>>>>>>>>>>>>>>>>> also changed to *subdag_factory >> *with >>>>>>>>> *subdag_args >>>>>>>>>>> *and >>>>>>>>>>>>>>>>>>>>> *subdag_kwargs*. >>>>>>>>>>>>>>>>>>>>>>> This is similar to the >> PythonOperator >>>>>>>> signature. >>>>>>>>>>>>>>>>>>>>>>> - *Add a TaskGroup model and add >>>>>>> current_group >>>>>>>> & >>>>>>>>>>>>>> parent_group >>>>>>>>>>>>>>>>>>>>> attributes >>>>>>>>>>>>>>>>>>>>>>> to BaseOperator*: This metadata is >>> used >>>>> to >>>>>>>> group >>>>>>>>>>> tasks >>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>>>>> rendering at >>>>>>>>>>>>>>>>>>>>>>> UI level. It may potentially extend >>>>> further >>>>>>> to >>>>>>>>>> group >>>>>>>>>>>>>>> arbitrary >>>>>>>>>>>>>>>>>>> tasks >>>>>>>>>>>>>>>>>>>>>>> outside the context of subdag to >>> allow >>>>>>>>> group-level >>>>>>>>>>>>>> operations >>>>>>>>>>>>>>>>>>> (i.e. >>>>>>>>>>>>>>>>>>>>>>> stop/trigger a group of task within >>> the >>>>>> dag) >>>>>>>>>>>>>>>>>>>>>>> - *Webserver UI for SubDag*: >> Proposed >>>> UI >>>>>>>>>> modification >>>>>>>>>>>> to >>>>>>>>>>>>>>> allow >>>>>>>>>>>>>>>>>>>>>>> (un)collapse a group of tasks for a >>>> flat >>>>>>>>> structure >>>>>>>>>> to >>>>>>>>>>>>> pair >>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>> first >>>>>>>>>>>>>>>>>>>>>>> change instead of the original >>>>> hierarchical >>>>>>>>>>> structure. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Please see related documents and >> PRs >>>> for >>>>>>>> details: >>>>>>>>>>>>>>>>>>>>>>> AIP: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-34+Rewrite+SubDagOperator >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Original Issue: >>>>>>>>>>>>>>> https://github.com/apache/airflow/issues/8078 >>>>>>>>>>>>>>>>>>>>>>> Draft PR: >>>>>>>>>>> https://github.com/apache/airflow/pull/9243 >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Please let me know if there are any >>>>> aspects >>>>>>>> that >>>>>>>>>> you >>>>>>>>>>>>>>>>>> agree/disagree >>>>>>>>>>>>>>>>>>>>>>> with or >>>>>>>>>>>>>>>>>>>>>>> need more clarification (especially >>> the >>>>>> third >>>>>>>>>> change >>>>>>>>>>>>>>> regarding >>>>>>>>>>>>>>>>>>>>> TaskGroup). >>>>>>>>>>>>>>>>>>>>>>> Any comments are welcome and I am >>>> looking >>>>>>>> forward >>>>>>>>>> to >>>>>>>>>>>> it! >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>>>>>>>>>>> Bin >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>> Kyle Hamlin >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Thanks & Regards >>>>>>>>>>>>>>> Poornima >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Jarek Potiuk >>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>>>>>> >>>>>>> M: +48 660 796 129 <+48%20660%20796%20129> <+48660796129 >>>>> <+48%20660%20796%20129>> >>>>>>> [image: Polidea] <https://www.polidea.com/> >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Jarek Potiuk >>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>>>> >>>>> M: +48 660 796 129 <+48%20660%20796%20129> <+48660796129 >>>>> <+48%20660%20796%20129>> >>>>> [image: Polidea] <https://www.polidea.com/> >>>>> >>>> >>>> >>>> -- >>>> >>>> *Jacob Ferriero* >>>> >>>> Strategic Cloud Engineer: Data Engineering >>>> >>>> jferri...@google.com >>>> >>>> 617-714-2509 >>>> >>> >>