Hi Yu Qian, Sorry I somehow missed this email 😢 It is great to hear that you are also interested in adding TaskGroup to the TreeView, hope you still do 😀 I like how the TaskGroup is collapsible in the prototype, though I wonder if we can replicate the zoom in behavior of SubDag--that might be the most useful feature in SubDag for the big DAG owners.
Cheers, Kevin Y On Mon, Apr 19, 2021 at 7:18 AM Yu Qian <yuqian1...@gmail.com> wrote: > Hi, all, > > I'm interested in contributing to adding TaskGroup to Tree View. Here's a > prototype <https://yuqian90.github.io/task_group_tree/> of how it can > look like. Suggestions are welcome. > > I understand AIP-38 > <https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-38+Modern+Web+Application> > plans > to use modern web technology such as React, etc to revamp the Airflow UI. > I'm not sure where we are on that front. With AIP-38 > <https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-38+Modern+Web+Application> > in > mind, if we make improvements to pages such as Tree View, it makes a lot of > sense to contribute it as a reusable TaskInstanceTree component so that it > can be used in more than just the Tree View page itself. One other example > where a TaskInstanceTree component can be useful is when displaying the > confirmation page when a user clears/marks success/marks failed task > instances. Right now the confirmation page just concatenates all the > TaskInstance text representation. It's very difficult to read when a lot of > task instances are cleared or marked. If we put the task instances in the > confirmation page into a TaskInstanceTree component with TaskGroup support, > it can be much easier to read. If you have ideas regarding how to > contribute to Tree View so it's easy to incorporate into AIP-38, definitely > let me know. > > One > > On Tue, Mar 9, 2021 at 1:19 AM Ash Berlin-Taylor <a...@apache.org> wrote: > >> I agree, but we should see what of those we can implement just on the >> parsing side - i.e. can we continue to make the scheduler not have to care >> about Task Groups? >> >> If so, then things like the default args example is a small enough change >> that it doesn't need an AIP (IMO) >> >> -ash >> >> On 8 March 2021 17:12:07 GMT, Daniel Imberman <daniel.imber...@gmail.com> >> wrote: >>> >>> I personally think that TaskGroup should go beyond being “just” a UI >>> concept. I think that there are a lot of use-cases where people might want >>> to perform a single operation across an entire group of tasks. I think that >>> Bin points out a few really good examples (default arguments and group >>> delete are good examples). I also have a proposal coming out hopefully >>> later this week that will offer some more functionality to TaskGroup >>> objects as well. >>> >>> I don’t personally see the benefit of keeping them “UI only.” If we want >>> to be able to group delete or add external sensors to a group of tasks we’d >>> basically need to create another concept that centers around “a grouping of >>> tasks” which I think might create confusion. >>> >>> On Mon, Mar 8, 2021 at 7:19 AM, Yu Qian <yuqian1...@gmail.com> wrote: >>> >>> Hi, all, it's really exciting to see the great discussions about >>> TaskGroup. >>> >>> There are some interesting ideas here. >>> - Tree View support for TaskGroup: I think this can mostly be achieved >>> at the web layer? Changes probably involve tree.html and www/view.py. >>> Should we change Tree View to organize tasks based on the TaskGroup >>> hierarchy (no need to duplicate tasks in Tree View)? Currently the Tree >>> View is organized into a flattened graph hierarchy, which means the same >>> task can appear multiple times in Tree View. >>> - Clear an entire TaskGroup. We should be able to do this in graph.html >>> and www/view.py too. E.g. the UI passes the group_id of the TaskGroup >>> to the web server which then clears the list of tasks in the TaskGroup, >>> which is already an iterable of its child tasks so this should be possible. >>> In fact, I've heard from several users that they sometimes want to select >>> multiple tasks on Graph View with the mouse and then clear all of them at >>> once. This is actually a very similar problem as clearing a TaskGroup. >>> >>> Some other ideas such as default_args and ExternalTaskSensor support >>> sound good too. We can probably continue the discussion on those individual >>> issues/PRs. >>> >>> On Sun, Mar 7, 2021 at 3:55 AM Xinbin Huang <bin.huan...@gmail.com> >>> wrote: >>> >>>> Hi Kaxil, >>>> >>>> One use case I have is to reuse TaskGroup across different DAGs as a >>>> predefined sub-workflow. For example, my team is currently building out a >>>> data platform that will allow a certain level of self-serve ability. Users >>>> of the platform (mostly analyst and scientist) should focus on business >>>> logic - transformation part - while don't need to pay too much attention to >>>> some standard operations (i.e. from S3 to Redshift staging table - validate >>>> data - swap to production table), as these types of tasks are boring and >>>> repetitive. Reuse these sub-workflows also enables us to load data to a >>>> different destination/warehouse without users needing to change their code. >>>> We can also have a notification sub-workflow that allows us to swap in and >>>> out Slack/Pageduty/etc over time without impacting the user. >>>> >>>> Other use cases >>>> - allow default_args at TaskGroup level as in this issue: >>>> https://github.com/apache/airflow/issues/13911 >>>> - ExternalTaskSensor on TaskGroup as mentioned by Nathan: >>>> https://github.com/apache/airflow/issues/14563 >>>> - delete an entire TaskGroup: >>>> https://github.com/apache/airflow/issues/14529 >>>> >>>> All these use cases go beyond the pure UI level and require operations >>>> (viewing/triggering/deleting/waiting/etc) on *a group of tasks. *I >>>> think we can easily implement/formalize this with the current API without >>>> changing the backend too much (this PR >>>> https://github.com/apache/airflow/pull/14640 shows a small example). >>>> >>>> What do other people think? >>>> >>>> Best >>>> Bin >>>> >>>> On Sat, Mar 6, 2021 at 4:51 AM Kaxil Naik <kaxiln...@gmail.com> wrote: >>>> >>>>> Hi all, interesting discussion. I would love to hear about some more >>>>> use-cases where TaskGroup needs to be something more than the UI concept. >>>>> >>>>> All of Kevin's use-cases can be achieved while keeping it as a UI >>>>> concept.Xinbin can you please expand a bit on your use case. >>>>> >>>>> Regards, >>>>> Kaxil >>>>> >>>>> On Sat, Mar 6, 2021, 10:08 Xinbin Huang <bin.huan...@gmail.com> wrote: >>>>> >>>>>> Hi Kevin, Vikram, and Nathan, >>>>>> >>>>>> I think we don't need to restrict too much on keeping TaskGroup only >>>>>> as a UI concept. We are already using TaskGroup to author DAGs and create >>>>>> dependencies, which already lies a bit outside the UI. >>>>>> To fully replace SubDagOperator, I think it's necessary to expand >>>>>> TaskGroup as a *container for tasks* than just UI concept. >>>>>> >>>>>> As for TaskGroupSensor specifically, I land with the same approach as >>>>>> Kevin, and I have created a draft PR here: >>>>>> https://github.com/apache/airflow/pull/14640 >>>>>> >>>>>> Cheers >>>>>> Bin >>>>>> >>>>>> On Fri, Mar 5, 2021 at 10:00 PM Kevin Yang <yrql...@gmail.com> wrote: >>>>>> >>>>>>> Hi Vikram, >>>>>>> >>>>>>> Good point. What I had in mind was getting the TaskGroup definition >>>>>>> in a sensor, e.g. extract the _task_group field from serialized DAG, and >>>>>>> query the DB for the TI states within. >>>>>>> >>>>>>> You are right that it might not be clean nor does it keep TaskGroup >>>>>>> as a UI concept. >>>>>>> >>>>>>> >>>>>>> Cheers, >>>>>>> Kevin Y >>>>>>> >>>>>>> On Fri, Mar 5, 2021 at 8:19 PM Vikram Koka >>>>>>> <vik...@astronomer.io.invalid> wrote: >>>>>>> >>>>>>>> Kevin, >>>>>>>> >>>>>>>> I am not sure I understand your response to Nathan. >>>>>>>> >>>>>>>> I agree that it is also a valid use case, but I don't see how it >>>>>>>> can be cleanly done while keeping TaskGroup only as a UI concept. >>>>>>>> Would this require extending the TaskGroup concept to the backend? >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Vikram >>>>>>>> >>>>>>>> On Fri, Mar 5, 2021 at 1:31 AM Kevin Yang <yrql...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Nathan, >>>>>>>>> >>>>>>>>> Thanks a lot for your input and it is indeed a valid use case. >>>>>>>>> This can be done either keeping TaskGroup as a UI concept or bringing >>>>>>>>> it >>>>>>>>> into the backend. I'm curious to hear what others think. >>>>>>>>> >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Kevin Y >>>>>>>>> >>>>>>>>> On Thu, Mar 4, 2021 at 12:57 AM Nathan Hadfield < >>>>>>>>> nathan.hadfi...@king.com> wrote: >>>>>>>>> >>>>>>>>>> Hi Kevin, >>>>>>>>>> >>>>>>>>>> A quick piece of input from our recent experiences of working >>>>>>>>>> with TaskGroup is that we often have dependencies across DAGs that >>>>>>>>>> require >>>>>>>>>> waiting upon the completion of all the tasks in a group. At the >>>>>>>>>> moment, you >>>>>>>>>> basically have two options: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 1. Create a sensor task in a DAG for every task in the group >>>>>>>>>> 2. Create a Dummy task after the group that a sensor waits on >>>>>>>>>> >>>>>>>>>> So, I would certainly like TaskGroups to have some notion of run >>>>>>>>>> status as to better enable downstream decision making. >>>>>>>>>> >>>>>>>>>> I’ve already created a feature ticket to try to add some kind of >>>>>>>>>> TaskGroup Sensor but perhaps this can also form part of the wider >>>>>>>>>> discussions here. >>>>>>>>>> >>>>>>>>>> https://github.com/apache/airflow/issues/14563 >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> >>>>>>>>>> Nathan >>>>>>>>>> >>>>>>>>>> *From: *Kevin Yang <yrql...@gmail.com> >>>>>>>>>> *Date: *Thursday, 4 March 2021 at 05:21 >>>>>>>>>> *To: *dev@airflow.apache.org <dev@airflow.apache.org> >>>>>>>>>> *Subject: *[DISCUSS] TaskGroup in Tree View >>>>>>>>>> >>>>>>>>>> Hi team, >>>>>>>>>> >>>>>>>>>> We are very glad to see the introduction of TaskGroup in Airflow >>>>>>>>>> 2.0 and really like it. Thanks to Yu Qian and everyone that >>>>>>>>>> contributed to >>>>>>>>>> it. To continue moving towards the goal of replacing SubDagOperator >>>>>>>>>> with >>>>>>>>>> TaskGroup, I'd like to kick off a discussion on bringing TaskGroup >>>>>>>>>> into >>>>>>>>>> Tree View. >>>>>>>>>> >>>>>>>>>> *Why do we need TaskGroup in Tree View?* >>>>>>>>>> >>>>>>>>>> For owners of larger DAGs, say a DAG with 500 tasks, Tree View is >>>>>>>>>> the preferred view for its loading speed and simpler representation. >>>>>>>>>> SubDagOperator is often used to provide an isolated view into a >>>>>>>>>> subset of >>>>>>>>>> tasks in such large DAGs. To replace such SubDag use cases, >>>>>>>>>> TaskGroup will >>>>>>>>>> need to support Tree View. >>>>>>>>>> >>>>>>>>>> *What should TaskGroup look like in Tree View?* >>>>>>>>>> >>>>>>>>>> We didn't have a conclusion during the 1st iteration of >>>>>>>>>> TaskGroup. In Airbnb, we use SubDag mostly for providing a zoom in >>>>>>>>>> view on >>>>>>>>>> a small set of tasks and the SubDag zoom in feature worked well for >>>>>>>>>> us. >>>>>>>>>> We'd like to see TaskGroup provide a zoom in option for both Graph >>>>>>>>>> View and >>>>>>>>>> Tree View but also like to hear everyone's thoughts. >>>>>>>>>> >>>>>>>>>> *What needs to be in TaskGroup and what doesn't?* >>>>>>>>>> >>>>>>>>>> TaskGroup started off as a pure UI concept while SubDag is >>>>>>>>>> something more, e.g. it has its own DagRun thus isolated scheduling >>>>>>>>>> decisions, it can serve as a logical isolation layer that holds >>>>>>>>>> different >>>>>>>>>> sets of DAG level params, etc. While we only use SubDag as a UI >>>>>>>>>> feature, I >>>>>>>>>> think it would be a good opportunity for us to discuss what should be >>>>>>>>>> TaskGroup and what shouldn't. >>>>>>>>>> >>>>>>>>>> Please don't hesitate to share your thoughts. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> >>>>>>>>>> Kevin Y >>>>>>>>>> >>>>>>>>>