Hi Yu Qian,

Sorry I somehow missed this email 😢 It is great to hear that you are also
interested in adding TaskGroup to the TreeView, hope you still do 😀 I like
how the TaskGroup is collapsible in the prototype, though I wonder if we
can replicate the zoom in behavior of SubDag--that might be the most useful
feature in SubDag for the big DAG owners.


Cheers,
Kevin Y

On Mon, Apr 19, 2021 at 7:18 AM Yu Qian <yuqian1...@gmail.com> wrote:

> Hi, all,
>
> I'm interested in contributing to adding TaskGroup to Tree View. Here's a
> prototype <https://yuqian90.github.io/task_group_tree/> of how it can
> look like. Suggestions are welcome.
>
> I understand AIP-38
> <https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-38+Modern+Web+Application>
>  plans
> to use modern web technology such as React, etc to revamp the Airflow UI.
> I'm not sure where we are on that front. With AIP-38
> <https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-38+Modern+Web+Application>
>  in
> mind, if we make improvements to pages such as Tree View, it makes a lot of
> sense to contribute it as a reusable TaskInstanceTree component so that it
> can be used in more than just the Tree View page itself. One other example
> where a TaskInstanceTree component can be useful is when displaying the
> confirmation page when a user clears/marks success/marks failed task
> instances. Right now the confirmation page just concatenates all the
> TaskInstance text representation. It's very difficult to read when a lot of
> task instances are cleared or marked. If we put the task instances in the
> confirmation page into a TaskInstanceTree component with TaskGroup support,
> it can be much easier to read. If you have ideas regarding how to
> contribute to Tree View so it's easy to incorporate into AIP-38, definitely
> let me know.
>
> One
>
> On Tue, Mar 9, 2021 at 1:19 AM Ash Berlin-Taylor <a...@apache.org> wrote:
>
>> I agree, but we should see what of those we can implement just on the
>> parsing side - i.e. can we continue to make the scheduler not have to care
>> about Task Groups?
>>
>> If so, then things like the default args example is a small enough change
>> that it doesn't need an AIP (IMO)
>>
>> -ash
>>
>> On 8 March 2021 17:12:07 GMT, Daniel Imberman <daniel.imber...@gmail.com>
>> wrote:
>>>
>>> I personally think that TaskGroup should go beyond being “just” a UI
>>> concept. I think that there are a lot of use-cases where people might want
>>> to perform a single operation across an entire group of tasks. I think that
>>> Bin points out a few really good examples (default arguments and group
>>> delete are good examples). I also have a proposal coming out hopefully
>>> later this week that will offer some more functionality to TaskGroup
>>> objects as well.
>>>
>>> I don’t personally see the benefit of keeping them “UI only.” If we want
>>> to be able to group delete or add external sensors to a group of tasks we’d
>>> basically need to create another concept that centers around “a grouping of
>>> tasks” which I think might create confusion.
>>>
>>> On Mon, Mar 8, 2021 at 7:19 AM, Yu Qian <yuqian1...@gmail.com> wrote:
>>>
>>> Hi, all, it's really exciting to see the great discussions about
>>> TaskGroup.
>>>
>>> There are some interesting ideas here.
>>> - Tree View support for TaskGroup: I think this can mostly be achieved
>>> at the web layer? Changes probably involve tree.html and www/view.py.
>>> Should we change Tree View to organize tasks based on the TaskGroup
>>> hierarchy (no need to duplicate tasks in Tree View)? Currently the Tree
>>> View is organized into a flattened graph hierarchy, which means the same
>>> task can appear multiple times in Tree View.
>>> - Clear an entire TaskGroup. We should be able to do this in graph.html
>>> and www/view.py too. E.g. the UI passes the group_id of the TaskGroup
>>> to the web server which then clears the list of tasks in the TaskGroup,
>>> which is already an iterable of its child tasks so this should be possible.
>>> In fact, I've heard from several users that they sometimes want to select
>>> multiple tasks on Graph View with the mouse and then clear all of them at
>>> once. This is actually a very similar problem as clearing a TaskGroup.
>>>
>>> Some other ideas such as default_args and ExternalTaskSensor support
>>> sound good too. We can probably continue the discussion on those individual
>>> issues/PRs.
>>>
>>> On Sun, Mar 7, 2021 at 3:55 AM Xinbin Huang <bin.huan...@gmail.com>
>>> wrote:
>>>
>>>> Hi Kaxil,
>>>>
>>>> One use case I have is to reuse TaskGroup across different DAGs as a
>>>> predefined sub-workflow. For example, my team is currently building out a
>>>> data platform that will allow a certain level of self-serve ability. Users
>>>> of the platform (mostly analyst and scientist) should focus on business
>>>> logic - transformation part - while don't need to pay too much attention to
>>>> some standard operations (i.e. from S3 to Redshift staging table - validate
>>>> data - swap to production table), as these types of tasks are boring and
>>>> repetitive. Reuse these sub-workflows also enables us to load data to a
>>>> different destination/warehouse without users needing to change their code.
>>>> We can also have a notification sub-workflow that allows us to swap in and
>>>> out Slack/Pageduty/etc over time without impacting the user.
>>>>
>>>> Other use cases
>>>> - allow default_args at TaskGroup level as in this issue:
>>>> https://github.com/apache/airflow/issues/13911
>>>> - ExternalTaskSensor on TaskGroup as mentioned by Nathan:
>>>> https://github.com/apache/airflow/issues/14563
>>>> - delete an entire TaskGroup:
>>>> https://github.com/apache/airflow/issues/14529
>>>>
>>>> All these use cases go beyond the pure UI level and require operations
>>>> (viewing/triggering/deleting/waiting/etc) on *a group of tasks. *I
>>>> think we can easily implement/formalize this with the current API without
>>>> changing the backend too much (this PR
>>>> https://github.com/apache/airflow/pull/14640 shows a small example).
>>>>
>>>> What do other people think?
>>>>
>>>> Best
>>>> Bin
>>>>
>>>> On Sat, Mar 6, 2021 at 4:51 AM Kaxil Naik <kaxiln...@gmail.com> wrote:
>>>>
>>>>> Hi all, interesting discussion. I would love to hear about some more
>>>>> use-cases where TaskGroup needs to be something more than the UI concept.
>>>>>
>>>>> All of Kevin's use-cases can be achieved while keeping it as a UI
>>>>> concept.Xinbin can you please expand a bit on your use case.
>>>>>
>>>>> Regards,
>>>>> Kaxil
>>>>>
>>>>> On Sat, Mar 6, 2021, 10:08 Xinbin Huang <bin.huan...@gmail.com> wrote:
>>>>>
>>>>>> Hi Kevin, Vikram, and Nathan,
>>>>>>
>>>>>> I think we don't need to restrict too much on keeping TaskGroup only
>>>>>> as a UI concept. We are already using TaskGroup to author DAGs and create
>>>>>> dependencies, which already lies a bit outside the UI.
>>>>>> To fully replace SubDagOperator, I think it's necessary to expand
>>>>>> TaskGroup as a *container for tasks* than just UI concept.
>>>>>>
>>>>>> As for TaskGroupSensor specifically, I land with the same approach as
>>>>>> Kevin, and I have created a draft PR here:
>>>>>> https://github.com/apache/airflow/pull/14640
>>>>>>
>>>>>> Cheers
>>>>>> Bin
>>>>>>
>>>>>> On Fri, Mar 5, 2021 at 10:00 PM Kevin Yang <yrql...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Vikram,
>>>>>>>
>>>>>>> Good point. What I had in mind was getting the TaskGroup definition
>>>>>>> in a sensor, e.g. extract the _task_group field from serialized DAG, and
>>>>>>> query the DB for the TI states within.
>>>>>>>
>>>>>>> You are right that it might not be clean nor does it keep TaskGroup
>>>>>>> as a UI concept.
>>>>>>>
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Kevin Y
>>>>>>>
>>>>>>> On Fri, Mar 5, 2021 at 8:19 PM Vikram Koka
>>>>>>> <vik...@astronomer.io.invalid> wrote:
>>>>>>>
>>>>>>>> Kevin,
>>>>>>>>
>>>>>>>> I am not sure I understand your response to Nathan.
>>>>>>>>
>>>>>>>> I agree that it is also a valid use case, but I don't see how it
>>>>>>>> can be cleanly done while keeping TaskGroup only as a UI concept.
>>>>>>>> Would this require extending the TaskGroup concept to the backend?
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Vikram
>>>>>>>>
>>>>>>>> On Fri, Mar 5, 2021 at 1:31 AM Kevin Yang <yrql...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Nathan,
>>>>>>>>>
>>>>>>>>> Thanks a lot for your input and it is indeed a valid use case.
>>>>>>>>> This can be done either keeping TaskGroup as a UI concept or bringing 
>>>>>>>>> it
>>>>>>>>> into the backend. I'm curious to hear what others think.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Kevin Y
>>>>>>>>>
>>>>>>>>> On Thu, Mar 4, 2021 at 12:57 AM Nathan Hadfield <
>>>>>>>>> nathan.hadfi...@king.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Kevin,
>>>>>>>>>>
>>>>>>>>>> A quick piece of input from our recent experiences of working
>>>>>>>>>> with TaskGroup is that we often have dependencies across DAGs that 
>>>>>>>>>> require
>>>>>>>>>> waiting upon the completion of all the tasks in a group. At the 
>>>>>>>>>> moment, you
>>>>>>>>>> basically have two options:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>    1. Create a sensor task in a DAG for every task in the group
>>>>>>>>>>    2. Create a Dummy task after the group that a sensor waits on
>>>>>>>>>>
>>>>>>>>>> So, I would certainly like TaskGroups to have some notion of run
>>>>>>>>>> status as to better enable downstream decision making.
>>>>>>>>>>
>>>>>>>>>> I’ve already created a feature ticket to try to add some kind of
>>>>>>>>>> TaskGroup Sensor but perhaps this can also form part of the wider
>>>>>>>>>> discussions here.
>>>>>>>>>>
>>>>>>>>>> https://github.com/apache/airflow/issues/14563
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>>
>>>>>>>>>> Nathan
>>>>>>>>>>
>>>>>>>>>> *From: *Kevin Yang <yrql...@gmail.com>
>>>>>>>>>> *Date: *Thursday, 4 March 2021 at 05:21
>>>>>>>>>> *To: *dev@airflow.apache.org <dev@airflow.apache.org>
>>>>>>>>>> *Subject: *[DISCUSS] TaskGroup in Tree View
>>>>>>>>>>
>>>>>>>>>> Hi team,
>>>>>>>>>>
>>>>>>>>>> We are very glad to see the introduction of TaskGroup in Airflow
>>>>>>>>>> 2.0 and really like it. Thanks to Yu Qian and everyone that 
>>>>>>>>>> contributed to
>>>>>>>>>> it. To continue moving towards the goal of replacing SubDagOperator 
>>>>>>>>>> with
>>>>>>>>>> TaskGroup, I'd like to kick off a discussion on bringing TaskGroup 
>>>>>>>>>> into
>>>>>>>>>> Tree View.
>>>>>>>>>>
>>>>>>>>>> *Why do we need TaskGroup in Tree View?*
>>>>>>>>>>
>>>>>>>>>> For owners of larger DAGs, say a DAG with 500 tasks, Tree View is
>>>>>>>>>> the preferred view for its loading speed and simpler representation.
>>>>>>>>>> SubDagOperator is often used to provide an isolated view into a 
>>>>>>>>>> subset of
>>>>>>>>>> tasks in such large DAGs. To replace such SubDag use cases, 
>>>>>>>>>> TaskGroup will
>>>>>>>>>> need to support Tree View.
>>>>>>>>>>
>>>>>>>>>> *What should TaskGroup look like in Tree View?*
>>>>>>>>>>
>>>>>>>>>> We didn't have a conclusion during the 1st iteration of
>>>>>>>>>> TaskGroup. In Airbnb, we use SubDag mostly for providing a zoom in 
>>>>>>>>>> view on
>>>>>>>>>> a small set of tasks and the SubDag zoom in feature worked well for 
>>>>>>>>>> us.
>>>>>>>>>> We'd like to see TaskGroup provide a zoom in option for both Graph 
>>>>>>>>>> View and
>>>>>>>>>> Tree View but also like to hear everyone's thoughts.
>>>>>>>>>>
>>>>>>>>>> *What needs to be in TaskGroup and what doesn't?*
>>>>>>>>>>
>>>>>>>>>> TaskGroup started off as a pure UI concept while SubDag is
>>>>>>>>>> something more, e.g. it has its own DagRun thus isolated scheduling
>>>>>>>>>> decisions, it can serve as a logical isolation layer that holds 
>>>>>>>>>> different
>>>>>>>>>> sets of DAG level params, etc. While we only use SubDag as a UI 
>>>>>>>>>> feature, I
>>>>>>>>>> think it would be a good opportunity for us to discuss what should be
>>>>>>>>>> TaskGroup and what shouldn't.
>>>>>>>>>>
>>>>>>>>>> Please don't hesitate to share your thoughts.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>>
>>>>>>>>>> Kevin Y
>>>>>>>>>>
>>>>>>>>>

Reply via email to