I agree, but we should see what of those we can implement just on the parsing 
side - i.e. can we continue to make the scheduler not have to care about Task 
Groups?

If so, then things like the default args example is a small enough change that 
it doesn't need an AIP (IMO)

-ash

On 8 March 2021 17:12:07 GMT, Daniel Imberman <daniel.imber...@gmail.com> wrote:
>I personally think that TaskGroup should go beyond being “just” a UI concept. 
>I think that there are a lot of use-cases where people might want to perform a 
>single operation across an entire group of tasks. I think that Bin points out 
>a few really good examples (default arguments and group delete are good 
>examples). I also have a proposal coming out hopefully later this week that 
>will offer some more functionality to TaskGroup objects as well.
>I don’t personally see the benefit of keeping them “UI only.” If we want to be 
>able to group delete or add external sensors to a group of tasks we’d 
>basically need to create another concept that centers around “a grouping of 
>tasks” which I think might create confusion.
>On Mon, Mar 8, 2021 at 7:19 AM, Yu Qian <yuqian1...@gmail.com> wrote:
>Hi, all, it's really exciting to see the great discussions about TaskGroup.
>There are some interesting ideas here. - Tree View support for TaskGroup: I 
>think this can mostly be achieved at the web layer? Changes probably involve 
>tree.html and www/view.py. Should we change Tree View to organize tasks based 
>on the TaskGroup hierarchy (no need to duplicate tasks in Tree View)? 
>Currently the Tree View is organized into a flattened graph hierarchy, which 
>means the same task can appear multiple times in Tree View. - Clear an entire 
>TaskGroup. We should be able to do this in graph.html and www/view.py too. 
>E.g. the UI passes the group_id of the TaskGroup to the web server which then 
>clears the list of tasks in the TaskGroup, which is already an iterable of its 
>child tasks so this should be possible. In fact, I've heard from several users 
>that they sometimes want to select multiple tasks on Graph View with the mouse 
>and then clear all of them at once. This is actually a very similar problem as 
>clearing a TaskGroup.
>Some other ideas such as default_args and ExternalTaskSensor support sound 
>good too. We can probably continue the discussion on those individual 
>issues/PRs.
>On Sun, Mar 7, 2021 at 3:55 AM Xinbin Huang < bin.huan...@gmail.com 
>[bin.huan...@gmail.com] > wrote:
>Hi Kaxil,
>One use case I have is to reuse TaskGroup across different DAGs as a 
>predefined sub-workflow. For example, my team is currently building out a data 
>platform that will allow a certain level of self-serve ability. Users of the 
>platform (mostly analyst and scientist) should focus on business logic - 
>transformation part - while don't need to pay too much attention to some 
>standard operations (i.e. from S3 to Redshift staging table - validate data - 
>swap to production table), as these types of tasks are boring and repetitive. 
>Reuse these sub-workflows also enables us to load data to a different 
>destination/warehouse without users needing to change their code. We can also 
>have a notification sub-workflow that allows us to swap in and out 
>Slack/Pageduty/etc over time without impacting the user.
>Other use cases - allow default_args at TaskGroup level as in this issue: 
>https://github.com/apache/airflow/issues/13911 
>[https://github.com/apache/airflow/issues/13911] - ExternalTaskSensor on 
>TaskGroup as mentioned by Nathan: 
>https://github.com/apache/airflow/issues/14563 
>[https://github.com/apache/airflow/issues/14563] - delete an entire TaskGroup: 
>https://github.com/apache/airflow/issues/14529 
>[https://github.com/apache/airflow/issues/14529]
>All these use cases go beyond the pure UI level and require operations 
>(viewing/triggering/deleting/waiting/etc) on a group of tasks. I think we can 
>easily implement/formalize this with the current API without changing the 
>backend too much (this PR https://github.com/apache/airflow/pull/14640 
>[https://github.com/apache/airflow/pull/14640] shows a small example).
>What do other people think?
>Best Bin
>On Sat, Mar 6, 2021 at 4:51 AM Kaxil Naik < kaxiln...@gmail.com 
>[kaxiln...@gmail.com] > wrote:
>Hi all, interesting discussion. I would love to hear about some more use-cases 
>where TaskGroup needs to be something more than the UI concept.
>All of Kevin's use-cases can be achieved while keeping it as a UI 
>concept.Xinbin can you please expand a bit on your use case.
>Regards, Kaxil
>On Sat, Mar 6, 2021, 10:08 Xinbin Huang < bin.huan...@gmail.com 
>[bin.huan...@gmail.com] > wrote:
>Hi Kevin, Vikram, and Nathan,
>I think we don't need to restrict too much on keeping TaskGroup only as a UI 
>concept. We are already using TaskGroup to author DAGs and create 
>dependencies, which already lies a bit outside the UI. To fully replace 
>SubDagOperator, I think it's necessary to expand TaskGroup as a container for 
>tasks than just UI concept.
>As for TaskGroupSensor specifically, I land with the same approach as Kevin, 
>and I have created a draft PR here: 
>https://github.com/apache/airflow/pull/14640 
>[https://github.com/apache/airflow/pull/14640]
>Cheers Bin
>On Fri, Mar 5, 2021 at 10:00 PM Kevin Yang < yrql...@gmail.com 
>[yrql...@gmail.com] > wrote:
>Hi Vikram,
>Good point. What I had in mind was getting the TaskGroup definition in a 
>sensor, e.g. extract the _task_group field from serialized DAG, and query the 
>DB for the TI states within.
>You are right that it might not be clean nor does it keep TaskGroup as a UI 
>concept.
>
>Cheers, Kevin Y
>On Fri, Mar 5, 2021 at 8:19 PM Vikram Koka <vik...@astronomer.io.invalid> 
>wrote:
>Kevin,
>I am not sure I understand your response to Nathan.
>I agree that it is also a valid use case, but I don't see how it can be 
>cleanly done while keeping TaskGroup only as a UI concept. Would this require 
>extending the TaskGroup concept to the backend?
>Best regards, Vikram
>On Fri, Mar 5, 2021 at 1:31 AM Kevin Yang < yrql...@gmail.com 
>[yrql...@gmail.com] > wrote:
>Hi Nathan,
>Thanks a lot for your input and it is indeed a valid use case. This can be 
>done either keeping TaskGroup as a UI concept or bringing it into the backend. 
>I'm curious to hear what others think.
>
>Cheers, Kevin Y
>On Thu, Mar 4, 2021 at 12:57 AM Nathan Hadfield < nathan.hadfi...@king.com 
>[nathan.hadfi...@king.com] > wrote:
>Hi Kevin,
>
>
>
>A quick piece of input from our recent experiences of working with TaskGroup 
>is that we often have dependencies across DAGs that require waiting upon the 
>completion of all the tasks in a group. At the moment, you basically have two 
>options:
>
>
>
> 1. Create a sensor task in a DAG for every task in the group
> 2. Create a Dummy task after the group that a sensor waits on
>
>
>
>So, I would certainly like TaskGroups to have some notion of run status as to 
>better enable downstream decision making.
>
>
>
>I’ve already created a feature ticket to try to add some kind of TaskGroup 
>Sensor but perhaps this can also form part of the wider discussions here.
>
>
>
>https://github.com/apache/airflow/issues/14563 
>[https://github.com/apache/airflow/issues/14563]
>
>
>
>Cheers,
>
>
>
>Nathan
>
>
>
>From: Kevin Yang < yrql...@gmail.com [yrql...@gmail.com] >
>Date: Thursday, 4 March 2021 at 05:21
>To: dev@airflow.apache.org [dev@airflow.apache.org] < dev@airflow.apache.org 
>[dev@airflow.apache.org] >
>Subject: [DISCUSS] TaskGroup in Tree View
>
>Hi team,
>
>
>
>We are very glad to see the introduction of TaskGroup in Airflow 2.0 and 
>really like it. Thanks to Yu Qian and everyone that contributed to it. To 
>continue moving towards the goal of replacing SubDagOperator with TaskGroup, 
>I'd like to kick off a discussion on bringing TaskGroup into Tree View.
>
>
>
>Why do we need TaskGroup in Tree View?
>
>For owners of larger DAGs, say a DAG with 500 tasks, Tree View is the 
>preferred view for its loading speed and simpler representation. 
>SubDagOperator is often used to provide an isolated view into a subset of 
>tasks in such large DAGs. To replace such SubDag use cases, TaskGroup will 
>need to support Tree View.
>
>
>
>What should TaskGroup look like in Tree View?
>
>We didn't have a conclusion during the 1st iteration of TaskGroup. In Airbnb, 
>we use SubDag mostly for providing a zoom in view on a small set of tasks and 
>the SubDag zoom in feature worked well for us. We'd like to see TaskGroup 
>provide a zoom in option for both Graph View and Tree View but also like to 
>hear everyone's thoughts.
>
>
>
>What needs to be in TaskGroup and what doesn't?
>
>TaskGroup started off as a pure UI concept while SubDag is something more, 
>e.g. it has its own DagRun thus isolated scheduling decisions, it can serve as 
>a logical isolation layer that holds different sets of DAG level params, etc. 
>While we only use SubDag as a UI feature, I think it would be a good 
>opportunity for us to discuss what should be TaskGroup and what shouldn't.
>
>
>
>Please don't hesitate to share your thoughts.
>
>
>
>
>
>Cheers,
>
>Kevin Y

Reply via email to