100 sql tasks belonging to a business package. So heavyweight business
logic ! It will be difficult to maintain.
Maybe we can create a new JOB type called DAG, which split the 100 SQLs
depend on their data lineage.
Meanwhile the DAG job has some limitations ,for example ,25 subtask at most.

Too many jobs under one label should have any impact on master,  jobs
reached same schedule time will.
After all ,Jobs under one label will not be triggered at the same time.

Hemin Wen <[email protected]> 于2020年5月21日周四 上午10:34写道:

> First, There are 100 (for example) sql tasks belonging to a business
> package, Does not need to be split,
> scheduling still has to serve the actual business.
>
> I agree with the concept of JOB, I understand what you mean is that the
> master scheduling granularity is changed to job level,
> But, If there are too many jobs under one label, it will affect the
> execution of other label jobs,
> So I think the current master is correct at the workflow level, parallel
> between workflows, queued within workflows.
>
> I understand that this need to solve is the dependency between batch tasks
> and batch creation tasks.
> The list is for easier viewing of tasks. User manual maintenance of batch
> tasks is a time-consuming operation.
> So, I suggest to start with how to solve creating batch tasks and how to
> solve dependencies between tasks.
>
>
> --------------------
> DolphinScheduler(Incubator) Commtter
> Hemin Wen  温合民
> [email protected]
> --------------------
>
>
> GabryWu <[email protected]> 于2020年5月20日周三 下午9:58写道:
>
> > If DAG removed ,master only need to pick up JOBs which reached schedule
> > time, and dispatches them to one worker .However the JOBs will be in the
> > queue before dispatched .
> > The master will be lightweight , and complement work will be easy. The
> > JOBs which have same label will be dispatched to workers and they will
> > execute parallelly or in a sequence  relying on their DEPENDENCY
> >
> > ---Original---
> > *From:* "GabryWu"<[email protected]>
> > *Date:* Wed, May 20, 2020 18:41 PM
> > *To:* "dev"<[email protected]>;
> > *Cc:* "wenhemin"<[email protected]>;
> > *Subject:* Re: How do you think Task DAG dependency and List dependency?
> >
> > In that case ,I recommend spliting the 100 SQLs into different JOB which
> > has same label, each JOB has it's own business logic .Finally Add
> > DEPENDENCY to each JOB. JOBs having same label can be shown in front page
> > through DAG
> >
> >
> > ---Original---
> > *From:* "Hemin Wen"<[email protected]>
> > *Date:* Wed, May 20, 2020 17:38 PM
> > *To:* "GabryWu"<[email protected]>;
> > *Subject:* Re: How do you think Task DAG dependency and List dependency?
> >
> > For example, A workflow has 100 sql scripts task, tasks depend on each
> > other through data lineage.
> > This requirement is not suitable for DAG configuration.
> >
> >
> > --------------------
> > DolphinScheduler(Incubator) Commtter
> > Hemin Wen  温合民
> > [email protected]
> > --------------------
> >
> >
> > GabryWu <[email protected]> 于2020年5月20日周三 下午5:31写道:
> >
> >> what do you mean batch tasks
> >>
> >> ------------------ 原始邮件 ------------------
> >> *发件人:* "wenhemin"<[email protected]>;
> >> *发送时间:* 2020年5月20日(星期三) 下午5:04
> >> *收件人:* "dev"<[email protected]>;
> >> *主题:* Re: How do you think Task DAG dependency and List dependency?
> >>
> >> I do not recommend remove DAG.
> >> I think, DS lacks support for batch tasks.
> >> DAG solves the expression of different types of tasks, there is no way
> to
> >> express batch tasks of the same type.
> >>
> >> I don't know if what I understand is accurate, It is recommended to
> >> understand in depth the problem that users
> >> want to solve by using "List dependency”.
> >>
> >> --------------------
> >> DolphinScheduler(Incubator) Commtter
> >> Hemin Wen  温合民
> >> [email protected]
> >> --------------------
> >>
> >>
> >> lidong dai <[email protected]> 于2020年5月20日周三 下午3:40写道:
> >>
> >> > I know what you said,  there maybe need more people to discuss this
> >> topic,
> >> > I want to know other's opinion, how do they think this question
> >> >
> >> >
> >> >
> >> > Best Regards
> >> > ---------------
> >> > DolphinScheduler(Incubator) PPMC
> >> > Lidong Dai 代立冬
> >> > [email protected]
> >> > ---------------
> >> >
> >> >
> >> > GabryWu <[email protected]> 于2020年5月19日周二 上午7:19写道:
> >> >
> >> > > not actually. I mean refactor backend code to remove DAG, where only
> >> JOB
> >> > > and DEPENDENCY remained
> >> > >
> >> > > ---Original---
> >> > > *From:* "lidong dai"<[email protected]>
> >> > > *Date:* Mon, May 18, 2020 22:47 PM
> >> > > *To:* "GabryWu"<[email protected]>;
> >> > > *Cc:* "dev"<[email protected]>;
> >> > > *Subject:* Re: How do you think Task DAG dependency and List
> >> dependency?
> >> > >
> >> > > I think your said is "List dependency", List dependency is also DAG,
> >> it's
> >> > > only different in use.  when use List dependency, you can add your
> >> > upstream
> >> > > dependency for the task, this is convenient for hugely tasks in one
> >> > > workflow
> >> > >
> >> > >
> >> > > Best Regards
> >> > > ---------------
> >> > > DolphinScheduler(Incubator) PPMC
> >> > > Lidong Dai 代立冬
> >> > > [email protected]
> >> > > ---------------
> >> > >
> >> > >
> >> > > GabryWu <[email protected]> 于2020年5月13日周三 上午11:08写道:
> >> > >
> >> > >> Until now ,DAG is one physical  concept, which means that DAG is
> >> heavy
> >> > >> class , and introduces other classes and stored in a big json
> field.
> >> > >> On the bigdata platform , DAG is not a good concept for Schedulers,
> >> > which
> >> > >> can be abandoned.
> >> > >> If DAG was abandoned,   JOB and DEPENDENCY will simply the
> >> architecture
> >> > >> and make DolphinScheduler stabilize and extended easily and easier
> to
> >> > >> search jobs
> >> > >> However abandoning DAG doesn't mean remove DAG graph in the front
> >> end,
> >> > >> which is also an important visualization way for JOBs
> >> > >> We can add one Job and Dependency manually ,and visualize them in
> DAG
> >> > >> graph automatically
> >> > >>
> >> > >>
> >> > >>
> >> > >> ------------------ 原始邮件 ------------------
> >> > >> *发件人:* "lidong dai"<[email protected]>;
> >> > >> *发送时间:* 2020年5月12日(星期二) 下午5:26
> >> > >> *收件人:* "dev"<[email protected]>;
> >> > >> *主题:* Re: How do you think Task DAG dependency and List dependency?
> >> > >>
> >> > >> yes, your description is proper, thanks
> >> > >>
> >> > >>
> >> > >> Best Regards
> >> > >> ---------------
> >> > >> DolphinScheduler(Incubator) PPMC
> >> > >> Lidong Dai 代立冬
> >> > >> [email protected]
> >> > >> ---------------
> >> > >>
> >> > >>
> >> > >> leon bao <[email protected]> 于2020年5月12日周二 下午5:06写道:
> >> > >>
> >> > >> > i think you want to show the DAG using list mode.
> >> > >> > DS show DAG using graphic mode, that will have some problems:
> >> > >> >
> >> > >> > 1. once the number of tasks is large,  graphic mode will be
> >> confusing.
> >> > >> > 2. it is difficult to find the specified task in a complex DAG.
> >> > >> >
> >> > >> > lidong dai <[email protected]> 于2020年5月12日周二 下午4:31写道:
> >> > >> >
> >> > >> > > I want to say do we need to implement List dependency?
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > > Best Regards
> >> > >> > > ---------------
> >> > >> > > DolphinScheduler(Incubator) PPMC
> >> > >> > > Lidong Dai 代立冬
> >> > >> > > [email protected]
> >> > >> > > ---------------
> >> > >> > >
> >> > >> > >
> >> > >> > > JUN GAO <[email protected]> 于2020年5月12日周二 下午12:26写道:
> >> > >> > >
> >> > >> > > > Hi , @lidong dai
> >> > >> > > > Sorry , I don`t know your mean about this .
> >> > >> > > > You want to discuss how to implement Task DAG dependency and
> >> List
> >> > >> > > > dependency ? Or discuss what is Task DAG dependency and List
> >> > >> > dependency ?
> >> > >> > > > you can show us what`s your need and ideas , So that we can
> >> know
> >> > the
> >> > >> > > topic
> >> > >> > > > of discussion.
> >> > >> > > >
> >> > >> > > > Thank you !
> >> > >> > > >
> >> > >> > > > lidong dai <[email protected]> 于2020年5月11日周一 下午10:34写道:
> >> > >> > > >
> >> > >> > > > > hi ,
> >> > >> > > > >
> >> > >> > > > >    Task dependency between upstream and downstream  called
> >> List
> >> > >> > > > dependency
> >> > >> > > > > do you have ideas about this topic?
> >> > >> > > > >
> >> > >> > > > >
> >> > >> > > > >
> >> > >> > > > > Best Regards
> >> > >> > > > > ---------------
> >> > >> > > > > DolphinScheduler(Incubator) PPMC
> >> > >> > > > > Lidong Dai 代立冬
> >> > >> > > > > [email protected]
> >> > >> > > > > ---------------
> >> > >> > > > >
> >> > >> > > >
> >> > >> > > >
> >> > >> > > > --
> >> > >> > > >
> >> > >> > > > DolphinScheduler(Incubator)  PPMC
> >> > >> > > > Jun Gao 高俊
> >> > >> > > > [email protected]
> >> > >> > > >
> >> > >> > >
> >> > >> >
> >> > >> >
> >> > >> > --
> >> > >> > DolphinScheduler(Incubator)  PPMC
> >> > >> > BaoLiang 鲍亮
> >> > >> > [email protected]
> >> > >> >
> >> > >>
> >> > >>
> >> >
> >>
> >>
>

Reply via email to