I would question that hooking into DAG.run is "more gracefully" than having a 
root task node that does the pipeline environment setup.
IMO it'd be easier and cleaner to catch setup errors when it's done in a 
separate task.

Best regards,
Jiening

-----Original Message-----
From: Song Liu [mailto:song...@outlook.com] 
Sent: Saturday 12 May 2018 9:06 AM
To: dev@airflow.incubator.apache.org
Subject: 答复: 答复: How to know the DAG is starting to run [External]

Yes, I want to know the event about the creation of a DagRun.
________________________________
发件人: crisp...@gmail.com <crisp...@gmail.com> 代表 Chris Palmer 
<ch...@crpalmer.com>
发送时间: 2018年5月11日 15:46
收件人: dev@airflow.incubator.apache.org
主题: Re: 答复: How to know the DAG is starting to run

It's not even clear to me what it means for a DAG to start running. The 
creation of a DagRun for a specific execution date is completely independent of 
the scheduling of any TaskInstances for that DagRun. There could be a 
significant delay between those two events, either deliberately encoded into 
the DAG or due to resource constraints.

What event are you actually interested in knowing about? The creation of a 
DagRun? The starting of any task for a DagRun? Something else?

Maybe if you provided more details on what exactly the "pipeline environment 
setup" you are trying to do, it would help others understand the problem you 
are trying to solve.

Chris

On Fri, May 11, 2018 at 10:59 AM, Song Liu <song...@outlook.com> wrote:

> Overriding the "DAG.run" sounds like a workaround, so that if it's 
> running a first operation of DAG then do some setup etc.
>
> ________________________________
> 发件人: Victor Noagbodji <vnoagbo...@amplify-nation.com>
> 发送时间: 2018年5月11日 12:50
> 收件人: dev@airflow.incubator.apache.org
> 主题: Re: How to know the DAG is starting to run
>
> Hey,
>
> I don't know if airflow has a concept of DAG-level events or callbacks.
> (Operators do have callbacks though.). You might get away with 
> subclassing the DAG class or having a class decorator.
>
> The source suggests that ".run()" is the method you want to override. 
> You may want to call the original "super().run()" then do what you 
> need to do afterwards.
>
> Let's see if that works for you.
>
> > On May 11, 2018, at 8:26 AM, Song Liu <song...@outlook.com> wrote:
> >
> > Yes, I have though this approach, but more elegant way is doing in 
> > the
> DAG since we don't want to add this "pipeline environment setup" as a 
> single operator, which should be done in the DAG more gracefully.
> > ________________________________
> > 发件人: James Meickle <jmeic...@quantopian.com>
> > 发送时间: 2018年5月11日 12:09
> > 收件人: dev@airflow.incubator.apache.org
> > 主题: Re: How to know the DAG is starting to run
> >
> > Song:
> >
> > You can put an operator as the very first node in the DAG, and have 
> > everything else in the DAG depend on it. For example, this is the
> approach
> > we use to only execute DAG tasks on stock market trading days.
> >
> > -James M.
> >
> > On Fri, May 11, 2018 at 3:57 AM, Song Liu <song...@outlook.com> wrote:
> >
> >> Hi,
> >>
> >> I have something just want to be done only once when DAG is 
> >> constructed, but it seems that DAG will be instanced every time 
> >> when run each of operator.
> >>
> >> So is that there function in DAG that tell us it is starting to run 
> >> now
> ?
> >>
> >> Thanks,
> >> Song
> >>
>
>

Reply via email to