Thank you Daniel and Jarek for your helpful replies.
- I didn't know the "DAG Dependencies" Browse feature. Really cool, it
could indeed show the relationships and confirm that the triggered DAG
are the good ones (useful when we have multiple instances of the same
DAG...)
- OK for the conf parameter in the TriggerDagRunOperator. it seems
useful in certain cases, but in our case, we didn't find a way to
retrieve the values of variables in the triggered DAG (we only the
"content" inside the jinja template, i.e. the string!)
- Regarding the datasets feature, yes, since a few days, we have been
exploring this new feature, looks amazing!
Our use case seems specific: in short, we have a thousands objects to be
processed. For each object, we want to launch an instance of a DAG (the
DAG is the same for all the objects). We don't size our project yet but
let's say, 500 objects should be processed in parallel (note that the
"initial" DAG calls some other DAGs to be triggered).
We are studying several ways to design that with Airflow, but we
encounter for the moment some issues...we are just at the beginning of
the study and maybe we didn't yet explore all Airflow capabilities.
We will surely have questions afterwards... ;)
Best,
Hervé
Le 07/11/2022 à 21:21, Daniel Standish via users a écrit :
There is a dag dependencies view which I believe should show you the
relationships
In your trigger dag operator, you can add a dag run conf where you can
supply json data. You could include any info you like there
(including where the triggering is coming from) and you can read it in
the downstream dag.
You could also consider looking at using datasets (a feature added in
2.4) for more event-based dag triggering.
On Mon, Nov 7, 2022 at 11:30 AM Jarek Potiuk <[email protected]> wrote:
Parent_dag was (is) used for SubDags (which are discouraged). When you
Trigger DAG, the best way to make the link is to generate run_id in
the way that you can uniquely identify the "triggering" dag. Also in
the XCom of the triggering task you will find the "triggered" run_id
an execution date (this is used to generate extra-link). I don't think
there is any other place where this information is kept.
On Mon, Nov 7, 2022 at 4:48 PM Hervé Ballans
<[email protected]> wrote:
>
> Hi Airflowers,
>
> I have a Question about the TriggerDagRunOperator in a single
> environment Cross-DAG dependencies design:
>
> I thought that when a DAG was triggered by an another DAG with this
> Operator, the triggered DAG had the information that it was
triggered by
> this another DAG...
> But it seems that it does not the case?
>
> When I go through the GUI, into the "Details" tab of the
triggered DAG,
> I see nowhere this information? However it exists an attribute
> "parent_dag" but with a 'None' value (but maybe it has nothing to do
> with...)
>
> Also, is there a way to retrieve this information from the triggered
> DAG? Especially the name of the "parent" DAG that triggered the
current DAG?
>
> Maybe I missed something in the understanding of this operator?
>
> Thanks,
>
> Hervé
>