Ah.. I completely missed the question.. in my haste to do too many things.

Assuming you have a DAG named process_my_data with 3 tasks :
read__from_source_table --> transform --> write_to_new_table. This dag
should have a @none schedule.

You could write a script to read your list of source tables and call
airflow trigger_dag -c <a json string with param you want to pass to your
first task> -e <execution date>. This will launch a dag execution run for
each of the input that you call. I believe that the execution date should
differ by 1 second (timestamp granularity in the db).. so avoid a tight
loop with a 1 second sleep between executions.

You will see N dag runs, one for each of the N source tables that you pass
in.

-s

On Tue, Jun 20, 2017 at 12:22 PM, Maxime Beauchemin <
maximebeauche...@gmail.com> wrote:

> One DAG cannot have multiple shapes at one time, by design. You cannot
> parameterize things that will affect the shape of your DAG (though note
> that you can fully parameterize what happens within individual task
> instances). Think about it, a DAG is one (and only one) graph. It's NOT a
> shapeshifting thing.
>
> As a workaround, and this may or may not be the right thing to do, you can
> write a DAG factory function, that will return a DAG object given
> parameters, but any given DAG instance (with a unique dag_id) has a single
> shape. If you do want to go that route, may want to use
> `schedule_interval='@once'`
>
> If you think the shape of your DAG needs to change from one DAG run to the
> next, you may want to re-think what is static and what is dynamic. Are your
> database tables schema changing from one DAG run to the next? No right?
> That'd be crazy! Most likely you want to think about the shape of your DAG
> in a similar way as you think about the schema of your tables: static or
> slowly changing.
>
> Max
>
> On Mon, Jun 19, 2017 at 4:11 AM, Rob Harrison <robh...@gmail.com> wrote:
>
> > Hi,
> >
> > I would like to pass a variable to my airflow dag and would like to know
> if
> > there is a recommended method for doing this.
> >
> > I am hoping to create a dag with python operators and tasks that read
> data
> > from a parquet table, perform a calculation then write the results into a
> > new table. I'd like to pass the source table name in along with the task
> > when calling the dag from the command line.
> >
> > From what I have read, the following can be used to read a variable from
> > the command line:
> >
> > airflow variables -s myvar="value"
> >
> > Does anyone have an example of this they can share?
> >
> > Thank you,
> > Rob
> >
>

Reply via email to