I think it is a useful feature that nonetheless adds disproportionate
complexity -- for example, is there logic for when there is a task
downstream from an ad-hoc task, and the ad-hoc task isn't being run?

Perhaps there is a way to reengineer it around current Airflow idioms.
Maybe we can start by figuring out what exactly it's being used for? Here
are a few use-cases that come to mind (under the heading of "actions that
relate to my DAG but I don't want them to run every time... just at the
very beginning or occasionally on demand"):
- initializing a database table (and making sure it exists before running
downstream tasks)
- periodic maintenance, for example pruning, truncating tables, etc.
- initial logins, connection testing, etc.
- issuing some sort of debug command to a third party system before running
the rest of the DAG

On Wed, May 18, 2016 at 11:23 AM Maxime Beauchemin <
maximebeauche...@gmail.com> wrote:

> The idea there is to be able to ship on-demand tasks along with your DAG.
> Is it not used because it's not documented?
>
> Deprecating may be harder than maintaining it. We'd have to start warning
> about deprecation in 2.0 soon and add the PR that removes this in the
> [eventual] 2.0 branch.
>
> I don't feel strongly about the feature, I added it because we had use
> cases for it, and we didn't have externally triggered DAGs at the time. I
> get how it can become confusing, both from a usability and code maintenance
> perspective.
>
> Max
>
> On Tue, May 17, 2016 at 12:49 PM, Chris Riccomini <criccom...@apache.org>
> wrote:
>
> > Yea, it feels like a pretty edge-use case. It's not even documented. In
> the
> > interest of simplifying and and reducing bugs it seems like we might just
> > want to nuke this, or completely rethink the use cases.
> >
> > On Tue, May 17, 2016 at 12:22 PM, Jeremiah Lowin <jlo...@apache.org>
> > wrote:
> >
> > > Perhaps ad-hoc tasks could be refractored as ad-hoc DAGs? It sounds
> like
> > > they are for infrequent initialization or maintainence tasks.
> > >
> > > On Tue, May 17, 2016 at 11:21 AM Arthur Wiedmer <art...@apache.org>
> > wrote:
> > >
> > > > We still have tasks in production that use this feature.
> > > >
> > > > Sometimes, it has been used for one off tasks that create simple
> static
> > > > mapping tables (Tables loaded from a static file that also lives in
> > > source
> > > > control, creating a programmatically generated time dimension
> etc...).
> > > >
> > > > Of course, maybe just having the task in question as a script that
> uses
> > > the
> > > > airflow utilities would be sufficient.
> > > >
> > > > Best,
> > > > Arthur
> > > >
> > > > On Tue, May 17, 2016 at 10:40 AM, Chris Riccomini <
> > criccom...@apache.org
> > > >
> > > > wrote:
> > > >
> > > > > @Bolke/@Jeremiah
> > > > >
> > > > > When you make your changes to unify the backfiller and scheduler,
> it
> > > > sounds
> > > > > like this can go away, right?
> > > > >
> > > > > On Tue, May 17, 2016 at 10:38 AM, Maxime Beauchemin <
> > > > > maximebeauche...@gmail.com> wrote:
> > > > >
> > > > > > The scheduler won't trigger where `adhoc=True`. The CLI's
> > > > > backfill/test/run
> > > > > > is the only way to trigger where `adhoc=True`. For backfill
> > > > specifically,
> > > > > > there's a `-a`, `--include_adhoc` flag to make these tasks
> in-scope
> > > to
> > > > > the
> > > > > > backfill.
> > > > > >
> > > > > > Max
> > > > > >
> > > > > > On Tue, May 17, 2016 at 10:14 AM, Chris Riccomini <
> > > > criccom...@apache.org
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hey all,
> > > > > > >
> > > > > > > Curious about what the 'adhoc' property is in BaseOperator. It
> > > > appears
> > > > > to
> > > > > > > be completely undocumented. What is this?
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Chris
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to