Thanks a lot for the useful info. Regards Shubham Gupta
On Wed, Jul 25, 2018 at 7:48 PM Sid Anand <san...@apache.org> wrote: > I will +1 James comment and add to it. At Agari, one of our DAGs had as a > final step the sending of an alert. The alerts only made sense when the DAG > was current. But, sometimes, we did need to recompute some metrics based on > historical data, but not alert on them. The LatestOnlyOperator was a good > fit for this case. > > George/Ben, > It would be great to document this discussion -- i.e. when to use one over > another. > > -s > > > On Mon, Jul 23, 2018 at 2:03 PM George Leslie-Waksman <waks...@gmail.com> > wrote: > > > Ok, not so fringe; I'm glad it's working well for your use case, James. > > > > I retract my suggestion of deprecation. > > > > On Mon, Jul 23, 2018 at 12:58 PM James Meickle > > <jmeic...@quantopian.com.invalid> wrote: > > > > > We use LatestOnlyOperator in production. Generally our data is > available > > on > > > a regular schedule, and we update production services with it as soon > as > > it > > > is available; we might occasionally want to re-run historical days, in > > > which case we want to run the same DAG but without interacting with > live > > > production services at all. > > > > > > On Mon, Jul 23, 2018 at 2:18 PM, George Leslie-Waksman < > > waks...@gmail.com> > > > wrote: > > > > > > > As the author of LatestOnlyOperator, the goal was as a stopgap until > > > > catchup=False landed. > > > > > > > > There are some (very) fringe use cases where you might still want > > > > LatestOnlyOperator but in almost all cases what you want is probably > > > > catchup=False. > > > > > > > > The situations where LatestOnlyOperator is still useful are where you > > > want > > > > to run most of your DAG for every schedule interval but you want some > > of > > > > the tasks to run only on the latest run (not catching up, not > > > backfilling). > > > > > > > > It may be best to deprecate LatestOnlyOperator at this point to avoid > > > > confusion. > > > > > > > > --George > > > > > > > > On Sat, Jul 21, 2018 at 7:34 PM Ben Tallman <btall...@gmail.com> > > wrote: > > > > > > > > > As the author of catch-up, the idea is that in many cases your data > > > > > doesn't "window" nicely and you want instead to just run as if it > > were > > > a > > > > > brilliant Cron... > > > > > > > > > > Ben > > > > > > > > > > Sent from my iPhone > > > > > > > > > > > On Jul 20, 2018, at 11:39 PM, Shah Altaf <mend...@gmail.com> > > wrote: > > > > > > > > > > > > Hi my understanding is: if you use the LatestOnlyOperator then > when > > > you > > > > > run > > > > > > the DAG for the first time you'll see a whole bunch of DAG runs > > > queued > > > > > up, > > > > > > and in each run the LatestOnlyOperator will cause the rest of the > > DAG > > > > run > > > > > > to be skipped. Only the latest DAG will run in 'full'. > > > > > > > > > > > > With catchup = False, you should just get just the latest DAG > run. > > > > > > > > > > > > > > > > > > On Fri, Jul 20, 2018 at 10:58 PM Shubham Gupta < > > > > > shubham180695...@gmail.com> > > > > > > wrote: > > > > > > > > > > > >> ---------- Forwarded message --------- > > > > > >> From: Shubham Gupta <shubham180695...@gmail.com> > > > > > >> Date: Fri, Jul 20, 2018 at 2:38 PM > > > > > >> Subject: Catchup By default = False vs LatestOnlyOperator > > > > > >> To: <dev-subscr...@airflow.incubator.apache.org> > > > > > >> > > > > > >> > > > > > >> Hi! > > > > > >> > > > > > >> Can someone please explain the difference b/w catchup by > default = > > > > False > > > > > >> and LatestOnlyOperator? > > > > > >> > > > > > >> Regarding > > > > > >> Shubham Gupta > > > > > >> > > > > > > > > > > > > > > >