Hi, What do you mean by "discarding", what is the outcome you are after?
If what you want is a DagRun that matches your start_date you can do that from the UI (create a new DagRun that matches your desired start_date, that essentially "re-seeds" the point from which the future DagRuns will get created). You may also want to deactivate older `running` DagRuns as well, which you can also do from the UI. Max On Tue, Aug 9, 2016 at 9:24 PM, הילה ויזן <[email protected]> wrote: > Hi Maxime, > > Thanks for the clarifications. > I've already read this page while trying to find a solution to my problem. > > But I still have the question - is there any way to discard the previous > definitions? (for example the 'start_date' of a DAG) > > Thanks > > On Wed, Aug 10, 2016 at 1:37 AM, Maxime Beauchemin < > [email protected]> wrote: > > > From http://pythonhosted.org/airflow/faq.html: > > > > *What’s the deal with ``start_date``?* > > > > start_date is partly legacy from the pre-DagRun era, but it is still > > relevant in many ways. When creating a new DAG, you probably want to set > a > > global start_date for your tasks usingdefault_args. The first DagRun to > be > > created will be based on the min(start_date) for all your task. From that > > point on, the scheduler creates new DagRuns based on your > > schedule_interval and > > the corresponding task instances run as your dependencies are met. When > > introducing new tasks to your DAG, you need to pay special attention to > > start_date, and may want to reactivate inactive DagRuns to get the new > task > > to get onboarded properly. > > > > We recommend against using dynamic values as start_date, especially > > datetime.now() as it can be quite confusing. The task is triggered once > the > > period closes, and in theory an @hourly DAG would never get to an hour > > after now as now() moves along. > > > > Previously we also recommended using rounded start_date in relation to > your > > schedule_interval. This meant an @hourly would be at 00:00 > minutes:seconds, > > a @daily job at midnight, a @monthlyjob on the first of the month. This > is > > no longer required. Airflow will not auto align the start_dateand the > > schedule_interval, by using the start_date as the moment to start > looking. > > > > You can use any sensor or a TimeDeltaSensor to delay the execution of > tasks > > within the schedule interval. While schedule_interval does allow > specifying > > a datetime.timedelta object, we recommend using the macros or cron > > expressions instead, as it enforces this idea of rounded schedules. > > > > When using depends_on_past=True it’s important to pay special attention > to > > start_date as the past dependency is not enforced only on the specific > > schedule of the start_date specified for the task. It’ also important to > > watch DagRun activity status in time when introducing new > > depends_on_past=True, unless you are planning on running a backfill for > the > > new task(s). > > > > Also important to note is that the tasks start_date, in the context of a > > backfill CLI command, get overridden by the backfill’s command > start_date. > > This allows for a backfill on tasks that havedepends_on_past=True to > > actually start, if it wasn’t the case, the backfill just wouldn’t start. > > > > On Tue, Aug 9, 2016 at 7:44 AM, הילה ויזן <[email protected]> wrote: > > > > > Hi, > > > > > > We're experiencing a strange problem with the start_date configuration > in > > > Airflow. > > > > > > When we first ran the DAGs, we defined the start_date as > > 'datetime.now()', > > > which at the time was 01/08/2016. This worked fine. A week afterwards, > we > > > changed the DAGs to a specific newer date - 08/08/2016, and reset all > of > > > the tasks. After resetting the Airflow and all of the DAGs *we are > still > > > seeing the tasks running from original date (01/08)*. Why is this > > > happening? > > > > > > We don't understand why the tasks are still using the old date. Is > there > > a > > > cache/DB/persistent file that the DAG reads on startup that overrides > our > > > definition? Is it maybe Celery? We really would appreciate your input > > > because we are totally stuck. > > > > > > We use airflow version 1.7.1.3 with postgress as the backend DB. > > > In addition, we run in CeleryExecutor mode with rabbitMQ as Celery > > backend. > > > > > > Thank you, > > > Hila > > > > > >
