These are the default args to my DAG.
I am trying to run a standard hourly job (basically, at the end of
this hour, process last hours data)
I noticed that my pipeline is 1 hour late.
For some reason, I am messing up with my start_date I guess.
What is the best practice for setting up start_date?
scheduling_start_date = (datetime.utcnow()).replace(minute=0,
second=0, microsecond=0) +
datetime.timedelta(minutes=15)default_schedule_interval =
datetime.timedelta(minutes=60)default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': scheduling_start_date,
'email': ['[email protected]'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 2,
'retry_delay': default_retries_delay, 'schedule_interval'=
default_schedule_interval
# 'queue': 'bash_queue',
# 'pool': 'backfill',
# 'priority_weight': 10,
# 'end_date': datetime(2016, 1, 1),
}