yuqian90 commented on issue #7038: [AIRFLOW-4495] allow externally triggered 
dags to run for future exec dates
URL: https://github.com/apache/airflow/pull/7038#issuecomment-572931603
 
 
   @potiuk  @kaxil 
   
   I want to bring it to the attention of committers. I have brought this up in 
an older PR previously, let me explain again why this PR is useful.
   
   In our use case of Airflow, we have to download some files with file name 
some_name_20190801.txt at 20190801 08:00 Tokyo time. So intuitively, we want to 
put this task on the DAG with execution_date 20190801. But at the moment, this 
is not possible with Airflow, even with externally triggered DAGs (i.e. 
`schedule_interval=None`). The reason is because Airflow scheduler does not 
consider tasks on the 20190801 DAG for execution until 20190801 00:00 UTC, 
which is 20190801 09:00 Tokyo time. So the earliest time we can run this task 
is at 09:00 Tokyo time.
   
   With this constraint in mind, we came up with our own temp workaround. What 
we did was to put the downloader for some_name_20190801.txt on the 
execution_date 20190731 DAG. This works, but it’s not great because in the Web 
UI the DAG execution_date is shown as 20190731. Whoever looking at the DAG 
always gets confused because they would think the DAG is downloading files for 
20190731.
   With this PR by @tooptoop4, we will finally be able to put the downloader 
for some_name_20190801.txt on the 20190801 DAG. This greatly simplifies things.
   
   There are other examples why this is useful. What i brought up here is only 
a simple one for illustration.
   
   This PR should be pretty safe to merge because it only changes the behaviour 
if the DAG has `schedule_interval` set to `None` and with 
`run_future_exec_dates`
    set to True in `airflow.cfg`. And it only makes a difference if the user 
externally triggers the DAG before execution_date midnight in UTC timezone.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to