ubermen commented on issue #3840: [AIRFLOW-3001] Add task_instance table index 
'ti_dag_date'
URL: 
https://github.com/apache/incubator-airflow/pull/3840#issuecomment-419279096
 
 
   There was no index composed of dag_id and execution_date. So, when scheduler 
find all tis of dagrun like this "select * from task_instance where dag_id = 
'~' and execution_date = '~' ", this query will be using ti_dag_state index (I 
was testing it in mysql workbench). Perhaps there's no problem when range of 
execution_date is small (under 1000 dagrun), but I had experienced slow 
allocation of tis when the dag had 1000+ accumulative dagrun. So, now I was 
using airflow with adding new index (dag_id, execution_date) on task_instance 
table. I have attached result of my test
   
![image](https://user-images.githubusercontent.com/6738941/45191171-bc525000-b27c-11e8-9762-bfd18cf99011.png)
   
![image](https://user-images.githubusercontent.com/6738941/45191184-d2f8a700-b27c-11e8-8739-fda9742985ff.png)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to