[GitHub] ubermen edited a comment on issue #3840: [AIRFLOW-3001] Add task_instance table index 'ti_dag_date'

2018-09-09 Thread GitBox
ubermen edited a comment on issue #3840: [AIRFLOW-3001] Add task_instance table 
index 'ti_dag_date'
URL: 
https://github.com/apache/incubator-airflow/pull/3840#issuecomment-419279096
 
 
   I have updated PR, thanks! And I will reopen the PR after write new alemic 
script.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ubermen edited a comment on issue #3840: [AIRFLOW-3001] Add task_instance table index 'ti_dag_date'

2018-09-06 Thread GitBox
ubermen edited a comment on issue #3840: [AIRFLOW-3001] Add task_instance table 
index 'ti_dag_date'
URL: 
https://github.com/apache/incubator-airflow/pull/3840#issuecomment-419279096
 
 
   I have updated PR, thanks! And if this pr is reasonable, do I have to create 
new version of schema file ?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ubermen edited a comment on issue #3840: [AIRFLOW-3001] Add task_instance table index 'ti_dag_date'

2018-09-06 Thread GitBox
ubermen edited a comment on issue #3840: [AIRFLOW-3001] Add task_instance table 
index 'ti_dag_date'
URL: 
https://github.com/apache/incubator-airflow/pull/3840#issuecomment-419279096
 
 
   I have updated PR, thanks!
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ubermen edited a comment on issue #3840: [AIRFLOW-3001] Add task_instance table index 'ti_dag_date'

2018-09-06 Thread GitBox
ubermen edited a comment on issue #3840: [AIRFLOW-3001] Add task_instance table 
index 'ti_dag_date'
URL: 
https://github.com/apache/incubator-airflow/pull/3840#issuecomment-419279096
 
 
   There was no index composed of dag_id and execution_date. So, when scheduler 
find all tis of dagrun like this "select * from task_instance where dag_id = 
'some_id' and execution_date = '2018-09-01 ...'", this query will be using 
ti_dag_state index (I was testing it in mysql workbench). Perhaps there's no 
problem when range of execution_date is small (under 1000 dagrun), but I had 
experienced slow allocation of tis when the dag had 1000+ accumulative dagrun. 
So, now I was using airflow with adding new index ti_dag_date (dag_id, 
execution_date) on task_instance table. I have attached result of my test
   
![image](https://user-images.githubusercontent.com/6738941/45191171-bc525000-b27c-11e8-9762-bfd18cf99011.png)
   
![image](https://user-images.githubusercontent.com/6738941/45191184-d2f8a700-b27c-11e8-8739-fda9742985ff.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ubermen edited a comment on issue #3840: [AIRFLOW-3001] Add task_instance table index 'ti_dag_date'

2018-09-06 Thread GitBox
ubermen edited a comment on issue #3840: [AIRFLOW-3001] Add task_instance table 
index 'ti_dag_date'
URL: 
https://github.com/apache/incubator-airflow/pull/3840#issuecomment-419279096
 
 
   There was no index composed of dag_id and execution_date. So, when scheduler 
find all tis of dagrun like this "select * from task_instance where dag_id = 
'some_id' and execution_date = '2018-09-01 ...'", this query will be using 
ti_dag_state index (I was testing it in mysql workbench). Perhaps there's no 
problem when range of execution_date is small (under 1000 dagrun), but I had 
experienced slow allocation of tis when the dag had 1000+ accumulative dagrun. 
So, now I was using airflow with adding new index (dag_id, execution_date) on 
task_instance table. I have attached result of my test
   
![image](https://user-images.githubusercontent.com/6738941/45191171-bc525000-b27c-11e8-9762-bfd18cf99011.png)
   
![image](https://user-images.githubusercontent.com/6738941/45191184-d2f8a700-b27c-11e8-8739-fda9742985ff.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services