chigili commented on PR #63166:
URL: https://github.com/apache/airflow/pull/63166#issuecomment-4033722552

   > Hmmmm. 4ms vs 1.2ms -- I'm not sure that absolute speed difference is 
worth the cost of a) applying a migration, and the impact on creating every 
single dag run.
   > 
   > If it's not much work are you able to run against 500k or 5m dag runs?
   
   @ashb 
   
   
    ┌──────┬────────────────┬────────────────┬─────────┐
     │ Rows │ OLD (COALESCE) │ NEW (sargable) │ Speedup │
     ├──────┼────────────────┼────────────────┼─────────┤
     │ 50k  │ 4ms                          │ 1.3ms                  │ 3x      │
     ├──────┼────────────────┼────────────────┼─────────┤
     │ 500k │ 18ms                       │ 13ms                  │ 1.4x    │
     ├──────┼────────────────┼────────────────┼─────────┤
     │ 5M   │ 234ms                     │ 22ms                  │ 10x     │
     └──────┴────────────────┴────────────────┴─────────┘
   
     At 5M rows:
     - OLD: Parallel Seq Scan — reads all 5M rows, removes 1.65M by filter. 
234ms
     - NEW: Bitmap Index Scan on idx_dag_run_end_date — jumps straight to 
matching rows. 22ms
   
     the seq scan scales linearly with table size (50k→5M = 100x rows → ~58x 
time), while the index scan stays nearly flat (~1.3ms → 22ms).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to