chigili commented on PR #63166:
URL: https://github.com/apache/airflow/pull/63166#issuecomment-4033375328

   > Do you have a query plan showing that coalesce makes it not using indexes?
   
   @ashb , 
   
     Generated on Postgres with 50,000 DagRun rows. Key takeaway: OLD forces a 
Seq Scan (scans all 50k rows, removes 43,200 by filter) because COALESCE() 
wraps the column. NEW uses Bitmap Index Scan on 
     idx_dag_run_start_date — touches only the relevant rows. Execution time 
drops from 4.0ms → 1.3ms on this test dataset; the gap widens with larger 
tables. 
     
     Please find the query plan as below:
   
     ======================================================================     
                                                                                
                                          
     QUERY PLAN: OLD — func.coalesce (non-sargable)                             
                                                                                
                                          
     ======================================================================     
                                                                                
                                          
     HashAggregate  (cost=2284.78..2284.79 rows=1 width=18) (actual 
time=3.986..3.987 rows=1 loops=1)                                               
                                                      
       Group Key: run_type                                                      
                                                                                
                                          
       Batches: 1  Memory Usage: 24kB                                           
                                                                                
                                          
       ->  Seq Scan on dag_run  (cost=0.00..2257.00 rows=5556 width=27) (actual 
time=2.851..3.471 rows=6800 loops=1)                                            
                                          
             Filter: ((COALESCE(start_date, '2026-03-10 
17:29:41.071763+00'::timestamp with time zone) >= '2026-02-08 
17:29:41.071763+00'::timestamp with time zone) AND (COALESCE(end_date, 
'2026-03-10  
     17:29:41.071763+00'::timestamp with time zone) <= '2026-03-10 
17:29:41.071763+00'::timestamp with time zone))                                 
                                                       
             Rows Removed by Filter: 43200                                      
                                                                                
                                          
     Planning Time: 0.169 ms                                                    
                                                                                
                                          
     Execution Time: 4.000 ms                                                   
                                                                                
                                          
                                                                                
                                                                                
                                          
     ======================================================================     
                                                                                
                                          
     QUERY PLAN: NEW — sargable OR/IS NULL                                      
                                                                                
                                          
     ======================================================================     
                                                                                
                                          
     HashAggregate  (cost=1800.33..1800.34 rows=1 width=26) (actual 
time=1.277..1.277 rows=1 loops=1)
       Group Key: run_type, state                                               
                                                                                
                                          
       Batches: 1  Memory Usage: 24kB
       ->  Bitmap Heap Scan on dag_run  (cost=139.34..1749.00 rows=6844 
width=35) (actual time=0.107..0.604 rows=6800 loops=1)                          
                                                  
             Recheck Cond: ((start_date >= '2026-02-08 
17:29:41.071763+00'::timestamp with time zone) OR (start_date IS NULL))         
                                                                   
             Filter: ((end_date <= '2026-03-10 17:29:41.071763+00'::timestamp 
with time zone) OR (end_date IS NULL))                                          
                                            
             Heap Blocks: exact=207                                             
                                                                                
                                          
             ->  BitmapOr  (cost=139.34..139.34 rows=6844 width=0) (actual 
time=0.095..0.095 rows=0 loops=1)                                               
                                               
                   ->  Bitmap Index Scan on idx_dag_run_start_date  
(cost=0.00..131.62 rows=6844 width=0) (actual time=0.094..0.094 rows=6800 
loops=1)                                                    
                         Index Cond: (start_date >= '2026-02-08 
17:29:41.071763+00'::timestamp with time zone)                                  
                                                          
                   ->  Bitmap Index Scan on idx_dag_run_start_date  
(cost=0.00..4.30 rows=1 width=0) (actual time=0.000..0.000 rows=0 loops=1)      
                                                      
                         Index Cond: (start_date IS NULL)                       
                                                                                
                                          
     Planning Time: 0.101 ms                                                    
                                                                                
                                          
     Execution Time: 1.290 ms                                                   
                                                                                
                                          
                                                                                
                                                                                
                                          
   
     
     I am going to include the benchmark code for this in the dev folder in the 
next commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to