All,
   On 9.1, I am running into a curious issue. I will explain the issue in high 
level terms using psuedo SQL statements. Consider a SQL statement:

SELECT a, b, c FROM tab WHERE a = value1;

- This does an index scan followed by a merge join and takes about 37 secs to 
execute

If I change the query to:

SELECT a, b, c FROM tab WHERE a BETWEEN value1 AND value1; -- The start and end 
are the same

- This query does a index scan and nested loop join and takes forever to run 
(as a matter of fact, it has never finished even after 8+ hours!!!!)

My actual query:

First Run with BETWEEN clause range of 4 days (Execution time = about 5 mins)

  select a.date_id Date,
    a.page_group page_group,
    a.page page,
    ...  --{snipped for brevity}
 from bi2003.alps_agg a left outer join bi2003.event_agg b
    on (a.date_id = b.date_id and upper(a.adc_visit) = 
upper(bicommon.get_value('adc_visit', b.vcset)))
 WHERE a.date_id between 20120228 and 20120302
   and b.date_id between 20120228 and 20120302
   and a.page = 'ddi_671'
 group by 1,2,3,4,5,6,7,8,9;

Explain Analyze:

 "GroupAggregate  (cost=1214313.29..1923536.69 rows=274096 width=3040) (actual 
time=275946.288..312348.732 rows=861 loops=1)"
"  Output: a.date_id, a.page_group, a.page, a.int_alloc_type, 
(bicommon.get_value('browserfamily'::text, b.vcset)), 
(bicommon.get_value('trafficsource'::text, b.vcset)), 
(bicommon.get_value('e671_hl_p1'::text, a.componentset)), 
(bicommon.get_value('e671_img_p1'::text, a.componentset)), 
(bicommon.get_value('e671_formLabels'::text, a.componentset)), 
count(a.adc_visit), sum((bicommon.get_value('revenue'::text, 
b.eventvalueset))::numeric(9,0)), sum((bicommon.get_value('Impression'::text, 
b.eventcountset))::numeric(9,0)), sum((bicommon.get_value('page1submit'::text, 
b.eventcountset))::numeric(9,0)), sum((bicommon.get_value('conversion'::text, 
b.eventcountset))::numeric(9,0)), sum((bicommon.get_value('lead'::text, 
b.eventcountset))::numeric(9,0))"
"  ->  Sort  (cost=1214313.29..1214998.53 rows=274096 width=3040) (actual 
time=275785.085..293169.601 rows=147480 loops=1)"
"        Output: a.date_id, a.page_group, a.page, a.int_alloc_type, b.vcset, 
a.componentset, a.adc_visit, b.eventvalueset, b.eventcountset, 
(bicommon.get_value('browserfamily'::text, b.vcset)), 
(bicommon.get_value('trafficsource'::text, b.vcset)), 
(bicommon.get_value('e671_hl_p1'::text, a.componentset)), 
(bicommon.get_value('e671_img_p1'::text, a.componentset)), 
(bicommon.get_value('e671_formLabels'::text, a.componentset))"
"        Sort Key: a.date_id, a.page_group, a.page, a.int_alloc_type, 
(bicommon.get_value('browserfamily'::text, b.vcset)), 
(bicommon.get_value('trafficsource'::text, b.vcset)), 
(bicommon.get_value('e671_hl_p1'::text, a.componentset)), 
(bicommon.get_value('e671_img_p1'::text, a.componentset)), 
(bicommon.get_value('e671_formLabels'::text, a.componentset))"
"        Sort Method: external merge  Disk: 447168kB"
"        ->  Merge Join  (cost=380263.82..830740.00 rows=274096 width=3040) 
(actual time=81438.225..257963.708 rows=147480 loops=1)"
"              Output: a.date_id, a.page_group, a.page, a.int_alloc_type, 
b.vcset, a.componentset, a.adc_visit, b.eventvalueset, b.eventcountset, 
bicommon.get_value('browserfamily'::text, b.vcset), 
bicommon.get_value('trafficsource'::text, b.vcset), 
bicommon.get_value('e671_hl_p1'::text, a.componentset), 
bicommon.get_value('e671_img_p1'::text, a.componentset), 
bicommon.get_value('e671_formLabels'::text, a.componentset)"
"              Merge Cond: (((upper((a.adc_visit)::text)) = 
(upper(bicommon.get_value('adc_visit'::text, b.vcset)))) AND (a.date_id = 
b.date_id))"
"              ->  Sort  (cost=169041.20..169352.99 rows=124717 width=1350) 
(actual time=15181.446..25806.356 rows=147480 loops=1)"
"                    Output: a.date_id, a.page_group, a.page, a.int_alloc_type, 
a.componentset, a.adc_visit, (upper((a.adc_visit)::text))"
"                    Sort Key: (upper((a.adc_visit)::text)), a.date_id"
"                    Sort Method: external merge  Disk: 205824kB"
"                    ->  Index Scan using alps_agg_date_id on bi2003.alps_agg a 
 (cost=0.00..85163.47 rows=124717 width=1350) (actual time=28.843..11369.048 
rows=147480 loops=1)"
"                          Output: a.date_id, a.page_group, a.page, 
a.int_alloc_type, a.componentset, a.adc_visit, upper((a.adc_visit)::text)"
"                          Index Cond: ((a.date_id >= 20120228) AND (a.date_id 
<= 20120302))"
"                          Filter: ((a.page)::text = 'ddi_671'::text)"
"              ->  Materialize  (cost=211222.61..212076.23 rows=170723 
width=1694) (actual time=66254.779..80862.998 rows=187870 loops=1)"
"                    Output: b.vcset, b.eventvalueset, b.eventcountset, 
b.date_id, (upper(bicommon.get_value('adc_visit'::text, b.vcset)))"
"                    ->  Sort  (cost=211222.61..211649.42 rows=170723 
width=1694) (actual time=66254.773..80680.870 rows=187870 loops=1)"
"                          Output: b.vcset, b.eventvalueset, b.eventcountset, 
b.date_id, (upper(bicommon.get_value('adc_visit'::text, b.vcset)))"
"                          Sort Key: 
(upper(bicommon.get_value('adc_visit'::text, b.vcset))), b.date_id"
"                          Sort Method: external merge  Disk: 312864kB"
"                          ->  Index Scan using event_agg_date_id on 
bi2003.event_agg b  (cost=0.00..70924.68 rows=170723 width=1694) (actual 
time=54.028..61463.318 rows=187870 loops=1)"
"                                Output: b.vcset, b.eventvalueset, 
b.eventcountset, b.date_id, upper(bicommon.get_value('adc_visit'::text, 
b.vcset))"
"                                Index Cond: ((b.date_id >= 20120228) AND 
(b.date_id <= 20120302))"
"Total runtime: 312461.516 ms"

Second Run with EQUALS clause of only one day (Execution time is 37 secs):

  select a.date_id Date,
    a.page_group page_group,
    a.page page,
    ... --{snipped for brevity}
 from bi2003.alps_agg a left outer join bi2003.event_agg b
    on (a.date_id = b.date_id and upper(a.adc_visit) = 
upper(bicommon.get_value('adc_visit', b.vcset)))
 WHERE a.date_id = 20120228      ----------------- CHANGE TO STATEMENT IS HERE 
------------------
   and a.page = 'ddi_671'
 group by 1,2,3,4,5,6,7,8,9;

"GroupAggregate  (cost=18528595.55..29763160.89 rows=4341861 width=3040) 
(actual time=33762.882..37750.500 rows=208 loops=1)"
"  Output: a.date_id, a.page_group, a.page, a.int_alloc_type, 
(bicommon.get_value('browserfamily'::text, b.vcset)), 
(bicommon.get_value('trafficsource'::text, b.vcset)), 
(bicommon.get_value('e671_hl_p1'::text, a.componentset)), 
(bicommon.get_value('e671_img_p1'::text, a.componentset)), 
(bicommon.get_value('e671_formLabels'::text, a.componentset)), 
count(a.adc_visit), sum((bicommon.get_value('revenue'::text, 
b.eventvalueset))::numeric(9,0)), sum((bicommon.get_value('Impression'::text, 
b.eventcountset))::numeric(9,0)), sum((bicommon.get_value('page1submit'::text, 
b.eventcountset))::numeric(9,0)), sum((bicommon.get_value('conversion'::text, 
b.eventcountset))::numeric(9,0)), sum((bicommon.get_value('lead'::text, 
b.eventcountset))::numeric(9,0))"
"  ->  Sort  (cost=18528595.55..18539450.20 rows=4341861 width=3040) (actual 
time=33681.585..34796.148 rows=36132 loops=1)"
"        Output: a.date_id, a.page_group, a.page, a.int_alloc_type, b.vcset, 
a.componentset, a.adc_visit, b.eventvalueset, b.eventcountset, 
(bicommon.get_value('browserfamily'::text, b.vcset)), 
(bicommon.get_value('trafficsource'::text, b.vcset)), 
(bicommon.get_value('e671_hl_p1'::text, a.componentset)), 
(bicommon.get_value('e671_img_p1'::text, a.componentset)), 
(bicommon.get_value('e671_formLabels'::text, a.componentset))"
"        Sort Key: a.date_id, a.page_group, a.page, a.int_alloc_type, 
(bicommon.get_value('browserfamily'::text, b.vcset)), 
(bicommon.get_value('trafficsource'::text, b.vcset)), 
(bicommon.get_value('e671_hl_p1'::text, a.componentset)), 
(bicommon.get_value('e671_img_p1'::text, a.componentset)), 
(bicommon.get_value('e671_formLabels'::text, a.componentset))"
"        Sort Method: external merge  Disk: 108232kB"
"        ->  Merge Join  (cost=75654.72..6682201.93 rows=4341861 width=3040) 
(actual time=5486.181..30565.791 rows=36132 loops=1)"
"              Output: a.date_id, a.page_group, a.page, a.int_alloc_type, 
b.vcset, a.componentset, a.adc_visit, b.eventvalueset, b.eventcountset, 
bicommon.get_value('browserfamily'::text, b.vcset), 
bicommon.get_value('trafficsource'::text, b.vcset), 
bicommon.get_value('e671_hl_p1'::text, a.componentset), 
bicommon.get_value('e671_img_p1'::text, a.componentset), 
bicommon.get_value('e671_formLabels'::text, a.componentset)"
"              Merge Cond: ((upper((a.adc_visit)::text)) = 
(upper(bicommon.get_value('adc_visit'::text, b.vcset))))"
"              ->  Sort  (cost=35259.64..35325.37 rows=26292 width=1350) 
(actual time=878.497..985.331 rows=36132 loops=1)"
"                    Output: a.date_id, a.page_group, a.page, a.int_alloc_type, 
a.componentset, a.adc_visit, (upper((a.adc_visit)::text))"
"                    Sort Key: (upper((a.adc_visit)::text))"
"                    Sort Method: external merge  Disk: 49480kB"
"                    ->  Index Scan using alps_agg_date_id on bi2003.alps_agg a 
 (cost=0.00..17870.00 rows=26292 width=1350) (actual time=0.047..142.383 
rows=36132 loops=1)"
"                          Output: a.date_id, a.page_group, a.page, 
a.int_alloc_type, a.componentset, a.adc_visit, upper((a.adc_visit)::text)"
"                          Index Cond: (a.date_id = 20120228)"
"                          Filter: ((a.page)::text = 'ddi_671'::text)"
"              ->  Materialize  (cost=40395.08..40560.22 rows=33028 width=1694) 
(actual time=4606.461..4809.163 rows=45428 loops=1)"
"                    Output: b.vcset, b.eventvalueset, b.eventcountset, 
b.date_id, (upper(bicommon.get_value('adc_visit'::text, b.vcset)))"
"                    ->  Sort  (cost=40395.08..40477.65 rows=33028 width=1694) 
(actual time=4606.455..4778.491 rows=45428 loops=1)"
"                          Output: b.vcset, b.eventvalueset, b.eventcountset, 
b.date_id, (upper(bicommon.get_value('adc_visit'::text, b.vcset)))"
"                          Sort Key: 
(upper(bicommon.get_value('adc_visit'::text, b.vcset)))"
"                          Sort Method: external merge  Disk: 75440kB"
"                          ->  Index Scan using event_agg_date_id on 
bi2003.event_agg b  (cost=0.00..13643.60 rows=33028 width=1694) (actual 
time=0.201..3865.567 rows=45428 loops=1)"
"                                Output: b.vcset, b.eventvalueset, 
b.eventcountset, b.date_id, upper(bicommon.get_value('adc_visit'::text, 
b.vcset))"
"                                Index Cond: (b.date_id = 20120228)"
"Total runtime: 37797.536 ms"

Now, The problem run:
If I change the above query to use one day in a BETWEEN clause where start and 
end are the same, the query never completes even after 6+ hours.

  select a.date_id Date,
    a.page_group page_group,
    a.page page,
    a.int_alloc_type int_alloc_type,
    ... --{snipped for brevity}
 from bi2003.alps_agg a left outer join bi2003.event_agg b
    on (a.date_id = b.date_id and upper(a.adc_visit) = 
upper(bicommon.get_value('adc_visit', b.vcset)))
 WHERE a.date_id BETWEEN 20120228 AND 20120228     ----------------- CHANGE TO 
STATEMENT IS HERE ------------------
   and a.page = 'ddi_671'
 group by 1,2,3,4,5,6,7,8,9;

"HashAggregate  (cost=23.23..24.49 rows=1 width=3040)"
"  Output: a.date_id, a.page_group, a.page, a.int_alloc_type, 
(bicommon.get_value('browserfamily'::text, b.vcset)), 
(bicommon.get_value('trafficsource'::text, b.vcset)), 
(bicommon.get_value('e671_hl_p1'::text, a.componentset)), 
(bicommon.get_value('e671_img_p1'::text, a.componentset)), 
(bicommon.get_value('e671_formLabels'::text, a.componentset)), 
count(a.adc_visit), sum((bicommon.get_value('revenue'::text, 
b.eventvalueset))::numeric(9,0)), sum((bicommon.get_value('Impression'::text, 
b.eventcountset))::numeric(9,0)), sum((bicommon.get_value('page1submit'::text, 
b.eventcountset))::numeric(9,0)), sum((bicommon.get_value('conversion'::text, 
b.eventcountset))::numeric(9,0)), sum((bicommon.get_value('lead'::text, 
b.eventcountset))::numeric(9,0))"
"  ->  Nested Loop  (cost=0.00..21.91 rows=1 width=3040)"
"        Output: a.date_id, a.page_group, a.page, a.int_alloc_type, b.vcset, 
a.componentset, a.adc_visit, b.eventvalueset, b.eventcountset, 
bicommon.get_value('browserfamily'::text, b.vcset), 
bicommon.get_value('trafficsource'::text, b.vcset), 
bicommon.get_value('e671_hl_p1'::text, a.componentset), 
bicommon.get_value('e671_img_p1'::text, a.componentset), 
bicommon.get_value('e671_formLabels'::text, a.componentset)"
"        Join Filter: ((a.date_id = b.date_id) AND (upper((a.adc_visit)::text) 
= upper(bicommon.get_value('adc_visit'::text, b.vcset))))"
"        ->  Index Scan using alps_agg_date_id on bi2003.alps_agg a  
(cost=0.00..10.12 rows=1 width=1350)"
"              Output: a.date_id, a.adc_visit, a.page_group, a.page, 
a.int_alloc_type, a.componentset, a.variation_tagset, a.page_instance"
"              Index Cond: ((a.date_id >= 20120228) AND (a.date_id <= 
20120228))"
"              Filter: ((a.page)::text = 'ddi_671'::text)"
"        ->  Index Scan using event_agg_date_id on bi2003.event_agg b  
(cost=0.00..10.27 rows=1 width=1694)"
"              Output: b.date_id, b.vcset, b.eventcountset, b.eventvalueset"
"              Index Cond: ((b.date_id >= 20120228) AND (b.date_id <= 
20120228))"

Has anyone experienced a similar issue? We have checked the tables and indexes 
for consistency, VACUUM'ed, ANALYZEd and REINDEX'ed them many times. But the 
between clause with start and end the same runs for hours and never completes. 
But if we use a range where start and end is not the same, it comes back in a 
few minutes. If we use a single day in an '=' clause, it also comes back in 30+ 
secs.

What could be the problem here? Does anyone have any insights?


________________________________________

Venky Kandaswamy

Principal Engineer, Adchemy Inc.

925-200-7124

Reply via email to