Hi
On Wed, Sep 25, 2013 at 6:42 PM, Sam Wong <s...@hellosam.net> wrote: > > > --- > Excuse me, this is the actual query. > > with Q AS ( > select event > from event_log > WHERE log_id>1010000 and log_id<1050000 > order by event > ) > SELECT * FROM Q LIMIT 1 > > > > Limit (cost=2521.82..2521.83 rows=1 width=32) (actual > time=88.342..88.342 > > rows=1 loops=1) > > Output: q.event > > Buffers: shared hit=93 read=622 > > CTE q > > -> Sort (cost=2502.07..2521.82 rows=39502 width=6) (actual > > time=88.335..88.335 rows=1 loops=1) > > Output: event_log.event > > Sort Key: event_log.event > > Sort Method: quicksort Memory: 3486kB > > Buffers: shared hit=93 read=622 > > -> Index Scan using event_log_pkey on uco.event_log > > (cost=0.00..1898.89 rows=39502 width=6) (actual time=13.918..65.573 > > rows=39999 loops=1) > > Output: event_log.event > > Index Cond: ((event_log.log_id > 1010000) AND > > (event_log.log_id < 1050000)) > > Buffers: shared hit=93 read=622 > > -> CTE Scan on q (cost=0.00..237.01 rows=39502 width=32) (actual > > time=88.340..88.340 rows=1 loops=1) > > Output: q.event > > Buffers: shared hit=93 read=622 > > Total runtime: 89.039 ms > > > > --- > > Slow Query > select min(event) > from event_log > WHERE log_id>1010000 and log_id<1050000 > > Result (cost=1241.05..1241.05 rows=1 width=0) (actual > > time=1099.532..1099.533 rows=1 loops=1) > > Output: $0 > > Buffers: shared hit=49029 read=57866 > > InitPlan 1 (returns $0) > > -> Limit (cost=0.00..1241.05 rows=1 width=6) (actual > > time=1099.527..1099.527 rows=1 loops=1) > > Output: uco.event_log.event > > Buffers: shared hit=49029 read=57866 > > -> Index Scan using event_data_search on uco.event_log > > (cost=0.00..49024009.79 rows=39502 width=6) (actual > > time=1099.523..1099.523 > > rows=1 loops=1) > > Output: uco.event_log.event > > Index Cond: (uco.event_log.event IS NOT NULL) > > Filter: ((uco.event_log.log_id > 1010000) AND > > (uco.event_log.log_id < 1050000)) > > Rows Removed by Filter: 303884 > > Buffers: shared hit=49029 read=57866 Total runtime: > > 1099.568 ms > > (Note: Things got buffered so it goes down to 1 second, but comparing to > the > > buffer count with the query above this is a few orders slower than that) > > > > --- > > The CTE "fast query" works, it is completed with index scan over > > "event_log_pkey", which is what I am expecting, and it is good. > > But it's much straight forward to write it as the "slow query", I am > expecting > > the planner would give the same optimum plan for both. > > > > For the plan of "Slow query" has an estimated total cost of 1241.05, and > the > > "Fast query" has 2521.83, hence the planner prefers that plan - the > scanning > > over the "event_data_search" index plan - but this choice doesn't make > sense > > to me. > > > > This is my point I want to bring up, why the planner would do that? > Instead of > > scanning over the "event_log_pkey"? > > > Maybe there's a bug but I don't think so, Postgresql optimizer is strongly bias toward uncorrelated data but in your case log_id and event are highly correlated, right? With your example it has to chose between: 1- play safe and use event_log_pkey, scan 39502 rows and select the smallest event. 2- use event_data_search, 4 000 000 rows, 40 000 with a log_id in the interval thus win big and find one in the first 100 scanned rows or lose big and scan 4 000 000 rows. With uncorrelated data 2- is 400 time faster than 1-, 100 rows vs 40 000. Postgresql is a high stake gambler :) Didier