Re: [PERFORM] Simple join doesn't use index

Alex Vinnik Tue, 29 Jan 2013 12:49:34 -0800

index definition
CREATE INDEX views_visit_id_visit_buoy_index ON views USING btree
(visit_id, visit_buoy)




On Tue, Jan 29, 2013 at 1:35 PM, Merlin Moncure <[email protected]> wrote:

> On Tue, Jan 29, 2013 at 12:59 PM, Alex Vinnik <[email protected]>
> wrote:
> >
> >
> >
> > On Tue, Jan 29, 2013 at 11:39 AM, Ben Chobot <[email protected]>
> wrote:
> >>
> >> On Jan 29, 2013, at 6:24 AM, Alex Vinnik wrote:
> >>
> >>> random_page_cost=1 might be not what you really want.
> >>> it would mean that random reads are as fast as as sequential reads,
> which
> >>> probably is true only for SSD
> >>
> >> What randon_page_cost would be more appropriate for EC2 EBS Provisioned
> >> volume that can handle 2,000 IOPS?
> >>
> >>
> >> For EC2 Provisioned IOPS volumes - not standard EBS - random_page_cost=1
> >> is exactly what you want.
> >>
> > Well... after some experimentation it turned out that
> random_page_cost=0.6
> > gives me fast query
> >
> > QUERY PLAN
> > Sort  (cost=754114.96..754510.46 rows=158199 width=8) (actual
> > time=1839.324..2035.405 rows=209401 loops=1)
> >   Sort Key: visits.id, views.id
> >   Sort Method: quicksort  Memory: 15960kB
> >   ->  Nested Loop  (cost=0.00..740453.38 rows=158199 width=8) (actual
> > time=0.048..1531.592 rows=209401 loops=1)
> >         ->  Index Scan using visits_created_at_index on visits
> > (cost=0.00..5929.82 rows=115492 width=4) (actual time=0.032..161.488
> > rows=131311 loops=1)
> >               Index Cond: ((created_at >= '2013-01-15
> 00:00:00'::timestamp
> > without time zone) AND (created_at < '2013-01-16 00:00:00'::timestamp
> > without time zone))
> >         ->  Index Scan using views_visit_id_index on views
>  (cost=0.00..6.26
> > rows=10 width=8) (actual time=0.003..0.005 rows=2 loops=131311)
> >               Index Cond: (visit_id = visits.id)
> > Total runtime: 2234.142 ms
> >
> > random_page_cost=0.7 slows it down 16 times
> >
> > Sort  (cost=804548.42..804943.92 rows=158199 width=8) (actual
> > time=37011.337..37205.449 rows=209401 loops=1)
> >   Sort Key: visits.id, views.id
> >   Sort Method: quicksort  Memory: 15960kB
> >   ->  Merge Join  (cost=15871.37..790886.85 rows=158199 width=8) (actual
> > time=35673.602..36714.056 rows=209401 loops=1)
> >         Merge Cond: (visits.id = views.visit_id)
> >         ->  Sort  (cost=15824.44..16113.17 rows=115492 width=4) (actual
> > time=335.486..463.085 rows=131311 loops=1)
> >               Sort Key: visits.id
> >               Sort Method: quicksort  Memory: 12300kB
> >               ->  Index Scan using visits_created_at_index on visits
> > (cost=0.00..6113.04 rows=115492 width=4) (actual time=0.034..159.326
> > rows=131311 loops=1)
> >                     Index Cond: ((created_at >= '2013-01-15
> > 00:00:00'::timestamp without time zone) AND (created_at < '2013-01-16
> > 00:00:00'::timestamp without time zone))
>
> >         ->  Index Scan using views_visit_id_visit_buoy_index on views
> > (cost=0.00..757596.22 rows=6122770 width=8) (actual time=0.017..30765.316
> > rows=5145902 loops=1)
>
> Something is awry here. pg is doing an index scan via
> views_visit_id_visit_buoy_index with no matching condition.  What's
> the definition of that index? The reason why the random_page_cost
> adjustment is working is that you are highly penalizing sequential
> type scans so that the database is avoiding the merge (sort A, sort B,
> stepwise compare).
>
> SQL server is doing a nestloop/index scan, just like the faster pg
> plan, but is a bit faster because it's parallelizing.
>
>  merlin
>

Re: [PERFORM] Simple join doesn't use index

Reply via email to