Tom,

Just to clarify, is the lefthand input customers or balances?

And turning off merge joins "fixes" everything, including the runtime:

production=> set enable_mergejoin = off;
SET
production=> explain analyze SELECT ac.* FROM balances ac JOIN customers o
ON (o.id = ac.customer_id AND o.group_id = 45);
                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=20288.24..20722.35 rows=7 width=80) (actual
time=133.202..140.876 rows=7318 loops=1)
   Hash Cond: (ac.customer_id = o.id)
   ->  Seq Scan on balances ac  (cost=0.00..370.76 rows=16876 width=80)
(actual time=0.015..5.853 rows=16876 loops=1)
   ->  Hash  (cost=20285.94..20285.94 rows=184 width=8) (actual
time=126.768..126.768 rows=430 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 12kB
         ->  Seq Scan on customers o  (cost=0.00..20285.94 rows=184
width=8) (actual time=16.901..126.606 rows=430 loops=1)
               Filter: (group_id = 45)
               Rows Removed by Filter: 476221
 Total runtime: 142.089 ms
(9 rows)

Sean




On Tue, Oct 13, 2015 at 11:20 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> Sean Rhea <sean.c.r...@gmail.com> writes:
> > No, the customers table is not 100% the same. This is a live production
> > system, so the data is (unfortunately) changing under us a bit here. That
> > said, there are still some strange things going on. I just reran
> > everything. The query plan time hasn't changed, but as Jeremy, Igor, and
> > David all pointed out, there's something funky going on with the apparent
> > size of the customers table. These queries were all run within 5 minutes
> of
> > each other:
>
> > production=> explain analyze SELECT ac.* FROM balances ac JOIN customers
> o
> > ON (o.id= ac.customer_id AND o.group_id = 45);
> >                                                                     QUERY
> > PLAN
> >
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
> >  Merge Join  (cost=2475.89..20223.08 rows=7 width=80) (actual
> > time=157.437..243670.853 rows=7318 loops=1)
> >    Merge Cond: (ac.customer_id = o.id)   ->  Index Scan using
> > balances_customer_id_index on balances ac  (cost=0.00..727.42 rows=16876
> > width=80) (actual time=0.489..30.573 rows=16876 loops=1)
> >    ->  Index Scan using customers_pkey on customers o
> (cost=0.00..65080.01
> > rows=184 width=8) (actual time=127.266..243582.767 rows=*7672* loops=1)
> >          Filter: (group_id = 45)
> >          Rows Removed by Filter: *212699113*
> >  Total runtime: 243674.288 ms
> > (7 rows)
>
> > production=> select count(*) from customers where group_id = 45;
> >  count
> > -------
> >    430
> > (1 row)
>
> What you're looking at there is rows being read repeatedly as a
> consequence of the mergejoin applying mark/restore operations to rescan
> portions of its righthand input.  This will happen whenever there are
> duplicate keys in the lefthand input.
>
> I think the planner does take the possibility of rescans into account
> in its cost estimates, but perhaps it's not weighing it heavily
> enough.  It would be interesting to see what you get as a second-choice
> plan if you set enable_mergejoin = off.
>
>                         regards, tom lane
>

Reply via email to