Re: [BUGS] BUG #5885: Strange rows estimation for left join

Tom Lane Tue, 15 Feb 2011 09:18:28 -0800

Maxim Boguk <[email protected]> writes:
> Test case look like:


> create table "references" ( attr_id integer, reference integer,
> object_id integer );
> insert into "references" select *100**(random()),
> *100000**(random()^*10*), *1000000**(random()) from
> generate_series(*1*,*10000000*);
> create index xif01references on "references" ( reference, attr_id );
> create index xif02references on "references" ( object_id, attr_id, reference 
> );

> analyze "references";

> explain select * from "references" rs left join "references" vm on
> vm.reference = rs.reference and vm.attr_id = *10* where rs.object_id =
> *1000*;

I don't believe there's actually anything very wrong here.  The
large-looking estimate for the join size is not out of line: if you try
different values for object_id you will find that some produce more rows
than that and some produce less.  If we had cross-column stats we could
maybe derive a better estimate, but as-is you're getting an estimate
that is probably about right on the average, depending on whether the
particular object_id matches to more common or less common reference
values.

The thing that looks funny is that the inner indexscan rowcount estimate
is so small, which is because that's being done on the assumption that
the passed-in rs.reference value is random.  It's not really --- it's
more likely to be one of the more common reference values --- which is
something that's correctly accounted for in the join size estimate but
not in the inner indexscan estimate.

                        regards, tom lane

-- 
Sent via pgsql-bugs mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #5885: Strange rows estimation for left join

Reply via email to