On Fri, Sep 2, 2011 at 12:45 PM, Tom Lane wrote:
> column values). But GROUP BY or DISTINCT would entirely invalidate the
> column frequency statistics, which makes me think that ignoring the
> pg_statistic entry might be the thing to do. Comments?
There's a possible problem there in that you m
I wrote:
> On a longer-term basis, I'm looking into what we could do with
> extracting stats from subqueries, but that doesn't seem like material
> for a backpatch. I have a draft patch that I've been playing with
> (attached).
I've committed a heavily rewritten version of that patch. Git HEAD
s
Peter Eisentraut writes:
> On lör, 2011-08-27 at 13:32 -0400, Tom Lane wrote:
>> The larger problem is that if a subquery didn't get flattened, it's
>> often because it's got LIMIT, or GROUP BY, or some similar clause that
>> makes it highly suspect whether the statistics available for the table
Robert Haas writes:
> On Wed, Aug 31, 2011 at 6:22 AM, Peter Eisentraut wrote:
>> I liked the old one better. ;-)
> AFAICS, those plans are identical, except for a minor difference in
> the cost of scanning test2.
The point is that the estimate of the result size is worse in 8.4.8.
I am not, h
On Wed, Aug 31, 2011 at 6:22 AM, Peter Eisentraut wrote:
> On lör, 2011-08-27 at 13:32 -0400, Tom Lane wrote:
>> > EXPLAIN SELECT * FROM test1 WHERE sha1 in (SELECT sha1 FROM test2
>> LIMIT 200);
>>
>> > Here, however, it has apparently not passed this knowledge through
>> the
>> > LIMIT.
>>
>> T
On lör, 2011-08-27 at 13:32 -0400, Tom Lane wrote:
> > EXPLAIN SELECT * FROM test1 WHERE sha1 in (SELECT sha1 FROM test2
> LIMIT 200);
>
> > Here, however, it has apparently not passed this knowledge through
> the
> > LIMIT.
>
> The LIMIT prevents the subquery from being flattened entirely, ie w
On Sat, Aug 27, 2011 at 1:32 PM, Tom Lane wrote:
> Peter Eisentraut writes:
>> EXPLAIN SELECT * FROM test1 WHERE sha1 in (SELECT sha1 FROM test2);
>> QUERY PLAN
>> --
>> Hash Semi Join (cost=30.52
Peter Eisentraut writes:
> EXPLAIN SELECT * FROM test1 WHERE sha1 in (SELECT sha1 FROM test2);
> QUERY PLAN
> --
> Hash Semi Join (cost=30.52..61.27 rows=1000 width=27)
>Hash Cond: (test1.sha1
This is an artificial test case shrunk down from a much larger real
query.
CREATE TABLE test1 (
sha1 bytea PRIMARY KEY,
something text
);
CREATE TABLE test2 (
sha1 bytea PRIMARY KEY,
blah text
);
Fill those with 1000 random rows each, e.g.,
for i in $(seq 1 1000); do sha1=$(echo