Re: [PERFORM] not in(subselect) in 8.4

2009-02-22 Thread Grzegorz Jaśkiewicz
but then you have 10 questions a week from windows people about
password, and yet you haven't remove that :P

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] not in(subselect) in 8.4

2009-02-21 Thread Grzegorz Jaśkiewicz
after your recent commit Tom, the cost is sky-high, and also it takes
ages again with subselect version. In case of two table join. I have
to try the three way one.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] not in(subselect) in 8.4

2009-02-21 Thread Tom Lane
=?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?= gryz...@gmail.com writes:
 after your recent commit Tom, the cost is sky-high, and also it takes
 ages again with subselect version. In case of two table join. I have
 to try the three way one.

Which commit, and what example are you talking about?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] not in(subselect) in 8.4

2009-02-21 Thread Grzegorz Jaśkiewicz
the foo bar example above, with notion that all columns are NOT NULL
behaves much different now. I noticed, that some of the 'anti join'
stuff has changed in cvs recently, but I don't know if that's to
blame.
Basically, what I can see, is that the subselect case is no longer of
lower cost, to the left join - but is quite substantially more
expensive.

Just an observation, I don't intend to use subselects anyhow, because
these are very much slower on 8.3, which we use in production here.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] not in(subselect) in 8.4

2009-02-21 Thread Tom Lane
=?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?= gryz...@gmail.com writes:
 the foo bar example above, with notion that all columns are NOT NULL
 behaves much different now.

AFAIK the treatment of NOT IN subselects hasn't changed a bit since 8.3.
So I still find your complaint uninformative.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] not in(subselect) in 8.4

2009-02-21 Thread Scott Carey
Are there any optimizations planned for the case where columns are defined as 
NOT NULL?  Or other special path filtering for cases where the planner can know 
that the set of values in the subselect won't contain NULLs  (such as in 
(select a from b where (a  0 and a  1).

It turns out to be a rare use case for someone to write a subselect for a NOT 
IN  or IN clause that will have NULL values.  In the common case, the subselect 
does not contain nulls.  I would like to see Postgres optimize for the common 
case.


From: pgsql-performance-ow...@postgresql.org 
[pgsql-performance-ow...@postgresql.org] On Behalf Of Tom Lane 
[...@sss.pgh.pa.us]
Sent: Friday, February 20, 2009 7:33 AM
To: Grzegorz Jaśkiewicz
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] not in(subselect) in 8.4

=?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?= gryz...@gmail.com writes:
 I mean query like:
 select id from foo where id not in ( select id from bar);
 into:
 select f.id from foo f left join bar b on f.id=b.id where b.id is null;

Postgres does not do that, because they don't mean the same thing ---
the behavior for NULLs in bar.id is different.

8.4 does understand that NOT EXISTS is an antijoin, though.

regards, tom lane

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] not in(subselect) in 8.4

2009-02-21 Thread Tom Lane
Scott Carey sc...@richrelevance.com writes:
 Are there any optimizations planned for the case where columns are
 defined as NOT NULL?

We might get around to recognizing that case as an antijoin sometime.
It's nontrivial though, because you have to check for an intermediate
outer join causing the column to be possibly nullable after all.

 It turns out to be a rare use case for someone to write a subselect
 for a NOT IN  or IN clause that will have NULL values.

Judging from the steady flow of why doesn't my NOT IN query work
newbie questions, I don't think it's so rare as all that.

There's surely some population of people who know enough or could be
trained to be careful about using NOT NULL columns, but they could also
be trained to use NOT EXISTS, and dodge the whole bullet from the start.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] not in(subselect) in 8.4

2009-02-21 Thread Robert Haas
On Sat, Feb 21, 2009 at 10:41 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Scott Carey sc...@richrelevance.com writes:
 Are there any optimizations planned for the case where columns are
 defined as NOT NULL?

 We might get around to recognizing that case as an antijoin sometime.
 It's nontrivial though, because you have to check for an intermediate
 outer join causing the column to be possibly nullable after all.

 It turns out to be a rare use case for someone to write a subselect
 for a NOT IN  or IN clause that will have NULL values.

 Judging from the steady flow of why doesn't my NOT IN query work
 newbie questions, I don't think it's so rare as all that.

I think it's rare to do it on purpose, precisely because of the weird
semantics we all hate.  I have done it by accident, more than once,
and then fixed it by adding WHERE blah IS NOT NULL to the subquery.
So I think Scott is basically right.

 There's surely some population of people who know enough or could be
 trained to be careful about using NOT NULL columns, but they could also
 be trained to use NOT EXISTS, and dodge the whole bullet from the start.

There are far more important reasons to make columns NOT NULL than
avoiding strange results from NOT IN.  Personally, I have gotten used
to the fact that the planner sucks at handling NOT IN and so always
write LEFT JOIN ... WHERE pk IS NULL, so it's not important to me that
we fix it.  But it's certainly a foot-gun for the inexperienced, as it
is both the most compact and (at least IMO) the most intuitive
formulation of an anti-join.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] not in(subselect) in 8.4

2009-02-20 Thread Grzegorz Jaśkiewicz
Just as a question to Tom and team,
I saw a post a bit ago, about plans for 8.4, and Tom said it is very
likely that 8.4 will rewrite subselects into left joins, is it still
in plans?

I mean query like:
select id from foo where id not in ( select id from bar);
into:

select f.id from foo f left join bar b on f.id=b.id where b.id is null;

the latter is most often much much faster on 8.1-8.3;

thanks.

-- 
GJ

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] not in(subselect) in 8.4

2009-02-20 Thread Grzegorz Jaśkiewicz
On Fri, Feb 20, 2009 at 11:14 AM, marcin mank marcin.m...@gmail.com wrote:
 Just as a question to Tom and team,

 maybe it`s time for asktom.postgresql.org?  Oracle has it :)

hehe,
on the other hand - that would make my ppl here very skilfull, the
only reason I started to praise them about joins, and stuff - is
because subselects were slow. (no wonder, when you check two tables
against each other, and each holds few M of rows).




-- 
GJ

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] not in(subselect) in 8.4

2009-02-20 Thread marcin mank
 Just as a question to Tom and team,

maybe it`s time for asktom.postgresql.org?  Oracle has it :)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] not in(subselect) in 8.4

2009-02-20 Thread Tom Lane
=?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?= gryz...@gmail.com writes:
 I mean query like:
 select id from foo where id not in ( select id from bar);
 into:
 select f.id from foo f left join bar b on f.id=b.id where b.id is null;

Postgres does not do that, because they don't mean the same thing ---
the behavior for NULLs in bar.id is different.

8.4 does understand that NOT EXISTS is an antijoin, though.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] not in(subselect) in 8.4

2009-02-20 Thread Grzegorz Jaśkiewicz
On Fri, Feb 20, 2009 at 3:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 =?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?= gryz...@gmail.com writes:
 I mean query like:
 select id from foo where id not in ( select id from bar);
 into:
 select f.id from foo f left join bar b on f.id=b.id where b.id is null;

 Postgres does not do that, because they don't mean the same thing ---
 the behavior for NULLs in bar.id is different.
yes, the obvious assumption here is that all columns are 'not null';


 8.4 does understand that NOT EXISTS is an antijoin, though.

Yes, I noticed that it actually assumes lesser cost.



-- 
GJ

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] not in(subselect) in 8.4

2009-02-20 Thread Rodrigo E . De León Plicet
On Fri, Feb 20, 2009 at 6:14 AM, marcin mank marcin.m...@gmail.com wrote:
 On Fri, Feb 20, 2009 at 4:56 AM, Grzegorz Jaśkiewicz gryz...@gmail.com 
 wrote:
 Just as a question to Tom and team,

 maybe it`s time for asktom.postgresql.org?  Oracle has it :)

+1

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance