Re: Fwd: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)

2007-08-31 Thread Tom Lane
I wrote:
 Mark Lewis [EMAIL PROTECTED] writes:
 We've been holding back from upgrading to 8.2 because this one is a
 show-stopper for us.

 Well, you could always make your own version with this patch reverted:
 http://archives.postgresql.org/pgsql-committers/2006-11/msg00066.php
 I might end up doing that in the 8.2 branch if a better solution
 seems too large to back-patch.

I thought of a suitably small hack that should cover at least the main
problem without going so far as to revert that patch entirely.  What we
can do is have the IS NULL estimator recognize when the clause is being
applied at an outer join, and not believe the table statistics in that
case.  I've applied the attached patch for this --- are you interested
in trying it out on your queries before 8.2.5 comes out?

regards, tom lane

Index: src/backend/optimizer/path/clausesel.c
===
RCS file: /cvsroot/pgsql/src/backend/optimizer/path/clausesel.c,v
retrieving revision 1.82
diff -c -r1.82 clausesel.c
*** src/backend/optimizer/path/clausesel.c  4 Oct 2006 00:29:53 -   
1.82
--- src/backend/optimizer/path/clausesel.c  31 Aug 2007 23:29:01 -
***
*** 218,224 
s2 = rqlist-hibound + rqlist-lobound - 1.0;
  
/* Adjust for double-exclusion of NULLs */
!   s2 += nulltestsel(root, IS_NULL, rqlist-var, 
varRelid);
  
/*
 * A zero or slightly negative s2 should be 
converted into a
--- 218,226 
s2 = rqlist-hibound + rqlist-lobound - 1.0;
  
/* Adjust for double-exclusion of NULLs */
!   /* HACK: disable nulltestsel's special 
outer-join logic */
!   s2 += nulltestsel(root, IS_NULL, rqlist-var,
! varRelid, 
JOIN_INNER);
  
/*
 * A zero or slightly negative s2 should be 
converted into a
***
*** 701,707 
s1 = nulltestsel(root,
 ((NullTest *) 
clause)-nulltesttype,
 (Node *) ((NullTest *) 
clause)-arg,
!varRelid);
}
else if (IsA(clause, BooleanTest))
{
--- 703,710 
s1 = nulltestsel(root,
 ((NullTest *) 
clause)-nulltesttype,
 (Node *) ((NullTest *) 
clause)-arg,
!varRelid,
!jointype);
}
else if (IsA(clause, BooleanTest))
{
Index: src/backend/utils/adt/selfuncs.c
===
RCS file: /cvsroot/pgsql/src/backend/utils/adt/selfuncs.c,v
retrieving revision 1.214.2.5
diff -c -r1.214.2.5 selfuncs.c
*** src/backend/utils/adt/selfuncs.c5 May 2007 17:05:55 -   
1.214.2.5
--- src/backend/utils/adt/selfuncs.c31 Aug 2007 23:29:02 -
***
*** 1386,1396 
   */
  Selectivity
  nulltestsel(PlannerInfo *root, NullTestType nulltesttype,
!   Node *arg, int varRelid)
  {
VariableStatData vardata;
double  selec;
  
examine_variable(root, arg, varRelid, vardata);
  
if (HeapTupleIsValid(vardata.statsTuple))
--- 1386,1409 
   */
  Selectivity
  nulltestsel(PlannerInfo *root, NullTestType nulltesttype,
!   Node *arg, int varRelid, JoinType jointype)
  {
VariableStatData vardata;
double  selec;
  
+   /*
+* Special hack: an IS NULL test being applied at an outer join should 
not
+* be taken at face value, since it's very likely being used to select 
the
+* outer-side rows that don't have a match, and thus its selectivity has
+* nothing whatever to do with the statistics of the original table
+* column.  We do not have nearly enough context here to determine its
+* true selectivity, so for the moment punt and guess at 0.5.  
Eventually
+* the planner should be made to provide enough info about the clause's
+* context to let us do better.
+*/
+   if (IS_OUTER_JOIN(jointype)  nulltesttype == IS_NULL)
+   return (Selectivity) 0.5;
+ 
examine_variable(root, arg, varRelid, vardata);
  
if (HeapTupleIsValid(vardata.statsTuple))
Index: src/include/utils/selfuncs.h
===
RCS file: /cvsroot/pgsql/src/include/utils/selfuncs.h,v
retrieving revision 1.36
diff -c -r1.36 selfuncs.h

Re: Fwd: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)

2007-08-31 Thread Mark Lewis
On Fri, 2007-08-31 at 19:39 -0400, Tom Lane wrote:
 I wrote:
  Mark Lewis [EMAIL PROTECTED] writes:
  We've been holding back from upgrading to 8.2 because this one is a
  show-stopper for us.
 
  Well, you could always make your own version with this patch reverted:
  http://archives.postgresql.org/pgsql-committers/2006-11/msg00066.php
  I might end up doing that in the 8.2 branch if a better solution
  seems too large to back-patch.
 
 I thought of a suitably small hack that should cover at least the main
 problem without going so far as to revert that patch entirely.  What we
 can do is have the IS NULL estimator recognize when the clause is being
 applied at an outer join, and not believe the table statistics in that
 case.  I've applied the attached patch for this --- are you interested
 in trying it out on your queries before 8.2.5 comes out?

Wish I could, but I'm afraid that I'm not going to be in a position to
try out the patch on the application that exhibits the problem for at
least the next few weeks.

-- Mark

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: Fwd: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)

2007-08-28 Thread Mark Lewis
It looks like your view is using a left join to look for rows in one
table without matching rows in the other, i.e. a SQL construct similar
in form to the query below:

SELECT ...
FROM A LEFT JOIN B ON (...)
WHERE B.primary_key IS NULL

Unfortunately there has been a planner regression in 8.2 in some cases
with these forms of queries.  This was discussed a few weeks (months?)
ago on this forum.  I haven't looked closely enough to confirm that this
is the problem in your case, but it seems likely.  Is it possible to
refactor the query to avoid using this construct to see if that helps?

We've been holding back from upgrading to 8.2 because this one is a
show-stopper for us.

-- Mark Lewis

On Tue, 2007-08-28 at 11:24 -0500, Evan Carroll wrote:
 -- Forwarded message --
 From: Evan Carroll [EMAIL PROTECTED]
 Date: Aug 28, 2007 11:23 AM
 Subject: Re: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)
 To: Scott Marlowe [EMAIL PROTECTED]
 
 
 On 8/28/07, Scott Marlowe [EMAIL PROTECTED] wrote:
  I looked through your query plan, and this is what stood out in the 8.2 
  plan:
 
   -  Nested Loop Left Join  (cost=8830.30..10871.27 rows=1
  width=102) (actual time=2148.444..236018.971 rows=62 loops=1)
 Join Filter: ((public.contact.pkid =
  public.contact.pkid) AND (public.event.ts_in  public.event.ts_in))
 Filter: (public.event.pkid IS NULL)
 
  Notice the misestimation is by a factor of 62, and the actual time
  goes from 2149 to 236018 ms.
 
  Again, have you analyzed your tables  / databases?
 
 contacts=# \o scott_marlowe_test
 contacts=# VACUUM FULL ANALYZE;
 contacts=# SELECT * FROM test_view WHERE U_ID = 8;
 Cancel request sent
 ERROR:  canceling statement due to user request
 contacts=# EXPLAIN ANALYZE SELECT * FROM test_view WHERE U_ID = 8;
 
 output found at http://rafb.net/p/EQouMI82.html
 
 --
 Evan Carroll
 System Lord of the Internets
 [EMAIL PROTECTED]
 832-445-8877
 
 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: Fwd: [PERFORM] 8.2 Query 10 times slower than 8.1 (view-heavy)

2007-08-28 Thread Tom Lane
Mark Lewis [EMAIL PROTECTED] writes:
 Unfortunately there has been a planner regression in 8.2 in some cases
 with these forms of queries.  This was discussed a few weeks (months?)
 ago on this forum.  I haven't looked closely enough to confirm that this
 is the problem in your case, but it seems likely.

Yeah, the EXPLAIN ANALYZE output clearly shows a drastic underestimate
of the number of rows out of a join like this, and a consequent choice
of a nestloop above it that performs terribly.

 We've been holding back from upgrading to 8.2 because this one is a
 show-stopper for us.

Well, you could always make your own version with this patch reverted:
http://archives.postgresql.org/pgsql-committers/2006-11/msg00066.php

I might end up doing that in the 8.2 branch if a better solution
seems too large to back-patch.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org