Re: [GENERAL] Lifting WHERE conditions out of inner select

2007-10-09 Thread John D. Burger

Tom Lane wrote:


The planner does not look for this type of situation though, and after
some study I think it'd be more trouble than it was worth.  It'd be
less than trivial to determine whether the upper references occurred
only in places where it was safe to pull them up, and the actual  
pulling

would take some code that doesn't exist now, too,


Okay, good to know.  The situation is obviously easy to avoid, I just  
found the contrast surprising.


Thanks.

- John D. Burger
  MITRE



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[GENERAL] Lifting WHERE conditions out of inner select

2007-10-08 Thread John D. Burger

Hi -

A colleague presented the following very slow query to me:

SELECT DISTINCT lemma FROM word
JOIN sense USING (wordid)
JOIN synset USING (synsetid)
  WHERE sense.synsetid
IN (SELECT synset2id FROM semlinkref
 WHERE synset1id
   IN (SELECT synsetid FROM sense
WHERE wordid = (SELECT wordid FROM word WHERE  
lemma='scramble'))

 AND linkid=1
 AND synset.pos='v')
  ORDER BY lemma;

I realized that the last constraint, synset.pos='v', actually applies  
to one of the tables in the main join, and could be lifted out of the  
double IN clause.  Doing so sped the query up by a factor of 10,000.


My question is, should the planner have figured this out, and we're  
just losing out because we're stuck in 7.4?  Or is there some subtle  
difference in semantics I'm missing?  The select results were the  
same in both cases, but I'm willing to believe that's an accident of  
our data.


(Sorry if no one can answer my question without the table  
definitions, etc. - it seemed worthwhile trying to get away without  
that for now.)


Thanks.

- John D. Burger
  MITRE



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Lifting WHERE conditions out of inner select

2007-10-08 Thread Richard Broersma Jr
--- John D. Burger [EMAIL PROTECTED] wrote:
 My question is, should the planner have figured this out, and we're  
 just losing out because we're stuck in 7.4?  Or is there some subtle  
 difference in semantics I'm missing?  The select results were the  
 same in both cases, but I'm willing to believe that's an accident of  
 our data.

I don't know if this helps.

SELECT lemma 
  FROM word AS W
INNER JOIN sense AS S
ON W.wordid = S.wordid
INNER JOIN semlinkref AS R
ON R.synset1id = S.synsetid
   AND R.synset2id = S.synsetid
INNER JOIN synset AS T
ON S1.synsetid = T.synsetid
 WHERE W.lemma = 'scramble'
   AND R.linked = 1
   AND R.pos='v'
  ORDER BY lemma;

Regards,
Richard Broersma Jr.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Lifting WHERE conditions out of inner select

2007-10-08 Thread Richard Broersma Jr
OOPs!
--- Richard Broersma Jr [EMAIL PROTECTED] wrote:

 --- John D. Burger [EMAIL PROTECTED] wrote:
  My question is, should the planner have figured this out, and we're  
  just losing out because we're stuck in 7.4?  Or is there some subtle  
  difference in semantics I'm missing?  The select results were the  
  same in both cases, but I'm willing to believe that's an accident of  
  our data.
 
 I don't know if this helps.
 
 SELECT lemma 
   FROM word AS W
 INNER JOIN sense AS S
 ON W.wordid = S.wordid
 INNER JOIN semlinkref AS R
 ON R.synset1id = S.synsetid
AND R.synset2id = S.synsetid
 INNER JOIN synset AS T
 ON S.synsetid = T.synsetid
  WHERE W.lemma = 'scramble'
AND R.linked = 1
AND T.pos='v'
   ORDER BY lemma;
 
 Regards,
 Richard Broersma Jr.
 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq
 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Lifting WHERE conditions out of inner select

2007-10-08 Thread Tom Lane
John D. Burger [EMAIL PROTECTED] writes:
 ... WHERE sense.synsetid
  IN (SELECT synset2id FROM semlinkref
   WHERE synset1id
 IN (SELECT synsetid FROM sense
  WHERE wordid = (SELECT wordid FROM word WHERE  
 lemma='scramble'))
   AND linkid=1
   AND synset.pos='v')

 I realized that the last constraint, synset.pos='v', actually applies  
 to one of the tables in the main join, and could be lifted out of the  
 double IN clause.  Doing so sped the query up by a factor of 10,000.

 My question is, should the planner have figured this out, and we're  
 just losing out because we're stuck in 7.4?  Or is there some subtle  
 difference in semantics I'm missing?

As long as the condition involving the upper variable is STABLE
(including IMMUTABLE), I think your analysis is correct --- at least for
upper references that're within top-level WHERE clauses of a simple
subquery.  An example where it would not work is

foo IN (SELECT COUNT(*) FROM sometable
WHERE something-involving-upper-variable)

Here the upper condition doesn't simply filter out all the rows of the
subquery but actually changes the value of the (one) returned row.

The planner does not look for this type of situation though, and after
some study I think it'd be more trouble than it was worth.  It'd be
less than trivial to determine whether the upper references occurred
only in places where it was safe to pull them up, and the actual pulling
would take some code that doesn't exist now, too,

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend