Hi -

A colleague presented the following very slow query to me:

SELECT DISTINCT lemma FROM word
        JOIN sense USING (wordid)
        JOIN synset USING (synsetid)
  WHERE sense.synsetid
    IN (SELECT synset2id FROM semlinkref
         WHERE synset1id
           IN (SELECT synsetid FROM sense
WHERE wordid = (SELECT wordid FROM word WHERE lemma='scramble'))
         AND linkid=1
         AND synset.pos='v')
  ORDER BY lemma;

I realized that the last constraint, synset.pos='v', actually applies to one of the tables in the main join, and could be lifted out of the double IN clause. Doing so sped the query up by a factor of 10,000.

My question is, should the planner have figured this out, and we're just losing out because we're stuck in 7.4? Or is there some subtle difference in semantics I'm missing? The select results were the same in both cases, but I'm willing to believe that's an accident of our data.

(Sorry if no one can answer my question without the table definitions, etc. - it seemed worthwhile trying to get away without that for now.)

Thanks.

- John D. Burger
  MITRE



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Reply via email to