Re: [GENERAL] Lifting WHERE conditions out of inner select
Tom Lane wrote: The planner does not look for this type of situation though, and after some study I think it'd be more trouble than it was worth. It'd be less than trivial to determine whether the upper references occurred only in places where it was safe to pull them up, and the actual pulling would take some code that doesn't exist now, too, Okay, good to know. The situation is obviously easy to avoid, I just found the contrast surprising. Thanks. - John D. Burger MITRE ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Lifting WHERE conditions out of inner select
Hi - A colleague presented the following very slow query to me: SELECT DISTINCT lemma FROM word JOIN sense USING (wordid) JOIN synset USING (synsetid) WHERE sense.synsetid IN (SELECT synset2id FROM semlinkref WHERE synset1id IN (SELECT synsetid FROM sense WHERE wordid = (SELECT wordid FROM word WHERE lemma='scramble')) AND linkid=1 AND synset.pos='v') ORDER BY lemma; I realized that the last constraint, synset.pos='v', actually applies to one of the tables in the main join, and could be lifted out of the double IN clause. Doing so sped the query up by a factor of 10,000. My question is, should the planner have figured this out, and we're just losing out because we're stuck in 7.4? Or is there some subtle difference in semantics I'm missing? The select results were the same in both cases, but I'm willing to believe that's an accident of our data. (Sorry if no one can answer my question without the table definitions, etc. - it seemed worthwhile trying to get away without that for now.) Thanks. - John D. Burger MITRE ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Lifting WHERE conditions out of inner select
--- John D. Burger [EMAIL PROTECTED] wrote: My question is, should the planner have figured this out, and we're just losing out because we're stuck in 7.4? Or is there some subtle difference in semantics I'm missing? The select results were the same in both cases, but I'm willing to believe that's an accident of our data. I don't know if this helps. SELECT lemma FROM word AS W INNER JOIN sense AS S ON W.wordid = S.wordid INNER JOIN semlinkref AS R ON R.synset1id = S.synsetid AND R.synset2id = S.synsetid INNER JOIN synset AS T ON S1.synsetid = T.synsetid WHERE W.lemma = 'scramble' AND R.linked = 1 AND R.pos='v' ORDER BY lemma; Regards, Richard Broersma Jr. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Lifting WHERE conditions out of inner select
OOPs! --- Richard Broersma Jr [EMAIL PROTECTED] wrote: --- John D. Burger [EMAIL PROTECTED] wrote: My question is, should the planner have figured this out, and we're just losing out because we're stuck in 7.4? Or is there some subtle difference in semantics I'm missing? The select results were the same in both cases, but I'm willing to believe that's an accident of our data. I don't know if this helps. SELECT lemma FROM word AS W INNER JOIN sense AS S ON W.wordid = S.wordid INNER JOIN semlinkref AS R ON R.synset1id = S.synsetid AND R.synset2id = S.synsetid INNER JOIN synset AS T ON S.synsetid = T.synsetid WHERE W.lemma = 'scramble' AND R.linked = 1 AND T.pos='v' ORDER BY lemma; Regards, Richard Broersma Jr. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Lifting WHERE conditions out of inner select
John D. Burger [EMAIL PROTECTED] writes: ... WHERE sense.synsetid IN (SELECT synset2id FROM semlinkref WHERE synset1id IN (SELECT synsetid FROM sense WHERE wordid = (SELECT wordid FROM word WHERE lemma='scramble')) AND linkid=1 AND synset.pos='v') I realized that the last constraint, synset.pos='v', actually applies to one of the tables in the main join, and could be lifted out of the double IN clause. Doing so sped the query up by a factor of 10,000. My question is, should the planner have figured this out, and we're just losing out because we're stuck in 7.4? Or is there some subtle difference in semantics I'm missing? As long as the condition involving the upper variable is STABLE (including IMMUTABLE), I think your analysis is correct --- at least for upper references that're within top-level WHERE clauses of a simple subquery. An example where it would not work is foo IN (SELECT COUNT(*) FROM sometable WHERE something-involving-upper-variable) Here the upper condition doesn't simply filter out all the rows of the subquery but actually changes the value of the (one) returned row. The planner does not look for this type of situation though, and after some study I think it'd be more trouble than it was worth. It'd be less than trivial to determine whether the upper references occurred only in places where it was safe to pull them up, and the actual pulling would take some code that doesn't exist now, too, regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend