>>I might still be misunderstanding, but I think what you are suggesting >>is that in the loop in make_rels_by_clause_joins, if we find that the >>old_rel doesn't have a join clause/restriction with the current >>other_rel, we check to see whether other_rel has any join clauses at >>all, and force the join to occur anyway if it doesn't.
It is on similar lines, but the only difference is that it will try to join old_rel with other_rel list incase old_rel is not able to join with any of other_rel in the list with proper join clause between them. >>In the second place, instead of N tests to see whether a rel lacks any join clauses, >>we'd now have something approaching O(N^2) such tests, in the typical >>case where most base rels are directly joined to only a few other rels. In the typical case where most base rels are directly joined to only a few other rels, it will not go in the path where it has to try new combinations, as the idea is that if in first pass the old_rel is not able find any suitable rel to join then only it will go in second pass to try to join with other_rel list. >>In the first place, it seems to miss the need to >>clauseless-join two relations when neither of them have any join >>clauses, for instance plain old "SELECT * FROM a, b". So you still need >>something like the make_rels_by_clauseless_joins code path, and it's >>not entirely clear how to avoid duplicated work there. I believe this will not be related to clause-less joins because we have entered the function make_rels_by_clause_joins() for relation which has join clause and which can be joined to one of the other_rel. So in most cases I believe it will find one of the members in other_rel list with a proper join clause. Only in some cases when it is not able to find any of the other_rel with proper join clause between them, it will go in new path. So in a way if we see this case handling is different from handling clause-less join case. The other way like you said "is there any join clause that both these relations participate in?" Does this mean that we will evaluate the join list to see if there can be any valid join between 2 relations. Is it possible that in all scenarios or cases we will be able to find the existence of any valid join which can be possible may be not at this level but at next level. -----Original Message----- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Sent: Tuesday, April 17, 2012 2:05 AM To: Amit Kapila Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Improving our clauseless-join heuristics Amit Kapila <amit.kap...@huawei.com> writes: > For this kind of query, currently (referring 9.0.3 code) also it considers > join of b,c and b,d. > As there is no join clause between b,c,d so it will go in path of > make_rels_by_clauseless_joins() where it considers join of b,c and b,d. > In this kind of query, if the suggestion by me in below mail is followed, > then it will consider joining a,b a,c a,d at level-2 in function > make_rels_by_clause_joins() which it currently doesn't do which may generate > useless join paths. > However in real-world scenario's this kind of queries where 2 cols of > different tables are > used in one side expression (b.y + c.z) of where clause may be less. > On the other hand, when we come to consider d, it will have no join > clauses so we will consider joining it to each other rel in turn. > When it come to consider d, as at level -2 it only consider later rels. So > it should not consider joining with each other rel. I might still be misunderstanding, but I think what you are suggesting is that in the loop in make_rels_by_clause_joins, if we find that the old_rel doesn't have a join clause/restriction with the current other_rel, we check to see whether other_rel has any join clauses at all, and force the join to occur anyway if it doesn't. I can't really get excited about doing it that way instead of the current way. In the first place, it seems to miss the need to clauseless-join two relations when neither of them have any join clauses, for instance plain old "SELECT * FROM a, b". So you still need something like the make_rels_by_clauseless_joins code path, and it's not entirely clear how to avoid duplicated work there. In the second place, instead of N tests to see whether a rel lacks any join clauses, we'd now have something approaching O(N^2) such tests, in the typical case where most base rels are directly joined to only a few other rels. So it seems to make things slower for little obvious benefit. In general, queries with join-clause-less rels are pretty uncommon, so I don't want to introduce extra work into make_rels_by_clause_joins to handle the case. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers