Hi,

On 12/16/2015 01:27 AM, David Rowley wrote:

I've attached a rebased patch against current master as there were
some conflicts from the recent changes to LATERAL join.

Thanks. I've looked at the rebased patch and have a few minor comments.

0) I know the patch does not tweak costing - any plans in this
   direction? Would it be possible to simply use the costing used by
   semijoin?

1) nodeHashjoin.c (and other join nodes)

I've noticed we have this in the ExecHashJoin() method:

 /*
  * When the inner side is unique or we're performing a
  * semijoin, we'll consider returning the first match, but
  * after that we're done with this outer tuple.
  */
 if (node->js.unique_inner)
     node->hj_JoinState = HJ_NEED_NEW_OUTER;

That seems a bit awkward because the comment speaks about unique joins *OR* semijoins, but the check only references unique_inner. That of course works because we do this in ExecInitHashJoin():

        switch (node->join.jointype)
        {
                case JOIN_SEMI:
                        hjstate->js.unique_inner = true;
                        /* fall through */

Either some semijoins may not be unique - in that case the naming is misleading at best, and we either need to find a better name or simply check two conditions like this:

 if (node->js.unique_inner || node->join.type == JOIN_SEMI)
    node->hj_JoinState = HJ_NEED_NEW_OUTER;

Or all semijoins are unique joins, and in that case the comment may need rephrasing. But more importantly, it begs the question why we're detecting this in the executor and not in the planner? Because if we detect it in executor, we can't use this bit in costing, for example.

FWIW the misleading naming was actually mentioned in the thread by Horiguchi-san, but I don't see any response to that (might have missed it though).

2) analyzejoins.c

I see that this code in join_is_removable()

    innerrel = find_base_rel(root, innerrelid);

    if (innerrel->reloptkind != RELOPT_BASEREL)
        return false;

was rewritten like this:

    innerrel = find_base_rel(root, innerrelid);

    Assert(innerrel->reloptkind == RELOPT_BASEREL);

That suggests that previously the function expected cases where reloptkind may not be RELOPT_BASEREL, but now it'll error out int such cases. I haven't noticed any changes in the surrounding code that'd guarantee this won't happen, but I also haven't been able to come up with an example triggering the assert (haven't been trying too hard). How do we know the assert() makes sense?


3) joinpath.c

- either "were" or "been" seems redundant

/* left joins were already been analyzed for uniqueness in mark_unique_joins() */


4) analyzejoins.c

comment format broken

/*
 * mark_unique_joins
        Analyze joins in order to determine if their inner side is unique based
        on the join condition.
 */

5) analyzejoins.c

missing line before the comment

}
/*
 * rel_supports_distinctness
 *              Returns true if rel has some properties which can prove the 
relation
 *              to be unique over some set of columns.
 *

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to