On Thu, Apr 6, 2017 at 2:13 PM, Jeff Davis <pg...@j-davis.com> wrote:

> ========
> Example:
> ========
>
> Find different people using the same website at the same time:
>
> create table session(sessionid text, username text, during tstzrange);
> SELECT s1.username, s2.username, s1.during * s2.during
>   FROM session s1, session s2
>   WHERE s1.during && s2.during AND s1.username < s2.username
>
>
> =====================================
> Brief summary of previous discussion:
> =====================================
>
> http://www.postgresql.org/message-id/1334554850.10878.52.camel@jdavis
>
> - can indexes solve it already (parameterized paths, etc.)?
> - planner complexity (track path keys, mergejoinable equality, etc.)
> - spatial join algorithm choice
> - compare range merge join against existing indexes to see if any
>   indexing approach is viable
>
> ==========
> Externals:
> ==========
>
> No new syntax or other externals. Just improved execution strategy for
> joining ranges using the overlaps operator (&&).
>
> New syntax is possible, but I don't see a strong reason to support
> special range join syntax at this time.
>
> =============
> Optimizer Design
> =============
>
> I took the path of least resistance, so if I am doing something wrong
> please let me know. I hardwired it to look for the range overlaps
> operator when checking if it could do a merge join, and then add
> range_ops to restrictinfo->mergeopfamilies.
>
> Then, I have to suppress adding it as an equivalence class, because
> overlaps is not equality. It still adds single-member ECs to
> restrictinfo->right_ec and left_ec, but (I think) that's OK.
>
> Also, I have to prevent generating paths for right/full join, because
> the range join algorithm can't (currently) handle those.
>
> =============
> Costing
> =============
>
> Needs more consideration. Seems to do reasonable things in the few
> examples I tried.
>
> =============
> Executor Design
> =============
>
> See detailed comments in nodeMergejoin.c
>
> =============
> Performance
> =============
>
> Seems much better than index nest loop join when the join is not
> selective. I will post detailed numbers soon.
>
> If no index is available, range merge join is the only reasonable way
> to execute a range join. For instance, if the inner is not a leaf in
> the plan.
>
> =============
> Alternatives:
> =============
>
> It was suggested that I approach it as a general spatial-join
> problem. That has merit, but I rejected it for now because the
> algorithm that I think has the most promise is range-oriented
> anyway. By that I mean we would need to extract all of the dimensions
> into ranges first, and then perform the join. With that in mind, it
> makes sense to implement range joins first; and then later provide the
> APIs to get at the spatial dimensions so that we can implement other
> spatial joins as range joins.
>
> Another suggestion was to base it on hash join, but I never understood
> that proposal and it didn't seem to map very well to a spatial join.
>
> Yet another suggestion was to use some kind of temporary index. Some
> brief experiments I did indicated that it would be fairly slow (though
> more investigation might be useful here). Also, it doesn't provide any
> alternative to the nestloop-with-inner-index we already offer at the
> leaf level today.
>
> Regards,
>      Jeff Davis
>

Looks like an interesting idea, however, in an attempt to test this patch I
found following error when compiling,
selfuncs.c: In function ‘mergejoinscansel’:
selfuncs.c:2901:12: error: ‘op_strategy’ undeclared (first use in this
function)
           &op_strategy,

-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/

Reply via email to