[ 
https://issues.apache.org/jira/browse/PHOENIX-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14211536#comment-14211536
 ] 

James Taylor commented on PHOENIX-1179:
---------------------------------------

Thanks for the contribution, [~maryannxue]. This is fantastic! First some high 
level questions before any detailed feedback:
- how is it determined if a hash-join is possible?
- is the hint USE_SORT_MERGE_JOIN an all-or-nothing hint, in that if more than 
two tables are joined together, they'd all use the the merge sort algorithm 
instead of a hash join? Should we have a way of hinting each part of the join 
(kind of like we have for our index order hint)?
- what's the high level algorithm? That'd be good to capture in a code comment 
(sorry in advance if I missed it). Does it push an ORDER BY to the server based 
on the join keys? What happens if there's already an ORDER BY clause?
- would be interesting to get your feedback on what we need in terms of 
statistics to drive the optimizer decisions: PHOENIX-1178, PHOENIX-1453, maybe 
others needed? [~julianhyde] may have ideas.

> Support many-to-many joins
> --------------------------
>
>                 Key: PHOENIX-1179
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1179
>             Project: Phoenix
>          Issue Type: Sub-task
>            Reporter: James Taylor
>            Assignee: Maryann Xue
>             Fix For: 3.0.0, 4.0.0, 5.0.0
>
>         Attachments: 1179.patch
>
>
> Enhance our join capabilities to support many-to-many joins where the size of 
> both sides of the join are too big to fit into memory (and thus cannot use 
> our hash join mechanism). One technique would be to order both sides of the 
> join by their join key and merge sort the results on the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to