[ https://issues.apache.org/jira/browse/PHOENIX-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14211536#comment-14211536 ]
James Taylor commented on PHOENIX-1179: --------------------------------------- Thanks for the contribution, [~maryannxue]. This is fantastic! First some high level questions before any detailed feedback: - how is it determined if a hash-join is possible? - is the hint USE_SORT_MERGE_JOIN an all-or-nothing hint, in that if more than two tables are joined together, they'd all use the the merge sort algorithm instead of a hash join? Should we have a way of hinting each part of the join (kind of like we have for our index order hint)? - what's the high level algorithm? That'd be good to capture in a code comment (sorry in advance if I missed it). Does it push an ORDER BY to the server based on the join keys? What happens if there's already an ORDER BY clause? - would be interesting to get your feedback on what we need in terms of statistics to drive the optimizer decisions: PHOENIX-1178, PHOENIX-1453, maybe others needed? [~julianhyde] may have ideas. > Support many-to-many joins > -------------------------- > > Key: PHOENIX-1179 > URL: https://issues.apache.org/jira/browse/PHOENIX-1179 > Project: Phoenix > Issue Type: Sub-task > Reporter: James Taylor > Assignee: Maryann Xue > Fix For: 3.0.0, 4.0.0, 5.0.0 > > Attachments: 1179.patch > > > Enhance our join capabilities to support many-to-many joins where the size of > both sides of the join are too big to fit into memory (and thus cannot use > our hash join mechanism). One technique would be to order both sides of the > join by their join key and merge sort the results on the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)