[
https://issues.apache.org/jira/browse/PHOENIX-852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14097775#comment-14097775
]
James Taylor commented on PHOENIX-852:
--------------------------------------
Yes, correct - must be leading columns. There's a little bit of work required
to support the case of join condition including both c0 and c1. We handle the
basic case (i.e. WHERE c0=1 and c1=2), but not the IN case (i.e. WHERE (c0, c1)
IN ((?,?),(?,?)) ). Your case is really the latter. This would be easy to add,
though and I think well worth it.
FWIW, we can handle non leading columns or gaps in columns, but we don't by
default today. The reason is that we don't know the cardinality of these
missing columns, so don't know if doing a skip scan would be better or worse
that a skip scan. When we start collecting histogram information, we can start
to change this.
> Optimize child/parent foreign key joins
> ---------------------------------------
>
> Key: PHOENIX-852
> URL: https://issues.apache.org/jira/browse/PHOENIX-852
> Project: Phoenix
> Issue Type: Improvement
> Reporter: James Taylor
> Assignee: Maryann Xue
>
> Often times a join will occur from a child to a parent. Our current algorithm
> would do a full scan of one side or the other. We can do much better than
> that if the HashCache contains the PK (or even part of the PK) from the table
> being joined to. In these cases, we should drive the second scan through a
> skip scan on the server side.
--
This message was sent by Atlassian JIRA
(v6.2#6252)