[jira] [Commented] (PHOENIX-852) Optimize child/parent foreign key joins

James Taylor (JIRA) Thu, 14 Aug 2014 15:09:17 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14097775#comment-14097775
 ]


James Taylor commented on PHOENIX-852:
--------------------------------------

Yes, correct - must be leading columns. There's a little bit of work required 
to support the case of join condition including both c0 and c1. We handle the 
basic case (i.e. WHERE c0=1 and c1=2), but not the IN case (i.e. WHERE (c0, c1) 
IN ((?,?),(?,?)) ). Your case is really the latter. This would be easy to add, 
though and I think well worth it.

FWIW, we can handle non leading columns or gaps in columns, but we don't by 
default today. The reason is that we don't know the cardinality of these 
missing columns, so don't know if doing a skip scan would be better or worse 
that a skip scan. When we start collecting histogram information, we can start 
to change this.

> Optimize child/parent foreign key joins
> ---------------------------------------
>
>                 Key: PHOENIX-852
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-852
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: James Taylor
>            Assignee: Maryann Xue
>
> Often times a join will occur from a child to a parent. Our current algorithm 
> would do a full scan of one side or the other. We can do much better than 
> that if the HashCache contains the PK (or even part of the PK) from the table 
> being joined to. In these cases, we should drive the second scan through a 
> skip scan on the server side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PHOENIX-852) Optimize child/parent foreign key joins

Reply via email to