[ 
https://issues.apache.org/jira/browse/PHOENIX-852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14104083#comment-14104083
 ] 

James Taylor commented on PHOENIX-852:
--------------------------------------

I see - so without a filter, there's no need for using the skip scan because 
the matching rows on the LHS will be contiguous, right? Now I understand why 
you need two hints.

Orthogonal to this change, what about the child PK -> parent FK case, like the 
following?
{code}
select * from A join B on A.fk = B.id
{code}
Assume that there's an index on A.fk. Do we add a A.fk IS NOT NULL filter to 
the LHS query? That would help if A is much bigger than B and there are lots of 
null values for A.fk.

And related to this, what about the partial key match scenario when there's no 
where clause? For example, assume table A has PK of (a, b) and table B has PK 
of (a), and you have a query like this one:
{code}
select * from A join B on A.a = B.a
{code}
Does the LHS scan have a start row equal to the min value in the RHS scan? I 
suppose this would cover the null case too.


> Optimize child/parent foreign key joins
> ---------------------------------------
>
>                 Key: PHOENIX-852
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-852
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: James Taylor
>            Assignee: Maryann Xue
>         Attachments: PHOENIX-852.patch
>
>
> Often times a join will occur from a child to a parent. Our current algorithm 
> would do a full scan of one side or the other. We can do much better than 
> that if the HashCache contains the PK (or even part of the PK) from the table 
> being joined to. In these cases, we should drive the second scan through a 
> skip scan on the server side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to