[
https://issues.apache.org/jira/browse/PHOENIX-852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14104083#comment-14104083
]
James Taylor commented on PHOENIX-852:
--------------------------------------
I see - so without a filter, there's no need for using the skip scan because
the matching rows on the LHS will be contiguous, right? Now I understand why
you need two hints.
Orthogonal to this change, what about the child PK -> parent FK case, like the
following?
{code}
select * from A join B on A.fk = B.id
{code}
Assume that there's an index on A.fk. Do we add a A.fk IS NOT NULL filter to
the LHS query? That would help if A is much bigger than B and there are lots of
null values for A.fk.
And related to this, what about the partial key match scenario when there's no
where clause? For example, assume table A has PK of (a, b) and table B has PK
of (a), and you have a query like this one:
{code}
select * from A join B on A.a = B.a
{code}
Does the LHS scan have a start row equal to the min value in the RHS scan? I
suppose this would cover the null case too.
> Optimize child/parent foreign key joins
> ---------------------------------------
>
> Key: PHOENIX-852
> URL: https://issues.apache.org/jira/browse/PHOENIX-852
> Project: Phoenix
> Issue Type: Improvement
> Reporter: James Taylor
> Assignee: Maryann Xue
> Attachments: PHOENIX-852.patch
>
>
> Often times a join will occur from a child to a parent. Our current algorithm
> would do a full scan of one side or the other. We can do much better than
> that if the HashCache contains the PK (or even part of the PK) from the table
> being joined to. In these cases, we should drive the second scan through a
> skip scan on the server side.
--
This message was sent by Atlassian JIRA
(v6.2#6252)