DISTINCT for prefixes of the PK

Lars Hofhansl (JIRA) Fri, 25 Mar 2016 12:08:15 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15212264#comment-15212264
 ]


Lars Hofhansl commented on PHOENIX-2797:
----------------------------------------

Oops. Should have done some checking first. Just wanted to capture it before I 
forget about it.

> Ideas to speed up MIN/MAX/DISTINCT for prefixes of the PK
> ---------------------------------------------------------
>
>                 Key: PHOENIX-2797
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2797
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Lars Hofhansl
>            Priority: Minor
>
> All of MIN, MAX, and DISTINCT always perform a full scan, even when they are 
> on a prefix of a compound key.
> For MIN and MAX one only needs to find the first and last row (resp) and 
> we'll have our answer. This works for the full key or a prefix of the key.
> This should work find with or without a WHERE clause, as long as we can 
> identify the first and last row.
> For DISTINCT we could do a skip scan to the next prefix (only helps with a 
> true prefix of a compound key).
> Say the key is (K1, K2), and say further that we're doing DISTINCT(K1). We 
> can skip to the next value of K1 once we found a value. This should have a 
> dramatic impact when the cardinality of K2 is high.
> With a WHERE clause that might itself be causing a SKIP SCAN, this might be 
> quite tricky. Would need to think about it.
> Both of these statements hold equally when querying against an index.
> Anyway... Just filing this as an idea for now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-2797) Ideas to speed up MIN/MAX/DISTINCT for prefixes of the PK

Reply via email to