[ 
https://issues.apache.org/jira/browse/IMPALA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Norbert Luksa resolved IMPALA-8755.
-----------------------------------
    Target Version: Impala 4.0
        Resolution: Implemented

> Implement Z-ordering for Impala
> -------------------------------
>
>                 Key: IMPALA-8755
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8755
>             Project: IMPALA
>          Issue Type: New Feature
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Norbert Luksa
>            Priority: Major
>
> Implement Z-ordering for Impala: [https://en.wikipedia.org/wiki/Z-order_curve]
> A Z-order curve defines an ordering on multi-dimensional data. Data sorted 
> that way can be efficiently filtered by min/max statistics regarding to the 
> columns participating in the ordering.
> Impala currently only supports lexicographic ordering via the SORT BY clause. 
> This strongly prefers the first column, i.e. given the "SORT BY A, B, C" 
> clause => A will be totally ordered (hence filtering on A will be very 
> efficient), but values belonging to B and C will be scattered throughout the 
> data set (hence filtering on B or C will barely do any good).
> We could add a new clause, e.g. a "ZSORT BY" clause to Impala that writes the 
> data in Z-order.
> "ZSORT BY A, B C" would cluster the rows in a way that filtering on A, B, or 
> C would be equally efficient.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to