[ 
https://issues.apache.org/jira/browse/IMPALA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16944106#comment-16944106
 ] 

ASF subversion and git services commented on IMPALA-8755:
---------------------------------------------------------

Commit 8f3733910811eeb7d0018a5284f0a170f618e092 in impala's branch 
refs/heads/master from norbert.luksa
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=8f37339 ]

IMPALA-8996: fix test_show_create_table in test_zorder.py

IMPALA-8593 extended the filtered tbl properties in
test_show_create_table.py. This change was not propagated to
test_zorder.py which created a duplicate of the former in order
to run its tests in a custom cluster environment.

Removed the "show create table" tests from test_zorder.py.
These tests are not critical, but can cause more issues like this.
Also, Z-order tests are in a separate file only until custom
cluster is required. After merging IMPALA-8755, and removing the
feature flag (which is the reason for the custom cluster), the
tests can be merged into show-create-table.test (from
show-create-table-zorder.test).

Change-Id: Ic316224325eec64d9b1e570854a74c0372084a4a
Reviewed-on: http://gerrit.cloudera.org:8080/14359
Reviewed-by: Csaba Ringhofer <csringho...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>


> Implement Z-ordering for Impala
> -------------------------------
>
>                 Key: IMPALA-8755
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8755
>             Project: IMPALA
>          Issue Type: New Feature
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Norbert Luksa
>            Priority: Major
>
> Implement Z-ordering for Impala: [https://en.wikipedia.org/wiki/Z-order_curve]
> A Z-order curve defines an ordering on multi-dimensional data. Data sorted 
> that way can be efficiently filtered by min/max statistics regarding to the 
> columns participating in the ordering.
> Impala currently only supports lexicographic ordering via the SORT BY clause. 
> This strongly prefers the first column, i.e. given the "SORT BY A, B, C" 
> clause => A will be totally ordered (hence filtering on A will be very 
> efficient), but values belonging to B and C will be scattered throughout the 
> data set (hence filtering on B or C will barely do any good).
> We could add a new clause, e.g. a "ZSORT BY" clause to Impala that writes the 
> data in Z-order.
> "ZSORT BY A, B C" would cluster the rows in a way that filtering on A, B, or 
> C would be equally efficient.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to