[ https://issues.apache.org/jira/browse/HUDI-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexey Kudinkin reopened HUDI-2814: ----------------------------------- > Address issues w/ Z-order Layout Optimization > --------------------------------------------- > > Key: HUDI-2814 > URL: https://issues.apache.org/jira/browse/HUDI-2814 > Project: Apache Hudi > Issue Type: Task > Components: index > Reporter: Alexey Kudinkin > Assignee: Alexey Kudinkin > Priority: Blocker > Labels: pull-request-available > Fix For: 0.10.0 > > > During extensive testing following issues have been discovered, which we're > planning to addres in the upcoming PR: > * Data-skipping seq incorrectly handles cases when columns that are not > Z-sorted are present in the query (it simply ignores this fact, while it > should abandon pruning altogether[1]) > * Exception w/in file-pruning seq should not be affecting overall query (it > should in the worst case fallback to full-scan) > * Merging seq prefers records from the old Z-index table, while should > prefer those from the new one. > * After clustering columns change, Z-index should simply overwrite index > (currently it actually does the opposite – it skips updating the index in > case old and new tables diverge in schemas) > * Incorrect type conversions (for ex, Decimal is converted to Double) > Additionally we're planning to beef up current Z-index implementation > test-suite making sure that all critical flows of the Z-indexing have > appropriate coverage. > [1] Actually, with more advanced analysis we could still prune the search > space, but this requires substantial sophistication of the analysis > conducted, which is beyond our current focus -- This message was sent by Atlassian Jira (v8.20.10#820010)