[ 
https://issues.apache.org/jira/browse/HUDI-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Mahindra updated HUDI-2814:
----------------------------------
    Fix Version/s:     (was: 0.10.0)
                   0.11.0

> Address issues w/ Z-order Layout Optimization
> ---------------------------------------------
>
>                 Key: HUDI-2814
>                 URL: https://issues.apache.org/jira/browse/HUDI-2814
>             Project: Apache Hudi
>          Issue Type: Sub-task
>          Components: Index
>            Reporter: Alexey Kudinkin
>            Assignee: Alexey Kudinkin
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 0.11.0
>
>
> During extensive testing following issues have been discovered, which we're 
> planning to addres in the upcoming PR:
>  * Data-skipping seq incorrectly handles cases when columns that are not 
> Z-sorted are present in the query (it simply ignores this fact, while it 
> should abandon pruning altogether[1])
>  * Exception w/in file-pruning seq should not be affecting overall query (it 
> should in the worst case fallback to full-scan)
>  * Merging seq prefers records from the old Z-index table, while should 
> prefer those from the new one.
>  * After clustering columns change, Z-index should simply overwrite index 
> (currently it actually does the opposite – it skips updating the index in 
> case old and new tables diverge in schemas)
>  * Incorrect type conversions (for ex, Decimal is converted to Double)
> Additionally we're planning to beef up current Z-index implementation 
> test-suite making sure that all critical flows of the Z-indexing have 
> appropriate coverage.
> [1] Actually, with more advanced analysis we could still prune the search 
> space, but this requires substantial sophistication of the analysis 
> conducted, which is beyond our current focus



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to