[ https://issues.apache.org/jira/browse/HUDI-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexey Kudinkin updated HUDI-5557: ---------------------------------- Priority: Blocker (was: Critical) > Wrong candidate files found in metadata table > ---------------------------------------------- > > Key: HUDI-5557 > URL: https://issues.apache.org/jira/browse/HUDI-5557 > Project: Apache Hudi > Issue Type: Bug > Components: metadata, spark-sql > Affects Versions: 0.12.1 > Reporter: ruofan > Priority: Blocker > Labels: pull-request-available > Fix For: 0.13.1 > > > Suppose the hudi table has five fields, but only two fields are indexed. When > part of the filter condition in SQL comes from index fields and the other > part comes from non-index fields, the candidate files queried from the > metadata table are wrong. > For example following hudi table schema > {code:java} > name: varchar(128) > age: int > addr: varchar(128) > city: varchar(32) > job: varchar(32) {code} > table properties > {code:java} > hoodie.table.type=MERGE_ON_READ > hoodie.metadata.enable=true > hoodie.metadata.index.column.stats.enable=true > hoodie.metadata.index.column.stats.column.list='name,city' > hoodie.enable.data.skipping=true {code} > sql > {code:java} > select * from hudi_table where name='tom' and age=18; {code} > if we set hoodie.enable.data.skipping=false, the data can be found. But if we > set hoodie.enable.data.skipping=true, we can't find the expected data. -- This message was sent by Atlassian Jira (v8.20.10#820010)