Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9358 )
Change subject: IMPALA-6538: Fix read path when Parquet min/max statistics contain NaN ...................................................................... IMPALA-6538: Fix read path when Parquet min/max statistics contain NaN If the first number in a row group written by Impala is NaN, then Impala writes incorrect statistics in the metadata. This will result in incorrect results when filtering the data. This commit fixes the read path when encountering NaNs in Parquet min/max statistics. If min and max are both NaN, we can't use the statistics at all. If only one of them is NaN, the other still can be used. I added some tests to QueryTest/parqet-stats.test Change-Id: If3897fc1426541239223670812f59e2bed32f455 Reviewed-on: http://gerrit.cloudera.org:8080/9358 Reviewed-by: Tim Armstrong <tarmstr...@cloudera.com> Tested-by: Impala Public Jenkins --- M be/src/exec/parquet-column-stats.cc M testdata/data/README A testdata/data/min_max_is_nan.parquet A testdata/workloads/functional-query/queries/QueryTest/parquet-invalid-minmax-stats.test M testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test M tests/query_test/test_parquet_stats.py 6 files changed, 137 insertions(+), 2 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: If3897fc1426541239223670812f59e2bed32f455 Gerrit-Change-Number: 9358 Gerrit-PatchSet: 10 Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Reviewer: Zoltan Ivanfi <zi+ger...@cloudera.com>