Pooja Nilangekar has uploaded a new patch set (#4). Change subject: IMPALA-5061: Populate null_count in parquet::statistics ......................................................................
IMPALA-5061: Populate null_count in parquet::statistics The null_count in the statistics field is updated each time a null value is encountered by paquet table writer. The value is written to the parquet header if it has one or more null values in the row_group. Testing: Modified the existing end-to-end test in the test_insert_parquet.py file to make sure each parquet header has the appropriate null_count. Verified the correctness of the nulltable test and added an additional test which populates a parquet file with the functional_parquet.zipcode_incomes table and ensures that the expected null_count is populated. Change-Id: I4c49a63af84c2234f0633be63206cb52eb7e8ebb --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/parquet-column-stats.cc M be/src/exec/parquet-column-stats.h M be/src/exec/parquet-column-stats.inline.h M tests/query_test/test_insert_parquet.py 6 files changed, 180 insertions(+), 100 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/7058/4 -- To view, visit http://gerrit.cloudera.org:8080/7058 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4c49a63af84c2234f0633be63206cb52eb7e8ebb Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pooja Nilangekar <pooja.nilange...@cloudera.com> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Pooja Nilangekar <pooja.nilange...@cloudera.com>