Zoltan Borok-Nagy has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17071
Change subject: IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: def_levels_.CacheRemaining() <= num_buffered_values_ ...................................................................... IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: def_levels_.CacheRemaining() <= num_buffered_values_ We had a DCHECK in ScalarColumnReader::MaterializeValueBatch() that checked that 'num_buffered_values_' is greater or equal to the number of cached values in the Parquet definition level decoder. However, the decoder might contain more values because literal runs are stored in groups of 8, i.e. there might be padding zeros at the end. Also, the decoder doesn't know the exact number of the actual values, it is up to the client of the decoder to keep track the number of values. I removed this wrong assumption from MaterializeValueBatch() and modified the code accordingly. Testing * until this patch TestParquetStats::test_page_index was flaky because of this issue * I tested the solution on a hacked Impala that randomly generated skip ranges Change-Id: Ic071473e7b315300fd5e163225d3e39735f09c4f --- M be/src/exec/parquet/parquet-column-readers.cc 1 file changed, 6 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/17071/1 -- To view, visit http://gerrit.cloudera.org:8080/17071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ic071473e7b315300fd5e163225d3e39735f09c4f Gerrit-Change-Number: 17071 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>