Looking at the failed Jenkins runs for HIVE-5998, I see there are diffs in the statistics in the EXPLAIN:
Running: diff -a /root/hive/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/vectorized_parquet.q.out /root/hive/itests/qtest/../../ql/src/test/results/clientpositive/vectorized_parquet.q.out 72c72 < Statistics: Num rows: 12288 Data size: 73728 Basic stats: COMPLETE Column stats: NONE --- > Statistics: Num rows: 2072 Data size: 257046 Basic stats: > COMPLETE Column stats: NONE 75c75 < Statistics: Num rows: 6144 Data size: 36864 Basic stats: COMPLETE Column stats: NONE --- > Statistics: Num rows: 1036 Data size: 128523 Basic stats: > COMPLETE Column stats: NONE 79c79 < Statistics: Num rows: 6144 Data size: 36864 Basic stats: COMPLETE Column stats: NONE --- > Statistics: Num rows: 1036 Data size: 128523 Basic stats: > COMPLETE Column stats: NONE 82c82 < Statistics: Num rows: 10 Data size: 60 Basic stats: COMPLETE Column stats: NONE --- > Statistics: Num rows: 10 Data size: 1240 Basic stats: > COMPLETE Column stats: NONE What would cause such statistics diffs? The Parquet file is created as: create table if not exists alltypes_parquet ( cint int, ctinyint tinyint, csmallint smallint, cfloat float, cdouble double, cstring1 string) stored as parquet; insert overwrite table alltypes_parquet select cint, ctinyint, csmallint, cfloat, cdouble, cstring1 from alltypesorc; Note that there are no diffs in the actual query results. Thanks, ~Remus