Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/10951 )
Change subject: IMPALA-7304: Don't write floating column index until PARQUET-1222 is resolved. ...................................................................... Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/10951/1/be/src/exec/hdfs-parquet-table-writer.cc File be/src/exec/hdfs-parquet-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/10951/1/be/src/exec/hdfs-parquet-table-writer.cc@343 PS1, Line 343: if (std::is_floating_point<T>::value) valid_column_index_ = false; > Will this still allow page skipping for predicates on other columns? Yes, at L1261 it can be seen that we only skip column index writing for columns that have valid_column_index_ == false. And we write the offset index for all the columns. So, e.g. if we have a predicate on an int32 column, and with the help of the page index (column index + offset index) we can figure out that we only need rows from row range 100 and 200, we can still use this row range to filter pages from all of the columns with the help of the offset index. -- To view, visit http://gerrit.cloudera.org:8080/10951 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I50aa2e6607de6a8943eb068b8162b0506763078b Gerrit-Change-Number: 10951 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Mon, 16 Jul 2018 21:48:47 +0000 Gerrit-HasComments: Yes