Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/11949 )
Change subject: IMPALA-7869: break up parquet-column-readers.cc ...................................................................... Patch Set 7: (4 comments) http://gerrit.cloudera.org:8080/#/c/11949/7/be/src/exec/parquet-bool-decoder.h File be/src/exec/parquet-bool-decoder.h: http://gerrit.cloudera.org:8080/#/c/11949/7/be/src/exec/parquet-bool-decoder.h@17 PS7, Line 17: We have several parquet_* files in exec/ - it would be nice to move this to a separate directory like exex/parquet. I am not sure that would worth the extra pain during future merges, but if we want to do this, then this commit seems the right time. http://gerrit.cloudera.org:8080/#/c/11949/7/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: http://gerrit.cloudera.org:8080/#/c/11949/7/be/src/exec/parquet-column-readers.cc@74 PS7, Line 74: i Where is this new variable used? http://gerrit.cloudera.org:8080/#/c/11949/7/be/src/exec/parquet-column-readers.cc@284 PS7, Line 284: /// The size of this column with plain encoding for FIXED_LEN_BYTE_ARRAY, or : /// the max length for VARCHAR columns. Unused otherwise. : int fixed_len_size_; : : /// Contains extra data needed for Timestamp decoding. : ParquetTimestampDecoder timestamp_decoder_; : : /// Contains extra state required to decode boolean values. Only initialised for : /// BOOLEAN columns. : unique_ptr<ParquetBoolDecoder> bool_decoder_; A note about members that are used only in specific types: I have a plan to create a class like template <typename InternalType, parquet::Type::type PARQUET_TYPE> ParquetTypeDecoder {} whose specializations could add members needed for special cases like TIMESTAMP/CHAR(N)/BOOlEAN. The main benefit of moving this logic into another class is that it could be reused in DictDecoder during the implementation of IMPALA-4994. http://gerrit.cloudera.org:8080/#/c/11949/7/be/src/exec/parquet-level-decoder.h File be/src/exec/parquet-level-decoder.h: http://gerrit.cloudera.org:8080/#/c/11949/7/be/src/exec/parquet-level-decoder.h@122 PS7, Line 122: /// The parquet encoding used for the levels. Only RLE is supported for now. : parquet::Encoding::type encoding_ = parquet::Encoding::PLAIN; This was not changed, but looks a bit weird (only RLE is supported, that it is set to PLAIN) 'encoding_' could be removed to simplify code - it looks unlikely that there will be another encoding for levels in the future. -- To view, visit http://gerrit.cloudera.org:8080/11949 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0efd5c50b781fe9e3c022b33c66c06cfb529c0b8 Gerrit-Change-Number: 11949 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Comment-Date: Tue, 20 Nov 2018 15:11:03 +0000 Gerrit-HasComments: Yes