Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 )
Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder ...................................................................... Patch Set 8: (8 comments) http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-scanner.cc@274 PS8, Line 274: // Before closing the Column readers, the accounted bytes in mem_tracker for : // dict_decoder_ is released. : if (mem_tracker_ != nullptr) { : for (ParquetColumnReader* col_reader : column_readers_) col_reader->Cleanup(); : } I think this code won't be necessary if col_reader->Close() does the dictionary close. http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-scanner.cc@348 PS8, Line 348: CloseAndUnregisterFromParent Same question here as elsewhere: Do we really need CloseAndUnregisterFromParent() or will Close() do? http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-table-writer.cc File be/src/exec/hdfs-parquet-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-table-writer.cc@178 PS8, Line 178: dict_encoder_base_->ClearIndices(); This should call the new base Close() function (which should do ClearIndices()). http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-table-writer.cc@1057 PS8, Line 1057: CloseAndUnregisterFromParent(); I think we want to limit use of CloseAndUnregisterFromParent() to cases that really need it. I think the ordinary Close() should work here. Check if that works. http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/parquet-column-readers.h File be/src/exec/parquet-column-readers.h: http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/parquet-column-readers.h@234 PS8, Line 234: /// Delete the bytes used for memory tracking. : virtual void Cleanup() {} I think this can be folded into Close() and removed. http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/parquet-column-readers.cc@269 PS8, Line 269: virtual void Cleanup() { : dict_decoder_.Close(); : } I think this can be folded into Close() without a separate Cleanup() function. http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/util/dict-encoding.h File be/src/util/dict-encoding.h: http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/util/dict-encoding.h@107 PS8, Line 107: DictEncoder(MemPool* pool, int encoded_value_size, MemTracker* mem_tracker) : : DictEncoderBase(pool), buckets_(HASH_TABLE_SIZE, Node::INVALID_INDEX), : encoded_value_size_(encoded_value_size), : dict_mem_tracker_(mem_tracker), : dict_bytes_cnt_(0) { } As I understand it, the DictEncoderBase/DictEncoder split is that DictEncoderBase contains everything that does not rely on T and DictEncoder contains the type-specific stuff. The same split applies for DictDecoderBase/DictDecoder. Since the memory tracking is not type specific, I think it makes sense to put it in the base classes. This should simplify the code in HdfsParquetWriter, because then you can call Close() on the DictEncoderBase rather than having to go to the typed version. Close() on DictEncoderBase should also do ClearIndices(). http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/util/dict-encoding.h@251 PS8, Line 251: ReleaseBytes(); Is there any codepath that should legitimately use this? If not, DCHECK that dict_bytes_cnt_ is zero. -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 8 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Thu, 12 Oct 2017 17:23:41 +0000 Gerrit-HasComments: Yes