Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8034 )

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
......................................................................


Patch Set 8:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-scanner.cc@274
PS8, Line 274:   // Before closing the Column readers, the accounted bytes in 
mem_tracker for
             :   // dict_decoder_ is released.
             :   if (mem_tracker_ != nullptr) {
             :     for (ParquetColumnReader* col_reader : column_readers_) 
col_reader->Cleanup();
             :   }
I think this code won't be necessary if col_reader->Close() does the dictionary 
close.


http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-scanner.cc@348
PS8, Line 348: CloseAndUnregisterFromParent
Same question here as elsewhere: Do we really need 
CloseAndUnregisterFromParent() or will Close() do?


http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-table-writer.cc
File be/src/exec/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-table-writer.cc@178
PS8, Line 178: dict_encoder_base_->ClearIndices();
This should call the new base Close() function (which should do ClearIndices()).


http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-table-writer.cc@1057
PS8, Line 1057: CloseAndUnregisterFromParent();
I think we want to limit use of CloseAndUnregisterFromParent() to cases that 
really need it. I think the ordinary Close() should work here. Check if that 
works.


http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/parquet-column-readers.h
File be/src/exec/parquet-column-readers.h:

http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/parquet-column-readers.h@234
PS8, Line 234:   /// Delete the bytes used for memory tracking.
             :   virtual void Cleanup() {}
I think this can be folded into Close() and removed.


http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/parquet-column-readers.cc@269
PS8, Line 269:   virtual void Cleanup() {
             :     dict_decoder_.Close();
             :   }
I think this can be folded into Close() without a separate Cleanup() function.


http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/util/dict-encoding.h@107
PS8, Line 107:   DictEncoder(MemPool* pool, int encoded_value_size, MemTracker* 
mem_tracker) :
             :       DictEncoderBase(pool), buckets_(HASH_TABLE_SIZE, 
Node::INVALID_INDEX),
             :       encoded_value_size_(encoded_value_size),
             :       dict_mem_tracker_(mem_tracker),
             :       dict_bytes_cnt_(0) { }
As I understand it, the DictEncoderBase/DictEncoder split is that 
DictEncoderBase contains everything that does not rely on T and DictEncoder 
contains the type-specific stuff. The same split applies for 
DictDecoderBase/DictDecoder. Since the memory tracking is not type specific, I 
think it makes sense to put it in the base classes. This should simplify the 
code in HdfsParquetWriter, because then you can call Close() on the 
DictEncoderBase rather than having to go to the typed version. Close() on 
DictEncoderBase should also do ClearIndices().


http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/util/dict-encoding.h@251
PS8, Line 251: ReleaseBytes();
Is there any codepath that should legitimately use this? If not, DCHECK that 
dict_bytes_cnt_ is zero.



--
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-Change-Number: 8034
Gerrit-PatchSet: 8
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Thu, 12 Oct 2017 17:23:41 +0000
Gerrit-HasComments: Yes

Reply via email to