Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/17980 )
Change subject: IMPALA-10984: Improve TimestampValue to String casting ...................................................................... Patch Set 3: (4 comments) Thank you everyone for the feedbacks. Will address them in the next patch set. I'd like to answer some of the questions first. http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/timestamp-parse-util.h File be/src/runtime/timestamp-parse-util.h: http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/timestamp-parse-util.h@128 PS3, Line 128: int max_length > I wonder if we need this argument, since the source is 'str' which has a le max_length here is precautionary hint passed from TimestampParser::Format. When writing, we should not hit case where pos becomes > max_length. If that does happens, DCHECK error should be raised at TimestampParser::Format. http://gerrit.cloudera.org:8080/#/c/17980/2/be/src/runtime/timestamp-parse-util.cc File be/src/runtime/timestamp-parse-util.cc: http://gerrit.cloudera.org:8080/#/c/17980/2/be/src/runtime/timestamp-parse-util.cc@271 PS2, Line 271: > optional: we could optimize the case when day/month/year are all needed by This is interesting. We should look at it in the next JIRA. http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/timestamp-value.cc File be/src/runtime/timestamp-value.cc: http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/timestamp-value.cc@86 PS3, Line 86: dst.resize(written) > I wonder why resize once dst is populated. It is possible that we allocate larger than we actually needs. For example, "cast(now() as string format 'HH24:mm:ss')" will allocate 10 bytes at first. But the output result in only 8 bytes, like "17:10:57". So the second resize here is done to update the dst.length(). http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/timestamp-value.cc@222 PS3, Line 222: StringVal result(ctx, max_length); > When is this memory released? Seems this will still allocate via ctx for ev The allocation comes from exps_results_pool_. In between RowBatch processing, exec nodes will call QueryMaintenance() method in some frequencies to return all allocated memory so far back to MemPool though expr_results_pool_->Clear(). https://github.com/apache/impala/blob/b42c649/be/src/exec/exec-node.cc#L483 This, however, does not free the memory. MemPool will retain the memory chunks so that it can be reused in the next RowBatch processing. After the exec node completes, it will call Close() and free the memory chunks back by calling expr_results_pool_->FreeAll(). https://github.com/apache/impala/blob/b42c649/be/src/exec/exec-node.cc#L320 Contrast to Clear(), FreeAll() does free the memory. IMPALA-5844 explain more about this detail. There is also IMPALA-2399 that discuss about plan to remove QueryMaintenance(). -- To view, visit http://gerrit.cloudera.org:8080/17980 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4fcb4545d9c9a3fdb38c4db58bb4b1321a429d61 Gerrit-Change-Number: 17980 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Comment-Date: Fri, 29 Oct 2021 19:32:16 +0000 Gerrit-HasComments: Yes