Alexander Saydakov has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17869 )

Change subject: IMPALA-10956: datasketches UDFs: memory leak and merge overhead
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17869/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17869/3//COMMIT_MSG@9
PS3, Line 9: - call destructors of sketch and union objects
> Thanks a lot for fixing this!
Sorry for the late reply. It did not seem to me that these comments invite a 
discussion or something might be expected from me. I am replying just in case I 
misunderstood, and I hope to move this forward.

Yes, ideally we should have an allocator, and I even started writing one, but 
it turned out that the version of Apache DataSketches currently used in Impala 
is not ready for this yet. There are some changes in the library to support 
this, but not released yet. It should happen soon. In the meantime this 
proposed change should make things better than before. I believe that the leaks 
should not happen in normal circumstances anymore (only in case of thrown 
exceptions).

The next step after adopting this change would be to fix the dependency on 
Apache Datasketches, so we could easily upgrade to the latest version, and then 
work on the allocator to take this to the next level. I believe there is a 
ticket open for the dependency issue.


http://gerrit.cloudera.org:8080/#/c/17869/3/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/17869/3/be/src/exprs/aggregate-functions-ir.cc@1689
PS3, Line 1689:   agg_state_ptr->second = new 
(ctx->Allocate<datasketches::hll_sketch>())
              :       datasketches::hll_sketch(DS_SKETCH_CONFIG, DS_HLL_TYPE);
> DsHllInit always initializes a hll_sketch while we'll only modify it if DsH
As I understand, there is always the update phase of the processing before 
moving on to the merge phase. Even if I am wrong, and it is not so, the 
overhead of this unnecessary constructor is not that high, and it was there 
before, so no regression here. On the contrary, some unnecessary complexity was 
eliminated in this change. Even if not perfect, this should be an improvement.



--
To view, visit http://gerrit.cloudera.org:8080/17869
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8dd0e6736f4266f74f5f265f58d40a4e4707287f
Gerrit-Change-Number: 17869
Gerrit-PatchSet: 3
Gerrit-Owner: Alexander Saydakov <al...@apache.org>
Gerrit-Reviewer: Alexander Saydakov <al...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Fucun Chu <chufu...@hotmail.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Comment-Date: Sat, 04 Dec 2021 05:28:01 +0000
Gerrit-HasComments: Yes

Reply via email to