Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20716 )

Change subject: Issue a warning on parameter heap_sample_every_n_bytes.
......................................................................


Patch Set 6:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/20716/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20716/6//COMMIT_MSG@7
PS6, Line 7: Issue a warning
With this patch, no warning is actually issued :) This patch just adds a 
warning into the description of the flag, but the description already contains 
information about the overhead of the allocation sampling.

I guess proper summary for this patch could be:

[util] add a warning into 'heap_sample_every_n_bytes' flag description


http://gerrit.cloudera.org:8080/#/c/20716/6//COMMIT_MSG@10
PS6, Line 10: thread
            : cache request
What is 'thread cache request'?


http://gerrit.cloudera.org:8080/#/c/20716/6//COMMIT_MSG@12
PS6, Line 12: occupy
nit: 'acquire' or 'hold'


http://gerrit.cloudera.org:8080/#/c/20716/6//COMMIT_MSG@15
PS6, Line 15: (especially for upserts)
Is this essential? Any allocation on the heap will add to the lock contention, 
no?


http://gerrit.cloudera.org:8080/#/c/20716/6//COMMIT_MSG@16
PS6, Line 16: so much performance because of the lock contention of tcmalloc
Even without enabling tcmalloc sampling a system with quite high throughput 
already shows significant lock contention in tcmalloc -- you could see it in 
the thread stacks traces captured in the diagnostic logs :)


http://gerrit.cloudera.org:8080/#/c/20716/6//COMMIT_MSG@18
PS6, Line 18: 1MB
This is quite low, actually.  Empirically, setting it to hundreds of gigabytes 
makes more sense in a single tablet server of a production system, if trying 
the sampling at all.


http://gerrit.cloudera.org:8080/#/c/20716/6//COMMIT_MSG@20
PS6, Line 20: So give a warning on the parameter heap_sample_every_n_bytes in 
case
            : that some users do not know the terrible influence of it.
Even with the extra warning, users wouldn't know how bad the lock contention in 
tcmalloc could be before they try it: no quantitative information is provided 
on that in the description of the flag, before or after this patch.

If trying to add guardrails for people who don't quite understand what they are 
doing, maybe consider tagging this flag with the 'unsafe' attribute: it's 
currently tagged with just 'experimental' attribute, but adding 'unsafe' looks 
reasonable.

In addition, consider issuing a warning into the log if the 
heap_sample_every_n_bytes flag is set to a positive value -- that could be done 
in the validator for the flag.


http://gerrit.cloudera.org:8080/#/c/20716/6/src/kudu/util/flags.cc
File src/kudu/util/flags.cc:

http://gerrit.cloudera.org:8080/#/c/20716/6/src/kudu/util/flags.cc@100
PS6, Line 100: Warning: Setting it to a value greater than 0 will probably "
             :              "lead to lock contention in tcmalloc, try setting 
it manually when a sample "
             :              "is desired.
Try "setting manually" what?

how about:

WARNING: setting this flag to a positive value increases lock contention in 
tcmalloc; the lower the value, the higher the lock contention



--
To view, visit http://gerrit.cloudera.org:8080/20716
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I00d55126f0cad0ac6cc130576cc45808f96f0a47
Gerrit-Change-Number: 20716
Gerrit-PatchSet: 6
Gerrit-Owner: Song Jiacheng <songjiach...@thinkingdata.cn>
Gerrit-Reviewer: Alexey Serbin <ale...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber <greber...@gmail.com>
Gerrit-Reviewer: Song Jiacheng <songjiach...@thinkingdata.cn>
Gerrit-Comment-Date: Tue, 21 Nov 2023 17:27:44 +0000
Gerrit-HasComments: Yes

Reply via email to