Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20377 )

Change subject: IMPALA-12385: Enable Periodic metrics by default
......................................................................


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/20377/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20377/1//COMMIT_MSG@12
PS1, Line 12: resource_trace_ratio to 1
AFAIK, there is a pretty significant overhead on always sampling this metrics. 
Seems like parsing /proc/stat, /proc/net/dev, /proc/diskstats does not come 
cheap. I'm not sure if this should be enabled by default.


http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/runtime/query-state.cc@221
PS1, Line 221: AddSamplingTimeSeriesCounter
Will this cause interpretation problem if different host happen to resize its 
sampling period differently?
In contrast, ChunkedTimeSeriesCounter does not resize it sampling period, right?


http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/periodic-counter-updater.cc
File be/src/util/periodic-counter-updater.cc:

http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/periodic-counter-updater.cc@30
PS1, Line 30: periodic_counter_update_period_ms, 50
I'm a bit concern about lowering this to 10x. Can the code in 
PeriodicCounterUpdater::UpdateLoop() keep up in such short sampling period 
under heavy-concurrent queries? It looks like PeriodicCounterUpdater is a 
singleton per impalad.


http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/runtime-profile-counters.h
File be/src/util/runtime-profile-counters.h:

http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/runtime-profile-counters.h@807
PS1, Line 807: typedef StreamingSampler<int64_t, 64> StreamingCounterSampler;
If initial_period = 50ms, and MAX_SAMPLES = 64, that means it will take 3200ms 
before the sampling period doubled to 100ms. Will this hurt performance of 
short latency queries?


http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/streaming-sampler.h
File be/src/util/streaming-sampler.h:

http://gerrit.cloudera.org:8080/#/c/20377/1/be/src/util/streaming-sampler.h@40
PS1, Line 40: int initial_period
I'd rather keep this default to 500, but then add new parameter in 
AddSamplingTimeSeriesCounter for customized initial_period. I see this kind of 
counter is being used in other places like following:

be/src/runtime/fragment-instance-state.cc:  mem_usage_sampled_counter_ = 
profile()->AddSamplingTimeSeriesCounter("MemoryUsage",
be/src/runtime/fragment-instance-state.cc:  thread_usage_sampled_counter_ = 
profile()->AddSamplingTimeSeriesCounter("ThreadUsage",
be/src/runtime/krpc-data-stream-recvr.cc:      
enqueue_profile_->AddSamplingTimeSeriesCounter("DeferredQueueSize", TUnit::UNIT,

Their sampling period should probably stay at 500, while sampling counters from 
host_profile_ starts at lower initial_period.



--
To view, visit http://gerrit.cloudera.org:8080/20377
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic8e5cbfd4b324081158574ceb8f4b3a062a69fd1
Gerrit-Change-Number: 20377
Gerrit-PatchSet: 2
Gerrit-Owner: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: David Rorke <dro...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Surya Hebbar <sheb...@cloudera.com>
Gerrit-Comment-Date: Fri, 18 Aug 2023 16:23:59 +0000
Gerrit-HasComments: Yes

Reply via email to