Song Jiacheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20166 )

Change subject: eKUDU-3407 Avoid unchecked scheduling of flush operations.
......................................................................


Patch Set 35:

(5 comments)

Thanks for the review~

http://gerrit.cloudera.org:8080/#/c/20166/29//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20166/29//COMMIT_MSG@12
PS29, Line 12: eventually break
             : due to OOM
> Right -- overall performance is dropping, so every operation takes longer t
Theoretically, yes, it should not be much higher than 80%. But I could always 
find a tserver whose memory is higher than 90, sometimes 100%. I had looked 
into the memory profile and I think everything is OK except some big SchemaPBs 
from tombstoned replicas, which I have raised another issue KUDU-3486 to fix it.
But the memory of tombstoned replicas is also tracked by the memory tracker, so 
the memory pressure mechanism should still work.
I think the reason might be the scan requests, or the concurrent writes or I/O 
problem. And all of them could be optimized by maintenance operations.
And also, sometimes the performance might not be the root cause of OOM. I still 
think this mechanism is needed since I always find that there are many 
operations pending with high perf improvement score when the tserver is under 
memory pressure.


http://gerrit.cloudera.org:8080/#/c/20166/34/src/kudu/util/maintenance_manager-test.cc
File src/kudu/util/maintenance_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/20166/34/src/kudu/util/maintenance_manager-test.cc@584
PS34, Line 584:   me
> nit: fix the alignment for this line
Done


http://gerrit.cloudera.org:8080/#/c/20166/34/src/kudu/util/maintenance_manager-test.cc@1022
PS34, Line 1022: l
> What's "e"?  This looks like an incomplete word or a typo.
Done


http://gerrit.cloudera.org:8080/#/c/20166/34/src/kudu/util/maintenance_manager-test.cc@1023
PS34, Line 1023: s * pr
> nit: greater
Done


http://gerrit.cloudera.org:8080/#/c/20166/34/src/kudu/util/maintenance_manager-test.cc@1026
PS34, Line 1026:   // take time, so the other_ops_running_times might be 
greater than expected.
               :   const int64_t memory_op_running_times = op2.run_count();
> These margins make sense from the theoretical perspective, but how did you
I ran it with the parameters and it turns out that AssertEventually will time 
out until memory_op run 1000 times.
No other problem except that, the range is big enough to avoid the flakiness.
Done.



--
To view, visit http://gerrit.cloudera.org:8080/20166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc2fd3a850cf99d54ef2980211b712468440ed80
Gerrit-Change-Number: 20166
Gerrit-PatchSet: 35
Gerrit-Owner: Song Jiacheng <songjiach...@thinkingdata.cn>
Gerrit-Reviewer: Abhishek Chennaka <achenn...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <ale...@apache.org>
Gerrit-Reviewer: Ashwani Raina <ara...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Song Jiacheng <songjiach...@thinkingdata.cn>
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Comment-Date: Tue, 14 Nov 2023 17:19:27 +0000
Gerrit-HasComments: Yes

Reply via email to