Song Jiacheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/20166 )
Change subject: KUDU-3407 Avoid unchecked scheduling of flush operations. ...................................................................... Patch Set 23: (21 comments) http://gerrit.cloudera.org:8080/#/c/20166/17//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20166/17//COMMIT_MSG@7 PS17, Line 7: Avoid unchecked scheduli > nit: Avoid unchecked scheduling of flush operations Done http://gerrit.cloudera.org:8080/#/c/20166/17//COMMIT_MSG@20 PS17, Line 20: higher the probability to flush > nit: higher the probability to flush Done http://gerrit.cloudera.org:8080/#/c/20166/17//COMMIT_MSG@20 PS17, Line 20: Higher the > nit: Higher the Done http://gerrit.cloudera.org:8080/#/c/20166/17//COMMIT_MSG@27 PS17, Line 27: e > nit: remove whitespace Done http://gerrit.cloudera.org:8080/#/c/20166/17//COMMIT_MSG@27 PS17, Line 27: Yes.. I modified the flag name and forgot to correct the commit message. Done. http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager-test.cc File src/kudu/util/maintenance_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager-test.cc@151 PS17, Line 151: gister_self_) { > Consider removing this once done debugging/troubleshooting the new test -- TSAN will report data race if I remove this, which I think is a wrong report. Please check the patch 18~22 to view the test log. http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager-test.cc@314 PS17, Line 314: MonoTim > nit: why not to use the MonoDelta type here? Done http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager-test.cc@315 PS17, Line 315: Sum of schedul > nit: How many times the operation has been run. Done http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager-test.cc@989 PS17, Line 989: ob = 1; > Is it possible to assert on the range of the expected counters, etc.? Give Yes, mostly the counter is in a range, but I am afraid that there will be some accidental compile failures if I assert that the counter is in a range, which will be confusing. http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager.h File src/kudu/util/maintenance_manager.h: http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager.h@379 PS17, Line 379: memory pressure > the memory pressure Done http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager.h@386 PS17, Line 386: bool ProceedWithFlush(double* used_memory_percentage); > nit: you could use a free format in a non-public API when documenting metho Got it, thanks! http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager.cc File src/kudu/util/maintenance_manager.cc: http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager.cc@105 PS17, Line 105: th > nit: Who's "we" here? It's better to be more specific on who does what her Thans you for the review! Done. http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager.cc@108 PS17, Line 108: run > run Done http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager.cc@110 PS17, Line 110: system adm > nit: maybe you could use "system admin" or "user"? Thank you for the review! A user should turn on this switch if the tablet server is under memory pressure and there are some high-perf-score operations not scheduled for a long time. If I try to warn the users according to the logic above, some extra and complex codes need to be added to the maintenance scheduler thread. I guess this might make a bad impact on the scheduler. Also, it not worth to create another thread just for making a warn log. http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager.cc@111 PS17, Line 111: there is a significant degrada > nit: "there is a significant degradation in performance." Done http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager.cc@136 PS17, Line 136: emory_pressure_percent > What is the unit of the usage? Done http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager.cc@539 PS17, Line 539: pc > under Done http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager.cc@542 PS17, Line 542: > Alternatively, consider calling memory_pressure_func_() functor as it is, b Yes, this is much more reasonable. Done http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager.cc@730 PS17, Line 730: tatic_cast > FlushOrNot becomes confusing when return type is bool. It is not clearly ev Yes, That is much better, thank you! http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager.cc@730 PS17, Line 730: double soft_limit = static_cast<double>(FLAGS_memory_limit_soft_p > nit: Add a comment above that describes the purpose of this method, out par Sorry, I'm not sure. There is a comment of this method in file maintenance_manager.h, should I add another one here? http://gerrit.cloudera.org:8080/#/c/20166/17/src/kudu/util/maintenance_manager.cc@732 PS17, Line 732: rand_.NextDoubleFraction() >= FLAGS_run_non_memory_ops_prob * : (soft_limit - *used_memory_percentage) / (soft_limit - pressure_threshold); : } > Instead of introducing this extra test flag and adding this extra logic, co Got it, done. -- To view, visit http://gerrit.cloudera.org:8080/20166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc2fd3a850cf99d54ef2980211b712468440ed80 Gerrit-Change-Number: 20166 Gerrit-PatchSet: 23 Gerrit-Owner: Song Jiacheng <songjiach...@thinkingdata.cn> Gerrit-Reviewer: Abhishek Chennaka <achenn...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <ale...@apache.org> Gerrit-Reviewer: Ashwani Raina <ara...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Song Jiacheng <songjiach...@thinkingdata.cn> Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Comment-Date: Fri, 22 Sep 2023 09:47:59 +0000 Gerrit-HasComments: Yes