Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/19473 )
Change subject: [clock] add sanity check to detect wall clock jumps ...................................................................... Patch Set 1: (9 comments) http://gerrit.cloudera.org:8080/#/c/19473/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19473/1//COMMIT_MSG@23 PS1, Line 23: condition > nit: condition is Done http://gerrit.cloudera.org:8080/#/c/19473/1//COMMIT_MSG@40 PS1, Line 40: However, if a problem is observed at some inadequate VMs such : as ones in Azure Cloud, it's now possible to enable the guardrail : to detect such an issue. > There are chances that this clock issue may go unnoticed unless there is a The idea is to turn on this flag if it's found that the cluster is prone clock jump issues. Doing so, it would help to detect such issues in the future. >From the other side, I have a follow-up WIP patch where the flag changes its >default value to 'true' if a Kudu server founds itself running at an Azure VM. Another alternative would be enabling this check by default, but the issue hasn't been observed to happen often enough in the wild. http://gerrit.cloudera.org:8080/#/c/19473/1//COMMIT_MSG@46 PS1, Line 46: opeations > nit: operations Done http://gerrit.cloudera.org:8080/#/c/19473/1/src/kudu/clock/hybrid_clock.h File src/kudu/clock/hybrid_clock.h: http://gerrit.cloudera.org:8080/#/c/19473/1/src/kudu/clock/hybrid_clock.h@238 PS1, Line 238: captured > nit: Did you mean "captured at" ? Done http://gerrit.cloudera.org:8080/#/c/19473/1/src/kudu/clock/hybrid_clock.cc File src/kudu/clock/hybrid_clock.cc: http://gerrit.cloudera.org:8080/#/c/19473/1/src/kudu/clock/hybrid_clock.cc@165 PS1, Line 165: 15 * 60 > Any specific reason to choose 15 minutes of max allowed divergence? Nothing very specific except that the idea was to keep this under 1000 seconds. That's the default threshold for ntpd unless '-g' options is provided. ntpd would exit without trying to adjust time if it sees the difference between the reference time and local clock time is greater than 1000 seconds. I added corresponding comment. http://gerrit.cloudera.org:8080/#/c/19473/1/src/kudu/clock/hybrid_clock.cc@287 PS1, Line 287: uint64_t now_latest = GetPhysicalValueMicros(now) + error; > warning: The right operand of '+' is a garbage value [clang-analyzer-core.U This warning is bogus. http://gerrit.cloudera.org:8080/#/c/19473/1/src/kudu/clock/hybrid_clock.cc@353 PS1, Line 353: uint64_t now_earliest_usec = now_usec - error; > warning: The right operand of '-' is a garbage value [clang-analyzer-core.U This warning is bogus. http://gerrit.cloudera.org:8080/#/c/19473/1/src/kudu/clock/hybrid_clock.cc@444 PS1, Line 444: monotomic > nit: monotonic Done http://gerrit.cloudera.org:8080/#/c/19473/1/src/kudu/clock/hybrid_clock.cc@773 PS1, Line 773: return error; > warning: Undefined or garbage value returned to caller [clang-analyzer-core This warning is bogus. -- To view, visit http://gerrit.cloudera.org:8080/19473 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I630783653717d975a9b2ad668e8bd47b7796d275 Gerrit-Change-Number: 19473 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin <ale...@apache.org> Gerrit-Reviewer: Alexey Serbin <ale...@apache.org> Gerrit-Reviewer: Ashwani Raina <ara...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Yingchun Lai <laiyingc...@apache.org> Gerrit-Reviewer: Yuqi Du <shenxingwuy...@gmail.com> Gerrit-Reviewer: Zoltan Chovan <zcho...@cloudera.com> Gerrit-Comment-Date: Tue, 07 Feb 2023 02:43:50 +0000 Gerrit-HasComments: Yes