Hi all, I am trying to implement the stats for Geode membership service health monitor, which monitors the health of the members of the distributed system by heartbeats. I will describe the stats that will be implemented. Please take a look and let me know what you think.
Assume you have basic knowledge of Geode, here is a very brief description of how the health monitor works. Every member exchanges heartbeat messages with its neighbors to make sure that its neighbor is alive. If for some reason, a member doesn't receive heartbeat from its neighbor, the member will send suspect member messages to the coordinator reporting the issue. Upon receiving the suspect member message, the coordinator will perform a final check with the suspect member by exchanging final check messages (similar to heartbeat) with the suspect member. Depending on the result of final check, the coordinator can decide whether to keep or remove the suspect member from membership. For details of the health monitor, please refer to GEODE-77 and/or GMSHealthMonitor.java. The proposed stats for health monitor are: 1) The number of heartbeat requests a member has sent 2) The number of heartbeat requests a member has received 3) The number of heartbeat (responses) a member has sent 4) The number of heartbeat (responses) a member has received 5) The number of suspect member messages a member has sent 6) The number of suspect member messages a member has received 7) The number of final check request a member has sent 8) The number of final check request a member has received 9) The number of final check responses a member has sent 10) The number of final check responses a member has received Note that there are two different types of final checks (TCP based and UDP based), therefore more stats of these two types of final checks: 11) The number of TCP final check request a member has sent 12) The number of TCP final check request a member has received 13) The number of TCP final check responses a member has sent 14) The number of TCP final check responses a member has received 15) The number of UDP final check request a member has sent 16) The number of UDP final check request a member has received 17) The number of UDP final check responses a member has sent 18) The number of UDP final check responses a member has received Thanks, Jianxia