----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9647/#review17390 -----------------------------------------------------------
Ship it! Ship It! - Hugo Trippaers On Feb. 27, 2013, 9:06 a.m., Brenn Oosterbaan wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/9647/ > ----------------------------------------------------------- > > (Updated Feb. 27, 2013, 9:06 a.m.) > > > Review request for cloudstack and Hugo Trippaers. > > > Description > ------- > > In some storage failure scenario’s the NFS timeout can cause writing the > heartbeat to take longer than expected. By comparing the last successful > heartbeat epoch with the current epoch we check if the timeout value has been > met. > > > Diffs > ----- > > scripts/vm/hypervisor/xenserver/xenheartbeat.sh 5edacf7 > > Diff: https://reviews.apache.org/r/9647/diff/ > > > Testing > ------- > > Tested on hostxxx with an empty heartbeat file: > Feb 26 21:54:13 hostxxx heartbeat: Problem with heartbeat, no iSCSI or NFS > mount defined in /opt/xensource/bin/heartbeat! > > Tested on hostxxx with a 120 seconds timeout value by causing a storage > failover (hits NFS timeout): > Feb 26 08:04:15 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/d392d770-330b-bdbf-9c07-e1c38af81c6e/hb-faecefb3-9ac0-47a2-b0fb-ae383762ba13: > not reachable since 18 seconds > Feb 26 08:04:48 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/d392d770-330b-bdbf-9c07-e1c38af81c6e/hb-faecefb3-9ac0-47a2-b0fb-ae383762ba13: > not reachable since 51 seconds > Feb 26 08:05:20 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/d392d770-330b-bdbf-9c07-e1c38af81c6e/hb-faecefb3-9ac0-47a2-b0fb-ae383762ba13: > not reachable since 83 seconds > The storage failover stayed within the 120 seconds timeout value so no reboot > > Tested on hostxxx with a 120 second timeout by removing the storage > altogether (hits NFS timeout): > Feb 26 10:08:52 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 32 seconds > Feb 26 10:09:24 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 64 seconds > Feb 26 10:09:57 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 97 seconds > Feb 26 10:10:29 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 129 seconds > Feb 26 10:10:29 hostxxx heartbeat: Problem with > /var/run/sr-mount/test/hb-test: not reachable since 129 seconds, rebooting > system! > > Tested on hostxxx with a 120 second timeout by removing write rights on the > storage (does not hit NFS timeout): > Feb 26 10:22:13 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 5 seconds > Feb 26 10:22:18 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 10 seconds > Feb 26 10:22:23 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 15 seconds > Feb 26 10:22:28 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 20 seconds > Feb 26 10:22:33 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 25 seconds > Feb 26 10:22:38 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 30 seconds > Feb 26 10:22:43 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 35 seconds > Feb 26 10:22:48 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 40 seconds > Feb 26 10:22:53 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 45 seconds > Feb 26 10:22:58 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 50 seconds > Feb 26 10:23:03 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 55 seconds > Feb 26 10:23:08 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 60 seconds > Feb 26 10:23:13 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 65 seconds > Feb 26 10:23:18 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 70 seconds > Feb 26 10:23:23 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 75 seconds > Feb 26 10:23:28 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 80 seconds > Feb 26 10:23:33 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 85 seconds > Feb 26 10:23:38 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 90 seconds > Feb 26 10:23:43 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 95 seconds > Feb 26 10:23:48 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 100 seconds > Feb 26 10:23:53 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 105 seconds > Feb 26 10:23:58 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 110 seconds > Feb 26 10:24:03 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 115 seconds > Feb 26 10:24:08 hostxxx heartbeat: Potential problem with > /var/run/sr-mount/test/hb-test: not reachable since 120 seconds > Feb 26 10:24:08 hostxxx heartbeat: Problem with > /var/run/sr-mount/test/hb-test: not reachable for 120 seconds, rebooting > system! > > > Thanks, > > Brenn Oosterbaan > >