Ok, not sure what the difference is, but this is a fix; kvm-probes.d/monitor_ds.sh
--- monitor_ds.sh.orig 2013-12-05 17:06:35.251763742 +0000 +++ monitor_ds.sh 2013-12-05 16:55:11.699743037 +0000 @@ -31,6 +31,10 @@ TOTAL_MB=$(df -B1M -P $dir 2>/dev/null | tail -n 1 | awk '{print $2}') FREE_MB=$(df -B1M -P $dir 2>/dev/null | tail -n 1 | awk '{print $4}') + USED_MB=${USED_MB:-"0"} + TOTAL_MB=${TOTAL_MB:-"0"} + FREE_MB=${FREE_MB:-"0"} + if [ -n "$LVM_SIZE_CMD" ]; then LVM_SIZE=$($LVM_SIZE_CMD ${LVM_VG_PREFIX}${ds} 2>/dev/null) LVM_STATUS=$? The host monitor seems to be calling a generic monitor which is also grabbing information about datastores. In an environment (such as VDC) where the data exists remotely, the standard monitor (which uses 'du') isn't going to provide any useful results, indeed in this instance it errors and the script returns blanks, which breaks the host monitoring. Returning '0' on failure fixes the issue ... Any chance this could inserted in the next release please? -- Gareth Bult “The odds of hitting your target go up dramatically when you aim at it.” ----- Original Message ----- From: "Gareth Bult" <gar...@linux.co.uk> To: users@lists.opennebula.org Sent: Thursday, 5 December, 2013 3:47:12 PM Subject: [one-users] Breakage when upgrading from 4.38 to 4.4 ... Hi, I've run the upgrade just to see if a number of issues I was having were fixed, and I seem to have ended up with a fairly terminal issue I can't spot. Host monitoring is now failing for hosts that previously seemed to be working. The reason seems to be that the host is somehow looking at the datastores, and this in turn is failing. The datastore monitoring is correct. The error in oned.log is as follows; Thu Dec 5 15:24:47 2013 [ONE][E]: Error parsing host information: syntax error, unexpected EQUAL, expecting COMMA or CBRACKET at line 30, columns 494:497. Monitoring information: ARCH=x86_64 MODELNAME="AMD Phenom(tm) II X6 1100T Processor" HYPERVISOR=kvm TOTALCPU=600 CPUSPEED=3300 TOTALMEMORY=16175216 USEDMEMORY=399320 FREEMEMORY=15775896 FREECPU=600.0 USEDCPU=0.0 NETRX=33791417902 NETTX=25290741468 DS_LOCATION_USED_MB=2637 DS_LOCATION_TOTAL_MB=40190 DS_LOCATION_FREE_MB=28202 DS = [ ID = 0, USED_MB = 29, TOTAL_MB = 40190, FREE_MB = 28202 ] DS = [ ID = 1, USED_MB = 2609, TOTAL_MB = 40190, FREE_MB = 28202 ] DS = [ ID = 107, USED_MB = , TOTAL_MB = , FREE_MB = ] DS = [ ID = 109, USED_MB = , TOTAL_MB = , FREE_MB = ] DS = [ ID = 114, USED_MB = 1, TOTAL_MB = 40190, FREE_MB = 28202 ] DS = [ ID = 2, USED_MB = 1, TOTAL_MB = 40190, FREE_MB = 28202 ] HOSTNAME=node1 VM_POLL=YES VERSION="4.4.0" So it would appear that it's calling datastore/monitor, and it's failing .. however the script does work and is reporting the correct size in the datastore monitor. I've added syslog to the monitor script, it is being called and it is returning the correct result ... If I call the script on the command line with a pasted in base64 and host it, this too gives the correct results; oneadmin@nebula:~/remotes$ ssh node1 /var/lib/one/remotes/datastore/vdc/monitor "PERTX0RSSVZFUl9BQ1RJT05fREFUQT48REFUQVNUT1JFPjxJRD4xMDk8L0lEPjxVSUQ+MDwvVUlEPjxHSUQ+MTwvR0lEPjxVTkFNRT5vbmVhZG1pbjwvVU5BTUU+PEdOQU1FPnVzZXJzPC9HTkFNRT48TkFNRT5kYXRhMjwvTkFNRT48UEVSTUlTU0lPTlM+PE9XTkVSX1U+MTwvT1dORVJfVT48T1dORVJfTT4xPC9PV05FUl9NPjxPV05FUl9BPjA8L09XTkVSX0E+PEdST1VQX1U+MTwvR1JPVVBfVT48R1JPVVBfTT4wPC9HUk9VUF9NPjxHUk9VUF9BPjA8L0dST1VQX0E+PE9USEVSX1U+MTwvT1RIRVJfVT48T1RIRVJfTT4wPC9PVEhFUl9NPjxPVEhFUl9BPjA8L09USEVSX0E+PC9QRVJNSVNTSU9OUz48RFNfTUFEPnZkYzwvRFNfTUFEPjxUTV9NQUQ+dmRjPC9UTV9NQUQ+PEJBU0VfUEFUSD4vdmFyL2xpYi9vbmUvL2RhdGFzdG9yZXMvMTA5PC9CQVNFX1BBVEg+PFRZUEU+MDwvVFlQRT48RElTS19UWVBFPjA8L0RJU0tfVFlQRT48Q0xVU1RFUl9JRD4tMTwvQ0xVU1RFUl9JRD48Q0xVU1RFUj48L0NMVVNURVI+PFRPVEFMX01CPjE1MDAxMDE8L1RPVEFMX01CPjxGUkVFX01CPjk2MzIzMDwvRlJFRV9NQj48VVNFRF9NQj41MzY4NzE8L1VTRURfTUI+PElNQUdFUz48SUQ+MzE8L0lEPjxJRD4zMjwvSUQ+PElEPjYyPC9JRD48SUQ+ODQ8L0lEPjwvSU1BR0VTPjxURU1QTEFURT48Q0xPTkVfVEFSR0VUPjwhW0NEQVRBW1NZU1RFTV1dPjwvQ0xPTkVfVEFSR0VUPjxESVNLX1RZUEU+PCFbQ0RBVEFbRklMRV1dPjwvRElTS19UWVBFPjxEU19NQUQ+PCFbQ0RBVEFbdmRjXV0+PC9EU19NQUQ+PExOX1RBUkdFVD48IVtDREFUQVtTWVNURU1dXT48L0xOX1RBUkdFVD48TU9VTlRQT0lOVD48IVtDREFUQVsvdm9scy92bXNdXT48L01PVU5UUE9JTlQ+PFNBRkVfRElSUz48IVtDREFUQVsvdmFyL2xpYi9vbmUvaW1hZ2VzXV0+PC9TQUZFX0RJUlM+PFRNX01BRD48IVtDREFUQVt2ZGNdXT48L1RNX01BRD48VFlQRT48IVtDREFUQVtJTUFHRV9EU11dPjwvVFlQRT48VkdfTkFNRT48IVtDREFUQVt2b2xzXV0+PC9WR19OQU1FPjwvVEVNUExBVEU+PC9EQVRBU1RPUkU+PC9EU19EUklWRVJfQUNUSU9OX0RBVEE+" 109 TOTAL_MB=1500101.21 FREE_MB=963230.30 USED_MB=536871 Short of trying to dig through oned's source code I'm stuck - can anyone help? My datastore driver type is "vdc", even if I insert a static script (I've also tried the monitor driver from "fs") that just prints out "TOTAL_MB=0" etc .. I still get the same issue ... ??? (and it was working fine .. before the upgrade??) -- Gareth Bult “The odds of hitting your target go up dramatically when you aim at it.” _______________________________________________ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
_______________________________________________ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org