I was wrong, just changing the guest OS type did not fix my problem. The morning following this email I found the CPU being pegged again.
I ended up installing the i386 version of 4.9 and used FreeBSD 32-bit as the guest os type. These VMs have been running for four days without a problem. If it occurs again I'll try the other suggestions provided here. -Gene On Sun, Oct 23, 2011 at 10:09 PM, Gene <gh5...@gmail.com> wrote: > This problem appears to be resolved. By changing the guest os type > from "FreeBSD (64-bit)" to "Other (64-bit)" these vm guests perform > much better. > > I found out I could easily duplicate the problem with the following command: > > find / -type f -exec grep -i moo {} \; > > After ten or so minutes dmesg would be flooded with the "vmware: > sending length failed" messages. Looking at the ESXi system > performance, that vm guest would have its core pegged. > > After changing the guest os type I ran that find repeatedly in a loop > for 30 minutes, and the problem didn't come back. I switched back and > forth between the OS types a couple of times to confirm my findings. > With the fix in place the CPU utilisation for that vm guest's core did > not go above 75%. > > Once again, thank you for your help everyone. > > -Gene > > On Sun, Oct 23, 2011 at 12:10 PM, Gene <gh5...@gmail.com> wrote: >> This is just an update, I've still got to try everything that was >> suggested before. >> >> This issue is finally occurring again, and I have been able to collect >> more information about it: >> >> # uptime >> 11:46AM up 3 days, 22:50, 1 user, load averages: 1.33, 1.12, 1.10 >> >> # ps aux >> USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND >> root 1 0.0 0.2 364 376 ?? Is Wed12PM 0:00.09 /sbin/init >> root 17473 0.0 0.3 412 812 ?? Is Wed12PM 0:00.09 >> syslogd: [priv] (syslogd) >> _syslogd 4944 0.0 0.3 420 860 ?? S Wed12PM 1:59.70 >> syslogd -a /var/www/dev/log -a /var/empty/dev/log >> root 17203 0.0 0.2 572 464 ?? Is Wed12PM 0:00.01 >> pflogd: [priv] (pflogd) >> _pflogd 25836 0.0 0.2 636 384 ?? S Wed12PM 1:18.70 >> pflogd: [running] -s 160 -i pflog0 -f /var/log/pflog (pflogd) >> root 20453 0.0 0.4 496 1020 ?? Is Wed12PM 0:02.17 >> ntpd: [priv] (ntpd) >> _ntp 27033 0.0 0.4 548 1092 ?? S Wed12PM 0:36.73 >> ntpd: ntp engine (ntpd) >> _ntp 30318 0.0 0.4 676 1008 ?? I Wed12PM 0:00.02 >> ntpd: dns engine (ntpd) >> root 12410 0.0 0.5 616 1384 ?? Is Wed12PM 0:00.02 /usr/sbin/sshd >> root 18650 0.0 0.3 412 832 ?? Is Wed12PM 0:00.06 inetd >> root 13652 0.0 0.4 668 912 ?? Is Wed12PM 0:04.15 cron >> root 12191 0.0 0.8 1216 2116 ?? Ss Wed12PM 1:36.36 >> sendmail: accepting connections (sendmail) >> root 18822 0.0 1.2 3452 3084 ?? Is 11:22AM 0:00.13 >> sshd: gene [priv] (sshd) >> gene 27682 0.3 0.9 3420 2312 ?? S 11:22AM 0:00.55 >> sshd: gene@ttyp0 (sshd) >> gene 18431 0.0 0.2 616 492 p0 Ss 11:22AM 0:00.14 -ksh (ksh) >> root 23079 0.1 0.2 692 536 p0 S 11:46AM 0:00.07 -ksh (ksh) >> root 19366 0.0 0.1 516 328 p0 R+ 11:47AM 0:00.00 ps -aux >> root 17451 0.0 0.3 280 864 C0 Is+ Wed12PM 0:00.02 >> /usr/libexec/getty std.9600 ttyC0 >> root 23962 0.0 0.3 324 864 C1 Is+ Wed12PM 0:00.01 >> /usr/libexec/getty std.9600 ttyC1 >> root 2571 0.0 0.3 272 860 C2 Is+ Wed12PM 0:00.01 >> /usr/libexec/getty std.9600 ttyC2 >> root 9191 0.0 0.3 296 864 C3 Is+ Wed12PM 0:00.02 >> /usr/libexec/getty std.9600 ttyC3 >> root 2812 0.0 0.3 416 868 C5 Is+ Wed12PM 0:00.01 >> /usr/libexec/getty std.9600 ttyC5 >> >> # vmstat -i >> interrupt total rate >> irq0/clock 34043772 99 >> irq97/mpi0 772066 2 >> irq112/em0 96237 0 >> Total 34912075 102 >> >> # systat >> 1 users Load 1.10 1.07 1.08 PAUSED Sun Oct 23 11:46:02 2011 >> >> memory totals (in KB) PAGING SWAPPING Interrupts >> real virtual free in out in out 105 total >> Active 12420 12420 185072 ops 100 clock >> All 55712 55712 447212 pages 4 mpi0 >> 1 em0 >> Proc:r d s w Csw Trp Sys Int Sof Flt forks >> 6 21 17 88 4 102 21 fkppw >> fksvm >> 0.0%Int 0.2%Sys 0.4%Usr 0.0%Nic 99.4%Idle pwait >> | | | | | | | | | | | 2 relck >> 2 rlkok >> noram >> Namei Sys-cache Proc-cache No-cache ndcpy >> Calls hits % hits % miss % fltcp >> 14 14 100 2 zfod >> cow >> Disks cd0 sd0 fd0 2006 fmin >> seeks 2674 ftarg >> xfers 4 itarg >> speed 67K 1 wired >> sec 0.0 pdfre >> pdscn >> pzidle >> 10 kmapent >> >> # dmesg | tail >> vmware: sending length failed, eax=00000000, ecx=00000000 >> vmt0: failed to send TCLO outgoing ping >> vmware: sending length failed, eax=00000000, ecx=00000000 >> vmt0: failed to send TCLO outgoing ping >> vmware: sending length failed, eax=00000000, ecx=00000000 >> vmt0: failed to send TCLO outgoing ping >> vmware: sending length failed, eax=00000000, ecx=00000000 >> vmt0: failed to send TCLO outgoing ping >> vmware: sending length failed, eax=00000000, ecx=00000000 >> vmt0: failed to send TCLO outgoing ping >> >> >> My /var/log/messages* files have that pair of error messages in them >> over 16,000 times. >> >> I will go through and try what has been suggested, starting with >> changing the guest OS type. Unfortunately it appears it can be days >> apart when this problem occurs. I'll send an update when I have >> something more concrete. >> >> If anyone would like to try recreating this problem on their ESXi host >> I'll make a .tar.gz of this vm guest for you to download. >> >> Thanks again. >> >> -Gene >> >> >> On Wed, Oct 19, 2011 at 8:23 PM, Gene <gh5...@gmail.com> wrote: >>> I haven't been able to reproduce the problem since this morning. >>> Nothing has been changed on the vmhosts so I'm at a bit of a loss at >>> the moment. >>> >>> When the issue reoccurs I'll try everything that has been suggested today. >>> >>> Thank you very much for your help everyone. >>> >>> -Gene