On Wed, Oct 26, 2016 at 06:36:25PM -0500, Ax0n wrote:
> I'm running vmd with the options you specified, and using tee(1) to peel it
> off to a file while I can still watch what happens in the foreground. It
> hasn't happened again yet, but I haven't been messing with the VMs as much
> this week as I was over the weekend.
> 
> One thing of interest: inside the VM running the Oct 22 snapshot, top(1)
> reports the CPU utilization hovering over 1.0 load, with nearly 100% in
> interrupt state, which seems pretty odd to me.  I am also running an i386
> and amd64 vm at the same time, both on 6.0-Release and neither of them are
> exhibiting this high load. I'll probably update the snapshot of the
> -CURRENT(ish) VM tonight, and the snapshot of my host system (which is also
> my daily driver) this weekend.
> 

I've seen that (and have seen it reported) from time to time as well. This
is unlikely time being spent in interrupt, it's more likely a time accounting
error that's making the guest think it's spending more in interrupt servicing
than it actually is. This is due to the fact that both the statclock and
hardclock are running at 100Hz (or close to it) because the host is unable
to inject more frequent interrupts.

You might try running the host at 1000Hz and see if that fixes the problem.
It did, for me. Note that such an adjustment is really a hack and should
just be viewed as a temporary workaround. Of course, don't run your guests
at 1000Hz as well (that would defeat the purpose of cranking the host). That
parameter can be adjusted in param.c.

-ml

> load averages:  1.07,  1.09,  0.94               vmmbsd.labs.h-i-r.net
> 05:05:27
> 26 processes: 1 running, 24 idle, 1 on processor                       up
>  0:28
> CPU states:  0.0% user,  0.0% nice,  0.4% system, 99.6% interrupt,  0.0%
> idle
> Memory: Real: 21M/130M act/tot Free: 355M Cache: 74M Swap: 0K/63M
> 
>   PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
>     1 root      10    0  420K  496K idle      wait      0:01  0.00% init
> 13415 _ntp       2  -20  888K 2428K sleep     poll      0:00  0.00% ntpd
> 15850 axon       3    0  724K  760K sleep     ttyin     0:00  0.00% ksh
> 42990 _syslogd   2    0  972K 1468K sleep     kqread    0:00  0.00% syslogd
> 89057 _pflogd    4    0  672K  424K sleep     bpf       0:00  0.00% pflogd
>  2894 root       2    0  948K 3160K sleep     poll      0:00  0.00% sshd
> 85054 _ntp       2    0  668K 2316K idle      poll      0:00  0.00% ntpd
> 
> 
> 
> On Tue, Oct 25, 2016 at 2:09 AM, Mike Larkin <mlar...@azathoth.net> wrote:
> 
> > On Mon, Oct 24, 2016 at 11:07:32PM -0500, Ax0n wrote:
> > > Thanks for the update, ml.
> > >
> > > The VM Just did it again in the middle of backspacing over uname -a...
> > >
> > > $ uname -a
> > > OpenBSD vmmbsd.labs.h-i-r.net 6.0 GENERIC.MP#0 amd64
> > > $ un   <-- frozen
> > >
> > > Spinning like mad.
> > >
> >
> > Bizarre. If it were I, I'd next try killing all vmd processes and
> > running vmd -dvvv from a root console window and look for what it dumps
> > out when it hangs like this (if anything).
> >
> > You'll see a fair number of "vmd: unknown exit code 1" (and 48), those
> > are harmless and can be ignored, as can anything that vmd dumps out
> > before the vm gets stuck like this.
> >
> > If you capture this and post somewhere I can take a look. You may need to
> > extract the content out of /var/log/messages if a bunch gets printed.
> >
> > If this fails to diagnose what happens, I can work with you off-list on
> > how to debug further.
> >
> > -ml
> >
> > > [axon@transient ~]$ vmctl status
> > >    ID   PID VCPUS    MAXMEM    CURMEM              TTY NAME
> > >     2  2769     1     512MB     149MB       /dev/ttyp3 -c
> > >     1 48245     1     512MB     211MB       /dev/ttyp0 obsdvmm.vm
> > > [axon@transient ~]$ ps aux | grep 48245
> > > _vmd     48245 98.5  2.3 526880 136956 ??  Rp     1:54PM   47:08.30 vmd:
> > > obsdvmm.vm (vmd)
> > >
> > > load averages:  2.43,  2.36,
> > > 2.26
> > > transient.my.domain 18:29:10
> > > 56 processes: 53 idle, 3 on
> > > processor
> > > up  4:35
> > > CPU0 states:  3.8% user,  0.0% nice, 15.4% system,  0.6% interrupt, 80.2%
> > > idle
> > > CPU1 states: 15.3% user,  0.0% nice, 49.3% system,  0.0% interrupt, 35.4%
> > > idle
> > > CPU2 states:  6.6% user,  0.0% nice, 24.3% system,  0.0% interrupt, 69.1%
> > > idle
> > > CPU3 states:  4.7% user,  0.0% nice, 18.1% system,  0.0% interrupt, 77.2%
> > > idle
> > > Memory: Real: 1401M/2183M act/tot Free: 3443M Cache: 536M Swap: 0K/4007M
> > >
> > >   PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU
> > COMMAND
> > > 48245 _vmd      43    0  515M  134M onproc    thrslee  47:37 98.00% vmd
> > >  7234 axon       2    0  737M  715M sleep     poll     33:18 19.14%
> > firefox
> > > 42481 _x11      55    0   16M   42M onproc    -         2:53  9.96% Xorg
> > >  2769 _vmd      29    0  514M   62M idle      thrslee   2:29  9.62% vmd
> > > 13503 axon      10    0  512K 2496K sleep     nanosle   0:52  1.12% wmapm
> > > 76008 axon      10    0  524K 2588K sleep     nanosle   0:10  0.73% wmmon
> > > 57059 axon      10    0  248M  258M sleep     nanosle   0:08  0.34% wmnet
> > > 23088 axon       2    0  580K 2532K sleep     select    0:10  0.00%
> > > wmclockmon
> > > 64041 axon       2    0 3752K   10M sleep     poll      0:05  0.00%
> > wmaker
> > > 16919 axon       2    0 7484K   20M sleep     poll      0:04  0.00%
> > > xfce4-terminal
> > >     1 root      10    0  408K  460K idle      wait      0:01  0.00% init
> > > 80619 _ntp       2  -20  880K 2480K sleep     poll      0:01  0.00% ntpd
> > >  9014 _pflogd    4    0  672K  408K sleep     bpf       0:01  0.00%
> > pflogd
> > > 58764 root      10    0 2052K 7524K idle      wait      0:01  0.00% slim
> > >
> > >
> > >
> > > On Mon, Oct 24, 2016 at 10:47 PM, Mike Larkin <mlar...@azathoth.net>
> > wrote:
> > >
> > > > On Mon, Oct 24, 2016 at 07:36:48PM -0500, Ax0n wrote:
> > > > > I suppose I'll ask here since it seems on-topic for this thread. Let
> > me
> > > > > know if I shouldn't do this in the future. I've been testing vmm for
> > > > > exactly a week on two different snapshots. I have two VMs: One
> > running
> > > > the
> > > > > same snapshot (amd64, Oct 22) I'm running on the host vm, the other
> > > > running
> > > > > amd64 6.0-RELEASE with no patches of any kind.
> > > > >
> > > > > For some reason, the vm running a recent snapshot locks up
> > occasionally
> > > > > while I'm interacting with it via cu or occasionally ssh. Should I
> > > > expect a
> > > > > ddb prompt and/or kernel panic messages via the virtualized serial
> > > > console?
> > > > > Is there some kind of "break" command on the console to get into ddb
> > when
> > > > > it appears to hang? A "No" or "Not yet" on those two questions would
> > > > > suffice if not possible. I know this isn't supported, and appreciate
> > the
> > > > > hard work.
> > > > >
> > > > > Host dmesg:
> > > > > http://stuff.h-i-r.net/2016-10-22.Aspire5733Z.dmesg.txt
> > > > >
> > > > > VM (Oct 22 Snapshot) dmesg:
> > > > > http://stuff.h-i-r.net/2016-10-22.vmm.dmesg.txt
> > > > >
> > > >
> > > > These look fine. Not sure why it would have locked up. Is the
> > associated
> > > > vmd
> > > > process idle, or spinning like mad?
> > > >
> > > > -ml
> > > >
> > > > > Second:
> > > > > I'm using vm.conf (contents below) to start the aforementioned
> > snapshot
> > > > vm
> > > > > at boot.
> > > > > There's a "disable" line inside vm.conf to keep one VM from spinning
> > up
> > > > > with vmd.  Is there a way to start this one with vmctl aside from
> > passing
> > > > > all the options to vmctl as below?
> > > > >
> > > > > doas vmctl start -c -d OBSD-RELa -i 1 -k /home/axon/obsd/amd64/bsd -m
> > > > 512M
> > > > >
> > > > > I've tried stuff along the lines of:
> > > > > doas vmctl start OBSD-RELa.vm
> > > > >
> > > > > vm "obsdvmm.vm" {
> > > > >         memory 512M
> > > > >         kernel "bsd"
> > > > >         disk "/home/axon/vmm/OBSD6"
> > > > >         interface tap
> > > > > }
> > > > > vm "OBSD-RELa.vm" {
> > > > >         memory 512M
> > > > >         kernel "/home/axon/obsd/amd64/bsd"
> > > > >         disk "/home/axon/vmm/OBSD-RELa"
> > > > >         interface tap
> > > > >         disable
> > > > > }
> > > > >
> > > >
> > > > I think this is being worked on, but not done yet.
> > > >
> > > > -ml

Reply via email to