On Wed, Apr 07, 2021 at 07:47:28PM -0400, Dave Voutila wrote:
>
> Thomas L. writes:
>
> >> > Thomas: I looked at your host dmesg and your provided vm.conf. It
> >> > looks like 11 vm's with the default 512M memory and one (minecraft)
> >> > with 8G. Your host seems to have only 16GB of memory, some of which
> >> > is probably unavailable as it's used by the integrated gpu. I'm
> >> > wondering if you are effectively oversusbcribing your memory here.
> >> >
> >> > I know we currently don't support swapping guest memory out, but not
> >> > sure what happens if we don't have the physical memory to fault a
> >> > page in and wire it.
> >> >
> >>
> >> Something else gets swapped out.
> >
> > Wire == Can't swap out?
>
> Yes.
>
> > top shows 15G real memory available. That should be enough (8G + 11 *
> > 0.5G = 13.5G), or is this inherently risky with 6.8?
>
> With 6.8, the guests might have memory swapped out and worst case you'll
> see some performance issues. That shouldn't cause unexpected
> termination.
>

Depends on the exact content that got swapped out (as we didn't handle
TLB flushes correctly), so a crash was certainly a possibility. That's why
I wanted to see the VMM_DEBUG output.

In any case, Thomas should try -current and see if this problem is even
reproducible.

-ml

> > I can try -current as suggested in the other mail. Is this a likely
> > cause or should I run with VMM_DEBUG for further investigation? Is
> > "somewhat slower" from VMM_DEBUG still usable? I don't need full
> > performance, but ~month downtime until the problem shows again would be
> > too much.
>
> A fix is more likely to land in -current if an issue can be
> identified. Since the issue doesn't sound like it's easily reproducible
> yet, VMM_DEBUG is the best bet for having the information you'd need to
> share when the issue occurs.
>
> >> > Even without a custom kernel with VMM_DEBUG, if it's a uvm_fault
> >> > issue you should see a message in the kernel buffer. Something like:
> >> >
> >> >   vmx_fault_page: uvm_fault returns N, GPA=0x...., rip=0x....
> >> >
> >> > mlarkin: thoughts on my hypothesis? Am I wildly off course?
> >> >
> >> > -dv
> >> >
> >>
> >> Yeah I was trying to catch the big dump when a VM resets. That would
> >> tell us if the vm caused the reset or if vmd(8) crashed for some
> >> reason.
> >
> > But if vmd crashed it wouldn't restart automatically or does it?
> > All VMs down from vmd crashing would have been noticed.
> > That kernel message would have shown in the dmesg too, wouldn't it?
> >
>
> There are multiple factors. First is vmd(8) is multi-process and a vm's
> process can die without impacting others. Second is the vcpu could be
> reset making the guest "reboot." There are numerous reasons these things
> could happen, hence needing debug logging.
>
> -dv
>

Reply via email to