On 12/7/23 2:49 PM, Warner Losh wrote:


On Thu, Dec 7, 2023 at 3:38 PM Pete Wright <p...@nomadlogic.org <mailto:p...@nomadlogic.org>> wrote:



    On 10/13/23 7:34 PM, Warner Losh wrote:
     >

     >
     >     the messages i posted in the start of the thread are from the
    VM itself
     >     (13.2-RELEASE).  The zpool on the hypervisor (13.2-RELEASE)
    showed no
     >     such issues.
     >
     >     Based on your comment about the improvements in 14 I'll focus my
     >     efforts
     >     on my workstation, it seemed to happen regularly so hopefully
    i can
     >     find
     >     a repo case.
     >
     >
     > Let me now if you see similar messages in stable/14. I think I've
    fixed
     > all the
     > issues with timeouts, though you shouldn't ever seem them in a vm
    setup
     > unless something else weird is going on.
     >


    Hi Warner, just resurfacing this thread because I've had a few lockups
    on my workstation running 14.0-STABLE.  I was able to capture a
    photo of
    the hang and this seems to be the most important line:

    nvme0: Resetting controller due to a timeout and possible hot unplug.

    When I scan the device after reboot I don't see any errors, but if
    there
    is a particular thing I should check via nvmecontrol please let me
    know.
       Also, since it mentions possible hot unplug I wonder if this is
    hardware/firmware related to my system?

    Anyway, haven't found a repro case yet but it has locked up a few times
    the past two weeks.


What the message means is that (a) we stopped getting interrupts from the device and (b) when we went to check on the status of the device it read back like missing hardware.

So is this from inside the VM running under bhyve, or in the host that's hosting the VM? We have different next steps depending on where it is.


OK awesome thanks for that context, so this is on a bare metal workstation.

-pete


--
Pete Wright
p...@nomadlogic.org

Reply via email to