Hi,

what qemu version are you using? I cannot reproduce this with qemu 7.2. Can you try with a newer qemu?

Cheers,
Stefan

Am 25.04.23 um 14:53 schrieb Aaron Mason:
Yeah I'm getting the same thing. Trying a build in QEMU and
transferring in to see if that helps. Will report back.


Ok, good news, it still crashes at the same spot, but this time I've
got more data. Copying in tech@ - if I've forgotten anything let me
know and I'll fire up a fresh instance.

[REDACTED]
vioscsi_req_done(e,ffff800000024a00,fffffd803f81c338,e,ffff800000024a00,ffff800
0000d3228) at vioscsi_req_done+0x26
[REDACTED]

Ok, so based on the trace I got, I was able to trace the stop itself
back to line 299 of vioscsi.c (thank. you. random relink. And
anonymous CVS):

    293  vioscsi_req_done(struct vioscsi_softc *sc, struct virtio_softc *vsc,
    294      struct vioscsi_req *vr)
    295  {
    296          struct scsi_xfer *xs = vr->vr_xs;
    297          DPRINTF("vioscsi_req_done: enter vr: %p xs: %p\n", vr, xs);
    298
-->299          int isread = !!(xs->flags & SCSI_DATA_IN);
    300          bus_dmamap_sync(vsc->sc_dmat, vr->vr_control,
    301              offsetof(struct vioscsi_req, vr_req),
    302              sizeof(struct virtio_scsi_req_hdr),
    303              BUS_DMASYNC_POSTWRITE);

Maybe if I follow the rabbit hole enough, I might find out what's
going wrong between the driver and OCI. I've got a day off tomorrow
(yay for war I guess), I'll give it a bash and see where we end up.

--
Aaron Mason - Programmer, open source addict
I've taken my software vows - for beta or for worse

I enabled debugging on the vioscsi driver, rebuilt the RAMDISK kernel
with those drivers enabled, and got this:

vioscsi0 at virtio1: qsize 128
scsibus0 at vioscsi0: 255 targets
vioscsi_req_get: 0xfffffd803f80d338
vioscsi_scsi_cmd: enter
vioscsi_scsi_cmd: polling...
vioscsi_scsi_cmd: polling timeout
vioscsi_scsi_cmd: done (timeout=0)
vioscsi_scsi_cmd: enter
vioscsi_scsi_cmd: polling...
vioscsi_vq_done: enter
vioscsi_vq_done: slot=127
vioscsi_req_done: enter vr: 0xfffffd803f80d338 xs: 0xfffffd803f8a5e58
vioscsi_req_done: done 0, 2, 0
vioscsi_vq_done: slot=127
vioscsi_req_done: enter vr: 0xfffffd803f80d338 xs: 0x0
uvm_fault(0xffffffff813ec2e0, 0x8, 0, 1) -> e
fatal page fault in supervisor mode
trap type 6 code 0 rip ffffffff810e6190 cs 8 rflags 10286 cr2 8 cpl e
rsp ffffffff81606670
gsbase 0xffffffff813dfff0  kgsbase 0x0
panic: trap type 6, code=0, pc=ffffffff810e6190

That "xs: 0x0" bit feels like a clue. It should be trivial to pick up
and handle, but what would be the correct way to handle that?

If I have it return if "xs" is found to be NULL, it continues - the
debugging suggests it goes through each possible target before
finishing up. I don't know if that's correct, but it seems to continue
booting after that even if my example didn't detect the drive with the
kernel I built (I used the RAMDISK kernel and it was pretty stripped
down).

I'm about to attempt a -STABLE build (I've got 7.3 installed and thus
can't yet build a snapshot, but I will do that if this test succeeds)
- here's the patch that hopefully fixes the problem. (and hopefully
gmail doesn't clobber the tabs)

Index: sys/dev/pv/vioscsi.c
===================================================================
RCS file: /cvs/src/sys/dev/pv/vioscsi.c,v
retrieving revision 1.30
diff -u -p -u -p -r1.30 vioscsi.c
--- sys/dev/pv/vioscsi.c 16 Apr 2022 19:19:59 -0000 1.30
+++ sys/dev/pv/vioscsi.c 25 Apr 2023 12:51:16 -0000
@@ -296,6 +296,7 @@ vioscsi_req_done(struct vioscsi_softc *s
   struct scsi_xfer *xs = vr->vr_xs;
   DPRINTF("vioscsi_req_done: enter vr: %p xs: %p\n", vr, xs);

+ if (xs == NULL) return;
   int isread = !!(xs->flags & SCSI_DATA_IN);
   bus_dmamap_sync(vsc->sc_dmat, vr->vr_control,
       offsetof(struct vioscsi_req, vr_req),



Reply via email to