On Wednesday, 14 June 2023 Simon Rowe wrote: > We've also seen a handful of similar reports. Again, just the MBR sector > overwritten by what looks to be guest data (e.g. log messages). The > common thread with our incidents is again a SATA disk under the AHCI > controller, we have a network backend (iSCSI) which has experienced a > failure. > > I've tried to repro this with blkdebug and simulated write errors, > without success.
I’ve finally has some success in reproducing this issue. I have a test environment set up as follows: * QEMU 4.2 * guest booting from CD with a small SATA disk * guest test harness partitions the disk then continually writes data to the partition while checking the integrity of the MBR * filter script that interposes between QEMU and the iSCSI backend, this drops writes and then resets the connection after a period of time >From tracing in the filter script I can see unsolicited writes to LBA 0 once >the SATA controller is reset Data in: iSCSI op 01 SCSI op 28 LBA 0 NOP count 5 wait for read False Data in: iSCSI op 01 SCSI op 28 LBA 0 NOP count 6 wait for read False Data in: iSCSI op 01 SCSI op 2a LBA 0 NOP count 0 wait for read True Data in: iSCSI op 01 SCSI op 28 LBA 0 NOP count 0 wait for read False I have a stack trace at the time that the write occurs #0 iscsi_co_writev (bs=0x564322ecc220, sector_num=<optimized out>, nb_sectors=1, iov=0x7fc20c045860, flags=<optimized out>) at block/iscsi.c:641 #1 0x00005643220e780b in bdrv_driver_pwritev (bs=bs@entry=0x564322ecc220, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7fc20c045860, qiov_offset=qiov_offset@entry=0, flags=flags@entry=0) at block/io.c:1216 #2 0x00005643220e9985 in bdrv_aligned_pwritev ( child=child@entry=0x564322ecb050, req=req@entry=0x7fc2aa90bb00, offset=0, bytes=512, align=align@entry=512, qiov=0x7fc20c045860, qiov_offset=0, flags=flags@entry=0) at block/io.c:1980 #3 0x00005643220ea25b in bdrv_co_pwritev_part (child=0x564322ecb050, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7fc20c045860, qiov_offset=qiov_offset@entry=0, flags=0) at block/io.c:2137 #4 0x00005643220ea55b in bdrv_co_pwritev (child=<optimized out>, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7fc20c045860, flags=<optimized out>) at block/io.c:2087 #5 0x00005643220aa64d in raw_co_pwritev (bs=0x564322ec4a00, offset=0, bytes=512, qiov=0x7fc20c045860, flags=<optimized out>) at block/raw-format.c:258 #6 0x00005643220e7702 in bdrv_driver_pwritev (bs=bs@entry=0x564322ec4a00, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7fc20c045860, qiov_offset=qiov_offset@entry=0, flags=flags@entry=0) at block/io.c:1183 #7 0x00005643220e9985 in bdrv_aligned_pwritev ( child=child@entry=0x564322ed28c0, req=req@entry=0x7fc2aa90be70, offset=0, bytes=512, align=align@entry=1, qiov=0x7fc20c045860, qiov_offset=0, flags=flags@entry=0) at block/io.c:1980 #8 0x00005643220ea25b in bdrv_co_pwritev_part (child=0x564322ed28c0, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7fc20c045860, qiov_offset=qiov_offset@entry=0, flags=0) at block/io.c:2137 #9 0x00005643220d63b4 in blk_do_pwritev_part (blk=0x564322ec4570, offset=0, bytes=512, qiov=0x7fc20c045860, qiov_offset=qiov_offset@entry=0, flags=<optimized out>) at block/block-backend.c:1231 #10 0x00005643220d650d in blk_aio_write_entry (opaque=0x7fc20c045520) at block/block-backend.c:1439 #11 0x000056432218706a in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:115 #12 0x00007fc2afa20190 in ?? () from /lib64/libc.so.6 #13 0x00007fc2b3e01aa0 in ?? () #14 0x0000000000000000 in ?? () I’m not familiar with the storage code of QEMU, any suggestions about how to proceed debugging this? Regards Simon