On Mon, Jun 24, 2024 at 6:06 PM Thomas Huth <th...@redhat.com> wrote:

> On 10/06/2024 19.02, Paolo Bonzini wrote:
> > From: Hyman Huang <yong.hu...@smartx.com>
> >
> > For VMs configured with the USB CDROM device:
> >
> > -drive
> file=/path/to/local/file,id=drive-usb-disk0,media=cdrom,readonly=on...
> > -device usb-storage,drive=drive-usb-disk0,id=usb-disk0...
> >
> > QEMU process may crash after live migration, to reproduce the issue,
> > configure VM (Guest OS ubuntu 20.04 or 21.10) with the following XML:
> >
> > <disk type='file' device='cdrom'>
> >    <driver name='qemu' type='raw'/>
> >    <source file='/path/to/share_fs/cdrom.iso'/>
> >    <target dev='sda' bus='usb'/>
> >    <readonly/>
> >    <address type='usb' bus='0' port='2'/>
> > </disk>
> > <controller type='usb' index='0' model='piix3-uhci'/>
> >
> > Do the live migration repeatedly, crash may happen after live migratoin,
> > trace log at the source before live migration is as follows:
> >
> > 324808@1711972823.521945:usb_uhci_frame_start nr 319
> > 324808@1711972823.521978:usb_uhci_qh_load qh 0x35cb5400
> > 324808@1711972823.521989:usb_uhci_qh_load qh 0x35cb5480
> > 324808@1711972823.521997:usb_uhci_td_load qh 0x35cb5480, td 0x35cbe000,
> ctrl 0x0, token 0xffe07f69
> > 324808@1711972823.522010:usb_uhci_td_nextqh qh 0x35cb5480, td 0x35cbe000
> > 324808@1711972823.522022:usb_uhci_qh_load qh 0x35cb5680
> > 324808@1711972823.522030:usb_uhci_td_load qh 0x35cb5680, td 0x75ac5180,
> ctrl 0x19800000, token 0x3c903e1
> > 324808@1711972823.522045:usb_uhci_packet_add token 0x103e1, td
> 0x75ac5180
> > 324808@1711972823.522056:usb_packet_state_change bus 0, port 2, ep 2,
> packet 0x559f9ba14b00, state undef -> setup
> > 324808@1711972823.522079:usb_msd_cmd_submit lun 0, tag 0x472, flags
> 0x00000080, len 10, data-len 8
> > 324808@1711972823.522107:scsi_req_parsed target 0 lun 0 tag 1138
> command 74 dir 1 length 8
> > 324808@1711972823.522124:scsi_req_parsed_lba target 0 lun 0 tag 1138
> command 74 lba 4096
> > 324808@1711972823.522139:scsi_req_alloc target 0 lun 0 tag 1138
> > 324808@1711972823.522169:scsi_req_continue target 0 lun 0 tag 1138
> > 324808@1711972823.522181:scsi_req_data target 0 lun 0 tag 1138 len 8
> > 324808@1711972823.522194:usb_packet_state_change bus 0, port 2, ep 2,
> packet 0x559f9ba14b00, state setup -> complete
> > 324808@1711972823.522209:usb_uhci_packet_complete_success token
> 0x103e1, td 0x75ac5180
> > 324808@1711972823.522219:usb_uhci_packet_del token 0x103e1, td
> 0x75ac5180
> > 324808@1711972823.522232:usb_uhci_td_complete qh 0x35cb5680, td
> 0x75ac5180
> >
> > trace log at the destination after live migration is as follows:
> >
> > 3286206@1711972823.951646:usb_uhci_frame_start nr 320
> > 3286206@1711972823.951663:usb_uhci_qh_load qh 0x35cb5100
> > 3286206@1711972823.951671:usb_uhci_qh_load qh 0x35cb5480
> > 3286206@1711972823.951680:usb_uhci_td_load qh 0x35cb5480, td
> 0x35cbe000, ctrl 0x1000000, token 0xffe07f69
> > 3286206@1711972823.951693:usb_uhci_td_nextqh qh 0x35cb5480, td
> 0x35cbe000
> > 3286206@1711972823.951702:usb_uhci_qh_load qh 0x35cb5700
> > 3286206@1711972823.951709:usb_uhci_td_load qh 0x35cb5700, td
> 0x75ac5240, ctrl 0x39800000, token 0xe08369
> > 3286206@1711972823.951727:usb_uhci_queue_add token 0x8369
> > 3286206@1711972823.951735:usb_uhci_packet_add token 0x8369, td
> 0x75ac5240
> > 3286206@1711972823.951746:usb_packet_state_change bus 0, port 2, ep 1,
> packet 0x56066b2fb5a0, state undef -> setup
> > 3286206@1711972823.951766:usb_msd_data_in 8/8 (scsi 8)
> > 2024-04-01 12:00:24.665+0000: shutting down, reason=crashed
> >
> > The backtrace reveals the following:
> >
> > Program terminated with signal SIGSEGV, Segmentation fault.
> > 0  __memmove_sse2_unaligned_erms () at
> ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:312
> > 312        movq    -8(%rsi,%rdx), %rcx
> > [Current thread is 1 (Thread 0x7f0a9025fc00 (LWP 3286206))]
> > (gdb) bt
> > 0  __memmove_sse2_unaligned_erms () at
> ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:312
> > 1  memcpy (__len=8, __src=<optimized out>, __dest=<optimized out>) at
> /usr/include/bits/string_fortified.h:34
> > 2  iov_from_buf_full (iov=<optimized out>, iov_cnt=<optimized out>,
> offset=<optimized out>, buf=0x0, bytes=bytes@entry=8) at ../util/iov.c:33
> > 3  iov_from_buf (bytes=8, buf=<optimized out>, offset=<optimized out>,
> iov_cnt=<optimized out>, iov=<optimized out>)
> >     at
> /usr/src/debug/qemu-6-6.2.0-75.7.oe1.smartx.git.40.x86_64/include/qemu/iov.h:49
> > 4  usb_packet_copy (p=p@entry=0x56066b2fb5a0, ptr=<optimized out>,
> bytes=bytes@entry=8) at ../hw/usb/core.c:636
> > 5  usb_msd_copy_data (s=s@entry=0x56066c62c770, p=p@entry=0x56066b2fb5a0)
> at ../hw/usb/dev-storage.c:186
> > 6  usb_msd_handle_data (dev=0x56066c62c770, p=0x56066b2fb5a0) at
> ../hw/usb/dev-storage.c:496
> > 7  usb_handle_packet (dev=0x56066c62c770, p=p@entry=0x56066b2fb5a0) at
> ../hw/usb/core.c:455
> > 8  uhci_handle_td (s=s@entry=0x56066bd5f210, q=0x56066bb7fbd0, q@entry=0x0,
> qh_addr=qh_addr@entry=902518530, td=td@entry=0x7fffe6e788f0,
> td_addr=<optimized out>,
> >     int_mask=int_mask@entry=0x7fffe6e788e4) at ../hw/usb/hcd-uhci.c:885
> > 9  uhci_process_frame (s=s@entry=0x56066bd5f210) at
> ../hw/usb/hcd-uhci.c:1061
> > 10 uhci_frame_timer (opaque=opaque@entry=0x56066bd5f210) at
> ../hw/usb/hcd-uhci.c:1159
> > 11 timerlist_run_timers (timer_list=0x56066af26bd0) at
> ../util/qemu-timer.c:642
> > 12 qemu_clock_run_timers (type=QEMU_CLOCK_VIRTUAL) at
> ../util/qemu-timer.c:656
> > 13 qemu_clock_run_all_timers () at ../util/qemu-timer.c:738
> > 14 main_loop_wait (nonblocking=nonblocking@entry=0) at
> ../util/main-loop.c:542
> > 15 qemu_main_loop () at ../softmmu/runstate.c:739
> > 16 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized
> out>) at ../softmmu/main.c:52
> > (gdb) frame 5
> > (gdb) p ((SCSIDiskReq *)s->req)->iov
> > $1 = {iov_base = 0x0, iov_len = 0}
> > (gdb) p/x s->req->tag
> > $2 = 0x472
> >
> > When designing the USB mass storage device model, QEMU places SCSI disk
> > device as the backend of USB mass storage device. In addition, USB mass
> > device driver in Guest OS conforms to the "Universal Serial Bus Mass
> > Storage Class Bulk-Only Transport" specification in order to simulate
> > the transform behavior between a USB controller and a USB mass device.
> > The following shows the protocol hierarchy:
> >
> >                        +----------------+
> >   CDROM driver         |  scsi command  |        CDROM
> >                        +----------------+
> >
> >                     +-----------------------+
> >   USB mass          | USB Mass Storage Class|    USB mass
> >   storage driver    | Bulk-Only Transport   |    storage device
> >                     +-----------------------+
> >
> >                        +----------------+
> >   USB Controller       |  USB Protocol  |        USB device
> >                        +----------------+
> >
> > In the USB protocol layer, between the USB controller and USB device, at
> > least two USB packets will be transformed when guest OS send a
> > read operation to USB mass storage device:
> >
> > 1. The CBW packet, which will be delivered to the USB device's Bulk-Out
> > endpoint. In order to simulate a read operation, the USB mass storage
> > device parses the CBW and converts it to a SCSI command, which would be
> > executed by CDROM(represented as SCSI disk in QEMU internally), and store
> > the result data of the SCSI command in a buffer.
> >
> > 2. The DATA-IN packet, which will be delivered from the USB device's
> > Bulk-In endpoint(fetched directly from the preceding buffer) to the USB
> > controller.
> >
> > We consider UHCI to be the controller. The two packets mentioned above
> may
> > have been processed by UHCI in two separate frame entries of the Frame
> List
> > , and also described by two different TDs. Unlike the physical
> environment,
> > a virtualized environment requires the QEMU to make sure that the result
> > data of CBW is not lost and is delivered to the UHCI controller.
> >
> > Currently, these types of SCSI requests are not migrated, so QEMU cannot
> > ensure the result data of the IO operation is not lost if there are
> > inflight emulated SCSI requests during the live migration.
> >
> > Assume for the moment that the USB mass storage device is processing the
> > CBW and storing the result data of the read operation to a buffre, live
> > migration happens and moves the VM to the destination while not migrating
> > the result data of the read operation.
> >
> > After migration, when UHCI at the destination issues a DATA-IN request to
> > the USB mass storage device, a crash happens because USB mass storage
> device
> > fetches the result data and get nothing.
> >
> > The scenario this patch addresses is this one.
> >
> > Theoretically, any device that uses the SCSI disk as a back-end would be
> > affected by this issue. In this case, it is the USB CDROM.
> >
> > To fix it, inflight emulated SCSI request be migrated during live
> migration,
> > similar to the DMA SCSI request.
> >
> > Signed-off-by: Hyman Huang <yong.hu...@smartx.com>
> > Message-ID: <
> 878c8f093f3fc2f584b5c31cb2490d9f6a12131a.1716531409.git.yong.hu...@smartx.com
> >
> > [Do not bump migration version, introduce compat property instead. -
> Paolo]
> > Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
> > ---
> >   hw/core/machine.c   |  1 +
> >   hw/scsi/scsi-disk.c | 24 +++++++++++++++++++++++-
> >   2 files changed, 24 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > index c93d2492443..655d75c21fc 100644
> > --- a/hw/core/machine.c
> > +++ b/hw/core/machine.c
> > @@ -36,6 +36,7 @@
> >
> >   GlobalProperty hw_compat_9_0[] = {
> >       {"arm-cpu", "backcompat-cntfrq", "true" },
> > +    {"scsi-disk-base", "migrate-emulated-scsi-request", "false" },
> >       {"vfio-pci", "skip-vsc-check", "false" },
> >   };
>
>   Hi Paolo, hi Hyman,
>
> this patch introduced a problem with device introspection on older machine
> types. Running "make check-qtest SPEED=slow" is now failing for older
> machine types.
>
> Or if you want to reproduce it manually:
>
>   $ ./qemu-system-x86_64 -M pc-q35-8.0 -monitor stdio
>   QEMU 9.0.50 monitor - type 'help' for more information
>   (qemu) device_add scsi-block,help
>   Unexpected error in object_property_find_err() at
>   ../../devel/qemu/qom/object.c:1357:
>   can't apply global scsi-disk-base.migrate-emulated-scsi-request=false:
>   Property 'scsi-block.migrate-emulated-scsi-request' not found
>   Aborted (core dumped)
>
> I think the problem is that the property is only added to certain SCSI
> devices via the DEFINE_SCSI_DISK_PROPERTIES macro, but "scsi-block" device
> does not use this macro and thus does not have this property. So when the
> compat code tries to set this property for "scsi-block" (which is also
> derived from "scsi-disk-base"), it fails, of course.
>
"scsi-block" appears to handle two different types of devices:
passthrough-ed block devices and scsi-block devices.

Since each of them is capable of supporting the load/save scsi request,
and this patchset does not address any problems for them.

So, IMHO, The "scsi-disk-base" class may not be the best place for the
"migrate-emulated-scsi-request" attribute?


>
> Should the "migrate-emulated-scsi-request" property maybe rather be added
> to
> the "scsi-disk-base" class instead? Or should hw_compat_9_0 rather list
> the
> devices that have the property instead of using the parent class? Could
> you
> please have a look?
>
>   Thanks,
>    Thomas
>
>

-- 
Best regards

Reply via email to