Ping On Mon, Apr 8, 2024 at 8:08 PM Hyman Huang <yong.hu...@smartx.com> wrote:
> When configuring VMs with the CDROM device using the USB bus > in Libvirt, do as follows: > > <disk type='file' device='cdrom'> > <driver name='qemu' type='raw'/> > <source file='/path/to/share_fs/cdrom.iso'/> > <target dev='sda' bus='usb'/> > <readonly/> > <address type='usb' bus='0' port='2'/> > </disk> > <controller type='usb' index='0' model='piix3-uhci'/> > > The destination Qemu process crashed, causing the VM migration > to fail; the backtrace reveals the following: > > Program terminated with signal SIGSEGV, Segmentation fault. > 0 __memmove_sse2_unaligned_erms () at > ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:312 > 312 movq -8(%rsi,%rdx), %rcx > [Current thread is 1 (Thread 0x7f0a9025fc00 (LWP 3286206))] > (gdb) bt > 0 __memmove_sse2_unaligned_erms () at > ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:312 > 1 memcpy (__len=8, __src=<optimized out>, __dest=<optimized out>) at > /usr/include/bits/string_fortified.h:34 > 2 iov_from_buf_full (iov=<optimized out>, iov_cnt=<optimized out>, > offset=<optimized out>, buf=0x0, bytes=bytes@entry=8) at ../util/iov.c:33 > 3 iov_from_buf (bytes=8, buf=<optimized out>, offset=<optimized out>, > iov_cnt=<optimized out>, iov=<optimized out>) > at > /usr/src/debug/qemu-6-6.2.0-75.7.oe1.smartx.git.40.x86_64/include/qemu/iov.h:49 > 4 usb_packet_copy (p=p@entry=0x56066b2fb5a0, ptr=<optimized out>, > bytes=bytes@entry=8) at ../hw/usb/core.c:636 > 5 usb_msd_copy_data (s=s@entry=0x56066c62c770, p=p@entry=0x56066b2fb5a0) > at ../hw/usb/dev-storage.c:186 > 6 usb_msd_handle_data (dev=0x56066c62c770, p=0x56066b2fb5a0) at > ../hw/usb/dev-storage.c:496 > 7 usb_handle_packet (dev=0x56066c62c770, p=p@entry=0x56066b2fb5a0) at > ../hw/usb/core.c:455 > 8 uhci_handle_td (s=s@entry=0x56066bd5f210, q=0x56066bb7fbd0, q@entry=0x0, > qh_addr=qh_addr@entry=902518530, td=td@entry=0x7fffe6e788f0, > td_addr=<optimized out>, > int_mask=int_mask@entry=0x7fffe6e788e4) at ../hw/usb/hcd-uhci.c:885 > 9 uhci_process_frame (s=s@entry=0x56066bd5f210) at > ../hw/usb/hcd-uhci.c:1061 > 10 uhci_frame_timer (opaque=opaque@entry=0x56066bd5f210) at > ../hw/usb/hcd-uhci.c:1159 > 11 timerlist_run_timers (timer_list=0x56066af26bd0) at > ../util/qemu-timer.c:642 > 12 qemu_clock_run_timers (type=QEMU_CLOCK_VIRTUAL) at > ../util/qemu-timer.c:656 > 13 qemu_clock_run_all_timers () at ../util/qemu-timer.c:738 > 14 main_loop_wait (nonblocking=nonblocking@entry=0) at > ../util/main-loop.c:542 > 15 qemu_main_loop () at ../softmmu/runstate.c:739 > 16 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) > at ../softmmu/main.c:52 > (gdb) frame 5 > (gdb) p ((SCSIDiskReq *)s->req)->iov > $1 = {iov_base = 0x0, iov_len = 0} > (gdb) p/x s->req->tag > $2 = 0x472 > > The scsi commands that the CDROM issued are wrapped as the > payload of the USB protocol in Qemu's implementation of a > USB mass storage device, which is used to implement a > CDROM device that uses a USB bus. > > In general, the USB controller processes SCSI commands in > two phases. Sending the OUT USB package that encapsulates > the SCSI command is the first stage; scsi-disk would handle > this by emulating the SCSI operation. Receiving the IN USB > package containing the SCSI operation's output is the second > stage. Additionally, the SCSI request tag tracks the request > during the procedure. > > Since QEMU did not migrate the flying SCSI request, the > output of the SCSI may be lost if the live migration is > initiated between the two previously mentioned steps. > > In our scenario, the SCSI command is GET_EVENT_STATUS_NOTIFICATION, > the QEMU log information below demonstrates how the SCSI command > is being handled (first step) on the source: > > usb_packet_state_change bus 0, port 2, ep 2, packet 0x559f9ba14b00, state > undef -> setup > usb_msd_cmd_submit lun 0, tag 0x472, flags 0x00000080, len 10, data-len 8 > > After migration, the VM crashed as soon as the destination's UHCI > controller began processing the remaining portion of the SCSI > request (second step)! Here is how the QEMU logged out: > > usb_packet_state_change bus 0, port 2, ep 1, packet 0x56066b2fb5a0, state > undef -> setup > usb_msd_data_in 8/8 (scsi 8) > shutting down, reason=crashed > > To summarize, the missing scsi request during a live migration > may cause a VM configured with a CDROM to crash. > > Migrating the SCSI request that the scsi-disk is handling is > the simple approach, assuming that it actually exists. > > Signed-off-by: Hyman Huang <yong.hu...@smartx.com> > --- > hw/scsi/scsi-disk.c | 24 +++++++++++++++++++++++- > 1 file changed, 23 insertions(+), 1 deletion(-) > > diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c > index 0985676f73..d6e9d9e8d4 100644 > --- a/hw/scsi/scsi-disk.c > +++ b/hw/scsi/scsi-disk.c > @@ -160,6 +160,16 @@ static void scsi_disk_save_request(QEMUFile *f, > SCSIRequest *req) > } > } > > +static void scsi_disk_emulate_save_request(QEMUFile *f, SCSIRequest *req) > +{ > + SCSIDiskReq *r = DO_UPCAST(SCSIDiskReq, req, req); > + SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev); > + > + if (s->migrate_emulate_scsi_request) { > + scsi_disk_save_request(f, req); > + } > +} > + > static void scsi_disk_load_request(QEMUFile *f, SCSIRequest *req) > { > SCSIDiskReq *r = DO_UPCAST(SCSIDiskReq, req, req); > @@ -183,6 +193,16 @@ static void scsi_disk_load_request(QEMUFile *f, > SCSIRequest *req) > qemu_iovec_init_external(&r->qiov, &r->iov, 1); > } > > +static void scsi_disk_emulate_load_request(QEMUFile *f, SCSIRequest *req) > +{ > + SCSIDiskReq *r = DO_UPCAST(SCSIDiskReq, req, req); > + SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev); > + > + if (s->migrate_emulate_scsi_request) { > + scsi_disk_load_request(f, req); > + } > +} > + > /* > * scsi_handle_rw_error has two return values. False means that the error > * must be ignored, true means that the error has been processed and the > @@ -2593,6 +2613,8 @@ static const SCSIReqOps scsi_disk_emulate_reqops = { > .read_data = scsi_disk_emulate_read_data, > .write_data = scsi_disk_emulate_write_data, > .get_buf = scsi_get_buf, > + .load_request = scsi_disk_emulate_load_request, > + .save_request = scsi_disk_emulate_save_request, > }; > I would like to know why load_request and save_request hooks were not initially present in the scsi_disk_emulate_reqops. Are there any historical considerations? > static const SCSIReqOps scsi_disk_dma_reqops = { > @@ -3137,7 +3159,7 @@ static Property scsi_hd_properties[] = { > static int scsi_disk_pre_save(void *opaque) > { > SCSIDiskState *dev = opaque; > - dev->migrate_emulate_scsi_request = false; > + dev->migrate_emulate_scsi_request = true; > > return 0; > } > -- > 2.39.3 > > Thanks, Yong -- Best regards