On Wed, Jul 17, 2024 at 09:40:06AM -0400, Michael S. Tsirkin wrote:
> On Wed, Jul 17, 2024 at 09:33:01AM -0400, Peter Xu wrote:
> > Hi, Michael,
> > 
> > On Wed, Jul 17, 2024 at 04:55:52AM -0400, Michael S. Tsirkin wrote:
> > > I just want to understand how we managed to have two threads
> > > talking in parallel. BQL is normally enough, which path
> > > manages to invoke vhost-user with BQL not taken?
> > > Just check BQL taken on each vhost user invocation and
> > > you will figure it out.
> > 
> > Prasad mentioned how the race happened in the cover letter:
> > 
> > https://lore.kernel.org/r/20240711131424.181615-1-ppan...@redhat.com
> > 
> >      Thread-1                                  Thread-2
> > 
> > vhost_dev_start                        postcopy_ram_incoming_cleanup
> >  vhost_device_iotlb_miss                postcopy_notify
> >   vhost_backend_update_device_iotlb      vhost_user_postcopy_notifier
> >    vhost_user_send_device_iotlb_msg       vhost_user_postcopy_end
> >     process_message_reply                  process_message_reply
> >      vhost_user_read                        vhost_user_read
> >       vhost_user_read_header                 vhost_user_read_header
> >        "Fail to update device iotlb"          "Failed to receive reply to 
> > postcopy_end"
> > 
> > The normal case should be that thread-2 is postcopy_ram_listen_thread(),
> > and this happens when postcopy migration is close to the end.
> > 
> > Thanks,
> > 
> > -- 
> > Peter Xu
> 
> 
> OK, so postcopy_ram_ things run without the BQL?

There are a lot of postcopy_ram_* functions, I didn't check all of them but
I think it's true in this case.  Thanks.

-- 
Peter Xu


Reply via email to