> -----Original Message-----
> From: Michael S. Tsirkin [mailto:m...@redhat.com]
> Sent: Friday, April 20, 2018 12:56 AM
> To: Liang, Cunming <cunming.li...@intel.com>
> Cc: Paolo Bonzini <pbonz...@redhat.com>; Bie, Tiwei <tiwei....@intel.com>;
> jasow...@redhat.com; alex.william...@redhat.com; stefa...@redhat.com;
> qemu-de...@nongnu.org; virtio-dev@lists.oasis-open.org; Daly, Dan
> <dan.d...@intel.com>; Tan, Jianfeng <jianfeng....@intel.com>; Wang, Zhihong
> <zhihong.w...@intel.com>; Wang, Xiao W <xiao.w.w...@intel.com>
> Subject: Re: [virtio-dev] RE: [PATCH v3 6/6] vhost-user: support registering
> external host notifiers
> 
> On Thu, Apr 19, 2018 at 04:24:29PM +0000, Liang, Cunming wrote:
> >
> >
> > > -----Original Message-----
> > > From: Michael S. Tsirkin [mailto:m...@redhat.com]
> > > Sent: Thursday, April 19, 2018 11:19 PM
> > > To: Paolo Bonzini <pbonz...@redhat.com>
> > > Cc: Liang, Cunming <cunming.li...@intel.com>; Bie, Tiwei
> > > <tiwei....@intel.com>; jasow...@redhat.com;
> > > alex.william...@redhat.com; stefa...@redhat.com;
> > > qemu-de...@nongnu.org; virtio-dev@lists.oasis-open.org; Daly, Dan
> > > <dan.d...@intel.com>; Tan, Jianfeng <jianfeng....@intel.com>; Wang,
> > > Zhihong <zhihong.w...@intel.com>; Wang, Xiao W
> > > <xiao.w.w...@intel.com>
> > > Subject: Re: [virtio-dev] RE: [PATCH v3 6/6] vhost-user: support
> > > registering external host notifiers
> > >
> > > On Thu, Apr 19, 2018 at 03:02:40PM +0200, Paolo Bonzini wrote:
> > > > On 19/04/2018 14:43, Liang, Cunming wrote:
> > > > >> 2. Memory barriers. Right now after updating the avail idx,
> > > > >> virtio does smp_wmb() and then the MMIO write. Normal hardware
> > > > >> drivers do
> > > > >> wmb() which is an sfence. Can a PCI device read bypass index
> > > > >> write and see a stale index value?
> > > > >
> > > > > A compiler barrier is enough on strongly-ordered memory
> > > > > platform. As it doesn't re-order store, PCI device won't see a stale 
> > > > > index
> value.
> > > > > But a weakly-ordered memory needs sfence.
> > > >
> > > > That is complicated then.  We need to define a feature bit and (in
> > > > the Linux driver) propagate it to vring_create_virtqueue's
> > > > weak_barrier argument.  However:
> > > >
> > > > - if we make it 1 when weak barriers are needed, the device also
> > > > needs to nack feature negotiation (not allow setting the
> > > > FEATURES_OK) if the bit is not set by the driver.
> > > >  However, that is not enough.  Live migration assumes that it is
> > > > okay to migrate a virtual machine from a source that doesn't
> > > > support a feature to a destination that supports it.
> > > >  In this case, it would assume that it is okay to migrate from
> > > > software virtio to hardware virtio.  This is wrong because the
> > > > destination would use weak barriers
> > >
> > > You can't migrate between systems with different sets of device
> > > features right now.
> > >
> > > > - if we make it 1 when strong barriers are enough, software virtio
> > > > devices needs to be updated to expose the bit.  This works,
> > > > including live migration, but updated drivers will now go slower
> > > > when run against an old device that doesn't know the feature bit.
> > > >
> > > > Maybe bump the PCI revision, so that only the new revision has the bit?
> > > >
> > > > Thanks,
> > > >
> > > > Paolo
> > >
> > > As a first step, if you want to migrate to a HW offloaded solution
> > > then you need to enable the feature.
> >
> > > It does mean it will go a bit slower when run with software, so it's
> > > only good if most systems in your cluster do have the HW offload.
> > To clarify a bit more, it's suboptimal to always use mandatory barriers for 
> > MMIO.
> Per strongly-order memory, 'weak barriers' (smp_wmb) is pretty good for MMIO.
> The tradeoff doesn't always happen, software and HW offload can align on the
> same page.
> 
> I agree to all of the above except where you say smp_wmb.
> 
> smp_wmb is for controlling SMP effects on Linux, and I suspect it will not do 
> the
> right thing on some non-Intel architectures.
> 
> The claim is I think correct for Intel/AMD platforms, and probably other 
> strongly
> ordered ones. I suspect it's incorrect for ARM and power.
> 
> Replace smp_wmb with 'asm volatile ("") on Intel' and I'll agree.

Yeah, that's more accurate. 

> 
> 
> 
> > > I think we can start by getting that working and think about ways to
> > > improve down the road.
> > >
> > >
> > > That's the usecase we designed FEATURES_OK for though, so I do
> > > think/hope it's enough and we don't need to play with revisions.
> > >
> > >
> > > --
> > > MST

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Reply via email to