> From: Paul Durrant <xadimg...@gmail.com>
> Sent: Friday, March 13, 2020 5:26 PM
> 
> > -----Original Message-----
> > From: Tian, Kevin <kevin.t...@intel.com>
> > Sent: 13 March 2020 03:23
> > To: p...@xen.org; 'Jan Beulich' <jbeul...@suse.com>
> > Cc: xen-devel@lists.xenproject.org; 'Andrew Cooper'
> <andrew.coop...@citrix.com>
> > Subject: RE: [Xen-devel] [PATCH v3] IOMMU: make DMA containment of
> quarantined devices optional
> >
> > > From: Paul Durrant <xadimg...@gmail.com>
> > > Sent: Wednesday, March 11, 2020 12:05 AM
> > >
> > [...]
> > >
> > > >
> > > > > However, is a really saying that things will break if any of the
> > > > > PTEs has their present bit clear?
> > > >
> > > > Well, you said that read faults are fatal (to the host). Reads will,
> > > > for any address with an unpopulated PTE, result in a fault and hence
> > > > by implication be fatal.
> > >
> > > Oh I see. I thought there was an implication that the IOMMU could not
> cope
> > > with non-present PTEs in some way. Agreed that, when the device is
> assigned
> > > to the guest, then it can arrange (via ballooning) for a non-present entry
> to
> > > be hit by a read transaction, resulting in a lock-up. But dealing with a
> > > malicious guest was not the issue at hand... dealing with a buggy device
> that
> > > still tried to DMA after reset and whilst in quarantine was the problem.
> > >
> >
> > More thinking on this, I wonder whether the scratch page is sufficient, or
> > whether we should support such device in the first place. Looking at
> > 0c35d446:
> > --
> >     The reason for doing this is that some hardware may continue to re-try
> >     DMA (despite FLR) in the event of an error, or even BME being cleared,
> and
> >     will fail to deal with DMA read faults gracefully. Having a scratch page
> >     mapped will allow pending DMA reads to complete and thus such buggy
> >     hardware will eventually be quiesced.
> > --
> >
> > 'eventually'... what does it exactly mean?
> 
> It means after a period of time we can only determine empirically.
> 
> > How would an user know a
> > device has been quiesced before he attempts to re-assign the device
> > to other domU or dom0? by guess?
> 
> Yes, a guess, but an educated one.
> 
> > Note the exact behavior of such
> > device, after different guest behaviors (hang, kill, bug, etc.), is not
> > documented. Who knows whether a in-fly DMA may be triggered when
> > the new owner starts to initialize the device again? How many stale
> > states are remaining on such device which, even not triggerring in-fly
> > DMAs, may change the desired behavior of the new owner? e.g. it's
> > possible one control register configured by the old owner, but not
> > touched by the new owner. If it cannot be reset, what's the point of
> > supporting assignment of such bogus device?
> >
> 
> Because I'm afraid it is quite ubiquitous and we need to deal with it.

it sounds the whole passthrough is in dangerous if your statement is true...

> 
> > Thereby I feel any support of such bogus device should be maintained
> > offtree, instead of in upstream Xen. Thoughts?
> >
> 
> I don't see the harm in the code being upstream. There may well be other
> devices with similar issues and it provides an option for an admin to try.
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to