On 04/02/2013 08:35:54 PM, Timur Tabi wrote:
On Tue, Apr 2, 2013 at 11:18 AM, Joerg Roedel <j...@8bytes.org> wrote:

> > +     panic("\n");
>
> A kernel panic seems like an over-reaction to an access violation.

We have no way to determining what code caused the violation, so we
can't just kill the process.  I agree it seems like overkill, but what
else should we do?  Does the IOMMU layer have a way for the IOMMU
driver to stop the device that caused the problem?

At a minimum, log a message and continue. Probably turn off the LIODN, at least if it continues to be noisy (otherwise we could get stuck in an interrupt storm as you note). Possibly let the user know somehow, especially if it's a VFIO domain.

Don't take down the whole kernel. It's not just overkill; it undermines VFIO's efforts to make it safe for users to control devices.

> Besides the device that caused the violation the system should still
> work, no?

Not really.  The PAMU was designed to add IOMMU support to legacy
devices, which have no concept of an MMU.  If the PAMU detects an
access violation, there's no way for the device to recover, because it
has no idea that a violation has occurred.  It's going to keep on
writing to bad data.

I think that's only the case for posted writes (or devices which fail to take a hint and stop even after they see an I/O error).

-Scott
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to