Hi Jean-Philippe,

On Fri, Jan 06, 2017 at 05:48:33PM +0000, Jean-Philippe Brucker wrote:
> On 20/12/16 15:14, Will Deacon wrote:
> > Booting Linux on an ARM fastmodel containing an SMMU emulation results
> > in an unexpected I/O page fault from the legacy virtio-blk PCI device:
> > 
> > [    1.211721] arm-smmu-v3 2b400000.smmu: event 0x10 received:
> > [    1.211800] arm-smmu-v3 2b400000.smmu:   0x00000000fffff010
> > [    1.211880] arm-smmu-v3 2b400000.smmu:   0x0000020800000000
> > [    1.211959] arm-smmu-v3 2b400000.smmu:   0x00000008fa081002
> > [    1.212075] arm-smmu-v3 2b400000.smmu:   0x0000000000000000
> > [    1.212155] arm-smmu-v3 2b400000.smmu: event 0x10 received:
> > [    1.212234] arm-smmu-v3 2b400000.smmu:   0x00000000fffff010
> > [    1.212314] arm-smmu-v3 2b400000.smmu:   0x0000020800000000
> > [    1.212394] arm-smmu-v3 2b400000.smmu:   0x00000008fa081000
> > [    1.212471] arm-smmu-v3 2b400000.smmu:   0x0000000000000000
> > 
> > <system hangs failing to read partition table>
> > 
> > This is because the virtio-blk is behind an SMMU, so we have consequently
> > swizzled its DMA ops and configured the SMMU to translate accesses. This
> > then requires the vring code to use the DMA API to establish translations,
> > otherwise all transactions will result in fatal faults and termination.
> > 
> > Given that ARM-based systems only see an SMMU if one is really present
> > (the topology is all described by firmware tables such as device-tree or
> > IORT), then we can safely use the DMA API for all virtio devices.
> 
> There is a problem with the platform block device on that same model.
> Since it's not behind the SMMU, the DMA ops fall back to swiotlb, which
> limits the number of mappings.
> 
> It used to work with 4.9, but since 9491ae4 ("mm: don't cap request size
> based on read-ahead setting") unlocked read-ahead, we quickly run into
> the limit of swiotlb and panic:
> 
> [    5.382359] virtio-mmio 1c130000.virtio_block: swiotlb buffer is full
> (sz: 491520 bytes)
> [    5.382452] virtio-mmio 1c130000.virtio_block: DMA: Out of SW-IOMMU
> space for 491520 bytes
> [    5.382531] Kernel panic - not syncing: DMA: Random memory could be
> DMA written
> ...
> [    5.383148] [<ffff0000083ad754>] swiotlb_map_page+0x194/0x1a0
> [    5.383226] [<ffff000008096bb8>] __swiotlb_map_page+0x20/0x88
> [    5.383320] [<ffff0000084bf738>] vring_map_one_sg.isra.1+0x70/0x88
> [    5.383417] [<ffff0000084c04fc>] virtqueue_add_sgs+0x2ec/0x4e8
> [    5.383505] [<ffff00000856d99c>] __virtblk_add_req+0x9c/0x1a8
> ...
> [    5.384449] [<ffff0000081829c4>] ondemand_readahead+0xfc/0x2b8

Oh, lovely!

> Commit 9491ae4 caps the read-ahead request to a limit set by the backing
> device. For virtio-blk, it is infinite (as set by the call to
> blk_queue_max_hw_sectors in virtblk_probe).
> 
> I'm not sure how to fix this. Setting an arbitrary sector limit in the
> virtio-blk driver seems unfair to other users. Maybe we should check if
> the device is behind a hardware IOMMU before using the DMA API?

Couldn't the same issue potentially occur with a hardware IOMMU, where
we run out of IOVA space due to unlimited readahead? I think it might be
best to enforce a finite limit for virtio devices when the DMA API is in
use.

Do any drivers for physical (i.e. non-virtual) hardware make use of
unlimited readahead?

Will
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Reply via email to