On Wed, Jul 05, 2023 at 10:13:11AM +0000, Duan, Zhenzhong wrote: > >-----Original Message----- > >From: Jean-Philippe Brucker <jean-phili...@linaro.org> > >Sent: Wednesday, July 5, 2023 4:29 PM > >Subject: Re: [PATCH 1/2] virtio-iommu: Fix 64kB host page size VFIO device > >assignment > > > >On Wed, Jul 05, 2023 at 04:52:09AM +0000, Duan, Zhenzhong wrote: > >> Hi Eric, > >> > >> >-----Original Message----- > >> >From: Eric Auger <eric.au...@redhat.com> > >> >Sent: Tuesday, July 4, 2023 7:15 PM > >> >Subject: [PATCH 1/2] virtio-iommu: Fix 64kB host page size VFIO > >> >device assignment > >> > > >> >When running on a 64kB page size host and protecting a VFIO device > >> >with the virtio-iommu, qemu crashes with this kind of message: > >> > > >> >qemu-kvm: virtio-iommu page mask 0xfffffffffffff000 is incompatible > >> >with mask 0x20010000 > >> > >> Does 0x20010000 mean only 512MB and 64KB super page mapping is > >> supported for host iommu hw? 4KB mapping not supported? > > > >It's not a restriction by the HW IOMMU, but the host kernel. An Arm SMMU > >can implement 4KB, 16KB and/or 64KB granules, but the host kernel only > >advertises through VFIO the granule corresponding to host PAGE_SIZE. This > >restriction is done by arm_lpae_restrict_pgsizes() in order to choose a page > >size when a device is driven by the host. > > Just curious why not advertises the Arm SMMU implemented granules to VFIO > Eg:4KB, 16KB or 64KB granules?
That's possible, but the difficulty is setting up the page table configuration afterwards. At the moment the host installs the HW page tables early, when QEMU sets up the VFIO container. That initializes the page size bitmap because configuring the HW page tables requires picking one of the supported granules (setting TG0 in the SMMU Context Descriptor). If the guest could pick a granule via an ATTACH request, then QEMU would need to tell the host kernel to install page tables with the desired granule at that point. That would require a new interface in VFIO to reconfigure a live container and replace the existing HW page tables configuration (before ATTACH, the container must already be configured with working page tables in order to implement boot-bypass, I think). > But arm_lpae_restrict_pgsizes() restricted ones, > Eg: for SZ_4K, (SZ_4K | SZ_2M | SZ_1G). > (SZ_4K | SZ_2M | SZ_1G) looks not real hardware granules of Arm SMMU. Yes, the granule here is 4K, and other bits only indicate huge page sizes, so the user can try to optimize large mappings to use fewer TLB entries where possible. Thanks, Jean