On Tue, 14 Jan 2020 17:14:33 +1100 Alexey Kardashevskiy <a...@ozlabs.ru> wrote:
> On 14/01/2020 03:28, Alex Williamson wrote: > > On Mon, 13 Jan 2020 18:49:21 +0300 > > yurij <lnk...@gmail.com> wrote: > > > >> Hello everybody! > >> > >> I have a specific PCIe device (sorry, but I can't tell about what is it > >> and what it does) but PCI configuration space consists of 4 BARs (lspci > >> output brief): > >> > >> lspci -s 84:00.00 -vvv > >> > >> . . . > >> Region 0: Memory at fa000000 (64-bit, non-prefetchable) [size=16M] > >> Region 2: Memory at fb001000 (32-bit, non-prefetchable) [size=4K] > >> Region 3: Memory at fb000000 (32-bit, non-prefetchable) [size=4K] > >> Region 4: Memory at f9000000 (64-bit, non-prefetchable) [size=16M] > >> . . . > >> Kernel driver in use: vfio-pci > >> . . . > >> > >> BAR0 merged with BAR1, BAR4 merged with BAR5 so they are 64 bit width. > >> > >> I put this PCIe device in virtual machine via vfio: > >> > >> -device vfio-pci,host=84:00.0,id=hostdev0,bus=pci.6,addr=0x0 > >> > >> Virtual machine successfully boot. PCI configuration space in virtual > >> environment looks OK (lspci output brief): > >> > >> lspci -s 06:00.0 -vvv > >> > >> . . . > >> Region 0: Memory at f8000000 (64-bit, non-prefetchable) [size=16M] > >> Region 2: Memory at fa000000 (32-bit, non-prefetchable) [size=4K] > >> Region 3: Memory at fa001000 (32-bit, non-prefetchable) [size=4K] > >> Region 4: Memory at f9000000 (64-bit, non-prefetchable) [size=16M] > >> . . . > >> Kernel driver in use: custom_driver > >> > >> BAR0 merged with BAR1 and BAR4 merged with BAR5 and so they are also 64 > >> bit width. > >> > >> The main problem in 4K HOLE in REGION 0 in virtual environment. So some > >> device features don't work. > >> > >> I have enabled iommu trace in host system (trace_event=iommu) and > >> display all events (for i in $(find > >> /sys/kernel/debug/tracing/events/iommu/ -name enable);do echo 1 > $i; > >> done). I saw next events during virtual machine booting: > >> > >> # cat /sys/kernel/debug/tracing/trace > >> . . . > >> CPU 0/KVM-3046 [051] .... 63113.338894: map: IOMMU: > >> iova=0x00000000f8000000 paddr=0x00000000fa000000 size=24576 > >> CPU 0/KVM-3046 [051] .... 63113.339177: map: IOMMU: > >> iova=0x00000000f8007000 paddr=0x00000000fa007000 size=16748544 > >> CPU 0/KVM-3046 [051] .... 63113.339444: map: IOMMU: > >> iova=0x00000000fa000000 paddr=0x00000000fb001000 size=4096 > >> CPU 0/KVM-3046 [051] .... 63113.339697: map: IOMMU: > >> iova=0x00000000fa001000 paddr=0x00000000fb000000 size=4096 > >> CPU 0/KVM-3046 [051] .... 63113.340209: map: IOMMU: > >> iova=0x00000000f9000000 paddr=0x00000000f9000000 size=16777216 > >> . . . > >> > >> I have enabled qemu trace(-trace events=/root/qemu/trace_events). Trace > >> file consists of the falling functions: > >> vfio_region_mmap > >> vfio_get_dev_region > >> vfio_pci_size_rom > >> vfio_pci_read_config > >> vfio_pci_write_config > >> vfio_iommu_map_notify > >> vfio_listener_region_add_iommu > >> vfio_listener_region_add_ram > >> > >> Some important brief from qemu trace: > >> . . . > >> янв 13 18:17:24 VM qemu-system-x86_64[7131]: vfio_region_mmap Region > >> 0000:84:00.0 BAR 0 mmaps[0] [0x0 - 0xffffff] > >> янв 13 18:17:24 VM qemu-system-x86_64[7131]: vfio_region_mmap Region > >> 0000:84:00.0 BAR 2 mmaps[0] [0x0 - 0xfff] > >> янв 13 18:17:24 VM qemu-system-x86_64[7131]: vfio_region_mmap Region > >> 0000:84:00.0 BAR 3 mmaps[0] [0x0 - 0xfff] > >> янв 13 18:17:24 VM qemu-system-x86_64[7131]: vfio_region_mmap Region > >> 0000:84:00.0 BAR 4 mmaps[0] [0x0 - 0xffffff] > >> . . . > >> янв 13 18:17:37 VM qemu-system-x86_64[7131]: > >> vfio_listener_region_add_ram region_add [ram] 0xf8000000 - 0xf8005fff > >> [0x7f691e800000] > >> янв 13 18:17:37 VM qemu-system-x86_64[7131]: > >> vfio_listener_region_add_ram region_add [ram] 0xf8007000 - 0xf8ffffff > >> [0x7f691e807000] > >> янв 13 18:17:37 VM qemu-system-x86_64[7131]: > >> vfio_listener_region_add_ram region_add [ram] 0xfa000000 - 0xfa000fff > >> [0x7f6b5de37000] > >> янв 13 18:17:37 VM qemu-system-x86_64[7131]: > >> vfio_listener_region_add_ram region_add [ram] 0xfa001000 - 0xfa001fff > >> [0x7f6b58004000] > >> янв 13 18:17:37 VM qemu-system-x86_64[7131]: > >> vfio_listener_region_add_ram region_add [ram] 0xf9000000 - 0xf9ffffff > >> [0x7f691d800000] > >> > >> I use qemu 4.0.0 which I rebuild for tracing support > >> (--enable-trace-backends=syslog). > >> > >> Please, help me solve this issue. Thank you! > > > > Something has probably created a QEMU MemoryRegion overlapping the BAR, > > we do this for quirks where we want to intercept a range of MMIO for > > emulation, but the offset 0x6000 on BAR0 doesn't sound familiar to me. > > Run the VM with a monitor and see if 'info mtree' provides any info on > > the handling of that overlap. Thanks, > > > Could not it be an MSIX region? 'info mtree -f' should tell exactly what > is going on. Oh, good call, that's probably it. The PCI spec specifically recommends against placing non-MSIX related registers within the same 4K page as the vector table to avoid such things: If a Base Address register that maps address space for the MSI-X Table or MSI-X PBA also maps other usable address space that is not associated with MSI-X structures, locations (e.g., for CSRs) used in the other address space must not share any naturally aligned 4-KB address range with one where either MSI-X structure resides. This allows system software where applicable to use different processor attributes for MSI-X structures and the other address space. We have the following QEMU vfio-pci device option to relocate the BAR elsewhere for hardware that violates that recommendation or for where the PCI spec recommended alignment isn't sufficient: x-msix-relocation=<OffAutoPCIBAR> - off/auto/bar0/bar1/bar2/bar3/bar4/bar5 In this case I'd probably recommend bar2 or bar3 as those BARs would only be extended to 8K versus bar0/4 would be extended to 32M. Thanks, Alex