Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
On Mon, Apr 18, 2022 at 08:49:49AM +0800, Lu Baolu wrote: > Hi Joerg, > > This is a resend version of v8 posted here: > https://lore.kernel.org/linux-iommu/20220308054421.847385-1-baolu...@linux.intel.com/ > as we discussed in this thread: > https://lore.kernel.org/linux-iommu/yk%2fq1bgn8pc5h...@8bytes.org/ > > All patches can be applied perfectly except this one: > - [PATCH v8 02/11] driver core: Add dma_cleanup callback in bus_type > It conflicts with below refactoring commit: > - 4b775aaf1ea99 "driver core: Refactor sysfs and drv/bus remove hooks" > The conflict has been fixed in this post. > > No functional changes in this series. I suppress cc-ing this series to > all v8 reviewers in order to avoid spam. > > Please consider it for your iommu tree. Reverting this series fixed an user-after-free while doing SR-IOV. BUG: KASAN: use-after-free in __lock_acquire Read of size 8 at addr 080279825d78 by task qemu-system-aar/22429 CPU: 24 PID: 22429 Comm: qemu-system-aar Not tainted 5.18.0-rc5-next-20220502 #69 Call trace: dump_backtrace show_stack dump_stack_lvl print_address_description.constprop.0 print_report kasan_report __asan_report_load8_noabort __lock_acquire lock_acquire.part.0 lock_acquire _raw_spin_lock_irqsave arm_smmu_detach_dev arm_smmu_detach_dev at drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:2377 arm_smmu_attach_dev __iommu_attach_group __iommu_attach_device at drivers/iommu/iommu.c:1942 (inlined by) iommu_group_do_attach_device at drivers/iommu/iommu.c:2058 (inlined by) __iommu_group_for_each_dev at drivers/iommu/iommu.c:989 (inlined by) __iommu_attach_group at drivers/iommu/iommu.c:2069 iommu_group_release_dma_owner __vfio_group_unset_container vfio_group_try_dissolve_container vfio_group_put_external_user kvm_vfio_destroy kvm_destroy_vm kvm_vm_release __fput fput task_work_run do_exit do_group_exit get_signal do_signal do_notify_resume el0_svc el0t_64_sync_handler el0t_64_sync Allocated by task 22427: kasan_save_stack __kasan_kmalloc kmem_cache_alloc_trace arm_smmu_domain_alloc iommu_domain_alloc vfio_iommu_type1_attach_group vfio_ioctl_set_iommu vfio_fops_unl_ioctl __arm64_sys_ioctl invoke_syscall el0_svc_common.constprop.0 do_el0_svc el0_svc el0t_64_sync_handler el0t_64_sync Freed by task 22429: kasan_save_stack kasan_set_track kasan_set_free_info kasan_slab_free __kasan_slab_free slab_free_freelist_hook kfree arm_smmu_domain_free arm_smmu_domain_free at iommu/arm/arm-smmu-v3/arm-smmu-v3.c:2067 iommu_domain_free vfio_iommu_type1_detach_group __vfio_group_unset_container vfio_group_try_dissolve_container vfio_group_put_external_user kvm_vfio_destroy kvm_destroy_vm kvm_vm_release __fput fput task_work_run do_exit do_group_exit get_signal do_signal do_notify_resume el0_svc el0t_64_sync_handler el0t_64_sync ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v14 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
On 6/24/2021 7:48 AM, Will Deacon wrote: > Ok, diff below which attempts to tackle the offset issue I mentioned as > well. Qian Cai -- please can you try with these changes? This works fine. > > Will > > --->8 > > diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h > index 175b6c113ed8..39284ff2a6cd 100644 > --- a/include/linux/swiotlb.h > +++ b/include/linux/swiotlb.h > @@ -116,7 +116,9 @@ static inline bool is_swiotlb_buffer(struct device *dev, > phys_addr_t paddr) > > static inline bool is_swiotlb_force_bounce(struct device *dev) > { > - return dev->dma_io_tlb_mem->force_bounce; > + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; > + > + return mem && mem->force_bounce; > } > > void __init swiotlb_exit(void); > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c > index 44be8258e27b..0ffbaae9fba2 100644 > --- a/kernel/dma/swiotlb.c > +++ b/kernel/dma/swiotlb.c > @@ -449,6 +449,7 @@ static int swiotlb_find_slots(struct device *dev, > phys_addr_t orig_addr, > dma_get_min_align_mask(dev) & ~(IO_TLB_SIZE - 1); > unsigned int nslots = nr_slots(alloc_size), stride; > unsigned int index, wrap, count = 0, i; > + unsigned int offset = swiotlb_align_offset(dev, orig_addr); > unsigned long flags; > > BUG_ON(!nslots); > @@ -497,7 +498,7 @@ static int swiotlb_find_slots(struct device *dev, > phys_addr_t orig_addr, > for (i = index; i < index + nslots; i++) { > mem->slots[i].list = 0; > mem->slots[i].alloc_size = > - alloc_size - ((i - index) << IO_TLB_SHIFT); > + alloc_size - (offset + ((i - index) << IO_TLB_SHIFT)); > } > for (i = index - 1; > io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 && > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v14 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
On 6/23/2021 2:37 PM, Will Deacon wrote: > On Wed, Jun 23, 2021 at 12:39:29PM -0400, Qian Cai wrote: >> >> >> On 6/18/2021 11:40 PM, Claire Chang wrote: >>> Propagate the swiotlb_force into io_tlb_default_mem->force_bounce and >>> use it to determine whether to bounce the data or not. This will be >>> useful later to allow for different pools. >>> >>> Signed-off-by: Claire Chang >>> Reviewed-by: Christoph Hellwig >>> Tested-by: Stefano Stabellini >>> Tested-by: Will Deacon >>> Acked-by: Stefano Stabellini >> >> Reverting the rest of the series up to this patch fixed a boot crash with >> NVMe on today's linux-next. > > Hmm, so that makes patch 7 the suspicious one, right? Will, no. It is rather patch #6 (this patch). Only the patch from #6 to #12 were reverted to fix the issue. Also, looking at this offset of the crash, pc : dma_direct_map_sg+0x304/0x8f0 is_swiotlb_force_bounce at /usr/src/linux-next/./include/linux/swiotlb.h:119 is_swiotlb_force_bounce() was the new function introduced in this patch here. +static inline bool is_swiotlb_force_bounce(struct device *dev) +{ + return dev->dma_io_tlb_mem->force_bounce; +} > > Looking at that one more closely, it looks like swiotlb_find_slots() takes > 'alloc_size + offset' as its 'alloc_size' parameter from > swiotlb_tbl_map_single() and initialises 'mem->slots[i].alloc_size' based > on 'alloc_size + offset', which looks like a change in behaviour from the > old code, which didn't include the offset there. > > swiotlb_release_slots() then adds the offset back on afaict, so we end up > accounting for it twice and possibly unmap more than we're supposed to? > > Will > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v14 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
On 6/18/2021 11:40 PM, Claire Chang wrote: > Propagate the swiotlb_force into io_tlb_default_mem->force_bounce and > use it to determine whether to bounce the data or not. This will be > useful later to allow for different pools. > > Signed-off-by: Claire Chang > Reviewed-by: Christoph Hellwig > Tested-by: Stefano Stabellini > Tested-by: Will Deacon > Acked-by: Stefano Stabellini Reverting the rest of the series up to this patch fixed a boot crash with NVMe on today's linux-next. [ 22.286574][T7] Unable to handle kernel paging request at virtual address dfff800e [ 22.295225][T7] Mem abort info: [ 22.298743][T7] ESR = 0x9604 [ 22.302496][T7] EC = 0x25: DABT (current EL), IL = 32 bits [ 22.308525][T7] SET = 0, FnV = 0 [ 22.312274][T7] EA = 0, S1PTW = 0 [ 22.316131][T7] FSC = 0x04: level 0 translation fault [ 22.321704][T7] Data abort info: [ 22.325278][T7] ISV = 0, ISS = 0x0004 [ 22.329840][T7] CM = 0, WnR = 0 [ 22.333503][T7] [dfff800e] address between user and kernel address ranges [ 22.338543][ T256] igb 0006:01:00.0: Intel(R) Gigabit Ethernet Network Connection [ 22.341400][T7] Internal error: Oops: 9604 [#1] SMP [ 22.348915][ T256] igb 0006:01:00.0: eth0: (PCIe:2.5Gb/s:Width x1) 4c:38:d5:09:c8:83 [ 22.354458][T7] Modules linked in: igb(+) i2c_algo_bit nvme mlx5_core(+) i2c_core nvme_core firmware_class [ 22.362512][ T256] igb 0006:01:00.0: eth0: PBA No: G69016-004 [ 22.372287][T7] CPU: 13 PID: 7 Comm: kworker/u64:0 Not tainted 5.13.0-rc7-next-20210623+ #47 [ 22.372293][T7] Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 1.6 06/28/2020 [ 22.372298][T7] Workqueue: nvme-reset-wq nvme_reset_work [nvme] [ 22.378145][ T256] igb 0006:01:00.0: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s) [ 22.386901][T7] [ 22.386905][T7] pstate: 1005 (nzcV daif -PAN -UAO -TCO BTYPE=--) [ 22.386910][T7] pc : dma_direct_map_sg+0x304/0x8f0 is_swiotlb_force_bounce at /usr/src/linux-next/./include/linux/swiotlb.h:119 (inlined by) dma_direct_map_page at /usr/src/linux-next/kernel/dma/direct.h:90 (inlined by) dma_direct_map_sg at /usr/src/linux-next/kernel/dma/direct.c:428 [ 22.386919][T7] lr : dma_map_sg_attrs+0x6c/0x118 [ 22.386924][T7] sp : 80001dc8eac0 [ 22.386926][T7] x29: 80001dc8eac0 x28: 199e70b0 x27: [ 22.386935][T7] x26: 000847ee7000 x25: 80001158e570 x24: 0002 [ 22.386943][T7] x23: dfff8000 x22: 0100 x21: 199e7460 [ 22.386951][T7] x20: 199e7488 x19: 0001 x18: 10062670 [ 22.386955][ T253] Unable to handle kernel paging request at virtual address dfff800e [ 22.386958][T7] x17: 8000109f6a90 x16: 8000109e1b4c x15: 89303420 [ 22.386965][ T253] Mem abort info: [ 22.386967][T7] x14: 0001 x13: 80001158e000 [ 22.386970][ T253] ESR = 0x9604 [ 22.386972][T7] x12: 1fffe00108fdce01 [ 22.386975][ T253] EC = 0x25: DABT (current EL), IL = 32 bits [ 22.386976][T7] x11: 1fffe00108fdce03 x10: 000847ee700c x9 : 0004 [ 22.386981][ T253] SET = 0, FnV = 0 [ 22.386983][T7] [ 22.386985][T7] x8 : 73b91d72 [ 22.386986][ T253] EA = 0, S1PTW = 0 [ 22.386987][T7] x7 : x6 : 000e [ 22.386990][ T253] FSC = 0x04: level 0 translation fault [ 22.386992][T7] [ 22.386994][T7] x5 : dfff8000 [ 22.386995][ T253] Data abort info: [ 22.386997][T7] x4 : 0008c7ede000 [ 22.386999][ T253] ISV = 0, ISS = 0x0004 [ 22.386999][T7] x3 : 0008c7ede000 [ 22.387003][T7] x2 : 1000 [ 22.387003][ T253] CM = 0, WnR = 0 [ 22.387006][T7] x1 : x0 : 0071 [ 22.387008][ T253] [dfff800e] address between user and kernel address ranges [ 22.387011][T7] [ 22.387013][T7] Call trace: [ 22.387016][T7] dma_direct_map_sg+0x304/0x8f0 [ 22.387022][T7] dma_map_sg_attrs+0x6c/0x118 [ 22.387026][T7] nvme_map_data+0x2ec/0x21d8 [nvme] [ 22.387040][T7] nvme_queue_rq+0x274/0x3f0 [nvme] [ 22.387052][T7] blk_mq_dispatch_rq_list+0x2ec/0x18a0 [ 22.387060][T7] __blk_mq_sched_dispatch_requests+0x2a0/0x3e8 [ 22.387065][T7] blk_mq_sched_dispatch_requests+0xa4/0x100 [ 22.387070][T7] __blk_mq_run_hw_queue+0x148/0x1d8 [ 22.387075][T7] __blk_mq_delay_run_hw_queue+0x3f8/0x730 [ 22.414539][ T269] igb 0006:01:00.0 enP6p1s0: renamed from eth0 [ 22.418957][T7] blk_mq_run_hw_queue+0x148/0x248 [ 22.418969][T7] blk_mq_sched_insert_request+0x2a4/0x330 [ 22.418975][T7] blk_execute_rq_nowait+0xc8/0x118 [ 22.418981][T7] blk_execute_rq+0xd4/0x188 [ 22.453203][ T255] udeva
Re: [PATCH v3 19/35] x86/io_apic: Cleanup trigger/polarity helpers
On Sat, 2020-10-24 at 22:35 +0100, David Woodhouse wrote: > From: Thomas Gleixner > > 'trigger' and 'polarity' are used throughout the I/O-APIC code for handling > the trigger type (edge/level) and the active low/high configuration. While > there are defines for initializing these variables and struct members, they > are not used consequently and the meaning of 'trigger' and 'polarity' is > opaque and confusing at best. > > Rename them to 'is_level' and 'active_low' and make them boolean in various > structs so it's entirely clear what the meaning is. > > Signed-off-by: Thomas Gleixner > Signed-off-by: David Woodhouse > --- > arch/x86/include/asm/hw_irq.h | 6 +- > arch/x86/kernel/apic/io_apic.c | 244 +--- > arch/x86/pci/intel_mid_pci.c| 8 +- > drivers/iommu/amd/iommu.c | 10 +- > drivers/iommu/intel/irq_remapping.c | 9 +- > 5 files changed, 130 insertions(+), 147 deletions(-) Reverting the rest of patchset up to this commit on next-20201109 fixed an endless soft-lockups issue booting an AMD server below. I noticed that the failed boots always has this IOMMU IO_PAGE_FAULT before those soft-lockups: [ 3404.093354][T1] AMD-Vi: Interrupt remapping enabled [ 3404.098593][T1] AMD-Vi: Virtual APIC enabled [ 3404.107783][ T340] pci :00:14.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x address=0xfffdf8020200 flags=0x0008] [ 3404.120644][T1] AMD-Vi: Lazy IO/TLB flushing enabled [ 3404.126011][T1] PCI-DMA: Using software bounce buffering for IO (SWIOTLB) [ 3404.133173][T1] software IO TLB: mapped [mem 0x68dcf000-0x6cdcf000] (64MB) .config (if ever matters): https://cailca.coding.net/public/linux/mm/git/files/master/x86.config good boot dmesg (with the commits reverted): http://people.redhat.com/qcai/dmesg.txt == system info == Dell Poweredge R6415 AMD EPYC 7401P 24-Core Processor 24576 MB memory, 239 GB disk space [ OK ] Started Flush Journal to Persistent Storage. [ OK ] Started udev Kernel Device Manager. [ OK ] Started udev Coldplug all Devices. [ OK ] Started Monitoring of LVM2 mirrors,…sing dmeventd or progress polling. [ OK ] Reached target Local File Systems (Pre). Mounting /boot... [ OK ] Created slice system-lvm2\x2dpvscan.slice. [ 3740.376500][ T1058] XFS (sda1): Mounting V5 Filesystem [ 3740.438474][ T1058] XFS (sda1): Ending clean mount [ 3765.159433][C0] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [systemd:1] [ 3765.166929][C0] Modules linked in: acpi_cpufreq(+) ip_tables x_tables sd_mod ahci libahci tg3 bnxt_en megaraid_sas libata firmware_class libphy dm_mirror dm_region_hash dm_log dm_mod [ 3765.183576][C0] irq event stamp: 26230104 [ 3765.187954][C0] hardirqs last enabled at (26230103): [] asm_common_interrupt+0x1e/0x40 [ 3765.197873][C0] hardirqs last disabled at (26230104): [] sysvec_apic_timer_interrupt+0xa/0xa0 [ 3765.208303][C0] softirqs last enabled at (26202664): [] __do_softirq+0x61b/0x95d [ 3765.217699][C0] softirqs last disabled at (26202591): [] asm_call_irq_on_stack+0x12/0x20 [ 3765.227702][C0] CPU: 0 PID: 1 Comm: systemd Not tainted 5.10.0-rc2-next-20201109+ #2 [ 3765.235793][C0] Hardware name: Dell Inc. PowerEdge R6415/07YXFK, BIOS 1.9.3 06/25/2019 [ 3765.244065][C0] RIP: 0010:lock_acquire+0x1f4/0x820 lock_acquire at kernel/locking/lockdep.c:5404 [ 3765.249211][C0] Code: ff ff ff 48 83 c4 20 65 0f c1 05 a7 ba 9e 7e 83 f8 01 4c 8b 54 24 08 0f 85 60 04 00 00 41 52 9d 48 b8 00 00 00 00 00 fc ff df <48> 01 c3 c7 03 00 00 00 00 c7 43 08 00 00 00 00 48 8b0 [ 3765.268657][C0] RSP: 0018:c906f9f8 EFLAGS: 0246 [ 3765.274587][C0] RAX: dc00 RBX: 1920df42 RCX: 1920df28 [ 3765.282420][C0] RDX: 111020645769 RSI: RDI: 0001 [ 3765.290256][C0] RBP: 0001 R08: fbfff164cb10 R09: fbfff164cb10 [ 3765.298090][C0] R10: 0246 R11: fbfff164cb0f R12: 88812be555b0 [ 3765.305922][C0] R13: R14: R15: [ 3765.313750][C0] FS: 7f12bb8c59c0() GS:8881b700() knlGS: [ 3765.322537][C0] CS: 0010 DS: ES: CR0: 80050033 [ 3765.328985][C0] CR2: 7f0c2d828fd0 CR3: 00011868a000 CR4: 003506f0 [ 3765.336820][C0] Call Trace: [ 3765.339979][C0] ? rcu_read_unlock+0x40/0x40 [ 3765.344609][C0] __d_move+0x2a2/0x16f0 __seqprop_spinlock_assert at include/linux/seqlock.h:277 (inlined by) __d_move at fs/dcache.c:2861 [ 3765.348711][C0] ? d_move+0x47/0x70 [ 3765.352560][C0] ? _raw_spin_unlock+0x1a/0x30 [ 3765.357275][C0] d_move+0x47/0x70 write_seqcount_t_end at include/linux/seqlock.h:560 (inlined by) write_sequnlock at include/linux/seqlock.h:901 (inlined by) d_move at fs/dcache.c:2916 [ 3765.360951][C0] ? vfs_rename+0x9ac/0x1270
Re: [patch V2 00/46] x86, PCI, XEN, genirq ...: Prepare for device MSI
On Wed, 2020-08-26 at 13:16 +0200, Thomas Gleixner wrote: > This is the second version of providing a base to support device MSI (non > PCI based) and on top of that support for IMS (Interrupt Message Storm) > based devices in a halfways architecture independent way. > > The first version can be found here: > > https://lore.kernel.org/r/20200821002424.119492...@linutronix.de > > It's still a mixed bag of bug fixes, cleanups and general improvements > which are worthwhile independent of device MSI. Reverting the part of this patchset on the top of today's linux-next fixed an boot issue on HPE ProLiant DL560 Gen10, i.e., $ git revert --no-edit 13b90cadfc29..bc95fd0d7c42 .config: https://gitlab.com/cailca/linux-mm/-/blob/master/x86.config It looks like the crashes happen in the interrupt remapping code where they are only able to to generate partial call traces. [1.912386][T0] ACPI: X2APIC_NMI (uid[0xf5] high level 9983][T0] ... MAX_LOCK_DEPTH: 48 [7.914876][T0] ... MAX_LOCKDEP_KEYS:8192 [7.919942][T0] ... CLASSHASH_SIZE: 4096 [7.925009][T0] ... MAX_LOCKDEP_ENTRIES: 32768 [7.930163][T0] ... MAX_LOCKDEP_CHAINS: 65536 [7.935318][T0] ... CHAINHASH_SIZE: 32768 [7.940473][T0] memory used by lock dependency info: 6301 kB [7.946586][T0] memory used for stack traces: 4224 kB [7.952088][T0] per task-struct memory footprint: 1920 bytes [7.968312][T0] mempolicy: Enabling automatic NUMA balancing. Configure with numa_balancing= or the kernel.numa_balancing sysctl [7.980281][T0] ACPI: Core revision 20200717 [7.993343][T0] clocksource: hpet: mask: 0x max_cycles: 0x, max_idle_ns: 79635855245 ns [8.003270][T0] APIC: Switch to symmetric I/O mode setup [8.008951][T0] DMAR: Host address width 46 [8.013512][T0] DMAR: DRHD base: 0x00e5ffc000 flags: 0x0 [8.019680][T0] DMAR: dmar0: reg_base_addr e5ffc000 ver 1:0 cap 8d2078c106f0466 [T0] DMAR-IR: IOAPIC id 15 under DRHD base 0xe5ffc000 IOMMU 0 [8.420990][T0] DMAR-IR: IOAPIC id 8 under DRHD base 0xddffc000 IOMMU 15 [8.428166][T0] DMAR-IR: IOAPIC id 9 under DRHD base 0xddffc000 IOMMU 15 [8.435341][T0] DMAR-IR: HPET id 0 under DRHD base 0xddffc000 [8.441456][T0] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping. [8.457911][T0] DMAR-IR: Enabled IRQ remapping in x2apic mode [8.466614][T0] BUG: kernel NULL pointer dereference, address: [8.474295][T0] #PF: supervisor instruction fetch in kernel mode [8.480669][T0] #PF: error_code(0x0010) - not-present page [8.486518][T0] PGD 0 P4D 0 [8.489757][T0] Oops: 0010 [#1] SMP KASAN PTI [8.494476][T0] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G I 5.9.0-rc6-next-20200925 #2 [8.503987][T0] Hardware name: HPE ProLiant DL560 Gen10/ProLiant DL560 Gen10, BIOS U34 11/13/2019 [8.513238][T0] RIP: 0010:0x0 [8.516562][T0] Code: Bad RIP v or [2.906744][T0] ACPI: X2API32, address 0xfec68000, GSI 128-135 [2.907063][T0] IOAPIC[15]: apic_id 29, version 32, address 0xfec7, GSI 136-143 [2.907071][T0] IOAPIC[16]: apic_id 30, version 32, address 0xfec78000, GSI 144-151 [2.907079][T0] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) [2.907084][T0] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) [2.907100][T0] Using ACPI (MADT) for SMP configuration information [2.907105][T0] ACPI: HPET id: 0x8086a701 base: 0xfed0 [2.907116][T0] ACPI: SPCR: console: uart,mmio,0x0,115200 [2.907121][T0] TSC deadline timer available [2.907126][T0] smpboot: Allowing 144 CPUs, 0 hotplug CPUs [2.907163][T0] [mem 0xd000-0xfdff] available for PCI devices [2.907175][T0] clocksource: refined-jiffies: mask: 0x max_cycles: 0x, max_idle_ns: 1911260446275 ns [2.914541][T0] setup_percpu: NR_CPUS:256 nr_cpumask_bits:144 nr_cpu_ids:144 nr_node_ids:4 [2.926109][ 466 ecap f020df [9.134709][T0] DMAR: DRHD base: 0x00f5ffc000 flags: 0x0 [9.140867][T0] DMAR: dmar8: reg_base_addr f5ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df [9.149610][T0] DMAR: DRHD base: 0x00f7ffc000 flags: 0x0 [9.155762][T0] DMAR: dmar9: reg_base_addr f7ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df [9.164491][T0] DMAR: DRHD base: 0x00f9ffc000 flags: 0x0 [9.170645][T0] DMAR: dmar10: reg_base_addr f9ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df [9.179476][T0] DMAR: DRHD base: 0x00fbffc000 flags: 0x0 [9.185626][T0] DMAR: dmar11: reg_base_addr fbffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df [9.194442][T0] DMAR: DRHD base: 0x00dfffc000 flags: 0x0 [9.200587][T0] DMAR: dmar12:
Re: [patch V2 34/46] PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable
On Wed, 2020-08-26 at 13:17 +0200, Thomas Gleixner wrote: > From: Thomas Gleixner > > The arch_.*_msi_irq[s] fallbacks are compiled in whether an architecture > requires them or not. Architectures which are fully utilizing hierarchical > irq domains should never call into that code. > > It's not only architectures which depend on that by implementing one or > more of the weak functions, there is also a bunch of drivers which relies > on the weak functions which invoke msi_controller::setup_irq[s] and > msi_controller::teardown_irq. > > Make the architectures and drivers which rely on them select them in Kconfig > and if not selected replace them by stub functions which emit a warning and > fail the PCI/MSI interrupt allocation. > > Signed-off-by: Thomas Gleixner Today's linux-next will have some warnings on s390x: .config: https://gitlab.com/cailca/linux-mm/-/blob/master/s390.config WARNING: unmet direct dependencies detected for PCI_MSI_ARCH_FALLBACKS Depends on [n]: PCI [=n] Selected by [y]: - S390 [=y] WARNING: unmet direct dependencies detected for PCI_MSI_ARCH_FALLBACKS Depends on [n]: PCI [=n] Selected by [y]: - S390 [=y] > --- > V2: Make the architectures (and drivers) which need the fallbacks select them > and not the other way round (Bjorn). > --- > arch/ia64/Kconfig |1 + > arch/mips/Kconfig |1 + > arch/powerpc/Kconfig |1 + > arch/s390/Kconfig |1 + > arch/sparc/Kconfig |1 + > arch/x86/Kconfig |1 + > drivers/pci/Kconfig|3 +++ > drivers/pci/controller/Kconfig |3 +++ > drivers/pci/msi.c |3 ++- > include/linux/msi.h| 31 ++- > 10 files changed, 40 insertions(+), 6 deletions(-) > > --- a/arch/ia64/Kconfig > +++ b/arch/ia64/Kconfig > @@ -56,6 +56,7 @@ config IA64 > select NEED_DMA_MAP_STATE > select NEED_SG_DMA_LENGTH > select NUMA if !FLATMEM > + select PCI_MSI_ARCH_FALLBACKS > default y > help > The Itanium Processor Family is Intel's 64-bit successor to > --- a/arch/mips/Kconfig > +++ b/arch/mips/Kconfig > @@ -86,6 +86,7 @@ config MIPS > select MODULES_USE_ELF_REL if MODULES > select MODULES_USE_ELF_RELA if MODULES && 64BIT > select PERF_USE_VMALLOC > + select PCI_MSI_ARCH_FALLBACKS > select RTC_LIB > select SYSCTL_EXCEPTION_TRACE > select VIRT_TO_BUS > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -246,6 +246,7 @@ config PPC > select OLD_SIGACTIONif PPC32 > select OLD_SIGSUSPEND > select PCI_DOMAINS if PCI > + select PCI_MSI_ARCH_FALLBACKS > select PCI_SYSCALL if PCI > select PPC_DAWR if PPC64 > select RTC_LIB > --- a/arch/s390/Kconfig > +++ b/arch/s390/Kconfig > @@ -185,6 +185,7 @@ config S390 > select OLD_SIGSUSPEND3 > select PCI_DOMAINS if PCI > select PCI_MSI if PCI > + select PCI_MSI_ARCH_FALLBACKS > select SPARSE_IRQ > select SYSCTL_EXCEPTION_TRACE > select THREAD_INFO_IN_TASK > --- a/arch/sparc/Kconfig > +++ b/arch/sparc/Kconfig > @@ -43,6 +43,7 @@ config SPARC > select GENERIC_STRNLEN_USER > select MODULES_USE_ELF_RELA > select PCI_SYSCALL if PCI > + select PCI_MSI_ARCH_FALLBACKS > select ODD_RT_SIGACTION > select OLD_SIGSUSPEND > select CPU_NO_EFFICIENT_FFS > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -225,6 +225,7 @@ config X86 > select NEED_SG_DMA_LENGTH > select PCI_DOMAINS if PCI > select PCI_LOCKLESS_CONFIG if PCI > + select PCI_MSI_ARCH_FALLBACKS > select PERF_EVENTS > select RTC_LIB > select RTC_MC146818_LIB > --- a/drivers/pci/Kconfig > +++ b/drivers/pci/Kconfig > @@ -56,6 +56,9 @@ config PCI_MSI_IRQ_DOMAIN > depends on PCI_MSI > select GENERIC_MSI_IRQ_DOMAIN > > +config PCI_MSI_ARCH_FALLBACKS > + bool > + > config PCI_QUIRKS > default y > bool "Enable PCI quirk workarounds" if EXPERT > --- a/drivers/pci/controller/Kconfig > +++ b/drivers/pci/controller/Kconfig > @@ -41,6 +41,7 @@ config PCI_TEGRA > bool "NVIDIA Tegra PCIe controller" > depends on ARCH_TEGRA || COMPILE_TEST > depends on PCI_MSI_IRQ_DOMAIN > + select PCI_MSI_ARCH_FALLBACKS > help > Say Y here if you want support for the PCIe host controller found > on NVIDIA Tegra SoCs. > @@ -67,6 +68,7 @@ config PCIE_RCAR_HOST > bool "Renesas R-Car PCIe host controller" > depends on ARCH_RENESAS || COMPILE_TEST > depends on PCI_MSI_IRQ_DOMAIN > + select PCI_MSI_ARCH_FALLBACKS > help > Say Y here if you want PCIe controller support on R-Car SoCs in host > mode. > @@ -103,6 +105,7 @@ config PCIE_X
Re: [PATCH v3 00/34] iommu: Move iommu_group setup to IOMMU core code
On Tue, Jun 30, 2020 at 08:40:28PM -0400, Qian Cai wrote: > On Wed, Apr 29, 2020 at 03:36:38PM +0200, Joerg Roedel wrote: > > Hi, > > > > here is the third version of this patch-set. Older versions can be found > > here: > > > > v1: https://lore.kernel.org/lkml/20200407183742.4344-1-j...@8bytes.org/ > > (Has some more introductory text) > > > > v2: https://lore.kernel.org/lkml/20200414131542.25608-1-j...@8bytes.org/ > > > > Changes v2 -> v3: > > > > * Rebased v5.7-rc3 > > > > * Added a missing iommu_group_put() as reported by Lu Baolu. > > > > * Added a patch to consolidate more initialization work in > > __iommu_probe_device(), fixing a bug where no 'struct > > device_iommu' was allocated in the hotplug path. > > > > There is also a git-branch available with these patches applied: > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=iommu-probe-device-v3 > > > > Please review. If there are no objections I plan to put these patches > > into the IOMMU tree early next week. > > Looks like this patchset introduced an use-after-free on arm-smmu-v3. > > Reproduced using mlx5, > > # echo 1 > /sys/class/net/enp11s0f1np1/device/sriov_numvfs > # echo 0 > /sys/class/net/enp11s0f1np1/device/sriov_numvfs > > The .config, > https://github.com/cailca/linux-mm/blob/master/arm64.config > > Looking at the free stack, > > iommu_release_device->iommu_group_remove_device > > was introduced in 07/34 ("iommu: Add probe_device() and release_device() > call-backs"). FYI, I have just sent a patch to fix this, https://lore.kernel.org/linux-iommu/20200704001003.2303-1-...@lca.pw/ > > [ 9426.724641][ T3356] pci :0b:01.2: Removing from iommu group 3 > [ 9426.731347][ T3356] > == > [ 9426.739263][ T3356] BUG: KASAN: use-after-free in > __lock_acquire+0x3458/0x4440 > __lock_acquire at kernel/locking/lockdep.c:4250 > [ 9426.746477][ T3356] Read of size 8 at addr 0089df1a6f68 by task > bash/3356 > [ 9426.753601][ T3356] > [ 9426.755782][ T3356] CPU: 5 PID: 3356 Comm: bash Not tainted > 5.8.0-rc3-next-20200630 #2 > [ 9426.763687][ T3356] Hardware name: HPE Apollo 70 > /C01_APACHE_MB , BIOS L50_5.13_1.11 06/18/2019 > [ 9426.774111][ T3356] Call trace: > [ 9426.777245][ T3356] dump_backtrace+0x0/0x398 > [ 9426.781593][ T3356] show_stack+0x14/0x20 > [ 9426.785596][ T3356] dump_stack+0x140/0x1b8 > [ 9426.789772][ T3356] print_address_description.isra.12+0x54/0x4a8 > [ 9426.795855][ T3356] kasan_report+0x134/0x1b8 > [ 9426.800203][ T3356] __asan_report_load8_noabort+0x2c/0x50 > [ 9426.805679][ T3356] __lock_acquire+0x3458/0x4440 > [ 9426.810373][ T3356] lock_acquire+0x204/0xf10 > [ 9426.814722][ T3356] _raw_spin_lock_irqsave+0xf8/0x180 > [ 9426.819853][ T3356] arm_smmu_detach_dev+0xd8/0x4a0 > arm_smmu_detach_dev at drivers/iommu/arm-smmu-v3.c:2776 > [ 9426.824721][ T3356] arm_smmu_release_device+0xb4/0x1c8 > arm_smmu_disable_pasid at drivers/iommu/arm-smmu-v3.c:2754 > (inlined by) arm_smmu_release_device at drivers/iommu/arm-smmu-v3.c:3000 > [ 9426.829937][ T3356] iommu_release_device+0xc0/0x178 > iommu_release_device at drivers/iommu/iommu.c:302 > [ 9426.834892][ T3356] iommu_bus_notifier+0x118/0x160 > [ 9426.839762][ T3356] notifier_call_chain+0xa4/0x128 > [ 9426.844630][ T3356] __blocking_notifier_call_chain+0x70/0xa8 > [ 9426.850367][ T3356] blocking_notifier_call_chain+0x14/0x20 > [ 9426.855929][ T3356] device_del+0x618/0xa00 > [ 9426.860105][ T3356] pci_remove_bus_device+0x108/0x2d8 > [ 9426.865233][ T3356] pci_stop_and_remove_bus_device+0x1c/0x28 > [ 9426.870972][ T3356] pci_iov_remove_virtfn+0x228/0x368 > [ 9426.876100][ T3356] sriov_disable+0x8c/0x348 > [ 9426.880447][ T3356] pci_disable_sriov+0x5c/0x70 > [ 9426.885117][ T3356] mlx5_core_sriov_configure+0xd8/0x260 [mlx5_core] > [ 9426.891549][ T3356] sriov_numvfs_store+0x240/0x318 > [ 9426.896417][ T3356] dev_attr_store+0x38/0x68 > [ 9426.900766][ T3356] sysfs_kf_write+0xdc/0x128 > [ 9426.905200][ T3356] kernfs_fop_write+0x23c/0x448 > [ 9426.909897][ T3356] __vfs_write+0x54/0xe8 > [ 9426.913984][ T3356] vfs_write+0x124/0x3f0 > [ 9426.918070][ T3356] ksys_write+0xe8/0x1b8 > [ 9426.922157][ T3356] __arm64_sys_write+0x68/0x98 > [ 9426.926766][ T3356] do_el0_svc+0x124/0x220 > [ 9426.930941][ T3356] el0_sync_handler+0x260/0x408 > [ 9426.935634][ T3356] el0_sync+0x140/0x180 > [ 9426.939633][ T3356] > [ 9426.941810][ T3356] Allocated by task 3356: > [ 9426.9
[PATCH] iommu: fix use-after-free in iommu_release_device
b fb fb fb fb 0089df1a6e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >0089df1a6f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ 0089df1a6f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 0089df1a7000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc Fixes: a6a4c7e2c5b8 ("iommu: Add probe_device() and release_device() call-backs") Signed-off-by: Qian Cai --- drivers/iommu/iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 1ed1e14a1f0c..0eeb3251266a 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -295,10 +295,10 @@ void iommu_release_device(struct device *dev) return; iommu_device_unlink(dev->iommu->iommu_dev, dev); - iommu_group_remove_device(dev); ops->release_device(dev); + iommu_group_remove_device(dev); module_put(ops->owner); dev_iommu_free(dev); } -- 2.21.0 (Apple Git-122.2) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 00/34] iommu: Move iommu_group setup to IOMMU core code
On Wed, Apr 29, 2020 at 03:36:38PM +0200, Joerg Roedel wrote: > Hi, > > here is the third version of this patch-set. Older versions can be found > here: > > v1: https://lore.kernel.org/lkml/20200407183742.4344-1-j...@8bytes.org/ > (Has some more introductory text) > > v2: https://lore.kernel.org/lkml/20200414131542.25608-1-j...@8bytes.org/ > > Changes v2 -> v3: > > * Rebased v5.7-rc3 > > * Added a missing iommu_group_put() as reported by Lu Baolu. > > * Added a patch to consolidate more initialization work in > __iommu_probe_device(), fixing a bug where no 'struct > device_iommu' was allocated in the hotplug path. > > There is also a git-branch available with these patches applied: > > > https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=iommu-probe-device-v3 > > Please review. If there are no objections I plan to put these patches > into the IOMMU tree early next week. Looks like this patchset introduced an use-after-free on arm-smmu-v3. Reproduced using mlx5, # echo 1 > /sys/class/net/enp11s0f1np1/device/sriov_numvfs # echo 0 > /sys/class/net/enp11s0f1np1/device/sriov_numvfs The .config, https://github.com/cailca/linux-mm/blob/master/arm64.config Looking at the free stack, iommu_release_device->iommu_group_remove_device was introduced in 07/34 ("iommu: Add probe_device() and release_device() call-backs"). [ 9426.724641][ T3356] pci :0b:01.2: Removing from iommu group 3 [ 9426.731347][ T3356] == [ 9426.739263][ T3356] BUG: KASAN: use-after-free in __lock_acquire+0x3458/0x4440 __lock_acquire at kernel/locking/lockdep.c:4250 [ 9426.746477][ T3356] Read of size 8 at addr 0089df1a6f68 by task bash/3356 [ 9426.753601][ T3356] [ 9426.755782][ T3356] CPU: 5 PID: 3356 Comm: bash Not tainted 5.8.0-rc3-next-20200630 #2 [ 9426.763687][ T3356] Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.11 06/18/2019 [ 9426.774111][ T3356] Call trace: [ 9426.777245][ T3356] dump_backtrace+0x0/0x398 [ 9426.781593][ T3356] show_stack+0x14/0x20 [ 9426.785596][ T3356] dump_stack+0x140/0x1b8 [ 9426.789772][ T3356] print_address_description.isra.12+0x54/0x4a8 [ 9426.795855][ T3356] kasan_report+0x134/0x1b8 [ 9426.800203][ T3356] __asan_report_load8_noabort+0x2c/0x50 [ 9426.805679][ T3356] __lock_acquire+0x3458/0x4440 [ 9426.810373][ T3356] lock_acquire+0x204/0xf10 [ 9426.814722][ T3356] _raw_spin_lock_irqsave+0xf8/0x180 [ 9426.819853][ T3356] arm_smmu_detach_dev+0xd8/0x4a0 arm_smmu_detach_dev at drivers/iommu/arm-smmu-v3.c:2776 [ 9426.824721][ T3356] arm_smmu_release_device+0xb4/0x1c8 arm_smmu_disable_pasid at drivers/iommu/arm-smmu-v3.c:2754 (inlined by) arm_smmu_release_device at drivers/iommu/arm-smmu-v3.c:3000 [ 9426.829937][ T3356] iommu_release_device+0xc0/0x178 iommu_release_device at drivers/iommu/iommu.c:302 [ 9426.834892][ T3356] iommu_bus_notifier+0x118/0x160 [ 9426.839762][ T3356] notifier_call_chain+0xa4/0x128 [ 9426.844630][ T3356] __blocking_notifier_call_chain+0x70/0xa8 [ 9426.850367][ T3356] blocking_notifier_call_chain+0x14/0x20 [ 9426.855929][ T3356] device_del+0x618/0xa00 [ 9426.860105][ T3356] pci_remove_bus_device+0x108/0x2d8 [ 9426.865233][ T3356] pci_stop_and_remove_bus_device+0x1c/0x28 [ 9426.870972][ T3356] pci_iov_remove_virtfn+0x228/0x368 [ 9426.876100][ T3356] sriov_disable+0x8c/0x348 [ 9426.880447][ T3356] pci_disable_sriov+0x5c/0x70 [ 9426.885117][ T3356] mlx5_core_sriov_configure+0xd8/0x260 [mlx5_core] [ 9426.891549][ T3356] sriov_numvfs_store+0x240/0x318 [ 9426.896417][ T3356] dev_attr_store+0x38/0x68 [ 9426.900766][ T3356] sysfs_kf_write+0xdc/0x128 [ 9426.905200][ T3356] kernfs_fop_write+0x23c/0x448 [ 9426.909897][ T3356] __vfs_write+0x54/0xe8 [ 9426.913984][ T3356] vfs_write+0x124/0x3f0 [ 9426.918070][ T3356] ksys_write+0xe8/0x1b8 [ 9426.922157][ T3356] __arm64_sys_write+0x68/0x98 [ 9426.926766][ T3356] do_el0_svc+0x124/0x220 [ 9426.930941][ T3356] el0_sync_handler+0x260/0x408 [ 9426.935634][ T3356] el0_sync+0x140/0x180 [ 9426.939633][ T3356] [ 9426.941810][ T3356] Allocated by task 3356: [ 9426.945985][ T3356] save_stack+0x24/0x50 [ 9426.949986][ T3356] __kasan_kmalloc.isra.13+0xc4/0xe0 [ 9426.955114][ T3356] kasan_kmalloc+0xc/0x18 [ 9426.959288][ T3356] kmem_cache_alloc_trace+0x1ec/0x318 [ 9426.964503][ T3356] arm_smmu_domain_alloc+0x54/0x148 [ 9426.969545][ T3356] iommu_group_alloc_default_domain+0xc0/0x440 [ 9426.975541][ T3356] iommu_probe_device+0x1c0/0x308 [ 9426.980409][ T3356] iort_iommu_configure+0x434/0x518 [ 9426.985452][ T3356] acpi_dma_configure+0xf0/0x128 [ 9426.990235][ T3356] pci_dma_configure+0x114/0x160 [ 9426.995017][ T3356] really_probe+0x124/0x6d8 [ 9426.999364][ T3356] driver_probe_device+0xc4/0x180 [ 9427.004232][ T3356] __device_attach_driver+0x184/0x1e8 [ 9427.009447][ T3356] bus_for_each_drv+0x114/0x1a0 [ 9427.014142][ T
Re: [PATCH v2 0/2] iommu/amd: Don't use atomic64_t for domain->pt_root
> On Jun 26, 2020, at 4:05 AM, Joerg Roedel wrote: > > a previous discussion pointed out that using atomic64_t for that > purpose is a bit of overkill. This patch-set replaces it with unsigned > long and introduces some helpers first to make the change more easy. BTW, from the previous discussion, Linus mentioned, “ The thing is, the 64-bit atomic reads/writes are very expensive on 32-bit x86. If it was just a native pointer, it would be much cheaper than an "atomic64_t". “ However, here we have AMD_IOMMU depend on x86_64, so I am wondering if it makes any sense to run this code on 32-bit x86 at all? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] iommu/amd: Use 'unsigned long' for domain->pt_root
On Thu, Jun 25, 2020 at 04:52:27PM +0200, Joerg Roedel wrote: > From: Joerg Roedel > > Using atomic64_t can be quite expensive, so use unsigned long instead. > This is safe because the write becomes visible atomically. > > Signed-off-by: Joerg Roedel > --- > drivers/iommu/amd/amd_iommu_types.h | 2 +- > drivers/iommu/amd/iommu.c | 10 -- > 2 files changed, 9 insertions(+), 3 deletions(-) > > diff --git a/drivers/iommu/amd/amd_iommu_types.h > b/drivers/iommu/amd/amd_iommu_types.h > index 30a5d412255a..f6f102282dda 100644 > --- a/drivers/iommu/amd/amd_iommu_types.h > +++ b/drivers/iommu/amd/amd_iommu_types.h > @@ -468,7 +468,7 @@ struct protection_domain { > iommu core code */ > spinlock_t lock;/* mostly used to lock the page table*/ > u16 id; /* the domain id written to the device table */ > - atomic64_t pt_root; /* pgtable root and pgtable mode */ > + unsigned long pt_root; /* pgtable root and pgtable mode */ > int glx;/* Number of levels for GCR3 table */ > u64 *gcr3_tbl; /* Guest CR3 table */ > unsigned long flags;/* flags to find out type of domain */ > diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c > index 5286ddcfc2f9..b0e1dc58244e 100644 > --- a/drivers/iommu/amd/iommu.c > +++ b/drivers/iommu/amd/iommu.c > @@ -156,7 +156,7 @@ static struct protection_domain *to_pdomain(struct > iommu_domain *dom) > static void amd_iommu_domain_get_pgtable(struct protection_domain *domain, >struct domain_pgtable *pgtable) > { > - u64 pt_root = atomic64_read(&domain->pt_root); > + unsigned long pt_root = domain->pt_root; The pt_root might be reload later in case of register pressure where the compiler decides to not store it as a stack variable, so it needs smp_rmb() here to match to the smp_wmb() in amd_iommu_domain_set_pt_root() to make the load visiable to all CPUs. Then, smp_rmb/wmb() wouldn't be able to deal with data races, so it needs, unsigned long pt_root = READ_ONCE(domain->pt_root); > > pgtable->root = (u64 *)(pt_root & PAGE_MASK); > pgtable->mode = pt_root & 7; /* lowest 3 bits encode pgtable mode */ > @@ -164,7 +164,13 @@ static void amd_iommu_domain_get_pgtable(struct > protection_domain *domain, > > static void amd_iommu_domain_set_pt_root(struct protection_domain *domain, > u64 root) > { > - atomic64_set(&domain->pt_root, root); > + domain->pt_root = root; WRITE_ONCE(domain->pt_root, root); > + > + /* > + * The new value needs to be gobally visible in case pt_root gets > + * cleared, so that the page-table can be safely freed. > + */ > + smp_wmb(); > } > > static void amd_iommu_domain_clr_pt_root(struct protection_domain *domain) > -- > 2.27.0 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH -next] iommu/vt-d: fix a GCC warning
The commit 6ee1b77ba3ac ("iommu/vt-d: Add svm/sva invalidate function") introduced a GCC warning, drivers/iommu/intel-iommu.c:5330:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] const static int ^~~~~ Signed-off-by: Qian Cai --- drivers/iommu/intel-iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index f75d7d9c231f..ff5a30a94679 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5327,7 +5327,7 @@ static void intel_iommu_aux_detach_device(struct iommu_domain *domain, * [IOMMU_CACHE_INV_TYPE_IOTLB][IOMMU_INV_GRANU_ADDR] */ -const static int +static const int inv_type_granu_table[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU_NR] = { /* * PASID based IOTLB invalidation: PASID selective (per PASID), -- 2.17.2 (Apple Git-113) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH -next] iommu/amd: fix variable "iommu" set but not used
The commit dce8d6964ebd ("iommu/amd: Convert to probe/release_device() call-backs") introduced an unused variable, drivers/iommu/amd_iommu.c: In function 'amd_iommu_uninit_device': drivers/iommu/amd_iommu.c:422:20: warning: variable 'iommu' set but not used [-Wunused-but-set-variable] struct amd_iommu *iommu; ^ Signed-off-by: Qian Cai --- drivers/iommu/amd_iommu.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index fef3689ee535..2b8eb58d2bea 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -419,15 +419,12 @@ static void iommu_ignore_device(struct device *dev) static void amd_iommu_uninit_device(struct device *dev) { struct iommu_dev_data *dev_data; - struct amd_iommu *iommu; int devid; devid = get_device_id(dev); if (devid < 0) return; - iommu = amd_iommu_rlookup_table[devid]; - dev_data = search_dev_data(devid); if (!dev_data) return; -- 2.21.0 (Apple Git-122.2) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()
> On May 3, 2020, at 2:39 PM, Joerg Roedel wrote: > > Can I add your Tested-by when I > send them to the mailing list tomorrow? Sure. Feel free to add, Tested-by: Qian Cai ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()
> On Apr 29, 2020, at 7:20 AM, Joerg Roedel wrote: > > On Mon, Apr 20, 2020 at 09:26:12AM -0400, Qian Cai wrote: >> No dice. There could be some other races. For example, > > Can you please test this branch: > > > https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=amd-iommu-fixes > > It has the previous fix in it and a couple more to make sure the > device-table is updated and flushed before increase_address_space() > updates domain->pt_root. I believe this closed the existing races as it had survived for many days. Great work! ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()
> On Apr 29, 2020, at 7:20 AM, Joerg Roedel wrote: > > On Mon, Apr 20, 2020 at 09:26:12AM -0400, Qian Cai wrote: >> No dice. There could be some other races. For example, > > Can you please test this branch: > > > https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=amd-iommu-fixes > > It has the previous fix in it and a couple more to make sure the > device-table is updated and flushed before increase_address_space() > updates domain->pt_root. It looks promising as it survives for a day’s stress testing. I’ll keep it running for a few days to be sure. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()
> On Apr 18, 2020, at 2:34 PM, Joerg Roedel wrote: > > On Sat, Apr 18, 2020 at 09:01:35AM -0400, Qian Cai wrote: >> Hard to tell without testing further. I’ll leave that optimization in >> the future, and focus on fixing those races first. > > Yeah right, we should fix the existing races first before introducing > new ones ;) > > Btw, THANKS A LOT for tracking down all these race condition bugs, I am > not even remotely able to trigger them with the hardware I have around. > > I did some hacking and the attached diff shows how I think this race > condition needs to be fixed. I boot-tested this fix on-top of v5.7-rc1, > but did no further testing. Can you test it please? No dice. There could be some other races. For example, > @@ -1536,16 +1571,19 @@ static u64 *fetch_pte(struct protection_domain > *domain, > unsigned long address, > unsigned long *page_size) ... > amd_iommu_domain_get_pgtable(domain, &pgtable); > > if (address > PM_LEVEL_SIZE(pgtable.mode)) > return NULL; > > level = pgtable.mode - 1; > pte= &pgtable.root[PM_LEVEL_INDEX(level, address)]; <— increase_address_space() > *page_size = PTE_LEVEL_PAGE_SIZE(level); > while (level > 0) { *page_size = PTE_LEVEL_PAGE_SIZE(level); Then in iommu_unmap_page(), while (unmapped < page_size) { pte = fetch_pte(dom, bus_addr, &unmap_size); … bus_addr = (bus_addr & ~(unmap_size - 1)) + unmap_size; bus_addr would be unsync with dom->mode when it enter fetch_pte() again. Could that be a problem? [ 5159.274829][ T4057] LTP: starting oom02 [ 5160.382787][ C52] perf: interrupt took too long (7443 > 6208), lowering kernel.perf_event_max_sample_rate to 26800 [ 5167.951785][ T812] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xfffc flags=0x0010] [ 5167.964540][ T812] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xfffc1000 flags=0x0010] [ 5167.977442][ T812] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xfffc1900 flags=0x0010] [ 5167.989901][ T812] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xfffc1d00 flags=0x0010] [ 5168.002291][ T812] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xfffc2000 flags=0x0010] [ 5168.014665][ T812] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xfffc2400 flags=0x0010] [ 5168.027132][ T812] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xfffc2800 flags=0x0010] [ 5168.039566][ T812] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xfffc2c00 flags=0x0010] [ 5168.051956][ T812] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xfffc3000 flags=0x0010] [ 5168.064310][ T812] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xfffc3400 flags=0x0010] [ 5168.076652][ T812] AMD-Vi: Event logged [IO_PAGE_FAULT device=23:00.0 domain=0x0027 address=0xfffc3800 flags=0x0010] [ 5168.088290][ T812] AMD-Vi: Event logged [IO_PAGE_FAULT device=23:00.0 domain=0x0027 address=0xfffc3c00 flags=0x0010] [ 5183.695829][C8] smartpqi :23:00.0: controller is offline: status code 0x14803 [ 5183.704390][C8] smartpqi :23:00.0: controller offline [ 5183.756594][ C101] blk_update_request: I/O error, dev sda, sector 22306304 op 0x1:(WRITE) flags 0x800 phys_seg 4 prio class 0 [ 5183.756628][ C34] sd 0:1:0:0: [sda] tag#655 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=15s [ 5183.756759][ C56] blk_update_request: I/O error, dev sda, sector 58480128 op 0x1:(WRITE) flags 0x8004000 phys_seg 4 prio class 0 [ 5183.756810][ C79] sd 0:1:0:0: [sda] tag#234 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=15s [ 5183.756816][ C121] sd 0:1:0:0: [sda] tag#104 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=15s [ 5183.756837][ T53] sd 0:1:0:0: [sda] tag#4 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=15s [ 5183.756882][ C121] sd 0:1:0:0: [sda] tag#104 CDB: opcode=0x2a 2a 00 00 4d d4 00 00 02 00 00 [ 5183.756892][ C79] sd 0:1:0:0: [sda] tag#234 CDB: opcode=0x2a 2a 00 02 03 e4 00 00 02 00 00 [ 5183.756909][ C121] blk_update_request: I/O error, dev sda, sector 5100544 op 0x1:(WRITE) flags 0x8004000 phys_seg 4 prio class 0 [ 5183.756920][ C79] blk_update_request: I/O error, dev sda, sector 33809408 op 0x1:(WRITE) flags 0x8004000 phys_seg 4 prio class 0 [ 5183.756939][ T53] sd 0:1:0:0: [sda] tag#4 CDB: o
Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()
> On Apr 18, 2020, at 2:34 PM, Joerg Roedel wrote: > > On Sat, Apr 18, 2020 at 09:01:35AM -0400, Qian Cai wrote: >> Hard to tell without testing further. I’ll leave that optimization in >> the future, and focus on fixing those races first. > > Yeah right, we should fix the existing races first before introducing > new ones ;) > > Btw, THANKS A LOT for tracking down all these race condition bugs, I am > not even remotely able to trigger them with the hardware I have around. > > I did some hacking and the attached diff shows how I think this race > condition needs to be fixed. I boot-tested this fix on-top of v5.7-rc1, > but did no further testing. Can you test it please? Sure, give it a few days to see if it could survive. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()
> On Apr 18, 2020, at 8:10 AM, Joerg Roedel wrote: > > Yes, your patch still looks racy. You need to atomically read > domain->pt_root to a stack variable and derive the pt_root pointer and > the mode from that variable instead of domain->pt_root directly. If you > read the domain->pt_root twice there could still be an update between > the two reads. > Probably the lock in increase_address_space() can also be avoided if > pt_root is updated using cmpxchg()? Hard to tell without testing further. I’ll leave that optimization in the future, and focus on fixing those races first. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()
> On Apr 13, 2020, at 9:36 PM, Qian Cai wrote: > > > >> On Apr 8, 2020, at 10:19 AM, Joerg Roedel wrote: >> >> Hi Qian, >> >> On Tue, Apr 07, 2020 at 11:36:05AM -0400, Qian Cai wrote: >>> After further testing, the change along is insufficient. What I am chasing >>> right >>> now is the swap device will go offline after heavy memory pressure below. >>> The >>> symptom is similar to what we have in the commit, >>> >>> 754265bcab78 (“iommu/amd: Fix race in increase_address_space()”) >>> >>> Apparently, it is no possible to take the domain->lock in fetch_pte() >>> because it >>> could sleep. >> >> Thanks a lot for finding and tracking down another race in the AMD IOMMU >> page-table code. The domain->lock is a spin-lock and taking it can't >> sleep. But fetch_pte() is a fast-path and must not take any locks. >> >> I think the best fix is to update the pt_root and mode of the domain >> atomically by storing the mode in the lower 12 bits of pt_root. This way >> they are stored together and can be read/write atomically. > > Like this? So, this is still not enough that would still trigger storage driver offline under memory pressure for a bit longer. It looks to me that in fetch_pte() there are could still racy? level = domain->mode - 1; pte= &domain->pt_root[PM_LEVEL_INDEX(level, address)]; <— increase_address_space(); *page_size = PTE_LEVEL_PAGE_SIZE(level); while (level > 0) { *page_size = PTE_LEVEL_PAGE_SIZE(level); Then in iommu_unmap_page(), while (unmapped < page_size) { pte = fetch_pte(dom, bus_addr, &unmap_size); … bus_addr = (bus_addr & ~(unmap_size - 1)) + unmap_size; bus_addr would be unsync with dom->mode when it enter fetch_pte() again. Could that be a problem? > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > index 20cce366e951..b36c6b07cbfd 100644 > --- a/drivers/iommu/amd_iommu.c > +++ b/drivers/iommu/amd_iommu.c > @@ -1397,13 +1397,13 @@ static struct page *free_sub_pt(unsigned long root, > int mode, > > static void free_pagetable(struct protection_domain *domain) > { > - unsigned long root = (unsigned long)domain->pt_root; > + int level = iommu_get_mode(domain->pt_root); > + unsigned long root = iommu_get_root(domain->pt_root); > struct page *freelist = NULL; > > - BUG_ON(domain->mode < PAGE_MODE_NONE || > -domain->mode > PAGE_MODE_6_LEVEL); > + BUG_ON(level < PAGE_MODE_NONE || level > PAGE_MODE_6_LEVEL); > > - freelist = free_sub_pt(root, domain->mode, freelist); > + freelist = free_sub_pt(root, level, freelist); > > free_page_list(freelist); > } > @@ -1417,24 +1417,27 @@ static bool increase_address_space(struct > protection_domain *domain, > unsigned long address, > gfp_t gfp) > { > + int level; > unsigned long flags; > bool ret = false; > u64 *pte; > > spin_lock_irqsave(&domain->lock, flags); > > - if (address <= PM_LEVEL_SIZE(domain->mode) || > - WARN_ON_ONCE(domain->mode == PAGE_MODE_6_LEVEL)) > + level = iommu_get_mode(domain->pt_root); > + > + if (address <= PM_LEVEL_SIZE(level) || > + WARN_ON_ONCE(level == PAGE_MODE_6_LEVEL)) > goto out; > > pte = (void *)get_zeroed_page(gfp); > if (!pte) > goto out; > > - *pte = PM_LEVEL_PDE(domain->mode, > - iommu_virt_to_phys(domain->pt_root)); > - domain->pt_root = pte; > - domain->mode+= 1; > + *pte = PM_LEVEL_PDE(level, > + iommu_virt_to_phys((void *)iommu_get_root(domain->pt_root))); > + > + WRITE_ONCE(domain->pt_root, (unsigned long)pte + level + 1); > > ret = true; > > @@ -1452,15 +1455,17 @@ static u64 *alloc_pte(struct protection_domain > *domain, > bool *updated) > { > int level, end_lvl; > - u64 *pte, *page; > + u64 *pte, *page, *pt_root, *root; > > BUG_ON(!is_power_of_2(page_size)); > > - while (address > PM_LEVEL_SIZE(domain->mode)) > + while (address > PM_LEVEL_SIZE(iommu_get_mode(domain->pt_root))) > *updated = increase_address_space(domain, address, gfp) || > *updated; >
Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()
> On Apr 8, 2020, at 10:19 AM, Joerg Roedel wrote: > > Hi Qian, > > On Tue, Apr 07, 2020 at 11:36:05AM -0400, Qian Cai wrote: >> After further testing, the change along is insufficient. What I am chasing >> right >> now is the swap device will go offline after heavy memory pressure below. The >> symptom is similar to what we have in the commit, >> >> 754265bcab78 (“iommu/amd: Fix race in increase_address_space()”) >> >> Apparently, it is no possible to take the domain->lock in fetch_pte() >> because it >> could sleep. > > Thanks a lot for finding and tracking down another race in the AMD IOMMU > page-table code. The domain->lock is a spin-lock and taking it can't > sleep. But fetch_pte() is a fast-path and must not take any locks. > > I think the best fix is to update the pt_root and mode of the domain > atomically by storing the mode in the lower 12 bits of pt_root. This way > they are stored together and can be read/write atomically. Like this? diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 20cce366e951..b36c6b07cbfd 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -1397,13 +1397,13 @@ static struct page *free_sub_pt(unsigned long root, int mode, static void free_pagetable(struct protection_domain *domain) { - unsigned long root = (unsigned long)domain->pt_root; + int level = iommu_get_mode(domain->pt_root); + unsigned long root = iommu_get_root(domain->pt_root); struct page *freelist = NULL; - BUG_ON(domain->mode < PAGE_MODE_NONE || - domain->mode > PAGE_MODE_6_LEVEL); + BUG_ON(level < PAGE_MODE_NONE || level > PAGE_MODE_6_LEVEL); - freelist = free_sub_pt(root, domain->mode, freelist); + freelist = free_sub_pt(root, level, freelist); free_page_list(freelist); } @@ -1417,24 +1417,27 @@ static bool increase_address_space(struct protection_domain *domain, unsigned long address, gfp_t gfp) { + int level; unsigned long flags; bool ret = false; u64 *pte; spin_lock_irqsave(&domain->lock, flags); - if (address <= PM_LEVEL_SIZE(domain->mode) || - WARN_ON_ONCE(domain->mode == PAGE_MODE_6_LEVEL)) + level = iommu_get_mode(domain->pt_root); + + if (address <= PM_LEVEL_SIZE(level) || + WARN_ON_ONCE(level == PAGE_MODE_6_LEVEL)) goto out; pte = (void *)get_zeroed_page(gfp); if (!pte) goto out; - *pte = PM_LEVEL_PDE(domain->mode, - iommu_virt_to_phys(domain->pt_root)); - domain->pt_root = pte; - domain->mode+= 1; + *pte = PM_LEVEL_PDE(level, + iommu_virt_to_phys((void *)iommu_get_root(domain->pt_root))); + + WRITE_ONCE(domain->pt_root, (unsigned long)pte + level + 1); ret = true; @@ -1452,15 +1455,17 @@ static u64 *alloc_pte(struct protection_domain *domain, bool *updated) { int level, end_lvl; - u64 *pte, *page; + u64 *pte, *page, *pt_root, *root; BUG_ON(!is_power_of_2(page_size)); - while (address > PM_LEVEL_SIZE(domain->mode)) + while (address > PM_LEVEL_SIZE(iommu_get_mode(domain->pt_root))) *updated = increase_address_space(domain, address, gfp) || *updated; - level = domain->mode - 1; - pte = &domain->pt_root[PM_LEVEL_INDEX(level, address)]; + pt_root = READ_ONCE(domain->pt_root); + root= (void *)iommu_get_root(pt_root); + level = iommu_get_mode(pt_root) - 1; + pte = &root[PM_LEVEL_INDEX(level, address)]; address = PAGE_SIZE_ALIGN(address, page_size); end_lvl = PAGE_SIZE_LEVEL(page_size); @@ -1536,16 +1541,18 @@ static u64 *fetch_pte(struct protection_domain *domain, unsigned long address, unsigned long *page_size) { - int level; u64 *pte; + u64 *pt_root = READ_ONCE(domain->pt_root); + u64 *root= (void *)iommu_get_root(pt_root); + int level= iommu_get_mode(pt_root); *page_size = 0; - if (address > PM_LEVEL_SIZE(domain->mode)) + if (address > PM_LEVEL_SIZE(level)) return NULL; - level = domain->mode - 1; - pte= &domain->pt_root[PM_LEVEL_INDEX(level, address)]; + level--; + pte= &root[PM_LEVEL_INDEX(level, address)]; *page_size = PTE_LEVEL_PAGE_SIZE(level); while (level > 0) { @@ -1814,12 +1821,13 @@ static struct protection_domain *dma_ops_domain_alloc(void) if (prot
Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()
> On Apr 6, 2020, at 10:12 PM, Qian Cai wrote: > > fetch_pte() could race with increase_address_space() because it held no > lock from iommu_unmap_page(). On the CPU that runs fetch_pte() it could > see a stale domain->pt_root and a new increased domain->mode from > increase_address_space(). As the result, it could trigger invalid > accesses later on. Fix it by using a pair of smp_[w|r]mb in those > places. After further testing, the change along is insufficient. What I am chasing right now is the swap device will go offline after heavy memory pressure below. The symptom is similar to what we have in the commit, 754265bcab78 (“iommu/amd: Fix race in increase_address_space()”) Apparently, it is no possible to take the domain->lock in fetch_pte() because it could sleep. Thoughts? [ 7292.727377][ T814] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffd8 flags=0x0010] [ 7292.740571][ T814] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffd81000 flags=0x0010] [ 7292.753268][ T814] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffd81900 flags=0x0010] [ 7292.766078][ T814] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffd81d00 flags=0x0010] [ 7292.778447][ T814] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffd82000 flags=0x0010] [ 7292.790724][ T814] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffd82400 flags=0x0010] [ 7292.803195][ T814] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffd82800 flags=0x0010] [ 7292.815664][ T814] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffd82c00 flags=0x0010] [ 7292.828089][ T814] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffd83000 flags=0x0010] [ 7292.840468][ T814] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0027 address=0xffd83400 flags=0x0010] [ 7292.852795][ T814] AMD-Vi: Event logged [IO_PAGE_FAULT device=23:00.0 domain=0x0027 address=0xffd83800 flags=0x0010] [ 7292.864566][ T814] AMD-Vi: Event logged [IO_PAGE_FAULT device=23:00.0 domain=0x0027 address=0xffd83c00 flags=0x0010] [ 7326.583908][C8] smartpqi :23:00.0: controller is offline: status code 0x14803 [ 7326.592386][C8] smartpqi :23:00.0: controller offline [ 7326.663728][ C66] sd 0:1:0:0: [sda] tag#467 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=6s [ 7326.664321][ T2738] smartpqi :23:00.0: resetting scsi 0:1:0:0 [ 7326.673849][ C66] sd 0:1:0:0: [sda] tag#467 CDB: opcode=0x28 28 00 02 ee 2e 60 00 00 08 00 [ 7326.680118][ T2738] smartpqi :23:00.0: reset of scsi 0:1:0:0: FAILED [ 7326.688612][ C66] blk_update_request: I/O error, dev sda, sector 49163872 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 [ 7326.695456][ T2738] sd 0:1:0:0: Device offlined - not ready after error recovery [ 7326.706632][ C66] Read-error on swap-device (254:1:45833824) [ 7326.714030][ T2738] sd 0:1:0:0: Device offlined - not ready after error recovery [ 7326.723382][T45523] sd 0:1:0:0: rejecting I/O to offline device [ 7326.727352][ T2738] sd 0:1:0:0: Device offlined - not ready after error recovery [ 7326.727379][ T2738] sd 0:1:0:0: Device offlined - not ready after error recovery [ 7326.727442][ T2738] sd 0:1:0:0: Device offlined - not ready after error recovery > > kernel BUG at drivers/iommu/amd_iommu.c:1704! > BUG_ON(unmapped && !is_power_of_2(unmapped)); > Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 > 07/10/2019 > RIP: 0010:amd_iommu_unmap+0x1b2/0x1d0 > Call Trace: > > __iommu_unmap+0x106/0x320 > iommu_unmap_fast+0xe/0x10 > __iommu_dma_unmap+0xdc/0x1a0 > iommu_dma_unmap_sg+0xae/0xd0 > scsi_dma_unmap+0xe7/0x150 > pqi_raid_io_complete+0x37/0x60 [smartpqi] > pqi_irq_handler+0x1fc/0x13f0 [smartpqi] > __handle_irq_event_percpu+0x78/0x4f0 > handle_irq_event_percpu+0x70/0x100 > handle_irq_event+0x5a/0x8b > handle_edge_irq+0x10c/0x370 > do_IRQ+0x9e/0x1e0 > common_interrupt+0xf/0xf > > > Signed-off-by: Qian Cai > --- > drivers/iommu/amd_iommu.c | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > index 20cce366e951..22328a23335f 100644 > --- a/drivers/iommu/amd_iommu.c > +++ b/drivers/iommu/amd_iommu.c > @@ -1434,6 +1434,11 @@ static bool increase_address_space(struct > protection_domain *domain, > *pte = PM_LEVEL_PDE(domain->mode, > iommu_virt_to_phys(domain->pt_root)
[RFC PATCH] iommu/amd: fix a race in fetch_pte()
fetch_pte() could race with increase_address_space() because it held no lock from iommu_unmap_page(). On the CPU that runs fetch_pte() it could see a stale domain->pt_root and a new increased domain->mode from increase_address_space(). As the result, it could trigger invalid accesses later on. Fix it by using a pair of smp_[w|r]mb in those places. kernel BUG at drivers/iommu/amd_iommu.c:1704! BUG_ON(unmapped && !is_power_of_2(unmapped)); Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 RIP: 0010:amd_iommu_unmap+0x1b2/0x1d0 Call Trace: __iommu_unmap+0x106/0x320 iommu_unmap_fast+0xe/0x10 __iommu_dma_unmap+0xdc/0x1a0 iommu_dma_unmap_sg+0xae/0xd0 scsi_dma_unmap+0xe7/0x150 pqi_raid_io_complete+0x37/0x60 [smartpqi] pqi_irq_handler+0x1fc/0x13f0 [smartpqi] __handle_irq_event_percpu+0x78/0x4f0 handle_irq_event_percpu+0x70/0x100 handle_irq_event+0x5a/0x8b handle_edge_irq+0x10c/0x370 do_IRQ+0x9e/0x1e0 common_interrupt+0xf/0xf Signed-off-by: Qian Cai --- drivers/iommu/amd_iommu.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 20cce366e951..22328a23335f 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -1434,6 +1434,11 @@ static bool increase_address_space(struct protection_domain *domain, *pte = PM_LEVEL_PDE(domain->mode, iommu_virt_to_phys(domain->pt_root)); domain->pt_root = pte; + /* +* Make sure fetch_pte() will see the new domain->pt_root before it +* snapshots domain->mode. +*/ + smp_wmb(); domain->mode+= 1; ret = true; @@ -1460,6 +1465,8 @@ static u64 *alloc_pte(struct protection_domain *domain, *updated = increase_address_space(domain, address, gfp) || *updated; level = domain->mode - 1; + /* To pair with smp_wmb() in increase_address_space(). */ + smp_rmb(); pte = &domain->pt_root[PM_LEVEL_INDEX(level, address)]; address = PAGE_SIZE_ALIGN(address, page_size); end_lvl = PAGE_SIZE_LEVEL(page_size); @@ -1545,6 +1552,8 @@ static u64 *fetch_pte(struct protection_domain *domain, return NULL; level = domain->mode - 1; + /* To pair with smp_wmb() in increase_address_space(). */ + smp_rmb(); pte= &domain->pt_root[PM_LEVEL_INDEX(level, address)]; *page_size = PTE_LEVEL_PAGE_SIZE(level); -- 2.21.0 (Apple Git-122.2) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/vt-d: silence a RCU-list debugging warning
> On Mar 18, 2020, at 1:27 AM, Lu Baolu wrote: > > How about changing the commit subject to > "iommu/vt-d: Silence RCU-list debugging warning in dmar_find_atsr()"? Just a bit long which should be non-issue. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/vt-d: silence a RCU-list debugging warning
dmar_find_atsr() calls list_for_each_entry_rcu() outside of an RCU read side critical section but with dmar_global_lock held. Silence this false positive. drivers/iommu/intel-iommu.c:4504 RCU-list traversed in non-reader section!! 1 lock held by swapper/0/1: #0: 9755bee8 (dmar_global_lock){+.+.}, at: intel_iommu_init+0x1a6/0xe19 Call Trace: dump_stack+0xa4/0xfe lockdep_rcu_suspicious+0xeb/0xf5 dmar_find_atsr+0x1ab/0x1c0 dmar_parse_one_atsr+0x64/0x220 dmar_walk_remapping_entries+0x130/0x380 dmar_table_init+0x166/0x243 intel_iommu_init+0x1ab/0xe19 pci_iommu_init+0x1a/0x44 do_one_initcall+0xae/0x4d0 kernel_init_freeable+0x412/0x4c5 kernel_init+0x19/0x193 Signed-off-by: Qian Cai --- drivers/iommu/intel-iommu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 4be549478691..ef0a5246700e 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -4501,7 +4501,8 @@ static struct dmar_atsr_unit *dmar_find_atsr(struct acpi_dmar_atsr *atsr) struct dmar_atsr_unit *atsru; struct acpi_dmar_atsr *tmp; - list_for_each_entry_rcu(atsru, &dmar_atsr_units, list) { + list_for_each_entry_rcu(atsru, &dmar_atsr_units, list, + dmar_rcu_check()) { tmp = (struct acpi_dmar_atsr *)atsru->hdr; if (atsr->segment != tmp->segment) continue; -- 2.21.0 (Apple Git-122.2) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/vt-d: fix RCU-list bugs in intel_iommu_init
There are several places traverse RCU-list without holding any lock in intel_iommu_init(). Fix them by acquiring dmar_global_lock. WARNING: suspicious RCU usage - drivers/iommu/intel-iommu.c:5216 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 no locks held by swapper/0/1. Call Trace: dump_stack+0xa0/0xea lockdep_rcu_suspicious+0x102/0x10b intel_iommu_init+0x947/0xb13 pci_iommu_init+0x26/0x62 do_one_initcall+0xfe/0x500 kernel_init_freeable+0x45a/0x4f8 kernel_init+0x11/0x139 ret_from_fork+0x3a/0x50 DMAR: Intel(R) Virtualization Technology for Directed I/O Fixes: d8190dc63886 ("iommu/vt-d: Enable DMA remapping after rmrr mapped") Signed-off-by: Qian Cai --- drivers/iommu/intel-iommu.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 6fa6de2b6ad5..bc138ceb07bc 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5193,6 +5193,7 @@ int __init intel_iommu_init(void) init_iommu_pm_ops(); + down_read(&dmar_global_lock); for_each_active_iommu(iommu, drhd) { iommu_device_sysfs_add(&iommu->iommu, NULL, intel_iommu_groups, @@ -5200,6 +5201,7 @@ int __init intel_iommu_init(void) iommu_device_set_ops(&iommu->iommu, &intel_iommu_ops); iommu_device_register(&iommu->iommu); } + up_read(&dmar_global_lock); bus_set_iommu(&pci_bus_type, &intel_iommu_ops); if (si_domain && !hw_pass_through) @@ -5210,7 +5212,6 @@ int __init intel_iommu_init(void) down_read(&dmar_global_lock); if (probe_acpi_namespace_devices()) pr_warn("ACPI name space devices didn't probe correctly\n"); - up_read(&dmar_global_lock); /* Finally, we enable the DMA remapping hardware. */ for_each_iommu(iommu, drhd) { @@ -5219,6 +5220,8 @@ int __init intel_iommu_init(void) iommu_disable_protect_mem_regions(iommu); } + up_read(&dmar_global_lock); + pr_info("Intel(R) Virtualization Technology for Directed I/O\n"); intel_iommu_enabled = 1; -- 1.8.3.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH -next] iommu/dmar: silence RCU-list debugging warnings
Similar to the commit 02d715b4a818 ("iommu/vt-d: Fix RCU list debugging warnings"), there are several other places that call list_for_each_entry_rcu() outside of an RCU read side critical section but with dmar_global_lock held. Silence those false positives as well. drivers/iommu/intel-iommu.c:4288 RCU-list traversed in non-reader section!! 1 lock held by swapper/0/1: #0: 935892c8 (dmar_global_lock){+.+.}, at: intel_iommu_init+0x1ad/0xb97 drivers/iommu/dmar.c:366 RCU-list traversed in non-reader section!! 1 lock held by swapper/0/1: #0: 935892c8 (dmar_global_lock){+.+.}, at: intel_iommu_init+0x125/0xb97 drivers/iommu/intel-iommu.c:5057 RCU-list traversed in non-reader section!! 1 lock held by swapper/0/1: #0: a71892c8 (dmar_global_lock){}, at: intel_iommu_init+0x61a/0xb13 Signed-off-by: Qian Cai --- drivers/iommu/dmar.c | 3 ++- include/linux/dmar.h | 6 -- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c index 071bb42bbbc5..7b16c4db40b4 100644 --- a/drivers/iommu/dmar.c +++ b/drivers/iommu/dmar.c @@ -363,7 +363,8 @@ static int dmar_pci_bus_notifier(struct notifier_block *nb, { struct dmar_drhd_unit *dmaru; - list_for_each_entry_rcu(dmaru, &dmar_drhd_units, list) + list_for_each_entry_rcu(dmaru, &dmar_drhd_units, list, + dmar_rcu_check()) if (dmaru->segment == drhd->segment && dmaru->reg_base_addr == drhd->address) return dmaru; diff --git a/include/linux/dmar.h b/include/linux/dmar.h index 712be8bc6a7c..d7bf029df737 100644 --- a/include/linux/dmar.h +++ b/include/linux/dmar.h @@ -74,11 +74,13 @@ struct dmar_pci_notify_info { dmar_rcu_check()) #define for_each_active_drhd_unit(drhd) \ - list_for_each_entry_rcu(drhd, &dmar_drhd_units, list) \ + list_for_each_entry_rcu(drhd, &dmar_drhd_units, list, \ + dmar_rcu_check()) \ if (drhd->ignored) {} else #define for_each_active_iommu(i, drhd) \ - list_for_each_entry_rcu(drhd, &dmar_drhd_units, list) \ + list_for_each_entry_rcu(drhd, &dmar_drhd_units, list, \ + dmar_rcu_check()) \ if (i=drhd->iommu, drhd->ignored) {} else #define for_each_iommu(i, drhd) \ -- 1.8.3.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/dma: fix variable 'cookie' set but not used
> On Jan 6, 2020, at 1:19 PM, Robin Murphy wrote: > > Fair enough... I guess this is a W=1 thing? Yes. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/dma: fix variable 'cookie' set but not used
The commit c18647900ec8 ("iommu/dma: Relax locking in iommu_dma_prepare_msi()") introduced a compliation warning, drivers/iommu/dma-iommu.c: In function 'iommu_dma_prepare_msi': drivers/iommu/dma-iommu.c:1206:27: warning: variable 'cookie' set but not used [-Wunused-but-set-variable] struct iommu_dma_cookie *cookie; ^~ Fixes: c18647900ec8 ("iommu/dma: Relax locking in iommu_dma_prepare_msi()") Signed-off-by: Qian Cai --- drivers/iommu/dma-iommu.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index c363294b3bb9..a2e96a5fd9a7 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -1203,7 +1203,6 @@ int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr) { struct device *dev = msi_desc_to_dev(desc); struct iommu_domain *domain = iommu_get_domain_for_dev(dev); - struct iommu_dma_cookie *cookie; struct iommu_dma_msi_page *msi_page; static DEFINE_MUTEX(msi_prepare_lock); /* see below */ @@ -1212,8 +1211,6 @@ int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr) return 0; } - cookie = domain->iova_cookie; - /* * In fact the whole prepare operation should already be serialised by * irq_domain_mutex further up the callchain, but that's pretty subtle -- 2.21.0 (Apple Git-122.2) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3] iommu: fix KASAN use-after-free in iommu_insert_resv_region
> On Nov 26, 2019, at 5:27 AM, Eric Auger wrote: > > In case the new region gets merged into another one, the nr > list node is freed. Checking its type while completing the > merge algorithm leads to a use-after-free. Use new->type > instead. > > Fixes: 4dbd258ff63e ("iommu: Revisit iommu_insert_resv_region() > implementation") > Signed-off-by: Eric Auger > Reported-by: Qian Cai > Cc: Stable #v5.3+ Looks like Joerg is away for a few weeks. Could Andrew or Linus pick up this use-after-free? > > --- > > v2 -> v3: > - directly use new->type > > v1 -> v2: > - remove spurious new line > --- > drivers/iommu/iommu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index d658c7c6a2ab..285ad4a4c7f2 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -313,7 +313,7 @@ int iommu_insert_resv_region(struct iommu_resv_region > *new, > phys_addr_t top_end, iter_end = iter->start + iter->length - 1; > > /* no merge needed on elements of different types than @nr */ > - if (iter->type != nr->type) { > + if (iter->type != new->type) { > list_move_tail(&iter->list, &stack); > continue; > } > -- > 2.20.1 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [Patch v3 0/3] iommu: reduce spinlock contention on fast path
> On Dec 6, 2019, at 4:38 PM, Cong Wang wrote: > > This patchset contains three small optimizations for the global spinlock > contention in IOVA cache. Our memcache perf test shows this reduced its > p999 latency down by 45% on AMD when IOMMU is enabled. Can you at least have a changelog compared to previous versions? > > Cong Wang (3): > iommu: avoid unnecessary magazine allocations > iommu: optimize iova_magazine_free_pfns() > iommu: avoid taking iova_rbtree_lock twice > --- > drivers/iommu/iova.c | 75 ++-- > 1 file changed, 45 insertions(+), 30 deletions(-) > > -- > 2.21.0 > > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/3] iommu: match the original algorithm
> On Nov 27, 2019, at 1:01 PM, John Garry wrote: > > I haven't gone into the details, but this patch alone is giving this: > > root@(none)$ [ 123.857024] kmemleak: 8 new suspected memory leaks (see > /sys/kernel/debug/kmemleak) > > root@(none)$ cat /sys/kernel/debug/kmemleak > unreferenced object 0x002339843000 (size 2048): > comm "swapper/0", pid 1, jiffies 4294898165 (age 122.688s) > hex dump (first 32 bytes): >00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > backtrace: >[<1d2710bf>] kmem_cache_alloc+0x188/0x260 >[] init_iova_domain+0x1e8/0x2a8 >[<2646fc92>] iommu_setup_dma_ops+0x200/0x710 >[ ] arch_setup_dma_ops+0x80/0x128 >[<994e1e43>] acpi_dma_configure+0x11c/0x140 >[ ] pci_dma_configure+0xe0/0x108 >[ ] really_probe+0x210/0x548 >[<87884b1b>] driver_probe_device+0x7c/0x148 >[<10af2936>] device_driver_attach+0x94/0xa0 >[ ] __driver_attach+0xa4/0x110 >[ ] bus_for_each_dev+0xe8/0x158 >[ ] driver_attach+0x30/0x40 >[<3cf39ba8>] bus_add_driver+0x234/0x2f0 >[<43830a45>] driver_register+0xbc/0x1d0 >[ ] __pci_register_driver+0xb0/0xc8 >[ ] sas_v3_pci_driver_init+0x20/0x28 > unreferenced object 0x002339844000 (size 2048): > comm "swapper/0", pid 1, jiffies 4294898165 (age 122.688s) > > [snip] > > And I don't feel like continuing until it's resolved Thanks for talking a hit by this before me. It is frustrating that people tend not to test their patches properly with things like kmemleak. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4] iommu/iova: silence warnings under memory pressure
:34403 active_file:2285 inactive_file:1838 isolated_file:0 unevictable:0 dirty:1 writeback:5 unstable:0 slab_reclaimable:13972 slab_unreclaimable:453879 mapped:2380 shmem:154 pagetables:6948 bounce:0 free:19133 free_pcp:7363 free_cma:0 Signed-off-by: Qian Cai --- v4: use dev_err_once() instead as mentioned above. v3: insert a "\n" in dev_err_ratelimited() per Joe. v2: use dev_err_ratelimited() and improve the commit messages. drivers/iommu/intel-iommu.c | 3 ++- drivers/iommu/iova.c| 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 0c8d81f56a30..8c944bbbed8a 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3406,7 +3406,8 @@ static unsigned long intel_alloc_iova(struct device *dev, iova_pfn = alloc_iova_fast(&domain->iovad, nrpages, IOVA_PFN(dma_mask), true); if (unlikely(!iova_pfn)) { - dev_err(dev, "Allocating %ld-page iova failed", nrpages); + dev_err_once(dev, "Allocating %ld-page iova failed\n", +nrpages); return 0; } diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 41c605b0058f..aa1a56aaa5ee 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -233,7 +233,7 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad, struct iova *alloc_iova_mem(void) { - return kmem_cache_alloc(iova_cache, GFP_ATOMIC); + return kmem_cache_alloc(iova_cache, GFP_ATOMIC | __GFP_NOWARN); } EXPORT_SYMBOL(alloc_iova_mem); -- 1.8.3.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3] iommu/iova: silence warnings under memory pressure
lated_file:0 unevictable:0 dirty:1 writeback:5 unstable:0 slab_reclaimable:13972 slab_unreclaimable:453879 mapped:2380 shmem:154 pagetables:6948 bounce:0 free:19133 free_pcp:7363 free_cma:0 Signed-off-by: Qian Cai --- v3: insert a "\n" in dev_err_ratelimited() per Joe. v2: use dev_err_ratelimited() and improve the commit messages. drivers/iommu/intel-iommu.c | 3 ++- drivers/iommu/iova.c| 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 0c8d81f56a30..855a7bb272d8 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3406,7 +3406,8 @@ static unsigned long intel_alloc_iova(struct device *dev, iova_pfn = alloc_iova_fast(&domain->iovad, nrpages, IOVA_PFN(dma_mask), true); if (unlikely(!iova_pfn)) { - dev_err(dev, "Allocating %ld-page iova failed", nrpages); + dev_err_ratelimited(dev, "Allocating %ld-page iova failed\n", + nrpages); return 0; } diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 41c605b0058f..aa1a56aaa5ee 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -233,7 +233,7 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad, struct iova *alloc_iova_mem(void) { - return kmem_cache_alloc(iova_cache, GFP_ATOMIC); + return kmem_cache_alloc(iova_cache, GFP_ATOMIC | __GFP_NOWARN); } EXPORT_SYMBOL(alloc_iova_mem); -- 1.8.3.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/iova: silence warnings under memory pressure
On Fri, 2019-11-22 at 08:28 -0800, Joe Perches wrote: > On Fri, 2019-11-22 at 09:59 -0500, Qian Cai wrote: > > On Thu, 2019-11-21 at 20:37 -0800, Joe Perches wrote: > > > On Thu, 2019-11-21 at 21:55 -0500, Qian Cai wrote: > > > > When running heavy memory pressure workloads, this 5+ old system is > > > > throwing endless warnings below because disk IO is too slow to recover > > > > from swapping. Since the volume from alloc_iova_fast() could be large, > > > > once it calls printk(), it will trigger disk IO (writing to the log > > > > files) and pending softirqs which could cause an infinite loop and make > > > > no progress for days by the ongoimng memory reclaim. This is the counter > > > > part for Intel where the AMD part has already been merged. See the > > > > commit 3d708895325b ("iommu/amd: Silence warnings under memory > > > > pressure"). Since the allocation failure will be reported in > > > > intel_alloc_iova(), so just call printk_ratelimted() there and silence > > > > the one in alloc_iova_mem() to avoid the expensive warn_alloc(). > > > > > > [] > > > > v2: use dev_err_ratelimited() and improve the commit messages. > > > > > > [] > > > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > > > > > > [] > > > > @@ -3401,7 +3401,8 @@ static unsigned long intel_alloc_iova(struct > > > > device *dev, > > > > iova_pfn = alloc_iova_fast(&domain->iovad, nrpages, > > > >IOVA_PFN(dma_mask), true); > > > > if (unlikely(!iova_pfn)) { > > > > - dev_err(dev, "Allocating %ld-page iova failed", > > > > nrpages); > > > > + dev_err_ratelimited(dev, "Allocating %ld-page iova > > > > failed", > > > > + nrpages); > > > > > > Trivia: > > > > > > This should really have a \n termination on the format string > > > > > > dev_err_ratelimited(dev, "Allocating %ld-page iova failed\n", > > > > > > > > > > Why do you say so? It is right now printing with a newline added anyway. > > > > hpsa :03:00.0: DMAR: Allocating 1-page iova failed > > If another process uses pr_cont at the same time, > it can be interleaved. I lean towards fixing that in a separate patch if ever needed, as the origin dev_err() has no "\n" enclosed either. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
"Revisit iommu_insert_resv_region() implementation" causes use-after-free
Read files under /sys/kernel/iommu_groups/ triggers an use-after-free. Reverted the commit 4dbd258ff63e ("iommu: Revisit iommu_insert_resv_region() implementation") fixed the issue. /* no merge needed on elements of different types than @nr */ if (iter->type != nr->type) { list_move_tail(&iter->list, &stack); continue; [ 160.156964][ T3100] BUG: KASAN: use-after-free in iommu_insert_resv_region+0x34b/0x520 [ 160.197758][ T3100] Read of size 4 at addr 8887aba78464 by task cat/3100 [ 160.230645][ T3100] [ 160.240907][ T3100] CPU: 14 PID: 3100 Comm: cat Not tainted 5.4.0-rc8-next- 20191122+ #11 [ 160.278671][ T3100] Hardware name: HP ProLiant XL420 Gen9/ProLiant XL420 Gen9, BIOS U19 12/27/2015 [ 160.320589][ T3100] Call Trace: [ 160.335229][ T3100] dump_stack+0xa0/0xea [ 160.354011][ T3100] print_address_description.constprop.5.cold.7+0x9/0x384 [ 160.386569][ T3100] __kasan_report.cold.8+0x7a/0xc0 [ 160.409811][ T3100] ? iommu_insert_resv_region+0x34b/0x520 [ 160.435668][ T3100] kasan_report+0x12/0x20 [ 160.455387][ T3100] __asan_load4+0x95/0xa0 [ 160.474808][ T3100] iommu_insert_resv_region+0x34b/0x520 [ 160.500228][ T3100] ? iommu_bus_notifier+0xe0/0xe0 [ 160.522904][ T3100] ? intel_iommu_get_resv_regions+0x348/0x400 [ 160.550461][ T3100] iommu_get_group_resv_regions+0x16d/0x2f0 [ 160.577611][ T3100] ? iommu_insert_resv_region+0x520/0x520 [ 160.603756][ T3100] ? register_lock_class+0x940/0x940 [ 160.628265][ T3100] iommu_group_show_resv_regions+0x8d/0x1f0 [ 160.655370][ T3100] ? iommu_get_group_resv_regions+0x2f0/0x2f0 [ 160.684168][ T3100] iommu_group_attr_show+0x34/0x50 [ 160.708395][ T3100] sysfs_kf_seq_show+0x11c/0x220 [ 160.731758][ T3100] ? iommu_default_passthrough+0x20/0x20 [ 160.756898][ T3100] kernfs_seq_show+0xa4/0xb0 [ 160.777097][ T3100] seq_read+0x27e/0x710 [ 160.795195][ T3100] kernfs_fop_read+0x7d/0x2c0 [ 160.815349][ T3100] __vfs_read+0x50/0xa0 [ 160.834154][ T3100] vfs_read+0xcb/0x1e0 [ 160.852332][ T3100] ksys_read+0xc6/0x160 [ 160.871028][ T3100] ? kernel_write+0xc0/0xc0 [ 160.891307][ T3100] ? do_syscall_64+0x79/0xaec [ 160.912446][ T3100] ? do_syscall_64+0x79/0xaec [ 160.933640][ T3100] __x64_sys_read+0x43/0x50 [ 160.953957][ T3100] do_syscall_64+0xcc/0xaec [ 160.974322][ T3100] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 160.999130][ T3100] ? syscall_return_slowpath+0x580/0x580 [ 161.024753][ T3100] ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe [ 161.052416][ T3100] ? trace_hardirqs_off_caller+0x3a/0x150 [ 161.078400][ T3100] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 161.103711][ T3100] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 161.130793][ T3100] RIP: 0033:0x7f33e0d89d75 [ 161.150732][ T3100] Code: fe ff ff 50 48 8d 3d 4a dc 09 00 e8 25 0e 02 00 0f 1f 44 00 00 f3 0f 1e fa 48 8d 05 a5 59 2d 00 8b 00 85 c0 75 0f 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 53 c3 66 90 41 54 49 89 d4 55 48 89 f5 53 89 [ 161.245503][ T3100] RSP: 002b:7fff88f0db88 EFLAGS: 0246 ORIG_RAX: [ 161.284547][ T3100] RAX: ffda RBX: 0002 RCX: 7f33e0d89d75 [ 161.321123][ T3100] RDX: 0002 RSI: 7f33e1201000 RDI: 0003 [ 161.357617][ T3100] RBP: 7f33e1201000 R08: R09: [ 161.394173][ T3100] R10: 0022 R11: 0246 R12: 7f33e1201000 [ 161.430736][ T3100] R13: 0003 R14: 0fff R15: 0002 [ 161.467337][ T3100] [ 161.477529][ T3100] Allocated by task 3100: [ 161.497133][ T3100] save_stack+0x21/0x90 [ 161.515777][ T3100] __kasan_kmalloc.constprop.13+0xc1/0xd0 [ 161.541743][ T3100] kasan_kmalloc+0x9/0x10 [ 161.561330][ T3100] kmem_cache_alloc_trace+0x1f8/0x470 [ 161.585949][ T3100] iommu_insert_resv_region+0xeb/0x520 [ 161.610876][ T3100] iommu_get_group_resv_regions+0x16d/0x2f0 [ 161.638318][ T3100] iommu_group_show_resv_regions+0x8d/0x1f0 [ 161.665322][ T3100] iommu_group_attr_show+0x34/0x50 [ 161.688526][ T3100] sysfs_kf_seq_show+0x11c/0x220 [ 161.711992][ T3100] kernfs_seq_show+0xa4/0xb0 [ 161.734252][ T3100] seq_read+0x27e/0x710 [ 161.754412][ T3100] kernfs_fop_read+0x7d/0x2c0 [ 161.775493][ T3100] __vfs_read+0x50/0xa0 [ 161.794328][ T3100] vfs_read+0xcb/0x1e0 [ 161.812559][ T3100] ksys_read+0xc6/0x160 [ 161.831554][ T3100] __x64_sys_read+0x43/0x50 [ 161.851772][ T3100] do_syscall_64+0xcc/0xaec [ 161.872098][ T3100] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 161.898919][ T3100] [ 161.909113][ T3100] Freed by task 3100: [ 161.927070][ T3100] save_stack+0x21/0x90 [ 161.945711][ T3100] __kasan_slab_free+0x11c/0x170 [ 161.968112][ T3100] kasan_slab_free+0xe/0x10 [ 161.988601][ T3100] slab_free_freelist_hook+0x5f/0x1d0 [ 162.012918][ T3100] kfree+0xe9/0x410 [ 162.029454][ T3100] iommu_insert_resv_region+0x47d/0x520 [ 162.053701][ T3100] iommu_get_group_resv_regions+0x16d/0x2f0 [ 162.079671][ T3100] iommu_group_show
Re: [PATCH v2] iommu/iova: silence warnings under memory pressure
On Thu, 2019-11-21 at 20:37 -0800, Joe Perches wrote: > On Thu, 2019-11-21 at 21:55 -0500, Qian Cai wrote: > > When running heavy memory pressure workloads, this 5+ old system is > > throwing endless warnings below because disk IO is too slow to recover > > from swapping. Since the volume from alloc_iova_fast() could be large, > > once it calls printk(), it will trigger disk IO (writing to the log > > files) and pending softirqs which could cause an infinite loop and make > > no progress for days by the ongoimng memory reclaim. This is the counter > > part for Intel where the AMD part has already been merged. See the > > commit 3d708895325b ("iommu/amd: Silence warnings under memory > > pressure"). Since the allocation failure will be reported in > > intel_alloc_iova(), so just call printk_ratelimted() there and silence > > the one in alloc_iova_mem() to avoid the expensive warn_alloc(). > > [] > > v2: use dev_err_ratelimited() and improve the commit messages. > > [] > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > > [] > > @@ -3401,7 +3401,8 @@ static unsigned long intel_alloc_iova(struct device > > *dev, > > iova_pfn = alloc_iova_fast(&domain->iovad, nrpages, > >IOVA_PFN(dma_mask), true); > > if (unlikely(!iova_pfn)) { > > - dev_err(dev, "Allocating %ld-page iova failed", nrpages); > > + dev_err_ratelimited(dev, "Allocating %ld-page iova failed", > > + nrpages); > > Trivia: > > This should really have a \n termination on the format string > > dev_err_ratelimited(dev, "Allocating %ld-page iova failed\n", > > Why do you say so? It is right now printing with a newline added anyway. hpsa :03:00.0: DMAR: Allocating 1-page iova failed hpsa :03:00.0: DMAR: Allocating 1-page iova failed hpsa :03:00.0: DMAR: Allocating 1-page iova failed hpsa :03:00.0: DMAR: Allocating 1-page iova failed hpsa :03:00.0: DMAR: Allocating 1-page iova failed hpsa :03:00.0: DMAR: Allocating 1-page iova failed hpsa :03:00.0: DMAR: Allocating 1-page iova failed hpsa :03:00.0: DMAR: Allocating 1-page iova failed ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2] iommu/iova: silence warnings under memory pressure
lated_file:0 unevictable:0 dirty:1 writeback:5 unstable:0 slab_reclaimable:13972 slab_unreclaimable:453879 mapped:2380 shmem:154 pagetables:6948 bounce:0 free:19133 free_pcp:7363 free_cma:0 Signed-off-by: Qian Cai --- v2: use dev_err_ratelimited() and improve the commit messages. drivers/iommu/intel-iommu.c | 3 ++- drivers/iommu/iova.c| 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 6db6d969e31c..c01a7bc99385 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3401,7 +3401,8 @@ static unsigned long intel_alloc_iova(struct device *dev, iova_pfn = alloc_iova_fast(&domain->iovad, nrpages, IOVA_PFN(dma_mask), true); if (unlikely(!iova_pfn)) { - dev_err(dev, "Allocating %ld-page iova failed", nrpages); + dev_err_ratelimited(dev, "Allocating %ld-page iova failed", + nrpages); return 0; } diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 41c605b0058f..aa1a56aaa5ee 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -233,7 +233,7 @@ static DEFINE_MUTEX(iova_cache_mutex); struct iova *alloc_iova_mem(void) { - return kmem_cache_alloc(iova_cache, GFP_ATOMIC); + return kmem_cache_alloc(iova_cache, GFP_ATOMIC | __GFP_NOWARN); } EXPORT_SYMBOL(alloc_iova_mem); -- 2.21.0 (Apple Git-122.2) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 1/1] iommu/vt-d: Add Kconfig option to enable/disable scalable mode
> On Nov 11, 2019, at 12:23 AM, Lu Baolu wrote: > > The scalable mode is defined in VT-d 3.0. The scalable mode capability > could be checked by reading /sys/devices/virtual/iommu/dmar*/intel- > iommu/ecap. It's currently not friendly for reading. You need to decode > it according to the spec. This looks like some perfect information to put in the Kconfig description. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 1/1] iommu/vt-d: Add Kconfig option to enable/disable scalable mode
> On Nov 8, 2019, at 10:43 PM, Lu Baolu wrote: > > +config INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON > +prompt "Enable Intel IOMMU scalable mode by default" > +depends on INTEL_IOMMU > +help > + Selecting this option will enable the scalable mode if > + hardware presents the capability. If this option is not > + selected, scalable mode support could also be enabled > + by passing intel_iommu=sm_on to the kernel. > + Does it also make sense to mention which hardware presents this capability or how to check it? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 1/1] iommu/vt-d: Add Kconfig option to enable/disable scalable mode
> On Nov 10, 2019, at 8:30 PM, Lu Baolu wrote: > > How about "pasid based multiple stages DMA translation"? It is better but I am still not sure how developers should select it or not when asking. Ideally, should it mention pros and cons of this? At minimal, there should be a line said “if not sure what this is, select N”? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 1/1] iommu/vt-d: Add Kconfig option to enable/disable scalable mode
> On Nov 8, 2019, at 10:40 PM, Lu Baolu wrote: > > This adds Kconfig option INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON > to make it easier for distributions to enable or disable the > Intel IOMMU scalable mode by default during kernel build. > > Signed-off-by: Lu Baolu > --- > drivers/iommu/Kconfig | 9 + > drivers/iommu/intel-iommu.c | 7 ++- > 2 files changed, 15 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig > index e3842eabcfdd..fbdf3fd291d9 100644 > --- a/drivers/iommu/Kconfig > +++ b/drivers/iommu/Kconfig > @@ -242,6 +242,15 @@ config INTEL_IOMMU_FLOPPY_WA > workaround will setup a 1:1 mapping for the first > 16MiB to make floppy (an ISA device) work. > > +config INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON > + prompt "Enable Intel IOMMU scalable mode by default" > + depends on INTEL_IOMMU > + help > + Selecting this option will enable the scalable mode if > + hardware presents the capability. If this option is not > + selected, scalable mode support could also be enabled > + by passing intel_iommu=sm_on to the kernel. Maybe a sentence or two to describe what the scalable mode is in layman's terms could be useful, so developers don’t need to search around for the Kconfig selection? > + > config IRQ_REMAP > bool "Support for Interrupt Remapping" > depends on X86_64 && X86_IO_APIC && PCI_MSI && ACPI > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index 6db6d969e31c..6051fe790c61 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -355,9 +355,14 @@ static phys_addr_t intel_iommu_iova_to_phys(struct > iommu_domain *domain, > int dmar_disabled = 0; > #else > int dmar_disabled = 1; > -#endif /*CONFIG_INTEL_IOMMU_DEFAULT_ON*/ > +#endif /* CONFIG_INTEL_IOMMU_DEFAULT_ON */ > > +#ifdef INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON > +int intel_iommu_sm = 1; > +#else > int intel_iommu_sm; > +#endif /* INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON */ > + > int intel_iommu_enabled = 0; > EXPORT_SYMBOL_GPL(intel_iommu_enabled); > > -- > 2.17.1 > > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: "Convert the AMD iommu driver to the dma-iommu api" is buggy
On Wed, 2019-10-16 at 17:44 +0200, Joerg Roedel wrote: > On Wed, Oct 16, 2019 at 10:59:42AM -0400, Qian Cai wrote: > > BTW, the previous x86 warning was from only reverted one patch "iommu: Add > > gfp > > parameter to iommu_ops::map" where proved to be insufficient. Now, pasting > > the > > correct warning. > > Can you please test this small fix: This works fine so far. > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > index 78a2cca3ac5c..e7a4464e8594 100644 > --- a/drivers/iommu/amd_iommu.c > +++ b/drivers/iommu/amd_iommu.c > @@ -2562,7 +2562,7 @@ static int amd_iommu_map(struct iommu_domain *dom, > unsigned long iova, > if (iommu_prot & IOMMU_WRITE) > prot |= IOMMU_PROT_IW; > > - ret = iommu_map_page(domain, iova, paddr, page_size, prot, GFP_KERNEL); > + ret = iommu_map_page(domain, iova, paddr, page_size, prot, gfp); > > domain_flush_np_cache(domain, iova, page_size); > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH -next] iommu/amd: fix a warning in increase_address_space
> On Oct 16, 2019, at 6:59 PM, Jerry Snitselaar wrote: > > I guess the mode level 6 check is really for other potential callers > increase_address_space, none exist at the moment, and the condition > of the while loop in alloc_pte should fail if the mode level is 6. Because there is no locking around iommu_map_page(), if there are several concurrent callers of it for the same domain, could it be that it silently corrupt data due to invalid access?
[PATCH -next] iommu/amd: fix a warning in increase_address_space
After the commit 754265bcab78 ("iommu/amd: Fix race in increase_address_space()"), it could still possible trigger a race condition under some heavy memory pressure below. The race to trigger a warning is, CPU0: CPU1: in alloc_pte(): in increase_address_space(): while (address > PM_LEVEL_SIZE(domain->mode)) [1] spin_lock_irqsave(&domain->lock domain->mode+= 1; spin_unlock_irqrestore(&domain->lock in increase_address_space(): spin_lock_irqsave(&domain->lock if (WARN_ON_ONCE(domain->mode == PAGE_MODE_6_LEVEL)) [1] domain->mode = 5 It is unclear the triggering of the warning is the root cause of the smartpqi offline yet, but let's fix it first by lifting the locking. WARNING: CPU: 57 PID: 124314 at drivers/iommu/amd_iommu.c:1474 iommu_map_page+0x718/0x7e0 smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x address=0xffec flags=0x0010] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x address=0xffec1000 flags=0x0010] CPU: 57 PID: 124314 Comm: oom01 Tainted: G O Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 RIP: 0010:iommu_map_page+0x718/0x7e0 Code: 88 a5 70 ff ff ff e9 5d fa ff ff 48 8b b5 70 ff ff ff 4c 89 ef e8 08 32 2f 00 41 80 fc 01 0f 87 b7 3d 00 00 41 83 e4 01 eb be <0f> 0b 48 8b b5 70 ff ff ff 4c 89 ef e8 e7 31 2f 00 eb dd 0f 0b 48 RSP: 0018:888da4816cb8 EFLAGS: 00010046 RAX: RBX: 8885fe689000 RCX: 96f4a6c4 RDX: 0007 RSI: dc00 RDI: 8885fe689124 RBP: 888da4816da8 R08: ed10bfcd120e R09: ed10bfcd120e R10: ed10bfcd120d R11: 8885fe68906b R12: smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x address=0xffec1a00 flags=0x0010] R13: 8885fe689068 R14: 8885fe689124 R15: smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x address=0xffec1e00 flags=0x0010] FS: 7f29722ba700() GS:88902f88() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7f27f82d8000 CR3: 00102ed9c000 CR4: 003406e0 Call Trace: smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x address=0xffec2000 flags=0x0010] map_sg+0x1ce/0x2f0 smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x address=0xffec2400 flags=0x0010] scsi_dma_map+0xd7/0x160 pqi_raid_submit_scsi_cmd_with_io_request+0x1b8/0x420 [smartpqi] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x address=0xffec2800 flags=0x0010] pqi_scsi_queue_command+0x8ab/0xe00 [smartpqi] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x address=0xffec2c00 flags=0x0010] scsi_queue_rq+0xd19/0x1360 smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x address=0xffec3000 flags=0x0010] __blk_mq_try_issue_directly+0x295/0x3f0 smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x address=0xffec3400 flags=0x0010] AMD-Vi: Event logged [IO_PAGE_FAULT device=23:00.0 domain=0x address=0xffec3800 flags=0x0010] blk_mq_request_issue_directly+0xb5/0x100 AMD-Vi: Event logged [IO_PAGE_FAULT device=23:00.0 domain=0x address=0xffec3c00 flags=0x0010] blk_mq_try_issue_list_directly+0xa9/0x160 blk_mq_sched_insert_requests+0x228/0x380 blk_mq_flush_plug_list+0x448/0x7e0 blk_flush_plug_list+0x1eb/0x230 blk_finish_plug+0x43/0x5d shrink_node_memcg+0x9c5/0x1550 smartpqi :23:00.0: controller is offline: status code 0x14803 smartpqi :23:00.0: controller offline Fixes: 754265bcab78 ("iommu/amd: Fix race in increase_address_space()") Signed-off-by: Qian Cai --- BTW, Joerg, this line from the commit "iommu/amd: Remove domain->updated" looks suspicious. Not sure what the purpose of it. *updated = increase_address_space(domain, gfp) || *updated; drivers/iommu/amd_iommu.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 2369b8af81f3..a5754068aa29 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -1465,12 +1465,9 @@ static void free_pagetable(struct protection_domain *domain) static bool increase_address_space(struct protection_domain *domain, gfp_t gfp) { - unsigned long flags; bool ret = false; u64 *pte; - spin_lock_irqsave(&domain->lock, flags); - if (WARN_ON_ONCE(domain->mode == PAGE_MODE_6_LEVEL)) /* address space already 64 bit large */ goto out; @@ -1487,8 +1484,6 @@ static bool increase_address_space(struct protection_domai
Re: "Convert the AMD iommu driver to the dma-iommu api" is buggy
On Wed, 2019-10-16 at 18:03 +0200, Joerg Roedel wrote: > On Wed, Oct 16, 2019 at 11:53:33AM -0400, Qian Cai wrote: > > On Wed, 2019-10-16 at 17:31 +0200, Joerg Roedel wrote: > > The x86 one might just be a mistake. > > > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > > index ad05484d0c80..63c4b894751d 100644 > > --- a/drivers/iommu/amd_iommu.c > > +++ b/drivers/iommu/amd_iommu.c > > @@ -2542,7 +2542,7 @@ static int amd_iommu_map(struct iommu_domain *dom, > > unsigned long iova, > > if (iommu_prot & IOMMU_WRITE) > > prot |= IOMMU_PROT_IW; > > > > - ret = iommu_map_page(domain, iova, paddr, page_size, prot, > > GFP_KERNEL); > > + ret = iommu_map_page(domain, iova, paddr, page_size, prot, gfp); > > Yeah, that is a bug, I spotted that too. > > > @@ -1185,7 +1185,7 @@ static struct iommu_dma_msi_page > > *iommu_dma_get_msi_page(struct device *dev, > > if (!iova) > > goto out_free_page; > > > > - if (iommu_map(domain, iova, msi_addr, size, prot)) > > + if (iommu_map_atomic(domain, iova, msi_addr, size, prot)) > > goto out_free_iova; > > Not so sure this is a bug, this code is only about setting up MSIs on > ARM. It probably doesn't need to be atomic. The patch "iommu: Add gfp parameter to iommu_ops::map" does this. It could be called from an atomic context as showed in the arm64 call traces, +int iommu_map(struct iommu_domain *domain, unsigned long iova, + phys_addr_t paddr, size_t size, int prot) +{ + might_sleep(); + return __iommu_map(domain, iova, paddr, size, prot, GFP_KERNEL); +}
Re: "Convert the AMD iommu driver to the dma-iommu api" is buggy
On Wed, 2019-10-16 at 17:31 +0200, Joerg Roedel wrote: > Hi Qian, > > thanks for the report! > > On Wed, Oct 16, 2019 at 10:59:42AM -0400, Qian Cai wrote: > > On Wed, 2019-10-16 at 10:55 -0400, Qian Cai wrote: > > > Today's linux-next generates a lot of warnings on multiple servers during > > > boot > > > due to the series "iommu/amd: Convert the AMD iommu driver to the > > > dma-iommu api" > > > [1]. Reverted the whole things fixed them. > > > > > > [1] https://lore.kernel.org/lkml/20190908165642.22253-1-murph...@tcd.ie/ > > > > > > > BTW, the previous x86 warning was from only reverted one patch "iommu: Add > > gfp > > parameter to iommu_ops::map" where proved to be insufficient. Now, pasting > > the > > correct warning. The x86 one might just be a mistake. diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index ad05484d0c80..63c4b894751d 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -2542,7 +2542,7 @@ static int amd_iommu_map(struct iommu_domain *dom, unsigned long iova, if (iommu_prot & IOMMU_WRITE) prot |= IOMMU_PROT_IW; - ret = iommu_map_page(domain, iova, paddr, page_size, prot, GFP_KERNEL); + ret = iommu_map_page(domain, iova, paddr, page_size, prot, gfp); domain_flush_np_cache(domain, iova, page_size); The arm64 -- does it forget to do this? diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index ecc08aef9b58..8dd0ef0656f4 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -1185,7 +1185,7 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, if (!iova) goto out_free_page; - if (iommu_map(domain, iova, msi_addr, size, prot)) + if (iommu_map_atomic(domain, iova, msi_addr, size, prot)) goto out_free_iova; INIT_LIST_HEAD(&msi_page->list); ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: "Convert the AMD iommu driver to the dma-iommu api" is buggy
On Wed, 2019-10-16 at 10:55 -0400, Qian Cai wrote: > Today's linux-next generates a lot of warnings on multiple servers during boot > due to the series "iommu/amd: Convert the AMD iommu driver to the dma-iommu > api" > [1]. Reverted the whole things fixed them. > > [1] https://lore.kernel.org/lkml/20190908165642.22253-1-murph...@tcd.ie/ > BTW, the previous x86 warning was from only reverted one patch "iommu: Add gfp parameter to iommu_ops::map" where proved to be insufficient. Now, pasting the correct warning. [ 564.365768][ T6222] BUG: sleeping function called from invalid context at mm/page_alloc.c:4692 [ 564.374447][ T6222] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6222, name: git [ 564.382969][ T6222] INFO: lockdep is turned off. [ 564.387644][ T6222] CPU: 25 PID: 6222 Comm: git Tainted: GW 5.4.0-rc3-next-20191016 #6 [ 564.397011][ T6222] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 [ 564.406291][ T6222] Call Trace: [ 564.409470][ T6222] dump_stack+0x86/0xca [ 564.413517][ T6222] ___might_sleep.cold.92+0xd2/0x122 [ 564.418694][ T6222] __might_sleep+0x73/0xe0 [ 564.422999][ T6222] __alloc_pages_nodemask+0x442/0x720 [ 564.428265][ T6222] ? __alloc_pages_slowpath+0x18d0/0x18d0 [ 564.433883][ T6222] ? arch_stack_walk+0x7f/0xf0 [ 564.438534][ T6222] ? create_object+0x4a2/0x540 [ 564.443188][ T6222] alloc_pages_current+0x9c/0x110 [ 564.448098][ T6222] __get_free_pages+0x12/0x60 [ 564.452659][ T6222] get_zeroed_page+0x16/0x20 [ 564.457137][ T6222] amd_iommu_map+0x504/0x850 [ 564.461612][ T6222] ? amd_iommu_domain_direct_map+0x60/0x60 [ 564.467312][ T6222] ? lockdep_hardirqs_on+0x16/0x2a0 [ 564.472400][ T6222] ? alloc_iova+0x189/0x210 [ 564.476790][ T6222] __iommu_map+0x1c1/0x4e0 [ 564.481090][ T6222] ? iommu_get_dma_domain+0x40/0x40 [ 564.486181][ T6222] ? alloc_iova_fast+0x258/0x3d1 [ 564.491009][ T6222] ? create_object+0x4a2/0x540 [ 564.495656][ T6222] __iommu_map_sg+0xa5/0x130 [ 564.500135][ T6222] iommu_map_sg_atomic+0x14/0x20 [ 564.504958][ T6222] iommu_dma_map_sg+0x2c3/0x450 [ 564.509699][ T6222] scsi_dma_map+0xd7/0x160 [ 564.514010][ T6222] pqi_raid_submit_scsi_cmd_with_io_request+0x392/0x420 [smartpqi] [ 564.521811][ T6222] ? pqi_alloc_io_request+0x127/0x140 [smartpqi] [ 564.528037][ T6222] pqi_scsi_queue_command+0x8ab/0xe00 [smartpqi] [ 564.534264][ T6222] ? pqi_eh_device_reset_handler+0x9c0/0x9c0 [smartpqi] [ 564.541100][ T6222] ? sd_init_command+0xa25/0x1346 [sd_mod] [ 564.546802][ T6222] scsi_queue_rq+0xd19/0x1360 [ 564.551372][ T6222] __blk_mq_try_issue_directly+0x295/0x3f0 [ 564.557071][ T6222] ? blk_mq_request_bypass_insert+0xd0/0xd0 [ 564.562860][ T6222] ? debug_lockdep_rcu_enabled+0x27/0x60 [ 564.568384][ T6222] blk_mq_try_issue_directly+0xad/0x130 [ 564.573821][ T6222] ? __blk_mq_try_issue_directly+0x3f0/0x3f0 [ 564.579693][ T6222] ? blk_add_rq_to_plug+0xcd/0x110 [ 564.584693][ T6222] blk_mq_make_request+0xcee/0x1120 [ 564.589777][ T6222] ? lock_downgrade+0x3c0/0x3c0 [ 564.594517][ T6222] ? blk_mq_try_issue_directly+0x130/0x130 [ 564.600218][ T6222] ? blk_queue_enter+0x78d/0x810 [ 564.605041][ T6222] ? generic_make_request_checks+0xf30/0xf30 [ 564.610915][ T6222] ? lock_downgrade+0x3c0/0x3c0 [ 564.615655][ T6222] ? __srcu_read_unlock+0x24/0x50 [ 564.620565][ T6222] ? generic_make_request+0x150/0x650 [ 564.625833][ T6222] generic_make_request+0x196/0x650 [ 564.630921][ T6222] ? blk_queue_enter+0x810/0x810 [ 564.635747][ T6222] submit_bio+0xaa/0x270 [ 564.639873][ T6222] ? submit_bio+0xaa/0x270 [ 564.644172][ T6222] ? generic_make_request+0x650/0x650 [ 564.649437][ T6222] ? iomap_readpage+0x260/0x260 [ 564.654173][ T6222] iomap_readpages+0x154/0x3d0 [ 564.658823][ T6222] ? iomap_zero_range_actor+0x330/0x330 [ 564.664257][ T6222] ? __might_sleep+0x73/0xe0 [ 564.668836][ T6222] xfs_vm_readpages+0xaf/0x1f0 [xfs] [ 564.674016][ T6222] read_pages+0xe2/0x3b0 [ 564.678142][ T6222] ? read_cache_pages+0x350/0x350 [ 564.683057][ T6222] ? __page_cache_alloc+0x12c/0x230 [ 564.688148][ T6222] __do_page_cache_readahead+0x346/0x3a0 [ 564.693670][ T6222] ? read_pages+0x3b0/0x3b0 [ 564.698059][ T6222] ? lockdep_hardirqs_on+0x16/0x2a0 [ 564.703247][ T6222] ? __xfs_filemap_fault+0x167/0x4a0 [xfs] [ 564.708947][ T6222] filemap_fault+0xa13/0xe70 [ 564.713528][ T6222] __xfs_filemap_fault+0x167/0x4a0 [xfs] [ 564.719059][ T6222] ? kmemleak_alloc+0x57/0x90 [ 564.723724][ T6222] ? xfs_file_read_iter+0x3c0/0x3c0 [xfs] [ 564.729337][ T6222] ? debug_check_no_locks_freed+0x2c/0xe0 [ 564.734946][ T6222] ? lockdep_init_map+0x8b/0x2b0 [ 564.739872][ T6222] xfs_filemap_fault+0x68/0x70 [xfs] [ 564.745046][ T6222] __do_fault+0x83/0x220 [ 564.749172][ T6222] __handle_mm_fault+0xd76/0x1f40 [ 564.754084][ T6222] ? __pmd_alloc+0x280/0x280 [ 564.758559][ T6222] ? debug_loc
"Convert the AMD iommu driver to the dma-iommu api" is buggy
Today's linux-next generates a lot of warnings on multiple servers during boot due to the series "iommu/amd: Convert the AMD iommu driver to the dma-iommu api" [1]. Reverted the whole things fixed them. [1] https://lore.kernel.org/lkml/20190908165642.22253-1-murph...@tcd.ie/ [ 257.785749][ T6184] BUG: sleeping function called from invalid context at mm/page_alloc.c:4692 [ 257.794886][ T6184] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6184, name: readelf [ 257.803574][ T6184] INFO: lockdep is turned off. [ 257.808233][ T6184] CPU: 86 PID: 6184 Comm: readelf Tainted: GW 5.4.0-rc3-next-20191016+ #7 [ 257.818035][ T6184] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 [ 257.827313][ T6184] Call Trace: [ 257.830487][ T6184] dump_stack+0x86/0xca [ 257.834530][ T6184] ___might_sleep.cold.92+0xd2/0x122 [ 257.839708][ T6184] __might_sleep+0x73/0xe0 [ 257.844011][ T6184] __alloc_pages_nodemask+0x442/0x720 [ 257.849274][ T6184] ? __alloc_pages_slowpath+0x18d0/0x18d0 [ 257.854886][ T6184] ? debug_lockdep_rcu_enabled+0x27/0x60 [ 257.860415][ T6184] ? lock_downgrade+0x3c0/0x3c0 [ 257.865156][ T6184] alloc_pages_current+0x9c/0x110 [ 257.870071][ T6184] __get_free_pages+0x12/0x60 [ 257.874634][ T6184] get_zeroed_page+0x16/0x20 [ 257.879112][ T6184] amd_iommu_map+0x504/0x850 [ 257.883588][ T6184] ? amd_iommu_domain_direct_map+0x60/0x60 [ 257.889286][ T6184] ? lockdep_hardirqs_on+0x16/0x2a0 [ 257.894373][ T6184] ? alloc_iova+0x189/0x210 [ 257.898765][ T6184] ? trace_hardirqs_on+0x3a/0x160 [ 257.903677][ T6184] iommu_map+0x1b3/0x4d0 [ 257.907802][ T6184] ? iommu_unmap+0xf0/0xf0 [ 257.912104][ T6184] ? alloc_iova_fast+0x258/0x3d1 [ 257.916929][ T6184] ? create_object+0x4a2/0x540 [ 257.921579][ T6184] iommu_map_sg+0x9d/0x120 [ 257.925882][ T6184] iommu_dma_map_sg+0x2c3/0x450 [ 257.930627][ T6184] scsi_dma_map+0xd7/0x160 [ 257.934936][ T6184] pqi_raid_submit_scsi_cmd_with_io_request+0x392/0x420 [smartpqi] [ 257.942735][ T6184] ? pqi_alloc_io_request+0x127/0x140 [smartpqi] [ 257.948962][ T6184] pqi_scsi_queue_command+0x8ab/0xe00 [smartpqi] [ 257.955192][ T6184] ? pqi_eh_device_reset_handler+0x9c0/0x9c0 [smartpqi] [ 257.962029][ T6184] ? sd_init_command+0xa25/0x1346 [sd_mod] [ 257.967730][ T6184] scsi_queue_rq+0xd19/0x1360 [ 257.972298][ T6184] __blk_mq_try_issue_directly+0x295/0x3f0 [ 257.977999][ T6184] ? blk_mq_request_bypass_insert+0xd0/0xd0 [ 257.983787][ T6184] ? debug_lockdep_rcu_enabled+0x27/0x60 [ 257.989312][ T6184] blk_mq_request_issue_directly+0xb5/0x100 [ 257.995098][ T6184] ? blk_mq_flush_plug_list+0x7e0/0x7e0 [ 258.000537][ T6184] ? blk_mq_sched_insert_requests+0xd6/0x380 [ 258.006409][ T6184] ? lock_downgrade+0x3c0/0x3c0 [ 258.011147][ T6184] blk_mq_try_issue_list_directly+0xa9/0x160 [ 258.017023][ T6184] blk_mq_sched_insert_requests+0x228/0x380 [ 258.022810][ T6184] blk_mq_flush_plug_list+0x448/0x7e0 [ 258.028073][ T6184] ? blk_mq_insert_requests+0x380/0x380 [ 258.033516][ T6184] blk_flush_plug_list+0x1eb/0x230 [ 258.038515][ T6184] ? blk_insert_cloned_request+0x1b0/0x1b0 [ 258.044215][ T6184] blk_finish_plug+0x43/0x5d [ 258.048695][ T6184] read_pages+0xf6/0x3b0 [ 258.052823][ T6184] ? read_cache_pages+0x350/0x350 [ 258.057737][ T6184] ? __page_cache_alloc+0x12c/0x230 [ 258.062826][ T6184] __do_page_cache_readahead+0x346/0x3a0 [ 258.068348][ T6184] ? read_pages+0x3b0/0x3b0 [ 258.072738][ T6184] ? lockdep_hardirqs_on+0x16/0x2a0 [ 258.077928][ T6184] ? __xfs_filemap_fault+0x167/0x4a0 [xfs] [ 258.083625][ T6184] filemap_fault+0xa13/0xe70 [ 258.088201][ T6184] __xfs_filemap_fault+0x167/0x4a0 [xfs] [ 258.093731][ T6184] ? kmemleak_alloc+0x57/0x90 [ 258.098397][ T6184] ? xfs_file_read_iter+0x3c0/0x3c0 [xfs] [ 258.104009][ T6184] ? debug_check_no_locks_freed+0x2c/0xe0 [ 258.109618][ T6184] ? lockdep_init_map+0x8b/0x2b0 [ 258.114543][ T6184] xfs_filemap_fault+0x68/0x70 [xfs] [ 258.119720][ T6184] __do_fault+0x83/0x220 [ 258.123847][ T6184] __handle_mm_fault+0xd76/0x1f40 [ 258.128757][ T6184] ? __pmd_alloc+0x280/0x280 [ 258.133231][ T6184] ? debug_lockdep_rcu_enabled+0x27/0x60 [ 258.138755][ T6184] ? handle_mm_fault+0x178/0x4c0 [ 258.143581][ T6184] ? lockdep_hardirqs_on+0x16/0x2a0 [ 258.148674][ T6184] ? __do_page_fault+0x29c/0x640 [ 258.153501][ T6184] handle_mm_fault+0x205/0x4c0 [ 258.158151][ T6184] __do_page_fault+0x29c/0x640 [ 258.162800][ T6184] do_page_fault+0x50/0x37f [ 258.167189][ T6184] page_fault+0x34/0x40 [ 258.171228][ T6184] RIP: 0010:__clear_user+0x3b/0x70 [ 183.553150] BUG: sleeping function called from invalid context at drivers/iommu/iommu.c:1950 [ 183.562306] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1486, name: kworker/0:3 [ 183.571450] 5 locks held by kworker/0:3/1486: [ 183.576510] #0: 44ff0008000ce128 ((wq_completion)events){+.+.}, at: process_one_work+0x25c/0x948 [ 183.5861
Re: [PATCH] dma/coherent: remove unused dma_get_device_base()
Ping. Please take a look at this trivial patch. On Tue, 2019-09-17 at 09:00 -0400, Qian Cai wrote: > dma_get_device_base() was first introduced in the commit c41f9ea998f3 > ("drivers: dma-coherent: Account dma_pfn_offset when used with device > tree"). Later, it was removed by the commit 43fc509c3efb ("dma-coherent: > introduce interface for default DMA pool") > > Found by the -Wunused-function compilation warning. > > Signed-off-by: Qian Cai > --- > kernel/dma/coherent.c | 9 - > 1 file changed, 9 deletions(-) > > diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c > index 29fd6590dc1e..909b38e1c29b 100644 > --- a/kernel/dma/coherent.c > +++ b/kernel/dma/coherent.c > @@ -28,15 +28,6 @@ static inline struct dma_coherent_mem > *dev_get_coherent_memory(struct device *de > return NULL; > } > > -static inline dma_addr_t dma_get_device_base(struct device *dev, > - struct dma_coherent_mem * mem) > -{ > - if (mem->use_dev_dma_pfn_offset) > - return (mem->pfn_base - dev->dma_pfn_offset) << PAGE_SHIFT; > - else > - return mem->device_base; > -} > - > static int dma_init_coherent_memory(phys_addr_t phys_addr, > dma_addr_t device_addr, size_t size, > struct dma_coherent_mem **mem) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] dma/coherent: remove unused dma_get_device_base()
dma_get_device_base() was first introduced in the commit c41f9ea998f3 ("drivers: dma-coherent: Account dma_pfn_offset when used with device tree"). Later, it was removed by the commit 43fc509c3efb ("dma-coherent: introduce interface for default DMA pool") Found by the -Wunused-function compilation warning. Signed-off-by: Qian Cai --- kernel/dma/coherent.c | 9 - 1 file changed, 9 deletions(-) diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c index 29fd6590dc1e..909b38e1c29b 100644 --- a/kernel/dma/coherent.c +++ b/kernel/dma/coherent.c @@ -28,15 +28,6 @@ static inline struct dma_coherent_mem *dev_get_coherent_memory(struct device *de return NULL; } -static inline dma_addr_t dma_get_device_base(struct device *dev, -struct dma_coherent_mem * mem) -{ - if (mem->use_dev_dma_pfn_offset) - return (mem->pfn_base - dev->dma_pfn_offset) << PAGE_SHIFT; - else - return mem->device_base; -} - static int dma_init_coherent_memory(phys_addr_t phys_addr, dma_addr_t device_addr, size_t size, struct dma_coherent_mem **mem) -- 1.8.3.1
Re: [PATCH] iommu/arm-smmu: Report USF more clearly
On Fri, 2019-09-13 at 12:48 +0100, Robin Murphy wrote: > Although CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT is a welcome tool > for smoking out inadequate firmware, the failure mode is non-obvious > and can be confusing for end users. Add some special-case reporting of > Unidentified Stream Faults to help clarify this particular symptom. > > CC: Douglas Anderson > Signed-off-by: Robin Murphy > --- > drivers/iommu/arm-smmu.c | 5 + > drivers/iommu/arm-smmu.h | 2 ++ > 2 files changed, 7 insertions(+) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index b7cf24402a94..76ac8c180695 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -499,6 +499,11 @@ static irqreturn_t arm_smmu_global_fault(int irq, void > *dev) > dev_err_ratelimited(smmu->dev, > "\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 > 0x%08x\n", > gfsr, gfsynr0, gfsynr1, gfsynr2); > + if (IS_ENABLED(CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT) && > + (gfsr & sGFSR_USF)) > + dev_err_ratelimited(smmu->dev, > + "Stream ID %hu may not be described by firmware, try > booting with \"arm-smmu.disable_bypass=0\"\n", > + (u16)gfsynr1); dev_err_once(), i.e., don't need to remind people to set "arm- smmu.disable_bypass=0" multiple times. > > arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sGFSR, gfsr); > return IRQ_HANDLED; > diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h > index c9c13b5785f2..46f7e161e83e 100644 > --- a/drivers/iommu/arm-smmu.h > +++ b/drivers/iommu/arm-smmu.h > @@ -79,6 +79,8 @@ > #define ID7_MINORGENMASK(3, 0) > > #define ARM_SMMU_GR0_sGFSR 0x48 > +#define sGFSR_USFBIT(2) > + > #define ARM_SMMU_GR0_sGFSYNR00x50 > #define ARM_SMMU_GR0_sGFSYNR10x54 > #define ARM_SMMU_GR0_sGFSYNR20x58 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/vt-d: Make function signatures static
On Tue, 2019-09-10 at 10:15 +0200, Joerg Roedel wrote: > On Sat, Sep 07, 2019 at 04:49:33PM +1000, Adam Zerella wrote: > > drivers/iommu/intel-iommu.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > Applied, thanks. Joerg, not sure if you saw the reply from Lu, https://lore.kernel.org/linux-iommu/ba8d4792-3b62-98a8-31d8-74a08be2f983@linux.i ntel.com/ This patch is not even compiled for me as well. WARNING: "intel_iommu_gfx_mapped" [vmlinux] is a static EXPORT_SYMBOL_GPL drivers/iommu/dmar.o: In function `iommu_device_set_ops': /home/linux-mm/linux/./include/linux/iommu.h:382: undefined reference to `intel_iommu_ops' make: *** [Makefile:1096: vmlinux] Error 1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH] iommu/amd: fix a race in increase_address_space()
On Thu, 2019-09-05 at 13:43 +0200, Joerg Roedel wrote: > Hi Qian, > > On Wed, Sep 04, 2019 at 05:24:22PM -0400, Qian Cai wrote: > > if (domain->mode == PAGE_MODE_6_LEVEL) > > /* address space already 64 bit large */ > > return false; > > > > This gives a clue that there must be a race between multiple concurrent > > threads in increase_address_space(). > > Thanks for tracking this down, there is a race indeed. > > > + mutex_lock(&domain->api_lock); > > *dma_addr = __map_single(dev, dma_dom, page_to_phys(page), > > size, DMA_BIDIRECTIONAL, dma_mask); > > + mutex_unlock(&domain->api_lock); > > > > if (*dma_addr == DMA_MAPPING_ERROR) > > goto out_free; > > @@ -2696,7 +2698,9 @@ static void free_coherent(struct device *dev, size_t > > size, > > > > dma_dom = to_dma_ops_domain(domain); > > > > + mutex_lock(&domain->api_lock); > > __unmap_single(dma_dom, dma_addr, size, DMA_BIDIRECTIONAL); > > + mutex_unlock(&domain->api_lock); > > But I think the right fix is to lock the operation in > increase_address_space() directly, and not the calls around it, like in > the diff below. It is untested, so can you please try it and report back > if it fixes your issue? Yes, it works great so far. > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > index b607a92791d3..1ff705f16239 100644 > --- a/drivers/iommu/amd_iommu.c > +++ b/drivers/iommu/amd_iommu.c > @@ -1424,18 +1424,21 @@ static void free_pagetable(struct protection_domain > *domain) > * another level increases the size of the address space by 9 bits to a size > up > * to 64 bits. > */ > -static bool increase_address_space(struct protection_domain *domain, > +static void increase_address_space(struct protection_domain *domain, > gfp_t gfp) > { > + unsigned long flags; > u64 *pte; > > - if (domain->mode == PAGE_MODE_6_LEVEL) > + spin_lock_irqsave(&domain->lock, flags); > + > + if (WARN_ON_ONCE(domain->mode == PAGE_MODE_6_LEVEL)) > /* address space already 64 bit large */ > - return false; > + goto out; > > pte = (void *)get_zeroed_page(gfp); > if (!pte) > - return false; > + goto out; > > *pte = PM_LEVEL_PDE(domain->mode, > iommu_virt_to_phys(domain->pt_root)); > @@ -1443,7 +1446,10 @@ static bool increase_address_space(struct > protection_domain *domain, > domain->mode+= 1; > domain->updated = true; > > - return true; > +out: > + spin_unlock_irqrestore(&domain->lock, flags); > + > + return; > } > > static u64 *alloc_pte(struct protection_domain *domain,
[RFC PATCH] iommu/amd: fix a race in increase_address_space()
When the system is under some memory pressure, it could cause disks on the "smartpqi" driver going offline below. From the UBSAN report, it indicates that "domain->mode" becomes 7. Further investigation indicates that the only place that would increase "domain->mode" is increase_address_space() but it has this check before actually increasing it, if (domain->mode == PAGE_MODE_6_LEVEL) /* address space already 64 bit large */ return false; This gives a clue that there must be a race between multiple concurrent threads in increase_address_space(). In amd_iommu_map() and amd_iommu_unmap(), there is mutex_lock() and mutex_unlock() around a iommu_map_page(). There are several places that could call iommu_map_page() with interrupt enabled. One for sure is alloc_coherent() which call __map_single() that had a comment said that it needs to hold a lock but not so. I am not sure about the map_page() and map_sg() where my investigation so far indicating that many call sites there are in the atomic context. smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x address=0xff702a00 flags=0x0010] UBSAN: Undefined behaviour in drivers/iommu/amd_iommu.c:1464:29 shift exponent 66 is too large for 64-bit type 'long unsigned int' CPU: 39 PID: 821 Comm: kswapd4 Tainted: G O 5.3.0-rc7-next-20190903+ #4 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 Call Trace: dump_stack+0x62/0x9a ubsan_epilogue+0xd/0x3a smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x address=0xff702c80 flags=0x0010] __ubsan_handle_shift_out_of_bounds.cold.12+0x21/0x68 iommu_map_page.cold.39+0x7f/0x84 map_sg+0x1ce/0x2f0 scsi_dma_map+0xc6/0x160 pqi_raid_submit_scsi_cmd_with_io_request+0x1c3/0x470 [smartpqi] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x address=0xff702900 flags=0x0010] pqi_scsi_queue_command+0x7d6/0xeb0 [smartpqi] scsi_queue_rq+0x7ee/0x12b0 __blk_mq_try_issue_directly+0x295/0x3f0 blk_mq_try_issue_directly+0xad/0x130 blk_mq_make_request+0xb12/0xee0 generic_make_request+0x179/0x4a0 submit_bio+0xaa/0x270 __swap_writepage+0x8f5/0xba0 swap_writepage+0x65/0xb0 pageout.isra.3+0x3e5/0xa00 shrink_page_list+0x15a0/0x2660 shrink_inactive_list+0x373/0x770 shrink_node_memcg+0x4ff/0x1550 shrink_node+0x123/0x800 balance_pgdat+0x493/0x9b0 kswapd+0x39b/0x7f0 kthread+0x1df/0x200 ret_from_fork+0x22/0x40 smartpqi :23:00.0: resetting scsi 0:1:0:0 INFO: task khugepaged:797 blocked for more than 122 seconds. Tainted: G O 5.3.0-rc7-next-20190903+ #4 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. khugepaged D22656 797 2 0x80004000 Call Trace: __schedule+0x4bb/0xc20 schedule+0x6c/0x140 io_schedule+0x21/0x50 rq_qos_wait+0x18e/0x2b0 wbt_wait+0x12b/0x1b0 __rq_qos_throttle+0x3b/0x60 blk_mq_make_request+0x217/0xee0 generic_make_request+0x179/0x4a0 submit_bio+0xaa/0x270 __swap_writepage+0x8f5/0xba0 swap_writepage+0x65/0xb0 pageout.isra.3+0x3e5/0xa00 shrink_page_list+0x15a0/0x2660 shrink_inactive_list+0x373/0x770 shrink_node_memcg+0x4ff/0x1550 shrink_node+0x123/0x800 do_try_to_free_pages+0x22f/0x820 try_to_free_pages+0x242/0x4d0 __alloc_pages_nodemask+0x9f6/0x1bb0 khugepaged_alloc_page+0x6f/0xf0 collapse_huge_page+0x103/0x1060 khugepaged_scan_pmd+0x840/0xa70 khugepaged+0x1571/0x18d0 kthread+0x1df/0x200 ret_from_fork+0x22/0x40 INFO: task systemd-journal:3112 blocked for more than 123 seconds. Tainted: G O 5.3.0-rc7-next-20190903+ #4 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. systemd-journal D20744 3112 1 0x4120 Call Trace: __do_page_cache_readahead+0x149/0x3a0 filemap_fault+0x9da/0x1050 __xfs_filemap_fault+0x167/0x450 [xfs] xfs_filemap_fault+0x68/0x70 [xfs] __do_fault+0x83/0x220 __handle_mm_fault+0x1034/0x1a50 handle_mm_fault+0x17f/0x37e __do_page_fault+0x369/0x630 do_page_fault+0x50/0x2d3 page_fault+0x2f/0x40 RIP: 0033:0x56088e81faa0 Code: Bad RIP value. RSP: 002b:7ffdc2158ba8 EFLAGS: 000102_iter+0x135/0x850 shrink_node+0x123/0x800 do_try_to_free_pages+0x22f/0x820 try_to_free_pages+0x242/0x4d0 __alloc_pages_nodemask+0x9f6/0x1bb0 [11967.13967 __swap_writepage+0x8f5/0xba0 swap_writepage+0x65/0xb0 pageout.isra.3+0x3e5/0xa00 shrink_page_list+0x15a0/0x2660 io_schedule+0x21/0x50 rq_qos_wait+0x18e/0x2b0 wbt_wait+0x12b/0x1b0 __rq_qos_throttle+0x3b/0x60 blk_mq_make_request+0x217/0xee0 schedule+0x6c/0x140 io_schedule+0x21/0x50 rq_qos_wait+0x18e/0x2b0 wbt_wait+0x12b/0x1b0 __rq_qos_throttle+0x3b/0x60 blk_mq_make_request+0x217/0xee0 [11968.69198.0+0x60/0x60 swap_writepage+0x65/0xb0 pageout.isra.3+0x3e5/0xa00 shrink_page_list+0x15a0/0x2660 generic_make_request+0x179/0x4a0 submit_bio+0xaa/0x270 __swap_writepage+0x8f5/0xba0 swap_writepage+0x65/0xb0 pageo
"Rework enabling/disabling of ATS for PCI masters" failed to compile on arm64
The linux-next commit “iommu/arm-smmu-v3: Rework enabling/disabling of ATS for PCI masters” [1] causes a compilation error when PCI_ATS=n on arm64. [1] https://lore.kernel.org/linux-iommu/20190820154549.17018-3-w...@kernel.org/ drivers/iommu/arm-smmu-v3.c:2325:35: error: no member named 'ats_cap' in 'struct pci_dev' return !pdev->untrusted && pdev->ats_cap; ^ For example, Symbol: PCI_ATS [=n] │ Type : bool │ Defined at drivers/pci/Kconfig:118 │ Depends on: PCI [=y] │ Selected by [n]: │ - PCI_IOV [=n] && PCI [=y] │ - PCI_PRI [=n] && PCI [=y]│ │ - PCI_PASID [=n] && PCI [=y] │ │ - AMD_IOMMU [=n] && IOMMU_SUPPORT [=y] && X86_64 && PCI [=y] && ACPI [=y] ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/amd: silence warnings under memory pressure
When running heavy memory pressure workloads, the system is throwing endless warnings, smartpqi :23:00.0: AMD-Vi: IOMMU mapping error in map_sg (io-pages: 5 reason: -12) Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 swapper/10: page allocation failure: order:0, mode:0xa20(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0,4 Call Trace: dump_stack+0x62/0x9a warn_alloc.cold.43+0x8a/0x148 __alloc_pages_nodemask+0x1a5c/0x1bb0 get_zeroed_page+0x16/0x20 iommu_map_page+0x477/0x540 map_sg+0x1ce/0x2f0 scsi_dma_map+0xc6/0x160 pqi_raid_submit_scsi_cmd_with_io_request+0x1c3/0x470 [smartpqi] do_IRQ+0x81/0x170 common_interrupt+0xf/0xf because the allocation could fail from iommu_map_page(), and the volume of this call could be huge which may generate a lot of serial console output and cosumes all CPUs. Fix it by silencing the warning in this call site, and there is still a dev_err() later to notify the failure. Signed-off-by: Qian Cai --- drivers/iommu/amd_iommu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index b607a92791d3..19eef1edf8ed 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -2547,7 +2547,9 @@ static int map_sg(struct device *dev, struct scatterlist *sglist, bus_addr = address + s->dma_address + (j << PAGE_SHIFT); phys_addr = (sg_phys(s) & PAGE_MASK) + (j << PAGE_SHIFT); - ret = iommu_map_page(domain, bus_addr, phys_addr, PAGE_SIZE, prot, GFP_ATOMIC); + ret = iommu_map_page(domain, bus_addr, phys_addr, +PAGE_SIZE, prot, +GFP_ATOMIC | __GFP_NOWARN); if (ret) goto out_unmap; -- 1.8.3.1
[PATCH v2] iommu/amd: fix a crash in iova_magazine_free_pfns
The commit b3aa14f02254 ("iommu: remove the mapping_error dma_map_ops method") incorrectly changed the checking from dma_ops_alloc_iova() in map_sg() causes a crash under memory pressure as dma_ops_alloc_iova() never return DMA_MAPPING_ERROR on failure but 0, so the error handling is all wrong. kernel BUG at drivers/iommu/iova.c:801! Workqueue: kblockd blk_mq_run_work_fn RIP: 0010:iova_magazine_free_pfns+0x7d/0xc0 Call Trace: free_cpu_cached_iovas+0xbd/0x150 alloc_iova_fast+0x8c/0xba dma_ops_alloc_iova.isra.6+0x65/0xa0 map_sg+0x8c/0x2a0 scsi_dma_map+0xc6/0x160 pqi_aio_submit_io+0x1f6/0x440 [smartpqi] pqi_scsi_queue_command+0x90c/0xdd0 [smartpqi] scsi_queue_rq+0x79c/0x1200 blk_mq_dispatch_rq_list+0x4dc/0xb70 blk_mq_sched_dispatch_requests+0x249/0x310 __blk_mq_run_hw_queue+0x128/0x200 blk_mq_run_work_fn+0x27/0x30 process_one_work+0x522/0xa10 worker_thread+0x63/0x5b0 kthread+0x1d2/0x1f0 ret_from_fork+0x22/0x40 Fixes: b3aa14f02254 ("iommu: remove the mapping_error dma_map_ops method") Signed-off-by: Qian Cai --- v2: Fix the offensive commit directly. drivers/iommu/amd_iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 73740b969e62..b607a92791d3 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -2533,7 +2533,7 @@ static int map_sg(struct device *dev, struct scatterlist *sglist, npages = sg_num_pages(dev, sglist, nelems); address = dma_ops_alloc_iova(dev, dma_dom, npages, dma_mask); - if (address == DMA_MAPPING_ERROR) + if (!address) goto out_err; prot = dir2prot(direction); -- 1.8.3.1
[PATCH] iommu/amd: fix a crash in iova_magazine_free_pfns
When a system is under heavy memory pressure, the allocation in alloc_iova_fast() could still fail even flush_rcache=true, and then causes dma_ops_alloc_iova() return 0. pqi_scsi_queue_command pqi_raid_submit_scsi_cmd_with_io_request scsi_dma_map map_sg dma_ops_alloc_iova alloc_iova_fast Later, map_sg()->iommu_map_page() would probably fail due to the invalid PFN 0, and call free_iova_fast()->iova_rcache_insert() to insert it to the rcache. Finally, it will trigger the BUG_ON(!iova) here. kernel BUG at drivers/iommu/iova.c:801! Workqueue: kblockd blk_mq_run_work_fn RIP: 0010:iova_magazine_free_pfns+0x7d/0xc0 Call Trace: free_cpu_cached_iovas+0xbd/0x150 alloc_iova_fast+0x8c/0xba dma_ops_alloc_iova.isra.6+0x65/0xa0 map_sg+0x8c/0x2a0 scsi_dma_map+0xc6/0x160 pqi_aio_submit_io+0x1f6/0x440 [smartpqi] pqi_scsi_queue_command+0x90c/0xdd0 [smartpqi] scsi_queue_rq+0x79c/0x1200 blk_mq_dispatch_rq_list+0x4dc/0xb70 blk_mq_sched_dispatch_requests+0x249/0x310 __blk_mq_run_hw_queue+0x128/0x200 blk_mq_run_work_fn+0x27/0x30 process_one_work+0x522/0xa10 worker_thread+0x63/0x5b0 kthread+0x1d2/0x1f0 ret_from_fork+0x22/0x40 Fix it by validating the return from the 2nd alloc_iova_fast() in dma_ops_alloc_iova(), so map_sg() could handle the error condition immediately. Signed-off-by: Qian Cai --- drivers/iommu/amd_iommu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 73740b969e62..f24c689b4e01 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -1697,6 +1697,8 @@ static unsigned long dma_ops_alloc_iova(struct device *dev, if (!pfn) pfn = alloc_iova_fast(&dma_dom->iovad, pages, IOVA_PFN(dma_mask), true); + if (!pfn) + return DMA_MAPPING_ERROR; return (pfn << PAGE_SHIFT); } -- 1.8.3.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/intel: remove an unused variable "length"
The linux-next commit "iommu/vt-d: Duplicate iommu_resv_region objects per device list" [1] left out an unused variable, drivers/iommu/intel-iommu.c: In function 'dmar_parse_one_rmrr': drivers/iommu/intel-iommu.c:4014:9: warning: variable 'length' set but not used [-Wunused-but-set-variable] [1] https://lore.kernel.org/patchwork/patch/1083073/ Signed-off-by: Qian Cai --- drivers/iommu/intel-iommu.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 478ac186570b..d86d4ee5cc78 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -4011,7 +4011,6 @@ int __init dmar_parse_one_rmrr(struct acpi_dmar_header *header, void *arg) { struct acpi_dmar_reserved_memory *rmrr; struct dmar_rmrr_unit *rmrru; - size_t length; rmrru = kzalloc(sizeof(*rmrru), GFP_KERNEL); if (!rmrru) @@ -4022,8 +4021,6 @@ int __init dmar_parse_one_rmrr(struct acpi_dmar_header *header, void *arg) rmrru->base_address = rmrr->base_address; rmrru->end_address = rmrr->end_address; - length = rmrr->end_address - rmrr->base_address + 1; - rmrru->devices = dmar_alloc_dev_scope((void *)(rmrr + 1), ((void *)rmrr) + rmrr->header.length, &rmrru->devices_cnt); -- 2.20.1 (Apple Git-117) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/6] iommu/vt-d: Fixes and cleanups for linux-next
On Sun, 2019-06-09 at 10:37 +0800, Lu Baolu wrote: > Hi Joerg, > > This series includes several fixes and cleanups after delegating > DMA domain to generic iommu. Please review and consider them for > linux-next. > > Best regards, > Baolu > > Lu Baolu (5): > iommu/vt-d: Don't return error when device gets right domain > iommu/vt-d: Set domain type for a private domain > iommu/vt-d: Don't enable iommu's which have been ignored > iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices() > iommu/vt-d: Consolidate domain_init() to avoid duplication > > Sai Praneeth Prakhya (1): > iommu/vt-d: Cleanup after delegating DMA domain to generic iommu > > drivers/iommu/intel-iommu.c | 210 +--- > 1 file changed, 53 insertions(+), 157 deletions(-) > BTW, the linux-next commit "iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions" [1] also introduced a memory leak below, as it forgets to ask intel_iommu_put_resv_regions() to call kfree() when CONFIG_INTEL_IOMMU_FLOPPY_WA=y. [1] https://lore.kernel.org/patchwork/patch/1078963/ unreferenced object 0x88912ef789c8 (size 64): comm "swapper/0", pid 1, jiffies 4294946232 (age 5399.530s) hex dump (first 32 bytes): 48 83 f7 2e 91 88 ff ff 30 fa e3 00 82 88 ff ff H...0... 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 backtrace: [] kmem_cache_alloc_trace+0x266/0x380 [ ] iommu_alloc_resv_region+0x40/0xb0 [ ] intel_iommu_get_resv_regions+0x25e/0x2d0 [<21fbc6c3>] iommu_group_create_direct_mappings+0x159/0x3d0 [<22259268>] iommu_group_add_device+0x17b/0x4f0 [<28b91093>] iommu_group_get_for_dev+0x153/0x460 [<577c33b4>] intel_iommu_add_device+0xc4/0x210 [<587b7492>] iommu_probe_device+0x63/0x80 [<4aa997d1>] add_iommu_group+0xe/0x20 [ ] bus_for_each_dev+0xf0/0x150 [ ] bus_set_iommu+0xc6/0x100 [ ] intel_iommu_init+0x682/0xb0a [<226f7444>] pci_iommu_init+0x26/0x62 [<2d8694f5>] do_one_initcall+0xe5/0x3ea [<4bc60101>] kernel_init_freeable+0x5ad/0x640 [<91b0bad6>] kernel_init+0x11/0x138
Re: "iommu/vt-d: Delegate DMA domain to generic iommu" series breaks megaraid_sas
> On Jun 10, 2019, at 9:41 PM, Lu Baolu wrote: > > Ah, good catch! > > The device failed to be attached by a DMA domain. Can you please try the > attached fix patch? It works fine. > > [ 101.885468] pci :06:00.0: DMAR: Device is ineligible for IOMMU > domain attach due to platform RMRR requirement. Contact your platform > vendor. > [ 101.900801] pci :06:00.0: Failed to add to iommu group 23: -1 > > Best regards, > Baolu > > On 6/10/19 10:54 PM, Qian Cai wrote: >> On Mon, 2019-06-10 at 09:44 -0400, Qian Cai wrote: >>> On Sun, 2019-06-09 at 10:43 +0800, Lu Baolu wrote: >>>> Hi Qian, >>>> >>>> I just posted some fix patches. I cc'ed them in your email inbox as >>>> well. Can you please check whether they happen to fix your issue? >>>> If not, do you mind posting more debug messages? >>> >>> Unfortunately, it does not work. Here is the dmesg. >>> >>> https://raw.githubusercontent.com/cailca/tmp/master/dmesg?token=AMC35QKPIZBYUM >>> FUQKLW4ZC47ZPIK >> This one should be good to view. >> https://cailca.github.io/files/dmesg.txt > <0001-iommu-vt-d-Allow-DMA-domain-attaching-to-rmrr-locked.patch>
Re: "iommu/vt-d: Delegate DMA domain to generic iommu" series breaks megaraid_sas
On Mon, 2019-06-10 at 09:44 -0400, Qian Cai wrote: > On Sun, 2019-06-09 at 10:43 +0800, Lu Baolu wrote: > > Hi Qian, > > > > I just posted some fix patches. I cc'ed them in your email inbox as > > well. Can you please check whether they happen to fix your issue? > > If not, do you mind posting more debug messages? > > Unfortunately, it does not work. Here is the dmesg. > > https://raw.githubusercontent.com/cailca/tmp/master/dmesg?token=AMC35QKPIZBYUM > FUQKLW4ZC47ZPIK This one should be good to view. https://cailca.github.io/files/dmesg.txt
Re: "iommu/vt-d: Delegate DMA domain to generic iommu" series breaks megaraid_sas
On Sun, 2019-06-09 at 10:43 +0800, Lu Baolu wrote: > Hi Qian, > > I just posted some fix patches. I cc'ed them in your email inbox as > well. Can you please check whether they happen to fix your issue? > If not, do you mind posting more debug messages? Unfortunately, it does not work. Here is the dmesg. https://raw.githubusercontent.com/cailca/tmp/master/dmesg?token=AMC35QKPIZBYUMFU QKLW4ZC47ZPIK ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
"iommu/vt-d: Delegate DMA domain to generic iommu" series breaks megaraid_sas
The linux-next series "iommu/vt-d: Delegate DMA domain to generic iommu" [1] causes a system with the rootfs on megaraid_sas card unable to boot. Reverted the whole series on the top of linux-next (next-20190607) fixed the issue. The information regards this storage card is, [ 116.466810][ T324] megaraid_sas :06:00.0: FW provided supportMaxExtLDs: 0 max_lds: 32 [ 116.476052][ T324] megaraid_sas :06:00.0: controller type : iMR(0MB) [ 116.483646][ T324] megaraid_sas :06:00.0: Online Controller Reset(OCR) : Enabled [ 116.492403][ T324] megaraid_sas :06:00.0: Secure JBOD support : Yes [ 116.499887][ T324] megaraid_sas :06:00.0: NVMe passthru support : No [ 116.507480][ T324] megaraid_sas :06:00.0: FW provided [ 116.612523][ T324] megaraid_sas :06:00.0: NVME page size: (0) [ 116.629991][ T324] megaraid_sas :06:00.0: INIT adapter done [ 116.714789][ T324] megaraid_sas :06:00.0: pci id: (0x1000)/(0x0017)/(0x1d49)/(0x0500) [ 116.724228][ T324] megaraid_sas :06:00.0: unevenspan support: no [ 116.731518][ T324] megaraid_sas :06:00.0: firmware crash dump : no [ 116.738981][ T324] megaraid_sas :06:00.0: jbod sync map : yes [ 116.787433][ T324] scsi host0: Avago SAS based MegaRAID driver [ 117.081088][ T324] scsi 0:0:0:0: Direct- Access LENOVO ST900MM0168 L587 PQ: 0 ANSI: 6 [1] https://lore.kernel.org/patchwork/cover/1078960/
[PATCH -next] iommu/intel: silence a variable set but not used
The commit "iommu/vt-d: Probe DMA-capable ACPI name space devices" introduced a compilation warning due to the "iommu" variable in for_each_active_iommu() but never used the for each element, i.e, "drhd->iommu". drivers/iommu/intel-iommu.c: In function 'probe_acpi_namespace_devices': drivers/iommu/intel-iommu.c:4639:22: warning: variable 'iommu' set but not used [-Wunused-but-set-variable] struct intel_iommu *iommu; Silence the warning the same way as in the commit d3ed71e5cc50 ("drivers/iommu/intel-iommu.c: fix variable 'iommu' set but not used") Signed-off-by: Qian Cai --- drivers/iommu/intel-iommu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index b431cc6f6ba4..2897354a341a 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -4636,7 +4636,8 @@ static int __init platform_optin_force_iommu(void) static int __init probe_acpi_namespace_devices(void) { struct dmar_drhd_unit *drhd; - struct intel_iommu *iommu; + /* To avoid a -Wunused-but-set-variable warning. */ + struct intel_iommu *iommu __maybe_unused; struct device *dev; int i, ret = 0; -- 1.8.3.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: replace single-char identifiers in macros
On Mon, 2019-06-03 at 14:07 +0100, Robin Murphy wrote: > On 03/06/2019 13:59, Qian Cai wrote: > > There are a few macros in IOMMU have single-char identifiers make the > > code hard to read and debug. Replace them with meaningful names. > > > > Suggested-by: Andrew Morton > > Signed-off-by: Qian Cai > > --- > > include/linux/dmar.h | 14 -- > > 1 file changed, 8 insertions(+), 6 deletions(-) > > > > diff --git a/include/linux/dmar.h b/include/linux/dmar.h > > index f8af1d770520..eb634912f475 100644 > > --- a/include/linux/dmar.h > > +++ b/include/linux/dmar.h > > @@ -104,12 +104,14 @@ static inline bool dmar_rcu_check(void) > > > > #define dmar_rcu_dereference(p) rcu_dereference_check((p), > > dmar_rcu_check()) > > > > -#definefor_each_dev_scope(a, c, p, d) \ > > - for ((p) = 0; ((d) = (p) < (c) ? dmar_rcu_dereference((a)[(p)].dev) > > : \ > > - NULL, (p) < (c)); (p)++) > > - > > -#definefor_each_active_dev_scope(a, c, p, d) \ > > - for_each_dev_scope((a), (c), (p), (d)) if (!(d)) { continue; > > } else > > +#define for_each_dev_scope(devs, cnt, i, tmp) > > \ > > + for ((i) = 0; ((tmp) = (i) < (cnt) ? > > \ > > Given that "tmp" actually appears to be some sort of device cursor, I'm > not sure that that naming really achieves the stated goal of clarity :/ "tmp" is used in the callers everywhere though, although I suppose something like "tmp_dev" can be used if you prefer. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu: replace single-char identifiers in macros
There are a few macros in IOMMU have single-char identifiers make the code hard to read and debug. Replace them with meaningful names. Suggested-by: Andrew Morton Signed-off-by: Qian Cai --- include/linux/dmar.h | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/include/linux/dmar.h b/include/linux/dmar.h index f8af1d770520..eb634912f475 100644 --- a/include/linux/dmar.h +++ b/include/linux/dmar.h @@ -104,12 +104,14 @@ static inline bool dmar_rcu_check(void) #definedmar_rcu_dereference(p) rcu_dereference_check((p), dmar_rcu_check()) -#definefor_each_dev_scope(a, c, p, d) \ - for ((p) = 0; ((d) = (p) < (c) ? dmar_rcu_dereference((a)[(p)].dev) : \ - NULL, (p) < (c)); (p)++) - -#definefor_each_active_dev_scope(a, c, p, d) \ - for_each_dev_scope((a), (c), (p), (d)) if (!(d)) { continue; } else +#define for_each_dev_scope(devs, cnt, i, tmp) \ + for ((i) = 0; ((tmp) = (i) < (cnt) ?\ + dmar_rcu_dereference((devs)[(i)].dev) : NULL, (i) < (cnt)); \ + (i)++) + +#define for_each_active_dev_scope(devs, cnt, i, tmp) \ + for_each_dev_scope((devs), (cnt), (i), (tmp)) \ + if (!(tmp)) { continue; } else extern int dmar_table_init(void); extern int dmar_dev_scope_init(void); -- 1.8.3.1
[PATCH -next] intel-iommu: fix a variable set but not used
The commit "iommu/vt-d: Delegate the dma domain to upper layer" left an unused variable, drivers/iommu/intel-iommu.c: In function 'disable_dmar_iommu': drivers/iommu/intel-iommu.c:1652:23: warning: variable 'domain' set but not used [-Wunused-but-set-variable] Signed-off-by: Qian Cai --- drivers/iommu/intel-iommu.c | 4 1 file changed, 4 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index b431cc6f6ba4..073c547f247a 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -1649,16 +1649,12 @@ static void disable_dmar_iommu(struct intel_iommu *iommu) spin_lock_irqsave(&device_domain_lock, flags); list_for_each_entry_safe(info, tmp, &device_domain_list, global) { - struct dmar_domain *domain; - if (info->iommu != iommu) continue; if (!info->dev || !info->domain) continue; - domain = info->domain; - __dmar_remove_one_dev_info(info); } spin_unlock_irqrestore(&device_domain_lock, flags); -- 1.8.3.1
[PATCH] dma: replace single-char identifiers in macros
There are a few macros in DMA have single-char identifiers make the code hard to read. Replace them with meaningful names. Signed-off-by: Qian Cai --- include/linux/dma-mapping.h | 24 include/linux/dmar.h| 14 -- 2 files changed, 24 insertions(+), 14 deletions(-) diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 6309a721394b..2f0151dcfa4e 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -602,14 +602,22 @@ static inline void dma_sync_single_range_for_device(struct device *dev, return dma_sync_single_for_device(dev, addr + offset, size, dir); } -#define dma_map_single(d, a, s, r) dma_map_single_attrs(d, a, s, r, 0) -#define dma_unmap_single(d, a, s, r) dma_unmap_single_attrs(d, a, s, r, 0) -#define dma_map_sg(d, s, n, r) dma_map_sg_attrs(d, s, n, r, 0) -#define dma_unmap_sg(d, s, n, r) dma_unmap_sg_attrs(d, s, n, r, 0) -#define dma_map_page(d, p, o, s, r) dma_map_page_attrs(d, p, o, s, r, 0) -#define dma_unmap_page(d, a, s, r) dma_unmap_page_attrs(d, a, s, r, 0) -#define dma_get_sgtable(d, t, v, h, s) dma_get_sgtable_attrs(d, t, v, h, s, 0) -#define dma_mmap_coherent(d, v, c, h, s) dma_mmap_attrs(d, v, c, h, s, 0) +#define dma_map_single(dev, ptr, size, direction) \ + dma_map_single_attrs(dev, ptr, size, direction, 0) +#define dma_unmap_single(dev, addr, size, direction) \ + dma_unmap_single_attrs(dev, addr, size, direction, 0) +#define dma_map_sg(dev, sg, mapped_ents, direction)\ + dma_map_sg_attrs(dev, sg, mapped_ents, direction, 0) +#define dma_unmap_sg(dev, sg, mapped_ents, direction) \ + dma_unmap_sg_attrs(dev, sg, mapped_ents, direction, 0) +#define dma_map_page(dev, page, offset, size, direction) \ + dma_map_page_attrs(dev, page, offset, size, direction, 0) +#define dma_unmap_page(dev, addr, size, direction) \ + dma_unmap_page_attrs(dev, addr, size, direction, 0) +#define dma_get_sgtable(dev, sgt, cpu_addr, dma_addr, size)\ + dma_get_sgtable_attrs(dev, sgt, cpu_addr, dma_addr, size, 0) +#define dma_mmap_coherent(dev, vma, cpu_addr, dma_addr, size) \ + dma_mmap_attrs(dev, vma, cpu_addr, dma_addr, size, 0) extern int dma_common_mmap(struct device *dev, struct vm_area_struct *vma, void *cpu_addr, dma_addr_t dma_addr, size_t size, diff --git a/include/linux/dmar.h b/include/linux/dmar.h index f8af1d770520..eb634912f475 100644 --- a/include/linux/dmar.h +++ b/include/linux/dmar.h @@ -104,12 +104,14 @@ static inline bool dmar_rcu_check(void) #definedmar_rcu_dereference(p) rcu_dereference_check((p), dmar_rcu_check()) -#definefor_each_dev_scope(a, c, p, d) \ - for ((p) = 0; ((d) = (p) < (c) ? dmar_rcu_dereference((a)[(p)].dev) : \ - NULL, (p) < (c)); (p)++) - -#definefor_each_active_dev_scope(a, c, p, d) \ - for_each_dev_scope((a), (c), (p), (d)) if (!(d)) { continue; } else +#define for_each_dev_scope(devs, cnt, i, tmp) \ + for ((i) = 0; ((tmp) = (i) < (cnt) ?\ + dmar_rcu_dereference((devs)[(i)].dev) : NULL, (i) < (cnt)); \ + (i)++) + +#define for_each_active_dev_scope(devs, cnt, i, tmp) \ + for_each_dev_scope((devs), (cnt), (i), (tmp)) \ + if (!(tmp)) { continue; } else extern int dmar_table_init(void); extern int dmar_dev_scope_init(void); -- 1.8.3.1
[PATCH v2] iommu/intel: fix variable 'iommu' set but not used
The commit cf04eee8bf0e ("iommu/vt-d: Include ACPI devices in iommu=pt") added for_each_active_iommu() in iommu_prepare_static_identity_mapping() but never used the each element, i.e, "drhd->iommu". drivers/iommu/intel-iommu.c: In function 'iommu_prepare_static_identity_mapping': drivers/iommu/intel-iommu.c:3037:22: warning: variable 'iommu' set but not used [-Wunused-but-set-variable] struct intel_iommu *iommu; Fixed the warning by appending a compiler attribute __maybe_unused for it. Suggested-by: Andrew Morton Signed-off-by: Qian Cai --- drivers/iommu/intel-iommu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index a209199f3af6..09b8ff0d856a 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3034,7 +3034,8 @@ static int __init iommu_prepare_static_identity_mapping(int hw) { struct pci_dev *pdev = NULL; struct dmar_drhd_unit *drhd; - struct intel_iommu *iommu; + /* To avoid a -Wunused-but-set-variable warning. */ + struct intel_iommu *iommu __maybe_unused; struct device *dev; int i; int ret = 0; -- 2.20.1 (Apple Git-117)
[RESEND PATCH] iommu/intel: fix variable 'iommu' set but not used
The commit cf04eee8bf0e ("iommu/vt-d: Include ACPI devices in iommu=pt") added for_each_active_iommu() in iommu_prepare_static_identity_mapping() but never used the each element, i.e, "drhd->iommu". drivers/iommu/intel-iommu.c: In function 'iommu_prepare_static_identity_mapping': drivers/iommu/intel-iommu.c:3037:22: warning: variable 'iommu' set but not used [-Wunused-but-set-variable] struct intel_iommu *iommu; Fixed the warning by passing "drhd->iommu" directly to for_each_active_iommu() which all subsequent self-assignments should be ignored by a compiler anyway. Signed-off-by: Qian Cai --- drivers/iommu/intel-iommu.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index a209199f3af6..86e1ddcb4a8e 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3034,7 +3034,6 @@ static int __init iommu_prepare_static_identity_mapping(int hw) { struct pci_dev *pdev = NULL; struct dmar_drhd_unit *drhd; - struct intel_iommu *iommu; struct device *dev; int i; int ret = 0; @@ -3045,7 +3044,7 @@ static int __init iommu_prepare_static_identity_mapping(int hw) return ret; } - for_each_active_iommu(iommu, drhd) + for_each_active_iommu(drhd->iommu, drhd) for_each_active_dev_scope(drhd->devices, drhd->devices_cnt, i, dev) { struct acpi_device_physical_node *pn; struct acpi_device *adev; -- 1.8.3.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: print out "tag" in INVALID_PPR_REQUEST
On Tue, 2019-05-07 at 13:47 +, Gary R Hook wrote: > On 5/5/19 11:11 PM, Qian Cai wrote: > > [CAUTION: External Email] > > > > The commit e7f63ffc1bf1 ("iommu/amd: Update logging information for new > > event type") introduced a variable "tag" but had never used it which > > generates a warning below, > > > > drivers/iommu/amd_iommu.c: In function 'iommu_print_event': > > drivers/iommu/amd_iommu.c:567:33: warning: variable 'tag' set but not > > used [-Wunused-but-set-variable] > > int type, devid, pasid, flags, tag; > > ^~~ > > so just use it during the logging. > > > > Signed-off-by: Qian Cai > > --- > > drivers/iommu/amd_iommu.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > > index f7cdd2ab7f11..52f41369c5b3 100644 > > --- a/drivers/iommu/amd_iommu.c > > +++ b/drivers/iommu/amd_iommu.c > > @@ -631,9 +631,9 @@ static void iommu_print_event(struct amd_iommu *iommu, > > void *__evt) > > pasid = ((event[0] >> 16) & 0x) > > | ((event[1] << 6) & 0xF); > > tag = event[1] & 0x03FF; > > - dev_err(dev, "Event logged [INVALID_PPR_REQUEST > > device=%02x:%02x.%x pasid=0x%05x address=0x%llx flags=0x%04x]\n", > > + dev_err(dev, "Event logged [INVALID_PPR_REQUEST > > device=%02x:%02x.%x pasid=0x%05x tag=0x%04x address=0x%llx flags=0x%04x]\n", > > PCI_BUS_NUM(devid), PCI_SLOT(devid), > > PCI_FUNC(devid), > > - pasid, address, flags); > > + pasid, tag, address, flags); > > break; > > default: > > dev_err(dev, "Event logged [UNKNOWN event[0]=0x%08x > > event[1]=0x%08x event[2]=0x%08x event[3]=0x%08x\n", > > I did manage to overlook that variable when I posted the original patch. > But it looks to me like 41e59a41fc5d1 (iommu tree) already fixed this... > I'm not sure why it never got pushed to the main tree. Jroedel, I am wondering what the plan for 41e59a41fc5d1 (iommu tree) or this patch to be pushed to the linux-next or mainline... ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/intel: fix variable 'iommu' set but not used
The commit cf04eee8bf0e ("iommu/vt-d: Include ACPI devices in iommu=pt") added for_each_active_iommu() in iommu_prepare_static_identity_mapping() but never used the each element, i.e, "drhd->iommu". drivers/iommu/intel-iommu.c: In function 'iommu_prepare_static_identity_mapping': drivers/iommu/intel-iommu.c:3037:22: warning: variable 'iommu' set but not used [-Wunused-but-set-variable] struct intel_iommu *iommu; Fixed the warning by passing "drhd->iommu" directly to for_each_active_iommu() which all subsequent self-assignments should be ignored by a compiler anyway. Signed-off-by: Qian Cai --- drivers/iommu/intel-iommu.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index a209199f3af6..86e1ddcb4a8e 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3034,7 +3034,6 @@ static int __init iommu_prepare_static_identity_mapping(int hw) { struct pci_dev *pdev = NULL; struct dmar_drhd_unit *drhd; - struct intel_iommu *iommu; struct device *dev; int i; int ret = 0; @@ -3045,7 +3044,7 @@ static int __init iommu_prepare_static_identity_mapping(int hw) return ret; } - for_each_active_iommu(iommu, drhd) + for_each_active_iommu(drhd->iommu, drhd) for_each_active_dev_scope(drhd->devices, drhd->devices_cnt, i, dev) { struct acpi_device_physical_node *pn; struct acpi_device *adev; -- 1.8.3.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH -next v2] iommu/amd: fix a null-ptr-deref in map_sg()
The commit 1a1079011da3 ("iommu/amd: Flush not present cache in iommu_map_page") added domain_flush_np_cache() in map_sg() which triggered a crash below during boot. sg_next() could return NULL if sg_is_last() is true, so after for_each_sg(sglist, s, nelems, i), "s" could be NULL which ends up deferencing a NULL pointer later here, domain_flush_np_cache(domain, s->dma_address, s->dma_length); so move domain_flush_np_cache() call inside for_each_sg() to loop over each sg element. BUG: kernel NULL pointer dereference, address: 0018 PGD 0 P4D 0 Oops: [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI CPU: 8 PID: 659 Comm: kworker/8:1 Tainted: GB 5.1.0-rc7-next-20190506+ #20 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 01/25/2019 Workqueue: events work_for_cpu_fn RIP: 0010:map_sg+0x297/0x2e0 Call Trace: scsi_dma_map+0xc6/0x160 pqi_raid_submit_scsi_cmd_with_io_request+0x3b4/0x470 [smartpqi] pqi_scsi_queue_command+0x791/0xdd0 [smartpqi] scsi_queue_rq+0x79c/0x1200 blk_mq_dispatch_rq_list+0x4dc/0xb70 blk_mq_sched_dispatch_requests+0x2e1/0x310 __blk_mq_run_hw_queue+0x128/0x200 __blk_mq_delay_run_hw_queue+0x2b7/0x2d0 blk_mq_run_hw_queue+0x127/0x1d0 blk_mq_sched_insert_request+0x25c/0x320 __scsi_scan_target+0x14d/0x790 scsi_scan_target+0x115/0x120 sas_rphy_add+0x1d1/0x280 [scsi_transport_sas] pqi_add_sas_device+0x187/0x1e0 [smartpqi] pqi_update_device_list+0x1227/0x1460 [smartpqi] pqi_update_scsi_devices+0x755/0x1980 [smartpqi] pqi_scan_scsi_devices+0x57/0xf0 [smartpqi] pqi_ctrl_init+0x149e/0x14df [smartpqi] pqi_pci_probe.cold.49+0x808/0x818 [smartpqi] local_pci_probe+0x7a/0xc0 work_for_cpu_fn+0x2e/0x50 process_one_work+0x522/0xa10 worker_thread+0x363/0x5b0 kthread+0x1d2/0x1f0 ret_from_fork+0x22/0x40 Signed-off-by: Qian Cai --- v2: Call domain_flush_np_cache() inside for_each_sg(). drivers/iommu/amd_iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 867f8b155000..b7132812ce59 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -2576,9 +2576,9 @@ static int map_sg(struct device *dev, struct scatterlist *sglist, */ s->dma_address += address + (s->offset & ~PAGE_MASK); s->dma_length = s->length; - } - domain_flush_np_cache(domain, s->dma_address, s->dma_length); + domain_flush_np_cache(domain, s->dma_address, s->dma_length); + } return nelems; -- 2.20.1 (Apple Git-117) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH -next] iommu/amd: fix a null-ptr-deref in map_sg()
The commit 1a1079011da3 ("iommu/amd: Flush not present cache in iommu_map_page") added domain_flush_np_cache() in map_sg() which triggered a crash below during boot. sg_next() could return NULL if sg_is_last() is true, so after for_each_sg(sglist, s, nelems, i), "s" could be NULL which ends up deferencing a NULL pointer later here, domain_flush_np_cache(domain, s->dma_address, s->dma_length); BUG: kernel NULL pointer dereference, address: 0018 PGD 0 P4D 0 Oops: [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI CPU: 8 PID: 659 Comm: kworker/8:1 Tainted: GB 5.1.0-rc7-next-20190506+ #20 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 01/25/2019 Workqueue: events work_for_cpu_fn RIP: 0010:map_sg+0x297/0x2e0 Call Trace: scsi_dma_map+0xc6/0x160 pqi_raid_submit_scsi_cmd_with_io_request+0x3b4/0x470 [smartpqi] pqi_scsi_queue_command+0x791/0xdd0 [smartpqi] scsi_queue_rq+0x79c/0x1200 blk_mq_dispatch_rq_list+0x4dc/0xb70 blk_mq_sched_dispatch_requests+0x2e1/0x310 __blk_mq_run_hw_queue+0x128/0x200 __blk_mq_delay_run_hw_queue+0x2b7/0x2d0 blk_mq_run_hw_queue+0x127/0x1d0 blk_mq_sched_insert_request+0x25c/0x320 __scsi_scan_target+0x14d/0x790 scsi_scan_target+0x115/0x120 sas_rphy_add+0x1d1/0x280 [scsi_transport_sas] pqi_add_sas_device+0x187/0x1e0 [smartpqi] pqi_update_device_list+0x1227/0x1460 [smartpqi] pqi_update_scsi_devices+0x755/0x1980 [smartpqi] pqi_scan_scsi_devices+0x57/0xf0 [smartpqi] pqi_ctrl_init+0x149e/0x14df [smartpqi] pqi_pci_probe.cold.49+0x808/0x818 [smartpqi] local_pci_probe+0x7a/0xc0 work_for_cpu_fn+0x2e/0x50 process_one_work+0x522/0xa10 worker_thread+0x363/0x5b0 kthread+0x1d2/0x1f0 ret_from_fork+0x22/0x40 Signed-off-by: Qian Cai --- drivers/iommu/amd_iommu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 867f8b155000..908f5618fb5c 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -2578,7 +2578,8 @@ static int map_sg(struct device *dev, struct scatterlist *sglist, s->dma_length = s->length; } - domain_flush_np_cache(domain, s->dma_address, s->dma_length); + if (s) + domain_flush_np_cache(domain, s->dma_address, s->dma_length); return nelems; -- 2.20.1 (Apple Git-117) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/amd: print out "tag" in INVALID_PPR_REQUEST
The commit e7f63ffc1bf1 ("iommu/amd: Update logging information for new event type") introduced a variable "tag" but had never used it which generates a warning below, drivers/iommu/amd_iommu.c: In function 'iommu_print_event': drivers/iommu/amd_iommu.c:567:33: warning: variable 'tag' set but not used [-Wunused-but-set-variable] int type, devid, pasid, flags, tag; ^~~ so just use it during the logging. Signed-off-by: Qian Cai --- drivers/iommu/amd_iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index f7cdd2ab7f11..52f41369c5b3 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -631,9 +631,9 @@ static void iommu_print_event(struct amd_iommu *iommu, void *__evt) pasid = ((event[0] >> 16) & 0x) | ((event[1] << 6) & 0xF); tag = event[1] & 0x03FF; - dev_err(dev, "Event logged [INVALID_PPR_REQUEST device=%02x:%02x.%x pasid=0x%05x address=0x%llx flags=0x%04x]\n", + dev_err(dev, "Event logged [INVALID_PPR_REQUEST device=%02x:%02x.%x pasid=0x%05x tag=0x%04x address=0x%llx flags=0x%04x]\n", PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), - pasid, address, flags); + pasid, tag, address, flags); break; default: dev_err(dev, "Event logged [UNKNOWN event[0]=0x%08x event[1]=0x%08x event[2]=0x%08x event[3]=0x%08x\n", -- 2.20.1 (Apple Git-117) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
On 4/26/19 10:52 AM, Qian Cai wrote: > Applying some memory pressure would causes smartpqi offline even in today's > linux-next. This can always be reproduced by a LTP test cases [1] or sometimes > just compiling kernels. > > Reverting the commit "iommu/amd: Set exclusion range correctly" fixed the > issue. > > [ 213.437112] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT > domain=0x address=0x1000 flags=0x] > [ 213.447659] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT > domain=0x address=0x1800 flags=0x] > [ 233.362013] smartpqi :23:00.0: controller is offline: status code > 0x14803 > [ 233.369359] smartpqi :23:00.0: controller offline > [ 233.388915] print_req_error: I/O error, dev sdb, sector 3317352 flags > 201 > [ 233.388921] sd 0:0:0:0: [sdb] tag#95 UNKNOWN(0x2003) Result: hostbyte=0x01 > driverbyte=0x00 > [ 233.388931] sd 0:0:0:0: [sdb] tag#95 CDB: opcode=0x2a 2a 00 00 55 89 00 00 > 01 > 08 00 > [ 233.389003] Write-error on swap-device (254:1:4474640) > [ 233.389015] Write-error on swap-device (254:1:2190776) > [ 233.389023] Write-error on swap-device (254:1:8351936) > > [1] /opt/ltp/testcases/bin/mtest01 -p80 -w It turned out another linux-next commit is needed to reproduce this, i.e., 7a5dbf3ab2f0 ("iommu/amd: Remove the leftover of bypass support"). Specifically, the chunks for map_sg() and unmap_sg(). This has been reproduced on 3 different HPE ProLiant DL385 Gen10 systems so far. Either reverted the chunks (map_sg() and unmap_sg()) on the top of the latest linux-next fixed the issue or applied them on the top of the mainline v5.1 reproduced it immediately. Lots of time it triggered this BUG_ON(!iova) in iova_magazine_free_pfns() instead of the smartpqi offline. kernel BUG at drivers/iommu/iova.c:813! Workqueue: kblockd blk_mq_run_work_fn RIP: 0010:iova_magazine_free_pfns+0x7d/0xc0 Call Trace: free_cpu_cached_iovas+0xbd/0x150 alloc_iova_fast+0x8c/0xba dma_ops_alloc_iova.isra.6+0x65/0xa0 map_sg+0x8c/0x2a0 scsi_dma_map+0xc6/0x160 pqi_aio_submit_io+0x1f6/0x440 [smartpqi] pqi_scsi_queue_command+0x90c/0xdd0 [smartpqi] scsi_queue_rq+0x79c/0x1200 blk_mq_dispatch_rq_list+0x4dc/0xb70 blk_mq_sched_dispatch_requests+0x249/0x310 __blk_mq_run_hw_queue+0x128/0x200 blk_mq_run_work_fn+0x27/0x30 process_one_work+0x522/0xa10 worker_thread+0x63/0x5b0 kthread+0x1d2/0x1f0 ret_from_fork+0x22/0x40 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
On Mon, 2019-04-29 at 16:23 +0200, Joerg Roedel wrote: > On Fri, Apr 26, 2019 at 11:55:12AM -0400, Qian Cai wrote: > > https://git.sr.ht/~cai/linux-debug/blob/master/dmesg > > Thanks, I can't see any definitions for unity ranges or exclusion ranges > in the IVRS table dump, which makes it even more weird. > > Can you please send me the output of > > for f in `ls -1 /sys/kernel/iommu_groups/*/reserved_regions`; do echo "- > --$f"; cat $f;done > > to double-check? https://git.sr.ht/~cai/linux-debug/blob/master/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
On 4/29/19 10:23 AM, Joerg Roedel wrote: > On Fri, Apr 26, 2019 at 11:55:12AM -0400, Qian Cai wrote: >> https://git.sr.ht/~cai/linux-debug/blob/master/dmesg > > Thanks, I can't see any definitions for unity ranges or exclusion ranges > in the IVRS table dump, which makes it even more weird. > > Can you please send me the output of > > for f in `ls -1 /sys/kernel/iommu_groups/*/reserved_regions`; do echo > "---$f"; cat $f;done > > to double-check? It is going to take a while to reserve that system again to gather the information. BTW, this is only reproducible on linux-next but not mainline. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
On Fri, 2019-04-26 at 17:26 +0200, Joerg Roedel wrote: > On Fri, Apr 26, 2019 at 10:52:28AM -0400, Qian Cai wrote: > > Applying some memory pressure would causes smartpqi offline even in today's > > linux-next. This can always be reproduced by a LTP test cases [1] or > > sometimes > > just compiling kernels. > > > > Reverting the commit "iommu/amd: Set exclusion range correctly" fixed the > > issue. > > > > [ 213.437112] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT > > domain=0x address=0x1000 flags=0x] > > [ 213.447659] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT > > domain=0x address=0x1800 flags=0x] > > [ 233.362013] smartpqi :23:00.0: controller is offline: status code > > 0x14803 > > [ 233.369359] smartpqi :23:00.0: controller offline > > [ 233.388915] print_req_error: I/O error, dev sdb, sector 3317352 flags > > 201 > > [ 233.388921] sd 0:0:0:0: [sdb] tag#95 UNKNOWN(0x2003) Result: > > hostbyte=0x01 > > driverbyte=0x00 > > [ 233.388931] sd 0:0:0:0: [sdb] tag#95 CDB: opcode=0x2a 2a 00 00 55 89 00 > > 00 01 > > 08 00 > > [ 233.389003] Write-error on swap-device (254:1:4474640) > > [ 233.389015] Write-error on swap-device (254:1:2190776) > > [ 233.389023] Write-error on swap-device (254:1:8351936) > > > > [1] /opt/ltp/testcases/bin/mtest01 -p80 -w > > I can't explain that, can you please boot with 'amd_iommu_dump' on the > kernel command line and send me dmesg after boot? https://git.sr.ht/~cai/linux-debug/blob/master/dmesg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
"iommu/amd: Set exclusion range correctly" causes smartpqi offline
Applying some memory pressure would causes smartpqi offline even in today's linux-next. This can always be reproduced by a LTP test cases [1] or sometimes just compiling kernels. Reverting the commit "iommu/amd: Set exclusion range correctly" fixed the issue. [ 213.437112] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x address=0x1000 flags=0x] [ 213.447659] smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x address=0x1800 flags=0x] [ 233.362013] smartpqi :23:00.0: controller is offline: status code 0x14803 [ 233.369359] smartpqi :23:00.0: controller offline [ 233.388915] print_req_error: I/O error, dev sdb, sector 3317352 flags 201 [ 233.388921] sd 0:0:0:0: [sdb] tag#95 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 [ 233.388931] sd 0:0:0:0: [sdb] tag#95 CDB: opcode=0x2a 2a 00 00 55 89 00 00 01 08 00 [ 233.389003] Write-error on swap-device (254:1:4474640) [ 233.389015] Write-error on swap-device (254:1:2190776) [ 233.389023] Write-error on swap-device (254:1:8351936) [1] /opt/ltp/testcases/bin/mtest01 -p80 -w ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 0/7] dma-debug cleanup and dynamic allocation
On 12/10/18 9:00 AM, Robin Murphy wrote: > Hi all, > > Here's some assorted cleanup and improvements to dma-debug which grew > out of the problem that certain drivers use very large numbers of DMA > mappings, and knowing when to override "dma_debug_entries=..." and what > value to override it with can be a less-than-obvious task for users. > > The main part is patch #4, wherein we make dma-debug clever enough > to allocate more entries dynamically if needed, such that the > preallocation value becomes more of a quality-of-life option than a > necessity. Patches #5 and #6 do some cruft-removal to allow patch #7 > to make the allocation behaviour more efficient in general. > > Patches #1, #2 and #4 are some other cleanup and handy features which > fell out of the discussion/development. > > Robin. > > > Robin Murphy (7): > dma-debug: Use pr_fmt() > dma-debug: Expose nr_total_entries in debugfs > dma-debug: Dynamically expand the dma_debug_entry pool > dma-debug: Make leak-like behaviour apparent > x86/dma/amd-gart: Stop resizing dma_debug_entry pool > dma/debug: Remove dma_debug_resize_entries() > dma-debug: Batch dma_debug_entry allocation > > Documentation/DMA-API.txt | 20 +- > Documentation/x86/x86_64/boot-options.txt | 5 +- > arch/x86/kernel/amd_gart_64.c | 23 --- > include/linux/dma-debug.h | 7 - > kernel/dma/debug.c | 217 ++ > 5 files changed, 109 insertions(+), 163 deletions(-) > Tested-by: Qian Cai ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2] dma-debug: Kconfig for PREALLOC_DMA_DEBUG_ENTRIES
The amount of DMA mappings from Hisilicon HNS ethernet devices is huge, so it could trigger "DMA-API: debugging out of memory - disabling". hnae_get_handle [1] hnae_init_queue hnae_init_ring hnae_alloc_buffers [2] debug_dma_map_page dma_entry_alloc [1] for (i = 0; i < handle->q_num; i++) [2] for (i = 0; i < ring->desc_num; i++) Also, "#define HNS_DSAF_MAX_DESC_CNT 1024" On this Huawei TaiShan 2280 aarch64 server, it has reached the limit already, 4 (NICs) x 16 (queues) x 1024 (port descption numbers) = 65536 Added a Kconfig entry for PREALLOC_DMA_DEBUG_ENTRIES, so make it easier for users to deal with special cases like this. Signed-off-by: Qian Cai --- Changes since v1: * Increased the default value if has HNS_ENET suggested by Robin. kernel/dma/debug.c | 9 ++--- lib/Kconfig.debug | 10 ++ 2 files changed, 12 insertions(+), 7 deletions(-) diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index 231ca4628062..3752fb23f72f 100644 --- a/kernel/dma/debug.c +++ b/kernel/dma/debug.c @@ -41,11 +41,6 @@ #define HASH_FN_SHIFT 13 #define HASH_FN_MASK(HASH_SIZE - 1) -/* allow architectures to override this if absolutely required */ -#ifndef PREALLOC_DMA_DEBUG_ENTRIES -#define PREALLOC_DMA_DEBUG_ENTRIES (1 << 16) -#endif - enum { dma_debug_single, dma_debug_page, @@ -132,7 +127,7 @@ static u32 min_free_entries; static u32 nr_total_entries; /* number of preallocated entries requested by kernel cmdline */ -static u32 nr_prealloc_entries = PREALLOC_DMA_DEBUG_ENTRIES; +static u32 nr_prealloc_entries = CONFIG_PREALLOC_DMA_DEBUG_ENTRIES; /* debugfs dentry's for the stuff above */ static struct dentry *dma_debug_dent__read_mostly; @@ -1063,7 +1058,7 @@ static __init int dma_debug_entries_cmdline(char *str) if (!str) return -EINVAL; if (!get_option(&str, &nr_prealloc_entries)) - nr_prealloc_entries = PREALLOC_DMA_DEBUG_ENTRIES; + nr_prealloc_entries = CONFIG_PREALLOC_DMA_DEBUG_ENTRIES; return 0; } diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 1af29b8224fd..9f85a7a13647 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1659,6 +1659,16 @@ config DMA_API_DEBUG If unsure, say N. +config PREALLOC_DMA_DEBUG_ENTRIES + int "Preallocated DMA-API debugging entries" + depends on DMA_API_DEBUG + default 131072 if HNS_ENET + default 65536 + help + The number of preallocated entries for DMA-API debugging code. One + entry is required per DMA-API allocation. Increase this if the DMA-API + debugging code disables itself because the default is too low. + config DMA_API_DEBUG_SG bool "Debug DMA scatter-gather usage" default y -- 2.17.2 (Apple Git-113) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] dma-debug: Kconfig for PREALLOC_DMA_DEBUG_ENTRIES
The amount of DMA mappings from Hisilicon HNS ethernet devices is huge, so it could trigger "DMA-API: debugging out of memory - disabling". hnae_get_handle [1] hnae_init_queue hnae_init_ring hnae_alloc_buffers [2] debug_dma_map_page dma_entry_alloc [1] for (i = 0; i < handle->q_num; i++) [2] for (i = 0; i < ring->desc_num; i++) On this Huawei TaiShan 2280 aarch64 server, it has reached the limit already, 4 (ports) x 16 (handles) x 1024 (rings) = 65536 Added a Kconfig entry for PREALLOC_DMA_DEBUG_ENTRIES, so make it easier for users to deal with special cases like this. Signed-off-by: Qian Cai --- kernel/dma/debug.c | 9 ++--- lib/Kconfig.debug | 9 + 2 files changed, 11 insertions(+), 7 deletions(-) diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index 231ca4628062..3752fb23f72f 100644 --- a/kernel/dma/debug.c +++ b/kernel/dma/debug.c @@ -41,11 +41,6 @@ #define HASH_FN_SHIFT 13 #define HASH_FN_MASK(HASH_SIZE - 1) -/* allow architectures to override this if absolutely required */ -#ifndef PREALLOC_DMA_DEBUG_ENTRIES -#define PREALLOC_DMA_DEBUG_ENTRIES (1 << 16) -#endif - enum { dma_debug_single, dma_debug_page, @@ -132,7 +127,7 @@ static u32 min_free_entries; static u32 nr_total_entries; /* number of preallocated entries requested by kernel cmdline */ -static u32 nr_prealloc_entries = PREALLOC_DMA_DEBUG_ENTRIES; +static u32 nr_prealloc_entries = CONFIG_PREALLOC_DMA_DEBUG_ENTRIES; /* debugfs dentry's for the stuff above */ static struct dentry *dma_debug_dent__read_mostly; @@ -1063,7 +1058,7 @@ static __init int dma_debug_entries_cmdline(char *str) if (!str) return -EINVAL; if (!get_option(&str, &nr_prealloc_entries)) - nr_prealloc_entries = PREALLOC_DMA_DEBUG_ENTRIES; + nr_prealloc_entries = CONFIG_PREALLOC_DMA_DEBUG_ENTRIES; return 0; } diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 1af29b8224fd..2c281edcb5ad 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1659,6 +1659,15 @@ config DMA_API_DEBUG If unsure, say N. +config PREALLOC_DMA_DEBUG_ENTRIES + int "Preallocated DMA-API debugging entries" + depends on DMA_API_DEBUG + default 65536 + help + The number of preallocated entries for DMA-API debugging code. One + entry is required per DMA-API allocation. Increase this if the DMA-API + debugging code disables itself because the default is too low. + config DMA_API_DEBUG_SG bool "Debug DMA scatter-gather usage" default y -- 2.17.2 (Apple Git-113) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] dma-debug: hns_enet_drv could use more DMA entries
The amount of DMA mappings from Hisilicon HNS ethernet devices is huge, so it could trigger "DMA-API: debugging out of memory - disabling". hnae_get_handle [1] hnae_init_queue hnae_init_ring hnae_alloc_buffers [2] debug_dma_map_page dma_entry_alloc [1] for (i = 0; i < handle->q_num; i++) [2] for (i = 0; i < ring->desc_num; i++) On this Huawei TaiShan 2280 aarch64 server, it has reached the limit already, 4 (ports) x 16 (handles) x 1024 (rings) = 65536 Signed-off-by: Qian Cai --- kernel/dma/debug.c | 5 + 1 file changed, 5 insertions(+) diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index 231ca4628062..ae91689cc9ad 100644 --- a/kernel/dma/debug.c +++ b/kernel/dma/debug.c @@ -43,8 +43,13 @@ /* allow architectures to override this if absolutely required */ #ifndef PREALLOC_DMA_DEBUG_ENTRIES +/* amount of DMA mappings on this driver is huge. */ +#ifdef HNS_ENET +#define PREALLOC_DMA_DEBUG_ENTRIES (1 << 17) +#else #define PREALLOC_DMA_DEBUG_ENTRIES (1 << 16) #endif +#endif enum { dma_debug_single, -- 2.17.2 (Apple Git-113) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu