[Kernel-packages] [Bug 1970453] Re: DMAR: ERROR: DMA PTE for vPFN 0x7bf32 already set
Not running any VMs but I did have SR-IOV and VT-X enabled in the BIOS. Server has a HP B120i which is a fakeraid device and the drives are accessed as simple AHCI devices. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1970453 Title: DMAR: ERROR: DMA PTE for vPFN 0x7bf32 already set Status in linux package in Ubuntu: Confirmed Bug description: I'm running Ubuntu 22.04 with kernel 5.15.0.27.30 on an HPE ProLiant DL20 Gen9 server. The server has an HPE Smart HBA H240 SATA controller. Since Ubuntu 22.04, the kernel runs into trouble after a few hours of uptime. The problem starts with a few instances of a message such as this: Apr 26 12:03:37 kernel: DMAR: ERROR: DMA PTE for vPFN 0x7bf32 already set (to 7bf32003 not 24c563801) Apr 26 12:03:37 kernel: [ cut here ] Apr 26 12:03:37 kernel: WARNING: CPU: 1 PID: 10171 at drivers/iommu/intel/iommu.c:2391 __domain_mapping.cold+0x94/0xcb Apr 26 12:03:37 kernel: Modules linked in: tls rpcsec_gss_krb5 binfmt_misc ip6t_REJECT nf_reject_ipv6 xt_hl ip6_tables ip6t_rt ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog nft_limit xt_limi> Apr 26 12:03:37 kernel: drm_kms_helper aesni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops xhci_pci cec crypto_simd i2c_i801 rc_core cryptd drm xhci_pci_renesas ahci i2c_smbus tg3 hpsa> Apr 26 12:03:37 kernel: CPU: 1 PID: 10171 Comm: kworker/u4:0 Not tainted 5.15.0-27-generic #28-Ubuntu Apr 26 12:03:37 kernel: Hardware name: HP ProLiant DL20 Gen9/ProLiant DL20 Gen9, BIOS U22 04/01/2021 Apr 26 12:03:37 kernel: Workqueue: writeback wb_workfn (flush-253:2) Apr 26 12:03:37 kernel: RIP: 0010:__domain_mapping.cold+0x94/0xcb Apr 26 12:03:37 kernel: Code: 27 9d 4c 89 4d b8 4c 89 45 c0 e8 03 c5 fa ff 8b 05 e7 e6 40 01 4c 8b 45 c0 4c 8b 4d b8 85 c0 74 09 83 e8 01 89 05 d2 e6 40 01 <0f> 0b e9 7e b2 b1 ff 89 ca 48 83 > Apr 26 12:03:37 kernel: RSP: 0018:c077826b2fa0 EFLAGS: 00010202 Apr 26 12:03:37 kernel: RAX: 0004 RBX: 9f0042062990 RCX: Apr 26 12:03:37 kernel: RDX: RSI: 9f02b3d20980 RDI: 9f02b3d20980 Apr 26 12:03:37 kernel: RBP: c077826b2ff0 R08: 00024c563801 R09: 0024c563 Apr 26 12:03:37 kernel: R10: R11: c01550e0 R12: 000f Apr 26 12:03:37 kernel: R13: 0007bf32 R14: 9f00412f5800 R15: 9f0042062938 Apr 26 12:03:37 kernel: FS: () GS:9f02b3d0() knlGS: Apr 26 12:03:37 kernel: CS: 0010 DS: ES: CR0: 80050033 Apr 26 12:03:37 kernel: CR2: 1530f676a01c CR3: 00029c210001 CR4: 002706e0 Apr 26 12:03:37 kernel: Call Trace: Apr 26 12:03:37 kernel: Apr 26 12:03:37 kernel: intel_iommu_map_pages+0xdc/0x120 Apr 26 12:03:37 kernel: ? __alloc_and_insert_iova_range+0x203/0x240 Apr 26 12:03:37 kernel: __iommu_map+0xda/0x270 Apr 26 12:03:37 kernel: __iommu_map_sg+0x8e/0x120 Apr 26 12:03:37 kernel: iommu_map_sg_atomic+0x14/0x20 Apr 26 12:03:37 kernel: iommu_dma_map_sg+0x345/0x4d0 Apr 26 12:03:37 kernel: __dma_map_sg_attrs+0x68/0x70 Apr 26 12:03:37 kernel: dma_map_sg_attrs+0xe/0x20 Apr 26 12:03:37 kernel: scsi_dma_map+0x39/0x50 Apr 26 12:03:37 kernel: hpsa_scsi_ioaccel2_queue_command.constprop.0+0x11e/0x570 [hpsa] Apr 26 12:03:37 kernel: ? __blk_rq_map_sg+0x36/0x160 Apr 26 12:03:37 kernel: hpsa_scsi_ioaccel_queue_command+0x82/0xd0 [hpsa] Apr 26 12:03:37 kernel: hpsa_ioaccel_submit+0x174/0x190 [hpsa] Apr 26 12:03:37 kernel: hpsa_scsi_queue_command+0x19c/0x240 [hpsa] Apr 26 12:03:37 kernel: ? recalibrate_cpu_khz+0x10/0x10 Apr 26 12:03:37 kernel: scsi_dispatch_cmd+0x93/0x1f0 Apr 26 12:03:37 kernel: scsi_queue_rq+0x2d1/0x690 Apr 26 12:03:37 kernel: blk_mq_dispatch_rq_list+0x126/0x600 Apr 26 12:03:37 kernel: ? __sbitmap_queue_get+0x1/0x10 Apr 26 12:03:37 kernel: __blk_mq_do_dispatch_sched+0xba/0x2d0 Apr 26 12:03:37 kernel: __blk_mq_sched_dispatch_requests+0x104/0x150 Apr 26 12:03:37 kernel: blk_mq_sched_dispatch_requests+0x35/0x60 Apr 26 12:03:37 kernel: __blk_mq_run_hw_queue+0x34/0xb0 Apr 26 12:03:37 kernel: __blk_mq_delay_run_hw_queue+0x162/0x170 Apr 26 12:03:37 kernel: blk_mq_run_hw_queue+0x83/0x120 Apr 26 12:03:37 kernel: blk_mq_sched_insert_requests+0x69/0xf0 Apr 26 12:03:37 kernel: blk_mq_flush_plug_list+0x103/0x1c0 Apr 26 12:03:37 kernel: blk_flush_plug_list+0xdd/0x100 Apr 26 12:03:37 kernel: blk_mq_submit_bio+0x2bd/0x600 Apr 26 12:03:37 kernel: __submit_bio+0x1ea/0x220 Apr 26 12:03:37 kernel: ? mempool_alloc_slab+0x17/0x20 Apr 26 12:03:37 kernel: __submit_bio_noacct+0x85/0x1f0 Apr 26 12:03:37 kernel: submit_bio_noacct+0x4e/0x120 Apr 26 12:03:37 kernel: ? radix_tree_lookup+0xd/0x10
[Kernel-packages] [Bug 1970453] Re: DMAR: ERROR: DMA PTE for vPFN 0x7bf32 already set
I hit this bug upgrading my home server (proliant microserver gen9) and it seems to be causing memory corruption when it occurs ( at least in combination with zfs ). Using zfs mirrored root I experienced this issue after only a few minutes uptime, with DMAR messages flooding the log and very high CPU usage. After rebooting with intel_iommu=off things are back to normal, but a zfs scrub indicated several thousand checksum errors detected on the root volume, some of them unrecoverable that had to be restored from backup, and a separate zfs RAIDZ1 volume experienced corrupted metadata and had to be rolled back with some data loss. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1970453 Title: DMAR: ERROR: DMA PTE for vPFN 0x7bf32 already set Status in linux package in Ubuntu: Confirmed Bug description: I'm running Ubuntu 22.04 with kernel 5.15.0.27.30 on an HPE ProLiant DL20 Gen9 server. The server has an HPE Smart HBA H240 SATA controller. Since Ubuntu 22.04, the kernel runs into trouble after a few hours of uptime. The problem starts with a few instances of a message such as this: Apr 26 12:03:37 kernel: DMAR: ERROR: DMA PTE for vPFN 0x7bf32 already set (to 7bf32003 not 24c563801) Apr 26 12:03:37 kernel: [ cut here ] Apr 26 12:03:37 kernel: WARNING: CPU: 1 PID: 10171 at drivers/iommu/intel/iommu.c:2391 __domain_mapping.cold+0x94/0xcb Apr 26 12:03:37 kernel: Modules linked in: tls rpcsec_gss_krb5 binfmt_misc ip6t_REJECT nf_reject_ipv6 xt_hl ip6_tables ip6t_rt ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog nft_limit xt_limi> Apr 26 12:03:37 kernel: drm_kms_helper aesni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops xhci_pci cec crypto_simd i2c_i801 rc_core cryptd drm xhci_pci_renesas ahci i2c_smbus tg3 hpsa> Apr 26 12:03:37 kernel: CPU: 1 PID: 10171 Comm: kworker/u4:0 Not tainted 5.15.0-27-generic #28-Ubuntu Apr 26 12:03:37 kernel: Hardware name: HP ProLiant DL20 Gen9/ProLiant DL20 Gen9, BIOS U22 04/01/2021 Apr 26 12:03:37 kernel: Workqueue: writeback wb_workfn (flush-253:2) Apr 26 12:03:37 kernel: RIP: 0010:__domain_mapping.cold+0x94/0xcb Apr 26 12:03:37 kernel: Code: 27 9d 4c 89 4d b8 4c 89 45 c0 e8 03 c5 fa ff 8b 05 e7 e6 40 01 4c 8b 45 c0 4c 8b 4d b8 85 c0 74 09 83 e8 01 89 05 d2 e6 40 01 <0f> 0b e9 7e b2 b1 ff 89 ca 48 83 > Apr 26 12:03:37 kernel: RSP: 0018:c077826b2fa0 EFLAGS: 00010202 Apr 26 12:03:37 kernel: RAX: 0004 RBX: 9f0042062990 RCX: Apr 26 12:03:37 kernel: RDX: RSI: 9f02b3d20980 RDI: 9f02b3d20980 Apr 26 12:03:37 kernel: RBP: c077826b2ff0 R08: 00024c563801 R09: 0024c563 Apr 26 12:03:37 kernel: R10: R11: c01550e0 R12: 000f Apr 26 12:03:37 kernel: R13: 0007bf32 R14: 9f00412f5800 R15: 9f0042062938 Apr 26 12:03:37 kernel: FS: () GS:9f02b3d0() knlGS: Apr 26 12:03:37 kernel: CS: 0010 DS: ES: CR0: 80050033 Apr 26 12:03:37 kernel: CR2: 1530f676a01c CR3: 00029c210001 CR4: 002706e0 Apr 26 12:03:37 kernel: Call Trace: Apr 26 12:03:37 kernel: Apr 26 12:03:37 kernel: intel_iommu_map_pages+0xdc/0x120 Apr 26 12:03:37 kernel: ? __alloc_and_insert_iova_range+0x203/0x240 Apr 26 12:03:37 kernel: __iommu_map+0xda/0x270 Apr 26 12:03:37 kernel: __iommu_map_sg+0x8e/0x120 Apr 26 12:03:37 kernel: iommu_map_sg_atomic+0x14/0x20 Apr 26 12:03:37 kernel: iommu_dma_map_sg+0x345/0x4d0 Apr 26 12:03:37 kernel: __dma_map_sg_attrs+0x68/0x70 Apr 26 12:03:37 kernel: dma_map_sg_attrs+0xe/0x20 Apr 26 12:03:37 kernel: scsi_dma_map+0x39/0x50 Apr 26 12:03:37 kernel: hpsa_scsi_ioaccel2_queue_command.constprop.0+0x11e/0x570 [hpsa] Apr 26 12:03:37 kernel: ? __blk_rq_map_sg+0x36/0x160 Apr 26 12:03:37 kernel: hpsa_scsi_ioaccel_queue_command+0x82/0xd0 [hpsa] Apr 26 12:03:37 kernel: hpsa_ioaccel_submit+0x174/0x190 [hpsa] Apr 26 12:03:37 kernel: hpsa_scsi_queue_command+0x19c/0x240 [hpsa] Apr 26 12:03:37 kernel: ? recalibrate_cpu_khz+0x10/0x10 Apr 26 12:03:37 kernel: scsi_dispatch_cmd+0x93/0x1f0 Apr 26 12:03:37 kernel: scsi_queue_rq+0x2d1/0x690 Apr 26 12:03:37 kernel: blk_mq_dispatch_rq_list+0x126/0x600 Apr 26 12:03:37 kernel: ? __sbitmap_queue_get+0x1/0x10 Apr 26 12:03:37 kernel: __blk_mq_do_dispatch_sched+0xba/0x2d0 Apr 26 12:03:37 kernel: __blk_mq_sched_dispatch_requests+0x104/0x150 Apr 26 12:03:37 kernel: blk_mq_sched_dispatch_requests+0x35/0x60 Apr 26 12:03:37 kernel: __blk_mq_run_hw_queue+0x34/0xb0 Apr 26 12:03:37 kernel: __blk_mq_delay_run_hw_queue+0x162/0x170 Apr 26 12:03:37 kernel: blk_mq_run_hw_queue+0x83/0x120 Apr 26 12:03:37 kernel: blk_mq_sched_insert_requests+0x69/0xf0 Apr