Re: DMAR and DRHD errors[DMAR:[fault reason 06] PTE Read access is not set] Vt-d & intel_iommu
On 12/15/2012 05:16 PM, Robert Hancock wrote: On 12/14/2012 03:32 PM, Don Dutile wrote: On 12/13/2012 04:50 AM, Jason Gao wrote: Dear List: Description of problem: After installed Centos 6.3(RHEL6.3) on my Dell R710(lastest bios:Version: 6.3.0,Release Date: 07/24/2012) server,and updated lastest kernel "2.6.32-279.14.1.el6.x86_64",I want to use the Intel 82576 ET Dual Port nic's SR-IOV feature,assigning VFs to kvm guest appended kernel boot parameter: intel_iommu=on,after boot with the following messages: Dec 13 16:58:15 2 kernel: DRHD: handling fault status reg 2 Dec 13 16:58:15 2 kernel: DMAR:[DMA Read] Request device [03:00.0] fault addr ffe65000 Dec 13 16:58:15 2 kernel: DMAR:[fault reason 06] PTE Read access is not set Dec 13 16:58:15 2 kernel: DRHD: handling fault status reg 102 Dec 13 16:58:15 2 kernel: DMAR:[DMA Read] Request device [03:00.0] fault addr ffe8a000 Dec 13 16:58:15 2 kernel: DMAR:[fault reason 06] PTE Read access is not set Dec 13 16:58:15 2 kernel: scsi 0:0:32:0: Enclosure DP BACKPLANE 1.07 PQ: 0 ANSI: 5 Dec 13 16:58:15 2 kernel: DRHD: handling fault status reg 202 Dec 13 16:58:15 2 kernel: DMAR:[DMA Read] Request device [03:00.0] fault addr ffe89000 Dec 13 16:58:15 2 kernel: DMAR:[fault reason 06] PTE Read access is not set full dmesg detail: http://pastebin.com/BzFQV0jU lspci -vvv full detail: http://pastebin.com/9rP2d1br it's a production server,and I'm not sure if this is a critical problem,how to fix it,any help would be greatly appreciated. DMAR table does not have an entry for this device to this region. Once the driver reconfigs/resets the device to stop polling bios-boot cmd rings and use (new) OS (dma-mapped) rings, there's a period of time during this transition that the hw is babbling away to an area that is no longer mapped. Maybe some kind of boot PCI quirk is needed to stop the device DMA activity before enabling the IOMMU? No, lack of a *proper* RMRR for this device is the source of the problem; that's why the RMRR's exist -- so this transition state does not cause these types of problems. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: DMAR and DRHD errors[DMAR:[fault reason 06] PTE Read access is not set] Vt-d & intel_iommu
On 12/14/2012 03:32 PM, Don Dutile wrote: On 12/13/2012 04:50 AM, Jason Gao wrote: Dear List: Description of problem: After installed Centos 6.3(RHEL6.3) on my Dell R710(lastest bios:Version: 6.3.0,Release Date: 07/24/2012) server,and updated lastest kernel "2.6.32-279.14.1.el6.x86_64",I want to use the Intel 82576 ET Dual Port nic's SR-IOV feature,assigning VFs to kvm guest appended kernel boot parameter: intel_iommu=on,after boot with the following messages: Dec 13 16:58:15 2 kernel: DRHD: handling fault status reg 2 Dec 13 16:58:15 2 kernel: DMAR:[DMA Read] Request device [03:00.0] fault addr ffe65000 Dec 13 16:58:15 2 kernel: DMAR:[fault reason 06] PTE Read access is not set Dec 13 16:58:15 2 kernel: DRHD: handling fault status reg 102 Dec 13 16:58:15 2 kernel: DMAR:[DMA Read] Request device [03:00.0] fault addr ffe8a000 Dec 13 16:58:15 2 kernel: DMAR:[fault reason 06] PTE Read access is not set Dec 13 16:58:15 2 kernel: scsi 0:0:32:0: Enclosure DP BACKPLANE1.07 PQ: 0 ANSI: 5 Dec 13 16:58:15 2 kernel: DRHD: handling fault status reg 202 Dec 13 16:58:15 2 kernel: DMAR:[DMA Read] Request device [03:00.0] fault addr ffe89000 Dec 13 16:58:15 2 kernel: DMAR:[fault reason 06] PTE Read access is not set full dmesg detail: http://pastebin.com/BzFQV0jU lspci -vvv full detail: http://pastebin.com/9rP2d1br it's a production server,and I'm not sure if this is a critical problem,how to fix it,any help would be greatly appreciated. DMAR table does not have an entry for this device to this region. Once the driver reconfigs/resets the device to stop polling bios-boot cmd rings and use (new) OS (dma-mapped) rings, there's a period of time during this transition that the hw is babbling away to an area that is no longer mapped. Maybe some kind of boot PCI quirk is needed to stop the device DMA activity before enabling the IOMMU? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: DMAR and DRHD errors[DMAR:[fault reason 06] PTE Read access is not set] Vt-d & intel_iommu
On Sat, Dec 15, 2012 at 5:55 AM, Don Dutile wrote: > forgot: did you check that all the bios settings are the same btwn > the 710 systems? Bios settings should be the same between servers, I'v ignored these errors and run KVM on this server,deployed non-critical java production applications running on kvm guest, thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: DMAR and DRHD errors[DMAR:[fault reason 06] PTE Read access is not set] Vt-d & intel_iommu
On Sat, Dec 15, 2012 at 5:54 AM, Don Dutile wrote: > mptsas or smi fw has to be different this server: # inventory_firmware Wait while we inventory system: System inventory: BIOS = 6.3.0 SAS/SATA Backplane 0:0 Backplane Firmware = 1.07 PERC 6/i Integrated Controller 0 Firmware = 6.3.1-0003 Dell OS Drivers Pack, v.6.5.3, A00 = 6.5.3 Dell Lifecycle Controller = 1.5.5.27 NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth1) = 7.2.20 NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth0) = 7.2.20 ST3600057SS Firmware = es66 iDRAC6 = 1.92 NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth2) = 7.2.20 NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth3) = 7.2.20 Dell 32 Bit Diagnostics, v.5154A0, 5154.1 = 5154a0 System BIOS for PowerEdge R710 = 6.3.0 other servers: # inventory_firmware Wait while we inventory system: System inventory: BIOS = 6.3.0 SAS/SATA Backplane 0:0 Backplane Firmware = 1.07 PERC H700 Integrated Controller 0 Firmware = 12.10.4-0001 Dell OS Drivers Pack, 7.1.0.9, A00 = 7.1.0.9 Dell Lifecycle Controller, 1.5.5.27, A00 = 1.5.5.27 ST3600057SS Firmware = es65 iDRAC6 = 1.90 Dell 32 Bit Diagnostics, v.5154A0, 5154. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: DMAR and DRHD errors[DMAR:[fault reason 06] PTE Read access is not set] Vt-d & intel_iommu
On 12/13/2012 09:01 PM, Jason Gao wrote: On Fri, Dec 14, 2012 at 12:23 AM, Alex Williamson wrote: Device 03:00.0 is your raid controller: 03:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04) For some reason it's trying to read from ffe65000, ffe8a000, ffe89000, ffe86000, ffe87000, ffe84000. Those are in reserved memory regions, so it's not reading an OS allocated buffer, which probably means it's some kind of side-band communication with a management controller. I'd guess it's a BIOS bug and there should be an RMRR covering those accesses. Thanks, First of all ,I want to known whether I can ignore these errors on the production server,and do these error may affect the system? By the way,when I removed the "intel_iommu=on" from /etc/grub.conf,no DMAR related errors occur It's a strange thing,other three Dell R710 servers with the same bios version v. 6.3.0, same kernel 2.6.32-279.14.1 on RHEL6u3(Centos 6u3) ,but these errors don't appear on these tree servers forgot: did you check that all the bios settings are the same btwn the 710 systems? Anyone have any idea for this ? thanks -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: DMAR and DRHD errors[DMAR:[fault reason 06] PTE Read access is not set] Vt-d & intel_iommu
On 12/13/2012 09:01 PM, Jason Gao wrote: On Fri, Dec 14, 2012 at 12:23 AM, Alex Williamson wrote: Device 03:00.0 is your raid controller: 03:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04) For some reason it's trying to read from ffe65000, ffe8a000, ffe89000, ffe86000, ffe87000, ffe84000. Those are in reserved memory regions, so it's not reading an OS allocated buffer, which probably means it's some kind of side-band communication with a management controller. I'd guess it's a BIOS bug and there should be an RMRR covering those accesses. Thanks, First of all ,I want to known whether I can ignore these errors on the production server,and do these error may affect the system? By the way,when I removed the "intel_iommu=on" from /etc/grub.conf,no DMAR related errors occur well, if you don't enable the IOMMU, then it won't have IOMMU faults! ;-) It's a strange thing,other three Dell R710 servers with the same bios version v. 6.3.0, same kernel 2.6.32-279.14.1 on RHEL6u3(Centos 6u3) ,but these errors don't appear on these tree servers mptsas or smi fw has to be different Anyone have any idea for this ? thanks -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: DMAR and DRHD errors[DMAR:[fault reason 06] PTE Read access is not set] Vt-d & intel_iommu
On 12/13/2012 04:50 AM, Jason Gao wrote: Dear List: Description of problem: After installed Centos 6.3(RHEL6.3) on my Dell R710(lastest bios:Version: 6.3.0,Release Date: 07/24/2012) server,and updated lastest kernel "2.6.32-279.14.1.el6.x86_64",I want to use the Intel 82576 ET Dual Port nic's SR-IOV feature,assigning VFs to kvm guest appended kernel boot parameter: intel_iommu=on,after boot with the following messages: Dec 13 16:58:15 2 kernel: DRHD: handling fault status reg 2 Dec 13 16:58:15 2 kernel: DMAR:[DMA Read] Request device [03:00.0] fault addr ffe65000 Dec 13 16:58:15 2 kernel: DMAR:[fault reason 06] PTE Read access is not set Dec 13 16:58:15 2 kernel: DRHD: handling fault status reg 102 Dec 13 16:58:15 2 kernel: DMAR:[DMA Read] Request device [03:00.0] fault addr ffe8a000 Dec 13 16:58:15 2 kernel: DMAR:[fault reason 06] PTE Read access is not set Dec 13 16:58:15 2 kernel: scsi 0:0:32:0: Enclosure DP BACKPLANE1.07 PQ: 0 ANSI: 5 Dec 13 16:58:15 2 kernel: DRHD: handling fault status reg 202 Dec 13 16:58:15 2 kernel: DMAR:[DMA Read] Request device [03:00.0] fault addr ffe89000 Dec 13 16:58:15 2 kernel: DMAR:[fault reason 06] PTE Read access is not set full dmesg detail: http://pastebin.com/BzFQV0jU lspci -vvv full detail: http://pastebin.com/9rP2d1br it's a production server,and I'm not sure if this is a critical problem,how to fix it,any help would be greatly appreciated. DMAR table does not have an entry for this device to this region. Once the driver reconfigs/resets the device to stop polling bios-boot cmd rings and use (new) OS (dma-mapped) rings, there's a period of time during this transition that the hw is babbling away to an area that is no longer mapped. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: DMAR and DRHD errors[DMAR:[fault reason 06] PTE Read access is not set] Vt-d & intel_iommu
On Fri, Dec 14, 2012 at 2:56 PM, Jason Gao wrote: > On Fri, Dec 14, 2012 at 12:45 PM, Alex Williamson > wrote: >> Is the MegaRAID firmware and system management firmware the same as >> well? Thanks. > > I'v updated all the firmware using Dell's firmware-tools: > > # inventory_firmware > Wait while we inventory system: > System inventory: > BIOS = 6.3.0 > SAS/SATA Backplane 0:0 Backplane Firmware = 1.07 > PERC 6/i Integrated Controller 0 Firmware = 6.3.1-0003 > Dell OS Drivers Pack, v.6.5.3, A00 = 6.5.3 > Dell Lifecycle Controller = 1.5.5.27 > NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth1) = 7.2.20 > NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth0) = 7.2.20 > ST3600057SS Firmware = es66 > iDRAC6 = 1.92 > NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth2) = 7.2.20 > NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth3) = 7.2.20 > Dell 32 Bit Diagnostics, v.5154A0, 5154.1 = 5154a0 > System BIOS for PowerEdge R710 = 6.3.0 > > Thanks #lspci - -s 03:00.0|grep fail: pcilib: sysfs_read_vpd: read failed: Connection timed out #strace lspci - -s 03:00.0|grep fail: open("/sys/bus/pci/devices/:03:00.0/vpd", O_RDONLY) = 4 pread(4, 0x7fff30670b3f, 1, 0) = -1 ETIMEDOUT (Connection timed out) write(2, "pcilib: ", 8pcilib: ) = 8 write(2, "sysfs_read_vpd: read failed: Con"..., 49sysfs_read_vpd: read failed: Connection timed out) = 49 write(2, "\n", 1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: DMAR and DRHD errors[DMAR:[fault reason 06] PTE Read access is not set] Vt-d & intel_iommu
On Fri, Dec 14, 2012 at 12:45 PM, Alex Williamson wrote: > Is the MegaRAID firmware and system management firmware the same as > well? Thanks. I'v updated all the firmware using Dell's firmware-tools: # inventory_firmware Wait while we inventory system: System inventory: BIOS = 6.3.0 SAS/SATA Backplane 0:0 Backplane Firmware = 1.07 PERC 6/i Integrated Controller 0 Firmware = 6.3.1-0003 Dell OS Drivers Pack, v.6.5.3, A00 = 6.5.3 Dell Lifecycle Controller = 1.5.5.27 NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth1) = 7.2.20 NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth0) = 7.2.20 ST3600057SS Firmware = es66 iDRAC6 = 1.92 NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth2) = 7.2.20 NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth3) = 7.2.20 Dell 32 Bit Diagnostics, v.5154A0, 5154.1 = 5154a0 System BIOS for PowerEdge R710 = 6.3.0 Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: DMAR and DRHD errors[DMAR:[fault reason 06] PTE Read access is not set] Vt-d & intel_iommu
On Fri, 2012-12-14 at 10:01 +0800, Jason Gao wrote: > On Fri, Dec 14, 2012 at 12:23 AM, Alex Williamson > wrote: > > > > Device 03:00.0 is your raid controller: > > > > 03:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 > > (rev 04) > > > > For some reason it's trying to read from ffe65000, ffe8a000, ffe89000, > > ffe86000, ffe87000, ffe84000. Those are in reserved memory regions, so > > it's not reading an OS allocated buffer, which probably means it's some > > kind of side-band communication with a management controller. I'd guess > > it's a BIOS bug and there should be an RMRR covering those accesses. > > Thanks, > > First of all ,I want to known whether I can ignore these errors on the > production server,and do these error may affect the system? You'll have to make that call, the device is being blocked from reading a memory address, we don't know what it's reading or why. > By the way,when I removed the "intel_iommu=on" from /etc/grub.conf,no > DMAR related errors occur Of course. One option you have is to use the iommu in passthrough mode which allows host used devices unrestricted, identity mapped access to the system while still offering PCI device assignment. I wouldn't try assigning device 3:00.0 though. Add iommu=pt to enable this. > It's a strange thing,other three Dell R710 servers with the same bios > version v. 6.3.0, same kernel 2.6.32-279.14.1 on RHEL6u3(Centos 6u3) > ,but these errors don't appear on these tree servers Is the MegaRAID firmware and system management firmware the same as well? Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: DMAR and DRHD errors[DMAR:[fault reason 06] PTE Read access is not set] Vt-d & intel_iommu
On Fri, Dec 14, 2012 at 12:23 AM, Alex Williamson wrote: > > Device 03:00.0 is your raid controller: > > 03:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev > 04) > > For some reason it's trying to read from ffe65000, ffe8a000, ffe89000, > ffe86000, ffe87000, ffe84000. Those are in reserved memory regions, so > it's not reading an OS allocated buffer, which probably means it's some > kind of side-band communication with a management controller. I'd guess > it's a BIOS bug and there should be an RMRR covering those accesses. > Thanks, First of all ,I want to known whether I can ignore these errors on the production server,and do these error may affect the system? By the way,when I removed the "intel_iommu=on" from /etc/grub.conf,no DMAR related errors occur It's a strange thing,other three Dell R710 servers with the same bios version v. 6.3.0, same kernel 2.6.32-279.14.1 on RHEL6u3(Centos 6u3) ,but these errors don't appear on these tree servers Anyone have any idea for this ? thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: DMAR and DRHD errors[DMAR:[fault reason 06] PTE Read access is not set] Vt-d & intel_iommu
On Thu, 2012-12-13 at 17:50 +0800, Jason Gao wrote: > Dear List: > > Description of problem: > After installed Centos 6.3(RHEL6.3) on my Dell R710(lastest > bios:Version: 6.3.0,Release Date: 07/24/2012) server,and updated > lastest kernel "2.6.32-279.14.1.el6.x86_64",I want to use the Intel > 82576 ET Dual Port nic's SR-IOV feature,assigning VFs to kvm guest > > appended kernel boot parameter: intel_iommu=on,after boot with the > following messages: > > Dec 13 16:58:15 2 kernel: DRHD: handling fault status reg 2 > Dec 13 16:58:15 2 kernel: DMAR:[DMA Read] Request device [03:00.0] > fault addr ffe65000 > Dec 13 16:58:15 2 kernel: DMAR:[fault reason 06] PTE Read access is not set > Dec 13 16:58:15 2 kernel: DRHD: handling fault status reg 102 > Dec 13 16:58:15 2 kernel: DMAR:[DMA Read] Request device [03:00.0] > fault addr ffe8a000 > Dec 13 16:58:15 2 kernel: DMAR:[fault reason 06] PTE Read access is not set > Dec 13 16:58:15 2 kernel: scsi 0:0:32:0: Enclosure DP > BACKPLANE1.07 PQ: 0 ANSI: 5 > Dec 13 16:58:15 2 kernel: DRHD: handling fault status reg 202 > Dec 13 16:58:15 2 kernel: DMAR:[DMA Read] Request device [03:00.0] > fault addr ffe89000 > Dec 13 16:58:15 2 kernel: DMAR:[fault reason 06] PTE Read access is not set > > full dmesg detail: >http://pastebin.com/BzFQV0jU > lspci -vvv full detail: >http://pastebin.com/9rP2d1br > > > it's a production server,and I'm not sure if this is a critical > problem,how to fix it,any help would be greatly appreciated. Device 03:00.0 is your raid controller: 03:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04) For some reason it's trying to read from ffe65000, ffe8a000, ffe89000, ffe86000, ffe87000, ffe84000. Those are in reserved memory regions, so it's not reading an OS allocated buffer, which probably means it's some kind of side-band communication with a management controller. I'd guess it's a BIOS bug and there should be an RMRR covering those accesses. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/