Re: How to reduce PCI initialization from 5 s (1.5 s adding them to IOMMU groups)
Hi Paul, > Thank you for your reply. Thank you for getting back to us with a good insight. [...] > > I am curious - why is this a problem? Are you power-cycling your servers > > so often to the point where the cumulative time spent in enumerating PCI > > devices and adding them later to IOMMU groups is a problem? > > > > I am simply wondering why you decided to signal out the PCI enumeration as > > slow in particular, especially given that a large server hardware tends to > > have (most of the time, as per my experience) rather long initialisation > > time either from being powered off or after being power cycled. I can take > > a while before the actual operating system itself will start. > > It’s not a problem per se, and more a pet peeve of mine. Systems get faster > and faster, and boottime slower and slower. On desktop systems, it’s much > more important with firmware like coreboot taking less than one second to > initialize the hardware and passing control to the payload/operating system. > If we are lucky, we are going to have servers with FLOSS firmware. > > But, already now, using kexec to reboot a system, avoids the problems you > pointed out on servers, and being able to reboot a system as quickly as > possible, lowers the bar for people to reboot systems more often to, for > example, so updates take effect. A very good point about the kexec usage. This is definitely often invaluable to get security updates out of the door quickly, update kernel version, or when you want to switch operating system quickly (a trick that companies like Equinix Metal use when offering their baremetal as a service). > > We talked about this briefly with Bjorn, and there might be an option to > > perhaps add some caching, as we suspect that the culprit here is doing PCI > > configuration space read for each device, which can be slow on some > > platforms. > > > > However, we would need to profile this to get some quantitative data to see > > whether doing anything would even be worthwhile. It would definitely help > > us understand better where the bottlenecks really are and of what magnitude. > > > > I personally don't have access to such a large hardware like the one you > > have access to, thus I was wondering whether you would have some time, and > > be willing, to profile this for us on the hardware you have. > > > > Let me know what do you think? > > Sounds good. I’d be willing to help. Note, that I won’t have time before > Wednesday next week though. Not a problem! I am very grateful you are willing to devote some of you time to help with this. I only have access to a few systems such as some commodity hardware like a desktop PC and notebooks, and some assorted SoCs. These are sadly not even close to a proper server platforms, and trying to measure anything on these does not really yield any useful data as the delays related to PCI enumeration on startup are quite insignificant in comparison - there is just not enough hardware there, so to speak. I am really looking forward to the data you can gather for us and what insight it might provide us with. Thank you again! Krzysztof ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: How to reduce PCI initialization from 5 s (1.5 s adding them to IOMMU groups)
Dear Robin, Thank you for your reply. Am 09.11.21 um 16:31 schrieb Robin Murphy: On 2021-11-06 10:42, Paul Menzel wrote: Am 05.11.21 um 19:53 schrieb Bjorn Helgaas: On Fri, Nov 05, 2021 at 12:56:09PM +0100, Paul Menzel wrote: On a PowerEdge T440/021KCD, BIOS 2.11.2 04/22/2021, Linux 5.10.70 takes almost five seconds to initialize PCI. According to the timestamps, 1.5 s are from assigning the PCI devices to the 142 IOMMU groups. ``` $ lspci | wc -l 281 $ dmesg […] [ 2.918411] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug [ 2.933841] ACPI: Enabled 5 GPEs in block 00 to 7F [ 2.973739] ACPI: PCI Root Bridge [PC00] (domain [bus 00-16]) [ 2.980398] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3] [ 2.989457] acpi PNP0A08:00: _OSC: platform does not support [LTR] [ 2.995451] acpi PNP0A08:00: _OSC: OS now controls [PME PCIeCapability] [ 3.001394] acpi PNP0A08:00: FADT indicates ASPM is unsupported, using BIOS configuration [ 3.010511] PCI host bridge to bus :00 […] [ 6.233508] system 00:05: [io 0x1000-0x10fe] has been reserved [ 6.239420] system 00:05: Plug and Play ACPI device, IDs PNP0c02 (active) [ 6.239906] pnp: PnP ACPI: found 6 devices For ~280 PCI devices, (6.24-2.92)/280 = 0.012 s/dev. On my laptop I have about (.66-.37)/36 = 0.008 s/dev (on v5.4), so about the same ballpark. Though if it was on average 0.008 s/dev here, around a second could be saved. The integrated Matrox G200eW3 graphics controller (102b:0536) and the two Broadcom NetXtreme BCM5720 2-port Gigabit Ethernet PCIe cards (14e4:165f) take 150 ms to be initialized. [ 3.454409] pci :03:00.0: [102b:0536] type 00 class 0x03 [ 3.460411] pci :03:00.0: reg 0x10: [mem 0x9100-0x91ff pref] [ 3.467403] pci :03:00.0: reg 0x14: [mem 0x92808000-0x9280bfff] [ 3.473402] pci :03:00.0: reg 0x18: [mem 0x9200-0x927f] [ 3.479437] pci :03:00.0: BAR 0: assigned to efifb The timestamp in each line differs by around 6 ms. Could printing the messages to the console (VGA) hold this up (line 373 to line 911 makes (6.24 s-2.92 s)/(538 lines) = (3.32 s)/(538 lines) = 6 ms)? [ 3.484480] pci :02:00.0: PCI bridge to [bus 03] [ 3.489401] pci :02:00.0: bridge window [mem 0x9200-0x928f] [ 3.496398] pci :02:00.0: bridge window [mem 0x9100-0x91ff 64bit pref] [ 3.504446] pci :04:00.0: [14e4:165f] type 00 class 0x02 [ 3.510415] pci :04:00.0: reg 0x10: [mem 0x92e3-0x92e3 64bit pref] [ 3.517408] pci :04:00.0: reg 0x18: [mem 0x92e4-0x92e4 64bit pref] [ 3.524407] pci :04:00.0: reg 0x20: [mem 0x92e5-0x92e5 64bit pref] [ 3.532402] pci :04:00.0: reg 0x30: [mem 0xfffc-0x pref] [ 3.538483] pci :04:00.0: PME# supported from D0 D3hot D3cold [ 3.544437] pci :04:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x1 link at :00:1c.5 (capable of 8.000 Gb/s with 5.0 GT/s PCIe x2 link) [ 3.559493] pci :04:00.1: [14e4:165f] type 00 class 0x02 Here is a 15 ms delay. [ 3.565415] pci :04:00.1: reg 0x10: [mem 0x92e0-0x92e0 64bit pref] [ 3.573407] pci :04:00.1: reg 0x18: [mem 0x92e1-0x92e1 64bit pref] [ 3.580407] pci :04:00.1: reg 0x20: [mem 0x92e2-0x92e2 64bit pref] [ 3.587402] pci :04:00.1: reg 0x30: [mem 0xfffc-0x pref] [ 3.594483] pci :04:00.1: PME# supported from D0 D3hot D3cold [ 3.600502] pci :00:1c.5: PCI bridge to [bus 04] Can the 6 ms – also from your system – be explained by the PCI specification? Seeing how fast PCI nowadays is, 6 ms sounds like a long time. ;-) Faster would always be better, of course. I assume this is not really a regression? Correct, as far as I know of, this is no regression. [ 6.989016] pci :d7:05.0: disabled boot interrupts on device [8086:2034] [ 6.996063] PCI: CLS 0 bytes, default 64 [ 7.08] Trying to unpack rootfs image as initramfs... [ 7.065281] Freeing initrd memory: 5136K The PCI resource assignment(?) also seems to take 670 ms: [ 6.319656] pci :04:00.0: can't claim BAR 6 [mem 0xfffc-0x pref]: no compatible bridge window […] [ 6.989016] pci :d7:05.0: disabled boot interrupts on device [8086:2034] […] [ 7.079098] DMAR: dmar7: Using Queued invalidation [ 7.083983] pci :00:00.0: Adding to iommu group 0 […] [ 8.537808] pci :d7:17.1: Adding to iommu group 141 I don't have this iommu stuff turned on and don't know what's happening here. There is a lock in `iommu_group_add_device()` in `drivers/iommu/iommu.c`: mutex_lock(&group->mutex); list_add_tail(&device->list, &group->devices);
Re: How to reduce PCI initialization from 5 s (1.5 s adding them to IOMMU groups)
Dear Krzysztof, Thank you for your reply. Am 08.11.21 um 18:18 schrieb Krzysztof Wilczyński: On a PowerEdge T440/021KCD, BIOS 2.11.2 04/22/2021, Linux 5.10.70 takes almost five seconds to initialize PCI. According to the timestamps, 1.5 s are from assigning the PCI devices to the 142 IOMMU groups. [...] Is there anything that could be done to reduce the time? I am curious - why is this a problem? Are you power-cycling your servers so often to the point where the cumulative time spent in enumerating PCI devices and adding them later to IOMMU groups is a problem? I am simply wondering why you decided to signal out the PCI enumeration as slow in particular, especially given that a large server hardware tends to have (most of the time, as per my experience) rather long initialisation time either from being powered off or after being power cycled. I can take a while before the actual operating system itself will start. It’s not a problem per se, and more a pet peeve of mine. Systems get faster and faster, and boottime slower and slower. On desktop systems, it’s much more important with firmware like coreboot taking less than one second to initialize the hardware and passing control to the payload/operating system. If we are lucky, we are going to have servers with FLOSS firmware. But, already now, using kexec to reboot a system, avoids the problems you pointed out on servers, and being able to reboot a system as quickly as possible, lowers the bar for people to reboot systems more often to, for example, so updates take effect. We talked about this briefly with Bjorn, and there might be an option to perhaps add some caching, as we suspect that the culprit here is doing PCI configuration space read for each device, which can be slow on some platforms. However, we would need to profile this to get some quantitative data to see whether doing anything would even be worthwhile. It would definitely help us understand better where the bottlenecks really are and of what magnitude. I personally don't have access to such a large hardware like the one you have access to, thus I was wondering whether you would have some time, and be willing, to profile this for us on the hardware you have. Let me know what do you think? Sounds good. I’d be willing to help. Note, that I won’t have time before Wednesday next week though. Kind regards, Paul ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: How to reduce PCI initialization from 5 s (1.5 s adding them to IOMMU groups)
On 2021-11-06 10:42, Paul Menzel wrote: Dear Bjorn, Thank you for your quick reply. Am 05.11.21 um 19:53 schrieb Bjorn Helgaas: On Fri, Nov 05, 2021 at 12:56:09PM +0100, Paul Menzel wrote: On a PowerEdge T440/021KCD, BIOS 2.11.2 04/22/2021, Linux 5.10.70 takes almost five seconds to initialize PCI. According to the timestamps, 1.5 s are from assigning the PCI devices to the 142 IOMMU groups. ``` $ lspci | wc -l 281 $ dmesg […] [ 2.918411] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug [ 2.933841] ACPI: Enabled 5 GPEs in block 00 to 7F [ 2.973739] ACPI: PCI Root Bridge [PC00] (domain [bus 00-16]) [ 2.980398] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3] [ 2.989457] acpi PNP0A08:00: _OSC: platform does not support [LTR] [ 2.995451] acpi PNP0A08:00: _OSC: OS now controls [PME PCIeCapability] [ 3.001394] acpi PNP0A08:00: FADT indicates ASPM is unsupported, using BIOS configuration [ 3.010511] PCI host bridge to bus :00 […] [ 6.233508] system 00:05: [io 0x1000-0x10fe] has been reserved [ 6.239420] system 00:05: Plug and Play ACPI device, IDs PNP0c02 (active) [ 6.239906] pnp: PnP ACPI: found 6 devices For ~280 PCI devices, (6.24-2.92)/280 = 0.012 s/dev. On my laptop I have about (.66-.37)/36 = 0.008 s/dev (on v5.4), so about the same ballpark. Though if it was on average 0.008 s/dev here, around a second could be saved. The integrated Matrox G200eW3 graphics controller (102b:0536) and the two Broadcom NetXtreme BCM5720 2-port Gigabit Ethernet PCIe cards (14e4:165f) take 150 ms to be initialized. [ 3.454409] pci :03:00.0: [102b:0536] type 00 class 0x03 [ 3.460411] pci :03:00.0: reg 0x10: [mem 0x9100-0x91ff pref] [ 3.467403] pci :03:00.0: reg 0x14: [mem 0x92808000-0x9280bfff] [ 3.473402] pci :03:00.0: reg 0x18: [mem 0x9200-0x927f] [ 3.479437] pci :03:00.0: BAR 0: assigned to efifb The timestamp in each line differs by around 6 ms. Could printing the messages to the console (VGA) hold this up (line 373 to line 911 makes (6.24 s-2.92 s)/(538 lines) = (3.32 s)/(538 lines) = 6 ms)? [ 3.484480] pci :02:00.0: PCI bridge to [bus 03] [ 3.489401] pci :02:00.0: bridge window [mem 0x9200-0x928f] [ 3.496398] pci :02:00.0: bridge window [mem 0x9100-0x91ff 64bit pref] [ 3.504446] pci :04:00.0: [14e4:165f] type 00 class 0x02 [ 3.510415] pci :04:00.0: reg 0x10: [mem 0x92e3-0x92e3 64bit pref] [ 3.517408] pci :04:00.0: reg 0x18: [mem 0x92e4-0x92e4 64bit pref] [ 3.524407] pci :04:00.0: reg 0x20: [mem 0x92e5-0x92e5 64bit pref] [ 3.532402] pci :04:00.0: reg 0x30: [mem 0xfffc-0x pref] [ 3.538483] pci :04:00.0: PME# supported from D0 D3hot D3cold [ 3.544437] pci :04:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x1 link at :00:1c.5 (capable of 8.000 Gb/s with 5.0 GT/s PCIe x2 link) [ 3.559493] pci :04:00.1: [14e4:165f] type 00 class 0x02 Here is a 15 ms delay. [ 3.565415] pci :04:00.1: reg 0x10: [mem 0x92e0-0x92e0 64bit pref] [ 3.573407] pci :04:00.1: reg 0x18: [mem 0x92e1-0x92e1 64bit pref] [ 3.580407] pci :04:00.1: reg 0x20: [mem 0x92e2-0x92e2 64bit pref] [ 3.587402] pci :04:00.1: reg 0x30: [mem 0xfffc-0x pref] [ 3.594483] pci :04:00.1: PME# supported from D0 D3hot D3cold [ 3.600502] pci :00:1c.5: PCI bridge to [bus 04] Can the 6 ms – also from your system – be explained by the PCI specification? Seeing how fast PCI nowadays is, 6 ms sounds like a long time. ;-) Faster would always be better, of course. I assume this is not really a regression? Correct, as far as I know of, this is no regression. [ 6.989016] pci :d7:05.0: disabled boot interrupts on device [8086:2034] [ 6.996063] PCI: CLS 0 bytes, default 64 [ 7.08] Trying to unpack rootfs image as initramfs... [ 7.065281] Freeing initrd memory: 5136K The PCI resource assignment(?) also seems to take 670 ms: [ 6.319656] pci :04:00.0: can't claim BAR 6 [mem 0xfffc-0x pref]: no compatible bridge window […] [ 6.989016] pci :d7:05.0: disabled boot interrupts on device [8086:2034] […] [ 7.079098] DMAR: dmar7: Using Queued invalidation [ 7.083983] pci :00:00.0: Adding to iommu group 0 […] [ 8.537808] pci :d7:17.1: Adding to iommu group 141 I don't have this iommu stuff turned on and don't know what's happening here. There is a lock in `iommu_group_add_device()` in `drivers/iommu/iommu.c`: mutex_lock(&group->mutex); list_add_tail(&device->list, &group->devices); if (group->domain
Re: How to reduce PCI initialization from 5 s (1.5 s adding them to IOMMU groups)
Hi Paul, > On a PowerEdge T440/021KCD, BIOS 2.11.2 04/22/2021, Linux 5.10.70 takes > almost five seconds to initialize PCI. According to the timestamps, 1.5 s > are from assigning the PCI devices to the 142 IOMMU groups. [...] > Is there anything that could be done to reduce the time? I am curious - why is this a problem? Are you power-cycling your servers so often to the point where the cumulative time spent in enumerating PCI devices and adding them later to IOMMU groups is a problem? I am simply wondering why you decided to signal out the PCI enumeration as slow in particular, especially given that a large server hardware tends to have (most of the time, as per my experience) rather long initialisation time either from being powered off or after being power cycled. I can take a while before the actual operating system itself will start. We talked about this briefly with Bjorn, and there might be an option to perhaps add some caching, as we suspect that the culprit here is doing PCI configuration space read for each device, which can be slow on some platforms. However, we would need to profile this to get some quantitative data to see whether doing anything would even be worthwhile. It would definitely help us understand better where the bottlenecks really are and of what magnitude. I personally don't have access to such a large hardware like the one you have access to, thus I was wondering whether you would have some time, and be willing, to profile this for us on the hardware you have. Let me know what do you think? Krzysztof ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: How to reduce PCI initialization from 5 s (1.5 s adding them to IOMMU groups)
Dear Bjorn, Thank you for your quick reply. Am 05.11.21 um 19:53 schrieb Bjorn Helgaas: On Fri, Nov 05, 2021 at 12:56:09PM +0100, Paul Menzel wrote: On a PowerEdge T440/021KCD, BIOS 2.11.2 04/22/2021, Linux 5.10.70 takes almost five seconds to initialize PCI. According to the timestamps, 1.5 s are from assigning the PCI devices to the 142 IOMMU groups. ``` $ lspci | wc -l 281 $ dmesg […] [2.918411] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug [2.933841] ACPI: Enabled 5 GPEs in block 00 to 7F [2.973739] ACPI: PCI Root Bridge [PC00] (domain [bus 00-16]) [2.980398] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3] [2.989457] acpi PNP0A08:00: _OSC: platform does not support [LTR] [2.995451] acpi PNP0A08:00: _OSC: OS now controls [PME PCIeCapability] [3.001394] acpi PNP0A08:00: FADT indicates ASPM is unsupported, using BIOS configuration [3.010511] PCI host bridge to bus :00 […] [6.233508] system 00:05: [io 0x1000-0x10fe] has been reserved [6.239420] system 00:05: Plug and Play ACPI device, IDs PNP0c02 (active) [6.239906] pnp: PnP ACPI: found 6 devices For ~280 PCI devices, (6.24-2.92)/280 = 0.012 s/dev. On my laptop I have about (.66-.37)/36 = 0.008 s/dev (on v5.4), so about the same ballpark. Though if it was on average 0.008 s/dev here, around a second could be saved. The integrated Matrox G200eW3 graphics controller (102b:0536) and the two Broadcom NetXtreme BCM5720 2-port Gigabit Ethernet PCIe cards (14e4:165f) take 150 ms to be initialized. [3.454409] pci :03:00.0: [102b:0536] type 00 class 0x03 [3.460411] pci :03:00.0: reg 0x10: [mem 0x9100-0x91ff pref] [3.467403] pci :03:00.0: reg 0x14: [mem 0x92808000-0x9280bfff] [3.473402] pci :03:00.0: reg 0x18: [mem 0x9200-0x927f] [3.479437] pci :03:00.0: BAR 0: assigned to efifb The timestamp in each line differs by around 6 ms. Could printing the messages to the console (VGA) hold this up (line 373 to line 911 makes (6.24 s-2.92 s)/(538 lines) = (3.32 s)/(538 lines) = 6 ms)? [3.484480] pci :02:00.0: PCI bridge to [bus 03] [3.489401] pci :02:00.0: bridge window [mem 0x9200-0x928f] [3.496398] pci :02:00.0: bridge window [mem 0x9100-0x91ff 64bit pref] [3.504446] pci :04:00.0: [14e4:165f] type 00 class 0x02 [3.510415] pci :04:00.0: reg 0x10: [mem 0x92e3-0x92e3 64bit pref] [3.517408] pci :04:00.0: reg 0x18: [mem 0x92e4-0x92e4 64bit pref] [3.524407] pci :04:00.0: reg 0x20: [mem 0x92e5-0x92e5 64bit pref] [3.532402] pci :04:00.0: reg 0x30: [mem 0xfffc-0x pref] [3.538483] pci :04:00.0: PME# supported from D0 D3hot D3cold [3.544437] pci :04:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x1 link at :00:1c.5 (capable of 8.000 Gb/s with 5.0 GT/s PCIe x2 link) [3.559493] pci :04:00.1: [14e4:165f] type 00 class 0x02 Here is a 15 ms delay. [3.565415] pci :04:00.1: reg 0x10: [mem 0x92e0-0x92e0 64bit pref] [3.573407] pci :04:00.1: reg 0x18: [mem 0x92e1-0x92e1 64bit pref] [3.580407] pci :04:00.1: reg 0x20: [mem 0x92e2-0x92e2 64bit pref] [3.587402] pci :04:00.1: reg 0x30: [mem 0xfffc-0x pref] [3.594483] pci :04:00.1: PME# supported from D0 D3hot D3cold [3.600502] pci :00:1c.5: PCI bridge to [bus 04] Can the 6 ms – also from your system – be explained by the PCI specification? Seeing how fast PCI nowadays is, 6 ms sounds like a long time. ;-) Faster would always be better, of course. I assume this is not really a regression? Correct, as far as I know of, this is no regression. [6.989016] pci :d7:05.0: disabled boot interrupts on device [8086:2034] [6.996063] PCI: CLS 0 bytes, default 64 [7.08] Trying to unpack rootfs image as initramfs... [7.065281] Freeing initrd memory: 5136K The PCI resource assignment(?) also seems to take 670 ms: [6.319656] pci :04:00.0: can't claim BAR 6 [mem 0xfffc-0x pref]: no compatible bridge window […] [6.989016] pci :d7:05.0: disabled boot interrupts on device [8086:2034] […] [7.079098] DMAR: dmar7: Using Queued invalidation [7.083983] pci :00:00.0: Adding to iommu group 0 […] [8.537808] pci :d7:17.1: Adding to iommu group 141 I don't have this iommu stuff turned on and don't know what's happening here. There is a lock in `iommu_group_add_device()` in `drivers/iommu/iommu.c`: mutex_lock(&group->mutex); list_add_tail(&device->list, &group->devices); if (group->domain && !iommu_is_attach_deferred(group->domain, dev)) ret = __
Re: How to reduce PCI initialization from 5 s (1.5 s adding them to IOMMU groups)
On Fri, Nov 05, 2021 at 12:56:09PM +0100, Paul Menzel wrote: > Dear Linux folks, > > > On a PowerEdge T440/021KCD, BIOS 2.11.2 04/22/2021, Linux 5.10.70 takes > almost five seconds to initialize PCI. According to the timestamps, 1.5 s > are from assigning the PCI devices to the 142 IOMMU groups. > > ``` > $ lspci | wc -l > 281 > $ dmesg > […] > [2.918411] PCI: Using host bridge windows from ACPI; if necessary, use > "pci=nocrs" and report a bug > [2.933841] ACPI: Enabled 5 GPEs in block 00 to 7F > [2.973739] ACPI: PCI Root Bridge [PC00] (domain [bus 00-16]) > [2.980398] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM > ClockPM Segments MSI HPX-Type3] > [2.989457] acpi PNP0A08:00: _OSC: platform does not support [LTR] > [2.995451] acpi PNP0A08:00: _OSC: OS now controls [PME PCIeCapability] > [3.001394] acpi PNP0A08:00: FADT indicates ASPM is unsupported, using > BIOS configuration > [3.010511] PCI host bridge to bus :00 > […] > [6.233508] system 00:05: [io 0x1000-0x10fe] has been reserved > [6.239420] system 00:05: Plug and Play ACPI device, IDs PNP0c02 (active) > [6.239906] pnp: PnP ACPI: found 6 devices For ~280 PCI devices, (6.24-2.92)/280 = 0.012 s/dev. On my laptop I have about (.66-.37)/36 = 0.008 s/dev (on v5.4), so about the same ballpark. Faster would always be better, of course. I assume this is not really a regression? > [6.989016] pci :d7:05.0: disabled boot interrupts on device > [8086:2034] > [6.996063] PCI: CLS 0 bytes, default 64 > [7.08] Trying to unpack rootfs image as initramfs... > [7.065281] Freeing initrd memory: 5136K > […] > [7.079098] DMAR: dmar7: Using Queued invalidation > [7.083983] pci :00:00.0: Adding to iommu group 0 > […] > [8.537808] pci :d7:17.1: Adding to iommu group 141 I don't have this iommu stuff turned on and don't know what's happening here. > Is there anything that could be done to reduce the time? > > > Kind regards, > > Paul ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: How to reduce PCI initialization from 5 s (1.5 s adding them to IOMMU groups)s
Dear Linux folks, Am 05.11.21 um 12:56 schrieb Paul Menzel: On a PowerEdge T440/021KCD, BIOS 2.11.2 04/22/2021, Linux 5.10.70 takes almost five seconds to initialize PCI. According to the timestamps, 1.5 s are from assigning the PCI devices to the 142 IOMMU groups. ``` $ lspci | wc -l 281 $ dmesg […] [ 2.918411] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug [ 2.933841] ACPI: Enabled 5 GPEs in block 00 to 7F [ 2.973739] ACPI: PCI Root Bridge [PC00] (domain [bus 00-16]) [ 2.980398] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3] [ 2.989457] acpi PNP0A08:00: _OSC: platform does not support [LTR] [ 2.995451] acpi PNP0A08:00: _OSC: OS now controls [PME PCIeCapability] [ 3.001394] acpi PNP0A08:00: FADT indicates ASPM is unsupported, using BIOS configuration [ 3.010511] PCI host bridge to bus :00 […] [ 6.233508] system 00:05: [io 0x1000-0x10fe] has been reserved [ 6.239420] system 00:05: Plug and Play ACPI device, IDs PNP0c02 (active) [ 6.239906] pnp: PnP ACPI: found 6 devices […] [ 6.989016] pci :d7:05.0: disabled boot interrupts on device [8086:2034] [ 6.996063] PCI: CLS 0 bytes, default 64 [ 7.08] Trying to unpack rootfs image as initramfs... [ 7.065281] Freeing initrd memory: 5136K […] [ 7.079098] DMAR: dmar7: Using Queued invalidation [ 7.083983] pci :00:00.0: Adding to iommu group 0 […] [ 8.537808] pci :d7:17.1: Adding to iommu group 141 [ 8.571191] DMAR: Intel(R) Virtualization Technology for Directed I/O [ 8.577618] PCI-DMA: Using software bounce buffering for IO (SWIOTLB) […] ``` Is there anything that could be done to reduce the time? I created an issue at the Kernel.org Bugzilla, and attached the output of `dmesg` there [1]. Kind regards, Paul [1]: https://bugzilla.kernel.org/show_bug.cgi?id=214953 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu