Re: [RFC] PCI: Unassigned Expansion ROM BARs
Here it is a year later and there has basically been no progress on this ongoing situation. I still often encounter bugs raised against the kernel w.r.t. unmet resource allocations - here is the most recent example, I'll attach the 'dmesg' log from the platform at https://bugzilla.kernel.org/show_bug.cgi?id=104931. Researching device :04:00.3 as it's the device with the issue (and all other devices/functions under PCI bus 04 due to possible competing resource needs). Analysis from v4.7.0 kernel run 'dmesg' log with comments interspersed ... This platform has two PCI Root Bridges. Limiting analysis to the first Root Bridge handling PCI buses 0x00 through 0x7e as it contains the PCI bus in question - bus 04. ACPI: PCI Root Bridge [PCI0] (domain [bus 00-7e]) PCI host bridge to bus :00 pci_bus :00: root bus resource [io 0x-0x03bb window] pci_bus :00: root bus resource [io 0x03bc-0x03df window] pci_bus :00: root bus resource [io 0x03e0-0x0cf7 window] pci_bus :00: root bus resource [io 0x1000-0x7fff window] pci_bus :00: root bus resource [mem 0x000a-0x000b window] pci_bus :00: root bus resource [mem 0x9000-0xc7ffbfff window] pci_bus :00: root bus resource [mem 0x300-0x33f window] CPU addresses falling into the above resource ranges will get intercepted by the host controller and converted into PCI bus transactions. Looking further into the log we find the set of resource ranges (PCI-to-PCI bridge apertures) corresponding to PCI bus 04. pci :00:02.0: PCI bridge to [bus 04] pci :00:02.0: bridge window [io 0x2000-0x2fff] pci :00:02.0: bridge window [mem 0x9200-0x940f] 33M The following shows what the platforms BIOS programmed into the BARs of device(s) under PCI bus 04. pci :04:00.0: [1924:0923] type 00 class 0x02 pci :04:00.0: reg 0x10: [io 0x2300-0x23ff] pci :04:00.0: reg 0x18: [mem 0x9380-0x93ff 64bit] BAR2 pci :04:00.0: reg 0x20: [mem 0x9400c000-0x9400 64bit] BAR4 pci :04:00.0: reg 0x30: [mem 0xfffc-0x pref] E ROM pci :04:00.1: [1924:0923] type 00 class 0x02 pci :04:00.1: reg 0x10: [io 0x2200-0x22ff] pci :04:00.1: reg 0x18: [mem 0x9300-0x937f 64bit] pci :04:00.1: reg 0x20: [mem 0x94008000-0x9400bfff 64bit] pci :04:00.1: reg 0x30: [mem 0xfffc-0x pref] pci :04:00.2: [1924:0923] type 00 class 0x02 pci :04:00.2: reg 0x10: [io 0x2100-0x21ff] pci :04:00.2: reg 0x18: [mem 0x9280-0x92ff 64bit] pci :04:00.2: reg 0x20: [mem 0x94004000-0x94007fff 64bit] pci :04:00.2: reg 0x30: [mem 0xfffc-0x pref] pci :04:00.3: [1924:0923] type 00 class 0x02 pci :04:00.3: reg 0x10: [io 0x2000-0x20ff] pci :04:00.3: reg 0x18: [mem 0x9200-0x927f 64bit] 8M pci :04:00.3: reg 0x20: [mem 0x9400-0x94003fff 64bit] 16K pci :04:00.3: reg 0x30: [mem 0xfffc-0x pref] 256K It's already obvious that the 33M of MMIO space that the PCI-to-PCI bridge leading to PCI bus 04 provides (0x9200-0x940f) is not enough space to fully satisfy the MMIO specific addressing needs of all device's BARs below it - the 4 combined ports - totaling (8M + 16K + 256K) *4) = 33M + 64K. This is _without_ taking into account any alignment constraints that likely would increase the buses needed aperture range even further. Note that the values programmed into the device's Expansion ROM BARs do not fit within any of its immediately upstream bridge's MMIO related apertures. pci :04:00.0: can't claim BAR 6 [mem 0xfffc-0x pref]: no compatible bridge window pci :04:00.1: can't claim BAR 6 [mem 0xfffc-0x pref]: no compatible bridge window pci :04:00.2: can't claim BAR 6 [mem 0xfffc-0x pref]: no compatible bridge window pci :04:00.3: can't claim BAR 6 [mem 0xfffc-0x pref]: no compatible bridge window The kernel notices this and attempts to allocate appropriate space for them from any remaining, available, MMIO space that meets all the alignment constraints and such. pci :04:00.0: BAR 6: assigned [mem 0x9404-0x9407 pref] pci :04:00.1: BAR 6: assigned [mem 0x9408-0x940b pref] pci :04:00.2: BAR 6: assigned [mem 0x940c-0x940f pref] pci :04:00.3: BAR 6: no space for [mem size 0x0004 pref] pci :04:00.3: BAR 6: failed to assign [mem size 0x0004 pref] The kernel was able to satisfy the first three ports MMIO needs but was _not_ able to for the last port - there is no remaining available addressing space within the range to satisfy its needs! At this point the :04:00.3 device just happens to work by luck due to the fact that the unmet resource needs correspond to its Expansion ROM BAR [1]. Next a "user" initiates a
Re: [RFC] PCI: Unassigned Expansion ROM BARs
Here it is a year later and there has basically been no progress on this ongoing situation. I still often encounter bugs raised against the kernel w.r.t. unmet resource allocations - here is the most recent example, I'll attach the 'dmesg' log from the platform at https://bugzilla.kernel.org/show_bug.cgi?id=104931. Researching device :04:00.3 as it's the device with the issue (and all other devices/functions under PCI bus 04 due to possible competing resource needs). Analysis from v4.7.0 kernel run 'dmesg' log with comments interspersed ... This platform has two PCI Root Bridges. Limiting analysis to the first Root Bridge handling PCI buses 0x00 through 0x7e as it contains the PCI bus in question - bus 04. ACPI: PCI Root Bridge [PCI0] (domain [bus 00-7e]) PCI host bridge to bus :00 pci_bus :00: root bus resource [io 0x-0x03bb window] pci_bus :00: root bus resource [io 0x03bc-0x03df window] pci_bus :00: root bus resource [io 0x03e0-0x0cf7 window] pci_bus :00: root bus resource [io 0x1000-0x7fff window] pci_bus :00: root bus resource [mem 0x000a-0x000b window] pci_bus :00: root bus resource [mem 0x9000-0xc7ffbfff window] pci_bus :00: root bus resource [mem 0x300-0x33f window] CPU addresses falling into the above resource ranges will get intercepted by the host controller and converted into PCI bus transactions. Looking further into the log we find the set of resource ranges (PCI-to-PCI bridge apertures) corresponding to PCI bus 04. pci :00:02.0: PCI bridge to [bus 04] pci :00:02.0: bridge window [io 0x2000-0x2fff] pci :00:02.0: bridge window [mem 0x9200-0x940f] 33M The following shows what the platforms BIOS programmed into the BARs of device(s) under PCI bus 04. pci :04:00.0: [1924:0923] type 00 class 0x02 pci :04:00.0: reg 0x10: [io 0x2300-0x23ff] pci :04:00.0: reg 0x18: [mem 0x9380-0x93ff 64bit] BAR2 pci :04:00.0: reg 0x20: [mem 0x9400c000-0x9400 64bit] BAR4 pci :04:00.0: reg 0x30: [mem 0xfffc-0x pref] E ROM pci :04:00.1: [1924:0923] type 00 class 0x02 pci :04:00.1: reg 0x10: [io 0x2200-0x22ff] pci :04:00.1: reg 0x18: [mem 0x9300-0x937f 64bit] pci :04:00.1: reg 0x20: [mem 0x94008000-0x9400bfff 64bit] pci :04:00.1: reg 0x30: [mem 0xfffc-0x pref] pci :04:00.2: [1924:0923] type 00 class 0x02 pci :04:00.2: reg 0x10: [io 0x2100-0x21ff] pci :04:00.2: reg 0x18: [mem 0x9280-0x92ff 64bit] pci :04:00.2: reg 0x20: [mem 0x94004000-0x94007fff 64bit] pci :04:00.2: reg 0x30: [mem 0xfffc-0x pref] pci :04:00.3: [1924:0923] type 00 class 0x02 pci :04:00.3: reg 0x10: [io 0x2000-0x20ff] pci :04:00.3: reg 0x18: [mem 0x9200-0x927f 64bit] 8M pci :04:00.3: reg 0x20: [mem 0x9400-0x94003fff 64bit] 16K pci :04:00.3: reg 0x30: [mem 0xfffc-0x pref] 256K It's already obvious that the 33M of MMIO space that the PCI-to-PCI bridge leading to PCI bus 04 provides (0x9200-0x940f) is not enough space to fully satisfy the MMIO specific addressing needs of all device's BARs below it - the 4 combined ports - totaling (8M + 16K + 256K) *4) = 33M + 64K. This is _without_ taking into account any alignment constraints that likely would increase the buses needed aperture range even further. Note that the values programmed into the device's Expansion ROM BARs do not fit within any of its immediately upstream bridge's MMIO related apertures. pci :04:00.0: can't claim BAR 6 [mem 0xfffc-0x pref]: no compatible bridge window pci :04:00.1: can't claim BAR 6 [mem 0xfffc-0x pref]: no compatible bridge window pci :04:00.2: can't claim BAR 6 [mem 0xfffc-0x pref]: no compatible bridge window pci :04:00.3: can't claim BAR 6 [mem 0xfffc-0x pref]: no compatible bridge window The kernel notices this and attempts to allocate appropriate space for them from any remaining, available, MMIO space that meets all the alignment constraints and such. pci :04:00.0: BAR 6: assigned [mem 0x9404-0x9407 pref] pci :04:00.1: BAR 6: assigned [mem 0x9408-0x940b pref] pci :04:00.2: BAR 6: assigned [mem 0x940c-0x940f pref] pci :04:00.3: BAR 6: no space for [mem size 0x0004 pref] pci :04:00.3: BAR 6: failed to assign [mem size 0x0004 pref] The kernel was able to satisfy the first three ports MMIO needs but was _not_ able to for the last port - there is no remaining available addressing space within the range to satisfy its needs! At this point the :04:00.3 device just happens to work by luck due to the fact that the unmet resource needs correspond to its Expansion ROM BAR [1]. Next a "user" initiates a
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Wed, Sep 23, 2015 at 8:47 PM, Myron Stowe wrote: snip > > There is a kernel boot parameter, pci=norom, that is intended to disable the > kernel's resource assignment actions for Expansion ROMs that do not already > have BIOS assigned address ranges. Note however, if I remember correctly, > that this only works if the Expansion ROM BAR is set to "0" by the BIOS > before hand-off. In private conversations I was asked: Why do you propose asking the BIOS to assign setting Expansion ROM BARs to "0"? That is not what I'm advocating. I think it's a complete hack. Some background context - this is effectively the defacto detente that has come to be somehow with one of the major vendor's BIOS' to circumvent 'dmesg' entries corresponding to unassigned Expansion ROM BARs which draws customer attention. Unless something has changed recently, specifying "pci=norom" when booting does not cause the kernel to completely ignore Expansion ROM BARs all together as one would expect. The kernel still outputs 'dmesg's corresponding to unassigned Expansion ROM BARs and also attempts to allocate such. This is a kernel bug in my opinion. It's only if both "pci=norom" has been specified, and, the BIOS has set the Expansion ROM BARs to "0" that the kernel completely ignores Expansion ROM BARs and no 'dmesg's are output. Customers don't want to, and shouldn't have to, utilize kernel parameters. They are indispensable for kernel engineers to use for debugging and such but not for normal, every day, (i.e. customer) use. So, no, I am not advocating the current defacto detente that is in place today -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Wed, Sep 23, 2015 at 8:47 PM, Myron Stowe wrote: snip > > The kernel expects device Expansion ROM BARs to be programmed with valid > values - even if the respective Expansion ROM's Enable bit is 0 (i.e. the > device’s expansion ROM address space is disabled). This seems to be the > main contention point with said BIOS engineers. If an Expansion ROM BAR is > not programmed, the kernel will attempt to find available resources and, if > successful, program it. As this occurs various 'dmesg' entries > related to kernel's actions are output. > The respective BIOS engineers from the two major vendors exhibiting the behavior outlined are aware of, and monitoring, this thread. With the exception of Daniel's recent post, there hasn't been much substance presented supporting the OS's viewpoint to encourage the BIOS engineers to enter into any kind of discussion. The majority of the responses have gone straight towards how the OS can effectively work around platform's that exhibit such setups. I'd like to step back and present known instances of the OS's need(s) to access the Expansion (a.k.a. option) ROMs - something for the BIOS engineers to consider; something with which to start a dialogue. There are at least three known major reasons why Linux uses the ROMs: 1) For many of the video cards, Linux has drivers that assume the card has been initialized by the ROM. The drivers work fine, but they aren't smart enough to work with the card straight out of reset - a lot of which is due to specific vendor's keeping their devices closed; the code remains proprietary. When such devices are reset when the OS is running (i.e. when X is restarted), the OS has to run the ROM before the driver works again. Alex Williamson and Daniel Blueman both covered this fairly well, including the current dificiencies of such, in prior threads. 2) As Daniel further expressed, hot-plug scenarios and PCI domains which may not be visible to the BIOS at initial boot, may need to access the ROMs. In these environments - PCI hiearchies shared among multiple, distinct, servers; hiearchies using non-transparent bridges - option ROMs handed off by the BIOS unassigned need to be allocated by the OS so that they can be accessed under these circumstances. 3) Virtualized guest environments where a device may be assigned to a virtualized guest is an interesting case. In such environments the host OS effectively functions as the meta-level BIOS, setting up a guest's environment (virtual platform) prior to instantianting it. Within such a context consider a simple example: NIC devices often have Preboot Execution Environment (PXE) code in their ROMs. In a bare-metal scenario, the BIOS (a.k.a. platform firmware) obtains the PXE code from the ROM and initiates its execution. In this scenario, once the OS is up and running there would seem to be no further need to access such device's ROMs. If we now extend the scenario one meta-level to include virtualization, the host OS [1] assumes the role of bare-metal environment's BIOS and the virtualized guest takes on the role of bare-metal OS. As such, if the guest is booted via PXE from a NIC device, the meta-level BIOS (QEMU/seabios) needs the ROM's content in order initiate PXE execution to bring up the guest. So in virtualized environments, it becomes obvious that all the traditional BIOS ROM related actions extend to the (host) OS - PXE booting, device initialization, hot-plug, and directly assigning physical devices to virtualized guests, etc. [1] "host OS" is used here in the generalized sence (i.e. it is in control and thus the subsequent use of QEMU and seabios are not specifically differentiated). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Wed, Sep 23, 2015 at 8:47 PM, Myron Stowewrote: snip > > There is a kernel boot parameter, pci=norom, that is intended to disable the > kernel's resource assignment actions for Expansion ROMs that do not already > have BIOS assigned address ranges. Note however, if I remember correctly, > that this only works if the Expansion ROM BAR is set to "0" by the BIOS > before hand-off. In private conversations I was asked: Why do you propose asking the BIOS to assign setting Expansion ROM BARs to "0"? That is not what I'm advocating. I think it's a complete hack. Some background context - this is effectively the defacto detente that has come to be somehow with one of the major vendor's BIOS' to circumvent 'dmesg' entries corresponding to unassigned Expansion ROM BARs which draws customer attention. Unless something has changed recently, specifying "pci=norom" when booting does not cause the kernel to completely ignore Expansion ROM BARs all together as one would expect. The kernel still outputs 'dmesg's corresponding to unassigned Expansion ROM BARs and also attempts to allocate such. This is a kernel bug in my opinion. It's only if both "pci=norom" has been specified, and, the BIOS has set the Expansion ROM BARs to "0" that the kernel completely ignores Expansion ROM BARs and no 'dmesg's are output. Customers don't want to, and shouldn't have to, utilize kernel parameters. They are indispensable for kernel engineers to use for debugging and such but not for normal, every day, (i.e. customer) use. So, no, I am not advocating the current defacto detente that is in place today -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Wed, Sep 23, 2015 at 8:47 PM, Myron Stowewrote: snip > > The kernel expects device Expansion ROM BARs to be programmed with valid > values - even if the respective Expansion ROM's Enable bit is 0 (i.e. the > device’s expansion ROM address space is disabled). This seems to be the > main contention point with said BIOS engineers. If an Expansion ROM BAR is > not programmed, the kernel will attempt to find available resources and, if > successful, program it. As this occurs various 'dmesg' entries > related to kernel's actions are output. > The respective BIOS engineers from the two major vendors exhibiting the behavior outlined are aware of, and monitoring, this thread. With the exception of Daniel's recent post, there hasn't been much substance presented supporting the OS's viewpoint to encourage the BIOS engineers to enter into any kind of discussion. The majority of the responses have gone straight towards how the OS can effectively work around platform's that exhibit such setups. I'd like to step back and present known instances of the OS's need(s) to access the Expansion (a.k.a. option) ROMs - something for the BIOS engineers to consider; something with which to start a dialogue. There are at least three known major reasons why Linux uses the ROMs: 1) For many of the video cards, Linux has drivers that assume the card has been initialized by the ROM. The drivers work fine, but they aren't smart enough to work with the card straight out of reset - a lot of which is due to specific vendor's keeping their devices closed; the code remains proprietary. When such devices are reset when the OS is running (i.e. when X is restarted), the OS has to run the ROM before the driver works again. Alex Williamson and Daniel Blueman both covered this fairly well, including the current dificiencies of such, in prior threads. 2) As Daniel further expressed, hot-plug scenarios and PCI domains which may not be visible to the BIOS at initial boot, may need to access the ROMs. In these environments - PCI hiearchies shared among multiple, distinct, servers; hiearchies using non-transparent bridges - option ROMs handed off by the BIOS unassigned need to be allocated by the OS so that they can be accessed under these circumstances. 3) Virtualized guest environments where a device may be assigned to a virtualized guest is an interesting case. In such environments the host OS effectively functions as the meta-level BIOS, setting up a guest's environment (virtual platform) prior to instantianting it. Within such a context consider a simple example: NIC devices often have Preboot Execution Environment (PXE) code in their ROMs. In a bare-metal scenario, the BIOS (a.k.a. platform firmware) obtains the PXE code from the ROM and initiates its execution. In this scenario, once the OS is up and running there would seem to be no further need to access such device's ROMs. If we now extend the scenario one meta-level to include virtualization, the host OS [1] assumes the role of bare-metal environment's BIOS and the virtualized guest takes on the role of bare-metal OS. As such, if the guest is booted via PXE from a NIC device, the meta-level BIOS (QEMU/seabios) needs the ROM's content in order initiate PXE execution to bring up the guest. So in virtualized environments, it becomes obvious that all the traditional BIOS ROM related actions extend to the (host) OS - PXE booting, device initialization, hot-plug, and directly assigning physical devices to virtualized guests, etc. [1] "host OS" is used here in the generalized sence (i.e. it is in control and thus the subsequent use of QEMU and seabios are not specifically differentiated). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Thursday, September 24, 2015 at 10:50:07 AM UTC+8, Myron Stowe wrote: > I've encountered numerous bugzilla reports related to platform BIOS' not > programming valid values into a PCI device's Type 0 Configuration space > "Expansion ROM Base Address" field (a.k.a. Expansion ROM BAR). The main > observed consequence being 'dmesg' entries like the following that get > customers excited enough to file reports against the kernel. PCI option ROMs legitimately hold real-mode/EFI code needed to initialise devices; the problem is, we can't guarantee that the BIOS has initialised all devices with the option ROM code, so linux must ensure they are correctly accessible. In addition to VMs as Alex points out, hotplug (eg Thunderbold GPUs) and PCI domains which may not be visible to the BIOS at early boot, may need the option ROM. Nvidia GPUs primarily have had a lot of encoder/connector (HDCP?) and product-specific voltage-frequency setup code and tables in the ROM. As such, in my NumaConnect open firmware which maps the PCI domains of multiple servers into one, I have to also reallocate PCI option ROMs [1] to guarantee GPU VBIOS execution in linux. That said, option ROMs are a dying trend in favour of shipped binary blobs and open-coded initialisation for cross-platform support, and there are only 10 users of pci_map_rom(). Thanks, Daniel [1] https://github.com/numascale/nc-utils/blob/master/bootloader/dnc-mmio.c -- Daniel J Blueman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Thursday, September 24, 2015 at 10:50:07 AM UTC+8, Myron Stowe wrote: > I've encountered numerous bugzilla reports related to platform BIOS' not > programming valid values into a PCI device's Type 0 Configuration space > "Expansion ROM Base Address" field (a.k.a. Expansion ROM BAR). The main > observed consequence being 'dmesg' entries like the following that get > customers excited enough to file reports against the kernel. PCI option ROMs legitimately hold real-mode/EFI code needed to initialise devices; the problem is, we can't guarantee that the BIOS has initialised all devices with the option ROM code, so linux must ensure they are correctly accessible. In addition to VMs as Alex points out, hotplug (eg Thunderbold GPUs) and PCI domains which may not be visible to the BIOS at early boot, may need the option ROM. Nvidia GPUs primarily have had a lot of encoder/connector (HDCP?) and product-specific voltage-frequency setup code and tables in the ROM. As such, in my NumaConnect open firmware which maps the PCI domains of multiple servers into one, I have to also reallocate PCI option ROMs [1] to guarantee GPU VBIOS execution in linux. That said, option ROMs are a dying trend in favour of shipped binary blobs and open-coded initialisation for cross-platform support, and there are only 10 users of pci_map_rom(). Thanks, Daniel [1] https://github.com/numascale/nc-utils/blob/master/bootloader/dnc-mmio.c -- Daniel J Blueman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Fri, Sep 25, 2015 at 9:18 AM, Alex Williamson wrote: >> > > Or do we want to keep a white list to say which device should have >> > > ROM bar as mush have, and other is optional to have ? >> > >> > Subject: [RFC PATCH] PCI: Add pci_dev_need_rom_bar() >> > >> > Only set that for >> > 1. if BIOS/firmware already set ROM bar. >> > 2. via quirks for some devices. >> > >> > We assign those needed ROM bar as required >> > and other ROM bars as optional resources. >> >> I'd rather not have a whitelist if we can avoid it. We'd be >> continually adding new devices to it, and it makes the system harder >> to understand because there's no consistent rule about how ROMs are >> handled. I don't like that whitelist way, and hope we can find better way to handle all the cases elegantly. >> >> Alex mentioned the idea of ripping the ROM, and I'd like to explore >> that idea a little more. What if we could temporarily assign space >> for the ROM during enumeration, read the contents, cache the contents >> somewhere, then deallocate the actual BAR space? We could hang onto >> the cache and give it to anybody who later needs the ROM. > > That sounds pretty good, so long as we can consider the ROM to be > perfectly static. I don't know if anything relies on an in-place update > or if there are ROMs that are less read-only than others. Maybe it's > safe to assume that or at least safe to assume that if the BIOS didn't > leave room for them, then we can consider them static. It might be > interesting to strace some of the userspace firmware update programs for > add-in cards to see if they re-read the ROM to determine if it has > changed. Should cover most cases. But there is some driver like drivers/mtd/maps/pci.c::drivers/mtd/maps/pci.c do have write operation to the mmio range. > > We already sort of do this for VGA ROMs. When a driver tries to map the > boot VGA ROM they actually map the shadow copy at 0xC (iirc) rather > than the one on the device. This actually sort of sucks because this > particular shadow copy certainly is not read-only and in all the glory > that is VGA, the shadow sometimes gets modified by the execution of the > VGA ROM itself and we no longer have access to the pristine device > version of it (bad for device assignment of primary graphics). > > Now, if we want to make our own shadow copies for all ROMs, it seems > pretty clear that we first need to get access to the ROM, which means we > need to figure out if the BIOS mapped it. If the ROM BAR is outside of > the bridge aperture (catching both 0 and 0xFFF..000) or overlaps a > standard ROM BAR, we can consider it unprogrammed. In that case we need > to try to do the trick above with unmapping standard BARs to get the > shadow copy. Otherwise we should be able to get the shadow copy in > place (maybe a question of which we prefer to use in this case). I > would be willing to risk that if the BIOS didn't leave room for the ROM > and we can't map it into the space used by other BARs or it doesn't fit > the bridge aperture, we can spit out a printk warning and skip it. I > expect a very low failure rate. Maybe we can combine two methods together: 1. have NEED_ROM_BAR, and it is set a) if BIOS allocate resource to yet (maybe not, so we leave space for MMIO bars) b) via limited whitelist that will not support static copy. 2. kernel will try to allocate resource to ROM bar with NEED_ROM_BAR as required, and others as optional 3. for ROM bar that can not get assigned, kernel try to borrow mmio range from other MMIO bar on the device, and save a copy and expose that via /sys/.../rom That should happen FINAL_FIXUP stage before driver get loaded. --- There is risk on it, some add-on card firmware would stop working if kernel ever try to change MMIO bar. then we will need blacklist to skip BAR on those devices. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Fri, Sep 25, 2015 at 7:31 AM, Myron Stowe wrote: > On Thu, Sep 24, 2015 at 10:35 PM, Yinghai Lu wrote: >> On Thu, Sep 24, 2015 at 12:01 PM, Yinghai Lu wrote: >> >>> Or do we want to keep a white list to say which device should have >>> ROM bar as mush have, and other is optional to have ? > > I suppose this idea is one possible outcome that could occur but I > think we need to have a discussion first before we start making a lot > of changes. We need to try and come to some consensus with BIOS > engineers. I know that both sets have been alerted to this > conversation so *if* we come up with some good arguments to support > the kernel's current view/actions perhaps things will start to > progress. > > In the prior thread you replied: > "They are wrong. >It still depends on how addon card firmware and drivers >from OS to use it." > > So continuing my suggested thought experiment where you are sitting > across the table from your platform's BIOS engineers having this > discussion ... Do you *really* think your reply was helpful in any > way? Do you *really* think you did anything what so ever to get the > BIOS engineers to consider something they hadn't before. Do you > *really* think your reply was technically based in any way? Do you > have any specification references or such to back up such a strong > claim? > > Come on here Yinghai - you are an intelligent person. Take 1/10th the > time that you spent developing this patch and think, gather your > thoughts, and then sit down calmly, have a beer or coffee or tea > (which ever you prefer), relax, and take some time to reply in a > logical, thoughtful manner here with enough expression that others can > understand what you are getting at and hopefully even with some > passion or logic to try to convince the BIOS engineers that the > kernel's current behavior is correct. This is your area of expertise > - so stand up and rise to the occasion here! > > Hacking out a patch before this thread has played out serves no > purpose what so ever so I'm not even going to waste my time and look > at it. It serves no purpose and will only make matters worse as there > is already strong disagreement with the kernel's actions in these > regards. Sigh, that was a terrible response on my part - I'm trying to encourage engagement in this discussion and yet my response likely has exactly the opposite result; shutting you down. Yinghai, I apologize, truly! Others have privately said that you may be uncomfortable with expressing your views due to language skills. If that's the case then please don't be intimidated and limit your contributions. I expect you know at least two languages which is 50% more than me so don't worry about grammar or such - that's not important and we could benefit from your experience and knowledge. English is my only language and I still too often find it difficult to express my opinions. Again, I'm sorry for my rash, harsh, response, Myron > >> >> Subject: [RFC PATCH] PCI: Add pci_dev_need_rom_bar() >> >> Only set that for >> 1. if BIOS/firmware already set ROM bar. >> 2. via quirks for some devices. >> >> We assign those needed ROM bar as required >> and other ROM bars as optional resources. >> >> Signed-off-by: Yinghai Lu >> >> --- >> arch/x86/pci/i386.c|9 +- >> drivers/dma/pch_dma.c |1 >> drivers/gpio/gpio-ml-ioh.c |2 - >> drivers/misc/pch_phub.c| 12 >> drivers/pci/probe.c|7 + >> drivers/pci/quirks.c | 63 >> + >> drivers/pci/setup-bus.c| 18 ++-- >> include/linux/pci.h| 13 + >> include/linux/pci_ids.h|7 + >> 9 files changed, 112 insertions(+), 20 deletions(-) >> >> Index: linux-2.6/arch/x86/pci/i386.c >> === >> --- linux-2.6.orig/arch/x86/pci/i386.c >> +++ linux-2.6/arch/x86/pci/i386.c >> @@ -377,11 +377,16 @@ static void pcibios_allocate_rom_resourc >> } >> } >> >> +bool pci_assign_roms(void) >> +{ >> +return !!(pci_probe & PCI_ASSIGN_ROMS); >> +} >> + >> static int __init pcibios_assign_resources(void) >> { >> struct pci_host_bridge *host_bridge = NULL; >> >> -if (!(pci_probe & PCI_ASSIGN_ROMS)) >> +if (!pci_assign_roms()) >> for_each_pci_host_bridge(host_bridge) >> pcibios_allocate_rom_resources(host_bridge->bus); >> >> @@ -406,7 +411,7 @@ void pcibios_resource_survey_bus(struct >> pcibios_allocate_resources(bus, 0); >> pcibios_allocate_resources(bus, 1); >> >> -if (!(pci_probe & PCI_ASSIGN_ROMS)) >> +if (!pci_assign_roms()) >> pcibios_allocate_rom_resources(bus); >> } >> >> Index: linux-2.6/drivers/pci/probe.c >> === >> --- linux-2.6.orig/drivers/pci/probe.c >> +++ linux-2.6/drivers/pci/probe.c >> @@ -224,6 +224,13 @@ int __pci_read_base(struct
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Fri, 2015-09-25 at 09:35 -0500, Bjorn Helgaas wrote: > On Thu, Sep 24, 2015 at 09:35:20PM -0700, Yinghai Lu wrote: > > On Thu, Sep 24, 2015 at 12:01 PM, Yinghai Lu wrote: > > > > > Or do we want to keep a white list to say which device should have > > > ROM bar as mush have, and other is optional to have ? > > > > Subject: [RFC PATCH] PCI: Add pci_dev_need_rom_bar() > > > > Only set that for > > 1. if BIOS/firmware already set ROM bar. > > 2. via quirks for some devices. > > > > We assign those needed ROM bar as required > > and other ROM bars as optional resources. > > I'd rather not have a whitelist if we can avoid it. We'd be > continually adding new devices to it, and it makes the system harder > to understand because there's no consistent rule about how ROMs are > handled. > > Alex mentioned the idea of ripping the ROM, and I'd like to explore > that idea a little more. What if we could temporarily assign space > for the ROM during enumeration, read the contents, cache the contents > somewhere, then deallocate the actual BAR space? We could hang onto > the cache and give it to anybody who later needs the ROM. > > I know there are probably issues here, but I don't know what they are, > so I'd like to at least have a conversation about it. Ok, so for background in the case I mentioned, we can often use the pci-sysfs rom interface to get a copy of the device ROM, which we then pass to QEMU and we avoid any access to the physical ROM from the VM. We can obviously get the ROM from other sources too, but that's not really relevant here. If we want to extend this idea into the kernel, creating a buffer that holds the ROM contents that we access rather than mapping and enabling the ROM to provide access, we first need to get access to the ROM at least once. The simplification here is that we can do this on boot and we can re-use the space allocated to the standard BARs since we don't need space for both the ROM and the standard BARs at the same time. I think that in the vast majority of cases, we're going to find that the ROM BAR is smaller than the largest standard MMIO BAR of the device or at least smaller than the minimum bridge aperture. This suggests that we could simply record the BAR values, reprogram them to zero, then overlap the ROM BAR to the orignal addresses, enable, rip the ROM, then restore the configuration without needing to adjust any bridge apertures. That sounds pretty good, so long as we can consider the ROM to be perfectly static. I don't know if anything relies on an in-place update or if there are ROMs that are less read-only than others. Maybe it's safe to assume that or at least safe to assume that if the BIOS didn't leave room for them, then we can consider them static. It might be interesting to strace some of the userspace firmware update programs for add-in cards to see if they re-read the ROM to determine if it has changed. We already sort of do this for VGA ROMs. When a driver tries to map the boot VGA ROM they actually map the shadow copy at 0xC (iirc) rather than the one on the device. This actually sort of sucks because this particular shadow copy certainly is not read-only and in all the glory that is VGA, the shadow sometimes gets modified by the execution of the VGA ROM itself and we no longer have access to the pristine device version of it (bad for device assignment of primary graphics). Now, if we want to make our own shadow copies for all ROMs, it seems pretty clear that we first need to get access to the ROM, which means we need to figure out if the BIOS mapped it. If the ROM BAR is outside of the bridge aperture (catching both 0 and 0xFFF..000) or overlaps a standard ROM BAR, we can consider it unprogrammed. In that case we need to try to do the trick above with unmapping standard BARs to get the shadow copy. Otherwise we should be able to get the shadow copy in place (maybe a question of which we prefer to use in this case). I would be willing to risk that if the BIOS didn't leave room for the ROM and we can't map it into the space used by other BARs or it doesn't fit the bridge aperture, we can spit out a printk warning and skip it. I expect a very low failure rate. What dependencies would we have on the BIOS programming of the ROM BAR if we took such an approach? It seems like we should always be able to detect invalid contents in the ROM BAR and it doesn't matter if they think we should have access or not, we own the device and can give ourselves access. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Thu, Sep 24, 2015 at 09:35:20PM -0700, Yinghai Lu wrote: > On Thu, Sep 24, 2015 at 12:01 PM, Yinghai Lu wrote: > > > Or do we want to keep a white list to say which device should have > > ROM bar as mush have, and other is optional to have ? > > Subject: [RFC PATCH] PCI: Add pci_dev_need_rom_bar() > > Only set that for > 1. if BIOS/firmware already set ROM bar. > 2. via quirks for some devices. > > We assign those needed ROM bar as required > and other ROM bars as optional resources. I'd rather not have a whitelist if we can avoid it. We'd be continually adding new devices to it, and it makes the system harder to understand because there's no consistent rule about how ROMs are handled. Alex mentioned the idea of ripping the ROM, and I'd like to explore that idea a little more. What if we could temporarily assign space for the ROM during enumeration, read the contents, cache the contents somewhere, then deallocate the actual BAR space? We could hang onto the cache and give it to anybody who later needs the ROM. I know there are probably issues here, but I don't know what they are, so I'd like to at least have a conversation about it. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Thu, Sep 24, 2015 at 10:35 PM, Yinghai Lu wrote: > On Thu, Sep 24, 2015 at 12:01 PM, Yinghai Lu wrote: > >> Or do we want to keep a white list to say which device should have >> ROM bar as mush have, and other is optional to have ? I suppose this idea is one possible outcome that could occur but I think we need to have a discussion first before we start making a lot of changes. We need to try and come to some consensus with BIOS engineers. I know that both sets have been alerted to this conversation so *if* we come up with some good arguments to support the kernel's current view/actions perhaps things will start to progress. In the prior thread you replied: "They are wrong. It still depends on how addon card firmware and drivers from OS to use it." So continuing my suggested thought experiment where you are sitting across the table from your platform's BIOS engineers having this discussion ... Do you *really* think your reply was helpful in any way? Do you *really* think you did anything what so ever to get the BIOS engineers to consider something they hadn't before. Do you *really* think your reply was technically based in any way? Do you have any specification references or such to back up such a strong claim? Come on here Yinghai - you are an intelligent person. Take 1/10th the time that you spent developing this patch and think, gather your thoughts, and then sit down calmly, have a beer or coffee or tea (which ever you prefer), relax, and take some time to reply in a logical, thoughtful manner here with enough expression that others can understand what you are getting at and hopefully even with some passion or logic to try to convince the BIOS engineers that the kernel's current behavior is correct. This is your area of expertise - so stand up and rise to the occasion here! Hacking out a patch before this thread has played out serves no purpose what so ever so I'm not even going to waste my time and look at it. It serves no purpose and will only make matters worse as there is already strong disagreement with the kernel's actions in these regards. > > Subject: [RFC PATCH] PCI: Add pci_dev_need_rom_bar() > > Only set that for > 1. if BIOS/firmware already set ROM bar. > 2. via quirks for some devices. > > We assign those needed ROM bar as required > and other ROM bars as optional resources. > > Signed-off-by: Yinghai Lu > > --- > arch/x86/pci/i386.c|9 +- > drivers/dma/pch_dma.c |1 > drivers/gpio/gpio-ml-ioh.c |2 - > drivers/misc/pch_phub.c| 12 > drivers/pci/probe.c|7 + > drivers/pci/quirks.c | 63 > + > drivers/pci/setup-bus.c| 18 ++-- > include/linux/pci.h| 13 + > include/linux/pci_ids.h|7 + > 9 files changed, 112 insertions(+), 20 deletions(-) > > Index: linux-2.6/arch/x86/pci/i386.c > === > --- linux-2.6.orig/arch/x86/pci/i386.c > +++ linux-2.6/arch/x86/pci/i386.c > @@ -377,11 +377,16 @@ static void pcibios_allocate_rom_resourc > } > } > > +bool pci_assign_roms(void) > +{ > +return !!(pci_probe & PCI_ASSIGN_ROMS); > +} > + > static int __init pcibios_assign_resources(void) > { > struct pci_host_bridge *host_bridge = NULL; > > -if (!(pci_probe & PCI_ASSIGN_ROMS)) > +if (!pci_assign_roms()) > for_each_pci_host_bridge(host_bridge) > pcibios_allocate_rom_resources(host_bridge->bus); > > @@ -406,7 +411,7 @@ void pcibios_resource_survey_bus(struct > pcibios_allocate_resources(bus, 0); > pcibios_allocate_resources(bus, 1); > > -if (!(pci_probe & PCI_ASSIGN_ROMS)) > +if (!pci_assign_roms()) > pcibios_allocate_rom_resources(bus); > } > > Index: linux-2.6/drivers/pci/probe.c > === > --- linux-2.6.orig/drivers/pci/probe.c > +++ linux-2.6/drivers/pci/probe.c > @@ -224,6 +224,13 @@ int __pci_read_base(struct pci_dev *dev, > l64 = l & PCI_ROM_ADDRESS_MASK; > sz64 = sz & PCI_ROM_ADDRESS_MASK; > mask64 = (u32)PCI_ROM_ADDRESS_MASK; > +/* simple validation */ > +if (l64 && sz64 && > +(l64 & 0xff00) != 0xff00 && > +system_state == SYSTEM_BOOTING) { > +dev_printk(KERN_DEBUG, >dev, "set dev_flags > NEED_ROM_BAR\n"); > +pci_dev_set_need_rom_bar(dev); > +} > } > > if (res->flags & IORESOURCE_MEM_64) { > Index: linux-2.6/drivers/pci/quirks.c > === > --- linux-2.6.orig/drivers/pci/quirks.c > +++ linux-2.6/drivers/pci/quirks.c > @@ -4197,3 +4197,66 @@ static void quirk_intel_qat_vf_cap(struc > } > } > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap); > + > +/* from drivers/mtd/maps/pci.c */ > +static void
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Thu, Sep 24, 2015 at 09:35:20PM -0700, Yinghai Lu wrote: > On Thu, Sep 24, 2015 at 12:01 PM, Yinghai Luwrote: > > > Or do we want to keep a white list to say which device should have > > ROM bar as mush have, and other is optional to have ? > > Subject: [RFC PATCH] PCI: Add pci_dev_need_rom_bar() > > Only set that for > 1. if BIOS/firmware already set ROM bar. > 2. via quirks for some devices. > > We assign those needed ROM bar as required > and other ROM bars as optional resources. I'd rather not have a whitelist if we can avoid it. We'd be continually adding new devices to it, and it makes the system harder to understand because there's no consistent rule about how ROMs are handled. Alex mentioned the idea of ripping the ROM, and I'd like to explore that idea a little more. What if we could temporarily assign space for the ROM during enumeration, read the contents, cache the contents somewhere, then deallocate the actual BAR space? We could hang onto the cache and give it to anybody who later needs the ROM. I know there are probably issues here, but I don't know what they are, so I'd like to at least have a conversation about it. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Fri, 2015-09-25 at 09:35 -0500, Bjorn Helgaas wrote: > On Thu, Sep 24, 2015 at 09:35:20PM -0700, Yinghai Lu wrote: > > On Thu, Sep 24, 2015 at 12:01 PM, Yinghai Luwrote: > > > > > Or do we want to keep a white list to say which device should have > > > ROM bar as mush have, and other is optional to have ? > > > > Subject: [RFC PATCH] PCI: Add pci_dev_need_rom_bar() > > > > Only set that for > > 1. if BIOS/firmware already set ROM bar. > > 2. via quirks for some devices. > > > > We assign those needed ROM bar as required > > and other ROM bars as optional resources. > > I'd rather not have a whitelist if we can avoid it. We'd be > continually adding new devices to it, and it makes the system harder > to understand because there's no consistent rule about how ROMs are > handled. > > Alex mentioned the idea of ripping the ROM, and I'd like to explore > that idea a little more. What if we could temporarily assign space > for the ROM during enumeration, read the contents, cache the contents > somewhere, then deallocate the actual BAR space? We could hang onto > the cache and give it to anybody who later needs the ROM. > > I know there are probably issues here, but I don't know what they are, > so I'd like to at least have a conversation about it. Ok, so for background in the case I mentioned, we can often use the pci-sysfs rom interface to get a copy of the device ROM, which we then pass to QEMU and we avoid any access to the physical ROM from the VM. We can obviously get the ROM from other sources too, but that's not really relevant here. If we want to extend this idea into the kernel, creating a buffer that holds the ROM contents that we access rather than mapping and enabling the ROM to provide access, we first need to get access to the ROM at least once. The simplification here is that we can do this on boot and we can re-use the space allocated to the standard BARs since we don't need space for both the ROM and the standard BARs at the same time. I think that in the vast majority of cases, we're going to find that the ROM BAR is smaller than the largest standard MMIO BAR of the device or at least smaller than the minimum bridge aperture. This suggests that we could simply record the BAR values, reprogram them to zero, then overlap the ROM BAR to the orignal addresses, enable, rip the ROM, then restore the configuration without needing to adjust any bridge apertures. That sounds pretty good, so long as we can consider the ROM to be perfectly static. I don't know if anything relies on an in-place update or if there are ROMs that are less read-only than others. Maybe it's safe to assume that or at least safe to assume that if the BIOS didn't leave room for them, then we can consider them static. It might be interesting to strace some of the userspace firmware update programs for add-in cards to see if they re-read the ROM to determine if it has changed. We already sort of do this for VGA ROMs. When a driver tries to map the boot VGA ROM they actually map the shadow copy at 0xC (iirc) rather than the one on the device. This actually sort of sucks because this particular shadow copy certainly is not read-only and in all the glory that is VGA, the shadow sometimes gets modified by the execution of the VGA ROM itself and we no longer have access to the pristine device version of it (bad for device assignment of primary graphics). Now, if we want to make our own shadow copies for all ROMs, it seems pretty clear that we first need to get access to the ROM, which means we need to figure out if the BIOS mapped it. If the ROM BAR is outside of the bridge aperture (catching both 0 and 0xFFF..000) or overlaps a standard ROM BAR, we can consider it unprogrammed. In that case we need to try to do the trick above with unmapping standard BARs to get the shadow copy. Otherwise we should be able to get the shadow copy in place (maybe a question of which we prefer to use in this case). I would be willing to risk that if the BIOS didn't leave room for the ROM and we can't map it into the space used by other BARs or it doesn't fit the bridge aperture, we can spit out a printk warning and skip it. I expect a very low failure rate. What dependencies would we have on the BIOS programming of the ROM BAR if we took such an approach? It seems like we should always be able to detect invalid contents in the ROM BAR and it doesn't matter if they think we should have access or not, we own the device and can give ourselves access. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Fri, Sep 25, 2015 at 7:31 AM, Myron Stowewrote: > On Thu, Sep 24, 2015 at 10:35 PM, Yinghai Lu wrote: >> On Thu, Sep 24, 2015 at 12:01 PM, Yinghai Lu wrote: >> >>> Or do we want to keep a white list to say which device should have >>> ROM bar as mush have, and other is optional to have ? > > I suppose this idea is one possible outcome that could occur but I > think we need to have a discussion first before we start making a lot > of changes. We need to try and come to some consensus with BIOS > engineers. I know that both sets have been alerted to this > conversation so *if* we come up with some good arguments to support > the kernel's current view/actions perhaps things will start to > progress. > > In the prior thread you replied: > "They are wrong. >It still depends on how addon card firmware and drivers >from OS to use it." > > So continuing my suggested thought experiment where you are sitting > across the table from your platform's BIOS engineers having this > discussion ... Do you *really* think your reply was helpful in any > way? Do you *really* think you did anything what so ever to get the > BIOS engineers to consider something they hadn't before. Do you > *really* think your reply was technically based in any way? Do you > have any specification references or such to back up such a strong > claim? > > Come on here Yinghai - you are an intelligent person. Take 1/10th the > time that you spent developing this patch and think, gather your > thoughts, and then sit down calmly, have a beer or coffee or tea > (which ever you prefer), relax, and take some time to reply in a > logical, thoughtful manner here with enough expression that others can > understand what you are getting at and hopefully even with some > passion or logic to try to convince the BIOS engineers that the > kernel's current behavior is correct. This is your area of expertise > - so stand up and rise to the occasion here! > > Hacking out a patch before this thread has played out serves no > purpose what so ever so I'm not even going to waste my time and look > at it. It serves no purpose and will only make matters worse as there > is already strong disagreement with the kernel's actions in these > regards. Sigh, that was a terrible response on my part - I'm trying to encourage engagement in this discussion and yet my response likely has exactly the opposite result; shutting you down. Yinghai, I apologize, truly! Others have privately said that you may be uncomfortable with expressing your views due to language skills. If that's the case then please don't be intimidated and limit your contributions. I expect you know at least two languages which is 50% more than me so don't worry about grammar or such - that's not important and we could benefit from your experience and knowledge. English is my only language and I still too often find it difficult to express my opinions. Again, I'm sorry for my rash, harsh, response, Myron > >> >> Subject: [RFC PATCH] PCI: Add pci_dev_need_rom_bar() >> >> Only set that for >> 1. if BIOS/firmware already set ROM bar. >> 2. via quirks for some devices. >> >> We assign those needed ROM bar as required >> and other ROM bars as optional resources. >> >> Signed-off-by: Yinghai Lu >> >> --- >> arch/x86/pci/i386.c|9 +- >> drivers/dma/pch_dma.c |1 >> drivers/gpio/gpio-ml-ioh.c |2 - >> drivers/misc/pch_phub.c| 12 >> drivers/pci/probe.c|7 + >> drivers/pci/quirks.c | 63 >> + >> drivers/pci/setup-bus.c| 18 ++-- >> include/linux/pci.h| 13 + >> include/linux/pci_ids.h|7 + >> 9 files changed, 112 insertions(+), 20 deletions(-) >> >> Index: linux-2.6/arch/x86/pci/i386.c >> === >> --- linux-2.6.orig/arch/x86/pci/i386.c >> +++ linux-2.6/arch/x86/pci/i386.c >> @@ -377,11 +377,16 @@ static void pcibios_allocate_rom_resourc >> } >> } >> >> +bool pci_assign_roms(void) >> +{ >> +return !!(pci_probe & PCI_ASSIGN_ROMS); >> +} >> + >> static int __init pcibios_assign_resources(void) >> { >> struct pci_host_bridge *host_bridge = NULL; >> >> -if (!(pci_probe & PCI_ASSIGN_ROMS)) >> +if (!pci_assign_roms()) >> for_each_pci_host_bridge(host_bridge) >> pcibios_allocate_rom_resources(host_bridge->bus); >> >> @@ -406,7 +411,7 @@ void pcibios_resource_survey_bus(struct >> pcibios_allocate_resources(bus, 0); >> pcibios_allocate_resources(bus, 1); >> >> -if (!(pci_probe & PCI_ASSIGN_ROMS)) >> +if (!pci_assign_roms()) >> pcibios_allocate_rom_resources(bus); >> } >> >> Index: linux-2.6/drivers/pci/probe.c >> === >> --- linux-2.6.orig/drivers/pci/probe.c >> +++
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Fri, Sep 25, 2015 at 9:18 AM, Alex Williamsonwrote: >> > > Or do we want to keep a white list to say which device should have >> > > ROM bar as mush have, and other is optional to have ? >> > >> > Subject: [RFC PATCH] PCI: Add pci_dev_need_rom_bar() >> > >> > Only set that for >> > 1. if BIOS/firmware already set ROM bar. >> > 2. via quirks for some devices. >> > >> > We assign those needed ROM bar as required >> > and other ROM bars as optional resources. >> >> I'd rather not have a whitelist if we can avoid it. We'd be >> continually adding new devices to it, and it makes the system harder >> to understand because there's no consistent rule about how ROMs are >> handled. I don't like that whitelist way, and hope we can find better way to handle all the cases elegantly. >> >> Alex mentioned the idea of ripping the ROM, and I'd like to explore >> that idea a little more. What if we could temporarily assign space >> for the ROM during enumeration, read the contents, cache the contents >> somewhere, then deallocate the actual BAR space? We could hang onto >> the cache and give it to anybody who later needs the ROM. > > That sounds pretty good, so long as we can consider the ROM to be > perfectly static. I don't know if anything relies on an in-place update > or if there are ROMs that are less read-only than others. Maybe it's > safe to assume that or at least safe to assume that if the BIOS didn't > leave room for them, then we can consider them static. It might be > interesting to strace some of the userspace firmware update programs for > add-in cards to see if they re-read the ROM to determine if it has > changed. Should cover most cases. But there is some driver like drivers/mtd/maps/pci.c::drivers/mtd/maps/pci.c do have write operation to the mmio range. > > We already sort of do this for VGA ROMs. When a driver tries to map the > boot VGA ROM they actually map the shadow copy at 0xC (iirc) rather > than the one on the device. This actually sort of sucks because this > particular shadow copy certainly is not read-only and in all the glory > that is VGA, the shadow sometimes gets modified by the execution of the > VGA ROM itself and we no longer have access to the pristine device > version of it (bad for device assignment of primary graphics). > > Now, if we want to make our own shadow copies for all ROMs, it seems > pretty clear that we first need to get access to the ROM, which means we > need to figure out if the BIOS mapped it. If the ROM BAR is outside of > the bridge aperture (catching both 0 and 0xFFF..000) or overlaps a > standard ROM BAR, we can consider it unprogrammed. In that case we need > to try to do the trick above with unmapping standard BARs to get the > shadow copy. Otherwise we should be able to get the shadow copy in > place (maybe a question of which we prefer to use in this case). I > would be willing to risk that if the BIOS didn't leave room for the ROM > and we can't map it into the space used by other BARs or it doesn't fit > the bridge aperture, we can spit out a printk warning and skip it. I > expect a very low failure rate. Maybe we can combine two methods together: 1. have NEED_ROM_BAR, and it is set a) if BIOS allocate resource to yet (maybe not, so we leave space for MMIO bars) b) via limited whitelist that will not support static copy. 2. kernel will try to allocate resource to ROM bar with NEED_ROM_BAR as required, and others as optional 3. for ROM bar that can not get assigned, kernel try to borrow mmio range from other MMIO bar on the device, and save a copy and expose that via /sys/.../rom That should happen FINAL_FIXUP stage before driver get loaded. --- There is risk on it, some add-on card firmware would stop working if kernel ever try to change MMIO bar. then we will need blacklist to skip BAR on those devices. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Thu, Sep 24, 2015 at 10:35 PM, Yinghai Luwrote: > On Thu, Sep 24, 2015 at 12:01 PM, Yinghai Lu wrote: > >> Or do we want to keep a white list to say which device should have >> ROM bar as mush have, and other is optional to have ? I suppose this idea is one possible outcome that could occur but I think we need to have a discussion first before we start making a lot of changes. We need to try and come to some consensus with BIOS engineers. I know that both sets have been alerted to this conversation so *if* we come up with some good arguments to support the kernel's current view/actions perhaps things will start to progress. In the prior thread you replied: "They are wrong. It still depends on how addon card firmware and drivers from OS to use it." So continuing my suggested thought experiment where you are sitting across the table from your platform's BIOS engineers having this discussion ... Do you *really* think your reply was helpful in any way? Do you *really* think you did anything what so ever to get the BIOS engineers to consider something they hadn't before. Do you *really* think your reply was technically based in any way? Do you have any specification references or such to back up such a strong claim? Come on here Yinghai - you are an intelligent person. Take 1/10th the time that you spent developing this patch and think, gather your thoughts, and then sit down calmly, have a beer or coffee or tea (which ever you prefer), relax, and take some time to reply in a logical, thoughtful manner here with enough expression that others can understand what you are getting at and hopefully even with some passion or logic to try to convince the BIOS engineers that the kernel's current behavior is correct. This is your area of expertise - so stand up and rise to the occasion here! Hacking out a patch before this thread has played out serves no purpose what so ever so I'm not even going to waste my time and look at it. It serves no purpose and will only make matters worse as there is already strong disagreement with the kernel's actions in these regards. > > Subject: [RFC PATCH] PCI: Add pci_dev_need_rom_bar() > > Only set that for > 1. if BIOS/firmware already set ROM bar. > 2. via quirks for some devices. > > We assign those needed ROM bar as required > and other ROM bars as optional resources. > > Signed-off-by: Yinghai Lu > > --- > arch/x86/pci/i386.c|9 +- > drivers/dma/pch_dma.c |1 > drivers/gpio/gpio-ml-ioh.c |2 - > drivers/misc/pch_phub.c| 12 > drivers/pci/probe.c|7 + > drivers/pci/quirks.c | 63 > + > drivers/pci/setup-bus.c| 18 ++-- > include/linux/pci.h| 13 + > include/linux/pci_ids.h|7 + > 9 files changed, 112 insertions(+), 20 deletions(-) > > Index: linux-2.6/arch/x86/pci/i386.c > === > --- linux-2.6.orig/arch/x86/pci/i386.c > +++ linux-2.6/arch/x86/pci/i386.c > @@ -377,11 +377,16 @@ static void pcibios_allocate_rom_resourc > } > } > > +bool pci_assign_roms(void) > +{ > +return !!(pci_probe & PCI_ASSIGN_ROMS); > +} > + > static int __init pcibios_assign_resources(void) > { > struct pci_host_bridge *host_bridge = NULL; > > -if (!(pci_probe & PCI_ASSIGN_ROMS)) > +if (!pci_assign_roms()) > for_each_pci_host_bridge(host_bridge) > pcibios_allocate_rom_resources(host_bridge->bus); > > @@ -406,7 +411,7 @@ void pcibios_resource_survey_bus(struct > pcibios_allocate_resources(bus, 0); > pcibios_allocate_resources(bus, 1); > > -if (!(pci_probe & PCI_ASSIGN_ROMS)) > +if (!pci_assign_roms()) > pcibios_allocate_rom_resources(bus); > } > > Index: linux-2.6/drivers/pci/probe.c > === > --- linux-2.6.orig/drivers/pci/probe.c > +++ linux-2.6/drivers/pci/probe.c > @@ -224,6 +224,13 @@ int __pci_read_base(struct pci_dev *dev, > l64 = l & PCI_ROM_ADDRESS_MASK; > sz64 = sz & PCI_ROM_ADDRESS_MASK; > mask64 = (u32)PCI_ROM_ADDRESS_MASK; > +/* simple validation */ > +if (l64 && sz64 && > +(l64 & 0xff00) != 0xff00 && > +system_state == SYSTEM_BOOTING) { > +dev_printk(KERN_DEBUG, >dev, "set dev_flags > NEED_ROM_BAR\n"); > +pci_dev_set_need_rom_bar(dev); > +} > } > > if (res->flags & IORESOURCE_MEM_64) { > Index: linux-2.6/drivers/pci/quirks.c > === > --- linux-2.6.orig/drivers/pci/quirks.c > +++ linux-2.6/drivers/pci/quirks.c > @@ -4197,3 +4197,66 @@ static void quirk_intel_qat_vf_cap(struc > } > } > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap); > + > +/*
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Thu, Sep 24, 2015 at 12:01 PM, Yinghai Lu wrote: > Or do we want to keep a white list to say which device should have > ROM bar as mush have, and other is optional to have ? Subject: [RFC PATCH] PCI: Add pci_dev_need_rom_bar() Only set that for 1. if BIOS/firmware already set ROM bar. 2. via quirks for some devices. We assign those needed ROM bar as required and other ROM bars as optional resources. Signed-off-by: Yinghai Lu --- arch/x86/pci/i386.c|9 +- drivers/dma/pch_dma.c |1 drivers/gpio/gpio-ml-ioh.c |2 - drivers/misc/pch_phub.c| 12 drivers/pci/probe.c|7 + drivers/pci/quirks.c | 63 + drivers/pci/setup-bus.c| 18 ++-- include/linux/pci.h| 13 + include/linux/pci_ids.h|7 + 9 files changed, 112 insertions(+), 20 deletions(-) Index: linux-2.6/arch/x86/pci/i386.c === --- linux-2.6.orig/arch/x86/pci/i386.c +++ linux-2.6/arch/x86/pci/i386.c @@ -377,11 +377,16 @@ static void pcibios_allocate_rom_resourc } } +bool pci_assign_roms(void) +{ +return !!(pci_probe & PCI_ASSIGN_ROMS); +} + static int __init pcibios_assign_resources(void) { struct pci_host_bridge *host_bridge = NULL; -if (!(pci_probe & PCI_ASSIGN_ROMS)) +if (!pci_assign_roms()) for_each_pci_host_bridge(host_bridge) pcibios_allocate_rom_resources(host_bridge->bus); @@ -406,7 +411,7 @@ void pcibios_resource_survey_bus(struct pcibios_allocate_resources(bus, 0); pcibios_allocate_resources(bus, 1); -if (!(pci_probe & PCI_ASSIGN_ROMS)) +if (!pci_assign_roms()) pcibios_allocate_rom_resources(bus); } Index: linux-2.6/drivers/pci/probe.c === --- linux-2.6.orig/drivers/pci/probe.c +++ linux-2.6/drivers/pci/probe.c @@ -224,6 +224,13 @@ int __pci_read_base(struct pci_dev *dev, l64 = l & PCI_ROM_ADDRESS_MASK; sz64 = sz & PCI_ROM_ADDRESS_MASK; mask64 = (u32)PCI_ROM_ADDRESS_MASK; +/* simple validation */ +if (l64 && sz64 && +(l64 & 0xff00) != 0xff00 && +system_state == SYSTEM_BOOTING) { +dev_printk(KERN_DEBUG, >dev, "set dev_flags NEED_ROM_BAR\n"); +pci_dev_set_need_rom_bar(dev); +} } if (res->flags & IORESOURCE_MEM_64) { Index: linux-2.6/drivers/pci/quirks.c === --- linux-2.6.orig/drivers/pci/quirks.c +++ linux-2.6/drivers/pci/quirks.c @@ -4197,3 +4197,66 @@ static void quirk_intel_qat_vf_cap(struc } } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap); + +/* from drivers/mtd/maps/pci.c */ +static void quirk_set_need_rom_bar(struct pci_dev *pdev) +{ +if (!pci_dev_need_rom_bar(pdev)) { +dev_printk(KERN_DEBUG, >dev, "set dev_flags NEED_ROM_BAR\n"); +pci_dev_set_need_rom_bar(pdev); +} +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_DEC, PCI_DEVICE_ID_DEC_21285, + quirk_set_need_rom_bar); + +#ifdef CONFIG_PARISC +/* from drivers/video/console/sticore.c */ +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_EG, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FX6, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FX4, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FX2, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FXE, + quirk_set_need_rom_bar); +#endif + +/* from drivers/misc/pch_phub.c */ +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_PCH1_PHUB, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7213_PHUB, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7223_mPHUB, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7223_nPHUB, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7831_PHUB, + quirk_set_need_rom_bar); + +/* from drivers/net/ethernet/sun/sungem.c */ +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SUN, PCI_DEVICE_ID_SUN_GEM, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SUN, PCI_DEVICE_ID_SUN_RIO_GEM, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_UNI_N_GMAC, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_UNI_N_GMACP, + quirk_set_need_rom_bar);
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Thu, Sep 24, 2015 at 10:06 AM, Myron Stowe wrote: > On Wed, Sep 23, 2015 at 9:21 PM, Yinghai Lu wrote: >> On Wed, Sep 23, 2015 at 7:47 PM, Myron Stowe wrote: >>> >>> The kernel expects device Expansion ROM BARs to be programmed with valid >>> values - even if the respective Expansion ROM's Enable bit is 0 (i.e. the >>> device’s expansion ROM address space is disabled). This seems to be the >>> main contention point with said BIOS engineers. If an Expansion ROM BAR is >>> not programmed, the kernel will attempt to find available resources and, if >>> successful, program it. As this occurs various 'dmesg' entries >>> related to kernel's actions are output. >> ... >>> There is a kernel boot parameter, pci=norom, that is intended to disable the >>> kernel's resource assignment actions for Expansion ROMs that do not already >>> have BIOS assigned address ranges. Note however, if I remember correctly, >>> that this only works if the Expansion ROM BAR is set to "0" by the BIOS >>> before hand-off. >> >> option rom is used by legacy bios to enable booting from external device. >> usually BIOS call the option rom, so the firmware will be loaded to >> add on cards. >> and firmware get started. >> Also option rom would include tools that is used to configure behavior of >> cards >> like add/remove raid. > > I'm not sure what you are getting at here but yes, there are use cases > where the BIOS > needs to access the Expansion ROM, one of the more common being PXE booting > from > a NIC device where the PXE boot content is retrieved from the ROM, but > that has little, > if anything, to do with what I'm after here. > > The BIOS engineers are expressing that the kernel should *never* need > to access the Expansion > ROM, and thus should *never* try to allocate resources for these BARs > and program them > to sane address range values. They are wrong. It still depends on how addon card firmware and drivers from OS to use it. > > > I know you work with bringing up new hardware. So picture yourself > sitting with some > members from your platform's BIOS team. They tell you: "The OS is > incorrect in thinking > it needs to find, and program, sane resource range values into a > device's Expansion ROM > BAR. We (the BIOS) hand-off the platform with these disabled, thus > whatever values are in > the ROMs BAR should be totally ignored, and the OS should never touch > them." What would you > reply with to them in an attempt to show that your position (i.e. the > kernel finding, and programming > values under these circumstances) is correct and that the BIOS opinion > is in-correct? That is > what I'm after. Some addon cards drivers are use ROM bar to talk to card. like Intel DC21285 driver in drivers/mtd/maps/pci.c > >> >> Also there is some use case that kernel driver try to get some parameters >> from >> BIOS. like intel soft raid ? --- bad practice ! > > Again, your replies are so terse I have no idea what you are saying; it's > undecipherable! Are you indicating that you agree with the BIOS > engineers views? No. Just some addon card/drivers would avoid accessing expand rom to get parameter or settings. > >> >> I would like to treat option rom BAR as optional resources during >> resource allocation. >> >> https://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/patch/?id=7f689da33302e4871fd18aee2c19abb5e3ea5261 >> >> Subject: PCI: Treat ROM resource as optional during realloc >> >> Current on realloc path, we just ignore ROM resource if we can not assign >> them in first try. >> >> Treat ROM resources as optional resources,so try to allocate them together >> with required ones, if can not assign them, could go with other required >> resources only, and try to allocate them second time in expand path. > > Yes, while they may have lower priority in obtaining resources, your still > attempting to do so initially. The BIOS engineers seem to believe that this > is > incorrect - the OS should not even attempt to allocate them in the first try. Some drivers may use it, and we don't know who is really need them. so just give it a try. Or do we want to keep a white list to say which device should have ROM bar as mush have, and other is optional to have ? Thanks Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Wed, Sep 23, 2015 at 9:21 PM, Yinghai Lu wrote: > On Wed, Sep 23, 2015 at 7:47 PM, Myron Stowe wrote: >> >> The kernel expects device Expansion ROM BARs to be programmed with valid >> values - even if the respective Expansion ROM's Enable bit is 0 (i.e. the >> device’s expansion ROM address space is disabled). This seems to be the >> main contention point with said BIOS engineers. If an Expansion ROM BAR is >> not programmed, the kernel will attempt to find available resources and, if >> successful, program it. As this occurs various 'dmesg' entries >> related to kernel's actions are output. > ... >> There is a kernel boot parameter, pci=norom, that is intended to disable the >> kernel's resource assignment actions for Expansion ROMs that do not already >> have BIOS assigned address ranges. Note however, if I remember correctly, >> that this only works if the Expansion ROM BAR is set to "0" by the BIOS >> before hand-off. > > option rom is used by legacy bios to enable booting from external device. > usually BIOS call the option rom, so the firmware will be loaded to > add on cards. > and firmware get started. > Also option rom would include tools that is used to configure behavior of > cards > like add/remove raid. I'm not sure what you are getting at here but yes, there are use cases where the BIOS needs to access the Expansion ROM, one of the more common being PXE booting from a NIC device where the PXE boot content is retrieved from the ROM, but that has little, if anything, to do with what I'm after here. The BIOS engineers are expressing that the kernel should *never* need to access the Expansion ROM, and thus should *never* try to allocate resources for these BARs and program them to sane address range values. I know you work with bringing up new hardware. So picture yourself sitting with some members from your platform's BIOS team. They tell you: "The OS is incorrect in thinking it needs to find, and program, sane resource range values into a device's Expansion ROM BAR. We (the BIOS) hand-off the platform with these disabled, thus whatever values are in the ROMs BAR should be totally ignored, and the OS should never touch them." What would you reply with to them in an attempt to show that your position (i.e. the kernel finding, and programming values under these circumstances) is correct and that the BIOS opinion is in-correct? That is what I'm after. > > Also there is some use case that kernel driver try to get some parameters from > BIOS. like intel soft raid ? --- bad practice ! Again, your replies are so terse I have no idea what you are saying; it's undecipherable! Are you indicating that you agree with the BIOS engineers views? > > I would like to treat option rom BAR as optional resources during > resource allocation. > > https://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/patch/?id=7f689da33302e4871fd18aee2c19abb5e3ea5261 > > Subject: PCI: Treat ROM resource as optional during realloc > > Current on realloc path, we just ignore ROM resource if we can not assign > them in first try. > > Treat ROM resources as optional resources,so try to allocate them together > with required ones, if can not assign them, could go with other required > resources only, and try to allocate them second time in expand path. Yes, while they may have lower priority in obtaining resources, your still attempting to do so initially. The BIOS engineers seem to believe that this is incorrect - the OS should not even attempt to allocate them in the first try. So, which side are you on and can you support your view with some technical based argument (and any references from the specifications)? Please take some time and respond with some thought out explanations and opinions. I value your opinion because I have seen your work but your terse replies are going to do nothing what so ever in trying to convince BIOS engineers that the OS should, or needs to, access such. Otherwise: "Why are we (the kernel) allocating resources for them?" Myron > > Thanks > > Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Thu, Sep 24, 2015 at 10:06 AM, Myron Stowewrote: > On Wed, Sep 23, 2015 at 9:21 PM, Yinghai Lu wrote: >> On Wed, Sep 23, 2015 at 7:47 PM, Myron Stowe wrote: >>> >>> The kernel expects device Expansion ROM BARs to be programmed with valid >>> values - even if the respective Expansion ROM's Enable bit is 0 (i.e. the >>> device’s expansion ROM address space is disabled). This seems to be the >>> main contention point with said BIOS engineers. If an Expansion ROM BAR is >>> not programmed, the kernel will attempt to find available resources and, if >>> successful, program it. As this occurs various 'dmesg' entries >>> related to kernel's actions are output. >> ... >>> There is a kernel boot parameter, pci=norom, that is intended to disable the >>> kernel's resource assignment actions for Expansion ROMs that do not already >>> have BIOS assigned address ranges. Note however, if I remember correctly, >>> that this only works if the Expansion ROM BAR is set to "0" by the BIOS >>> before hand-off. >> >> option rom is used by legacy bios to enable booting from external device. >> usually BIOS call the option rom, so the firmware will be loaded to >> add on cards. >> and firmware get started. >> Also option rom would include tools that is used to configure behavior of >> cards >> like add/remove raid. > > I'm not sure what you are getting at here but yes, there are use cases > where the BIOS > needs to access the Expansion ROM, one of the more common being PXE booting > from > a NIC device where the PXE boot content is retrieved from the ROM, but > that has little, > if anything, to do with what I'm after here. > > The BIOS engineers are expressing that the kernel should *never* need > to access the Expansion > ROM, and thus should *never* try to allocate resources for these BARs > and program them > to sane address range values. They are wrong. It still depends on how addon card firmware and drivers from OS to use it. > > > I know you work with bringing up new hardware. So picture yourself > sitting with some > members from your platform's BIOS team. They tell you: "The OS is > incorrect in thinking > it needs to find, and program, sane resource range values into a > device's Expansion ROM > BAR. We (the BIOS) hand-off the platform with these disabled, thus > whatever values are in > the ROMs BAR should be totally ignored, and the OS should never touch > them." What would you > reply with to them in an attempt to show that your position (i.e. the > kernel finding, and programming > values under these circumstances) is correct and that the BIOS opinion > is in-correct? That is > what I'm after. Some addon cards drivers are use ROM bar to talk to card. like Intel DC21285 driver in drivers/mtd/maps/pci.c > >> >> Also there is some use case that kernel driver try to get some parameters >> from >> BIOS. like intel soft raid ? --- bad practice ! > > Again, your replies are so terse I have no idea what you are saying; it's > undecipherable! Are you indicating that you agree with the BIOS > engineers views? No. Just some addon card/drivers would avoid accessing expand rom to get parameter or settings. > >> >> I would like to treat option rom BAR as optional resources during >> resource allocation. >> >> https://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/patch/?id=7f689da33302e4871fd18aee2c19abb5e3ea5261 >> >> Subject: PCI: Treat ROM resource as optional during realloc >> >> Current on realloc path, we just ignore ROM resource if we can not assign >> them in first try. >> >> Treat ROM resources as optional resources,so try to allocate them together >> with required ones, if can not assign them, could go with other required >> resources only, and try to allocate them second time in expand path. > > Yes, while they may have lower priority in obtaining resources, your still > attempting to do so initially. The BIOS engineers seem to believe that this > is > incorrect - the OS should not even attempt to allocate them in the first try. Some drivers may use it, and we don't know who is really need them. so just give it a try. Or do we want to keep a white list to say which device should have ROM bar as mush have, and other is optional to have ? Thanks Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Thu, Sep 24, 2015 at 12:01 PM, Yinghai Luwrote: > Or do we want to keep a white list to say which device should have > ROM bar as mush have, and other is optional to have ? Subject: [RFC PATCH] PCI: Add pci_dev_need_rom_bar() Only set that for 1. if BIOS/firmware already set ROM bar. 2. via quirks for some devices. We assign those needed ROM bar as required and other ROM bars as optional resources. Signed-off-by: Yinghai Lu --- arch/x86/pci/i386.c|9 +- drivers/dma/pch_dma.c |1 drivers/gpio/gpio-ml-ioh.c |2 - drivers/misc/pch_phub.c| 12 drivers/pci/probe.c|7 + drivers/pci/quirks.c | 63 + drivers/pci/setup-bus.c| 18 ++-- include/linux/pci.h| 13 + include/linux/pci_ids.h|7 + 9 files changed, 112 insertions(+), 20 deletions(-) Index: linux-2.6/arch/x86/pci/i386.c === --- linux-2.6.orig/arch/x86/pci/i386.c +++ linux-2.6/arch/x86/pci/i386.c @@ -377,11 +377,16 @@ static void pcibios_allocate_rom_resourc } } +bool pci_assign_roms(void) +{ +return !!(pci_probe & PCI_ASSIGN_ROMS); +} + static int __init pcibios_assign_resources(void) { struct pci_host_bridge *host_bridge = NULL; -if (!(pci_probe & PCI_ASSIGN_ROMS)) +if (!pci_assign_roms()) for_each_pci_host_bridge(host_bridge) pcibios_allocate_rom_resources(host_bridge->bus); @@ -406,7 +411,7 @@ void pcibios_resource_survey_bus(struct pcibios_allocate_resources(bus, 0); pcibios_allocate_resources(bus, 1); -if (!(pci_probe & PCI_ASSIGN_ROMS)) +if (!pci_assign_roms()) pcibios_allocate_rom_resources(bus); } Index: linux-2.6/drivers/pci/probe.c === --- linux-2.6.orig/drivers/pci/probe.c +++ linux-2.6/drivers/pci/probe.c @@ -224,6 +224,13 @@ int __pci_read_base(struct pci_dev *dev, l64 = l & PCI_ROM_ADDRESS_MASK; sz64 = sz & PCI_ROM_ADDRESS_MASK; mask64 = (u32)PCI_ROM_ADDRESS_MASK; +/* simple validation */ +if (l64 && sz64 && +(l64 & 0xff00) != 0xff00 && +system_state == SYSTEM_BOOTING) { +dev_printk(KERN_DEBUG, >dev, "set dev_flags NEED_ROM_BAR\n"); +pci_dev_set_need_rom_bar(dev); +} } if (res->flags & IORESOURCE_MEM_64) { Index: linux-2.6/drivers/pci/quirks.c === --- linux-2.6.orig/drivers/pci/quirks.c +++ linux-2.6/drivers/pci/quirks.c @@ -4197,3 +4197,66 @@ static void quirk_intel_qat_vf_cap(struc } } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap); + +/* from drivers/mtd/maps/pci.c */ +static void quirk_set_need_rom_bar(struct pci_dev *pdev) +{ +if (!pci_dev_need_rom_bar(pdev)) { +dev_printk(KERN_DEBUG, >dev, "set dev_flags NEED_ROM_BAR\n"); +pci_dev_set_need_rom_bar(pdev); +} +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_DEC, PCI_DEVICE_ID_DEC_21285, + quirk_set_need_rom_bar); + +#ifdef CONFIG_PARISC +/* from drivers/video/console/sticore.c */ +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_EG, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FX6, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FX4, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FX2, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FXE, + quirk_set_need_rom_bar); +#endif + +/* from drivers/misc/pch_phub.c */ +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_PCH1_PHUB, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7213_PHUB, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7223_mPHUB, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7223_nPHUB, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7831_PHUB, + quirk_set_need_rom_bar); + +/* from drivers/net/ethernet/sun/sungem.c */ +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SUN, PCI_DEVICE_ID_SUN_GEM, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SUN, PCI_DEVICE_ID_SUN_RIO_GEM, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_UNI_N_GMAC, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_UNI_N_GMACP, +
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Wed, Sep 23, 2015 at 9:21 PM, Yinghai Luwrote: > On Wed, Sep 23, 2015 at 7:47 PM, Myron Stowe wrote: >> >> The kernel expects device Expansion ROM BARs to be programmed with valid >> values - even if the respective Expansion ROM's Enable bit is 0 (i.e. the >> device’s expansion ROM address space is disabled). This seems to be the >> main contention point with said BIOS engineers. If an Expansion ROM BAR is >> not programmed, the kernel will attempt to find available resources and, if >> successful, program it. As this occurs various 'dmesg' entries >> related to kernel's actions are output. > ... >> There is a kernel boot parameter, pci=norom, that is intended to disable the >> kernel's resource assignment actions for Expansion ROMs that do not already >> have BIOS assigned address ranges. Note however, if I remember correctly, >> that this only works if the Expansion ROM BAR is set to "0" by the BIOS >> before hand-off. > > option rom is used by legacy bios to enable booting from external device. > usually BIOS call the option rom, so the firmware will be loaded to > add on cards. > and firmware get started. > Also option rom would include tools that is used to configure behavior of > cards > like add/remove raid. I'm not sure what you are getting at here but yes, there are use cases where the BIOS needs to access the Expansion ROM, one of the more common being PXE booting from a NIC device where the PXE boot content is retrieved from the ROM, but that has little, if anything, to do with what I'm after here. The BIOS engineers are expressing that the kernel should *never* need to access the Expansion ROM, and thus should *never* try to allocate resources for these BARs and program them to sane address range values. I know you work with bringing up new hardware. So picture yourself sitting with some members from your platform's BIOS team. They tell you: "The OS is incorrect in thinking it needs to find, and program, sane resource range values into a device's Expansion ROM BAR. We (the BIOS) hand-off the platform with these disabled, thus whatever values are in the ROMs BAR should be totally ignored, and the OS should never touch them." What would you reply with to them in an attempt to show that your position (i.e. the kernel finding, and programming values under these circumstances) is correct and that the BIOS opinion is in-correct? That is what I'm after. > > Also there is some use case that kernel driver try to get some parameters from > BIOS. like intel soft raid ? --- bad practice ! Again, your replies are so terse I have no idea what you are saying; it's undecipherable! Are you indicating that you agree with the BIOS engineers views? > > I would like to treat option rom BAR as optional resources during > resource allocation. > > https://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/patch/?id=7f689da33302e4871fd18aee2c19abb5e3ea5261 > > Subject: PCI: Treat ROM resource as optional during realloc > > Current on realloc path, we just ignore ROM resource if we can not assign > them in first try. > > Treat ROM resources as optional resources,so try to allocate them together > with required ones, if can not assign them, could go with other required > resources only, and try to allocate them second time in expand path. Yes, while they may have lower priority in obtaining resources, your still attempting to do so initially. The BIOS engineers seem to believe that this is incorrect - the OS should not even attempt to allocate them in the first try. So, which side are you on and can you support your view with some technical based argument (and any references from the specifications)? Please take some time and respond with some thought out explanations and opinions. I value your opinion because I have seen your work but your terse replies are going to do nothing what so ever in trying to convince BIOS engineers that the OS should, or needs to, access such. Otherwise: "Why are we (the kernel) allocating resources for them?" Myron > > Thanks > > Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Wed, 2015-09-23 at 20:21 -0700, Yinghai Lu wrote: > On Wed, Sep 23, 2015 at 7:47 PM, Myron Stowe wrote: > > > > The kernel expects device Expansion ROM BARs to be programmed with valid > > values - even if the respective Expansion ROM's Enable bit is 0 (i.e. the > > device’s expansion ROM address space is disabled). This seems to be the > > main contention point with said BIOS engineers. If an Expansion ROM BAR is > > not programmed, the kernel will attempt to find available resources and, if > > successful, program it. As this occurs various 'dmesg' entries > > related to kernel's actions are output. > ... > > There is a kernel boot parameter, pci=norom, that is intended to disable the > > kernel's resource assignment actions for Expansion ROMs that do not already > > have BIOS assigned address ranges. Note however, if I remember correctly, > > that this only works if the Expansion ROM BAR is set to "0" by the BIOS > > before hand-off. > > option rom is used by legacy bios to enable booting from external device. > usually BIOS call the option rom, so the firmware will be loaded to > add on cards. > and firmware get started. > Also option rom would include tools that is used to configure behavior of > cards > like add/remove raid. > > Also there is some use case that kernel driver try to get some parameters from > BIOS. like intel soft raid ? --- bad practice ! > > I would like to treat option rom BAR as optional resources during > resource allocation. > > https://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/patch/?id=7f689da33302e4871fd18aee2c19abb5e3ea5261 > > Subject: PCI: Treat ROM resource as optional during realloc > > Current on realloc path, we just ignore ROM resource if we can not assign > them in first try. > > Treat ROM resources as optional resources,so try to allocate them together > with required ones, if can not assign them, could go with other required > resources only, and try to allocate them second time in expand path. Don't forget that the physical system boot may not be the only "boot" of the PCI device. We can assign a PCI device to a VM running on top of the bare-metal OS, at which point the option ROM of the device may be re-executed and the device re-initialized by the VM BIOS. A BIOS engineer that argues that the option ROM is unnecessary after bare-metal BIOS boot is completely disregarding this use case. We do have ways to make this be a soft requirement, we can pass the option ROM as a file to the VM, but we need to be able to rip the option ROM from the device in order to do that, likely from a better behaved platform wrt option ROM mapping. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Wed, Sep 23, 2015 at 7:47 PM, Myron Stowe wrote: > > The kernel expects device Expansion ROM BARs to be programmed with valid > values - even if the respective Expansion ROM's Enable bit is 0 (i.e. the > device’s expansion ROM address space is disabled). This seems to be the > main contention point with said BIOS engineers. If an Expansion ROM BAR is > not programmed, the kernel will attempt to find available resources and, if > successful, program it. As this occurs various 'dmesg' entries > related to kernel's actions are output. ... > There is a kernel boot parameter, pci=norom, that is intended to disable the > kernel's resource assignment actions for Expansion ROMs that do not already > have BIOS assigned address ranges. Note however, if I remember correctly, > that this only works if the Expansion ROM BAR is set to "0" by the BIOS > before hand-off. option rom is used by legacy bios to enable booting from external device. usually BIOS call the option rom, so the firmware will be loaded to add on cards. and firmware get started. Also option rom would include tools that is used to configure behavior of cards like add/remove raid. Also there is some use case that kernel driver try to get some parameters from BIOS. like intel soft raid ? --- bad practice ! I would like to treat option rom BAR as optional resources during resource allocation. https://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/patch/?id=7f689da33302e4871fd18aee2c19abb5e3ea5261 Subject: PCI: Treat ROM resource as optional during realloc Current on realloc path, we just ignore ROM resource if we can not assign them in first try. Treat ROM resources as optional resources,so try to allocate them together with required ones, if can not assign them, could go with other required resources only, and try to allocate them second time in expand path. Thanks Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] PCI: Unassigned Expansion ROM BARs
I've encountered numerous bugzilla reports related to platform BIOS' not programming valid values into a PCI device's Type 0 Configuration space "Expansion ROM Base Address" field (a.k.a. Expansion ROM BAR). The main observed consequence being 'dmesg' entries like the following that get customers excited enough to file reports against the kernel. pci :01:00.0: can't claim BAR 6 [mem 0xfff0-0x pref]: no compatible bridge window pci :04:03.0: can't claim BAR 6 [mem 0x-0x pref]: no compatible bridge window After I've provided an analysis similar to [1] the respective BIOS response (teams from two of the major vendors) is typically: "The OS has no business touching the Expansion ROM BARs and it provides no value to the equation here. The Expansion ROM BAR is only useful in pre-boot for the BIOS to get boot code from a device." This scenario has occurred enough times now that I'd like to attempt to "raise the bar" and invite a technically merit based discussion concerning this topic - via a public forum that is archived and provides a source of reference for use upon future occurrences - and see if a consensus can be reached between the various vendor's BIOS engineers and kernel engineers. A little more background context - The kernel expects device Expansion ROM BARs to be programmed with valid values - even if the respective Expansion ROM's Enable bit is 0 (i.e. the device’s expansion ROM address space is disabled). This seems to be the main contention point with said BIOS engineers. If an Expansion ROM BAR is not programmed, the kernel will attempt to find available resources and, if successful, program it. As this occurs various 'dmesg' entries related to kernel's actions are output. Note that for devices that share decoders between the Expansion ROM BAR and other BARs the firmware (probably) should not enable the Expansion ROM BAR at hand-off to the operating system (see the last paragraph of the PCI Firmware Specification, Rev 3.2, Section 3.5 "Device State at Firmware/Operating System Handoff"). There is a kernel boot parameter, pci=norom, that is intended to disable the kernel's resource assignment actions for Expansion ROMs that do not already have BIOS assigned address ranges. Note however, if I remember correctly, that this only works if the Expansion ROM BAR is set to "0" by the BIOS before hand-off. I've opened https://bugzilla.kernel.org/show_bug.cgi?id=104931 and attached the full 'dmesg' that exhibits a typical occurrence as an example. I'd like to use the bugzilla to archive any discussion that takes place. I'll copy all relevant discussion that takes place here into the bugzilla as "Additional Comments". Please continue with this thread, adding your views in these regards. Citing's from pertinent specifications that back up your position would be appreciated. Thanks, Myron [1] Annotated 'dmesg' log concerning Expansion ROM BARs not setup by BIOS The "can't claim" messages of interest are: pci :01:00.0: can't claim BAR 6 [mem 0xfff0-0x pref]: no compatible bridge window pci :04:03.0: can't claim BAR 6 [mem 0x-0x pref]: no compatible bridge window The PCI devices of interest are a device at PCI Bus 1, Device 0, Function 0 (01:00.0) and another device at PCI Bus 4, Device 3, Function 0 (04:03.0). The "root bridge" that leads to PCI buses 1 and 4 - the buses of interest - is "PCI0" and its I/O Port space and Memory Mapped I/O (MMIO) space are: ACPI: PCI Root Bridge [PCI0] (domain [bus 00-fe]) PCI host bridge to bus :00 pci_bus :00: root bus resource [bus 00-fe] pci_bus :00: root bus resource [io 0x-0x0cf7] pci_bus :00: root bus resource [io 0x0d00-0x] pci_bus :00: root bus resource [mem 0x000a-0x000b] pci_bus :00: root bus resource [mem 0xc000-0xfeaf] It's helpful to gather up all the resource related information pertaining to the devices of interest in one place. Concentrating on the PCI-to-PCI bridges and individual PCI devices that lead to 01:00.0, the first device exhibiting the "can't claim" message (everything that is consuming resources on PCI bus 0 and PCI bus 1): pci :00:1a.0: [8086:1c2d] type 00 class 0x0c0320 pci :00:1a.0: reg 0x10: [mem 0xc1305000-0xc13053ff] pci :00:1d.0: [8086:1c26] type 00 class 0x0c0320 pci :00:1d.0: reg 0x10: [mem 0xc1304000-0xc13043ff] pci :00:1f.2: [8086:1c00] type 00 class 0x01018f pci :00:1f.2: reg 0x10: [io 0x3078-0x307f] pci :00:1f.2: reg 0x14: [io 0x308c-0x308f] pci :00:1f.2: reg 0x18: [io 0x3070-0x3077] pci :00:1f.2: reg 0x1c: [io 0x3088-0x308b] pci :00:1f.2: reg 0x20: [io 0x3050-0x305f] pci :00:1f.2: reg 0x24: [io 0x3040-0x304f] pci :00:1f.3: [8086:1c22] type 00 class 0x0c0500 pci :00:1f.3: reg 0x10: [mem 0xc1302000-0xc13020ff 64bit] pci
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Wed, 2015-09-23 at 20:21 -0700, Yinghai Lu wrote: > On Wed, Sep 23, 2015 at 7:47 PM, Myron Stowewrote: > > > > The kernel expects device Expansion ROM BARs to be programmed with valid > > values - even if the respective Expansion ROM's Enable bit is 0 (i.e. the > > device’s expansion ROM address space is disabled). This seems to be the > > main contention point with said BIOS engineers. If an Expansion ROM BAR is > > not programmed, the kernel will attempt to find available resources and, if > > successful, program it. As this occurs various 'dmesg' entries > > related to kernel's actions are output. > ... > > There is a kernel boot parameter, pci=norom, that is intended to disable the > > kernel's resource assignment actions for Expansion ROMs that do not already > > have BIOS assigned address ranges. Note however, if I remember correctly, > > that this only works if the Expansion ROM BAR is set to "0" by the BIOS > > before hand-off. > > option rom is used by legacy bios to enable booting from external device. > usually BIOS call the option rom, so the firmware will be loaded to > add on cards. > and firmware get started. > Also option rom would include tools that is used to configure behavior of > cards > like add/remove raid. > > Also there is some use case that kernel driver try to get some parameters from > BIOS. like intel soft raid ? --- bad practice ! > > I would like to treat option rom BAR as optional resources during > resource allocation. > > https://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/patch/?id=7f689da33302e4871fd18aee2c19abb5e3ea5261 > > Subject: PCI: Treat ROM resource as optional during realloc > > Current on realloc path, we just ignore ROM resource if we can not assign > them in first try. > > Treat ROM resources as optional resources,so try to allocate them together > with required ones, if can not assign them, could go with other required > resources only, and try to allocate them second time in expand path. Don't forget that the physical system boot may not be the only "boot" of the PCI device. We can assign a PCI device to a VM running on top of the bare-metal OS, at which point the option ROM of the device may be re-executed and the device re-initialized by the VM BIOS. A BIOS engineer that argues that the option ROM is unnecessary after bare-metal BIOS boot is completely disregarding this use case. We do have ways to make this be a soft requirement, we can pass the option ROM as a file to the VM, but we need to be able to rip the option ROM from the device in order to do that, likely from a better behaved platform wrt option ROM mapping. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] PCI: Unassigned Expansion ROM BARs
I've encountered numerous bugzilla reports related to platform BIOS' not programming valid values into a PCI device's Type 0 Configuration space "Expansion ROM Base Address" field (a.k.a. Expansion ROM BAR). The main observed consequence being 'dmesg' entries like the following that get customers excited enough to file reports against the kernel. pci :01:00.0: can't claim BAR 6 [mem 0xfff0-0x pref]: no compatible bridge window pci :04:03.0: can't claim BAR 6 [mem 0x-0x pref]: no compatible bridge window After I've provided an analysis similar to [1] the respective BIOS response (teams from two of the major vendors) is typically: "The OS has no business touching the Expansion ROM BARs and it provides no value to the equation here. The Expansion ROM BAR is only useful in pre-boot for the BIOS to get boot code from a device." This scenario has occurred enough times now that I'd like to attempt to "raise the bar" and invite a technically merit based discussion concerning this topic - via a public forum that is archived and provides a source of reference for use upon future occurrences - and see if a consensus can be reached between the various vendor's BIOS engineers and kernel engineers. A little more background context - The kernel expects device Expansion ROM BARs to be programmed with valid values - even if the respective Expansion ROM's Enable bit is 0 (i.e. the device’s expansion ROM address space is disabled). This seems to be the main contention point with said BIOS engineers. If an Expansion ROM BAR is not programmed, the kernel will attempt to find available resources and, if successful, program it. As this occurs various 'dmesg' entries related to kernel's actions are output. Note that for devices that share decoders between the Expansion ROM BAR and other BARs the firmware (probably) should not enable the Expansion ROM BAR at hand-off to the operating system (see the last paragraph of the PCI Firmware Specification, Rev 3.2, Section 3.5 "Device State at Firmware/Operating System Handoff"). There is a kernel boot parameter, pci=norom, that is intended to disable the kernel's resource assignment actions for Expansion ROMs that do not already have BIOS assigned address ranges. Note however, if I remember correctly, that this only works if the Expansion ROM BAR is set to "0" by the BIOS before hand-off. I've opened https://bugzilla.kernel.org/show_bug.cgi?id=104931 and attached the full 'dmesg' that exhibits a typical occurrence as an example. I'd like to use the bugzilla to archive any discussion that takes place. I'll copy all relevant discussion that takes place here into the bugzilla as "Additional Comments". Please continue with this thread, adding your views in these regards. Citing's from pertinent specifications that back up your position would be appreciated. Thanks, Myron [1] Annotated 'dmesg' log concerning Expansion ROM BARs not setup by BIOS The "can't claim" messages of interest are: pci :01:00.0: can't claim BAR 6 [mem 0xfff0-0x pref]: no compatible bridge window pci :04:03.0: can't claim BAR 6 [mem 0x-0x pref]: no compatible bridge window The PCI devices of interest are a device at PCI Bus 1, Device 0, Function 0 (01:00.0) and another device at PCI Bus 4, Device 3, Function 0 (04:03.0). The "root bridge" that leads to PCI buses 1 and 4 - the buses of interest - is "PCI0" and its I/O Port space and Memory Mapped I/O (MMIO) space are: ACPI: PCI Root Bridge [PCI0] (domain [bus 00-fe]) PCI host bridge to bus :00 pci_bus :00: root bus resource [bus 00-fe] pci_bus :00: root bus resource [io 0x-0x0cf7] pci_bus :00: root bus resource [io 0x0d00-0x] pci_bus :00: root bus resource [mem 0x000a-0x000b] pci_bus :00: root bus resource [mem 0xc000-0xfeaf] It's helpful to gather up all the resource related information pertaining to the devices of interest in one place. Concentrating on the PCI-to-PCI bridges and individual PCI devices that lead to 01:00.0, the first device exhibiting the "can't claim" message (everything that is consuming resources on PCI bus 0 and PCI bus 1): pci :00:1a.0: [8086:1c2d] type 00 class 0x0c0320 pci :00:1a.0: reg 0x10: [mem 0xc1305000-0xc13053ff] pci :00:1d.0: [8086:1c26] type 00 class 0x0c0320 pci :00:1d.0: reg 0x10: [mem 0xc1304000-0xc13043ff] pci :00:1f.2: [8086:1c00] type 00 class 0x01018f pci :00:1f.2: reg 0x10: [io 0x3078-0x307f] pci :00:1f.2: reg 0x14: [io 0x308c-0x308f] pci :00:1f.2: reg 0x18: [io 0x3070-0x3077] pci :00:1f.2: reg 0x1c: [io 0x3088-0x308b] pci :00:1f.2: reg 0x20: [io 0x3050-0x305f] pci :00:1f.2: reg 0x24: [io 0x3040-0x304f] pci :00:1f.3: [8086:1c22] type 00 class 0x0c0500 pci :00:1f.3: reg 0x10: [mem 0xc1302000-0xc13020ff 64bit] pci
Re: [RFC] PCI: Unassigned Expansion ROM BARs
On Wed, Sep 23, 2015 at 7:47 PM, Myron Stowewrote: > > The kernel expects device Expansion ROM BARs to be programmed with valid > values - even if the respective Expansion ROM's Enable bit is 0 (i.e. the > device’s expansion ROM address space is disabled). This seems to be the > main contention point with said BIOS engineers. If an Expansion ROM BAR is > not programmed, the kernel will attempt to find available resources and, if > successful, program it. As this occurs various 'dmesg' entries > related to kernel's actions are output. ... > There is a kernel boot parameter, pci=norom, that is intended to disable the > kernel's resource assignment actions for Expansion ROMs that do not already > have BIOS assigned address ranges. Note however, if I remember correctly, > that this only works if the Expansion ROM BAR is set to "0" by the BIOS > before hand-off. option rom is used by legacy bios to enable booting from external device. usually BIOS call the option rom, so the firmware will be loaded to add on cards. and firmware get started. Also option rom would include tools that is used to configure behavior of cards like add/remove raid. Also there is some use case that kernel driver try to get some parameters from BIOS. like intel soft raid ? --- bad practice ! I would like to treat option rom BAR as optional resources during resource allocation. https://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/patch/?id=7f689da33302e4871fd18aee2c19abb5e3ea5261 Subject: PCI: Treat ROM resource as optional during realloc Current on realloc path, we just ignore ROM resource if we can not assign them in first try. Treat ROM resources as optional resources,so try to allocate them together with required ones, if can not assign them, could go with other required resources only, and try to allocate them second time in expand path. Thanks Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/