Re: [PATCH 4/5] x86: acpi: Print warning for malformed host bridge resources
Ping? On Sun, 2012-11-11 at 09:49 -0500, Peter Hurley wrote: > On Sat, 2012-11-10 at 14:52 -0700, Bjorn Helgaas wrote: > > On Wed, Nov 7, 2012 at 7:55 PM, Peter Hurley > > wrote: > > > An incorrectly specified host bridge window may prevent > > > other devices from claiming assigned resources. For example, > > > this flawed _CRS resource descriptor from a Dell T5400: > > > DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, > > > NonCacheable, ReadWrite, > > > 0x, // Granularity > > > 0xF000, // Range Minimum > > > 0xFE00, // Range Maximum > > > 0x, // Translation Offset > > > 0x0E00, // Length > > > ,, , AddressRangeMemory, TypeStatic) > > > > I think the problem here is that the Range Maximum should be > > 0xFDFF, not 0xFE00, right? > > I presume so. > > > > diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c > > > index 192397c..3468d16 100644 > > > --- a/arch/x86/pci/acpi.c > > > +++ b/arch/x86/pci/acpi.c > > > @@ -298,6 +298,10 @@ setup_resource(struct acpi_resource *acpi_res, void > > > *data) > > > "host bridge window [%#llx-%#llx] " > > > "([%#llx-%#llx] ignored, not CPU addressable)\n", > > > start, orig_end, end + 1, orig_end); > > > + } else if (flags & IORESOURCE_MEM && (start & 0x0f || ~end & > > > 0x0f)) { > > > + dev_warn(&info->bridge->dev, > > > +"invalid host bridge window [%#llx-%#llx]\n", > > > +start, end); > > > > We didn't actually *fix* anything here, so I guess we're just pointing > > out the reason for a subsequent failure to claim the adjacent > > resource. > > Correct. There is no fix; only a diagnostic warning. > > The warning is also a 'red flag' that, on this machine, it might be > better to boot the kernel with the "pci=nocrs" option. > > > As far as I know, the spec doesn't actually require resources of ACPI > > devices to be non-overlapping. Windows accepts overlapping resources, > > and I think Linux probably should, too, but right now we trip over > > this. > > (note: I included a link below to the defect report which has > the /proc/iomem, dmesg & dmidecode) > > The situation is this: > > The adjacent resources (northbridge & southbridge) are not defined by > ACPI, but rather reserved with an e820 address descriptor from > [0xfe00-0xfeff], so strictly speaking there is no overlapping > ACPI resource. > > The e820 descriptor is bumped out to [0xf000-0xfeff] and the > malformed host bridge window is reparented to it. > > At this point in the boot, there is no resource conflict. > > Later in the boot, the i5k_amb driver tries to map > [0xfe00-0xfe01] which is the FB-DIMM AMB register window on the > Intel 5400 MCH and is rejected. The request is rejected because the > requested range does not map completely to a single parent and this is > not allowed. (The i5k_amb driver exposes the FB-DIMM temperature sensors > through sysfs). > > There is no problem in Windows because no driver attempts to allocate > [0xfe00-0xfe01]. However, I doubt the PNP Manager would allow > another bus pdo to claim an overlapping resource with PCI bus 0. I > suspect the offending device would yellow bang. (That would be an > interesting experiment...) > > > In the meantime (until we figure out how to handle overlapping > > resources better), can we do something to actually fix this? Maybe we > > should truncate the end of the range to 0xFDFF like we do for > > non-addressable parts of the range? > > Auto-fixing this seems problematic because it's essentially impossible > to determine if the resource length or the resource end or both is > wrong. > > > Is there a bugzilla or a complete dmesg log to look at? > > https://bugzilla.kernel.org/show_bug.cgi?id=50161 > > Regards, > Peter Hurley > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/5] x86: acpi: Print warning for malformed host bridge resources
On Sat, 2012-11-10 at 14:52 -0700, Bjorn Helgaas wrote: > On Wed, Nov 7, 2012 at 7:55 PM, Peter Hurley wrote: > > An incorrectly specified host bridge window may prevent > > other devices from claiming assigned resources. For example, > > this flawed _CRS resource descriptor from a Dell T5400: > > DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, > > NonCacheable, ReadWrite, > > 0x, // Granularity > > 0xF000, // Range Minimum > > 0xFE00, // Range Maximum > > 0x, // Translation Offset > > 0x0E00, // Length > > ,, , AddressRangeMemory, TypeStatic) > > I think the problem here is that the Range Maximum should be > 0xFDFF, not 0xFE00, right? I presume so. > > diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c > > index 192397c..3468d16 100644 > > --- a/arch/x86/pci/acpi.c > > +++ b/arch/x86/pci/acpi.c > > @@ -298,6 +298,10 @@ setup_resource(struct acpi_resource *acpi_res, void > > *data) > > "host bridge window [%#llx-%#llx] " > > "([%#llx-%#llx] ignored, not CPU addressable)\n", > > start, orig_end, end + 1, orig_end); > > + } else if (flags & IORESOURCE_MEM && (start & 0x0f || ~end & 0x0f)) > > { > > + dev_warn(&info->bridge->dev, > > +"invalid host bridge window [%#llx-%#llx]\n", > > +start, end); > > We didn't actually *fix* anything here, so I guess we're just pointing > out the reason for a subsequent failure to claim the adjacent > resource. Correct. There is no fix; only a diagnostic warning. The warning is also a 'red flag' that, on this machine, it might be better to boot the kernel with the "pci=nocrs" option. > As far as I know, the spec doesn't actually require resources of ACPI > devices to be non-overlapping. Windows accepts overlapping resources, > and I think Linux probably should, too, but right now we trip over > this. (note: I included a link below to the defect report which has the /proc/iomem, dmesg & dmidecode) The situation is this: The adjacent resources (northbridge & southbridge) are not defined by ACPI, but rather reserved with an e820 address descriptor from [0xfe00-0xfeff], so strictly speaking there is no overlapping ACPI resource. The e820 descriptor is bumped out to [0xf000-0xfeff] and the malformed host bridge window is reparented to it. At this point in the boot, there is no resource conflict. Later in the boot, the i5k_amb driver tries to map [0xfe00-0xfe01] which is the FB-DIMM AMB register window on the Intel 5400 MCH and is rejected. The request is rejected because the requested range does not map completely to a single parent and this is not allowed. (The i5k_amb driver exposes the FB-DIMM temperature sensors through sysfs). There is no problem in Windows because no driver attempts to allocate [0xfe00-0xfe01]. However, I doubt the PNP Manager would allow another bus pdo to claim an overlapping resource with PCI bus 0. I suspect the offending device would yellow bang. (That would be an interesting experiment...) > In the meantime (until we figure out how to handle overlapping > resources better), can we do something to actually fix this? Maybe we > should truncate the end of the range to 0xFDFF like we do for > non-addressable parts of the range? Auto-fixing this seems problematic because it's essentially impossible to determine if the resource length or the resource end or both is wrong. > Is there a bugzilla or a complete dmesg log to look at? https://bugzilla.kernel.org/show_bug.cgi?id=50161 Regards, Peter Hurley -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/5] x86: acpi: Print warning for malformed host bridge resources
On Wed, Nov 7, 2012 at 7:55 PM, Peter Hurley wrote: > An incorrectly specified host bridge window may prevent > other devices from claiming assigned resources. For example, > this flawed _CRS resource descriptor from a Dell T5400: > DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, > NonCacheable, ReadWrite, > 0x, // Granularity > 0xF000, // Range Minimum > 0xFE00, // Range Maximum > 0x, // Translation Offset > 0x0E00, // Length > ,, , AddressRangeMemory, TypeStatic) I think the problem here is that the Range Maximum should be 0xFDFF, not 0xFE00, right? > prevents the adjacent device from claiming [mem 0xfe000-0xfe01] > > Sanity check that the resource at least conforms to a valid > PCI BAR; if not, emit a diagnostic warning. > > Cc: Bjorn Helgaas > Cc: Thomas Gleixner > Cc: Ingo Molnar > Cc: H. Peter Anvin > Cc: x...@kernel.org > Signed-off-by: Peter Hurley > --- > arch/x86/pci/acpi.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c > index 192397c..3468d16 100644 > --- a/arch/x86/pci/acpi.c > +++ b/arch/x86/pci/acpi.c > @@ -298,6 +298,10 @@ setup_resource(struct acpi_resource *acpi_res, void > *data) > "host bridge window [%#llx-%#llx] " > "([%#llx-%#llx] ignored, not CPU addressable)\n", > start, orig_end, end + 1, orig_end); > + } else if (flags & IORESOURCE_MEM && (start & 0x0f || ~end & 0x0f)) { > + dev_warn(&info->bridge->dev, > +"invalid host bridge window [%#llx-%#llx]\n", > +start, end); We didn't actually *fix* anything here, so I guess we're just pointing out the reason for a subsequent failure to claim the adjacent resource. As far as I know, the spec doesn't actually require resources of ACPI devices to be non-overlapping. Windows accepts overlapping resources, and I think Linux probably should, too, but right now we trip over this. In the meantime (until we figure out how to handle overlapping resources better), can we do something to actually fix this? Maybe we should truncate the end of the range to 0xFDFF like we do for non-addressable parts of the range? Is there a bugzilla or a complete dmesg log to look at? Bjorn > } > > res = &info->res[info->res_num]; > -- > 1.7.12.3 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/5] x86: acpi: Print warning for malformed host bridge resources
An incorrectly specified host bridge window may prevent other devices from claiming assigned resources. For example, this flawed _CRS resource descriptor from a Dell T5400: DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, NonCacheable, ReadWrite, 0x, // Granularity 0xF000, // Range Minimum 0xFE00, // Range Maximum 0x, // Translation Offset 0x0E00, // Length ,, , AddressRangeMemory, TypeStatic) prevents the adjacent device from claiming [mem 0xfe000-0xfe01] Sanity check that the resource at least conforms to a valid PCI BAR; if not, emit a diagnostic warning. Cc: Bjorn Helgaas Cc: Thomas Gleixner Cc: Ingo Molnar Cc: H. Peter Anvin Cc: x...@kernel.org Signed-off-by: Peter Hurley --- arch/x86/pci/acpi.c | 4 1 file changed, 4 insertions(+) diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c index 192397c..3468d16 100644 --- a/arch/x86/pci/acpi.c +++ b/arch/x86/pci/acpi.c @@ -298,6 +298,10 @@ setup_resource(struct acpi_resource *acpi_res, void *data) "host bridge window [%#llx-%#llx] " "([%#llx-%#llx] ignored, not CPU addressable)\n", start, orig_end, end + 1, orig_end); + } else if (flags & IORESOURCE_MEM && (start & 0x0f || ~end & 0x0f)) { + dev_warn(&info->bridge->dev, +"invalid host bridge window [%#llx-%#llx]\n", +start, end); } res = &info->res[info->res_num]; -- 1.7.12.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/