Re: [PATCH 4/5] x86: acpi: Print warning for malformed host bridge resources

2012-12-14 Thread Peter Hurley
Ping?

On Sun, 2012-11-11 at 09:49 -0500, Peter Hurley wrote:
> On Sat, 2012-11-10 at 14:52 -0700, Bjorn Helgaas wrote:
> > On Wed, Nov 7, 2012 at 7:55 PM, Peter Hurley  
> > wrote:
> > > An incorrectly specified host bridge window may prevent
> > > other devices from claiming assigned resources. For example,
> > > this flawed _CRS resource descriptor from a Dell T5400:
> > >  DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, 
> > > NonCacheable, ReadWrite,
> > >  0x, // Granularity
> > >  0xF000, // Range Minimum
> > >  0xFE00, // Range Maximum
> > >  0x, // Translation Offset
> > >  0x0E00, // Length
> > >  ,, , AddressRangeMemory, TypeStatic)
> > 
> > I think the problem here is that the Range Maximum should be
> > 0xFDFF, not 0xFE00, right?
> 
> I presume so.
> 
> > > diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
> > > index 192397c..3468d16 100644
> > > --- a/arch/x86/pci/acpi.c
> > > +++ b/arch/x86/pci/acpi.c
> > > @@ -298,6 +298,10 @@ setup_resource(struct acpi_resource *acpi_res, void 
> > > *data)
> > > "host bridge window [%#llx-%#llx] "
> > > "([%#llx-%#llx] ignored, not CPU addressable)\n",
> > > start, orig_end, end + 1, orig_end);
> > > +   } else if (flags & IORESOURCE_MEM && (start & 0x0f || ~end & 
> > > 0x0f)) {
> > > +   dev_warn(&info->bridge->dev,
> > > +"invalid host bridge window [%#llx-%#llx]\n",
> > > +start, end);
> > 
> > We didn't actually *fix* anything here, so I guess we're just pointing
> > out the reason for a subsequent failure to claim the adjacent
> > resource.
> 
> Correct. There is no fix; only a diagnostic warning.
> 
> The warning is also a 'red flag' that, on this machine, it might be
> better to boot the kernel with the "pci=nocrs" option.
> 
> > As far as I know, the spec doesn't actually require resources of ACPI
> > devices to be non-overlapping.  Windows accepts overlapping resources,
> > and I think Linux probably should, too, but right now we trip over
> > this.
> 
> (note: I included a link below to the defect report which has
> the /proc/iomem, dmesg & dmidecode)
> 
> The situation is this:
> 
> The adjacent resources (northbridge & southbridge) are not defined by
> ACPI, but rather reserved with an e820 address descriptor from
> [0xfe00-0xfeff], so strictly speaking there is no overlapping
> ACPI resource.
> 
> The e820 descriptor is bumped out to [0xf000-0xfeff] and the
> malformed host bridge window is reparented to it.
> 
> At this point in the boot, there is no resource conflict.
> 
> Later in the boot, the i5k_amb driver tries to map
> [0xfe00-0xfe01] which is the FB-DIMM AMB register window on the
> Intel 5400 MCH and is rejected. The request is rejected because the
> requested range does not map completely to a single parent and this is
> not allowed. (The i5k_amb driver exposes the FB-DIMM temperature sensors
> through sysfs).
> 
> There is no problem in Windows because no driver attempts to allocate
> [0xfe00-0xfe01]. However, I doubt the PNP Manager would allow
> another bus pdo to claim an overlapping resource with PCI bus 0. I
> suspect the offending device would yellow bang. (That would be an
> interesting experiment...)
> 
> > In the meantime (until we figure out how to handle overlapping
> > resources better), can we do something to actually fix this?  Maybe we
> > should truncate the end of the range to 0xFDFF like we do for
> > non-addressable parts of the range?
> 
> Auto-fixing this seems problematic because it's essentially impossible
> to determine if the resource length or the resource end or both is
> wrong.
> 
> > Is there a bugzilla or a complete dmesg log to look at?
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=50161
> 
> Regards,
> Peter Hurley
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/5] x86: acpi: Print warning for malformed host bridge resources

2012-11-11 Thread Peter Hurley
On Sat, 2012-11-10 at 14:52 -0700, Bjorn Helgaas wrote:
> On Wed, Nov 7, 2012 at 7:55 PM, Peter Hurley  wrote:
> > An incorrectly specified host bridge window may prevent
> > other devices from claiming assigned resources. For example,
> > this flawed _CRS resource descriptor from a Dell T5400:
> >  DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, 
> > NonCacheable, ReadWrite,
> >  0x, // Granularity
> >  0xF000, // Range Minimum
> >  0xFE00, // Range Maximum
> >  0x, // Translation Offset
> >  0x0E00, // Length
> >  ,, , AddressRangeMemory, TypeStatic)
> 
> I think the problem here is that the Range Maximum should be
> 0xFDFF, not 0xFE00, right?

I presume so.

> > diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
> > index 192397c..3468d16 100644
> > --- a/arch/x86/pci/acpi.c
> > +++ b/arch/x86/pci/acpi.c
> > @@ -298,6 +298,10 @@ setup_resource(struct acpi_resource *acpi_res, void 
> > *data)
> > "host bridge window [%#llx-%#llx] "
> > "([%#llx-%#llx] ignored, not CPU addressable)\n",
> > start, orig_end, end + 1, orig_end);
> > +   } else if (flags & IORESOURCE_MEM && (start & 0x0f || ~end & 0x0f)) 
> > {
> > +   dev_warn(&info->bridge->dev,
> > +"invalid host bridge window [%#llx-%#llx]\n",
> > +start, end);
> 
> We didn't actually *fix* anything here, so I guess we're just pointing
> out the reason for a subsequent failure to claim the adjacent
> resource.

Correct. There is no fix; only a diagnostic warning.

The warning is also a 'red flag' that, on this machine, it might be
better to boot the kernel with the "pci=nocrs" option.

> As far as I know, the spec doesn't actually require resources of ACPI
> devices to be non-overlapping.  Windows accepts overlapping resources,
> and I think Linux probably should, too, but right now we trip over
> this.

(note: I included a link below to the defect report which has
the /proc/iomem, dmesg & dmidecode)

The situation is this:

The adjacent resources (northbridge & southbridge) are not defined by
ACPI, but rather reserved with an e820 address descriptor from
[0xfe00-0xfeff], so strictly speaking there is no overlapping
ACPI resource.

The e820 descriptor is bumped out to [0xf000-0xfeff] and the
malformed host bridge window is reparented to it.

At this point in the boot, there is no resource conflict.

Later in the boot, the i5k_amb driver tries to map
[0xfe00-0xfe01] which is the FB-DIMM AMB register window on the
Intel 5400 MCH and is rejected. The request is rejected because the
requested range does not map completely to a single parent and this is
not allowed. (The i5k_amb driver exposes the FB-DIMM temperature sensors
through sysfs).

There is no problem in Windows because no driver attempts to allocate
[0xfe00-0xfe01]. However, I doubt the PNP Manager would allow
another bus pdo to claim an overlapping resource with PCI bus 0. I
suspect the offending device would yellow bang. (That would be an
interesting experiment...)

> In the meantime (until we figure out how to handle overlapping
> resources better), can we do something to actually fix this?  Maybe we
> should truncate the end of the range to 0xFDFF like we do for
> non-addressable parts of the range?

Auto-fixing this seems problematic because it's essentially impossible
to determine if the resource length or the resource end or both is
wrong.

> Is there a bugzilla or a complete dmesg log to look at?

https://bugzilla.kernel.org/show_bug.cgi?id=50161

Regards,
Peter Hurley



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/5] x86: acpi: Print warning for malformed host bridge resources

2012-11-10 Thread Bjorn Helgaas
On Wed, Nov 7, 2012 at 7:55 PM, Peter Hurley  wrote:
> An incorrectly specified host bridge window may prevent
> other devices from claiming assigned resources. For example,
> this flawed _CRS resource descriptor from a Dell T5400:
>  DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, 
> NonCacheable, ReadWrite,
>  0x, // Granularity
>  0xF000, // Range Minimum
>  0xFE00, // Range Maximum
>  0x, // Translation Offset
>  0x0E00, // Length
>  ,, , AddressRangeMemory, TypeStatic)

I think the problem here is that the Range Maximum should be
0xFDFF, not 0xFE00, right?

> prevents the adjacent device from claiming [mem 0xfe000-0xfe01]
>
> Sanity check that the resource at least conforms to a valid
> PCI BAR; if not, emit a diagnostic warning.
>
> Cc: Bjorn Helgaas 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: H. Peter Anvin 
> Cc: x...@kernel.org
> Signed-off-by: Peter Hurley 
> ---
>  arch/x86/pci/acpi.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
> index 192397c..3468d16 100644
> --- a/arch/x86/pci/acpi.c
> +++ b/arch/x86/pci/acpi.c
> @@ -298,6 +298,10 @@ setup_resource(struct acpi_resource *acpi_res, void 
> *data)
> "host bridge window [%#llx-%#llx] "
> "([%#llx-%#llx] ignored, not CPU addressable)\n",
> start, orig_end, end + 1, orig_end);
> +   } else if (flags & IORESOURCE_MEM && (start & 0x0f || ~end & 0x0f)) {
> +   dev_warn(&info->bridge->dev,
> +"invalid host bridge window [%#llx-%#llx]\n",
> +start, end);

We didn't actually *fix* anything here, so I guess we're just pointing
out the reason for a subsequent failure to claim the adjacent
resource.

As far as I know, the spec doesn't actually require resources of ACPI
devices to be non-overlapping.  Windows accepts overlapping resources,
and I think Linux probably should, too, but right now we trip over
this.

In the meantime (until we figure out how to handle overlapping
resources better), can we do something to actually fix this?  Maybe we
should truncate the end of the range to 0xFDFF like we do for
non-addressable parts of the range?

Is there a bugzilla or a complete dmesg log to look at?

Bjorn

> }
>
> res = &info->res[info->res_num];
> --
> 1.7.12.3
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/5] x86: acpi: Print warning for malformed host bridge resources

2012-11-07 Thread Peter Hurley
An incorrectly specified host bridge window may prevent
other devices from claiming assigned resources. For example,
this flawed _CRS resource descriptor from a Dell T5400:
 DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, 
NonCacheable, ReadWrite,
 0x, // Granularity
 0xF000, // Range Minimum
 0xFE00, // Range Maximum
 0x, // Translation Offset
 0x0E00, // Length
 ,, , AddressRangeMemory, TypeStatic)
prevents the adjacent device from claiming [mem 0xfe000-0xfe01]

Sanity check that the resource at least conforms to a valid
PCI BAR; if not, emit a diagnostic warning.

Cc: Bjorn Helgaas 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: H. Peter Anvin 
Cc: x...@kernel.org
Signed-off-by: Peter Hurley 
---
 arch/x86/pci/acpi.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
index 192397c..3468d16 100644
--- a/arch/x86/pci/acpi.c
+++ b/arch/x86/pci/acpi.c
@@ -298,6 +298,10 @@ setup_resource(struct acpi_resource *acpi_res, void *data)
"host bridge window [%#llx-%#llx] "
"([%#llx-%#llx] ignored, not CPU addressable)\n", 
start, orig_end, end + 1, orig_end);
+   } else if (flags & IORESOURCE_MEM && (start & 0x0f || ~end & 0x0f)) {
+   dev_warn(&info->bridge->dev,
+"invalid host bridge window [%#llx-%#llx]\n",
+start, end);
}
 
res = &info->res[info->res_num];
-- 
1.7.12.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/