Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-26 Thread Matthew Wilcox
On Thu, Sep 27, 2007 at 10:40:33AM +1000, Benjamin Herrenschmidt wrote:
> 
> > How is this a change in behavior as far as this device is concerned? If 
> > we are doing BAR sizing and moving the base address around, it's going 
> > to cause problems if you try to access the device during this time 
> > whether we disable decode or not.
> 
> True. The window is smaller tho if the upper bridge decode hasn't been
> disabled. I suppose I'll have to find a way to "pin" those devices and
> generate the BAR info from the firmware data.

The upper bridge decode only gets disabled while we size the upper
bridge's BARs.  Then we reenable it.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-26 Thread Benjamin Herrenschmidt

> How is this a change in behavior as far as this device is concerned? If 
> we are doing BAR sizing and moving the base address around, it's going 
> to cause problems if you try to access the device during this time 
> whether we disable decode or not.

True. The window is smaller tho if the upper bridge decode hasn't been
disabled. I suppose I'll have to find a way to "pin" those devices and
generate the BAR info from the firmware data.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-26 Thread Robert Hancock

Benjamin Herrenschmidt wrote:

On Mon, 2007-09-17 at 14:22 +0400, Ivan Kokshaysky wrote:

On Sun, Sep 16, 2007 at 10:01:52PM +0200, Benjamin Herrenschmidt wrote:

Agreed. I have a similar problem on ppc where it's common to have things
like the main PIC on a PCI device. Note that another problem is (or at
least was, i haven't checked recently) the P2P bridge scanning code
that, in a similar way, can block the path to all devices below it. I
-do- have a case for example with Apple Xserve G4's where the main Apple
IO ASIC, which is a PCI device containing the PIC, the power management
controller, and various low level system control IOs is behind a pair of
P2P bridges.

I think the P2P probing code is pretty safe now - there are read-only
accesses to the bridge config, unless you request to reassign the bus
numbers. Though it won't be safe anymore with the patch in question.


In which case I will need to NAK the patch... Note that those Xserve
G4's still have the subtle issue that they -also- reassign bus
numbers :-) But that's going away the day I finally enable domains
support for ppc32 (it's been off for now due to problems with X)


How is this a change in behavior as far as this device is concerned? If 
we are doing BAR sizing and moving the base address around, it's going 
to cause problems if you try to access the device during this time 
whether we disable decode or not.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-19 Thread Jesse Barnes
On Tuesday, September 18, 2007 2:53 am Ivan Kokshaysky wrote:
> On Mon, Sep 17, 2007 at 09:21:47AM +0800, Shaohua Li wrote:
> > I can confirm this is an add-in graphics card. the bfd is 00:02.0,
> > so it's not behind any AGP/PCI-E bridge.
>
> AFAIKS, 00:02.0 is *integrated* controller. Can you check that
> "graphic adapter priority" setting in BIOS is "PCI Express" and not
> "Internal VGA"? In the latter case an add-on card might be turned
> completely off, so it doesn't even show up in lspci output.

Yeah, it's integrated gfx.  See the PRM at intel.com for the decode 
rules.

I've been following this thread and I see a lot of fear about moving to 
probing BARs as outlined in the PCI spec.  I haven't seen much in the 
way of hard evidence though, mostly just handwaving or red herrings 
(and even one comment implying that -mm wasn't a real testbed for 
patches) that don't have much to do with the actual "disable, size, 
re-enable" logic.  Has a conclusion been reached yet?

Keep in mind that any failures that occur due to this patch should be 
easy to track down (boot hang), but we have yet to see any in real 
life.  Moreover, reversion is trivial, and we could move to a more 
complex scheme at that time if needed, but why bother unless we're sure 
we need to?

Thanks,
Jesse
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-18 Thread Ivan Kokshaysky
On Tue, Sep 18, 2007 at 06:30:23AM +1000, Benjamin Herrenschmidt wrote:
> At this stage (but we are getting a bit OT), ppc has something like 3
> different PCI code implementations :-) I do have some plans to fix that
> by switching everybody to use pci_assign_unassigned_resources() and
> friends but last time I tried, everything blew up :-) I suspect I'll
> need a quirk or two in the generic code, but I'll let you know when I
> get to it.

Ok, I'll be happy to look into that :-)

Ivan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-18 Thread Ivan Kokshaysky
On Mon, Sep 17, 2007 at 09:21:47AM +0800, Shaohua Li wrote:
> I can confirm this is an add-in graphics card. the bfd is 00:02.0, so
> it's not behind any AGP/PCI-E bridge.

AFAIKS, 00:02.0 is *integrated* controller. Can you check that "graphic
adapter priority" setting in BIOS is "PCI Express" and not "Internal VGA"?
In the latter case an add-on card might be turned completely off, so
it doesn't even show up in lspci output.

Ivan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-17 Thread Benjamin Herrenschmidt
On Mon, 2007-09-17 at 14:22 +0400, Ivan Kokshaysky wrote:
> On Sun, Sep 16, 2007 at 10:01:52PM +0200, Benjamin Herrenschmidt wrote:
> > Agreed. I have a similar problem on ppc where it's common to have things
> > like the main PIC on a PCI device. Note that another problem is (or at
> > least was, i haven't checked recently) the P2P bridge scanning code
> > that, in a similar way, can block the path to all devices below it. I
> > -do- have a case for example with Apple Xserve G4's where the main Apple
> > IO ASIC, which is a PCI device containing the PIC, the power management
> > controller, and various low level system control IOs is behind a pair of
> > P2P bridges.
> 
> I think the P2P probing code is pretty safe now - there are read-only
> accesses to the bridge config, unless you request to reassign the bus
> numbers. Though it won't be safe anymore with the patch in question.

In which case I will need to NAK the patch... Note that those Xserve
G4's still have the subtle issue that they -also- reassign bus
numbers :-) But that's going away the day I finally enable domains
support for ppc32 (it's been off for now due to problems with X)

> > One solution for us (PPC) is to enforce those devices and bridges to be
> > described in the OF tree, and generalize a bit the code we have for some
> > 64 bits machines, that synthetizes the pci_dev's from the OF nodes
> > rather than probing. But that's not going to help other archs.
> 
> If you can get reliable PCI info from firmware, it should be relatively easy
> to avoid at least a bar sizing. You can install an "early" fixup for
> PCI_ANY_ID and fill the resource fields of the pci_dev with values obtained
> from firmware. Then all we need in probe.c is just to check that the resource
> is already non-zero and skip the sizing of respective BAR, if so.

Right now, we have code to completely build a pci_dev from the firmware
infos. We only use it on 64 bits pseries currently though. 

> > In fact, that's a problem we also have with
> > pci_assign_unassigned_resources() which will happily move things around
> > that must not be moved, especially when sitting behind P2P bridges.
> 
> It's not supposed to do that. Certainly, there were problems of that sort,
> but hopefully they are in the past.

At this stage (but we are getting a bit OT), ppc has something like 3
different PCI code implementations :-) I do have some plans to fix that
by switching everybody to use pci_assign_unassigned_resources() and
friends but last time I tried, everything blew up :-) I suspect I'll
need a quirk or two in the generic code, but I'll let you know when I
get to it.

Cheers,
Ben.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-17 Thread Robert Hancock

Ivan Kokshaysky wrote:

On Sun, Sep 16, 2007 at 01:52:59PM -0600, Matthew Wilcox wrote:

I don't think it would be that complicated.  We could just delay probing
for mmconfig until after the pci bus probes are done.  No changes to
other architectures needed.


Indeed. Seems like it's a way to go.


With another patch pending for 2.6.24, to validate the MMCONFIG aperture 
against the ACPI motherboard resources instead of the E820 table, we 
have to delay switching to MMCONFIG until the ACPI interpreter is 
available (unless no other configuration mechanism is available). It 
might thus be feasible to delay it further.


Nevertheless, I think this is a rather ugly workaround for the 
underlying problem.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-17 Thread Ivan Kokshaysky
On Sun, Sep 16, 2007 at 10:01:52PM +0200, Benjamin Herrenschmidt wrote:
> Agreed. I have a similar problem on ppc where it's common to have things
> like the main PIC on a PCI device. Note that another problem is (or at
> least was, i haven't checked recently) the P2P bridge scanning code
> that, in a similar way, can block the path to all devices below it. I
> -do- have a case for example with Apple Xserve G4's where the main Apple
> IO ASIC, which is a PCI device containing the PIC, the power management
> controller, and various low level system control IOs is behind a pair of
> P2P bridges.

I think the P2P probing code is pretty safe now - there are read-only
accesses to the bridge config, unless you request to reassign the bus
numbers. Though it won't be safe anymore with the patch in question.

> One solution for us (PPC) is to enforce those devices and bridges to be
> described in the OF tree, and generalize a bit the code we have for some
> 64 bits machines, that synthetizes the pci_dev's from the OF nodes
> rather than probing. But that's not going to help other archs.

If you can get reliable PCI info from firmware, it should be relatively easy
to avoid at least a bar sizing. You can install an "early" fixup for
PCI_ANY_ID and fill the resource fields of the pci_dev with values obtained
from firmware. Then all we need in probe.c is just to check that the resource
is already non-zero and skip the sizing of respective BAR, if so.

> In fact, that's a problem we also have with
> pci_assign_unassigned_resources() which will happily move things around
> that must not be moved, especially when sitting behind P2P bridges.

It's not supposed to do that. Certainly, there were problems of that sort,
but hopefully they are in the past.

Ivan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-17 Thread Ivan Kokshaysky
On Sun, Sep 16, 2007 at 01:52:59PM -0600, Matthew Wilcox wrote:
> I don't think it would be that complicated.  We could just delay probing
> for mmconfig until after the pci bus probes are done.  No changes to
> other architectures needed.

Indeed. Seems like it's a way to go.

> I'm really disappointed with mmconfig.  Here was a great chance to get
> rid of all the sucky performance problems of the past, and BIOS writers
> and chipset designers have fucked up the implementation so much that
> it's now the ugliest method for getting to pci config space.  And it's
> the *only* method of getting to extended config space.

Same feeling...

Fortunately, it seems that we don't access the extended config space
during bus probe.

Ivan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-17 Thread Ivan Kokshaysky
On Sun, Sep 16, 2007 at 11:34:35AM -0600, Robert Hancock wrote:
> > The most interesting fact there is that windows *does not* clear
> > PCI_COMMAND_MEMORY bits during PCI bus enumeration, so BIOS writers
> > have to rely on that. Yet another reason why your patch is unsafe.
> 
> Windows 98 doesn't, it says. That doesn't say anything about what newer 
> versions of Windows are doing (like Vista, which is the first to 
> actually use MMCONFIG).

But we're not going to drop support for hardware "designed for Windows 98",
aren't we?

> Check the code. We have a quirk to handle this, in 
> drivers/usb/host/pci-quirks.c

It doesn't help. It's a "final" fixup, so it runs much later than BAR sizing.

Ivan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-16 Thread Shaohua Li
On Sun, 2007-09-16 at 15:13 +0400, Ivan Kokshaysky wrote:
> On Fri, Sep 14, 2007 at 05:53:58PM -0600, Robert Hancock wrote:
> > In the first one, Linus talks about a USB controller whose SMM code 
> > chokes on the BAR being disabled. That explanation seems odd to me. If 
> > it chokes on the BAR disabled, how doesn't it choke on the BAR being 
> > moved to a different location, which is unavoidable during probing?
> 
> Here is an article with detailed explanation of that USB issue:
> 
>  http://support.microsoft.com/kb/250635
> 
> The most interesting fact there is that windows *does not* clear
> PCI_COMMAND_MEMORY bits during PCI bus enumeration, so BIOS writers
> have to rely on that. Yet another reason why your patch is unsafe.
> 
> > Also, I think we do USB handoff from the BIOS much earlier than we used 
> > to, which likely prevents these issues.
> 
> I don't think so. This can only be done in USB driver, which cannot run
> before PCI device discovery.
Most BIOS has an option called legacy emulation, which will emulate USB
mouse as legacy mouse. BIOS is using USB before driver is loaded.

If moving BAR work, disabling BAR should work too.

> > In the second one, he mentions that "So the secondary rule to "don't 
> > turnoff MEM or IO accesses" is "never allocate real PCI BAR resources at 
> > the top of memory". Well, unfortunately, the second one isn't possible 
> > to meet given that we have boards with the MMCONFIG up there, so 
> > disabling the decode is the only thing we can do. That whole discussion 
> > was also based on the fact that it wasn't necessary to solve problems. 
> > This is no longer a theoretical problem. We now have real problems that 
> > we need this in order to solve.
> 
> I have suggested a *practical* solution that disables the decode only
> when it's unavoidable. If it's not sufficient for some machines, it
> can be extended quite easily.
If you just want to workaroud the issue at hand, I'd take the method to
not use mmcfg at probe stage.

> > > No, it's impossible for several reasons. Most obvious one is that a PCI-E
> > > bridge does isolate stuff quite nicely.
> > 
> > It's not impossible at all. In fact I'm quite sure (Jesse can confirm) 
> > that in the case of the board he was using, it was an add-in graphics 
> > card where he saw this problem.
> 
> Again, it's impossible with AGP or PCI-E cards, which are always
> separated
> from the root bus by AGP or PCI-E bridge. The bridge does decode in
> the
> first place, so when you write the BAR with all ones, you move it
> outside
> the bridge decoding window. Address collision isn't possible in this
> case in principle.
I can confirm this is an add-in graphics card. the bfd is 00:02.0, so
it's not behind any AGP/PCI-E bridge.

Thanks,
Shaohua
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-16 Thread Benjamin Herrenschmidt
On Sun, 2007-09-16 at 17:37 -0600, Robert Hancock wrote:
> We would already have this problem, though. If it causes problems to 
> disable decode on the BAR because we try to access it in interrupt 
> context, we would already have problems because we move the thing to 
> 0x during probing anyway.. 

Yes, it's not a new problem, I was just pointing out the fact that there
are deeper issues already there in the same area.

Today, we mostly get lucky because we rarely take an irq during boot
time PCI probe but I've seen it happen (one easy way to screw things up
on -many- machines is to enable the DEBUG stuff in the PCI probe code
and use a serial console for example, especially if access to the device
with the serial ports in it gets temporarily cut off).

Cheers,
Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-16 Thread Robert Hancock

Benjamin Herrenschmidt wrote:

On Thu, 2007-09-13 at 21:32 -0600, Robert Hancock wrote:

If we do encounter other devices that choke on having the BAR
disabled 
during probing then we can add additional quirk logic, but we haven't 
run into anything like that yet.


Well... if the device needs to be accessed to service an interrupt then
you do have a problem. For example... the PIC :-)

Problem is.. it's not practical nor really feasible generally to have
IRQs off on all CPUs during PCI probing neither... Unless we define that
the initial boot time probing is "special", and the first pass that
actually probes devices (and doesn't muck around with the sysfs
hierarchy etc...) can be run in a special context with all interrupt
servicing disabled on the PIC, though that will require some arch
support.

Ben.


We would already have this problem, though. If it causes problems to 
disable decode on the BAR because we try to access it in interrupt 
context, we would already have problems because we move the thing to 
0x during probing anyway..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-16 Thread Benjamin Herrenschmidt
On Thu, 2007-09-13 at 21:32 -0600, Robert Hancock wrote:
> 
> If we do encounter other devices that choke on having the BAR
> disabled 
> during probing then we can add additional quirk logic, but we haven't 
> run into anything like that yet.

Well... if the device needs to be accessed to service an interrupt then
you do have a problem. For example... the PIC :-)

Problem is.. it's not practical nor really feasible generally to have
IRQs off on all CPUs during PCI probing neither... Unless we define that
the initial boot time probing is "special", and the first pass that
actually probes devices (and doesn't muck around with the sysfs
hierarchy etc...) can be run in a special context with all interrupt
servicing disabled on the PIC, though that will require some arch
support.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-16 Thread Benjamin Herrenschmidt
On Thu, 2007-09-13 at 15:16 +0400, Ivan Kokshaysky wrote:
> On Thu, Sep 13, 2007 at 02:53:13AM -0700, Greg KH wrote:
> > On Thu, Sep 13, 2007 at 01:55:36AM -0600, Matthew Wilcox wrote:
> > > Unfortunately if this patch does cause any machine to break, these will
> > > be machines that worked fine up until this point, so that would be a
> > > regression, which is worse.  Life sucks.
> > 
> > If, after a while, you think the change should go into the -stable tree,
> > I have no objection.
> 
> I think it shouldn't - this change will almost certainly cause a regression.
> There is a lot of system devices besides the host bridges that shouldn't be
> disabled during BAR probe, like interrupt controllers, power management
> controllers and so on.
> 
> We need a more sophisticated fix - I'm thinking of introducing "probe" field
> in struct pci_dev which can be set by "early" quirk routines.

Agreed. I have a similar problem on ppc where it's common to have things
like the main PIC on a PCI device. Note that another problem is (or at
least was, i haven't checked recently) the P2P bridge scanning code
that, in a similar way, can block the path to all devices below it. I
-do- have a case for example with Apple Xserve G4's where the main Apple
IO ASIC, which is a PCI device containing the PIC, the power management
controller, and various low level system control IOs is behind a pair of
P2P bridges.

One solution for us (PPC) is to enforce those devices and bridges to be
described in the OF tree, and generalize a bit the code we have for some
64 bits machines, that synthetizes the pci_dev's from the OF nodes
rather than probing. But that's not going to help other archs.

In fact, that's a problem we also have with
pci_assign_unassigned_resources() which will happily move things around
that must not be moved, especially when sitting behind P2P bridges.

So the root of the issue is much deeper than just a quirk here I
believe.

Cheers,
Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-16 Thread Matthew Wilcox
On Sun, Sep 16, 2007 at 03:13:55PM +0400, Ivan Kokshaysky wrote:
> Another possible solution is not to use mmconfig for bar sizing at all,
> if it's *that* broken. It would be more complicated though, since it
> probably requires changes to all architectures.

I don't think it would be that complicated.  We could just delay probing
for mmconfig until after the pci bus probes are done.  No changes to
other architectures needed.

I'm really disappointed with mmconfig.  Here was a great chance to get
rid of all the sucky performance problems of the past, and BIOS writers
and chipset designers have fucked up the implementation so much that
it's now the ugliest method for getting to pci config space.  And it's
the *only* method of getting to extended config space.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-16 Thread Robert Hancock

Ivan Kokshaysky wrote:

On Fri, Sep 14, 2007 at 05:53:58PM -0600, Robert Hancock wrote:
In the first one, Linus talks about a USB controller whose SMM code 
chokes on the BAR being disabled. That explanation seems odd to me. If 
it chokes on the BAR disabled, how doesn't it choke on the BAR being 
moved to a different location, which is unavoidable during probing?


Here is an article with detailed explanation of that USB issue:

 http://support.microsoft.com/kb/250635

The most interesting fact there is that windows *does not* clear
PCI_COMMAND_MEMORY bits during PCI bus enumeration, so BIOS writers
have to rely on that. Yet another reason why your patch is unsafe.


Windows 98 doesn't, it says. That doesn't say anything about what newer 
versions of Windows are doing (like Vista, which is the first to 
actually use MMCONFIG).




Also, I think we do USB handoff from the BIOS much earlier than we used 
to, which likely prevents these issues.


I don't think so. This can only be done in USB driver, which cannot run
before PCI device discovery.


Check the code. We have a quirk to handle this, in 
drivers/usb/host/pci-quirks.c




In the second one, he mentions that "So the secondary rule to "don't 
turnoff MEM or IO accesses" is "never allocate real PCI BAR resources at 
the top of memory". Well, unfortunately, the second one isn't possible 
to meet given that we have boards with the MMCONFIG up there, so 
disabling the decode is the only thing we can do. That whole discussion 
was also based on the fact that it wasn't necessary to solve problems. 
This is no longer a theoretical problem. We now have real problems that 
we need this in order to solve.


I have suggested a *practical* solution that disables the decode only
when it's unavoidable. If it's not sufficient for some machines, it
can be extended quite easily.


No, it's impossible for several reasons. Most obvious one is that a PCI-E
bridge does isolate stuff quite nicely.
It's not impossible at all. In fact I'm quite sure (Jesse can confirm) 
that in the case of the board he was using, it was an add-in graphics 
card where he saw this problem.


Again, it's impossible with AGP or PCI-E cards, which are always separated
from the root bus by AGP or PCI-E bridge. The bridge does decode in the
first place, so when you write the BAR with all ones, you move it outside
the bridge decoding window. Address collision isn't possible in this
case in principle.


And yet, something was clearly happening to cause this. Jesse, what were 
the bridge decode windows (shown in lspci -v) on one of those boards 
that was having issues?




The fact is that in the case of MMCONFIG overlap with PCI BARs, which 
one takes priority is completely undefined. In the case of this Intel 
chipset, clearly the PCI Express device connected to the northbridge had 
higher decode priority than the MMCONFIG aperture.


Another possible solution is not to use mmconfig for bar sizing at all,
if it's *that* broken. It would be more complicated though, since it
probably requires changes to all architectures.


It's not broken at all, it just requires that we not do silly things 
like stick enabled decode regions over top of the aperture.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-16 Thread Ivan Kokshaysky
On Fri, Sep 14, 2007 at 05:53:58PM -0600, Robert Hancock wrote:
> In the first one, Linus talks about a USB controller whose SMM code 
> chokes on the BAR being disabled. That explanation seems odd to me. If 
> it chokes on the BAR disabled, how doesn't it choke on the BAR being 
> moved to a different location, which is unavoidable during probing?

Here is an article with detailed explanation of that USB issue:

 http://support.microsoft.com/kb/250635

The most interesting fact there is that windows *does not* clear
PCI_COMMAND_MEMORY bits during PCI bus enumeration, so BIOS writers
have to rely on that. Yet another reason why your patch is unsafe.

> Also, I think we do USB handoff from the BIOS much earlier than we used 
> to, which likely prevents these issues.

I don't think so. This can only be done in USB driver, which cannot run
before PCI device discovery.

> In the second one, he mentions that "So the secondary rule to "don't 
> turnoff MEM or IO accesses" is "never allocate real PCI BAR resources at 
> the top of memory". Well, unfortunately, the second one isn't possible 
> to meet given that we have boards with the MMCONFIG up there, so 
> disabling the decode is the only thing we can do. That whole discussion 
> was also based on the fact that it wasn't necessary to solve problems. 
> This is no longer a theoretical problem. We now have real problems that 
> we need this in order to solve.

I have suggested a *practical* solution that disables the decode only
when it's unavoidable. If it's not sufficient for some machines, it
can be extended quite easily.

> > No, it's impossible for several reasons. Most obvious one is that a PCI-E
> > bridge does isolate stuff quite nicely.
> 
> It's not impossible at all. In fact I'm quite sure (Jesse can confirm) 
> that in the case of the board he was using, it was an add-in graphics 
> card where he saw this problem.

Again, it's impossible with AGP or PCI-E cards, which are always separated
from the root bus by AGP or PCI-E bridge. The bridge does decode in the
first place, so when you write the BAR with all ones, you move it outside
the bridge decoding window. Address collision isn't possible in this
case in principle.

> The fact is that in the case of MMCONFIG overlap with PCI BARs, which 
> one takes priority is completely undefined. In the case of this Intel 
> chipset, clearly the PCI Express device connected to the northbridge had 
> higher decode priority than the MMCONFIG aperture.

Another possible solution is not to use mmconfig for bar sizing at all,
if it's *that* broken. It would be more complicated though, since it
probably requires changes to all architectures.

Ivan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-15 Thread Robert Hancock

Yinghai Lu wrote:

On 9/14/07, Robert Hancock <[EMAIL PROTECTED]> wrote:

It's not impossible at all. In fact I'm quite sure (Jesse can confirm)
that in the case of the board he was using, it was an add-in graphics
card where he saw this problem.

The fact is that in the case of MMCONFIG overlap with PCI BARs, which
one takes priority is completely undefined. In the case of this Intel
chipset, clearly the PCI Express device connected to the northbridge had
higher decode priority than the MMCONFIG aperture.


can you relocate the MMCONFIG above RAM range? for example 512G...


On some chipsets that might be possible, however I think that on the 
Intel ones it's not possible to move it above 4G. And in any case this 
would require chipset-specific knowledge, which we would rather not have 
to need.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-14 Thread Yinghai Lu
On 9/14/07, Robert Hancock <[EMAIL PROTECTED]> wrote:
> It's not impossible at all. In fact I'm quite sure (Jesse can confirm)
> that in the case of the board he was using, it was an add-in graphics
> card where he saw this problem.
>
> The fact is that in the case of MMCONFIG overlap with PCI BARs, which
> one takes priority is completely undefined. In the case of this Intel
> chipset, clearly the PCI Express device connected to the northbridge had
> higher decode priority than the MMCONFIG aperture.

can you relocate the MMCONFIG above RAM range? for example 512G...

YH
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-14 Thread Robert Hancock

Ivan Kokshaysky wrote:

On Fri, Sep 14, 2007 at 08:30:59AM -0600, Robert Hancock wrote:

Do you have an example of specific hardware that exhibits this problem?


Well, first two results of google search for "disable bar when sizing":
 http://lkml.org/lkml/2002/12/21/95
 http://lkml.org/lkml/2002/12/20/110


In the first one, Linus talks about a USB controller whose SMM code 
chokes on the BAR being disabled. That explanation seems odd to me. If 
it chokes on the BAR disabled, how doesn't it choke on the BAR being 
moved to a different location, which is unavoidable during probing?


Also, I think we do USB handoff from the BIOS much earlier than we used 
to, which likely prevents these issues.


In the second one, he mentions that "So the secondary rule to "don't 
turnoff MEM or IO accesses" is "never allocate real PCI BAR resources at 
the top of memory". Well, unfortunately, the second one isn't possible 
to meet given that we have boards with the MMCONFIG up there, so 
disabling the decode is the only thing we can do. That whole discussion 
was also based on the fact that it wasn't necessary to solve problems. 
This is no longer a theoretical problem. We now have real problems that 
we need this in order to solve.




So far after a similar patch has been in -mm for months there have been 
no reports of any such problems.


You cannot compare user base of -mm and release kernels. Also, note
that we're talking about maybe 0.01% of systems running Linux.
And similar patch appeared in various trees several times over the last
decade and every time it had to be reverted.

This isn't guaranteed to help. I don't think it is only integrated 
graphics that could cause this problem, I think that an add-on PCI 
Express card can do this as well. Depending on the chipset memory decode 
rules, any PCI or PCI Express device with a BAR large enough could cause 
issues.


No, it's impossible for several reasons. Most obvious one is that a PCI-E
bridge does isolate stuff quite nicely.


It's not impossible at all. In fact I'm quite sure (Jesse can confirm) 
that in the case of the board he was using, it was an add-in graphics 
card where he saw this problem.


The fact is that in the case of MMCONFIG overlap with PCI BARs, which 
one takes priority is completely undefined. In the case of this Intel 
chipset, clearly the PCI Express device connected to the northbridge had 
higher decode priority than the MMCONFIG aperture.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-14 Thread Ivan Kokshaysky
On Fri, Sep 14, 2007 at 08:30:59AM -0600, Robert Hancock wrote:
> Do you have an example of specific hardware that exhibits this problem?

Well, first two results of google search for "disable bar when sizing":
 http://lkml.org/lkml/2002/12/21/95
 http://lkml.org/lkml/2002/12/20/110

> So far after a similar patch has been in -mm for months there have been 
> no reports of any such problems.

You cannot compare user base of -mm and release kernels. Also, note
that we're talking about maybe 0.01% of systems running Linux.
And similar patch appeared in various trees several times over the last
decade and every time it had to be reverted.

> This isn't guaranteed to help. I don't think it is only integrated 
> graphics that could cause this problem, I think that an add-on PCI 
> Express card can do this as well. Depending on the chipset memory decode 
> rules, any PCI or PCI Express device with a BAR large enough could cause 
> issues.

No, it's impossible for several reasons. Most obvious one is that a PCI-E
bridge does isolate stuff quite nicely.

Ivan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-14 Thread Robert Hancock

Ivan Kokshaysky wrote:

On Thu, Sep 13, 2007 at 09:32:42PM -0600, Robert Hancock wrote:
Disabling the BAR decoding does not mean disabling the device (aside 
from the one  group of host bridge that apparently disables CPU to RAM 
access, whose designers were apparently on crack when they read the PCI 
spec and which is why we don't disable the decode on host bridges). 
There really is no other sane way to do it.


Disabling IO and MEM decode effectively disables device as a PCI target.
Worse, it's often not just BARs that get disabled, but some non-standard
or hidden address ranges as well. Host bridges are just one common
example, there is a lot of other insane hardware, more than one could
expect.


Do you have an example of specific hardware that exhibits this problem? 
So far after a similar patch has been in -mm for months there have been 
no reports of any such problems.



BTW, the way how the internal address decode works on these latest Intel
whippets doesn't look sane either. ;-)

If we do encounter other devices that choke on having the BAR disabled 
during probing then we can add additional quirk logic, but we haven't 
run into anything like that yet.


As Greg said, the main point is "no regression".

Folks, can you check if the patch below helps?


This isn't guaranteed to help. I don't think it is only integrated 
graphics that could cause this problem, I think that an add-on PCI 
Express card can do this as well. Depending on the chipset memory decode 
rules, any PCI or PCI Express device with a BAR large enough could cause 
issues.




Ivan.

--- 2.6.23-rc6-git4/include/linux/pci.h 2007-09-12 17:22:57.0 +0400
+++ linux/include/linux/pci.h   2007-09-14 10:26:29.0 +0400
@@ -183,6 +183,7 @@ struct pci_dev {
unsigned intmsi_enabled:1;
unsigned intmsix_enabled:1;
unsigned intis_managed:1;
+   unsigned intdisable_while_probe:1; /* Probe BARs with IO and MMIO 
turned off */
atomic_tenable_cnt; /* pci_enable_device has been called */
 
 	u32		saved_config_space[16]; /* config space saved at suspend time */

--- 2.6.23-rc6-git4/arch/i386/pci/fixup.c   2007-09-12 17:22:55.0 
+0400
+++ linux/arch/i386/pci/fixup.c 2007-09-14 14:43:13.0 +0400
@@ -444,3 +444,21 @@ static void __devinit pci_siemens_interr
 }
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SIEMENS, 0x0015,
  pci_siemens_interrupt_controller);
+
+/*
+ * Work around apparent quirk in internal address decoding on
+ * some Intel chipsets (G31, G33, Q965 etc.): 
+ * when some MMIO address range of integrated peripheral overlaps

+ * mmconfig area, mmconfig accesses are blocked. This can happen
+ * when we size the BAR writing all ones to it.
+ * In practice, only integrated graphics controller has MMIO range
+ * large enough (BAR2, 256Mb) to cause a trouble, so we disable
+ * address decoders just on that function during PCI BAR probe.
+ */
+static void __devinit pci_intel_mmconfig(struct pci_dev *dev)
+{
+   if ((dev->class >> 8) == PCI_CLASS_DISPLAY_VGA &&
+   pci_domain_nr(dev->bus) == 0 && dev->bus->number == 0)
+   dev->disable_while_probe = 1;
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_intel_mmconfig);
--- 2.6.23-rc6-git4/drivers/pci/probe.c 2007-09-14 00:35:47.0 +0400
+++ linux/drivers/pci/probe.c   2007-09-14 10:27:45.0 +0400
@@ -185,6 +185,13 @@ static void pci_read_bases(struct pci_de
unsigned int pos, reg, next;
u32 l, sz;
struct resource *res;
+   u16 cmd, orig_cmd;
+
+   if (dev->disable_while_probe) {
+   pci_read_config_word(dev, PCI_COMMAND, &orig_cmd);
+   cmd |= orig_cmd & ~(PCI_COMMAND_MEMORY | PCI_COMMAND_IO);
+   pci_write_config_word(dev, PCI_COMMAND, cmd);
+   }
 
 	for(pos=0; pos
u64 l64;
@@ -283,6 +290,8 @@ static void pci_read_bases(struct pci_de
}
}
}
+   if (dev->disable_while_probe)
+   pci_write_config_word(dev, PCI_COMMAND, orig_cmd);
 }
 
 void pci_read_bridge_bases(struct pci_bus *child)



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-14 Thread Ivan Kokshaysky
On Fri, Sep 14, 2007 at 03:14:01PM +0400, Ivan Kokshaysky wrote:
> whippets doesn't look sane either. ;-)
  
It's not me, it's a spellchecker :-) I meant "chipsets" of course, sorry.

Ivan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-14 Thread Ivan Kokshaysky
On Thu, Sep 13, 2007 at 09:32:42PM -0600, Robert Hancock wrote:
> Disabling the BAR decoding does not mean disabling the device (aside 
> from the one  group of host bridge that apparently disables CPU to RAM 
> access, whose designers were apparently on crack when they read the PCI 
> spec and which is why we don't disable the decode on host bridges). 
> There really is no other sane way to do it.

Disabling IO and MEM decode effectively disables device as a PCI target.
Worse, it's often not just BARs that get disabled, but some non-standard
or hidden address ranges as well. Host bridges are just one common
example, there is a lot of other insane hardware, more than one could
expect.
BTW, the way how the internal address decode works on these latest Intel
whippets doesn't look sane either. ;-)

> If we do encounter other devices that choke on having the BAR disabled 
> during probing then we can add additional quirk logic, but we haven't 
> run into anything like that yet.

As Greg said, the main point is "no regression".

Folks, can you check if the patch below helps?

Ivan.

--- 2.6.23-rc6-git4/include/linux/pci.h 2007-09-12 17:22:57.0 +0400
+++ linux/include/linux/pci.h   2007-09-14 10:26:29.0 +0400
@@ -183,6 +183,7 @@ struct pci_dev {
unsigned intmsi_enabled:1;
unsigned intmsix_enabled:1;
unsigned intis_managed:1;
+   unsigned intdisable_while_probe:1; /* Probe BARs with IO and MMIO 
turned off */
atomic_tenable_cnt; /* pci_enable_device has been called */
 
u32 saved_config_space[16]; /* config space saved at 
suspend time */
--- 2.6.23-rc6-git4/arch/i386/pci/fixup.c   2007-09-12 17:22:55.0 
+0400
+++ linux/arch/i386/pci/fixup.c 2007-09-14 14:43:13.0 +0400
@@ -444,3 +444,21 @@ static void __devinit pci_siemens_interr
 }
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SIEMENS, 0x0015,
  pci_siemens_interrupt_controller);
+
+/*
+ * Work around apparent quirk in internal address decoding on
+ * some Intel chipsets (G31, G33, Q965 etc.): 
+ * when some MMIO address range of integrated peripheral overlaps
+ * mmconfig area, mmconfig accesses are blocked. This can happen
+ * when we size the BAR writing all ones to it.
+ * In practice, only integrated graphics controller has MMIO range
+ * large enough (BAR2, 256Mb) to cause a trouble, so we disable
+ * address decoders just on that function during PCI BAR probe.
+ */
+static void __devinit pci_intel_mmconfig(struct pci_dev *dev)
+{
+   if ((dev->class >> 8) == PCI_CLASS_DISPLAY_VGA &&
+   pci_domain_nr(dev->bus) == 0 && dev->bus->number == 0)
+   dev->disable_while_probe = 1;
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_intel_mmconfig);
--- 2.6.23-rc6-git4/drivers/pci/probe.c 2007-09-14 00:35:47.0 +0400
+++ linux/drivers/pci/probe.c   2007-09-14 10:27:45.0 +0400
@@ -185,6 +185,13 @@ static void pci_read_bases(struct pci_de
unsigned int pos, reg, next;
u32 l, sz;
struct resource *res;
+   u16 cmd, orig_cmd;
+
+   if (dev->disable_while_probe) {
+   pci_read_config_word(dev, PCI_COMMAND, &orig_cmd);
+   cmd |= orig_cmd & ~(PCI_COMMAND_MEMORY | PCI_COMMAND_IO);
+   pci_write_config_word(dev, PCI_COMMAND, cmd);
+   }
 
for(pos=0; posdisable_while_probe)
+   pci_write_config_word(dev, PCI_COMMAND, orig_cmd);
 }
 
 void pci_read_bridge_bases(struct pci_bus *child)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-13 Thread Robert Hancock

Ivan Kokshaysky wrote:

On Thu, Sep 13, 2007 at 02:53:13AM -0700, Greg KH wrote:

On Thu, Sep 13, 2007 at 01:55:36AM -0600, Matthew Wilcox wrote:

Unfortunately if this patch does cause any machine to break, these will
be machines that worked fine up until this point, so that would be a
regression, which is worse.  Life sucks.

If, after a while, you think the change should go into the -stable tree,
I have no objection.


I think it shouldn't - this change will almost certainly cause a regression.
There is a lot of system devices besides the host bridges that shouldn't be
disabled during BAR probe, like interrupt controllers, power management
controllers and so on.

We need a more sophisticated fix - I'm thinking of introducing "probe" field
in struct pci_dev which can be set by "early" quirk routines.


Disabling the BAR decoding does not mean disabling the device (aside 
from the one  group of host bridge that apparently disables CPU to RAM 
access, whose designers were apparently on crack when they read the PCI 
spec and which is why we don't disable the decode on host bridges). 
There really is no other sane way to do it.


If we do encounter other devices that choke on having the BAR disabled 
during probing then we can add additional quirk logic, but we haven't 
run into anything like that yet.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-13 Thread Greg KH
On Thu, Sep 13, 2007 at 03:16:37PM +0400, Ivan Kokshaysky wrote:
> On Thu, Sep 13, 2007 at 02:53:13AM -0700, Greg KH wrote:
> > On Thu, Sep 13, 2007 at 01:55:36AM -0600, Matthew Wilcox wrote:
> > > Unfortunately if this patch does cause any machine to break, these will
> > > be machines that worked fine up until this point, so that would be a
> > > regression, which is worse.  Life sucks.
> > 
> > If, after a while, you think the change should go into the -stable tree,
> > I have no objection.
> 
> I think it shouldn't - this change will almost certainly cause a regression.
> There is a lot of system devices besides the host bridges that shouldn't be
> disabled during BAR probe, like interrupt controllers, power management
> controllers and so on.

Well, if it's going to cause a regression with machines that currently
work properly, I'll just drop it entirely.  I would much rather not have
a machine work at all with Linux, than break other people's working
machines.

> We need a more sophisticated fix - I'm thinking of introducing "probe" field
> in struct pci_dev which can be set by "early" quirk routines.

That might work out well.  Matthew, want to look into this for a
possible way to get your fix into the tree in a way that will not affect
others?

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-13 Thread Ivan Kokshaysky
On Thu, Sep 13, 2007 at 02:53:13AM -0700, Greg KH wrote:
> On Thu, Sep 13, 2007 at 01:55:36AM -0600, Matthew Wilcox wrote:
> > Unfortunately if this patch does cause any machine to break, these will
> > be machines that worked fine up until this point, so that would be a
> > regression, which is worse.  Life sucks.
> 
> If, after a while, you think the change should go into the -stable tree,
> I have no objection.

I think it shouldn't - this change will almost certainly cause a regression.
There is a lot of system devices besides the host bridges that shouldn't be
disabled during BAR probe, like interrupt controllers, power management
controllers and so on.

We need a more sophisticated fix - I'm thinking of introducing "probe" field
in struct pci_dev which can be set by "early" quirk routines.

Ivan.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-13 Thread Greg KH
On Thu, Sep 13, 2007 at 01:55:36AM -0600, Matthew Wilcox wrote:
> On Thu, Sep 13, 2007 at 03:24:46PM +0800, Shaohua Li wrote:
> > On Thu, 2007-09-13 at 01:31 -0600, Matthew Wilcox wrote:
> > > You missed the part where we have to avoid doing this for host bridges ...
> > > 
> > > http://marc.info/?l=linux-kernel&m=118809338631160&w=2
> > > 
> > > (I believe it's now queued in Greg's tree, and possibly Andrew's tree
> > > too)
> > Hmm, I don't know there is already a fix for this, so waste a whole
> > morning :(. yes, your patch looks better.
> 
> Yes, this bug is causing a lot of people to waste time.  I fielded an
> internal request for this patch this afternoon.  I appreciate we're
> post-rc6 at this point, but it does rather suck to be releasing a kernel
> which freezes on boot on this class of machines.

But it's not like the kernel every worked on this class of machines,
right?  This is not something that was a regression from what I can see.

> Unfortunately if this patch does cause any machine to break, these will
> be machines that worked fine up until this point, so that would be a
> regression, which is worse.  Life sucks.

If, after a while, you think the change should go into the -stable tree,
I have no objection.

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-13 Thread Matthew Wilcox
On Thu, Sep 13, 2007 at 03:24:46PM +0800, Shaohua Li wrote:
> On Thu, 2007-09-13 at 01:31 -0600, Matthew Wilcox wrote:
> > You missed the part where we have to avoid doing this for host bridges ...
> > 
> > http://marc.info/?l=linux-kernel&m=118809338631160&w=2
> > 
> > (I believe it's now queued in Greg's tree, and possibly Andrew's tree
> > too)
> Hmm, I don't know there is already a fix for this, so waste a whole
> morning :(. yes, your patch looks better.

Yes, this bug is causing a lot of people to waste time.  I fielded an
internal request for this patch this afternoon.  I appreciate we're
post-rc6 at this point, but it does rather suck to be releasing a kernel
which freezes on boot on this class of machines.

Unfortunately if this patch does cause any machine to break, these will
be machines that worked fine up until this point, so that would be a
regression, which is worse.  Life sucks.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-13 Thread Shaohua Li
On Thu, 2007-09-13 at 01:31 -0600, Matthew Wilcox wrote:
> On Thu, Sep 13, 2007 at 02:21:07PM +0800, Shaohua Li wrote:
> > http://bugzilla.kernel.org/show_bug.cgi?id=8833
> > We write 0x to BARs to detect BAR size, this will change BAR
> > base to 0xfxx depends on BAR size. In the bug, PCI MCFG base address
> > is 0xf400. One PCI device (gfx) has a 256M BAR, the detection code
> > will temprarily change it to 0xf000, so conflict with MCFG decode
> > range. Later memory based config space read/write address is decoded by
> > both MCH and gfx and cause a hang. This patch disables resource decode
> > in BAR size detection to avoid resource conflict. 
> 
> You missed the part where we have to avoid doing this for host bridges ...
> 
> http://marc.info/?l=linux-kernel&m=118809338631160&w=2
> 
> (I believe it's now queued in Greg's tree, and possibly Andrew's tree
> too)
Hmm, I don't know there is already a fix for this, so waste a whole
morning :(. yes, your patch looks better.

Thanks,
Shaohua
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]PCI:disable resource decode in PCI BAR detection

2007-09-13 Thread Matthew Wilcox
On Thu, Sep 13, 2007 at 02:21:07PM +0800, Shaohua Li wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=8833
> We write 0x to BARs to detect BAR size, this will change BAR
> base to 0xfxx depends on BAR size. In the bug, PCI MCFG base address
> is 0xf400. One PCI device (gfx) has a 256M BAR, the detection code
> will temprarily change it to 0xf000, so conflict with MCFG decode
> range. Later memory based config space read/write address is decoded by
> both MCH and gfx and cause a hang. This patch disables resource decode
> in BAR size detection to avoid resource conflict. 

You missed the part where we have to avoid doing this for host bridges ...

http://marc.info/?l=linux-kernel&m=118809338631160&w=2

(I believe it's now queued in Greg's tree, and possibly Andrew's tree
too)

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/