from:"Robert Hancock"

Re: hisax isdn card (Sedlbauer Speed Fax+) does not get an interrupt

2007-05-31 Thread Robert Hancock

esetting card
<06>2007 May 30 14:21:40 cbs kern: Sedlbauer Speed Fax +: IRQ 11 count 10
<04>2007 May 30 14:21:40 cbs kern: Sedlbauer Speed Fax +: IRQ(11) getting no 
interrupts during init 1
<06>2007 May 30 14:21:40 cbs kern: Sedlbauer: resetting card
<06>2007 May 30 14:21:40 cbs kern: Sedlbauer: resetting card
<06>2007 May 30 14:21:40 cbs kern: Sedlbauer Speed Fax +: IRQ 11 count 10
<04>2007 May 30 14:21:40 cbs kern: Sedlbauer Speed Fax +: IRQ(11) getting no 
interrupts during init 2
<06>2007 May 30 14:21:40 cbs kern: Sedlbauer: resetting card
<06>2007 May 30 14:21:40 cbs kern: Sedlbauer: resetting card
<06>2007 May 30 14:21:40 cbs kern: Sedlbauer Speed Fax +: IRQ 11 count 10
<04>2007 May 30 14:21:40 cbs kern: Sedlbauer Speed Fax +: IRQ(11) getting no 
interrupts during init 3
<06>2007 May 30 14:21:40 cbs kern: Sedlbauer: resetting card
<04>2007 May 30 14:21:40 cbs kern: HiSax: Card Sedlbauer Speed Fax + not 
installed !
==

While the output seems to suggest a hardware problem, the same system 
loads the hisax driver perfectly on recent 2.4 kernels.


We tried several kernel versions, up to 2.6.21.3

Any hints are appreciated :)


Likely a driver problem - the device is using IRQ 11, but the driver 
never actually registered a handler for that interrupt (it's not in the 
list of handlers, only USB is). Maybe retrieving the interrupt before 
pci_enable_device? (I haven't looked at the code in question.)


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling

2007-05-31 Thread Robert Hancock


Eric W. Biederman wrote:

I just realized that except for doing the code review and noticing
that the current cpu hotplug code is fundamentally incompatible
with x86 I haven't done anything about it.  So here is my patch
to document what is wrong.

The current cpu hotplug code requires irqs to be migrated from a cpu
outside of irq context.  On x86 ioapics simply do not support this,
making the code unfixable without major redesign of the generic cpu
hotplug code.

So this patch makes CPU_HOTPLUG on x86 depend on CONFIG_BROKEN
and adds a WARN_ON so people that do enable it are not in doubt about
which part of the code is broken, even if it does work for them.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>


I don't think this is useful, though the code may be problematic, this 
patch will break suspend on all SMP machines with an existing config, 
which is a major regression..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: MCP55 NCQ problem?

2007-05-31 Thread Robert Hancock

: ata2.00: exception Emask 0x0 SAct 0x6 
SErr 0x200 action 0x2 frozen

May 29 12:49:23 localhost kernel: ata2: SError: {UnrecFIS }
May 29 12:49:23 localhost kernel: ata2.00: cmd 
61/40:08:3f:71:fa/00:00:07:00:00/40 tag 1 cdb 0x0 data 32768 out
May 29 12:49:23 localhost kernel:  res 
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

May 29 12:49:23 localhost kernel: ata2.00: status: {DRDY }
May 29 12:49:23 localhost kernel: ata2.00: cmd 
61/10:10:7f:e7:fa/00:00:07:00:00/40 tag 2 cdb 0x0 data 8192 out
May 29 12:49:23 localhost kernel:  res 
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

May 29 12:49:23 localhost kernel: ata2.00: status: {DRDY }
May 29 12:49:23 localhost kernel: ata2: hard resetting port
May 29 12:49:24 localhost kernel: ata2: SATA link up 1.5 Gbps (SStatus 
113 SControl 300)
May 29 12:49:24 localhost kernel: ata2.00: ata_hpa_resize: sectors = 
625142448, hpa_sectors = 625142448
May 29 12:49:24 localhost kernel: ata2.00: ata_hpa_resize: sectors = 
625142448, hpa_sectors = 625142448

May 29 12:49:24 localhost kernel: ata2.00: configured for UDMA/133
May 29 12:49:24 localhost kernel: ata2: EH pending after completion, 
repeating EH (cnt=4)

May 29 12:49:24 localhost kernel: ata2: EH complete
May 29 12:49:24 localhost kernel: sd 1:0:0:0: [sdb] 625142448 512-byte 
hardware sectors (320073 MB)

May 29 12:49:24 localhost kernel: sd 1:0:0:0: [sdb] Write Protect is off
May 29 12:49:24 localhost kernel: sd 1:0:0:0: [sdb] Write cache: 
enabled, read cache: enabled, doesn't support DPO or FUA
May 29 12:50:24 localhost kernel: ata2.00: NCQ disabled due to excessive 
errors
May 29 12:50:24 localhost kernel: ata2.00: exception Emask 0x0 SAct 0x6 
SErr 0x0 action 0x2 frozen
May 29 12:50:24 localhost kernel: ata2.00: cmd 
61/10:08:c7:88:b8/00:00:0f:00:00/40 tag 1 cdb 0x0 data 8192 out
May 29 12:50:24 localhost kernel:  res 
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

May 29 12:50:24 localhost kernel: ata2.00: status: {DRDY }
May 29 12:50:24 localhost kernel: ata2.00: cmd 
61/10:10:9f:8a:b8/00:00:0f:00:00/40 tag 2 cdb 0x0 data 8192 out
May 29 12:50:24 localhost kernel:  res 
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

May 29 12:50:24 localhost kernel: ata2.00: status: {DRDY }
May 29 12:50:24 localhost kernel: ata2: hard resetting port
May 29 12:50:25 localhost kernel: ata2: SATA link up 1.5 Gbps (SStatus 
113 SControl 300)
May 29 12:50:25 localhost kernel: ata2.00: ata_hpa_resize: sectors = 
625142448, hpa_sectors = 625142448
May 29 12:50:25 localhost kernel: ata2.00: ata_hpa_resize: sectors = 
625142448, hpa_sectors = 625142448

May 29 12:50:25 localhost kernel: ata2.00: configured for UDMA/133
May 29 12:50:25 localhost kernel: ata2: EH pending after completion, 
repeating EH (cnt=4)

May 29 12:50:25 localhost kernel: ata2: EH complete
May 29 12:50:25 localhost kernel: sd 1:0:0:0: [sdb] 625142448 512-byte 
hardware sectors (320073 MB)

May 29 12:50:25 localhost kernel: sd 1:0:0:0: [sdb] Write Protect is off
May 29 12:50:25 localhost kernel: sd 1:0:0:0: [sdb] Write cache: 
enabled, read cache: enabled, doesn't support DPO or FUA


After the system disabled NCQ there weren't any more more ata resets
and the disks were working OK.

Strange thing is, when I have run PostgreSQL pgbench with 25 clients
on 2.6.22-rc2 + cfs-v13 + swncq (which clearly showed advanced transfer 
rate)

I had no such problems. The PostgreSQL DB was also on ata2.00.

Best regards,
Zoltán Böszörményi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Case: 7454422: Re: Kernel 2.6.21.3 does not work with 8GB of RAM on Intel 965WH motherboards. (FULL DMESG)

2007-05-31 Thread Robert Hancock


Justin Piszcz wrote:

On Thu, 31 May 2007, Parag Warudkar wrote:


Robert Hancock wrote:
I think that mem=8832M would work as well, to make the kernel use 
only the memory that is marked cacheable. (It looks like this 
parameter takes the highest memory address we want the kernel to use, 
not the highest memory amount.)



Yep, and that would be much easier too.

I am curious though as this seems to be somewhat common a problem, 
could we make the kernel analyze which memory is not cacheable (it 
already knows this via MTRR) and not use that portion for anything? 
Plus may be warn the user to contact their BIOS vendor to correct the 
problem?


I think that would be possible - even if the kernel knows late that 
the memory was uncached we could migrate those pages in that region to 
someplace else?


Parag



That is an excellent question and I wonder the same thing.  I also had 
this problem when I only used 4GB of ram and upgraded the (another 
motherboard, I have two) past version 1666P and I had no idea what was 
going on other than the BIOS did not work correctly.


In this case however it worked with 4GB with bios version 1612P but not 
with 8GB.  Is this the case of a buggy BIOS for the 965 chipset or do 
Intel boards have a lot of issues?


We could conceivably generate a warning if the MTRRs don't map all of 
the physical memory as write-back. Actually, conceivably we could 
actually go and fix up the MTRRs if we found them to be wrong according 
to the E820 memory map. That would be more complicated, however.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Case: 7454422: Re: Kernel 2.6.21.3 does not work with 8GB of RAM on Intel 965WH motherboards. (FULL DMESG)

2007-05-31 Thread Robert Hancock


Justin Piszcz wrote:

On Thu, 31 May 2007, Parag Warudkar wrote:


Robert Hancock wrote:
I think that mem=8832M would work as well, to make the kernel use 
only the memory that is marked cacheable. (It looks like this 
parameter takes the highest memory address we want the kernel to use, 
not the highest memory amount.)



Yep, and that would be much easier too.

I am curious though as this seems to be somewhat common a problem, 
could we make the kernel analyze which memory is not cacheable (it 
already knows this via MTRR) and not use that portion for anything? 
Plus may be warn the user to contact their BIOS vendor to correct the 
problem?


I think that would be possible - even if the kernel knows late that 
the memory was uncached we could migrate those pages in that region to 
someplace else?


Parag



That is an excellent question and I wonder the same thing.  I also had 
this problem when I only used 4GB of ram and upgraded the (another 
motherboard, I have two) past version 1666P and I had no idea what was 
going on other than the BIOS did not work correctly.


In this case however it worked with 4GB with bios version 1612P but not 
with 8GB.  Is this the case of a buggy BIOS for the 965 chipset or do 
Intel boards have a lot of issues?


We could conceivably generate a warning if the MTRRs don't map all of 
the physical memory as write-back. Actually, conceivably we could 
actually go and fix up the MTRRs if we found them to be wrong according 
to the E820 memory map. That would be more complicated, however.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: MCP55 NCQ problem?

2007-05-31 Thread Robert Hancock

: ata2.00: exception Emask 0x0 SAct 0x6 
SErr 0x200 action 0x2 frozen

May 29 12:49:23 localhost kernel: ata2: SError: {UnrecFIS }
May 29 12:49:23 localhost kernel: ata2.00: cmd 
61/40:08:3f:71:fa/00:00:07:00:00/40 tag 1 cdb 0x0 data 32768 out
May 29 12:49:23 localhost kernel:  res 
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

May 29 12:49:23 localhost kernel: ata2.00: status: {DRDY }
May 29 12:49:23 localhost kernel: ata2.00: cmd 
61/10:10:7f:e7:fa/00:00:07:00:00/40 tag 2 cdb 0x0 data 8192 out
May 29 12:49:23 localhost kernel:  res 
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

May 29 12:49:23 localhost kernel: ata2.00: status: {DRDY }
May 29 12:49:23 localhost kernel: ata2: hard resetting port
May 29 12:49:24 localhost kernel: ata2: SATA link up 1.5 Gbps (SStatus 
113 SControl 300)
May 29 12:49:24 localhost kernel: ata2.00: ata_hpa_resize: sectors = 
625142448, hpa_sectors = 625142448
May 29 12:49:24 localhost kernel: ata2.00: ata_hpa_resize: sectors = 
625142448, hpa_sectors = 625142448

May 29 12:49:24 localhost kernel: ata2.00: configured for UDMA/133
May 29 12:49:24 localhost kernel: ata2: EH pending after completion, 
repeating EH (cnt=4)

May 29 12:49:24 localhost kernel: ata2: EH complete
May 29 12:49:24 localhost kernel: sd 1:0:0:0: [sdb] 625142448 512-byte 
hardware sectors (320073 MB)

May 29 12:49:24 localhost kernel: sd 1:0:0:0: [sdb] Write Protect is off
May 29 12:49:24 localhost kernel: sd 1:0:0:0: [sdb] Write cache: 
enabled, read cache: enabled, doesn't support DPO or FUA
May 29 12:50:24 localhost kernel: ata2.00: NCQ disabled due to excessive 
errors
May 29 12:50:24 localhost kernel: ata2.00: exception Emask 0x0 SAct 0x6 
SErr 0x0 action 0x2 frozen
May 29 12:50:24 localhost kernel: ata2.00: cmd 
61/10:08:c7:88:b8/00:00:0f:00:00/40 tag 1 cdb 0x0 data 8192 out
May 29 12:50:24 localhost kernel:  res 
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

May 29 12:50:24 localhost kernel: ata2.00: status: {DRDY }
May 29 12:50:24 localhost kernel: ata2.00: cmd 
61/10:10:9f:8a:b8/00:00:0f:00:00/40 tag 2 cdb 0x0 data 8192 out
May 29 12:50:24 localhost kernel:  res 
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

May 29 12:50:24 localhost kernel: ata2.00: status: {DRDY }
May 29 12:50:24 localhost kernel: ata2: hard resetting port
May 29 12:50:25 localhost kernel: ata2: SATA link up 1.5 Gbps (SStatus 
113 SControl 300)
May 29 12:50:25 localhost kernel: ata2.00: ata_hpa_resize: sectors = 
625142448, hpa_sectors = 625142448
May 29 12:50:25 localhost kernel: ata2.00: ata_hpa_resize: sectors = 
625142448, hpa_sectors = 625142448

May 29 12:50:25 localhost kernel: ata2.00: configured for UDMA/133
May 29 12:50:25 localhost kernel: ata2: EH pending after completion, 
repeating EH (cnt=4)

May 29 12:50:25 localhost kernel: ata2: EH complete
May 29 12:50:25 localhost kernel: sd 1:0:0:0: [sdb] 625142448 512-byte 
hardware sectors (320073 MB)

May 29 12:50:25 localhost kernel: sd 1:0:0:0: [sdb] Write Protect is off
May 29 12:50:25 localhost kernel: sd 1:0:0:0: [sdb] Write cache: 
enabled, read cache: enabled, doesn't support DPO or FUA


After the system disabled NCQ there weren't any more more ata resets
and the disks were working OK.

Strange thing is, when I have run PostgreSQL pgbench with 25 clients
on 2.6.22-rc2 + cfs-v13 + swncq (which clearly showed advanced transfer 
rate)

I had no such problems. The PostgreSQL DB was also on ata2.00.

Best regards,
Zoltán Böszörményi

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling

2007-05-31 Thread Robert Hancock


Eric W. Biederman wrote:

I just realized that except for doing the code review and noticing
that the current cpu hotplug code is fundamentally incompatible
with x86 I haven't done anything about it.  So here is my patch
to document what is wrong.

The current cpu hotplug code requires irqs to be migrated from a cpu
outside of irq context.  On x86 ioapics simply do not support this,
making the code unfixable without major redesign of the generic cpu
hotplug code.

So this patch makes CPU_HOTPLUG on x86 depend on CONFIG_BROKEN
and adds a WARN_ON so people that do enable it are not in doubt about
which part of the code is broken, even if it does work for them.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]


I don't think this is useful, though the code may be problematic, this 
patch will break suspend on all SMP machines with an existing config, 
which is a major regression..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: hisax isdn card (Sedlbauer Speed Fax+) does not get an interrupt

2007-05-31 Thread Robert Hancock

062007 May 30 14:21:40 cbs kern: Sedlbauer Speed Fax +: IRQ 11 count 10
042007 May 30 14:21:40 cbs kern: Sedlbauer Speed Fax +: IRQ(11) getting no 
interrupts during init 2
062007 May 30 14:21:40 cbs kern: Sedlbauer: resetting card
062007 May 30 14:21:40 cbs kern: Sedlbauer: resetting card
062007 May 30 14:21:40 cbs kern: Sedlbauer Speed Fax +: IRQ 11 count 10
042007 May 30 14:21:40 cbs kern: Sedlbauer Speed Fax +: IRQ(11) getting no 
interrupts during init 3
062007 May 30 14:21:40 cbs kern: Sedlbauer: resetting card
042007 May 30 14:21:40 cbs kern: HiSax: Card Sedlbauer Speed Fax + not 
installed !
==

While the output seems to suggest a hardware problem, the same system 
loads the hisax driver perfectly on recent 2.4 kernels.


We tried several kernel versions, up to 2.6.21.3

Any hints are appreciated :)


Likely a driver problem - the device is using IRQ 11, but the driver 
never actually registered a handler for that interrupt (it's not in the 
list of handlers, only USB is). Maybe retrieving the interrupt before 
pci_enable_device? (I haven't looked at the code in question.)


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Compact Flash performance...

2007-05-31 Thread Robert Hancock


Jeff Garzik wrote:

Mark Lord wrote:

To maximize throughput, some kind of host-queuing would be needed,
or just have the driver sit in a tight loop, starting the next I/O
immediately when the previous one finishes.  Linux isn't that quick 
(yet).



I was talking on IRC with Tejun just recently.  There are several 
controllers (and/or situations) like this, where some amount of host 
queueing would permit greater throughput, even when NCQ is not 
supported.  sata_sx4 is the most dramatic example, where host queueing 
could potentially increase speed by a factor of 10 or more, since it is 
penalized by an awful two-irq-per-command (w/ a per-host bottleneck to 
boot) setup.  Silicon Image has a command buffer.  And overall, I 
designed -qc_prep() hook separate from -qc_issue() to enable the 
prepartion of multiple commands such that it only takes a simple go 
I/O to start a transaction, immediately after the previous one ends.


Jeff


Theoretically NVIDIA nForce4 ADMA could likely do this as well, as it 
seems to allow chaining up multiple commands to execute in succession 
(assuming they're not NCQ)..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: SELECT() returns 1 But FIONREAD says (Input/output error)

2007-05-31 Thread Robert Hancock


Uncle George wrote:

David Schwartz wrote:


Nope. An errored connection is always ready for read/write -- there is
nothing to wait for as far as the kernel is concerned. Your code keeps
asking the kernel if something interesting has happened, the kernel keeps
telling it yes, and it refuses to do anything about it.

The select() returns because i pulled the USB cable from hub. Seems 
reasonable.


The next select() found what? to be interesting in order to prematurely 
terminate the select-wait? As far as I can tell, nothing interesting has 
happened since the previous select(). In this case the select() is only 
looking at read()'s.


It's because you haven't done anything to handle the error which is 
still persisting. Likely the only thing sane you can do in this case is 
close the fd and try to reopen it later.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Case: 7454422: Re: Kernel 2.6.21.3 does not work with 8GB of RAM on Intel 965WH motherboards. (FULL DMESG)

2007-05-30 Thread Robert Hancock


Parag Warudkar wrote:

Robert Hancock wrote:



0-3319MB
4096-8832MB

leaving 64MB of memory at the top of RAM uncached. What do you want to
bet that something important (kernel code?) is getting loaded there..

So essentially it's a BIOS problem, it's not setting up the MTRRs
properly in order to map all of RAM as cacheable. As Andi says, complain
to Intel.



Could the BADRAM patch be useful for him?
http://rick.vanrein.org/linux/badram/download.html has 2.6.21 version.
It says it supports x86_64. May be using this patch he can exclude
that RAM from being used/accessed?


I think that mem=8832M would work as well, to make the kernel use only 
the memory that is marked cacheable. (It looks like this parameter takes 
the highest memory address we want the kernel to use, not the highest 
memory amount.)


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] 1/2: MMCONFIG: validate against ACPI motherboard resources

2007-05-30 Thread Robert Hancock

This path adds validation of the MMCONFIG table against the ACPI reserved
motherboard resources. If the MMCONFIG table is found to be reserved in
ACPI, we don't bother checking the E820 table. The PCI Express firmware spec
apparently tells BIOS developers that reservation in ACPI is required and
E820 reservation is optional, so checking against ACPI first makes sense.
Many BIOSes don't reserve the MMCONFIG region in E820 even though it is
perfectly functional, the existing check needlessly disables MMCONFIG in
these cases.

In order to do this, MMCONFIG setup has been split into two phases. If PCI
configuration type 1 is not available then MMCONFIG is enabled early as before.
Otherwise, it is enabled later after the ACPI interpreter is enabled, since we
need to be able to execute control methods in order to check the ACPI reserved
resources. Presently this is just triggered off the end of ACPI interpreter
initialization.

There are a few other behavioral changes here:

-Validate all MMCONFIG configurations provided, not just the first one.

-Validate the entire required length of each configuration according to the
provided ending bus number is reserved, not just the minimum required 
allocation.

-Validate that the area is reserved even if we read it from the chipset directly
and not from the MCFG table. This catches the case where the BIOS didn't set the
location properly in the chipset and has mapped it over other things it 
shouldn't
have.

Based on an original patch by Rajesh Shah from Intel.

Signed-off-by: Robert Hancock <[EMAIL PROTECTED]>

---

This should fix up some of the whitespace/formatting problems in the previous
version. There were actually some bugs in the check_mcfg_resource function,
there were some <= that should have been <. Also forgot the attribution
for Rajesh Shah who wrote the original version of some of this code.

diff -rup --exclude-from=linux-2.6.22-rc2-mm1/Documentation/dontdiff 
linux-2.6.22-rc2-mm1/arch/i386/pci/init.c 
linux-2.6.22-rc2-mm1edit/arch/i386/pci/init.c
--- linux-2.6.22-rc2-mm1/arch/i386/pci/init.c   2007-05-23 21:20:43.0 
-0600
+++ linux-2.6.22-rc2-mm1edit/arch/i386/pci/init.c   2007-05-23 
21:31:50.0 -0600
@@ -12,7 +12,7 @@ static __init int pci_access_init(void)
type = pci_direct_probe();
 #endif
 #ifdef CONFIG_PCI_MMCONFIG
-   pci_mmcfg_init(type);
+   pci_mmcfg_early_init(type);
 #endif
if (raw_pci_ops)
return 0;
diff -rup --exclude-from=linux-2.6.22-rc2-mm1/Documentation/dontdiff 
linux-2.6.22-rc2-mm1/arch/i386/pci/mmconfig-shared.c 
linux-2.6.22-rc2-mm1edit/arch/i386/pci/mmconfig-shared.c
--- linux-2.6.22-rc2-mm1/arch/i386/pci/mmconfig-shared.c2007-05-23 
21:21:04.0 -0600
+++ linux-2.6.22-rc2-mm1edit/arch/i386/pci/mmconfig-shared.c2007-05-30 
18:40:31.0 -0600
@@ -206,9 +206,78 @@ static void __init pci_mmcfg_insert_reso
pci_mmcfg_resources_inserted = 1;
 }
 
-static void __init pci_mmcfg_reject_broken(int type)
+static acpi_status __init check_mcfg_resource(struct acpi_resource *res,
+ void *data)
+{
+   struct resource *mcfg_res = data;
+   struct acpi_resource_address64 address;
+   acpi_status status;
+
+   if (res->type == ACPI_RESOURCE_TYPE_FIXED_MEMORY32) {
+   struct acpi_resource_fixed_memory32 *fixmem32 =
+   >data.fixed_memory32;
+   if (!fixmem32)
+   return AE_OK;
+   if ((mcfg_res->start >= fixmem32->address) &&
+   (mcfg_res->end < (fixmem32->address +
+ fixmem32->address_length))) {
+   mcfg_res->flags = 1;
+   return AE_CTRL_TERMINATE;
+   }
+   }
+   if ((res->type != ACPI_RESOURCE_TYPE_ADDRESS32) &&
+   (res->type != ACPI_RESOURCE_TYPE_ADDRESS64))
+   return AE_OK;
+
+   status = acpi_resource_to_address64(res, );
+   if (ACPI_FAILURE(status) ||
+  (address.address_length <= 0) ||
+  (address.resource_type != ACPI_MEMORY_RANGE))
+   return AE_OK;
+
+   if ((mcfg_res->start >= address.minimum) &&
+   (mcfg_res->end < (address.minimum + address.address_length))) {
+   mcfg_res->flags = 1;
+   return AE_CTRL_TERMINATE;
+   }
+   return AE_OK;
+}
+
+static acpi_status __init find_mboard_resource(acpi_handle handle, u32 lvl,
+   void *context, void **rv)
+{
+   struct resource *mcfg_res = context;
+
+   acpi_walk_resources(handle, METHOD_NAME__CRS,
+   check_mcfg_resource, context);
+
+   if (mcfg_res->flags)
+   return AE_CTRL_TERMINATE;
+
+   return AE_OK;
+}
+
+static int __init is_acpi_reserved(unsigned long start, unsigned long end)
+{
+

Re: [PATCH -mm] 1/2: MMCONFIG: validate against ACPI motherboard resources

2007-05-30 Thread Robert Hancock


Mark Lord wrote:

Linus Torvalds wrote:
And once I looked closer, I just went "aiieee, it wasn't all the email 
client" ;)


Not long ago, Tejun pointed out the "External Editor" extension for 
Thunderbird,
which turns out to be the only really sane way to submit patches with 
that client.


Download and install it, then add a button for it using 
View->Toolbars->Customize...

and finally just click on it when in the Compose dialog.

A very useful tip.  Thanks again to Tejun for pointing it out.


Yes, I've been using that one, as well as changing the word wrap length 
to 0 characters to switch that off. Apparently disabling format=flowed 
is needed as well, however :-)


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] 0/2: PCI MMCONFIG-related updates

2007-05-30 Thread Robert Hancock


Jesse Barnes wrote:

On Tuesday, May 29, 2007 9:01:22 Robert Hancock wrote:

These two patches implement some changes in behavior related to PCI
MMCONFIG configuration space access. One changes the way in which we
validate the MCFG table provided by the BIOS by checking it against
ACPI motherboard resources instead of the E820 table. The BIOS is not
required to reserve this area in the E820 table, so checking that
results in MMCONFIG being unnecessarily disabled on some machines.

Some Intel chipsets where MMCONFIG was being disabled previously
(but won't be with the first patch) had problems, not due to the
MCFG table being broken, but because the access was hosed by the way
in which we do PCI BAR sizing. The second patch fixes this problem.

This is requested for inclusion in the -mm tree for testing.


Robert, should we also pull in the 915 and 965 chipset specific register 
poking code?  It might be a good sanity check against ACPI (i.e. if ACPI and 
the actual register window disagree, we can assume the BIOS is broken and 
MCFG is not safe to use).  If so, I'll update and repost them against your 
patchset.


Probably not a bad idea..

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Case: 7454422: Re: Kernel 2.6.21.3 does not work with 8GB of RAM on Intel 965WH motherboards. (FULL DMESG)

2007-05-30 Thread Robert Hancock

Justin Piszcz wrote:
> That output looked nasty, attaching entries from syslog.
> 
> Justin.

Here's your E820 memory map, from dmesg:

BIOS-e820:  - 0008f000 (usable)
BIOS-e820: 0008f000 - 000a (reserved)
BIOS-e820: 000e - 0010 (reserved)
BIOS-e820: 0010 - cf58f000 (usable)
BIOS-e820: cf58f000 - cf59c000 (reserved)
BIOS-e820: cf59c000 - cf653000 (usable)
BIOS-e820: cf653000 - cf6a5000 (ACPI NVS)
BIOS-e820: cf6a5000 - cf6a8000 (ACPI data)
BIOS-e820: cf6a8000 - cf6ef000 (ACPI NVS)
BIOS-e820: cf6ef000 - cf6f1000 (ACPI data)
BIOS-e820: cf6f1000 - cf6f2000 (usable)
BIOS-e820: cf6f2000 - cf6ff000 (ACPI data)
BIOS-e820: cf6ff000 - cf70 (usable)
BIOS-e820: cf70 - d000 (reserved)
BIOS-e820: fff0 - 0001 (reserved)
BIOS-e820: 0001 - 00022c00 (usable)

so the usable memory ranges are:

0-572K
1MB-3317.55MB
3317.60MB-3317.75MB
3318.94MB-3318.945MB
3318.996MB-3319MB
4096MB-8896MB

and the MTRRs (from /proc/mtrr, from private email):

reg00: base=0x (   0MB), size=2048MB: write-back, count=1
reg01: base=0x8000 (2048MB), size=1024MB: write-back, count=1
reg02: base=0xc000 (3072MB), size= 256MB: write-back, count=1
reg03: base=0xcf80 (3320MB), size=   8MB: uncachable, count=1
reg04: base=0xcf70 (3319MB), size=   1MB: uncachable, count=1
reg05: base=0x1 (4096MB), size=4096MB: write-back, count=1
reg06: base=0x2 (8192MB), size= 512MB: write-back, count=1
reg07: base=0x22000 (8704MB), size= 128MB: write-back, count=1

so the ranges mapped as cacheable are:

0-3319MB
4096-8832MB

leaving 64MB of memory at the top of RAM uncached. What do you want to
bet that something important (kernel code?) is getting loaded there..

So essentially it's a BIOS problem, it's not setting up the MTRRs
properly in order to map all of RAM as cacheable. As Andi says, complain
to Intel.

-- 
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] 1/2: MMCONFIG: validate against ACPI motherboard resources

2007-05-30 Thread Robert Hancock

Linus Torvalds wrote:
> On Tue, 29 May 2007, Robert Hancock wrote:
>> This path adds validation of the MMCONFIG table against the ACPI reserved
>> motherboard resources.
> 
> Please fix the formatting of your code.
> 
> "for" and "if" are not functions, and they have a space before the 
> parenthesis.
> 
> And pretty much every single conditional in this patch is spread out over 
> two or more lines and has at least three different indentations. There's 
> something wrong here. Code can't look this bad and still be fine. Some of 
> this looks like random whitespace noise:
> 
> +   if(is_acpi_reserved(cfg->address,
> +   cfg->address + size - 1))
> +   printk(KERN_NOTICE "PCI: MCFG area at %Lx reserved "
> +   "in ACPI motherboard resources\n",
> +   cfg->address);
> +   else {
> 
> That's just horrid. Please try to make the code _look_ nicer.

I'll try and fix up the formatting and repost this patch. I suspect some
of the issues are from the added code clashing with the way the existing
code was formatted.

-- 
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel 2.6.21.3 does not work with 8GB of RAM on Intel 965WH motherboards.

2007-05-30 Thread Robert Hancock


Justin Piszcz wrote:

Kernel dmesg attached from 8GB bootup.


It looks like part of the start of the output was truncated..

Robert, how come the option is not applicable in 64-bit mode? If I want 
to use all 8GB of memory I need to run a 32-bit kernel?


Justin.



Highmem and PAE (which are essentially what the 4GB/64GB memory options 
control) are not needed in 64-bit mode, since we can access the entire 
64-bit address space directly.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Case: 7454422: Re: Kernel 2.6.21.3 does not work with 8GB of RAM on Intel 965WH motherboards. (FULL DMESG)

2007-05-30 Thread Robert Hancock

Justin Piszcz wrote:
 That output looked nasty, attaching entries from syslog.
 
 Justin.

Here's your E820 memory map, from dmesg:

BIOS-e820:  - 0008f000 (usable)
BIOS-e820: 0008f000 - 000a (reserved)
BIOS-e820: 000e - 0010 (reserved)
BIOS-e820: 0010 - cf58f000 (usable)
BIOS-e820: cf58f000 - cf59c000 (reserved)
BIOS-e820: cf59c000 - cf653000 (usable)
BIOS-e820: cf653000 - cf6a5000 (ACPI NVS)
BIOS-e820: cf6a5000 - cf6a8000 (ACPI data)
BIOS-e820: cf6a8000 - cf6ef000 (ACPI NVS)
BIOS-e820: cf6ef000 - cf6f1000 (ACPI data)
BIOS-e820: cf6f1000 - cf6f2000 (usable)
BIOS-e820: cf6f2000 - cf6ff000 (ACPI data)
BIOS-e820: cf6ff000 - cf70 (usable)
BIOS-e820: cf70 - d000 (reserved)
BIOS-e820: fff0 - 0001 (reserved)
BIOS-e820: 0001 - 00022c00 (usable)

so the usable memory ranges are:

0-572K
1MB-3317.55MB
3317.60MB-3317.75MB
3318.94MB-3318.945MB
3318.996MB-3319MB
4096MB-8896MB

and the MTRRs (from /proc/mtrr, from private email):

reg00: base=0x (   0MB), size=2048MB: write-back, count=1
reg01: base=0x8000 (2048MB), size=1024MB: write-back, count=1
reg02: base=0xc000 (3072MB), size= 256MB: write-back, count=1
reg03: base=0xcf80 (3320MB), size=   8MB: uncachable, count=1
reg04: base=0xcf70 (3319MB), size=   1MB: uncachable, count=1
reg05: base=0x1 (4096MB), size=4096MB: write-back, count=1
reg06: base=0x2 (8192MB), size= 512MB: write-back, count=1
reg07: base=0x22000 (8704MB), size= 128MB: write-back, count=1

so the ranges mapped as cacheable are:

0-3319MB
4096-8832MB

leaving 64MB of memory at the top of RAM uncached. What do you want to
bet that something important (kernel code?) is getting loaded there..

So essentially it's a BIOS problem, it's not setting up the MTRRs
properly in order to map all of RAM as cacheable. As Andi says, complain
to Intel.

-- 
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] 0/2: PCI MMCONFIG-related updates

2007-05-30 Thread Robert Hancock


Jesse Barnes wrote:

On Tuesday, May 29, 2007 9:01:22 Robert Hancock wrote:

These two patches implement some changes in behavior related to PCI
MMCONFIG configuration space access. One changes the way in which we
validate the MCFG table provided by the BIOS by checking it against
ACPI motherboard resources instead of the E820 table. The BIOS is not
required to reserve this area in the E820 table, so checking that
results in MMCONFIG being unnecessarily disabled on some machines.

Some Intel chipsets where MMCONFIG was being disabled previously
(but won't be with the first patch) had problems, not due to the
MCFG table being broken, but because the access was hosed by the way
in which we do PCI BAR sizing. The second patch fixes this problem.

This is requested for inclusion in the -mm tree for testing.


Robert, should we also pull in the 915 and 965 chipset specific register 
poking code?  It might be a good sanity check against ACPI (i.e. if ACPI and 
the actual register window disagree, we can assume the BIOS is broken and 
MCFG is not safe to use).  If so, I'll update and repost them against your 
patchset.


Probably not a bad idea..

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] 1/2: MMCONFIG: validate against ACPI motherboard resources

2007-05-30 Thread Robert Hancock


Mark Lord wrote:

Linus Torvalds wrote:
And once I looked closer, I just went aiieee, it wasn't all the email 
client ;)


Not long ago, Tejun pointed out the External Editor extension for 
Thunderbird,
which turns out to be the only really sane way to submit patches with 
that client.


Download and install it, then add a button for it using 
View-Toolbars-Customize...

and finally just click on it when in the Compose dialog.

A very useful tip.  Thanks again to Tejun for pointing it out.


Yes, I've been using that one, as well as changing the word wrap length 
to 0 characters to switch that off. Apparently disabling format=flowed 
is needed as well, however :-)


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] 1/2: MMCONFIG: validate against ACPI motherboard resources

2007-05-30 Thread Robert Hancock

This path adds validation of the MMCONFIG table against the ACPI reserved
motherboard resources. If the MMCONFIG table is found to be reserved in
ACPI, we don't bother checking the E820 table. The PCI Express firmware spec
apparently tells BIOS developers that reservation in ACPI is required and
E820 reservation is optional, so checking against ACPI first makes sense.
Many BIOSes don't reserve the MMCONFIG region in E820 even though it is
perfectly functional, the existing check needlessly disables MMCONFIG in
these cases.

In order to do this, MMCONFIG setup has been split into two phases. If PCI
configuration type 1 is not available then MMCONFIG is enabled early as before.
Otherwise, it is enabled later after the ACPI interpreter is enabled, since we
need to be able to execute control methods in order to check the ACPI reserved
resources. Presently this is just triggered off the end of ACPI interpreter
initialization.

There are a few other behavioral changes here:

-Validate all MMCONFIG configurations provided, not just the first one.

-Validate the entire required length of each configuration according to the
provided ending bus number is reserved, not just the minimum required 
allocation.

-Validate that the area is reserved even if we read it from the chipset directly
and not from the MCFG table. This catches the case where the BIOS didn't set the
location properly in the chipset and has mapped it over other things it 
shouldn't
have.

Based on an original patch by Rajesh Shah from Intel.

Signed-off-by: Robert Hancock [EMAIL PROTECTED]

---

This should fix up some of the whitespace/formatting problems in the previous
version. There were actually some bugs in the check_mcfg_resource function,
there were some = that should have been . Also forgot the attribution
for Rajesh Shah who wrote the original version of some of this code.

diff -rup --exclude-from=linux-2.6.22-rc2-mm1/Documentation/dontdiff 
linux-2.6.22-rc2-mm1/arch/i386/pci/init.c 
linux-2.6.22-rc2-mm1edit/arch/i386/pci/init.c
--- linux-2.6.22-rc2-mm1/arch/i386/pci/init.c   2007-05-23 21:20:43.0 
-0600
+++ linux-2.6.22-rc2-mm1edit/arch/i386/pci/init.c   2007-05-23 
21:31:50.0 -0600
@@ -12,7 +12,7 @@ static __init int pci_access_init(void)
type = pci_direct_probe();
 #endif
 #ifdef CONFIG_PCI_MMCONFIG
-   pci_mmcfg_init(type);
+   pci_mmcfg_early_init(type);
 #endif
if (raw_pci_ops)
return 0;
diff -rup --exclude-from=linux-2.6.22-rc2-mm1/Documentation/dontdiff 
linux-2.6.22-rc2-mm1/arch/i386/pci/mmconfig-shared.c 
linux-2.6.22-rc2-mm1edit/arch/i386/pci/mmconfig-shared.c
--- linux-2.6.22-rc2-mm1/arch/i386/pci/mmconfig-shared.c2007-05-23 
21:21:04.0 -0600
+++ linux-2.6.22-rc2-mm1edit/arch/i386/pci/mmconfig-shared.c2007-05-30 
18:40:31.0 -0600
@@ -206,9 +206,78 @@ static void __init pci_mmcfg_insert_reso
pci_mmcfg_resources_inserted = 1;
 }
 
-static void __init pci_mmcfg_reject_broken(int type)
+static acpi_status __init check_mcfg_resource(struct acpi_resource *res,
+ void *data)
+{
+   struct resource *mcfg_res = data;
+   struct acpi_resource_address64 address;
+   acpi_status status;
+
+   if (res-type == ACPI_RESOURCE_TYPE_FIXED_MEMORY32) {
+   struct acpi_resource_fixed_memory32 *fixmem32 =
+   res-data.fixed_memory32;
+   if (!fixmem32)
+   return AE_OK;
+   if ((mcfg_res-start = fixmem32-address) 
+   (mcfg_res-end  (fixmem32-address +
+ fixmem32-address_length))) {
+   mcfg_res-flags = 1;
+   return AE_CTRL_TERMINATE;
+   }
+   }
+   if ((res-type != ACPI_RESOURCE_TYPE_ADDRESS32) 
+   (res-type != ACPI_RESOURCE_TYPE_ADDRESS64))
+   return AE_OK;
+
+   status = acpi_resource_to_address64(res, address);
+   if (ACPI_FAILURE(status) ||
+  (address.address_length = 0) ||
+  (address.resource_type != ACPI_MEMORY_RANGE))
+   return AE_OK;
+
+   if ((mcfg_res-start = address.minimum) 
+   (mcfg_res-end  (address.minimum + address.address_length))) {
+   mcfg_res-flags = 1;
+   return AE_CTRL_TERMINATE;
+   }
+   return AE_OK;
+}
+
+static acpi_status __init find_mboard_resource(acpi_handle handle, u32 lvl,
+   void *context, void **rv)
+{
+   struct resource *mcfg_res = context;
+
+   acpi_walk_resources(handle, METHOD_NAME__CRS,
+   check_mcfg_resource, context);
+
+   if (mcfg_res-flags)
+   return AE_CTRL_TERMINATE;
+
+   return AE_OK;
+}
+
+static int __init is_acpi_reserved(unsigned long start, unsigned long end)
+{
+   struct resource mcfg_res;
+
+   mcfg_res.start = start;
+   mcfg_res.end = end;
+   mcfg_res.flags

Re: Case: 7454422: Re: Kernel 2.6.21.3 does not work with 8GB of RAM on Intel 965WH motherboards. (FULL DMESG)

2007-05-30 Thread Robert Hancock


Parag Warudkar wrote:

Robert Hancock wrote:



0-3319MB
4096-8832MB

leaving 64MB of memory at the top of RAM uncached. What do you want to
bet that something important (kernel code?) is getting loaded there..

So essentially it's a BIOS problem, it's not setting up the MTRRs
properly in order to map all of RAM as cacheable. As Andi says, complain
to Intel.



Could the BADRAM patch be useful for him?
http://rick.vanrein.org/linux/badram/download.html has 2.6.21 version.
It says it supports x86_64. May be using this patch he can exclude
that RAM from being used/accessed?


I think that mem=8832M would work as well, to make the kernel use only 
the memory that is marked cacheable. (It looks like this parameter takes 
the highest memory address we want the kernel to use, not the highest 
memory amount.)


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel 2.6.21.3 does not work with 8GB of RAM on Intel 965WH motherboards.

2007-05-30 Thread Robert Hancock


Justin Piszcz wrote:

Kernel dmesg attached from 8GB bootup.


It looks like part of the start of the output was truncated..

Robert, how come the option is not applicable in 64-bit mode? If I want 
to use all 8GB of memory I need to run a 32-bit kernel?


Justin.



Highmem and PAE (which are essentially what the 4GB/64GB memory options 
control) are not needed in 64-bit mode, since we can access the entire 
64-bit address space directly.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] 1/2: MMCONFIG: validate against ACPI motherboard resources

2007-05-30 Thread Robert Hancock

Linus Torvalds wrote:
 On Tue, 29 May 2007, Robert Hancock wrote:
 This path adds validation of the MMCONFIG table against the ACPI reserved
 motherboard resources.
 
 Please fix the formatting of your code.
 
 for and if are not functions, and they have a space before the 
 parenthesis.
 
 And pretty much every single conditional in this patch is spread out over 
 two or more lines and has at least three different indentations. There's 
 something wrong here. Code can't look this bad and still be fine. Some of 
 this looks like random whitespace noise:
 
 +   if(is_acpi_reserved(cfg-address,
 +   cfg-address + size - 1))
 +   printk(KERN_NOTICE PCI: MCFG area at %Lx reserved 
 +   in ACPI motherboard resources\n,
 +   cfg-address);
 +   else {
 
 That's just horrid. Please try to make the code _look_ nicer.

I'll try and fix up the formatting and repost this patch. I suspect some
of the issues are from the added code clashing with the way the existing
code was formatted.

-- 
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel 2.6.21.3 does not work with 8GB of RAM on Intel 965WH motherboards.

2007-05-29 Thread Robert Hancock


Justin Piszcz wrote:

Short Description of Problem:
Linux 2.6.21.3 does not run properly with 8GB of ram on the Intel 965WH 
motherboard.


Long Description of Problem:
When I use 8GB of memory on my x86_64 system, CPU-bound processes are VERY
slow, up to 36x slower than usual.  My temporary fix is force Linux to only
use 4GB of memory, I am currently using mem=4096M.  I ran memtest86 and the
memory is fine, not a single error.  I tried the following to mem= 1024, 
2048

4096 and blank "" to let the kernel use all 8GB of memory.  What is wrong
with the kernel and how come it cannot use 8GB of memory without slowing 
down
all CPU-related processes to a snail-like pace?  There is something 
horribly

wrong here.

Specifications:
Intel Motherboard: 965WH
Linux Kernel: 2.6.21.3
Distribution: Debian Testing x86_64
GCC: gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)
Target: x86_64-linux-gnu

Tests:

1. append line = 1024M
top - 18:28:26 up 1 min,  4 users,  load average: 0.42, 0.17, 0.06
Tasks: 157 total,   1 running, 156 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
0.0%st

Mem:   1027016k total,   964288k used,62728k free, 1232k buffers
Swap: 16787768k total,0k used, 16787768k free,   105168k cached
---> STATUS: No problems, box is fine, no lag, etc..

2. append line = 2048M
top - 18:34:23 up 2 min,  2 users,  load average: 0.14, 0.14, 0.05
Tasks: 147 total,   1 running, 146 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.7%us,  1.2%sy,  0.4%ni, 95.2%id,  1.5%wa,  0.0%hi,  0.0%si,  
0.0%st

Mem:   2059696k total,   956324k used,  1103372k free, 1232k buffers
Swap: 16787768k total,0k used, 16787768k free,   102924k cached
---> STATUS: No problems, box is fine, no lag, etc..

3. append line = 4096M
top - 18:37:55 up 1 min,  1 user,  load average: 0.52, 0.19, 0.07
Tasks: 143 total,   1 running, 142 sleeping,   0 stopped,   0 zombie
Cpu(s):  2.9%us,  2.2%sy,  0.7%ni, 91.6%id,  2.6%wa,  0.0%hi,  0.0%si,  
0.0%st

Mem:   3339536k total,   949792k used,  2389744k free, 1232k buffers
Swap: 16787768k total,0k used, 16787768k free,99920k cached

$ time ssh p34 uptime
 19:00:16 up 1 min,  1 user,  load average: 0.67, 0.18, 0.06
real0m0.159s
user0m0.013s
sys 0m0.003s
---> STATUS: No problems, box is fine, no lag, etc..

4. append line = "" (use all 8GB)

top - 18:52:50 up 9 min,  1 user,  load average: 2.88, 2.43, 1.41
Tasks: 149 total,   3 running, 146 sleeping,   0 stopped,   0 zombie
Cpu(s): 36.3%us,  2.2%sy, 10.3%ni, 50.8%id,  0.4%wa,  0.0%hi,  0.1%si,  
0.0%st

Mem:   8104460k total,  1064416k used,  7040044k free, 3296k buffers
Swap: 16787768k total,0k used, 16787768k free,   201852k cached

$ ssh p34
ssh: connect to host p34 port 22: Connection refused

Machine takes 5-10 minutes to boot, it acts like a 286 computer, about 8 
minutes later:


$ time ssh p34 uptime  # 5 SECONDS!! 36x slower when using 8GB of RAM
 18:51:39 up 8 min,  1 user,  load average: 2.74, 2.31, 1.30

real0m5.757s
user0m0.015s
sys 0m0.004s

The machine is VERY slow and this is on a gigabit network, I/O does not 
seem to be affected but rather, CPU-bound processes.


  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 2483 root  25   0 25324 5292 1072 R   96  0.1   4:37.12 mailgraph
 3604 logcheck  30  10  3408 1120  544 R   91  0.0   0:03.55 grep

These normally take seconds but when I use all 8GB of memory, it runs
for a very long time.

Conclusion: For now, I will be using mem=4096M until someone can help me 
understand what is happening here.  Can anyone offer any insight?


I found it interesting in make menuconfig on x86_64 there is no 4GB/64GB
options in the kernel that I remember seeing in 32bit.


That's because that option is not applicable in 64-bit mode.

Can you send your full dmesg output from the 8GB bootup?

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm] 2/2: PCI: disable decode of IO/memory during BAR sizing

2007-05-29 Thread Robert Hancock


Change PCI BAR sizing to disable the decode of memory or IO, as appropriate,
while we are writing the all-ones value to the BAR to determine the size.
If this is not done, the device may spuriously decode accesses to memory
areas it should not. On some Intel PCI Express chipsets, this breaks
MMCONFIG configuration space access, since the memory the graphics card ends up
decoding during this period overlaps the MMCONFIG area, and thus it steals
the accesses to the area to do any other configuration space access, including
changing the BAR back to its previous value.

However, don't do this disabling on host bridge devices, as it is reported that
some of them do silly things like disable CPU to RAM access if this is done.

Based on an original patch by Jesse Barnes.

Signed-off-by: Robert Hancock <[EMAIL PROTECTED]>

--- linux-2.6.22-rc2-mm1/drivers/pci/probe.c2007-05-23 21:21:05.0 
-0600
+++ linux-2.6.22-rc2-mm1edit/drivers/pci/probe.c2007-05-29 
21:31:47.0 -0600
@@ -180,6 +180,58 @@ static inline int is_64bit_memory(u32 ma
return 0;
}

+#define BAR_IS_MEMORY(bar) (((bar) & PCI_BASE_ADDRESS_SPACE) ==\
+   PCI_BASE_ADDRESS_SPACE_MEMORY)
+
+/**
+ * pci_bar_size - get raw PCI BAR size
+ * @dev: PCI device
+ * @reg: BAR to probe
+ *
+ * Use basic PCI probing:
+ *   - save original BAR value
+ *   - disable MEM or IO decode in PCI_COMMAND reg if appropriate
+ *   - write all 1s to the BAR
+ *   - read back value
+ *   - reenble MEM or IO decode as necessary
+ *   - write original value back
+ *
+ * Returns raw BAR size to caller.
+ */
+static u32 pci_bar_size(struct pci_dev *dev, unsigned int reg)
+{
+   u32 orig_reg, sz;
+   u16 orig_cmd;
+
+   pci_read_config_dword(dev, reg, _reg);
+   pci_read_config_word(dev, PCI_COMMAND, _cmd);
+
+   /*
+* Disable memory or IO decode on the device while writing the test
+* value to the BAR. This prevents possible spurious decoding
+* of random addresses by the device. Don't do this for host bridges,
+* however, since some of them do silly things like disable CPU to RAM
+* access if this is done.
+*/
+   if ((dev->class >> 8) != PCI_CLASS_BRIDGE_HOST) {
+   if (BAR_IS_MEMORY(orig_reg))
+   pci_write_config_word(dev, PCI_COMMAND,
+ orig_cmd & ~PCI_COMMAND_MEMORY);
+   else
+			pci_write_config_word(dev, PCI_COMMAND, 
+	  orig_cmd & ~PCI_COMMAND_IO);

+   }
+   
+   pci_write_config_dword(dev, reg, 0x);
+   pci_read_config_dword(dev, reg, );
+   pci_write_config_dword(dev, reg, orig_reg);
+
+   if ((dev->class >> 8) != PCI_CLASS_BRIDGE_HOST)
+   pci_write_config_word(dev, PCI_COMMAND, orig_cmd);
+
+   return sz;
+}
+
static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int rom)
{
unsigned int pos, reg, next;
@@ -196,16 +248,13 @@ static void pci_read_bases(struct pci_de
res->name = pci_name(dev);
reg = PCI_BASE_ADDRESS_0 + (pos << 2);
pci_read_config_dword(dev, reg, );
-   pci_write_config_dword(dev, reg, ~0);
-   pci_read_config_dword(dev, reg, );
-   pci_write_config_dword(dev, reg, l);
+   sz = pci_bar_size(dev, reg);
if (!sz || sz == 0x)
continue;
if (l == 0x)
l = 0;
raw_sz = sz;
-   if ((l & PCI_BASE_ADDRESS_SPACE) ==
-   PCI_BASE_ADDRESS_SPACE_MEMORY) {
+   if (BAR_IS_MEMORY(l)) {
sz = pci_size(l, sz, (u32)PCI_BASE_ADDRESS_MEM_MASK);
/*
 * For 64bit prefetchable memory sz could be 0, if the
@@ -229,9 +278,7 @@ static void pci_read_bases(struct pci_de
u32 szhi, lhi;

pci_read_config_dword(dev, reg+4, );
-   pci_write_config_dword(dev, reg+4, ~0);
-   pci_read_config_dword(dev, reg+4, );
-   pci_write_config_dword(dev, reg+4, lhi);
+   szhi = pci_bar_size(dev, reg+4);
sz64 = ((u64)szhi << 32) | raw_sz;
l64 = ((u64)lhi << 32) | l;
sz64 = pci_size64(l64, sz64, PCI_BASE_ADDRESS_MEM_MASK);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm] 0/2: PCI MMCONFIG-related updates

2007-05-29 Thread Robert Hancock


These two patches implement some changes in behavior related to PCI
MMCONFIG configuration space access. One changes the way in which we
validate the MCFG table provided by the BIOS by checking it against
ACPI motherboard resources instead of the E820 table. The BIOS is not
required to reserve this area in the E820 table, so checking that
results in MMCONFIG being unnecessarily disabled on some machines.

Some Intel chipsets where MMCONFIG was being disabled previously
(but won't be with the first patch) had problems, not due to the
MCFG table being broken, but because the access was hosed by the way
in which we do PCI BAR sizing. The second patch fixes this problem.

This is requested for inclusion in the -mm tree for testing.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm] 1/2: MMCONFIG: validate against ACPI motherboard resources

2007-05-29 Thread Robert Hancock


This path adds validation of the MMCONFIG table against the ACPI reserved
motherboard resources. If the MMCONFIG table is found to be reserved in
ACPI, we don't bother checking the E820 table. The PCI Express firmware spec
apparently tells BIOS developers that reservation in ACPI is required and
E820 reservation is optional, so checking against ACPI first makes sense.
Many BIOSes don't reserve the MMCONFIG region in E820 even though it is
perfectly functional, the existing check needlessly disables MMCONFIG in
these cases.

In order to do this, MMCONFIG setup has been split into two phases. If PCI
configuration type 1 is not available then MMCONFIG is enabled early as before.
Otherwise, it is enabled later after the ACPI interpreter is enabled, since we
need to be able to execute control methods in order to check the ACPI reserved
resources. Presently this is just triggered off the end of ACPI interpreter
initialization.

There are a few other behavioral changes here:

-Validate all MMCONFIG configurations provided, not just the first one.

-Validate the entire required length of each configuration according to the
provided ending bus number is reserved, not just the minimum required 
allocation.

-Validate that the area is reserved even if we read it from the chipset directly
and not from the MCFG table. This catches the case where the BIOS didn't set the
location properly in the chipset and has mapped it over other things it 
shouldn't
have.

Signed-off-by: Robert Hancock <[EMAIL PROTECTED]>

diff -up linux-2.6.22-rc2-mm1/arch/i386/pci/init.c 
linux-2.6.22-rc2-mm1edit/arch/i386/pci/init.c
--- linux-2.6.22-rc2-mm1/arch/i386/pci/init.c   2007-05-23 21:20:43.0 
-0600
+++ linux-2.6.22-rc2-mm1edit/arch/i386/pci/init.c   2007-05-23 
21:31:50.0 -0600
@@ -12,7 +12,7 @@ static __init int pci_access_init(void)
type = pci_direct_probe();
#endif
#ifdef CONFIG_PCI_MMCONFIG
-   pci_mmcfg_init(type);
+   pci_mmcfg_early_init(type);
#endif
if (raw_pci_ops)
return 0;
diff -up linux-2.6.22-rc2-mm1/arch/i386/pci/mmconfig-shared.c 
linux-2.6.22-rc2-mm1edit/arch/i386/pci/mmconfig-shared.c
--- linux-2.6.22-rc2-mm1/arch/i386/pci/mmconfig-shared.c2007-05-23 
21:21:04.0 -0600
+++ linux-2.6.22-rc2-mm1edit/arch/i386/pci/mmconfig-shared.c2007-05-23 
21:38:19.0 -0600
@@ -206,9 +206,77 @@ static void __init pci_mmcfg_insert_reso
pci_mmcfg_resources_inserted = 1;
}

-static void __init pci_mmcfg_reject_broken(int type)
+static acpi_status __init check_mcfg_resource(struct acpi_resource *res,
+   void *data)
+{
+   struct resource *mcfg_res = data;
+   struct acpi_resource_address64 address;
+   acpi_status status;
+
+   if (res->type == ACPI_RESOURCE_TYPE_FIXED_MEMORY32) {
+   struct acpi_resource_fixed_memory32 *fixmem32 =
+   >data.fixed_memory32;
+   if (!fixmem32)
+   return AE_OK;
+   if ((mcfg_res->start >= fixmem32->address) &&
+   (mcfg_res->end <= (fixmem32->address +
+  fixmem32->address_length))) {
+   mcfg_res->flags = 1;
+   return AE_CTRL_TERMINATE;
+   }
+   }
+   if ((res->type != ACPI_RESOURCE_TYPE_ADDRESS32) &&
+   (res->type != ACPI_RESOURCE_TYPE_ADDRESS64))
+   return AE_OK;
+
+   status = acpi_resource_to_address64(res, );
+   if (ACPI_FAILURE(status) || (address.address_length <= 0) ||
+   (address.resource_type != ACPI_MEMORY_RANGE))
+   return AE_OK;
+
+   if ((mcfg_res->start >= address.minimum) &&
+   (mcfg_res->end <=
+(address.minimum +address.address_length))) {
+   mcfg_res->flags = 1;
+   return AE_CTRL_TERMINATE;
+   }
+   return AE_OK;
+}
+
+static acpi_status __init find_mboard_resource(acpi_handle handle, u32 lvl,
+   void *context, void **rv)
+{
+   struct resource *mcfg_res = context;
+
+   acpi_walk_resources(handle, METHOD_NAME__CRS,
+   check_mcfg_resource, context);
+
+   if (mcfg_res->flags)
+   return AE_CTRL_TERMINATE;
+
+   return AE_OK;
+}
+
+static int __init is_acpi_reserved(unsigned long start, unsigned long end)
+{
+   struct resource mcfg_res;
+
+   mcfg_res.start = start;
+   mcfg_res.end = end;
+   mcfg_res.flags = 0;
+
+   acpi_get_devices("PNP0C01", find_mboard_resource, _res, NULL);
+   
+   if( !mcfg_res.flags )
+   acpi_get_devices("PNP0C02", find_mboard_resource, _res, 
NULL);
+
+   return mcfg_res.flags;
+}
+
+static void __init pci_mmcfg_reject_broken(void)
{
typeof(pci_mmcfg_config

[PATCH -mm] 0/2: PCI MMCONFIG-related updates

2007-05-29 Thread Robert Hancock


These two patches implement some changes in behavior related to PCI
MMCONFIG configuration space access. One changes the way in which we
validate the MCFG table provided by the BIOS by checking it against
ACPI motherboard resources instead of the E820 table. The BIOS is not
required to reserve this area in the E820 table, so checking that
results in MMCONFIG being unnecessarily disabled on some machines.

Some Intel chipsets where MMCONFIG was being disabled previously
(but won't be with the first patch) had problems, not due to the
MCFG table being broken, but because the access was hosed by the way
in which we do PCI BAR sizing. The second patch fixes this problem.

This is requested for inclusion in the -mm tree for testing.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm] 1/2: MMCONFIG: validate against ACPI motherboard resources

2007-05-29 Thread Robert Hancock


This path adds validation of the MMCONFIG table against the ACPI reserved
motherboard resources. If the MMCONFIG table is found to be reserved in
ACPI, we don't bother checking the E820 table. The PCI Express firmware spec
apparently tells BIOS developers that reservation in ACPI is required and
E820 reservation is optional, so checking against ACPI first makes sense.
Many BIOSes don't reserve the MMCONFIG region in E820 even though it is
perfectly functional, the existing check needlessly disables MMCONFIG in
these cases.

In order to do this, MMCONFIG setup has been split into two phases. If PCI
configuration type 1 is not available then MMCONFIG is enabled early as before.
Otherwise, it is enabled later after the ACPI interpreter is enabled, since we
need to be able to execute control methods in order to check the ACPI reserved
resources. Presently this is just triggered off the end of ACPI interpreter
initialization.

There are a few other behavioral changes here:

-Validate all MMCONFIG configurations provided, not just the first one.

-Validate the entire required length of each configuration according to the
provided ending bus number is reserved, not just the minimum required 
allocation.

-Validate that the area is reserved even if we read it from the chipset directly
and not from the MCFG table. This catches the case where the BIOS didn't set the
location properly in the chipset and has mapped it over other things it 
shouldn't
have.

Signed-off-by: Robert Hancock [EMAIL PROTECTED]

diff -up linux-2.6.22-rc2-mm1/arch/i386/pci/init.c 
linux-2.6.22-rc2-mm1edit/arch/i386/pci/init.c
--- linux-2.6.22-rc2-mm1/arch/i386/pci/init.c   2007-05-23 21:20:43.0 
-0600
+++ linux-2.6.22-rc2-mm1edit/arch/i386/pci/init.c   2007-05-23 
21:31:50.0 -0600
@@ -12,7 +12,7 @@ static __init int pci_access_init(void)
type = pci_direct_probe();
#endif
#ifdef CONFIG_PCI_MMCONFIG
-   pci_mmcfg_init(type);
+   pci_mmcfg_early_init(type);
#endif
if (raw_pci_ops)
return 0;
diff -up linux-2.6.22-rc2-mm1/arch/i386/pci/mmconfig-shared.c 
linux-2.6.22-rc2-mm1edit/arch/i386/pci/mmconfig-shared.c
--- linux-2.6.22-rc2-mm1/arch/i386/pci/mmconfig-shared.c2007-05-23 
21:21:04.0 -0600
+++ linux-2.6.22-rc2-mm1edit/arch/i386/pci/mmconfig-shared.c2007-05-23 
21:38:19.0 -0600
@@ -206,9 +206,77 @@ static void __init pci_mmcfg_insert_reso
pci_mmcfg_resources_inserted = 1;
}

-static void __init pci_mmcfg_reject_broken(int type)
+static acpi_status __init check_mcfg_resource(struct acpi_resource *res,
+   void *data)
+{
+   struct resource *mcfg_res = data;
+   struct acpi_resource_address64 address;
+   acpi_status status;
+
+   if (res-type == ACPI_RESOURCE_TYPE_FIXED_MEMORY32) {
+   struct acpi_resource_fixed_memory32 *fixmem32 =
+   res-data.fixed_memory32;
+   if (!fixmem32)
+   return AE_OK;
+   if ((mcfg_res-start = fixmem32-address) 
+   (mcfg_res-end = (fixmem32-address +
+  fixmem32-address_length))) {
+   mcfg_res-flags = 1;
+   return AE_CTRL_TERMINATE;
+   }
+   }
+   if ((res-type != ACPI_RESOURCE_TYPE_ADDRESS32) 
+   (res-type != ACPI_RESOURCE_TYPE_ADDRESS64))
+   return AE_OK;
+
+   status = acpi_resource_to_address64(res, address);
+   if (ACPI_FAILURE(status) || (address.address_length = 0) ||
+   (address.resource_type != ACPI_MEMORY_RANGE))
+   return AE_OK;
+
+   if ((mcfg_res-start = address.minimum) 
+   (mcfg_res-end =
+(address.minimum +address.address_length))) {
+   mcfg_res-flags = 1;
+   return AE_CTRL_TERMINATE;
+   }
+   return AE_OK;
+}
+
+static acpi_status __init find_mboard_resource(acpi_handle handle, u32 lvl,
+   void *context, void **rv)
+{
+   struct resource *mcfg_res = context;
+
+   acpi_walk_resources(handle, METHOD_NAME__CRS,
+   check_mcfg_resource, context);
+
+   if (mcfg_res-flags)
+   return AE_CTRL_TERMINATE;
+
+   return AE_OK;
+}
+
+static int __init is_acpi_reserved(unsigned long start, unsigned long end)
+{
+   struct resource mcfg_res;
+
+   mcfg_res.start = start;
+   mcfg_res.end = end;
+   mcfg_res.flags = 0;
+
+   acpi_get_devices(PNP0C01, find_mboard_resource, mcfg_res, NULL);
+   
+   if( !mcfg_res.flags )
+   acpi_get_devices(PNP0C02, find_mboard_resource, mcfg_res, 
NULL);
+
+   return mcfg_res.flags;
+}
+
+static void __init pci_mmcfg_reject_broken(void)
{
typeof(pci_mmcfg_config[0]) *cfg;
+   int i;

if ((pci_mmcfg_config_num == 0) ||
(pci_mmcfg_config == NULL) ||
@@ -228,18

[PATCH -mm] 2/2: PCI: disable decode of IO/memory during BAR sizing

2007-05-29 Thread Robert Hancock


Change PCI BAR sizing to disable the decode of memory or IO, as appropriate,
while we are writing the all-ones value to the BAR to determine the size.
If this is not done, the device may spuriously decode accesses to memory
areas it should not. On some Intel PCI Express chipsets, this breaks
MMCONFIG configuration space access, since the memory the graphics card ends up
decoding during this period overlaps the MMCONFIG area, and thus it steals
the accesses to the area to do any other configuration space access, including
changing the BAR back to its previous value.

However, don't do this disabling on host bridge devices, as it is reported that
some of them do silly things like disable CPU to RAM access if this is done.

Based on an original patch by Jesse Barnes.

Signed-off-by: Robert Hancock [EMAIL PROTECTED]

--- linux-2.6.22-rc2-mm1/drivers/pci/probe.c2007-05-23 21:21:05.0 
-0600
+++ linux-2.6.22-rc2-mm1edit/drivers/pci/probe.c2007-05-29 
21:31:47.0 -0600
@@ -180,6 +180,58 @@ static inline int is_64bit_memory(u32 ma
return 0;
}

+#define BAR_IS_MEMORY(bar) (((bar)  PCI_BASE_ADDRESS_SPACE) ==\
+   PCI_BASE_ADDRESS_SPACE_MEMORY)
+
+/**
+ * pci_bar_size - get raw PCI BAR size
+ * @dev: PCI device
+ * @reg: BAR to probe
+ *
+ * Use basic PCI probing:
+ *   - save original BAR value
+ *   - disable MEM or IO decode in PCI_COMMAND reg if appropriate
+ *   - write all 1s to the BAR
+ *   - read back value
+ *   - reenble MEM or IO decode as necessary
+ *   - write original value back
+ *
+ * Returns raw BAR size to caller.
+ */
+static u32 pci_bar_size(struct pci_dev *dev, unsigned int reg)
+{
+   u32 orig_reg, sz;
+   u16 orig_cmd;
+
+   pci_read_config_dword(dev, reg, orig_reg);
+   pci_read_config_word(dev, PCI_COMMAND, orig_cmd);
+
+   /*
+* Disable memory or IO decode on the device while writing the test
+* value to the BAR. This prevents possible spurious decoding
+* of random addresses by the device. Don't do this for host bridges,
+* however, since some of them do silly things like disable CPU to RAM
+* access if this is done.
+*/
+   if ((dev-class  8) != PCI_CLASS_BRIDGE_HOST) {
+   if (BAR_IS_MEMORY(orig_reg))
+   pci_write_config_word(dev, PCI_COMMAND,
+ orig_cmd  ~PCI_COMMAND_MEMORY);
+   else
+			pci_write_config_word(dev, PCI_COMMAND, 
+	  orig_cmd  ~PCI_COMMAND_IO);

+   }
+   
+   pci_write_config_dword(dev, reg, 0x);
+   pci_read_config_dword(dev, reg, sz);
+   pci_write_config_dword(dev, reg, orig_reg);
+
+   if ((dev-class  8) != PCI_CLASS_BRIDGE_HOST)
+   pci_write_config_word(dev, PCI_COMMAND, orig_cmd);
+
+   return sz;
+}
+
static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int rom)
{
unsigned int pos, reg, next;
@@ -196,16 +248,13 @@ static void pci_read_bases(struct pci_de
res-name = pci_name(dev);
reg = PCI_BASE_ADDRESS_0 + (pos  2);
pci_read_config_dword(dev, reg, l);
-   pci_write_config_dword(dev, reg, ~0);
-   pci_read_config_dword(dev, reg, sz);
-   pci_write_config_dword(dev, reg, l);
+   sz = pci_bar_size(dev, reg);
if (!sz || sz == 0x)
continue;
if (l == 0x)
l = 0;
raw_sz = sz;
-   if ((l  PCI_BASE_ADDRESS_SPACE) ==
-   PCI_BASE_ADDRESS_SPACE_MEMORY) {
+   if (BAR_IS_MEMORY(l)) {
sz = pci_size(l, sz, (u32)PCI_BASE_ADDRESS_MEM_MASK);
/*
 * For 64bit prefetchable memory sz could be 0, if the
@@ -229,9 +278,7 @@ static void pci_read_bases(struct pci_de
u32 szhi, lhi;

pci_read_config_dword(dev, reg+4, lhi);
-   pci_write_config_dword(dev, reg+4, ~0);
-   pci_read_config_dword(dev, reg+4, szhi);
-   pci_write_config_dword(dev, reg+4, lhi);
+   szhi = pci_bar_size(dev, reg+4);
sz64 = ((u64)szhi  32) | raw_sz;
l64 = ((u64)lhi  32) | l;
sz64 = pci_size64(l64, sz64, PCI_BASE_ADDRESS_MEM_MASK);

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel 2.6.21.3 does not work with 8GB of RAM on Intel 965WH motherboards.

2007-05-29 Thread Robert Hancock


Justin Piszcz wrote:

Short Description of Problem:
Linux 2.6.21.3 does not run properly with 8GB of ram on the Intel 965WH 
motherboard.


Long Description of Problem:
When I use 8GB of memory on my x86_64 system, CPU-bound processes are VERY
slow, up to 36x slower than usual.  My temporary fix is force Linux to only
use 4GB of memory, I am currently using mem=4096M.  I ran memtest86 and the
memory is fine, not a single error.  I tried the following to mem= 1024, 
2048

4096 and blank  to let the kernel use all 8GB of memory.  What is wrong
with the kernel and how come it cannot use 8GB of memory without slowing 
down
all CPU-related processes to a snail-like pace?  There is something 
horribly

wrong here.

Specifications:
Intel Motherboard: 965WH
Linux Kernel: 2.6.21.3
Distribution: Debian Testing x86_64
GCC: gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)
Target: x86_64-linux-gnu

Tests:

1. append line = 1024M
top - 18:28:26 up 1 min,  4 users,  load average: 0.42, 0.17, 0.06
Tasks: 157 total,   1 running, 156 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
0.0%st

Mem:   1027016k total,   964288k used,62728k free, 1232k buffers
Swap: 16787768k total,0k used, 16787768k free,   105168k cached
--- STATUS: No problems, box is fine, no lag, etc..

2. append line = 2048M
top - 18:34:23 up 2 min,  2 users,  load average: 0.14, 0.14, 0.05
Tasks: 147 total,   1 running, 146 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.7%us,  1.2%sy,  0.4%ni, 95.2%id,  1.5%wa,  0.0%hi,  0.0%si,  
0.0%st

Mem:   2059696k total,   956324k used,  1103372k free, 1232k buffers
Swap: 16787768k total,0k used, 16787768k free,   102924k cached
--- STATUS: No problems, box is fine, no lag, etc..

3. append line = 4096M
top - 18:37:55 up 1 min,  1 user,  load average: 0.52, 0.19, 0.07
Tasks: 143 total,   1 running, 142 sleeping,   0 stopped,   0 zombie
Cpu(s):  2.9%us,  2.2%sy,  0.7%ni, 91.6%id,  2.6%wa,  0.0%hi,  0.0%si,  
0.0%st

Mem:   3339536k total,   949792k used,  2389744k free, 1232k buffers
Swap: 16787768k total,0k used, 16787768k free,99920k cached

$ time ssh p34 uptime
 19:00:16 up 1 min,  1 user,  load average: 0.67, 0.18, 0.06
real0m0.159s
user0m0.013s
sys 0m0.003s
--- STATUS: No problems, box is fine, no lag, etc..

4. append line =  (use all 8GB)

top - 18:52:50 up 9 min,  1 user,  load average: 2.88, 2.43, 1.41
Tasks: 149 total,   3 running, 146 sleeping,   0 stopped,   0 zombie
Cpu(s): 36.3%us,  2.2%sy, 10.3%ni, 50.8%id,  0.4%wa,  0.0%hi,  0.1%si,  
0.0%st

Mem:   8104460k total,  1064416k used,  7040044k free, 3296k buffers
Swap: 16787768k total,0k used, 16787768k free,   201852k cached

$ ssh p34
ssh: connect to host p34 port 22: Connection refused

Machine takes 5-10 minutes to boot, it acts like a 286 computer, about 8 
minutes later:


$ time ssh p34 uptime  # 5 SECONDS!! 36x slower when using 8GB of RAM
 18:51:39 up 8 min,  1 user,  load average: 2.74, 2.31, 1.30

real0m5.757s
user0m0.015s
sys 0m0.004s

The machine is VERY slow and this is on a gigabit network, I/O does not 
seem to be affected but rather, CPU-bound processes.


  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 2483 root  25   0 25324 5292 1072 R   96  0.1   4:37.12 mailgraph
 3604 logcheck  30  10  3408 1120  544 R   91  0.0   0:03.55 grep

These normally take seconds but when I use all 8GB of memory, it runs
for a very long time.

Conclusion: For now, I will be using mem=4096M until someone can help me 
understand what is happening here.  Can anyone offer any insight?


I found it interesting in make menuconfig on x86_64 there is no 4GB/64GB
options in the kernel that I remember seeing in 32bit.


That's because that option is not applicable in 64-bit mode.

Can you send your full dmesg output from the 8GB bootup?

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Network broken in kernel level.

2007-05-28 Thread Robert Hancock


Wang Penghui wrote:

Hello, list,

Recently, i have messed up with the follow problem, i have two server
both with two ethernet cards. Here are them:

[EMAIL PROTECTED] ~]# lspci | grep -i eth
05:00.0 Ethernet controller: Marvell Technology Group Ltd. Gigabit
Ethernet Controller (rev 18)
07:04.0 Ethernet controller: Intel Corp. 82541GI/PI Gigabit Ethernet
Controller (rev 05)

And they are running MySQL server on both of them. The OS is RHEL 4 with
the default kernel 2.6.9-5.ELsmp. These days there are lots of error
message comming out in /var/log/message and dmesg.


That kernel is very old, you should get the latest RHEL errata update 
kernel and see if that helps. There have been hundreds of bugfixes in 
RHEL kernels since that version.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Network broken in kernel level.

2007-05-28 Thread Robert Hancock


Wang Penghui wrote:

Hello, list,

Recently, i have messed up with the follow problem, i have two server
both with two ethernet cards. Here are them:

[EMAIL PROTECTED] ~]# lspci | grep -i eth
05:00.0 Ethernet controller: Marvell Technology Group Ltd. Gigabit
Ethernet Controller (rev 18)
07:04.0 Ethernet controller: Intel Corp. 82541GI/PI Gigabit Ethernet
Controller (rev 05)

And they are running MySQL server on both of them. The OS is RHEL 4 with
the default kernel 2.6.9-5.ELsmp. These days there are lots of error
message comming out in /var/log/message and dmesg.


That kernel is very old, you should get the latest RHEL errata update 
kernel and see if that helps. There have been hundreds of bugfixes in 
RHEL kernels since that version.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.21.1] resume doesn't run suspended kernel?

2007-05-26 Thread Robert Hancock


Bill Davidsen wrote:
I was testing susp2disk in 2.6.21.1 under FC6, to support reliable 
computing environment (RCE) needs. The idea is that if power fails, 
after some short time on UPS the system does susp2disk with a time set, 
and boots back every so often to see if power is stable.


No, I don't want susp2mem until I debug it, console come up in useless 
mode, console as kalidescope is not what I need.


Anyway, I pulled the plug on the UPS, and the system shut down. But when 
it powered up, it booted the default kernel rather than the test kernel, 
decided that it couldn't resume, and then did a cold boot.


I can bypass this by making the debug kernel the default, but WHY? Is 
the kernel not saved such that any kernel can be rolled back into memory 
and run? Actually, the answer is HELL NO, so I really ask if this is the 
intended mode of operation, that only the default boot kernel will restore.


Fedora scripts for hibernation are supposed to tell GRUB to set the 
default kernel on the next boot to be the current one before suspending 
to disk, so that it comes up with the same version it was running and 
the resume can succeed. If the way you're triggering the suspend 
bypasses this mechanism, you'll see this problem.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.21.1] resume doesn't run suspended kernel?

2007-05-26 Thread Robert Hancock


Bill Davidsen wrote:
I was testing susp2disk in 2.6.21.1 under FC6, to support reliable 
computing environment (RCE) needs. The idea is that if power fails, 
after some short time on UPS the system does susp2disk with a time set, 
and boots back every so often to see if power is stable.


No, I don't want susp2mem until I debug it, console come up in useless 
mode, console as kalidescope is not what I need.


Anyway, I pulled the plug on the UPS, and the system shut down. But when 
it powered up, it booted the default kernel rather than the test kernel, 
decided that it couldn't resume, and then did a cold boot.


I can bypass this by making the debug kernel the default, but WHY? Is 
the kernel not saved such that any kernel can be rolled back into memory 
and run? Actually, the answer is HELL NO, so I really ask if this is the 
intended mode of operation, that only the default boot kernel will restore.


Fedora scripts for hibernation are supposed to tell GRUB to set the 
default kernel on the next boot to be the current one before suspending 
to disk, so that it comes up with the same version it was running and 
the resume can succeed. If the way you're triggering the suspend 
bypasses this mechanism, you'll see this problem.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [scsi] Remove __GFP_DMA

2007-05-23 Thread Robert Hancock


Alan Cox wrote:

On Wed, 23 May 2007 15:17:08 -0400
"Salyzyn, Mark" <[EMAIL PROTECTED]> wrote:


The 31 bit limit for some of these cards is a problem, we currently only
do __GFP_DMA for bounce buffer sg elements allocated for user supplied
references in ioctls.

I figure we should be using pci_alloc_consistent calls for these
allocations to more accurately acquire memory within the 31 bit limit if
necessary, we could switch to these to remove the need for the __GFP_DMA
flag in the aacraid driver?


That didn't used to work right on the AMD boards when I tried it last as
we ended up with a buffer that was mapped by the IOMMU for some reason
and that was not below 2GB.


The physical address you mean? If that is still happening then it needs 
to get fixed. The allocation should not succeed if it can't provide 
memory that's inside the DMA mask for the device..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PCI MMCONFIG: add validation against ACPI motherboard resources

2007-05-23 Thread Robert Hancock


Jesse Barnes wrote:

On Wednesday, May 23, 2007 8:20:14 Linus Torvalds wrote:

On Wed, 23 May 2007, Linus Torvalds wrote:

Sure. I think mmconfig is perfectly sane if it falls back to conf1
accesses for legacy stuff..

.. but without a regression, it's obviously a post-2.6.22 thing, I guess I
should make that clear, just because I think people send me patches after
-rc1 way too eagerly just because they think it fixes a bug.

Basically if it's not somethign that has _ever_ worked some way, it's not
a bug, it's a feature ;)


No, I know better than to send something after your merge window closes.  I 
have no desire to be flamed even further on this topic.  :)


And come to think of it, adding the enable/disable bits might be good even 
with the patch to make legacy accesses go through type 1, since PCIe BAR 
probing is probably done the same way (I haven't looked) and so we might run 
into the same problems there.


I think that disabling decode on non-host-bridge devices during the BAR 
sizing is something we should at least try, indeed.


The issue I have with forcing legacy config space accesses to type1 is 
that it would make it much less obvious if the MMCONFIG access wasn't 
working properly. You'd likely be able to boot up but then wonder why 
something that does extended config space accesses didn't work or hung 
the box. As I mentioned before, either we trust the MMCONFIG or we 
don't, and if we decide that we don't on a particular box, we should 
really be shutting it off entirely. Hopefully with the ACPI reservation 
checking patch and the disable-decode-during-BAR-sizing patch

we wouldn't need to add that restriction.

But yes, post-2.6.22 for all of this :-)

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PCI MMCONFIG: add validation against ACPI motherboard resources

2007-05-23 Thread Robert Hancock

Jesse Barnes wrote:

On Wednesday, May 23, 2007 4:04 pm David Miller wrote:

From: Linus Torvalds <[EMAIL PROTECTED]>
Date: Wed, 23 May 2007 15:16:23 -0700 (PDT)

That crap should be seen for the crap it is! Dammit, how hard can
it be to just admit that mmconfig isn't that great?

I knew mmconfig was broken conceptually the first time I started
seeing write posting "bug fixes" for it that would do a read back
from PCI config space via mmconfig to post the write, which of course
has potential side-effects on the device and is absolutely illegal if
the write just performed put the device into a PM state or whatever.

I've actually seen that specific form of posted write flushing cause 
crashes on some machines, so yes, it sucks.

Unfortunately, I don't think we have any other way of getting at 
extended config space on x86, unless EFI provides methods or something, 
but I'm not sure that would be an improvement...

That "fix" shouldn't be needed at all, the MMCONFIG memory range 
shouldn't be covered by PCI ordering rules, so there should be no such 
thing as write posting. I suspect that the author of such patch(es) was 
doing so out of some misguided sense that it was needed. (And if there 
is some chipset where it is actually needed, better just disable 
MMCONFIG on that one, as there's no way to use it sanely.)

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PCI MMCONFIG: add validation against ACPI motherboard resources

2007-05-23 Thread Robert Hancock


Linus Torvalds wrote:


On Wed, 23 May 2007, Jesse Barnes wrote:
Fixed it (finally).  I don't think moving the 64 bit probing around 
would make a difference, since we'd restore its original value anyway 
before moving on to the 32 bit probe which is where I think the problem 
is.


Well, the thing is, I'm pretty sure there is at least one northbridge that 
stops memory accesses from the CPU when you turn off the MEM bit on it. 
Oops, you just killed the machine.


Which is retarded, since the command bits are only supposed to be for 
memory ranges that are part of the BARs, it's not supposed to completely 
kill the device function. Unless somehow the memory on that system is 
accessed through the PCI bus or something. Anyway, it's something we 
have to deal with.




Looking at the 925X datasheet (which I happened to have around in my 
google search history because of the discussions of the sky2 DMA 
problems), it looks like at least that one just hardcodes the MEM bit to 
be 1, and thus writing to it is a total no-op.


But I really think that clearing the MEM bit for at least the host bridge 
is conceptually quite wrong, even if it might turn out that all chipsets 
end up just saying (like Intel) "screw it, the user is insane, we're not 
going to actually do what he asks us to do".


Do we really want to be that insane? Turn off memory accesses when probing 
the CPU host bridge?


So at a _minimum_ I would say that that thing needs to be more careful 
about host bridges. Maybe it's not needed, who knows? 


I think we should likely avoid disabling the command bits on host 
bridges (maybe any bridge) due to this risk of disabling something that 
will break things. Ideally we can get around this without doing any 
disabling at all, as noted in my last email.




Linus, since you were the one concerned about breaking working setups, 
what do you think?  Should we use this approach, or specifically quirk 
out cases where mmconfig space might conflict with BAR probing?


So see above. I think at a minimum, we should consider the host bridge 
special.


I also suspect that we'd be simply better off if we didn't use mmconfig at 
all unless we _have_ to. Why use mmconfig for the standard BAR accesses? 
Is there really any reason? I can understand using it for extended config 
space, since then the old-fashioned approach won't work. But for normal 
accesses? What's the point, really?


Why not? Either you trust that the MMCONFIG is working or you don't. If 
you trust it, you might as well use it for everything, and if you don't, 
you can't risk using it for anything. If there are problems that show up 
 only with MMCONFIG, doing what you propose would simply cover them up 
until somebody actually tried accessing extended config space.


mmconfig seems to be fundamentally designed to be impossible to bootstrap 
off, so there's no way you can have a machine that _only_ supports 
mmconfig. So why do people seem to think it's so wonderful? Please fill me 
in on this fundamental mystery.


Sure you can bootstrap off it, you just need to have some way to know 
where to find it (either ACPI or some other system-specific mechanism).




Quite frankly, if we just didn't use mmconfig, the whole issue would go 
away. Isn't _that_ the much better solution?


I don't think that is going to be viable in the long run now that 
Windows Vista is out and MS is actually encouraging HW developers to 
allow using that config space..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PCI MMCONFIG: add validation against ACPI motherboard resources

2007-05-23 Thread Robert Hancock


Jesse Barnes wrote:

On Tuesday, May 22, 2007 6:06 pm Robert Hancock wrote:

There was a big discussion about this back in 2002, in which Linus
wasn't overly enthused about disabling the decode during probing due
to risk of causing problems with some devices:

http://lkml.org/lkml/2002/12/19/145

In this particular case (64-bit BAR) we might be able to avoid the
problem by changing the order in which we probe the two halves of the
address, i.e. change the top half to 0x before messing with
the bottom half and then change it back last. That way, we end up
mapping it way to the top of 64-bit address space, which hopefully is
less likely to conflict..


Fixed it (finally).  I don't think moving the 64 bit probing around 
would make a difference, since we'd restore its original value anyway 
before moving on to the 32 bit probe which is where I think the problem 
is.


You couldn't just reorder the code the way it is now, you'd have to 
rearrange the way we do things for 64-bit BARs:


-write  to high part of 64-bit address (we end up moving the BAR 
to 0xC000 for example)
-If any bits stick, we know what the size is now (more than 4GB of 
decode), so just change it back, we're done
-If not, we need to check the low part, so write  to low part of 
64-bit address (BAR moves to 0x)

-Check which bits stick and calculate the address
-Change the low part of the address back (BAR moves to 0xC00)
-Change the high part of the address back (BAR moves to the original 
0xC000 address)


This means that at no point do we map the BAR anywhere near the top of 
32-bit memory, so we should avoid this issue in this particular case. I 
don't think this strategy is too likely to break anything, surely less 
likely than disabling command bits. Jesse, you might want to try hacking 
up something like this and see what happens.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PCI MMCONFIG: add validation against ACPI motherboard resources

2007-05-23 Thread Robert Hancock


Jesse Barnes wrote:

On Tuesday, May 22, 2007 6:06 pm Robert Hancock wrote:

There was a big discussion about this back in 2002, in which Linus
wasn't overly enthused about disabling the decode during probing due
to risk of causing problems with some devices:

http://lkml.org/lkml/2002/12/19/145

In this particular case (64-bit BAR) we might be able to avoid the
problem by changing the order in which we probe the two halves of the
address, i.e. change the top half to 0x before messing with
the bottom half and then change it back last. That way, we end up
mapping it way to the top of 64-bit address space, which hopefully is
less likely to conflict..


Fixed it (finally).  I don't think moving the 64 bit probing around 
would make a difference, since we'd restore its original value anyway 
before moving on to the 32 bit probe which is where I think the problem 
is.


You couldn't just reorder the code the way it is now, you'd have to 
rearrange the way we do things for 64-bit BARs:


-write  to high part of 64-bit address (we end up moving the BAR 
to 0xC000 for example)
-If any bits stick, we know what the size is now (more than 4GB of 
decode), so just change it back, we're done
-If not, we need to check the low part, so write  to low part of 
64-bit address (BAR moves to 0x)

-Check which bits stick and calculate the address
-Change the low part of the address back (BAR moves to 0xC00)
-Change the high part of the address back (BAR moves to the original 
0xC000 address)


This means that at no point do we map the BAR anywhere near the top of 
32-bit memory, so we should avoid this issue in this particular case. I 
don't think this strategy is too likely to break anything, surely less 
likely than disabling command bits. Jesse, you might want to try hacking 
up something like this and see what happens.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PCI MMCONFIG: add validation against ACPI motherboard resources

2007-05-23 Thread Robert Hancock


Linus Torvalds wrote:


On Wed, 23 May 2007, Jesse Barnes wrote:
Fixed it (finally).  I don't think moving the 64 bit probing around 
would make a difference, since we'd restore its original value anyway 
before moving on to the 32 bit probe which is where I think the problem 
is.


Well, the thing is, I'm pretty sure there is at least one northbridge that 
stops memory accesses from the CPU when you turn off the MEM bit on it. 
Oops, you just killed the machine.


Which is retarded, since the command bits are only supposed to be for 
memory ranges that are part of the BARs, it's not supposed to completely 
kill the device function. Unless somehow the memory on that system is 
accessed through the PCI bus or something. Anyway, it's something we 
have to deal with.




Looking at the 925X datasheet (which I happened to have around in my 
google search history because of the discussions of the sky2 DMA 
problems), it looks like at least that one just hardcodes the MEM bit to 
be 1, and thus writing to it is a total no-op.


But I really think that clearing the MEM bit for at least the host bridge 
is conceptually quite wrong, even if it might turn out that all chipsets 
end up just saying (like Intel) screw it, the user is insane, we're not 
going to actually do what he asks us to do.


Do we really want to be that insane? Turn off memory accesses when probing 
the CPU host bridge?


So at a _minimum_ I would say that that thing needs to be more careful 
about host bridges. Maybe it's not needed, who knows? 


I think we should likely avoid disabling the command bits on host 
bridges (maybe any bridge) due to this risk of disabling something that 
will break things. Ideally we can get around this without doing any 
disabling at all, as noted in my last email.




Linus, since you were the one concerned about breaking working setups, 
what do you think?  Should we use this approach, or specifically quirk 
out cases where mmconfig space might conflict with BAR probing?


So see above. I think at a minimum, we should consider the host bridge 
special.


I also suspect that we'd be simply better off if we didn't use mmconfig at 
all unless we _have_ to. Why use mmconfig for the standard BAR accesses? 
Is there really any reason? I can understand using it for extended config 
space, since then the old-fashioned approach won't work. But for normal 
accesses? What's the point, really?


Why not? Either you trust that the MMCONFIG is working or you don't. If 
you trust it, you might as well use it for everything, and if you don't, 
you can't risk using it for anything. If there are problems that show up 
 only with MMCONFIG, doing what you propose would simply cover them up 
until somebody actually tried accessing extended config space.


mmconfig seems to be fundamentally designed to be impossible to bootstrap 
off, so there's no way you can have a machine that _only_ supports 
mmconfig. So why do people seem to think it's so wonderful? Please fill me 
in on this fundamental mystery.


Sure you can bootstrap off it, you just need to have some way to know 
where to find it (either ACPI or some other system-specific mechanism).




Quite frankly, if we just didn't use mmconfig, the whole issue would go 
away. Isn't _that_ the much better solution?


I don't think that is going to be viable in the long run now that 
Windows Vista is out and MS is actually encouraging HW developers to 
allow using that config space..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PCI MMCONFIG: add validation against ACPI motherboard resources

2007-05-23 Thread Robert Hancock

Jesse Barnes wrote:

On Wednesday, May 23, 2007 4:04 pm David Miller wrote:

From: Linus Torvalds [EMAIL PROTECTED]
Date: Wed, 23 May 2007 15:16:23 -0700 (PDT)

That crap should be seen for the crap it is! Dammit, how hard can
it be to just admit that mmconfig isn't that great?

I knew mmconfig was broken conceptually the first time I started
seeing write posting bug fixes for it that would do a read back
from PCI config space via mmconfig to post the write, which of course
has potential side-effects on the device and is absolutely illegal if
the write just performed put the device into a PM state or whatever.

I've actually seen that specific form of posted write flushing cause 
crashes on some machines, so yes, it sucks.

Unfortunately, I don't think we have any other way of getting at 
extended config space on x86, unless EFI provides methods or something, 
but I'm not sure that would be an improvement...

That fix shouldn't be needed at all, the MMCONFIG memory range 
shouldn't be covered by PCI ordering rules, so there should be no such 
thing as write posting. I suspect that the author of such patch(es) was 
doing so out of some misguided sense that it was needed. (And if there 
is some chipset where it is actually needed, better just disable 
MMCONFIG on that one, as there's no way to use it sanely.)

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PCI MMCONFIG: add validation against ACPI motherboard resources

2007-05-23 Thread Robert Hancock


Jesse Barnes wrote:

On Wednesday, May 23, 2007 8:20:14 Linus Torvalds wrote:

On Wed, 23 May 2007, Linus Torvalds wrote:

Sure. I think mmconfig is perfectly sane if it falls back to conf1
accesses for legacy stuff..

.. but without a regression, it's obviously a post-2.6.22 thing, I guess I
should make that clear, just because I think people send me patches after
-rc1 way too eagerly just because they think it fixes a bug.

Basically if it's not somethign that has _ever_ worked some way, it's not
a bug, it's a feature ;)


No, I know better than to send something after your merge window closes.  I 
have no desire to be flamed even further on this topic.  :)


And come to think of it, adding the enable/disable bits might be good even 
with the patch to make legacy accesses go through type 1, since PCIe BAR 
probing is probably done the same way (I haven't looked) and so we might run 
into the same problems there.


I think that disabling decode on non-host-bridge devices during the BAR 
sizing is something we should at least try, indeed.


The issue I have with forcing legacy config space accesses to type1 is 
that it would make it much less obvious if the MMCONFIG access wasn't 
working properly. You'd likely be able to boot up but then wonder why 
something that does extended config space accesses didn't work or hung 
the box. As I mentioned before, either we trust the MMCONFIG or we 
don't, and if we decide that we don't on a particular box, we should 
really be shutting it off entirely. Hopefully with the ACPI reservation 
checking patch and the disable-decode-during-BAR-sizing patch

we wouldn't need to add that restriction.

But yes, post-2.6.22 for all of this :-)

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [scsi] Remove __GFP_DMA

2007-05-23 Thread Robert Hancock


Alan Cox wrote:

On Wed, 23 May 2007 15:17:08 -0400
Salyzyn, Mark [EMAIL PROTECTED] wrote:


The 31 bit limit for some of these cards is a problem, we currently only
do __GFP_DMA for bounce buffer sg elements allocated for user supplied
references in ioctls.

I figure we should be using pci_alloc_consistent calls for these
allocations to more accurately acquire memory within the 31 bit limit if
necessary, we could switch to these to remove the need for the __GFP_DMA
flag in the aacraid driver?


That didn't used to work right on the AMD boards when I tried it last as
we ended up with a buffer that was mapped by the IOMMU for some reason
and that was not below 2GB.


The physical address you mean? If that is still happening then it needs 
to get fixed. The allocation should not succeed if it can't provide 
memory that's inside the DMA mask for the device..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PCI MMCONFIG: add validation against ACPI motherboard resources

2007-05-22 Thread Robert Hancock


Jesse Barnes wrote:

On Tuesday, May 22, 2007, Robert Hancock wrote:

Eww. I don't see where we disable the decode at all while we probe the
BARs on the device. That seems like a bad thing, especially with the way
we probe 64-bit BARs (do the low 32 bits first and then the high 32
bits). This means the base address effectively gets set to 0xfff0
momentarily, which might cause some issues.


I'm a bit shocked that things work as well as they do without the 
disabling...



I'd try adding some code inside pci_setup_device (drivers/pci/probe.c)
to disable PCI_COMMAND_IO and PCI_COMMAND_MEMORY on the device when
probing devices with the standard header type and then restoring the
previous command bits afterwards, and see what effect that has. It'll be
  interesting if it does, since obviously it seems to work as it is with
non-MMCONFIG access methods. Maybe the base address being set like that
interferes with MMCONFIG access itself somehow?


I tried that, and it seems to get past probing the graphics device at 
least, but it hangs a bit later.  It could be that the enable/disable I 
added wasn't correct though, I didn't check to see which one I should 
disable in the command word, which may be a problem (just disabled them 
both every probe).  I'll try again with more precise enable/disable 
semantics.


There was a big discussion about this back in 2002, in which Linus 
wasn't overly enthused about disabling the decode during probing due to 
risk of causing problems with some devices:


http://lkml.org/lkml/2002/12/19/145

In this particular case (64-bit BAR) we might be able to avoid the 
problem by changing the order in which we probe the two halves of the 
address, i.e. change the top half to 0x before messing with the 
bottom half and then change it back last. That way, we end up mapping it 
way to the top of 64-bit address space, which hopefully is less likely 
to conflict..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PCI MMCONFIG: add validation against ACPI motherboard resources

2007-05-22 Thread Robert Hancock


Jesse Barnes wrote:

On Tuesday, May 22, 2007, Robert Hancock wrote:

Eww. I don't see where we disable the decode at all while we probe the
BARs on the device. That seems like a bad thing, especially with the way
we probe 64-bit BARs (do the low 32 bits first and then the high 32
bits). This means the base address effectively gets set to 0xfff0
momentarily, which might cause some issues.


I'm a bit shocked that things work as well as they do without the 
disabling...



I'd try adding some code inside pci_setup_device (drivers/pci/probe.c)
to disable PCI_COMMAND_IO and PCI_COMMAND_MEMORY on the device when
probing devices with the standard header type and then restoring the
previous command bits afterwards, and see what effect that has. It'll be
  interesting if it does, since obviously it seems to work as it is with
non-MMCONFIG access methods. Maybe the base address being set like that
interferes with MMCONFIG access itself somehow?


I tried that, and it seems to get past probing the graphics device at 
least, but it hangs a bit later.  It could be that the enable/disable I 
added wasn't correct though, I didn't check to see which one I should 
disable in the command word, which may be a problem (just disabled them 
both every probe).  I'll try again with more precise enable/disable 
semantics.


It'd be interesting to see at what access it ran into trouble next, at 
least if it's consistent. Could be that some device doesn't like having 
the decode disabled..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PCI MMCONFIG: add validation against ACPI motherboard resources

2007-05-22 Thread Robert Hancock


Jesse Barnes wrote:

On Monday, May 21, 2007, Jesse Barnes wrote:

Yeah, I've got that data... just a sec while I make sure it's
reproducable...

Aha, I hadn't decoded the devfn before, looks like it's dying on an
access to the graphics device (bus 0, slot 2, device 0):

...
pci_mmcfg_read: 0, 0, 0x10, 0x18, 4 = 0xc00c
pci_mmcfg_read: 0, 0, 0x10, 0x18, 4 = 
...

Offset 0x18 into the graphics config space should be the graphics memory
range address, and 0xc00c is the correct value.  But for some reason
it hangs on the second access.

It hangs here everytime.


That register is in the config space BAR region, so it should be ok to 
write 0x to it and read it back to size the register.  However, 
it's after writing the 0x to it and trying to read it back that 
the machine hangs.  I didn't see any accesses to the command register to 
disable decoding (at least not via the mmconfig methods), so maybe that's 
broken during MCFG based probing?


Eww. I don't see where we disable the decode at all while we probe the 
BARs on the device. That seems like a bad thing, especially with the way 
we probe 64-bit BARs (do the low 32 bits first and then the high 32 
bits). This means the base address effectively gets set to 0xfff0 
momentarily, which might cause some issues.


I'd try adding some code inside pci_setup_device (drivers/pci/probe.c) 
to disable PCI_COMMAND_IO and PCI_COMMAND_MEMORY on the device when 
probing devices with the standard header type and then restoring the 
previous command bits afterwards, and see what effect that has. It'll be 
 interesting if it does, since obviously it seems to work as it is with 
non-MMCONFIG access methods. Maybe the base address being set like that 
interferes with MMCONFIG access itself somehow?


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Enabling power states for Core 2 Duo

2007-05-22 Thread Robert Hancock


Paa Paa wrote:
For some reason I'm not able to enable processor power states (c1, c2 
etc.) for my Core 2 Duo. This is what I get::


cat /proc/acpi/processor/CPU1/info
processor id:0
acpi id: 1
bus mastering control:   no
power management:no
throttling control:  no
limit interface: no

cat /proc/acpi/processor/CPU1/power
active state:C0
max_cstate:  C8
bus master activity: 
maximum allowed latency: 2000 usec
states:

"dmesg | grep -i power" also gives nothing. I have ACPI enabled in BIOS 
and in kernel I have these set ("grep -i acpi .config | grep =y"):


CONFIG_ACPI=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_SYSTEM=y
CONFIG_X86_ACPI_CPUFREQ=y
CONFIG_PNPACPI=y
CONFIG_SATA_ACPI=y

I'm probably missing something crucial here. So how do I enable power 
states? I'm using 64-bit Gentoo. My mobo is Asus P5B Deluxe. Otherwise 
ACPI works fine.


The BIOS has to expose this support in ACPI, if it doesn't (which is 
often the case on desktop boards) you won't get any C-state support 
(well, except for C1 which is just the normal halt state).


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PCI MMCONFIG: add validation against ACPI motherboard resources

2007-05-22 Thread Robert Hancock


Jesse Barnes wrote:

On Monday, May 21, 2007, Jesse Barnes wrote:

Yeah, I've got that data... just a sec while I make sure it's
reproducable...

Aha, I hadn't decoded the devfn before, looks like it's dying on an
access to the graphics device (bus 0, slot 2, device 0):

...
pci_mmcfg_read: 0, 0, 0x10, 0x18, 4 = 0xc00c
pci_mmcfg_read: 0, 0, 0x10, 0x18, 4 = hang
...

Offset 0x18 into the graphics config space should be the graphics memory
range address, and 0xc00c is the correct value.  But for some reason
it hangs on the second access.

It hangs here everytime.


That register is in the config space BAR region, so it should be ok to 
write 0x to it and read it back to size the register.  However, 
it's after writing the 0x to it and trying to read it back that 
the machine hangs.  I didn't see any accesses to the command register to 
disable decoding (at least not via the mmconfig methods), so maybe that's 
broken during MCFG based probing?


Eww. I don't see where we disable the decode at all while we probe the 
BARs on the device. That seems like a bad thing, especially with the way 
we probe 64-bit BARs (do the low 32 bits first and then the high 32 
bits). This means the base address effectively gets set to 0xfff0 
momentarily, which might cause some issues.


I'd try adding some code inside pci_setup_device (drivers/pci/probe.c) 
to disable PCI_COMMAND_IO and PCI_COMMAND_MEMORY on the device when 
probing devices with the standard header type and then restoring the 
previous command bits afterwards, and see what effect that has. It'll be 
 interesting if it does, since obviously it seems to work as it is with 
non-MMCONFIG access methods. Maybe the base address being set like that 
interferes with MMCONFIG access itself somehow?


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PCI MMCONFIG: add validation against ACPI motherboard resources

2007-05-22 Thread Robert Hancock


Jesse Barnes wrote:

On Tuesday, May 22, 2007, Robert Hancock wrote:

Eww. I don't see where we disable the decode at all while we probe the
BARs on the device. That seems like a bad thing, especially with the way
we probe 64-bit BARs (do the low 32 bits first and then the high 32
bits). This means the base address effectively gets set to 0xfff0
momentarily, which might cause some issues.


I'm a bit shocked that things work as well as they do without the 
disabling...



I'd try adding some code inside pci_setup_device (drivers/pci/probe.c)
to disable PCI_COMMAND_IO and PCI_COMMAND_MEMORY on the device when
probing devices with the standard header type and then restoring the
previous command bits afterwards, and see what effect that has. It'll be
  interesting if it does, since obviously it seems to work as it is with
non-MMCONFIG access methods. Maybe the base address being set like that
interferes with MMCONFIG access itself somehow?


I tried that, and it seems to get past probing the graphics device at 
least, but it hangs a bit later.  It could be that the enable/disable I 
added wasn't correct though, I didn't check to see which one I should 
disable in the command word, which may be a problem (just disabled them 
both every probe).  I'll try again with more precise enable/disable 
semantics.


It'd be interesting to see at what access it ran into trouble next, at 
least if it's consistent. Could be that some device doesn't like having 
the decode disabled..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PCI MMCONFIG: add validation against ACPI motherboard resources

2007-05-22 Thread Robert Hancock


Jesse Barnes wrote:

On Tuesday, May 22, 2007, Robert Hancock wrote:

Eww. I don't see where we disable the decode at all while we probe the
BARs on the device. That seems like a bad thing, especially with the way
we probe 64-bit BARs (do the low 32 bits first and then the high 32
bits). This means the base address effectively gets set to 0xfff0
momentarily, which might cause some issues.


I'm a bit shocked that things work as well as they do without the 
disabling...



I'd try adding some code inside pci_setup_device (drivers/pci/probe.c)
to disable PCI_COMMAND_IO and PCI_COMMAND_MEMORY on the device when
probing devices with the standard header type and then restoring the
previous command bits afterwards, and see what effect that has. It'll be
  interesting if it does, since obviously it seems to work as it is with
non-MMCONFIG access methods. Maybe the base address being set like that
interferes with MMCONFIG access itself somehow?


I tried that, and it seems to get past probing the graphics device at 
least, but it hangs a bit later.  It could be that the enable/disable I 
added wasn't correct though, I didn't check to see which one I should 
disable in the command word, which may be a problem (just disabled them 
both every probe).  I'll try again with more precise enable/disable 
semantics.


There was a big discussion about this back in 2002, in which Linus 
wasn't overly enthused about disabling the decode during probing due to 
risk of causing problems with some devices:


http://lkml.org/lkml/2002/12/19/145

In this particular case (64-bit BAR) we might be able to avoid the 
problem by changing the order in which we probe the two halves of the 
address, i.e. change the top half to 0x before messing with the 
bottom half and then change it back last. That way, we end up mapping it 
way to the top of 64-bit address space, which hopefully is less likely 
to conflict..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Enabling power states for Core 2 Duo

2007-05-22 Thread Robert Hancock


Paa Paa wrote:
For some reason I'm not able to enable processor power states (c1, c2 
etc.) for my Core 2 Duo. This is what I get::


cat /proc/acpi/processor/CPU1/info
processor id:0
acpi id: 1
bus mastering control:   no
power management:no
throttling control:  no
limit interface: no

cat /proc/acpi/processor/CPU1/power
active state:C0
max_cstate:  C8
bus master activity: 
maximum allowed latency: 2000 usec
states:

dmesg | grep -i power also gives nothing. I have ACPI enabled in BIOS 
and in kernel I have these set (grep -i acpi .config | grep =y):


CONFIG_ACPI=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_SYSTEM=y
CONFIG_X86_ACPI_CPUFREQ=y
CONFIG_PNPACPI=y
CONFIG_SATA_ACPI=y

I'm probably missing something crucial here. So how do I enable power 
states? I'm using 64-bit Gentoo. My mobo is Asus P5B Deluxe. Otherwise 
ACPI works fine.


The BIOS has to expose this support in ACPI, if it doesn't (which is 
often the case on desktop boards) you won't get any C-state support 
(well, except for C1 which is just the normal halt state).


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PCI MMCONFIG: add validation against ACPI motherboard resources

2007-05-21 Thread Robert Hancock


Jesse Barnes wrote:

What happens if you take out the chipset register detection, does
the MCFG table give you the same result? Wonder if they're doing
something funny with start/end bus values or something in their
table. There's some code in my patch that prints out the important
data from the MCFG table, can you tell me what that shows with the
chipset detection taken out?


I can't see how any MCFG based accesses could work on this box, but I
don't know why.  According to the boot log (with our code patched in
but disabled after checking the ACPI reserved status), the space is fine:

...
ACPI: (supports S0 S3 S4 S5)
ACPI: Using IOAPIC for interrupt routing
pciexbar lo: 0xf003
pciexbar hi: 0x
Enabled MCFG space at 0xf000, size 134217728
PCI: Found Intel Corporation G965 Express Memory Controller Hub with MMCONFIG 
support.
PCI: MCFG configuration 0: base f000 segment 0 buses 0 - 127
PCI: MCFG area at f000 reserved in ACPI motherboard resources
PCI: Not using MMCONFIG. <-- due to the 'goto reject' after
 if (is_acpi_reserved) { ... }
PM: Adding info for acpi:acpi_system:00
PM: Adding info for acpi:button_power:00
...

Same thing happens if I disable the chipset specific code and just use
the ACPI stuff you added.

If I leave it enabled, several config cycles work fine, but the box
eventually hangs after probing 24 devices or so.  I don't see anything
else mapped into this space, and the MTRRs seem ok, so either there's
something hidden in this memory range or there's another chipset register
that needs poking to fully enable this space properly.

Sysrq doesn't seem to work, and I don't see any events in my machine log,
so figuring out exactly why it's hanging is a bit difficult.

Any ideas on what to try next?  I'll see if I can get some more details
from our BIOS folks and do yet another pass over the documentation to see
if there's something I'm missing.


Can you find out which config access (bus, device, function, address) is 
the one that hangs the box? I assume that either the corresponding 
address in the MCFG table is problematic (i.e. has something else mapped 
over it), or maybe that device just doesn't like being probed with MCFG 
somehow.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Add Seagate STT20000A to DMA blacklist.

2007-05-21 Thread Robert Hancock


Dave Jones wrote:

On Mon, May 21, 2007 at 05:15:51PM +0100, Alan Cox wrote:
 > On Mon, 21 May 2007 10:50:42 -0400
 > Dave Jones <[EMAIL PROTECTED]> wrote:
 > 
 > > http://bugzilla.kernel.org/show_bug.cgi?id=1044

 > > has been open for _four_ years with a patch available.
 > > Here's a rediffed version of the same.
 > 
 > Please update libata as well when you udpate the blacklists.


Sure, point me at the table(s) ?

Dave



ata_device_blacklist in drivers/ata/libata-core.c

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: something strange in libata-core.c for kernel 2.6.22-rc3

2007-05-21 Thread Robert Hancock


Alan Cox wrote:

Yeah, that's consistent to what I've seen on my machine which is a
variant of A8N.  No matter what value I through at _STM, _GTM just
echoed the result thus always leading to 80c configuration.


I guess this means that what we have to do is trust that the BIOS set up
a reasonable mode and base the cable detect on that (either by reading
back the boot-up controller registers, or by calling GTM). I imagine
this is what the Windows default IDE driver is doing (just using the
boot-up mode and feeding it back using GTM/STM on suspend/resume cycles).

Alan, what do you think?


Interesting, sounds like it is still useful rather than just reading the
registers as the GTM/STM seem to survive resume cycles which drive config
may not (eg if the driver is loaded after a s2ram/resume.


I don't think that case is handled in this BIOS anyway - if you call GTM 
after resume without previously calling STM, it's just going to read 
whatever random values are in the controller and give you timings based 
on that, which presumably will be junk.


It looks like the main purpose for what it's doing with saving those 
registers in the _PTS method is to save and restore a couple of 
controller registers called ID20 (PCI config space offset 0x50, 16 bits) 
and ID22 (PCI config space offset 0x5C, 32 bits) which aren't otherwise 
used in the AML. According to pata_amd, for the AMD IDE interface the 
former is some reserved bits as well as the cable detect bits, while the 
latter is the cycle time and address setup time register. Presumably 
those aren't really the cable detect bits though, since the detection 
based on those bits in pata_amd doesn't really work..



If it just echoes back we should also be able to detect this by using
knowingly invalid values.


Well, this implementation doesn't purely echo back the same values, it 
echoes back values derived from what the controller was actually set to, 
so I imagine if you put in something ridiculous it would come back with 
the closest possible mode that it was set to (PIO mode 0, etc.)


I suspect the implementation we would need to use (which doesn't depend 
on anything not given in the spec) would be:


-On driver load, execute _GTM to get the timing mode the BIOS had set. 
Assume this represents the fastest modes the controller supports, and 
set cable detect based on whether it includes UDMA modes > 2.


-If we decide to set a slower mode (speed down due to errors, etc.), set 
it using _STM and then read back the actual values that were set using 
_GTM (for possible use in suspend/resume).


-On resume after suspend, re-set the last mode using _STM followed by 
executing _GTF and running those commands.


This won't handle the case where the driver is loaded after the system 
was already suspended to RAM and resumed, however I don't know exactly 
how one could handle that in this situation..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: something strange in libata-core.c for kernel 2.6.22-rc3

2007-05-21 Thread Robert Hancock


Tejun Heo wrote:

[EMAIL PROTECTED] wrote:

Mybe I am wrong, but if you are detecting 40-wire cable to set them to
DMA/33, why the check includes also 80-wire cables configuring them to
DMA/33 too?

With this patch my nvidia4 IDE controllers detects correctly and
configure correctly DMA/100 for my HD and DMA/33 for my DVD (the first
uses a 80-wire cable, the second a 40-wire cable).

Am I wrong somewhere?


That's the drive side verification of 80c cable check, so if the
condition triggers we downgrade 80c or unknown to 40c.  Cable detection
on nvidia PATA is a disaster.  You're supposed to do some ACPI dancing
and drive side detection is completely bogus.  Eeeek

Alan, did you have a chance to test the ACPI cable detection?  It just
didn't work when I tried it.  It always returned 80c on my machine.


On a whim I started poking around in the disassembled ACPI DSDT code for 
my Asus A8N-SLI Deluxe board, which is one of these chipsets. The 
original thought was that the STM/GTM trick on these chipsets is 
supposed to allow us to determine what modes we should use based on what 
modes it sets up appropriately. Unfortunately, unless I'm missing 
something in the AML (which is possible) it doesn't seem like there is 
any validation being done on the settings passed in. The settings appear 
to essentially just get programmed into the controller when STM is 
called and read back on GTM. The only complication is some logic on the 
_PTS method (Prepare to Sleep) which stores the current settings into 
some variables, and in STM, if a flag was set by the _PTS method, the 
previous settings for all registers are stored back first before writing 
the requested values into the correct places.


So in this case, obviously the approach used by pata_acpi, etc. won't 
work for cable detection. Whatever magic register on the chipset 
contains the cable detect value, the AML doesn't seem to be accessing 
it. The ACPI spec doesn't really give any guarantee that the "try STM on 
all possible modes" trick will work either, since there seems to be no 
mention of the AML being required to validate the mode and the STM 
function has no return value to indicate failure.


I guess this means that what we have to do is trust that the BIOS set up 
a reasonable mode and base the cable detect on that (either by reading 
back the boot-up controller registers, or by calling GTM). I imagine 
this is what the Windows default IDE driver is doing (just using the 
boot-up mode and feeding it back using GTM/STM on suspend/resume cycles).


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: IDE/ATA: Intel i865-based mainboard, CDROM not detected

2007-05-21 Thread Robert Hancock


Jonathan Woithe wrote:

I've just tried a quick test after enabling most of the PATA drivers under
the libata section including the Jmicron driver (basically everything except
those labelled "highly experimental").  As far as I can tell the
CDROM/DVDROM is still not detected even with all these built into the
kernel.  Maybe I do need one of those "highly experimental" drivers.


Can you post the entire lspci -v for this board?



Also, it's unrelated to this problem, but you should check the BIOS 
settings for the SATA controller - you really want to get the controller 
into AHCI mode for best performance.


I've often wondered how the BIOS descriptions correlate with the modes the
controller ends up in.  I've always gone for things like "enhanced" or
"SATA" or "native" (the exact string of course being dependent on the BIOS
writer's mood on the day).  This seems to work out OK in practice.  How
can you tell from the Linux boot messages that the controller is in AHCI
mode - is it as simple as looking for AHCI driver messages?  In this case
the 


  scsi0 : ata_piix
  scsi1 : ata_piix

indicate that things are suboptimal I assume.


Right, you should see that showing up as ahci.

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: IDE/ATA: Intel i865-based mainboard, CDROM not detected

2007-05-21 Thread Robert Hancock


Jonathan Woithe wrote:

A collegue of mine has an Intel mainboard with the i865 chipset onboard
(DQ965).  All kernels up to and including 2.6.22-rc2 do not detect the IDE
CDROM/DVDROM when booting.  The SATA hard drive is found without any
problems.

Relevant parts from lspci:

  00:1f.2 0101: 8086:2820 (rev 02)
  00:1f.2 IDE interface: Intel Corporation 82801H (ICH8 Family) 4 port SATA
  IDE Controller (rev 02) (prog-if 8f [Master SecP SecO PriP PriO])
Subsystem: Intel Corporation Unknown device 514d
Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 19

  00:1f.5 0101: 8086:2825 (rev 02)
  00:1f.5 IDE interface: Intel Corporation 82801H (ICH8 Family) 2 port SATA
  IDE Controller (rev 02) (prog-if 85 [Master SecO PriO])
Subsystem: Intel Corporation Unknown device 514d
Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 19

What's interesting here is that 00:1f.2 and 00:1f.5 are both identified as
"n port SATA" controllers even though one of them (I suspect 00:1f.5) is a
PATA controller.  This may just be a typo in lspci's database though.

Boot messages:

  ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
  Probing IDE interface ide0...
  Probing IDE interface ide1...
  :
  ata_piix :00:1f.2: version 2.11
  ata_piix :00:1f.2: MAP [ P0 P2 P1 P3 ]
  ACPI: PCI Interrupt :00:1f.2[A] -> GSI 19 (level, low) -> IRQ 19
  PCI: Setting latency timer of device :00:1f.2 to 64
  scsi0 : ata_piix
  scsi1 : ata_piix
  ata1: SATA max UDMA/133 cmd 0x00012138 ctl 0x00012156 bmdma 0x00012110 irq 0
  ata2: SATA max UDMA/133 cmd 0x00012130 ctl 0x00012152 bmdma 0x00012118 irq 0
  ata1.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168
  ata1.00: ATA-7: WDC WD2500AAJS-00RYA0, 12.01B01, max UDMA/133
  ata1.00: 488397168 sectors, multi 16: LBA48 NCQ (depth 0/32)
  ata1.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168
  ata1.00: configured for UDMA/133
  ATA: abnormal status 0x7F on port 0x00012137
  scsi 0:0:0:0: Direct-Access ATA  WDC WD2500AAJS-0 12.0 PQ: 0 ANSI: 5
  sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
  sd 0:0:0:0: [sda] Write Protect is off
  sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
  sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA
  sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
  sd 0:0:0:0: [sda] Write Protect is off
  sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
  sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA
  sda: sda1 sda2
  sd 0:0:0:0: [sda] Attached SCSI disk
  ata_piix :00:1f.5: MAP [ P0 P2 P1 P3 ]

Here the HDD is clearly detected while the CDROM/DVDROM (attached to ide0)
isn't.

libata is compiled into the kernel as is the non-libata PATA driver.
In the libata configuration, only SATA_AHCI, ATA_PIIX and ATA_GENERIC are
defined.  For the non-libata side of things most options are selected
including BLK_DEV_IDE, BLK_DEV_IDECD, IDE_GENERIC, BLK_DEV_IDEPCI, 
BLK_DEV_GENERIC, BLK_DEV_IDEDMA_PCI and BLK_DEV_PIIX.


Does anyone have any ideas as to why there is a problem detecting the PATA
(IDE) CDROM/DVDROM in this machine?  Further information/testing can be
provided if requested.


A lot of newer Intel boards have the IDE interface provided by an 
external JMicron, etc. chip so you may need to enable that driver for 
things to work.


Also, it's unrelated to this problem, but you should check the BIOS 
settings for the SATA controller - you really want to get the controller 
into AHCI mode for best performance.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: IDE/ATA: Intel i865-based mainboard, CDROM not detected

2007-05-21 Thread Robert Hancock


Jonathan Woithe wrote:

A collegue of mine has an Intel mainboard with the i865 chipset onboard
(DQ965).  All kernels up to and including 2.6.22-rc2 do not detect the IDE
CDROM/DVDROM when booting.  The SATA hard drive is found without any
problems.

Relevant parts from lspci:

  00:1f.2 0101: 8086:2820 (rev 02)
  00:1f.2 IDE interface: Intel Corporation 82801H (ICH8 Family) 4 port SATA
  IDE Controller (rev 02) (prog-if 8f [Master SecP SecO PriP PriO])
Subsystem: Intel Corporation Unknown device 514d
Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 19

  00:1f.5 0101: 8086:2825 (rev 02)
  00:1f.5 IDE interface: Intel Corporation 82801H (ICH8 Family) 2 port SATA
  IDE Controller (rev 02) (prog-if 85 [Master SecO PriO])
Subsystem: Intel Corporation Unknown device 514d
Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 19

What's interesting here is that 00:1f.2 and 00:1f.5 are both identified as
n port SATA controllers even though one of them (I suspect 00:1f.5) is a
PATA controller.  This may just be a typo in lspci's database though.

Boot messages:

  ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
  Probing IDE interface ide0...
  Probing IDE interface ide1...
  :
  ata_piix :00:1f.2: version 2.11
  ata_piix :00:1f.2: MAP [ P0 P2 P1 P3 ]
  ACPI: PCI Interrupt :00:1f.2[A] - GSI 19 (level, low) - IRQ 19
  PCI: Setting latency timer of device :00:1f.2 to 64
  scsi0 : ata_piix
  scsi1 : ata_piix
  ata1: SATA max UDMA/133 cmd 0x00012138 ctl 0x00012156 bmdma 0x00012110 irq 0
  ata2: SATA max UDMA/133 cmd 0x00012130 ctl 0x00012152 bmdma 0x00012118 irq 0
  ata1.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168
  ata1.00: ATA-7: WDC WD2500AAJS-00RYA0, 12.01B01, max UDMA/133
  ata1.00: 488397168 sectors, multi 16: LBA48 NCQ (depth 0/32)
  ata1.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168
  ata1.00: configured for UDMA/133
  ATA: abnormal status 0x7F on port 0x00012137
  scsi 0:0:0:0: Direct-Access ATA  WDC WD2500AAJS-0 12.0 PQ: 0 ANSI: 5
  sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
  sd 0:0:0:0: [sda] Write Protect is off
  sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
  sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA
  sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
  sd 0:0:0:0: [sda] Write Protect is off
  sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
  sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA
  sda: sda1 sda2
  sd 0:0:0:0: [sda] Attached SCSI disk
  ata_piix :00:1f.5: MAP [ P0 P2 P1 P3 ]

Here the HDD is clearly detected while the CDROM/DVDROM (attached to ide0)
isn't.

libata is compiled into the kernel as is the non-libata PATA driver.
In the libata configuration, only SATA_AHCI, ATA_PIIX and ATA_GENERIC are
defined.  For the non-libata side of things most options are selected
including BLK_DEV_IDE, BLK_DEV_IDECD, IDE_GENERIC, BLK_DEV_IDEPCI, 
BLK_DEV_GENERIC, BLK_DEV_IDEDMA_PCI and BLK_DEV_PIIX.


Does anyone have any ideas as to why there is a problem detecting the PATA
(IDE) CDROM/DVDROM in this machine?  Further information/testing can be
provided if requested.


A lot of newer Intel boards have the IDE interface provided by an 
external JMicron, etc. chip so you may need to enable that driver for 
things to work.


Also, it's unrelated to this problem, but you should check the BIOS 
settings for the SATA controller - you really want to get the controller 
into AHCI mode for best performance.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: IDE/ATA: Intel i865-based mainboard, CDROM not detected

2007-05-21 Thread Robert Hancock


Jonathan Woithe wrote:

I've just tried a quick test after enabling most of the PATA drivers under
the libata section including the Jmicron driver (basically everything except
those labelled highly experimental).  As far as I can tell the
CDROM/DVDROM is still not detected even with all these built into the
kernel.  Maybe I do need one of those highly experimental drivers.


Can you post the entire lspci -v for this board?



Also, it's unrelated to this problem, but you should check the BIOS 
settings for the SATA controller - you really want to get the controller 
into AHCI mode for best performance.


I've often wondered how the BIOS descriptions correlate with the modes the
controller ends up in.  I've always gone for things like enhanced or
SATA or native (the exact string of course being dependent on the BIOS
writer's mood on the day).  This seems to work out OK in practice.  How
can you tell from the Linux boot messages that the controller is in AHCI
mode - is it as simple as looking for AHCI driver messages?  In this case
the 


  scsi0 : ata_piix
  scsi1 : ata_piix

indicate that things are suboptimal I assume.


Right, you should see that showing up as ahci.

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: something strange in libata-core.c for kernel 2.6.22-rc3

2007-05-21 Thread Robert Hancock


Tejun Heo wrote:

[EMAIL PROTECTED] wrote:

Mybe I am wrong, but if you are detecting 40-wire cable to set them to
DMA/33, why the check includes also 80-wire cables configuring them to
DMA/33 too?

With this patch my nvidia4 IDE controllers detects correctly and
configure correctly DMA/100 for my HD and DMA/33 for my DVD (the first
uses a 80-wire cable, the second a 40-wire cable).

Am I wrong somewhere?


That's the drive side verification of 80c cable check, so if the
condition triggers we downgrade 80c or unknown to 40c.  Cable detection
on nvidia PATA is a disaster.  You're supposed to do some ACPI dancing
and drive side detection is completely bogus.  Eeeek

Alan, did you have a chance to test the ACPI cable detection?  It just
didn't work when I tried it.  It always returned 80c on my machine.


On a whim I started poking around in the disassembled ACPI DSDT code for 
my Asus A8N-SLI Deluxe board, which is one of these chipsets. The 
original thought was that the STM/GTM trick on these chipsets is 
supposed to allow us to determine what modes we should use based on what 
modes it sets up appropriately. Unfortunately, unless I'm missing 
something in the AML (which is possible) it doesn't seem like there is 
any validation being done on the settings passed in. The settings appear 
to essentially just get programmed into the controller when STM is 
called and read back on GTM. The only complication is some logic on the 
_PTS method (Prepare to Sleep) which stores the current settings into 
some variables, and in STM, if a flag was set by the _PTS method, the 
previous settings for all registers are stored back first before writing 
the requested values into the correct places.


So in this case, obviously the approach used by pata_acpi, etc. won't 
work for cable detection. Whatever magic register on the chipset 
contains the cable detect value, the AML doesn't seem to be accessing 
it. The ACPI spec doesn't really give any guarantee that the try STM on 
all possible modes trick will work either, since there seems to be no 
mention of the AML being required to validate the mode and the STM 
function has no return value to indicate failure.


I guess this means that what we have to do is trust that the BIOS set up 
a reasonable mode and base the cable detect on that (either by reading 
back the boot-up controller registers, or by calling GTM). I imagine 
this is what the Windows default IDE driver is doing (just using the 
boot-up mode and feeding it back using GTM/STM on suspend/resume cycles).


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: something strange in libata-core.c for kernel 2.6.22-rc3

2007-05-21 Thread Robert Hancock


Alan Cox wrote:

Yeah, that's consistent to what I've seen on my machine which is a
variant of A8N.  No matter what value I through at _STM, _GTM just
echoed the result thus always leading to 80c configuration.


I guess this means that what we have to do is trust that the BIOS set up
a reasonable mode and base the cable detect on that (either by reading
back the boot-up controller registers, or by calling GTM). I imagine
this is what the Windows default IDE driver is doing (just using the
boot-up mode and feeding it back using GTM/STM on suspend/resume cycles).

Alan, what do you think?


Interesting, sounds like it is still useful rather than just reading the
registers as the GTM/STM seem to survive resume cycles which drive config
may not (eg if the driver is loaded after a s2ram/resume.


I don't think that case is handled in this BIOS anyway - if you call GTM 
after resume without previously calling STM, it's just going to read 
whatever random values are in the controller and give you timings based 
on that, which presumably will be junk.


It looks like the main purpose for what it's doing with saving those 
registers in the _PTS method is to save and restore a couple of 
controller registers called ID20 (PCI config space offset 0x50, 16 bits) 
and ID22 (PCI config space offset 0x5C, 32 bits) which aren't otherwise 
used in the AML. According to pata_amd, for the AMD IDE interface the 
former is some reserved bits as well as the cable detect bits, while the 
latter is the cycle time and address setup time register. Presumably 
those aren't really the cable detect bits though, since the detection 
based on those bits in pata_amd doesn't really work..



If it just echoes back we should also be able to detect this by using
knowingly invalid values.


Well, this implementation doesn't purely echo back the same values, it 
echoes back values derived from what the controller was actually set to, 
so I imagine if you put in something ridiculous it would come back with 
the closest possible mode that it was set to (PIO mode 0, etc.)


I suspect the implementation we would need to use (which doesn't depend 
on anything not given in the spec) would be:


-On driver load, execute _GTM to get the timing mode the BIOS had set. 
Assume this represents the fastest modes the controller supports, and 
set cable detect based on whether it includes UDMA modes  2.


-If we decide to set a slower mode (speed down due to errors, etc.), set 
it using _STM and then read back the actual values that were set using 
_GTM (for possible use in suspend/resume).


-On resume after suspend, re-set the last mode using _STM followed by 
executing _GTF and running those commands.


This won't handle the case where the driver is loaded after the system 
was already suspended to RAM and resumed, however I don't know exactly 
how one could handle that in this situation..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Add Seagate STT20000A to DMA blacklist.

2007-05-21 Thread Robert Hancock


Dave Jones wrote:

On Mon, May 21, 2007 at 05:15:51PM +0100, Alan Cox wrote:
  On Mon, 21 May 2007 10:50:42 -0400
  Dave Jones [EMAIL PROTECTED] wrote:
  
   http://bugzilla.kernel.org/show_bug.cgi?id=1044

   has been open for _four_ years with a patch available.
   Here's a rediffed version of the same.
  
  Please update libata as well when you udpate the blacklists.


Sure, point me at the table(s) ?

Dave



ata_device_blacklist in drivers/ata/libata-core.c

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PCI MMCONFIG: add validation against ACPI motherboard resources

2007-05-21 Thread Robert Hancock


Jesse Barnes wrote:

What happens if you take out the chipset register detection, does
the MCFG table give you the same result? Wonder if they're doing
something funny with start/end bus values or something in their
table. There's some code in my patch that prints out the important
data from the MCFG table, can you tell me what that shows with the
chipset detection taken out?


I can't see how any MCFG based accesses could work on this box, but I
don't know why.  According to the boot log (with our code patched in
but disabled after checking the ACPI reserved status), the space is fine:

...
ACPI: (supports S0 S3 S4 S5)
ACPI: Using IOAPIC for interrupt routing
pciexbar lo: 0xf003
pciexbar hi: 0x
Enabled MCFG space at 0xf000, size 134217728
PCI: Found Intel Corporation G965 Express Memory Controller Hub with MMCONFIG 
support.
PCI: MCFG configuration 0: base f000 segment 0 buses 0 - 127
PCI: MCFG area at f000 reserved in ACPI motherboard resources
PCI: Not using MMCONFIG. -- due to the 'goto reject' after
 if (is_acpi_reserved) { ... }
PM: Adding info for acpi:acpi_system:00
PM: Adding info for acpi:button_power:00
...

Same thing happens if I disable the chipset specific code and just use
the ACPI stuff you added.

If I leave it enabled, several config cycles work fine, but the box
eventually hangs after probing 24 devices or so.  I don't see anything
else mapped into this space, and the MTRRs seem ok, so either there's
something hidden in this memory range or there's another chipset register
that needs poking to fully enable this space properly.

Sysrq doesn't seem to work, and I don't see any events in my machine log,
so figuring out exactly why it's hanging is a bit difficult.

Any ideas on what to try next?  I'll see if I can get some more details
from our BIOS folks and do yet another pass over the documentation to see
if there's something I'm missing.


Can you find out which config access (bus, device, function, address) is 
the one that hangs the box? I assume that either the corresponding 
address in the MCFG table is problematic (i.e. has something else mapped 
over it), or maybe that device just doesn't like being probed with MCFG 
somehow.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: something strange in libata-core.c for kernel 2.6.22-rc3

2007-05-20 Thread Robert Hancock


Tejun Heo wrote:

[EMAIL PROTECTED] wrote:

Mybe I am wrong, but if you are detecting 40-wire cable to set them to
DMA/33, why the check includes also 80-wire cables configuring them to
DMA/33 too?

With this patch my nvidia4 IDE controllers detects correctly and
configure correctly DMA/100 for my HD and DMA/33 for my DVD (the first
uses a 80-wire cable, the second a 40-wire cable).

Am I wrong somewhere?


That's the drive side verification of 80c cable check, so if the
condition triggers we downgrade 80c or unknown to 40c.  Cable detection
on nvidia PATA is a disaster.  You're supposed to do some ACPI dancing
and drive side detection is completely bogus.  Eeeek

Alan, did you have a chance to test the ACPI cable detection?  It just
didn't work when I tried it.  It always returned 80c on my machine.


Hopefully when we get that support in and working it will solve a lot of 
these issues (and others, like the laptops that have a short 40-wire 
cable that is good for high UDMA speeds which we presently have to 
hard-code detection for specific models).


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: sd_resume redundant? [was: [PATCH] libata: implement ata_wait_after_reset()]

2007-05-20 Thread Robert Hancock


Randy Dunlap wrote:

On Sun, 20 May 2007 11:45:03 -0600 Robert Hancock wrote:


Indan Zupancic wrote:

Everything seems to work fine without sd_resume(), so why is it needed?

Because not all disks spin up without being told to do so and like it or
not spinning disks up on resume is the default behavior.  As I wrote in
the other reply, it would be worthwhile to make it configurable.

Not even after they receive a read command? Ugh.
ATA disks are supposed to spin up, yes. SCSI disks require a command to 
tell them to spin up if they're in the "stopped" state.


Good info, but linux-ide was dropped.  Is that due to lack of
reply-to-all or is it a newsgroup thing or what?


That would be a newsgroup thing. It seems that sometimes CCs get dropped 
when the posts are forwarded to fa.linux.kernel where I normally read them.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: sd_resume redundant? [was: [PATCH] libata: implement ata_wait_after_reset()]

2007-05-20 Thread Robert Hancock


Indan Zupancic wrote:

Everything seems to work fine without sd_resume(), so why is it needed?

Because not all disks spin up without being told to do so and like it or
not spinning disks up on resume is the default behavior.  As I wrote in
the other reply, it would be worthwhile to make it configurable.


Not even after they receive a read command? Ugh.


ATA disks are supposed to spin up, yes. SCSI disks require a command to 
tell them to spin up if they're in the "stopped" state.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: something strange in libata-core.c for kernel 2.6.22-rc3

2007-05-20 Thread Robert Hancock


[EMAIL PROTECTED] wrote:


Mybe I am wrong, but if you are detecting 40-wire cable to set them to 
DMA/33, why the check includes also 80-wire cables configuring them to 
DMA/33 too?


With this patch my nvidia4 IDE controllers detects correctly and 
configure correctly DMA/100 for my HD and DMA/33 for my DVD (the first 
uses a 80-wire cable, the second a 40-wire cable).


Am I wrong somewhere?

--- libata-core.c.orig  2007-05-20 14:31:25.0 +0200
+++ libata-core.c   2007-05-20 14:34:01.0 +0200
@@ -3901,8 +3901,7 @@
/* UDMA/44 or higher would be available */
if((ap->cbl == ATA_CBL_PATA40) ||
(ata_drive_40wire(dev->id) &&
-(ap->cbl == ATA_CBL_PATA_UNK ||
- ap->cbl == ATA_CBL_PATA80))) {
+(ap->cbl == ATA_CBL_PATA_UNK))) {
ata_dev_printk(dev, KERN_WARNING,
 "limited to UDMA/33 due to 40-wire 
cable\n");

xfer_mask &= ~(0xF8 << ATA_SHIFT_UDMA);


It only does that for ATA_CBL_PATA80 if ata_drive_40wire returns true, 
which means that the drive is detecting a 40-wire cable on its side.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: something strange in libata-core.c for kernel 2.6.22-rc3

2007-05-20 Thread Robert Hancock


[EMAIL PROTECTED] wrote:


Mybe I am wrong, but if you are detecting 40-wire cable to set them to 
DMA/33, why the check includes also 80-wire cables configuring them to 
DMA/33 too?


With this patch my nvidia4 IDE controllers detects correctly and 
configure correctly DMA/100 for my HD and DMA/33 for my DVD (the first 
uses a 80-wire cable, the second a 40-wire cable).


Am I wrong somewhere?

--- libata-core.c.orig  2007-05-20 14:31:25.0 +0200
+++ libata-core.c   2007-05-20 14:34:01.0 +0200
@@ -3901,8 +3901,7 @@
/* UDMA/44 or higher would be available */
if((ap-cbl == ATA_CBL_PATA40) ||
(ata_drive_40wire(dev-id) 
-(ap-cbl == ATA_CBL_PATA_UNK ||
- ap-cbl == ATA_CBL_PATA80))) {
+(ap-cbl == ATA_CBL_PATA_UNK))) {
ata_dev_printk(dev, KERN_WARNING,
 limited to UDMA/33 due to 40-wire 
cable\n);

xfer_mask = ~(0xF8  ATA_SHIFT_UDMA);


It only does that for ATA_CBL_PATA80 if ata_drive_40wire returns true, 
which means that the drive is detecting a 40-wire cable on its side.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: sd_resume redundant? [was: [PATCH] libata: implement ata_wait_after_reset()]

2007-05-20 Thread Robert Hancock


Indan Zupancic wrote:

Everything seems to work fine without sd_resume(), so why is it needed?

Because not all disks spin up without being told to do so and like it or
not spinning disks up on resume is the default behavior.  As I wrote in
the other reply, it would be worthwhile to make it configurable.


Not even after they receive a read command? Ugh.


ATA disks are supposed to spin up, yes. SCSI disks require a command to 
tell them to spin up if they're in the stopped state.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: sd_resume redundant? [was: [PATCH] libata: implement ata_wait_after_reset()]

2007-05-20 Thread Robert Hancock


Randy Dunlap wrote:

On Sun, 20 May 2007 11:45:03 -0600 Robert Hancock wrote:


Indan Zupancic wrote:

Everything seems to work fine without sd_resume(), so why is it needed?

Because not all disks spin up without being told to do so and like it or
not spinning disks up on resume is the default behavior.  As I wrote in
the other reply, it would be worthwhile to make it configurable.

Not even after they receive a read command? Ugh.
ATA disks are supposed to spin up, yes. SCSI disks require a command to 
tell them to spin up if they're in the stopped state.


Good info, but linux-ide was dropped.  Is that due to lack of
reply-to-all or is it a newsgroup thing or what?


That would be a newsgroup thing. It seems that sometimes CCs get dropped 
when the posts are forwarded to fa.linux.kernel where I normally read them.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: something strange in libata-core.c for kernel 2.6.22-rc3

2007-05-20 Thread Robert Hancock


Tejun Heo wrote:

[EMAIL PROTECTED] wrote:

Mybe I am wrong, but if you are detecting 40-wire cable to set them to
DMA/33, why the check includes also 80-wire cables configuring them to
DMA/33 too?

With this patch my nvidia4 IDE controllers detects correctly and
configure correctly DMA/100 for my HD and DMA/33 for my DVD (the first
uses a 80-wire cable, the second a 40-wire cable).

Am I wrong somewhere?


That's the drive side verification of 80c cable check, so if the
condition triggers we downgrade 80c or unknown to 40c.  Cable detection
on nvidia PATA is a disaster.  You're supposed to do some ACPI dancing
and drive side detection is completely bogus.  Eeeek

Alan, did you have a chance to test the ACPI cable detection?  It just
didn't work when I tried it.  It always returned 80c on my machine.


Hopefully when we get that support in and working it will solve a lot of 
these issues (and others, like the laptops that have a short 40-wire 
cable that is good for high UDMA speeds which we presently have to 
hard-code detection for specific models).


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] sata_sil: Greatly improve DMA support

2007-05-19 Thread Robert Hancock


Jeff Garzik wrote:

Since Alan expressed a desire to see Large Block Transfer (LBT) support
in pata_sil680, I though I would re-post my patch for adding LBT support
to sata_sil.

Silicon Image's Large Block Transfer (LBT) support is a vendor-specific
DMA scatter/gather engine, which enables 64-bit DMA addresses (where
supported by platform) and eliminates the annoying 64k DMA boundary
found in legacy PCI IDE BMDMA engines.


Looks like it doesn't allow 64-bit DMA addresses, it only gets rid of 
the 64K boundary limitation.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] sata_sil: Greatly improve DMA support

2007-05-19 Thread Robert Hancock


Jeff Garzik wrote:

Since Alan expressed a desire to see Large Block Transfer (LBT) support
in pata_sil680, I though I would re-post my patch for adding LBT support
to sata_sil.

Silicon Image's Large Block Transfer (LBT) support is a vendor-specific
DMA scatter/gather engine, which enables 64-bit DMA addresses (where
supported by platform) and eliminates the annoying 64k DMA boundary
found in legacy PCI IDE BMDMA engines.


Looks like it doesn't allow 64-bit DMA addresses, it only gets rid of 
the 64K boundary limitation.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] drivers/ata: Add the SW NCQ support to sata_nv for MCP51/MCP55/MCP61

2007-05-18 Thread Robert Hancock


Kuan Luo wrote:

Thanks for your comment, see the explaination inline.
We'll apply your advice in later patch.


...

Please don't duplicate this code in the driver, this is part of libata 
core in libata-scsi.c. Add an export for these functions if you need to 
use them in the driver.

[kuan]: These calls are declared in static type . I can't export them
and don't want to
modify libata code.


If you really need these functions then they should be made non-static 
and exported. Duplicating the code is not a solution.


I'm not so sure you actually need all that, though. I suspect you can 
likely handle the deferring of commands if you detect an FPDMA data 
phase inside the qc_issue function only (like you already do in some 
cases) instead of having to mess with deferring them at the SCSI layer.


I'm still puzzling out how this stuff all works, but it looks like this 
code makes you stop sending new commands if:


-the port is in the FPDMA Data Phase (DMA Setup FIS received but the 
transfer is not complete yet) - I assume the hardware doesn't handle 
this itself, which seems rather unique
-we previously deferred a command inside of qc_issue because we were in 
the FPDMA data phase
-we previously saw dhfis_flags not equal to qc_active, or we got a 
BACKOUT interrupt (whatever exactly that means), both of which set some 
value in the back_byte
[kuan]: 
-If we got BACKOUT interrupt, it means that a command just sent by

driver
backed out.The driver should resend the command.So new commands should
be defered.
-If dhfis_flags != qc_active, it indicates that the last command doesn't
generate a device to host register FIS .
After sending some commands, I found that the last command sometimes has
this problem 
but previous commands are normal.In this case, we need resend the last

command.
Both cases set back_byte. 


The case where the command didn't generate a D2H FIS should likely be 
investigated further, otherwise we don't necessarily know that this 
workaround will work in all cases?


This code seems a bit odd. Isn't this tossing out a bunch of potential 
error status, etc?

[kuan]:
If there are commands in queue, the driver can  send  a new command 
only after receiving dhfis intr of previous command and before receiving

any dmasetup fis intr.
In this place,  i do the last check before sending the command.


But the D2H FIS can contain an error indication, correct? If that 
happens here it won't detect this. In this situation error handling 
should be triggered.



+   done_mask = pp->qc_active ^ sactive;
+   if (unlikely(done_mask & sactive)) {
+   ata_port_printk(ap, KERN_ERR, "illegal qc_active
transition "
+   "(%08x->%08x)\n", ap->qc_active,
sactive);
+   return -EINVAL;
+   }   

Shouldn't this trigger error handling if it happens instead of just 
printing an error?

[kuan]:
I think the error handling can be triggered by timeout. In fact, 
this case should seldom happen.


There have been reports of some drives with bad NCQ implementations that 
return completion status for commands that were not issued. If we detect 
this case we should raise an HSM violation which will disable NCQ on 
this drive if it happens repeatedly. See the code in ahci.c in 
ahci_host_intr.


This comment still applies:


Additional/general comments:

Think you need some code to handle suspend and resume (re-enable SATA 
MMIO space, etc.)


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] drivers/ata: Add the SW NCQ support to sata_nv for MCP51/MCP55/MCP61

2007-05-18 Thread Robert Hancock


Kuan Luo wrote:

Thanks for your comment, see the explaination inline.
We'll apply your advice in later patch.


...

Please don't duplicate this code in the driver, this is part of libata 
core in libata-scsi.c. Add an export for these functions if you need to 
use them in the driver.

[kuan]: These calls are declared in static type . I can't export them
and don't want to
modify libata code.


If you really need these functions then they should be made non-static 
and exported. Duplicating the code is not a solution.


I'm not so sure you actually need all that, though. I suspect you can 
likely handle the deferring of commands if you detect an FPDMA data 
phase inside the qc_issue function only (like you already do in some 
cases) instead of having to mess with deferring them at the SCSI layer.


I'm still puzzling out how this stuff all works, but it looks like this 
code makes you stop sending new commands if:


-the port is in the FPDMA Data Phase (DMA Setup FIS received but the 
transfer is not complete yet) - I assume the hardware doesn't handle 
this itself, which seems rather unique
-we previously deferred a command inside of qc_issue because we were in 
the FPDMA data phase
-we previously saw dhfis_flags not equal to qc_active, or we got a 
BACKOUT interrupt (whatever exactly that means), both of which set some 
value in the back_byte
[kuan]: 
-If we got BACKOUT interrupt, it means that a command just sent by

driver
backed out.The driver should resend the command.So new commands should
be defered.
-If dhfis_flags != qc_active, it indicates that the last command doesn't
generate a device to host register FIS .
After sending some commands, I found that the last command sometimes has
this problem 
but previous commands are normal.In this case, we need resend the last

command.
Both cases set back_byte. 


The case where the command didn't generate a D2H FIS should likely be 
investigated further, otherwise we don't necessarily know that this 
workaround will work in all cases?


This code seems a bit odd. Isn't this tossing out a bunch of potential 
error status, etc?

[kuan]:
If there are commands in queue, the driver can  send  a new command 
only after receiving dhfis intr of previous command and before receiving

any dmasetup fis intr.
In this place,  i do the last check before sending the command.


But the D2H FIS can contain an error indication, correct? If that 
happens here it won't detect this. In this situation error handling 
should be triggered.



+   done_mask = pp-qc_active ^ sactive;
+   if (unlikely(done_mask  sactive)) {
+   ata_port_printk(ap, KERN_ERR, illegal qc_active
transition 
+   (%08x-%08x)\n, ap-qc_active,
sactive);
+   return -EINVAL;
+   }   

Shouldn't this trigger error handling if it happens instead of just 
printing an error?

[kuan]:
I think the error handling can be triggered by timeout. In fact, 
this case should seldom happen.


There have been reports of some drives with bad NCQ implementations that 
return completion status for commands that were not issued. If we detect 
this case we should raise an HSM violation which will disable NCQ on 
this drive if it happens repeatedly. See the code in ahci.c in 
ahci_host_intr.


This comment still applies:


Additional/general comments:

Think you need some code to handle suspend and resume (re-enable SATA 
MMIO space, etc.)


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] drivers/ata: Add the SW NCQ support to sata_nv for MCP51/MCP55/MCP61

2007-05-17 Thread Robert Hancock


Zoltan Boszormenyi wrote:

Hi,

thanks for publishing this.


Add the Software NCQ support to sata_nv.c for MCP51/MCP55/MCP61 SATA
controller.

This patch base on sata_nv.c file from kernel 2.6.22-rc1

See attachment for the patch.

Signed-off-by: Kuan Luo <[EMAIL PROTECTED]>
Signed-off-by: Peer Chen <[EMAIL PROTECTED]>
==
See attached file.
==
  


However, I saw this in the patch:

+   /* determine if physical DMA addr spans 64K boundary.
+* Note h/w doesn't support 64-bit, so we unconditionally
+* truncate dma_addr_t to u32.
+*/
+   addr = (u32) sg_dma_address(sg);

Does it mean that I can't upgrade my machine to 4 GB or more
without losing NCQ or risking data corruption?
Can the code be made IOMMU-aware?


That shouldn't be a problem, libata default DMA mask is 32 bits (which 
isn't overridden with this controller) and so the block layer will 
bounce any data being read/written above that point with IOMMU or 
swiotlb. The comment is a bit unnecessarily scary.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] drivers/ata: Add the SW NCQ support to sata_nv for MCP51/MCP55/MCP61

2007-05-17 Thread Robert Hancock


Peer Chen wrote:

 Add the Software NCQ support to sata_nv.c for MCP51/MCP55/MCP61 SATA
controller.

This patch base on sata_nv.c file from kernel 2.6.22-rc1

See attachment for the patch.

Signed-off-by: Kuan Luo <[EMAIL PROTECTED]>
Signed-off-by: Peer Chen <[EMAIL PROTECTED]>


Good to finally see this come out. I've pasted the code below (indented) 
in order to make some comments:


	--- linux-2.6.22-rc1/drivers/ata/sata_nv.c.orig	2007-05-17 
14:48:26.0 -0400
	+++ linux-2.6.22-rc1/drivers/ata/sata_nv.c	2007-05-17 
17:07:28.0 -0400

@@ -46,6 +46,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 

 #define DRV_NAME   "sata_nv"
@@ -169,6 +171,36 @@ enum {
NV_ADMA_PORT_REGISTER_MODE  = (1 << 0),
NV_ADMA_ATAPI_SETUP_COMPLETE= (1 << 1),

+   /* MCP55 reg offset */
+   NV_CTL_MCP55= 0x400,
+   NV_INT_STATUS_MCP55 = 0x440,
+   NV_INT_ENABLE_MCP55 = 0x444,
+   NV_NCQ_REG_MCP55= 0x448,
+   NV_CH1_SACTIVE_MCP55= 0x0C,
+   
+   /* MCP55 */
+   NV_INT_ALL_MCP55= 0x,
+   NV_INT_PORT_SHIFT_MCP55 = 16,   /* each port occupies 
16 bits */
+   NV_INT_MASK_MCP55   = NV_INT_ALL_MCP55 & 0xfffd,
+   
+   /* NCQ ENABLE BITS*/
+   NV_CTL_PRI_SWNCQ= 0x02,
+   NV_CTL_SEC_SWNCQ= 0x04,
+   
+   /* MCP55 status bits*/
+   NV_INT_DEV_MCP55= 0x01,
+   NV_INT_PM_MCP55 = 0x02,
+   NV_INT_ADDED_MCP55  = 0x04,
+   NV_INT_REMOVED_MCP55= 0x08,
+   
+   NV_INT_BACKOUT_MCP55= 0x10,
+   NV_INT_SDBFIS_MCP55 = 0x20,
+   NV_INT_DHREGFIS_MCP55   = 0x40,
+   NV_INT_DMASETUP_MCP55   = 0x80,
+   
+   NV_INT_HOTPLUG_MCP55= (NV_INT_ADDED_MCP55 |
+   NV_INT_REMOVED_MCP55),
+
 };

 /* ADMA Physical Region Descriptor - one SG segment */
@@ -264,13 +296,118 @@ static void nv_adma_host_stop(struct ata
 static void nv_adma_post_internal_cmd(struct ata_queued_cmd *qc);
 static void nv_adma_tf_read(struct ata_port *ap, struct ata_taskfile 
*tf);

+static void ncq_error_handler(struct ata_port *ap);
+static void nv_mcp55_thaw(struct ata_port *ap);
+static void nv_mcp55_freeze(struct ata_port *ap);
+static void ncq_host_init(struct ata_host *host);
+static void nv_bmdma_stop(struct ata_port *ap);
+static int nv_std_qc_defer(struct ata_port *ap);
+static int  nv_port_start(struct ata_port *ap);
+static void nv_port_stop(struct ata_port *ap);
+static void ncq_clear(struct ata_port *ap);
+static void nv_qc_prep(struct ata_queued_cmd *qc);
+static void nv_fill_sg(struct ata_queued_cmd *qc);
+static void ncq_sactive_start (struct ata_queued_cmd *qc);
+static u32 ncq_sactive_value (struct ata_port *ap);
+static unsigned int nv_qc_issue_prot(struct ata_queued_cmd *qc);
+static u32 ncq_tag_value(struct ata_port *ap);
+static int nv_ncqintr_sdbfis(struct ata_port *ap);
+static int nv_ncqintr_dmasetupfis(struct ata_port *ap);
+static void ncq_clear_singlefis(struct ata_port *ap, u32 val);
+static u32 ncq_ownfisintr_value (struct ata_port *ap);
+void ncq_hotplug(struct ata_port *ap, u32 fis);
+static irqreturn_t nv_mcp55_interrupt(int irq, void *dev_instance);
+static int ncq_interrupt(struct ata_port *ap, u32 fis);
+static int nv_scsi_queuecmd(struct scsi_cmnd *cmd,
+   void (*done)(struct scsi_cmnd *));

These functions should use "mcp51" or "swncq" or something in the name 
instead of "ncq", since the latter implies it may be related to ADMA as 
well.


+
+#undef NCQ_DEBUG
+#undef NCQ_VERBOSE_DEBUG
+#ifdef NCQ_DEBUG
	+#define NPRINTK(fmt, args...) printk(KERN_ERR "%s: " fmt, 
__FUNCTION__, ## args)

+#ifdef NCQ_VERBOSE_DEBUG
	+#define NVPRINTK(fmt, args...) printk(KERN_ERR "%s: " fmt, 
__FUNCTION__, ## args)

+#else
+#define NVPRINTK(fmt, args...) do { } while(0)
+#endif /* NCQ_VERBOSE_DEBUG */
+#else
+#define NPRINTK(fmt, args...) do { } while(0)
+#define NVPRINTK(fmt, args...) do { } while(0)
+#endif

We don't need these private helper macros, just use the ones that libata 
defines.


+
+/*cmd_stop

Re: [PATCH] drivers/ata: Add the SW NCQ support to sata_nv for MCP51/MCP55/MCP61

2007-05-17 Thread Robert Hancock


Peer Chen wrote:

 Add the Software NCQ support to sata_nv.c for MCP51/MCP55/MCP61 SATA
controller.

This patch base on sata_nv.c file from kernel 2.6.22-rc1

See attachment for the patch.

Signed-off-by: Kuan Luo [EMAIL PROTECTED]
Signed-off-by: Peer Chen [EMAIL PROTECTED]


Good to finally see this come out. I've pasted the code below (indented) 
in order to make some comments:


	--- linux-2.6.22-rc1/drivers/ata/sata_nv.c.orig	2007-05-17 
14:48:26.0 -0400
	+++ linux-2.6.22-rc1/drivers/ata/sata_nv.c	2007-05-17 
17:07:28.0 -0400

@@ -46,6 +46,8 @@
 #include linux/device.h
 #include scsi/scsi_host.h
 #include scsi/scsi_device.h
+#include scsi/scsi.h
+#include scsi/scsi_cmnd.h
 #include linux/libata.h

 #define DRV_NAME   sata_nv
@@ -169,6 +171,36 @@ enum {
NV_ADMA_PORT_REGISTER_MODE  = (1  0),
NV_ADMA_ATAPI_SETUP_COMPLETE= (1  1),

+   /* MCP55 reg offset */
+   NV_CTL_MCP55= 0x400,
+   NV_INT_STATUS_MCP55 = 0x440,
+   NV_INT_ENABLE_MCP55 = 0x444,
+   NV_NCQ_REG_MCP55= 0x448,
+   NV_CH1_SACTIVE_MCP55= 0x0C,
+   
+   /* MCP55 */
+   NV_INT_ALL_MCP55= 0x,
+   NV_INT_PORT_SHIFT_MCP55 = 16,   /* each port occupies 
16 bits */
+   NV_INT_MASK_MCP55   = NV_INT_ALL_MCP55  0xfffd,
+   
+   /* NCQ ENABLE BITS*/
+   NV_CTL_PRI_SWNCQ= 0x02,
+   NV_CTL_SEC_SWNCQ= 0x04,
+   
+   /* MCP55 status bits*/
+   NV_INT_DEV_MCP55= 0x01,
+   NV_INT_PM_MCP55 = 0x02,
+   NV_INT_ADDED_MCP55  = 0x04,
+   NV_INT_REMOVED_MCP55= 0x08,
+   
+   NV_INT_BACKOUT_MCP55= 0x10,
+   NV_INT_SDBFIS_MCP55 = 0x20,
+   NV_INT_DHREGFIS_MCP55   = 0x40,
+   NV_INT_DMASETUP_MCP55   = 0x80,
+   
+   NV_INT_HOTPLUG_MCP55= (NV_INT_ADDED_MCP55 |
+   NV_INT_REMOVED_MCP55),
+
 };

 /* ADMA Physical Region Descriptor - one SG segment */
@@ -264,13 +296,118 @@ static void nv_adma_host_stop(struct ata
 static void nv_adma_post_internal_cmd(struct ata_queued_cmd *qc);
 static void nv_adma_tf_read(struct ata_port *ap, struct ata_taskfile 
*tf);

+static void ncq_error_handler(struct ata_port *ap);
+static void nv_mcp55_thaw(struct ata_port *ap);
+static void nv_mcp55_freeze(struct ata_port *ap);
+static void ncq_host_init(struct ata_host *host);
+static void nv_bmdma_stop(struct ata_port *ap);
+static int nv_std_qc_defer(struct ata_port *ap);
+static int  nv_port_start(struct ata_port *ap);
+static void nv_port_stop(struct ata_port *ap);
+static void ncq_clear(struct ata_port *ap);
+static void nv_qc_prep(struct ata_queued_cmd *qc);
+static void nv_fill_sg(struct ata_queued_cmd *qc);
+static void ncq_sactive_start (struct ata_queued_cmd *qc);
+static u32 ncq_sactive_value (struct ata_port *ap);
+static unsigned int nv_qc_issue_prot(struct ata_queued_cmd *qc);
+static u32 ncq_tag_value(struct ata_port *ap);
+static int nv_ncqintr_sdbfis(struct ata_port *ap);
+static int nv_ncqintr_dmasetupfis(struct ata_port *ap);
+static void ncq_clear_singlefis(struct ata_port *ap, u32 val);
+static u32 ncq_ownfisintr_value (struct ata_port *ap);
+void ncq_hotplug(struct ata_port *ap, u32 fis);
+static irqreturn_t nv_mcp55_interrupt(int irq, void *dev_instance);
+static int ncq_interrupt(struct ata_port *ap, u32 fis);
+static int nv_scsi_queuecmd(struct scsi_cmnd *cmd,
+   void (*done)(struct scsi_cmnd *));

These functions should use mcp51 or swncq or something in the name 
instead of ncq, since the latter implies it may be related to ADMA as 
well.


+
+#undef NCQ_DEBUG
+#undef NCQ_VERBOSE_DEBUG
+#ifdef NCQ_DEBUG
	+#define NPRINTK(fmt, args...) printk(KERN_ERR %s:  fmt, 
__FUNCTION__, ## args)

+#ifdef NCQ_VERBOSE_DEBUG
	+#define NVPRINTK(fmt, args...) printk(KERN_ERR %s:  fmt, 
__FUNCTION__, ## args)

+#else
+#define NVPRINTK(fmt, args...) do { } while(0)
+#endif /* NCQ_VERBOSE_DEBUG */
+#else
+#define NPRINTK(fmt, args...) do { } while(0)
+#define NVPRINTK(fmt, args...) do { } while(0)
+#endif

We don't need these private helper macros, just use

Re: [PATCH] drivers/ata: Add the SW NCQ support to sata_nv for MCP51/MCP55/MCP61

2007-05-17 Thread Robert Hancock


Zoltan Boszormenyi wrote:

Hi,

thanks for publishing this.


Add the Software NCQ support to sata_nv.c for MCP51/MCP55/MCP61 SATA
controller.

This patch base on sata_nv.c file from kernel 2.6.22-rc1

See attachment for the patch.

Signed-off-by: Kuan Luo [EMAIL PROTECTED]
Signed-off-by: Peer Chen [EMAIL PROTECTED]
==
See attached file.
==
  


However, I saw this in the patch:

+   /* determine if physical DMA addr spans 64K boundary.
+* Note h/w doesn't support 64-bit, so we unconditionally
+* truncate dma_addr_t to u32.
+*/
+   addr = (u32) sg_dma_address(sg);

Does it mean that I can't upgrade my machine to 4 GB or more
without losing NCQ or risking data corruption?
Can the code be made IOMMU-aware?


That shouldn't be a problem, libata default DMA mask is 32 bits (which 
isn't overridden with this controller) and so the block layer will 
bounce any data being read/written above that point with IOMMU or 
swiotlb. The comment is a bit unnecessarily scary.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [announce] Intel announces the PowerTOP utility for Linux

2007-05-14 Thread Robert Hancock

Looks like the radeon driver has the same problem as the i915 driver 
mentioned on the known problems page - I get 60 wakeups/sec from it on 
my Compaq X1000 laptop (Radeon 9000 graphics) while in X, which 
essentially prevents entry into C3.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] libata: add human-readable error value decoding (v2)

2007-05-14 Thread Robert Hancock


This adds human-readable decoding of the ATA status and error registers (similar
to what drivers/ide does) as well as the SATA Serror register to libata error
handling output. This prevents the need to pore through standards documents
to figure out the meaning of the bits in these registers when looking at error
reports. Some bits that drivers/ide decoded are not decoded here, since the bits
are either command-dependent or obsolete, and properly parsing them would add
too much complexity.

This version reduces the length of the SError parsed output strings relative to 
the
previous version of this patch.

Signed-off-by: Robert Hancock <[EMAIL PROTECTED]>

--- linux-2.6.21.1/drivers/ata/libata-eh.c  2007-04-27 15:49:26.0 
-0600
+++ linux-2.6.21.1edit/drivers/ata/libata-eh.c  2007-05-14 17:38:35.0 
-0600
@@ -1523,6 +1523,27 @@ static void ata_eh_report(struct ata_por
ata_port_printk(ap, KERN_ERR, "(%s)\n", desc);
}

+   if (ehc->i.serror)
+   ata_port_printk(ap, KERN_ERR,
+ "SError: {%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s}\n",
+ ehc->i.serror & SERR_DATA_RECOVERED ? "RecovData " : "",
+ ehc->i.serror & SERR_COMM_RECOVERED ? "RecovComm " : "",
+ ehc->i.serror & SERR_DATA ? "UnrecovData " : "",
+ ehc->i.serror & SERR_PERSISTENT ? "Persist " : "",
+ ehc->i.serror & SERR_PROTOCOL ? "Proto " : "",
+ ehc->i.serror & SERR_INTERNAL ? "HostInt " : "",
+ ehc->i.serror & SERR_PHYRDY_CHG ? "PHYRdyChg " : "",
+ ehc->i.serror & SERR_PHY_INT_ERR ? "PHYInt " : "",
+ ehc->i.serror & SERR_COMM_WAKE ? "CommWake " : "",
+ ehc->i.serror & SERR_10B_8B_ERR ? "10B8B " : "",
+ ehc->i.serror & SERR_DISPARITY ? "Dispar " : "",
+ ehc->i.serror & SERR_CRC ? "BadCRC " : "",
+ ehc->i.serror & SERR_HANDSHAKE ? "Handshk " : "",
+ ehc->i.serror & SERR_LINK_SEQ_ERR ? "LinkSeq " : "",
+ ehc->i.serror & SERR_TRANS_ST_ERROR ? "TrStaTrns " : "",
+ ehc->i.serror & SERR_UNRECOG_FIS ? "UnrecFIS " : "",
+ ehc->i.serror & SERR_DEV_XCHG ? "DevExch " : "" );
+
for (tag = 0; tag < ATA_MAX_QUEUE; tag++) {
static const char *dma_str[] = {
[DMA_BIDIRECTIONAL] = "bidi",
@@ -1552,6 +1573,29 @@ static void ata_eh_report(struct ata_por
res->hob_feature, res->hob_nsect,
res->hob_lbal, res->hob_lbam, res->hob_lbah,
res->device, qc->err_mask, 
ata_err_string(qc->err_mask));
+   
+   if (res->command & (ATA_BUSY | ATA_DRDY | ATA_DF | ATA_DRQ |
+   ATA_ERR) ) {
+   if (res->command & ATA_BUSY)
+   ata_dev_printk(qc->dev, KERN_ERR,
+ "status: {Busy}\n" );
+   else
+   ata_dev_printk(qc->dev, KERN_ERR,
+ "status: {%s%s%s%s}\n",
+ res->command & ATA_DRDY ? "DRDY " : "",
+ res->command & ATA_DF ? "DF " : "",
+ res->command & ATA_DRQ ? "DRQ " : "",
+ res->command & ATA_ERR ? "ERR " : "" );
+   }
+   
+   if (cmd->command != ATA_CMD_PACKET &&
+   (res->feature & (ATA_ICRC | ATA_UNC | ATA_IDNF | 
ATA_ABORTED)))
+   ata_dev_printk(qc->dev, KERN_ERR,
+ "error: {%s%s%s%s}\n",
+ res->feature & ATA_ICRC ? "ICRC " : "",
+ res->feature & ATA_UNC ? "UNC " : "",
+ res->feature & ATA_IDNF ? "IDNF " : "",
+ res->feature & ATA_ABORTED ? "ABRT " : "" );
}
}

--- linux-2.6.21.1/include/linux/ata.h  2007-04-27 15:49:26.0 -0600
+++ linux-2.6.21.1edit/include/linux/ata.h  2007-05

[PATCH] libata: add human-readable error value decoding (v2)

2007-05-14 Thread Robert Hancock


This adds human-readable decoding of the ATA status and error registers (similar
to what drivers/ide does) as well as the SATA Serror register to libata error
handling output. This prevents the need to pore through standards documents
to figure out the meaning of the bits in these registers when looking at error
reports. Some bits that drivers/ide decoded are not decoded here, since the bits
are either command-dependent or obsolete, and properly parsing them would add
too much complexity.

This version reduces the length of the SError parsed output strings relative to 
the
previous version of this patch.

Signed-off-by: Robert Hancock [EMAIL PROTECTED]

--- linux-2.6.21.1/drivers/ata/libata-eh.c  2007-04-27 15:49:26.0 
-0600
+++ linux-2.6.21.1edit/drivers/ata/libata-eh.c  2007-05-14 17:38:35.0 
-0600
@@ -1523,6 +1523,27 @@ static void ata_eh_report(struct ata_por
ata_port_printk(ap, KERN_ERR, (%s)\n, desc);
}

+   if (ehc-i.serror)
+   ata_port_printk(ap, KERN_ERR,
+ SError: {%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s}\n,
+ ehc-i.serror  SERR_DATA_RECOVERED ? RecovData  : ,
+ ehc-i.serror  SERR_COMM_RECOVERED ? RecovComm  : ,
+ ehc-i.serror  SERR_DATA ? UnrecovData  : ,
+ ehc-i.serror  SERR_PERSISTENT ? Persist  : ,
+ ehc-i.serror  SERR_PROTOCOL ? Proto  : ,
+ ehc-i.serror  SERR_INTERNAL ? HostInt  : ,
+ ehc-i.serror  SERR_PHYRDY_CHG ? PHYRdyChg  : ,
+ ehc-i.serror  SERR_PHY_INT_ERR ? PHYInt  : ,
+ ehc-i.serror  SERR_COMM_WAKE ? CommWake  : ,
+ ehc-i.serror  SERR_10B_8B_ERR ? 10B8B  : ,
+ ehc-i.serror  SERR_DISPARITY ? Dispar  : ,
+ ehc-i.serror  SERR_CRC ? BadCRC  : ,
+ ehc-i.serror  SERR_HANDSHAKE ? Handshk  : ,
+ ehc-i.serror  SERR_LINK_SEQ_ERR ? LinkSeq  : ,
+ ehc-i.serror  SERR_TRANS_ST_ERROR ? TrStaTrns  : ,
+ ehc-i.serror  SERR_UNRECOG_FIS ? UnrecFIS  : ,
+ ehc-i.serror  SERR_DEV_XCHG ? DevExch  :  );
+
for (tag = 0; tag  ATA_MAX_QUEUE; tag++) {
static const char *dma_str[] = {
[DMA_BIDIRECTIONAL] = bidi,
@@ -1552,6 +1573,29 @@ static void ata_eh_report(struct ata_por
res-hob_feature, res-hob_nsect,
res-hob_lbal, res-hob_lbam, res-hob_lbah,
res-device, qc-err_mask, 
ata_err_string(qc-err_mask));
+   
+   if (res-command  (ATA_BUSY | ATA_DRDY | ATA_DF | ATA_DRQ |
+   ATA_ERR) ) {
+   if (res-command  ATA_BUSY)
+   ata_dev_printk(qc-dev, KERN_ERR,
+ status: {Busy}\n );
+   else
+   ata_dev_printk(qc-dev, KERN_ERR,
+ status: {%s%s%s%s}\n,
+ res-command  ATA_DRDY ? DRDY  : ,
+ res-command  ATA_DF ? DF  : ,
+ res-command  ATA_DRQ ? DRQ  : ,
+ res-command  ATA_ERR ? ERR  :  );
+   }
+   
+   if (cmd-command != ATA_CMD_PACKET 
+   (res-feature  (ATA_ICRC | ATA_UNC | ATA_IDNF | 
ATA_ABORTED)))
+   ata_dev_printk(qc-dev, KERN_ERR,
+ error: {%s%s%s%s}\n,
+ res-feature  ATA_ICRC ? ICRC  : ,
+ res-feature  ATA_UNC ? UNC  : ,
+ res-feature  ATA_IDNF ? IDNF  : ,
+ res-feature  ATA_ABORTED ? ABRT  :  );
}
}

--- linux-2.6.21.1/include/linux/ata.h  2007-04-27 15:49:26.0 -0600
+++ linux-2.6.21.1edit/include/linux/ata.h  2007-05-09 19:25:54.0 
-0600
@@ -223,6 +223,15 @@ enum {
SERR_PROTOCOL   = (1  10), /* protocol violation */
SERR_INTERNAL   = (1  11), /* host internal error */
SERR_PHYRDY_CHG = (1  16), /* PHY RDY changed */
+   SERR_PHY_INT_ERR= (1  17), /* PHY internal error */
+   SERR_COMM_WAKE  = (1  18), /* Comm wake */
+   SERR_10B_8B_ERR = (1  19), /* 10b to 8b decode error */
+   SERR_DISPARITY  = (1  20), /* Disparity */
+   SERR_CRC= (1  21), /* CRC error */
+   SERR_HANDSHAKE  = (1  22), /* Handshake error */
+   SERR_LINK_SEQ_ERR   = (1  23), /* Link sequence error */
+   SERR_TRANS_ST_ERROR = (1  24), /* Transport state transition 
error */
+   SERR_UNRECOG_FIS= (1  25), /* Unrecognized FIS */
SERR_DEV_XCHG   = (1  26), /* device exchanged */

/* struct ata_taskfile flags */

-
To unsubscribe from

Re: [announce] Intel announces the PowerTOP utility for Linux

2007-05-14 Thread Robert Hancock

Looks like the radeon driver has the same problem as the i915 driver 
mentioned on the known problems page - I get 60 wakeups/sec from it on 
my Compaq X1000 laptop (Radeon 9000 graphics) while in X, which 
essentially prevents entry into C3.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Workaround for a PCI restoring bug

2007-05-12 Thread Robert Hancock


Lukas Hejtmanek wrote:

Hello,

as of 2.6.21-git16, the bugs related to restoring PCI are still present. The
save pci function reads only -1 from the PCI config space and when restoring,
it messes up totaly most PCI devices. The attached patch is workaround only
until proper fix is found and included. Could it be included into the mainline
for now?


It's not really a fix, that value might be legitimately supposed to be 
in the config space. Sounds like some driver is disabling the device 
before saving the state or something..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.21.1] SATA freeze

2007-05-12 Thread Robert Hancock


Fred Moyer wrote:
This appears to be a different problem. Something is issuing 
SMART-related commands (smartd or smartctl perhaps) which the drive 
seems to be reacting strangely to. It apparently completed the command 
but never raised DRQ to request any data being transferred even though 
we expected it to. Maybe SMART is disabled on the drive and that's 
causing it to just toss these commands? CCing linux-ide in case anyone 
knows what would cause this.


Here's smartctl -a for this drive - same output for both sda and sdb. 
Smartd is currently running.  Any advice appreciated.


Previously on 2.6.15 I was seeing sdb remount as readonly under heavy 
i/o.  I have not seen that issue yet with 2.6.21 (with Robert's patch 
from May 5th for sata_nv), but that occurrence of remounts read-only was 
infrequently, so that issue may be solved.


app2 ~ # smartctl -a /dev/sda
smartctl version 5.36 [x86_64-pc-linux-gnu] Copyright (C) 2002-6 Bruce 
Allen

Home page is http://smartmontools.sourceforge.net/

Device: ATA  ST3808110AS  Version: n/a
Serial number: 5LR8895K
Device type: disk
Local Time is: Sat May 12 12:05:58 2007 PDT
Device does not support SMART

Error Counter logging not supported

[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
Device does not support Self Test logging



Sounds like SMART is likely disabled on that drive. You can try doing 
"smartctl -s on /dev/sda" and see if that will turn it on.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.21.1] SATA freeze

2007-05-12 Thread Robert Hancock


Fred Moyer wrote:
I just joined the list today so apologies if this email breaks any email 
client post threading.


I have been seeing similar errors on two different systems.  I applied 
Robert's sata_nv patch posted to the list on May 5th, and approved today 
by Jeff Garzik.  I've taken several steps to insure that this isn't a 
faulty cable or drive issue.  This is running on a hp dl145g2.  Here is 
my lspci, dmesg, and relevant kernel config sections:


(snip)


ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata1.00: cmd b0/d2:f1:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 
123392 in
 res 50/00:f1:00:4f:c2/00:00:00:00:00/00 Emask 0x202 (HSM 
violation)

ata1: soft resetting port
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: configured for UDMA/100
ata1: EH complete


This appears to be a different problem. Something is issuing 
SMART-related commands (smartd or smartctl perhaps) which the drive 
seems to be reacting strangely to. It apparently completed the command 
but never raised DRQ to request any data being transferred even though 
we expected it to. Maybe SMART is disabled on the drive and that's 
causing it to just toss these commands? CCing linux-ide in case anyone 
knows what would cause this.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.21.1] SATA freeze

2007-05-12 Thread Robert Hancock


Gerhard Mack wrote:

On Wed, 9 May 2007, Robert Hancock wrote:

Gerhard Mack wrote:

On Wed, 9 May 2007, Jeff Garzik wrote:

Gerhard Mack wrote:

May  9 14:51:35 mgerhard kernel: ata1.00: exception Emask 0x0 SAct 0x0
SErr
0x180 action 0x2 frozen
May  9 14:51:35 mgerhard kernel: ata1.00: cmd
35/00:00:80:6d:c8/00:04:09:00:00/e0 tag 0 cdb 0x0 data 524288 out
May  9 14:51:35 mgerhard kernel:  res
40/00:c8:68:65:c8/84:00:09:00:00/e0 Emask 0x4 (timeout)
May  9 14:51:42 mgerhard kernel: ata1: port is slow to respond, please
be
patient (Status 0xd0)

Anything I can do to figgure out what's causing this?

You're showing various flags set in the SError register, which suggests you're
having SATA communication problems with the drive. A bad SATA cable or power
problems would be a strong possibility.

It really would be nice if we decoded these things more usefully for the user
(same with the regular ATA errors, like drivers/ide does), but in general
SError showing up as non-zero is a bad thing:

0x40 = "Handshake error: When set to one, this bit indicates that one or
more R_ERR handshake response was received in response to frame transmission.
Such errors may be the result of a CRC error detected by the recipient, a
disparity or 10b/8b decoding error, or other error condition leading to a
negative handshake on a transmitted frame."

0x180 = "Link Sequence Error: When set to one, this bit indicates that one
or more Link state machine error conditions was encountered since the last
time this bit was cleared. The Link Layer state machine defines the conditions
under which the link layer detects an erroneous transition."

and

"Transport state transition error: When set to one, this bit indicates that an
error has occurred in the transition from one state to another within the
Transport layer since the last time this bit was cleared."



Just out of curiosity how often is that bit cleared?


I believe that is cleared only on error handling or controller reset, so 
  it just means that it happened sometime since boot or the last libata 
error recovery.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] "volatile considered harmful" document

2007-05-12 Thread Robert Hancock


Bill Davidsen wrote:

Jonathan Corbet wrote:


+There are still a few rare situations where volatile makes sense in the
+kernel:
+
+  - The above-mentioned accessor functions might use volatile on
+architectures where direct I/O memory access does work.  
Essentially,

+each accessor call becomes a little critical section on its own and
+ensures that the access happens as expected by the programmer.
+
+  - Inline assembly code which changes memory, but which has no other
+visible side effects, risks being deleted by GCC.  Adding the 
volatile

+keyword to asm statements will prevent this removal.
+
+  - The jiffies variable is special in that it can have a different 
value

+every time it is referenced, but it can be read without any special
+locking.  So jiffies can be volatile, but the addition of other
+variables of this type is frowned upon.  Jiffies is considered to 
be a

+"stupid legacy" issue in this regard.


It would seem that any variable which is (a) subject to change by other 
threads or hardware, and (b) the value of which is going to be used 
without writing the variable, would be a valid use for volatile.


You don't need volatile in that case, rmb() can be used.

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] volatile considered harmful document

2007-05-12 Thread Robert Hancock


Bill Davidsen wrote:

Jonathan Corbet wrote:


+There are still a few rare situations where volatile makes sense in the
+kernel:
+
+  - The above-mentioned accessor functions might use volatile on
+architectures where direct I/O memory access does work.  
Essentially,

+each accessor call becomes a little critical section on its own and
+ensures that the access happens as expected by the programmer.
+
+  - Inline assembly code which changes memory, but which has no other
+visible side effects, risks being deleted by GCC.  Adding the 
volatile

+keyword to asm statements will prevent this removal.
+
+  - The jiffies variable is special in that it can have a different 
value

+every time it is referenced, but it can be read without any special
+locking.  So jiffies can be volatile, but the addition of other
+variables of this type is frowned upon.  Jiffies is considered to 
be a

+stupid legacy issue in this regard.


It would seem that any variable which is (a) subject to change by other 
threads or hardware, and (b) the value of which is going to be used 
without writing the variable, would be a valid use for volatile.


You don't need volatile in that case, rmb() can be used.

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.21.1] SATA freeze

2007-05-12 Thread Robert Hancock


Gerhard Mack wrote:

On Wed, 9 May 2007, Robert Hancock wrote:

Gerhard Mack wrote:

On Wed, 9 May 2007, Jeff Garzik wrote:

Gerhard Mack wrote:

May  9 14:51:35 mgerhard kernel: ata1.00: exception Emask 0x0 SAct 0x0
SErr
0x180 action 0x2 frozen
May  9 14:51:35 mgerhard kernel: ata1.00: cmd
35/00:00:80:6d:c8/00:04:09:00:00/e0 tag 0 cdb 0x0 data 524288 out
May  9 14:51:35 mgerhard kernel:  res
40/00:c8:68:65:c8/84:00:09:00:00/e0 Emask 0x4 (timeout)
May  9 14:51:42 mgerhard kernel: ata1: port is slow to respond, please
be
patient (Status 0xd0)

Anything I can do to figgure out what's causing this?

You're showing various flags set in the SError register, which suggests you're
having SATA communication problems with the drive. A bad SATA cable or power
problems would be a strong possibility.

It really would be nice if we decoded these things more usefully for the user
(same with the regular ATA errors, like drivers/ide does), but in general
SError showing up as non-zero is a bad thing:

0x40 = Handshake error: When set to one, this bit indicates that one or
more R_ERR handshake response was received in response to frame transmission.
Such errors may be the result of a CRC error detected by the recipient, a
disparity or 10b/8b decoding error, or other error condition leading to a
negative handshake on a transmitted frame.

0x180 = Link Sequence Error: When set to one, this bit indicates that one
or more Link state machine error conditions was encountered since the last
time this bit was cleared. The Link Layer state machine defines the conditions
under which the link layer detects an erroneous transition.

and

Transport state transition error: When set to one, this bit indicates that an
error has occurred in the transition from one state to another within the
Transport layer since the last time this bit was cleared.



Just out of curiosity how often is that bit cleared?


I believe that is cleared only on error handling or controller reset, so 
  it just means that it happened sometime since boot or the last libata 
error recovery.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.21.1] SATA freeze

2007-05-12 Thread Robert Hancock


Fred Moyer wrote:
I just joined the list today so apologies if this email breaks any email 
client post threading.


I have been seeing similar errors on two different systems.  I applied 
Robert's sata_nv patch posted to the list on May 5th, and approved today 
by Jeff Garzik.  I've taken several steps to insure that this isn't a 
faulty cable or drive issue.  This is running on a hp dl145g2.  Here is 
my lspci, dmesg, and relevant kernel config sections:


(snip)


ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata1.00: cmd b0/d2:f1:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 
123392 in
 res 50/00:f1:00:4f:c2/00:00:00:00:00/00 Emask 0x202 (HSM 
violation)

ata1: soft resetting port
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: configured for UDMA/100
ata1: EH complete


This appears to be a different problem. Something is issuing 
SMART-related commands (smartd or smartctl perhaps) which the drive 
seems to be reacting strangely to. It apparently completed the command 
but never raised DRQ to request any data being transferred even though 
we expected it to. Maybe SMART is disabled on the drive and that's 
causing it to just toss these commands? CCing linux-ide in case anyone 
knows what would cause this.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.21.1] SATA freeze

2007-05-12 Thread Robert Hancock


Fred Moyer wrote:
This appears to be a different problem. Something is issuing 
SMART-related commands (smartd or smartctl perhaps) which the drive 
seems to be reacting strangely to. It apparently completed the command 
but never raised DRQ to request any data being transferred even though 
we expected it to. Maybe SMART is disabled on the drive and that's 
causing it to just toss these commands? CCing linux-ide in case anyone 
knows what would cause this.


Here's smartctl -a for this drive - same output for both sda and sdb. 
Smartd is currently running.  Any advice appreciated.


Previously on 2.6.15 I was seeing sdb remount as readonly under heavy 
i/o.  I have not seen that issue yet with 2.6.21 (with Robert's patch 
from May 5th for sata_nv), but that occurrence of remounts read-only was 
infrequently, so that issue may be solved.


app2 ~ # smartctl -a /dev/sda
smartctl version 5.36 [x86_64-pc-linux-gnu] Copyright (C) 2002-6 Bruce 
Allen

Home page is http://smartmontools.sourceforge.net/

Device: ATA  ST3808110AS  Version: n/a
Serial number: 5LR8895K
Device type: disk
Local Time is: Sat May 12 12:05:58 2007 PDT
Device does not support SMART

Error Counter logging not supported

[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
Device does not support Self Test logging



Sounds like SMART is likely disabled on that drive. You can try doing 
smartctl -s on /dev/sda and see if that will turn it on.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Workaround for a PCI restoring bug

2007-05-12 Thread Robert Hancock


Lukas Hejtmanek wrote:

Hello,

as of 2.6.21-git16, the bugs related to restoring PCI are still present. The
save pci function reads only -1 from the PCI config space and when restoring,
it messes up totaly most PCI devices. The attached patch is workaround only
until proper fix is found and included. Could it be included into the mainline
for now?


It's not really a fix, that value might be legitimately supposed to be 
in the config space. Sounds like some driver is disabling the device 
before saving the state or something..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] libata: add human-readable error value decoding

2007-05-11 Thread Robert Hancock


Tejun Heo wrote:

Chuck Ebbert wrote:

Robert Hancock wrote:

+  ehc->i.serror & SERR_TRANS_ST_ERROR ? "TransStatTransErr "
: "",
+  ehc->i.serror & SERR_UNRECOG_FIS ? "UnrecogFIS " : "",
+  ehc->i.serror & SERR_DEV_XCHG ? "DevExchanged " : "" );

I'm not really convinced whether this is necessary.  The human readable
form is also a bit cryptic and can get quite long.  So, mild NACK from
me.


It certainly seems useful when debugging hotplug issues or random SATA
problems which end up being caused by communication problems. Without
this output, Joe User stands no chance of figuring out what's going on,
and neither does Joe libata Developer unless they really care to dig
through the spec and count bits to figure out what they mean. At least
with this you can see that there was a CRC error, etc. and go from that..


Why not just document the error messages?

And the scsi ones too, I can't seem to find what the sense codes mean.


They are well documented elsewhere - the standard documents.  For sense
codes, t10.org.  For SError bits, t13.org.  You can get drafts free of
charge.


The ATA ones are more of a pain in that regard than SCSI though - SCSI 
has all distinct error codes for different errors, whereas ATA has 
bitmasks for everything..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] libata: add human-readable error value decoding

2007-05-11 Thread Robert Hancock


Tejun Heo wrote:

Chuck Ebbert wrote:

Robert Hancock wrote:

+  ehc-i.serror  SERR_TRANS_ST_ERROR ? TransStatTransErr 
: ,
+  ehc-i.serror  SERR_UNRECOG_FIS ? UnrecogFIS  : ,
+  ehc-i.serror  SERR_DEV_XCHG ? DevExchanged  :  );

I'm not really convinced whether this is necessary.  The human readable
form is also a bit cryptic and can get quite long.  So, mild NACK from
me.


It certainly seems useful when debugging hotplug issues or random SATA
problems which end up being caused by communication problems. Without
this output, Joe User stands no chance of figuring out what's going on,
and neither does Joe libata Developer unless they really care to dig
through the spec and count bits to figure out what they mean. At least
with this you can see that there was a CRC error, etc. and go from that..


Why not just document the error messages?

And the scsi ones too, I can't seem to find what the sense codes mean.


They are well documented elsewhere - the standard documents.  For sense
codes, t10.org.  For SError bits, t13.org.  You can get drafts free of
charge.


The ATA ones are more of a pain in that regard than SCSI though - SCSI 
has all distinct error codes for different errors, whereas ATA has 
bitmasks for everything..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] libata: add human-readable error value decoding

2007-05-10 Thread Robert Hancock


Jeff Garzik wrote:

Mark Lord wrote:

If we're compiling the messages into the kernel regardless,
then it doesn't really make much sense to NOT show all of them
on the error paths.



Not true.  Uncontrolled message spewage inevitably results in critical 
information scrolling off the screen, before a user can take a digital 
photo of the output...  Or of users being confused by subsequent error 
fallout (i.e. multiple oopses reporting problem).


Moderation and restraint still have roles to play...  :)

Jeff


I don't think this is as big of a deal here as in other cases, like oops 
output. With libata errors, if they're at the console (which they'd have 
to be to see these messages), unless something has actually caused a 
panic the scrollback buffer should still be functional and they'd be 
able to see the entire output..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] libata: add human-readable error value decoding

2007-05-10 Thread Robert Hancock


Tejun Heo wrote:

+if (ehc->i.serror)
+ata_port_printk(ap, KERN_ERR,
+  "SError: {%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s}\n",
+  ehc->i.serror & SERR_DATA_RECOVERED ? "RecovDataErr " : "",
+  ehc->i.serror & SERR_COMM_RECOVERED ? "RecovCommErr " : "",
+  ehc->i.serror & SERR_DATA ? "UnrecovDataErr " : "",
+  ehc->i.serror & SERR_PERSISTENT ? "PersistErr " : "",
+  ehc->i.serror & SERR_PROTOCOL ? "ProtocolErr " : "",
+  ehc->i.serror & SERR_INTERNAL ? "HostInternalErr " : "",
+  ehc->i.serror & SERR_PHYRDY_CHG ? "PHYRdyChg " : "",
+  ehc->i.serror & SERR_PHY_INT_ERR ? "PHYInternalErr " : "",
+  ehc->i.serror & SERR_COMM_WAKE ? "CommWake " : "",
+  ehc->i.serror & SERR_10B_8B_ERR ? "10B8BErr " : "",
+  ehc->i.serror & SERR_DISPARITY ? "Disparity " : "",
+  ehc->i.serror & SERR_CRC ? "CRCErr " : "",
+  ehc->i.serror & SERR_HANDSHAKE ? "HandshakeErr " : "",
+  ehc->i.serror & SERR_LINK_SEQ_ERR ? "LinkSeqErr " : "",
+  ehc->i.serror & SERR_TRANS_ST_ERROR ? "TransStatTransErr " : "",
+  ehc->i.serror & SERR_UNRECOG_FIS ? "UnrecogFIS " : "",
+  ehc->i.serror & SERR_DEV_XCHG ? "DevExchanged " : "" );


I'm not really convinced whether this is necessary.  The human readable
form is also a bit cryptic and can get quite long.  So, mild NACK from me.



It certainly seems useful when debugging hotplug issues or random SATA 
problems which end up being caused by communication problems. Without 
this output, Joe User stands no chance of figuring out what's going on, 
and neither does Joe libata Developer unless they really care to dig 
through the spec and count bits to figure out what they mean. At least 
with this you can see that there was a CRC error, etc. and go from that..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

< 4 5 6 7 8 9 10 11 12 13 >

801 - 900 of 1555 matches

Mail list logo