RE: [PATCH 1/6] PCI Express Advanced Error Reporting Driver

2005-03-15 Thread Nguyen, Tom L
Tuesday, March 15, 2005 2:51 PM Linas Vepstas wrote:
>> +void hw_aer_unregister(void)
>> +{
>> +struct pci_dev *dev = (struct pci_dev*)host->dev;
>> +unsigned short id;
>> +
>> +id = (dev->bus->number << 8) | dev->devfn;
>> +
>> +/* Unregister with AER Root driver */
>> +pcie_aer_unregister(id);
>> +}
>
>I don't understand how this can work on a system with 
>more than one domain.  On any midrange/high-end system, 
>you'll have a number of devices with identical values
>for (bus->number << 8) | devfn)

Good catch! I forgot to encounter multiple segments. However, based on
LKML inputs for a common interface in the pci_driver data structure, it
appears that pcie_aer_register and pcie_aer_unregister are no longer
required.

Thanks,
Long
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] PCI Express Advanced Error Reporting Driver

2005-03-15 Thread Grant Grundler
On Tue, Mar 15, 2005 at 04:51:01PM -0600, Linas Vepstas wrote:
> Hi,
> 
> On Fri, Mar 11, 2005 at 04:12:18PM -0800, long was heard to remark:
> 
> > +void hw_aer_unregister(void)
> > +{
> > +   struct pci_dev *dev = (struct pci_dev*)host->dev;

I'm more nervous about "host" being defined as a global
instead of being passed in. I've not review the
other code and don't know if that's safe.

> > +   unsigned short id;
> > +
> > +   id = (dev->bus->number << 8) | dev->devfn;
> > +   
> > +   /* Unregister with AER Root driver */
> > +   pcie_aer_unregister(id);
> > +}
> 
> I don't understand how this can work on a system with 
> more than one domain.  On any midrange/high-end system, 
> you'll have a number of devices with identical values
> for (bus->number << 8) | devfn)

Yes - this is an error reported within a particular domain.
I'm expecting host-> to refer to a particular domain.
Maybe it doesn't?

[ example deleted ]

> Or am I being stupid/dense/all-of-the-above?

Probably not.

grant

> 
> --linas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] PCI Express Advanced Error Reporting Driver

2005-03-15 Thread Linas Vepstas
Hi,

On Fri, Mar 11, 2005 at 04:12:18PM -0800, long was heard to remark:

> +void hw_aer_unregister(void)
> +{
> + struct pci_dev *dev = (struct pci_dev*)host->dev;
> + unsigned short id;
> +
> + id = (dev->bus->number << 8) | dev->devfn;
> + 
> + /* Unregister with AER Root driver */
> + pcie_aer_unregister(id);
> +}

I don't understand how this can work on a system with 
more than one domain.  On any midrange/high-end system, 
you'll have a number of devices with identical values
for (bus->number << 8) | devfn)

For example, on my system, lspci prints out:

mosquito:~ # lspci
:00:01.0 Co-processor: IBM: Unknown device 00e0 (rev 01)
:00:03.0 ISA bridge: Symphony Labs W83C553 (rev 10)
0001:00:02.0 PCI bridge: IBM: Unknown device 0188 (rev 02)
0001:00:02.2 PCI bridge: IBM: Unknown device 0188 (rev 02)
0001:00:02.3 PCI bridge: IBM: Unknown device 0188 (rev 02)
0001:00:02.4 PCI bridge: IBM: Unknown device 0188 (rev 02)
0001:00:02.6 PCI bridge: IBM: Unknown device 0188 (rev 02)
0001:01:01.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1010
66MHz  Ultra3 SCSI Adapter (rev 01)
0001:01:01.1 SCSI storage controller: LSI Logic / Symbios Logic 53c1010
66MHz  Ultra3 SCSI Adapter (rev 01)
0001:21:01.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro
100] (rev 0d)
0002:00:02.0 PCI bridge: IBM: Unknown device 0188 (rev 02)
0002:00:02.2 PCI bridge: IBM: Unknown device 0188 (rev 02)
0002:00:02.4 PCI bridge: IBM: Unknown device 0188 (rev 02)
0002:00:02.6 PCI bridge: IBM: Unknown device 0188 (rev 02)


Here, 'Unknown device' is actually an empty slot.

If I plugged the ethernet card in a few slots over, it would 
show up as

0002:01:01.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro

and so it would have the exact same (bus->number << 8) | devfn)
as the scsi device.

Or am I being stupid/dense/all-of-the-above?

--linas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] PCI Express Advanced Error Reporting Driver

2005-03-15 Thread Linas Vepstas
Hi,

On Fri, Mar 11, 2005 at 04:12:18PM -0800, long was heard to remark:

 +void hw_aer_unregister(void)
 +{
 + struct pci_dev *dev = (struct pci_dev*)host-dev;
 + unsigned short id;
 +
 + id = (dev-bus-number  8) | dev-devfn;
 + 
 + /* Unregister with AER Root driver */
 + pcie_aer_unregister(id);
 +}

I don't understand how this can work on a system with 
more than one domain.  On any midrange/high-end system, 
you'll have a number of devices with identical values
for (bus-number  8) | devfn)

For example, on my system, lspci prints out:

mosquito:~ # lspci
:00:01.0 Co-processor: IBM: Unknown device 00e0 (rev 01)
:00:03.0 ISA bridge: Symphony Labs W83C553 (rev 10)
0001:00:02.0 PCI bridge: IBM: Unknown device 0188 (rev 02)
0001:00:02.2 PCI bridge: IBM: Unknown device 0188 (rev 02)
0001:00:02.3 PCI bridge: IBM: Unknown device 0188 (rev 02)
0001:00:02.4 PCI bridge: IBM: Unknown device 0188 (rev 02)
0001:00:02.6 PCI bridge: IBM: Unknown device 0188 (rev 02)
0001:01:01.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1010
66MHz  Ultra3 SCSI Adapter (rev 01)
0001:01:01.1 SCSI storage controller: LSI Logic / Symbios Logic 53c1010
66MHz  Ultra3 SCSI Adapter (rev 01)
0001:21:01.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro
100] (rev 0d)
0002:00:02.0 PCI bridge: IBM: Unknown device 0188 (rev 02)
0002:00:02.2 PCI bridge: IBM: Unknown device 0188 (rev 02)
0002:00:02.4 PCI bridge: IBM: Unknown device 0188 (rev 02)
0002:00:02.6 PCI bridge: IBM: Unknown device 0188 (rev 02)


Here, 'Unknown device' is actually an empty slot.

If I plugged the ethernet card in a few slots over, it would 
show up as

0002:01:01.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro

and so it would have the exact same (bus-number  8) | devfn)
as the scsi device.

Or am I being stupid/dense/all-of-the-above?

--linas

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] PCI Express Advanced Error Reporting Driver

2005-03-15 Thread Grant Grundler
On Tue, Mar 15, 2005 at 04:51:01PM -0600, Linas Vepstas wrote:
 Hi,
 
 On Fri, Mar 11, 2005 at 04:12:18PM -0800, long was heard to remark:
 
  +void hw_aer_unregister(void)
  +{
  +   struct pci_dev *dev = (struct pci_dev*)host-dev;

I'm more nervous about host being defined as a global
instead of being passed in. I've not review the
other code and don't know if that's safe.

  +   unsigned short id;
  +
  +   id = (dev-bus-number  8) | dev-devfn;
  +   
  +   /* Unregister with AER Root driver */
  +   pcie_aer_unregister(id);
  +}
 
 I don't understand how this can work on a system with 
 more than one domain.  On any midrange/high-end system, 
 you'll have a number of devices with identical values
 for (bus-number  8) | devfn)

Yes - this is an error reported within a particular domain.
I'm expecting host- to refer to a particular domain.
Maybe it doesn't?

[ example deleted ]

 Or am I being stupid/dense/all-of-the-above?

Probably not.

grant

 
 --linas
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/6] PCI Express Advanced Error Reporting Driver

2005-03-15 Thread Nguyen, Tom L
Tuesday, March 15, 2005 2:51 PM Linas Vepstas wrote:
 +void hw_aer_unregister(void)
 +{
 +struct pci_dev *dev = (struct pci_dev*)host-dev;
 +unsigned short id;
 +
 +id = (dev-bus-number  8) | dev-devfn;
 +
 +/* Unregister with AER Root driver */
 +pcie_aer_unregister(id);
 +}

I don't understand how this can work on a system with 
more than one domain.  On any midrange/high-end system, 
you'll have a number of devices with identical values
for (bus-number  8) | devfn)

Good catch! I forgot to encounter multiple segments. However, based on
LKML inputs for a common interface in the pci_driver data structure, it
appears that pcie_aer_register and pcie_aer_unregister are no longer
required.

Thanks,
Long
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/6] PCI Express Advanced Error Reporting Driver

2005-03-14 Thread Nguyen, Tom L
On Saturday, March 12, 2005 1:38 AM Andi Kleen wrote:
>I haven't read your code in detail, just a high level remark.
>
>> +6. Enabling AER Aware Support in PCI Express Device Driver
>> +
>> +To enable AER aware support requires a software driver to configure
>> +the AER capability structure within its device, to initialize its
AER
>> +aware callback handle and to call pcie_aer_register. Sections 6.1,
>> +6.2, and 6.3 describe how to enable AER aware support in details.
>
>There is currently discussion underway for a generic portable PCI 
>error reporting interface for drivers. This is already being worked
>on by some PPC64 and IA64 people. I don't think it would be a good idea
>to add another incompatible PCI-E specific interface.
>
>So I would recommend to not apply pcie_aer_register() et.al.
>and coordinate with the others working on this area on a common
>interface.
>
>This would only impact the device driver interface; having
>a PCI Express specific interface in sysfs is probably ok.
>
>Otherwise we would end up with tons of ifdefs in the drivers
>supporting multiple error reporting interfaces for different platforms,

>which would be bad.
>
>Also in general I think the necessary callbacks should
>be part of the basic device; not provided in a separate structure.

Agree. We will coordinate with the others working on this area on a
common interface.

Thanks for your suggestions,
Long

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] PCI Express Advanced Error Reporting Driver

2005-03-14 Thread Randy.Dunlap
Nguyen, Tom L wrote:
Monday, March 14, 2005 3:01 AM David Vrabel wrote:

This patch includes PCIEAER-HOWTO.txt, which describes how the PCI
Express Advanced Error Reporting Root driver works.
--- linux-2.6.11-rc5/Documentation/PCIEAER-HOWTO.txt
Could this be placed in a sub-system subdirectory (creating one if
necessary, e.g., pci/)?  The root of Documentation/ is rather full of
random files as is.

Most of the HOWTO documents are under Documentation/ directory. I have
no problem of placing it in a sub-system subdirectory if it is OK with
Linux community?
It should remain in the Documentation/ directory or a (new)
subdirectory under Documentation/ .
--
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/6] PCI Express Advanced Error Reporting Driver

2005-03-14 Thread Nguyen, Tom L
Monday, March 14, 2005 3:01 AM David Vrabel wrote:

>> This patch includes PCIEAER-HOWTO.txt, which describes how the PCI
>> Express Advanced Error Reporting Root driver works.
>>
>> --- linux-2.6.11-rc5/Documentation/PCIEAER-HOWTO.txt
>>
>Could this be placed in a sub-system subdirectory (creating one if
>necessary, e.g., pci/)?  The root of Documentation/ is rather full of
>random files as is.

Most of the HOWTO documents are under Documentation/ directory. I have
no problem of placing it in a sub-system subdirectory if it is OK with
Linux community?

Thanks,
Long
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] PCI Express Advanced Error Reporting Driver

2005-03-14 Thread David Vrabel
long wrote:
> This patch includes PCIEAER-HOWTO.txt, which describes how the PCI
> Express Advanced Error Reporting Root driver works.

> --- linux-2.6.11-rc5/Documentation/PCIEAER-HOWTO.txt

Could this be placed in a sub-system subdirectory (creating one if
necessary, e.g., pci/)?  The root of Documentation/ is rather full of
random files as is.

David Vrabel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] PCI Express Advanced Error Reporting Driver

2005-03-14 Thread David Vrabel
long wrote:
 This patch includes PCIEAER-HOWTO.txt, which describes how the PCI
 Express Advanced Error Reporting Root driver works.

 --- linux-2.6.11-rc5/Documentation/PCIEAER-HOWTO.txt

Could this be placed in a sub-system subdirectory (creating one if
necessary, e.g., pci/)?  The root of Documentation/ is rather full of
random files as is.

David Vrabel
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/6] PCI Express Advanced Error Reporting Driver

2005-03-14 Thread Nguyen, Tom L
Monday, March 14, 2005 3:01 AM David Vrabel wrote:

 This patch includes PCIEAER-HOWTO.txt, which describes how the PCI
 Express Advanced Error Reporting Root driver works.

 --- linux-2.6.11-rc5/Documentation/PCIEAER-HOWTO.txt

Could this be placed in a sub-system subdirectory (creating one if
necessary, e.g., pci/)?  The root of Documentation/ is rather full of
random files as is.

Most of the HOWTO documents are under Documentation/ directory. I have
no problem of placing it in a sub-system subdirectory if it is OK with
Linux community?

Thanks,
Long
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] PCI Express Advanced Error Reporting Driver

2005-03-14 Thread Randy.Dunlap
Nguyen, Tom L wrote:
Monday, March 14, 2005 3:01 AM David Vrabel wrote:

This patch includes PCIEAER-HOWTO.txt, which describes how the PCI
Express Advanced Error Reporting Root driver works.
--- linux-2.6.11-rc5/Documentation/PCIEAER-HOWTO.txt
Could this be placed in a sub-system subdirectory (creating one if
necessary, e.g., pci/)?  The root of Documentation/ is rather full of
random files as is.

Most of the HOWTO documents are under Documentation/ directory. I have
no problem of placing it in a sub-system subdirectory if it is OK with
Linux community?
It should remain in the Documentation/ directory or a (new)
subdirectory under Documentation/ .
--
~Randy
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/6] PCI Express Advanced Error Reporting Driver

2005-03-14 Thread Nguyen, Tom L
On Saturday, March 12, 2005 1:38 AM Andi Kleen wrote:
I haven't read your code in detail, just a high level remark.

 +6. Enabling AER Aware Support in PCI Express Device Driver
 +
 +To enable AER aware support requires a software driver to configure
 +the AER capability structure within its device, to initialize its
AER
 +aware callback handle and to call pcie_aer_register. Sections 6.1,
 +6.2, and 6.3 describe how to enable AER aware support in details.

There is currently discussion underway for a generic portable PCI 
error reporting interface for drivers. This is already being worked
on by some PPC64 and IA64 people. I don't think it would be a good idea
to add another incompatible PCI-E specific interface.

So I would recommend to not apply pcie_aer_register() et.al.
and coordinate with the others working on this area on a common
interface.

This would only impact the device driver interface; having
a PCI Express specific interface in sysfs is probably ok.

Otherwise we would end up with tons of ifdefs in the drivers
supporting multiple error reporting interfaces for different platforms,

which would be bad.

Also in general I think the necessary callbacks should
be part of the basic device; not provided in a separate structure.

Agree. We will coordinate with the others working on this area on a
common interface.

Thanks for your suggestions,
Long

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] PCI Express Advanced Error Reporting Driver

2005-03-12 Thread Andi Kleen
long <[EMAIL PROTECTED]> writes:

I haven't read your code in detail, just a high level remark.

> +6. Enabling AER Aware Support in PCI Express Device Driver
> +
> +To enable AER aware support requires a software driver to configure
> +the AER capability structure within its device, to initialize its AER
> +aware callback handle and to call pcie_aer_register. Sections 6.1,
> +6.2, and 6.3 describe how to enable AER aware support in details.

[...]

There is currently discussion underway for a generic portable PCI 
error reporting interface for drivers. This is already being worked
on by some PPC64 and IA64 people. I don't think it would be a good idea
to add another incompatible PCI-E specific interface.

So I would recommend to not apply pcie_aer_register() et.al.
and coordinate with the others working on this area on a common
interface.

This would only impact the device driver interface; having
a PCI Express specific interface in sysfs is probably ok.

Otherwise we would end up with tons of ifdefs in the drivers
supporting multiple error reporting interfaces for different platforms, 
which would be bad.

Also in general I think the necessary callbacks should
be part of the basic device; not provided in a separate structure.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/6] PCI Express Advanced Error Reporting Driver

2005-03-12 Thread Andi Kleen
long [EMAIL PROTECTED] writes:

I haven't read your code in detail, just a high level remark.

 +6. Enabling AER Aware Support in PCI Express Device Driver
 +
 +To enable AER aware support requires a software driver to configure
 +the AER capability structure within its device, to initialize its AER
 +aware callback handle and to call pcie_aer_register. Sections 6.1,
 +6.2, and 6.3 describe how to enable AER aware support in details.

[...]

There is currently discussion underway for a generic portable PCI 
error reporting interface for drivers. This is already being worked
on by some PPC64 and IA64 people. I don't think it would be a good idea
to add another incompatible PCI-E specific interface.

So I would recommend to not apply pcie_aer_register() et.al.
and coordinate with the others working on this area on a common
interface.

This would only impact the device driver interface; having
a PCI Express specific interface in sysfs is probably ok.

Otherwise we would end up with tons of ifdefs in the drivers
supporting multiple error reporting interfaces for different platforms, 
which would be bad.

Also in general I think the necessary callbacks should
be part of the basic device; not provided in a separate structure.

-Andi

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/6] PCI Express Advanced Error Reporting Driver

2005-03-11 Thread long
This patch includes PCIEAER-HOWTO.txt, which describes how the PCI
Express Advanced Error Reporting Root driver works.

Signed-off-by: T. Long Nguyen <[EMAIL PROTECTED]>


diff -urpN linux-2.6.11-rc5/Documentation/PCIEAER-HOWTO.txt 
patch-2.6.11-rc5-aerc3-split1/Documentation/PCIEAER-HOWTO.txt
--- linux-2.6.11-rc5/Documentation/PCIEAER-HOWTO.txt1969-12-31 
19:00:00.0 -0500
+++ patch-2.6.11-rc5-aerc3-split1/Documentation/PCIEAER-HOWTO.txt   
2005-03-11 10:25:21.0 -0500
@@ -0,0 +1,712 @@
+   The PCI Express Advanced Error Reporting Root Driver Guide HOWTO
+   T. Long Nguyen <[EMAIL PROTECTED]>
+   02/23/2005
+
+1. About this guide
+
+This guide describes the basics of the PCI Express Advanced Error
+Reporting (AER) Root driver and provides information on how to enable
+the drivers of AER endpoint devices to register/un-register with PCI
+Express Root AER driver.
+
+2. Copyright © Intel Corporation 2005. 
+
+3. What is the PCI Express AER Root Driver?
+
+PCI Express error signaling can occur on the PCI Express link itself
+or on behalf of transactions initiated on the link. PCI Express
+defines two error reporting paradigms: the baseline capability and
+the Advanced Error Reporting capability. The baseline capability is
+required of all PCI Express components providing a minimum defined
+set of error reporting requirements. Advanced Error Reporting
+capability is implemented with a PCI Express advanced error reporting
+extended capability structure providing more robust error reporting.
+
+The PCI Express AER Root driver provides the mechanism to support PCI
+Express Advanced Error Reporting capability. The PCI Express AER Root
+driver provides three basic functions:
+
+-  A mechanism to allow a driver of a PCI Express component to
+   register/un-register its AER aware callback handle with the
+   PCI Express AER Root driver. This mechanism is provided as
+   an option to allow the PCI Express AER Root driver to query
+   the PCI Express component device driver to determine more
+   precisely which error and what severity occurred.
+   
+-  A mechanism to process the error reporting message detected
+   by Root Ports, and report the errors to user.
+
+4. Why Use the PCI Express AER Root Driver?
+
+In a PCI Express-aware system, a PCI Express component in a hierarchy
+associated with the Root Port can send an error reporting message
+to the Root Port. The Root Port, upon receiving an error reporting
+message, internally processes and logs the error message in its PCI
+Express capability structure. Error information being logged includes
+storing the error reporting agent's requestor ID into the Error
+Source Identification Registers and setting the error bits of the
+Root Error Status Register accordingly. If AER error reporting is
+enabled in Root Error Command Register, the Root Port generates an
+interrupt if an error is detected.
+
+In existing Linux kernels, 2.4.x and 2.6.x, there is no root service
+driver available to manage the PCI Express advanced error reporting
+extended capability structure. If an error is detected by the Root
+Port, the baseline capability, as described above, therefore must
+be provided by platform hardware to provide the platform-specific
+system with minimum defined error reporting requirements. Such a 
+platform-specific system may have BIOS not only configure the Root
+Control Register of the Root Ports' PCI Express capability structure
+to generate the system error accordingly but also handle error 
+reporting messages. Using platform-specific BIOS to handle system
+error reporting has three key issues:
+
+-  Inability to coordinate with the downstream device drivers
+   to determine more precisely which error and what severity.
+
+-  Inability to reset the downstream links while handling fatal
+   error recovery.
+
+-  Platform-specific dependency.
+
+To provide a solution to these issues requires the PCI Express AER
+Root driver that provides:
+
+-  A mechanism for the OS and application to determine if a fatal
+   error is fatal to the system, OS, or application increasing
+   uptime.
+
+-  A mechanism to notify the downstream device drivers if errors
+   occurred.
+
+-  A mechanism to dynamically perform error recovery actions
+   based on configuration options.
+
+-  Platform-specific independence.
+
+5. Including the PCI Express AER Root Driver into the Linux Kernel
+
+The PCI Express AER Root driver is a Root Port service driver attached
+to the PCI Express Port Bus driver. Its service must be registered
+with the PCI Express Port Bus driver and users are required to include
+the PCI Express Port Bus driver in the kernel (refer to
+PCIEBUS-HOWTO.txt). Once the kernel config CONFIG_PCIEPORTBUS is
+included, the PCI Express AER Root driver is 

[PATCH 1/6] PCI Express Advanced Error Reporting Driver

2005-03-11 Thread long
This patch includes PCIEAER-HOWTO.txt, which describes how the PCI
Express Advanced Error Reporting Root driver works.

Signed-off-by: T. Long Nguyen [EMAIL PROTECTED]


diff -urpN linux-2.6.11-rc5/Documentation/PCIEAER-HOWTO.txt 
patch-2.6.11-rc5-aerc3-split1/Documentation/PCIEAER-HOWTO.txt
--- linux-2.6.11-rc5/Documentation/PCIEAER-HOWTO.txt1969-12-31 
19:00:00.0 -0500
+++ patch-2.6.11-rc5-aerc3-split1/Documentation/PCIEAER-HOWTO.txt   
2005-03-11 10:25:21.0 -0500
@@ -0,0 +1,712 @@
+   The PCI Express Advanced Error Reporting Root Driver Guide HOWTO
+   T. Long Nguyen [EMAIL PROTECTED]
+   02/23/2005
+
+1. About this guide
+
+This guide describes the basics of the PCI Express Advanced Error
+Reporting (AER) Root driver and provides information on how to enable
+the drivers of AER endpoint devices to register/un-register with PCI
+Express Root AER driver.
+
+2. Copyright © Intel Corporation 2005. 
+
+3. What is the PCI Express AER Root Driver?
+
+PCI Express error signaling can occur on the PCI Express link itself
+or on behalf of transactions initiated on the link. PCI Express
+defines two error reporting paradigms: the baseline capability and
+the Advanced Error Reporting capability. The baseline capability is
+required of all PCI Express components providing a minimum defined
+set of error reporting requirements. Advanced Error Reporting
+capability is implemented with a PCI Express advanced error reporting
+extended capability structure providing more robust error reporting.
+
+The PCI Express AER Root driver provides the mechanism to support PCI
+Express Advanced Error Reporting capability. The PCI Express AER Root
+driver provides three basic functions:
+
+-  A mechanism to allow a driver of a PCI Express component to
+   register/un-register its AER aware callback handle with the
+   PCI Express AER Root driver. This mechanism is provided as
+   an option to allow the PCI Express AER Root driver to query
+   the PCI Express component device driver to determine more
+   precisely which error and what severity occurred.
+   
+-  A mechanism to process the error reporting message detected
+   by Root Ports, and report the errors to user.
+
+4. Why Use the PCI Express AER Root Driver?
+
+In a PCI Express-aware system, a PCI Express component in a hierarchy
+associated with the Root Port can send an error reporting message
+to the Root Port. The Root Port, upon receiving an error reporting
+message, internally processes and logs the error message in its PCI
+Express capability structure. Error information being logged includes
+storing the error reporting agent's requestor ID into the Error
+Source Identification Registers and setting the error bits of the
+Root Error Status Register accordingly. If AER error reporting is
+enabled in Root Error Command Register, the Root Port generates an
+interrupt if an error is detected.
+
+In existing Linux kernels, 2.4.x and 2.6.x, there is no root service
+driver available to manage the PCI Express advanced error reporting
+extended capability structure. If an error is detected by the Root
+Port, the baseline capability, as described above, therefore must
+be provided by platform hardware to provide the platform-specific
+system with minimum defined error reporting requirements. Such a 
+platform-specific system may have BIOS not only configure the Root
+Control Register of the Root Ports' PCI Express capability structure
+to generate the system error accordingly but also handle error 
+reporting messages. Using platform-specific BIOS to handle system
+error reporting has three key issues:
+
+-  Inability to coordinate with the downstream device drivers
+   to determine more precisely which error and what severity.
+
+-  Inability to reset the downstream links while handling fatal
+   error recovery.
+
+-  Platform-specific dependency.
+
+To provide a solution to these issues requires the PCI Express AER
+Root driver that provides:
+
+-  A mechanism for the OS and application to determine if a fatal
+   error is fatal to the system, OS, or application increasing
+   uptime.
+
+-  A mechanism to notify the downstream device drivers if errors
+   occurred.
+
+-  A mechanism to dynamically perform error recovery actions
+   based on configuration options.
+
+-  Platform-specific independence.
+
+5. Including the PCI Express AER Root Driver into the Linux Kernel
+
+The PCI Express AER Root driver is a Root Port service driver attached
+to the PCI Express Port Bus driver. Its service must be registered
+with the PCI Express Port Bus driver and users are required to include
+the PCI Express Port Bus driver in the kernel (refer to
+PCIEBUS-HOWTO.txt). Once the kernel config CONFIG_PCIEPORTBUS is
+included, the PCI Express AER Root driver is automatically