RE: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-18 Thread Nguyen, Tom L
On Friday, March 18, 2005 10:26 AM Grant Grundler wrote:
>> He was referring to an unpublished draft "Error Reporting ECN".
>> You'll have to talk to Intel's PCI-SIG representative to get a copy.
>
>Good News: the "Error Reporting ECN" is now posted on the PCISIG
website.
>
>Tom, please review and see if/how that changes your implementation.

Agree. Thanks for the update.

Thanks,
Long
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-18 Thread Grant Grundler
On Tue, Mar 15, 2005 at 07:12:07PM -0700, Grant Grundler wrote:
...
> He was referring to an unpublished draft "Error Reporting ECN".
> You'll have to talk to Intel's PCI-SIG representative to get a copy.

Good News: the "Error Reporting ECN" is now posted on the PCISIG website.


http://www.pcisig.com/specifications/pciexpress/specifications/ECN_Error_Reporting_050127_clean.pdf

Tom, please review and see if/how that changes your implementation.

thanks,
grant
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-18 Thread Grant Grundler
On Tue, Mar 15, 2005 at 07:12:07PM -0700, Grant Grundler wrote:
...
 He was referring to an unpublished draft Error Reporting ECN.
 You'll have to talk to Intel's PCI-SIG representative to get a copy.

Good News: the Error Reporting ECN is now posted on the PCISIG website.


http://www.pcisig.com/specifications/pciexpress/specifications/ECN_Error_Reporting_050127_clean.pdf

Tom, please review and see if/how that changes your implementation.

thanks,
grant
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-18 Thread Nguyen, Tom L
On Friday, March 18, 2005 10:26 AM Grant Grundler wrote:
 He was referring to an unpublished draft Error Reporting ECN.
 You'll have to talk to Intel's PCI-SIG representative to get a copy.

Good News: the Error Reporting ECN is now posted on the PCISIG
website.

Tom, please review and see if/how that changes your implementation.

Agree. Thanks for the update.

Thanks,
Long
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-15 Thread Greg KH
On Tue, Mar 15, 2005 at 07:12:07PM -0700, Grant Grundler wrote:
> On Tue, Mar 15, 2005 at 01:11:39PM -0700, Grant Grundler wrote:
> > Tom,
> > A co-worker made the following observation (I'm paraphrasing):
> > ...this proposal does not deal with the Error Reporting ECN.
> > For example, they do not show the advisory non-fatal bit in
> > the correctable error status register.
> > 
> > I believe he is referring to the "Error Clarifications ECN":
> > 
> > 
> > http://www.pcisig.com/specifications/pciexpress/ECN_-_Error_Clarifications.pdf
> 
> Tom,
> Sorry - I got this wrong.
> He was referring to an unpublished draft "Error Reporting ECN".
> You'll have to talk to Intel's PCI-SIG representative to get a copy.
> [ Ugh. And everyone else is SOL - sorry ]

Then we have no obligation to be compliant with a unpublished spec :)

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-15 Thread Grant Grundler
On Tue, Mar 15, 2005 at 01:11:39PM -0700, Grant Grundler wrote:
> Tom,
> A co-worker made the following observation (I'm paraphrasing):
>   ...this proposal does not deal with the Error Reporting ECN.
>   For example, they do not show the advisory non-fatal bit in
>   the correctable error status register.
> 
> I believe he is referring to the "Error Clarifications ECN":
> 
>   
> http://www.pcisig.com/specifications/pciexpress/ECN_-_Error_Clarifications.pdf

Tom,
Sorry - I got this wrong.
He was referring to an unpublished draft "Error Reporting ECN".
You'll have to talk to Intel's PCI-SIG representative to get a copy.
[ Ugh. And everyone else is SOL - sorry ]

I'm annoyed he wanted me to raise this in a public forum without
having a public document to point at. And I'm annoyed at myself
for being lazy and not verifying that before hand...

sorry,
grant
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-15 Thread Nguyen, Tom L
On Tuesday, March 15, 2005 2:38 PM Grant Grundler wrote:
>> >A co-worker made the following observation (I'm paraphrasing):
>> >...this proposal does not deal with the Error Reporting ECN.
>> >For example, they do not show the advisory non-fatal bit in
>> >the correctable error status register.
>> 
>> Does he refer to the ECN update on the Received Error Bit[0] of the
>> Correctable Error Status Register and on the Training Error Bit[0] of
>> the Uncorrectable Error Status Register? If not, please clarify his
>> comments for us.

>Yes - I believe so.

Great! I will make changes to reflect this update. Thanks for pointing
it out.

Thanks,
Long
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-15 Thread Grant Grundler
On Tue, Mar 15, 2005 at 01:54:32PM -0800, Nguyen, Tom L wrote:
> On Tuesday, March 15, 2005 12:12 PM Grant Grundler wrote:
> >Tom,
> >A co-worker made the following observation (I'm paraphrasing):
> > ...this proposal does not deal with the Error Reporting ECN.
> > For example, they do not show the advisory non-fatal bit in
> > the correctable error status register.
> 
> Does he refer to the ECN update on the Received Error Bit[0] of the
> Correctable Error Status Register and on the Training Error Bit[0] of
> the Uncorrectable Error Status Register? If not, please clarify his
> comments for us.


Yes - I believe so.

grant
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-15 Thread Nguyen, Tom L
On Tuesday, March 15, 2005 12:12 PM Grant Grundler wrote:
>Tom,
>A co-worker made the following observation (I'm paraphrasing):
>   ...this proposal does not deal with the Error Reporting ECN.
>   For example, they do not show the advisory non-fatal bit in
>   the correctable error status register.

Does he refer to the ECN update on the Received Error Bit[0] of the
Correctable Error Status Register and on the Training Error Bit[0] of
the Uncorrectable Error Status Register? If not, please clarify his
comments for us.

Thanks,
Long
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-15 Thread Grant Grundler
Tom,
A co-worker made the following observation (I'm paraphrasing):
...this proposal does not deal with the Error Reporting ECN.
For example, they do not show the advisory non-fatal bit in
the correctable error status register.

I believe he is referring to the "Error Clarifications ECN":


http://www.pcisig.com/specifications/pciexpress/ECN_-_Error_Clarifications.pdf

Looks like all PCI-E ECNs are available [just not the original docs :^( ]:
http://www.pcisig.com/specifications/pciexpress/specifications

hth,
grant
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-15 Thread Grant Grundler
Tom,
A co-worker made the following observation (I'm paraphrasing):
...this proposal does not deal with the Error Reporting ECN.
For example, they do not show the advisory non-fatal bit in
the correctable error status register.

I believe he is referring to the Error Clarifications ECN:


http://www.pcisig.com/specifications/pciexpress/ECN_-_Error_Clarifications.pdf

Looks like all PCI-E ECNs are available [just not the original docs :^( ]:
http://www.pcisig.com/specifications/pciexpress/specifications

hth,
grant
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-15 Thread Nguyen, Tom L
On Tuesday, March 15, 2005 12:12 PM Grant Grundler wrote:
Tom,
A co-worker made the following observation (I'm paraphrasing):
   ...this proposal does not deal with the Error Reporting ECN.
   For example, they do not show the advisory non-fatal bit in
   the correctable error status register.

Does he refer to the ECN update on the Received Error Bit[0] of the
Correctable Error Status Register and on the Training Error Bit[0] of
the Uncorrectable Error Status Register? If not, please clarify his
comments for us.

Thanks,
Long
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-15 Thread Grant Grundler
On Tue, Mar 15, 2005 at 01:54:32PM -0800, Nguyen, Tom L wrote:
 On Tuesday, March 15, 2005 12:12 PM Grant Grundler wrote:
 Tom,
 A co-worker made the following observation (I'm paraphrasing):
  ...this proposal does not deal with the Error Reporting ECN.
  For example, they do not show the advisory non-fatal bit in
  the correctable error status register.
 
 Does he refer to the ECN update on the Received Error Bit[0] of the
 Correctable Error Status Register and on the Training Error Bit[0] of
 the Uncorrectable Error Status Register? If not, please clarify his
 comments for us.


Yes - I believe so.

grant
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-15 Thread Nguyen, Tom L
On Tuesday, March 15, 2005 2:38 PM Grant Grundler wrote:
 A co-worker made the following observation (I'm paraphrasing):
 ...this proposal does not deal with the Error Reporting ECN.
 For example, they do not show the advisory non-fatal bit in
 the correctable error status register.
 
 Does he refer to the ECN update on the Received Error Bit[0] of the
 Correctable Error Status Register and on the Training Error Bit[0] of
 the Uncorrectable Error Status Register? If not, please clarify his
 comments for us.

Yes - I believe so.

Great! I will make changes to reflect this update. Thanks for pointing
it out.

Thanks,
Long
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-15 Thread Grant Grundler
On Tue, Mar 15, 2005 at 01:11:39PM -0700, Grant Grundler wrote:
 Tom,
 A co-worker made the following observation (I'm paraphrasing):
   ...this proposal does not deal with the Error Reporting ECN.
   For example, they do not show the advisory non-fatal bit in
   the correctable error status register.
 
 I believe he is referring to the Error Clarifications ECN:
 
   
 http://www.pcisig.com/specifications/pciexpress/ECN_-_Error_Clarifications.pdf

Tom,
Sorry - I got this wrong.
He was referring to an unpublished draft Error Reporting ECN.
You'll have to talk to Intel's PCI-SIG representative to get a copy.
[ Ugh. And everyone else is SOL - sorry ]

I'm annoyed he wanted me to raise this in a public forum without
having a public document to point at. And I'm annoyed at myself
for being lazy and not verifying that before hand...

sorry,
grant
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-15 Thread Greg KH
On Tue, Mar 15, 2005 at 07:12:07PM -0700, Grant Grundler wrote:
 On Tue, Mar 15, 2005 at 01:11:39PM -0700, Grant Grundler wrote:
  Tom,
  A co-worker made the following observation (I'm paraphrasing):
  ...this proposal does not deal with the Error Reporting ECN.
  For example, they do not show the advisory non-fatal bit in
  the correctable error status register.
  
  I believe he is referring to the Error Clarifications ECN:
  
  
  http://www.pcisig.com/specifications/pciexpress/ECN_-_Error_Clarifications.pdf
 
 Tom,
 Sorry - I got this wrong.
 He was referring to an unpublished draft Error Reporting ECN.
 You'll have to talk to Intel's PCI-SIG representative to get a copy.
 [ Ugh. And everyone else is SOL - sorry ]

Then we have no obligation to be compliant with a unpublished spec :)

greg k-h
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-14 Thread Nguyen, Tom L
On Friday, March 11, 2005 11:21 PM Greg KH wrote:
>> 
>> -Report the errors to user.
>>
>This is done through the syslog, right?  Is that acceptable?

Reporting the errors to user can be written automatically to
/var/log/messages or be manually consumed through the syslog. I am not
sure whether it is acceptable or not, but I like your below suggestion. 

>It looks like you are logging a lot of stuff, all without a kernel log
>level, which is going to really mess up syslog parsers.
>
>Have you thought about just providing userspace with access to the
error
>message, in binary form, from a sysfs file, and causing a kevent to
wake
>userspace up to know to read from the file?  That way all of the
parsing
>of the error log can be done in userspace, and there is no formatting
of
>the messages from within the kernel.

Again, I like this suggestion.

Thanks,
Long
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-14 Thread Nguyen, Tom L
On Friday, March 11, 2005 11:21 PM Greg KH wrote:
 
 -Report the errors to user.

This is done through the syslog, right?  Is that acceptable?

Reporting the errors to user can be written automatically to
/var/log/messages or be manually consumed through the syslog. I am not
sure whether it is acceptable or not, but I like your below suggestion. 

It looks like you are logging a lot of stuff, all without a kernel log
level, which is going to really mess up syslog parsers.

Have you thought about just providing userspace with access to the
error
message, in binary form, from a sysfs file, and causing a kevent to
wake
userspace up to know to read from the file?  That way all of the
parsing
of the error log can be done in userspace, and there is no formatting
of
the messages from within the kernel.

Again, I like this suggestion.

Thanks,
Long
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-12 Thread Greg KH
On Fri, Mar 11, 2005 at 04:10:28PM -0800, long wrote:
> 
> - Report the errors to user.

This is done through the syslog, right?  Is that acceptable?

It looks like you are logging a lot of stuff, all without a kernel log
level, which is going to really mess up syslog parsers.

Have you thought about just providing userspace with access to the error
message, in binary form, from a sysfs file, and causing a kevent to wake
userspace up to know to read from the file?  That way all of the parsing
of the error log can be done in userspace, and there is no formatting of
the messages from within the kernel.

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-12 Thread Greg KH
On Fri, Mar 11, 2005 at 04:10:28PM -0800, long wrote:
 
 - Report the errors to user.

This is done through the syslog, right?  Is that acceptable?

It looks like you are logging a lot of stuff, all without a kernel log
level, which is going to really mess up syslog parsers.

Have you thought about just providing userspace with access to the error
message, in binary form, from a sysfs file, and causing a kevent to wake
userspace up to know to read from the file?  That way all of the parsing
of the error log can be done in userspace, and there is no formatting of
the messages from within the kernel.

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-11 Thread long
PCI Express error signaling can occur on the PCI Express link itself
or on behalf of transactions initiated on the link. PCI Express
defines the Advanced Error Reporting capability, which is implemented 
with a PCI Express advanced error reporting extended capability
structure, to provide more robust error reporting. With the Advanced
Error Reporting capability a PCI Express component, which detects an
error, can send an error message to the Root Port associated with
its hierarchy.  

The PCI Express Advanced Error Reporting driver is a PCI Express Bus's 
service driver to handle Advanced Error Reporting on Root Ports. The
PCI Express AER Root driver provides the following functions:

-   A mechanism to allow a driver of a PCI Express component to
register/un-register its AER aware callback handle with the
PCI Express AER Root driver. This mechanism is provided as
an option to allow the PCI Express AER Root driver to query
the PCI Express component device driver to determine more
precisely which error and what severity occurred.

-   A mechanism to process the error reporting message detected
by Root Ports, and 

-   Report the errors to user.

This patchset, which is based on Linux kernel 2.6.11-rc5, consists
of patches in numeric order as they should be applied.

[PATCH 1/6] <- first patch to be applied
[PATCH 2/6] <- second patch to be applied
[PATCH 3/6] <- third patch to be applied
[PATCH 4/6] <- fourth patch to be applied
[PATCH 5/6] <- fifth patch to be applied
[PATCH 6/6] <- last patch to be applied

Please send us any suggestions, feedback, comments or alternative
designs.

Signed-off-by: T. Long Nguyen <[EMAIL PROTECTED]>


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/6] PCI Express Advanced Error Reporting Driver

2005-03-11 Thread long
PCI Express error signaling can occur on the PCI Express link itself
or on behalf of transactions initiated on the link. PCI Express
defines the Advanced Error Reporting capability, which is implemented 
with a PCI Express advanced error reporting extended capability
structure, to provide more robust error reporting. With the Advanced
Error Reporting capability a PCI Express component, which detects an
error, can send an error message to the Root Port associated with
its hierarchy.  

The PCI Express Advanced Error Reporting driver is a PCI Express Bus's 
service driver to handle Advanced Error Reporting on Root Ports. The
PCI Express AER Root driver provides the following functions:

-   A mechanism to allow a driver of a PCI Express component to
register/un-register its AER aware callback handle with the
PCI Express AER Root driver. This mechanism is provided as
an option to allow the PCI Express AER Root driver to query
the PCI Express component device driver to determine more
precisely which error and what severity occurred.

-   A mechanism to process the error reporting message detected
by Root Ports, and 

-   Report the errors to user.

This patchset, which is based on Linux kernel 2.6.11-rc5, consists
of patches in numeric order as they should be applied.

[PATCH 1/6] - first patch to be applied
[PATCH 2/6] - second patch to be applied
[PATCH 3/6] - third patch to be applied
[PATCH 4/6] - fourth patch to be applied
[PATCH 5/6] - fifth patch to be applied
[PATCH 6/6] - last patch to be applied

Please send us any suggestions, feedback, comments or alternative
designs.

Signed-off-by: T. Long Nguyen [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/