On Mon, Apr 13, 2015 at 12:34:34PM +0100, Jan Beulich wrote:
> >>> On 13.04.15 at 13:19, <m...@redhat.com> wrote:
> > Yes Linux can't fix firmware 1st mode, but
> > PCI express spec says what firmware should do in this case:
> > 
> > IMPLEMENTATION NOTE Software UR Reporting Compatibility with 1.0a Devices
> > 
> >         With 1.0a device Functions, 96 if the Unsupported Request Reporting 
> > Enable bit is set, the Function
> >         when operating as a Completer will send an uncorrectable error 
> > Message (if enabled) when a UR
> >         error is detected. On platforms where an uncorrectable error 
> > Message 
> > is handled as a System Error,
> >         this will break PC-compatible Configuration Space probing, so 
> > software/firmware on such
> >         platforms may need to avoid setting the Unsupported Request 
> > Reporting Enable bit.
> >         With device Functions implementing Role-Based Error Reporting, 
> > setting the Unsupported Request
> >         Reporting Enable bit will not interfere with PC-compatible 
> > Configuration Space probing, assuming
> >         that the severity for UR is left at its default of non-fatal. 
> > However, setting the Unsupported Request
> >         Reporting Enable bit will enable the Function to report UR errors 
> > detected with posted Requests,
> >         helping avoid this case for potential silent data corruption.
> >         On platforms where robust error handling and PC-compatible 
> > Configuration Space probing is
> >         required, it is suggested that software or firmware have the 
> > Unsupported Request Reporting Enable
> >         bit Set for Role-Based Error Reporting Functions, but clear for 
> > 1.0a 
> > Functions. Software or
> >         firmware can distinguish the two classes of Functions by examining 
> > the Role-Based Error Reporting
> >         bit in the Device Capabilities register.
> > 
> > 
> > What I think you have is a very old 1.0a system, and you set Unsupported
> > Request Reporting Enable.
> > 
> > Can you confirm?
> 
> No. In at least one of the two cases we got reports of the original
> problem, triggering the finding of this issue, this is a brand new one,
> only soon to become available publicly. Furthermore I'm being
> confused by the mention of PC-compatible config space probing
> above: The URs we talk about here don't result from config space
> accessed at all.

OK. Can you please explain why does UR cause a system error then?
It looks like a hardware bug: PCIE 1.1 seems to say it shouldn't.

> > You will have other problems if your firmware doesn't follow the spec. So 
> > how about either
> > 
> > - Don't use firmware 1st mode with pci express
> >   (Seems no reason to do firmware 1st for PCIE, architecture is completely
> >    standard. I saw mentions of using combined/parallel mode, using AER for 
> > some
> >    devices but not others, but I don't know how this is supposed to be 
> > enabled.
> >    Any idea?)
> > 
> > or
> > 
> > - ask your vendor to update firmware if it doesn't do the right thing
> 
> Both not very practical suggestions, based on experience.
> 
> Jan

Well using OS native mode is definitely practical, the question
is how to detect the problematic configurations.

There's always XSA-124 which says buggy hardware can cause
security problems.

-- 
MST

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Reply via email to