On Tue, May 2, 2017 at 12:34 PM, Raj, Ashok <ashok....@intel.com> wrote:
> On Tue, May 02, 2017 at 11:10:22AM -0700, Alexander Duyck wrote:
>> On Tue, May 2, 2017 at 9:53 AM, Raj, Ashok <ashok....@intel.com> wrote:
>> > On Tue, May 02, 2017 at 09:39:34AM -0700, Alexander Duyck wrote:
>> >> On Mon, May 1, 2017 at 4:13 PM, Casey Leedom <lee...@chelsio.com> wrote:
>> >> > The new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING indicates that the 
>> >> > Relaxed
>> >> > Ordering Attribute should not be used on Transaction Layer Packets 
>> >> > destined
>> >> > for the PCIe End Node so flagged.  Initially flagged this way are Intel
>> >> > E5-26xx Root Complex Ports which suffer from a Flow Control Credit
>> >> > Performance Problem and AMD A1100 ARM ("SEATTLE") Root Complex Ports 
>> >> > which
>> >> > don't obey PCIe 3.0 ordering rules which can lead to Data Corruption.
>> >>
>> >> So this is a good first step though might I suggest one other change.
>> >>
>> >> We may want to add logic to the core PCIe code that clears the "Enable
>> >> Relaxed Ordering" bit in the device control register for all devices
>> >> hanging off of this root complex. Assuming the devices conform to the
>> >> PCIe spec doing that should disable relaxed ordering in a device
>> >> agnostic way that then enables us at a driver level to just enable the
>> >> feature always without having to perform any checks for your flag. We
>> >> could probably do that as a part of probing the PCIe interfaces
>> >> hanging off of these devices.
>> >
>> > I suppose you don't want to turn off RO completely on the device. When
>> > traffic is targetted to mmio for peer to peer reasons RO has performance
>> > upside. The specific issue with these root ports indicate limitation using
>> > RO for traffic targetting coherent memory.
>>
>> Actually my main concern here is virtualization. If I take the PCIe
>> function and direct assign it I have no way of seeing the root complex
>> flag as it is now virtualized away. In the meantime the guest now has
>> the ability to enable the function and sees nothing that says you
>> can't enable relaxed ordering which in turn ends up potentially
>> causing data corruption on the system. I want relaxed ordering
>> disabled before I even consider assigning it to the guest on the
>> systems where this would be an issue.
>>
>> I prefer to err on the side of caution with this. Enabling Relaxed
>> Ordering is technically a performance enhancement, so we function but
>> not as well as we would like, while having it enabled when there are
>> issues can lead to data corruption. I would weigh the risk of data
>> corruption the thing to be avoided and of much higher priority than
>> enabling improved performance. As such I say we should default the
>> relaxed ordering attribute to off in general and look at
>> "white-listing" it in for various architectures and/or chipsets that
>> support/need it rather than having it enabled by default and trying to
>> switch it off after the fact when we find some new issue.
>
> I agree, after thinking about it a bit more.. even for transactions going to
> p2p, i'm just reading the pcie spec and some sections aren't super clear
> about completion redirect and ACS rules for p2p.
>
> Also it appears the device control default value is 1 for enabling
> Relaxed Ordering. This means we should probably save these states across
> resets/FLR for e.g. To ensure perf isn't affected after a FLR.

Right. That should happen automatically with the PCIe configuration
being saved/restored.

- Alex

Reply via email to