On Thu, Jan 19, 2017 at 3:43 AM, David Laight <david.lai...@aculab.com> wrote:
> From: Alexander Duyck
>> Sent: 18 January 2017 17:25
>> On Wed, Jan 18, 2017 at 8:22 AM, David Laight <david.lai...@aculab.com> 
>> wrote:
>> > From: David Miller
>> >> Sent: 17 January 2017 19:16
>> >> > Relax ordering(RO) is one feature of 82599 NIC, to enable this feature 
>> >> > can
>> >> > enhance the performance for some cpu architecure, such as SPARC and so 
>> >> > on.
>> >> > Currently it only supports one special cpu architecture(SPARC) in 82599
>> >> > driver to enable RO feature, this is not very common for other cpu 
>> >> > architecture
>> >> > which really needs RO feature.
>> >> > This patch add one common config CONFIG_ARCH_WANT_RELAX_ORDER to set RO 
>> >> > feature,
>> >> > and should define CONFIG_ARCH_WANT_RELAX_ORDER in sparc Kconfig firstly.
>> >> >
>> >> > Signed-off-by: Mao Wenan <maowe...@huawei.com>
>> >>
>> >> Since no-one has reviewed this patch, and I do not feel comfortable with 
>> >> applying
>> >> it without such review, I am tossing this patch.
>> >>
>> >> If someone eventually reviews it, repost this patch.
>> >
>> > Having re-read parts of the PCIe spec I think I'd like someone to
>> > explain exactly which transfers are affected by the 'relaxed ordering'
>> > bit and why any re-ordered transactions aren't a problem.
>> >
>> > In particular I believe RO allows the write to update the receive
>> > descriptor ring to overtake a write of receive packet data.
>> > That could lead to the network stack processing a receive frame
>> > before it has actually been written.
>> >
>> >         David
>> >
>>
>> The Relaxed Ordering attribute doesn't get applied across the board.
>> It ends up being limited to a subset of the transactions if I recall
>> correctly.  In this case it is the Tx descriptor write back, and the
>> Rx data write back.  We don't apply the RO bit to any other
>> transactions.
>>
>> In the case of Tx descriptor there is no harm in allowing it to be
>> reordered because we only really read the DD bit so we don't care
>> about the ordering of the write back.  In the case of the Rx data the
>> Rx descriptor essentially acts as a flush since it is sent without the
>> RO bit set.  So all the writes before it must be completed before the
>> Rx descriptor write back.
>
> In which case why not set it unconditionally for all architectures?
>
> I'm surprised (I often am) that allowing those re-orderings makes
> any significant difference.
> Unfortunately you need a PCIe analyser to see what is really happening
> and they don't come cheap.
>
> What I do vaguely remember is that some hosts don't always implement
> the 'normal' re-ordering of reads and read completions.
> Re-ordering of reads allows descriptor reads to overtake transmit
> traffic which is likely to make a difference.

I think part of the issue, at least in the case of SPARC, is that the
handling of the memory writes in the PCIe root complex is impacted by
the RO attribute.  On the bus itself it doesn't matter much, but at
the root complex it can become expensive to have to wait on a partial
write to complete while there are other writes pending.  This is why
the IOMMU for SPARC now has a WEAK_ORDERING attribute you can add so
that it can write the data in whatever order it wants in relation to
other writes in that region.

- Alex

Reply via email to