From: Alexander Duyck
> Sent: 18 January 2017 17:25
> On Wed, Jan 18, 2017 at 8:22 AM, David Laight <david.lai...@aculab.com> wrote:
> > From: David Miller
> >> Sent: 17 January 2017 19:16
> >> > Relax ordering(RO) is one feature of 82599 NIC, to enable this feature 
> >> > can
> >> > enhance the performance for some cpu architecure, such as SPARC and so 
> >> > on.
> >> > Currently it only supports one special cpu architecture(SPARC) in 82599
> >> > driver to enable RO feature, this is not very common for other cpu 
> >> > architecture
> >> > which really needs RO feature.
> >> > This patch add one common config CONFIG_ARCH_WANT_RELAX_ORDER to set RO 
> >> > feature,
> >> > and should define CONFIG_ARCH_WANT_RELAX_ORDER in sparc Kconfig firstly.
> >> >
> >> > Signed-off-by: Mao Wenan <maowe...@huawei.com>
> >>
> >> Since no-one has reviewed this patch, and I do not feel comfortable with 
> >> applying
> >> it without such review, I am tossing this patch.
> >>
> >> If someone eventually reviews it, repost this patch.
> >
> > Having re-read parts of the PCIe spec I think I'd like someone to
> > explain exactly which transfers are affected by the 'relaxed ordering'
> > bit and why any re-ordered transactions aren't a problem.
> >
> > In particular I believe RO allows the write to update the receive
> > descriptor ring to overtake a write of receive packet data.
> > That could lead to the network stack processing a receive frame
> > before it has actually been written.
> >
> >         David
> >
> 
> The Relaxed Ordering attribute doesn't get applied across the board.
> It ends up being limited to a subset of the transactions if I recall
> correctly.  In this case it is the Tx descriptor write back, and the
> Rx data write back.  We don't apply the RO bit to any other
> transactions.
> 
> In the case of Tx descriptor there is no harm in allowing it to be
> reordered because we only really read the DD bit so we don't care
> about the ordering of the write back.  In the case of the Rx data the
> Rx descriptor essentially acts as a flush since it is sent without the
> RO bit set.  So all the writes before it must be completed before the
> Rx descriptor write back.

In which case why not set it unconditionally for all architectures?

I'm surprised (I often am) that allowing those re-orderings makes
any significant difference.
Unfortunately you need a PCIe analyser to see what is really happening
and they don't come cheap.

What I do vaguely remember is that some hosts don't always implement
the 'normal' re-ordering of reads and read completions.
Re-ordering of reads allows descriptor reads to overtake transmit
traffic which is likely to make a difference.

        David

Reply via email to