On Fri, Aug 26, 2016 at 3:18 AM, Auer, Jens <jens.a...@cgi.com> wrote:
> Hi,
>
> we are experiencing packet reordering and thus bandwidth drops due to 
> retransmissions on our servers. We are using RedHat 7.2 on HP Servers 
> connected to a HP Aruba 1920 10GB switch. I can easily reproduce the issues 
> by running iperf3 between the two servers:

<snipping out test results>

> In this test, I get some retransmissions every couple of seconds. We analyzed 
> the retransmissions by capturing a sample with Wireshark and we can confirm 
> that the retransmissions are fast retransmissions caused by reordering of the 
> traffic at the receiving server only.
>
> I have found a description of problems caused by Flow Director and migrating 
> processes from Fermilab (https://arxiv.org/pdf/1106.0443.pdf) together with a 
> support case at RedHat (https://access.redhat.com/solutions/2403071). What is 
> offical position concerning this issue? Is it considered as bug?
>
> The RedHat case recommends to either switch to Perfect Filtering or fix CPU 
> assignment of processes. I have tried both solutions, but the first one still 
> produces retransmissions even when pinning both iperf processes to a fixed 
> CPU:

<snipping out some more test results>

>
> Switching to Perfect Forwarding mode fixed the problems and iperf runs 
> without retransmissions. Are there other factors that can cause 
> retransmissions, e.g. irq balancing at the receiver?
>
> Best wishes,
>   Jens

Hi Jens,

The main cause of the reordering is likely the ATR feature on the
adapter.  What ATR does is try to redirect all Rx traffic for a given
flow back to the same Tx queue that the flow was received on.  When
combined with XPS which aligns the Tx queue selection with the CPU
that the transmitting application is scheduled on this can lead to
reordering.  The reordering is due to the fact that the scheduler will
move flows back and forth between CPUs.  This in turn can lead to
reordering if either the scheduler moves the application between CPUs
or the IRQ balancing moves the Rx queue to a new CPU which leads to
possible bouncing if acknowledgements are generated by the stack on a
different CPU from the transmissions.

On additional cause for possible issues is if you are running flows
with a high connection rate.  The filter table used for ATR has a
limited amount of space available and if the table overflows it will
cause the contents to be flushed.  This also can lead to reordering
related to ATR.

Hope this helps to clarify some of what is going on.

- Alex

------------------------------------------------------------------------------
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to