Hi all,

I have developed a variation of the pipeline example. I am just using the
ROUTING pipeline along with a new pipeline that I have developed on purpose
for the ARP handing.


   - The ROUTING pipeline just creates one tx and rx queue on each
   interface where it captures everything (promiscuous mode) and performs
   pipeline LPM table (table 0) classification and next-hop setting on the
   headroom. Subsequently it applies the ARP table (table 1) where the dest
   and src MAC addresses are set on the outgoing packet header previuos to the
   tx of the packet to the corresponding tx queue.
   - The ARP pipeline also creates a separate tx and rx queue on the same
   interfaces with the particularity that a filter is also set on the rx
   queues (just capture the ARP frames on that queue). Then it processes the
   ARP requests and provides the IP-MAC address translations to the ROUTING
   pipeline via the MASTER pipeline (sending IP-MAC translation events). It is
   as simple as this.
   - There is also a MASTER pipeline (executed in the logical core 0) which
   sends messages and receives events (new feature developed by us) from each
   pipeline.

The thing is that, as the pipeline example allows it, I can assign which
core I want each pipeline to be executed. When ROUTING pipeline is executed
in core 1 and ARP pipeline executed in core 2 everything goes perfect as
maximum throughput is reached in my environment (1 Gbps for a single TCP
flow). No pàcket loss at all.

However when I assign both ROUTING and ARP pipelines the same core (logical
core 1) the behaviour degrades considerably. Packets loss and disorder are
detected (and confirmed by Wireshark captures). This causes that the same
TCP flow cannot reach a constant rate due to retries and packet loss
detection.

Both ROUTING and ARP pipelines, when capturing packets or ARP frames use
the same mempool.

One thing to take into account is that the ethernet interface statistics
reveals no missing packets at all in both cases. So the thread executing
both pipelines in the same logical core is not slow enough while packet
processing to provoke packets misses that could cause this TCP degradation.

My question is: If the code is exactly the same, why the behaviour is so
different when I decide to execute them on the same core? It seems to me
that the issue comes from the fact that I am rx and tx packets from the
same thread to more than one queue in each interface. All the examples I
have seen in DPDK web page directly maps each thread to one hw queue per
interface and this is the differing point with respect to the case I have
explained to you.

What do you think about it? It is not possible to receive packets from
several hw queues of the same interface in the same thread? Could it be the
cause of this packet loss/disordering due to slow memory access?

Thanks for your attention,

-- 
Victor

Reply via email to