Hello Pavel,

Appreciate for your detailed explanation.  Please bear with my 
verbose/questions.

Based on your explanation, looks like your application (running on layer 7) 
uses at least 3 threads. (1). first thread is for DPDK burst 
read(rte_eth_rx_burst()). After read data from layer 2, put them into Q  (2). 
second thread (layer 2) read data from Q, and then use F-Stack to handle TCP 
data. Then put these data (layer 4) to TCP socket buffer. (3). third thread use 
epoll_wait() to read (layer 7) data from TCP socket buffer. And "forward" them 
to outgoing TCP socket for rte_eth_tx_burst()

Is my understanding right?


Thanks

________________________________
From: Pavel Vazharov <frea...@gmail.com>
Sent: Wednesday, April 14, 2021 23:57
To: Hao Chen <earthlovepyt...@outlook.com>
Cc: users@dpdk.org <users@dpdk.org>
Subject: Re: [dpdk-users] What is TCP read performance by using DPDK?

Hi,

"Does it mean your code just look at IPHeader and TCPheader without handling 
TCP payload?"
The proxy works in the application layer. I mean, it works with regular BSD 
sockets. As I said we use modified version of F-stack 
(https://github.com/F-Stack/f-stack) for this. Basically our version is very 
close to the original libuinet (https://github.com/pkelsey/libuinet) but based 
on a newer version of the FreeBSD networking stack (FreeBSD 11). Here is a 
rough description how it works:
1. Every thread of our application reads packets in bursts from the single RX 
queue using the DPDK API.
2. These packets are then passed/injected into the FreeBSD/F-stack networking 
stack. We use separate networking stack per thread.
3. The networking stack processes the packets queueing them in the receive 
buffers of the TCP sockets. These are regular sockets.
4. Every application thread also calls regularly an epoll_wait API provided by 
the F-stack library. It's just a wrapper over the kevent API provided by the 
FreeBSD.
5. The application gets the read/write events from the epoll_wait and 
reads/writes to the corresponding sockets. Again this is done exactly like in a 
regular Linux application where you read/write data from/to the sockets.
6. Our test proxy application used sockets in pairs and all data read from a 
given TCP socket were written to the corresponding TCP socket in the other 
direction.
7. The written data to the given socket is put in the send buffers of this 
socket and eventually sent out via the given TX queue using the DPDK API. This 
happens via callback that's provided to the F-stack. The callback is called for 
every single packet that needs to be send out by the F-stack and our 
application implements this callback using the DPDK functionality. In our 
design the F-stack/FreeBSD stack doesn't know about the DPDK it can work with 
different packet processing framework.

"Does it mean UDP-payload-size is NOT 1400 bytes (MTU size)? And it is as 
smaller as 64 bytes for example?"
My personal observation is that for the same amount of traffic the UTP traffic 
generates much more packets per second than the corresponding HTTP traffic 
running over TCP. These are the two tests that we did. However, I can't provide 
you numbers about this at the moment but there are lots of packets smaller than 
the MTU size usually. I think they come from things like the internal ACK 
packets which seem to be send more frequently than TCP. Also the request, 
cancel, have, etc messages, from the BitTorrent protocol, are most of the times 
sent in smaller packets.

"Do you handle UTP payload, or just "relay" it like proxy?"
Our proxies always work with sockets. We have application business logic built 
over the socket layer. For the test case we just proxied the data between pairs 
of UTP sockets in the same way we did it for the TCP proxy above.
We have implementation of the UTP protocol which provides a socket API similar 
to the BSD socket API with read/write/shutdown/close/etc functions. As you 
probably may have read, the UTP protocol is, kind of, a simplified version of 
the TCP protocol but also more suitable for the needs of the BitTorrent 
traffic. So this is a reliable protocol and this means that there is a need for 
socket buffers. Our implementation is built over the UDP sockets provided by 
the F-stack. The data are read from the UDP sockets and put into the buffers of 
the corresponding UTP socket. If contiguous data are collected into the 
buffers, the implementation fires notification to the application layer. The 
write direction works in the opposite way. The data from the application are 
first written to the buffers of the UTP socket and then later send via the 
internal UDP socket from the F-stack.

So to summarize the above. We handle the TCP/UDP payload using the regular BSD 
socket API provided by the F-stack library and our UTP stack library. For the 
test we just relayed the data between a few thousands pairs of sockets. 
Currently we do much more complex manipulation of this data but this is still 
work in progress and the final performance is still not tested.

Hope the above explanations help.
Pavel.

Reply via email to