This is an automated email from the ASF dual-hosted git repository. jerpelea pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/nuttx.git
commit cd713f00fc29a16fd4d8c3b233661c818c7974c5 Author: Patrick José Pereira <[email protected]> AuthorDate: Thu May 28 11:24:39 2026 -0300 Documentaiton: net: tcp_network_perf: Remove trailing spaces Signed-off-by: Patrick José Pereira <[email protected]> --- Documentation/components/net/tcp_network_perf.rst | 252 +++++++++++----------- 1 file changed, 126 insertions(+), 126 deletions(-) diff --git a/Documentation/components/net/tcp_network_perf.rst b/Documentation/components/net/tcp_network_perf.rst index 33e89d8efd2..903995334d0 100644 --- a/Documentation/components/net/tcp_network_perf.rst +++ b/Documentation/components/net/tcp_network_perf.rst @@ -2,8 +2,8 @@ TCP Network Performance ======================= -.. warning:: - Migrated from: +.. warning:: + Migrated from: https://cwiki.apache.org/confluence/display/NUTTX/TCP+Network+Performance @@ -22,172 +22,172 @@ First let's talk about TCP send performance. Source of Performance Bottlenecks --------------------------------- -General TCP send performance is not determined by the TCP stack as much -as it is by the network device driver. Bad network performance is due -to time lost `BETWEEN` packet transfers. The packet transfers themselves -go at the wire speed*. So if you want to improve performance on a -given network, you have to reduce time lost between transfers. +General TCP send performance is not determined by the TCP stack as much +as it is by the network device driver. Bad network performance is due +to time lost `BETWEEN` packet transfers. The packet transfers themselves +go at the wire speed*. So if you want to improve performance on a +given network, you have to reduce time lost between transfers. There is no other way. -Ignoring Ethernet issues like collisions, back-off delays, +Ignoring Ethernet issues like collisions, back-off delays, inter-packet gaps (IPG), etc. -The time between packets is limited primarily by the buffering -design of the network driver. If you want to improve performance, -then you must improve the buffering at the network driver. -You need to support many full size (1500 byte) packet buffers. -You must be able to query the network for new data to transfer, -and queue those transfers in packet buffers. In order to reach -peak performance, the network driver must have the next transfer -buffered and ready-to-go before the previous transfer is finished +The time between packets is limited primarily by the buffering +design of the network driver. If you want to improve performance, +then you must improve the buffering at the network driver. +You need to support many full size (1500 byte) packet buffers. +You must be able to query the network for new data to transfer, +and queue those transfers in packet buffers. In order to reach +peak performance, the network driver must have the next transfer +buffered and ready-to-go before the previous transfer is finished to minimize the GAP between packet transfers. -Different network devices also support more or less efficient -interfaces: The worst performing support interfaces that can -handle only one packet at a time, the best performing are able -to retain linked lists of packet buffers in memory and perform +Different network devices also support more or less efficient +interfaces: The worst performing support interfaces that can +handle only one packet at a time, the best performing are able +to retain linked lists of packet buffers in memory and perform scatter-gather DMA for a sequence of packets. -In the NuttX TCP stack, you can also improve performance by +In the NuttX TCP stack, you can also improve performance by enabling TCP write buffering. But the driver is the real key. -It would be good to have a real in-depth analysis of the -network stack performance to identify bottlenecks and -generate ideas for performance improvement. No one has -ever done that. If I were aware of any stack related +It would be good to have a real in-depth analysis of the +network stack performance to identify bottlenecks and +generate ideas for performance improvement. No one has +ever done that. If I were aware of any stack related performance issue, I would certainly address it. RFC 1122 -------- -There is one important feature missing the NuttX TCP that -can help when there is no write buffering: Without write -buffering send() will not return until the transfer has -been ACKed by the recipient. But under RFC 1122, the host -need not ACK each packet immediately; the host may wait -for 500 MS before ACKing. This combination can cause very -slow performance when small, non-buffered transfers are -made to an RFC 1122 client. However, the RFC 1122 must -ACK at least every second (odd) packet so sequences of -packets with write buffering enabled do not suffer from +There is one important feature missing the NuttX TCP that +can help when there is no write buffering: Without write +buffering send() will not return until the transfer has +been ACKed by the recipient. But under RFC 1122, the host +need not ACK each packet immediately; the host may wait +for 500 MS before ACKing. This combination can cause very +slow performance when small, non-buffered transfers are +made to an RFC 1122 client. However, the RFC 1122 must +ACK at least every second (odd) packet so sequences of +packets with write buffering enabled do not suffer from this problem. -`Update: RFC 1122 support was added to the NuttX TCP -stack with commit 66ef6d143a627738ad7f3ce1c065f9b1f3f303b0 -in December of 2019. That, however, that affects only -received packet ACK behavior and has no impact on transmitted +`Update: RFC 1122 support was added to the NuttX TCP +stack with commit 66ef6d143a627738ad7f3ce1c065f9b1f3f303b0 +in December of 2019. That, however, that affects only +received packet ACK behavior and has no impact on transmitted packet performance; write buffering is still recommended.` TCPBlaster ---------- -I created a new test application at ``apps/examples/tcpblaster`` to -measure TCP performance and collected some data for the -configuration that happens to be on my desk. The `tcpblaster` -test gives you the read and write transfer rates in ``Kb/sec`` -(I won't mention the numbers because I don't believe they -would translate any other setup and, hence, would be +I created a new test application at ``apps/examples/tcpblaster`` to +measure TCP performance and collected some data for the +configuration that happens to be on my desk. The `tcpblaster` +test gives you the read and write transfer rates in ``Kb/sec`` +(I won't mention the numbers because I don't believe they +would translate any other setup and, hence, would be misleading). -There is a nifty `TCP Throughput Tool <https://www.switch.ch/network/tools/tcp_throughput/>`_ -that gives some theoretical upper limits on performance. -The tool needs to know the ``MSS`` (which is the Ethernet -packet size that you configured minus the size of the -Ethernet header, 14), the round-trip time (``RTT``)in -milliseconds (which you can -get from the Linux host ping), and a loss constant (which -I left at the default). With these values, I can determine -that the throughput for the NuttX TCP stack is approximately -at the theoretical limits. You should not be able to do -better any better than that (actually, it performs above -the theoretical limit, but I suppose that is why it is +There is a nifty `TCP Throughput Tool <https://www.switch.ch/network/tools/tcp_throughput/>`_ +that gives some theoretical upper limits on performance. +The tool needs to know the ``MSS`` (which is the Ethernet +packet size that you configured minus the size of the +Ethernet header, 14), the round-trip time (``RTT``)in +milliseconds (which you can +get from the Linux host ping), and a loss constant (which +I left at the default). With these values, I can determine +that the throughput for the NuttX TCP stack is approximately +at the theoretical limits. You should not be able to do +better any better than that (actually, it performs above +the theoretical limit, but I suppose that is why it is "theoretical"). -So, If you are unhappy with your network performance, the I -suggest you run the `tcpblaster` test, use that data -(along with the ``RTT`` from ping) with the -`TCP Throughput Tool <https://www.switch.ch/network/tools/tcp_throughput/>`_. -If you are still unhappy with the performance, don't go -immediately pointing fingers at the stack (which everyone does). -Instead, you should focus on optimizing your network -configuration settings and reviewing the buffer handling +So, If you are unhappy with your network performance, the I +suggest you run the `tcpblaster` test, use that data +(along with the ``RTT`` from ping) with the +`TCP Throughput Tool <https://www.switch.ch/network/tools/tcp_throughput/>`_. +If you are still unhappy with the performance, don't go +immediately pointing fingers at the stack (which everyone does). +Instead, you should focus on optimizing your network +configuration settings and reviewing the buffer handling of the Ethernet driver in you MCU. -If you do discover any significant performance issues -with the stack I will of course gladly help you resolve -them. Or if you have ideas for improved performance, +If you do discover any significant performance issues +with the stack I will of course gladly help you resolve +them. Or if you have ideas for improved performance, I would also be happy to hear those. What about Receive Performance? ------------------------------- -All of the above discussion concerns `transmit performance`, -i.e., "How fast can we send data over the network?" The other -side is receive performance. Receive performance is very -different thing. In this case it is the remote peer who is -in complete control of the rate at which packets appear on -the network and, hence, responsible for all of the raw bit +All of the above discussion concerns `transmit performance`, +i.e., "How fast can we send data over the network?" The other +side is receive performance. Receive performance is very +different thing. In this case it is the remote peer who is +in complete control of the rate at which packets appear on +the network and, hence, responsible for all of the raw bit transfer rates. -However, we might also redefine performance as the number of -bytes that were `successfully` transferred. In order for the -bytes to be successfully transferred they must be successfully -received and processed on the NuttX target. If we fail in -this if the packet is `lost` or `dropped`. A packet is lost if -the network driver is not prepared to receive the packet when -it was sent. A packet is dropped by the network if it is -received but could not be processed either because there -is some logical issue with the packet (not the case here) +However, we might also redefine performance as the number of +bytes that were `successfully` transferred. In order for the +bytes to be successfully transferred they must be successfully +received and processed on the NuttX target. If we fail in +this if the packet is `lost` or `dropped`. A packet is lost if +the network driver is not prepared to receive the packet when +it was sent. A packet is dropped by the network if it is +received but could not be processed either because there +is some logical issue with the packet (not the case here) or if we have no space to buffer the newly received packet. -If a TCP packet is lost or dropped, then the penalty is big: -The packet will not be ACKed, the remote peer may send a -few more out-of-sequence packets which will also be dropped. -Eventually, the remote peer will time out and retransmit +If a TCP packet is lost or dropped, then the penalty is big: +The packet will not be ACKed, the remote peer may send a +few more out-of-sequence packets which will also be dropped. +Eventually, the remote peer will time out and retransmit the data from the point of the lost packet. -There is logic in the TCP protocol to help manage these data -overruns. The TCP header includes a TCP `receive window` which -tells the remote peer how much data the receiver is able to -buffer. This value is sent in the ACK to each received -packet. If well tuned, this receive window could possibly -prevent packets from being lost due to the lack of -read-ahead storage. This is a little better. The remote -peer will hold off sending data instead of timing out and -re-transmitting. But this is still a loss of performance; -the gap between the transfer of packets caused by the hold-off +There is logic in the TCP protocol to help manage these data +overruns. The TCP header includes a TCP `receive window` which +tells the remote peer how much data the receiver is able to +buffer. This value is sent in the ACK to each received +packet. If well tuned, this receive window could possibly +prevent packets from being lost due to the lack of +read-ahead storage. This is a little better. The remote +peer will hold off sending data instead of timing out and +re-transmitting. But this is still a loss of performance; +the gap between the transfer of packets caused by the hold-off will result in a reduced transfer rate. -So the issues for good reception are buffering and processing -time. Buffering again applies to handling within the driver -but unlike the transmit performance, this is not typically -the bottleneck. And there is also a NuttX configuration -option that controls `read-ahead` buffering of TCP packets. -The buffering in the driver must be optimized to avoid lost -packets; the ` buffering can be tuned to minimize +So the issues for good reception are buffering and processing +time. Buffering again applies to handling within the driver +but unlike the transmit performance, this is not typically +the bottleneck. And there is also a NuttX configuration +option that controls `read-ahead` buffering of TCP packets. +The buffering in the driver must be optimized to avoid lost +packets; the ` buffering can be tuned to minimize the number packets dropped because we have no space to buffer them. -But the key to receive perform is management of processing -delays. Small processing delays can occur in the network -driver or in the TCP stack. But the major source of -processing delay is the application which is the ultimate -consumer of the incoming data. Imagine, for example, -and FTP application that is receiving a file over a -TCP and writing the file into FLASH memory. The primary -bottleneck here will be the write to FLASH memory which +But the key to receive perform is management of processing +delays. Small processing delays can occur in the network +driver or in the TCP stack. But the major source of +processing delay is the application which is the ultimate +consumer of the incoming data. Imagine, for example, +and FTP application that is receiving a file over a +TCP and writing the file into FLASH memory. The primary +bottleneck here will be the write to FLASH memory which is out of the control of software. -We obtain optimal receive performance when the processing -delays keep up with the rate of the incoming packets. -If the processing data rate is even slightly slower -then the receive data rate, then there will be a -growing `backlog` of buffered, incoming data to be -processed. If this backlog continues to grow then -eventually our ability to buffer data will be exhausted, -packets will be held off or dropped, and performance -will deteriorate. In an environment where a high-end, -remote peer is interacting with the low-end, embedded -system, that remote peer can easily overrun the -embedded system due to the embedded system's limited -buffering space, its much lower processing capability, -and its slower storage peripherals. \ No newline at end of file +We obtain optimal receive performance when the processing +delays keep up with the rate of the incoming packets. +If the processing data rate is even slightly slower +then the receive data rate, then there will be a +growing `backlog` of buffered, incoming data to be +processed. If this backlog continues to grow then +eventually our ability to buffer data will be exhausted, +packets will be held off or dropped, and performance +will deteriorate. In an environment where a high-end, +remote peer is interacting with the low-end, embedded +system, that remote peer can easily overrun the +embedded system due to the embedded system's limited +buffering space, its much lower processing capability, +and its slower storage peripherals.
