> > First of all, 6 pbufs seems not to be very much, and as i understood they > > can be chained together, i changed this to 36 pbufs with 256 bytes each > > noticing that now the TCP connections won't work with the 1200 byte > > MSS.
> I wonder if your driver can handle that case correctly? It has to know > to split the received packet across pbufs. A naive driver might ignore > pbuf chains (on send as well as receive). > Kieran Correct, I wrote my own driver for the ARM7 LPC23XX/24XX. I wrote the driver so that it could handle PBUFs of different size than the EMAC DMA buffers: equal, bigger, or smaller. Unless your driver is specifically written to handle all of this data reorganization between the PBUFs and the DMA bufs, it's best to assume they need to be the exact same size. I was curious to know how much difference in speed there would be with various buffer sizes, and equal or not equal. I tested with a 4MB stream of TCP data. The data rate is currently around 2.2MB/sec. I tested with BUF sizes down to 128 bytes, and up to 1536. But in each case the total RAM was the same, about 12K. There wasn't much difference in speed, maybe +/- 10%. I expected a larger penalty when the buffer sizes were not equal, and/or when they were smaller, but it really didn't affect it significantly. Very little speed difference. Not much difference either with a large number of 24/256 size buffers, or a small number of 4/1536 buffers. I could see no significant speed penalty by chaining buffers either. The chaining appears very efficient. With all that in mind, the most flexible combination is to use a larger number of smaller buffers. 256 bytes seems about optimal and provides more resources for higher frequency small packet traffic. Several of the other TCP stacks I have used employed dual small/big buffers as a solution to this problem. However, I have to give lwIP credit here, the PBUF chaining approach gives all this small/big buffer size flexibility with virtually no speed penalty. My experience also suggests that lwIP [RAW] is about the fastest, smallest, and most capable TCP stack solution for embedded systems with minimal RAM resources. I am using this with FreeRTOS and my profiling shows only about 75% CPU utilization during the saturated TX/RX transfer. I think even more speed is still possible with further optimizations and tuning, which is what I am working on now. memcpy and chksum routines are very important, as well as many other somewhat obscure areas. I am curious to see if I can increase the speed further. Chris. _______________________________________________ lwip-users mailing list lwip-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/lwip-users