On 15/12/2019 04:19, Heinrich Schuchardt wrote:
The failure starts when the receive buffer is filled up. No message is sent by iPXE for a few seconds. Afterwords a new login to the iSCSI server occurs.
This is curious. The iSCSI receive datapath is essentially zero-copy: U-Boot provides the destination buffer when issuing the EFI_BLOCK_IO_PROTOCOL.ReadBlocks() call, and iPXE will copy data to this buffer as each new packet arrives. There is no internal iSCSI- or SCSI-level buffer that could be filling up.
The only plausible place I can find in which buffering would occur is in the TCP receive queue. It's very likely that the difference in speeds would cause iPXE to essentially suffer from a high received packet loss rate, in which case it will end up maintaining a queue of what are effectively packets received out of order. This would be visible in a detailed packet trace from the presence of TCP SACKs sent by iPXE.
However, the TCP SACK mechanism should already be robust enough to recover from this situation. iPXE will discard packets from the receive queue if it runs out of memory for any reason, so the network driver will always be able to allocate space for newly received packets, allowing forward progress to be made once the server retransmits the missing packets (which happens almost immediately when using SACK).
Could you publish the full .pcap file for your reproducible failure case? Thanks, Michael _______________________________________________ ipxe-devel mailing list ipxe-devel@lists.ipxe.org https://lists.ipxe.org/mailman/listinfo.cgi/ipxe-devel