There are two threads involved:
1. The user thread that calls the UDP sendto() interface:
* Lock the network.
* Call netdev_txnotify_dev() to inform the driver that TX data is
available. The driver should schedule the TX poll on LP work queue.
* If CONFIG_NET_UDP_WRITE_BUFFERS is enabled, the UDP sendto() to
will copy the UDP packet into a write buffer, unlock the network,
and return to the caller immediately.
* if CONFIG_NET_UDP_WRITE_BUFFERS is NOT enabled, the UDP sendto()
will unlock the network and wait for the driver TX poll
The other thread is the LP work queue thread. Work was schedule here
when netdev_txnotify_dev() was called.
* Lock the network (perhaps waiting for the user thread to unlock it).
* Perform the TX poll
* If CONFIG_NET_UDP_WRITE_BUFFERS is enabled, it will copy the
buffered UDP packet into the driver packet buffer.
* If CONFIG_NET_UDP_WRITE_BUFFERS is NOT enabled, it will copy the
user data directly into the driver packet buffer.
* When the packet buffer is filled, the Ethernet driver will send
(or schedule to send) the packet
For single packet transfers, I would think that the latency would be a
little less if CONFIG_NET_UDP_WRITE_BUFFERS were disabled. That would
save one packet copy with the side effect of making the user
application wait until the data is accepted by the driver.
LP worker thread priority could have some effect in certain situations.
See also the slide entitled /Tx Event Handler “Rendezvous”/ in
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=139629397