Eric Dumazet a écrit :
David S. Miller a écrit :

The only thing truly expensive in tg3_tx() are the DMA unmaps on some
platforms, particular if that platform uses an IOMMU.  Unfortunately,
you won't see the IOMMU unmapping overhead in the oprofile traces
because the locking done by IOMMU layers must disable interrupts.

Another idea, if the DMA unmapping isn't the main culprit, is to batch
the freeing.  As we walk through the TX entries, just add the SKB to a
linked list maintained at the top level of the function.  Then drop
the lock and pass each SKB on the list to dev_kfree_skb() (do not
forget to NULL out skb->next or you'll trigger debugging checks).

A hybrid scheme would only hold the lock for a certain number of SKBs
at a time, drop the lock and free the list, then regrab the lock and
ACK some more entries, again and again until all finied TX'd frames
are processed.

In fact that might work quite well.


Excellent

Well, the hybrid scheme could lead to RX starvation (if some fool succeed to transmit at full Gigabit speed), tg3_tx() could loop forever, never returning to tg3_poll().

About the first idea, maybe we should defer the handling of the list of skb to be DMA unmapped *after* the tg3_rx(), in order to reduce the possibility of discarded incoming frames...

Eric

Looking at tg3_tx() more closely, I am not convinced it really needs to lock 
tp->tx_lock during the loop.
tp->tx_cons (swidx) is changed in this function only, and could be changed to 
an atomic_t

The tx_lock would be needed for the final

if (netif_queue_stopped(tp->dev) &&
        (TX_BUFFS_AVAIL(tp) > TG3_TX_WAKEUP_THRESH))
        netif_wake_queue(tp->dev);


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to