On Thu, Sep 21, 2006 at 04:59:37PM -0700, Jesse Brandeburg wrote: > Herbert can you help describe why we need this patch on this thread on > netdev? We reproduced the race without it and it appears to go away > with it.
Of course you can resolve it by adding a lock to the txclean path. However, the advantages of a lockless txclean path is obvious. Here's the original changelog: [NETDRV] e1000: Fix potential tx wake up race Jamal posted a patch which improved e1000 TX performance. That in turn spawned a discussion on potential tx wake up races caused by the change. During that discussion, Michael Chan posted the following scenario (redrawn by me for e1000): CPU0 CPU1 e1000_xmit_frame tx_lock if (tx_ring_empty) e1000_clean_tx_irq if (netif_queue_stopped) tx_lock netif_wake_queue tx_unlock netif_stop_queue tx_unlock As you can see, even with the locks around netif_wake_queue this still allows the race to occur because the netif_wake_queue code path is conditional on netif_queue_stopped. So here is a patch that uses the memory barriers rather than spin locks to solve the problem. First of all I've made netif_wake_queue independent of netif_queue_stopped. This means that we operate on the stopped bit only once which closes a race window. Furthermore, netif_wake_queue itself is already a memory barrier because it does a test_and_clear_bit. This means that if the other side can observe the netif_wake_queue then it can observe the extra room we've made on the queue. On the e1000_xmit_frame side, I've changed netif_stop_queue into a new function netif_test_and_stop_queue which does test_and_set_bit. This means that it also is a full memory barrier. In order to close the race then, we simply recheck the tx queue size if netif_test_and_stop_queue returns 0 indicating that the queue was previously started. There are two possibilities: 1) netif_wake_queue is about to occur. In this case it doesn't really matter what we do as netif_wake_queue will always wake the queue for us. In the worst case, it'll wake up the queue with too little room but that's really uncommon and we can cope with that anyway. 2) netif_wake_queue has just occured. Because it is a memory barrier, when we recheck the queue size we're guaranteed to see the new indices which means that we'll see the extra room it has made for us. As a result we'll simply restart the queue. The one wart in the patch is that we now have an unconditional atomic operation in e1000_clean_tx_irq (netif_wake_queue). If anyone can think of a way to get rid of it without opening up the race, please let us know. Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html