On Thu, 2014-08-21 at 16:06 -0700, Benjamin Poirier wrote: > On 2014/08/21 15:32, Michael Chan wrote: > > On Thu, 2014-08-21 at 15:04 -0700, Benjamin Poirier wrote: > > > On 2014/08/19 15:00, Michael Chan wrote: > > > > On Tue, 2014-08-19 at 11:52 -0700, Benjamin Poirier wrote: > > > > > diff --git a/drivers/net/ethernet/broadcom/tg3.c > > > > > b/drivers/net/ethernet/broadcom/tg3.c > > > > > index 3ac5d23..b11c0fd 100644 > > > > > --- a/drivers/net/ethernet/broadcom/tg3.c > > > > > +++ b/drivers/net/ethernet/broadcom/tg3.c > > > > > @@ -202,7 +202,8 @@ static inline void _tg3_flag_clear(enum TG3_FLAGS > > > > > flag, unsigned long *bits) > > > > > #endif > > > > > > > > > > /* minimum number of free TX descriptors required to wake up TX > > > > > process */ > > > > > -#define TG3_TX_WAKEUP_THRESH(tnapi) ((tnapi)->tx_pending > > > > > / 4) > > > > > +#define TG3_TX_WAKEUP_THRESH(tnapi) max_t(u32, > > > > > (tnapi)->tx_pending / 4, \ > > > > > + MAX_SKB_FRAGS + 1) > > > > > > > > I think we should precompute this and store it in something like > > > > tp->tx_wake_thresh. > > > > > > I've tried this by adding the following patch at the end of the v2 > > > series but I did not measure a significant latency improvement. Was > > > there another reason for the change? > > > > Just performance. The wake up threshold is checked in the tx fast path > > in both start_xmit() and tg3_tx(). I would optimize such code for speed > > I don't see what you mean. The code in those two functions that used to > invoke TG3_TX_WAKEUP_THRESH is wrapped in unlikely() conditions. You > can't tell me that's the fast path ;) It's only checked when the queue > is stopped.
I missed the unlikely(). So you're right. It's not really in the fast path. > > Moreover, the patches I've sent already add tg3_napi.wakeup_thresh. It > is over those patches that I've made the measurements. Right. But my original comment was over your original patch #1 which was adding max_t() to the macro TG3_TX_WAKE_THRESH without adding wakeup_thresh field. All my comments (performance and smaller code) were based on your original patch #1. Later I did see that your patch 3 converted TG3_TX_WAKEUP_THRESH to a structure field so it's no longer an issue. > > > as much as possible. In the current code, it was just a right shift > > operation. Now, with max_t() added, I think I prefer having it > > pre-computed. The performance difference may not be measurable, but I > > think the compiled code size may be smaller too. > > Maybe in certain areas, but not overall: > > with v2 patches 1-3 > text data bss dec hex filename > 149495 1247 0 150742 24cd6 drivers/net/ethernet/broadcom/tg3.o > with v2 patches 1-3 + tx_wake_thresh_def > text data bss dec hex filename > 149524 1247 0 150771 24cf3 drivers/net/ethernet/broadcom/tg3.o > > I really don't see a gain. > Agreed. Once you have converted the TG3_TX_WAKEUP_THRESH to a structure field, that's sufficient. No need to have multiple fields. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/