Hi Ido, I'm looking at your patch "net/mlx4_en: Configure the XPS queue mapping on driver load". We're testing a 40 CPU system and it looks like XPS is being configured by default with forty queues 0-39 where each xps_cpus is (1 << i). The problem is that this does not easily align with RX queues and the TX completion interrupt for TX queue z happens in interrupt for (z % num_rx_queues). So it looks like, the default XPS doesn't respect the RX interrupt affinities, so we have TX queues that are bound to a CPU on one numa node but have their TX completions happen on another which is suboptimal. It would be nice if the default XPS could take where the completion interrupt happens into account somehow.
Looks like there's a few other drivers that are setting XPS, they might have similar issue. Thanks, Tom -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
