On Thu, Oct 30, 2025 at 11:39:30AM +0100, Przemek Kitszel wrote: > On 10/30/25 10:37, Michal Swiatkowski wrote: > > On Thu, Oct 30, 2025 at 10:10:32AM +0100, Paul Menzel wrote: > > > Dear Michal, > > > > > > > > > Thank you for your patch. For the summary, I’d add: > > > > > > ice: Use netif_get_num_default_rss_queues() to decrease queue number > > I would instead just say: > ice: cap the default number of queues to 64 > > as this is exactly what happens. Then next paragraph could be: > Use netif_get_num_default_rss_queues() as a better base (instead of > the number of CPU cores), but still cap it to 64 to avoid excess IRQs > assigned to PF (what would leave, in some cases, nothing for VFs). > > sorry for such late nitpicks > and, see below too
I moved away from capping to 64, now it is just call to netif_get_num_default_rss_queues(). Following Olek's comment, dividing by 2 is just fine now and looks like there is no good reasone to cap it more in the driver, but let's discuss it here if you have different opinion. > > > > > > > Am 30.10.25 um 09:30 schrieb Michal Swiatkowski: > > > > On some high-core systems (like AMD EPYC Bergamo, Intel Clearwater > > > > Forest) loading ice driver with default values can lead to queue/irq > > > > exhaustion. It will result in no additional resources for SR-IOV. > > > > > > Could you please elaborate how to make the queue/irq exhaustion visible? > > > > > > > What do you mean? On high core system, lets say num_online_cpus() > > returns 288, on 8 ports card we have online 256 irqs per eqch PF (2k in > > total). Driver will load with the 256 queues (and irqs) on each PF. > > Any VFs creation command will fail due to no free irqs available. > > this clearly means this is a -net material, > even if this commit will be rather unpleasant for backports to stable > In my opinion it isn't. It is just about default values. Still in the described case user can call ethtool -L and lower the queues to create VFs without a problem. > > (echo X > /sys/class/net/ethX/device/sriov_numvfs) > > > > > > In most cases there is no performance reason for more than half > > > > num_cpus(). Limit the default value to it using generic > > > > netif_get_num_default_rss_queues(). > > > > > > > > Still, using ethtool the number of queues can be changed up to > > > > num_online_cpus(). It can be done by calling: > > > > $ethtool -L ethX combined $(nproc) > > > > > > > > This change affects only the default queue amount. > > > > > > How would you judge the regression potential, that means for people where > > > the defaults work good enough, and the queue number is reduced now? > > > > > > > You can take a look into commit that introduce /2 change in > > netif_get_num_default_rss_queues() [1]. There is a good justification > > for such situation. In short, heaving physical core number is just a > > wasting of CPU resources. > > > > [1] > > https://lore.kernel.org/netdev/[email protected]/ > > > [...]
