On Thu, Oct 30, 2025 at 11:39:30AM +0100, Przemek Kitszel wrote:
> On 10/30/25 10:37, Michal Swiatkowski wrote:
> > On Thu, Oct 30, 2025 at 10:10:32AM +0100, Paul Menzel wrote:
> > > Dear Michal,
> > > 
> > > 
> > > Thank you for your patch. For the summary, I’d add:
> > > 
> > > ice: Use netif_get_num_default_rss_queues() to decrease queue number
> 
> I would instead just say:
> ice: cap the default number of queues to 64
> 
> as this is exactly what happens. Then next paragraph could be:
> Use netif_get_num_default_rss_queues() as a better base (instead of
> the number of CPU cores), but still cap it to 64 to avoid excess IRQs
> assigned to PF (what would leave, in some cases, nothing for VFs).
> 
> sorry for such late nitpicks
> and, see below too

I moved away from capping to 64, now it is just call to
netif_get_num_default_rss_queues(). Following Olek's comment, dividing
by 2 is just fine now and looks like there is no good reasone to cap it
more in the driver, but let's discuss it here if you have different
opinion.

> 
> > > 
> > > Am 30.10.25 um 09:30 schrieb Michal Swiatkowski:
> > > > On some high-core systems (like AMD EPYC Bergamo, Intel Clearwater
> > > > Forest) loading ice driver with default values can lead to queue/irq
> > > > exhaustion. It will result in no additional resources for SR-IOV.
> > > 
> > > Could you please elaborate how to make the queue/irq exhaustion visible?
> > > 
> > 
> > What do you mean? On high core system, lets say num_online_cpus()
> > returns 288, on 8 ports card we have online 256 irqs per eqch PF (2k in
> > total). Driver will load with the 256 queues (and irqs) on each PF.
> > Any VFs creation command will fail due to no free irqs available.
> 
> this clearly means this is a -net material,
> even if this commit will be rather unpleasant for backports to stable
>

In my opinion it isn't. It is just about default values. Still in the
described case user can call ethtool -L and lower the queues to create
VFs without a problem.

> > (echo X > /sys/class/net/ethX/device/sriov_numvfs)
> > 
> > > > In most cases there is no performance reason for more than half
> > > > num_cpus(). Limit the default value to it using generic
> > > > netif_get_num_default_rss_queues().
> > > > 
> > > > Still, using ethtool the number of queues can be changed up to
> > > > num_online_cpus(). It can be done by calling:
> > > > $ethtool -L ethX combined $(nproc)
> > > > 
> > > > This change affects only the default queue amount.
> > > 
> > > How would you judge the regression potential, that means for people where
> > > the defaults work good enough, and the queue number is reduced now?
> > > 
> > 
> > You can take a look into commit that introduce /2 change in
> > netif_get_num_default_rss_queues() [1]. There is a good justification
> > for such situation. In short, heaving physical core number is just a
> > wasting of CPU resources.
> > 
> > [1] 
> > https://lore.kernel.org/netdev/[email protected]/
> > 
> [...]

Reply via email to