On 10/31/25 14:17, Michal Swiatkowski wrote:
On Thu, Oct 30, 2025 at 11:39:30AM +0100, Przemek Kitszel wrote:
On 10/30/25 10:37, Michal Swiatkowski wrote:
On Thu, Oct 30, 2025 at 10:10:32AM +0100, Paul Menzel wrote:
Dear Michal,
Thank you for your patch. For the summary, I’d add:
ice: Use netif_get_num_default_rss_queues() to decrease queue number
I would instead just say:
ice: cap the default number of queues to 64
as this is exactly what happens. Then next paragraph could be:
Use netif_get_num_default_rss_queues() as a better base (instead of
the number of CPU cores), but still cap it to 64 to avoid excess IRQs
assigned to PF (what would leave, in some cases, nothing for VFs).
sorry for such late nitpicks
and, see below too
I moved away from capping to 64, now it is just call to
netif_get_num_default_rss_queues(). Following Olek's comment, dividing
by 2 is just fine now and looks like there is no good reasone to cap it
more in the driver, but let's discuss it here if you have different
opinion.
I see, sorry for the confusion
with that I'm fine with the change being -next material, and commit
message is good (not sure if perfect, but it does not need to be)
Reviewed-by: Przemek Kitszel <[email protected]>
Am 30.10.25 um 09:30 schrieb Michal Swiatkowski:
On some high-core systems (like AMD EPYC Bergamo, Intel Clearwater
Forest) loading ice driver with default values can lead to queue/irq
exhaustion. It will result in no additional resources for SR-IOV.
Could you please elaborate how to make the queue/irq exhaustion visible?
What do you mean? On high core system, lets say num_online_cpus()
returns 288, on 8 ports card we have online 256 irqs per eqch PF (2k in
total). Driver will load with the 256 queues (and irqs) on each PF.
Any VFs creation command will fail due to no free irqs available.
this clearly means this is a -net material,
even if this commit will be rather unpleasant for backports to stable
In my opinion it isn't. It is just about default values. Still in the
described case user can call ethtool -L and lower the queues to create
VFs without a problem.
(echo X > /sys/class/net/ethX/device/sriov_numvfs)
In most cases there is no performance reason for more than half
num_cpus(). Limit the default value to it using generic
netif_get_num_default_rss_queues().
Still, using ethtool the number of queues can be changed up to
num_online_cpus(). It can be done by calling:
$ethtool -L ethX combined $(nproc)
This change affects only the default queue amount.
How would you judge the regression potential, that means for people where
the defaults work good enough, and the queue number is reduced now?
You can take a look into commit that introduce /2 change in
netif_get_num_default_rss_queues() [1]. There is a good justification
for such situation. In short, heaving physical core number is just a
wasting of CPU resources.
[1] https://lore.kernel.org/netdev/[email protected]/
[...]