On Mon, 16 Mar 2026 05:52:55 +0000
KAVYA AV <[email protected]> wrote:
> When using DCB mode with VT disabled and requesting more queues than
> traffic classes (e.g., rxq=64 with 8 TCs), testpmd crashes with null
> pointer errors because it artificially limits queue allocation to
> num_tcs.
>
> For VMDq devices, use device-specific queue count (nb_rx_queues/
> nb_tx_queues) instead of limiting to num_tcs. This allows VMDq devices
> to utilize their full queue capacity while maintaining compatibility
> with non VMDq devices.
>
> Fixes null pointer dereference when queue structures are accessed
> beyond the allocated range.
>
> Fixes: 2169699b15fc ("app/testpmd: add queue restriction in DCB command")
> Cc: [email protected]
>
> Signed-off-by: KAVYA AV <[email protected]>
> ---
It makes sense to do this but AI review raised a number of issues:
Error: Wrong field — nb_rx_queues reflects the previous configure, not the
device's DCB queue capacity
The patch changes the vmdq_pool_base > 0 path from num_tcs to
dev_info.nb_rx_queues / dev_info.nb_tx_queues. However, dev_info.nb_rx_queues
is the configured queue count from the rte_eth_dev_configure() call that
happened earlier in this same function (line 4413: rte_eth_dev_configure(pid,
nb_rxq, nb_rxq, &port_conf)). So dev_info.nb_rx_queues just reflects whatever
nb_rxq was before entering this block — it is not the device's inherent DCB
queue capacity.
If nb_rxq was previously set to 64 by the user (rxq=64), then after configure
dev_info.nb_rx_queues will be 64, and this code sets nb_rxq = 64 — which is
circular. It does avoid the crash from using num_tcs (which could be too
small), but it doesn't set the queue count to a value derived from the device's
VMDq/DCB capability. Compare with the DCB_VT_ENABLED branch just above, which
uses dev_info.nb_rx_queues only when max_vfs > 0 because the VF driver
legitimately constrains nb_rx_queues during configure.
For the VT_DISABLED + vmdq_pool_base > 0 case, the intent is to limit queues to
those available to the PF (since VMDq pools consume some). The original num_tcs
was an approximation; nb_rx_queues is another approximation that happens to be
the user's requested count echoed back. Consider whether the correct value
should be derived from vmdq_queue_base or vmdq_queue_num fields instead, which
describe the actual PF/VMDq queue layout.
Warning: Comment is misleading
The added comment says "Use device queue counts to prevent null pointer errors"
but dev_info.nb_rx_queues is the configured count, not a device-intrinsic
limit. The comment should describe why this value is appropriate for the
VMDq-with-pool-base case.