Hi,

On 17/11/2025 9:15 AM, Maayan Kashani wrote:
The mlx5_ipool_free() function was called with a NULL pool pointer
during HW flow destruction, causing a segmentation fault. This occurred
when flow creation failed and the cleanup path attempted to free
resources from an uninitialized flow pool.

The crash happened in the following scenario:
1. During device start, a default NTA copy action flow is created
2. If the flow creation fails, mlx5_flow_hw_list_destroy() is called
3. In hw_cmpl_flow_update_or_destroy(), table->flow pool could be NULL
4. mlx5_ipool_free(table->flow, flow->idx) was called without checking
    if table->flow is NULL
5. Inside mlx5_ipool_free(), accessing pool->cfg.per_core_cache caused
    a segmentation fault due to NULL pointer dereference

The fix adds two layers of protection,
1. Add NULL check for table->flow before calling mlx5_ipool_free() in
    hw_cmpl_flow_update_or_destroy(), consistent with the existing check
    for table->resource on the previous line
2. Add NULL check for pool parameter in mlx5_ipool_free() as a defensive
    measure to prevent similar crashes in other code paths

The fix also renames the ‘flow’ field in rte_flow_template_table
to ‘flow_pool’ for better code readability.

Stack trace of the fault:
   mlx5_ipool_free (pool=0x0) at mlx5_utils.c:753
   hw_cmpl_flow_update_or_destroy at mlx5_flow_hw.c:4481
   mlx5_flow_hw_destroy at mlx5_flow_hw.c:14219
   mlx5_flow_hw_list_destroy at mlx5_flow_hw.c:14279
   flow_hw_list_create at mlx5_flow_hw.c:14415
   mlx5_flow_start_default at mlx5_flow.c:8263
   mlx5_dev_start at mlx5_trigger.c:1420

Fixes: 27d171b88031 ("net/mlx5: abstract flow action and enable reconfigure")
Cc: [email protected]

Signed-off-by: Maayan Kashani <[email protected]>
Acked-by: Bing Zhao <[email protected]>

Patch applied to next-net-mlx,
Kindest regards
Raslan Darawsheh

Reply via email to