From: 'Simon Horman' <[email protected]>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
Subject: ice: use element-by-element comparison for DCB config changes
This commit replaces the use of memcmp() for DCB configurations with an
element-by-element comparison using ice_dcb_need_recfg(). It exports this
function and applies it to the SW DCB netlink setters to prevent issues
with uninitialized padding bytes.
> diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_nl.c
> b/drivers/net/ethernet/intel/ice/ice_dcb_nl.c
> index a10c1c8d86977..13a52c1a4da3e 100644
> --- a/drivers/net/ethernet/intel/ice/ice_dcb_nl.c
> +++ b/drivers/net/ethernet/intel/ice/ice_dcb_nl.c
> @@ -108,11 +108,17 @@ static int ice_dcbnl_setets(struct net_device *netdev,
> struct ieee_ets *ets)
> if (!bwrec)
> new_cfg->etsrec.tcbwtable[0] = 100;
>
> + if (!ice_dcb_need_recfg(pf, &pf->hw.port_info->qos_cfg.local_dcbx_cfg,
> + new_cfg)) {
> + err = ICE_DCB_NO_HW_CHG;
> + goto ets_out;
> + }
Does ice_dcb_need_recfg() correctly detect bandwidth changes?
Looking at the implementation of ice_dcb_need_recfg(), it only logs a debug
message but doesn't set need_reconfig to true when only the ETS Bandwidth
Table or TSA Table changes:
drivers/net/ethernet/intel/ice/ice_dcb_lib.c:ice_dcb_need_recfg() {
...
if (memcmp(&new_cfg->etscfg.tcbwtable,
&old_cfg->etscfg.tcbwtable,
sizeof(new_cfg->etscfg.tcbwtable)))
dev_dbg(dev, "ETS TC BW Table changed.\n");
...
}
Since the function returns false, will this cause ice_dcbnl_setets() to
abort early and silently drop user-requested bandwidth modifications?
Also, since new_cfg points to the shared desired_dcbx_cfg state, does this
early return leave uncommitted modifications in the software state?
If a subsequent unrelated operation (like setting PFC) correctly returns
true from ice_dcb_need_recfg(), it appears it could unintentionally push
these abandoned ETS bandwidth changes to the hardware.
> err = ice_pf_dcb_cfg(pf, new_cfg, true);
> /* return of zero indicates new cfg applied */
> - if (err == ICE_DCB_HW_CHG_RST)
> + if (!err)
> ice_dcbnl_devreset(netdev);
This isn't a bug introduced by this patch, but can ice_dcbnl_devreset()
cause a deadlock against hardware reset tasks?
The ice_dcbnl_devreset() function spins in a usleep_range() loop while
ice_is_reset_in_progress() is true.
Since DCBNL callbacks like ice_dcbnl_setets() run with the global
rtnl_lock held, and a concurrent hardware or PCIe reset worker likely
requires rtnl_lock to complete the reset, will the reset task block
forever while this thread waits for it?
> - if (err == ICE_DCB_NO_HW_CHG)
> + else if (err == ICE_DCB_NO_HW_CHG)
> err = ICE_DCB_HW_CHG_RST;
>
> ets_out:
Does this change cause the function to return a positive error code to
userspace when applying an identical configuration?
By bypassing the translation from ICE_DCB_NO_HW_CHG (which is 1) to
ICE_DCB_HW_CHG_RST (which evaluates to 0) in the early goto added
above, err remains 1 at the ets_out label.
The DCBNL core evaluates any non-zero return value as an error:
net/dcb/dcbnl.c:dcbnl_ieee_set() {
...
err = netdev->dcbnl_ops->ieee_setets(netdev, ets);
if (err)
goto err;
...
}
Will this cause tools like lldptool to fail when configuring an
existing setup?
[ ... ]
> @@ -845,6 +858,12 @@ static int ice_dcbnl_setapp(struct net_device *netdev,
> struct dcb_app *app)
> new_cfg->dscp_map[app->protocol] = app->priority;
> new_cfg->app[new_cfg->numapps++] = new_app;
This isn't a bug introduced by this patch, but does this array append
lack bounds checking against ICE_DCBX_MAX_APPS?
The app array has a capacity of 64. The function restricts duplicate
DSCP protocols using the dscp_mapped bitmap, but since DSCP_MAX is also
64, a user could map all 64 unique values.
If numapps starts greater than 0 due to existing EtherType applications,
adding 64 valid DSCP mappings would cause numapps to exceed 64, leading
to an out-of-bounds write that overwrites the adjacent dscp_mapped
bitmap.