On 10/3/2025 9:27 AM, Dave Ertman wrote:
> When two E8XX interfaces are placed into a bond, and are correctly
> configured for supporting SRIOV traffic over the bonded interfaces,
> there is a problem with traffic aimed directly at the bond netdev. By
> conjoining both interfaces onto a single switch black in the NIC, all
> unicast and broadcast traffic is being directed to the primary interface's
> set of resources no matter which interface is the active/targeting one.
>
> To fix this, add a set of rules into the switch block that combines both
> target MAC address and source logical port to direct packets to the
> active/targeted VSI. This change will not touch traffic directed to SRIOV
> VF targets.
>
> Fixes: ec5a6c5f79ed ("ice: process events created by lag netdev event
> handler")
> Signed-off-by: Dave Ertman <[email protected]>
> ---
> drivers/net/ethernet/intel/ice/ice_lag.c | 101 +++++++++++++++++++++++
> drivers/net/ethernet/intel/ice/ice_lag.h | 5 ++
> 2 files changed, 106 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_lag.c
> b/drivers/net/ethernet/intel/ice/ice_lag.c
> index d2576d606e10..7773d5b9bae9 100644
> --- a/drivers/net/ethernet/intel/ice/ice_lag.c
> +++ b/drivers/net/ethernet/intel/ice/ice_lag.c
> @@ -17,6 +17,7 @@ static const u8 lacp_train_pkt[ICE_TRAIN_PKT_LEN] = { 0, 0,
> 0, 0, 0, 0,
> static const u8 act_act_train_pkt[ICE_TRAIN_PKT_LEN] = { 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0 };
> +static u8 mac_train_pkt[ICE_TRAIN_PKT_LEN] = { 0 };
>
Is there any way this static global variable could be either allocated
or made part of the LAG structure or something?
You're using it as some sort of storage from what I can tell, but I
really don't like that its a driver global and open to a lot of
potential race conditions.
For that matter, its only accessed a couple of times, and each time its
used to copy a value into it and then copy that into something else..
Can you explain whats going on with that and why it even needs a global
variable like this??
> #define ICE_RECIPE_LEN 64
> #define ICE_LAG_SRIOV_CP_RECIPE 10
> @@ -29,6 +30,10 @@ static const u8 ice_lport_rcp[ICE_RECIPE_LEN] = {
> 0x05, 0, 0, 0, 0x20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0x85, 0, 0x16, 0, 0, 0, 0xff, 0xff, 0x07, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0x30 };
> +static const u8 ice_pfmac_rcp[ICE_RECIPE_LEN] = {
> + 0x05, 0, 0, 0, 0x20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0x85, 0, 0x16,
> + 0x05, 0x06, 0x07, 0xff, 0xff, 0x07, 0x00, 0xff, 0xff, 0xff, 0xff,
> + 0xff, 0xff, 0, 0, 0, 0, 0, 0, 0x30 };
>
> /**
> * ice_lag_set_primary - set PF LAG state as Primary
> @@ -1336,6 +1341,89 @@ ice_lag_reclaim_vf_nodes(struct ice_lag *lag, struct
> ice_hw *src_hw)
> ice_lag_reclaim_vf_tc(lag, src_hw, i, tc);
> }
>
> +/**
> + * ice_lag_cfg_pfmac_fltrs
> + * @lag: local lag info struct
> + * @link: is this a linking action
> + *
> + * Configure lport/MAC filters for this interfaces PF traffic in the
> + * current interfaces SWID
> + */
> +static void ice_lag_cfg_pfmac_fltrs(struct ice_lag *lag, bool link)
> +{
> + u8 lport = lag->pf->hw.port_info->lport;
> + struct ice_sw_rule_lkup_rx_tx *s_rule;
> + struct ice_vsi *vsi = lag->pf->vsi[0];
> + struct ice_hw *hw = &lag->pf->hw;
> + u16 s_rule_sz;
> + u32 act;
> +
> + act = ICE_FWD_TO_VSI | ICE_SINGLE_ACT_LAN_ENABLE |
> ICE_SINGLE_ACT_VALID_BIT |
> + FIELD_PREP(ICE_SINGLE_ACT_VSI_ID_M, vsi->vsi_num);
> +
> + s_rule_sz = ICE_SW_RULE_RX_TX_HDR_SIZE(s_rule, ICE_TRAIN_PKT_LEN);
> + s_rule = kzalloc(s_rule_sz, GFP_KERNEL);
> + if (!s_rule) {
> + netdev_warn(lag->netdev, "-ENOMEM error configuring PFMAC
> filters\n");
> + return;
> + }
> +
> + if (link) {
> + u8 broadcast[ETH_ALEN] = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff };
> +
> + /* unicast */
> + ether_addr_copy(mac_train_pkt, lag->upper_netdev->dev_addr);
> + memcpy(s_rule->hdr_data, mac_train_pkt, ICE_TRAIN_PKT_LEN);
Here, you copy dev_addr into it, then you copy that back into
s_rule->hdr_data...
> + s_rule->recipe_id = cpu_to_le16(lag->pfmac_recipe);
> + s_rule->src = cpu_to_le16(lport);
> + s_rule->act = cpu_to_le32(act);
> + s_rule->hdr_len = cpu_to_le16(ICE_TRAIN_PKT_LEN);
> + s_rule->hdr.type = cpu_to_le16(ICE_AQC_SW_RULES_T_LKUP_RX);
> +
> + if (ice_aq_sw_rules(hw, s_rule, s_rule_sz, 1,
> + ice_aqc_opc_add_sw_rules, NULL)) {
> + netdev_warn(lag->netdev, "Error ADDING Unicast PFMAC
> rule for aggregate\n");
> + goto err_pfmac_free;
> + }
> +
> + lag->pfmac_unicst_idx = le16_to_cpu(s_rule->index);
> +
> + /* broadast */
> + ether_addr_copy(mac_train_pkt, broadcast);
> + memcpy(s_rule->hdr_data, mac_train_pkt, ICE_TRAIN_PKT_LEN);
And here, you copy the broadcast into it, and then copy that into the
s_rule_hdr_data...
But why not just copy directly into the s_rule->hdr_data instead of
copying twice? Literally nothing else interacts with mac_train_pkt
introduced in this patch, so we needlessly copy, and result in using a
value that could be modified by another thread possibly even on another
PF since its a global variable...
Please fix this.
OpenPGP_signature.asc
Description: OpenPGP digital signature
