On Thu, Mar 12, 2026 at 12:13:52PM +0100, Sabrina Dubroca wrote:
> > >> This patch calls netdev_change_features() after
> > >> __netdev_upper_dev_link(),
> > >> Which trigger a NETDEV_FEAT_CHANGE notify and calls rtmsg_ifinfo_event()
> > >> to fill the new link info. Maybe the event is a bit early and macsec has
> > >> data not ready?
> > >
> > > But this would still mean that there's a mismatch between
> > > if_nlmsg_size() and rtnl_fill_ifinfo(), and your patch is only
> > > revealing it.
> > >
> > > I'll send fixes for the stuff I mentioned, no idea if that's what
> > > syzbot saw since we don't have a repro.
> >
> > It looks like even the nipa CI is reproducing the issue, i.e.:
> >
> > https://netdev-ctrl.bots.linux.dev/logs/vmksft/net-dbg/results/554921/17-rtnetlink-sh/
> >
> > more failures here:
> >
> > https://netdev.bots.linux.dev/contest.html?pw-n=0&branch=net-next-2026-03-12--06-00&pw-n=0&pass=0
> >
> > the fail in mascsec-offload looks quite deterministic, could you please
> > have a look?
>
> Ah crap, sorry Hangbin, you were right. macsec_fill_info() returns
> -EMSGSIZE when the key length is unexpected, and at this point we
> haven't set it to its proper value yet.
>
> Bandaid solution would be:
>
> diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
> index f6cad0746a02..0f7ef7bfbdde 100644
> --- a/drivers/net/macsec.c
> +++ b/drivers/net/macsec.c
> @@ -4337,7 +4337,7 @@ static int macsec_fill_info(struct sk_buff *skb,
> csid = secy->xpn ? MACSEC_CIPHER_ID_GCM_AES_XPN_256 :
> MACSEC_CIPHER_ID_GCM_AES_256;
> break;
> default:
> - goto nla_put_failure;
> + return 0;
> }
>
> if (nla_put_sci(skb, IFLA_MACSEC_SCI, secy->sci,
>
>
> Proper fix (so that the notification we're sending during
> upper_dev_link has full linkinfo) would be to move
> netdev_upper_dev_link() to after macsec_changelink_common() and fix up
> the error handling. I don't see anything in macsec_add_dev or
> macsec_changelink_common that needs the device to be linked. But
If we move the netdev_upper_dev_link() after macsec_changelink_common(),
we will not goto nla_put_failure via default, right?
> anyway it doesn't make sense for macsec_fill_info to return -EMSGSIZE
> on invalid data, so the "bandaid" should be included as well.
>
> Should this be part of this series (either just the "bandaid" or the
> "proper fix"+bandaid), since we never saw a problem before?
Since macsec need the "bandaid" fix either way. How about you post the
"bandaid" fix to net. And I include the "proper fix" in this series for
net-next?
BTW, here is my draft patch, would you like to review it first?
diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index f6cad0746a02..1f8da4c291c6 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -4161,10 +4161,6 @@ static int macsec_newlink(struct net_device *dev,
lockdep_set_class(&dev->addr_list_lock,
&macsec_netdev_addr_lock_key);
- err = netdev_upper_dev_link(real_dev, dev, extack);
- if (err < 0)
- goto unregister;
-
/* need to be already registered so that ->init has run and
* the MAC addr is set
*/
@@ -4177,12 +4173,12 @@ static int macsec_newlink(struct net_device *dev,
if (rx_handler && sci_exists(real_dev, sci)) {
err = -EBUSY;
- goto unlink;
+ goto unregister;
}
err = macsec_add_dev(dev, sci, icv_len);
if (err)
- goto unlink;
+ goto unregister;
if (data) {
err = macsec_changelink_common(dev, data);
@@ -4190,6 +4186,10 @@ static int macsec_newlink(struct net_device *dev,
goto del_dev;
}
+ err = netdev_upper_dev_link(real_dev, dev, extack);
+ if (err < 0)
+ goto unlink;
+
/* If h/w offloading is available, propagate to the device */
if (macsec_is_offloaded(macsec)) {
const struct macsec_ops *ops;
@@ -4200,7 +4200,7 @@ static int macsec_newlink(struct net_device *dev,
ctx.secy = &macsec->secy;
err = macsec_offload(ops->mdo_add_secy, &ctx);
if (err)
- goto del_dev;
+ goto unlink;
macsec->insert_tx_tag =
macsec_needs_tx_tag(macsec, ops);
@@ -4209,7 +4209,7 @@ static int macsec_newlink(struct net_device *dev,
err = register_macsec_dev(real_dev, dev);
if (err < 0)
- goto del_dev;
+ goto unlink;
netdev_update_features(dev);
netif_stacked_transfer_operstate(real_dev, dev);
@@ -4219,10 +4219,10 @@ static int macsec_newlink(struct net_device *dev,
return 0;
-del_dev:
- macsec_del_dev(macsec);
unlink:
netdev_upper_dev_unlink(real_dev, dev);
+del_dev:
+ macsec_del_dev(macsec);
unregister:
unregister_netdevice(dev);
return err;
Thanks
Hangbin