On Thu, Mar 12, 2026 at 12:13:52PM +0100, Sabrina Dubroca wrote:
> > >> This patch calls netdev_change_features() after 
> > >> __netdev_upper_dev_link(),
> > >> Which trigger a NETDEV_FEAT_CHANGE notify and calls rtmsg_ifinfo_event()
> > >> to fill the new link info. Maybe the event is a bit early and macsec has
> > >> data not ready?
> > > 
> > > But this would still mean that there's a mismatch between
> > > if_nlmsg_size() and rtnl_fill_ifinfo(), and your patch is only
> > > revealing it.
> > > 
> > > I'll send fixes for the stuff I mentioned, no idea if that's what
> > > syzbot saw since we don't have a repro.
> > 
> > It looks like even the nipa CI is reproducing the issue, i.e.:
> > 
> > https://netdev-ctrl.bots.linux.dev/logs/vmksft/net-dbg/results/554921/17-rtnetlink-sh/
> > 
> > more failures here:
> > 
> > https://netdev.bots.linux.dev/contest.html?pw-n=0&branch=net-next-2026-03-12--06-00&pw-n=0&pass=0
> > 
> > the fail in mascsec-offload looks quite deterministic, could you please
> > have a look?
> 
> Ah crap, sorry Hangbin, you were right.  macsec_fill_info() returns
> -EMSGSIZE when the key length is unexpected, and at this point we
> haven't set it to its proper value yet.
> 
> Bandaid solution would be:
> 
> diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
> index f6cad0746a02..0f7ef7bfbdde 100644
> --- a/drivers/net/macsec.c
> +++ b/drivers/net/macsec.c
> @@ -4337,7 +4337,7 @@ static int macsec_fill_info(struct sk_buff *skb,
>                 csid = secy->xpn ? MACSEC_CIPHER_ID_GCM_AES_XPN_256 : 
> MACSEC_CIPHER_ID_GCM_AES_256;
>                 break;
>         default:
> -               goto nla_put_failure;
> +               return 0;
>         }
>  
>         if (nla_put_sci(skb, IFLA_MACSEC_SCI, secy->sci,
> 
> 
> Proper fix (so that the notification we're sending during
> upper_dev_link has full linkinfo) would be to move
> netdev_upper_dev_link() to after macsec_changelink_common() and fix up
> the error handling. I don't see anything in macsec_add_dev or
> macsec_changelink_common that needs the device to be linked. But

If we move the netdev_upper_dev_link() after macsec_changelink_common(),
we will not goto nla_put_failure via default, right?

> anyway it doesn't make sense for macsec_fill_info to return -EMSGSIZE
> on invalid data, so the "bandaid" should be included as well.
> 
> Should this be part of this series (either just the "bandaid" or the
> "proper fix"+bandaid), since we never saw a problem before?

Since macsec need the "bandaid" fix either way. How about you post the
"bandaid" fix to net. And I include the "proper fix" in this series for
net-next?

BTW, here is my draft patch, would you like to review it first?

diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index f6cad0746a02..1f8da4c291c6 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -4161,10 +4161,6 @@ static int macsec_newlink(struct net_device *dev,
        lockdep_set_class(&dev->addr_list_lock,
                          &macsec_netdev_addr_lock_key);
 
-       err = netdev_upper_dev_link(real_dev, dev, extack);
-       if (err < 0)
-               goto unregister;
-
        /* need to be already registered so that ->init has run and
         * the MAC addr is set
         */
@@ -4177,12 +4173,12 @@ static int macsec_newlink(struct net_device *dev,
 
        if (rx_handler && sci_exists(real_dev, sci)) {
                err = -EBUSY;
-               goto unlink;
+               goto unregister;
        }
 
        err = macsec_add_dev(dev, sci, icv_len);
        if (err)
-               goto unlink;
+               goto unregister;
 
        if (data) {
                err = macsec_changelink_common(dev, data);
@@ -4190,6 +4186,10 @@ static int macsec_newlink(struct net_device *dev,
                        goto del_dev;
        }
 
+       err = netdev_upper_dev_link(real_dev, dev, extack);
+       if (err < 0)
+               goto unlink;
+
        /* If h/w offloading is available, propagate to the device */
        if (macsec_is_offloaded(macsec)) {
                const struct macsec_ops *ops;
@@ -4200,7 +4200,7 @@ static int macsec_newlink(struct net_device *dev,
                        ctx.secy = &macsec->secy;
                        err = macsec_offload(ops->mdo_add_secy, &ctx);
                        if (err)
-                               goto del_dev;
+                               goto unlink;
 
                        macsec->insert_tx_tag =
                                macsec_needs_tx_tag(macsec, ops);
@@ -4209,7 +4209,7 @@ static int macsec_newlink(struct net_device *dev,
 
        err = register_macsec_dev(real_dev, dev);
        if (err < 0)
-               goto del_dev;
+               goto unlink;
 
        netdev_update_features(dev);
        netif_stacked_transfer_operstate(real_dev, dev);
@@ -4219,10 +4219,10 @@ static int macsec_newlink(struct net_device *dev,
 
        return 0;
 
-del_dev:
-       macsec_del_dev(macsec);
 unlink:
        netdev_upper_dev_unlink(real_dev, dev);
+del_dev:
+       macsec_del_dev(macsec);
 unregister:
        unregister_netdevice(dev);
        return err;

Thanks
Hangbin

Reply via email to