From: Geert Uytterhoeven <ge...@linux-m68k.org> Date: Fri, 11 Sep 2020 08:32:55 +0200
> Hi David, > > On Thu, Sep 10, 2020 at 9:20 PM David Miller <da...@davemloft.net> wrote: >> From: Geert Uytterhoeven <geert+rene...@glider.be> >> Date: Tue, 1 Sep 2020 17:02:37 +0200 >> >> > This reverts commit 124eee3f6955f7aa19b9e6ff5c9b6d37cb3d1e2c. >> > >> > Inami-san reported that this commit breaks bridge support in a Xen >> > environment, and that reverting it fixes this. >> > >> > During system resume, bridge ports are no longer enabled, as that relies >> > on the receipt of the NETDEV_CHANGE notification. This notification is >> > not sent, as netdev_state_change() is no longer called. >> > >> > Note that the condition this commit intended to fix never existed >> > upstream, as the patch triggering it and referenced in the commit was >> > never applied upstream. Hence I can confirm s2ram on r8a73a4/ape6evm >> > and sh73a0/kzm9g works fine before/after this revert. >> > >> > Reported-by Gaku Inami <gaku.inami...@renesas.com> >> > Signed-off-by: Geert Uytterhoeven <geert+rene...@glider.be> >> >> Maybe you cannot reproduce it, but the problem is there and it still >> looks very real to me. >> >> netdev_state_change() does two things: >> >> 1) Emit the NETDEV_CHANGE notification >> >> 2) Emit an rtmsg_ifinfo() netlink message, which in turn tries to access >> the device statistics via ->ndo_get_stats*(). >> >> It is absolutely wrong to do #2 when netif_device_present() is false. >> >> So I cannot apply this patch as-is, sorry. > > Thanks a lot for looking into this! > > But doing #1 is still safe? That is the part that calls into the bridge > code. So would moving the netif_device_present() check from > linkwatch_do_dev() to netdev_state_change(), to prevent doing #2, be > acceptable? I have a better question. Why is a software device like the bridge, that wants to effectively exist and still receive netdev event notifications, marking itself as not present? That's seems like the real bug here.