Re: ieee80211.h virtual_map splat
On Tue, 25 Jun 2024 09:56:35 +0300 Kalle Valo wrote: > > Adding netdev to the initial message in the thread. > > https://lore.kernel.org/all/CAPh3n83zb1PwFBFijJKChBqY95zzpYh=2iPf8tmh=yts6e3...@mail.gmail.com/ > > > > There was some discussion in the thread, with the observation that the > > splat > > is fixed by: > > 2ae5c9248e06 ("wifi: mac80211: Use flexible array in struct > > ieee80211_tim_ie") > > > > Followed by discussion if this should be backported. > > > > Kees said that "netdev [...] maintainers have asked that contributors not > > include "Cc: stable" tags, as they want to evaluate for themselves whether > > patches should go to stable or not" > > BTW this rule doesn't apply to wireless subsystem. For wireless patches > it's ok to "Cc: stable" in patches or anyone can send a request to > stable maintainers to pick a patch. It's an old rule. Quoting documentation: Stable tree ~~~ While it used to be the case that netdev submissions were not supposed to carry explicit ``CC: sta...@vger.kernel.org`` tags that is no longer the case today. Please follow the standard stable rules in :ref:`Documentation/process/stable-kernel-rules.rst `, and make sure you include appropriate Fixes tags! See: https://www.kernel.org/doc/html/next/process/maintainer-netdev.html#stable-tree
Re: [PATCH net-next v6 03/10] net: create a dummy net_device allocator
On Thu, 11 Apr 2024 06:59:27 -0700 Breno Leitao wrote: > +/** > + * alloc_netdev_dummy - Allocate and initialize a dummy net device. > + * @sizeof_priv: size of private data to allocate space for > + */ > +struct net_device *alloc_netdev_dummy(int sizeof_priv) Sorry, one more round :) We started using -Wall for kdoc (./scripts/kernel-doc -Wall $files) recently and it now complains about missing return values...
Re: [PATCH net-next v5 00/10] allocate dummy device dynamically
On Wed, 10 Apr 2024 06:13:41 -0700 Breno Leitao wrote: > wifi: ath11k: allocate dummy net_device dynamically Sorry Breno, I didn't notice earlier, patch 10 didn't make it to the list. The series wasn't ingested by CI and tested because of this. Could you repost? -- pw-bot: cr
Re: [PATCH net-next v4 1/9] net: free_netdev: exit earlier if dummy
On Tue, 9 Apr 2024 05:57:15 -0700 Breno Leitao wrote: > For dummy devices, exit earlier at free_netdev() instead of executing > the whole function. This is necessary, because dummy devices are > special, and shouldn't have the second part of the function executed. > > Otherwise reg_state, which is NETREG_DUMMY for dummy devices, will be > overwritten and there will be no way to identify that this is a dummy > device. Also, this device do not need the final put_device(), since > dummy devices are not registered (through register_netdevice()), where > the device reference is increased (at netdev_register_kobject() -> > device_add()). There's a small fuzz when applying due to the phy topo changes landing, please rebase, the CI didn't ingest it right. -- pw-bot: cr
Re: [PATCH] ath10k: allocate dummy net_device dynamically
On Wed, 27 Mar 2024 08:42:55 -0700 Jeff Johnson wrote: > On 3/27/2024 7:45 AM, Jakub Kicinski wrote: > > On Wed, 27 Mar 2024 07:38:05 -0700 Breno Leitao wrote: > >> -void init_dummy_netdev(struct net_device *dev) > >> +void init_dummy_netdev_core(struct net_device *dev) > > > > Can init_dummy_netdev_core() be a static function (and no export)? > > alloc_netdev_dummy() is probably going to be the only user. > > > > I'd also lean towards squashing the two commits into one. > > And once all of the conversions are complete, won't init_dummy_netdev() no > longer be used and can be removed? That's right, it should be removed. But it may take another release cycle until we can do that, depending on how quickly the patches propagate :(
Re: [PATCH] ath10k: allocate dummy net_device dynamically
On Wed, 27 Mar 2024 07:38:05 -0700 Breno Leitao wrote: > -void init_dummy_netdev(struct net_device *dev) > +void init_dummy_netdev_core(struct net_device *dev) Can init_dummy_netdev_core() be a static function (and no export)? alloc_netdev_dummy() is probably going to be the only user. I'd also lean towards squashing the two commits into one.
Re: [PATCH] ath10k: allocate dummy net_device dynamically
On Fri, 22 Mar 2024 07:58:02 -0700 Breno Leitao wrote: > > Looks like init_dummy_netdev wipes the netdev structure clean, so I > > don't think we can use it directly as the setup function, Breno :( > > Before my patch, init_dummy_netdev was being also used. The patch was > basically replacing the init_dummy_netdev by alloc_netdev() with will > call "setup(dev);" later. > > - init_dummy_netdev(&irq_grp->napi_ndev); > + irq_grp->napi_ndev = alloc_netdev(0, "dummy", > NET_NAME_UNKNOWN, > + init_dummy_netdev); > > I am wondering if alloc_netdev() is messing with something instead of > init_dummy_netdev(). alloc_netdev() allocates some memory and initializes lists which free_netdev() wants to free, basically. But init_dummy_netdev() does: /* Clear everything. Note we don't initialize spinlocks * are they aren't supposed to be taken by any of the * NAPI code and this dummy netdev is supposed to be * only ever used for NAPI polls */ memset(dev, 0, sizeof(struct net_device)); so all those pointers and init alloc_netdev() did before calling setup will get wiped. > Also, Kalle's crash is during rmmod, and not during initialization. > getting NULL after free_netdev() is called. > > > Maybe we should add a new helper to "alloc dummy netdev" which can > > call alloc_netdev() with right arguments and do necessary init? > > What are the right arguments in this case? I'm not sure we have a noop setup() callback today. If you define a wrapper to allocate a dummy netdev you can define a new empty function next to it and pass that as init? Hope I got the question right.
Re: [PATCH] ath10k: allocate dummy net_device dynamically
On Thu, 21 Mar 2024 15:02:39 -0700 Jeff Johnson wrote: > >> As suggested there we should just use kmalloc/kfree to match the existing > >> logic. > > > > Please no. There is no magic here. alloc + free must match whether > > you're using magic object alloc wrapper (alloc_netdev()) or straight > > up kzalloc(). > > Based upon the ath11k patch there must be something going on with > alloc_netdev()/free_netdev() that doesn't occur when these aren't used. Looks like init_dummy_netdev wipes the netdev structure clean, so I don't think we can use it directly as the setup function, Breno :( Maybe we should add a new helper to "alloc dummy netdev" which can call alloc_netdev() with right arguments and do necessary init? > So I'm just suggesting that instead we use kmalloc() and kfree(), which are > matching functions, and which, like the existing code, are not subject to > whatever is happening in alloc_netdev()/free_netdev(). > > I don't understand your objection. Using subsystem APIs to allocate its objects is preferable to ad hoc kmalloc(). We're working upstream, basic alloc/free of an object should work. Took me 5 min to realize what the problem is.
Re: [PATCH] ath10k: allocate dummy net_device dynamically
On Wed, 20 Mar 2024 08:12:46 -0700 Jeff Johnson wrote: > NAK this based upon the ath11k patch results. The ath11 patch is much more complex, I'd wager this one is fine. > As suggested there we should just use kmalloc/kfree to match the existing > logic. Please no. There is no magic here. alloc + free must match whether you're using magic object alloc wrapper (alloc_netdev()) or straight up kzalloc().
Re: [net-next PATCH v2 3/4] netdev: replace napi_reschedule with napi_schedule
On Thu, 5 Oct 2023 18:11:56 +0200 Eric Dumazet wrote: > OK, but I suspect some users of napi_reschedule() might not be race-free... What's the race you're thinking of? ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [net-next PATCH 1/4] netdev: replace simple napi_schedule_prep/__napi_schedule to napi_schedule
On Mon, 2 Oct 2023 17:10:20 +0200 Christian Marangi wrote: > queue_work(priv->xfer_wq, &priv->rx_work); > - else if (napi_schedule_prep(&priv->napi)) > - __napi_schedule(&priv->napi); > + else > + napi_schedule(&priv->napi) Missing semi-colon, please make sure each patch builds cleanly with allmodconfig. -- pw-bot: cr ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH v2] wifi: ath10k: Fix return value in ath10k_pci_init()
On Thu, 10 Nov 2022 11:12:28 +0800 Xiu Jianfeng wrote: > To: , , , > , , > CC: , , > How did you come up with this CC list? If you're CCing netdev maintainers you should also CC net...@vger.kernel.org Somehow you managed to CC LMKL but not netdev :S ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH wireless-next 2/3] wifi: ath10k: remove a copy of the NAPI_POLL_WEIGHT define
On Fri, 29 Apr 2022 22:06:20 +0300 Kalle Valo wrote: > This failed to build as recently usb.c got napi poll as well, I fixed it > in the pending branch: > > https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=89fc2e14d3e50cad605104572228d3740df7ca77 Sorry & thanks! ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
[PATCH wireless-next 2/3] wifi: ath10k: remove a copy of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. Signed-off-by: Jakub Kicinski --- CC: kv...@kernel.org CC: ath10k@lists.infradead.org CC: linux-wirel...@vger.kernel.org --- drivers/net/wireless/ath/ath10k/core.h | 3 --- drivers/net/wireless/ath/ath10k/pci.c | 2 +- drivers/net/wireless/ath/ath10k/sdio.c | 2 +- drivers/net/wireless/ath/ath10k/snoc.c | 2 +- 4 files changed, 3 insertions(+), 6 deletions(-) diff --git a/drivers/net/wireless/ath/ath10k/core.h b/drivers/net/wireless/ath/ath10k/core.h index 9f6680b3be0a..8bfabbcfdb14 100644 --- a/drivers/net/wireless/ath/ath10k/core.h +++ b/drivers/net/wireless/ath/ath10k/core.h @@ -59,9 +59,6 @@ #define ATH10K_KEEPALIVE_MAX_IDLE 3895 #define ATH10K_KEEPALIVE_MAX_UNRESPONSIVE 3900 -/* NAPI poll budget */ -#define ATH10K_NAPI_BUDGET 64 - /* SMBIOS type containing Board Data File Name Extension */ #define ATH10K_SMBIOS_BDF_EXT_TYPE 0xF8 diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c index 4d4e2f91e15c..bf1c938be7d0 100644 --- a/drivers/net/wireless/ath/ath10k/pci.c +++ b/drivers/net/wireless/ath/ath10k/pci.c @@ -3216,7 +3216,7 @@ static void ath10k_pci_free_irq(struct ath10k *ar) void ath10k_pci_init_napi(struct ath10k *ar) { netif_napi_add(&ar->napi_dev, &ar->napi, ath10k_pci_napi_poll, - ATH10K_NAPI_BUDGET); + NAPI_POLL_WEIGHT); } static int ath10k_pci_init_irq(struct ath10k *ar) diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c index 73693c66cef1..24283c02a5ef 100644 --- a/drivers/net/wireless/ath/ath10k/sdio.c +++ b/drivers/net/wireless/ath/ath10k/sdio.c @@ -2532,7 +2532,7 @@ static int ath10k_sdio_probe(struct sdio_func *func, } netif_napi_add(&ar->napi_dev, &ar->napi, ath10k_sdio_napi_poll, - ATH10K_NAPI_BUDGET); + NAPI_POLL_WEIGHT); ath10k_dbg(ar, ATH10K_DBG_BOOT, "sdio new func %d vendor 0x%x device 0x%x block 0x%x/0x%x\n", diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c index 8328966a0471..607e8164bf98 100644 --- a/drivers/net/wireless/ath/ath10k/snoc.c +++ b/drivers/net/wireless/ath/ath10k/snoc.c @@ -1243,7 +1243,7 @@ static int ath10k_snoc_napi_poll(struct napi_struct *ctx, int budget) static void ath10k_snoc_init_napi(struct ath10k *ar) { netif_napi_add(&ar->napi_dev, &ar->napi, ath10k_snoc_napi_poll, - ATH10K_NAPI_BUDGET); + NAPI_POLL_WEIGHT); } static int ath10k_snoc_request_irq(struct ath10k *ar) -- 2.34.1 ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH] ath10k: enable threaded napi on ath10k driver
On Tue, 14 Dec 2021 22:39:36 + Abhishek Kumar wrote: > NAPI poll can be done in threaded context along with soft irq > context. Threaded context can be scheduled efficiently, thus > creating less of bottleneck during Rx processing. This patch is > to enable threaded NAPI on ath10k driver. You need to explain in more detail what you mean by "can be scheduled efficiently". mt76 had an issue where Rx and Tx would use the same IRQ and threaded NAPI allowed them to be run on separate cores. What's the challenge for ath10k HW? ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [Bug 215129] New: Linux kernel hangs during power down
On Thu, 25 Nov 2021 08:32:18 +0100 Heiner Kallweit wrote: > I think the reference to ath10k_pci is misleading, Kalle isn't needed here. > The actual issue is a RTNL deadlock in igb_resume(). See log snippet: > > Nov 24 18:56:19 MartinsPc kernel: igb_resume+0xff/0x1e0 [igb > 21bf6a00cb1f20e9b0e8434f7f8748a0504e93f8] > Nov 24 18:56:19 MartinsPc kernel: pci_pm_runtime_resume+0xa7/0xd0 > Nov 24 18:56:19 MartinsPc kernel: ? pci_pm_freeze_noirq+0x110/0x110 > Nov 24 18:56:19 MartinsPc kernel: __rpm_callback+0x41/0x120 > Nov 24 18:56:19 MartinsPc kernel: ? pci_pm_freeze_noirq+0x110/0x110 > Nov 24 18:56:19 MartinsPc kernel: rpm_callback+0x35/0x70 > Nov 24 18:56:19 MartinsPc kernel: rpm_resume+0x567/0x810 > Nov 24 18:56:19 MartinsPc kernel: __pm_runtime_resume+0x4a/0x80 > Nov 24 18:56:19 MartinsPc kernel: dev_ethtool+0xd4/0x2d80 > > We have at least two places in net core where runtime_resume() is called > under RTNL. This conflicts with the current structure in few Intel drivers > that have something like the following in their resume path. > > rtnl_lock(); > if (!err && netif_running(netdev)) > err = __igb_open(netdev, true); > > if (!err) > netif_device_attach(netdev); > rtnl_unlock(); > > Other drivers don't do this, so it's the question whether it's actually > needed here to take RTNL. Some discussion was started [0], but it ended > w/o tangible result and since then it has been surprisingly quiet. > > [0] https://www.spinics.net/lists/netdev/msg736880.html Ah, that makes perfect sense, I didn't see that stack trace. Dropping Kalle from CC. Let's hear what Intel folks have to say.. ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [Bug 215129] New: Linux kernel hangs during power down
Adding Kalle and Hainer. On Wed, 24 Nov 2021 14:45:05 -0800 Stephen Hemminger wrote: > Begin forwarded message: > > Date: Wed, 24 Nov 2021 21:14:53 + > From: bugzilla-dae...@bugzilla.kernel.org > To: step...@networkplumber.org > Subject: [Bug 215129] New: Linux kernel hangs during power down > > > https://bugzilla.kernel.org/show_bug.cgi?id=215129 > > Bug ID: 215129 >Summary: Linux kernel hangs during power down >Product: Networking >Version: 2.5 > Kernel Version: 5.15 > Hardware: All > OS: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Other > Assignee: step...@networkplumber.org > Reporter: martin.sto...@gmail.com > Regression: No > > Created attachment 299703 > --> https://bugzilla.kernel.org/attachment.cgi?id=299703&action=edit > Kernel log after timeout occured > > On my system the kernel is waiting for a task during shutdown which doesn't > complete. > > The commit which causes this behavior is: > [f32a213765739f2a1db319346799f130a3d08820] ethtool: runtime-resume netdev > parent before ethtool ioctl ops > > This bug causes also that the system gets unresponsive after starting Steam: > https://steamcommunity.com/app/221410/discussions/2/3194736442566303600/ > ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH net-next 1/2] soc: qcom: ipa: Constify static qmi structs
On Wed, 25 Nov 2020 15:45:05 -0600 Alex Elder wrote: > On 11/22/20 5:40 PM, Rikard Falkeborn wrote: > > These are only used as input arguments to qmi_handle_init() which > > accepts const pointers to both qmi_ops and qmi_msg_handler. Make them > > const to allow the compiler to put them in read-only memory. > > > > Signed-off-by: Rikard Falkeborn > > Acked-by: Alex Elder Applied to net-next, thanks! ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH net-next 1/2] soc: qcom: ipa: Constify static qmi structs
On Mon, 23 Nov 2020 00:40:30 +0100 Rikard Falkeborn wrote: > These are only used as input arguments to qmi_handle_init() which > accepts const pointers to both qmi_ops and qmi_msg_handler. Make them > const to allow the compiler to put them in read-only memory. > > Signed-off-by: Rikard Falkeborn I can take this one if Alex acks it. The other patch is probably best handled by Kalle. ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH net v2 00/21] net: avoid to remove module when its debugfs is being used
On Sat, 7 Nov 2020 17:21:31 + Taehee Yoo wrote: > When debugfs file is opened, its module should not be removed until > it's closed. > Because debugfs internally uses the module's data. > So, it could access freed memory. > > In order to avoid panic, it just sets .owner to THIS_MODULE. > So that all modules will be held when its debugfs file is opened. Hm, looks like some of the patches need to be revised because .owner is already set in the ops, and a warning gets generated. Also it'd be good to mention why Johannes's approach was abandoned. When you repost please separate out all the patches for drivers/net/wireless/ and send that to Kalle's wireless drivers tree. Patch 1 needs to be split in two. Patches 2 and 3 would go via Johannes. The wimax patch needs to go to staging (wimax code has been moved). The remaining patches can be posted individually, not as a series. ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH v2 0/3]
On Fri, 31 Jul 2020 23:57:19 +0530 Rakesh Pillai wrote: > The history recording will be compiled only if > ATH10K_DEBUG is enabled, and also enabled via > the module parameter. Once the history recording > is enabled via module parameter, it can be enabled > or disabled runtime via debugfs. Have you seen the trace_devlink_hwmsg() interface? Could it be used here? ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [RFC 0/7] Add support to process rx packets in thread
On Tue, 21 Jul 2020 19:25:14 +0200 Andrew Lunn wrote: > On Tue, Jul 21, 2020 at 10:44:19PM +0530, Rakesh Pillai wrote: > > NAPI gets scheduled on the CPU core which got the > > interrupt. The linux scheduler cannot move it to a > > different core, even if the CPU on which NAPI is running > > is heavily loaded. This can lead to degraded wifi > > performance when running traffic at peak data rates. > > > > A thread on the other hand can be moved to different > > CPU cores, if the one on which its running is heavily > > loaded. During high incoming data traffic, this gives > > better performance, since the thread can be moved to a > > less loaded or sometimes even a more powerful CPU core > > to account for the required CPU performance in order > > to process the incoming packets. > > > > This patch series adds the support to use a high priority > > thread to process the incoming packets, as opposed to > > everything being done in NAPI context. > > I don't see why this problem is limited to the ath10k driver. I expect > it applies to all drivers using NAPI. So shouldn't you be solving this > in the NAPI core? Allow a driver to request the NAPI core uses a > thread? Agreed, this is a problem we have with all drivers today. We see seriously sub-optimal behavior in data center workloads, because kernel overloads the cores doing packet processing. I think the fix may actually be in the scheduler. AFAIU the scheduler counts the softIRQ time towards the interrupted process, and on top of that tries to move processes to the cores handling their IO. In the end the configuration which works somewhat okay is when each core has its own IRQ and queues, which is seriously sub-optimal. ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [RFC 1/2] devlink: add simple fw crash helpers
On Fri, 22 May 2020 22:46:07 +0200 Johannes Berg wrote: > > The irony is you have a problem with a networking device and all the > > devices your initial set touched are networking. Two of the drivers > > you touched either have or will soon have devlink health reporters > > implemented. > > Like I said above, do you think it'd be feasible to make a devcoredump > out of devlink health reports? And can the report be in a way that we > control the file format, or are there limits? I guess I should read the > code to find out, but I figure you probably just know. But feel free to > tell me to read it :) > > The reason I'm asking is that it's starting to sound like we really > ought to be implementing devlink, but we've got a bunch of > infrastructure that uses the devcoredump, and it'll take time > (significantly so) to change all that... In devlink world pure FW core dumps are exposed by devlink regions. An API allowing reading device memory, registers, etc., but also creating dumps of memory regions when things go wrong. It should be a fairly straightforward migration. Devlink health is more targeted, the dump is supposed to contain only relevant information, selected and formatted by the driver. When device misbehaves driver reads the relevant registers and FW state and creates a formatted state dump. I believe each element of the dump must fit into a netlink message (but there may be multiple elements, see devlink_fmsg_prepare_skb()). We should be able to convert dl-regions dumps and dl-health dumps into devcoredumps, but since health reporters are supposed to be more targeted there's usually multiple of them per device. Conversely devcoredumps can be trivially exposed as dl-region dumps, but I believe dl-health would require driver changes to format the information appropriately. ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [RFC 1/2] devlink: add simple fw crash helpers
On Fri, 22 May 2020 05:20:46 + Luis Chamberlain wrote: > > diff --git a/net/core/Makefile b/net/core/Makefile > > index 3e2c378e5f31..6f1513781c17 100644 > > --- a/net/core/Makefile > > +++ b/net/core/Makefile > > @@ -31,7 +31,7 @@ obj-$(CONFIG_LWTUNNEL_BPF) += lwt_bpf.o > > obj-$(CONFIG_BPF_STREAM_PARSER) += sock_map.o > > obj-$(CONFIG_DST_CACHE) += dst_cache.o > > obj-$(CONFIG_HWBM) += hwbm.o > > -obj-$(CONFIG_NET_DEVLINK) += devlink.o > > +obj-$(CONFIG_NET_DEVLINK) += devlink.o devlink_simple_fw_reporter.o > > This was looking super sexy up to here. This is networking specific. > We want something generic for *anything* that requests firmware. You can't be serious. It's network specific because of how the Kconfig is named? Working for a company operating large data centers I would strongly prefer if we didn't have ten different ways of reporting firmware problems in the fleet. > I'm afraid this won't work for something generic. I don't think its > throw-away work though, the idea to provide a generic interface to > dump firmware through netlink might be nice for networking, or other > things. > > But I have a feeling we'll want something still more generic than this. Please be specific. Saying generic a lot is not helpful. The code (as you can see in this patch) is in no way network specific. Or are you saying there are machines out there running without netlink sockets? > So networking may want to be aware that a firmware crash happened as > part of this network device health thing, but firmware crashing is a > generic thing. > > I have now extended my patch set to include uvents and I am more set on > that we need the taint now more than ever. Please expect my nack if you're trying to add this to networking drivers. The irony is you have a problem with a networking device and all the devices your initial set touched are networking. Two of the drivers you touched either have or will soon have devlink health reporters implemented. ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
[RFC 2/2] i2400m: use devlink health reporter
It builds. Signed-off-by: Jakub Kicinski --- drivers/net/wimax/i2400m/rx.c | 2 ++ drivers/net/wimax/i2400m/usb.c | 5 + 2 files changed, 7 insertions(+) diff --git a/drivers/net/wimax/i2400m/rx.c b/drivers/net/wimax/i2400m/rx.c index c9fb619a9e01..cc7fe78f2df0 100644 --- a/drivers/net/wimax/i2400m/rx.c +++ b/drivers/net/wimax/i2400m/rx.c @@ -144,6 +144,7 @@ * i2400m_msg_size_check * wimax_msg */ +#include #include #include #include @@ -712,6 +713,7 @@ void __i2400m_roq_queue(struct i2400m *i2400m, struct i2400m_roq *roq, dev_err(dev, "SW BUG? failed to insert packet\n"); dev_err(dev, "ERX: roq %p [ws %u] skb %p nsn %d sn %u\n", roq, roq->ws, skb, nsn, roq_data->sn); + devlink_simple_fw_reporter_report_crash(dev); skb_queue_walk(&roq->queue, skb_itr) { roq_data_itr = (struct i2400m_roq_data *) &skb_itr->cb; nsn_itr = __i2400m_roq_nsn(roq, roq_data_itr->sn); diff --git a/drivers/net/wimax/i2400m/usb.c b/drivers/net/wimax/i2400m/usb.c index 9659f9e1aaa6..5c811dccbf1d 100644 --- a/drivers/net/wimax/i2400m/usb.c +++ b/drivers/net/wimax/i2400m/usb.c @@ -49,6 +49,7 @@ * usb_reset_device() */ #include "i2400m-usb.h" +#include #include #include #include @@ -423,6 +424,8 @@ int i2400mu_probe(struct usb_interface *iface, if (usb_dev->speed != USB_SPEED_HIGH) dev_err(dev, "device not connected as high speed\n"); + devlink_simple_fw_reporter_prepare(dev); + /* Allocate instance [calls i2400m_netdev_setup() on it]. */ result = -ENOMEM; net_dev = alloc_netdev(sizeof(*i2400mu), "wmx%d", NET_NAME_UNKNOWN, @@ -506,6 +509,7 @@ int i2400mu_probe(struct usb_interface *iface, usb_put_dev(i2400mu->usb_dev); free_netdev(net_dev); error_alloc_netdev: + devlink_simple_fw_reporter_cleanup(dev); return result; } @@ -532,6 +536,7 @@ void i2400mu_disconnect(struct usb_interface *iface) usb_set_intfdata(iface, NULL); usb_put_dev(i2400mu->usb_dev); free_netdev(net_dev); + devlink_simple_fw_reporter_cleanup(dev); d_fnend(3, dev, "(iface %p i2400m %p) = void\n", iface, i2400m); } -- 2.25.4 ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
[RFC 1/2] devlink: add simple fw crash helpers
Add infra for creating devlink instances for a device to report fw crashes. This patch expects the devlink instance to be registered at probe time. I belive to be the cleanest. We can also add a devm version of the helpers, so that we don't have to do the clean up. Or we can go even further and register the devlink instance only once error has happened (for the first time, then we can just find out if already registered by traversing the list like we do here). With the patch applied and a sample driver converted we get: $ devlink dev pci/:07:00.0 Then monitor for errors: $ devlink mon health [health,status] pci/:07:00.0: reporter fw state error error 1 recover 0 [health,status] pci/:07:00.0: reporter fw state error error 2 recover 0 These are the events I triggered on purpose. One can also inspect the health of all devices capable of reporting fw errors: $ devlink health pci/:07:00.0: reporter fw state error error 7 recover 0 Obviously drivers may upgrade to the full devlink health API which includes state dump, state dump auto-collect and automatic error recovery control. Signed-off-by: Jakub Kicinski --- include/linux/devlink.h | 11 +++ net/core/Makefile | 2 +- net/core/devlink_simple_fw_reporter.c | 101 ++ 3 files changed, 113 insertions(+), 1 deletion(-) create mode 100644 include/linux/devlink.h create mode 100644 net/core/devlink_simple_fw_reporter.c diff --git a/include/linux/devlink.h b/include/linux/devlink.h new file mode 100644 index ..2b73987eefca --- /dev/null +++ b/include/linux/devlink.h @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +#ifndef _LINUX_DEVLINK_H_ +#define _LINUX_DEVLINK_H_ + +struct device; + +void devlink_simple_fw_reporter_prepare(struct device *dev); +void devlink_simple_fw_reporter_cleanup(struct device *dev); +void devlink_simple_fw_reporter_report_crash(struct device *dev); + +#endif diff --git a/net/core/Makefile b/net/core/Makefile index 3e2c378e5f31..6f1513781c17 100644 --- a/net/core/Makefile +++ b/net/core/Makefile @@ -31,7 +31,7 @@ obj-$(CONFIG_LWTUNNEL_BPF) += lwt_bpf.o obj-$(CONFIG_BPF_STREAM_PARSER) += sock_map.o obj-$(CONFIG_DST_CACHE) += dst_cache.o obj-$(CONFIG_HWBM) += hwbm.o -obj-$(CONFIG_NET_DEVLINK) += devlink.o +obj-$(CONFIG_NET_DEVLINK) += devlink.o devlink_simple_fw_reporter.o obj-$(CONFIG_GRO_CELLS) += gro_cells.o obj-$(CONFIG_FAILOVER) += failover.o obj-$(CONFIG_BPF_SYSCALL) += bpf_sk_storage.o diff --git a/net/core/devlink_simple_fw_reporter.c b/net/core/devlink_simple_fw_reporter.c new file mode 100644 index ..48dde9123c3c --- /dev/null +++ b/net/core/devlink_simple_fw_reporter.c @@ -0,0 +1,101 @@ +#include +#include +#include +#include + +struct devlink_simple_fw_reporter { + struct list_head list; + struct devlink_health_reporter *reporter; +}; + + +static LIST_HEAD(devlink_simple_fw_reporters); +static DEFINE_MUTEX(devlink_simple_fw_reporters_mutex); + +static const struct devlink_health_reporter_ops simple_devlink_health = { + .name = "fw", +}; + +static const struct devlink_ops simple_devlink_ops = { +}; + +static struct devlink_simple_fw_reporter * +devlink_simple_fw_reporter_find_for_dev(struct device *dev) +{ + struct devlink_simple_fw_reporter *simple_devlink, *ret = NULL; + struct devlink *devlink; + + mutex_lock(&devlink_simple_fw_reporters_mutex); + list_for_each_entry(simple_devlink, &devlink_simple_fw_reporters, + list) { + devlink = priv_to_devlink(simple_devlink); + if (devlink->dev == dev) { + ret = simple_devlink; + break; + } + } + mutex_unlock(&devlink_simple_fw_reporters_mutex); + + return ret; +} + +void devlink_simple_fw_reporter_report_crash(struct device *dev) +{ + struct devlink_simple_fw_reporter *simple_devlink; + + simple_devlink = devlink_simple_fw_reporter_find_for_dev(dev); + if (!simple_devlink) + return; + + devlink_health_report(simple_devlink->reporter, "firmware crash", NULL); +} +EXPORT_SYMBOL_GPL(devlink_simple_fw_reporter_report_crash); + +void devlink_simple_fw_reporter_prepare(struct device *dev) +{ + struct devlink_simple_fw_reporter *simple_devlink; + struct devlink *devlink; + + devlink = devlink_alloc(&simple_devlink_ops, + sizeof(struct devlink_simple_fw_reporter)); + if (!devlink) + return; + + if (devlink_register(devlink, dev)) + goto err_free; + + simple_devlink = devlink_priv(devlink); + simple_devlink->reporter = + devlink_health_reporter_create(devlink, &simple_devlink_health, + 0, NULL); + if (IS_ERR(simple_d
Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()
On Mon, 18 May 2020 21:22:02 + Luis Chamberlain wrote: > Indeed my issue with devlink is that it did not seem generic enough for > all devices which use firmware and for which firmware can crash. Support > should not have to be "add devlink support" + "now use this new hook", > but rather a very lighweight devlink_crash(device) call we can sprinkly > with only the device as a functional requirement. We can provide a lightweight devlink_crash(device) which only generates the notification, without the need to register the health reporter or a devlink instance upfront. But then we loose the ability to control the recovery, count errors, etc. So I'd think that's not the direction we want to go in. ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()
On Mon, 18 May 2020 22:41:48 +0200 Johannes Berg wrote: > On Mon, 2020-05-18 at 13:35 -0700, Jakub Kicinski wrote: > > It's intended to be a generic netlink channel for configuring devices. > > > > All the firmware-related interfaces have no dependencies on netdevs, > > in fact that's one of the reasons we moved to devlink - we don't want > > to hold rtnl lock just for talking to device firmware. > > Sounds good :) > > So I guess Luis just has to add some way in devlink to hook up devlink > health in a simple way to drivers, perhaps? I mean, many drivers won't > really want to use devlink for anything else, so I guess it should be as > simple as the API that Luis proposed ("firmware crashed for this struct > device"), if nothing more interesting is done with devlink? > > Dunno. But anyway sounds like it should somehow integrate there rather > than the way this patchset proposed? Right, that'd be great. Simple API to register a devlink instance with whatever number of reporters the device would need. I'm happy to help. ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()
On Mon, 18 May 2020 22:29:53 +0200 Johannes Berg wrote: > On Mon, 2020-05-18 at 13:28 -0700, Jakub Kicinski wrote: > > On Mon, 18 May 2020 21:25:09 +0200 Johannes Berg wrote: > > > It's pretty clear, but even then, first of all I doubt this is the case > > > for many of the places that you've sprinkled the annotation on, and > > > secondly it actually hides useful information. > > > > > > Regardless of the support issue, I think this hiding of information is > > > also problematic. > > > > > > I really think we'd all be better off if you just made a sysfs file (I > > > mistyped debugfs in some other email, sorry, apparently you didn't see > > > the correction in time) that listed which device(s) crashed and how many > > > times. That would actually be useful. Because honestly, if a random > > > device crashed for some random reason, that's pretty much a non-event. > > > If it keeps happening, then we might even want to know about it. > > > > Johannes - have you seen devlink health? I think we should just use > > that interface, since it supports all the things you're requesting, > > rather than duplicate it in sysfs. > > I haven't, and I'm glad to hear that's there, sounds good! > > I suspect that Luis wants something more generic though, that isn't just > applicable to netdevices, unless devlink grew some kind of non-netdev > stuff while I wasn't looking? :) It's intended to be a generic netlink channel for configuring devices. All the firmware-related interfaces have no dependencies on netdevs, in fact that's one of the reasons we moved to devlink - we don't want to hold rtnl lock just for talking to device firmware. ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()
On Mon, 18 May 2020 21:25:09 +0200 Johannes Berg wrote: > It's pretty clear, but even then, first of all I doubt this is the case > for many of the places that you've sprinkled the annotation on, and > secondly it actually hides useful information. > > Regardless of the support issue, I think this hiding of information is > also problematic. > > I really think we'd all be better off if you just made a sysfs file (I > mistyped debugfs in some other email, sorry, apparently you didn't see > the correction in time) that listed which device(s) crashed and how many > times. That would actually be useful. Because honestly, if a random > device crashed for some random reason, that's pretty much a non-event. > If it keeps happening, then we might even want to know about it. Johannes - have you seen devlink health? I think we should just use that interface, since it supports all the things you're requesting, rather than duplicate it in sysfs. ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k
Re: [PATCH 1/3] ath9k: remove stray backslash in Makefile
On Mon, 27 Nov 2017 18:56:21 +0100, Matthias Schiffer wrote: > Signed-off-by: Matthias Schiffer > --- > drivers/net/wireless/ath/ath9k/Makefile | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/wireless/ath/ath9k/Makefile > b/drivers/net/wireless/ath/ath9k/Makefile > index 36a40ffdce15..90e4a341076c 100644 > --- a/drivers/net/wireless/ath/ath9k/Makefile > +++ b/drivers/net/wireless/ath/ath9k/Makefile > @@ -59,7 +59,7 @@ obj-$(CONFIG_ATH9K_HW) += ath9k_hw.o > obj-$(CONFIG_ATH9K_COMMON) += ath9k_common.o > ath9k_common-y:= common.o \ > common-init.o \ > - common-beacon.o \ > + common-beacon.o > It's not necessarily stray, there is nothing on the next line so it's OK, and if you add \ at the end of all lines, you don't have to touch the last line every time you add/remove something. Sort of like putting a , after last enum value. Just my $.02, not that I mind either way :) > ath9k_common-$(CONFIG_ATH9K_COMMON_DEBUG) += common-debug.o \ >common-spectral.o ___ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k