Re: [PATCH 09/10] wireless: hostap: remove unused ioctl function
Arnd Bergmann writes: > From: Arnd Bergmann > > The ioctl handler has no actual callers in the kernel and is useless. > All the functionality should be reachable through the regualar interfaces. > > Signed-off-by: Arnd Bergmann In the title we prefer "wifi:" over "wireless:" but that's nitpicking. I assume this goes via a net tree so: Acked-by: Kalle Valo Let me know if I should take this to wireless-next instead. -- https://patchwork.kernel.org/project/linux-wireless/list/ https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
Re: [PATCH 08/10] wireless: atmel: remove unused ioctl function
Arnd Bergmann writes: > From: Arnd Bergmann > > This function has no callers, and for the past 20 years, the request_firmware > interface has been in place instead of the custom firmware loader. > > Signed-off-by: Arnd Bergmann Yuck, good riddance. In the title we prefer "wifi:" over "wireless:" but that's nitpicking. I assume this goes via a net tree so: Acked-by: Kalle Valo Let me know if I should take this to wireless-next instead. -- https://patchwork.kernel.org/project/linux-wireless/list/ https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
Re: [PATCH v1] Ftrace: make sched_wakeup can focus on the target process
On 10/9/23 17:37, Jinyu Tang wrote: > $ cyclictest --mlockall --smp --priority=99 & rtla timerlat -a will give you an structured analysis of your latency... https://bristot.me/linux-scheduling-latency-debug-and-analysis/ -- Daniel
Re: [PATCH] remoteproc: st: Use device_get_match_data()
On 10/9/23 23:13, Rob Herring wrote: > Use preferred device_get_match_data() instead of of_match_device() to > get the driver match data. With this, adjust the includes to explicitly > include the correct headers. > > Signed-off-by: Rob Herring > --- > drivers/remoteproc/st_remoteproc.c | 14 -- > 1 file changed, 4 insertions(+), 10 deletions(-) > > diff --git a/drivers/remoteproc/st_remoteproc.c > b/drivers/remoteproc/st_remoteproc.c > index e3ce01d98b4c..b0638f984842 100644 > --- a/drivers/remoteproc/st_remoteproc.c > +++ b/drivers/remoteproc/st_remoteproc.c > @@ -16,10 +16,9 @@ > #include > #include > #include > -#include > -#include > #include > #include > +#include > #include > #include > #include > @@ -341,7 +340,6 @@ static int st_rproc_parse_dt(struct platform_device *pdev) > static int st_rproc_probe(struct platform_device *pdev) > { > struct device *dev = &pdev->dev; > - const struct of_device_id *match; > struct st_rproc *ddata; > struct device_node *np = dev->of_node; > struct rproc *rproc; > @@ -349,19 +347,15 @@ static int st_rproc_probe(struct platform_device *pdev) > int enabled; > int ret, i; > > - match = of_match_device(st_rproc_match, dev); > - if (!match || !match->data) { > - dev_err(dev, "No device match found\n"); > - return -ENODEV; > - } > - > rproc = rproc_alloc(dev, np->name, &st_rproc_ops, NULL, sizeof(*ddata)); > if (!rproc) > return -ENOMEM; > > rproc->has_iommu = false; > ddata = rproc->priv; > - ddata->config = (struct st_rproc_config *)match->data; > + ddata->config = (struct st_rproc_config *)device_get_match_data(dev); > + if (!ddata->config) > + goto free_rproc; > > platform_set_drvdata(pdev, rproc); > Reviewed-by: Patrice Chotard Thanks Patrice
Re: [PATCH net-next v7] net/core: Introduce netdev_core_stats_inc()
On 2023/10/9 22:28, Steven Rostedt wrote: On Mon, 9 Oct 2023 18:58:27 +0800 Yajun Deng wrote: C compiler decides to inline or not, depending on various factors. The most efficient (and small) code is generated by this_cpu_inc() version, allowing the compiler to inline it. If you copy/paste this_cpu_inc() twenty times, then the compiler would not inline the function anymore. Yes, if you want something to be visible by ftrace, it must not be inlined (as inlined functions are not function calls by definition). And as Eric stated, the compiler is perfectly allowed to inline something if it believes it will be more efficient. i.e. There may be code around the function call that could be more efficient if it wasn't change to parameters. If you want to make sure a function stays out of line, you must explicitly tell the compiler you want the function not to ever be inlined (hence the "noinline" attribute). Thanks for the details. Got it. Thank you. Great. -- Steve
Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC
On Mon, 2023-10-09 at 20:42 -0500, Haitao Huang wrote: > Hi Sean > > On Mon, 09 Oct 2023 19:23:04 -0500, Sean Christopherson > wrote: > > > On Mon, Oct 09, 2023, Kai Huang wrote: > > > On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote: > > > > +/** > > > > + * sgx_epc_oom() - invoke EPC out-of-memory handling on target LRU > > > > + * @lru: LRU that is low > > > > + * > > > > + * Return: %true if a victim was found and kicked. > > > > + */ > > > > +bool sgx_epc_oom(struct sgx_epc_lru_lists *lru) > > > > +{ > > > > + struct sgx_epc_page *victim; > > > > + > > > > + spin_lock(&lru->lock); > > > > + victim = sgx_oom_get_victim(lru); > > > > + spin_unlock(&lru->lock); > > > > + > > > > + if (!victim) > > > > + return false; > > > > + > > > > + if (victim->flags & SGX_EPC_OWNER_PAGE) > > > > + return sgx_oom_encl_page(victim->encl_page); > > > > + > > > > + if (victim->flags & SGX_EPC_OWNER_ENCL) > > > > + return sgx_oom_encl(victim->encl); > > > > > > I hate to bring this up, at least at this stage, but I am wondering why > > > we need > > > to put VA and SECS pages to the unreclaimable list, but cannot keep an > > > "enclave_list" instead? > > > > The motivation for tracking EPC pages instead of enclaves was so that > > the EPC > > OOM-killer could "kill" VMs as well as host-owned enclaves. The virtual > > EPC code > > didn't actually kill the VM process, it instead just freed all of the > > EPC pages > > and abused the SGX architecture to effectively make the guest recreate > > all its > > enclaves (IIRC, QEMU does the same thing to "support" live migration). > > > > Looks like y'all punted on that with: > > > > The EPC pages allocated for KVM guests by the virtual EPC driver are > > not > > reclaimable by the host kernel [5]. Therefore they are not tracked by > > any > > LRU lists for reclaiming purposes in this implementation, but they are > > charged toward the cgroup of the user processs (e.g., QEMU) launching > > the > > guest. And when the cgroup EPC usage reaches its limit, the virtual > > EPC > > driver will stop allocating more EPC for the VM, and return SIGBUS to > > the > > user process which would abort the VM launch. > > > > which IMO is a hack, unless returning SIGBUS is actually enforced > > somehow. Relying > > on userspace to be kind enough to kill its VMs kinda defeats the purpose > > of cgroup > > enforcement. E.g. if the hard limit for a EPC cgroup is lowered, > > userspace running > > encalves in a VM could continue on and refuse to give up its EPC, and > > thus run above > > its limit in perpetuity. > > > Cgroup would refuse to allocate more when limit is reached so VMs can not > run above limit. > > IIRC VMs only support static EPC size right now, reaching limit at launch > means the EPC size given in command line for QEMU is not appropriate. So > VM should not launch, hence the current behavior. > > [all EPC pages in guest are allocated on page fault caused by the > sensitization process in guest kernel during init, which is part of the VM > Launch process. So SIGNBUS will turn into failed VM launch.] > > Once it is launched, guest kernel would have 'total capacity' given by the > static value from QEMU option. And it would start paging when it is used > up, never would ask for more from host. > > For future with dynamic EPC for running guests, QEMU could handle > allocation failure and pass SIGBUS to the running guest kernel. Is that > correct understanding? > > > > I can see userspace wanting to explicitly terminate the VM instead of > > "silently" > > the VM's enclaves, but that seems like it should be a knob in the > > virtual EPC > > code. > > If my understanding above is correct and understanding your statement > above correctly, then don't see we really need separate knob for vEPC > code. Reaching a cgroup limit by a running guest (assuming dynamic > allocation implemented) should not translate automatically killing the VM. > Instead, it's user space job to work with guest to handle allocation > failure. Guest could page and kill enclaves. > IIUC Sean was talking about changing misc.max _after_ you launch SGX VMs: 1) misc.max = 100M 2) Launch VMs with total virtual EPC size = 100M<- success 3) misc.max = 50M 3) will also succeed, but nothing will happen, the VMs will be still holding 100M EPC. You need to somehow track virtual EPC and kill VM instead. (or somehow fail to do 3) if it is also an acceptable option.)
Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC
> > > > > > > Later the hosting process could migrated/reassigned to another cgroup? > > > What to do when the new cgroup is OOM? > > > > > > > You addressed in the documentation, no? > > > > +Migration > > +- > > + > > +Once an EPC page is charged to a cgroup (during allocation), it > > +remains charged to the original cgroup until the page is released > > +or reclaimed. Migrating a process to a different cgroup doesn't > > +move the EPC charges that it incurred while in the previous cgroup > > +to its new cgroup. > > Should we kill the enclave though because some VA pages may be in the new > group? > I guess acceptable? And any difference if you keep VA/SECS to unreclaimabe list? If you migrate one enclave to another cgroup, the old EPC pages stay in the old cgroup while the new one is charged to the new group IIUC. I am not cgroup expert, but by searching some old thread it appears this isn't a supported model: https://lore.kernel.org/lkml/yeyr9181qgzt+...@mtj.duckdns.org/
Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC
Hi Sean On Mon, 09 Oct 2023 19:23:04 -0500, Sean Christopherson wrote: On Mon, Oct 09, 2023, Kai Huang wrote: On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote: > +/** > + * sgx_epc_oom() - invoke EPC out-of-memory handling on target LRU > + * @lru: LRU that is low > + * > + * Return:%true if a victim was found and kicked. > + */ > +bool sgx_epc_oom(struct sgx_epc_lru_lists *lru) > +{ > + struct sgx_epc_page *victim; > + > + spin_lock(&lru->lock); > + victim = sgx_oom_get_victim(lru); > + spin_unlock(&lru->lock); > + > + if (!victim) > + return false; > + > + if (victim->flags & SGX_EPC_OWNER_PAGE) > + return sgx_oom_encl_page(victim->encl_page); > + > + if (victim->flags & SGX_EPC_OWNER_ENCL) > + return sgx_oom_encl(victim->encl); I hate to bring this up, at least at this stage, but I am wondering why we need to put VA and SECS pages to the unreclaimable list, but cannot keep an "enclave_list" instead? The motivation for tracking EPC pages instead of enclaves was so that the EPC OOM-killer could "kill" VMs as well as host-owned enclaves. The virtual EPC code didn't actually kill the VM process, it instead just freed all of the EPC pages and abused the SGX architecture to effectively make the guest recreate all its enclaves (IIRC, QEMU does the same thing to "support" live migration). Looks like y'all punted on that with: The EPC pages allocated for KVM guests by the virtual EPC driver are not reclaimable by the host kernel [5]. Therefore they are not tracked by any LRU lists for reclaiming purposes in this implementation, but they are charged toward the cgroup of the user processs (e.g., QEMU) launching the guest. And when the cgroup EPC usage reaches its limit, the virtual EPC driver will stop allocating more EPC for the VM, and return SIGBUS to the user process which would abort the VM launch. which IMO is a hack, unless returning SIGBUS is actually enforced somehow. Relying on userspace to be kind enough to kill its VMs kinda defeats the purpose of cgroup enforcement. E.g. if the hard limit for a EPC cgroup is lowered, userspace running encalves in a VM could continue on and refuse to give up its EPC, and thus run above its limit in perpetuity. Cgroup would refuse to allocate more when limit is reached so VMs can not run above limit. IIRC VMs only support static EPC size right now, reaching limit at launch means the EPC size given in command line for QEMU is not appropriate. So VM should not launch, hence the current behavior. [all EPC pages in guest are allocated on page fault caused by the sensitization process in guest kernel during init, which is part of the VM Launch process. So SIGNBUS will turn into failed VM launch.] Once it is launched, guest kernel would have 'total capacity' given by the static value from QEMU option. And it would start paging when it is used up, never would ask for more from host. For future with dynamic EPC for running guests, QEMU could handle allocation failure and pass SIGBUS to the running guest kernel. Is that correct understanding? I can see userspace wanting to explicitly terminate the VM instead of "silently" the VM's enclaves, but that seems like it should be a knob in the virtual EPC code. If my understanding above is correct and understanding your statement above correctly, then don't see we really need separate knob for vEPC code. Reaching a cgroup limit by a running guest (assuming dynamic allocation implemented) should not translate automatically killing the VM. Instead, it's user space job to work with guest to handle allocation failure. Guest could page and kill enclaves. Haitao
Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC
On Mon, 09 Oct 2023 20:18:00 -0500, Huang, Kai wrote: On Mon, 2023-10-09 at 20:04 -0500, Haitao Huang wrote: On Mon, 09 Oct 2023 18:45:06 -0500, Huang, Kai wrote: > On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote: > > From: Sean Christopherson > > > > Introduce the OOM path for killing an enclave with a reclaimer that is > > no > > longer able to reclaim enough EPC pages. Find a victim enclave, which > > will be an enclave with only "unreclaimable" EPC pages left in the > > cgroup LRU lists. Once a victim is identified, mark the enclave as OOM > > and zap the enclave's entire page range, and drain all mm references in > > encl->mm_list. Block allocating any EPC pages in #PF handler, or > > reloading any pages in all paths, or creating any new mappings. > > > > The OOM killing path may race with the reclaimers: in some cases, the > > victim enclave is in the process of reclaiming the last EPC pages when > > OOM happens, that is, all pages other than SECS and VA pages are in > > RECLAIMING_IN_PROGRESS state. The reclaiming process requires access to > > the enclave backing, VA pages as well as SECS. So the OOM killer does > > not directly release those enclave resources, instead, it lets all > > reclaiming in progress to finish, and relies (as currently done) on > > kref_put on encl->refcount to trigger sgx_encl_release() to do the > > final cleanup. > > > > Signed-off-by: Sean Christopherson > > Co-developed-by: Kristen Carlson Accardi > > Signed-off-by: Kristen Carlson Accardi > > Co-developed-by: Haitao Huang > > Signed-off-by: Haitao Huang > > Cc: Sean Christopherson > > --- > > V5: > > - Rename SGX_ENCL_OOM to SGX_ENCL_NO_MEMORY > > > > V4: > > - Updates for patch reordering and typo fixes. > > > > V3: > > - Rebased to use the new VMA_ITERATOR to zap VMAs. > > - Fixed the racing cases by blocking new page allocation/mapping and > > reloading when enclave is marked for OOM. And do not release any enclave > > resources other than draining mm_list entries, and let pages in > > RECLAIMING_IN_PROGRESS to be reaped by reclaimers. > > - Due to above changes, also removed the no-longer needed encl->lock in > > the OOM path which was causing deadlocks reported by the lock prover. > > > > [...] > > > + > > +/** > > + * sgx_epc_oom() - invoke EPC out-of-memory handling on target LRU > > + * @lru: LRU that is low > > + * > > + * Return: %true if a victim was found and kicked. > > + */ > > +bool sgx_epc_oom(struct sgx_epc_lru_lists *lru) > > +{ > > + struct sgx_epc_page *victim; > > + > > + spin_lock(&lru->lock); > > + victim = sgx_oom_get_victim(lru); > > + spin_unlock(&lru->lock); > > + > > + if (!victim) > > + return false; > > + > > + if (victim->flags & SGX_EPC_OWNER_PAGE) > > + return sgx_oom_encl_page(victim->encl_page); > > + > > + if (victim->flags & SGX_EPC_OWNER_ENCL) > > + return sgx_oom_encl(victim->encl); > > I hate to bring this up, at least at this stage, but I am wondering why > we need > to put VA and SECS pages to the unreclaimable list, but cannot keep an > "enclave_list" instead? > > So by looking the patch (" x86/sgx: Limit process EPC usage with misc > cgroup > controller"), if I am not missing anything, the whole "unreclaimable" > list is > just used to find the victim enclave when OOM needs to be done. Thus, I > don't > see why "enclave_list" cannot be used to achieve this. > > The reason that I am asking is because it seems using "enclave_list" we > can > simplify the code. At least the patches related to track VA/SECS pages, > and the > SGX_EPC_OWNER_PAGE/SGX_EPC_OWNER_ENCL thing can be eliminated > completely. > Using "enclave_list", I guess you just need to put the enclave to the > current > EPC cgroup when SECS page is allocated. > Later the hosting process could migrated/reassigned to another cgroup? What to do when the new cgroup is OOM? You addressed in the documentation, no? +Migration +- + +Once an EPC page is charged to a cgroup (during allocation), it +remains charged to the original cgroup until the page is released +or reclaimed. Migrating a process to a different cgroup doesn't +move the EPC charges that it incurred while in the previous cgroup +to its new cgroup. Should we kill the enclave though because some VA pages may be in the new group? Haitao
Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC
On Tue, 2023-10-10 at 00:50 +, Huang, Kai wrote: > On Mon, 2023-10-09 at 17:23 -0700, Sean Christopherson wrote: > > On Mon, Oct 09, 2023, Kai Huang wrote: > > > On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote: > > > > +/** > > > > + * sgx_epc_oom() - invoke EPC out-of-memory handling on target LRU > > > > + * @lru: LRU that is low > > > > + * > > > > + * Return: %true if a victim was found and kicked. > > > > + */ > > > > +bool sgx_epc_oom(struct sgx_epc_lru_lists *lru) > > > > +{ > > > > + struct sgx_epc_page *victim; > > > > + > > > > + spin_lock(&lru->lock); > > > > + victim = sgx_oom_get_victim(lru); > > > > + spin_unlock(&lru->lock); > > > > + > > > > + if (!victim) > > > > + return false; > > > > + > > > > + if (victim->flags & SGX_EPC_OWNER_PAGE) > > > > + return sgx_oom_encl_page(victim->encl_page); > > > > + > > > > + if (victim->flags & SGX_EPC_OWNER_ENCL) > > > > + return sgx_oom_encl(victim->encl); > > > > > > I hate to bring this up, at least at this stage, but I am wondering why > > > we need > > > to put VA and SECS pages to the unreclaimable list, but cannot keep an > > > "enclave_list" instead? > > > > The motivation for tracking EPC pages instead of enclaves was so that the > > EPC > > OOM-killer could "kill" VMs as well as host-owned enclaves. > > > > Ah this seems a fair argument. :-) > > > The virtual EPC code > > didn't actually kill the VM process, it instead just freed all of the EPC > > pages > > and abused the SGX architecture to effectively make the guest recreate all > > its > > enclaves (IIRC, QEMU does the same thing to "support" live migration). > > It returns SIGBUS. SGX VM live migration also requires enough EPC being able > to > be allocated on the destination machine to work AFAICT. > > > > > Looks like y'all punted on that with: > > > > The EPC pages allocated for KVM guests by the virtual EPC driver are not > > reclaimable by the host kernel [5]. Therefore they are not tracked by any > > LRU lists for reclaiming purposes in this implementation, but they are > > charged toward the cgroup of the user processs (e.g., QEMU) launching the > > guest. And when the cgroup EPC usage reaches its limit, the virtual EPC > > driver will stop allocating more EPC for the VM, and return SIGBUS to the > > user process which would abort the VM launch. > > > > which IMO is a hack, unless returning SIGBUS is actually enforced somehow. > > > > "enforced" do you mean? > > Currently the sgx_vepc_fault() returns VM_FAULT_SIGBUS when it cannot allocate > EPC page. And when this happens, KVM returns KVM_PFN_ERR_FAULT in > hva_to_pfn(), > which eventually results in KVM returning -EFAULT to userspace in vcpu_run(). > And Qemu then kills the VM with some nonsense message: > > error: kvm run failed Bad address > > > > Relying > > on userspace to be kind enough to kill its VMs kinda defeats the purpose of > > cgroup > > enforcement. E.g. if the hard limit for a EPC cgroup is lowered, userspace > > running > > encalves in a VM could continue on and refuse to give up its EPC, and thus > > run above > > its limit in perpetuity. > > > > > I can see userspace wanting to explicitly terminate the VM instead of > > "silently" > > the VM's enclaves, but that seems like it should be a knob in the virtual > > EPC > > code. I guess I slightly misunderstood your words. You mean we want to kill VM when the limit is set to be lower than virtual EPC size. This patch adds SGX_ENCL_NO_MEMORY. I guess we can use it for virtual EPC too? In the sgx_vepc_fault(), we check this flag at early time and return SIGBUS if it is set. But this also requires keeping virtual EPC pages in some list, and handles them in sgx_epc_oom() too. And for virtual EPC pages, I guess the "young" logic can be applied thus probably it's better to keep the actual virtual EPC pages to a (separate?) list instead of keeping the virtual EPC instance. struct sgx_epc_lru { struct list_head reclaimable; struct sgx_encl *enclaves; struct list_head vepc_pages; } Or still tracking VA/SECS and virtual EPC pages in a single unrecliamable list? I don't know :-)
Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC
On Mon, 2023-10-09 at 20:04 -0500, Haitao Huang wrote: > On Mon, 09 Oct 2023 18:45:06 -0500, Huang, Kai wrote: > > > On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote: > > > From: Sean Christopherson > > > > > > Introduce the OOM path for killing an enclave with a reclaimer that is > > > no > > > longer able to reclaim enough EPC pages. Find a victim enclave, which > > > will be an enclave with only "unreclaimable" EPC pages left in the > > > cgroup LRU lists. Once a victim is identified, mark the enclave as OOM > > > and zap the enclave's entire page range, and drain all mm references in > > > encl->mm_list. Block allocating any EPC pages in #PF handler, or > > > reloading any pages in all paths, or creating any new mappings. > > > > > > The OOM killing path may race with the reclaimers: in some cases, the > > > victim enclave is in the process of reclaiming the last EPC pages when > > > OOM happens, that is, all pages other than SECS and VA pages are in > > > RECLAIMING_IN_PROGRESS state. The reclaiming process requires access to > > > the enclave backing, VA pages as well as SECS. So the OOM killer does > > > not directly release those enclave resources, instead, it lets all > > > reclaiming in progress to finish, and relies (as currently done) on > > > kref_put on encl->refcount to trigger sgx_encl_release() to do the > > > final cleanup. > > > > > > Signed-off-by: Sean Christopherson > > > Co-developed-by: Kristen Carlson Accardi > > > Signed-off-by: Kristen Carlson Accardi > > > Co-developed-by: Haitao Huang > > > Signed-off-by: Haitao Huang > > > Cc: Sean Christopherson > > > --- > > > V5: > > > - Rename SGX_ENCL_OOM to SGX_ENCL_NO_MEMORY > > > > > > V4: > > > - Updates for patch reordering and typo fixes. > > > > > > V3: > > > - Rebased to use the new VMA_ITERATOR to zap VMAs. > > > - Fixed the racing cases by blocking new page allocation/mapping and > > > reloading when enclave is marked for OOM. And do not release any enclave > > > resources other than draining mm_list entries, and let pages in > > > RECLAIMING_IN_PROGRESS to be reaped by reclaimers. > > > - Due to above changes, also removed the no-longer needed encl->lock in > > > the OOM path which was causing deadlocks reported by the lock prover. > > > > > > > [...] > > > > > + > > > +/** > > > + * sgx_epc_oom() - invoke EPC out-of-memory handling on target LRU > > > + * @lru: LRU that is low > > > + * > > > + * Return: %true if a victim was found and kicked. > > > + */ > > > +bool sgx_epc_oom(struct sgx_epc_lru_lists *lru) > > > +{ > > > + struct sgx_epc_page *victim; > > > + > > > + spin_lock(&lru->lock); > > > + victim = sgx_oom_get_victim(lru); > > > + spin_unlock(&lru->lock); > > > + > > > + if (!victim) > > > + return false; > > > + > > > + if (victim->flags & SGX_EPC_OWNER_PAGE) > > > + return sgx_oom_encl_page(victim->encl_page); > > > + > > > + if (victim->flags & SGX_EPC_OWNER_ENCL) > > > + return sgx_oom_encl(victim->encl); > > > > I hate to bring this up, at least at this stage, but I am wondering why > > we need > > to put VA and SECS pages to the unreclaimable list, but cannot keep an > > "enclave_list" instead? > > > > So by looking the patch (" x86/sgx: Limit process EPC usage with misc > > cgroup > > controller"), if I am not missing anything, the whole "unreclaimable" > > list is > > just used to find the victim enclave when OOM needs to be done. Thus, I > > don't > > see why "enclave_list" cannot be used to achieve this. > > > > The reason that I am asking is because it seems using "enclave_list" we > > can > > simplify the code. At least the patches related to track VA/SECS pages, > > and the > > SGX_EPC_OWNER_PAGE/SGX_EPC_OWNER_ENCL thing can be eliminated > > completely. > > Using "enclave_list", I guess you just need to put the enclave to the > > current > > EPC cgroup when SECS page is allocated. > > > Later the hosting process could migrated/reassigned to another cgroup? > What to do when the new cgroup is OOM? > You addressed in the documentation, no? +Migration +- + +Once an EPC page is charged to a cgroup (during allocation), it +remains charged to the original cgroup until the page is released +or reclaimed. Migrating a process to a different cgroup doesn't +move the EPC charges that it incurred while in the previous cgroup +to its new cgroup.
Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC
On Mon, 09 Oct 2023 18:45:06 -0500, Huang, Kai wrote: On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote: From: Sean Christopherson Introduce the OOM path for killing an enclave with a reclaimer that is no longer able to reclaim enough EPC pages. Find a victim enclave, which will be an enclave with only "unreclaimable" EPC pages left in the cgroup LRU lists. Once a victim is identified, mark the enclave as OOM and zap the enclave's entire page range, and drain all mm references in encl->mm_list. Block allocating any EPC pages in #PF handler, or reloading any pages in all paths, or creating any new mappings. The OOM killing path may race with the reclaimers: in some cases, the victim enclave is in the process of reclaiming the last EPC pages when OOM happens, that is, all pages other than SECS and VA pages are in RECLAIMING_IN_PROGRESS state. The reclaiming process requires access to the enclave backing, VA pages as well as SECS. So the OOM killer does not directly release those enclave resources, instead, it lets all reclaiming in progress to finish, and relies (as currently done) on kref_put on encl->refcount to trigger sgx_encl_release() to do the final cleanup. Signed-off-by: Sean Christopherson Co-developed-by: Kristen Carlson Accardi Signed-off-by: Kristen Carlson Accardi Co-developed-by: Haitao Huang Signed-off-by: Haitao Huang Cc: Sean Christopherson --- V5: - Rename SGX_ENCL_OOM to SGX_ENCL_NO_MEMORY V4: - Updates for patch reordering and typo fixes. V3: - Rebased to use the new VMA_ITERATOR to zap VMAs. - Fixed the racing cases by blocking new page allocation/mapping and reloading when enclave is marked for OOM. And do not release any enclave resources other than draining mm_list entries, and let pages in RECLAIMING_IN_PROGRESS to be reaped by reclaimers. - Due to above changes, also removed the no-longer needed encl->lock in the OOM path which was causing deadlocks reported by the lock prover. [...] + +/** + * sgx_epc_oom() - invoke EPC out-of-memory handling on target LRU + * @lru: LRU that is low + * + * Return: %true if a victim was found and kicked. + */ +bool sgx_epc_oom(struct sgx_epc_lru_lists *lru) +{ + struct sgx_epc_page *victim; + + spin_lock(&lru->lock); + victim = sgx_oom_get_victim(lru); + spin_unlock(&lru->lock); + + if (!victim) + return false; + + if (victim->flags & SGX_EPC_OWNER_PAGE) + return sgx_oom_encl_page(victim->encl_page); + + if (victim->flags & SGX_EPC_OWNER_ENCL) + return sgx_oom_encl(victim->encl); I hate to bring this up, at least at this stage, but I am wondering why we need to put VA and SECS pages to the unreclaimable list, but cannot keep an "enclave_list" instead? So by looking the patch (" x86/sgx: Limit process EPC usage with misc cgroup controller"), if I am not missing anything, the whole "unreclaimable" list is just used to find the victim enclave when OOM needs to be done. Thus, I don't see why "enclave_list" cannot be used to achieve this. The reason that I am asking is because it seems using "enclave_list" we can simplify the code. At least the patches related to track VA/SECS pages, and the SGX_EPC_OWNER_PAGE/SGX_EPC_OWNER_ENCL thing can be eliminated completely. Using "enclave_list", I guess you just need to put the enclave to the current EPC cgroup when SECS page is allocated. Later the hosting process could migrated/reassigned to another cgroup? What to do when the new cgroup is OOM? Thanks Haitao
Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC
On Mon, 2023-10-09 at 17:23 -0700, Sean Christopherson wrote: > On Mon, Oct 09, 2023, Kai Huang wrote: > > On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote: > > > +/** > > > + * sgx_epc_oom() - invoke EPC out-of-memory handling on target LRU > > > + * @lru: LRU that is low > > > + * > > > + * Return: %true if a victim was found and kicked. > > > + */ > > > +bool sgx_epc_oom(struct sgx_epc_lru_lists *lru) > > > +{ > > > + struct sgx_epc_page *victim; > > > + > > > + spin_lock(&lru->lock); > > > + victim = sgx_oom_get_victim(lru); > > > + spin_unlock(&lru->lock); > > > + > > > + if (!victim) > > > + return false; > > > + > > > + if (victim->flags & SGX_EPC_OWNER_PAGE) > > > + return sgx_oom_encl_page(victim->encl_page); > > > + > > > + if (victim->flags & SGX_EPC_OWNER_ENCL) > > > + return sgx_oom_encl(victim->encl); > > > > I hate to bring this up, at least at this stage, but I am wondering why we > > need > > to put VA and SECS pages to the unreclaimable list, but cannot keep an > > "enclave_list" instead? > > The motivation for tracking EPC pages instead of enclaves was so that the EPC > OOM-killer could "kill" VMs as well as host-owned enclaves. > Ah this seems a fair argument. :-) > The virtual EPC code > didn't actually kill the VM process, it instead just freed all of the EPC > pages > and abused the SGX architecture to effectively make the guest recreate all its > enclaves (IIRC, QEMU does the same thing to "support" live migration). It returns SIGBUS. SGX VM live migration also requires enough EPC being able to be allocated on the destination machine to work AFAICT. > > Looks like y'all punted on that with: > > The EPC pages allocated for KVM guests by the virtual EPC driver are not > reclaimable by the host kernel [5]. Therefore they are not tracked by any > LRU lists for reclaiming purposes in this implementation, but they are > charged toward the cgroup of the user processs (e.g., QEMU) launching the > guest. And when the cgroup EPC usage reaches its limit, the virtual EPC > driver will stop allocating more EPC for the VM, and return SIGBUS to the > user process which would abort the VM launch. > > which IMO is a hack, unless returning SIGBUS is actually enforced somehow. > "enforced" do you mean? Currently the sgx_vepc_fault() returns VM_FAULT_SIGBUS when it cannot allocate EPC page. And when this happens, KVM returns KVM_PFN_ERR_FAULT in hva_to_pfn(), which eventually results in KVM returning -EFAULT to userspace in vcpu_run(). And Qemu then kills the VM with some nonsense message: error: kvm run failed Bad address > Relying > on userspace to be kind enough to kill its VMs kinda defeats the purpose of > cgroup > enforcement. E.g. if the hard limit for a EPC cgroup is lowered, userspace > running > encalves in a VM could continue on and refuse to give up its EPC, and thus > run above > its limit in perpetuity. Agreed. But this looks cannot resolved until we can reclaim EPC page from VM. Or in the EPC cgroup code we can refuse to set the maximum which cannot be supported, e.g., less the total virtual EPC size. I guess the second is acceptable for now? > > I can see userspace wanting to explicitly terminate the VM instead of > "silently" > the VM's enclaves, but that seems like it should be a knob in the virtual EPC > code. See above for the second option.
Re: [PATCH 01/10] appletalk: remove localtalk and ppp support
On Mon, 9 Oct 2023 16:18:59 +0200 Arnd Bergmann wrote: > The last localtalk driver is gone now, and ppp support was never fully > merged, so clean up the appletalk code by removing the obvious dead > code paths. > > Notably, this removes one of the two callers of the old .ndo_do_ioctl() > callback that was abused for getting device addresses and is now > only used in the ieee802154 subsystem, which still uses the same trick. > > The include/uapi/linux/if_ltalk.h header might still be required > for building userspace programs, but I made sure that debian code > search and the netatalk upstream have no references it it, so it > should be fine to remove. Looks like it depends on the ipddp driver removal. Could you repost once that one is merged (~tomorrow)? -- pw-bot: cr
Re: [PATCH v5 16/18] x86/sgx: Limit process EPC usage with misc cgroup controller
> @@ -332,6 +336,7 @@ void sgx_isolate_epc_pages(struct sgx_epc_lru_lists *lru, > size_t nr_to_scan, > * sgx_reclaim_epc_pages() - Reclaim EPC pages from the consumers > * @nr_to_scan: Number of EPC pages to scan for reclaim > * @ignore_age: Reclaim a page even if it is young > + * @epc_cg: EPC cgroup from which to reclaim > * > * Take a fixed number of pages from the head of the active page pool and > * reclaim them to the enclave's private shmem files. Skip the pages, which > have > @@ -345,7 +350,8 @@ void sgx_isolate_epc_pages(struct sgx_epc_lru_lists *lru, > size_t nr_to_scan, > * problematic as it would increase the lock contention too much, which would > * halt forward progress. > */ > -size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool ignore_age) > +size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool ignore_age, > + struct sgx_epc_cgroup *epc_cg) > { > struct sgx_backing backing[SGX_NR_TO_SCAN_MAX]; > struct sgx_epc_page *epc_page, *tmp; > @@ -355,7 +361,15 @@ size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool > ignore_age) > LIST_HEAD(iso); > size_t ret, i; > > - sgx_isolate_epc_pages(&sgx_global_lru, nr_to_scan, &iso); > + /* > + * If a specific cgroup is not being targeted, take from the global > + * list first, even when cgroups are enabled. If there are > + * pages on the global LRU then they should get reclaimed asap. > + */ > + if (!IS_ENABLED(CONFIG_CGROUP_SGX_EPC) || !epc_cg) > + sgx_isolate_epc_pages(&sgx_global_lru, &nr_to_scan, &iso); > + > + sgx_epc_cgroup_isolate_pages(epc_cg, &nr_to_scan, &iso); (I wish such code can be somehow moved to the earlier patches, so that we can get early idea that how sgx_reclaim_epc_pages() is supposed to be used.) So here when we are not targeting a specific EPC cgroup, we always reclaim from the global list first, ... [...] > > if (list_empty(&iso)) > return 0; > @@ -423,7 +437,7 @@ static bool sgx_should_reclaim(unsigned long watermark) > void sgx_reclaim_direct(void) > { > if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) > - sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false); > + sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false, NULL); ... and we always try to reclaim the global list first when directly reclaim is desired, even the enclave is within some EPC cgroup. ... > } > > static int ksgxd(void *p) > @@ -446,7 +460,7 @@ static int ksgxd(void *p) >sgx_should_reclaim(SGX_NR_HIGH_PAGES)); > > if (sgx_should_reclaim(SGX_NR_HIGH_PAGES)) > - sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false); > + sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false, NULL); ... and in ksgxd() as well, which I guess is somehow acceptable. ... > > cond_resched(); > } > @@ -600,6 +614,11 @@ int sgx_drop_epc_page(struct sgx_epc_page *page) > struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) > { > struct sgx_epc_page *page; > + struct sgx_epc_cgroup *epc_cg; > + > + epc_cg = sgx_epc_cgroup_try_charge(reclaim); > + if (IS_ERR(epc_cg)) > + return ERR_CAST(epc_cg); > > for ( ; ; ) { > page = __sgx_alloc_epc_page(); > @@ -608,8 +627,10 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, > bool reclaim) > break; > } > > - if (!sgx_can_reclaim()) > - return ERR_PTR(-ENOMEM); > + if (!sgx_can_reclaim()) { > + page = ERR_PTR(-ENOMEM); > + break; > + } > > if (!reclaim) { > page = ERR_PTR(-EBUSY); > @@ -621,10 +642,17 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, > bool reclaim) > break; > } > > - sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false); > + sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false, NULL); ... and when an EPC page is allocated, no matter whether the EPC page belongs to any cgroup or not. When we are allocating EPC page for one enclave, if that enclave belongs to some cgroup, is it more reasonable to reclaim EPC pages from it's own group (and the children under it)? You already got the current EPC cgroup at the beginning of sgx_alloc_epc_page() when you want to charge the EPC allocation. > cond_resched(); > } >
Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC
On Mon, Oct 09, 2023, Kai Huang wrote: > On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote: > > +/** > > + * sgx_epc_oom() - invoke EPC out-of-memory handling on target LRU > > + * @lru: LRU that is low > > + * > > + * Return: %true if a victim was found and kicked. > > + */ > > +bool sgx_epc_oom(struct sgx_epc_lru_lists *lru) > > +{ > > + struct sgx_epc_page *victim; > > + > > + spin_lock(&lru->lock); > > + victim = sgx_oom_get_victim(lru); > > + spin_unlock(&lru->lock); > > + > > + if (!victim) > > + return false; > > + > > + if (victim->flags & SGX_EPC_OWNER_PAGE) > > + return sgx_oom_encl_page(victim->encl_page); > > + > > + if (victim->flags & SGX_EPC_OWNER_ENCL) > > + return sgx_oom_encl(victim->encl); > > I hate to bring this up, at least at this stage, but I am wondering why we > need > to put VA and SECS pages to the unreclaimable list, but cannot keep an > "enclave_list" instead? The motivation for tracking EPC pages instead of enclaves was so that the EPC OOM-killer could "kill" VMs as well as host-owned enclaves. The virtual EPC code didn't actually kill the VM process, it instead just freed all of the EPC pages and abused the SGX architecture to effectively make the guest recreate all its enclaves (IIRC, QEMU does the same thing to "support" live migration). Looks like y'all punted on that with: The EPC pages allocated for KVM guests by the virtual EPC driver are not reclaimable by the host kernel [5]. Therefore they are not tracked by any LRU lists for reclaiming purposes in this implementation, but they are charged toward the cgroup of the user processs (e.g., QEMU) launching the guest. And when the cgroup EPC usage reaches its limit, the virtual EPC driver will stop allocating more EPC for the VM, and return SIGBUS to the user process which would abort the VM launch. which IMO is a hack, unless returning SIGBUS is actually enforced somehow. Relying on userspace to be kind enough to kill its VMs kinda defeats the purpose of cgroup enforcement. E.g. if the hard limit for a EPC cgroup is lowered, userspace running encalves in a VM could continue on and refuse to give up its EPC, and thus run above its limit in perpetuity. I can see userspace wanting to explicitly terminate the VM instead of "silently" the VM's enclaves, but that seems like it should be a knob in the virtual EPC code.
Re: [PATCH v5 16/18] x86/sgx: Limit process EPC usage with misc cgroup controller
> +static inline struct sgx_epc_lru_lists *epc_cg_lru(struct sgx_epc_cgroup > *epc_cg) > +{ > + if (epc_cg) > + return &epc_cg->lru; > + return NULL; > +} > It's legal to return NULL EPC cgroup for a given EPC page, i.e., when the enclave isn't assigned to any cgroup. But ... > > static inline struct sgx_epc_lru_lists *sgx_lru_lists(struct sgx_epc_page > *epc_page) > { > + if (IS_ENABLED(CONFIG_CGROUP_SGX_EPC)) > + return epc_cg_lru(epc_page->epc_cg); > + > return &sgx_global_lru; > } ... here is it legal to return a NULL LRU list? It appears you always want to return a valid LRU list. That is, if EPC cgroup is enabled, and when the EPC page doesn't belong to any cgroup, then you want to return the sgx_global_lru ?
Re: [PATCH v5 16/18] x86/sgx: Limit process EPC usage with misc cgroup controller
> +/** > + * sgx_epc_cgroup_lru_empty() - check if a cgroup tree has no pages on its > lrus > + * @root:root of the tree to check > + * > + * Return: %true if all cgroups under the specified root have empty LRU > lists. > + * Used to avoid livelocks due to a cgroup having a non-zero charge count but > + * no pages on its LRUs, e.g. due to a dead enclave waiting to be released or > + * because all pages in the cgroup are unreclaimable. > + */ > +bool sgx_epc_cgroup_lru_empty(struct sgx_epc_cgroup *root) > +{ > + struct cgroup_subsys_state *css_root; > + struct cgroup_subsys_state *pos; > + struct sgx_epc_cgroup *epc_cg; > + bool ret = true; > + > + /* > + * Caller ensure css_root ref acquired > + */ > + css_root = root ? &root->cg->css : &(misc_cg_root()->css); > + > + rcu_read_lock(); > + css_for_each_descendant_pre(pos, css_root) { > + if (!css_tryget(pos)) > + break; > + > + rcu_read_unlock(); > + > + epc_cg = sgx_epc_cgroup_from_misc_cg(css_misc(pos)); > + > + spin_lock(&epc_cg->lru.lock); > + ret = list_empty(&epc_cg->lru.reclaimable); > + spin_unlock(&epc_cg->lru.lock); > + > + rcu_read_lock(); > + css_put(pos); > + if (!ret) > + break; > + } > + > + rcu_read_unlock(); > + > + return ret; > +} > [...] > > static inline bool sgx_can_reclaim(void) > { > + if (IS_ENABLED(CONFIG_CGROUP_SGX_EPC)) > + return !sgx_epc_cgroup_lru_empty(NULL); > + Is it better to keep a root sgx_epc_cgroup and pass the root instead of NULL? > return !list_empty(&sgx_global_lru.reclaimable); > } >
[ANNOUNCE] 5.10.197-rt96
Hello RT-list! I'm pleased to announce the 5.10.197-rt96 stable release. This release is just an update to the new stable 5.10.197 version and no RT-specific changes have been performed. You can get this release via the git tree at: git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git branch: v5.10-rt Head SHA1: 2140f78d1da2b7ba8913fb8d009d6b394bf1b813 Or to build 5.10.197-rt96 directly, the following patches should be applied: https://www.kernel.org/pub/linux/kernel/v5.x/linux-5.10.tar.xz https://www.kernel.org/pub/linux/kernel/v5.x/patch-5.10.197.xz https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patch-5.10.197-rt96.patch.xz Signing key fingerprint: 9354 0649 9972 8D31 D464 D140 F394 A423 F8E6 7C26 All keys used for the above files and repositories can be found on the following git repository: git://git.kernel.org/pub/scm/docs/kernel/pgpkeys.git Enjoy! Luis
Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC
On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote: > From: Sean Christopherson > > Introduce the OOM path for killing an enclave with a reclaimer that is no > longer able to reclaim enough EPC pages. Find a victim enclave, which > will be an enclave with only "unreclaimable" EPC pages left in the > cgroup LRU lists. Once a victim is identified, mark the enclave as OOM > and zap the enclave's entire page range, and drain all mm references in > encl->mm_list. Block allocating any EPC pages in #PF handler, or > reloading any pages in all paths, or creating any new mappings. > > The OOM killing path may race with the reclaimers: in some cases, the > victim enclave is in the process of reclaiming the last EPC pages when > OOM happens, that is, all pages other than SECS and VA pages are in > RECLAIMING_IN_PROGRESS state. The reclaiming process requires access to > the enclave backing, VA pages as well as SECS. So the OOM killer does > not directly release those enclave resources, instead, it lets all > reclaiming in progress to finish, and relies (as currently done) on > kref_put on encl->refcount to trigger sgx_encl_release() to do the > final cleanup. > > Signed-off-by: Sean Christopherson > Co-developed-by: Kristen Carlson Accardi > Signed-off-by: Kristen Carlson Accardi > Co-developed-by: Haitao Huang > Signed-off-by: Haitao Huang > Cc: Sean Christopherson > --- > V5: > - Rename SGX_ENCL_OOM to SGX_ENCL_NO_MEMORY > > V4: > - Updates for patch reordering and typo fixes. > > V3: > - Rebased to use the new VMA_ITERATOR to zap VMAs. > - Fixed the racing cases by blocking new page allocation/mapping and > reloading when enclave is marked for OOM. And do not release any enclave > resources other than draining mm_list entries, and let pages in > RECLAIMING_IN_PROGRESS to be reaped by reclaimers. > - Due to above changes, also removed the no-longer needed encl->lock in > the OOM path which was causing deadlocks reported by the lock prover. > [...] > + > +/** > + * sgx_epc_oom() - invoke EPC out-of-memory handling on target LRU > + * @lru: LRU that is low > + * > + * Return: %true if a victim was found and kicked. > + */ > +bool sgx_epc_oom(struct sgx_epc_lru_lists *lru) > +{ > + struct sgx_epc_page *victim; > + > + spin_lock(&lru->lock); > + victim = sgx_oom_get_victim(lru); > + spin_unlock(&lru->lock); > + > + if (!victim) > + return false; > + > + if (victim->flags & SGX_EPC_OWNER_PAGE) > + return sgx_oom_encl_page(victim->encl_page); > + > + if (victim->flags & SGX_EPC_OWNER_ENCL) > + return sgx_oom_encl(victim->encl); I hate to bring this up, at least at this stage, but I am wondering why we need to put VA and SECS pages to the unreclaimable list, but cannot keep an "enclave_list" instead? So by looking the patch (" x86/sgx: Limit process EPC usage with misc cgroup controller"), if I am not missing anything, the whole "unreclaimable" list is just used to find the victim enclave when OOM needs to be done. Thus, I don't see why "enclave_list" cannot be used to achieve this. The reason that I am asking is because it seems using "enclave_list" we can simplify the code. At least the patches related to track VA/SECS pages, and the SGX_EPC_OWNER_PAGE/SGX_EPC_OWNER_ENCL thing can be eliminated completely. Using "enclave_list", I guess you just need to put the enclave to the current EPC cgroup when SECS page is allocated. In fact, putting "unreclaimable" list to LRU itself is a little bit confusing because: 1) you cannot really reclaim anything from the list; 2) VA/SECS pages don't have the concept of "young" at all, thus makes no sense to annotate they as LRU. Thus putting VA/SECS to "unreclaimable" list, instead of keeping an "enclave_list" seems won't have any benefit but will only make the code more complicated. Or am I missing anything?
Re: [PATCH 01/10] appletalk: remove localtalk and ppp support
On Mon, 9 Oct 2023 18:49:43 +0200 Rodolfo Zitellini wrote: > Hi! > I’ve been working on a new LocalTalk interface driver for the last couple > months, do you think it would be possible to at least postpone the removal of > LT a bit? > > It is a driver for an open source device called TashTalk > (https://github.com/lampmerchant/tashtalk), which runs on a PIC micro that > does all the LT interfacing, and communicates back via serial to the host > system. My driver is relatively simple and works very well with netatalk 2.2 > (which is still maintained and still has support for AppleTalk). The driver > is basically complete and trsted and I was preparing to submit a patch. > > Still having LocalTalk in my view has many advantages for us enthusiasts that > still want to bridge old machines to the current world without modifications, > for example for printing on modern printers, netbooting, sharing files and > even tcp/ip. All this basically works out of the box via the driver, Linux > and available userspace tools (netatalk, macipgw). > > The old ISA cards supported by COPS were basically unobtanium even 20 years > ago, but the solution of using a PIC and a serial port is very robust and > much more furure-proof. We also already have a device that can interface a > modern machine directly via USB to LocalTalk. > > The development of the TashTalk has been also extensively discussed on thr > 68KMLA forum > (https://68kmla.org/bb/index.php?threads/modtashtalk-lt0-driver-for-linux.45031/) > > I hope the decision to remove LocalTalk can be reconsidered at least for the > time being so there is a chance to submit a new, modern device making use of > this stack. > > Many Thanks, > Rodolfo Zitellini Does it really need it to be a kernel protocol stack? What about doing it in userspace or with BPF?
[ANNOUNCE] 4.14.326-rt155
Hello RT-list! I'm pleased to announce the 4.14.326-rt155 stable release. This release is an update to the new stable 4.14.326-rt155 version and no RT-specific changes have been performed. You can get this release via the git tree at: git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git branch: v4.14-rt Head SHA1: 1d10d09bd19b5eb5e90c5585afb308207e9c683f Or to build 4.14.326-rt155 directly, the following patches should be applied: https://www.kernel.org/pub/linux/kernel/v4.x/linux-4.14.tar.xz https://www.kernel.org/pub/linux/kernel/v4.x/patch-4.14.326.xz https://www.kernel.org/pub/linux/kernel/projects/rt/4.14/older/patch-4.14.326-rt155.patch.xz Signing key fingerprint: 9354 0649 9972 8D31 D464 D140 F394 A423 F8E6 7C26 All keys used for the above files and repositories can be found on the following git repository: git://git.kernel.org/pub/scm/docs/kernel/pgpkeys.git Enjoy! Luis
Re: [PATCH] KVM: deprecate KVM_WERROR in favor of general WERROR
On Mon, 9 Oct 2023 12:33:53 -0700 Sean Christopherson wrote: > > We do have sympathy for these folks, we are mostly volunteers after > > all. At the same time someone's under-investment should not be causing > > pain to those of us who _do_ build test stuff carefully. > > This is a bit over the top. Yeah, I need to add W=1 to my build scripts, but > that's > not a lack of investment, just an oversight. Though in this case it likely > wouldn't > have made any difference since Paolo grabbed the patches directly and might > have > even bypassed linux-next. But again I would argue that's bad process, not a > lack > of investment. If you do invest in build testing automation, why can't your automation count warnings rather than depend on WERROR? I don't understand. > > Rather than tweak stuff I'd prefer if we could agree that local -Werror > > is anti-social :( > > > > The global WERROR seems to be a good compromise. > > I disagree. WERROR simply doesn't provide the same coverage. E.g. it can't > be > enabled for i386 without tuning FRAME_WARN, which (a) won't be at all obvious > to > the average contributor and (b) increasing FRAME_WARN effectively reduces the > test coverage of KVM i386. > > For KVM x86, I want the rules for contributing to be clearly documented, and > as > simple as possible. I don't see a sane way to achieve that with WERROR=y. Linus, you created the global WERROR option. Do you have an opinion on whether random subsystems should create their own WERROR flags? W=1 warning got in thru KVM and since they have a KVM_WERROR which defaults to enabled it broke build testing in networking. Randomly sprinkled -Werrors are fragile. Can we ask people to stop using them now that the global ERROR exists?
Re: [PATCH 7/7] arm64: dts: qcom: Add support for Xiaomi Redmi Note 9S
On 10/7/23 15:58, David Wronek wrote: From: Joe Mason Add a device tree for the Xiaomi Redmi Note 9S (curtana) phone, based on sm7125-xiaomi-common.dtsi. Signed-off-by: Joe Mason Signed-off-by: David Wronek --- arch/arm64/boot/dts/qcom/Makefile| 1 + .../boot/dts/qcom/sm7125-xiaomi-curtana.dts | 16 2 files changed, 17 insertions(+) create mode 100644 arch/arm64/boot/dts/qcom/sm7125-xiaomi-curtana.dts diff --git a/arch/arm64/boot/dts/qcom/Makefile b/arch/arm64/boot/dts/qcom/Makefile index d6cb840b7050..57974fb0c580 100644 --- a/arch/arm64/boot/dts/qcom/Makefile +++ b/arch/arm64/boot/dts/qcom/Makefile @@ -207,6 +207,7 @@ dtb-$(CONFIG_ARCH_QCOM) += sm6125-sony-xperia-seine-pdx201.dtb dtb-$(CONFIG_ARCH_QCOM) += sm6125-xiaomi-laurel-sprout.dtb dtb-$(CONFIG_ARCH_QCOM) += sm6350-sony-xperia-lena-pdx213.dtb dtb-$(CONFIG_ARCH_QCOM) += sm6375-sony-xperia-murray-pdx225.dtb +dtb-$(CONFIG_ARCH_QCOM)+= sm7125-xiaomi-curtana.dtb dtb-$(CONFIG_ARCH_QCOM) += sm7125-xiaomi-joyeuse.dtb dtb-$(CONFIG_ARCH_QCOM) += sm7225-fairphone-fp4.dtb dtb-$(CONFIG_ARCH_QCOM) += sm8150-hdk.dtb diff --git a/arch/arm64/boot/dts/qcom/sm7125-xiaomi-curtana.dts b/arch/arm64/boot/dts/qcom/sm7125-xiaomi-curtana.dts new file mode 100644 index ..12f517a8492c --- /dev/null +++ b/arch/arm64/boot/dts/qcom/sm7125-xiaomi-curtana.dts @@ -0,0 +1,16 @@ +// SPDX-License-Identifier: GPL-2.0 I have no idea if you can just include a bunch of bsd3 source and slap gpl2 atop your changes.. Somebody else could probably chime in Konrad
Re: [PATCH 6/7] arm64: dts: qcom: sm7125-xiaomi-common: Add UFS nodes
On 10/7/23 15:58, David Wronek wrote: Enable the UFS found on the SM7125 Xiaomi smartphones. Signed-off-by: David Wronek --- Reviewed-by: Konrad Dybcio Konrad
[PATCH] remoteproc: st: Use device_get_match_data()
Use preferred device_get_match_data() instead of of_match_device() to get the driver match data. With this, adjust the includes to explicitly include the correct headers. Signed-off-by: Rob Herring --- drivers/remoteproc/st_remoteproc.c | 14 -- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/drivers/remoteproc/st_remoteproc.c b/drivers/remoteproc/st_remoteproc.c index e3ce01d98b4c..b0638f984842 100644 --- a/drivers/remoteproc/st_remoteproc.c +++ b/drivers/remoteproc/st_remoteproc.c @@ -16,10 +16,9 @@ #include #include #include -#include -#include #include #include +#include #include #include #include @@ -341,7 +340,6 @@ static int st_rproc_parse_dt(struct platform_device *pdev) static int st_rproc_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; - const struct of_device_id *match; struct st_rproc *ddata; struct device_node *np = dev->of_node; struct rproc *rproc; @@ -349,19 +347,15 @@ static int st_rproc_probe(struct platform_device *pdev) int enabled; int ret, i; - match = of_match_device(st_rproc_match, dev); - if (!match || !match->data) { - dev_err(dev, "No device match found\n"); - return -ENODEV; - } - rproc = rproc_alloc(dev, np->name, &st_rproc_ops, NULL, sizeof(*ddata)); if (!rproc) return -ENOMEM; rproc->has_iommu = false; ddata = rproc->priv; - ddata->config = (struct st_rproc_config *)match->data; + ddata->config = (struct st_rproc_config *)device_get_match_data(dev); + if (!ddata->config) + goto free_rproc; platform_set_drvdata(pdev, rproc); -- 2.42.0
Re: [PATCH 5/7] arm64: dts: qcom: sc7180: Add UFS nodes
On 10/7/23 15:58, David Wronek wrote: Add the UFS and QMP PHY nodes for the Qualcomm SC7180 SoC. Signed-off-by: David Wronek --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 70 1 file changed, 70 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index 11f353d416b4..9f18be4fd61a 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -1532,6 +1532,76 @@ mmss_noc: interconnect@174 { qcom,bcm-voters = <&apps_bcm_voter>; }; + ufs_mem_hc: ufshc@1d84000 { + compatible = "qcom,sc7180-ufshc", "qcom,ufshc", +"jedec,ufs-2.0"; + reg = <0 0x01d84000 0 0x3000>, + <0 0x01d9 0 0x8000>; + reg-names = "std", "ice"; Recently the ICE was separated into its own node + interrupts = ; + phys = <&ufs_mem_phy>; + phy-names = "ufsphy"; + lanes-per-direction = <1>; + power-domains = <&gcc UFS_PHY_GDSC>; + #reset-cells = <1>; + resets = <&gcc GCC_UFS_PHY_BCR>; + reset-names = "rst"; + + iommus = <&apps_smmu 0xa0 0x0>; + + clock-names = + "core_clk", Remove the newline after the =, here and below + "bus_aggr_clk", + "iface_clk", + "core_clk_unipro", + "ref_clk", + "tx_lane0_sync_clk", + "rx_lane0_sync_clk", + "ice_core_clk"; + clocks = + <&gcc GCC_UFS_PHY_AXI_CLK>, + <&gcc GCC_AGGRE_UFS_PHY_AXI_CLK>, + <&gcc GCC_UFS_PHY_AHB_CLK>, + <&gcc GCC_UFS_PHY_UNIPRO_CORE_CLK>, + <&rpmhcc RPMH_CXO_CLK>, + <&gcc GCC_UFS_PHY_TX_SYMBOL_0_CLK>, + <&gcc GCC_UFS_PHY_RX_SYMBOL_0_CLK>, + <&gcc GCC_UFS_PHY_ICE_CORE_CLK>; + freq-table-hz = + <5000 2>, + <0 0>, + <0 0>, + <3750 15000>, + <0 0>, + <0 0>, + <0 0>, + <0 3>; + + interconnects = <&aggre1_noc MASTER_UFS_MEM 0 &mc_virt SLAVE_EBI1 0>, + <&gem_noc MASTER_APPSS_PROC 0 &config_noc SLAVE_UFS_MEM_CFG 0>; Please make the include/dt-bindings/interconnect/qcom,icc.h like in sa8775p.dtsi + interconnect-names = "ufs-ddr", "cpu-ufs"; + + status = "disabled"; + }; + + ufs_mem_phy: phy@1d87000 { + compatible = "qcom,sc7180-qmp-ufs-phy"; + reg = <0 0x01d87000 0 0x1000>; + + clocks = <&gcc GCC_UFS_MEM_CLKREF_CLK>, + <&gcc GCC_UFS_PHY_PHY_AUX_CLK>; Please align the
Re: [PATCH] KVM: deprecate KVM_WERROR in favor of general WERROR
On Mon, Oct 09, 2023, Jakub Kicinski wrote: > On Mon, 9 Oct 2023 10:43:43 -0700 Sean Christopherson wrote: > > On Fri, Oct 06, 2023, Jakub Kicinski wrote: > > On a related topic, this is comically stale as WERROR is on by default for > > both > > allmodconfig and allyesconfig, which work because they trigger 64-bit > > builds. > > And KASAN on x86 is 64-bit only. > > > > Rather than yank out KVM_WERROR entirely, what if we make default=n and > > trim the > > depends down to "KVM && EXPERT && !KASAN"? E.g. > > IMO setting WERROR is a bit perverse. The way I see it WERROR is a > crutch for people who don't have the time / infra to properly build > test changes they send to Linus. Or wait for build bots to do their job. KVM_WERROR reduces the probability of issues in patches being sent to *me*. The reality is that most contributors do not have the knowledge and/or resources to "properly" build test changes without specific guidance on what/how to test, or what configs to prioritize. Nor is it realistic to expect that build bots will detect every issue in every possible configuration in every patch that's posted. Call -Werror a crutch if you will, but for me it's a crutch that I'm more than willing to lean on in order to increase the overall quality of KVM x86 submissions. > We do have sympathy for these folks, we are mostly volunteers after > all. At the same time someone's under-investment should not be causing > pain to those of us who _do_ build test stuff carefully. This is a bit over the top. Yeah, I need to add W=1 to my build scripts, but that's not a lack of investment, just an oversight. Though in this case it likely wouldn't have made any difference since Paolo grabbed the patches directly and might have even bypassed linux-next. But again I would argue that's bad process, not a lack of investment. > Rather than tweak stuff I'd prefer if we could agree that local -Werror > is anti-social :( > > The global WERROR seems to be a good compromise. I disagree. WERROR simply doesn't provide the same coverage. E.g. it can't be enabled for i386 without tuning FRAME_WARN, which (a) won't be at all obvious to the average contributor and (b) increasing FRAME_WARN effectively reduces the test coverage of KVM i386. For KVM x86, I want the rules for contributing to be clearly documented, and as simple as possible. I don't see a sane way to achieve that with WERROR=y.
Re: [PATCH] KVM: deprecate KVM_WERROR in favor of general WERROR
On Mon, 9 Oct 2023 10:43:43 -0700 Sean Christopherson wrote: > On Fri, Oct 06, 2023, Jakub Kicinski wrote: > > Setting WERROR for random subsystems make life really hard > > for subsystems which want to build-test their stuff with W=1. > > WERROR for the entire kernel now exists and can be used > > instead. W=1 people probably know how to deal with the global > > W=1 already, tracking all per-subsystem WERRORs is too much... > > I assume s/W=1/WERROR=y in this line? Yes, sorry about that. > > Link: > > https://lore.kernel.org/all/0da9874b6e9fcbaaa5edeb345d7e2a7c859fc818.1696271334.git.thomas.lenda...@amd.com/ > > Signed-off-by: Jakub Kicinski > > --- > > Documentation/process/maintainer-kvm-x86.rst | 2 +- > > arch/x86/kvm/Kconfig | 14 -- > > arch/x86/kvm/Makefile| 1 - > > 3 files changed, 1 insertion(+), 16 deletions(-) > > > > diff --git a/Documentation/process/maintainer-kvm-x86.rst > > b/Documentation/process/maintainer-kvm-x86.rst > > index 9183bd449762..cd70c0351108 100644 > > --- a/Documentation/process/maintainer-kvm-x86.rst > > +++ b/Documentation/process/maintainer-kvm-x86.rst > > @@ -243,7 +243,7 @@ context and disambiguate the reference. > > Testing > > --- > > At a bare minimum, *all* patches in a series must build cleanly for > > KVM_INTEL=m > > -KVM_AMD=m, and KVM_WERROR=y. Building every possible combination of > > Kconfigs > > +KVM_AMD=m, and WERROR=y. Building every possible combination of Kconfigs > > isn't feasible, but the more the merrier. KVM_SMM, KVM_XEN, > > PROVE_LOCKING, and > > X86_64 are particularly interesting knobs to turn. > > > > diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig > > index ed90f148140d..12929324ac3e 100644 > > --- a/arch/x86/kvm/Kconfig > > +++ b/arch/x86/kvm/Kconfig > > @@ -63,20 +63,6 @@ config KVM > > > > If unsure, say N. > > > > -config KVM_WERROR > > - bool "Compile KVM with -Werror" > > - # KASAN may cause the build to fail due to larger frames > > - default y if X86_64 && !KASAN > > Hrm, I am loath to give up KVM's targeted -Werror as it allows for more > aggresive > enabling, e.g. enabling CONFIG_WERROR for i386 builds with other defaults > doesn't > work because of CONFIG_FRAME_WARN=1024. That in turns means making WERROR=y a > requirement in maintainer-kvm-x86.rst is likely unreasonable. > > And arguably KVM_WERROR is doing its job by flagging the linked W=1 error. > The > problem there lies more in my build testing, which I'll go fix by adding a W=1 > configuration or three. As the changelog notes, I highly doubt W=1 builds > work > with WERROR, whereas keeping KVM x86 warning-free even with W=1 is feasible. > > > - # We use the dependency on !COMPILE_TEST to not be enabled > > - # blindly in allmodconfig or allyesconfig configurations > > - depends on KVM > > - depends on (X86_64 && !KASAN) || !COMPILE_TEST > > On a related topic, this is comically stale as WERROR is on by default for > both > allmodconfig and allyesconfig, which work because they trigger 64-bit builds. > And KASAN on x86 is 64-bit only. > > Rather than yank out KVM_WERROR entirely, what if we make default=n and trim > the > depends down to "KVM && EXPERT && !KASAN"? E.g. IMO setting WERROR is a bit perverse. The way I see it WERROR is a crutch for people who don't have the time / infra to properly build test changes they send to Linus. Or wait for build bots to do their job. We do have sympathy for these folks, we are mostly volunteers after all. At the same time someone's under-investment should not be causing pain to those of us who _do_ build test stuff carefully. Rather than tweak stuff I'd prefer if we could agree that local -Werror is anti-social :( The global WERROR seems to be a good compromise.
Re: [PATCH v2 3/6] ACPI: AC: Replace acpi_driver with platform_driver
On Mon, Oct 9, 2023 at 3:04 PM Wilczynski, Michal wrote: > > > > On 10/9/2023 2:27 PM, Rafael J. Wysocki wrote: > > On Mon, Oct 9, 2023 at 10:40 AM Wilczynski, Michal > > wrote: > >> [cut] > >> Yeah we could add platform device without removing acpi device, and > >> yes that would introduce data duplication, like Andy noticed. > > But he hasn't answered my question regarding what data duplication he > > meant, exactly. > > > > So what data duplication do you mean? > > So what I meant was that many drivers would find it useful to have > a struct device embedded in their 'private structure', and in that case > there would be a duplication of platform_device and acpi_device as > both pointers would be needed. It all depends on how often each of them is going to be used in the given driver. This particular driver only needs a struct acpi_device pointer if I'm not mistaken. > Mind this if you only have struct device > it's easy to get acpi device, but it's not the case the other way around. > > So for this driver it's just a matter of sticking to convention, There is no convention in this respect and there is always a tradeoff between using more memory and using more CPU time in computing in general, but none of them should be wasted just for the sake of following a convention. > for the others like it would be duplication. So I'm only talking about the driver modified by the patch at hand. > In my version of this patch we managed to avoid this duplication, thanks > to the contextual argument introduced before, but look at this patch: > https://github.com/mwilczy/linux-pm/commit/cc8ef52707341e67a12067d6ead991d56ea017ca > > Author of this patch had to introduce platform_device and acpi_device to the > struct ac, so > there was some duplication. That is the case for many drivers to come, so I > decided it's better > to change this and have a pattern with keeping only 'struct device'. Well, if the only thing you need from a struct device is its ACPI_COMPANION(), it is better to store a pointer to the latter.
Re: [PATCH] KVM: deprecate KVM_WERROR in favor of general WERROR
On Fri, Oct 06, 2023, Jakub Kicinski wrote: > Setting WERROR for random subsystems make life really hard > for subsystems which want to build-test their stuff with W=1. > WERROR for the entire kernel now exists and can be used > instead. W=1 people probably know how to deal with the global > W=1 already, tracking all per-subsystem WERRORs is too much... I assume s/W=1/WERROR=y in this line? > Link: > https://lore.kernel.org/all/0da9874b6e9fcbaaa5edeb345d7e2a7c859fc818.1696271334.git.thomas.lenda...@amd.com/ > Signed-off-by: Jakub Kicinski > --- > Documentation/process/maintainer-kvm-x86.rst | 2 +- > arch/x86/kvm/Kconfig | 14 -- > arch/x86/kvm/Makefile| 1 - > 3 files changed, 1 insertion(+), 16 deletions(-) > > diff --git a/Documentation/process/maintainer-kvm-x86.rst > b/Documentation/process/maintainer-kvm-x86.rst > index 9183bd449762..cd70c0351108 100644 > --- a/Documentation/process/maintainer-kvm-x86.rst > +++ b/Documentation/process/maintainer-kvm-x86.rst > @@ -243,7 +243,7 @@ context and disambiguate the reference. > Testing > --- > At a bare minimum, *all* patches in a series must build cleanly for > KVM_INTEL=m > -KVM_AMD=m, and KVM_WERROR=y. Building every possible combination of Kconfigs > +KVM_AMD=m, and WERROR=y. Building every possible combination of Kconfigs > isn't feasible, but the more the merrier. KVM_SMM, KVM_XEN, PROVE_LOCKING, > and > X86_64 are particularly interesting knobs to turn. > > diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig > index ed90f148140d..12929324ac3e 100644 > --- a/arch/x86/kvm/Kconfig > +++ b/arch/x86/kvm/Kconfig > @@ -63,20 +63,6 @@ config KVM > > If unsure, say N. > > -config KVM_WERROR > - bool "Compile KVM with -Werror" > - # KASAN may cause the build to fail due to larger frames > - default y if X86_64 && !KASAN Hrm, I am loath to give up KVM's targeted -Werror as it allows for more aggresive enabling, e.g. enabling CONFIG_WERROR for i386 builds with other defaults doesn't work because of CONFIG_FRAME_WARN=1024. That in turns means making WERROR=y a requirement in maintainer-kvm-x86.rst is likely unreasonable. And arguably KVM_WERROR is doing its job by flagging the linked W=1 error. The problem there lies more in my build testing, which I'll go fix by adding a W=1 configuration or three. As the changelog notes, I highly doubt W=1 builds work with WERROR, whereas keeping KVM x86 warning-free even with W=1 is feasible. > - # We use the dependency on !COMPILE_TEST to not be enabled > - # blindly in allmodconfig or allyesconfig configurations > - depends on KVM > - depends on (X86_64 && !KASAN) || !COMPILE_TEST On a related topic, this is comically stale as WERROR is on by default for both allmodconfig and allyesconfig, which work because they trigger 64-bit builds. And KASAN on x86 is 64-bit only. Rather than yank out KVM_WERROR entirely, what if we make default=n and trim the depends down to "KVM && EXPERT && !KASAN"? E.g. diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 8452ed0228cb..c2466304aa6a 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -65,13 +65,12 @@ config KVM config KVM_WERROR bool "Compile KVM with -Werror" - # KASAN may cause the build to fail due to larger frames - default y if X86_64 && !KASAN - # We use the dependency on !COMPILE_TEST to not be enabled - # blindly in allmodconfig or allyesconfig configurations - depends on KVM - depends on (X86_64 && !KASAN) || !COMPILE_TEST - depends on EXPERT + # Disallow KVM's -Werror if KASAN=y, e.g. to guard against randomized + # configs from selecting KVM_WERROR=y. KASAN builds generates warnings + # for the default FRAME_WARN, i.e. KVM_WERROR=y with KASAN=y requires + # special tuning. Building KVM with -Werror and KASAN is still doable + * via enabling the kernel-wide WERROR=y. + depends on KVM && EXPERT && !KASAN help Add -Werror to the build flags for KVM.
Re: [PATCH 01/10] appletalk: remove localtalk and ppp support
On Mon, Oct 9, 2023, at 18:49, Rodolfo Zitellini wrote: >> From: Arnd Bergmann >> >> The last localtalk driver is gone now, and ppp support was never fully >> merged, so clean up the appletalk code by removing the obvious dead >> code paths. >> >> Notably, this removes one of the two callers of the old .ndo_do_ioctl() >> callback that was abused for getting device addresses and is now >> only used in the ieee802154 subsystem, which still uses the same trick. >> >> The include/uapi/linux/if_ltalk.h header might still be required >> for building userspace programs, but I made sure that debian code >> search and the netatalk upstream have no references it it, so it >> should be fine to remove. >> >> Signed-off-by: Arnd Bergmann > > Hi! > I’ve been working on a new LocalTalk interface driver for the last > couple months, do you think it would be possible to at least postpone > the removal of LT a bit? > > It is a driver for an open source device called TashTalk > (https://github.com/lampmerchant/tashtalk), which runs on a PIC micro > that does all the LT interfacing, and communicates back via serial to > the host system. My driver is relatively simple and works very well > with netatalk 2.2 (which is still maintained and still has support for > AppleTalk). The driver is basically complete and trsted and I was > preparing to submit a patch. > > Still having LocalTalk in my view has many advantages for us > enthusiasts that still want to bridge old machines to the current world > without modifications, for example for printing on modern printers, > netbooting, sharing files and even tcp/ip. All this basically works out > of the box via the driver, Linux and available userspace tools > (netatalk, macipgw). > > The old ISA cards supported by COPS were basically unobtanium even 20 > years ago, but the solution of using a PIC and a serial port is very > robust and much more furure-proof. We also already have a device that > can interface a modern machine directly via USB to LocalTalk. > > The development of the TashTalk has been also extensively discussed on > thr 68KMLA forum > (https://68kmla.org/bb/index.php?threads/modtashtalk-lt0-driver-for-linux.45031/) > > I hope the decision to remove LocalTalk can be reconsidered at least > for the time being so there is a chance to submit a new, modern device > making use of this stack. Nothing is decided, I'm just proposing my patch as a cleanup for now. It would be nice to still drop the ndo_do_ioctl function though, at least in some form. When your driver actually makes it into the kernel, you can find a different method of communicating the address between the socket interface and the device driver. I can see a few ways this could work out: - add a custom callback pointer to struct atalk_iface to get and set the address for phase1 probing instead of going through the ioctl - rewrite the probing logic in aarp.c more widely, and improve the userspace interface in the process by introducing a netlink interface - Move your entire driver into userspace and go to the kernel using tun/tap. This has the added benefit of avoiding a lot of the complexity of the tty line discipline code you have. Arnd
Re: [PATCH] KVM: deprecate KVM_WERROR in favor of general WERROR
On Fri, Oct 06, 2023 at 01:54:15PM -0700, Jakub Kicinski wrote: > Setting WERROR for random subsystems make life really hard > for subsystems which want to build-test their stuff with W=1. > WERROR for the entire kernel now exists and can be used > instead. W=1 people probably know how to deal with the global > W=1 already, tracking all per-subsystem WERRORs is too much... > > Link: > https://lore.kernel.org/all/0da9874b6e9fcbaaa5edeb345d7e2a7c859fc818.1696271334.git.thomas.lenda...@amd.com/ > Signed-off-by: Jakub Kicinski Yeah, best to have just the global option. Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH 01/10] appletalk: remove localtalk and ppp support
> From: Arnd Bergmann > > The last localtalk driver is gone now, and ppp support was never fully > merged, so clean up the appletalk code by removing the obvious dead > code paths. > > Notably, this removes one of the two callers of the old .ndo_do_ioctl() > callback that was abused for getting device addresses and is now > only used in the ieee802154 subsystem, which still uses the same trick. > > The include/uapi/linux/if_ltalk.h header might still be required > for building userspace programs, but I made sure that debian code > search and the netatalk upstream have no references it it, so it > should be fine to remove. > > Signed-off-by: Arnd Bergmann Hi! I’ve been working on a new LocalTalk interface driver for the last couple months, do you think it would be possible to at least postpone the removal of LT a bit? It is a driver for an open source device called TashTalk (https://github.com/lampmerchant/tashtalk), which runs on a PIC micro that does all the LT interfacing, and communicates back via serial to the host system. My driver is relatively simple and works very well with netatalk 2.2 (which is still maintained and still has support for AppleTalk). The driver is basically complete and trsted and I was preparing to submit a patch. Still having LocalTalk in my view has many advantages for us enthusiasts that still want to bridge old machines to the current world without modifications, for example for printing on modern printers, netbooting, sharing files and even tcp/ip. All this basically works out of the box via the driver, Linux and available userspace tools (netatalk, macipgw). The old ISA cards supported by COPS were basically unobtanium even 20 years ago, but the solution of using a PIC and a serial port is very robust and much more furure-proof. We also already have a device that can interface a modern machine directly via USB to LocalTalk. The development of the TashTalk has been also extensively discussed on thr 68KMLA forum (https://68kmla.org/bb/index.php?threads/modtashtalk-lt0-driver-for-linux.45031/) I hope the decision to remove LocalTalk can be reconsidered at least for the time being so there is a chance to submit a new, modern device making use of this stack. Many Thanks, Rodolfo Zitellini
Re: [PATCH 5/5] kbuild: unify no-compiler-targets and no-sync-config-targets
On Mon, Oct 09, 2023 at 09:42:10PM +0900, Masahiro Yamada wrote: > Now that vdso_install does not depend on any in-tree build artifact, > it no longer invokes a compiler, making no-compiler-targets the same > as no-sync-config-targets. > > Signed-off-by: Masahiro Yamada > --- > > Makefile | 13 + > 1 file changed, 1 insertion(+), 12 deletions(-) > > diff --git a/Makefile b/Makefile > index 2170d56630e8..982b1ad33287 100644 > --- a/Makefile > +++ b/Makefile > @@ -277,10 +277,6 @@ no-dot-config-targets := $(clean-targets) \ >$(version_h) headers headers_% archheaders archscripts > \ >%asm-generic kernelversion %src-pkg dt_binding_check \ >outputmakefile rustavailable rustfmt rustfmtcheck > -# Installation targets should not require compiler. Unfortunately, > vdso_install > -# is an exception where build artifacts may be updated. This must be fixed. > -no-compiler-targets := $(no-dot-config-targets) install dtbs_install \ > - headers_install modules_install modules_sign > kernelrelease image_name > no-sync-config-targets := $(no-dot-config-targets) %install modules_sign > kernelrelease \ > image_name > single-targets := %.a %.i %.ko %.lds %.ll %.lst %.mod %.o %.rsi %.s > %.symtypes %/ > @@ -288,7 +284,6 @@ single-targets := %.a %.i %.ko %.lds %.ll %.lst %.mod %.o > %.rsi %.s %.symtypes % > config-build := > mixed-build := > need-config := 1 > -need-compiler:= 1 > may-sync-config := 1 > single-build := > > @@ -298,12 +293,6 @@ ifneq ($(filter $(no-dot-config-targets), > $(MAKECMDGOALS)),) > endif > endif > > -ifneq ($(filter $(no-compiler-targets), $(MAKECMDGOALS)),) > - ifeq ($(filter-out $(no-compiler-targets), $(MAKECMDGOALS)),) > - need-compiler := > - endif > -endif > - MIPS and LoongArch seem to have grown a usage of need-compiler in 4fe4a6374c4d ("MIPS: Only fiddle with CHECKFLAGS if `need-compiler'") and 54c2c9df083f ("LoongArch: Only fiddle with CHECKFLAGS if `need-compiler'"). With this removal, should those be updated as well? > ifneq ($(filter $(no-sync-config-targets), $(MAKECMDGOALS)),) > ifeq ($(filter-out $(no-sync-config-targets), $(MAKECMDGOALS)),) > may-sync-config := > @@ -675,7 +664,7 @@ endif > > # Include this also for config targets because some architectures need > # cc-cross-prefix to determine CROSS_COMPILE. > -ifdef need-compiler > +ifdef may-sync-config > include $(srctree)/scripts/Makefile.compiler > endif > > -- > 2.39.2 >
Re: [PATCH v1] Ftrace: make sched_wakeup can focus on the target process
On Mon, 9 Oct 2023 23:37:14 +0800 Jinyu Tang wrote: > When we want to know what happened in kernel when our app > has more latency than we hope, but the larger latency of > our app may be lower than other process in the syetem. > We feel sad after waiting a long time but only get other > process sched_wakeup trace. > > This Patch can let us only trace target process sched-wakeup > time, other process sched-wakeup will be dropped and won't > change tracing_max_latency. > > The patch is tested by the following commands: > > $ mount -t tracefs none /sys/kernel/tracing > $ echo wakeup_rt > /sys/kernel/tracing/current_tracer > # some other stress-ng options are also tested > $ stress-ng --cpu 4 & > $ cyclictest --mlockall --smp --priority=99 & > $ cyclictest_pid=$! > # child thread of cyclictest main process > $ thread_pid=$((cyclictest_pid + 1)) > > $ echo ${thread_pid} > /sys/kernel/tracing/set_wakeup_pid > > $ echo 1 > /sys/kernel/tracing/tracing_on > $ echo 0 > /sys/kernel/tracing/tracing_max_latency > $ wait ${cyclictest_pid} > $ echo 0 > /sys/kernel/tracing/tracing_on > $ cat /sys/kernel/tracing/trace > > The maximum latency and backtrace recorded in the trace file will be only > generated by the target process. > Then we can eliminate interference from other programs, making it easier > to identify the cause of latency. > > Tested-by: Jiexun Wang > Signed-off-by: Jinyu Tang > --- Honestly, the wakeup tracers are obsolete. I haven't used them in years. I use synthetic events instead: # cd /sys/kernel/tracing # echo 'wakeup_lat pid_t pid; u64 delay;' > synthetic_events # echo 'hist:keys=pid:ts=common_timestamp.usecs' if pid==$thread_pid > events/sched/sched_waking/trigger # echo 'hist:keys=next_pid:delta=common_timestamp.usecs-$ts:onmax($delta).trace(wakeup_lat,next_pid,$delta)' > events/sched/sched_switch/trigger # echo 1 > events/synthetic/wakeup_lat/enable # cat trace # tracer: nop # # entries-in-buffer/entries-written: 3/3 #P:8 # #_-=> irqs-off/BH-disabled # / _=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / _-=> migrate-disable # / delay # TASK-PID CPU# | TIMESTAMP FUNCTION # | | | | | | -0 [000] d..4. 350799.423428: wakeup_lat: pid=59921 delay=1281 -0 [000] d..4. 350800.423441: wakeup_lat: pid=59921 delay=1317 -0 [000] d..4. 350801.423445: wakeup_lat: pid=59921 delay=1331 I could also make it record stack traces, disable tracing, and all sorts of other nifty things. -- Steve
Re: [PATCH rebased] kbuild: rpm-pkg: Fix build with non-default MODLIB
On Monday 2023-10-09 17:14, Masahiro Yamada wrote: > >Let me add more context to my question. > >I am interested in the timing when >'pkg-config --print-variables kmod | grep module_directory' >is executed. > >1. Build a SRPM on machine A >2. Copy the SRPM from machine A to machine B >3. Run rpmbuild on machine B to build the SRPM into a RPM >4. Copy the RPM from machine B to machine C >5. Install the RPM to machine C In over 20 years of Linux distros existing, the one thing that everyone has learned is that installing foreign RPM packages (or any other format) is probably not going to work for one reason or another. Different package names in Require: lines (just think of the switch from modutils to kmod), different ABIs.. The overwhelming amount of package production that is going on these days targets distro(A) == distro(B) == distro(C). Yeah, the kernel package is kinda special because the files in it are freestanding executables, but still..
[PATCH v1] Ftrace: make sched_wakeup can focus on the target process
When we want to know what happened in kernel when our app has more latency than we hope, but the larger latency of our app may be lower than other process in the syetem. We feel sad after waiting a long time but only get other process sched_wakeup trace. This Patch can let us only trace target process sched-wakeup time, other process sched-wakeup will be dropped and won't change tracing_max_latency. The patch is tested by the following commands: $ mount -t tracefs none /sys/kernel/tracing $ echo wakeup_rt > /sys/kernel/tracing/current_tracer # some other stress-ng options are also tested $ stress-ng --cpu 4 & $ cyclictest --mlockall --smp --priority=99 & $ cyclictest_pid=$! # child thread of cyclictest main process $ thread_pid=$((cyclictest_pid + 1)) $ echo ${thread_pid} > /sys/kernel/tracing/set_wakeup_pid $ echo 1 > /sys/kernel/tracing/tracing_on $ echo 0 > /sys/kernel/tracing/tracing_max_latency $ wait ${cyclictest_pid} $ echo 0 > /sys/kernel/tracing/tracing_on $ cat /sys/kernel/tracing/trace The maximum latency and backtrace recorded in the trace file will be only generated by the target process. Then we can eliminate interference from other programs, making it easier to identify the cause of latency. Tested-by: Jiexun Wang Signed-off-by: Jinyu Tang --- kernel/trace/trace.h | 3 + kernel/trace/trace_sched_wakeup.c | 179 ++ 2 files changed, 182 insertions(+) diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index 77debe53f07c..c6f212e8bfd2 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -403,6 +403,9 @@ struct trace_array { #endif /* function tracing enabled */ int function_enabled; +#endif +#ifdef CONFIG_SCHED_TRACER + struct trace_pid_list __rcu *wakeup_pids; #endif int no_filter_buffering_ref; struct list_headhist_vars; diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c index 0469a04a355f..b6cb9391e120 100644 --- a/kernel/trace/trace_sched_wakeup.c +++ b/kernel/trace/trace_sched_wakeup.c @@ -10,6 +10,9 @@ * Copyright (C) 2004-2006 Ingo Molnar * Copyright (C) 2004 Nadia Yvette Chambers */ +#include +#include +#include #include #include #include @@ -17,6 +20,7 @@ #include #include #include +#include #include "trace.h" static struct trace_array *wakeup_trace; @@ -361,6 +365,169 @@ static bool report_latency(struct trace_array *tr, u64 delta) return true; } +DEFINE_MUTEX(sched_wakeup_mutex); +static void * +p_next(struct seq_file *m, void *v, loff_t *pos) +{ + struct trace_array *tr = m->private; + struct trace_pid_list *pid_list = rcu_dereference_sched(tr->wakeup_pids); + + return trace_pid_next(pid_list, v, pos); +} + +static void *p_start(struct seq_file *m, loff_t *pos) + __acquires(RCU) +{ + struct trace_pid_list *pid_list; + struct trace_array *tr = m->private; + + /* +* Grab the mutex, to keep calls to p_next() having the same +* tr->wakeup_pids as p_start() has. +* If we just passed the tr->wakeup_pids around, then RCU would +* have been enough, but doing that makes things more complex. +*/ + mutex_lock(&sched_wakeup_mutex); + rcu_read_lock_sched(); + + pid_list = rcu_dereference_sched(tr->wakeup_pids); + + if (!pid_list) + return NULL; + + return trace_pid_start(pid_list, pos); +} + +static void p_stop(struct seq_file *m, void *p) + __releases(RCU) +{ + rcu_read_unlock_sched(); + mutex_unlock(&sched_wakeup_mutex); +} + +static const struct seq_operations show_set_pid_seq_ops = { + .start = p_start, + .next = p_next, + .show = trace_pid_show, + .stop = p_stop, +}; + +static int +ftrace_wakeup_open(struct inode *inode, struct file *file, + const struct seq_operations *seq_ops) +{ + struct seq_file *m; + int ret; + + ret = seq_open(file, seq_ops); + if (ret < 0) + return ret; + m = file->private_data; + /* copy tr over to seq ops */ + m->private = inode->i_private; + + return ret; +} + +static void __clear_wakeup_pids(struct trace_array *tr) +{ + struct trace_pid_list *pid_list; + + pid_list = rcu_dereference_protected(tr->wakeup_pids, + lockdep_is_held(&sched_wakeup_mutex)); + if (!pid_list) + return; + + rcu_assign_pointer(tr->wakeup_pids, NULL); + + synchronize_rcu(); + trace_pid_list_free(pid_list); +} + +static void clear_wakeup_pids(struct trace_array *tr) +{ + mutex_lock(&sched_wakeup_mutex); + __clear_wakeup_pids(tr); + mutex_unlock(&sched_wakeup_mutex); + +} + +static int +ftrace_set_wakeup_pid_open(struct inode *inode, struct file *file) +{ + const struct seq_operations *seq_ops = &sh
Re: [PATCH 04/10] staging: ks7010: remove unused ioctl handler
On Mon, Oct 09, 2023 at 04:19:02PM +0200, Arnd Bergmann wrote: > From: Arnd Bergmann > > The ndo_do_ioctl function has no actual callers, and doesn't do much here, > so just remove it entirely as preparation for removing the callback pointer > from net_device_ops. > > Signed-off-by: Arnd Bergmann Reviewed-by: Greg Kroah-Hartman
Re: [PATCH 05/10] staging: rtl8192: remove unused legacy ioctl handlers
On Mon, Oct 09, 2023 at 04:19:03PM +0200, Arnd Bergmann wrote: > From: Arnd Bergmann > > The .ndo_do_ioctl functions are never called, and can just be removed, > especially since this is a staging driver. > > Signed-off-by: Arnd Bergmann > --- > drivers/staging/rtl8192u/ieee80211/dot11d.c | 41 -- > drivers/staging/rtl8192u/ieee80211/dot11d.h | 2 - > .../staging/rtl8192u/ieee80211/ieee80211.h| 12 - > .../rtl8192u/ieee80211/ieee80211_softmac.c| 563 -- > drivers/staging/rtl8192u/r8192U.h | 2 - > drivers/staging/rtl8192u/r8192U_core.c| 109 > 6 files changed, 729 deletions(-) Reviewed-by: Greg Kroah-Hartman
Re: [PATCH 07/10] staging: rtl8723bs: remove dead code
On Mon, Oct 09, 2023 at 04:19:05PM +0200, Arnd Bergmann wrote: > From: Arnd Bergmann > > The .ndo_do_ioctl functions are never called, so the three implementation here > is useless but only works as a way to identify the device in the notifiers, > which can really be removed as well. > > Looking through the exported functions, I found a bunch more that have > no callers, so just drop all of those. > > Signed-off-by: Arnd Bergmann Reviewed-by: Greg Kroah-Hartman
Re: [PATCH 06/10] staging: rtl8712: remove unused legacy ioctl handlers
On Mon, Oct 09, 2023 at 04:19:04PM +0200, Arnd Bergmann wrote: > From: Arnd Bergmann > > The .ndo_do_ioctl functions are never called, and can just be removed, > especially since this is a staging driver. > > Signed-off-by: Arnd Bergmann > --- > drivers/staging/rtl8712/os_intfs.c| 1 - > drivers/staging/rtl8712/osdep_intf.h | 2 - > drivers/staging/rtl8712/rtl871x_ioctl_linux.c | 124 -- > 3 files changed, 127 deletions(-) Reviewed-by: Greg Kroah-Hartman
Re: [PATCH 1/5] csky: remove unused cmd_vdso_install
On Mon, Oct 9, 2023 at 8:42 PM Masahiro Yamada wrote: > > You cannot run this code because arch/csky/Makefile does not define the > vdso_install target. > > It appears that this code was blindly copied from another architecture. Yes, I do that. Thx for pointing it out. Acked-by: Guo Ren > > Remove the dead code. > > Signed-off-by: Masahiro Yamada > --- > > arch/csky/kernel/vdso/Makefile | 10 -- > 1 file changed, 10 deletions(-) > > diff --git a/arch/csky/kernel/vdso/Makefile b/arch/csky/kernel/vdso/Makefile > index 299e4e41ebc5..ddf784a62c11 100644 > --- a/arch/csky/kernel/vdso/Makefile > +++ b/arch/csky/kernel/vdso/Makefile > @@ -58,13 +58,3 @@ quiet_cmd_vdsold = VDSOLD $@ > # that contains the same symbols at the same offsets. > quiet_cmd_so2s = SO2S$@ >cmd_so2s = $(NM) -D $< | $(srctree)/$(src)/so2s.sh > $@ > - > -# install commands for the unstripped file > -quiet_cmd_vdso_install = INSTALL $@ > - cmd_vdso_install = cp $(obj)/$@.dbg $(MODLIB)/vdso/$@ > - > -vdso.so: $(obj)/vdso.so.dbg > - @mkdir -p $(MODLIB)/vdso > - $(call cmd,vdso_install) > - > -vdso_install: vdso.so > -- > 2.39.2 > -- Best Regards Guo Ren
Re: [PATCH v5 1/2] mm/memory_hotplug: split memmap_on_memory requests across memblocks
On 07.10.23 00:01, Verma, Vishal L wrote: On Fri, 2023-10-06 at 14:52 +0200, David Hildenbrand wrote: On 05.10.23 20:31, Vishal Verma wrote: <..> @@ -2167,47 +2221,28 @@ static int __ref try_remove_memory(u64 start, u64 size) if (rc) return rc; + mem_hotplug_begin(); + /* - * We only support removing memory added with MHP_MEMMAP_ON_MEMORY in - * the same granularity it was added - a single memory block. + * For memmap_on_memory, the altmaps could have been added on + * a per-memblock basis. Loop through the entire range if so, + * and remove each memblock and its altmap. */ if (mhp_memmap_on_memory()) { - rc = walk_memory_blocks(start, size, &mem, test_has_altmap_cb); - if (rc) { - if (size != memory_block_size_bytes()) { - pr_warn("Refuse to remove %#llx - %#llx," - "wrong granularity\n", - start, start + size); - return -EINVAL; - } - altmap = mem->altmap; - /* - * Mark altmap NULL so that we can add a debug - * check on memblock free. - */ - mem->altmap = NULL; - } + unsigned long memblock_size = memory_block_size_bytes(); + u64 cur_start; + + for (cur_start = start; cur_start < start + size; + cur_start += memblock_size) + remove_memory_block_and_altmap(nid, cur_start, + memblock_size); + } else { + remove_memory_block_and_altmap(nid, start, size); Better call remove_memory_block_devices() and arch_remove_memory(start, size, altmap) here explicitly instead of using remove_memory_block_and_altmap() that really can only handle a single memory block with any inputs. I'm not sure I follow. Even in the non memmap_on_memory case, we'd have to walk_memory_blocks() to get to the memory_block->altmap, right? See my other reply to, at least with mhp_memmap_on_memory()==false, we don't have to worry about the altmap. Or is there a more direct way? If we have to walk_memory_blocks, what's the advantage of calling those directly instead of calling the helper created above? I think we have two cases to handle 1) All have an altmap. Remove them block-by-block. Probably we should call a function remove_memory_blocks(altmap=true) [or alternatively remove_memory_blocks_and_altmaps()] and just handle iterating internally. 2) All don't have an altmap. We can remove them in one go. Probably we should call that remove_memory_blocks(altmap=false) [or alternatively remove_memory_blocks_no_altmaps()]. I guess it's best to do a walk upfront to make sure either all have an altmap or none has one. Then we can branch off to the right function knowing whether we have to process altmaps or not. The existing if (mhp_memmap_on_memory()) { ... } Can be extended for that case. Please let me know if I failed to express what I mean, then I can briefly prototype it on top of your changes. -- Cheers, David / dhildenb
Re: [PATCH rebased] kbuild: rpm-pkg: Fix build with non-default MODLIB
On Mon, Oct 9, 2023 at 11:07 PM Michal Suchánek wrote: > > On Mon, Oct 09, 2023 at 09:34:10PM +0900, Masahiro Yamada wrote: > > On Mon, Oct 9, 2023 at 5:52 PM Michal Suchánek wrote: > > > > > > Hello, > > > > > > On Mon, Oct 09, 2023 at 05:31:02PM +0900, Masahiro Yamada wrote: > > > > On Fri, Oct 6, 2023 at 12:49 AM Michal Suchanek > > > > wrote: > > > > > > > > > > The default MODLIB value is composed of two variables and the > > > > > hardcoded > > > > > string '/lib/modules/'. > > > > > > > > > > MODLIB = $(INSTALL_MOD_PATH)/lib/modules/$(KERNELRELEASE) > > > > > > > > > > Defining this middle part as a variable was rejected on the basis that > > > > > users can pass the whole MODLIB to make, such as > > > > > > > > > > > > In other words, do you want to say > > > > > > > > "If defining this middle part as a variable had been accepted, > > > > this patch would have been unneeded." ? > > > > > > If it were accepted I would not have to guess what the middle part is, > > > and could use the variable that unambiguosly defines it instead. > > > > > > How? > > > > scripts/package/kernel.spec hardcodes 'lib/modules' > > in a couple of places. > > > > I am asking how to derive the module path. > > Not sure what you are asking here. The path is hardcoded, everywhere. > > The current Makefile has > > MODLIB = $(INSTALL_MOD_PATH)/lib/modules/$(KERNELRELEASE) > > and there is no reliable way to learn what the middle part was after the > fact - $(INSTALL_MOD_PATH) can be non-empty. > > The rejected patch was changing this to a variable, and also default to > adjusting the content to what kmod exports in pkgconfig after applying a > proposed patch to make this hardcoded part configurable: > > export KERNEL_MODULE_DIRECTORY := $(shell pkg-config --print-variables kmod > 2>/dev/null | grep '^module_directory$$' >/dev/null && pkg-config > --variable=module_directory kmod || echo /lib/modules) > > MODLIB = $(INSTALL_MOD_PATH)$(KERNEL_MODULE_DIRECTORY)/$(KERNELRELEASE) > > It would be completely posible to only define the middle part as a > variable that could then be used in rpm-pkg: > > export KERNEL_MODULE_DIRECTORY := /lib/modules > > MODLIB = $(INSTALL_MOD_PATH)$(KERNEL_MODULE_DIRECTORY)/$(KERNELRELEASE) > > Thanks > > Michal > > Let me add more context to my question. I am interested in the timing when 'pkg-config --print-variables kmod | grep module_directory' is executed. 1. Build a SRPM on machine A 2. Copy the SRPM from machine A to machine B 3. Run rpmbuild on machine B to build the SRPM into a RPM 4. Copy the RPM from machine B to machine C 5. Install the RPM to machine C Of course, we are most interested in the module path of machine C, but it is difficult/impossible to guess it at the time of building. We can assume machine B == machine C. We are the second most interested in the module path on machine B. The module path of machine A is not important. So, I am asking where you would inject 'pkg-config --print-variables kmod | grep module_directory'. -- Best Regards Masahiro Yamada
Re: [PATCH v5 1/2] mm/memory_hotplug: split memmap_on_memory requests across memblocks
On 07.10.23 10:55, Huang, Ying wrote: Vishal Verma writes: The MHP_MEMMAP_ON_MEMORY flag for hotplugged memory is restricted to 'memblock_size' chunks of memory being added. Adding a larger span of memory precludes memmap_on_memory semantics. For users of hotplug such as kmem, large amounts of memory might get added from the CXL subsystem. In some cases, this amount may exceed the available 'main memory' to store the memmap for the memory being added. In this case, it is useful to have a way to place the memmap on the memory being added, even if it means splitting the addition into memblock-sized chunks. Change add_memory_resource() to loop over memblock-sized chunks of memory if caller requested memmap_on_memory, and if other conditions for it are met. Teach try_remove_memory() to also expect that a memory range being removed might have been split up into memblock sized chunks, and to loop through those as needed. Cc: Andrew Morton Cc: David Hildenbrand Cc: Michal Hocko Cc: Oscar Salvador Cc: Dan Williams Cc: Dave Jiang Cc: Dave Hansen Cc: Huang Ying Suggested-by: David Hildenbrand Signed-off-by: Vishal Verma --- mm/memory_hotplug.c | 162 1 file changed, 99 insertions(+), 63 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index f8d3e7427e32..77ec6f15f943 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1380,6 +1380,44 @@ static bool mhp_supports_memmap_on_memory(unsigned long size) return arch_supports_memmap_on_memory(vmemmap_size); } +static int add_memory_create_devices(int nid, struct memory_group *group, +u64 start, u64 size, mhp_t mhp_flags) +{ + struct mhp_params params = { .pgprot = pgprot_mhp(PAGE_KERNEL) }; + struct vmem_altmap mhp_altmap = { + .base_pfn = PHYS_PFN(start), + .end_pfn = PHYS_PFN(start + size - 1), + }; + int ret; + + if ((mhp_flags & MHP_MEMMAP_ON_MEMORY)) { + mhp_altmap.free = memory_block_memmap_on_memory_pages(); + params.altmap = kmalloc(sizeof(struct vmem_altmap), GFP_KERNEL); + if (!params.altmap) + return -ENOMEM; + + memcpy(params.altmap, &mhp_altmap, sizeof(mhp_altmap)); + } + + /* call arch's memory hotadd */ + ret = arch_add_memory(nid, start, size, ¶ms); + if (ret < 0) + goto error; + + /* create memory block devices after memory was added */ + ret = create_memory_block_devices(start, size, params.altmap, group); + if (ret) + goto err_bdev; + + return 0; + +err_bdev: + arch_remove_memory(start, size, NULL); +error: + kfree(params.altmap); + return ret; +} + /* * NOTE: The caller must call lock_device_hotplug() to serialize hotplug * and online/offline operations (triggered e.g. by sysfs). @@ -1388,14 +1426,10 @@ static bool mhp_supports_memmap_on_memory(unsigned long size) */ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) { - struct mhp_params params = { .pgprot = pgprot_mhp(PAGE_KERNEL) }; + unsigned long memblock_size = memory_block_size_bytes(); enum memblock_flags memblock_flags = MEMBLOCK_NONE; - struct vmem_altmap mhp_altmap = { - .base_pfn = PHYS_PFN(res->start), - .end_pfn = PHYS_PFN(res->end), - }; struct memory_group *group = NULL; - u64 start, size; + u64 start, size, cur_start; bool new_node = false; int ret; @@ -1436,28 +1470,21 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) /* * Self hosted memmap array */ - if (mhp_flags & MHP_MEMMAP_ON_MEMORY) { - if (mhp_supports_memmap_on_memory(size)) { - mhp_altmap.free = memory_block_memmap_on_memory_pages(); - params.altmap = kmalloc(sizeof(struct vmem_altmap), GFP_KERNEL); - if (!params.altmap) + if ((mhp_flags & MHP_MEMMAP_ON_MEMORY) && + mhp_supports_memmap_on_memory(memblock_size)) { + for (cur_start = start; cur_start < start + size; +cur_start += memblock_size) { + ret = add_memory_create_devices(nid, group, cur_start, + memblock_size, + mhp_flags); + if (ret) goto error; - - memcpy(params.altmap, &mhp_altmap, sizeof(mhp_altmap)); } - /* fallback to not using altmap */ - } - - /* call arch's memory hotadd */ - ret = arch_add_memory(nid, start, size, ¶ms); - if (ret < 0) - goto error_free; - - /* create me
Re: [PATCH v1] samples: kprobes: Fixes a typo
On Mon, 9 Oct 2023 09:51:03 -0400 Steven Rostedt wrote: > On Sat, 7 Oct 2023 21:09:00 +0530 > Atul Kumar Pant wrote: > > > On Sat, Sep 23, 2023 at 11:00:46PM +0530, Atul Kumar Pant wrote: > > > On Thu, Aug 17, 2023 at 10:38:19PM +0530, Atul Kumar Pant wrote: > > > > Fixes typo in a function name. > > > > > > > > Signed-off-by: Atul Kumar Pant > > > > --- > > > > samples/kprobes/kretprobe_example.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/samples/kprobes/kretprobe_example.c > > > > b/samples/kprobes/kretprobe_example.c > > > > index cbf16542d84e..ed79fd3d48fb 100644 > > > > --- a/samples/kprobes/kretprobe_example.c > > > > +++ b/samples/kprobes/kretprobe_example.c > > > > @@ -35,7 +35,7 @@ struct my_data { > > > > ktime_t entry_stamp; > > > > }; > > > > > > > > -/* Here we use the entry_hanlder to timestamp function entry */ > > > > +/* Here we use the entry_handler to timestamp function entry */ > > > > static int entry_handler(struct kretprobe_instance *ri, struct pt_regs > > > > *regs) > > > > { > > > > struct my_data *data; > > > > -- > > > > 2.25.1 > > > > > > > > > > Hi all, can someone provide comments on this change. > > > > Hi all, can someone please review this change. It has > > been not > > reviewed for quite some time. > > That's because trivial typos in comments are considered very low priority, > and are usually only added (if they are ever added) if the maintainer has > extra time, which may not be for a while. Anyway, let me pick this. I found this in my inbox now. :) Thank you, > > -- Steve -- Masami Hiramatsu (Google)
Re: [PATCH net-next v7] net/core: Introduce netdev_core_stats_inc()
On Mon, 9 Oct 2023 18:58:27 +0800 Yajun Deng wrote: > > C compiler decides to inline or not, depending on various factors. > > > > The most efficient (and small) code is generated by this_cpu_inc() > > version, allowing the compiler to inline it. > > > > If you copy/paste this_cpu_inc() twenty times, then the compiler > > would not inline the function anymore. Yes, if you want something to be visible by ftrace, it must not be inlined (as inlined functions are not function calls by definition). And as Eric stated, the compiler is perfectly allowed to inline something if it believes it will be more efficient. i.e. There may be code around the function call that could be more efficient if it wasn't change to parameters. If you want to make sure a function stays out of line, you must explicitly tell the compiler you want the function not to ever be inlined (hence the "noinline" attribute). > > > Got it. Thank you. Great. -- Steve
[PATCH 10/10] net: remove ndo_do_ioctl handler
From: Arnd Bergmann All of the references to the callback pointer are gone, so remove the pointer itself before we grow new references to it. Signed-off-by: Arnd Bergmann --- Documentation/networking/netdevices.rst | 8 include/linux/netdevice.h | 7 --- 2 files changed, 15 deletions(-) diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst index 9e4cccb90b870..6f9b71c5d37b8 100644 --- a/Documentation/networking/netdevices.rst +++ b/Documentation/networking/netdevices.rst @@ -218,14 +218,6 @@ ndo_stop: Context: process Note: netif_running() is guaranteed false -ndo_do_ioctl: - Synchronization: rtnl_lock() semaphore. - Context: process - -This is only called by network subsystems internally, -not by user space calling ioctl as it was in before -linux-5.14. - ndo_siocbond: Synchronization: rtnl_lock() semaphore. Context: process diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index e070a4540fbaf..8d1cc8f195cb6 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1121,11 +1121,6 @@ struct netdev_net_notifier { * int (*ndo_validate_addr)(struct net_device *dev); * Test if Media Access Control address is valid for the device. * - * int (*ndo_do_ioctl)(struct net_device *dev, struct ifreq *ifr, int cmd); - * Old-style ioctl entry point. This is used internally by the - * appletalk and ieee802154 subsystems but is no longer called by - * the device ioctl handler. - * * int (*ndo_siocbond)(struct net_device *dev, struct ifreq *ifr, int cmd); * Used by the bonding driver for its device specific ioctls: * SIOCBONDENSLAVE, SIOCBONDRELEASE, SIOCBONDSETHWADDR, SIOCBONDCHANGEACTIVE, @@ -1429,8 +1424,6 @@ struct net_device_ops { int (*ndo_set_mac_address)(struct net_device *dev, void *addr); int (*ndo_validate_addr)(struct net_device *dev); - int (*ndo_do_ioctl)(struct net_device *dev, - struct ifreq *ifr, int cmd); int (*ndo_eth_ioctl)(struct net_device *dev, struct ifreq *ifr, int cmd); int (*ndo_siocbond)(struct net_device *dev, -- 2.39.2
[PATCH 08/10] wireless: atmel: remove unused ioctl function
From: Arnd Bergmann This function has no callers, and for the past 20 years, the request_firmware interface has been in place instead of the custom firmware loader. Signed-off-by: Arnd Bergmann --- drivers/net/wireless/atmel/atmel.c | 72 -- 1 file changed, 72 deletions(-) diff --git a/drivers/net/wireless/atmel/atmel.c b/drivers/net/wireless/atmel/atmel.c index 7c2d1c588156d..461dce21de2b0 100644 --- a/drivers/net/wireless/atmel/atmel.c +++ b/drivers/net/wireless/atmel/atmel.c @@ -571,7 +571,6 @@ static const struct { { REG_DOMAIN_ISRAEL, 3, 9, "Israel"} }; static void build_wpa_mib(struct atmel_private *priv); -static int atmel_ioctl(struct net_device *dev, struct ifreq *rq, int cmd); static void atmel_copy_to_card(struct net_device *dev, u16 dest, const unsigned char *src, u16 len); static void atmel_copy_to_host(struct net_device *dev, unsigned char *dest, @@ -1487,7 +1486,6 @@ static const struct net_device_ops atmel_netdev_ops = { .ndo_stop = atmel_close, .ndo_set_mac_address= atmel_set_mac_address, .ndo_start_xmit = start_tx, - .ndo_do_ioctl = atmel_ioctl, .ndo_validate_addr = eth_validate_addr, }; @@ -2616,76 +2614,6 @@ static const struct iw_handler_def atmel_handler_def = { .get_wireless_stats = atmel_get_wireless_stats }; -static int atmel_ioctl(struct net_device *dev, struct ifreq *rq, int cmd) -{ - int i, rc = 0; - struct atmel_private *priv = netdev_priv(dev); - struct atmel_priv_ioctl com; - struct iwreq *wrq = (struct iwreq *) rq; - unsigned char *new_firmware; - char domain[REGDOMAINSZ + 1]; - - switch (cmd) { - case ATMELIDIFC: - wrq->u.param.value = ATMELMAGIC; - break; - - case ATMELFWL: - if (copy_from_user(&com, rq->ifr_data, sizeof(com))) { - rc = -EFAULT; - break; - } - - if (!capable(CAP_NET_ADMIN)) { - rc = -EPERM; - break; - } - - new_firmware = memdup_user(com.data, com.len); - if (IS_ERR(new_firmware)) { - rc = PTR_ERR(new_firmware); - break; - } - - kfree(priv->firmware); - - priv->firmware = new_firmware; - priv->firmware_length = com.len; - strncpy(priv->firmware_id, com.id, 31); - priv->firmware_id[31] = '\0'; - break; - - case ATMELRD: - if (copy_from_user(domain, rq->ifr_data, REGDOMAINSZ)) { - rc = -EFAULT; - break; - } - - if (!capable(CAP_NET_ADMIN)) { - rc = -EPERM; - break; - } - - domain[REGDOMAINSZ] = 0; - rc = -EINVAL; - for (i = 0; i < ARRAY_SIZE(channel_table); i++) { - if (!strcasecmp(channel_table[i].name, domain)) { - priv->config_reg_domain = channel_table[i].reg_domain; - rc = 0; - } - } - - if (rc == 0 && priv->station_state != STATION_STATE_DOWN) - rc = atmel_open(dev); - break; - - default: - rc = -EOPNOTSUPP; - } - - return rc; -} - struct auth_body { __le16 alg; __le16 trans_seq; -- 2.39.2
[PATCH 09/10] wireless: hostap: remove unused ioctl function
From: Arnd Bergmann The ioctl handler has no actual callers in the kernel and is useless. All the functionality should be reachable through the regualar interfaces. Signed-off-by: Arnd Bergmann --- drivers/net/wireless/intersil/hostap/hostap.h | 1 - .../wireless/intersil/hostap/hostap_ioctl.c | 228 -- .../wireless/intersil/hostap/hostap_main.c| 3 - 3 files changed, 232 deletions(-) diff --git a/drivers/net/wireless/intersil/hostap/hostap.h b/drivers/net/wireless/intersil/hostap/hostap.h index c17ab6dbbb538..552ae33d78751 100644 --- a/drivers/net/wireless/intersil/hostap/hostap.h +++ b/drivers/net/wireless/intersil/hostap/hostap.h @@ -92,7 +92,6 @@ void hostap_info_process(local_info_t *local, struct sk_buff *skb); extern const struct iw_handler_def hostap_iw_handler_def; extern const struct ethtool_ops prism2_ethtool_ops; -int hostap_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd); int hostap_siocdevprivate(struct net_device *dev, struct ifreq *ifr, void __user *data, int cmd); diff --git a/drivers/net/wireless/intersil/hostap/hostap_ioctl.c b/drivers/net/wireless/intersil/hostap/hostap_ioctl.c index b4adfc190ae87..26162f92e3c3d 100644 --- a/drivers/net/wireless/intersil/hostap/hostap_ioctl.c +++ b/drivers/net/wireless/intersil/hostap/hostap_ioctl.c @@ -2316,21 +2316,6 @@ static const struct iw_priv_args prism2_priv[] = { }; -static int prism2_ioctl_priv_inquire(struct net_device *dev, int *i) -{ - struct hostap_interface *iface; - local_info_t *local; - - iface = netdev_priv(dev); - local = iface->local; - - if (local->func->cmd(dev, HFA384X_CMDCODE_INQUIRE, *i, NULL, NULL)) - return -EOPNOTSUPP; - - return 0; -} - - static int prism2_ioctl_priv_prism2_param(struct net_device *dev, struct iw_request_info *info, union iwreq_data *uwrq, char *extra) @@ -2910,146 +2895,6 @@ static int prism2_ioctl_priv_writemif(struct net_device *dev, } -static int prism2_ioctl_priv_monitor(struct net_device *dev, int *i) -{ - struct hostap_interface *iface; - local_info_t *local; - int ret = 0; - union iwreq_data wrqu; - - iface = netdev_priv(dev); - local = iface->local; - - printk(KERN_DEBUG "%s: process %d (%s) used deprecated iwpriv monitor " - "- update software to use iwconfig mode monitor\n", - dev->name, task_pid_nr(current), current->comm); - - /* Backward compatibility code - this can be removed at some point */ - - if (*i == 0) { - /* Disable monitor mode - old mode was not saved, so go to -* Master mode */ - wrqu.mode = IW_MODE_MASTER; - ret = prism2_ioctl_siwmode(dev, NULL, &wrqu, NULL); - } else if (*i == 1) { - /* netlink socket mode is not supported anymore since it did -* not separate different devices from each other and was not -* best method for delivering large amount of packets to -* user space */ - ret = -EOPNOTSUPP; - } else if (*i == 2 || *i == 3) { - switch (*i) { - case 2: - local->monitor_type = PRISM2_MONITOR_80211; - break; - case 3: - local->monitor_type = PRISM2_MONITOR_PRISM; - break; - } - wrqu.mode = IW_MODE_MONITOR; - ret = prism2_ioctl_siwmode(dev, NULL, &wrqu, NULL); - hostap_monitor_mode_enable(local); - } else - ret = -EINVAL; - - return ret; -} - - -static int prism2_ioctl_priv_reset(struct net_device *dev, int *i) -{ - struct hostap_interface *iface; - local_info_t *local; - - iface = netdev_priv(dev); - local = iface->local; - - printk(KERN_DEBUG "%s: manual reset request(%d)\n", dev->name, *i); - switch (*i) { - case 0: - /* Disable and enable card */ - local->func->hw_shutdown(dev, 1); - local->func->hw_config(dev, 0); - break; - - case 1: - /* COR sreset */ - local->func->hw_reset(dev); - break; - - case 2: - /* Disable and enable port 0 */ - local->func->reset_port(dev); - break; - - case 3: - prism2_sta_deauth(local, WLAN_REASON_DEAUTH_LEAVING); - if (local->func->cmd(dev, HFA384X_CMDCODE_DISABLE, 0, NULL, -NULL)) - return -EINVAL; - break; - - case 4: - if (local->func->cmd(dev, HFA384X_CMDCODE_ENABLE, 0, NULL, -NULL)) - return -EINVAL; -
[PATCH 07/10] staging: rtl8723bs: remove dead code
From: Arnd Bergmann The .ndo_do_ioctl functions are never called, so the three implementation here is useless but only works as a way to identify the device in the notifiers, which can really be removed as well. Looking through the exported functions, I found a bunch more that have no callers, so just drop all of those. Signed-off-by: Arnd Bergmann --- drivers/staging/rtl8723bs/Makefile|1 - .../staging/rtl8723bs/include/osdep_intf.h| 32 - drivers/staging/rtl8723bs/include/rtw_io.h|1 - .../staging/rtl8723bs/os_dep/ioctl_linux.c| 1300 - drivers/staging/rtl8723bs/os_dep/os_intfs.c | 29 - drivers/staging/rtl8723bs/os_dep/sdio_intf.c | 23 +- 6 files changed, 1 insertion(+), 1385 deletions(-) delete mode 100644 drivers/staging/rtl8723bs/os_dep/ioctl_linux.c diff --git a/drivers/staging/rtl8723bs/Makefile b/drivers/staging/rtl8723bs/Makefile index 590bde02058c7..0f3f6dea4955e 100644 --- a/drivers/staging/rtl8723bs/Makefile +++ b/drivers/staging/rtl8723bs/Makefile @@ -50,7 +50,6 @@ r8723bs-y = \ hal/HalHWImg8723B_RF.o \ hal/HalPhyRf_8723B.o \ os_dep/ioctl_cfg80211.o \ - os_dep/ioctl_linux.o \ os_dep/mlme_linux.o \ os_dep/osdep_service.o \ os_dep/os_intfs.o \ diff --git a/drivers/staging/rtl8723bs/include/osdep_intf.h b/drivers/staging/rtl8723bs/include/osdep_intf.h index 111e0179712ac..83a25598e9627 100644 --- a/drivers/staging/rtl8723bs/include/osdep_intf.h +++ b/drivers/staging/rtl8723bs/include/osdep_intf.h @@ -8,33 +8,6 @@ #ifndef __OSDEP_INTF_H_ #define __OSDEP_INTF_H_ - -struct intf_priv { - - u8 *intf_dev; - u32 max_iosz; /* USB2.0: 128, USB1.1: 64, SDIO:64 */ - u32 max_xmitsz; /* USB2.0: unlimited, SDIO:512 */ - u32 max_recvsz; /* USB2.0: unlimited, SDIO:512 */ - - volatile u8 *io_rwmem; - volatile u8 *allocated_io_rwmem; - u32 io_wsz; /* unit: 4bytes */ - u32 io_rsz;/* unit: 4bytes */ - u8 intf_status; - - void (*_bus_io)(u8 *priv); - -/* -Under Sync. IRP (SDIO/USB) -A protection mechanism is necessary for the io_rwmem(read/write protocol) - -Under Async. IRP (SDIO/USB) -The protection mechanism is through the pending queue. -*/ - - struct mutex ioctl_mutex; -}; - struct dvobj_priv *devobj_init(void); void devobj_deinit(struct dvobj_priv *pdvobj); @@ -47,17 +20,12 @@ u32 rtw_start_drv_threads(struct adapter *padapter); void rtw_stop_drv_threads(struct adapter *padapter); void rtw_cancel_all_timer(struct adapter *padapter); -int rtw_ioctl(struct net_device *dev, struct ifreq *rq, int cmd); - int rtw_init_netdev_name(struct net_device *pnetdev, const char *ifname); struct net_device *rtw_init_netdev(struct adapter *padapter); void rtw_unregister_netdevs(struct dvobj_priv *dvobj); u16 rtw_recv_select_queue(struct sk_buff *skb); -int rtw_ndev_notifier_register(void); -void rtw_ndev_notifier_unregister(void); - void rtw_ips_dev_unload(struct adapter *padapter); int rtw_ips_pwr_up(struct adapter *padapter); diff --git a/drivers/staging/rtl8723bs/include/rtw_io.h b/drivers/staging/rtl8723bs/include/rtw_io.h index e98083a07a660..f92093e73fe67 100644 --- a/drivers/staging/rtl8723bs/include/rtw_io.h +++ b/drivers/staging/rtl8723bs/include/rtw_io.h @@ -72,7 +72,6 @@ #define _INTF_ASYNC_ BIT(0) /* support async io */ -struct intf_priv; struct intf_hdl; struct io_queue; diff --git a/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c b/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c deleted file mode 100644 index c81b30f1f1b05..0 --- a/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c +++ /dev/null @@ -1,1300 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/** - * - * Copyright(c) 2007 - 2012 Realtek Corporation. All rights reserved. - * - **/ - -#include -#include -#include -#include -#include -#include -#include - -#define RTL_IOCTL_WPA_SUPPLICANT (SIOCIWFIRSTPRIV + 30) - -static int wpa_set_auth_algs(struct net_device *dev, u32 value) -{ - struct adapter *padapter = rtw_netdev_priv(dev); - int ret = 0; - - if ((value & IW_AUTH_ALG_SHARED_KEY) && (value & IW_AUTH_ALG_OPEN_SYSTEM)) { - padapter->securitypriv.ndisencryptstatus = Ndis802_11Encryption1Enabled; - padapter->securitypriv.ndisauthtype = Ndis802_11AuthModeAutoSwitch; - padapter->securitypriv.dot11AuthAlgrthm = dot11AuthAlgrthm_Auto; - } else if (value & IW_AUTH_ALG_SHARED_KEY) { - padapter->securitypriv.ndisencryptstatus = Ndis802_11Encryption1Enabled; - - padapter->securitypriv.ndisauthtype = Ndis802_11AuthModeShared; - padapter->securitypriv.dot11AuthAlgrthm = dot11AuthAlgrthm_Shared; -
[PATCH 06/10] staging: rtl8712: remove unused legacy ioctl handlers
From: Arnd Bergmann The .ndo_do_ioctl functions are never called, and can just be removed, especially since this is a staging driver. Signed-off-by: Arnd Bergmann --- drivers/staging/rtl8712/os_intfs.c| 1 - drivers/staging/rtl8712/osdep_intf.h | 2 - drivers/staging/rtl8712/rtl871x_ioctl_linux.c | 124 -- 3 files changed, 127 deletions(-) diff --git a/drivers/staging/rtl8712/os_intfs.c b/drivers/staging/rtl8712/os_intfs.c index b18e6d9c832b8..121edffbd2507 100644 --- a/drivers/staging/rtl8712/os_intfs.c +++ b/drivers/staging/rtl8712/os_intfs.c @@ -191,7 +191,6 @@ static const struct net_device_ops rtl8712_netdev_ops = { .ndo_start_xmit = r8712_xmit_entry, .ndo_set_mac_address = r871x_net_set_mac_address, .ndo_get_stats = r871x_net_get_stats, - .ndo_do_ioctl = r871x_ioctl, }; struct net_device *r8712_init_netdev(void) diff --git a/drivers/staging/rtl8712/osdep_intf.h b/drivers/staging/rtl8712/osdep_intf.h index 9e75116c987ec..ce823030bfec2 100644 --- a/drivers/staging/rtl8712/osdep_intf.h +++ b/drivers/staging/rtl8712/osdep_intf.h @@ -27,6 +27,4 @@ struct intf_priv { struct completion io_retevt_comp; }; -int r871x_ioctl(struct net_device *dev, struct ifreq *rq, int cmd); - #endif /*_OSDEP_INTF_H_*/ diff --git a/drivers/staging/rtl8712/rtl871x_ioctl_linux.c b/drivers/staging/rtl8712/rtl871x_ioctl_linux.c index 36f6904d25abc..a4a34c9f00b84 100644 --- a/drivers/staging/rtl8712/rtl871x_ioctl_linux.c +++ b/drivers/staging/rtl8712/rtl871x_ioctl_linux.c @@ -36,8 +36,6 @@ #include #include -#define RTL_IOCTL_WPA_SUPPLICANT (SIOCIWFIRSTPRIV + 0x1E) - #define SCAN_ITEM_SIZE 768 #define MAX_CUSTOM_LEN 64 #define RATE_COUNT 4 @@ -2066,128 +2064,6 @@ static int r871x_wps_start(struct net_device *dev, return 0; } -static int wpa_set_param(struct net_device *dev, u8 name, u32 value) -{ - struct _adapter *padapter = netdev_priv(dev); - - switch (name) { - case IEEE_PARAM_WPA_ENABLED: - padapter->securitypriv.AuthAlgrthm = 2; /* 802.1x */ - switch ((value) & 0xff) { - case 1: /* WPA */ - padapter->securitypriv.ndisauthtype = - Ndis802_11AuthModeWPAPSK; /* WPA_PSK */ - padapter->securitypriv.ndisencryptstatus = - Ndis802_11Encryption2Enabled; - break; - case 2: /* WPA2 */ - padapter->securitypriv.ndisauthtype = - Ndis802_11AuthModeWPA2PSK; /* WPA2_PSK */ - padapter->securitypriv.ndisencryptstatus = - Ndis802_11Encryption3Enabled; - break; - } - break; - case IEEE_PARAM_TKIP_COUNTERMEASURES: - break; - case IEEE_PARAM_DROP_UNENCRYPTED: - /* HACK: -* -* wpa_supplicant calls set_wpa_enabled when the driver -* is loaded and unloaded, regardless of if WPA is being -* used. No other calls are made which can be used to -* determine if encryption will be used or not prior to -* association being expected. If encryption is not being -* used, drop_unencrypted is set to false, else true -- we -* can use this to determine if the CAP_PRIVACY_ON bit should -* be set. -*/ - break; - case IEEE_PARAM_PRIVACY_INVOKED: - break; - case IEEE_PARAM_AUTH_ALGS: - return wpa_set_auth_algs(dev, value); - case IEEE_PARAM_IEEE_802_1X: - break; - case IEEE_PARAM_WPAX_SELECT: - /* added for WPA2 mixed mode */ - break; - default: - return -EOPNOTSUPP; - } - return 0; -} - -static int wpa_mlme(struct net_device *dev, u32 command, u32 reason) -{ - struct _adapter *padapter = netdev_priv(dev); - - switch (command) { - case IEEE_MLME_STA_DEAUTH: - if (!r8712_set_802_11_disassociate(padapter)) - return -1; - break; - case IEEE_MLME_STA_DISASSOC: - if (!r8712_set_802_11_disassociate(padapter)) - return -1; - break; - default: - return -EOPNOTSUPP; - } - return 0; -} - -static int wpa_supplicant_ioctl(struct net_device *dev, struct iw_point *p) -{ - struct ieee_param *param; - int ret = 0; - struct _adapter *padapter = netdev_priv(dev); - - if (p->length < sizeof(struct ieee_param) || !p->pointer) - return -EINVAL; - param = memdup_user(p->pointer, p->length); - if (IS_ERR(param)) - return PTR_ERR(param); - switch (param->cmd) { -
[PATCH 05/10] staging: rtl8192: remove unused legacy ioctl handlers
From: Arnd Bergmann The .ndo_do_ioctl functions are never called, and can just be removed, especially since this is a staging driver. Signed-off-by: Arnd Bergmann --- drivers/staging/rtl8192u/ieee80211/dot11d.c | 41 -- drivers/staging/rtl8192u/ieee80211/dot11d.h | 2 - .../staging/rtl8192u/ieee80211/ieee80211.h| 12 - .../rtl8192u/ieee80211/ieee80211_softmac.c| 563 -- drivers/staging/rtl8192u/r8192U.h | 2 - drivers/staging/rtl8192u/r8192U_core.c| 109 6 files changed, 729 deletions(-) diff --git a/drivers/staging/rtl8192u/ieee80211/dot11d.c b/drivers/staging/rtl8192u/ieee80211/dot11d.c index ddaf66fa0f936..8a72c1e9eb1e1 100644 --- a/drivers/staging/rtl8192u/ieee80211/dot11d.c +++ b/drivers/staging/rtl8192u/ieee80211/dot11d.c @@ -97,22 +97,6 @@ void dot11d_update_country_ie(struct ieee80211_device *dev, u8 *pTaddr, } EXPORT_SYMBOL(dot11d_update_country_ie); -u8 dot11d_get_max_tx_pwr_in_dbm(struct ieee80211_device *dev, u8 Channel) -{ - struct rt_dot11d_info *dot11d_info = GET_DOT11D_INFO(dev); - u8 MaxTxPwrInDbm = 255; - - if (Channel > MAX_CHANNEL_NUMBER) { - netdev_err(dev->dev, "%s: Invalid Channel\n", __func__); - return MaxTxPwrInDbm; - } - if (dot11d_info->channel_map[Channel]) - MaxTxPwrInDbm = dot11d_info->max_tx_pwr_dbm_list[Channel]; - - return MaxTxPwrInDbm; -} -EXPORT_SYMBOL(dot11d_get_max_tx_pwr_in_dbm); - void dot11d_scan_complete(struct ieee80211_device *dev) { struct rt_dot11d_info *dot11d_info = GET_DOT11D_INFO(dev); @@ -147,28 +131,3 @@ int is_legal_channel(struct ieee80211_device *dev, u8 channel) return 0; } EXPORT_SYMBOL(is_legal_channel); - -int to_legal_channel(struct ieee80211_device *dev, u8 channel) -{ - struct rt_dot11d_info *dot11d_info = GET_DOT11D_INFO(dev); - u8 default_chn = 0; - u32 i = 0; - - for (i = 1; i <= MAX_CHANNEL_NUMBER; i++) { - if (dot11d_info->channel_map[i] > 0) { - default_chn = i; - break; - } - } - - if (channel > MAX_CHANNEL_NUMBER) { - netdev_err(dev->dev, "%s: Invalid Channel\n", __func__); - return default_chn; - } - - if (dot11d_info->channel_map[channel] > 0) - return channel; - - return default_chn; -} -EXPORT_SYMBOL(to_legal_channel); diff --git a/drivers/staging/rtl8192u/ieee80211/dot11d.h b/drivers/staging/rtl8192u/ieee80211/dot11d.h index 8b485fa180898..fd774265211a5 100644 --- a/drivers/staging/rtl8192u/ieee80211/dot11d.h +++ b/drivers/staging/rtl8192u/ieee80211/dot11d.h @@ -49,9 +49,7 @@ void dot11d_update_country_ie(struct ieee80211_device *dev, u8 *addr, u16 coutry_ie_len, u8 *coutry_ie); -u8 dot11d_get_max_tx_pwr_in_dbm(struct ieee80211_device *dev, u8 channel); void dot11d_scan_complete(struct ieee80211_device *dev); int is_legal_channel(struct ieee80211_device *dev, u8 channel); -int to_legal_channel(struct ieee80211_device *dev, u8 channel); #endif /* #ifndef __INC_DOT11D_H */ diff --git a/drivers/staging/rtl8192u/ieee80211/ieee80211.h b/drivers/staging/rtl8192u/ieee80211/ieee80211.h index 694d1b18f81c7..fc4201757c408 100644 --- a/drivers/staging/rtl8192u/ieee80211/ieee80211.h +++ b/drivers/staging/rtl8192u/ieee80211/ieee80211.h @@ -223,11 +223,7 @@ struct cb_desc { #define MAX_IE_LEN 0xff // added for kernel conflict -#define ieee80211_wake_queue ieee80211_wake_queue_rsl -#define ieee80211_stop_queue ieee80211_stop_queue_rsl #define notify_wx_assoc_event notify_wx_assoc_event_rsl -#define SendDisassociation SendDisassociation_rsl - struct ieee_param { u32 cmd; @@ -2152,7 +2148,6 @@ int ieee80211_wx_set_gen_ie(struct ieee80211_device *ieee, u8 *ie, size_t len); /* ieee80211_softmac.c */ short ieee80211_is_54g(const struct ieee80211_network *net); -short ieee80211_is_shortslot(const struct ieee80211_network *net); int ieee80211_rx_frame_softmac(struct ieee80211_device *ieee, struct sk_buff *skb, struct ieee80211_rx_stats *rx_stats, @@ -2160,7 +2155,6 @@ int ieee80211_rx_frame_softmac(struct ieee80211_device *ieee, void ieee80211_softmac_new_net(struct ieee80211_device *ieee, struct ieee80211_network *net); -void SendDisassociation(struct ieee80211_device *ieee, u8 *asSta, u8 asRsn); void ieee80211_softmac_xmit(struct ieee80211_txb *txb, struct ieee80211_device *ieee); @@ -2182,13 +2176,7 @@ void ieee80211_stop_protocol(struct ieee80211_device *ieee); void ieee80211_softmac_start_protocol(struct ieee80211_device *ieee); void ieee80211_softmac_stop_protocol(struct ieee80211_device *ieee); void ieee80211_reset_queue(struct
[PATCH 04/10] staging: ks7010: remove unused ioctl handler
From: Arnd Bergmann The ndo_do_ioctl function has no actual callers, and doesn't do much here, so just remove it entirely as preparation for removing the callback pointer from net_device_ops. Signed-off-by: Arnd Bergmann --- drivers/staging/ks7010/ks_wlan_net.c | 21 - 1 file changed, 21 deletions(-) diff --git a/drivers/staging/ks7010/ks_wlan_net.c b/drivers/staging/ks7010/ks_wlan_net.c index 0fb97a79ad0b3..ab7463bb25169 100644 --- a/drivers/staging/ks7010/ks_wlan_net.c +++ b/drivers/staging/ks7010/ks_wlan_net.c @@ -51,8 +51,6 @@ static int ks_wlan_close(struct net_device *dev); static void ks_wlan_set_rx_mode(struct net_device *dev); static struct net_device_stats *ks_wlan_get_stats(struct net_device *dev); static int ks_wlan_set_mac_address(struct net_device *dev, void *addr); -static int ks_wlan_netdev_ioctl(struct net_device *dev, struct ifreq *rq, - int cmd); static atomic_t update_phyinfo; static struct timer_list update_phyinfo_timer; @@ -2458,24 +2456,6 @@ static const struct iw_handler_def ks_wlan_handler_def = { .get_wireless_stats = ks_get_wireless_stats, }; -static int ks_wlan_netdev_ioctl(struct net_device *dev, struct ifreq *rq, - int cmd) -{ - int ret; - struct iwreq *wrq = (struct iwreq *)rq; - - switch (cmd) { - case SIOCIWFIRSTPRIV + 20: /* KS_WLAN_SET_STOP_REQ */ - ret = ks_wlan_set_stop_request(dev, NULL, &wrq->u, NULL); - break; - // All other calls are currently unsupported - default: - ret = -EOPNOTSUPP; - } - - return ret; -} - static struct net_device_stats *ks_wlan_get_stats(struct net_device *dev) { @@ -2608,7 +2588,6 @@ static const struct net_device_ops ks_wlan_netdev_ops = { .ndo_start_xmit = ks_wlan_start_xmit, .ndo_open = ks_wlan_open, .ndo_stop = ks_wlan_close, - .ndo_do_ioctl = ks_wlan_netdev_ioctl, .ndo_set_mac_address = ks_wlan_set_mac_address, .ndo_get_stats = ks_wlan_get_stats, .ndo_tx_timeout = ks_wlan_tx_timeout, -- 2.39.2
Re: [PATCH v9 1/5] lib: objpool added: ring-array based lockless MPMC
Hi, On Mon, 9 Oct 2023 02:40:53 +0800 wuqiang wrote: > On 2023/9/23 17:48, Masami Hiramatsu (Google) wrote: > > Hi Wuqiang, > > > > Sorry for replying later. > > > > On Tue, 5 Sep 2023 09:52:51 +0800 > > "wuqiang.matt" wrote: > > > >> The object pool is a scalable implementaion of high performance queue > >> for object allocation and reclamation, such as kretprobe instances. > >> > >> With leveraging percpu ring-array to mitigate the hot spot of memory > >> contention, it could deliver near-linear scalability for high parallel > >> scenarios. The objpool is best suited for following cases: > >> 1) Memory allocation or reclamation are prohibited or too expensive > >> 2) Consumers are of different priorities, such as irqs and threads > >> > >> Limitations: > >> 1) Maximum objects (capacity) is determined during pool initializing > >> and can't be modified (extended) after objpool creation > > > > So the pool size is fixed in initialization. > > Right. The arrary size will be up-rounded to the exponent of 2, but the > actual number of objects (to be allocated) are the exact value specified > by user. Yeah, this makes easy to use the seq-number as index. > > > > >> 2) The memory of objects won't be freed until objpool is finalized > >> 3) Object allocation (pop) may fail after trying all cpu slots > > > > This means that object allocation will fail if the all pools are empty, > > right? > > Yes, pop() will return NULL for this case. pop() does the checking for > only 1 round of all cpu nodes. > > The objpool might not be empty since new object could be inserted back > in the meaintime by other nodes, which is natural for lockless queues. OK. > > > > >> > >> Signed-off-by: wuqiang.matt > >> --- > >> include/linux/objpool.h | 174 + > >> lib/Makefile| 2 +- > >> lib/objpool.c | 338 > >> 3 files changed, 513 insertions(+), 1 deletion(-) > >> create mode 100644 include/linux/objpool.h > >> create mode 100644 lib/objpool.c > >> > >> diff --git a/include/linux/objpool.h b/include/linux/objpool.h > >> new file mode 100644 > >> index ..33c832216b98 > >> --- /dev/null > >> +++ b/include/linux/objpool.h > >> @@ -0,0 +1,174 @@ > >> +/* SPDX-License-Identifier: GPL-2.0 */ > >> + > >> +#ifndef _LINUX_OBJPOOL_H > >> +#define _LINUX_OBJPOOL_H > >> + > >> +#include > >> +#include > >> + > >> +/* > >> + * objpool: ring-array based lockless MPMC queue > >> + * > >> + * Copyright: wuqiang.m...@bytedance.com > >> + * > >> + * The object pool is a scalable implementaion of high performance queue > >> + * for objects allocation and reclamation, such as kretprobe instances. > >> + * > >> + * With leveraging per-cpu ring-array to mitigate the hot spots of memory > >> + * contention, it could deliver near-linear scalability for high parallel > >> + * scenarios. The ring-array is compactly managed in a single cache-line > >> + * to benefit from warmed L1 cache for most cases (<= 4 objects per-core). > >> + * The body of pre-allocated objects is stored in continuous cache-lines > >> + * just after the ring-array. > > > > I consider the size of entries may be big if we have larger number of > > CPU cores, like 72-cores. And if user specifies (2^n) + 1 entries. > > In this case, each CPU has (2^n - 1)/72 objects, but has 2^(n + 1) > > entries in ring buffer. So it should be noted. > > Yes for the arrary size since it‘s up-rounded to the exponent of 2, but the > actual number of pre-allocated objects will stay the same as user specified. > > >> + * > >> + * The object pool is interrupt safe. Both allocation and reclamation > >> + * (object pop and push operations) can be preemptible or interruptable. > > > > You've added raw_spinlock_disable/enable(), so it is not preemptible > > or interruptible anymore. (Anyway, caller doesn't take care of that) > > Sure, this decription is imporper and unnecessary. Will be removed. > > >> + * > >> + * It's best suited for following cases: > >> + * 1) Memory allocation or reclamation are prohibited or too expensive > >> + * 2) Consumers are of different priorities, such as irqs and threads > >> + * > >> + * Limitations: > >> + * 1) Maximum objects (capacity) is determined during pool initializing > >> + * 2) The memory of objects won't be freed until the pool is finalized > >> + * 3) Object allocation (pop) may fail after trying all cpu slots > >> + */ > >> + > >> +/** > >> + * struct objpool_slot - percpu ring array of objpool > >> + * @head: head of the local ring array (to retrieve at) > >> + * @tail: tail of the local ring array (to append at) > >> + * @bits: log2 of capacity (for bitwise operations) > >> + * @mask: capacity - 1 > > > > These description does not give idea what those roles are. > > I'll refine the description. objpool_slot is totally internal to objpool. > > > > >> + * > >> + * Represents a cpu-local array-based ring buffer, its size is spec
[PATCH 03/10] ethernet: sp7021: fix ioctl callback pointer
From: Arnd Bergmann The old .ndo_do_ioctl() callback is never called any more, instead the driver should set .ndo_eth_ioctl() for the phy operations. Fixes: fd3040b9394c5 ("net: ethernet: Add driver for Sunplus SP7021") Signed-off-by: Arnd Bergmann --- drivers/net/ethernet/sunplus/spl2sw_driver.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/sunplus/spl2sw_driver.c b/drivers/net/ethernet/sunplus/spl2sw_driver.c index 391a1bc7f4463..bb4514f4e8157 100644 --- a/drivers/net/ethernet/sunplus/spl2sw_driver.c +++ b/drivers/net/ethernet/sunplus/spl2sw_driver.c @@ -199,7 +199,7 @@ static const struct net_device_ops netdev_ops = { .ndo_start_xmit = spl2sw_ethernet_start_xmit, .ndo_set_rx_mode = spl2sw_ethernet_set_rx_mode, .ndo_set_mac_address = spl2sw_ethernet_set_mac_address, - .ndo_do_ioctl = phy_do_ioctl, + .ndo_eth_ioctl = phy_do_ioctl, .ndo_tx_timeout = spl2sw_ethernet_tx_timeout, }; -- 2.39.2
[PATCH 02/10] ieee802154: avoid deprecated .ndo_do_ioctl callback
From: Arnd Bergmann The ieee802154 socket implementation is the last remaining caller of the netdevice ioctl callback. In order to completely remove this, add a custom pointer to the existing wpan_dev specific operations structure. Since that structure is currently only used to wrap the 'create' header operation, adjust the naming slightly to make this more generic. It would be a good idea to adjust the calling conventions and split the get/set operations into separate functions, but that can be a follow-up cleanup. For the moment, I kept the actual changes to a minimum to avoid regressions. Signed-off-by: Arnd Bergmann --- include/net/cfg802154.h | 9 + net/ieee802154/socket.c | 5 +++-- net/mac802154/iface.c | 8 3 files changed, 12 insertions(+), 10 deletions(-) diff --git a/include/net/cfg802154.h b/include/net/cfg802154.h index f79ce133e51a7..e604df98e2ee9 100644 --- a/include/net/cfg802154.h +++ b/include/net/cfg802154.h @@ -433,15 +433,16 @@ struct ieee802154_llsec_device_key { u32 frame_counter; }; -struct wpan_dev_header_ops { +struct wpan_dev_ops { /* TODO create callback currently assumes ieee802154_mac_cb inside * skb->cb. This should be changed to give these information as * parameter. */ - int (*create)(struct sk_buff *skb, struct net_device *dev, + int (*header_create)(struct sk_buff *skb, struct net_device *dev, const struct ieee802154_addr *daddr, const struct ieee802154_addr *saddr, unsigned int len); + int (*ioctl)(struct net_device *dev, struct ifreq *ifr, int cmd); }; struct wpan_dev { @@ -452,7 +453,7 @@ struct wpan_dev { struct list_head list; struct net_device *netdev; - const struct wpan_dev_header_ops *header_ops; + const struct wpan_dev_ops *ops; /* lowpan interface, set when the wpan_dev belongs to one lowpan_dev */ struct net_device *lowpan_dev; @@ -491,7 +492,7 @@ wpan_dev_hard_header(struct sk_buff *skb, struct net_device *dev, { struct wpan_dev *wpan_dev = dev->ieee802154_ptr; - return wpan_dev->header_ops->create(skb, dev, daddr, saddr, len); + return wpan_dev->ops->header_create(skb, dev, daddr, saddr, len); } #endif diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c index 00302e8b9615b..27e58237091ca 100644 --- a/net/ieee802154/socket.c +++ b/net/ieee802154/socket.c @@ -139,8 +139,9 @@ static int ieee802154_dev_ioctl(struct sock *sk, struct ifreq __user *arg, if (!dev) return -ENODEV; - if (dev->type == ARPHRD_IEEE802154 && dev->netdev_ops->ndo_do_ioctl) - ret = dev->netdev_ops->ndo_do_ioctl(dev, &ifr, cmd); + if (dev->type == ARPHRD_IEEE802154 && dev->ieee802154_ptr && + dev->ieee802154_ptr->ops) + ret = dev->ieee802154_ptr->ops->ioctl(dev, &ifr, cmd); if (!ret && put_user_ifreq(&ifr, arg)) ret = -EFAULT; diff --git a/net/mac802154/iface.c b/net/mac802154/iface.c index c0e2da5072bea..4937f8c2fb4cc 100644 --- a/net/mac802154/iface.c +++ b/net/mac802154/iface.c @@ -406,8 +406,9 @@ static int ieee802154_header_create(struct sk_buff *skb, return hlen; } -static const struct wpan_dev_header_ops ieee802154_header_ops = { - .create = ieee802154_header_create, +static const struct wpan_dev_ops ieee802154_ops = { + .header_create = ieee802154_header_create, + .ioctl = mac802154_wpan_ioctl, }; /* This header create functionality assumes a 8 byte array for @@ -495,7 +496,6 @@ static const struct net_device_ops mac802154_wpan_ops = { .ndo_open = mac802154_wpan_open, .ndo_stop = mac802154_slave_close, .ndo_start_xmit = ieee802154_subif_start_xmit, - .ndo_do_ioctl = mac802154_wpan_ioctl, .ndo_set_mac_address= mac802154_wpan_mac_addr, }; @@ -581,7 +581,7 @@ ieee802154_setup_sdata(struct ieee802154_sub_if_data *sdata, sdata->dev->netdev_ops = &mac802154_wpan_ops; sdata->dev->ml_priv = &mac802154_mlme_wpan; sdata->iface_default_filtering = IEEE802154_FILTERING_4_FRAME_FIELDS; - wpan_dev->header_ops = &ieee802154_header_ops; + wpan_dev->ops = &ieee802154_ops; mutex_init(&sdata->sec_mtx); -- 2.39.2
[PATCH 01/10] appletalk: remove localtalk and ppp support
From: Arnd Bergmann The last localtalk driver is gone now, and ppp support was never fully merged, so clean up the appletalk code by removing the obvious dead code paths. Notably, this removes one of the two callers of the old .ndo_do_ioctl() callback that was abused for getting device addresses and is now only used in the ieee802154 subsystem, which still uses the same trick. The include/uapi/linux/if_ltalk.h header might still be required for building userspace programs, but I made sure that debian code search and the netatalk upstream have no references it it, so it should be fine to remove. Signed-off-by: Arnd Bergmann --- drivers/net/tun.c | 3 - include/linux/atalk.h | 1 - include/linux/if_ltalk.h | 8 --- include/uapi/linux/if_ltalk.h | 10 net/appletalk/Makefile| 2 +- net/appletalk/aarp.c | 108 +++--- net/appletalk/ddp.c | 97 +- net/appletalk/dev.c | 46 --- 8 files changed, 11 insertions(+), 264 deletions(-) delete mode 100644 include/linux/if_ltalk.h delete mode 100644 include/uapi/linux/if_ltalk.h delete mode 100644 net/appletalk/dev.c diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 89ab9efe522c3..e11476296e253 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -70,7 +70,6 @@ #include #include #include -#include #include #include #include @@ -3059,8 +3058,6 @@ static unsigned char tun_get_addr_len(unsigned short type) return ROSE_ADDR_LEN; case ARPHRD_NETROM: return AX25_ADDR_LEN; - case ARPHRD_LOCALTLK: - return LTALK_ALEN; default: return 0; } diff --git a/include/linux/atalk.h b/include/linux/atalk.h index a55bfc6567d01..2896f2ac9568e 100644 --- a/include/linux/atalk.h +++ b/include/linux/atalk.h @@ -121,7 +121,6 @@ static inline struct atalk_iface *atalk_find_dev(struct net_device *dev) #endif extern struct atalk_addr *atalk_find_dev_addr(struct net_device *dev); -extern struct net_device *atrtr_get_dev(struct atalk_addr *sa); extern int aarp_send_ddp(struct net_device *dev, struct sk_buff *skb, struct atalk_addr *sa, void *hwaddr); diff --git a/include/linux/if_ltalk.h b/include/linux/if_ltalk.h deleted file mode 100644 index 4cc1c0b778700..0 --- a/include/linux/if_ltalk.h +++ /dev/null @@ -1,8 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifndef __LINUX_LTALK_H -#define __LINUX_LTALK_H - -#include - -extern struct net_device *alloc_ltalkdev(int sizeof_priv); -#endif diff --git a/include/uapi/linux/if_ltalk.h b/include/uapi/linux/if_ltalk.h deleted file mode 100644 index fa61e776f598d..0 --- a/include/uapi/linux/if_ltalk.h +++ /dev/null @@ -1,10 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ -#ifndef _UAPI__LINUX_LTALK_H -#define _UAPI__LINUX_LTALK_H - -#define LTALK_HLEN 1 -#define LTALK_MTU 600 -#define LTALK_ALEN 1 - - -#endif /* _UAPI__LINUX_LTALK_H */ diff --git a/net/appletalk/Makefile b/net/appletalk/Makefile index 33164d972d379..152312a151800 100644 --- a/net/appletalk/Makefile +++ b/net/appletalk/Makefile @@ -5,6 +5,6 @@ obj-$(CONFIG_ATALK) += appletalk.o -appletalk-y:= aarp.o ddp.o dev.o +appletalk-y:= aarp.o ddp.o appletalk-$(CONFIG_PROC_FS)+= atalk_proc.o appletalk-$(CONFIG_SYSCTL) += sysctl_net_atalk.o diff --git a/net/appletalk/aarp.c b/net/appletalk/aarp.c index 9fa0b246902be..dfcd9f46cb3a6 100644 --- a/net/appletalk/aarp.c +++ b/net/appletalk/aarp.c @@ -432,49 +432,18 @@ static struct atalk_addr *__aarp_proxy_find(struct net_device *dev, return a ? sa : NULL; } -/* - * Probe a Phase 1 device or a device that requires its Net:Node to - * be set via an ioctl. - */ -static void aarp_send_probe_phase1(struct atalk_iface *iface) -{ - struct ifreq atreq; - struct sockaddr_at *sa = (struct sockaddr_at *)&atreq.ifr_addr; - const struct net_device_ops *ops = iface->dev->netdev_ops; - - sa->sat_addr.s_node = iface->address.s_node; - sa->sat_addr.s_net = ntohs(iface->address.s_net); - - /* We pass the Net:Node to the drivers/cards by a Device ioctl. */ - if (!(ops->ndo_do_ioctl(iface->dev, &atreq, SIOCSIFADDR))) { - ops->ndo_do_ioctl(iface->dev, &atreq, SIOCGIFADDR); - if (iface->address.s_net != htons(sa->sat_addr.s_net) || - iface->address.s_node != sa->sat_addr.s_node) - iface->status |= ATIF_PROBE_FAIL; - - iface->address.s_net = htons(sa->sat_addr.s_net); - iface->address.s_node = sa->sat_addr.s_node; - } -} - - void aarp_probe_network(struct atalk_iface *atif) { - if (atif->dev->type == ARPHRD_LOCALTLK
Re: [PATCH] net: appletalk: remove cops support
On Wed, Oct 4, 2023, at 20:52, Jakub Kicinski wrote: > On Wed, 27 Sep 2023 11:00:30 +0200 Greg Kroah-Hartman wrote: >> The COPS Appletalk support is very old, never said to actually work >> properly, and the firmware code for the devices are under a very suspect >> license. Remove it all to clear up the license issue, if it is still >> needed and actually used by anyone, we can add it back later once the >> license is cleared up. > > Nice, Doug and Arnd also mentioned this in the past so let me add > them to the CC as I apply this... Yes, definitely, thanks Greg for getting this done. I think every time this came up we concluded that it can be removed, we just never finished the job. Acked-by: Arnd Bergmann Link: https://lore.kernel.org/netdev/e490dd0c-a65d-4acf-89c6-c06cb48ec...@app.fastmail.com/ Link: https://lore.kernel.org/netdev/9cac4fbd-9557-b0b8-54fa-93f0290a6...@schmorgal.com/ Semi-related: Since this removes one of the two callers of the .ndo_do_ioctl() callback, I've had a new look at that bit as well and ended up with a refresh of the missing bits of [1], which I'll submit next. Arnd [1] https://lore.kernel.org/lkml/20201106221743.3271965-1-a...@kernel.org/
Re: [PATCH rebased] kbuild: rpm-pkg: Fix build with non-default MODLIB
On Mon, Oct 09, 2023 at 09:34:10PM +0900, Masahiro Yamada wrote: > On Mon, Oct 9, 2023 at 5:52 PM Michal Suchánek wrote: > > > > Hello, > > > > On Mon, Oct 09, 2023 at 05:31:02PM +0900, Masahiro Yamada wrote: > > > On Fri, Oct 6, 2023 at 12:49 AM Michal Suchanek wrote: > > > > > > > > The default MODLIB value is composed of two variables and the hardcoded > > > > string '/lib/modules/'. > > > > > > > > MODLIB = $(INSTALL_MOD_PATH)/lib/modules/$(KERNELRELEASE) > > > > > > > > Defining this middle part as a variable was rejected on the basis that > > > > users can pass the whole MODLIB to make, such as > > > > > > > > > In other words, do you want to say > > > > > > "If defining this middle part as a variable had been accepted, > > > this patch would have been unneeded." ? > > > > If it were accepted I would not have to guess what the middle part is, > > and could use the variable that unambiguosly defines it instead. > > > How? > > scripts/package/kernel.spec hardcodes 'lib/modules' > in a couple of places. > > I am asking how to derive the module path. Not sure what you are asking here. The path is hardcoded, everywhere. The current Makefile has MODLIB = $(INSTALL_MOD_PATH)/lib/modules/$(KERNELRELEASE) and there is no reliable way to learn what the middle part was after the fact - $(INSTALL_MOD_PATH) can be non-empty. The rejected patch was changing this to a variable, and also default to adjusting the content to what kmod exports in pkgconfig after applying a proposed patch to make this hardcoded part configurable: export KERNEL_MODULE_DIRECTORY := $(shell pkg-config --print-variables kmod 2>/dev/null | grep '^module_directory$$' >/dev/null && pkg-config --variable=module_directory kmod || echo /lib/modules) MODLIB = $(INSTALL_MOD_PATH)$(KERNEL_MODULE_DIRECTORY)/$(KERNELRELEASE) It would be completely posible to only define the middle part as a variable that could then be used in rpm-pkg: export KERNEL_MODULE_DIRECTORY := /lib/modules MODLIB = $(INSTALL_MOD_PATH)$(KERNEL_MODULE_DIRECTORY)/$(KERNELRELEASE) Thanks Michal
Re: [PATCH v9 1/5] lib: objpool added: ring-array based lockless MPMC
On Mon, 9 Oct 2023 17:23:34 +0800 wuqiang wrote: > Hello Masami, > > Just got time for the new patch and got that ages[] was removed. ages[] is > introduced the way like 2-phase commit to keep consitency and must be kept. > > Thinking of the following 2 cases that two cpu nodes are operating the same > objpool_slot simultaneously: > > Case 1: > >NODE 1: NODE 2: >push to an empty slotpop will get wrong value > >try_cmpxchg_acquire(&slot->tail, &tail, next) > try_cmpxchg_release(&slot->head, &head, head + 1) > return slot->entries[head & slot->mask] >WRITE_ONCE(slot->entries[tail & slot->mask], obj) Oh, good catch! Hmm, indeed. I considered to use another 'commit_tail' but it may not work (small window to leak the object for nested case). Thanks for review it! > > > Case 2: > >NODE 1: NODE 2 >push to slot w/ 1 objpop will get wrong value > > try_cmpxchg_release(&slot->head, &head, head + 1) >try_cmpxchg_acquire(&slot->tail, &tail, next) >WRITE_ONCE(slot->entries[tail & slot->mask], obj) > return slot->entries[head & slot->mask] > > > Regards, > wuqiang > > On 2023/9/25 17:42, Masami Hiramatsu (Google) wrote: > > Hi Wuqiang, > > > > On Tue, 5 Sep 2023 09:52:51 +0800 > > "wuqiang.matt" wrote: > > > >> The object pool is a scalable implementaion of high performance queue > >> for object allocation and reclamation, such as kretprobe instances. > >> > >> With leveraging percpu ring-array to mitigate the hot spot of memory > >> contention, it could deliver near-linear scalability for high parallel > >> scenarios. The objpool is best suited for following cases: > >> 1) Memory allocation or reclamation are prohibited or too expensive > >> 2) Consumers are of different priorities, such as irqs and threads > >> > >> Limitations: > >> 1) Maximum objects (capacity) is determined during pool initializing > >> and can't be modified (extended) after objpool creation > >> 2) The memory of objects won't be freed until objpool is finalized > >> 3) Object allocation (pop) may fail after trying all cpu slots > > > > I made a simplifying patch on this by (mainly) removing ages array. > > I also rename local variable to use more readable names, like slot, > > pool, and obj. > > > > Here the results which I run the test_objpool.ko. > > > > Original: > > [ 50.500235] Summary of testcases: > > [ 50.503139] duration: 1027135us hits: 30628293miss: > >0sync: percpu objpool > > [ 50.510416] duration: 1047977us hits: 30453023miss: > >0sync: percpu objpool from vmalloc > > [ 50.517421] duration: 1047098us hits: 31145034miss: > >0sync & hrtimer: percpu objpool > > [ 50.524671] duration: 1053233us hits: 30919640miss: > >0sync & hrtimer: percpu objpool from vmalloc > > [ 50.531382] duration: 1055822us hits:3407221miss: > > 830219sync overrun: percpu objpool > > [ 50.538135] duration: 1055998us hits:3404624miss: > > 854160sync overrun: percpu objpool from vmalloc > > [ 50.546686] duration: 1046104us hits: 19464798miss: > >0async: percpu objpool > > [ 50.552989] duration: 1033054us hits: 18957836miss: > >0async: percpu objpool from vmalloc > > [ 50.560289] duration: 1041778us hits: 33806868miss: > >0async & hrtimer: percpu objpool > > [ 50.567425] duration: 1048901us hits: 34211860miss: > >0async & hrtimer: percpu objpool from vmalloc > > > > Simplified: > > [ 48.393236] Summary of testcases: > > [ 48.395384] duration: 1013002us hits: 29661448miss: > >0sync: percpu objpool > > [ 48.400351] duration: 1057231us hits: 31035578miss: > >0sync: percpu objpool from vmalloc > > [ 48.405660] duration: 1043196us hits: 30546652miss: > >0sync & hrtimer: percpu objpool > > [ 48.411216] duration: 1047128us hits: 30601306miss: > >0sync & hrtimer: percpu objpool from vmalloc > > [ 48.417231] duration: 1051695us hits:3468287miss: > > 892881sync overrun: percpu objpool > > [ 48.422405] duration: 1054003us hits:3549491miss: > > 898141sync overrun: percpu objpool from vmalloc > > [ 48.428425] duration: 1052946us hits: 19005228miss: > >0async: percpu objpool > > [ 48.433597] duration: 1051757us hits: 19670866miss: > >0async: percpu objpool from vmalloc > > [
Re: [PATCH v1] samples: kprobes: Fixes a typo
On Sat, 7 Oct 2023 21:09:00 +0530 Atul Kumar Pant wrote: > On Sat, Sep 23, 2023 at 11:00:46PM +0530, Atul Kumar Pant wrote: > > On Thu, Aug 17, 2023 at 10:38:19PM +0530, Atul Kumar Pant wrote: > > > Fixes typo in a function name. > > > > > > Signed-off-by: Atul Kumar Pant > > > --- > > > samples/kprobes/kretprobe_example.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/samples/kprobes/kretprobe_example.c > > > b/samples/kprobes/kretprobe_example.c > > > index cbf16542d84e..ed79fd3d48fb 100644 > > > --- a/samples/kprobes/kretprobe_example.c > > > +++ b/samples/kprobes/kretprobe_example.c > > > @@ -35,7 +35,7 @@ struct my_data { > > > ktime_t entry_stamp; > > > }; > > > > > > -/* Here we use the entry_hanlder to timestamp function entry */ > > > +/* Here we use the entry_handler to timestamp function entry */ > > > static int entry_handler(struct kretprobe_instance *ri, struct pt_regs > > > *regs) > > > { > > > struct my_data *data; > > > -- > > > 2.25.1 > > > > > > > Hi all, can someone provide comments on this change. > > Hi all, can someone please review this change. It has > been not > reviewed for quite some time. That's because trivial typos in comments are considered very low priority, and are usually only added (if they are ever added) if the maintainer has extra time, which may not be for a while. -- Steve
Re: [PATCH v4] eventfs: Remove eventfs_file and just use eventfs_inode
On Sat, 7 Oct 2023 19:44:45 +0900 Masami Hiramatsu (Google) wrote: > [...] > > @@ -118,6 +72,7 @@ static struct dentry *create_file(const char *name, > > umode_t mode, > > if (WARN_ON_ONCE(!S_ISREG(mode))) > > return NULL; > > > > + WARN_ON_ONCE(!parent); > > Nit: This looks a wrong case and should not create a file under root > directory. > So it should return NULL. (but it shouldn't happen.) Yes it should never happen (hence the warn on), but if it does happen, it shouldn't cause any real harm, so I decided not to return early. > > > dentry = eventfs_start_creating(name, parent); > > > > if (IS_ERR(dentry)) > > > > +static struct dentry * > > +create_file_dentry(struct eventfs_inode *ei, struct dentry **e_dentry, > > + struct dentry *parent, const char *name, umode_t mode, void > > *data, > > + const struct file_operations *fops, bool lookup) > > +{ > > + struct dentry *dentry; > > + bool invalidate = false; > > + > > + mutex_lock(&eventfs_mutex); > > + /* If the e_dentry already has a dentry, use it */ > > + if (*e_dentry) { > > + /* lookup does not need to up the ref count */ > > + if (!lookup) > > + dget(*e_dentry); > > + mutex_unlock(&eventfs_mutex); > > + return *e_dentry; > > + } > > + mutex_unlock(&eventfs_mutex); > > + > > + /* The lookup already has the parent->d_inode locked */ > > + if (!lookup) > > + inode_lock(parent->d_inode); > > + > > + dentry = create_file(name, mode, parent, data, fops); > > + > > + if (!lookup) > > + inode_unlock(parent->d_inode); > > + > > + mutex_lock(&eventfs_mutex); > > + > > + if (IS_ERR_OR_NULL(dentry)) { > > + /* > > +* When the mutex was released, something else could have > > +* created the dentry for this e_dentry. In which case > > +* use that one. > > +* > > +* Note, with the mutex held, the e_dentry cannot have content > > +* and the ei->is_freed be true at the same time. > > +*/ > > + WARN_ON_ONCE(ei->is_freed); > > + dentry = *e_dentry; > > + /* The lookup does not need to up the dentry refcount */ > > + if (dentry && !lookup) > > + dget(dentry); > > + mutex_unlock(&eventfs_mutex); > > + return dentry; > > + } > > + > > + if (!*e_dentry && !ei->is_freed) { > > + *e_dentry = dentry; > > + dentry->d_fsdata = ei; > > Nit: If we use LSB for existing flag, this should be checked. e.g. > WARN_ON_ONCE(ei & 1). But ei->is_freed is set before we set that LSB. Why should we check it here if we already checked ei->is_freed? > > > > + } else { > > + /* > > +* Should never happen unless we get here due to being freed. > > +* Otherwise it means two dentries exist with the same name. > > +*/ > > + WARN_ON_ONCE(!ei->is_freed); > > + invalidate = true; > > + } > > + mutex_unlock(&eventfs_mutex); > > + > > + if (invalidate) > > + d_invalidate(dentry); > > + > > + if (lookup || invalidate) > > + dput(dentry); > > + > > + return invalidate ? NULL : dentry; > > +} > > + > > /** > > * eventfs_post_create_dir - post create dir routine > > - * @ef: eventfs_file of recently created dir > > + * @ei: eventfs_inode of recently created dir > > * > > * Map the meta-data of files within an eventfs dir to their parent dentry > > */ > > -static void eventfs_post_create_dir(struct eventfs_file *ef) > > +static void eventfs_post_create_dir(struct eventfs_inode *ei) > > { > > - struct eventfs_file *ef_child; > > + struct eventfs_inode *ei_child; > > struct tracefs_inode *ti; > > > > /* srcu lock already held */ > > /* fill parent-child relation */ > > - list_for_each_entry_srcu(ef_child, &ef->ei->e_top_files, list, > > + list_for_each_entry_srcu(ei_child, &ei->children, list, > > srcu_read_lock_held(&eventfs_srcu)) { > > - ef_child->d_parent = ef->dentry; > > + ei_child->d_parent = ei->dentry; > > } > > > > - ti = get_tracefs(ef->dentry->d_inode); > > - ti->private = ef->ei; > > + ti = get_tracefs(ei->dentry->d_inode); > > + ti->private = ei; > > } > > > > /** > > - * create_dentry - helper function to create dentry > > - * @ef: eventfs_file of file or directory to create > > - * @parent: parent dentry > > - * @lookup: true if called from lookup routine > > + * create_dir_dentry - Create a directory dentry for the eventfs_inode > > + * @ei: The eventfs_inode to create the directory for > > + * @parent: The dentry of the parent of this directory > > + * @lookup: True if this is called by the lookup code > > * > > - * Used to create a dentry for file/dir, executes post dentry creation > > routine > > + * This creates
Re: [PATCH v5] scripts/link-vmlinux.sh: Add alias to duplicate symbols for kallsyms
Hello Kris, Thank you for your contribution and for having your thought shared with me. Allow me to begin this conversation by explaining what came to mind when I decided to propose a patch that creates aliases. The objective was to address a specific problem I was facing while minimizing any potential impact on other aspects. My initial consideration was the existence of numerous tools, both in the kernel and in userspace, that rely on the current kallsyms implementation. Both Nick and I shared the concern that making changes to elements upon which these tools depend on could have significant consequences. To the best of my knowledge, Nick's strategy has been to duplicate kallsyms with something new - a new, improved kallsyms file. However, even if Nick's patch were to be accepted, it wouldn't fully meet my personal requirements. This is because my goal was to utilize kprobe on a symbol that shares its name with others. Nick's work wouldn't allow me to do this, and that's why, I proposed an alternative. As a result, my strategy was more modest and focused solely on creating aliases for duplicate symbols. By adding these aliases, existing tools would remain unaffected, and the current system state and ecosystem would be preserved. For instance, mechanisms like live patching could continue to use the symbol hit count. On the flip side, introducing these new symbols would enable tracers to directly employ the new names without any modifications, and humans could easily identify the symbol they are dealing with just by examining the name. These are the fundamental principles behind my patch - introducing aliases. Il giorno gio 5 ott 2023 alle ore 18:25 Kris Van Hees ha scritto: > > On Wed, Sep 27, 2023 at 05:35:16PM +, Alessandro Carminati (Red Hat) > wrote: > > It is not uncommon for drivers or modules related to similar peripherals > > to have symbols with the exact same name. > > While this is not a problem for the kernel's binary itself, it becomes an > > issue when attempting to trace or probe specific functions using > > infrastructure like ftrace or kprobe. > > > > The tracing subsystem relies on the `nm -n vmlinux` output, which provides > > symbol information from the kernel's ELF binary. However, when multiple > > symbols share the same name, the standard nm output does not differentiate > > between them. This can lead to confusion and difficulty when trying to > > probe the intended symbol. > > > > ~ # cat /proc/kallsyms | grep " name_show" > > 8c4f76d0 t name_show > > 8c9cccb0 t name_show > > 8cb0ac20 t name_show > > 8cc728c0 t name_show > > 8ce0efd0 t name_show > > 8ce126c0 t name_show > > 8ce1dd20 t name_show > > 8ce24e70 t name_show > > 8d1104c0 t name_show > > 8d1fe480 t name_show > > One problem that remains as far as I can see is that this approach does not > take into consideration that there can be duplicate symbols in the core > kernel, but also between core kernel and loadable modules, and even between > loadable modules. So, I think more is needed to also ensure that this > approach of adding alias symbols is also done for loadable modules. > > Earlier work that cover all symbols (core kernel and loadable modules) was > posted quite a while ago by Nick Alcock: > > https://lore.kernel.org/all/20221205163157.269335-1-nick.alc...@oracle.com/ > > It takes a different approach and adds in other info that is very useful for > tracing, but unfortunately it has been dormant for a long time now. > > While module symbols are handled quite differently (for kallsyms) from the > core kernel symbols, I think that a similar approach tied in with modpost > ought to be quite possible. It will add to the size of modules because the > data needs to be stored in the .ko but that is unavoidable. But not doing it > unfortunately would mean that the duplicate symbol issue remains unresolved > in the presence of loadable modules. > In the current implementation, my work is capable of "addressing" the issue only for duplicate symbols within the core kernel image. This is a fact. Currently, modules are not within the scope of my work. Originally, my intention was to deal with duplicates in modules as the subsequent objective. I agree with you that addressing duplicate symbols in modules is something that needs to be resolved, but from my perspective, it appeared to be of lesser importance. This is because, when trace tools are used manually, there is already an indication of the modules where the name originates. In this context, I would welcome the opportunity to explore additional use cases. By doing so, I can better align any future proposals with these real-world scenarios and requirements I'm not considering in this moment. In my initial intention, I had hoped to extend the use of aliases to modules as well. However, I encountered some challenges that need to be addressed first. If I were to cont
Re: [PATCH 2/5] UML: remove unused cmd_vdso_install
- Ursprüngliche Mail - > Von: "masahiroy" > An: "linux-kbuild" > CC: "linux-kernel" , "linux-arm-kernel" > , > linux-c...@vger.kernel.org, "linux-parisc" , > linux-ri...@lists.infradead.org, > linux-s...@vger.kernel.org, "linux-um" , > "loongarch" , > "sparclinux" , "x86" , > "masahiroy" , "anton ivanov" > , "bp" , "dave hansen" > , "hpa" > , "mingo" , "Johannes Berg" > , "richard" , > "tglx" > Gesendet: Montag, 9. Oktober 2023 14:42:07 > Betreff: [PATCH 2/5] UML: remove unused cmd_vdso_install > You cannot run this code because arch/um/Makefile does not define the > vdso_install target. > > It appears that this code was blindly copied from another architecture. > > Remove the dead code. > > Signed-off-by: Masahiro Yamada Acked-by: Richard Weinberger Thanks, //richard
Re: [PATCH v2 3/6] ACPI: AC: Replace acpi_driver with platform_driver
On 10/9/2023 2:27 PM, Rafael J. Wysocki wrote: > On Mon, Oct 9, 2023 at 10:40 AM Wilczynski, Michal > wrote: >> >> Hi ! >> >> Thanks a lot for a review, to both of you ! :-) >> >> On 10/7/2023 12:43 PM, Rafael J. Wysocki wrote: >>> On Sat, Oct 7, 2023 at 12:41 PM Rafael J. Wysocki wrote: On Sat, Oct 7, 2023 at 9:56 AM Andy Shevchenko wrote: > On Fri, Oct 06, 2023 at 09:47:57PM +0200, Rafael J. Wysocki wrote: >> On Fri, Oct 6, 2023 at 8:33 PM Michal Wilczynski >> wrote: > ... > >>> struct acpi_ac { >>> struct power_supply *charger; >>> struct power_supply_desc charger_desc; >>> - struct acpi_device *device; >>> + struct device *dev; >> I'm not convinced about this change. >> >> If I'm not mistaken, you only use the dev pointer above to get the >> ACPI_COMPANION() of it, but the latter is already found in _probe(), >> so it can be stored in struct acpi_ac for later use and then the dev >> pointer in there will not be necessary any more. >> >> That will save you a bunch of ACPI_HANDLE() evaluations and there's >> nothing wrong with using ac->device->handle. The patch will then >> become almost trivial AFAICS and if you really need to get from ac to >> the underlying platform device, a pointer to it can be added to struct >> acpi_ac without removing the ACPI device pointer from it. >> Yeah we could add platform device without removing acpi device, and >> yes that would introduce data duplication, like Andy noticed. > But he hasn't answered my question regarding what data duplication he > meant, exactly. > > So what data duplication do you mean? So what I meant was that many drivers would find it useful to have a struct device embedded in their 'private structure', and in that case there would be a duplication of platform_device and acpi_device as both pointers would be needed. Mind this if you only have struct device it's easy to get acpi device, but it's not the case the other way around. So for this driver it's just a matter of sticking to convention, for the others like it would be duplication. In my version of this patch we managed to avoid this duplication, thanks to the contextual argument introduced before, but look at this patch: https://github.com/mwilczy/linux-pm/commit/cc8ef52707341e67a12067d6ead991d56ea017ca Author of this patch had to introduce platform_device and acpi_device to the struct ac, so there was some duplication. That is the case for many drivers to come, so I decided it's better to change this and have a pattern with keeping only 'struct device'. > >> And yes, maybe for this particular driver there is little impact (as struct >> device is not >> really used), but in my opinion we should adhere to some common coding >> pattern among all acpi drivers in order for the code to be easier to maintain >> and improve readability, also for any newcomer. > Well, maybe. > > But then please minimize the number of code lines changed in this > particular patch and please avoid changes that are not directly > related to the purpose of the patch. Sure, like I've stated before I felt this is part of this patch, we only narrowly escaped the duplication by introducing contextual argument before ;-) > > The idea behind is to eliminate data duplication. What data duplication exactly do you mean? struct acpi_device *device is replaced with struct device *dev which is the same size. The latter is then used to obtain a struct acpi_device pointer. Why is it better to do this than to store the struct acpi_device itself? >>> This should be "store the struct acpi_device pointer itself", sorry. >> Hi, >> So let me explain the reasoning here: >> >> 1) I've had a pleasure to work with different drivers in ACPI directory. In >> my >> opinion the best ones I've seen, were build around the concept of struct >> device (e.g NFIT). It's a struct that's well understood in the kernel, >> and >> it's easier to understand without having to read any ACPI specific code. >> If I see something like ACPI_HANDLE(dev), I think 'ah-ha - having a >> struct >> device I can easily get struct acpi_device - they're connected'. And I >> think >> using a standardized macro, instead of manually dereferencing pointers is >> also much easier for ACPI newbs reading the code, as it hides a bit >> complexity >> of getting acpi device from struct device. And to be honest I don't >> think there would >> be any measurable performance change, as those code paths are far from >> being considered 'hot paths'. So if we can get the code easier to >> understand >> from a newcomer perspective, why not do it. > I have a differing opinion on a couple of points above, and let's make > one change at a time. OK > >> 2) I think it would be good to stick to the convention, and introduce some >> coding >> pattern, for now some
[PATCH 5/5] kbuild: unify no-compiler-targets and no-sync-config-targets
Now that vdso_install does not depend on any in-tree build artifact, it no longer invokes a compiler, making no-compiler-targets the same as no-sync-config-targets. Signed-off-by: Masahiro Yamada --- Makefile | 13 + 1 file changed, 1 insertion(+), 12 deletions(-) diff --git a/Makefile b/Makefile index 2170d56630e8..982b1ad33287 100644 --- a/Makefile +++ b/Makefile @@ -277,10 +277,6 @@ no-dot-config-targets := $(clean-targets) \ $(version_h) headers headers_% archheaders archscripts \ %asm-generic kernelversion %src-pkg dt_binding_check \ outputmakefile rustavailable rustfmt rustfmtcheck -# Installation targets should not require compiler. Unfortunately, vdso_install -# is an exception where build artifacts may be updated. This must be fixed. -no-compiler-targets := $(no-dot-config-targets) install dtbs_install \ - headers_install modules_install modules_sign kernelrelease image_name no-sync-config-targets := $(no-dot-config-targets) %install modules_sign kernelrelease \ image_name single-targets := %.a %.i %.ko %.lds %.ll %.lst %.mod %.o %.rsi %.s %.symtypes %/ @@ -288,7 +284,6 @@ single-targets := %.a %.i %.ko %.lds %.ll %.lst %.mod %.o %.rsi %.s %.symtypes % config-build := mixed-build:= need-config:= 1 -need-compiler := 1 may-sync-config:= 1 single-build := @@ -298,12 +293,6 @@ ifneq ($(filter $(no-dot-config-targets), $(MAKECMDGOALS)),) endif endif -ifneq ($(filter $(no-compiler-targets), $(MAKECMDGOALS)),) - ifeq ($(filter-out $(no-compiler-targets), $(MAKECMDGOALS)),) - need-compiler := - endif -endif - ifneq ($(filter $(no-sync-config-targets), $(MAKECMDGOALS)),) ifeq ($(filter-out $(no-sync-config-targets), $(MAKECMDGOALS)),) may-sync-config := @@ -675,7 +664,7 @@ endif # Include this also for config targets because some architectures need # cc-cross-prefix to determine CROSS_COMPILE. -ifdef need-compiler +ifdef may-sync-config include $(srctree)/scripts/Makefile.compiler endif -- 2.39.2
[PATCH 4/5] kbuild: unify vdso_install rules
Currently, there is no standard implementation for vdso_install, leading to various issues: 1. Code duplication Many architectures duplicate similar code just for copying files to the install destination. Some architectures (arm, sparc, x86) create build-id symlinks, introducing more code duplication. 2. Accidental updates of in-tree build artifacts The vdso_install rule depends on the vdso files to install. It may update in-tree build artifacts. This can be problematic, as explained in commit 19514fc665ff ("arm, kbuild: make "make install" not depend on vmlinux"). 3. Broken code in some architectures Makefile code is often copied from one architecture to another without proper adaptation or testing. The previous commits removed broken code from csky, UML, and parisc. Another issue is that 'make vdso_install' for ARCH=s390 installs vdso64, but not vdso32. To address these problems, this commit introduces the generic vdso_install. Architectures that support vdso_install need to define vdso-install-y in arch/*/Makefile. vdso-install-y lists the files to install. For example, arch/x86/Makefile looks like this: vdso-install-$(CONFIG_X86_64) += arch/x86/entry/vdso/vdso64.so.dbg vdso-install-$(CONFIG_X86_X32_ABI) += arch/x86/entry/vdso/vdsox32.so.dbg vdso-install-$(CONFIG_X86_32) += arch/x86/entry/vdso/vdso32.so.dbg vdso-install-$(CONFIG_IA32_EMULATION) += arch/x86/entry/vdso/vdso32.so.dbg These files will be installed to $(MODLIB)/vdso/ with the .dbg suffix, if exists, stripped away. vdso-install-y can optionally take the second field after the colon separator. This is needed because some architectures install vdso files as a different base name. The following is a snippet from arch/arm64/Makefile. vdso-install-$(CONFIG_COMPAT_VDSO) += arch/arm64/kernel/vdso32/vdso.so.dbg:vdso32.so This will rename vdso.so.dbg to vdso32.so during installation. If such architectures change their implementation so that the file names match, this workaround will go away. Signed-off-by: Masahiro Yamada --- Makefile | 9 ++ arch/arm/Makefile | 7 +--- arch/arm/vdso/Makefile | 25 -- arch/arm64/Makefile| 9 ++ arch/arm64/kernel/vdso/Makefile| 10 -- arch/arm64/kernel/vdso32/Makefile | 10 -- arch/loongarch/Makefile| 4 +-- arch/loongarch/vdso/Makefile | 10 -- arch/riscv/Makefile| 9 ++ arch/riscv/kernel/compat_vdso/Makefile | 10 -- arch/riscv/kernel/vdso/Makefile| 10 -- arch/s390/Makefile | 6 ++-- arch/s390/kernel/vdso32/Makefile | 10 -- arch/s390/kernel/vdso64/Makefile | 10 -- arch/sparc/Makefile| 5 ++- arch/sparc/vdso/Makefile | 27 arch/x86/Makefile | 7 ++-- arch/x86/entry/vdso/Makefile | 27 scripts/Makefile.vdsoinst | 45 ++ 19 files changed, 71 insertions(+), 179 deletions(-) create mode 100644 scripts/Makefile.vdsoinst diff --git a/Makefile b/Makefile index 373649c7374e..2170d56630e8 100644 --- a/Makefile +++ b/Makefile @@ -1317,6 +1317,14 @@ scripts_unifdef: scripts_basic quiet_cmd_install = INSTALL $(INSTALL_PATH) cmd_install = unset sub_make_done; $(srctree)/scripts/install.sh +# --- +# vDSO install + +PHONY += vdso_install +vdso_install: export INSTALL_FILES = $(vdso-install-y) +vdso_install: + $(Q)$(MAKE) -f $(srctree)/scripts/Makefile.vdsoinst + # --- # Tools @@ -1560,6 +1568,7 @@ help: @echo '* vmlinux - Build the bare kernel' @echo '* modules - Build all modules' @echo ' modules_install - Install all modules to INSTALL_MOD_PATH (default: /)' + @echo ' vdso_install- Install unstripped vdso to INSTALL_MOD_PATH (default: /)' @echo ' dir/- Build all files in dir and below' @echo ' dir/file.[ois] - Build specified target only' @echo ' dir/file.ll - Build the LLVM assembly file' diff --git a/arch/arm/Makefile b/arch/arm/Makefile index 547e5856eaa0..5ba42f69f8ce 100644 --- a/arch/arm/Makefile +++ b/arch/arm/Makefile @@ -304,11 +304,7 @@ $(INSTALL_TARGETS): KBUILD_IMAGE = $(boot)/$(patsubst %install,%Image,$@) $(INSTALL_TARGETS): $(call cmd,install) -PHONY += vdso_install -vdso_install: -ifeq ($(CONFIG_VDSO),y) - $(Q)$(MAKE) $(build)=arch/arm/vdso $@ -endif +vdso-install-$(CONFIG_VDSO) += arch/arm/vdso/vdso.so.dbg # My testing targets (bypasses dependencies) bp:; $(Q)$(MAKE) $(build)=$(boot) $(boot)/bootpImage @@ -331,7 +327,6 @@ define
[PATCH 3/5] parisc: remove broken vdso_install
'make ARCH=parisc vdso_install' has never worked. It attempts to descend into arch/parisc/kernel/vdso/, which does not exist. The command just fails: scripts/Makefile.build:41: arch/parisc/kernel/vdso/Makefile: No such file or directory The second line is also meaningless because parisc does not define CONFIG_COMPAT_VDSO. It appears that this code was copied from another architecture without proper adaptation. Remove the broken code. Signed-off-by: Masahiro Yamada --- arch/parisc/Makefile | 7 --- 1 file changed, 7 deletions(-) diff --git a/arch/parisc/Makefile b/arch/parisc/Makefile index 968ebe17494c..4222fa73c34a 100644 --- a/arch/parisc/Makefile +++ b/arch/parisc/Makefile @@ -177,13 +177,6 @@ vdso_prepare: prepare0 $(Q)$(MAKE) $(build)=arch/parisc/kernel/vdso32 include/generated/vdso32-offsets.h endif -PHONY += vdso_install - -vdso_install: - $(Q)$(MAKE) $(build)=arch/parisc/kernel/vdso $@ - $(if $(CONFIG_COMPAT_VDSO), \ - $(Q)$(MAKE) $(build)=arch/parisc/kernel/vdso32 $@) - install: KBUILD_IMAGE := vmlinux zinstall: KBUILD_IMAGE := vmlinuz install zinstall: -- 2.39.2
[PATCH 2/5] UML: remove unused cmd_vdso_install
You cannot run this code because arch/um/Makefile does not define the vdso_install target. It appears that this code was blindly copied from another architecture. Remove the dead code. Signed-off-by: Masahiro Yamada --- arch/x86/um/vdso/Makefile | 12 1 file changed, 12 deletions(-) diff --git a/arch/x86/um/vdso/Makefile b/arch/x86/um/vdso/Makefile index 6825e146a62f..b86d634730b2 100644 --- a/arch/x86/um/vdso/Makefile +++ b/arch/x86/um/vdso/Makefile @@ -67,15 +67,3 @@ quiet_cmd_vdso = VDSO$@ VDSO_LDFLAGS = -fPIC -shared -Wl,--hash-style=sysv -z noexecstack GCOV_PROFILE := n - -# -# Install the unstripped copy of vdso*.so listed in $(vdso-install-y). -# -quiet_cmd_vdso_install = INSTALL $@ - cmd_vdso_install = cp $(obj)/$@.dbg $(MODLIB)/vdso/$@ -$(vdso-install-y): %.so: $(obj)/%.so.dbg FORCE - @mkdir -p $(MODLIB)/vdso - $(call cmd,vdso_install) - -PHONY += vdso_install $(vdso-install-y) -vdso_install: $(vdso-install-y) -- 2.39.2
[PATCH 1/5] csky: remove unused cmd_vdso_install
You cannot run this code because arch/csky/Makefile does not define the vdso_install target. It appears that this code was blindly copied from another architecture. Remove the dead code. Signed-off-by: Masahiro Yamada --- arch/csky/kernel/vdso/Makefile | 10 -- 1 file changed, 10 deletions(-) diff --git a/arch/csky/kernel/vdso/Makefile b/arch/csky/kernel/vdso/Makefile index 299e4e41ebc5..ddf784a62c11 100644 --- a/arch/csky/kernel/vdso/Makefile +++ b/arch/csky/kernel/vdso/Makefile @@ -58,13 +58,3 @@ quiet_cmd_vdsold = VDSOLD $@ # that contains the same symbols at the same offsets. quiet_cmd_so2s = SO2S$@ cmd_so2s = $(NM) -D $< | $(srctree)/$(src)/so2s.sh > $@ - -# install commands for the unstripped file -quiet_cmd_vdso_install = INSTALL $@ - cmd_vdso_install = cp $(obj)/$@.dbg $(MODLIB)/vdso/$@ - -vdso.so: $(obj)/vdso.so.dbg - @mkdir -p $(MODLIB)/vdso - $(call cmd,vdso_install) - -vdso_install: vdso.so -- 2.39.2
Re: [PATCH rebased] kbuild: rpm-pkg: Fix build with non-default MODLIB
On Mon, Oct 9, 2023 at 5:52 PM Michal Suchánek wrote: > > Hello, > > On Mon, Oct 09, 2023 at 05:31:02PM +0900, Masahiro Yamada wrote: > > On Fri, Oct 6, 2023 at 12:49 AM Michal Suchanek wrote: > > > > > > The default MODLIB value is composed of two variables and the hardcoded > > > string '/lib/modules/'. > > > > > > MODLIB = $(INSTALL_MOD_PATH)/lib/modules/$(KERNELRELEASE) > > > > > > Defining this middle part as a variable was rejected on the basis that > > > users can pass the whole MODLIB to make, such as > > > > > > In other words, do you want to say > > > > "If defining this middle part as a variable had been accepted, > > this patch would have been unneeded." ? > > If it were accepted I would not have to guess what the middle part is, > and could use the variable that unambiguosly defines it instead. How? scripts/package/kernel.spec hardcodes 'lib/modules' in a couple of places. I am asking how to derive the module path. -- Best Regards Masahiro Yamada
Re: [PATCH v2 3/6] ACPI: AC: Replace acpi_driver with platform_driver
On Mon, Oct 9, 2023 at 10:40 AM Wilczynski, Michal wrote: > > > Hi ! > > Thanks a lot for a review, to both of you ! :-) > > On 10/7/2023 12:43 PM, Rafael J. Wysocki wrote: > > On Sat, Oct 7, 2023 at 12:41 PM Rafael J. Wysocki wrote: > >> On Sat, Oct 7, 2023 at 9:56 AM Andy Shevchenko > >> wrote: > >>> On Fri, Oct 06, 2023 at 09:47:57PM +0200, Rafael J. Wysocki wrote: > On Fri, Oct 6, 2023 at 8:33 PM Michal Wilczynski > wrote: > >>> ... > >>> > > struct acpi_ac { > > struct power_supply *charger; > > struct power_supply_desc charger_desc; > > - struct acpi_device *device; > > + struct device *dev; > I'm not convinced about this change. > > If I'm not mistaken, you only use the dev pointer above to get the > ACPI_COMPANION() of it, but the latter is already found in _probe(), > so it can be stored in struct acpi_ac for later use and then the dev > pointer in there will not be necessary any more. > > That will save you a bunch of ACPI_HANDLE() evaluations and there's > nothing wrong with using ac->device->handle. The patch will then > become almost trivial AFAICS and if you really need to get from ac to > the underlying platform device, a pointer to it can be added to struct > acpi_ac without removing the ACPI device pointer from it. > > Yeah we could add platform device without removing acpi device, and > yes that would introduce data duplication, like Andy noticed. But he hasn't answered my question regarding what data duplication he meant, exactly. So what data duplication do you mean? > And yes, maybe for this particular driver there is little impact (as struct > device is not > really used), but in my opinion we should adhere to some common coding > pattern among all acpi drivers in order for the code to be easier to maintain > and improve readability, also for any newcomer. Well, maybe. But then please minimize the number of code lines changed in this particular patch and please avoid changes that are not directly related to the purpose of the patch. > >>> The idea behind is to eliminate data duplication. > >> What data duplication exactly do you mean? > >> > >> struct acpi_device *device is replaced with struct device *dev which > >> is the same size. The latter is then used to obtain a struct > >> acpi_device pointer. Why is it better to do this than to store the > >> struct acpi_device itself? > > This should be "store the struct acpi_device pointer itself", sorry. > > Hi, > So let me explain the reasoning here: > > 1) I've had a pleasure to work with different drivers in ACPI directory. In my > opinion the best ones I've seen, were build around the concept of struct > device (e.g NFIT). It's a struct that's well understood in the kernel, and > it's easier to understand without having to read any ACPI specific code. > If I see something like ACPI_HANDLE(dev), I think 'ah-ha - having a > struct > device I can easily get struct acpi_device - they're connected'. And I > think > using a standardized macro, instead of manually dereferencing pointers is > also much easier for ACPI newbs reading the code, as it hides a bit > complexity > of getting acpi device from struct device. And to be honest I don't think > there would > be any measurable performance change, as those code paths are far from > being considered 'hot paths'. So if we can get the code easier to > understand > from a newcomer perspective, why not do it. I have a differing opinion on a couple of points above, and let's make one change at a time. > > 2) I think it would be good to stick to the convention, and introduce some > coding > pattern, for now some drivers store the struct device pointer, and other > store > acpi device pointer . As I'm doing this change acpi device pointer > become less > relevant, as we're using platform device. So to reflect that loss of > relevance > we can choose to adhere to a pattern where we use it less and less, and > the > winning approach would be to use 'struct device' by default everywhere > we can > so maybe eventually we would be able to lose acpi_device altogether at > some point, > as most of the usage is to retrieve acpi handle and execute some AML > method. > So in my understanding acpi device is already obsolete at this point, if > we can > manage to use it less and less, and eventually wipe it out then why not > ;-) No, ACPI device is not obsolete, it will still be needed for various things, like power management and hotplug. Also, I'd rather store a struct acpi_device than acpi_handle, because the latter is much better from the compile-time type correctness checks and similar. I can send my version of the $subject patch just fine if you strongly disagree with me.
Re: [PATCH net-next v7] net/core: Introduce netdev_core_stats_inc()
On 2023/10/9 18:16, Eric Dumazet wrote: On Mon, Oct 9, 2023 at 11:43 AM Yajun Deng wrote: On 2023/10/9 17:30, Eric Dumazet wrote: On Mon, Oct 9, 2023 at 10:36 AM Yajun Deng wrote: On 2023/10/9 16:20, Eric Dumazet wrote: On Mon, Oct 9, 2023 at 10:14 AM Yajun Deng wrote: On 2023/10/9 15:53, Eric Dumazet wrote: On Mon, Oct 9, 2023 at 5:07 AM Yajun Deng wrote: 'this_cpu_read + this_cpu_write' and 'pr_info + this_cpu_inc' will make the trace work well. They all have 'pop' instructions in them. This may be the key to making the trace work well. Hi all, I need your help on percpu and ftrace. I do not think you made sure netdev_core_stats_inc() was never inlined. Adding more code in it is simply changing how the compiler decides to inline or not. Yes, you are right. It needs to add the 'noinline' prefix. The disassembly code will have 'pop' instruction. The function was fine, you do not need anything like push or pop. The only needed stuff was the call __fentry__. The fact that the function was inlined for some invocations was the issue, because the trace point is only planted in the out of line function. But somehow the following code isn't inline? They didn't need to add the 'noinline' prefix. + field = (unsigned long *)((void *)this_cpu_ptr(p) + offset); + WRITE_ONCE(*field, READ_ONCE(*field) + 1); Or + (*(unsigned long *)((void *)this_cpu_ptr(p) + offset))++; I think you are very confused. You only want to trace netdev_core_stats_inc() entry point, not arbitrary pieces of it. Yes, I will trace netdev_core_stats_inc() entry point. I mean to replace + field = (__force unsigned long __percpu *)((__force void *)p + offset); + this_cpu_inc(*field); with + field = (unsigned long *)((void *)this_cpu_ptr(p) + offset); + WRITE_ONCE(*field, READ_ONCE(*field) + 1); Or + (*(unsigned long *)((void *)this_cpu_ptr(p) + offset))++; The netdev_core_stats_inc() entry point will work fine even if it doesn't have 'noinline' prefix. I don't know why this code needs to add 'noinline' prefix. + field = (__force unsigned long __percpu *)((__force void *)p + offset); + this_cpu_inc(*field); C compiler decides to inline or not, depending on various factors. The most efficient (and small) code is generated by this_cpu_inc() version, allowing the compiler to inline it. If you copy/paste this_cpu_inc() twenty times, then the compiler would not inline the function anymore. Got it. Thank you.
Re: [PATCH net-next v7] net/core: Introduce netdev_core_stats_inc()
On Mon, Oct 9, 2023 at 11:43 AM Yajun Deng wrote: > > > On 2023/10/9 17:30, Eric Dumazet wrote: > > On Mon, Oct 9, 2023 at 10:36 AM Yajun Deng wrote: > >> > >> On 2023/10/9 16:20, Eric Dumazet wrote: > >>> On Mon, Oct 9, 2023 at 10:14 AM Yajun Deng wrote: > On 2023/10/9 15:53, Eric Dumazet wrote: > > On Mon, Oct 9, 2023 at 5:07 AM Yajun Deng wrote: > > > >> 'this_cpu_read + this_cpu_write' and 'pr_info + this_cpu_inc' will make > >> the trace work well. > >> > >> They all have 'pop' instructions in them. This may be the key to making > >> the trace work well. > >> > >> Hi all, > >> > >> I need your help on percpu and ftrace. > >> > > I do not think you made sure netdev_core_stats_inc() was never inlined. > > > > Adding more code in it is simply changing how the compiler decides to > > inline or not. > Yes, you are right. It needs to add the 'noinline' prefix. The > disassembly code will have 'pop' > > instruction. > > >>> The function was fine, you do not need anything like push or pop. > >>> > >>> The only needed stuff was the call __fentry__. > >>> > >>> The fact that the function was inlined for some invocations was the > >>> issue, because the trace point > >>> is only planted in the out of line function. > >> > >> But somehow the following code isn't inline? They didn't need to add the > >> 'noinline' prefix. > >> > >> + field = (unsigned long *)((void *)this_cpu_ptr(p) + > >> offset); > >> + WRITE_ONCE(*field, READ_ONCE(*field) + 1); > >> > >> Or > >> + (*(unsigned long *)((void *)this_cpu_ptr(p) + offset))++; > >> > > I think you are very confused. > > > > You only want to trace netdev_core_stats_inc() entry point, not > > arbitrary pieces of it. > > > Yes, I will trace netdev_core_stats_inc() entry point. I mean to replace > > + field = (__force unsigned long > __percpu *)((__force void *)p + offset); > + this_cpu_inc(*field); > > with > > + field = (unsigned long *)((void *)this_cpu_ptr(p) + offset); > + WRITE_ONCE(*field, READ_ONCE(*field) + 1); > > Or > + (*(unsigned long *)((void *)this_cpu_ptr(p) + offset))++; > > The netdev_core_stats_inc() entry point will work fine even if it doesn't > have 'noinline' prefix. > > I don't know why this code needs to add 'noinline' prefix. > + field = (__force unsigned long __percpu *)((__force void *)p > + offset); > + this_cpu_inc(*field); > C compiler decides to inline or not, depending on various factors. The most efficient (and small) code is generated by this_cpu_inc() version, allowing the compiler to inline it. If you copy/paste this_cpu_inc() twenty times, then the compiler would not inline the function anymore.
Re: [PATCH net-next v7] net/core: Introduce netdev_core_stats_inc()
On 2023/10/9 17:30, Eric Dumazet wrote: On Mon, Oct 9, 2023 at 10:36 AM Yajun Deng wrote: On 2023/10/9 16:20, Eric Dumazet wrote: On Mon, Oct 9, 2023 at 10:14 AM Yajun Deng wrote: On 2023/10/9 15:53, Eric Dumazet wrote: On Mon, Oct 9, 2023 at 5:07 AM Yajun Deng wrote: 'this_cpu_read + this_cpu_write' and 'pr_info + this_cpu_inc' will make the trace work well. They all have 'pop' instructions in them. This may be the key to making the trace work well. Hi all, I need your help on percpu and ftrace. I do not think you made sure netdev_core_stats_inc() was never inlined. Adding more code in it is simply changing how the compiler decides to inline or not. Yes, you are right. It needs to add the 'noinline' prefix. The disassembly code will have 'pop' instruction. The function was fine, you do not need anything like push or pop. The only needed stuff was the call __fentry__. The fact that the function was inlined for some invocations was the issue, because the trace point is only planted in the out of line function. But somehow the following code isn't inline? They didn't need to add the 'noinline' prefix. + field = (unsigned long *)((void *)this_cpu_ptr(p) + offset); + WRITE_ONCE(*field, READ_ONCE(*field) + 1); Or + (*(unsigned long *)((void *)this_cpu_ptr(p) + offset))++; I think you are very confused. You only want to trace netdev_core_stats_inc() entry point, not arbitrary pieces of it. Yes, I will trace netdev_core_stats_inc() entry point. I mean to replace + field = (__force unsigned long __percpu *)((__force void *)p + offset); + this_cpu_inc(*field); with + field = (unsigned long *)((void *)this_cpu_ptr(p) + offset); + WRITE_ONCE(*field, READ_ONCE(*field) + 1); Or + (*(unsigned long *)((void *)this_cpu_ptr(p) + offset))++; The netdev_core_stats_inc() entry point will work fine even if it doesn't have 'noinline' prefix. I don't know why this code needs to add 'noinline' prefix. + field = (__force unsigned long __percpu *)((__force void *)p + offset); + this_cpu_inc(*field);
Re: [PATCH net-next v7] net/core: Introduce netdev_core_stats_inc()
On Mon, Oct 9, 2023 at 10:36 AM Yajun Deng wrote: > > > On 2023/10/9 16:20, Eric Dumazet wrote: > > On Mon, Oct 9, 2023 at 10:14 AM Yajun Deng wrote: > >> > >> On 2023/10/9 15:53, Eric Dumazet wrote: > >>> On Mon, Oct 9, 2023 at 5:07 AM Yajun Deng wrote: > >>> > 'this_cpu_read + this_cpu_write' and 'pr_info + this_cpu_inc' will make > the trace work well. > > They all have 'pop' instructions in them. This may be the key to making > the trace work well. > > Hi all, > > I need your help on percpu and ftrace. > > >>> I do not think you made sure netdev_core_stats_inc() was never inlined. > >>> > >>> Adding more code in it is simply changing how the compiler decides to > >>> inline or not. > >> > >> Yes, you are right. It needs to add the 'noinline' prefix. The > >> disassembly code will have 'pop' > >> > >> instruction. > >> > > The function was fine, you do not need anything like push or pop. > > > > The only needed stuff was the call __fentry__. > > > > The fact that the function was inlined for some invocations was the > > issue, because the trace point > > is only planted in the out of line function. > > > But somehow the following code isn't inline? They didn't need to add the > 'noinline' prefix. > > + field = (unsigned long *)((void *)this_cpu_ptr(p) + offset); > + WRITE_ONCE(*field, READ_ONCE(*field) + 1); > > Or > + (*(unsigned long *)((void *)this_cpu_ptr(p) + offset))++; > I think you are very confused. You only want to trace netdev_core_stats_inc() entry point, not arbitrary pieces of it.
Re: [PATCH v9 1/5] lib: objpool added: ring-array based lockless MPMC
Hello Masami, Just got time for the new patch and got that ages[] was removed. ages[] is introduced the way like 2-phase commit to keep consitency and must be kept. Thinking of the following 2 cases that two cpu nodes are operating the same objpool_slot simultaneously: Case 1: NODE 1: NODE 2: push to an empty slotpop will get wrong value try_cmpxchg_acquire(&slot->tail, &tail, next) try_cmpxchg_release(&slot->head, &head, head + 1) return slot->entries[head & slot->mask] WRITE_ONCE(slot->entries[tail & slot->mask], obj) Case 2: NODE 1: NODE 2 push to slot w/ 1 objpop will get wrong value try_cmpxchg_release(&slot->head, &head, head + 1) try_cmpxchg_acquire(&slot->tail, &tail, next) WRITE_ONCE(slot->entries[tail & slot->mask], obj) return slot->entries[head & slot->mask] Regards, wuqiang On 2023/9/25 17:42, Masami Hiramatsu (Google) wrote: Hi Wuqiang, On Tue, 5 Sep 2023 09:52:51 +0800 "wuqiang.matt" wrote: The object pool is a scalable implementaion of high performance queue for object allocation and reclamation, such as kretprobe instances. With leveraging percpu ring-array to mitigate the hot spot of memory contention, it could deliver near-linear scalability for high parallel scenarios. The objpool is best suited for following cases: 1) Memory allocation or reclamation are prohibited or too expensive 2) Consumers are of different priorities, such as irqs and threads Limitations: 1) Maximum objects (capacity) is determined during pool initializing and can't be modified (extended) after objpool creation 2) The memory of objects won't be freed until objpool is finalized 3) Object allocation (pop) may fail after trying all cpu slots I made a simplifying patch on this by (mainly) removing ages array. I also rename local variable to use more readable names, like slot, pool, and obj. Here the results which I run the test_objpool.ko. Original: [ 50.500235] Summary of testcases: [ 50.503139] duration: 1027135us hits: 30628293miss: 0sync: percpu objpool [ 50.510416] duration: 1047977us hits: 30453023miss: 0sync: percpu objpool from vmalloc [ 50.517421] duration: 1047098us hits: 31145034miss: 0 sync & hrtimer: percpu objpool [ 50.524671] duration: 1053233us hits: 30919640miss: 0 sync & hrtimer: percpu objpool from vmalloc [ 50.531382] duration: 1055822us hits:3407221miss: 830219sync overrun: percpu objpool [ 50.538135] duration: 1055998us hits:3404624miss: 854160sync overrun: percpu objpool from vmalloc [ 50.546686] duration: 1046104us hits: 19464798miss: 0async: percpu objpool [ 50.552989] duration: 1033054us hits: 18957836miss: 0async: percpu objpool from vmalloc [ 50.560289] duration: 1041778us hits: 33806868miss: 0 async & hrtimer: percpu objpool [ 50.567425] duration: 1048901us hits: 34211860miss: 0 async & hrtimer: percpu objpool from vmalloc Simplified: [ 48.393236] Summary of testcases: [ 48.395384] duration: 1013002us hits: 29661448miss: 0sync: percpu objpool [ 48.400351] duration: 1057231us hits: 31035578miss: 0sync: percpu objpool from vmalloc [ 48.405660] duration: 1043196us hits: 30546652miss: 0 sync & hrtimer: percpu objpool [ 48.411216] duration: 1047128us hits: 30601306miss: 0 sync & hrtimer: percpu objpool from vmalloc [ 48.417231] duration: 1051695us hits:3468287miss: 892881sync overrun: percpu objpool [ 48.422405] duration: 1054003us hits:3549491miss: 898141sync overrun: percpu objpool from vmalloc [ 48.428425] duration: 1052946us hits: 19005228miss: 0async: percpu objpool [ 48.433597] duration: 1051757us hits: 19670866miss: 0async: percpu objpool from vmalloc [ 48.439280] duration: 1042951us hits: 37135332miss: 0 async & hrtimer: percpu objpool [ 48.445085] duration: 1029803us hits: 37093248miss: 0 async & hrtimer: percpu objpool from vmalloc Can you test it too? Thanks, From f1f442ff653e329839e5452b8b88463a80a12ff3 Mon Sep 17 00:00:00 2001 From: "Masami Hiramatsu (Google)" Date: Mon, 25 Sep 2023 16:07:12 +0900 Subject: [PATCH] objpool: Simplify objpool by removing ages array Simplify the objpool code by removing ages array from per-cpu slot. It chooses enough big capacity (which is a rounded up power of 2 value of nr_objects
Re: [PATCH rebased] kbuild: rpm-pkg: Fix build with non-default MODLIB
Hello, On Mon, Oct 09, 2023 at 05:31:02PM +0900, Masahiro Yamada wrote: > On Fri, Oct 6, 2023 at 12:49 AM Michal Suchanek wrote: > > > > The default MODLIB value is composed of two variables and the hardcoded > > string '/lib/modules/'. > > > > MODLIB = $(INSTALL_MOD_PATH)/lib/modules/$(KERNELRELEASE) > > > > Defining this middle part as a variable was rejected on the basis that > > users can pass the whole MODLIB to make, such as > > > In other words, do you want to say > > "If defining this middle part as a variable had been accepted, > this patch would have been unneeded." ? If it were accepted I would not have to guess what the middle part is, and could use the variable that unambiguosly defines it instead. Thanks Michal > > > If your original patch were accepted, how would this patch look like? > > kernel.spec needs to know the module directory somehow. > > > Would you add the following in scripts/package/mkspec ? > > %define MODLIB $(pkg-config --print-variables kmod 2>/dev/null | grep > '^module_directory$' >/dev/null && pkg-config > --variable=module_directory kmod || echo /lib/modules) > > > > > > > > > > > > make 'MODLIB=$(INSTALL_MOD_PATH)/usr/lib/modules/$(KERNELRELEASE)' > > > > However, this middle part of MODLIB is independently hardcoded by > > rpm-pkg, and when the user alters MODLIB this is not reflected when > > building the package. > > > > Given that $(INSTALL_MOD_PATH) is overridden during the rpm package build > > it is likely going to be empty. Then MODLIB can be passed to the rpm > > package, and used in place of the whole > > /usr/lib/modules/$(KERNELRELEASE) part. > > > > Signed-off-by: Michal Suchanek > > --- > > scripts/package/kernel.spec | 8 > > scripts/package/mkspec | 1 + > > 2 files changed, 5 insertions(+), 4 deletions(-) > > > > diff --git a/scripts/package/kernel.spec b/scripts/package/kernel.spec > > index 3eee0143e0c5..15f49c5077db 100644 > > --- a/scripts/package/kernel.spec > > +++ b/scripts/package/kernel.spec > > @@ -67,7 +67,7 @@ cp $(%{make} %{makeflags} -s image_name) > > %{buildroot}/boot/vmlinuz-%{KERNELRELEA > > %{make} %{makeflags} INSTALL_HDR_PATH=%{buildroot}/usr headers_install > > cp System.map %{buildroot}/boot/System.map-%{KERNELRELEASE} > > cp .config %{buildroot}/boot/config-%{KERNELRELEASE} > > -ln -fns /usr/src/kernels/%{KERNELRELEASE} > > %{buildroot}/lib/modules/%{KERNELRELEASE}/build > > +ln -fns /usr/src/kernels/%{KERNELRELEASE} %{buildroot}%{MODLIB}/build > > %if %{with_devel} > > %{make} %{makeflags} run-command > > KBUILD_RUN_COMMAND='${srctree}/scripts/package/install-extmod-build > > %{buildroot}/usr/src/kernels/%{KERNELRELEASE}' > > %endif > > @@ -98,8 +98,8 @@ fi > > > > %files > > %defattr (-, root, root) > > -/lib/modules/%{KERNELRELEASE} > > -%exclude /lib/modules/%{KERNELRELEASE}/build > > +%{MODLIB} > > +%exclude %{MODLIB}/build > > /boot/* > > > > %files headers > > @@ -110,5 +110,5 @@ fi > > %files devel > > %defattr (-, root, root) > > /usr/src/kernels/%{KERNELRELEASE} > > -/lib/modules/%{KERNELRELEASE}/build > > +%{MODLIB}/build > > %endif > > diff --git a/scripts/package/mkspec b/scripts/package/mkspec > > index d41608efb747..d41b2e5304ac 100755 > > --- a/scripts/package/mkspec > > +++ b/scripts/package/mkspec > > @@ -18,6 +18,7 @@ fi > > cat< > %define ARCH ${ARCH} > > %define KERNELRELEASE ${KERNELRELEASE} > > +%define MODLIB ${MODLIB} > > %define pkg_release $("${srctree}/init/build-version") > > EOF > > > > -- > > 2.42.0 > > > > > -- > Best Regards > Masahiro Yamada
Re: [PATCH 3/7] dt-bindings: arm: qcom: Add Xiaomi Redmi Note 9S
On 07/10/2023 15:58, David Wronek wrote: > Document the Xiaomi Redmi Note 9S (curtana) smartphone, which is based > on the Qualcomm SM7125 SoC. Reviewed-by: Krzysztof Kozlowski Best regards, Krzysztof
Re: [PATCH 2/7] dt-bindings: phy: Add QMP UFS PHY compatible for SC7180
On 07/10/2023 15:58, David Wronek wrote: > Document the QMP UFS PHY compatible for SC7180 > > Signed-off-by: David Wronek > --- > .../devicetree/bindings/phy/qcom,sc8280xp-qmp-ufs-phy.yaml | 1 + > 1 file changed, 1 insertion(+) > > diff --git > a/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-ufs-phy.yaml > b/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-ufs-phy.yaml > index f3a3296c811c..f2eee8b5326f 100644 > --- a/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-ufs-phy.yaml > +++ b/Documentation/devicetree/bindings/phy/qcom,sc8280xp-qmp-ufs-phy.yaml > @@ -19,6 +19,7 @@ properties: >- qcom,msm8996-qmp-ufs-phy >- qcom,msm8998-qmp-ufs-phy >- qcom,sa8775p-qmp-ufs-phy > + - qcom,sc7180-qmp-ufs-phy You also need to update the if: for clocks. Best regards, Krzysztof
Re: [PATCH 1/7] dt-bindings: ufs: qcom: Add SC7180 compatible string
On 07/10/2023 15:58, David Wronek wrote: > Document the compatible for the UFS found on SC7180. > > Signed-off-by: David Wronek > --- > Documentation/devicetree/bindings/ufs/qcom,ufs.yaml | 2 ++ > 1 file changed, 2 insertions(+) Reviewed-by: Krzysztof Kozlowski Best regards, Krzysztof
Re: [PATCH v2 3/6] ACPI: AC: Replace acpi_driver with platform_driver
Hi ! Thanks a lot for a review, to both of you ! :-) On 10/7/2023 12:43 PM, Rafael J. Wysocki wrote: > On Sat, Oct 7, 2023 at 12:41 PM Rafael J. Wysocki wrote: >> On Sat, Oct 7, 2023 at 9:56 AM Andy Shevchenko >> wrote: >>> On Fri, Oct 06, 2023 at 09:47:57PM +0200, Rafael J. Wysocki wrote: On Fri, Oct 6, 2023 at 8:33 PM Michal Wilczynski wrote: >>> ... >>> > struct acpi_ac { > struct power_supply *charger; > struct power_supply_desc charger_desc; > - struct acpi_device *device; > + struct device *dev; I'm not convinced about this change. If I'm not mistaken, you only use the dev pointer above to get the ACPI_COMPANION() of it, but the latter is already found in _probe(), so it can be stored in struct acpi_ac for later use and then the dev pointer in there will not be necessary any more. That will save you a bunch of ACPI_HANDLE() evaluations and there's nothing wrong with using ac->device->handle. The patch will then become almost trivial AFAICS and if you really need to get from ac to the underlying platform device, a pointer to it can be added to struct acpi_ac without removing the ACPI device pointer from it. Yeah we could add platform device without removing acpi device, and yes that would introduce data duplication, like Andy noticed. And yes, maybe for this particular driver there is little impact (as struct device is not really used), but in my opinion we should adhere to some common coding pattern among all acpi drivers in order for the code to be easier to maintain and improve readability, also for any newcomer. >>> The idea behind is to eliminate data duplication. >> What data duplication exactly do you mean? >> >> struct acpi_device *device is replaced with struct device *dev which >> is the same size. The latter is then used to obtain a struct >> acpi_device pointer. Why is it better to do this than to store the >> struct acpi_device itself? > This should be "store the struct acpi_device pointer itself", sorry. Hi, So let me explain the reasoning here: 1) I've had a pleasure to work with different drivers in ACPI directory. In my opinion the best ones I've seen, were build around the concept of struct device (e.g NFIT). It's a struct that's well understood in the kernel, and it's easier to understand without having to read any ACPI specific code. If I see something like ACPI_HANDLE(dev), I think 'ah-ha - having a struct device I can easily get struct acpi_device - they're connected'. And I think using a standardized macro, instead of manually dereferencing pointers is also much easier for ACPI newbs reading the code, as it hides a bit complexity of getting acpi device from struct device. And to be honest I don't think there would be any measurable performance change, as those code paths are far from being considered 'hot paths'. So if we can get the code easier to understand from a newcomer perspective, why not do it. 2) I think it would be good to stick to the convention, and introduce some coding pattern, for now some drivers store the struct device pointer, and other store acpi device pointer . As I'm doing this change acpi device pointer become less relevant, as we're using platform device. So to reflect that loss of relevance we can choose to adhere to a pattern where we use it less and less, and the winning approach would be to use 'struct device' by default everywhere we can so maybe eventually we would be able to lose acpi_device altogether at some point, as most of the usage is to retrieve acpi handle and execute some AML method. So in my understanding acpi device is already obsolete at this point, if we can manage to use it less and less, and eventually wipe it out then why not ;-) Thanks again ! Michał
Re: [PATCH net-next v7] net/core: Introduce netdev_core_stats_inc()
On 2023/10/9 16:20, Eric Dumazet wrote: On Mon, Oct 9, 2023 at 10:14 AM Yajun Deng wrote: On 2023/10/9 15:53, Eric Dumazet wrote: On Mon, Oct 9, 2023 at 5:07 AM Yajun Deng wrote: 'this_cpu_read + this_cpu_write' and 'pr_info + this_cpu_inc' will make the trace work well. They all have 'pop' instructions in them. This may be the key to making the trace work well. Hi all, I need your help on percpu and ftrace. I do not think you made sure netdev_core_stats_inc() was never inlined. Adding more code in it is simply changing how the compiler decides to inline or not. Yes, you are right. It needs to add the 'noinline' prefix. The disassembly code will have 'pop' instruction. The function was fine, you do not need anything like push or pop. The only needed stuff was the call __fentry__. The fact that the function was inlined for some invocations was the issue, because the trace point is only planted in the out of line function. But somehow the following code isn't inline? They didn't need to add the 'noinline' prefix. + field = (unsigned long *)((void *)this_cpu_ptr(p) + offset); + WRITE_ONCE(*field, READ_ONCE(*field) + 1); Or + (*(unsigned long *)((void *)this_cpu_ptr(p) + offset))++;
Re: [PATCH rebased] kbuild: rpm-pkg: Fix build with non-default MODLIB
On Fri, Oct 6, 2023 at 12:49 AM Michal Suchanek wrote: > > The default MODLIB value is composed of two variables and the hardcoded > string '/lib/modules/'. > > MODLIB = $(INSTALL_MOD_PATH)/lib/modules/$(KERNELRELEASE) > > Defining this middle part as a variable was rejected on the basis that > users can pass the whole MODLIB to make, such as In other words, do you want to say "If defining this middle part as a variable had been accepted, this patch would have been unneeded." ? I do not think so. If your original patch were accepted, how would this patch look like? kernel.spec needs to know the module directory somehow. Would you add the following in scripts/package/mkspec ? %define MODLIB $(pkg-config --print-variables kmod 2>/dev/null | grep '^module_directory$' >/dev/null && pkg-config --variable=module_directory kmod || echo /lib/modules) > > make 'MODLIB=$(INSTALL_MOD_PATH)/usr/lib/modules/$(KERNELRELEASE)' > > However, this middle part of MODLIB is independently hardcoded by > rpm-pkg, and when the user alters MODLIB this is not reflected when > building the package. > > Given that $(INSTALL_MOD_PATH) is overridden during the rpm package build > it is likely going to be empty. Then MODLIB can be passed to the rpm > package, and used in place of the whole > /usr/lib/modules/$(KERNELRELEASE) part. > > Signed-off-by: Michal Suchanek > --- > scripts/package/kernel.spec | 8 > scripts/package/mkspec | 1 + > 2 files changed, 5 insertions(+), 4 deletions(-) > > diff --git a/scripts/package/kernel.spec b/scripts/package/kernel.spec > index 3eee0143e0c5..15f49c5077db 100644 > --- a/scripts/package/kernel.spec > +++ b/scripts/package/kernel.spec > @@ -67,7 +67,7 @@ cp $(%{make} %{makeflags} -s image_name) > %{buildroot}/boot/vmlinuz-%{KERNELRELEA > %{make} %{makeflags} INSTALL_HDR_PATH=%{buildroot}/usr headers_install > cp System.map %{buildroot}/boot/System.map-%{KERNELRELEASE} > cp .config %{buildroot}/boot/config-%{KERNELRELEASE} > -ln -fns /usr/src/kernels/%{KERNELRELEASE} > %{buildroot}/lib/modules/%{KERNELRELEASE}/build > +ln -fns /usr/src/kernels/%{KERNELRELEASE} %{buildroot}%{MODLIB}/build > %if %{with_devel} > %{make} %{makeflags} run-command > KBUILD_RUN_COMMAND='${srctree}/scripts/package/install-extmod-build > %{buildroot}/usr/src/kernels/%{KERNELRELEASE}' > %endif > @@ -98,8 +98,8 @@ fi > > %files > %defattr (-, root, root) > -/lib/modules/%{KERNELRELEASE} > -%exclude /lib/modules/%{KERNELRELEASE}/build > +%{MODLIB} > +%exclude %{MODLIB}/build > /boot/* > > %files headers > @@ -110,5 +110,5 @@ fi > %files devel > %defattr (-, root, root) > /usr/src/kernels/%{KERNELRELEASE} > -/lib/modules/%{KERNELRELEASE}/build > +%{MODLIB}/build > %endif > diff --git a/scripts/package/mkspec b/scripts/package/mkspec > index d41608efb747..d41b2e5304ac 100755 > --- a/scripts/package/mkspec > +++ b/scripts/package/mkspec > @@ -18,6 +18,7 @@ fi > cat< %define ARCH ${ARCH} > %define KERNELRELEASE ${KERNELRELEASE} > +%define MODLIB ${MODLIB} > %define pkg_release $("${srctree}/init/build-version") > EOF > > -- > 2.42.0 > -- Best Regards Masahiro Yamada
Re: [PATCH net-next v7] net/core: Introduce netdev_core_stats_inc()
On Mon, Oct 9, 2023 at 10:14 AM Yajun Deng wrote: > > > On 2023/10/9 15:53, Eric Dumazet wrote: > > On Mon, Oct 9, 2023 at 5:07 AM Yajun Deng wrote: > > > >> 'this_cpu_read + this_cpu_write' and 'pr_info + this_cpu_inc' will make > >> the trace work well. > >> > >> They all have 'pop' instructions in them. This may be the key to making > >> the trace work well. > >> > >> Hi all, > >> > >> I need your help on percpu and ftrace. > >> > > I do not think you made sure netdev_core_stats_inc() was never inlined. > > > > Adding more code in it is simply changing how the compiler decides to > > inline or not. > > > Yes, you are right. It needs to add the 'noinline' prefix. The > disassembly code will have 'pop' > > instruction. > The function was fine, you do not need anything like push or pop. The only needed stuff was the call __fentry__. The fact that the function was inlined for some invocations was the issue, because the trace point is only planted in the out of line function.
Re: [PATCH net-next v7] net/core: Introduce netdev_core_stats_inc()
On 2023/10/9 15:53, Eric Dumazet wrote: On Mon, Oct 9, 2023 at 5:07 AM Yajun Deng wrote: 'this_cpu_read + this_cpu_write' and 'pr_info + this_cpu_inc' will make the trace work well. They all have 'pop' instructions in them. This may be the key to making the trace work well. Hi all, I need your help on percpu and ftrace. I do not think you made sure netdev_core_stats_inc() was never inlined. Adding more code in it is simply changing how the compiler decides to inline or not. Yes, you are right. It needs to add the 'noinline' prefix. The disassembly code will have 'pop' instruction. Thanks.
Re: [PATCH net-next v7] net/core: Introduce netdev_core_stats_inc()
On Mon, Oct 9, 2023 at 5:07 AM Yajun Deng wrote: > 'this_cpu_read + this_cpu_write' and 'pr_info + this_cpu_inc' will make > the trace work well. > > They all have 'pop' instructions in them. This may be the key to making > the trace work well. > > Hi all, > > I need your help on percpu and ftrace. > I do not think you made sure netdev_core_stats_inc() was never inlined. Adding more code in it is simply changing how the compiler decides to inline or not.