[tip: x86/mm] smp: Micro-optimize smp_call_function_many_cond()
The following commit has been merged into the x86/mm branch of tip: Commit-ID: d43f17a1da25373580ebb466de7d0641acbf6fd6 Gitweb: https://git.kernel.org/tip/d43f17a1da25373580ebb466de7d0641acbf6fd6 Author:Peter Zijlstra AuthorDate:Tue, 02 Mar 2021 08:02:43 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 13:00:22 +01:00 smp: Micro-optimize smp_call_function_many_cond() Call the generic send_call_function_single_ipi() function, which will avoid the IPI when @last_cpu is idle. Signed-off-by: Peter Zijlstra Signed-off-by: Ingo Molnar Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar --- kernel/smp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/smp.c b/kernel/smp.c index b6375d7..af0d51d 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -694,7 +694,7 @@ static void smp_call_function_many_cond(const struct cpumask *mask, * provided mask. */ if (nr_cpus == 1) - arch_send_call_function_single_ipi(last_cpu); + send_call_function_single_ipi(last_cpu); else if (likely(nr_cpus > 1)) arch_send_call_function_ipi_mask(cfd->cpumask_ipi); }
[tip: x86/mm] cpumask: Mark functions as pure
The following commit has been merged into the x86/mm branch of tip: Commit-ID: 291c4011dd7ac0cd0cebb727a75ee5a50d16dcf7 Gitweb: https://git.kernel.org/tip/291c4011dd7ac0cd0cebb727a75ee5a50d16dcf7 Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:10 -08:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:59:10 +01:00 cpumask: Mark functions as pure cpumask_next_and() and cpumask_any_but() are pure, and marking them as such seems to generate different and presumably better code for native_flush_tlb_multi(). Signed-off-by: Nadav Amit Signed-off-by: Ingo Molnar Reviewed-by: Dave Hansen Link: https://lore.kernel.org/r/20210220231712.2475218-8-na...@vmware.com --- include/linux/cpumask.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h index 383684e..c53364c 100644 --- a/include/linux/cpumask.h +++ b/include/linux/cpumask.h @@ -235,7 +235,7 @@ static inline unsigned int cpumask_last(const struct cpumask *srcp) return find_last_bit(cpumask_bits(srcp), nr_cpumask_bits); } -unsigned int cpumask_next(int n, const struct cpumask *srcp); +unsigned int __pure cpumask_next(int n, const struct cpumask *srcp); /** * cpumask_next_zero - get the next unset cpu in a cpumask @@ -252,8 +252,8 @@ static inline unsigned int cpumask_next_zero(int n, const struct cpumask *srcp) return find_next_zero_bit(cpumask_bits(srcp), nr_cpumask_bits, n+1); } -int cpumask_next_and(int n, const struct cpumask *, const struct cpumask *); -int cpumask_any_but(const struct cpumask *mask, unsigned int cpu); +int __pure cpumask_next_and(int n, const struct cpumask *, const struct cpumask *); +int __pure cpumask_any_but(const struct cpumask *mask, unsigned int cpu); unsigned int cpumask_local_spread(unsigned int i, int node); int cpumask_any_and_distribute(const struct cpumask *src1p, const struct cpumask *src2p);
[tip: x86/mm] smp: Inline on_each_cpu_cond() and on_each_cpu()
The following commit has been merged into the x86/mm branch of tip: Commit-ID: a5aa5ce300597224ec76dacc8e63ba3ad7a18bbd Gitweb: https://git.kernel.org/tip/a5aa5ce300597224ec76dacc8e63ba3ad7a18bbd Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:12 -08:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:59:10 +01:00 smp: Inline on_each_cpu_cond() and on_each_cpu() Simplify the code and avoid having an additional function on the stack by inlining on_each_cpu_cond() and on_each_cpu(). Suggested-by: Peter Zijlstra Signed-off-by: Nadav Amit [ Minor edits. ] Signed-off-by: Ingo Molnar Link: https://lore.kernel.org/r/20210220231712.2475218-10-na...@vmware.com --- include/linux/smp.h | 50 --- kernel/smp.c| 56 + kernel/up.c | 38 +-- 3 files changed, 37 insertions(+), 107 deletions(-) diff --git a/include/linux/smp.h b/include/linux/smp.h index 70c6f62..84a0b48 100644 --- a/include/linux/smp.h +++ b/include/linux/smp.h @@ -50,30 +50,52 @@ extern unsigned int total_cpus; int smp_call_function_single(int cpuid, smp_call_func_t func, void *info, int wait); +void on_each_cpu_cond_mask(smp_cond_func_t cond_func, smp_call_func_t func, + void *info, bool wait, const struct cpumask *mask); + +int smp_call_function_single_async(int cpu, call_single_data_t *csd); + /* * Call a function on all processors */ -void on_each_cpu(smp_call_func_t func, void *info, int wait); +static inline void on_each_cpu(smp_call_func_t func, void *info, int wait) +{ + on_each_cpu_cond_mask(NULL, func, info, wait, cpu_online_mask); +} -/* - * Call a function on processors specified by mask, which might include - * the local one. +/** + * on_each_cpu_mask(): Run a function on processors specified by + * cpumask, which may include the local processor. + * @mask: The set of cpus to run on (only runs on online subset). + * @func: The function to run. This must be fast and non-blocking. + * @info: An arbitrary pointer to pass to the function. + * @wait: If true, wait (atomically) until function has completed + *on other CPUs. + * + * If @wait is true, then returns once @func has returned. + * + * You must not call this function with disabled interrupts or from a + * hardware interrupt handler or from a bottom half handler. The + * exception is that it may be used during early boot while + * early_boot_irqs_disabled is set. */ -void on_each_cpu_mask(const struct cpumask *mask, smp_call_func_t func, - void *info, bool wait); +static inline void on_each_cpu_mask(const struct cpumask *mask, + smp_call_func_t func, void *info, bool wait) +{ + on_each_cpu_cond_mask(NULL, func, info, wait, mask); +} /* * Call a function on each processor for which the supplied function * cond_func returns a positive value. This may include the local - * processor. + * processor. May be used during early boot while early_boot_irqs_disabled is + * set. Use local_irq_save/restore() instead of local_irq_disable/enable(). */ -void on_each_cpu_cond(smp_cond_func_t cond_func, smp_call_func_t func, - void *info, bool wait); - -void on_each_cpu_cond_mask(smp_cond_func_t cond_func, smp_call_func_t func, - void *info, bool wait, const struct cpumask *mask); - -int smp_call_function_single_async(int cpu, call_single_data_t *csd); +static inline void on_each_cpu_cond(smp_cond_func_t cond_func, + smp_call_func_t func, void *info, bool wait) +{ + on_each_cpu_cond_mask(cond_func, func, info, wait, cpu_online_mask); +} #ifdef CONFIG_SMP diff --git a/kernel/smp.c b/kernel/smp.c index c8a5a1f..b6375d7 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -848,55 +848,6 @@ void __init smp_init(void) } /* - * Call a function on all processors. May be used during early boot while - * early_boot_irqs_disabled is set. Use local_irq_save/restore() instead - * of local_irq_disable/enable(). - */ -void on_each_cpu(smp_call_func_t func, void *info, int wait) -{ - unsigned long flags; - - preempt_disable(); - smp_call_function(func, info, wait); - local_irq_save(flags); - func(info); - local_irq_restore(flags); - preempt_enable(); -} -EXPORT_SYMBOL(on_each_cpu); - -/** - * on_each_cpu_mask(): Run a function on processors specified by - * cpumask, which may include the local processor. - * @mask: The set of cpus to run on (only runs on online subset). - * @func: The function to run. This must be fast and non-blocking. - * @info: An arbitrary pointer to pass to the function. - * @wait: If true, wait (atomically) until function has completed - *on other CPUs. - * - * If @wait is true, then returns once @func has returned. - * - * You must not call this
[tip: x86/mm] x86/mm/tlb: Do not make is_lazy dirty for no reason
The following commit has been merged into the x86/mm branch of tip: Commit-ID: 09c5272e48614a30598e759c3c7bed126d22037d Gitweb: https://git.kernel.org/tip/09c5272e48614a30598e759c3c7bed126d22037d Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:09 -08:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:59:10 +01:00 x86/mm/tlb: Do not make is_lazy dirty for no reason Blindly writing to is_lazy for no reason, when the written value is identical to the old value, makes the cacheline dirty for no reason. Avoid making such writes to prevent cache coherency traffic for no reason. Suggested-by: Dave Hansen Signed-off-by: Nadav Amit Signed-off-by: Ingo Molnar Reviewed-by: Dave Hansen Link: https://lore.kernel.org/r/20210220231712.2475218-7-na...@vmware.com --- arch/x86/mm/tlb.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 345a0af..17ec4bf 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -469,7 +469,8 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, __flush_tlb_all(); } #endif - this_cpu_write(cpu_tlbstate_shared.is_lazy, false); + if (was_lazy) + this_cpu_write(cpu_tlbstate_shared.is_lazy, false); /* * The membarrier system call requires a full memory barrier and
Re: [PATCH 1/1] RISC-V: correct enum sbi_ext_rfence_fid
On Sat, Mar 6, 2021 at 11:19 AM Heinrich Schuchardt wrote: > > The constants in enum sbi_ext_rfence_fid should match the SBI > specification. See > https://github.com/riscv/riscv-sbi-doc/blob/master/riscv-sbi.adoc#78-function-listing > > | Function Name | FID | EID > | sbi_remote_fence_i | 0 | 0x52464E43 > | sbi_remote_sfence_vma | 1 | 0x52464E43 > | sbi_remote_sfence_vma_asid | 2 | 0x52464E43 > | sbi_remote_hfence_gvma_vmid | 3 | 0x52464E43 > | sbi_remote_hfence_gvma | 4 | 0x52464E43 > | sbi_remote_hfence_vvma_asid | 5 | 0x52464E43 > | sbi_remote_hfence_vvma | 6 | 0x52464E43 > > Fixes: ecbacc2a3efd ("RISC-V: Add SBI v0.2 extension definitions") > Reported-by: Sean Anderson > Signed-off-by: Heinrich Schuchardt Good catch. I guess we never saw any issues because these calls are only used by KVM RISC-V which is not merged yet. Further for KVM RISC-V, the HFENCE instruction is emulated as flush everything on FPGA, QEMU, and Spike so we did not notice any issue with KVM RISC-V too. Looks good to me. Reviewed-by: Anup Patel Regards, Anup > --- > arch/riscv/include/asm/sbi.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h > index 99895d9c3bdd..d7027411dde8 100644 > --- a/arch/riscv/include/asm/sbi.h > +++ b/arch/riscv/include/asm/sbi.h > @@ -51,10 +51,10 @@ enum sbi_ext_rfence_fid { > SBI_EXT_RFENCE_REMOTE_FENCE_I = 0, > SBI_EXT_RFENCE_REMOTE_SFENCE_VMA, > SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID, > - SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA, > SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID, > - SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA, > + SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA, > SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID, > + SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA, > }; > > enum sbi_ext_hsm_fid { > -- > 2.30.1 > > > ___ > linux-riscv mailing list > linux-ri...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
[PATCH] xhci: Remove unused value len from xhci_unmap_temp_buf
From: Zhang Kun The value assigned to len by sg_pcopy_from_buffer() never used for anything, so remove it. Signed-off-by: Zhang Kun --- drivers/usb/host/xhci.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index bd27bd670104..6ebda89d476c 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -1335,7 +1335,6 @@ static bool xhci_urb_temp_buffer_required(struct usb_hcd *hcd, static void xhci_unmap_temp_buf(struct usb_hcd *hcd, struct urb *urb) { - unsigned int len; unsigned int buf_len; enum dma_data_direction dir; @@ -1351,7 +1350,7 @@ static void xhci_unmap_temp_buf(struct usb_hcd *hcd, struct urb *urb) dir); if (usb_urb_dir_in(urb)) - len = sg_pcopy_from_buffer(urb->sg, urb->num_sgs, + sg_pcopy_from_buffer(urb->sg, urb->num_sgs, urb->transfer_buffer, buf_len, 0); -- 2.17.1
[PATCH] media:atomisp: remove duplicate include in sh_css
From: Zhang Yunkai 'ia_css_isys.h' included in 'sh_css.c' is duplicated. It is also included in the 30th line. Signed-off-by: Zhang Yunkai --- drivers/staging/media/atomisp/pci/sh_css.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/staging/media/atomisp/pci/sh_css.c b/drivers/staging/media/atomisp/pci/sh_css.c index ddee04c8248d..afddc54094e9 100644 --- a/drivers/staging/media/atomisp/pci/sh_css.c +++ b/drivers/staging/media/atomisp/pci/sh_css.c @@ -49,9 +49,6 @@ #include "ia_css_pipe_util.h" #include "ia_css_pipe_binarydesc.h" #include "ia_css_pipe_stagedesc.h" -#ifndef ISP2401 -#include "ia_css_isys.h" -#endif #include "tag.h" #include "assert_support.h" -- 2.25.1
[tip: x86/cpu] x86/cpu/hygon: Set __max_die_per_package on Hygon
The following commit has been merged into the x86/cpu branch of tip: Commit-ID: 59eca2fa1934de42d8aa44d3bef655c92ea69703 Gitweb: https://git.kernel.org/tip/59eca2fa1934de42d8aa44d3bef655c92ea69703 Author:Pu Wen AuthorDate:Tue, 02 Mar 2021 10:02:17 +08:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:54:59 +01:00 x86/cpu/hygon: Set __max_die_per_package on Hygon Set the maximum DIE per package variable on Hygon using the nodes_per_socket value in order to do per-DIE manipulations for drivers such as powercap. Signed-off-by: Pu Wen Signed-off-by: Borislav Petkov Signed-off-by: Ingo Molnar Link: https://lkml.kernel.org/r/20210302020217.1827-1-pu...@hygon.cn --- arch/x86/kernel/cpu/hygon.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/hygon.c b/arch/x86/kernel/cpu/hygon.c index ae59115..0bd6c74 100644 --- a/arch/x86/kernel/cpu/hygon.c +++ b/arch/x86/kernel/cpu/hygon.c @@ -215,12 +215,12 @@ static void bsp_init_hygon(struct cpuinfo_x86 *c) u32 ecx; ecx = cpuid_ecx(0x801e); - nodes_per_socket = ((ecx >> 8) & 7) + 1; + __max_die_per_package = nodes_per_socket = ((ecx >> 8) & 7) + 1; } else if (boot_cpu_has(X86_FEATURE_NODEID_MSR)) { u64 value; rdmsrl(MSR_FAM10H_NODE_ID, value); - nodes_per_socket = ((value >> 3) & 7) + 1; + __max_die_per_package = nodes_per_socket = ((value >> 3) & 7) + 1; } if (!boot_cpu_has(X86_FEATURE_AMD_SSBD) &&
[tip: x86/platform] x86/platform/uv: Fix indentation warning in Documentation/ABI/testing/sysfs-firmware-sgi_uv
The following commit has been merged into the x86/platform branch of tip: Commit-ID: e93d757c3f33c8a09f4aae579da4dc4500707471 Gitweb: https://git.kernel.org/tip/e93d757c3f33c8a09f4aae579da4dc4500707471 Author:Justin Ernst AuthorDate:Fri, 19 Feb 2021 12:28:52 -06:00 Committer: Borislav Petkov CommitterDate: Sat, 06 Mar 2021 12:28:35 +01:00 x86/platform/uv: Fix indentation warning in Documentation/ABI/testing/sysfs-firmware-sgi_uv Commit c9624cb7db1c ("x86/platform/uv: Update sysfs documentation") misplaced the first line of a codeblock section, causing the reported warning message: Documentation/ABI/testing/sysfs-firmware-sgi_uv:2: WARNING: Unexpected indentation. Move the misplaced line below the required blank line to remove the warning message. Fixes: c9624cb7db1c ("x86/platform/uv: Update sysfs documentation") Reported-by: Stephen Rothwell Signed-off-by: Justin Ernst Signed-off-by: Borislav Petkov Acked-by: Mike Travis Link: https://lkml.kernel.org/r/20210219182852.385297-1-justin.er...@hpe.com --- Documentation/ABI/testing/sysfs-firmware-sgi_uv | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/ABI/testing/sysfs-firmware-sgi_uv b/Documentation/ABI/testing/sysfs-firmware-sgi_uv index 637c668..12ed843 100644 --- a/Documentation/ABI/testing/sysfs-firmware-sgi_uv +++ b/Documentation/ABI/testing/sysfs-firmware-sgi_uv @@ -39,8 +39,8 @@ Description: The uv_type entry contains the hub revision number. This value can be used to identify the UV system version:: - "0.*" = Hubless UV ('*' is subtype) + "0.*" = Hubless UV ('*' is subtype) "3.0" = UV2 "5.0" = UV3 "7.0" = UV4
[tip: timers/urgent] hrtimer: Update softirq_expires_next correctly after __hrtimer_get_next_event()
The following commit has been merged into the timers/urgent branch of tip: Commit-ID: eca8f0c80a005aea84df507a446fc0154fc55a32 Gitweb: https://git.kernel.org/tip/eca8f0c80a005aea84df507a446fc0154fc55a32 Author:Anna-Maria Behnsen AuthorDate:Tue, 23 Feb 2021 17:02:40 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:53:47 +01:00 hrtimer: Update softirq_expires_next correctly after __hrtimer_get_next_event() hrtimer_force_reprogram() and hrtimer_interrupt() invokes __hrtimer_get_next_event() to find the earliest expiry time of hrtimer bases. __hrtimer_get_next_event() does not update cpu_base::[softirq_]_expires_next to preserve reprogramming logic. That needs to be done at the callsites. hrtimer_force_reprogram() updates cpu_base::softirq_expires_next only when the first expiring timer is a softirq timer and the soft interrupt is not activated. That's wrong because cpu_base::softirq_expires_next is left stale when the first expiring timer of all bases is a timer which expires in hard interrupt context. hrtimer_interrupt() does never update cpu_base::softirq_expires_next which is wrong too. That becomes a problem when clock_settime() sets CLOCK_REALTIME forward and the first soft expiring timer is in the CLOCK_REALTIME_SOFT base. Setting CLOCK_REALTIME forward moves the clock MONOTONIC based expiry time of that timer before the stale cpu_base::softirq_expires_next. cpu_base::softirq_expires_next is cached to make the check for raising the soft interrupt fast. In the above case the soft interrupt won't be raised until clock monotonic reaches the stale cpu_base::softirq_expires_next value. That's incorrect, but what's worse it that if the softirq timer becomes the first expiring timer of all clock bases after the hard expiry timer has been handled the reprogramming of the clockevent from hrtimer_interrupt() will result in an interrupt storm. That happens because the reprogramming does not use cpu_base::softirq_expires_next, it uses __hrtimer_get_next_event() which returns the actual expiry time. Once clock MONOTONIC reaches cpu_base::softirq_expires_next the soft interrupt is raised and the storm subsides. Change the logic in hrtimer_force_reprogram() to evaluate the soft and hard bases seperately, update softirq_expires_next and handle the case when a soft expiring timer is the first of all bases by comparing the expiry times and updating the required cpu base fields. Split this functionality into a separate function to be able to use it in hrtimer_interrupt() as well without copy paste. Fixes: da70160462e ("hrtimer: Implement support for softirq based hrtimers") Reported-by: Mikael Beckius Suggested-by: Thomas Gleixner Tested-by: Mikael Beckius Signed-off-by: Anna-Maria Behnsen Signed-off-by: Thomas Gleixner Signed-off-by: Ingo Molnar Link: https://lore.kernel.org/r/20210223160240.27518-1-anna-ma...@linutronix.de --- kernel/time/hrtimer.c | 60 +++--- 1 file changed, 39 insertions(+), 21 deletions(-) diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index 743c852..788b9d1 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -546,8 +546,11 @@ static ktime_t __hrtimer_next_event_base(struct hrtimer_cpu_base *cpu_base, } /* - * Recomputes cpu_base::*next_timer and returns the earliest expires_next but - * does not set cpu_base::*expires_next, that is done by hrtimer_reprogram. + * Recomputes cpu_base::*next_timer and returns the earliest expires_next + * but does not set cpu_base::*expires_next, that is done by + * hrtimer[_force]_reprogram and hrtimer_interrupt only. When updating + * cpu_base::*expires_next right away, reprogramming logic would no longer + * work. * * When a softirq is pending, we can ignore the HRTIMER_ACTIVE_SOFT bases, * those timers will get run whenever the softirq gets handled, at the end of @@ -588,6 +591,37 @@ __hrtimer_get_next_event(struct hrtimer_cpu_base *cpu_base, unsigned int active_ return expires_next; } +static ktime_t hrtimer_update_next_event(struct hrtimer_cpu_base *cpu_base) +{ + ktime_t expires_next, soft = KTIME_MAX; + + /* +* If the soft interrupt has already been activated, ignore the +* soft bases. They will be handled in the already raised soft +* interrupt. +*/ + if (!cpu_base->softirq_activated) { + soft = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_SOFT); + /* +* Update the soft expiry time. clock_settime() might have +* affected it. +*/ + cpu_base->softirq_expires_next = soft; + } + + expires_next = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_HARD); + /* +* If a softirq timer is expiring first, update cpu_base->next_timer +* and program the hardware with the soft expiry time. +*/ + if (expires_next > soft) { +
[tip: perf/urgent] perf/x86/intel: Set PERF_ATTACH_SCHED_CB for large PEBS and LBR
The following commit has been merged into the perf/urgent branch of tip: Commit-ID: afbef30149587ad46f4780b1e0cc5e219745ce90 Gitweb: https://git.kernel.org/tip/afbef30149587ad46f4780b1e0cc5e219745ce90 Author:Kan Liang AuthorDate:Mon, 30 Nov 2020 11:38:41 -08:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:52:44 +01:00 perf/x86/intel: Set PERF_ATTACH_SCHED_CB for large PEBS and LBR To supply a PID/TID for large PEBS, it requires flushing the PEBS buffer in a context switch. For normal LBRs, a context switch can flip the address space and LBR entries are not tagged with an identifier, we need to wipe the LBR, even for per-cpu events. For LBR callstack, save/restore the stack is required during a context switch. Set PERF_ATTACH_SCHED_CB for the event with large PEBS & LBR. Fixes: 9c964efa4330 ("perf/x86/intel: Drain the PEBS buffer during context switches") Signed-off-by: Kan Liang Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Link: https://lkml.kernel.org/r/20201130193842.10569-2-kan.li...@linux.intel.com --- arch/x86/events/intel/core.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 5bac48d..7bbb5bb 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3662,8 +3662,10 @@ static int intel_pmu_hw_config(struct perf_event *event) if (!(event->attr.freq || (event->attr.wakeup_events && !event->attr.watermark))) { event->hw.flags |= PERF_X86_EVENT_AUTO_RELOAD; if (!(event->attr.sample_type & - ~intel_pmu_large_pebs_flags(event))) + ~intel_pmu_large_pebs_flags(event))) { event->hw.flags |= PERF_X86_EVENT_LARGE_PEBS; + event->attach_state |= PERF_ATTACH_SCHED_CB; + } } if (x86_pmu.pebs_aliases) x86_pmu.pebs_aliases(event); @@ -3676,6 +3678,7 @@ static int intel_pmu_hw_config(struct perf_event *event) ret = intel_pmu_setup_lbr_filter(event); if (ret) return ret; + event->attach_state |= PERF_ATTACH_SCHED_CB; /* * BTS is set up earlier in this path, so don't account twice
[PATCH] sound: soc: codecs: Fix a spello in the file wm8955.c
s/sortd/sorted/ Signed-off-by: Bhaskar Chowdhury --- sound/soc/codecs/wm8955.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/codecs/wm8955.c b/sound/soc/codecs/wm8955.c index 513df47bd87d..538bb8b0db39 100644 --- a/sound/soc/codecs/wm8955.c +++ b/sound/soc/codecs/wm8955.c @@ -151,7 +151,7 @@ static int wm8955_pll_factors(struct device *dev, /* The oscilator should run at should be 90-100MHz, and * there's a divide by 4 plus an optional divide by 2 in the * output path to generate the system clock. The clock table -* is sortd so we should always generate a suitable target. */ +* is sorted so we should always generate a suitable target. */ target = Fout * 4; if (target < 9000) { pll->outdiv = 1; -- 2.26.2
[tip: perf/urgent] perf/core: Flush PMU internal buffers for per-CPU events
The following commit has been merged into the perf/urgent branch of tip: Commit-ID: a5398bffc01fe044848c5024e5e867e407f239b8 Gitweb: https://git.kernel.org/tip/a5398bffc01fe044848c5024e5e867e407f239b8 Author:Kan Liang AuthorDate:Mon, 30 Nov 2020 11:38:40 -08:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:52:39 +01:00 perf/core: Flush PMU internal buffers for per-CPU events Sometimes the PMU internal buffers have to be flushed for per-CPU events during a context switch, e.g., large PEBS. Otherwise, the perf tool may report samples in locations that do not belong to the process where the samples are processed in, because PEBS does not tag samples with PID/TID. The current code only flush the buffers for a per-task event. It doesn't check a per-CPU event. Add a new event state flag, PERF_ATTACH_SCHED_CB, to indicate that the PMU internal buffers have to be flushed for this event during a context switch. Add sched_cb_entry and perf_sched_cb_usages back to track the PMU/cpuctx which is required to be flushed. Only need to invoke the sched_task() for per-CPU events in this patch. The per-task events have been handled in perf_event_context_sched_in/out already. Fixes: 9c964efa4330 ("perf/x86/intel: Drain the PEBS buffer during context switches") Reported-by: Gabriel Marin Originally-by: Namhyung Kim Signed-off-by: Kan Liang Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Link: https://lkml.kernel.org/r/20201130193842.10569-1-kan.li...@linux.intel.com --- include/linux/perf_event.h | 2 ++- kernel/events/core.c | 42 + 2 files changed, 40 insertions(+), 4 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index fab42cf..3f7f89e 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -606,6 +606,7 @@ struct swevent_hlist { #define PERF_ATTACH_TASK 0x04 #define PERF_ATTACH_TASK_DATA 0x08 #define PERF_ATTACH_ITRACE 0x10 +#define PERF_ATTACH_SCHED_CB 0x20 struct perf_cgroup; struct perf_buffer; @@ -872,6 +873,7 @@ struct perf_cpu_context { struct list_headcgrp_cpuctx_entry; #endif + struct list_headsched_cb_entry; int sched_cb_usage; int online; diff --git a/kernel/events/core.c b/kernel/events/core.c index 0aeca5f..03db40f 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -386,6 +386,7 @@ static DEFINE_MUTEX(perf_sched_mutex); static atomic_t perf_sched_count; static DEFINE_PER_CPU(atomic_t, perf_cgroup_events); +static DEFINE_PER_CPU(int, perf_sched_cb_usages); static DEFINE_PER_CPU(struct pmu_event_list, pmu_sb_events); static atomic_t nr_mmap_events __read_mostly; @@ -3461,11 +3462,16 @@ unlock: } } +static DEFINE_PER_CPU(struct list_head, sched_cb_list); + void perf_sched_cb_dec(struct pmu *pmu) { struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context); - --cpuctx->sched_cb_usage; + this_cpu_dec(perf_sched_cb_usages); + + if (!--cpuctx->sched_cb_usage) + list_del(>sched_cb_entry); } @@ -3473,7 +3479,10 @@ void perf_sched_cb_inc(struct pmu *pmu) { struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context); - cpuctx->sched_cb_usage++; + if (!cpuctx->sched_cb_usage++) + list_add(>sched_cb_entry, this_cpu_ptr(_cb_list)); + + this_cpu_inc(perf_sched_cb_usages); } /* @@ -3502,6 +3511,24 @@ static void __perf_pmu_sched_task(struct perf_cpu_context *cpuctx, bool sched_in perf_ctx_unlock(cpuctx, cpuctx->task_ctx); } +static void perf_pmu_sched_task(struct task_struct *prev, + struct task_struct *next, + bool sched_in) +{ + struct perf_cpu_context *cpuctx; + + if (prev == next) + return; + + list_for_each_entry(cpuctx, this_cpu_ptr(_cb_list), sched_cb_entry) { + /* will be handled in perf_event_context_sched_in/out */ + if (cpuctx->task_ctx) + continue; + + __perf_pmu_sched_task(cpuctx, sched_in); + } +} + static void perf_event_switch(struct task_struct *task, struct task_struct *next_prev, bool sched_in); @@ -3524,6 +3551,9 @@ void __perf_event_task_sched_out(struct task_struct *task, { int ctxn; + if (__this_cpu_read(perf_sched_cb_usages)) + perf_pmu_sched_task(task, next, false); + if (atomic_read(_switch_events)) perf_event_switch(task, next, false); @@ -3832,6 +3862,9 @@ void __perf_event_task_sched_in(struct task_struct *prev, if (atomic_read(_switch_events)) perf_event_switch(task, prev, true); + + if (__this_cpu_read(perf_sched_cb_usages)) +
[tip: locking/core] static_call: Fix the module key fixup
The following commit has been merged into the locking/core branch of tip: Commit-ID: 50bf8080a94d171e843fc013abec19d8ab9f50ae Gitweb: https://git.kernel.org/tip/50bf8080a94d171e843fc013abec19d8ab9f50ae Author:Peter Zijlstra AuthorDate:Thu, 25 Feb 2021 23:03:51 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:49:08 +01:00 static_call: Fix the module key fixup Provided the target address of a R_X86_64_PC32 relocation is aligned, the low two bits should be invariant between the relative and absolute value. Turns out the address is not aligned and things go sideways, ensure we transfer the bits in the absolute form when fixing up the key address. Fixes: 73f44fe19d35 ("static_call: Allow module use without exposing static_call_key") Reported-by: Steven Rostedt Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Tested-by: Steven Rostedt (VMware) Link: https://lkml.kernel.org/r/20210225220351.ge4...@worktop.programming.kicks-ass.net --- kernel/static_call.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/kernel/static_call.c b/kernel/static_call.c index 6906c6e..ae82529 100644 --- a/kernel/static_call.c +++ b/kernel/static_call.c @@ -349,7 +349,8 @@ static int static_call_add_module(struct module *mod) struct static_call_site *site; for (site = start; site != stop; site++) { - unsigned long addr = (unsigned long)static_call_key(site); + unsigned long s_key = (long)site->key + (long)>key; + unsigned long addr = s_key & ~STATIC_CALL_SITE_FLAGS; unsigned long key; /* @@ -373,8 +374,8 @@ static int static_call_add_module(struct module *mod) return -EINVAL; } - site->key = (key - (long)>key) | - (site->key & STATIC_CALL_SITE_FLAGS); + key |= s_key & STATIC_CALL_SITE_FLAGS; + site->key = key - (long)>key; } return __static_call_init(mod, start, stop);
[tip: locking/core] lockdep: Add lockdep_assert_not_held()
The following commit has been merged into the locking/core branch of tip: Commit-ID: 3e31f94752e454bdd0ca4a1d046ee21f80c166c5 Gitweb: https://git.kernel.org/tip/3e31f94752e454bdd0ca4a1d046ee21f80c166c5 Author:Shuah Khan AuthorDate:Fri, 26 Feb 2021 17:06:58 -07:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:51:05 +01:00 lockdep: Add lockdep_assert_not_held() Some kernel functions must be called without holding a specific lock. Add lockdep_assert_not_held() to be used in these functions to detect incorrect calls while holding a lock. lockdep_assert_not_held() provides the opposite functionality of lockdep_assert_held() which is used to assert calls that require holding a specific lock. Incorporates suggestions from Peter Zijlstra to avoid misfires when lockdep_off() is employed. The need for lockdep_assert_not_held() came up in a discussion on ath10k patch. ath10k_drain_tx() and i915_vma_pin_ww() are examples of functions that can use lockdep_assert_not_held(). Signed-off-by: Shuah Khan Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Link: https://lore.kernel.org/linux-wireless/871rdmu9z9@codeaurora.org/ --- include/linux/lockdep.h | 11 --- kernel/locking/lockdep.c | 6 +- 2 files changed, 13 insertions(+), 4 deletions(-) diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index 7b7ebf2..dbd9ea8 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -301,8 +301,12 @@ extern void lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie); #define lockdep_depth(tsk) (debug_locks ? (tsk)->lockdep_depth : 0) -#define lockdep_assert_held(l) do {\ - WARN_ON(debug_locks && !lockdep_is_held(l));\ +#define lockdep_assert_held(l) do {\ + WARN_ON(debug_locks && lockdep_is_held(l) == 0);\ + } while (0) + +#define lockdep_assert_not_held(l) do {\ + WARN_ON(debug_locks && lockdep_is_held(l) == 1);\ } while (0) #define lockdep_assert_held_write(l) do {\ @@ -393,7 +397,8 @@ extern int lockdep_is_held(const void *); #define lockdep_is_held_type(l, r) (1) #define lockdep_assert_held(l) do { (void)(l); } while (0) -#define lockdep_assert_held_write(l) do { (void)(l); } while (0) +#define lockdep_assert_not_held(l) do { (void)(l); } while (0) +#define lockdep_assert_held_write(l) do { (void)(l); } while (0) #define lockdep_assert_held_read(l)do { (void)(l); } while (0) #define lockdep_assert_held_once(l)do { (void)(l); } while (0) diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c index c6d0c1d..969736b 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -5539,8 +5539,12 @@ noinstr int lock_is_held_type(const struct lockdep_map *lock, int read) unsigned long flags; int ret = 0; + /* +* Avoid false negative lockdep_assert_held() and +* lockdep_assert_not_held(). +*/ if (unlikely(!lockdep_enabled())) - return 1; /* avoid false negative lockdep_assert_held() */ + return -1; raw_local_irq_save(flags); check_flags(flags);
[tip: locking/core] x86/jump_label: Mark arguments as const to satisfy asm constraints
The following commit has been merged into the locking/core branch of tip: Commit-ID: 864b435514b286c0be2a38a02f487aa28d990ef8 Gitweb: https://git.kernel.org/tip/864b435514b286c0be2a38a02f487aa28d990ef8 Author:Jason Gerecke AuthorDate:Thu, 11 Feb 2021 13:48:48 -08:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:51:00 +01:00 x86/jump_label: Mark arguments as const to satisfy asm constraints When compiling an external kernel module with `-O0` or `-O1`, the following compile error may be reported: ./arch/x86/include/asm/jump_label.h:25:2: error: impossible constraint in ‘asm’ 25 | asm_volatile_goto("1:" | ^ It appears that these lower optimization levels prevent GCC from detecting that the key/branch arguments can be treated as constants and used as immediate operands. To work around this, explicitly add the `const` label. Signed-off-by: Jason Gerecke Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Steven Rostedt (VMware) Acked-by: Josh Poimboeuf Link: https://lkml.kernel.org/r/20210211214848.536626-1-jason.gere...@wacom.com --- arch/x86/include/asm/jump_label.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h index 06c3cc2..7f20066 100644 --- a/arch/x86/include/asm/jump_label.h +++ b/arch/x86/include/asm/jump_label.h @@ -20,7 +20,7 @@ #include #include -static __always_inline bool arch_static_branch(struct static_key *key, bool branch) +static __always_inline bool arch_static_branch(struct static_key * const key, const bool branch) { asm_volatile_goto("1:" ".byte " __stringify(STATIC_KEY_INIT_NOP) "\n\t" @@ -36,7 +36,7 @@ l_yes: return true; } -static __always_inline bool arch_static_branch_jump(struct static_key *key, bool branch) +static __always_inline bool arch_static_branch_jump(struct static_key * const key, const bool branch) { asm_volatile_goto("1:" ".byte 0xe9\n\t .long %l[l_yes] - 2f\n\t"
[tip: locking/core] lockdep: Add lockdep lock state defines
The following commit has been merged into the locking/core branch of tip: Commit-ID: f8cfa46608f8aa5ca5421ce281ab314129c15411 Gitweb: https://git.kernel.org/tip/f8cfa46608f8aa5ca5421ce281ab314129c15411 Author:Shuah Khan AuthorDate:Fri, 26 Feb 2021 17:06:59 -07:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:51:10 +01:00 lockdep: Add lockdep lock state defines Adds defines for lock state returns from lock_is_held_type() based on Johannes Berg's suggestions as it make it easier to read and maintain the lock states. These are defines and a enum to avoid changes to lock_is_held_type() and lockdep_is_held() return types. Updates to lock_is_held_type() and __lock_is_held() to use the new defines. Signed-off-by: Shuah Khan Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Link: https://lore.kernel.org/linux-wireless/871rdmu9z9@codeaurora.org/ --- include/linux/lockdep.h | 11 +-- kernel/locking/lockdep.c | 11 ++- 2 files changed, 15 insertions(+), 7 deletions(-) diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index dbd9ea8..17805aa 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -268,6 +268,11 @@ extern void lock_acquire(struct lockdep_map *lock, unsigned int subclass, extern void lock_release(struct lockdep_map *lock, unsigned long ip); +/* lock_is_held_type() returns */ +#define LOCK_STATE_UNKNOWN -1 +#define LOCK_STATE_NOT_HELD0 +#define LOCK_STATE_HELD1 + /* * Same "read" as for lock_acquire(), except -1 means any. */ @@ -302,11 +307,13 @@ extern void lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie); #define lockdep_depth(tsk) (debug_locks ? (tsk)->lockdep_depth : 0) #define lockdep_assert_held(l) do {\ - WARN_ON(debug_locks && lockdep_is_held(l) == 0);\ + WARN_ON(debug_locks && \ + lockdep_is_held(l) == LOCK_STATE_NOT_HELD); \ } while (0) #define lockdep_assert_not_held(l) do {\ - WARN_ON(debug_locks && lockdep_is_held(l) == 1);\ + WARN_ON(debug_locks && \ + lockdep_is_held(l) == LOCK_STATE_HELD); \ } while (0) #define lockdep_assert_held_write(l) do {\ diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c index 969736b..c0b8926 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -54,6 +54,7 @@ #include #include #include +#include #include @@ -5252,13 +5253,13 @@ int __lock_is_held(const struct lockdep_map *lock, int read) if (match_held_lock(hlock, lock)) { if (read == -1 || hlock->read == read) - return 1; + return LOCK_STATE_HELD; - return 0; + return LOCK_STATE_NOT_HELD; } } - return 0; + return LOCK_STATE_NOT_HELD; } static struct pin_cookie __lock_pin_lock(struct lockdep_map *lock) @@ -5537,14 +5538,14 @@ EXPORT_SYMBOL_GPL(lock_release); noinstr int lock_is_held_type(const struct lockdep_map *lock, int read) { unsigned long flags; - int ret = 0; + int ret = LOCK_STATE_NOT_HELD; /* * Avoid false negative lockdep_assert_held() and * lockdep_assert_not_held(). */ if (unlikely(!lockdep_enabled())) - return -1; + return LOCK_STATE_UNKNOWN; raw_local_irq_save(flags); check_flags(flags);
[tip: locking/core] locking/csd_lock: Prepare more CSD lock debugging
The following commit has been merged into the locking/core branch of tip: Commit-ID: de7b09ef658d637eed0584eaba30884e409aef31 Gitweb: https://git.kernel.org/tip/de7b09ef658d637eed0584eaba30884e409aef31 Author:Juergen Gross AuthorDate:Mon, 01 Mar 2021 11:13:35 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:49:48 +01:00 locking/csd_lock: Prepare more CSD lock debugging In order to be able to easily add more CSD lock debugging data to struct call_function_data->csd move the call_single_data_t element into a sub-structure. Signed-off-by: Juergen Gross Signed-off-by: Ingo Molnar Link: https://lore.kernel.org/r/20210301101336.7797-3-jgr...@suse.com --- kernel/smp.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/kernel/smp.c b/kernel/smp.c index d5f0b21..6d7e6db 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -31,8 +31,12 @@ #define CSD_TYPE(_csd) ((_csd)->node.u_flags & CSD_FLAG_TYPE_MASK) +struct cfd_percpu { + call_single_data_t csd; +}; + struct call_function_data { - call_single_data_t __percpu *csd; + struct cfd_percpu __percpu *pcpu; cpumask_var_t cpumask; cpumask_var_t cpumask_ipi; }; @@ -55,8 +59,8 @@ int smpcfd_prepare_cpu(unsigned int cpu) free_cpumask_var(cfd->cpumask); return -ENOMEM; } - cfd->csd = alloc_percpu(call_single_data_t); - if (!cfd->csd) { + cfd->pcpu = alloc_percpu(struct cfd_percpu); + if (!cfd->pcpu) { free_cpumask_var(cfd->cpumask); free_cpumask_var(cfd->cpumask_ipi); return -ENOMEM; @@ -71,7 +75,7 @@ int smpcfd_dead_cpu(unsigned int cpu) free_cpumask_var(cfd->cpumask); free_cpumask_var(cfd->cpumask_ipi); - free_percpu(cfd->csd); + free_percpu(cfd->pcpu); return 0; } @@ -694,7 +698,7 @@ static void smp_call_function_many_cond(const struct cpumask *mask, cpumask_clear(cfd->cpumask_ipi); for_each_cpu(cpu, cfd->cpumask) { - call_single_data_t *csd = per_cpu_ptr(cfd->csd, cpu); + call_single_data_t *csd = _cpu_ptr(cfd->pcpu, cpu)->csd; if (cond_func && !cond_func(cpu, info)) continue; @@ -719,7 +723,7 @@ static void smp_call_function_many_cond(const struct cpumask *mask, for_each_cpu(cpu, cfd->cpumask) { call_single_data_t *csd; - csd = per_cpu_ptr(cfd->csd, cpu); + csd = _cpu_ptr(cfd->pcpu, cpu)->csd; csd_lock_wait(csd); } }
[tip: locking/core] locking/csd_lock: Add more data to CSD lock debugging
The following commit has been merged into the locking/core branch of tip: Commit-ID: a5aabace5fb8abf2adcfcf0fe54c089b20d71755 Gitweb: https://git.kernel.org/tip/a5aabace5fb8abf2adcfcf0fe54c089b20d71755 Author:Juergen Gross AuthorDate:Mon, 01 Mar 2021 11:13:36 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:49:48 +01:00 locking/csd_lock: Add more data to CSD lock debugging In order to help identifying problems with IPI handling and remote function execution add some more data to IPI debugging code. There have been multiple reports of CPUs looping long times (many seconds) in smp_call_function_many() waiting for another CPU executing a function like tlb flushing. Most of these reports have been for cases where the kernel was running as a guest on top of KVM or Xen (there are rumours of that happening under VMWare, too, and even on bare metal). Finding the root cause hasn't been successful yet, even after more than 2 years of chasing this bug by different developers. Commit: 35feb60474bf4f7 ("kernel/smp: Provide CSD lock timeout diagnostics") tried to address this by adding some debug code and by issuing another IPI when a hang was detected. This helped mitigating the problem (the repeated IPI unlocks the hang), but the root cause is still unknown. Current available data suggests that either an IPI wasn't sent when it should have been, or that the IPI didn't result in the target CPU executing the queued function (due to the IPI not reaching the CPU, the IPI handler not being called, or the handler not seeing the queued request). Try to add more diagnostic data by introducing a global atomic counter which is being incremented when doing critical operations (before and after queueing a new request, when sending an IPI, and when dequeueing a request). The counter value is stored in percpu variables which can be printed out when a hang is detected. The data of the last event (consisting of sequence counter, source CPU, target CPU, and event type) is stored in a global variable. When a new event is to be traced, the data of the last event is stored in the event related percpu location and the global data is updated with the new event's data. This allows to track two events in one data location: one by the value of the event data (the event before the current one), and one by the location itself (the current event). A typical printout with a detected hang will look like this: csd: Detected non-responsive CSD lock (#1) on CPU#1, waiting 53 ns for CPU#06 scf_handler_1+0x0/0x50(0xa2a881bb1410). csd: CSD lock (#1) handling prior scf_handler_1+0x0/0x50(0xa2a8813823c0) request. csd: cnt(8cc): -> dequeue (src cpu 0 == empty) csd: cnt(8cd): ->0006 idle csd: cnt(0003668): 0001->0006 queue csd: cnt(0003669): 0001->0006 ipi csd: cnt(0003e0f): 0007->000a queue csd: cnt(0003e10): 0001-> ping csd: cnt(0003e71): 0003-> ping csd: cnt(0003e72): ->0006 gotipi csd: cnt(0003e73): ->0006 handle csd: cnt(0003e74): ->0006 dequeue (src cpu 0 == empty) csd: cnt(0003e7f): 0004->0006 ping csd: cnt(0003e80): 0001-> pinged csd: cnt(0003eb2): 0005->0001 noipi csd: cnt(0003eb3): 0001->0006 queue csd: cnt(0003eb4): 0001->0006 noipi csd: cnt now: 0003f00 The idea is to print only relevant entries. Those are all events which are associated with the hang (so sender side events for the source CPU of the hanging request, and receiver side events for the target CPU), and the related events just before those (for adding data needed to identify a possible race). Printing all available data would be possible, but this would add large amounts of data printed on larger configurations. Signed-off-by: Juergen Gross [ Minor readability edits. Breaks col80 but is far more readable. ] Signed-off-by: Ingo Molnar Tested-by: Paul E. McKenney Link: https://lore.kernel.org/r/20210301101336.7797-4-jgr...@suse.com --- Documentation/admin-guide/kernel-parameters.txt | 4 +- kernel/smp.c| 226 ++- 2 files changed, 226 insertions(+), 4 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 98dbffa..1fe9d38 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -789,6 +789,10 @@ printed to the console in case a hanging CPU is detected, and that CPU is pinged again in order to try to resolve the hang situation. + 0: disable csdlock debugging (default) + 1: enable basic csdlock debugging (minor impact) + ext: enable extended csdlock debugging (more impact, +but
[tip: locking/core] locking/csd_lock: Add boot parameter for controlling CSD lock debugging
The following commit has been merged into the locking/core branch of tip: Commit-ID: 8d0968cc6b8ffd8496c2ebffdfdc801f949a85e5 Gitweb: https://git.kernel.org/tip/8d0968cc6b8ffd8496c2ebffdfdc801f949a85e5 Author:Juergen Gross AuthorDate:Mon, 01 Mar 2021 11:13:34 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:49:48 +01:00 locking/csd_lock: Add boot parameter for controlling CSD lock debugging Currently CSD lock debugging can be switched on and off via a kernel config option only. Unfortunately there is at least one problem with CSD lock handling pending for about 2 years now, which has been seen in different environments (mostly when running virtualized under KVM or Xen, at least once on bare metal). Multiple attempts to catch this issue have finally led to introduction of CSD lock debug code, but this code is not in use in most distros as it has some impact on performance. In order to be able to ship kernels with CONFIG_CSD_LOCK_WAIT_DEBUG enabled even for production use, add a boot parameter for switching the debug functionality on. This will reduce any performance impact of the debug coding to a bare minimum when not being used. Signed-off-by: Juergen Gross [ Minor edits. ] Signed-off-by: Ingo Molnar Link: https://lore.kernel.org/r/20210301101336.7797-2-jgr...@suse.com --- Documentation/admin-guide/kernel-parameters.txt | 6 +++- kernel/smp.c| 38 ++-- 2 files changed, 40 insertions(+), 4 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 0454572..98dbffa 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -784,6 +784,12 @@ cs89x0_media= [HW,NET] Format: { rj45 | aui | bnc } + csdlock_debug= [KNL] Enable debug add-ons of cross-CPU function call + handling. When switched on, additional debug data is + printed to the console in case a hanging CPU is + detected, and that CPU is pinged again in order to try + to resolve the hang situation. + dasd= [HW,NET] See header of drivers/s390/block/dasd_devmap.c. diff --git a/kernel/smp.c b/kernel/smp.c index aeb0adf..d5f0b21 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -24,6 +24,7 @@ #include #include #include +#include #include "smpboot.h" #include "sched/smp.h" @@ -102,6 +103,20 @@ void __init call_function_init(void) #ifdef CONFIG_CSD_LOCK_WAIT_DEBUG +static DEFINE_STATIC_KEY_FALSE(csdlock_debug_enabled); + +static int __init csdlock_debug(char *str) +{ + unsigned int val = 0; + + get_option(, ); + if (val) + static_branch_enable(_debug_enabled); + + return 0; +} +early_param("csdlock_debug", csdlock_debug); + static DEFINE_PER_CPU(call_single_data_t *, cur_csd); static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func); static DEFINE_PER_CPU(void *, cur_csd_info); @@ -110,7 +125,7 @@ static DEFINE_PER_CPU(void *, cur_csd_info); static atomic_t csd_bug_count = ATOMIC_INIT(0); /* Record current CSD work for current CPU, NULL to erase. */ -static void csd_lock_record(call_single_data_t *csd) +static void __csd_lock_record(call_single_data_t *csd) { if (!csd) { smp_mb(); /* NULL cur_csd after unlock. */ @@ -125,7 +140,13 @@ static void csd_lock_record(call_single_data_t *csd) /* Or before unlock, as the case may be. */ } -static __always_inline int csd_lock_wait_getcpu(call_single_data_t *csd) +static __always_inline void csd_lock_record(call_single_data_t *csd) +{ + if (static_branch_unlikely(_debug_enabled)) + __csd_lock_record(csd); +} + +static int csd_lock_wait_getcpu(call_single_data_t *csd) { unsigned int csd_type; @@ -140,7 +161,7 @@ static __always_inline int csd_lock_wait_getcpu(call_single_data_t *csd) * the CSD_TYPE_SYNC/ASYNC types provide the destination CPU, * so waiting on other types gets much less information. */ -static __always_inline bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id) +static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id) { int cpu = -1; int cpux; @@ -204,7 +225,7 @@ static __always_inline bool csd_lock_wait_toolong(call_single_data_t *csd, u64 t * previous function call. For multi-cpu calls its even more interesting * as we'll have to ensure no other cpu is observing our csd. */ -static __always_inline void csd_lock_wait(call_single_data_t *csd) +static void __csd_lock_wait(call_single_data_t *csd) { int bug_id = 0; u64 ts0, ts1; @@ -218,6 +239,15 @@ static __always_inline void csd_lock_wait(call_single_data_t *csd) smp_acquire__after_ctrl_dep(); }
[tip: irq/core] genirq: Add IRQF_NO_AUTOEN for request_irq/nmi()
The following commit has been merged into the irq/core branch of tip: Commit-ID: cbe16f35bee6880becca6f20d2ebf6b457148552 Gitweb: https://git.kernel.org/tip/cbe16f35bee6880becca6f20d2ebf6b457148552 Author:Barry Song AuthorDate:Wed, 03 Mar 2021 11:49:15 +13:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:48:00 +01:00 genirq: Add IRQF_NO_AUTOEN for request_irq/nmi() Many drivers don't want interrupts enabled automatically via request_irq(). So they are handling this issue by either way of the below two: (1) irq_set_status_flags(irq, IRQ_NOAUTOEN); request_irq(dev, irq...); (2) request_irq(dev, irq...); disable_irq(irq); The code in the second way is silly and unsafe. In the small time gap between request_irq() and disable_irq(), interrupts can still come. The code in the first way is safe though it's subobtimal. Add a new IRQF_NO_AUTOEN flag which can be handed in by drivers to request_irq() and request_nmi(). It prevents the automatic enabling of the requested interrupt/nmi in the same safe way as #1 above. With that the various usage sites of #1 and #2 above can be simplified and corrected. Signed-off-by: Barry Song Signed-off-by: Thomas Gleixner Signed-off-by: Ingo Molnar Cc: dmitry.torok...@gmail.com Link: https://lore.kernel.org/r/20210302224916.13980-2-song.bao@hisilicon.com --- include/linux/interrupt.h | 4 kernel/irq/manage.c | 11 +-- 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h index 967e257..76f1161 100644 --- a/include/linux/interrupt.h +++ b/include/linux/interrupt.h @@ -61,6 +61,9 @@ *interrupt handler after suspending interrupts. For system *wakeup devices users need to implement wakeup detection in *their interrupt handlers. + * IRQF_NO_AUTOEN - Don't enable IRQ or NMI automatically when users request it. + *Users will enable it explicitly by enable_irq() or enable_nmi() + *later. */ #define IRQF_SHARED0x0080 #define IRQF_PROBE_SHARED 0x0100 @@ -74,6 +77,7 @@ #define IRQF_NO_THREAD 0x0001 #define IRQF_EARLY_RESUME 0x0002 #define IRQF_COND_SUSPEND 0x0004 +#define IRQF_NO_AUTOEN 0x0008 #define IRQF_TIMER (__IRQF_TIMER | IRQF_NO_SUSPEND | IRQF_NO_THREAD) diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c index dec3f73..97c231a 100644 --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -1693,7 +1693,8 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new) irqd_set(>irq_data, IRQD_NO_BALANCING); } - if (irq_settings_can_autoenable(desc)) { + if (!(new->flags & IRQF_NO_AUTOEN) && + irq_settings_can_autoenable(desc)) { irq_startup(desc, IRQ_RESEND, IRQ_START_COND); } else { /* @@ -2086,10 +2087,15 @@ int request_threaded_irq(unsigned int irq, irq_handler_t handler, * which interrupt is which (messes up the interrupt freeing * logic etc). * +* Also shared interrupts do not go well with disabling auto enable. +* The sharing interrupt might request it while it's still disabled +* and then wait for interrupts forever. +* * Also IRQF_COND_SUSPEND only makes sense for shared interrupts and * it cannot be set along with IRQF_NO_SUSPEND. */ if (((irqflags & IRQF_SHARED) && !dev_id) || + ((irqflags & IRQF_SHARED) && (irqflags & IRQF_NO_AUTOEN)) || (!(irqflags & IRQF_SHARED) && (irqflags & IRQF_COND_SUSPEND)) || ((irqflags & IRQF_NO_SUSPEND) && (irqflags & IRQF_COND_SUSPEND))) return -EINVAL; @@ -2245,7 +2251,8 @@ int request_nmi(unsigned int irq, irq_handler_t handler, desc = irq_to_desc(irq); - if (!desc || irq_settings_can_autoenable(desc) || + if (!desc || (irq_settings_can_autoenable(desc) && + !(irqflags & IRQF_NO_AUTOEN)) || !irq_settings_can_request(desc) || WARN_ON(irq_settings_is_per_cpu_devid(desc)) || !irq_supports_nmi(desc))
[tip: locking/core] ath10k: Detect conf_mutex held ath10k_drain_tx() calls
The following commit has been merged into the locking/core branch of tip: Commit-ID: bdb1050ee1faaec1e78c15de8b1959176f26c655 Gitweb: https://git.kernel.org/tip/bdb1050ee1faaec1e78c15de8b1959176f26c655 Author:Shuah Khan AuthorDate:Fri, 26 Feb 2021 17:07:00 -07:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:51:15 +01:00 ath10k: Detect conf_mutex held ath10k_drain_tx() calls ath10k_drain_tx() must not be called with conf_mutex held as workers can use that also. Add call to lockdep_assert_not_held() on conf_mutex to detect if conf_mutex is held by the caller. The idea for this patch stemmed from coming across the comment block above the ath10k_drain_tx() while reviewing the conf_mutex holds during to debug the conf_mutex lock assert in ath10k_debug_fw_stats_request(). Adding detection to assert on conf_mutex hold will help detect incorrect usages that could lead to locking problems when async worker routines try to call this routine. Signed-off-by: Shuah Khan Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Acked-by: Kalle Valo Link: https://lore.kernel.org/linux-wireless/871rdmu9z9@codeaurora.org/ --- drivers/net/wireless/ath/ath10k/mac.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c index bb6c5ee..5ce4f8d 100644 --- a/drivers/net/wireless/ath/ath10k/mac.c +++ b/drivers/net/wireless/ath/ath10k/mac.c @@ -4727,6 +4727,8 @@ out: /* Must not be called with conf_mutex held as workers can use that also. */ void ath10k_drain_tx(struct ath10k *ar) { + lockdep_assert_not_held(>conf_mutex); + /* make sure rcu-protected mac80211 tx path itself is drained */ synchronize_net();
[PATCH] soc:litex: remove duplicate include in litex_soc_ctrl
From: Zhang Yunkai 'errno.h' included in 'litex_soc_ctrl.c' is duplicated. It is also included in the 11th line. Signed-off-by: Zhang Yunkai --- drivers/soc/litex/litex_soc_ctrl.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/soc/litex/litex_soc_ctrl.c b/drivers/soc/litex/litex_soc_ctrl.c index 6268bfa7f0d6..c3e379a990f2 100644 --- a/drivers/soc/litex/litex_soc_ctrl.c +++ b/drivers/soc/litex/litex_soc_ctrl.c @@ -13,7 +13,6 @@ #include #include #include -#include #include #include -- 2.25.1
[tip: objtool/core] objtool,x86: More ModRM sugar
The following commit has been merged into the objtool/core branch of tip: Commit-ID: 36d92e43d01cbeeec99abdf405362243051d6b3f Gitweb: https://git.kernel.org/tip/36d92e43d01cbeeec99abdf405362243051d6b3f Author:Peter Zijlstra AuthorDate:Fri, 12 Feb 2021 09:13:00 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:44:23 +01:00 objtool,x86: More ModRM sugar Better helpers to decode ModRM. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Acked-by: Josh Poimboeuf Link: https://lkml.kernel.org/r/YCZB/ljatfxqq...@hirez.programming.kicks-ass.net --- tools/objtool/arch/x86/decode.c | 28 +--- 1 file changed, 17 insertions(+), 11 deletions(-) diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c index b42e5ec..431bafb 100644 --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -82,15 +82,21 @@ unsigned long arch_jump_destination(struct instruction *insn) * 01 | [r/m + d8]|[S+d]| [r/m + d8] | * 10 | [r/m + d32] |[S+D]| [r/m + d32] | * 11 | r/ m | - * */ + +#define mod_is_mem() (modrm_mod != 3) +#define mod_is_reg() (modrm_mod == 3) + #define is_RIP() ((modrm_rm & 7) == CFI_BP && modrm_mod == 0) -#define have_SIB() ((modrm_rm & 7) == CFI_SP && modrm_mod != 3) +#define have_SIB() ((modrm_rm & 7) == CFI_SP && mod_is_mem()) #define rm_is(reg) (have_SIB() ? \ sib_base == (reg) && sib_index == CFI_SP : \ modrm_rm == (reg)) +#define rm_is_mem(reg) (mod_is_mem() && !is_RIP() && rm_is(reg)) +#define rm_is_reg(reg) (mod_is_reg() && modrm_rm == (reg)) + int arch_decode_instruction(const struct elf *elf, const struct section *sec, unsigned long offset, unsigned int maxlen, unsigned int *len, enum insn_type *type, @@ -154,7 +160,7 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, case 0x1: case 0x29: - if (rex_w && modrm_mod == 3 && modrm_rm == CFI_SP) { + if (rex_w && rm_is_reg(CFI_SP)) { /* add/sub reg, %rsp */ ADD_OP(op) { @@ -219,7 +225,7 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, break; /* %rsp target only */ - if (!(modrm_mod == 3 && modrm_rm == CFI_SP)) + if (!rm_is_reg(CFI_SP)) break; imm = insn.immediate.value; @@ -272,7 +278,7 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, if (modrm_reg == CFI_SP) { - if (modrm_mod == 3) { + if (mod_is_reg()) { /* mov %rsp, reg */ ADD_OP(op) { op->src.type = OP_SRC_REG; @@ -308,7 +314,7 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, break; } - if (modrm_mod == 3 && modrm_rm == CFI_SP) { + if (rm_is_reg(CFI_SP)) { /* mov reg, %rsp */ ADD_OP(op) { @@ -325,7 +331,7 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, if (!rex_w) break; - if ((modrm_mod == 1 || modrm_mod == 2) && modrm_rm == CFI_BP) { + if (rm_is_mem(CFI_BP)) { /* mov reg, disp(%rbp) */ ADD_OP(op) { @@ -338,7 +344,7 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, break; } - if (modrm_mod != 3 && rm_is(CFI_SP)) { + if (rm_is_mem(CFI_SP)) { /* mov reg, disp(%rsp) */ ADD_OP(op) { @@ -357,7 +363,7 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, if (!rex_w) break; - if ((modrm_mod == 1 || modrm_mod == 2) && modrm_rm == CFI_BP) { + if (rm_is_mem(CFI_BP)) { /* mov disp(%rbp), reg */ ADD_OP(op) { @@ -370,7 +376,7 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, break; } - if (modrm_mod != 3 && rm_is(CFI_SP)) { + if (rm_is_mem(CFI_SP)) { /* mov disp(%rsp), reg */ ADD_OP(op) { @@ -386,7 +392,7 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, break; case 0x8d: - if (modrm_mod == 3) { + if (mod_is_reg()) {
[tip: objtool/core] objtool: Collate parse_options() users
The following commit has been merged into the objtool/core branch of tip: Commit-ID: a2f605f9ff57397d05a8e2f282b78a69f574d305 Gitweb: https://git.kernel.org/tip/a2f605f9ff57397d05a8e2f282b78a69f574d305 Author:Peter Zijlstra AuthorDate:Fri, 26 Feb 2021 11:18:24 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:44:23 +01:00 objtool: Collate parse_options() users Ensure there's a single place that parses check_options, in preparation for extending where to get options from. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Acked-by: Josh Poimboeuf Link: https://lkml.kernel.org/r/20210226110004.193108...@infradead.org --- tools/objtool/builtin-check.c | 14 +- tools/objtool/builtin-orc.c | 5 + tools/objtool/include/objtool/builtin.h | 2 ++ 3 files changed, 12 insertions(+), 9 deletions(-) diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c index 97f063d..0399752 100644 --- a/tools/objtool/builtin-check.c +++ b/tools/objtool/builtin-check.c @@ -42,17 +42,21 @@ const struct option check_options[] = { OPT_END(), }; +int cmd_parse_options(int argc, const char **argv, const char * const usage[]) +{ + argc = parse_options(argc, argv, check_options, usage, 0); + if (argc != 1) + usage_with_options(usage, check_options); + return argc; +} + int cmd_check(int argc, const char **argv) { const char *objname; struct objtool_file *file; int ret; - argc = parse_options(argc, argv, check_options, check_usage, 0); - - if (argc != 1) - usage_with_options(check_usage, check_options); - + argc = cmd_parse_options(argc, argv, check_usage); objname = argv[0]; file = objtool_open_read(objname); diff --git a/tools/objtool/builtin-orc.c b/tools/objtool/builtin-orc.c index 8273bbf..17f8b93 100644 --- a/tools/objtool/builtin-orc.c +++ b/tools/objtool/builtin-orc.c @@ -34,10 +34,7 @@ int cmd_orc(int argc, const char **argv) struct objtool_file *file; int ret; - argc = parse_options(argc, argv, check_options, orc_usage, 0); - if (argc != 1) - usage_with_options(orc_usage, check_options); - + argc = cmd_parse_options(argc, argv, orc_usage); objname = argv[0]; file = objtool_open_read(objname); diff --git a/tools/objtool/include/objtool/builtin.h b/tools/objtool/include/objtool/builtin.h index d019210..15ac0b7 100644 --- a/tools/objtool/include/objtool/builtin.h +++ b/tools/objtool/include/objtool/builtin.h @@ -11,6 +11,8 @@ extern const struct option check_options[]; extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, mcount, noinstr, backup; +extern int cmd_parse_options(int argc, const char **argv, const char * const usage[]); + extern int cmd_check(int argc, const char **argv); extern int cmd_orc(int argc, const char **argv);
[tip: objtool/core] objtool: Parse options from OBJTOOL_ARGS
The following commit has been merged into the objtool/core branch of tip: Commit-ID: 900b4df347bbac4874149a226143a556909faba8 Gitweb: https://git.kernel.org/tip/900b4df347bbac4874149a226143a556909faba8 Author:Peter Zijlstra AuthorDate:Fri, 26 Feb 2021 11:32:30 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:44:23 +01:00 objtool: Parse options from OBJTOOL_ARGS Teach objtool to parse options from the OBJTOOL_ARGS environment variable. This enables things like: $ OBJTOOL_ARGS="--backup" make O=defconfig-build/ kernel/ponies.o to obtain both defconfig-build/kernel/ponies.o{,.orig} and easily inspect what objtool actually did. Suggested-by: Borislav Petkov Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Acked-by: Josh Poimboeuf Link: https://lkml.kernel.org/r/20210226110004.252553...@infradead.org --- tools/objtool/builtin-check.c | 25 + 1 file changed, 25 insertions(+) diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c index 0399752..8b38b5d 100644 --- a/tools/objtool/builtin-check.c +++ b/tools/objtool/builtin-check.c @@ -15,6 +15,7 @@ #include #include +#include #include #include @@ -26,6 +27,11 @@ static const char * const check_usage[] = { NULL, }; +static const char * const env_usage[] = { + "OBJTOOL_ARGS=\"\"", + NULL, +}; + const struct option check_options[] = { OPT_BOOLEAN('f', "no-fp", _fp, "Skip frame pointer validation"), OPT_BOOLEAN('u', "no-unreachable", _unreachable, "Skip 'unreachable instruction' warnings"), @@ -44,6 +50,25 @@ const struct option check_options[] = { int cmd_parse_options(int argc, const char **argv, const char * const usage[]) { + const char *envv[16] = { }; + char *env; + int envc; + + env = getenv("OBJTOOL_ARGS"); + if (env) { + envv[0] = "OBJTOOL_ARGS"; + for (envc = 1; envc < ARRAY_SIZE(envv); ) { + envv[envc++] = env; + env = strchr(env, ' '); + if (!env) + break; + *env = '\0'; + env++; + } + + parse_options(envc, envv, check_options, env_usage, 0); + } + argc = parse_options(argc, argv, check_options, usage, 0); if (argc != 1) usage_with_options(usage, check_options);
[tip: objtool/core] objtool,x86: Rewrite LEA decode
The following commit has been merged into the objtool/core branch of tip: Commit-ID: 2ee0c363492f1acc1082125218e6a80c0d7d502b Gitweb: https://git.kernel.org/tip/2ee0c363492f1acc1082125218e6a80c0d7d502b Author:Peter Zijlstra AuthorDate:Tue, 09 Feb 2021 21:29:16 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:44:23 +01:00 objtool,x86: Rewrite LEA decode Current LEA decoding is a bunch of special cases, properly decode the instruction, with exception of full SIB and RIP-relative modes. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Acked-by: Josh Poimboeuf Tested-by: Nick Desaulniers Link: https://lkml.kernel.org/r/20210211173627.143250...@infradead.org --- tools/objtool/arch/x86/decode.c | 86 ++-- 1 file changed, 28 insertions(+), 58 deletions(-) diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c index 549813c..d8f0138 100644 --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -91,9 +91,10 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, { struct insn insn; int x86_64, sign; - unsigned char op1, op2, rex = 0, rex_b = 0, rex_r = 0, rex_w = 0, - rex_x = 0, modrm = 0, modrm_mod = 0, modrm_rm = 0, - modrm_reg = 0, sib = 0; + unsigned char op1, op2, + rex = 0, rex_b = 0, rex_r = 0, rex_w = 0, rex_x = 0, + modrm = 0, modrm_mod = 0, modrm_rm = 0, modrm_reg = 0, + sib = 0; struct stack_op *op = NULL; struct symbol *sym; @@ -328,68 +329,37 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, break; case 0x8d: - if (sib == 0x24 && rex_w && !rex_b && !rex_x) { - - ADD_OP(op) { - if (!insn.displacement.value) { - /* lea (%rsp), reg */ - op->src.type = OP_SRC_REG; - } else { - /* lea disp(%rsp), reg */ - op->src.type = OP_SRC_ADD; - op->src.offset = insn.displacement.value; - } - op->src.reg = CFI_SP; - op->dest.type = OP_DEST_REG; - op->dest.reg = op_to_cfi_reg[modrm_reg][rex_r]; - } - - } else if (rex == 0x48 && modrm == 0x65) { - - /* lea disp(%rbp), %rsp */ - ADD_OP(op) { - op->src.type = OP_SRC_ADD; - op->src.reg = CFI_BP; - op->src.offset = insn.displacement.value; - op->dest.type = OP_DEST_REG; - op->dest.reg = CFI_SP; - } + if (modrm_mod == 3) { + WARN("invalid LEA encoding at %s:0x%lx", sec->name, offset); + break; + } - } else if (rex == 0x49 && modrm == 0x62 && - insn.displacement.value == -8) { + /* skip non 64bit ops */ + if (!rex_w) + break; - /* -* lea -0x8(%r10), %rsp -* -* Restoring rsp back to its original value after a -* stack realignment. -*/ - ADD_OP(op) { - op->src.type = OP_SRC_ADD; - op->src.reg = CFI_R10; - op->src.offset = -8; - op->dest.type = OP_DEST_REG; - op->dest.reg = CFI_SP; - } + /* skip nontrivial SIB */ + if (modrm_rm == 4 && !(sib == 0x24 && rex_b == rex_x)) + break; - } else if (rex == 0x49 && modrm == 0x65 && - insn.displacement.value == -16) { + /* skip RIP relative displacement */ + if (modrm_rm == 5 && modrm_mod == 0) + break; - /* -* lea -0x10(%r13), %rsp -* -* Restoring rsp back to its original value after a -* stack realignment. -*/ - ADD_OP(op) { + /* lea disp(%src), %dst */ + ADD_OP(op) { + op->src.offset = insn.displacement.value; + if (!op->src.offset) { +
[tip: objtool/core] objtool,x86: Rewrite LEAVE
The following commit has been merged into the objtool/core branch of tip: Commit-ID: ffc7e74f36a2c7424da262a32a0bbe59669677ef Gitweb: https://git.kernel.org/tip/ffc7e74f36a2c7424da262a32a0bbe59669677ef Author:Peter Zijlstra AuthorDate:Tue, 09 Feb 2021 21:41:13 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:44:23 +01:00 objtool,x86: Rewrite LEAVE Since we can now have multiple stack-ops per instruction, we don't need to special case LEAVE and can simply emit the composite operations. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Acked-by: Josh Poimboeuf Tested-by: Nick Desaulniers Link: https://lkml.kernel.org/r/20210211173627.253273...@infradead.org --- tools/objtool/arch/x86/decode.c | 14 +++--- tools/objtool/check.c| 24 ++-- tools/objtool/include/objtool/arch.h | 1 - 3 files changed, 13 insertions(+), 26 deletions(-) diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c index d8f0138..47b9acf 100644 --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -446,9 +446,17 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, * mov bp, sp * pop bp */ - ADD_OP(op) - op->dest.type = OP_DEST_LEAVE; - + ADD_OP(op) { + op->src.type = OP_SRC_REG; + op->src.reg = CFI_BP; + op->dest.type = OP_DEST_REG; + op->dest.reg = CFI_SP; + } + ADD_OP(op) { + op->src.type = OP_SRC_POP; + op->dest.type = OP_DEST_REG; + op->dest.reg = CFI_BP; + } break; case 0xe3: diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 12b8f0f..a0f762a 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -2020,7 +2020,7 @@ static int update_cfi_state(struct instruction *insn, } else if (op->src.reg == CFI_BP && op->dest.reg == CFI_SP && -cfa->base == CFI_BP) { +(cfa->base == CFI_BP || cfa->base == cfi->drap_reg)) { /* * mov %rbp, %rsp @@ -2217,7 +2217,7 @@ static int update_cfi_state(struct instruction *insn, cfa->offset = 0; cfi->drap_offset = -1; - } else if (regs[op->dest.reg].offset == -cfi->stack_size) { + } else if (cfi->stack_size == -regs[op->dest.reg].offset) { /* pop %reg */ restore_reg(cfi, op->dest.reg); @@ -2358,26 +2358,6 @@ static int update_cfi_state(struct instruction *insn, break; - case OP_DEST_LEAVE: - if ((!cfi->drap && cfa->base != CFI_BP) || - (cfi->drap && cfa->base != cfi->drap_reg)) { - WARN_FUNC("leave instruction with modified stack frame", - insn->sec, insn->offset); - return -1; - } - - /* leave (mov %rbp, %rsp; pop %rbp) */ - - cfi->stack_size = -cfi->regs[CFI_BP].offset - 8; - restore_reg(cfi, CFI_BP); - - if (!cfi->drap) { - cfa->base = CFI_SP; - cfa->offset -= 8; - } - - break; - case OP_DEST_MEM: if (op->src.type != OP_SRC_POP && op->src.type != OP_SRC_POPF) { WARN_FUNC("unknown stack-related memory operation", diff --git a/tools/objtool/include/objtool/arch.h b/tools/objtool/include/objtool/arch.h index 6ff0685..ff21f38 100644 --- a/tools/objtool/include/objtool/arch.h +++ b/tools/objtool/include/objtool/arch.h @@ -35,7 +35,6 @@ enum op_dest_type { OP_DEST_MEM, OP_DEST_PUSH, OP_DEST_PUSHF, - OP_DEST_LEAVE, }; struct op_dest {
[tip: objtool/core] objtool,x86: Renumber CFI_reg
The following commit has been merged into the objtool/core branch of tip: Commit-ID: d473b18b2ef62563fb874f9cae6e123f99129e3f Gitweb: https://git.kernel.org/tip/d473b18b2ef62563fb874f9cae6e123f99129e3f Author:Peter Zijlstra AuthorDate:Tue, 09 Feb 2021 20:18:21 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:44:22 +01:00 objtool,x86: Renumber CFI_reg Make them match the instruction encoding numbering. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Acked-by: Josh Poimboeuf Tested-by: Nick Desaulniers Link: https://lkml.kernel.org/r/20210211173627.033720...@infradead.org --- tools/objtool/arch/x86/include/arch/cfi_regs.h | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/tools/objtool/arch/x86/include/arch/cfi_regs.h b/tools/objtool/arch/x86/include/arch/cfi_regs.h index 79bc517..0579d22 100644 --- a/tools/objtool/arch/x86/include/arch/cfi_regs.h +++ b/tools/objtool/arch/x86/include/arch/cfi_regs.h @@ -4,13 +4,13 @@ #define _OBJTOOL_CFI_REGS_H #define CFI_AX 0 -#define CFI_DX 1 -#define CFI_CX 2 +#define CFI_CX 1 +#define CFI_DX 2 #define CFI_BX 3 -#define CFI_SI 4 -#define CFI_DI 5 -#define CFI_BP 6 -#define CFI_SP 7 +#define CFI_SP 4 +#define CFI_BP 5 +#define CFI_SI 6 +#define CFI_DI 7 #define CFI_R8 8 #define CFI_R9 9 #define CFI_R1010
[tip: objtool/core] objtool: Allow UNWIND_HINT to suppress dodgy stack modifications
The following commit has been merged into the objtool/core branch of tip: Commit-ID: d54dba41999498b38a40940e1123019d50b26496 Gitweb: https://git.kernel.org/tip/d54dba41999498b38a40940e1123019d50b26496 Author:Peter Zijlstra AuthorDate:Thu, 11 Feb 2021 13:03:28 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:44:22 +01:00 objtool: Allow UNWIND_HINT to suppress dodgy stack modifications rewind_stack_do_exit() UNWIND_HINT_FUNC /* Prevent any naive code from trying to unwind to our caller. */ xorl%ebp, %ebp movqPER_CPU_VAR(cpu_current_top_of_stack), %rax leaq-PTREGS_SIZE(%rax), %rsp UNWIND_HINT_REGS calldo_exit Does unspeakable things to the stack, which objtool currently fails to detect due to a limitation in instruction decoding. This will be rectified after which the above will result in: arch/x86/entry/entry_64.o: warning: objtool: .text+0xab: unsupported stack register modification Allow the UNWIND_HINT on the next instruction to suppress this, it will overwrite the state anyway. Suggested-by: Josh Poimboeuf Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Acked-by: Josh Poimboeuf Tested-by: Nick Desaulniers Link: https://lkml.kernel.org/r/20210211173626.918498...@infradead.org --- tools/objtool/check.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 068cdb4..12b8f0f 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -1959,8 +1959,9 @@ static void restore_reg(struct cfi_state *cfi, unsigned char reg) * 41 5d pop%r13 * c3retq */ -static int update_cfi_state(struct instruction *insn, struct cfi_state *cfi, -struct stack_op *op) +static int update_cfi_state(struct instruction *insn, + struct instruction *next_insn, + struct cfi_state *cfi, struct stack_op *op) { struct cfi_reg *cfa = >cfa; struct cfi_reg *regs = cfi->regs; @@ -2161,7 +2162,7 @@ static int update_cfi_state(struct instruction *insn, struct cfi_state *cfi, break; } - if (op->dest.reg == cfi->cfa.base) { + if (op->dest.reg == cfi->cfa.base && !(next_insn && next_insn->hint)) { WARN_FUNC("unsupported stack register modification", insn->sec, insn->offset); return -1; @@ -2433,13 +2434,15 @@ static int propagate_alt_cfi(struct objtool_file *file, struct instruction *insn return 0; } -static int handle_insn_ops(struct instruction *insn, struct insn_state *state) +static int handle_insn_ops(struct instruction *insn, + struct instruction *next_insn, + struct insn_state *state) { struct stack_op *op; list_for_each_entry(op, >stack_ops, list) { - if (update_cfi_state(insn, >cfi, op)) + if (update_cfi_state(insn, next_insn, >cfi, op)) return 1; if (op->dest.type == OP_DEST_PUSHF) { @@ -2719,7 +2722,7 @@ static int validate_branch(struct objtool_file *file, struct symbol *func, return 0; } - if (handle_insn_ops(insn, )) + if (handle_insn_ops(insn, next_insn, )) return 1; switch (insn->type) {
[tip: objtool/core] objtool,x86: Simplify register decode
The following commit has been merged into the objtool/core branch of tip: Commit-ID: 16ef7f159c503c7befec7018ee0e82fdc311721e Gitweb: https://git.kernel.org/tip/16ef7f159c503c7befec7018ee0e82fdc311721e Author:Peter Zijlstra AuthorDate:Tue, 09 Feb 2021 19:59:43 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:44:23 +01:00 objtool,x86: Simplify register decode Since the CFI_reg number now matches the instruction encoding order do away with the op_to_cfi_reg[] and use direct assignment. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Acked-by: Josh Poimboeuf Tested-by: Nick Desaulniers Link: https://lkml.kernel.org/r/20210211173627.362004...@infradead.org --- tools/objtool/arch/x86/decode.c | 79 +++- 1 file changed, 39 insertions(+), 40 deletions(-) diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c index 47b9acf..5ce7dc4 100644 --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -17,17 +17,6 @@ #include #include -static unsigned char op_to_cfi_reg[][2] = { - {CFI_AX, CFI_R8}, - {CFI_CX, CFI_R9}, - {CFI_DX, CFI_R10}, - {CFI_BX, CFI_R11}, - {CFI_SP, CFI_R12}, - {CFI_BP, CFI_R13}, - {CFI_SI, CFI_R14}, - {CFI_DI, CFI_R15}, -}; - static int is_x86_64(const struct elf *elf) { switch (elf->ehdr.e_machine) { @@ -94,7 +83,7 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, unsigned char op1, op2, rex = 0, rex_b = 0, rex_r = 0, rex_w = 0, rex_x = 0, modrm = 0, modrm_mod = 0, modrm_rm = 0, modrm_reg = 0, - sib = 0; + sib = 0 /* , sib_scale = 0, sib_index = 0, sib_base = 0 */; struct stack_op *op = NULL; struct symbol *sym; @@ -130,23 +119,29 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, if (insn.modrm.nbytes) { modrm = insn.modrm.bytes[0]; modrm_mod = X86_MODRM_MOD(modrm); - modrm_reg = X86_MODRM_REG(modrm); - modrm_rm = X86_MODRM_RM(modrm); + modrm_reg = X86_MODRM_REG(modrm) + 8*rex_r; + modrm_rm = X86_MODRM_RM(modrm) + 8*rex_b; } - if (insn.sib.nbytes) + if (insn.sib.nbytes) { sib = insn.sib.bytes[0]; + /* + sib_scale = X86_SIB_SCALE(sib); + sib_index = X86_SIB_INDEX(sib) + 8*rex_x; + sib_base = X86_SIB_BASE(sib) + 8*rex_b; +*/ + } switch (op1) { case 0x1: case 0x29: - if (rex_w && !rex_b && modrm_mod == 3 && modrm_rm == 4) { + if (rex_w && modrm_mod == 3 && modrm_rm == CFI_SP) { /* add/sub reg, %rsp */ ADD_OP(op) { op->src.type = OP_SRC_ADD; - op->src.reg = op_to_cfi_reg[modrm_reg][rex_r]; + op->src.reg = modrm_reg; op->dest.type = OP_DEST_REG; op->dest.reg = CFI_SP; } @@ -158,7 +153,7 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, /* push reg */ ADD_OP(op) { op->src.type = OP_SRC_REG; - op->src.reg = op_to_cfi_reg[op1 & 0x7][rex_b]; + op->src.reg = (op1 & 0x7) + 8*rex_b; op->dest.type = OP_DEST_PUSH; } @@ -170,7 +165,7 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, ADD_OP(op) { op->src.type = OP_SRC_POP; op->dest.type = OP_DEST_REG; - op->dest.reg = op_to_cfi_reg[op1 & 0x7][rex_b]; + op->dest.reg = (op1 & 0x7) + 8*rex_b; } break; @@ -223,7 +218,7 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, break; case 0x89: - if (rex_w && !rex_r && modrm_reg == 4) { + if (rex_w && modrm_reg == CFI_SP) { if (modrm_mod == 3) { /* mov %rsp, reg */ @@ -231,17 +226,17 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, op->src.type = OP_SRC_REG; op->src.reg = CFI_SP; op->dest.type = OP_DEST_REG; - op->dest.reg = op_to_cfi_reg[modrm_rm][rex_b]; + op->dest.reg = modrm_rm; }
[tip: objtool/core] objtool,x86: Rewrite ADD/SUB/AND
The following commit has been merged into the objtool/core branch of tip: Commit-ID: 961d83b9073b1ce5834af50d3c69e5e2461c6fd3 Gitweb: https://git.kernel.org/tip/961d83b9073b1ce5834af50d3c69e5e2461c6fd3 Author:Peter Zijlstra AuthorDate:Wed, 10 Feb 2021 14:11:30 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:44:23 +01:00 objtool,x86: Rewrite ADD/SUB/AND Support sign extending and imm8 forms. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Acked-by: Josh Poimboeuf Tested-by: Nick Desaulniers Link: https://lkml.kernel.org/r/20210211173627.588366...@infradead.org --- tools/objtool/arch/x86/decode.c | 70 +++- 1 file changed, 51 insertions(+), 19 deletions(-) diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c index 78ae5be..b42e5ec 100644 --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -98,13 +98,14 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, struct list_head *ops_list) { struct insn insn; - int x86_64, sign; + int x86_64; unsigned char op1, op2, rex = 0, rex_b = 0, rex_r = 0, rex_w = 0, rex_x = 0, modrm = 0, modrm_mod = 0, modrm_rm = 0, modrm_reg = 0, sib = 0, /* sib_scale = 0, */ sib_index = 0, sib_base = 0; struct stack_op *op = NULL; struct symbol *sym; + u64 imm; x86_64 = is_x86_64(elf); if (x86_64 == -1) @@ -200,12 +201,54 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, *type = INSN_JUMP_CONDITIONAL; break; - case 0x81: - case 0x83: - if (rex != 0x48) + case 0x80 ... 0x83: + /* +* 1000 00sw : mod OP r/m : immediate +* +* s - sign extend immediate +* w - imm8 / imm32 +* +* OP: 000 ADD100 AND +* 001 OR 101 SUB +* 010 ADC110 XOR +* 011 SBB111 CMP +*/ + + /* 64bit only */ + if (!rex_w) break; - if (modrm == 0xe4) { + /* %rsp target only */ + if (!(modrm_mod == 3 && modrm_rm == CFI_SP)) + break; + + imm = insn.immediate.value; + if (op1 & 2) { /* sign extend */ + if (op1 & 1) { /* imm32 */ + imm <<= 32; + imm = (s64)imm >> 32; + } else { /* imm8 */ + imm <<= 56; + imm = (s64)imm >> 56; + } + } + + switch (modrm_reg & 7) { + case 5: + imm = -imm; + /* fallthrough */ + case 0: + /* add/sub imm, %rsp */ + ADD_OP(op) { + op->src.type = OP_SRC_ADD; + op->src.reg = CFI_SP; + op->src.offset = imm; + op->dest.type = OP_DEST_REG; + op->dest.reg = CFI_SP; + } + break; + + case 4: /* and imm, %rsp */ ADD_OP(op) { op->src.type = OP_SRC_AND; @@ -215,23 +258,12 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, op->dest.reg = CFI_SP; } break; - } - if (modrm == 0xc4) - sign = 1; - else if (modrm == 0xec) - sign = -1; - else + default: + /* WARN ? */ break; - - /* add/sub imm, %rsp */ - ADD_OP(op) { - op->src.type = OP_SRC_ADD; - op->src.reg = CFI_SP; - op->src.offset = insn.immediate.value * sign; - op->dest.type = OP_DEST_REG; - op->dest.reg = CFI_SP; } + break; case 0x89:
[tip: objtool/core] objtool,x86: Support %riz encodings
The following commit has been merged into the objtool/core branch of tip: Commit-ID: 78df6245c3c82484200b9f8e306dc86fb19e9c02 Gitweb: https://git.kernel.org/tip/78df6245c3c82484200b9f8e306dc86fb19e9c02 Author:Peter Zijlstra AuthorDate:Wed, 10 Feb 2021 11:47:35 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:44:23 +01:00 objtool,x86: Support %riz encodings When there's a SIB byte, the register otherwise denoted by r/m will then be denoted by SIB.base REX.b will now extend this. SIB.index == SP is magic and notes an index value zero. This means that there's a bunch of alternative (longer) encodings for the same thing. Eg. 'ModRM.mod != 3, ModRM.r/m = AX' can be encoded as 'ModRM.mod != 3, ModRM.r/m = SP, SIB.base = AX, SIB.index = SP' which is actually 4 different encodings because the value of SIB.scale is irrelevant, giving rise to 5 different but equal encodings. Support these encodings and clean up the SIB handling in general. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Acked-by: Josh Poimboeuf Tested-by: Nick Desaulniers Link: https://lkml.kernel.org/r/20210211173627.472967...@infradead.org --- tools/objtool/arch/x86/decode.c | 67 ++-- 1 file changed, 48 insertions(+), 19 deletions(-) diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c index 5ce7dc4..78ae5be 100644 --- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -72,6 +72,25 @@ unsigned long arch_jump_destination(struct instruction *insn) return -1; \ else for (list_add_tail(>list, ops_list); op; op = NULL) +/* + * Helpers to decode ModRM/SIB: + * + * r/m| AX CX DX BX | SP | BP | SI DI | + *| R8 R9 R10 R11 | R12 | R13 | R14 R15 | + * Mod++-+-+-+ + * 00 |[r/m] |[SIB]|[IP+]| [r/m] | + * 01 | [r/m + d8]|[S+d]| [r/m + d8] | + * 10 | [r/m + d32] |[S+D]| [r/m + d32] | + * 11 | r/ m | + * + */ +#define is_RIP() ((modrm_rm & 7) == CFI_BP && modrm_mod == 0) +#define have_SIB() ((modrm_rm & 7) == CFI_SP && modrm_mod != 3) + +#define rm_is(reg) (have_SIB() ? \ + sib_base == (reg) && sib_index == CFI_SP : \ + modrm_rm == (reg)) + int arch_decode_instruction(const struct elf *elf, const struct section *sec, unsigned long offset, unsigned int maxlen, unsigned int *len, enum insn_type *type, @@ -83,7 +102,7 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, unsigned char op1, op2, rex = 0, rex_b = 0, rex_r = 0, rex_w = 0, rex_x = 0, modrm = 0, modrm_mod = 0, modrm_rm = 0, modrm_reg = 0, - sib = 0 /* , sib_scale = 0, sib_index = 0, sib_base = 0 */; + sib = 0, /* sib_scale = 0, */ sib_index = 0, sib_base = 0; struct stack_op *op = NULL; struct symbol *sym; @@ -125,11 +144,9 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, if (insn.sib.nbytes) { sib = insn.sib.bytes[0]; - /* - sib_scale = X86_SIB_SCALE(sib); + /* sib_scale = X86_SIB_SCALE(sib); */ sib_index = X86_SIB_INDEX(sib) + 8*rex_x; sib_base = X86_SIB_BASE(sib) + 8*rex_b; -*/ } switch (op1) { @@ -218,7 +235,10 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, break; case 0x89: - if (rex_w && modrm_reg == CFI_SP) { + if (!rex_w) + break; + + if (modrm_reg == CFI_SP) { if (modrm_mod == 3) { /* mov %rsp, reg */ @@ -231,14 +251,17 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec, break; } else { - /* skip nontrivial SIB */ - if ((modrm_rm & 7) == 4 && !(sib == 0x24 && rex_b == rex_x)) - break; - /* skip RIP relative displacement */ - if ((modrm_rm & 7) == 5 && modrm_mod == 0) + if (is_RIP()) break; + /* skip nontrivial SIB */ + if (have_SIB()) { + modrm_rm = sib_base; + if (sib_index != CFI_SP) + break; + } + /* mov %rsp, disp(%reg) */ ADD_OP(op) {
[tip: objtool/core] objtool: Add --backup
The following commit has been merged into the objtool/core branch of tip: Commit-ID: 8ad15c6900840e8a2163012f4581c52127622e02 Gitweb: https://git.kernel.org/tip/8ad15c6900840e8a2163012f4581c52127622e02 Author:Peter Zijlstra AuthorDate:Fri, 26 Feb 2021 10:59:59 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:44:23 +01:00 objtool: Add --backup Teach objtool to write backups files, such that it becomes easier to see what objtool did to the object file. Backup files will be ${name}.orig. Suggested-by: Borislav Petkov Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Acked-by: Borislav Petkov Acked-by: Josh Poimboeuf Link: https://lkml.kernel.org/r/yd4obt3aoxpwl...@hirez.programming.kicks-ass.net --- tools/objtool/builtin-check.c | 4 +- tools/objtool/include/objtool/builtin.h | 3 +- tools/objtool/objtool.c | 64 - 3 files changed, 69 insertions(+), 2 deletions(-) diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c index c3a85d8..97f063d 100644 --- a/tools/objtool/builtin-check.c +++ b/tools/objtool/builtin-check.c @@ -18,7 +18,8 @@ #include #include -bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, mcount, noinstr; +bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, + validate_dup, vmlinux, mcount, noinstr, backup; static const char * const check_usage[] = { "objtool check [] file.o", @@ -37,6 +38,7 @@ const struct option check_options[] = { OPT_BOOLEAN('n', "noinstr", , "noinstr validation for vmlinux.o"), OPT_BOOLEAN('l', "vmlinux", , "vmlinux.o validation"), OPT_BOOLEAN('M', "mcount", , "generate __mcount_loc"), + OPT_BOOLEAN('B', "backup", , "create .orig files before modification"), OPT_END(), }; diff --git a/tools/objtool/include/objtool/builtin.h b/tools/objtool/include/objtool/builtin.h index 2502bb2..d019210 100644 --- a/tools/objtool/include/objtool/builtin.h +++ b/tools/objtool/include/objtool/builtin.h @@ -8,7 +8,8 @@ #include extern const struct option check_options[]; -extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux, mcount, noinstr; +extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, +validate_dup, vmlinux, mcount, noinstr, backup; extern int cmd_check(int argc, const char **argv); extern int cmd_orc(int argc, const char **argv); diff --git a/tools/objtool/objtool.c b/tools/objtool/objtool.c index 7b97ce4..43c1836 100644 --- a/tools/objtool/objtool.c +++ b/tools/objtool/objtool.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include #include @@ -44,6 +45,64 @@ bool help; const char *objname; static struct objtool_file file; +static bool objtool_create_backup(const char *_objname) +{ + int len = strlen(_objname); + char *buf, *base, *name = malloc(len+6); + int s, d, l, t; + + if (!name) { + perror("failed backup name malloc"); + return false; + } + + strcpy(name, _objname); + strcpy(name + len, ".orig"); + + d = open(name, O_CREAT|O_WRONLY|O_TRUNC, 0644); + if (d < 0) { + perror("failed to create backup file"); + return false; + } + + s = open(_objname, O_RDONLY); + if (s < 0) { + perror("failed to open orig file"); + return false; + } + + buf = malloc(4096); + if (!buf) { + perror("failed backup data malloc"); + return false; + } + + while ((l = read(s, buf, 4096)) > 0) { + base = buf; + do { + t = write(d, base, l); + if (t < 0) { + perror("failed backup write"); + return false; + } + base += t; + l -= t; + } while (l); + } + + if (l < 0) { + perror("failed backup read"); + return false; + } + + free(name); + free(buf); + close(d); + close(s); + + return true; +} + struct objtool_file *objtool_open_read(const char *_objname) { if (objname) { @@ -59,6 +118,11 @@ struct objtool_file *objtool_open_read(const char *_objname) if (!file.elf) return NULL; + if (backup && !objtool_create_backup(objname)) { + WARN("can't create backup file"); + return NULL; + } + INIT_LIST_HEAD(_list); hash_init(file.insn_hash); INIT_LIST_HEAD(_call_list);
[PATCH] scsi:ufs: remove duplicate include in ufshcd
From: Zhang Yunkai 'blkdev.h' included in 'ufshcd.c' is duplicated. It is also included in the 18th line. Signed-off-by: Zhang Yunkai --- drivers/scsi/ufs/ufshcd.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 77161750c9fb..9a564b6fd092 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -24,7 +24,6 @@ #include "ufs_bsg.h" #include "ufshcd-crypto.h" #include -#include #define CREATE_TRACE_POINTS #include -- 2.25.1
[tip: sched/core] sched: Simplify set_affinity_pending refcounts
The following commit has been merged into the sched/core branch of tip: Commit-ID: 50caf9c14b1498c90cf808dbba2ca29bd32ccba4 Gitweb: https://git.kernel.org/tip/50caf9c14b1498c90cf808dbba2ca29bd32ccba4 Author:Peter Zijlstra AuthorDate:Wed, 24 Feb 2021 11:42:08 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00 sched: Simplify set_affinity_pending refcounts Now that we have set_affinity_pending::stop_pending to indicate if a stopper is in progress, and we have the guarantee that if that stopper exists, it will (eventually) complete our @pending we can simplify the refcount scheme by no longer counting the stopper thread. Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Cc: sta...@kernel.org Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224131355.724130...@infradead.org --- kernel/sched/core.c | 32 1 file changed, 20 insertions(+), 12 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 4e4d100..9819121 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1862,6 +1862,10 @@ struct migration_arg { struct set_affinity_pending *pending; }; +/* + * @refs: number of wait_for_completion() + * @stop_pending: is @stop_work in use + */ struct set_affinity_pending { refcount_t refs; unsigned intstop_pending; @@ -1997,10 +2001,6 @@ out: if (complete) complete_all(>done); - /* For pending->{arg,stop_work} */ - if (pending && refcount_dec_and_test(>refs)) - wake_up_var(>refs); - return 0; } @@ -2199,12 +2199,16 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag push_task = get_task_struct(p); } + /* +* If there are pending waiters, but no pending stop_work, +* then complete now. +*/ pending = p->migration_pending; - if (pending) { - refcount_inc(>refs); + if (pending && !pending->stop_pending) { p->migration_pending = NULL; complete = true; } + task_rq_unlock(rq, p, rf); if (push_task) { @@ -2213,7 +2217,7 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag } if (complete) - goto do_complete; + complete_all(>done); return 0; } @@ -2264,9 +2268,9 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag if (!stop_pending) pending->stop_pending = true; - refcount_inc(>refs); /* pending->{arg,stop_work} */ if (flags & SCA_MIGRATE_ENABLE) p->migration_flags &= ~MDF_PUSH; + task_rq_unlock(rq, p, rf); if (!stop_pending) { @@ -2282,12 +2286,13 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag if (task_on_rq_queued(p)) rq = move_queued_task(rq, rf, p, dest_cpu); - p->migration_pending = NULL; - complete = true; + if (!pending->stop_pending) { + p->migration_pending = NULL; + complete = true; + } } task_rq_unlock(rq, p, rf); -do_complete: if (complete) complete_all(>done); } @@ -2295,7 +2300,7 @@ do_complete: wait_for_completion(>done); if (refcount_dec_and_test(>refs)) - wake_up_var(>refs); + wake_up_var(>refs); /* No UaF, just an address */ /* * Block the original owner of until all subsequent callers @@ -2303,6 +2308,9 @@ do_complete: */ wait_var_event(_pending.refs, !refcount_read(_pending.refs)); + /* ARGH */ + WARN_ON_ONCE(my_pending.stop_pending); + return 0; }
[tip: sched/core] sched: Collate affine_move_task() stoppers
The following commit has been merged into the sched/core branch of tip: Commit-ID: 58b1a45086b5f80f2b2842aa7ed0da51a64a302b Gitweb: https://git.kernel.org/tip/58b1a45086b5f80f2b2842aa7ed0da51a64a302b Author:Peter Zijlstra AuthorDate:Wed, 24 Feb 2021 11:15:23 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00 sched: Collate affine_move_task() stoppers The SCA_MIGRATE_ENABLE and task_running() cases are almost identical, collapse them to avoid further duplication. Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Cc: sta...@kernel.org Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224131355.500108...@infradead.org --- kernel/sched/core.c | 23 --- 1 file changed, 8 insertions(+), 15 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 088e8f4..84b657f 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2239,30 +2239,23 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag return -EINVAL; } - if (flags & SCA_MIGRATE_ENABLE) { - - refcount_inc(>refs); /* pending->{arg,stop_work} */ - p->migration_flags &= ~MDF_PUSH; - task_rq_unlock(rq, p, rf); - - stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop, - >arg, >stop_work); - - return 0; - } - if (task_running(rq, p) || p->state == TASK_WAKING) { /* -* Lessen races (and headaches) by delegating -* is_migration_disabled(p) checks to the stopper, which will -* run on the same CPU as said p. +* MIGRATE_ENABLE gets here because 'p == current', but for +* anything else we cannot do is_migration_disabled(), punt +* and have the stopper function handle it all race-free. */ + refcount_inc(>refs); /* pending->{arg,stop_work} */ + if (flags & SCA_MIGRATE_ENABLE) + p->migration_flags &= ~MDF_PUSH; task_rq_unlock(rq, p, rf); stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop, >arg, >stop_work); + if (flags & SCA_MIGRATE_ENABLE) + return 0; } else { if (!is_migration_disabled(p)) {
[tip: sched/core] sched: Simplify migration_cpu_stop()
The following commit has been merged into the sched/core branch of tip: Commit-ID: c20cf065d4a619d394d23290093b1002e27dff86 Gitweb: https://git.kernel.org/tip/c20cf065d4a619d394d23290093b1002e27dff86 Author:Peter Zijlstra AuthorDate:Wed, 24 Feb 2021 11:50:39 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:20 +01:00 sched: Simplify migration_cpu_stop() When affine_move_task() issues a migration_cpu_stop(), the purpose of that function is to complete that @pending, not any random other p->migration_pending that might have gotten installed since. This realization much simplifies migration_cpu_stop() and allows further necessary steps to fix all this as it provides the guarantee that @pending's stopper will complete @pending (and not some random other @pending). Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Cc: sta...@kernel.org Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224131355.430014...@infradead.org --- kernel/sched/core.c | 56 ++-- 1 file changed, 8 insertions(+), 48 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 79ddba5..088e8f4 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1898,8 +1898,8 @@ static struct rq *__migrate_task(struct rq *rq, struct rq_flags *rf, */ static int migration_cpu_stop(void *data) { - struct set_affinity_pending *pending; struct migration_arg *arg = data; + struct set_affinity_pending *pending = arg->pending; struct task_struct *p = arg->task; int dest_cpu = arg->dest_cpu; struct rq *rq = this_rq(); @@ -1921,25 +1921,6 @@ static int migration_cpu_stop(void *data) raw_spin_lock(>pi_lock); rq_lock(rq, ); - pending = p->migration_pending; - if (pending && !arg->pending) { - /* -* This happens from sched_exec() and migrate_task_to(), -* neither of them care about pending and just want a task to -* maybe move about. -* -* Even if there is a pending, we can ignore it, since -* affine_move_task() will have it's own stop_work's in flight -* which will manage the completion. -* -* Notably, pending doesn't need to match arg->pending. This can -* happen when tripple concurrent affine_move_task() first sets -* pending, then clears pending and eventually sets another -* pending. -*/ - pending = NULL; - } - /* * If task_rq(p) != rq, it cannot be migrated here, because we're * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because @@ -1950,31 +1931,20 @@ static int migration_cpu_stop(void *data) goto out; if (pending) { - p->migration_pending = NULL; + if (p->migration_pending == pending) + p->migration_pending = NULL; complete = true; } - /* migrate_enable() -- we must not race against SCA */ - if (dest_cpu < 0) { - /* -* When this was migrate_enable() but we no longer -* have a @pending, a concurrent SCA 'fixed' things -* and we should be valid again. Nothing to do. -*/ - if (!pending) { - WARN_ON_ONCE(!cpumask_test_cpu(task_cpu(p), >cpus_mask)); - goto out; - } - + if (dest_cpu < 0) dest_cpu = cpumask_any_distribute(>cpus_mask); - } if (task_on_rq_queued(p)) rq = __migrate_task(rq, , p, dest_cpu); else p->wake_cpu = dest_cpu; - } else if (dest_cpu < 0 || pending) { + } else if (pending) { /* * This happens when we get migrated between migrate_enable()'s * preempt_enable() and scheduling the stopper task. At that @@ -1989,23 +1959,14 @@ static int migration_cpu_stop(void *data) * ->pi_lock, so the allowed mask is stable - if it got * somewhere allowed, we're done. */ - if (pending && cpumask_test_cpu(task_cpu(p), p->cpus_ptr)) { - p->migration_pending = NULL; + if (cpumask_test_cpu(task_cpu(p), p->cpus_ptr)) { + if (p->migration_pending == pending) + p->migration_pending = NULL; complete = true;
[tip: sched/core] sched: Optimize migration_cpu_stop()
The following commit has been merged into the sched/core branch of tip: Commit-ID: 3f1bc119cd7fc987c8ed25ffb717f99403bb308c Gitweb: https://git.kernel.org/tip/3f1bc119cd7fc987c8ed25ffb717f99403bb308c Author:Peter Zijlstra AuthorDate:Wed, 24 Feb 2021 11:21:35 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00 sched: Optimize migration_cpu_stop() When the purpose of migration_cpu_stop() is to migrate the task to 'any' valid CPU, don't migrate the task when it's already running on a valid CPU. Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Cc: sta...@kernel.org Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224131355.569238...@infradead.org --- kernel/sched/core.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 84b657f..ac05afb 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1936,14 +1936,25 @@ static int migration_cpu_stop(void *data) complete = true; } - if (dest_cpu < 0) + if (dest_cpu < 0) { + if (cpumask_test_cpu(task_cpu(p), >cpus_mask)) + goto out; + dest_cpu = cpumask_any_distribute(>cpus_mask); + } if (task_on_rq_queued(p)) rq = __migrate_task(rq, , p, dest_cpu); else p->wake_cpu = dest_cpu; + /* +* XXX __migrate_task() can fail, at which point we might end +* up running on a dodgy CPU, AFAICT this can only happen +* during CPU hotplug, at which point we'll get pushed out +* anyway, so it's probably not a big deal. +*/ + } else if (pending) { /* * This happens when we get migrated between migrate_enable()'s
[tip: sched/core] sched: Fix migration_cpu_stop() requeueing
The following commit has been merged into the sched/core branch of tip: Commit-ID: 8a6edb5257e2a84720fe78cb179eca58ba76126f Gitweb: https://git.kernel.org/tip/8a6edb5257e2a84720fe78cb179eca58ba76126f Author:Peter Zijlstra AuthorDate:Sat, 13 Feb 2021 13:10:35 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:20 +01:00 sched: Fix migration_cpu_stop() requeueing When affine_move_task(p) is called on a running task @p, which is not otherwise already changing affinity, we'll first set p->migration_pending and then do: stop_one_cpu(cpu_of_rq(rq), migration_cpu_stop, ); This then gets us to migration_cpu_stop() running on the CPU that was previously running our victim task @p. If we find that our task is no longer on that runqueue (this can happen because of a concurrent migration due to load-balance etc.), then we'll end up at the: } else if (dest_cpu < 1 || pending) { branch. Which we'll take because we set pending earlier. Here we first check if the task @p has already satisfied the affinity constraints, if so we bail early [A]. Otherwise we'll reissue migration_cpu_stop() onto the CPU that is now hosting our task @p: stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop, >arg, >stop_work); Except, we've never initialized pending->arg, which will be all 0s. This then results in running migration_cpu_stop() on the next CPU with arg->p == NULL, which gives the by now obvious result of fireworks. The cure is to change affine_move_task() to always use pending->arg, furthermore we can use the exact same pattern as the SCA_MIGRATE_ENABLE case, since we'll block on the pending->done completion anyway, no point in adding yet another completion in stop_one_cpu(). This then gives a clear distinction between the two migration_cpu_stop() use cases: - sched_exec() / migrate_task_to() : arg->pending == NULL - affine_move_task() : arg->pending != NULL; And we can have it ignore p->migration_pending when !arg->pending. Any stop work from sched_exec() / migrate_task_to() is in addition to stop works from affine_move_task(), which will be sufficient to issue the completion. Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Cc: sta...@kernel.org Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224131355.357743...@infradead.org --- kernel/sched/core.c | 39 --- 1 file changed, 28 insertions(+), 11 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index ca2bb62..79ddba5 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1922,6 +1922,24 @@ static int migration_cpu_stop(void *data) rq_lock(rq, ); pending = p->migration_pending; + if (pending && !arg->pending) { + /* +* This happens from sched_exec() and migrate_task_to(), +* neither of them care about pending and just want a task to +* maybe move about. +* +* Even if there is a pending, we can ignore it, since +* affine_move_task() will have it's own stop_work's in flight +* which will manage the completion. +* +* Notably, pending doesn't need to match arg->pending. This can +* happen when tripple concurrent affine_move_task() first sets +* pending, then clears pending and eventually sets another +* pending. +*/ + pending = NULL; + } + /* * If task_rq(p) != rq, it cannot be migrated here, because we're * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because @@ -2194,10 +2212,6 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag int dest_cpu, unsigned int flags) { struct set_affinity_pending my_pending = { }, *pending = NULL; - struct migration_arg arg = { - .task = p, - .dest_cpu = dest_cpu, - }; bool complete = false; /* Can the task run on the task's current CPU? If so, we're done */ @@ -2235,6 +2249,12 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag /* Install the request */ refcount_set(_pending.refs, 1); init_completion(_pending.done); + my_pending.arg = (struct migration_arg) { + .task = p, + .dest_cpu = -1, /* any */ + .pending = _pending, + }; + p->migration_pending = _pending; } else { pending = p->migration_pending; @@ -2265,12 +2285,6 @@
[tip: sched/core] sched: Fix affine_move_task() self-concurrency
The following commit has been merged into the sched/core branch of tip: Commit-ID: 9e81889c7648d48dd5fe13f41cbc99f3c362484a Gitweb: https://git.kernel.org/tip/9e81889c7648d48dd5fe13f41cbc99f3c362484a Author:Peter Zijlstra AuthorDate:Wed, 24 Feb 2021 11:31:09 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00 sched: Fix affine_move_task() self-concurrency Consider: sched_setaffinity(p, X); sched_setaffinity(p, Y); Then the first will install p->migration_pending = _pending; and issue stop_one_cpu_nowait(pending); and the second one will read p->migration_pending and _also_ issue: stop_one_cpu_nowait(pending), the _SAME_ @pending. This causes stopper list corruption. Add set_affinity_pending::stop_pending, to indicate if a stopper is in progress. Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Cc: sta...@kernel.org Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224131355.649146...@infradead.org --- kernel/sched/core.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index ac05afb..4e4d100 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1864,6 +1864,7 @@ struct migration_arg { struct set_affinity_pending { refcount_t refs; + unsigned intstop_pending; struct completion done; struct cpu_stop_workstop_work; struct migration_argarg; @@ -1982,12 +1983,15 @@ static int migration_cpu_stop(void *data) * determine is_migration_disabled() and so have to chase after * it. */ + WARN_ON_ONCE(!pending->stop_pending); task_rq_unlock(rq, p, ); stop_one_cpu_nowait(task_cpu(p), migration_cpu_stop, >arg, >stop_work); return 0; } out: + if (pending) + pending->stop_pending = false; task_rq_unlock(rq, p, ); if (complete) @@ -2183,7 +2187,7 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag int dest_cpu, unsigned int flags) { struct set_affinity_pending my_pending = { }, *pending = NULL; - bool complete = false; + bool stop_pending, complete = false; /* Can the task run on the task's current CPU? If so, we're done */ if (cpumask_test_cpu(task_cpu(p), >cpus_mask)) { @@ -2256,14 +2260,19 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag * anything else we cannot do is_migration_disabled(), punt * and have the stopper function handle it all race-free. */ + stop_pending = pending->stop_pending; + if (!stop_pending) + pending->stop_pending = true; refcount_inc(>refs); /* pending->{arg,stop_work} */ if (flags & SCA_MIGRATE_ENABLE) p->migration_flags &= ~MDF_PUSH; task_rq_unlock(rq, p, rf); - stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop, - >arg, >stop_work); + if (!stop_pending) { + stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop, + >arg, >stop_work); + } if (flags & SCA_MIGRATE_ENABLE) return 0;
[tip: sched/core] sched/membarrier: fix missing local execution of ipi_sync_rq_state()
The following commit has been merged into the sched/core branch of tip: Commit-ID: ce29ddc47b91f97e7f69a0fb7cbb5845f52a9825 Gitweb: https://git.kernel.org/tip/ce29ddc47b91f97e7f69a0fb7cbb5845f52a9825 Author:Mathieu Desnoyers AuthorDate:Wed, 17 Feb 2021 11:56:51 -05:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00 sched/membarrier: fix missing local execution of ipi_sync_rq_state() The function sync_runqueues_membarrier_state() should copy the membarrier state from the @mm received as parameter to each runqueue currently running tasks using that mm. However, the use of smp_call_function_many() skips the current runqueue, which is unintended. Replace by a call to on_each_cpu_mask(). Fixes: 227a4aadc75b ("sched/membarrier: Fix p->mm->membarrier_state racy load") Reported-by: Nadav Amit Signed-off-by: Mathieu Desnoyers Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Cc: sta...@vger.kernel.org # 5.4.x+ Link: https://lore.kernel.org/r/74f1e842-4a84-47bf-b6c2-5407dfdd4...@gmail.com --- kernel/sched/membarrier.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c index acdae62..b5add64 100644 --- a/kernel/sched/membarrier.c +++ b/kernel/sched/membarrier.c @@ -471,9 +471,7 @@ static int sync_runqueues_membarrier_state(struct mm_struct *mm) } rcu_read_unlock(); - preempt_disable(); - smp_call_function_many(tmpmask, ipi_sync_rq_state, mm, 1); - preempt_enable(); + on_each_cpu_mask(tmpmask, ipi_sync_rq_state, mm, true); free_cpumask_var(tmpmask); cpus_read_unlock();
[tip: sched/core] sched/fair: Remove update of blocked load from newidle_balance
The following commit has been merged into the sched/core branch of tip: Commit-ID: 0826530de3cbdc89e60a89e86def94a5f0fc81ca Gitweb: https://git.kernel.org/tip/0826530de3cbdc89e60a89e86def94a5f0fc81ca Author:Vincent Guittot AuthorDate:Wed, 24 Feb 2021 14:30:01 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00 sched/fair: Remove update of blocked load from newidle_balance newidle_balance runs with both preempt and irq disabled which prevent local irq to run during this period. The duration for updating the blocked load of CPUs varies according to the number of CPU cgroups with non-decayed load and extends this critical period to an uncontrolled level. Remove the update from newidle_balance and trigger a normal ILB that will take care of the update instead. This reduces the IRQ latency from O(nr_cgroups * nr_nohz_cpus) to O(nr_cgroups). Signed-off-by: Vincent Guittot Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224133007.28644-2-vincent.guit...@linaro.org --- kernel/sched/fair.c | 33 + 1 file changed, 5 insertions(+), 28 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 794c2cb..806e16f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7392,8 +7392,6 @@ enum migration_type { #define LBF_NEED_BREAK 0x02 #define LBF_DST_PINNED 0x04 #define LBF_SOME_PINNED0x08 -#define LBF_NOHZ_STATS 0x10 -#define LBF_NOHZ_AGAIN 0x20 struct lb_env { struct sched_domain *sd; @@ -8397,9 +8395,6 @@ static inline void update_sg_lb_stats(struct lb_env *env, for_each_cpu_and(i, sched_group_span(group), env->cpus) { struct rq *rq = cpu_rq(i); - if ((env->flags & LBF_NOHZ_STATS) && update_nohz_stats(rq, false)) - env->flags |= LBF_NOHZ_AGAIN; - sgs->group_load += cpu_load(rq); sgs->group_util += cpu_util(i); sgs->group_runnable += cpu_runnable(rq); @@ -8940,11 +8935,6 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd struct sg_lb_stats tmp_sgs; int sg_status = 0; -#ifdef CONFIG_NO_HZ_COMMON - if (env->idle == CPU_NEWLY_IDLE && READ_ONCE(nohz.has_blocked)) - env->flags |= LBF_NOHZ_STATS; -#endif - do { struct sg_lb_stats *sgs = _sgs; int local_group; @@ -8981,14 +8971,6 @@ next_group: /* Tag domain that child domain prefers tasks go to siblings first */ sds->prefer_sibling = child && child->flags & SD_PREFER_SIBLING; -#ifdef CONFIG_NO_HZ_COMMON - if ((env->flags & LBF_NOHZ_AGAIN) && - cpumask_subset(nohz.idle_cpus_mask, sched_domain_span(env->sd))) { - - WRITE_ONCE(nohz.next_blocked, - jiffies + msecs_to_jiffies(LOAD_AVG_PERIOD)); - } -#endif if (env->sd->flags & SD_NUMA) env->fbq_type = fbq_classify_group(>busiest_stat); @@ -10517,16 +10499,11 @@ static void nohz_newidle_balance(struct rq *this_rq) time_before(jiffies, READ_ONCE(nohz.next_blocked))) return; - raw_spin_unlock(_rq->lock); /* -* This CPU is going to be idle and blocked load of idle CPUs -* need to be updated. Run the ilb locally as it is a good -* candidate for ilb instead of waking up another idle CPU. -* Kick an normal ilb if we failed to do the update. +* Blocked load of idle CPUs need to be updated. +* Kick an ILB to update statistics. */ - if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE)) - kick_ilb(NOHZ_STATS_KICK); - raw_spin_lock(_rq->lock); + kick_ilb(NOHZ_STATS_KICK); } #else /* !CONFIG_NO_HZ_COMMON */ @@ -10587,8 +10564,6 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf) update_next_balance(sd, _balance); rcu_read_unlock(); - nohz_newidle_balance(this_rq); - goto out; } @@ -10654,6 +10629,8 @@ out: if (pulled_task) this_rq->idle_stamp = 0; + else + nohz_newidle_balance(this_rq); rq_repin_lock(this_rq, rf);
[tip: sched/core] sched/fair: Remove unused parameter of update_nohz_stats
The following commit has been merged into the sched/core branch of tip: Commit-ID: 64f84f273592d17dcdca20244168ad9f525a39c3 Gitweb: https://git.kernel.org/tip/64f84f273592d17dcdca20244168ad9f525a39c3 Author:Vincent Guittot AuthorDate:Wed, 24 Feb 2021 14:30:03 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00 sched/fair: Remove unused parameter of update_nohz_stats idle load balance is the only user of update_nohz_stats and doesn't use force parameter. Remove it Signed-off-by: Vincent Guittot Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224133007.28644-4-vincent.guit...@linaro.org --- kernel/sched/fair.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 6a458e9..1b91030 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8352,7 +8352,7 @@ group_type group_classify(unsigned int imbalance_pct, return group_has_spare; } -static bool update_nohz_stats(struct rq *rq, bool force) +static bool update_nohz_stats(struct rq *rq) { #ifdef CONFIG_NO_HZ_COMMON unsigned int cpu = rq->cpu; @@ -8363,7 +8363,7 @@ static bool update_nohz_stats(struct rq *rq, bool force) if (!cpumask_test_cpu(cpu, nohz.idle_cpus_mask)) return false; - if (!force && !time_after(jiffies, rq->last_blocked_load_update_tick)) + if (!time_after(jiffies, rq->last_blocked_load_update_tick)) return true; update_blocked_averages(cpu); @@ -10401,7 +10401,7 @@ static void _nohz_idle_balance(struct rq *this_rq, unsigned int flags, rq = cpu_rq(balance_cpu); - has_blocked_load |= update_nohz_stats(rq, true); + has_blocked_load |= update_nohz_stats(rq); /* * If time for next balance is due,
[tip: sched/core] kcov: Remove kcov include from sched.h and move it to its users.
The following commit has been merged into the sched/core branch of tip: Commit-ID: 183f47fcaa54a5ffe671d990186d330ac8c63b10 Gitweb: https://git.kernel.org/tip/183f47fcaa54a5ffe671d990186d330ac8c63b10 Author:Sebastian Andrzej Siewior AuthorDate:Thu, 18 Feb 2021 18:31:24 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00 kcov: Remove kcov include from sched.h and move it to its users. The recent addition of in_serving_softirq() to kconv.h results in compile failure on PREEMPT_RT because it requires task_struct::softirq_disable_cnt. This is not available if kconv.h is included from sched.h. It is not needed to include kconv.h from sched.h. All but the net/ user already include the kconv header file. Move the include of the kconv.h header from sched.h it its users. Additionally include sched.h from kconv.h to ensure that everything task_struct related is available. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Acked-by: Johannes Berg Acked-by: Andrey Konovalov Link: https://lkml.kernel.org/r/20210218173124.iy5iyqv3a4oia...@linutronix.de --- drivers/usb/usbip/usbip_common.h | 1 + include/linux/kcov.h | 1 + include/linux/sched.h| 1 - net/core/skbuff.c| 1 + net/mac80211/iface.c | 1 + net/mac80211/rx.c| 1 + 6 files changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/usb/usbip/usbip_common.h b/drivers/usb/usbip/usbip_common.h index d60ce17..a7dd6c6 100644 --- a/drivers/usb/usbip/usbip_common.h +++ b/drivers/usb/usbip/usbip_common.h @@ -18,6 +18,7 @@ #include #include #include +#include #include #undef pr_fmt diff --git a/include/linux/kcov.h b/include/linux/kcov.h index 4e3037d..55dc338 100644 --- a/include/linux/kcov.h +++ b/include/linux/kcov.h @@ -2,6 +2,7 @@ #ifndef _LINUX_KCOV_H #define _LINUX_KCOV_H +#include #include struct task_struct; diff --git a/include/linux/sched.h b/include/linux/sched.h index ef00bb2..cf245bc 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -14,7 +14,6 @@ #include #include #include -#include #include #include #include diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 545a472..420f23c 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -60,6 +60,7 @@ #include #include #include +#include #include #include diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c index b80c9b0..c127deb 100644 --- a/net/mac80211/iface.c +++ b/net/mac80211/iface.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include #include "ieee80211_i.h" diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c index c1343c0..62047e9 100644 --- a/net/mac80211/rx.c +++ b/net/mac80211/rx.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include #include
[tip: sched/core] sched/fair: Remove unused return of _nohz_idle_balance
The following commit has been merged into the sched/core branch of tip: Commit-ID: ab2dde5e98db23387147fb4e7a52b6cf8141cdb3 Gitweb: https://git.kernel.org/tip/ab2dde5e98db23387147fb4e7a52b6cf8141cdb3 Author:Vincent Guittot AuthorDate:Wed, 24 Feb 2021 14:30:02 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00 sched/fair: Remove unused return of _nohz_idle_balance The return of _nohz_idle_balance() is not used anymore so we can remove it Signed-off-by: Vincent Guittot Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224133007.28644-3-vincent.guit...@linaro.org --- kernel/sched/fair.c | 10 +- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 806e16f..6a458e9 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -10354,10 +10354,8 @@ out: * Internal function that runs load balance for all idle cpus. The load balance * can be a simple update of blocked load or a complete load balance with * tasks movement depending of flags. - * The function returns false if the loop has stopped before running - * through all idle CPUs. */ -static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags, +static void _nohz_idle_balance(struct rq *this_rq, unsigned int flags, enum cpu_idle_type idle) { /* Earliest time when we have to do rebalance again */ @@ -10367,7 +10365,6 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags, int update_next_balance = 0; int this_cpu = this_rq->cpu; int balance_cpu; - int ret = false; struct rq *rq; SCHED_WARN_ON((flags & NOHZ_KICK_MASK) == NOHZ_BALANCE_KICK); @@ -10447,15 +10444,10 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags, WRITE_ONCE(nohz.next_blocked, now + msecs_to_jiffies(LOAD_AVG_PERIOD)); - /* The full idle balance loop has been done */ - ret = true; - abort: /* There is still blocked load, enable periodic update */ if (has_blocked_load) WRITE_ONCE(nohz.has_blocked, 1); - - return ret; } /*
[tip: sched/core] sched/fair: Reorder newidle_balance pulled_task tests
The following commit has been merged into the sched/core branch of tip: Commit-ID: 6553fc18179113a11835d5fde1735259f8943a55 Gitweb: https://git.kernel.org/tip/6553fc18179113a11835d5fde1735259f8943a55 Author:Vincent Guittot AuthorDate:Wed, 24 Feb 2021 14:30:05 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00 sched/fair: Reorder newidle_balance pulled_task tests Reorder the tests and skip useless ones when no load balance has been performed and rq lock has not been released. Signed-off-by: Vincent Guittot Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224133007.28644-6-vincent.guit...@linaro.org --- kernel/sched/fair.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 3c00918..356a245 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -10584,7 +10584,6 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf) if (curr_cost > this_rq->max_idle_balance_cost) this_rq->max_idle_balance_cost = curr_cost; -out: /* * While browsing the domains, we released the rq lock, a task could * have been enqueued in the meantime. Since we're not going idle, @@ -10593,14 +10592,15 @@ out: if (this_rq->cfs.h_nr_running && !pulled_task) pulled_task = 1; - /* Move the next balance forward */ - if (time_after(this_rq->next_balance, next_balance)) - this_rq->next_balance = next_balance; - /* Is there a task of a high priority class? */ if (this_rq->nr_running != this_rq->cfs.h_nr_running) pulled_task = -1; +out: + /* Move the next balance forward */ + if (time_after(this_rq->next_balance, next_balance)) + this_rq->next_balance = next_balance; + if (pulled_task) this_rq->idle_stamp = 0; else
[tip: sched/core] sched: Simplify migration_cpu_stop()
The following commit has been merged into the sched/core branch of tip: Commit-ID: e140749c9f194d65f5984a5941e46758377c93c0 Gitweb: https://git.kernel.org/tip/e140749c9f194d65f5984a5941e46758377c93c0 Author:Valentin Schneider AuthorDate:Thu, 25 Feb 2021 10:22:30 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00 sched: Simplify migration_cpu_stop() Since, when ->stop_pending, only the stopper can uninstall p->migration_pending. This could simplify a few ifs, because: (pending != NULL) => (pending == p->migration_pending) Also, the fatty comment above affine_move_task() probably needs a bit of gardening. Signed-off-by: Valentin Schneider Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar --- kernel/sched/core.c | 27 ++- 1 file changed, 18 insertions(+), 9 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9819121..f9dfb34 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1927,6 +1927,12 @@ static int migration_cpu_stop(void *data) rq_lock(rq, ); /* +* If we were passed a pending, then ->stop_pending was set, thus +* p->migration_pending must have remained stable. +*/ + WARN_ON_ONCE(pending && pending != p->migration_pending); + + /* * If task_rq(p) != rq, it cannot be migrated here, because we're * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because * we're holding p->pi_lock. @@ -1936,8 +1942,7 @@ static int migration_cpu_stop(void *data) goto out; if (pending) { - if (p->migration_pending == pending) - p->migration_pending = NULL; + p->migration_pending = NULL; complete = true; } @@ -1976,8 +1981,7 @@ static int migration_cpu_stop(void *data) * somewhere allowed, we're done. */ if (cpumask_test_cpu(task_cpu(p), p->cpus_ptr)) { - if (p->migration_pending == pending) - p->migration_pending = NULL; + p->migration_pending = NULL; complete = true; goto out; } @@ -2165,16 +2169,21 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) * * (1) In the cases covered above. There is one more where the completion is * signaled within affine_move_task() itself: when a subsequent affinity request - * cancels the need for an active migration. Consider: + * occurs after the stopper bailed out due to the targeted task still being + * Migrate-Disable. Consider: * * Initial conditions: P0->cpus_mask = [0, 1] * - * P0@CPU0P1 P2 - * - * migrate_disable(); - * + * CPU0 P1P2 + * + * migrate_disable(); + * *set_cpus_allowed_ptr(P0, [1]); * + * + * migration_cpu_stop() + * is_migration_disabled() + * * set_cpus_allowed_ptr(P0, [0, 1]); * *
[tip: sched/core] sched/fair: Trigger the update of blocked load on newly idle cpu
The following commit has been merged into the sched/core branch of tip: Commit-ID: c6f886546cb8a38617cdbe755fe50d3acd2463e4 Gitweb: https://git.kernel.org/tip/c6f886546cb8a38617cdbe755fe50d3acd2463e4 Author:Vincent Guittot AuthorDate:Wed, 24 Feb 2021 14:30:06 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00 sched/fair: Trigger the update of blocked load on newly idle cpu Instead of waking up a random and already idle CPU, we can take advantage of this_cpu being about to enter idle to run the ILB and update the blocked load. Signed-off-by: Vincent Guittot Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224133007.28644-7-vincent.guit...@linaro.org --- kernel/sched/core.c | 2 +- kernel/sched/fair.c | 24 +--- kernel/sched/idle.c | 6 ++ kernel/sched/sched.h | 7 +++ 4 files changed, 35 insertions(+), 4 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index f9dfb34..361974e 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -737,7 +737,7 @@ static void nohz_csd_func(void *info) /* * Release the rq::nohz_csd. */ - flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(cpu)); + flags = atomic_fetch_andnot(NOHZ_KICK_MASK | NOHZ_NEWILB_KICK, nohz_flags(cpu)); WARN_ON(!(flags & NOHZ_KICK_MASK)); rq->idle_balance = idle_cpu(cpu); diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 356a245..e87e1b3 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -10453,6 +10453,24 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) return true; } +/* + * Check if we need to run the ILB for updating blocked load before entering + * idle state. + */ +void nohz_run_idle_balance(int cpu) +{ + unsigned int flags; + + flags = atomic_fetch_andnot(NOHZ_NEWILB_KICK, nohz_flags(cpu)); + + /* +* Update the blocked load only if no SCHED_SOFTIRQ is about to happen +* (ie NOHZ_STATS_KICK set) and will do the same. +*/ + if ((flags == NOHZ_NEWILB_KICK) && !need_resched()) + _nohz_idle_balance(cpu_rq(cpu), NOHZ_STATS_KICK, CPU_IDLE); +} + static void nohz_newidle_balance(struct rq *this_rq) { int this_cpu = this_rq->cpu; @@ -10474,10 +10492,10 @@ static void nohz_newidle_balance(struct rq *this_rq) return; /* -* Blocked load of idle CPUs need to be updated. -* Kick an ILB to update statistics. +* Set the need to trigger ILB in order to update blocked load +* before entering idle state. */ - kick_ilb(NOHZ_STATS_KICK); + atomic_or(NOHZ_NEWILB_KICK, nohz_flags(this_cpu)); } #else /* !CONFIG_NO_HZ_COMMON */ diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index 7199e6f..7a92d60 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -261,6 +261,12 @@ exit_idle: static void do_idle(void) { int cpu = smp_processor_id(); + + /* +* Check if we need to update blocked load +*/ + nohz_run_idle_balance(cpu); + /* * If the arch has a polling bit, we maintain an invariant: * diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 10a1522..0ddc9a6 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2385,9 +2385,11 @@ extern void cfs_bandwidth_usage_dec(void); #ifdef CONFIG_NO_HZ_COMMON #define NOHZ_BALANCE_KICK_BIT 0 #define NOHZ_STATS_KICK_BIT1 +#define NOHZ_NEWILB_KICK_BIT 2 #define NOHZ_BALANCE_KICK BIT(NOHZ_BALANCE_KICK_BIT) #define NOHZ_STATS_KICKBIT(NOHZ_STATS_KICK_BIT) +#define NOHZ_NEWILB_KICK BIT(NOHZ_NEWILB_KICK_BIT) #define NOHZ_KICK_MASK (NOHZ_BALANCE_KICK | NOHZ_STATS_KICK) @@ -2398,6 +2400,11 @@ extern void nohz_balance_exit_idle(struct rq *rq); static inline void nohz_balance_exit_idle(struct rq *rq) { } #endif +#if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON) +extern void nohz_run_idle_balance(int cpu); +#else +static inline void nohz_run_idle_balance(int cpu) { } +#endif #ifdef CONFIG_SMP static inline
[tip: sched/core] sched/fair: Merge for each idle cpu loop of ILB
The following commit has been merged into the sched/core branch of tip: Commit-ID: 7a82e5f52a3506bc35a4dc04d53ad2c9daf82e7f Gitweb: https://git.kernel.org/tip/7a82e5f52a3506bc35a4dc04d53ad2c9daf82e7f Author:Vincent Guittot AuthorDate:Wed, 24 Feb 2021 14:30:04 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00 sched/fair: Merge for each idle cpu loop of ILB Remove the specific case for handling this_cpu outside for_each_cpu() loop when running ILB. Instead we use for_each_cpu_wrap() and start with the next cpu after this_cpu so we will continue to finish with this_cpu. update_nohz_stats() is now used for this_cpu too and will prevents unnecessary update. We don't need a special case for handling the update of nohz.next_balance for this_cpu anymore because it is now handled by the loop like others. Signed-off-by: Vincent Guittot Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224133007.28644-5-vincent.guit...@linaro.org --- kernel/sched/fair.c | 32 +++- 1 file changed, 7 insertions(+), 25 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 1b91030..3c00918 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -10043,22 +10043,9 @@ out: * When the cpu is attached to null domain for ex, it will not be * updated. */ - if (likely(update_next_balance)) { + if (likely(update_next_balance)) rq->next_balance = next_balance; -#ifdef CONFIG_NO_HZ_COMMON - /* -* If this CPU has been elected to perform the nohz idle -* balance. Other idle CPUs have already rebalanced with -* nohz_idle_balance() and nohz.next_balance has been -* updated accordingly. This CPU is now running the idle load -* balance for itself and we need to update the -* nohz.next_balance accordingly. -*/ - if ((idle == CPU_IDLE) && time_after(nohz.next_balance, rq->next_balance)) - nohz.next_balance = rq->next_balance; -#endif - } } static inline int on_null_domain(struct rq *rq) @@ -10385,8 +10372,12 @@ static void _nohz_idle_balance(struct rq *this_rq, unsigned int flags, */ smp_mb(); - for_each_cpu(balance_cpu, nohz.idle_cpus_mask) { - if (balance_cpu == this_cpu || !idle_cpu(balance_cpu)) + /* +* Start with the next CPU after this_cpu so we will end with this_cpu and let a +* chance for other idle cpu to pull load. +*/ + for_each_cpu_wrap(balance_cpu, nohz.idle_cpus_mask, this_cpu+1) { + if (!idle_cpu(balance_cpu)) continue; /* @@ -10432,15 +10423,6 @@ static void _nohz_idle_balance(struct rq *this_rq, unsigned int flags, if (likely(update_next_balance)) nohz.next_balance = next_balance; - /* Newly idle CPU doesn't need an update */ - if (idle != CPU_NEWLY_IDLE) { - update_blocked_averages(this_cpu); - has_blocked_load |= this_rq->has_blocked_load; - } - - if (flags & NOHZ_BALANCE_KICK) - rebalance_domains(this_rq, CPU_IDLE); - WRITE_ONCE(nohz.next_blocked, now + msecs_to_jiffies(LOAD_AVG_PERIOD));
Re: [PATCH RESEND][next] hwmon: (corsair-cpro) Fix fall-through warnings for Clang
On 06.03.21 at 10:53:59 CET, Gustavo A. R. Silva wrote > In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning > by explicitly adding a break statement instead of letting the code fall > through to the next case. > > Link: https://github.com/KSPP/linux/issues/115 > Acked-by: Guenter Roeck > Signed-off-by: Gustavo A. R. Silva Acked-by: Marius Zachmann > --- > drivers/hwmon/corsair-cpro.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/hwmon/corsair-cpro.c b/drivers/hwmon/corsair-cpro.c > index 591929ec217a..fa6aa4fc8b52 100644 > --- a/drivers/hwmon/corsair-cpro.c > +++ b/drivers/hwmon/corsair-cpro.c > @@ -310,6 +310,7 @@ static int ccp_write(struct device *dev, enum > hwmon_sensor_types type, > default: > break; > } > + break; > default: > break; > } >
[tip: sched/core] psi: Optimize task switch inside shared cgroups
The following commit has been merged into the sched/core branch of tip: Commit-ID: 4117cebf1a9fcbf35b9aabf0e37b6c5eea296798 Gitweb: https://git.kernel.org/tip/4117cebf1a9fcbf35b9aabf0e37b6c5eea296798 Author:Chengming Zhou AuthorDate:Wed, 03 Mar 2021 11:46:59 +08:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:23 +01:00 psi: Optimize task switch inside shared cgroups The commit 36b238d57172 ("psi: Optimize switching tasks inside shared cgroups") only update cgroups whose state actually changes during a task switch only in task preempt case, not in task sleep case. We actually don't need to clear and set TSK_ONCPU state for common cgroups of next and prev task in sleep case, that can save many psi_group_change especially when most activity comes from one leaf cgroup. sleep before: psi_dequeue() while ((group = iterate_groups(prev))) # all ancestors psi_group_change(prev, .clear=TSK_RUNNING|TSK_ONCPU) psi_task_switch() while ((group = iterate_groups(next))) # all ancestors psi_group_change(next, .set=TSK_ONCPU) sleep after: psi_dequeue() nop psi_task_switch() while ((group = iterate_groups(next))) # until (prev & next) psi_group_change(next, .set=TSK_ONCPU) while ((group = iterate_groups(prev))) # all ancestors psi_group_change(prev, .clear=common?TSK_RUNNING:TSK_RUNNING|TSK_ONCPU) When a voluntary sleep switches to another task, we remove one call of psi_group_change() for every common cgroup ancestor of the two tasks. Co-developed-by: Muchun Song Signed-off-by: Muchun Song Signed-off-by: Chengming Zhou Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Acked-by: Johannes Weiner Link: https://lkml.kernel.org/r/20210303034659.91735-5-zhouchengm...@bytedance.com --- kernel/sched/psi.c | 35 +-- kernel/sched/stats.h | 28 2 files changed, 37 insertions(+), 26 deletions(-) diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 3907a6b..ee3c5b4 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -840,20 +840,35 @@ void psi_task_switch(struct task_struct *prev, struct task_struct *next, } } - /* -* If this is a voluntary sleep, dequeue will have taken care -* of the outgoing TSK_ONCPU alongside TSK_RUNNING already. We -* only need to deal with it during preemption. -*/ - if (sleep) - return; - if (prev->pid) { - psi_flags_change(prev, TSK_ONCPU, 0); + int clear = TSK_ONCPU, set = 0; + + /* +* When we're going to sleep, psi_dequeue() lets us handle +* TSK_RUNNING and TSK_IOWAIT here, where we can combine it +* with TSK_ONCPU and save walking common ancestors twice. +*/ + if (sleep) { + clear |= TSK_RUNNING; + if (prev->in_iowait) + set |= TSK_IOWAIT; + } + + psi_flags_change(prev, clear, set); iter = NULL; while ((group = iterate_groups(prev, )) && group != common) - psi_group_change(group, cpu, TSK_ONCPU, 0, true); + psi_group_change(group, cpu, clear, set, true); + + /* +* TSK_ONCPU is handled up to the common ancestor. If we're tasked +* with dequeuing too, finish that for the rest of the hierarchy. +*/ + if (sleep) { + clear &= ~TSK_ONCPU; + for (; group; group = iterate_groups(prev, )) + psi_group_change(group, cpu, clear, set, true); + } } } diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h index 9e4e67a..dc218e9 100644 --- a/kernel/sched/stats.h +++ b/kernel/sched/stats.h @@ -84,28 +84,24 @@ static inline void psi_enqueue(struct task_struct *p, bool wakeup) static inline void psi_dequeue(struct task_struct *p, bool sleep) { - int clear = TSK_RUNNING, set = 0; + int clear = TSK_RUNNING; if (static_branch_likely(_disabled)) return; - if (!sleep) { - if (p->in_memstall) - clear |= TSK_MEMSTALL; - } else { - /* -* When a task sleeps, schedule() dequeues it before -* switching to the next one. Merge the clearing of -* TSK_RUNNING and TSK_ONCPU to save an unnecessary -* psi_task_change() call in psi_sched_switch(). -*/ - clear |= TSK_ONCPU; + /* +* A voluntary sleep is a dequeue followed by a task switch. To +* avoid walking all ancestors twice, psi_task_switch() handles +* TSK_RUNNING and TSK_IOWAIT for us when it moves TSK_ONCPU. +*
[tip: sched/core] psi: Add PSI_CPU_FULL state
The following commit has been merged into the sched/core branch of tip: Commit-ID: e7fcd762282332f765af2035a9568fb126fa3c01 Gitweb: https://git.kernel.org/tip/e7fcd762282332f765af2035a9568fb126fa3c01 Author:Chengming Zhou AuthorDate:Wed, 03 Mar 2021 11:46:56 +08:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00 psi: Add PSI_CPU_FULL state The FULL state doesn't exist for the CPU resource at the system level, but exist at the cgroup level, means all non-idle tasks in a cgroup are delayed on the CPU resource which used by others outside of the cgroup or throttled by the cgroup cpu.max configuration. Co-developed-by: Muchun Song Signed-off-by: Muchun Song Signed-off-by: Chengming Zhou Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Acked-by: Johannes Weiner Link: https://lkml.kernel.org/r/20210303034659.91735-2-zhouchengm...@bytedance.com --- include/linux/psi_types.h | 3 ++- kernel/sched/psi.c| 14 +++--- 2 files changed, 13 insertions(+), 4 deletions(-) diff --git a/include/linux/psi_types.h b/include/linux/psi_types.h index b95f321..0a23300 100644 --- a/include/linux/psi_types.h +++ b/include/linux/psi_types.h @@ -50,9 +50,10 @@ enum psi_states { PSI_MEM_SOME, PSI_MEM_FULL, PSI_CPU_SOME, + PSI_CPU_FULL, /* Only per-CPU, to weigh the CPU in the global average: */ PSI_NONIDLE, - NR_PSI_STATES = 6, + NR_PSI_STATES = 7, }; enum psi_aggregators { diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 967732c..2293c45 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -34,7 +34,10 @@ * delayed on that resource such that nobody is advancing and the CPU * goes idle. This leaves both workload and CPU unproductive. * - * (Naturally, the FULL state doesn't exist for the CPU resource.) + * Naturally, the FULL state doesn't exist for the CPU resource at the + * system level, but exist at the cgroup level, means all non-idle tasks + * in a cgroup are delayed on the CPU resource which used by others outside + * of the cgroup or throttled by the cgroup cpu.max configuration. * * SOME = nr_delayed_tasks != 0 * FULL = nr_delayed_tasks != 0 && nr_running_tasks == 0 @@ -225,6 +228,8 @@ static bool test_state(unsigned int *tasks, enum psi_states state) return tasks[NR_MEMSTALL] && !tasks[NR_RUNNING]; case PSI_CPU_SOME: return tasks[NR_RUNNING] > tasks[NR_ONCPU]; + case PSI_CPU_FULL: + return tasks[NR_RUNNING] && !tasks[NR_ONCPU]; case PSI_NONIDLE: return tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] || tasks[NR_RUNNING]; @@ -678,8 +683,11 @@ static void record_times(struct psi_group_cpu *groupc, int cpu, } } - if (groupc->state_mask & (1 << PSI_CPU_SOME)) + if (groupc->state_mask & (1 << PSI_CPU_SOME)) { groupc->times[PSI_CPU_SOME] += delta; + if (groupc->state_mask & (1 << PSI_CPU_FULL)) + groupc->times[PSI_CPU_FULL] += delta; + } if (groupc->state_mask & (1 << PSI_NONIDLE)) groupc->times[PSI_NONIDLE] += delta; @@ -1018,7 +1026,7 @@ int psi_show(struct seq_file *m, struct psi_group *group, enum psi_res res) group->avg_next_update = update_averages(group, now); mutex_unlock(>avgs_lock); - for (full = 0; full < 2 - (res == PSI_CPU); full++) { + for (full = 0; full < 2; full++) { unsigned long avg[3]; u64 total; int w;
[tip: sched/core] psi: Pressure states are unlikely
The following commit has been merged into the sched/core branch of tip: Commit-ID: fddc8bab531e217806b84906681324377d741c6c Gitweb: https://git.kernel.org/tip/fddc8bab531e217806b84906681324377d741c6c Author:Johannes Weiner AuthorDate:Wed, 03 Mar 2021 11:46:58 +08:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:23 +01:00 psi: Pressure states are unlikely Move the unlikely branches out of line. This eliminates undesirable jumps during wakeup and sleeps for workloads that aren't under any sort of resource pressure. Signed-off-by: Johannes Weiner Signed-off-by: Chengming Zhou Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Link: https://lkml.kernel.org/r/20210303034659.91735-4-zhouchengm...@bytedance.com --- kernel/sched/psi.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 0fe6ff6..3907a6b 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -219,17 +219,17 @@ static bool test_state(unsigned int *tasks, enum psi_states state) { switch (state) { case PSI_IO_SOME: - return tasks[NR_IOWAIT]; + return unlikely(tasks[NR_IOWAIT]); case PSI_IO_FULL: - return tasks[NR_IOWAIT] && !tasks[NR_RUNNING]; + return unlikely(tasks[NR_IOWAIT] && !tasks[NR_RUNNING]); case PSI_MEM_SOME: - return tasks[NR_MEMSTALL]; + return unlikely(tasks[NR_MEMSTALL]); case PSI_MEM_FULL: - return tasks[NR_MEMSTALL] && !tasks[NR_RUNNING]; + return unlikely(tasks[NR_MEMSTALL] && !tasks[NR_RUNNING]); case PSI_CPU_SOME: - return tasks[NR_RUNNING] > tasks[NR_ONCPU]; + return unlikely(tasks[NR_RUNNING] > tasks[NR_ONCPU]); case PSI_CPU_FULL: - return tasks[NR_RUNNING] && !tasks[NR_ONCPU]; + return unlikely(tasks[NR_RUNNING] && !tasks[NR_ONCPU]); case PSI_NONIDLE: return tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] || tasks[NR_RUNNING]; @@ -729,7 +729,7 @@ static void psi_group_change(struct psi_group *group, int cpu, * task in a cgroup is in_memstall, the corresponding groupc * on that cpu is in PSI_MEM_FULL state. */ - if (groupc->tasks[NR_ONCPU] && cpu_curr(cpu)->in_memstall) + if (unlikely(groupc->tasks[NR_ONCPU] && cpu_curr(cpu)->in_memstall)) state_mask |= (1 << PSI_MEM_FULL); groupc->state_mask = state_mask;
[tip: sched/core] psi: Use ONCPU state tracking machinery to detect reclaim
The following commit has been merged into the sched/core branch of tip: Commit-ID: 7fae6c8171d20ac55402930ee8ae760cf85dff7b Gitweb: https://git.kernel.org/tip/7fae6c8171d20ac55402930ee8ae760cf85dff7b Author:Chengming Zhou AuthorDate:Wed, 03 Mar 2021 11:46:57 +08:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00 psi: Use ONCPU state tracking machinery to detect reclaim Move the reclaim detection from the timer tick to the task state tracking machinery using the recently added ONCPU state. And we also add task psi_flags changes checking in the psi_task_switch() optimization to update the parents properly. In terms of performance and cost, this ONCPU task state tracking is not cheaper than previous timer tick in aggregate. But the code is simpler and shorter this way, so it's a maintainability win. And Johannes did some testing with perf bench, the performace and cost changes would be acceptable for real workloads. Thanks to Johannes Weiner for pointing out the psi_task_switch() optimization things and the clearer changelog. Co-developed-by: Muchun Song Signed-off-by: Muchun Song Signed-off-by: Chengming Zhou Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Acked-by: Johannes Weiner Link: https://lkml.kernel.org/r/20210303034659.91735-3-zhouchengm...@bytedance.com --- include/linux/psi.h | 1 +- kernel/sched/core.c | 1 +- kernel/sched/psi.c | 65 +++ kernel/sched/stats.h | 9 +-- 4 files changed, 24 insertions(+), 52 deletions(-) diff --git a/include/linux/psi.h b/include/linux/psi.h index 7361023..65eb147 100644 --- a/include/linux/psi.h +++ b/include/linux/psi.h @@ -20,7 +20,6 @@ void psi_task_change(struct task_struct *task, int clear, int set); void psi_task_switch(struct task_struct *prev, struct task_struct *next, bool sleep); -void psi_memstall_tick(struct task_struct *task, int cpu); void psi_memstall_enter(unsigned long *flags); void psi_memstall_leave(unsigned long *flags); diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 361974e..d2629fd 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4551,7 +4551,6 @@ void scheduler_tick(void) update_thermal_load_avg(rq_clock_thermal(rq), rq, thermal_pressure); curr->sched_class->task_tick(rq, curr, 0); calc_global_load_tick(rq); - psi_task_tick(rq); rq_unlock(rq, ); diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 2293c45..0fe6ff6 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -644,8 +644,7 @@ static void poll_timer_fn(struct timer_list *t) wake_up_interruptible(>poll_wait); } -static void record_times(struct psi_group_cpu *groupc, int cpu, -bool memstall_tick) +static void record_times(struct psi_group_cpu *groupc, int cpu) { u32 delta; u64 now; @@ -664,23 +663,6 @@ static void record_times(struct psi_group_cpu *groupc, int cpu, groupc->times[PSI_MEM_SOME] += delta; if (groupc->state_mask & (1 << PSI_MEM_FULL)) groupc->times[PSI_MEM_FULL] += delta; - else if (memstall_tick) { - u32 sample; - /* -* Since we care about lost potential, a -* memstall is FULL when there are no other -* working tasks, but also when the CPU is -* actively reclaiming and nothing productive -* could run even if it were runnable. -* -* When the timer tick sees a reclaiming CPU, -* regardless of runnable tasks, sample a FULL -* tick (or less if it hasn't been a full tick -* since the last state change). -*/ - sample = min(delta, (u32)jiffies_to_nsecs(1)); - groupc->times[PSI_MEM_FULL] += sample; - } } if (groupc->state_mask & (1 << PSI_CPU_SOME)) { @@ -714,7 +696,7 @@ static void psi_group_change(struct psi_group *group, int cpu, */ write_seqcount_begin(>seq); - record_times(groupc, cpu, false); + record_times(groupc, cpu); for (t = 0, m = clear; m; m &= ~(1 << t), t++) { if (!(m & (1 << t))) @@ -738,6 +720,18 @@ static void psi_group_change(struct psi_group *group, int cpu, if (test_state(groupc->tasks, s)) state_mask |= (1 << s); } + + /* +* Since we care about lost potential, a memstall is FULL +* when there are no other working tasks, but also when +* the CPU is actively reclaiming and nothing productive +* could run even if it were runnable. So when the current +* task in a cgroup is
[tip: sched/core] sched/topology: fix the issue groups don't span domain->span for NUMA diameter > 2
The following commit has been merged into the sched/core branch of tip: Commit-ID: 585b6d2723dc927ebc4ad884c4e879e4da8bc21f Gitweb: https://git.kernel.org/tip/585b6d2723dc927ebc4ad884c4e879e4da8bc21f Author:Barry Song AuthorDate:Wed, 24 Feb 2021 16:09:44 +13:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00 sched/topology: fix the issue groups don't span domain->span for NUMA diameter > 2 As long as NUMA diameter > 2, building sched_domain by sibling's child domain will definitely create a sched_domain with sched_group which will span out of the sched_domain: +--+ +--++---+ +--+ | node | 12 |node | 20 | node | 12 |node | | 0 +-+1 ++ 2 +---+3 | +--+ +--++---+ +--+ domain0node0node1node2 node3 domain1node0+1 node0+1 node2+3node2+3 + domain2node0+1+2 | group: node0+1 | group:node2+3 <---+ when node2 is added into the domain2 of node0, kernel is using the child domain of node2's domain2, which is domain1(node2+3). Node 3 is outside the span of the domain including node0+1+2. This will make load_balance() run based on screwed avg_load and group_type in the sched_group spanning out of the sched_domain, and it also makes select_task_rq_fair() pick an idle CPU outside the sched_domain. Real servers which suffer from this problem include Kunpeng920 and 8-node Sun Fire X4600-M2, at least. Here we move to use the *child* domain of the *child* domain of node2's domain2 as the new added sched_group. At the same, we re-use the lower level sgc directly. +--+ +--++---+ +--+ | node | 12 |node | 20 | node | 12 |node | | 0 +-+1 ++ 2 +---+3 | +--+ +--++---+ +--+ domain0node0node1 +- node2 node3 | domain1node0+1 node0+1| node2+3node2+3 | domain2node0+1+2 | group: node0+1| group:node2 <---+ While the lower level sgc is re-used, this patch only changes the remote sched_groups for those sched_domains playing grandchild trick, therefore, sgc->next_update is still safe since it's only touched by CPUs that have the group span as local group. And sgc->imbalance is also safe because sd_parent remains the same in load_balance and LB only tries other CPUs from the local group. Moreover, since local groups are not touched, they are still getting roughly equal size in a TL. And should_we_balance() only matters with local groups, so the pull probability of those groups are still roughly equal. Tested by the below topology: qemu-system-aarch64 -M virt -nographic \ -smp cpus=8 \ -numa node,cpus=0-1,nodeid=0 \ -numa node,cpus=2-3,nodeid=1 \ -numa node,cpus=4-5,nodeid=2 \ -numa node,cpus=6-7,nodeid=3 \ -numa dist,src=0,dst=1,val=12 \ -numa dist,src=0,dst=2,val=20 \ -numa dist,src=0,dst=3,val=22 \ -numa dist,src=1,dst=2,val=22 \ -numa dist,src=2,dst=3,val=12 \ -numa dist,src=1,dst=3,val=24 \ -m 4G -cpu cortex-a57 -kernel arch/arm64/boot/Image w/o patch, we get lots of "groups don't span domain->span": [0.802139] CPU0 attaching sched-domain(s): [0.802193] domain-0: span=0-1 level=MC [0.802443] groups: 0:{ span=0 cap=1013 }, 1:{ span=1 cap=979 } [0.802693] domain-1: span=0-3 level=NUMA [0.802731]groups: 0:{ span=0-1 cap=1992 }, 2:{ span=2-3 cap=1943 } [0.802811]domain-2: span=0-5 level=NUMA [0.802829] groups: 0:{ span=0-3 cap=3935 }, 4:{ span=4-7 cap=3937 } [0.802881] ERROR: groups don't span domain->span [0.803058] domain-3: span=0-7 level=NUMA [0.803080] groups: 0:{ span=0-5 mask=0-1 cap=5843 }, 6:{ span=4-7 mask=6-7 cap=4077 } [0.804055] CPU1 attaching sched-domain(s): [0.804072] domain-0: span=0-1 level=MC [0.804096] groups: 1:{ span=1 cap=979 }, 0:{ span=0 cap=1013 } [0.804152] domain-1: span=0-3 level=NUMA [0.804170]groups: 0:{ span=0-1 cap=1992 }, 2:{ span=2-3 cap=1943 } [0.804219]domain-2: span=0-5 level=NUMA [0.804236] groups: 0:{ span=0-3 cap=3935 }, 4:{ span=4-7 cap=3937 } [0.804302] ERROR: groups don't span domain->span [0.804520] domain-3: span=0-7 level=NUMA [0.804546] groups: 0:{ span=0-5 mask=0-1 cap=5843 }, 6:{ span=4-7 mask=6-7 cap=4077 } [0.804677] CPU2 attaching
[tip: sched/core] cpu/hotplug: Allowing to reset fail injection
The following commit has been merged into the sched/core branch of tip: Commit-ID: 3ae70c251f344976428d1f6ee61ea7b4e170fec3 Gitweb: https://git.kernel.org/tip/3ae70c251f344976428d1f6ee61ea7b4e170fec3 Author:Vincent Donnefort AuthorDate:Tue, 16 Feb 2021 10:35:04 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00 cpu/hotplug: Allowing to reset fail injection Currently, the only way of resetting the fail injection is to trigger a hotplug, hotunplug or both. This is rather annoying for testing and, as the default value for this file is -1, it seems pretty natural to let a user write it. Signed-off-by: Vincent Donnefort Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Link: https://lkml.kernel.org/r/20210216103506.416286-2-vincent.donnef...@arm.com --- kernel/cpu.c | 5 + 1 file changed, 5 insertions(+) diff --git a/kernel/cpu.c b/kernel/cpu.c index 1b6302e..9121edf 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -2207,6 +2207,11 @@ static ssize_t write_cpuhp_fail(struct device *dev, if (ret) return ret; + if (fail == CPUHP_INVALID) { + st->fail = fail; + return count; + } + if (fail < CPUHP_OFFLINE || fail > CPUHP_ONLINE) return -EINVAL;
[tip: sched/core] cpu/hotplug: Add cpuhp_invoke_callback_range()
The following commit has been merged into the sched/core branch of tip: Commit-ID: 453e41085183980087f8a80dada523caf1131c3c Gitweb: https://git.kernel.org/tip/453e41085183980087f8a80dada523caf1131c3c Author:Vincent Donnefort AuthorDate:Tue, 16 Feb 2021 10:35:06 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00 cpu/hotplug: Add cpuhp_invoke_callback_range() Factorizing and unifying cpuhp callback range invocations, especially for the hotunplug path, where two different ways of decrementing were used. The first one, decrements before the callback is called: cpuhp_thread_fun() state = st->state; st->state--; cpuhp_invoke_callback(state); The second one, after: take_down_cpu()|cpuhp_down_callbacks() cpuhp_invoke_callback(st->state); st->state--; This is problematic for rolling back the steps in case of error, as depending on the decrement, the rollback will start from N or N-1. It also makes tracing inconsistent, between steps run in the cpuhp thread and the others. Additionally, avoid useless cpuhp_thread_fun() loops by skipping empty steps. Signed-off-by: Vincent Donnefort Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Link: https://lkml.kernel.org/r/20210216103506.416286-4-vincent.donnef...@arm.com --- kernel/cpu.c | 170 ++ 1 file changed, 102 insertions(+), 68 deletions(-) diff --git a/kernel/cpu.c b/kernel/cpu.c index 680ed8f..23505d6 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -135,6 +135,11 @@ static struct cpuhp_step *cpuhp_get_step(enum cpuhp_state state) return cpuhp_hp_states + state; } +static bool cpuhp_step_empty(bool bringup, struct cpuhp_step *step) +{ + return bringup ? !step->startup.single : !step->teardown.single; +} + /** * cpuhp_invoke_callback _ Invoke the callbacks for a given state * @cpu: The cpu for which the callback should be invoked @@ -157,26 +162,24 @@ static int cpuhp_invoke_callback(unsigned int cpu, enum cpuhp_state state, if (st->fail == state) { st->fail = CPUHP_INVALID; - - if (!(bringup ? step->startup.single : step->teardown.single)) - return 0; - return -EAGAIN; } + if (cpuhp_step_empty(bringup, step)) { + WARN_ON_ONCE(1); + return 0; + } + if (!step->multi_instance) { WARN_ON_ONCE(lastp && *lastp); cb = bringup ? step->startup.single : step->teardown.single; - if (!cb) - return 0; + trace_cpuhp_enter(cpu, st->target, state, cb); ret = cb(cpu); trace_cpuhp_exit(cpu, st->state, state, ret); return ret; } cbm = bringup ? step->startup.multi : step->teardown.multi; - if (!cbm) - return 0; /* Single invocation for instance add/remove */ if (node) { @@ -475,6 +478,15 @@ cpuhp_set_state(struct cpuhp_cpu_state *st, enum cpuhp_state target) static inline void cpuhp_reset_state(struct cpuhp_cpu_state *st, enum cpuhp_state prev_state) { + st->target = prev_state; + + /* +* Already rolling back. No need invert the bringup value or to change +* the current state. +*/ + if (st->rollback) + return; + st->rollback = true; /* @@ -488,7 +500,6 @@ cpuhp_reset_state(struct cpuhp_cpu_state *st, enum cpuhp_state prev_state) st->state++; } - st->target = prev_state; st->bringup = !st->bringup; } @@ -591,10 +602,53 @@ static int finish_cpu(unsigned int cpu) * Hotplug state machine related functions */ -static void undo_cpu_up(unsigned int cpu, struct cpuhp_cpu_state *st) +/* + * Get the next state to run. Empty ones will be skipped. Returns true if a + * state must be run. + * + * st->state will be modified ahead of time, to match state_to_run, as if it + * has already ran. + */ +static bool cpuhp_next_state(bool bringup, +enum cpuhp_state *state_to_run, +struct cpuhp_cpu_state *st, +enum cpuhp_state target) { - for (st->state--; st->state > st->target; st->state--) - cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL); + do { + if (bringup) { + if (st->state >= target) + return false; + + *state_to_run = ++st->state; + } else { + if (st->state <= target) + return false; + + *state_to_run = st->state--; + } + + if (!cpuhp_step_empty(bringup, cpuhp_get_step(*state_to_run))) + break; + } while (true); + +
[tip: sched/core] sched/fair: Fix shift-out-of-bounds in load_balance()
The following commit has been merged into the sched/core branch of tip: Commit-ID: 39a2a6eb5c9b66ea7c8055026303b3aa681b49a5 Gitweb: https://git.kernel.org/tip/39a2a6eb5c9b66ea7c8055026303b3aa681b49a5 Author:Valentin Schneider AuthorDate:Thu, 25 Feb 2021 17:56:56 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00 sched/fair: Fix shift-out-of-bounds in load_balance() Syzbot reported a handful of occurrences where an sd->nr_balance_failed can grow to much higher values than one would expect. A successful load_balance() resets it to 0; a failed one increments it. Once it gets to sd->cache_nice_tries + 3, this *should* trigger an active balance, which will either set it to sd->cache_nice_tries+1 or reset it to 0. However, in case the to-be-active-balanced task is not allowed to run on env->dst_cpu, then the increment is done without any further modification. This could then be repeated ad nauseam, and would explain the absurdly high values reported by syzbot (86, 149). VincentG noted there is value in letting sd->cache_nice_tries grow, so the shift itself should be fixed. That means preventing: """ If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined. """ Thus we need to cap the shift exponent to BITS_PER_TYPE(typeof(lefthand)) - 1. I had a look around for other similar cases via coccinelle: @expr@ position pos; expression E1; expression E2; @@ ( E1 >> E2@pos | E1 >> E2@pos ) @cst depends on expr@ position pos; expression expr.E1; constant cst; @@ ( E1 >> cst@pos | E1 << cst@pos ) @script:python depends on !cst@ pos << expr.pos; exp << expr.E2; @@ # Dirty hack to ignore constexpr if exp.upper() != exp: coccilib.report.print_report(pos[0], "Possible UB shift here") The only other match in kernel/sched is rq_clock_thermal() which employs sched_thermal_decay_shift, and that exponent is already capped to 10, so that one is fine. Fixes: 5a7f55590467 ("sched/fair: Relax constraint on task's load during load balance") Reported-by: syzbot+d7581744d5fd27c9f...@syzkaller.appspotmail.com Signed-off-by: Valentin Schneider Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Link: http://lore.kernel.org/r/ffac1205b9a21...@google.com --- kernel/sched/fair.c | 3 +-- kernel/sched/sched.h | 7 +++ 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 7b2fac0..1af51a6 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7722,8 +7722,7 @@ static int detach_tasks(struct lb_env *env) * scheduler fails to find a good waiting task to * migrate. */ - - if ((load >> env->sd->nr_balance_failed) > env->imbalance) + if (shr_bound(load, env->sd->nr_balance_failed) > env->imbalance) goto next; env->imbalance -= load; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 0ddc9a6..bb8bb06 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -205,6 +205,13 @@ static inline void update_avg(u64 *avg, u64 sample) } /* + * Shifting a value by an exponent greater *or equal* to the size of said value + * is UB; cap at size-1. + */ +#define shr_bound(val, shift) \ + (val >> min_t(typeof(shift), shift, BITS_PER_TYPE(typeof(val)) - 1)) + +/* * !! For sched_setattr_nocheck() (kernel) only !! * * This is actually gross. :(
[tip: sched/core] cpu/hotplug: CPUHP_BRINGUP_CPU failure exception
The following commit has been merged into the sched/core branch of tip: Commit-ID: 62f250694092dd5fef9900dc3126f07110bf9d48 Gitweb: https://git.kernel.org/tip/62f250694092dd5fef9900dc3126f07110bf9d48 Author:Vincent Donnefort AuthorDate:Tue, 16 Feb 2021 10:35:05 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00 cpu/hotplug: CPUHP_BRINGUP_CPU failure exception The atomic states (between CPUHP_AP_IDLE_DEAD and CPUHP_AP_ONLINE) are triggered by the CPUHP_BRINGUP_CPU step. If the latter fails, no atomic state can be rolled back. DEAD callbacks too can't fail and disallow recovery. As a consequence, during hotunplug, the fail injection interface should prohibit all states from CPUHP_BRINGUP_CPU to CPUHP_ONLINE. Signed-off-by: Vincent Donnefort Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Link: https://lkml.kernel.org/r/20210216103506.416286-3-vincent.donnef...@arm.com --- kernel/cpu.c | 19 --- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/kernel/cpu.c b/kernel/cpu.c index 9121edf..680ed8f 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -1045,9 +1045,13 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen, * to do the further cleanups. */ ret = cpuhp_down_callbacks(cpu, st, target); - if (ret && st->state == CPUHP_TEARDOWN_CPU && st->state < prev_state) { - cpuhp_reset_state(st, prev_state); - __cpuhp_kick_ap(st); + if (ret && st->state < prev_state) { + if (st->state == CPUHP_TEARDOWN_CPU) { + cpuhp_reset_state(st, prev_state); + __cpuhp_kick_ap(st); + } else { + WARN(1, "DEAD callback error for CPU%d", cpu); + } } out: @@ -,6 +2226,15 @@ static ssize_t write_cpuhp_fail(struct device *dev, return -EINVAL; /* +* DEAD callbacks cannot fail... +* ... neither can CPUHP_BRINGUP_CPU during hotunplug. The latter +* triggering STARTING callbacks, a failure in this state would +* hinder rollback. +*/ + if (fail <= CPUHP_BRINGUP_CPU && st->state > CPUHP_BRINGUP_CPU) + return -EINVAL; + + /* * Cannot fail anything that doesn't have callbacks. */ mutex_lock(_state_mutex);
[tip: sched/core] sched/pelt: Fix task util_est update filtering
The following commit has been merged into the sched/core branch of tip: Commit-ID: b89997aa88f0b07d8a6414c908af75062103b8c9 Gitweb: https://git.kernel.org/tip/b89997aa88f0b07d8a6414c908af75062103b8c9 Author:Vincent Donnefort AuthorDate:Thu, 25 Feb 2021 16:58:20 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00 sched/pelt: Fix task util_est update filtering Being called for each dequeue, util_est reduces the number of its updates by filtering out when the EWMA signal is different from the task util_avg by less than 1%. It is a problem for a sudden util_avg ramp-up. Due to the decay from a previous high util_avg, EWMA might now be close enough to the new util_avg. No update would then happen while it would leave ue.enqueued with an out-of-date value. Taking into consideration the two util_est members, EWMA and enqueued for the filtering, ensures, for both, an up-to-date value. This is for now an issue only for the trace probe that might return the stale value. Functional-wise, it isn't a problem, as the value is always accessed through max(enqueued, ewma). This problem has been observed using LISA's UtilConvergence:test_means on the sd845c board. No regression observed with Hackbench on sd845c and Perf-bench sched pipe on hikey/hikey960. Signed-off-by: Vincent Donnefort Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Dietmar Eggemann Reviewed-by: Vincent Guittot Link: https://lkml.kernel.org/r/20210225165820.1377125-1-vincent.donnef...@arm.com --- kernel/sched/fair.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 1af51a6..f5d6541 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3941,6 +3941,8 @@ static inline void util_est_dequeue(struct cfs_rq *cfs_rq, trace_sched_util_est_cfs_tp(cfs_rq); } +#define UTIL_EST_MARGIN (SCHED_CAPACITY_SCALE / 100) + /* * Check if a (signed) value is within a specified (unsigned) margin, * based on the observation that: @@ -3958,7 +3960,7 @@ static inline void util_est_update(struct cfs_rq *cfs_rq, struct task_struct *p, bool task_sleep) { - long last_ewma_diff; + long last_ewma_diff, last_enqueued_diff; struct util_est ue; if (!sched_feat(UTIL_EST)) @@ -3979,6 +3981,8 @@ static inline void util_est_update(struct cfs_rq *cfs_rq, if (ue.enqueued & UTIL_AVG_UNCHANGED) return; + last_enqueued_diff = ue.enqueued; + /* * Reset EWMA on utilization increases, the moving average is used only * to smooth utilization decreases. @@ -3992,12 +3996,17 @@ static inline void util_est_update(struct cfs_rq *cfs_rq, } /* -* Skip update of task's estimated utilization when its EWMA is +* Skip update of task's estimated utilization when its members are * already ~1% close to its last activation value. */ last_ewma_diff = ue.enqueued - ue.ewma; - if (within_margin(last_ewma_diff, (SCHED_CAPACITY_SCALE / 100))) + last_enqueued_diff -= ue.enqueued; + if (within_margin(last_ewma_diff, UTIL_EST_MARGIN)) { + if (!within_margin(last_enqueued_diff, UTIL_EST_MARGIN)) + goto done; + return; + } /* * To avoid overestimation of actual task utilization, skip updates if
[tip: sched/core] sched/fair: use lsub_positive in cpu_util_next()
The following commit has been merged into the sched/core branch of tip: Commit-ID: 736cc6b31102236a55470c72523ed0a65eb3f804 Gitweb: https://git.kernel.org/tip/736cc6b31102236a55470c72523ed0a65eb3f804 Author:Vincent Donnefort AuthorDate:Thu, 25 Feb 2021 08:36:12 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00 sched/fair: use lsub_positive in cpu_util_next() The sub_positive local version is saving an explicit load-store and is enough for the cpu_util_next() usage. Signed-off-by: Vincent Donnefort Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Quentin Perret Reviewed-by: Dietmar Eggemann Link: https://lkml.kernel.org/r/20210225083612.1113823-3-vincent.donnef...@arm.com --- kernel/sched/fair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index b994db9..7b2fac0 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6471,7 +6471,7 @@ static unsigned long cpu_util_next(int cpu, struct task_struct *p, int dst_cpu) * util_avg should already be correct. */ if (task_cpu(p) == cpu && dst_cpu != cpu) - sub_positive(, task_util(p)); + lsub_positive(, task_util(p)); else if (task_cpu(p) != cpu && dst_cpu == cpu) util += task_util(p);
[tip: sched/core] sched/fair: Reduce the window for duplicated update
The following commit has been merged into the sched/core branch of tip: Commit-ID: 39b6a429c30482c349f1bb3746470fe473cbdb0f Gitweb: https://git.kernel.org/tip/39b6a429c30482c349f1bb3746470fe473cbdb0f Author:Vincent Guittot AuthorDate:Wed, 24 Feb 2021 14:30:07 +01:00 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00 sched/fair: Reduce the window for duplicated update Start to update last_blocked_load_update_tick to reduce the possibility of another cpu starting the update one more time Signed-off-by: Vincent Guittot Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224133007.28644-8-vincent.guit...@linaro.org --- kernel/sched/fair.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e87e1b3..f1b55f9 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7852,16 +7852,20 @@ static inline bool others_have_blocked(struct rq *rq) return false; } -static inline void update_blocked_load_status(struct rq *rq, bool has_blocked) +static inline void update_blocked_load_tick(struct rq *rq) { - rq->last_blocked_load_update_tick = jiffies; + WRITE_ONCE(rq->last_blocked_load_update_tick, jiffies); +} +static inline void update_blocked_load_status(struct rq *rq, bool has_blocked) +{ if (!has_blocked) rq->has_blocked_load = 0; } #else static inline bool cfs_rq_has_blocked(struct cfs_rq *cfs_rq) { return false; } static inline bool others_have_blocked(struct rq *rq) { return false; } +static inline void update_blocked_load_tick(struct rq *rq) {} static inline void update_blocked_load_status(struct rq *rq, bool has_blocked) {} #endif @@ -8022,6 +8026,7 @@ static void update_blocked_averages(int cpu) struct rq_flags rf; rq_lock_irqsave(rq, ); + update_blocked_load_tick(rq); update_rq_clock(rq); decayed |= __update_blocked_others(rq, ); @@ -8363,7 +8368,7 @@ static bool update_nohz_stats(struct rq *rq) if (!cpumask_test_cpu(cpu, nohz.idle_cpus_mask)) return false; - if (!time_after(jiffies, rq->last_blocked_load_update_tick)) + if (!time_after(jiffies, READ_ONCE(rq->last_blocked_load_update_tick))) return true; update_blocked_averages(cpu);
[tip: sched/core] sched/fair: Fix task utilization accountability in compute_energy()
The following commit has been merged into the sched/core branch of tip: Commit-ID: 0372e1cf70c28de6babcba38ef97b6ae3400b101 Gitweb: https://git.kernel.org/tip/0372e1cf70c28de6babcba38ef97b6ae3400b101 Author:Vincent Donnefort AuthorDate:Thu, 25 Feb 2021 08:36:11 Committer: Ingo Molnar CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00 sched/fair: Fix task utilization accountability in compute_energy() find_energy_efficient_cpu() (feec()) computes for each perf_domain (pd) an energy delta as follows: feec(task) for_each_pd base_energy = compute_energy(task, -1, pd) -> for_each_cpu(pd) -> cpu_util_next(cpu, task, -1) energy_delta = compute_energy(task, dst_cpu, pd) -> for_each_cpu(pd) -> cpu_util_next(cpu, task, dst_cpu) energy_delta -= base_energy Then it picks the best CPU as being the one that minimizes energy_delta. cpu_util_next() estimates the CPU utilization that would happen if the task was placed on dst_cpu as follows: max(cpu_util + task_util, cpu_util_est + _task_util_est) The task contribution to the energy delta can then be either: (1) _task_util_est, on a mostly idle CPU, where cpu_util is close to 0 and _task_util_est > cpu_util. (2) task_util, on a mostly busy CPU, where cpu_util > _task_util_est. (cpu_util_est doesn't appear here. It is 0 when a CPU is idle and otherwise must be small enough so that feec() takes the CPU as a potential target for the task placement) This is problematic for feec(), as cpu_util_next() might give an unfair advantage to a CPU which is mostly busy (2) compared to one which is mostly idle (1). _task_util_est being always bigger than task_util in feec() (as the task is waking up), the task contribution to the energy might look smaller on certain CPUs (2) and this breaks the energy comparison. This issue is, moreover, not sporadic. By starving idle CPUs, it keeps their cpu_util < _task_util_est (1) while others will maintain cpu_util > _task_util_est (2). Fix this problem by always using max(task_util, _task_util_est) as a task contribution to the energy (ENERGY_UTIL). The new estimated CPU utilization for the energy would then be: max(cpu_util, cpu_util_est) + max(task_util, _task_util_est) compute_energy() still needs to know which OPP would be selected if the task would be migrated in the perf_domain (FREQUENCY_UTIL). Hence, cpu_util_next() is still used to estimate the maximum util within the pd. Signed-off-by: Vincent Donnefort Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Quentin Perret Reviewed-by: Dietmar Eggemann Link: https://lkml.kernel.org/r/20210225083612.1113823-2-vincent.donnef...@arm.com --- kernel/sched/fair.c | 24 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index f1b55f9..b994db9 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6518,8 +6518,24 @@ compute_energy(struct task_struct *p, int dst_cpu, struct perf_domain *pd) * its pd list and will not be accounted by compute_energy(). */ for_each_cpu_and(cpu, pd_mask, cpu_online_mask) { - unsigned long cpu_util, util_cfs = cpu_util_next(cpu, p, dst_cpu); - struct task_struct *tsk = cpu == dst_cpu ? p : NULL; + unsigned long util_freq = cpu_util_next(cpu, p, dst_cpu); + unsigned long cpu_util, util_running = util_freq; + struct task_struct *tsk = NULL; + + /* +* When @p is placed on @cpu: +* +* util_running = max(cpu_util, cpu_util_est) + +*max(task_util, _task_util_est) +* +* while cpu_util_next is: max(cpu_util + task_util, +* cpu_util_est + _task_util_est) +*/ + if (cpu == dst_cpu) { + tsk = p; + util_running = + cpu_util_next(cpu, p, -1) + task_util_est(p); + } /* * Busy time computation: utilization clamping is not @@ -6527,7 +6543,7 @@ compute_energy(struct task_struct *p, int dst_cpu, struct perf_domain *pd) * is already enough to scale the EM reported power * consumption at the (eventually clamped) cpu_capacity. */ - sum_util += effective_cpu_util(cpu, util_cfs, cpu_cap, + sum_util += effective_cpu_util(cpu, util_running, cpu_cap, ENERGY_UTIL, NULL); /* @@ -6537,7 +6553,7 @@ compute_energy(struct task_struct *p, int dst_cpu, struct perf_domain *pd) * NOTE: in case RT tasks are running, by default the * FREQUENCY_UTIL's utilization can be max
[PATCH] arm64: dts: imx8mp: add wdog2/3 nodes
From: Peng Fan There is wdog[2,3] in i.MX8MP, so add them, all wdogs share the same clock root, so use the wdog1 clk here. Signed-off-by: Peng Fan --- arch/arm64/boot/dts/freescale/imx8mp.dtsi | 16 1 file changed, 16 insertions(+) diff --git a/arch/arm64/boot/dts/freescale/imx8mp.dtsi b/arch/arm64/boot/dts/freescale/imx8mp.dtsi index c7523fd4eae9..05dd04116f2e 100644 --- a/arch/arm64/boot/dts/freescale/imx8mp.dtsi +++ b/arch/arm64/boot/dts/freescale/imx8mp.dtsi @@ -312,6 +312,22 @@ wdog1: watchdog@3028 { status = "disabled"; }; + wdog2: watchdog@3029 { + compatible = "fsl,imx8mp-wdt", "fsl,imx21-wdt"; + reg = <0x3029 0x1>; + interrupts = ; + clocks = < IMX8MP_CLK_WDOG2_ROOT>; + status = "disabled"; + }; + + wdog3: watchdog@302a { + compatible = "fsl,imx8mp-wdt", "fsl,imx21-wdt"; + reg = <0x302a 0x1>; + interrupts = ; + clocks = < IMX8MP_CLK_WDOG3_ROOT>; + status = "disabled"; + }; + iomuxc: pinctrl@3033 { compatible = "fsl,imx8mp-iomuxc"; reg = <0x3033 0x1>; -- 2.30.0
[PATCH V13 10/10] remoteproc: imx_proc: enable virtio/mailbox
From: Peng Fan Use virtio/mailbox to build connection between Remote Proccessors and Linux. Add work queue to handle incoming messages. Reviewed-by: Richard Zhu Reviewed-by: Mathieu Poirier Signed-off-by: Peng Fan --- drivers/remoteproc/imx_rproc.c | 116 - 1 file changed, 113 insertions(+), 3 deletions(-) diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c index 3685bbd135b0..90471790bb24 100644 --- a/drivers/remoteproc/imx_rproc.c +++ b/drivers/remoteproc/imx_rproc.c @@ -7,6 +7,7 @@ #include #include #include +#include #include #include #include @@ -15,6 +16,9 @@ #include #include #include +#include + +#include "remoteproc_internal.h" #define IMX7D_SRC_SCR 0x0C #define IMX7D_ENABLE_M4BIT(3) @@ -86,6 +90,11 @@ struct imx_rproc { const struct imx_rproc_dcfg *dcfg; struct imx_rproc_memmem[IMX7D_RPROC_MEM_MAX]; struct clk *clk; + struct mbox_client cl; + struct mbox_chan*tx_ch; + struct mbox_chan*rx_ch; + struct work_struct rproc_work; + struct workqueue_struct *workqueue; }; static const struct imx_rproc_att imx_rproc_att_imx8mq[] = { @@ -366,9 +375,33 @@ static int imx_rproc_parse_fw(struct rproc *rproc, const struct firmware *fw) return 0; } +static void imx_rproc_kick(struct rproc *rproc, int vqid) +{ + struct imx_rproc *priv = rproc->priv; + int err; + __u32 mmsg; + + if (!priv->tx_ch) { + dev_err(priv->dev, "No initialized mbox tx channel\n"); + return; + } + + /* +* Send the index of the triggered virtqueue as the mu payload. +* Let remote processor know which virtqueue is used. +*/ + mmsg = vqid << 16; + + err = mbox_send_message(priv->tx_ch, (void *)); + if (err < 0) + dev_err(priv->dev, "%s: failed (%d, err:%d)\n", + __func__, vqid, err); +} + static const struct rproc_ops imx_rproc_ops = { .start = imx_rproc_start, .stop = imx_rproc_stop, + .kick = imx_rproc_kick, .da_to_va = imx_rproc_da_to_va, .load = rproc_elf_load_segments, .parse_fw = imx_rproc_parse_fw, @@ -444,6 +477,66 @@ static int imx_rproc_addr_init(struct imx_rproc *priv, return 0; } +static void imx_rproc_vq_work(struct work_struct *work) +{ + struct imx_rproc *priv = container_of(work, struct imx_rproc, + rproc_work); + + rproc_vq_interrupt(priv->rproc, 0); + rproc_vq_interrupt(priv->rproc, 1); +} + +static void imx_rproc_rx_callback(struct mbox_client *cl, void *msg) +{ + struct rproc *rproc = dev_get_drvdata(cl->dev); + struct imx_rproc *priv = rproc->priv; + + queue_work(priv->workqueue, >rproc_work); +} + +static int imx_rproc_xtr_mbox_init(struct rproc *rproc) +{ + struct imx_rproc *priv = rproc->priv; + struct device *dev = priv->dev; + struct mbox_client *cl; + int ret; + + if (!of_get_property(dev->of_node, "mbox-names", NULL)) + return 0; + + cl = >cl; + cl->dev = dev; + cl->tx_block = true; + cl->tx_tout = 100; + cl->knows_txdone = false; + cl->rx_callback = imx_rproc_rx_callback; + + priv->tx_ch = mbox_request_channel_byname(cl, "tx"); + if (IS_ERR(priv->tx_ch)) { + ret = PTR_ERR(priv->tx_ch); + return dev_err_probe(cl->dev, ret, +"failed to request tx mailbox channel: %d\n", ret); + } + + priv->rx_ch = mbox_request_channel_byname(cl, "rx"); + if (IS_ERR(priv->rx_ch)) { + mbox_free_channel(priv->tx_ch); + ret = PTR_ERR(priv->rx_ch); + return dev_err_probe(cl->dev, ret, +"failed to request rx mailbox channel: %d\n", ret); + } + + return 0; +} + +static void imx_rproc_free_mbox(struct rproc *rproc) +{ + struct imx_rproc *priv = rproc->priv; + + mbox_free_channel(priv->tx_ch); + mbox_free_channel(priv->rx_ch); +} + static int imx_rproc_probe(struct platform_device *pdev) { struct device *dev = >dev; @@ -481,18 +574,28 @@ static int imx_rproc_probe(struct platform_device *pdev) priv->dev = dev; dev_set_drvdata(dev, rproc); + priv->workqueue = create_workqueue(dev_name(dev)); + if (!priv->workqueue) { + dev_err(dev, "cannot create workqueue\n"); + ret = -ENOMEM; + goto err_put_rproc; + } + + ret = imx_rproc_xtr_mbox_init(rproc); + if (ret) + goto err_put_wkq; ret = imx_rproc_addr_init(priv, pdev);
[PATCH V13 08/10] remoteproc: imx_rproc: support i.MX8MQ/M
From: Peng Fan Add i.MX8MQ dev/sys addr map and configuration data structure i.MX8MM share i.MX8MQ settings. Reviewed-by: Richard Zhu Reviewed-by: Mathieu Poirier Signed-off-by: Peng Fan --- drivers/remoteproc/Kconfig | 6 ++--- drivers/remoteproc/imx_rproc.c | 41 +- 2 files changed, 43 insertions(+), 4 deletions(-) diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig index 15d1574d129b..7cf3d1b40c55 100644 --- a/drivers/remoteproc/Kconfig +++ b/drivers/remoteproc/Kconfig @@ -24,11 +24,11 @@ config REMOTEPROC_CDEV It's safe to say N if you don't want to use this interface. config IMX_REMOTEPROC - tristate "IMX6/7 remoteproc support" + tristate "i.MX remoteproc support" depends on ARCH_MXC help - Say y here to support iMX's remote processors (Cortex M4 - on iMX7D) via the remote processor framework. + Say y here to support iMX's remote processors via the remote + processor framework. It's safe to say N here. diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c index 5ae1f5209548..0124ebf69838 100644 --- a/drivers/remoteproc/imx_rproc.c +++ b/drivers/remoteproc/imx_rproc.c @@ -88,6 +88,34 @@ struct imx_rproc { struct clk *clk; }; +static const struct imx_rproc_att imx_rproc_att_imx8mq[] = { + /* dev addr , sys addr , size , flags */ + /* TCML - alias */ + { 0x, 0x007e, 0x0002, 0 }, + /* OCRAM_S */ + { 0x0018, 0x0018, 0x8000, 0 }, + /* OCRAM */ + { 0x0090, 0x0090, 0x0002, 0 }, + /* OCRAM */ + { 0x0092, 0x0092, 0x0002, 0 }, + /* QSPI Code - alias */ + { 0x0800, 0x0800, 0x0800, 0 }, + /* DDR (Code) - alias */ + { 0x1000, 0x8000, 0x0FFE, 0 }, + /* TCML */ + { 0x1FFE, 0x007E, 0x0002, ATT_OWN }, + /* TCMU */ + { 0x2000, 0x0080, 0x0002, ATT_OWN }, + /* OCRAM_S */ + { 0x2018, 0x0018, 0x8000, ATT_OWN }, + /* OCRAM */ + { 0x2020, 0x0090, 0x0002, ATT_OWN }, + /* OCRAM */ + { 0x2022, 0x0092, 0x0002, ATT_OWN }, + /* DDR (Data) */ + { 0x4000, 0x4000, 0x8000, 0 }, +}; + static const struct imx_rproc_att imx_rproc_att_imx7d[] = { /* dev addr , sys addr , size , flags */ /* OCRAM_S (M4 Boot code) - alias */ @@ -138,6 +166,15 @@ static const struct imx_rproc_att imx_rproc_att_imx6sx[] = { { 0x8000, 0x8000, 0x6000, 0 }, }; +static const struct imx_rproc_dcfg imx_rproc_cfg_imx8mq = { + .src_reg= IMX7D_SRC_SCR, + .src_mask = IMX7D_M4_RST_MASK, + .src_start = IMX7D_M4_START, + .src_stop = IMX7D_M4_STOP, + .att= imx_rproc_att_imx8mq, + .att_size = ARRAY_SIZE(imx_rproc_att_imx8mq), +}; + static const struct imx_rproc_dcfg imx_rproc_cfg_imx7d = { .src_reg= IMX7D_SRC_SCR, .src_mask = IMX7D_M4_RST_MASK, @@ -496,6 +533,8 @@ static int imx_rproc_remove(struct platform_device *pdev) static const struct of_device_id imx_rproc_of_match[] = { { .compatible = "fsl,imx7d-cm4", .data = _rproc_cfg_imx7d }, { .compatible = "fsl,imx6sx-cm4", .data = _rproc_cfg_imx6sx }, + { .compatible = "fsl,imx8mq-cm4", .data = _rproc_cfg_imx8mq }, + { .compatible = "fsl,imx8mm-cm4", .data = _rproc_cfg_imx8mq }, {}, }; MODULE_DEVICE_TABLE(of, imx_rproc_of_match); @@ -512,5 +551,5 @@ static struct platform_driver imx_rproc_driver = { module_platform_driver(imx_rproc_driver); MODULE_LICENSE("GPL v2"); -MODULE_DESCRIPTION("IMX6SX/7D remote processor control driver"); +MODULE_DESCRIPTION("i.MX remote processor control driver"); MODULE_AUTHOR("Oleksij Rempel "); -- 2.30.0
[PATCH V13 09/10] remoteproc: imx_rproc: ignore mapping vdev regions
From: Peng Fan vdev regions are vdev0vring0, vdev0vring1, vdevbuffer and similar. They are handled by remoteproc common code, no need to map in imx rproc driver. Signed-off-by: Peng Fan Reviewed-by: Mathieu Poirier --- drivers/remoteproc/imx_rproc.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c index 0124ebf69838..3685bbd135b0 100644 --- a/drivers/remoteproc/imx_rproc.c +++ b/drivers/remoteproc/imx_rproc.c @@ -417,6 +417,9 @@ static int imx_rproc_addr_init(struct imx_rproc *priv, struct resource res; node = of_parse_phandle(np, "memory-region", a); + /* Not map vdev region */ + if (!strcmp(node->name, "vdev")) + continue; err = of_address_to_resource(node, 0, ); if (err) { dev_err(dev, "unable to resolve memory region\n"); -- 2.30.0
[PATCH V13 06/10] remoteproc: imx_rproc: use devm_ioremap
From: Peng Fan We might need to map an region multiple times, becaue the region might be shared between remote processors, such i.MX8QM with dual M4 cores. So use devm_ioremap, not devm_ioremap_resource. Reviewed-by: Oleksij Rempel Reviewed-by: Richard Zhu Signed-off-by: Peng Fan Reviewed-by: Mathieu Poirier --- drivers/remoteproc/imx_rproc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c index 2a093cea4997..47fc1d06be6a 100644 --- a/drivers/remoteproc/imx_rproc.c +++ b/drivers/remoteproc/imx_rproc.c @@ -296,7 +296,8 @@ static int imx_rproc_addr_init(struct imx_rproc *priv, if (b >= IMX7D_RPROC_MEM_MAX) break; - priv->mem[b].cpu_addr = devm_ioremap_resource(>dev, ); + /* Not use resource version, because we might share region */ + priv->mem[b].cpu_addr = devm_ioremap(>dev, res.start, resource_size()); if (IS_ERR(priv->mem[b].cpu_addr)) { dev_err(dev, "failed to remap %pr\n", ); err = PTR_ERR(priv->mem[b].cpu_addr); -- 2.30.0
[PATCH V13 07/10] remoteproc: imx_rproc: add i.MX specific parse fw hook
From: Peng Fan The hook is used to parse memory-regions and load resource table from the address the remote processor published. Reviewed-by: Richard Zhu Reviewed-by: Mathieu Poirier Signed-off-by: Peng Fan --- drivers/remoteproc/imx_rproc.c | 93 ++ 1 file changed, 93 insertions(+) diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c index 47fc1d06be6a..5ae1f5209548 100644 --- a/drivers/remoteproc/imx_rproc.c +++ b/drivers/remoteproc/imx_rproc.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include #include @@ -241,10 +242,102 @@ static void *imx_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len, bool *i return va; } +static int imx_rproc_mem_alloc(struct rproc *rproc, + struct rproc_mem_entry *mem) +{ + struct device *dev = rproc->dev.parent; + void *va; + + dev_dbg(dev, "map memory: %p+%zx\n", >dma, mem->len); + va = ioremap_wc(mem->dma, mem->len); + if (IS_ERR_OR_NULL(va)) { + dev_err(dev, "Unable to map memory region: %p+%zx\n", + >dma, mem->len); + return -ENOMEM; + } + + /* Update memory entry va */ + mem->va = va; + + return 0; +} + +static int imx_rproc_mem_release(struct rproc *rproc, +struct rproc_mem_entry *mem) +{ + dev_dbg(rproc->dev.parent, "unmap memory: %pa\n", >dma); + iounmap(mem->va); + + return 0; +} + +static int imx_rproc_parse_memory_regions(struct rproc *rproc) +{ + struct imx_rproc *priv = rproc->priv; + struct device_node *np = priv->dev->of_node; + struct of_phandle_iterator it; + struct rproc_mem_entry *mem; + struct reserved_mem *rmem; + u32 da; + + /* Register associated reserved memory regions */ + of_phandle_iterator_init(, np, "memory-region", NULL, 0); + while (of_phandle_iterator_next() == 0) { + /* +* Ignore the first memory region which will be used vdev buffer. +* No need to do extra handlings, rproc_add_virtio_dev will handle it. +*/ + if (!strcmp(it.node->name, "vdev0buffer")) + continue; + + rmem = of_reserved_mem_lookup(it.node); + if (!rmem) { + dev_err(priv->dev, "unable to acquire memory-region\n"); + return -EINVAL; + } + + /* No need to translate pa to da, i.MX use same map */ + da = rmem->base; + + /* Register memory region */ + mem = rproc_mem_entry_init(priv->dev, NULL, (dma_addr_t)rmem->base, rmem->size, da, + imx_rproc_mem_alloc, imx_rproc_mem_release, + it.node->name); + + if (mem) + rproc_coredump_add_segment(rproc, da, rmem->size); + else + return -ENOMEM; + + rproc_add_carveout(rproc, mem); + } + + return 0; +} + +static int imx_rproc_parse_fw(struct rproc *rproc, const struct firmware *fw) +{ + int ret = imx_rproc_parse_memory_regions(rproc); + + if (ret) + return ret; + + ret = rproc_elf_load_rsc_table(rproc, fw); + if (ret) + dev_info(>dev, "No resource table in elf\n"); + + return 0; +} + static const struct rproc_ops imx_rproc_ops = { .start = imx_rproc_start, .stop = imx_rproc_stop, .da_to_va = imx_rproc_da_to_va, + .load = rproc_elf_load_segments, + .parse_fw = imx_rproc_parse_fw, + .find_loaded_rsc_table = rproc_elf_find_loaded_rsc_table, + .sanity_check = rproc_elf_sanity_check, + .get_boot_addr = rproc_elf_get_boot_addr, }; static int imx_rproc_addr_init(struct imx_rproc *priv, -- 2.30.0
[PATCH V13 05/10] remoteproc: imx_rproc: correct err message
From: Peng Fan It is using devm_ioremap, so not devm_ioremap_resource. Correct the error message and print out sa/size. Reviewed-by: Bjorn Andersson Reviewed-by: Mathieu Poirier Signed-off-by: Peng Fan --- drivers/remoteproc/imx_rproc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c index 6603e00bb6f4..2a093cea4997 100644 --- a/drivers/remoteproc/imx_rproc.c +++ b/drivers/remoteproc/imx_rproc.c @@ -268,7 +268,7 @@ static int imx_rproc_addr_init(struct imx_rproc *priv, priv->mem[b].cpu_addr = devm_ioremap(>dev, att->sa, att->size); if (!priv->mem[b].cpu_addr) { - dev_err(dev, "devm_ioremap_resource failed\n"); + dev_err(dev, "failed to remap %#x bytes from %#x\n", att->size, att->sa); return -ENOMEM; } priv->mem[b].sys_addr = att->sa; @@ -298,7 +298,7 @@ static int imx_rproc_addr_init(struct imx_rproc *priv, priv->mem[b].cpu_addr = devm_ioremap_resource(>dev, ); if (IS_ERR(priv->mem[b].cpu_addr)) { - dev_err(dev, "devm_ioremap_resource failed\n"); + dev_err(dev, "failed to remap %pr\n", ); err = PTR_ERR(priv->mem[b].cpu_addr); return err; } -- 2.30.0
[PATCH V13 04/10] remoteproc: add is_iomem to da_to_va
From: Peng Fan Introduce an extra parameter is_iomem to da_to_va, then the caller could take the memory as normal memory or io mapped memory. Reviewed-by: Bjorn Andersson Reviewed-by: Mathieu Poirier Reported-by: kernel test robot Signed-off-by: Peng Fan --- drivers/remoteproc/imx_rproc.c | 2 +- drivers/remoteproc/ingenic_rproc.c | 2 +- drivers/remoteproc/keystone_remoteproc.c | 2 +- drivers/remoteproc/mtk_scp.c | 6 +++--- drivers/remoteproc/omap_remoteproc.c | 2 +- drivers/remoteproc/pru_rproc.c | 2 +- drivers/remoteproc/qcom_q6v5_adsp.c| 2 +- drivers/remoteproc/qcom_q6v5_pas.c | 2 +- drivers/remoteproc/qcom_q6v5_wcss.c| 2 +- drivers/remoteproc/qcom_wcnss.c| 2 +- drivers/remoteproc/remoteproc_core.c | 7 +-- drivers/remoteproc/remoteproc_coredump.c | 8 ++-- drivers/remoteproc/remoteproc_debugfs.c| 2 +- drivers/remoteproc/remoteproc_elf_loader.c | 21 +++-- drivers/remoteproc/remoteproc_internal.h | 2 +- drivers/remoteproc/st_slim_rproc.c | 2 +- drivers/remoteproc/ti_k3_dsp_remoteproc.c | 2 +- drivers/remoteproc/ti_k3_r5_remoteproc.c | 2 +- drivers/remoteproc/wkup_m3_rproc.c | 2 +- include/linux/remoteproc.h | 2 +- 20 files changed, 45 insertions(+), 29 deletions(-) diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c index 8957ed271d20..6603e00bb6f4 100644 --- a/drivers/remoteproc/imx_rproc.c +++ b/drivers/remoteproc/imx_rproc.c @@ -208,7 +208,7 @@ static int imx_rproc_da_to_sys(struct imx_rproc *priv, u64 da, return -ENOENT; } -static void *imx_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len) +static void *imx_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len, bool *is_iomem) { struct imx_rproc *priv = rproc->priv; void *va = NULL; diff --git a/drivers/remoteproc/ingenic_rproc.c b/drivers/remoteproc/ingenic_rproc.c index e2618c36eaab..a356738160a4 100644 --- a/drivers/remoteproc/ingenic_rproc.c +++ b/drivers/remoteproc/ingenic_rproc.c @@ -121,7 +121,7 @@ static void ingenic_rproc_kick(struct rproc *rproc, int vqid) writel(vqid, vpu->aux_base + REG_CORE_MSG); } -static void *ingenic_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len) +static void *ingenic_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len, bool *is_iomem) { struct vpu *vpu = rproc->priv; void __iomem *va = NULL; diff --git a/drivers/remoteproc/keystone_remoteproc.c b/drivers/remoteproc/keystone_remoteproc.c index cd266163a65f..54781f553f4e 100644 --- a/drivers/remoteproc/keystone_remoteproc.c +++ b/drivers/remoteproc/keystone_remoteproc.c @@ -246,7 +246,7 @@ static void keystone_rproc_kick(struct rproc *rproc, int vqid) * can be used either by the remoteproc core for loading (when using kernel * remoteproc loader), or by any rpmsg bus drivers. */ -static void *keystone_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len) +static void *keystone_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len, bool *is_iomem) { struct keystone_rproc *ksproc = rproc->priv; void __iomem *va = NULL; diff --git a/drivers/remoteproc/mtk_scp.c b/drivers/remoteproc/mtk_scp.c index ce727598c41c..9679cc26895e 100644 --- a/drivers/remoteproc/mtk_scp.c +++ b/drivers/remoteproc/mtk_scp.c @@ -272,7 +272,7 @@ static int scp_elf_load_segments(struct rproc *rproc, const struct firmware *fw) } /* grab the kernel address for this device address */ - ptr = (void __iomem *)rproc_da_to_va(rproc, da, memsz); + ptr = (void __iomem *)rproc_da_to_va(rproc, da, memsz, NULL); if (!ptr) { dev_err(dev, "bad phdr da 0x%x mem 0x%x\n", da, memsz); ret = -EINVAL; @@ -509,7 +509,7 @@ static void *mt8192_scp_da_to_va(struct mtk_scp *scp, u64 da, size_t len) return NULL; } -static void *scp_da_to_va(struct rproc *rproc, u64 da, size_t len) +static void *scp_da_to_va(struct rproc *rproc, u64 da, size_t len, bool *is_iomem) { struct mtk_scp *scp = (struct mtk_scp *)rproc->priv; @@ -627,7 +627,7 @@ void *scp_mapping_dm_addr(struct mtk_scp *scp, u32 mem_addr) { void *ptr; - ptr = scp_da_to_va(scp->rproc, mem_addr, 0); + ptr = scp_da_to_va(scp->rproc, mem_addr, 0, NULL); if (!ptr) return ERR_PTR(-EINVAL); diff --git a/drivers/remoteproc/omap_remoteproc.c b/drivers/remoteproc/omap_remoteproc.c index d94b7391bf9d..43531caa1959 100644 --- a/drivers/remoteproc/omap_remoteproc.c +++ b/drivers/remoteproc/omap_remoteproc.c @@ -728,7 +728,7 @@ static int omap_rproc_stop(struct rproc *rproc) * Return: translated virtual address in kernel memory space on success, * or NULL on failure. */ -static void *omap_rproc_da_to_va(struct rproc *rproc, u64 da,
Re: [syzbot] upstream boot error: WARNING in kvm_wait
On Fri, Mar 5, 2021 at 9:56 PM syzbot wrote: > > Hello, > > syzbot found the following issue on: > > HEAD commit:280d542f Merge tag 'drm-fixes-2021-03-05' of git://anongit.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=138c7a92d0 > kernel config: https://syzkaller.appspot.com/x/.config?x=dc4003509ab3fc78 > dashboard link: https://syzkaller.appspot.com/bug?extid=a4c8bc1d1dc7b620630d > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+a4c8bc1d1dc7b6206...@syzkaller.appspotmail.com +Mark, I've enabled CONFIG_DEBUG_IRQFLAGS on syzbot and it led to this breakage. Is it a bug in kvm_wait or in the debugging code itself? If it's a real bug, I would assume it's pretty bad as it happens all the time. > [ cut here ] > raw_local_irq_restore() called with IRQs enabled > WARNING: CPU: 2 PID: 213 at kernel/locking/irqflag-debug.c:10 > warn_bogus_irq_restore+0x1d/0x20 kernel/locking/irqflag-debug.c:10 > Modules linked in: > CPU: 2 PID: 213 Comm: kworker/u17:4 Not tainted 5.12.0-rc1-syzkaller #0 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 > Workqueue: events_unbound call_usermodehelper_exec_work > > RIP: 0010:warn_bogus_irq_restore+0x1d/0x20 kernel/locking/irqflag-debug.c:10 > Code: be ff cc cc cc cc cc cc cc cc cc cc cc 80 3d e4 38 af 04 00 74 01 c3 48 > c7 c7 a0 8f 6b 89 c6 05 d3 38 af 04 01 e8 e7 b9 be ff <0f> 0b c3 48 39 77 10 > 0f 84 97 00 00 00 66 f7 47 22 f0 ff 74 4b 48 > RSP: :c9fe7770 EFLAGS: 00010286 > > RAX: RBX: 8c0e9c68 RCX: > RDX: 8880116bc3c0 RSI: 815c0cf5 RDI: f520001fcee0 > RBP: 0200 R08: R09: 0001 > R10: 815b9a5e R11: R12: 0003 > R13: fbfff181d38d R14: 0001 R15: 88802cc36000 > FS: () GS:88802cc0() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: CR3: 0bc8e000 CR4: 00150ee0 > DR0: DR1: DR2: > DR3: DR6: fffe0ff0 DR7: 0400 > Call Trace: > kvm_wait arch/x86/kernel/kvm.c:860 [inline] > kvm_wait+0xc9/0xe0 arch/x86/kernel/kvm.c:837 > pv_wait arch/x86/include/asm/paravirt.h:564 [inline] > pv_wait_head_or_lock kernel/locking/qspinlock_paravirt.h:470 [inline] > __pv_queued_spin_lock_slowpath+0x8b8/0xb40 kernel/locking/qspinlock.c:508 > pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:554 [inline] > queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline] > queued_spin_lock include/asm-generic/qspinlock.h:85 [inline] > do_raw_spin_lock+0x200/0x2b0 kernel/locking/spinlock_debug.c:113 > spin_lock include/linux/spinlock.h:354 [inline] > copy_fs_struct+0x1c8/0x340 fs/fs_struct.c:123 > copy_fs kernel/fork.c:1443 [inline] > copy_process+0x4dc2/0x6fd0 kernel/fork.c:2088 > kernel_clone+0xe7/0xab0 kernel/fork.c:2462 > kernel_thread+0xb5/0xf0 kernel/fork.c:2514 > call_usermodehelper_exec_work kernel/umh.c:172 [inline] > call_usermodehelper_exec_work+0xcc/0x180 kernel/umh.c:158 > process_one_work+0x98d/0x1600 kernel/workqueue.c:2275 > worker_thread+0x64c/0x1120 kernel/workqueue.c:2421 > kthread+0x3b1/0x4a0 kernel/kthread.c:292 > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 > > > --- > This report is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syzkal...@googlegroups.com. > > syzbot will keep track of this issue. See: > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > -- > You received this message because you are subscribed to the Google Groups > "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to syzkaller-bugs+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/syzkaller-bugs/ccbedd05bcd0504e%40google.com.
[PATCH V13 02/10] dt-bindings: remoteproc: imx_rproc: add i.MX8MQ/M support
From: Peng Fan Add i.MX8MQ/M support, also include mailbox for rpmsg/virtio usage. Signed-off-by: Peng Fan --- .../bindings/remoteproc/fsl,imx-rproc.yaml| 31 ++- 1 file changed, 30 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml index 54d2456530a6..208a628f8d6c 100644 --- a/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml +++ b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml @@ -4,7 +4,7 @@ $id: "http://devicetree.org/schemas/remoteproc/fsl,imx-rproc.yaml#; $schema: "http://devicetree.org/meta-schemas/core.yaml#; -title: NXP iMX6SX/iMX7D Co-Processor Bindings +title: NXP i.MX Co-Processor Bindings description: This binding provides support for ARM Cortex M4 Co-processor found on some NXP iMX SoCs. @@ -15,6 +15,8 @@ maintainers: properties: compatible: enum: + - fsl,imx8mq-cm4 + - fsl,imx8mm-cm4 - fsl,imx7d-cm4 - fsl,imx6sx-cm4 @@ -26,6 +28,20 @@ properties: description: Phandle to syscon block which provide access to System Reset Controller + mbox-names: +items: + - const: tx + - const: rx + - const: rxdb + + mboxes: +description: + This property is required only if the rpmsg/virtio functionality is used. + List of < type channel> - 1 channel for TX, 1 channel for RX, 1 channel for RXDB. + (see mailbox/fsl,mu.yaml) +minItems: 1 +maxItems: 3 + memory-region: description: If present, a phandle for a reserved memory area that used for vdev buffer, @@ -58,4 +74,17 @@ examples: clocks = < IMX7D_ARM_M4_ROOT_CLK>; }; + - | +#include + +imx8mm-cm4 { + compatible = "fsl,imx8mm-cm4"; + clocks = < IMX8MM_CLK_M4_DIV>; + mbox-names = "tx", "rx", "rxdb"; + mboxes = < 0 1 + 1 1 + 3 1>; + memory-region = <>, <>, <>, <_table>; + syscon = <>; +}; ... -- 2.30.0
[PATCH V13 03/10] remoteproc: introduce is_iomem to rproc_mem_entry
From: Peng Fan Introduce is_iomem to indicate this piece memory is iomem or not. Reviewed-by: Bjorn Andersson Reviewed-by: Mathieu Poirier Signed-off-by: Peng Fan --- include/linux/remoteproc.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h index f28ee75d1005..a5f6d2d9cde2 100644 --- a/include/linux/remoteproc.h +++ b/include/linux/remoteproc.h @@ -315,6 +315,7 @@ struct rproc; /** * struct rproc_mem_entry - memory entry descriptor * @va:virtual address + * @is_iomem: io memory * @dma: dma address * @len: length, in bytes * @da: device address @@ -329,6 +330,7 @@ struct rproc; */ struct rproc_mem_entry { void *va; + bool is_iomem; dma_addr_t dma; size_t len; u32 da; -- 2.30.0
[PATCH V13 01/10] dt-bindings: remoteproc: convert imx rproc bindings to json-schema
From: Peng Fan Convert the imx rproc binding to DT schema format using json-schema. Reviewed-by: Rob Herring Signed-off-by: Peng Fan --- .../bindings/remoteproc/fsl,imx-rproc.yaml| 61 +++ .../bindings/remoteproc/imx-rproc.txt | 33 -- 2 files changed, 61 insertions(+), 33 deletions(-) create mode 100644 Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml delete mode 100644 Documentation/devicetree/bindings/remoteproc/imx-rproc.txt diff --git a/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml new file mode 100644 index ..54d2456530a6 --- /dev/null +++ b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml @@ -0,0 +1,61 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +%YAML 1.2 +--- +$id: "http://devicetree.org/schemas/remoteproc/fsl,imx-rproc.yaml#; +$schema: "http://devicetree.org/meta-schemas/core.yaml#; + +title: NXP iMX6SX/iMX7D Co-Processor Bindings + +description: + This binding provides support for ARM Cortex M4 Co-processor found on some NXP iMX SoCs. + +maintainers: + - Peng Fan + +properties: + compatible: +enum: + - fsl,imx7d-cm4 + - fsl,imx6sx-cm4 + + clocks: +maxItems: 1 + + syscon: +$ref: /schemas/types.yaml#/definitions/phandle +description: + Phandle to syscon block which provide access to System Reset Controller + + memory-region: +description: + If present, a phandle for a reserved memory area that used for vdev buffer, + resource table, vring region and others used by remote processor. +minItems: 1 +maxItems: 32 + +required: + - compatible + - clocks + - syscon + +additionalProperties: false + +examples: + - | +#include +m4_reserved_sysmem1: cm4@8000 { + reg = <0x8000 0x8>; +}; + +m4_reserved_sysmem2: cm4@8100 { + reg = <0x8100 0x8>; +}; + +imx7d-cm4 { + compatible = "fsl,imx7d-cm4"; + memory-region= <_reserved_sysmem1>, <_reserved_sysmem2>; + syscon = <>; + clocks = < IMX7D_ARM_M4_ROOT_CLK>; +}; + +... diff --git a/Documentation/devicetree/bindings/remoteproc/imx-rproc.txt b/Documentation/devicetree/bindings/remoteproc/imx-rproc.txt deleted file mode 100644 index fbcefd965dc4.. --- a/Documentation/devicetree/bindings/remoteproc/imx-rproc.txt +++ /dev/null @@ -1,33 +0,0 @@ -NXP iMX6SX/iMX7D Co-Processor Bindings - - -This binding provides support for ARM Cortex M4 Co-processor found on some -NXP iMX SoCs. - -Required properties: -- compatible Should be one of: - "fsl,imx7d-cm4" - "fsl,imx6sx-cm4" -- clocks Clock for co-processor (See: ../clock/clock-bindings.txt) -- syscon Phandle to syscon block which provide access to - System Reset Controller - -Optional properties: -- memory-regionlist of phandels to the reserved memory regions. - (See: ../reserved-memory/reserved-memory.txt) - -Example: - m4_reserved_sysmem1: cm4@8000 { - reg = <0x8000 0x8>; - }; - - m4_reserved_sysmem2: cm4@8100 { - reg = <0x8100 0x8>; - }; - - imx7d-cm4 { - compatible = "fsl,imx7d-cm4"; - memory-region = <_reserved_sysmem1>, <_reserved_sysmem2>; - syscon = <>; - clocks = < IMX7D_ARM_M4_ROOT_CLK>; - }; -- 2.30.0
[PATCH V13 00/10] remoteproc: imx_rproc: support iMX8MQ/M
From: Peng Fan V13: Add R-b tag from Rob for patch 1. Drop the reserved memory node from patch 2 per Rob's comment. Mathieu, Bjorn Only patch 2 not have R-b/A-b tag, but since Rob's only has a minor comment, and addressed in this version, is it ok for you take into remoteproc next branch? Thanks. V12: Add maxItems to avoid dt_bindings_check fail Rebased on top of linux-next V11: Per Rob's comments, fix memory-region in patch 1/10 Rebased on top of Linux-next V10: Per Rob's comments, fix patch 1/10 V9: Per Mathieu's comments, update the tile of yaml in patch 2/10 update the Kconfig and MODULE_DESCRIPTION, I merge this change in patch 8/10, since this is a minor change, I still keep Mathieu's R-b tag. If any objection, I could remove. Add R-b tag in Patch 10/10 Rob, please help review patch 1/10 and 2/10 V8: Address sparse warning in patch 4/10 reported by kernel test robot V7: Add R-b tag from Mathieu vdevbuffer->vdev0buffer in patch 1/10, 7/10 correct err msg and shutdown seq per Mathieu's comments in patch 10/10 Hope this version is ok to be merged. V6: Add R-b tag from Mathieu Convert imx-rproc.txt to yaml and add dt-bindings support for i.MX8MQ/M, patch 1/10 2/10 No other changes. V5: Apply on Linux next Add V5 subject prefix Add R-b tag from Bjorn for 1/8, 2/8, 3/8 https://patchwork.kernel.org/project/linux-remoteproc/cover/20201229033019.25899-1-peng@nxp.com/ V4: According to Bjorn's comments, add is_iomem for da to va usage 1/8, 2/8 is new patch 3/8, follow Bjorn's comments to correct/update the err msg. 6/8, new patch 8/8, use dev_err_probe to simplify code, use queue_work instead schedule_delayed_work V3: Since I was quite busy in the past days, V3 is late Rebased on Linux-next Add R-b tags 1/7: Add R-b tag of Mathieu, add comments 4/7: Typo fix 5/7: Add R-b tag of Mathieu, drop index Per Mathieu's comments 6/7: Add R-b tag of Mathieu 7/7: Add comment for vqid << 16, drop unneeded timeout settings of mailbox Use queue_work instead of schedule_delayed_work free mbox channels when remove https://lkml.org/lkml/2020/12/4/82 V2: Rebased on linux-next Dropped early boot feature to make patchset simple. Drop rsc-da https://patchwork.kernel.org/project/linux-remoteproc/cover/20200927064131.24101-1-peng@nxp.com/ V1: https://patchwork.kernel.org/cover/11682461/ This patchset is to support i.MX8MQ/M coproc. The early boot feature was dropped to make the patchset small in V2. Since i.MX specific TCM memory requirement, add elf platform hook. Several patches have got reviewed by Oleksij and Mathieu in v1. Peng Fan (10): dt-bindings: remoteproc: convert imx rproc bindings to json-schema dt-bindings: remoteproc: imx_rproc: add i.MX8MQ/M support remoteproc: introduce is_iomem to rproc_mem_entry remoteproc: add is_iomem to da_to_va remoteproc: imx_rproc: correct err message remoteproc: imx_rproc: use devm_ioremap remoteproc: imx_rproc: add i.MX specific parse fw hook remoteproc: imx_rproc: support i.MX8MQ/M remoteproc: imx_rproc: ignore mapping vdev regions remoteproc: imx_proc: enable virtio/mailbox .../bindings/remoteproc/fsl,imx-rproc.yaml| 90 ++ .../bindings/remoteproc/imx-rproc.txt | 33 --- drivers/remoteproc/Kconfig| 6 +- drivers/remoteproc/imx_rproc.c| 262 +- drivers/remoteproc/ingenic_rproc.c| 2 +- drivers/remoteproc/keystone_remoteproc.c | 2 +- drivers/remoteproc/mtk_scp.c | 6 +- drivers/remoteproc/omap_remoteproc.c | 2 +- drivers/remoteproc/pru_rproc.c| 2 +- drivers/remoteproc/qcom_q6v5_adsp.c | 2 +- drivers/remoteproc/qcom_q6v5_pas.c| 2 +- drivers/remoteproc/qcom_q6v5_wcss.c | 2 +- drivers/remoteproc/qcom_wcnss.c | 2 +- drivers/remoteproc/remoteproc_core.c | 7 +- drivers/remoteproc/remoteproc_coredump.c | 8 +- drivers/remoteproc/remoteproc_debugfs.c | 2 +- drivers/remoteproc/remoteproc_elf_loader.c| 21 +- drivers/remoteproc/remoteproc_internal.h | 2 +- drivers/remoteproc/st_slim_rproc.c| 2 +- drivers/remoteproc/ti_k3_dsp_remoteproc.c | 2 +- drivers/remoteproc/ti_k3_r5_remoteproc.c | 2 +- drivers/remoteproc/wkup_m3_rproc.c| 2 +- include/linux/remoteproc.h| 4 +- 23 files changed, 393 insertions(+), 72 deletions(-) create mode 100644 Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml delete mode 100644 Documentation/devicetree/bindings/remoteproc/imx-rproc.txt -- 2.30.0
[RFC v2] scripts: kernel-doc: fix attribute capture in function parsing
Currently, kernel-doc warns for function prototype parsing on the presence of attributes "__attribute_const__" and "__flatten" in the definition. There are 166 occurrences in ~70 files in the kernel tree for "__attribute_const__" and 5 occurrences in 4 files for "__flatten". Out of 166, there are 3 occurrences in three different files with "__attribute_const__" and a preceding kernel-doc; and, 1 occurrence in ./mm/percpu.c for "__flatten" with a preceding kernel-doc. All other occurrences have no preceding kernel-doc. Add support for "__attribute_const__" and "__flatten" attributes. A quick evaluation by running 'kernel-doc -none' on kernel-tree reveals that no additional warning or error has been added or removed by the fix. Suggested-by: Lukas Bulwahn Signed-off-by: Aditya Srivastava --- Changes in v2: - Remove "__attribute_const__" from the $return_type capture regex and add to the substituting ones. - Add support for "__flatten" attribute - Modify commit message scripts/kernel-doc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/scripts/kernel-doc b/scripts/kernel-doc index 68df17877384..e1e562b2e2e7 100755 --- a/scripts/kernel-doc +++ b/scripts/kernel-doc @@ -1766,12 +1766,14 @@ sub dump_function($$) { $prototype =~ s/^noinline +//; $prototype =~ s/__init +//; $prototype =~ s/__init_or_module +//; +$prototype =~ s/__flatten +//; $prototype =~ s/__meminit +//; $prototype =~ s/__must_check +//; $prototype =~ s/__weak +//; $prototype =~ s/__sched +//; $prototype =~ s/__printf\s*\(\s*\d*\s*,\s*\d*\s*\) +//; my $define = $prototype =~ s/^#\s*define\s+//; #ak added +$prototype =~ s/__attribute_const__ +//; $prototype =~ s/__attribute__\s*\(\( (?: [\w\s]++ # attribute name -- 2.17.1
[PATCH v3 1/4] ALSA: hda/cirrus: Increase AUTO_CFG_MAX_INS from 8 to 18
In preparation to support Cirrus Logic CS8409 HDA bridge on new Dell platforms it is nessasary to increase AUTO_CFG_MAX_INS and AUTO_CFG_NUM_INPUTS values. Currently AUTO_CFG_MAX_INS is limited to 8, but Cirrus Logic HDA bridge CS8409 has 18 input pins, 16 ASP receivers and 2 DMIC inputs. We have to increase this value to 18, so generic code can handle this correctly. Tested on DELL Inspiron-3505, DELL Inspiron-3501, DELL Inspiron-3500 Signed-off-by: Vitaly Rodionov --- Changes in v3: - No changes sound/pci/hda/hda_auto_parser.h | 2 +- sound/pci/hda/hda_local.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/sound/pci/hda/hda_auto_parser.h b/sound/pci/hda/hda_auto_parser.h index a22ca0e17a08..df63d66af1ab 100644 --- a/sound/pci/hda/hda_auto_parser.h +++ b/sound/pci/hda/hda_auto_parser.h @@ -27,7 +27,7 @@ enum { }; #define AUTO_CFG_MAX_OUTS HDA_MAX_OUTS -#define AUTO_CFG_MAX_INS 8 +#define AUTO_CFG_MAX_INS 18 struct auto_pin_cfg_item { hda_nid_t pin; diff --git a/sound/pci/hda/hda_local.h b/sound/pci/hda/hda_local.h index 5beb8aa44ecd..317245a5585d 100644 --- a/sound/pci/hda/hda_local.h +++ b/sound/pci/hda/hda_local.h @@ -180,7 +180,7 @@ int snd_hda_create_spdif_in_ctls(struct hda_codec *codec, hda_nid_t nid); /* * input MUX helper */ -#define HDA_MAX_NUM_INPUTS 16 +#define HDA_MAX_NUM_INPUTS 36 struct hda_input_mux_item { char label[32]; unsigned int index; -- 2.25.1
[PATCH v3 3/4] ALSA: hda/cirrus: Add jack detect interrupt support from CS42L42 companion codec.
In the case of CS8409 we do not have unsol events from NID's 0x24 and 0x34 where hs mic and hp are connected. Companion codec CS42L42 will generate interrupt via gpio 4 to notify jack events. We have to overwrite standard snd_hda_jack_unsol_event(), read CS42L42 jack detect status registers and then notify status via generic snd_hda_jack_unsol_event() call. Tested on DELL Inspiron-3500, DELL Inspiron-3501, DELL Inspiron-3505. Signed-off-by: Vitaly Rodionov --- Changes in v3: - Fixed missing static function declaration warning (Reported-by: kernel test robot ) - Improved unsolicited events handling for headset type 4 sound/pci/hda/patch_cirrus.c | 309 ++- 1 file changed, 307 insertions(+), 2 deletions(-) diff --git a/sound/pci/hda/patch_cirrus.c b/sound/pci/hda/patch_cirrus.c index d664eed5c3cf..1d2f6a1224e6 100644 --- a/sound/pci/hda/patch_cirrus.c +++ b/sound/pci/hda/patch_cirrus.c @@ -9,6 +9,7 @@ #include #include #include +#include #include #include #include @@ -38,6 +39,15 @@ struct cs_spec { /* for MBP SPDIF control */ int (*spdif_sw_put)(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol); + + unsigned int cs42l42_hp_jack_in:1; + unsigned int cs42l42_mic_jack_in:1; + + struct mutex cs8409_i2c_mux; + + /* verb exec op override */ + int (*exec_verb)(struct hdac_device *dev, unsigned int cmd, +unsigned int flags, unsigned int *res); }; /* available models with CS420x */ @@ -1229,6 +1239,13 @@ static int patch_cs4213(struct hda_codec *codec) #define CS8409_CS42L42_SPK_PIN_NID 0x2c #define CS8409_CS42L42_AMIC_PIN_NID0x34 #define CS8409_CS42L42_DMIC_PIN_NID0x44 +#define CS8409_CS42L42_DMIC_ADC_PIN_NID0x22 + +#define CS42L42_HSDET_AUTO_DONE0x02 +#define CS42L42_HSTYPE_MASK0x03 + +#define CS42L42_JACK_INSERTED 0x0C +#define CS42L42_JACK_REMOVED 0x00 #define GPIO3_INT (1 << 3) #define GPIO4_INT (1 << 4) @@ -1429,6 +1446,7 @@ static const struct cs8409_i2c_param cs42l42_init_reg_seq[] = { { 0x1C03, 0xC0 }, { 0x1105, 0x00 }, { 0x1112, 0xC0 }, + { 0x1101, 0x02 }, {} /* Terminator */ }; @@ -1565,6 +1583,8 @@ static unsigned int cs8409_i2c_write(struct hda_codec *codec, /* Assert/release RTS# line to CS42L42 */ static void cs8409_cs42l42_reset(struct hda_codec *codec) { + struct cs_spec *spec = codec->spec; + /* Assert RTS# line */ snd_hda_codec_write(codec, codec->core.afg, 0, AC_VERB_SET_GPIO_DATA, 0); @@ -1576,21 +1596,190 @@ static void cs8409_cs42l42_reset(struct hda_codec *codec) /* wait ~10ms */ usleep_range(1, 15000); - /* Clear interrupts status */ + mutex_lock(>cs8409_i2c_mux); + + /* Clear interrupts, by reading interrupt status registers */ cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x1308, 1); cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x1309, 1); cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x130A, 1); cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x130F, 1); + mutex_unlock(>cs8409_i2c_mux); + +} + +/* Configure CS42L42 slave codec for jack autodetect */ +static int cs8409_cs42l42_enable_jack_detect(struct hda_codec *codec) +{ + struct cs_spec *spec = codec->spec; + + mutex_lock(>cs8409_i2c_mux); + + /* Set TIP_SENSE_EN for analog front-end of tip sense. */ + cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1b70, 0x0020, 1); + /* Clear WAKE# */ + cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1b71, 0x0001, 1); + /* Wait ~2.5ms */ + usleep_range(2500, 3000); + /* Set mode WAKE# output follows the combination logic directly */ + cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1b71, 0x0020, 1); + /* Clear interrupts status */ + cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x130f, 1); + cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x1b7b, 1); + /* Enable interrupt */ + cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1320, 0x03, 1); + cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1b79, 0x00, 1); + + mutex_unlock(>cs8409_i2c_mux); + + return 0; +} + +/* Enable and run CS42L42 slave codec jack auto detect */ +static void cs8409_cs42l42_run_jack_detect(struct hda_codec *codec) +{ + struct cs_spec *spec = codec->spec; + + mutex_lock(>cs8409_i2c_mux); + + /* Clear interrupts */ + cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x1308, 1); + cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x1b77, 1); + + cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1102, 0x87, 1); + cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1f06, 0x86, 1); + cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1b74, 0x07, 1); + cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x131b, 0x01, 1); + cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1120, 0x80,
[PATCH v3 0/4] ALSA: hda/cirrus: Add support for CS8409 HDA bridge and CS42L42 companion codec
Dell's laptops Inspiron 3500, Inspiron 3501, Inspiron 3505 are using Cirrus Logic CS8409 HDA bridge with CS42L42 companion codec. The CS8409 is a multichannel HD audio routing controller. CS8409 includes support for four channels of digital microphone data and two bidirectional ASPs for up to 32 channels of TDM data or 4 channels of I2S data. The CS8409 is intended to be used with a remote companion codec that implements high performance analog functions in close physical proximity to the end-equipment audio port or speaker driver. The CS42L42 is a low-power audio codec with integrated MIPI SoundWire interface or I2C/I2S/TDM interfaces designed for portable applications. It provides a high-dynamic range, stereo DAC for audio playback and a mono high-dynamic-range ADC for audio capture Changes since version 1: ALSA: hda/cirrus: Increase AUTO_CFG_MAX_INS from 8 to 18 * No change ALSA: hda/cirrus: Add support for CS8409 HDA bridge and CS42L42 companion codec. * Removed redundant fields in fixup table * Handle gpio via spec->gpio_dir, spec->gpio_data and spec->gpio_mask * Moved cs8409_cs42l42_init() from patch 2, to handle resume correctly ALSA: hda/cirrus: Add jack detect interrupt support from CS42L42 companion codec. * Run scripts/checkpatch.pl, fixed new warnings ALSA: hda/cirrus: Add Headphone and Headset MIC Volume Control * Moved control values to cache to avoid i2c read at each time. Stefan Binding (1): ALSA: hda/cirrus: Add Headphone and Headset MIC Volume Control Vitaly Rodionov (3): ALSA: hda/cirrus: Increase AUTO_CFG_MAX_INS from 8 to 18 ALSA: hda/cirrus: Add support for CS8409 HDA bridge and CS42L42 companion codec. ALSA: hda/cirrus: Add jack detect interrupt support from CS42L42 companion codec. sound/pci/hda/hda_auto_parser.h |2 +- sound/pci/hda/hda_local.h |2 +- sound/pci/hda/patch_cirrus.c| 1081 +++ 3 files changed, 1083 insertions(+), 2 deletions(-) -- 2.25.1
[PATCH v3 2/4] ALSA: hda/cirrus: Add support for CS8409 HDA bridge and CS42L42 companion codec.
Dell's laptops Inspiron 3500, Inspiron 3501, Inspiron 3505 are using Cirrus Logic CS8409 HDA bridge with CS42L42 companion codec. The CS8409 is a multichannel HD audio routing controller. CS8409 includes support for four channels of digital microphone data and two bidirectional ASPs for up to 32 channels of TDM data or 4 channels of I2S data. The CS8409 is intended to be used with a remote companion codec that implements high performance analog functions in close physical proximity to the end-equipment audio port or speaker driver. The CS42L42 is a low-power audio codec with integrated MIPI SoundWire interface or I2C/I2S/TDM interfaces designed for portable applications. It provides a high-dynamic range, stereo DAC for audio playback and a mono high-dynamic-range ADC for audio capture CS42L42 is connected to CS8409 HDA bridge via I2C and I2S. CS8409 CS42L42 --- ASP1.A TX --> ASP_SDIN ASP1.A RX <-- ASP_SDOUT GPIO5 --> RST# GPIO4 <-- INT# GPIO3 <-- WAKE# GPIO7 <-> I2C SDA GPIO6 --> I2C CLK Tested on DELL Inspiron-3500, DELL Inspiron-3501, DELL Inspiron-3505 This patch will register CS8409 with sound card and create input/output paths and two input devices, initialise CS42L42 companion codec and configure it for ASP TX/RX TDM mode, 24bit, 48kHz. cat /proc/asound/pcm 00-00: CS8409 Analog : CS8409 Analog : playback 1 : capture 1 00-03: HDMI 0 : HDMI 0 : playback 1 dmesg snd_hda_codec_cirrus hdaudioC0D0: autoconfig for CS8409: line_outs=1 (0x2c/0x0/0x0/0x0/0x0) type:speaker snd_hda_codec_cirrus hdaudioC0D0:speaker_outs=0 (0x0/0x0/0x0/0x0/0x0) snd_hda_codec_cirrus hdaudioC0D0:hp_outs=1 (0x24/0x0/0x0/0x0/0x0) snd_hda_codec_cirrus hdaudioC0D0:mono: mono_out=0x0 snd_hda_codec_cirrus hdaudioC0D0:inputs: snd_hda_codec_cirrus hdaudioC0D0: Internal Mic=0x44 snd_hda_codec_cirrus hdaudioC0D0: Mic=0x34 input: HDA Intel PCH Headphone as /devices/pci:00/:00:1f.3/sound/card0/input8 input: HDA Intel PCH Headset Mic as /devices/pci:00/:00:1f.3/sound/card0/input9 Signed-off-by: Vitaly Rodionov --- Changes in v3: - Fixed uninitialized variable warning (Reported-by: kernel test robot ) - Moved gpio setup into s8409_cs42l42_hw_init() - Improved susped() implementation sound/pci/hda/patch_cirrus.c | 576 +++ 1 file changed, 576 insertions(+) diff --git a/sound/pci/hda/patch_cirrus.c b/sound/pci/hda/patch_cirrus.c index f46204ab0b90..d664eed5c3cf 100644 --- a/sound/pci/hda/patch_cirrus.c +++ b/sound/pci/hda/patch_cirrus.c @@ -9,6 +9,7 @@ #include #include #include +#include #include #include #include "hda_local.h" @@ -1219,6 +1220,580 @@ static int patch_cs4213(struct hda_codec *codec) return err; } +/* Cirrus Logic CS8409 HDA bridge with + * companion codec CS42L42 + */ +#define CS8409_VENDOR_NID 0x47 + +#define CS8409_CS42L42_HP_PIN_NID 0x24 +#define CS8409_CS42L42_SPK_PIN_NID 0x2c +#define CS8409_CS42L42_AMIC_PIN_NID0x34 +#define CS8409_CS42L42_DMIC_PIN_NID0x44 + +#define GPIO3_INT (1 << 3) +#define GPIO4_INT (1 << 4) +#define GPIO5_INT (1 << 5) + +#define CS42L42_I2C_ADDR (0x48 << 1) + +#define CIR_I2C_ADDR 0x0059 +#define CIR_I2C_DATA 0x005A +#define CIR_I2C_CTRL 0x005B +#define CIR_I2C_STATUS 0x005C +#define CIR_I2C_QWRITE 0x005D +#define CIR_I2C_QREAD 0x005E + +struct cs8409_i2c_param { + unsigned int addr; + unsigned int reg; +}; + +struct cs8409_cir_param { + unsigned int nid; + unsigned int cir; + unsigned int coeff; +}; + +enum { + CS8409_BULLSEYE, + CS8409_WARLOCK, + CS8409_CYBORG, + CS8409_VERBS, +}; + +/* Dell Inspiron models with cs8409/cs42l42 */ +static const struct hda_model_fixup cs8409_models[] = { + { .id = CS8409_BULLSEYE, .name = "bullseye" }, + { .id = CS8409_WARLOCK, .name = "warlock" }, + { .id = CS8409_CYBORG, .name = "cyborg" }, + {} +}; + +/* Dell Inspiron platforms + * with cs8409 bridge and cs42l42 codec + */ +static const struct snd_pci_quirk cs8409_fixup_tbl[] = { + SND_PCI_QUIRK(0x1028, 0x0A11, "Bullseye", CS8409_BULLSEYE), + SND_PCI_QUIRK(0x1028, 0x0A12, "Bullseye", CS8409_BULLSEYE), + SND_PCI_QUIRK(0x1028, 0x0A23, "Bullseye", CS8409_BULLSEYE), + SND_PCI_QUIRK(0x1028, 0x0A24, "Bullseye", CS8409_BULLSEYE), + SND_PCI_QUIRK(0x1028, 0x0A25, "Bullseye", CS8409_BULLSEYE), + SND_PCI_QUIRK(0x1028, 0x0A29, "Bullseye", CS8409_BULLSEYE), + SND_PCI_QUIRK(0x1028, 0x0A2A, "Bullseye", CS8409_BULLSEYE), + SND_PCI_QUIRK(0x1028, 0x0A2B, "Bullseye", CS8409_BULLSEYE), + SND_PCI_QUIRK(0x1028, 0x0AB0, "Warlock", CS8409_WARLOCK), + SND_PCI_QUIRK(0x1028, 0x0AB2, "Warlock", CS8409_WARLOCK), + SND_PCI_QUIRK(0x1028, 0x0AB1, "Warlock", CS8409_WARLOCK), + SND_PCI_QUIRK(0x1028, 0x0AB3, "Warlock", CS8409_WARLOCK), + SND_PCI_QUIRK(0x1028, 0x0AB4, "Warlock",
[PATCH] media:vidtv: remove duplicate include in vidtv_psi
From: Zhang Yunkai 'string.h' included in 'vidtv_psi.c' is duplicated. Signed-off-by: Zhang Yunkai --- drivers/media/test-drivers/vidtv/vidtv_psi.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/media/test-drivers/vidtv/vidtv_psi.c b/drivers/media/test-drivers/vidtv/vidtv_psi.c index 47ed7907db8d..c11ac8dca73d 100644 --- a/drivers/media/test-drivers/vidtv/vidtv_psi.c +++ b/drivers/media/test-drivers/vidtv/vidtv_psi.c @@ -19,7 +19,6 @@ #include #include #include -#include #include #include -- 2.25.1
[syzbot] bpf boot error: WARNING in kvm_wait
Hello, syzbot found the following issue on: HEAD commit:edbea922 veth: Store queue_mapping independently of XDP pr.. git tree: bpf console output: https://syzkaller.appspot.com/x/log.txt?x=113ae02ad0 kernel config: https://syzkaller.appspot.com/x/.config?x=402784bff477e1ac dashboard link: https://syzkaller.appspot.com/bug?extid=46fc491326a456ff8127 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+46fc491326a456ff8...@syzkaller.appspotmail.com [ cut here ] raw_local_irq_restore() called with IRQs enabled WARNING: CPU: 0 PID: 4787 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x1d/0x20 kernel/locking/irqflag-debug.c:10 Modules linked in: CPU: 0 PID: 4787 Comm: systemd-getty-g Not tainted 5.11.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:warn_bogus_irq_restore+0x1d/0x20 kernel/locking/irqflag-debug.c:10 Code: be ff cc cc cc cc cc cc cc cc cc cc cc 80 3d 1e 62 b0 04 00 74 01 c3 48 c7 c7 a0 8e 6b 89 c6 05 0d 62 b0 04 01 e8 57 da be ff <0f> 0b c3 48 39 77 10 0f 84 97 00 00 00 66 f7 47 22 f0 ff 74 4b 48 RSP: 0018:c900012efc40 EFLAGS: 00010282 RAX: RBX: 8be28b80 RCX: RDX: 888023de5340 RSI: 815bea35 RDI: f5200025df7a RBP: 0200 R08: R09: 0001 R10: 815b77be R11: R12: 0003 R13: fbfff17c5170 R14: 0001 R15: 8880b9c35f40 FS: () GS:8880b9c0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7fa257bcaab4 CR3: 0bc8e000 CR4: 001506f0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: kvm_wait arch/x86/kernel/kvm.c:860 [inline] kvm_wait+0xc9/0xe0 arch/x86/kernel/kvm.c:837 pv_wait arch/x86/include/asm/paravirt.h:564 [inline] pv_wait_head_or_lock kernel/locking/qspinlock_paravirt.h:470 [inline] __pv_queued_spin_lock_slowpath+0x8b8/0xb40 kernel/locking/qspinlock.c:508 pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:554 [inline] queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline] queued_spin_lock include/asm-generic/qspinlock.h:85 [inline] do_raw_spin_lock+0x200/0x2b0 kernel/locking/spinlock_debug.c:113 spin_lock include/linux/spinlock.h:354 [inline] check_stack_usage kernel/exit.c:715 [inline] do_exit+0x1d6a/0x2ae0 kernel/exit.c:868 do_group_exit+0x125/0x310 kernel/exit.c:922 __do_sys_exit_group kernel/exit.c:933 [inline] __se_sys_exit_group kernel/exit.c:931 [inline] __x64_sys_exit_group+0x3a/0x50 kernel/exit.c:931 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fa2592a3618 Code: Unable to access opcode bytes at RIP 0x7fa2592a35ee. RSP: 002b:7ffc579980b8 EFLAGS: 0246 ORIG_RAX: 00e7 RAX: ffda RBX: RCX: 7fa2592a3618 RDX: RSI: 003c RDI: RBP: 7fa2595808e0 R08: 00e7 R09: fee8 R10: 7fa25775e158 R11: 0246 R12: 7fa2595808e0 R13: 7fa259585c20 R14: R15: --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
[tip: x86/seves] x86/sev-es: Remove subtraction of res variable
The following commit has been merged into the x86/seves branch of tip: Commit-ID: f3db3365c069c2a8505cdee8033fe3d22d2fe6c0 Gitweb: https://git.kernel.org/tip/f3db3365c069c2a8505cdee8033fe3d22d2fe6c0 Author:Borislav Petkov AuthorDate:Tue, 23 Feb 2021 12:03:19 +01:00 Committer: Borislav Petkov CommitterDate: Sat, 06 Mar 2021 12:08:53 +01:00 x86/sev-es: Remove subtraction of res variable vc_decode_insn() calls copy_from_kernel_nofault() by way of vc_fetch_insn_kernel() to fetch 15 bytes max of opcodes to decode. copy_from_kernel_nofault() returns negative on error and 0 on success. The error case is handled by returning ES_EXCEPTION. In the success case, the ret variable which contains the return value is 0 so there's no need to subtract it from MAX_INSN_SIZE when initializing the insn buffer for further decoding. Remove it. No functional changes. Signed-off-by: Borislav Petkov Reviewed-by: Joerg Roedel Link: https://lkml.kernel.org/r/2021022330.16201-1...@alien8.de --- arch/x86/kernel/sev-es.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/sev-es.c b/arch/x86/kernel/sev-es.c index 84c1821..1e78f4b 100644 --- a/arch/x86/kernel/sev-es.c +++ b/arch/x86/kernel/sev-es.c @@ -267,7 +267,7 @@ static enum es_result vc_decode_insn(struct es_em_ctxt *ctxt) return ES_EXCEPTION; } - insn_init(>insn, buffer, MAX_INSN_SIZE - res, 1); + insn_init(>insn, buffer, MAX_INSN_SIZE, 1); insn_get_length(>insn); }
[PATCH v3 4/4] ALSA: hda/cirrus: Add Headphone and Headset MIC Volume Control
From: Stefan Binding CS8409 does not support Volume Control for NIDs 0x24 (the Headphones), or 0x34 (The Headset Mic). However, CS42L42 codec does support gain control for both. We can add support for Volume Controls, by writing the the CS42L42 regmap via i2c commands, using custom info, get and put volume functions, saved in the control. Tested on DELL Inspiron-3500, DELL Inspiron-3501, DELL Inspiron-3500 Signed-off-by: Stefan Binding Signed-off-by: Vitaly Rodionov --- Changes in v3: - Added restore volumes after resume - Removed redundant debug logging after testing sound/pci/hda/patch_cirrus.c | 200 +++ 1 file changed, 200 insertions(+) diff --git a/sound/pci/hda/patch_cirrus.c b/sound/pci/hda/patch_cirrus.c index 1d2f6a1224e6..6a9e5c803977 100644 --- a/sound/pci/hda/patch_cirrus.c +++ b/sound/pci/hda/patch_cirrus.c @@ -21,6 +21,9 @@ /* */ +#define CS42L42_HP_CH (2U) +#define CS42L42_HS_MIC_CH (1U) + struct cs_spec { struct hda_gen_spec gen; @@ -42,6 +45,9 @@ struct cs_spec { unsigned int cs42l42_hp_jack_in:1; unsigned int cs42l42_mic_jack_in:1; + unsigned int cs42l42_volume_init:1; + char cs42l42_hp_volume[CS42L42_HP_CH]; + char cs42l42_hs_mic_volume[CS42L42_HS_MIC_CH]; struct mutex cs8409_i2c_mux; @@ -1260,6 +1266,14 @@ static int patch_cs4213(struct hda_codec *codec) #define CIR_I2C_QWRITE 0x005D #define CIR_I2C_QREAD 0x005E +#define CS8409_CS42L42_HP_VOL_REAL_MIN (-63) +#define CS8409_CS42L42_HP_VOL_REAL_MAX (0) +#define CS8409_CS42L42_AMIC_VOL_REAL_MIN (-97) +#define CS8409_CS42L42_AMIC_VOL_REAL_MAX (12) +#define CS8409_CS42L42_REG_HS_VOLUME_CHA (0x2301) +#define CS8409_CS42L42_REG_HS_VOLUME_CHB (0x2303) +#define CS8409_CS42L42_REG_AMIC_VOLUME (0x1D03) + struct cs8409_i2c_param { unsigned int addr; unsigned int reg; @@ -1580,6 +1594,165 @@ static unsigned int cs8409_i2c_write(struct hda_codec *codec, return retval; } +static int cs8409_cs42l42_volume_info(struct snd_kcontrol *kcontrol, + struct snd_ctl_elem_info *uinfo) +{ + struct hda_codec *codec = snd_kcontrol_chip(kcontrol); + u16 nid = get_amp_nid(kcontrol); + u8 chs = get_amp_channels(kcontrol); + + codec_dbg(codec, "%s() nid: %d\n", __func__, nid); + switch (nid) { + case CS8409_CS42L42_HP_PIN_NID: + uinfo->type = SNDRV_CTL_ELEM_TYPE_INTEGER; + uinfo->count = chs == 3 ? 2 : 1; + uinfo->value.integer.min = CS8409_CS42L42_HP_VOL_REAL_MIN; + uinfo->value.integer.max = CS8409_CS42L42_HP_VOL_REAL_MAX; + break; + case CS8409_CS42L42_AMIC_PIN_NID: + uinfo->type = SNDRV_CTL_ELEM_TYPE_INTEGER; + uinfo->count = chs == 3 ? 2 : 1; + uinfo->value.integer.min = CS8409_CS42L42_AMIC_VOL_REAL_MIN; + uinfo->value.integer.max = CS8409_CS42L42_AMIC_VOL_REAL_MAX; + break; + default: + break; + } + return 0; +} + +static void cs8409_cs42l42_update_volume(struct hda_codec *codec) +{ + struct cs_spec *spec = codec->spec; + + mutex_lock(>cs8409_i2c_mux); + spec->cs42l42_hp_volume[0] = -(cs8409_i2c_read(codec, CS42L42_I2C_ADDR, + CS8409_CS42L42_REG_HS_VOLUME_CHA, 1)); + spec->cs42l42_hp_volume[1] = -(cs8409_i2c_read(codec, CS42L42_I2C_ADDR, + CS8409_CS42L42_REG_HS_VOLUME_CHB, 1)); + spec->cs42l42_hs_mic_volume[0] = -(cs8409_i2c_read(codec, CS42L42_I2C_ADDR, + CS8409_CS42L42_REG_AMIC_VOLUME, 1)); + mutex_unlock(>cs8409_i2c_mux); + spec->cs42l42_volume_init = 1; +} + +static int cs8409_cs42l42_volume_get(struct snd_kcontrol *kcontrol, +struct snd_ctl_elem_value *ucontrol) +{ + struct hda_codec *codec = snd_kcontrol_chip(kcontrol); + struct cs_spec *spec = codec->spec; + hda_nid_t nid = get_amp_nid(kcontrol); + int chs = get_amp_channels(kcontrol); + long *valp = ucontrol->value.integer.value; + + if (!spec->cs42l42_volume_init) { + snd_hda_power_up(codec); + cs8409_cs42l42_update_volume(codec); + snd_hda_power_down(codec); + } + switch (nid) { + case CS8409_CS42L42_HP_PIN_NID: + if (chs & 1) + *valp++ = spec->cs42l42_hp_volume[0]; + if (chs & 2) + *valp++ = spec->cs42l42_hp_volume[1]; + break; + case CS8409_CS42L42_AMIC_PIN_NID: + if (chs & 1) + *valp++ = spec->cs42l42_hs_mic_volume[0]; + break; + default: + break; + } + return 0; +} + +static int cs8409_cs42l42_volume_put(struct snd_kcontrol *kcontrol, +struct
Re: [PATCH v2 4/5] mtd: spi-nor: Move Software Write Protection logic out of the core
Am 2021-03-06 10:50, schrieb Tudor Ambarus: It makes the core file a bit smaller and provides better separation between the Software Write Protection features and the core logic. All the next generic software write protection features (e.g. Individual Block Protection) will reside in swp.c. Signed-off-by: Tudor Ambarus --- [..] @@ -3554,6 +3152,9 @@ int spi_nor_scan(struct spi_nor *nor, const char *name, if (ret) return ret; + if (nor->params->locking_ops) Should this be in spi_nor_register_locking_ops(), too? I.e. void spi_nor_register_locking_ops() { if (!nor->params->locking_ops) return; .. } I don't have a strong opinion on that so far. I just noticed because I put the check into spi_nor_otp_init() for my OTP series. They should be the same though. + spi_nor_register_locking_ops(nor); -michael
Re: [PATCH] x86/smpboot: remove duplicate include in smpboot.c
On Fri, Mar 05, 2021 at 10:56:10PM -0800, menglong8.d...@gmail.com wrote: > From: Zhang Yunkai > > 'cpu_device_id.h' and 'intel_family.h' included in 'smpboot.c' > is duplicated. It is also included in the 80th line. > > Signed-off-by: Zhang Yunkai If you send another person's patch, then your SOB needs to follow his/hers: https://www.kernel.org/doc/html/latest/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin Also, merge those two x86 patches removing includes into one please. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
[PATCH] drm/nouveau: remove duplicate include in nouveau_dmem and base
From: Zhang Yunkai 'if000c.h' included in 'nouveau_dmem.c' is duplicated. 'priv.h' included in 'base.c' is duplicated. Signed-off-by: Zhang Yunkai --- drivers/gpu/drm/nouveau/nouveau_dmem.c | 1 - drivers/gpu/drm/nouveau/nvkm/engine/nvenc/base.c | 2 -- 2 files changed, 3 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c index 92987daa5e17..f5cc057b123b 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c @@ -33,7 +33,6 @@ #include #include #include -#include #include diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/nvenc/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/nvenc/base.c index c39e797dc7c9..09524168431c 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/nvenc/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/nvenc/base.c @@ -20,8 +20,6 @@ * OTHER DEALINGS IN THE SOFTWARE. */ #include "priv.h" - -#include "priv.h" #include static void * -- 2.25.1
Re: [PATCH v6 1/2] dt-bindings: hwlock: add sun6i_hwspinlock
On Tue, 2 Mar 2021 18:22:33 +0100 Maxime Ripard wrote: > On Mon, Mar 01, 2021 at 03:06:35PM +0100, Wilken Gottwalt wrote: > > On Mon, 1 Mar 2021 14:12:44 +0100 > > Maxime Ripard wrote: > > > > > On Sat, Feb 27, 2021 at 02:03:28PM +0100, Wilken Gottwalt wrote: > > > > Adds documentation on how to use the sun6i_hwspinlock driver for sun6i > > > > compatible series SoCs. > > > > > > > > Signed-off-by: Wilken Gottwalt > > > > --- > > > > Changes in v6: > > > > - fixed formating and name issues in dt documentation > > > > > > > > Changes in v5: > > > > - changed binding to earliest known supported SoC sun6i-a31 > > > > - dropped unnecessary entries > > > > > > > > Changes in v4: > > > > - changed binding to sun8i-a33-hwpinlock > > > > - added changes suggested by Maxime Ripard > > > > > > > > Changes in v3: > > > > - changed symbols from sunxi to sun8i > > > > > > > > Changes in v2: > > > > - fixed memory ranges > > > > --- > > > > .../hwlock/allwinner,sun6i-hwspinlock.yaml| 45 +++ > > > > > > The name of the file doesn't match the compatible, once fixed: > > > Acked-by: Maxime Ripard > > > > This is something that still confuses me. What if you have more than one > > compatible string? > > In this case, it's fairly easy there's only one :) > > But we're following the same rule than the compatible: the first SoC > that got the compatible wins > > > This won't be solvable. See the qcom binding for example, > > there are two strings and none matches. In the omap bindings are also two > > strings and only one matches. In all cases, including mine, the bindings > > check script is fine with that. > > If other platforms want to follow other rules, good for them :) > > > So, you basically want it to be called > > "allwinner,sun6i-a31-hwspinlock.yaml"? > > Yes Is it okay if I provide only the fixed bindings? I assume the v6 driver is fine now. greetings, Will
Re: [PATCH v6 2/2] hwspinlock: add sun6i hardware spinlock support
On Tue, 2 Mar 2021 18:20:02 +0100 Maxime Ripard wrote: > Hi, > > On Mon, Mar 01, 2021 at 03:06:08PM +0100, Wilken Gottwalt wrote: > > On Mon, 1 Mar 2021 14:13:05 +0100 > > Maxime Ripard wrote: > > > > > On Sat, Feb 27, 2021 at 02:03:54PM +0100, Wilken Gottwalt wrote: > > > > Adds the sun6i_hwspinlock driver for the hardware spinlock unit found in > > > > most of the sun6i compatible SoCs. > > > > > > > > This unit provides at least 32 spinlocks in hardware. The implementation > > > > supports 32, 64, 128 or 256 32bit registers. A lock can be taken by > > > > reading a register and released by writing a 0 to it. This driver > > > > supports all 4 spinlock setups, but for now only the first setup (32 > > > > locks) seem to exist in available devices. This spinlock unit is shared > > > > between all ARM cores and the embedded companion core. All of them can > > > > take/release a lock with a single cycle operation. It can be used to > > > > sync access to devices shared by the ARM cores and the companion core. > > > > > > > > There are two ways to check if a lock is taken. The first way is to read > > > > a lock. If a 0 is returned, the lock was free and is taken now. If an 1 > > > > is returned, the caller has to try again. Which means the lock is taken. > > > > The second way is to read a 32bit wide status register where every bit > > > > represents one of the 32 first locks. According to the datasheets this > > > > status register supports only the 32 first locks. This is the reason the > > > > first way (lock read/write) approach is used to be able to cover all 256 > > > > locks in future devices. The driver also reports the amount of supported > > > > locks via debugfs. > > > > > > > > Signed-off-by: Wilken Gottwalt > > > > Nope, I had to replace the devm_hwspin_lock_register function by the > > hwspin_lock_register function because like Bjorn pointed out that it can > > fail and needs to handled correctly. And having a devm_* function does not > > play well with the non-devm clock/reset functions and winding back if an > > error occurs. It also messes with the call order in the remove function. So > > I went back to the classic way where I have full control over the call > > order. > > If you're talking about the clock and reset line reassertion, I don't > really see what the trouble is. Sure, it's not going to be in the exact > same order in remove, but it's still going to execute in the proper > order (ie, clock disable, then reset disable, then clock put and reset > put). And you can use devm_add_action if you want to handle things > automatically. See, in v5 zje result of devm_hwspin_lock_register was returned directly. The remove callback or the bank_fail/clk_fail labels would not run, if the registering fails. In v6 it is fixed. + platform_set_drvdata(pdev, priv); + + return devm_hwspin_lock_register(>dev, priv->bank, _hwspinlock_ops, +SPINLOCK_BASE_ID, priv->nlocks); +bank_fail: + clk_disable_unprepare(priv->ahb_clk); +clk_fail: + reset_control_assert(priv->reset); + + return err; +} So, is v6 fine for you even if it uses a more classic approach? greetings, Will
[PATCH] drm/amd/display: remove duplicate include in dcn21 and gpio
From: Zhang Yunkai 'dce110_resource.h' included in 'dcn21_resource.c' is duplicated. 'hw_gpio.h' included in 'hw_factory_dce110.c' is duplicated. Signed-off-by: Zhang Yunkai --- drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c | 1 - .../gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c| 4 2 files changed, 5 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c index 072f8c880924..8a6a965751e8 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c @@ -61,7 +61,6 @@ #include "dcn21/dcn21_dccg.h" #include "dcn21_hubbub.h" #include "dcn10/dcn10_resource.h" -#include "dce110/dce110_resource.h" #include "dce/dce_panel_cntl.h" #include "dcn20/dcn20_dwb.h" diff --git a/drivers/gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c b/drivers/gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c index 66e4841f41e4..ca335ea60412 100644 --- a/drivers/gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c +++ b/drivers/gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c @@ -48,10 +48,6 @@ #define REGI(reg_name, block, id)\ mm ## block ## id ## _ ## reg_name -#include "../hw_gpio.h" -#include "../hw_ddc.h" -#include "../hw_hpd.h" - #include "reg_helper.h" #include "../hpd_regs.h" -- 2.25.1
Re: [PATCH 2/2] MIPS: Loongson64: Move loongson_system_configuration to loongson.h
On Sat, Mar 6, 2021, at 5:53 PM, Thomas Bogendoerfer wrote: > On Sat, Mar 06, 2021 at 05:00:15PM +0800, Jiaxun Yang wrote: > > > > > > On Sat, Mar 6, 2021, at 4:03 PM, Thomas Bogendoerfer wrote: > > > On Thu, Mar 04, 2021 at 07:00:57PM +0800, Qing Zhang wrote: > > > > The purpose of separating loongson_system_configuration from > > > > boot_param.h > > > > is to keep the other structure consistent with the firmware. > > > > > > > > Signed-off-by: Jiaxun Yang > > > > Signed-off-by: Qing Zhang > > > > --- > > > > .../include/asm/mach-loongson64/boot_param.h | 18 -- > > > > .../include/asm/mach-loongson64/loongson.h | 18 ++ > > > > > > as you are already touching mach-loongson64 files... > > > > > > Is there a chance you clean up that up even further ? My goal is to > > > have only files in mach- files, which have an mach-generic > > > counterpart. Everything else should go to its own directory. So in > > > case of loongson something > > > > > > like > > > > > > arch/mips/include/asm/loongsonfor common stuff > > > arch/mips/include/asm/loongson/32 > > > arch/mips/include/asm/loongson/64 > > > > Hi Thomas > > > > I'm object to this idea as loongson32/2ef/64 have nothing in common. > > at least they share the name loongson, so having > > arch/mips/include/asm/loongson > > sounds like a good move. > > And seeing > > diff -u mach-loongson2ef/ mach-loongson64/loongson.h | diffstat > loongson.h | 137 > + > 1 file changed, 30 insertions(+), 107 deletions(-) > > wc mach-loongson2ef/loongson.h > 318 963 11278 mach-loongson2ef/loongson.h > > so there is something to shared. To me it looks like 2ef could be merged > into 64, but that's nothing I'm wanting. Hmm there are duplications in loongson.h just because we didn't clean them up when splitting loongson2ef out of loongson64. > > Just to understand you, you want > > arch/mips/include/asm/loongson/2ef > arch/mips/include/asm/loongson/32 > arch/mips/include/asm/loongson/64 Yeah it looks reasonable but from my point of view doing these movement brings no actual benefit :-( Thanks. - Jiaxun > > ? > > Thomas. > > -- > Crap can work. Given enough thrust pigs will fly, but it's not necessarily a > good idea.[ RFC1925, 2.3 ] > -- - Jiaxun
Re: [PATCH] arm64: dts: add support for the Pixel 2 XL
On 05.03.2021 22:35, Caleb Connolly wrote: > Add a minimal devicetree capable of booting on the Pixel 2 XL MSM8998 > device. > > It's currently possible to boot the device into postmarketOS with USB > networking, however the display panel depends on Display Stream > Compression which is not yet supported in the kernel. > > The bootloader also requires that the dtbo partition contains a device > tree overlay with a particular id which has to be overlayed onto the > existing dtb. It's possible to use a specially crafted dtbo partition to > workaround this, more information is available here: > > https://gitlab.com/calebccff/dtbo-google-wahoo-mainline > > Signed-off-by: Caleb Connolly > --- > It's possible to get wifi working by running Bjorns diag-router in the > background, without this the wifi firmware crashes every 10 seconds or > so. This is the same issue encountered on the OnePlus 5. > > arch/arm64/boot/dts/qcom/Makefile | 1 + > .../boot/dts/qcom/msm8998-google-taimen.dts | 14 + > .../boot/dts/qcom/msm8998-google-wahoo.dtsi | 391 ++ > 3 files changed, 406 insertions(+) > create mode 100644 arch/arm64/boot/dts/qcom/msm8998-google-taimen.dts > create mode 100644 arch/arm64/boot/dts/qcom/msm8998-google-wahoo.dtsi > > diff --git a/arch/arm64/boot/dts/qcom/Makefile > b/arch/arm64/boot/dts/qcom/Makefile > index 5113fac80b7a..d942d3ec3928 100644 > --- a/arch/arm64/boot/dts/qcom/Makefile > +++ b/arch/arm64/boot/dts/qcom/Makefile > @@ -16,6 +16,7 @@ dtb-$(CONFIG_ARCH_QCOM) += > msm8994-msft-lumia-cityman.dtb > dtb-$(CONFIG_ARCH_QCOM) += msm8994-sony-xperia-kitakami-sumire.dtb > dtb-$(CONFIG_ARCH_QCOM) += msm8996-mtp.dtb > dtb-$(CONFIG_ARCH_QCOM) += msm8998-asus-novago-tp370ql.dtb > +dtb-$(CONFIG_ARCH_QCOM) += msm8998-google-taimen.dtb > dtb-$(CONFIG_ARCH_QCOM) += msm8998-hp-envy-x2.dtb > dtb-$(CONFIG_ARCH_QCOM) += msm8998-lenovo-miix-630.dtb > dtb-$(CONFIG_ARCH_QCOM) += msm8998-mtp.dtb > diff --git a/arch/arm64/boot/dts/qcom/msm8998-google-taimen.dts > b/arch/arm64/boot/dts/qcom/msm8998-google-taimen.dts > new file mode 100644 > index ..ffaaafe14037 > --- /dev/null > +++ b/arch/arm64/boot/dts/qcom/msm8998-google-taimen.dts > @@ -0,0 +1,14 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* > + * Copyright (c) 2020, Caleb Connolly > + */ > + > +/dts-v1/; > + > +#include "msm8998-google-wahoo.dtsi" > + > +/ { > + model = "Google Pixel 2 XL"; > + compatible = "google,taimen", "google,wahoo", "qcom,msm8998", > "qcom,msm8998-mtp"; Drop the mtp compatible. Also, afaict wahoo is a shared platform name for P2/2XL, so perhaps using the same naming scheme we used for Xperias/Lumias (soc-vendor-platform-board) would clear up some confusion. In this case, I'm not sure about the wahoo compatible, but I reckon it's fine for it to stay so that it's easier to introduce potential quirks that concern both devices. > + qcom,msm-id = <0x124 0x20001>; Move it to the common dtsi, unless the other Pixel ships with a different SoC revision. > +}; > diff --git a/arch/arm64/boot/dts/qcom/msm8998-google-wahoo.dtsi > b/arch/arm64/boot/dts/qcom/msm8998-google-wahoo.dtsi > new file mode 100644 > index ..0c221ead2df7 > --- /dev/null > +++ b/arch/arm64/boot/dts/qcom/msm8998-google-wahoo.dtsi > @@ -0,0 +1,391 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* Copyright (c) 2020 Caleb Connolly */ > + > +#include "msm8998.dtsi" > +#include "pm8998.dtsi" > +#include "pmi8998.dtsi" > +#include "pm8005.dtsi" > + > +/delete-node/ _mem; > +/delete-node/ _mem; > +/delete-node/ _mem; > +/delete-node/ _mem; > + > +/ { > + aliases { > + }; > + > + chosen { > + #address-cells = <2>; > + #size-cells = <2>; > + ranges; > + > + /* Add "earlycon" intended to be used in combination with UART > serial console */ > + bootargs = "clk_ignore_unused earlycon > console=ttyGS0,115200";// loglevel=10 drm.debug=15 debug"; clk_ignore_unused is a BIG hack! You should trace which clocks are important for it to stay alive and fix it on the driver side. What breaks if it's not there? Does it still happen with Angelo's clk patches that got in for the 5.12 window? Aside from that, //loglevel... should also go. > + > + vph_pwr: vph-pwr-regulator { > + compatible = "regulator-fixed"; > + regulator-name = "vph_pwr"; > + regulator-always-on; > + regulator-boot-on; > + }; > +}; Don't you need to specify voltage here? > + > +_uart3 { > + status = "disabled"; > + > + bluetooth { > + compatible = "qcom,wcn3990-bt"; > + > + vddio-supply = <_s4a_1p8>; > + vddxo-supply = <_l7a_1p8>; > + vddrf-supply = <_l17a_1p3>; > + vddch0-supply = <_l25a_3p3>; > + max-speed = <320>; > + }; > +}; Either enable the UART or rid the
[PATCH] drm/amd/display: remove duplicate include in amdgpu_dm.c
From: Zhang Yunkai 'drm/drm_hdcp.h' included in 'amdgpu_dm.c' is duplicated. It is also included in the 79th line. Signed-off-by: Zhang Yunkai --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 3e1fd1e7d09f..fee46fbcb0b7 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -44,7 +44,6 @@ #include "amdgpu_dm.h" #ifdef CONFIG_DRM_AMD_DC_HDCP #include "amdgpu_dm_hdcp.h" -#include #endif #include "amdgpu_pm.h" -- 2.25.1
[tip: x86/urgent] x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2
The following commit has been merged into the x86/urgent branch of tip: Commit-ID: 8bd7b3980ca62904814d536b3a2453001992a0c3 Gitweb: https://git.kernel.org/tip/8bd7b3980ca62904814d536b3a2453001992a0c3 Author:Josh Poimboeuf AuthorDate:Fri, 05 Feb 2021 08:24:02 -06:00 Committer: Borislav Petkov CommitterDate: Sat, 06 Mar 2021 11:37:00 +01:00 x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2 KASAN reserves "redzone" areas between stack frames in order to detect stack overruns. A read or write to such an area triggers a KASAN "stack-out-of-bounds" BUG. Normally, the ORC unwinder stays in-bounds and doesn't access the redzone. But sometimes it can't find ORC metadata for a given instruction. This can happen for code which is missing ORC metadata, or for generated code. In such cases, the unwinder attempts to fall back to frame pointers, as a best-effort type thing. This fallback often works, but when it doesn't, the unwinder can get confused and go off into the weeds into the KASAN redzone, triggering the aforementioned KASAN BUG. But in this case, the unwinder's confusion is actually harmless and working as designed. It already has checks in place to prevent off-stack accesses, but those checks get short-circuited by the KASAN BUG. And a BUG is a lot more disruptive than a harmless unwinder warning. Disable the KASAN checks by using READ_ONCE_NOCHECK() for all stack accesses. This finishes the job started by commit 881125bfe65b ("x86/unwind: Disable KASAN checking in the ORC unwinder"), which only partially fixed the issue. Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder") Reported-by: Ivan Babrou Signed-off-by: Josh Poimboeuf Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Steven Rostedt (VMware) Tested-by: Ivan Babrou Cc: sta...@kernel.org Link: https://lkml.kernel.org/r/9583327904ebbbeda399eca9c56d6c7085ac20fe.1612534649.git.jpoim...@redhat.com --- arch/x86/kernel/unwind_orc.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index 2a1d47f..1bcc14c 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -367,8 +367,8 @@ static bool deref_stack_regs(struct unwind_state *state, unsigned long addr, if (!stack_access_ok(state, addr, sizeof(struct pt_regs))) return false; - *ip = regs->ip; - *sp = regs->sp; + *ip = READ_ONCE_NOCHECK(regs->ip); + *sp = READ_ONCE_NOCHECK(regs->sp); return true; } @@ -380,8 +380,8 @@ static bool deref_stack_iret_regs(struct unwind_state *state, unsigned long addr if (!stack_access_ok(state, addr, IRET_FRAME_SIZE)) return false; - *ip = regs->ip; - *sp = regs->sp; + *ip = READ_ONCE_NOCHECK(regs->ip); + *sp = READ_ONCE_NOCHECK(regs->sp); return true; } @@ -402,12 +402,12 @@ static bool get_reg(struct unwind_state *state, unsigned int reg_off, return false; if (state->full_regs) { - *val = ((unsigned long *)state->regs)[reg]; + *val = READ_ONCE_NOCHECK(((unsigned long *)state->regs)[reg]); return true; } if (state->prev_regs) { - *val = ((unsigned long *)state->prev_regs)[reg]; + *val = READ_ONCE_NOCHECK(((unsigned long *)state->prev_regs)[reg]); return true; }
[tip: x86/urgent] x86/entry: Fix entry/exit mismatch on failed fast 32-bit syscalls
The following commit has been merged into the x86/urgent branch of tip: Commit-ID: e59ba7bf71a09e474198741563e0e587ae43d1c7 Gitweb: https://git.kernel.org/tip/e59ba7bf71a09e474198741563e0e587ae43d1c7 Author:Andy Lutomirski AuthorDate:Thu, 04 Mar 2021 11:05:54 -08:00 Committer: Borislav Petkov CommitterDate: Sat, 06 Mar 2021 11:37:00 +01:00 x86/entry: Fix entry/exit mismatch on failed fast 32-bit syscalls On a 32-bit fast syscall that fails to read its arguments from user memory, the kernel currently does syscall exit work but not syscall entry work. This confuses audit and ptrace. For example: $ ./tools/testing/selftests/x86/syscall_arg_fault_32 ... strace: pid 264258: entering, ptrace_syscall_info.op == 2 ... This is a minimal fix intended for ease of backporting. A more complete cleanup is coming. Fixes: 0b085e68f407 ("x86/entry: Consolidate 32/64 bit syscall entry") Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Cc: sta...@vger.kernel.org Link: https://lore.kernel.org/r/8c82296ddf803b91f8d1e5eac89e5803ba54ab0e.1614884673.git.l...@kernel.org --- arch/x86/entry/common.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c index a2433ae..4efd39a 100644 --- a/arch/x86/entry/common.c +++ b/arch/x86/entry/common.c @@ -128,7 +128,8 @@ static noinstr bool __do_fast_syscall_32(struct pt_regs *regs) regs->ax = -EFAULT; instrumentation_end(); - syscall_exit_to_user_mode(regs); + local_irq_disable(); + irqentry_exit_to_user_mode(regs); return false; }
[tip: x86/urgent] x86/unwind/orc: Silence warnings caused by missing ORC data
The following commit has been merged into the x86/urgent branch of tip: Commit-ID: d072f941c1e234f8495cc4828370b180318bf49b Gitweb: https://git.kernel.org/tip/d072f941c1e234f8495cc4828370b180318bf49b Author:Josh Poimboeuf AuthorDate:Fri, 05 Feb 2021 08:24:03 -06:00 Committer: Borislav Petkov CommitterDate: Sat, 06 Mar 2021 11:37:00 +01:00 x86/unwind/orc: Silence warnings caused by missing ORC data The ORC unwinder attempts to fall back to frame pointers when ORC data is missing for a given instruction. It sets state->error, but then tries to keep going as a best-effort type of thing. That may result in further warnings if the unwinder gets lost. Until we have some way to register generated code with the unwinder, missing ORC will be expected, and occasionally going off the rails will also be expected. So don't warn about it. Signed-off-by: Josh Poimboeuf Signed-off-by: Peter Zijlstra (Intel) Tested-by: Ivan Babrou Link: https://lkml.kernel.org/r/06d02c4bbb220bd31668db579278b0352538efbb.1612534649.git.jpoim...@redhat.com --- arch/x86/kernel/unwind_orc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index 1bcc14c..a120253 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -13,7 +13,7 @@ #define orc_warn_current(args...) \ ({ \ - if (state->task == current) \ + if (state->task == current && !state->error)\ orc_warn(args); \ })
Re: [PATCH v2] input: s6sy761: fix coordinate read bit shift
Hi Caleb, On Fri, Mar 05, 2021 at 06:58:10PM +, Caleb Connolly wrote: > The touch coordinate register contains the following: > > byte 3 byte 2 byte 1 > +++ +-+ +-+ > ||| | | | | > | X[3:0] | Y[3:0] | | Y[11:4] | | X[11:4] | > ||| | | | | > +++ +-+ +-+ > > Bytes 2 and 1 need to be shifted left by 4 bits, the least significant > nibble of each is stored in byte 3. Currently they are only > being shifted by 3 causing the reported coordinates to be incorrect. > > This matches downstream examples, and has been confirmed on my > device (OnePlus 7 Pro). > > Fixes: 0145a7141e59 ("Input: add support for the Samsung S6SY761 > touchscreen") > Signed-off-by: Caleb Connolly Reviewed-by: Andi Shyti Thanks, Andi
Re: [PATCH] kbuild: dummy-tools, fix inverted tests for gcc
On Wed, Mar 3, 2021 at 7:43 PM Jiri Slaby wrote: > > There is a test in Kconfig which takes inverted value of a compiler > check: > * config CC_HAS_INT128 > def_bool !$(cc-option,$(m64-flag) -D__SIZEOF_INT128__=0) > > This results in CC_HAS_INT128 not being in super-config generated by > dummy-tools. So take this into account in the gcc script. > > Signed-off-by: Jiri Slaby > Cc: Masahiro Yamada > --- Applied to linux-kbuild/fixes. Thanks. We could fix init/Kconfig to use the positive logic as follows, but I guess (hope) this conditional will go away when we raise the GCC min version next time. So, I am fine with fixing this in dummy-tools. diff --git a/init/Kconfig b/init/Kconfig index 22946fe5ded9..502594a78282 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -849,7 +849,7 @@ config ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH bool config CC_HAS_INT128 - def_bool !$(cc-option,$(m64-flag) -D__SIZEOF_INT128__=0) && 64BIT + def_bool $(success,echo '__int128 x;' | $(CC) $(CLANG_FLAGS) -x c - -c -o /dev/null) && 64BIT # # For architectures that know their GCC __int128 support is sound -- Best Regards Masahiro Yamada