[tip: x86/mm] smp: Micro-optimize smp_call_function_many_cond()

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the x86/mm branch of tip:

Commit-ID: d43f17a1da25373580ebb466de7d0641acbf6fd6
Gitweb:
https://git.kernel.org/tip/d43f17a1da25373580ebb466de7d0641acbf6fd6
Author:Peter Zijlstra 
AuthorDate:Tue, 02 Mar 2021 08:02:43 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 13:00:22 +01:00

smp: Micro-optimize smp_call_function_many_cond()

Call the generic send_call_function_single_ipi() function, which
will avoid the IPI when @last_cpu is idle.

Signed-off-by: Peter Zijlstra 
Signed-off-by: Ingo Molnar 
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar 
---
 kernel/smp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index b6375d7..af0d51d 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -694,7 +694,7 @@ static void smp_call_function_many_cond(const struct 
cpumask *mask,
 * provided mask.
 */
if (nr_cpus == 1)
-   arch_send_call_function_single_ipi(last_cpu);
+   send_call_function_single_ipi(last_cpu);
else if (likely(nr_cpus > 1))
arch_send_call_function_ipi_mask(cfd->cpumask_ipi);
}


[tip: x86/mm] cpumask: Mark functions as pure

2021-03-06 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip:

Commit-ID: 291c4011dd7ac0cd0cebb727a75ee5a50d16dcf7
Gitweb:
https://git.kernel.org/tip/291c4011dd7ac0cd0cebb727a75ee5a50d16dcf7
Author:Nadav Amit 
AuthorDate:Sat, 20 Feb 2021 15:17:10 -08:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:59:10 +01:00

cpumask: Mark functions as pure

cpumask_next_and() and cpumask_any_but() are pure, and marking them as
such seems to generate different and presumably better code for
native_flush_tlb_multi().

Signed-off-by: Nadav Amit 
Signed-off-by: Ingo Molnar 
Reviewed-by: Dave Hansen 
Link: https://lore.kernel.org/r/20210220231712.2475218-8-na...@vmware.com
---
 include/linux/cpumask.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 383684e..c53364c 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -235,7 +235,7 @@ static inline unsigned int cpumask_last(const struct 
cpumask *srcp)
return find_last_bit(cpumask_bits(srcp), nr_cpumask_bits);
 }
 
-unsigned int cpumask_next(int n, const struct cpumask *srcp);
+unsigned int __pure cpumask_next(int n, const struct cpumask *srcp);
 
 /**
  * cpumask_next_zero - get the next unset cpu in a cpumask
@@ -252,8 +252,8 @@ static inline unsigned int cpumask_next_zero(int n, const 
struct cpumask *srcp)
return find_next_zero_bit(cpumask_bits(srcp), nr_cpumask_bits, n+1);
 }
 
-int cpumask_next_and(int n, const struct cpumask *, const struct cpumask *);
-int cpumask_any_but(const struct cpumask *mask, unsigned int cpu);
+int __pure cpumask_next_and(int n, const struct cpumask *, const struct 
cpumask *);
+int __pure cpumask_any_but(const struct cpumask *mask, unsigned int cpu);
 unsigned int cpumask_local_spread(unsigned int i, int node);
 int cpumask_any_and_distribute(const struct cpumask *src1p,
   const struct cpumask *src2p);


[tip: x86/mm] smp: Inline on_each_cpu_cond() and on_each_cpu()

2021-03-06 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip:

Commit-ID: a5aa5ce300597224ec76dacc8e63ba3ad7a18bbd
Gitweb:
https://git.kernel.org/tip/a5aa5ce300597224ec76dacc8e63ba3ad7a18bbd
Author:Nadav Amit 
AuthorDate:Sat, 20 Feb 2021 15:17:12 -08:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:59:10 +01:00

smp: Inline on_each_cpu_cond() and on_each_cpu()

Simplify the code and avoid having an additional function on the stack
by inlining on_each_cpu_cond() and on_each_cpu().

Suggested-by: Peter Zijlstra 
Signed-off-by: Nadav Amit 
[ Minor edits. ]
Signed-off-by: Ingo Molnar 
Link: https://lore.kernel.org/r/20210220231712.2475218-10-na...@vmware.com
---
 include/linux/smp.h | 50 ---
 kernel/smp.c| 56 +
 kernel/up.c | 38 +--
 3 files changed, 37 insertions(+), 107 deletions(-)

diff --git a/include/linux/smp.h b/include/linux/smp.h
index 70c6f62..84a0b48 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -50,30 +50,52 @@ extern unsigned int total_cpus;
 int smp_call_function_single(int cpuid, smp_call_func_t func, void *info,
 int wait);
 
+void on_each_cpu_cond_mask(smp_cond_func_t cond_func, smp_call_func_t func,
+  void *info, bool wait, const struct cpumask *mask);
+
+int smp_call_function_single_async(int cpu, call_single_data_t *csd);
+
 /*
  * Call a function on all processors
  */
-void on_each_cpu(smp_call_func_t func, void *info, int wait);
+static inline void on_each_cpu(smp_call_func_t func, void *info, int wait)
+{
+   on_each_cpu_cond_mask(NULL, func, info, wait, cpu_online_mask);
+}
 
-/*
- * Call a function on processors specified by mask, which might include
- * the local one.
+/**
+ * on_each_cpu_mask(): Run a function on processors specified by
+ * cpumask, which may include the local processor.
+ * @mask: The set of cpus to run on (only runs on online subset).
+ * @func: The function to run. This must be fast and non-blocking.
+ * @info: An arbitrary pointer to pass to the function.
+ * @wait: If true, wait (atomically) until function has completed
+ *on other CPUs.
+ *
+ * If @wait is true, then returns once @func has returned.
+ *
+ * You must not call this function with disabled interrupts or from a
+ * hardware interrupt handler or from a bottom half handler.  The
+ * exception is that it may be used during early boot while
+ * early_boot_irqs_disabled is set.
  */
-void on_each_cpu_mask(const struct cpumask *mask, smp_call_func_t func,
-   void *info, bool wait);
+static inline void on_each_cpu_mask(const struct cpumask *mask,
+   smp_call_func_t func, void *info, bool wait)
+{
+   on_each_cpu_cond_mask(NULL, func, info, wait, mask);
+}
 
 /*
  * Call a function on each processor for which the supplied function
  * cond_func returns a positive value. This may include the local
- * processor.
+ * processor.  May be used during early boot while early_boot_irqs_disabled is
+ * set. Use local_irq_save/restore() instead of local_irq_disable/enable().
  */
-void on_each_cpu_cond(smp_cond_func_t cond_func, smp_call_func_t func,
- void *info, bool wait);
-
-void on_each_cpu_cond_mask(smp_cond_func_t cond_func, smp_call_func_t func,
-  void *info, bool wait, const struct cpumask *mask);
-
-int smp_call_function_single_async(int cpu, call_single_data_t *csd);
+static inline void on_each_cpu_cond(smp_cond_func_t cond_func,
+   smp_call_func_t func, void *info, bool wait)
+{
+   on_each_cpu_cond_mask(cond_func, func, info, wait, cpu_online_mask);
+}
 
 #ifdef CONFIG_SMP
 
diff --git a/kernel/smp.c b/kernel/smp.c
index c8a5a1f..b6375d7 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -848,55 +848,6 @@ void __init smp_init(void)
 }
 
 /*
- * Call a function on all processors.  May be used during early boot while
- * early_boot_irqs_disabled is set.  Use local_irq_save/restore() instead
- * of local_irq_disable/enable().
- */
-void on_each_cpu(smp_call_func_t func, void *info, int wait)
-{
-   unsigned long flags;
-
-   preempt_disable();
-   smp_call_function(func, info, wait);
-   local_irq_save(flags);
-   func(info);
-   local_irq_restore(flags);
-   preempt_enable();
-}
-EXPORT_SYMBOL(on_each_cpu);
-
-/**
- * on_each_cpu_mask(): Run a function on processors specified by
- * cpumask, which may include the local processor.
- * @mask: The set of cpus to run on (only runs on online subset).
- * @func: The function to run. This must be fast and non-blocking.
- * @info: An arbitrary pointer to pass to the function.
- * @wait: If true, wait (atomically) until function has completed
- *on other CPUs.
- *
- * If @wait is true, then returns once @func has returned.
- *
- * You must not call this 

[tip: x86/mm] x86/mm/tlb: Do not make is_lazy dirty for no reason

2021-03-06 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip:

Commit-ID: 09c5272e48614a30598e759c3c7bed126d22037d
Gitweb:
https://git.kernel.org/tip/09c5272e48614a30598e759c3c7bed126d22037d
Author:Nadav Amit 
AuthorDate:Sat, 20 Feb 2021 15:17:09 -08:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:59:10 +01:00

x86/mm/tlb: Do not make is_lazy dirty for no reason

Blindly writing to is_lazy for no reason, when the written value is
identical to the old value, makes the cacheline dirty for no reason.
Avoid making such writes to prevent cache coherency traffic for no
reason.

Suggested-by: Dave Hansen 
Signed-off-by: Nadav Amit 
Signed-off-by: Ingo Molnar 
Reviewed-by: Dave Hansen 
Link: https://lore.kernel.org/r/20210220231712.2475218-7-na...@vmware.com
---
 arch/x86/mm/tlb.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 345a0af..17ec4bf 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -469,7 +469,8 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct 
mm_struct *next,
__flush_tlb_all();
}
 #endif
-   this_cpu_write(cpu_tlbstate_shared.is_lazy, false);
+   if (was_lazy)
+   this_cpu_write(cpu_tlbstate_shared.is_lazy, false);
 
/*
 * The membarrier system call requires a full memory barrier and


Re: [PATCH 1/1] RISC-V: correct enum sbi_ext_rfence_fid

2021-03-06 Thread Anup Patel
On Sat, Mar 6, 2021 at 11:19 AM Heinrich Schuchardt  wrote:
>
> The constants in enum sbi_ext_rfence_fid should match the SBI
> specification. See
> https://github.com/riscv/riscv-sbi-doc/blob/master/riscv-sbi.adoc#78-function-listing
>
> | Function Name   | FID | EID
> | sbi_remote_fence_i  |   0 | 0x52464E43
> | sbi_remote_sfence_vma   |   1 | 0x52464E43
> | sbi_remote_sfence_vma_asid  |   2 | 0x52464E43
> | sbi_remote_hfence_gvma_vmid |   3 | 0x52464E43
> | sbi_remote_hfence_gvma  |   4 | 0x52464E43
> | sbi_remote_hfence_vvma_asid |   5 | 0x52464E43
> | sbi_remote_hfence_vvma  |   6 | 0x52464E43
>
> Fixes: ecbacc2a3efd ("RISC-V: Add SBI v0.2 extension definitions")
> Reported-by: Sean Anderson 
> Signed-off-by: Heinrich Schuchardt 

Good catch.

I guess we never saw any issues because these calls are only used by
KVM RISC-V which is not merged yet. Further for KVM RISC-V, the HFENCE
instruction is emulated as flush everything on FPGA, QEMU, and Spike so
we did not notice any issue with KVM RISC-V too.

Looks good to me.

Reviewed-by: Anup Patel 

Regards,
Anup

> ---
>  arch/riscv/include/asm/sbi.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
> index 99895d9c3bdd..d7027411dde8 100644
> --- a/arch/riscv/include/asm/sbi.h
> +++ b/arch/riscv/include/asm/sbi.h
> @@ -51,10 +51,10 @@ enum sbi_ext_rfence_fid {
> SBI_EXT_RFENCE_REMOTE_FENCE_I = 0,
> SBI_EXT_RFENCE_REMOTE_SFENCE_VMA,
> SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID,
> -   SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA,
> SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID,
> -   SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA,
> +   SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA,
> SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID,
> +   SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA,
>  };
>
>  enum sbi_ext_hsm_fid {
> --
> 2.30.1
>
>
> ___
> linux-riscv mailing list
> linux-ri...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv


[PATCH] xhci: Remove unused value len from xhci_unmap_temp_buf

2021-03-06 Thread zhangkun4jr
From: Zhang Kun 

The value assigned to len by sg_pcopy_from_buffer() never used for
anything, so remove it.

Signed-off-by: Zhang Kun 
---
 drivers/usb/host/xhci.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index bd27bd670104..6ebda89d476c 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -1335,7 +1335,6 @@ static bool xhci_urb_temp_buffer_required(struct usb_hcd 
*hcd,
 
 static void xhci_unmap_temp_buf(struct usb_hcd *hcd, struct urb *urb)
 {
-   unsigned int len;
unsigned int buf_len;
enum dma_data_direction dir;
 
@@ -1351,7 +1350,7 @@ static void xhci_unmap_temp_buf(struct usb_hcd *hcd, 
struct urb *urb)
 dir);
 
if (usb_urb_dir_in(urb))
-   len = sg_pcopy_from_buffer(urb->sg, urb->num_sgs,
+   sg_pcopy_from_buffer(urb->sg, urb->num_sgs,
   urb->transfer_buffer,
   buf_len,
   0);
-- 
2.17.1




[PATCH] media:atomisp: remove duplicate include in sh_css

2021-03-06 Thread menglong8 . dong
From: Zhang Yunkai 

'ia_css_isys.h' included in 'sh_css.c' is duplicated.
It is also included in the 30th line.

Signed-off-by: Zhang Yunkai 
---
 drivers/staging/media/atomisp/pci/sh_css.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/staging/media/atomisp/pci/sh_css.c 
b/drivers/staging/media/atomisp/pci/sh_css.c
index ddee04c8248d..afddc54094e9 100644
--- a/drivers/staging/media/atomisp/pci/sh_css.c
+++ b/drivers/staging/media/atomisp/pci/sh_css.c
@@ -49,9 +49,6 @@
 #include "ia_css_pipe_util.h"
 #include "ia_css_pipe_binarydesc.h"
 #include "ia_css_pipe_stagedesc.h"
-#ifndef ISP2401
-#include "ia_css_isys.h"
-#endif
 
 #include "tag.h"
 #include "assert_support.h"
-- 
2.25.1



[tip: x86/cpu] x86/cpu/hygon: Set __max_die_per_package on Hygon

2021-03-06 Thread tip-bot2 for Pu Wen
The following commit has been merged into the x86/cpu branch of tip:

Commit-ID: 59eca2fa1934de42d8aa44d3bef655c92ea69703
Gitweb:
https://git.kernel.org/tip/59eca2fa1934de42d8aa44d3bef655c92ea69703
Author:Pu Wen 
AuthorDate:Tue, 02 Mar 2021 10:02:17 +08:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:54:59 +01:00

x86/cpu/hygon: Set __max_die_per_package on Hygon

Set the maximum DIE per package variable on Hygon using the
nodes_per_socket value in order to do per-DIE manipulations for drivers
such as powercap.

Signed-off-by: Pu Wen 
Signed-off-by: Borislav Petkov 
Signed-off-by: Ingo Molnar 
Link: https://lkml.kernel.org/r/20210302020217.1827-1-pu...@hygon.cn
---
 arch/x86/kernel/cpu/hygon.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/hygon.c b/arch/x86/kernel/cpu/hygon.c
index ae59115..0bd6c74 100644
--- a/arch/x86/kernel/cpu/hygon.c
+++ b/arch/x86/kernel/cpu/hygon.c
@@ -215,12 +215,12 @@ static void bsp_init_hygon(struct cpuinfo_x86 *c)
u32 ecx;
 
ecx = cpuid_ecx(0x801e);
-   nodes_per_socket = ((ecx >> 8) & 7) + 1;
+   __max_die_per_package = nodes_per_socket = ((ecx >> 8) & 7) + 1;
} else if (boot_cpu_has(X86_FEATURE_NODEID_MSR)) {
u64 value;
 
rdmsrl(MSR_FAM10H_NODE_ID, value);
-   nodes_per_socket = ((value >> 3) & 7) + 1;
+   __max_die_per_package = nodes_per_socket = ((value >> 3) & 7) + 
1;
}
 
if (!boot_cpu_has(X86_FEATURE_AMD_SSBD) &&


[tip: x86/platform] x86/platform/uv: Fix indentation warning in Documentation/ABI/testing/sysfs-firmware-sgi_uv

2021-03-06 Thread tip-bot2 for Justin Ernst
The following commit has been merged into the x86/platform branch of tip:

Commit-ID: e93d757c3f33c8a09f4aae579da4dc4500707471
Gitweb:
https://git.kernel.org/tip/e93d757c3f33c8a09f4aae579da4dc4500707471
Author:Justin Ernst 
AuthorDate:Fri, 19 Feb 2021 12:28:52 -06:00
Committer: Borislav Petkov 
CommitterDate: Sat, 06 Mar 2021 12:28:35 +01:00

x86/platform/uv: Fix indentation warning in 
Documentation/ABI/testing/sysfs-firmware-sgi_uv

Commit

  c9624cb7db1c ("x86/platform/uv: Update sysfs documentation")

misplaced the first line of a codeblock section, causing the reported
warning message:

  Documentation/ABI/testing/sysfs-firmware-sgi_uv:2: WARNING: Unexpected 
indentation.

Move the misplaced line below the required blank line to remove the
warning message.

Fixes: c9624cb7db1c ("x86/platform/uv: Update sysfs documentation")
Reported-by: Stephen Rothwell 
Signed-off-by: Justin Ernst 
Signed-off-by: Borislav Petkov 
Acked-by: Mike Travis 
Link: https://lkml.kernel.org/r/20210219182852.385297-1-justin.er...@hpe.com
---
 Documentation/ABI/testing/sysfs-firmware-sgi_uv | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-firmware-sgi_uv 
b/Documentation/ABI/testing/sysfs-firmware-sgi_uv
index 637c668..12ed843 100644
--- a/Documentation/ABI/testing/sysfs-firmware-sgi_uv
+++ b/Documentation/ABI/testing/sysfs-firmware-sgi_uv
@@ -39,8 +39,8 @@ Description:
 
The uv_type entry contains the hub revision number.
This value can be used to identify the UV system version::
-   "0.*" = Hubless UV ('*' is subtype)
 
+   "0.*" = Hubless UV ('*' is subtype)
"3.0" = UV2
"5.0" = UV3
"7.0" = UV4


[tip: timers/urgent] hrtimer: Update softirq_expires_next correctly after __hrtimer_get_next_event()

2021-03-06 Thread tip-bot2 for Anna-Maria Behnsen
The following commit has been merged into the timers/urgent branch of tip:

Commit-ID: eca8f0c80a005aea84df507a446fc0154fc55a32
Gitweb:
https://git.kernel.org/tip/eca8f0c80a005aea84df507a446fc0154fc55a32
Author:Anna-Maria Behnsen 
AuthorDate:Tue, 23 Feb 2021 17:02:40 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:53:47 +01:00

hrtimer: Update softirq_expires_next correctly after __hrtimer_get_next_event()

hrtimer_force_reprogram() and hrtimer_interrupt() invokes
__hrtimer_get_next_event() to find the earliest expiry time of hrtimer
bases. __hrtimer_get_next_event() does not update
cpu_base::[softirq_]_expires_next to preserve reprogramming logic. That
needs to be done at the callsites.

hrtimer_force_reprogram() updates cpu_base::softirq_expires_next only when
the first expiring timer is a softirq timer and the soft interrupt is not
activated. That's wrong because cpu_base::softirq_expires_next is left
stale when the first expiring timer of all bases is a timer which expires
in hard interrupt context. hrtimer_interrupt() does never update
cpu_base::softirq_expires_next which is wrong too.

That becomes a problem when clock_settime() sets CLOCK_REALTIME forward and
the first soft expiring timer is in the CLOCK_REALTIME_SOFT base. Setting
CLOCK_REALTIME forward moves the clock MONOTONIC based expiry time of that
timer before the stale cpu_base::softirq_expires_next.

cpu_base::softirq_expires_next is cached to make the check for raising the
soft interrupt fast. In the above case the soft interrupt won't be raised
until clock monotonic reaches the stale cpu_base::softirq_expires_next
value. That's incorrect, but what's worse it that if the softirq timer
becomes the first expiring timer of all clock bases after the hard expiry
timer has been handled the reprogramming of the clockevent from
hrtimer_interrupt() will result in an interrupt storm. That happens because
the reprogramming does not use cpu_base::softirq_expires_next, it uses
__hrtimer_get_next_event() which returns the actual expiry time. Once clock
MONOTONIC reaches cpu_base::softirq_expires_next the soft interrupt is
raised and the storm subsides.

Change the logic in hrtimer_force_reprogram() to evaluate the soft and hard
bases seperately, update softirq_expires_next and handle the case when a
soft expiring timer is the first of all bases by comparing the expiry times
and updating the required cpu base fields. Split this functionality into a
separate function to be able to use it in hrtimer_interrupt() as well
without copy paste.

Fixes: da70160462e ("hrtimer: Implement support for softirq based hrtimers")
Reported-by: Mikael Beckius 
Suggested-by: Thomas Gleixner 
Tested-by: Mikael Beckius 
Signed-off-by: Anna-Maria Behnsen 
Signed-off-by: Thomas Gleixner 
Signed-off-by: Ingo Molnar 
Link: https://lore.kernel.org/r/20210223160240.27518-1-anna-ma...@linutronix.de
---
 kernel/time/hrtimer.c | 60 +++---
 1 file changed, 39 insertions(+), 21 deletions(-)

diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 743c852..788b9d1 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -546,8 +546,11 @@ static ktime_t __hrtimer_next_event_base(struct 
hrtimer_cpu_base *cpu_base,
 }
 
 /*
- * Recomputes cpu_base::*next_timer and returns the earliest expires_next but
- * does not set cpu_base::*expires_next, that is done by hrtimer_reprogram.
+ * Recomputes cpu_base::*next_timer and returns the earliest expires_next
+ * but does not set cpu_base::*expires_next, that is done by
+ * hrtimer[_force]_reprogram and hrtimer_interrupt only. When updating
+ * cpu_base::*expires_next right away, reprogramming logic would no longer
+ * work.
  *
  * When a softirq is pending, we can ignore the HRTIMER_ACTIVE_SOFT bases,
  * those timers will get run whenever the softirq gets handled, at the end of
@@ -588,6 +591,37 @@ __hrtimer_get_next_event(struct hrtimer_cpu_base 
*cpu_base, unsigned int active_
return expires_next;
 }
 
+static ktime_t hrtimer_update_next_event(struct hrtimer_cpu_base *cpu_base)
+{
+   ktime_t expires_next, soft = KTIME_MAX;
+
+   /*
+* If the soft interrupt has already been activated, ignore the
+* soft bases. They will be handled in the already raised soft
+* interrupt.
+*/
+   if (!cpu_base->softirq_activated) {
+   soft = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_SOFT);
+   /*
+* Update the soft expiry time. clock_settime() might have
+* affected it.
+*/
+   cpu_base->softirq_expires_next = soft;
+   }
+
+   expires_next = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_HARD);
+   /*
+* If a softirq timer is expiring first, update cpu_base->next_timer
+* and program the hardware with the soft expiry time.
+*/
+   if (expires_next > soft) {
+   

[tip: perf/urgent] perf/x86/intel: Set PERF_ATTACH_SCHED_CB for large PEBS and LBR

2021-03-06 Thread tip-bot2 for Kan Liang
The following commit has been merged into the perf/urgent branch of tip:

Commit-ID: afbef30149587ad46f4780b1e0cc5e219745ce90
Gitweb:
https://git.kernel.org/tip/afbef30149587ad46f4780b1e0cc5e219745ce90
Author:Kan Liang 
AuthorDate:Mon, 30 Nov 2020 11:38:41 -08:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:52:44 +01:00

perf/x86/intel: Set PERF_ATTACH_SCHED_CB for large PEBS and LBR

To supply a PID/TID for large PEBS, it requires flushing the PEBS buffer
in a context switch.

For normal LBRs, a context switch can flip the address space and LBR
entries are not tagged with an identifier, we need to wipe the LBR, even
for per-cpu events.

For LBR callstack, save/restore the stack is required during a context
switch.

Set PERF_ATTACH_SCHED_CB for the event with large PEBS & LBR.

Fixes: 9c964efa4330 ("perf/x86/intel: Drain the PEBS buffer during context 
switches")
Signed-off-by: Kan Liang 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Link: https://lkml.kernel.org/r/20201130193842.10569-2-kan.li...@linux.intel.com
---
 arch/x86/events/intel/core.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 5bac48d..7bbb5bb 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3662,8 +3662,10 @@ static int intel_pmu_hw_config(struct perf_event *event)
if (!(event->attr.freq || (event->attr.wakeup_events && 
!event->attr.watermark))) {
event->hw.flags |= PERF_X86_EVENT_AUTO_RELOAD;
if (!(event->attr.sample_type &
- ~intel_pmu_large_pebs_flags(event)))
+ ~intel_pmu_large_pebs_flags(event))) {
event->hw.flags |= PERF_X86_EVENT_LARGE_PEBS;
+   event->attach_state |= PERF_ATTACH_SCHED_CB;
+   }
}
if (x86_pmu.pebs_aliases)
x86_pmu.pebs_aliases(event);
@@ -3676,6 +3678,7 @@ static int intel_pmu_hw_config(struct perf_event *event)
ret = intel_pmu_setup_lbr_filter(event);
if (ret)
return ret;
+   event->attach_state |= PERF_ATTACH_SCHED_CB;
 
/*
 * BTS is set up earlier in this path, so don't account twice


[PATCH] sound: soc: codecs: Fix a spello in the file wm8955.c

2021-03-06 Thread Bhaskar Chowdhury


s/sortd/sorted/

Signed-off-by: Bhaskar Chowdhury 
---
 sound/soc/codecs/wm8955.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/codecs/wm8955.c b/sound/soc/codecs/wm8955.c
index 513df47bd87d..538bb8b0db39 100644
--- a/sound/soc/codecs/wm8955.c
+++ b/sound/soc/codecs/wm8955.c
@@ -151,7 +151,7 @@ static int wm8955_pll_factors(struct device *dev,
/* The oscilator should run at should be 90-100MHz, and
 * there's a divide by 4 plus an optional divide by 2 in the
 * output path to generate the system clock.  The clock table
-* is sortd so we should always generate a suitable target. */
+* is sorted so we should always generate a suitable target. */
target = Fout * 4;
if (target < 9000) {
pll->outdiv = 1;
--
2.26.2



[tip: perf/urgent] perf/core: Flush PMU internal buffers for per-CPU events

2021-03-06 Thread tip-bot2 for Kan Liang
The following commit has been merged into the perf/urgent branch of tip:

Commit-ID: a5398bffc01fe044848c5024e5e867e407f239b8
Gitweb:
https://git.kernel.org/tip/a5398bffc01fe044848c5024e5e867e407f239b8
Author:Kan Liang 
AuthorDate:Mon, 30 Nov 2020 11:38:40 -08:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:52:39 +01:00

perf/core: Flush PMU internal buffers for per-CPU events

Sometimes the PMU internal buffers have to be flushed for per-CPU events
during a context switch, e.g., large PEBS. Otherwise, the perf tool may
report samples in locations that do not belong to the process where the
samples are processed in, because PEBS does not tag samples with PID/TID.

The current code only flush the buffers for a per-task event. It doesn't
check a per-CPU event.

Add a new event state flag, PERF_ATTACH_SCHED_CB, to indicate that the
PMU internal buffers have to be flushed for this event during a context
switch.

Add sched_cb_entry and perf_sched_cb_usages back to track the PMU/cpuctx
which is required to be flushed.

Only need to invoke the sched_task() for per-CPU events in this patch.
The per-task events have been handled in perf_event_context_sched_in/out
already.

Fixes: 9c964efa4330 ("perf/x86/intel: Drain the PEBS buffer during context 
switches")
Reported-by: Gabriel Marin 
Originally-by: Namhyung Kim 
Signed-off-by: Kan Liang 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Link: https://lkml.kernel.org/r/20201130193842.10569-1-kan.li...@linux.intel.com
---
 include/linux/perf_event.h |  2 ++-
 kernel/events/core.c   | 42 +
 2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index fab42cf..3f7f89e 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -606,6 +606,7 @@ struct swevent_hlist {
 #define PERF_ATTACH_TASK   0x04
 #define PERF_ATTACH_TASK_DATA  0x08
 #define PERF_ATTACH_ITRACE 0x10
+#define PERF_ATTACH_SCHED_CB   0x20
 
 struct perf_cgroup;
 struct perf_buffer;
@@ -872,6 +873,7 @@ struct perf_cpu_context {
struct list_headcgrp_cpuctx_entry;
 #endif
 
+   struct list_headsched_cb_entry;
int sched_cb_usage;
 
int online;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 0aeca5f..03db40f 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -386,6 +386,7 @@ static DEFINE_MUTEX(perf_sched_mutex);
 static atomic_t perf_sched_count;
 
 static DEFINE_PER_CPU(atomic_t, perf_cgroup_events);
+static DEFINE_PER_CPU(int, perf_sched_cb_usages);
 static DEFINE_PER_CPU(struct pmu_event_list, pmu_sb_events);
 
 static atomic_t nr_mmap_events __read_mostly;
@@ -3461,11 +3462,16 @@ unlock:
}
 }
 
+static DEFINE_PER_CPU(struct list_head, sched_cb_list);
+
 void perf_sched_cb_dec(struct pmu *pmu)
 {
struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
 
-   --cpuctx->sched_cb_usage;
+   this_cpu_dec(perf_sched_cb_usages);
+
+   if (!--cpuctx->sched_cb_usage)
+   list_del(>sched_cb_entry);
 }
 
 
@@ -3473,7 +3479,10 @@ void perf_sched_cb_inc(struct pmu *pmu)
 {
struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
 
-   cpuctx->sched_cb_usage++;
+   if (!cpuctx->sched_cb_usage++)
+   list_add(>sched_cb_entry, this_cpu_ptr(_cb_list));
+
+   this_cpu_inc(perf_sched_cb_usages);
 }
 
 /*
@@ -3502,6 +3511,24 @@ static void __perf_pmu_sched_task(struct 
perf_cpu_context *cpuctx, bool sched_in
perf_ctx_unlock(cpuctx, cpuctx->task_ctx);
 }
 
+static void perf_pmu_sched_task(struct task_struct *prev,
+   struct task_struct *next,
+   bool sched_in)
+{
+   struct perf_cpu_context *cpuctx;
+
+   if (prev == next)
+   return;
+
+   list_for_each_entry(cpuctx, this_cpu_ptr(_cb_list), 
sched_cb_entry) {
+   /* will be handled in perf_event_context_sched_in/out */
+   if (cpuctx->task_ctx)
+   continue;
+
+   __perf_pmu_sched_task(cpuctx, sched_in);
+   }
+}
+
 static void perf_event_switch(struct task_struct *task,
  struct task_struct *next_prev, bool sched_in);
 
@@ -3524,6 +3551,9 @@ void __perf_event_task_sched_out(struct task_struct *task,
 {
int ctxn;
 
+   if (__this_cpu_read(perf_sched_cb_usages))
+   perf_pmu_sched_task(task, next, false);
+
if (atomic_read(_switch_events))
perf_event_switch(task, next, false);
 
@@ -3832,6 +3862,9 @@ void __perf_event_task_sched_in(struct task_struct *prev,
 
if (atomic_read(_switch_events))
perf_event_switch(task, prev, true);
+
+   if (__this_cpu_read(perf_sched_cb_usages))
+   

[tip: locking/core] static_call: Fix the module key fixup

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the locking/core branch of tip:

Commit-ID: 50bf8080a94d171e843fc013abec19d8ab9f50ae
Gitweb:
https://git.kernel.org/tip/50bf8080a94d171e843fc013abec19d8ab9f50ae
Author:Peter Zijlstra 
AuthorDate:Thu, 25 Feb 2021 23:03:51 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:49:08 +01:00

static_call: Fix the module key fixup

Provided the target address of a R_X86_64_PC32 relocation is aligned,
the low two bits should be invariant between the relative and absolute
value.

Turns out the address is not aligned and things go sideways, ensure we
transfer the bits in the absolute form when fixing up the key address.

Fixes: 73f44fe19d35 ("static_call: Allow module use without exposing 
static_call_key")
Reported-by: Steven Rostedt 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Tested-by: Steven Rostedt (VMware) 
Link: 
https://lkml.kernel.org/r/20210225220351.ge4...@worktop.programming.kicks-ass.net
---
 kernel/static_call.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/static_call.c b/kernel/static_call.c
index 6906c6e..ae82529 100644
--- a/kernel/static_call.c
+++ b/kernel/static_call.c
@@ -349,7 +349,8 @@ static int static_call_add_module(struct module *mod)
struct static_call_site *site;
 
for (site = start; site != stop; site++) {
-   unsigned long addr = (unsigned long)static_call_key(site);
+   unsigned long s_key = (long)site->key + (long)>key;
+   unsigned long addr = s_key & ~STATIC_CALL_SITE_FLAGS;
unsigned long key;
 
/*
@@ -373,8 +374,8 @@ static int static_call_add_module(struct module *mod)
return -EINVAL;
}
 
-   site->key = (key - (long)>key) |
-   (site->key & STATIC_CALL_SITE_FLAGS);
+   key |= s_key & STATIC_CALL_SITE_FLAGS;
+   site->key = key - (long)>key;
}
 
return __static_call_init(mod, start, stop);


[tip: locking/core] lockdep: Add lockdep_assert_not_held()

2021-03-06 Thread tip-bot2 for Shuah Khan
The following commit has been merged into the locking/core branch of tip:

Commit-ID: 3e31f94752e454bdd0ca4a1d046ee21f80c166c5
Gitweb:
https://git.kernel.org/tip/3e31f94752e454bdd0ca4a1d046ee21f80c166c5
Author:Shuah Khan 
AuthorDate:Fri, 26 Feb 2021 17:06:58 -07:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:51:05 +01:00

lockdep: Add lockdep_assert_not_held()

Some kernel functions must be called without holding a specific lock.
Add lockdep_assert_not_held() to be used in these functions to detect
incorrect calls while holding a lock.

lockdep_assert_not_held() provides the opposite functionality of
lockdep_assert_held() which is used to assert calls that require
holding a specific lock.

Incorporates suggestions from Peter Zijlstra to avoid misfires when
lockdep_off() is employed.

The need for lockdep_assert_not_held() came up in a discussion on
ath10k patch. ath10k_drain_tx() and i915_vma_pin_ww() are examples
of functions that can use lockdep_assert_not_held().

Signed-off-by: Shuah Khan 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Link: https://lore.kernel.org/linux-wireless/871rdmu9z9@codeaurora.org/
---
 include/linux/lockdep.h  | 11 ---
 kernel/locking/lockdep.c |  6 +-
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 7b7ebf2..dbd9ea8 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -301,8 +301,12 @@ extern void lock_unpin_lock(struct lockdep_map *lock, 
struct pin_cookie);
 
 #define lockdep_depth(tsk) (debug_locks ? (tsk)->lockdep_depth : 0)
 
-#define lockdep_assert_held(l) do {\
-   WARN_ON(debug_locks && !lockdep_is_held(l));\
+#define lockdep_assert_held(l) do {\
+   WARN_ON(debug_locks && lockdep_is_held(l) == 0);\
+   } while (0)
+
+#define lockdep_assert_not_held(l) do {\
+   WARN_ON(debug_locks && lockdep_is_held(l) == 1);\
} while (0)
 
 #define lockdep_assert_held_write(l)   do {\
@@ -393,7 +397,8 @@ extern int lockdep_is_held(const void *);
 #define lockdep_is_held_type(l, r) (1)
 
 #define lockdep_assert_held(l) do { (void)(l); } while (0)
-#define lockdep_assert_held_write(l)   do { (void)(l); } while (0)
+#define lockdep_assert_not_held(l) do { (void)(l); } while (0)
+#define lockdep_assert_held_write(l)   do { (void)(l); } while (0)
 #define lockdep_assert_held_read(l)do { (void)(l); } while (0)
 #define lockdep_assert_held_once(l)do { (void)(l); } while (0)
 
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index c6d0c1d..969736b 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -5539,8 +5539,12 @@ noinstr int lock_is_held_type(const struct lockdep_map 
*lock, int read)
unsigned long flags;
int ret = 0;
 
+   /*
+* Avoid false negative lockdep_assert_held() and
+* lockdep_assert_not_held().
+*/
if (unlikely(!lockdep_enabled()))
-   return 1; /* avoid false negative lockdep_assert_held() */
+   return -1;
 
raw_local_irq_save(flags);
check_flags(flags);


[tip: locking/core] x86/jump_label: Mark arguments as const to satisfy asm constraints

2021-03-06 Thread tip-bot2 for Jason Gerecke
The following commit has been merged into the locking/core branch of tip:

Commit-ID: 864b435514b286c0be2a38a02f487aa28d990ef8
Gitweb:
https://git.kernel.org/tip/864b435514b286c0be2a38a02f487aa28d990ef8
Author:Jason Gerecke 
AuthorDate:Thu, 11 Feb 2021 13:48:48 -08:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:51:00 +01:00

x86/jump_label: Mark arguments as const to satisfy asm constraints

When compiling an external kernel module with `-O0` or `-O1`, the following
compile error may be reported:

./arch/x86/include/asm/jump_label.h:25:2: error: impossible constraint in 
‘asm’
   25 |  asm_volatile_goto("1:"
  |  ^

It appears that these lower optimization levels prevent GCC from detecting
that the key/branch arguments can be treated as constants and used as
immediate operands. To work around this, explicitly add the `const` label.

Signed-off-by: Jason Gerecke 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Steven Rostedt (VMware) 
Acked-by: Josh Poimboeuf 
Link: https://lkml.kernel.org/r/20210211214848.536626-1-jason.gere...@wacom.com
---
 arch/x86/include/asm/jump_label.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/jump_label.h 
b/arch/x86/include/asm/jump_label.h
index 06c3cc2..7f20066 100644
--- a/arch/x86/include/asm/jump_label.h
+++ b/arch/x86/include/asm/jump_label.h
@@ -20,7 +20,7 @@
 #include 
 #include 
 
-static __always_inline bool arch_static_branch(struct static_key *key, bool 
branch)
+static __always_inline bool arch_static_branch(struct static_key * const key, 
const bool branch)
 {
asm_volatile_goto("1:"
".byte " __stringify(STATIC_KEY_INIT_NOP) "\n\t"
@@ -36,7 +36,7 @@ l_yes:
return true;
 }
 
-static __always_inline bool arch_static_branch_jump(struct static_key *key, 
bool branch)
+static __always_inline bool arch_static_branch_jump(struct static_key * const 
key, const bool branch)
 {
asm_volatile_goto("1:"
".byte 0xe9\n\t .long %l[l_yes] - 2f\n\t"


[tip: locking/core] lockdep: Add lockdep lock state defines

2021-03-06 Thread tip-bot2 for Shuah Khan
The following commit has been merged into the locking/core branch of tip:

Commit-ID: f8cfa46608f8aa5ca5421ce281ab314129c15411
Gitweb:
https://git.kernel.org/tip/f8cfa46608f8aa5ca5421ce281ab314129c15411
Author:Shuah Khan 
AuthorDate:Fri, 26 Feb 2021 17:06:59 -07:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:51:10 +01:00

lockdep: Add lockdep lock state defines

Adds defines for lock state returns from lock_is_held_type() based on
Johannes Berg's suggestions as it make it easier to read and maintain
the lock states. These are defines and a enum to avoid changes to
lock_is_held_type() and lockdep_is_held() return types.

Updates to lock_is_held_type() and  __lock_is_held() to use the new
defines.

Signed-off-by: Shuah Khan 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Link: https://lore.kernel.org/linux-wireless/871rdmu9z9@codeaurora.org/
---
 include/linux/lockdep.h  | 11 +--
 kernel/locking/lockdep.c | 11 ++-
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index dbd9ea8..17805aa 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -268,6 +268,11 @@ extern void lock_acquire(struct lockdep_map *lock, 
unsigned int subclass,
 
 extern void lock_release(struct lockdep_map *lock, unsigned long ip);
 
+/* lock_is_held_type() returns */
+#define LOCK_STATE_UNKNOWN -1
+#define LOCK_STATE_NOT_HELD0
+#define LOCK_STATE_HELD1
+
 /*
  * Same "read" as for lock_acquire(), except -1 means any.
  */
@@ -302,11 +307,13 @@ extern void lock_unpin_lock(struct lockdep_map *lock, 
struct pin_cookie);
 #define lockdep_depth(tsk) (debug_locks ? (tsk)->lockdep_depth : 0)
 
 #define lockdep_assert_held(l) do {\
-   WARN_ON(debug_locks && lockdep_is_held(l) == 0);\
+   WARN_ON(debug_locks &&  \
+   lockdep_is_held(l) == LOCK_STATE_NOT_HELD); \
} while (0)
 
 #define lockdep_assert_not_held(l) do {\
-   WARN_ON(debug_locks && lockdep_is_held(l) == 1);\
+   WARN_ON(debug_locks &&  \
+   lockdep_is_held(l) == LOCK_STATE_HELD); \
} while (0)
 
 #define lockdep_assert_held_write(l)   do {\
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 969736b..c0b8926 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -54,6 +54,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -5252,13 +5253,13 @@ int __lock_is_held(const struct lockdep_map *lock, int 
read)
 
if (match_held_lock(hlock, lock)) {
if (read == -1 || hlock->read == read)
-   return 1;
+   return LOCK_STATE_HELD;
 
-   return 0;
+   return LOCK_STATE_NOT_HELD;
}
}
 
-   return 0;
+   return LOCK_STATE_NOT_HELD;
 }
 
 static struct pin_cookie __lock_pin_lock(struct lockdep_map *lock)
@@ -5537,14 +5538,14 @@ EXPORT_SYMBOL_GPL(lock_release);
 noinstr int lock_is_held_type(const struct lockdep_map *lock, int read)
 {
unsigned long flags;
-   int ret = 0;
+   int ret = LOCK_STATE_NOT_HELD;
 
/*
 * Avoid false negative lockdep_assert_held() and
 * lockdep_assert_not_held().
 */
if (unlikely(!lockdep_enabled()))
-   return -1;
+   return LOCK_STATE_UNKNOWN;
 
raw_local_irq_save(flags);
check_flags(flags);


[tip: locking/core] locking/csd_lock: Prepare more CSD lock debugging

2021-03-06 Thread tip-bot2 for Juergen Gross
The following commit has been merged into the locking/core branch of tip:

Commit-ID: de7b09ef658d637eed0584eaba30884e409aef31
Gitweb:
https://git.kernel.org/tip/de7b09ef658d637eed0584eaba30884e409aef31
Author:Juergen Gross 
AuthorDate:Mon, 01 Mar 2021 11:13:35 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:49:48 +01:00

locking/csd_lock: Prepare more CSD lock debugging

In order to be able to easily add more CSD lock debugging data to
struct call_function_data->csd move the call_single_data_t element
into a sub-structure.

Signed-off-by: Juergen Gross 
Signed-off-by: Ingo Molnar 
Link: https://lore.kernel.org/r/20210301101336.7797-3-jgr...@suse.com
---
 kernel/smp.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index d5f0b21..6d7e6db 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -31,8 +31,12 @@
 
 #define CSD_TYPE(_csd) ((_csd)->node.u_flags & CSD_FLAG_TYPE_MASK)
 
+struct cfd_percpu {
+   call_single_data_t  csd;
+};
+
 struct call_function_data {
-   call_single_data_t  __percpu *csd;
+   struct cfd_percpu   __percpu *pcpu;
cpumask_var_t   cpumask;
cpumask_var_t   cpumask_ipi;
 };
@@ -55,8 +59,8 @@ int smpcfd_prepare_cpu(unsigned int cpu)
free_cpumask_var(cfd->cpumask);
return -ENOMEM;
}
-   cfd->csd = alloc_percpu(call_single_data_t);
-   if (!cfd->csd) {
+   cfd->pcpu = alloc_percpu(struct cfd_percpu);
+   if (!cfd->pcpu) {
free_cpumask_var(cfd->cpumask);
free_cpumask_var(cfd->cpumask_ipi);
return -ENOMEM;
@@ -71,7 +75,7 @@ int smpcfd_dead_cpu(unsigned int cpu)
 
free_cpumask_var(cfd->cpumask);
free_cpumask_var(cfd->cpumask_ipi);
-   free_percpu(cfd->csd);
+   free_percpu(cfd->pcpu);
return 0;
 }
 
@@ -694,7 +698,7 @@ static void smp_call_function_many_cond(const struct 
cpumask *mask,
 
cpumask_clear(cfd->cpumask_ipi);
for_each_cpu(cpu, cfd->cpumask) {
-   call_single_data_t *csd = per_cpu_ptr(cfd->csd, cpu);
+   call_single_data_t *csd = _cpu_ptr(cfd->pcpu, cpu)->csd;
 
if (cond_func && !cond_func(cpu, info))
continue;
@@ -719,7 +723,7 @@ static void smp_call_function_many_cond(const struct 
cpumask *mask,
for_each_cpu(cpu, cfd->cpumask) {
call_single_data_t *csd;
 
-   csd = per_cpu_ptr(cfd->csd, cpu);
+   csd = _cpu_ptr(cfd->pcpu, cpu)->csd;
csd_lock_wait(csd);
}
}


[tip: locking/core] locking/csd_lock: Add more data to CSD lock debugging

2021-03-06 Thread tip-bot2 for Juergen Gross
The following commit has been merged into the locking/core branch of tip:

Commit-ID: a5aabace5fb8abf2adcfcf0fe54c089b20d71755
Gitweb:
https://git.kernel.org/tip/a5aabace5fb8abf2adcfcf0fe54c089b20d71755
Author:Juergen Gross 
AuthorDate:Mon, 01 Mar 2021 11:13:36 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:49:48 +01:00

locking/csd_lock: Add more data to CSD lock debugging

In order to help identifying problems with IPI handling and remote
function execution add some more data to IPI debugging code.

There have been multiple reports of CPUs looping long times (many
seconds) in smp_call_function_many() waiting for another CPU executing
a function like tlb flushing. Most of these reports have been for
cases where the kernel was running as a guest on top of KVM or Xen
(there are rumours of that happening under VMWare, too, and even on
bare metal).

Finding the root cause hasn't been successful yet, even after more than
2 years of chasing this bug by different developers.

Commit:

  35feb60474bf4f7 ("kernel/smp: Provide CSD lock timeout diagnostics")

tried to address this by adding some debug code and by issuing another
IPI when a hang was detected. This helped mitigating the problem
(the repeated IPI unlocks the hang), but the root cause is still unknown.

Current available data suggests that either an IPI wasn't sent when it
should have been, or that the IPI didn't result in the target CPU
executing the queued function (due to the IPI not reaching the CPU,
the IPI handler not being called, or the handler not seeing the queued
request).

Try to add more diagnostic data by introducing a global atomic counter
which is being incremented when doing critical operations (before and
after queueing a new request, when sending an IPI, and when dequeueing
a request). The counter value is stored in percpu variables which can
be printed out when a hang is detected.

The data of the last event (consisting of sequence counter, source
CPU, target CPU, and event type) is stored in a global variable. When
a new event is to be traced, the data of the last event is stored in
the event related percpu location and the global data is updated with
the new event's data. This allows to track two events in one data
location: one by the value of the event data (the event before the
current one), and one by the location itself (the current event).

A typical printout with a detected hang will look like this:

csd: Detected non-responsive CSD lock (#1) on CPU#1, waiting 53 ns for 
CPU#06 scf_handler_1+0x0/0x50(0xa2a881bb1410).
csd: CSD lock (#1) handling prior 
scf_handler_1+0x0/0x50(0xa2a8813823c0) request.
csd: cnt(8cc): -> dequeue (src cpu 0 == empty)
csd: cnt(8cd): ->0006 idle
csd: cnt(0003668): 0001->0006 queue
csd: cnt(0003669): 0001->0006 ipi
csd: cnt(0003e0f): 0007->000a queue
csd: cnt(0003e10): 0001-> ping
csd: cnt(0003e71): 0003-> ping
csd: cnt(0003e72): ->0006 gotipi
csd: cnt(0003e73): ->0006 handle
csd: cnt(0003e74): ->0006 dequeue (src cpu 0 == empty)
csd: cnt(0003e7f): 0004->0006 ping
csd: cnt(0003e80): 0001-> pinged
csd: cnt(0003eb2): 0005->0001 noipi
csd: cnt(0003eb3): 0001->0006 queue
csd: cnt(0003eb4): 0001->0006 noipi
csd: cnt now: 0003f00

The idea is to print only relevant entries. Those are all events which
are associated with the hang (so sender side events for the source CPU
of the hanging request, and receiver side events for the target CPU),
and the related events just before those (for adding data needed to
identify a possible race). Printing all available data would be
possible, but this would add large amounts of data printed on larger
configurations.

Signed-off-by: Juergen Gross 
[ Minor readability edits. Breaks col80 but is far more readable. ]
Signed-off-by: Ingo Molnar 
Tested-by: Paul E. McKenney 
Link: https://lore.kernel.org/r/20210301101336.7797-4-jgr...@suse.com
---
 Documentation/admin-guide/kernel-parameters.txt |   4 +-
 kernel/smp.c| 226 ++-
 2 files changed, 226 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 98dbffa..1fe9d38 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -789,6 +789,10 @@
printed to the console in case a hanging CPU is
detected, and that CPU is pinged again in order to try
to resolve the hang situation.
+   0: disable csdlock debugging (default)
+   1: enable basic csdlock debugging (minor impact)
+   ext: enable extended csdlock debugging (more impact,
+but 

[tip: locking/core] locking/csd_lock: Add boot parameter for controlling CSD lock debugging

2021-03-06 Thread tip-bot2 for Juergen Gross
The following commit has been merged into the locking/core branch of tip:

Commit-ID: 8d0968cc6b8ffd8496c2ebffdfdc801f949a85e5
Gitweb:
https://git.kernel.org/tip/8d0968cc6b8ffd8496c2ebffdfdc801f949a85e5
Author:Juergen Gross 
AuthorDate:Mon, 01 Mar 2021 11:13:34 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:49:48 +01:00

locking/csd_lock: Add boot parameter for controlling CSD lock debugging

Currently CSD lock debugging can be switched on and off via a kernel
config option only. Unfortunately there is at least one problem with
CSD lock handling pending for about 2 years now, which has been seen
in different environments (mostly when running virtualized under KVM
or Xen, at least once on bare metal). Multiple attempts to catch this
issue have finally led to introduction of CSD lock debug code, but
this code is not in use in most distros as it has some impact on
performance.

In order to be able to ship kernels with CONFIG_CSD_LOCK_WAIT_DEBUG
enabled even for production use, add a boot parameter for switching
the debug functionality on. This will reduce any performance impact
of the debug coding to a bare minimum when not being used.

Signed-off-by: Juergen Gross 
[ Minor edits. ]
Signed-off-by: Ingo Molnar 
Link: https://lore.kernel.org/r/20210301101336.7797-2-jgr...@suse.com
---
 Documentation/admin-guide/kernel-parameters.txt |  6 +++-
 kernel/smp.c| 38 ++--
 2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 0454572..98dbffa 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -784,6 +784,12 @@
cs89x0_media=   [HW,NET]
Format: { rj45 | aui | bnc }
 
+   csdlock_debug=  [KNL] Enable debug add-ons of cross-CPU function call
+   handling. When switched on, additional debug data is
+   printed to the console in case a hanging CPU is
+   detected, and that CPU is pinged again in order to try
+   to resolve the hang situation.
+
dasd=   [HW,NET]
See header of drivers/s390/block/dasd_devmap.c.
 
diff --git a/kernel/smp.c b/kernel/smp.c
index aeb0adf..d5f0b21 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "smpboot.h"
 #include "sched/smp.h"
@@ -102,6 +103,20 @@ void __init call_function_init(void)
 
 #ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
 
+static DEFINE_STATIC_KEY_FALSE(csdlock_debug_enabled);
+
+static int __init csdlock_debug(char *str)
+{
+   unsigned int val = 0;
+
+   get_option(, );
+   if (val)
+   static_branch_enable(_debug_enabled);
+
+   return 0;
+}
+early_param("csdlock_debug", csdlock_debug);
+
 static DEFINE_PER_CPU(call_single_data_t *, cur_csd);
 static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func);
 static DEFINE_PER_CPU(void *, cur_csd_info);
@@ -110,7 +125,7 @@ static DEFINE_PER_CPU(void *, cur_csd_info);
 static atomic_t csd_bug_count = ATOMIC_INIT(0);
 
 /* Record current CSD work for current CPU, NULL to erase. */
-static void csd_lock_record(call_single_data_t *csd)
+static void __csd_lock_record(call_single_data_t *csd)
 {
if (!csd) {
smp_mb(); /* NULL cur_csd after unlock. */
@@ -125,7 +140,13 @@ static void csd_lock_record(call_single_data_t *csd)
  /* Or before unlock, as the case may be. */
 }
 
-static __always_inline int csd_lock_wait_getcpu(call_single_data_t *csd)
+static __always_inline void csd_lock_record(call_single_data_t *csd)
+{
+   if (static_branch_unlikely(_debug_enabled))
+   __csd_lock_record(csd);
+}
+
+static int csd_lock_wait_getcpu(call_single_data_t *csd)
 {
unsigned int csd_type;
 
@@ -140,7 +161,7 @@ static __always_inline int 
csd_lock_wait_getcpu(call_single_data_t *csd)
  * the CSD_TYPE_SYNC/ASYNC types provide the destination CPU,
  * so waiting on other types gets much less information.
  */
-static __always_inline bool csd_lock_wait_toolong(call_single_data_t *csd, u64 
ts0, u64 *ts1, int *bug_id)
+static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, 
int *bug_id)
 {
int cpu = -1;
int cpux;
@@ -204,7 +225,7 @@ static __always_inline bool 
csd_lock_wait_toolong(call_single_data_t *csd, u64 t
  * previous function call. For multi-cpu calls its even more interesting
  * as we'll have to ensure no other cpu is observing our csd.
  */
-static __always_inline void csd_lock_wait(call_single_data_t *csd)
+static void __csd_lock_wait(call_single_data_t *csd)
 {
int bug_id = 0;
u64 ts0, ts1;
@@ -218,6 +239,15 @@ static __always_inline void 
csd_lock_wait(call_single_data_t *csd)
smp_acquire__after_ctrl_dep();
 }
 

[tip: irq/core] genirq: Add IRQF_NO_AUTOEN for request_irq/nmi()

2021-03-06 Thread tip-bot2 for Barry Song
The following commit has been merged into the irq/core branch of tip:

Commit-ID: cbe16f35bee6880becca6f20d2ebf6b457148552
Gitweb:
https://git.kernel.org/tip/cbe16f35bee6880becca6f20d2ebf6b457148552
Author:Barry Song 
AuthorDate:Wed, 03 Mar 2021 11:49:15 +13:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:48:00 +01:00

genirq: Add IRQF_NO_AUTOEN for request_irq/nmi()

Many drivers don't want interrupts enabled automatically via request_irq().
So they are handling this issue by either way of the below two:

(1)
  irq_set_status_flags(irq, IRQ_NOAUTOEN);
  request_irq(dev, irq...);

(2)
  request_irq(dev, irq...);
  disable_irq(irq);

The code in the second way is silly and unsafe. In the small time gap
between request_irq() and disable_irq(), interrupts can still come.

The code in the first way is safe though it's subobtimal.

Add a new IRQF_NO_AUTOEN flag which can be handed in by drivers to
request_irq() and request_nmi(). It prevents the automatic enabling of the
requested interrupt/nmi in the same safe way as #1 above. With that the
various usage sites of #1 and #2 above can be simplified and corrected.

Signed-off-by: Barry Song 
Signed-off-by: Thomas Gleixner 
Signed-off-by: Ingo Molnar 
Cc: dmitry.torok...@gmail.com
Link: 
https://lore.kernel.org/r/20210302224916.13980-2-song.bao@hisilicon.com
---
 include/linux/interrupt.h |  4 
 kernel/irq/manage.c   | 11 +--
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 967e257..76f1161 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -61,6 +61,9 @@
  *interrupt handler after suspending interrupts. For system
  *wakeup devices users need to implement wakeup detection in
  *their interrupt handlers.
+ * IRQF_NO_AUTOEN - Don't enable IRQ or NMI automatically when users request 
it.
+ *Users will enable it explicitly by enable_irq() or 
enable_nmi()
+ *later.
  */
 #define IRQF_SHARED0x0080
 #define IRQF_PROBE_SHARED  0x0100
@@ -74,6 +77,7 @@
 #define IRQF_NO_THREAD 0x0001
 #define IRQF_EARLY_RESUME  0x0002
 #define IRQF_COND_SUSPEND  0x0004
+#define IRQF_NO_AUTOEN 0x0008
 
 #define IRQF_TIMER (__IRQF_TIMER | IRQF_NO_SUSPEND | 
IRQF_NO_THREAD)
 
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index dec3f73..97c231a 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1693,7 +1693,8 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, 
struct irqaction *new)
irqd_set(>irq_data, IRQD_NO_BALANCING);
}
 
-   if (irq_settings_can_autoenable(desc)) {
+   if (!(new->flags & IRQF_NO_AUTOEN) &&
+   irq_settings_can_autoenable(desc)) {
irq_startup(desc, IRQ_RESEND, IRQ_START_COND);
} else {
/*
@@ -2086,10 +2087,15 @@ int request_threaded_irq(unsigned int irq, 
irq_handler_t handler,
 * which interrupt is which (messes up the interrupt freeing
 * logic etc).
 *
+* Also shared interrupts do not go well with disabling auto enable.
+* The sharing interrupt might request it while it's still disabled
+* and then wait for interrupts forever.
+*
 * Also IRQF_COND_SUSPEND only makes sense for shared interrupts and
 * it cannot be set along with IRQF_NO_SUSPEND.
 */
if (((irqflags & IRQF_SHARED) && !dev_id) ||
+   ((irqflags & IRQF_SHARED) && (irqflags & IRQF_NO_AUTOEN)) ||
(!(irqflags & IRQF_SHARED) && (irqflags & IRQF_COND_SUSPEND)) ||
((irqflags & IRQF_NO_SUSPEND) && (irqflags & IRQF_COND_SUSPEND)))
return -EINVAL;
@@ -2245,7 +2251,8 @@ int request_nmi(unsigned int irq, irq_handler_t handler,
 
desc = irq_to_desc(irq);
 
-   if (!desc || irq_settings_can_autoenable(desc) ||
+   if (!desc || (irq_settings_can_autoenable(desc) &&
+   !(irqflags & IRQF_NO_AUTOEN)) ||
!irq_settings_can_request(desc) ||
WARN_ON(irq_settings_is_per_cpu_devid(desc)) ||
!irq_supports_nmi(desc))


[tip: locking/core] ath10k: Detect conf_mutex held ath10k_drain_tx() calls

2021-03-06 Thread tip-bot2 for Shuah Khan
The following commit has been merged into the locking/core branch of tip:

Commit-ID: bdb1050ee1faaec1e78c15de8b1959176f26c655
Gitweb:
https://git.kernel.org/tip/bdb1050ee1faaec1e78c15de8b1959176f26c655
Author:Shuah Khan 
AuthorDate:Fri, 26 Feb 2021 17:07:00 -07:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:51:15 +01:00

ath10k: Detect conf_mutex held ath10k_drain_tx() calls

ath10k_drain_tx() must not be called with conf_mutex held as workers can
use that also. Add call to lockdep_assert_not_held() on conf_mutex to
detect if conf_mutex is held by the caller.

The idea for this patch stemmed from coming across the comment block
above the ath10k_drain_tx() while reviewing the conf_mutex holds during
to debug the conf_mutex lock assert in ath10k_debug_fw_stats_request().

Adding detection to assert on conf_mutex hold will help detect incorrect
usages that could lead to locking problems when async worker routines try
to call this routine.

Signed-off-by: Shuah Khan 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Acked-by: Kalle Valo 
Link: https://lore.kernel.org/linux-wireless/871rdmu9z9@codeaurora.org/
---
 drivers/net/wireless/ath/ath10k/mac.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c 
b/drivers/net/wireless/ath/ath10k/mac.c
index bb6c5ee..5ce4f8d 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -4727,6 +4727,8 @@ out:
 /* Must not be called with conf_mutex held as workers can use that also. */
 void ath10k_drain_tx(struct ath10k *ar)
 {
+   lockdep_assert_not_held(>conf_mutex);
+
/* make sure rcu-protected mac80211 tx path itself is drained */
synchronize_net();
 


[PATCH] soc:litex: remove duplicate include in litex_soc_ctrl

2021-03-06 Thread menglong8 . dong
From: Zhang Yunkai 

'errno.h' included in 'litex_soc_ctrl.c' is duplicated.
It is also included in the 11th line.

Signed-off-by: Zhang Yunkai 
---
 drivers/soc/litex/litex_soc_ctrl.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/soc/litex/litex_soc_ctrl.c 
b/drivers/soc/litex/litex_soc_ctrl.c
index 6268bfa7f0d6..c3e379a990f2 100644
--- a/drivers/soc/litex/litex_soc_ctrl.c
+++ b/drivers/soc/litex/litex_soc_ctrl.c
@@ -13,7 +13,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
-- 
2.25.1



[tip: objtool/core] objtool,x86: More ModRM sugar

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the objtool/core branch of tip:

Commit-ID: 36d92e43d01cbeeec99abdf405362243051d6b3f
Gitweb:
https://git.kernel.org/tip/36d92e43d01cbeeec99abdf405362243051d6b3f
Author:Peter Zijlstra 
AuthorDate:Fri, 12 Feb 2021 09:13:00 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:44:23 +01:00

objtool,x86: More ModRM sugar

Better helpers to decode ModRM.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Acked-by: Josh Poimboeuf 
Link: https://lkml.kernel.org/r/YCZB/ljatfxqq...@hirez.programming.kicks-ass.net
---
 tools/objtool/arch/x86/decode.c | 28 +---
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index b42e5ec..431bafb 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -82,15 +82,21 @@ unsigned long arch_jump_destination(struct instruction 
*insn)
  * 01 |  [r/m + d8]|[S+d]|   [r/m + d8]  |
  * 10 |  [r/m + d32]   |[S+D]|   [r/m + d32] |
  * 11 |   r/ m   |
- *
  */
+
+#define mod_is_mem()   (modrm_mod != 3)
+#define mod_is_reg()   (modrm_mod == 3)
+
 #define is_RIP()   ((modrm_rm & 7) == CFI_BP && modrm_mod == 0)
-#define have_SIB() ((modrm_rm & 7) == CFI_SP && modrm_mod != 3)
+#define have_SIB() ((modrm_rm & 7) == CFI_SP && mod_is_mem())
 
 #define rm_is(reg) (have_SIB() ? \
sib_base == (reg) && sib_index == CFI_SP : \
modrm_rm == (reg))
 
+#define rm_is_mem(reg) (mod_is_mem() && !is_RIP() && rm_is(reg))
+#define rm_is_reg(reg) (mod_is_reg() && modrm_rm == (reg))
+
 int arch_decode_instruction(const struct elf *elf, const struct section *sec,
unsigned long offset, unsigned int maxlen,
unsigned int *len, enum insn_type *type,
@@ -154,7 +160,7 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
 
case 0x1:
case 0x29:
-   if (rex_w && modrm_mod == 3 && modrm_rm == CFI_SP) {
+   if (rex_w && rm_is_reg(CFI_SP)) {
 
/* add/sub reg, %rsp */
ADD_OP(op) {
@@ -219,7 +225,7 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
break;
 
/* %rsp target only */
-   if (!(modrm_mod == 3 && modrm_rm == CFI_SP))
+   if (!rm_is_reg(CFI_SP))
break;
 
imm = insn.immediate.value;
@@ -272,7 +278,7 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
 
if (modrm_reg == CFI_SP) {
 
-   if (modrm_mod == 3) {
+   if (mod_is_reg()) {
/* mov %rsp, reg */
ADD_OP(op) {
op->src.type = OP_SRC_REG;
@@ -308,7 +314,7 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
break;
}
 
-   if (modrm_mod == 3 && modrm_rm == CFI_SP) {
+   if (rm_is_reg(CFI_SP)) {
 
/* mov reg, %rsp */
ADD_OP(op) {
@@ -325,7 +331,7 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
if (!rex_w)
break;
 
-   if ((modrm_mod == 1 || modrm_mod == 2) && modrm_rm == CFI_BP) {
+   if (rm_is_mem(CFI_BP)) {
 
/* mov reg, disp(%rbp) */
ADD_OP(op) {
@@ -338,7 +344,7 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
break;
}
 
-   if (modrm_mod != 3 && rm_is(CFI_SP)) {
+   if (rm_is_mem(CFI_SP)) {
 
/* mov reg, disp(%rsp) */
ADD_OP(op) {
@@ -357,7 +363,7 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
if (!rex_w)
break;
 
-   if ((modrm_mod == 1 || modrm_mod == 2) && modrm_rm == CFI_BP) {
+   if (rm_is_mem(CFI_BP)) {
 
/* mov disp(%rbp), reg */
ADD_OP(op) {
@@ -370,7 +376,7 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
break;
}
 
-   if (modrm_mod != 3 && rm_is(CFI_SP)) {
+   if (rm_is_mem(CFI_SP)) {
 
/* mov disp(%rsp), reg */
ADD_OP(op) {
@@ -386,7 +392,7 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
break;
 
case 0x8d:
-   if (modrm_mod == 3) {
+   if (mod_is_reg()) {
  

[tip: objtool/core] objtool: Collate parse_options() users

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the objtool/core branch of tip:

Commit-ID: a2f605f9ff57397d05a8e2f282b78a69f574d305
Gitweb:
https://git.kernel.org/tip/a2f605f9ff57397d05a8e2f282b78a69f574d305
Author:Peter Zijlstra 
AuthorDate:Fri, 26 Feb 2021 11:18:24 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:44:23 +01:00

objtool: Collate parse_options() users

Ensure there's a single place that parses check_options, in
preparation for extending where to get options from.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Acked-by: Josh Poimboeuf 
Link: https://lkml.kernel.org/r/20210226110004.193108...@infradead.org
---
 tools/objtool/builtin-check.c   | 14 +-
 tools/objtool/builtin-orc.c |  5 +
 tools/objtool/include/objtool/builtin.h |  2 ++
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index 97f063d..0399752 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -42,17 +42,21 @@ const struct option check_options[] = {
OPT_END(),
 };
 
+int cmd_parse_options(int argc, const char **argv, const char * const usage[])
+{
+   argc = parse_options(argc, argv, check_options, usage, 0);
+   if (argc != 1)
+   usage_with_options(usage, check_options);
+   return argc;
+}
+
 int cmd_check(int argc, const char **argv)
 {
const char *objname;
struct objtool_file *file;
int ret;
 
-   argc = parse_options(argc, argv, check_options, check_usage, 0);
-
-   if (argc != 1)
-   usage_with_options(check_usage, check_options);
-
+   argc = cmd_parse_options(argc, argv, check_usage);
objname = argv[0];
 
file = objtool_open_read(objname);
diff --git a/tools/objtool/builtin-orc.c b/tools/objtool/builtin-orc.c
index 8273bbf..17f8b93 100644
--- a/tools/objtool/builtin-orc.c
+++ b/tools/objtool/builtin-orc.c
@@ -34,10 +34,7 @@ int cmd_orc(int argc, const char **argv)
struct objtool_file *file;
int ret;
 
-   argc = parse_options(argc, argv, check_options, orc_usage, 0);
-   if (argc != 1)
-   usage_with_options(orc_usage, check_options);
-
+   argc = cmd_parse_options(argc, argv, orc_usage);
objname = argv[0];
 
file = objtool_open_read(objname);
diff --git a/tools/objtool/include/objtool/builtin.h 
b/tools/objtool/include/objtool/builtin.h
index d019210..15ac0b7 100644
--- a/tools/objtool/include/objtool/builtin.h
+++ b/tools/objtool/include/objtool/builtin.h
@@ -11,6 +11,8 @@ extern const struct option check_options[];
 extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, 
stats,
 validate_dup, vmlinux, mcount, noinstr, backup;
 
+extern int cmd_parse_options(int argc, const char **argv, const char * const 
usage[]);
+
 extern int cmd_check(int argc, const char **argv);
 extern int cmd_orc(int argc, const char **argv);
 


[tip: objtool/core] objtool: Parse options from OBJTOOL_ARGS

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the objtool/core branch of tip:

Commit-ID: 900b4df347bbac4874149a226143a556909faba8
Gitweb:
https://git.kernel.org/tip/900b4df347bbac4874149a226143a556909faba8
Author:Peter Zijlstra 
AuthorDate:Fri, 26 Feb 2021 11:32:30 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:44:23 +01:00

objtool: Parse options from OBJTOOL_ARGS

Teach objtool to parse options from the OBJTOOL_ARGS environment
variable.

This enables things like:

  $ OBJTOOL_ARGS="--backup" make O=defconfig-build/ kernel/ponies.o

to obtain both defconfig-build/kernel/ponies.o{,.orig} and easily
inspect what objtool actually did.

Suggested-by: Borislav Petkov 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Acked-by: Josh Poimboeuf 
Link: https://lkml.kernel.org/r/20210226110004.252553...@infradead.org
---
 tools/objtool/builtin-check.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index 0399752..8b38b5d 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -15,6 +15,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -26,6 +27,11 @@ static const char * const check_usage[] = {
NULL,
 };
 
+static const char * const env_usage[] = {
+   "OBJTOOL_ARGS=\"\"",
+   NULL,
+};
+
 const struct option check_options[] = {
OPT_BOOLEAN('f', "no-fp", _fp, "Skip frame pointer validation"),
OPT_BOOLEAN('u', "no-unreachable", _unreachable, "Skip 'unreachable 
instruction' warnings"),
@@ -44,6 +50,25 @@ const struct option check_options[] = {
 
 int cmd_parse_options(int argc, const char **argv, const char * const usage[])
 {
+   const char *envv[16] = { };
+   char *env;
+   int envc;
+
+   env = getenv("OBJTOOL_ARGS");
+   if (env) {
+   envv[0] = "OBJTOOL_ARGS";
+   for (envc = 1; envc < ARRAY_SIZE(envv); ) {
+   envv[envc++] = env;
+   env = strchr(env, ' ');
+   if (!env)
+   break;
+   *env = '\0';
+   env++;
+   }
+
+   parse_options(envc, envv, check_options, env_usage, 0);
+   }
+
argc = parse_options(argc, argv, check_options, usage, 0);
if (argc != 1)
usage_with_options(usage, check_options);


[tip: objtool/core] objtool,x86: Rewrite LEA decode

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the objtool/core branch of tip:

Commit-ID: 2ee0c363492f1acc1082125218e6a80c0d7d502b
Gitweb:
https://git.kernel.org/tip/2ee0c363492f1acc1082125218e6a80c0d7d502b
Author:Peter Zijlstra 
AuthorDate:Tue, 09 Feb 2021 21:29:16 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:44:23 +01:00

objtool,x86: Rewrite LEA decode

Current LEA decoding is a bunch of special cases, properly decode the
instruction, with exception of full SIB and RIP-relative modes.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Acked-by: Josh Poimboeuf 
Tested-by: Nick Desaulniers 
Link: https://lkml.kernel.org/r/20210211173627.143250...@infradead.org
---
 tools/objtool/arch/x86/decode.c | 86 ++--
 1 file changed, 28 insertions(+), 58 deletions(-)

diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 549813c..d8f0138 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -91,9 +91,10 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
 {
struct insn insn;
int x86_64, sign;
-   unsigned char op1, op2, rex = 0, rex_b = 0, rex_r = 0, rex_w = 0,
- rex_x = 0, modrm = 0, modrm_mod = 0, modrm_rm = 0,
- modrm_reg = 0, sib = 0;
+   unsigned char op1, op2,
+ rex = 0, rex_b = 0, rex_r = 0, rex_w = 0, rex_x = 0,
+ modrm = 0, modrm_mod = 0, modrm_rm = 0, modrm_reg = 0,
+ sib = 0;
struct stack_op *op = NULL;
struct symbol *sym;
 
@@ -328,68 +329,37 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
break;
 
case 0x8d:
-   if (sib == 0x24 && rex_w && !rex_b && !rex_x) {
-
-   ADD_OP(op) {
-   if (!insn.displacement.value) {
-   /* lea (%rsp), reg */
-   op->src.type = OP_SRC_REG;
-   } else {
-   /* lea disp(%rsp), reg */
-   op->src.type = OP_SRC_ADD;
-   op->src.offset = 
insn.displacement.value;
-   }
-   op->src.reg = CFI_SP;
-   op->dest.type = OP_DEST_REG;
-   op->dest.reg = op_to_cfi_reg[modrm_reg][rex_r];
-   }
-
-   } else if (rex == 0x48 && modrm == 0x65) {
-
-   /* lea disp(%rbp), %rsp */
-   ADD_OP(op) {
-   op->src.type = OP_SRC_ADD;
-   op->src.reg = CFI_BP;
-   op->src.offset = insn.displacement.value;
-   op->dest.type = OP_DEST_REG;
-   op->dest.reg = CFI_SP;
-   }
+   if (modrm_mod == 3) {
+   WARN("invalid LEA encoding at %s:0x%lx", sec->name, 
offset);
+   break;
+   }
 
-   } else if (rex == 0x49 && modrm == 0x62 &&
-  insn.displacement.value == -8) {
+   /* skip non 64bit ops */
+   if (!rex_w)
+   break;
 
-   /*
-* lea -0x8(%r10), %rsp
-*
-* Restoring rsp back to its original value after a
-* stack realignment.
-*/
-   ADD_OP(op) {
-   op->src.type = OP_SRC_ADD;
-   op->src.reg = CFI_R10;
-   op->src.offset = -8;
-   op->dest.type = OP_DEST_REG;
-   op->dest.reg = CFI_SP;
-   }
+   /* skip nontrivial SIB */
+   if (modrm_rm == 4 && !(sib == 0x24 && rex_b == rex_x))
+   break;
 
-   } else if (rex == 0x49 && modrm == 0x65 &&
-  insn.displacement.value == -16) {
+   /* skip RIP relative displacement */
+   if (modrm_rm == 5 && modrm_mod == 0)
+   break;
 
-   /*
-* lea -0x10(%r13), %rsp
-*
-* Restoring rsp back to its original value after a
-* stack realignment.
-*/
-   ADD_OP(op) {
+   /* lea disp(%src), %dst */
+   ADD_OP(op) {
+   op->src.offset = insn.displacement.value;
+   if (!op->src.offset) {
+

[tip: objtool/core] objtool,x86: Rewrite LEAVE

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the objtool/core branch of tip:

Commit-ID: ffc7e74f36a2c7424da262a32a0bbe59669677ef
Gitweb:
https://git.kernel.org/tip/ffc7e74f36a2c7424da262a32a0bbe59669677ef
Author:Peter Zijlstra 
AuthorDate:Tue, 09 Feb 2021 21:41:13 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:44:23 +01:00

objtool,x86: Rewrite LEAVE

Since we can now have multiple stack-ops per instruction, we don't
need to special case LEAVE and can simply emit the composite
operations.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Acked-by: Josh Poimboeuf 
Tested-by: Nick Desaulniers 
Link: https://lkml.kernel.org/r/20210211173627.253273...@infradead.org
---
 tools/objtool/arch/x86/decode.c  | 14 +++---
 tools/objtool/check.c| 24 ++--
 tools/objtool/include/objtool/arch.h |  1 -
 3 files changed, 13 insertions(+), 26 deletions(-)

diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index d8f0138..47b9acf 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -446,9 +446,17 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
 * mov bp, sp
 * pop bp
 */
-   ADD_OP(op)
-   op->dest.type = OP_DEST_LEAVE;
-
+   ADD_OP(op) {
+   op->src.type = OP_SRC_REG;
+   op->src.reg = CFI_BP;
+   op->dest.type = OP_DEST_REG;
+   op->dest.reg = CFI_SP;
+   }
+   ADD_OP(op) {
+   op->src.type = OP_SRC_POP;
+   op->dest.type = OP_DEST_REG;
+   op->dest.reg = CFI_BP;
+   }
break;
 
case 0xe3:
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 12b8f0f..a0f762a 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2020,7 +2020,7 @@ static int update_cfi_state(struct instruction *insn,
}
 
else if (op->src.reg == CFI_BP && op->dest.reg == 
CFI_SP &&
-cfa->base == CFI_BP) {
+(cfa->base == CFI_BP || cfa->base == 
cfi->drap_reg)) {
 
/*
 * mov %rbp, %rsp
@@ -2217,7 +2217,7 @@ static int update_cfi_state(struct instruction *insn,
cfa->offset = 0;
cfi->drap_offset = -1;
 
-   } else if (regs[op->dest.reg].offset == 
-cfi->stack_size) {
+   } else if (cfi->stack_size == 
-regs[op->dest.reg].offset) {
 
/* pop %reg */
restore_reg(cfi, op->dest.reg);
@@ -2358,26 +2358,6 @@ static int update_cfi_state(struct instruction *insn,
 
break;
 
-   case OP_DEST_LEAVE:
-   if ((!cfi->drap && cfa->base != CFI_BP) ||
-   (cfi->drap && cfa->base != cfi->drap_reg)) {
-   WARN_FUNC("leave instruction with modified stack frame",
- insn->sec, insn->offset);
-   return -1;
-   }
-
-   /* leave (mov %rbp, %rsp; pop %rbp) */
-
-   cfi->stack_size = -cfi->regs[CFI_BP].offset - 8;
-   restore_reg(cfi, CFI_BP);
-
-   if (!cfi->drap) {
-   cfa->base = CFI_SP;
-   cfa->offset -= 8;
-   }
-
-   break;
-
case OP_DEST_MEM:
if (op->src.type != OP_SRC_POP && op->src.type != OP_SRC_POPF) {
WARN_FUNC("unknown stack-related memory operation",
diff --git a/tools/objtool/include/objtool/arch.h 
b/tools/objtool/include/objtool/arch.h
index 6ff0685..ff21f38 100644
--- a/tools/objtool/include/objtool/arch.h
+++ b/tools/objtool/include/objtool/arch.h
@@ -35,7 +35,6 @@ enum op_dest_type {
OP_DEST_MEM,
OP_DEST_PUSH,
OP_DEST_PUSHF,
-   OP_DEST_LEAVE,
 };
 
 struct op_dest {


[tip: objtool/core] objtool,x86: Renumber CFI_reg

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the objtool/core branch of tip:

Commit-ID: d473b18b2ef62563fb874f9cae6e123f99129e3f
Gitweb:
https://git.kernel.org/tip/d473b18b2ef62563fb874f9cae6e123f99129e3f
Author:Peter Zijlstra 
AuthorDate:Tue, 09 Feb 2021 20:18:21 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:44:22 +01:00

objtool,x86: Renumber CFI_reg

Make them match the instruction encoding numbering.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Acked-by: Josh Poimboeuf 
Tested-by: Nick Desaulniers 
Link: https://lkml.kernel.org/r/20210211173627.033720...@infradead.org
---
 tools/objtool/arch/x86/include/arch/cfi_regs.h | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/objtool/arch/x86/include/arch/cfi_regs.h 
b/tools/objtool/arch/x86/include/arch/cfi_regs.h
index 79bc517..0579d22 100644
--- a/tools/objtool/arch/x86/include/arch/cfi_regs.h
+++ b/tools/objtool/arch/x86/include/arch/cfi_regs.h
@@ -4,13 +4,13 @@
 #define _OBJTOOL_CFI_REGS_H
 
 #define CFI_AX 0
-#define CFI_DX 1
-#define CFI_CX 2
+#define CFI_CX 1
+#define CFI_DX 2
 #define CFI_BX 3
-#define CFI_SI 4
-#define CFI_DI 5
-#define CFI_BP 6
-#define CFI_SP 7
+#define CFI_SP 4
+#define CFI_BP 5
+#define CFI_SI 6
+#define CFI_DI 7
 #define CFI_R8 8
 #define CFI_R9 9
 #define CFI_R1010


[tip: objtool/core] objtool: Allow UNWIND_HINT to suppress dodgy stack modifications

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the objtool/core branch of tip:

Commit-ID: d54dba41999498b38a40940e1123019d50b26496
Gitweb:
https://git.kernel.org/tip/d54dba41999498b38a40940e1123019d50b26496
Author:Peter Zijlstra 
AuthorDate:Thu, 11 Feb 2021 13:03:28 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:44:22 +01:00

objtool: Allow UNWIND_HINT to suppress dodgy stack modifications

rewind_stack_do_exit()
UNWIND_HINT_FUNC
/* Prevent any naive code from trying to unwind to our caller. */

xorl%ebp, %ebp
movqPER_CPU_VAR(cpu_current_top_of_stack), %rax
leaq-PTREGS_SIZE(%rax), %rsp
UNWIND_HINT_REGS

calldo_exit

Does unspeakable things to the stack, which objtool currently fails to
detect due to a limitation in instruction decoding. This will be
rectified after which the above will result in:

arch/x86/entry/entry_64.o: warning: objtool: .text+0xab: unsupported stack 
register modification

Allow the UNWIND_HINT on the next instruction to suppress this, it
will overwrite the state anyway.

Suggested-by: Josh Poimboeuf 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Acked-by: Josh Poimboeuf 
Tested-by: Nick Desaulniers 
Link: https://lkml.kernel.org/r/20210211173626.918498...@infradead.org
---
 tools/objtool/check.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 068cdb4..12b8f0f 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1959,8 +1959,9 @@ static void restore_reg(struct cfi_state *cfi, unsigned 
char reg)
  *   41 5d pop%r13
  *   c3retq
  */
-static int update_cfi_state(struct instruction *insn, struct cfi_state *cfi,
-struct stack_op *op)
+static int update_cfi_state(struct instruction *insn,
+   struct instruction *next_insn,
+   struct cfi_state *cfi, struct stack_op *op)
 {
struct cfi_reg *cfa = >cfa;
struct cfi_reg *regs = cfi->regs;
@@ -2161,7 +2162,7 @@ static int update_cfi_state(struct instruction *insn, 
struct cfi_state *cfi,
break;
}
 
-   if (op->dest.reg == cfi->cfa.base) {
+   if (op->dest.reg == cfi->cfa.base && !(next_insn && 
next_insn->hint)) {
WARN_FUNC("unsupported stack register 
modification",
  insn->sec, insn->offset);
return -1;
@@ -2433,13 +2434,15 @@ static int propagate_alt_cfi(struct objtool_file *file, 
struct instruction *insn
return 0;
 }
 
-static int handle_insn_ops(struct instruction *insn, struct insn_state *state)
+static int handle_insn_ops(struct instruction *insn,
+  struct instruction *next_insn,
+  struct insn_state *state)
 {
struct stack_op *op;
 
list_for_each_entry(op, >stack_ops, list) {
 
-   if (update_cfi_state(insn, >cfi, op))
+   if (update_cfi_state(insn, next_insn, >cfi, op))
return 1;
 
if (op->dest.type == OP_DEST_PUSHF) {
@@ -2719,7 +2722,7 @@ static int validate_branch(struct objtool_file *file, 
struct symbol *func,
return 0;
}
 
-   if (handle_insn_ops(insn, ))
+   if (handle_insn_ops(insn, next_insn, ))
return 1;
 
switch (insn->type) {


[tip: objtool/core] objtool,x86: Simplify register decode

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the objtool/core branch of tip:

Commit-ID: 16ef7f159c503c7befec7018ee0e82fdc311721e
Gitweb:
https://git.kernel.org/tip/16ef7f159c503c7befec7018ee0e82fdc311721e
Author:Peter Zijlstra 
AuthorDate:Tue, 09 Feb 2021 19:59:43 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:44:23 +01:00

objtool,x86: Simplify register decode

Since the CFI_reg number now matches the instruction encoding order do
away with the op_to_cfi_reg[] and use direct assignment.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Acked-by: Josh Poimboeuf 
Tested-by: Nick Desaulniers 
Link: https://lkml.kernel.org/r/20210211173627.362004...@infradead.org
---
 tools/objtool/arch/x86/decode.c | 79 +++-
 1 file changed, 39 insertions(+), 40 deletions(-)

diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 47b9acf..5ce7dc4 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -17,17 +17,6 @@
 #include 
 #include 
 
-static unsigned char op_to_cfi_reg[][2] = {
-   {CFI_AX, CFI_R8},
-   {CFI_CX, CFI_R9},
-   {CFI_DX, CFI_R10},
-   {CFI_BX, CFI_R11},
-   {CFI_SP, CFI_R12},
-   {CFI_BP, CFI_R13},
-   {CFI_SI, CFI_R14},
-   {CFI_DI, CFI_R15},
-};
-
 static int is_x86_64(const struct elf *elf)
 {
switch (elf->ehdr.e_machine) {
@@ -94,7 +83,7 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
unsigned char op1, op2,
  rex = 0, rex_b = 0, rex_r = 0, rex_w = 0, rex_x = 0,
  modrm = 0, modrm_mod = 0, modrm_rm = 0, modrm_reg = 0,
- sib = 0;
+ sib = 0 /* , sib_scale = 0, sib_index = 0, sib_base = 0 
*/;
struct stack_op *op = NULL;
struct symbol *sym;
 
@@ -130,23 +119,29 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
if (insn.modrm.nbytes) {
modrm = insn.modrm.bytes[0];
modrm_mod = X86_MODRM_MOD(modrm);
-   modrm_reg = X86_MODRM_REG(modrm);
-   modrm_rm = X86_MODRM_RM(modrm);
+   modrm_reg = X86_MODRM_REG(modrm) + 8*rex_r;
+   modrm_rm  = X86_MODRM_RM(modrm)  + 8*rex_b;
}
 
-   if (insn.sib.nbytes)
+   if (insn.sib.nbytes) {
sib = insn.sib.bytes[0];
+   /*
+   sib_scale = X86_SIB_SCALE(sib);
+   sib_index = X86_SIB_INDEX(sib) + 8*rex_x;
+   sib_base  = X86_SIB_BASE(sib)  + 8*rex_b;
+*/
+   }
 
switch (op1) {
 
case 0x1:
case 0x29:
-   if (rex_w && !rex_b && modrm_mod == 3 && modrm_rm == 4) {
+   if (rex_w && modrm_mod == 3 && modrm_rm == CFI_SP) {
 
/* add/sub reg, %rsp */
ADD_OP(op) {
op->src.type = OP_SRC_ADD;
-   op->src.reg = op_to_cfi_reg[modrm_reg][rex_r];
+   op->src.reg = modrm_reg;
op->dest.type = OP_DEST_REG;
op->dest.reg = CFI_SP;
}
@@ -158,7 +153,7 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
/* push reg */
ADD_OP(op) {
op->src.type = OP_SRC_REG;
-   op->src.reg = op_to_cfi_reg[op1 & 0x7][rex_b];
+   op->src.reg = (op1 & 0x7) + 8*rex_b;
op->dest.type = OP_DEST_PUSH;
}
 
@@ -170,7 +165,7 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
ADD_OP(op) {
op->src.type = OP_SRC_POP;
op->dest.type = OP_DEST_REG;
-   op->dest.reg = op_to_cfi_reg[op1 & 0x7][rex_b];
+   op->dest.reg = (op1 & 0x7) + 8*rex_b;
}
 
break;
@@ -223,7 +218,7 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
break;
 
case 0x89:
-   if (rex_w && !rex_r && modrm_reg == 4) {
+   if (rex_w && modrm_reg == CFI_SP) {
 
if (modrm_mod == 3) {
/* mov %rsp, reg */
@@ -231,17 +226,17 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
op->src.type = OP_SRC_REG;
op->src.reg = CFI_SP;
op->dest.type = OP_DEST_REG;
-   op->dest.reg = 
op_to_cfi_reg[modrm_rm][rex_b];
+   op->dest.reg = modrm_rm;
}

[tip: objtool/core] objtool,x86: Rewrite ADD/SUB/AND

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the objtool/core branch of tip:

Commit-ID: 961d83b9073b1ce5834af50d3c69e5e2461c6fd3
Gitweb:
https://git.kernel.org/tip/961d83b9073b1ce5834af50d3c69e5e2461c6fd3
Author:Peter Zijlstra 
AuthorDate:Wed, 10 Feb 2021 14:11:30 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:44:23 +01:00

objtool,x86: Rewrite ADD/SUB/AND

Support sign extending and imm8 forms.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Acked-by: Josh Poimboeuf 
Tested-by: Nick Desaulniers 
Link: https://lkml.kernel.org/r/20210211173627.588366...@infradead.org
---
 tools/objtool/arch/x86/decode.c | 70 +++-
 1 file changed, 51 insertions(+), 19 deletions(-)

diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 78ae5be..b42e5ec 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -98,13 +98,14 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
struct list_head *ops_list)
 {
struct insn insn;
-   int x86_64, sign;
+   int x86_64;
unsigned char op1, op2,
  rex = 0, rex_b = 0, rex_r = 0, rex_w = 0, rex_x = 0,
  modrm = 0, modrm_mod = 0, modrm_rm = 0, modrm_reg = 0,
  sib = 0, /* sib_scale = 0, */ sib_index = 0, sib_base = 0;
struct stack_op *op = NULL;
struct symbol *sym;
+   u64 imm;
 
x86_64 = is_x86_64(elf);
if (x86_64 == -1)
@@ -200,12 +201,54 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
*type = INSN_JUMP_CONDITIONAL;
break;
 
-   case 0x81:
-   case 0x83:
-   if (rex != 0x48)
+   case 0x80 ... 0x83:
+   /*
+* 1000 00sw : mod OP r/m : immediate
+*
+* s - sign extend immediate
+* w - imm8 / imm32
+*
+* OP: 000 ADD100 AND
+* 001 OR 101 SUB
+* 010 ADC110 XOR
+* 011 SBB111 CMP
+*/
+
+   /* 64bit only */
+   if (!rex_w)
break;
 
-   if (modrm == 0xe4) {
+   /* %rsp target only */
+   if (!(modrm_mod == 3 && modrm_rm == CFI_SP))
+   break;
+
+   imm = insn.immediate.value;
+   if (op1 & 2) { /* sign extend */
+   if (op1 & 1) { /* imm32 */
+   imm <<= 32;
+   imm = (s64)imm >> 32;
+   } else { /* imm8 */
+   imm <<= 56;
+   imm = (s64)imm >> 56;
+   }
+   }
+
+   switch (modrm_reg & 7) {
+   case 5:
+   imm = -imm;
+   /* fallthrough */
+   case 0:
+   /* add/sub imm, %rsp */
+   ADD_OP(op) {
+   op->src.type = OP_SRC_ADD;
+   op->src.reg = CFI_SP;
+   op->src.offset = imm;
+   op->dest.type = OP_DEST_REG;
+   op->dest.reg = CFI_SP;
+   }
+   break;
+
+   case 4:
/* and imm, %rsp */
ADD_OP(op) {
op->src.type = OP_SRC_AND;
@@ -215,23 +258,12 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
op->dest.reg = CFI_SP;
}
break;
-   }
 
-   if (modrm == 0xc4)
-   sign = 1;
-   else if (modrm == 0xec)
-   sign = -1;
-   else
+   default:
+   /* WARN ? */
break;
-
-   /* add/sub imm, %rsp */
-   ADD_OP(op) {
-   op->src.type = OP_SRC_ADD;
-   op->src.reg = CFI_SP;
-   op->src.offset = insn.immediate.value * sign;
-   op->dest.type = OP_DEST_REG;
-   op->dest.reg = CFI_SP;
}
+
break;
 
case 0x89:


[tip: objtool/core] objtool,x86: Support %riz encodings

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the objtool/core branch of tip:

Commit-ID: 78df6245c3c82484200b9f8e306dc86fb19e9c02
Gitweb:
https://git.kernel.org/tip/78df6245c3c82484200b9f8e306dc86fb19e9c02
Author:Peter Zijlstra 
AuthorDate:Wed, 10 Feb 2021 11:47:35 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:44:23 +01:00

objtool,x86: Support %riz encodings

When there's a SIB byte, the register otherwise denoted by r/m will
then be denoted by SIB.base REX.b will now extend this. SIB.index == SP
is magic and notes an index value zero.

This means that there's a bunch of alternative (longer) encodings for
the same thing. Eg. 'ModRM.mod != 3, ModRM.r/m = AX' can be encoded as
'ModRM.mod != 3, ModRM.r/m = SP, SIB.base = AX, SIB.index = SP' which is 
actually 4
different encodings because the value of SIB.scale is irrelevant,
giving rise to 5 different but equal encodings.

Support these encodings and clean up the SIB handling in general.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Acked-by: Josh Poimboeuf 
Tested-by: Nick Desaulniers 
Link: https://lkml.kernel.org/r/20210211173627.472967...@infradead.org
---
 tools/objtool/arch/x86/decode.c | 67 ++--
 1 file changed, 48 insertions(+), 19 deletions(-)

diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 5ce7dc4..78ae5be 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -72,6 +72,25 @@ unsigned long arch_jump_destination(struct instruction *insn)
return -1; \
else for (list_add_tail(>list, ops_list); op; op = NULL)
 
+/*
+ * Helpers to decode ModRM/SIB:
+ *
+ * r/m| AX  CX  DX  BX |  SP |  BP |  SI  DI |
+ *| R8  R9 R10 R11 | R12 | R13 | R14 R15 |
+ * Mod++-+-+-+
+ * 00 |[r/m]   |[SIB]|[IP+]|  [r/m]  |
+ * 01 |  [r/m + d8]|[S+d]|   [r/m + d8]  |
+ * 10 |  [r/m + d32]   |[S+D]|   [r/m + d32] |
+ * 11 |   r/ m   |
+ *
+ */
+#define is_RIP()   ((modrm_rm & 7) == CFI_BP && modrm_mod == 0)
+#define have_SIB() ((modrm_rm & 7) == CFI_SP && modrm_mod != 3)
+
+#define rm_is(reg) (have_SIB() ? \
+   sib_base == (reg) && sib_index == CFI_SP : \
+   modrm_rm == (reg))
+
 int arch_decode_instruction(const struct elf *elf, const struct section *sec,
unsigned long offset, unsigned int maxlen,
unsigned int *len, enum insn_type *type,
@@ -83,7 +102,7 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
unsigned char op1, op2,
  rex = 0, rex_b = 0, rex_r = 0, rex_w = 0, rex_x = 0,
  modrm = 0, modrm_mod = 0, modrm_rm = 0, modrm_reg = 0,
- sib = 0 /* , sib_scale = 0, sib_index = 0, sib_base = 0 
*/;
+ sib = 0, /* sib_scale = 0, */ sib_index = 0, sib_base = 0;
struct stack_op *op = NULL;
struct symbol *sym;
 
@@ -125,11 +144,9 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
 
if (insn.sib.nbytes) {
sib = insn.sib.bytes[0];
-   /*
-   sib_scale = X86_SIB_SCALE(sib);
+   /* sib_scale = X86_SIB_SCALE(sib); */
sib_index = X86_SIB_INDEX(sib) + 8*rex_x;
sib_base  = X86_SIB_BASE(sib)  + 8*rex_b;
-*/
}
 
switch (op1) {
@@ -218,7 +235,10 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
break;
 
case 0x89:
-   if (rex_w && modrm_reg == CFI_SP) {
+   if (!rex_w)
+   break;
+
+   if (modrm_reg == CFI_SP) {
 
if (modrm_mod == 3) {
/* mov %rsp, reg */
@@ -231,14 +251,17 @@ int arch_decode_instruction(const struct elf *elf, const 
struct section *sec,
break;
 
} else {
-   /* skip nontrivial SIB */
-   if ((modrm_rm & 7) == 4 && !(sib == 0x24 && 
rex_b == rex_x))
-   break;
-
/* skip RIP relative displacement */
-   if ((modrm_rm & 7) == 5 && modrm_mod == 0)
+   if (is_RIP())
break;
 
+   /* skip nontrivial SIB */
+   if (have_SIB()) {
+   modrm_rm = sib_base;
+   if (sib_index != CFI_SP)
+   break;
+   }
+
/* mov %rsp, disp(%reg) */
ADD_OP(op) {
   

[tip: objtool/core] objtool: Add --backup

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the objtool/core branch of tip:

Commit-ID: 8ad15c6900840e8a2163012f4581c52127622e02
Gitweb:
https://git.kernel.org/tip/8ad15c6900840e8a2163012f4581c52127622e02
Author:Peter Zijlstra 
AuthorDate:Fri, 26 Feb 2021 10:59:59 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:44:23 +01:00

objtool: Add --backup

Teach objtool to write backups files, such that it becomes easier to
see what objtool did to the object file.

Backup files will be ${name}.orig.

Suggested-by: Borislav Petkov 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Acked-by: Borislav Petkov 
Acked-by: Josh Poimboeuf 
Link: https://lkml.kernel.org/r/yd4obt3aoxpwl...@hirez.programming.kicks-ass.net
---
 tools/objtool/builtin-check.c   |  4 +-
 tools/objtool/include/objtool/builtin.h |  3 +-
 tools/objtool/objtool.c | 64 -
 3 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index c3a85d8..97f063d 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -18,7 +18,8 @@
 #include 
 #include 
 
-bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, 
validate_dup, vmlinux, mcount, noinstr;
+bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats,
+ validate_dup, vmlinux, mcount, noinstr, backup;
 
 static const char * const check_usage[] = {
"objtool check [] file.o",
@@ -37,6 +38,7 @@ const struct option check_options[] = {
OPT_BOOLEAN('n', "noinstr", , "noinstr validation for 
vmlinux.o"),
OPT_BOOLEAN('l', "vmlinux", , "vmlinux.o validation"),
OPT_BOOLEAN('M', "mcount", , "generate __mcount_loc"),
+   OPT_BOOLEAN('B', "backup", , "create .orig files before 
modification"),
OPT_END(),
 };
 
diff --git a/tools/objtool/include/objtool/builtin.h 
b/tools/objtool/include/objtool/builtin.h
index 2502bb2..d019210 100644
--- a/tools/objtool/include/objtool/builtin.h
+++ b/tools/objtool/include/objtool/builtin.h
@@ -8,7 +8,8 @@
 #include 
 
 extern const struct option check_options[];
-extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, 
stats, validate_dup, vmlinux, mcount, noinstr;
+extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, 
stats,
+validate_dup, vmlinux, mcount, noinstr, backup;
 
 extern int cmd_check(int argc, const char **argv);
 extern int cmd_orc(int argc, const char **argv);
diff --git a/tools/objtool/objtool.c b/tools/objtool/objtool.c
index 7b97ce4..43c1836 100644
--- a/tools/objtool/objtool.c
+++ b/tools/objtool/objtool.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -44,6 +45,64 @@ bool help;
 const char *objname;
 static struct objtool_file file;
 
+static bool objtool_create_backup(const char *_objname)
+{
+   int len = strlen(_objname);
+   char *buf, *base, *name = malloc(len+6);
+   int s, d, l, t;
+
+   if (!name) {
+   perror("failed backup name malloc");
+   return false;
+   }
+
+   strcpy(name, _objname);
+   strcpy(name + len, ".orig");
+
+   d = open(name, O_CREAT|O_WRONLY|O_TRUNC, 0644);
+   if (d < 0) {
+   perror("failed to create backup file");
+   return false;
+   }
+
+   s = open(_objname, O_RDONLY);
+   if (s < 0) {
+   perror("failed to open orig file");
+   return false;
+   }
+
+   buf = malloc(4096);
+   if (!buf) {
+   perror("failed backup data malloc");
+   return false;
+   }
+
+   while ((l = read(s, buf, 4096)) > 0) {
+   base = buf;
+   do {
+   t = write(d, base, l);
+   if (t < 0) {
+   perror("failed backup write");
+   return false;
+   }
+   base += t;
+   l -= t;
+   } while (l);
+   }
+
+   if (l < 0) {
+   perror("failed backup read");
+   return false;
+   }
+
+   free(name);
+   free(buf);
+   close(d);
+   close(s);
+
+   return true;
+}
+
 struct objtool_file *objtool_open_read(const char *_objname)
 {
if (objname) {
@@ -59,6 +118,11 @@ struct objtool_file *objtool_open_read(const char *_objname)
if (!file.elf)
return NULL;
 
+   if (backup && !objtool_create_backup(objname)) {
+   WARN("can't create backup file");
+   return NULL;
+   }
+
INIT_LIST_HEAD(_list);
hash_init(file.insn_hash);
INIT_LIST_HEAD(_call_list);


[PATCH] scsi:ufs: remove duplicate include in ufshcd

2021-03-06 Thread menglong8 . dong
From: Zhang Yunkai 

'blkdev.h' included in 'ufshcd.c' is duplicated.
It is also included in the 18th line.

Signed-off-by: Zhang Yunkai 
---
 drivers/scsi/ufs/ufshcd.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 77161750c9fb..9a564b6fd092 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -24,7 +24,6 @@
 #include "ufs_bsg.h"
 #include "ufshcd-crypto.h"
 #include 
-#include 
 
 #define CREATE_TRACE_POINTS
 #include 
-- 
2.25.1



[tip: sched/core] sched: Simplify set_affinity_pending refcounts

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 50caf9c14b1498c90cf808dbba2ca29bd32ccba4
Gitweb:
https://git.kernel.org/tip/50caf9c14b1498c90cf808dbba2ca29bd32ccba4
Author:Peter Zijlstra 
AuthorDate:Wed, 24 Feb 2021 11:42:08 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00

sched: Simplify set_affinity_pending refcounts

Now that we have set_affinity_pending::stop_pending to indicate if a
stopper is in progress, and we have the guarantee that if that stopper
exists, it will (eventually) complete our @pending we can simplify the
refcount scheme by no longer counting the stopper thread.

Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: sta...@kernel.org
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Valentin Schneider 
Link: https://lkml.kernel.org/r/20210224131355.724130...@infradead.org
---
 kernel/sched/core.c | 32 
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4e4d100..9819121 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1862,6 +1862,10 @@ struct migration_arg {
struct set_affinity_pending *pending;
 };
 
+/*
+ * @refs: number of wait_for_completion()
+ * @stop_pending: is @stop_work in use
+ */
 struct set_affinity_pending {
refcount_t  refs;
unsigned intstop_pending;
@@ -1997,10 +2001,6 @@ out:
if (complete)
complete_all(>done);
 
-   /* For pending->{arg,stop_work} */
-   if (pending && refcount_dec_and_test(>refs))
-   wake_up_var(>refs);
-
return 0;
 }
 
@@ -2199,12 +2199,16 @@ static int affine_move_task(struct rq *rq, struct 
task_struct *p, struct rq_flag
push_task = get_task_struct(p);
}
 
+   /*
+* If there are pending waiters, but no pending stop_work,
+* then complete now.
+*/
pending = p->migration_pending;
-   if (pending) {
-   refcount_inc(>refs);
+   if (pending && !pending->stop_pending) {
p->migration_pending = NULL;
complete = true;
}
+
task_rq_unlock(rq, p, rf);
 
if (push_task) {
@@ -2213,7 +2217,7 @@ static int affine_move_task(struct rq *rq, struct 
task_struct *p, struct rq_flag
}
 
if (complete)
-   goto do_complete;
+   complete_all(>done);
 
return 0;
}
@@ -2264,9 +2268,9 @@ static int affine_move_task(struct rq *rq, struct 
task_struct *p, struct rq_flag
if (!stop_pending)
pending->stop_pending = true;
 
-   refcount_inc(>refs); /* pending->{arg,stop_work} */
if (flags & SCA_MIGRATE_ENABLE)
p->migration_flags &= ~MDF_PUSH;
+
task_rq_unlock(rq, p, rf);
 
if (!stop_pending) {
@@ -2282,12 +2286,13 @@ static int affine_move_task(struct rq *rq, struct 
task_struct *p, struct rq_flag
if (task_on_rq_queued(p))
rq = move_queued_task(rq, rf, p, dest_cpu);
 
-   p->migration_pending = NULL;
-   complete = true;
+   if (!pending->stop_pending) {
+   p->migration_pending = NULL;
+   complete = true;
+   }
}
task_rq_unlock(rq, p, rf);
 
-do_complete:
if (complete)
complete_all(>done);
}
@@ -2295,7 +2300,7 @@ do_complete:
wait_for_completion(>done);
 
if (refcount_dec_and_test(>refs))
-   wake_up_var(>refs);
+   wake_up_var(>refs); /* No UaF, just an address */
 
/*
 * Block the original owner of  until all subsequent callers
@@ -2303,6 +2308,9 @@ do_complete:
 */
wait_var_event(_pending.refs, !refcount_read(_pending.refs));
 
+   /* ARGH */
+   WARN_ON_ONCE(my_pending.stop_pending);
+
return 0;
 }
 


[tip: sched/core] sched: Collate affine_move_task() stoppers

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 58b1a45086b5f80f2b2842aa7ed0da51a64a302b
Gitweb:
https://git.kernel.org/tip/58b1a45086b5f80f2b2842aa7ed0da51a64a302b
Author:Peter Zijlstra 
AuthorDate:Wed, 24 Feb 2021 11:15:23 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00

sched: Collate affine_move_task() stoppers

The SCA_MIGRATE_ENABLE and task_running() cases are almost identical,
collapse them to avoid further duplication.

Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: sta...@kernel.org
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Valentin Schneider 
Link: https://lkml.kernel.org/r/20210224131355.500108...@infradead.org
---
 kernel/sched/core.c | 23 ---
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 088e8f4..84b657f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2239,30 +2239,23 @@ static int affine_move_task(struct rq *rq, struct 
task_struct *p, struct rq_flag
return -EINVAL;
}
 
-   if (flags & SCA_MIGRATE_ENABLE) {
-
-   refcount_inc(>refs); /* pending->{arg,stop_work} */
-   p->migration_flags &= ~MDF_PUSH;
-   task_rq_unlock(rq, p, rf);
-
-   stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
-   >arg, >stop_work);
-
-   return 0;
-   }
-
if (task_running(rq, p) || p->state == TASK_WAKING) {
/*
-* Lessen races (and headaches) by delegating
-* is_migration_disabled(p) checks to the stopper, which will
-* run on the same CPU as said p.
+* MIGRATE_ENABLE gets here because 'p == current', but for
+* anything else we cannot do is_migration_disabled(), punt
+* and have the stopper function handle it all race-free.
 */
+
refcount_inc(>refs); /* pending->{arg,stop_work} */
+   if (flags & SCA_MIGRATE_ENABLE)
+   p->migration_flags &= ~MDF_PUSH;
task_rq_unlock(rq, p, rf);
 
stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
>arg, >stop_work);
 
+   if (flags & SCA_MIGRATE_ENABLE)
+   return 0;
} else {
 
if (!is_migration_disabled(p)) {


[tip: sched/core] sched: Simplify migration_cpu_stop()

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the sched/core branch of tip:

Commit-ID: c20cf065d4a619d394d23290093b1002e27dff86
Gitweb:
https://git.kernel.org/tip/c20cf065d4a619d394d23290093b1002e27dff86
Author:Peter Zijlstra 
AuthorDate:Wed, 24 Feb 2021 11:50:39 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:20 +01:00

sched: Simplify migration_cpu_stop()

When affine_move_task() issues a migration_cpu_stop(), the purpose of
that function is to complete that @pending, not any random other
p->migration_pending that might have gotten installed since.

This realization much simplifies migration_cpu_stop() and allows
further necessary steps to fix all this as it provides the guarantee
that @pending's stopper will complete @pending (and not some random
other @pending).

Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: sta...@kernel.org
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Valentin Schneider 
Link: https://lkml.kernel.org/r/20210224131355.430014...@infradead.org
---
 kernel/sched/core.c | 56 ++--
 1 file changed, 8 insertions(+), 48 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 79ddba5..088e8f4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1898,8 +1898,8 @@ static struct rq *__migrate_task(struct rq *rq, struct 
rq_flags *rf,
  */
 static int migration_cpu_stop(void *data)
 {
-   struct set_affinity_pending *pending;
struct migration_arg *arg = data;
+   struct set_affinity_pending *pending = arg->pending;
struct task_struct *p = arg->task;
int dest_cpu = arg->dest_cpu;
struct rq *rq = this_rq();
@@ -1921,25 +1921,6 @@ static int migration_cpu_stop(void *data)
raw_spin_lock(>pi_lock);
rq_lock(rq, );
 
-   pending = p->migration_pending;
-   if (pending && !arg->pending) {
-   /*
-* This happens from sched_exec() and migrate_task_to(),
-* neither of them care about pending and just want a task to
-* maybe move about.
-*
-* Even if there is a pending, we can ignore it, since
-* affine_move_task() will have it's own stop_work's in flight
-* which will manage the completion.
-*
-* Notably, pending doesn't need to match arg->pending. This can
-* happen when tripple concurrent affine_move_task() first sets
-* pending, then clears pending and eventually sets another
-* pending.
-*/
-   pending = NULL;
-   }
-
/*
 * If task_rq(p) != rq, it cannot be migrated here, because we're
 * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because
@@ -1950,31 +1931,20 @@ static int migration_cpu_stop(void *data)
goto out;
 
if (pending) {
-   p->migration_pending = NULL;
+   if (p->migration_pending == pending)
+   p->migration_pending = NULL;
complete = true;
}
 
-   /* migrate_enable() --  we must not race against SCA */
-   if (dest_cpu < 0) {
-   /*
-* When this was migrate_enable() but we no longer
-* have a @pending, a concurrent SCA 'fixed' things
-* and we should be valid again. Nothing to do.
-*/
-   if (!pending) {
-   WARN_ON_ONCE(!cpumask_test_cpu(task_cpu(p), 
>cpus_mask));
-   goto out;
-   }
-
+   if (dest_cpu < 0)
dest_cpu = cpumask_any_distribute(>cpus_mask);
-   }
 
if (task_on_rq_queued(p))
rq = __migrate_task(rq, , p, dest_cpu);
else
p->wake_cpu = dest_cpu;
 
-   } else if (dest_cpu < 0 || pending) {
+   } else if (pending) {
/*
 * This happens when we get migrated between migrate_enable()'s
 * preempt_enable() and scheduling the stopper task. At that
@@ -1989,23 +1959,14 @@ static int migration_cpu_stop(void *data)
 * ->pi_lock, so the allowed mask is stable - if it got
 * somewhere allowed, we're done.
 */
-   if (pending && cpumask_test_cpu(task_cpu(p), p->cpus_ptr)) {
-   p->migration_pending = NULL;
+   if (cpumask_test_cpu(task_cpu(p), p->cpus_ptr)) {
+   if (p->migration_pending == pending)
+   p->migration_pending = NULL;
complete = true;
 

[tip: sched/core] sched: Optimize migration_cpu_stop()

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 3f1bc119cd7fc987c8ed25ffb717f99403bb308c
Gitweb:
https://git.kernel.org/tip/3f1bc119cd7fc987c8ed25ffb717f99403bb308c
Author:Peter Zijlstra 
AuthorDate:Wed, 24 Feb 2021 11:21:35 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00

sched: Optimize migration_cpu_stop()

When the purpose of migration_cpu_stop() is to migrate the task to
'any' valid CPU, don't migrate the task when it's already running on a
valid CPU.

Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: sta...@kernel.org
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Valentin Schneider 
Link: https://lkml.kernel.org/r/20210224131355.569238...@infradead.org
---
 kernel/sched/core.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 84b657f..ac05afb 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1936,14 +1936,25 @@ static int migration_cpu_stop(void *data)
complete = true;
}
 
-   if (dest_cpu < 0)
+   if (dest_cpu < 0) {
+   if (cpumask_test_cpu(task_cpu(p), >cpus_mask))
+   goto out;
+
dest_cpu = cpumask_any_distribute(>cpus_mask);
+   }
 
if (task_on_rq_queued(p))
rq = __migrate_task(rq, , p, dest_cpu);
else
p->wake_cpu = dest_cpu;
 
+   /*
+* XXX __migrate_task() can fail, at which point we might end
+* up running on a dodgy CPU, AFAICT this can only happen
+* during CPU hotplug, at which point we'll get pushed out
+* anyway, so it's probably not a big deal.
+*/
+
} else if (pending) {
/*
 * This happens when we get migrated between migrate_enable()'s


[tip: sched/core] sched: Fix migration_cpu_stop() requeueing

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 8a6edb5257e2a84720fe78cb179eca58ba76126f
Gitweb:
https://git.kernel.org/tip/8a6edb5257e2a84720fe78cb179eca58ba76126f
Author:Peter Zijlstra 
AuthorDate:Sat, 13 Feb 2021 13:10:35 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:20 +01:00

sched: Fix migration_cpu_stop() requeueing

When affine_move_task(p) is called on a running task @p, which is not
otherwise already changing affinity, we'll first set
p->migration_pending and then do:

 stop_one_cpu(cpu_of_rq(rq), migration_cpu_stop, );

This then gets us to migration_cpu_stop() running on the CPU that was
previously running our victim task @p.

If we find that our task is no longer on that runqueue (this can
happen because of a concurrent migration due to load-balance etc.),
then we'll end up at the:

} else if (dest_cpu < 1 || pending) {

branch. Which we'll take because we set pending earlier. Here we first
check if the task @p has already satisfied the affinity constraints,
if so we bail early [A]. Otherwise we'll reissue migration_cpu_stop()
onto the CPU that is now hosting our task @p:

stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
>arg, >stop_work);

Except, we've never initialized pending->arg, which will be all 0s.

This then results in running migration_cpu_stop() on the next CPU with
arg->p == NULL, which gives the by now obvious result of fireworks.

The cure is to change affine_move_task() to always use pending->arg,
furthermore we can use the exact same pattern as the
SCA_MIGRATE_ENABLE case, since we'll block on the pending->done
completion anyway, no point in adding yet another completion in
stop_one_cpu().

This then gives a clear distinction between the two
migration_cpu_stop() use cases:

  - sched_exec() / migrate_task_to() : arg->pending == NULL
  - affine_move_task() : arg->pending != NULL;

And we can have it ignore p->migration_pending when !arg->pending. Any
stop work from sched_exec() / migrate_task_to() is in addition to stop
works from affine_move_task(), which will be sufficient to issue the
completion.

Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: sta...@kernel.org
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Valentin Schneider 
Link: https://lkml.kernel.org/r/20210224131355.357743...@infradead.org
---
 kernel/sched/core.c | 39 ---
 1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ca2bb62..79ddba5 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1922,6 +1922,24 @@ static int migration_cpu_stop(void *data)
rq_lock(rq, );
 
pending = p->migration_pending;
+   if (pending && !arg->pending) {
+   /*
+* This happens from sched_exec() and migrate_task_to(),
+* neither of them care about pending and just want a task to
+* maybe move about.
+*
+* Even if there is a pending, we can ignore it, since
+* affine_move_task() will have it's own stop_work's in flight
+* which will manage the completion.
+*
+* Notably, pending doesn't need to match arg->pending. This can
+* happen when tripple concurrent affine_move_task() first sets
+* pending, then clears pending and eventually sets another
+* pending.
+*/
+   pending = NULL;
+   }
+
/*
 * If task_rq(p) != rq, it cannot be migrated here, because we're
 * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because
@@ -2194,10 +2212,6 @@ static int affine_move_task(struct rq *rq, struct 
task_struct *p, struct rq_flag
int dest_cpu, unsigned int flags)
 {
struct set_affinity_pending my_pending = { }, *pending = NULL;
-   struct migration_arg arg = {
-   .task = p,
-   .dest_cpu = dest_cpu,
-   };
bool complete = false;
 
/* Can the task run on the task's current CPU? If so, we're done */
@@ -2235,6 +2249,12 @@ static int affine_move_task(struct rq *rq, struct 
task_struct *p, struct rq_flag
/* Install the request */
refcount_set(_pending.refs, 1);
init_completion(_pending.done);
+   my_pending.arg = (struct migration_arg) {
+   .task = p,
+   .dest_cpu = -1, /* any */
+   .pending = _pending,
+   };
+
p->migration_pending = _pending;
} else {
pending = p->migration_pending;
@@ -2265,12 +2285,6 @@ 

[tip: sched/core] sched: Fix affine_move_task() self-concurrency

2021-03-06 Thread tip-bot2 for Peter Zijlstra
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 9e81889c7648d48dd5fe13f41cbc99f3c362484a
Gitweb:
https://git.kernel.org/tip/9e81889c7648d48dd5fe13f41cbc99f3c362484a
Author:Peter Zijlstra 
AuthorDate:Wed, 24 Feb 2021 11:31:09 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00

sched: Fix affine_move_task() self-concurrency

Consider:

   sched_setaffinity(p, X); sched_setaffinity(p, Y);

Then the first will install p->migration_pending = _pending; and
issue stop_one_cpu_nowait(pending); and the second one will read
p->migration_pending and _also_ issue: stop_one_cpu_nowait(pending),
the _SAME_ @pending.

This causes stopper list corruption.

Add set_affinity_pending::stop_pending, to indicate if a stopper is in
progress.

Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: sta...@kernel.org
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Valentin Schneider 
Link: https://lkml.kernel.org/r/20210224131355.649146...@infradead.org
---
 kernel/sched/core.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ac05afb..4e4d100 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1864,6 +1864,7 @@ struct migration_arg {
 
 struct set_affinity_pending {
refcount_t  refs;
+   unsigned intstop_pending;
struct completion   done;
struct cpu_stop_workstop_work;
struct migration_argarg;
@@ -1982,12 +1983,15 @@ static int migration_cpu_stop(void *data)
 * determine is_migration_disabled() and so have to chase after
 * it.
 */
+   WARN_ON_ONCE(!pending->stop_pending);
task_rq_unlock(rq, p, );
stop_one_cpu_nowait(task_cpu(p), migration_cpu_stop,
>arg, >stop_work);
return 0;
}
 out:
+   if (pending)
+   pending->stop_pending = false;
task_rq_unlock(rq, p, );
 
if (complete)
@@ -2183,7 +2187,7 @@ static int affine_move_task(struct rq *rq, struct 
task_struct *p, struct rq_flag
int dest_cpu, unsigned int flags)
 {
struct set_affinity_pending my_pending = { }, *pending = NULL;
-   bool complete = false;
+   bool stop_pending, complete = false;
 
/* Can the task run on the task's current CPU? If so, we're done */
if (cpumask_test_cpu(task_cpu(p), >cpus_mask)) {
@@ -2256,14 +2260,19 @@ static int affine_move_task(struct rq *rq, struct 
task_struct *p, struct rq_flag
 * anything else we cannot do is_migration_disabled(), punt
 * and have the stopper function handle it all race-free.
 */
+   stop_pending = pending->stop_pending;
+   if (!stop_pending)
+   pending->stop_pending = true;
 
refcount_inc(>refs); /* pending->{arg,stop_work} */
if (flags & SCA_MIGRATE_ENABLE)
p->migration_flags &= ~MDF_PUSH;
task_rq_unlock(rq, p, rf);
 
-   stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
-   >arg, >stop_work);
+   if (!stop_pending) {
+   stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
+   >arg, >stop_work);
+   }
 
if (flags & SCA_MIGRATE_ENABLE)
return 0;


[tip: sched/core] sched/membarrier: fix missing local execution of ipi_sync_rq_state()

2021-03-06 Thread tip-bot2 for Mathieu Desnoyers
The following commit has been merged into the sched/core branch of tip:

Commit-ID: ce29ddc47b91f97e7f69a0fb7cbb5845f52a9825
Gitweb:
https://git.kernel.org/tip/ce29ddc47b91f97e7f69a0fb7cbb5845f52a9825
Author:Mathieu Desnoyers 
AuthorDate:Wed, 17 Feb 2021 11:56:51 -05:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00

sched/membarrier: fix missing local execution of ipi_sync_rq_state()

The function sync_runqueues_membarrier_state() should copy the
membarrier state from the @mm received as parameter to each runqueue
currently running tasks using that mm.

However, the use of smp_call_function_many() skips the current runqueue,
which is unintended. Replace by a call to on_each_cpu_mask().

Fixes: 227a4aadc75b ("sched/membarrier: Fix p->mm->membarrier_state racy load")
Reported-by: Nadav Amit 
Signed-off-by: Mathieu Desnoyers 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Cc: sta...@vger.kernel.org # 5.4.x+
Link: https://lore.kernel.org/r/74f1e842-4a84-47bf-b6c2-5407dfdd4...@gmail.com
---
 kernel/sched/membarrier.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
index acdae62..b5add64 100644
--- a/kernel/sched/membarrier.c
+++ b/kernel/sched/membarrier.c
@@ -471,9 +471,7 @@ static int sync_runqueues_membarrier_state(struct mm_struct 
*mm)
}
rcu_read_unlock();
 
-   preempt_disable();
-   smp_call_function_many(tmpmask, ipi_sync_rq_state, mm, 1);
-   preempt_enable();
+   on_each_cpu_mask(tmpmask, ipi_sync_rq_state, mm, true);
 
free_cpumask_var(tmpmask);
cpus_read_unlock();


[tip: sched/core] sched/fair: Remove update of blocked load from newidle_balance

2021-03-06 Thread tip-bot2 for Vincent Guittot
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 0826530de3cbdc89e60a89e86def94a5f0fc81ca
Gitweb:
https://git.kernel.org/tip/0826530de3cbdc89e60a89e86def94a5f0fc81ca
Author:Vincent Guittot 
AuthorDate:Wed, 24 Feb 2021 14:30:01 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00

sched/fair: Remove update of blocked load from newidle_balance

newidle_balance runs with both preempt and irq disabled which prevent
local irq to run during this period. The duration for updating the
blocked load of CPUs varies according to the number of CPU cgroups
with non-decayed load and extends this critical period to an uncontrolled
level.

Remove the update from newidle_balance and trigger a normal ILB that
will take care of the update instead.

This reduces the IRQ latency from O(nr_cgroups * nr_nohz_cpus) to
O(nr_cgroups).

Signed-off-by: Vincent Guittot 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Valentin Schneider 
Link: 
https://lkml.kernel.org/r/20210224133007.28644-2-vincent.guit...@linaro.org
---
 kernel/sched/fair.c | 33 +
 1 file changed, 5 insertions(+), 28 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 794c2cb..806e16f 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7392,8 +7392,6 @@ enum migration_type {
 #define LBF_NEED_BREAK 0x02
 #define LBF_DST_PINNED  0x04
 #define LBF_SOME_PINNED0x08
-#define LBF_NOHZ_STATS 0x10
-#define LBF_NOHZ_AGAIN 0x20
 
 struct lb_env {
struct sched_domain *sd;
@@ -8397,9 +8395,6 @@ static inline void update_sg_lb_stats(struct lb_env *env,
for_each_cpu_and(i, sched_group_span(group), env->cpus) {
struct rq *rq = cpu_rq(i);
 
-   if ((env->flags & LBF_NOHZ_STATS) && update_nohz_stats(rq, 
false))
-   env->flags |= LBF_NOHZ_AGAIN;
-
sgs->group_load += cpu_load(rq);
sgs->group_util += cpu_util(i);
sgs->group_runnable += cpu_runnable(rq);
@@ -8940,11 +8935,6 @@ static inline void update_sd_lb_stats(struct lb_env 
*env, struct sd_lb_stats *sd
struct sg_lb_stats tmp_sgs;
int sg_status = 0;
 
-#ifdef CONFIG_NO_HZ_COMMON
-   if (env->idle == CPU_NEWLY_IDLE && READ_ONCE(nohz.has_blocked))
-   env->flags |= LBF_NOHZ_STATS;
-#endif
-
do {
struct sg_lb_stats *sgs = _sgs;
int local_group;
@@ -8981,14 +8971,6 @@ next_group:
/* Tag domain that child domain prefers tasks go to siblings first */
sds->prefer_sibling = child && child->flags & SD_PREFER_SIBLING;
 
-#ifdef CONFIG_NO_HZ_COMMON
-   if ((env->flags & LBF_NOHZ_AGAIN) &&
-   cpumask_subset(nohz.idle_cpus_mask, sched_domain_span(env->sd))) {
-
-   WRITE_ONCE(nohz.next_blocked,
-  jiffies + msecs_to_jiffies(LOAD_AVG_PERIOD));
-   }
-#endif
 
if (env->sd->flags & SD_NUMA)
env->fbq_type = fbq_classify_group(>busiest_stat);
@@ -10517,16 +10499,11 @@ static void nohz_newidle_balance(struct rq *this_rq)
time_before(jiffies, READ_ONCE(nohz.next_blocked)))
return;
 
-   raw_spin_unlock(_rq->lock);
/*
-* This CPU is going to be idle and blocked load of idle CPUs
-* need to be updated. Run the ilb locally as it is a good
-* candidate for ilb instead of waking up another idle CPU.
-* Kick an normal ilb if we failed to do the update.
+* Blocked load of idle CPUs need to be updated.
+* Kick an ILB to update statistics.
 */
-   if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
-   kick_ilb(NOHZ_STATS_KICK);
-   raw_spin_lock(_rq->lock);
+   kick_ilb(NOHZ_STATS_KICK);
 }
 
 #else /* !CONFIG_NO_HZ_COMMON */
@@ -10587,8 +10564,6 @@ static int newidle_balance(struct rq *this_rq, struct 
rq_flags *rf)
update_next_balance(sd, _balance);
rcu_read_unlock();
 
-   nohz_newidle_balance(this_rq);
-
goto out;
}
 
@@ -10654,6 +10629,8 @@ out:
 
if (pulled_task)
this_rq->idle_stamp = 0;
+   else
+   nohz_newidle_balance(this_rq);
 
rq_repin_lock(this_rq, rf);
 


[tip: sched/core] sched/fair: Remove unused parameter of update_nohz_stats

2021-03-06 Thread tip-bot2 for Vincent Guittot
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 64f84f273592d17dcdca20244168ad9f525a39c3
Gitweb:
https://git.kernel.org/tip/64f84f273592d17dcdca20244168ad9f525a39c3
Author:Vincent Guittot 
AuthorDate:Wed, 24 Feb 2021 14:30:03 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00

sched/fair: Remove unused parameter of update_nohz_stats

idle load balance is the only user of update_nohz_stats and doesn't use
force parameter. Remove it

Signed-off-by: Vincent Guittot 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Valentin Schneider 
Link: 
https://lkml.kernel.org/r/20210224133007.28644-4-vincent.guit...@linaro.org
---
 kernel/sched/fair.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6a458e9..1b91030 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8352,7 +8352,7 @@ group_type group_classify(unsigned int imbalance_pct,
return group_has_spare;
 }
 
-static bool update_nohz_stats(struct rq *rq, bool force)
+static bool update_nohz_stats(struct rq *rq)
 {
 #ifdef CONFIG_NO_HZ_COMMON
unsigned int cpu = rq->cpu;
@@ -8363,7 +8363,7 @@ static bool update_nohz_stats(struct rq *rq, bool force)
if (!cpumask_test_cpu(cpu, nohz.idle_cpus_mask))
return false;
 
-   if (!force && !time_after(jiffies, rq->last_blocked_load_update_tick))
+   if (!time_after(jiffies, rq->last_blocked_load_update_tick))
return true;
 
update_blocked_averages(cpu);
@@ -10401,7 +10401,7 @@ static void _nohz_idle_balance(struct rq *this_rq, 
unsigned int flags,
 
rq = cpu_rq(balance_cpu);
 
-   has_blocked_load |= update_nohz_stats(rq, true);
+   has_blocked_load |= update_nohz_stats(rq);
 
/*
 * If time for next balance is due,


[tip: sched/core] kcov: Remove kcov include from sched.h and move it to its users.

2021-03-06 Thread tip-bot2 for Sebastian Andrzej Siewior
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 183f47fcaa54a5ffe671d990186d330ac8c63b10
Gitweb:
https://git.kernel.org/tip/183f47fcaa54a5ffe671d990186d330ac8c63b10
Author:Sebastian Andrzej Siewior 
AuthorDate:Thu, 18 Feb 2021 18:31:24 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00

kcov: Remove kcov include from sched.h and move it to its users.

The recent addition of in_serving_softirq() to kconv.h results in
compile failure on PREEMPT_RT because it requires
task_struct::softirq_disable_cnt. This is not available if kconv.h is
included from sched.h.

It is not needed to include kconv.h from sched.h. All but the net/ user
already include the kconv header file.

Move the include of the kconv.h header from sched.h it its users.
Additionally include sched.h from kconv.h to ensure that everything
task_struct related is available.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Acked-by: Johannes Berg 
Acked-by: Andrey Konovalov 
Link: https://lkml.kernel.org/r/20210218173124.iy5iyqv3a4oia...@linutronix.de
---
 drivers/usb/usbip/usbip_common.h | 1 +
 include/linux/kcov.h | 1 +
 include/linux/sched.h| 1 -
 net/core/skbuff.c| 1 +
 net/mac80211/iface.c | 1 +
 net/mac80211/rx.c| 1 +
 6 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/usbip/usbip_common.h b/drivers/usb/usbip/usbip_common.h
index d60ce17..a7dd6c6 100644
--- a/drivers/usb/usbip/usbip_common.h
+++ b/drivers/usb/usbip/usbip_common.h
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #undef pr_fmt
diff --git a/include/linux/kcov.h b/include/linux/kcov.h
index 4e3037d..55dc338 100644
--- a/include/linux/kcov.h
+++ b/include/linux/kcov.h
@@ -2,6 +2,7 @@
 #ifndef _LINUX_KCOV_H
 #define _LINUX_KCOV_H
 
+#include 
 #include 
 
 struct task_struct;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index ef00bb2..cf245bc 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -14,7 +14,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 545a472..420f23c 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -60,6 +60,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index b80c9b0..c127deb 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "ieee80211_i.h"
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
index c1343c0..62047e9 100644
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 


[tip: sched/core] sched/fair: Remove unused return of _nohz_idle_balance

2021-03-06 Thread tip-bot2 for Vincent Guittot
The following commit has been merged into the sched/core branch of tip:

Commit-ID: ab2dde5e98db23387147fb4e7a52b6cf8141cdb3
Gitweb:
https://git.kernel.org/tip/ab2dde5e98db23387147fb4e7a52b6cf8141cdb3
Author:Vincent Guittot 
AuthorDate:Wed, 24 Feb 2021 14:30:02 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00

sched/fair: Remove unused return of _nohz_idle_balance

The return of _nohz_idle_balance() is not used anymore so we can remove
it

Signed-off-by: Vincent Guittot 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Valentin Schneider 
Link: 
https://lkml.kernel.org/r/20210224133007.28644-3-vincent.guit...@linaro.org
---
 kernel/sched/fair.c | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 806e16f..6a458e9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10354,10 +10354,8 @@ out:
  * Internal function that runs load balance for all idle cpus. The load balance
  * can be a simple update of blocked load or a complete load balance with
  * tasks movement depending of flags.
- * The function returns false if the loop has stopped before running
- * through all idle CPUs.
  */
-static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
+static void _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
   enum cpu_idle_type idle)
 {
/* Earliest time when we have to do rebalance again */
@@ -10367,7 +10365,6 @@ static bool _nohz_idle_balance(struct rq *this_rq, 
unsigned int flags,
int update_next_balance = 0;
int this_cpu = this_rq->cpu;
int balance_cpu;
-   int ret = false;
struct rq *rq;
 
SCHED_WARN_ON((flags & NOHZ_KICK_MASK) == NOHZ_BALANCE_KICK);
@@ -10447,15 +10444,10 @@ static bool _nohz_idle_balance(struct rq *this_rq, 
unsigned int flags,
WRITE_ONCE(nohz.next_blocked,
now + msecs_to_jiffies(LOAD_AVG_PERIOD));
 
-   /* The full idle balance loop has been done */
-   ret = true;
-
 abort:
/* There is still blocked load, enable periodic update */
if (has_blocked_load)
WRITE_ONCE(nohz.has_blocked, 1);
-
-   return ret;
 }
 
 /*


[tip: sched/core] sched/fair: Reorder newidle_balance pulled_task tests

2021-03-06 Thread tip-bot2 for Vincent Guittot
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 6553fc18179113a11835d5fde1735259f8943a55
Gitweb:
https://git.kernel.org/tip/6553fc18179113a11835d5fde1735259f8943a55
Author:Vincent Guittot 
AuthorDate:Wed, 24 Feb 2021 14:30:05 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00

sched/fair: Reorder newidle_balance pulled_task tests

Reorder the tests and skip useless ones when no load balance has been
performed and rq lock has not been released.

Signed-off-by: Vincent Guittot 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Valentin Schneider 
Link: 
https://lkml.kernel.org/r/20210224133007.28644-6-vincent.guit...@linaro.org
---
 kernel/sched/fair.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 3c00918..356a245 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10584,7 +10584,6 @@ static int newidle_balance(struct rq *this_rq, struct 
rq_flags *rf)
if (curr_cost > this_rq->max_idle_balance_cost)
this_rq->max_idle_balance_cost = curr_cost;
 
-out:
/*
 * While browsing the domains, we released the rq lock, a task could
 * have been enqueued in the meantime. Since we're not going idle,
@@ -10593,14 +10592,15 @@ out:
if (this_rq->cfs.h_nr_running && !pulled_task)
pulled_task = 1;
 
-   /* Move the next balance forward */
-   if (time_after(this_rq->next_balance, next_balance))
-   this_rq->next_balance = next_balance;
-
/* Is there a task of a high priority class? */
if (this_rq->nr_running != this_rq->cfs.h_nr_running)
pulled_task = -1;
 
+out:
+   /* Move the next balance forward */
+   if (time_after(this_rq->next_balance, next_balance))
+   this_rq->next_balance = next_balance;
+
if (pulled_task)
this_rq->idle_stamp = 0;
else


[tip: sched/core] sched: Simplify migration_cpu_stop()

2021-03-06 Thread tip-bot2 for Valentin Schneider
The following commit has been merged into the sched/core branch of tip:

Commit-ID: e140749c9f194d65f5984a5941e46758377c93c0
Gitweb:
https://git.kernel.org/tip/e140749c9f194d65f5984a5941e46758377c93c0
Author:Valentin Schneider 
AuthorDate:Thu, 25 Feb 2021 10:22:30 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00

sched: Simplify migration_cpu_stop()

Since, when ->stop_pending, only the stopper can uninstall
p->migration_pending. This could simplify a few ifs, because:

  (pending != NULL) => (pending == p->migration_pending)

Also, the fatty comment above affine_move_task() probably needs a bit
of gardening.

Signed-off-by: Valentin Schneider 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
---
 kernel/sched/core.c | 27 ++-
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9819121..f9dfb34 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1927,6 +1927,12 @@ static int migration_cpu_stop(void *data)
rq_lock(rq, );
 
/*
+* If we were passed a pending, then ->stop_pending was set, thus
+* p->migration_pending must have remained stable.
+*/
+   WARN_ON_ONCE(pending && pending != p->migration_pending);
+
+   /*
 * If task_rq(p) != rq, it cannot be migrated here, because we're
 * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because
 * we're holding p->pi_lock.
@@ -1936,8 +1942,7 @@ static int migration_cpu_stop(void *data)
goto out;
 
if (pending) {
-   if (p->migration_pending == pending)
-   p->migration_pending = NULL;
+   p->migration_pending = NULL;
complete = true;
}
 
@@ -1976,8 +1981,7 @@ static int migration_cpu_stop(void *data)
 * somewhere allowed, we're done.
 */
if (cpumask_test_cpu(task_cpu(p), p->cpus_ptr)) {
-   if (p->migration_pending == pending)
-   p->migration_pending = NULL;
+   p->migration_pending = NULL;
complete = true;
goto out;
}
@@ -2165,16 +2169,21 @@ void do_set_cpus_allowed(struct task_struct *p, const 
struct cpumask *new_mask)
  *
  * (1) In the cases covered above. There is one more where the completion is
  * signaled within affine_move_task() itself: when a subsequent affinity 
request
- * cancels the need for an active migration. Consider:
+ * occurs after the stopper bailed out due to the targeted task still being
+ * Migrate-Disable. Consider:
  *
  * Initial conditions: P0->cpus_mask = [0, 1]
  *
- * P0@CPU0P1 P2
- *
- * migrate_disable();
- * 
+ * CPU0  P1P2
+ * 
+ *   migrate_disable();
+ *   
  *set_cpus_allowed_ptr(P0, [1]);
  *  
+ * 
+ *   migration_cpu_stop()
+ * is_migration_disabled()
+ *   
  *   
set_cpus_allowed_ptr(P0, [0, 1]);
  * 
  *  


[tip: sched/core] sched/fair: Trigger the update of blocked load on newly idle cpu

2021-03-06 Thread tip-bot2 for Vincent Guittot
The following commit has been merged into the sched/core branch of tip:

Commit-ID: c6f886546cb8a38617cdbe755fe50d3acd2463e4
Gitweb:
https://git.kernel.org/tip/c6f886546cb8a38617cdbe755fe50d3acd2463e4
Author:Vincent Guittot 
AuthorDate:Wed, 24 Feb 2021 14:30:06 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00

sched/fair: Trigger the update of blocked load on newly idle cpu

Instead of waking up a random and already idle CPU, we can take advantage
of this_cpu being about to enter idle to run the ILB and update the
blocked load.

Signed-off-by: Vincent Guittot 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Valentin Schneider 
Link: 
https://lkml.kernel.org/r/20210224133007.28644-7-vincent.guit...@linaro.org
---
 kernel/sched/core.c  |  2 +-
 kernel/sched/fair.c  | 24 +---
 kernel/sched/idle.c  |  6 ++
 kernel/sched/sched.h |  7 +++
 4 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f9dfb34..361974e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -737,7 +737,7 @@ static void nohz_csd_func(void *info)
/*
 * Release the rq::nohz_csd.
 */
-   flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(cpu));
+   flags = atomic_fetch_andnot(NOHZ_KICK_MASK | NOHZ_NEWILB_KICK, 
nohz_flags(cpu));
WARN_ON(!(flags & NOHZ_KICK_MASK));
 
rq->idle_balance = idle_cpu(cpu);
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 356a245..e87e1b3 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10453,6 +10453,24 @@ static bool nohz_idle_balance(struct rq *this_rq, enum 
cpu_idle_type idle)
return true;
 }
 
+/*
+ * Check if we need to run the ILB for updating blocked load before entering
+ * idle state.
+ */
+void nohz_run_idle_balance(int cpu)
+{
+   unsigned int flags;
+
+   flags = atomic_fetch_andnot(NOHZ_NEWILB_KICK, nohz_flags(cpu));
+
+   /*
+* Update the blocked load only if no SCHED_SOFTIRQ is about to happen
+* (ie NOHZ_STATS_KICK set) and will do the same.
+*/
+   if ((flags == NOHZ_NEWILB_KICK) && !need_resched())
+   _nohz_idle_balance(cpu_rq(cpu), NOHZ_STATS_KICK, CPU_IDLE);
+}
+
 static void nohz_newidle_balance(struct rq *this_rq)
 {
int this_cpu = this_rq->cpu;
@@ -10474,10 +10492,10 @@ static void nohz_newidle_balance(struct rq *this_rq)
return;
 
/*
-* Blocked load of idle CPUs need to be updated.
-* Kick an ILB to update statistics.
+* Set the need to trigger ILB in order to update blocked load
+* before entering idle state.
 */
-   kick_ilb(NOHZ_STATS_KICK);
+   atomic_or(NOHZ_NEWILB_KICK, nohz_flags(this_cpu));
 }
 
 #else /* !CONFIG_NO_HZ_COMMON */
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 7199e6f..7a92d60 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -261,6 +261,12 @@ exit_idle:
 static void do_idle(void)
 {
int cpu = smp_processor_id();
+
+   /*
+* Check if we need to update blocked load
+*/
+   nohz_run_idle_balance(cpu);
+
/*
 * If the arch has a polling bit, we maintain an invariant:
 *
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 10a1522..0ddc9a6 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2385,9 +2385,11 @@ extern void cfs_bandwidth_usage_dec(void);
 #ifdef CONFIG_NO_HZ_COMMON
 #define NOHZ_BALANCE_KICK_BIT  0
 #define NOHZ_STATS_KICK_BIT1
+#define NOHZ_NEWILB_KICK_BIT   2
 
 #define NOHZ_BALANCE_KICK  BIT(NOHZ_BALANCE_KICK_BIT)
 #define NOHZ_STATS_KICKBIT(NOHZ_STATS_KICK_BIT)
+#define NOHZ_NEWILB_KICK   BIT(NOHZ_NEWILB_KICK_BIT)
 
 #define NOHZ_KICK_MASK (NOHZ_BALANCE_KICK | NOHZ_STATS_KICK)
 
@@ -2398,6 +2400,11 @@ extern void nohz_balance_exit_idle(struct rq *rq);
 static inline void nohz_balance_exit_idle(struct rq *rq) { }
 #endif
 
+#if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON)
+extern void nohz_run_idle_balance(int cpu);
+#else
+static inline void nohz_run_idle_balance(int cpu) { }
+#endif
 
 #ifdef CONFIG_SMP
 static inline


[tip: sched/core] sched/fair: Merge for each idle cpu loop of ILB

2021-03-06 Thread tip-bot2 for Vincent Guittot
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 7a82e5f52a3506bc35a4dc04d53ad2c9daf82e7f
Gitweb:
https://git.kernel.org/tip/7a82e5f52a3506bc35a4dc04d53ad2c9daf82e7f
Author:Vincent Guittot 
AuthorDate:Wed, 24 Feb 2021 14:30:04 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00

sched/fair: Merge for each idle cpu loop of ILB

Remove the specific case for handling this_cpu outside for_each_cpu() loop
when running ILB. Instead we use for_each_cpu_wrap() and start with the
next cpu after this_cpu so we will continue to finish with this_cpu.

update_nohz_stats() is now used for this_cpu too and will prevents
unnecessary update. We don't need a special case for handling the update of
nohz.next_balance for this_cpu anymore because it is now handled by the
loop like others.

Signed-off-by: Vincent Guittot 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Valentin Schneider 
Link: 
https://lkml.kernel.org/r/20210224133007.28644-5-vincent.guit...@linaro.org
---
 kernel/sched/fair.c | 32 +++-
 1 file changed, 7 insertions(+), 25 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1b91030..3c00918 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10043,22 +10043,9 @@ out:
 * When the cpu is attached to null domain for ex, it will not be
 * updated.
 */
-   if (likely(update_next_balance)) {
+   if (likely(update_next_balance))
rq->next_balance = next_balance;
 
-#ifdef CONFIG_NO_HZ_COMMON
-   /*
-* If this CPU has been elected to perform the nohz idle
-* balance. Other idle CPUs have already rebalanced with
-* nohz_idle_balance() and nohz.next_balance has been
-* updated accordingly. This CPU is now running the idle load
-* balance for itself and we need to update the
-* nohz.next_balance accordingly.
-*/
-   if ((idle == CPU_IDLE) && time_after(nohz.next_balance, 
rq->next_balance))
-   nohz.next_balance = rq->next_balance;
-#endif
-   }
 }
 
 static inline int on_null_domain(struct rq *rq)
@@ -10385,8 +10372,12 @@ static void _nohz_idle_balance(struct rq *this_rq, 
unsigned int flags,
 */
smp_mb();
 
-   for_each_cpu(balance_cpu, nohz.idle_cpus_mask) {
-   if (balance_cpu == this_cpu || !idle_cpu(balance_cpu))
+   /*
+* Start with the next CPU after this_cpu so we will end with this_cpu 
and let a
+* chance for other idle cpu to pull load.
+*/
+   for_each_cpu_wrap(balance_cpu,  nohz.idle_cpus_mask, this_cpu+1) {
+   if (!idle_cpu(balance_cpu))
continue;
 
/*
@@ -10432,15 +10423,6 @@ static void _nohz_idle_balance(struct rq *this_rq, 
unsigned int flags,
if (likely(update_next_balance))
nohz.next_balance = next_balance;
 
-   /* Newly idle CPU doesn't need an update */
-   if (idle != CPU_NEWLY_IDLE) {
-   update_blocked_averages(this_cpu);
-   has_blocked_load |= this_rq->has_blocked_load;
-   }
-
-   if (flags & NOHZ_BALANCE_KICK)
-   rebalance_domains(this_rq, CPU_IDLE);
-
WRITE_ONCE(nohz.next_blocked,
now + msecs_to_jiffies(LOAD_AVG_PERIOD));
 


Re: [PATCH RESEND][next] hwmon: (corsair-cpro) Fix fall-through warnings for Clang

2021-03-06 Thread Marius Zachmann
On 06.03.21 at 10:53:59 CET, Gustavo A. R. Silva wrote
> In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning
> by explicitly adding a break statement instead of letting the code fall
> through to the next case.
> 
> Link: https://github.com/KSPP/linux/issues/115
> Acked-by: Guenter Roeck 
> Signed-off-by: Gustavo A. R. Silva 

Acked-by: Marius Zachmann 

> ---
>  drivers/hwmon/corsair-cpro.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/hwmon/corsair-cpro.c b/drivers/hwmon/corsair-cpro.c
> index 591929ec217a..fa6aa4fc8b52 100644
> --- a/drivers/hwmon/corsair-cpro.c
> +++ b/drivers/hwmon/corsair-cpro.c
> @@ -310,6 +310,7 @@ static int ccp_write(struct device *dev, enum 
> hwmon_sensor_types type,
>   default:
>   break;
>   }
> + break;
>   default:
>   break;
>   }
> 






[tip: sched/core] psi: Optimize task switch inside shared cgroups

2021-03-06 Thread tip-bot2 for Chengming Zhou
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 4117cebf1a9fcbf35b9aabf0e37b6c5eea296798
Gitweb:
https://git.kernel.org/tip/4117cebf1a9fcbf35b9aabf0e37b6c5eea296798
Author:Chengming Zhou 
AuthorDate:Wed, 03 Mar 2021 11:46:59 +08:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:23 +01:00

psi: Optimize task switch inside shared cgroups

The commit 36b238d57172 ("psi: Optimize switching tasks inside shared
cgroups") only update cgroups whose state actually changes during a
task switch only in task preempt case, not in task sleep case.

We actually don't need to clear and set TSK_ONCPU state for common cgroups
of next and prev task in sleep case, that can save many psi_group_change
especially when most activity comes from one leaf cgroup.

sleep before:
psi_dequeue()
  while ((group = iterate_groups(prev)))  # all ancestors
psi_group_change(prev, .clear=TSK_RUNNING|TSK_ONCPU)
psi_task_switch()
  while ((group = iterate_groups(next)))  # all ancestors
psi_group_change(next, .set=TSK_ONCPU)

sleep after:
psi_dequeue()
  nop
psi_task_switch()
  while ((group = iterate_groups(next)))  # until (prev & next)
psi_group_change(next, .set=TSK_ONCPU)
  while ((group = iterate_groups(prev)))  # all ancestors
psi_group_change(prev, .clear=common?TSK_RUNNING:TSK_RUNNING|TSK_ONCPU)

When a voluntary sleep switches to another task, we remove one call of
psi_group_change() for every common cgroup ancestor of the two tasks.

Co-developed-by: Muchun Song 
Signed-off-by: Muchun Song 
Signed-off-by: Chengming Zhou 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Acked-by: Johannes Weiner 
Link: 
https://lkml.kernel.org/r/20210303034659.91735-5-zhouchengm...@bytedance.com
---
 kernel/sched/psi.c   | 35 +--
 kernel/sched/stats.h | 28 
 2 files changed, 37 insertions(+), 26 deletions(-)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 3907a6b..ee3c5b4 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -840,20 +840,35 @@ void psi_task_switch(struct task_struct *prev, struct 
task_struct *next,
}
}
 
-   /*
-* If this is a voluntary sleep, dequeue will have taken care
-* of the outgoing TSK_ONCPU alongside TSK_RUNNING already. We
-* only need to deal with it during preemption.
-*/
-   if (sleep)
-   return;
-
if (prev->pid) {
-   psi_flags_change(prev, TSK_ONCPU, 0);
+   int clear = TSK_ONCPU, set = 0;
+
+   /*
+* When we're going to sleep, psi_dequeue() lets us handle
+* TSK_RUNNING and TSK_IOWAIT here, where we can combine it
+* with TSK_ONCPU and save walking common ancestors twice.
+*/
+   if (sleep) {
+   clear |= TSK_RUNNING;
+   if (prev->in_iowait)
+   set |= TSK_IOWAIT;
+   }
+
+   psi_flags_change(prev, clear, set);
 
iter = NULL;
while ((group = iterate_groups(prev, )) && group != common)
-   psi_group_change(group, cpu, TSK_ONCPU, 0, true);
+   psi_group_change(group, cpu, clear, set, true);
+
+   /*
+* TSK_ONCPU is handled up to the common ancestor. If we're 
tasked
+* with dequeuing too, finish that for the rest of the 
hierarchy.
+*/
+   if (sleep) {
+   clear &= ~TSK_ONCPU;
+   for (; group; group = iterate_groups(prev, ))
+   psi_group_change(group, cpu, clear, set, true);
+   }
}
 }
 
diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
index 9e4e67a..dc218e9 100644
--- a/kernel/sched/stats.h
+++ b/kernel/sched/stats.h
@@ -84,28 +84,24 @@ static inline void psi_enqueue(struct task_struct *p, bool 
wakeup)
 
 static inline void psi_dequeue(struct task_struct *p, bool sleep)
 {
-   int clear = TSK_RUNNING, set = 0;
+   int clear = TSK_RUNNING;
 
if (static_branch_likely(_disabled))
return;
 
-   if (!sleep) {
-   if (p->in_memstall)
-   clear |= TSK_MEMSTALL;
-   } else {
-   /*
-* When a task sleeps, schedule() dequeues it before
-* switching to the next one. Merge the clearing of
-* TSK_RUNNING and TSK_ONCPU to save an unnecessary
-* psi_task_change() call in psi_sched_switch().
-*/
-   clear |= TSK_ONCPU;
+   /*
+* A voluntary sleep is a dequeue followed by a task switch. To
+* avoid walking all ancestors twice, psi_task_switch() handles
+* TSK_RUNNING and TSK_IOWAIT for us when it moves TSK_ONCPU.
+* 

[tip: sched/core] psi: Add PSI_CPU_FULL state

2021-03-06 Thread tip-bot2 for Chengming Zhou
The following commit has been merged into the sched/core branch of tip:

Commit-ID: e7fcd762282332f765af2035a9568fb126fa3c01
Gitweb:
https://git.kernel.org/tip/e7fcd762282332f765af2035a9568fb126fa3c01
Author:Chengming Zhou 
AuthorDate:Wed, 03 Mar 2021 11:46:56 +08:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00

psi: Add PSI_CPU_FULL state

The FULL state doesn't exist for the CPU resource at the system level,
but exist at the cgroup level, means all non-idle tasks in a cgroup are
delayed on the CPU resource which used by others outside of the cgroup
or throttled by the cgroup cpu.max configuration.

Co-developed-by: Muchun Song 
Signed-off-by: Muchun Song 
Signed-off-by: Chengming Zhou 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Acked-by: Johannes Weiner 
Link: 
https://lkml.kernel.org/r/20210303034659.91735-2-zhouchengm...@bytedance.com
---
 include/linux/psi_types.h |  3 ++-
 kernel/sched/psi.c| 14 +++---
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/include/linux/psi_types.h b/include/linux/psi_types.h
index b95f321..0a23300 100644
--- a/include/linux/psi_types.h
+++ b/include/linux/psi_types.h
@@ -50,9 +50,10 @@ enum psi_states {
PSI_MEM_SOME,
PSI_MEM_FULL,
PSI_CPU_SOME,
+   PSI_CPU_FULL,
/* Only per-CPU, to weigh the CPU in the global average: */
PSI_NONIDLE,
-   NR_PSI_STATES = 6,
+   NR_PSI_STATES = 7,
 };
 
 enum psi_aggregators {
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 967732c..2293c45 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -34,7 +34,10 @@
  * delayed on that resource such that nobody is advancing and the CPU
  * goes idle. This leaves both workload and CPU unproductive.
  *
- * (Naturally, the FULL state doesn't exist for the CPU resource.)
+ * Naturally, the FULL state doesn't exist for the CPU resource at the
+ * system level, but exist at the cgroup level, means all non-idle tasks
+ * in a cgroup are delayed on the CPU resource which used by others outside
+ * of the cgroup or throttled by the cgroup cpu.max configuration.
  *
  * SOME = nr_delayed_tasks != 0
  * FULL = nr_delayed_tasks != 0 && nr_running_tasks == 0
@@ -225,6 +228,8 @@ static bool test_state(unsigned int *tasks, enum psi_states 
state)
return tasks[NR_MEMSTALL] && !tasks[NR_RUNNING];
case PSI_CPU_SOME:
return tasks[NR_RUNNING] > tasks[NR_ONCPU];
+   case PSI_CPU_FULL:
+   return tasks[NR_RUNNING] && !tasks[NR_ONCPU];
case PSI_NONIDLE:
return tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] ||
tasks[NR_RUNNING];
@@ -678,8 +683,11 @@ static void record_times(struct psi_group_cpu *groupc, int 
cpu,
}
}
 
-   if (groupc->state_mask & (1 << PSI_CPU_SOME))
+   if (groupc->state_mask & (1 << PSI_CPU_SOME)) {
groupc->times[PSI_CPU_SOME] += delta;
+   if (groupc->state_mask & (1 << PSI_CPU_FULL))
+   groupc->times[PSI_CPU_FULL] += delta;
+   }
 
if (groupc->state_mask & (1 << PSI_NONIDLE))
groupc->times[PSI_NONIDLE] += delta;
@@ -1018,7 +1026,7 @@ int psi_show(struct seq_file *m, struct psi_group *group, 
enum psi_res res)
group->avg_next_update = update_averages(group, now);
mutex_unlock(>avgs_lock);
 
-   for (full = 0; full < 2 - (res == PSI_CPU); full++) {
+   for (full = 0; full < 2; full++) {
unsigned long avg[3];
u64 total;
int w;


[tip: sched/core] psi: Pressure states are unlikely

2021-03-06 Thread tip-bot2 for Johannes Weiner
The following commit has been merged into the sched/core branch of tip:

Commit-ID: fddc8bab531e217806b84906681324377d741c6c
Gitweb:
https://git.kernel.org/tip/fddc8bab531e217806b84906681324377d741c6c
Author:Johannes Weiner 
AuthorDate:Wed, 03 Mar 2021 11:46:58 +08:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:23 +01:00

psi: Pressure states are unlikely

Move the unlikely branches out of line. This eliminates undesirable
jumps during wakeup and sleeps for workloads that aren't under any
sort of resource pressure.

Signed-off-by: Johannes Weiner 
Signed-off-by: Chengming Zhou 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Link: 
https://lkml.kernel.org/r/20210303034659.91735-4-zhouchengm...@bytedance.com
---
 kernel/sched/psi.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 0fe6ff6..3907a6b 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -219,17 +219,17 @@ static bool test_state(unsigned int *tasks, enum 
psi_states state)
 {
switch (state) {
case PSI_IO_SOME:
-   return tasks[NR_IOWAIT];
+   return unlikely(tasks[NR_IOWAIT]);
case PSI_IO_FULL:
-   return tasks[NR_IOWAIT] && !tasks[NR_RUNNING];
+   return unlikely(tasks[NR_IOWAIT] && !tasks[NR_RUNNING]);
case PSI_MEM_SOME:
-   return tasks[NR_MEMSTALL];
+   return unlikely(tasks[NR_MEMSTALL]);
case PSI_MEM_FULL:
-   return tasks[NR_MEMSTALL] && !tasks[NR_RUNNING];
+   return unlikely(tasks[NR_MEMSTALL] && !tasks[NR_RUNNING]);
case PSI_CPU_SOME:
-   return tasks[NR_RUNNING] > tasks[NR_ONCPU];
+   return unlikely(tasks[NR_RUNNING] > tasks[NR_ONCPU]);
case PSI_CPU_FULL:
-   return tasks[NR_RUNNING] && !tasks[NR_ONCPU];
+   return unlikely(tasks[NR_RUNNING] && !tasks[NR_ONCPU]);
case PSI_NONIDLE:
return tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] ||
tasks[NR_RUNNING];
@@ -729,7 +729,7 @@ static void psi_group_change(struct psi_group *group, int 
cpu,
 * task in a cgroup is in_memstall, the corresponding groupc
 * on that cpu is in PSI_MEM_FULL state.
 */
-   if (groupc->tasks[NR_ONCPU] && cpu_curr(cpu)->in_memstall)
+   if (unlikely(groupc->tasks[NR_ONCPU] && cpu_curr(cpu)->in_memstall))
state_mask |= (1 << PSI_MEM_FULL);
 
groupc->state_mask = state_mask;


[tip: sched/core] psi: Use ONCPU state tracking machinery to detect reclaim

2021-03-06 Thread tip-bot2 for Chengming Zhou
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 7fae6c8171d20ac55402930ee8ae760cf85dff7b
Gitweb:
https://git.kernel.org/tip/7fae6c8171d20ac55402930ee8ae760cf85dff7b
Author:Chengming Zhou 
AuthorDate:Wed, 03 Mar 2021 11:46:57 +08:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00

psi: Use ONCPU state tracking machinery to detect reclaim

Move the reclaim detection from the timer tick to the task state
tracking machinery using the recently added ONCPU state. And we
also add task psi_flags changes checking in the psi_task_switch()
optimization to update the parents properly.

In terms of performance and cost, this ONCPU task state tracking
is not cheaper than previous timer tick in aggregate. But the code is
simpler and shorter this way, so it's a maintainability win. And
Johannes did some testing with perf bench, the performace and cost
changes would be acceptable for real workloads.

Thanks to Johannes Weiner for pointing out the psi_task_switch()
optimization things and the clearer changelog.

Co-developed-by: Muchun Song 
Signed-off-by: Muchun Song 
Signed-off-by: Chengming Zhou 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Acked-by: Johannes Weiner 
Link: 
https://lkml.kernel.org/r/20210303034659.91735-3-zhouchengm...@bytedance.com
---
 include/linux/psi.h  |  1 +-
 kernel/sched/core.c  |  1 +-
 kernel/sched/psi.c   | 65 +++
 kernel/sched/stats.h |  9 +--
 4 files changed, 24 insertions(+), 52 deletions(-)

diff --git a/include/linux/psi.h b/include/linux/psi.h
index 7361023..65eb147 100644
--- a/include/linux/psi.h
+++ b/include/linux/psi.h
@@ -20,7 +20,6 @@ void psi_task_change(struct task_struct *task, int clear, int 
set);
 void psi_task_switch(struct task_struct *prev, struct task_struct *next,
 bool sleep);
 
-void psi_memstall_tick(struct task_struct *task, int cpu);
 void psi_memstall_enter(unsigned long *flags);
 void psi_memstall_leave(unsigned long *flags);
 
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 361974e..d2629fd 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4551,7 +4551,6 @@ void scheduler_tick(void)
update_thermal_load_avg(rq_clock_thermal(rq), rq, thermal_pressure);
curr->sched_class->task_tick(rq, curr, 0);
calc_global_load_tick(rq);
-   psi_task_tick(rq);
 
rq_unlock(rq, );
 
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 2293c45..0fe6ff6 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -644,8 +644,7 @@ static void poll_timer_fn(struct timer_list *t)
wake_up_interruptible(>poll_wait);
 }
 
-static void record_times(struct psi_group_cpu *groupc, int cpu,
-bool memstall_tick)
+static void record_times(struct psi_group_cpu *groupc, int cpu)
 {
u32 delta;
u64 now;
@@ -664,23 +663,6 @@ static void record_times(struct psi_group_cpu *groupc, int 
cpu,
groupc->times[PSI_MEM_SOME] += delta;
if (groupc->state_mask & (1 << PSI_MEM_FULL))
groupc->times[PSI_MEM_FULL] += delta;
-   else if (memstall_tick) {
-   u32 sample;
-   /*
-* Since we care about lost potential, a
-* memstall is FULL when there are no other
-* working tasks, but also when the CPU is
-* actively reclaiming and nothing productive
-* could run even if it were runnable.
-*
-* When the timer tick sees a reclaiming CPU,
-* regardless of runnable tasks, sample a FULL
-* tick (or less if it hasn't been a full tick
-* since the last state change).
-*/
-   sample = min(delta, (u32)jiffies_to_nsecs(1));
-   groupc->times[PSI_MEM_FULL] += sample;
-   }
}
 
if (groupc->state_mask & (1 << PSI_CPU_SOME)) {
@@ -714,7 +696,7 @@ static void psi_group_change(struct psi_group *group, int 
cpu,
 */
write_seqcount_begin(>seq);
 
-   record_times(groupc, cpu, false);
+   record_times(groupc, cpu);
 
for (t = 0, m = clear; m; m &= ~(1 << t), t++) {
if (!(m & (1 << t)))
@@ -738,6 +720,18 @@ static void psi_group_change(struct psi_group *group, int 
cpu,
if (test_state(groupc->tasks, s))
state_mask |= (1 << s);
}
+
+   /*
+* Since we care about lost potential, a memstall is FULL
+* when there are no other working tasks, but also when
+* the CPU is actively reclaiming and nothing productive
+* could run even if it were runnable. So when the current
+* task in a cgroup is 

[tip: sched/core] sched/topology: fix the issue groups don't span domain->span for NUMA diameter > 2

2021-03-06 Thread tip-bot2 for Barry Song
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 585b6d2723dc927ebc4ad884c4e879e4da8bc21f
Gitweb:
https://git.kernel.org/tip/585b6d2723dc927ebc4ad884c4e879e4da8bc21f
Author:Barry Song 
AuthorDate:Wed, 24 Feb 2021 16:09:44 +13:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00

sched/topology: fix the issue groups don't span domain->span for NUMA diameter 
> 2

As long as NUMA diameter > 2, building sched_domain by sibling's child
domain will definitely create a sched_domain with sched_group which will
span out of the sched_domain:

   +--+ +--++---+   +--+
   | node |  12 |node  | 20 | node  |  12   |node  |
   |  0   +-+1 ++ 2 +---+3 |
   +--+ +--++---+   +--+

domain0node0node1node2  node3

domain1node0+1  node0+1  node2+3node2+3
 +
domain2node0+1+2 |
 group: node0+1  |
   group:node2+3 <---+

when node2 is added into the domain2 of node0, kernel is using the child
domain of node2's domain2, which is domain1(node2+3). Node 3 is outside
the span of the domain including node0+1+2.

This will make load_balance() run based on screwed avg_load and group_type
in the sched_group spanning out of the sched_domain, and it also makes
select_task_rq_fair() pick an idle CPU outside the sched_domain.

Real servers which suffer from this problem include Kunpeng920 and 8-node
Sun Fire X4600-M2, at least.

Here we move to use the *child* domain of the *child* domain of node2's
domain2 as the new added sched_group. At the same, we re-use the lower
level sgc directly.
   +--+ +--++---+   +--+
   | node |  12 |node  | 20 | node  |  12   |node  |
   |  0   +-+1 ++ 2 +---+3 |
   +--+ +--++---+   +--+

domain0node0node1  +- node2  node3
   |
domain1node0+1  node0+1| node2+3node2+3
   |
domain2node0+1+2   |
 group: node0+1|
   group:node2 <---+

While the lower level sgc is re-used, this patch only changes the remote
sched_groups for those sched_domains playing grandchild trick, therefore,
sgc->next_update is still safe since it's only touched by CPUs that have
the group span as local group. And sgc->imbalance is also safe because
sd_parent remains the same in load_balance and LB only tries other CPUs
from the local group.
Moreover, since local groups are not touched, they are still getting
roughly equal size in a TL. And should_we_balance() only matters with
local groups, so the pull probability of those groups are still roughly
equal.

Tested by the below topology:
qemu-system-aarch64  -M virt -nographic \
 -smp cpus=8 \
 -numa node,cpus=0-1,nodeid=0 \
 -numa node,cpus=2-3,nodeid=1 \
 -numa node,cpus=4-5,nodeid=2 \
 -numa node,cpus=6-7,nodeid=3 \
 -numa dist,src=0,dst=1,val=12 \
 -numa dist,src=0,dst=2,val=20 \
 -numa dist,src=0,dst=3,val=22 \
 -numa dist,src=1,dst=2,val=22 \
 -numa dist,src=2,dst=3,val=12 \
 -numa dist,src=1,dst=3,val=24 \
 -m 4G -cpu cortex-a57 -kernel arch/arm64/boot/Image

w/o patch, we get lots of "groups don't span domain->span":
[0.802139] CPU0 attaching sched-domain(s):
[0.802193]  domain-0: span=0-1 level=MC
[0.802443]   groups: 0:{ span=0 cap=1013 }, 1:{ span=1 cap=979 }
[0.802693]   domain-1: span=0-3 level=NUMA
[0.802731]groups: 0:{ span=0-1 cap=1992 }, 2:{ span=2-3 cap=1943 }
[0.802811]domain-2: span=0-5 level=NUMA
[0.802829] groups: 0:{ span=0-3 cap=3935 }, 4:{ span=4-7 cap=3937 }
[0.802881] ERROR: groups don't span domain->span
[0.803058] domain-3: span=0-7 level=NUMA
[0.803080]  groups: 0:{ span=0-5 mask=0-1 cap=5843 }, 6:{ span=4-7 
mask=6-7 cap=4077 }
[0.804055] CPU1 attaching sched-domain(s):
[0.804072]  domain-0: span=0-1 level=MC
[0.804096]   groups: 1:{ span=1 cap=979 }, 0:{ span=0 cap=1013 }
[0.804152]   domain-1: span=0-3 level=NUMA
[0.804170]groups: 0:{ span=0-1 cap=1992 }, 2:{ span=2-3 cap=1943 }
[0.804219]domain-2: span=0-5 level=NUMA
[0.804236] groups: 0:{ span=0-3 cap=3935 }, 4:{ span=4-7 cap=3937 }
[0.804302] ERROR: groups don't span domain->span
[0.804520] domain-3: span=0-7 level=NUMA
[0.804546]  groups: 0:{ span=0-5 mask=0-1 cap=5843 }, 6:{ span=4-7 
mask=6-7 cap=4077 }
[0.804677] CPU2 attaching 

[tip: sched/core] cpu/hotplug: Allowing to reset fail injection

2021-03-06 Thread tip-bot2 for Vincent Donnefort
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 3ae70c251f344976428d1f6ee61ea7b4e170fec3
Gitweb:
https://git.kernel.org/tip/3ae70c251f344976428d1f6ee61ea7b4e170fec3
Author:Vincent Donnefort 
AuthorDate:Tue, 16 Feb 2021 10:35:04 
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00

cpu/hotplug: Allowing to reset fail injection

Currently, the only way of resetting the fail injection is to trigger a
hotplug, hotunplug or both. This is rather annoying for testing
and, as the default value for this file is -1, it seems pretty natural to
let a user write it.

Signed-off-by: Vincent Donnefort 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Link: 
https://lkml.kernel.org/r/20210216103506.416286-2-vincent.donnef...@arm.com
---
 kernel/cpu.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 1b6302e..9121edf 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2207,6 +2207,11 @@ static ssize_t write_cpuhp_fail(struct device *dev,
if (ret)
return ret;
 
+   if (fail == CPUHP_INVALID) {
+   st->fail = fail;
+   return count;
+   }
+
if (fail < CPUHP_OFFLINE || fail > CPUHP_ONLINE)
return -EINVAL;
 


[tip: sched/core] cpu/hotplug: Add cpuhp_invoke_callback_range()

2021-03-06 Thread tip-bot2 for Vincent Donnefort
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 453e41085183980087f8a80dada523caf1131c3c
Gitweb:
https://git.kernel.org/tip/453e41085183980087f8a80dada523caf1131c3c
Author:Vincent Donnefort 
AuthorDate:Tue, 16 Feb 2021 10:35:06 
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00

cpu/hotplug: Add cpuhp_invoke_callback_range()

Factorizing and unifying cpuhp callback range invocations, especially for
the hotunplug path, where two different ways of decrementing were used. The
first one, decrements before the callback is called:

 cpuhp_thread_fun()
 state = st->state;
 st->state--;
 cpuhp_invoke_callback(state);

The second one, after:

 take_down_cpu()|cpuhp_down_callbacks()
 cpuhp_invoke_callback(st->state);
 st->state--;

This is problematic for rolling back the steps in case of error, as
depending on the decrement, the rollback will start from N or N-1. It also
makes tracing inconsistent, between steps run in the cpuhp thread and
the others.

Additionally, avoid useless cpuhp_thread_fun() loops by skipping empty
steps.

Signed-off-by: Vincent Donnefort 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Link: 
https://lkml.kernel.org/r/20210216103506.416286-4-vincent.donnef...@arm.com
---
 kernel/cpu.c | 170 ++
 1 file changed, 102 insertions(+), 68 deletions(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 680ed8f..23505d6 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -135,6 +135,11 @@ static struct cpuhp_step *cpuhp_get_step(enum cpuhp_state 
state)
return cpuhp_hp_states + state;
 }
 
+static bool cpuhp_step_empty(bool bringup, struct cpuhp_step *step)
+{
+   return bringup ? !step->startup.single : !step->teardown.single;
+}
+
 /**
  * cpuhp_invoke_callback _ Invoke the callbacks for a given state
  * @cpu:   The cpu for which the callback should be invoked
@@ -157,26 +162,24 @@ static int cpuhp_invoke_callback(unsigned int cpu, enum 
cpuhp_state state,
 
if (st->fail == state) {
st->fail = CPUHP_INVALID;
-
-   if (!(bringup ? step->startup.single : step->teardown.single))
-   return 0;
-
return -EAGAIN;
}
 
+   if (cpuhp_step_empty(bringup, step)) {
+   WARN_ON_ONCE(1);
+   return 0;
+   }
+
if (!step->multi_instance) {
WARN_ON_ONCE(lastp && *lastp);
cb = bringup ? step->startup.single : step->teardown.single;
-   if (!cb)
-   return 0;
+
trace_cpuhp_enter(cpu, st->target, state, cb);
ret = cb(cpu);
trace_cpuhp_exit(cpu, st->state, state, ret);
return ret;
}
cbm = bringup ? step->startup.multi : step->teardown.multi;
-   if (!cbm)
-   return 0;
 
/* Single invocation for instance add/remove */
if (node) {
@@ -475,6 +478,15 @@ cpuhp_set_state(struct cpuhp_cpu_state *st, enum 
cpuhp_state target)
 static inline void
 cpuhp_reset_state(struct cpuhp_cpu_state *st, enum cpuhp_state prev_state)
 {
+   st->target = prev_state;
+
+   /*
+* Already rolling back. No need invert the bringup value or to change
+* the current state.
+*/
+   if (st->rollback)
+   return;
+
st->rollback = true;
 
/*
@@ -488,7 +500,6 @@ cpuhp_reset_state(struct cpuhp_cpu_state *st, enum 
cpuhp_state prev_state)
st->state++;
}
 
-   st->target = prev_state;
st->bringup = !st->bringup;
 }
 
@@ -591,10 +602,53 @@ static int finish_cpu(unsigned int cpu)
  * Hotplug state machine related functions
  */
 
-static void undo_cpu_up(unsigned int cpu, struct cpuhp_cpu_state *st)
+/*
+ * Get the next state to run. Empty ones will be skipped. Returns true if a
+ * state must be run.
+ *
+ * st->state will be modified ahead of time, to match state_to_run, as if it
+ * has already ran.
+ */
+static bool cpuhp_next_state(bool bringup,
+enum cpuhp_state *state_to_run,
+struct cpuhp_cpu_state *st,
+enum cpuhp_state target)
 {
-   for (st->state--; st->state > st->target; st->state--)
-   cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
+   do {
+   if (bringup) {
+   if (st->state >= target)
+   return false;
+
+   *state_to_run = ++st->state;
+   } else {
+   if (st->state <= target)
+   return false;
+
+   *state_to_run = st->state--;
+   }
+
+   if (!cpuhp_step_empty(bringup, cpuhp_get_step(*state_to_run)))
+   break;
+   } while (true);
+
+   

[tip: sched/core] sched/fair: Fix shift-out-of-bounds in load_balance()

2021-03-06 Thread tip-bot2 for Valentin Schneider
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 39a2a6eb5c9b66ea7c8055026303b3aa681b49a5
Gitweb:
https://git.kernel.org/tip/39a2a6eb5c9b66ea7c8055026303b3aa681b49a5
Author:Valentin Schneider 
AuthorDate:Thu, 25 Feb 2021 17:56:56 
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00

sched/fair: Fix shift-out-of-bounds in load_balance()

Syzbot reported a handful of occurrences where an sd->nr_balance_failed can
grow to much higher values than one would expect.

A successful load_balance() resets it to 0; a failed one increments
it. Once it gets to sd->cache_nice_tries + 3, this *should* trigger an
active balance, which will either set it to sd->cache_nice_tries+1 or reset
it to 0. However, in case the to-be-active-balanced task is not allowed to
run on env->dst_cpu, then the increment is done without any further
modification.

This could then be repeated ad nauseam, and would explain the absurdly high
values reported by syzbot (86, 149). VincentG noted there is value in
letting sd->cache_nice_tries grow, so the shift itself should be
fixed. That means preventing:

  """
  If the value of the right operand is negative or is greater than or equal
  to the width of the promoted left operand, the behavior is undefined.
  """

Thus we need to cap the shift exponent to
  BITS_PER_TYPE(typeof(lefthand)) - 1.

I had a look around for other similar cases via coccinelle:

  @expr@
  position pos;
  expression E1;
  expression E2;
  @@
  (
  E1 >> E2@pos
  |
  E1 >> E2@pos
  )

  @cst depends on expr@
  position pos;
  expression expr.E1;
  constant cst;
  @@
  (
  E1 >> cst@pos
  |
  E1 << cst@pos
  )

  @script:python depends on !cst@
  pos << expr.pos;
  exp << expr.E2;
  @@
  # Dirty hack to ignore constexpr
  if exp.upper() != exp:
 coccilib.report.print_report(pos[0], "Possible UB shift here")

The only other match in kernel/sched is rq_clock_thermal() which employs
sched_thermal_decay_shift, and that exponent is already capped to 10, so
that one is fine.

Fixes: 5a7f55590467 ("sched/fair: Relax constraint on task's load during load 
balance")
Reported-by: syzbot+d7581744d5fd27c9f...@syzkaller.appspotmail.com
Signed-off-by: Valentin Schneider 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Link: http://lore.kernel.org/r/ffac1205b9a21...@google.com
---
 kernel/sched/fair.c  | 3 +--
 kernel/sched/sched.h | 7 +++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7b2fac0..1af51a6 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7722,8 +7722,7 @@ static int detach_tasks(struct lb_env *env)
 * scheduler fails to find a good waiting task to
 * migrate.
 */
-
-   if ((load >> env->sd->nr_balance_failed) > 
env->imbalance)
+   if (shr_bound(load, env->sd->nr_balance_failed) > 
env->imbalance)
goto next;
 
env->imbalance -= load;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 0ddc9a6..bb8bb06 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -205,6 +205,13 @@ static inline void update_avg(u64 *avg, u64 sample)
 }
 
 /*
+ * Shifting a value by an exponent greater *or equal* to the size of said value
+ * is UB; cap at size-1.
+ */
+#define shr_bound(val, shift)  
\
+   (val >> min_t(typeof(shift), shift, BITS_PER_TYPE(typeof(val)) - 1))
+
+/*
  * !! For sched_setattr_nocheck() (kernel) only !!
  *
  * This is actually gross. :(


[tip: sched/core] cpu/hotplug: CPUHP_BRINGUP_CPU failure exception

2021-03-06 Thread tip-bot2 for Vincent Donnefort
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 62f250694092dd5fef9900dc3126f07110bf9d48
Gitweb:
https://git.kernel.org/tip/62f250694092dd5fef9900dc3126f07110bf9d48
Author:Vincent Donnefort 
AuthorDate:Tue, 16 Feb 2021 10:35:05 
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00

cpu/hotplug: CPUHP_BRINGUP_CPU failure exception

The atomic states (between CPUHP_AP_IDLE_DEAD and CPUHP_AP_ONLINE) are
triggered by the CPUHP_BRINGUP_CPU step. If the latter fails, no atomic
state can be rolled back.

DEAD callbacks too can't fail and disallow recovery. As a consequence,
during hotunplug, the fail injection interface should prohibit all states
from CPUHP_BRINGUP_CPU to CPUHP_ONLINE.

Signed-off-by: Vincent Donnefort 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Link: 
https://lkml.kernel.org/r/20210216103506.416286-3-vincent.donnef...@arm.com
---
 kernel/cpu.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 9121edf..680ed8f 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1045,9 +1045,13 @@ static int __ref _cpu_down(unsigned int cpu, int 
tasks_frozen,
 * to do the further cleanups.
 */
ret = cpuhp_down_callbacks(cpu, st, target);
-   if (ret && st->state == CPUHP_TEARDOWN_CPU && st->state < prev_state) {
-   cpuhp_reset_state(st, prev_state);
-   __cpuhp_kick_ap(st);
+   if (ret && st->state < prev_state) {
+   if (st->state == CPUHP_TEARDOWN_CPU) {
+   cpuhp_reset_state(st, prev_state);
+   __cpuhp_kick_ap(st);
+   } else {
+   WARN(1, "DEAD callback error for CPU%d", cpu);
+   }
}
 
 out:
@@ -,6 +2226,15 @@ static ssize_t write_cpuhp_fail(struct device *dev,
return -EINVAL;
 
/*
+* DEAD callbacks cannot fail...
+* ... neither can CPUHP_BRINGUP_CPU during hotunplug. The latter
+* triggering STARTING callbacks, a failure in this state would
+* hinder rollback.
+*/
+   if (fail <= CPUHP_BRINGUP_CPU && st->state > CPUHP_BRINGUP_CPU)
+   return -EINVAL;
+
+   /*
 * Cannot fail anything that doesn't have callbacks.
 */
mutex_lock(_state_mutex);


[tip: sched/core] sched/pelt: Fix task util_est update filtering

2021-03-06 Thread tip-bot2 for Vincent Donnefort
The following commit has been merged into the sched/core branch of tip:

Commit-ID: b89997aa88f0b07d8a6414c908af75062103b8c9
Gitweb:
https://git.kernel.org/tip/b89997aa88f0b07d8a6414c908af75062103b8c9
Author:Vincent Donnefort 
AuthorDate:Thu, 25 Feb 2021 16:58:20 
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00

sched/pelt: Fix task util_est update filtering

Being called for each dequeue, util_est reduces the number of its updates
by filtering out when the EWMA signal is different from the task util_avg
by less than 1%. It is a problem for a sudden util_avg ramp-up. Due to the
decay from a previous high util_avg, EWMA might now be close enough to
the new util_avg. No update would then happen while it would leave
ue.enqueued with an out-of-date value.

Taking into consideration the two util_est members, EWMA and enqueued for
the filtering, ensures, for both, an up-to-date value.

This is for now an issue only for the trace probe that might return the
stale value. Functional-wise, it isn't a problem, as the value is always
accessed through max(enqueued, ewma).

This problem has been observed using LISA's UtilConvergence:test_means on
the sd845c board.

No regression observed with Hackbench on sd845c and Perf-bench sched pipe
on hikey/hikey960.

Signed-off-by: Vincent Donnefort 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Dietmar Eggemann 
Reviewed-by: Vincent Guittot 
Link: 
https://lkml.kernel.org/r/20210225165820.1377125-1-vincent.donnef...@arm.com
---
 kernel/sched/fair.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1af51a6..f5d6541 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3941,6 +3941,8 @@ static inline void util_est_dequeue(struct cfs_rq *cfs_rq,
trace_sched_util_est_cfs_tp(cfs_rq);
 }
 
+#define UTIL_EST_MARGIN (SCHED_CAPACITY_SCALE / 100)
+
 /*
  * Check if a (signed) value is within a specified (unsigned) margin,
  * based on the observation that:
@@ -3958,7 +3960,7 @@ static inline void util_est_update(struct cfs_rq *cfs_rq,
   struct task_struct *p,
   bool task_sleep)
 {
-   long last_ewma_diff;
+   long last_ewma_diff, last_enqueued_diff;
struct util_est ue;
 
if (!sched_feat(UTIL_EST))
@@ -3979,6 +3981,8 @@ static inline void util_est_update(struct cfs_rq *cfs_rq,
if (ue.enqueued & UTIL_AVG_UNCHANGED)
return;
 
+   last_enqueued_diff = ue.enqueued;
+
/*
 * Reset EWMA on utilization increases, the moving average is used only
 * to smooth utilization decreases.
@@ -3992,12 +3996,17 @@ static inline void util_est_update(struct cfs_rq 
*cfs_rq,
}
 
/*
-* Skip update of task's estimated utilization when its EWMA is
+* Skip update of task's estimated utilization when its members are
 * already ~1% close to its last activation value.
 */
last_ewma_diff = ue.enqueued - ue.ewma;
-   if (within_margin(last_ewma_diff, (SCHED_CAPACITY_SCALE / 100)))
+   last_enqueued_diff -= ue.enqueued;
+   if (within_margin(last_ewma_diff, UTIL_EST_MARGIN)) {
+   if (!within_margin(last_enqueued_diff, UTIL_EST_MARGIN))
+   goto done;
+
return;
+   }
 
/*
 * To avoid overestimation of actual task utilization, skip updates if


[tip: sched/core] sched/fair: use lsub_positive in cpu_util_next()

2021-03-06 Thread tip-bot2 for Vincent Donnefort
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 736cc6b31102236a55470c72523ed0a65eb3f804
Gitweb:
https://git.kernel.org/tip/736cc6b31102236a55470c72523ed0a65eb3f804
Author:Vincent Donnefort 
AuthorDate:Thu, 25 Feb 2021 08:36:12 
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00

sched/fair: use lsub_positive in cpu_util_next()

The sub_positive local version is saving an explicit load-store and is
enough for the cpu_util_next() usage.

Signed-off-by: Vincent Donnefort 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Quentin Perret 
Reviewed-by: Dietmar Eggemann 
Link: 
https://lkml.kernel.org/r/20210225083612.1113823-3-vincent.donnef...@arm.com
---
 kernel/sched/fair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b994db9..7b2fac0 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6471,7 +6471,7 @@ static unsigned long cpu_util_next(int cpu, struct 
task_struct *p, int dst_cpu)
 * util_avg should already be correct.
 */
if (task_cpu(p) == cpu && dst_cpu != cpu)
-   sub_positive(, task_util(p));
+   lsub_positive(, task_util(p));
else if (task_cpu(p) != cpu && dst_cpu == cpu)
util += task_util(p);
 


[tip: sched/core] sched/fair: Reduce the window for duplicated update

2021-03-06 Thread tip-bot2 for Vincent Guittot
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 39b6a429c30482c349f1bb3746470fe473cbdb0f
Gitweb:
https://git.kernel.org/tip/39b6a429c30482c349f1bb3746470fe473cbdb0f
Author:Vincent Guittot 
AuthorDate:Wed, 24 Feb 2021 14:30:07 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00

sched/fair: Reduce the window for duplicated update

Start to update last_blocked_load_update_tick to reduce the possibility
of another cpu starting the update one more time

Signed-off-by: Vincent Guittot 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Valentin Schneider 
Link: 
https://lkml.kernel.org/r/20210224133007.28644-8-vincent.guit...@linaro.org
---
 kernel/sched/fair.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e87e1b3..f1b55f9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7852,16 +7852,20 @@ static inline bool others_have_blocked(struct rq *rq)
return false;
 }
 
-static inline void update_blocked_load_status(struct rq *rq, bool has_blocked)
+static inline void update_blocked_load_tick(struct rq *rq)
 {
-   rq->last_blocked_load_update_tick = jiffies;
+   WRITE_ONCE(rq->last_blocked_load_update_tick, jiffies);
+}
 
+static inline void update_blocked_load_status(struct rq *rq, bool has_blocked)
+{
if (!has_blocked)
rq->has_blocked_load = 0;
 }
 #else
 static inline bool cfs_rq_has_blocked(struct cfs_rq *cfs_rq) { return false; }
 static inline bool others_have_blocked(struct rq *rq) { return false; }
+static inline void update_blocked_load_tick(struct rq *rq) {}
 static inline void update_blocked_load_status(struct rq *rq, bool has_blocked) 
{}
 #endif
 
@@ -8022,6 +8026,7 @@ static void update_blocked_averages(int cpu)
struct rq_flags rf;
 
rq_lock_irqsave(rq, );
+   update_blocked_load_tick(rq);
update_rq_clock(rq);
 
decayed |= __update_blocked_others(rq, );
@@ -8363,7 +8368,7 @@ static bool update_nohz_stats(struct rq *rq)
if (!cpumask_test_cpu(cpu, nohz.idle_cpus_mask))
return false;
 
-   if (!time_after(jiffies, rq->last_blocked_load_update_tick))
+   if (!time_after(jiffies, READ_ONCE(rq->last_blocked_load_update_tick)))
return true;
 
update_blocked_averages(cpu);


[tip: sched/core] sched/fair: Fix task utilization accountability in compute_energy()

2021-03-06 Thread tip-bot2 for Vincent Donnefort
The following commit has been merged into the sched/core branch of tip:

Commit-ID: 0372e1cf70c28de6babcba38ef97b6ae3400b101
Gitweb:
https://git.kernel.org/tip/0372e1cf70c28de6babcba38ef97b6ae3400b101
Author:Vincent Donnefort 
AuthorDate:Thu, 25 Feb 2021 08:36:11 
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:22 +01:00

sched/fair: Fix task utilization accountability in compute_energy()

find_energy_efficient_cpu() (feec()) computes for each perf_domain (pd) an
energy delta as follows:

  feec(task)
for_each_pd
  base_energy = compute_energy(task, -1, pd)
-> for_each_cpu(pd)
   -> cpu_util_next(cpu, task, -1)

  energy_delta = compute_energy(task, dst_cpu, pd)
-> for_each_cpu(pd)
   -> cpu_util_next(cpu, task, dst_cpu)
  energy_delta -= base_energy

Then it picks the best CPU as being the one that minimizes energy_delta.

cpu_util_next() estimates the CPU utilization that would happen if the
task was placed on dst_cpu as follows:

  max(cpu_util + task_util, cpu_util_est + _task_util_est)

The task contribution to the energy delta can then be either:

  (1) _task_util_est, on a mostly idle CPU, where cpu_util is close to 0
  and _task_util_est > cpu_util.
  (2) task_util, on a mostly busy CPU, where cpu_util > _task_util_est.

  (cpu_util_est doesn't appear here. It is 0 when a CPU is idle and
   otherwise must be small enough so that feec() takes the CPU as a
   potential target for the task placement)

This is problematic for feec(), as cpu_util_next() might give an unfair
advantage to a CPU which is mostly busy (2) compared to one which is
mostly idle (1). _task_util_est being always bigger than task_util in
feec() (as the task is waking up), the task contribution to the energy
might look smaller on certain CPUs (2) and this breaks the energy
comparison.

This issue is, moreover, not sporadic. By starving idle CPUs, it keeps
their cpu_util < _task_util_est (1) while others will maintain cpu_util >
_task_util_est (2).

Fix this problem by always using max(task_util, _task_util_est) as a task
contribution to the energy (ENERGY_UTIL). The new estimated CPU
utilization for the energy would then be:

  max(cpu_util, cpu_util_est) + max(task_util, _task_util_est)

compute_energy() still needs to know which OPP would be selected if the
task would be migrated in the perf_domain (FREQUENCY_UTIL). Hence,
cpu_util_next() is still used to estimate the maximum util within the pd.

Signed-off-by: Vincent Donnefort 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Reviewed-by: Quentin Perret 
Reviewed-by: Dietmar Eggemann 
Link: 
https://lkml.kernel.org/r/20210225083612.1113823-2-vincent.donnef...@arm.com
---
 kernel/sched/fair.c | 24 
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f1b55f9..b994db9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6518,8 +6518,24 @@ compute_energy(struct task_struct *p, int dst_cpu, 
struct perf_domain *pd)
 * its pd list and will not be accounted by compute_energy().
 */
for_each_cpu_and(cpu, pd_mask, cpu_online_mask) {
-   unsigned long cpu_util, util_cfs = cpu_util_next(cpu, p, 
dst_cpu);
-   struct task_struct *tsk = cpu == dst_cpu ? p : NULL;
+   unsigned long util_freq = cpu_util_next(cpu, p, dst_cpu);
+   unsigned long cpu_util, util_running = util_freq;
+   struct task_struct *tsk = NULL;
+
+   /*
+* When @p is placed on @cpu:
+*
+* util_running = max(cpu_util, cpu_util_est) +
+*max(task_util, _task_util_est)
+*
+* while cpu_util_next is: max(cpu_util + task_util,
+* cpu_util_est + _task_util_est)
+*/
+   if (cpu == dst_cpu) {
+   tsk = p;
+   util_running =
+   cpu_util_next(cpu, p, -1) + task_util_est(p);
+   }
 
/*
 * Busy time computation: utilization clamping is not
@@ -6527,7 +6543,7 @@ compute_energy(struct task_struct *p, int dst_cpu, struct 
perf_domain *pd)
 * is already enough to scale the EM reported power
 * consumption at the (eventually clamped) cpu_capacity.
 */
-   sum_util += effective_cpu_util(cpu, util_cfs, cpu_cap,
+   sum_util += effective_cpu_util(cpu, util_running, cpu_cap,
   ENERGY_UTIL, NULL);
 
/*
@@ -6537,7 +6553,7 @@ compute_energy(struct task_struct *p, int dst_cpu, struct 
perf_domain *pd)
 * NOTE: in case RT tasks are running, by default the
 * FREQUENCY_UTIL's utilization can be max 

[PATCH] arm64: dts: imx8mp: add wdog2/3 nodes

2021-03-06 Thread peng . fan
From: Peng Fan 

There is wdog[2,3] in i.MX8MP, so add them, all wdogs share the
same clock root, so use the wdog1 clk here.

Signed-off-by: Peng Fan 
---
 arch/arm64/boot/dts/freescale/imx8mp.dtsi | 16 
 1 file changed, 16 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/imx8mp.dtsi 
b/arch/arm64/boot/dts/freescale/imx8mp.dtsi
index c7523fd4eae9..05dd04116f2e 100644
--- a/arch/arm64/boot/dts/freescale/imx8mp.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx8mp.dtsi
@@ -312,6 +312,22 @@ wdog1: watchdog@3028 {
status = "disabled";
};
 
+   wdog2: watchdog@3029 {
+   compatible = "fsl,imx8mp-wdt", "fsl,imx21-wdt";
+   reg = <0x3029 0x1>;
+   interrupts = ;
+   clocks = < IMX8MP_CLK_WDOG2_ROOT>;
+   status = "disabled";
+   };
+
+   wdog3: watchdog@302a {
+   compatible = "fsl,imx8mp-wdt", "fsl,imx21-wdt";
+   reg = <0x302a 0x1>;
+   interrupts = ;
+   clocks = < IMX8MP_CLK_WDOG3_ROOT>;
+   status = "disabled";
+   };
+
iomuxc: pinctrl@3033 {
compatible = "fsl,imx8mp-iomuxc";
reg = <0x3033 0x1>;
-- 
2.30.0



[PATCH V13 10/10] remoteproc: imx_proc: enable virtio/mailbox

2021-03-06 Thread peng . fan
From: Peng Fan 

Use virtio/mailbox to build connection between Remote Proccessors
and Linux. Add work queue to handle incoming messages.

Reviewed-by: Richard Zhu 
Reviewed-by: Mathieu Poirier 
Signed-off-by: Peng Fan 
---
 drivers/remoteproc/imx_rproc.c | 116 -
 1 file changed, 113 insertions(+), 3 deletions(-)

diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c
index 3685bbd135b0..90471790bb24 100644
--- a/drivers/remoteproc/imx_rproc.c
+++ b/drivers/remoteproc/imx_rproc.c
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -15,6 +16,9 @@
 #include 
 #include 
 #include 
+#include 
+
+#include "remoteproc_internal.h"
 
 #define IMX7D_SRC_SCR  0x0C
 #define IMX7D_ENABLE_M4BIT(3)
@@ -86,6 +90,11 @@ struct imx_rproc {
const struct imx_rproc_dcfg *dcfg;
struct imx_rproc_memmem[IMX7D_RPROC_MEM_MAX];
struct clk  *clk;
+   struct mbox_client  cl;
+   struct mbox_chan*tx_ch;
+   struct mbox_chan*rx_ch;
+   struct work_struct  rproc_work;
+   struct workqueue_struct *workqueue;
 };
 
 static const struct imx_rproc_att imx_rproc_att_imx8mq[] = {
@@ -366,9 +375,33 @@ static int imx_rproc_parse_fw(struct rproc *rproc, const 
struct firmware *fw)
return 0;
 }
 
+static void imx_rproc_kick(struct rproc *rproc, int vqid)
+{
+   struct imx_rproc *priv = rproc->priv;
+   int err;
+   __u32 mmsg;
+
+   if (!priv->tx_ch) {
+   dev_err(priv->dev, "No initialized mbox tx channel\n");
+   return;
+   }
+
+   /*
+* Send the index of the triggered virtqueue as the mu payload.
+* Let remote processor know which virtqueue is used.
+*/
+   mmsg = vqid << 16;
+
+   err = mbox_send_message(priv->tx_ch, (void *));
+   if (err < 0)
+   dev_err(priv->dev, "%s: failed (%d, err:%d)\n",
+   __func__, vqid, err);
+}
+
 static const struct rproc_ops imx_rproc_ops = {
.start  = imx_rproc_start,
.stop   = imx_rproc_stop,
+   .kick   = imx_rproc_kick,
.da_to_va   = imx_rproc_da_to_va,
.load   = rproc_elf_load_segments,
.parse_fw   = imx_rproc_parse_fw,
@@ -444,6 +477,66 @@ static int imx_rproc_addr_init(struct imx_rproc *priv,
return 0;
 }
 
+static void imx_rproc_vq_work(struct work_struct *work)
+{
+   struct imx_rproc *priv = container_of(work, struct imx_rproc,
+ rproc_work);
+
+   rproc_vq_interrupt(priv->rproc, 0);
+   rproc_vq_interrupt(priv->rproc, 1);
+}
+
+static void imx_rproc_rx_callback(struct mbox_client *cl, void *msg)
+{
+   struct rproc *rproc = dev_get_drvdata(cl->dev);
+   struct imx_rproc *priv = rproc->priv;
+
+   queue_work(priv->workqueue, >rproc_work);
+}
+
+static int imx_rproc_xtr_mbox_init(struct rproc *rproc)
+{
+   struct imx_rproc *priv = rproc->priv;
+   struct device *dev = priv->dev;
+   struct mbox_client *cl;
+   int ret;
+
+   if (!of_get_property(dev->of_node, "mbox-names", NULL))
+   return 0;
+
+   cl = >cl;
+   cl->dev = dev;
+   cl->tx_block = true;
+   cl->tx_tout = 100;
+   cl->knows_txdone = false;
+   cl->rx_callback = imx_rproc_rx_callback;
+
+   priv->tx_ch = mbox_request_channel_byname(cl, "tx");
+   if (IS_ERR(priv->tx_ch)) {
+   ret = PTR_ERR(priv->tx_ch);
+   return dev_err_probe(cl->dev, ret,
+"failed to request tx mailbox channel: 
%d\n", ret);
+   }
+
+   priv->rx_ch = mbox_request_channel_byname(cl, "rx");
+   if (IS_ERR(priv->rx_ch)) {
+   mbox_free_channel(priv->tx_ch);
+   ret = PTR_ERR(priv->rx_ch);
+   return dev_err_probe(cl->dev, ret,
+"failed to request rx mailbox channel: 
%d\n", ret);
+   }
+
+   return 0;
+}
+
+static void imx_rproc_free_mbox(struct rproc *rproc)
+{
+   struct imx_rproc *priv = rproc->priv;
+
+   mbox_free_channel(priv->tx_ch);
+   mbox_free_channel(priv->rx_ch);
+}
+
 static int imx_rproc_probe(struct platform_device *pdev)
 {
struct device *dev = >dev;
@@ -481,18 +574,28 @@ static int imx_rproc_probe(struct platform_device *pdev)
priv->dev = dev;
 
dev_set_drvdata(dev, rproc);
+   priv->workqueue = create_workqueue(dev_name(dev));
+   if (!priv->workqueue) {
+   dev_err(dev, "cannot create workqueue\n");
+   ret = -ENOMEM;
+   goto err_put_rproc;
+   }
+
+   ret = imx_rproc_xtr_mbox_init(rproc);
+   if (ret)
+   goto err_put_wkq;
 
ret = imx_rproc_addr_init(priv, pdev);

[PATCH V13 08/10] remoteproc: imx_rproc: support i.MX8MQ/M

2021-03-06 Thread peng . fan
From: Peng Fan 

Add i.MX8MQ dev/sys addr map and configuration data structure
i.MX8MM share i.MX8MQ settings.

Reviewed-by: Richard Zhu 
Reviewed-by: Mathieu Poirier 
Signed-off-by: Peng Fan 
---
 drivers/remoteproc/Kconfig |  6 ++---
 drivers/remoteproc/imx_rproc.c | 41 +-
 2 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
index 15d1574d129b..7cf3d1b40c55 100644
--- a/drivers/remoteproc/Kconfig
+++ b/drivers/remoteproc/Kconfig
@@ -24,11 +24,11 @@ config REMOTEPROC_CDEV
  It's safe to say N if you don't want to use this interface.
 
 config IMX_REMOTEPROC
-   tristate "IMX6/7 remoteproc support"
+   tristate "i.MX remoteproc support"
depends on ARCH_MXC
help
- Say y here to support iMX's remote processors (Cortex M4
- on iMX7D) via the remote processor framework.
+ Say y here to support iMX's remote processors via the remote
+ processor framework.
 
  It's safe to say N here.
 
diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c
index 5ae1f5209548..0124ebf69838 100644
--- a/drivers/remoteproc/imx_rproc.c
+++ b/drivers/remoteproc/imx_rproc.c
@@ -88,6 +88,34 @@ struct imx_rproc {
struct clk  *clk;
 };
 
+static const struct imx_rproc_att imx_rproc_att_imx8mq[] = {
+   /* dev addr , sys addr  , size  , flags */
+   /* TCML - alias */
+   { 0x, 0x007e, 0x0002, 0 },
+   /* OCRAM_S */
+   { 0x0018, 0x0018, 0x8000, 0 },
+   /* OCRAM */
+   { 0x0090, 0x0090, 0x0002, 0 },
+   /* OCRAM */
+   { 0x0092, 0x0092, 0x0002, 0 },
+   /* QSPI Code - alias */
+   { 0x0800, 0x0800, 0x0800, 0 },
+   /* DDR (Code) - alias */
+   { 0x1000, 0x8000, 0x0FFE, 0 },
+   /* TCML */
+   { 0x1FFE, 0x007E, 0x0002, ATT_OWN },
+   /* TCMU */
+   { 0x2000, 0x0080, 0x0002, ATT_OWN },
+   /* OCRAM_S */
+   { 0x2018, 0x0018, 0x8000, ATT_OWN },
+   /* OCRAM */
+   { 0x2020, 0x0090, 0x0002, ATT_OWN },
+   /* OCRAM */
+   { 0x2022, 0x0092, 0x0002, ATT_OWN },
+   /* DDR (Data) */
+   { 0x4000, 0x4000, 0x8000, 0 },
+};
+
 static const struct imx_rproc_att imx_rproc_att_imx7d[] = {
/* dev addr , sys addr  , size  , flags */
/* OCRAM_S (M4 Boot code) - alias */
@@ -138,6 +166,15 @@ static const struct imx_rproc_att imx_rproc_att_imx6sx[] = 
{
{ 0x8000, 0x8000, 0x6000, 0 },
 };
 
+static const struct imx_rproc_dcfg imx_rproc_cfg_imx8mq = {
+   .src_reg= IMX7D_SRC_SCR,
+   .src_mask   = IMX7D_M4_RST_MASK,
+   .src_start  = IMX7D_M4_START,
+   .src_stop   = IMX7D_M4_STOP,
+   .att= imx_rproc_att_imx8mq,
+   .att_size   = ARRAY_SIZE(imx_rproc_att_imx8mq),
+};
+
 static const struct imx_rproc_dcfg imx_rproc_cfg_imx7d = {
.src_reg= IMX7D_SRC_SCR,
.src_mask   = IMX7D_M4_RST_MASK,
@@ -496,6 +533,8 @@ static int imx_rproc_remove(struct platform_device *pdev)
 static const struct of_device_id imx_rproc_of_match[] = {
{ .compatible = "fsl,imx7d-cm4", .data = _rproc_cfg_imx7d },
{ .compatible = "fsl,imx6sx-cm4", .data = _rproc_cfg_imx6sx },
+   { .compatible = "fsl,imx8mq-cm4", .data = _rproc_cfg_imx8mq },
+   { .compatible = "fsl,imx8mm-cm4", .data = _rproc_cfg_imx8mq },
{},
 };
 MODULE_DEVICE_TABLE(of, imx_rproc_of_match);
@@ -512,5 +551,5 @@ static struct platform_driver imx_rproc_driver = {
 module_platform_driver(imx_rproc_driver);
 
 MODULE_LICENSE("GPL v2");
-MODULE_DESCRIPTION("IMX6SX/7D remote processor control driver");
+MODULE_DESCRIPTION("i.MX remote processor control driver");
 MODULE_AUTHOR("Oleksij Rempel ");
-- 
2.30.0



[PATCH V13 09/10] remoteproc: imx_rproc: ignore mapping vdev regions

2021-03-06 Thread peng . fan
From: Peng Fan 

vdev regions are vdev0vring0, vdev0vring1, vdevbuffer and similar.
They are handled by remoteproc common code, no need to map in imx
rproc driver.

Signed-off-by: Peng Fan 
Reviewed-by: Mathieu Poirier 
---
 drivers/remoteproc/imx_rproc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c
index 0124ebf69838..3685bbd135b0 100644
--- a/drivers/remoteproc/imx_rproc.c
+++ b/drivers/remoteproc/imx_rproc.c
@@ -417,6 +417,9 @@ static int imx_rproc_addr_init(struct imx_rproc *priv,
struct resource res;
 
node = of_parse_phandle(np, "memory-region", a);
+   /* Not map vdev region */
+   if (!strcmp(node->name, "vdev"))
+   continue;
err = of_address_to_resource(node, 0, );
if (err) {
dev_err(dev, "unable to resolve memory region\n");
-- 
2.30.0



[PATCH V13 06/10] remoteproc: imx_rproc: use devm_ioremap

2021-03-06 Thread peng . fan
From: Peng Fan 

We might need to map an region multiple times, becaue the region might
be shared between remote processors, such i.MX8QM with dual M4 cores.
So use devm_ioremap, not devm_ioremap_resource.

Reviewed-by: Oleksij Rempel 
Reviewed-by: Richard Zhu 
Signed-off-by: Peng Fan 
Reviewed-by: Mathieu Poirier 
---
 drivers/remoteproc/imx_rproc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c
index 2a093cea4997..47fc1d06be6a 100644
--- a/drivers/remoteproc/imx_rproc.c
+++ b/drivers/remoteproc/imx_rproc.c
@@ -296,7 +296,8 @@ static int imx_rproc_addr_init(struct imx_rproc *priv,
if (b >= IMX7D_RPROC_MEM_MAX)
break;
 
-   priv->mem[b].cpu_addr = devm_ioremap_resource(>dev, );
+   /* Not use resource version, because we might share region */
+   priv->mem[b].cpu_addr = devm_ioremap(>dev, res.start, 
resource_size());
if (IS_ERR(priv->mem[b].cpu_addr)) {
dev_err(dev, "failed to remap %pr\n", );
err = PTR_ERR(priv->mem[b].cpu_addr);
-- 
2.30.0



[PATCH V13 07/10] remoteproc: imx_rproc: add i.MX specific parse fw hook

2021-03-06 Thread peng . fan
From: Peng Fan 

The hook is used to parse memory-regions and load resource table
from the address the remote processor published.

Reviewed-by: Richard Zhu 
Reviewed-by: Mathieu Poirier 
Signed-off-by: Peng Fan 
---
 drivers/remoteproc/imx_rproc.c | 93 ++
 1 file changed, 93 insertions(+)

diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c
index 47fc1d06be6a..5ae1f5209548 100644
--- a/drivers/remoteproc/imx_rproc.c
+++ b/drivers/remoteproc/imx_rproc.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -241,10 +242,102 @@ static void *imx_rproc_da_to_va(struct rproc *rproc, u64 
da, size_t len, bool *i
return va;
 }
 
+static int imx_rproc_mem_alloc(struct rproc *rproc,
+  struct rproc_mem_entry *mem)
+{
+   struct device *dev = rproc->dev.parent;
+   void *va;
+
+   dev_dbg(dev, "map memory: %p+%zx\n", >dma, mem->len);
+   va = ioremap_wc(mem->dma, mem->len);
+   if (IS_ERR_OR_NULL(va)) {
+   dev_err(dev, "Unable to map memory region: %p+%zx\n",
+   >dma, mem->len);
+   return -ENOMEM;
+   }
+
+   /* Update memory entry va */
+   mem->va = va;
+
+   return 0;
+}
+
+static int imx_rproc_mem_release(struct rproc *rproc,
+struct rproc_mem_entry *mem)
+{
+   dev_dbg(rproc->dev.parent, "unmap memory: %pa\n", >dma);
+   iounmap(mem->va);
+
+   return 0;
+}
+
+static int imx_rproc_parse_memory_regions(struct rproc *rproc)
+{
+   struct imx_rproc *priv = rproc->priv;
+   struct device_node *np = priv->dev->of_node;
+   struct of_phandle_iterator it;
+   struct rproc_mem_entry *mem;
+   struct reserved_mem *rmem;
+   u32 da;
+
+   /* Register associated reserved memory regions */
+   of_phandle_iterator_init(, np, "memory-region", NULL, 0);
+   while (of_phandle_iterator_next() == 0) {
+   /*
+* Ignore the first memory region which will be used vdev 
buffer.
+* No need to do extra handlings, rproc_add_virtio_dev will 
handle it.
+*/
+   if (!strcmp(it.node->name, "vdev0buffer"))
+   continue;
+
+   rmem = of_reserved_mem_lookup(it.node);
+   if (!rmem) {
+   dev_err(priv->dev, "unable to acquire memory-region\n");
+   return -EINVAL;
+   }
+
+   /* No need to translate pa to da, i.MX use same map */
+   da = rmem->base;
+
+   /* Register memory region */
+   mem = rproc_mem_entry_init(priv->dev, NULL, 
(dma_addr_t)rmem->base, rmem->size, da,
+  imx_rproc_mem_alloc, 
imx_rproc_mem_release,
+  it.node->name);
+
+   if (mem)
+   rproc_coredump_add_segment(rproc, da, rmem->size);
+   else
+   return -ENOMEM;
+
+   rproc_add_carveout(rproc, mem);
+   }
+
+   return  0;
+}
+
+static int imx_rproc_parse_fw(struct rproc *rproc, const struct firmware *fw)
+{
+   int ret = imx_rproc_parse_memory_regions(rproc);
+
+   if (ret)
+   return ret;
+
+   ret = rproc_elf_load_rsc_table(rproc, fw);
+   if (ret)
+   dev_info(>dev, "No resource table in elf\n");
+
+   return 0;
+}
+
 static const struct rproc_ops imx_rproc_ops = {
.start  = imx_rproc_start,
.stop   = imx_rproc_stop,
.da_to_va   = imx_rproc_da_to_va,
+   .load   = rproc_elf_load_segments,
+   .parse_fw   = imx_rproc_parse_fw,
+   .find_loaded_rsc_table = rproc_elf_find_loaded_rsc_table,
+   .sanity_check   = rproc_elf_sanity_check,
+   .get_boot_addr  = rproc_elf_get_boot_addr,
 };
 
 static int imx_rproc_addr_init(struct imx_rproc *priv,
-- 
2.30.0



[PATCH V13 05/10] remoteproc: imx_rproc: correct err message

2021-03-06 Thread peng . fan
From: Peng Fan 

It is using devm_ioremap, so not devm_ioremap_resource. Correct
the error message and print out sa/size.

Reviewed-by: Bjorn Andersson 
Reviewed-by: Mathieu Poirier 
Signed-off-by: Peng Fan 
---
 drivers/remoteproc/imx_rproc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c
index 6603e00bb6f4..2a093cea4997 100644
--- a/drivers/remoteproc/imx_rproc.c
+++ b/drivers/remoteproc/imx_rproc.c
@@ -268,7 +268,7 @@ static int imx_rproc_addr_init(struct imx_rproc *priv,
priv->mem[b].cpu_addr = devm_ioremap(>dev,
 att->sa, att->size);
if (!priv->mem[b].cpu_addr) {
-   dev_err(dev, "devm_ioremap_resource failed\n");
+   dev_err(dev, "failed to remap %#x bytes from %#x\n", 
att->size, att->sa);
return -ENOMEM;
}
priv->mem[b].sys_addr = att->sa;
@@ -298,7 +298,7 @@ static int imx_rproc_addr_init(struct imx_rproc *priv,
 
priv->mem[b].cpu_addr = devm_ioremap_resource(>dev, );
if (IS_ERR(priv->mem[b].cpu_addr)) {
-   dev_err(dev, "devm_ioremap_resource failed\n");
+   dev_err(dev, "failed to remap %pr\n", );
err = PTR_ERR(priv->mem[b].cpu_addr);
return err;
}
-- 
2.30.0



[PATCH V13 04/10] remoteproc: add is_iomem to da_to_va

2021-03-06 Thread peng . fan
From: Peng Fan 

Introduce an extra parameter is_iomem to da_to_va, then the caller
could take the memory as normal memory or io mapped memory.

Reviewed-by: Bjorn Andersson 
Reviewed-by: Mathieu Poirier 
Reported-by: kernel test robot 
Signed-off-by: Peng Fan 
---
 drivers/remoteproc/imx_rproc.c |  2 +-
 drivers/remoteproc/ingenic_rproc.c |  2 +-
 drivers/remoteproc/keystone_remoteproc.c   |  2 +-
 drivers/remoteproc/mtk_scp.c   |  6 +++---
 drivers/remoteproc/omap_remoteproc.c   |  2 +-
 drivers/remoteproc/pru_rproc.c |  2 +-
 drivers/remoteproc/qcom_q6v5_adsp.c|  2 +-
 drivers/remoteproc/qcom_q6v5_pas.c |  2 +-
 drivers/remoteproc/qcom_q6v5_wcss.c|  2 +-
 drivers/remoteproc/qcom_wcnss.c|  2 +-
 drivers/remoteproc/remoteproc_core.c   |  7 +--
 drivers/remoteproc/remoteproc_coredump.c   |  8 ++--
 drivers/remoteproc/remoteproc_debugfs.c|  2 +-
 drivers/remoteproc/remoteproc_elf_loader.c | 21 +++--
 drivers/remoteproc/remoteproc_internal.h   |  2 +-
 drivers/remoteproc/st_slim_rproc.c |  2 +-
 drivers/remoteproc/ti_k3_dsp_remoteproc.c  |  2 +-
 drivers/remoteproc/ti_k3_r5_remoteproc.c   |  2 +-
 drivers/remoteproc/wkup_m3_rproc.c |  2 +-
 include/linux/remoteproc.h |  2 +-
 20 files changed, 45 insertions(+), 29 deletions(-)

diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c
index 8957ed271d20..6603e00bb6f4 100644
--- a/drivers/remoteproc/imx_rproc.c
+++ b/drivers/remoteproc/imx_rproc.c
@@ -208,7 +208,7 @@ static int imx_rproc_da_to_sys(struct imx_rproc *priv, u64 
da,
return -ENOENT;
 }
 
-static void *imx_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len)
+static void *imx_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len, bool 
*is_iomem)
 {
struct imx_rproc *priv = rproc->priv;
void *va = NULL;
diff --git a/drivers/remoteproc/ingenic_rproc.c 
b/drivers/remoteproc/ingenic_rproc.c
index e2618c36eaab..a356738160a4 100644
--- a/drivers/remoteproc/ingenic_rproc.c
+++ b/drivers/remoteproc/ingenic_rproc.c
@@ -121,7 +121,7 @@ static void ingenic_rproc_kick(struct rproc *rproc, int 
vqid)
writel(vqid, vpu->aux_base + REG_CORE_MSG);
 }
 
-static void *ingenic_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len)
+static void *ingenic_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len, 
bool *is_iomem)
 {
struct vpu *vpu = rproc->priv;
void __iomem *va = NULL;
diff --git a/drivers/remoteproc/keystone_remoteproc.c 
b/drivers/remoteproc/keystone_remoteproc.c
index cd266163a65f..54781f553f4e 100644
--- a/drivers/remoteproc/keystone_remoteproc.c
+++ b/drivers/remoteproc/keystone_remoteproc.c
@@ -246,7 +246,7 @@ static void keystone_rproc_kick(struct rproc *rproc, int 
vqid)
  * can be used either by the remoteproc core for loading (when using kernel
  * remoteproc loader), or by any rpmsg bus drivers.
  */
-static void *keystone_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len)
+static void *keystone_rproc_da_to_va(struct rproc *rproc, u64 da, size_t len, 
bool *is_iomem)
 {
struct keystone_rproc *ksproc = rproc->priv;
void __iomem *va = NULL;
diff --git a/drivers/remoteproc/mtk_scp.c b/drivers/remoteproc/mtk_scp.c
index ce727598c41c..9679cc26895e 100644
--- a/drivers/remoteproc/mtk_scp.c
+++ b/drivers/remoteproc/mtk_scp.c
@@ -272,7 +272,7 @@ static int scp_elf_load_segments(struct rproc *rproc, const 
struct firmware *fw)
}
 
/* grab the kernel address for this device address */
-   ptr = (void __iomem *)rproc_da_to_va(rproc, da, memsz);
+   ptr = (void __iomem *)rproc_da_to_va(rproc, da, memsz, NULL);
if (!ptr) {
dev_err(dev, "bad phdr da 0x%x mem 0x%x\n", da, memsz);
ret = -EINVAL;
@@ -509,7 +509,7 @@ static void *mt8192_scp_da_to_va(struct mtk_scp *scp, u64 
da, size_t len)
return NULL;
 }
 
-static void *scp_da_to_va(struct rproc *rproc, u64 da, size_t len)
+static void *scp_da_to_va(struct rproc *rproc, u64 da, size_t len, bool 
*is_iomem)
 {
struct mtk_scp *scp = (struct mtk_scp *)rproc->priv;
 
@@ -627,7 +627,7 @@ void *scp_mapping_dm_addr(struct mtk_scp *scp, u32 mem_addr)
 {
void *ptr;
 
-   ptr = scp_da_to_va(scp->rproc, mem_addr, 0);
+   ptr = scp_da_to_va(scp->rproc, mem_addr, 0, NULL);
if (!ptr)
return ERR_PTR(-EINVAL);
 
diff --git a/drivers/remoteproc/omap_remoteproc.c 
b/drivers/remoteproc/omap_remoteproc.c
index d94b7391bf9d..43531caa1959 100644
--- a/drivers/remoteproc/omap_remoteproc.c
+++ b/drivers/remoteproc/omap_remoteproc.c
@@ -728,7 +728,7 @@ static int omap_rproc_stop(struct rproc *rproc)
  * Return: translated virtual address in kernel memory space on success,
  * or NULL on failure.
  */
-static void *omap_rproc_da_to_va(struct rproc *rproc, u64 da, 

Re: [syzbot] upstream boot error: WARNING in kvm_wait

2021-03-06 Thread Dmitry Vyukov
On Fri, Mar 5, 2021 at 9:56 PM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:280d542f Merge tag 'drm-fixes-2021-03-05' of git://anongit..
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=138c7a92d0
> kernel config:  https://syzkaller.appspot.com/x/.config?x=dc4003509ab3fc78
> dashboard link: https://syzkaller.appspot.com/bug?extid=a4c8bc1d1dc7b620630d
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+a4c8bc1d1dc7b6206...@syzkaller.appspotmail.com

+Mark, I've enabled CONFIG_DEBUG_IRQFLAGS on syzbot and it led to this breakage.
Is it a bug in kvm_wait or in the debugging code itself? If it's a
real bug, I would assume it's pretty bad as it happens all the time.


> [ cut here ]
> raw_local_irq_restore() called with IRQs enabled
> WARNING: CPU: 2 PID: 213 at kernel/locking/irqflag-debug.c:10 
> warn_bogus_irq_restore+0x1d/0x20 kernel/locking/irqflag-debug.c:10
> Modules linked in:
> CPU: 2 PID: 213 Comm: kworker/u17:4 Not tainted 5.12.0-rc1-syzkaller #0
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
> Workqueue: events_unbound call_usermodehelper_exec_work
>
> RIP: 0010:warn_bogus_irq_restore+0x1d/0x20 kernel/locking/irqflag-debug.c:10
> Code: be ff cc cc cc cc cc cc cc cc cc cc cc 80 3d e4 38 af 04 00 74 01 c3 48 
> c7 c7 a0 8f 6b 89 c6 05 d3 38 af 04 01 e8 e7 b9 be ff <0f> 0b c3 48 39 77 10 
> 0f 84 97 00 00 00 66 f7 47 22 f0 ff 74 4b 48
> RSP: :c9fe7770 EFLAGS: 00010286
>
> RAX:  RBX: 8c0e9c68 RCX: 
> RDX: 8880116bc3c0 RSI: 815c0cf5 RDI: f520001fcee0
> RBP: 0200 R08:  R09: 0001
> R10: 815b9a5e R11:  R12: 0003
> R13: fbfff181d38d R14: 0001 R15: 88802cc36000
> FS:  () GS:88802cc0() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2:  CR3: 0bc8e000 CR4: 00150ee0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Call Trace:
>  kvm_wait arch/x86/kernel/kvm.c:860 [inline]
>  kvm_wait+0xc9/0xe0 arch/x86/kernel/kvm.c:837
>  pv_wait arch/x86/include/asm/paravirt.h:564 [inline]
>  pv_wait_head_or_lock kernel/locking/qspinlock_paravirt.h:470 [inline]
>  __pv_queued_spin_lock_slowpath+0x8b8/0xb40 kernel/locking/qspinlock.c:508
>  pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:554 [inline]
>  queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline]
>  queued_spin_lock include/asm-generic/qspinlock.h:85 [inline]
>  do_raw_spin_lock+0x200/0x2b0 kernel/locking/spinlock_debug.c:113
>  spin_lock include/linux/spinlock.h:354 [inline]
>  copy_fs_struct+0x1c8/0x340 fs/fs_struct.c:123
>  copy_fs kernel/fork.c:1443 [inline]
>  copy_process+0x4dc2/0x6fd0 kernel/fork.c:2088
>  kernel_clone+0xe7/0xab0 kernel/fork.c:2462
>  kernel_thread+0xb5/0xf0 kernel/fork.c:2514
>  call_usermodehelper_exec_work kernel/umh.c:172 [inline]
>  call_usermodehelper_exec_work+0xcc/0x180 kernel/umh.c:158
>  process_one_work+0x98d/0x1600 kernel/workqueue.c:2275
>  worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
>  kthread+0x3b1/0x4a0 kernel/kthread.c:292
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkal...@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to syzkaller-bugs+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/syzkaller-bugs/ccbedd05bcd0504e%40google.com.


[PATCH V13 02/10] dt-bindings: remoteproc: imx_rproc: add i.MX8MQ/M support

2021-03-06 Thread peng . fan
From: Peng Fan 

Add i.MX8MQ/M support, also include mailbox for rpmsg/virtio usage.

Signed-off-by: Peng Fan 
---
 .../bindings/remoteproc/fsl,imx-rproc.yaml| 31 ++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml 
b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
index 54d2456530a6..208a628f8d6c 100644
--- a/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
+++ b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
@@ -4,7 +4,7 @@
 $id: "http://devicetree.org/schemas/remoteproc/fsl,imx-rproc.yaml#;
 $schema: "http://devicetree.org/meta-schemas/core.yaml#;
 
-title: NXP iMX6SX/iMX7D Co-Processor Bindings
+title: NXP i.MX Co-Processor Bindings
 
 description:
   This binding provides support for ARM Cortex M4 Co-processor found on some 
NXP iMX SoCs.
@@ -15,6 +15,8 @@ maintainers:
 properties:
   compatible:
 enum:
+  - fsl,imx8mq-cm4
+  - fsl,imx8mm-cm4
   - fsl,imx7d-cm4
   - fsl,imx6sx-cm4
 
@@ -26,6 +28,20 @@ properties:
 description:
   Phandle to syscon block which provide access to System Reset Controller
 
+  mbox-names:
+items:
+  - const: tx
+  - const: rx
+  - const: rxdb
+
+  mboxes:
+description:
+  This property is required only if the rpmsg/virtio functionality is used.
+  List of < type channel> - 1 channel for TX, 1 channel for RX, 1 
channel for RXDB.
+  (see mailbox/fsl,mu.yaml)
+minItems: 1
+maxItems: 3
+
   memory-region:
 description:
   If present, a phandle for a reserved memory area that used for vdev 
buffer,
@@ -58,4 +74,17 @@ examples:
   clocks   = < IMX7D_ARM_M4_ROOT_CLK>;
 };
 
+  - |
+#include 
+
+imx8mm-cm4 {
+  compatible = "fsl,imx8mm-cm4";
+  clocks = < IMX8MM_CLK_M4_DIV>;
+  mbox-names = "tx", "rx", "rxdb";
+  mboxes = < 0 1
+ 1 1
+ 3 1>;
+  memory-region = <>, <>, <>, 
<_table>;
+  syscon = <>;
+};
 ...
-- 
2.30.0



[PATCH V13 03/10] remoteproc: introduce is_iomem to rproc_mem_entry

2021-03-06 Thread peng . fan
From: Peng Fan 

Introduce is_iomem to indicate this piece memory is iomem or not.

Reviewed-by: Bjorn Andersson 
Reviewed-by: Mathieu Poirier 
Signed-off-by: Peng Fan 
---
 include/linux/remoteproc.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
index f28ee75d1005..a5f6d2d9cde2 100644
--- a/include/linux/remoteproc.h
+++ b/include/linux/remoteproc.h
@@ -315,6 +315,7 @@ struct rproc;
 /**
  * struct rproc_mem_entry - memory entry descriptor
  * @va:virtual address
+ * @is_iomem: io memory
  * @dma: dma address
  * @len: length, in bytes
  * @da: device address
@@ -329,6 +330,7 @@ struct rproc;
  */
 struct rproc_mem_entry {
void *va;
+   bool is_iomem;
dma_addr_t dma;
size_t len;
u32 da;
-- 
2.30.0



[PATCH V13 01/10] dt-bindings: remoteproc: convert imx rproc bindings to json-schema

2021-03-06 Thread peng . fan
From: Peng Fan 

Convert the imx rproc binding to DT schema format using json-schema.

Reviewed-by: Rob Herring 
Signed-off-by: Peng Fan 
---
 .../bindings/remoteproc/fsl,imx-rproc.yaml| 61 +++
 .../bindings/remoteproc/imx-rproc.txt | 33 --
 2 files changed, 61 insertions(+), 33 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
 delete mode 100644 Documentation/devicetree/bindings/remoteproc/imx-rproc.txt

diff --git a/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml 
b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
new file mode 100644
index ..54d2456530a6
--- /dev/null
+++ b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
@@ -0,0 +1,61 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: "http://devicetree.org/schemas/remoteproc/fsl,imx-rproc.yaml#;
+$schema: "http://devicetree.org/meta-schemas/core.yaml#;
+
+title: NXP iMX6SX/iMX7D Co-Processor Bindings
+
+description:
+  This binding provides support for ARM Cortex M4 Co-processor found on some 
NXP iMX SoCs.
+
+maintainers:
+  - Peng Fan 
+
+properties:
+  compatible:
+enum:
+  - fsl,imx7d-cm4
+  - fsl,imx6sx-cm4
+
+  clocks:
+maxItems: 1
+
+  syscon:
+$ref: /schemas/types.yaml#/definitions/phandle
+description:
+  Phandle to syscon block which provide access to System Reset Controller
+
+  memory-region:
+description:
+  If present, a phandle for a reserved memory area that used for vdev 
buffer,
+  resource table, vring region and others used by remote processor.
+minItems: 1
+maxItems: 32
+
+required:
+  - compatible
+  - clocks
+  - syscon
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+m4_reserved_sysmem1: cm4@8000 {
+  reg = <0x8000 0x8>;
+};
+
+m4_reserved_sysmem2: cm4@8100 {
+  reg = <0x8100 0x8>;
+};
+
+imx7d-cm4 {
+  compatible   = "fsl,imx7d-cm4";
+  memory-region= <_reserved_sysmem1>, <_reserved_sysmem2>;
+  syscon   = <>;
+  clocks   = < IMX7D_ARM_M4_ROOT_CLK>;
+};
+
+...
diff --git a/Documentation/devicetree/bindings/remoteproc/imx-rproc.txt 
b/Documentation/devicetree/bindings/remoteproc/imx-rproc.txt
deleted file mode 100644
index fbcefd965dc4..
--- a/Documentation/devicetree/bindings/remoteproc/imx-rproc.txt
+++ /dev/null
@@ -1,33 +0,0 @@
-NXP iMX6SX/iMX7D Co-Processor Bindings
-
-
-This binding provides support for ARM Cortex M4 Co-processor found on some
-NXP iMX SoCs.
-
-Required properties:
-- compatible   Should be one of:
-   "fsl,imx7d-cm4"
-   "fsl,imx6sx-cm4"
-- clocks   Clock for co-processor (See: 
../clock/clock-bindings.txt)
-- syscon   Phandle to syscon block which provide access to
-   System Reset Controller
-
-Optional properties:
-- memory-regionlist of phandels to the reserved memory regions.
-   (See: ../reserved-memory/reserved-memory.txt)
-
-Example:
-   m4_reserved_sysmem1: cm4@8000 {
-   reg = <0x8000 0x8>;
-   };
-
-   m4_reserved_sysmem2: cm4@8100 {
-   reg = <0x8100 0x8>;
-   };
-
-   imx7d-cm4 {
-   compatible  = "fsl,imx7d-cm4";
-   memory-region   = <_reserved_sysmem1>, 
<_reserved_sysmem2>;
-   syscon  = <>;
-   clocks  = < IMX7D_ARM_M4_ROOT_CLK>;
-   };
-- 
2.30.0



[PATCH V13 00/10] remoteproc: imx_rproc: support iMX8MQ/M

2021-03-06 Thread peng . fan
From: Peng Fan 

V13:
 Add R-b tag from Rob for patch 1.
 Drop the reserved memory node from patch 2 per Rob's comment.
 Mathieu, Bjorn
  Only patch 2 not have R-b/A-b tag, but since Rob's only has a minor comment, 
and
  addressed in this version, is it ok for you take into remoteproc next branch?
  Thanks.

V12:
 Add maxItems to avoid dt_bindings_check fail
 Rebased on top of linux-next

V11:
 Per Rob's comments, fix memory-region in patch 1/10
 Rebased on top of Linux-next

V10:
 Per Rob's comments, fix patch 1/10

V9:
 Per Mathieu's comments,
   update the tile of yaml in patch 2/10
   update the Kconfig and MODULE_DESCRIPTION, I merge this change in patch 8/10,
   since this is a minor change, I still keep Mathieu's R-b tag. If any 
objection, I could remove.
   Add R-b tag in Patch 10/10

 Rob, please help review patch 1/10 and 2/10

V8:
 Address sparse warning in patch 4/10 reported by kernel test robot

V7:
 Add R-b tag from Mathieu
 vdevbuffer->vdev0buffer in patch 1/10, 7/10
 correct err msg and shutdown seq per Mathieu's comments in patch 10/10
 Hope this version is ok to be merged.
 
V6:
 Add R-b tag from Mathieu
 Convert imx-rproc.txt to yaml and add dt-bindings support for i.MX8MQ/M, patch 
1/10 2/10
 No other changes.

V5:
 Apply on Linux next
 Add V5 subject prefix
 Add R-b tag from Bjorn for 1/8, 2/8, 3/8
 
https://patchwork.kernel.org/project/linux-remoteproc/cover/20201229033019.25899-1-peng@nxp.com/

V4:
 According to Bjorn's comments, add is_iomem for da to va usage
 1/8, 2/8 is new patch
 3/8, follow Bjorn's comments to correct/update the err msg.
 6/8, new patch
 8/8, use dev_err_probe to simplify code, use queue_work instead 
schedule_delayed_work

V3:
 Since I was quite busy in the past days, V3 is late
 Rebased on Linux-next
 Add R-b tags
 1/7: Add R-b tag of Mathieu, add comments
 4/7: Typo fix
 5/7: Add R-b tag of Mathieu, drop index Per Mathieu's comments
 6/7: Add R-b tag of Mathieu
 7/7: Add comment for vqid << 16, drop unneeded timeout settings of mailbox
  Use queue_work instead of schedule_delayed_work
  free mbox channels when remove
 https://lkml.org/lkml/2020/12/4/82

V2:
 Rebased on linux-next
 Dropped early boot feature to make patchset simple.
 Drop rsc-da
 
https://patchwork.kernel.org/project/linux-remoteproc/cover/20200927064131.24101-1-peng@nxp.com/

V1:
 https://patchwork.kernel.org/cover/11682461/

This patchset is to support i.MX8MQ/M coproc.
The early boot feature was dropped to make the patchset small in V2.

Since i.MX specific TCM memory requirement, add elf platform hook.
Several patches have got reviewed by Oleksij and Mathieu in v1.


Peng Fan (10):
  dt-bindings: remoteproc: convert imx rproc bindings to json-schema
  dt-bindings: remoteproc: imx_rproc: add i.MX8MQ/M support
  remoteproc: introduce is_iomem to rproc_mem_entry
  remoteproc: add is_iomem to da_to_va
  remoteproc: imx_rproc: correct err message
  remoteproc: imx_rproc: use devm_ioremap
  remoteproc: imx_rproc: add i.MX specific parse fw hook
  remoteproc: imx_rproc: support i.MX8MQ/M
  remoteproc: imx_rproc: ignore mapping vdev regions
  remoteproc: imx_proc: enable virtio/mailbox

 .../bindings/remoteproc/fsl,imx-rproc.yaml|  90 ++
 .../bindings/remoteproc/imx-rproc.txt |  33 ---
 drivers/remoteproc/Kconfig|   6 +-
 drivers/remoteproc/imx_rproc.c| 262 +-
 drivers/remoteproc/ingenic_rproc.c|   2 +-
 drivers/remoteproc/keystone_remoteproc.c  |   2 +-
 drivers/remoteproc/mtk_scp.c  |   6 +-
 drivers/remoteproc/omap_remoteproc.c  |   2 +-
 drivers/remoteproc/pru_rproc.c|   2 +-
 drivers/remoteproc/qcom_q6v5_adsp.c   |   2 +-
 drivers/remoteproc/qcom_q6v5_pas.c|   2 +-
 drivers/remoteproc/qcom_q6v5_wcss.c   |   2 +-
 drivers/remoteproc/qcom_wcnss.c   |   2 +-
 drivers/remoteproc/remoteproc_core.c  |   7 +-
 drivers/remoteproc/remoteproc_coredump.c  |   8 +-
 drivers/remoteproc/remoteproc_debugfs.c   |   2 +-
 drivers/remoteproc/remoteproc_elf_loader.c|  21 +-
 drivers/remoteproc/remoteproc_internal.h  |   2 +-
 drivers/remoteproc/st_slim_rproc.c|   2 +-
 drivers/remoteproc/ti_k3_dsp_remoteproc.c |   2 +-
 drivers/remoteproc/ti_k3_r5_remoteproc.c  |   2 +-
 drivers/remoteproc/wkup_m3_rproc.c|   2 +-
 include/linux/remoteproc.h|   4 +-
 23 files changed, 393 insertions(+), 72 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
 delete mode 100644 Documentation/devicetree/bindings/remoteproc/imx-rproc.txt

-- 
2.30.0



[RFC v2] scripts: kernel-doc: fix attribute capture in function parsing

2021-03-06 Thread Aditya Srivastava
Currently, kernel-doc warns for function prototype parsing on the
presence of attributes "__attribute_const__" and "__flatten" in the
definition.

There are 166 occurrences in ~70 files in the kernel tree for
"__attribute_const__" and 5 occurrences in 4 files for "__flatten".

Out of 166, there are 3 occurrences in three different files with
"__attribute_const__" and a preceding kernel-doc; and, 1 occurrence in
./mm/percpu.c for "__flatten" with a preceding kernel-doc. All other
occurrences have no preceding kernel-doc.

Add support for  "__attribute_const__" and "__flatten" attributes.

A quick evaluation by running 'kernel-doc -none' on kernel-tree reveals
that no additional warning or error has been added or removed by the fix.

Suggested-by: Lukas Bulwahn 
Signed-off-by: Aditya Srivastava 
---
Changes in v2:
- Remove "__attribute_const__" from the $return_type capture regex and add to 
the substituting ones.
- Add support for "__flatten" attribute
- Modify commit message

 scripts/kernel-doc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 68df17877384..e1e562b2e2e7 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -1766,12 +1766,14 @@ sub dump_function($$) {
 $prototype =~ s/^noinline +//;
 $prototype =~ s/__init +//;
 $prototype =~ s/__init_or_module +//;
+$prototype =~ s/__flatten +//;
 $prototype =~ s/__meminit +//;
 $prototype =~ s/__must_check +//;
 $prototype =~ s/__weak +//;
 $prototype =~ s/__sched +//;
 $prototype =~ s/__printf\s*\(\s*\d*\s*,\s*\d*\s*\) +//;
 my $define = $prototype =~ s/^#\s*define\s+//; #ak added
+$prototype =~ s/__attribute_const__ +//;
 $prototype =~ s/__attribute__\s*\(\(
 (?:
  [\w\s]++  # attribute name
-- 
2.17.1



[PATCH v3 1/4] ALSA: hda/cirrus: Increase AUTO_CFG_MAX_INS from 8 to 18

2021-03-06 Thread Vitaly Rodionov
In preparation to support Cirrus Logic CS8409 HDA bridge on new Dell platforms
it is nessasary to increase AUTO_CFG_MAX_INS and AUTO_CFG_NUM_INPUTS values.
Currently AUTO_CFG_MAX_INS is limited to 8, but Cirrus Logic HDA bridge CS8409
has 18 input pins, 16 ASP receivers and 2 DMIC inputs. We have to increase this
value to 18, so generic code can handle this correctly.

Tested on DELL Inspiron-3505, DELL Inspiron-3501, DELL Inspiron-3500

Signed-off-by: Vitaly Rodionov 
---

Changes in v3:
- No changes

 sound/pci/hda/hda_auto_parser.h | 2 +-
 sound/pci/hda/hda_local.h   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/sound/pci/hda/hda_auto_parser.h b/sound/pci/hda/hda_auto_parser.h
index a22ca0e17a08..df63d66af1ab 100644
--- a/sound/pci/hda/hda_auto_parser.h
+++ b/sound/pci/hda/hda_auto_parser.h
@@ -27,7 +27,7 @@ enum {
 };
 
 #define AUTO_CFG_MAX_OUTS  HDA_MAX_OUTS
-#define AUTO_CFG_MAX_INS   8
+#define AUTO_CFG_MAX_INS   18
 
 struct auto_pin_cfg_item {
hda_nid_t pin;
diff --git a/sound/pci/hda/hda_local.h b/sound/pci/hda/hda_local.h
index 5beb8aa44ecd..317245a5585d 100644
--- a/sound/pci/hda/hda_local.h
+++ b/sound/pci/hda/hda_local.h
@@ -180,7 +180,7 @@ int snd_hda_create_spdif_in_ctls(struct hda_codec *codec, 
hda_nid_t nid);
 /*
  * input MUX helper
  */
-#define HDA_MAX_NUM_INPUTS 16
+#define HDA_MAX_NUM_INPUTS 36
 struct hda_input_mux_item {
char label[32];
unsigned int index;
-- 
2.25.1



[PATCH v3 3/4] ALSA: hda/cirrus: Add jack detect interrupt support from CS42L42 companion codec.

2021-03-06 Thread Vitaly Rodionov
In the case of CS8409 we do not have unsol events from NID's 0x24 and 0x34
where hs mic and hp are connected. Companion codec CS42L42 will generate
interrupt via gpio 4 to notify jack events. We have to overwrite standard
snd_hda_jack_unsol_event(), read CS42L42 jack detect status registers and
then notify status via generic snd_hda_jack_unsol_event() call.

Tested on DELL Inspiron-3500, DELL Inspiron-3501, DELL Inspiron-3505.

Signed-off-by: Vitaly Rodionov 
---

Changes in v3:
- Fixed missing static function declaration warning (Reported-by: kernel test 
robot )
- Improved unsolicited events handling for headset type 4

 sound/pci/hda/patch_cirrus.c | 309 ++-
 1 file changed, 307 insertions(+), 2 deletions(-)

diff --git a/sound/pci/hda/patch_cirrus.c b/sound/pci/hda/patch_cirrus.c
index d664eed5c3cf..1d2f6a1224e6 100644
--- a/sound/pci/hda/patch_cirrus.c
+++ b/sound/pci/hda/patch_cirrus.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -38,6 +39,15 @@ struct cs_spec {
/* for MBP SPDIF control */
int (*spdif_sw_put)(struct snd_kcontrol *kcontrol,
struct snd_ctl_elem_value *ucontrol);
+
+   unsigned int cs42l42_hp_jack_in:1;
+   unsigned int cs42l42_mic_jack_in:1;
+
+   struct mutex cs8409_i2c_mux;
+
+   /* verb exec op override */
+   int (*exec_verb)(struct hdac_device *dev, unsigned int cmd,
+unsigned int flags, unsigned int *res);
 };
 
 /* available models with CS420x */
@@ -1229,6 +1239,13 @@ static int patch_cs4213(struct hda_codec *codec)
 #define CS8409_CS42L42_SPK_PIN_NID 0x2c
 #define CS8409_CS42L42_AMIC_PIN_NID0x34
 #define CS8409_CS42L42_DMIC_PIN_NID0x44
+#define CS8409_CS42L42_DMIC_ADC_PIN_NID0x22
+
+#define CS42L42_HSDET_AUTO_DONE0x02
+#define CS42L42_HSTYPE_MASK0x03
+
+#define CS42L42_JACK_INSERTED  0x0C
+#define CS42L42_JACK_REMOVED   0x00
 
 #define GPIO3_INT (1 << 3)
 #define GPIO4_INT (1 << 4)
@@ -1429,6 +1446,7 @@ static const struct cs8409_i2c_param 
cs42l42_init_reg_seq[] = {
{ 0x1C03, 0xC0 },
{ 0x1105, 0x00 },
{ 0x1112, 0xC0 },
+   { 0x1101, 0x02 },
{} /* Terminator */
 };
 
@@ -1565,6 +1583,8 @@ static unsigned int cs8409_i2c_write(struct hda_codec 
*codec,
 /* Assert/release RTS# line to CS42L42 */
 static void cs8409_cs42l42_reset(struct hda_codec *codec)
 {
+   struct cs_spec *spec = codec->spec;
+
/* Assert RTS# line */
snd_hda_codec_write(codec,
codec->core.afg, 0, AC_VERB_SET_GPIO_DATA, 0);
@@ -1576,21 +1596,190 @@ static void cs8409_cs42l42_reset(struct hda_codec 
*codec)
/* wait ~10ms */
usleep_range(1, 15000);
 
-   /* Clear interrupts status */
+   mutex_lock(>cs8409_i2c_mux);
+
+   /* Clear interrupts, by reading interrupt status registers */
cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x1308, 1);
cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x1309, 1);
cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x130A, 1);
cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x130F, 1);
 
+   mutex_unlock(>cs8409_i2c_mux);
+
+}
+
+/* Configure CS42L42 slave codec for jack autodetect */
+static int cs8409_cs42l42_enable_jack_detect(struct hda_codec *codec)
+{
+   struct cs_spec *spec = codec->spec;
+
+   mutex_lock(>cs8409_i2c_mux);
+
+   /* Set TIP_SENSE_EN for analog front-end of tip sense. */
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1b70, 0x0020, 1);
+   /* Clear WAKE# */
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1b71, 0x0001, 1);
+   /* Wait ~2.5ms */
+   usleep_range(2500, 3000);
+   /* Set mode WAKE# output follows the combination logic directly */
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1b71, 0x0020, 1);
+   /* Clear interrupts status */
+   cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x130f, 1);
+   cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x1b7b, 1);
+   /* Enable interrupt */
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1320, 0x03, 1);
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1b79, 0x00, 1);
+
+   mutex_unlock(>cs8409_i2c_mux);
+
+   return 0;
+}
+
+/* Enable and run CS42L42 slave codec jack auto detect */
+static void cs8409_cs42l42_run_jack_detect(struct hda_codec *codec)
+{
+   struct cs_spec *spec = codec->spec;
+
+   mutex_lock(>cs8409_i2c_mux);
+
+   /* Clear interrupts */
+   cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x1308, 1);
+   cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x1b77, 1);
+
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1102, 0x87, 1);
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1f06, 0x86, 1);
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1b74, 0x07, 1);
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x131b, 0x01, 1);
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1120, 0x80, 

[PATCH v3 0/4] ALSA: hda/cirrus: Add support for CS8409 HDA bridge and CS42L42 companion codec

2021-03-06 Thread Vitaly Rodionov
Dell's laptops Inspiron 3500, Inspiron 3501, Inspiron 3505 are using
Cirrus Logic CS8409 HDA bridge with CS42L42 companion codec.

The CS8409 is a multichannel HD audio routing controller.
CS8409 includes support for four channels of digital
microphone data and two bidirectional ASPs for up to 32
channels of TDM data or 4 channels of I2S data. The CS8409 is
intended to be used with a remote companion codec that implements
high performance analog functions in close physical
proximity to the end-equipment audio port or speaker driver.

The CS42L42 is a low-power audio codec with integrated MIPI
SoundWire interface or I2C/I2S/TDM interfaces designed
for portable applications. It provides a high-dynamic range,
stereo DAC for audio playback and a mono high-dynamic-range
ADC for audio capture

Changes since version 1:

ALSA: hda/cirrus: Increase AUTO_CFG_MAX_INS from 8 to 18
* No change

ALSA: hda/cirrus: Add support for CS8409 HDA bridge and CS42L42
companion codec.
* Removed redundant fields in fixup table
* Handle gpio via spec->gpio_dir, spec->gpio_data and spec->gpio_mask
* Moved cs8409_cs42l42_init() from patch 2, to handle resume correctly

ALSA: hda/cirrus: Add jack detect interrupt support from CS42L42
companion codec.
* Run scripts/checkpatch.pl, fixed new warnings

ALSA: hda/cirrus: Add Headphone and Headset MIC Volume Control
* Moved control values to cache to avoid i2c read at each time.

Stefan Binding (1):
  ALSA: hda/cirrus: Add Headphone and Headset MIC Volume Control

Vitaly Rodionov (3):
  ALSA: hda/cirrus: Increase AUTO_CFG_MAX_INS from 8 to 18
  ALSA: hda/cirrus: Add support for CS8409 HDA bridge and CS42L42
companion codec.
  ALSA: hda/cirrus: Add jack detect interrupt support from CS42L42
companion codec.

 sound/pci/hda/hda_auto_parser.h |2 +-
 sound/pci/hda/hda_local.h   |2 +-
 sound/pci/hda/patch_cirrus.c| 1081 +++
 3 files changed, 1083 insertions(+), 2 deletions(-)

-- 
2.25.1



[PATCH v3 2/4] ALSA: hda/cirrus: Add support for CS8409 HDA bridge and CS42L42 companion codec.

2021-03-06 Thread Vitaly Rodionov
Dell's laptops Inspiron 3500, Inspiron 3501, Inspiron 3505 are using Cirrus 
Logic
CS8409 HDA bridge with CS42L42 companion codec.

The CS8409 is a multichannel HD audio routing controller.
CS8409 includes support for four channels of digital
microphone data and two bidirectional ASPs for up to 32
channels of TDM data or 4 channels of I2S data. The CS8409 is
intended to be used with a remote companion codec that implements
high performance analog functions in close physical
proximity to the end-equipment audio port or speaker driver.

The CS42L42 is a low-power audio codec with integrated MIPI
SoundWire interface or I2C/I2S/TDM interfaces designed
for portable applications. It provides a high-dynamic range,
stereo DAC for audio playback and a mono high-dynamic-range
ADC for audio capture

CS42L42 is connected to CS8409 HDA bridge via I2C and I2S.

CS8409  CS42L42
--- 
ASP1.A TX  -->  ASP_SDIN
ASP1.A RX  <--  ASP_SDOUT
GPIO5  -->  RST#
GPIO4  <--  INT#
GPIO3  <--  WAKE#
GPIO7  <->  I2C SDA
GPIO6  -->  I2C CLK

Tested on DELL Inspiron-3500, DELL Inspiron-3501, DELL Inspiron-3505

This patch will register CS8409 with sound card and create
input/output paths and two input devices, initialise CS42L42
companion codec and configure it for ASP TX/RX TDM mode,
24bit, 48kHz.

cat /proc/asound/pcm
00-00: CS8409 Analog : CS8409 Analog : playback 1 : capture 1
00-03: HDMI 0 : HDMI 0 : playback 1

dmesg
snd_hda_codec_cirrus hdaudioC0D0: autoconfig for CS8409: line_outs=1 
(0x2c/0x0/0x0/0x0/0x0) type:speaker
snd_hda_codec_cirrus hdaudioC0D0:speaker_outs=0 (0x0/0x0/0x0/0x0/0x0)
snd_hda_codec_cirrus hdaudioC0D0:hp_outs=1 (0x24/0x0/0x0/0x0/0x0)
snd_hda_codec_cirrus hdaudioC0D0:mono: mono_out=0x0
snd_hda_codec_cirrus hdaudioC0D0:inputs:
snd_hda_codec_cirrus hdaudioC0D0:  Internal Mic=0x44
snd_hda_codec_cirrus hdaudioC0D0:  Mic=0x34
input: HDA Intel PCH Headphone as 
/devices/pci:00/:00:1f.3/sound/card0/input8
input: HDA Intel PCH Headset Mic as 
/devices/pci:00/:00:1f.3/sound/card0/input9

Signed-off-by: Vitaly Rodionov 
---

Changes in v3:
- Fixed uninitialized variable warning (Reported-by: kernel test robot 
)
- Moved gpio setup into s8409_cs42l42_hw_init()
- Improved susped() implementation

 sound/pci/hda/patch_cirrus.c | 576 +++
 1 file changed, 576 insertions(+)

diff --git a/sound/pci/hda/patch_cirrus.c b/sound/pci/hda/patch_cirrus.c
index f46204ab0b90..d664eed5c3cf 100644
--- a/sound/pci/hda/patch_cirrus.c
+++ b/sound/pci/hda/patch_cirrus.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "hda_local.h"
@@ -1219,6 +1220,580 @@ static int patch_cs4213(struct hda_codec *codec)
return err;
 }
 
+/* Cirrus Logic CS8409 HDA bridge with
+ * companion codec CS42L42
+ */
+#define CS8409_VENDOR_NID 0x47
+
+#define CS8409_CS42L42_HP_PIN_NID  0x24
+#define CS8409_CS42L42_SPK_PIN_NID 0x2c
+#define CS8409_CS42L42_AMIC_PIN_NID0x34
+#define CS8409_CS42L42_DMIC_PIN_NID0x44
+
+#define GPIO3_INT (1 << 3)
+#define GPIO4_INT (1 << 4)
+#define GPIO5_INT (1 << 5)
+
+#define CS42L42_I2C_ADDR   (0x48 << 1)
+
+#define CIR_I2C_ADDR   0x0059
+#define CIR_I2C_DATA   0x005A
+#define CIR_I2C_CTRL   0x005B
+#define CIR_I2C_STATUS 0x005C
+#define CIR_I2C_QWRITE 0x005D
+#define CIR_I2C_QREAD  0x005E
+
+struct cs8409_i2c_param {
+   unsigned int addr;
+   unsigned int reg;
+};
+
+struct cs8409_cir_param {
+   unsigned int nid;
+   unsigned int cir;
+   unsigned int coeff;
+};
+
+enum {
+   CS8409_BULLSEYE,
+   CS8409_WARLOCK,
+   CS8409_CYBORG,
+   CS8409_VERBS,
+};
+
+/* Dell Inspiron models with cs8409/cs42l42 */
+static const struct hda_model_fixup cs8409_models[] = {
+   { .id = CS8409_BULLSEYE, .name = "bullseye" },
+   { .id = CS8409_WARLOCK, .name = "warlock" },
+   { .id = CS8409_CYBORG, .name = "cyborg" },
+   {}
+};
+
+/* Dell Inspiron platforms
+ * with cs8409 bridge and cs42l42 codec
+ */
+static const struct snd_pci_quirk cs8409_fixup_tbl[] = {
+   SND_PCI_QUIRK(0x1028, 0x0A11, "Bullseye", CS8409_BULLSEYE),
+   SND_PCI_QUIRK(0x1028, 0x0A12, "Bullseye", CS8409_BULLSEYE),
+   SND_PCI_QUIRK(0x1028, 0x0A23, "Bullseye", CS8409_BULLSEYE),
+   SND_PCI_QUIRK(0x1028, 0x0A24, "Bullseye", CS8409_BULLSEYE),
+   SND_PCI_QUIRK(0x1028, 0x0A25, "Bullseye", CS8409_BULLSEYE),
+   SND_PCI_QUIRK(0x1028, 0x0A29, "Bullseye", CS8409_BULLSEYE),
+   SND_PCI_QUIRK(0x1028, 0x0A2A, "Bullseye", CS8409_BULLSEYE),
+   SND_PCI_QUIRK(0x1028, 0x0A2B, "Bullseye", CS8409_BULLSEYE),
+   SND_PCI_QUIRK(0x1028, 0x0AB0, "Warlock", CS8409_WARLOCK),
+   SND_PCI_QUIRK(0x1028, 0x0AB2, "Warlock", CS8409_WARLOCK),
+   SND_PCI_QUIRK(0x1028, 0x0AB1, "Warlock", CS8409_WARLOCK),
+   SND_PCI_QUIRK(0x1028, 0x0AB3, "Warlock", CS8409_WARLOCK),
+   SND_PCI_QUIRK(0x1028, 0x0AB4, "Warlock", 

[PATCH] media:vidtv: remove duplicate include in vidtv_psi

2021-03-06 Thread menglong8 . dong
From: Zhang Yunkai 

'string.h' included in 'vidtv_psi.c' is duplicated.

Signed-off-by: Zhang Yunkai 
---
 drivers/media/test-drivers/vidtv/vidtv_psi.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/media/test-drivers/vidtv/vidtv_psi.c 
b/drivers/media/test-drivers/vidtv/vidtv_psi.c
index 47ed7907db8d..c11ac8dca73d 100644
--- a/drivers/media/test-drivers/vidtv/vidtv_psi.c
+++ b/drivers/media/test-drivers/vidtv/vidtv_psi.c
@@ -19,7 +19,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
-- 
2.25.1



[syzbot] bpf boot error: WARNING in kvm_wait

2021-03-06 Thread syzbot
Hello,

syzbot found the following issue on:

HEAD commit:edbea922 veth: Store queue_mapping independently of XDP pr..
git tree:   bpf
console output: https://syzkaller.appspot.com/x/log.txt?x=113ae02ad0
kernel config:  https://syzkaller.appspot.com/x/.config?x=402784bff477e1ac
dashboard link: https://syzkaller.appspot.com/bug?extid=46fc491326a456ff8127

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+46fc491326a456ff8...@syzkaller.appspotmail.com

[ cut here ]
raw_local_irq_restore() called with IRQs enabled
WARNING: CPU: 0 PID: 4787 at kernel/locking/irqflag-debug.c:10 
warn_bogus_irq_restore+0x1d/0x20 kernel/locking/irqflag-debug.c:10
Modules linked in:
CPU: 0 PID: 4787 Comm: systemd-getty-g Not tainted 5.11.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
RIP: 0010:warn_bogus_irq_restore+0x1d/0x20 kernel/locking/irqflag-debug.c:10
Code: be ff cc cc cc cc cc cc cc cc cc cc cc 80 3d 1e 62 b0 04 00 74 01 c3 48 
c7 c7 a0 8e 6b 89 c6 05 0d 62 b0 04 01 e8 57 da be ff <0f> 0b c3 48 39 77 10 0f 
84 97 00 00 00 66 f7 47 22 f0 ff 74 4b 48
RSP: 0018:c900012efc40 EFLAGS: 00010282
RAX:  RBX: 8be28b80 RCX: 
RDX: 888023de5340 RSI: 815bea35 RDI: f5200025df7a
RBP: 0200 R08:  R09: 0001
R10: 815b77be R11:  R12: 0003
R13: fbfff17c5170 R14: 0001 R15: 8880b9c35f40
FS:  () GS:8880b9c0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7fa257bcaab4 CR3: 0bc8e000 CR4: 001506f0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 kvm_wait arch/x86/kernel/kvm.c:860 [inline]
 kvm_wait+0xc9/0xe0 arch/x86/kernel/kvm.c:837
 pv_wait arch/x86/include/asm/paravirt.h:564 [inline]
 pv_wait_head_or_lock kernel/locking/qspinlock_paravirt.h:470 [inline]
 __pv_queued_spin_lock_slowpath+0x8b8/0xb40 kernel/locking/qspinlock.c:508
 pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:554 [inline]
 queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline]
 queued_spin_lock include/asm-generic/qspinlock.h:85 [inline]
 do_raw_spin_lock+0x200/0x2b0 kernel/locking/spinlock_debug.c:113
 spin_lock include/linux/spinlock.h:354 [inline]
 check_stack_usage kernel/exit.c:715 [inline]
 do_exit+0x1d6a/0x2ae0 kernel/exit.c:868
 do_group_exit+0x125/0x310 kernel/exit.c:922
 __do_sys_exit_group kernel/exit.c:933 [inline]
 __se_sys_exit_group kernel/exit.c:931 [inline]
 __x64_sys_exit_group+0x3a/0x50 kernel/exit.c:931
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fa2592a3618
Code: Unable to access opcode bytes at RIP 0x7fa2592a35ee.
RSP: 002b:7ffc579980b8 EFLAGS: 0246 ORIG_RAX: 00e7
RAX: ffda RBX:  RCX: 7fa2592a3618
RDX:  RSI: 003c RDI: 
RBP: 7fa2595808e0 R08: 00e7 R09: fee8
R10: 7fa25775e158 R11: 0246 R12: 7fa2595808e0
R13: 7fa259585c20 R14:  R15: 


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.


[tip: x86/seves] x86/sev-es: Remove subtraction of res variable

2021-03-06 Thread tip-bot2 for Borislav Petkov
The following commit has been merged into the x86/seves branch of tip:

Commit-ID: f3db3365c069c2a8505cdee8033fe3d22d2fe6c0
Gitweb:
https://git.kernel.org/tip/f3db3365c069c2a8505cdee8033fe3d22d2fe6c0
Author:Borislav Petkov 
AuthorDate:Tue, 23 Feb 2021 12:03:19 +01:00
Committer: Borislav Petkov 
CommitterDate: Sat, 06 Mar 2021 12:08:53 +01:00

x86/sev-es: Remove subtraction of res variable

vc_decode_insn() calls copy_from_kernel_nofault() by way of
vc_fetch_insn_kernel() to fetch 15 bytes max of opcodes to decode.

copy_from_kernel_nofault() returns negative on error and 0 on success.
The error case is handled by returning ES_EXCEPTION.

In the success case, the ret variable which contains the return value is
0 so there's no need to subtract it from MAX_INSN_SIZE when initializing
the insn buffer for further decoding. Remove it.

No functional changes.

Signed-off-by: Borislav Petkov 
Reviewed-by: Joerg Roedel 
Link: https://lkml.kernel.org/r/2021022330.16201-1...@alien8.de
---
 arch/x86/kernel/sev-es.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/sev-es.c b/arch/x86/kernel/sev-es.c
index 84c1821..1e78f4b 100644
--- a/arch/x86/kernel/sev-es.c
+++ b/arch/x86/kernel/sev-es.c
@@ -267,7 +267,7 @@ static enum es_result vc_decode_insn(struct es_em_ctxt 
*ctxt)
return ES_EXCEPTION;
}
 
-   insn_init(>insn, buffer, MAX_INSN_SIZE - res, 1);
+   insn_init(>insn, buffer, MAX_INSN_SIZE, 1);
insn_get_length(>insn);
}
 


[PATCH v3 4/4] ALSA: hda/cirrus: Add Headphone and Headset MIC Volume Control

2021-03-06 Thread Vitaly Rodionov
From: Stefan Binding 

CS8409 does not support Volume Control for NIDs 0x24 (the Headphones),
or 0x34 (The Headset Mic).
However, CS42L42 codec does support gain control for both.
We can add support for Volume Controls, by writing the the CS42L42
regmap via i2c commands, using custom info, get and put volume
functions, saved in the control.

Tested on DELL Inspiron-3500, DELL Inspiron-3501, DELL Inspiron-3500

Signed-off-by: Stefan Binding 
Signed-off-by: Vitaly Rodionov 
---

Changes in v3:
- Added restore volumes after resume
- Removed redundant debug logging after testing


 sound/pci/hda/patch_cirrus.c | 200 +++
 1 file changed, 200 insertions(+)

diff --git a/sound/pci/hda/patch_cirrus.c b/sound/pci/hda/patch_cirrus.c
index 1d2f6a1224e6..6a9e5c803977 100644
--- a/sound/pci/hda/patch_cirrus.c
+++ b/sound/pci/hda/patch_cirrus.c
@@ -21,6 +21,9 @@
 /*
  */
 
+#define CS42L42_HP_CH (2U)
+#define CS42L42_HS_MIC_CH (1U)
+
 struct cs_spec {
struct hda_gen_spec gen;
 
@@ -42,6 +45,9 @@ struct cs_spec {
 
unsigned int cs42l42_hp_jack_in:1;
unsigned int cs42l42_mic_jack_in:1;
+   unsigned int cs42l42_volume_init:1;
+   char cs42l42_hp_volume[CS42L42_HP_CH];
+   char cs42l42_hs_mic_volume[CS42L42_HS_MIC_CH];
 
struct mutex cs8409_i2c_mux;
 
@@ -1260,6 +1266,14 @@ static int patch_cs4213(struct hda_codec *codec)
 #define CIR_I2C_QWRITE 0x005D
 #define CIR_I2C_QREAD  0x005E
 
+#define CS8409_CS42L42_HP_VOL_REAL_MIN   (-63)
+#define CS8409_CS42L42_HP_VOL_REAL_MAX   (0)
+#define CS8409_CS42L42_AMIC_VOL_REAL_MIN (-97)
+#define CS8409_CS42L42_AMIC_VOL_REAL_MAX (12)
+#define CS8409_CS42L42_REG_HS_VOLUME_CHA (0x2301)
+#define CS8409_CS42L42_REG_HS_VOLUME_CHB (0x2303)
+#define CS8409_CS42L42_REG_AMIC_VOLUME   (0x1D03)
+
 struct cs8409_i2c_param {
unsigned int addr;
unsigned int reg;
@@ -1580,6 +1594,165 @@ static unsigned int cs8409_i2c_write(struct hda_codec 
*codec,
return retval;
 }
 
+static int cs8409_cs42l42_volume_info(struct snd_kcontrol *kcontrol,
+ struct snd_ctl_elem_info *uinfo)
+{
+   struct hda_codec *codec = snd_kcontrol_chip(kcontrol);
+   u16 nid = get_amp_nid(kcontrol);
+   u8 chs = get_amp_channels(kcontrol);
+
+   codec_dbg(codec, "%s() nid: %d\n", __func__, nid);
+   switch (nid) {
+   case CS8409_CS42L42_HP_PIN_NID:
+   uinfo->type = SNDRV_CTL_ELEM_TYPE_INTEGER;
+   uinfo->count = chs == 3 ? 2 : 1;
+   uinfo->value.integer.min = CS8409_CS42L42_HP_VOL_REAL_MIN;
+   uinfo->value.integer.max = CS8409_CS42L42_HP_VOL_REAL_MAX;
+   break;
+   case CS8409_CS42L42_AMIC_PIN_NID:
+   uinfo->type = SNDRV_CTL_ELEM_TYPE_INTEGER;
+   uinfo->count = chs == 3 ? 2 : 1;
+   uinfo->value.integer.min = CS8409_CS42L42_AMIC_VOL_REAL_MIN;
+   uinfo->value.integer.max = CS8409_CS42L42_AMIC_VOL_REAL_MAX;
+   break;
+   default:
+   break;
+   }
+   return 0;
+}
+
+static void cs8409_cs42l42_update_volume(struct hda_codec *codec)
+{
+   struct cs_spec *spec = codec->spec;
+
+   mutex_lock(>cs8409_i2c_mux);
+   spec->cs42l42_hp_volume[0] = -(cs8409_i2c_read(codec, CS42L42_I2C_ADDR,
+   CS8409_CS42L42_REG_HS_VOLUME_CHA, 1));
+   spec->cs42l42_hp_volume[1] = -(cs8409_i2c_read(codec, CS42L42_I2C_ADDR,
+   CS8409_CS42L42_REG_HS_VOLUME_CHB, 1));
+   spec->cs42l42_hs_mic_volume[0] = -(cs8409_i2c_read(codec, 
CS42L42_I2C_ADDR,
+   CS8409_CS42L42_REG_AMIC_VOLUME, 1));
+   mutex_unlock(>cs8409_i2c_mux);
+   spec->cs42l42_volume_init = 1;
+}
+
+static int cs8409_cs42l42_volume_get(struct snd_kcontrol *kcontrol,
+struct snd_ctl_elem_value *ucontrol)
+{
+   struct hda_codec *codec = snd_kcontrol_chip(kcontrol);
+   struct cs_spec *spec = codec->spec;
+   hda_nid_t nid = get_amp_nid(kcontrol);
+   int chs = get_amp_channels(kcontrol);
+   long *valp = ucontrol->value.integer.value;
+
+   if (!spec->cs42l42_volume_init) {
+   snd_hda_power_up(codec);
+   cs8409_cs42l42_update_volume(codec);
+   snd_hda_power_down(codec);
+   }
+   switch (nid) {
+   case CS8409_CS42L42_HP_PIN_NID:
+   if (chs & 1)
+   *valp++ = spec->cs42l42_hp_volume[0];
+   if (chs & 2)
+   *valp++ = spec->cs42l42_hp_volume[1];
+   break;
+   case CS8409_CS42L42_AMIC_PIN_NID:
+   if (chs & 1)
+   *valp++ = spec->cs42l42_hs_mic_volume[0];
+   break;
+   default:
+   break;
+   }
+   return 0;
+}
+
+static int cs8409_cs42l42_volume_put(struct snd_kcontrol *kcontrol,
+struct 

Re: [PATCH v2 4/5] mtd: spi-nor: Move Software Write Protection logic out of the core

2021-03-06 Thread Michael Walle

Am 2021-03-06 10:50, schrieb Tudor Ambarus:

It makes the core file a bit smaller and provides better separation
between the Software Write Protection features and the core logic.
All the next generic software write protection features (e.g. 
Individual

Block Protection) will reside in swp.c.

Signed-off-by: Tudor Ambarus 
---


[..]

@@ -3554,6 +3152,9 @@ int spi_nor_scan(struct spi_nor *nor, const char 
*name,

if (ret)
return ret;

+   if (nor->params->locking_ops)


Should this be in spi_nor_register_locking_ops(), too? I.e.

void spi_nor_register_locking_ops() {
if (!nor->params->locking_ops)
return;
..
}

I don't have a strong opinion on that so far. I just noticed because
I put the check into spi_nor_otp_init() for my OTP series. They should
be the same though.


+   spi_nor_register_locking_ops(nor);


-michael


Re: [PATCH] x86/smpboot: remove duplicate include in smpboot.c

2021-03-06 Thread Borislav Petkov
On Fri, Mar 05, 2021 at 10:56:10PM -0800, menglong8.d...@gmail.com wrote:
> From: Zhang Yunkai 
> 
> 'cpu_device_id.h' and 'intel_family.h' included in 'smpboot.c'
> is duplicated. It is also included in the 80th line.
> 
> Signed-off-by: Zhang Yunkai 

If you send another person's patch, then your SOB needs to follow his/hers:

https://www.kernel.org/doc/html/latest/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin

Also, merge those two x86 patches removing includes into one please.

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


[PATCH] drm/nouveau: remove duplicate include in nouveau_dmem and base

2021-03-06 Thread menglong8 . dong
From: Zhang Yunkai 

'if000c.h' included in 'nouveau_dmem.c' is duplicated.
'priv.h' included in 'base.c' is duplicated.

Signed-off-by: Zhang Yunkai 
---
 drivers/gpu/drm/nouveau/nouveau_dmem.c   | 1 -
 drivers/gpu/drm/nouveau/nvkm/engine/nvenc/base.c | 2 --
 2 files changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c 
b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 92987daa5e17..f5cc057b123b 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -33,7 +33,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/nvenc/base.c 
b/drivers/gpu/drm/nouveau/nvkm/engine/nvenc/base.c
index c39e797dc7c9..09524168431c 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/nvenc/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/nvenc/base.c
@@ -20,8 +20,6 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */
 #include "priv.h"
-
-#include "priv.h"
 #include 
 
 static void *
-- 
2.25.1



Re: [PATCH v6 1/2] dt-bindings: hwlock: add sun6i_hwspinlock

2021-03-06 Thread Wilken Gottwalt
On Tue, 2 Mar 2021 18:22:33 +0100
Maxime Ripard  wrote:

> On Mon, Mar 01, 2021 at 03:06:35PM +0100, Wilken Gottwalt wrote:
> > On Mon, 1 Mar 2021 14:12:44 +0100
> > Maxime Ripard  wrote:
> > 
> > > On Sat, Feb 27, 2021 at 02:03:28PM +0100, Wilken Gottwalt wrote:
> > > > Adds documentation on how to use the sun6i_hwspinlock driver for sun6i
> > > > compatible series SoCs.
> > > >
> > > > Signed-off-by: Wilken Gottwalt 
> > > > ---
> > > > Changes in v6:
> > > >   - fixed formating and name issues in dt documentation
> > > >
> > > > Changes in v5:
> > > >   - changed binding to earliest known supported SoC sun6i-a31
> > > >   - dropped unnecessary entries
> > > >
> > > > Changes in v4:
> > > >   - changed binding to sun8i-a33-hwpinlock
> > > >   - added changes suggested by Maxime Ripard
> > > >
> > > > Changes in v3:
> > > >   - changed symbols from sunxi to sun8i
> > > >
> > > > Changes in v2:
> > > >   - fixed memory ranges
> > > > ---
> > > >  .../hwlock/allwinner,sun6i-hwspinlock.yaml| 45 +++
> > > 
> > > The name of the file doesn't match the compatible, once fixed:
> > > Acked-by: Maxime Ripard 
> > 
> > This is something that still confuses me. What if you have more than one
> > compatible string?
> 
> In this case, it's fairly easy there's only one :)
> 
> But we're following the same rule than the compatible: the first SoC
> that got the compatible wins 
> 
> > This won't be solvable. See the qcom binding for example,
> > there are two strings and none matches. In the omap bindings are also two
> > strings and only one matches. In all cases, including mine, the bindings
> > check script is fine with that.
> 
> If other platforms want to follow other rules, good for them :)
> 
> > So, you basically want it to be called
> > "allwinner,sun6i-a31-hwspinlock.yaml"?
> 
> Yes

Is it okay if I provide only the fixed bindings? I assume the v6 driver is
fine now.

greetings,
Will


Re: [PATCH v6 2/2] hwspinlock: add sun6i hardware spinlock support

2021-03-06 Thread Wilken Gottwalt
On Tue, 2 Mar 2021 18:20:02 +0100
Maxime Ripard  wrote:

> Hi,
> 
> On Mon, Mar 01, 2021 at 03:06:08PM +0100, Wilken Gottwalt wrote:
> > On Mon, 1 Mar 2021 14:13:05 +0100
> > Maxime Ripard  wrote:
> > 
> > > On Sat, Feb 27, 2021 at 02:03:54PM +0100, Wilken Gottwalt wrote:
> > > > Adds the sun6i_hwspinlock driver for the hardware spinlock unit found in
> > > > most of the sun6i compatible SoCs.
> > > >
> > > > This unit provides at least 32 spinlocks in hardware. The implementation
> > > > supports 32, 64, 128 or 256 32bit registers. A lock can be taken by
> > > > reading a register and released by writing a 0 to it. This driver
> > > > supports all 4 spinlock setups, but for now only the first setup (32
> > > > locks) seem to exist in available devices. This spinlock unit is shared
> > > > between all ARM cores and the embedded companion core. All of them can
> > > > take/release a lock with a single cycle operation. It can be used to
> > > > sync access to devices shared by the ARM cores and the companion core.
> > > >
> > > > There are two ways to check if a lock is taken. The first way is to read
> > > > a lock. If a 0 is returned, the lock was free and is taken now. If an 1
> > > > is returned, the caller has to try again. Which means the lock is taken.
> > > > The second way is to read a 32bit wide status register where every bit
> > > > represents one of the 32 first locks. According to the datasheets this
> > > > status register supports only the 32 first locks. This is the reason the
> > > > first way (lock read/write) approach is used to be able to cover all 256
> > > > locks in future devices. The driver also reports the amount of supported
> > > > locks via debugfs.
> > > >
> > > > Signed-off-by: Wilken Gottwalt 
> > 
> > Nope, I had to replace the devm_hwspin_lock_register function by the
> > hwspin_lock_register function because like Bjorn pointed out that it can
> > fail and needs to handled correctly. And having a devm_* function does not
> > play well with the non-devm clock/reset functions and winding back if an
> > error occurs. It also messes with the call order in the remove function. So
> > I went back to the classic way where I have full control over the call 
> > order.
> 
> If you're talking about the clock and reset line reassertion, I don't
> really see what the trouble is. Sure, it's not going to be in the exact
> same order in remove, but it's still going to execute in the proper
> order (ie, clock disable, then reset disable, then clock put and reset
> put). And you can use devm_add_action if you want to handle things
> automatically.

See, in v5 zje result of devm_hwspin_lock_register was returned directly. The
remove callback or the bank_fail/clk_fail labels would not run, if the 
registering
fails. In v6 it is fixed.

+   platform_set_drvdata(pdev, priv);
+
+   return devm_hwspin_lock_register(>dev, priv->bank, 
_hwspinlock_ops,
+SPINLOCK_BASE_ID, priv->nlocks);
+bank_fail:
+   clk_disable_unprepare(priv->ahb_clk);
+clk_fail:
+   reset_control_assert(priv->reset);
+
+   return err;
+}

So, is v6 fine for you even if it uses a more classic approach?

greetings,
Will


[PATCH] drm/amd/display: remove duplicate include in dcn21 and gpio

2021-03-06 Thread menglong8 . dong
From: Zhang Yunkai 

'dce110_resource.h' included in 'dcn21_resource.c' is duplicated.
'hw_gpio.h' included in 'hw_factory_dce110.c' is duplicated.

Signed-off-by: Zhang Yunkai 
---
 drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c | 1 -
 .../gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c| 4 
 2 files changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
index 072f8c880924..8a6a965751e8 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
@@ -61,7 +61,6 @@
 #include "dcn21/dcn21_dccg.h"
 #include "dcn21_hubbub.h"
 #include "dcn10/dcn10_resource.h"
-#include "dce110/dce110_resource.h"
 #include "dce/dce_panel_cntl.h"
 
 #include "dcn20/dcn20_dwb.h"
diff --git a/drivers/gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c 
b/drivers/gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c
index 66e4841f41e4..ca335ea60412 100644
--- a/drivers/gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c
+++ b/drivers/gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c
@@ -48,10 +48,6 @@
 #define REGI(reg_name, block, id)\
mm ## block ## id ## _ ## reg_name
 
-#include "../hw_gpio.h"
-#include "../hw_ddc.h"
-#include "../hw_hpd.h"
-
 #include "reg_helper.h"
 #include "../hpd_regs.h"
 
-- 
2.25.1



Re: [PATCH 2/2] MIPS: Loongson64: Move loongson_system_configuration to loongson.h

2021-03-06 Thread Jiaxun Yang



On Sat, Mar 6, 2021, at 5:53 PM, Thomas Bogendoerfer wrote:
> On Sat, Mar 06, 2021 at 05:00:15PM +0800, Jiaxun Yang wrote:
> > 
> > 
> > On Sat, Mar 6, 2021, at 4:03 PM, Thomas Bogendoerfer wrote:
> > > On Thu, Mar 04, 2021 at 07:00:57PM +0800, Qing Zhang wrote:
> > > > The purpose of separating loongson_system_configuration from 
> > > > boot_param.h
> > > > is to keep the other structure consistent with the firmware.
> > > > 
> > > > Signed-off-by: Jiaxun Yang 
> > > > Signed-off-by: Qing Zhang 
> > > > ---
> > > >  .../include/asm/mach-loongson64/boot_param.h   | 18 --
> > > >  .../include/asm/mach-loongson64/loongson.h | 18 ++
> > > 
> > > as you are already touching mach-loongson64 files...
> > > 
> > > Is there a chance you clean up that up even further ? My goal is to
> > > have only files in mach- files, which have an mach-generic
> > > counterpart. Everything else should go to its own directory. So in
> > > case of loongson something
> > > 
> > > like
> > > 
> > > arch/mips/include/asm/loongsonfor common stuff
> > > arch/mips/include/asm/loongson/32
> > > arch/mips/include/asm/loongson/64
> > 
> > Hi Thomas
> > 
> > I'm object to this idea as loongson32/2ef/64 have nothing in common.
> 
> at least they share the name loongson, so having
> 
> arch/mips/include/asm/loongson
> 
> sounds like a good move.
> 
> And seeing 
> 
> diff -u mach-loongson2ef/ mach-loongson64/loongson.h  | diffstat
>  loongson.h |  137 
> +
>  1 file changed, 30 insertions(+), 107 deletions(-)
> 
> wc mach-loongson2ef/loongson.h 
>   318   963 11278 mach-loongson2ef/loongson.h
> 
> so there is something to shared. To me it looks like 2ef could be merged
> into 64, but that's nothing I'm wanting.

Hmm there are duplications in loongson.h just because we didn't clean them up 
when splitting loongson2ef out of loongson64.

> 
> Just to understand you, you want
> 
> arch/mips/include/asm/loongson/2ef
> arch/mips/include/asm/loongson/32
> arch/mips/include/asm/loongson/64

Yeah it looks reasonable but from my point of view doing these movement brings 
no actual benefit :-(

Thanks.

- Jiaxun

> 
> ?
> 
> Thomas.
> 
> -- 
> Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
> good idea.[ RFC1925, 2.3 ]
>

-- 
- Jiaxun


Re: [PATCH] arm64: dts: add support for the Pixel 2 XL

2021-03-06 Thread Konrad Dybcio


On 05.03.2021 22:35, Caleb Connolly wrote:
> Add a minimal devicetree capable of booting on the Pixel 2 XL MSM8998
> device.
>
> It's currently possible to boot the device into postmarketOS with USB
> networking, however the display panel depends on Display Stream
> Compression which is not yet supported in the kernel.
>
> The bootloader also requires that the dtbo partition contains a device
> tree overlay with a particular id which has to be overlayed onto the
> existing dtb. It's possible to use a specially crafted dtbo partition to
> workaround this, more information is available here:
>
> https://gitlab.com/calebccff/dtbo-google-wahoo-mainline
>
> Signed-off-by: Caleb Connolly 
> ---
> It's possible to get wifi working by running Bjorns diag-router in the
> background, without this the wifi firmware crashes every 10 seconds or
> so. This is the same issue encountered on the OnePlus 5.
>
>  arch/arm64/boot/dts/qcom/Makefile |   1 +
>  .../boot/dts/qcom/msm8998-google-taimen.dts   |  14 +
>  .../boot/dts/qcom/msm8998-google-wahoo.dtsi   | 391 ++
>  3 files changed, 406 insertions(+)
>  create mode 100644 arch/arm64/boot/dts/qcom/msm8998-google-taimen.dts
>  create mode 100644 arch/arm64/boot/dts/qcom/msm8998-google-wahoo.dtsi
>
> diff --git a/arch/arm64/boot/dts/qcom/Makefile 
> b/arch/arm64/boot/dts/qcom/Makefile
> index 5113fac80b7a..d942d3ec3928 100644
> --- a/arch/arm64/boot/dts/qcom/Makefile
> +++ b/arch/arm64/boot/dts/qcom/Makefile
> @@ -16,6 +16,7 @@ dtb-$(CONFIG_ARCH_QCOM) += 
> msm8994-msft-lumia-cityman.dtb
>  dtb-$(CONFIG_ARCH_QCOM)  += msm8994-sony-xperia-kitakami-sumire.dtb
>  dtb-$(CONFIG_ARCH_QCOM)  += msm8996-mtp.dtb
>  dtb-$(CONFIG_ARCH_QCOM)  += msm8998-asus-novago-tp370ql.dtb
> +dtb-$(CONFIG_ARCH_QCOM)  += msm8998-google-taimen.dtb
>  dtb-$(CONFIG_ARCH_QCOM)  += msm8998-hp-envy-x2.dtb
>  dtb-$(CONFIG_ARCH_QCOM)  += msm8998-lenovo-miix-630.dtb
>  dtb-$(CONFIG_ARCH_QCOM)  += msm8998-mtp.dtb
> diff --git a/arch/arm64/boot/dts/qcom/msm8998-google-taimen.dts 
> b/arch/arm64/boot/dts/qcom/msm8998-google-taimen.dts
> new file mode 100644
> index ..ffaaafe14037
> --- /dev/null
> +++ b/arch/arm64/boot/dts/qcom/msm8998-google-taimen.dts
> @@ -0,0 +1,14 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2020, Caleb Connolly 
> + */
> +
> +/dts-v1/;
> +
> +#include "msm8998-google-wahoo.dtsi"
> +
> +/ {
> + model = "Google Pixel 2 XL";
> + compatible = "google,taimen", "google,wahoo", "qcom,msm8998", 
> "qcom,msm8998-mtp";

Drop the mtp compatible. Also, afaict wahoo is a shared platform name for 
P2/2XL, so perhaps using the same naming scheme we used for Xperias/Lumias 
(soc-vendor-platform-board) would clear up some confusion. In this case, I'm 
not sure about the wahoo compatible, but I reckon it's fine for it to stay so 
that it's easier to introduce potential quirks that concern both devices.


> + qcom,msm-id = <0x124 0x20001>;

Move it to the common dtsi, unless the other Pixel ships with a different SoC 
revision.


> +};
> diff --git a/arch/arm64/boot/dts/qcom/msm8998-google-wahoo.dtsi 
> b/arch/arm64/boot/dts/qcom/msm8998-google-wahoo.dtsi
> new file mode 100644
> index ..0c221ead2df7
> --- /dev/null
> +++ b/arch/arm64/boot/dts/qcom/msm8998-google-wahoo.dtsi
> @@ -0,0 +1,391 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2020 Caleb Connolly  */
> +
> +#include "msm8998.dtsi"
> +#include "pm8998.dtsi"
> +#include "pmi8998.dtsi"
> +#include "pm8005.dtsi"
> +
> +/delete-node/ _mem;
> +/delete-node/ _mem;
> +/delete-node/ _mem;
> +/delete-node/ _mem;
> +
> +/ {
> + aliases {
> + };
> +
> + chosen {
> + #address-cells = <2>;
> + #size-cells = <2>;
> + ranges;
> +
> + /* Add "earlycon" intended to be used in combination with UART 
> serial console */
> + bootargs = "clk_ignore_unused earlycon 
> console=ttyGS0,115200";// loglevel=10 drm.debug=15 debug";

clk_ignore_unused is a BIG hack!

You should trace which clocks are important for it to stay alive and fix it on 
the driver side.

What breaks if it's not there? Does it still happen with Angelo's clk patches 
that got in for the 5.12

window?

Aside from that, //loglevel... should also go.


> +
> + vph_pwr: vph-pwr-regulator {
> + compatible = "regulator-fixed";
> + regulator-name = "vph_pwr";
> + regulator-always-on;
> + regulator-boot-on;
> + };
> +};

Don't you need to specify voltage here?


> +
> +_uart3 {
> + status = "disabled";
> +
> + bluetooth {
> + compatible = "qcom,wcn3990-bt";
> +
> + vddio-supply = <_s4a_1p8>;
> + vddxo-supply = <_l7a_1p8>;
> + vddrf-supply = <_l17a_1p3>;
> + vddch0-supply = <_l25a_3p3>;
> + max-speed = <320>;
> + };
> +};

Either enable the UART or rid the 

[PATCH] drm/amd/display: remove duplicate include in amdgpu_dm.c

2021-03-06 Thread menglong8 . dong
From: Zhang Yunkai 

'drm/drm_hdcp.h' included in 'amdgpu_dm.c' is duplicated.
It is also included in the 79th line.

Signed-off-by: Zhang Yunkai 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 3e1fd1e7d09f..fee46fbcb0b7 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -44,7 +44,6 @@
 #include "amdgpu_dm.h"
 #ifdef CONFIG_DRM_AMD_DC_HDCP
 #include "amdgpu_dm_hdcp.h"
-#include 
 #endif
 #include "amdgpu_pm.h"
 
-- 
2.25.1



[tip: x86/urgent] x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2

2021-03-06 Thread tip-bot2 for Josh Poimboeuf
The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: 8bd7b3980ca62904814d536b3a2453001992a0c3
Gitweb:
https://git.kernel.org/tip/8bd7b3980ca62904814d536b3a2453001992a0c3
Author:Josh Poimboeuf 
AuthorDate:Fri, 05 Feb 2021 08:24:02 -06:00
Committer: Borislav Petkov 
CommitterDate: Sat, 06 Mar 2021 11:37:00 +01:00

x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2

KASAN reserves "redzone" areas between stack frames in order to detect
stack overruns.  A read or write to such an area triggers a KASAN
"stack-out-of-bounds" BUG.

Normally, the ORC unwinder stays in-bounds and doesn't access the
redzone.  But sometimes it can't find ORC metadata for a given
instruction.  This can happen for code which is missing ORC metadata, or
for generated code.  In such cases, the unwinder attempts to fall back
to frame pointers, as a best-effort type thing.

This fallback often works, but when it doesn't, the unwinder can get
confused and go off into the weeds into the KASAN redzone, triggering
the aforementioned KASAN BUG.

But in this case, the unwinder's confusion is actually harmless and
working as designed.  It already has checks in place to prevent
off-stack accesses, but those checks get short-circuited by the KASAN
BUG.  And a BUG is a lot more disruptive than a harmless unwinder
warning.

Disable the KASAN checks by using READ_ONCE_NOCHECK() for all stack
accesses.  This finishes the job started by commit 881125bfe65b
("x86/unwind: Disable KASAN checking in the ORC unwinder"), which only
partially fixed the issue.

Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder")
Reported-by: Ivan Babrou 
Signed-off-by: Josh Poimboeuf 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Steven Rostedt (VMware) 
Tested-by: Ivan Babrou 
Cc: sta...@kernel.org
Link: 
https://lkml.kernel.org/r/9583327904ebbbeda399eca9c56d6c7085ac20fe.1612534649.git.jpoim...@redhat.com
---
 arch/x86/kernel/unwind_orc.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 2a1d47f..1bcc14c 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -367,8 +367,8 @@ static bool deref_stack_regs(struct unwind_state *state, 
unsigned long addr,
if (!stack_access_ok(state, addr, sizeof(struct pt_regs)))
return false;
 
-   *ip = regs->ip;
-   *sp = regs->sp;
+   *ip = READ_ONCE_NOCHECK(regs->ip);
+   *sp = READ_ONCE_NOCHECK(regs->sp);
return true;
 }
 
@@ -380,8 +380,8 @@ static bool deref_stack_iret_regs(struct unwind_state 
*state, unsigned long addr
if (!stack_access_ok(state, addr, IRET_FRAME_SIZE))
return false;
 
-   *ip = regs->ip;
-   *sp = regs->sp;
+   *ip = READ_ONCE_NOCHECK(regs->ip);
+   *sp = READ_ONCE_NOCHECK(regs->sp);
return true;
 }
 
@@ -402,12 +402,12 @@ static bool get_reg(struct unwind_state *state, unsigned 
int reg_off,
return false;
 
if (state->full_regs) {
-   *val = ((unsigned long *)state->regs)[reg];
+   *val = READ_ONCE_NOCHECK(((unsigned long *)state->regs)[reg]);
return true;
}
 
if (state->prev_regs) {
-   *val = ((unsigned long *)state->prev_regs)[reg];
+   *val = READ_ONCE_NOCHECK(((unsigned long 
*)state->prev_regs)[reg]);
return true;
}
 


[tip: x86/urgent] x86/entry: Fix entry/exit mismatch on failed fast 32-bit syscalls

2021-03-06 Thread tip-bot2 for Andy Lutomirski
The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: e59ba7bf71a09e474198741563e0e587ae43d1c7
Gitweb:
https://git.kernel.org/tip/e59ba7bf71a09e474198741563e0e587ae43d1c7
Author:Andy Lutomirski 
AuthorDate:Thu, 04 Mar 2021 11:05:54 -08:00
Committer: Borislav Petkov 
CommitterDate: Sat, 06 Mar 2021 11:37:00 +01:00

x86/entry: Fix entry/exit mismatch on failed fast 32-bit syscalls

On a 32-bit fast syscall that fails to read its arguments from user
memory, the kernel currently does syscall exit work but not
syscall entry work.  This confuses audit and ptrace.  For example:

$ ./tools/testing/selftests/x86/syscall_arg_fault_32
...
strace: pid 264258: entering, ptrace_syscall_info.op == 2
...

This is a minimal fix intended for ease of backporting.  A more
complete cleanup is coming.

Fixes: 0b085e68f407 ("x86/entry: Consolidate 32/64 bit syscall entry")
Signed-off-by: Andy Lutomirski 
Signed-off-by: Thomas Gleixner 
Cc: sta...@vger.kernel.org
Link: 
https://lore.kernel.org/r/8c82296ddf803b91f8d1e5eac89e5803ba54ab0e.1614884673.git.l...@kernel.org

---
 arch/x86/entry/common.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index a2433ae..4efd39a 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -128,7 +128,8 @@ static noinstr bool __do_fast_syscall_32(struct pt_regs 
*regs)
regs->ax = -EFAULT;
 
instrumentation_end();
-   syscall_exit_to_user_mode(regs);
+   local_irq_disable();
+   irqentry_exit_to_user_mode(regs);
return false;
}
 


[tip: x86/urgent] x86/unwind/orc: Silence warnings caused by missing ORC data

2021-03-06 Thread tip-bot2 for Josh Poimboeuf
The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: d072f941c1e234f8495cc4828370b180318bf49b
Gitweb:
https://git.kernel.org/tip/d072f941c1e234f8495cc4828370b180318bf49b
Author:Josh Poimboeuf 
AuthorDate:Fri, 05 Feb 2021 08:24:03 -06:00
Committer: Borislav Petkov 
CommitterDate: Sat, 06 Mar 2021 11:37:00 +01:00

x86/unwind/orc: Silence warnings caused by missing ORC data

The ORC unwinder attempts to fall back to frame pointers when ORC data
is missing for a given instruction.  It sets state->error, but then
tries to keep going as a best-effort type of thing.  That may result in
further warnings if the unwinder gets lost.

Until we have some way to register generated code with the unwinder,
missing ORC will be expected, and occasionally going off the rails will
also be expected.  So don't warn about it.

Signed-off-by: Josh Poimboeuf 
Signed-off-by: Peter Zijlstra (Intel) 
Tested-by: Ivan Babrou 
Link: 
https://lkml.kernel.org/r/06d02c4bbb220bd31668db579278b0352538efbb.1612534649.git.jpoim...@redhat.com
---
 arch/x86/kernel/unwind_orc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 1bcc14c..a120253 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -13,7 +13,7 @@
 
 #define orc_warn_current(args...)  \
 ({ \
-   if (state->task == current) \
+   if (state->task == current && !state->error)\
orc_warn(args); \
 })
 


Re: [PATCH v2] input: s6sy761: fix coordinate read bit shift

2021-03-06 Thread Andi Shyti
Hi Caleb,

On Fri, Mar 05, 2021 at 06:58:10PM +, Caleb Connolly wrote:
> The touch coordinate register contains the following:
> 
> byte 3 byte 2 byte 1
> +++ +-+ +-+
> ||| | | | |
> | X[3:0] | Y[3:0] | | Y[11:4] | | X[11:4] |
> ||| | | | |
> +++ +-+ +-+
> 
> Bytes 2 and 1 need to be shifted left by 4 bits, the least significant
> nibble of each is stored in byte 3. Currently they are only
> being shifted by 3 causing the reported coordinates to be incorrect.
> 
> This matches downstream examples, and has been confirmed on my
> device (OnePlus 7 Pro).
> 
> Fixes: 0145a7141e59 ("Input: add support for the Samsung S6SY761
> touchscreen")
> Signed-off-by: Caleb Connolly 

Reviewed-by: Andi Shyti 

Thanks,
Andi


Re: [PATCH] kbuild: dummy-tools, fix inverted tests for gcc

2021-03-06 Thread Masahiro Yamada
On Wed, Mar 3, 2021 at 7:43 PM Jiri Slaby  wrote:
>
> There is a test in Kconfig which takes inverted value of a compiler
> check:
> * config CC_HAS_INT128
> def_bool !$(cc-option,$(m64-flag) -D__SIZEOF_INT128__=0)
>
> This results in CC_HAS_INT128 not being in super-config generated by
> dummy-tools. So take this into account in the gcc script.
>
> Signed-off-by: Jiri Slaby 
> Cc: Masahiro Yamada 
> ---


Applied to linux-kbuild/fixes.
Thanks.


We could fix init/Kconfig to use the positive logic as follows,
but I guess (hope) this conditional will go away
when we raise the GCC min version next time.
So, I am fine with fixing this in dummy-tools.




diff --git a/init/Kconfig b/init/Kconfig
index 22946fe5ded9..502594a78282 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -849,7 +849,7 @@ config ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH
bool

 config CC_HAS_INT128
-   def_bool !$(cc-option,$(m64-flag) -D__SIZEOF_INT128__=0) && 64BIT
+   def_bool $(success,echo '__int128 x;' | $(CC) $(CLANG_FLAGS)
-x c - -c -o /dev/null) && 64BIT

 #
 # For architectures that know their GCC __int128 support is sound





-- 
Best Regards
Masahiro Yamada


<    1   2   3   4   5   >