Re: [PATCH] powerpc: Avoid clang uninitialized warning in __get_user_size_allowed

2021-04-26 Thread Christophe Leroy




Le 26/04/2021 à 22:35, Nathan Chancellor a écrit :

Commit 9975f852ce1b ("powerpc/uaccess: Remove calls to __get_user_bad()
and __put_user_bad()") switch to BUILD_BUG() in the default case, which
leaves x uninitialized. This will not be an issue because the build will
be broken in that case but clang does static analysis before it realizes
the default case will be done so it warns about x being uninitialized
(trimmed for brevity):

  In file included from mm/mprotect.c:13:
  In file included from ./include/linux/hugetlb.h:28:
  In file included from ./include/linux/mempolicy.h:16:
  ./include/linux/pagemap.h:772:16: warning: variable '__gu_val' is used
  uninitialized whenever switch default is taken [-Wsometimes-uninitialized]
  if (unlikely(__get_user(c, uaddr) != 0))
   ^~~~
  ./arch/powerpc/include/asm/uaccess.h:266:2: note: expanded from macro 
'__get_user'
  __get_user_size_allowed(__gu_val, __gu_addr, __gu_size, __gu_err);
  \
  ^
  ./arch/powerpc/include/asm/uaccess.h:235:2: note: expanded from macro
  '__get_user_size_allowed'
 default: BUILD_BUG();   \
 ^~~

Commit 5cd29b1fd3e8 ("powerpc/uaccess: Use asm goto for get_user when
compiler supports it") added an initialization for x because of the same
reason. Do the same thing here so there is no warning across all
versions of clang.


Ah yes, I tested with Clang 11 which has CONFIG_CC_HAS_ASM_GOTO_OUTPUT, that's the reason why I hit 
that warning only in the CONFIG_CC_HAS_ASM_GOTO_OUTPUT branch.


But regardless, is that normal that Clang warns that on a never taken branch ? 
That's puzzling.



Link: https://github.com/ClangBuiltLinux/linux/issues/1359
Signed-off-by: Nathan Chancellor 


Acked-by: Christophe Leroy 


---
  arch/powerpc/include/asm/uaccess.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index a4e791bcd3fe..a09e4240c5b1 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -232,7 +232,7 @@ do {
\
case 2: __get_user_asm(x, (u16 __user *)ptr, retval, "lhz"); break;   \
case 4: __get_user_asm(x, (u32 __user *)ptr, retval, "lwz"); break;   \
case 8: __get_user_asm2(x, (u64 __user *)ptr, retval);  break;  \
-   default: BUILD_BUG();   \
+   default: x = 0; BUILD_BUG();\
}   \
  } while (0)
  


base-commit: ee6b25fa7c037e42cc5f3b5c024b2a779edab6dd



[PATCH v6] powerpc/kexec_file: use current CPU info while setting up FDT

2021-04-26 Thread Sourabh Jain
kexec_file_load uses initial_boot_params in setting up the device-tree
for the kernel to be loaded. Though initial_boot_params holds info
about CPUs at the time of boot, it doesn't account for hot added CPUs.

So, kexec'ing with kexec_file_load syscall would leave the kexec'ed
kernel with inaccurate CPU info. Also, if kdump kernel is loaded with
kexec_file_load syscall and the system crashes on a hot added CPU,
capture kernel hangs failing to identify the boot CPU.

 Kernel panic - not syncing: sysrq triggered crash
 CPU: 24 PID: 6065 Comm: echo Kdump: loaded Not tainted 5.12.0-rc5upstream #54
 Call Trace:
 [c000e590fac0] [c07b2400] dump_stack+0xc4/0x114 (unreliable)
 [c000e590fb00] [c0145290] panic+0x16c/0x41c
 [c000e590fba0] [c08892e0] sysrq_handle_crash+0x30/0x40
 [c000e590fc00] [c0889cdc] __handle_sysrq+0xcc/0x1f0
 [c000e590fca0] [c088a538] write_sysrq_trigger+0xd8/0x178
 [c000e590fce0] [c05e9b7c] proc_reg_write+0x10c/0x1b0
 [c000e590fd10] [c04f26d0] vfs_write+0xf0/0x330
 [c000e590fd60] [c04f2aec] ksys_write+0x7c/0x140
 [c000e590fdb0] [c0031ee0] system_call_exception+0x150/0x290
 [c000e590fe10] [c000ca5c] system_call_common+0xec/0x278
 --- interrupt: c00 at 0x7fff905b9664
 NIP:  7fff905b9664 LR: 7fff905320c4 CTR: 
 REGS: c000e590fe80 TRAP: 0c00   Not tainted  (5.12.0-rc5upstream)
 MSR:  8280f033   CR: 28000242
   XER: 
 IRQMASK: 0
 GPR00: 0004 75fedf30 7fff906a7300 0001
 GPR04: 01002a7355b0 0002 0001 75fef616
 GPR08: 0001   
 GPR12:  7fff9073a160  
 GPR16:    
 GPR20:  7fff906a4ee0 0002 0001
 GPR24: 7fff906a0898  0002 01002a7355b0
 GPR28: 0002 7fff906a1790 01002a7355b0 0002
 NIP [7fff905b9664] 0x7fff905b9664
 LR [7fff905320c4] 0x7fff905320c4
 --- interrupt: c00

To avoid this from happening, extract current CPU info from of_root
device node and use it for setting up the fdt in kexec_file_load case.

Fixes: 6ecd0163d360 ("powerpc/kexec_file: Add appropriate regions for memory 
reserve map")

Signed-off-by: Sourabh Jain 
Reviewed-by: Hari Bathini 
Cc: 
---
 arch/powerpc/kexec/file_load_64.c | 88 +++
 1 file changed, 88 insertions(+)

 ---
Changelog:

v1 -> v5
  - https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-April/227950.html

v5 -> v6
  - use exiting macro (for_each_property_of_node) to loop through all
properties of a node.
  - removed devtree_lock while accessing the node properties.
  - function name update, add_node_prop to add_node_props.
 ---

diff --git a/arch/powerpc/kexec/file_load_64.c 
b/arch/powerpc/kexec/file_load_64.c
index 02b9e4d0dc40..4f7d4c10f939 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -960,6 +960,89 @@ unsigned int kexec_fdt_totalsize_ppc64(struct kimage 
*image)
return fdt_size;
 }
 
+/**
+ * add_node_props - Reads node properties from device node structure and add
+ *  them to fdt.
+ * @fdt:Flattened device tree of the kernel
+ * @node_offset:offset of the node to add a property at
+ * @dn: device node pointer
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+static int add_node_props(void *fdt, int node_offset, const struct device_node 
*dn)
+{
+   int ret = 0;
+   struct property *pp;
+
+   if (!dn)
+   return -EINVAL;
+
+   for_each_property_of_node(dn, pp) {
+   ret = fdt_setprop(fdt, node_offset, pp->name, pp->value, 
pp->length);
+   if (ret < 0) {
+   pr_err("Unable to add %s property: %s\n", pp->name, 
fdt_strerror(ret));
+   return ret;
+   }
+   }
+   return ret;
+}
+
+/**
+ * update_cpus_node - Update cpus node of flattened device tree using of_root
+ *device node.
+ * @fdt:  Flattened device tree of the kernel.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+static int update_cpus_node(void *fdt)
+{
+   struct device_node *cpus_node, *dn;
+   int cpus_offset, cpus_subnode_offset, ret = 0;
+
+   cpus_offset = fdt_path_offset(fdt, "/cpus");
+   if (cpus_offset < 0 && cpus_offset != -FDT_ERR_NOTFOUND) {
+   pr_err("Malformed device tree: error reading /cpus node: %s\n",
+  fdt_strerror(cpus_offset));
+   return cpus_offset;
+   }
+
+   if (cpus_offset > 0) {
+   ret = fdt_del_node(fdt, cpus_offset);
+   if (ret < 0) {
+   

Re: [PATCH net-next v4 2/2] of: net: fix of_get_mac_addr_nvmem() for non-platform devices

2021-04-26 Thread Benjamin Herrenschmidt
On Mon, 2021-04-26 at 12:54 +0200, Michael Walle wrote:
> Before I'll try to come up with a patch for this, I'd like to get
> your opinion on it.
> 
> (1) replacing of_get_mac_address(node) with eth_get_mac_address(dev)
>  might sometimes lead to confusing comments like in
>  drivers/net/ethernet/allwinner/sun4i-emac.c:
> 
>  /* Read MAC-address from DT */
>  ret = of_get_mac_address(np, ndev->dev_addr);

You could leave it or turn it into "from platform", doesn't matter...

> (2) What do you think of eth_get_mac_address(ndev). That is, the

Not sure what you mean, eth_platform_get_mac_address() takes the
address as an argument. I think what you want is a consolidated
nvmem_get_mac_address + eth_platform_get_mac_address that takes a
device, which would have no requirement of the bus_type at all.

Cheers,
Ben.



Re: [PATCH 2/9] ARM: PXA: Kill use of irq_create_strict_mappings()

2021-04-26 Thread Guenter Roeck
On Tue, Apr 06, 2021 at 10:35:50AM +0100, Marc Zyngier wrote:
> irq_create_strict_mappings() is a poor way to allow the use of
> a linear IRQ domain as a legacy one. Let's be upfront about
> it and use a legacy domain when appropriate.
> 
> Signed-off-by: Marc Zyngier 
> ---

When running the "mainstone" qemu emulation, this patch results
in many (32, actually) runtime warnings such as the following.

[0.528272] [ cut here ]
[0.528285] WARNING: CPU: 0 PID: 1 at kernel/irq/irqdomain.c:550 
irq_domain_associate+0x194/0x1f0
[0.528315] error: virq335 is not allocated
[0.528325] Modules linked in:
[0.528351] CPU: 0 PID: 1 Comm: swapper Tainted: GW 
5.12.0-rc8-next-20210423 #1
[0.528372] Hardware name: Intel HCDDBBVA0 Development Platform (aka 
Mainstone)
[0.528387] Backtrace:
[0.528406] [] (dump_backtrace) from [] 
(show_stack+0x20/0x24)
[0.528441]  r7:0226 r6:c00796e8 r5:0009 r4:c088d2a0
[0.528454] [] (show_stack) from [] 
(dump_stack+0x28/0x30)
[0.528479] [] (dump_stack) from [] (__warn+0xe8/0x110)
[0.528507]  r5:0009 r4:c0872698
[0.528520] [] (__warn) from [] 
(warn_slowpath_fmt+0xa0/0xe0)
[0.528551]  r7:c00796e8 r6:0226 r5:c0872698 r4:c0872700
[0.528564] [] (warn_slowpath_fmt) from [] 
(irq_domain_associate+0x194/0x1f0)
[0.528597]  r8:0130 r7:014f r6:001f r5: r4:c11bd780
[0.528610] [] (irq_domain_associate) from [] 
(irq_domain_associate_many+0x60/0xa4)
[0.528642]  r8:0130 r7:c11bd780 r6:fed0 r5:0150 r4:0150
[0.528655] [] (irq_domain_associate_many) from [] 
(irq_domain_create_legacy+0x5c/0x68)
[0.528687]  r8:0130 r7:0130 r6:0020 r5: r4:c11bd780
[0.528699] [] (irq_domain_create_legacy) from [] 
(irq_domain_add_legacy+0x34/0x3c)
[0.528730]  r7:c09b1370 r6:c09b1360 r5:c11bd3a0 r4:
[0.528743] [] (irq_domain_add_legacy) from [] 
(cplds_probe+0x170/0x1ac)
[0.528768] [] (cplds_probe) from [] 
(platform_probe+0x50/0xb0)
[0.528800]  r8:c09d2c94 r7:c0aa4f88 r6:c09d2c94 r5:c09b1370 r4:
[0.528814] [] (platform_probe) from [] 
(really_probe+0x100/0x4d4)
[0.528844]  r7:c0aa4f88 r6: r5: r4:c09b1370
[0.528858] [] (really_probe) from [] 
(driver_probe_device+0x88/0x20c)
[0.528892]  r10:c0974830 r9:c0a7 r8:c093b224 r7:c0a31de8 r6:c09d2c94 
r5:c09d2c94
[0.528907]  r4:c09b1370
[0.528919] [] (driver_probe_device) from [] 
(device_driver_attach+0x68/0x70)
[0.528953]  r9:c0a7 r8:c093b224 r7:c0a31de8 r6:c09d2c94 r5: 
r4:c09b1370
[0.528969] [] (device_driver_attach) from [] 
(__driver_attach+0xc0/0x164)
[0.528997]  r7:c0a31de8 r6:c09b1370 r5:c09d2c94 r4:
[0.529009] [] (__driver_attach) from [] 
(bus_for_each_dev+0x84/0xcc)
[0.529039]  r7:c0a31de8 r6:c04306d4 r5:c09d2c94 r4:
[0.529052] [] (bus_for_each_dev) from [] 
(driver_attach+0x28/0x30)
[0.529082]  r6: r5:c11bd200 r4:c09d2c94
[0.529095] [] (driver_attach) from [] 
(bus_add_driver+0x168/0x210)
[0.529122] [] (bus_add_driver) from [] 
(driver_register+0x88/0x120)
[0.529152]  r7:c0a5c7e0 r6: r5:e000 r4:c09d2c94
[0.529165] [] (driver_register) from [] 
(__platform_driver_register+0x2c/0x34)
[0.529191]  r5:e000 r4:c094ba64
[0.529204] [] (__platform_driver_register) from [] 
(cplds_driver_init+0x20/0x28)
[0.529230] [] (cplds_driver_init) from [] 
(do_one_initcall+0x60/0x27c)
[0.529255] [] (do_one_initcall) from [] 
(kernel_init_freeable+0x158/0x1e4)
[0.529284]  r7:c0974850 r6:0007 r5:c0c0f720 r4:c09a02fc
[0.529297] [] (kernel_init_freeable) from [] 
(kernel_init+0x18/0x110)
[0.529328]  r10: r9: r8: r7: r6: 
r5:c06c525c
[0.529343]  r4:
[0.529354] [] (kernel_init) from [] 
(ret_from_fork+0x14/0x2c)
[0.529387] Exception stack(0xc0bdffb0 to 0xc0bdfff8)
[0.529467] ffa0:   
 
[0.529587] ffc0:       
 
[0.529684] ffe0:     0013 
[0.529726]  r5:c06c525c r4:
[0.529752] ---[ end trace d199929d2b87e077 ]---

Bisect log attached.

Guenter

---
# bad: [e3d35712f85ac84fb06234848f6c043ab418cf8b] Add linux-next specific files 
for 20210423
# good: [bf05bf16c76bb44ab5156223e1e58e26dfe30a88] Linux 5.12-rc8
git bisect start 'HEAD' 'v5.12-rc8'
# good: [d4b5d9d94679a18bfa4ccdafd19876d58777911e] Merge remote-tracking branch 
'crypto/master'
git bisect good d4b5d9d94679a18bfa4ccdafd19876d58777911e
# good: [27628e42fe59a698e66b671bf1e1f01f6a3fe765] Merge remote-tracking branch 
'tip/auto-latest'
git bisect good 27628e42fe59a698e66b671bf1e1f01f6a3fe765
# bad: [bc6c3ae4f662fc719d0bf144f150f72cab8912d4] Merge remote-tracking branch 
'vfio/next'
git bisect 

Re: [PATCH] kbuild: replace LANG=C with LC_ALL=C

2021-04-26 Thread Matthieu Baerts
Hi,

Thank you for the patch!

On 24/04/2021 13:48, Masahiro Yamada wrote:
> LANG gives a weak default to each LC_* in case it is not explicitly
> defined. LC_ALL, if set, overrides all other LC_* variables.
> 
>   LANG  <  LC_CTYPE, LC_COLLATE, LC_MONETARY, LC_NUMERIC, ...  <  LC_ALL
> 
> This is why documentation such as [1] suggests to set LC_ALL in build
> scripts to get the deterministic result.
> 
> LANG=C is not strong enough to override LC_* that may be set by end
> users.
> 
> [1]: https://reproducible-builds.org/docs/locales/
> 
> Signed-off-by: Masahiro Yamada 
> ---
> 
>  arch/powerpc/boot/wrapper  | 2 +-
>  scripts/nsdeps | 2 +-
>  scripts/recordmcount.pl| 2 +-
>  scripts/setlocalversion| 2 +-
>  scripts/tags.sh| 2 +-
>  tools/testing/selftests/net/mptcp/mptcp_connect.sh | 2 +-

Acked-by: Matthieu Baerts  (mptcp)

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net


Re: [PATCH v2 2/2] powerpc: If kexec_build_elf_info() fails return immediately from elf64_load()

2021-04-26 Thread Rob Herring
On Wed, 21 Apr 2021 09:36:10 -0700, Lakshmi Ramasubramanian wrote:
> Uninitialized local variable "elf_info" would be passed to
> kexec_free_elf_info() if kexec_build_elf_info() returns an error
> in elf64_load().
> 
> If kexec_build_elf_info() returns an error, return the error
> immediately.
> 
> Signed-off-by: Lakshmi Ramasubramanian 
> Reported-by: Dan Carpenter 
> Reviewed-by: Michael Ellerman 
> ---
>  arch/powerpc/kexec/elf_64.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Applied, thanks!


Re: [PATCH v2 1/2] powerpc: Free fdt on error in elf64_load()

2021-04-26 Thread Rob Herring
On Wed, 21 Apr 2021 09:36:09 -0700, Lakshmi Ramasubramanian wrote:
> There are a few "goto out;" statements before the local variable "fdt"
> is initialized through the call to of_kexec_alloc_and_setup_fdt() in
> elf64_load().  This will result in an uninitialized "fdt" being passed
> to kvfree() in this function if there is an error before the call to
> of_kexec_alloc_and_setup_fdt().
> 
> If there is any error after fdt is allocated, but before it is
> saved in the arch specific kimage struct, free the fdt.
> 
> Reported-by: kernel test robot 
> Reported-by: Dan Carpenter 
> Signed-off-by: Michael Ellerman 
> Signed-off-by: Lakshmi Ramasubramanian 
> ---
>  arch/powerpc/kexec/elf_64.c | 16 ++--
>  1 file changed, 6 insertions(+), 10 deletions(-)
> 

Applied, thanks!


[PATCH] powerpc: Avoid clang uninitialized warning in __get_user_size_allowed

2021-04-26 Thread Nathan Chancellor
Commit 9975f852ce1b ("powerpc/uaccess: Remove calls to __get_user_bad()
and __put_user_bad()") switch to BUILD_BUG() in the default case, which
leaves x uninitialized. This will not be an issue because the build will
be broken in that case but clang does static analysis before it realizes
the default case will be done so it warns about x being uninitialized
(trimmed for brevity):

 In file included from mm/mprotect.c:13:
 In file included from ./include/linux/hugetlb.h:28:
 In file included from ./include/linux/mempolicy.h:16:
 ./include/linux/pagemap.h:772:16: warning: variable '__gu_val' is used
 uninitialized whenever switch default is taken [-Wsometimes-uninitialized]
 if (unlikely(__get_user(c, uaddr) != 0))
  ^~~~
 ./arch/powerpc/include/asm/uaccess.h:266:2: note: expanded from macro 
'__get_user'
 __get_user_size_allowed(__gu_val, __gu_addr, __gu_size, __gu_err); 
 \
 ^
 ./arch/powerpc/include/asm/uaccess.h:235:2: note: expanded from macro
 '__get_user_size_allowed'
default: BUILD_BUG();   \
^~~

Commit 5cd29b1fd3e8 ("powerpc/uaccess: Use asm goto for get_user when
compiler supports it") added an initialization for x because of the same
reason. Do the same thing here so there is no warning across all
versions of clang.

Link: https://github.com/ClangBuiltLinux/linux/issues/1359
Signed-off-by: Nathan Chancellor 
---
 arch/powerpc/include/asm/uaccess.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index a4e791bcd3fe..a09e4240c5b1 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -232,7 +232,7 @@ do {
\
case 2: __get_user_asm(x, (u16 __user *)ptr, retval, "lhz"); break; 
\
case 4: __get_user_asm(x, (u32 __user *)ptr, retval, "lwz"); break; 
\
case 8: __get_user_asm2(x, (u64 __user *)ptr, retval);  break;  \
-   default: BUILD_BUG();   \
+   default: x = 0; BUILD_BUG();\
}   \
 } while (0)
 

base-commit: ee6b25fa7c037e42cc5f3b5c024b2a779edab6dd
-- 
2.31.1.362.g311531c9de



Re: [PATCH v7] powerpc/irq: Inline call_do_irq() and call_do_softirq()

2021-04-26 Thread Nathan Chancellor
On Sat, Mar 20, 2021 at 11:22:27PM +1100, Michael Ellerman wrote:
> From: Christophe Leroy 
> 
> call_do_irq() and call_do_softirq() are simple enough to be
> worth inlining.
> 
> Inlining them avoids an mflr/mtlr pair plus a save/reload on stack. It
> also allows GCC to keep the saved ksp_limit in an nonvolatile reg.
> 
> This is inspired from S390 arch. Several other arches do more or
> less the same. The way sparc arch does seems odd thought.
> 
> Signed-off-by: Christophe Leroy 
> Signed-off-by: Michael Ellerman 
> ---
> 
> v2: no change.
> v3: no change.
> v4:
> - comment reminding the purpose of the inline asm block.
> - added r2 as clobbered reg
> v5:
> - Limiting the change to PPC32 for now.
> - removed r2 from the clobbered regs list (on PPC32 r2 points to current all 
> the time)
> - Removed patch 1 and merged ksp_limit handling in here.
> v6:
> - Rebase on top of merge-test (ca6e327fefb2).
> - Remove the ksp_limit stuff as it's doesn't exist anymore.
> 
> v7:
> mpe:
> - Enable for 64-bit too. This all in-kernel code calling in-kernel
>   code, and must use the kernel TOC.
> - Use named parameters for the inline asm.
> - Reformat inline asm.
> - Mark as always_inline.
> - Drop unused ret from call_do_softirq(), add r3 as clobbered.
> ---
>  arch/powerpc/include/asm/irq.h |  2 --
>  arch/powerpc/kernel/irq.c  | 41 ++
>  arch/powerpc/kernel/misc_32.S  | 25 -
>  arch/powerpc/kernel/misc_64.S  | 22 --
>  4 files changed, 41 insertions(+), 49 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/irq.h b/arch/powerpc/include/asm/irq.h
> index f3f264e441a7..b2bd58830430 100644
> --- a/arch/powerpc/include/asm/irq.h
> +++ b/arch/powerpc/include/asm/irq.h
> @@ -53,8 +53,6 @@ extern void *mcheckirq_ctx[NR_CPUS];
>  extern void *hardirq_ctx[NR_CPUS];
>  extern void *softirq_ctx[NR_CPUS];
>  
> -void call_do_softirq(void *sp);
> -void call_do_irq(struct pt_regs *regs, void *sp);
>  extern void do_IRQ(struct pt_regs *regs);
>  extern void __init init_IRQ(void);
>  extern void __do_irq(struct pt_regs *regs);
> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> index 5b72abbff96c..260effc0a435 100644
> --- a/arch/powerpc/kernel/irq.c
> +++ b/arch/powerpc/kernel/irq.c
> @@ -667,6 +667,47 @@ static inline void check_stack_overflow(void)
>   }
>  }
>  
> +static __always_inline void call_do_softirq(const void *sp)
> +{
> + /* Temporarily switch r1 to sp, call __do_softirq() then restore r1. */
> + asm volatile (
> +  PPC_STLU " %%r1, %[offset](%[sp])  ;"
> + "mr %%r1, %[sp] ;"
> + "bl %[callee]   ;"
> +  PPC_LL "   %%r1, 0(%%r1)   ;"
> +  : // Outputs
> +  : // Inputs
> +[sp] "b" (sp), [offset] "i" (THREAD_SIZE - 
> STACK_FRAME_OVERHEAD),
> +[callee] "i" (__do_softirq)
> +  : // Clobbers
> +"lr", "xer", "ctr", "memory", "cr0", "cr1", "cr5", "cr6",
> +"cr7", "r0", "r3", "r4", "r5", "r6", "r7", "r8", "r9", "r10",
> +"r11", "r12"
> + );
> +}
> +
> +static __always_inline void call_do_irq(struct pt_regs *regs, void *sp)
> +{
> + register unsigned long r3 asm("r3") = (unsigned long)regs;
> +
> + /* Temporarily switch r1 to sp, call __do_irq() then restore r1. */
> + asm volatile (
> +  PPC_STLU " %%r1, %[offset](%[sp])  ;"
> + "mr %%r1, %[sp] ;"
> + "bl %[callee]   ;"
> +  PPC_LL "   %%r1, 0(%%r1)   ;"
> +  : // Outputs
> +"+r" (r3)
> +  : // Inputs
> +[sp] "b" (sp), [offset] "i" (THREAD_SIZE - 
> STACK_FRAME_OVERHEAD),
> +[callee] "i" (__do_irq)
> +  : // Clobbers
> +"lr", "xer", "ctr", "memory", "cr0", "cr1", "cr5", "cr6",
> +"cr7", "r0", "r4", "r5", "r6", "r7", "r8", "r9", "r10",
> +"r11", "r12"
> + );
> +}
> +
>  void __do_irq(struct pt_regs *regs)
>  {
>   unsigned int irq;
> diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
> index acc410043b96..6a076bef2932 100644
> --- a/arch/powerpc/kernel/misc_32.S
> +++ b/arch/powerpc/kernel/misc_32.S
> @@ -27,31 +27,6 @@
>  
>   .text
>  
> -_GLOBAL(call_do_softirq)
> - mflrr0
> - stw r0,4(r1)
> - stwur1,THREAD_SIZE-STACK_FRAME_OVERHEAD(r3)
> - mr  r1,r3
> - bl  __do_softirq
> - lwz r1,0(r1)
> - lwz r0,4(r1)
> - mtlrr0
> - blr
> -
> -/*
> - * void call_do_irq(struct pt_regs *regs, void *sp);
> - */
> -_GLOBAL(call_do_irq)
> - mflrr0
> - stw r0,4(r1)
> - stwur1,THREAD_SIZE-STACK_FRAME_OVERHEAD(r4)
> - mr  r1,r4
> - bl  __do_irq
> - lwz 

Re: [PATCH v5 16/16] of: Add plumbing for restricted DMA pool

2021-04-26 Thread Claire Chang
On Fri, Apr 23, 2021 at 9:35 PM Robin Murphy  wrote:
>
> On 2021-04-22 09:15, Claire Chang wrote:
> > If a device is not behind an IOMMU, we look up the device node and set
> > up the restricted DMA when the restricted-dma-pool is presented.
> >
> > Signed-off-by: Claire Chang 
> > ---
> >   drivers/of/address.c| 25 +
> >   drivers/of/device.c |  3 +++
> >   drivers/of/of_private.h |  5 +
> >   3 files changed, 33 insertions(+)
> >
> > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > index 54f221dde267..fff3adfe4986 100644
> > --- a/drivers/of/address.c
> > +++ b/drivers/of/address.c
> > @@ -8,6 +8,7 @@
> >   #include 
> >   #include 
> >   #include 
> > +#include 
> >   #include 
> >   #include 
> >   #include 
> > @@ -1109,6 +1110,30 @@ bool of_dma_is_coherent(struct device_node *np)
> >   }
> >   EXPORT_SYMBOL_GPL(of_dma_is_coherent);
> >
> > +int of_dma_set_restricted_buffer(struct device *dev)
> > +{
> > + struct device_node *node;
> > + int count, i;
> > +
> > + if (!dev->of_node)
> > + return 0;
> > +
> > + count = of_property_count_elems_of_size(dev->of_node, "memory-region",
> > + sizeof(phandle));
> > + for (i = 0; i < count; i++) {
> > + node = of_parse_phandle(dev->of_node, "memory-region", i);
> > + /* There might be multiple memory regions, but only one
> > +  * restriced-dma-pool region is allowed.
> > +  */
>
> What's the use-case for having multiple regions if the restricted pool
> is by definition the only one accessible?

There might be a device coherent pool (shared-dma-pool) and
dma_alloc_attrs might allocate memory from that pool [1].
I'm not sure if it makes sense to have another device coherent pool
while using restricted DMA pool though.

[1] https://elixir.bootlin.com/linux/v5.12/source/kernel/dma/mapping.c#L435


>
> Robin.
>
> > + if (of_device_is_compatible(node, "restricted-dma-pool") &&
> > + of_device_is_available(node))
> > + return of_reserved_mem_device_init_by_idx(
> > + dev, dev->of_node, i);
> > + }
> > +
> > + return 0;
> > +}
> > +
> >   /**
> >* of_mmio_is_nonposted - Check if device uses non-posted MMIO
> >* @np: device node
> > diff --git a/drivers/of/device.c b/drivers/of/device.c
> > index c5a9473a5fb1..d8d865223e51 100644
> > --- a/drivers/of/device.c
> > +++ b/drivers/of/device.c
> > @@ -165,6 +165,9 @@ int of_dma_configure_id(struct device *dev, struct 
> > device_node *np,
> >
> >   arch_setup_dma_ops(dev, dma_start, size, iommu, coherent);
> >
> > + if (!iommu)
> > + return of_dma_set_restricted_buffer(dev);
> > +
> >   return 0;
> >   }
> >   EXPORT_SYMBOL_GPL(of_dma_configure_id);
> > diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
> > index d717efbd637d..e9237f5eff48 100644
> > --- a/drivers/of/of_private.h
> > +++ b/drivers/of/of_private.h
> > @@ -163,12 +163,17 @@ struct bus_dma_region;
> >   #if defined(CONFIG_OF_ADDRESS) && defined(CONFIG_HAS_DMA)
> >   int of_dma_get_range(struct device_node *np,
> >   const struct bus_dma_region **map);
> > +int of_dma_set_restricted_buffer(struct device *dev);
> >   #else
> >   static inline int of_dma_get_range(struct device_node *np,
> >   const struct bus_dma_region **map)
> >   {
> >   return -ENODEV;
> >   }
> > +static inline int of_dma_get_restricted_buffer(struct device *dev)
> > +{
> > + return -ENODEV;
> > +}
> >   #endif
> >
> >   #endif /* _LINUX_OF_PRIVATE_H */
> >


Re: [PATCH v5 05/16] swiotlb: Add restricted DMA pool initialization

2021-04-26 Thread Claire Chang
On Fri, Apr 23, 2021 at 7:34 PM Steven Price  wrote:
>
> On 22/04/2021 09:14, Claire Chang wrote:
> > Add the initialization function to create restricted DMA pools from
> > matching reserved-memory nodes.
> >
> > Signed-off-by: Claire Chang 
> > ---
> >   include/linux/device.h  |  4 +++
> >   include/linux/swiotlb.h |  3 +-
> >   kernel/dma/swiotlb.c| 80 +
> >   3 files changed, 86 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/device.h b/include/linux/device.h
> > index 38a2071cf776..4987608ea4ff 100644
> > --- a/include/linux/device.h
> > +++ b/include/linux/device.h
> > @@ -416,6 +416,7 @@ struct dev_links_info {
> >* @dma_pools:  Dma pools (if dma'ble device).
> >* @dma_mem:Internal for coherent mem override.
> >* @cma_area:   Contiguous memory area for dma allocations
> > + * @dma_io_tlb_mem: Internal for swiotlb io_tlb_mem override.
> >* @archdata:   For arch-specific additions.
> >* @of_node:Associated device tree node.
> >* @fwnode: Associated device node supplied by platform firmware.
> > @@ -521,6 +522,9 @@ struct device {
> >   #ifdef CONFIG_DMA_CMA
> >   struct cma *cma_area;   /* contiguous memory area for dma
> >  allocations */
> > +#endif
> > +#ifdef CONFIG_DMA_RESTRICTED_POOL
> > + struct io_tlb_mem *dma_io_tlb_mem;
> >   #endif
> >   /* arch specific additions */
> >   struct dev_archdata archdata;
> > diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> > index 216854a5e513..03ad6e3b4056 100644
> > --- a/include/linux/swiotlb.h
> > +++ b/include/linux/swiotlb.h
> > @@ -72,7 +72,8 @@ extern enum swiotlb_force swiotlb_force;
> >*  range check to see if the memory was in fact allocated by this
> >*  API.
> >* @nslabs: The number of IO TLB blocks (in groups of 64) between @start 
> > and
> > - *   @end. This is command line adjustable via setup_io_tlb_npages.
> > + *   @end. For default swiotlb, this is command line adjustable via
> > + *   setup_io_tlb_npages.
> >* @used:   The number of used IO TLB block.
> >* @list:   The free list describing the number of free entries available
> >*  from each index.
> > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> > index 57a9adb920bf..ffbb8724e06c 100644
> > --- a/kernel/dma/swiotlb.c
> > +++ b/kernel/dma/swiotlb.c
> > @@ -39,6 +39,13 @@
> >   #ifdef CONFIG_DEBUG_FS
> >   #include 
> >   #endif
> > +#ifdef CONFIG_DMA_RESTRICTED_POOL
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#endif
> >
> >   #include 
> >   #include 
> > @@ -681,3 +688,76 @@ static int __init swiotlb_create_default_debugfs(void)
> >   late_initcall(swiotlb_create_default_debugfs);
> >
> >   #endif
> > +
> > +#ifdef CONFIG_DMA_RESTRICTED_POOL
> > +static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
> > + struct device *dev)
> > +{
> > + struct io_tlb_mem *mem = rmem->priv;
> > + unsigned long nslabs = rmem->size >> IO_TLB_SHIFT;
> > +
> > + if (dev->dma_io_tlb_mem)
> > + return 0;
> > +
> > + /* Since multiple devices can share the same pool, the private data,
> > +  * io_tlb_mem struct, will be initialized by the first device attached
> > +  * to it.
> > +  */
> > + if (!mem) {
> > + mem = kzalloc(struct_size(mem, slots, nslabs), GFP_KERNEL);
> > + if (!mem)
> > + return -ENOMEM;
> > +#ifdef CONFIG_ARM
> > + if (!PageHighMem(pfn_to_page(PHYS_PFN(rmem->base {
> > + kfree(mem);
> > + return -EINVAL;
> > + }
> > +#endif /* CONFIG_ARM */
> > + swiotlb_init_io_tlb_mem(mem, rmem->base, nslabs, false);
> > +
> > + rmem->priv = mem;
> > + }
> > +
> > +#ifdef CONFIG_DEBUG_FS
> > + if (!io_tlb_default_mem->debugfs)
> > + io_tlb_default_mem->debugfs =
> > + debugfs_create_dir("swiotlb", NULL);
>
> At this point it's possible for io_tlb_default_mem to be NULL, leading
> to a splat.

Thanks for pointing this out.

>
> But even then if it's not and we have the situation where debugfs==NULL
> then the debugfs_create_dir() here will cause a subsequent attempt in
> swiotlb_create_debugfs() to fail (directory already exists) leading to
> mem->debugfs being assigned an error value. I suspect the creation of
> the debugfs directory needs to be separated from io_tlb_default_mem
> being set.

debugfs creation should move into the if (!mem) {...} above to avoid
duplication.
I think having a separated struct dentry pointer for the default
debugfs should be enough?

if (!debugfs)
debugfs = debugfs_create_dir("swiotlb", NULL);
swiotlb_create_debugfs(mem, rmem->name, debugfs);

>
> Other than that I gave this series a go with our 

Re: [PATCH v5 08/16] swiotlb: Update is_swiotlb_active to add a struct device argument

2021-04-26 Thread Claire Chang
On Fri, Apr 23, 2021 at 9:31 PM Robin Murphy  wrote:
>
> On 2021-04-22 09:15, Claire Chang wrote:
> > Update is_swiotlb_active to add a struct device argument. This will be
> > useful later to allow for restricted DMA pool.
> >
> > Signed-off-by: Claire Chang 
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_internal.c | 2 +-
> >   drivers/gpu/drm/nouveau/nouveau_ttm.c| 2 +-
> >   drivers/pci/xen-pcifront.c   | 2 +-
> >   include/linux/swiotlb.h  | 4 ++--
> >   kernel/dma/direct.c  | 2 +-
> >   kernel/dma/swiotlb.c | 4 ++--
> >   6 files changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c 
> > b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
> > index ce6b664b10aa..7d48c433446b 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
> > @@ -42,7 +42,7 @@ static int i915_gem_object_get_pages_internal(struct 
> > drm_i915_gem_object *obj)
> >
> >   max_order = MAX_ORDER;
> >   #ifdef CONFIG_SWIOTLB
> > - if (is_swiotlb_active()) {
> > + if (is_swiotlb_active(NULL)) {
> >   unsigned int max_segment;
> >
> >   max_segment = swiotlb_max_segment();
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c 
> > b/drivers/gpu/drm/nouveau/nouveau_ttm.c
> > index e8b506a6685b..2a2ae6d6cf6d 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
> > @@ -321,7 +321,7 @@ nouveau_ttm_init(struct nouveau_drm *drm)
> >   }
> >
> >   #if IS_ENABLED(CONFIG_SWIOTLB) && IS_ENABLED(CONFIG_X86)
> > - need_swiotlb = is_swiotlb_active();
> > + need_swiotlb = is_swiotlb_active(NULL);
> >   #endif
> >
> >   ret = ttm_device_init(>ttm.bdev, _bo_driver, 
> > drm->dev->dev,
> > diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c
> > index b7a8f3a1921f..6d548ce53ce7 100644
> > --- a/drivers/pci/xen-pcifront.c
> > +++ b/drivers/pci/xen-pcifront.c
> > @@ -693,7 +693,7 @@ static int pcifront_connect_and_init_dma(struct 
> > pcifront_device *pdev)
> >
> >   spin_unlock(_dev_lock);
> >
> > - if (!err && !is_swiotlb_active()) {
> > + if (!err && !is_swiotlb_active(NULL)) {
> >   err = pci_xen_swiotlb_init_late();
> >   if (err)
> >   dev_err(>xdev->dev, "Could not setup 
> > SWIOTLB!\n");
> > diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> > index 2a6cca07540b..c530c976d18b 100644
> > --- a/include/linux/swiotlb.h
> > +++ b/include/linux/swiotlb.h
> > @@ -123,7 +123,7 @@ static inline bool is_swiotlb_buffer(struct device 
> > *dev, phys_addr_t paddr)
> >   void __init swiotlb_exit(void);
> >   unsigned int swiotlb_max_segment(void);
> >   size_t swiotlb_max_mapping_size(struct device *dev);
> > -bool is_swiotlb_active(void);
> > +bool is_swiotlb_active(struct device *dev);
> >   void __init swiotlb_adjust_size(unsigned long size);
> >   #else
> >   #define swiotlb_force SWIOTLB_NO_FORCE
> > @@ -143,7 +143,7 @@ static inline size_t swiotlb_max_mapping_size(struct 
> > device *dev)
> >   return SIZE_MAX;
> >   }
> >
> > -static inline bool is_swiotlb_active(void)
> > +static inline bool is_swiotlb_active(struct device *dev)
> >   {
> >   return false;
> >   }
> > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> > index 84c9feb5474a..7a88c34d0867 100644
> > --- a/kernel/dma/direct.c
> > +++ b/kernel/dma/direct.c
> > @@ -495,7 +495,7 @@ int dma_direct_supported(struct device *dev, u64 mask)
> >   size_t dma_direct_max_mapping_size(struct device *dev)
> >   {
> >   /* If SWIOTLB is active, use its maximum mapping size */
> > - if (is_swiotlb_active() &&
> > + if (is_swiotlb_active(dev) &&
> >   (dma_addressing_limited(dev) || swiotlb_force == SWIOTLB_FORCE))
>
> I wonder if it's worth trying to fold these other conditions into
> is_swiotlb_active() itself? I'm not entirely sure what matters for Xen,
> but for the other cases it seems like they probably only care about
> whether bouncing may occur for their particular device or not (possibly
> they want to be using dma_max_mapping_size() now anyway - TBH I'm
> struggling to make sense of what the swiotlb_max_segment business is
> supposed to mean).

I think leaving those conditions outside of is_swiotlb_active() might
help avoid confusion with is_dev_swiotlb_force() in patch #9? We need
is_dev_swiotlb_force() only because the restricted DMA pool supports
memory allocation, but the default swiotlb doesn't.

>
> Otherwise, patch #9 will need to touch here as well to make sure that
> per-device forced bouncing is reflected correctly.

You're right. Otherwise, is_dev_swiotlb_force is needed here.


>
> Robin.
>
> >   return swiotlb_max_mapping_size(dev);
> >   return SIZE_MAX;
> > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> > index 

Re: [PATCH] kbuild: replace LANG=C with LC_ALL=C

2021-04-26 Thread Matthias Maennich

On Sat, Apr 24, 2021 at 08:48:41PM +0900, Masahiro Yamada wrote:

LANG gives a weak default to each LC_* in case it is not explicitly
defined. LC_ALL, if set, overrides all other LC_* variables.

 LANG  <  LC_CTYPE, LC_COLLATE, LC_MONETARY, LC_NUMERIC, ...  <  LC_ALL

This is why documentation such as [1] suggests to set LC_ALL in build
scripts to get the deterministic result.

LANG=C is not strong enough to override LC_* that may be set by end
users.

[1]: https://reproducible-builds.org/docs/locales/

Signed-off-by: Masahiro Yamada 


Reviewed-by: Matthias Maennich 

Cheers,
Matthias


---

arch/powerpc/boot/wrapper  | 2 +-
scripts/nsdeps | 2 +-
scripts/recordmcount.pl| 2 +-
scripts/setlocalversion| 2 +-
scripts/tags.sh| 2 +-
tools/testing/selftests/net/mptcp/mptcp_connect.sh | 2 +-
usr/gen_initramfs.sh   | 2 +-
7 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/boot/wrapper b/arch/powerpc/boot/wrapper
index 41fa0a8715e3..cdb796b76e2e 100755
--- a/arch/powerpc/boot/wrapper
+++ b/arch/powerpc/boot/wrapper
@@ -191,7 +191,7 @@ if [ -z "$kernel" ]; then
kernel=vmlinux
fi

-LANG=C elfformat="`${CROSS}objdump -p "$kernel" | grep 'file format' | awk '{print 
$4}'`"
+LC_ALL=C elfformat="`${CROSS}objdump -p "$kernel" | grep 'file format' | awk 
'{print $4}'`"
case "$elfformat" in
elf64-powerpcle)format=elf64lppc;;
elf64-powerpc)  format=elf32ppc ;;
diff --git a/scripts/nsdeps b/scripts/nsdeps
index e8ce2a4d704a..04c4b96e95ec 100644
--- a/scripts/nsdeps
+++ b/scripts/nsdeps
@@ -44,7 +44,7 @@ generate_deps() {
for source_file in $mod_source_files; do
sed '/MODULE_IMPORT_NS/Q' $source_file > 
${source_file}.tmp
offset=$(wc -l ${source_file}.tmp | awk '{print $1;}')
-   cat $source_file | grep MODULE_IMPORT_NS | LANG=C sort -u 
>> ${source_file}.tmp
+   cat $source_file | grep MODULE_IMPORT_NS | LC_ALL=C sort -u 
>> ${source_file}.tmp
tail -n +$((offset +1)) ${source_file} | grep -v 
MODULE_IMPORT_NS >> ${source_file}.tmp
if ! diff -q ${source_file} ${source_file}.tmp; then
mv ${source_file}.tmp ${source_file}
diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl
index 867860ea57da..0a7fc9507d6f 100755
--- a/scripts/recordmcount.pl
+++ b/scripts/recordmcount.pl
@@ -497,7 +497,7 @@ sub update_funcs
#
# Step 2: find the sections and mcount call sites
#
-open(IN, "LANG=C $objdump -hdr $inputfile|") || die "error running $objdump";
+open(IN, "LC_ALL=C $objdump -hdr $inputfile|") || die "error running $objdump";

my $text;

diff --git a/scripts/setlocalversion b/scripts/setlocalversion
index bb709eda96cd..db941f6d9591 100755
--- a/scripts/setlocalversion
+++ b/scripts/setlocalversion
@@ -126,7 +126,7 @@ scm_version()
fi

# Check for svn and a svn repo.
-   if rev=$(LANG= LC_ALL= LC_MESSAGES=C svn info 2>/dev/null | grep '^Last 
Changed Rev'); then
+   if rev=$(LC_ALL=C svn info 2>/dev/null | grep '^Last Changed Rev'); then
rev=$(echo $rev | awk '{print $NF}')
printf -- '-svn%s' "$rev"

diff --git a/scripts/tags.sh b/scripts/tags.sh
index fd96734deff1..db8ba411860a 100755
--- a/scripts/tags.sh
+++ b/scripts/tags.sh
@@ -326,5 +326,5 @@ esac

# Remove structure forward declarations.
if [ -n "$remove_structs" ]; then
-LANG=C sed -i -e '/^\([a-zA-Z_][a-zA-Z0-9_]*\)\t.*\t\/\^struct 
\1;.*\$\/;"\tx$/d' $1
+LC_ALL=C sed -i -e '/^\([a-zA-Z_][a-zA-Z0-9_]*\)\t.*\t\/\^struct 
\1;.*\$\/;"\tx$/d' $1
fi
diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.sh 
b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
index 10a030b53b23..1d2a6e7b877c 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
@@ -273,7 +273,7 @@ check_mptcp_disabled()
ip netns exec ${disabled_ns} sysctl -q net.mptcp.enabled=0

local err=0
-   LANG=C ip netns exec ${disabled_ns} ./mptcp_connect -t $timeout -p 1 -s MPTCP 127.0.0.1 
< "$cin" 2>&1 | \
+   LC_ALL=C ip netns exec ${disabled_ns} ./mptcp_connect -t $timeout -p 1 -s MPTCP 
127.0.0.1 < "$cin" 2>&1 | \
grep -q "^socket: Protocol not available$" && err=1
ip netns delete ${disabled_ns}

diff --git a/usr/gen_initramfs.sh b/usr/gen_initramfs.sh
index 8ae831657e5d..63476bb70b41 100755
--- a/usr/gen_initramfs.sh
+++ b/usr/gen_initramfs.sh
@@ -147,7 +147,7 @@ dir_filelist() {
header "$1"

srcdir=$(echo "$1" | sed -e 's://*:/:g')
-   dirlist=$(find "${srcdir}" -printf "%p %m %U %G\n" | LANG=C sort)
+   dirlist=$(find "${srcdir}" -printf "%p %m %U %G\n" | LC_ALL=C sort)

# If 

Re: linux-next: boot failure in today's linux-next

2021-04-26 Thread Jens Axboe
On 4/26/21 1:43 AM, Stephen Rothwell wrote:
> Hi all,
> 
> On Mon, 26 Apr 2021 16:36:06 +1000 Stephen Rothwell  
> wrote:
>>
>> Today's linux-next build (ipowerpc_pseries_le_defconfig)
>> failed its qemu boot tests like this:
>>
>> [1.833361][T1] ibmvscsi 7103: SRP_VERSION: 16.a
>> [1.834439][T1] ibmvscsi 7103: Maximum ID: 64 Maximum LUN: 32 
>> Maximum Channel: 3
>> [1.834683][T1] scsi host0: IBM POWER Virtual SCSI Adapter 1.5.9
>> [1.842605][C0] ibmvscsi 7103: partner initialization complete
>> [1.844979][C0] ibmvscsi 7103: host srp version: 16.a, host 
>> partition qemu (0), OS 2, max io 2097152
>> [1.845502][C0] ibmvscsi 7103: sent SRP login
>> [1.845853][C0] ibmvscsi 7103: SRP_LOGIN succeeded
>> [1.851447][T1] BUG: Kernel NULL pointer dereference on write at 
>> 0x0390
>> [1.851577][T1] Faulting instruction address: 0xc070386c
>> [1.852171][T1] Oops: Kernel access of bad area, sig: 11 [#1]
>> [1.852324][T1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA 
>> pSeries
>> [1.852689][T1] Modules linked in:
>> [1.853136][T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.12.0 #2
>> [1.853555][T1] NIP:  c070386c LR: c0703a6c CTR: 
>> 
>> [1.853679][T1] REGS: c63a2f40 TRAP: 0380   Not tainted  
>> (5.12.0)
>> [1.853870][T1] MSR:  82009033   
>> CR: 44002240  XER: 
>> [1.854305][T1] CFAR: c0703a68 IRQMASK: 0 
>> [1.854305][T1] GPR00: c0703a6c c63a31e0 
>> c146b200 c80ca800 
>> [1.854305][T1] GPR04: c6067380 c00c00020180 
>> 0024 8500 
>> [1.854305][T1] GPR08:   
>>   
>> [1.854305][T1] GPR12: 2000 c164 
>> c8068508 0020 
>> [1.854305][T1] GPR16:  0024 
>> c0f85f78 c0f0d998 
>> [1.854305][T1] GPR20: c13b59e0 0003 
>> c63a340c 0001 
>> [1.854305][T1] GPR24:  c84a3000 
>> c80ca800 c00c00020180 
>> [1.854305][T1] GPR28: 8500 c80ca800 
>> 0024 c6067380 
>> [1.855486][T1] NIP [c070386c] bio_add_hw_page+0x7c/0x240
>> [1.856357][T1] LR [c0703a6c] bio_add_pc_page+0x3c/0x70
>> [1.856723][T1] Call Trace:
>> [1.856890][T1] [c63a31e0] [0c00] 0xc00 
>> (unreliable)
>> [1.857390][T1] [c63a3230] [c070105c] 
>> bio_kmalloc+0x3c/0xd0
>> [1.857514][T1] [c63a3260] [c0713014] 
>> blk_rq_map_kern+0x164/0x4a0
>> [1.857630][T1] [c63a32d0] [c08e17dc] 
>> __scsi_execute+0x1cc/0x270
>> [1.857746][T1] [c63a3350] [c08e7bf0] 
>> scsi_probe_and_add_lun+0x250/0xd90
>> [1.857887][T1] [c63a34c0] [c08e921c] 
>> __scsi_scan_target+0x17c/0x630
>> [1.858007][T1] [c63a35d0] [c08e9900] 
>> scsi_scan_channel+0x90/0xe0
>> [1.858133][T1] [c63a3620] [c08e9ba8] 
>> scsi_scan_host_selected+0x138/0x1a0
>> [1.858258][T1] [c63a3670] [c08e9fec] 
>> scsi_scan_host+0x2dc/0x320
>> [1.858367][T1] [c63a3710] [c091b2a0] 
>> ibmvscsi_probe+0xa70/0xa80
>> [1.858487][T1] [c63a3800] [c00eb8ac] 
>> vio_bus_probe+0x9c/0x460
>> [1.858616][T1] [c63a38a0] [c08979bc] 
>> really_probe+0x12c/0x6b0
>> [1.858749][T1] [c63a3950] [c0897fd4] 
>> driver_probe_device+0x94/0x130
>> [1.858874][T1] [c63a3980] [c089896c] 
>> device_driver_attach+0x11c/0x130
>> [1.858999][T1] [c63a39c0] [c0898a38] 
>> __driver_attach+0xb8/0x1a0
>> [1.859123][T1] [c63a3a10] [c08941a8] 
>> bus_for_each_dev+0xa8/0x130
>> [1.859257][T1] [c63a3a70] [c0896ef4] 
>> driver_attach+0x34/0x50
>> [1.859381][T1] [c63a3a90] [c0896510] 
>> bus_add_driver+0x170/0x2b0
>> [1.859503][T1] [c63a3b20] [c0899b04] 
>> driver_register+0xb4/0x1c0
>> [1.859626][T1] [c63a3b90] [c00ea808] 
>> __vio_register_driver+0x68/0x90
>> [1.859754][T1] [c63a3bb0] [c10cee74] 
>> ibmvscsi_module_init+0xa4/0xdc
>> [1.859931][T1] [c63a3bf0] [c0012190] 
>> do_one_initcall+0x60/0x2c0
>> [1.860071][T1] [c63a3cc0] [c10846e4] 
>> kernel_init_freeable+0x300/0x3a0
>> [1.860207][T1] [c63a3da0] [c0012764] 
>> kernel_init+0x2c/0x168
>> [1.860336][T1] [c63a3e10] [c000d5ec] 
>> 

Re: [PATCH net-next v4 2/2] of: net: fix of_get_mac_addr_nvmem() for non-platform devices

2021-04-26 Thread Michael Walle

Am 2021-04-16 17:19, schrieb Rob Herring:

On Fri, Apr 16, 2021 at 2:30 AM Michael Walle  wrote:


Am 2021-04-16 05:24, schrieb Benjamin Herrenschmidt:
> On Mon, 2021-04-12 at 19:47 +0200, Michael Walle wrote:
>>
>>  /**
>>   * of_get_phy_mode - Get phy mode for given device_node
>> @@ -59,15 +60,39 @@ static int of_get_mac_addr(struct device_node *np,
>> const char *name, u8 *addr)
>>  static int of_get_mac_addr_nvmem(struct device_node *np, u8 *addr)
>>  {
>> struct platform_device *pdev = of_find_device_by_node(np);
>> +   struct nvmem_cell *cell;
>> +   const void *mac;
>> +   size_t len;
>> int ret;
>>
>> -   if (!pdev)
>> -   return -ENODEV;
>> +   /* Try lookup by device first, there might be a
>> nvmem_cell_lookup
>> +* associated with a given device.
>> +*/
>> +   if (pdev) {
>> +   ret = nvmem_get_mac_address(>dev, addr);
>> +   put_device(>dev);
>> +   return ret;
>> +   }
>> +
>
> This smells like the wrong band aid :)
>
> Any struct device can contain an OF node pointer these days.

But not all nodes might have an associated device, see DSA for 
example.


I believe what Ben is saying and what I said earlier is going from dev
-> OF node is right and OF node -> dev is wrong. If you only have an
OF node, then use an of_* function.


And as the name suggests of_get_mac_address() operates on a node. So
if a driver calls of_get_mac_address() it should work on the node. 
What

is wrong IMHO, is that the ethernet drivers where the corresponding
board
has a nvmem_cell_lookup registered is calling 
of_get_mac_address(node).

It should rather call eth_get_mac_address(dev) in the first place.

One would need to figure out if there is an actual device (with an
assiciated of_node), then call eth_get_mac_address(dev) and if there
isn't a device call of_get_mac_address(node).


Yes, I think we're all in agreement.


But I don't know if that is easy to figure out. Well, one could start
with just the device where nvmem_cell_lookup is used. Then we could
drop the workaround above.


Start with the ones just passing dev.of_node directly:

$ git grep 'of_get_mac_address(.*of_node)'


[..]

Before I'll try to come up with a patch for this, I'd like to get
your opinion on it.

(1) replacing of_get_mac_address(node) with eth_get_mac_address(dev)
might sometimes lead to confusing comments like in
drivers/net/ethernet/allwinner/sun4i-emac.c:

/* Read MAC-address from DT */
ret = of_get_mac_address(np, ndev->dev_addr);

Do we live with that or should the new name somehow reflect that
it is taken from the device tree.

(2) What do you think of eth_get_mac_address(ndev). That is, the
second argument is missing and ndev->dev_addr is used.
I'm unsure about it. We'd still need a second function for drivers
which don't write ndev->dev_addr directly, but have some custom
logic in between. OTOH it would be like eth_hw_addr_random(ndev).

-michael


Guest entry/exit performance work and observations

2021-04-26 Thread Nicholas Piggin
I'm looking at KVM HV P9 path guest exit/entry performance with the Cify
patches, plus some further work to see what we can do.

Measurement is done in the guest making a "NULL hcall" and return back 
to a non-nested guest. Two cases considered: First, returning to guest 
at the "try_real_mode" hcall handler. Second, returning back to guest 
after going around a loop in kvmppc_vcpu_run_hv (i.e., exit into full 
host kernel context, but not to host usermode).

The real-mode test is a proxy for real mode hcall and other interrupt
handlers, and the full exit is a proxy for virtual mode hcalls and 
interrupt handlers.

The test was done with powernv_defconfig, radix guest and radix host on
a POWER9 with meltdown mitigations disabled. A minor hack was made
just to get the immediate return / NULL hcall behaviour to measure
performance.

* Upstream try_real_mode return -  509 cycles
* Upstream virt NULL hcall  - 9587 cycles
* KVM Cify virt NULL hcall  - 9333 cycles
* KVM Cify+opt virt NULL hcall  - 5754 cycles (167% faster than upstream,
   or 60% the cycles required)

The KVM Cify series (which you have already seen) plus the further
optimisations patch series is here:

https://github.com/npiggin/linux/tree/kvm-in-c-new

Some of the important / major further optimisation patches have
individual cycle time improvement contribution annotated. In many cases
things are inter-dependent, e.g., patch A might improve 100 cycles and
B 50 cycles but A+B might be 250 due to together avoiding an SPR stall.
So take the individual numbers with a grain of salt, and the cumulative
result above is most important.

In summary the Cify series does not hurt performance of entry/exit, 
which is good. It actually helps a bit, I'm not sure exactly where.
And we can make quite a lot more improvement with this series.

HOWEVER! The Cify series removes the very fast real mode hcall and 
interrupt handlers (except some things like machine check). So any real 
mode handler will be handled as a virt mode handler on P9 after Cify.

Now I have some further patches in progress that should shave about 1000 
more cycles more from the full exit, but beyond that it gets pretty 
tough to improve. That still leaves it an order of magnitude slower.  

Now I did say this doesn't matter so much with a P9/radix/xive guest
which is true, except possibly for TCE hcalls that Alexey brought to my
attention (any other important cases?). So we will have to think about 
that.

Alexey did say that the real mode TCE hcalls were added for P8, and
were less important for P9, but it is something to keep an eye on. We 
might end up adding a faster handler back, but I would much prefer if
wasn't entirely run in guest context as they do today (maybe switch
MMU context, TB, and a few other important SPRs, and enable translation
so it can run practically as host kernel context). But I think we should
wait and see, and add the complexity only if it comes up as a problem.

The other thing is the P9 path now implements the P9 hash guest support 
after the Cify series. Hash does a lot more exits due to translation 
hcalls and interrupts. I did do some basic measurements (e.g., kernel 
compile) and couldn't see a significant slowdown. But in any case I 
think the P9 hash code is not important to micro optimise, it was only
done to simplify code and remove asm, so I would rather not add 
complexity for that.

Thanks,
Nick


Re: linux-next: boot failure in today's linux-next

2021-04-26 Thread Stephen Rothwell
Hi all,

On Mon, 26 Apr 2021 16:36:06 +1000 Stephen Rothwell  
wrote:
>
> Today's linux-next build (ipowerpc_pseries_le_defconfig)
> failed its qemu boot tests like this:
> 
> [1.833361][T1] ibmvscsi 7103: SRP_VERSION: 16.a
> [1.834439][T1] ibmvscsi 7103: Maximum ID: 64 Maximum LUN: 32 
> Maximum Channel: 3
> [1.834683][T1] scsi host0: IBM POWER Virtual SCSI Adapter 1.5.9
> [1.842605][C0] ibmvscsi 7103: partner initialization complete
> [1.844979][C0] ibmvscsi 7103: host srp version: 16.a, host 
> partition qemu (0), OS 2, max io 2097152
> [1.845502][C0] ibmvscsi 7103: sent SRP login
> [1.845853][C0] ibmvscsi 7103: SRP_LOGIN succeeded
> [1.851447][T1] BUG: Kernel NULL pointer dereference on write at 
> 0x0390
> [1.851577][T1] Faulting instruction address: 0xc070386c
> [1.852171][T1] Oops: Kernel access of bad area, sig: 11 [#1]
> [1.852324][T1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
> [1.852689][T1] Modules linked in:
> [1.853136][T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.12.0 #2
> [1.853555][T1] NIP:  c070386c LR: c0703a6c CTR: 
> 
> [1.853679][T1] REGS: c63a2f40 TRAP: 0380   Not tainted  
> (5.12.0)
> [1.853870][T1] MSR:  82009033   CR: 
> 44002240  XER: 
> [1.854305][T1] CFAR: c0703a68 IRQMASK: 0 
> [1.854305][T1] GPR00: c0703a6c c63a31e0 
> c146b200 c80ca800 
> [1.854305][T1] GPR04: c6067380 c00c00020180 
> 0024 8500 
> [1.854305][T1] GPR08:   
>   
> [1.854305][T1] GPR12: 2000 c164 
> c8068508 0020 
> [1.854305][T1] GPR16:  0024 
> c0f85f78 c0f0d998 
> [1.854305][T1] GPR20: c13b59e0 0003 
> c63a340c 0001 
> [1.854305][T1] GPR24:  c84a3000 
> c80ca800 c00c00020180 
> [1.854305][T1] GPR28: 8500 c80ca800 
> 0024 c6067380 
> [1.855486][T1] NIP [c070386c] bio_add_hw_page+0x7c/0x240
> [1.856357][T1] LR [c0703a6c] bio_add_pc_page+0x3c/0x70
> [1.856723][T1] Call Trace:
> [1.856890][T1] [c63a31e0] [0c00] 0xc00 
> (unreliable)
> [1.857390][T1] [c63a3230] [c070105c] 
> bio_kmalloc+0x3c/0xd0
> [1.857514][T1] [c63a3260] [c0713014] 
> blk_rq_map_kern+0x164/0x4a0
> [1.857630][T1] [c63a32d0] [c08e17dc] 
> __scsi_execute+0x1cc/0x270
> [1.857746][T1] [c63a3350] [c08e7bf0] 
> scsi_probe_and_add_lun+0x250/0xd90
> [1.857887][T1] [c63a34c0] [c08e921c] 
> __scsi_scan_target+0x17c/0x630
> [1.858007][T1] [c63a35d0] [c08e9900] 
> scsi_scan_channel+0x90/0xe0
> [1.858133][T1] [c63a3620] [c08e9ba8] 
> scsi_scan_host_selected+0x138/0x1a0
> [1.858258][T1] [c63a3670] [c08e9fec] 
> scsi_scan_host+0x2dc/0x320
> [1.858367][T1] [c63a3710] [c091b2a0] 
> ibmvscsi_probe+0xa70/0xa80
> [1.858487][T1] [c63a3800] [c00eb8ac] 
> vio_bus_probe+0x9c/0x460
> [1.858616][T1] [c63a38a0] [c08979bc] 
> really_probe+0x12c/0x6b0
> [1.858749][T1] [c63a3950] [c0897fd4] 
> driver_probe_device+0x94/0x130
> [1.858874][T1] [c63a3980] [c089896c] 
> device_driver_attach+0x11c/0x130
> [1.858999][T1] [c63a39c0] [c0898a38] 
> __driver_attach+0xb8/0x1a0
> [1.859123][T1] [c63a3a10] [c08941a8] 
> bus_for_each_dev+0xa8/0x130
> [1.859257][T1] [c63a3a70] [c0896ef4] 
> driver_attach+0x34/0x50
> [1.859381][T1] [c63a3a90] [c0896510] 
> bus_add_driver+0x170/0x2b0
> [1.859503][T1] [c63a3b20] [c0899b04] 
> driver_register+0xb4/0x1c0
> [1.859626][T1] [c63a3b90] [c00ea808] 
> __vio_register_driver+0x68/0x90
> [1.859754][T1] [c63a3bb0] [c10cee74] 
> ibmvscsi_module_init+0xa4/0xdc
> [1.859931][T1] [c63a3bf0] [c0012190] 
> do_one_initcall+0x60/0x2c0
> [1.860071][T1] [c63a3cc0] [c10846e4] 
> kernel_init_freeable+0x300/0x3a0
> [1.860207][T1] [c63a3da0] [c0012764] 
> kernel_init+0x2c/0x168
> [1.860336][T1] [c63a3e10] [c000d5ec] 
> ret_from_kernel_thread+0x5c/0x70
> [1.860690][T1] Instruction dump:
> [1.861072][T1] fba10038 7cbb2b78 7c7d1b78 7cfc3b78 a1440048 2c2a 
> 4082008c a13f004a 
> 

linux-next: boot failure in today's linux-next

2021-04-26 Thread Stephen Rothwell
Hi all,

Today's linux-next build (ipowerpc_pseries_le_defconfig)
failed its qemu boot tests like this:

[1.833361][T1] ibmvscsi 7103: SRP_VERSION: 16.a
[1.834439][T1] ibmvscsi 7103: Maximum ID: 64 Maximum LUN: 32 
Maximum Channel: 3
[1.834683][T1] scsi host0: IBM POWER Virtual SCSI Adapter 1.5.9
[1.842605][C0] ibmvscsi 7103: partner initialization complete
[1.844979][C0] ibmvscsi 7103: host srp version: 16.a, host 
partition qemu (0), OS 2, max io 2097152
[1.845502][C0] ibmvscsi 7103: sent SRP login
[1.845853][C0] ibmvscsi 7103: SRP_LOGIN succeeded
[1.851447][T1] BUG: Kernel NULL pointer dereference on write at 
0x0390
[1.851577][T1] Faulting instruction address: 0xc070386c
[1.852171][T1] Oops: Kernel access of bad area, sig: 11 [#1]
[1.852324][T1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
[1.852689][T1] Modules linked in:
[1.853136][T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.12.0 #2
[1.853555][T1] NIP:  c070386c LR: c0703a6c CTR: 

[1.853679][T1] REGS: c63a2f40 TRAP: 0380   Not tainted  (5.12.0)
[1.853870][T1] MSR:  82009033   CR: 
44002240  XER: 
[1.854305][T1] CFAR: c0703a68 IRQMASK: 0 
[1.854305][T1] GPR00: c0703a6c c63a31e0 
c146b200 c80ca800 
[1.854305][T1] GPR04: c6067380 c00c00020180 
0024 8500 
[1.854305][T1] GPR08:   
  
[1.854305][T1] GPR12: 2000 c164 
c8068508 0020 
[1.854305][T1] GPR16:  0024 
c0f85f78 c0f0d998 
[1.854305][T1] GPR20: c13b59e0 0003 
c63a340c 0001 
[1.854305][T1] GPR24:  c84a3000 
c80ca800 c00c00020180 
[1.854305][T1] GPR28: 8500 c80ca800 
0024 c6067380 
[1.855486][T1] NIP [c070386c] bio_add_hw_page+0x7c/0x240
[1.856357][T1] LR [c0703a6c] bio_add_pc_page+0x3c/0x70
[1.856723][T1] Call Trace:
[1.856890][T1] [c63a31e0] [0c00] 0xc00 (unreliable)
[1.857390][T1] [c63a3230] [c070105c] 
bio_kmalloc+0x3c/0xd0
[1.857514][T1] [c63a3260] [c0713014] 
blk_rq_map_kern+0x164/0x4a0
[1.857630][T1] [c63a32d0] [c08e17dc] 
__scsi_execute+0x1cc/0x270
[1.857746][T1] [c63a3350] [c08e7bf0] 
scsi_probe_and_add_lun+0x250/0xd90
[1.857887][T1] [c63a34c0] [c08e921c] 
__scsi_scan_target+0x17c/0x630
[1.858007][T1] [c63a35d0] [c08e9900] 
scsi_scan_channel+0x90/0xe0
[1.858133][T1] [c63a3620] [c08e9ba8] 
scsi_scan_host_selected+0x138/0x1a0
[1.858258][T1] [c63a3670] [c08e9fec] 
scsi_scan_host+0x2dc/0x320
[1.858367][T1] [c63a3710] [c091b2a0] 
ibmvscsi_probe+0xa70/0xa80
[1.858487][T1] [c63a3800] [c00eb8ac] 
vio_bus_probe+0x9c/0x460
[1.858616][T1] [c63a38a0] [c08979bc] 
really_probe+0x12c/0x6b0
[1.858749][T1] [c63a3950] [c0897fd4] 
driver_probe_device+0x94/0x130
[1.858874][T1] [c63a3980] [c089896c] 
device_driver_attach+0x11c/0x130
[1.858999][T1] [c63a39c0] [c0898a38] 
__driver_attach+0xb8/0x1a0
[1.859123][T1] [c63a3a10] [c08941a8] 
bus_for_each_dev+0xa8/0x130
[1.859257][T1] [c63a3a70] [c0896ef4] 
driver_attach+0x34/0x50
[1.859381][T1] [c63a3a90] [c0896510] 
bus_add_driver+0x170/0x2b0
[1.859503][T1] [c63a3b20] [c0899b04] 
driver_register+0xb4/0x1c0
[1.859626][T1] [c63a3b90] [c00ea808] 
__vio_register_driver+0x68/0x90
[1.859754][T1] [c63a3bb0] [c10cee74] 
ibmvscsi_module_init+0xa4/0xdc
[1.859931][T1] [c63a3bf0] [c0012190] 
do_one_initcall+0x60/0x2c0
[1.860071][T1] [c63a3cc0] [c10846e4] 
kernel_init_freeable+0x300/0x3a0
[1.860207][T1] [c63a3da0] [c0012764] 
kernel_init+0x2c/0x168
[1.860336][T1] [c63a3e10] [c000d5ec] 
ret_from_kernel_thread+0x5c/0x70
[1.860690][T1] Instruction dump:
[1.861072][T1] fba10038 7cbb2b78 7c7d1b78 7cfc3b78 a1440048 2c2a 
4082008c a13f004a 
[1.861328][T1] 7c095040 40810110 e93f0008 811f0028  e9290050 
812903d8 7d3e4850 
[1.863000][T1] ---[ end trace c49ca2d91ee47d7f ]---
[1.879456][T1] 
[2.880941][T1] Kernel panic - not syncing: Attempted to kill init!