date:20250522

Re: [PATCH v3] selftests: riscv: add misaligned access testing

2025-05-22 Thread Alexandre Ghiti


Hi Clément,

On 5/22/25 14:51, Clément Léger wrote:

This selftest tests all the currently emulated instructions (except for
the RV32 compressed ones which are left as a future exercise for a RV32
user). For the FPU instructions, all the FPU registers are tested.

Signed-off-by: Clément Léger 

---

Note: This test can be executed with the FWFT series [1] or using an SBI
firmware that delegates misaligned traps by default. If using QEMU,
you will need the patches mentioned at [2] so that misaligned accesses
will generate a trap.

Note: This commit was part of a series [3] that was partially merged.

Note: the remaining checkpatch errors are not applicable to this tests
which is a user-space one and does not use the kernel headers. Macros
with complex values can not be enclosed in do while loop since they are
generating functions.

Link: https://lore.kernel.org/all/20250424173204.1948385-1-cle...@rivosinc.com/ 
[1]
Link: https://lore.kernel.org/all/20241211211933.198792-1-fkon...@amd.com/ [2]
Link: 
https://lore.kernel.org/linux-riscv/20250422162324.956065-1-cle...@rivosinc.com/
 [3]

V3:
  - Fixed a segfault and a sign extension error found when compiling with
   -O, x != 0 (Alex)
  - Use inline assembly to generate the sigbus and avoid GCC
optimizations

V2:
  - Fix commit description
  - Fix a few errors reported by checkpatch.pl

  tools/testing/selftests/riscv/Makefile|   2 +-
  .../selftests/riscv/misaligned/.gitignore |   1 +
  .../selftests/riscv/misaligned/Makefile   |  12 +
  .../selftests/riscv/misaligned/common.S   |  33 +++
  .../testing/selftests/riscv/misaligned/fpu.S  | 180 +
  tools/testing/selftests/riscv/misaligned/gp.S | 103 +++
  .../selftests/riscv/misaligned/misaligned.c   | 253 ++
  7 files changed, 583 insertions(+), 1 deletion(-)
  create mode 100644 tools/testing/selftests/riscv/misaligned/.gitignore
  create mode 100644 tools/testing/selftests/riscv/misaligned/Makefile
  create mode 100644 tools/testing/selftests/riscv/misaligned/common.S
  create mode 100644 tools/testing/selftests/riscv/misaligned/fpu.S
  create mode 100644 tools/testing/selftests/riscv/misaligned/gp.S
  create mode 100644 tools/testing/selftests/riscv/misaligned/misaligned.c

diff --git a/tools/testing/selftests/riscv/Makefile 
b/tools/testing/selftests/riscv/Makefile
index 099b8c1f46f8..95a98ceeb3b3 100644
--- a/tools/testing/selftests/riscv/Makefile
+++ b/tools/testing/selftests/riscv/Makefile
@@ -5,7 +5,7 @@
  ARCH ?= $(shell uname -m 2>/dev/null || echo not)
  
  ifneq (,$(filter $(ARCH),riscv))

-RISCV_SUBTARGETS ?= abi hwprobe mm sigreturn vector
+RISCV_SUBTARGETS ?= abi hwprobe mm sigreturn vector misaligned
  else
  RISCV_SUBTARGETS :=
  endif
diff --git a/tools/testing/selftests/riscv/misaligned/.gitignore 
b/tools/testing/selftests/riscv/misaligned/.gitignore
new file mode 100644
index ..5eff15a1f981
--- /dev/null
+++ b/tools/testing/selftests/riscv/misaligned/.gitignore
@@ -0,0 +1 @@
+misaligned
diff --git a/tools/testing/selftests/riscv/misaligned/Makefile 
b/tools/testing/selftests/riscv/misaligned/Makefile
new file mode 100644
index ..1aa40110c50d
--- /dev/null
+++ b/tools/testing/selftests/riscv/misaligned/Makefile
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (C) 2021 ARM Limited
+# Originally tools/testing/arm64/abi/Makefile
+
+CFLAGS += -I$(top_srcdir)/tools/include
+
+TEST_GEN_PROGS := misaligned
+
+include ../../lib.mk
+
+$(OUTPUT)/misaligned: misaligned.c fpu.S gp.S
+   $(CC) -g3 -static -o$@ -march=rv64imafdc $(CFLAGS) $(LDFLAGS) $^
diff --git a/tools/testing/selftests/riscv/misaligned/common.S 
b/tools/testing/selftests/riscv/misaligned/common.S
new file mode 100644
index ..8fa00035bd5d
--- /dev/null
+++ b/tools/testing/selftests/riscv/misaligned/common.S
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2025 Rivos Inc.
+ *
+ * Authors:
+ * Clément Léger 
+ */
+
+.macro lb_sb temp, offset, src, dst
+   lb \temp, \offset(\src)
+   sb \temp, \offset(\dst)
+.endm
+
+.macro copy_long_to temp, src, dst
+   lb_sb \temp, 0, \src, \dst,
+   lb_sb \temp, 1, \src, \dst,
+   lb_sb \temp, 2, \src, \dst,
+   lb_sb \temp, 3, \src, \dst,
+   lb_sb \temp, 4, \src, \dst,
+   lb_sb \temp, 5, \src, \dst,
+   lb_sb \temp, 6, \src, \dst,
+   lb_sb \temp, 7, \src, \dst,
+.endm
+
+.macro sp_stack_prologue offset
+   addi sp, sp, -8
+   sub sp, sp, \offset
+.endm
+
+.macro sp_stack_epilogue offset
+   add sp, sp, \offset
+   addi sp, sp, 8
+.endm
diff --git a/tools/testing/selftests/riscv/misaligned/fpu.S 
b/tools/testing/selftests/riscv/misaligned/fpu.S
new file mode 100644
index ..a7ad4430a424
--- /dev/null
+++ b/tools/testing/selftests/riscv/misaligned/fpu.S
@@ -0,0 +1,180 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2025 Rivos Inc.
+ *
+ * Authors:
+ * Clément

[PATCH v3] selftests: riscv: add misaligned access testing

2025-05-22 Thread Clément Léger

This selftest tests all the currently emulated instructions (except for
the RV32 compressed ones which are left as a future exercise for a RV32
user). For the FPU instructions, all the FPU registers are tested.

Signed-off-by: Clément Léger 

---

Note: This test can be executed with the FWFT series [1] or using an SBI
firmware that delegates misaligned traps by default. If using QEMU,
you will need the patches mentioned at [2] so that misaligned accesses
will generate a trap.

Note: This commit was part of a series [3] that was partially merged.

Note: the remaining checkpatch errors are not applicable to this tests
which is a user-space one and does not use the kernel headers. Macros
with complex values can not be enclosed in do while loop since they are
generating functions.

Link: https://lore.kernel.org/all/20250424173204.1948385-1-cle...@rivosinc.com/ 
[1]
Link: https://lore.kernel.org/all/20241211211933.198792-1-fkon...@amd.com/ [2]
Link: 
https://lore.kernel.org/linux-riscv/20250422162324.956065-1-cle...@rivosinc.com/
 [3]

V3:
 - Fixed a segfault and a sign extension error found when compiling with
  -O, x != 0 (Alex)
 - Use inline assembly to generate the sigbus and avoid GCC
   optimizations

V2:
 - Fix commit description
 - Fix a few errors reported by checkpatch.pl

 tools/testing/selftests/riscv/Makefile|   2 +-
 .../selftests/riscv/misaligned/.gitignore |   1 +
 .../selftests/riscv/misaligned/Makefile   |  12 +
 .../selftests/riscv/misaligned/common.S   |  33 +++
 .../testing/selftests/riscv/misaligned/fpu.S  | 180 +
 tools/testing/selftests/riscv/misaligned/gp.S | 103 +++
 .../selftests/riscv/misaligned/misaligned.c   | 253 ++
 7 files changed, 583 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/riscv/misaligned/.gitignore
 create mode 100644 tools/testing/selftests/riscv/misaligned/Makefile
 create mode 100644 tools/testing/selftests/riscv/misaligned/common.S
 create mode 100644 tools/testing/selftests/riscv/misaligned/fpu.S
 create mode 100644 tools/testing/selftests/riscv/misaligned/gp.S
 create mode 100644 tools/testing/selftests/riscv/misaligned/misaligned.c

diff --git a/tools/testing/selftests/riscv/Makefile 
b/tools/testing/selftests/riscv/Makefile
index 099b8c1f46f8..95a98ceeb3b3 100644
--- a/tools/testing/selftests/riscv/Makefile
+++ b/tools/testing/selftests/riscv/Makefile
@@ -5,7 +5,7 @@
 ARCH ?= $(shell uname -m 2>/dev/null || echo not)
 
 ifneq (,$(filter $(ARCH),riscv))
-RISCV_SUBTARGETS ?= abi hwprobe mm sigreturn vector
+RISCV_SUBTARGETS ?= abi hwprobe mm sigreturn vector misaligned
 else
 RISCV_SUBTARGETS :=
 endif
diff --git a/tools/testing/selftests/riscv/misaligned/.gitignore 
b/tools/testing/selftests/riscv/misaligned/.gitignore
new file mode 100644
index ..5eff15a1f981
--- /dev/null
+++ b/tools/testing/selftests/riscv/misaligned/.gitignore
@@ -0,0 +1 @@
+misaligned
diff --git a/tools/testing/selftests/riscv/misaligned/Makefile 
b/tools/testing/selftests/riscv/misaligned/Makefile
new file mode 100644
index ..1aa40110c50d
--- /dev/null
+++ b/tools/testing/selftests/riscv/misaligned/Makefile
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (C) 2021 ARM Limited
+# Originally tools/testing/arm64/abi/Makefile
+
+CFLAGS += -I$(top_srcdir)/tools/include
+
+TEST_GEN_PROGS := misaligned
+
+include ../../lib.mk
+
+$(OUTPUT)/misaligned: misaligned.c fpu.S gp.S
+   $(CC) -g3 -static -o$@ -march=rv64imafdc $(CFLAGS) $(LDFLAGS) $^
diff --git a/tools/testing/selftests/riscv/misaligned/common.S 
b/tools/testing/selftests/riscv/misaligned/common.S
new file mode 100644
index ..8fa00035bd5d
--- /dev/null
+++ b/tools/testing/selftests/riscv/misaligned/common.S
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2025 Rivos Inc.
+ *
+ * Authors:
+ * Clément Léger 
+ */
+
+.macro lb_sb temp, offset, src, dst
+   lb \temp, \offset(\src)
+   sb \temp, \offset(\dst)
+.endm
+
+.macro copy_long_to temp, src, dst
+   lb_sb \temp, 0, \src, \dst,
+   lb_sb \temp, 1, \src, \dst,
+   lb_sb \temp, 2, \src, \dst,
+   lb_sb \temp, 3, \src, \dst,
+   lb_sb \temp, 4, \src, \dst,
+   lb_sb \temp, 5, \src, \dst,
+   lb_sb \temp, 6, \src, \dst,
+   lb_sb \temp, 7, \src, \dst,
+.endm
+
+.macro sp_stack_prologue offset
+   addi sp, sp, -8
+   sub sp, sp, \offset
+.endm
+
+.macro sp_stack_epilogue offset
+   add sp, sp, \offset
+   addi sp, sp, 8
+.endm
diff --git a/tools/testing/selftests/riscv/misaligned/fpu.S 
b/tools/testing/selftests/riscv/misaligned/fpu.S
new file mode 100644
index ..a7ad4430a424
--- /dev/null
+++ b/tools/testing/selftests/riscv/misaligned/fpu.S
@@ -0,0 +1,180 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2025 Rivos Inc.
+ *
+ * Authors:
+ * Clément Léger 
+ */
+
+#include "common.S"
+
+#define CASE_ALIGN 4
+
+.macro

Re: [PATCH] remoteproc: mediatek: Add SCP watchdog handler in IRQ processing

2025-05-22 Thread Mathieu Poirier

Good day,

On Wed, May 21, 2025 at 10:24:03PM +0800, Wentao Liang wrote:
> In mt8195_scp_c1_irq_handler(), only the IPC interrupt bit
> (MT8192_SCP_IPC_INT_BIT) was checked., but does not handle
> when this bit is not set. This could lead to unhandled watchdog
> events. This could lead to unhandled watchdog events. A proper
> implementation can be found in mt8183_scp_irq_handler().
>

As pointed out by Markus, this changelog needs work.

> Add a new branch to handle SCP watchdog events when the IPC
> interrupt bit is not set.
> 
> Fixes: 6a1c9aaf04eb ("remoteproc: mediatek: Add MT8195 SCP core 1 operations")
> Cc: sta...@vger.kernel.org # v6.7
> Signed-off-by: Wentao Liang 
> ---
>  drivers/remoteproc/mtk_scp.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/remoteproc/mtk_scp.c b/drivers/remoteproc/mtk_scp.c
> index 0f4a7065d0bd..316e8c98a503 100644
> --- a/drivers/remoteproc/mtk_scp.c
> +++ b/drivers/remoteproc/mtk_scp.c
> @@ -273,6 +273,8 @@ static void mt8195_scp_c1_irq_handler(struct mtk_scp *scp)
>  
>   if (scp_to_host & MT8192_SCP_IPC_INT_BIT)
>   scp_ipi_handler(scp);
> + else
> + scp_wdt_handler(scp, scp_to_host);

I would much rather see a test for the watchdog bit than just assuming it is
a watchdog interrupt.  And while at it, please refactor the bit definition to be
platform agnostic rather than reusing 8192 definitions on an 8195 platform.

Thanks,
Mathieu

>  
>   writel(scp_to_host, scp->cluster->reg_base + 
> MT8195_SSHUB2APMCU_IPC_CLR);
>  }
> -- 
> 2.42.0.windows.2
>

[PATCH] selftests/mm: Deduplicate test names in madv_populate

2025-05-22 Thread Mark Brown

The madv_populate selftest has some repetitive code for several different
cases that it covers, included repeated test names used in ksft_test_result()
reports. This causes problems for automation, the test name is used to both
track the test between runs and distinguish between multiple tests within
the same run. Fix this by tweaking the messages with duplication to be more
specific about the contexts they're in.

Signed-off-by: Mark Brown 
---
 tools/testing/selftests/mm/madv_populate.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/tools/testing/selftests/mm/madv_populate.c 
b/tools/testing/selftests/mm/madv_populate.c
index ef7d911da13e..b6fabd5c27ed 100644
--- a/tools/testing/selftests/mm/madv_populate.c
+++ b/tools/testing/selftests/mm/madv_populate.c
@@ -172,12 +172,12 @@ static void test_populate_read(void)
if (addr == MAP_FAILED)
ksft_exit_fail_msg("mmap failed\n");
ksft_test_result(range_is_not_populated(addr, SIZE),
-"range initially not populated\n");
+"read range initially not populated\n");
 
ret = madvise(addr, SIZE, MADV_POPULATE_READ);
ksft_test_result(!ret, "MADV_POPULATE_READ\n");
ksft_test_result(range_is_populated(addr, SIZE),
-"range is populated\n");
+"read range is populated\n");
 
munmap(addr, SIZE);
 }
@@ -194,12 +194,12 @@ static void test_populate_write(void)
if (addr == MAP_FAILED)
ksft_exit_fail_msg("mmap failed\n");
ksft_test_result(range_is_not_populated(addr, SIZE),
-"range initially not populated\n");
+"write range initially not populated\n");
 
ret = madvise(addr, SIZE, MADV_POPULATE_WRITE);
ksft_test_result(!ret, "MADV_POPULATE_WRITE\n");
ksft_test_result(range_is_populated(addr, SIZE),
-"range is populated\n");
+"write range is populated\n");
 
munmap(addr, SIZE);
 }
@@ -247,19 +247,19 @@ static void test_softdirty(void)
/* Clear any softdirty bits. */
clear_softdirty();
ksft_test_result(range_is_not_softdirty(addr, SIZE),
-"range is not softdirty\n");
+"cleared range is not softdirty\n");
 
/* Populating READ should set softdirty. */
ret = madvise(addr, SIZE, MADV_POPULATE_READ);
-   ksft_test_result(!ret, "MADV_POPULATE_READ\n");
+   ksft_test_result(!ret, "softdirty MADV_POPULATE_READ\n");
ksft_test_result(range_is_not_softdirty(addr, SIZE),
-"range is not softdirty\n");
+"range is not softdirty after MADV_POPULATE_READ\n");
 
/* Populating WRITE should set softdirty. */
ret = madvise(addr, SIZE, MADV_POPULATE_WRITE);
-   ksft_test_result(!ret, "MADV_POPULATE_WRITE\n");
+   ksft_test_result(!ret, "softdirty MADV_POPULATE_WRITE\n");
ksft_test_result(range_is_softdirty(addr, SIZE),
-"range is softdirty\n");
+"range is softdirty after MADV_POPULATE_WRITE \n");
 
munmap(addr, SIZE);
 }

---
base-commit: a5806cd506af5a7c19bcd596e4708b5c464bfd21
change-id: 20250521-selftests-mm-madv-populate-dedupe-95faf16c3c8f

Best regards,
-- 
Mark Brown

[PATCH v3 0/2] livepatch, arm64/module: Enable late module relocations.

2025-05-22 Thread Dylan Hatch

Late relocations (after the module is initially loaded) are needed when
livepatches change module code. This is supported by x86, ppc, and s390.
This series borrows the x86 methodology to reach the same level of
support on arm64, and moves the text-poke locking into the core livepatch
code to reduce redundancy.

Dylan Hatch (2):
  livepatch, x86/module: Generalize late module relocation locking.
  arm64/module: Use text-poke API for late relocations.

 arch/arm64/kernel/module.c | 114 ++---
 arch/x86/kernel/module.c   |   8 +--
 kernel/livepatch/core.c|  18 --
 3 files changed, 84 insertions(+), 56 deletions(-)

-- 
2.49.0.1151.ga128411c76-goog

Re: [PATCH] selftests/mm: Deduplicate test names in madv_populate

2025-05-22 Thread Liam R. Howlett

* Mark Brown  [250522 12:30]:
> The madv_populate selftest has some repetitive code for several different
> cases that it covers, included repeated test names used in ksft_test_result()
> reports. This causes problems for automation, the test name is used to both
> track the test between runs and distinguish between multiple tests within
> the same run. Fix this by tweaking the messages with duplication to be more
> specific about the contexts they're in.
> 
> Signed-off-by: Mark Brown 

Reviewed-by: Liam R. Howlett 

> ---
>  tools/testing/selftests/mm/madv_populate.c | 18 +-
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/testing/selftests/mm/madv_populate.c 
> b/tools/testing/selftests/mm/madv_populate.c
> index ef7d911da13e..b6fabd5c27ed 100644
> --- a/tools/testing/selftests/mm/madv_populate.c
> +++ b/tools/testing/selftests/mm/madv_populate.c
> @@ -172,12 +172,12 @@ static void test_populate_read(void)
>   if (addr == MAP_FAILED)
>   ksft_exit_fail_msg("mmap failed\n");
>   ksft_test_result(range_is_not_populated(addr, SIZE),
> -  "range initially not populated\n");
> +  "read range initially not populated\n");
>  
>   ret = madvise(addr, SIZE, MADV_POPULATE_READ);
>   ksft_test_result(!ret, "MADV_POPULATE_READ\n");
>   ksft_test_result(range_is_populated(addr, SIZE),
> -  "range is populated\n");
> +  "read range is populated\n");
>  
>   munmap(addr, SIZE);
>  }
> @@ -194,12 +194,12 @@ static void test_populate_write(void)
>   if (addr == MAP_FAILED)
>   ksft_exit_fail_msg("mmap failed\n");
>   ksft_test_result(range_is_not_populated(addr, SIZE),
> -  "range initially not populated\n");
> +  "write range initially not populated\n");
>  
>   ret = madvise(addr, SIZE, MADV_POPULATE_WRITE);
>   ksft_test_result(!ret, "MADV_POPULATE_WRITE\n");
>   ksft_test_result(range_is_populated(addr, SIZE),
> -  "range is populated\n");
> +  "write range is populated\n");
>  
>   munmap(addr, SIZE);
>  }
> @@ -247,19 +247,19 @@ static void test_softdirty(void)
>   /* Clear any softdirty bits. */
>   clear_softdirty();
>   ksft_test_result(range_is_not_softdirty(addr, SIZE),
> -  "range is not softdirty\n");
> +  "cleared range is not softdirty\n");
>  
>   /* Populating READ should set softdirty. */
>   ret = madvise(addr, SIZE, MADV_POPULATE_READ);
> - ksft_test_result(!ret, "MADV_POPULATE_READ\n");
> + ksft_test_result(!ret, "softdirty MADV_POPULATE_READ\n");
>   ksft_test_result(range_is_not_softdirty(addr, SIZE),
> -  "range is not softdirty\n");
> +  "range is not softdirty after MADV_POPULATE_READ\n");
>  
>   /* Populating WRITE should set softdirty. */
>   ret = madvise(addr, SIZE, MADV_POPULATE_WRITE);
> - ksft_test_result(!ret, "MADV_POPULATE_WRITE\n");
> + ksft_test_result(!ret, "softdirty MADV_POPULATE_WRITE\n");
>   ksft_test_result(range_is_softdirty(addr, SIZE),
> -  "range is softdirty\n");
> +  "range is softdirty after MADV_POPULATE_WRITE \n");
>  
>   munmap(addr, SIZE);
>  }
> 
> ---
> base-commit: a5806cd506af5a7c19bcd596e4708b5c464bfd21
> change-id: 20250521-selftests-mm-madv-populate-dedupe-95faf16c3c8f
> 
> Best regards,
> -- 
> Mark Brown 
> 
>

[PATCH v3 1/2] livepatch, x86/module: Generalize late module relocation locking.

2025-05-22 Thread Dylan Hatch

Late module relocations are an issue on any arch that supports
livepatch, so move the text_mutex locking to the livepatch core code.

Signed-off-by: Dylan Hatch 
---
 arch/x86/kernel/module.c |  8 ++--
 kernel/livepatch/core.c  | 18 +-
 2 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index ff07558b7ebc6..38767e0047d0c 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -197,18 +197,14 @@ static int write_relocate_add(Elf64_Shdr *sechdrs,
bool early = me->state == MODULE_STATE_UNFORMED;
void *(*write)(void *, const void *, size_t) = memcpy;
 
-   if (!early) {
+   if (!early)
write = text_poke;
-   mutex_lock(&text_mutex);
-   }
 
ret = __write_relocate_add(sechdrs, strtab, symindex, relsec, me,
   write, apply);
 
-   if (!early) {
+   if (!early)
text_poke_sync();
-   mutex_unlock(&text_mutex);
-   }
 
return ret;
 }
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 0e73fac55f8eb..9968441f73510 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -294,9 +294,10 @@ static int klp_write_section_relocs(struct module *pmod, 
Elf_Shdr *sechdrs,
unsigned int symndx, unsigned int secndx,
const char *objname, bool apply)
 {
-   int cnt, ret;
+   int cnt, ret = 0;
char sec_objname[MODULE_NAME_LEN];
Elf_Shdr *sec = sechdrs + secndx;
+   bool early = pmod->state == MODULE_STATE_UNFORMED;
 
/*
 * Format: .klp.rela.sec_objname.section_name
@@ -319,12 +320,19 @@ static int klp_write_section_relocs(struct module *pmod, 
Elf_Shdr *sechdrs,
  sec, sec_objname);
if (ret)
return ret;
-
-   return apply_relocate_add(sechdrs, strtab, symndx, secndx, 
pmod);
}
 
-   clear_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
-   return 0;
+   if (!early)
+   mutex_lock(&text_mutex);
+
+   if (apply)
+   ret = apply_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
+   else
+   clear_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
+
+   if (!early)
+   mutex_unlock(&text_mutex);
+   return ret;
 }
 
 int klp_apply_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
-- 
2.49.0.1151.ga128411c76-goog

Re: [PATCH v3 1/2] livepatch, x86/module: Generalize late module relocation locking.

2025-05-22 Thread Song Liu

On Thu, May 22, 2025 at 11:43 AM Dylan Hatch  wrote:
>
> Late module relocations are an issue on any arch that supports
> livepatch, so move the text_mutex locking to the livepatch core code.
>
> Signed-off-by: Dylan Hatch 

Acked-by: Song Liu

Re: [PATCH] selftests/cpufreq: Fix cpufreq basic read and update testcases

2025-05-22 Thread Shuah Khan


On 5/19/25 01:58, Viresh Kumar wrote:

On 30-04-25, 17:14, Swapnil Sapkal wrote:

In cpufreq basic selftests, one of the testcases is to read all cpufreq
sysfs files and print the values. This testcase assumes all the cpufreq
sysfs files have read permissions. However certain cpufreq sysfs files
(eg. stats/reset) are write only files and this testcase errors out
when it is not able to read the file.
Similarily, there is one more testcase which reads the cpufreq sysfs
file data and write it back to same file. This testcase also errors out
for sysfs files without read permission.
Fix these testcases by adding proper read permission checks.


Can you share how you ran the test?



Reported-by: Narasimhan V 
Signed-off-by: Swapnil Sapkal 
---
  tools/testing/selftests/cpufreq/cpufreq.sh | 15 +++
  1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/cpufreq/cpufreq.sh 
b/tools/testing/selftests/cpufreq/cpufreq.sh
index e350c521b467..3484fa34e8d8 100755
--- a/tools/testing/selftests/cpufreq/cpufreq.sh
+++ b/tools/testing/selftests/cpufreq/cpufreq.sh
@@ -52,7 +52,14 @@ read_cpufreq_files_in_dir()
for file in $files; do
if [ -f $1/$file ]; then
printf "$file:"
-   cat $1/$file
+   #file is readable ?
+   local rfile=$(ls -l $1/$file | awk '$1 ~ /^.*r.*/ { 
print $NF; }')
+
+   if [ ! -z $rfile ]; then
+   cat $1/$file
+   else
+   printf "$file is not readable\n"
+   fi


What about:

if [ -r $1/$file ]; then
 cat $1/$file
else
 printf "$file is not readable\n"
fi




thanks,
-- Shuah

[PATCH 1/4] selftests/mm: Use standard ksft_finished() in cow and gup_longterm

2025-05-22 Thread Mark Brown

The cow and gup_longterm test programs open code something that looks a
lot like the standard ksft_finished() helper to summarise the test
results and provide an exit code, convert to use ksft_finished().

Signed-off-by: Mark Brown 
---
 tools/testing/selftests/mm/cow.c  | 7 +--
 tools/testing/selftests/mm/gup_longterm.c | 8 ++--
 2 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/mm/cow.c b/tools/testing/selftests/mm/cow.c
index b6cfe0a4b7df..e70cd3d900cc 100644
--- a/tools/testing/selftests/mm/cow.c
+++ b/tools/testing/selftests/mm/cow.c
@@ -1771,7 +1771,6 @@ static int tests_per_non_anon_test_case(void)
 
 int main(int argc, char **argv)
 {
-   int err;
struct thp_settings default_settings;
 
ksft_print_header();
@@ -1811,9 +1810,5 @@ int main(int argc, char **argv)
thp_restore_settings();
}
 
-   err = ksft_get_fail_cnt();
-   if (err)
-   ksft_exit_fail_msg("%d out of %d tests failed\n",
-  err, ksft_test_num());
-   ksft_exit_pass();
+   ksft_finished();
 }
diff --git a/tools/testing/selftests/mm/gup_longterm.c 
b/tools/testing/selftests/mm/gup_longterm.c
index 21595b20bbc3..e60e62809186 100644
--- a/tools/testing/selftests/mm/gup_longterm.c
+++ b/tools/testing/selftests/mm/gup_longterm.c
@@ -455,7 +455,7 @@ static int tests_per_test_case(void)
 
 int main(int argc, char **argv)
 {
-   int i, err;
+   int i;
 
pagesize = getpagesize();
nr_hugetlbsizes = detect_hugetlb_page_sizes(hugetlbsizes,
@@ -469,9 +469,5 @@ int main(int argc, char **argv)
for (i = 0; i < ARRAY_SIZE(test_cases); i++)
run_test_case(&test_cases[i]);
 
-   err = ksft_get_fail_cnt();
-   if (err)
-   ksft_exit_fail_msg("%d out of %d tests failed\n",
-  err, ksft_test_num());
-   ksft_exit_pass();
+   ksft_finished();
 }

-- 
2.39.5

[PATCH 3/4] selftests/mm: Report unique test names for each cow test

2025-05-22 Thread Mark Brown

The kselftest framework uses the string logged when a test result is
reported as the unique identifier for a test, using it to track test
results between runs. The cow test completely fails to follow this pattern,
it runs test functions repeatedly with various parameters with each result
report from those functions being a string logging an error message which
is fixed between runs.

Since the code already logs each test uniquely before it starts refactor
to also print this to a buffer, then use that name as the test result.
This isn't especially pretty but is relatively straightforward and is a
great help to tooling.

Signed-off-by: Mark Brown 
---
 tools/testing/selftests/mm/cow.c | 333 +--
 1 file changed, 217 insertions(+), 116 deletions(-)

diff --git a/tools/testing/selftests/mm/cow.c b/tools/testing/selftests/mm/cow.c
index e70cd3d900cc..a97f5ef79f17 100644
--- a/tools/testing/selftests/mm/cow.c
+++ b/tools/testing/selftests/mm/cow.c
@@ -112,9 +112,12 @@ struct comm_pipes {
 
 static int setup_comm_pipes(struct comm_pipes *comm_pipes)
 {
-   if (pipe(comm_pipes->child_ready) < 0)
+   if (pipe(comm_pipes->child_ready) < 0) {
+   ksft_perror("pipe()");
return -errno;
+   }
if (pipe(comm_pipes->parent_ready) < 0) {
+   ksft_perror("pipe()");
close(comm_pipes->child_ready[0]);
close(comm_pipes->child_ready[1]);
return -errno;
@@ -207,13 +210,14 @@ static void do_test_cow_in_parent(char *mem, size_t size, 
bool do_mprotect,
 
ret = setup_comm_pipes(&comm_pipes);
if (ret) {
-   ksft_test_result_fail("pipe() failed\n");
+   log_test_result(KSFT_FAIL);
return;
}
 
ret = fork();
if (ret < 0) {
-   ksft_test_result_fail("fork() failed\n");
+   ksft_perror("fork() failed");
+   log_test_result(KSFT_FAIL);
goto close_comm_pipes;
} else if (!ret) {
exit(fn(mem, size, &comm_pipes));
@@ -228,9 +232,18 @@ static void do_test_cow_in_parent(char *mem, size_t size, 
bool do_mprotect,
 * write-faults by directly mapping pages writable.
 */
ret = mprotect(mem, size, PROT_READ);
-   ret |= mprotect(mem, size, PROT_READ|PROT_WRITE);
if (ret) {
-   ksft_test_result_fail("mprotect() failed\n");
+   ksft_perror("mprotect() failed");
+   log_test_result(KSFT_FAIL);
+   write(comm_pipes.parent_ready[1], "0", 1);
+   wait(&ret);
+   goto close_comm_pipes;
+   }
+
+   ret = mprotect(mem, size, PROT_READ|PROT_WRITE);
+   if (ret) {
+   ksft_perror("mprotect() failed");
+   log_test_result(KSFT_FAIL);
write(comm_pipes.parent_ready[1], "0", 1);
wait(&ret);
goto close_comm_pipes;
@@ -248,16 +261,16 @@ static void do_test_cow_in_parent(char *mem, size_t size, 
bool do_mprotect,
ret = -EINVAL;
 
if (!ret) {
-   ksft_test_result_pass("No leak from parent into child\n");
+   log_test_result(KSFT_PASS);
} else if (xfail) {
/*
 * With hugetlb, some vmsplice() tests are currently expected to
 * fail because (a) harder to fix and (b) nobody really cares.
 * Flag them as expected failure for now.
 */
-   ksft_test_result_xfail("Leak from parent into child\n");
+   log_test_result(KSFT_XFAIL);
} else {
-   ksft_test_result_fail("Leak from parent into child\n");
+   log_test_result(KSFT_FAIL);
}
 close_comm_pipes:
close_comm_pipes(&comm_pipes);
@@ -306,26 +319,29 @@ static void do_test_vmsplice_in_parent(char *mem, size_t 
size,
 
ret = setup_comm_pipes(&comm_pipes);
if (ret) {
-   ksft_test_result_fail("pipe() failed\n");
+   log_test_result(KSFT_FAIL);
goto free;
}
 
if (pipe(fds) < 0) {
-   ksft_test_result_fail("pipe() failed\n");
+   ksft_perror("pipe() failed");
+   log_test_result(KSFT_FAIL);
goto close_comm_pipes;
}
 
if (before_fork) {
transferred = vmsplice(fds[1], &iov, 1, 0);
if (transferred <= 0) {
-   ksft_test_result_fail("vmsplice() failed\n");
+   ksft_print_msg("vmsplice() failed\n");
+   log_test_result(KSFT_FAIL);
goto close_pipe;
}
}
 
ret = fork();
if (ret < 0) {
-   ksft_test_result_

[PATCH 2/4] selftest/mm: Add helper for logging test start and results

2025-05-22 Thread Mark Brown

Several of the MM tests have a pattern of printing a description of the
test to be run then reporting the actual TAP result using a generic string
not connected to the specific test, often in a shared function used by many
tests. The name reported typically varies depending on the specific result
rather than the test too. This causes problems for tooling that works with
test results, the names reported with the results are used to deduplicate
tests and track them between runs so both duplicated names and changing
names cause trouble for things like UIs and automated bisection.

As a first step towards matching these tests better with the expectations
of kselftest provide helpers which record the test name as part of the
initial print and then use that as part of reporting a result.

This is not added as a generic kselftest helper partly because the use of
a variable to store the test name doesn't fit well with the header only
implementation of kselftest.h and partly because it's not really an
intended pattern. Ideally at some point the mm tests that use it will be
updated to not need it.

Signed-off-by: Mark Brown 
---
 tools/testing/selftests/mm/vm_util.h | 20 
 1 file changed, 20 insertions(+)

diff --git a/tools/testing/selftests/mm/vm_util.h 
b/tools/testing/selftests/mm/vm_util.h
index 6effafdc4d8a..4944e4c79051 100644
--- a/tools/testing/selftests/mm/vm_util.h
+++ b/tools/testing/selftests/mm/vm_util.h
@@ -3,6 +3,7 @@
 #include 
 #include 
 #include 
+#include 
 #include  /* ffsl() */
 #include  /* _SC_PAGESIZE */
 #include "../kselftest.h"
@@ -74,6 +75,25 @@ int uffd_register_with_ioctls(int uffd, void *addr, uint64_t 
len,
 unsigned long get_free_hugepages(void);
 bool check_vmflag_io(void *addr);
 
+/* These helpers need to be inline to match the kselftest.h idiom. */
+static char test_name[1024];
+
+static inline void log_test_start(const char *name, ...)
+{
+   va_list args;
+   va_start(args, name);
+
+   vsnprintf(test_name, sizeof(test_name), name, args);
+   ksft_print_msg("[RUN] %s\n", test_name);
+
+   va_end(args);
+}
+
+static inline void log_test_result(int result)
+{
+   ksft_test_result_report(result, "%s\n", test_name);
+}
+
 /*
  * On ppc64 this will only work with radix 2M hugepage size
  */

-- 
2.39.5

[PATCH 0/4] selftests/mm: cow and gup_longterm cleanups

2025-05-22 Thread Mark Brown

The bulk of these changes modify the cow and gup_longterm tests to
report unique and stable names for each test, bringing them into line
with the expectations of tooling that works with kselftest.  The string
reported as a test result is used by tooling to both deduplicate tests
and track tests between test runs, using the same string for multiple
tests or changing the string depending on test result causes problems
for user interfaces and automation such as bisection.

It was suggested that converting to use kselftest_harness.h would be a
good way of addressing this, however that really wants the set of tests
to run to be known at compile time but both test programs dynamically
enumarate the set of huge page sizes the system supports and test each.
Refactoring to handle this would be even more invasive than these
changes which are large but straightforward and repetitive.

A version of the main gup_longterm cleanup was previously sent
separately, this version factors out the helpers for logging the start
of the test since the cow test looks very similar.

Signed-off-by: Mark Brown 
---
Mark Brown (4):
  selftests/mm: Use standard ksft_finished() in cow and gup_longterm
  selftest/mm: Add helper for logging test start and results
  selftests/mm: Report unique test names for each cow test
  selftests/mm: Fix test result reporting in gup_longterm

 tools/testing/selftests/mm/cow.c  | 340 +++---
 tools/testing/selftests/mm/gup_longterm.c | 158 --
 tools/testing/selftests/mm/vm_util.h  |  20 ++
 3 files changed, 334 insertions(+), 184 deletions(-)
---
base-commit: a5806cd506af5a7c19bcd596e4708b5c464bfd21
change-id: 20250521-selftests-mm-cow-dedupe-33dcab034558

Best regards,
-- 
Mark Brown

[PATCH 4/4] selftests/mm: Fix test result reporting in gup_longterm

2025-05-22 Thread Mark Brown

The kselftest framework uses the string logged when a test result is
reported as the unique identifier for a test, using it to track test
results between runs. The gup_longterm test fails to follow this
pattern, it runs a single test function repeatedly with various
parameters but each result report is a string logging an error message
which is fixed between runs.

Since the code already logs each test uniquely before it starts refactor
to also print this to a buffer, then use that name as the test result.
This isn't especially pretty but is relatively straightforward and is a
great help to tooling.

Signed-off-by: Mark Brown 
---
 tools/testing/selftests/mm/gup_longterm.c | 150 +++---
 1 file changed, 94 insertions(+), 56 deletions(-)

diff --git a/tools/testing/selftests/mm/gup_longterm.c 
b/tools/testing/selftests/mm/gup_longterm.c
index e60e62809186..f84ea97c2543 100644
--- a/tools/testing/selftests/mm/gup_longterm.c
+++ b/tools/testing/selftests/mm/gup_longterm.c
@@ -93,33 +93,48 @@ static void do_test(int fd, size_t size, enum test_type 
type, bool shared)
__fsword_t fs_type = get_fs_type(fd);
bool should_work;
char *mem;
+   int result = KSFT_PASS;
int ret;
 
+   if (fd < 0) {
+   result = KSFT_FAIL;
+   goto report;
+   }
+
if (ftruncate(fd, size)) {
if (errno == ENOENT) {
skip_test_dodgy_fs("ftruncate()");
} else {
-   ksft_test_result_fail("ftruncate() failed (%s)\n", 
strerror(errno));
+   ksft_print_msg("ftruncate() failed (%s)\n",
+  strerror(errno));
+   result = KSFT_FAIL;
+   goto report;
}
return;
}
 
if (fallocate(fd, 0, 0, size)) {
-   if (size == pagesize)
-   ksft_test_result_fail("fallocate() failed (%s)\n", 
strerror(errno));
-   else
-   ksft_test_result_skip("need more free huge pages\n");
-   return;
+   if (size == pagesize) {
+   ksft_print_msg("fallocate() failed (%s)\n", 
strerror(errno));
+   result = KSFT_FAIL;
+   } else {
+   ksft_print_msg("need more free huge pages\n");
+   result = KSFT_SKIP;
+   }
+   goto report;
}
 
mem = mmap(NULL, size, PROT_READ | PROT_WRITE,
   shared ? MAP_SHARED : MAP_PRIVATE, fd, 0);
if (mem == MAP_FAILED) {
-   if (size == pagesize || shared)
-   ksft_test_result_fail("mmap() failed (%s)\n", 
strerror(errno));
-   else
-   ksft_test_result_skip("need more free huge pages\n");
-   return;
+   if (size == pagesize || shared) {
+   ksft_print_msg("mmap() failed (%s)\n", strerror(errno));
+   result = KSFT_FAIL;
+   } else {
+   ksft_print_msg("need more free huge pages\n");
+   result = KSFT_SKIP;
+   }
+   goto report;
}
 
/* Fault in the page such that GUP-fast can pin it directly. */
@@ -134,7 +149,8 @@ static void do_test(int fd, size_t size, enum test_type 
type, bool shared)
 */
ret = mprotect(mem, size, PROT_READ);
if (ret) {
-   ksft_test_result_fail("mprotect() failed (%s)\n", 
strerror(errno));
+   ksft_print_msg("mprotect() failed (%s)\n", 
strerror(errno));
+   result = KSFT_FAIL;
goto munmap;
}
/* FALLTHROUGH */
@@ -147,12 +163,14 @@ static void do_test(int fd, size_t size, enum test_type 
type, bool shared)
type == TEST_TYPE_RW_FAST;
 
if (gup_fd < 0) {
-   ksft_test_result_skip("gup_test not available\n");
+   ksft_print_msg("gup_test not available\n");
+   result = KSFT_SKIP;
break;
}
 
if (rw && shared && fs_is_unknown(fs_type)) {
-   ksft_test_result_skip("Unknown filesystem\n");
+   ksft_print_msg("Unknown filesystem\n");
+   result = KSFT_SKIP;
return;
}
/*
@@ -169,14 +187,19 @@ static void do_test(int fd, size_t size, enum test_type 
type, bool shared)
args.flags |= rw ? PIN_LONGTERM_TEST_FLAG_USE_WRITE : 0;
ret = ioctl(gup_fd, PIN_LONGTERM_TEST_START, &args);
if (ret && errno == EINVAL) {
-   ksft_test_result_skip("PIN_LONGTERM_TEST_START failed 
(EINVA

[PATCH v3 2/2] arm64/module: Use text-poke API for late relocations.

2025-05-22 Thread Dylan Hatch

To enable late module patching, livepatch modules need to be able to
apply some of their relocations well after being loaded. In this
scenario, use the text-poking API to allow this, even with
STRICT_MODULE_RWX.

This patch is partially based off commit 88fc078a7a8f6 ("x86/module: Use
text_poke() for late relocations").

Signed-off-by: Dylan Hatch 
Acked-by: Song Liu 
---
 arch/arm64/kernel/module.c | 114 ++---
 1 file changed, 69 insertions(+), 45 deletions(-)

diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 06bb680bfe975..3998fb3322b73 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -18,11 +18,13 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
 #include 
 #include 
+#include 
 
 enum aarch64_reloc_op {
RELOC_OP_NONE,
@@ -48,7 +50,8 @@ static u64 do_reloc(enum aarch64_reloc_op reloc_op, __le32 
*place, u64 val)
return 0;
 }
 
-static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
+static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len,
+ struct module *me)
 {
s64 sval = do_reloc(op, place, val);
 
@@ -66,7 +69,11 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, 
u64 val, int len)
 
switch (len) {
case 16:
-   *(s16 *)place = sval;
+   if (me->state != MODULE_STATE_UNFORMED)
+   aarch64_insn_set(place, sval, sizeof(s16));
+   else
+   *(s16 *)place = sval;
+
switch (op) {
case RELOC_OP_ABS:
if (sval < 0 || sval > U16_MAX)
@@ -82,7 +89,11 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, 
u64 val, int len)
}
break;
case 32:
-   *(s32 *)place = sval;
+   if (me->state != MODULE_STATE_UNFORMED)
+   aarch64_insn_set(place, sval, sizeof(s32));
+   else
+   *(s32 *)place = sval;
+
switch (op) {
case RELOC_OP_ABS:
if (sval < 0 || sval > U32_MAX)
@@ -98,8 +109,10 @@ static int reloc_data(enum aarch64_reloc_op op, void 
*place, u64 val, int len)
}
break;
case 64:
-   *(s64 *)place = sval;
-   break;
+   if (me->state != MODULE_STATE_UNFORMED)
+   aarch64_insn_set(place, sval, sizeof(s64));
+   else
+   *(s64 *)place = sval;   break;
default:
pr_err("Invalid length (%d) for data relocation\n", len);
return 0;
@@ -113,7 +126,8 @@ enum aarch64_insn_movw_imm_type {
 };
 
 static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
-  int lsb, enum aarch64_insn_movw_imm_type imm_type)
+  int lsb, enum aarch64_insn_movw_imm_type imm_type,
+  struct module *me)
 {
u64 imm;
s64 sval;
@@ -145,7 +159,10 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, 
__le32 *place, u64 val,
 
/* Update the instruction with the new encoding. */
insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_16, insn, imm);
-   *place = cpu_to_le32(insn);
+   if (me->state != MODULE_STATE_UNFORMED)
+   aarch64_insn_set(place, cpu_to_le32(insn), sizeof(insn));
+   else
+   *place = cpu_to_le32(insn);
 
if (imm > U16_MAX)
return -ERANGE;
@@ -154,7 +171,8 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 
*place, u64 val,
 }
 
 static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
- int lsb, int len, enum aarch64_insn_imm_type imm_type)
+ int lsb, int len, enum aarch64_insn_imm_type imm_type,
+ struct module *me)
 {
u64 imm, imm_mask;
s64 sval;
@@ -170,7 +188,10 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 
*place, u64 val,
 
/* Update the instruction's immediate field. */
insn = aarch64_insn_encode_immediate(imm_type, insn, imm);
-   *place = cpu_to_le32(insn);
+   if (me->state != MODULE_STATE_UNFORMED)
+   aarch64_insn_set(place, cpu_to_le32(insn), sizeof(insn));
+   else
+   *place = cpu_to_le32(insn);
 
/*
 * Extract the upper value bits (including the sign bit) and
@@ -189,17 +210,17 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, 
__le32 *place, u64 val,
 }
 
 static int reloc_insn_adrp(struct module *mod, Elf64_Shdr *sechdrs,
-  __le32 *place, u64 val)
+  __le32 *place, u64 val, struct module *me)
 {
u32 insn;
 
if (!is_forbidden_offset_for_adrp(place))
return

Re: [PATCH v3 2/2] arm64/module: Use text-poke API for late relocations.

2025-05-22 Thread Dylan Hatch

On Thu, May 22, 2025 at 11:43 AM Dylan Hatch  wrote:
> -static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int 
> len)
> +static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int 
> len,
> + struct module *me)
>  {
> s64 sval = do_reloc(op, place, val);
>
> @@ -66,7 +69,11 @@ static int reloc_data(enum aarch64_reloc_op op, void 
> *place, u64 val, int len)
>
> switch (len) {
> case 16:
> -   *(s16 *)place = sval;
> +   if (me->state != MODULE_STATE_UNFORMED)
> +   aarch64_insn_set(place, sval, sizeof(s16));
> +   else
> +   *(s16 *)place = sval;
> +
> switch (op) {
> case RELOC_OP_ABS:
> if (sval < 0 || sval > U16_MAX)
> @@ -82,7 +89,11 @@ static int reloc_data(enum aarch64_reloc_op op, void 
> *place, u64 val, int len)
> }
> break;
> case 32:
> -   *(s32 *)place = sval;
> +   if (me->state != MODULE_STATE_UNFORMED)
> +   aarch64_insn_set(place, sval, sizeof(s32));
> +   else
> +   *(s32 *)place = sval;
> +
> switch (op) {
> case RELOC_OP_ABS:
> if (sval < 0 || sval > U32_MAX)
> @@ -98,8 +109,10 @@ static int reloc_data(enum aarch64_reloc_op op, void 
> *place, u64 val, int len)
> }
> break;
> case 64:
> -   *(s64 *)place = sval;
> -   break;
> +   if (me->state != MODULE_STATE_UNFORMED)
> +   aarch64_insn_set(place, sval, sizeof(s64));
> +   else
> +   *(s64 *)place = sval;   break;
> default:
> pr_err("Invalid length (%d) for data relocation\n", len);
> return 0;
> @@ -113,7 +126,8 @@ enum aarch64_insn_movw_imm_type {
>  };

Don't merge this. I spotted an issue -- for the data relocations this
looks like an incorrect usage of aarch64_insn_set(). An updated
version will follow soon.

Thanks,
Dylan

[PATCH v4 0/2] livepatch, arm64/module: Enable late module relocations.

2025-05-22 Thread Dylan Hatch

Late relocations (after the module is initially loaded) are needed when
livepatches change module code. This is supported by x86, ppc, and s390.
This series borrows the x86 methodology to reach the same level of
support on arm64, and moves the text-poke locking into the core livepatch
code to reduce redundancy.

Dylan Hatch (2):
  livepatch, x86/module: Generalize late module relocation locking.
  arm64/module: Use text-poke API for late relocations.

 arch/arm64/kernel/module.c | 113 ++---
 arch/x86/kernel/module.c   |   8 +--
 kernel/livepatch/core.c|  18 --
 3 files changed, 84 insertions(+), 55 deletions(-)

-- 
2.49.0.1151.ga128411c76-goog

[PATCH v4 1/2] livepatch, x86/module: Generalize late module relocation locking.

2025-05-22 Thread Dylan Hatch

Late module relocations are an issue on any arch that supports
livepatch, so move the text_mutex locking to the livepatch core code.

Signed-off-by: Dylan Hatch 
Acked-by: Song Liu 
---
 arch/x86/kernel/module.c |  8 ++--
 kernel/livepatch/core.c  | 18 +-
 2 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index ff07558b7ebc6..38767e0047d0c 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -197,18 +197,14 @@ static int write_relocate_add(Elf64_Shdr *sechdrs,
bool early = me->state == MODULE_STATE_UNFORMED;
void *(*write)(void *, const void *, size_t) = memcpy;
 
-   if (!early) {
+   if (!early)
write = text_poke;
-   mutex_lock(&text_mutex);
-   }
 
ret = __write_relocate_add(sechdrs, strtab, symindex, relsec, me,
   write, apply);
 
-   if (!early) {
+   if (!early)
text_poke_sync();
-   mutex_unlock(&text_mutex);
-   }
 
return ret;
 }
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 0e73fac55f8eb..9968441f73510 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -294,9 +294,10 @@ static int klp_write_section_relocs(struct module *pmod, 
Elf_Shdr *sechdrs,
unsigned int symndx, unsigned int secndx,
const char *objname, bool apply)
 {
-   int cnt, ret;
+   int cnt, ret = 0;
char sec_objname[MODULE_NAME_LEN];
Elf_Shdr *sec = sechdrs + secndx;
+   bool early = pmod->state == MODULE_STATE_UNFORMED;
 
/*
 * Format: .klp.rela.sec_objname.section_name
@@ -319,12 +320,19 @@ static int klp_write_section_relocs(struct module *pmod, 
Elf_Shdr *sechdrs,
  sec, sec_objname);
if (ret)
return ret;
-
-   return apply_relocate_add(sechdrs, strtab, symndx, secndx, 
pmod);
}
 
-   clear_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
-   return 0;
+   if (!early)
+   mutex_lock(&text_mutex);
+
+   if (apply)
+   ret = apply_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
+   else
+   clear_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
+
+   if (!early)
+   mutex_unlock(&text_mutex);
+   return ret;
 }
 
 int klp_apply_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
-- 
2.49.0.1151.ga128411c76-goog

[PATCH v4 2/2] arm64/module: Use text-poke API for late relocations.

2025-05-22 Thread Dylan Hatch

To enable late module patching, livepatch modules need to be able to
apply some of their relocations well after being loaded. In this
scenario, use the text-poking API to allow this, even with
STRICT_MODULE_RWX.

This patch is partially based off commit 88fc078a7a8f6 ("x86/module: Use
text_poke() for late relocations").

Signed-off-by: Dylan Hatch 
Acked-by: Song Liu 
---
 arch/arm64/kernel/module.c | 113 ++---
 1 file changed, 69 insertions(+), 44 deletions(-)

diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 06bb680bfe975..6fbc3dbdcb425 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -18,11 +18,13 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
 #include 
 #include 
+#include 
 
 enum aarch64_reloc_op {
RELOC_OP_NONE,
@@ -48,7 +50,8 @@ static u64 do_reloc(enum aarch64_reloc_op reloc_op, __le32 
*place, u64 val)
return 0;
 }
 
-static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
+static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len,
+ struct module *me)
 {
s64 sval = do_reloc(op, place, val);
 
@@ -66,7 +69,11 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, 
u64 val, int len)
 
switch (len) {
case 16:
-   *(s16 *)place = sval;
+   if (me->state != MODULE_STATE_UNFORMED)
+   aarch64_insn_copy(place, &sval, sizeof(s16));
+   else
+   *(s16 *)place = sval;
+
switch (op) {
case RELOC_OP_ABS:
if (sval < 0 || sval > U16_MAX)
@@ -82,7 +89,11 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, 
u64 val, int len)
}
break;
case 32:
-   *(s32 *)place = sval;
+   if (me->state != MODULE_STATE_UNFORMED)
+   aarch64_insn_copy(place, &sval, sizeof(s32));
+   else
+   *(s32 *)place = sval;
+
switch (op) {
case RELOC_OP_ABS:
if (sval < 0 || sval > U32_MAX)
@@ -98,7 +109,10 @@ static int reloc_data(enum aarch64_reloc_op op, void 
*place, u64 val, int len)
}
break;
case 64:
-   *(s64 *)place = sval;
+   if (me->state != MODULE_STATE_UNFORMED)
+   aarch64_insn_copy(place, &sval, sizeof(s64));
+   else
+   *(s64 *)place = sval;
break;
default:
pr_err("Invalid length (%d) for data relocation\n", len);
@@ -113,7 +127,8 @@ enum aarch64_insn_movw_imm_type {
 };
 
 static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
-  int lsb, enum aarch64_insn_movw_imm_type imm_type)
+  int lsb, enum aarch64_insn_movw_imm_type imm_type,
+  struct module *me)
 {
u64 imm;
s64 sval;
@@ -145,7 +160,10 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, 
__le32 *place, u64 val,
 
/* Update the instruction with the new encoding. */
insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_16, insn, imm);
-   *place = cpu_to_le32(insn);
+   if (me->state != MODULE_STATE_UNFORMED)
+   aarch64_insn_set(place, cpu_to_le32(insn), sizeof(insn));
+   else
+   *place = cpu_to_le32(insn);
 
if (imm > U16_MAX)
return -ERANGE;
@@ -154,7 +172,8 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 
*place, u64 val,
 }
 
 static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
- int lsb, int len, enum aarch64_insn_imm_type imm_type)
+ int lsb, int len, enum aarch64_insn_imm_type imm_type,
+ struct module *me)
 {
u64 imm, imm_mask;
s64 sval;
@@ -170,7 +189,10 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 
*place, u64 val,
 
/* Update the instruction's immediate field. */
insn = aarch64_insn_encode_immediate(imm_type, insn, imm);
-   *place = cpu_to_le32(insn);
+   if (me->state != MODULE_STATE_UNFORMED)
+   aarch64_insn_set(place, cpu_to_le32(insn), sizeof(insn));
+   else
+   *place = cpu_to_le32(insn);
 
/*
 * Extract the upper value bits (including the sign bit) and
@@ -189,17 +211,17 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, 
__le32 *place, u64 val,
 }
 
 static int reloc_insn_adrp(struct module *mod, Elf64_Shdr *sechdrs,
-  __le32 *place, u64 val)
+  __le32 *place, u64 val, struct module *me)
 {
u32 insn;
 
if (!is_forbidden_offset_for_adrp(place))
return reloc_insn_imm(RELOC_OP_PAGE, place,

Re: [PATCH] selftests: acct: fix grammar and clarify output messages

2025-05-22 Thread Shuah Khan


On 5/13/25 16:14, Abdelrahman Fekry wrote:

This patch improves the clarity and grammar of output messages in the acct()
selftest. Minor changes were made to user-facing strings and comments to make
them easier to understand and more consistent with the kselftest style.

Changes include:
- Fixing grammar in printed messages and comments.
- Rewording error and success outputs for clarity and professionalism.
- Making the "root check" message more direct.

These updates improve readability without affecting test logic or behavior.

Signed-off-by: Abdelrahman Fekry 


Did you run checkpatch on this patch? I am seeing

CHECK: From:/Signed-off-by: email comments mismatch: 'From: Abdelrahman Fekry 
' != 'Signed-off-by: Abdelrahman Fekry 
'

Please fix it and send v2.

thanks,
-- Shuah

Re: [PATCH] selftests: size: fix grammar and align output formatting

2025-05-22 Thread Shuah Khan


On 5/13/25 15:44, Abdelrahman Fekry wrote:

Improve the grammar in the test name by changing "get runtime memory use"
to "get runtime memory usage". Also adjust spacing in output lines
("Total:", "Free:", etc.) to ensure consistent alignment and readability.

Signed-off-by: Abdelrahman Fekry 


Did you run checkpatch on this patch? I am seeing

CHECK: From:/Signed-off-by: email comments mismatch: 'From: Abdelrahman Fekry 
' != 'Signed-off-by: Abdelrahman Fekry 
'

Please fix it and send v2.

thanks,
-- Shuah

Re: [PATCH] selftests: ir_decoder: Convert header comment to proper multi-line block

2025-05-22 Thread Shuah Khan


On 5/13/25 16:32, Abdelrahman Fekry wrote:

The test file for the IR decoder used single-line  comments at the top
to document its purpose and licensing, which is inconsistent with the style
used throughout the Linux kernel.

in this patch i converted the file header to a proper multi-line comment block
(/*) that aligns with standard kernel practices. This improves
readability, consistency across selftests, and ensures the license and
documentation are clearly visible in a familiar format.

No functional changes have been made.

Signed-off-by: Abdelrahman Fekry 


Did you run checkpatch on this patch? I am seeing

CHECK: From:/Signed-off-by: email comments mismatch: 'From: Abdelrahman Fekry 
' != 'Signed-off-by: Abdelrahman Fekry 
'

Fix these as well.

ERROR: trailing whitespace
#136: FILE: tools/testing/selftests/ir/ir_loopback.c:3:
+ * $

ERROR: trailing whitespace
#137: FILE: tools/testing/selftests/ir/ir_loopback.c:4:
+ * Selftest for IR decoder $

Please fix it and send v2.

thanks,
-- Shuah

Re: [PATCH] selftests: fix "memebers" typo in filesystems/mount-notify

2025-05-22 Thread Shuah Khan


On 5/13/25 08:48, Hendrik Hamerlinck wrote:

Corrects a spelling mistake "memebers" instead of "members" in
tools/testing/selftests/filesystems/mount-notify/mount-notify_test.c


Change the shortlog to indicate test clearly. Check a few logs
for this file for examples. Here is how the correct format looks
like:

selftests: filesystems: fix "memebers" typo in mount-notify

SZend v2 with this correction.

thanks,
-- Shuah

Re: [PATCH] selftests/eventfd: correct test name and improve messages

2025-05-22 Thread Shuah Khan


On 5/13/25 01:44, Ryan Chung wrote:

- Rename test from  to



?? missing description of the change. Looks like the patch
renames the test to fix spelling error in the test name?


- Make the RDWR‐flag comment declarative:
   “The kernel automatically adds the O_RDWR flag.”
- Update semaphore‐flag failure message to:
   “eventfd semaphore flag check failed: …”


There is no need to list all these changes.

Please check a few chanelogs as a reference to how to write them.



Signed-off-by: Ryan Chung 
---
  tools/testing/selftests/filesystems/eventfd/eventfd_test.c | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/filesystems/eventfd/eventfd_test.c 
b/tools/testing/selftests/filesystems/eventfd/eventfd_test.c
index 85acb4e3ef00..72d51ad0ee0e 100644
--- a/tools/testing/selftests/filesystems/eventfd/eventfd_test.c
+++ b/tools/testing/selftests/filesystems/eventfd/eventfd_test.c
@@ -50,7 +50,7 @@ TEST(eventfd_check_flag_rdwr)
ASSERT_GE(fd, 0);
  
  	flags = fcntl(fd, F_GETFL);

-   // since the kernel automatically added O_RDWR.
+   // The kernel automatically adds the O_RDWR flag.
EXPECT_EQ(flags, O_RDWR);
  
  	close(fd);

@@ -85,7 +85,7 @@ TEST(eventfd_check_flag_nonblock)
close(fd);
  }
  
-TEST(eventfd_chek_flag_cloexec_and_nonblock)

+TEST(eventfd_check_flag_cloexec_and_nonblock)
  {
int fd, flags;
  
@@ -178,8 +178,7 @@ TEST(eventfd_check_flag_semaphore)

// The semaphore could only be obtained from fdinfo.
ret = verify_fdinfo(fd, &err, "eventfd-semaphore: ", 19, "1\n");
if (ret != 0)
-   ksft_print_msg("eventfd-semaphore check failed, msg: %s\n",
-   err.msg);
+   ksft_print_msg("eventfd semaphore flag check failed: %s\n", 
err.msg);


What's the reason for this change?


EXPECT_EQ(ret, 0);
  
  	close(fd);


thanks,
-- Shuah

[PATCH v6 1/5] x86/sgx: Introduce a counter to count the sgx_(vepc_)open()

2025-05-22 Thread Elena Reshetova

Currently SGX does not have a global counter to count the
active users from userspace or hypervisor. Implement such a counter,
sgx_usage_count. It will be used by the driver when attempting
to call EUPDATESVN SGX instruction.

Suggested-by: Sean Christopherson 
Signed-off-by: Elena Reshetova 
---
 arch/x86/kernel/cpu/sgx/driver.c | 22 --
 arch/x86/kernel/cpu/sgx/encl.c   |  1 +
 arch/x86/kernel/cpu/sgx/main.c   | 14 ++
 arch/x86/kernel/cpu/sgx/sgx.h|  3 +++
 arch/x86/kernel/cpu/sgx/virt.c   | 16 ++--
 5 files changed, 48 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/driver.c b/arch/x86/kernel/cpu/sgx/driver.c
index 7f8d1e11dbee..a2994a74bdff 100644
--- a/arch/x86/kernel/cpu/sgx/driver.c
+++ b/arch/x86/kernel/cpu/sgx/driver.c
@@ -19,9 +19,15 @@ static int sgx_open(struct inode *inode, struct file *file)
struct sgx_encl *encl;
int ret;
 
+   ret = sgx_inc_usage_count();
+   if (ret)
+   return ret;
+
encl = kzalloc(sizeof(*encl), GFP_KERNEL);
-   if (!encl)
-   return -ENOMEM;
+   if (!encl) {
+   ret = -ENOMEM;
+   goto err_usage_count;
+   }
 
kref_init(&encl->refcount);
xa_init(&encl->page_array);
@@ -31,14 +37,18 @@ static int sgx_open(struct inode *inode, struct file *file)
spin_lock_init(&encl->mm_lock);
 
ret = init_srcu_struct(&encl->srcu);
-   if (ret) {
-   kfree(encl);
-   return ret;
-   }
+   if (ret)
+   goto err_encl;
 
file->private_data = encl;
 
return 0;
+
+err_encl:
+   kfree(encl);
+err_usage_count:
+   sgx_dec_usage_count();
+   return ret;
 }
 
 static int sgx_release(struct inode *inode, struct file *file)
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 279148e72459..3b54889ae4a4 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -765,6 +765,7 @@ void sgx_encl_release(struct kref *ref)
WARN_ON_ONCE(encl->secs.epc_page);
 
kfree(encl);
+   sgx_dec_usage_count();
 }
 
 /*
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 2de01b379aa3..a018b01b8736 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -917,6 +917,20 @@ int sgx_set_attribute(unsigned long *allowed_attributes,
 }
 EXPORT_SYMBOL_GPL(sgx_set_attribute);
 
+/* Counter to count the active SGX users */
+static atomic64_t sgx_usage_count;
+
+int sgx_inc_usage_count(void)
+{
+   atomic64_inc(&sgx_usage_count);
+   return 0;
+}
+
+void sgx_dec_usage_count(void)
+{
+   atomic64_dec(&sgx_usage_count);
+}
+
 static int __init sgx_init(void)
 {
int ret;
diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
index d2dad21259a8..f5940393d9bd 100644
--- a/arch/x86/kernel/cpu/sgx/sgx.h
+++ b/arch/x86/kernel/cpu/sgx/sgx.h
@@ -102,6 +102,9 @@ static inline int __init sgx_vepc_init(void)
 }
 #endif
 
+int sgx_inc_usage_count(void);
+void sgx_dec_usage_count(void);
+
 void sgx_update_lepubkeyhash(u64 *lepubkeyhash);
 
 #endif /* _X86_SGX_H */
diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
index 7aaa3652e31d..6ce908ed51c9 100644
--- a/arch/x86/kernel/cpu/sgx/virt.c
+++ b/arch/x86/kernel/cpu/sgx/virt.c
@@ -255,22 +255,34 @@ static int sgx_vepc_release(struct inode *inode, struct 
file *file)
xa_destroy(&vepc->page_array);
kfree(vepc);
 
+   sgx_dec_usage_count();
return 0;
 }
 
 static int sgx_vepc_open(struct inode *inode, struct file *file)
 {
struct sgx_vepc *vepc;
+   int ret;
+
+   ret = sgx_inc_usage_count();
+   if (ret)
+   return ret;
 
vepc = kzalloc(sizeof(struct sgx_vepc), GFP_KERNEL);
-   if (!vepc)
-   return -ENOMEM;
+   if (!vepc) {
+   ret = -ENOMEM;
+   goto err_usage_count;
+   }
mutex_init(&vepc->lock);
xa_init(&vepc->page_array);
 
file->private_data = vepc;
 
return 0;
+
+err_usage_count:
+   sgx_dec_usage_count();
+   return ret;
 }
 
 static long sgx_vepc_ioctl(struct file *file,
-- 
2.45.2

[PATCH v6 0/5] Enable automatic SVN updates for SGX enclaves

2025-05-22 Thread Elena Reshetova

Changes since v5 following reviews by Ingo, Kai,
Jarkko and Dave:

 - rebase on x86 tip
 - patch 1 is fixed to do correct unwinding in
 case of errors
 - patch 2: add feature flag to cpuid-deps.c
 - patch 3: remove unused SGX_EPC_NOT_READY error code
 - patch 4: fix x86 feature check, return -EAGAIN
 on SGX_INSUFFICIENT_ENTROPY and -EIO on other non-
 expected errors. Comments/style are also fixed.
 - patch 5: rewrite commit message, add comments inline

Changes since v4 following reviews by Dave and Jarkko:

 - breakdown the single patch into 4 patches as
 suggested by Dave
 - fix error unwinding in sgx_(vepc_)open as 
 suggested by Jarkko
 - numerous code improvements suggested by Dave
 - numerous additional code comments and commit
 message improvements as suggested by Dave
 - switch to usage of atomic64_t for sgx_usage_count
 to ensure overflows cannot happen as suggested by Dave
 - do not return a error case when failing with
 SGX_INSUFFICIENT_ENTROPY, fail silently as suggested
 by Dave

Changes since v3 following reviews by Kai and Sean:

 - Change the overall approach to the one suggested
  by Sean and do the EUPDATESVN execution during
  sgx_open() and sgx_vepc_open().
  Note, I do not try to do EUPDATESVN during the release()
  flows since it doesnt save any noticable amount
  of time during next open flow per underlying EUPDATESVN
  microcode logic.
 - In sgx_update_svn() remove the check that we are
  running under VMM and expect the VMM to instead
  expose correct CPUID
 - Move the actual CPUID leaf check out of
  sgx_update_svn() into sgx_init()
 - Use existing RDRAND_RETRY_LOOPS define instead of 10
 - Change the sgx_update_svn() to return 0 only in
 success cases (or if unsupported)
 - Various smaller cosmetic fixes

The still to be discussed question is what sgx_update_svn()
should return in case of various failures. The current version
follows suggestion by Kai and would return an error (and block
sgx_(vepc_)open() in all cases, including running out of entropy.
I think this might be the correct approach for SGX_INSUFFICIENT_ENTROPY
since in such cases userspace can retry the open() and also
will get an info about what is actually blocking the EUPDATEVSN
(and can act on this). However, this is a change in existing API
and therefore debatable and I would like to hear people's feedback.

Changes since v2 following review by Jarkko:

 - formatting of comments is fixed
 - change from pr_error to ENCLS_WARN to communicate errors from
 EUPDATESVN
 - In case an unknown error is detected (must not happen per spec),
 make page allocation from EPC fail in order to prevent EPC usage

Changes since v1 following review by Jarkko:

 - first and second patch are squashed together and a better
   explanation of the change is added into the commit message
 - third and fourth patch are also combined for better understanding
   of error code purposes used in 4th patch
 - implementation of sgx_updatesvn adjusted following Jarkko's 
   suggestions
 - minor fixes in both commit messages and code from the review
 - dropping co-developed-by tag since the code now differs enough
   from the original submission. However, the reference where the
   original code came from and credits to original author is kept

Background
--

In case an SGX vulnerability is discovered and TCB recovery
for SGX is triggered, Intel specifies a process that must be
followed for a given vulnerability. Steps to mitigate can vary
based on vulnerability type, affected components, etc.
In some cases, a vulnerability can be mitigated via a runtime
recovery flow by shutting down all running SGX enclaves,
clearing enclave page cache (EPC), applying a microcode patch
that does not require a reboot (via late microcode loading) and
restarting all SGX enclaves.


Problem statement
-
Even when the above-described runtime recovery flow to mitigate the
SGX vulnerability is followed, the SGX attestation evidence will
still reflect the security SVN version being equal to the previous
state of security SVN (containing vulnerability) that created
and managed the enclave until the runtime recovery event. This
limitation currently can be only overcome via a platform reboot,
which negates all the benefits from the rebootless late microcode
loading and not required in this case for functional or security
purposes.


Proposed solution
-

SGX architecture introduced  a new instruction called EUPDATESVN [1]
to Ice Lake. It allows updating security SVN version, given that EPC
is completely empty. The latter is required for security reasons
in order to reason that enclave security posture is as secure as the
security SVN version of the TCB that created it.

This series enables opportunistic execution of EUPDATESVN upon first
EPC page allocation for a first enclave to be run on the platform.

This series is partly based on the previous work done by Cathy Zhang
[2], which attempted to enable forceful destruction of al

[PATCH v6 3/5] x86/sgx: Define error codes for use by ENCLS[EUPDATESVN]

2025-05-22 Thread Elena Reshetova

Add error codes for ENCLS[EUPDATESVN], then SGX CPUSVN update
process can know the execution state of EUPDATESVN and notify
userspace.

Signed-off-by: Elena Reshetova 
---
 arch/x86/include/asm/sgx.h | 37 ++---
 1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
index 6a0069761508..1abf1461fab6 100644
--- a/arch/x86/include/asm/sgx.h
+++ b/arch/x86/include/asm/sgx.h
@@ -28,21 +28,22 @@
 #define SGX_CPUID_EPC_MASK GENMASK(3, 0)
 
 enum sgx_encls_function {
-   ECREATE = 0x00,
-   EADD= 0x01,
-   EINIT   = 0x02,
-   EREMOVE = 0x03,
-   EDGBRD  = 0x04,
-   EDGBWR  = 0x05,
-   EEXTEND = 0x06,
-   ELDU= 0x08,
-   EBLOCK  = 0x09,
-   EPA = 0x0A,
-   EWB = 0x0B,
-   ETRACK  = 0x0C,
-   EAUG= 0x0D,
-   EMODPR  = 0x0E,
-   EMODT   = 0x0F,
+   ECREATE = 0x00,
+   EADD= 0x01,
+   EINIT   = 0x02,
+   EREMOVE = 0x03,
+   EDGBRD  = 0x04,
+   EDGBWR  = 0x05,
+   EEXTEND = 0x06,
+   ELDU= 0x08,
+   EBLOCK  = 0x09,
+   EPA = 0x0A,
+   EWB = 0x0B,
+   ETRACK  = 0x0C,
+   EAUG= 0x0D,
+   EMODPR  = 0x0E,
+   EMODT   = 0x0F,
+   EUPDATESVN  = 0x18,
 };
 
 /**
@@ -73,6 +74,10 @@ enum sgx_encls_function {
  * public key does not match IA32_SGXLEPUBKEYHASH.
  * %SGX_PAGE_NOT_MODIFIABLE:   The EPC page cannot be modified because it
  * is in the PENDING or MODIFIED state.
+ * %SGX_INSUFFICIENT_ENTROPY:  Insufficient entropy in RNG.
+ * %SGX_NO_UPDATE: EUPDATESVN was successful, but CPUSVN was not
+ * updated because current SVN was not newer than
+ * CPUSVN.
  * %SGX_UNMASKED_EVENT:An unmasked event, e.g. INTR, was 
received
  */
 enum sgx_return_code {
@@ -81,6 +86,8 @@ enum sgx_return_code {
SGX_CHILD_PRESENT   = 13,
SGX_INVALID_EINITTOKEN  = 16,
SGX_PAGE_NOT_MODIFIABLE = 20,
+   SGX_INSUFFICIENT_ENTROPY= 29,
+   SGX_NO_UPDATE   = 31,
SGX_UNMASKED_EVENT  = 128,
 };
 
-- 
2.45.2

[PATCH v6 2/5] x86/cpufeatures: Add X86_FEATURE_SGX_EUPDATESVN feature flag

2025-05-22 Thread Elena Reshetova

Add a flag indicating whenever ENCLS[EUPDATESVN] SGX instruction
is supported. This will be used by SGX driver to perform CPU
SVN updates.

Signed-off-by: Elena Reshetova 
---
 arch/x86/include/asm/cpufeatures.h   | 1 +
 arch/x86/kernel/cpu/cpuid-deps.c | 1 +
 arch/x86/kernel/cpu/scattered.c  | 1 +
 tools/arch/x86/include/asm/cpufeatures.h | 1 +
 4 files changed, 4 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index 5b50e0e35129..ee8f0e30ab6c 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -483,6 +483,7 @@
 #define X86_FEATURE_PREFER_YMM (21*32+ 8) /* Avoid ZMM registers due 
to downclocking */
 #define X86_FEATURE_APX(21*32+ 9) /* Advanced 
Performance Extensions */
 #define X86_FEATURE_INDIRECT_THUNK_ITS (21*32+10) /* Use thunk for indirect 
branches in lower half of cacheline */
+#define X86_FEATURE_SGX_EUPDATESVN (21*32+11) /* Support for 
ENCLS[EUPDATESVN] instruction */
 
 /*
  * BUG word(s)
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 46efcbd6afa4..3d9f49ad0efd 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -79,6 +79,7 @@ static const struct cpuid_dep cpuid_deps[] = {
{ X86_FEATURE_SGX_LC,   X86_FEATURE_SGX   },
{ X86_FEATURE_SGX1, X86_FEATURE_SGX   },
{ X86_FEATURE_SGX2, X86_FEATURE_SGX1  },
+   { X86_FEATURE_SGX_EUPDATESVN,   X86_FEATURE_SGX1  },
{ X86_FEATURE_SGX_EDECCSSA, X86_FEATURE_SGX1  },
{ X86_FEATURE_XFD,  X86_FEATURE_XSAVES},
{ X86_FEATURE_XFD,  X86_FEATURE_XGETBV1   },
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index dbf6d71bdf18..2a29fc33a891 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -42,6 +42,7 @@ static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_PER_THREAD_MBA,   CPUID_ECX,  0, 0x0010, 3 },
{ X86_FEATURE_SGX1, CPUID_EAX,  0, 0x0012, 0 },
{ X86_FEATURE_SGX2, CPUID_EAX,  1, 0x0012, 0 },
+   { X86_FEATURE_SGX_EUPDATESVN,   CPUID_EAX, 10, 0x0012, 0 },
{ X86_FEATURE_SGX_EDECCSSA, CPUID_EAX, 11, 0x0012, 0 },
{ X86_FEATURE_HW_PSTATE,CPUID_EDX,  7, 0x8007, 0 },
{ X86_FEATURE_CPB,  CPUID_EDX,  9, 0x8007, 0 },
diff --git a/tools/arch/x86/include/asm/cpufeatures.h 
b/tools/arch/x86/include/asm/cpufeatures.h
index bc81b9d1aeca..769ee7e411c3 100644
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -481,6 +481,7 @@
 #define X86_FEATURE_AMD_HTR_CORES  (21*32+ 6) /* Heterogeneous Core 
Topology */
 #define X86_FEATURE_AMD_WORKLOAD_CLASS (21*32+ 7) /* Workload Classification */
 #define X86_FEATURE_PREFER_YMM (21*32+ 8) /* Avoid ZMM registers due 
to downclocking */
+#define X86_FEATURE_SGX_EUPDATESVN (21*32+11) /* Support for 
ENCLS[EUPDATESVN] instruction */
 
 /*
  * BUG word(s)
-- 
2.45.2

[PATCH v6 4/5] x86/sgx: Implement ENCLS[EUPDATESVN]

2025-05-22 Thread Elena Reshetova

All running enclaves and cryptographic assets (such as internal SGX
encryption keys) are assumed to be compromised whenever an SGX-related
microcode update occurs. To mitigate this assumed compromise the new
supervisor SGX instruction ENCLS[EUPDATESVN] can generate fresh
cryptographic assets.

Before executing EUPDATESVN, all SGX memory must be marked as unused.
This requirement ensures that no potentially compromised enclave
survives the update and allows the system to safely regenerate
cryptographic assets.

Add the method to perform ENCLS[EUPDATESVN].

Signed-off-by: Elena Reshetova 
---
 arch/x86/kernel/cpu/sgx/encls.h |  5 +++
 arch/x86/kernel/cpu/sgx/main.c  | 67 +
 2 files changed, 72 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
index 99004b02e2ed..d9160c89a93d 100644
--- a/arch/x86/kernel/cpu/sgx/encls.h
+++ b/arch/x86/kernel/cpu/sgx/encls.h
@@ -233,4 +233,9 @@ static inline int __eaug(struct sgx_pageinfo *pginfo, void 
*addr)
return __encls_2(EAUG, pginfo, addr);
 }
 
+/* Attempt to update CPUSVN at runtime. */
+static inline int __eupdatesvn(void)
+{
+   return __encls_ret_1(EUPDATESVN, "");
+}
 #endif /* _X86_ENCLS_H */
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index a018b01b8736..109d40c89fe8 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "driver.h"
 #include "encl.h"
 #include "encls.h"
@@ -920,6 +921,72 @@ EXPORT_SYMBOL_GPL(sgx_set_attribute);
 /* Counter to count the active SGX users */
 static atomic64_t sgx_usage_count;
 
+/**
+ * sgx_updatesvn() - Attempt to call ENCLS[EUPDATESVN].
+ * This instruction attempts to update CPUSVN to the
+ * currently loaded microcode update SVN and generate new
+ * cryptographic assets. Must be called when EPC is empty.
+ * Most of the time, there will be no update and that's OK.
+ * If the failure is due to SGX_INSUFFICIENT_ENTROPY, the
+ * operation can be safely retried. In other failure cases,
+ * the retry should not be attempted.
+ *
+ * Return:
+ * 0: Success or not supported
+ * -EAGAIN: Can be safely retried, failure is due to lack of
+ *  entropy in RNG.
+ * -EIO: Unexpected error, retries are not advisable.
+ */
+static int sgx_update_svn(void)
+{
+   int ret;
+
+   /*
+* If EUPDATESVN is not available, it is ok to
+* silently skip it to comply with legacy behavior.
+*/
+   if (!cpu_feature_enabled(X86_FEATURE_SGX_EUPDATESVN))
+   return 0;
+
+   for (int i = 0; i < RDRAND_RETRY_LOOPS; i++) {
+   ret = __eupdatesvn();
+
+   /* Stop on success or unexpected errors: */
+   if (ret != SGX_INSUFFICIENT_ENTROPY)
+   break;
+   }
+
+   /*
+* SVN was already up-to-date. This is the most
+* common case.
+*/
+   if (ret == SGX_NO_UPDATE)
+   return 0;
+
+   /*
+* SVN update failed due to lack of entropy in DRNG.
+* Indicate to userspace that it should retry.
+*/
+   if (ret == SGX_INSUFFICIENT_ENTROPY)
+   return -EAGAIN;
+
+   if (!ret) {
+   /*
+* SVN successfully updated.
+* Let users know when the update was successful.
+*/
+   pr_info("SVN updated successfully\n");
+   return 0;
+   }
+
+   /*
+* EUPDATESVN was called when EPC is empty, all other error
+* codes are unexpected.
+*/
+   ENCLS_WARN(ret, "EUPDATESVN");
+   return -EIO;
+}
+
 int sgx_inc_usage_count(void)
 {
atomic64_inc(&sgx_usage_count);
-- 
2.45.2

Re: [PATCH] selftests/mm: Fix test result reporting in gup_longterm

2025-05-22 Thread Mark Brown

On Thu, May 22, 2025 at 10:42:50AM +0200, David Hildenbrand wrote:

> Probably, one might be able to revert the logic: instead of running each
> test for each size, run each size for each test: then, the tests are fixed
> and would be covering all available sizes in a single logical test.

Yeah, that should work - it'd lose a bit of resolution in the test
results for automation but I'm not sure how often that's likely to be
relevant and the information would still be there for humans.  

> I agree that that really is a bigger rework. Let me take a look at your
> original patch later (fairly busy today, please poke me if I forget).

Thanks.  I'll take a look at the other similar tests like cow using the
same approach I've used here.

signature.asc
Description: PGP signature

Re: [PATCH] selftests/cpufreq: Fix cpufreq basic read and update testcases

2025-05-22 Thread Viresh Kumar

On 22-05-25, 14:07, Sapkal, Swapnil wrote:
> Initially I tried the same, but it does not work properly with the root user.

Hmm,

Tried chatgpt now and it says this should work:

if ! cat "$1/$file" 2>/dev/null; then
printf "$file is not readable\n"
fi

- This attempts to read the file.
- If it fails, the cat command returns non-zero, and you print a message.
- 2>/dev/null suppresses error messages (Permission denied, etc.)
- This works reliably for both root and non-root users, because it actually 
tests the read action, not just permission bits.

-- 
viresh

[PATCH bpf-next v2 0/2] bpf, arm64: support up to 12 arguments

2025-05-22 Thread Alexis Lothoré

Hello,

this is the v2 of the many args series for arm64, being itself a revival
of Xu Kuhoai's work to enable larger arguments count for BPF programs on
ARM64 ([1]).

The discussions in v1 shed some light on some issues around specific
cases, for example with functions passing struct on stack with custom
packing/alignment attributes: those cases can not be properly detected
with the current BTF info. So this new revision aims to separate
concerns with a simpler implementation, just accepting additional args
on stack if we can make sure about the alignment constraints (and so,
refusing attachment to functions passing structs on stacks). I then
checked if the specific alignment constraints could be checked with
larger scalar types rather than structs, but it appears that this use
case is in fact rejected at the verifier level (see a9b59159d338 ("bpf:
Do not allow btf_ctx_access with __int128 types")). So in the end the
specific alignment corner cases raised in [1] can not really happen in
the kernel in its current state. This new revision still brings support
for the standard cases as a first step, it will then be possible to
iterate on top of it to add the more specific cases like struct passed
on stack and larger types.

[1] 
https://lore.kernel.org/all/20230917150752.69612-1-xukuo...@huaweicloud.com/#t

Signed-off-by: Alexis Lothoré (eBPF Foundation) 
---
Changes in v2:
- remove alignment computation from btf.c
- deduce alignment constraints directly in jit compiler for simple types
- deny attachment to functions with "corner-cases" arguments (ie:
  structs on stack)
- remove custom tests, as the corresponding use cases are locked either
  by the JIT comp or the verifier
- drop RFC
- Link to v1: 
https://lore.kernel.org/r/20250411-many_args_arm64-v1-0-0a32fe723...@bootlin.com

---
Alexis Lothoré (eBPF Foundation) (1):
  selftests/bpf: enable many-args tests for arm64

Xu Kuohai (1):
  bpf, arm64: Support up to 12 function arguments

 arch/arm64/net/bpf_jit_comp.c| 234 ---
 tools/testing/selftests/bpf/DENYLIST.aarch64 |   2 -
 2 files changed, 180 insertions(+), 56 deletions(-)
---
base-commit: 9435138c069117cd59a4912b5ea2ae44cc2c5ffa
change-id: 20250220-many_args_arm64-8bd3747e6948

Best regards,
-- 
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

[PATCH bpf-next v2 2/2] selftests/bpf: enable many-args tests for arm64

2025-05-22 Thread eBPF Foundation

Now that support for up to 12 args is enabled for tracing programs on
ARM64, enable the existing tests for this feature on this architecture.

Signed-off-by: Alexis Lothoré (eBPF Foundation) 
---
Changes in v2:
- keep tracing struct tests disabled, as structs passed on stack are not
  handled by the new revision
---
 tools/testing/selftests/bpf/DENYLIST.aarch64 | 2 --
 1 file changed, 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/DENYLIST.aarch64 
b/tools/testing/selftests/bpf/DENYLIST.aarch64
index 
6d8feda27ce9de07d77d6e384666082923e3dc76..12e99c0277a8cbf9e63e8f6d3a108c8a1208407b
 100644
--- a/tools/testing/selftests/bpf/DENYLIST.aarch64
+++ b/tools/testing/selftests/bpf/DENYLIST.aarch64
@@ -1,3 +1 @@
-fentry_test/fentry_many_args # 
fentry_many_args:FAIL:fentry_many_args_attach unexpected error: -524
-fexit_test/fexit_many_args   # 
fexit_many_args:FAIL:fexit_many_args_attach unexpected error: -524
 tracing_struct/struct_many_args  # 
struct_many_args:FAIL:tracing_struct_many_args__attach unexpected error: -524

-- 
2.49.0

[PATCH v6 5/5] x86/sgx: Enable automatic SVN updates for SGX enclaves

2025-05-22 Thread Elena Reshetova

== Background ==

ENCLS[EUPDATESVN] is a new SGX instruction [1] which allows enclave
attestation to include information about updated microcode SVN without a
reboot. Before an EUPDATESVN operation can be successful, all SGX memory
(aka. EPC) must be marked as “unused” in the SGX hardware metadata
(aka.EPCM). This requirement ensures that no compromised enclave can
survive the EUPDATESVN procedure and provides an opportunity to generate
new cryptographic assets.

== Patch Contents ==

Attempt to execute ENCLS[EUPDATESVN] every time the first file descriptor
is obtained via sgx_(vepc_)open(). In the most common case the microcode
SVN is already up-to-date, and the operation succeeds without updating SVN.
If it fails with any other error code than SGX_INSUFFICIENT_ENTROPY, this
is considered unexpected and the *open() returns an error. This should not
happen in practice. On contrary, SGX_INSUFFICIENT_ENTROPY might happen due
to a pressure on the system's DRNG (RDSEED) and therefore the *open() can
be safely retried to allow normal enclave operation.

[1] Runtime Microcode Updates with Intel Software Guard Extensions,
https://cdrdv2.intel.com/v1/dl/getContent/648682

Signed-off-by: Elena Reshetova 
---
 arch/x86/kernel/cpu/sgx/main.c | 35 --
 1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 109d40c89fe8..73ec5ccff3ae 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -920,6 +920,8 @@ EXPORT_SYMBOL_GPL(sgx_set_attribute);
 
 /* Counter to count the active SGX users */
 static atomic64_t sgx_usage_count;
+/* Mutex to ensure no concurrent EPC accesses during EUPDATESVN */
+static DEFINE_MUTEX(sgx_svn_lock);
 
 /**
  * sgx_updatesvn() - Attempt to call ENCLS[EUPDATESVN].
@@ -989,8 +991,37 @@ static int sgx_update_svn(void)
 
 int sgx_inc_usage_count(void)
 {
-   atomic64_inc(&sgx_usage_count);
-   return 0;
+   int ret;
+
+   /*
+* Increments from non-zero indicate potential other
+* active EPC users and EUPDATESVN is not attempted.
+*/
+   if (atomic64_inc_not_zero(&sgx_usage_count))
+   return 0;
+
+   /*
+* Ensure no other concurrent threads can start
+* touching EPC while EUPDATESVN is running.
+*/
+   guard(mutex)(&sgx_svn_lock);
+
+   if (atomic64_inc_not_zero(&sgx_usage_count))
+   return 0;
+
+   /*
+* Attempt to call EUPDATESVN since EPC must be
+* empty at this point.
+*/
+   ret = sgx_update_svn();
+
+   /*
+* If EUPDATESVN failed, return failure to sgx_(vepc_)open and
+* do not increment the sgx_usage_count.
+*/
+   if (!ret)
+   atomic64_inc(&sgx_usage_count);
+   return ret;
 }
 
 void sgx_dec_usage_count(void)
-- 
2.45.2

[PATCH][next] selftests/futex: Fix spelling mistake "manaul" -> "manual"

2025-05-22 Thread Colin Ian King

There is a spelling mistake in a ksft_test_result message. Fix it.

Signed-off-by: Colin Ian King 
---
 tools/testing/selftests/futex/functional/futex_priv_hash.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/futex/functional/futex_priv_hash.c 
b/tools/testing/selftests/futex/functional/futex_priv_hash.c
index 2dca18fefedc..0213eb0bb4af 100644
--- a/tools/testing/selftests/futex/functional/futex_priv_hash.c
+++ b/tools/testing/selftests/futex/functional/futex_priv_hash.c
@@ -242,7 +242,7 @@ int main(int argc, char *argv[])
join_max_threads();
 
ret = futex_hash_slots_get();
-   ksft_test_result(ret == 2, "No more auto-resize after manaul setting, 
got %d\n",
+   ksft_test_result(ret == 2, "No more auto-resize after manual setting, 
got %d\n",
 ret);
 
futex_hash_slots_set_must_fail(1 << 29, 0);
-- 
2.49.0

Re: [PATCH net-next v6 3/5] vsock/test: Introduce vsock_wait_sent() helper

2025-05-22 Thread Stefano Garzarella


On Thu, May 22, 2025 at 01:18:23AM +0200, Michal Luczaj wrote:

Distill the virtio_vsock_sock::bytes_unsent checking loop (ioctl SIOCOUTQ)
and move it to utils. Tweak the comment.

Signed-off-by: Michal Luczaj 
---
tools/testing/vsock/util.c   | 25 +
tools/testing/vsock/util.h   |  1 +
tools/testing/vsock/vsock_test.c | 23 ++-
3 files changed, 32 insertions(+), 17 deletions(-)


Reviewed-by: Stefano Garzarella 



diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
index 
de25892f865f07672da0886be8bd1a429ade8b05..4427d459e199f643d415dfc13e071f21a2e4d6ba
 100644
--- a/tools/testing/vsock/util.c
+++ b/tools/testing/vsock/util.c
@@ -17,6 +17,7 @@
#include 
#include 
#include 
+#include 

#include "timeout.h"
#include "control.h"
@@ -96,6 +97,30 @@ void vsock_wait_remote_close(int fd)
close(epollfd);
}

+/* Wait until transport reports no data left to be sent.
+ * Return false if transport does not implement the unsent_bytes() callback.
+ */
+bool vsock_wait_sent(int fd)
+{
+   int ret, sock_bytes_unsent;
+
+   timeout_begin(TIMEOUT);
+   do {
+   ret = ioctl(fd, SIOCOUTQ, &sock_bytes_unsent);
+   if (ret < 0) {
+   if (errno == EOPNOTSUPP)
+   break;
+
+   perror("ioctl(SIOCOUTQ)");
+   exit(EXIT_FAILURE);
+   }
+   timeout_check("SIOCOUTQ");
+   } while (sock_bytes_unsent != 0);
+   timeout_end();
+
+   return !ret;
+}
+
/* Create socket , bind to  and return the file descriptor. */
int vsock_bind(unsigned int cid, unsigned int port, int type)
{
diff --git a/tools/testing/vsock/util.h b/tools/testing/vsock/util.h
index 
d1f765ce3d8f738630846bb47c4f3f6f946f..91f9df12f26a0858777e1a65456f8058544a5f18
 100644
--- a/tools/testing/vsock/util.h
+++ b/tools/testing/vsock/util.h
@@ -54,6 +54,7 @@ int vsock_stream_listen(unsigned int cid, unsigned int port);
int vsock_seqpacket_accept(unsigned int cid, unsigned int port,
   struct sockaddr_vm *clientaddrp);
void vsock_wait_remote_close(int fd);
+bool vsock_wait_sent(int fd);
void send_buf(int fd, const void *buf, size_t len, int flags,
  ssize_t expected_ret);
void recv_buf(int fd, void *buf, size_t len, int flags, ssize_t expected_ret);
diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index 
9ea33b78b9fcb532f4f9616b38b4d2b627b04d31..9d3a77be26f4eb5854629bb1fce08c4ef5485c84
 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -21,7 +21,6 @@
#include 
#include 
#include 
-#include 
#include 

#include "vsock_test_zerocopy.h"
@@ -1280,7 +1279,7 @@ static void test_unsent_bytes_server(const struct 
test_opts *opts, int type)
static void test_unsent_bytes_client(const struct test_opts *opts, int type)
{
unsigned char buf[MSG_BUF_IOCTL_LEN];
-   int ret, fd, sock_bytes_unsent;
+   int fd;

fd = vsock_connect(opts->peer_cid, opts->peer_port, type);
if (fd < 0) {
@@ -1297,22 +1296,12 @@ static void test_unsent_bytes_client(const struct 
test_opts *opts, int type)
/* SIOCOUTQ isn't guaranteed to instantly track sent data. Even though
 * the "RECEIVED" message means that the other side has received the
 * data, there can be a delay in our kernel before updating the "unsent
-* bytes" counter. Repeat SIOCOUTQ until it returns 0.
+* bytes" counter. vsock_wait_sent() will repeat SIOCOUTQ until it
+* returns 0.
 */
-   timeout_begin(TIMEOUT);
-   do {
-   ret = ioctl(fd, SIOCOUTQ, &sock_bytes_unsent);
-   if (ret < 0) {
-   if (errno == EOPNOTSUPP) {
-   fprintf(stderr, "Test skipped, SIOCOUTQ not 
supported.\n");
-   break;
-   }
-   perror("ioctl");
-   exit(EXIT_FAILURE);
-   }
-   timeout_check("SIOCOUTQ");
-   } while (sock_bytes_unsent != 0);
-   timeout_end();
+   if (!vsock_wait_sent(fd))
+   fprintf(stderr, "Test skipped, SIOCOUTQ not supported.\n");
+
close(fd);
}


--
2.49.0

Re: [PATCH net-next v6 4/5] vsock/test: Introduce enable_so_linger() helper

2025-05-22 Thread Stefano Garzarella


On Thu, May 22, 2025 at 01:18:24AM +0200, Michal Luczaj wrote:

Add a helper function that sets SO_LINGER. Adapt the caller.

Signed-off-by: Michal Luczaj 
---
tools/testing/vsock/util.c   | 13 +
tools/testing/vsock/util.h   |  1 +
tools/testing/vsock/vsock_test.c | 10 +-
3 files changed, 15 insertions(+), 9 deletions(-)


Reviewed-by: Stefano Garzarella 



diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
index 
4427d459e199f643d415dfc13e071f21a2e4d6ba..0c7e9cbcbc85cde9c8764fc3bb623cde2f6c77a6
 100644
--- a/tools/testing/vsock/util.c
+++ b/tools/testing/vsock/util.c
@@ -823,3 +823,16 @@ void enable_so_zerocopy_check(int fd)
setsockopt_int_check(fd, SOL_SOCKET, SO_ZEROCOPY, 1,
 "setsockopt SO_ZEROCOPY");
}
+
+void enable_so_linger(int fd, int timeout)
+{
+   struct linger optval = {
+   .l_onoff = 1,
+   .l_linger = timeout
+   };
+
+   if (setsockopt(fd, SOL_SOCKET, SO_LINGER, &optval, sizeof(optval))) {
+   perror("setsockopt(SO_LINGER)");
+   exit(EXIT_FAILURE);
+   }
+}
diff --git a/tools/testing/vsock/util.h b/tools/testing/vsock/util.h
index 
91f9df12f26a0858777e1a65456f8058544a5f18..5e2db67072d5053804a9bb93934b625ea78bcd7a
 100644
--- a/tools/testing/vsock/util.h
+++ b/tools/testing/vsock/util.h
@@ -80,4 +80,5 @@ void setsockopt_int_check(int fd, int level, int optname, int 
val,
void setsockopt_timeval_check(int fd, int level, int optname,
  struct timeval val, char const *errmsg);
void enable_so_zerocopy_check(int fd);
+void enable_so_linger(int fd, int timeout);
#endif /* UTIL_H */
diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index 
9d3a77be26f4eb5854629bb1fce08c4ef5485c84..b3258d6ba21a5f51cf4791514854bb40451399a9
 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -1813,10 +1813,6 @@ static void test_stream_connect_retry_server(const 
struct test_opts *opts)

static void test_stream_linger_client(const struct test_opts *opts)
{
-   struct linger optval = {
-   .l_onoff = 1,
-   .l_linger = 1
-   };
int fd;

fd = vsock_stream_connect(opts->peer_cid, opts->peer_port);
@@ -1825,11 +1821,7 @@ static void test_stream_linger_client(const struct 
test_opts *opts)
exit(EXIT_FAILURE);
}

-   if (setsockopt(fd, SOL_SOCKET, SO_LINGER, &optval, sizeof(optval))) {
-   perror("setsockopt(SO_LINGER)");
-   exit(EXIT_FAILURE);
-   }
-
+   enable_so_linger(fd, 1);
close(fd);
}


--
2.49.0

Re: [PATCH net-next v6 5/5] vsock/test: Add test for an unexpectedly lingering close()

2025-05-22 Thread Stefano Garzarella


On Thu, May 22, 2025 at 01:18:25AM +0200, Michal Luczaj wrote:

There was an issue with SO_LINGER: instead of blocking until all queued
messages for the socket have been successfully sent (or the linger timeout
has been reached), close() would block until packets were handled by the
peer.

Add a test to alert on close() lingering when it should not.

Signed-off-by: Michal Luczaj 
---
tools/testing/vsock/vsock_test.c | 52 
1 file changed, 52 insertions(+)


Reviewed-by: Stefano Garzarella 



diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index 
b3258d6ba21a5f51cf4791514854bb40451399a9..f669baaa0dca3bebc678d00eafa80857d1f0fdd6
 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -1839,6 +1839,53 @@ static void test_stream_linger_server(const struct 
test_opts *opts)
close(fd);
}

+/* Half of the default to not risk timing out the control channel */
+#define LINGER_TIMEOUT (TIMEOUT / 2)
+
+static void test_stream_nolinger_client(const struct test_opts *opts)
+{
+   bool waited;
+   time_t ns;
+   int fd;
+
+   fd = vsock_stream_connect(opts->peer_cid, opts->peer_port);
+   if (fd < 0) {
+   perror("connect");
+   exit(EXIT_FAILURE);
+   }
+
+   enable_so_linger(fd, LINGER_TIMEOUT);
+   send_byte(fd, 1, 0); /* Left unread to expose incorrect behaviour. */
+   waited = vsock_wait_sent(fd);
+
+   ns = current_nsec();
+   close(fd);
+   ns = current_nsec() - ns;
+
+   if (!waited) {
+   fprintf(stderr, "Test skipped, SIOCOUTQ not supported.\n");
+   } else if (DIV_ROUND_UP(ns, NSEC_PER_SEC) >= LINGER_TIMEOUT) {
+   fprintf(stderr, "Unexpected lingering\n");
+   exit(EXIT_FAILURE);
+   }
+
+   control_writeln("DONE");
+}
+
+static void test_stream_nolinger_server(const struct test_opts *opts)
+{
+   int fd;
+
+   fd = vsock_stream_accept(VMADDR_CID_ANY, opts->peer_port, NULL);
+   if (fd < 0) {
+   perror("accept");
+   exit(EXIT_FAILURE);
+   }
+
+   control_expectln("DONE");
+   close(fd);
+}
+
static struct test_case test_cases[] = {
{
.name = "SOCK_STREAM connection reset",
@@ -1999,6 +2046,11 @@ static struct test_case test_cases[] = {
.run_client = test_stream_linger_client,
.run_server = test_stream_linger_server,
},
+   {
+   .name = "SOCK_STREAM SO_LINGER close() on unread",
+   .run_client = test_stream_nolinger_client,
+   .run_server = test_stream_nolinger_server,
+   },
{},
};


--
2.49.0

Re: [PATCH] [next] selftests/ptrace: Fix spelling mistake "multible" -> "multiple"

2025-05-22 Thread Brigham Campbell

On Thu May 1, 2025 at 12:03 AM MDT, Ankit Chauhan wrote:
> Fix the spelling error from "multible" to "multiple".
>
> Signed-off-by: Ankit Chauhan 
> ---
>  tools/testing/selftests/ptrace/peeksiginfo.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/ptrace/peeksiginfo.c 
> b/tools/testing/selftests/ptrace/peeksiginfo.c
> index a6884f66dc01..2f345d11e4b8 100644
> --- a/tools/testing/selftests/ptrace/peeksiginfo.c
> +++ b/tools/testing/selftests/ptrace/peeksiginfo.c
> @@ -199,7 +199,7 @@ int main(int argc, char *argv[])
>  
>   /*
>* Dump signal from the process-wide queue.
> -  * The number of signals is not multible to the buffer size
> +  * The number of signals is not multiple to the buffer size

Excellent work! This could probably be clarified further by fixing the
grammar a little bit (i.e. "... is not a multiple of ...", assuming that
is actually what is meant).

>*/
>   if (check_direct_path(child, 1, 3))
>   goto out;

Reviewed-by: Brigham Campbell

Re: [PATCH] selftests/cpufreq: Fix cpufreq basic read and update testcases

2025-05-22 Thread Sapkal, Swapnil


Hi Viresh,

On 5/19/2025 1:28 PM, Viresh Kumar wrote:

On 30-04-25, 17:14, Swapnil Sapkal wrote:

In cpufreq basic selftests, one of the testcases is to read all cpufreq
sysfs files and print the values. This testcase assumes all the cpufreq
sysfs files have read permissions. However certain cpufreq sysfs files
(eg. stats/reset) are write only files and this testcase errors out
when it is not able to read the file.
Similarily, there is one more testcase which reads the cpufreq sysfs
file data and write it back to same file. This testcase also errors out
for sysfs files without read permission.
Fix these testcases by adding proper read permission checks.

Reported-by: Narasimhan V 
Signed-off-by: Swapnil Sapkal 
---
  tools/testing/selftests/cpufreq/cpufreq.sh | 15 +++
  1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/cpufreq/cpufreq.sh 
b/tools/testing/selftests/cpufreq/cpufreq.sh
index e350c521b467..3484fa34e8d8 100755
--- a/tools/testing/selftests/cpufreq/cpufreq.sh
+++ b/tools/testing/selftests/cpufreq/cpufreq.sh
@@ -52,7 +52,14 @@ read_cpufreq_files_in_dir()
for file in $files; do
if [ -f $1/$file ]; then
printf "$file:"
-   cat $1/$file
+   #file is readable ?
+   local rfile=$(ls -l $1/$file | awk '$1 ~ /^.*r.*/ { 
print $NF; }')
+
+   if [ ! -z $rfile ]; then
+   cat $1/$file
+   else
+   printf "$file is not readable\n"
+   fi


What about:

if [ -r $1/$file ]; then
 cat $1/$file
else
 printf "$file is not readable\n"
fi




Initially I tried the same, but it does not work properly with the root user.

--
Thanks and Regards,
Swapnil

Re: [PATCH net-next 1/3] net: devmem: support single IOV with sendmsg

2025-05-22 Thread Pavel Begunkov


On 5/21/25 18:33, Stanislav Fomichev wrote:

On 05/21, Mina Almasry wrote:

On Tue, May 20, 2025 at 1:30 PM Stanislav Fomichev  wrote:


sendmsg() with a single iov becomes ITER_UBUF, sendmsg() with multiple
iovs becomes ITER_IOVEC. iter_iov_len does not return correct
value for UBUF, so teach to treat UBUF differently.

Cc: Al Viro 
Cc: Pavel Begunkov 
Cc: Mina Almasry 
Fixes: bd61848900bf ("net: devmem: Implement TX path")
Signed-off-by: Stanislav Fomichev 
---
  include/linux/uio.h | 8 +++-
  net/core/datagram.c | 3 ++-
  2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index 49ece9e1888f..393d0622cc28 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -99,7 +99,13 @@ static inline const struct iovec *iter_iov(const struct 
iov_iter *iter)
  }

  #define iter_iov_addr(iter)(iter_iov(iter)->iov_base + (iter)->iov_offset)
-#define iter_iov_len(iter) (iter_iov(iter)->iov_len - (iter)->iov_offset)
+
+static inline size_t iter_iov_len(const struct iov_iter *i)
+{
+   if (i->iter_type == ITER_UBUF)
+   return i->count;
+   return iter_iov(i)->iov_len - i->iov_offset;
+}



This change looks good to me from devmem perspective, but aren't you
potentially breaking all these existing callers to iter_iov_len?

ackc -i iter_iov_len
fs/read_write.c
846:iter_iov_len(iter), ppos);
849:iter_iov_len(iter), ppos);
858:if (nr != iter_iov_len(iter))

mm/madvise.c
1808:   size_t len_in = iter_iov_len(iter);
1838:   iov_iter_advance(iter, iter_iov_len(iter));

io_uring/rw.c
710:len = iter_iov_len(iter);

Or are you confident this change is compatible with these callers for
some reason?
  
Pavel did go over all callers, see:

https://lore.kernel.org/netdev/7f06216e-1e66-433e-a247-2445dac22...@gmail.com/


Yes, the patch should work

Reviewed-by: Pavel Begunkov 




Maybe better to handle this locally in zerocopy_fill_skb_from_devmem,
and then follow up with a more ambitious change that streamlines how
all the iters behave.


Yes, I can definitely do that, but it seems a bit strange that the
callers need to distinguish between IOVEC and UBUF (which is a 1-entry
IOVEC), so having working iter_iov_len seems a bit cleaner.


It might be a good idea to rename it at some point to highlight that
it also works with ubufs (but not as a part of this fix).

--
Pavel Begunkov

Re: [PATCH net-next v6 0/5] vsock: SOCK_LINGER rework

2025-05-22 Thread Stefano Garzarella


On Thu, May 22, 2025 at 01:18:20AM +0200, Michal Luczaj wrote:

Change vsock's lingerning to wait on close() until all data is sent, i.e.
until workers picked all the packets for processing.


Thanks for the series and the patience :-)

LGTM! There should be my R-b for all patches.

Thanks,
Stefano



Changes in v6:
- Make vsock_wait_sent() return bool, parametrize enable_so_linger() with
 timeout, don't open code DIV_ROUND_UP [Stefano]
- Link to v5: 
https://lore.kernel.org/r/20250521-vsock-linger-v5-0-94827860d...@rbox.co

Changes in v5:
- Move unsent_bytes fetching logic to utils.c
- Add a helper for enabling SO_LINGER
- Accommodate for close() taking a long time for reasons unrelated to
 lingering
- Separate and redo the testcase [Stefano]
- Enrich the comment [Stefano]
- Link to v4: 
https://lore.kernel.org/r/20250501-vsock-linger-v4-0-beabbd8a0...@rbox.co

Changes in v4:
- While in virtio, stick to virtio_transport_unsent_bytes() [Stefano]
- Squash the indentation reduction [Stefano]
- Pull SOCK_LINGER check into vsock_linger() [Stefano]
- Don't explicitly pass sk->sk_lingertime [Stefano]
- Link to v3: 
https://lore.kernel.org/r/20250430-vsock-linger-v3-0-ddbe73b53...@rbox.co

Changes in v3:
- Set "vsock/virtio" topic where appropriate
- Do not claim that Hyper-V and VMCI ever lingered [Stefano]
- Move lingering to af_vsock core [Stefano]
- Link to v2: 
https://lore.kernel.org/r/20250421-vsock-linger-v2-0-fe9febd64...@rbox.co

Changes in v2:
- Comment that some transports do not implement unsent_bytes [Stefano]
- Reduce the indentation of virtio_transport_wait_close() [Stefano]
- Do not linger on shutdown(), expand the commit messages [Paolo]
- Link to v1: 
https://lore.kernel.org/r/20250407-vsock-linger-v1-0-1458038e3...@rbox.co

Changes in v1:
- Do not assume `unsent_bytes()` is implemented by all transports [Stefano]
- Link to v0: 
https://lore.kernel.org/netdev/df2d51fd-03e7-477f-8aea-938446f47...@rbox.co/

Signed-off-by: Michal Luczaj 
---
Michal Luczaj (5):
 vsock/virtio: Linger on unsent data
 vsock: Move lingering logic to af_vsock core
 vsock/test: Introduce vsock_wait_sent() helper
 vsock/test: Introduce enable_so_linger() helper
 vsock/test: Add test for an unexpectedly lingering close()

include/net/af_vsock.h  |  1 +
net/vmw_vsock/af_vsock.c| 33 +
net/vmw_vsock/virtio_transport_common.c | 21 +
tools/testing/vsock/util.c  | 38 +++
tools/testing/vsock/util.h  |  2 +
tools/testing/vsock/vsock_test.c| 83 +++--
6 files changed, 134 insertions(+), 44 deletions(-)
---
base-commit: f44092606a3f153bb7e6b277006b1f4a5b914cfc
change-id: 20250304-vsock-linger-9026e5f9986c

Best regards,
--
Michal Luczaj

Re: [PATCH] selftests/mm: Fix test result reporting in gup_longterm

2025-05-22 Thread David Hildenbrand


On 21.05.25 20:48, Mark Brown wrote:

On Mon, May 19, 2025 at 03:28:47PM +0200, David Hildenbrand wrote:

On 16.05.25 20:07, Mark Brown wrote:

On Fri, May 16, 2025 at 04:12:08PM +0200, David Hildenbrand wrote:



[Converting to kselftet_harness]

That'd certainly work, though doing that is more surgery on the test
than I personally have the time/enthusiasm for right now.



Same over here.



But probably if we touch it, we should just clean it up right away. Well,
if we decide that that is the right cleanup. (you mention something like that
in your patch description :)



OTOH there's something to be said for just making incremental
improvements in the tests where we can, they tend not to get huge
amounts of love in general which means perfect can very much be the



I would agree if it would be a handful of small changes.



But here we are already at



  1 file changed, 107 insertions(+), 56 deletions(-)


So, I did have a brief poke at this which confirmed my instinct that
blocking a fix for this (and the other similarly structured tests like
cow) seems disproportionate.


Thanks for giving it a try.



The biggest issue is the configuration of fixtures, the harness really
wants the set of test variants to be fixed at compile time (see the
FIXTURE_ macros) but we're covering the dynamically discovered list of
huge page sizes.


Yes.

Probably, one might be able to revert the logic: instead of running each 
test for each size, run each size for each test: then, the tests are 
fixed and would be covering all available sizes in a single logical test.


I agree that that really is a bigger rework. Let me take a look at your 
original patch later (fairly busy today, please poke me if I forget).


--
Cheers,

David / dhildenb

Re: [PATCH 00/19] virtio_ring in order support

2025-05-22 Thread Lei Yang

On Mon, May 19, 2025 at 5:35 AM Michael S. Tsirkin  wrote:
>
> On Wed, Mar 26, 2025 at 07:39:47AM +0100, Eugenio Perez Martin wrote:
> > On Mon, Mar 24, 2025 at 3:44 PM Lei Yang  wrote:
> > >
> > > QE tested this series of patches with virtio-net regression tests,
> > > everything works fine.
> > >
> >
> > Hi Lei,
> >
> > Is it possible to test this series also with virtio-net-pci,...,in_order=on?
> >
> > Thanks!
>
>
> Lei, what do you think?


Sure,  I will test it and provide test results ASAP.

Thanks
Lei
>
>
> > > Tested-by: Lei Yang 
> > >
> > > On Mon, Mar 24, 2025 at 1:45 PM Jason Wang  wrote:
> > > >
> > > > Hello all:
> > > >
> > > > This sereis tries to implement the VIRTIO_F_IN_ORDER to
> > > > virtio_ring. This is done by introducing virtqueue ops so we can
> > > > implement separate helpers for different virtqueue layout/features
> > > > then the in-order were implmeented on top.
> > > >
> > > > Tests shows 5% imporvemnt in RX PPS with KVM guest + testpmd on the
> > > > host.
> > > >
> > > > Please review.
> > > >
> > > > Thanks
> > > >
> > > > Jason Wang (19):
> > > >   virtio_ring: rename virtqueue_reinit_xxx to virtqueue_reset_xxx()
> > > >   virtio_ring: switch to use vring_virtqueue in virtqueue_poll variants
> > > >   virtio_ring: unify logic of virtqueue_poll() and more_used()
> > > >   virtio_ring: switch to use vring_virtqueue for virtqueue resize
> > > > variants
> > > >   virtio_ring: switch to use vring_virtqueue for virtqueue_kick_prepare
> > > > variants
> > > >   virtio_ring: switch to use vring_virtqueue for virtqueue_add variants
> > > >   virtio: switch to use vring_virtqueue for virtqueue_add variants
> > > >   virtio_ring: switch to use vring_virtqueue for enable_cb_prepare
> > > > variants
> > > >   virtio_ring: use vring_virtqueue for enable_cb_delayed variants
> > > >   virtio_ring: switch to use vring_virtqueue for disable_cb variants
> > > >   virtio_ring: switch to use vring_virtqueue for detach_unused_buf
> > > > variants
> > > >   virtio_ring: use u16 for last_used_idx in virtqueue_poll_split()
> > > >   virtio_ring: introduce virtqueue ops
> > > >   virtio_ring: determine descriptor flags at one time
> > > >   virtio_ring: factor out core logic of buffer detaching
> > > >   virtio_ring: factor out core logic for updating last_used_idx
> > > >   virtio_ring: move next_avail_idx to vring_virtqueue
> > > >   virtio_ring: factor out split indirect detaching logic
> > > >   virtio_ring: add in order support
> > > >
> > > >  drivers/virtio/virtio_ring.c | 856 ++-
> > > >  1 file changed, 653 insertions(+), 203 deletions(-)
> > > >
> > > > --
> > > > 2.42.0
> > > >
> > > >
> > >
>

[PATCH 2/3] modpost: allow "make nsdeps" to skip module-specific symbol namespace

2025-05-22 Thread Masahiro Yamada

When MODULE_IMPORT_NS() is missing, "make nsdeps" runs the Coccinelle
script to automatically add MODULE_IMPORT_NS() to each module.

This should not occur for users of EXPORT_SYMBOL_GPL_FOR_MODULES(), which
is intended to export a symbol to a specific module only. In such cases,
explicitly adding MODULE_IMPORT_NS("module:...") is disallowed.

This commit handles the latter case separately in order not to trigger
the Coccinelle, and displays the error message:

  ERROR: modpost: module "foo" uses symbol "bar", which is exported only for 
module "baz"

Apply the same logic for kernel space as well.

Fixes: 092a4f5985f2 ("module: Add module specific symbol namespace support")
Signed-off-by: Masahiro Yamada 
---

 kernel/module/main.c  | 37 -
 scripts/mod/modpost.c | 35 ++-
 2 files changed, 38 insertions(+), 34 deletions(-)

diff --git a/kernel/module/main.c b/kernel/module/main.c
index 81035f6552ec..642f790c47e7 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -65,6 +65,8 @@
 #define CREATE_TRACE_POINTS
 #include 
 
+#define MODULE_NS_PREFIX "module:"
+
 /*
  * Mutex protects:
  * 1) List of modules (also safely readable within RCU read section),
@@ -1108,28 +1110,21 @@ static char *get_modinfo(const struct load_info *info, 
const char *tag)
 }
 
 /**
- * verify_module_namespace() - does @modname have access to this symbol's 
@namespace
- * @namespace: export symbol namespace
+ * module_match() - check if @modname matches @patterns
  * @modname: module name
+ * @patterns: comma separated patterns
  *
- * If @namespace is prefixed with "module:" to indicate it is a module 
namespace
- * then test if @modname matches any of the comma separated patterns.
- *
- * The patterns only support tail-glob.
+ * The @patterns only supports tail-glob.
  */
-static bool verify_module_namespace(const char *namespace, const char *modname)
+static bool module_match(const char *modname, const char *patterns)
 {
size_t len, modlen = strlen(modname);
-   const char *prefix = "module:";
const char *sep;
bool glob;
 
-   if (!strstarts(namespace, prefix))
-   return false;
-
-   for (namespace += strlen(prefix); *namespace; namespace = sep) {
-   sep = strchrnul(namespace, ',');
-   len = sep - namespace;
+   for (; *patterns; patterns = sep) {
+   sep = strchrnul(patterns, ',');
+   len = sep - patterns;
 
glob = false;
if (sep[-1] == '*') {
@@ -1140,7 +1135,7 @@ static bool verify_module_namespace(const char 
*namespace, const char *modname)
if (*sep)
sep++;
 
-   if (mod_strncmp(namespace, modname, len) == 0 && (glob || len 
== modlen))
+   if (mod_strncmp(patterns, modname, len) == 0 && (glob || len == 
modlen))
return true;
}
 
@@ -1157,8 +1152,16 @@ static int verify_namespace_is_imported(const struct 
load_info *info,
namespace = kernel_symbol_namespace(sym);
if (namespace && namespace[0]) {
 
-   if (verify_module_namespace(namespace, mod->name))
+   if (strstarts(namespace, MODULE_NS_PREFIX)) {
+   namespace += strlen(MODULE_NS_PREFIX);
+
+   if (!module_match(mod->name, namespace)) {
+   pr_err("module \"%s\" uses symbol \"%s\", which 
is exported only for module \"%s\"\n",
+  mod->name, kernel_symbol_name(sym), 
namespace);
+   return -EINVAL;
+   }
return 0;
+   }
 
for_each_modinfo_entry(imported_namespace, info, "import_ns") {
if (strcmp(namespace, imported_namespace) == 0)
@@ -1743,7 +1746,7 @@ static int setup_modinfo(struct module *mod, struct 
load_info *info)
 * 'module:' prefixed namespaces are implicit, disallow
 * explicit imports.
 */
-   if (strstarts(imported_namespace, "module:")) {
+   if (strstarts(imported_namespace, MODULE_NS_PREFIX)) {
pr_err("%s: module tries to import module namespace: 
%s\n",
   mod->name, imported_namespace);
return -EPERM;
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index 5ca7c268294e..3948a4bc41b3 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -1690,28 +1690,21 @@ void buf_write(struct buffer *buf, const char *s, int 
len)
 }
 
 /**
- * verify_module_namespace() - does @modname have access to this symbol's 
@namespace
- * @namespace: export symbol namespace
+ * module_match() - check if @modname matches @patterns
  * @modname: module name
+ * @patterns: comma-separated list of module names
  *
- * If @namespace is prefixed with "m

[PATCH bpf-next v2 1/2] bpf, arm64: Support up to 12 function arguments

2025-05-22 Thread Alexis Lothoré

From: Xu Kuohai 

Currently ARM64 bpf trampoline supports up to 8 function arguments.
According to the statistics from commit
473e3150e30a ("bpf, x86: allow function arguments up to 12 for TRACING"),
there are about 200 functions accept 9 to 12 arguments, so adding support
for up to 12 function arguments.

Due to bpf only supporting function arguments up to 16 bytes, according to
AAPCS64, starting from the first argument, each argument is first
attempted to be loaded to 1 or 2 smallest registers from x0-x7, if there
are no enough registers to hold the entire argument, then all remaining
arguments starting from this one are pushed to the stack for passing.
There are some non-trivial cases for which it is not possible to
correctly read arguments from/write arguments to the stack: for example
struct variables may have custom packing/alignment attributes that are
invisible in BTF info. Such cases are denied for now to make sure not to
read incorrect values.

Signed-off-by: Xu Kuohai 
Co-developed-by: Alexis Lothoré (eBPF Foundation) 
Signed-off-by: Alexis Lothoré (eBPF Foundation) 
---
Changes in v2:
- refuse attachment to functions passing structs on stack
- use simpler alignment rules for args passed on stack, assuming that
  exotic types are denied either by the verifier and/or the trampoline
  generation code
---
 arch/arm64/net/bpf_jit_comp.c | 234 --
 1 file changed, 180 insertions(+), 54 deletions(-)

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 
70d7c89d3ac907798e86e0051e7b472c252c1412..8c735bc522e439a4b2e3111fc28b9575a64cdb3a
 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -2064,7 +2064,7 @@ bool bpf_jit_supports_subprog_tailcalls(void)
 }
 
 static void invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l,
-   int args_off, int retval_off, int run_ctx_off,
+   int bargs_off, int retval_off, int run_ctx_off,
bool save_ret)
 {
__le32 *branch;
@@ -2106,7 +2106,7 @@ static void invoke_bpf_prog(struct jit_ctx *ctx, struct 
bpf_tramp_link *l,
branch = ctx->image + ctx->idx;
emit(A64_NOP, ctx);
 
-   emit(A64_ADD_I(1, A64_R(0), A64_SP, args_off), ctx);
+   emit(A64_ADD_I(1, A64_R(0), A64_SP, bargs_off), ctx);
if (!p->jited)
emit_addr_mov_i64(A64_R(1), (const u64)p->insnsi, ctx);
 
@@ -2131,7 +2131,7 @@ static void invoke_bpf_prog(struct jit_ctx *ctx, struct 
bpf_tramp_link *l,
 }
 
 static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_links *tl,
-  int args_off, int retval_off, int run_ctx_off,
+  int bargs_off, int retval_off, int run_ctx_off,
   __le32 **branches)
 {
int i;
@@ -2141,7 +2141,7 @@ static void invoke_bpf_mod_ret(struct jit_ctx *ctx, 
struct bpf_tramp_links *tl,
 */
emit(A64_STR64I(A64_ZR, A64_SP, retval_off), ctx);
for (i = 0; i < tl->nr_links; i++) {
-   invoke_bpf_prog(ctx, tl->links[i], args_off, retval_off,
+   invoke_bpf_prog(ctx, tl->links[i], bargs_off, retval_off,
run_ctx_off, true);
/* if (*(u64 *)(sp + retval_off) !=  0)
 *  goto do_fexit;
@@ -2155,23 +2155,134 @@ static void invoke_bpf_mod_ret(struct jit_ctx *ctx, 
struct bpf_tramp_links *tl,
}
 }
 
-static void save_args(struct jit_ctx *ctx, int args_off, int nregs)
+struct arg_aux {
+   /* how many args are passed through registers, the rest of the args are
+* passed through stack
+*/
+   int args_in_regs;
+   /* how many registers are used to pass arguments */
+   int regs_for_args;
+   /* how much stack is used for additional args passed to bpf program
+* that did not fit in original function registers
+**/
+   int bstack_for_args;
+   /* home much stack is used for additional args passed to the
+* original function when called from trampoline (this one needs
+* arguments to be properly aligned)
+*/
+   int ostack_for_args;
+};
+
+static int calc_arg_aux(const struct btf_func_model *m,
+struct arg_aux *a)
 {
-   int i;
+   int stack_slots, nregs, slots, i;
+
+   /* verifier ensures m->nr_args <= MAX_BPF_FUNC_ARGS */
+   for (i = 0, nregs = 0; i < m->nr_args; i++) {
+   slots = (m->arg_size[i] + 7) / 8;
+   if (nregs + slots <= 8) /* passed through register ? */
+   nregs += slots;
+   else
+   break;
+   }
+
+   a->args_in_regs = i;
+   a->regs_for_args = nregs;
+   a->ostack_for_args = 0;
+
+   /* the rest arguments are passed through stack */
+   for (a->ostack_for_args = 0, a->bstack_for_args = 0;
+i < m->nr_args;

Re: [PATCH v4 4/7] futex: Create set_robust_list2

2025-05-22 Thread kernel test robot

Hi André,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 3ee84e3dd88e39b55b534e17a7b9a181f1d46809]

url:
https://github.com/intel-lab-lkp/linux/commits/Andr-Almeida/selftests-futex-Add-ASSERT_-macros/20250521-045231
base:   3ee84e3dd88e39b55b534e17a7b9a181f1d46809
patch link:
https://lore.kernel.org/r/20250520-tonyk-robust_futex-v4-4-1123093e59de%40igalia.com
patch subject: [PATCH v4 4/7] futex: Create set_robust_list2
config: arm-randconfig-r122-20250522 
(https://download.01.org/0day-ci/archive/20250522/202505221953.jkgfsa3u-...@intel.com/config)
compiler: clang version 17.0.6 (https://github.com/llvm/llvm-project 
6009708b4367171ccdbf4b5905cb6a803753fe18)
reproduce: 
(https://download.01.org/0day-ci/archive/20250522/202505221953.jkgfsa3u-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202505221953.jkgfsa3u-...@intel.com/

sparse warnings: (new ones prefixed by >>)
   kernel/futex/core.c:581:38: sparse: sparse: cast removes address space 
'__user' of expression
   kernel/futex/core.c:581:51: sparse: sparse: incorrect type in initializer 
(different address spaces) @@ expected unsigned int [noderef] [usertype] 
__user *naddr @@ got void * @@
   kernel/futex/core.c:581:51: sparse: expected unsigned int [noderef] 
[usertype] __user *naddr
   kernel/futex/core.c:581:51: sparse: got void *
   kernel/futex/core.c:597:38: sparse: sparse: cast removes address space 
'__user' of expression
   kernel/futex/core.c:597:51: sparse: sparse: incorrect type in initializer 
(different address spaces) @@ expected unsigned int [noderef] [usertype] 
__user *naddr @@ got void * @@
   kernel/futex/core.c:597:51: sparse: expected unsigned int [noderef] 
[usertype] __user *naddr
   kernel/futex/core.c:597:51: sparse: got void *
   kernel/futex/core.c:1268:59: sparse: sparse: cast removes address space 
'__user' of expression
>> kernel/futex/core.c:1268:59: sparse: sparse: incorrect type in argument 3 
>> (different address spaces) @@ expected unsigned int [noderef] [usertype] 
>> __user *head @@ got unsigned int [usertype] * @@
   kernel/futex/core.c:1268:59: sparse: expected unsigned int [noderef] 
[usertype] __user *head
   kernel/futex/core.c:1268:59: sparse: got unsigned int [usertype] *
   kernel/futex/core.c:978:9: sparse: sparse: context imbalance in 
'futex_q_lockptr_lock' - wrong count at exit

vim +1268 kernel/futex/core.c

04e7712f446058 kernel/futex.c  Arnd Bergmann 2018-04-17  1247  
04e7712f446058 kernel/futex.c  Arnd Bergmann 2018-04-17  1248  /*
04e7712f446058 kernel/futex.c  Arnd Bergmann 2018-04-17  1249   * Walk 
curr->robust_list (very carefully, it's a userspace list!)
04e7712f446058 kernel/futex.c  Arnd Bergmann 2018-04-17  1250   * and mark 
any locks found there dead, and notify any waiters.
04e7712f446058 kernel/futex.c  Arnd Bergmann 2018-04-17  1251   *
04e7712f446058 kernel/futex.c  Arnd Bergmann 2018-04-17  1252   * We 
silently return on any sign of list-walking problem.
04e7712f446058 kernel/futex.c  Arnd Bergmann 2018-04-17  1253   */
1c5e99b662506e kernel/futex/core.c André Almeida 2025-05-20  1254  static void 
exit_robust_list32(struct task_struct *curr,
1c5e99b662506e kernel/futex/core.c André Almeida 2025-05-20  1255   
   struct robust_list_head32 __user *head)
04e7712f446058 kernel/futex.c  Arnd Bergmann 2018-04-17  1256  {
04e7712f446058 kernel/futex.c  Arnd Bergmann 2018-04-17  1257   struct 
robust_list __user *entry, *next_entry, *pending;
04e7712f446058 kernel/futex.c  Arnd Bergmann 2018-04-17  1258   
unsigned int limit = ROBUST_LIST_LIMIT, pi, pip;
3f649ab728cda8 kernel/futex.c  Kees Cook 2020-06-03  1259   
unsigned int next_pi;
b9412773325c3a kernel/futex/core.c André Almeida 2025-05-20  1260   u32 
uentry, next_uentry, upending;
b9412773325c3a kernel/futex/core.c André Almeida 2025-05-20  1261   s32 
futex_offset;
04e7712f446058 kernel/futex.c  Arnd Bergmann 2018-04-17  1262   int rc;
04e7712f446058 kernel/futex.c  Arnd Bergmann 2018-04-17  1263  
04e7712f446058 kernel/futex.c  Arnd Bergmann 2018-04-17  1264   /*
04e7712f446058 kernel/futex.c  Arnd Bergmann 2018-04-17  1265* 
Fetch the list head (which was registered earlier, via
04e7712f446058 kernel/futex.c  Arnd Bergmann 2018-04-17  1266* 
sys_set_robust_list()):
04e7712f446058 kernel/futex.c  Arnd Bergmann 2018-04-17  1267*/
b9412773325c3a kernel/futex/core.c André Almeida 2025-05-20 @1268   if 
(fetch_robust_entry32((u32 *)&uentry, &entry, (u32 *)&head->list.next, &pi))
04e7712f446058 kernel/futex

[PATCH net-next v8] selftests/vsock: add initial vmtest.sh for vsock

2025-05-22 Thread Bobby Eshleman

This commit introduces a new vmtest.sh runner for vsock.

It uses virtme-ng/qemu to run tests in a VM. The tests validate G2H,
H2G, and loopback. The testing tools from tools/testing/vsock/ are
reused. Currently, only vsock_test is used.

VMCI and hyperv support is included in the config file to be built with
the -b option, though not used in the tests.

Only tested on x86.

To run:

  $ make -C tools/testing/selftests TARGETS=vsock
  $ tools/testing/selftests/vsock/vmtest.sh

or

  $ make -C tools/testing/selftests TARGETS=vsock run_tests

Example runs (after make -C tools/testing/selftests TARGETS=vsock):

$ ./tools/testing/selftests/vsock/vmtest.sh
1..3
ok 0 vm_server_host_client
ok 1 vm_client_host_server
ok 2 vm_loopback
SUMMARY: PASS=3 SKIP=0 FAIL=0
Log: /tmp/vsock_vmtest_m7DI.log

$ ./tools/testing/selftests/vsock/vmtest.sh vm_loopback
1..1
ok 0 vm_loopback
SUMMARY: PASS=1 SKIP=0 FAIL=0
Log: /tmp/vsock_vmtest_a1IO.log

$ mkdir -p ~/scratch
$ make -C tools/testing/selftests install TARGETS=vsock INSTALL_PATH=~/scratch
 [... omitted ...]
$ cd ~/scratch
$ ./run_kselftest.sh
 TAP version 13
 1..1
 # timeout set to 300
 # selftests: vsock: vmtest.sh
 # 1..3
 # ok 0 vm_server_host_client
 # ok 1 vm_client_host_server
 # ok 2 vm_loopback
 # SUMMARY: PASS=3 SKIP=0 FAIL=0
 # Log: /tmp/vsock_vmtest_svEl.log
 ok 1 selftests: vsock: vmtest.sh

Future work can include vsock_diag_test.

Because vsock requires a VM to test anything other than loopback, this
patch adds vmtest.sh as a kselftest itself. This is different than other
systems that have a "vmtest.sh", where it is used as a utility script to
spin up a VM to run the selftests as a guest (but isn't hooked into
kselftest).

Signed-off-by: Bobby Eshleman 
---
Changes in v8:
- remove NIPA comment from commit msg
- remove tap_* functions and TAP_PREFIX
- add -b for building kernel
- Link to v7: 
https://lore.kernel.org/r/20250515-vsock-vmtest-v7-1-ba6fa86d6...@gmail.com

Changes in v7:
- fix exit code bug when ran is kselftest: use cnt_total instead of 
KSFT_NUM_TESTS
- updated commit message with updated output
- updated commit message with commands for installing/running as
  kselftest
- Link to v6: 
https://lore.kernel.org/r/20250515-vsock-vmtest-v6-1-9af1cc023...@gmail.com

Changes in v6:
- add make cmd in commit message in vmtest.sh example (Stefano)
- check nonzero size of QEMU_PIDFILE using -s conditional (Stefano)
- display log file path after tests so it is easier to find amongst other 
random names
- cleanup qemu pidfile if qemu is unable to remove it
- make oops/warning failures more obvious with 'FAIL' prefix in log
  (simply saying 'detected' wasn't clear enough to identify failing
  condition)
- Link to v5: 
https://lore.kernel.org/r/20250513-vsock-vmtest-v5-1-4e75c4a45...@gmail.com

Changes in v5:
- make log file a tmpfile (Paolo)
- make sure both default and user defined QEMU gets handled by the dependency 
check (Paolo)
- increased VM boot up timeout from 1m to 3m for slow hosts (Paolo)
- rename vm_setup -> vm_start (Paolo)
- derive wait_for_listener from selftests/net/net_helper.sh to removes ss usage 
- Remove unused 'unset IFS' line (Paolo)
- leave space after variable declarations (Paolo)
- make QEMU_PIDFILE a tmp file (Paolo)
- make everything readonly that is only read (Paolo)
- source ktap_helpers.sh for KSFT_PASS and friends (Paolo)
- don't check for timeout util (Paolo)
- add missing usage string for -q qemu arg
- add tap prefix to SUMMARY line since it isn't part of TAP protocol
- exit with the correct status code based on failure/pass counts
- Link to v4: 
https://lore.kernel.org/r/20250507-vsock-vmtest-v4-1-6e2a97262...@gmail.com

Changes in v4:
- do not use special tab delimiter for help string parsing (Stefano + Paolo)
- fix paths for when installing kselftest and running out-of-tree (Paolo)
- change vng to using running kernel instead of compiled kernel (Paolo)
- use multi-line string for QEMU_OPTS (Stefano)
- change timeout to 300s (Paolo)
- skip if tools are not found and use kselftests status codes (Paolo)
- remove build from vmtest.sh (Paolo)
- change  -> SSH_HOST_PORT (Stefano)
- add tap-format output
- add vmtest.log to gitignore
- check for vsock_test binary and remind user to build it if missing
- create a proper build in makefile
- style fixes
- add ssh, timeout, and pkill to dependency check, just in case
- fix numerical comparison in conditionals
- check qemu pidfile exists before proceeding (avoid wasting time waiting for 
ssh)
- fix tracking of pass/fail bug
- fix stderr redirection bug
- Link to v3: 
https://lore.kernel.org/r/20250428-vsock-vmtest-v3-1-181af6163...@gmail.com

Changes in v3:
- use common conditional syntax for checking variables
- use return value instead of global rc
- fix typo TEST_HOST_LISTENER_PORT -> TEST_HOST_PORT_LISTENER
- use SIGTERM instead of SIGKILL on cleanup
- use peer-cid=1 for loopback
- change sleep delay times into globals
- fix test_vm_loopback logging
- add test selection in a

RE: [PATCH v4 0/2] livepatch, arm64/module: Enable late module relocations.

2025-05-22 Thread Toshiyuki Sato (Fujitsu)

Hi Dylan,

> Late relocations (after the module is initially loaded) are needed when
> livepatches change module code. This is supported by x86, ppc, and s390.
> This series borrows the x86 methodology to reach the same level of support on
> arm64, and moves the text-poke locking into the core livepatch code to reduce
> redundancy.
> 
> Dylan Hatch (2):
>   livepatch, x86/module: Generalize late module relocation locking.
>   arm64/module: Use text-poke API for late relocations.
> 
>  arch/arm64/kernel/module.c | 113
> ++---
>  arch/x86/kernel/module.c   |   8 +--
>  kernel/livepatch/core.c|  18 --
>  3 files changed, 84 insertions(+), 55 deletions(-)
> 
> --
> 2.49.0.1151.ga128411c76-goog

Thanks for posting the new patch.

I ran kpatch's integration tests and no issues were detected.

The livepatch patches [1][2] (Manually adjusting arch/arm64/Kconfig) have been 
applied to the kernel (6.15-rc7).
The kpatch uses the same one as the previous test [3][4].

[1] https://lore.kernel.org/all/2025052000.2237470-1-mark.rutl...@arm.com/
[2] https://lore.kernel.org/all/20250320171559.3423224-3-s...@kernel.org/
[3] 
https://lore.kernel.org/all/ty4pr01mb1377739f1cc08549a619c8635d7...@ty4pr01mb13777.jpnprd01.prod.outlook.com/
[4] https://github.com/dynup/kpatch/pull/1439

Tested-by: Toshiyuki Sato 

Regards,
Toshiyuki Sato

Re: [PATCH v6 4/5] x86/sgx: Implement ENCLS[EUPDATESVN]

2025-05-22 Thread Huang, Kai


>  
> +/**
> + * sgx_updatesvn() - Attempt to call ENCLS[EUPDATESVN].

sgx_updatesvn() -> sgx_update_svn():

arch/x86/kernel/cpu/sgx/main.c:941: warning: expecting prototype for
sgx_updatesvn(). Prototype was for sgx_update_svn() instead


> + * This instruction attempts to update CPUSVN to the
> + * currently loaded microcode update SVN and generate new
> + * cryptographic assets. Must be called when EPC is empty.
> + * Most of the time, there will be no update and that's OK.
> + * If the failure is due to SGX_INSUFFICIENT_ENTROPY, the
> + * operation can be safely retried. In other failure cases,
> + * the retry should not be attempted.
> + *
> + * Return:
> + * 0: Success or not supported
> + * -EAGAIN: Can be safely retried, failure is due to lack of
> + *  entropy in RNG.
> + * -EIO: Unexpected error, retries are not advisable.
> + */
> +static int sgx_update_svn(void)
> +{
> + int ret;
> +
> + /*
> +  * If EUPDATESVN is not available, it is ok to
> +  * silently skip it to comply with legacy behavior.
> +  */
> + if (!cpu_feature_enabled(X86_FEATURE_SGX_EUPDATESVN))
> + return 0;
> +
> + for (int i = 0; i < RDRAND_RETRY_LOOPS; i++) {
> + ret = __eupdatesvn();
> +
> + /* Stop on success or unexpected errors: */
> + if (ret != SGX_INSUFFICIENT_ENTROPY)
> + break;
> + }
> +
> + /*
> +  * SVN was already up-to-date. This is the most
> +  * common case.
> +  */
> + if (ret == SGX_NO_UPDATE)
> + return 0;
> +
> + /*
> +  * SVN update failed due to lack of entropy in DRNG.
> +  * Indicate to userspace that it should retry.
> +  */
> + if (ret == SGX_INSUFFICIENT_ENTROPY)
> + return -EAGAIN;
> +
> + if (!ret) {
> + /*
> +  * SVN successfully updated.
> +  * Let users know when the update was successful.
> +  */
> + pr_info("SVN updated successfully\n");
> + return 0;
> + }
> +
> + /*
> +  * EUPDATESVN was called when EPC is empty, all other error
> +  * codes are unexpected.
> +  */
> + ENCLS_WARN(ret, "EUPDATESVN");
> + return -EIO;
> +}
> +

This patch alone generates below build warning (both w/ and w/o 'W=1'):

khuang2@khuang2-desk:~/work/enabling/src/tip$ make arch/x86/kernel/cpu/sgx/ W=1
  DESCEND objtool
  CALLscripts/checksyscalls.sh
  INSTALL libsubcmd_headers
  CC  arch/x86/kernel/cpu/sgx/main.o
arch/x86/kernel/cpu/sgx/main.c:940:12: warning: ‘sgx_update_svn’ defined but not
used [-Wunused-function]
  940 | static int sgx_update_svn(void)
  |^~

Regardless of whether this warning is reasonable or not, it is a warning during
build process which may impact bisect.

You can silence it by annotating __maybe_unused attribute to sgx_update_svn() in
this patch, and then remove it in the next one.

But I am not sure whether it is necessary, though.  We can merge the last two
patches together.  The ending patch won't be too big to review IMHO.

We can even merge patch 3 together too.  The reason is current changelog of that
patch doesn't explain why we only define that two error codes (or return values)
but not others, which makes that patch *ALONE* un-reviewable without looking at
further patches.  That being said, it's fine to me we keep patch 3 alone, but
it's better to do some clarification in changelog.

But just my 2 cents.  Since Dave/Ingo/Jarkko are all on this thread, I'll leave
this to them.

Re: [PATCH v6 2/5] x86/cpufeatures: Add X86_FEATURE_SGX_EUPDATESVN feature flag

2025-05-22 Thread Huang, Kai

On Thu, 2025-05-22 at 12:21 +0300, Elena Reshetova wrote:
> --- a/tools/arch/x86/include/asm/cpufeatures.h
> +++ b/tools/arch/x86/include/asm/cpufeatures.h
> @@ -481,6 +481,7 @@
>  #define X86_FEATURE_AMD_HTR_CORES(21*32+ 6) /* Heterogeneous Core 
> Topology */
>  #define X86_FEATURE_AMD_WORKLOAD_CLASS   (21*32+ 7) /* Workload 
> Classification */
>  #define X86_FEATURE_PREFER_YMM   (21*32+ 8) /* Avoid ZMM 
> registers due to downclocking */
> +#define X86_FEATURE_SGX_EUPDATESVN   (21*32+11) /* Support for 
> ENCLS[EUPDATESVN] instruction */

[Sorry for not mentioning in the previous version.]

Nit:

I am not sure we need to change tool headers.

Per commit

  f6d9883f8e68 ("tools/include: Sync x86 headers with the kernel sources")

.. and tools/include/uapi/README:

  ...

  What we are doing now is a third option:

   - A software-enforced copy-on-write mechanism of kernel headers to
 tooling, driven by non-fatal warnings on the tooling side build when
 kernel headers get modified:

  Warning: Kernel ABI header differences:
diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h
diff -u tools/include/uapi/linux/fs.h include/uapi/linux/fs.h
diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
...

 The tooling policy is to always pick up the kernel side headers as-is,
 and integate them into the tooling build. The warnings above serve as a
 notification to tooling maintainers that there's changes on the kernel
 side.

  We've been using this for many years now, and it might seem hacky, but
  works surprisingly well.

.. I interpret the updating to tools headers is not mandatory (unless building
tools fails w/o the new feature bit definition which I believe isn't the case of
SGX_UPDATESVN).  The tools maintainers will eventually do the sync.

But on the other hand, modifying tools headers in this patch also reduces tools
maintainer's effort in the future.

That being said, I am unclear with the rule here.  Perhaps Dave/Ingo can help to
clarify.

Re: vmlinux BTF as a module (was Re: [PATCH bpf-next v4 0/3] Allow mmap of /sys/kernel/btf/vmlinux)

2025-05-22 Thread Alexei Starovoitov

On Wed, May 21, 2025 at 8:00 AM Alan Maguire  wrote:
>
> > Hi Alan,
> >
> > Thanks for taking a look at this. I've been following your related effort
> > to allow /sys/kernel/btf/vmlinux as a module in support of small systems
> > with kernel-size constraints, and wondered how this series might affect
> > that work? Such support would be well-received in the embedded space when
> > it happens, so am keen to understand.
> >
> > Thanks,
> > Tony
>
> hi Tony
>
> I had something nearly working a few months back but there are a bunch
> of complications that made it a bit trickier than I'd first anticipated.
> One challenge for example is that we want /sys/kernel/btf to behave just
> as it would if vmlinux BTF was not a module. My original hope was to
> just have the vmlinux BTF module forceload early, but the request module
> approach won't work since the vmlinux_btf.ko module would have to be
> part of the initrd image. A question for you on this - I presume that's
> what you want to avoid, right? So I'm assuming that we need to extract
> the .BTF section out of the vmlinu[xz] binary and out of initrd into a
> later-loading vmlinux_btf.ko module for small-footprint systems. Is that
> correct?
>
> The reason I ask is having a later-loading vmlinux_btf.ko is a bit of a
> pain since we need to walk the set of kernel modules and load their BTF,
> relocate it and do kfunc registration. If we can simplify things via a
> shared module dependency on vmlinux_btf.ko that would be great, but I'd
> like to better understand the constraints from the small system
> perspective first. Thanks!

We cannot require other modules to depend on vmlinux_btf.ko.
Some of them might load during the boot. So adding to the dependency
will defeat the point of vmlinux_btf.ko.
The only option I see is to let modules load and ignore their BTFs
and vmlinux BTF is not present.
Later vmlinux_btf.ko can be loaded and modules loaded after that
time will succeed in loading their BTFs too.
So some modules will have their BTF and some don't.
I don't think it's an issue.

If an admin loads a module with kfuncs and vmlixnu_btf.ko is not loaded yet
the kfunc registration will fail, of course. It's an issue,
but I don't think we need to fix it right now by messing with depmod.

The bigger issue is how to split vmlinux_btf.ko itself.
The kernel has a bunch of kfuncs and they need BTF ids for protos
and for all types they reference, so vmlinux BTF cannot be empty.
minimize_btf() can probably help.
So before we proceed with vmlinux_btf.ko we need to see the data
how big the mandatory part of vmlinux BTF will be vs
the rest of BTF in vmlinux_btf.ko.

[PATCH v2] selftests: ir_decoder: Convert header comment to proper multi-line block

2025-05-22 Thread Abdelrahman Fekry

v2- fixed multiple trailing whitespace errors and
the Signed-off-by mismatch

The test file for the IR decoder used single-line comments
at the top to document its purpose and licensing,
which is inconsistent with the style used throughout the
Linux kernel.

In this patch i converted the file header to
a proper multi-line comment block
(/*) that aligns with standard kernel practices.
This improves readability, consistency across selftests,
and ensures the license and documentation are
clearly visible in a familiar format.

No functional changes have been made.

Signed-off-by: Abdelrahman Fekry 
---
 tools/testing/selftests/ir/ir_loopback.c | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/tools/testing/selftests/ir/ir_loopback.c 
b/tools/testing/selftests/ir/ir_loopback.c
index f4a15cbdd5ea..c94faa975630 100644
--- a/tools/testing/selftests/ir/ir_loopback.c
+++ b/tools/testing/selftests/ir/ir_loopback.c
@@ -1,14 +1,17 @@
 // SPDX-License-Identifier: GPL-2.0
-// test ir decoder
-//
-// Copyright (C) 2018 Sean Young 
-
-// When sending LIRC_MODE_SCANCODE, the IR will be encoded. rc-loopback
-// will send this IR to the receiver side, where we try to read the decoded
-// IR. Decoding happens in a separate kernel thread, so we will need to
-// wait until that is scheduled, hence we use poll to check for read
-// readiness.
-
+/* Copyright (C) 2018 Sean Young 
+ *
+ * Selftest for IR decoder
+ *
+ *
+ * When sending LIRC_MODE_SCANCODE, the IR will be encoded.
+ * rc-loopback will send this IR to the receiver side,
+ * where we try to read the decoded IR.
+ * Decoding happens in a separate kernel thread,
+ * so we will need to wait until that is scheduled,
+ * hence we use poll to check for read
+ * readiness.
+ */
 #include 
 #include 
 #include 
-- 
2.25.1

Re: [PATCH 1/1] Fix typo in cpu-on-off-test selftest script:

2025-05-22 Thread Shuah Khan


On 5/16/25 19:19, Jihed Chaibi wrote:

Fix typo in hotplaggable_offline_cpus function name:

"hotplaggable" is replaced by "hotpluggable"

Signed-off-by: Jihed Chaibi 
---


Change looks good to me. Change log should specify the
subsusystem. Check submitting patches document and refer
to a few change logs for this file using git log.

Send v2 with a proper change log.


  tools/testing/selftests/cpu-hotplug/cpu-on-off-test.sh | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/cpu-hotplug/cpu-on-off-test.sh 
b/tools/testing/selftests/cpu-hotplug/cpu-on-off-test.sh
index d5dc7e0dc..6232a46ca 100755
--- a/tools/testing/selftests/cpu-hotplug/cpu-on-off-test.sh
+++ b/tools/testing/selftests/cpu-hotplug/cpu-on-off-test.sh
@@ -67,7 +67,7 @@ hotpluggable_cpus()
done
  }
  
-hotplaggable_offline_cpus()

+hotpluggable_offline_cpus()
  {
hotpluggable_cpus 0
  }
@@ -151,7 +151,7 @@ offline_cpu_expect_fail()
  
  online_all_hot_pluggable_cpus()

  {
-   for cpu in `hotplaggable_offline_cpus`; do
+   for cpu in `hotpluggable_offline_cpus`; do
online_cpu_expect_success $cpu
done
  }


thanks,
-- Shuah

Re: [PATCH v2] selftests: Add functional test for the abort file in fusectl

2025-05-22 Thread Shuah Khan


On 5/16/25 19:23, Chen Linxuan wrote:

This patch add a simple functional test for the "abort" file
in fusectlfs (/sys/fs/fuse/connections/ID/about).

A simple fuse daemon is added for testing.

Related discussion can be found in the link below.

Link: 
https://lore.kernel.org/all/CAOQ4uxjKFXOKQxPpxtS6G_nR0tpw95w0GiO68UcWg_OBhmSY=q...@mail.gmail.com/
Cc: Amir Goldstein 
Signed-off-by: Chen Linxuan 
---
Changes in v2:
- Apply changes suggested by Amir Goldstein
   - Check errno
- Link to v1: 
https://lore.kernel.org/all/20250515073449.346774-2-chenlinx...@uniontech.com/


Short summary should include the test name:

selftests: filesystems: Add functional test for the abort file in fusectl

Also if this test requires root previlege, add check for it. The rest
looks good to me.

Acked-by: Shuah Khan 

thanks,
-- Shuah

Re: [PATCH] selftests: net: fix spelling and grammar mistakes

2025-05-22 Thread Shuah Khan


On 5/16/25 19:59, Praveen Balakrishnan wrote:

Fix several spelling and grammatical mistakes in output messages from
the net selftests to improve readability.

Only the message strings for the test output have been modified. No
changes to the functional logic of the tests have been made.

Signed-off-by: Praveen Balakrishnan 


This patch is missing net maintainers. RUn get_maintainers.pl for
complete list recipients for this patch.

thanks,
-- Shuah

Re: [PATCH v2] selftests: Improve test output grammar, code style

2025-05-22 Thread Shuah Khan


On 5/16/25 02:42, Hanne-Lotta Mäenpää wrote:

Add small grammar fixes in perf events and Real Time Clock tests'
output messages.

Include braces around a single if statement, when there are multiple
statements in the else branch, to align with the kernel coding style.


This patch combines several changes in one including combining changes
to two tests.



Signed-off-by: Hanne-Lotta Mäenpää 
---

Notes:
 v1 -> v2: Improved wording in RTC tests based on feedback from
 Alexandre Belloni 

  tools/testing/selftests/perf_events/watermark_signal.c |  7 ---
  tools/testing/selftests/rtc/rtctest.c  | 10 +-
  2 files changed, 9 insertions(+), 8 deletions(-)



Send separate patches for selftests/perf_events and selftests/rtc/rtctest.c



diff --git a/tools/testing/selftests/perf_events/watermark_signal.c 
b/tools/testing/selftests/perf_events/watermark_signal.c
index 49dc1e831174..6176afd4950b 100644
--- a/tools/testing/selftests/perf_events/watermark_signal.c
+++ b/tools/testing/selftests/perf_events/watermark_signal.c
@@ -65,8 +65,9 @@ TEST(watermark_signal)
  
  	child = fork();

EXPECT_GE(child, 0);
-   if (child == 0)
+   if (child == 0) {
do_child();
+   }
else if (child < 0) {
perror("fork()");
goto cleanup;
@@ -75,7 +76,7 @@ TEST(watermark_signal)
if (waitpid(child, &child_status, WSTOPPED) != child ||
!(WIFSTOPPED(child_status) && WSTOPSIG(child_status) == SIGSTOP)) {
fprintf(stderr,
-   "failed to sycnhronize with child errno=%d status=%x\n",
+   "failed to synchronize with child errno=%d status=%x\n",


This change is good.


errno,
child_status);
goto cleanup;
@@ -84,7 +85,7 @@ TEST(watermark_signal)
fd = syscall(__NR_perf_event_open, &attr, child, -1, -1,
 PERF_FLAG_FD_CLOEXEC);
if (fd < 0) {
-   fprintf(stderr, "failed opening event %llx\n", attr.config);
+   fprintf(stderr, "failed to setup performance monitoring 
%llx\n", attr.config);


This change make it hard to understand what went wrong unlike the original
message.


goto cleanup;
}
  
diff --git a/tools/testing/selftests/rtc/rtctest.c b/tools/testing/selftests/rtc/rtctest.c

index be175c0e6ae3..930bf0ce4fa6 100644
--- a/tools/testing/selftests/rtc/rtctest.c
+++ b/tools/testing/selftests/rtc/rtctest.c
@@ -138,10 +138,10 @@ TEST_F_TIMEOUT(rtc, date_read_loop, 
READ_LOOP_DURATION_SEC + 2) {
rtc_read = rtc_time_to_timestamp(&rtc_tm);
/* Time should not go backwards */
ASSERT_LE(prev_rtc_read, rtc_read);
-   /* Time should not increase more then 1s at a time */
+   /* Time should not increase more than 1s per read */
ASSERT_GE(prev_rtc_read + 1, rtc_read);
  
-		/* Sleep 11ms to avoid killing / overheating the RTC */

+   /* Sleep 11ms to avoid overheating the RTC */


This change removes important information. What is the reason for this
change?


nanosleep_with_retries(READ_LOOP_SLEEP_MS * 100);
  
  		prev_rtc_read = rtc_read;

@@ -236,7 +236,7 @@ TEST_F(rtc, alarm_alm_set) {
if (alarm_state == RTC_ALARM_DISABLED)
SKIP(return, "Skipping test since alarms are not supported.");
if (alarm_state == RTC_ALARM_RES_MINUTE)
-   SKIP(return, "Skipping test since alarms has only minute 
granularity.");
+   SKIP(return, "Skipping test since alarm has only minute 
granularity.");
  
  	rc = ioctl(self->fd, RTC_RD_TIME, &tm);

ASSERT_NE(-1, rc);
@@ -306,7 +306,7 @@ TEST_F(rtc, alarm_wkalm_set) {
if (alarm_state == RTC_ALARM_DISABLED)
SKIP(return, "Skipping test since alarms are not supported.");


This one still says "alarms"


if (alarm_state == RTC_ALARM_RES_MINUTE)
-   SKIP(return, "Skipping test since alarms has only minute 
granularity.");
+   SKIP(return, "Skipping test since alarm has only minute 
granularity.");


Isn't "alarms" consistent with other messages?

  
  	rc = ioctl(self->fd, RTC_RD_TIME, &alarm.time);

ASSERT_NE(-1, rc);
@@ -502,7 +502,7 @@ int main(int argc, char **argv)
if (access(rtc_file, R_OK) == 0)
ret = test_harness_run(argc, argv);
else
-   ksft_exit_skip("[SKIP]: Cannot access rtc file %s - Exiting\n",
+   ksft_exit_skip("Cannot access RTC file %s - exiting\n",
rtc_file);


I don't see any reason for this change either.

  
  	return ret;


thanks,
-- Shuah

Re: [PATCH] [PATCH] Change pidns to pid namespace

2025-05-22 Thread Shuah Khan


On 5/16/25 10:49, rodgeprit...@gmail.com wrote:

From: Pritesh Rodge 

Changed a comment in memfd_test.c , Unabbreviated pidns to pid namespace
for better understanding .

Signed-off-by: Pritesh Rodge 
---
  tools/testing/selftests/memfd/memfd_test.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/memfd/memfd_test.c 
b/tools/testing/selftests/memfd/memfd_test.c
index 5b993924cc3f..4e4c46246a4e 100644
--- a/tools/testing/selftests/memfd/memfd_test.c
+++ b/tools/testing/selftests/memfd/memfd_test.c
@@ -1359,7 +1359,7 @@ static int sysctl_nested_child(void *arg)
  
  	printf("%s nested sysctl 0\n", memfd_str);

sysctl_assert_write("0");
-   /* A further nested pidns works the same. */
+   /* A further nested pid-namespace works the same. */
pid = spawn_thread(CLONE_NEWPID, sysctl_simple_child, NULL);
join_thread(pid);
  


Please run get_maintainers.pl to find the complete list of recipients
for this patch.

thanks,
-- Shuah

[PATCH net-next] vsock/test: Cover more CIDs in transport_uaf test

2025-05-22 Thread Michal Luczaj

Increase the coverage of test for UAF due to socket unbinding, and losing
transport in general. It's a follow up to commit 301a62dfb0d0 ("vsock/test:
Add test for UAF due to socket unbinding") and discussion in [1].

The idea remains the same: take an unconnected stream socket with a
transport assigned and then attempt to switch the transport by trying (and
failing) to connect to some other CID. Now do this iterating over all the
well known CIDs (plus one).

Note that having only a virtio transport loaded (without vhost_vsock) is
unsupported; test will always pass. Depending on transports available, a
variety of splats are possible on unpatched machines. After reverting
commit fcdd2242c023 ("vsock: Keep the binding until socket destruction"):

BUG: KASAN: slab-use-after-free in __vsock_bind+0x61f/0x720
Read of size 4 at addr 88811ff46b54 by task vsock_test/1475
Call Trace:
 dump_stack_lvl+0x68/0x90
 print_report+0x170/0x53d
 kasan_report+0xc2/0x180
 __vsock_bind+0x61f/0x720
 vsock_connect+0x727/0xc40
 __sys_connect+0xe8/0x100
 __x64_sys_connect+0x6e/0xc0
 do_syscall_64+0x92/0x1c0
 entry_SYSCALL_64_after_hwframe+0x4b/0x53

WARNING: CPU: 0 PID: 1475 at net/vmw_vsock/virtio_transport_common.c:37 
virtio_transport_send_pkt_info+0xb2b/0x1160
Call Trace:
 virtio_transport_connect+0x90/0xb0
 vsock_connect+0x782/0xc40
 __sys_connect+0xe8/0x100
 __x64_sys_connect+0x6e/0xc0
 do_syscall_64+0x92/0x1c0
 entry_SYSCALL_64_after_hwframe+0x4b/0x53

KASAN: null-ptr-deref in range [0x0010-0x0017]
RIP: 0010:sock_has_perm+0xa7/0x2a0
Call Trace:
 selinux_socket_connect_helper.isra.0+0xbc/0x450
 selinux_socket_connect+0x3b/0x70
 security_socket_connect+0x31/0xd0
 __sys_connect_file+0x79/0x1f0
 __sys_connect+0xe8/0x100
 __x64_sys_connect+0x6e/0xc0
 do_syscall_64+0x92/0x1c0
 entry_SYSCALL_64_after_hwframe+0x4b/0x53

refcount_t: addition on 0; use-after-free.
WARNING: CPU: 7 PID: 1518 at lib/refcount.c:25 refcount_warn_saturate+0xdd/0x140
RIP: 0010:refcount_warn_saturate+0xdd/0x140
Call Trace:
 __vsock_bind+0x65e/0x720
 vsock_connect+0x727/0xc40
 __sys_connect+0xe8/0x100
 __x64_sys_connect+0x6e/0xc0
 do_syscall_64+0x92/0x1c0
 entry_SYSCALL_64_after_hwframe+0x4b/0x53

refcount_t: underflow; use-after-free.
WARNING: CPU: 0 PID: 1475 at lib/refcount.c:28 
refcount_warn_saturate+0x12b/0x140
RIP: 0010:refcount_warn_saturate+0x12b/0x140
Call Trace:
 vsock_remove_bound+0x18f/0x280
 __vsock_release+0x371/0x480
 vsock_release+0x88/0x120
 __sock_release+0xaa/0x260
 sock_close+0x14/0x20
 __fput+0x35a/0xaa0
 task_work_run+0xff/0x1c0
 do_exit+0x849/0x24c0
 make_task_dead+0xf3/0x110
 rewind_stack_and_make_dead+0x16/0x20

[1]: 
https://lore.kernel.org/netdev/CAGxU2F5zhfWymY8u0hrKksW8PumXAYz-9_qRmW==92oax1b...@mail.gmail.com/

Suggested-by: Stefano Garzarella 
Signed-off-by: Michal Luczaj 
---
 tools/testing/vsock/vsock_test.c | 72 +++-
 1 file changed, 57 insertions(+), 15 deletions(-)

diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index 
9ea33b78b9fcb532f4f9616b38b4d2b627b04d31..460a8838e5e6a0f155e66e7720358208bab9520f
 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -1729,16 +1729,32 @@ static void 
test_stream_msgzcopy_leak_zcskb_server(const struct test_opts *opts)
 
 #define MAX_PORT_RETRIES   24  /* net/vmw_vsock/af_vsock.c */
 
-/* Test attempts to trigger a transport release for an unbound socket. This can
- * lead to a reference count mishandling.
- */
-static void test_stream_transport_uaf_client(const struct test_opts *opts)
+static bool test_stream_transport_uaf(int cid)
 {
+   struct sockaddr_vm addr = {
+   .svm_family = AF_VSOCK,
+   .svm_cid = cid,
+   .svm_port = VMADDR_PORT_ANY
+   };
int sockets[MAX_PORT_RETRIES];
-   struct sockaddr_vm addr;
-   int fd, i, alen;
+   socklen_t alen;
+   int fd, i, c;
 
-   fd = vsock_bind(VMADDR_CID_ANY, VMADDR_PORT_ANY, SOCK_STREAM);
+   fd = socket(AF_VSOCK, SOCK_STREAM, 0);
+   if (fd < 0) {
+   perror("socket");
+   exit(EXIT_FAILURE);
+   }
+
+   if (bind(fd, (struct sockaddr *)&addr, sizeof(addr))) {
+   if (errno != EADDRNOTAVAIL) {
+   perror("Unexpected bind() errno");
+   exit(EXIT_FAILURE);
+   }
+
+   close(fd);
+   return false;
+   }
 
alen = sizeof(addr);
if (getsockname(fd, (struct sockaddr *)&addr, &alen)) {
@@ -1746,9 +1762,9 @@ static void test_stream_transport_uaf_client(const struct 
test_opts *opts)
exit(EXIT_FAILURE);
}
 
+   /* Drain the autobind pool; see __vsock_bind_connectible(). */
for (i = 0; i < MAX_PORT_RETRIES; ++i)
-   sockets[i] = vsock_bind(VMADDR_CID_ANY, ++addr.svm_port,
-   SOCK_STREAM);
+   sockets[i]

Re: [PATCH net-next v6 0/5] vsock: SOCK_LINGER rework

2025-05-22 Thread Michal Luczaj

On 5/22/25 10:08, Stefano Garzarella wrote:
> On Thu, May 22, 2025 at 01:18:20AM +0200, Michal Luczaj wrote:
>> Change vsock's lingerning to wait on close() until all data is sent, i.e.
>> until workers picked all the packets for processing.
> 
> Thanks for the series and the patience :-)
> 
> LGTM! There should be my R-b for all patches.

I think it went smoothly, thanks for the reviews :)

Michal

Re: [PATCH bpf-next v5 1/3] btf: allow mmap of vmlinux btf

2025-05-22 Thread Shakeel Butt

On Tue, May 20, 2025 at 02:01:17PM +0100, Lorenz Bauer wrote:
> User space needs access to kernel BTF for many modern features of BPF.
> Right now each process needs to read the BTF blob either in pieces or
> as a whole. Allow mmaping the sysfs file so that processes can directly
> access the memory allocated for it in the kernel.
> 
> remap_pfn_range is used instead of vm_insert_page due to aarch64
> compatibility issues.
> 
> Tested-by: Alan Maguire 
> Signed-off-by: Lorenz Bauer 
> ---
>  include/asm-generic/vmlinux.lds.h |  3 ++-
>  kernel/bpf/sysfs_btf.c| 32 
>  2 files changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/include/asm-generic/vmlinux.lds.h 
> b/include/asm-generic/vmlinux.lds.h
> index 
> 58a635a6d5bdf0c53c267c2a3d21a5ed8678ce73..1750390735fac7637cc4d2fa05f96cb2a36aa448
>  100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -667,10 +667,11 @@ defined(CONFIG_AUTOFDO_CLANG) || 
> defined(CONFIG_PROPELLER_CLANG)
>   */
>  #ifdef CONFIG_DEBUG_INFO_BTF
>  #define BTF  \
> + . = ALIGN(PAGE_SIZE);   \
>   .BTF : AT(ADDR(.BTF) - LOAD_OFFSET) {   \
>   BOUNDED_SECTION_BY(.BTF, _BTF)  \
>   }   \
> - . = ALIGN(4);   \
> + . = ALIGN(PAGE_SIZE);   \
>   .BTF_ids : AT(ADDR(.BTF_ids) - LOAD_OFFSET) {   \
>   *(.BTF_ids) \
>   }
> diff --git a/kernel/bpf/sysfs_btf.c b/kernel/bpf/sysfs_btf.c
> index 
> 81d6cf90584a7157929c50f62a5c6862e7a3d081..941d0d2427e3a2d27e8f1cff7b6424d0d41817c1
>  100644
> --- a/kernel/bpf/sysfs_btf.c
> +++ b/kernel/bpf/sysfs_btf.c
> @@ -7,14 +7,46 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
> +#include 
>  
>  /* See scripts/link-vmlinux.sh, gen_btf() func for details */
>  extern char __start_BTF[];
>  extern char __stop_BTF[];
>  
> +static int btf_sysfs_vmlinux_mmap(struct file *filp, struct kobject *kobj,
> +   const struct bin_attribute *attr,
> +   struct vm_area_struct *vma)
> +{
> + unsigned long pages = PAGE_ALIGN(attr->size) >> PAGE_SHIFT;
> + size_t vm_size = vma->vm_end - vma->vm_start;
> + phys_addr_t addr = virt_to_phys(__start_BTF);
> + unsigned long pfn = addr >> PAGE_SHIFT;
> +
> + if (attr->private != __start_BTF || !PAGE_ALIGNED(addr))

With vmlinux.lds.h change above, is the page aligned check still needed?

Oh also can the size of btf region be non-page aligned?

> + return -EINVAL;
> +
> + if (vma->vm_pgoff)
> + return -EINVAL;
> +
> + if (vma->vm_flags & (VM_WRITE | VM_EXEC | VM_MAYSHARE))
> + return -EACCES;
> +
> + if (pfn + pages < pfn)
> + return -EINVAL;
> +
> + if ((vm_size >> PAGE_SHIFT) > pages)
> + return -EINVAL;
> +
> + vm_flags_mod(vma, VM_DONTDUMP, VM_MAYEXEC | VM_MAYWRITE);

Is it ok for fork() to keep the mapping in the child? (i.e. do you need
VM_DONTCOPY). BTW VM_DONTDUMP is added by remap_pfn_range(), so if you
want you can remove it here.

> + return remap_pfn_range(vma, vma->vm_start, pfn, vm_size, 
> vma->vm_page_prot);
> +}
> +
>  static struct bin_attribute bin_attr_btf_vmlinux __ro_after_init = {
>   .attr = { .name = "vmlinux", .mode = 0444, },
>   .read_new = sysfs_bin_attr_simple_read,
> + .mmap = btf_sysfs_vmlinux_mmap,
>  };
>  
>  struct kobject *btf_kobj;
> 

Overall this looks good to me, so you can add:

Reviewed-by: Shakeel Butt

[PATCH v2] selftests: timers: valid-adjtimex: fix coding style issues

2025-05-22 Thread Rujra Bhatt



This patch corrects minor coding style issues to comply with the Linux kernel 
coding style:

- Align closing parentheses to match opening ones in printf statements.
- Break long lines to keep them within the 100-column limit.

These changes address warnings reported by checkpatch.pl and do not
affect functionality.

changes in v2 : 
- Resubmitted the patch with a properly formatted commit message,
following patch submission guidelines, as suggested by Shuah Khan.

Signed-off-by: Rujra Bhatt 
---
 tools/testing/selftests/timers/valid-adjtimex.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/timers/valid-adjtimex.c
b/tools/testing/selftests/timers/valid-adjtimex.c
index 6b7801055ad1..5110f9ee285c 100644
--- a/tools/testing/selftests/timers/valid-adjtimex.c
+++ b/tools/testing/selftests/timers/valid-adjtimex.c
@@ -157,7 +157,7 @@ int validate_freq(void)
if (tx.freq == outofrange_freq[i]) {
printf("[FAIL]\n");
printf("ERROR: out of range value %ld actually set!\n",
-   tx.freq);
+  tx.freq);
pass = -1;
goto out;
}
@@ -172,7 +172,7 @@ int validate_freq(void)
if (ret >= 0) {
printf("[FAIL]\n");
printf("Error: No failure on invalid
ADJ_FREQUENCY %ld\n",
-   invalid_freq[i]);
+  invalid_freq[i]);
pass = -1;
goto out;
}
@@ -238,7 +238,8 @@ int set_bad_offset(long sec, long usec, int use_nano)
tmx.time.tv_usec = usec;
ret = clock_adjtime(CLOCK_REALTIME, &tmx);
if (ret >= 0) {
-   printf("Invalid (sec: %ld  usec: %ld) did not fail! ",
tmx.time.tv_sec, tmx.time.tv_usec);
+   printf("Invalid (sec: %ld  usec: %ld) did not fail! ",
+  tmx.time.tv_sec, tmx.time.tv_usec);
printf("[FAIL]\n");
return -1;
}
--
2.43.0

Re: [PATCH] selftests: firmware: Add details in error logging

2025-05-22 Thread Shuah Khan


On 5/16/25 09:39, Harshal wrote:

Specify details in logs of failed cases

Use die() instead of exit() when write to
sys_path fails


Please explain why this change is needed?



Signed-off-by: Harshal 
---
  tools/testing/selftests/firmware/fw_namespace.c | 17 +
  1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/firmware/fw_namespace.c 
b/tools/testing/selftests/firmware/fw_namespace.c
index 04757dc7e546..deff7f57b694 100644
--- a/tools/testing/selftests/firmware/fw_namespace.c
+++ b/tools/testing/selftests/firmware/fw_namespace.c
@@ -38,10 +38,11 @@ static void trigger_fw(const char *fw_name, const char 
*sys_path)
  
  	fd = open(sys_path, O_WRONLY);

if (fd < 0)
-   die("open failed: %s\n",
+   die("open of sys_path failed: %s\n",
strerror(errno));
if (write(fd, fw_name, strlen(fw_name)) != strlen(fw_name))
-   exit(EXIT_FAILURE);
+   die("write to sys_path failed: %s\n",
+   strerror(errno));


Hmm. Wrapper scripts key off of the EXIT_FAILURE - how does
the output change with this change?


close(fd);
  }
  
@@ -52,10 +53,10 @@ static void setup_fw(const char *fw_path)
  
  	fd = open(fw_path, O_WRONLY | O_CREAT, 0600);

if (fd < 0)
-   die("open failed: %s\n",
+   die("open of firmware file failed: %s\n",
strerror(errno));
if (write(fd, fw, sizeof(fw) -1) != sizeof(fw) -1)
-   die("write failed: %s\n",
+   die("write to firmware file failed: %s\n",
strerror(errno));
close(fd);
  }
@@ -66,7 +67,7 @@ static bool test_fw_in_ns(const char *fw_name, const char 
*sys_path, bool block_
  
  	if (block_fw_in_parent_ns)

if (mount("test", "/lib/firmware", "tmpfs", MS_RDONLY, NULL) == 
-1)
-   die("blocking firmware in parent ns failed\n");
+   die("blocking firmware in parent namespace failed\n");
  
  	child = fork();

if (child == -1) {
@@ -99,11 +100,11 @@ static bool test_fw_in_ns(const char *fw_name, const char 
*sys_path, bool block_
strerror(errno));
}
if (mount(NULL, "/", NULL, MS_SLAVE|MS_REC, NULL) == -1)
-   die("remount root in child ns failed\n");
+   die("remount root in child namespace failed\n");
  
  	if (!block_fw_in_parent_ns) {

if (mount("test", "/lib/firmware", "tmpfs", MS_RDONLY, NULL) == 
-1)
-   die("blocking firmware in child ns failed\n");
+   die("blocking firmware in child namespace failed\n");
} else
umount("/lib/firmware");
  
@@ -129,8 +130,8 @@ int main(int argc, char **argv)

die("error: failed to build full fw_path\n");
  
  	setup_fw(fw_path);

-
setvbuf(stdout, NULL, _IONBF, 0);
+
/* Positive case: firmware in PID1 mount namespace */
printf("Testing with firmware in parent namespace (assumed to be same file 
system as PID1)\n");
if (!test_fw_in_ns(fw_name, sys_path, false))


The rest look fine.

thanks,
-- Shuah

Re: [PATCH bpf-next v3 0/8] selftests/bpf: Test sockmap/sockhash redirection

2025-05-22 Thread patchwork-bot+netdevbpf

Hello:

This series was applied to bpf/bpf-next.git (master)
by Martin KaFai Lau :

On Thu, 15 May 2025 00:15:23 +0200 you wrote:
> John, this revision introduces one more patch: "selftests/bpf: Introduce
> verdict programs for sockmap_redir". I've kept you cross-series Acked-by. I
> hope it's ok.
> 
> Jiayuan, I haven't heard back from you regarding [*], so I've kept things
> unchanged for now. Please let me know what you think.
> 
> [...]

Here is the summary with links:
  - [bpf-next,v3,1/8] selftests/bpf: Support af_unix SOCK_DGRAM socket pair 
creation
https://git.kernel.org/bpf/bpf-next/c/fb1131d5e181
  - [bpf-next,v3,2/8] selftests/bpf: Add socket_kind_to_str() to socket_helpers
https://git.kernel.org/bpf/bpf-next/c/d87857946ded
  - [bpf-next,v3,3/8] selftests/bpf: Add u32()/u64() to sockmap_helpers
https://git.kernel.org/bpf/bpf-next/c/b57482b0fe8e
  - [bpf-next,v3,4/8] selftests/bpf: Introduce verdict programs for 
sockmap_redir
https://git.kernel.org/bpf/bpf-next/c/f266905bb3d8
  - [bpf-next,v3,5/8] selftests/bpf: Add selftest for sockmap/hashmap 
redirection
https://git.kernel.org/bpf/bpf-next/c/f0709263a07e
  - [bpf-next,v3,6/8] selftests/bpf: sockmap_listen cleanup: Drop af_vsock 
redir tests
https://git.kernel.org/bpf/bpf-next/c/9266e49d608c
  - [bpf-next,v3,7/8] selftests/bpf: sockmap_listen cleanup: Drop af_unix redir 
tests
https://git.kernel.org/bpf/bpf-next/c/f3de1cf621f7
  - [bpf-next,v3,8/8] selftests/bpf: sockmap_listen cleanup: Drop af_inet 
SOCK_DGRAM redir tests
https://git.kernel.org/bpf/bpf-next/c/c04eeeb2af8e

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html

Re: [PATCH] selftests: timers: Fix grammar and clarify comments in nanosleep.c

2025-05-22 Thread Shuah Khan


On 5/14/25 16:21, Rahul Kumar wrote:

Improved the clarity and grammar in the header comment of nanosleep.c
for better readability and consistency with kernel documentation style.


This patch isn't really fixing anything. I won't be taking this one.
Sorry.

thanks,
-- Shuah

Re: Fwd: [PATCH] selftests : timers : valid-adjtimex.c : Fixed style checks

2025-05-22 Thread Shuah Khan


On 5/15/25 19:44, rujra wrote:

fixed style checks according to Linux Kernel Coding Style standards.


Fixes




1 : fixed alignment of parenthesis.
LOG : CHECK: Alignment should match open parenthesis
+   printf("ERROR: out of range value %ld actually set!\n",
+   tx.freq);

2 : fixed alignment of parenthesis.
LOG : CHECK: Alignment should match open parenthesis
+   printf("Error: No failure on invalid
ADJ_FREQUENCY %ld\n",
+   invalid_freq[i]);

3 : fixed line length of 106 to 100 and less.
LOG :  CHECK: line length of 106 exceeds 100 columns
+   printf("Invalid (sec: %ld  usec: %ld) did not fail! ",
tmx.time.tv_sec, tmx.time.tv_usec);


Please refer to a few logs for examples on how to write change logs.
Also check kernel documentation on submitting patches.

thanks,
-- Shuah

RE: [PATCH v2 8/8] irqbypass: Require producers to pass in Linux IRQ number during registration

2025-05-22 Thread Tian, Kevin

> From: Sean Christopherson 
> Sent: Saturday, May 17, 2025 7:08 AM
> 
> Pass in the Linux IRQ associated with an IRQ bypass producer instead of
> relying on the caller to set the field prior to registration, as there's
> no benefit to relying on callers to do the right thing.
> 
> Take care to set producer->irq before __connect(), as KVM expects the IRQ
> to be valid as soon as a connection is possible.
> 
> Signed-off-by: Sean Christopherson 

Reviewed-by: Kevin Tian

[PATCH v2] selftests: acct: fix grammar and clarify output messages - Fixed email case mismatch in Signed-off-by

2025-05-22 Thread Abdelrahman Fekry

This patch improves the clarity and grammar of output messages in the acct()
selftest. Minor changes were made to user-facing strings and comments to make
them easier to understand and more consistent with the kselftest style.

Changes include:
- Fixing grammar in printed messages and comments.
- Rewording error and success outputs for clarity and professionalism.
- Making the "root check" message more direct.

These updates improve readability without affecting test logic or behavior.

Signed-off-by: Abdelrahman Fekry 
---
 tools/testing/selftests/acct/acct_syscall.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/acct/acct_syscall.c 
b/tools/testing/selftests/acct/acct_syscall.c
index 87c044fb9293..2c120a527574 100644
--- a/tools/testing/selftests/acct/acct_syscall.c
+++ b/tools/testing/selftests/acct/acct_syscall.c
@@ -22,9 +22,9 @@ int main(void)
ksft_print_header();
ksft_set_plan(1);
 
-   // Check if test is run a root
+   // Check if test is run as root
if (geteuid()) {
-   ksft_exit_skip("This test needs root to run!\n");
+   ksft_exit_skip("This test must be run as root!\n");
return 1;
}
 
@@ -52,7 +52,7 @@ int main(void)
child_pid = fork();
 
if (child_pid < 0) {
-   ksft_test_result_error("Creating a child process to log 
failed\n");
+   ksft_test_result_error("Failed to create child process for 
logging\n");
acct(NULL);
return 1;
} else if (child_pid > 0) {
-- 
2.25.1

[PATCH v2] selftests: size: fix grammar and align output formatting

2025-05-22 Thread Abdelrahman Fekry

This is v2 of the patch. 
The Signed-off-by line mismatch reported by checkpatch
has been fixed.

Improve the grammar in the test name by changing "get runtime memory use"
to "get runtime memory usage". Also adjust spacing in output lines
("Total:", "Free:", etc.) to ensure consistent alignment and readability.

Signed-off-by: Abdelrahman Fekry 
---
 tools/testing/selftests/size/get_size.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/size/get_size.c 
b/tools/testing/selftests/size/get_size.c
index 2980b1a63366..d5b67c073d8e 100644
--- a/tools/testing/selftests/size/get_size.c
+++ b/tools/testing/selftests/size/get_size.c
@@ -86,7 +86,7 @@ void _start(void)
int ccode;
struct sysinfo info;
unsigned long used;
-   static const char *test_name = " get runtime memory use\n";
+   static const char *test_name = " get runtime memory usage\n";
 
print("TAP version 13\n");
print("# Testing system size.\n");
@@ -105,8 +105,8 @@ void _start(void)
used = info.totalram - info.freeram - info.bufferram;
print("# System runtime memory report (units in Kilobytes):\n");
print(" ---\n");
-   print_k_value(" Total:  ", info.totalram, info.mem_unit);
-   print_k_value(" Free:   ", info.freeram, info.mem_unit);
+   print_k_value(" Total : ", info.totalram, info.mem_unit);
+   print_k_value(" Free  : ", info.freeram, info.mem_unit);
print_k_value(" Buffer: ", info.bufferram, info.mem_unit);
print_k_value(" In use: ", used, info.mem_unit);
print(" ...\n");
-- 
2.25.1

[PATCH v2] selftests: net: fix spelling and grammar mistakes

2025-05-22 Thread Praveen Balakrishnan

Fix several spelling and grammatical mistakes in output messages from
the net selftests to improve readability.

Only the message strings for the test output have been modified. No
changes to the functional logic of the tests have been made.

Signed-off-by: Praveen Balakrishnan 
---
Changes in v2:
- Resending to full recipient list as requested by Shuah Khan. No code
  changes since v1.

 .../testing/selftests/net/netfilter/conntrack_vrf.sh |  4 ++--
 tools/testing/selftests/net/openvswitch/ovs-dpctl.py |  2 +-
 tools/testing/selftests/net/rps_default_mask.sh  | 12 ++--
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/tools/testing/selftests/net/netfilter/conntrack_vrf.sh 
b/tools/testing/selftests/net/netfilter/conntrack_vrf.sh
index e95ecb37c2b1..806d2bfbd6e7 100755
--- a/tools/testing/selftests/net/netfilter/conntrack_vrf.sh
+++ b/tools/testing/selftests/net/netfilter/conntrack_vrf.sh
@@ -236,9 +236,9 @@ EOF
ip netns exec "$ns1" ping -q -w 1 -c 1 "$DUMMYNET".2 > /dev/null
 
if ip netns exec "$ns0" nft list counter t fibcount | grep -q "packets 
1"; then
-   echo "PASS: fib lookup returned exepected output interface"
+   echo "PASS: fib lookup returned expected output interface"
else
-   echo "FAIL: fib lookup did not return exepected output 
interface"
+   echo "FAIL: fib lookup did not return expected output interface"
ret=1
return
fi
diff --git a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py 
b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
index 8a0396bfaf99..b521e0dea506 100644
--- a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
+++ b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
@@ -1877,7 +1877,7 @@ class OvsPacket(GenericNetlinkSocket):
 elif msg["cmd"] == OvsPacket.OVS_PACKET_CMD_EXECUTE:
 up.execute(msg)
 else:
-print("Unkonwn cmd: %d" % msg["cmd"])
+print("Unknown cmd: %d" % msg["cmd"])
 except NetlinkError as ne:
 raise ne
 
diff --git a/tools/testing/selftests/net/rps_default_mask.sh 
b/tools/testing/selftests/net/rps_default_mask.sh
index 4287a8529890..b200019b3c80 100755
--- a/tools/testing/selftests/net/rps_default_mask.sh
+++ b/tools/testing/selftests/net/rps_default_mask.sh
@@ -54,16 +54,16 @@ cleanup
 
 echo 1 > /proc/sys/net/core/rps_default_mask
 setup
-chk_rps "changing rps_default_mask dont affect existing devices" "" lo 
$INITIAL_RPS_DEFAULT_MASK
+chk_rps "changing rps_default_mask doesn't affect existing devices" "" lo 
$INITIAL_RPS_DEFAULT_MASK
 
 echo 3 > /proc/sys/net/core/rps_default_mask
-chk_rps "changing rps_default_mask dont affect existing netns" $NETNS lo 0
+chk_rps "changing rps_default_mask doesn't affect existing netns" $NETNS lo 0
 
 ip link add name $VETH type veth peer netns $NETNS name $VETH
 ip link set dev $VETH up
 ip -n $NETNS link set dev $VETH up
-chk_rps "changing rps_default_mask affect newly created devices" "" $VETH 3
-chk_rps "changing rps_default_mask don't affect newly child netns[II]" $NETNS 
$VETH 0
+chk_rps "changing rps_default_mask affects newly created devices" "" $VETH 3
+chk_rps "changing rps_default_mask doesn't affect newly child netns[II]" 
$NETNS $VETH 0
 ip link del dev $VETH
 ip netns del $NETNS
 
@@ -72,8 +72,8 @@ chk_rps "rps_default_mask is 0 by default in child netns" 
"$NETNS" lo 0
 
 ip netns exec $NETNS sysctl -qw net.core.rps_default_mask=1
 ip link add name $VETH type veth peer netns $NETNS name $VETH
-chk_rps "changing rps_default_mask in child ns don't affect the main one" "" 
lo $INITIAL_RPS_DEFAULT_MASK
+chk_rps "changing rps_default_mask in child ns doesn't affect the main one" "" 
lo $INITIAL_RPS_DEFAULT_MASK
 chk_rps "changing rps_default_mask in child ns affects new childns devices" 
$NETNS $VETH 1
-chk_rps "changing rps_default_mask in child ns don't affect existing devices" 
$NETNS lo 0
+chk_rps "changing rps_default_mask in child ns doesn't affect existing 
devices" $NETNS lo 0
 
 exit $ret
-- 
2.39.5

[PATCH] fs/dax: Fix "don't skip locked entries when scanning entries"

2025-05-22 Thread Alistair Popple

Commit 6be3e21d25ca ("fs/dax: don't skip locked entries when scanning
entries") introduced a new function, wait_entry_unlocked_exclusive(),
which waits for the current entry to become unlocked without advancing
the XArray iterator state.

Waiting for the entry to become unlocked requires dropping the XArray
lock. This requires calling xas_pause() prior to dropping the lock
which leaves the xas in a suitable state for the next iteration. However
this has the side-effect of advancing the xas state to the next index.
Normally this isn't an issue because xas_for_each() contains code to
detect this state and thus avoid advancing the index a second time on
the next loop iteration.

However both callers of and wait_entry_unlocked_exclusive() itself
subsequently use the xas state to reload the entry. As xas_pause()
updated the state to the next index this will cause the current entry
which is being waited on to be skipped. This caused the following
warning to fire intermittently when running xftest generic/068 on an XFS
filesystem with FS DAX enabled:

[   35.067397] [ cut here ]
[   35.068229] WARNING: CPU: 21 PID: 1640 at mm/truncate.c:89 
truncate_folio_batch_exceptionals+0xd8/0x1e0
[   35.069717] Modules linked in: nd_pmem dax_pmem nd_btt nd_e820 libnvdimm
[   35.071006] CPU: 21 UID: 0 PID: 1640 Comm: fstest Not tainted 6.15.0-rc7+ 
#77 PREEMPT(voluntary)
[   35.072613] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/204
[   35.074845] RIP: 0010:truncate_folio_batch_exceptionals+0xd8/0x1e0
[   35.075962] Code: a1 00 00 00 f6 47 0d 20 0f 84 97 00 00 00 4c 63 e8 41 39 
c4 7f 0b eb 61 49 83 c5 01 45 39 ec 7e 58 42 f68
[   35.079522] RSP: 0018:b04e426c7850 EFLAGS: 00010202
[   35.080359] RAX:  RBX: 9d21e3481908 RCX: b04e426c77f4
[   35.081477] RDX: b04e426c79e8 RSI: b04e426c79e0 RDI: 9d21e34816e8
[   35.082590] RBP: b04e426c79e0 R08: 0001 R09: 0003
[   35.083733] R10:  R11: 822b53c0f7a49868 R12: 001f
[   35.084850] R13:  R14: b04e426c78e8 R15: fffe
[   35.085953] FS:  7f9134c87740() GS:9d22abba() 
knlGS:
[   35.087346] CS:  0010 DS:  ES:  CR0: 80050033
[   35.088244] CR2: 7f9134c86000 CR3: 00040afff000 CR4: 06f0
[   35.089354] Call Trace:
[   35.089749]  
[   35.090168]  truncate_inode_pages_range+0xfc/0x4d0
[   35.091078]  truncate_pagecache+0x47/0x60
[   35.091735]  xfs_setattr_size+0xc7/0x3e0
[   35.092648]  xfs_vn_setattr+0x1ea/0x270
[   35.093437]  notify_change+0x1f4/0x510
[   35.094219]  ? do_truncate+0x97/0xe0
[   35.094879]  do_truncate+0x97/0xe0
[   35.095640]  path_openat+0xabd/0xca0
[   35.096278]  do_filp_open+0xd7/0x190
[   35.096860]  do_sys_openat2+0x8a/0xe0
[   35.097459]  __x64_sys_openat+0x6d/0xa0
[   35.098076]  do_syscall_64+0xbb/0x1d0
[   35.098647]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   35.099444] RIP: 0033:0x7f9134d81fc1
[   35.100033] Code: 75 57 89 f0 25 00 00 41 00 3d 00 00 41 00 74 49 80 3d 2a 
26 0e 00 00 74 6d 89 da 48 89 ee bf 9c ff ff ff5
[   35.102993] RSP: 002b:7ffcd41e0d10 EFLAGS: 0202 ORIG_RAX: 
0101
[   35.104263] RAX: ffda RBX: 0242 RCX: 7f9134d81fc1
[   35.105452] RDX: 0242 RSI: 7ffcd41e1200 RDI: ff9c
[   35.106663] RBP: 7ffcd41e1200 R08:  R09: 0064
[   35.107923] R10: 01a4 R11: 0202 R12: 0066
[   35.109112] R13: 0010 R14: 0010 R15: 0400
[   35.110357]  
[   35.110769] irq event stamp: 8415587
[   35.111486] hardirqs last  enabled at (8415599): [] 
__up_console_sem+0x52/0x60
[   35.113067] hardirqs last disabled at (8415610): [] 
__up_console_sem+0x37/0x60
[   35.114575] softirqs last  enabled at (8415300): [] 
handle_softirqs+0x315/0x3f0
[   35.115933] softirqs last disabled at (8415291): [] 
__irq_exit_rcu+0xa1/0xc0
[   35.117316] ---[ end trace  ]---

Fix this by using xas_reset() instead, which is equivalent in
implementation to xas_pause() but does not advance the XArray state.

Fixes: 6be3e21d25ca ("fs/dax: don't skip locked entries when scanning entries")
Signed-off-by: Alistair Popple 
Cc: Dan Williams 
Cc: Alison Schofield 
Cc: Matthew Wilcow (Oracle) 
Cc: Balbir Singh 
Cc: "Darrick J. Wong" 
Cc: Dave Chinner 
Cc: David Hildenbrand 
Cc: Jan Kara 
Cc: John Hubbard 
Cc: Ted Ts'o 
Cc: Alexander Viro 
Cc: Christian Brauner 

---

Hi Andrew,

Apologies for finding this so late in the cycle. This is a very
intermittent issue for me, and it seems it was only exposed by a recent
upgrade to my test machine/setup. The user visible impact is the same
as for the original commit this fixes. That is possible file data
corruption if a device has a FS DAX page pinned for DMA.

So in other words it means

RE: [PATCH v6 2/5] x86/cpufeatures: Add X86_FEATURE_SGX_EUPDATESVN feature flag

2025-05-22 Thread Reshetova, Elena

> -Original Message-
> From: Huang, Kai 
> Sent: Friday, May 23, 2025 3:18 AM
> To: Reshetova, Elena ; Hansen, Dave
> 
> Cc: Raynor, Scott ; sea...@google.com;
> mi...@kernel.org; Scarlata, Vincent R ;
> x...@kernel.org; jar...@kernel.org; Annapurve, Vishal
> ; linux-kernel@vger.kernel.org; Mallick, Asit K
> ; Aktas, Erdem ; Cai,
> Chong ; bond...@google.com; linux-
> s...@vger.kernel.org; dionnagl...@google.com
> Subject: Re: [PATCH v6 2/5] x86/cpufeatures: Add
> X86_FEATURE_SGX_EUPDATESVN feature flag
> 
> On Thu, 2025-05-22 at 12:21 +0300, Elena Reshetova wrote:
> > --- a/tools/arch/x86/include/asm/cpufeatures.h
> > +++ b/tools/arch/x86/include/asm/cpufeatures.h
> > @@ -481,6 +481,7 @@
> >  #define X86_FEATURE_AMD_HTR_CORES  (21*32+ 6) /* Heterogeneous
> Core Topology */
> >  #define X86_FEATURE_AMD_WORKLOAD_CLASS (21*32+ 7) /*
> Workload Classification */
> >  #define X86_FEATURE_PREFER_YMM (21*32+ 8) /* Avoid ZMM
> registers due to downclocking */
> > +#define X86_FEATURE_SGX_EUPDATESVN (21*32+11) /* Support for
> ENCLS[EUPDATESVN] instruction */
> 
> [Sorry for not mentioning in the previous version.]
> 
> Nit:
> 
> I am not sure we need to change tool headers.
> 
> Per commit
> 
>   f6d9883f8e68 ("tools/include: Sync x86 headers with the kernel sources")
> 
> .. and tools/include/uapi/README:
> 
>   ...
> 
>   What we are doing now is a third option:
> 
>- A software-enforced copy-on-write mechanism of kernel headers to
>  tooling, driven by non-fatal warnings on the tooling side build when
>  kernel headers get modified:
> 
>   Warning: Kernel ABI header differences:
> diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h
> diff -u tools/include/uapi/linux/fs.h include/uapi/linux/fs.h
> diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
> ...
> 
>  The tooling policy is to always pick up the kernel side headers as-is,
>  and integate them into the tooling build. The warnings above serve as a
>  notification to tooling maintainers that there's changes on the kernel
>  side.
> 
>   We've been using this for many years now, and it might seem hacky, but
>   works surprisingly well.
> 
> .. I interpret the updating to tools headers is not mandatory (unless building
> tools fails w/o the new feature bit definition which I believe isn't the case 
> of
> SGX_UPDATESVN).  The tools maintainers will eventually do the sync.
> 
> But on the other hand, modifying tools headers in this patch also reduces
> tools
> maintainer's effort in the future.
> 
> That being said, I am unclear with the rule here.  Perhaps Dave/Ingo can help
> to
> clarify.


Thank you Kai! I am also not sure what is the rule since I have checked before
and different patches to x86/cpufeatures.c do it differently (some do the
updates to tools and some don't).
I also would like to hear suggestions from Dave and Ingo on this.

Best Regards,
Elena.

72 matches

Mail list logo