Re: [PATCH/RFC] kunit/rtc: Add real support for very slow tests

2025-03-28 Thread David Gow
Hi Geert,

Thanks for sending this out: I think this raises some good questions
about exactly how to handle long running tests (particularly on
older/slower hardware).

I've put a few notes below, but, tl;dr: I think these are all good
changes, even if there's more we can do to better scale to slower
hardware.

On Fri, 28 Mar 2025 at 00:07, Geert Uytterhoeven  wrote:
>
> When running rtc_lib_test ("lib_test" before my "[PATCH] rtc: Rename
> lib_test to rtc_lib_test") on m68k/ARAnyM:
>
> KTAP version 1
> 1..1
> KTAP version 1
> # Subtest: rtc_lib_test_cases
> # module: rtc_lib_test
> 1..2
> # rtc_time64_to_tm_test_date_range_1000: Test should be marked slow 
> (runtime: 3.222371420s)
> ok 1 rtc_time64_to_tm_test_date_range_1000
> # rtc_time64_to_tm_test_date_range_16: try timed out
> # rtc_time64_to_tm_test_date_range_16: test case timed out
> # rtc_time64_to_tm_test_date_range_16.speed: slow
> not ok 2 rtc_time64_to_tm_test_date_range_16
> # rtc_lib_test_cases: pass:1 fail:1 skip:0 total:2
> # Totals: pass:1 fail:1 skip:0 total:2
> not ok 1 rtc_lib_test_cases
>
> Commit 02c2d0c2a84172c3 ("kunit: Add speed attribute") added the notion
> of "very slow" tests, but this is further unused and unhandled.
>
> Hence:
>   1. Introduce KUNIT_CASE_VERY_SLOW(),

Thanks -- I think we want this regardless.

>   2. Increase timeout by ten; ideally this should only be done for very
>  slow tests, but I couldn't find how to access kunit_case.attr.case
>  from kunit_try_catch_run(),


My feeling for tests generally is:
- Normal: effectively instant on modern hardware, O(seconds) on
ancient hardware.
- Slow: takes O(seconds) to run on modern hardware, O(minutes)..O(10s
of minutes) on ancient hardware.
- Very slow: O(minutes) or higher on modern hardware, infeasible on
ancient hardware.

Obviously the definition of "modern" and "ancient" hardware here is
pretty arbitrary: I'm using "modern, high-end x86" ~4GHz as my
"modern" example, and "66MHz 486" as my "ancient" one, but things like
emulation or embedded systems fit in-between.

Ultimately, I think the timeout probably needs to be configurable on a
per-machine basis more than a per-test one, but having a 10x
multiplier (or even a 100x multiplier) for very slow tests would also
work for me.

I quickly tried hacking together something to pass through the
attribute and implement this. Diff (probably mangled by gmail) below:
---
diff --git a/include/kunit/try-catch.h b/include/kunit/try-catch.h
index 7c966a1adbd3..24a29622068b 100644
--- a/include/kunit/try-catch.h
+++ b/include/kunit/try-catch.h
@@ -50,6 +50,13 @@ struct kunit_try_catch {
   void *context;
};

+struct kunit_try_catch_context {
+   struct kunit *test;
+   struct kunit_suite *suite;
+   struct kunit_case *test_case;
+};
+
+
void kunit_try_catch_run(struct kunit_try_catch *try_catch, void *context);

void __noreturn kunit_try_catch_throw(struct kunit_try_catch *try_catch);
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index 146d1b48a096..79d12c0c2d25 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -420,12 +420,6 @@ static void kunit_run_case_cleanup(struct kunit *test,
   kunit_case_internal_cleanup(test);
}

-struct kunit_try_catch_context {
-   struct kunit *test;
-   struct kunit_suite *suite;
-   struct kunit_case *test_case;
-};
-
static void kunit_try_run_case(void *data)
{
   struct kunit_try_catch_context *ctx = data;
diff --git a/lib/kunit/try-catch.c b/lib/kunit/try-catch.c
index 92099c67bb21..5f62e393d422 100644
--- a/lib/kunit/try-catch.c
+++ b/lib/kunit/try-catch.c
@@ -34,30 +34,15 @@ static int kunit_generic_run_threadfn_adapter(void *data)
   return 0;
}

-static unsigned long kunit_test_timeout(void)
+static unsigned long kunit_test_timeout(struct kunit_try_catch *try_catch)
{
-   /*
-* TODO(brendanhigg...@google.com): We should probably have some type of
-* variable timeout here. The only question is what that timeout value
-* should be.
-*
-* The intention has always been, at some point, to be able to label
-* tests with some type of size bucket (unit/small, integration/medium,
-* large/system/end-to-end, etc), where each size bucket would get a
-* default timeout value kind of like what Bazel does:
-* 
https://docs.bazel.build/versions/master/be/common-definitions.html#test.size
-* There is still some debate to be had on exactly how we do this. (For
-* one, we probably want to have some sort of test runner level
-* timeout.)
-*
-* For more background on this topic, see:
-* https://mike-bland.com/2011/11/01/small-medium-large.html
-*
-* If tests timeout due to exceeding sysctl_hung_task_timeout_secs,
-* the task will be killed and an oops generated.
-*/
-   // FIXME

Re: [PATCH 1/4] x86/sgx: Add total number of EPC pages

2025-03-28 Thread Jarkko Sakkinen
On Fri, Mar 28, 2025 at 08:07:24AM +, Reshetova, Elena wrote:
> > Yes but obviously I cannot promise that I'll accept this as it is
> > until I see the final version
> 
> Are you saying you prefer *this version with spinlock* vs. 
> simpler version that utilizes the fact that sgx_nr_free_pages is changed
> into tracking of number of used pages? 

I don't know really what I do prefer.

Maybe +1 version would make sense where you keep with the approach
you've chosen (used pages) and better rationalize why it is mandatory,
and why free pages would be worse?

> 
> > 
> > Also you probably should use mutex given the loop where we cannot
> > temporarily exit the lock (like e.g. in keyrings gc we can).
> 
> Not sure I understand this, could you please elaborate why do I need an
> additional mutex here? Or are you suggesting switching spinlock to mutex? 

In your code example you had a loop inside spinlock, which was based on
a return code of an opcode, i.e. potentially infinite loop.

I'd like to remind you that the hardware I have is NUC7 from 2018 so
you really have to nail how things will work semantically as I can
only think these things only in theoretical level ;-) [1]


> 
> Best Regards,
> Elena.
> 

[1] https://social.kernel.org/notice/AsUbsYH0T4bTcUSdUW

BR, Jarkko



Re: [PATCH 4/4] x86/sgx: Implement ENCLS[EUPDATESVN] and opportunistically call it during first EPC page alloc

2025-03-28 Thread Jarkko Sakkinen
On Fri, Mar 28, 2025 at 08:27:51AM +, Reshetova, Elena wrote:
> 
> > On Thu, Mar 27, 2025 at 03:42:30PM +, Reshetova, Elena wrote:
> > > > > > > + case SGX_NO_UPDATE:
> > > > > > > + pr_debug("EUPDATESVN was successful, but CPUSVN
> > was not
> > > > > > updated, "
> > > > > > > + "because current SVN was not newer than
> > > > > > CPUSVN.\n");
> > > > > > > + break;
> > > > > > > + case SGX_EPC_NOT_READY:
> > > > > > > + pr_debug("EPC is not ready for SVN update.");
> > > > > > > + break;
> > > > > > > + case SGX_INSUFFICIENT_ENTROPY:
> > > > > > > + pr_debug("CPUSVN update is failed due to Insufficient
> > > > > > entropy in RNG, "
> > > > > > > + "please try it later.\n");
> > > > > > > + break;
> > > > > > > + case SGX_EPC_PAGE_CONFLICT:
> > > > > > > + pr_debug("CPUSVN update is failed due to
> > concurrency
> > > > > > violation, please "
> > > > > > > + "stop running any other ENCLS leaf and try it
> > > > > > later.\n");
> > > > > > > + break;
> > > > > > > + default:
> > > > > > > + break;
> > > > > >
> > > > > > Remove pr_debug() statements.
> > > > >
> > > > > This I am not sure it is good idea. I think it would be useful for 
> > > > > system
> > > > > admins to have a way to see that update either happened or not.
> > > > > It is true that you can find this out by requesting a new SGX 
> > > > > attestation
> > > > > quote (and see if newer SVN is used), but it is not the faster way.
> > > >
> > > > Maybe pr_debug() is them wrong level if they are meant for sysadmins?
> > > >
> > > > I mean these should not happen in normal behavior like ever? As
> > > > pr_debug() I don't really grab this.
> > >
> > > SGX_NO_UPDATE will absolutely happen normally all the time.
> > > Since EUPDATESVN is executed every time EPC is empty, this is the
> > > most common code you will get back (because microcode updates are rare).
> > > Others yes, that would indicate some error condition.
> > > So, what is the pr_level that you would suggest?
> > 
> > Right, got it. That changes my conclusions:
> > 
> > So I'd reformulate it like:
> > 
> > switch (ret) {
> > case 0:
> > pr_info("EUPDATESVN: success\n);
> > break;
> > case SGX_EPC_NOT_READY:
> > case SGX_INSUFFICIENT_ENTROPY:
> > case SGX_EPC_PAGE_CONFLICT:
> > pr_err("EUPDATESVN: error %d\n", ret);
> > /* TODO: block/teardown driver? */
> > break;
> > case SGX_NO_UPDATE:
> > break;
> > default:
> > pr_err("EUPDATESVN: unknown error %d\n", ret);
> > /* TODO: block/teardown driver? */
> > break;
> > }
> > 
> > Since when this is executed EPC usage is zero error cases should block
> > or teardown SGX driver, presuming that they are because of either
> > incorrect driver state or spurious error code.
> 
> I agree with the above, but not sure at all about the blocking/teardown the
> driver. They are all potentially temporal things and  SGX_INSUFFICIENT_ENTROPY
> is even outside of SGX driver control and *does not* indicate any error
> condition on the driver side itself. SGX_EPC_NOT_READY and 
> SGX_EPC_PAGE_CONFLICT
> would mean we have a bug somewhere because we thought we could go
> do EUDPATESVN on empty EPC and prevented anyone from creating
> pages in meanwhile but looks like we missed smth. That said, I dont know if we
> want to fail the whole system in case we have such a code bug, this is very 
> aggressive (in case it is some rare edge condition that no one knew about or
> guessed). So, I would propose to print the pr_err() as you have above but
> avoid destroying the driver. 
> Would this work? 

I think now is the time that you should really roll out a new version in
the way you see fit and we will revisit that.

I already grabbed from your example that I got some of the error codes
horribly wrong :-) Still I think the draft of error planning I put is
at least towards right direction.

> 
> Best Regards,
> Elena.
> 
> 
> > 
> > If this happens, we definitely do not want service, right?
> > 
> > I'm not sure of all error codes how serious they are, or are all of them
> > consequence of incorrectly working driver.
> > 
> > BR, Jarkko

BR, Jarkko



RE: [PATCH 1/4] x86/sgx: Add total number of EPC pages

2025-03-28 Thread Reshetova, Elena
 
> On Fri, Mar 28, 2025 at 08:07:24AM +, Reshetova, Elena wrote:
> > > Yes but obviously I cannot promise that I'll accept this as it is
> > > until I see the final version
> >
> > Are you saying you prefer *this version with spinlock* vs.
> > simpler version that utilizes the fact that sgx_nr_free_pages is changed
> > into tracking of number of used pages?
> 
> I don't know really what I do prefer.
> 
> Maybe +1 version would make sense where you keep with the approach
> you've chosen (used pages) and better rationalize why it is mandatory,
> and why free pages would be worse?

Sure, let me send out v2 with the old approach, all suggestions and fixes
taken into account and better reasoning. 

> 
> >
> > >
> > > Also you probably should use mutex given the loop where we cannot
> > > temporarily exit the lock (like e.g. in keyrings gc we can).
> >
> > Not sure I understand this, could you please elaborate why do I need an
> > additional mutex here? Or are you suggesting switching spinlock to mutex?
> 
> In your code example you had a loop inside spinlock, which was based on
> a return code of an opcode, i.e. potentially infinite loop.

Oh, this is a misunderstanding due to limited snippet posting. 
The loop was bounded also by "retry" condition in while with the max number of
retires being 10. It only exists earlier if there is success. 


> 
> I'd like to remind you that the hardware I have is NUC7 from 2018 so
> you really have to nail how things will work semantically as I can
> only think these things only in theoretical level ;-) [1]

Sure, I understand. 

Best Regards,
Elena.


> 
> 
> >
> > Best Regards,
> > Elena.
> >
> 
> [1] https://social.kernel.org/notice/AsUbsYH0T4bTcUSdUW
> 
> BR, Jarkko


Re: [PATCH v3 2/3] openrisc: Introduce new utility functions to flush and invalidate caches

2025-03-28 Thread Sahil Siddiq

Hi,

Thank you for the review.

On 3/26/25 10:41 PM, Stafford Horne wrote:

On Mon, Mar 24, 2025 at 01:25:43AM +0530, Sahil Siddiq wrote:

According to the OpenRISC architecture manual, the dcache and icache may
not be present. When these caches are present, the invalidate and flush
registers may be absent. The current implementation does not perform
checks to verify their presence before utilizing cache registers, or
invalidating and flushing cache blocks.

Introduce new functions to detect the presence of cache components and
related special-purpose registers.

There are a few places where a range of addresses have to be flushed or
invalidated and the implementation is duplicated. Introduce new utility
functions and macros that generalize this implementation and reduce
duplication.

Signed-off-by: Sahil Siddiq 
---
Changes from v2 -> v3:
- arch/openrisc/include/asm/cacheflush.h: Declare new functions and macros.
- arch/openrisc/include/asm/cpuinfo.h: Implement new functions.
   (cpu_cache_is_present):
   1. The implementation of this function was strewn all over the place in
  the previous versions.
   2. Fix condition. The condition in the previous version was incorrect.
   (cb_inv_flush_is_implemented): New function.
- arch/openrisc/kernel/dma.c: Use new functions.
- arch/openrisc/mm/cache.c:
   (cache_loop): Extend function.
   (local_*_page_*): Use new cache_loop interface.
   (local_*_range_*): Implement new functions.
- arch/openrisc/mm/init.c: Use new functions.

  arch/openrisc/include/asm/cacheflush.h | 17 +
  arch/openrisc/include/asm/cpuinfo.h| 42 +
  arch/openrisc/kernel/dma.c | 18 ++---
  arch/openrisc/mm/cache.c   | 52 ++
  arch/openrisc/mm/init.c|  5 ++-
  5 files changed, 110 insertions(+), 24 deletions(-)

diff --git a/arch/openrisc/include/asm/cacheflush.h 
b/arch/openrisc/include/asm/cacheflush.h
index 984c331ff5f4..0e60af486ec1 100644
--- a/arch/openrisc/include/asm/cacheflush.h
+++ b/arch/openrisc/include/asm/cacheflush.h
@@ -23,6 +23,9 @@
   */
  extern void local_dcache_page_flush(struct page *page);
  extern void local_icache_page_inv(struct page *page);
+extern void local_dcache_range_flush(unsigned long start, unsigned long end);
+extern void local_dcache_range_inv(unsigned long start, unsigned long end);
+extern void local_icache_range_inv(unsigned long start, unsigned long end);
  
  /*

   * Data cache flushing always happen on the local cpu. Instruction cache
@@ -38,6 +41,20 @@ extern void local_icache_page_inv(struct page *page);
  extern void smp_icache_page_inv(struct page *page);
  #endif /* CONFIG_SMP */
  
+/*

+ * Even if the actual block size is larger than L1_CACHE_BYTES, paddr
+ * can be incremented by L1_CACHE_BYTES. When paddr is written to the
+ * invalidate register, the entire cache line encompassing this address
+ * is invalidated. Each subsequent reference to the same cache line will
+ * not affect the invalidation process.
+ */
+#define local_dcache_block_flush(addr) \
+   local_dcache_range_flush(addr, addr + L1_CACHE_BYTES)
+#define local_dcache_block_inv(addr) \
+   local_dcache_range_inv(addr, addr + L1_CACHE_BYTES)
+#define local_icache_block_inv(addr) \
+   local_icache_range_inv(addr, addr + L1_CACHE_BYTES)
+
  /*
   * Synchronizes caches. Whenever a cpu writes executable code to memory, this
   * should be called to make sure the processor sees the newly written code.
diff --git a/arch/openrisc/include/asm/cpuinfo.h 
b/arch/openrisc/include/asm/cpuinfo.h
index 82f5d4c06314..7839c41152af 100644
--- a/arch/openrisc/include/asm/cpuinfo.h
+++ b/arch/openrisc/include/asm/cpuinfo.h
@@ -15,6 +15,9 @@
  #ifndef __ASM_OPENRISC_CPUINFO_H
  #define __ASM_OPENRISC_CPUINFO_H
  
+#include 

+#include 
+
  struct cache_desc {
u32 size;
u32 sets;
@@ -34,4 +37,43 @@ struct cpuinfo_or1k {
  extern struct cpuinfo_or1k cpuinfo_or1k[NR_CPUS];
  extern void setup_cpuinfo(void);
  
+/*

+ * Check if the cache component exists.
+ */
+static inline bool cpu_cache_is_present(const unsigned int cache_type)
+{
+   unsigned long upr = mfspr(SPR_UPR);
+
+   return !!(upr & (SPR_UPR_UP | cache_type));
+}
+
+/*
+ * Check if the cache block flush/invalidate register is implemented for the
+ * cache component.
+ */
+static inline bool cb_inv_flush_is_implemented(const unsigned int reg,
+  const unsigned int cache_type)
+{
+   unsigned long cfgr;
+
+   if (cache_type == SPR_UPR_DCP) {
+   cfgr = mfspr(SPR_DCCFGR);
+   if (reg == SPR_DCBFR)
+   return !!(cfgr & SPR_DCCFGR_CBFRI);
+
+   if (reg == SPR_DCBIR)
+   return !!(cfgr & SPR_DCCFGR_CBIRI);
+   }
+
+   /*
+* The cache block flush register is not implemented for the 
instruction cache.
+*/
+   if (cache_type == SPR_UPR_ICP) {
+   

Re: [PATCH] virtio: console: Make resizing compliant with virtio spec

2025-03-28 Thread Halil Pasic
On Thu, 20 Mar 2025 15:09:57 +0100
Halil Pasic  wrote:

> > I already implemented it in my patch v2 (just waiting for Amit to
> > confirm the new commit message). But if you want to split it you can
> > create a seperate patch for it as well (I don't really mind either
> > way).
> >   

Your v2 has not been posted yet, or? I can't find it in my Inbox.

I understand that you have confirmed that the byte order handling is
needed but missing, right?

> 
> It is conceptually a different bug and warrants a patch and a commit
> message of its own. At least IMHO.



[PATCH v4 3/3] openrisc: Add cacheinfo support

2025-03-28 Thread Sahil Siddiq
Add cacheinfo support for OpenRISC.

Currently, a few CPU cache attributes pertaining to OpenRISC processors
are exposed along with other unrelated CPU attributes in the procfs file
system (/proc/cpuinfo). However, a few cache attributes remain unexposed.

Provide a mechanism that the generic cacheinfo infrastructure can employ
to expose these attributes via the sysfs file system. These attributes
can then be exposed in /sys/devices/system/cpu/cpuX/cache/indexN. Move
the implementation to pull cache attributes from the processor's
registers from arch/openrisc/kernel/setup.c with a few modifications.

This implementation is based on similar work done for MIPS and LoongArch.

Link: 
https://raw.githubusercontent.com/openrisc/doc/master/openrisc-arch-1.4-rev0.pdf

Signed-off-by: Sahil Siddiq 
---
Changes from v3 -> v4:
- arch/openrisc/kernel/cacheinfo.c: Fix build warning detected by
  kernel test robot.

Changes from v2 -> v3:
- arch/openrisc/kernel/cacheinfo.c:
  1. Use new functions introduced in patch #2.
  2. Address review comments regarding coding style.
- arch/openrisc/kernel/setup.c:
  (print_cpuinfo): Don't remove detection of UPR register.

 arch/openrisc/kernel/Makefile|   2 +-
 arch/openrisc/kernel/cacheinfo.c | 104 +++
 arch/openrisc/kernel/setup.c |  44 +
 3 files changed, 108 insertions(+), 42 deletions(-)
 create mode 100644 arch/openrisc/kernel/cacheinfo.c

diff --git a/arch/openrisc/kernel/Makefile b/arch/openrisc/kernel/Makefile
index 79129161f3e0..e4c7d9bdd598 100644
--- a/arch/openrisc/kernel/Makefile
+++ b/arch/openrisc/kernel/Makefile
@@ -7,7 +7,7 @@ extra-y := vmlinux.lds
 
 obj-y  := head.o setup.o or32_ksyms.o process.o dma.o \
   traps.o time.o irq.o entry.o ptrace.o signal.o \
-  sys_call_table.o unwinder.o
+  sys_call_table.o unwinder.o cacheinfo.o
 
 obj-$(CONFIG_SMP)  += smp.o sync-timer.o
 obj-$(CONFIG_STACKTRACE)   += stacktrace.o
diff --git a/arch/openrisc/kernel/cacheinfo.c b/arch/openrisc/kernel/cacheinfo.c
new file mode 100644
index ..61230545e4ff
--- /dev/null
+++ b/arch/openrisc/kernel/cacheinfo.c
@@ -0,0 +1,104 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * OpenRISC cacheinfo support
+ *
+ * Based on work done for MIPS and LoongArch. All original copyrights
+ * apply as per the original source declaration.
+ *
+ * OpenRISC implementation:
+ * Copyright (C) 2025 Sahil Siddiq 
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+static inline void ci_leaf_init(struct cacheinfo *this_leaf, enum cache_type 
type,
+   unsigned int level, struct cache_desc *cache, 
int cpu)
+{
+   this_leaf->type = type;
+   this_leaf->level = level;
+   this_leaf->coherency_line_size = cache->block_size;
+   this_leaf->number_of_sets = cache->sets;
+   this_leaf->ways_of_associativity = cache->ways;
+   this_leaf->size = cache->size;
+   cpumask_set_cpu(cpu, &this_leaf->shared_cpu_map);
+}
+
+int init_cache_level(unsigned int cpu)
+{
+   struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[smp_processor_id()];
+   struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+   int leaves = 0, levels = 0;
+   unsigned long upr = mfspr(SPR_UPR);
+   unsigned long iccfgr, dccfgr;
+
+   if (!(upr & SPR_UPR_UP)) {
+   printk(KERN_INFO
+  "-- no UPR register... unable to detect 
configuration\n");
+   return -ENOENT;
+   }
+
+   if (cpu_cache_is_present(SPR_UPR_DCP)) {
+   dccfgr = mfspr(SPR_DCCFGR);
+   cpuinfo->dcache.ways = 1 << (dccfgr & SPR_DCCFGR_NCW);
+   cpuinfo->dcache.sets = 1 << ((dccfgr & SPR_DCCFGR_NCS) >> 3);
+   cpuinfo->dcache.block_size = 16 << ((dccfgr & SPR_DCCFGR_CBS) 
>> 7);
+   cpuinfo->dcache.size =
+   cpuinfo->dcache.sets * cpuinfo->dcache.ways * 
cpuinfo->dcache.block_size;
+   leaves += 1;
+   printk(KERN_INFO
+  "-- dcache: %d bytes total, %d bytes/line, %d set(s), %d 
way(s)\n",
+  cpuinfo->dcache.size, cpuinfo->dcache.block_size,
+  cpuinfo->dcache.sets, cpuinfo->dcache.ways);
+   } else
+   printk(KERN_INFO "-- dcache disabled\n");
+
+   if (cpu_cache_is_present(SPR_UPR_ICP)) {
+   iccfgr = mfspr(SPR_ICCFGR);
+   cpuinfo->icache.ways = 1 << (iccfgr & SPR_ICCFGR_NCW);
+   cpuinfo->icache.sets = 1 << ((iccfgr & SPR_ICCFGR_NCS) >> 3);
+   cpuinfo->icache.block_size = 16 << ((iccfgr & SPR_ICCFGR_CBS) 
>> 7);
+   cpuinfo->icache.size =
+   cpuinfo->icache.sets * cpuinfo->icache.ways * 
cpuinfo->icache.block_size;
+   leaves += 1;
+   printk(KERN_INFO
+  "-- icache: %d bytes total, %d bytes/line, %d set(s), %d 
way(s)\n",
+  cp

Re: [PATCH v2 2/2] x86/sgx: Implement EUPDATESVN and opportunistically call it during first EPC page alloc

2025-03-28 Thread Jarkko Sakkinen
On Fri, Mar 28, 2025 at 02:57:41PM +0200, Elena Reshetova wrote:
> SGX architecture introduced a new instruction called EUPDATESVN
> to Ice Lake. It allows updating security SVN version, given that EPC
> is completely empty. The latter is required for security reasons
> in order to reason that enclave security posture is as secure as the
> security SVN version of the TCB that created it.
> 
> Additionally it is important to ensure that while ENCLS[EUPDATESVN]
> runs, no concurrent page creation happens in EPC, because it might
> result in #GP delivered to the creator. Legacy SW might not be prepared
> to handle such unexpected #GPs and therefore this patch introduces
> a locking mechanism to ensure no concurrent EPC allocations can happen.
> 
> It is also ensured that ENCLS[EUPDATESVN] is not called when running
> in a VM since it does not have a meaning in this context (microcode
> updates application is limited to the host OS) and will create
> unnecessary load.
> 
> This patch is based on previous submision by Cathy Zhang
> https://lore.kernel.org/all/20220520103904.1216-1-cathy.zh...@intel.com/
> 
> Signed-off-by: Elena Reshetova 
> ---
>  arch/x86/include/asm/sgx.h  | 41 +
>  arch/x86/kernel/cpu/sgx/encls.h |  6 
>  arch/x86/kernel/cpu/sgx/main.c  | 63 -
>  arch/x86/kernel/cpu/sgx/sgx.h   |  1 +
>  4 files changed, 95 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
> index 6a0069761508..5caf5c31ebc6 100644
> --- a/arch/x86/include/asm/sgx.h
> +++ b/arch/x86/include/asm/sgx.h
> @@ -26,23 +26,26 @@
>  #define SGX_CPUID_EPC_SECTION0x1
>  /* The bitmask for the EPC section type. */
>  #define SGX_CPUID_EPC_MASK   GENMASK(3, 0)
> +/* EUPDATESVN presence indication */
> +#define SGX_CPUID_EUPDATESVN BIT(10)
>  
>  enum sgx_encls_function {
> - ECREATE = 0x00,
> - EADD= 0x01,
> - EINIT   = 0x02,
> - EREMOVE = 0x03,
> - EDGBRD  = 0x04,
> - EDGBWR  = 0x05,
> - EEXTEND = 0x06,
> - ELDU= 0x08,
> - EBLOCK  = 0x09,
> - EPA = 0x0A,
> - EWB = 0x0B,
> - ETRACK  = 0x0C,
> - EAUG= 0x0D,
> - EMODPR  = 0x0E,
> - EMODT   = 0x0F,
> + ECREATE = 0x00,
> + EADD= 0x01,
> + EINIT   = 0x02,
> + EREMOVE = 0x03,
> + EDGBRD  = 0x04,
> + EDGBWR  = 0x05,
> + EEXTEND = 0x06,
> + ELDU= 0x08,
> + EBLOCK  = 0x09,
> + EPA = 0x0A,
> + EWB = 0x0B,
> + ETRACK  = 0x0C,
> + EAUG= 0x0D,
> + EMODPR  = 0x0E,
> + EMODT   = 0x0F,
> + EUPDATESVN  = 0x18,
>  };
>  
>  /**
> @@ -73,6 +76,11 @@ enum sgx_encls_function {
>   *   public key does not match IA32_SGXLEPUBKEYHASH.
>   * %SGX_PAGE_NOT_MODIFIABLE: The EPC page cannot be modified because it
>   *   is in the PENDING or MODIFIED state.
> + * %SGX_INSUFFICIENT_ENTROPY:Insufficient entropy in RNG.
> + * %SGX_EPC_NOT_READY:   EPC is not ready for SVN update.
> + * %SGX_NO_UPDATE:   EUPDATESVN was successful, but CPUSVN was not
> + *   updated because current SVN was not newer than
> + *   CPUSVN.
>   * %SGX_UNMASKED_EVENT:  An unmasked event, e.g. INTR, was 
> received
>   */
>  enum sgx_return_code {
> @@ -81,6 +89,9 @@ enum sgx_return_code {
>   SGX_CHILD_PRESENT   = 13,
>   SGX_INVALID_EINITTOKEN  = 16,
>   SGX_PAGE_NOT_MODIFIABLE = 20,
> + SGX_INSUFFICIENT_ENTROPY= 29,
> + SGX_EPC_NOT_READY   = 30,
> + SGX_NO_UPDATE   = 31,
>   SGX_UNMASKED_EVENT  = 128,
>  };
>  
> diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
> index 99004b02e2ed..3d83c76dc91f 100644
> --- a/arch/x86/kernel/cpu/sgx/encls.h
> +++ b/arch/x86/kernel/cpu/sgx/encls.h
> @@ -233,4 +233,10 @@ static inline int __eaug(struct sgx_pageinfo *pginfo, 
> void *addr)
>   return __encls_2(EAUG, pginfo, addr);
>  }
>  
> +/* Update CPUSVN at runtime. */
> +static inline int __eupdatesvn(void)
> +{
> + return __encls_ret_1(EUPDATESVN, "");
> +}
> +
>  #endif /* _X86_ENCLS_H */
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index b61d3bad0446..24563110811d 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -32,6 +32,11 @@ static DEFINE_XARRAY(sgx_epc_address_space);
>  static LIST_HEAD(sgx_active_page_list);
>  static DEFINE_SPINLOCK(sgx_reclaimer_lock);
>  
> +/* This lock is held to prevent new EPC pages from being created
> + * during the execution of ENCLS[EUPDATESVN].
> + */
> +static DEFINE_SPINLOCK(sgx_epc_eupdatesvn_lock);
> +
>  static atomic_long_t sgx_nr_used_pages = ATOMIC_LONG_IN

Re: [PATCH] selftests/run_kselftest.sh: Use readlink if realpath is not available

2025-03-28 Thread Shuah Khan

On 3/18/25 10:05, Yosry Ahmed wrote:

'realpath' is not always available,  fallback to 'readlink -f' if is not
available. They seem to work equally well in this context.


Can you add more specifics on "realpath" is not always available,"

No issues with the patch itself. I would like to know the cases
where "realpath" command is missing.




Signed-off-by: Yosry Ahmed 
---
  tools/testing/selftests/run_kselftest.sh | 9 -
  1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/run_kselftest.sh 
b/tools/testing/selftests/run_kselftest.sh
index 50e03eefe7ac7..0443beacf3621 100755
--- a/tools/testing/selftests/run_kselftest.sh
+++ b/tools/testing/selftests/run_kselftest.sh
@@ -3,7 +3,14 @@
  #
  # Run installed kselftest tests.
  #
-BASE_DIR=$(realpath $(dirname $0))
+
+# Fallback to readlink if realpath is not available
+if which realpath > /dev/null; then
+BASE_DIR=$(realpath $(dirname $0))
+else
+BASE_DIR=$(readlink -f $(dirname $0))
+fi
+
  cd $BASE_DIR
  TESTS="$BASE_DIR"/kselftest-list.txt
  if [ ! -r "$TESTS" ] ; then


thanks,
-- Shuah




[PATCH v4 1/3] openrisc: Refactor struct cpuinfo_or1k to reduce duplication

2025-03-28 Thread Sahil Siddiq
The "cpuinfo_or1k" structure currently has identical data members for
different cache components.

Remove these fields out of struct cpuinfo_or1k and into its own struct.
This reduces duplication while keeping cpuinfo_or1k extensible so more
cache descriptors can be added in the future.

Also add a new field "sets" to the new structure.

Signed-off-by: Sahil Siddiq 
---
No changes from v3 -> v4.

Changes from v1/v2 -> v3:
- arch/openrisc/kernel/setup.c:
  (print_cpuinfo):
  1. Cascade changes made to struct cpuinfo_or1k.
  2. These lines are ultimately shifted to the new file created in
 patch #3.
  (setup_cpuinfo): Likewise.
  (show_cpuinfo): Likewise.

 arch/openrisc/include/asm/cpuinfo.h | 16 +-
 arch/openrisc/kernel/setup.c| 45 ++---
 2 files changed, 31 insertions(+), 30 deletions(-)

diff --git a/arch/openrisc/include/asm/cpuinfo.h 
b/arch/openrisc/include/asm/cpuinfo.h
index 5e4744153d0e..82f5d4c06314 100644
--- a/arch/openrisc/include/asm/cpuinfo.h
+++ b/arch/openrisc/include/asm/cpuinfo.h
@@ -15,16 +15,18 @@
 #ifndef __ASM_OPENRISC_CPUINFO_H
 #define __ASM_OPENRISC_CPUINFO_H
 
+struct cache_desc {
+   u32 size;
+   u32 sets;
+   u32 block_size;
+   u32 ways;
+};
+
 struct cpuinfo_or1k {
u32 clock_frequency;
 
-   u32 icache_size;
-   u32 icache_block_size;
-   u32 icache_ways;
-
-   u32 dcache_size;
-   u32 dcache_block_size;
-   u32 dcache_ways;
+   struct cache_desc icache;
+   struct cache_desc dcache;
 
u16 coreid;
 };
diff --git a/arch/openrisc/kernel/setup.c b/arch/openrisc/kernel/setup.c
index be56eaafc8b9..66207cd7bb9e 100644
--- a/arch/openrisc/kernel/setup.c
+++ b/arch/openrisc/kernel/setup.c
@@ -115,16 +115,16 @@ static void print_cpuinfo(void)
 
if (upr & SPR_UPR_DCP)
printk(KERN_INFO
-  "-- dcache: %4d bytes total, %2d bytes/line, %d 
way(s)\n",
-  cpuinfo->dcache_size, cpuinfo->dcache_block_size,
-  cpuinfo->dcache_ways);
+  "-- dcache: %4d bytes total, %2d bytes/line, %d set(s), 
%d way(s)\n",
+  cpuinfo->dcache.size, cpuinfo->dcache.block_size,
+  cpuinfo->dcache.sets, cpuinfo->dcache.ways);
else
printk(KERN_INFO "-- dcache disabled\n");
if (upr & SPR_UPR_ICP)
printk(KERN_INFO
-  "-- icache: %4d bytes total, %2d bytes/line, %d 
way(s)\n",
-  cpuinfo->icache_size, cpuinfo->icache_block_size,
-  cpuinfo->icache_ways);
+  "-- icache: %4d bytes total, %2d bytes/line, %d set(s), 
%d way(s)\n",
+  cpuinfo->icache.size, cpuinfo->icache.block_size,
+  cpuinfo->icache.sets, cpuinfo->icache.ways);
else
printk(KERN_INFO "-- icache disabled\n");
 
@@ -156,7 +156,6 @@ void __init setup_cpuinfo(void)
 {
struct device_node *cpu;
unsigned long iccfgr, dccfgr;
-   unsigned long cache_set_size;
int cpu_id = smp_processor_id();
struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[cpu_id];
 
@@ -165,18 +164,18 @@ void __init setup_cpuinfo(void)
panic("Couldn't find CPU%d in device tree...\n", cpu_id);
 
iccfgr = mfspr(SPR_ICCFGR);
-   cpuinfo->icache_ways = 1 << (iccfgr & SPR_ICCFGR_NCW);
-   cache_set_size = 1 << ((iccfgr & SPR_ICCFGR_NCS) >> 3);
-   cpuinfo->icache_block_size = 16 << ((iccfgr & SPR_ICCFGR_CBS) >> 7);
-   cpuinfo->icache_size =
-   cache_set_size * cpuinfo->icache_ways * cpuinfo->icache_block_size;
+   cpuinfo->icache.ways = 1 << (iccfgr & SPR_ICCFGR_NCW);
+   cpuinfo->icache.sets = 1 << ((iccfgr & SPR_ICCFGR_NCS) >> 3);
+   cpuinfo->icache.block_size = 16 << ((iccfgr & SPR_ICCFGR_CBS) >> 7);
+   cpuinfo->icache.size =
+   cpuinfo->icache.sets * cpuinfo->icache.ways * 
cpuinfo->icache.block_size;
 
dccfgr = mfspr(SPR_DCCFGR);
-   cpuinfo->dcache_ways = 1 << (dccfgr & SPR_DCCFGR_NCW);
-   cache_set_size = 1 << ((dccfgr & SPR_DCCFGR_NCS) >> 3);
-   cpuinfo->dcache_block_size = 16 << ((dccfgr & SPR_DCCFGR_CBS) >> 7);
-   cpuinfo->dcache_size =
-   cache_set_size * cpuinfo->dcache_ways * cpuinfo->dcache_block_size;
+   cpuinfo->dcache.ways = 1 << (dccfgr & SPR_DCCFGR_NCW);
+   cpuinfo->dcache.sets = 1 << ((dccfgr & SPR_DCCFGR_NCS) >> 3);
+   cpuinfo->dcache.block_size = 16 << ((dccfgr & SPR_DCCFGR_CBS) >> 7);
+   cpuinfo->dcache.size =
+   cpuinfo->dcache.sets * cpuinfo->dcache.ways * 
cpuinfo->dcache.block_size;
 
if (of_property_read_u32(cpu, "clock-frequency",
 &cpuinfo->clock_frequency)) {
@@ -320,14 +319,14 @@ static int show_cpuinfo(struct seq_file *m, void *v)
seq_printf(m, "revision\t\t: %d\n", vr & SPR_VR_REV);
}
seq_pr

[PATCH v4 2/3] openrisc: Introduce new utility functions to flush and invalidate caches

2025-03-28 Thread Sahil Siddiq
According to the OpenRISC architecture manual, the dcache and icache may
not be present. When these caches are present, the invalidate and flush
registers may be absent. The current implementation does not perform
checks to verify their presence before utilizing cache registers, or
invalidating and flushing cache blocks.

Introduce new functions to detect the presence of cache components and
related special-purpose registers.

There are a few places where a range of addresses have to be flushed or
invalidated and the implementation is duplicated. Introduce new utility
functions and macros that generalize this implementation and reduce
duplication.

Signed-off-by: Sahil Siddiq 
---
Changes from v3 -> v4:
- arch/openrisc/include/asm/cpuinfo.h: Move new definitions to cache.c.
- arch/openrisc/mm/cache.c:
  (cache_loop): Split function.
  (cache_loop_page): New function.
  (cpu_cache_is_present): Move definition here.
  (cb_inv_flush_is_implemented): Move definition here.

Changes from v2 -> v3:
- arch/openrisc/include/asm/cacheflush.h: Declare new functions and macros.
- arch/openrisc/include/asm/cpuinfo.h: Implement new functions.
  (cpu_cache_is_present):
  1. The implementation of this function was strewn all over the place in
 the previous versions.
  2. Fix condition. The condition in the previous version was incorrect.
  (cb_inv_flush_is_implemented): New function.
- arch/openrisc/kernel/dma.c: Use new functions.
- arch/openrisc/mm/cache.c:
  (cache_loop): Extend function.
  (local_*_page_*): Use new cache_loop interface.
  (local_*_range_*): Implement new functions.
- arch/openrisc/mm/init.c: Use new functions.

 arch/openrisc/include/asm/cacheflush.h | 17 +
 arch/openrisc/include/asm/cpuinfo.h| 15 +
 arch/openrisc/kernel/dma.c | 18 ++
 arch/openrisc/mm/cache.c   | 87 +++---
 arch/openrisc/mm/init.c|  5 +-
 5 files changed, 118 insertions(+), 24 deletions(-)

diff --git a/arch/openrisc/include/asm/cacheflush.h 
b/arch/openrisc/include/asm/cacheflush.h
index 984c331ff5f4..0e60af486ec1 100644
--- a/arch/openrisc/include/asm/cacheflush.h
+++ b/arch/openrisc/include/asm/cacheflush.h
@@ -23,6 +23,9 @@
  */
 extern void local_dcache_page_flush(struct page *page);
 extern void local_icache_page_inv(struct page *page);
+extern void local_dcache_range_flush(unsigned long start, unsigned long end);
+extern void local_dcache_range_inv(unsigned long start, unsigned long end);
+extern void local_icache_range_inv(unsigned long start, unsigned long end);
 
 /*
  * Data cache flushing always happen on the local cpu. Instruction cache
@@ -38,6 +41,20 @@ extern void local_icache_page_inv(struct page *page);
 extern void smp_icache_page_inv(struct page *page);
 #endif /* CONFIG_SMP */
 
+/*
+ * Even if the actual block size is larger than L1_CACHE_BYTES, paddr
+ * can be incremented by L1_CACHE_BYTES. When paddr is written to the
+ * invalidate register, the entire cache line encompassing this address
+ * is invalidated. Each subsequent reference to the same cache line will
+ * not affect the invalidation process.
+ */
+#define local_dcache_block_flush(addr) \
+   local_dcache_range_flush(addr, addr + L1_CACHE_BYTES)
+#define local_dcache_block_inv(addr) \
+   local_dcache_range_inv(addr, addr + L1_CACHE_BYTES)
+#define local_icache_block_inv(addr) \
+   local_icache_range_inv(addr, addr + L1_CACHE_BYTES)
+
 /*
  * Synchronizes caches. Whenever a cpu writes executable code to memory, this
  * should be called to make sure the processor sees the newly written code.
diff --git a/arch/openrisc/include/asm/cpuinfo.h 
b/arch/openrisc/include/asm/cpuinfo.h
index 82f5d4c06314..e46afbfe9b5a 100644
--- a/arch/openrisc/include/asm/cpuinfo.h
+++ b/arch/openrisc/include/asm/cpuinfo.h
@@ -15,6 +15,9 @@
 #ifndef __ASM_OPENRISC_CPUINFO_H
 #define __ASM_OPENRISC_CPUINFO_H
 
+#include 
+#include 
+
 struct cache_desc {
u32 size;
u32 sets;
@@ -34,4 +37,16 @@ struct cpuinfo_or1k {
 extern struct cpuinfo_or1k cpuinfo_or1k[NR_CPUS];
 extern void setup_cpuinfo(void);
 
+/*
+ * Check if the cache component exists.
+ */
+extern bool cpu_cache_is_present(const unsigned int cache_type);
+
+/*
+ * Check if the cache block flush/invalidate register is implemented for the
+ * cache component.
+ */
+extern bool cb_inv_flush_is_implemented(const unsigned int reg,
+   const unsigned int cache_type);
+
 #endif /* __ASM_OPENRISC_CPUINFO_H */
diff --git a/arch/openrisc/kernel/dma.c b/arch/openrisc/kernel/dma.c
index b3edbb33b621..3a7b5baaa450 100644
--- a/arch/openrisc/kernel/dma.c
+++ b/arch/openrisc/kernel/dma.c
@@ -17,6 +17,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 
@@ -24,9 +25,6 @@ static int
 page_set_nocache(pte_t *pte, unsigned long addr,
 unsigned long next, struct mm_walk *walk)
 {
-   unsigned long cl;
-   struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[smp_proc

Re: [PATCH v4 2/3] openrisc: Introduce new utility functions to flush and invalidate caches

2025-03-28 Thread Stafford Horne
On Sat, Mar 29, 2025 at 01:56:31AM +0530, Sahil Siddiq wrote:
> According to the OpenRISC architecture manual, the dcache and icache may
> not be present. When these caches are present, the invalidate and flush
> registers may be absent. The current implementation does not perform
> checks to verify their presence before utilizing cache registers, or
> invalidating and flushing cache blocks.
> 
> Introduce new functions to detect the presence of cache components and
> related special-purpose registers.
> 
> There are a few places where a range of addresses have to be flushed or
> invalidated and the implementation is duplicated. Introduce new utility
> functions and macros that generalize this implementation and reduce
> duplication.
> 
> Signed-off-by: Sahil Siddiq 
> ---
> Changes from v3 -> v4:
> - arch/openrisc/include/asm/cpuinfo.h: Move new definitions to cache.c.
> - arch/openrisc/mm/cache.c:
>   (cache_loop): Split function.
>   (cache_loop_page): New function.
>   (cpu_cache_is_present): Move definition here.
>   (cb_inv_flush_is_implemented): Move definition here.
> 
> Changes from v2 -> v3:
> - arch/openrisc/include/asm/cacheflush.h: Declare new functions and macros.
> - arch/openrisc/include/asm/cpuinfo.h: Implement new functions.
>   (cpu_cache_is_present):
>   1. The implementation of this function was strewn all over the place in
>  the previous versions.
>   2. Fix condition. The condition in the previous version was incorrect.
>   (cb_inv_flush_is_implemented): New function.
> - arch/openrisc/kernel/dma.c: Use new functions.
> - arch/openrisc/mm/cache.c:
>   (cache_loop): Extend function.
>   (local_*_page_*): Use new cache_loop interface.
>   (local_*_range_*): Implement new functions.
> - arch/openrisc/mm/init.c: Use new functions.
> 
>  arch/openrisc/include/asm/cacheflush.h | 17 +
>  arch/openrisc/include/asm/cpuinfo.h| 15 +
>  arch/openrisc/kernel/dma.c | 18 ++
>  arch/openrisc/mm/cache.c   | 87 +++---
>  arch/openrisc/mm/init.c|  5 +-
>  5 files changed, 118 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/openrisc/include/asm/cacheflush.h 
> b/arch/openrisc/include/asm/cacheflush.h
> index 984c331ff5f4..0e60af486ec1 100644
> --- a/arch/openrisc/include/asm/cacheflush.h
> +++ b/arch/openrisc/include/asm/cacheflush.h
> @@ -23,6 +23,9 @@
>   */
>  extern void local_dcache_page_flush(struct page *page);
>  extern void local_icache_page_inv(struct page *page);
> +extern void local_dcache_range_flush(unsigned long start, unsigned long end);
> +extern void local_dcache_range_inv(unsigned long start, unsigned long end);
> +extern void local_icache_range_inv(unsigned long start, unsigned long end);
>  
>  /*
>   * Data cache flushing always happen on the local cpu. Instruction cache
> @@ -38,6 +41,20 @@ extern void local_icache_page_inv(struct page *page);
>  extern void smp_icache_page_inv(struct page *page);
>  #endif /* CONFIG_SMP */
>  
> +/*
> + * Even if the actual block size is larger than L1_CACHE_BYTES, paddr
> + * can be incremented by L1_CACHE_BYTES. When paddr is written to the
> + * invalidate register, the entire cache line encompassing this address
> + * is invalidated. Each subsequent reference to the same cache line will
> + * not affect the invalidation process.
> + */
> +#define local_dcache_block_flush(addr) \
> + local_dcache_range_flush(addr, addr + L1_CACHE_BYTES)
> +#define local_dcache_block_inv(addr) \
> + local_dcache_range_inv(addr, addr + L1_CACHE_BYTES)
> +#define local_icache_block_inv(addr) \
> + local_icache_range_inv(addr, addr + L1_CACHE_BYTES)
> +
>  /*
>   * Synchronizes caches. Whenever a cpu writes executable code to memory, this
>   * should be called to make sure the processor sees the newly written code.
> diff --git a/arch/openrisc/include/asm/cpuinfo.h 
> b/arch/openrisc/include/asm/cpuinfo.h
> index 82f5d4c06314..e46afbfe9b5a 100644
> --- a/arch/openrisc/include/asm/cpuinfo.h
> +++ b/arch/openrisc/include/asm/cpuinfo.h
> @@ -15,6 +15,9 @@
>  #ifndef __ASM_OPENRISC_CPUINFO_H
>  #define __ASM_OPENRISC_CPUINFO_H
>  
> +#include 
> +#include 
> +
>  struct cache_desc {
>   u32 size;
>   u32 sets;
> @@ -34,4 +37,16 @@ struct cpuinfo_or1k {
>  extern struct cpuinfo_or1k cpuinfo_or1k[NR_CPUS];
>  extern void setup_cpuinfo(void);
>  
> +/*
> + * Check if the cache component exists.
> + */
> +extern bool cpu_cache_is_present(const unsigned int cache_type);

This is used in cacheinfo.  OK.

> +/*
> + * Check if the cache block flush/invalidate register is implemented for the
> + * cache component.
> + */
> +extern bool cb_inv_flush_is_implemented(const unsigned int reg,
> + const unsigned int cache_type);

But this function doesnt seem to be used anywhere but in cache.c. Does it need
to be public?

>  #endif /* __ASM_OPENRISC_CPUINFO_H */
> diff --git a/arch/openrisc/kernel/dma.c b/arch/openrisc/kernel/dma.c
> index b3

[GIT PULL] remoteproc updates for v6.15

2025-03-28 Thread Bjorn Andersson


The following changes since commit a64dcfb451e254085a7daee5fe51bf22959d52d3:

  Linux 6.14-rc2 (2025-02-09 12:45:03 -0800)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git 
tags/rproc-v6.15

for you to fetch changes up to e917b73234b02aa4966325e7380d2559bf127ba9:

  remoteproc: qcom_q6v5_pas: Make single-PD handling more robust (2025-03-22 
08:42:39 -0500)


remoteproc updates for v6.15

The i.MX8MP DSP remoteproc driver is transitioned to use the reset
framework for driving the run/stall reset bits.

Support for managing the modem remoteprocessor on the Qualocmm MSM8226,
MSM8926, and SM8750 platforms is added.


Dan Carpenter (1):
  remoteproc: sysmon: Update qcom_add_sysmon_subdev() comment

Daniel Baluta (8):
  dt-bindings: reset: audiomix: Add reset ids for EARC and DSP
  dt-bindings: dsp: fsl,dsp: Add resets property
  reset: imx8mp-audiomix: Add prefix for internal macro
  reset: imx8mp-audiomix: Prepare the code for more reset bits
  reset: imx8mp-audiomix: Introduce active_low configuration option
  reset: imx8mp-audiomix: Add support for DSP run/stall
  imx_dsp_rproc: Use reset controller API to control the DSP
  remoteproc: imx_dsp_rproc: Document run_stall struct member

Jiri Slaby (SUSE) (1):
  irqdomain: remoteproc: Switch to of_fwnode_handle()

Konrad Dybcio (1):
  dt-bindings: remoteproc: Consolidate SC8180X and SM8150 PAS files

Krzysztof Kozlowski (4):
  dt-bindings: remoteproc: qcom,sm6115-pas: Use recommended MBN firmware 
format in DTS example
  dt-bindings: remoteproc: Add SM8750 CDSP
  dt-bindings: remoteproc: Add SM8750 MPSS
  remoteproc: qcom: pas: Add SM8750 MPSS

Luca Weiss (7):
  dt-bindings: remoteproc: qcom,msm8916-mss-pil: Add MSM8926
  remoteproc: qcom_q6v5_mss: Handle platforms with one power domain
  remoteproc: qcom_q6v5_mss: Add modem support on MSM8226
  remoteproc: qcom_q6v5_mss: Add modem support on MSM8926
  remoteproc: qcom: pas: add minidump_id to SC7280 WPSS
  remoteproc: qcom_q6v5_pas: Use resource with CX PD for MSM8226
  remoteproc: qcom_q6v5_pas: Make single-PD handling more robust

Matti Lehtimäki (4):
  dt-bindings: remoteproc: qcom,msm8916-mss-pil: Support platforms with one 
power domain
  dt-bindings: remoteproc: qcom,msm8916-mss-pil: Add MSM8226
  dt-bindings: remoteproc: qcom,wcnss-pil: Add support for single 
power-domain platforms
  remoteproc: qcom_wcnss: Handle platforms with only single power domain

Peng Fan (2):
  remoteproc: omap: Add comment for is_iomem
  remoteproc: core: Clear table_sz when rproc_shutdown

 Documentation/devicetree/bindings/dsp/fsl,dsp.yaml |  24 ++-
 .../bindings/remoteproc/qcom,msm8916-mss-pil.yaml  |  64 ++-
 .../bindings/remoteproc/qcom,sc8180x-pas.yaml  |  96 ---
 .../bindings/remoteproc/qcom,sm6115-pas.yaml   |   2 +-
 .../bindings/remoteproc/qcom,sm8150-pas.yaml   |   7 +
 .../bindings/remoteproc/qcom,sm8550-pas.yaml   |  46 +-
 .../bindings/remoteproc/qcom,wcnss-pil.yaml|  45 -
 drivers/remoteproc/imx_dsp_rproc.c |  26 ++-
 drivers/remoteproc/imx_rproc.h |   2 +
 drivers/remoteproc/omap_remoteproc.c   |   1 +
 drivers/remoteproc/pru_rproc.c |   2 +-
 drivers/remoteproc/qcom_q6v5_mss.c | 184 -
 drivers/remoteproc/qcom_q6v5_pas.c |  38 -
 drivers/remoteproc/qcom_sysmon.c   |   2 +-
 drivers/remoteproc/qcom_wcnss.c|  33 +++-
 drivers/remoteproc/remoteproc_core.c   |   1 +
 drivers/reset/reset-imx8mp-audiomix.c  |  78 ++---
 include/dt-bindings/reset/imx8mp-reset-audiomix.h  |  13 ++
 18 files changed, 499 insertions(+), 165 deletions(-)
 delete mode 100644 
Documentation/devicetree/bindings/remoteproc/qcom,sc8180x-pas.yaml
 create mode 100644 include/dt-bindings/reset/imx8mp-reset-audiomix.h



[PATCH v4 0/3] openrisc: Add cacheinfo support and introduce new utility functions

2025-03-28 Thread Sahil Siddiq
Hi,

The main purpose of this series is to expose CPU cache attributes for
OpenRISC in sysfs using the cacheinfo API. The core implementation
to achieve this is in patch #3. Patch #1 and #2 add certain enhancements
to simplify the implementation of cacheinfo support.

Patch #1 removes duplication of cache-related data members in struct
cpuinfo_or1k.

Patch #2 introduces several utility functions. One set of functions is
used to check if the cache components and SPRs exist before attempting
to use them. The other set provides a convenient interface to flush or
invalidate a range of cache blocks.

This version addresses review comments posted in response to v3. In
commit #2, I chose not to make "cache_loop_page()" inline after reading
point 15 in the coding style doc [1]. Let me know if it should be made
inline.

Thanks,
Sahil

[1] https://www.kernel.org/doc/html/latest/process/coding-style.html

Sahil Siddiq (3):
  openrisc: Refactor struct cpuinfo_or1k to reduce duplication
  openrisc: Introduce new utility functions to flush and invalidate
caches
  openrisc: Add cacheinfo support

 arch/openrisc/include/asm/cacheflush.h |  17 
 arch/openrisc/include/asm/cpuinfo.h|  31 ++--
 arch/openrisc/kernel/Makefile  |   2 +-
 arch/openrisc/kernel/cacheinfo.c   | 104 +
 arch/openrisc/kernel/dma.c |  18 +
 arch/openrisc/kernel/setup.c   |  45 +--
 arch/openrisc/mm/cache.c   |  87 +++--
 arch/openrisc/mm/init.c|   5 +-
 8 files changed, 235 insertions(+), 74 deletions(-)
 create mode 100644 arch/openrisc/kernel/cacheinfo.c


base-commit: ea1413e5b53a8dd4fa7675edb23cdf828bbdce1e
-- 
2.48.1




[PATCH net 0/4] mptcp: misc. fixes for 6.15-rc0

2025-03-28 Thread Matthieu Baerts (NGI0)
Here are 4 unrelated patches:

- Patch 1: fix a NULL pointer when two SYN-ACK for the same request are
  handled in parallel. A fix for up to v5.9.

- Patch 2: selftests: fix check for the wrong FD. A fix for up to v5.17.

- Patch 3: selftests: close all FDs in case of error. A fix for up to
  v5.17.

- Patch 4: selftests: ignore a new generated file. A fix for 6.15-rc0.

Signed-off-by: Matthieu Baerts (NGI0) 
---
Cong Liu (1):
  selftests: mptcp: fix incorrect fd checks in main_loop

Gang Yan (1):
  mptcp: fix NULL pointer in can_accept_new_subflow

Geliang Tang (1):
  selftests: mptcp: close fd_in before returning in main_loop

Matthieu Baerts (NGI0) (1):
  selftests: mptcp: ignore mptcp_diag binary

 net/mptcp/subflow.c   | 15 ---
 tools/testing/selftests/net/mptcp/.gitignore  |  1 +
 tools/testing/selftests/net/mptcp/mptcp_connect.c | 11 +++
 3 files changed, 16 insertions(+), 11 deletions(-)
---
base-commit: 2ea396448f26d0d7d66224cb56500a6789c7ed07
change-id: 20250328-net-mptcp-misc-fixes-6-15-98bfbeaa15ac

Best regards,
-- 
Matthieu Baerts (NGI0) 




[PATCH net 1/4] mptcp: fix NULL pointer in can_accept_new_subflow

2025-03-28 Thread Matthieu Baerts (NGI0)
From: Gang Yan 

When testing valkey benchmark tool with MPTCP, the kernel panics in
'mptcp_can_accept_new_subflow' because subflow_req->msk is NULL.

Call trace:

  mptcp_can_accept_new_subflow (./net/mptcp/subflow.c:63 (discriminator 4)) (P)
  subflow_syn_recv_sock (./net/mptcp/subflow.c:854)
  tcp_check_req (./net/ipv4/tcp_minisocks.c:863)
  tcp_v4_rcv (./net/ipv4/tcp_ipv4.c:2268)
  ip_protocol_deliver_rcu (./net/ipv4/ip_input.c:207)
  ip_local_deliver_finish (./net/ipv4/ip_input.c:234)
  ip_local_deliver (./net/ipv4/ip_input.c:254)
  ip_rcv_finish (./net/ipv4/ip_input.c:449)
  ...

According to the debug log, the same req received two SYN-ACK in a very
short time, very likely because the client retransmits the syn ack due
to multiple reasons.

Even if the packets are transmitted with a relevant time interval, they
can be processed by the server on different CPUs concurrently). The
'subflow_req->msk' ownership is transferred to the subflow the first,
and there will be a risk of a null pointer dereference here.

This patch fixes this issue by moving the 'subflow_req->msk' under the
`own_req == true` conditional.

Note that the !msk check in subflow_hmac_valid() can be dropped, because
the same check already exists under the own_req mpj branch where the
code has been moved to.

Fixes: 9466a1ccebbe ("mptcp: enable JOIN requests even if cookies are in use")
Cc: sta...@vger.kernel.org
Suggested-by: Paolo Abeni 
Signed-off-by: Gang Yan 
Reviewed-by: Matthieu Baerts (NGI0) 
Signed-off-by: Matthieu Baerts (NGI0) 
---
 net/mptcp/subflow.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 
efe8d86496dbd06a3c4cae6ffc6462e43e42c959..409bd415ef1d190d5599658d01323ad8c8a9be93
 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -754,8 +754,6 @@ static bool subflow_hmac_valid(const struct request_sock 
*req,
 
subflow_req = mptcp_subflow_rsk(req);
msk = subflow_req->msk;
-   if (!msk)
-   return false;
 
subflow_generate_hmac(READ_ONCE(msk->remote_key),
  READ_ONCE(msk->local_key),
@@ -850,12 +848,8 @@ static struct sock *subflow_syn_recv_sock(const struct 
sock *sk,
 
} else if (subflow_req->mp_join) {
mptcp_get_options(skb, &mp_opt);
-   if (!(mp_opt.suboptions & OPTION_MPTCP_MPJ_ACK) ||
-   !subflow_hmac_valid(req, &mp_opt) ||
-   !mptcp_can_accept_new_subflow(subflow_req->msk)) {
-   SUBFLOW_REQ_INC_STATS(req, MPTCP_MIB_JOINACKMAC);
+   if (!(mp_opt.suboptions & OPTION_MPTCP_MPJ_ACK))
fallback = true;
-   }
}
 
 create_child:
@@ -905,6 +899,13 @@ static struct sock *subflow_syn_recv_sock(const struct 
sock *sk,
goto dispose_child;
}
 
+   if (!subflow_hmac_valid(req, &mp_opt) ||
+   !mptcp_can_accept_new_subflow(subflow_req->msk)) {
+   SUBFLOW_REQ_INC_STATS(req, 
MPTCP_MIB_JOINACKMAC);
+   subflow_add_reset_reason(skb, 
MPTCP_RST_EPROHIBIT);
+   goto dispose_child;
+   }
+
/* move the msk reference ownership to the subflow */
subflow_req->msk = NULL;
ctx->conn = (struct sock *)owner;

-- 
2.48.1




[PATCH net 2/4] selftests: mptcp: fix incorrect fd checks in main_loop

2025-03-28 Thread Matthieu Baerts (NGI0)
From: Cong Liu 

Fix a bug where the code was checking the wrong file descriptors
when opening the input files. The code was checking 'fd' instead
of 'fd_in', which could lead to incorrect error handling.

Fixes: 05be5e273c84 ("selftests: mptcp: add disconnect tests")
Cc: sta...@vger.kernel.org
Fixes: ca7ae8916043 ("selftests: mptcp: mptfo Initiator/Listener")
Co-developed-by: Geliang Tang 
Signed-off-by: Geliang Tang 
Signed-off-by: Cong Liu 
Reviewed-by: Matthieu Baerts (NGI0) 
Signed-off-by: Matthieu Baerts (NGI0) 
---
 tools/testing/selftests/net/mptcp/mptcp_connect.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c 
b/tools/testing/selftests/net/mptcp/mptcp_connect.c
index 
d240d02fa443a1cd802f0e705ab36db5c22063a8..893dc36b12f607bec56a41c9961eff272a7837c7
 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c
@@ -1270,7 +1270,7 @@ int main_loop(void)
 
if (cfg_input && cfg_sockopt_types.mptfo) {
fd_in = open(cfg_input, O_RDONLY);
-   if (fd < 0)
+   if (fd_in < 0)
xerror("can't open %s:%d", cfg_input, errno);
}
 
@@ -1293,7 +1293,7 @@ int main_loop(void)
 
if (cfg_input && !cfg_sockopt_types.mptfo) {
fd_in = open(cfg_input, O_RDONLY);
-   if (fd < 0)
+   if (fd_in < 0)
xerror("can't open %s:%d", cfg_input, errno);
}
 

-- 
2.48.1




[PATCH net 3/4] selftests: mptcp: close fd_in before returning in main_loop

2025-03-28 Thread Matthieu Baerts (NGI0)
From: Geliang Tang 

The file descriptor 'fd_in' is opened when cfg_input is configured, but
not closed in main_loop(), this patch fixes it.

Fixes: 05be5e273c84 ("selftests: mptcp: add disconnect tests")
Cc: sta...@vger.kernel.org
Co-developed-by: Cong Liu 
Signed-off-by: Cong Liu 
Signed-off-by: Geliang Tang 
Reviewed-by: Matthieu Baerts (NGI0) 
Signed-off-by: Matthieu Baerts (NGI0) 
---
 tools/testing/selftests/net/mptcp/mptcp_connect.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c 
b/tools/testing/selftests/net/mptcp/mptcp_connect.c
index 
893dc36b12f607bec56a41c9961eff272a7837c7..c83a8b47bbdfa5fcf1462e2b2949b41fd32c9b14
 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c
@@ -1299,7 +1299,7 @@ int main_loop(void)
 
ret = copyfd_io(fd_in, fd, 1, 0, &winfo);
if (ret)
-   return ret;
+   goto out;
 
if (cfg_truncate > 0) {
shutdown(fd, SHUT_WR);
@@ -1320,7 +1320,10 @@ int main_loop(void)
close(fd);
}
 
-   return 0;
+out:
+   if (cfg_input)
+   close(fd_in);
+   return ret;
 }
 
 int parse_proto(const char *proto)

-- 
2.48.1




[GIT PULL] hwspinlock updates for v6.15

2025-03-28 Thread Bjorn Andersson


The following changes since commit a64dcfb451e254085a7daee5fe51bf22959d52d3:

  Linux 6.14-rc2 (2025-02-09 12:45:03 -0800)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git 
tags/hwlock-v6.15

for you to fetch changes up to fec04edb74126f21ac628c7be763c97deb49f69d:

  hwspinlock: Remove unused hwspin_lock_get_id() (2025-03-21 17:12:04 -0500)


hwspinlock updates for v6.15

Drop a few unused functions from the hwspinlock framework.


Dr. David Alan Gilbert (2):
  hwspinlock: Remove unused (devm_)hwspin_lock_request()
  hwspinlock: Remove unused hwspin_lock_get_id()

 Documentation/locking/hwspinlock.rst | 57 +-
 drivers/hwspinlock/hwspinlock_core.c | 94 
 include/linux/hwspinlock.h   | 18 ---
 3 files changed, 1 insertion(+), 168 deletions(-)



[PATCH v2 1/2] x86/sgx: Use sgx_nr_used_pages for EPC page count instead of sgx_nr_free_pages

2025-03-28 Thread Elena Reshetova
sgx_nr_free_pages is an atomic that is used to keep track of
free EPC pages and detect whenever page reclaiming should start.
Since successful execution of ENCLS[EUPDATESVN] requires empty
EPC and preferably a fast lockless way of checking for this
condition in all code paths where EPC is already used, change the
reclaiming code to track the number of used pages via
sgx_nr_used_pages instead of sgx_nr_free_pages.
For this change to work in the page reclamation code, add a new
variable, sgx_nr_total_pages, that will keep track of total
number of EPC pages.

It would have been possible to implement ENCLS[EUPDATESVN] using
existing sgx_nr_free_pages counter and a new sgx_nr_total_pages
counter, but it won't be possible to avoid taking a lock *every time*
a new EPC page is being allocated. The conversion of sgx_nr_free_pages
into sgx_nr_used_pages allows avoiding the lock in all cases except
when it is the first EPC page being allocated via a quick
atomic_long_inc_not_zero check.

Note: The serialization for sgx_nr_total_pages is not needed because
the variable is only updated during the initialization and there's no
concurrent access.

Signed-off-by: Elena Reshetova 
---
 arch/x86/kernel/cpu/sgx/main.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 8ce352fc72ac..b61d3bad0446 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -32,7 +32,8 @@ static DEFINE_XARRAY(sgx_epc_address_space);
 static LIST_HEAD(sgx_active_page_list);
 static DEFINE_SPINLOCK(sgx_reclaimer_lock);
 
-static atomic_long_t sgx_nr_free_pages = ATOMIC_LONG_INIT(0);
+static atomic_long_t sgx_nr_used_pages = ATOMIC_LONG_INIT(0);
+static unsigned long sgx_nr_total_pages;
 
 /* Nodes with one or more EPC sections. */
 static nodemask_t sgx_numa_mask;
@@ -378,8 +379,8 @@ static void sgx_reclaim_pages(void)
 
 static bool sgx_should_reclaim(unsigned long watermark)
 {
-   return atomic_long_read(&sgx_nr_free_pages) < watermark &&
-  !list_empty(&sgx_active_page_list);
+   return (sgx_nr_total_pages - atomic_long_read(&sgx_nr_used_pages))
+  < watermark && !list_empty(&sgx_active_page_list);
 }
 
 /*
@@ -456,7 +457,7 @@ static struct sgx_epc_page 
*__sgx_alloc_epc_page_from_node(int nid)
page->flags = 0;
 
spin_unlock(&node->lock);
-   atomic_long_dec(&sgx_nr_free_pages);
+   atomic_long_inc(&sgx_nr_used_pages);
 
return page;
 }
@@ -616,7 +617,7 @@ void sgx_free_epc_page(struct sgx_epc_page *page)
page->flags = SGX_EPC_PAGE_IS_FREE;
 
spin_unlock(&node->lock);
-   atomic_long_inc(&sgx_nr_free_pages);
+   atomic_long_dec(&sgx_nr_used_pages);
 }
 
 static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size,
@@ -648,6 +649,8 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 
size,
list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list);
}
 
+   sgx_nr_total_pages += nr_pages;
+
return true;
 }
 
@@ -848,6 +851,8 @@ static bool __init sgx_page_cache_init(void)
return false;
}
 
+   atomic_long_set(&sgx_nr_used_pages, sgx_nr_total_pages);
+
for_each_online_node(nid) {
if (!node_isset(nid, sgx_numa_mask) &&
node_state(nid, N_MEMORY) && node_state(nid, N_CPU))
-- 
2.45.2




[PATCH v2 0/2] Enable automatic SVN updates for SGX enclaves

2025-03-28 Thread Elena Reshetova
Changes since v1 following review by Jarkko:

 - first and second patch are squashed together and a better
   explanation of the change is added into the commit message
 - third and fourth patch are also combined for better understanding
   of error code purposes used in 4th patch
 - implementation of sgx_updatesvn adjusted following Jarkko's
   suggestions
 - minor fixes in both commit messages and code from the review
 - dropping co-developed-by tag since the code now differs enough
   from the original submission. However, the reference where the
   original code came from and credits to original author is kept

Background
--

In case an SGX vulnerability is discovered and TCB recovery
for SGX is triggered, Intel specifies a process that must be
followed for a given vulnerability. Steps to mitigate can vary
based on vulnerability type, affected components, etc.
In some cases, a vulnerability can be mitigated via a runtime
recovery flow by shutting down all running SGX enclaves,
clearing enclave page cache (EPC), applying a microcode patch
that does not require a reboot (via late microcode loading) and
restarting all SGX enclaves.


Problem statement
-
Even when the above-described runtime recovery flow to mitigate the
SGX vulnerability is followed, the SGX attestation evidence will
still reflect the security SVN version being equal to the previous
state of security SVN (containing vulnerability) that created
and managed the enclave until the runtime recovery event. This
limitation currently can be only overcome via a platform reboot,
which negates all the benefits from the rebootless late microcode
loading and not required in this case for functional or security
purposes.


Proposed solution
-

SGX architecture introduced  a new instruction called EUPDATESVN [1]
to Ice Lake. It allows updating security SVN version, given that EPC
is completely empty. The latter is required for security reasons
in order to reason that enclave security posture is as secure as the
security SVN version of the TCB that created it.

This series enables opportunistic execution of EUPDATESVN upon first
EPC page allocation for a first enclave to be run on the platform.

This series is partly based on the previous work done by Cathy Zhang
[2], which attempted to enable forceful destruction of all SGX
enclaves and execution of EUPDATESVN upon successful application of
any microcode patch. This approach is determined as being too
intrusive for the running SGX enclaves, especially taking into account
that it would be performed upon *every* microcode patch application
regardless if it changes the security SVN version or not (change to the
security SVN version is a rare event).

Testing
---

Tested on EMR machine using kernel-6.14.0_rc7 & sgx selftests.
If Google folks in CC can test on their side, it would be greatly appreciated.


References
--

[1] https://cdrdv2.intel.com/v1/dl/getContent/648682?explicitVersion=true
[2] https://lore.kernel.org/all/20220520103904.1216-1-cathy.zh...@intel.com/

Elena Reshetova (2):
  x86/sgx: Use sgx_nr_used_pages for EPC page count instead of
sgx_nr_free_pages
  x86/sgx: Implement EUPDATESVN and opportunistically call it during
first EPC page alloc

 arch/x86/include/asm/sgx.h  | 41 +++---
 arch/x86/kernel/cpu/sgx/encls.h |  6 +++
 arch/x86/kernel/cpu/sgx/main.c  | 76 ++---
 arch/x86/kernel/cpu/sgx/sgx.h   |  1 +
 4 files changed, 104 insertions(+), 20 deletions(-)

-- 
2.45.2




Re: [PATCH 0/3] Avoid calling WARN_ON() on allocation failure in cfg802154_switch_netns()

2025-03-28 Thread Miquel Raynal
Hello Ivan,

On 28/03/2025 at 04:04:24 +03, Ivan Abramov  wrote:

> This series was inspired by Syzkaller report on warning in
> cfg802154_switch_netns().

Thanks for the series, lgtm.

Reviewed-by: Miquel Raynal 

Miquèl



RE: [PATCH 1/4] x86/sgx: Add total number of EPC pages

2025-03-28 Thread Reshetova, Elena

> oN Thu, Mar 27, 2025 at 03:29:53PM +, Reshetova, Elena wrote:
> >
> > > On Mon, Mar 24, 2025 at 12:12:41PM +, Reshetova, Elena wrote:
> > > > > On Fri, Mar 21, 2025 at 02:34:40PM +0200, Elena Reshetova wrote:
> > > > > > In order to successfully execute ENCLS[EUPDATESVN], EPC must be
> > > empty.
> > > > > > SGX already has a variable sgx_nr_free_pages that tracks free
> > > > > > EPC pages. Add a new variable, sgx_nr_total_pages, that will keep
> > > > > > track of total number of EPC pages. It will be used in subsequent
> > > > > > patch to change the sgx_nr_free_pages into sgx_nr_used_pages and
> > > > > > allow an easy check for an empty EPC.
> > > > >
> > > > > First off, remove "in subsequent patch".
> > > >
> > > > Ok
> > > >
> > > > >
> > > > > What does "change sgx_nr_free_pages into sgx_nr_used_pages"
> mean?
> > > >
> > > > As you can see from patch 2/4, I had to turn around the meaning of the
> > > > existing sgx_nr_free_pages atomic counter not to count the # of free
> pages
> > > > in EPC, but to count the # of used EPC pages (hence the change of name
> > > > to sgx_nr_used_pages). The reason for doing this is only apparent in
> patch
> > >
> > > Why you *absolutely* need to invert the meaning and cannot make
> > > this work by any means otherwise?
> > >
> > > I doubt highly doubt this could not be done other way around.
> >
> > I can make it work. The point that this way is much better and no damage to
> > existing logic is done. The sgx_nr_free_pages counter that is used only for
> page reclaiming
> > and checked in a single piece of code.
> > To give you an idea the previous iteration of the code looked like below.
> > First, I had to define a new unconditional spinlock to protect the EPC page
> allocation:
> >
> > diff --git a/arch/x86/kernel/cpu/sgx/main.c
> b/arch/x86/kernel/cpu/sgx/main.c
> > index c8a2542140a1..4f445c28929b 100644
> > --- a/arch/x86/kernel/cpu/sgx/main.c
> > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > @@ -31,6 +31,7 @@ static DEFINE_XARRAY(sgx_epc_address_space);
> >   */
> >  static LIST_HEAD(sgx_active_page_list);
> >  static DEFINE_SPINLOCK(sgx_reclaimer_lock);
> > +static DEFINE_SPINLOCK(sgx_allocate_epc_page_lock);
> 
> 
> 
> >
> >  static atomic_long_t sgx_nr_free_pages = ATOMIC_LONG_INIT(0);
> >  static unsigned long sgx_nr_total_pages;
> > @@ -457,7 +458,10 @@ static struct sgx_epc_page
> *__sgx_alloc_epc_page_from_node(int nid)
> >   page->flags = 0;
> >
> >   spin_unlock(&node->lock);
> > +
> > + spin_lock(&sgx_allocate_epc_page_lock);
> >   atomic_long_dec(&sgx_nr_free_pages);
> > + spin_unlock(&sgx_allocate_epc_page_lock);
> >
> >   return page;
> >  }
> >
> > And then also take spinlock every time eupdatesvn attempts to run:
> >
> > int sgx_updatesvn(void)
> > +{
> > + int ret;
> > + int retry = 10;
> 
> Reverse xmas tree order.
> 
> > +
> > + spin_lock(&sgx_allocate_epc_page_lock);
> 
> You could use guard for this.
> 
> https://elixir.bootlin.com/linux/v6.13.7/source/include/linux/cleanup.h
> 
> > +
> > + if (atomic_long_read(&sgx_nr_free_pages) != sgx_nr_total_pages) {
> > + spin_unlock(&sgx_allocate_epc_page_lock);
> > + return SGX_EPC_NOT_READY;
> 
> Don't use uarch error codes.

Sure, thanks, I can fix all of the above, this was just to give an idea how
the other version of the code would look like. 

> 
> > + }
> > +
> > + do {
> > + ret = __eupdatesvn();
> > + if (ret != SGX_INSUFFICIENT_ENTROPY)
> > + break;
> > +
> > + } while (--retry);
> > +
> > + spin_unlock(&sgx_allocate_epc_page_lock);
> >
> > Which was called from each enclave create ioctl:
> >
> > @@ -163,6 +163,11 @@ static long sgx_ioc_enclave_create(struct sgx_encl
> *encl, void __user *arg)
> >   if (copy_from_user(&create_arg, arg, sizeof(create_arg)))
> >   return -EFAULT;
> >
> > + /* Unless running in a VM, execute EUPDATESVN if instruction is avalible 
> > */
> > + if ((cpuid_eax(SGX_CPUID) & SGX_CPUID_EUPDATESVN) &&
> > +!boot_cpu_has(X86_FEATURE_HYPERVISOR))
> > + sgx_updatesvn();
> > +
> >   secs = kmalloc(PAGE_SIZE, GFP_KERNEL);
> >   if (!secs)
> >   return -ENOMEM;
> >
> > Would you agree that this way it is much worse even code/logic-wise even
> without benchmarks?
> 
> Yes but obviously I cannot promise that I'll accept this as it is
> until I see the final version

Are you saying you prefer *this version with spinlock* vs. 
simpler version that utilizes the fact that sgx_nr_free_pages is changed
into tracking of number of used pages? 

> 
> Also you probably should use mutex given the loop where we cannot
> temporarily exit the lock (like e.g. in keyrings gc we can).

Not sure I understand this, could you please elaborate why do I need an
additional mutex here? Or are you suggesting switching spinlock to mutex? 

Best Regards,
Elena.



[PATCH v8 4/8] vhost: Introduce vhost_worker_ops in vhost_worker

2025-03-28 Thread Cindy Lu
Abstract vhost worker operations (create/stop/wakeup) into an ops
structure to prepare for kthread mode support.

Signed-off-by: Cindy Lu 
---
 drivers/vhost/vhost.c | 63 ++-
 drivers/vhost/vhost.h | 11 
 2 files changed, 56 insertions(+), 18 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 20571bd6f7bd..c162ad772f8f 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -243,7 +243,7 @@ static void vhost_worker_queue(struct vhost_worker *worker,
 * test_and_set_bit() implies a memory barrier.
 */
llist_add(&work->node, &worker->work_list);
-   vhost_task_wake(worker->vtsk);
+   worker->ops->wakeup(worker);
}
 }
 
@@ -706,7 +706,7 @@ static void vhost_worker_destroy(struct vhost_dev *dev,
 
WARN_ON(!llist_empty(&worker->work_list));
xa_erase(&dev->worker_xa, worker->id);
-   vhost_task_stop(worker->vtsk);
+   worker->ops->stop(worker);
kfree(worker);
 }
 
@@ -729,42 +729,69 @@ static void vhost_workers_free(struct vhost_dev *dev)
xa_destroy(&dev->worker_xa);
 }
 
+static void vhost_task_wakeup(struct vhost_worker *worker)
+{
+   return vhost_task_wake(worker->vtsk);
+}
+
+static void vhost_task_do_stop(struct vhost_worker *worker)
+{
+   return vhost_task_stop(worker->vtsk);
+}
+
+static int vhost_task_worker_create(struct vhost_worker *worker,
+   struct vhost_dev *dev, const char *name)
+{
+   struct vhost_task *vtsk;
+   u32 id;
+   int ret;
+
+   vtsk = vhost_task_create(vhost_run_work_list, vhost_worker_killed,
+worker, name);
+   if (IS_ERR(vtsk))
+   return PTR_ERR(vtsk);
+
+   worker->vtsk = vtsk;
+   vhost_task_start(vtsk);
+   ret = xa_alloc(&dev->worker_xa, &id, worker, xa_limit_32b, GFP_KERNEL);
+   if (ret < 0) {
+   vhost_task_do_stop(worker);
+   return ret;
+   }
+   worker->id = id;
+   return 0;
+}
+
+static const struct vhost_worker_ops vhost_task_ops = {
+   .create = vhost_task_worker_create,
+   .stop = vhost_task_do_stop,
+   .wakeup = vhost_task_wakeup,
+};
+
 static struct vhost_worker *vhost_worker_create(struct vhost_dev *dev)
 {
struct vhost_worker *worker;
-   struct vhost_task *vtsk;
char name[TASK_COMM_LEN];
int ret;
-   u32 id;
+   const struct vhost_worker_ops *ops = &vhost_task_ops;
 
worker = kzalloc(sizeof(*worker), GFP_KERNEL_ACCOUNT);
if (!worker)
return NULL;
 
worker->dev = dev;
+   worker->ops = ops;
snprintf(name, sizeof(name), "vhost-%d", current->pid);
 
-   vtsk = vhost_task_create(vhost_run_work_list, vhost_worker_killed,
-worker, name);
-   if (IS_ERR(vtsk))
-   goto free_worker;
-
mutex_init(&worker->mutex);
init_llist_head(&worker->work_list);
worker->kcov_handle = kcov_common_handle();
-   worker->vtsk = vtsk;
-
-   vhost_task_start(vtsk);
-
-   ret = xa_alloc(&dev->worker_xa, &id, worker, xa_limit_32b, GFP_KERNEL);
+   ret = ops->create(worker, dev, name);
if (ret < 0)
-   goto stop_worker;
-   worker->id = id;
+   goto free_worker;
 
return worker;
 
-stop_worker:
-   vhost_task_stop(vtsk);
 free_worker:
kfree(worker);
return NULL;
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 19bb94922a0e..98895e299efa 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -26,6 +26,16 @@ struct vhost_work {
unsigned long   flags;
 };
 
+struct vhost_worker;
+struct vhost_dev;
+
+struct vhost_worker_ops {
+   int (*create)(struct vhost_worker *worker, struct vhost_dev *dev,
+ const char *name);
+   void (*stop)(struct vhost_worker *worker);
+   void (*wakeup)(struct vhost_worker *worker);
+};
+
 struct vhost_worker {
struct vhost_task   *vtsk;
struct vhost_dev*dev;
@@ -36,6 +46,7 @@ struct vhost_worker {
u32 id;
int attachment_cnt;
boolkilled;
+   const struct vhost_worker_ops *ops;
 };
 
 /* Poll a file (eventfd or socket) */
-- 
2.45.0




[PATCH] selftests/mm: Fix loss of information warnings

2025-03-28 Thread Siddarth G
Cppcheck reported a style warning:
int result is assigned to long long variable. If the variable is long long
to avoid loss of information, then you have loss of information.

Changing the type of page_size from 'unsigned int' to 'unsigned long long'
was considered. But that might cause new conversion issues in other
parts of the code where calculations involving 'page_size' are assigned
to int variables. So we approach by appending ULL suffixes

Reported-by: David Binderman 
Closes: 
https://lore.kernel.org/all/as8pr02mb10217315060bbfdb21f19643e9c...@as8pr02mb10217.eurprd02.prod.outlook.com/
Signed-off-by: Siddarth G 
---
 tools/testing/selftests/mm/pagemap_ioctl.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/mm/pagemap_ioctl.c 
b/tools/testing/selftests/mm/pagemap_ioctl.c
index 57b4bba2b45f..f3b12402ca89 100644
--- a/tools/testing/selftests/mm/pagemap_ioctl.c
+++ b/tools/testing/selftests/mm/pagemap_ioctl.c
@@ -244,7 +244,7 @@ int sanity_tests_sd(void)
long walk_end;
 
vec_size = num_pages/2;
-   mem_size = num_pages * page_size;
+   mem_size = num_pages * (long long)page_size;
 
vec = malloc(sizeof(struct page_region) * vec_size);
if (!vec)
@@ -432,7 +432,7 @@ int sanity_tests_sd(void)
free(vec2);
 
/* 8. Smaller vec */
-   mem_size = 1050 * page_size;
+   mem_size = 1050ULL * page_size;
vec_size = mem_size/(page_size*2);
 
vec = malloc(sizeof(struct page_region) * vec_size);
@@ -487,7 +487,7 @@ int sanity_tests_sd(void)
total_pages = 0;
 
/* 9. Smaller vec */
-   mem_size = 1 * page_size;
+   mem_size = 1ULL * page_size;
vec_size = 50;
 
vec = malloc(sizeof(struct page_region) * vec_size);
@@ -1058,7 +1058,7 @@ int sanity_tests(void)
char *tmp_buf;
 
/* 1. wrong operation */
-   mem_size = 10 * page_size;
+   mem_size = 10ULL * page_size;
vec_size = mem_size / page_size;
 
vec = malloc(sizeof(struct page_region) * vec_size);
@@ -1507,7 +1507,7 @@ int main(int __attribute__((unused)) argc, char *argv[])
sanity_tests_sd();
 
/* 2. Normal page testing */
-   mem_size = 10 * page_size;
+   mem_size = 10ULL * page_size;
mem = mmap(NULL, mem_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | 
MAP_ANON, -1, 0);
if (mem == MAP_FAILED)
ksft_exit_fail_msg("error nomem\n");
@@ -1520,7 +1520,7 @@ int main(int __attribute__((unused)) argc, char *argv[])
munmap(mem, mem_size);
 
/* 3. Large page testing */
-   mem_size = 512 * 10 * page_size;
+   mem_size = 512ULL * 10 * page_size;
mem = mmap(NULL, mem_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | 
MAP_ANON, -1, 0);
if (mem == MAP_FAILED)
ksft_exit_fail_msg("error nomem\n");
-- 
2.43.0




Re: [PATCH v2] selftests/ptrace/get_syscall_info: fix for MIPS n32

2025-03-28 Thread Shuah Khan

On 1/15/25 16:37, Dmitry V. Levin wrote:

MIPS n32 is one of two ILP32 architectures supported by the kernel
that have 64-bit syscall arguments (another one is x32).

When this test passed 32-bit arguments to syscall(), they were
sign-extended in libc, PTRACE_GET_SYSCALL_INFO reported these
sign-extended 64-bit values, and the test complained about the mismatch.

Fix this by passing arguments of the appropriate type to syscall(),
which is "unsigned long long" on MIPS n32, and __kernel_ulong_t on other
architectures.

As a side effect, this also extends the test on all 64-bit architectures
by choosing constants that don't fit into 32-bit integers.

Signed-off-by: Dmitry V. Levin 
---

v2: Fixed MIPS #ifdef.

  .../selftests/ptrace/get_syscall_info.c   | 53 +++
  1 file changed, 32 insertions(+), 21 deletions(-)

diff --git a/tools/testing/selftests/ptrace/get_syscall_info.c 
b/tools/testing/selftests/ptrace/get_syscall_info.c
index 5bcd1c7b5be6..2970f72d66d3 100644
--- a/tools/testing/selftests/ptrace/get_syscall_info.c
+++ b/tools/testing/selftests/ptrace/get_syscall_info.c
@@ -11,8 +11,19 @@
  #include 
  #include 
  #include 
+#include 
  #include "linux/ptrace.h"
  
+#if defined(_MIPS_SIM) && _MIPS_SIM == _MIPS_SIM_NABI32

+/*
+ * MIPS N32 is the only architecture where __kernel_ulong_t
+ * does not match the bitness of syscall arguments.
+ */
+typedef unsigned long long kernel_ulong_t;
+#else
+typedef __kernel_ulong_t kernel_ulong_t;
+#endif
+


What's the reason for adding these typedefs? checkpatch should
have warned you about adding new typedefs.

Also this introduces kernel_ulong_t in user-space test code.
Something to avoid.


  static int
  kill_tracee(pid_t pid)
  {
@@ -42,37 +53,37 @@ sys_ptrace(int request, pid_t pid, unsigned long addr, 
unsigned long data)
  
  TEST(get_syscall_info)

  {
-   static const unsigned long args[][7] = {
+   const kernel_ulong_t args[][7] = {
/* a sequence of architecture-agnostic syscalls */
{
__NR_chdir,
-   (unsigned long) "",
-   0xbad1fed1,
-   0xbad2fed2,
-   0xbad3fed3,
-   0xbad4fed4,
-   0xbad5fed5
+   (uintptr_t) "",


You could use ifdef here.


+   (kernel_ulong_t) 0xdad1bef1bad1fed1ULL,
+   (kernel_ulong_t) 0xdad2bef2bad2fed2ULL,
+   (kernel_ulong_t) 0xdad3bef3bad3fed3ULL,
+   (kernel_ulong_t) 0xdad4bef4bad4fed4ULL,
+   (kernel_ulong_t) 0xdad5bef5bad5fed5ULL
},
{
__NR_gettid,
-   0xcaf0bea0,
-   0xcaf1bea1,
-   0xcaf2bea2,
-   0xcaf3bea3,
-   0xcaf4bea4,
-   0xcaf5bea5
+   (kernel_ulong_t) 0xdad0bef0caf0bea0ULL,
+   (kernel_ulong_t) 0xdad1bef1caf1bea1ULL,
+   (kernel_ulong_t) 0xdad2bef2caf2bea2ULL,
+   (kernel_ulong_t) 0xdad3bef3caf3bea3ULL,
+   (kernel_ulong_t) 0xdad4bef4caf4bea4ULL,
+   (kernel_ulong_t) 0xdad5bef5caf5bea5ULL
},
{
__NR_exit_group,
0,
-   0xfac1c0d1,
-   0xfac2c0d2,
-   0xfac3c0d3,
-   0xfac4c0d4,
-   0xfac5c0d5
+   (kernel_ulong_t) 0xdad1bef1fac1c0d1ULL,
+   (kernel_ulong_t) 0xdad2bef2fac2c0d2ULL,
+   (kernel_ulong_t) 0xdad3bef3fac3c0d3ULL,
+   (kernel_ulong_t) 0xdad4bef4fac4c0d4ULL,
+   (kernel_ulong_t) 0xdad5bef5fac5c0d5ULL
}
};
-   const unsigned long *exp_args;
+   const kernel_ulong_t *exp_args;
  
  	pid_t pid = fork();
  
@@ -154,7 +165,7 @@ TEST(get_syscall_info)

}
ASSERT_LT(0, (rc = sys_ptrace(PTRACE_GET_SYSCALL_INFO,
  pid, size,
- (unsigned long) &info))) {
+ (uintptr_t) &info))) {
LOG_KILL_TRACEE("PTRACE_GET_SYSCALL_INFO: %m");
}
ASSERT_EQ(expected_none_size, rc) {
@@ -177,7 +188,7 @@ TEST(get_syscall_info)
case SIGTRAP | 0x80:
ASSERT_LT(0, (rc = sys_ptrace(PTRACE_GET_SYSCALL_INFO,
  pid, size,
- (unsigned long) &info))) {
+ (uintptr_

Re: [PATCH net-next v2] vsock/test: Add test for null ptr deref when transport changes

2025-03-28 Thread Stefano Garzarella

On Wed, Mar 26, 2025 at 05:21:03PM +0100, Stefano Garzarella wrote:

On Wed, Mar 26, 2025 at 04:14:20PM +0100, Luigi Leonardi wrote:

Hi Michal,

On Wed, Mar 19, 2025 at 01:27:35AM +0100, Michal Luczaj wrote:

On 3/14/25 10:27, Luigi Leonardi wrote:

Add a new test to ensure that when the transport changes a null pointer
dereference does not occur[1].

Note that this test does not fail, but it may hang on the client side if
it triggers a kernel oops.

This works by creating a socket, trying to connect to a server, and then
executing a second connect operation on the same socket but to a
different CID (0). This triggers a transport change. If the connect
operation is interrupted by a signal, this could cause a null-ptr-deref.


Just to be clear: that's the splat, right?

Oops: general protection fault, probably for non-canonical address 
0xdc0c:  [#1] PREEMPT SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0060-0x0067]
CPU: 2 UID: 0 PID: 463 Comm: kworker/2:3 Not tainted
Workqueue: vsock-loopback vsock_loopback_work
RIP: 0010:vsock_stream_has_data+0x44/0x70
Call Trace:
virtio_transport_do_close+0x68/0x1a0
virtio_transport_recv_pkt+0x1045/0x2ae4
vsock_loopback_work+0x27d/0x3f0
process_one_work+0x846/0x1420
worker_thread+0x5b3/0xf80
kthread+0x35a/0x700
ret_from_fork+0x2d/0x70
ret_from_fork_asm+0x1a/0x30



Yep! I'll add it to the commit message in v3.

...
+static void test_stream_transport_change_client(const struct test_opts *opts)
+{
+   __sighandler_t old_handler;
+   pid_t pid = getpid();
+   pthread_t thread_id;
+   time_t tout;
+
+   old_handler = signal(SIGUSR1, test_transport_change_signal_handler);
+   if (old_handler == SIG_ERR) {
+   perror("signal");
+   exit(EXIT_FAILURE);
+   }
+
+   if (pthread_create(&thread_id, NULL, test_stream_transport_change_thread, 
&pid)) {
+   perror("pthread_create");


Does pthread_create() set errno on failure?

It does not, very good catch!



+   exit(EXIT_FAILURE);
+   }
+
+   tout = current_nsec() + TIMEOUT * NSEC_PER_SEC;


Isn't 10 seconds a bit excessive? I see the oops pretty much immediately.
Yeah it's probably excessive. I used because it's the default 
timeout value.



+   do {
+   struct sockaddr_vm sa = {
+   .svm_family = AF_VSOCK,
+   .svm_cid = opts->peer_cid,
+   .svm_port = opts->peer_port,
+   };
+   int s;
+
+   s = socket(AF_VSOCK, SOCK_STREAM, 0);
+   if (s < 0) {
+   perror("socket");
+   exit(EXIT_FAILURE);
+   }
+
+   connect(s, (struct sockaddr *)&sa, sizeof(sa));
+
+   /* Set CID to 0 cause a transport change. */
+   sa.svm_cid = 0;
+   connect(s, (struct sockaddr *)&sa, sizeof(sa));
+
+   close(s);
+   } while (current_nsec() < tout);
+
+   if (pthread_cancel(thread_id)) {
+   perror("pthread_cancel");


And errno here.


+   exit(EXIT_FAILURE);
+   }
+
+   /* Wait for the thread to terminate */
+   if (pthread_join(thread_id, NULL)) {
+   perror("pthread_join");


And here.
Aaand I've realized I've made exactly the same mistake elsewhere :)


...
+static void test_stream_transport_change_server(const struct test_opts *opts)
+{
+   time_t tout = current_nsec() + TIMEOUT * NSEC_PER_SEC;
+
+   do {
+   int s = vsock_stream_listen(VMADDR_CID_ANY, opts->peer_port);
+
+   close(s);
+   } while (current_nsec() < tout);
+}


I'm not certain you need to re-create the listener or measure the time
here. What about something like

int s = vsock_stream_listen(VMADDR_CID_ANY, opts->peer_port);
control_expectln("DONE");
close(s);


Just tried and it triggers the oops :)


If this works (as I also initially thought), we should check the 
result of the first connect() in the client code. It can succeed or 
fail with -EINTR, in other cases we should report an error because it 
is not expected.


And we should check also the second connect(), it should always fail, 
right?


For this I think you need another sync point to be sure the server is 
listening before try to connect the first time:


client:
   // pthread_create, etc.

   control_expectln("LISTENING");

   do {
   ...
   } while();

   control_writeln("DONE");

server:
   int s = vsock_stream_listen(VMADDR_CID_ANY, opts->peer_port);
   control_writeln("LISTENING");


We found that this needed to be extended by adding an accept() loop to 
avoid filling up the backlog of the listening socket.
But by doing accept() and close() back to back, we found a problem in 
AF_VSOCK, where connect() in some cases would get stuck until the 
timeout (default: 2 seconds) returning -ETIMEDOUT.


Fix is coming.

Thanks,
Stefano


   co

Re: [PATCH/RFC] kunit/rtc: Add real support for very slow tests

2025-03-28 Thread Geert Uytterhoeven
Hi David,

On Fri, 28 Mar 2025 at 09:07, David Gow  wrote:
> Thanks for sending this out: I think this raises some good questions
> about exactly how to handle long running tests (particularly on
> older/slower hardware).
>
> I've put a few notes below, but, tl;dr: I think these are all good
> changes, even if there's more we can do to better scale to slower
> hardware.
>
> On Fri, 28 Mar 2025 at 00:07, Geert Uytterhoeven  wrote:
> >   2. Increase timeout by ten; ideally this should only be done for very
> >  slow tests, but I couldn't find how to access kunit_case.attr.case
> >  from kunit_try_catch_run(),
>
>
> My feeling for tests generally is:
> - Normal: effectively instant on modern hardware, O(seconds) on
> ancient hardware.
> - Slow: takes O(seconds) to run on modern hardware, O(minutes)..O(10s
> of minutes) on ancient hardware.
> - Very slow: O(minutes) or higher on modern hardware, infeasible on
> ancient hardware.
>
> Obviously the definition of "modern" and "ancient" hardware here is
> pretty arbitrary: I'm using "modern, high-end x86" ~4GHz as my
> "modern" example, and "66MHz 486" as my "ancient" one, but things like
> emulation or embedded systems fit in-between.
>
> Ultimately, I think the timeout probably needs to be configurable on a
> per-machine basis more than a per-test one, but having a 10x
> multiplier (or even a 100x multiplier) for very slow tests would also
> work for me.

Yes, adapting automatically to the speed of the target maachine
would be nice, but non-trivial.

> I quickly tried hacking together something to pass through the
> attribute and implement this. Diff (probably mangled by gmail) below:

[...]

Thanks!

> I'll get around to extending this to allow the "base timeout" to be
> configurable as a command-line option, too, if this seems like a good
> way to go.
>
> >   3. Mark rtc_time64_to_tm_test_date_range_1000 slow,
> >   4. Mark rtc_time64_to_tm_test_date_range_16 very slow.
>
> Hmm... these are definitely fast enough on my "modern" machine that
> they probably only warrant "slow", not "very slow". But given they're
> definitely causing problems on older machines, I'm happy to go with
> marking the large ones very slow. (I've been waiting for them for
> about 45 minutes so far on my 486.)
>
> Do the time tests in kernel/time/time_test.c also need to be marked
> very slow, or does that run much faster on your setup?

Hmm, I did run time_test (insmod took (+7 minutes), but I don't
seem to have pass/fail output. Will rerun...

Indeed:

# time64_to_tm_test_date_range.speed: slow

Another test that wanted to be marked as slow was:

# kunit_platform_device_add_twice_fails_test: Test should be
marked slow (runtime: 30.788248702s)

I will rerun all, as it seems I have lost some logs...

> Is this causing you enough strife that you want it in as-is, straight
> away, or would you be happy with it being split up and polished a bit
> first -- particularly around supporting the more configurable timeout,
> and shifting the test changes into separate patches? (I'm happy to do
> that for you if you don't want to dig around in the somewhat messy
> KUnit try-catch stuff any further.)

This is definitely not something urgent for me.

Thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds



Re: [ANNOUNCE] kmod 34.2

2025-03-28 Thread Lucas De Marchi

kmod 34.2 is out:

https://www.kernel.org/pub/linux/utils/kernel/kmod/kmod-34.2.tar.xz
https://www.kernel.org/pub/linux/utils/kernel/kmod/kmod-34.2.tar.sign

The tarballs generated for kmod-34 and kmod-34.1 were not very
compatible for distros still on autotools. Hint: v35 will not have
autotools and it'd be better to be prepared.

Fix it and also bring a few fixes to weakdep parsing.

Shortlog is below:

Emil Velikov (1):
  NEWS: squash a couple of typos

Jakub Ślepecki (1):
  libkmod: fix buffer-overflow in weakdep_to_char

Lucas De Marchi (3):
  testsuite: Add modprobe -c test for weakdep
  autotools: Fix generated files in tarball
  kmod 34.2

Tobias Stoeckmann (2):
  libkmod: release memory on builtin error path
  libkmod: fix buffer-overflow in weakdep_to_char

thanks
Lucas De Marchi



[GIT PULL] Modules changes for v6.15-rc1

2025-03-28 Thread Petr Pavlu
The following changes since commit 80e54e84911a923c40d7bee33a34c1b4be148d7a:

  Linux 6.14-rc6 (2025-03-09 13:45:25 -1000)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux.git/ 
tags/modules-6.15-rc1

for you to fetch changes up to 897c0b4e27135132dc5b348c1a3773d059668489:

  MAINTAINERS: Update the MODULE SUPPORT section (2025-03-28 15:08:20 +0100)


Modules changes for 6.15-rc1

- Use RCU instead of RCU-sched

  The mix of rcu_read_lock(), rcu_read_lock_sched() and preempt_disable()
  in the module code and its users has been replaced with just
  rcu_read_lock().

- The rest of changes are smaller fixes and updates.

The changes have been on linux-next for at least 2 weeks, with the RCU
cleanup present for 2 months. One performance problem was reported with the
RCU change when KASAN + lockdep were enabled, but it was effectively
addressed by the already merged ee57ab5a3212 ("locking/lockdep: Disable
KASAN instrumentation of lockdep.c").


Joel Granados (1):
  tests/module: nix-ify

Petr Pavlu (1):
  MAINTAINERS: Update the MODULE SUPPORT section

Sebastian Andrzej Siewior (27):
  module: Begin to move from RCU-sched to RCU.
  module: Use proper RCU assignment in add_kallsyms().
  module: Use RCU in find_kallsyms_symbol().
  module: Use RCU in module_get_kallsym().
  module: Use RCU in find_module_all().
  module: Use RCU in __find_kallsyms_symbol_value().
  module: Use RCU in module_kallsyms_on_each_symbol().
  module: Remove module_assert_mutex_or_preempt() from 
try_add_tainted_module().
  module: Use RCU in find_symbol().
  module: Use RCU in __is_module_percpu_address().
  module: Allow __module_address() to be called from RCU section.
  module: Use RCU in search_module_extables().
  module: Use RCU in all users of __module_address().
  module: Use RCU in all users of __module_text_address().
  ARM: module: Use RCU in all users of __module_text_address().
  arm64: module: Use RCU in all users of __module_text_address().
  LoongArch/orc: Use RCU in all users of __module_address().
  LoongArch: ftrace: Use RCU in all users of __module_text_address().
  powerpc/ftrace: Use RCU in all users of __module_text_address().
  cfi: Use RCU while invoking __module_address().
  x86: Use RCU in all users of __module_address().
  jump_label: Use RCU in all users of __module_address().
  jump_label: Use RCU in all users of __module_text_address().
  bpf: Use RCU in all users of __module_text_address().
  kprobes: Use RCU in all users of __module_text_address().
  static_call: Use RCU in all users of __module_text_address().
  bug: Use RCU instead RCU-sched to protect module_bug_list.

Thorsten Blum (3):
  params: Annotate struct module_param_attrs with __counted_by()
  module: Replace deprecated strncpy() with strscpy()
  module: Remove unnecessary size argument when calling strscpy()

 MAINTAINERS  |   4 +-
 arch/arm/kernel/module-plts.c|   4 +-
 arch/arm64/kernel/ftrace.c   |   7 +-
 arch/loongarch/kernel/ftrace_dyn.c   |   9 ++-
 arch/loongarch/kernel/unwind_orc.c   |   4 +-
 arch/powerpc/kernel/trace/ftrace.c   |   6 +-
 arch/powerpc/kernel/trace/ftrace_64_pg.c |   6 +-
 arch/x86/kernel/callthunks.c |   3 +-
 arch/x86/kernel/unwind_orc.c |   4 +-
 include/linux/kallsyms.h |   3 +-
 include/linux/module.h   |   2 +-
 kernel/cfi.c |   5 +-
 kernel/jump_label.c  |  31 +
 kernel/kprobes.c |   2 +-
 kernel/livepatch/core.c  |   4 +-
 kernel/module/internal.h |  11 
 kernel/module/kallsyms.c |  73 -
 kernel/module/main.c | 109 +++
 kernel/module/tracking.c |   2 -
 kernel/module/tree_lookup.c  |   8 +--
 kernel/module/version.c  |  14 ++--
 kernel/params.c  |  29 
 kernel/static_call_inline.c  |  13 ++--
 kernel/trace/bpf_trace.c |  24 +++
 kernel/trace/trace_kprobe.c  |   9 +--
 lib/bug.c|  22 +++
 lib/tests/module/gen_test_kallsyms.sh|   2 +-
 27 files changed, 160 insertions(+), 250 deletions(-)



Re: [PATCH V2] remoteproc: core: Clear table_sz when rproc_shutdown

2025-03-28 Thread Mathieu Poirier
On Fri, Mar 28, 2025 at 12:50:12PM +0800, Peng Fan wrote:
> On Thu, Mar 27, 2025 at 11:46:33AM -0600, Mathieu Poirier wrote:
> >Hi,
> >
> >On Wed, Mar 26, 2025 at 10:02:14AM +0800, Peng Fan (OSS) wrote:
> >> From: Peng Fan 
> >> 
> >> There is case as below could trigger kernel dump:
> >> Use U-Boot to start remote processor(rproc) with resource table
> >> published to a fixed address by rproc. After Kernel boots up,
> >> stop the rproc, load a new firmware which doesn't have resource table
> >> ,and start rproc.
> >>
> >
> >If a firwmare image doesn't have a resouce table, rproc_elf_load_rsc_table()
> >will return an error [1], rproc_fw_boot() will exit prematurely [2] and the
> >remote processor won't be started.  What am I missing?
> 
> STM32 and i.MX use their own parse_fw implementation which allows no resource
> table:
> https://elixir.bootlin.com/linux/v6.13.7/source/drivers/remoteproc/stm32_rproc.c#L272
> https://elixir.bootlin.com/linux/v6.13.7/source/drivers/remoteproc/imx_rproc.c#L598

Ok, that settles rproc_fw_boot() but there is also rproc_find_loaded_rsc_table()
that will return NULL if a resource table is not found and preventing the
memcpy() in rproc_start() from happening:

https://elixir.bootlin.com/linux/v6.14-rc6/source/drivers/remoteproc/remoteproc_core.c#L1288

> 
> Thanks,
> Peng
> 
> >
> >[1]. 
> >https://elixir.bootlin.com/linux/v6.14-rc6/source/drivers/remoteproc/remoteproc_elf_loader.c#L338
> >[2]. 
> >https://elixir.bootlin.com/linux/v6.14-rc6/source/drivers/remoteproc/remoteproc_core.c#L1411
> > 
> >
> >> When starting rproc with a firmware not have resource table,
> >> `memcpy(loaded_table, rproc->cached_table, rproc->table_sz)` will
> >> trigger dump, because rproc->cache_table is set to NULL during the last
> >> stop operation, but rproc->table_sz is still valid.
> >> 
> >> This issue is found on i.MX8MP and i.MX9.
> >> 
> >> Dump as below:
> >> Unable to handle kernel NULL pointer dereference at virtual address 
> >> 
> >> Mem abort info:
> >>   ESR = 0x9604
> >>   EC = 0x25: DABT (current EL), IL = 32 bits
> >>   SET = 0, FnV = 0
> >>   EA = 0, S1PTW = 0
> >>   FSC = 0x04: level 0 translation fault
> >> Data abort info:
> >>   ISV = 0, ISS = 0x0004, ISS2 = 0x
> >>   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> >>   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> >> user pgtable: 4k pages, 48-bit VAs, pgdp=00010af63000
> >> [] pgd=, p4d=
> >> Internal error: Oops: 9604 [#1] PREEMPT SMP
> >> Modules linked in:
> >> CPU: 2 UID: 0 PID: 1060 Comm: sh Not tainted 
> >> 6.14.0-rc7-next-20250317-dirty #38
> >> Hardware name: NXP i.MX8MPlus EVK board (DT)
> >> pstate: a005 (NzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> >> pc : __pi_memcpy_generic+0x110/0x22c
> >> lr : rproc_start+0x88/0x1e0
> >> Call trace:
> >>  __pi_memcpy_generic+0x110/0x22c (P)
> >>  rproc_boot+0x198/0x57c
> >>  state_store+0x40/0x104
> >>  dev_attr_store+0x18/0x2c
> >>  sysfs_kf_write+0x7c/0x94
> >>  kernfs_fop_write_iter+0x120/0x1cc
> >>  vfs_write+0x240/0x378
> >>  ksys_write+0x70/0x108
> >>  __arm64_sys_write+0x1c/0x28
> >>  invoke_syscall+0x48/0x10c
> >>  el0_svc_common.constprop.0+0xc0/0xe0
> >>  do_el0_svc+0x1c/0x28
> >>  el0_svc+0x30/0xcc
> >>  el0t_64_sync_handler+0x10c/0x138
> >>  el0t_64_sync+0x198/0x19c
> >> 
> >> Clear rproc->table_sz to address the issue.
> >> 
> >> While at here, also clear rproc->table_sz when rproc_fw_boot and
> >> rproc_detach.
> >> 
> >> Fixes: 9dc9507f1880 ("remoteproc: Properly deal with the resource table 
> >> when detaching")
> >> Signed-off-by: Peng Fan 
> >> ---
> >> 
> >> V2:
> >>  Clear table_sz when rproc_fw_boot and rproc_detach per Arnaud
> >> 
> >>  drivers/remoteproc/remoteproc_core.c | 3 +++
> >>  1 file changed, 3 insertions(+)
> >> 
> >> diff --git a/drivers/remoteproc/remoteproc_core.c 
> >> b/drivers/remoteproc/remoteproc_core.c
> >> index c2cf0d277729..1efa53d4e0c3 100644
> >> --- a/drivers/remoteproc/remoteproc_core.c
> >> +++ b/drivers/remoteproc/remoteproc_core.c
> >> @@ -1442,6 +1442,7 @@ static int rproc_fw_boot(struct rproc *rproc, const 
> >> struct firmware *fw)
> >>kfree(rproc->cached_table);
> >>rproc->cached_table = NULL;
> >>rproc->table_ptr = NULL;
> >> +  rproc->table_sz = 0;
> >>  unprepare_rproc:
> >>/* release HW resources if needed */
> >>rproc_unprepare_device(rproc);
> >> @@ -2025,6 +2026,7 @@ int rproc_shutdown(struct rproc *rproc)
> >>kfree(rproc->cached_table);
> >>rproc->cached_table = NULL;
> >>rproc->table_ptr = NULL;
> >> +  rproc->table_sz = 0;
> >>  out:
> >>mutex_unlock(&rproc->lock);
> >>return ret;
> >> @@ -2091,6 +2093,7 @@ int rproc_detach(struct rproc *rproc)
> >>kfree(rproc->cached_table);
> >>rproc->cached_table = NULL;
> >>rproc->table_ptr = NULL;
> >> +  rproc->table_sz = 0;
> >>  out:
> >>mutex_unlock(&rproc->lock);
> >>return ret;
> >> -- 
> >> 2.37.1
> >

[PATCH net] vsock: avoid timeout during connect() if the socket is closing

2025-03-28 Thread Stefano Garzarella
From: Stefano Garzarella 

When a peer attempts to establish a connection, vsock_connect() contains
a loop that waits for the state to be TCP_ESTABLISHED. However, the
other peer can be fast enough to accept the connection and close it
immediately, thus moving the state to TCP_CLOSING.

When this happens, the peer in the vsock_connect() is properly woken up,
but since the state is not TCP_ESTABLISHED, it goes back to sleep
until the timeout expires, returning -ETIMEDOUT.

If the socket state is TCP_CLOSING, waiting for the timeout is pointless.
vsock_connect() can return immediately without errors or delay since the
connection actually happened. The socket will be in a closing state,
but this is not an issue, and subsequent calls will fail as expected.

We discovered this issue while developing a test that accepts and
immediately closes connections to stress the transport switch between
two connect() calls, where the first one was interrupted by a signal
(see Closes link).

Reported-by: Luigi Leonardi 
Closes: 
https://lore.kernel.org/virtualization/bq6hxrolno2vmtqwcvb5bljfpb7mvwb3kohrvaed6auz5vxrfv@ijmd2f3grobn/
Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
Signed-off-by: Stefano Garzarella 
---
 net/vmw_vsock/af_vsock.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 7e3db87ae433..fc6afbc8d680 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1551,7 +1551,11 @@ static int vsock_connect(struct socket *sock, struct 
sockaddr *addr,
timeout = vsk->connect_timeout;
prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
 
-   while (sk->sk_state != TCP_ESTABLISHED && sk->sk_err == 0) {
+   /* If the socket is already closing or it is in an error state, there
+* is no point in waiting.
+*/
+   while (sk->sk_state != TCP_ESTABLISHED &&
+  sk->sk_state != TCP_CLOSING && sk->sk_err == 0) {
if (flags & O_NONBLOCK) {
/* If we're not going to block, we schedule a timeout
 * function to generate a timeout on the connection
-- 
2.49.0




Re: [PATCH v2 2/2] x86/sgx: Implement EUPDATESVN and opportunistically call it during first EPC page alloc

2025-03-28 Thread Jarkko Sakkinen
On Fri, Mar 28, 2025 at 07:50:43PM +0200, Jarkko Sakkinen wrote:
> On Fri, Mar 28, 2025 at 02:57:41PM +0200, Elena Reshetova wrote:
> > SGX architecture introduced a new instruction called EUPDATESVN
> > to Ice Lake. It allows updating security SVN version, given that EPC
> > is completely empty. The latter is required for security reasons
> > in order to reason that enclave security posture is as secure as the
> > security SVN version of the TCB that created it.
> > 
> > Additionally it is important to ensure that while ENCLS[EUPDATESVN]
> > runs, no concurrent page creation happens in EPC, because it might
> > result in #GP delivered to the creator. Legacy SW might not be prepared
> > to handle such unexpected #GPs and therefore this patch introduces
> > a locking mechanism to ensure no concurrent EPC allocations can happen.
> > 
> > It is also ensured that ENCLS[EUPDATESVN] is not called when running
> > in a VM since it does not have a meaning in this context (microcode
> > updates application is limited to the host OS) and will create
> > unnecessary load.
> > 
> > This patch is based on previous submision by Cathy Zhang
> > https://lore.kernel.org/all/20220520103904.1216-1-cathy.zh...@intel.com/
> > 
> > Signed-off-by: Elena Reshetova 
> > ---
> >  arch/x86/include/asm/sgx.h  | 41 +
> >  arch/x86/kernel/cpu/sgx/encls.h |  6 
> >  arch/x86/kernel/cpu/sgx/main.c  | 63 -
> >  arch/x86/kernel/cpu/sgx/sgx.h   |  1 +
> >  4 files changed, 95 insertions(+), 16 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
> > index 6a0069761508..5caf5c31ebc6 100644
> > --- a/arch/x86/include/asm/sgx.h
> > +++ b/arch/x86/include/asm/sgx.h
> > @@ -26,23 +26,26 @@
> >  #define SGX_CPUID_EPC_SECTION  0x1
> >  /* The bitmask for the EPC section type. */
> >  #define SGX_CPUID_EPC_MASK GENMASK(3, 0)
> > +/* EUPDATESVN presence indication */
> > +#define SGX_CPUID_EUPDATESVN   BIT(10)
> >  
> >  enum sgx_encls_function {
> > -   ECREATE = 0x00,
> > -   EADD= 0x01,
> > -   EINIT   = 0x02,
> > -   EREMOVE = 0x03,
> > -   EDGBRD  = 0x04,
> > -   EDGBWR  = 0x05,
> > -   EEXTEND = 0x06,
> > -   ELDU= 0x08,
> > -   EBLOCK  = 0x09,
> > -   EPA = 0x0A,
> > -   EWB = 0x0B,
> > -   ETRACK  = 0x0C,
> > -   EAUG= 0x0D,
> > -   EMODPR  = 0x0E,
> > -   EMODT   = 0x0F,
> > +   ECREATE = 0x00,
> > +   EADD= 0x01,
> > +   EINIT   = 0x02,
> > +   EREMOVE = 0x03,
> > +   EDGBRD  = 0x04,
> > +   EDGBWR  = 0x05,
> > +   EEXTEND = 0x06,
> > +   ELDU= 0x08,
> > +   EBLOCK  = 0x09,
> > +   EPA = 0x0A,
> > +   EWB = 0x0B,
> > +   ETRACK  = 0x0C,
> > +   EAUG= 0x0D,
> > +   EMODPR  = 0x0E,
> > +   EMODT   = 0x0F,
> > +   EUPDATESVN  = 0x18,
> >  };
> >  
> >  /**
> > @@ -73,6 +76,11 @@ enum sgx_encls_function {
> >   * public key does not match IA32_SGXLEPUBKEYHASH.
> >   * %SGX_PAGE_NOT_MODIFIABLE:   The EPC page cannot be modified because 
> > it
> >   * is in the PENDING or MODIFIED state.
> > + * %SGX_INSUFFICIENT_ENTROPY:  Insufficient entropy in RNG.
> > + * %SGX_EPC_NOT_READY: EPC is not ready for SVN update.
> > + * %SGX_NO_UPDATE: EUPDATESVN was successful, but CPUSVN was not
> > + * updated because current SVN was not newer than
> > + * CPUSVN.
> >   * %SGX_UNMASKED_EVENT:An unmasked event, e.g. INTR, was 
> > received
> >   */
> >  enum sgx_return_code {
> > @@ -81,6 +89,9 @@ enum sgx_return_code {
> > SGX_CHILD_PRESENT   = 13,
> > SGX_INVALID_EINITTOKEN  = 16,
> > SGX_PAGE_NOT_MODIFIABLE = 20,
> > +   SGX_INSUFFICIENT_ENTROPY= 29,
> > +   SGX_EPC_NOT_READY   = 30,
> > +   SGX_NO_UPDATE   = 31,
> > SGX_UNMASKED_EVENT  = 128,
> >  };
> >  
> > diff --git a/arch/x86/kernel/cpu/sgx/encls.h 
> > b/arch/x86/kernel/cpu/sgx/encls.h
> > index 99004b02e2ed..3d83c76dc91f 100644
> > --- a/arch/x86/kernel/cpu/sgx/encls.h
> > +++ b/arch/x86/kernel/cpu/sgx/encls.h
> > @@ -233,4 +233,10 @@ static inline int __eaug(struct sgx_pageinfo *pginfo, 
> > void *addr)
> > return __encls_2(EAUG, pginfo, addr);
> >  }
> >  
> > +/* Update CPUSVN at runtime. */
> > +static inline int __eupdatesvn(void)
> > +{
> > +   return __encls_ret_1(EUPDATESVN, "");
> > +}
> > +
> >  #endif /* _X86_ENCLS_H */
> > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> > index b61d3bad0446..24563110811d 100644
> > --- a/arch/x86/kernel/cpu/sgx/main.c
> > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > @@ -32,6 +32,11 @@ static DEFINE_XARRAY(sgx_epc_address_space);
> >  static LIST_HEAD(sgx_active_page_list);
> >  static DEFINE_SPINLOCK(sgx_reclaimer_lock);
> >  
> > +/*

Re: [PATCH v3 1/7] dt-bindings: input: syna,rmi4: document syna,pdt-fallback-desc

2025-03-28 Thread David Heidelberg

On 26/03/2025 11:26, Caleb Connolly wrote:



On 3/26/25 07:57, Krzysztof Kozlowski wrote:

On 25/03/2025 14:23, Caleb Connolly wrote:



On 3/25/25 08:36, Krzysztof Kozlowski wrote:

On 24/03/2025 19:00, David Heidelberg wrote:

On 10/03/2025 10:45, Krzysztof Kozlowski wrote:

On Sat, Mar 08, 2025 at 03:08:37PM +0100, David Heidelberg wrote:

From: Caleb Connolly 

This new property allows devices to specify some register values 
which

are missing on units with third party replacement displays. These
displays use unofficial touch ICs which only implement a subset 
of the

RMI4 specification.


These are different ICs, so they have their own compatibles. Why this
cannot be deduced from the compatible?


Yes, but these identify as the originals.



It does not matter how they identify. You have the compatible for them.
If you cannot add compatible for them, how can you add dedicated
property for them?


Hi Krzysztof,

There are an unknown number of knock-off RMI4 chips which are sold in
cheap replacement display panels from multiple vendors. We suspect
there's more than one implementation.

A new compatible string wouldn't help us, since we use the same DTB on
fully original hardware as on hardware with replacement parts.

The proposed new property describes configuration registers which are
present on original RMI4 chips but missing on the third party ones, the
contents of the registers is static.



So you want to add redundant information for existing compatible, while
claiming you cannot deduce it from that existing compatible... Well,
no.. you cannot be sure that only chosen boards will have touchscreens
replaced, thus you will have to add this property to every board using
this compatible making it equal to the compatible and we are back at my
original comment. This is deducible from the compatible. If not the new
one, then from old one.


hmm I see, so instead we should add a compatible for the specific 
variant (S3320 or something) of RMI4 in this device and handle this in 
the driver? I think that makes sense.


Agree, preparing it for v4. So far proposing `compatible = 
"syna,rmi4-s3706b-i2c", "syna,rmi4-i2c"` (as S3706B is written in the 
commit and search confirms it for OP6/6T).


David>


Best regards,
Krzysztof




--
David Heidelberg




Re: bug report for linux-6.14/tools/testing/selftests/mm/pagemap_ioctl.c

2025-03-28 Thread Shuah Khan

On 3/26/25 13:25, David Binderman wrote:

Hello there,

Static analyser cppcheck says:


linux-6.14/tools/testing/selftests/mm/pagemap_ioctl.c:1061:11: style: int 
result is assigned to long long variable. If the variable is long long to avoid 
loss of information, then you have loss of information. 
[truncLongCastAssignment]
linux-6.14/tools/testing/selftests/mm/pagemap_ioctl.c:1510:11: style: int 
result is assigned to long long variable. If the variable is long long to avoid 
loss of information, then you have loss of information. 
[truncLongCastAssignment]
linux-6.14/tools/testing/selftests/mm/pagemap_ioctl.c:1523:11: style: int 
result is assigned to long long variable. If the variable is long long to avoid 
loss of information, then you have loss of information. 
[truncLongCastAssignment]
linux-6.14/tools/testing/selftests/mm/pagemap_ioctl.c:247:11: style: int result 
is assigned to long long variable. If the variable is long long to avoid loss 
of information, then you have loss of information. [truncLongCastAssignment]
linux-6.14/tools/testing/selftests/mm/pagemap_ioctl.c:435:11: style: int result 
is assigned to long long variable. If the variable is long long to avoid loss 
of information, then you have loss of information. [truncLongCastAssignment]
linux-6.14/tools/testing/selftests/mm/pagemap_ioctl.c:490:11: style: int result 
is assigned to long long variable. If the variable is long long to avoid loss 
of information, then you have loss of information. [truncLongCastAssignment]


The source code of the first one is

 mem_size = 10 * page_size;

Maybe better code:

 mem_size = 10ULL * page_size;

Regards

David Binderman



Can you send a patch for us to review?

thanks,
-- Shuah



Re: [PATCH] selftests/nolibc: drop unnecessary sys/io.h include

2025-03-28 Thread Shuah Khan

On 3/24/25 16:01, Thomas Weißschuh wrote:

The include of sys/io.h is not necessary anymore since
commit 67eb617a8e1e ("selftests/nolibc: simplify call to ioperm").
It's existence is also problematic as the header does not exist on all
architectures.

Reported-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Weißschuh 
---
  tools/testing/selftests/nolibc/nolibc-test.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/tools/testing/selftests/nolibc/nolibc-test.c 
b/tools/testing/selftests/nolibc/nolibc-test.c
index 
5884a891c491544050fc35b07322c73a1a9dbaf3..7a60b6ac1457e8d862ab1a6a26c9e46abec92111
 100644
--- a/tools/testing/selftests/nolibc/nolibc-test.c
+++ b/tools/testing/selftests/nolibc/nolibc-test.c
@@ -16,7 +16,6 @@
  #ifndef _NOLIBC_STDIO_H
  /* standard libcs need more includes */
  #include 
-#include 
  #include 
  #include 
  #include 

---
base-commit: bceb73904c855c78402dca94c82915f078f259dd
change-id: 20250324-nolibc-ioperm-155646560b95

Best regards,


Acked-by: Shuah Khan 

thanks,
-- Shuah



[PATCH v8 0/8] vhost: Add support of kthread API

2025-03-28 Thread Cindy Lu
In commit 6e890c5d5021 ("vhost: use vhost_tasks for worker threads"),
the vhost now uses vhost_task and operates as a child of the
owner thread. This aligns with containerization principles.
However, this change has caused confusion for some legacy
userspace applications. Therefore, we are reintroducing
support for the kthread API.

In this series, a new UAPI is implemented to allow
userspace applications to configure their thread mode.

Changelog v2:
 1. Change the module_param's name to enforce_inherit_owner, and the default 
value is true.
 2. Change the UAPI's name to VHOST_SET_INHERIT_FROM_OWNER.

Changelog v3:
 1. Change the module_param's name to inherit_owner_default, and the default 
value is true.
 2. Add a structure for task function; the worker will select a different mode 
based on the value inherit_owner.
 3. device will have their own inherit_owner in struct vhost_dev
 4. Address other comments

Changelog v4:
 1. remove the module_param, only keep the UAPI
 2. remove the structure for task function; change to use the function pointer 
in vhost_worker
 3. fix the issue in vhost_worker_create and vhost_dev_ioctl
 4. Address other comments

Changelog v5:
 1. Change wakeup and stop function pointers in struct vhost_worker to void.
 2. merging patches 4, 5, 6 in a single patch
 3. Fix spelling issues and address other comments.

Changelog v6:
 1. move the check of VHOST_NEW_WORKER from vhost_scsi to vhost
 2. Change the ioctl name VHOST_SET_INHERIT_FROM_OWNER to VHOST_FORK_FROM_OWNER
 3. reuse the function __vhost_worker_flush
 4. use a ops sturct to support worker relates function
 5. reset the value of inherit_owner in vhost_dev_reset_owner.

Changelog v7:
 1. add a KConfig knob to disable legacy app support
 2. Split the changes into two patches to separately introduce the ops and add 
kthread support.
 3. Utilized INX_MAX to avoid modifications in __vhost_worker_flush
 4. Rebased on the latest kernel
 5. Address other comments

Changelog v8:
 1. Rebased on the latest kernel
 2. Address some other comments

Tested with QEMU with kthread mode/task mode/kthread+task mode

Cindy Lu (8):
  vhost: Add a new parameter in vhost_dev to allow user select kthread
  vhost: Reintroduce vhost_worker to support kthread
  vhost: Add the cgroup related function
  vhost: Introduce vhost_worker_ops in vhost_worker
  vhost: Reintroduce kthread mode support in vhost
  vhost: uapi to control task mode (owner vs kthread)
  vhost: Add check for inherit_owner status
  vhost: Add a KConfig knob to enable IOCTL VHOST_FORK_FROM_OWNER

 drivers/vhost/Kconfig  |  15 +++
 drivers/vhost/vhost.c  | 219 +
 drivers/vhost/vhost.h  |  21 
 include/uapi/linux/vhost.h |  16 +++
 4 files changed, 252 insertions(+), 19 deletions(-)

-- 
2.45.0




[PATCH v8 7/8] vhost: Add check for inherit_owner status

2025-03-28 Thread Cindy Lu
The VHOST_NEW_WORKER requires the inherit_owner
setting to be true. So we need to add a check for this.

Signed-off-by: Cindy Lu 
---
 drivers/vhost/vhost.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index ff930c2e5b78..fb0c7fb43f78 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1018,6 +1018,13 @@ long vhost_worker_ioctl(struct vhost_dev *dev, unsigned 
int ioctl,
switch (ioctl) {
/* dev worker ioctls */
case VHOST_NEW_WORKER:
+   /*
+* vhost_tasks will account for worker threads under the 
parent's
+* NPROC value but kthreads do not. To avoid userspace 
overflowing
+* the system with worker threads inherit_owner must be true.
+*/
+   if (!dev->inherit_owner)
+   return -EFAULT;
ret = vhost_new_worker(dev, &state);
if (!ret && copy_to_user(argp, &state, sizeof(state)))
ret = -EFAULT;
-- 
2.45.0




[PATCH v8 8/8] vhost: Add a KConfig knob to enable IOCTL VHOST_FORK_FROM_OWNER

2025-03-28 Thread Cindy Lu
Introduce a new config knob `CONFIG_VHOST_ENABLE_FORK_OWNER_IOCTL`,
to control the availability of the `VHOST_FORK_FROM_OWNER` ioctl.
When CONFIG_VHOST_ENABLE_FORK_OWNER_IOCTL is set to n, the ioctl
is disabled, and any attempt to use it will result in failure.

Signed-off-by: Cindy Lu 
---
 drivers/vhost/Kconfig | 15 +++
 drivers/vhost/vhost.c |  3 +++
 2 files changed, 18 insertions(+)

diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index b455d9ab6f3d..e5b9dcbf31b6 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -95,3 +95,18 @@ config VHOST_CROSS_ENDIAN_LEGACY
  If unsure, say "N".
 
 endif
+
+config VHOST_ENABLE_FORK_OWNER_IOCTL
+   bool "Enable IOCTL VHOST_FORK_FROM_OWNER"
+   default n
+   help
+ This option enables the IOCTL VHOST_FORK_FROM_OWNER, which allows
+ userspace applications to modify the thread mode for vhost devices.
+
+  By default, `CONFIG_VHOST_ENABLE_FORK_OWNER_IOCTL` is set to `n`,
+  meaning the ioctl is disabled and any operation using this ioctl
+  will fail.
+  When the configuration is enabled (y), the ioctl becomes
+  available, allowing users to set the mode if needed.
+
+ If unsure, say "N".
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index fb0c7fb43f78..568e43cb54a9 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2294,6 +2294,8 @@ long vhost_dev_ioctl(struct vhost_dev *d, unsigned int 
ioctl, void __user *argp)
r = vhost_dev_set_owner(d);
goto done;
}
+
+#ifdef CONFIG_VHOST_ENABLE_FORK_OWNER_IOCTL
if (ioctl == VHOST_FORK_FROM_OWNER) {
u8 inherit_owner;
/*inherit_owner can only be modified before owner is set*/
@@ -2313,6 +2315,7 @@ long vhost_dev_ioctl(struct vhost_dev *d, unsigned int 
ioctl, void __user *argp)
r = 0;
goto done;
}
+#endif
/* You must be the owner to do anything else */
r = vhost_dev_check_owner(d);
if (r)
-- 
2.45.0




[PATCH v8 3/8] vhost: Add the cgroup related function

2025-03-28 Thread Cindy Lu
Add back the previously removed cgroup function to support the kthread
The biggest change for this part is in vhost_attach_cgroups() and
vhost_attach_task_to_cgroups().

The old function was remove in
commit 6e890c5d5021 ("vhost: use vhost_tasks for worker threads")

Signed-off-by: Cindy Lu 
---
 drivers/vhost/vhost.c | 41 +
 1 file changed, 41 insertions(+)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 9500e85b42ce..20571bd6f7bd 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -620,6 +621,46 @@ long vhost_dev_check_owner(struct vhost_dev *dev)
 }
 EXPORT_SYMBOL_GPL(vhost_dev_check_owner);
 
+struct vhost_attach_cgroups_struct {
+   struct vhost_work work;
+   struct task_struct *owner;
+   int ret;
+};
+
+static void vhost_attach_cgroups_work(struct vhost_work *work)
+{
+   struct vhost_attach_cgroups_struct *s;
+
+   s = container_of(work, struct vhost_attach_cgroups_struct, work);
+   s->ret = cgroup_attach_task_all(s->owner, current);
+}
+
+static int vhost_attach_task_to_cgroups(struct vhost_worker *worker)
+{
+   struct vhost_attach_cgroups_struct attach;
+   int saved_cnt;
+
+   attach.owner = current;
+
+   vhost_work_init(&attach.work, vhost_attach_cgroups_work);
+   vhost_worker_queue(worker, &attach.work);
+
+   mutex_lock(&worker->mutex);
+
+   /*
+* Bypass attachment_cnt check in __vhost_worker_flush:
+* Temporarily change it to INT_MAX to bypass the check
+*/
+   saved_cnt = worker->attachment_cnt;
+   worker->attachment_cnt = INT_MAX;
+   __vhost_worker_flush(worker);
+   worker->attachment_cnt = saved_cnt;
+
+   mutex_unlock(&worker->mutex);
+
+   return attach.ret;
+}
+
 /* Caller should have device mutex */
 bool vhost_dev_has_owner(struct vhost_dev *dev)
 {
-- 
2.45.0




[PATCH v8 5/8] vhost: Reintroduce kthread mode support in vhost

2025-03-28 Thread Cindy Lu
This commit restores the previously removed functions kthread
wake/stop/create, and use ops structure vhost_worker_ops to
manage worker wakeup, stop and creation. The function
vhost_worker_create initializes these ops pointers based on
the value of inherit_owner

The old function was remove in
commit 6e890c5d5021 ("vhost: use vhost_tasks for worker threads")

Signed-off-by: Cindy Lu 
---
 drivers/vhost/vhost.c | 48 ++-
 drivers/vhost/vhost.h |  1 +
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index c162ad772f8f..be97028a8baf 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -734,11 +734,21 @@ static void vhost_task_wakeup(struct vhost_worker *worker)
return vhost_task_wake(worker->vtsk);
 }
 
+static void vhost_kthread_wakeup(struct vhost_worker *worker)
+{
+   wake_up_process(worker->kthread_task);
+}
+
 static void vhost_task_do_stop(struct vhost_worker *worker)
 {
return vhost_task_stop(worker->vtsk);
 }
 
+static void vhost_kthread_do_stop(struct vhost_worker *worker)
+{
+   kthread_stop(worker->kthread_task);
+}
+
 static int vhost_task_worker_create(struct vhost_worker *worker,
struct vhost_dev *dev, const char *name)
 {
@@ -762,6 +772,41 @@ static int vhost_task_worker_create(struct vhost_worker 
*worker,
return 0;
 }
 
+static int vhost_kthread_worker_create(struct vhost_worker *worker,
+  struct vhost_dev *dev, const char *name)
+{
+   struct task_struct *task;
+   u32 id;
+   int ret;
+
+   task = kthread_create(vhost_run_work_kthread_list, worker, "%s", name);
+   if (IS_ERR(task))
+   return PTR_ERR(task);
+
+   worker->kthread_task = task;
+   wake_up_process(task);
+   ret = xa_alloc(&dev->worker_xa, &id, worker, xa_limit_32b, GFP_KERNEL);
+   if (ret < 0)
+   goto stop_worker;
+
+   ret = vhost_attach_task_to_cgroups(worker);
+   if (ret)
+   goto stop_worker;
+
+   worker->id = id;
+   return 0;
+
+stop_worker:
+   vhost_kthread_do_stop(worker);
+   return ret;
+}
+
+static const struct vhost_worker_ops kthread_ops = {
+   .create = vhost_kthread_worker_create,
+   .stop = vhost_kthread_do_stop,
+   .wakeup = vhost_kthread_wakeup,
+};
+
 static const struct vhost_worker_ops vhost_task_ops = {
.create = vhost_task_worker_create,
.stop = vhost_task_do_stop,
@@ -773,7 +818,8 @@ static struct vhost_worker *vhost_worker_create(struct 
vhost_dev *dev)
struct vhost_worker *worker;
char name[TASK_COMM_LEN];
int ret;
-   const struct vhost_worker_ops *ops = &vhost_task_ops;
+   const struct vhost_worker_ops *ops =
+   dev->inherit_owner ? &vhost_task_ops : &kthread_ops;
 
worker = kzalloc(sizeof(*worker), GFP_KERNEL_ACCOUNT);
if (!worker)
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 98895e299efa..af4b2f7d3b91 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -37,6 +37,7 @@ struct vhost_worker_ops {
 };
 
 struct vhost_worker {
+   struct task_struct *kthread_task;
struct vhost_task   *vtsk;
struct vhost_dev*dev;
/* Used to serialize device wide flushing with worker swapping. */
-- 
2.45.0




[PATCH v8 6/8] vhost: uapi to control task mode (owner vs kthread)

2025-03-28 Thread Cindy Lu
Add a new UAPI to configure the vhost device to use the kthread mode
The userspace application can use IOCTL VHOST_FORK_FROM_OWNER
to choose between owner and kthread mode if necessary
This setting must be applied before VHOST_SET_OWNER, as the worker
will be created in the VHOST_SET_OWNER function

Signed-off-by: Cindy Lu 
---
 drivers/vhost/vhost.c  | 22 --
 include/uapi/linux/vhost.h | 16 
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index be97028a8baf..ff930c2e5b78 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1134,7 +1134,7 @@ void vhost_dev_reset_owner(struct vhost_dev *dev, struct 
vhost_iotlb *umem)
int i;
 
vhost_dev_cleanup(dev);
-
+   dev->inherit_owner = true;
dev->umem = umem;
/* We don't need VQ locks below since vhost_dev_cleanup makes sure
 * VQs aren't running.
@@ -2287,7 +2287,25 @@ long vhost_dev_ioctl(struct vhost_dev *d, unsigned int 
ioctl, void __user *argp)
r = vhost_dev_set_owner(d);
goto done;
}
-
+   if (ioctl == VHOST_FORK_FROM_OWNER) {
+   u8 inherit_owner;
+   /*inherit_owner can only be modified before owner is set*/
+   if (vhost_dev_has_owner(d)) {
+   r = -EBUSY;
+   goto done;
+   }
+   if (copy_from_user(&inherit_owner, argp, sizeof(u8))) {
+   r = -EFAULT;
+   goto done;
+   }
+   if (inherit_owner > 1) {
+   r = -EINVAL;
+   goto done;
+   }
+   d->inherit_owner = (bool)inherit_owner;
+   r = 0;
+   goto done;
+   }
/* You must be the owner to do anything else */
r = vhost_dev_check_owner(d);
if (r)
diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
index b95dd84eef2d..1ae0917bfeca 100644
--- a/include/uapi/linux/vhost.h
+++ b/include/uapi/linux/vhost.h
@@ -235,4 +235,20 @@
  */
 #define VHOST_VDPA_GET_VRING_SIZE  _IOWR(VHOST_VIRTIO, 0x82,   \
  struct vhost_vring_state)
+
+/**
+ * VHOST_FORK_FROM_OWNER - Set the inherit_owner flag for the vhost device,
+ * This ioctl must called before VHOST_SET_OWNER.
+ *
+ * @param inherit_owner: An 8-bit value that determines the vhost thread mode
+ *
+ * When inherit_owner is set to 1(default value):
+ *   - Vhost will create tasks similar to processes forked from the owner,
+ * inheriting all of the owner's attributes.
+ *
+ * When inherit_owner is set to 0:
+ *   - Vhost will create tasks as kernel thread.
+ */
+#define VHOST_FORK_FROM_OWNER _IOW(VHOST_VIRTIO, 0x83, __u8)
+
 #endif
-- 
2.45.0




[PATCH v8 2/8] vhost: Reintroduce vhost_worker to support kthread

2025-03-28 Thread Cindy Lu
Add the previously removed function vhost_worker() back
to support the kthread and rename it to vhost_run_work_kthread_list.

The old function vhost_worker was change to support task in
commit 6e890c5d5021 ("vhost: use vhost_tasks for worker threads")
change to xarray in
commit 1cdaafa1b8b4 ("vhost: replace single worker pointer with xarray")

Signed-off-by: Cindy Lu 
---
 drivers/vhost/vhost.c | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 250dc43f1786..9500e85b42ce 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -388,6 +388,44 @@ static void vhost_vq_reset(struct vhost_dev *dev,
__vhost_vq_meta_reset(vq);
 }
 
+static int vhost_run_work_kthread_list(void *data)
+{
+   struct vhost_worker *worker = data;
+   struct vhost_work *work, *work_next;
+   struct vhost_dev *dev = worker->dev;
+   struct llist_node *node;
+
+   kthread_use_mm(dev->mm);
+
+   for (;;) {
+   /* mb paired w/ kthread_stop */
+   set_current_state(TASK_INTERRUPTIBLE);
+
+   if (kthread_should_stop()) {
+   __set_current_state(TASK_RUNNING);
+   break;
+   }
+   node = llist_del_all(&worker->work_list);
+   if (!node)
+   schedule();
+
+   node = llist_reverse_order(node);
+   /* make sure flag is seen after deletion */
+   smp_wmb();
+   llist_for_each_entry_safe(work, work_next, node, node) {
+   clear_bit(VHOST_WORK_QUEUED, &work->flags);
+   __set_current_state(TASK_RUNNING);
+   kcov_remote_start_common(worker->kcov_handle);
+   work->fn(work);
+   kcov_remote_stop();
+   cond_resched();
+   }
+   }
+   kthread_unuse_mm(dev->mm);
+
+   return 0;
+}
+
 static bool vhost_run_work_list(void *data)
 {
struct vhost_worker *worker = data;
-- 
2.45.0




[PATCH v8 1/8] vhost: Add a new parameter in vhost_dev to allow user select kthread

2025-03-28 Thread Cindy Lu
The vhost now uses vhost_task and workers as a child of the owner thread.
While this aligns with containerization principles,it confuses some legacy
userspace app, Therefore, we are reintroducing kthread API support.

Introduce a new parameter to enable users to choose between
kthread and task mode.

Signed-off-by: Cindy Lu 
---
 drivers/vhost/vhost.c | 1 +
 drivers/vhost/vhost.h | 9 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 63612faeab72..250dc43f1786 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -552,6 +552,7 @@ void vhost_dev_init(struct vhost_dev *dev,
dev->byte_weight = byte_weight;
dev->use_worker = use_worker;
dev->msg_handler = msg_handler;
+   dev->inherit_owner = true;
init_waitqueue_head(&dev->wait);
INIT_LIST_HEAD(&dev->read_list);
INIT_LIST_HEAD(&dev->pending_list);
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index bb75a292d50c..19bb94922a0e 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -176,6 +176,15 @@ struct vhost_dev {
int byte_weight;
struct xarray worker_xa;
bool use_worker;
+   /*
+* If inherit_owner is true we use vhost_tasks to create
+* the worker so all settings/limits like cgroups, NPROC,
+* scheduler, etc are inherited from the owner. If false,
+* we use kthreads and only attach to the same cgroups
+* as the owner for compat with older kernels.
+* here we use true as default value
+*/
+   bool inherit_owner;
int (*msg_handler)(struct vhost_dev *dev, u32 asid,
   struct vhost_iotlb_msg *msg);
 };
-- 
2.45.0




Re: Symbol too long for allsyms warnings on KSYM_NAME_LEN

2025-03-28 Thread Arnd Bergmann
On Thu, Mar 27, 2025, at 14:58, Peter Zijlstra wrote:
> On Thu, Mar 27, 2025 at 09:38:46AM +0100, Arnd Bergmann wrote:
>> My randconfig builds sometimes (around one in every 700 configs) run
>> into this warning on x86:
>> 
>> Symbol 
>> __pfx_sg1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7g1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7ng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7g1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7n
>>  too long for kallsyms (517 >= 512).
>> Please increase KSYM_NAME_LEN both in kernel and kallsyms.c
>> 
>> The check that gets triggered was added in commit c104c16073b
>> ("Kunit to check the longest symbol length"), see
>> https://lore.kernel.org/all/20241117195923.222145-1-sergio.coll...@gmail.com/
>> 
>> and the overlong identifier seems to be the result of objtool adding
>> the six-byte "__pfx_" string to a symbol in elf_create_prefix_symbol()
>> when CONFIG_FUNCTION_PADDING_CFI is set.
>> 
>> I think the suggestion to "Please increase KSYM_NAME_LEN both in
>> kernel and kallsyms.c" is misleading here and should probably be
>> changed. I don't know if this something that objtool should work
>> around, or something that needs to be adapted in the test.
>
> Probably test needs to be fixed; objtool can't really do anything here,
> it just take the existing symname and prefixes it.

I found a workaround that avoids the problem for me now, see
https://lore.kernel.org/linux-kbuild/20250328112156.2614513-1-a...@kernel.org/

  Arnd



Re: [PATCH 1/4] x86/sgx: Add total number of EPC pages

2025-03-28 Thread Jarkko Sakkinen
On Fri, Mar 28, 2025 at 10:42:18AM +0200, Jarkko Sakkinen wrote:
> In your code example you had a loop inside spinlock, which was based on
> a return code of an opcode, i.e. potentially infinite loop.
> 
> I'd like to remind you that the hardware I have is NUC7 from 2018 so
> you really have to nail how things will work semantically as I can
> only think these things only in theoretical level ;-) [1]

That said, I do execute these in NUC7 but is getting a bit old..

Cheapest hardware I've heard is Xeon E-2334 but even that with
case etc. is like nearing 2k in price.

BR, Jarkko



Re: [PATCH net-next v24 00/23] Introducing OpenVPN Data Channel Offload

2025-03-28 Thread Antonio Quartulli

Hi Sabrina,

do you plan to drop more comments at the patchset at this point?

I have gone through all requested changes and I'll just get the patches 
ready for submission once net-next is open again.


Thanks a lot!

Cheers,

On 18/03/2025 02:40, Antonio Quartulli wrote:

Notable changes since v23:
* dropped call to netif_tx_start/stop_all_queues()
* dropped NETIF_F_HW_CSUM and NETIF_F_RXCSUM dev flags
* dropped conditional call to skb_checksum_help() due to the point above
* added call to dst_cache_reset() in nl_peer_modify()
* dropped obsolete comment in ovpn_peer_keepalive_work()
* reversed scheduling delay computation in ovpn_peer_keepalive_work()

Please note that some patches were already reviewed/tested by a few
people. These patches have retained the tags as they have hardly been
touched.

The latest code can also be found at:

https://github.com/OpenVPN/ovpn-net-next

Thanks a lot!
Best Regards,

Antonio Quartulli
OpenVPN Inc.

---
Antonio Quartulli (23):
   net: introduce OpenVPN Data Channel Offload (ovpn)
   ovpn: add basic netlink support
   ovpn: add basic interface creation/destruction/management routines
   ovpn: keep carrier always on for MP interfaces
   ovpn: introduce the ovpn_peer object
   ovpn: introduce the ovpn_socket object
   ovpn: implement basic TX path (UDP)
   ovpn: implement basic RX path (UDP)
   ovpn: implement packet processing
   ovpn: store tunnel and transport statistics
   ovpn: implement TCP transport
   skb: implement skb_send_sock_locked_with_flags()
   ovpn: add support for MSG_NOSIGNAL in tcp_sendmsg
   ovpn: implement multi-peer support
   ovpn: implement peer lookup logic
   ovpn: implement keepalive mechanism
   ovpn: add support for updating local or remote UDP endpoint
   ovpn: implement peer add/get/dump/delete via netlink
   ovpn: implement key add/get/del/swap via netlink
   ovpn: kill key and notify userspace in case of IV exhaustion
   ovpn: notify userspace when a peer is deleted
   ovpn: add basic ethtool support
   testing/selftests: add test tool and scripts for ovpn module

  Documentation/netlink/specs/ovpn.yaml  |  367 +++
  Documentation/netlink/specs/rt_link.yaml   |   16 +
  MAINTAINERS|   11 +
  drivers/net/Kconfig|   15 +
  drivers/net/Makefile   |1 +
  drivers/net/ovpn/Makefile  |   22 +
  drivers/net/ovpn/bind.c|   55 +
  drivers/net/ovpn/bind.h|  101 +
  drivers/net/ovpn/crypto.c  |  211 ++
  drivers/net/ovpn/crypto.h  |  145 ++
  drivers/net/ovpn/crypto_aead.c |  409 
  drivers/net/ovpn/crypto_aead.h |   29 +
  drivers/net/ovpn/io.c  |  455 
  drivers/net/ovpn/io.h  |   34 +
  drivers/net/ovpn/main.c|  330 +++
  drivers/net/ovpn/main.h|   14 +
  drivers/net/ovpn/netlink-gen.c |  213 ++
  drivers/net/ovpn/netlink-gen.h |   41 +
  drivers/net/ovpn/netlink.c | 1250 ++
  drivers/net/ovpn/netlink.h |   18 +
  drivers/net/ovpn/ovpnpriv.h|   57 +
  drivers/net/ovpn/peer.c| 1364 +++
  drivers/net/ovpn/peer.h|  163 ++
  drivers/net/ovpn/pktid.c   |  129 ++
  drivers/net/ovpn/pktid.h   |   87 +
  drivers/net/ovpn/proto.h   |  118 +
  drivers/net/ovpn/skb.h |   61 +
  drivers/net/ovpn/socket.c  |  244 ++
  drivers/net/ovpn/socket.h  |   49 +
  drivers/net/ovpn/stats.c   |   21 +
  drivers/net/ovpn/stats.h   |   47 +
  drivers/net/ovpn/tcp.c |  592 +
  drivers/net/ovpn/tcp.h |   36 +
  drivers/net/ovpn/udp.c |  442 
  drivers/net/ovpn/udp.h |   25 +
  include/linux/skbuff.h |2 +
  include/uapi/linux/if_link.h   |   15 +
  include/uapi/linux/ovpn.h  |  109 +
  include/uapi/linux/udp.h   |1 +
  net/core/skbuff.c  |   18 +-
  net/ipv6/af_inet6.c|1 +
  net/ipv6/udp.c |1 +
  tools/testing/selftests/Makefile   |1 +
  tools/testing/selftests/net/ovpn/.gitignore|2 +
  tools/testing/selftests/net/ovpn/Makefile  |   31 +
  t

RE: [PATCH 4/4] x86/sgx: Implement ENCLS[EUPDATESVN] and opportunistically call it during first EPC page alloc

2025-03-28 Thread Reshetova, Elena

> On Thu, Mar 27, 2025 at 03:42:30PM +, Reshetova, Elena wrote:
> > > > > > +   case SGX_NO_UPDATE:
> > > > > > +   pr_debug("EUPDATESVN was successful, but CPUSVN
> was not
> > > > > updated, "
> > > > > > +   "because current SVN was not newer than
> > > > > CPUSVN.\n");
> > > > > > +   break;
> > > > > > +   case SGX_EPC_NOT_READY:
> > > > > > +   pr_debug("EPC is not ready for SVN update.");
> > > > > > +   break;
> > > > > > +   case SGX_INSUFFICIENT_ENTROPY:
> > > > > > +   pr_debug("CPUSVN update is failed due to Insufficient
> > > > > entropy in RNG, "
> > > > > > +   "please try it later.\n");
> > > > > > +   break;
> > > > > > +   case SGX_EPC_PAGE_CONFLICT:
> > > > > > +   pr_debug("CPUSVN update is failed due to
> concurrency
> > > > > violation, please "
> > > > > > +   "stop running any other ENCLS leaf and try it
> > > > > later.\n");
> > > > > > +   break;
> > > > > > +   default:
> > > > > > +   break;
> > > > >
> > > > > Remove pr_debug() statements.
> > > >
> > > > This I am not sure it is good idea. I think it would be useful for 
> > > > system
> > > > admins to have a way to see that update either happened or not.
> > > > It is true that you can find this out by requesting a new SGX 
> > > > attestation
> > > > quote (and see if newer SVN is used), but it is not the faster way.
> > >
> > > Maybe pr_debug() is them wrong level if they are meant for sysadmins?
> > >
> > > I mean these should not happen in normal behavior like ever? As
> > > pr_debug() I don't really grab this.
> >
> > SGX_NO_UPDATE will absolutely happen normally all the time.
> > Since EUPDATESVN is executed every time EPC is empty, this is the
> > most common code you will get back (because microcode updates are rare).
> > Others yes, that would indicate some error condition.
> > So, what is the pr_level that you would suggest?
> 
> Right, got it. That changes my conclusions:
> 
> So I'd reformulate it like:
> 
>   switch (ret) {
>   case 0:
>   pr_info("EUPDATESVN: success\n);
>   break;
>   case SGX_EPC_NOT_READY:
>   case SGX_INSUFFICIENT_ENTROPY:
>   case SGX_EPC_PAGE_CONFLICT:
>   pr_err("EUPDATESVN: error %d\n", ret);
>   /* TODO: block/teardown driver? */
>   break;
>   case SGX_NO_UPDATE:
>   break;
>   default:
>   pr_err("EUPDATESVN: unknown error %d\n", ret);
>   /* TODO: block/teardown driver? */
>   break;
>   }
> 
> Since when this is executed EPC usage is zero error cases should block
> or teardown SGX driver, presuming that they are because of either
> incorrect driver state or spurious error code.

I agree with the above, but not sure at all about the blocking/teardown the
driver. They are all potentially temporal things and  SGX_INSUFFICIENT_ENTROPY
is even outside of SGX driver control and *does not* indicate any error
condition on the driver side itself. SGX_EPC_NOT_READY and SGX_EPC_PAGE_CONFLICT
would mean we have a bug somewhere because we thought we could go
do EUDPATESVN on empty EPC and prevented anyone from creating
pages in meanwhile but looks like we missed smth. That said, I dont know if we
want to fail the whole system in case we have such a code bug, this is very 
aggressive (in case it is some rare edge condition that no one knew about or
guessed). So, I would propose to print the pr_err() as you have above but
avoid destroying the driver. 
Would this work? 

Best Regards,
Elena.


> 
> If this happens, we definitely do not want service, right?
> 
> I'm not sure of all error codes how serious they are, or are all of them
> consequence of incorrectly working driver.
> 
> BR, Jarkko


Re: [RFC PATCH v3 5/8] KVM: arm64: Introduce module param to partition the PMU

2025-03-28 Thread James Clark




On 13/02/2025 6:03 pm, Colton Lewis wrote:

For PMUv3, the register MDCR_EL2.HPMN partitiones the PMU counters
into two ranges where counters 0..HPMN-1 are accessible by EL1 and, if
allowed, EL0 while counters HPMN..N are only accessible by EL2.

Introduce a module parameter in KVM to set this register. The name
reserved_host_counters reflects the intent to reserve some counters
for the host so the guest may eventually be allowed direct access to a
subset of PMU functionality for increased performance.

Track HPMN and whether the pmu is partitioned in struct arm_pmu
because both KVM and the PMUv3 driver will need to know that to handle
guests correctly.

Due to the difficulty this feature would create for the driver running
at EL1 on the host, partitioning is only allowed in VHE mode. Working
on nVHE mode would require a hypercall for every register access
because the counters reserved for the host by HPMN are now only
accessible to EL2.

The parameter is only configurable at boot time. Making the parameter
configurable on a running system is dangerous due to the difficulty of
knowing for sure no counters are in use anywhere so it is safe to
reporgram HPMN.



Hi Colton,

For some high level feedback for the RFC, it probably makes sense to 
include the other half of the feature at the same time. I think there is 
a risk that it requires something slightly different than what's here 
and there ends up being some churn.


Other than that I think it looks ok apart from some minor code review nits.

I was also thinking about how BRBE interacts with this. Alex has done 
some analysis that finds that it's difficult to use BRBE in guests with 
virtualized counters due to the fact that BRBE freezes on any counter 
overflow, rather than just guest ones. That leaves the guest with branch 
blackout windows in the delay between a host counter overflowing and the 
interrupt being taken and BRBE being restarted.


But with HPMN, BRBE does allow freeze on overflow of only one partition 
or the other (or both, but I don't think we'd want that) e.g.:


 RNXCWF: If EL2 is implemented, a BRBE freeze event occurs when all of
 the following are true:

 * BRBCR_EL1.FZP is 1.
 * Generation of Branch records is not paused.
 * PMOVSCLR_EL0[(MDCR_EL2.HPMN-1):0] is nonzero.
 * The PE is in a BRBE Non-prohibited region.

Unfortunately that means we could only let guests use BRBE with a 
partitioned PMU, which would massively reduce flexibility if hosts have 
to lose counters just so the guest can use BRBE.


I don't know if this is a stupid idea, but instead of having a fixed 
number for the partition, wouldn't it be nice if we could trap and 
increment HPMN on the first guest use of a counter, then decrement it on 
guest exit depending on what's still in use? The host would always 
assign its counters from the top down, and guests go bottom up if they 
want PMU passthrough. Maybe it's too complicated or won't work for 
various reasons, but because of BRBE the counter partitioning changes go 
from an optimization to almost a necessity.



Signed-off-by: Colton Lewis 
---
  arch/arm64/include/asm/kvm_pmu.h |  4 +++
  arch/arm64/kvm/Makefile  |  2 +-
  arch/arm64/kvm/debug.c   |  9 --
  arch/arm64/kvm/pmu-part.c| 47 
  arch/arm64/kvm/pmu.c |  2 ++
  include/linux/perf/arm_pmu.h |  2 ++
  6 files changed, 62 insertions(+), 4 deletions(-)
  create mode 100644 arch/arm64/kvm/pmu-part.c

diff --git a/arch/arm64/include/asm/kvm_pmu.h b/arch/arm64/include/asm/kvm_pmu.h
index 613cddbdbdd8..174b7f376d95 100644
--- a/arch/arm64/include/asm/kvm_pmu.h
+++ b/arch/arm64/include/asm/kvm_pmu.h
@@ -22,6 +22,10 @@ bool kvm_set_pmuserenr(u64 val);
  void kvm_vcpu_pmu_resync_el0(void);
  void kvm_host_pmu_init(struct arm_pmu *pmu);
  
+u8 kvm_pmu_get_reserved_counters(void);

+u8 kvm_pmu_hpmn(u8 nr_counters);
+void kvm_pmu_partition(struct arm_pmu *pmu);
+
  #else
  
  static inline void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr) {}

diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 3cf7adb2b503..065a6b804c84 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -25,7 +25,7 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
 vgic/vgic-mmio-v3.o vgic/vgic-kvm-device.o \
 vgic/vgic-its.o vgic/vgic-debug.o
  
-kvm-$(CONFIG_HW_PERF_EVENTS)  += pmu-emul.o pmu.o

+kvm-$(CONFIG_HW_PERF_EVENTS)  += pmu-emul.o pmu-part.o pmu.o
  kvm-$(CONFIG_ARM64_PTR_AUTH)  += pauth.o
  kvm-$(CONFIG_PTDUMP_STAGE2_DEBUGFS) += ptdump.o
  
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c

index 7fb1d9e7180f..b5ac5a213877 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -31,15 +31,18 @@
   */
  static void kvm_arm_setup_mdcr_el2(struct kvm_vcpu *vcpu)
  {
+   u8 counters = *host_data_ptr(nr_event_counters);
+   u8 hpmn = kvm_pmu_hpmn(counters);
+
preempt_disable();
  


Would you no

[PATCH net 4/4] selftests: mptcp: ignore mptcp_diag binary

2025-03-28 Thread Matthieu Baerts (NGI0)
A new binary is now generated by the MPTCP selftests: mptcp_diag.

Like the other binaries from this directory, there is no need to track
this in Git, it should then be ignored.

Fixes: 00f5e338cf7e ("selftests: mptcp: Add a tool to get specific msk_info")
Reviewed-by: Mat Martineau 
Signed-off-by: Matthieu Baerts (NGI0) 
---
 tools/testing/selftests/net/mptcp/.gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/testing/selftests/net/mptcp/.gitignore 
b/tools/testing/selftests/net/mptcp/.gitignore
index 
49daae73c41e6f86c6f0e47aa42426e5ad5c17e6..833279fb34e2dd74a27f16c26e44108029dd45e1
 100644
--- a/tools/testing/selftests/net/mptcp/.gitignore
+++ b/tools/testing/selftests/net/mptcp/.gitignore
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
 mptcp_connect
+mptcp_diag
 mptcp_inq
 mptcp_sockopt
 pm_nl_ctl

-- 
2.48.1




[PATCH v2 2/2] x86/sgx: Implement EUPDATESVN and opportunistically call it during first EPC page alloc

2025-03-28 Thread Elena Reshetova
SGX architecture introduced a new instruction called EUPDATESVN
to Ice Lake. It allows updating security SVN version, given that EPC
is completely empty. The latter is required for security reasons
in order to reason that enclave security posture is as secure as the
security SVN version of the TCB that created it.

Additionally it is important to ensure that while ENCLS[EUPDATESVN]
runs, no concurrent page creation happens in EPC, because it might
result in #GP delivered to the creator. Legacy SW might not be prepared
to handle such unexpected #GPs and therefore this patch introduces
a locking mechanism to ensure no concurrent EPC allocations can happen.

It is also ensured that ENCLS[EUPDATESVN] is not called when running
in a VM since it does not have a meaning in this context (microcode
updates application is limited to the host OS) and will create
unnecessary load.

This patch is based on previous submision by Cathy Zhang
https://lore.kernel.org/all/20220520103904.1216-1-cathy.zh...@intel.com/

Signed-off-by: Elena Reshetova 
---
 arch/x86/include/asm/sgx.h  | 41 +
 arch/x86/kernel/cpu/sgx/encls.h |  6 
 arch/x86/kernel/cpu/sgx/main.c  | 63 -
 arch/x86/kernel/cpu/sgx/sgx.h   |  1 +
 4 files changed, 95 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
index 6a0069761508..5caf5c31ebc6 100644
--- a/arch/x86/include/asm/sgx.h
+++ b/arch/x86/include/asm/sgx.h
@@ -26,23 +26,26 @@
 #define SGX_CPUID_EPC_SECTION  0x1
 /* The bitmask for the EPC section type. */
 #define SGX_CPUID_EPC_MASK GENMASK(3, 0)
+/* EUPDATESVN presence indication */
+#define SGX_CPUID_EUPDATESVN   BIT(10)
 
 enum sgx_encls_function {
-   ECREATE = 0x00,
-   EADD= 0x01,
-   EINIT   = 0x02,
-   EREMOVE = 0x03,
-   EDGBRD  = 0x04,
-   EDGBWR  = 0x05,
-   EEXTEND = 0x06,
-   ELDU= 0x08,
-   EBLOCK  = 0x09,
-   EPA = 0x0A,
-   EWB = 0x0B,
-   ETRACK  = 0x0C,
-   EAUG= 0x0D,
-   EMODPR  = 0x0E,
-   EMODT   = 0x0F,
+   ECREATE = 0x00,
+   EADD= 0x01,
+   EINIT   = 0x02,
+   EREMOVE = 0x03,
+   EDGBRD  = 0x04,
+   EDGBWR  = 0x05,
+   EEXTEND = 0x06,
+   ELDU= 0x08,
+   EBLOCK  = 0x09,
+   EPA = 0x0A,
+   EWB = 0x0B,
+   ETRACK  = 0x0C,
+   EAUG= 0x0D,
+   EMODPR  = 0x0E,
+   EMODT   = 0x0F,
+   EUPDATESVN  = 0x18,
 };
 
 /**
@@ -73,6 +76,11 @@ enum sgx_encls_function {
  * public key does not match IA32_SGXLEPUBKEYHASH.
  * %SGX_PAGE_NOT_MODIFIABLE:   The EPC page cannot be modified because it
  * is in the PENDING or MODIFIED state.
+ * %SGX_INSUFFICIENT_ENTROPY:  Insufficient entropy in RNG.
+ * %SGX_EPC_NOT_READY: EPC is not ready for SVN update.
+ * %SGX_NO_UPDATE: EUPDATESVN was successful, but CPUSVN was not
+ * updated because current SVN was not newer than
+ * CPUSVN.
  * %SGX_UNMASKED_EVENT:An unmasked event, e.g. INTR, was 
received
  */
 enum sgx_return_code {
@@ -81,6 +89,9 @@ enum sgx_return_code {
SGX_CHILD_PRESENT   = 13,
SGX_INVALID_EINITTOKEN  = 16,
SGX_PAGE_NOT_MODIFIABLE = 20,
+   SGX_INSUFFICIENT_ENTROPY= 29,
+   SGX_EPC_NOT_READY   = 30,
+   SGX_NO_UPDATE   = 31,
SGX_UNMASKED_EVENT  = 128,
 };
 
diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
index 99004b02e2ed..3d83c76dc91f 100644
--- a/arch/x86/kernel/cpu/sgx/encls.h
+++ b/arch/x86/kernel/cpu/sgx/encls.h
@@ -233,4 +233,10 @@ static inline int __eaug(struct sgx_pageinfo *pginfo, void 
*addr)
return __encls_2(EAUG, pginfo, addr);
 }
 
+/* Update CPUSVN at runtime. */
+static inline int __eupdatesvn(void)
+{
+   return __encls_ret_1(EUPDATESVN, "");
+}
+
 #endif /* _X86_ENCLS_H */
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index b61d3bad0446..24563110811d 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -32,6 +32,11 @@ static DEFINE_XARRAY(sgx_epc_address_space);
 static LIST_HEAD(sgx_active_page_list);
 static DEFINE_SPINLOCK(sgx_reclaimer_lock);
 
+/* This lock is held to prevent new EPC pages from being created
+ * during the execution of ENCLS[EUPDATESVN].
+ */
+static DEFINE_SPINLOCK(sgx_epc_eupdatesvn_lock);
+
 static atomic_long_t sgx_nr_used_pages = ATOMIC_LONG_INIT(0);
 static unsigned long sgx_nr_total_pages;
 
@@ -457,7 +462,17 @@ static struct sgx_epc_page 
*__sgx_alloc_epc_page_from_node(int nid)
page->flags = 0;
 
spin_unlock(&node->lock);
-   atomic_long_inc(&sgx_nr_used_