[ovmf test] 172883: regressions - FAIL

2022-08-30 Thread osstest service owner
flight 172883 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/172883/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 172136
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 172136
 test-amd64-amd64-xl-qemuu-ovmf-amd64 18 guest-localmigrate/x10 fail REGR. vs. 
172136

version targeted for testing:
 ovmf 383d34159d136f2dc923dfb6a722912b1af451b7
baseline version:
 ovmf 444260d45ec2a84e8f8c192b3539a3cd5591d009

Last test of basis   172136  2022-08-04 06:43:42 Z   26 days
Failing since172151  2022-08-05 02:40:28 Z   26 days  210 attempts
Testing same since   172883  2022-08-31 01:56:04 Z0 days1 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abner Chang 
  Ard Biesheuvel 
  Bob Feng 
  Chasel Chiu 
  Czajkowski, Maciej 
  Dimitrije Pavlov 
  Dun Tan 
  Edward Pickup 
  Foster Nong 
  Gregx Yeh 
  Guo Dong 
  Igor Kulchytskyy 
  James Lu 
  Jose Marinho 
  KasimX Liu 
  Kavya 
  Konstantin Aladyshev 
  Liming Gao 
  Liu, Zhiguang 
  Maciej Czajkowski 
  Michael D Kinney 
  Michael Kubacki 
  Ray Ni 
  Rebecca Cran 
  Sainadh Nagolu 
  Sami Mujawar 
  Shengfengx Xue 
  Zhiguang Liu 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  fail
 build-i386-libvirt   fail
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 1058 lines long.)



[RFC PATCH 27/30] Code tagging based latency tracking

2022-08-30 Thread Suren Baghdasaryan
From: Kent Overstreet 

This adds the ability to easily instrument code for measuring latency.
To use, add the following to calls to your code, at the start and end of
the event you wish to measure:

  code_tag_time_stats_start(start_time);
  code_tag_time_stats_finish(start_time);

Stastistics will then show up in debugfs under
/sys/kernel/debug/time_stats, listed by file and line number.

Stastics measured include weighted averages of frequency, duration, max
duration, as well as quantiles.

This patch also instruments all calls to init_wait and finish_wait,
which includes all calls to wait_event. Example debugfs output:

fs/xfs/xfs_trans_ail.c:746 module:xfs func:xfs_ail_push_all_sync
count:  17
rate:   0/sec
frequency:  2 sec
avg duration:   10 us
max duration:   232 us
quantiles (ns): 128 128 128 128 128 128 128 128 128 128 128 128 128 128 128

lib/sbitmap.c:813 module:sbitmap func:sbitmap_finish_wait
count:  3
rate:   0/sec
frequency:  4 sec
avg duration:   4 sec
max duration:   4 sec
quantiles (ns): 0 4288669120 4288669120 5360836048 5360836048 5360836048 
5360836048 5360836048 5360836048 5360836048 5360836048 5360836048 5360836048 
5360836048 5360836048

net/core/datagram.c:122 module:datagram func:__skb_wait_for_more_packets
count:  10
rate:   1/sec
frequency:  859 ms
avg duration:   472 ms
max duration:   30 sec
quantiles (ns): 0 12279 12279 15669 15669 15669 15669 17217 17217 17217 17217 
17217 17217 17217 17217

Signed-off-by: Kent Overstreet 
---
 include/asm-generic/codetag.lds.h  |   3 +-
 include/linux/codetag_time_stats.h |  54 +++
 include/linux/io_uring_types.h |   2 +-
 include/linux/wait.h   |  22 -
 kernel/sched/wait.c|   6 +-
 lib/Kconfig.debug  |   8 ++
 lib/Makefile   |   1 +
 lib/codetag_time_stats.c   | 143 +
 8 files changed, 233 insertions(+), 6 deletions(-)
 create mode 100644 include/linux/codetag_time_stats.h
 create mode 100644 lib/codetag_time_stats.c

diff --git a/include/asm-generic/codetag.lds.h 
b/include/asm-generic/codetag.lds.h
index 16fbf74edc3d..d799f4aced82 100644
--- a/include/asm-generic/codetag.lds.h
+++ b/include/asm-generic/codetag.lds.h
@@ -10,6 +10,7 @@
 
 #define CODETAG_SECTIONS() \
SECTION_WITH_BOUNDARIES(alloc_tags) \
-   SECTION_WITH_BOUNDARIES(dynamic_fault_tags)
+   SECTION_WITH_BOUNDARIES(dynamic_fault_tags) \
+   SECTION_WITH_BOUNDARIES(time_stats_tags)
 
 #endif /* __ASM_GENERIC_CODETAG_LDS_H */
diff --git a/include/linux/codetag_time_stats.h 
b/include/linux/codetag_time_stats.h
new file mode 100644
index ..7e44c7ee9e9b
--- /dev/null
+++ b/include/linux/codetag_time_stats.h
@@ -0,0 +1,54 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_CODETAG_TIMESTATS_H
+#define _LINUX_CODETAG_TIMESTATS_H
+
+/*
+ * Code tagging based latency tracking:
+ * (C) 2022 Kent Overstreet
+ *
+ * This allows you to easily instrument code to track latency, and have the
+ * results show up in debugfs. To use, add the following two calls to your code
+ * at the beginning and end of the event you wish to instrument:
+ *
+ * code_tag_time_stats_start(start_time);
+ * code_tag_time_stats_finish(start_time);
+ *
+ * Statistics will then show up in debugfs under /sys/kernel/debug/time_stats,
+ * listed by file and line number.
+ */
+
+#ifdef CONFIG_CODETAG_TIME_STATS
+
+#include 
+#include 
+#include 
+
+struct codetag_time_stats {
+   struct codetag  tag;
+   struct time_stats   stats;
+};
+
+#define codetag_time_stats_start(_start_time)  u64 _start_time = ktime_get_ns()
+
+#define codetag_time_stats_finish(_start_time) \
+do {   \
+   static struct codetag_time_stats\
+   __used  \
+   __section("time_stats_tags")\
+   __aligned(8) s = {  \
+   .tag= CODE_TAG_INIT,\
+   .stats.lock = __SPIN_LOCK_UNLOCKED(_lock)   \
+   };  \
+   \
+   WARN_ONCE(!(_start_time), "codetag_time_stats_start() not called");\
+   time_stats_update(, _start_time);   \
+} while (0)
+
+#else
+
+#define codetag_time_stats_finish(_start_time) do {} while (0)
+#define codetag_time_stats_start(_start_time)  do {} while (0)
+
+#endif /* CODETAG_CODETAG_TIME_STATS */
+
+#endif
diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index 677a25d44d7f..3bcef85eacd8 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -488,7 +488,7 @@ struct io_cqe {
 struct io_cmd_data {
struct 

[RFC PATCH 23/30] timekeeping: Add a missing include

2022-08-30 Thread Suren Baghdasaryan
From: Kent Overstreet 

We need ktime.h for ktime_t.

Signed-off-by: Kent Overstreet 
---
 include/linux/timekeeping.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index fe1e467ba046..7c43e98cf211 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -4,6 +4,7 @@
 
 #include 
 #include 
+#include 
 
 /* Included from linux/ktime.h */
 
-- 
2.37.2.672.g94769d06f0-goog




[RFC PATCH 28/30] Improved symbolic error names

2022-08-30 Thread Suren Baghdasaryan
From: Kent Overstreet 

This patch adds per-error-site error codes, with error strings that
include their file and line number.

To use, change code that returns an error, e.g.
return -ENOMEM;
to
return -ERR(ENOMEM);

Then, errname() will return a string that includes the file and line
number of the ERR() call, for example
printk("Got error %s!\n", errname(err));
will result in
Got error ENOMEM at foo.c:1234

To convert back to the original error code (before returning it to
outside code that does not understand dynamic error codes), use
return error_class(err);

To test if an error is of some type, replace
if (err == -ENOMEM)
with
if (error_matches(err, ENOMEM))

Implementation notes:

Error codes are allocated dynamically on module load and deallocated on
module unload. On memory allocation failure (i.e. the data structures
for indexing error strings and error parents), ERR() will fall back to
returning the error code that it was passed.

MAX_ERRNO has been raised from 4096 to 1 million, which should be
sufficient given the number of lines of code and the fraction that throw
errors in the kernel codebase.

This has implications for ERR_PTR(), since the range of the address
space reserved for errors is unavailable for other purposes. Since
ERR_PTR() ptrs are at the top of the address space there should not be
any major difficulties.

Signed-off-by: Kent Overstreet 
---
 include/asm-generic/codetag.lds.h |   3 +-
 include/linux/err.h   |   2 +-
 include/linux/errname.h   |  50 +++
 lib/errname.c | 103 ++
 4 files changed, 156 insertions(+), 2 deletions(-)

diff --git a/include/asm-generic/codetag.lds.h 
b/include/asm-generic/codetag.lds.h
index d799f4aced82..b087cf1874a9 100644
--- a/include/asm-generic/codetag.lds.h
+++ b/include/asm-generic/codetag.lds.h
@@ -11,6 +11,7 @@
 #define CODETAG_SECTIONS() \
SECTION_WITH_BOUNDARIES(alloc_tags) \
SECTION_WITH_BOUNDARIES(dynamic_fault_tags) \
-   SECTION_WITH_BOUNDARIES(time_stats_tags)
+   SECTION_WITH_BOUNDARIES(time_stats_tags)\
+   SECTION_WITH_BOUNDARIES(error_code_tags)
 
 #endif /* __ASM_GENERIC_CODETAG_LDS_H */
diff --git a/include/linux/err.h b/include/linux/err.h
index a139c64aef2a..1d8d6c46ab9c 100644
--- a/include/linux/err.h
+++ b/include/linux/err.h
@@ -15,7 +15,7 @@
  * This should be a per-architecture thing, to allow different
  * error and pointer decisions.
  */
-#define MAX_ERRNO  4095
+#define MAX_ERRNO  ((1 << 20) - 1)
 
 #ifndef __ASSEMBLY__
 
diff --git a/include/linux/errname.h b/include/linux/errname.h
index e8576ad90cb7..dd39fe7120bb 100644
--- a/include/linux/errname.h
+++ b/include/linux/errname.h
@@ -5,12 +5,62 @@
 #include 
 
 #ifdef CONFIG_SYMBOLIC_ERRNAME
+
 const char *errname(int err);
+
+#include 
+
+struct codetag_error_code {
+   const char  *str;
+   int err;
+};
+
+/**
+ * ERR - return an error code that records the error site
+ *
+ * E.g., instead of
+ *   return -ENOMEM;
+ * Use
+ *   return -ERR(ENOMEM);
+ *
+ * Then, when a caller prints out the error with errname(), the error string
+ * will include the file and line number.
+ */
+#define ERR(_err)  \
+({ \
+   static struct codetag_error_code\
+   __used  \
+   __section("error_code_tags")\
+   __aligned(8) e = {  \
+   .str= #_err " at " __FILE__ ":" __stringify(__LINE__),\
+   .err= _err, \
+   };  \
+   \
+   e.err;  \
+})
+
+int error_class(int err);
+bool error_matches(int err, int class);
+
 #else
+
+static inline int error_class(int err)
+{
+   return err;
+}
+
+static inline bool error_matches(int err, int class)
+{
+   return err == class;
+}
+
+#define ERR(_err)  _err
+
 static inline const char *errname(int err)
 {
return NULL;
 }
+
 #endif
 
 #endif /* _LINUX_ERRNAME_H */
diff --git a/lib/errname.c b/lib/errname.c
index 05cbf731545f..2db8f5301ba0 100644
--- a/lib/errname.c
+++ b/lib/errname.c
@@ -1,9 +1,20 @@
 // SPDX-License-Identifier: GPL-2.0
 #include 
+#include 
 #include 
 #include 
+#include 
 #include 
 #include 
+#include 
+#include 
+
+#define DYNAMIC_ERRCODE_START  4096
+
+static DEFINE_IDR(dynamic_error_strings);
+static DEFINE_XARRAY(error_classes);
+
+static struct codetag_type *cttype;
 
 /*
  * Ensure these tables do not accidentally become 

[RFC PATCH 29/30] dyndbg: Convert to code tagging

2022-08-30 Thread Suren Baghdasaryan
From: Kent Overstreet 

This converts dynamic debug to the new code tagging framework, which
provides an interface for iterating over objects in a particular elf
section.

It also converts the debugfs interface from seq_file to the style used
by other code tagging users, which also makes the code a bit smaller and
simpler.

It doesn't yet convert struct _ddebug to use struct codetag; another
cleanup could convert it to that, and to codetag_query_parse().

Signed-off-by: Kent Overstreet 
Cc: Jason Baron 
Cc: Luis Chamberlain 
---
 include/asm-generic/codetag.lds.h |   5 +-
 include/asm-generic/vmlinux.lds.h |   5 -
 include/linux/dynamic_debug.h |  11 +-
 kernel/module/internal.h  |   2 -
 kernel/module/main.c  |  23 --
 lib/dynamic_debug.c   | 452 ++
 6 files changed, 158 insertions(+), 340 deletions(-)

diff --git a/include/asm-generic/codetag.lds.h 
b/include/asm-generic/codetag.lds.h
index b087cf1874a9..b7e351f80e9e 100644
--- a/include/asm-generic/codetag.lds.h
+++ b/include/asm-generic/codetag.lds.h
@@ -8,10 +8,11 @@
KEEP(*(_name))  \
__stop_##_name = .;
 
-#define CODETAG_SECTIONS() \
+#define CODETAG_SECTIONS() \
SECTION_WITH_BOUNDARIES(alloc_tags) \
SECTION_WITH_BOUNDARIES(dynamic_fault_tags) \
SECTION_WITH_BOUNDARIES(time_stats_tags)\
-   SECTION_WITH_BOUNDARIES(error_code_tags)
+   SECTION_WITH_BOUNDARIES(error_code_tags)\
+   SECTION_WITH_BOUNDARIES(dyndbg)
 
 #endif /* __ASM_GENERIC_CODETAG_LDS_H */
diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index c2dc2a59ab2e..d3fb914d157f 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -345,11 +345,6 @@
__end_once = .; \
STRUCT_ALIGN(); \
*(__tracepoints)\
-   /* implement dynamic printk debug */\
-   . = ALIGN(8);   \
-   __start___dyndbg = .;   \
-   KEEP(*(__dyndbg))   \
-   __stop___dyndbg = .;\
CODETAG_SECTIONS()  \
LIKELY_PROFILE()\
BRANCH_PROFILE()\
diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index dce631e678dd..6a57009dd29e 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -58,9 +58,6 @@ struct _ddebug {
 /* exported for module authors to exercise >control */
 int dynamic_debug_exec_queries(const char *query, const char *modname);
 
-int ddebug_add_module(struct _ddebug *tab, unsigned int n,
-   const char *modname);
-extern int ddebug_remove_module(const char *mod_name);
 extern __printf(2, 3)
 void __dynamic_pr_debug(struct _ddebug *descriptor, const char *fmt, ...);
 
@@ -89,7 +86,7 @@ void __dynamic_ibdev_dbg(struct _ddebug *descriptor,
 
 #define DEFINE_DYNAMIC_DEBUG_METADATA(name, fmt)   \
static struct _ddebug  __aligned(8) \
-   __section("__dyndbg") name = {  \
+   __section("dyndbg") name = {\
.modname = KBUILD_MODNAME,  \
.function = __func__,   \
.filename = __FILE__,   \
@@ -187,12 +184,6 @@ void __dynamic_ibdev_dbg(struct _ddebug *descriptor,
 #include 
 #include 
 
-static inline int ddebug_add_module(struct _ddebug *tab, unsigned int n,
-   const char *modname)
-{
-   return 0;
-}
-
 static inline int ddebug_remove_module(const char *mod)
 {
return 0;
diff --git a/kernel/module/internal.h b/kernel/module/internal.h
index f1b6c477bd93..f867c57ab74f 100644
--- a/kernel/module/internal.h
+++ b/kernel/module/internal.h
@@ -62,8 +62,6 @@ struct load_info {
Elf_Shdr *sechdrs;
char *secstrings, *strtab;
unsigned long symoffs, stroffs, init_typeoffs, core_typeoffs;
-   struct _ddebug *debug;
-   unsigned int num_debug;
bool sig_ok;
 #ifdef CONFIG_KALLSYMS
unsigned long mod_kallsyms_init_off;
diff --git a/kernel/module/main.c b/kernel/module/main.c
index d253277492fd..28e3b337841b 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -1163,9 +1163,6 @@ static void free_module(struct module *mod)
mod->state = MODULE_STATE_UNFORMED;
mutex_unlock(_mutex);
 
-   /* Remove dynamic debug info */
-  

[RFC PATCH 24/30] wait: Clean up waitqueue_entry initialization

2022-08-30 Thread Suren Baghdasaryan
From: Kent Overstreet 

Cleanup for code tagging latency tracking:

Add an initializer, WAIT_FUNC_INITIALIZER(), to be used by initializers
for structs that include wait_queue_entries.

Also, change init_wait(), init_wait_entry etc.  to be a wrapper around
the new __init_waitqueue_entry(); more de-duplication prep work.

Signed-off-by: Kent Overstreet 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
---
 include/linux/sbitmap.h  |  6 +
 include/linux/wait.h | 52 +++-
 include/linux/wait_bit.h |  7 +-
 kernel/sched/wait.c  |  9 ---
 4 files changed, 27 insertions(+), 47 deletions(-)

diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
index 8f5a86e210b9..f696c29d9ab3 100644
--- a/include/linux/sbitmap.h
+++ b/include/linux/sbitmap.h
@@ -596,11 +596,7 @@ struct sbq_wait {
 #define DEFINE_SBQ_WAIT(name)  
\
struct sbq_wait name = {
\
.sbq = NULL,
\
-   .wait = {   
\
-   .private= current,  
\
-   .func   = autoremove_wake_function, 
\
-   .entry  = LIST_HEAD_INIT((name).wait.entry),
\
-   }   
\
+   .wait = WAIT_FUNC_INITIALIZER((name).wait, 
autoremove_wake_function),\
}
 
 /*
diff --git a/include/linux/wait.h b/include/linux/wait.h
index 58cfbf81447c..91ced6a118bc 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -79,21 +79,38 @@ extern void __init_waitqueue_head(struct wait_queue_head 
*wq_head, const char *n
 # define DECLARE_WAIT_QUEUE_HEAD_ONSTACK(name) DECLARE_WAIT_QUEUE_HEAD(name)
 #endif
 
-static inline void init_waitqueue_entry(struct wait_queue_entry *wq_entry, 
struct task_struct *p)
-{
-   wq_entry->flags = 0;
-   wq_entry->private   = p;
-   wq_entry->func  = default_wake_function;
+#define WAIT_FUNC_INITIALIZER(name, function) {
\
+   .private= current,  
\
+   .func   = function, 
\
+   .entry  = LIST_HEAD_INIT((name).entry), 
\
 }
 
+#define DEFINE_WAIT_FUNC(name, function)   
\
+   struct wait_queue_entry name = WAIT_FUNC_INITIALIZER(name, function)
+
+#define DEFINE_WAIT(name) DEFINE_WAIT_FUNC(name, autoremove_wake_function)
+
 static inline void
-init_waitqueue_func_entry(struct wait_queue_entry *wq_entry, wait_queue_func_t 
func)
+__init_waitqueue_entry(struct wait_queue_entry *wq_entry, unsigned int flags,
+  void *private, wait_queue_func_t func)
 {
-   wq_entry->flags = 0;
-   wq_entry->private   = NULL;
+   wq_entry->flags = flags;
+   wq_entry->private   = private;
wq_entry->func  = func;
+   INIT_LIST_HEAD(_entry->entry);
 }
 
+#define init_waitqueue_func_entry(_wq_entry, _func)\
+   __init_waitqueue_entry(_wq_entry, 0, NULL, _func)
+
+#define init_waitqueue_entry(_wq_entry, _task) \
+   __init_waitqueue_entry(_wq_entry, 0, _task, default_wake_function)
+
+#define init_wait_entry(_wq_entry, _flags) \
+   __init_waitqueue_entry(_wq_entry, _flags, current, 
autoremove_wake_function)
+
+#define init_wait(wait)init_wait_entry(wait, 0)
+
 /**
  * waitqueue_active -- locklessly test for waiters on the queue
  * @wq_head: the waitqueue to test for waiters
@@ -283,8 +300,6 @@ static inline void wake_up_pollfree(struct wait_queue_head 
*wq_head)
(!__builtin_constant_p(state) ||
\
state == TASK_INTERRUPTIBLE || state == TASK_KILLABLE)  
\
 
-extern void init_wait_entry(struct wait_queue_entry *wq_entry, int flags);
-
 /*
  * The below macro ___wait_event() has an explicit shadow of the __ret
  * variable when used from the wait_event_*() macros.
@@ -1170,23 +1185,6 @@ long wait_woken(struct wait_queue_entry *wq_entry, 
unsigned mode, long timeout);
 int woken_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, int 
sync, void *key);
 int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, 
int sync, void *key);
 
-#define DEFINE_WAIT_FUNC(name, function)   
\
-   struct wait_queue_entry name = {
\
-   .private= current,  
\
-   .func   = function, 

[RFC PATCH 19/30] move stack capture functionality into a separate function for reuse

2022-08-30 Thread Suren Baghdasaryan
Make save_stack() function part of stackdepot API to be used outside of
page_owner. Also rename task_struct's in_page_owner to in_capture_stack
flag to better convey the wider use of this flag.

Signed-off-by: Suren Baghdasaryan 
---
 include/linux/sched.h  |  6 ++--
 include/linux/stackdepot.h |  3 ++
 lib/stackdepot.c   | 68 ++
 mm/page_owner.c| 52 ++---
 4 files changed, 77 insertions(+), 52 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index e7b2f8a5c711..d06cad6c14bd 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -930,9 +930,9 @@ struct task_struct {
/* Stalled due to lack of memory */
unsignedin_memstall:1;
 #endif
-#ifdef CONFIG_PAGE_OWNER
-   /* Used by page_owner=on to detect recursion in page tracking. */
-   unsignedin_page_owner:1;
+#ifdef CONFIG_STACKDEPOT
+   /* Used by stack_depot_capture_stack to detect recursion. */
+   unsignedin_capture_stack:1;
 #endif
 #ifdef CONFIG_EVENTFD
/* Recursion prevention for eventfd_signal() */
diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h
index bc2797955de9..8dc9fdb2c4dd 100644
--- a/include/linux/stackdepot.h
+++ b/include/linux/stackdepot.h
@@ -64,4 +64,7 @@ int stack_depot_snprint(depot_stack_handle_t handle, char 
*buf, size_t size,
 
 void stack_depot_print(depot_stack_handle_t stack);
 
+bool stack_depot_capture_init(void);
+depot_stack_handle_t stack_depot_capture_stack(gfp_t flags);
+
 #endif
diff --git a/lib/stackdepot.c b/lib/stackdepot.c
index e73fda23388d..c8615bd6dc25 100644
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -514,3 +514,71 @@ depot_stack_handle_t stack_depot_save(unsigned long 
*entries,
return __stack_depot_save(entries, nr_entries, alloc_flags, true);
 }
 EXPORT_SYMBOL_GPL(stack_depot_save);
+
+static depot_stack_handle_t recursion_handle;
+static depot_stack_handle_t failure_handle;
+
+static __always_inline depot_stack_handle_t create_custom_stack(void)
+{
+   unsigned long entries[4];
+   unsigned int nr_entries;
+
+   nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 0);
+   return stack_depot_save(entries, nr_entries, GFP_KERNEL);
+}
+
+static noinline void register_recursion_stack(void)
+{
+   recursion_handle = create_custom_stack();
+}
+
+static noinline void register_failure_stack(void)
+{
+   failure_handle = create_custom_stack();
+}
+
+bool stack_depot_capture_init(void)
+{
+   static DEFINE_MUTEX(stack_depot_capture_init_mutex);
+   static bool utility_stacks_ready;
+
+   mutex_lock(_depot_capture_init_mutex);
+   if (!utility_stacks_ready) {
+   register_recursion_stack();
+   register_failure_stack();
+   utility_stacks_ready = true;
+   }
+   mutex_unlock(_depot_capture_init_mutex);
+
+   return utility_stacks_ready;
+}
+
+/* TODO: teach stack_depot_capture_stack to use off stack temporal storage */
+#define CAPTURE_STACK_DEPTH (16)
+
+depot_stack_handle_t stack_depot_capture_stack(gfp_t flags)
+{
+   unsigned long entries[CAPTURE_STACK_DEPTH];
+   depot_stack_handle_t handle;
+   unsigned int nr_entries;
+
+   /*
+* Avoid recursion.
+*
+* Sometimes page metadata allocation tracking requires more
+* memory to be allocated:
+* - when new stack trace is saved to stack depot
+* - when backtrace itself is calculated (ia64)
+*/
+   if (current->in_capture_stack)
+   return recursion_handle;
+   current->in_capture_stack = 1;
+
+   nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 2);
+   handle = stack_depot_save(entries, nr_entries, flags);
+   if (!handle)
+   handle = failure_handle;
+
+   current->in_capture_stack = 0;
+   return handle;
+}
diff --git a/mm/page_owner.c b/mm/page_owner.c
index fd4af1ad34b8..c3173e34a779 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -15,12 +15,6 @@
 
 #include "internal.h"
 
-/*
- * TODO: teach PAGE_OWNER_STACK_DEPTH (__dump_page_owner and save_stack)
- * to use off stack temporal storage
- */
-#define PAGE_OWNER_STACK_DEPTH (16)
-
 struct page_owner {
unsigned short order;
short last_migrate_reason;
@@ -37,8 +31,6 @@ struct page_owner {
 static bool page_owner_enabled __initdata;
 DEFINE_STATIC_KEY_FALSE(page_owner_inited);
 
-static depot_stack_handle_t dummy_handle;
-static depot_stack_handle_t failure_handle;
 static depot_stack_handle_t early_handle;
 
 static void init_early_allocated_pages(void);
@@ -68,16 +60,6 @@ static __always_inline depot_stack_handle_t 
create_dummy_stack(void)
return stack_depot_save(entries, nr_entries, GFP_KERNEL);
 }
 
-static noinline void register_dummy_stack(void)
-{
-   dummy_handle = create_dummy_stack();
-}
-

[RFC PATCH 26/30] bcache: Convert to lib/time_stats

2022-08-30 Thread Suren Baghdasaryan
From: Kent Overstreet 

This patch converts bcache to the new generic time_stats code
lib/time_stats.c. The new code is from bcachefs, and has some changes
from the version in bcache:

 - we now use ktime_get_ns(), not local_clock(). When the code was
   originally written multi processor systems that lacked synchronized
   TSCs were still common, and so local_clock() was much cheaper than
   sched_clock() (though not necessarily fully accurate, due to TSC
   drift). ktime_get_ns() should be cheap enough on all common hardware
   now, and more standard/correct.

 - time_stats are now exported in a single file in sysfs, which means we
   can improve the statistics we keep track of without changing all
   users. This also means we don't have to manually specify which units
   (ms, us, ns) a given time_stats should be printed in; that's handled
   dynamically.

 - There's a lazily-allocated percpu buffer, which now needs to be freed
   with time_stats_exit().

Signed-off-by: Kent Overstreet 
Cc: Coly Li 
---
 drivers/md/bcache/Kconfig  |  1 +
 drivers/md/bcache/bcache.h |  1 +
 drivers/md/bcache/bset.c   |  8 +++---
 drivers/md/bcache/bset.h   |  1 +
 drivers/md/bcache/btree.c  | 12 
 drivers/md/bcache/super.c  |  3 ++
 drivers/md/bcache/sysfs.c  | 43 
 drivers/md/bcache/util.c   | 30 
 drivers/md/bcache/util.h   | 57 --
 9 files changed, 47 insertions(+), 109 deletions(-)

diff --git a/drivers/md/bcache/Kconfig b/drivers/md/bcache/Kconfig
index 529c9d04e9a4..8d165052e508 100644
--- a/drivers/md/bcache/Kconfig
+++ b/drivers/md/bcache/Kconfig
@@ -4,6 +4,7 @@ config BCACHE
tristate "Block device as cache"
select BLOCK_HOLDER_DEPRECATED if SYSFS
select CRC64
+   select TIME_STATS
help
Allows a block device to be used as cache for other devices; uses
a btree for indexing and the layout is optimized for SSDs.
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index 2acda9cea0f9..5100010a3897 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -185,6 +185,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/drivers/md/bcache/bset.c b/drivers/md/bcache/bset.c
index 94d38e8a59b3..727e9b7aead4 100644
--- a/drivers/md/bcache/bset.c
+++ b/drivers/md/bcache/bset.c
@@ -1251,7 +1251,7 @@ static void __btree_sort(struct btree_keys *b, struct 
btree_iter *iter,
order = state->page_order;
}
 
-   start_time = local_clock();
+   start_time = ktime_get_ns();
 
btree_mergesort(b, out, iter, fixup, false);
b->nsets = start;
@@ -1286,7 +1286,7 @@ static void __btree_sort(struct btree_keys *b, struct 
btree_iter *iter,
bch_bset_build_written_tree(b);
 
if (!start)
-   bch_time_stats_update(>time, start_time);
+   time_stats_update(>time, start_time);
 }
 
 void bch_btree_sort_partial(struct btree_keys *b, unsigned int start,
@@ -1322,14 +1322,14 @@ void bch_btree_sort_and_fix_extents(struct btree_keys 
*b,
 void bch_btree_sort_into(struct btree_keys *b, struct btree_keys *new,
 struct bset_sort_state *state)
 {
-   uint64_t start_time = local_clock();
+   uint64_t start_time = ktime_get_ns();
struct btree_iter iter;
 
bch_btree_iter_init(b, , NULL);
 
btree_mergesort(b, new->set->data, , false, true);
 
-   bch_time_stats_update(>time, start_time);
+   time_stats_update(>time, start_time);
 
new->set->size = 0; // XXX: why?
 }
diff --git a/drivers/md/bcache/bset.h b/drivers/md/bcache/bset.h
index d795c84246b0..13e524ad7783 100644
--- a/drivers/md/bcache/bset.h
+++ b/drivers/md/bcache/bset.h
@@ -3,6 +3,7 @@
 #define _BCACHE_BSET_H
 
 #include 
+#include 
 #include 
 
 #include "bcache_ondisk.h"
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 147c493a989a..abf543bc7551 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -242,7 +242,7 @@ static void btree_node_read_endio(struct bio *bio)
 
 static void bch_btree_node_read(struct btree *b)
 {
-   uint64_t start_time = local_clock();
+   uint64_t start_time = ktime_get_ns();
struct closure cl;
struct bio *bio;
 
@@ -270,7 +270,7 @@ static void bch_btree_node_read(struct btree *b)
goto err;
 
bch_btree_node_read_done(b);
-   bch_time_stats_update(>c->btree_read_time, start_time);
+   time_stats_update(>c->btree_read_time, start_time);
 
return;
 err:
@@ -1789,7 +1789,7 @@ static void bch_btree_gc(struct cache_set *c)
struct gc_stat stats;
struct closure writes;
struct btree_op op;
-   uint64_t start_time = local_clock();
+   uint64_t start_time = ktime_get_ns();
 
trace_bcache_gc_start(c);
 
@@ -1815,7 +1815,7 @@ static void bch_btree_gc(struct cache_set *c)

[RFC PATCH 25/30] lib/time_stats: New library for statistics on events

2022-08-30 Thread Suren Baghdasaryan
From: Kent Overstreet 

This adds a small new library for tracking statistics on events that
have a duration, i.e. a start and end time.

 - number of events
 - rate/frequency
 - average duration
 - max duration
 - duration quantiles

This code comes from bcachefs, and originally bcache: the next patch
will be converting bcache to use this version, and a subsequent patch
will be using code_tagging to instrument all wait_event() calls in the
kernel.

Signed-off-by: Kent Overstreet 
---
 include/linux/time_stats.h |  44 +++
 lib/Kconfig|   3 +
 lib/Makefile   |   1 +
 lib/time_stats.c   | 236 +
 4 files changed, 284 insertions(+)
 create mode 100644 include/linux/time_stats.h
 create mode 100644 lib/time_stats.c

diff --git a/include/linux/time_stats.h b/include/linux/time_stats.h
new file mode 100644
index ..7ae929e6f836
--- /dev/null
+++ b/include/linux/time_stats.h
@@ -0,0 +1,44 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _LINUX_TIMESTATS_H
+#define _LINUX_TIMESTATS_H
+
+#include 
+#include 
+
+#define NR_QUANTILES   15
+
+struct quantiles {
+   struct quantile_entry {
+   u64 m;
+   u64 step;
+   }   entries[NR_QUANTILES];
+};
+
+struct time_stat_buffer {
+   unsigned intnr;
+   struct time_stat_buffer_entry {
+   u64 start;
+   u64 end;
+   }   entries[32];
+};
+
+struct time_stats {
+   spinlock_t  lock;
+   u64 count;
+   /* all fields are in nanoseconds */
+   u64 average_duration;
+   u64 average_frequency;
+   u64 max_duration;
+   u64 last_event;
+   struct quantiles quantiles;
+
+   struct time_stat_buffer __percpu *buffer;
+};
+
+struct seq_buf;
+void time_stats_update(struct time_stats *stats, u64 start);
+void time_stats_to_text(struct seq_buf *out, struct time_stats *stats);
+void time_stats_exit(struct time_stats *stats);
+
+#endif /* _LINUX_TIMESTATS_H */
diff --git a/lib/Kconfig b/lib/Kconfig
index fc6dbc425728..884fd9f2f06d 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -744,3 +744,6 @@ config ASN1_ENCODER
 
 config POLYNOMIAL
tristate
+
+config TIME_STATS
+   bool
diff --git a/lib/Makefile b/lib/Makefile
index 489ea000c528..e54392011f5e 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -232,6 +232,7 @@ obj-$(CONFIG_ALLOC_TAGGING) += alloc_tag.o
 obj-$(CONFIG_PAGE_ALLOC_TAGGING) += pgalloc_tag.o
 
 obj-$(CONFIG_CODETAG_FAULT_INJECTION) += dynamic_fault.o
+obj-$(CONFIG_TIME_STATS) += time_stats.o
 
 lib-$(CONFIG_GENERIC_BUG) += bug.o
 
diff --git a/lib/time_stats.c b/lib/time_stats.c
new file mode 100644
index ..30362364fdd2
--- /dev/null
+++ b/lib/time_stats.c
@@ -0,0 +1,236 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static inline unsigned int eytzinger1_child(unsigned int i, unsigned int child)
+{
+   return (i << 1) + child;
+}
+
+static inline unsigned int eytzinger1_right_child(unsigned int i)
+{
+   return eytzinger1_child(i, 1);
+}
+
+static inline unsigned int eytzinger1_next(unsigned int i, unsigned int size)
+{
+   if (eytzinger1_right_child(i) <= size) {
+   i = eytzinger1_right_child(i);
+
+   i <<= __fls(size + 1) - __fls(i);
+   i >>= i > size;
+   } else {
+   i >>= ffz(i) + 1;
+   }
+
+   return i;
+}
+
+static inline unsigned int eytzinger0_child(unsigned int i, unsigned int child)
+{
+   return (i << 1) + 1 + child;
+}
+
+static inline unsigned int eytzinger0_first(unsigned int size)
+{
+   return rounddown_pow_of_two(size) - 1;
+}
+
+static inline unsigned int eytzinger0_next(unsigned int i, unsigned int size)
+{
+   return eytzinger1_next(i + 1, size) - 1;
+}
+
+#define eytzinger0_for_each(_i, _size) \
+   for ((_i) = eytzinger0_first((_size));  \
+(_i) != -1;\
+(_i) = eytzinger0_next((_i), (_size)))
+
+#define ewma_add(ewma, val, weight)\
+({ \
+   typeof(ewma) _ewma = (ewma);\
+   typeof(weight) _weight = (weight);  \
+   \
+   (((_ewma << _weight) - _ewma) + (val)) >> _weight;  \
+})
+
+static void quantiles_update(struct quantiles *q, u64 v)
+{
+   unsigned int i = 0;
+
+   while (i < ARRAY_SIZE(q->entries)) {
+   struct quantile_entry *e = q->entries + i;
+
+   if (unlikely(!e->step)) {
+   e->m = v;
+   e->step = max_t(unsigned int, v / 2, 

[RFC PATCH 30/30] MAINTAINERS: Add entries for code tagging & related

2022-08-30 Thread Suren Baghdasaryan
From: Kent Overstreet 

The new code & libraries added are being maintained - mark them as such.

Signed-off-by: Kent Overstreet 
---
 MAINTAINERS | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 589517372408..902c96744bcb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5111,6 +5111,19 @@ S:   Supported
 F: Documentation/process/code-of-conduct-interpretation.rst
 F: Documentation/process/code-of-conduct.rst
 
+CODE TAGGING
+M: Suren Baghdasaryan 
+M: Kent Overstreet 
+S: Maintained
+F: lib/codetag.c
+F: include/linux/codetag.h
+
+CODE TAGGING TIME STATS
+M: Kent Overstreet 
+S: Maintained
+F: lib/codetag_time_stats.c
+F: include/linux/codetag_time_stats.h
+
 COMEDI DRIVERS
 M: Ian Abbott 
 M: H Hartley Sweeten 
@@ -11405,6 +11418,12 @@ M: John Hawley 
 S: Maintained
 F: tools/testing/ktest
 
+LAZY PERCPU COUNTERS
+M: Kent Overstreet 
+S: Maintained
+F: lib/lazy-percpu-counter.c
+F: include/linux/lazy-percpu-counter.h
+
 L3MDEV
 M: David Ahern 
 L: net...@vger.kernel.org
@@ -13124,6 +13143,15 @@ F: include/linux/memblock.h
 F: mm/memblock.c
 F: tools/testing/memblock/
 
+MEMORY ALLOCATION TRACKING
+M: Suren Baghdasaryan 
+M: Kent Overstreet 
+S: Maintained
+F: lib/alloc_tag.c
+F: lib/pgalloc_tag.c
+F: include/linux/alloc_tag.h
+F: include/linux/codetag_ctx.h
+
 MEMORY CONTROLLER DRIVERS
 M: Krzysztof Kozlowski 
 L: linux-ker...@vger.kernel.org
@@ -20421,6 +20449,12 @@ T: git 
git://git.kernel.org/pub/scm/linux/kernel/git/luca/wl12xx.git
 F: drivers/net/wireless/ti/
 F: include/linux/wl12xx.h
 
+TIME STATS
+M: Kent Overstreet 
+S: Maintained
+F: lib/time_stats.c
+F: include/linux/time_stats.h
+
 TIMEKEEPING, CLOCKSOURCE CORE, NTP, ALARMTIMER
 M: John Stultz 
 M: Thomas Gleixner 
-- 
2.37.2.672.g94769d06f0-goog




[RFC PATCH 22/30] Code tagging based fault injection

2022-08-30 Thread Suren Baghdasaryan
From: Kent Overstreet 

This adds a new fault injection capability, based on code tagging.

To use, simply insert somewhere in your code

  dynamic_fault("fault_class_name")

and check whether it returns true - if so, inject the error.
For example

  if (dynamic_fault("init"))
  return -EINVAL;

There's no need to define faults elsewhere, as with
include/linux/fault-injection.h. Faults show up in debugfs, under
/sys/kernel/debug/dynamic_faults, and can be selected based on
file/module/function/line number/class, and enabled permanently, or in
oneshot mode, or with a specified frequency.

Signed-off-by: Kent Overstreet 
---
 include/asm-generic/codetag.lds.h |   3 +-
 include/linux/dynamic_fault.h |  79 +++
 include/linux/slab.h  |   3 +-
 lib/Kconfig.debug |   6 +
 lib/Makefile  |   2 +
 lib/dynamic_fault.c   | 372 ++
 6 files changed, 463 insertions(+), 2 deletions(-)
 create mode 100644 include/linux/dynamic_fault.h
 create mode 100644 lib/dynamic_fault.c

diff --git a/include/asm-generic/codetag.lds.h 
b/include/asm-generic/codetag.lds.h
index 64f536b80380..16fbf74edc3d 100644
--- a/include/asm-generic/codetag.lds.h
+++ b/include/asm-generic/codetag.lds.h
@@ -9,6 +9,7 @@
__stop_##_name = .;
 
 #define CODETAG_SECTIONS() \
-   SECTION_WITH_BOUNDARIES(alloc_tags)
+   SECTION_WITH_BOUNDARIES(alloc_tags) \
+   SECTION_WITH_BOUNDARIES(dynamic_fault_tags)
 
 #endif /* __ASM_GENERIC_CODETAG_LDS_H */
diff --git a/include/linux/dynamic_fault.h b/include/linux/dynamic_fault.h
new file mode 100644
index ..526a33209e94
--- /dev/null
+++ b/include/linux/dynamic_fault.h
@@ -0,0 +1,79 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _LINUX_DYNAMIC_FAULT_H
+#define _LINUX_DYNAMIC_FAULT_H
+
+/*
+ * Dynamic/code tagging fault injection:
+ *
+ * Originally based on the dynamic debug trick of putting types in a special 
elf
+ * section, then rewritten using code tagging:
+ *
+ * To use, simply insert a call to dynamic_fault("fault_class"), which will
+ * return true if an error should be injected.
+ *
+ * Fault injection sites may be listed and enabled via debugfs, under
+ * /sys/kernel/debug/dynamic_faults.
+ */
+
+#ifdef CONFIG_CODETAG_FAULT_INJECTION
+
+#include 
+#include 
+
+#define DFAULT_STATES()\
+   x(disabled) \
+   x(enabled)  \
+   x(oneshot)
+
+enum dfault_enabled {
+#define x(n)   DFAULT_##n,
+   DFAULT_STATES()
+#undef x
+};
+
+union dfault_state {
+   struct {
+   unsigned intenabled:2;
+   unsigned intcount:30;
+   };
+
+   struct {
+   unsigned intv;
+   };
+};
+
+struct dfault {
+   struct codetag  tag;
+   const char  *class;
+   unsigned intfrequency;
+   union dfault_state  state;
+   struct static_key_false enabled;
+};
+
+bool __dynamic_fault_enabled(struct dfault *df);
+
+#define dynamic_fault(_class)  \
+({ \
+   static struct dfault\
+   __used  \
+   __section("dynamic_fault_tags") \
+   __aligned(8) df = { \
+   .tag= CODE_TAG_INIT,\
+   .class  = _class,   \
+   .enabled = STATIC_KEY_FALSE_INIT,   \
+   };  \
+   \
+   static_key_false() &&\
+   __dynamic_fault_enabled();   \
+})
+
+#else
+
+#define dynamic_fault(_class)  false
+
+#endif /* CODETAG_FAULT_INJECTION */
+
+#define memory_fault() dynamic_fault("memory")
+
+#endif /* _LINUX_DYNAMIC_FAULT_H */
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 89273be35743..4be5a93ed15a 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 
 /*
@@ -468,7 +469,7 @@ static inline void slab_tag_dec(const void *ptr) {}
 
 #define krealloc_hooks(_p, _do_alloc)  \
 ({ \
-   void *_res = _do_alloc; \
+   void *_res = !memory_fault() ? _do_alloc : NULL;\
slab_tag_add(_p, _res); \
_res;   \
 })
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 2790848464f1..b7d03afbc808 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1982,6 +1982,12 @@ config FAULT_INJECTION_STACKTRACE_FILTER
help
  Provide 

[RFC PATCH 18/30] codetag: add codetag query helper functions

2022-08-30 Thread Suren Baghdasaryan
From: Kent Overstreet 

Provide codetag_query_parse() to parse codetag queries and
codetag_matches_query() to check if the query affects a given codetag.

Signed-off-by: Kent Overstreet 
---
 include/linux/codetag.h |  27 
 lib/codetag.c   | 135 
 2 files changed, 162 insertions(+)

diff --git a/include/linux/codetag.h b/include/linux/codetag.h
index 386733e89b31..0c605417ebbe 100644
--- a/include/linux/codetag.h
+++ b/include/linux/codetag.h
@@ -80,4 +80,31 @@ static inline void codetag_load_module(struct module *mod) {}
 static inline void codetag_unload_module(struct module *mod) {}
 #endif
 
+/* Codetag query parsing */
+
+struct codetag_query {
+   const char  *filename;
+   const char  *module;
+   const char  *function;
+   const char  *class;
+   unsigned intfirst_line, last_line;
+   unsigned intfirst_index, last_index;
+   unsigned intcur_index;
+
+   boolmatch_line:1;
+   boolmatch_index:1;
+
+   unsigned intset_enabled:1;
+   unsigned intenabled:2;
+
+   unsigned intset_frequency:1;
+   unsigned intfrequency;
+};
+
+char *codetag_query_parse(struct codetag_query *q, char *buf);
+bool codetag_matches_query(struct codetag_query *q,
+  const struct codetag *ct,
+  const struct codetag_module *mod,
+  const char *class);
+
 #endif /* _LINUX_CODETAG_H */
diff --git a/lib/codetag.c b/lib/codetag.c
index f0a3174f9b71..288ccfd5cbd0 100644
--- a/lib/codetag.c
+++ b/lib/codetag.c
@@ -246,3 +246,138 @@ void codetag_unload_module(struct module *mod)
}
mutex_unlock(_lock);
 }
+
+/* Codetag query parsing */
+
+#define CODETAG_QUERY_TOKENS() \
+   x(func) \
+   x(file) \
+   x(line) \
+   x(module)   \
+   x(class)\
+   x(index)
+
+enum tokens {
+#define x(name)TOK_##name,
+   CODETAG_QUERY_TOKENS()
+#undef x
+};
+
+static const char * const token_strs[] = {
+#define x(name)#name,
+   CODETAG_QUERY_TOKENS()
+#undef x
+   NULL
+};
+
+static int parse_range(char *str, unsigned int *first, unsigned int *last)
+{
+   char *first_str = str;
+   char *last_str = strchr(first_str, '-');
+
+   if (last_str)
+   *last_str++ = '\0';
+
+   if (kstrtouint(first_str, 10, first))
+   return -EINVAL;
+
+   if (!last_str)
+   *last = *first;
+   else if (kstrtouint(last_str, 10, last))
+   return -EINVAL;
+
+   return 0;
+}
+
+char *codetag_query_parse(struct codetag_query *q, char *buf)
+{
+   while (1) {
+   char *p = buf;
+   char *str1 = strsep_no_empty(, " \t\r\n");
+   char *str2 = strsep_no_empty(, " \t\r\n");
+   int ret, token;
+
+   if (!str1 || !str2)
+   break;
+
+   token = match_string(token_strs, ARRAY_SIZE(token_strs), str1);
+   if (token < 0)
+   break;
+
+   switch (token) {
+   case TOK_func:
+   q->function = str2;
+   break;
+   case TOK_file:
+   q->filename = str2;
+   break;
+   case TOK_line:
+   ret = parse_range(str2, >first_line, >last_line);
+   if (ret)
+   return ERR_PTR(ret);
+   q->match_line = true;
+   break;
+   case TOK_module:
+   q->module = str2;
+   break;
+   case TOK_class:
+   q->class = str2;
+   break;
+   case TOK_index:
+   ret = parse_range(str2, >first_index, 
>last_index);
+   if (ret)
+   return ERR_PTR(ret);
+   q->match_index = true;
+   break;
+   }
+
+   buf = p;
+   }
+
+   return buf;
+}
+
+bool codetag_matches_query(struct codetag_query *q,
+  const struct codetag *ct,
+  const struct codetag_module *mod,
+  const char *class)
+{
+   size_t classlen = q->class ? strlen(q->class) : 0;
+
+   if (q->module &&
+   (!mod->mod ||
+strcmp(q->module, ct->modname)))
+   return false;
+
+   if (q->filename &&
+   strcmp(q->filename, ct->filename) &&
+   strcmp(q->filename, kbasename(ct->filename)))
+   return false;
+
+   if (q->function &&
+   strcmp(q->function, ct->function))
+   return false;
+
+   /* match against the 

[RFC PATCH 21/30] lib: implement context capture support for page and slab allocators

2022-08-30 Thread Suren Baghdasaryan
Implement mechanisms for capturing allocation call context which consists
of:
- allocation size
- pid, tgid and name of the allocating task
- allocation timestamp
- allocation call stack
The patch creates alloc_tags.ctx file which can be written to
enable/disable context capture for a specific code tag. Captured context
can be obtained by reading alloc_tags.ctx file.
Usage example:

echo "file include/asm-generic/pgalloc.h line 63 enable" > \
/sys/kernel/debug/alloc_tags.ctx
cat alloc_tags.ctx
 91.0MiB  212 include/asm-generic/pgalloc.h:63 module:pgtable 
func:__pte_alloc_one
size: 4096
pid: 1551
tgid: 1551
comm: cat
ts: 670109646361
call stack:
 pte_alloc_one+0xfe/0x130
 __pte_alloc+0x22/0x90
 move_page_tables.part.0+0x994/0xa60
 shift_arg_pages+0xa4/0x180
 setup_arg_pages+0x286/0x2d0
 load_elf_binary+0x4e1/0x18d0
 bprm_execve+0x26b/0x660
 do_execveat_common.isra.0+0x19d/0x220
 __x64_sys_execve+0x2e/0x40
 do_syscall_64+0x38/0x90
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

size: 4096
pid: 1551
tgid: 1551
comm: cat
ts: 670109711801
call stack:
 pte_alloc_one+0xfe/0x130
 __do_fault+0x52/0xc0
 __handle_mm_fault+0x7d9/0xdd0
 handle_mm_fault+0xc0/0x2b0
 do_user_addr_fault+0x1c3/0x660
 exc_page_fault+0x62/0x150
 asm_exc_page_fault+0x22/0x30
...

echo "file include/asm-generic/pgalloc.h line 63 disable" > \
/sys/kernel/debug/alloc_tags.ctx

Note that disabling context capture will not clear already captured
context but no new context will be captured.

Signed-off-by: Suren Baghdasaryan 
---
 include/linux/alloc_tag.h |  28 -
 include/linux/codetag.h   |   3 +-
 lib/Kconfig.debug |   1 +
 lib/alloc_tag.c   | 239 +-
 lib/codetag.c |  20 ++--
 5 files changed, 273 insertions(+), 18 deletions(-)

diff --git a/include/linux/alloc_tag.h b/include/linux/alloc_tag.h
index b3f589afb1c9..66638cbf349a 100644
--- a/include/linux/alloc_tag.h
+++ b/include/linux/alloc_tag.h
@@ -16,27 +16,41 @@
  * an array of these. Embedded codetag utilizes codetag framework.
  */
 struct alloc_tag {
-   struct codetag  ct;
+   struct codetag_with_ctx ctc;
unsigned long   last_wrap;
struct raw_lazy_percpu_counter  call_count;
struct raw_lazy_percpu_counter  bytes_allocated;
 } __aligned(8);
 
+static inline struct alloc_tag *ctc_to_alloc_tag(struct codetag_with_ctx *ctc)
+{
+   return container_of(ctc, struct alloc_tag, ctc);
+}
+
 static inline struct alloc_tag *ct_to_alloc_tag(struct codetag *ct)
 {
-   return container_of(ct, struct alloc_tag, ct);
+   return container_of(ct_to_ctc(ct), struct alloc_tag, ctc);
 }
 
+struct codetag_ctx *alloc_tag_create_ctx(struct alloc_tag *tag, size_t size);
+void alloc_tag_free_ctx(struct codetag_ctx *ctx, struct alloc_tag **ptag);
+bool alloc_tag_enable_ctx(struct alloc_tag *tag, bool enable);
+
 #define DEFINE_ALLOC_TAG(_alloc_tag)   \
static struct alloc_tag _alloc_tag __used __aligned(8)  \
-   __section("alloc_tags") = { .ct = CODE_TAG_INIT }
+   __section("alloc_tags") = { .ctc.ct = CODE_TAG_INIT }
 
 #define alloc_tag_counter_read(counter)
\
__lazy_percpu_counter_read(counter)
 
 static inline void __alloc_tag_sub(union codetag_ref *ref, size_t bytes)
 {
-   struct alloc_tag *tag = ct_to_alloc_tag(ref->ct);
+   struct alloc_tag *tag;
+
+   if (is_codetag_ctx_ref(ref))
+   alloc_tag_free_ctx(ref->ctx, );
+   else
+   tag = ct_to_alloc_tag(ref->ct);
 
__lazy_percpu_counter_add(>call_count, >last_wrap, -1);
__lazy_percpu_counter_add(>bytes_allocated, >last_wrap, 
-bytes);
@@ -51,7 +65,11 @@ do { 
\
 
 static inline void __alloc_tag_add(struct alloc_tag *tag, union codetag_ref 
*ref, size_t bytes)
 {
-   ref->ct = >ct;
+   if (codetag_ctx_enabled(>ctc))
+   ref->ctx = alloc_tag_create_ctx(tag, bytes);
+   else
+   ref->ct = >ctc.ct;
+
__lazy_percpu_counter_add(>call_count, >last_wrap, 1);
__lazy_percpu_counter_add(>bytes_allocated, >last_wrap, 
bytes);
 }
diff --git a/include/linux/codetag.h b/include/linux/codetag.h
index 57736ec77b45..a10c5fcbdd20 100644
--- a/include/linux/codetag.h
+++ b/include/linux/codetag.h
@@ -104,7 +104,8 @@ struct codetag_with_ctx *ct_to_ctc(struct codetag *ct)
 }
 
 void codetag_lock_module_list(struct codetag_type *cttype, bool lock);
-struct codetag_iterator codetag_get_ct_iter(struct codetag_type *cttype);
+void codetag_init_iter(struct codetag_iterator *iter,
+  struct codetag_type *cttype);
 struct codetag 

[RFC PATCH 17/30] lib/string.c: strsep_no_empty()

2022-08-30 Thread Suren Baghdasaryan
From: Kent Overstreet 

This adds a new helper which is like strsep, except that it skips empty
tokens.

Signed-off-by: Kent Overstreet 
---
 include/linux/string.h |  1 +
 lib/string.c   | 19 +++
 2 files changed, 20 insertions(+)

diff --git a/include/linux/string.h b/include/linux/string.h
index 61ec7e4f6311..b950ac9cfa56 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -96,6 +96,7 @@ extern char * strpbrk(const char *,const char *);
 #ifndef __HAVE_ARCH_STRSEP
 extern char * strsep(char **,const char *);
 #endif
+extern char *strsep_no_empty(char **, const char *);
 #ifndef __HAVE_ARCH_STRSPN
 extern __kernel_size_t strspn(const char *,const char *);
 #endif
diff --git a/lib/string.c b/lib/string.c
index 6f334420f687..6939f5b751f2 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -596,6 +596,25 @@ char *strsep(char **s, const char *ct)
 EXPORT_SYMBOL(strsep);
 #endif
 
+/**
+ * strsep_no_empt - Split a string into tokens, but don't return empty tokens
+ * @s: The string to be searched
+ * @ct: The characters to search for
+ *
+ * strsep() updates @s to point after the token, ready for the next call.
+ */
+char *strsep_no_empty(char **s, const char *ct)
+{
+   char *ret;
+
+   do {
+   ret = strsep(s, ct);
+   } while (ret && !*ret);
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(strsep_no_empty);
+
 #ifndef __HAVE_ARCH_MEMSET
 /**
  * memset - Fill a region of memory with the given value
-- 
2.37.2.672.g94769d06f0-goog




[RFC PATCH 16/30] mm: enable slab allocation tagging for kmalloc and friends

2022-08-30 Thread Suren Baghdasaryan
Redefine kmalloc, krealloc, kzalloc, kcalloc, etc. to record allocations
and deallocations done by these functions.

Signed-off-by: Suren Baghdasaryan 
Co-developed-by: Kent Overstreet 
Signed-off-by: Kent Overstreet 
---
 include/linux/slab.h | 103 +--
 mm/slab.c|   2 +
 mm/slab_common.c |  16 +++
 mm/slob.c|   2 +
 mm/slub.c|   2 +
 5 files changed, 75 insertions(+), 50 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 5a198aa02a08..89273be35743 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -191,7 +191,10 @@ int kmem_cache_shrink(struct kmem_cache *s);
 /*
  * Common kmalloc functions provided by all allocators
  */
-void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags) 
__alloc_size(2);
+void * __must_check _krealloc(const void *objp, size_t new_size, gfp_t flags) 
__alloc_size(2);
+#define krealloc(_p, _size, _flags)\
+   krealloc_hooks(_p, _krealloc(_p, _size, _flags))
+
 void kfree(const void *objp);
 void kfree_sensitive(const void *objp);
 size_t __ksize(const void *objp);
@@ -463,6 +466,15 @@ static inline void slab_tag_dec(const void *ptr) {}
 
 #endif
 
+#define krealloc_hooks(_p, _do_alloc)  \
+({ \
+   void *_res = _do_alloc; \
+   slab_tag_add(_p, _res); \
+   _res;   \
+})
+
+#define kmalloc_hooks(_do_alloc)   krealloc_hooks(NULL, _do_alloc)
+
 void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment 
__alloc_size(1);
 void *kmem_cache_alloc(struct kmem_cache *s, gfp_t flags) 
__assume_slab_alignment __malloc;
 void *kmem_cache_alloc_lru(struct kmem_cache *s, struct list_lru *lru,
@@ -541,25 +553,31 @@ static __always_inline void 
*kmem_cache_alloc_node_trace(struct kmem_cache *s, g
 }
 #endif /* CONFIG_TRACING */
 
-extern void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) 
__assume_page_alignment
+extern void *_kmalloc_order(size_t size, gfp_t flags, unsigned int order) 
__assume_page_alignment
 
__alloc_size(1);
+#define kmalloc_order(_size, _flags, _order)  \
+   kmalloc_hooks(_kmalloc_order(_size, _flags, _order))
 
 #ifdef CONFIG_TRACING
-extern void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order)
+extern void *_kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order)
__assume_page_alignment __alloc_size(1);
 #else
-static __always_inline __alloc_size(1) void *kmalloc_order_trace(size_t size, 
gfp_t flags,
+static __always_inline __alloc_size(1) void *_kmalloc_order_trace(size_t size, 
gfp_t flags,
 unsigned int 
order)
 {
-   return kmalloc_order(size, flags, order);
+   return _kmalloc_order(size, flags, order);
 }
 #endif
+#define kmalloc_order_trace(_size, _flags, _order)  \
+   kmalloc_hooks(_kmalloc_order_trace(_size, _flags, _order))
 
-static __always_inline __alloc_size(1) void *kmalloc_large(size_t size, gfp_t 
flags)
+static __always_inline __alloc_size(1) void *_kmalloc_large(size_t size, gfp_t 
flags)
 {
unsigned int order = get_order(size);
-   return kmalloc_order_trace(size, flags, order);
+   return _kmalloc_order_trace(size, flags, order);
 }
+#define kmalloc_large(_size, _flags)\
+   kmalloc_hooks(_kmalloc_large(_size, _flags))
 
 /**
  * kmalloc - allocate memory
@@ -615,14 +633,14 @@ static __always_inline __alloc_size(1) void 
*kmalloc_large(size_t size, gfp_t fl
  * Try really hard to succeed the allocation but fail
  * eventually.
  */
-static __always_inline __alloc_size(1) void *kmalloc(size_t size, gfp_t flags)
+static __always_inline __alloc_size(1) void *_kmalloc(size_t size, gfp_t flags)
 {
if (__builtin_constant_p(size)) {
 #ifndef CONFIG_SLOB
unsigned int index;
 #endif
if (size > KMALLOC_MAX_CACHE_SIZE)
-   return kmalloc_large(size, flags);
+   return _kmalloc_large(size, flags);
 #ifndef CONFIG_SLOB
index = kmalloc_index(size);
 
@@ -636,8 +654,9 @@ static __always_inline __alloc_size(1) void *kmalloc(size_t 
size, gfp_t flags)
}
return __kmalloc(size, flags);
 }
+#define kmalloc(_size, _flags) kmalloc_hooks(_kmalloc(_size, 
_flags))
 
-static __always_inline __alloc_size(1) void *kmalloc_node(size_t size, gfp_t 
flags, int node)
+static __always_inline __alloc_size(1) void *_kmalloc_node(size_t size, gfp_t 
flags, int node)
 {
 #ifndef CONFIG_SLOB
if (__builtin_constant_p(size) &&
@@ 

[RFC PATCH 15/30] lib: introduce slab allocation tagging

2022-08-30 Thread Suren Baghdasaryan
Introduce CONFIG_SLAB_ALLOC_TAGGING which provides helper functions
to easily instrument slab allocators and adds a codetag_ref field into
slabobj_ext to store a pointer to the allocation tag associated with
the code that allocated the slab object.

Signed-off-by: Suren Baghdasaryan 
Co-developed-by: Kent Overstreet 
Signed-off-by: Kent Overstreet 
---
 include/linux/memcontrol.h |  5 +
 include/linux/slab.h   | 25 +
 include/linux/slab_def.h   |  2 +-
 include/linux/slub_def.h   |  4 ++--
 lib/Kconfig.debug  | 11 +++
 mm/slab_common.c   | 33 +
 6 files changed, 77 insertions(+), 3 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 315399f77173..97c0153f0247 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -232,7 +232,12 @@ struct obj_cgroup {
  * if MEMCG_DATA_OBJEXTS is set.
  */
 struct slabobj_ext {
+#ifdef CONFIG_MEMCG_KMEM
struct obj_cgroup *objcg;
+#endif
+#ifdef CONFIG_SLAB_ALLOC_TAGGING
+   union codetag_ref ref;
+#endif
 } __aligned(8);
 
 /*
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 55ae3ea864a4..5a198aa02a08 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -438,6 +438,31 @@ static __always_inline unsigned int __kmalloc_index(size_t 
size,
 #define kmalloc_index(s) __kmalloc_index(s, true)
 #endif /* !CONFIG_SLOB */
 
+#ifdef CONFIG_SLAB_ALLOC_TAGGING
+
+#include 
+
+union codetag_ref *get_slab_tag_ref(const void *objp);
+
+#define slab_tag_add(_old, _new)   \
+do {   \
+   if (!ZERO_OR_NULL_PTR(_new) && _old != _new)\
+   alloc_tag_add(get_slab_tag_ref(_new), __ksize(_new));   \
+} while (0)
+
+static inline void slab_tag_dec(const void *ptr)
+{
+   if (!ZERO_OR_NULL_PTR(ptr))
+   alloc_tag_sub(get_slab_tag_ref(ptr), __ksize(ptr));
+}
+
+#else
+
+#define slab_tag_add(_old, _new) do {} while (0)
+static inline void slab_tag_dec(const void *ptr) {}
+
+#endif
+
 void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment 
__alloc_size(1);
 void *kmem_cache_alloc(struct kmem_cache *s, gfp_t flags) 
__assume_slab_alignment __malloc;
 void *kmem_cache_alloc_lru(struct kmem_cache *s, struct list_lru *lru,
diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
index e24c9aff6fed..25feb5f7dc32 100644
--- a/include/linux/slab_def.h
+++ b/include/linux/slab_def.h
@@ -106,7 +106,7 @@ static inline void *nearest_obj(struct kmem_cache *cache, 
const struct slab *sla
  *   reciprocal_divide(offset, cache->reciprocal_buffer_size)
  */
 static inline unsigned int obj_to_index(const struct kmem_cache *cache,
-   const struct slab *slab, void *obj)
+   const struct slab *slab, const void 
*obj)
 {
u32 offset = (obj - slab->s_mem);
return reciprocal_divide(offset, cache->reciprocal_buffer_size);
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index f9c68a9dac04..940c146768d4 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -170,14 +170,14 @@ static inline void *nearest_obj(struct kmem_cache *cache, 
const struct slab *sla
 
 /* Determine object index from a given position */
 static inline unsigned int __obj_to_index(const struct kmem_cache *cache,
- void *addr, void *obj)
+ void *addr, const void *obj)
 {
return reciprocal_divide(kasan_reset_tag(obj) - addr,
 cache->reciprocal_size);
 }
 
 static inline unsigned int obj_to_index(const struct kmem_cache *cache,
-   const struct slab *slab, void *obj)
+   const struct slab *slab, const void 
*obj)
 {
if (is_kfence_address(obj))
return 0;
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 6686648843b3..08c97a978906 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -989,6 +989,17 @@ config PAGE_ALLOC_TAGGING
  initiated at that code location. The mechanism can be used to track
  memory leaks with a low performance impact.
 
+config SLAB_ALLOC_TAGGING
+   bool "Enable slab allocation tagging"
+   default n
+   select ALLOC_TAGGING
+   select SLAB_OBJ_EXT
+   help
+ Instrument slab allocators to track allocation source code and
+ collect statistics on the number of allocations and their total size
+ initiated at that code location. The mechanism can be used to track
+ memory leaks with a low performance impact.
+
 source "lib/Kconfig.kasan"
 source "lib/Kconfig.kfence"
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 17996649cfe3..272eda62ecaa 100644
--- a/mm/slab_common.c
+++ 

[RFC PATCH 14/30] mm: prevent slabobj_ext allocations for slabobj_ext and kmem_cache objects

2022-08-30 Thread Suren Baghdasaryan
Use __GFP_NO_OBJ_EXT to prevent recursions when allocating slabobj_ext
objects. Also prevent slabobj_ext allocations for kmem_cache objects.

Signed-off-by: Suren Baghdasaryan 
---
 mm/memcontrol.c | 2 ++
 mm/slab.h   | 6 ++
 2 files changed, 8 insertions(+)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 3f407ef2f3f1..dabb451dc364 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2809,6 +2809,8 @@ int alloc_slab_obj_exts(struct slab *slab, struct 
kmem_cache *s,
void *vec;
 
gfp &= ~OBJCGS_CLEAR_MASK;
+   /* Prevent recursive extension vector allocation */
+   gfp |= __GFP_NO_OBJ_EXT;
vec = kcalloc_node(objects, sizeof(struct slabobj_ext), gfp,
   slab_nid(slab));
if (!vec)
diff --git a/mm/slab.h b/mm/slab.h
index c767ce3f0fe2..d93b22b8bbe2 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -475,6 +475,12 @@ static inline void prepare_slab_obj_exts_hook(struct 
kmem_cache *s, gfp_t flags,
if (is_kmem_only_obj_ext())
return;
 
+   if (s->flags & SLAB_NO_OBJ_EXT)
+   return;
+
+   if (flags & __GFP_NO_OBJ_EXT)
+   return;
+
slab = virt_to_slab(p);
if (!slab_obj_exts(slab))
WARN(alloc_slab_obj_exts(slab, s, flags, false),
-- 
2.37.2.672.g94769d06f0-goog




[RFC PATCH 11/30] mm: introduce slabobj_ext to support slab object extensions

2022-08-30 Thread Suren Baghdasaryan
Currently slab pages can store only vectors of obj_cgroup pointers in
page->memcg_data. Introduce slabobj_ext structure to allow more data
to be stored for each slab object. Wraps obj_cgroup into slabobj_ext
to support current functionality while allowing to extend slabobj_ext
in the future.

Note: ideally the config dependency should be turned the other way around:
MEMCG should depend on SLAB_OBJ_EXT and {page|slab|folio}.memcg_data would
be renamed to something like {page|slab|folio}.objext_data. However doing
this in RFC would introduce considerable churn unrelated to the overall
idea, so avoiding this until v1.

Signed-off-by: Suren Baghdasaryan 
---
 include/linux/memcontrol.h |  18 --
 init/Kconfig   |   5 ++
 mm/kfence/core.c   |   2 +-
 mm/memcontrol.c|  60 ++-
 mm/page_owner.c|   2 +-
 mm/slab.h  | 119 +
 6 files changed, 131 insertions(+), 75 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 6257867fbf95..315399f77173 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -227,6 +227,14 @@ struct obj_cgroup {
};
 };
 
+/*
+ * Extended information for slab objects stored as an array in page->memcg_data
+ * if MEMCG_DATA_OBJEXTS is set.
+ */
+struct slabobj_ext {
+   struct obj_cgroup *objcg;
+} __aligned(8);
+
 /*
  * The memory controller data structure. The memory controller controls both
  * page cache and RSS per cgroup. We would eventually like to provide
@@ -363,7 +371,7 @@ extern struct mem_cgroup *root_mem_cgroup;
 
 enum page_memcg_data_flags {
/* page->memcg_data is a pointer to an objcgs vector */
-   MEMCG_DATA_OBJCGS = (1UL << 0),
+   MEMCG_DATA_OBJEXTS = (1UL << 0),
/* page has been accounted as a non-slab kernel page */
MEMCG_DATA_KMEM = (1UL << 1),
/* the next bit after the last actual flag */
@@ -401,7 +409,7 @@ static inline struct mem_cgroup *__folio_memcg(struct folio 
*folio)
unsigned long memcg_data = folio->memcg_data;
 
VM_BUG_ON_FOLIO(folio_test_slab(folio), folio);
-   VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJCGS, folio);
+   VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJEXTS, folio);
VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_KMEM, folio);
 
return (struct mem_cgroup *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK);
@@ -422,7 +430,7 @@ static inline struct obj_cgroup *__folio_objcg(struct folio 
*folio)
unsigned long memcg_data = folio->memcg_data;
 
VM_BUG_ON_FOLIO(folio_test_slab(folio), folio);
-   VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJCGS, folio);
+   VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJEXTS, folio);
VM_BUG_ON_FOLIO(!(memcg_data & MEMCG_DATA_KMEM), folio);
 
return (struct obj_cgroup *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK);
@@ -517,7 +525,7 @@ static inline struct mem_cgroup *page_memcg_check(struct 
page *page)
 */
unsigned long memcg_data = READ_ONCE(page->memcg_data);
 
-   if (memcg_data & MEMCG_DATA_OBJCGS)
+   if (memcg_data & MEMCG_DATA_OBJEXTS)
return NULL;
 
if (memcg_data & MEMCG_DATA_KMEM) {
@@ -556,7 +564,7 @@ static inline struct mem_cgroup 
*get_mem_cgroup_from_objcg(struct obj_cgroup *ob
 static inline bool folio_memcg_kmem(struct folio *folio)
 {
VM_BUG_ON_PGFLAGS(PageTail(>page), >page);
-   VM_BUG_ON_FOLIO(folio->memcg_data & MEMCG_DATA_OBJCGS, folio);
+   VM_BUG_ON_FOLIO(folio->memcg_data & MEMCG_DATA_OBJEXTS, folio);
return folio->memcg_data & MEMCG_DATA_KMEM;
 }
 
diff --git a/init/Kconfig b/init/Kconfig
index 532362fcfe31..82396d7a2717 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -958,6 +958,10 @@ config MEMCG
help
  Provides control over the memory footprint of tasks in a cgroup.
 
+config SLAB_OBJ_EXT
+   bool
+   depends on MEMCG
+
 config MEMCG_SWAP
bool
depends on MEMCG && SWAP
@@ -966,6 +970,7 @@ config MEMCG_SWAP
 config MEMCG_KMEM
bool
depends on MEMCG && !SLOB
+   select SLAB_OBJ_EXT
default y
 
 config BLK_CGROUP
diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index c252081b11df..c0958e4a32e2 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -569,7 +569,7 @@ static unsigned long kfence_init_pool(void)
__folio_set_slab(slab_folio(slab));
 #ifdef CONFIG_MEMCG
slab->memcg_data = (unsigned long)_metadata[i / 2 - 
1].objcg |
-  MEMCG_DATA_OBJCGS;
+  MEMCG_DATA_OBJEXTS;
 #endif
}
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b69979c9ced5..3f407ef2f3f1 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2793,7 +2793,7 @@ static void commit_charge(struct folio *folio, struct 
mem_cgroup *memcg)
folio->memcg_data = (unsigned long)memcg;
 }
 
-#ifdef CONFIG_MEMCG_KMEM

[RFC PATCH 09/30] change alloc_pages name in dma_map_ops to avoid name conflicts

2022-08-30 Thread Suren Baghdasaryan
After redefining alloc_pages, all uses of that name are being replaced.
Change the conflicting names to prevent preprocessor from replacing them
when it's not intended.

Signed-off-by: Suren Baghdasaryan 
---
 arch/x86/kernel/amd_gart_64.c | 2 +-
 drivers/iommu/dma-iommu.c | 2 +-
 drivers/xen/grant-dma-ops.c   | 2 +-
 drivers/xen/swiotlb-xen.c | 2 +-
 include/linux/dma-map-ops.h   | 2 +-
 kernel/dma/mapping.c  | 4 ++--
 6 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/amd_gart_64.c b/arch/x86/kernel/amd_gart_64.c
index 194d54eed537..5e83a387bfef 100644
--- a/arch/x86/kernel/amd_gart_64.c
+++ b/arch/x86/kernel/amd_gart_64.c
@@ -676,7 +676,7 @@ static const struct dma_map_ops gart_dma_ops = {
.get_sgtable= dma_common_get_sgtable,
.dma_supported  = dma_direct_supported,
.get_required_mask  = dma_direct_get_required_mask,
-   .alloc_pages= dma_direct_alloc_pages,
+   .alloc_pages_op = dma_direct_alloc_pages,
.free_pages = dma_direct_free_pages,
 };
 
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 17dd683b2fce..58b4878ef930 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1547,7 +1547,7 @@ static const struct dma_map_ops iommu_dma_ops = {
.flags  = DMA_F_PCI_P2PDMA_SUPPORTED,
.alloc  = iommu_dma_alloc,
.free   = iommu_dma_free,
-   .alloc_pages= dma_common_alloc_pages,
+   .alloc_pages_op = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
.alloc_noncontiguous= iommu_dma_alloc_noncontiguous,
.free_noncontiguous = iommu_dma_free_noncontiguous,
diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c
index 8973fc1e9ccc..0e26d066036e 100644
--- a/drivers/xen/grant-dma-ops.c
+++ b/drivers/xen/grant-dma-ops.c
@@ -262,7 +262,7 @@ static int xen_grant_dma_supported(struct device *dev, u64 
mask)
 static const struct dma_map_ops xen_grant_dma_ops = {
.alloc = xen_grant_dma_alloc,
.free = xen_grant_dma_free,
-   .alloc_pages = xen_grant_dma_alloc_pages,
+   .alloc_pages_op = xen_grant_dma_alloc_pages,
.free_pages = xen_grant_dma_free_pages,
.mmap = dma_common_mmap,
.get_sgtable = dma_common_get_sgtable,
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 67aa74d20162..5ab2616153f0 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -403,6 +403,6 @@ const struct dma_map_ops xen_swiotlb_dma_ops = {
.dma_supported = xen_swiotlb_dma_supported,
.mmap = dma_common_mmap,
.get_sgtable = dma_common_get_sgtable,
-   .alloc_pages = dma_common_alloc_pages,
+   .alloc_pages_op = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
 };
diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index d678afeb8a13..e8e2d210ba68 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -27,7 +27,7 @@ struct dma_map_ops {
unsigned long attrs);
void (*free)(struct device *dev, size_t size, void *vaddr,
dma_addr_t dma_handle, unsigned long attrs);
-   struct page *(*alloc_pages)(struct device *dev, size_t size,
+   struct page *(*alloc_pages_op)(struct device *dev, size_t size,
dma_addr_t *dma_handle, enum dma_data_direction dir,
gfp_t gfp);
void (*free_pages)(struct device *dev, size_t size, struct page *vaddr,
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 49cbf3e33de7..80a2bfeed8d0 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -552,9 +552,9 @@ static struct page *__dma_alloc_pages(struct device *dev, 
size_t size,
size = PAGE_ALIGN(size);
if (dma_alloc_direct(dev, ops))
return dma_direct_alloc_pages(dev, size, dma_handle, dir, gfp);
-   if (!ops->alloc_pages)
+   if (!ops->alloc_pages_op)
return NULL;
-   return ops->alloc_pages(dev, size, dma_handle, dir, gfp);
+   return ops->alloc_pages_op(dev, size, dma_handle, dir, gfp);
 }
 
 struct page *dma_alloc_pages(struct device *dev, size_t size,
-- 
2.37.2.672.g94769d06f0-goog




[RFC PATCH 10/30] mm: enable page allocation tagging for __get_free_pages and alloc_pages

2022-08-30 Thread Suren Baghdasaryan
Redefine alloc_pages, __get_free_pages to record allocations done by
these functions. Instrument deallocation hooks to record object freeing.

Signed-off-by: Suren Baghdasaryan 
---
 include/linux/gfp.h | 10 +++---
 include/linux/page_ext.h|  3 ++-
 include/linux/pgalloc_tag.h | 35 +++
 mm/mempolicy.c  |  4 ++--
 mm/page_alloc.c | 13 ++---
 5 files changed, 56 insertions(+), 9 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index f314be58fa77..5cb950a49d40 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -6,6 +6,7 @@
 
 #include 
 #include 
+#include 
 
 struct vm_area_struct;
 
@@ -267,12 +268,12 @@ static inline struct page *alloc_pages_node(int nid, 
gfp_t gfp_mask,
 }
 
 #ifdef CONFIG_NUMA
-struct page *alloc_pages(gfp_t gfp, unsigned int order);
+struct page *_alloc_pages(gfp_t gfp, unsigned int order);
 struct folio *folio_alloc(gfp_t gfp, unsigned order);
 struct folio *vma_alloc_folio(gfp_t gfp, int order, struct vm_area_struct *vma,
unsigned long addr, bool hugepage);
 #else
-static inline struct page *alloc_pages(gfp_t gfp_mask, unsigned int order)
+static inline struct page *_alloc_pages(gfp_t gfp_mask, unsigned int order)
 {
return alloc_pages_node(numa_node_id(), gfp_mask, order);
 }
@@ -283,6 +284,7 @@ static inline struct folio *folio_alloc(gfp_t gfp, unsigned 
int order)
 #define vma_alloc_folio(gfp, order, vma, addr, hugepage)   \
folio_alloc(gfp, order)
 #endif
+#define alloc_pages(gfp, order) pgtag_alloc_pages(gfp, order)
 #define alloc_page(gfp_mask) alloc_pages(gfp_mask, 0)
 static inline struct page *alloc_page_vma(gfp_t gfp,
struct vm_area_struct *vma, unsigned long addr)
@@ -292,7 +294,9 @@ static inline struct page *alloc_page_vma(gfp_t gfp,
return >page;
 }
 
-extern unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order);
+extern unsigned long _get_free_pages(gfp_t gfp_mask, unsigned int order,
+struct page **ppage);
+#define __get_free_pages(gfp_mask, order) pgtag_get_free_pages(gfp_mask, order)
 extern unsigned long get_zeroed_page(gfp_t gfp_mask);
 
 void *alloc_pages_exact(size_t size, gfp_t gfp_mask) __alloc_size(1);
diff --git a/include/linux/page_ext.h b/include/linux/page_ext.h
index fabb2e1e087f..b26077110fb3 100644
--- a/include/linux/page_ext.h
+++ b/include/linux/page_ext.h
@@ -4,7 +4,6 @@
 
 #include 
 #include 
-#include 
 
 struct pglist_data;
 struct page_ext_operations {
@@ -14,6 +13,8 @@ struct page_ext_operations {
void (*init)(void);
 };
 
+#include 
+
 #ifdef CONFIG_PAGE_EXTENSION
 
 enum page_ext_flags {
diff --git a/include/linux/pgalloc_tag.h b/include/linux/pgalloc_tag.h
index f525abfe51d4..154ea7436fec 100644
--- a/include/linux/pgalloc_tag.h
+++ b/include/linux/pgalloc_tag.h
@@ -5,6 +5,8 @@
 #ifndef _LINUX_PGALLOC_TAG_H
 #define _LINUX_PGALLOC_TAG_H
 
+#ifdef CONFIG_PAGE_ALLOC_TAGGING
+
 #include 
 #include 
 
@@ -25,4 +27,37 @@ static inline void pgalloc_tag_dec(struct page *page, 
unsigned int order)
alloc_tag_sub(get_page_tag_ref(page), PAGE_SIZE << order);
 }
 
+/*
+ * Redefinitions of the common page allocators/destructors
+ */
+#define pgtag_alloc_pages(gfp, order)  \
+({ \
+   struct page *_page = _alloc_pages((gfp), (order));  \
+   \
+   if (_page)  \
+   alloc_tag_add(get_page_tag_ref(_page), PAGE_SIZE << (order));\
+   _page;  \
+})
+
+#define pgtag_get_free_pages(gfp_mask, order)  \
+({ \
+   struct page *_page; \
+   unsigned long _res = _get_free_pages((gfp_mask), (order), &_page);\
+   \
+   if (_res)   \
+   alloc_tag_add(get_page_tag_ref(_page), PAGE_SIZE << (order));\
+   _res;   \
+})
+
+#else /* CONFIG_PAGE_ALLOC_TAGGING */
+
+#define pgtag_alloc_pages(gfp, order) _alloc_pages(gfp, order)
+
+#define pgtag_get_free_pages(gfp_mask, order) \
+   _get_free_pages((gfp_mask), (order), NULL)
+
+#define pgalloc_tag_dec(__page, __size)do {} while (0)
+
+#endif /* CONFIG_PAGE_ALLOC_TAGGING */
+
 #endif /* _LINUX_PGALLOC_TAG_H */
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index b73d3248d976..f7e6d9564a49 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2249,7 +2249,7 @@ EXPORT_SYMBOL(vma_alloc_folio);
  * flags are used.
  * 

[RFC PATCH 12/30] mm: introduce __GFP_NO_OBJ_EXT flag to selectively prevent slabobj_ext creation

2022-08-30 Thread Suren Baghdasaryan
Introduce __GFP_NO_OBJ_EXT flag in order to prevent recursive allocations
when allocating slabobj_ext on a slab.

Signed-off-by: Suren Baghdasaryan 
---
 include/linux/gfp_types.h | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
index d88c46ca82e1..a2cba1d20b86 100644
--- a/include/linux/gfp_types.h
+++ b/include/linux/gfp_types.h
@@ -55,8 +55,13 @@ typedef unsigned int __bitwise gfp_t;
 #define ___GFP_SKIP_KASAN_UNPOISON 0
 #define ___GFP_SKIP_KASAN_POISON   0
 #endif
+#ifdef CONFIG_SLAB_OBJ_EXT
+#define ___GFP_NO_OBJ_EXT   0x800u
+#else
+#define ___GFP_NO_OBJ_EXT   0
+#endif
 #ifdef CONFIG_LOCKDEP
-#define ___GFP_NOLOCKDEP   0x800u
+#define ___GFP_NOLOCKDEP   0x1000u
 #else
 #define ___GFP_NOLOCKDEP   0
 #endif
@@ -101,12 +106,15 @@ typedef unsigned int __bitwise gfp_t;
  * node with no fallbacks or placement policy enforcements.
  *
  * %__GFP_ACCOUNT causes the allocation to be accounted to kmemcg.
+ *
+ * %__GFP_NO_OBJ_EXT causes slab allocation to have no object extension.
  */
 #define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE)
 #define __GFP_WRITE((__force gfp_t)___GFP_WRITE)
 #define __GFP_HARDWALL   ((__force gfp_t)___GFP_HARDWALL)
 #define __GFP_THISNODE ((__force gfp_t)___GFP_THISNODE)
 #define __GFP_ACCOUNT  ((__force gfp_t)___GFP_ACCOUNT)
+#define __GFP_NO_OBJ_EXT   ((__force gfp_t)___GFP_NO_OBJ_EXT)
 
 /**
  * DOC: Watermark modifiers
@@ -256,7 +264,7 @@ typedef unsigned int __bitwise gfp_t;
 #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
 
 /* Room for N __GFP_FOO bits */
-#define __GFP_BITS_SHIFT (27 + IS_ENABLED(CONFIG_LOCKDEP))
+#define __GFP_BITS_SHIFT (28 + IS_ENABLED(CONFIG_LOCKDEP))
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /**
-- 
2.37.2.672.g94769d06f0-goog




[RFC PATCH 20/30] lib: introduce support for storing code tag context

2022-08-30 Thread Suren Baghdasaryan
Add support for code tag context capture when registering a new code tag
type. When context capture for a specific code tag is enabled,
codetag_ref will point to a codetag_ctx object which can be attached
to an application-specific object storing code invocation context.
codetag_ctx has a pointer to its codetag_with_ctx object with embedded
codetag object in it. All context objects of the same code tag are placed
into codetag_with_ctx.ctx_head linked list. codetag.flag is used to
indicate when a context capture for the associated code tag is
initialized and enabled.

Signed-off-by: Suren Baghdasaryan 
---
 include/linux/codetag.h |  50 +-
 include/linux/codetag_ctx.h |  48 +
 lib/codetag.c   | 134 
 3 files changed, 231 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/codetag_ctx.h

diff --git a/include/linux/codetag.h b/include/linux/codetag.h
index 0c605417ebbe..57736ec77b45 100644
--- a/include/linux/codetag.h
+++ b/include/linux/codetag.h
@@ -5,8 +5,12 @@
 #ifndef _LINUX_CODETAG_H
 #define _LINUX_CODETAG_H
 
+#include 
+#include 
 #include 
 
+struct kref;
+struct codetag_ctx;
 struct codetag_iterator;
 struct codetag_type;
 struct seq_buf;
@@ -18,15 +22,38 @@ struct module;
  * an array of these.
  */
 struct codetag {
-   unsigned int flags; /* used in later patches */
+   unsigned int flags; /* has to be the first member shared with 
codetag_ctx */
unsigned int lineno;
const char *modname;
const char *function;
const char *filename;
 } __aligned(8);
 
+/* codetag_with_ctx flags */
+#define CTC_FLAG_CTX_PTR   (1 << 0)
+#define CTC_FLAG_CTX_READY (1 << 1)
+#define CTC_FLAG_CTX_ENABLED   (1 << 2)
+
+/*
+ * Code tag with context capture support. Contains a list to store context for
+ * each tag hit, a lock protecting the list and a flag to indicate whether
+ * context capture is enabled for the tag.
+ */
+struct codetag_with_ctx {
+   struct codetag ct;
+   struct list_head ctx_head;
+   spinlock_t ctx_lock;
+} __aligned(8);
+
+/*
+ * Tag reference can point to codetag directly or indirectly via codetag_ctx.
+ * Direct codetag pointer is used when context capture is disabled or not
+ * supported. When context capture for the tag is used, the reference points
+ * to the codetag_ctx through which the codetag can be reached.
+ */
 union codetag_ref {
struct codetag *ct;
+   struct codetag_ctx *ctx;
 };
 
 struct codetag_range {
@@ -46,6 +73,7 @@ struct codetag_type_desc {
struct codetag_module *cmod);
void (*module_unload)(struct codetag_type *cttype,
  struct codetag_module *cmod);
+   void (*free_ctx)(struct kref *ref);
 };
 
 struct codetag_iterator {
@@ -53,6 +81,7 @@ struct codetag_iterator {
struct codetag_module *cmod;
unsigned long mod_id;
struct codetag *ct;
+   struct codetag_ctx *ctx;
 };
 
 #define CODE_TAG_INIT {\
@@ -63,9 +92,28 @@ struct codetag_iterator {
.flags  = 0,\
 }
 
+static inline bool is_codetag_ctx_ref(union codetag_ref *ref)
+{
+   return !!(ref->ct->flags & CTC_FLAG_CTX_PTR);
+}
+
+static inline
+struct codetag_with_ctx *ct_to_ctc(struct codetag *ct)
+{
+   return container_of(ct, struct codetag_with_ctx, ct);
+}
+
 void codetag_lock_module_list(struct codetag_type *cttype, bool lock);
 struct codetag_iterator codetag_get_ct_iter(struct codetag_type *cttype);
 struct codetag *codetag_next_ct(struct codetag_iterator *iter);
+struct codetag_ctx *codetag_next_ctx(struct codetag_iterator *iter);
+
+bool codetag_enable_ctx(struct codetag_with_ctx *ctc, bool enable);
+static inline bool codetag_ctx_enabled(struct codetag_with_ctx *ctc)
+{
+   return !!(ctc->ct.flags & CTC_FLAG_CTX_ENABLED);
+}
+bool codetag_has_ctx(struct codetag_with_ctx *ctc);
 
 void codetag_to_text(struct seq_buf *out, struct codetag *ct);
 
diff --git a/include/linux/codetag_ctx.h b/include/linux/codetag_ctx.h
new file mode 100644
index ..e741484f0e08
--- /dev/null
+++ b/include/linux/codetag_ctx.h
@@ -0,0 +1,48 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * code tag context
+ */
+#ifndef _LINUX_CODETAG_CTX_H
+#define _LINUX_CODETAG_CTX_H
+
+#include 
+#include 
+
+/* Code tag hit context. */
+struct codetag_ctx {
+   unsigned int flags; /* has to be the first member shared with codetag */
+   struct codetag_with_ctx *ctc;
+   struct list_head node;
+   struct kref refcount;
+} __aligned(8);
+
+static inline struct codetag_ctx *kref_to_ctx(struct kref *refcount)
+{
+   return container_of(refcount, struct codetag_ctx, refcount);
+}
+
+static inline void add_ctx(struct codetag_ctx *ctx,
+  struct codetag_with_ctx *ctc)
+{
+   kref_init(>refcount);
+   spin_lock(>ctx_lock);
+   ctx->flags = 

[RFC PATCH 08/30] lib: introduce page allocation tagging

2022-08-30 Thread Suren Baghdasaryan
Introduce CONFIG_PAGE_ALLOC_TAGGING which provides helper functions to
easily instrument page allocators and adds a page_ext field to store a
pointer to the allocation tag associated with the code that allocated
the page.

Signed-off-by: Suren Baghdasaryan 
Co-developed-by: Kent Overstreet 
Signed-off-by: Kent Overstreet 
---
 include/linux/pgalloc_tag.h | 28 
 lib/Kconfig.debug   | 11 +++
 lib/Makefile|  1 +
 lib/pgalloc_tag.c   | 22 ++
 mm/page_ext.c   |  6 ++
 5 files changed, 68 insertions(+)
 create mode 100644 include/linux/pgalloc_tag.h
 create mode 100644 lib/pgalloc_tag.c

diff --git a/include/linux/pgalloc_tag.h b/include/linux/pgalloc_tag.h
new file mode 100644
index ..f525abfe51d4
--- /dev/null
+++ b/include/linux/pgalloc_tag.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * page allocation tagging
+ */
+#ifndef _LINUX_PGALLOC_TAG_H
+#define _LINUX_PGALLOC_TAG_H
+
+#include 
+#include 
+
+extern struct page_ext_operations page_alloc_tagging_ops;
+struct page_ext *lookup_page_ext(const struct page *page);
+
+static inline union codetag_ref *get_page_tag_ref(struct page *page)
+{
+   struct page_ext *page_ext = lookup_page_ext(page);
+
+   return page_ext ? (void *)page_ext + page_alloc_tagging_ops.offset
+   : NULL;
+}
+
+static inline void pgalloc_tag_dec(struct page *page, unsigned int order)
+{
+   if (page)
+   alloc_tag_sub(get_page_tag_ref(page), PAGE_SIZE << order);
+}
+
+#endif /* _LINUX_PGALLOC_TAG_H */
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 795bf6993f8a..6686648843b3 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -978,6 +978,17 @@ config ALLOC_TAGGING
select CODE_TAGGING
select LAZY_PERCPU_COUNTER
 
+config PAGE_ALLOC_TAGGING
+   bool "Enable page allocation tagging"
+   default n
+   select ALLOC_TAGGING
+   select PAGE_EXTENSION
+   help
+ Instrument page allocators to track allocation source code and
+ collect statistics on the number of allocations and their total size
+ initiated at that code location. The mechanism can be used to track
+ memory leaks with a low performance impact.
+
 source "lib/Kconfig.kasan"
 source "lib/Kconfig.kfence"
 
diff --git a/lib/Makefile b/lib/Makefile
index dc00533fc5c8..99f732156673 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -229,6 +229,7 @@ obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
 
 obj-$(CONFIG_CODE_TAGGING) += codetag.o
 obj-$(CONFIG_ALLOC_TAGGING) += alloc_tag.o
+obj-$(CONFIG_PAGE_ALLOC_TAGGING) += pgalloc_tag.o
 
 lib-$(CONFIG_GENERIC_BUG) += bug.o
 
diff --git a/lib/pgalloc_tag.c b/lib/pgalloc_tag.c
new file mode 100644
index ..7d97372ca0df
--- /dev/null
+++ b/lib/pgalloc_tag.c
@@ -0,0 +1,22 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include 
+#include 
+#include 
+#include 
+
+static __init bool need_page_alloc_tagging(void)
+{
+   return true;
+}
+
+static __init void init_page_alloc_tagging(void)
+{
+}
+
+struct page_ext_operations page_alloc_tagging_ops = {
+   .size = sizeof(union codetag_ref),
+   .need = need_page_alloc_tagging,
+   .init = init_page_alloc_tagging,
+};
+EXPORT_SYMBOL(page_alloc_tagging_ops);
+
diff --git a/mm/page_ext.c b/mm/page_ext.c
index 3dc715d7ac29..a22f514ff4da 100644
--- a/mm/page_ext.c
+++ b/mm/page_ext.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * struct page extension
@@ -76,6 +77,9 @@ static struct page_ext_operations *page_ext_ops[] __initdata 
= {
 #if defined(CONFIG_PAGE_IDLE_FLAG) && !defined(CONFIG_64BIT)
_idle_ops,
 #endif
+#ifdef CONFIG_PAGE_ALLOC_TAGGING
+   _alloc_tagging_ops,
+#endif
 #ifdef CONFIG_PAGE_TABLE_CHECK
_table_check_ops,
 #endif
@@ -152,6 +156,7 @@ struct page_ext *lookup_page_ext(const struct page *page)
MAX_ORDER_NR_PAGES);
return get_entry(base, index);
 }
+EXPORT_SYMBOL(lookup_page_ext);
 
 static int __init alloc_node_page_ext(int nid)
 {
@@ -221,6 +226,7 @@ struct page_ext *lookup_page_ext(const struct page *page)
return NULL;
return get_entry(section->page_ext, pfn);
 }
+EXPORT_SYMBOL(lookup_page_ext);
 
 static void *__meminit alloc_page_ext(size_t size, int nid)
 {
-- 
2.37.2.672.g94769d06f0-goog




[RFC PATCH 04/30] scripts/kallysms: Always include __start and __stop symbols

2022-08-30 Thread Suren Baghdasaryan
From: Kent Overstreet 

These symbols are used to denote section boundaries: by always including
them we can unify loading sections from modules with loading built-in
sections, which leads to some significant cleanup.

Signed-off-by: Kent Overstreet 
---
 scripts/kallsyms.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index f18e6dfc68c5..3d51639a595d 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -263,6 +263,11 @@ static int symbol_in_range(const struct sym_entry *s,
return 0;
 }
 
+static bool string_starts_with(const char *s, const char *prefix)
+{
+   return strncmp(s, prefix, strlen(prefix)) == 0;
+}
+
 static int symbol_valid(const struct sym_entry *s)
 {
const char *name = sym_name(s);
@@ -270,6 +275,14 @@ static int symbol_valid(const struct sym_entry *s)
/* if --all-symbols is not specified, then symbols outside the text
 * and inittext sections are discarded */
if (!all_symbols) {
+   /*
+* Symbols starting with __start and __stop are used to denote
+* section boundaries, and should always be included:
+*/
+   if (string_starts_with(name, "__start_") ||
+   string_starts_with(name, "__stop_"))
+   return 1;
+
if (symbol_in_range(s, text_ranges,
ARRAY_SIZE(text_ranges)) == 0)
return 0;
-- 
2.37.2.672.g94769d06f0-goog




[RFC PATCH 06/30] lib: code tagging module support

2022-08-30 Thread Suren Baghdasaryan
Add support for code tagging from dynamically loaded modules.

Signed-off-by: Suren Baghdasaryan 
Co-developed-by: Kent Overstreet 
Signed-off-by: Kent Overstreet 
---
 include/linux/codetag.h | 12 ++
 kernel/module/main.c|  4 
 lib/codetag.c   | 51 -
 3 files changed, 66 insertions(+), 1 deletion(-)

diff --git a/include/linux/codetag.h b/include/linux/codetag.h
index a9d7adecc2a5..386733e89b31 100644
--- a/include/linux/codetag.h
+++ b/include/linux/codetag.h
@@ -42,6 +42,10 @@ struct codetag_module {
 struct codetag_type_desc {
const char *section;
size_t tag_size;
+   void (*module_load)(struct codetag_type *cttype,
+   struct codetag_module *cmod);
+   void (*module_unload)(struct codetag_type *cttype,
+ struct codetag_module *cmod);
 };
 
 struct codetag_iterator {
@@ -68,4 +72,12 @@ void codetag_to_text(struct seq_buf *out, struct codetag 
*ct);
 struct codetag_type *
 codetag_register_type(const struct codetag_type_desc *desc);
 
+#ifdef CONFIG_CODE_TAGGING
+void codetag_load_module(struct module *mod);
+void codetag_unload_module(struct module *mod);
+#else
+static inline void codetag_load_module(struct module *mod) {}
+static inline void codetag_unload_module(struct module *mod) {}
+#endif
+
 #endif /* _LINUX_CODETAG_H */
diff --git a/kernel/module/main.c b/kernel/module/main.c
index a4e4d84b6f4e..d253277492fd 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -53,6 +53,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include "internal.h"
 
@@ -1151,6 +1152,7 @@ static void free_module(struct module *mod)
 {
trace_module_free(mod);
 
+   codetag_unload_module(mod);
mod_sysfs_teardown(mod);
 
/*
@@ -2849,6 +2851,8 @@ static int load_module(struct load_info *info, const char 
__user *uargs,
/* Get rid of temporary copy. */
free_copy(info, flags);
 
+   codetag_load_module(mod);
+
/* Done! */
trace_module_load(mod);
 
diff --git a/lib/codetag.c b/lib/codetag.c
index 7708f8388e55..f0a3174f9b71 100644
--- a/lib/codetag.c
+++ b/lib/codetag.c
@@ -157,8 +157,11 @@ static int codetag_module_init(struct codetag_type 
*cttype, struct module *mod)
 
down_write(>mod_lock);
err = idr_alloc(>mod_idr, cmod, 0, 0, GFP_KERNEL);
-   if (err >= 0)
+   if (err >= 0) {
cttype->count += range_size(cttype, );
+   if (cttype->desc.module_load)
+   cttype->desc.module_load(cttype, cmod);
+   }
up_write(>mod_lock);
 
if (err < 0) {
@@ -197,3 +200,49 @@ codetag_register_type(const struct codetag_type_desc *desc)
 
return cttype;
 }
+
+void codetag_load_module(struct module *mod)
+{
+   struct codetag_type *cttype;
+
+   if (!mod)
+   return;
+
+   mutex_lock(_lock);
+   list_for_each_entry(cttype, _types, link)
+   codetag_module_init(cttype, mod);
+   mutex_unlock(_lock);
+}
+
+void codetag_unload_module(struct module *mod)
+{
+   struct codetag_type *cttype;
+
+   if (!mod)
+   return;
+
+   mutex_lock(_lock);
+   list_for_each_entry(cttype, _types, link) {
+   struct codetag_module *found = NULL;
+   struct codetag_module *cmod;
+   unsigned long mod_id, tmp;
+
+   down_write(>mod_lock);
+   idr_for_each_entry_ul(>mod_idr, cmod, tmp, mod_id) {
+   if (cmod->mod && cmod->mod == mod) {
+   found = cmod;
+   break;
+   }
+   }
+   if (found) {
+   if (cttype->desc.module_unload)
+   cttype->desc.module_unload(cttype, cmod);
+
+   cttype->count -= range_size(cttype, >range);
+   idr_remove(>mod_idr, mod_id);
+   kfree(cmod);
+   }
+   up_write(>mod_lock);
+   }
+   mutex_unlock(_lock);
+}
-- 
2.37.2.672.g94769d06f0-goog




[RFC PATCH 05/30] lib: code tagging framework

2022-08-30 Thread Suren Baghdasaryan
Add basic infrastructure to support code tagging which stores tag common
information consisting of the module name, function, file name and line
number. Provide functions to register a new code tag type and navigate
between code tags.

Co-developed-by: Kent Overstreet 
Signed-off-by: Kent Overstreet 
Signed-off-by: Suren Baghdasaryan 
---
 include/linux/codetag.h |  71 ++
 lib/Kconfig.debug   |   4 +
 lib/Makefile|   1 +
 lib/codetag.c   | 199 
 4 files changed, 275 insertions(+)
 create mode 100644 include/linux/codetag.h
 create mode 100644 lib/codetag.c

diff --git a/include/linux/codetag.h b/include/linux/codetag.h
new file mode 100644
index ..a9d7adecc2a5
--- /dev/null
+++ b/include/linux/codetag.h
@@ -0,0 +1,71 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * code tagging framework
+ */
+#ifndef _LINUX_CODETAG_H
+#define _LINUX_CODETAG_H
+
+#include 
+
+struct codetag_iterator;
+struct codetag_type;
+struct seq_buf;
+struct module;
+
+/*
+ * An instance of this structure is created in a special ELF section at every
+ * code location being tagged.  At runtime, the special section is treated as
+ * an array of these.
+ */
+struct codetag {
+   unsigned int flags; /* used in later patches */
+   unsigned int lineno;
+   const char *modname;
+   const char *function;
+   const char *filename;
+} __aligned(8);
+
+union codetag_ref {
+   struct codetag *ct;
+};
+
+struct codetag_range {
+   struct codetag *start;
+   struct codetag *stop;
+};
+
+struct codetag_module {
+   struct module *mod;
+   struct codetag_range range;
+};
+
+struct codetag_type_desc {
+   const char *section;
+   size_t tag_size;
+};
+
+struct codetag_iterator {
+   struct codetag_type *cttype;
+   struct codetag_module *cmod;
+   unsigned long mod_id;
+   struct codetag *ct;
+};
+
+#define CODE_TAG_INIT {\
+   .modname= KBUILD_MODNAME,   \
+   .function   = __func__, \
+   .filename   = __FILE__, \
+   .lineno = __LINE__, \
+   .flags  = 0,\
+}
+
+void codetag_lock_module_list(struct codetag_type *cttype, bool lock);
+struct codetag_iterator codetag_get_ct_iter(struct codetag_type *cttype);
+struct codetag *codetag_next_ct(struct codetag_iterator *iter);
+
+void codetag_to_text(struct seq_buf *out, struct codetag *ct);
+
+struct codetag_type *
+codetag_register_type(const struct codetag_type_desc *desc);
+
+#endif /* _LINUX_CODETAG_H */
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index bcbe60d6c80c..22bc1eff7f8f 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -969,6 +969,10 @@ config DEBUG_STACKOVERFLOW
 
  If in doubt, say "N".
 
+config CODE_TAGGING
+   bool
+   select KALLSYMS
+
 source "lib/Kconfig.kasan"
 source "lib/Kconfig.kfence"
 
diff --git a/lib/Makefile b/lib/Makefile
index cc7762748708..574d7716e640 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -227,6 +227,7 @@ obj-$(CONFIG_OF_RECONFIG_NOTIFIER_ERROR_INJECT) += \
of-reconfig-notifier-error-inject.o
 obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
 
+obj-$(CONFIG_CODE_TAGGING) += codetag.o
 lib-$(CONFIG_GENERIC_BUG) += bug.o
 
 obj-$(CONFIG_HAVE_ARCH_TRACEHOOK) += syscall.o
diff --git a/lib/codetag.c b/lib/codetag.c
new file mode 100644
index ..7708f8388e55
--- /dev/null
+++ b/lib/codetag.c
@@ -0,0 +1,199 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct codetag_type {
+   struct list_head link;
+   unsigned int count;
+   struct idr mod_idr;
+   struct rw_semaphore mod_lock; /* protects mod_idr */
+   struct codetag_type_desc desc;
+};
+
+static DEFINE_MUTEX(codetag_lock);
+static LIST_HEAD(codetag_types);
+
+void codetag_lock_module_list(struct codetag_type *cttype, bool lock)
+{
+   if (lock)
+   down_read(>mod_lock);
+   else
+   up_read(>mod_lock);
+}
+
+struct codetag_iterator codetag_get_ct_iter(struct codetag_type *cttype)
+{
+   struct codetag_iterator iter = {
+   .cttype = cttype,
+   .cmod = NULL,
+   .mod_id = 0,
+   .ct = NULL,
+   };
+
+   return iter;
+}
+
+static inline struct codetag *get_first_module_ct(struct codetag_module *cmod)
+{
+   return cmod->range.start < cmod->range.stop ? cmod->range.start : NULL;
+}
+
+static inline
+struct codetag *get_next_module_ct(struct codetag_iterator *iter)
+{
+   struct codetag *res = (struct codetag *)
+   ((char *)iter->ct + iter->cttype->desc.tag_size);
+
+   return res < iter->cmod->range.stop ? res : NULL;
+}
+
+struct codetag *codetag_next_ct(struct codetag_iterator *iter)
+{
+   struct 

[RFC PATCH 02/30] lib/string_helpers: Drop space in string_get_size's output

2022-08-30 Thread Suren Baghdasaryan
From: Kent Overstreet 

Previously, string_get_size() outputted a space between the number and
the units, i.e.
  9.88 MiB

This changes it to
  9.88MiB

which allows it to be parsed correctly by the 'sort -h' command.

Signed-off-by: Kent Overstreet 
Cc: Andy Shevchenko 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: "Noralf Trønnes" 
Cc: Jens Axboe 
---
 lib/string_helpers.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/lib/string_helpers.c b/lib/string_helpers.c
index 5ed3beb066e6..3032d1b04ca3 100644
--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -126,8 +126,7 @@ void string_get_size(u64 size, u64 blk_size, const enum 
string_size_units units,
else
unit = units_str[units][i];
 
-   snprintf(buf, len, "%u%s %s", (u32)size,
-tmp, unit);
+   snprintf(buf, len, "%u%s%s", (u32)size, tmp, unit);
 }
 EXPORT_SYMBOL(string_get_size);
 
-- 
2.37.2.672.g94769d06f0-goog




[RFC PATCH 01/30] kernel/module: move find_kallsyms_symbol_value declaration

2022-08-30 Thread Suren Baghdasaryan
Allow find_kallsyms_symbol_value to be called by code outside of
kernel/module. It will be used for code tagging module support.

Signed-off-by: Suren Baghdasaryan 
---
 include/linux/module.h   | 1 +
 kernel/module/internal.h | 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/module.h b/include/linux/module.h
index 518296ea7f73..563d38ad84ed 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -605,6 +605,7 @@ struct module *find_module(const char *name);
 int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
char *name, char *module_name, int *exported);
 
+unsigned long find_kallsyms_symbol_value(struct module *mod, const char *name);
 /* Look for this name: can be of form module:name. */
 unsigned long module_kallsyms_lookup_name(const char *name);
 
diff --git a/kernel/module/internal.h b/kernel/module/internal.h
index 680d980a4fb2..f1b6c477bd93 100644
--- a/kernel/module/internal.h
+++ b/kernel/module/internal.h
@@ -246,7 +246,6 @@ static inline void kmemleak_load_module(const struct module 
*mod,
 void init_build_id(struct module *mod, const struct load_info *info);
 void layout_symtab(struct module *mod, struct load_info *info);
 void add_kallsyms(struct module *mod, const struct load_info *info);
-unsigned long find_kallsyms_symbol_value(struct module *mod, const char *name);
 
 static inline bool sect_empty(const Elf_Shdr *sect)
 {
-- 
2.37.2.672.g94769d06f0-goog




[RFC PATCH 00/30] Code tagging framework and applications

2022-08-30 Thread Suren Baghdasaryan
===
Code tagging framework
===
Code tag is a structure identifying a specific location in the source code
which is generated at compile time and can be embedded in an application-
specific structure. Several applications of code tagging are included in
this RFC, such as memory allocation tracking, dynamic fault injection,
latency tracking and improved error code reporting.
Basically, it takes the old trick of "define a special elf section for
objects of a given type so that we can iterate over them at runtime" and
creates a proper library for it.

===
Memory allocation tracking
===
The goal for using codetags for memory allocation tracking is to minimize
performance and memory overhead. By recording only the call count and
allocation size, the required operations are kept at the minimum while
collecting statistics for every allocation in the codebase. With that
information, if users are interested in mode detailed context for a
specific allocation, they can enable more in-depth context tracking,
which includes capturing the pid, tgid, task name, allocation size,
timestamp and call stack for every allocation at the specified code
location.
Memory allocation tracking is implemented in two parts:

part1: instruments page and slab allocators to record call count and total
memory allocated at every allocation in the source code. Every time an
allocation is performed by an instrumented allocator, the codetag at that
location increments its call and size counters. Every time the memory is
freed these counters are decremented. To decrement the counters upon free,
allocated object needs a reference to its codetag. Page allocators use
page_ext to record this reference while slab allocators use memcg_data of
the slab page.
The data is exposed to the user space via a read-only debugfs file called
alloc_tags.

Usage example:

$ sort -hr /sys/kernel/debug/alloc_tags|head
  153MiB 8599 mm/slub.c:1826 module:slub func:alloc_slab_page
 6.08MiB  49 mm/slab_common.c:950 module:slab_common func:_kmalloc_order
 5.09MiB 6335 mm/memcontrol.c:2814 module:memcontrol 
func:alloc_slab_obj_exts
 4.54MiB  78 mm/page_alloc.c:5777 module:page_alloc func:alloc_pages_exact
 1.32MiB  338 include/asm-generic/pgalloc.h:63 module:pgtable 
func:__pte_alloc_one
 1.16MiB  603 fs/xfs/xfs_log_priv.h:700 module:xfs func:xlog_kvmalloc
 1.00MiB  256 mm/swap_cgroup.c:48 module:swap_cgroup 
func:swap_cgroup_prepare
  734KiB 5380 fs/xfs/kmem.c:20 module:xfs func:kmem_alloc
  640KiB  160 kernel/rcu/tree.c:3184 module:tree func:fill_page_cache_func
  640KiB  160 drivers/char/virtio_console.c:452 module:virtio_console 
func:alloc_buf

part2: adds support for the user to select a specific code location to capture
allocation context. A new debugfs file called alloc_tags.ctx is used to select
which code location should capture allocation context and to read captured
context information.

Usage example:

$ cd /sys/kernel/debug/
$ echo "file include/asm-generic/pgalloc.h line 63 enable" > alloc_tags.ctx
$ cat alloc_tags.ctx
  920KiB  230 include/asm-generic/pgalloc.h:63 module:pgtable 
func:__pte_alloc_one
size: 4096
pid: 1474
tgid: 1474
comm: bash
ts: 175332940994
call stack:
 pte_alloc_one+0xfe/0x130
 __pte_alloc+0x22/0xb0
 copy_page_range+0x842/0x1640
 dup_mm+0x42d/0x580
 copy_process+0xfb1/0x1ac0
 kernel_clone+0x92/0x3e0
 __do_sys_clone+0x66/0x90
 do_syscall_64+0x38/0x90
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
...

NOTE: slab allocation tracking is not yet stable and has a leak that
shows up in long-running tests. We are working on fixing it and posting
the RFC early to collect some feedback and to have a reference code in
public before presenting the idea at LPC2022.

===
Dynamic fault injection
===
Dynamic fault injection lets you do fault injection with a single call
to dynamic_fault(), with a debugfs interface similar to dynamic_debug.

Calls to dynamic_fault are listed in debugfs and can be enabled at
runtime (oneshot mode or a defined frequency are also available). This
patch also uses the memory allocation wrapper macros introduced by the
memory allocation tracking patches to add distinct fault injection
points for every memory allocation in the kernel.

Example fault injection points, after hooking memory allocation paths:

  fs/xfs/libxfs/xfs_iext_tree.c:606 module:xfs 
func:xfs_iext_realloc_rootclass:memory disabled "
  fs/xfs/libxfs/xfs_inode_fork.c:503 module:xfs 
func:xfs_idata_reallocclass:memory disabled "
  fs/xfs/libxfs/xfs_inode_fork.c:399 module:xfs 
func:xfs_iroot_reallocclass:memory disabled "
  fs/xfs/xfs_buf.c:373 module:xfs func:xfs_buf_alloc_pagesclass:memory disabled 
"
  fs/xfs/xfs_iops.c:497 module:xfs func:xfs_vn_get_linkclass:memory disabled "
  

[RFC PATCH 13/30] mm/slab: introduce SLAB_NO_OBJ_EXT to avoid obj_ext creation

2022-08-30 Thread Suren Baghdasaryan
Slab extension objects can't be allocated before slab infrastructure is
initialized. Some caches, like kmem_cache and kmem_cache_node, are created
before slab infrastructure is initialized. Objects from these caches can't
have extension objects. Introduce SLAB_NO_OBJ_EXT slab flag to mark these
caches and avoid creating extensions for objects allocated from these
slabs.

Signed-off-by: Suren Baghdasaryan 
---
 include/linux/slab.h | 7 +++
 mm/slab.c| 2 +-
 mm/slub.c| 5 +++--
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 0fefdf528e0d..55ae3ea864a4 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -124,6 +124,13 @@
 #define SLAB_RECLAIM_ACCOUNT   ((slab_flags_t __force)0x0002U)
 #define SLAB_TEMPORARY SLAB_RECLAIM_ACCOUNT/* Objects are 
short-lived */
 
+#ifdef CONFIG_SLAB_OBJ_EXT
+/* Slab created using create_boot_cache */
+#define SLAB_NO_OBJ_EXT ((slab_flags_t __force)0x2000U)
+#else
+#define SLAB_NO_OBJ_EXT 0
+#endif
+
 /*
  * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests.
  *
diff --git a/mm/slab.c b/mm/slab.c
index 10e96137b44f..ba97aeef7ec1 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1233,7 +1233,7 @@ void __init kmem_cache_init(void)
create_boot_cache(kmem_cache, "kmem_cache",
offsetof(struct kmem_cache, node) +
  nr_node_ids * sizeof(struct kmem_cache_node 
*),
- SLAB_HWCACHE_ALIGN, 0, 0);
+ SLAB_HWCACHE_ALIGN | SLAB_NO_OBJ_EXT, 0, 0);
list_add(_cache->list, _caches);
slab_state = PARTIAL;
 
diff --git a/mm/slub.c b/mm/slub.c
index 862dbd9af4f5..80199d5ac7c9 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4825,7 +4825,8 @@ void __init kmem_cache_init(void)
node_set(node, slab_nodes);
 
create_boot_cache(kmem_cache_node, "kmem_cache_node",
-   sizeof(struct kmem_cache_node), SLAB_HWCACHE_ALIGN, 0, 0);
+   sizeof(struct kmem_cache_node),
+   SLAB_HWCACHE_ALIGN | SLAB_NO_OBJ_EXT, 0, 0);
 
register_hotmemory_notifier(_memory_callback_nb);
 
@@ -4835,7 +4836,7 @@ void __init kmem_cache_init(void)
create_boot_cache(kmem_cache, "kmem_cache",
offsetof(struct kmem_cache, node) +
nr_node_ids * sizeof(struct kmem_cache_node *),
-  SLAB_HWCACHE_ALIGN, 0, 0);
+   SLAB_HWCACHE_ALIGN | SLAB_NO_OBJ_EXT, 0, 0);
 
kmem_cache = bootstrap(_kmem_cache);
kmem_cache_node = bootstrap(_kmem_cache_node);
-- 
2.37.2.672.g94769d06f0-goog




[RFC PATCH 07/30] lib: add support for allocation tagging

2022-08-30 Thread Suren Baghdasaryan
Introduce CONFIG_ALLOC_TAGGING which provides definitions to easily
instrument allocators. It also registers an "alloc_tags" codetag type
with defbugfs interface to output allocation tags information.

Signed-off-by: Suren Baghdasaryan 
Co-developed-by: Kent Overstreet 
Signed-off-by: Kent Overstreet 
---
 include/asm-generic/codetag.lds.h |  14 +++
 include/asm-generic/vmlinux.lds.h |   3 +
 include/linux/alloc_tag.h |  66 +
 lib/Kconfig.debug |   5 +
 lib/Makefile  |   2 +
 lib/alloc_tag.c   | 158 ++
 scripts/module.lds.S  |   7 ++
 7 files changed, 255 insertions(+)
 create mode 100644 include/asm-generic/codetag.lds.h
 create mode 100644 include/linux/alloc_tag.h
 create mode 100644 lib/alloc_tag.c

diff --git a/include/asm-generic/codetag.lds.h 
b/include/asm-generic/codetag.lds.h
new file mode 100644
index ..64f536b80380
--- /dev/null
+++ b/include/asm-generic/codetag.lds.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ASM_GENERIC_CODETAG_LDS_H
+#define __ASM_GENERIC_CODETAG_LDS_H
+
+#define SECTION_WITH_BOUNDARIES(_name) \
+   . = ALIGN(8);   \
+   __start_##_name = .;\
+   KEEP(*(_name))  \
+   __stop_##_name = .;
+
+#define CODETAG_SECTIONS() \
+   SECTION_WITH_BOUNDARIES(alloc_tags)
+
+#endif /* __ASM_GENERIC_CODETAG_LDS_H */
diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index 7515a465ec03..c2dc2a59ab2e 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -50,6 +50,8 @@
  *   [__nosave_begin, __nosave_end] for the nosave data
  */
 
+#include 
+
 #ifndef LOAD_OFFSET
 #define LOAD_OFFSET 0
 #endif
@@ -348,6 +350,7 @@
__start___dyndbg = .;   \
KEEP(*(__dyndbg))   \
__stop___dyndbg = .;\
+   CODETAG_SECTIONS()  \
LIKELY_PROFILE()\
BRANCH_PROFILE()\
TRACE_PRINTKS() \
diff --git a/include/linux/alloc_tag.h b/include/linux/alloc_tag.h
new file mode 100644
index ..b3f589afb1c9
--- /dev/null
+++ b/include/linux/alloc_tag.h
@@ -0,0 +1,66 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * allocation tagging
+ */
+#ifndef _LINUX_ALLOC_TAG_H
+#define _LINUX_ALLOC_TAG_H
+
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * An instance of this structure is created in a special ELF section at every
+ * allocation callsite. At runtime, the special section is treated as
+ * an array of these. Embedded codetag utilizes codetag framework.
+ */
+struct alloc_tag {
+   struct codetag  ct;
+   unsigned long   last_wrap;
+   struct raw_lazy_percpu_counter  call_count;
+   struct raw_lazy_percpu_counter  bytes_allocated;
+} __aligned(8);
+
+static inline struct alloc_tag *ct_to_alloc_tag(struct codetag *ct)
+{
+   return container_of(ct, struct alloc_tag, ct);
+}
+
+#define DEFINE_ALLOC_TAG(_alloc_tag)   \
+   static struct alloc_tag _alloc_tag __used __aligned(8)  \
+   __section("alloc_tags") = { .ct = CODE_TAG_INIT }
+
+#define alloc_tag_counter_read(counter)
\
+   __lazy_percpu_counter_read(counter)
+
+static inline void __alloc_tag_sub(union codetag_ref *ref, size_t bytes)
+{
+   struct alloc_tag *tag = ct_to_alloc_tag(ref->ct);
+
+   __lazy_percpu_counter_add(>call_count, >last_wrap, -1);
+   __lazy_percpu_counter_add(>bytes_allocated, >last_wrap, 
-bytes);
+   ref->ct = NULL;
+}
+
+#define alloc_tag_sub(_ref, _bytes)\
+do {   \
+   if ((_ref) && (_ref)->ct)   \
+   __alloc_tag_sub(_ref, _bytes);  \
+} while (0)
+
+static inline void __alloc_tag_add(struct alloc_tag *tag, union codetag_ref 
*ref, size_t bytes)
+{
+   ref->ct = >ct;
+   __lazy_percpu_counter_add(>call_count, >last_wrap, 1);
+   __lazy_percpu_counter_add(>bytes_allocated, >last_wrap, 
bytes);
+}
+
+#define alloc_tag_add(_ref, _bytes)\
+do {   \
+   DEFINE_ALLOC_TAG(_alloc_tag);   \
+   if (_ref && !WARN_ONCE(_ref->ct, "alloc_tag was not cleared"))  \
+   __alloc_tag_add(&_alloc_tag, _ref, _bytes); \
+} while (0)
+
+#endif /* _LINUX_ALLOC_TAG_H */
diff --git 

[RFC PATCH 03/30] Lazy percpu counters

2022-08-30 Thread Suren Baghdasaryan
From: Kent Overstreet 

This patch adds lib/lazy-percpu-counter.c, which implements counters
that start out as atomics, but lazily switch to percpu mode if the
update rate crosses some threshold (arbitrarily set at 256 per second).

Signed-off-by: Kent Overstreet 
---
 include/linux/lazy-percpu-counter.h |  67 +
 lib/Kconfig |   3 +
 lib/Makefile|   2 +
 lib/lazy-percpu-counter.c   | 141 
 4 files changed, 213 insertions(+)
 create mode 100644 include/linux/lazy-percpu-counter.h
 create mode 100644 lib/lazy-percpu-counter.c

diff --git a/include/linux/lazy-percpu-counter.h 
b/include/linux/lazy-percpu-counter.h
new file mode 100644
index ..a22a2b9a9f32
--- /dev/null
+++ b/include/linux/lazy-percpu-counter.h
@@ -0,0 +1,67 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Lazy percpu counters:
+ * (C) 2022 Kent Overstreet
+ *
+ * Lazy percpu counters start out in atomic mode, then switch to percpu mode if
+ * the update rate crosses some threshold.
+ *
+ * This means we don't have to decide between low memory overhead atomic
+ * counters and higher performance percpu counters - we can have our cake and
+ * eat it, too!
+ *
+ * Internally we use an atomic64_t, where the low bit indicates whether we're 
in
+ * percpu mode, and the high 8 bits are a secondary counter that's incremented
+ * when the counter is modified - meaning 55 bits of precision are available 
for
+ * the counter itself.
+ *
+ * lazy_percpu_counter is 16 bytes (on 64 bit machines), 
raw_lazy_percpu_counter
+ * is 8 bytes but requires a separate unsigned long to record when the counter
+ * wraps - because sometimes multiple counters are used together and can share
+ * the same timestamp.
+ */
+
+#ifndef _LINUX_LAZY_PERCPU_COUNTER_H
+#define _LINUX_LAZY_PERCPU_COUNTER_H
+
+struct raw_lazy_percpu_counter {
+   atomic64_t  v;
+};
+
+void __lazy_percpu_counter_exit(struct raw_lazy_percpu_counter *c);
+void __lazy_percpu_counter_add(struct raw_lazy_percpu_counter *c,
+  unsigned long *last_wrap, s64 i);
+s64 __lazy_percpu_counter_read(struct raw_lazy_percpu_counter *c);
+
+static inline void __lazy_percpu_counter_sub(struct raw_lazy_percpu_counter *c,
+unsigned long *last_wrap, s64 i)
+{
+   __lazy_percpu_counter_add(c, last_wrap, -i);
+}
+
+struct lazy_percpu_counter {
+   struct raw_lazy_percpu_counter  v;
+   unsigned long   last_wrap;
+};
+
+static inline void lazy_percpu_counter_exit(struct lazy_percpu_counter *c)
+{
+   __lazy_percpu_counter_exit(>v);
+}
+
+static inline void lazy_percpu_counter_add(struct lazy_percpu_counter *c, s64 
i)
+{
+   __lazy_percpu_counter_add(>v, >last_wrap, i);
+}
+
+static inline void lazy_percpu_counter_sub(struct lazy_percpu_counter *c, s64 
i)
+{
+   __lazy_percpu_counter_sub(>v, >last_wrap, i);
+}
+
+static inline s64 lazy_percpu_counter_read(struct lazy_percpu_counter *c)
+{
+   return __lazy_percpu_counter_read(>v);
+}
+
+#endif /* _LINUX_LAZY_PERCPU_COUNTER_H */
diff --git a/lib/Kconfig b/lib/Kconfig
index dc1ab2ed1dc6..fc6dbc425728 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -498,6 +498,9 @@ config ASSOCIATIVE_ARRAY
 
  for more information.
 
+config LAZY_PERCPU_COUNTER
+   bool
+
 config HAS_IOMEM
bool
depends on !NO_IOMEM
diff --git a/lib/Makefile b/lib/Makefile
index ffabc30a27d4..cc7762748708 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -163,6 +163,8 @@ obj-$(CONFIG_DEBUG_PREEMPT) += smp_processor_id.o
 obj-$(CONFIG_DEBUG_LIST) += list_debug.o
 obj-$(CONFIG_DEBUG_OBJECTS) += debugobjects.o
 
+obj-$(CONFIG_LAZY_PERCPU_COUNTER) += lazy-percpu-counter.o
+
 obj-$(CONFIG_BITREVERSE) += bitrev.o
 obj-$(CONFIG_LINEAR_RANGES) += linear_ranges.o
 obj-$(CONFIG_PACKING)  += packing.o
diff --git a/lib/lazy-percpu-counter.c b/lib/lazy-percpu-counter.c
new file mode 100644
index ..299ef36137ee
--- /dev/null
+++ b/lib/lazy-percpu-counter.c
@@ -0,0 +1,141 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * We use the high bits of the atomic counter for a secondary counter, which is
+ * incremented every time the counter is touched. When the secondary counter
+ * wraps, we check the time the counter last wrapped, and if it was recent
+ * enough that means the update frequency has crossed our threshold and we
+ * switch to percpu mode:
+ */
+#define COUNTER_MOD_BITS   8
+#define COUNTER_MOD_MASK   ~(~0ULL >> COUNTER_MOD_BITS)
+#define COUNTER_MOD_BITS_START (64 - COUNTER_MOD_BITS)
+
+/*
+ * We use the low bit of the counter to indicate whether we're in atomic mode
+ * (low bit clear), or percpu mode (low bit set, counter is a pointer to actual
+ * percpu counters:
+ */
+#define COUNTER_IS_PCPU_BIT1
+
+static inline u64 

[qemu-mainline test] 172877: regressions - FAIL

2022-08-30 Thread osstest service owner
flight 172877 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/172877/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 172123
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 172123
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 172123
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 172123

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 14 guest-start  fail REGR. vs. 172123

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-raw   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 172123
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 172123
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 172123
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 172123
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 172123
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuu93fac696d241dccb04ebb9d23da55fc1e9d8ee36
baseline version:
 qemuu2480f3bbd03814b0651a1f74959f5c6631ee5819

Last test of basis   172123  2022-08-03 18:10:07 Z   27 days
Failing since172148  2022-08-04 21:39:38 Z   26 days   60 attempts
Testing same since   172877  2022-08-30 19:10:25 Z0 days1 attempts


[linux-5.4 test] 172875: regressions - FAIL

2022-08-30 Thread osstest service owner
flight 172875 linux-5.4 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/172875/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 172128
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 172128
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 172128
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 172128

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-rtds 14 guest-start  fail in 172866 pass in 172875
 test-armhf-armhf-examine  8 reboot fail pass in 172866

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-raw   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit1 18 guest-start/debian.repeat fail in 172866 
blocked in 172128
 test-armhf-armhf-xl-multivcpu 14 guest-startfail in 172866 like 172128
 test-armhf-armhf-xl-credit1 15 migrate-support-check fail in 172866 never pass
 test-armhf-armhf-xl-credit1 16 saverestore-support-check fail in 172866 never 
pass
 test-armhf-armhf-xl-multivcpu 18 guest-start/debian.repeatfail like 172108
 test-armhf-armhf-xl-credit2  18 guest-start/debian.repeatfail  like 172108
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 172108
 test-armhf-armhf-xl-credit1  14 guest-start  fail  like 172128
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 172128
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 172128
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 172128
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 172128
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 172128
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 172128
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 172128
 test-armhf-armhf-xl-rtds 18 guest-start/debian.repeatfail  like 172128
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 172128
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 

[PATCH v11 6/6] xen: retrieve reserved pages on populate_physmap

2022-08-30 Thread Penny Zheng
When a static domain populates memory through populate_physmap at runtime,
it shall retrieve reserved pages from resv_page_list to make sure that
guest RAM is still restricted in statically configured memory regions.
This commit also introduces a new helper acquire_reserved_page to make it work.

Signed-off-by: Penny Zheng 
---
v11 change:
- with assignment having failed and the page not exposed to the guest at any
point, there is no need for scrubbing
---
v10 changes:
- add lock on the fail path
---
v9 changes:
- Use ASSERT_ALLOC_CONTEXT() in acquire_reserved_page
- Add free_staticmem_pages to undo prepare_staticmem_pages when
assign_domstatic_pages
- Remove redundant static in error message
---
v8 changes:
- As concurrent free/allocate could modify the resv_page_list, we still
need the lock
---
v7 changes:
- remove the lock, since we add the page to rsv_page_list after it has
been totally freed.
---
v6 changes:
- drop the lock before returning
---
v5 changes:
- extract common codes for assigning pages into a helper assign_domstatic_pages
- refine commit message
- remove stub function acquire_reserved_page
- Alloc/free of memory can happen concurrently. So access to rsv_page_list
needs to be protected with a spinlock
---
v4 changes:
- miss dropping __init in acquire_domstatic_pages
- add the page back to the reserved list in case of error
- remove redundant printk
- refine log message and make it warn level
---
v3 changes:
- move is_domain_using_staticmem to the common header file
- remove #ifdef CONFIG_STATIC_MEMORY-ary
- remove meaningless page_to_mfn(page) in error log
---
v2 changes:
- introduce acquire_reserved_page to retrieve reserved pages from
resv_page_list
- forbid non-zero-order requests in populate_physmap
- let is_domain_static return ((void)(d), false) on x86
---
 xen/common/memory.c | 23 +
 xen/common/page_alloc.c | 76 -
 xen/include/xen/mm.h|  1 +
 3 files changed, 83 insertions(+), 17 deletions(-)

diff --git a/xen/common/memory.c b/xen/common/memory.c
index bc89442ba5..ae8163a738 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -245,6 +245,29 @@ static void populate_physmap(struct memop_args *a)
 
 mfn = _mfn(gpfn);
 }
+else if ( is_domain_using_staticmem(d) )
+{
+/*
+ * No easy way to guarantee the retrieved pages are contiguous,
+ * so forbid non-zero-order requests here.
+ */
+if ( a->extent_order != 0 )
+{
+gdprintk(XENLOG_WARNING,
+ "Cannot allocate static order-%u pages for %pd\n",
+ a->extent_order, d);
+goto out;
+}
+
+mfn = acquire_reserved_page(d, a->memflags);
+if ( mfn_eq(mfn, INVALID_MFN) )
+{
+gdprintk(XENLOG_WARNING,
+ "%pd: failed to retrieve a reserved page\n",
+ d);
+goto out;
+}
+}
 else
 {
 page = alloc_domheap_pages(d, a->extent_order, a->memflags);
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 18d34d1b69..93d504c3c4 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -2755,9 +2755,8 @@ void free_domstatic_page(struct page_info *page)
 put_domain(d);
 }
 
-static bool __init prepare_staticmem_pages(struct page_info *pg,
-   unsigned long nr_mfns,
-   unsigned int memflags)
+static bool prepare_staticmem_pages(struct page_info *pg, unsigned long 
nr_mfns,
+unsigned int memflags)
 {
 bool need_tlbflush = false;
 uint32_t tlbflush_timestamp = 0;
@@ -2838,21 +2837,9 @@ static struct page_info * __init 
acquire_staticmem_pages(mfn_t smfn,
 return pg;
 }
 
-/*
- * Acquire nr_mfns contiguous pages, starting at #smfn, of static memory,
- * then assign them to one specific domain #d.
- */
-int __init acquire_domstatic_pages(struct domain *d, mfn_t smfn,
-   unsigned int nr_mfns, unsigned int memflags)
+static int assign_domstatic_pages(struct domain *d, struct page_info *pg,
+  unsigned int nr_mfns, unsigned int memflags)
 {
-struct page_info *pg;
-
-ASSERT_ALLOC_CONTEXT();
-
-pg = acquire_staticmem_pages(smfn, nr_mfns, memflags);
-if ( !pg )
-return -ENOENT;
-
 if ( !d || (memflags & (MEMF_no_owner | MEMF_no_refcount)) )
 {
 /*
@@ -2871,6 +2858,61 @@ int __init acquire_domstatic_pages(struct domain *d, 
mfn_t smfn,
 
 return 0;
 }
+
+/*
+ * Acquire nr_mfns contiguous pages, starting at #smfn, of static memory,
+ * then assign them to one specific domain #d.
+ 

[PATCH v11 5/6] xen: rename free_staticmem_pages to unprepare_staticmem_pages

2022-08-30 Thread Penny Zheng
The name of free_staticmem_pages is inappropriate, considering it is
the opposite of function prepare_staticmem_pages.

Rename free_staticmem_pages to unprepare_staticmem_pages.

Signed-off-by: Penny Zheng 
Acked-by: Jan Beulich 
---
v11 changes:
- moved ahead of "xen: retrieve reserved pages on populate_physmap"
---
v10 changes:
- new commit
---
 xen/arch/arm/setup.c|  3 ++-
 xen/common/page_alloc.c | 13 -
 xen/include/xen/mm.h|  4 ++--
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 500307edc0..4662997c7e 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -639,7 +639,8 @@ static void __init init_staticmem_pages(void)
 if ( mfn_x(bank_end) <= mfn_x(bank_start) )
 return;
 
-free_staticmem_pages(mfn_to_page(bank_start), bank_pages, false);
+unprepare_staticmem_pages(mfn_to_page(bank_start),
+  bank_pages, false);
 }
 }
 #endif
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index adcc16e4f6..18d34d1b69 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -2693,9 +2693,12 @@ struct domain *get_pg_owner(domid_t domid)
 }
 
 #ifdef CONFIG_STATIC_MEMORY
-/* Equivalent of free_heap_pages to free nr_mfns pages of static memory. */
-void free_staticmem_pages(struct page_info *pg, unsigned long nr_mfns,
-  bool need_scrub)
+/*
+ * It is the opposite of prepare_staticmem_pages, and it aims to unprepare
+ * nr_mfns pages of static memory.
+ */
+void unprepare_staticmem_pages(struct page_info *pg, unsigned long nr_mfns,
+   bool need_scrub)
 {
 mfn_t mfn = page_to_mfn(pg);
 unsigned long i;
@@ -2741,7 +2744,7 @@ void free_domstatic_page(struct page_info *page)
 
 drop_dom_ref = !domain_adjust_tot_pages(d, -1);
 
-free_staticmem_pages(page, 1, scrub_debug);
+unprepare_staticmem_pages(page, 1, scrub_debug);
 
 /* Add page on the resv_page_list *after* it has been freed. */
 page_list_add_tail(page, >resv_page_list);
@@ -2862,7 +2865,7 @@ int __init acquire_domstatic_pages(struct domain *d, 
mfn_t smfn,
 
 if ( assign_pages(pg, nr_mfns, d, memflags) )
 {
-free_staticmem_pages(pg, nr_mfns, memflags & MEMF_no_scrub);
+unprepare_staticmem_pages(pg, nr_mfns, memflags & MEMF_no_scrub);
 return -EINVAL;
 }
 
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index deadf4b2a1..93db3c4418 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -86,8 +86,8 @@ bool scrub_free_pages(void);
 #define FREE_XENHEAP_PAGE(p) FREE_XENHEAP_PAGES(p, 0)
 
 /* These functions are for static memory */
-void free_staticmem_pages(struct page_info *pg, unsigned long nr_mfns,
-  bool need_scrub);
+void unprepare_staticmem_pages(struct page_info *pg, unsigned long nr_mfns,
+   bool need_scrub);
 void free_domstatic_page(struct page_info *page);
 int acquire_domstatic_pages(struct domain *d, mfn_t smfn, unsigned int nr_mfns,
 unsigned int memflags);
-- 
2.25.1




[PATCH v11 4/6] xen: introduce prepare_staticmem_pages

2022-08-30 Thread Penny Zheng
Later, we want to use acquire_domstatic_pages() for populating memory
for static domain on runtime, however, there are a lot of pointless work
(checking mfn_valid(), scrubbing the free part, cleaning the cache...)
considering we know the page is valid and belong to the guest.

This commit splits acquire_staticmem_pages() in two parts, and
introduces prepare_staticmem_pages to bypass all "pointless work".

Signed-off-by: Penny Zheng 
Acked-by: Jan Beulich 
Acked-by: Julien Grall 
---
v11 changes:
- no change
---
v10 changes:
- no change
---
v9 changes:
- no change
---
v8 changes:
- no change
---
v7 changes:
- no change
---
v6 changes:
- adapt to PGC_static
---
v5 changes:
- new commit
---
 xen/common/page_alloc.c | 61 -
 1 file changed, 36 insertions(+), 25 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 26a2fad4e3..adcc16e4f6 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -2752,26 +2752,13 @@ void free_domstatic_page(struct page_info *page)
 put_domain(d);
 }
 
-/*
- * Acquire nr_mfns contiguous reserved pages, starting at #smfn, of
- * static memory.
- * This function needs to be reworked if used outside of boot.
- */
-static struct page_info * __init acquire_staticmem_pages(mfn_t smfn,
- unsigned long nr_mfns,
- unsigned int memflags)
+static bool __init prepare_staticmem_pages(struct page_info *pg,
+   unsigned long nr_mfns,
+   unsigned int memflags)
 {
 bool need_tlbflush = false;
 uint32_t tlbflush_timestamp = 0;
 unsigned long i;
-struct page_info *pg;
-
-ASSERT(nr_mfns);
-for ( i = 0; i < nr_mfns; i++ )
-if ( !mfn_valid(mfn_add(smfn, i)) )
-return NULL;
-
-pg = mfn_to_page(smfn);
 
 spin_lock(_lock);
 
@@ -2782,7 +2769,7 @@ static struct page_info * __init 
acquire_staticmem_pages(mfn_t smfn,
 {
 printk(XENLOG_ERR
"pg[%lu] Static MFN %"PRI_mfn" c=%#lx t=%#x\n",
-   i, mfn_x(smfn) + i,
+   i, mfn_x(page_to_mfn(pg)) + i,
pg[i].count_info, pg[i].tlbflush_timestamp);
 goto out_err;
 }
@@ -2806,6 +2793,38 @@ static struct page_info * __init 
acquire_staticmem_pages(mfn_t smfn,
 if ( need_tlbflush )
 filtered_flush_tlb_mask(tlbflush_timestamp);
 
+return true;
+
+ out_err:
+while ( i-- )
+pg[i].count_info = PGC_static | PGC_state_free;
+
+spin_unlock(_lock);
+
+return false;
+}
+
+/*
+ * Acquire nr_mfns contiguous reserved pages, starting at #smfn, of
+ * static memory.
+ * This function needs to be reworked if used outside of boot.
+ */
+static struct page_info * __init acquire_staticmem_pages(mfn_t smfn,
+ unsigned long nr_mfns,
+ unsigned int memflags)
+{
+unsigned long i;
+struct page_info *pg;
+
+ASSERT(nr_mfns);
+for ( i = 0; i < nr_mfns; i++ )
+if ( !mfn_valid(mfn_add(smfn, i)) )
+return NULL;
+
+pg = mfn_to_page(smfn);
+if ( !prepare_staticmem_pages(pg, nr_mfns, memflags) )
+return NULL;
+
 /*
  * Ensure cache and RAM are consistent for platforms where the guest
  * can control its own visibility of/through the cache.
@@ -2814,14 +2833,6 @@ static struct page_info * __init 
acquire_staticmem_pages(mfn_t smfn,
 flush_page_to_ram(mfn_x(smfn) + i, !(memflags & MEMF_no_icache_flush));
 
 return pg;
-
- out_err:
-while ( i-- )
-pg[i].count_info = PGC_static | PGC_state_free;
-
-spin_unlock(_lock);
-
-return NULL;
 }
 
 /*
-- 
2.25.1




[PATCH v11 3/6] xen: unpopulate memory when domain is static

2022-08-30 Thread Penny Zheng
Today when a domain unpopulates the memory on runtime, they will always
hand the memory back to the heap allocator. And it will be a problem if domain
is static.

Pages as guest RAM for static domain shall be reserved to only this domain
and not be used for any other purposes, so they shall never go back to heap
allocator.

This commit puts reserved page on the new list resv_page_list after
it has been freed.

Signed-off-by: Penny Zheng 
Acked-by: Jan Beulich 
Acked-by: Julien Grall 
---
v11 change:
- commit message tweak
---
v10 change:
- Do not skip the list addition in that one special case
---
v9 change:
- remove macro helper put_static_page, and just expand its code inside
free_domstatic_page
---
v8 changes:
- adapt this patch for newly introduced free_domstatic_page
- order as a parameter is not needed here, as all staticmem operations are
limited to order-0 regions
- move d->page_alloc_lock after operation on d->resv_page_list
---
v7 changes:
- Add page on the rsv_page_list *after* it has been freed
---
v6 changes:
- refine in-code comment
- move PGC_static !CONFIG_STATIC_MEMORY definition to common header
---
v5 changes:
- adapt this patch for PGC_staticmem
---
v4 changes:
- no changes
---
v3 changes:
- have page_list_del() just once out of the if()
- remove resv_pages counter
- make arch_free_heap_page be an expression, not a compound statement.
---
v2 changes:
- put reserved pages on resv_page_list after having taken them off
the "normal" list
---
 xen/common/domain.c | 4 
 xen/common/page_alloc.c | 7 +--
 xen/include/xen/sched.h | 3 +++
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 7062393e37..c23f449451 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -604,6 +604,10 @@ struct domain *domain_create(domid_t domid,
 INIT_PAGE_LIST_HEAD(>page_list);
 INIT_PAGE_LIST_HEAD(>extra_page_list);
 INIT_PAGE_LIST_HEAD(>xenpage_list);
+#ifdef CONFIG_STATIC_MEMORY
+INIT_PAGE_LIST_HEAD(>resv_page_list);
+#endif
+
 
 spin_lock_init(>node_affinity_lock);
 d->node_affinity = NODE_MASK_ALL;
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 0c50dee4c5..26a2fad4e3 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -2741,10 +2741,13 @@ void free_domstatic_page(struct page_info *page)
 
 drop_dom_ref = !domain_adjust_tot_pages(d, -1);
 
-spin_unlock_recursive(>page_alloc_lock);
-
 free_staticmem_pages(page, 1, scrub_debug);
 
+/* Add page on the resv_page_list *after* it has been freed. */
+page_list_add_tail(page, >resv_page_list);
+
+spin_unlock_recursive(>page_alloc_lock);
+
 if ( drop_dom_ref )
 put_domain(d);
 }
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 1cf629e7ec..956e0f9dca 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -381,6 +381,9 @@ struct domain
 struct page_list_head page_list;  /* linked list */
 struct page_list_head extra_page_list; /* linked list (size extra_pages) */
 struct page_list_head xenpage_list; /* linked list (size xenheap_pages) */
+#ifdef CONFIG_STATIC_MEMORY
+struct page_list_head resv_page_list; /* linked list */
+#endif
 
 /*
  * This field should only be directly accessed by domain_adjust_tot_pages()
-- 
2.25.1




[PATCH v11 2/6] xen/arm: introduce CDF_staticmem

2022-08-30 Thread Penny Zheng
In order to have an easy and quick way to find out whether this domain memory
is statically configured, this commit introduces a new flag CDF_staticmem and a
new helper is_domain_using_staticmem() to tell.

Signed-off-by: Penny Zheng 
Acked-by: Julien Grall 
Acked-by: Jan Beulich 
---
v11 changes:
- no change
---
v10 changes:
- no change
---
v9 changes:
- no change
---
v8 changes:
- #ifdef-ary around is_domain_using_staticmem() is not needed anymore
---
v7 changes:
- IS_ENABLED(CONFIG_STATIC_MEMORY) would not be needed anymore
---
v6 changes:
- move non-zero is_domain_using_staticmem() from ARM header to common
header
---
v5 changes:
- guard "is_domain_using_staticmem" under CONFIG_STATIC_MEMORY
- #define is_domain_using_staticmem zero if undefined
---
v4 changes:
- no changes
---
v3 changes:
- change name from "is_domain_static()" to "is_domain_using_staticmem"
---
v2 changes:
- change name from "is_domain_on_static_allocation" to "is_domain_static()
---
 xen/arch/arm/domain_build.c | 5 -
 xen/include/xen/domain.h| 8 
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 3fd1186b53..b76a84e8f5 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -3287,9 +3287,12 @@ void __init create_domUs(void)
 if ( !dt_device_is_compatible(node, "xen,domain") )
 continue;
 
+if ( dt_find_property(node, "xen,static-mem", NULL) )
+flags |= CDF_staticmem;
+
 if ( dt_property_read_bool(node, "direct-map") )
 {
-if ( !IS_ENABLED(CONFIG_STATIC_MEMORY) || !dt_find_property(node, 
"xen,static-mem", NULL) )
+if ( !(flags & CDF_staticmem) )
 panic("direct-map is not valid for domain %s without static 
allocation.\n",
   dt_node_name(node));
 
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index 628b14b086..2c8116afba 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -35,6 +35,14 @@ void arch_get_domain_info(const struct domain *d,
 /* Should domain memory be directly mapped? */
 #define CDF_directmap(1U << 1)
 #endif
+/* Is domain memory on static allocation? */
+#ifdef CONFIG_STATIC_MEMORY
+#define CDF_staticmem(1U << 2)
+#else
+#define CDF_staticmem0
+#endif
+
+#define is_domain_using_staticmem(d) ((d)->cdf & CDF_staticmem)
 
 /*
  * Arch-specifics.
-- 
2.25.1




[PATCH v11 0/6] populate/unpopulate memory when domain on static allocation

2022-08-30 Thread Penny Zheng
Today when a domain unpopulates the memory on runtime, they will always
hand the memory over to the heap allocator. And it will be a problem if it
is a static domain.
Pages used as guest RAM for static domain shall always be reserved to this
domain only, and not be used for any other purposes, so they shall never go
back to heap allocator.

This patch serie intends to fix this issue, by adding pages on the new list
resv_page_list after having taken them off the "normal" list, when unpopulating
memory, and retrieving pages from resv page list(resv_page_list) when
populating memory.

---
v11 changes:
- printing message ahead of the assertion, which should also be
XENLOG_G_* kind of log level
- commit message tweak
- move "xen: rename free_staticmem_pages to unprepare_staticmem_pages" ahead
of "xen: retrieve reserved pages on populate_physmap"
- with assignment having failed and the page not exposed to the guest at any
point, there is no need for scrubbing
---
v10 changes:
- let Arm keep #define PGC_static 0 private, with the generic fallback
remaining in page_alloc.c
- change ASSERT(d) to ASSERT_UNREACHABLE() to be more robust looking
forward, and also add a printk() to log the problem
- mention the the removal of #ifdef CONFIG_STATIC_MEMORY in commit
message
- commit message typo fix
- Do not skip the list addition in that one special case
- add lock on the fail path
- new commit "xen: rename free_staticmem_pages to unprepare_staticmem_pages"
---
v9 changes:
- move free_domheap_page into else-condition
- considering scrubbing static pages, domain dying case and opt_scrub_domheap
both do not apply to static pages.
- as unowned static pages don't make themselves to free_domstatic_page
at the moment, remove else-condition and add ASSERT(d) at the top of the
function
- remove macro helper put_static_page, and just expand its code inside
free_domstatic_page
- Use ASSERT_ALLOC_CONTEXT() in acquire_reserved_page
- Add free_staticmem_pages to undo prepare_staticmem_pages when
assign_domstatic_pages fails
- Remove redundant static in error message
---
v8 changes:
- introduce new helper free_domstatic_page
- let put_page call free_domstatic_page for static page, when last ref
drops
- #define PGC_static zero when !CONFIG_STATIC_MEMORY, as it is used
outside page_alloc.c
- #ifdef-ary around is_domain_using_staticmem() is not needed anymore
- order as a parameter is not needed here, as all staticmem operations are
limited to order-0 regions
- move d->page_alloc_lock after operation on d->resv_page_list
- As concurrent free/allocate could modify the resv_page_list, we still
need the lock
---
v7 changes:
- protect free_staticmem_pages with heap_lock to match its reverse function
acquire_staticmem_pages
- IS_ENABLED(CONFIG_STATIC_MEMORY) would not be needed anymore
- add page on the rsv_page_list *after* it has been freed
- remove the lock, since we add the page to rsv_page_list after it has
been totally freed.
---
v6 changes:
- rename PGC_staticmem to PGC_static
- remove #ifdef aroud function declaration
- use domain instead of sub-systems
- move non-zero is_domain_using_staticmem() from ARM header to common
header
- move PGC_static !CONFIG_STATIC_MEMORY definition to common header
- drop the lock before returning
---
v5 changes:
- introduce three new commits
- In order to avoid stub functions, we #define PGC_staticmem to non-zero only
when CONFIG_STATIC_MEMORY
- use "unlikely()" around pg->count_info & PGC_staticmem
- remove pointless "if", since mark_page_free() is going to set count_info
to PGC_state_free and by consequence clear PGC_staticmem
- move #define PGC_staticmem 0 to mm.h
- guard "is_domain_using_staticmem" under CONFIG_STATIC_MEMORY
- #define is_domain_using_staticmem zero if undefined
- extract common codes for assigning pages into a helper assign_domstatic_pages
- refine commit message
- remove stub function acquire_reserved_page
- Alloc/free of memory can happen concurrently. So access to rsv_page_list
needs to be protected with a spinlock
---
v4 changes:
- commit message refinement
- miss dropping __init in acquire_domstatic_pages
- add the page back to the reserved list in case of error
- remove redundant printk
- refine log message and make it warn level
- guard "is_domain_using_staticmem" under CONFIG_STATIC_MEMORY
- #define is_domain_using_staticmem zero if undefined
---
v3 changes:
- fix possible racy issue in free_staticmem_pages()
- introduce a stub free_staticmem_pages() for the !CONFIG_STATIC_MEMORY case
- move the change to free_heap_pages() to cover other potential call sites
- change fixed width type uint32_t to unsigned int
- change "flags" to a more descriptive name "cdf"
- change name from "is_domain_static()" to "is_domain_using_staticmem"
- have page_list_del() just once out of the if()
- remove resv_pages counter
- make arch_free_heap_page be an expression, not a compound statement.
- move #ifndef is_domain_using_staticmem to the common header file
- remove #ifdef 

[PATCH v11 1/6] xen: do not free reserved memory into heap

2022-08-30 Thread Penny Zheng
Pages used as guest RAM for static domain, shall be reserved to this
domain only.
So in case reserved pages being used for other purpose, users
shall not free them back to heap, even when last ref gets dropped.

This commit introduces a new helper free_domstatic_page to free
static page in runtime, and free_staticmem_pages will be called by it
in runtime, so let's drop the __init flag.

Wrapper #ifdef CONFIG_STATIC_MEMORY around function declaration(
free_staticmem_pages, free_domstatic_page, etc) is kinds of redundant,
so we decide to remove it here.

Signed-off-by: Penny Zheng 
Reviewed-by: Jan Beulich 
Reviewed-by: Julien Grall 
---
v11 changes:
- printing message ahead of the assertion, which should also be
XENLOG_G_* kind of log level
---
v10 changes:
- let Arm keep #define PGC_static 0 private, with the generic fallback
remaining in page_alloc.c
- change ASSERT(d) to ASSERT_UNREACHABLE() to be more robust looking
forward, and also add a printk() to log the problem
- mention the the removal of #ifdef CONFIG_STATIC_MEMORY in commit
message
---
v9 changes:
- move free_domheap_page into else-condition
- considering scrubbing static pages, domain dying case and opt_scrub_domheap
both donot apply to static pages.
- as unowned static pages don't make themselves to free_domstatic_page
at the moment, remove else-condition and add ASSERT(d) at the top of the
function
---
v8 changes:
- introduce new helper free_domstatic_page
- let put_page call free_domstatic_page for static page, when last ref
drops
- #define PGC_static zero when !CONFIG_STATIC_MEMORY, as it is used
outside page_alloc.c
---
v7 changes:
- protect free_staticmem_pages with heap_lock to match its reverse function
acquire_staticmem_pages
---
v6 changes:
- adapt to PGC_static
- remove #ifdef aroud function declaration
---
v5 changes:
- In order to avoid stub functions, we #define PGC_staticmem to non-zero only
when CONFIG_STATIC_MEMORY
- use "unlikely()" around pg->count_info & PGC_staticmem
- remove pointless "if", since mark_page_free() is going to set count_info
to PGC_state_free and by consequence clear PGC_staticmem
- move #define PGC_staticmem 0 to mm.h
---
v4 changes:
- no changes
---
v3 changes:
- fix possible racy issue in free_staticmem_pages()
- introduce a stub free_staticmem_pages() for the !CONFIG_STATIC_MEMORY case
- move the change to free_heap_pages() to cover other potential call sites
- fix the indentation
---
v2 changes:
- new commit
---
 xen/arch/arm/include/asm/mm.h |  6 +-
 xen/arch/arm/mm.c |  5 -
 xen/common/page_alloc.c   | 40 ---
 xen/include/xen/mm.h  |  3 +--
 4 files changed, 47 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
index da25251cda..749fbefa0c 100644
--- a/xen/arch/arm/include/asm/mm.h
+++ b/xen/arch/arm/include/asm/mm.h
@@ -121,9 +121,13 @@ struct page_info
   /* Page is Xen heap? */
 #define _PGC_xen_heap PG_shift(2)
 #define PGC_xen_heap  PG_mask(1, 2)
-  /* Page is static memory */
+#ifdef CONFIG_STATIC_MEMORY
+/* Page is static memory */
 #define _PGC_staticPG_shift(3)
 #define PGC_static PG_mask(1, 3)
+#else
+#define PGC_static 0
+#endif
 /* ... */
 /* Page is broken? */
 #define _PGC_broken   PG_shift(7)
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index b42cddb1b4..fbdab5598c 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1496,7 +1496,10 @@ void put_page(struct page_info *page)
 
 if ( unlikely((nx & PGC_count_mask) == 0) )
 {
-free_domheap_page(page);
+if ( unlikely(nx & PGC_static) )
+free_domstatic_page(page);
+else
+free_domheap_page(page);
 }
 }
 
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index bfd4150be7..0c50dee4c5 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -2694,12 +2694,14 @@ struct domain *get_pg_owner(domid_t domid)
 
 #ifdef CONFIG_STATIC_MEMORY
 /* Equivalent of free_heap_pages to free nr_mfns pages of static memory. */
-void __init free_staticmem_pages(struct page_info *pg, unsigned long nr_mfns,
- bool need_scrub)
+void free_staticmem_pages(struct page_info *pg, unsigned long nr_mfns,
+  bool need_scrub)
 {
 mfn_t mfn = page_to_mfn(pg);
 unsigned long i;
 
+spin_lock(_lock);
+
 for ( i = 0; i < nr_mfns; i++ )
 {
 mark_page_free([i], mfn_add(mfn, i));
@@ -2710,9 +2712,41 @@ void __init free_staticmem_pages(struct page_info *pg, 
unsigned long nr_mfns,
 scrub_one_page(pg);
 }
 
-/* In case initializing page of static memory, mark it PGC_static. */
 pg[i].count_info |= PGC_static;
 }
+
+spin_unlock(_lock);
+}
+
+void free_domstatic_page(struct page_info *page)
+{
+struct domain *d = page_get_owner(page);
+bool drop_dom_ref;
+
+if ( unlikely(!d) )
+{
+printk(XENLOG_G_ERR

RE: [PATCH 2/2] xen/arm: Handle reserved heap pages in boot and heap allocator

2022-08-30 Thread Henry Wang
Hi Stefano,

> -Original Message-
> From: Henry Wang
> I am not sure about that, since we already have
> heap_pages = reserved_heap ? reserved_heap_pages : ram_pages;
> 
> the heap_pages is supposed to contain domheap_pages + xenheap_pages
> based on the reserved heap definition discussed in the RFC.

To add a little bit more about the background, here is the RFC discussion [1].
I should have attached this in my previous reply, sorry.

[1] 
https://lore.kernel.org/xen-devel/316007b7-51ba-4820-8f6f-018bc6d3a...@arm.com/

Kind regards,
Henry




[ovmf test] 172880: regressions - FAIL

2022-08-30 Thread osstest service owner
flight 172880 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/172880/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 172136
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 172136

version targeted for testing:
 ovmf 227a133a0a4357d9ce7cbf1c81dc4257a37ac616
baseline version:
 ovmf 444260d45ec2a84e8f8c192b3539a3cd5591d009

Last test of basis   172136  2022-08-04 06:43:42 Z   26 days
Failing since172151  2022-08-05 02:40:28 Z   25 days  209 attempts
Testing same since   172876  2022-08-30 17:13:30 Z0 days3 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abner Chang 
  Ard Biesheuvel 
  Bob Feng 
  Chasel Chiu 
  Czajkowski, Maciej 
  Dimitrije Pavlov 
  Dun Tan 
  Edward Pickup 
  Foster Nong 
  Gregx Yeh 
  Guo Dong 
  Igor Kulchytskyy 
  James Lu 
  Jose Marinho 
  KasimX Liu 
  Kavya 
  Konstantin Aladyshev 
  Liming Gao 
  Liu, Zhiguang 
  Maciej Czajkowski 
  Michael D Kinney 
  Ray Ni 
  Rebecca Cran 
  Sainadh Nagolu 
  Sami Mujawar 
  Shengfengx Xue 
  Zhiguang Liu 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  fail
 build-i386-libvirt   fail
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 1036 lines long.)



Re: [RFC PATCH 27/30] Code tagging based latency tracking

2022-08-30 Thread Randy Dunlap



On 8/30/22 14:49, Suren Baghdasaryan wrote:
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index b7d03afbc808..b0f86643b8f0 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1728,6 +1728,14 @@ config LATENCYTOP
> Enable this option if you want to use the LatencyTOP tool
> to find out which userspace is blocking on what kernel operations.
>  
> +config CODETAG_TIME_STATS
> + bool "Code tagging based latency measuring"
> + depends on DEBUG_FS
> + select TIME_STATS
> + select CODE_TAGGING
> + help
> +   Enabling this option makes latency statistics available in debugfs

Missing period at the end of the sentence.

-- 
~Randy



Re: [RFC PATCH 22/30] Code tagging based fault injection

2022-08-30 Thread Randy Dunlap



On 8/30/22 14:49, Suren Baghdasaryan wrote:
> From: Kent Overstreet 
> 
> This adds a new fault injection capability, based on code tagging.
> 
> To use, simply insert somewhere in your code
> 
>   dynamic_fault("fault_class_name")
> 
> and check whether it returns true - if so, inject the error.
> For example
> 
>   if (dynamic_fault("init"))
>   return -EINVAL;
> 
> There's no need to define faults elsewhere, as with
> include/linux/fault-injection.h. Faults show up in debugfs, under
> /sys/kernel/debug/dynamic_faults, and can be selected based on
> file/module/function/line number/class, and enabled permanently, or in
> oneshot mode, or with a specified frequency.
> 
> Signed-off-by: Kent Overstreet 

Missing Signed-off-by: from Suren.
See Documentation/process/submitting-patches.rst:

When to use Acked-by:, Cc:, and Co-developed-by:


The Signed-off-by: tag indicates that the signer was involved in the
development of the patch, or that he/she was in the patch's delivery path.


-- 
~Randy



RE: [PATCH 2/2] xen/arm: Handle reserved heap pages in boot and heap allocator

2022-08-30 Thread Henry Wang
Hi Stefano,

> -Original Message-
> From: Stefano Stabellini 
> Subject: RE: [PATCH 2/2] xen/arm: Handle reserved heap pages in boot and
> heap allocator
> 
> On Tue, 30 Aug 2022, Henry Wang wrote:
> > > -Original Message-
> > > From: Michal Orzel 
> > > >
> > > > Oh I think get your point. Let me try to explain myself and thanks for
> your
> > > > patience :))
> > > >
> > > > The reserved heap region defined in the device tree should be used for
> > > both
> > > > Xenheap and domain heap, so if we reserved a too small region (<32M),
> > > > an error should pop because the reserved region is not enough for
> > > xenheap,
> > > > and user should reserve more.
> > > > [...]
> > > >
> > > >> But your check is against heap being to small (less than 32M).
> > > >> So basically if the following check fails:
> > > >> "( reserved_heap && reserved_heap_pages < 32<<(20-
> PAGE_SHIFT) ) )"
> > > >> it means that the heap region defined by a user is too small (not too
> large),
> > > >> because according to requirements it should be at least 32M.
> > > >
> > > > [...]
> > > > So in that case, printing "Not enough space for xenheap" means the
> > > reserved
> > > > region cannot satisfy the minimal requirement of the space of xenheap
> (at
> > > least
> > > > 32M), and this is in consistent with the check.
> > >
> > > Ok, it clearly depends on the way someone understands this sentence.
> > > Currently this panic can be triggered if the heap size is too large and
> > > should be read as "heap is too large to fit in because there is not enough
> > > space
> > > within RAM considering modules (e - s < size)". Usually (and also in this
> case)
> > > space refers to a region to contain another one.
> > >
> > > You are reusing the same message for different meaning, that is "user
> > > defined too
> > > small heap and this space (read as size) is not enough".
> >
> > Yes, thanks for the explanation. I think maybe rewording the message
> > to "Not enough memory for allocating xenheap" would remove the
> ambiguity
> > to some extent? Because the user-defined heap region should cover both
> > xenheap and domain heap at the same time, the small user-defined heap
> > means "xenheap is too large to fit in the user-defined heap region", which
> is
> > in consistent with your interpretation of the current "xenheap is too large
> to fit
> > in because there is not enough space within RAM considering modules"
> 
> I think we should have a separate check specific for the device tree
> input parameters to make sure the region is correct, that way we can
> have a specific error message, such as:
> 
> "xen,static-heap address needs to be 32MB aligned and the size a
> multiple of 32MB."

Sure, will follow this.

Kind regards,
Henry




RE: [PATCH 2/2] xen/arm: Handle reserved heap pages in boot and heap allocator

2022-08-30 Thread Henry Wang
Hi Stefano,

> -Original Message-
> From: Stefano Stabellini 
> > > and we can automatically calculate xenheap_pages in a single line.
> >
> > Here I am a little bit confused. Sorry to ask but could you please explain
> > a little bit more about why we can calculate the xenheap_pages in a single
> > line? Below is the code snippet in my mind, is this correct?
> >
> > if (reserved_heap)
> 
> coding style
> 
> > e = reserved_heap_end;
> > else
> > {
> > do
> > {
> > e = consider_modules(ram_start, ram_end,
> >  pfn_to_paddr(xenheap_pages),
> >  32<<20, 0);
> > if ( e )
> > break;
> >
> > xenheap_pages >>= 1;
> > } while ( !opt_xenheap_megabytes && xenheap_pages > 32<<(20-
> PAGE_SHIFT) );
> > }
> 
> Yes, this is what I meant.

Thank you very much for your detailed explanation below!
[...]

> 
> But also, here the loop is also for adjusting xenheap_pages, and
> xenheap_pages is initialized as follows:
> 
> 
> if ( opt_xenheap_megabytes )
> xenheap_pages = opt_xenheap_megabytes << (20-PAGE_SHIFT);
> else
> {
> xenheap_pages = (heap_pages/32 + 0x1fffUL) & ~0x1fffUL;
> xenheap_pages = max(xenheap_pages, 32UL<<(20-PAGE_SHIFT));
> xenheap_pages = min(xenheap_pages, 1UL<<(30-PAGE_SHIFT));
> }
> 
> 
> In the reserved_heap case, it doesn't make sense to initialize
> xenheap_pages like that, right? It should be something like:

I am not sure about that, since we already have
heap_pages = reserved_heap ? reserved_heap_pages : ram_pages;

the heap_pages is supposed to contain domheap_pages + xenheap_pages
based on the reserved heap definition discussed in the RFC.

from the code in...

> 
> if ( opt_xenheap_megabytes )
> xenheap_pages = opt_xenheap_megabytes << (20-PAGE_SHIFT);
> else if ( reserved_heap )
> xenheap_pages = heap_pages;

...here, setting xenheap_pages to heap_pages makes me a little bit
confused.

> else
> {
> xenheap_pages = (heap_pages/32 + 0x1fffUL) & ~0x1fffUL;
> xenheap_pages = max(xenheap_pages, 32UL<<(20-PAGE_SHIFT));
> xenheap_pages = min(xenheap_pages, 1UL<<(30-PAGE_SHIFT));
> }

If we keep this logic as this patch does, we can have the requirements...

> 
> But also it looks like that on arm32 we have specific requirements for
> Xen heap:
> 
>  *  - must be 32 MiB aligned
>  *  - must not include Xen itself or the boot modules
>  *  - must be at most 1GB or 1/32 the total RAM in the system if less
>  *  - must be at least 32M

...here, with the "1/32 the total RAM" now being "1/32 of the total reserved
heap region", since heap_pages is now reserved_heap_pages.

> 
> I think we should check at least the 32MB alignment and 32MB minimum
> size before using the xen_heap bank.
> 
> 
> In short I think this patch should:
> 
> - add a check for 32MB alignment and size of the xen_heap memory bank
> - if reserved_heap, set xenheap_pages = heap_pages
> - if reserved_heap, skip the consider_modules do/while
> 
> Does it make sense?

I left some of my thoughts above to explain my understanding, but I might
be wrong, thank you for your patience!

Kind regards,
Henry



[linux-linus test] 172873: regressions - FAIL

2022-08-30 Thread osstest service owner
flight 172873 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/172873/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 172133
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 172133
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 172133
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 172133

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 12 debian-hvm-install fail in 
172865 pass in 172873
 test-armhf-armhf-examine  8 reboot fail pass in 172865

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 172133
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 172133
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 172133
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 172133
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 172133
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 linuxdcf8e5633e2e69ad60b730ab5905608b756a032f
baseline version:
 linuxb44f2fd87919b5ae6e1756d4c7ba2cbba22238e1

Last test of basis   172133  2022-08-04 05:14:48 Z   26 days
Failing since172152  2022-08-05 04:01:26 Z   25 days   59 attempts
Testing same since   172865  2022-08-30 04:54:06 Z0 days2 attempts


1589 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  

[ovmf test] 172878: regressions - FAIL

2022-08-30 Thread osstest service owner
flight 172878 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/172878/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 172136
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 172136

version targeted for testing:
 ovmf 227a133a0a4357d9ce7cbf1c81dc4257a37ac616
baseline version:
 ovmf 444260d45ec2a84e8f8c192b3539a3cd5591d009

Last test of basis   172136  2022-08-04 06:43:42 Z   26 days
Failing since172151  2022-08-05 02:40:28 Z   25 days  208 attempts
Testing same since   172876  2022-08-30 17:13:30 Z0 days2 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abner Chang 
  Ard Biesheuvel 
  Bob Feng 
  Chasel Chiu 
  Czajkowski, Maciej 
  Dimitrije Pavlov 
  Dun Tan 
  Edward Pickup 
  Foster Nong 
  Gregx Yeh 
  Guo Dong 
  Igor Kulchytskyy 
  James Lu 
  Jose Marinho 
  KasimX Liu 
  Kavya 
  Konstantin Aladyshev 
  Liming Gao 
  Liu, Zhiguang 
  Maciej Czajkowski 
  Michael D Kinney 
  Ray Ni 
  Rebecca Cran 
  Sainadh Nagolu 
  Sami Mujawar 
  Shengfengx Xue 
  Zhiguang Liu 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  fail
 build-i386-libvirt   fail
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 1036 lines long.)



[PATCH v2] xen/device_tree: Fix MISRA C 2012 Rule 20.7 violations

2022-08-30 Thread Xenia Ragiadakou
Add parentheses around the macro parameters that are used as expressions
to prevent against unintended expansions during macro substitution.

Signed-off-by: Xenia Ragiadakou 
---

Changes in v2:
- apply rule 20.7 as is, without deviating from it
- adjust commit message accordingly

Also, in this file, the macro dt_irq(irq) has not been defined properly but
since it is not used, the bug has not been uncovered yet.
I can either fix it or remove it along with macro dt_irq_flags(irq) under
rule 2.5 "A project should not contain unused macro declarations" (advisory)

 xen/include/xen/device_tree.h | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index 430a1ef445..6e253f5763 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -37,11 +37,11 @@ struct dt_device_match {
 const void *data;
 };
 
-#define __DT_MATCH_PATH(p)  .path = p
-#define __DT_MATCH_TYPE(typ).type = typ
-#define __DT_MATCH_COMPATIBLE(compat)   .compatible = compat
+#define __DT_MATCH_PATH(p)  .path = (p)
+#define __DT_MATCH_TYPE(typ).type = (typ)
+#define __DT_MATCH_COMPATIBLE(compat)   .compatible = (compat)
 #define __DT_MATCH_NOT_AVAILABLE()  .not_available = 1
-#define __DT_MATCH_PROP(p)  .prop = p
+#define __DT_MATCH_PROP(p)  .prop = (p)
 
 #define DT_MATCH_PATH(p){ __DT_MATCH_PATH(p) }
 #define DT_MATCH_TYPE(typ)  { __DT_MATCH_TYPE(typ) }
@@ -222,13 +222,13 @@ dt_find_interrupt_controller(const struct dt_device_match 
*matches);
 #define DT_ROOT_NODE_SIZE_CELLS_DEFAULT 1
 
 #define dt_for_each_property_node(dn, pp)   \
-for ( pp = dn->properties; pp != NULL; pp = pp->next )
+for ( (pp) = (dn)->properties; (pp) != NULL; (pp) = (pp)->next )
 
 #define dt_for_each_device_node(dt, dn) \
-for ( dn = dt; dn != NULL; dn = dn->allnext )
+for ( (dn) = (dt); (dn) != NULL; (dn) = (dn)->allnext )
 
 #define dt_for_each_child_node(dt, dn)  \
-for ( dn = dt->child; dn != NULL; dn = dn->sibling )
+for ( (dn) = (dt)->child; (dn) != NULL; (dn) = (dn)->sibling )
 
 /* Helper to read a big number; size is in cells (not bytes) */
 static inline u64 dt_read_number(const __be32 *cell, int size)
-- 
2.34.1




[ovmf test] 172876: regressions - FAIL

2022-08-30 Thread osstest service owner
flight 172876 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/172876/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 172136
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 172136

version targeted for testing:
 ovmf 227a133a0a4357d9ce7cbf1c81dc4257a37ac616
baseline version:
 ovmf 444260d45ec2a84e8f8c192b3539a3cd5591d009

Last test of basis   172136  2022-08-04 06:43:42 Z   26 days
Failing since172151  2022-08-05 02:40:28 Z   25 days  207 attempts
Testing same since   172876  2022-08-30 17:13:30 Z0 days1 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abner Chang 
  Ard Biesheuvel 
  Bob Feng 
  Chasel Chiu 
  Czajkowski, Maciej 
  Dimitrije Pavlov 
  Dun Tan 
  Edward Pickup 
  Foster Nong 
  Gregx Yeh 
  Guo Dong 
  Igor Kulchytskyy 
  James Lu 
  Jose Marinho 
  KasimX Liu 
  Kavya 
  Konstantin Aladyshev 
  Liming Gao 
  Liu, Zhiguang 
  Maciej Czajkowski 
  Michael D Kinney 
  Ray Ni 
  Rebecca Cran 
  Sainadh Nagolu 
  Sami Mujawar 
  Shengfengx Xue 
  Zhiguang Liu 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  fail
 build-i386-libvirt   fail
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 1036 lines long.)



[PATCH] acpi: Add TPM2 interface definition and make the TPM version configurable.

2022-08-30 Thread Jennifer Herbert
This patch introduces an optional TPM 2 interface definition to the ACPI table,
which is to be used as part of a vTPM 2 implementation.
To enable the new interface - I have made the TPM interface version
configurable in the acpi_config, with the default being the existing 1.2.(TCPA)
I have also added to hvmloader an option to utilise this new config, which can
be triggered by setting the platform/tpm_verion xenstore key.

Signed-off-by: Jennifer Herbert 
---
 tools/firmware/hvmloader/config.h |   1 +
 tools/firmware/hvmloader/util.c   |  15 +++-
 tools/libacpi/Makefile|   2 +-
 tools/libacpi/acpi2_0.h   |  24 +++
 tools/libacpi/build.c | 111 ++
 tools/libacpi/libacpi.h   |   4 +-
 tools/libacpi/ssdt_tpm2.asl   |  36 ++
 7 files changed, 159 insertions(+), 34 deletions(-)
 create mode 100644 tools/libacpi/ssdt_tpm2.asl

diff --git a/tools/firmware/hvmloader/config.h 
b/tools/firmware/hvmloader/config.h
index c82adf6dc5..4dec7195f0 100644
--- a/tools/firmware/hvmloader/config.h
+++ b/tools/firmware/hvmloader/config.h
@@ -56,6 +56,7 @@ extern uint8_t ioapic_version;
 #define PCI_ISA_IRQ_MASK0x0c20U /* ISA IRQs 5,10,11 are PCI connected */
 
 #define ACPI_TIS_HDR_ADDRESS 0xFED40F00UL
+#define ACPI_CRB_HDR_ADDRESS 0xFED40034UL
 
 extern uint32_t pci_mem_start;
 extern const uint32_t pci_mem_end;
diff --git a/tools/firmware/hvmloader/util.c b/tools/firmware/hvmloader/util.c
index 581b35e5cf..e3af32581b 100644
--- a/tools/firmware/hvmloader/util.c
+++ b/tools/firmware/hvmloader/util.c
@@ -994,13 +994,24 @@ void hvmloader_acpi_build_tables(struct acpi_config 
*config,
 if ( !strncmp(xenstore_read("platform/acpi_laptop_slate", "0"), "1", 1)  )
 config->table_flags |= ACPI_HAS_SSDT_LAPTOP_SLATE;
 
-config->table_flags |= (ACPI_HAS_TCPA | ACPI_HAS_IOAPIC |
+config->table_flags |= (ACPI_HAS_TPM | ACPI_HAS_IOAPIC |
 ACPI_HAS_WAET | ACPI_HAS_PMTIMER |
 ACPI_HAS_BUTTONS | ACPI_HAS_VGA |
 ACPI_HAS_8042 | ACPI_HAS_CMOS_RTC);
 config->acpi_revision = 4;
 
-config->tis_hdr = (uint16_t *)ACPI_TIS_HDR_ADDRESS;
+if ( !strncmp(xenstore_read("platform/tpm_version", "0"), "2", 1)  ) {
+
+config->tpm_version = 2;
+config->crb_hdr = (uint16_t *)ACPI_CRB_HDR_ADDRESS;
+config->tis_hdr = NULL;
+}
+else
+{
+config->tpm_version = 1;
+config->crb_hdr = NULL;
+config->tis_hdr = (uint16_t *)ACPI_TIS_HDR_ADDRESS;
+}
 
 config->numa.nr_vmemranges = nr_vmemranges;
 config->numa.nr_vnodes = nr_vnodes;
diff --git a/tools/libacpi/Makefile b/tools/libacpi/Makefile
index 60860eaa00..125f29fb54 100644
--- a/tools/libacpi/Makefile
+++ b/tools/libacpi/Makefile
@@ -25,7 +25,7 @@ C_SRC-$(CONFIG_X86) = dsdt_anycpu.c dsdt_15cpu.c 
dsdt_anycpu_qemu_xen.c dsdt_pvh
 C_SRC-$(CONFIG_ARM_64) = dsdt_anycpu_arm.c
 DSDT_FILES ?= $(C_SRC-y)
 C_SRC = $(addprefix $(ACPI_BUILD_DIR)/, $(DSDT_FILES))
-H_SRC = $(addprefix $(ACPI_BUILD_DIR)/, ssdt_s3.h ssdt_s4.h ssdt_pm.h 
ssdt_tpm.h ssdt_laptop_slate.h)
+H_SRC = $(addprefix $(ACPI_BUILD_DIR)/, ssdt_s3.h ssdt_s4.h ssdt_pm.h 
ssdt_tpm.h ssdt_tpm2.h ssdt_laptop_slate.h)
 
 MKDSDT_CFLAGS-$(CONFIG_ARM_64) = -DCONFIG_ARM_64
 MKDSDT_CFLAGS-$(CONFIG_X86) = -DCONFIG_X86
diff --git a/tools/libacpi/acpi2_0.h b/tools/libacpi/acpi2_0.h
index 2619ba32db..5754daa985 100644
--- a/tools/libacpi/acpi2_0.h
+++ b/tools/libacpi/acpi2_0.h
@@ -121,6 +121,28 @@ struct acpi_20_tcpa {
 };
 #define ACPI_2_0_TCPA_LAML_SIZE (64*1024)
 
+/*
+ * TPM2
+ */
+struct Acpi20TPM2 {
+struct acpi_header header;
+uint16_t platform_class;
+uint16_t reserved;
+uint64_t control_area_address;
+uint32_t start_method;
+uint8_t start_method_params[12];
+uint32_t log_area_minimum_length;
+uint64_t log_area_start_address;
+};
+#define TPM2_ACPI_CLASS_CLIENT  0
+#define TPM2_START_METHOD_CRB   7
+
+#define TPM_CRB_ADDR_BASE   0xFED4
+#define TPM_CRB_ADDR_CTRL   (TPM_CRB_ADDR_BASE + 0x40)
+
+#define TPM_LOG_AREA_MINIMUM_SIZE   (64 << 10)
+#define TPM_LOG_SIZE(64 << 10)
+
 /*
  * Fixed ACPI Description Table Structure (FADT) in ACPI 1.0.
  */
@@ -431,6 +453,7 @@ struct acpi_20_slit {
 #define ACPI_2_0_RSDT_SIGNATURE ASCII32('R','S','D','T')
 #define ACPI_2_0_XSDT_SIGNATURE ASCII32('X','S','D','T')
 #define ACPI_2_0_TCPA_SIGNATURE ASCII32('T','C','P','A')
+#define ACPI_2_0_TPM2_SIGNATURE ASCII32('T','P','M','2')
 #define ACPI_2_0_HPET_SIGNATURE ASCII32('H','P','E','T')
 #define ACPI_2_0_WAET_SIGNATURE ASCII32('W','A','E','T')
 #define ACPI_2_0_SRAT_SIGNATURE ASCII32('S','R','A','T')
@@ -444,6 +467,7 @@ struct acpi_20_slit {
 #define ACPI_2_0_RSDT_REVISION 0x01
 #define ACPI_2_0_XSDT_REVISION 0x01
 #define ACPI_2_0_TCPA_REVISION 0x02
+#define ACPI_2_0_TPM2_REVISION 0x04
 #define ACPI_2_0_HPET_REVISION 0x01
 #define 

[qemu-mainline test] 172869: regressions - FAIL

2022-08-30 Thread osstest service owner
flight 172869 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/172869/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 172123
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 172123
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 172123
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 172123

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-raw   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 172123
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 172123
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 172123
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 172123
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 172123
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuu9a99f964b152f8095949bbddca7841744ad418da
baseline version:
 qemuu2480f3bbd03814b0651a1f74959f5c6631ee5819

Last test of basis   172123  2022-08-03 18:10:07 Z   27 days
Failing since172148  2022-08-04 21:39:38 Z   25 days   59 attempts
Testing same since   172768  2022-08-25 07:03:08 Z5 days   12 attempts


RE: [PATCH 2/2] xen/arm: Handle reserved heap pages in boot and heap allocator

2022-08-30 Thread Stefano Stabellini
On Tue, 30 Aug 2022, Henry Wang wrote:
> Hi Michal,
> 
> > -Original Message-
> > From: Michal Orzel 
> > >>>
> > >> Did you consider putting reserved_heap into bootinfo structure?
> > >
> > > Actually I did, but I saw current bootinfo only contains some structs so
> > > I was not sure if this is the preferred way, but since you are raising 
> > > this
> > > question, I will follow this method in v2.
> > This is what I think would be better but maintainers will have a decisive 
> > vote.
> 
> Then let's wait for more input from maintainers.

I don't have a strong preference and the way the current code is
written, it would actually take less memory as is (the extra bool
xen_heap comes for free.)

I would keep the patch as is for now and for 4.17.

If Julien prefers a refactoring of bootinfo/meminfo I think it could be
done after the release if you are up to it.



RE: [PATCH 2/2] xen/arm: Handle reserved heap pages in boot and heap allocator

2022-08-30 Thread Stefano Stabellini
On Tue, 30 Aug 2022, Henry Wang wrote:
> > -Original Message-
> > From: Michal Orzel 
> > >
> > > Oh I think get your point. Let me try to explain myself and thanks for 
> > > your
> > > patience :))
> > >
> > > The reserved heap region defined in the device tree should be used for
> > both
> > > Xenheap and domain heap, so if we reserved a too small region (<32M),
> > > an error should pop because the reserved region is not enough for
> > xenheap,
> > > and user should reserve more.
> > > [...]
> > >
> > >> But your check is against heap being to small (less than 32M).
> > >> So basically if the following check fails:
> > >> "( reserved_heap && reserved_heap_pages < 32<<(20-PAGE_SHIFT) ) )"
> > >> it means that the heap region defined by a user is too small (not too 
> > >> large),
> > >> because according to requirements it should be at least 32M.
> > >
> > > [...]
> > > So in that case, printing "Not enough space for xenheap" means the
> > reserved
> > > region cannot satisfy the minimal requirement of the space of xenheap (at
> > least
> > > 32M), and this is in consistent with the check.
> > 
> > Ok, it clearly depends on the way someone understands this sentence.
> > Currently this panic can be triggered if the heap size is too large and
> > should be read as "heap is too large to fit in because there is not enough
> > space
> > within RAM considering modules (e - s < size)". Usually (and also in this 
> > case)
> > space refers to a region to contain another one.
> > 
> > You are reusing the same message for different meaning, that is "user
> > defined too
> > small heap and this space (read as size) is not enough".
> 
> Yes, thanks for the explanation. I think maybe rewording the message
> to "Not enough memory for allocating xenheap" would remove the ambiguity
> to some extent? Because the user-defined heap region should cover both
> xenheap and domain heap at the same time, the small user-defined heap
> means "xenheap is too large to fit in the user-defined heap region", which is
> in consistent with your interpretation of the current "xenheap is too large 
> to fit
> in because there is not enough space within RAM considering modules"

I think we should have a separate check specific for the device tree
input parameters to make sure the region is correct, that way we can
have a specific error message, such as:

"xen,static-heap address needs to be 32MB aligned and the size a
multiple of 32MB."



RE: [PATCH 2/2] xen/arm: Handle reserved heap pages in boot and heap allocator

2022-08-30 Thread Stefano Stabellini
On Tue, 30 Aug 2022, Henry Wang wrote:
> > >  /*
> > >   * If the user has not requested otherwise via the command line
> > >   * then locate the xenheap using these constraints:
> > > @@ -743,7 +766,8 @@ static void __init setup_mm(void)
> > >   * We try to allocate the largest xenheap possible within these
> > >   * constraints.
> > >   */
> > > -heap_pages = ram_pages;
> > > +heap_pages = !reserved_heap ? ram_pages : reserved_heap_pages;
> > > +
> > >  if ( opt_xenheap_megabytes )
> > >  xenheap_pages = opt_xenheap_megabytes << (20-PAGE_SHIFT);
> > >  else
> > > @@ -755,17 +779,21 @@ static void __init setup_mm(void)
> > >
> > >  do
> > >  {
> > > -e = consider_modules(ram_start, ram_end,
> > > +e = !reserved_heap ?
> > > +consider_modules(ram_start, ram_end,
> > >   pfn_to_paddr(xenheap_pages),
> > > - 32<<20, 0);
> > > + 32<<20, 0) :
> > > +reserved_heap_end;
> > > +
> > >  if ( e )
> > >  break;
> > >
> > >  xenheap_pages >>= 1;
> > >  } while ( !opt_xenheap_megabytes && xenheap_pages > 32<<(20-
> > PAGE_SHIFT) );
> > >
> > > -if ( ! e )
> > > -panic("Not not enough space for xenheap\n");
> > > +if ( ! e ||
> > > + ( reserved_heap && reserved_heap_pages < 32<<(20-PAGE_SHIFT) ) )
> > > +panic("Not enough space for xenheap\n");
> > 
> > 
> > I would skip the do/while loop completely if reserved_heap. We don't
> > need it anyway
> 
> I agree with this.
> 
> > and we can automatically calculate xenheap_pages in a single line.
> 
> Here I am a little bit confused. Sorry to ask but could you please explain
> a little bit more about why we can calculate the xenheap_pages in a single
> line? Below is the code snippet in my mind, is this correct?
> 
> if (reserved_heap)

coding style

> e = reserved_heap_end;
> else
> {
> do
> {
> e = consider_modules(ram_start, ram_end,
>  pfn_to_paddr(xenheap_pages),
>  32<<20, 0);
> if ( e )
> break;
> 
> xenheap_pages >>= 1;
> } while ( !opt_xenheap_megabytes && xenheap_pages > 32<<(20-PAGE_SHIFT) );
> }

Yes, this is what I meant.

But also, here the loop is also for adjusting xenheap_pages, and
xenheap_pages is initialized as follows:


if ( opt_xenheap_megabytes )
xenheap_pages = opt_xenheap_megabytes << (20-PAGE_SHIFT);
else
{
xenheap_pages = (heap_pages/32 + 0x1fffUL) & ~0x1fffUL;
xenheap_pages = max(xenheap_pages, 32UL<<(20-PAGE_SHIFT));
xenheap_pages = min(xenheap_pages, 1UL<<(30-PAGE_SHIFT));
}


In the reserved_heap case, it doesn't make sense to initialize
xenheap_pages like that, right? It should be something like:

if ( opt_xenheap_megabytes )
xenheap_pages = opt_xenheap_megabytes << (20-PAGE_SHIFT);
else if ( reserved_heap )
xenheap_pages = heap_pages;
else
{
xenheap_pages = (heap_pages/32 + 0x1fffUL) & ~0x1fffUL;
xenheap_pages = max(xenheap_pages, 32UL<<(20-PAGE_SHIFT));
xenheap_pages = min(xenheap_pages, 1UL<<(30-PAGE_SHIFT));
}

But also it looks like that on arm32 we have specific requirements for
Xen heap:

 *  - must be 32 MiB aligned
 *  - must not include Xen itself or the boot modules
 *  - must be at most 1GB or 1/32 the total RAM in the system if less
 *  - must be at least 32M

I think we should check at least the 32MB alignment and 32MB minimum
size before using the xen_heap bank.


In short I think this patch should:

- add a check for 32MB alignment and size of the xen_heap memory bank
- if reserved_heap, set xenheap_pages = heap_pages
- if reserved_heap, skip the consider_modules do/while

Does it make sense?



[linux-5.4 test] 172866: regressions - FAIL

2022-08-30 Thread osstest service owner
flight 172866 linux-5.4 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/172866/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 172128
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 172128
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 172128
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 172128

Tests which are failing intermittently (not blocking):
 test-amd64-i386-qemuu-rhel6hvm-amd 7 xen-install fail in 172858 pass in 172866
 test-armhf-armhf-xl-rtds 14 guest-startfail pass in 172858

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-raw   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit1 18 guest-start/debian.repeat fail blocked in 172128
 test-armhf-armhf-xl-multivcpu 18 guest-start/debian.repeat fail in 172858 like 
172108
 test-armhf-armhf-xl-rtds 18 guest-start/debian.repeat fail in 172858 like 
172128
 test-armhf-armhf-xl-multivcpu 15 migrate-support-check fail in 172858 never 
pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-check fail in 172858 
never pass
 test-armhf-armhf-xl-rtds15 migrate-support-check fail in 172858 never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-check fail in 172858 never pass
 test-armhf-armhf-xl-credit2  18 guest-start/debian.repeatfail  like 172108
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 172108
 test-armhf-armhf-xl-multivcpu 14 guest-start  fail like 172128
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 172128
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 172128
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 172128
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 172128
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 172128
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 172128
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 172128
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 172128
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 

[ovmf test] 172872: regressions - FAIL

2022-08-30 Thread osstest service owner
flight 172872 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/172872/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 172136
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 172136
 build-i386-xsm6 xen-buildfail REGR. vs. 172136

version targeted for testing:
 ovmf ba0e0e4c6a174b71b18ccd6e47319cc45878893c
baseline version:
 ovmf 444260d45ec2a84e8f8c192b3539a3cd5591d009

Last test of basis   172136  2022-08-04 06:43:42 Z   26 days
Failing since172151  2022-08-05 02:40:28 Z   25 days  206 attempts
Testing same since   172829  2022-08-28 09:13:28 Z2 days   18 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abner Chang 
  Ard Biesheuvel 
  Bob Feng 
  Chasel Chiu 
  Czajkowski, Maciej 
  Dimitrije Pavlov 
  Dun Tan 
  Edward Pickup 
  Foster Nong 
  Gregx Yeh 
  Igor Kulchytskyy 
  James Lu 
  Jose Marinho 
  KasimX Liu 
  Kavya 
  Konstantin Aladyshev 
  Liming Gao 
  Liu, Zhiguang 
  Maciej Czajkowski 
  Michael D Kinney 
  Ray Ni 
  Rebecca Cran 
  Sainadh Nagolu 
  Sami Mujawar 
  Shengfengx Xue 
  Zhiguang Liu 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   fail
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  fail
 build-i386-libvirt   fail
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 1015 lines long.)



[linux-linus test] 172865: regressions - FAIL

2022-08-30 Thread osstest service owner
flight 172865 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/172865/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 172133
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 172133
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 172133
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 12 debian-hvm-install fail REGR. 
vs. 172133
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 172133

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 172133
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 172133
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 172133
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 172133
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 172133
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 linuxdcf8e5633e2e69ad60b730ab5905608b756a032f
baseline version:
 linuxb44f2fd87919b5ae6e1756d4c7ba2cbba22238e1

Last test of basis   172133  2022-08-04 05:14:48 Z   26 days
Failing since172152  2022-08-05 04:01:26 Z   25 days   58 attempts
Testing same since   172865  2022-08-30 04:54:06 Z0 days1 attempts


1589 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm 

Xen in ARM environment networking overload issue

2022-08-30 Thread Oleshii Wood
Hello guys,

The same issue migh be found by link
https://xen.markmail.org/message/3w4oqeu5z7ryfbsb?q=xen_add_phys_to_mach_entry=1

DOM0/DOMU kernels version 5.15.19

Two XENified guests. They are equal. They both have the same configuration.
It is enough to have one DomU and Dom0.

Part of DomU configuration.
type = "pvh"
memory = 1024
vcpus = 1

It is not necessary to repeat full custom board configuration with all the
bridges.
It is enough to connect DomU with some external machine or with Dom0
through the network.

You should have a perf3 compiled package on both ends.
on Dom0 or external host you issue command:
perf3 -s &

on DomU you issue the command:
perf3 -c [Dom0 or external host ip address] -k 400K -b 0

After enough short time you will see in your DomU kernel messages or in
your xen console something like
[ 2385.999011] xen_add_phys_to_mach_entry: cannot add
pfn=0x0003e0de -> mfn=0x008031f2: pfn=0x0003e0de ->
mfn=0x008044a0 already exists
[ 2355.968172] xen_add_phys_to_mach_entry: cannot add
pfn=0x0003bfca -> mfn=0x0080319c: pfn=0x0003bfca ->
mfn=0x00803276 already exists
[ 2323.002652] xen_add_phys_to_mach_entry: cannot add
pfn=0x0003b80e -> mfn=0x0080447d: pfn=0x0003b80e ->
mfn=0x008032a7 already exists
[ 2302.036336] xen_add_phys_to_mach_entry: cannot add
pfn=0x0003e0de -> mfn=0x00803105: pfn=0x0003e0de ->
mfn=0x008044a0 already exists
[ 2273.758169] xen_add_phys_to_mach_entry: cannot add
pfn=0x0003b80e -> mfn=0x008033fc: pfn=0x0003b80e ->
mfn=0x008032a7 already exists
[ 2252.254857] xen_add_phys_to_mach_entry: cannot add
pfn=0x0003b80e -> mfn=0x008032f0: pfn=0x0003b80e ->
mfn=0x008032a7 already exists

You will have a lot of those messages.

Involved files
arch/arm/xen/p2m.c
drivers/net/xen-netback/netback.c
drivers/net/xen-netback/common.h

This problem arrived in the p2m.c file in the xen_add_phys_to_mach_entry
function
This function adds a new mapping XEN page pfn to the DomU gfn.
It does via red/black tree
Xen netback adapter structure placed in the common.h file. It contains
xenvif_queue structure.
There are some involved members from this structure.

struct xenvif_queue {
...
struct page *mmap_pages[MAX_PENDING_REQS];
pending_ring_idx_t pending_cons;
...
u16 pending_ring[MAX_PENDING_REQS];
...
struct page *pages_to_map[MAX_PENDING_REQS];
...
}

All the pages are stored in the xenvif_queue->mmap_pages
They are allocated by their indexes. Those ones are stored
sequentially in the xenvif_queue->pending_ring.
Pages allocation depends on xenvif_queue->pending_cons.
This value rounded up to MAX_PENDING_REQS value.
Pages allocated for the request cyclically.
When an intensive network traffic is in progress especially
when a packets flow density has been growing, sooner or later
we will run into the case when we do not have enough free pages.
MAX_PENDING_REQS in our case is 256
This case arrives in the netback.c xenvif_tx_action->gnttab_map_refs call.
Main work is in the xenvif_tx_build_gops.
xenvif_tx_action is issued in NAPI context.
So we could say it is something like the interrupt bottom half.
This message is produced when the issued pfn is presented in the red/black
tree.
It is produced unconditionally. In  the above mentioned condition this
output
degrades the performance drastically.
I may offer a patch which decreases the amount of messages.

Regards,
Oleg


[ovmf test] 172870: regressions - FAIL

2022-08-30 Thread osstest service owner
flight 172870 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/172870/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 172136
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 172136

version targeted for testing:
 ovmf ba0e0e4c6a174b71b18ccd6e47319cc45878893c
baseline version:
 ovmf 444260d45ec2a84e8f8c192b3539a3cd5591d009

Last test of basis   172136  2022-08-04 06:43:42 Z   26 days
Failing since172151  2022-08-05 02:40:28 Z   25 days  205 attempts
Testing same since   172829  2022-08-28 09:13:28 Z2 days   17 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abner Chang 
  Ard Biesheuvel 
  Bob Feng 
  Chasel Chiu 
  Czajkowski, Maciej 
  Dimitrije Pavlov 
  Dun Tan 
  Edward Pickup 
  Foster Nong 
  Gregx Yeh 
  Igor Kulchytskyy 
  James Lu 
  Jose Marinho 
  KasimX Liu 
  Kavya 
  Konstantin Aladyshev 
  Liming Gao 
  Liu, Zhiguang 
  Maciej Czajkowski 
  Michael D Kinney 
  Ray Ni 
  Rebecca Cran 
  Sainadh Nagolu 
  Sami Mujawar 
  Shengfengx Xue 
  Zhiguang Liu 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  fail
 build-i386-libvirt   fail
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 1015 lines long.)



Re: [RFC PATCH v1] xen/docs: Document acquire resource interface

2022-08-30 Thread Paul Durrant

On 29/08/2022 10:03, Matias Ezequiel Vara Larsen wrote:

This commit creates a new doc to document the acquire resource interface. This
is a reference document.

Signed-off-by: Matias Ezequiel Vara Larsen 
---
Changes in v1:
- correct documentation about how mfns are allocated
- correct documentation about how mfns are released
- use the wording tool instead of pv tool
- fix typos
---
  .../acquire_resource_reference.rst| 338 ++
  docs/hypervisor-guide/index.rst   |   2 +
  2 files changed, 340 insertions(+)
  create mode 100644 docs/hypervisor-guide/acquire_resource_reference.rst

diff --git a/docs/hypervisor-guide/acquire_resource_reference.rst 
b/docs/hypervisor-guide/acquire_resource_reference.rst
new file mode 100644
index 00..d1989d2fd4
--- /dev/null
+++ b/docs/hypervisor-guide/acquire_resource_reference.rst
@@ -0,0 +1,338 @@
+.. SPDX-License-Identifier: CC-BY-4.0
+
+Acquire resource reference
+==
+
+Acquire resource allows you to share a resource between Xen and dom0.


That doesn't sound right. The resources 'belong' to Xen, and are 
specific to a particular domain (A). Another domain (B) with enough 
privilege over domain A can then map and hence access those resources.


  Paul




[xen-unstable test] 172861: tolerable FAIL

2022-08-30 Thread osstest service owner
flight 172861 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/172861/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-raw   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 172823
 build-amd64-libvirt   6 libvirt-buildfail  like 172841
 build-i386-libvirt6 libvirt-buildfail  like 172841
 build-arm64-libvirt   6 libvirt-buildfail  like 172841
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 172841
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 172841
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 172841
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 172841
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 172841
 build-armhf-libvirt   6 libvirt-buildfail  like 172841
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 172841
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 172841
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 172841
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  cbb35e72802f3a285c382a995ef647b59e5caf2f
baseline version:
 xen  

RE: [PATCH 2/2] xen/arm: Handle reserved heap pages in boot and heap allocator

2022-08-30 Thread Henry Wang
Hi Michal,

> -Original Message-
> From: Michal Orzel 
> >
> > Oh I think get your point. Let me try to explain myself and thanks for your
> > patience :))
> >
> > The reserved heap region defined in the device tree should be used for
> both
> > Xenheap and domain heap, so if we reserved a too small region (<32M),
> > an error should pop because the reserved region is not enough for
> xenheap,
> > and user should reserve more.
> > [...]
> >
> >> But your check is against heap being to small (less than 32M).
> >> So basically if the following check fails:
> >> "( reserved_heap && reserved_heap_pages < 32<<(20-PAGE_SHIFT) ) )"
> >> it means that the heap region defined by a user is too small (not too 
> >> large),
> >> because according to requirements it should be at least 32M.
> >
> > [...]
> > So in that case, printing "Not enough space for xenheap" means the
> reserved
> > region cannot satisfy the minimal requirement of the space of xenheap (at
> least
> > 32M), and this is in consistent with the check.
> 
> Ok, it clearly depends on the way someone understands this sentence.
> Currently this panic can be triggered if the heap size is too large and
> should be read as "heap is too large to fit in because there is not enough
> space
> within RAM considering modules (e - s < size)". Usually (and also in this 
> case)
> space refers to a region to contain another one.
> 
> You are reusing the same message for different meaning, that is "user
> defined too
> small heap and this space (read as size) is not enough".

Yes, thanks for the explanation. I think maybe rewording the message
to "Not enough memory for allocating xenheap" would remove the ambiguity
to some extent? Because the user-defined heap region should cover both
xenheap and domain heap at the same time, the small user-defined heap
means "xenheap is too large to fit in the user-defined heap region", which is
in consistent with your interpretation of the current "xenheap is too large to 
fit
in because there is not enough space within RAM considering modules"

> 
> Let's leave it to someone else to decide.

I agree.

Kind regards,
Henry

> 
> ~Michal


Re: [PATCH 2/2] xen/arm: Handle reserved heap pages in boot and heap allocator

2022-08-30 Thread Michal Orzel



On 30/08/2022 11:17, Henry Wang wrote:
> 
> Hi Michal,
> 
>> -Original Message-
>> From: Michal Orzel 
 This is totally fine. What I mean is that the check you introduced does not
 correspond
 to the panic message below. In case of reserved heap, its size is selected
>> by
 the user.
 "Not enough space for xenheap" means that there is not enough space to
>> be
 reserved for heap,
 meaning its size is too large. But your check is about size being too 
 small.
>>>
>>> Actually my understanding of "Not enough space for xenheap" is xenheap
>>> is too large so we need to reserve more space, which is slightly different
>> than
>>> your opinion. But I am not the native speaker so it is highly likely that I 
>>> am
>>> making mistakes...
>> My understanding is exactly the same as yours :),
>> meaning heap is too large.
> 
> Oh I think get your point. Let me try to explain myself and thanks for your
> patience :))
> 
> The reserved heap region defined in the device tree should be used for both
> Xenheap and domain heap, so if we reserved a too small region (<32M),
> an error should pop because the reserved region is not enough for xenheap,
> and user should reserve more.
> [...]
> 
>> But your check is against heap being to small (less than 32M).
>> So basically if the following check fails:
>> "( reserved_heap && reserved_heap_pages < 32<<(20-PAGE_SHIFT) ) )"
>> it means that the heap region defined by a user is too small (not too large),
>> because according to requirements it should be at least 32M.
> 
> [...]
> So in that case, printing "Not enough space for xenheap" means the reserved
> region cannot satisfy the minimal requirement of the space of xenheap (at 
> least
> 32M), and this is in consistent with the check.

Ok, it clearly depends on the way someone understands this sentence.
Currently this panic can be triggered if the heap size is too large and
should be read as "heap is too large to fit in because there is not enough space
within RAM considering modules (e - s < size)". Usually (and also in this case)
space refers to a region to contain another one.

You are reusing the same message for different meaning, that is "user defined 
too
small heap and this space (read as size) is not enough".

Let's leave it to someone else to decide.

~Michal



RE: [PATCH 2/2] xen/arm: Handle reserved heap pages in boot and heap allocator

2022-08-30 Thread Henry Wang
Hi Michal,

> -Original Message-
> From: Michal Orzel 
> >> This is totally fine. What I mean is that the check you introduced does not
> >> correspond
> >> to the panic message below. In case of reserved heap, its size is selected
> by
> >> the user.
> >> "Not enough space for xenheap" means that there is not enough space to
> be
> >> reserved for heap,
> >> meaning its size is too large. But your check is about size being too 
> >> small.
> >
> > Actually my understanding of "Not enough space for xenheap" is xenheap
> > is too large so we need to reserve more space, which is slightly different
> than
> > your opinion. But I am not the native speaker so it is highly likely that I 
> > am
> > making mistakes...
> My understanding is exactly the same as yours :),
> meaning heap is too large.

Oh I think get your point. Let me try to explain myself and thanks for your
patience :))

The reserved heap region defined in the device tree should be used for both
Xenheap and domain heap, so if we reserved a too small region (<32M),
an error should pop because the reserved region is not enough for xenheap,
and user should reserve more.
[...]

> But your check is against heap being to small (less than 32M).
> So basically if the following check fails:
> "( reserved_heap && reserved_heap_pages < 32<<(20-PAGE_SHIFT) ) )"
> it means that the heap region defined by a user is too small (not too large),
> because according to requirements it should be at least 32M.

[...]
So in that case, printing "Not enough space for xenheap" means the reserved
region cannot satisfy the minimal requirement of the space of xenheap (at least
32M), and this is in consistent with the check.

Kind regards,
Henry







Re: [PATCH 2/2] xen/arm: Handle reserved heap pages in boot and heap allocator

2022-08-30 Thread Michal Orzel
Hi Henry,

On 30/08/2022 10:00, Henry Wang wrote:
> 
> Hi Michal,
> 
>> -Original Message-
>> From: Michal Orzel 
>
 Did you consider putting reserved_heap into bootinfo structure?
>>>
>>> Actually I did, but I saw current bootinfo only contains some structs so
>>> I was not sure if this is the preferred way, but since you are raising this
>>> question, I will follow this method in v2.
>> This is what I think would be better but maintainers will have a decisive 
>> vote.
> 
> Then let's wait for more input from maintainers.
> 
>>
>
> -if ( ! e )
> -panic("Not not enough space for xenheap\n");
> +if ( ! e ||
> + ( reserved_heap && reserved_heap_pages < 32<<(20-
>> PAGE_SHIFT) ) )
 I'm not sure about this. You are checking if the size of the reserved heap 
 is
 less than 32MB
 and this has nothing to do with the following panic message.
>>>
>>> Hmmm, I am not sure if I understand your question correctly, so here there
>>> are actually 2 issues:
>>> (1) The double not in the panic message.
>>> (2) The size of xenheap.
>>>
>>> If you check the comment of the xenheap constraints above, one rule of
>> the
>>> xenheap size is it "must be at least 32M". If I am not mistaken, we need to
>>> follow the same rule with the reserved heap setup, so here we need to
>> check
>>> the size and if <32M then panic.
>> This is totally fine. What I mean is that the check you introduced does not
>> correspond
>> to the panic message below. In case of reserved heap, its size is selected by
>> the user.
>> "Not enough space for xenheap" means that there is not enough space to be
>> reserved for heap,
>> meaning its size is too large. But your check is about size being too small.
> 
> Actually my understanding of "Not enough space for xenheap" is xenheap
> is too large so we need to reserve more space, which is slightly different 
> than
> your opinion. But I am not the native speaker so it is highly likely that I am
> making mistakes...My understanding is exactly the same as yours :), meaning 
> heap is too large.
But your check is against heap being to small (less than 32M).
So basically if the following check fails:
"( reserved_heap && reserved_heap_pages < 32<<(20-PAGE_SHIFT) ) )"
it means that the heap region defined by a user is too small (not too large),
because according to requirements it should be at least 32M.

> 
> How about changing the panic message to "Not enough memory for xenheap"?
> This would remove the ambiguity here IMHO.
> 
>>
> + * If reserved heap regions are properly defined, (only) add these
 regions
 How can you say at this stage whether the reserved heap regions are
>> defined
 properly?
>>>
>>> Because if the reserved heap regions are not properly defined, in the
>> device
>>> tree parsing phase the global variable "reserved_heap" can never be true.
>>>
>>> Did I understand your question correctly? Or maybe we need to change the
>>> wording here in the comment?
>>
>> FWICS, reserved_heap will be set to true even if a user describes an empty
>> region
>> for reserved heap. This cannot be consider a properly defined region for a
>> heap.
> 
> Oh good point, thank you for pointing this out. I will change the comments
> here to "If there are non-empty reserved heap regions". I am not sure if 
> adding
> an empty region check before setting the "reserved_heap" would be a good
> idea, because adding such check would add another for loop to find a non-empty
> reserved heap bank. What do you think?
> 
> Kind regards,
> Henry
> 
>>
>> ~Michal



[qemu-mainline test] 172859: regressions - FAIL

2022-08-30 Thread osstest service owner
flight 172859 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/172859/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 172123
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 172123
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 172123
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 172123

Tests which are failing intermittently (not blocking):
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-install fail in 
172852 pass in 172859
 test-amd64-amd64-qemuu-nested-intel 16 xen-boot/l1 fail in 172852 pass in 
172859
 test-armhf-armhf-xl-rtds 14 guest-start  fail in 172852 pass in 172859
 test-amd64-coresched-i386-xl 20 guest-localmigrate/x10 fail pass in 172852

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-raw   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-qemuu-nested-intel 17 capture-logs/l1(17) fail in 172852 
blocked in 172123
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 172123
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 172123
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 172123
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 172123
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 172123
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never 

RE: [PATCH 2/2] xen/arm: Handle reserved heap pages in boot and heap allocator

2022-08-30 Thread Henry Wang
Hi Stefano,

> -Original Message-
> From: Stefano Stabellini 
> > +paddr_t reserved_heap_start = ~0, reserved_heap_end = 0,
> 
> INVALID_PADDR or ~0ULL

Ack.

> 
> >  /*
> >   * If the user has not requested otherwise via the command line
> >   * then locate the xenheap using these constraints:
> > @@ -743,7 +766,8 @@ static void __init setup_mm(void)
> >   * We try to allocate the largest xenheap possible within these
> >   * constraints.
> >   */
> > -heap_pages = ram_pages;
> > +heap_pages = !reserved_heap ? ram_pages : reserved_heap_pages;
> > +
> >  if ( opt_xenheap_megabytes )
> >  xenheap_pages = opt_xenheap_megabytes << (20-PAGE_SHIFT);
> >  else
> > @@ -755,17 +779,21 @@ static void __init setup_mm(void)
> >
> >  do
> >  {
> > -e = consider_modules(ram_start, ram_end,
> > +e = !reserved_heap ?
> > +consider_modules(ram_start, ram_end,
> >   pfn_to_paddr(xenheap_pages),
> > - 32<<20, 0);
> > + 32<<20, 0) :
> > +reserved_heap_end;
> > +
> >  if ( e )
> >  break;
> >
> >  xenheap_pages >>= 1;
> >  } while ( !opt_xenheap_megabytes && xenheap_pages > 32<<(20-
> PAGE_SHIFT) );
> >
> > -if ( ! e )
> > -panic("Not not enough space for xenheap\n");
> > +if ( ! e ||
> > + ( reserved_heap && reserved_heap_pages < 32<<(20-PAGE_SHIFT) ) )
> > +panic("Not enough space for xenheap\n");
> 
> 
> I would skip the do/while loop completely if reserved_heap. We don't
> need it anyway

I agree with this.

> and we can automatically calculate xenheap_pages in a single line.

Here I am a little bit confused. Sorry to ask but could you please explain
a little bit more about why we can calculate the xenheap_pages in a single
line? Below is the code snippet in my mind, is this correct?

if (reserved_heap)
e = reserved_heap_end;
else
{
do
{
e = consider_modules(ram_start, ram_end,
 pfn_to_paddr(xenheap_pages),
 32<<20, 0);
if ( e )
break;

xenheap_pages >>= 1;
} while ( !opt_xenheap_megabytes && xenheap_pages > 32<<(20-PAGE_SHIFT) );
}

> 
> >  domheap_pages = heap_pages - xenheap_pages;
> >
> > @@ -810,9 +838,9 @@ static void __init setup_mm(void)
> >  static void __init setup_mm(void)
> >  {
> >  const struct meminfo *banks = 
> > -paddr_t ram_start = ~0;
> > -paddr_t ram_end = 0;
> > -paddr_t ram_size = 0;
> > +paddr_t ram_start = ~0, bank_start = ~0;
> > +paddr_t ram_end = 0, bank_end = 0;
> > +paddr_t ram_size = 0, bank_size = 0;
> >  unsigned int i;
> 
> Please use INVALID_PADDR or ~0ULL

Ack.

Kind regards,
Henry

> 
> 
> >
> >  init_pdx();
> > @@ -821,17 +849,36 @@ static void __init setup_mm(void)
> >   * We need some memory to allocate the page-tables used for the
> xenheap
> >   * mappings. But some regions may contain memory already allocated
> >   * for other uses (e.g. modules, reserved-memory...).
> > - *
> > + * If reserved heap regions are properly defined, (only) add these
> regions
> > + * in the boot allocator.
> > + */
> > +if ( reserved_heap )
> > +{
> > +for ( i = 0 ; i < bootinfo.reserved_mem.nr_banks; i++ )
> > +{
> > +if ( bootinfo.reserved_mem.bank[i].xen_heap )
> > +{
> > +bank_start = bootinfo.reserved_mem.bank[i].start;
> > +bank_size = bootinfo.reserved_mem.bank[i].size;
> > +bank_end = bank_start + bank_size;
> > +
> > +init_boot_pages(bank_start, bank_end);
> > +}
> > +}
> > +}
> > +/*
> > + * No reserved heap regions:
> >   * For simplicity, add all the free regions in the boot allocator.
> >   */
> > -populate_boot_allocator();
> > +else
> > +populate_boot_allocator();
> >
> >  total_pages = 0;
> >
> >  for ( i = 0; i < banks->nr_banks; i++ )
> >  {
> >  const struct membank *bank = >bank[i];
> > -paddr_t bank_end = bank->start + bank->size;
> > +bank_end = bank->start + bank->size;
> >
> >  ram_size = ram_size + bank->size;
> >  ram_start = min(ram_start, bank->start);
> > --
> > 2.17.1
> >



[ovmf test] 172863: regressions - FAIL

2022-08-30 Thread osstest service owner
flight 172863 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/172863/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 172136
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 172136
 build-i386-pvops  6 kernel-build fail REGR. vs. 172136

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a

version targeted for testing:
 ovmf ba0e0e4c6a174b71b18ccd6e47319cc45878893c
baseline version:
 ovmf 444260d45ec2a84e8f8c192b3539a3cd5591d009

Last test of basis   172136  2022-08-04 06:43:42 Z   26 days
Failing since172151  2022-08-05 02:40:28 Z   25 days  204 attempts
Testing same since   172829  2022-08-28 09:13:28 Z1 days   16 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abner Chang 
  Ard Biesheuvel 
  Bob Feng 
  Chasel Chiu 
  Czajkowski, Maciej 
  Dimitrije Pavlov 
  Dun Tan 
  Edward Pickup 
  Foster Nong 
  Gregx Yeh 
  Igor Kulchytskyy 
  James Lu 
  Jose Marinho 
  KasimX Liu 
  Kavya 
  Konstantin Aladyshev 
  Liming Gao 
  Liu, Zhiguang 
  Maciej Czajkowski 
  Michael D Kinney 
  Ray Ni 
  Rebecca Cran 
  Sainadh Nagolu 
  Sami Mujawar 
  Shengfengx Xue 
  Zhiguang Liu 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  fail
 build-i386-libvirt   fail
 build-amd64-pvopspass
 build-i386-pvops fail
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 1015 lines long.)



Re: USB-C 250GB SSD Passthrough fails to DomU Ubuntu

2022-08-30 Thread A Sudheer
Few more logs

>From Dom0, did passthrough of two USB drives (32GB stick and 250GB USB SDD)
In DomU, 32GB driver got mounted but 250GB SSD fails to mount.
In DomU, "lsusb" shows both the drives but "usb-devices" shows only the
32GB drive.

*Dom0 log:*
amd@HOST:~$ sudo xl usb-list vm1
Devid  Type BE  state usb-ver ports
0  devicemodel  0   0 3   15
  Port 1: Bus 003 Device 002
  Port 2: Bus 005 Device 002
  Port 3:
  Port 4:
  Port 5:
  Port 6:
  Port 7:
  Port 8:
  Port 9:
  Port 10:
  Port 11:
  Port 12:
  Port 13:
  Port 14:
  Port 15:
HOST:~$

*DomU Log:*
amd@VM1:~$ lsusb
Bus 003 Device 003: ID 0781:558c SanDisk Corp. Extreme Portable SSD
Bus 003 Device 002: ID 0781:5581 SanDisk Corp. Ultra
Bus 003 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 002: ID 0627:0001 Adomax Technology Co., Ltd QEMU USB Tablet
Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
amd@VM1:~$

*amd@VM1:~$ usb-devices*

T:  Bus=01 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#=  1 Spd=12  MxCh= 2
D:  Ver= 1.10 Cls=09(hub  ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=1d6b ProdID=0001 Rev=05.18
S:  Manufacturer=Linux 5.18.0-4460-amd+ uhci_hcd
S:  Product=UHCI Host Controller
S:  SerialNumber=:00:01.2
C:  #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=0mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=09(hub  ) Sub=00 Prot=00 Driver=hub
E:  Ad=81(I) Atr=03(Int.) MxPS=   2 Ivl=255ms

T:  Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=12  MxCh= 0
D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs=  1
P:  Vendor=0627 ProdID=0001 Rev=00.00
S:  Manufacturer=QEMU
S:  Product=QEMU USB Tablet
S:  SerialNumber=42
C:  #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=100mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=03(HID  ) Sub=00 Prot=00 Driver=usbhid
E:  Ad=81(I) Atr=03(Int.) MxPS=   8 Ivl=10ms

T:  Bus=02 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#=  1 Spd=480 MxCh=15
D:  Ver= 2.00 Cls=09(hub  ) Sub=00 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=1d6b ProdID=0002 Rev=05.18
S:  Manufacturer=Linux 5.18.0-4460-amd+ xhci-hcd
S:  Product=xHCI Host Controller
S:  SerialNumber=:00:04.0
C:  #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=0mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=09(hub  ) Sub=00 Prot=00 Driver=hub
E:  Ad=81(I) Atr=03(Int.) MxPS=   4 Ivl=256ms

T:  Bus=03 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#=  1 Spd=5000 MxCh=15
D:  Ver= 3.00 Cls=09(hub  ) Sub=00 Prot=03 MxPS= 9 #Cfgs=  1
P:  Vendor=1d6b ProdID=0003 Rev=05.18
S:  Manufacturer=Linux 5.18.0-4460-amd+ xhci-hcd
S:  Product=xHCI Host Controller
S:  SerialNumber=:00:04.0
C:  #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=0mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=09(hub  ) Sub=00 Prot=00 Driver=hub
E:  Ad=81(I) Atr=03(Int.) MxPS=   4 Ivl=256ms

T:  Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=5000 MxCh= 0
D:  Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs=  1
P:  Vendor=0781 ProdID=5581 Rev=01.00
S:  Manufacturer= USB
*S:  Product= SanDisk 3.2Gen1*
S:
 
SerialNumber=040143504c9a3bd4596082500826a11868845df4396ebc5cb2e33dd3071e3fd5505f3ca6a60b000d7c18815581071b2a7c33
C:  #Ifs= 1 Cfg#= 1 Atr=80 MxPwr=896mA
I:  If#= 0 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage
E:  Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E:  Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
amd@VM1:~$

*amd@VM1:~$ sudo dmesg *

[  247.071742] usb 3-2: new SuperSpeed USB device number 3 using xhci_hcd
[  247.097304] usb 3-2: New USB device found, idVendor=0781,
idProduct=558c, bcdDevice=10.12
[  247.097314] usb 3-2: New USB device strings: Mfr=2, Product=3,
SerialNumber=1
[  247.097318] usb 3-2: Product: Extreme SSD
[  247.097321] usb 3-2: Manufacturer: SanDisk
[  247.097323] usb 3-2: SerialNumber: 31393430475A343030363932
[  247.101909] usb 3-2: USB controller :00:04.0 does not support
streams, which are required by the UAS driver.
[  247.101915] usb 3-2: Please try an other USB controller if you wish to
use UAS.
[  247.101918] usb-storage 3-2:1.0: USB Mass Storage device detected
[  247.102710] scsi host3: usb-storage 3-2:1.0
[  269.131522] usb 3-2: reset SuperSpeed USB device number 3 using xhci_hcd

*DomU dmesg log while adding USB controller and USB 32GB disk*

[   94.494852] pci :00:04.0: [1033:0194] type 00 class 0x0c0330
[   94.496278] pci :00:04.0: reg 0x10: [mem 0x-0x3fff 64bit]
[   94.502161] pci :00:04.0: BAR 0: assigned [mem 0xf180-0xf1803fff
64bit]
[   94.502981] pci :00:04.0: enabling device ( -> 0002)
[   94.504005] xen: --> pirq=24 -> irq=32 (gsi=32)
[   94.514634] xhci_hcd :00:04.0: xHCI Host Controller
[   94.514650] xhci_hcd :00:04.0: new USB bus registered, assigned bus
number 2
[   94.517840] xhci_hcd :00:04.0: hcc params 0x00080001 hci version
0x100 quirks 0x0014
[   94.523559] usb usb2: New USB device found, idVendor=1d6b,
idProduct=0002, bcdDevice= 5.18
[   94.523571] usb usb2: New USB device strings: Mfr=3, Product=2,
SerialNumber=1
[   94.523575] usb usb2: Product: xHCI Host Controller
[   94.523579] usb usb2: 

RE: [PATCH 2/2] xen/arm: Handle reserved heap pages in boot and heap allocator

2022-08-30 Thread Henry Wang
Hi Michal,

> -Original Message-
> From: Michal Orzel 
> >>>
> >> Did you consider putting reserved_heap into bootinfo structure?
> >
> > Actually I did, but I saw current bootinfo only contains some structs so
> > I was not sure if this is the preferred way, but since you are raising this
> > question, I will follow this method in v2.
> This is what I think would be better but maintainers will have a decisive 
> vote.

Then let's wait for more input from maintainers.

> 
> >>>
> >>> -if ( ! e )
> >>> -panic("Not not enough space for xenheap\n");
> >>> +if ( ! e ||
> >>> + ( reserved_heap && reserved_heap_pages < 32<<(20-
> PAGE_SHIFT) ) )
> >> I'm not sure about this. You are checking if the size of the reserved heap 
> >> is
> >> less than 32MB
> >> and this has nothing to do with the following panic message.
> >
> > Hmmm, I am not sure if I understand your question correctly, so here there
> > are actually 2 issues:
> > (1) The double not in the panic message.
> > (2) The size of xenheap.
> >
> > If you check the comment of the xenheap constraints above, one rule of
> the
> > xenheap size is it "must be at least 32M". If I am not mistaken, we need to
> > follow the same rule with the reserved heap setup, so here we need to
> check
> > the size and if <32M then panic.
> This is totally fine. What I mean is that the check you introduced does not
> correspond
> to the panic message below. In case of reserved heap, its size is selected by
> the user.
> "Not enough space for xenheap" means that there is not enough space to be
> reserved for heap,
> meaning its size is too large. But your check is about size being too small.

Actually my understanding of "Not enough space for xenheap" is xenheap
is too large so we need to reserve more space, which is slightly different than
your opinion. But I am not the native speaker so it is highly likely that I am
making mistakes...

How about changing the panic message to "Not enough memory for xenheap"?
This would remove the ambiguity here IMHO.

> 
> >>> + * If reserved heap regions are properly defined, (only) add these
> >> regions
> >> How can you say at this stage whether the reserved heap regions are
> defined
> >> properly?
> >
> > Because if the reserved heap regions are not properly defined, in the
> device
> > tree parsing phase the global variable "reserved_heap" can never be true.
> >
> > Did I understand your question correctly? Or maybe we need to change the
> > wording here in the comment?
> 
> FWICS, reserved_heap will be set to true even if a user describes an empty
> region
> for reserved heap. This cannot be consider a properly defined region for a
> heap.

Oh good point, thank you for pointing this out. I will change the comments
here to "If there are non-empty reserved heap regions". I am not sure if adding
an empty region check before setting the "reserved_heap" would be a good
idea, because adding such check would add another for loop to find a non-empty
reserved heap bank. What do you think?

Kind regards,
Henry

> 
> ~Michal


[libvirt test] 172864: regressions - FAIL

2022-08-30 Thread osstest service owner
flight 172864 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/172864/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-i386-pvops  6 kernel-build fail REGR. vs. 151777

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-raw   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  50ca78ae6d676d8c510b14d56ca0a36ebc3e3ce0
baseline version:
 libvirt  2c846fa6bcc11929c9fb857a22430fb9945654ad

Last test of basis   151777  2020-07-10 04:19:19 Z  781 days
Failing since151818  2020-07-11 04:18:52 Z  780 days  762 attempts
Testing same since   172864  2022-08-30 04:19:24 Z0 days1 attempts


People who touched revisions under test:
Adolfo Jayme Barrientos 
  Aleksandr Alekseev 
  Aleksei Zakharov 
  Amneesh Singh 
  Andika Triwidada 
  Andrea Bolognani 
  Andrew Melnychenko 
  Ani Sinha 
  Balázs Meskó 
  Barrett Schonefeld 
  Bastian Germann 
  Bastien Orivel 
  BiaoXiang Ye 
  Bihong Yu 
  Binfeng Wu 
  Bjoern Walk 
  Boris Fiuczynski 
  Brad Laue 
  Brian Turek 
  Bruno Haible 
  Carlos Bilbao 
  Chris Mayo 
  Christian Borntraeger 
  Christian Ehrhardt 
  Christian Kirbach 
  Christian Schoenebeck 
  Christophe de Dinechin 
  Christophe Fergeau 
  Claudio Fontana 
  Cole Robinson 
  Collin Walling 
  Cornelia Huck 
  Cédric Bosdonnat 
  Côme Borsoi 
  Daniel Henrique Barboza 
  Daniel Letai 
  Daniel P. Berrange 
  Daniel P. Berrangé 
  Dario Faggioli 
  David Michael 
  Didik Supriadi 
  dinglimin 
  Divya Garg 
  Dmitrii Shcherbakov 
  Dmytro Linkin 
  Eiichi Tsukata 
  Emilio Herrera 
  Eric Farman 
  Erik Skultety 
  Eugenio Pérez 
  Fabian Affolter 
  Fabian Freyer 
  Fabiano Fidêncio 
  Fangge Jin 
  Farhan Ali 
  Fedora Weblate Translation 
  Florian Schmidt 
  Franck Ridel 
  Gavi Teitz 
  gongwei 
  Guoyi Tu
  Göran Uddeborg 
  Halil Pasic 
  Han Han 
  Hao Wang 
  Haonan Wang 
  Hela Basa 
  Helmut Grohne 
  Hiroki Narukawa 
  Hyman Huang(黄勇) 
  Ian Wienand 
  Ioanna Alifieraki 
  Ivan Teterevkov 
  Jakob Meng 
  Jamie Strandboge 
  Jamie Strandboge 
  Jan Kuparinen 
  jason lee 
  Jean-Baptiste Holcroft 
  Jia Zhou 
  Jianan Gao 
  Jim Fehlig 
  Jin Yan 
  Jing Qi 
  Jinsheng Zhang 
  Jiri Denemark 
  Joachim Falk 
  John Ferlan 
  John Levon 
  John Levon 
  Jonathan Watt 
  Jonathon Jongsma 
  Julio Faracco 
  Justin Gatzen 
  Ján Tomko 
  Kashyap Chamarthy 
  Kevin Locke 
  Kim InSoo 
  Koichi Murase 
  Kristina Hanicova 
  Laine Stump 
  Laszlo Ersek 
  Lee Yarwood 
  Lei Yang 
  Lena Voytek 
  Liang Yan 
  Liang Yan 
  Liao Pingfang 
  Lin Ma 
  Lin Ma 
  Lin Ma 
  Liu Yiding 
  Lubomir Rintel 
  Ludek Janda 
  Luke Yue 
  Luyao Zhong 
  luzhipeng 
  Marc Hartmayer 
  Marc-André Lureau 
  Marek Marczykowski-Górecki 
  Mark Mielke 
  Markus Schade 
  Martin Kletzander 
  Martin Pitt 
  Masayoshi Mizuma 
  Matej Cepl 
  Matt Coleman 
  Matt Coleman 
  Mauro Matteo Cascella 
  Max Goodhart 
  Maxim Nestratov 
  Meina Li 
  Michal Privoznik 
  Michał Smyk 
  Milo Casagrande 
  minglei.liu 
  Moshe Levi 
  Moteen Shah 
  Moteen Shah 
  Muha Aliss 
  Nathan 
  Neal Gompa 
  Nick Chevsky 
  Nick Shyrokovskiy 
  Nickys Music Group 
  Nico Pache 
  Nicolas Lécureuil 
  Nicolas Lécureuil 
  Nikolay Shirokovskiy 
  Nikolay Shirokovskiy 
  

RE: [PATCH 1/2] docs, xen/arm: Introduce reserved heap memory

2022-08-30 Thread Henry Wang
Hi Michal,

> -Original Message-
> From: Michal Orzel 
> >> +printk("Checking for reserved heap in /chosen\n");
> >> +if ( address_cells < 1 || size_cells < 1 )
> > address_cells and size_cells cannot be negative so you could just check
> if
> > there are 0.
> 
>  In bootfdt.c function device_tree_get_meminfo(), the address and size
> cells
>  are checked using <1 instead of =0. I agree they cannot be negative, but
> I
> >>> am
>  not very sure if there were other reasons to do the "<1" check in
>  device_tree_get_meminfo(). Are you fine with we don't keep the
> >>> consistency
>  here?
> >>>
> >>> I would keep the < 1 check but it doesn't make much difference either
> >>> way
> >>
> >> I also would prefer to keep these two places consistent and I agree Michal
> is
> >> making a good point.
> > I'm ok with that so let's keep the consistency.
> Actually, why do we want to duplicate exactly the same check in
> process_chosen_node that is already
> present in device_tree_get_meminfo? There is no need for that so just
> remove it from process_chosen_node.

Well, yes and no IMHO, because we are using "#xen,static-heap-address-cells"
and "#xen,static-heap-size-cells" instead of normal "#address-cells" and
"#size-cells". These properties are dependent on user's input so I would say
adding a check and proper printk to inform user with the related information
would be a good idea. Also I think catching the incorrect
"#xen,static-heap-address-cells" and "#xen,static-heap-size-cells" and return
early would also be a good idea.

Kind regards,
Henry



Re: [PATCH 2/2] xen/arm: Handle reserved heap pages in boot and heap allocator

2022-08-30 Thread Michal Orzel
Hi Henry,

On 30/08/2022 08:11, Henry Wang wrote:
> 
> Hi Michal,
> 
> Sorry about the late reply - I had a couple of days off. Thank you very
> much for the review! I will add my reply and answer some of your
> questions below.
> 
>> -Original Message-
>> From: Michal Orzel 
>> Subject: Re: [PATCH 2/2] xen/arm: Handle reserved heap pages in boot and
>> heap allocator
>>
>>> This commit firstly adds a global variable `reserved_heap`.
>>> This newly introduced global variable is set at the device tree
>>> parsing time if the reserved heap ranges are defined in the device
>>> tree chosen node.
>>>
>> Did you consider putting reserved_heap into bootinfo structure?
> 
> Actually I did, but I saw current bootinfo only contains some structs so
> I was not sure if this is the preferred way, but since you are raising this
> question, I will follow this method in v2.
This is what I think would be better but maintainers will have a decisive vote.

> 
>> It would help to avoid introducing new global variables that are only used
>> in places making use of the bootinfo anyway.
> 
> Ack.
> 
>>
>>> +for ( i = 0 ; i < bootinfo.reserved_mem.nr_banks; i++ )
>>> +{
>>> +if ( bootinfo.reserved_mem.bank[i].xen_heap )
>>> +{
>>> +bank_start = bootinfo.reserved_mem.bank[i].start;
>>> +bank_size = bootinfo.reserved_mem.bank[i].size;
>>> +bank_end = bank_start + bank_size;
>>> +
>>> +reserved_heap_size += bank_size;
>>> +reserved_heap_start = min(reserved_heap_start, bank_start);
>> You do not need reserved_heap_start as you do not use it at any place later
>> on.
>> In your current implementation you just need reserved_heap_size and
>> reserved_heap_end.
> 
> Good point, thank you and I will remove in v2.
> 
>>
>>>  /*
>>>   * If the user has not requested otherwise via the command line
>>>   * then locate the xenheap using these constraints:
>>> @@ -743,7 +766,8 @@ static void __init setup_mm(void)
>>>   * We try to allocate the largest xenheap possible within these
>>>   * constraints.
>>>   */
>>> -heap_pages = ram_pages;
>>> +heap_pages = !reserved_heap ? ram_pages : reserved_heap_pages;
>> I must say that the reverted logic is harder to read. This is a matter of 
>> taste
>> but
>> please consider the following:
>> heap_pages = reserved_heap ? reserved_heap_pages : ram_pages;
>> The same applies to ...
> 
> Sure, I will use the way you suggested.
> 
>>
>>> +
>>>  if ( opt_xenheap_megabytes )
>>>  xenheap_pages = opt_xenheap_megabytes << (20-PAGE_SHIFT);
>>>  else
>>> @@ -755,17 +779,21 @@ static void __init setup_mm(void)
>>>
>>>  do
>>>  {
>>> -e = consider_modules(ram_start, ram_end,
>>> +e = !reserved_heap ?
>> ... here.
> 
> And here :))
> 
>>
>>> +consider_modules(ram_start, ram_end,
>>>   pfn_to_paddr(xenheap_pages),
>>> - 32<<20, 0);
>>> + 32<<20, 0) :
>>> +reserved_heap_end;
>>> +
>>>  if ( e )
>>>  break;
>>>
>>>  xenheap_pages >>= 1;
>>>  } while ( !opt_xenheap_megabytes && xenheap_pages > 32<<(20-
>> PAGE_SHIFT) );
>>>
>>> -if ( ! e )
>>> -panic("Not not enough space for xenheap\n");
>>> +if ( ! e ||
>>> + ( reserved_heap && reserved_heap_pages < 32<<(20-PAGE_SHIFT) ) )
>> I'm not sure about this. You are checking if the size of the reserved heap is
>> less than 32MB
>> and this has nothing to do with the following panic message.
> 
> Hmmm, I am not sure if I understand your question correctly, so here there
> are actually 2 issues:
> (1) The double not in the panic message.
> (2) The size of xenheap.
> 
> If you check the comment of the xenheap constraints above, one rule of the
> xenheap size is it "must be at least 32M". If I am not mistaken, we need to
> follow the same rule with the reserved heap setup, so here we need to check
> the size and if <32M then panic.
This is totally fine. What I mean is that the check you introduced does not 
correspond
to the panic message below. In case of reserved heap, its size is selected by 
the user.
"Not enough space for xenheap" means that there is not enough space to be 
reserved for heap,
meaning its size is too large. But your check is about size being too small.

> 
>>
>>> +panic("Not enough space for xenheap\n");
>>>
>>>  domheap_pages = heap_pages - xenheap_pages;
>>>
>>> @@ -810,9 +838,9 @@ static void __init setup_mm(void)
>>>  static void __init setup_mm(void)
>>>  {
>>>  const struct meminfo *banks = 
>>> -paddr_t ram_start = ~0;
>>> -paddr_t ram_end = 0;
>>> -paddr_t ram_size = 0;
>>> +paddr_t ram_start = ~0, bank_start = ~0;
>>> +paddr_t ram_end = 0, bank_end = 0;
>>> +paddr_t ram_size = 0, bank_size = 0;
>>>  unsigned int i;
>>>
>>>  init_pdx();
>>> @@ 

Re: [PATCH 1/2] docs, xen/arm: Introduce reserved heap memory

2022-08-30 Thread Michal Orzel
On 30/08/2022 08:29, Michal Orzel wrote:
> Hi Henry,
> 
> On 30/08/2022 02:58, Henry Wang wrote:
>>
>> Hi Stefano and Michal,
>>
>>> -Original Message-
>>> From: Stefano Stabellini 
>>> Sent: Tuesday, August 30, 2022 8:47 AM
>>> To: Henry Wang 
>>> Cc: Michal Orzel ; xen-devel@lists.xenproject.org;
>>> Stefano Stabellini ; Julien Grall ;
>>> Bertrand Marquis ; Wei Chen
>>> ; Volodymyr Babchuk
>>> ; Penny Zheng 
>>> Subject: RE: [PATCH 1/2] docs, xen/arm: Introduce reserved heap memory
>>>
>>> On Thu, 25 Aug 2022, Henry Wang wrote:
>> const char *name,
>> u32 address_cells, u32 
>> size_cells)
>>  {
>> @@ -301,16 +303,40 @@ static void __init process_chosen_node(const
> void *fdt, int node,
>>  paddr_t start, end;
>>  int len;
>>
>> +if ( fdt_get_property(fdt, node, "xen,static-mem", NULL) )
>> +{
>> +u32 address_cells = device_tree_get_u32(fdt, node,
>> +
>> "#xen,static-mem-address-cells",
>> +0);
>> +u32 size_cells = device_tree_get_u32(fdt, node,
>> + 
>> "#xen,static-mem-size-cells", 0);
>> +int rc;
>> +
>> +printk("Checking for reserved heap in /chosen\n");
>> +if ( address_cells < 1 || size_cells < 1 )
> address_cells and size_cells cannot be negative so you could just check if
> there are 0.

 In bootfdt.c function device_tree_get_meminfo(), the address and size cells
 are checked using <1 instead of =0. I agree they cannot be negative, but I
>>> am
 not very sure if there were other reasons to do the "<1" check in
 device_tree_get_meminfo(). Are you fine with we don't keep the
>>> consistency
 here?
>>>
>>> I would keep the < 1 check but it doesn't make much difference either
>>> way
>>
>> I also would prefer to keep these two places consistent and I agree Michal is
>> making a good point.
> I'm ok with that so let's keep the consistency.
Actually, why do we want to duplicate exactly the same check in 
process_chosen_node that is already
present in device_tree_get_meminfo? There is no need for that so just remove it 
from process_chosen_node.

> 
>>
>> Kind regards,
>> Henry
>>
> 
> ~Michal



Re: [PATCH 1/2] docs, xen/arm: Introduce reserved heap memory

2022-08-30 Thread Michal Orzel
Hi Henry,

On 30/08/2022 02:58, Henry Wang wrote:
> 
> Hi Stefano and Michal,
> 
>> -Original Message-
>> From: Stefano Stabellini 
>> Sent: Tuesday, August 30, 2022 8:47 AM
>> To: Henry Wang 
>> Cc: Michal Orzel ; xen-devel@lists.xenproject.org;
>> Stefano Stabellini ; Julien Grall ;
>> Bertrand Marquis ; Wei Chen
>> ; Volodymyr Babchuk
>> ; Penny Zheng 
>> Subject: RE: [PATCH 1/2] docs, xen/arm: Introduce reserved heap memory
>>
>> On Thu, 25 Aug 2022, Henry Wang wrote:
> const char *name,
> u32 address_cells, u32 size_cells)
>  {
> @@ -301,16 +303,40 @@ static void __init process_chosen_node(const
 void *fdt, int node,
>  paddr_t start, end;
>  int len;
>
> +if ( fdt_get_property(fdt, node, "xen,static-mem", NULL) )
> +{
> +u32 address_cells = device_tree_get_u32(fdt, node,
> +
> "#xen,static-mem-address-cells",
> +0);
> +u32 size_cells = device_tree_get_u32(fdt, node,
> + 
> "#xen,static-mem-size-cells", 0);
> +int rc;
> +
> +printk("Checking for reserved heap in /chosen\n");
> +if ( address_cells < 1 || size_cells < 1 )
 address_cells and size_cells cannot be negative so you could just check if
 there are 0.
>>>
>>> In bootfdt.c function device_tree_get_meminfo(), the address and size cells
>>> are checked using <1 instead of =0. I agree they cannot be negative, but I
>> am
>>> not very sure if there were other reasons to do the "<1" check in
>>> device_tree_get_meminfo(). Are you fine with we don't keep the
>> consistency
>>> here?
>>
>> I would keep the < 1 check but it doesn't make much difference either
>> way
> 
> I also would prefer to keep these two places consistent and I agree Michal is
> making a good point.
I'm ok with that so let's keep the consistency.

> 
> Kind regards,
> Henry
> 

~Michal



RE: [PATCH 2/2] xen/arm: Handle reserved heap pages in boot and heap allocator

2022-08-30 Thread Henry Wang
Hi Michal,

Sorry about the late reply - I had a couple of days off. Thank you very
much for the review! I will add my reply and answer some of your
questions below.

> -Original Message-
> From: Michal Orzel 
> Subject: Re: [PATCH 2/2] xen/arm: Handle reserved heap pages in boot and
> heap allocator
> 
> > This commit firstly adds a global variable `reserved_heap`.
> > This newly introduced global variable is set at the device tree
> > parsing time if the reserved heap ranges are defined in the device
> > tree chosen node.
> >
> Did you consider putting reserved_heap into bootinfo structure?

Actually I did, but I saw current bootinfo only contains some structs so
I was not sure if this is the preferred way, but since you are raising this
question, I will follow this method in v2.

> It would help to avoid introducing new global variables that are only used
> in places making use of the bootinfo anyway.

Ack.

> 
> > +for ( i = 0 ; i < bootinfo.reserved_mem.nr_banks; i++ )
> > +{
> > +if ( bootinfo.reserved_mem.bank[i].xen_heap )
> > +{
> > +bank_start = bootinfo.reserved_mem.bank[i].start;
> > +bank_size = bootinfo.reserved_mem.bank[i].size;
> > +bank_end = bank_start + bank_size;
> > +
> > +reserved_heap_size += bank_size;
> > +reserved_heap_start = min(reserved_heap_start, bank_start);
> You do not need reserved_heap_start as you do not use it at any place later
> on.
> In your current implementation you just need reserved_heap_size and
> reserved_heap_end.

Good point, thank you and I will remove in v2.

> 
> >  /*
> >   * If the user has not requested otherwise via the command line
> >   * then locate the xenheap using these constraints:
> > @@ -743,7 +766,8 @@ static void __init setup_mm(void)
> >   * We try to allocate the largest xenheap possible within these
> >   * constraints.
> >   */
> > -heap_pages = ram_pages;
> > +heap_pages = !reserved_heap ? ram_pages : reserved_heap_pages;
> I must say that the reverted logic is harder to read. This is a matter of 
> taste
> but
> please consider the following:
> heap_pages = reserved_heap ? reserved_heap_pages : ram_pages;
> The same applies to ...

Sure, I will use the way you suggested.

> 
> > +
> >  if ( opt_xenheap_megabytes )
> >  xenheap_pages = opt_xenheap_megabytes << (20-PAGE_SHIFT);
> >  else
> > @@ -755,17 +779,21 @@ static void __init setup_mm(void)
> >
> >  do
> >  {
> > -e = consider_modules(ram_start, ram_end,
> > +e = !reserved_heap ?
> ... here.

And here :))

> 
> > +consider_modules(ram_start, ram_end,
> >   pfn_to_paddr(xenheap_pages),
> > - 32<<20, 0);
> > + 32<<20, 0) :
> > +reserved_heap_end;
> > +
> >  if ( e )
> >  break;
> >
> >  xenheap_pages >>= 1;
> >  } while ( !opt_xenheap_megabytes && xenheap_pages > 32<<(20-
> PAGE_SHIFT) );
> >
> > -if ( ! e )
> > -panic("Not not enough space for xenheap\n");
> > +if ( ! e ||
> > + ( reserved_heap && reserved_heap_pages < 32<<(20-PAGE_SHIFT) ) )
> I'm not sure about this. You are checking if the size of the reserved heap is
> less than 32MB
> and this has nothing to do with the following panic message.

Hmmm, I am not sure if I understand your question correctly, so here there
are actually 2 issues:
(1) The double not in the panic message.
(2) The size of xenheap.

If you check the comment of the xenheap constraints above, one rule of the
xenheap size is it "must be at least 32M". If I am not mistaken, we need to
follow the same rule with the reserved heap setup, so here we need to check
the size and if <32M then panic.

> 
> > +panic("Not enough space for xenheap\n");
> >
> >  domheap_pages = heap_pages - xenheap_pages;
> >
> > @@ -810,9 +838,9 @@ static void __init setup_mm(void)
> >  static void __init setup_mm(void)
> >  {
> >  const struct meminfo *banks = 
> > -paddr_t ram_start = ~0;
> > -paddr_t ram_end = 0;
> > -paddr_t ram_size = 0;
> > +paddr_t ram_start = ~0, bank_start = ~0;
> > +paddr_t ram_end = 0, bank_end = 0;
> > +paddr_t ram_size = 0, bank_size = 0;
> >  unsigned int i;
> >
> >  init_pdx();
> > @@ -821,17 +849,36 @@ static void __init setup_mm(void)
> >   * We need some memory to allocate the page-tables used for the
> xenheap
> >   * mappings. But some regions may contain memory already allocated
> >   * for other uses (e.g. modules, reserved-memory...).
> > - *
> > + * If reserved heap regions are properly defined, (only) add these
> regions
> How can you say at this stage whether the reserved heap regions are defined
> properly?

Because if the reserved heap regions are not properly defined, in the device
tree parsing phase the global