[PATCH 4/4] Documentation/x86: Add ratelimit in buslock.rst

2021-04-19 Thread Fenghua Yu
ratelimit is a new option in bus lock handling. Need to add it in
buslock.rst.

Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
---
 Documentation/x86/buslock.rst | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/Documentation/x86/buslock.rst b/Documentation/x86/buslock.rst
index 4deaf8b82338..87ee5925cb5c 100644
--- a/Documentation/x86/buslock.rst
+++ b/Documentation/x86/buslock.rst
@@ -61,6 +61,11 @@ The kernel #AC and #DB handlers handle bus lock based on 
kernel parameter
 | |When both features are  |   |
 | |supported, fatal in #AC |   |
 +--++---+
+|ratelimit:N  |Do nothing  |Limit bus lock rate to |
+|(0 < N <= 1000)   |   |N bus locks per second |
+| ||system wide and warn on|
+| ||bus locks. |
++--++---+
 
 Usages
 ==
@@ -108,3 +113,21 @@ fatal
 In this case, the bus lock is not tolerated and the process is killed.
 
 It is useful in hard real time system.
+
+ratelimit
+-
+
+A system wide bus lock rate limit N is specified where 0 < N <= 1000.
+Less bus locks can be generated when N is smaller.
+
+This may find usage in throttling malicious processes in cloud. For
+example, a few malicious users may generate a lot of bus locks to launch
+Denial of Service (DoS) attack. By setting ratelimit, the system wide
+bus locks is rate limited by N bus locks per second and the DoS attack
+will be mitigated. The bus locks are warned so that the system
+administrator can found the malicious users and processes.
+
+Selecting a rate limit of 1000 would allow the bus to be locked for
+up to about seven million cycles each second (assuming 7000 cycles for
+each bus lock). On a 2 GHz processor that would be about 0.35% system
+impact.
-- 
2.31.1



[PATCH 3/4] Documentation/admin-guide: Change doc for bus lock ratelimit

2021-04-19 Thread Fenghua Yu
Since bus lock rate limit changes the split_lock_detect parameter,
update the documentation for the change.

Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
---
 Documentation/admin-guide/kernel-parameters.txt | 8 
 1 file changed, 8 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index f5892896bedc..c13bbfd8c5aa 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5154,6 +5154,14 @@
  exception. Default behavior is by #AC if
  both features are enabled in hardware.
 
+   ratelimit:N -
+ Set system wide rate limit to N bus locks
+ per second for bus lock detection.
+ 0 < N <= 1000.
+
+ N/A for split lock detection.
+
+
If an #AC exception is hit in the kernel or in
firmware (i.e. not while executing in user mode)
the kernel will oops in either "warn" or "fatal"
-- 
2.31.1



[PATCH 2/4] x86/bus_lock: Set rate limit for bus lock

2021-04-19 Thread Fenghua Yu
A bus lock can be thousands of cycles slower than atomic operation within
one cache line. It also disrupts performance on other cores. Malicious
users may generate multiple bus locks to degrade the whole system
performance.

To mitigate the issue, the kernel can set a system wide rate limit for
bus locks via a kernel parameter:
split_lock_detect=ratelimit:N

When the system detects bus locks at a rate higher than N/sec (where
N can be set by the kernel boot argument in the range [1..1000]) any
task triggering a bus lock will be forced to sleep for at least 20ms
until the overall system rate of bus locks drops below the threshold.

Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
---
 arch/x86/kernel/cpu/intel.c | 43 +++--
 1 file changed, 41 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index fe0bec14d7ec..149c4d33e8c4 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -41,6 +42,7 @@ enum split_lock_detect_state {
sld_off = 0,
sld_warn,
sld_fatal,
+   sld_ratelimit,
 };
 
 /*
@@ -997,13 +999,31 @@ static const struct {
{ "off",sld_off   },
{ "warn",   sld_warn  },
{ "fatal",  sld_fatal },
+   { "ratelimit:", sld_ratelimit },
 };
 
+static struct ratelimit_state bld_ratelimit;
+
 static inline bool match_option(const char *arg, int arglen, const char *opt)
 {
-   int len = strlen(opt);
+   int len = strlen(opt), ratelimit;
+
+   if (strncmp(arg, opt, len))
+   return false;
+
+   /*
+* Min ratelimit is 1 bus lock/sec.
+* Max ratelimit is 1000 bus locks/sec.
+*/
+   if (sscanf(arg, "ratelimit:%d", ) == 1 &&
+   ratelimit > 0 && ratelimit <= 1000) {
+   ratelimit_state_init(_ratelimit, HZ, ratelimit);
+   ratelimit_set_flags(_ratelimit, RATELIMIT_MSG_ON_RELEASE);
 
-   return len == arglen && !strncmp(arg, opt, len);
+   return true;
+   }
+
+   return len == arglen;
 }
 
 static bool split_lock_verify_msr(bool on)
@@ -1082,6 +1102,15 @@ static void sld_update_msr(bool on)
 
 static void split_lock_init(void)
 {
+   /*
+* #DB for bus lock handles ratelimit and #AC for split lock is
+* disabled.
+*/
+   if (sld_state == sld_ratelimit) {
+   split_lock_verify_msr(false);
+   return;
+   }
+
if (cpu_model_supports_sld)
split_lock_verify_msr(sld_state != sld_off);
 }
@@ -1154,6 +1183,12 @@ void handle_bus_lock(struct pt_regs *regs)
switch (sld_state) {
case sld_off:
break;
+   case sld_ratelimit:
+   /* Enforce no more than bld_ratelimit bus locks/sec. */
+   while (!__ratelimit(_ratelimit))
+   msleep(20);
+   /* Warn on the bus lock. */
+   fallthrough;
case sld_warn:
pr_warn_ratelimited("#DB: %s/%d took a bus_lock trap at 
address: 0x%lx\n",
current->comm, current->pid, regs->ip);
@@ -1259,6 +1294,10 @@ static void sld_state_show(void)
" from non-WB" : "");
}
break;
+   case sld_ratelimit:
+   if (boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT))
+   pr_info("#DB: setting system wide bus lock rate limit 
to %u/sec\n", bld_ratelimit.burst);
+   break;
}
 }
 
-- 
2.31.1



[PATCH 1/4] Documentation/x86: Add buslock.rst

2021-04-19 Thread Fenghua Yu
Add buslock.rst to explain bus lock problem and how to detect and
handle it.

Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
---
 Documentation/x86/buslock.rst | 110 ++
 1 file changed, 110 insertions(+)
 create mode 100644 Documentation/x86/buslock.rst

diff --git a/Documentation/x86/buslock.rst b/Documentation/x86/buslock.rst
new file mode 100644
index ..4deaf8b82338
--- /dev/null
+++ b/Documentation/x86/buslock.rst
@@ -0,0 +1,110 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===
+Bus lock detection and handling
+===
+
+:Copyright: |copy| 2021 Intel Corporation
+:Authors: - Fenghua Yu 
+  - Tony Luck 
+
+Problem
+===
+
+A split lock is any atomic operation whose operand crosses two cache lines.
+Since the operand spans two cache lines and the operation must be atomic,
+the system locks the bus while the CPU accesses the two cache lines.
+
+A bus lock is acquired through either split locked access to writeback (WB)
+memory or any locked access to non-WB memory. This is typically thousands of
+cycles slower than an atomic operation within a cache line. It also disrupts
+performance on other cores and brings the whole system to its knees.
+
+Detection
+=
+
+Intel processors may support either or both of the following hardware
+mechanisms to detect split locks and bus locks.
+
+#AC exception for split lock detection
+--
+
+Beginning with the Tremont Atom CPU split lock operations may raise an
+Alignment Check (#AC) exception when a split lock operation is attemped.
+
+#DB exception for bus lock detection
+
+
+Some CPUs have ability to notify the kernel by an #DB trap after a user
+instruction acquires a bus lock and is executed. This allows the kernel
+to enforce user application throttling or mitigation.
+
+Software handling
+=
+
+The kernel #AC and #DB handlers handle bus lock based on kernel parameter
+"split_lock_detect". Here is a summary of different options:
+
++--++---+
+|split_lock_detect=|#AC for split lock |#DB for bus lock   |
++--++---+
+|off  |Do nothing  |Do nothing |
++--++---+
+|warn |Kernel OOPs |Warn once per task and |
+|(default)|Warn once per task and  |and continues to run.  |
+| |disable future checking |   |
+| |When both features are  |   |
+| |supported, warn in #AC  |   |
++--++---+
+|fatal|Kernel OOPs |Send SIGBUS to user.   |
+| |Send SIGBUS to user |   |
+| |When both features are  |   |
+| |supported, fatal in #AC |   |
++--++---+
+
+Usages
+==
+
+Detecting and handling bus lock may find usages in various areas:
+
+It is critical for real time system designers who build consolidated real
+time systems. These systems run hard real time code on some cores and
+run "untrusted" user processes on some other cores. The hard real time
+cannot afford to have any bus lock from the untrusted processes to hurt
+real time performance. To date the designers have been unable to deploy
+these solutions as they have no way to prevent the "untrusted" user code
+from generating split lock and bus lock to block the hard real time code
+to access memory during bus locking.
+
+It may also find usage in cloud. A user process with bus lock running
+in one guest can block other cores from accessing shared memory.
+
+Bus lock may open a security hole where malicious user code may slow
+down overall system by executing instructions with bus lock.
+
+
+Guidance
+
+off
+---
+
+Disable checking for split lock and bus lock. This option may be
+useful if there are legacy applications that trigger these events
+at a low rate so that mitigation is not needed.
+
+warn
+
+
+The bus lock is warned so that it can be found and fixed. This is the
+default behavior.
+
+It may be useful to find and fix bus lock. The warning information has
+process id and faulting instruction address to help pin point bus lock
+and fix it.
+
+fatal
+-
+
+In this case, the bus lock is not tolerated and the process is killed.
+
+It is useful in hard real time system.
-- 
2.31.1



[PATCH 0/4] x86/bus_lock: Set rate limit for bus lock

2021-04-19 Thread Fenghua Yu
Bus lock warn and fatal handling is in tip. This series sets system
wide bus lock rate limit to throttle malicious code.

This series is applied on top of tip master branch.

Change Log:
-Set system wide rate limit instead of per-user rate limit (Thomas).
-Thomas suggested to split the previous bus lock into warn and fatal
patch set and this rate limit patch set:
https://lore.kernel.org/lkml/871rca6dbp@nanos.tec.linutronix.de/

Fenghua Yu (4):
  Documentation/x86: Add buslock.rst
  x86/bus_lock: Set rate limit for bus lock
  Documentation/admin-guide: Change doc for bus lock ratelimit
  Documentation/x86: Add ratelimit in buslock.rst

 .../admin-guide/kernel-parameters.txt |   8 ++
 Documentation/x86/buslock.rst | 133 ++
 arch/x86/kernel/cpu/intel.c   |  43 +-
 3 files changed, 182 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/x86/buslock.rst

-- 
2.31.1



Re: [PATCH v5 2/3] x86/bus_lock: Handle #DB for bus lock

2021-04-13 Thread Fenghua Yu
Hi, Thomas,

On Mon, Apr 12, 2021 at 09:15:08AM +0200, Thomas Gleixner wrote:
> On Sat, Apr 03 2021 at 01:04, Fenghua Yu wrote:
> > On Sat, Mar 20, 2021 at 01:42:52PM +0100, Thomas Gleixner wrote:
> >> On Fri, Mar 19 2021 at 22:19, Fenghua Yu wrote:
> >> And even with throttling the injection rate further down to 25k per
> >> second the impact on the workload is still significant in the 10% range.
> > So I can change the ratelimit to system wide and call usleep_range()
> > to sleep: 
> >while (!__ratelimit(_bld_ratelimit))
> >usleep_range(100 / bld_ratelimit,
> > 100 / bld_ratelimit);
> >
> > The max bld_ratelimit is 1000,000/s because the max sleeping time is 1
> > usec.
> 
> Maximum sleep time is 1usec?
> 
> > The min bld_ratelimit is 1/s.
> 
> Again. This does not make sense at all. 1Mio bus lock events per second
> are way beyond the point where the machine does anything else than being
> stuck in buslocks.
> 
> Aside of that why are you trying to make this throttling in any way
> accurate? It does not matter at all, really. Limit reached, put it to
> sleep for some time and be done with it. No point in trying to be clever
> for no value.

Is it OK to set bld_ratelimit between 1 and 1,000 bus locks/sec for
bld_ratelimit?

Can I do the throttling like this?

   /* Enforce no more than bld_ratelimit bus locks/sec. */
   while (!__ratelimit(_bld_ratelimit))
   msleep(10);

On one machine, if bld_ratelimit=1,000, that's about 5msec for a busy
bus lock loop, i.e. bus is locked for about 5msec and then the process
sleeps for 10msec and thus won't generate any bus lock.
"dd" command running on other cores doesn't have noticeable degradation
with bld_ratelimit=1,000.

Thanks.

-Fenghua


Re: [PATCH v2] selftests/resctrl: Change a few printed messages

2021-04-07 Thread Fenghua Yu
Hi, Shuah,

On Wed, Apr 07, 2021 at 04:46:38PM -0600, Shuah Khan wrote:
> On 4/7/21 1:57 PM, Fenghua Yu wrote:
> > Change a few printed messages to report test progress more clearly.
> Thank you. Applied to linux-kseftest next branch for 5.13-rc1

Great! I pull the next patch and test the patch. It works fine.

BTW, as said in the cover patch in the series, there will be two
new patch sets to fix a few other resctrl selftest issues on top of
this series. I will send out them in the next weeks. Hopefully
you will push them to 5.14:)

Thank you very much for your help!

-Fenghua


[PATCH v2] selftests/resctrl: Change a few printed messages

2021-04-07 Thread Fenghua Yu
Change a few printed messages to report test progress more clearly.

Add a missing "\n" at the end of one printed message.

Suggested-by: Shuah Khan 
Signed-off-by: Fenghua Yu 
---
Change log:
v2:
- Add "Pass:" and "Fail:" sub-strings back (Shuah).

This is a follow-up patch of recent resctrl selftest patches and can be
applied cleanly to:
git git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git
branch next.

 tools/testing/selftests/resctrl/cache.c | 2 +-
 tools/testing/selftests/resctrl/mba_test.c  | 6 +++---
 tools/testing/selftests/resctrl/mbm_test.c  | 2 +-
 tools/testing/selftests/resctrl/resctrlfs.c | 4 ++--
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c 
b/tools/testing/selftests/resctrl/cache.c
index 362e3a418caa..68ff856d36f0 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -301,7 +301,7 @@ int show_cache_info(unsigned long sum_llc_val, int 
no_of_bits,
ret = platform && abs((int)diff_percent) > max_diff_percent &&
  (cmt ? (abs(avg_diff) > max_diff) : true);
 
-   ksft_print_msg("%s cache miss rate within %d%%\n",
+   ksft_print_msg("%s Check cache miss rate within %d%%\n",
   ret ? "Fail:" : "Pass:", max_diff_percent);
 
ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent));
diff --git a/tools/testing/selftests/resctrl/mba_test.c 
b/tools/testing/selftests/resctrl/mba_test.c
index 26f12ad4c663..1a1bdb6180cf 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -80,7 +80,7 @@ static void show_mba_info(unsigned long *bw_imc, unsigned 
long *bw_resc)
avg_diff = (float)labs(avg_bw_resc - avg_bw_imc) / avg_bw_imc;
avg_diff_per = (int)(avg_diff * 100);
 
-   ksft_print_msg("%s MBA: diff within %d%% for schemata %u\n",
+   ksft_print_msg("%s Check MBA diff within %d%% for schemata 
%u\n",
   avg_diff_per > MAX_DIFF_PERCENT ?
   "Fail:" : "Pass:",
   MAX_DIFF_PERCENT,
@@ -93,10 +93,10 @@ static void show_mba_info(unsigned long *bw_imc, unsigned 
long *bw_resc)
failed = true;
}
 
-   ksft_print_msg("%s schemata change using MBA\n",
+   ksft_print_msg("%s Check schemata change using MBA\n",
   failed ? "Fail:" : "Pass:");
if (failed)
-   ksft_print_msg("At least one test failed");
+   ksft_print_msg("At least one test failed\n");
 }
 
 static int check_results(void)
diff --git a/tools/testing/selftests/resctrl/mbm_test.c 
b/tools/testing/selftests/resctrl/mbm_test.c
index 02b1ed03f1e5..8392e5c55ed0 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -37,7 +37,7 @@ show_bw_info(unsigned long *bw_imc, unsigned long *bw_resc, 
int span)
avg_diff_per = (int)(avg_diff * 100);
 
ret = avg_diff_per > MAX_DIFF_PERCENT;
-   ksft_print_msg("%s MBM: diff within %d%%\n",
+   ksft_print_msg("%s Check MBM diff within %d%%\n",
   ret ? "Fail:" : "Pass:", MAX_DIFF_PERCENT);
ksft_print_msg("avg_diff_per: %d%%\n", avg_diff_per);
ksft_print_msg("Span (MB): %d\n", span);
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c 
b/tools/testing/selftests/resctrl/resctrlfs.c
index ade5f2b8b843..5f5a166ade60 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -570,14 +570,14 @@ bool check_resctrlfs_support(void)
 
fclose(inf);
 
-   ksft_print_msg("%s kernel supports resctrl filesystem\n",
+   ksft_print_msg("%s Check kernel supports resctrl filesystem\n",
   ret ? "Pass:" : "Fail:");
 
if (!ret)
return ret;
 
dp = opendir(RESCTRL_PATH);
-   ksft_print_msg("%s resctrl mountpoint \"%s\" exists\n",
+   ksft_print_msg("%s Check resctrl mountpoint \"%s\" exists\n",
   dp ? "Pass:" : "Fail:", RESCTRL_PATH);
if (dp)
closedir(dp);
-- 
2.31.1



Re: [PATCH] selftests/resctrl: Change a few printed messages

2021-04-07 Thread Fenghua Yu
Hi, Shuah,

On Wed, Apr 07, 2021 at 08:33:23AM -0600, Shuah Khan wrote:
> On 4/5/21 6:52 PM, Fenghua Yu wrote:
> > -   ksft_print_msg("%s cache miss rate within %d%%\n",
> > -  ret ? "Fail:" : "Pass:", max_diff_percent);
> > +   ksft_print_msg("Check cache miss rate within %d%%\n", max_diff_percent);
> 
> You need %s and pass in the ret ? "Fail:" : "Pass:" result for the
> message to read correctly.

Should I keep the ":" after "Pass"/"Fail"?

> 
> I am seeing:
> 
> # Check kernel support for resctrl filesystem
> 
> It should say the following:
> 
> # Fail Check kernel support for resctrl filesystem

i.e. should the printed messages be like the following?
# Fail: Check kernel support for resctrl filesystem
or
# Pass: Check kernel support for resctrl filesystem

Thanks.

-Fenghua


[PATCH] selftests/resctrl: Change a few printed messages

2021-04-05 Thread Fenghua Yu
A few printed messages contain pass/fail strings which should be shown
in test results. Remove the pass/fail strings in the messages to avoid
confusion.

Add "\n" at the end of one printed message.

Suggested-by: Shuah Khan 
Signed-off-by: Fenghua Yu 
---
This is a follow-up patch of recent resctrl selftest patches and can be
applied cleanly to:
git git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git
branch next.

 tools/testing/selftests/resctrl/cache.c | 3 +--
 tools/testing/selftests/resctrl/mba_test.c  | 9 +++--
 tools/testing/selftests/resctrl/mbm_test.c  | 3 +--
 tools/testing/selftests/resctrl/resctrlfs.c | 7 ++-
 4 files changed, 7 insertions(+), 15 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c 
b/tools/testing/selftests/resctrl/cache.c
index 362e3a418caa..310bbc997c60 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -301,8 +301,7 @@ int show_cache_info(unsigned long sum_llc_val, int 
no_of_bits,
ret = platform && abs((int)diff_percent) > max_diff_percent &&
  (cmt ? (abs(avg_diff) > max_diff) : true);
 
-   ksft_print_msg("%s cache miss rate within %d%%\n",
-  ret ? "Fail:" : "Pass:", max_diff_percent);
+   ksft_print_msg("Check cache miss rate within %d%%\n", max_diff_percent);
 
ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent));
ksft_print_msg("Number of bits: %d\n", no_of_bits);
diff --git a/tools/testing/selftests/resctrl/mba_test.c 
b/tools/testing/selftests/resctrl/mba_test.c
index 26f12ad4c663..a909a745754f 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -80,9 +80,7 @@ static void show_mba_info(unsigned long *bw_imc, unsigned 
long *bw_resc)
avg_diff = (float)labs(avg_bw_resc - avg_bw_imc) / avg_bw_imc;
avg_diff_per = (int)(avg_diff * 100);
 
-   ksft_print_msg("%s MBA: diff within %d%% for schemata %u\n",
-  avg_diff_per > MAX_DIFF_PERCENT ?
-  "Fail:" : "Pass:",
+   ksft_print_msg("Check MBA diff within %d%% for schemata %u\n",
   MAX_DIFF_PERCENT,
   ALLOCATION_MAX - ALLOCATION_STEP * allocation);
 
@@ -93,10 +91,9 @@ static void show_mba_info(unsigned long *bw_imc, unsigned 
long *bw_resc)
failed = true;
}
 
-   ksft_print_msg("%s schemata change using MBA\n",
-  failed ? "Fail:" : "Pass:");
+   ksft_print_msg("Check schemata change using MBA\n");
if (failed)
-   ksft_print_msg("At least one test failed");
+   ksft_print_msg("At least one test failed\n");
 }
 
 static int check_results(void)
diff --git a/tools/testing/selftests/resctrl/mbm_test.c 
b/tools/testing/selftests/resctrl/mbm_test.c
index 02b1ed03f1e5..e2e7ee4ec630 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -37,8 +37,7 @@ show_bw_info(unsigned long *bw_imc, unsigned long *bw_resc, 
int span)
avg_diff_per = (int)(avg_diff * 100);
 
ret = avg_diff_per > MAX_DIFF_PERCENT;
-   ksft_print_msg("%s MBM: diff within %d%%\n",
-  ret ? "Fail:" : "Pass:", MAX_DIFF_PERCENT);
+   ksft_print_msg("Check MBM diff within %d%%\n", MAX_DIFF_PERCENT);
ksft_print_msg("avg_diff_per: %d%%\n", avg_diff_per);
ksft_print_msg("Span (MB): %d\n", span);
ksft_print_msg("avg_bw_imc: %lu\n", avg_bw_imc);
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c 
b/tools/testing/selftests/resctrl/resctrlfs.c
index ade5f2b8b843..91cb3c48a7da 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -570,15 +570,12 @@ bool check_resctrlfs_support(void)
 
fclose(inf);
 
-   ksft_print_msg("%s kernel supports resctrl filesystem\n",
-  ret ? "Pass:" : "Fail:");
-
+   ksft_print_msg("Check kernel support for resctrl filesystem\n");
if (!ret)
return ret;
 
dp = opendir(RESCTRL_PATH);
-   ksft_print_msg("%s resctrl mountpoint \"%s\" exists\n",
-  dp ? "Pass:" : "Fail:", RESCTRL_PATH);
+   ksft_print_msg("Check resctrl mountpoint \"%s\"\n", RESCTRL_PATH);
if (dp)
closedir(dp);
 
-- 
2.31.1



Re: [PATCH v5 2/3] x86/bus_lock: Handle #DB for bus lock

2021-04-02 Thread Fenghua Yu
Hi, Thomas,

On Sat, Mar 20, 2021 at 01:42:52PM +0100, Thomas Gleixner wrote:
> On Fri, Mar 19 2021 at 22:19, Fenghua Yu wrote:
> > On Fri, Mar 19, 2021 at 10:30:50PM +0100, Thomas Gleixner wrote:
> >> > +if (sscanf(arg, "ratelimit:%d", ) == 1 && ratelimit > 
> >> > 0) {
> >> > +bld_ratelimit = ratelimit;
> >> 
> >> So any rate up to INTMAX/s is valid here, right?
> >
> > Yes. I don't see smaller limitation than INTMX/s. Is that right?
> 
> That's a given, but what's the point of limits in that range?
> 
> A buslock access locks up the system for X cycles. So the total amount
> of allowable damage in cycles per second is:
> 
>limit * stall_cycles_per_bus_lock
> 
> ergo the time (in seconds) which the system is locked up is:
> 
>limit * stall_cycles_per_bus_lock / cpufreq
> 
> Which means for ~INTMAX/2 on a 2 GHz CPU:
> 
>2 * 10^9 * $CYCLES  / 2 * 10^9  = $CYCLES seconds
> 
> Assumed the inflicted damage is only 1 cycle then #LOCK is pretty much
> permanently on if there are enough threads. Sure #DB will slow them
> down, but it still does not make any sense at all especially as the
> damage is certainly greater than a single cycle.
> 
> And because the changelogs and the docs are void of numbers I just got
> real numbers myself.
> 
> With a single thread doing a 'lock inc *mem' accross a cache line
> boundary the workload which I measured with perf stat goes from:
> 
>  5,940,985,091  instructions  #0.88  insn per cycle   
>   
>2.780950806 seconds time elapsed
>0.99848 seconds user
>4.202137000 seconds sys
> to
> 
>  7,467,979,504  instructions  #0.10  insn per cycle   
>   
>5.110795917 seconds time elapsed
>7.123499000 seconds user
>   37.266852000 seconds sys
> 
> The buslock injection rate is ~250k per second.
> 
> Even if I ratelimit the locked inc by a delay loop of ~5000 cycles
> which is probably more than what the #DB will cost then this single task
> still impacts the workload significantly:
> 
>  6,496,994,537  instructions  #0.39  insn per cycle   
>   
>3.043275473 seconds time elapsed
>1.899852000 seconds user
>8.957088000 seconds sys
> 
> The buslock injection rate is down to ~150k per second in this case.
> 
> And even with throttling the injection rate further down to 25k per
> second the impact on the workload is still significant in the 10% range.

Thank you for your insight!

So I can change the ratelimit to system wide and call usleep_range()
to sleep: 
   while (!__ratelimit(_bld_ratelimit))
   usleep_range(100 / bld_ratelimit,
100 / bld_ratelimit);

The max bld_ratelimit is 1000,000/s because the max sleeping time is 1 usec.
The min bld_ratelimit is 1/s.

> 
> And of course the documentation of the ratelimit parameter explains all
> of this in great detail so the administrator has a trivial job to tune
> that, right?

I will explain how to tune the parameter in buslock.rst doc.

> 
> >> > +case sld_ratelimit:
> >> > +/* Enforce no more than bld_ratelimit bus locks/sec. */
> >> > +while (!__ratelimit(_current_user()->bld_ratelimit))
> >> > +msleep(1000 / bld_ratelimit);
> 
> For any ratelimit > 1000 this will loop up to 1000 times with
> CONFIG_HZ=1000.
> 
> Assume that the buslock producer has tons of threads which all end up
> here pretty soon then you launch a mass wakeup in the worst case every
> jiffy. Are you sure that the cure is better than the disease?

if using usleep_range() to sleep, the threads will not sleep and wakeup,
right? Maybe I can use msleep() for msec (bigger bld_ratelimit) and
usleep_range() for usec (smaller bld_ratelimit)?

Even if there is mass wakeup, throttling the threads can avoid the system
wide performance degradation (e.g. 7x slower dd command in another user).
Is that a good justification for throttling the threads?

> 
> > If I split this whole patch set into two patch sets:
> > 1. Three patches in the first patch set: the enumeration patch, the warn
> >and fatal patch, and the documentation patch.
> > 2. Two patches in the second patch set: the ratelimit patch and the
> >documentation patch.
> >
> > Then I will send the two patch sets separately, you will accept them one
> > by one. Is that OK?
> 
> That's obviously the right thing to do because #1 should be ready and we
> can sort out #2 seperately. See the conversation with Tony.

Thank you for picking up the first patch set!

-Fenghua


Re: [PATCH v5 2/3] x86/bus_lock: Handle #DB for bus lock

2021-04-02 Thread Fenghua Yu
Hi, Thomas,

On Sat, Mar 20, 2021 at 02:57:52PM +0100, Thomas Gleixner wrote:
> On Sat, Mar 20 2021 at 02:01, Thomas Gleixner wrote:
> 
> > On Fri, Mar 19 2021 at 21:50, Tony Luck wrote:
> >>>  What is the justifucation for making this rate limit per UID and not
> >>>  per task, per process or systemwide?
> >>
> >> The concern is that a malicious user is running a workload that loops
> >> obtaining the buslock. This brings the whole system to its knees.
> >>
> >> Limiting per task doesn't help. The user can just fork(2) a whole bunch
> >> of tasks for a distributed buslock attack..
> >
> > Fair enough.
> >
> >> Systemwide might be an interesting alternative. Downside would be 
> >> accidental
> >> rate limit of non-malicious tasks that happen to grab a bus lock 
> >> periodically
> >> but in the same window with other buslocks from other users.
> >>
> >> Do you think that a risk worth taking to make the code simpler?
> >
> > I'd consider it low risk, but I just looked for the usage of the
> > existing ratelimit in struct user and the related commit. Nw it's dawns
> > on me where you are coming from.
> 
> So after getting real numbers myself, I have more thoughts on
> this. Setting a reasonable per user limit might be hard when you want to
> protect e.g. against an orchestrated effort by several users
> (containers...). If each of them stays under the limit which is easy
> enough to figure out then you still end up with significant accumulated
> damage.
> 
> So systemwide might not be the worst option after all.

Indeed.

> 
> The question is how wide spread are bus locks in existing applications?
> I haven't found any on a dozen machines with random variants of
> workloads so far according to perf ... -e sq_misc.split_lock.

We have been running various tests widely inside Intel (and also outside)
after enabling split lock and captured a few split lock issues in firmware,
kernel, drivers, and apps. As you know, we have submitted a few patches to
fix the split lock issues in the kernel and drivers (e.g. split lock
in bit ops) and fixed a few split lock issues in firmware.

But so far I'm not aware of any split lock issues in user space yet.
I guess compilers do good cache line alignment good job to avoid this
issue. But inline asm in user apps can easily hit this issue (on purpose).

> 
> What's the actual scenario in the real world where a buslock access
> might be legitimate?

I did a simple experiment: looping on a split locked instruction on
one core in one user can slow down "dd" command running on another core
in another user by 7 times. A malicious user can do similar things to
slow down the whole system performance, right?

> 
> And what's the advice, recommendation for a system administrator how to
> analyze the situation and what kind of parameter to set?
> 
> I tried to get answers from Documentation/x86/buslock.rst, but 

Can I change the sleep code in the handle_bus_lock() to the following?

   while (!__ratelimit(_bld_ratelimit))
   usleep_range(100 / bld_ratelimit,
100 / bld_ratelimit);

Maybe the system wide bus lock ratelimit can be set to default value
1000,000/s which is also the max ratelimit value.

The max sleep in the kernel is 1 us which means max bld_ratelimit
can be up to 1000,000.

If the system administrator think bus locks are less tolerant and wants
to throttle bus lock further, bld_ratelimit can be set as a smaller number.
The smallest bld_ratelimit is 1.

When I gradually decreases bld_ratelimit value, I can see less bus locks
can be issued per second systemwide and "dd" command or other memory
benchmarks are less impacted by the bus locks.

If this works, I will have the buslock.rst doc to explain the situation
and how to set the parameter.

Thanks.

-Fenghua


Re: [PATCH v6 00/21] Miscellaneous fixes for resctrl selftests

2021-04-02 Thread Fenghua Yu
On Fri, Apr 02, 2021 at 02:04:16PM -0600, Shuah Khan wrote:
> On 4/2/21 12:18 PM, Fenghua Yu wrote:
> > Hi, Shuah,
> > 
> > On Fri, Apr 02, 2021 at 12:17:17PM -0600, Shuah Khan wrote:
> > > On 3/26/21 1:45 PM, Fenghua Yu wrote:
> > > > Hi, Shuah,
> > > > 
> > > > On Wed, Mar 17, 2021 at 02:22:34AM +, Fenghua Yu wrote:
> > > > > This patch set has several miscellaneous fixes to resctrl selftest 
> > > > > tool
> > > > > that are easily visible to user. V1 had fixes to CAT test and CMT test
> > > > > but they were dropped in V2 because having them here made the patchset
> > > > > humongous. So, changes to CAT test and CMT test will be posted in 
> > > > > another
> > > > > patchset.
> > > > > 
> > > > > Change Log:
> > > > > v6:
> > > > > - Add Tested-by: Babu Moger .
> > > > > - Replace "cat" by CAT_STR etc (Babu).
> > > > > - Capitalize the first letter of printed message (Babu).
> > > > 
> > > > Any comment on this series? Will you push it into linux-kselftest.git?
> > > > 
> > > Yes. Will apply for 5.13-rc1
> > 
> > Great! Thank you very much for your help!
> > 
> 
> Done. Now applied to linux-selftest next.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git/
> next
> 
> Ran sanity test and suggested a change in message for 12/21.
> 
> Please take a look other such messages and improve them as well
> and send follow-on patches.

Sure. Will send a follow-on patch to change the messages.

Thanks.

-Fenghua


Re: [PATCH v6 00/21] Miscellaneous fixes for resctrl selftests

2021-04-02 Thread Fenghua Yu
Hi, Shuah,

On Fri, Apr 02, 2021 at 12:17:17PM -0600, Shuah Khan wrote:
> On 3/26/21 1:45 PM, Fenghua Yu wrote:
> > Hi, Shuah,
> > 
> > On Wed, Mar 17, 2021 at 02:22:34AM +, Fenghua Yu wrote:
> > > This patch set has several miscellaneous fixes to resctrl selftest tool
> > > that are easily visible to user. V1 had fixes to CAT test and CMT test
> > > but they were dropped in V2 because having them here made the patchset
> > > humongous. So, changes to CAT test and CMT test will be posted in another
> > > patchset.
> > > 
> > > Change Log:
> > > v6:
> > > - Add Tested-by: Babu Moger .
> > > - Replace "cat" by CAT_STR etc (Babu).
> > > - Capitalize the first letter of printed message (Babu).
> > 
> > Any comment on this series? Will you push it into linux-kselftest.git?
> > 
> Yes. Will apply for 5.13-rc1

Great! Thank you very much for your help!

-Fenghua


[tip: x86/splitlock] Documentation/admin-guide: Change doc for split_lock_detect parameter

2021-03-28 Thread tip-bot2 for Fenghua Yu
The following commit has been merged into the x86/splitlock branch of tip:

Commit-ID: ebca17707e38f2050b188d837bd4646b29a1b0c2
Gitweb:
https://git.kernel.org/tip/ebca17707e38f2050b188d837bd4646b29a1b0c2
Author:Fenghua Yu 
AuthorDate:Mon, 22 Mar 2021 13:53:25 
Committer: Thomas Gleixner 
CommitterDate: Sun, 28 Mar 2021 22:52:16 +02:00

Documentation/admin-guide: Change doc for split_lock_detect parameter

Since #DB for bus lock detect changes the split_lock_detect parameter,
update the documentation for the changes.

Signed-off-by: Fenghua Yu 
Signed-off-by: Thomas Gleixner 
Reviewed-by: Tony Luck 
Acked-by: Randy Dunlap 
Link: https://lore.kernel.org/r/20210322135325.682257-4-fenghua...@intel.com

---
 Documentation/admin-guide/kernel-parameters.txt | 22 +++-
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 0454572..aef927c 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5100,27 +5100,37 @@
spia_peddr=
 
split_lock_detect=
-   [X86] Enable split lock detection
+   [X86] Enable split lock detection or bus lock detection
 
When enabled (and if hardware support is present), 
atomic
instructions that access data across cache line
-   boundaries will result in an alignment check exception.
+   boundaries will result in an alignment check exception
+   for split lock detection or a debug exception for
+   bus lock detection.
 
off - not enabled
 
-   warn- the kernel will emit rate limited warnings
+   warn- the kernel will emit rate-limited warnings
  about applications triggering the #AC
- exception. This mode is the default on CPUs
- that supports split lock detection.
+ exception or the #DB exception. This mode is
+ the default on CPUs that support split lock
+ detection or bus lock detection. Default
+ behavior is by #AC if both features are
+ enabled in hardware.
 
fatal   - the kernel will send SIGBUS to applications
- that trigger the #AC exception.
+ that trigger the #AC exception or the #DB
+ exception. Default behavior is by #AC if
+ both features are enabled in hardware.
 
If an #AC exception is hit in the kernel or in
firmware (i.e. not while executing in user mode)
the kernel will oops in either "warn" or "fatal"
mode.
 
+   #DB exception for bus lock is triggered only when
+   CPL > 0.
+
srbds=  [X86,INTEL]
Control the Special Register Buffer Data Sampling
(SRBDS) mitigation.


[tip: x86/splitlock] x86/traps: Handle #DB for bus lock

2021-03-28 Thread tip-bot2 for Fenghua Yu
The following commit has been merged into the x86/splitlock branch of tip:

Commit-ID: ebb1064e7c2e90b56e4d40ab154ef9796060a1c3
Gitweb:
https://git.kernel.org/tip/ebb1064e7c2e90b56e4d40ab154ef9796060a1c3
Author:Fenghua Yu 
AuthorDate:Mon, 22 Mar 2021 13:53:24 
Committer: Thomas Gleixner 
CommitterDate: Sun, 28 Mar 2021 22:52:15 +02:00

x86/traps: Handle #DB for bus lock

Bus locks degrade performance for the whole system, not just for the CPU
that requested the bus lock. Two CPU features "#AC for split lock" and
"#DB for bus lock" provide hooks so that the operating system may choose
one of several mitigation strategies.

#AC for split lock is already implemented. Add code to use the #DB for
bus lock feature to cover additional situations with new options to
mitigate.

split_lock_detect=
#AC for split lock  #DB for bus lock

off Do nothing  Do nothing

warnKernel OOPs Warn once per task and
Warn once per task and  and continues to run.
disable future checking
When both features are
supported, warn in #AC

fatal   Kernel OOPs Send SIGBUS to user.
Send SIGBUS to user
When both features are
supported, fatal in #AC

ratelimit:N Do nothing  Limit bus lock rate to
N per second in the
current non-root user.

Default option is "warn".

Hardware only generates #DB for bus lock detect when CPL>0 to avoid
nested #DB from multiple bus locks while the first #DB is being handled.
So no need to handle #DB for bus lock detected in the kernel.

#DB for bus lock is enabled by bus lock detection bit 2 in DEBUGCTL MSR
while #AC for split lock is enabled by split lock detection bit 29 in
TEST_CTRL MSR.

Both breakpoint and bus lock in the same instruction can trigger one #DB.
The bus lock is handled before the breakpoint in the #DB handler.

Delivery of #DB for bus lock in userspace clears DR6[11], which is set by
the #DB handler right after reading DR6.

Signed-off-by: Fenghua Yu 
Signed-off-by: Thomas Gleixner 
Reviewed-by: Tony Luck 
Link: https://lore.kernel.org/r/20210322135325.682257-3-fenghua...@intel.com

---
 arch/x86/include/asm/cpu.h   |   7 +-
 arch/x86/include/asm/msr-index.h |   1 +-
 arch/x86/include/uapi/asm/debugreg.h |   1 +-
 arch/x86/kernel/cpu/common.c |   2 +-
 arch/x86/kernel/cpu/intel.c  | 111 +-
 arch/x86/kernel/traps.c  |   4 +-
 6 files changed, 104 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index da78ccb..0d7fc0e 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -41,12 +41,13 @@ unsigned int x86_family(unsigned int sig);
 unsigned int x86_model(unsigned int sig);
 unsigned int x86_stepping(unsigned int sig);
 #ifdef CONFIG_CPU_SUP_INTEL
-extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c);
+extern void __init sld_setup(struct cpuinfo_x86 *c);
 extern void switch_to_sld(unsigned long tifn);
 extern bool handle_user_split_lock(struct pt_regs *regs, long error_code);
 extern bool handle_guest_split_lock(unsigned long ip);
+extern void handle_bus_lock(struct pt_regs *regs);
 #else
-static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {}
+static inline void __init sld_setup(struct cpuinfo_x86 *c) {}
 static inline void switch_to_sld(unsigned long tifn) {}
 static inline bool handle_user_split_lock(struct pt_regs *regs, long 
error_code)
 {
@@ -57,6 +58,8 @@ static inline bool handle_guest_split_lock(unsigned long ip)
 {
return false;
 }
+
+static inline void handle_bus_lock(struct pt_regs *regs) {}
 #endif
 #ifdef CONFIG_IA32_FEAT_CTL
 void init_ia32_feat_ctl(struct cpuinfo_x86 *c);
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 546d6ec..32c496f 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -265,6 +265,7 @@
 #define DEBUGCTLMSR_LBR(1UL <<  0) /* last branch 
recording */
 #define DEBUGCTLMSR_BTF_SHIFT  1
 #define DEBUGCTLMSR_BTF(1UL <<  1) /* single-step on 
branches */
+#define DEBUGCTLMSR_BUS_LOCK_DETECT(1UL <<  2)
 #define DEBUGCTLMSR_TR (1UL <<  6)
 #define DEBUGCTLMSR_BTS(1UL <<  7)
 #define DEBUGCTLMSR_BTINT  (1UL <<  8)
diff --git a/arch/x86/include/uapi/asm/debugreg.h 
b/arch/x86/include/uapi/asm/debugreg.h
index d95d080..0007ba0 100644
--- a/arch/x86/include/uapi/asm/debugreg.h
+++ b/arch/x86/include/uapi/asm/debugreg.h
@@ -24,6 +24,7 @@
 #define DR_TRAP3   (0x8)   

[tip: x86/splitlock] x86/cpufeatures: Enumerate #DB for bus lock detection

2021-03-28 Thread tip-bot2 for Fenghua Yu
The following commit has been merged into the x86/splitlock branch of tip:

Commit-ID: f21d4d3b97a8603567e5d4250bd75e8ebbd520af
Gitweb:
https://git.kernel.org/tip/f21d4d3b97a8603567e5d4250bd75e8ebbd520af
Author:Fenghua Yu 
AuthorDate:Mon, 22 Mar 2021 13:53:23 
Committer: Thomas Gleixner 
CommitterDate: Sun, 28 Mar 2021 22:52:14 +02:00

x86/cpufeatures: Enumerate #DB for bus lock detection

A bus lock is acquired through either a split locked access to writeback
(WB) memory or any locked access to non-WB memory. This is typically >1000
cycles slower than an atomic operation within a cache line. It also
disrupts performance on other cores.

Some CPUs have the ability to notify the kernel by a #DB trap after a user
instruction acquires a bus lock and is executed. This allows the kernel to
enforce user application throttling or mitigation. Both breakpoint and bus
lock can trigger the #DB trap in the same instruction and the ordering of
handling them is the kernel #DB handler's choice.

The CPU feature flag to be shown in /proc/cpuinfo will be "bus_lock_detect".

Signed-off-by: Fenghua Yu 
Signed-off-by: Thomas Gleixner 
Reviewed-by: Tony Luck 
Link: https://lore.kernel.org/r/20210322135325.682257-2-fenghua...@intel.com

---
 arch/x86/include/asm/cpufeatures.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index cc96e26..faec3d9 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -354,6 +354,7 @@
 #define X86_FEATURE_AVX512_VPOPCNTDQ   (16*32+14) /* POPCNT for vectors of 
DW/QW */
 #define X86_FEATURE_LA57   (16*32+16) /* 5-level page tables */
 #define X86_FEATURE_RDPID  (16*32+22) /* RDPID instruction */
+#define X86_FEATURE_BUS_LOCK_DETECT(16*32+24) /* Bus Lock detect */
 #define X86_FEATURE_CLDEMOTE   (16*32+25) /* CLDEMOTE instruction */
 #define X86_FEATURE_MOVDIRI(16*32+27) /* MOVDIRI instruction */
 #define X86_FEATURE_MOVDIR64B  (16*32+28) /* MOVDIR64B instruction */


Re: [PATCH v6 00/21] Miscellaneous fixes for resctrl selftests

2021-03-26 Thread Fenghua Yu
Hi, Shuah,

On Wed, Mar 17, 2021 at 02:22:34AM +, Fenghua Yu wrote:
> This patch set has several miscellaneous fixes to resctrl selftest tool
> that are easily visible to user. V1 had fixes to CAT test and CMT test
> but they were dropped in V2 because having them here made the patchset
> humongous. So, changes to CAT test and CMT test will be posted in another
> patchset.
> 
> Change Log:
> v6:
> - Add Tested-by: Babu Moger .
> - Replace "cat" by CAT_STR etc (Babu).
> - Capitalize the first letter of printed message (Babu).

Any comment on this series? Will you push it into linux-kselftest.git?

Thanks.

-Fenghua


[PATCH v6 0/3] x86/bus_lock: Enable bus lock detection

2021-03-22 Thread Fenghua Yu
A bus lock [1] is acquired through either split locked access to
writeback (WB) memory or any locked access to non-WB memory. This is
typically >1000 cycles slower than an atomic operation within
a cache line. It also disrupts performance on other cores.

Although split lock can be detected by #AC trap, the trap is triggered
before the instruction acquires bus lock. This makes it difficult to
mitigate bus lock (e.g. throttle the user application).

Some CPUs have ability to notify the kernel by an #DB trap after a user
instruction acquires a bus lock and is executed. This allows the kernel
to enforce user application throttling or mitigations.

#DB for bus lock detect fixes issues in #AC for split lock detect:
1) It's architectural ... just need to look at one CPUID bit to know it
   exists
2) The IA32_DEBUGCTL MSR, which reports bus lock in #DB, is per-thread.
   So each process or guest can have different behavior.
3) It has support for VMM/guests (new VMEXIT codes, etc).
4) It detects not only split locks but also bus locks from non-WB.

Hardware only generates #DB for bus lock detect when CPL>0 to avoid
nested #DB from multiple bus locks while the first #DB is being handled.

Use the existing kernel command line parameter "split_lock_detect=" to
handle #DB for bus lock with an additional option "ratelimit=N" to set
bus lock rate limit for a user.

[1] Intel Instruction Set Extension Chapter 9:
https://software.intel.com/content/dam/develop/public/us/en/documents/architecture-instruction-set-extensions-programming-reference.pdf

Change Log:
v6:
- Split the v5 patch set into two sets. The first set (this patch set)
  only handles warn and fatal cases. The second set will handle ratelimit
  case and will be released later (Thomas).
v5 is here: https://lore.kernel.org/patchwork/cover/1394659/
 
v5:
Address all comments from Thomas:
- In the cover letter, update the latest ISE link to include the #DB
  for bus lock spec.
- In patch 1, add commit message for breakpoint and bus lock on the same
  instruction.
- In patch 2, change warn to #AC if both #AC and #DB are supported, remove
  sld and bld variables, remove bus lock checking in handle_bus_lock() etc.
- In patch 3 and 4, remove bld_ratelimit < HZ/2 check and define
  bld_ratelimit only for Intel CPUs.
- Merge patch 2 and 3 into one patch for handling warn, fatal, and
  ratelimit.
v4 is here: 
https://lore.kernel.org/lkml/20201124205245.4164633-2-fenghua...@intel.com/

v4:
- Fix a ratelimit wording issue in the doc (Randy).
- Patch 4 is acked by Randy (Randy).

v3:
- Enable Bus Lock Detection when fatal to handle bus lock from non-WB
  (PeterZ).
- Add Acked-by: PeterZ in patch 2.

v2:
- Send SIGBUS in fatal case for bus lock #DB (PeterZ).

v1:
- Check bus lock bit by its positive polarity (Xiaoyao).
- Fix a few wording issues in the documentation (Randy).
[RFC v3 can be found at: https://lore.kernel.org/patchwork/cover/1329943/]

RFC v3:
- Remove DR6_RESERVED change (PeterZ).
- Simplify the documentation (Randy).

RFC v2:
- Architecture changed based on feedback from Thomas and PeterZ. #DB is
  no longer generated for bus lock in ring0.
- Split the one single patch into four patches.
[RFC v1 can be found at: 
https://lore.kernel.org/lkml/1595021700-68460-1-git-send-email-fenghua...@intel.com/]

Fenghua Yu (3):
  x86/cpufeatures: Enumerate #DB for bus lock detection
  x86/bus_lock: Handle #DB for bus lock
  Documentation/admin-guide: Change doc for split_lock_detect parameter

 .../admin-guide/kernel-parameters.txt |  22 +++-
 arch/x86/include/asm/cpu.h|   9 +-
 arch/x86/include/asm/cpufeatures.h|   1 +
 arch/x86/include/asm/msr-index.h  |   1 +
 arch/x86/include/uapi/asm/debugreg.h  |   1 +
 arch/x86/kernel/cpu/common.c  |   2 +-
 arch/x86/kernel/cpu/intel.c   | 111 +++---
 arch/x86/kernel/traps.c   |   7 ++
 8 files changed, 126 insertions(+), 28 deletions(-)

-- 
2.31.0



[PATCH v6 3/3] Documentation/admin-guide: Change doc for split_lock_detect parameter

2021-03-22 Thread Fenghua Yu
Since #DB for bus lock detect changes the split_lock_detect parameter,
update the documentation for the changes.

Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
Acked-by: Randy Dunlap 
---
Change Log:
v6:
- Remove the ratelimit info which will be released later in a ratelimit
  specific patch set (Thomas).

v5:
- Remove N < HZ/2 check info in the doc (Thomas).

v4:
- Fix a ratelimit wording issue in the doc (Randy).
- Patch 4 is acked by Randy (Randy).

v3:
- Enable Bus Lock Detection when fatal to handle bus lock from non-WB
  (PeterZ).

v1:
- Fix a few wording issues (Randy).

RFC v2:
- Simplify the documentation (Randy).

 .../admin-guide/kernel-parameters.txt | 22 ++-
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 04545725f187..aef927cec602 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5100,27 +5100,37 @@
spia_peddr=
 
split_lock_detect=
-   [X86] Enable split lock detection
+   [X86] Enable split lock detection or bus lock detection
 
When enabled (and if hardware support is present), 
atomic
instructions that access data across cache line
-   boundaries will result in an alignment check exception.
+   boundaries will result in an alignment check exception
+   for split lock detection or a debug exception for
+   bus lock detection.
 
off - not enabled
 
-   warn- the kernel will emit rate limited warnings
+   warn- the kernel will emit rate-limited warnings
  about applications triggering the #AC
- exception. This mode is the default on CPUs
- that supports split lock detection.
+ exception or the #DB exception. This mode is
+ the default on CPUs that support split lock
+ detection or bus lock detection. Default
+ behavior is by #AC if both features are
+ enabled in hardware.
 
fatal   - the kernel will send SIGBUS to applications
- that trigger the #AC exception.
+ that trigger the #AC exception or the #DB
+ exception. Default behavior is by #AC if
+ both features are enabled in hardware.
 
If an #AC exception is hit in the kernel or in
firmware (i.e. not while executing in user mode)
the kernel will oops in either "warn" or "fatal"
mode.
 
+   #DB exception for bus lock is triggered only when
+   CPL > 0.
+
srbds=  [X86,INTEL]
Control the Special Register Buffer Data Sampling
(SRBDS) mitigation.
-- 
2.31.0



[PATCH v6 1/3] x86/cpufeatures: Enumerate #DB for bus lock detection

2021-03-22 Thread Fenghua Yu
A bus lock is acquired through either a split locked access to
writeback (WB) memory or any locked access to non-WB memory. This is
typically >1000 cycles slower than an atomic operation within a cache
line. It also disrupts performance on other cores.

Some CPUs have the ability to notify the kernel by an #DB trap after a user
instruction acquires a bus lock and is executed. This allows the kernel
to enforce user application throttling or mitigation. Both breakpoint
and bus lock can trigger the #DB trap in the same instruction and the
ordering of handling them is the kernel #DB handler's choice.

The CPU feature flag to be shown in /proc/cpuinfo will be "bus_lock_detect".

Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
---
Change Log:
v6:
- Fix wording issues in the commit message (Thomas).

v5:
- Add "Both breakpoint and bus lock can trigger an #DB trap..." in the
  commit message (Thomas).

 arch/x86/include/asm/cpufeatures.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index cc96e26d69f7..faec3d92d09b 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -354,6 +354,7 @@
 #define X86_FEATURE_AVX512_VPOPCNTDQ   (16*32+14) /* POPCNT for vectors of 
DW/QW */
 #define X86_FEATURE_LA57   (16*32+16) /* 5-level page tables */
 #define X86_FEATURE_RDPID  (16*32+22) /* RDPID instruction */
+#define X86_FEATURE_BUS_LOCK_DETECT(16*32+24) /* Bus Lock detect */
 #define X86_FEATURE_CLDEMOTE   (16*32+25) /* CLDEMOTE instruction */
 #define X86_FEATURE_MOVDIRI(16*32+27) /* MOVDIRI instruction */
 #define X86_FEATURE_MOVDIR64B  (16*32+28) /* MOVDIR64B instruction */
-- 
2.31.0



[PATCH v6 2/3] x86/bus_lock: Handle #DB for bus lock

2021-03-22 Thread Fenghua Yu
Bus locks degrade performance for the whole system, not just for the CPU
that requested the bus lock. Two CPU features "#AC for split lock" and
"#DB for bus lock" provide hooks so that the operating system may choose
one of several mitigation strategies.

#AC for split lock is already implemented. Add code to use the #DB for
bus lock feature to cover additional situations with new options to
mitigate.

split_lock_detect=
#AC for split lock  #DB for bus lock

off Do nothing  Do nothing

warnKernel OOPs Warn once per task and
Warn once per task and  and continues to run.
disable future checking
When both features are
supported, warn in #AC

fatal   Kernel OOPs Send SIGBUS to user.
Send SIGBUS to user
When both features are
supported, fatal in #AC

ratelimit:N Do nothing  Limit bus lock rate to
N per second in the
current non-root user.

Default option is "warn".

Hardware only generates #DB for bus lock detect when CPL>0 to avoid
nested #DB from multiple bus locks while the first #DB is being handled.
So no need to handle #DB for bus lock detected in the kernel.

#DB for bus lock is enabled by bus lock detection bit 2 in DEBUGCTL MSR
while #AC for split lock is enabled by split lock detection bit 29 in
TEST_CTRL MSR.

Both breakpoint and bus lock in the same instruction can trigger one #DB.
The bus lock is handled before the breakpoint in the #DB handler.

Delivery of #DB for bus lock in userspace clears DR6[11]. To avoid
confusion in identifying #DB, #DB handler sets the bit to 1 before
returning to the interrupted task.

Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
---
Change Log:
v6:
- Split the v5 patch set into two patch sets: the one for warn and fatal
  and the one for ratelimit. The warn and fatal patch set is released
  here first and the ratelimit patch set will be released later (Thomas).

v5:
Address all comments from Thomas:
- Merge patch 2 and patch 3 into one patch so all "split_lock_detect="
  options are processed in one patch.
- Change warn to #AC if both #AC and #DB are supported.
- Remove sld and bld variables and use boot_cpu_has() to check bus lock
  split lock support.
- Remove bus lock checking in handle_bus_lock().
- Remove bld_ratelimit < HZ/2 check.
- Add rate limit handling comment in bus lock #DB.
- Define bld_ratelimit only for Intel CPUs.

v3:
- Enable Bus Lock Detection when fatal to handle bus lock from non-WB
  (PeterZ).

v2:
- Send SIGBUS in fatal case for bus lock #DB (PeterZ).

v1::
- Check bus lock bit by its positive polarity (Xiaoyao).

RFC v3:
- Remove DR6_RESERVED change (PeterZ).

 arch/x86/include/asm/cpu.h   |   9 ++-
 arch/x86/include/asm/msr-index.h |   1 +
 arch/x86/include/uapi/asm/debugreg.h |   1 +
 arch/x86/kernel/cpu/common.c |   2 +-
 arch/x86/kernel/cpu/intel.c  | 111 ++-
 arch/x86/kernel/traps.c  |   7 ++
 6 files changed, 109 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index da78ccbd493b..c16b9bab502f 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -41,12 +41,13 @@ unsigned int x86_family(unsigned int sig);
 unsigned int x86_model(unsigned int sig);
 unsigned int x86_stepping(unsigned int sig);
 #ifdef CONFIG_CPU_SUP_INTEL
-extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c);
+extern void __init sld_setup(struct cpuinfo_x86 *c);
 extern void switch_to_sld(unsigned long tifn);
 extern bool handle_user_split_lock(struct pt_regs *regs, long error_code);
 extern bool handle_guest_split_lock(unsigned long ip);
+extern void handle_bus_lock(struct pt_regs *regs);
 #else
-static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {}
+static inline void __init sld_setup(struct cpuinfo_x86 *c) {}
 static inline void switch_to_sld(unsigned long tifn) {}
 static inline bool handle_user_split_lock(struct pt_regs *regs, long 
error_code)
 {
@@ -57,6 +58,10 @@ static inline bool handle_guest_split_lock(unsigned long ip)
 {
return false;
 }
+
+static inline void handle_bus_lock(struct pt_regs *regs)
+{
+}
 #endif
 #ifdef CONFIG_IA32_FEAT_CTL
 void init_ia32_feat_ctl(struct cpuinfo_x86 *c);
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 546d6ecf0a35..558485965f21 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -265,6 +265,7 @@
 #define DEBUGCTLMSR_LBR(1UL <<  0) /* last branch 
recording */
 #define DEBUGCTLMSR_BTF_SHIFT  1
 #define DEBUGCTLMSR_BTF   

Re: [PATCH v5 2/3] x86/bus_lock: Handle #DB for bus lock

2021-03-19 Thread Fenghua Yu
Hi, Thomas,

On Fri, Mar 19, 2021 at 10:30:50PM +0100, Thomas Gleixner wrote:
> On Sat, Mar 13 2021 at 05:49, Fenghua Yu wrote:
> > Change Log:
> > v5:
> > Address all comments from Thomas:
> > - Merge patch 2 and patch 3 into one patch so all "split_lock_detect="
> >   options are processed in one patch.
> 
> What? I certainly did not request that. I said:
> 
>  "Why is this seperate and an all in one thing? patch 2/4 changes the
>   parameter first and 3/4 adds a new option"
> 
> which means we want documentation for patch 2 and documentation for
> patch 3? 
> 
> The ratelimit thing is clearly an extra functionality on top of that
> buslock muck.
> 
> Next time I write it out..

Sorry for misunderstanding your comments. I will split the document patch
into two: one for patch 2 (warn and fatal) and one for patch 3 (ratelimit).

> 
> > +   if (sscanf(arg, "ratelimit:%d", ) == 1 && ratelimit > 0) {
> > +   bld_ratelimit = ratelimit;
> 
> So any rate up to INTMAX/s is valid here, right?

Yes. I don't see smaller limitation than INTMX/s. Is that right?

> 
> > +   case sld_ratelimit:
> > +   /* Enforce no more than bld_ratelimit bus locks/sec. */
> > +   while (!__ratelimit(_current_user()->bld_ratelimit))
> > +   msleep(1000 / bld_ratelimit);
> 
> which is cute because msleep() will always sleep until the next jiffie
> increment happens.
> 
> What's not so cute here is the fact that get_current_user() takes a
> reference on current's UID on every invocation, but nothing ever calls
> free_uid(). I missed that last time over the way more obvious HZ division.

I will call free_uid().

> 
> > +++ b/kernel/user.c
> > @@ -103,6 +103,9 @@ struct user_struct root_user = {
> > .locked_shm = 0,
> > .uid= GLOBAL_ROOT_UID,
> > .ratelimit  = RATELIMIT_STATE_INIT(root_user.ratelimit, 0, 0),
> > +#ifdef CONFIG_CPU_SUP_INTEL
> > +   .bld_ratelimit  = RATELIMIT_STATE_INIT(root_user.bld_ratelimit, 0, 0),
> > +#endif
> >  };
> >  
> >  /*
> > @@ -172,6 +175,11 @@ void free_uid(struct user_struct *up)
> > free_user(up, flags);
> >  }
> >  
> > +#ifdef CONFIG_CPU_SUP_INTEL
> > +/* Some Intel CPUs may set this for rate-limited bus locks. */
> > +int bld_ratelimit;
> > +#endif
> 
> Of course this variable is still required to be in the core kernel code
> because?
> 
> While you decided to munge this all together, you obviously ignored the
> following review comment:
> 
>   "It also lacks the information that the ratelimiting is per UID
>and not per task and why this was chosen to be per UID..."
> 
> There is still no reasoning neither in the changelog nor in the cover
> letter nor in a reply to my review.
> 
> So let me repeat my question and make it more explicit:
> 
>   What is the justifucation for making this rate limit per UID and not
>   per task, per process or systemwide?

Tony jut now answered the justification. If that's OK, I will add the
answer in the commit message.

> 
> >  struct user_struct *alloc_uid(kuid_t uid)
> >  {
> > struct hlist_head *hashent = uidhashentry(uid);
> > @@ -190,6 +198,11 @@ struct user_struct *alloc_uid(kuid_t uid)
> > refcount_set(>__count, 1);
> > ratelimit_state_init(>ratelimit, HZ, 100);
> > ratelimit_set_flags(>ratelimit, RATELIMIT_MSG_ON_RELEASE);
> > +#ifdef CONFIG_CPU_SUP_INTEL
> > +   ratelimit_state_init(>bld_ratelimit, HZ, bld_ratelimit);
> > +   ratelimit_set_flags(>bld_ratelimit,
> > +   RATELIMIT_MSG_ON_RELEASE);
> > +#endif
> 
> If this has a proper justification for being per user and having to add
> 40 bytes per UID for something which is mostly unused then there are
> definitely better ways to do that than slapping #ifdefs into
> architecture agnostic core code.
> 
> So if you instead of munging the code patches had split the
> documentation, then I could apply the first 3 patches and we would only
> have to sort out the ratelimiting muck.

If I split this whole patch set into two patch sets:
1. Three patches in the first patch set: the enumeration patch, the warn
   and fatal patch, and the documentation patch.
2. Two patches in the second patch set: the ratelimit patch and the
   documentation patch.

Then I will send the two patch sets separately, you will accept them one
by one. Is that OK?

Or should I still send the 5 patches in one patch set so you will pick up
the first 3 patches and then the next 2 patches separately?

Thanks.

-Fenghua



Re: [PATCH v5 1/3] x86/cpufeatures: Enumerate #DB for bus lock detection

2021-03-19 Thread Fenghua Yu
On Fri, Mar 19, 2021 at 09:35:39PM +0100, Thomas Gleixner wrote:
> On Sat, Mar 13 2021 at 05:49, Fenghua Yu wrote:
> > A bus lock is acquired though either split locked access to
> 
> s/though/through/
> either a 
> > Some CPUs have ability to notify the kernel by an #DB trap after a user
> 
> the ability

Thank you for your review, Thomas! Will fix the issues.

-Fenghua


[PATCH v6 21/21] selftests/resctrl: Create .gitignore to include resctrl_tests

2021-03-16 Thread Fenghua Yu
Create .gitignore to hold the test file resctrl_tests generated after
compiling.

Suggested-by: Shuah Khan 
Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Add this patch (Shuah)

 tools/testing/selftests/resctrl/.gitignore | 2 ++
 1 file changed, 2 insertions(+)
 create mode 100644 tools/testing/selftests/resctrl/.gitignore

diff --git a/tools/testing/selftests/resctrl/.gitignore 
b/tools/testing/selftests/resctrl/.gitignore
new file mode 100644
index ..ab68442b6bc8
--- /dev/null
+++ b/tools/testing/selftests/resctrl/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+resctrl_tests
-- 
2.31.0



[PATCH v6 16/21] selftests/resctrl: Modularize resctrl test suite main() function

2021-03-16 Thread Fenghua Yu
Resctrl test suite main() function does the following things
1. Parses command line arguments passed by user
2. Some setup checks
3. Logic that calls into each unit test
4. Print result and clean up after running each unit test

Introduce wrapper functions for steps 3 and 4 to modularize the main()
function. Adding these wrapper functions makes it easier to add any logic
to each individual test.

Please note that this is a preparatory patch for the next one and no
functional changes are intended.

Suggested-by: Reinette Chatre 
Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
 .../testing/selftests/resctrl/resctrl_tests.c | 88 ---
 1 file changed, 57 insertions(+), 31 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c 
b/tools/testing/selftests/resctrl/resctrl_tests.c
index 2ace464b96d1..e63e0d8764ef 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -54,10 +54,58 @@ void tests_cleanup(void)
cat_test_cleanup();
 }
 
+static void run_mbm_test(bool has_ben, char **benchmark_cmd, int span,
+int cpu_no, char *bw_report)
+{
+   int res;
+
+   ksft_print_msg("Starting MBM BW change ...\n");
+   if (!has_ben)
+   sprintf(benchmark_cmd[5], "%s", MBA_STR);
+   res = mbm_bw_change(span, cpu_no, bw_report, benchmark_cmd);
+   ksft_test_result(!res, "MBM: bw change\n");
+   mbm_test_cleanup();
+}
+
+static void run_mba_test(bool has_ben, char **benchmark_cmd, int span,
+int cpu_no, char *bw_report)
+{
+   int res;
+
+   ksft_print_msg("Starting MBA Schemata change ...\n");
+   if (!has_ben)
+   sprintf(benchmark_cmd[1], "%d", span);
+   res = mba_schemata_change(cpu_no, bw_report, benchmark_cmd);
+   ksft_test_result(!res, "MBA: schemata change\n");
+   mba_test_cleanup();
+}
+
+static void run_cmt_test(bool has_ben, char **benchmark_cmd, int cpu_no)
+{
+   int res;
+
+   ksft_print_msg("Starting CMT test ...\n");
+   if (!has_ben)
+   sprintf(benchmark_cmd[5], "%s", CMT_STR);
+   res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd);
+   ksft_test_result(!res, "CMT: test\n");
+   cmt_test_cleanup();
+}
+
+static void run_cat_test(int cpu_no, int no_of_bits)
+{
+   int res;
+
+   ksft_print_msg("Starting CAT test ...\n");
+   res = cat_perf_miss_val(cpu_no, no_of_bits, "L3");
+   ksft_test_result(!res, "CAT: test\n");
+   cat_test_cleanup();
+}
+
 int main(int argc, char **argv)
 {
bool has_ben = false, mbm_test = true, mba_test = true, cmt_test = true;
-   int res, c, cpu_no = 1, span = 250, argc_new = argc, i, no_of_bits = 0;
+   int c, cpu_no = 1, span = 250, argc_new = argc, i, no_of_bits = 0;
char *benchmark_cmd[BENCHMARK_ARGS], bw_report[64], bm_type[64];
char benchmark_cmd_area[BENCHMARK_ARGS][BENCHMARK_ARG_SIZE];
int ben_ind, ben_count, tests = 0;
@@ -170,39 +218,17 @@ int main(int argc, char **argv)
 
ksft_set_plan(tests ? : 4);
 
-   if (!is_amd && mbm_test) {
-   ksft_print_msg("Starting MBM BW change ...\n");
-   if (!has_ben)
-   sprintf(benchmark_cmd[5], "%s", MBA_STR);
-   res = mbm_bw_change(span, cpu_no, bw_report, benchmark_cmd);
-   ksft_test_result(!res, "MBM: bw change\n");
-   mbm_test_cleanup();
-   }
+   if (!is_amd && mbm_test)
+   run_mbm_test(has_ben, benchmark_cmd, span, cpu_no, bw_report);
 
-   if (!is_amd && mba_test) {
-   ksft_print_msg("Starting MBA Schemata change ...\n");
-   if (!has_ben)
-   sprintf(benchmark_cmd[1], "%d", span);
-   res = mba_schemata_change(cpu_no, bw_report, benchmark_cmd);
-   ksft_test_result(!res, "MBA: schemata change\n");
-   mba_test_cleanup();
-   }
+   if (!is_amd && mba_test)
+   run_mba_test(has_ben, benchmark_cmd, span, cpu_no, bw_report);
 
-   if (cmt_test) {
-   ksft_print_msg("Starting CMT test ...\n");
-   if (!has_ben)
-   sprintf(benchmark_cmd[5], "%s", CMT_STR);
-   res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd);
-   ksft_test_result(!res, "CMT: test\n");
-   cmt_test_cleanup();
-   }
+   if (cmt_test)
+   run_cmt_test(has_ben, benchmark_cmd, cpu_no);
 
-   if (cat_test) {
-   ksft_print_msg("Starting CAT test ...\n");
-   res = cat_perf_miss_val(cpu_no, no_of_bits, "L3");
-   ksft_test_result(!res, "CAT: test\n");
-   cat_test_cleanup();
-   }
+   if (cat_test)
+   run_cat_test(cpu_no, no_of_bits);
 
return ksft_exit_pass();
 }
-- 
2.31.0



[PATCH v6 20/21] selftests/resctrl: Fix checking for < 0 for unsigned values

2021-03-16 Thread Fenghua Yu
Dan reported following static checker warnings

tools/testing/selftests/resctrl/resctrl_val.c:545 measure_vals()
warn: 'bw_imc' unsigned <= 0

tools/testing/selftests/resctrl/resctrl_val.c:549 measure_vals()
warn: 'bw_resc_end' unsigned <= 0

These warnings are reported because
1. measure_vals() declares 'bw_imc' and 'bw_resc_end' as unsigned long
   variables
2. Return value of get_mem_bw_imc() and get_mem_bw_resctrl() are assigned
   to 'bw_imc' and 'bw_resc_end' respectively
3. The returned values are checked for <= 0 to see if the calls failed

Checking for < 0 for an unsigned value doesn't make any sense.

Fix this issue by changing the implementation of get_mem_bw_imc() and
get_mem_bw_resctrl() such that they now accept reference to a variable
and set the variable appropriately upon success and return 0, else return
< 0 on error.

Reported-by: Dan Carpenter 
Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrl_val.c | 41 +++
 1 file changed, 23 insertions(+), 18 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl_val.c 
b/tools/testing/selftests/resctrl/resctrl_val.c
index 20d457c47ded..95224345c78e 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -300,9 +300,9 @@ static int initialize_mem_bw_imc(void)
  * Memory B/W utilized by a process on a socket can be calculated using
  * iMC counters. Perf events are used to read these counters.
  *
- * Return: >= 0 on success. < 0 on failure.
+ * Return: = 0 on success. < 0 on failure.
  */
-static float get_mem_bw_imc(int cpu_no, char *bw_report)
+static int get_mem_bw_imc(int cpu_no, char *bw_report, float *bw_imc)
 {
float reads, writes, of_mul_read, of_mul_write;
int imc, j, ret;
@@ -373,13 +373,18 @@ static float get_mem_bw_imc(int cpu_no, char *bw_report)
close(imc_counters_config[imc][WRITE].fd);
}
 
-   if (strcmp(bw_report, "reads") == 0)
-   return reads;
+   if (strcmp(bw_report, "reads") == 0) {
+   *bw_imc = reads;
+   return 0;
+   }
 
-   if (strcmp(bw_report, "writes") == 0)
-   return writes;
+   if (strcmp(bw_report, "writes") == 0) {
+   *bw_imc = writes;
+   return 0;
+   }
 
-   return (reads + writes);
+   *bw_imc = reads + writes;
+   return 0;
 }
 
 void set_mbm_path(const char *ctrlgrp, const char *mongrp, int resource_id)
@@ -438,9 +443,8 @@ static void initialize_mem_bw_resctrl(const char *ctrlgrp, 
const char *mongrp,
  * 1. If con_mon grp is given, then read from it
  * 2. If con_mon grp is not given, then read from root con_mon grp
  */
-static unsigned long get_mem_bw_resctrl(void)
+static int get_mem_bw_resctrl(unsigned long *mbm_total)
 {
-   unsigned long mbm_total = 0;
FILE *fp;
 
fp = fopen(mbm_total_path, "r");
@@ -449,7 +453,7 @@ static unsigned long get_mem_bw_resctrl(void)
 
return -1;
}
-   if (fscanf(fp, "%lu", _total) <= 0) {
+   if (fscanf(fp, "%lu", mbm_total) <= 0) {
perror("Could not get mbm local bytes");
fclose(fp);
 
@@ -457,7 +461,7 @@ static unsigned long get_mem_bw_resctrl(void)
}
fclose(fp);
 
-   return mbm_total;
+   return 0;
 }
 
 pid_t bm_pid, ppid;
@@ -549,7 +553,8 @@ static void initialize_llc_occu_resctrl(const char 
*ctrlgrp, const char *mongrp,
 static int
 measure_vals(struct resctrl_val_param *param, unsigned long *bw_resc_start)
 {
-   unsigned long bw_imc, bw_resc, bw_resc_end;
+   unsigned long bw_resc, bw_resc_end;
+   float bw_imc;
int ret;
 
/*
@@ -559,13 +564,13 @@ measure_vals(struct resctrl_val_param *param, unsigned 
long *bw_resc_start)
 * Compare the two values to validate resctrl value.
 * It takes 1sec to measure the data.
 */
-   bw_imc = get_mem_bw_imc(param->cpu_no, param->bw_report);
-   if (bw_imc <= 0)
-   return bw_imc;
+   ret = get_mem_bw_imc(param->cpu_no, param->bw_report, _imc);
+   if (ret < 0)
+   return ret;
 
-   bw_resc_end = get_mem_bw_resctrl();
-   if (bw_resc_end <= 0)
-   return bw_resc_end;
+   ret = get_mem_bw_resctrl(_resc_end);
+   if (ret < 0)
+   return ret;
 
bw_resc = (bw_resc_end - *bw_resc_start) / MB;
ret = print_results_bw(param->filename, bm_pid, bw_imc, bw_resc);
-- 
2.31.0



[PATCH v6 15/21] selftests/resctrl: Don't hard code value of "no_of_bits" variable

2021-03-16 Thread Fenghua Yu
Cache related tests (like CAT and CMT) depend on a variable called
no_of_bits to run. no_of_bits defines the number of contiguous bits
that should be set in the CBM mask and a user can pass a value for
no_of_bits using -n command line argument. If a user hasn't passed any
value, it defaults to 5 (randomly chosen value).

Hard coding no_of_bits to 5 will make the cache tests fail to run on
systems that support maximum cbm mask that is less than or equal to 5 bits.
Hence, don't hard code no_of_bits value.

If a user passes a value for "no_of_bits" using -n option, use it.
Otherwise, no_of_bits is equal to half of the maximum number of bits in
the cbm mask.

Please note that CMT test is still hard coded to 5 bits. It will change in
subsequent patches that change CMT test.

Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/cat_test.c  | 5 -
 tools/testing/selftests/resctrl/resctrl_tests.c | 8 ++--
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c 
b/tools/testing/selftests/resctrl/cat_test.c
index 090d3afc7a78..04d706b4f10e 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -130,7 +130,10 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
/* Get max number of bits from default-cabm mask */
count_of_bits = count_bits(long_mask);
 
-   if (n < 1 || n > count_of_bits - 1) {
+   if (!n)
+   n = count_of_bits / 2;
+
+   if (n > count_of_bits - 1) {
ksft_print_msg("Invalid input value for no_of_bits n!\n");
ksft_print_msg("Please enter value in range 1 to %d\n",
   count_of_bits - 1);
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c 
b/tools/testing/selftests/resctrl/resctrl_tests.c
index 355bd28b996a..2ace464b96d1 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -57,7 +57,7 @@ void tests_cleanup(void)
 int main(int argc, char **argv)
 {
bool has_ben = false, mbm_test = true, mba_test = true, cmt_test = true;
-   int res, c, cpu_no = 1, span = 250, argc_new = argc, i, no_of_bits = 5;
+   int res, c, cpu_no = 1, span = 250, argc_new = argc, i, no_of_bits = 0;
char *benchmark_cmd[BENCHMARK_ARGS], bw_report[64], bm_type[64];
char benchmark_cmd_area[BENCHMARK_ARGS][BENCHMARK_ARG_SIZE];
int ben_ind, ben_count, tests = 0;
@@ -110,6 +110,10 @@ int main(int argc, char **argv)
break;
case 'n':
no_of_bits = atoi(optarg);
+   if (no_of_bits <= 0) {
+   printf("Bail out! invalid argument for 
no_of_bits\n");
+   return -1;
+   }
break;
case 'h':
cmd_help();
@@ -188,7 +192,7 @@ int main(int argc, char **argv)
ksft_print_msg("Starting CMT test ...\n");
if (!has_ben)
sprintf(benchmark_cmd[5], "%s", CMT_STR);
-   res = cmt_resctrl_val(cpu_no, no_of_bits, benchmark_cmd);
+   res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd);
ksft_test_result(!res, "CMT: test\n");
cmt_test_cleanup();
}
-- 
2.31.0



[PATCH v6 14/21] selftests/resctrl: Fix MBA/MBM results reporting format

2021-03-16 Thread Fenghua Yu
MBM unit test starts fill_buf (default built-in benchmark) in a new con_mon
group (c1, m1) and records resctrl reported mbm values and iMC (Integrated
Memory Controller) values every second. It does this for five seconds
(randomly chosen value) in total. It then calculates average of resctrl_mbm
values and imc_mbm values and if the difference is greater than 300 MB/sec
(randomly chosen value), the test treats it as a failure. MBA unit test is
similar to MBM but after every run it changes schemata.

Checking for a difference of 300 MB/sec doesn't look very meaningful when
the mbm values are changing over a wide range. For example, below are the
values running MBA test on SKL with different allocations

1. With 10% as schemata both iMC and resctrl mbm_values are around 2000
   MB/sec
2. With 100% as schemata both iMC and resctrl mbm_values are around 1
   MB/sec

A 300 MB/sec difference between resctrl_mbm and imc_mbm values is
acceptable at 100% schemata but it isn't acceptable at 10% schemata because
that's a huge difference.

So, fix this by checking for percentage difference instead of absolute
difference i.e. check if the difference between resctrl_mbm value and
imc_mbm value is within 5% (randomly chosen value) of imc_mbm value. If the
difference is greater than 5% of imc_mbm value, treat it is a failure.

Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/mba_test.c | 22 +-
 tools/testing/selftests/resctrl/mbm_test.c | 15 ---
 2 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/tools/testing/selftests/resctrl/mba_test.c 
b/tools/testing/selftests/resctrl/mba_test.c
index f42d4ba70363..8842d379e886 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -12,7 +12,7 @@
 
 #define RESULT_FILE_NAME   "result_mba"
 #define NUM_OF_RUNS5
-#define MAX_DIFF   300
+#define MAX_DIFF_PERCENT   5
 #define ALLOCATION_MAX 100
 #define ALLOCATION_MIN 10
 #define ALLOCATION_STEP10
@@ -62,7 +62,8 @@ static void show_mba_info(unsigned long *bw_imc, unsigned 
long *bw_resc)
 allocation++) {
unsigned long avg_bw_imc, avg_bw_resc;
unsigned long sum_bw_imc = 0, sum_bw_resc = 0;
-   unsigned long avg_diff;
+   int avg_diff_per;
+   float avg_diff;
 
/*
 * The first run is discarded due to inaccurate value from
@@ -76,16 +77,19 @@ static void show_mba_info(unsigned long *bw_imc, unsigned 
long *bw_resc)
 
avg_bw_imc = sum_bw_imc / (NUM_OF_RUNS - 1);
avg_bw_resc = sum_bw_resc / (NUM_OF_RUNS - 1);
-   avg_diff = labs((long)(avg_bw_resc - avg_bw_imc));
+   avg_diff = (float)labs(avg_bw_resc - avg_bw_imc) / avg_bw_imc;
+   avg_diff_per = (int)(avg_diff * 100);
 
-   ksft_print_msg("%s MBA schemata percentage %u smaller than %d 
%%\n",
-  avg_diff > MAX_DIFF ? "Fail:" : "Pass:",
-  ALLOCATION_MAX - ALLOCATION_STEP * allocation,
-  MAX_DIFF);
-   ksft_print_msg("avg_diff: %lu\n", avg_diff);
+   ksft_print_msg("%s MBA: diff within %d%% for schemata %u\n",
+  avg_diff_per > MAX_DIFF_PERCENT ?
+  "Fail:" : "Pass:",
+  MAX_DIFF_PERCENT,
+  ALLOCATION_MAX - ALLOCATION_STEP * allocation);
+
+   ksft_print_msg("avg_diff_per: %d%%\n", avg_diff_per);
ksft_print_msg("avg_bw_imc: %lu\n", avg_bw_imc);
ksft_print_msg("avg_bw_resc: %lu\n", avg_bw_resc);
-   if (avg_diff > MAX_DIFF)
+   if (avg_diff_per > MAX_DIFF_PERCENT)
failed = true;
}
 
diff --git a/tools/testing/selftests/resctrl/mbm_test.c 
b/tools/testing/selftests/resctrl/mbm_test.c
index 0d65ba4b62b4..651d4ac15986 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -11,7 +11,7 @@
 #include "resctrl.h"
 
 #define RESULT_FILE_NAME   "result_mbm"
-#define MAX_DIFF   300
+#define MAX_DIFF_PERCENT   5
 #define NUM_OF_RUNS5
 
 static int
@@ -19,8 +19,8 @@ show_bw_info(unsigned long *bw_imc, unsigned long *bw_resc, 
int span)
 {
unsigned long avg_bw_imc = 0, avg_bw_resc = 0;
unsigned long sum_bw_imc = 0, sum_bw_resc = 0;
-   long avg_diff = 0;
-   int runs, ret;
+   int runs, ret, avg_diff_per;
+   float avg_diff = 0;
 
/*
 * Discard the first value which is inaccurate due to monitoring setup
@@ -33,12 +33,13 @@ show_

[PATCH v6 17/21] selftests/resctrl: Skip the test if requested resctrl feature is not supported

2021-03-16 Thread Fenghua Yu
There could be two reasons why a resctrl feature might not be enabled on
the platform
1. H/W might not support the feature
2. Even if the H/W supports it, the user might have disabled the feature
   through kernel command line arguments

Hence, any resctrl unit test (like cmt, cat, mbm and mba) before starting
the test will first check if the feature is enabled on the platform or not.
If the feature isn't enabled, then the test returns with an error status.
For example, if MBA isn't supported on a platform and if the user tries to
run MBA, the output will look like this

ok mounting resctrl to "/sys/fs/resctrl"
not ok MBA: schemata change

But, not supporting a feature isn't a test failure. So, instead of treating
it as an error, use the SKIP directive of the TAP protocol. With the
change, the output will look as below

ok MBA # SKIP Hardware does not support MBA or MBA is disabled

Suggested-by: Reinette Chatre 
Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
Change Log:
v6:
- Replace "cat" by CAT_STR and so on (Babu).

 tools/testing/selftests/resctrl/cat_test.c|  3 ---
 tools/testing/selftests/resctrl/mba_test.c|  3 ---
 tools/testing/selftests/resctrl/mbm_test.c|  3 ---
 .../testing/selftests/resctrl/resctrl_tests.c | 23 +++
 4 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c 
b/tools/testing/selftests/resctrl/cat_test.c
index 04d706b4f10e..cd4f68388e0f 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -111,9 +111,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
if (ret)
return ret;
 
-   if (!validate_resctrl_feature_request("cat"))
-   return -1;
-
/* Get default cbm mask for L3/L2 cache */
ret = get_cbm_mask(cache_type, cbm_mask);
if (ret)
diff --git a/tools/testing/selftests/resctrl/mba_test.c 
b/tools/testing/selftests/resctrl/mba_test.c
index 8842d379e886..26f12ad4c663 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -158,9 +158,6 @@ int mba_schemata_change(int cpu_no, char *bw_report, char 
**benchmark_cmd)
 
remove(RESULT_FILE_NAME);
 
-   if (!validate_resctrl_feature_request("mba"))
-   return -1;
-
ret = resctrl_val(benchmark_cmd, );
if (ret)
return ret;
diff --git a/tools/testing/selftests/resctrl/mbm_test.c 
b/tools/testing/selftests/resctrl/mbm_test.c
index 651d4ac15986..02b1ed03f1e5 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -131,9 +131,6 @@ int mbm_bw_change(int span, int cpu_no, char *bw_report, 
char **benchmark_cmd)
 
remove(RESULT_FILE_NAME);
 
-   if (!validate_resctrl_feature_request("mbm"))
-   return -1;
-
ret = resctrl_val(benchmark_cmd, );
if (ret)
return ret;
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c 
b/tools/testing/selftests/resctrl/resctrl_tests.c
index e63e0d8764ef..fb246bc41f47 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -60,6 +60,12 @@ static void run_mbm_test(bool has_ben, char **benchmark_cmd, 
int span,
int res;
 
ksft_print_msg("Starting MBM BW change ...\n");
+
+   if (!validate_resctrl_feature_request(MBM_STR)) {
+   ksft_test_result_skip("Hardware does not support MBM or MBM is 
disabled\n");
+   return;
+   }
+
if (!has_ben)
sprintf(benchmark_cmd[5], "%s", MBA_STR);
res = mbm_bw_change(span, cpu_no, bw_report, benchmark_cmd);
@@ -73,6 +79,12 @@ static void run_mba_test(bool has_ben, char **benchmark_cmd, 
int span,
int res;
 
ksft_print_msg("Starting MBA Schemata change ...\n");
+
+   if (!validate_resctrl_feature_request(MBA_STR)) {
+   ksft_test_result_skip("Hardware does not support MBA or MBA is 
disabled\n");
+   return;
+   }
+
if (!has_ben)
sprintf(benchmark_cmd[1], "%d", span);
res = mba_schemata_change(cpu_no, bw_report, benchmark_cmd);
@@ -85,6 +97,11 @@ static void run_cmt_test(bool has_ben, char **benchmark_cmd, 
int cpu_no)
int res;
 
ksft_print_msg("Starting CMT test ...\n");
+   if (!validate_resctrl_feature_request(CMT_STR)) {
+   ksft_test_result_skip("Hardware does not support CMT or CMT is 
disabled\n");
+   return;
+   }
+
if (!has_ben)
sprintf(benchmark_cmd[5], "%s", CMT_STR);
res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd);
@@ -97,6 +114,12 @@ static void run_cat_test(int cpu_no, i

[PATCH v6 03/21] selftests/resctrl: Fix compilation issues for other global variables

2021-03-16 Thread Fenghua Yu
Reinette reported following compilation issue on Fedora 32, gcc version
10.1.1

/usr/bin/ld: resctrl_tests.o:/resctrl.h:65: multiple definition
of `bm_pid'; cache.o:/resctrl.h:65: first defined here

Other variables are ppid, tests_run, llc_occup_path, is_amd. Compiler
isn't happy because these variables are defined globally in two .c files
but are not declared as extern.

To fix issues for the global variables, declare them as extern.

Chang Log:
- Split this patch from v4's patch 1 (Shuah).

Reported-by: Reinette Chatre 
Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrl.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl.h 
b/tools/testing/selftests/resctrl/resctrl.h
index 959c71e39bdc..12b77182cb44 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -62,11 +62,11 @@ struct resctrl_val_param {
int (*setup)(int num, ...);
 };
 
-pid_t bm_pid, ppid;
-int tests_run;
+extern pid_t bm_pid, ppid;
+extern int tests_run;
 
-char llc_occup_path[1024];
-bool is_amd;
+extern char llc_occup_path[1024];
+extern bool is_amd;
 
 bool check_resctrlfs_support(void);
 int filter_dmesg(void);
-- 
2.31.0



[PATCH v6 19/21] selftests/resctrl: Fix incorrect parsing of iMC counters

2021-03-16 Thread Fenghua Yu
iMC (Integrated Memory Controller) counters are usually at
"/sys/bus/event_source/devices/" and are named as "uncore_imc_".
num_of_imcs() function tries to count number of such iMC counters so that
it could appropriately initialize required number of perf_attr structures
that could be used to read these iMC counters.

num_of_imcs() function assumes that all the directories under this path
that start with "uncore_imc" are iMC counters. But, on some systems there
could be directories named as "uncore_imc_free_running" which aren't iMC
counters. Trying to read from such directories will result in "not found
file" errors and MBM/MBA tests will fail.

Hence, fix the logic in num_of_imcs() such that it looks at the first
character after "uncore_imc_" to check if it's a numerical digit or not. If
it's a digit then the directory represents an iMC counter, else, skip the
directory.

Reported-by: Reinette Chatre 
Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrl_val.c | 22 +--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl_val.c 
b/tools/testing/selftests/resctrl/resctrl_val.c
index 5dfae51133bc..20d457c47ded 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -221,8 +221,8 @@ static int read_from_imc_dir(char *imc_dir, int count)
  */
 static int num_of_imcs(void)
 {
+   char imc_dir[512], *temp;
unsigned int count = 0;
-   char imc_dir[512];
struct dirent *ep;
int ret;
DIR *dp;
@@ -230,7 +230,25 @@ static int num_of_imcs(void)
dp = opendir(DYN_PMU_PATH);
if (dp) {
while ((ep = readdir(dp))) {
-   if (strstr(ep->d_name, UNCORE_IMC)) {
+   temp = strstr(ep->d_name, UNCORE_IMC);
+   if (!temp)
+   continue;
+
+   /*
+* imc counters are named as "uncore_imc_", hence
+* increment the pointer to point to . Note that
+* sizeof(UNCORE_IMC) would count for null character as
+* well and hence the last underscore character in
+* uncore_imc'_' need not be counted.
+*/
+   temp = temp + sizeof(UNCORE_IMC);
+
+   /*
+* Some directories under "DYN_PMU_PATH" could have
+* names like "uncore_imc_free_running", hence, check if
+* first character is a numerical digit or not.
+*/
+   if (temp[0] >= '0' && temp[0] <= '9') {
sprintf(imc_dir, "%s/%s/", DYN_PMU_PATH,
ep->d_name);
ret = read_from_imc_dir(imc_dir, count);
-- 
2.31.0



[PATCH v6 18/21] selftests/resctrl: Fix unmount resctrl FS

2021-03-16 Thread Fenghua Yu
umount_resctrlfs() directly attempts to unmount resctrl file system without
checking if resctrl FS is already mounted or not. It returns 0 on success
and on failure it prints an error message and returns an error status.
Calling umount_resctrlfs() when resctrl FS isn't mounted will return an
error status.

There could be situations where-in the caller might not know if resctrl
FS is already mounted or not and the caller might still want to unmount
resctrl FS if it's already mounted (For example during teardown).

To support above use cases, change umount_resctrlfs() such that it now
first checks if resctrl FS is already mounted or not and unmounts resctrl
FS only if it's already mounted.

unmount resctrl FS upon exit. For example, running only mba test on a
Broadwell (BDW) machine (MBA isn't supported on BDW CPU).

This happens because validate_resctrl_feature_request() would mount resctrl
FS to check if mba is enabled on the platform or not and finds that the H/W
doesn't support mba and hence will return false to run_mba_test(). This in
turn makes the main() function return without unmounting resctrl FS.

Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrl_tests.c | 2 ++
 tools/testing/selftests/resctrl/resctrlfs.c | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c 
b/tools/testing/selftests/resctrl/resctrl_tests.c
index fb246bc41f47..f51b5fc066a3 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -253,5 +253,7 @@ int main(int argc, char **argv)
if (cat_test)
run_cat_test(cpu_no, no_of_bits);
 
+   umount_resctrlfs();
+
return ksft_exit_pass();
 }
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c 
b/tools/testing/selftests/resctrl/resctrlfs.c
index 26563175acf6..ade5f2b8b843 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -82,6 +82,9 @@ int remount_resctrlfs(bool mum_resctrlfs)
 
 int umount_resctrlfs(void)
 {
+   if (find_resctrl_mount(NULL))
+   return 0;
+
if (umount(RESCTRL_PATH)) {
perror("# Unable to umount resctrl");
 
-- 
2.31.0



[PATCH v6 02/21] selftests/resctrl: Fix compilation issues for global variables

2021-03-16 Thread Fenghua Yu
Reinette reported following compilation issue on Fedora 32, gcc version
10.1.1

/usr/bin/ld: cqm_test.o:/cqm_test.c:22: multiple definition of
`cache_size'; cat_test.o:/cat_test.c:23: first defined here

The same issue is reported for long_mask, cbm_mask, count_of_bits etc
variables as well. Compiler isn't happy because these variables are
defined globally in two .c files namely cqm_test.c and cat_test.c and
the compiler during compilation finds that the variable is already
defined (multiple definition error).

Taking a closer look at the usage of these variables reveals that these
variables are used only locally in functions such as cqm_resctrl_val()
(defined in cqm_test.c) and cat_perf_miss_val() (defined in cat_test.c).
These variables are not shared between those functions. So, there is no
need for these variables to be global. Hence, fix this issue by making
them static variables.

Reported-by: Reinette Chatre 
Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Define long_mask, cbm_mask, count_of_bits etc as static variables
  (Shuah).
- Split this patch into patch 2 and 3 (Shuah).

 tools/testing/selftests/resctrl/cat_test.c  | 10 +-
 tools/testing/selftests/resctrl/cqm_test.c  | 10 +-
 tools/testing/selftests/resctrl/resctrl.h   |  2 +-
 tools/testing/selftests/resctrl/resctrlfs.c | 10 +-
 4 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c 
b/tools/testing/selftests/resctrl/cat_test.c
index 5da43767b973..bdeeb5772592 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -17,10 +17,10 @@
 #define MAX_DIFF_PERCENT   4
 #define MAX_DIFF   100
 
-int count_of_bits;
-char cbm_mask[256];
-unsigned long long_mask;
-unsigned long cache_size;
+static int count_of_bits;
+static char cbm_mask[256];
+static unsigned long long_mask;
+static unsigned long cache_size;
 
 /*
  * Change schemata. Write schemata to specified
@@ -136,7 +136,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
return -1;
 
/* Get default cbm mask for L3/L2 cache */
-   ret = get_cbm_mask(cache_type);
+   ret = get_cbm_mask(cache_type, cbm_mask);
if (ret)
return ret;
 
diff --git a/tools/testing/selftests/resctrl/cqm_test.c 
b/tools/testing/selftests/resctrl/cqm_test.c
index 5e7308ac63be..de33d1c0466e 100644
--- a/tools/testing/selftests/resctrl/cqm_test.c
+++ b/tools/testing/selftests/resctrl/cqm_test.c
@@ -16,10 +16,10 @@
 #define MAX_DIFF   200
 #define MAX_DIFF_PERCENT   15
 
-int count_of_bits;
-char cbm_mask[256];
-unsigned long long_mask;
-unsigned long cache_size;
+static int count_of_bits;
+static char cbm_mask[256];
+static unsigned long long_mask;
+static unsigned long cache_size;
 
 static int cqm_setup(int num, ...)
 {
@@ -125,7 +125,7 @@ int cqm_resctrl_val(int cpu_no, int n, char **benchmark_cmd)
if (!validate_resctrl_feature_request("cqm"))
return -1;
 
-   ret = get_cbm_mask("L3");
+   ret = get_cbm_mask("L3", cbm_mask);
if (ret)
return ret;
 
diff --git a/tools/testing/selftests/resctrl/resctrl.h 
b/tools/testing/selftests/resctrl/resctrl.h
index 39bf59c6b9c5..959c71e39bdc 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -92,7 +92,7 @@ void tests_cleanup(void);
 void mbm_test_cleanup(void);
 int mba_schemata_change(int cpu_no, char *bw_report, char **benchmark_cmd);
 void mba_test_cleanup(void);
-int get_cbm_mask(char *cache_type);
+int get_cbm_mask(char *cache_type, char *cbm_mask);
 int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size);
 void ctrlc_handler(int signum, siginfo_t *info, void *ptr);
 int cat_val(struct resctrl_val_param *param);
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c 
b/tools/testing/selftests/resctrl/resctrlfs.c
index 19c0ec4045a4..2a16100c9c3f 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -49,8 +49,6 @@ static int find_resctrl_mount(char *buffer)
return -ENOENT;
 }
 
-char cbm_mask[256];
-
 /*
  * remount_resctrlfs - Remount resctrl FS at /sys/fs/resctrl
  * @mum_resctrlfs: Should the resctrl FS be remounted?
@@ -205,16 +203,18 @@ int get_cache_size(int cpu_no, char *cache_type, unsigned 
long *cache_size)
 /*
  * get_cbm_mask - Get cbm mask for given cache
  * @cache_type:Cache level L2/L3
- *
- * Mask is stored in cbm_mask which is global variable.
+ * @cbm_mask:  cbm_mask returned as a string
  *
  * Return: = 0 on success, < 0 on failure.
  */
-int get_cbm_mask(char *cache_type)
+int get_cbm_mask(char *cache_type, char *cbm_mask)
 {
char cbm_mask_path[1024];
FILE *fp;
 
+   if (!cbm_mask)
+   return -1;
+
sprintf(cbm_mask_path

[PATCH v6 05/21] selftests/resctrl: Ensure sibling CPU is not same as original CPU

2021-03-16 Thread Fenghua Yu
From: Reinette Chatre 

The resctrl tests can accept a CPU on which the tests are run and use
default of CPU #1 if it is not provided. In the CAT test a "sibling CPU"
is determined that is from the same package where another thread will be
run.

The current algorithm with which a "sibling CPU" is determined does not
take the provided/default CPU into account and when that CPU is the
first CPU in a package then the "sibling CPU" will be selected to be the
same CPU since it starts by picking the first CPU from core_siblings_list.

Fix the "sibling CPU" selection by taking the provided/default CPU into
account and ensuring a sibling that is a different CPU is selected.

Tested-by: Babu Moger 
Signed-off-by: Reinette Chatre 
Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Move from v4's patch 8 to this patch as the fix patch should be first
  (Shuah).

 tools/testing/selftests/resctrl/resctrlfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/resctrl/resctrlfs.c 
b/tools/testing/selftests/resctrl/resctrlfs.c
index 4174e48e06d1..bc52076bee7f 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -268,7 +268,7 @@ int get_core_sibling(int cpu_no)
while (token) {
sibling_cpu_no = atoi(token);
/* Skipping core 0 as we don't want to run test on core 0 */
-   if (sibling_cpu_no != 0)
+   if (sibling_cpu_no != 0 && sibling_cpu_no != cpu_no)
break;
token = strtok(NULL, "-,");
}
-- 
2.31.0



[PATCH v6 06/21] selftests/resctrl: Fix missing options "-n" and "-p"

2021-03-16 Thread Fenghua Yu
resctrl test suite accepts command line arguments (like -b, -t, -n and -p)
as documented in the help. But passing -n and -p throws an invalid option
error. This happens because -n and -p are missing in the list of
characters that getopt() recognizes as valid arguments. Hence, they are
treated as invalid options.

Fix this by adding them to the list of characters that getopt() recognizes
as valid arguments. Please note that the main() function already has the
logic to deal with the values passed as part of these arguments and hence
no changes are needed there.

Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Move from v4's patch 9 to this patch as the fix patch should be first
  (Shuah).

 tools/testing/selftests/resctrl/resctrl_tests.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c 
b/tools/testing/selftests/resctrl/resctrl_tests.c
index 4b109a59f72d..ac2269610aa9 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -73,7 +73,7 @@ int main(int argc, char **argv)
}
}
 
-   while ((c = getopt(argc_new, argv, "ht:b:")) != -1) {
+   while ((c = getopt(argc_new, argv, "ht:b:n:p:")) != -1) {
char *token;
 
switch (c) {
-- 
2.31.0



[PATCH v6 04/21] selftests/resctrl: Clean up resctrl features check

2021-03-16 Thread Fenghua Yu
Checking resctrl features call strcmp() to compare feature strings
(e.g. "mba", "cat" etc). The checkings are error prone and don't have
good coding style. Define the constant strings in macros and call
strncmp() to solve the potential issues.

Suggested-by: Shuah Khan 
Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Remove is_cat() etc functions and directly call strncmp() to check
  the features (Shuah).

 tools/testing/selftests/resctrl/cache.c   |  8 +++
 tools/testing/selftests/resctrl/cat_test.c|  2 +-
 tools/testing/selftests/resctrl/cqm_test.c|  2 +-
 tools/testing/selftests/resctrl/fill_buf.c|  4 ++--
 tools/testing/selftests/resctrl/mba_test.c|  2 +-
 tools/testing/selftests/resctrl/mbm_test.c|  2 +-
 tools/testing/selftests/resctrl/resctrl.h |  5 +
 .../testing/selftests/resctrl/resctrl_tests.c | 12 +-
 tools/testing/selftests/resctrl/resctrl_val.c | 22 +--
 tools/testing/selftests/resctrl/resctrlfs.c   | 17 +++---
 10 files changed, 41 insertions(+), 35 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c 
b/tools/testing/selftests/resctrl/cache.c
index 38dbf4962e33..5922cc1b0386 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -182,7 +182,7 @@ int measure_cache_vals(struct resctrl_val_param *param, int 
bm_pid)
/*
 * Measure cache miss from perf.
 */
-   if (!strcmp(param->resctrl_val, "cat")) {
+   if (!strncmp(param->resctrl_val, CAT_STR, sizeof(CAT_STR))) {
ret = get_llc_perf(_perf_miss);
if (ret < 0)
return ret;
@@ -192,7 +192,7 @@ int measure_cache_vals(struct resctrl_val_param *param, int 
bm_pid)
/*
 * Measure llc occupancy from resctrl.
 */
-   if (!strcmp(param->resctrl_val, "cqm")) {
+   if (!strncmp(param->resctrl_val, CQM_STR, sizeof(CQM_STR))) {
ret = get_llc_occu_resctrl(_occu_resc);
if (ret < 0)
return ret;
@@ -234,7 +234,7 @@ int cat_val(struct resctrl_val_param *param)
if (ret)
return ret;
 
-   if ((strcmp(resctrl_val, "cat") == 0)) {
+   if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR))) {
ret = initialize_llc_perf();
if (ret)
return ret;
@@ -242,7 +242,7 @@ int cat_val(struct resctrl_val_param *param)
 
/* Test runs until the callback setup() tells the test to stop. */
while (1) {
-   if (strcmp(resctrl_val, "cat") == 0) {
+   if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR))) {
ret = param->setup(1, param);
if (ret) {
ret = 0;
diff --git a/tools/testing/selftests/resctrl/cat_test.c 
b/tools/testing/selftests/resctrl/cat_test.c
index bdeeb5772592..20823725daca 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -164,7 +164,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
return -1;
 
struct resctrl_val_param param = {
-   .resctrl_val= "cat",
+   .resctrl_val= CAT_STR,
.cpu_no = cpu_no,
.mum_resctrlfs  = 0,
.setup  = cat_setup,
diff --git a/tools/testing/selftests/resctrl/cqm_test.c 
b/tools/testing/selftests/resctrl/cqm_test.c
index de33d1c0466e..271752e9ef5b 100644
--- a/tools/testing/selftests/resctrl/cqm_test.c
+++ b/tools/testing/selftests/resctrl/cqm_test.c
@@ -145,7 +145,7 @@ int cqm_resctrl_val(int cpu_no, int n, char **benchmark_cmd)
}
 
struct resctrl_val_param param = {
-   .resctrl_val= "cqm",
+   .resctrl_val= CQM_STR,
.ctrlgrp= "c1",
.mongrp = "m1",
.cpu_no = cpu_no,
diff --git a/tools/testing/selftests/resctrl/fill_buf.c 
b/tools/testing/selftests/resctrl/fill_buf.c
index 79c611c99a3d..51e5cf22632f 100644
--- a/tools/testing/selftests/resctrl/fill_buf.c
+++ b/tools/testing/selftests/resctrl/fill_buf.c
@@ -115,7 +115,7 @@ static int fill_cache_read(unsigned char *start_ptr, 
unsigned char *end_ptr,
 
while (1) {
ret = fill_one_span_read(start_ptr, end_ptr);
-   if (!strcmp(resctrl_val, "cat"))
+   if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR)))
break;
}
 
@@ -134,7 +134,7 @@ static int fill_cache_write(unsigned char *start_ptr, 
unsigned char *end_ptr,
 {
while (1) {
fill_one_span_write(start_ptr, end_ptr);
-   if (!strcmp(resctrl_val, "cat"))
+  

[PATCH v6 11/21] selftests/resctrl: Add config dependencies

2021-03-16 Thread Fenghua Yu
Add the config file for test dependencies.

Suggested-by: Shuah Khan 
Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Add this patch (Shuah)

 tools/testing/selftests/resctrl/config | 2 ++
 1 file changed, 2 insertions(+)
 create mode 100644 tools/testing/selftests/resctrl/config

diff --git a/tools/testing/selftests/resctrl/config 
b/tools/testing/selftests/resctrl/config
new file mode 100644
index ..8d9f2deb56ed
--- /dev/null
+++ b/tools/testing/selftests/resctrl/config
@@ -0,0 +1,2 @@
+CONFIG_X86_CPU_RESCTRL=y
+CONFIG_PROC_CPU_RESCTRL=y
-- 
2.31.0



[PATCH v6 07/21] selftests/resctrl: Rename CQM test as CMT test

2021-03-16 Thread Fenghua Yu
CMT (Cache Monitoring Technology) [1] is a H/W feature that reports cache
occupancy of a process. resctrl selftest suite has a unit test to test CMT
for LLC but the test is named as CQM (Cache Quality Monitoring).
Furthermore, the unit test source file is named as cqm_test.c and several
functions, variables, comments, preprocessors and statements widely use
"cqm" as either suffix or prefix. This rampant misusage of CQM for CMT
might confuse someone who is newly looking at resctrl selftests because
this feature is named CMT in the Intel Software Developer's Manual.

Hence, rename all the occurrences (unit test source file name, functions,
variables, comments and preprocessors) of cqm with cmt.

[1] Please see Intel SDM, Volume 3, chapter 17 and section 18 for more
information on CMT: 
https://software.intel.com/content/www/us/en/develop/articles/intel-sdm.html

Suggested-by: Reinette Chatre 
Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/README|  4 +--
 tools/testing/selftests/resctrl/cache.c   |  4 +--
 .../resctrl/{cqm_test.c => cmt_test.c}| 20 +++---
 tools/testing/selftests/resctrl/resctrl.h |  6 ++---
 .../testing/selftests/resctrl/resctrl_tests.c | 26 +--
 tools/testing/selftests/resctrl/resctrl_val.c | 12 -
 tools/testing/selftests/resctrl/resctrlfs.c   | 10 +++
 7 files changed, 41 insertions(+), 41 deletions(-)
 rename tools/testing/selftests/resctrl/{cqm_test.c => cmt_test.c} (89%)

diff --git a/tools/testing/selftests/resctrl/README 
b/tools/testing/selftests/resctrl/README
index 6e5a0ffa18e8..4b36b25b6ac0 100644
--- a/tools/testing/selftests/resctrl/README
+++ b/tools/testing/selftests/resctrl/README
@@ -46,8 +46,8 @@ ARGUMENTS
 Parameter '-h' shows usage information.
 
 usage: resctrl_tests [-h] [-b "benchmark_cmd [options]"] [-t test list] [-n 
no_of_bits]
--b benchmark_cmd [options]: run specified benchmark for MBM, MBA and 
CQM default benchmark is builtin fill_buf
--t test list: run tests specified in the test list, e.g. -t mbm, mba, 
cqm, cat
+-b benchmark_cmd [options]: run specified benchmark for MBM, MBA and 
CMT default benchmark is builtin fill_buf
+-t test list: run tests specified in the test list, e.g. -t mbm, mba, 
cmt, cat
 -n no_of_bits: run cache tests using specified no of bits in cache bit 
mask
 -p cpu_no: specify CPU number to run the test. 1 is default
 -h: help
diff --git a/tools/testing/selftests/resctrl/cache.c 
b/tools/testing/selftests/resctrl/cache.c
index 5922cc1b0386..2aa1b5c7d9e1 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -111,7 +111,7 @@ static int get_llc_perf(unsigned long *llc_perf_miss)
 
 /*
  * Get LLC Occupancy as reported by RESCTRL FS
- * For CQM,
+ * For CMT,
  * 1. If con_mon grp and mon grp given, then read from mon grp in
  * con_mon grp
  * 2. If only con_mon grp given, then read from con_mon grp
@@ -192,7 +192,7 @@ int measure_cache_vals(struct resctrl_val_param *param, int 
bm_pid)
/*
 * Measure llc occupancy from resctrl.
 */
-   if (!strncmp(param->resctrl_val, CQM_STR, sizeof(CQM_STR))) {
+   if (!strncmp(param->resctrl_val, CMT_STR, sizeof(CMT_STR))) {
ret = get_llc_occu_resctrl(_occu_resc);
if (ret < 0)
return ret;
diff --git a/tools/testing/selftests/resctrl/cqm_test.c 
b/tools/testing/selftests/resctrl/cmt_test.c
similarity index 89%
rename from tools/testing/selftests/resctrl/cqm_test.c
rename to tools/testing/selftests/resctrl/cmt_test.c
index 271752e9ef5b..4b63838dda32 100644
--- a/tools/testing/selftests/resctrl/cqm_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
- * Cache Monitoring Technology (CQM) test
+ * Cache Monitoring Technology (CMT) test
  *
  * Copyright (C) 2018 Intel Corporation
  *
@@ -11,7 +11,7 @@
 #include "resctrl.h"
 #include 
 
-#define RESULT_FILE_NAME   "result_cqm"
+#define RESULT_FILE_NAME   "result_cmt"
 #define NUM_OF_RUNS5
 #define MAX_DIFF   200
 #define MAX_DIFF_PERCENT   15
@@ -21,7 +21,7 @@ static char cbm_mask[256];
 static unsigned long long_mask;
 static unsigned long cache_size;
 
-static int cqm_setup(int num, ...)
+static int cmt_setup(int num, ...)
 {
struct resctrl_val_param *p;
va_list param;
@@ -58,7 +58,7 @@ static void show_cache_info(unsigned long sum_llc_occu_resc, 
int no_of_bits,
else
res = false;
 
-   printf("%sok CQM: diff within %d, %d\%%\n", res ? "" : "not",
+   printf("%sok CMT: diff within %d, %d\%%\n", res ? "" : "not",
   MAX_DIFF, (int)MAX_DIFF_PERCENT);
 
printf("# diff: %l

[PATCH v6 09/21] selftests/resctrl: Share show_cache_info() by CAT and CMT tests

2021-03-16 Thread Fenghua Yu
show_cache_info() functions are defined separately in CAT and CMT
tests. But the functions are same for the tests and unnecessary
to be defined separately. Share the function by the tests.

Suggested-by: Shuah Khan 
Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Add this patch (Shuah)

 tools/testing/selftests/resctrl/cache.c| 42 ++
 tools/testing/selftests/resctrl/cat_test.c | 28 ++-
 tools/testing/selftests/resctrl/cmt_test.c | 33 ++---
 tools/testing/selftests/resctrl/resctrl.h  |  4 +++
 4 files changed, 52 insertions(+), 55 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c 
b/tools/testing/selftests/resctrl/cache.c
index 2aa1b5c7d9e1..362e3a418caa 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -270,3 +270,45 @@ int cat_val(struct resctrl_val_param *param)
 
return ret;
 }
+
+/*
+ * show_cache_info:show cache test result information
+ * @sum_llc_val:   sum of LLC cache result data
+ * @no_of_bits:number of bits
+ * @cache_span:cache span in bytes for CMT or in lines for CAT
+ * @max_diff:  max difference
+ * @max_diff_percent:  max difference percentage
+ * @num_of_runs:   number of runs
+ * @platform:  show test information on this platform
+ * @cmt:   CMT test or CAT test
+ *
+ * Return: 0 on success. non-zero on failure.
+ */
+int show_cache_info(unsigned long sum_llc_val, int no_of_bits,
+   unsigned long cache_span, unsigned long max_diff,
+   unsigned long max_diff_percent, unsigned long num_of_runs,
+   bool platform, bool cmt)
+{
+   unsigned long avg_llc_val = 0;
+   float diff_percent;
+   long avg_diff = 0;
+   int ret;
+
+   avg_llc_val = sum_llc_val / (num_of_runs - 1);
+   avg_diff = (long)abs(cache_span - avg_llc_val);
+   diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100;
+
+   ret = platform && abs((int)diff_percent) > max_diff_percent &&
+ (cmt ? (abs(avg_diff) > max_diff) : true);
+
+   ksft_print_msg("%s cache miss rate within %d%%\n",
+  ret ? "Fail:" : "Pass:", max_diff_percent);
+
+   ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent));
+   ksft_print_msg("Number of bits: %d\n", no_of_bits);
+   ksft_print_msg("Average LLC val: %lu\n", avg_llc_val);
+   ksft_print_msg("Cache span (%s): %lu\n", cmt ? "bytes" : "lines",
+  cache_span);
+
+   return ret;
+}
diff --git a/tools/testing/selftests/resctrl/cat_test.c 
b/tools/testing/selftests/resctrl/cat_test.c
index 1daf911076c7..090d3afc7a78 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -52,30 +52,6 @@ static int cat_setup(int num, ...)
return ret;
 }
 
-static int show_cache_info(unsigned long sum_llc_perf_miss, int no_of_bits,
-  unsigned long span)
-{
-   unsigned long allocated_cache_lines = span / 64;
-   unsigned long avg_llc_perf_miss = 0;
-   float diff_percent;
-   int ret;
-
-   avg_llc_perf_miss = sum_llc_perf_miss / (NUM_OF_RUNS - 1);
-   diff_percent = ((float)allocated_cache_lines - avg_llc_perf_miss) /
-   allocated_cache_lines * 100;
-
-   ret = !is_amd && abs((int)diff_percent) > MAX_DIFF_PERCENT;
-   ksft_print_msg("Cache miss rate %swithin %d%%\n",
-  ret ? "not " : "", MAX_DIFF_PERCENT);
-
-   ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent));
-   ksft_print_msg("Number of bits: %d\n", no_of_bits);
-   ksft_print_msg("Avg_llc_perf_miss: %lu\n", avg_llc_perf_miss);
-   ksft_print_msg("Allocated cache lines: %lu\n", allocated_cache_lines);
-
-   return ret;
-}
-
 static int check_results(struct resctrl_val_param *param)
 {
char *token_array[8], temp[512];
@@ -111,7 +87,9 @@ static int check_results(struct resctrl_val_param *param)
fclose(fp);
no_of_bits = count_bits(param->mask);
 
-   return show_cache_info(sum_llc_perf_miss, no_of_bits, param->span);
+   return show_cache_info(sum_llc_perf_miss, no_of_bits, param->span / 64,
+  MAX_DIFF, MAX_DIFF_PERCENT, NUM_OF_RUNS,
+  !is_amd, false);
 }
 
 void cat_test_cleanup(void)
diff --git a/tools/testing/selftests/resctrl/cmt_test.c 
b/tools/testing/selftests/resctrl/cmt_test.c
index b1ab1bd1f74d..8968e36db99d 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -39,35 +39,6 @@ static int cmt_setup(int num, ...)
  

[PATCH v6 12/21] selftests/resctrl: Check for resctrl mount point only if resctrl FS is supported

2021-03-16 Thread Fenghua Yu
check_resctrlfs_support() does the following
1. Checks if the platform supports resctrl file system or not by looking
   for resctrl in /proc/filesystems
2. Calls opendir() on default resctrl file system path
   (i.e. /sys/fs/resctrl)
3. Checks if resctrl file system is mounted or not by looking at
   /proc/mounts

Steps 2 and 3 will fail if the platform does not support resctrl file
system. So, there is no need to check for them if step 1 fails.

Fix this by returning immediately if the platform does not support
resctrl file system.

Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrlfs.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/testing/selftests/resctrl/resctrlfs.c 
b/tools/testing/selftests/resctrl/resctrlfs.c
index 6b22a186790a..87195eb78356 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -570,6 +570,9 @@ bool check_resctrlfs_support(void)
ksft_print_msg("%s kernel supports resctrl filesystem\n",
   ret ? "Pass:" : "Fail:");
 
+   if (!ret)
+   return ret;
+
dp = opendir(RESCTRL_PATH);
ksft_print_msg("%s resctrl mountpoint \"%s\" exists\n",
   dp ? "Pass:" : "Fail:", RESCTRL_PATH);
-- 
2.31.0



[PATCH v6 08/21] selftests/resctrl: Call kselftest APIs to log test results

2021-03-16 Thread Fenghua Yu
Call kselftest APIs instead of using printf() to log test results
for cleaner code and better future extension.

Suggested-by: Shuah Khan 
Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
Change Log:
v6:
- Capitalize the first letter in printed msg (Babu).

v5:
- Add this patch (Shuah)

 tools/testing/selftests/resctrl/cat_test.c| 37 +++
 tools/testing/selftests/resctrl/cmt_test.c| 42 -
 tools/testing/selftests/resctrl/mba_test.c| 24 +-
 tools/testing/selftests/resctrl/mbm_test.c| 28 ++--
 tools/testing/selftests/resctrl/resctrl.h |  2 +-
 .../testing/selftests/resctrl/resctrl_tests.c | 40 +
 tools/testing/selftests/resctrl/resctrl_val.c |  4 +-
 tools/testing/selftests/resctrl/resctrlfs.c   | 45 +++
 8 files changed, 105 insertions(+), 117 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c 
b/tools/testing/selftests/resctrl/cat_test.c
index 20823725daca..1daf911076c7 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -52,25 +52,28 @@ static int cat_setup(int num, ...)
return ret;
 }
 
-static void show_cache_info(unsigned long sum_llc_perf_miss, int no_of_bits,
-   unsigned long span)
+static int show_cache_info(unsigned long sum_llc_perf_miss, int no_of_bits,
+  unsigned long span)
 {
unsigned long allocated_cache_lines = span / 64;
unsigned long avg_llc_perf_miss = 0;
float diff_percent;
+   int ret;
 
avg_llc_perf_miss = sum_llc_perf_miss / (NUM_OF_RUNS - 1);
diff_percent = ((float)allocated_cache_lines - avg_llc_perf_miss) /
allocated_cache_lines * 100;
 
-   printf("%sok CAT: cache miss rate within %d%%\n",
-  !is_amd && abs((int)diff_percent) > MAX_DIFF_PERCENT ?
-  "not " : "", MAX_DIFF_PERCENT);
-   tests_run++;
-   printf("# Percent diff=%d\n", abs((int)diff_percent));
-   printf("# Number of bits: %d\n", no_of_bits);
-   printf("# Avg_llc_perf_miss: %lu\n", avg_llc_perf_miss);
-   printf("# Allocated cache lines: %lu\n", allocated_cache_lines);
+   ret = !is_amd && abs((int)diff_percent) > MAX_DIFF_PERCENT;
+   ksft_print_msg("Cache miss rate %swithin %d%%\n",
+  ret ? "not " : "", MAX_DIFF_PERCENT);
+
+   ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent));
+   ksft_print_msg("Number of bits: %d\n", no_of_bits);
+   ksft_print_msg("Avg_llc_perf_miss: %lu\n", avg_llc_perf_miss);
+   ksft_print_msg("Allocated cache lines: %lu\n", allocated_cache_lines);
+
+   return ret;
 }
 
 static int check_results(struct resctrl_val_param *param)
@@ -80,7 +83,7 @@ static int check_results(struct resctrl_val_param *param)
int runs = 0, no_of_bits = 0;
FILE *fp;
 
-   printf("# Checking for pass/fail\n");
+   ksft_print_msg("Checking for pass/fail\n");
fp = fopen(param->filename, "r");
if (!fp) {
perror("# Cannot open file");
@@ -108,9 +111,7 @@ static int check_results(struct resctrl_val_param *param)
fclose(fp);
no_of_bits = count_bits(param->mask);
 
-   show_cache_info(sum_llc_perf_miss, no_of_bits, param->span);
-
-   return 0;
+   return show_cache_info(sum_llc_perf_miss, no_of_bits, param->span);
 }
 
 void cat_test_cleanup(void)
@@ -146,15 +147,15 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
ret = get_cache_size(cpu_no, cache_type, _size);
if (ret)
return ret;
-   printf("cache size :%lu\n", cache_size);
+   ksft_print_msg("Cache size :%lu\n", cache_size);
 
/* Get max number of bits from default-cabm mask */
count_of_bits = count_bits(long_mask);
 
if (n < 1 || n > count_of_bits - 1) {
-   printf("Invalid input value for no_of_bits n!\n");
-   printf("Please Enter value in range 1 to %d\n",
-  count_of_bits - 1);
+   ksft_print_msg("Invalid input value for no_of_bits n!\n");
+   ksft_print_msg("Please enter value in range 1 to %d\n",
+  count_of_bits - 1);
return -1;
}
 
diff --git a/tools/testing/selftests/resctrl/cmt_test.c 
b/tools/testing/selftests/resctrl/cmt_test.c
index 4b63838dda32..b1ab1bd1f74d 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -39,36 +39,33 @@ static int cmt_setup(int num, ...)
return 0;
 }
 
-static void show_cache_info(unsigned long sum_llc_occu_re

[PATCH v6 01/21] selftests/resctrl: Enable gcc checks to detect buffer overflows

2021-03-16 Thread Fenghua Yu
David reported a buffer overflow error in the check_results() function of
the cmt unit test and he suggested enabling _FORTIFY_SOURCE gcc compiler
option to automatically detect any such errors.

Feature Test Macros man page describes_FORTIFY_SOURCE as below

"Defining this macro causes some lightweight checks to be performed to
detect some buffer overflow errors when employing various string and memory
manipulation functions (for example, memcpy, memset, stpcpy, strcpy,
strncpy, strcat, strncat, sprintf, snprintf, vsprintf, vsnprintf, gets, and
wide character variants thereof). For some functions, argument consistency
is checked; for example, a check is made that open has been supplied with a
mode argument when the specified flags include O_CREAT. Not all problems
are detected, just some common cases.

If _FORTIFY_SOURCE is set to 1, with compiler optimization level 1 (gcc
-O1) and above, checks that shouldn't change the behavior of conforming
programs are performed.

With _FORTIFY_SOURCE set to 2, some more checking is added, but some
conforming programs might fail.

Some of the checks can be performed at compile time (via macros logic
implemented in header files), and result in compiler warnings; other checks
take place at run time, and result in a run-time error if the check fails.

Use of this macro requires compiler support, available with gcc since
version 4.0."

Fix the buffer overflow error in the check_results() function of the cmt
unit test and enable _FORTIFY_SOURCE gcc check to catch any future buffer
overflow errors.

Reported-by: David Binderman 
Suggested-by: David Binderman 
Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Move from v4's patch 11 to patch 1 so the fix patch should be first
  (Shuah).

 tools/testing/selftests/resctrl/Makefile   | 2 +-
 tools/testing/selftests/resctrl/cqm_test.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/resctrl/Makefile 
b/tools/testing/selftests/resctrl/Makefile
index d585cc1948cc..6bcee2ec91a9 100644
--- a/tools/testing/selftests/resctrl/Makefile
+++ b/tools/testing/selftests/resctrl/Makefile
@@ -1,5 +1,5 @@
 CC = $(CROSS_COMPILE)gcc
-CFLAGS = -g -Wall
+CFLAGS = -g -Wall -O2 -D_FORTIFY_SOURCE=2
 SRCS=$(wildcard *.c)
 OBJS=$(SRCS:.c=.o)
 
diff --git a/tools/testing/selftests/resctrl/cqm_test.c 
b/tools/testing/selftests/resctrl/cqm_test.c
index c8756152bd61..5e7308ac63be 100644
--- a/tools/testing/selftests/resctrl/cqm_test.c
+++ b/tools/testing/selftests/resctrl/cqm_test.c
@@ -86,7 +86,7 @@ static int check_results(struct resctrl_val_param *param, int 
no_of_bits)
return errno;
}
 
-   while (fgets(temp, 1024, fp)) {
+   while (fgets(temp, sizeof(temp), fp)) {
char *token = strtok(temp, ":\t");
int fields = 0;
 
-- 
2.31.0



[PATCH v6 10/21] selftests/resctrl: Fix a printed message

2021-03-16 Thread Fenghua Yu
From: Reinette Chatre 

Add a missing newline to the printed help text to improve readability.

Tested-by: Babu Moger 
Signed-off-by: Reinette Chatre 
Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Remove the "notok" fix part because the API change fixes it already.

 tools/testing/selftests/resctrl/resctrl_tests.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c 
b/tools/testing/selftests/resctrl/resctrl_tests.c
index ccc1d6987cc6..355bd28b996a 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -37,8 +37,8 @@ void detect_amd(void)
 static void cmd_help(void)
 {
printf("usage: resctrl_tests [-h] [-b \"benchmark_cmd [options]\"] [-t 
test list] [-n no_of_bits]\n");
-   printf("\t-b benchmark_cmd [options]: run specified benchmark for MBM, 
MBA and CMT");
-   printf("\t default benchmark is builtin fill_buf\n");
+   printf("\t-b benchmark_cmd [options]: run specified benchmark for MBM, 
MBA and CMT\n");
+   printf("\t   default benchmark is builtin fill_buf\n");
printf("\t-t test list: run tests specified in the test list, ");
printf("e.g. -t mbm, mba, cmt, cat\n");
printf("\t-n no_of_bits: run cache tests using specified no of bits in 
cache bit mask\n");
-- 
2.31.0



[PATCH v6 00/21] Miscellaneous fixes for resctrl selftests

2021-03-16 Thread Fenghua Yu
This patch set has several miscellaneous fixes to resctrl selftest tool
that are easily visible to user. V1 had fixes to CAT test and CMT test
but they were dropped in V2 because having them here made the patchset
humongous. So, changes to CAT test and CMT test will be posted in another
patchset.

Change Log:
v6:
- Add Tested-by: Babu Moger .
- Replace "cat" by CAT_STR etc (Babu).
- Capitalize the first letter of printed message (Babu).

v5:
- Address various comments from Shuah Khan:
  1. Move a few fixing patches before cleaning patches.
  2. Call kselftest APIs to log test results instead of printf().
  3. Add .gitignore to ignore resctrl_tests.
  4. Share show_cache_info() in CAT and CMT tests.
  5. Define long_mask, cbm_mask, count_of_bits etc as static variables.

v4:
- Address various comments from Shuah Khan:
  1. Combine a few patches e.g. a couple of fixing typos patches into one
 and a couple of unmounting patches into one etc.
  2. Add config file.
  3. Remove "Fixes" tags.
  4. Change strcmp() to strncmp().
  5. Move the global variable fixing patch to the patch 1 so that the
 compilation issue is fixed first.

Please note:
- I didn't move the patch of renaming CQM to CMT to the end of the series
  because code and commit messages in a few other patches depend on the
  new term of "CMT". If move the renaming patch to the end, the previous
  patches use the old "CQM" term and code which will be changed soon at
  the end of series and will cause more code and explanations.
[v3: https://lkml.org/lkml/2020/10/28/137]

v3:
Address various comments (commit messages, return value on test failure,
print failure info on test failure etc) from Reinette and Tony.
[v2: 
https://lore.kernel.org/linux-kselftest/cover.1589835155.git.sai.praneeth.prak...@intel.com/]

v2:
1. Dropped changes to CAT test and CMT test as they will be posted in a later
   series.
2. Added several other fixes
[v1: 
https://lore.kernel.org/linux-kselftest/cover.1583657204.git.sai.praneeth.prak...@intel.com/]

Fenghua Yu (19):
  selftests/resctrl: Enable gcc checks to detect buffer overflows
  selftests/resctrl: Fix compilation issues for global variables
  selftests/resctrl: Fix compilation issues for other global variables
  selftests/resctrl: Clean up resctrl features check
  selftests/resctrl: Fix missing options "-n" and "-p"
  selftests/resctrl: Rename CQM test as CMT test
  selftests/resctrl: Call kselftest APIs to log test results
  selftests/resctrl: Share show_cache_info() by CAT and CMT tests
  selftests/resctrl: Add config dependencies
  selftests/resctrl: Check for resctrl mount point only if resctrl FS is
supported
  selftests/resctrl: Use resctrl/info for feature detection
  selftests/resctrl: Fix MBA/MBM results reporting format
  selftests/resctrl: Don't hard code value of "no_of_bits" variable
  selftests/resctrl: Modularize resctrl test suite main() function
  selftests/resctrl: Skip the test if requested resctrl feature is not
supported
  selftests/resctrl: Fix unmount resctrl FS
  selftests/resctrl: Fix incorrect parsing of iMC counters
  selftests/resctrl: Fix checking for < 0 for unsigned values
  selftests/resctrl: Create .gitignore to include resctrl_tests

Reinette Chatre (2):
  selftests/resctrl: Ensure sibling CPU is not same as original CPU
  selftests/resctrl: Fix a printed message

 tools/testing/selftests/resctrl/.gitignore|   2 +
 tools/testing/selftests/resctrl/Makefile  |   2 +-
 tools/testing/selftests/resctrl/README|   4 +-
 tools/testing/selftests/resctrl/cache.c   |  52 +-
 tools/testing/selftests/resctrl/cat_test.c|  57 ++
 .../resctrl/{cqm_test.c => cmt_test.c}|  75 +++-
 tools/testing/selftests/resctrl/config|   2 +
 tools/testing/selftests/resctrl/fill_buf.c|   4 +-
 tools/testing/selftests/resctrl/mba_test.c|  43 ++---
 tools/testing/selftests/resctrl/mbm_test.c|  42 ++---
 tools/testing/selftests/resctrl/resctrl.h |  29 +++-
 .../testing/selftests/resctrl/resctrl_tests.c | 163 --
 tools/testing/selftests/resctrl/resctrl_val.c |  95 ++
 tools/testing/selftests/resctrl/resctrlfs.c   | 134 --
 14 files changed, 408 insertions(+), 296 deletions(-)
 create mode 100644 tools/testing/selftests/resctrl/.gitignore
 rename tools/testing/selftests/resctrl/{cqm_test.c => cmt_test.c} (56%)
 create mode 100644 tools/testing/selftests/resctrl/config

-- 
2.31.0



[PATCH v6 13/21] selftests/resctrl: Use resctrl/info for feature detection

2021-03-16 Thread Fenghua Yu
Resctrl test suite before running any unit test (like cmt, cat, mbm and
mba) should first check if the feature is enabled (by kernel and not just
supported by H/W) on the platform or not.
validate_resctrl_feature_request() is supposed to do that. This function
intends to grep for relevant flags in /proc/cpuinfo but there are several
issues here

1. validate_resctrl_feature_request() calls fgrep() to get flags from
   /proc/cpuinfo. But, fgrep() can only return a string with maximum of 255
   characters and hence the complete cpu flags are never returned.
2. The substring search logic is also busted. If strstr() finds requested
   resctrl feature in the cpu flags, it returns pointer to the first
   occurrence. But, the logic negates the return value of strstr() and
   hence validate_resctrl_feature_request() returns false if the feature is
   present in the cpu flags and returns true if the feature is not present.
3. validate_resctrl_feature_request() checks if a resctrl feature is
   reported in /proc/cpuinfo flags or not. Having a cpu flag means that the
   H/W supports the feature, but it doesn't mean that the kernel enabled
   it. A user could selectively enable only a subset of resctrl features
   using kernel command line arguments. Hence, /proc/cpuinfo isn't a
   reliable source to check if a feature is enabled or not.

The 3rd issue being the major one and fixing it requires changing the way
validate_resctrl_feature_request() works. Since, /proc/cpuinfo isn't the
right place to check if a resctrl feature is enabled or not, a more
appropriate place is /sys/fs/resctrl/info directory. Change
validate_resctrl_feature_request() such that,

1. For cat, check if /sys/fs/resctrl/info/L3 directory is present or not
2. For mba, check if /sys/fs/resctrl/info/MB directory is present or not
3. For cmt, check if /sys/fs/resctrl/info/L3_MON directory is present and
   check if /sys/fs/resctrl/info/L3_MON/mon_features has llc_occupancy
4. For mbm, check if /sys/fs/resctrl/info/L3_MON directory is present and
   check if /sys/fs/resctrl/info/L3_MON/mon_features has
   mbm__bytes

Please note that only L3_CAT, L3_CMT, MBA and MBM are supported. CDP and L2
variants can be added later.

Reported-by: Reinette Chatre 
Tested-by: Babu Moger 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrl.h   |  6 ++-
 tools/testing/selftests/resctrl/resctrlfs.c | 52 -
 2 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl.h 
b/tools/testing/selftests/resctrl/resctrl.h
index 81f322245ef7..1ad10c47e31d 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -29,6 +29,10 @@
 #define RESCTRL_PATH   "/sys/fs/resctrl"
 #define PHYS_ID_PATH   "/sys/devices/system/cpu/cpu"
 #define CBM_MASK_PATH  "/sys/fs/resctrl/info"
+#define L3_PATH"/sys/fs/resctrl/info/L3"
+#define MB_PATH"/sys/fs/resctrl/info/MB"
+#define L3_MON_PATH"/sys/fs/resctrl/info/L3_MON"
+#define L3_MON_FEATURES_PATH   "/sys/fs/resctrl/info/L3_MON/mon_features"
 
 #define PARENT_EXIT(err_msg)   \
do {\
@@ -79,7 +83,7 @@ int remount_resctrlfs(bool mum_resctrlfs);
 int get_resource_id(int cpu_no, int *resource_id);
 int umount_resctrlfs(void);
 int validate_bw_report_request(char *bw_report);
-bool validate_resctrl_feature_request(char *resctrl_val);
+bool validate_resctrl_feature_request(const char *resctrl_val);
 char *fgrep(FILE *inf, const char *str);
 int taskset_benchmark(pid_t bm_pid, int cpu_no);
 void run_benchmark(int signum, siginfo_t *info, void *ucontext);
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c 
b/tools/testing/selftests/resctrl/resctrlfs.c
index 87195eb78356..26563175acf6 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -606,26 +606,56 @@ char *fgrep(FILE *inf, const char *str)
  * validate_resctrl_feature_request - Check if requested feature is valid.
  * @resctrl_val:   Requested feature
  *
- * Return: 0 on success, non-zero on failure
+ * Return: True if the feature is supported, else false
  */
-bool validate_resctrl_feature_request(char *resctrl_val)
+bool validate_resctrl_feature_request(const char *resctrl_val)
 {
-   FILE *inf = fopen("/proc/cpuinfo", "r");
+   struct stat statbuf;
bool found = false;
char *res;
+   FILE *inf;
 
-   if (!inf)
+   if (!resctrl_val)
return false;
 
-   res = fgrep(inf, "flags");
-
-   if (res) {
-   char *s = strchr(res, ':');
+   if (remount_resctrlfs(false))
+   return false;
 
-   found = s && !strstr(s, resctrl_val);
-  

[PATCH v5 0/3] x86/bus_lock: Enable bus lock detection

2021-03-12 Thread Fenghua Yu
A bus lock [1] is acquired through either split locked access to
writeback (WB) memory or any locked access to non-WB memory. This is
typically >1000 cycles slower than an atomic operation within
a cache line. It also disrupts performance on other cores.

Although split lock can be detected by #AC trap, the trap is triggered
before the instruction acquires bus lock. This makes it difficult to
mitigate bus lock (e.g. throttle the user application).

Some CPUs have ability to notify the kernel by an #DB trap after a user
instruction acquires a bus lock and is executed. This allows the kernel
to enforce user application throttling or mitigations.

#DB for bus lock detect fixes issues in #AC for split lock detect:
1) It's architectural ... just need to look at one CPUID bit to know it
   exists
2) The IA32_DEBUGCTL MSR, which reports bus lock in #DB, is per-thread.
   So each process or guest can have different behavior.
3) It has support for VMM/guests (new VMEXIT codes, etc).
4) It detects not only split locks but also bus locks from non-WB.

Hardware only generates #DB for bus lock detect when CPL>0 to avoid
nested #DB from multiple bus locks while the first #DB is being handled.

Use the existing kernel command line parameter "split_lock_detect=" to
handle #DB for bus lock with an additional option "ratelimit=N" to set
bus lock rate limit for a user.

[1] Intel Instruction Set Extension Chapter 9:
https://software.intel.com/content/dam/develop/public/us/en/documents/architecture-instruction-set-extensions-programming-reference.pdf

Change Log:
v5:
Address all comments from Thomas:
- In the cover letter, update the latest ISE link to include the #DB
  for bus lock spec.
- In patch 1, add commit message for breakpoint and bus lock on the same
  instruction.
- In patch 2, change warn to #AC if both #AC and #DB are supported, remove
  sld and bld variables, remove bus lock checking in handle_bus_lock() etc.
- In patch 3 and 4, remove bld_ratelimit < HZ/2 check and define
  bld_ratelimit only for Intel CPUs.
- Merge patch 2 and 3 into one patch for handling warn, fatal, and
  ratelimit.
v4 is here: 
https://lore.kernel.org/lkml/20201124205245.4164633-2-fenghua...@intel.com/

v4:
- Fix a ratelimit wording issue in the doc (Randy).
- Patch 4 is acked by Randy (Randy).

v3:
- Enable Bus Lock Detection when fatal to handle bus lock from non-WB
  (PeterZ).
- Add Acked-by: PeterZ in patch 2.

v2:
- Send SIGBUS in fatal case for bus lock #DB (PeterZ).

v1:
- Check bus lock bit by its positive polarity (Xiaoyao).
- Fix a few wording issues in the documentation (Randy).
[RFC v3 can be found at: https://lore.kernel.org/patchwork/cover/1329943/]

RFC v3:
- Remove DR6_RESERVED change (PeterZ).
- Simplify the documentation (Randy).

RFC v2:
- Architecture changed based on feedback from Thomas and PeterZ. #DB is
  no longer generated for bus lock in ring0.
- Split the one single patch into four patches.
[RFC v1 can be found at: 
https://lore.kernel.org/lkml/1595021700-68460-1-git-send-email-fenghua...@intel.com/]

Fenghua Yu (3):
  x86/cpufeatures: Enumerate #DB for bus lock detection
  x86/bus_lock: Handle #DB for bus lock
  Documentation/admin-guide: Change doc for split_lock_detect parameter

 .../admin-guide/kernel-parameters.txt |  30 +++-
 arch/x86/include/asm/cpu.h|  10 +-
 arch/x86/include/asm/cpufeatures.h|   1 +
 arch/x86/include/asm/msr-index.h  |   1 +
 arch/x86/include/uapi/asm/debugreg.h  |   1 +
 arch/x86/kernel/cpu/common.c  |   2 +-
 arch/x86/kernel/cpu/intel.c   | 148 +++---
 arch/x86/kernel/traps.c   |   7 +
 include/linux/sched/user.h|   5 +-
 kernel/user.c |  13 ++
 10 files changed, 187 insertions(+), 31 deletions(-)

-- 
2.30.2



[PATCH v5 2/3] x86/bus_lock: Handle #DB for bus lock

2021-03-12 Thread Fenghua Yu
Bus locks degrade performance for the whole system, not just for the CPU
that requested the bus lock. Two CPU features "#AC for split lock" and
"#DB for bus lock" provide hooks so that the operating system may choose
one of several mitigation strategies.

#AC for split lock is already implemented. Add code to use the #DB for
bus lock feature to cover additional situations with new options to
mitigate.

split_lock_detect=
#AC for split lock  #DB for bus lock

off Do nothing  Do nothing

warnKernel OOPs Warn once per task and
Warn once per task and  and continues to run.
disable future checking
When both features are
supported, warn in #AC

fatal   Kernel OOPs Send SIGBUS to user.
Send SIGBUS to user
When both features are
supported, fatal in #AC

ratelimit:N Do nothing  Limit bus lock rate to
N per second in the
current non-root user.

Default option is "warn".

Hardware only generates #DB for bus lock detect when CPL>0 to avoid
nested #DB from multiple bus locks while the first #DB is being handled.
So no need to handle #DB for bus lock detected in the kernel.

#DB for bus lock is enabled by bus lock detection bit 2 in DEBUGCTL MSR
while #AC for split lock is enabled by split lock detection bit 29 in
TEST_CTRL MSR.

Both breakpoint and bus lock in the same instruction can trigger one #DB.
The bus lock is handled before the breakpoint in the #DB handler.

Delivery of #DB for bus lock in userspace clears DR6[11]. To avoid
confusion in identifying #DB, #DB handler sets the bit to 1 before
returning to the interrupted task.

Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
---
Change Log:
v5:
Address all comments from Thomas:
- Merge patch 2 and patch 3 into one patch so all "split_lock_detect="
  options are processed in one patch.
- Change warn to #AC if both #AC and #DB are supported.
- Remove sld and bld variables and use boot_cpu_has() to check bus lock
  split lock support.
- Remove bus lock checking in handle_bus_lock().
- Remove bld_ratelimit < HZ/2 check.
- Add rate limit handling comment in bus lock #DB.
- Define bld_ratelimit only for Intel CPUs.

v3:
- Enable Bus Lock Detection when fatal to handle bus lock from non-WB
  (PeterZ).

v2:
- Send SIGBUS in fatal case for bus lock #DB (PeterZ).

v1::
- Check bus lock bit by its positive polarity (Xiaoyao).

RFC v3:
- Remove DR6_RESERVED change (PeterZ).

 arch/x86/include/asm/cpu.h   |  10 +-
 arch/x86/include/asm/msr-index.h |   1 +
 arch/x86/include/uapi/asm/debugreg.h |   1 +
 arch/x86/kernel/cpu/common.c |   2 +-
 arch/x86/kernel/cpu/intel.c  | 148 +++
 arch/x86/kernel/traps.c  |   7 ++
 include/linux/sched/user.h   |   5 +-
 kernel/user.c|  13 +++
 8 files changed, 162 insertions(+), 25 deletions(-)

diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index da78ccbd493b..991de5f2a09c 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -41,12 +41,14 @@ unsigned int x86_family(unsigned int sig);
 unsigned int x86_model(unsigned int sig);
 unsigned int x86_stepping(unsigned int sig);
 #ifdef CONFIG_CPU_SUP_INTEL
-extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c);
+extern void __init sld_setup(struct cpuinfo_x86 *c);
 extern void switch_to_sld(unsigned long tifn);
 extern bool handle_user_split_lock(struct pt_regs *regs, long error_code);
 extern bool handle_guest_split_lock(unsigned long ip);
+extern void handle_bus_lock(struct pt_regs *regs);
+extern int bld_ratelimit;
 #else
-static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {}
+static inline void __init sld_setup(struct cpuinfo_x86 *c) {}
 static inline void switch_to_sld(unsigned long tifn) {}
 static inline bool handle_user_split_lock(struct pt_regs *regs, long 
error_code)
 {
@@ -57,6 +59,10 @@ static inline bool handle_guest_split_lock(unsigned long ip)
 {
return false;
 }
+
+static inline void handle_bus_lock(struct pt_regs *regs)
+{
+}
 #endif
 #ifdef CONFIG_IA32_FEAT_CTL
 void init_ia32_feat_ctl(struct cpuinfo_x86 *c);
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 546d6ecf0a35..558485965f21 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -265,6 +265,7 @@
 #define DEBUGCTLMSR_LBR(1UL <<  0) /* last branch 
recording */
 #define DEBUGCTLMSR_BTF_SHIFT  1
 #define DEBUGCTLMSR_BTF(1UL <<  1) /* single-step on 
branches */
+#define DEBUGCTLMSR_BUS_LOCK_DETECT   

[PATCH v5 3/3] Documentation/admin-guide: Change doc for split_lock_detect parameter

2021-03-12 Thread Fenghua Yu
Since #DB for bus lock detect changes the split_lock_detect parameter,
update the documentation for the changes.

Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
Acked-by: Randy Dunlap 
---
Change Log:
v5:
- Remove N < HZ/2 check info in the doc (Thomas).

v4:
- Fix a ratelimit wording issue in the doc (Randy).
- Patch 4 is acked by Randy (Randy).

v3:
- Enable Bus Lock Detection when fatal to handle bus lock from non-WB
  (PeterZ).

v1:
- Fix a few wording issues (Randy).

RFC v2:
- Simplify the documentation (Randy).

 .../admin-guide/kernel-parameters.txt | 30 +++
 1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 04545725f187..16b2e1c45d04 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5100,27 +5100,45 @@
spia_peddr=
 
split_lock_detect=
-   [X86] Enable split lock detection
+   [X86] Enable split lock detection or bus lock detection
 
When enabled (and if hardware support is present), 
atomic
instructions that access data across cache line
-   boundaries will result in an alignment check exception.
+   boundaries will result in an alignment check exception
+   for split lock detection or a debug exception for
+   bus lock detection.
 
off - not enabled
 
-   warn- the kernel will emit rate limited warnings
+   warn- the kernel will emit rate-limited warnings
  about applications triggering the #AC
- exception. This mode is the default on CPUs
- that supports split lock detection.
+ exception or the #DB exception. This mode is
+ the default on CPUs that support split lock
+ detection or bus lock detection. Default
+ behavior is by #DB if both features are
+ enabled in hardware.
 
fatal   - the kernel will send SIGBUS to applications
- that trigger the #AC exception.
+ that trigger the #AC exception or the #DB
+ exception. If both features are enabled in
+ hardware, split lock triggers #AC and bus
+ lock from non-WB triggers #DB.
+
+   ratelimit:N -
+ Set rate limit to N bus locks per second
+ for bus lock detection. 0 < N.
+ Only applied to non-root users.
+
+ N/A for split lock detection.
 
If an #AC exception is hit in the kernel or in
firmware (i.e. not while executing in user mode)
the kernel will oops in either "warn" or "fatal"
mode.
 
+   #DB exception for bus lock is triggered only when
+   CPL > 0.
+
srbds=  [X86,INTEL]
Control the Special Register Buffer Data Sampling
(SRBDS) mitigation.
-- 
2.30.2



[PATCH v5 1/3] x86/cpufeatures: Enumerate #DB for bus lock detection

2021-03-12 Thread Fenghua Yu
A bus lock is acquired though either split locked access to
writeback (WB) memory or any locked access to non-WB memory. This is
typically >1000 cycles slower than an atomic operation within a cache
line. It also disrupts performance on other cores.

Some CPUs have ability to notify the kernel by an #DB trap after a user
instruction acquires a bus lock and is executed. This allows the kernel
to enforce user application throttling or mitigations. Both breakpoint
and bus lock can trigger the #DB trap in the same instruction and the
ordering of handling them is the kernel #DB handler's choice.

The CPU feature flag to be shown in /proc/cpuinfo will be "bus_lock_detect".

Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
---
Change Log:
v5:
- Add "Both breakpoint and bus lock can trigger an #DB trap..." in the
  commit message (Thomas).

 arch/x86/include/asm/cpufeatures.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index cc96e26d69f7..faec3d92d09b 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -354,6 +354,7 @@
 #define X86_FEATURE_AVX512_VPOPCNTDQ   (16*32+14) /* POPCNT for vectors of 
DW/QW */
 #define X86_FEATURE_LA57   (16*32+16) /* 5-level page tables */
 #define X86_FEATURE_RDPID  (16*32+22) /* RDPID instruction */
+#define X86_FEATURE_BUS_LOCK_DETECT(16*32+24) /* Bus Lock detect */
 #define X86_FEATURE_CLDEMOTE   (16*32+25) /* CLDEMOTE instruction */
 #define X86_FEATURE_MOVDIRI(16*32+27) /* MOVDIRI instruction */
 #define X86_FEATURE_MOVDIR64B  (16*32+28) /* MOVDIR64B instruction */
-- 
2.30.2



Re: [PATCH v5 00/21] Miscellaneous fixes for resctrl selftests

2021-03-12 Thread Fenghua Yu
Hi, Babu,

On Fri, Mar 12, 2021 at 01:08:11PM -0600, Babu Moger wrote:
> Hi Fenghua, Thanks for the patches.
> Sanity tested them on AMD systems. Appears to work fine.
> Few minor comments in few patches.
> Tested-by: Babu Moger 

I will add Tested-by: Babu Moger in the series and address your
comments.

Thank you for your review!

-Fenghua


Re: [PATCH v5 08/21] selftests/resctrl: Call kselftest APIs to log test results

2021-03-12 Thread Fenghua Yu
Hi, Babu,

On Fri, Mar 12, 2021 at 01:12:35PM -0600, Babu Moger wrote:
> > -   printf("# dmesg: %s", line);
> > +   ksft_print_msg("dmesg: %s", line);
> > if (strstr(line, "resctrl:"))
> > -   printf("# dmesg: %s", line);
> > +   ksft_print_msg("dmesg: %s", line);
> 
> In general, this patch has some minor nits. When displaying the messages,
>  normally the first character should be capitalized.
> ksft_print_msg("checking for pass/fail\n");
> should be
>  ksft_print_msg("Checking for pass/fail\n");
> 
> And
> ksft_print_msg("Please Enter value in range 1 to %d\n",count_of_bits);
> Should be
> 
> ksft_print_msg("Please enter value in range 1 to %d\n", count_of_bits);
> 
> I am not too concerned about this. You can improve it if you like it.
> 
Ok. Will fix them.

Thanks.

-Fenghua


Re: [PATCH v5 04/21] selftests/resctrl: Clean up resctrl features check

2021-03-12 Thread Fenghua Yu
Hi, Babu,

On Fri, Mar 12, 2021 at 01:09:50PM -0600, Babu Moger wrote:
> > -   if (strcmp(resctrl_val, "mba") == 0)
> > +   if (!strncmp(resctrl_val, MBA_STR, sizeof(MBA_STR)))
> > sprintf(schema, "%s%d%c%s", "MB:", resource_id, '=',
> > schemata);
> I see there are few other references as well.  Like this.
> 
> 1 cat_test.c  cat_perf_miss_val  135 if
> (!validate_resctrl_feature_request("cat"))
> 
> 2 cqm_test.c  cqm_resctrl_val  125 if
> (!validate_resctrl_feature_request("cqm"))
> 
> 3 mba_test.c  mba_schemata_change157 if
> (!validate_resctrl_feature_request("mba"))
> 
> 4 mbm_test.c  mbm_bw_change 131 if
> (!validate_resctrl_feature_request("mbm"))
> 
> Should you use CAT_STR and CQM_STR etc.. in here as well?

Sure. I will fix this.

Thanks.

-Fenghua


Re: [PATCH v5 02/21] selftests/resctrl: Fix compilation issues for global variables

2021-03-12 Thread Fenghua Yu
Hi, Babu,

On Fri, Mar 12, 2021 at 01:08:31PM -0600, Babu Moger wrote:
> > From: Fenghua Yu 
> > Taking a closer look at the usage of these variables reveals that these
> > variables are used only locally to functions such as cqm_resctrl_val()
> 
> %s/ locally to functions/locally in two functions

OK. Will change it.

> > -int get_cbm_mask(char *cache_type)
> > +int get_cbm_mask(char *cache_type, char *cbm_mask)
> >  {
> > char cbm_mask_path[1024];
> > FILE *fp;
> > 
> > +   if (!cbm_mask)
> > +   return -1;
> 
> Can cbm_mask be NULL? I see it is statically allocated.
> Or should this be if (!(*cbm_mask))? Or did I miss something.

This is a sanity checking. Although current callers do pass statically
allocated cbm_mask to the parameter, future callers may incorrectly pass
un-allocated cbm_mask to the parameter and may cause segmentation fault
without the sanity checking. To debug this kind of issue, the sanity
checking will be very helpful.

So I would keep this sanity checking.

Thanks.

-Fenghua


[PATCH v5 11/21] selftests/resctrl: Add config dependencies

2021-03-07 Thread Fenghua Yu
Add the config file for test dependencies.

Suggested-by: Shuah Khan 
Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Add this patch (Shuah)

 tools/testing/selftests/resctrl/config | 2 ++
 1 file changed, 2 insertions(+)
 create mode 100644 tools/testing/selftests/resctrl/config

diff --git a/tools/testing/selftests/resctrl/config 
b/tools/testing/selftests/resctrl/config
new file mode 100644
index ..8d9f2deb56ed
--- /dev/null
+++ b/tools/testing/selftests/resctrl/config
@@ -0,0 +1,2 @@
+CONFIG_X86_CPU_RESCTRL=y
+CONFIG_PROC_CPU_RESCTRL=y
-- 
2.30.1



[PATCH v5 18/21] selftests/resctrl: Fix unmount resctrl FS

2021-03-07 Thread Fenghua Yu
umount_resctrlfs() directly attempts to unmount resctrl file system without
checking if resctrl FS is already mounted or not. It returns 0 on success
and on failure it prints an error message and returns an error status.
Calling umount_resctrlfs() when resctrl FS isn't mounted will return an
error status.

There could be situations where-in the caller might not know if resctrl
FS is already mounted or not and the caller might still want to unmount
resctrl FS if it's already mounted (For example during teardown).

To support above use cases, change umount_resctrlfs() such that it now
first checks if resctrl FS is already mounted or not and unmounts resctrl
FS only if it's already mounted.

unmount resctrl FS upon exit. For example, running only mba test on a
Broadwell (BDW) machine (MBA isn't supported on BDW CPU).

This happens because validate_resctrl_feature_request() would mount resctrl
FS to check if mba is enabled on the platform or not and finds that the H/W
doesn't support mba and hence will return false to run_mba_test(). This in
turn makes the main() function return without unmounting resctrl FS.

Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrl_tests.c | 2 ++
 tools/testing/selftests/resctrl/resctrlfs.c | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c 
b/tools/testing/selftests/resctrl/resctrl_tests.c
index fa4c5f5075dd..6204ede25ad1 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -253,5 +253,7 @@ int main(int argc, char **argv)
if (cat_test)
run_cat_test(cpu_no, no_of_bits);
 
+   umount_resctrlfs();
+
return ksft_exit_pass();
 }
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c 
b/tools/testing/selftests/resctrl/resctrlfs.c
index 0c23514760dd..cb52d0ad4be2 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -82,6 +82,9 @@ int remount_resctrlfs(bool mum_resctrlfs)
 
 int umount_resctrlfs(void)
 {
+   if (find_resctrl_mount(NULL))
+   return 0;
+
if (umount(RESCTRL_PATH)) {
perror("# Unable to umount resctrl");
 
-- 
2.30.1



[PATCH v5 16/21] selftests/resctrl: Modularize resctrl test suite main() function

2021-03-07 Thread Fenghua Yu
Resctrl test suite main() function does the following things
1. Parses command line arguments passed by user
2. Some setup checks
3. Logic that calls into each unit test
4. Print result and clean up after running each unit test

Introduce wrapper functions for steps 3 and 4 to modularize the main()
function. Adding these wrapper functions makes it easier to add any logic
to each individual test.

Please note that this is a preparatory patch for the next one and no
functional changes are intended.

Suggested-by: Reinette Chatre 
Signed-off-by: Fenghua Yu 
---
 .../testing/selftests/resctrl/resctrl_tests.c | 88 ---
 1 file changed, 57 insertions(+), 31 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c 
b/tools/testing/selftests/resctrl/resctrl_tests.c
index 56900738edd6..f9d00ecbeedb 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -54,10 +54,58 @@ void tests_cleanup(void)
cat_test_cleanup();
 }
 
+static void run_mbm_test(bool has_ben, char **benchmark_cmd, int span,
+int cpu_no, char *bw_report)
+{
+   int res;
+
+   ksft_print_msg("Starting MBM BW change ...\n");
+   if (!has_ben)
+   sprintf(benchmark_cmd[5], "%s", MBA_STR);
+   res = mbm_bw_change(span, cpu_no, bw_report, benchmark_cmd);
+   ksft_test_result(!res, "MBM: bw change\n");
+   mbm_test_cleanup();
+}
+
+static void run_mba_test(bool has_ben, char **benchmark_cmd, int span,
+int cpu_no, char *bw_report)
+{
+   int res;
+
+   ksft_print_msg("Starting MBA Schemata change ...\n");
+   if (!has_ben)
+   sprintf(benchmark_cmd[1], "%d", span);
+   res = mba_schemata_change(cpu_no, bw_report, benchmark_cmd);
+   ksft_test_result(!res, "MBA: schemata change\n");
+   mba_test_cleanup();
+}
+
+static void run_cmt_test(bool has_ben, char **benchmark_cmd, int cpu_no)
+{
+   int res;
+
+   ksft_print_msg("Starting CMT test ...\n");
+   if (!has_ben)
+   sprintf(benchmark_cmd[5], "%s", "cmt");
+   res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd);
+   ksft_test_result(!res, "CMT: test\n");
+   cmt_test_cleanup();
+}
+
+static void run_cat_test(int cpu_no, int no_of_bits)
+{
+   int res;
+
+   ksft_print_msg("Starting CAT test ...\n");
+   res = cat_perf_miss_val(cpu_no, no_of_bits, "L3");
+   ksft_test_result(!res, "CAT: test\n");
+   cat_test_cleanup();
+}
+
 int main(int argc, char **argv)
 {
bool has_ben = false, mbm_test = true, mba_test = true, cmt_test = true;
-   int res, c, cpu_no = 1, span = 250, argc_new = argc, i, no_of_bits = 0;
+   int c, cpu_no = 1, span = 250, argc_new = argc, i, no_of_bits = 0;
char *benchmark_cmd[BENCHMARK_ARGS], bw_report[64], bm_type[64];
char benchmark_cmd_area[BENCHMARK_ARGS][BENCHMARK_ARG_SIZE];
int ben_ind, ben_count, tests = 0;
@@ -170,39 +218,17 @@ int main(int argc, char **argv)
 
ksft_set_plan(tests ? : 4);
 
-   if (!is_amd && mbm_test) {
-   ksft_print_msg("Starting MBM BW change ...\n");
-   if (!has_ben)
-   sprintf(benchmark_cmd[5], "%s", MBA_STR);
-   res = mbm_bw_change(span, cpu_no, bw_report, benchmark_cmd);
-   ksft_test_result(!res, "MBM: bw change\n");
-   mbm_test_cleanup();
-   }
+   if (!is_amd && mbm_test)
+   run_mbm_test(has_ben, benchmark_cmd, span, cpu_no, bw_report);
 
-   if (!is_amd && mba_test) {
-   ksft_print_msg("Starting MBA Schemata change ...\n");
-   if (!has_ben)
-   sprintf(benchmark_cmd[1], "%d", span);
-   res = mba_schemata_change(cpu_no, bw_report, benchmark_cmd);
-   ksft_test_result(!res, "MBA: schemata change\n");
-   mba_test_cleanup();
-   }
+   if (!is_amd && mba_test)
+   run_mba_test(has_ben, benchmark_cmd, span, cpu_no, bw_report);
 
-   if (cmt_test) {
-   ksft_print_msg("Starting CMT test ...\n");
-   if (!has_ben)
-   sprintf(benchmark_cmd[5], "%s", "cmt");
-   res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd);
-   ksft_test_result(!res, "CMT: test\n");
-   cmt_test_cleanup();
-   }
+   if (cmt_test)
+   run_cmt_test(has_ben, benchmark_cmd, cpu_no);
 
-   if (cat_test) {
-   ksft_print_msg("Starting CAT test ...\n");
-   res = cat_perf_miss_val(cpu_no, no_of_bits, "L3");
-   ksft_test_result(!res, "CAT: test\n");
-   cat_test_cleanup();
-   }
+   if (cat_test)
+   run_cat_test(cpu_no, no_of_bits);
 
return ksft_exit_pass();
 }
-- 
2.30.1



[PATCH v5 14/21] selftests/resctrl: Fix MBA/MBM results reporting format

2021-03-07 Thread Fenghua Yu
MBM unit test starts fill_buf (default built-in benchmark) in a new con_mon
group (c1, m1) and records resctrl reported mbm values and iMC (Integrated
Memory Controller) values every second. It does this for five seconds
(randomly chosen value) in total. It then calculates average of resctrl_mbm
values and imc_mbm values and if the difference is greater than 300 MB/sec
(randomly chosen value), the test treats it as a failure. MBA unit test is
similar to MBM but after every run it changes schemata.

Checking for a difference of 300 MB/sec doesn't look very meaningful when
the mbm values are changing over a wide range. For example, below are the
values running MBA test on SKL with different allocations

1. With 10% as schemata both iMC and resctrl mbm_values are around 2000
   MB/sec
2. With 100% as schemata both iMC and resctrl mbm_values are around 1
   MB/sec

A 300 MB/sec difference between resctrl_mbm and imc_mbm values is
acceptable at 100% schemata but it isn't acceptable at 10% schemata because
that's a huge difference.

So, fix this by checking for percentage difference instead of absolute
difference i.e. check if the difference between resctrl_mbm value and
imc_mbm value is within 5% (randomly chosen value) of imc_mbm value. If the
difference is greater than 5% of imc_mbm value, treat it is a failure.

Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/mba_test.c | 22 +-
 tools/testing/selftests/resctrl/mbm_test.c | 15 ---
 2 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/tools/testing/selftests/resctrl/mba_test.c 
b/tools/testing/selftests/resctrl/mba_test.c
index 3a226effe80c..fd66a831062c 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -12,7 +12,7 @@
 
 #define RESULT_FILE_NAME   "result_mba"
 #define NUM_OF_RUNS5
-#define MAX_DIFF   300
+#define MAX_DIFF_PERCENT   5
 #define ALLOCATION_MAX 100
 #define ALLOCATION_MIN 10
 #define ALLOCATION_STEP10
@@ -62,7 +62,8 @@ static void show_mba_info(unsigned long *bw_imc, unsigned 
long *bw_resc)
 allocation++) {
unsigned long avg_bw_imc, avg_bw_resc;
unsigned long sum_bw_imc = 0, sum_bw_resc = 0;
-   unsigned long avg_diff;
+   int avg_diff_per;
+   float avg_diff;
 
/*
 * The first run is discarded due to inaccurate value from
@@ -76,16 +77,19 @@ static void show_mba_info(unsigned long *bw_imc, unsigned 
long *bw_resc)
 
avg_bw_imc = sum_bw_imc / (NUM_OF_RUNS - 1);
avg_bw_resc = sum_bw_resc / (NUM_OF_RUNS - 1);
-   avg_diff = labs((long)(avg_bw_resc - avg_bw_imc));
+   avg_diff = (float)labs(avg_bw_resc - avg_bw_imc) / avg_bw_imc;
+   avg_diff_per = (int)(avg_diff * 100);
 
-   ksft_print_msg("%s MBA schemata percentage %u smaller than %d 
%%\n",
-  avg_diff > MAX_DIFF ? "fail:" : "pass:",
-  ALLOCATION_MAX - ALLOCATION_STEP * allocation,
-  MAX_DIFF);
-   ksft_print_msg("avg_diff: %lu\n", avg_diff);
+   ksft_print_msg("%s MBA: diff within %d%% for schemata %u\n",
+  avg_diff_per > MAX_DIFF_PERCENT ?
+  "fail:" : "pass:",
+  MAX_DIFF_PERCENT,
+  ALLOCATION_MAX - ALLOCATION_STEP * allocation);
+
+   ksft_print_msg("avg_diff_per: %d%%\n", avg_diff_per);
ksft_print_msg("avg_bw_imc: %lu\n", avg_bw_imc);
ksft_print_msg("avg_bw_resc: %lu\n", avg_bw_resc);
-   if (avg_diff > MAX_DIFF)
+   if (avg_diff_per > MAX_DIFF_PERCENT)
failed = true;
}
 
diff --git a/tools/testing/selftests/resctrl/mbm_test.c 
b/tools/testing/selftests/resctrl/mbm_test.c
index 2b4f26013d84..44a89e0267eb 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -11,7 +11,7 @@
 #include "resctrl.h"
 
 #define RESULT_FILE_NAME   "result_mbm"
-#define MAX_DIFF   300
+#define MAX_DIFF_PERCENT   5
 #define NUM_OF_RUNS5
 
 static int
@@ -19,8 +19,8 @@ show_bw_info(unsigned long *bw_imc, unsigned long *bw_resc, 
int span)
 {
unsigned long avg_bw_imc = 0, avg_bw_resc = 0;
unsigned long sum_bw_imc = 0, sum_bw_resc = 0;
-   long avg_diff = 0;
-   int runs, ret;
+   int runs, ret, avg_diff_per;
+   float avg_diff = 0;
 
/*
 * Discard the first value which is inaccurate due to monitoring setup
@@ -33,12 +33,13 @@ show_bw_info(unsigned l

[PATCH v5 15/21] selftests/resctrl: Don't hard code value of "no_of_bits" variable

2021-03-07 Thread Fenghua Yu
Cache related tests (like CAT and CMT) depend on a variable called
no_of_bits to run. no_of_bits defines the number of contiguous bits
that should be set in the CBM mask and a user can pass a value for
no_of_bits using -n command line argument. If a user hasn't passed any
value, it defaults to 5 (randomly chosen value).

Hard coding no_of_bits to 5 will make the cache tests fail to run on
systems that support maximum cbm mask that is less than or equal to 5 bits.
Hence, don't hard code no_of_bits value.

If a user passes a value for "no_of_bits" using -n option, use it.
Otherwise, no_of_bits is equal to half of the maximum number of bits in
the cbm mask.

Please note that CMT test is still hard coded to 5 bits. It will change in
subsequent patches that change CMT test.

Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/cat_test.c  | 5 -
 tools/testing/selftests/resctrl/resctrl_tests.c | 8 ++--
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c 
b/tools/testing/selftests/resctrl/cat_test.c
index 109363e9a7d7..58f075aa0423 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -130,7 +130,10 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
/* Get max number of bits from default-cabm mask */
count_of_bits = count_bits(long_mask);
 
-   if (n < 1 || n > count_of_bits - 1) {
+   if (!n)
+   n = count_of_bits / 2;
+
+   if (n > count_of_bits - 1) {
ksft_print_msg("Invalid input value for no_of_bits n!\n");
ksft_print_msg("Please Enter value in range 1 to %d\n",
   count_of_bits - 1);
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c 
b/tools/testing/selftests/resctrl/resctrl_tests.c
index f1b08afbc3d0..56900738edd6 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -57,7 +57,7 @@ void tests_cleanup(void)
 int main(int argc, char **argv)
 {
bool has_ben = false, mbm_test = true, mba_test = true, cmt_test = true;
-   int res, c, cpu_no = 1, span = 250, argc_new = argc, i, no_of_bits = 5;
+   int res, c, cpu_no = 1, span = 250, argc_new = argc, i, no_of_bits = 0;
char *benchmark_cmd[BENCHMARK_ARGS], bw_report[64], bm_type[64];
char benchmark_cmd_area[BENCHMARK_ARGS][BENCHMARK_ARG_SIZE];
int ben_ind, ben_count, tests = 0;
@@ -110,6 +110,10 @@ int main(int argc, char **argv)
break;
case 'n':
no_of_bits = atoi(optarg);
+   if (no_of_bits <= 0) {
+   printf("Bail out! invalid argument for 
no_of_bits\n");
+   return -1;
+   }
break;
case 'h':
cmd_help();
@@ -188,7 +192,7 @@ int main(int argc, char **argv)
ksft_print_msg("Starting CMT test ...\n");
if (!has_ben)
sprintf(benchmark_cmd[5], "%s", "cmt");
-   res = cmt_resctrl_val(cpu_no, no_of_bits, benchmark_cmd);
+   res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd);
ksft_test_result(!res, "CMT: test\n");
cmt_test_cleanup();
}
-- 
2.30.1



[PATCH v5 17/21] selftests/resctrl: Skip the test if requested resctrl feature is not supported

2021-03-07 Thread Fenghua Yu
There could be two reasons why a resctrl feature might not be enabled on
the platform
1. H/W might not support the feature
2. Even if the H/W supports it, the user might have disabled the feature
   through kernel command line arguments

Hence, any resctrl unit test (like cmt, cat, mbm and mba) before starting
the test will first check if the feature is enabled on the platform or not.
If the feature isn't enabled, then the test returns with an error status.
For example, if MBA isn't supported on a platform and if the user tries to
run MBA, the output will look like this

ok mounting resctrl to "/sys/fs/resctrl"
not ok MBA: schemata change

But, not supporting a feature isn't a test failure. So, instead of treating
it as an error, use the SKIP directive of the TAP protocol. With the
change, the output will look as below

ok MBA # SKIP Hardware does not support MBA or MBA is disabled

Suggested-by: Reinette Chatre 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/cat_test.c|  3 ---
 tools/testing/selftests/resctrl/mba_test.c|  3 ---
 tools/testing/selftests/resctrl/mbm_test.c|  3 ---
 .../testing/selftests/resctrl/resctrl_tests.c | 23 +++
 4 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c 
b/tools/testing/selftests/resctrl/cat_test.c
index 58f075aa0423..2e531ef0e7e4 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -111,9 +111,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
if (ret)
return ret;
 
-   if (!validate_resctrl_feature_request("cat"))
-   return -1;
-
/* Get default cbm mask for L3/L2 cache */
ret = get_cbm_mask(cache_type, cbm_mask);
if (ret)
diff --git a/tools/testing/selftests/resctrl/mba_test.c 
b/tools/testing/selftests/resctrl/mba_test.c
index fd66a831062c..205f09235b06 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -158,9 +158,6 @@ int mba_schemata_change(int cpu_no, char *bw_report, char 
**benchmark_cmd)
 
remove(RESULT_FILE_NAME);
 
-   if (!validate_resctrl_feature_request("mba"))
-   return -1;
-
ret = resctrl_val(benchmark_cmd, );
if (ret)
return ret;
diff --git a/tools/testing/selftests/resctrl/mbm_test.c 
b/tools/testing/selftests/resctrl/mbm_test.c
index 44a89e0267eb..01f72d4e68a6 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -131,9 +131,6 @@ int mbm_bw_change(int span, int cpu_no, char *bw_report, 
char **benchmark_cmd)
 
remove(RESULT_FILE_NAME);
 
-   if (!validate_resctrl_feature_request("mbm"))
-   return -1;
-
ret = resctrl_val(benchmark_cmd, );
if (ret)
return ret;
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c 
b/tools/testing/selftests/resctrl/resctrl_tests.c
index f9d00ecbeedb..fa4c5f5075dd 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -60,6 +60,12 @@ static void run_mbm_test(bool has_ben, char **benchmark_cmd, 
int span,
int res;
 
ksft_print_msg("Starting MBM BW change ...\n");
+
+   if (!validate_resctrl_feature_request("mbm")) {
+   ksft_test_result_skip("Hardware does not support MBM or MBM is 
disabled\n");
+   return;
+   }
+
if (!has_ben)
sprintf(benchmark_cmd[5], "%s", MBA_STR);
res = mbm_bw_change(span, cpu_no, bw_report, benchmark_cmd);
@@ -73,6 +79,12 @@ static void run_mba_test(bool has_ben, char **benchmark_cmd, 
int span,
int res;
 
ksft_print_msg("Starting MBA Schemata change ...\n");
+
+   if (!validate_resctrl_feature_request("mba")) {
+   ksft_test_result_skip("Hardware does not support MBA or MBA is 
disabled\n");
+   return;
+   }
+
if (!has_ben)
sprintf(benchmark_cmd[1], "%d", span);
res = mba_schemata_change(cpu_no, bw_report, benchmark_cmd);
@@ -85,6 +97,11 @@ static void run_cmt_test(bool has_ben, char **benchmark_cmd, 
int cpu_no)
int res;
 
ksft_print_msg("Starting CMT test ...\n");
+   if (!validate_resctrl_feature_request("cmt")) {
+   ksft_test_result_skip("Hardware does not support CMT or CMT is 
disabled\n");
+   return;
+   }
+
if (!has_ben)
sprintf(benchmark_cmd[5], "%s", "cmt");
res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd);
@@ -97,6 +114,12 @@ static void run_cat_test(int cpu_no, int no_of_bits)
int res;
 
ksft_print_msg("Starting CAT test ...\n&qu

[PATCH v5 12/21] selftests/resctrl: Check for resctrl mount point only if resctrl FS is supported

2021-03-07 Thread Fenghua Yu
check_resctrlfs_support() does the following
1. Checks if the platform supports resctrl file system or not by looking
   for resctrl in /proc/filesystems
2. Calls opendir() on default resctrl file system path
   (i.e. /sys/fs/resctrl)
3. Checks if resctrl file system is mounted or not by looking at
   /proc/mounts

Steps 2 and 3 will fail if the platform does not support resctrl file
system. So, there is no need to check for them if step 1 fails.

Fix this by returning immediately if the platform does not support
resctrl file system.

Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrlfs.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/testing/selftests/resctrl/resctrlfs.c 
b/tools/testing/selftests/resctrl/resctrlfs.c
index e3d18e113313..10b9292f33e5 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -570,6 +570,9 @@ bool check_resctrlfs_support(void)
ksft_print_msg("%s kernel supports resctrl filesystem\n",
   ret ? "pass:" : "fail:");
 
+   if (!ret)
+   return ret;
+
dp = opendir(RESCTRL_PATH);
ksft_print_msg("%s resctrl mountpoint \"%s\" exists\n",
   dp ? "pass:" : "fail:", RESCTRL_PATH);
-- 
2.30.1



[PATCH v5 13/21] selftests/resctrl: Use resctrl/info for feature detection

2021-03-07 Thread Fenghua Yu
Resctrl test suite before running any unit test (like cmt, cat, mbm and
mba) should first check if the feature is enabled (by kernel and not just
supported by H/W) on the platform or not.
validate_resctrl_feature_request() is supposed to do that. This function
intends to grep for relevant flags in /proc/cpuinfo but there are several
issues here

1. validate_resctrl_feature_request() calls fgrep() to get flags from
   /proc/cpuinfo. But, fgrep() can only return a string with maximum of 255
   characters and hence the complete cpu flags are never returned.
2. The substring search logic is also busted. If strstr() finds requested
   resctrl feature in the cpu flags, it returns pointer to the first
   occurrence. But, the logic negates the return value of strstr() and
   hence validate_resctrl_feature_request() returns false if the feature is
   present in the cpu flags and returns true if the feature is not present.
3. validate_resctrl_feature_request() checks if a resctrl feature is
   reported in /proc/cpuinfo flags or not. Having a cpu flag means that the
   H/W supports the feature, but it doesn't mean that the kernel enabled
   it. A user could selectively enable only a subset of resctrl features
   using kernel command line arguments. Hence, /proc/cpuinfo isn't a
   reliable source to check if a feature is enabled or not.

The 3rd issue being the major one and fixing it requires changing the way
validate_resctrl_feature_request() works. Since, /proc/cpuinfo isn't the
right place to check if a resctrl feature is enabled or not, a more
appropriate place is /sys/fs/resctrl/info directory. Change
validate_resctrl_feature_request() such that,

1. For cat, check if /sys/fs/resctrl/info/L3 directory is present or not
2. For mba, check if /sys/fs/resctrl/info/MB directory is present or not
3. For cmt, check if /sys/fs/resctrl/info/L3_MON directory is present and
   check if /sys/fs/resctrl/info/L3_MON/mon_features has llc_occupancy
4. For mbm, check if /sys/fs/resctrl/info/L3_MON directory is present and
   check if /sys/fs/resctrl/info/L3_MON/mon_features has
   mbm__bytes

Please note that only L3_CAT, L3_CMT, MBA and MBM are supported. CDP and L2
variants can be added later.

Reported-by: Reinette Chatre 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrl.h   |  6 ++-
 tools/testing/selftests/resctrl/resctrlfs.c | 52 -
 2 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl.h 
b/tools/testing/selftests/resctrl/resctrl.h
index 81f322245ef7..1ad10c47e31d 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -29,6 +29,10 @@
 #define RESCTRL_PATH   "/sys/fs/resctrl"
 #define PHYS_ID_PATH   "/sys/devices/system/cpu/cpu"
 #define CBM_MASK_PATH  "/sys/fs/resctrl/info"
+#define L3_PATH"/sys/fs/resctrl/info/L3"
+#define MB_PATH"/sys/fs/resctrl/info/MB"
+#define L3_MON_PATH"/sys/fs/resctrl/info/L3_MON"
+#define L3_MON_FEATURES_PATH   "/sys/fs/resctrl/info/L3_MON/mon_features"
 
 #define PARENT_EXIT(err_msg)   \
do {\
@@ -79,7 +83,7 @@ int remount_resctrlfs(bool mum_resctrlfs);
 int get_resource_id(int cpu_no, int *resource_id);
 int umount_resctrlfs(void);
 int validate_bw_report_request(char *bw_report);
-bool validate_resctrl_feature_request(char *resctrl_val);
+bool validate_resctrl_feature_request(const char *resctrl_val);
 char *fgrep(FILE *inf, const char *str);
 int taskset_benchmark(pid_t bm_pid, int cpu_no);
 void run_benchmark(int signum, siginfo_t *info, void *ucontext);
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c 
b/tools/testing/selftests/resctrl/resctrlfs.c
index 10b9292f33e5..0c23514760dd 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -606,26 +606,56 @@ char *fgrep(FILE *inf, const char *str)
  * validate_resctrl_feature_request - Check if requested feature is valid.
  * @resctrl_val:   Requested feature
  *
- * Return: 0 on success, non-zero on failure
+ * Return: True if the feature is supported, else false
  */
-bool validate_resctrl_feature_request(char *resctrl_val)
+bool validate_resctrl_feature_request(const char *resctrl_val)
 {
-   FILE *inf = fopen("/proc/cpuinfo", "r");
+   struct stat statbuf;
bool found = false;
char *res;
+   FILE *inf;
 
-   if (!inf)
+   if (!resctrl_val)
return false;
 
-   res = fgrep(inf, "flags");
-
-   if (res) {
-   char *s = strchr(res, ':');
+   if (remount_resctrlfs(false))
+   return false;
 
-   found = s && !strstr(s, resctrl_val);
-   free(res);
+   if (!strncmp(resctrl

[PATCH v5 21/21] selftests/resctrl: Create .gitignore to include resctrl_tests

2021-03-07 Thread Fenghua Yu
Create .gitignore to hold the test file resctrl_tests generated after
compiling.

Suggested-by: Shuah Khan 
Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Add this patch (Shuah)

 tools/testing/selftests/resctrl/.gitignore | 2 ++
 1 file changed, 2 insertions(+)
 create mode 100644 tools/testing/selftests/resctrl/.gitignore

diff --git a/tools/testing/selftests/resctrl/.gitignore 
b/tools/testing/selftests/resctrl/.gitignore
new file mode 100644
index ..ab68442b6bc8
--- /dev/null
+++ b/tools/testing/selftests/resctrl/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+resctrl_tests
-- 
2.30.1



[PATCH v5 03/21] selftests/resctrl: Fix compilation issues for other global variables

2021-03-07 Thread Fenghua Yu
Reinette reported following compilation issue on Fedora 32, gcc version
10.1.1

/usr/bin/ld: resctrl_tests.o:/resctrl.h:65: multiple definition
of `bm_pid'; cache.o:/resctrl.h:65: first defined here

Other variables are ppid, tests_run, llc_occup_path, is_amd. Compiler
isn't happy because these variables are defined globally in two .c files
but are not declared as extern.

To fix issues for the global variables, declare them as extern.

Chang Log:
- Split this patch from v4's patch 1 (Shuah).

Reported-by: Reinette Chatre 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrl.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl.h 
b/tools/testing/selftests/resctrl/resctrl.h
index 959c71e39bdc..12b77182cb44 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -62,11 +62,11 @@ struct resctrl_val_param {
int (*setup)(int num, ...);
 };
 
-pid_t bm_pid, ppid;
-int tests_run;
+extern pid_t bm_pid, ppid;
+extern int tests_run;
 
-char llc_occup_path[1024];
-bool is_amd;
+extern char llc_occup_path[1024];
+extern bool is_amd;
 
 bool check_resctrlfs_support(void);
 int filter_dmesg(void);
-- 
2.30.1



[PATCH v5 20/21] selftests/resctrl: Fix checking for < 0 for unsigned values

2021-03-07 Thread Fenghua Yu
Dan reported following static checker warnings

tools/testing/selftests/resctrl/resctrl_val.c:545 measure_vals()
warn: 'bw_imc' unsigned <= 0

tools/testing/selftests/resctrl/resctrl_val.c:549 measure_vals()
warn: 'bw_resc_end' unsigned <= 0

These warnings are reported because
1. measure_vals() declares 'bw_imc' and 'bw_resc_end' as unsigned long
   variables
2. Return value of get_mem_bw_imc() and get_mem_bw_resctrl() are assigned
   to 'bw_imc' and 'bw_resc_end' respectively
3. The returned values are checked for <= 0 to see if the calls failed

Checking for < 0 for an unsigned value doesn't make any sense.

Fix this issue by changing the implementation of get_mem_bw_imc() and
get_mem_bw_resctrl() such that they now accept reference to a variable
and set the variable appropriately upon success and return 0, else return
< 0 on error.

Reported-by: Dan Carpenter 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrl_val.c | 41 +++
 1 file changed, 23 insertions(+), 18 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl_val.c 
b/tools/testing/selftests/resctrl/resctrl_val.c
index de99d398ebfb..5a66e94cd4b4 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -300,9 +300,9 @@ static int initialize_mem_bw_imc(void)
  * Memory B/W utilized by a process on a socket can be calculated using
  * iMC counters. Perf events are used to read these counters.
  *
- * Return: >= 0 on success. < 0 on failure.
+ * Return: = 0 on success. < 0 on failure.
  */
-static float get_mem_bw_imc(int cpu_no, char *bw_report)
+static int get_mem_bw_imc(int cpu_no, char *bw_report, float *bw_imc)
 {
float reads, writes, of_mul_read, of_mul_write;
int imc, j, ret;
@@ -373,13 +373,18 @@ static float get_mem_bw_imc(int cpu_no, char *bw_report)
close(imc_counters_config[imc][WRITE].fd);
}
 
-   if (strcmp(bw_report, "reads") == 0)
-   return reads;
+   if (strcmp(bw_report, "reads") == 0) {
+   *bw_imc = reads;
+   return 0;
+   }
 
-   if (strcmp(bw_report, "writes") == 0)
-   return writes;
+   if (strcmp(bw_report, "writes") == 0) {
+   *bw_imc = writes;
+   return 0;
+   }
 
-   return (reads + writes);
+   *bw_imc = reads + writes;
+   return 0;
 }
 
 void set_mbm_path(const char *ctrlgrp, const char *mongrp, int resource_id)
@@ -438,9 +443,8 @@ static void initialize_mem_bw_resctrl(const char *ctrlgrp, 
const char *mongrp,
  * 1. If con_mon grp is given, then read from it
  * 2. If con_mon grp is not given, then read from root con_mon grp
  */
-static unsigned long get_mem_bw_resctrl(void)
+static int get_mem_bw_resctrl(unsigned long *mbm_total)
 {
-   unsigned long mbm_total = 0;
FILE *fp;
 
fp = fopen(mbm_total_path, "r");
@@ -449,7 +453,7 @@ static unsigned long get_mem_bw_resctrl(void)
 
return -1;
}
-   if (fscanf(fp, "%lu", _total) <= 0) {
+   if (fscanf(fp, "%lu", mbm_total) <= 0) {
perror("Could not get mbm local bytes");
fclose(fp);
 
@@ -457,7 +461,7 @@ static unsigned long get_mem_bw_resctrl(void)
}
fclose(fp);
 
-   return mbm_total;
+   return 0;
 }
 
 pid_t bm_pid, ppid;
@@ -549,7 +553,8 @@ static void initialize_llc_occu_resctrl(const char 
*ctrlgrp, const char *mongrp,
 static int
 measure_vals(struct resctrl_val_param *param, unsigned long *bw_resc_start)
 {
-   unsigned long bw_imc, bw_resc, bw_resc_end;
+   unsigned long bw_resc, bw_resc_end;
+   float bw_imc;
int ret;
 
/*
@@ -559,13 +564,13 @@ measure_vals(struct resctrl_val_param *param, unsigned 
long *bw_resc_start)
 * Compare the two values to validate resctrl value.
 * It takes 1sec to measure the data.
 */
-   bw_imc = get_mem_bw_imc(param->cpu_no, param->bw_report);
-   if (bw_imc <= 0)
-   return bw_imc;
+   ret = get_mem_bw_imc(param->cpu_no, param->bw_report, _imc);
+   if (ret < 0)
+   return ret;
 
-   bw_resc_end = get_mem_bw_resctrl();
-   if (bw_resc_end <= 0)
-   return bw_resc_end;
+   ret = get_mem_bw_resctrl(_resc_end);
+   if (ret < 0)
+   return ret;
 
bw_resc = (bw_resc_end - *bw_resc_start) / MB;
ret = print_results_bw(param->filename, bm_pid, bw_imc, bw_resc);
-- 
2.30.1



[PATCH v5 19/21] selftests/resctrl: Fix incorrect parsing of iMC counters

2021-03-07 Thread Fenghua Yu
iMC (Integrated Memory Controller) counters are usually at
"/sys/bus/event_source/devices/" and are named as "uncore_imc_".
num_of_imcs() function tries to count number of such iMC counters so that
it could appropriately initialize required number of perf_attr structures
that could be used to read these iMC counters.

num_of_imcs() function assumes that all the directories under this path
that start with "uncore_imc" are iMC counters. But, on some systems there
could be directories named as "uncore_imc_free_running" which aren't iMC
counters. Trying to read from such directories will result in "not found
file" errors and MBM/MBA tests will fail.

Hence, fix the logic in num_of_imcs() such that it looks at the first
character after "uncore_imc_" to check if it's a numerical digit or not. If
it's a digit then the directory represents an iMC counter, else, skip the
directory.

Reported-by: Reinette Chatre 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrl_val.c | 22 +--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl_val.c 
b/tools/testing/selftests/resctrl/resctrl_val.c
index 48bcd5fd7d79..de99d398ebfb 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -221,8 +221,8 @@ static int read_from_imc_dir(char *imc_dir, int count)
  */
 static int num_of_imcs(void)
 {
+   char imc_dir[512], *temp;
unsigned int count = 0;
-   char imc_dir[512];
struct dirent *ep;
int ret;
DIR *dp;
@@ -230,7 +230,25 @@ static int num_of_imcs(void)
dp = opendir(DYN_PMU_PATH);
if (dp) {
while ((ep = readdir(dp))) {
-   if (strstr(ep->d_name, UNCORE_IMC)) {
+   temp = strstr(ep->d_name, UNCORE_IMC);
+   if (!temp)
+   continue;
+
+   /*
+* imc counters are named as "uncore_imc_", hence
+* increment the pointer to point to . Note that
+* sizeof(UNCORE_IMC) would count for null character as
+* well and hence the last underscore character in
+* uncore_imc'_' need not be counted.
+*/
+   temp = temp + sizeof(UNCORE_IMC);
+
+   /*
+* Some directories under "DYN_PMU_PATH" could have
+* names like "uncore_imc_free_running", hence, check if
+* first character is a numerical digit or not.
+*/
+   if (temp[0] >= '0' && temp[0] <= '9') {
sprintf(imc_dir, "%s/%s/", DYN_PMU_PATH,
ep->d_name);
ret = read_from_imc_dir(imc_dir, count);
-- 
2.30.1



[PATCH v5 02/21] selftests/resctrl: Fix compilation issues for global variables

2021-03-07 Thread Fenghua Yu
Reinette reported following compilation issue on Fedora 32, gcc version
10.1.1

/usr/bin/ld: cqm_test.o:/cqm_test.c:22: multiple definition of
`cache_size'; cat_test.o:/cat_test.c:23: first defined here

The same issue is reported for long_mask, cbm_mask, count_of_bits etc
variables as well. Compiler isn't happy because these variables are
defined globally in two .c files namely cqm_test.c and cat_test.c and
the compiler during compilation finds that the variable is already
defined (multiple definition error).

Taking a closer look at the usage of these variables reveals that these
variables are used only locally to functions such as cqm_resctrl_val()
(defined in cqm_test.c) and cat_perf_miss_val() (defined in cat_test.c).
These variables are not shared between those functions. So, there is no
need for these variables to be global. Hence, fix this issue by making
them static variables.

Reported-by: Reinette Chatre 
Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Define long_mask, cbm_mask, count_of_bits etc as static variables
  (Shuah).
- Split this patch into patch 2 and 3 (Shuah).

 tools/testing/selftests/resctrl/cat_test.c  | 10 +-
 tools/testing/selftests/resctrl/cqm_test.c  | 10 +-
 tools/testing/selftests/resctrl/resctrl.h   |  2 +-
 tools/testing/selftests/resctrl/resctrlfs.c | 10 +-
 4 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c 
b/tools/testing/selftests/resctrl/cat_test.c
index 5da43767b973..bdeeb5772592 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -17,10 +17,10 @@
 #define MAX_DIFF_PERCENT   4
 #define MAX_DIFF   100
 
-int count_of_bits;
-char cbm_mask[256];
-unsigned long long_mask;
-unsigned long cache_size;
+static int count_of_bits;
+static char cbm_mask[256];
+static unsigned long long_mask;
+static unsigned long cache_size;
 
 /*
  * Change schemata. Write schemata to specified
@@ -136,7 +136,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
return -1;
 
/* Get default cbm mask for L3/L2 cache */
-   ret = get_cbm_mask(cache_type);
+   ret = get_cbm_mask(cache_type, cbm_mask);
if (ret)
return ret;
 
diff --git a/tools/testing/selftests/resctrl/cqm_test.c 
b/tools/testing/selftests/resctrl/cqm_test.c
index 5e7308ac63be..de33d1c0466e 100644
--- a/tools/testing/selftests/resctrl/cqm_test.c
+++ b/tools/testing/selftests/resctrl/cqm_test.c
@@ -16,10 +16,10 @@
 #define MAX_DIFF   200
 #define MAX_DIFF_PERCENT   15
 
-int count_of_bits;
-char cbm_mask[256];
-unsigned long long_mask;
-unsigned long cache_size;
+static int count_of_bits;
+static char cbm_mask[256];
+static unsigned long long_mask;
+static unsigned long cache_size;
 
 static int cqm_setup(int num, ...)
 {
@@ -125,7 +125,7 @@ int cqm_resctrl_val(int cpu_no, int n, char **benchmark_cmd)
if (!validate_resctrl_feature_request("cqm"))
return -1;
 
-   ret = get_cbm_mask("L3");
+   ret = get_cbm_mask("L3", cbm_mask);
if (ret)
return ret;
 
diff --git a/tools/testing/selftests/resctrl/resctrl.h 
b/tools/testing/selftests/resctrl/resctrl.h
index 39bf59c6b9c5..959c71e39bdc 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -92,7 +92,7 @@ void tests_cleanup(void);
 void mbm_test_cleanup(void);
 int mba_schemata_change(int cpu_no, char *bw_report, char **benchmark_cmd);
 void mba_test_cleanup(void);
-int get_cbm_mask(char *cache_type);
+int get_cbm_mask(char *cache_type, char *cbm_mask);
 int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size);
 void ctrlc_handler(int signum, siginfo_t *info, void *ptr);
 int cat_val(struct resctrl_val_param *param);
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c 
b/tools/testing/selftests/resctrl/resctrlfs.c
index 19c0ec4045a4..2a16100c9c3f 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -49,8 +49,6 @@ static int find_resctrl_mount(char *buffer)
return -ENOENT;
 }
 
-char cbm_mask[256];
-
 /*
  * remount_resctrlfs - Remount resctrl FS at /sys/fs/resctrl
  * @mum_resctrlfs: Should the resctrl FS be remounted?
@@ -205,16 +203,18 @@ int get_cache_size(int cpu_no, char *cache_type, unsigned 
long *cache_size)
 /*
  * get_cbm_mask - Get cbm mask for given cache
  * @cache_type:Cache level L2/L3
- *
- * Mask is stored in cbm_mask which is global variable.
+ * @cbm_mask:  cbm_mask returned as a string
  *
  * Return: = 0 on success, < 0 on failure.
  */
-int get_cbm_mask(char *cache_type)
+int get_cbm_mask(char *cache_type, char *cbm_mask)
 {
char cbm_mask_path[1024];
FILE *fp;
 
+   if (!cbm_mask)
+   return -1;
+
sprintf(cbm_mask_path, "%s/%s/cbm_mask"

[PATCH v5 04/21] selftests/resctrl: Clean up resctrl features check

2021-03-07 Thread Fenghua Yu
Checking resctrl features call strcmp() to compare feature strings
(e.g. "mba", "cat" etc). The checkings are error prone and don't have
good coding style. Define the constant strings in macros and call
strncmp() to solve the potential issues.

Suggested-by: Shuah Khan 
Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Remove is_cat() etc functions and directly call strncmp() to check
  the features (Shuah).

 tools/testing/selftests/resctrl/cache.c   |  8 +++
 tools/testing/selftests/resctrl/cat_test.c|  2 +-
 tools/testing/selftests/resctrl/cqm_test.c|  2 +-
 tools/testing/selftests/resctrl/fill_buf.c|  4 ++--
 tools/testing/selftests/resctrl/mba_test.c|  2 +-
 tools/testing/selftests/resctrl/mbm_test.c|  2 +-
 tools/testing/selftests/resctrl/resctrl.h |  5 +
 .../testing/selftests/resctrl/resctrl_tests.c | 12 +-
 tools/testing/selftests/resctrl/resctrl_val.c | 22 +--
 tools/testing/selftests/resctrl/resctrlfs.c   | 17 +++---
 10 files changed, 41 insertions(+), 35 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c 
b/tools/testing/selftests/resctrl/cache.c
index 38dbf4962e33..5922cc1b0386 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -182,7 +182,7 @@ int measure_cache_vals(struct resctrl_val_param *param, int 
bm_pid)
/*
 * Measure cache miss from perf.
 */
-   if (!strcmp(param->resctrl_val, "cat")) {
+   if (!strncmp(param->resctrl_val, CAT_STR, sizeof(CAT_STR))) {
ret = get_llc_perf(_perf_miss);
if (ret < 0)
return ret;
@@ -192,7 +192,7 @@ int measure_cache_vals(struct resctrl_val_param *param, int 
bm_pid)
/*
 * Measure llc occupancy from resctrl.
 */
-   if (!strcmp(param->resctrl_val, "cqm")) {
+   if (!strncmp(param->resctrl_val, CQM_STR, sizeof(CQM_STR))) {
ret = get_llc_occu_resctrl(_occu_resc);
if (ret < 0)
return ret;
@@ -234,7 +234,7 @@ int cat_val(struct resctrl_val_param *param)
if (ret)
return ret;
 
-   if ((strcmp(resctrl_val, "cat") == 0)) {
+   if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR))) {
ret = initialize_llc_perf();
if (ret)
return ret;
@@ -242,7 +242,7 @@ int cat_val(struct resctrl_val_param *param)
 
/* Test runs until the callback setup() tells the test to stop. */
while (1) {
-   if (strcmp(resctrl_val, "cat") == 0) {
+   if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR))) {
ret = param->setup(1, param);
if (ret) {
ret = 0;
diff --git a/tools/testing/selftests/resctrl/cat_test.c 
b/tools/testing/selftests/resctrl/cat_test.c
index bdeeb5772592..20823725daca 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -164,7 +164,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
return -1;
 
struct resctrl_val_param param = {
-   .resctrl_val= "cat",
+   .resctrl_val= CAT_STR,
.cpu_no = cpu_no,
.mum_resctrlfs  = 0,
.setup  = cat_setup,
diff --git a/tools/testing/selftests/resctrl/cqm_test.c 
b/tools/testing/selftests/resctrl/cqm_test.c
index de33d1c0466e..271752e9ef5b 100644
--- a/tools/testing/selftests/resctrl/cqm_test.c
+++ b/tools/testing/selftests/resctrl/cqm_test.c
@@ -145,7 +145,7 @@ int cqm_resctrl_val(int cpu_no, int n, char **benchmark_cmd)
}
 
struct resctrl_val_param param = {
-   .resctrl_val= "cqm",
+   .resctrl_val= CQM_STR,
.ctrlgrp= "c1",
.mongrp = "m1",
.cpu_no = cpu_no,
diff --git a/tools/testing/selftests/resctrl/fill_buf.c 
b/tools/testing/selftests/resctrl/fill_buf.c
index 79c611c99a3d..51e5cf22632f 100644
--- a/tools/testing/selftests/resctrl/fill_buf.c
+++ b/tools/testing/selftests/resctrl/fill_buf.c
@@ -115,7 +115,7 @@ static int fill_cache_read(unsigned char *start_ptr, 
unsigned char *end_ptr,
 
while (1) {
ret = fill_one_span_read(start_ptr, end_ptr);
-   if (!strcmp(resctrl_val, "cat"))
+   if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR)))
break;
}
 
@@ -134,7 +134,7 @@ static int fill_cache_write(unsigned char *start_ptr, 
unsigned char *end_ptr,
 {
while (1) {
fill_one_span_write(start_ptr, end_ptr);
-   if (!strcmp(resctrl_val, "cat"))
+   if (!strncmp(resctr

[PATCH v5 05/21] selftests/resctrl: Ensure sibling CPU is not same as original CPU

2021-03-07 Thread Fenghua Yu
From: Reinette Chatre 

The resctrl tests can accept a CPU on which the tests are run and use
default of CPU #1 if it is not provided. In the CAT test a "sibling CPU"
is determined that is from the same package where another thread will be
run.

The current algorithm with which a "sibling CPU" is determined does not
take the provided/default CPU into account and when that CPU is the
first CPU in a package then the "sibling CPU" will be selected to be the
same CPU since it starts by picking the first CPU from core_siblings_list.

Fix the "sibling CPU" selection by taking the provided/default CPU into
account and ensuring a sibling that is a different CPU is selected.

Signed-off-by: Reinette Chatre 
Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Move from v4's patch 8 to this patch as the fix patch should be first
  (Shuah).

 tools/testing/selftests/resctrl/resctrlfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/resctrl/resctrlfs.c 
b/tools/testing/selftests/resctrl/resctrlfs.c
index 4174e48e06d1..bc52076bee7f 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -268,7 +268,7 @@ int get_core_sibling(int cpu_no)
while (token) {
sibling_cpu_no = atoi(token);
/* Skipping core 0 as we don't want to run test on core 0 */
-   if (sibling_cpu_no != 0)
+   if (sibling_cpu_no != 0 && sibling_cpu_no != cpu_no)
break;
token = strtok(NULL, "-,");
}
-- 
2.30.1



[PATCH v5 09/21] selftests/resctrl: Share show_cache_info() by CAT and CMT tests

2021-03-07 Thread Fenghua Yu
show_cache_info() functions are defined separately in CAT and CMT
tests. But the functions are same for the tests and unnecessary
to be defined separately. Share the function by the tests.

Suggested-by: Shuah Khan 
Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Add this patch (Shuah)

 tools/testing/selftests/resctrl/cache.c| 42 ++
 tools/testing/selftests/resctrl/cat_test.c | 28 ++-
 tools/testing/selftests/resctrl/cmt_test.c | 33 ++---
 tools/testing/selftests/resctrl/resctrl.h  |  4 +++
 4 files changed, 52 insertions(+), 55 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c 
b/tools/testing/selftests/resctrl/cache.c
index 2aa1b5c7d9e1..eaf5116b112c 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -270,3 +270,45 @@ int cat_val(struct resctrl_val_param *param)
 
return ret;
 }
+
+/*
+ * show_cache_info:show cache test result information
+ * @sum_llc_val:   sum of LLC cache result data
+ * @no_of_bits:number of bits
+ * @cache_span:cache span in bytes for CMT or in lines for CAT
+ * @max_diff:  max difference
+ * @max_diff_percent:  max difference percentage
+ * @num_of_runs:   number of runs
+ * @platform:  show test information on this platform
+ * @cmt:   CMT test or CAT test
+ *
+ * Return: 0 on success. non-zero on failure.
+ */
+int show_cache_info(unsigned long sum_llc_val, int no_of_bits,
+   unsigned long cache_span, unsigned long max_diff,
+   unsigned long max_diff_percent, unsigned long num_of_runs,
+   bool platform, bool cmt)
+{
+   unsigned long avg_llc_val = 0;
+   float diff_percent;
+   long avg_diff = 0;
+   int ret;
+
+   avg_llc_val = sum_llc_val / (num_of_runs - 1);
+   avg_diff = (long)abs(cache_span - avg_llc_val);
+   diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100;
+
+   ret = platform && abs((int)diff_percent) > max_diff_percent &&
+ (cmt ? (abs(avg_diff) > max_diff) : true);
+
+   ksft_print_msg("%scache miss rate within %d%%\n",
+  ret ? "fail: " : "", max_diff_percent);
+
+   ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent));
+   ksft_print_msg("Number of bits: %d\n", no_of_bits);
+   ksft_print_msg("Average LLC val: %lu\n", avg_llc_val);
+   ksft_print_msg("Cache span (%s): %lu\n", cmt ? "bytes" : "lines",
+  cache_span);
+
+   return ret;
+}
diff --git a/tools/testing/selftests/resctrl/cat_test.c 
b/tools/testing/selftests/resctrl/cat_test.c
index 0deb38ed971b..109363e9a7d7 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -52,30 +52,6 @@ static int cat_setup(int num, ...)
return ret;
 }
 
-static int show_cache_info(unsigned long sum_llc_perf_miss, int no_of_bits,
-  unsigned long span)
-{
-   unsigned long allocated_cache_lines = span / 64;
-   unsigned long avg_llc_perf_miss = 0;
-   float diff_percent;
-   int ret;
-
-   avg_llc_perf_miss = sum_llc_perf_miss / (NUM_OF_RUNS - 1);
-   diff_percent = ((float)allocated_cache_lines - avg_llc_perf_miss) /
-   allocated_cache_lines * 100;
-
-   ret = !is_amd && abs((int)diff_percent) > MAX_DIFF_PERCENT;
-   ksft_print_msg("cache miss rate %swithin %d%%\n",
-  ret ? "not " : "", MAX_DIFF_PERCENT);
-
-   ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent));
-   ksft_print_msg("Number of bits: %d\n", no_of_bits);
-   ksft_print_msg("Avg_llc_perf_miss: %lu\n", avg_llc_perf_miss);
-   ksft_print_msg("Allocated cache lines: %lu\n", allocated_cache_lines);
-
-   return ret;
-}
-
 static int check_results(struct resctrl_val_param *param)
 {
char *token_array[8], temp[512];
@@ -111,7 +87,9 @@ static int check_results(struct resctrl_val_param *param)
fclose(fp);
no_of_bits = count_bits(param->mask);
 
-   return show_cache_info(sum_llc_perf_miss, no_of_bits, param->span);
+   return show_cache_info(sum_llc_perf_miss, no_of_bits, param->span / 64,
+  MAX_DIFF, MAX_DIFF_PERCENT, NUM_OF_RUNS,
+  !is_amd, false);
 }
 
 void cat_test_cleanup(void)
diff --git a/tools/testing/selftests/resctrl/cmt_test.c 
b/tools/testing/selftests/resctrl/cmt_test.c
index e5af19335115..4adb92cb6ca1 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -39,35 +39,6 @@ static int cmt_setup(int num, ...)
return 0;
 }
 
-static in

[PATCH v5 07/21] selftests/resctrl: Rename CQM test as CMT test

2021-03-07 Thread Fenghua Yu
CMT (Cache Monitoring Technology) [1] is a H/W feature that reports cache
occupancy of a process. resctrl selftest suite has a unit test to test CMT
for LLC but the test is named as CQM (Cache Quality Monitoring).
Furthermore, the unit test source file is named as cqm_test.c and several
functions, variables, comments, preprocessors and statements widely use
"cqm" as either suffix or prefix. This rampant misusage of CQM for CMT
might confuse someone who is newly looking at resctrl selftests because
this feature is named CMT in the Intel Software Developer's Manual.

Hence, rename all the occurrences (unit test source file name, functions,
variables, comments and preprocessors) of cqm with cmt.

[1] Please see Intel SDM, Volume 3, chapter 17 and section 18 for more
information on CMT: 
https://software.intel.com/content/www/us/en/develop/articles/intel-sdm.html

Suggested-by: Reinette Chatre 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/README|  4 +--
 tools/testing/selftests/resctrl/cache.c   |  4 +--
 .../resctrl/{cqm_test.c => cmt_test.c}| 20 +++---
 tools/testing/selftests/resctrl/resctrl.h |  6 ++---
 .../testing/selftests/resctrl/resctrl_tests.c | 26 +--
 tools/testing/selftests/resctrl/resctrl_val.c | 12 -
 tools/testing/selftests/resctrl/resctrlfs.c   | 10 +++
 7 files changed, 41 insertions(+), 41 deletions(-)
 rename tools/testing/selftests/resctrl/{cqm_test.c => cmt_test.c} (89%)

diff --git a/tools/testing/selftests/resctrl/README 
b/tools/testing/selftests/resctrl/README
index 6e5a0ffa18e8..4b36b25b6ac0 100644
--- a/tools/testing/selftests/resctrl/README
+++ b/tools/testing/selftests/resctrl/README
@@ -46,8 +46,8 @@ ARGUMENTS
 Parameter '-h' shows usage information.
 
 usage: resctrl_tests [-h] [-b "benchmark_cmd [options]"] [-t test list] [-n 
no_of_bits]
--b benchmark_cmd [options]: run specified benchmark for MBM, MBA and 
CQM default benchmark is builtin fill_buf
--t test list: run tests specified in the test list, e.g. -t mbm, mba, 
cqm, cat
+-b benchmark_cmd [options]: run specified benchmark for MBM, MBA and 
CMT default benchmark is builtin fill_buf
+-t test list: run tests specified in the test list, e.g. -t mbm, mba, 
cmt, cat
 -n no_of_bits: run cache tests using specified no of bits in cache bit 
mask
 -p cpu_no: specify CPU number to run the test. 1 is default
 -h: help
diff --git a/tools/testing/selftests/resctrl/cache.c 
b/tools/testing/selftests/resctrl/cache.c
index 5922cc1b0386..2aa1b5c7d9e1 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -111,7 +111,7 @@ static int get_llc_perf(unsigned long *llc_perf_miss)
 
 /*
  * Get LLC Occupancy as reported by RESCTRL FS
- * For CQM,
+ * For CMT,
  * 1. If con_mon grp and mon grp given, then read from mon grp in
  * con_mon grp
  * 2. If only con_mon grp given, then read from con_mon grp
@@ -192,7 +192,7 @@ int measure_cache_vals(struct resctrl_val_param *param, int 
bm_pid)
/*
 * Measure llc occupancy from resctrl.
 */
-   if (!strncmp(param->resctrl_val, CQM_STR, sizeof(CQM_STR))) {
+   if (!strncmp(param->resctrl_val, CMT_STR, sizeof(CMT_STR))) {
ret = get_llc_occu_resctrl(_occu_resc);
if (ret < 0)
return ret;
diff --git a/tools/testing/selftests/resctrl/cqm_test.c 
b/tools/testing/selftests/resctrl/cmt_test.c
similarity index 89%
rename from tools/testing/selftests/resctrl/cqm_test.c
rename to tools/testing/selftests/resctrl/cmt_test.c
index 271752e9ef5b..ca82db37c1f7 100644
--- a/tools/testing/selftests/resctrl/cqm_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
- * Cache Monitoring Technology (CQM) test
+ * Cache Monitoring Technology (CMT) test
  *
  * Copyright (C) 2018 Intel Corporation
  *
@@ -11,7 +11,7 @@
 #include "resctrl.h"
 #include 
 
-#define RESULT_FILE_NAME   "result_cqm"
+#define RESULT_FILE_NAME   "result_cmt"
 #define NUM_OF_RUNS5
 #define MAX_DIFF   200
 #define MAX_DIFF_PERCENT   15
@@ -21,7 +21,7 @@ static char cbm_mask[256];
 static unsigned long long_mask;
 static unsigned long cache_size;
 
-static int cqm_setup(int num, ...)
+static int cmt_setup(int num, ...)
 {
struct resctrl_val_param *p;
va_list param;
@@ -58,7 +58,7 @@ static void show_cache_info(unsigned long sum_llc_occu_resc, 
int no_of_bits,
else
res = false;
 
-   printf("%sok CQM: diff within %d, %d\%%\n", res ? "" : "not",
+   printf("%sok CMT: diff within %d, %d\%%\n", res ? "" : "not",
   MAX_DIFF, (int)MAX_DIFF_PERCENT);
 
printf("# diff: %ld\n", avg_diff);
@@ -106

[PATCH v5 06/21] selftests/resctrl: Fix missing options "-n" and "-p"

2021-03-07 Thread Fenghua Yu
resctrl test suite accepts command line arguments (like -b, -t, -n and -p)
as documented in the help. But passing -n and -p throws an invalid option
error. This happens because -n and -p are missing in the list of
characters that getopt() recognizes as valid arguments. Hence, they are
treated as invalid options.

Fix this by adding them to the list of characters that getopt() recognizes
as valid arguments. Please note that the main() function already has the
logic to deal with the values passed as part of these arguments and hence
no changes are needed there.

Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Move from v4's patch 9 to this patch as the fix patch should be first
  (Shuah).

 tools/testing/selftests/resctrl/resctrl_tests.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c 
b/tools/testing/selftests/resctrl/resctrl_tests.c
index 4b109a59f72d..ac2269610aa9 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -73,7 +73,7 @@ int main(int argc, char **argv)
}
}
 
-   while ((c = getopt(argc_new, argv, "ht:b:")) != -1) {
+   while ((c = getopt(argc_new, argv, "ht:b:n:p:")) != -1) {
char *token;
 
switch (c) {
-- 
2.30.1



[PATCH v5 00/21] Miscellaneous fixes for resctrl selftests

2021-03-07 Thread Fenghua Yu
This patch set has several miscellaneous fixes to resctrl selftest tool
that are easily visible to user. V1 had fixes to CAT test and CMT test
but they were dropped in V2 because having them here made the patchset
humongous. So, changes to CAT test and CMT test will be posted in another
patchset.

Change Log:
v5:
- Address various comments from Shuah Khan:
  1. Move a few fixing patches before cleaning patches.
  2. Call kselftest APIs to log test results instead of printf().
  3. Add .gitignore to ignore resctrl_tests.
  4. Share show_cache_info() in CAT and CMT tests.
  5. Define long_mask, cbm_mask, count_of_bits etc as static variables.

v4:
- Address various comments from Shuah Khan:
  1. Combine a few patches e.g. a couple of fixing typos patches into one
 and a couple of unmounting patches into one etc.
  2. Add config file.
  3. Remove "Fixes" tags.
  4. Change strcmp() to strncmp().
  5. Move the global variable fixing patch to the patch 1 so that the
 compilation issue is fixed first.

Please note:
- I didn't move the patch of renaming CQM to CMT to the end of the series
  because code and commit messages in a few other patches depend on the
  new term of "CMT". If move the renaming patch to the end, the previous
  patches use the old "CQM" term and code which will be changed soon at
  the end of series and will cause more code and explanations.
[v3: https://lkml.org/lkml/2020/10/28/137]

v3:
Address various comments (commit messages, return value on test failure,
print failure info on test failure etc) from Reinette and Tony.
[v2: 
https://lore.kernel.org/linux-kselftest/cover.1589835155.git.sai.praneeth.prak...@intel.com/]

v2:
1. Dropped changes to CAT test and CMT test as they will be posted in a later
   series.
2. Added several other fixes
[v1: 
https://lore.kernel.org/linux-kselftest/cover.1583657204.git.sai.praneeth.prak...@intel.com/]

Fenghua Yu (19):
  selftests/resctrl: Enable gcc checks to detect buffer overflows
  selftests/resctrl: Fix compilation issues for global variables
  selftests/resctrl: Fix compilation issues for other global variables
  selftests/resctrl: Clean up resctrl features check
  selftests/resctrl: Fix missing options "-n" and "-p"
  selftests/resctrl: Rename CQM test as CMT test
  selftests/resctrl: Call kselftest APIs to log test results
  selftests/resctrl: Share show_cache_info() by CAT and CMT tests
  selftests/resctrl: Add config dependencies
  selftests/resctrl: Check for resctrl mount point only if resctrl FS is
supported
  selftests/resctrl: Use resctrl/info for feature detection
  selftests/resctrl: Fix MBA/MBM results reporting format
  selftests/resctrl: Don't hard code value of "no_of_bits" variable
  selftests/resctrl: Modularize resctrl test suite main() function
  selftests/resctrl: Skip the test if requested resctrl feature is not
supported
  selftests/resctrl: Fix unmount resctrl FS
  selftests/resctrl: Fix incorrect parsing of iMC counters
  selftests/resctrl: Fix checking for < 0 for unsigned values
  selftests/resctrl: Create .gitignore to include resctrl_tests

Reinette Chatre (2):
  selftests/resctrl: Ensure sibling CPU is not same as original CPU
  selftests/resctrl: Fix a printed message

 tools/testing/selftests/resctrl/.gitignore|   2 +
 tools/testing/selftests/resctrl/Makefile  |   2 +-
 tools/testing/selftests/resctrl/README|   4 +-
 tools/testing/selftests/resctrl/cache.c   |  52 +-
 tools/testing/selftests/resctrl/cat_test.c|  57 ++
 .../resctrl/{cqm_test.c => cmt_test.c}|  75 +++-
 tools/testing/selftests/resctrl/config|   2 +
 tools/testing/selftests/resctrl/fill_buf.c|   4 +-
 tools/testing/selftests/resctrl/mba_test.c|  43 ++---
 tools/testing/selftests/resctrl/mbm_test.c|  42 ++---
 tools/testing/selftests/resctrl/resctrl.h |  29 +++-
 .../testing/selftests/resctrl/resctrl_tests.c | 163 --
 tools/testing/selftests/resctrl/resctrl_val.c |  95 ++
 tools/testing/selftests/resctrl/resctrlfs.c   | 134 --
 14 files changed, 408 insertions(+), 296 deletions(-)
 create mode 100644 tools/testing/selftests/resctrl/.gitignore
 rename tools/testing/selftests/resctrl/{cqm_test.c => cmt_test.c} (56%)
 create mode 100644 tools/testing/selftests/resctrl/config

-- 
2.30.1



[PATCH v5 10/21] selftests/resctrl: Fix a printed message

2021-03-07 Thread Fenghua Yu
From: Reinette Chatre 

Add a missing newline to the printed help text to improve readability.

Signed-off-by: Reinette Chatre 
Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Remove the "notok" fix part because the API change fixes it already.

 tools/testing/selftests/resctrl/resctrl_tests.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c 
b/tools/testing/selftests/resctrl/resctrl_tests.c
index ebc24992cc2c..f1b08afbc3d0 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -37,8 +37,8 @@ void detect_amd(void)
 static void cmd_help(void)
 {
printf("usage: resctrl_tests [-h] [-b \"benchmark_cmd [options]\"] [-t 
test list] [-n no_of_bits]\n");
-   printf("\t-b benchmark_cmd [options]: run specified benchmark for MBM, 
MBA and CMT");
-   printf("\t default benchmark is builtin fill_buf\n");
+   printf("\t-b benchmark_cmd [options]: run specified benchmark for MBM, 
MBA and CMT\n");
+   printf("\t   default benchmark is builtin fill_buf\n");
printf("\t-t test list: run tests specified in the test list, ");
printf("e.g. -t mbm, mba, cmt, cat\n");
printf("\t-n no_of_bits: run cache tests using specified no of bits in 
cache bit mask\n");
-- 
2.30.1



[PATCH v5 01/21] selftests/resctrl: Enable gcc checks to detect buffer overflows

2021-03-07 Thread Fenghua Yu
David reported a buffer overflow error in the check_results() function of
the cmt unit test and he suggested enabling _FORTIFY_SOURCE gcc compiler
option to automatically detect any such errors.

Feature Test Macros man page describes_FORTIFY_SOURCE as below

"Defining this macro causes some lightweight checks to be performed to
detect some buffer overflow errors when employing various string and memory
manipulation functions (for example, memcpy, memset, stpcpy, strcpy,
strncpy, strcat, strncat, sprintf, snprintf, vsprintf, vsnprintf, gets, and
wide character variants thereof). For some functions, argument consistency
is checked; for example, a check is made that open has been supplied with a
mode argument when the specified flags include O_CREAT. Not all problems
are detected, just some common cases.

If _FORTIFY_SOURCE is set to 1, with compiler optimization level 1 (gcc
-O1) and above, checks that shouldn't change the behavior of conforming
programs are performed.

With _FORTIFY_SOURCE set to 2, some more checking is added, but some
conforming programs might fail.

Some of the checks can be performed at compile time (via macros logic
implemented in header files), and result in compiler warnings; other checks
take place at run time, and result in a run-time error if the check fails.

Use of this macro requires compiler support, available with gcc since
version 4.0."

Fix the buffer overflow error in the check_results() function of the cmt
unit test and enable _FORTIFY_SOURCE gcc check to catch any future buffer
overflow errors.

Reported-by: David Binderman 
Suggested-by: David Binderman 
Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Move from v4's patch 11 to patch 1 so the fix patch should be first
  (Shuah).

 tools/testing/selftests/resctrl/Makefile   | 2 +-
 tools/testing/selftests/resctrl/cqm_test.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/resctrl/Makefile 
b/tools/testing/selftests/resctrl/Makefile
index d585cc1948cc..6bcee2ec91a9 100644
--- a/tools/testing/selftests/resctrl/Makefile
+++ b/tools/testing/selftests/resctrl/Makefile
@@ -1,5 +1,5 @@
 CC = $(CROSS_COMPILE)gcc
-CFLAGS = -g -Wall
+CFLAGS = -g -Wall -O2 -D_FORTIFY_SOURCE=2
 SRCS=$(wildcard *.c)
 OBJS=$(SRCS:.c=.o)
 
diff --git a/tools/testing/selftests/resctrl/cqm_test.c 
b/tools/testing/selftests/resctrl/cqm_test.c
index c8756152bd61..5e7308ac63be 100644
--- a/tools/testing/selftests/resctrl/cqm_test.c
+++ b/tools/testing/selftests/resctrl/cqm_test.c
@@ -86,7 +86,7 @@ static int check_results(struct resctrl_val_param *param, int 
no_of_bits)
return errno;
}
 
-   while (fgets(temp, 1024, fp)) {
+   while (fgets(temp, sizeof(temp), fp)) {
char *token = strtok(temp, ":\t");
int fields = 0;
 
-- 
2.30.1



[PATCH v5 08/21] selftests/resctrl: Call kselftest APIs to log test results

2021-03-07 Thread Fenghua Yu
Call kselftest APIs instead of using printf() to log test results
for cleaner code and better future extension.

Suggested-by: Shuah Khan 
Signed-off-by: Fenghua Yu 
---
Change Log:
v5:
- Add this patch (Shuah)

 tools/testing/selftests/resctrl/cat_test.c| 37 +++
 tools/testing/selftests/resctrl/cmt_test.c| 42 -
 tools/testing/selftests/resctrl/mba_test.c| 24 +-
 tools/testing/selftests/resctrl/mbm_test.c| 28 ++--
 tools/testing/selftests/resctrl/resctrl.h |  2 +-
 .../testing/selftests/resctrl/resctrl_tests.c | 40 +
 tools/testing/selftests/resctrl/resctrl_val.c |  4 +-
 tools/testing/selftests/resctrl/resctrlfs.c   | 45 +++
 8 files changed, 105 insertions(+), 117 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c 
b/tools/testing/selftests/resctrl/cat_test.c
index 20823725daca..0deb38ed971b 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -52,25 +52,28 @@ static int cat_setup(int num, ...)
return ret;
 }
 
-static void show_cache_info(unsigned long sum_llc_perf_miss, int no_of_bits,
-   unsigned long span)
+static int show_cache_info(unsigned long sum_llc_perf_miss, int no_of_bits,
+  unsigned long span)
 {
unsigned long allocated_cache_lines = span / 64;
unsigned long avg_llc_perf_miss = 0;
float diff_percent;
+   int ret;
 
avg_llc_perf_miss = sum_llc_perf_miss / (NUM_OF_RUNS - 1);
diff_percent = ((float)allocated_cache_lines - avg_llc_perf_miss) /
allocated_cache_lines * 100;
 
-   printf("%sok CAT: cache miss rate within %d%%\n",
-  !is_amd && abs((int)diff_percent) > MAX_DIFF_PERCENT ?
-  "not " : "", MAX_DIFF_PERCENT);
-   tests_run++;
-   printf("# Percent diff=%d\n", abs((int)diff_percent));
-   printf("# Number of bits: %d\n", no_of_bits);
-   printf("# Avg_llc_perf_miss: %lu\n", avg_llc_perf_miss);
-   printf("# Allocated cache lines: %lu\n", allocated_cache_lines);
+   ret = !is_amd && abs((int)diff_percent) > MAX_DIFF_PERCENT;
+   ksft_print_msg("cache miss rate %swithin %d%%\n",
+  ret ? "not " : "", MAX_DIFF_PERCENT);
+
+   ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent));
+   ksft_print_msg("Number of bits: %d\n", no_of_bits);
+   ksft_print_msg("Avg_llc_perf_miss: %lu\n", avg_llc_perf_miss);
+   ksft_print_msg("Allocated cache lines: %lu\n", allocated_cache_lines);
+
+   return ret;
 }
 
 static int check_results(struct resctrl_val_param *param)
@@ -80,7 +83,7 @@ static int check_results(struct resctrl_val_param *param)
int runs = 0, no_of_bits = 0;
FILE *fp;
 
-   printf("# Checking for pass/fail\n");
+   ksft_print_msg("Checking for pass/fail\n");
fp = fopen(param->filename, "r");
if (!fp) {
perror("# Cannot open file");
@@ -108,9 +111,7 @@ static int check_results(struct resctrl_val_param *param)
fclose(fp);
no_of_bits = count_bits(param->mask);
 
-   show_cache_info(sum_llc_perf_miss, no_of_bits, param->span);
-
-   return 0;
+   return show_cache_info(sum_llc_perf_miss, no_of_bits, param->span);
 }
 
 void cat_test_cleanup(void)
@@ -146,15 +147,15 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
ret = get_cache_size(cpu_no, cache_type, _size);
if (ret)
return ret;
-   printf("cache size :%lu\n", cache_size);
+   ksft_print_msg("cache size :%lu\n", cache_size);
 
/* Get max number of bits from default-cabm mask */
count_of_bits = count_bits(long_mask);
 
if (n < 1 || n > count_of_bits - 1) {
-   printf("Invalid input value for no_of_bits n!\n");
-   printf("Please Enter value in range 1 to %d\n",
-  count_of_bits - 1);
+   ksft_print_msg("Invalid input value for no_of_bits n!\n");
+   ksft_print_msg("Please Enter value in range 1 to %d\n",
+  count_of_bits - 1);
return -1;
}
 
diff --git a/tools/testing/selftests/resctrl/cmt_test.c 
b/tools/testing/selftests/resctrl/cmt_test.c
index ca82db37c1f7..e5af19335115 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -39,36 +39,33 @@ static int cmt_setup(int num, ...)
return 0;
 }
 
-static void show_cache_info(unsigned long sum_llc_occu_resc, int no_of_bits,
-   unsigned long span)
+static i

Re: [PATCH v6 08/12] fork: Clear PASID for new mm

2021-02-25 Thread Fenghua Yu
Hi, Jean,

On Wed, Feb 24, 2021 at 11:19:27AM +0100, Jean-Philippe Brucker wrote:
> Hi Fenghua,
> 
> [Trimmed the Cc list]
> 
> On Mon, Jul 13, 2020 at 04:48:03PM -0700, Fenghua Yu wrote:
> > When a new mm is created, its PASID should be cleared, i.e. the PASID is
> > initialized to its init state 0 on both ARM and X86.
> 
> I just noticed this patch was dropped in v7, and am wondering whether we
> could still upstream it. Does x86 need a child with a new address space
> (!CLONE_VM) to inherit the PASID of the parent?  That doesn't make much
> sense with regard to IOMMU structures - same PASID indexing multiple PGDs?

You are right: x86 should clear mm->pasid when a new mm is created.
This patch somehow is losted:(

> 
> Currently iommu_sva_alloc_pasid() assumes mm->pasid is always initialized
> to 0 and fails on forked tasks. I'm trying to figure out how to fix this.
> Could we clear the pasid on fork or does it break the x86 model?

x86 calls ioasid_alloc() instead of iommu_sva_alloc_pasid(). So
functionality is not a problem without this patch on x86. But I think
we do need to have this patch in the kernel because PASID is per addr
space and two addr spaces shouldn't have the same PASID.

Who will accept this patch?

Thanks.

-Fenghua


[tip: x86/urgent] x86/split_lock: Enable the split lock feature on another Alder Lake CPU

2021-02-01 Thread tip-bot2 for Fenghua Yu
The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: 8acf417805a5f5c69e9ff66f14cab022c2755161
Gitweb:
https://git.kernel.org/tip/8acf417805a5f5c69e9ff66f14cab022c2755161
Author:Fenghua Yu 
AuthorDate:Mon, 01 Feb 2021 19:00:07 
Committer: Borislav Petkov 
CommitterDate: Mon, 01 Feb 2021 21:34:51 +01:00

x86/split_lock: Enable the split lock feature on another Alder Lake CPU

Add Alder Lake mobile processor to CPU list to enumerate and enable the
split lock feature.

Signed-off-by: Fenghua Yu 
Signed-off-by: Borislav Petkov 
Reviewed-by: Tony Luck 
Link: https://lkml.kernel.org/r/20210201190007.4031869-1-fenghua...@intel.com
---
 arch/x86/kernel/cpu/intel.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 59a1e3c..816fdbe 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -1159,6 +1159,7 @@ static const struct x86_cpu_id split_lock_cpu_ids[] 
__initconst = {
X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE,   1),
X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X,1),
X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE,   1),
+   X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, 1),
{}
 };
 


[PATCH] x86/split_lock: Enable the split lock feature on another Alder Lake CPU

2021-02-01 Thread Fenghua Yu
Add Alder Lake mobile processor to CPU list to enumerate and enable the
split lock feature.

Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
---
 arch/x86/kernel/cpu/intel.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 59a1e3ce3f14..816fdbec795a 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -1159,6 +1159,7 @@ static const struct x86_cpu_id split_lock_cpu_ids[] 
__initconst = {
X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE,   1),
X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X,1),
X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE,   1),
+   X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, 1),
{}
 };
 
-- 
2.30.0



Re: [PATCH v4 0/4] x86/bus_lock: Enable bus lock detection

2021-01-26 Thread Fenghua Yu
Hi, Dear X86 Maintainers,

>On Mon, Jan 04, 2021 at 07:42:28PM +0000, Fenghua Yu wrote:
> 
> On Tue, Nov 24, 2020 at 08:52:41PM +, Fenghua Yu wrote:
> > A bus lock [1] is acquired through either split locked access to
> > writeback (WB) memory or any locked access to non-WB memory. This is
> > typically >1000 cycles slower than an atomic operation within
> > a cache line. It also disrupts performance on other cores.
 
Just a friendly reminder. Any comment on this series? There hasn't been
any comment since it was posted.

This series can be applied cleanly to v5.11-rc5 and pass all of our
tests. If you want me to repost this series, I can do that too.

Thanks!

-Fenghua


Re: [PATCH v4 00/17] Miscellaneous fixes for resctrl selftests

2021-01-25 Thread Fenghua Yu
On Mon, Jan 25, 2021 at 02:52:09PM -0700, Shuah Khan wrote:
> On 1/25/21 1:47 PM, Fenghua Yu wrote:
> > On Mon, Nov 30, 2020 at 08:19:53PM +, Fenghua Yu wrote:
> > > This patch set has several miscellaneous fixes to resctrl selftest tool
> > > that are easily visible to user. V1 had fixes to CAT test and CMT test
> > > but they were dropped in V2 because having them here made the patchset
> > Just a friendly reminder. Will you push this series to the upstream?
> > Maybe I miss something but I don't see this series in the linux-kselftest
> > tree yet.
> > 
> 
> Sorry I am a bit behind on reviews. I will pull these fixes in this
> week for 5.12-rc1 and will let you know if I would like changes.

Really appreciate your help, Shuah!

-Fenghua


Re: [PATCH v4 00/17] Miscellaneous fixes for resctrl selftests

2021-01-25 Thread Fenghua Yu
Hi, Shuah,

On Mon, Nov 30, 2020 at 08:19:53PM +, Fenghua Yu wrote:
> This patch set has several miscellaneous fixes to resctrl selftest tool
> that are easily visible to user. V1 had fixes to CAT test and CMT test
> but they were dropped in V2 because having them here made the patchset
> humongous. So, changes to CAT test and CMT test will be posted in another
> patchset.
> 
> Change Log:
> v4:
> - Address various comments from Shuah Khan:
>   1. Combine a few patches e.g. a couple of fixing typos patches into one
>  and a couple of unmounting patches into one etc.

Just a friendly reminder. Will you push this series to the upstream?
Maybe I miss something but I don't see this series in the linux-kselftest
tree yet.

Thank you very much!

-Fenghua


Re: [PATCH V4] arch: kernel: cpu: x86/resctrl: Takes a letter away and append a colon to match below struct member

2021-01-13 Thread Fenghua Yu
On Wed, Jan 13, 2021 at 07:33:33AM +0530, Bhaskar Chowdhury wrote:
> s/kernlfs/kernfs/
> s/@mon_data_kn/@mon_data_kn:/

May change the message to describe the problems like:

Fix typo "kernlfs" and add missing ":" to match with other comments.

> 
> Signed-off-by: Bhaskar Chowdhury 
> ---
> Changes from V3: Fix the subject line typo stuc to struct and mention cpu 
> architecture
> 

Please read Documentation/process/submitting-patches.rst.
It talks about the subject, description of problem, changelog, etc
for submitting a patch.

Thanks.

-Fenghua


[tip: x86/urgent] x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR

2021-01-08 Thread tip-bot2 for Fenghua Yu
The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: ae28d1aae48a1258bd09a6f707ebb4231d79a761
Gitweb:
https://git.kernel.org/tip/ae28d1aae48a1258bd09a6f707ebb4231d79a761
Author:Fenghua Yu 
AuthorDate:Thu, 17 Dec 2020 14:31:18 -08:00
Committer: Borislav Petkov 
CommitterDate: Fri, 08 Jan 2021 09:03:36 +01:00

x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR

Currently, when moving a task to a resource group the PQR_ASSOC MSR is
updated with the new closid and rmid in an added task callback. If the
task is running, the work is run as soon as possible. If the task is not
running, the work is executed later in the kernel exit path when the
kernel returns to the task again.

Updating the PQR_ASSOC MSR as soon as possible on the CPU a moved task
is running is the right thing to do. Queueing work for a task that is
not running is unnecessary (the PQR_ASSOC MSR is already updated when
the task is scheduled in) and causing system resource waste with the way
in which it is implemented: Work to update the PQR_ASSOC register is
queued every time the user writes a task id to the "tasks" file, even if
the task already belongs to the resource group.

This could result in multiple pending work items associated with a
single task even if they are all identical and even though only a single
update with most recent values is needed. Specifically, even if a task
is moved between different resource groups while it is sleeping then it
is only the last move that is relevant but yet a work item is queued
during each move.

This unnecessary queueing of work items could result in significant
system resource waste, especially on tasks sleeping for a long time.
For example, as demonstrated by Shakeel Butt in [1] writing the same
task id to the "tasks" file can quickly consume significant memory. The
same problem (wasted system resources) occurs when moving a task between
different resource groups.

As pointed out by Valentin Schneider in [2] there is an additional issue
with the way in which the queueing of work is done in that the task_struct
update is currently done after the work is queued, resulting in a race with
the register update possibly done before the data needed by the update is
available.

To solve these issues, update the PQR_ASSOC MSR in a synchronous way
right after the new closid and rmid are ready during the task movement,
only if the task is running. If a moved task is not running nothing
is done since the PQR_ASSOC MSR will be updated next time the task is
scheduled. This is the same way used to update the register when tasks
are moved as part of resource group removal.

[1] 
https://lore.kernel.org/lkml/calvzod7e9zzhwenzf7objzgksdbmvwtgej0npgs0lufu3sn...@mail.gmail.com/
[2] 
https://lore.kernel.org/lkml/20201123022433.17905-1-valentin.schnei...@arm.com

 [ bp: Massage commit message and drop the two update_task_closid_rmid()
   variants. ]

Fixes: e02737d5b826 ("x86/intel_rdt: Add tasks files")
Reported-by: Shakeel Butt 
Reported-by: Valentin Schneider 
Signed-off-by: Fenghua Yu 
Signed-off-by: Reinette Chatre 
Signed-off-by: Borislav Petkov 
Reviewed-by: Tony Luck 
Reviewed-by: James Morse 
Reviewed-by: Valentin Schneider 
Cc: sta...@vger.kernel.org
Link: 
https://lkml.kernel.org/r/17aa2fb38fc12ce7bb710106b3e7c7b45acb9e94.1608243147.git.reinette.cha...@intel.com
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 112 +---
 1 file changed, 43 insertions(+), 69 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c 
b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 29ffb95..1c6f8a6 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -525,89 +525,63 @@ static void rdtgroup_remove(struct rdtgroup *rdtgrp)
kfree(rdtgrp);
 }
 
-struct task_move_callback {
-   struct callback_headwork;
-   struct rdtgroup *rdtgrp;
-};
-
-static void move_myself(struct callback_head *head)
+static void _update_task_closid_rmid(void *task)
 {
-   struct task_move_callback *callback;
-   struct rdtgroup *rdtgrp;
-
-   callback = container_of(head, struct task_move_callback, work);
-   rdtgrp = callback->rdtgrp;
-
/*
-* If resource group was deleted before this task work callback
-* was invoked, then assign the task to root group and free the
-* resource group.
+* If the task is still current on this CPU, update PQR_ASSOC MSR.
+* Otherwise, the MSR is updated when the task is scheduled in.
 */
-   if (atomic_dec_and_test(>waitcount) &&
-   (rdtgrp->flags & RDT_DELETED)) {
-   current->closid = 0;
-   current->rmid = 0;
-   rdtgroup_remove(rdtgrp);
-   }
-
-   if (unlikely(current->flags & PF_EXITING))
-   goto out;
-
-   preempt_disable();
-

[tip: x86/urgent] x86/resctrl: Don't move a task to the same resource group

2021-01-08 Thread tip-bot2 for Fenghua Yu
The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: a0195f314a25582b38993bf30db11c300f4f4611
Gitweb:
https://git.kernel.org/tip/a0195f314a25582b38993bf30db11c300f4f4611
Author:Fenghua Yu 
AuthorDate:Thu, 17 Dec 2020 14:31:19 -08:00
Committer: Borislav Petkov 
CommitterDate: Fri, 08 Jan 2021 09:08:03 +01:00

x86/resctrl: Don't move a task to the same resource group

Shakeel Butt reported in [1] that a user can request a task to be moved
to a resource group even if the task is already in the group. It just
wastes time to do the move operation which could be costly to send IPI
to a different CPU.

Add a sanity check to ensure that the move operation only happens when
the task is not already in the resource group.

[1] 
https://lore.kernel.org/lkml/calvzod7e9zzhwenzf7objzgksdbmvwtgej0npgs0lufu3sn...@mail.gmail.com/

Fixes: e02737d5b826 ("x86/intel_rdt: Add tasks files")
Reported-by: Shakeel Butt 
Signed-off-by: Fenghua Yu 
Signed-off-by: Reinette Chatre 
Signed-off-by: Borislav Petkov 
Reviewed-by: Tony Luck 
Cc: sta...@vger.kernel.org
Link: 
https://lkml.kernel.org/r/962ede65d8e95be793cb61102cca37f7bb018e66.1608243147.git.reinette.cha...@intel.com
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c 
b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 1c6f8a6..460f3e0 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -546,6 +546,13 @@ static void update_task_closid_rmid(struct task_struct *t)
 static int __rdtgroup_move_task(struct task_struct *tsk,
struct rdtgroup *rdtgrp)
 {
+   /* If the task is already in rdtgrp, no need to move the task. */
+   if ((rdtgrp->type == RDTCTRL_GROUP && tsk->closid == rdtgrp->closid &&
+tsk->rmid == rdtgrp->mon.rmid) ||
+   (rdtgrp->type == RDTMON_GROUP && tsk->rmid == rdtgrp->mon.rmid &&
+tsk->closid == rdtgrp->mon.parent->closid))
+   return 0;
+
/*
 * Set the task's closid/rmid before the PQR_ASSOC MSR can be
 * updated by them.


Re: [PATCH v4 0/4] x86/bus_lock: Enable bus lock detection

2021-01-04 Thread Fenghua Yu
Hi, Dear X86 Maintainers,

On Tue, Nov 24, 2020 at 08:52:41PM +, Fenghua Yu wrote:
> A bus lock [1] is acquired through either split locked access to
> writeback (WB) memory or any locked access to non-WB memory. This is
> typically >1000 cycles slower than an atomic operation within
> a cache line. It also disrupts performance on other cores.

This is a friendly reminder. Any comment on this series?

Thanks.

-Fenghua


[PATCH v4 14/17] selftests/resctrl: Skip the test if requested resctrl feature is not supported

2020-11-30 Thread Fenghua Yu
There could be two reasons why a resctrl feature might not be enabled on
the platform
1. H/W might not support the feature
2. Even if the H/W supports it, the user might have disabled the feature
   through kernel command line arguments

Hence, any resctrl unit test (like cmt, cat, mbm and mba) before starting
the test will first check if the feature is enabled on the platform or not.
If the feature isn't enabled, then the test returns with an error status.
For example, if MBA isn't supported on a platform and if the user tries to
run MBA, the output will look like this

ok mounting resctrl to "/sys/fs/resctrl"
not ok MBA: schemata change

But, not supporting a feature isn't a test failure. So, instead of treating
it as an error, use the SKIP directive of the TAP protocol. With the
change, the output will look as below

ok MBA # SKIP Hardware does not support MBA or MBA is disabled

Suggested-by: Reinette Chatre 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/cat_test.c|  3 ---
 tools/testing/selftests/resctrl/mba_test.c|  3 ---
 tools/testing/selftests/resctrl/mbm_test.c|  3 ---
 .../testing/selftests/resctrl/resctrl_tests.c | 24 +++
 4 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c 
b/tools/testing/selftests/resctrl/cat_test.c
index 6e935a6bb3e6..9b38e6296d3d 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -128,9 +128,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
if (ret)
return ret;
 
-   if (!validate_resctrl_feature_request("cat"))
-   return -1;
-
/* Get default cbm mask for L3/L2 cache */
ret = get_cbm_mask(cache_type, cbm_mask);
if (ret)
diff --git a/tools/testing/selftests/resctrl/mba_test.c 
b/tools/testing/selftests/resctrl/mba_test.c
index b4c81d2ee53b..d10e030b1a55 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -156,9 +156,6 @@ int mba_schemata_change(int cpu_no, char *bw_report, char 
**benchmark_cmd)
 
remove(RESULT_FILE_NAME);
 
-   if (!validate_resctrl_feature_request("mba"))
-   return -1;
-
ret = resctrl_val(benchmark_cmd, );
if (ret)
return ret;
diff --git a/tools/testing/selftests/resctrl/mbm_test.c 
b/tools/testing/selftests/resctrl/mbm_test.c
index 672d3ddd6e85..614614ecd58b 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -129,9 +129,6 @@ int mbm_bw_change(int span, int cpu_no, char *bw_report, 
char **benchmark_cmd)
 
remove(RESULT_FILE_NAME);
 
-   if (!validate_resctrl_feature_request("mbm"))
-   return -1;
-
ret = resctrl_val(benchmark_cmd, );
if (ret)
return ret;
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c 
b/tools/testing/selftests/resctrl/resctrl_tests.c
index 9b7017299ca2..63400a51cbd8 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -60,6 +60,12 @@ static void run_mbm_test(bool has_ben, char **benchmark_cmd, 
int span,
int res;
 
printf("# Starting MBM BW change ...\n");
+
+   if (!validate_resctrl_feature_request("mbm")) {
+   printf("ok MBM # SKIP Hardware does not support MBM or MBM is 
disabled\n");
+   return;
+   }
+
if (!has_ben)
sprintf(benchmark_cmd[5], "%s", "mba");
res = mbm_bw_change(span, cpu_no, bw_report, benchmark_cmd);
@@ -74,6 +80,12 @@ static void run_mba_test(bool has_ben, char **benchmark_cmd, 
int span,
int res;
 
printf("# Starting MBA Schemata change ...\n");
+
+   if (!validate_resctrl_feature_request("mba")) {
+   printf("ok MBA # SKIP Hardware does not support MBA or MBA is 
disabled\n");
+   return;
+   }
+
if (!has_ben)
sprintf(benchmark_cmd[1], "%d", span);
res = mba_schemata_change(cpu_no, bw_report, benchmark_cmd);
@@ -87,6 +99,12 @@ static void run_cmt_test(bool has_ben, char **benchmark_cmd, 
int cpu_no)
int res;
 
printf("# Starting CMT test ...\n");
+
+   if (!validate_resctrl_feature_request("cmt")) {
+   printf("ok CMT # SKIP Hardware does not support CMT or CMT is 
disabled\n");
+   return;
+   }
+
if (!has_ben)
sprintf(benchmark_cmd[5], "%s", "cmt");
res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd);
@@ -100,6 +118,12 @@ static void run_cat_test(int cpu_no, int no_of_bits)
int res;
 
printf("# Starting CAT test ...\n");
+
+ 

[PATCH v4 13/17] selftests/resctrl: Modularize resctrl test suite main() function

2020-11-30 Thread Fenghua Yu
Resctrl test suite main() function does the following things
1. Parses command line arguments passed by user
2. Some setup checks
3. Logic that calls into each unit test
4. Print result and clean up after running each unit test

Introduce wrapper functions for steps 3 and 4 to modularize the main()
function. Adding these wrapper functions makes it easier to add any logic
to each individual test.

Please note that this is a preparatory patch for the next one and no
functional changes are intended.

Suggested-by: Reinette Chatre 
Signed-off-by: Fenghua Yu 
---
 .../testing/selftests/resctrl/resctrl_tests.c | 96 ---
 1 file changed, 61 insertions(+), 35 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c 
b/tools/testing/selftests/resctrl/resctrl_tests.c
index 244f1beb75da..9b7017299ca2 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -54,10 +54,62 @@ void tests_cleanup(void)
cat_test_cleanup();
 }
 
+static void run_mbm_test(bool has_ben, char **benchmark_cmd, int span,
+int cpu_no, char *bw_report)
+{
+   int res;
+
+   printf("# Starting MBM BW change ...\n");
+   if (!has_ben)
+   sprintf(benchmark_cmd[5], "%s", "mba");
+   res = mbm_bw_change(span, cpu_no, bw_report, benchmark_cmd);
+   printf("%sok MBM: bw change\n", res ? "not " : "");
+   mbm_test_cleanup();
+   tests_run++;
+}
+
+static void run_mba_test(bool has_ben, char **benchmark_cmd, int span,
+int cpu_no, char *bw_report)
+{
+   int res;
+
+   printf("# Starting MBA Schemata change ...\n");
+   if (!has_ben)
+   sprintf(benchmark_cmd[1], "%d", span);
+   res = mba_schemata_change(cpu_no, bw_report, benchmark_cmd);
+   printf("%sok MBA: schemata change\n", res ? "not " : "");
+   mba_test_cleanup();
+   tests_run++;
+}
+
+static void run_cmt_test(bool has_ben, char **benchmark_cmd, int cpu_no)
+{
+   int res;
+
+   printf("# Starting CMT test ...\n");
+   if (!has_ben)
+   sprintf(benchmark_cmd[5], "%s", "cmt");
+   res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd);
+   printf("%sok CMT: test\n", res ? "not " : "");
+   cmt_test_cleanup();
+   tests_run++;
+}
+
+static void run_cat_test(int cpu_no, int no_of_bits)
+{
+   int res;
+
+   printf("# Starting CAT test ...\n");
+   res = cat_perf_miss_val(cpu_no, no_of_bits, "L3");
+   printf("%sok CAT: test\n", res ? "not " : "");
+   tests_run++;
+   cat_test_cleanup();
+}
+
 int main(int argc, char **argv)
 {
bool has_ben = false, mbm_test = true, mba_test = true, cmt_test = true;
-   int res, c, cpu_no = 1, span = 250, argc_new = argc, i, no_of_bits = 0;
+   int c, cpu_no = 1, span = 250, argc_new = argc, i, no_of_bits = 0;
char *benchmark_cmd[BENCHMARK_ARGS], bw_report[64], bm_type[64];
char benchmark_cmd_area[BENCHMARK_ARGS][BENCHMARK_ARG_SIZE];
int ben_ind, ben_count;
@@ -168,43 +220,17 @@ int main(int argc, char **argv)
 
filter_dmesg();
 
-   if (!is_amd && mbm_test) {
-   printf("# Starting MBM BW change ...\n");
-   if (!has_ben)
-   sprintf(benchmark_cmd[5], "%s", MBA_STR);
-   res = mbm_bw_change(span, cpu_no, bw_report, benchmark_cmd);
-   printf("%sok MBM: bw change\n", res ? "not " : "");
-   mbm_test_cleanup();
-   tests_run++;
-   }
+   if (!is_amd && mbm_test)
+   run_mbm_test(has_ben, benchmark_cmd, span, cpu_no, bw_report);
 
-   if (!is_amd && mba_test) {
-   printf("# Starting MBA Schemata change ...\n");
-   if (!has_ben)
-   sprintf(benchmark_cmd[1], "%d", span);
-   res = mba_schemata_change(cpu_no, bw_report, benchmark_cmd);
-   printf("%sok MBA: schemata change\n", res ? "not " : "");
-   mba_test_cleanup();
-   tests_run++;
-   }
+   if (!is_amd && mba_test)
+   run_mba_test(has_ben, benchmark_cmd, span, cpu_no, bw_report);
 
-   if (cmt_test) {
-   printf("# Starting CMT test ...\n");
-   if (!has_ben)
-   sprintf(benchmark_cmd[5], "%s", "cmt");
-   res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd);
-   printf("%sok CMT: test\n", res ? "not " : "");
-   cmt_test_cleanup();
-  

[PATCH v4 15/17] selftests/resctrl: Fix unmount resctrl FS

2020-11-30 Thread Fenghua Yu
umount_resctrlfs() directly attempts to unmount resctrl file system without
checking if resctrl FS is already mounted or not. It returns 0 on success
and on failure it prints an error message and returns an error status.
Calling umount_resctrlfs() when resctrl FS isn't mounted will return an
error status.

There could be situations where-in the caller might not know if resctrl
FS is already mounted or not and the caller might still want to unmount
resctrl FS if it's already mounted (For example during teardown).

To support above use cases, change umount_resctrlfs() such that it now
first checks if resctrl FS is already mounted or not and unmounts resctrl
FS only if it's already mounted.

unmount resctrl FS upon exit. For example, running only mba test on a
Broadwell (BDW) machine (MBA isn't supported on BDW CPU).

This happens because validate_resctrl_feature_request() would mount resctrl
FS to check if mba is enabled on the platform or not and finds that the H/W
doesn't support mba and hence will return false to run_mba_test(). This in
turn makes the main() function return without unmounting resctrl FS.

Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrl_tests.c | 1 +
 tools/testing/selftests/resctrl/resctrlfs.c | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c 
b/tools/testing/selftests/resctrl/resctrl_tests.c
index 63400a51cbd8..7a63d5fcf4e0 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -257,6 +257,7 @@ int main(int argc, char **argv)
run_cat_test(cpu_no, no_of_bits);
 
 out:
+   umount_resctrlfs();
printf("1..%d\n", tests_run);
 
return 0;
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c 
b/tools/testing/selftests/resctrl/resctrlfs.c
index 3f43bcf0b8d5..868f6f186e98 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -90,6 +90,9 @@ int remount_resctrlfs(bool mum_resctrlfs)
 
 int umount_resctrlfs(void)
 {
+   if (find_resctrl_mount(NULL))
+   return 0;
+
if (umount(RESCTRL_PATH)) {
perror("# Unable to umount resctrl");
 
-- 
2.29.2



[PATCH v4 12/17] selftests/resctrl: Don't hard code value of "no_of_bits" variable

2020-11-30 Thread Fenghua Yu
Cache related tests (like CAT and CMT) depend on a variable called
no_of_bits to run. no_of_bits defines the number of contiguous bits
that should be set in the CBM mask and a user can pass a value for
no_of_bits using -n command line argument. If a user hasn't passed any
value, it defaults to 5 (randomly chosen value).

Hard coding no_of_bits to 5 will make the cache tests fail to run on
systems that support maximum cbm mask that is less than or equal to 5 bits.
Hence, don't hard code no_of_bits value.

If a user passes a value for "no_of_bits" using -n option, use it.
Otherwise, no_of_bits is equal to half of the maximum number of bits in
the cbm mask.

Please note that CMT test is still hard coded to 5 bits. It will change in
subsequent patches that change CMT test.

Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/cat_test.c  | 5 -
 tools/testing/selftests/resctrl/resctrl_tests.c | 8 ++--
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c 
b/tools/testing/selftests/resctrl/cat_test.c
index 6d9a41f3939a..6e935a6bb3e6 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -147,7 +147,10 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
/* Get max number of bits from default-cabm mask */
count_of_bits = count_bits(long_mask);
 
-   if (n < 1 || n > count_of_bits - 1) {
+   if (!n)
+   n = count_of_bits / 2;
+
+   if (n > count_of_bits - 1) {
printf("Invalid input value for no_of_bits n!\n");
printf("Please Enter value in range 1 to %d\n",
   count_of_bits - 1);
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c 
b/tools/testing/selftests/resctrl/resctrl_tests.c
index ef09e0ef2366..244f1beb75da 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -57,7 +57,7 @@ void tests_cleanup(void)
 int main(int argc, char **argv)
 {
bool has_ben = false, mbm_test = true, mba_test = true, cmt_test = true;
-   int res, c, cpu_no = 1, span = 250, argc_new = argc, i, no_of_bits = 5;
+   int res, c, cpu_no = 1, span = 250, argc_new = argc, i, no_of_bits = 0;
char *benchmark_cmd[BENCHMARK_ARGS], bw_report[64], bm_type[64];
char benchmark_cmd_area[BENCHMARK_ARGS][BENCHMARK_ARG_SIZE];
int ben_ind, ben_count;
@@ -106,6 +106,10 @@ int main(int argc, char **argv)
break;
case 'n':
no_of_bits = atoi(optarg);
+   if (no_of_bits <= 0) {
+   printf("Bail out! invalid argument for 
no_of_bits\n");
+   return -1;
+   }
break;
case 'h':
cmd_help();
@@ -188,7 +192,7 @@ int main(int argc, char **argv)
printf("# Starting CMT test ...\n");
if (!has_ben)
sprintf(benchmark_cmd[5], "%s", "cmt");
-   res = cmt_resctrl_val(cpu_no, no_of_bits, benchmark_cmd);
+   res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd);
printf("%sok CMT: test\n", res ? "not " : "");
cmt_test_cleanup();
tests_run++;
-- 
2.29.2



[PATCH v4 11/17] selftests/resctrl: Enable gcc checks to detect buffer overflows

2020-11-30 Thread Fenghua Yu
David reported a buffer overflow error in the check_results() function of
the cmt unit test and he suggested enabling _FORTIFY_SOURCE gcc compiler
option to automatically detect any such errors.

Feature Test Macros man page describes_FORTIFY_SOURCE as below

"Defining this macro causes some lightweight checks to be performed to
detect some buffer overflow errors when employing various string and memory
manipulation functions (for example, memcpy, memset, stpcpy, strcpy,
strncpy, strcat, strncat, sprintf, snprintf, vsprintf, vsnprintf, gets, and
wide character variants thereof). For some functions, argument consistency
is checked; for example, a check is made that open has been supplied with a
mode argument when the specified flags include O_CREAT. Not all problems
are detected, just some common cases.

If _FORTIFY_SOURCE is set to 1, with compiler optimization level 1 (gcc
-O1) and above, checks that shouldn't change the behavior of conforming
programs are performed.

With _FORTIFY_SOURCE set to 2, some more checking is added, but some
conforming programs might fail.

Some of the checks can be performed at compile time (via macros logic
implemented in header files), and result in compiler warnings; other checks
take place at run time, and result in a run-time error if the check fails.

Use of this macro requires compiler support, available with gcc since
version 4.0."

Fix the buffer overflow error in the check_results() function of the cmt
unit test and enable _FORTIFY_SOURCE gcc check to catch any future buffer
overflow errors.

Reported-by: David Binderman 
Suggested-by: David Binderman 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/Makefile   | 2 +-
 tools/testing/selftests/resctrl/cmt_test.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/resctrl/Makefile 
b/tools/testing/selftests/resctrl/Makefile
index d585cc1948cc..6bcee2ec91a9 100644
--- a/tools/testing/selftests/resctrl/Makefile
+++ b/tools/testing/selftests/resctrl/Makefile
@@ -1,5 +1,5 @@
 CC = $(CROSS_COMPILE)gcc
-CFLAGS = -g -Wall
+CFLAGS = -g -Wall -O2 -D_FORTIFY_SOURCE=2
 SRCS=$(wildcard *.c)
 OBJS=$(SRCS:.c=.o)
 
diff --git a/tools/testing/selftests/resctrl/cmt_test.c 
b/tools/testing/selftests/resctrl/cmt_test.c
index 188b73b5a2cc..ac1a33d9ce12 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -81,7 +81,7 @@ static int check_results(struct resctrl_val_param *param, int 
no_of_bits)
return errno;
}
 
-   while (fgets(temp, 1024, fp)) {
+   while (fgets(temp, sizeof(temp), fp)) {
char *token = strtok(temp, ":\t");
int fields = 0;
 
-- 
2.29.2



[PATCH v4 10/17] selftests/resctrl: Fix MBA/MBM results reporting format

2020-11-30 Thread Fenghua Yu
MBM unit test starts fill_buf (default built-in benchmark) in a new con_mon
group (c1, m1) and records resctrl reported mbm values and iMC (Integrated
Memory Controller) values every second. It does this for five seconds
(randomly chosen value) in total. It then calculates average of resctrl_mbm
values and imc_mbm values and if the difference is greater than 300 MB/sec
(randomly chosen value), the test treats it as a failure. MBA unit test is
similar to MBM but after every run it changes schemata.

Checking for a difference of 300 MB/sec doesn't look very meaningful when
the mbm values are changing over a wide range. For example, below are the
values running MBA test on SKL with different allocations

1. With 10% as schemata both iMC and resctrl mbm_values are around 2000
   MB/sec
2. With 100% as schemata both iMC and resctrl mbm_values are around 1
   MB/sec

A 300 MB/sec difference between resctrl_mbm and imc_mbm values is
acceptable at 100% schemata but it isn't acceptable at 10% schemata because
that's a huge difference.

So, fix this by checking for percentage difference instead of absolute
difference i.e. check if the difference between resctrl_mbm value and
imc_mbm value is within 5% (randomly chosen value) of imc_mbm value. If the
difference is greater than 5% of imc_mbm value, treat it is a failure.

Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/mba_test.c | 20 +++-
 tools/testing/selftests/resctrl/mbm_test.c | 13 +++--
 2 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/tools/testing/selftests/resctrl/mba_test.c 
b/tools/testing/selftests/resctrl/mba_test.c
index 6449fbd96096..b4c81d2ee53b 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -12,7 +12,7 @@
 
 #define RESULT_FILE_NAME   "result_mba"
 #define NUM_OF_RUNS5
-#define MAX_DIFF   300
+#define MAX_DIFF_PERCENT   5
 #define ALLOCATION_MAX 100
 #define ALLOCATION_MIN 10
 #define ALLOCATION_STEP10
@@ -62,7 +62,8 @@ static void show_mba_info(unsigned long *bw_imc, unsigned 
long *bw_resc)
 allocation++) {
unsigned long avg_bw_imc, avg_bw_resc;
unsigned long sum_bw_imc = 0, sum_bw_resc = 0;
-   unsigned long avg_diff;
+   int avg_diff_per;
+   float avg_diff;
 
/*
 * The first run is discarded due to inaccurate value from
@@ -76,17 +77,18 @@ static void show_mba_info(unsigned long *bw_imc, unsigned 
long *bw_resc)
 
avg_bw_imc = sum_bw_imc / (NUM_OF_RUNS - 1);
avg_bw_resc = sum_bw_resc / (NUM_OF_RUNS - 1);
-   avg_diff = labs((long)(avg_bw_resc - avg_bw_imc));
+   avg_diff = (float)labs(avg_bw_resc - avg_bw_imc) / avg_bw_imc;
+   avg_diff_per = (int)(avg_diff * 100);
 
-   printf("%sok MBA schemata percentage %u smaller than %d %%\n",
-  avg_diff > MAX_DIFF ? "not " : "",
-  ALLOCATION_MAX - ALLOCATION_STEP * allocation,
-  MAX_DIFF);
+   printf("%sok MBA: diff within %d%% for schemata %u\n",
+  avg_diff_per > MAX_DIFF_PERCENT ? "not " : "",
+  MAX_DIFF_PERCENT,
+  ALLOCATION_MAX - ALLOCATION_STEP * allocation);
tests_run++;
-   printf("# avg_diff: %lu\n", avg_diff);
+   printf("# avg_diff_per: %d%%\n", avg_diff_per);
printf("# avg_bw_imc: %lu\n", avg_bw_imc);
printf("# avg_bw_resc: %lu\n", avg_bw_resc);
-   if (avg_diff > MAX_DIFF)
+   if (avg_diff_per > MAX_DIFF_PERCENT)
failed = true;
}
 
diff --git a/tools/testing/selftests/resctrl/mbm_test.c 
b/tools/testing/selftests/resctrl/mbm_test.c
index ec6cfe01c9c2..672d3ddd6e85 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -11,7 +11,7 @@
 #include "resctrl.h"
 
 #define RESULT_FILE_NAME   "result_mbm"
-#define MAX_DIFF   300
+#define MAX_DIFF_PERCENT   5
 #define NUM_OF_RUNS5
 
 static void
@@ -19,8 +19,8 @@ show_bw_info(unsigned long *bw_imc, unsigned long *bw_resc, 
int span)
 {
unsigned long avg_bw_imc = 0, avg_bw_resc = 0;
unsigned long sum_bw_imc = 0, sum_bw_resc = 0;
-   long avg_diff = 0;
-   int runs;
+   int runs, avg_diff_per;
+   float avg_diff = 0;
 
/*
 * Discard the first value which is inaccurate due to monitoring setup
@@ -33,12 +33,13 @@ show_bw_info(unsigned long *bw_imc, unsigned long *bw_resc, 
int span)
 
avg_bw_imc = sum_bw_imc / 4;
avg_bw_resc = sum_bw_resc

[PATCH v4 07/17] selftests/resctrl: Use resctrl/info for feature detection

2020-11-30 Thread Fenghua Yu
Resctrl test suite before running any unit test (like cmt, cat, mbm and
mba) should first check if the feature is enabled (by kernel and not just
supported by H/W) on the platform or not.
validate_resctrl_feature_request() is supposed to do that. This function
intends to grep for relevant flags in /proc/cpuinfo but there are several
issues here

1. validate_resctrl_feature_request() calls fgrep() to get flags from
   /proc/cpuinfo. But, fgrep() can only return a string with maximum of 255
   characters and hence the complete cpu flags are never returned.
2. The substring search logic is also busted. If strstr() finds requested
   resctrl feature in the cpu flags, it returns pointer to the first
   occurrence. But, the logic negates the return value of strstr() and
   hence validate_resctrl_feature_request() returns false if the feature is
   present in the cpu flags and returns true if the feature is not present.
3. validate_resctrl_feature_request() checks if a resctrl feature is
   reported in /proc/cpuinfo flags or not. Having a cpu flag means that the
   H/W supports the feature, but it doesn't mean that the kernel enabled
   it. A user could selectively enable only a subset of resctrl features
   using kernel command line arguments. Hence, /proc/cpuinfo isn't a
   reliable source to check if a feature is enabled or not.

The 3rd issue being the major one and fixing it requires changing the way
validate_resctrl_feature_request() works. Since, /proc/cpuinfo isn't the
right place to check if a resctrl feature is enabled or not, a more
appropriate place is /sys/fs/resctrl/info directory. Change
validate_resctrl_feature_request() such that,

1. For cat, check if /sys/fs/resctrl/info/L3 directory is present or not
2. For mba, check if /sys/fs/resctrl/info/MB directory is present or not
3. For cmt, check if /sys/fs/resctrl/info/L3_MON directory is present and
   check if /sys/fs/resctrl/info/L3_MON/mon_features has llc_occupancy
4. For mbm, check if /sys/fs/resctrl/info/L3_MON directory is present and
   check if /sys/fs/resctrl/info/L3_MON/mon_features has
   mbm__bytes

Please note that only L3_CAT, L3_CMT, MBA and MBM are supported. CDP and L2
variants can be added later.

Reported-by: Reinette Chatre 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrl.h   |  6 ++-
 tools/testing/selftests/resctrl/resctrlfs.c | 51 -
 2 files changed, 45 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl.h 
b/tools/testing/selftests/resctrl/resctrl.h
index e99e62fddc61..5ff71870e61c 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -28,6 +28,10 @@
 #define RESCTRL_PATH   "/sys/fs/resctrl"
 #define PHYS_ID_PATH   "/sys/devices/system/cpu/cpu"
 #define CBM_MASK_PATH  "/sys/fs/resctrl/info"
+#define L3_PATH"/sys/fs/resctrl/info/L3"
+#define MB_PATH"/sys/fs/resctrl/info/MB"
+#define L3_MON_PATH"/sys/fs/resctrl/info/L3_MON"
+#define L3_MON_FEATURES_PATH   "/sys/fs/resctrl/info/L3_MON/mon_features"
 
 #define PARENT_EXIT(err_msg)   \
do {\
@@ -99,7 +103,7 @@ int remount_resctrlfs(bool mum_resctrlfs);
 int get_resource_id(int cpu_no, int *resource_id);
 int umount_resctrlfs(void);
 int validate_bw_report_request(char *bw_report);
-bool validate_resctrl_feature_request(char *resctrl_val);
+bool validate_resctrl_feature_request(const char *resctrl_val);
 char *fgrep(FILE *inf, const char *str);
 int taskset_benchmark(pid_t bm_pid, int cpu_no);
 void run_benchmark(int signum, siginfo_t *info, void *ucontext);
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c 
b/tools/testing/selftests/resctrl/resctrlfs.c
index c676557b376d..d2cae4927b62 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -616,26 +616,55 @@ char *fgrep(FILE *inf, const char *str)
  * validate_resctrl_feature_request - Check if requested feature is valid.
  * @resctrl_val:   Requested feature
  *
- * Return: 0 on success, non-zero on failure
+ * Return: True if the feature is supported, else false
  */
-bool validate_resctrl_feature_request(char *resctrl_val)
+bool validate_resctrl_feature_request(const char *resctrl_val)
 {
-   FILE *inf = fopen("/proc/cpuinfo", "r");
+   struct stat statbuf;
bool found = false;
char *res;
+   FILE *inf;
 
-   if (!inf)
+   if (!resctrl_val)
return false;
 
-   res = fgrep(inf, "flags");
-
-   if (res) {
-   char *s = strchr(res, ':');
+   if (remount_resctrlfs(false))
+   return false;
 
-   found = s && !strstr(s, resctrl_val);
-   free(res);
+   if (is_cat(resctrl_

[PATCH v4 06/17] selftests/resctrl: Check for resctrl mount point only if resctrl FS is supported

2020-11-30 Thread Fenghua Yu
check_resctrlfs_support() does the following
1. Checks if the platform supports resctrl file system or not by looking
   for resctrl in /proc/filesystems
2. Calls opendir() on default resctrl file system path
   (i.e. /sys/fs/resctrl)
3. Checks if resctrl file system is mounted or not by looking at
   /proc/mounts

Steps 2 and 3 will fail if the platform does not support resctrl file
system. So, there is no need to check for them if step 1 fails.

Fix this by returning immediately if the platform does not support
resctrl file system.

Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrlfs.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/testing/selftests/resctrl/resctrlfs.c 
b/tools/testing/selftests/resctrl/resctrlfs.c
index 2c574d143ff0..c676557b376d 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -579,6 +579,9 @@ bool check_resctrlfs_support(void)
printf("%sok kernel supports resctrl filesystem\n", ret ? "" : "not ");
tests_run++;
 
+   if (!ret)
+   return ret;
+
dp = opendir(RESCTRL_PATH);
printf("%sok resctrl mountpoint \"%s\" exists\n",
   dp ? "" : "not ", RESCTRL_PATH);
-- 
2.29.2



[PATCH v4 05/17] selftests/resctrl: Add a few dependencies

2020-11-30 Thread Fenghua Yu
Add a couple of sanity checks and the config file for test dependencies.

Running any resctrl unit test involves writing to resctrl file system
and only a root user has permission to write to resctrl FS. Resctrl
test suite before running any test checks for the privilege of the
user and if it's not a root user, the test suite prints a warning
and continues attempting to run tests.

Attempting to run any test without root privileges will fail as below

TAP version 13
ok kernel supports resctrl filesystem
.
not ok mounting resctrl to "/sys/fs/resctrl"
not ok MBA: schemata change

Hence, don't attempt to run any test if the user is not a root user and
change the warning message to a bail out message to comply with TAP 13
standards.

Regarding the second check, check_resctrlfs_support() checks if the
platform supports resctrl file system or not by looking for resctrl
in /proc/filesystems and returns a boolean value. The main function
of resctrl test suite calls check_resctrlfs_support() but forgets to
check for it's return value. This means that resctrl test suite will
attempt to run resctrl tests (like CMT, CAT, MBM and MBA) even if the
platform doesn't support resctrl file system.

Fix this by checking for the return value of check_resctrlfs_support() in
the main function. If resctrl file system isn't supported on the platform
then exit the test suite gracefully without attempting to run any of
resctrl unit tests.

Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/config  |  2 ++
 tools/testing/selftests/resctrl/resctrl_tests.c | 13 ++---
 2 files changed, 12 insertions(+), 3 deletions(-)
 create mode 100644 tools/testing/selftests/resctrl/config

diff --git a/tools/testing/selftests/resctrl/config 
b/tools/testing/selftests/resctrl/config
new file mode 100644
index ..8d9f2deb56ed
--- /dev/null
+++ b/tools/testing/selftests/resctrl/config
@@ -0,0 +1,2 @@
+CONFIG_X86_CPU_RESCTRL=y
+CONFIG_PROC_CPU_RESCTRL=y
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c 
b/tools/testing/selftests/resctrl/resctrl_tests.c
index d92b0b32a349..0e036dbf5d17 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -125,8 +125,10 @@ int main(int argc, char **argv)
 * 1. We write to resctrl FS
 * 2. We execute perf commands
 */
-   if (geteuid() != 0)
-   printf("# WARNING: not running as root, tests may fail.\n");
+   if (geteuid() != 0) {
+   printf("Bail out! not running as root, abort testing\n");
+   goto out;
+   }
 
/* Detect AMD vendor */
detect_amd();
@@ -155,7 +157,11 @@ int main(int argc, char **argv)
sprintf(bw_report, "reads");
sprintf(bm_type, "fill_buf");
 
-   check_resctrlfs_support();
+   if (!check_resctrlfs_support()) {
+   printf("Bail out! resctrl FS does not exist\n");
+   goto out;
+   }
+
filter_dmesg();
 
if (!is_amd && mbm_test) {
@@ -196,6 +202,7 @@ int main(int argc, char **argv)
cat_test_cleanup();
}
 
+out:
printf("1..%d\n", tests_run);
 
return 0;
-- 
2.29.2



[PATCH v4 17/17] selftests/resctrl: Fix checking for < 0 for unsigned values

2020-11-30 Thread Fenghua Yu
Dan reported following static checker warnings

tools/testing/selftests/resctrl/resctrl_val.c:545 measure_vals()
warn: 'bw_imc' unsigned <= 0

tools/testing/selftests/resctrl/resctrl_val.c:549 measure_vals()
warn: 'bw_resc_end' unsigned <= 0

These warnings are reported because
1. measure_vals() declares 'bw_imc' and 'bw_resc_end' as unsigned long
   variables
2. Return value of get_mem_bw_imc() and get_mem_bw_resctrl() are assigned
   to 'bw_imc' and 'bw_resc_end' respectively
3. The returned values are checked for <= 0 to see if the calls failed

Checking for < 0 for an unsigned value doesn't make any sense.

Fix this issue by changing the implementation of get_mem_bw_imc() and
get_mem_bw_resctrl() such that they now accept reference to a variable
and set the variable appropriately upon success and return 0, else return
< 0 on error.

Reported-by: Dan Carpenter 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrl_val.c | 41 +++
 1 file changed, 23 insertions(+), 18 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl_val.c 
b/tools/testing/selftests/resctrl/resctrl_val.c
index d6f0688182e8..ce8f0ec15f7b 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -300,9 +300,9 @@ static int initialize_mem_bw_imc(void)
  * Memory B/W utilized by a process on a socket can be calculated using
  * iMC counters. Perf events are used to read these counters.
  *
- * Return: >= 0 on success. < 0 on failure.
+ * Return: = 0 on success. < 0 on failure.
  */
-static float get_mem_bw_imc(int cpu_no, char *bw_report)
+static int get_mem_bw_imc(int cpu_no, char *bw_report, float *bw_imc)
 {
float reads, writes, of_mul_read, of_mul_write;
int imc, j, ret;
@@ -373,13 +373,18 @@ static float get_mem_bw_imc(int cpu_no, char *bw_report)
close(imc_counters_config[imc][WRITE].fd);
}
 
-   if (strcmp(bw_report, "reads") == 0)
-   return reads;
+   if (strcmp(bw_report, "reads") == 0) {
+   *bw_imc = reads;
+   return 0;
+   }
 
-   if (strcmp(bw_report, "writes") == 0)
-   return writes;
+   if (strcmp(bw_report, "writes") == 0) {
+   *bw_imc = writes;
+   return 0;
+   }
 
-   return (reads + writes);
+   *bw_imc = reads + writes;
+   return 0;
 }
 
 void set_mbm_path(const char *ctrlgrp, const char *mongrp, int resource_id)
@@ -438,9 +443,8 @@ static void initialize_mem_bw_resctrl(const char *ctrlgrp, 
const char *mongrp,
  * 1. If con_mon grp is given, then read from it
  * 2. If con_mon grp is not given, then read from root con_mon grp
  */
-static unsigned long get_mem_bw_resctrl(void)
+static int get_mem_bw_resctrl(unsigned long *mbm_total)
 {
-   unsigned long mbm_total = 0;
FILE *fp;
 
fp = fopen(mbm_total_path, "r");
@@ -449,7 +453,7 @@ static unsigned long get_mem_bw_resctrl(void)
 
return -1;
}
-   if (fscanf(fp, "%lu", _total) <= 0) {
+   if (fscanf(fp, "%lu", mbm_total) <= 0) {
perror("Could not get mbm local bytes");
fclose(fp);
 
@@ -457,7 +461,7 @@ static unsigned long get_mem_bw_resctrl(void)
}
fclose(fp);
 
-   return mbm_total;
+   return 0;
 }
 
 pid_t bm_pid, ppid;
@@ -549,7 +553,8 @@ static void initialize_llc_occu_resctrl(const char 
*ctrlgrp, const char *mongrp,
 static int
 measure_vals(struct resctrl_val_param *param, unsigned long *bw_resc_start)
 {
-   unsigned long bw_imc, bw_resc, bw_resc_end;
+   unsigned long bw_resc, bw_resc_end;
+   float bw_imc;
int ret;
 
/*
@@ -559,13 +564,13 @@ measure_vals(struct resctrl_val_param *param, unsigned 
long *bw_resc_start)
 * Compare the two values to validate resctrl value.
 * It takes 1sec to measure the data.
 */
-   bw_imc = get_mem_bw_imc(param->cpu_no, param->bw_report);
-   if (bw_imc <= 0)
-   return bw_imc;
+   ret = get_mem_bw_imc(param->cpu_no, param->bw_report, _imc);
+   if (ret < 0)
+   return ret;
 
-   bw_resc_end = get_mem_bw_resctrl();
-   if (bw_resc_end <= 0)
-   return bw_resc_end;
+   ret = get_mem_bw_resctrl(_resc_end);
+   if (ret < 0)
+   return ret;
 
bw_resc = (bw_resc_end - *bw_resc_start) / MB;
ret = print_results_bw(param->filename, bm_pid, bw_imc, bw_resc);
-- 
2.29.2



[PATCH v4 02/17] selftests/resctrl: Clean up resctrl features check

2020-11-30 Thread Fenghua Yu
Checking resctrl features call strcmp() to compare feature strings
(e.g. "mba", "cat" etc). The checkings are error prone and don't have
good coding style. Define the constant strings in macros and call
strncmp() to solve the potential issues.

Suggested-by: Shuah Khan 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/cache.c   |  8 +++---
 tools/testing/selftests/resctrl/cat_test.c|  2 +-
 tools/testing/selftests/resctrl/cqm_test.c|  2 +-
 tools/testing/selftests/resctrl/fill_buf.c|  4 +--
 tools/testing/selftests/resctrl/mba_test.c|  2 +-
 tools/testing/selftests/resctrl/mbm_test.c|  2 +-
 tools/testing/selftests/resctrl/resctrl.h | 25 +++
 .../testing/selftests/resctrl/resctrl_tests.c | 12 -
 tools/testing/selftests/resctrl/resctrl_val.c | 19 ++
 tools/testing/selftests/resctrl/resctrlfs.c   | 14 +--
 10 files changed, 55 insertions(+), 35 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c 
b/tools/testing/selftests/resctrl/cache.c
index 38dbf4962e33..248bf000c978 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -182,7 +182,7 @@ int measure_cache_vals(struct resctrl_val_param *param, int 
bm_pid)
/*
 * Measure cache miss from perf.
 */
-   if (!strcmp(param->resctrl_val, "cat")) {
+   if (is_cat(param->resctrl_val)) {
ret = get_llc_perf(_perf_miss);
if (ret < 0)
return ret;
@@ -192,7 +192,7 @@ int measure_cache_vals(struct resctrl_val_param *param, int 
bm_pid)
/*
 * Measure llc occupancy from resctrl.
 */
-   if (!strcmp(param->resctrl_val, "cqm")) {
+   if (is_cqm(param->resctrl_val)) {
ret = get_llc_occu_resctrl(_occu_resc);
if (ret < 0)
return ret;
@@ -234,7 +234,7 @@ int cat_val(struct resctrl_val_param *param)
if (ret)
return ret;
 
-   if ((strcmp(resctrl_val, "cat") == 0)) {
+   if (is_cat(resctrl_val)) {
ret = initialize_llc_perf();
if (ret)
return ret;
@@ -242,7 +242,7 @@ int cat_val(struct resctrl_val_param *param)
 
/* Test runs until the callback setup() tells the test to stop. */
while (1) {
-   if (strcmp(resctrl_val, "cat") == 0) {
+   if (is_cat(resctrl_val)) {
ret = param->setup(1, param);
if (ret) {
ret = 0;
diff --git a/tools/testing/selftests/resctrl/cat_test.c 
b/tools/testing/selftests/resctrl/cat_test.c
index 7f723bd8f328..6d9a41f3939a 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -160,7 +160,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
return -1;
 
struct resctrl_val_param param = {
-   .resctrl_val= "cat",
+   .resctrl_val= CAT_STR,
.cpu_no = cpu_no,
.mum_resctrlfs  = 0,
.setup  = cat_setup,
diff --git a/tools/testing/selftests/resctrl/cqm_test.c 
b/tools/testing/selftests/resctrl/cqm_test.c
index b6af940ccfc2..6635b24a74cc 100644
--- a/tools/testing/selftests/resctrl/cqm_test.c
+++ b/tools/testing/selftests/resctrl/cqm_test.c
@@ -142,7 +142,7 @@ int cqm_resctrl_val(int cpu_no, int n, char **benchmark_cmd)
}
 
struct resctrl_val_param param = {
-   .resctrl_val= "cqm",
+   .resctrl_val= CQM_STR,
.ctrlgrp= "c1",
.mongrp = "m1",
.cpu_no = cpu_no,
diff --git a/tools/testing/selftests/resctrl/fill_buf.c 
b/tools/testing/selftests/resctrl/fill_buf.c
index 79c611c99a3d..bece8bb4b575 100644
--- a/tools/testing/selftests/resctrl/fill_buf.c
+++ b/tools/testing/selftests/resctrl/fill_buf.c
@@ -115,7 +115,7 @@ static int fill_cache_read(unsigned char *start_ptr, 
unsigned char *end_ptr,
 
while (1) {
ret = fill_one_span_read(start_ptr, end_ptr);
-   if (!strcmp(resctrl_val, "cat"))
+   if (is_cat(resctrl_val))
break;
}
 
@@ -134,7 +134,7 @@ static int fill_cache_write(unsigned char *start_ptr, 
unsigned char *end_ptr,
 {
while (1) {
fill_one_span_write(start_ptr, end_ptr);
-   if (!strcmp(resctrl_val, "cat"))
+   if (is_cat(resctrl_val))
break;
}
 
diff --git a/tools/testing/selftests/resctrl/mba_test.c 
b/tools/testing/selftests/resctrl/mba_test.c
index 7bf8eaa6204b..6449fbd96096 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/to

[PATCH v4 00/17] Miscellaneous fixes for resctrl selftests

2020-11-30 Thread Fenghua Yu
This patch set has several miscellaneous fixes to resctrl selftest tool
that are easily visible to user. V1 had fixes to CAT test and CMT test
but they were dropped in V2 because having them here made the patchset
humongous. So, changes to CAT test and CMT test will be posted in another
patchset.

Change Log:
v4:
- Address various comments from Shuah Khan:
  1. Combine a few patches e.g. a couple of fixing typos patches into one
 and a couple of unmounting patches into one etc.
  2. Add config file.
  3. Remove "Fixes" tags.
  4. Change strcmp() to strncmp().
  5. Move the global variable fixing patch to the patch 1 so that the
 compilation issue is fixed first.

Please note:
- I didn't move the patch of renaming CQM to CMT to the end of the series
  because code and commit messages in a few other patches depend on the
  new term of "CMT". If move the renaming patch to the end, the previous
  patches use the old "CQM" term and code which will be changed soon at
  the end of series and will cause more code and explanations.
[v3: https://lkml.org/lkml/2020/10/28/137]

v3:
Address various comments (commit messages, return value on test failure,
print failure info on test failure etc) from Reinette and Tony.
[v2: 
https://lore.kernel.org/linux-kselftest/cover.1589835155.git.sai.praneeth.prak...@intel.com/]

v2:
1. Dropped changes to CAT test and CMT test as they will be posted in a later
   series.
2. Added several other fixes
[v1: 
https://lore.kernel.org/linux-kselftest/cover.1583657204.git.sai.praneeth.prak...@intel.com/]

Fenghua Yu (15):
  selftests/resctrl: Fix compilation issues for global variables
  selftests/resctrl: Clean up resctrl features check
  selftests/resctrl: Rename CQM test as CMT test
  selftests/resctrl: Add a few dependencies
  selftests/resctrl: Check for resctrl mount point only if resctrl FS is
supported
  selftests/resctrl: Use resctrl/info for feature detection
  selftests/resctrl: Fix missing options "-n" and "-p"
  selftests/resctrl: Fix MBA/MBM results reporting format
  selftests/resctrl: Enable gcc checks to detect buffer overflows
  selftests/resctrl: Don't hard code value of "no_of_bits" variable
  selftests/resctrl: Modularize resctrl test suite main() function
  selftests/resctrl: Skip the test if requested resctrl feature is not
supported
  selftests/resctrl: Fix unmount resctrl FS
  selftests/resctrl: Fix incorrect parsing of iMC counters
  selftests/resctrl: Fix checking for < 0 for unsigned values

Reinette Chatre (2):
  selftests/resctrl: Fix printed messages
  selftests/resctrl: Ensure sibling CPU is not same as original CPU

 tools/testing/selftests/resctrl/Makefile  |   2 +-
 tools/testing/selftests/resctrl/README|   4 +-
 tools/testing/selftests/resctrl/cache.c   |  10 +-
 tools/testing/selftests/resctrl/cat_test.c|  22 +--
 .../resctrl/{cqm_test.c => cmt_test.c}|  33 ++--
 tools/testing/selftests/resctrl/config|   2 +
 tools/testing/selftests/resctrl/fill_buf.c|   4 +-
 tools/testing/selftests/resctrl/mba_test.c|  25 ++-
 tools/testing/selftests/resctrl/mbm_test.c|  18 +-
 tools/testing/selftests/resctrl/resctrl.h |  45 -
 .../testing/selftests/resctrl/resctrl_tests.c | 162 --
 tools/testing/selftests/resctrl/resctrl_val.c |  88 ++
 tools/testing/selftests/resctrl/resctrlfs.c   |  85 ++---
 13 files changed, 318 insertions(+), 182 deletions(-)
 rename tools/testing/selftests/resctrl/{cqm_test.c => cmt_test.c} (85%)
 create mode 100644 tools/testing/selftests/resctrl/config

-- 
2.29.2



[PATCH v4 16/17] selftests/resctrl: Fix incorrect parsing of iMC counters

2020-11-30 Thread Fenghua Yu
iMC (Integrated Memory Controller) counters are usually at
"/sys/bus/event_source/devices/" and are named as "uncore_imc_".
num_of_imcs() function tries to count number of such iMC counters so that
it could appropriately initialize required number of perf_attr structures
that could be used to read these iMC counters.

num_of_imcs() function assumes that all the directories under this path
that start with "uncore_imc" are iMC counters. But, on some systems there
could be directories named as "uncore_imc_free_running" which aren't iMC
counters. Trying to read from such directories will result in "not found
file" errors and MBM/MBA tests will fail.

Hence, fix the logic in num_of_imcs() such that it looks at the first
character after "uncore_imc_" to check if it's a numerical digit or not. If
it's a digit then the directory represents an iMC counter, else, skip the
directory.

Reported-by: Reinette Chatre 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrl_val.c | 22 +--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl_val.c 
b/tools/testing/selftests/resctrl/resctrl_val.c
index 5ff336f62f8f..d6f0688182e8 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -221,8 +221,8 @@ static int read_from_imc_dir(char *imc_dir, int count)
  */
 static int num_of_imcs(void)
 {
+   char imc_dir[512], *temp;
unsigned int count = 0;
-   char imc_dir[512];
struct dirent *ep;
int ret;
DIR *dp;
@@ -230,7 +230,25 @@ static int num_of_imcs(void)
dp = opendir(DYN_PMU_PATH);
if (dp) {
while ((ep = readdir(dp))) {
-   if (strstr(ep->d_name, UNCORE_IMC)) {
+   temp = strstr(ep->d_name, UNCORE_IMC);
+   if (!temp)
+   continue;
+
+   /*
+* imc counters are named as "uncore_imc_", hence
+* increment the pointer to point to . Note that
+* sizeof(UNCORE_IMC) would count for null character as
+* well and hence the last underscore character in
+* uncore_imc'_' need not be counted.
+*/
+   temp = temp + sizeof(UNCORE_IMC);
+
+   /*
+* Some directories under "DYN_PMU_PATH" could have
+* names like "uncore_imc_free_running", hence, check if
+* first character is a numerical digit or not.
+*/
+   if (temp[0] >= '0' && temp[0] <= '9') {
sprintf(imc_dir, "%s/%s/", DYN_PMU_PATH,
ep->d_name);
ret = read_from_imc_dir(imc_dir, count);
-- 
2.29.2



[PATCH v4 01/17] selftests/resctrl: Fix compilation issues for global variables

2020-11-30 Thread Fenghua Yu
Reinette reported following compilation issue on Fedora 32, gcc version
10.1.1

/usr/bin/ld: cqm_test.o:/cqm_test.c:22: multiple definition of
`cache_size'; cat_test.o:/cat_test.c:23: first defined here

The same issue is reported for long_mask, cbm_mask, count_of_bits etc
variables as well. Compiler isn't happy because these variables are
defined globally in two .c files namely cqm_test.c and cat_test.c and
the compiler during compilation finds that the variable is already
defined (multiple definition error).

Taking a closer look at the usage of these variables reveals that these
variables are used only locally to functions such as cqm_resctrl_val()
(defined in cqm_test.c) and cat_perf_miss_val() (defined in cat_test.c).
These variables are not shared between those functions. So, there is no
need for these variables to be global. Hence, fix this issue by making
them local variables to the functions where they are used.

To fix issues for other global variables (e.g: bm_pid, ppid, llc_occup_path
and is_amd) that are used across .c files, declare them as extern.

Reported-by: Reinette Chatre 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/cat_test.c  | 12 
 tools/testing/selftests/resctrl/cqm_test.c  | 11 ---
 tools/testing/selftests/resctrl/resctrl.h   | 10 +-
 tools/testing/selftests/resctrl/resctrlfs.c | 10 +-
 4 files changed, 18 insertions(+), 25 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c 
b/tools/testing/selftests/resctrl/cat_test.c
index 5da43767b973..7f723bd8f328 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -17,11 +17,6 @@
 #define MAX_DIFF_PERCENT   4
 #define MAX_DIFF   100
 
-int count_of_bits;
-char cbm_mask[256];
-unsigned long long_mask;
-unsigned long cache_size;
-
 /*
  * Change schemata. Write schemata to specified
  * con_mon grp, mon_grp in resctrl FS.
@@ -121,8 +116,9 @@ void cat_test_cleanup(void)
 
 int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
 {
-   unsigned long l_mask, l_mask_1;
-   int ret, pipefd[2], sibling_cpu_no;
+   unsigned long l_mask, l_mask_1, long_mask, cache_size;
+   int ret, pipefd[2], sibling_cpu_no, count_of_bits;
+   char cbm_mask[256];
char pipe_message;
pid_t bm_pid;
 
@@ -136,7 +132,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
return -1;
 
/* Get default cbm mask for L3/L2 cache */
-   ret = get_cbm_mask(cache_type);
+   ret = get_cbm_mask(cache_type, cbm_mask);
if (ret)
return ret;
 
diff --git a/tools/testing/selftests/resctrl/cqm_test.c 
b/tools/testing/selftests/resctrl/cqm_test.c
index c8756152bd61..b6af940ccfc2 100644
--- a/tools/testing/selftests/resctrl/cqm_test.c
+++ b/tools/testing/selftests/resctrl/cqm_test.c
@@ -16,11 +16,6 @@
 #define MAX_DIFF   200
 #define MAX_DIFF_PERCENT   15
 
-int count_of_bits;
-char cbm_mask[256];
-unsigned long long_mask;
-unsigned long cache_size;
-
 static int cqm_setup(int num, ...)
 {
struct resctrl_val_param *p;
@@ -113,7 +108,9 @@ void cqm_test_cleanup(void)
 
 int cqm_resctrl_val(int cpu_no, int n, char **benchmark_cmd)
 {
-   int ret, mum_resctrlfs;
+   int ret, mum_resctrlfs, count_of_bits;
+   unsigned long long_mask, cache_size;
+   char cbm_mask[256];
 
cache_size = 0;
mum_resctrlfs = 1;
@@ -125,7 +122,7 @@ int cqm_resctrl_val(int cpu_no, int n, char **benchmark_cmd)
if (!validate_resctrl_feature_request("cqm"))
return -1;
 
-   ret = get_cbm_mask("L3");
+   ret = get_cbm_mask("L3", cbm_mask);
if (ret)
return ret;
 
diff --git a/tools/testing/selftests/resctrl/resctrl.h 
b/tools/testing/selftests/resctrl/resctrl.h
index 39bf59c6b9c5..12b77182cb44 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -62,11 +62,11 @@ struct resctrl_val_param {
int (*setup)(int num, ...);
 };
 
-pid_t bm_pid, ppid;
-int tests_run;
+extern pid_t bm_pid, ppid;
+extern int tests_run;
 
-char llc_occup_path[1024];
-bool is_amd;
+extern char llc_occup_path[1024];
+extern bool is_amd;
 
 bool check_resctrlfs_support(void);
 int filter_dmesg(void);
@@ -92,7 +92,7 @@ void tests_cleanup(void);
 void mbm_test_cleanup(void);
 int mba_schemata_change(int cpu_no, char *bw_report, char **benchmark_cmd);
 void mba_test_cleanup(void);
-int get_cbm_mask(char *cache_type);
+int get_cbm_mask(char *cache_type, char *cbm_mask);
 int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size);
 void ctrlc_handler(int signum, siginfo_t *info, void *ptr);
 int cat_val(struct resctrl_val_param *param);
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c 
b/tools/testing/selftests/resctrl/resctrlfs.c
index 19c0ec4045a4..2a16100c9c3f 100644

  1   2   3   4   5   6   7   8   9   10   >