Re: [PATCH v3] KVM: selftests: Fix target thread to be migrated in rseq_test

2022-07-18 Thread Gavin Shan

Hi Sean,

On 7/19/22 9:46 AM, Sean Christopherson wrote:

On Tue, Jul 19, 2022, Gavin Shan wrote:

---
v3: Improved changelog (Oliver Upon)


Sorry I didn't catch v3, I saw that you waited but just didn't get to this 
earlier :-/



Not a problem at all :)


---
  tools/testing/selftests/kvm/rseq_test.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/rseq_test.c 
b/tools/testing/selftests/kvm/rseq_test.c
index 4158da0da2bb..c83ac7b467f8 100644
--- a/tools/testing/selftests/kvm/rseq_test.c
+++ b/tools/testing/selftests/kvm/rseq_test.c
@@ -38,6 +38,7 @@ static __thread volatile struct rseq __rseq = {
   */
  #define NR_TASK_MIGRATIONS 10
  
+static pid_t rseq_tid;

  static pthread_t migration_thread;
  static cpu_set_t possible_mask;
  static int min_cpu, max_cpu;
@@ -106,7 +107,8 @@ static void *migration_worker(void *ign)


Pass the target TID to the worker, then there's no need to use a global and no
chance of consuming rseq_tid "uninitialized".  The casting to convert gettid() 
to
a "void *" is annoying, but not the end of the world.



I was thinking of the scheme, but passing the address of a local variable
for the thread ID. Your suggestion also makes sense to me.


 * stable, i.e. while changing affinity is in-progress.
 */
smp_wmb();
-   r = sched_setaffinity(0, sizeof(allowed_mask), _mask);
+   r = sched_setaffinity(rseq_tid, sizeof(allowed_mask),
+ _mask);


Eh, let this poke out, don't think it's worth wrapping here.



Ok, I was trying to follow rule of 80-characters per line, even it's
not strictly needed nowadays. It's also fine not to follow :)

I just picked your code and posted v4:

https://lore.kernel.org/kvmarm/20220719020830.3479482-1-gs...@redhat.com/T/#u

Thanks,
Gavin


E.g.

---
  tools/testing/selftests/kvm/rseq_test.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/rseq_test.c 
b/tools/testing/selftests/kvm/rseq_test.c
index aba7be178dab..a54d4d05a058 100644
--- a/tools/testing/selftests/kvm/rseq_test.c
+++ b/tools/testing/selftests/kvm/rseq_test.c
@@ -80,8 +80,9 @@ static int next_cpu(int cpu)
return cpu;
  }

-static void *migration_worker(void *ign)
+static void *migration_worker(void *__rseq_tid)
  {
+   pid_t rseq_tid = (pid_t)(unsigned long)__rseq_tid;
cpu_set_t allowed_mask;
int r, i, cpu;

@@ -104,7 +105,7 @@ static void *migration_worker(void *ign)
 * stable, i.e. while changing affinity is in-progress.
 */
smp_wmb();
-   r = sched_setaffinity(0, sizeof(allowed_mask), _mask);
+   r = sched_setaffinity(rseq_tid, sizeof(allowed_mask), 
_mask);
TEST_ASSERT(!r, "sched_setaffinity failed, errno = %d (%s)",
errno, strerror(errno));
smp_wmb();
@@ -227,7 +228,8 @@ int main(int argc, char *argv[])
vm = vm_create_with_one_vcpu(, guest_code);
ucall_init(vm, NULL);

-   pthread_create(_thread, NULL, migration_worker, 0);
+   pthread_create(_thread, NULL, migration_worker,
+  (void *)(unsigned long)gettid());

for (i = 0; !done; i++) {
vcpu_run(vcpu);

base-commit: ad6cb756bb497997032df2bda7cbdff076e4a66a
--



___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4] KVM: selftests: Fix target thread to be migrated in rseq_test

2022-07-18 Thread Gavin Shan
In rseq_test, there are two threads, which are vCPU thread and migration
worker separately. Unfortunately, the test has the wrong PID passed to
sched_setaffinity() in the migration worker. It forces migration on the
migration worker because zeroed PID represents the calling thread, which
is the migration worker itself. It means the vCPU thread is never enforced
to migration and it can migrate at any time, which eventually leads to
failure as the following logs show.

  host# uname -r
  5.19.0-rc6-gavin+
  host# # cat /proc/cpuinfo | grep processor | tail -n 1
  processor: 223
  host# pwd
  /home/gavin/sandbox/linux.main/tools/testing/selftests/kvm
  host# for i in `seq 1 100`; do \
echo "> $i"; ./rseq_test; done
  > 1
  > 2
  > 3
  > 4
  > 5
  > 6
   Test Assertion Failure 
rseq_test.c:265: rseq_cpu == cpu
pid=3925 tid=3925 errno=4 - Interrupted system call
   1  0x00401963: main at rseq_test.c:265 (discriminator 2)
   2  0xb044affb: ?? ??:0
   3  0xb044b0c7: ?? ??:0
   4  0x00401a6f: _start at ??:?
rseq CPU = 4, sched CPU = 27

Fix the issue by passing correct parameter, TID of the vCPU thread, to
sched_setaffinity() in the migration worker.

Fixes: 61e52f1630f5 ("KVM: selftests: Add a test for KVM_RUN+rseq to detect 
task migration bugs")
Suggested-by: Sean Christopherson 
Signed-off-by: Gavin Shan 
Reviewed-by: Oliver Upton 
---
v4: Pick the code change as Sean suggested.
---
 tools/testing/selftests/kvm/rseq_test.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/rseq_test.c 
b/tools/testing/selftests/kvm/rseq_test.c
index 4158da0da2bb..2237d1aac801 100644
--- a/tools/testing/selftests/kvm/rseq_test.c
+++ b/tools/testing/selftests/kvm/rseq_test.c
@@ -82,8 +82,9 @@ static int next_cpu(int cpu)
return cpu;
 }
 
-static void *migration_worker(void *ign)
+static void *migration_worker(void *__rseq_tid)
 {
+   pid_t rseq_tid = (pid_t)(unsigned long)__rseq_tid;
cpu_set_t allowed_mask;
int r, i, cpu;
 
@@ -106,7 +107,7 @@ static void *migration_worker(void *ign)
 * stable, i.e. while changing affinity is in-progress.
 */
smp_wmb();
-   r = sched_setaffinity(0, sizeof(allowed_mask), _mask);
+   r = sched_setaffinity(rseq_tid, sizeof(allowed_mask), 
_mask);
TEST_ASSERT(!r, "sched_setaffinity failed, errno = %d (%s)",
errno, strerror(errno));
smp_wmb();
@@ -231,7 +232,8 @@ int main(int argc, char *argv[])
vm = vm_create_default(VCPU_ID, 0, guest_code);
ucall_init(vm, NULL);
 
-   pthread_create(_thread, NULL, migration_worker, 0);
+   pthread_create(_thread, NULL, migration_worker,
+  (void *)(unsigned long)gettid());
 
for (i = 0; !done; i++) {
vcpu_run(vm, VCPU_ID);
-- 
2.23.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2] KVM: selftests: Fix target thread to be migrated in rseq_test

2022-07-18 Thread Gavin Shan

On 7/17/22 1:11 PM, Gavin Shan wrote:

On 7/17/22 7:48 AM, oliver.up...@linux.dev wrote:

July 16, 2022 7:45 AM, "Gavin Shan"  wrote:

In rseq_test, there are two threads, which are thread group leader
and migration worker. The migration worker relies on sched_setaffinity()
to force migration on the thread group leader.


It may be clearer to describe it as a vCPU thread and a migration worker
thread. The meat of this test is to catch a regression in KVM.


Unfortunately, we have


s/we have/the test has the/


wrong parameter (0) passed to sched_getaffinity().


wrong PID



Yep, it's much clearer to describe it as vCPU thread and migration worker.


It's actually
forcing migration on the migration worker instead of the thread group
leader.


What's missing is _why_ the migration worker is getting moved around by
the call. Perhaps instead it is better to state what a PID of 0 implies,
for those of us who haven't read their manpages in a while ;-)



Yes, it's good idea. I will have something like below in next revision :)

     In rseq_test, there are two threads, which are vCPU thread and migration
     worker separately. Unfortunately, the test has the wrong PID passed to
     sched_setaffinity() in the migration worker. It forces migration on the
     migration worker because zeroed PID represents the calling thread, which
     is the migration worker itself. It means the vCPU thread is never enforced
     to migration and it can migrate at any time, which eventually leads to
     failure as the following logs show.
     :
     :
     Fix the issue by passing correct parameter, TID of the vCPU thread, to
     sched_setaffinity() in the migration worker.



It also means migration can happen on the thread group leader
at any time, which eventually leads to failure as the following logs
show.

host# uname -r
5.19.0-rc6-gavin+
host# # cat /proc/cpuinfo | grep processor | tail -n 1
processor : 223
host# pwd
/home/gavin/sandbox/linux.main/tools/testing/selftests/kvm
host# for i in `seq 1 100`; \
do echo "> $i"; ./rseq_test; done
> 1
> 2
> 3
> 4
> 5
> 6
 Test Assertion Failure 
rseq_test.c:265: rseq_cpu == cpu
pid=3925 tid=3925 errno=4 - Interrupted system call
1 0x00401963: main at rseq_test.c:265 (discriminator 2)
2 0xb044affb: ?? ??:0
3 0xb044b0c7: ?? ??:0
4 0x00401a6f: _start at ??:?
rseq CPU = 4, sched CPU = 27

This fixes the issue by passing correct parameter, tid of the group
thread leader, to sched_setaffinity().


Kernel commit messages should have an imperative tone:

Fix the issue by ...



Ok. I've been having my style for long time. Actually, the style was
shared by some one when I worked for IBM long time ago. I will bear
it in mind to use imperative expression since now on :)

All your comments will be fixed in next revision, but I would delay
the posting a bit to see Sean or Paolo have more comments. In that
case, I can fix all of them at once.



v3 was just posted.

https://lore.kernel.org/kvmarm/20220719013540.3477946-1-gs...@redhat.com/T/#u


Fixes: 61e52f1630f5 ("KVM: selftests: Add a test for KVM_RUN+rseq to detect task 
migration bugs")
Signed-off-by: Gavin Shan 


With the comments on the commit message addressed:

Reviewed-by: Oliver Upton 



Thanks,
Gavin

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3] KVM: selftests: Fix target thread to be migrated in rseq_test

2022-07-18 Thread Gavin Shan
In rseq_test, there are two threads, which are vCPU thread and migration
worker separately. Unfortunately, the test has the wrong PID passed to
sched_setaffinity() in the migration worker. It forces migration on the
migration worker because zeroed PID represents the calling thread, which
is the migration worker itself. It means the vCPU thread is never enforced
to migration and it can migrate at any time, which eventually leads to
failure as the following logs show.

  host# uname -r
  5.19.0-rc6-gavin+
  host# # cat /proc/cpuinfo | grep processor | tail -n 1
  processor: 223
  host# pwd
  /home/gavin/sandbox/linux.main/tools/testing/selftests/kvm
  host# for i in `seq 1 100`; do \
echo "> $i"; ./rseq_test; done
  > 1
  > 2
  > 3
  > 4
  > 5
  > 6
   Test Assertion Failure 
rseq_test.c:265: rseq_cpu == cpu
pid=3925 tid=3925 errno=4 - Interrupted system call
   1  0x00401963: main at rseq_test.c:265 (discriminator 2)
   2  0xb044affb: ?? ??:0
   3  0xb044b0c7: ?? ??:0
   4  0x00401a6f: _start at ??:?
rseq CPU = 4, sched CPU = 27

Fix the issue by passing correct parameter, TID of the vCPU thread, to
sched_setaffinity() in the migration worker.

Fixes: 61e52f1630f5 ("KVM: selftests: Add a test for KVM_RUN+rseq to detect 
task migration bugs")
Signed-off-by: Gavin Shan 
Reviewed-by: Oliver Upton 
---
v3: Improved changelog (Oliver Upon)
---
 tools/testing/selftests/kvm/rseq_test.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/rseq_test.c 
b/tools/testing/selftests/kvm/rseq_test.c
index 4158da0da2bb..c83ac7b467f8 100644
--- a/tools/testing/selftests/kvm/rseq_test.c
+++ b/tools/testing/selftests/kvm/rseq_test.c
@@ -38,6 +38,7 @@ static __thread volatile struct rseq __rseq = {
  */
 #define NR_TASK_MIGRATIONS 10
 
+static pid_t rseq_tid;
 static pthread_t migration_thread;
 static cpu_set_t possible_mask;
 static int min_cpu, max_cpu;
@@ -106,7 +107,8 @@ static void *migration_worker(void *ign)
 * stable, i.e. while changing affinity is in-progress.
 */
smp_wmb();
-   r = sched_setaffinity(0, sizeof(allowed_mask), _mask);
+   r = sched_setaffinity(rseq_tid, sizeof(allowed_mask),
+ _mask);
TEST_ASSERT(!r, "sched_setaffinity failed, errno = %d (%s)",
errno, strerror(errno));
smp_wmb();
@@ -231,6 +233,7 @@ int main(int argc, char *argv[])
vm = vm_create_default(VCPU_ID, 0, guest_code);
ucall_init(vm, NULL);
 
+   rseq_tid = gettid();
pthread_create(_thread, NULL, migration_worker, 0);
 
for (i = 0; !done; i++) {
-- 
2.23.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 11/18] KVM: arm64: Stub implementation of non-protected nVHE HYP stack unwinder

2022-07-18 Thread Marc Zyngier

On 2022-07-18 17:51, Kalesh Singh wrote:

On Mon, Jul 18, 2022 at 12:31 AM Marc Zyngier  wrote:


On Fri, 15 Jul 2022 07:10:20 +0100,
Kalesh Singh  wrote:
>
> Add stub implementations of non-protected nVHE stack unwinder, for
> building. These are implemented later in this series.
>
> Signed-off-by: Kalesh Singh 
> ---
>  arch/arm64/include/asm/stacktrace/nvhe.h | 22 ++
>  1 file changed, 22 insertions(+)
>
> diff --git a/arch/arm64/include/asm/stacktrace/nvhe.h 
b/arch/arm64/include/asm/stacktrace/nvhe.h
> index 1eac4e57f2ae..36cf7858ddd8 100644
> --- a/arch/arm64/include/asm/stacktrace/nvhe.h
> +++ b/arch/arm64/include/asm/stacktrace/nvhe.h
> @@ -8,6 +8,12 @@
>   *  the HYP memory. The stack is unwinded in EL2 and dumped to a shared
>   *  buffer where the host can read and print the stacktrace.
>   *
> + *   2) Non-protected nVHE mode - the host can directly access the
> + *  HYP stack pages and unwind the HYP stack in EL1. This saves having
> + *  to allocate shared buffers for the host to read the unwinded
> + *  stacktrace.
> + *
> + *
>   * Copyright (C) 2022 Google LLC
>   */
>  #ifndef __ASM_STACKTRACE_NVHE_H
> @@ -53,5 +59,21 @@ static int notrace unwind_next(struct unwind_state *state)
>  NOKPROBE_SYMBOL(unwind_next);
>  #endif   /* CONFIG_PROTECTED_NVHE_STACKTRACE */
>
> +/*
> + * Non-protected nVHE HYP stack unwinder
> + */
> +#else/* !__KVM_NVHE_HYPERVISOR__ */

I don't get this path. This either represents the VHE hypervisor or
the kernel proper. Which one is it?


Hi Marc. This is run from kernel proper context. And it's the
unwinding for conventional nVHE (non-protected). The unwinding is done
from the host kernel in EL1.


This really deserves a comment here, as the one you currently
have is a bit misleading (and you probably want to move it
below the #else).

Thanks,

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvm-unit-tests PATCH 0/3] arm: pmu: Fixes for bare metal

2022-07-18 Thread Alexandru Elisei
Hi,

I believe you're missing the updated email for the arm maintainer. Added it.

Thanks,
Alex

On Mon, Jul 18, 2022 at 08:49:07AM -0700, Ricardo Koller wrote:
> There are some tests that fail when running on bare metal (including a
> passthrough prototype).  There are three issues with the tests.  The
> first one is that there are some missing isb()'s between enabling event
> counting and the actual counting. This wasn't an issue on KVM as
> trapping on registers served as context synchronization events. The
> second issue is that some tests assume that registers reset to 0.  And
> finally, the third issue is that overflowing the low counter of a
> chained event sets the overflow flag in PMVOS and some tests fail by
> checking for it not being set.
> 
> I believe the third fix also requires a KVM change, but would like to
> double check with others first.  The only reference I could find in the
> ARM ARM is the AArch64.IncrementEventCounter() pseudocode (DDI 0487H.a,
> J1.1.1 "aarch64/debug") that unconditionally sets the PMOVS bit on
> overflow.
> 
> Ricardo Koller (3):
>   arm: pmu: Add missing isb()'s after sys register writing
>   arm: pmu: Reset the pmu registers before starting some tests
>   arm: pmu: Remove checks for !overflow in chained counters tests
> 
>  arm/pmu.c | 34 +++---
>  1 file changed, 23 insertions(+), 11 deletions(-)
> 
> -- 
> 2.37.0.170.g444d1eabd0-goog
> 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvm-unit-tests PATCH 1/3] arm: pmu: Add missing isb()'s after sys register writing

2022-07-18 Thread Alexandru Elisei
Hi,

On Mon, Jul 18, 2022 at 08:49:08AM -0700, Ricardo Koller wrote:
> There are various pmu tests that require an isb() between enabling
> counting and the actual counting. This can lead to count registers
> reporting less events than expected; the actual enabling happens after
> some events have happened.  For example, some missing isb()'s in the
> pmu-sw-incr test lead to the following errors on bare-metal:
> 
>   INFO: pmu: pmu-sw-incr: SW_INCR counter #0 has value 4294967280
> PASS: pmu: pmu-sw-incr: PWSYNC does not increment if PMCR.E is unset
> FAIL: pmu: pmu-sw-incr: counter #1 after + 100 SW_INCR
> FAIL: pmu: pmu-sw-incr: counter #0 after + 100 SW_INCR
> INFO: pmu: pmu-sw-incr: counter values after 100 SW_INCR #0=82 #1=98
> PASS: pmu: pmu-sw-incr: overflow on counter #0 after 100 SW_INCR
> SUMMARY: 4 tests, 2 unexpected failures
> 
> Add the missing isb()'s on all failing tests, plus some others that are
> not currently required but might in the future (like an isb() after
> clearing the overflow signal in the IRQ handler).

That's rather cryptic. What might require those hypothetical ISBs and why? Why
should a test add code for some hypothetical requirement that might, or might
not, be implemented?

This is pure speculation on my part, were you seeing spurious interrupts that
went away after adding the ISB in irq_handler()?

A couple of comments below.

> 
> Signed-off-by: Ricardo Koller 
> ---
>  arm/pmu.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/arm/pmu.c b/arm/pmu.c
> index 15c542a2..fd838392 100644
> --- a/arm/pmu.c
> +++ b/arm/pmu.c
> @@ -307,6 +307,7 @@ static void irq_handler(struct pt_regs *regs)
>   }
>   }
>   write_sysreg(ALL_SET, pmovsclr_el0);
> + isb();
>   } else {
>   pmu_stats.unexpected = true;
>   }
> @@ -534,6 +535,7 @@ static void test_sw_incr(void)
>   write_sysreg_s(0x3, PMCNTENSET_EL0);
>  
>   write_regn_el0(pmevcntr, 0, PRE_OVERFLOW);
> + isb();
>  
>   for (i = 0; i < 100; i++)
>   write_sysreg(0x1, pmswinc_el0);

Since your patch adds needed synchronization, from ARM DDI 0487H.a, page
D13-5237:

"Where a direct write to one register causes a bit or field in a different
register [..] to be updated as a side-effect of that direct write [..], the
change to the different register [..] is defined to be an indirect write. In
this case, the indirect write is only guaranteed to be visible to subsequent
direct or indirect reads or writes if synchronization is performed after the
direct write and before the subsequent direct or indirect reads or writes."

I think that says that you need an ISB after the direct writes to PMSWINC_EL0
for software to read the correct value for PMEVNCTR0_EL0.

> @@ -547,6 +549,7 @@ static void test_sw_incr(void)
>   write_regn_el0(pmevcntr, 0, PRE_OVERFLOW);
>   write_sysreg_s(0x3, PMCNTENSET_EL0);
>   set_pmcr(pmu.pmcr_ro | PMU_PMCR_E);
> + isb();
>  
>   for (i = 0; i < 100; i++)
>   write_sysreg(0x3, pmswinc_el0);

Same as above, might be worth checking in other places.

Will come back with more review comments.

Thanks,
Alex

> @@ -618,6 +621,8 @@ static void test_chained_sw_incr(void)
>  
>   write_regn_el0(pmevcntr, 0, PRE_OVERFLOW);
>   set_pmcr(pmu.pmcr_ro | PMU_PMCR_E);
> + isb();
> +
>   for (i = 0; i < 100; i++)
>   write_sysreg(0x1, pmswinc_el0);
>  
> @@ -634,6 +639,8 @@ static void test_chained_sw_incr(void)
>   write_regn_el0(pmevcntr, 1, ALL_SET);
>   write_sysreg_s(0x3, PMCNTENSET_EL0);
>   set_pmcr(pmu.pmcr_ro | PMU_PMCR_E);
> + isb();
> +
>   for (i = 0; i < 100; i++)
>   write_sysreg(0x1, pmswinc_el0);
>  
> @@ -821,6 +828,8 @@ static void test_overflow_interrupt(void)
>   report(expect_interrupts(0), "no overflow interrupt after preset");
>  
>   set_pmcr(pmu.pmcr_ro | PMU_PMCR_E);
> + isb();
> +
>   for (i = 0; i < 100; i++)
>   write_sysreg(0x2, pmswinc_el0);
>  
> @@ -879,6 +888,7 @@ static bool check_cycles_increase(void)
>   set_pmccfiltr(0); /* count cycles in EL0, EL1, but not EL2 */
>  
>   set_pmcr(get_pmcr() | PMU_PMCR_LC | PMU_PMCR_C | PMU_PMCR_E);
> + isb();
>  
>   for (int i = 0; i < NR_SAMPLES; i++) {
>   uint64_t a, b;
> @@ -894,6 +904,7 @@ static bool check_cycles_increase(void)
>   }
>  
>   set_pmcr(get_pmcr() & ~PMU_PMCR_E);
> + isb();
>  
>   return success;
>  }
> -- 
> 2.37.0.170.g444d1eabd0-goog
> 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 01/18] arm64: stacktrace: Add shared header for common stack unwinding code

2022-07-18 Thread Russell King (Oracle)
On Mon, Jul 18, 2022 at 08:26:14AM -0700, Kalesh Singh wrote:
> On Mon, Jul 18, 2022 at 5:52 AM Russell King (Oracle)
>  wrote:
> >
> > Hi,
> >
> > Can you please explain why you are targetting my @oracle.com email
> > address with this patch set?
> >
> > This causes me problems as I use Outlook's Web interface for that
> > which doesn't appear to cope with the threading, and most certainly
> > is only capable of top-reply only which is against Linux kernel email
> > standards.
> 
> Hi Russell,
> 
> Sorry I wasn't aware of it (I got your oracle email from
> get_maintainer script). Going forward I'll use the one you responded
> from instead.

Oh, this is the very annoying behaviour of get_maintainer.pl default
mode to think that if someone touches a file, they're interested in
future changes to it. In this case, it's because we both touched
arch/arm64/include/asm/memory.h back in November 2021, and this
silly script thinks I'll still be interested.

b89ddf4cca43 arm64/bpf: Remove 128MB limit for BPF JIT programs

(The patch was originally developed for Oracle's UEK kernels, hence
why it's got my @oracle.com address, but was later merged upstream.
Interestingly, no one spotted that Alan Maguire's s-o-b should've
been on it, as he was involved in the submission path to mainline.)

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 01/18] arm64: stacktrace: Add shared header for common stack unwinding code

2022-07-18 Thread Russell King (Oracle)
Hi,

Can you please explain why you are targetting my @oracle.com email
address with this patch set?

This causes me problems as I use Outlook's Web interface for that
which doesn't appear to cope with the threading, and most certainly
is only capable of top-reply only which is against Linux kernel email
standards.

Thanks.

On Thu, Jul 14, 2022 at 11:10:10PM -0700, Kalesh Singh wrote:
> In order to reuse the arm64 stack unwinding logic for the nVHE
> hypervisor stack, move the common code to a shared header
> (arch/arm64/include/asm/stacktrace/common.h).
> 
> The nVHE hypervisor cannot safely link against kernel code, so we
> make use of the shared header to avoid duplicated logic later in
> this series.
> 
> Signed-off-by: Kalesh Singh 
> ---
>  arch/arm64/include/asm/stacktrace.h|  35 +--
>  arch/arm64/include/asm/stacktrace/common.h | 105 +
>  arch/arm64/kernel/stacktrace.c |  57 ---
>  3 files changed, 106 insertions(+), 91 deletions(-)
>  create mode 100644 arch/arm64/include/asm/stacktrace/common.h
> 
> diff --git a/arch/arm64/include/asm/stacktrace.h 
> b/arch/arm64/include/asm/stacktrace.h
> index aec9315bf156..79f455b37c84 100644
> --- a/arch/arm64/include/asm/stacktrace.h
> +++ b/arch/arm64/include/asm/stacktrace.h
> @@ -8,52 +8,19 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  
>  #include 
>  #include 
>  #include 
>  
> -enum stack_type {
> - STACK_TYPE_UNKNOWN,
> - STACK_TYPE_TASK,
> - STACK_TYPE_IRQ,
> - STACK_TYPE_OVERFLOW,
> - STACK_TYPE_SDEI_NORMAL,
> - STACK_TYPE_SDEI_CRITICAL,
> - __NR_STACK_TYPES
> -};
> -
> -struct stack_info {
> - unsigned long low;
> - unsigned long high;
> - enum stack_type type;
> -};
> +#include 
>  
>  extern void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk,
>  const char *loglvl);
>  
>  DECLARE_PER_CPU(unsigned long *, irq_stack_ptr);
>  
> -static inline bool on_stack(unsigned long sp, unsigned long size,
> - unsigned long low, unsigned long high,
> - enum stack_type type, struct stack_info *info)
> -{
> - if (!low)
> - return false;
> -
> - if (sp < low || sp + size < sp || sp + size > high)
> - return false;
> -
> - if (info) {
> - info->low = low;
> - info->high = high;
> - info->type = type;
> - }
> - return true;
> -}
> -
>  static inline bool on_irq_stack(unsigned long sp, unsigned long size,
>   struct stack_info *info)
>  {
> diff --git a/arch/arm64/include/asm/stacktrace/common.h 
> b/arch/arm64/include/asm/stacktrace/common.h
> new file mode 100644
> index ..64ae4f6b06fe
> --- /dev/null
> +++ b/arch/arm64/include/asm/stacktrace/common.h
> @@ -0,0 +1,105 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Common arm64 stack unwinder code.
> + *
> + * Copyright (C) 2012 ARM Ltd.
> + */
> +#ifndef __ASM_STACKTRACE_COMMON_H
> +#define __ASM_STACKTRACE_COMMON_H
> +
> +#include 
> +#include 
> +#include 
> +
> +enum stack_type {
> + STACK_TYPE_UNKNOWN,
> + STACK_TYPE_TASK,
> + STACK_TYPE_IRQ,
> + STACK_TYPE_OVERFLOW,
> + STACK_TYPE_SDEI_NORMAL,
> + STACK_TYPE_SDEI_CRITICAL,
> + __NR_STACK_TYPES
> +};
> +
> +struct stack_info {
> + unsigned long low;
> + unsigned long high;
> + enum stack_type type;
> +};
> +
> +/*
> + * A snapshot of a frame record or fp/lr register values, along with some
> + * accounting information necessary for robust unwinding.
> + *
> + * @fp:  The fp value in the frame record (or the real fp)
> + * @pc:  The lr value in the frame record (or the real lr)
> + *
> + * @stacks_done: Stacks which have been entirely unwound, for which it is no
> + *   longer valid to unwind to.
> + *
> + * @prev_fp: The fp that pointed to this frame record, or a synthetic 
> value
> + *   of 0. This is used to ensure that within a stack, each
> + *   subsequent frame record is at an increasing address.
> + * @prev_type:   The type of stack this frame record was on, or a synthetic
> + *   value of STACK_TYPE_UNKNOWN. This is used to detect a
> + *   transition from one stack to another.
> + *
> + * @kr_cur:  When KRETPROBES is selected, holds the kretprobe instance
> + *   associated with the most recently encountered replacement lr
> + *   value.
> + *
> + * @task:The task being unwound.
> + */
> +struct unwind_state {
> + unsigned long fp;
> + unsigned long pc;
> + DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
> + unsigned long prev_fp;
> + enum stack_type prev_type;
> +#ifdef CONFIG_KRETPROBES
> + struct llist_node *kr_cur;
> +#endif
> + struct task_struct *task;
> +};
> +
> +static inline bool on_stack(unsigned long sp, unsigned long size,

Re: [PATCH v4 12/18] KVM: arm64: Save protected-nVHE (pKVM) hyp stacktrace

2022-07-18 Thread Marc Zyngier
On Fri, 15 Jul 2022 07:10:21 +0100,
Kalesh Singh  wrote:
> 
> In protected nVHE mode, the host cannot access private owned hypervisor
> memory. Also the hypervisor aims to remains simple to reduce the attack
> surface and does not provide any printk support.
> 
> For the above reasons, the approach taken to provide hypervisor stacktraces
> in protected mode is:
>1) Unwind and save the hyp stack addresses in EL2 to a shared buffer
>   with the host (done in this patch).
>2) Delegate the dumping and symbolization of the addresses to the
>   host in EL1 (later patch in the series).
> 
> Signed-off-by: Kalesh Singh 
> ---
>  arch/arm64/include/asm/stacktrace/nvhe.h | 18 ++
>  arch/arm64/kvm/hyp/nvhe/stacktrace.c | 70 
>  2 files changed, 88 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/stacktrace/nvhe.h 
> b/arch/arm64/include/asm/stacktrace/nvhe.h
> index 36cf7858ddd8..456a6ae08433 100644
> --- a/arch/arm64/include/asm/stacktrace/nvhe.h
> +++ b/arch/arm64/include/asm/stacktrace/nvhe.h
> @@ -21,6 +21,22 @@
>  
>  #include 
>  
> +/**
> + * kvm_nvhe_unwind_init - Start an unwind from the given nVHE HYP fp and pc
> + *
> + * @fp : frame pointer at which to start the unwinding.
> + * @pc : program counter at which to start the unwinding.
> + */
> +static __always_inline void kvm_nvhe_unwind_init(struct unwind_state *state,
> +  unsigned long fp,
> +  unsigned long pc)
> +{
> + unwind_init_common(state, NULL);

Huh. Be careful here. This function is only 'inline', which means it
may not be really inlined. We've had tons of similar issues like this
in the past, and although this will not break at runtime anymore, it
will definitely stop the kernel from linking.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 11/18] KVM: arm64: Stub implementation of non-protected nVHE HYP stack unwinder

2022-07-18 Thread Marc Zyngier
On Fri, 15 Jul 2022 07:10:20 +0100,
Kalesh Singh  wrote:
> 
> Add stub implementations of non-protected nVHE stack unwinder, for
> building. These are implemented later in this series.
> 
> Signed-off-by: Kalesh Singh 
> ---
>  arch/arm64/include/asm/stacktrace/nvhe.h | 22 ++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/stacktrace/nvhe.h 
> b/arch/arm64/include/asm/stacktrace/nvhe.h
> index 1eac4e57f2ae..36cf7858ddd8 100644
> --- a/arch/arm64/include/asm/stacktrace/nvhe.h
> +++ b/arch/arm64/include/asm/stacktrace/nvhe.h
> @@ -8,6 +8,12 @@
>   *  the HYP memory. The stack is unwinded in EL2 and dumped to a shared
>   *  buffer where the host can read and print the stacktrace.
>   *
> + *   2) Non-protected nVHE mode - the host can directly access the
> + *  HYP stack pages and unwind the HYP stack in EL1. This saves having
> + *  to allocate shared buffers for the host to read the unwinded
> + *  stacktrace.
> + *
> + *
>   * Copyright (C) 2022 Google LLC
>   */
>  #ifndef __ASM_STACKTRACE_NVHE_H
> @@ -53,5 +59,21 @@ static int notrace unwind_next(struct unwind_state *state)
>  NOKPROBE_SYMBOL(unwind_next);
>  #endif   /* CONFIG_PROTECTED_NVHE_STACKTRACE */
>  
> +/*
> + * Non-protected nVHE HYP stack unwinder
> + */
> +#else/* !__KVM_NVHE_HYPERVISOR__ */

I don't get this path. This either represents the VHE hypervisor or
the kernel proper. Which one is it?

> +static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
> +  struct stack_info *info)
> +{
> + return false;
> +}
> +
> +static int notrace unwind_next(struct unwind_state *state)
> +{
> + return 0;
> +}
> +NOKPROBE_SYMBOL(unwind_next);
> +
>  #endif   /* __KVM_NVHE_HYPERVISOR__ */
>  #endif   /* __ASM_STACKTRACE_NVHE_H */

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 10/18] KVM: arm64: Stub implementation of pKVM HYP stack unwinder

2022-07-18 Thread Marc Zyngier
On Fri, 15 Jul 2022 07:10:19 +0100,
Kalesh Singh  wrote:
> 
> Add some stub implementations of protected nVHE stack unwinder, for
> building. These are implemented later in this series.
> 
> Signed-off-by: Kalesh Singh 
> ---
>  arch/arm64/include/asm/stacktrace/nvhe.h | 57 
>  arch/arm64/kvm/hyp/nvhe/stacktrace.c |  3 +-
>  2 files changed, 58 insertions(+), 2 deletions(-)
>  create mode 100644 arch/arm64/include/asm/stacktrace/nvhe.h
> 
> diff --git a/arch/arm64/include/asm/stacktrace/nvhe.h 
> b/arch/arm64/include/asm/stacktrace/nvhe.h
> new file mode 100644
> index ..1eac4e57f2ae
> --- /dev/null
> +++ b/arch/arm64/include/asm/stacktrace/nvhe.h
> @@ -0,0 +1,57 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * KVM nVHE hypervisor stack tracing support.
> + *
> + * The unwinder implementation depends on the nVHE mode:
> + *
> + *   1) pKVM (protected nVHE) mode - the host cannot directly access
> + *  the HYP memory. The stack is unwinded in EL2 and dumped to a shared
> + *  buffer where the host can read and print the stacktrace.
> + *
> + * Copyright (C) 2022 Google LLC
> + */
> +#ifndef __ASM_STACKTRACE_NVHE_H
> +#define __ASM_STACKTRACE_NVHE_H
> +
> +#include 
> +
> +static inline bool on_accessible_stack(const struct task_struct *tsk,
> +unsigned long sp, unsigned long size,
> +struct stack_info *info)
> +{
> + return false;
> +}
> +
> +/*
> + * Protected nVHE HYP stack unwinder
> + */
> +#ifdef __KVM_NVHE_HYPERVISOR__
> +
> +#ifdef CONFIG_PROTECTED_NVHE_STACKTRACE
> +static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
> +  struct stack_info *info)
> +{
> + return false;
> +}
> +
> +static int notrace unwind_next(struct unwind_state *state)
> +{
> + return 0;
> +}
> +NOKPROBE_SYMBOL(unwind_next);

I find this rather dodgy. It means that every compilation unit that
(indirectly) drags this include file may end-up with an 'unwind_next'
function. At best this will be eliminated at compilation time, but it
may also generate a warning.

Why can't this me made an 'inline' function? At the very least, it
should have a __maybe_unused attribute.

> +#else/* !CONFIG_PROTECTED_NVHE_STACKTRACE */
> +static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
> +  struct stack_info *info)
> +{
> + return false;
> +}
> +
> +static int notrace unwind_next(struct unwind_state *state)
> +{
> + return 0;
> +}
> +NOKPROBE_SYMBOL(unwind_next);

Same thing here.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 09/18] KVM: arm64: Allocate shared pKVM hyp stacktrace buffers

2022-07-18 Thread Marc Zyngier
On Fri, 15 Jul 2022 07:10:18 +0100,
Kalesh Singh  wrote:
> 
> In protected nVHE mode the host cannot directly access
> hypervisor memory, so we will dump the hypervisor stacktrace
> to a shared buffer with the host.
> 
> The minimum size do the buffer required, assuming the min frame

s/do/for/ ?

> size of [x29, x30] (2 * sizeof(long)), is half the combined size of
> the hypervisor and overflow stacks plus an additional entry to
> delimit the end of the stacktrace.

Let me see if I understand this: the maximum stack size is the
combination of the HYP and overflow stacks, and the smallest possible
stack frame is 128bit (only FP+LR). The buffer thus needs to provide
one 64bit entry per stack frame that fits in the combined stack, plus
one entry as an end marker.

So the resulting size is half of the combined stack size, plus a
single 64bit word. Is this correct?

> 
> The stacktrace buffers are used later in the seried to dump the
> nVHE hypervisor stacktrace when using protected-mode.
>
> Signed-off-by: Kalesh Singh 
> ---
>  arch/arm64/include/asm/memory.h  | 7 +++
>  arch/arm64/kvm/hyp/nvhe/stacktrace.c | 4 
>  2 files changed, 11 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 0af70d9abede..28a4893d4b84 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -113,6 +113,13 @@
>  
>  #define OVERFLOW_STACK_SIZE  SZ_4K
>  
> +/*
> + * With the minimum frame size of [x29, x30], exactly half the combined
> + * sizes of the hyp and overflow stacks is needed to save the unwinded
> + * stacktrace; plus an additional entry to delimit the end.
> + */
> +#define NVHE_STACKTRACE_SIZE ((OVERFLOW_STACK_SIZE + PAGE_SIZE) / 2 + 
> sizeof(long))
> +
>  /*
>   * Alignment of kernel segments (e.g. .text, .data).
>   *
> diff --git a/arch/arm64/kvm/hyp/nvhe/stacktrace.c 
> b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
> index a3d5b34e1249..69e65b457f1c 100644
> --- a/arch/arm64/kvm/hyp/nvhe/stacktrace.c
> +++ b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
> @@ -9,3 +9,7 @@
>  
>  DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], 
> overflow_stack)
>   __aligned(16);
> +
> +#ifdef CONFIG_PROTECTED_NVHE_STACKTRACE
> +DEFINE_PER_CPU(unsigned long [NVHE_STACKTRACE_SIZE/sizeof(long)], 
> pkvm_stacktrace);
> +#endif /* CONFIG_PROTECTED_NVHE_STACKTRACE */

OK, so the allocation exists even if KVM is not running in protected
mode. I guess this is OK for now, but definitely reinforces my request
that this is only there when compiled for debug mode.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 08/18] KVM: arm64: Add PROTECTED_NVHE_STACKTRACE Kconfig

2022-07-18 Thread Marc Zyngier
[- Drew and android-mm, as both addresses bounce]

On Fri, 15 Jul 2022 07:10:17 +0100,
Kalesh Singh  wrote:
> 
> This can be used to disable stacktrace for the protected KVM
> nVHE hypervisor, in order to save on the associated memory usage.
> 
> This option is disabled by default, since protected KVM is not widely
> used on platforms other than Android currently.
> 
> Signed-off-by: Kalesh Singh 
> ---
>  arch/arm64/kvm/Kconfig | 15 +++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
> index 8a5fbbf084df..1edab6f8a3b8 100644
> --- a/arch/arm64/kvm/Kconfig
> +++ b/arch/arm64/kvm/Kconfig
> @@ -46,6 +46,21 @@ menuconfig KVM
>  
> If unsure, say N.
>  
> +config PROTECTED_NVHE_STACKTRACE
> + bool "Protected KVM hypervisor stacktraces"
> + depends on KVM
> + default n
> + help
> +   Say Y here to enable pKVM hypervisor stacktraces on hyp_panic()
> +
> +   If you are not using protected nVHE (pKVM), say N.
> +
> +   If using protected nVHE mode, but cannot afford the associated
> +   memory cost (less than 0.75 page per CPU) of pKVM stacktraces,
> +   say N.
> +
> +   If unsure, say N.
> +

Can we make this depend on NVHE_EL2_DEBUG instead? I'd like to keep
the disclosing of EL2 information in protected mode a strict debug
feature.

>  config NVHE_EL2_DEBUG
>   bool "Debug mode for non-VHE EL2 object"
>   depends on KVM

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm