Re: [kvm-unit-tests PATCH v3 3/8] migration: use a more robust way to wait for background job
On 09/02/2024 08.01, Nicholas Piggin wrote: Starting a pipeline of jobs in the background does not seem to have a simple way to reliably find the pid of a particular process in the pipeline (because not all processes are started when the shell continues to execute). The way PID of QEMU is derived can result in a failure waiting on a PID that is not running. This is easier to hit with subsequent multiple-migration support. Changing this to use $! by swapping the pipeline for a fifo is more robust. Signed-off-by: Nicholas Piggin --- scripts/arch-run.bash | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash index 1e903e83..3689d7c2 100644 --- a/scripts/arch-run.bash +++ b/scripts/arch-run.bash @@ -130,19 +130,22 @@ run_migration () fi trap 'trap - TERM ; kill 0 ; exit 2' INT TERM - trap 'rm -f ${migout1} ${migsock} ${qmp1} ${qmp2} ${fifo}' RETURN EXIT + trap 'rm -f ${migout1} ${migout_fifo1} ${migsock} ${qmp1} ${qmp2} ${fifo}' RETURN EXIT migsock=$(mktemp -u -t mig-helper-socket.XX) migout1=$(mktemp -t mig-helper-stdout1.XX) + migout_fifo1=$(mktemp -u -t mig-helper-fifo-stdout1.XX) qmp1=$(mktemp -u -t mig-helper-qmp1.XX) qmp2=$(mktemp -u -t mig-helper-qmp2.XX) fifo=$(mktemp -u -t mig-helper-fifo.XX) qmpout1=/dev/null qmpout2=/dev/null + mkfifo ${migout_fifo1} eval "$@" -chardev socket,id=mon1,path=${qmp1},server=on,wait=off \ - -mon chardev=mon1,mode=control | tee ${migout1} & - live_pid=`jobs -l %+ | grep "eval" | awk '{print$2}'` + -mon chardev=mon1,mode=control > ${migout_fifo1} & + live_pid=$! + cat ${migout_fifo1} | tee ${migout1} & # We have to use cat to open the named FIFO, because named FIFO's, unlike # pipes, will block on open() until the other end is also opened, and that @@ -150,7 +153,7 @@ run_migration () mkfifo ${fifo} eval "$@" -chardev socket,id=mon2,path=${qmp2},server=on,wait=off \ -mon chardev=mon2,mode=control -incoming unix:${migsock} < <(cat ${fifo}) & - incoming_pid=`jobs -l %+ | awk '{print$2}'` + incoming_pid=$! # The test must prompt the user to migrate, so wait for the "migrate" keyword while ! grep -q -i "Now migrate the VM" < ${migout1} ; do @@ -164,6 +167,10 @@ run_migration () sleep 1 done + # Wait until the destination has created the incoming and qmp sockets + while ! [ -S ${migsock} ] ; do sleep 0.1 ; done + while ! [ -S ${qmp2} ] ; do sleep 0.1 ; done + qmp ${qmp1} '"migrate", "arguments": { "uri": "unix:'${migsock}'" }' > ${qmpout1} # Wait for the migration to complete Reviewed-by: Thomas Huth
Re: [kvm-unit-tests PATCH v3 2/8] arch-run: Clean up initrd cleanup
On 09/02/2024 08.01, Nicholas Piggin wrote: Rather than put a big script into the trap handler, have it call a function. Signed-off-by: Nicholas Piggin --- scripts/arch-run.bash | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash index 11d47a85..1e903e83 100644 --- a/scripts/arch-run.bash +++ b/scripts/arch-run.bash @@ -269,10 +269,21 @@ search_qemu_binary () export PATH=$save_path } +initrd_cleanup () +{ + rm -f $KVM_UNIT_TESTS_ENV + if [ "$KVM_UNIT_TESTS_ENV_OLD" ]; then + export KVM_UNIT_TESTS_ENV="$KVM_UNIT_TESTS_ENV_OLD" + else + unset KVM_UNIT_TESTS_ENV + unset KVM_UNIT_TESTS_ENV_OLD + fi +} Looking at the original code below, shouldn't this rather unset KVM_UNIT_TESTS_ENV_OLD after the "fi" statement? Thomas initrd_create () { if [ "$ENVIRON_DEFAULT" = "yes" ]; then - trap_exit_push 'rm -f $KVM_UNIT_TESTS_ENV; [ "$KVM_UNIT_TESTS_ENV_OLD" ] && export KVM_UNIT_TESTS_ENV="$KVM_UNIT_TESTS_ENV_OLD" || unset KVM_UNIT_TESTS_ENV; unset KVM_UNIT_TESTS_ENV_OLD' + trap_exit_push 'initrd_cleanup' [ -f "$KVM_UNIT_TESTS_ENV" ] && export KVM_UNIT_TESTS_ENV_OLD="$KVM_UNIT_TESTS_ENV" export KVM_UNIT_TESTS_ENV=$(mktemp) env_params
Re: [kvm-unit-tests PATCH v3 1/8] arch-run: Fix TRAP handler recursion to remove temporary files properly
On 09/02/2024 08.01, Nicholas Piggin wrote: Migration files were not being removed when the QEMU process is interrupted (e.g., with ^C). This is becaus the SIGINT propagates to the bash TRAP handler, which recursively TRAPs due to the 'kill 0' in the handler. This eventually crashes bash. This can be observed by interrupting a long-running test program that is run with MIGRATION=yes, /tmp/mig-helper-* files remain afterwards. Removing TRAP recursion solves this problem and allows the EXIT handler to run and clean up the files. This also moves the trap handler before temp file creation, and expands the name variables at trap-time rather than install-time, which closes the small race between creation trap handler install. Reviewed-by: Thomas Huth Signed-off-by: Nicholas Piggin --- scripts/arch-run.bash | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash index d0864360..11d47a85 100644 --- a/scripts/arch-run.bash +++ b/scripts/arch-run.bash @@ -129,6 +129,9 @@ run_migration () return 77 fi + trap 'trap - TERM ; kill 0 ; exit 2' INT TERM + trap 'rm -f ${migout1} ${migsock} ${qmp1} ${qmp2} ${fifo}' RETURN EXIT + migsock=$(mktemp -u -t mig-helper-socket.XX) migout1=$(mktemp -t mig-helper-stdout1.XX) qmp1=$(mktemp -u -t mig-helper-qmp1.XX) @@ -137,9 +140,6 @@ run_migration () qmpout1=/dev/null qmpout2=/dev/null - trap 'kill 0; exit 2' INT TERM - trap 'rm -f ${migout1} ${migsock} ${qmp1} ${qmp2} ${fifo}' RETURN EXIT - eval "$@" -chardev socket,id=mon1,path=${qmp1},server=on,wait=off \ -mon chardev=mon1,mode=control | tee ${migout1} & live_pid=`jobs -l %+ | grep "eval" | awk '{print$2}'` @@ -209,11 +209,11 @@ run_panic () return 77 fi - qmp=$(mktemp -u -t panic-qmp.XX) - - trap 'kill 0; exit 2' INT TERM + trap 'trap - TERM ; kill 0 ; exit 2' INT TERM trap 'rm -f ${qmp}' RETURN EXIT + qmp=$(mktemp -u -t panic-qmp.XX) + # start VM stopped so we don't miss any events eval "$@" -chardev socket,id=mon1,path=${qmp},server=on,wait=off \ -mon chardev=mon1,mode=control -S & Reviewed-by: Thomas Huth
[kvm-unit-tests PATCH v3 8/8] migration: add a migration selftest
Add a selftest for migration support in guest library and test harness code. It performs migrations a tight loop to irritate races and bugs in the test harness code. Include the test in arm, s390, powerpc. Acked-by: Claudio Imbrenda (s390x) Signed-off-by: Nicholas Piggin --- This has flushed out several bugs in developing the multi migration test harness code already. Thanks, Nick arm/Makefile.common | 1 + arm/selftest-migration.c | 1 + arm/unittests.cfg| 6 ++ common/selftest-migration.c | 34 ++ powerpc/Makefile.common | 1 + powerpc/selftest-migration.c | 1 + powerpc/unittests.cfg| 4 s390x/Makefile | 1 + s390x/selftest-migration.c | 1 + s390x/unittests.cfg | 4 10 files changed, 54 insertions(+) create mode 12 arm/selftest-migration.c create mode 100644 common/selftest-migration.c create mode 12 powerpc/selftest-migration.c create mode 12 s390x/selftest-migration.c diff --git a/arm/Makefile.common b/arm/Makefile.common index f828dbe0..f107c478 100644 --- a/arm/Makefile.common +++ b/arm/Makefile.common @@ -5,6 +5,7 @@ # tests-common = $(TEST_DIR)/selftest.$(exe) +tests-common += $(TEST_DIR)/selftest-migration.$(exe) tests-common += $(TEST_DIR)/spinlock-test.$(exe) tests-common += $(TEST_DIR)/pci-test.$(exe) tests-common += $(TEST_DIR)/pmu.$(exe) diff --git a/arm/selftest-migration.c b/arm/selftest-migration.c new file mode 12 index ..bd1eb266 --- /dev/null +++ b/arm/selftest-migration.c @@ -0,0 +1 @@ +../common/selftest-migration.c \ No newline at end of file diff --git a/arm/unittests.cfg b/arm/unittests.cfg index fe601cbb..1ffd9a82 100644 --- a/arm/unittests.cfg +++ b/arm/unittests.cfg @@ -55,6 +55,12 @@ smp = $MAX_SMP extra_params = -append 'smp' groups = selftest +# Test migration +[selftest-migration] +file = selftest-migration.flat +groups = selftest migration + +arch = arm64 # Test PCI emulation [pci-test] file = pci-test.flat diff --git a/common/selftest-migration.c b/common/selftest-migration.c new file mode 100644 index ..f70c505f --- /dev/null +++ b/common/selftest-migration.c @@ -0,0 +1,34 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Machine independent migration tests + * + * This is just a very simple test that is intended to stress the migration + * support in the test harness. This could be expanded to test more guest + * library code, but architecture-specific tests should be used to test + * migration of tricky machine state. + */ +#include +#include + +#if defined(__arm__) || defined(__aarch64__) +/* arm can only call getchar 15 times */ +#define NR_MIGRATIONS 15 +#else +#define NR_MIGRATIONS 100 +#endif + +int main(int argc, char **argv) +{ + int i = 0; + + report_prefix_push("migration"); + + for (i = 0; i < NR_MIGRATIONS; i++) + migrate_quiet(); + + report(true, "simple harness stress test"); + + report_prefix_pop(); + + return report_summary(); +} diff --git a/powerpc/Makefile.common b/powerpc/Makefile.common index eb88398d..da4a7bbb 100644 --- a/powerpc/Makefile.common +++ b/powerpc/Makefile.common @@ -6,6 +6,7 @@ tests-common = \ $(TEST_DIR)/selftest.elf \ + $(TEST_DIR)/selftest-migration.elf \ $(TEST_DIR)/spapr_hcall.elf \ $(TEST_DIR)/rtas.elf \ $(TEST_DIR)/emulator.elf \ diff --git a/powerpc/selftest-migration.c b/powerpc/selftest-migration.c new file mode 12 index ..bd1eb266 --- /dev/null +++ b/powerpc/selftest-migration.c @@ -0,0 +1 @@ +../common/selftest-migration.c \ No newline at end of file diff --git a/powerpc/unittests.cfg b/powerpc/unittests.cfg index e71140aa..7ce57de0 100644 --- a/powerpc/unittests.cfg +++ b/powerpc/unittests.cfg @@ -36,6 +36,10 @@ smp = 2 extra_params = -m 256 -append 'setup smp=2 mem=256' groups = selftest +[selftest-migration] +file = selftest-migration.elf +groups = selftest migration + [spapr_hcall] file = spapr_hcall.elf diff --git a/s390x/Makefile b/s390x/Makefile index b72f7578..344d46d6 100644 --- a/s390x/Makefile +++ b/s390x/Makefile @@ -1,4 +1,5 @@ tests = $(TEST_DIR)/selftest.elf +tests += $(TEST_DIR)/selftest-migration.elf tests += $(TEST_DIR)/intercept.elf tests += $(TEST_DIR)/emulator.elf tests += $(TEST_DIR)/sieve.elf diff --git a/s390x/selftest-migration.c b/s390x/selftest-migration.c new file mode 12 index ..bd1eb266 --- /dev/null +++ b/s390x/selftest-migration.c @@ -0,0 +1 @@ +../common/selftest-migration.c \ No newline at end of file diff --git a/s390x/unittests.cfg b/s390x/unittests.cfg index f5024b6e..a7ad522c 100644 --- a/s390x/unittests.cfg +++ b/s390x/unittests.cfg @@ -24,6 +24,10 @@ groups = selftest # please keep the kernel cmdline in sync with $(TEST_DIR)/selftest.parmfile extra_params = -append 'test 123' +[selftest-migration] +file = selftest-migration.elf +groups = selftest migration + [intercept]
[kvm-unit-tests PATCH v3 7/8] Add common/ directory for architecture-independent tests
x86/sieve.c is used by s390x, arm, and riscv via symbolic link. Make a new directory common/ for architecture-independent tests and move sieve.c here. Signed-off-by: Nicholas Piggin --- arm/sieve.c| 2 +- common/sieve.c | 51 + riscv/sieve.c | 2 +- s390x/sieve.c | 2 +- x86/sieve.c| 52 +- 5 files changed, 55 insertions(+), 54 deletions(-) create mode 100644 common/sieve.c mode change 100644 => 12 x86/sieve.c diff --git a/arm/sieve.c b/arm/sieve.c index 8f14a5c3..fe299f30 12 --- a/arm/sieve.c +++ b/arm/sieve.c @@ -1 +1 @@ -../x86/sieve.c \ No newline at end of file +../common/sieve.c \ No newline at end of file diff --git a/common/sieve.c b/common/sieve.c new file mode 100644 index ..8150f2d9 --- /dev/null +++ b/common/sieve.c @@ -0,0 +1,51 @@ +#include "alloc.h" +#include "libcflat.h" + +static int sieve(char* data, int size) +{ +int i, j, r = 0; + +for (i = 0; i < size; ++i) + data[i] = 1; + +data[0] = data[1] = 0; + +for (i = 2; i < size; ++i) + if (data[i]) { + ++r; + for (j = i*2; j < size; j += i) + data[j] = 0; + } +return r; +} + +static void test_sieve(const char *msg, char *data, int size) +{ +int r; + +printf("%s:", msg); +r = sieve(data, size); +printf("%d out of %d\n", r, size); +} + +#define STATIC_SIZE 100 +#define VSIZE 1 +char static_data[STATIC_SIZE]; + +int main(void) +{ +void *v; +int i; + +printf("starting sieve\n"); +test_sieve("static", static_data, STATIC_SIZE); +setup_vm(); +test_sieve("mapped", static_data, STATIC_SIZE); +for (i = 0; i < 3; ++i) { + v = malloc(VSIZE); + test_sieve("virtual", v, VSIZE); + free(v); +} + +return 0; +} diff --git a/riscv/sieve.c b/riscv/sieve.c index 8f14a5c3..fe299f30 12 --- a/riscv/sieve.c +++ b/riscv/sieve.c @@ -1 +1 @@ -../x86/sieve.c \ No newline at end of file +../common/sieve.c \ No newline at end of file diff --git a/s390x/sieve.c b/s390x/sieve.c index 8f14a5c3..fe299f30 12 --- a/s390x/sieve.c +++ b/s390x/sieve.c @@ -1 +1 @@ -../x86/sieve.c \ No newline at end of file +../common/sieve.c \ No newline at end of file diff --git a/x86/sieve.c b/x86/sieve.c deleted file mode 100644 index 8150f2d9.. --- a/x86/sieve.c +++ /dev/null @@ -1,51 +0,0 @@ -#include "alloc.h" -#include "libcflat.h" - -static int sieve(char* data, int size) -{ -int i, j, r = 0; - -for (i = 0; i < size; ++i) - data[i] = 1; - -data[0] = data[1] = 0; - -for (i = 2; i < size; ++i) - if (data[i]) { - ++r; - for (j = i*2; j < size; j += i) - data[j] = 0; - } -return r; -} - -static void test_sieve(const char *msg, char *data, int size) -{ -int r; - -printf("%s:", msg); -r = sieve(data, size); -printf("%d out of %d\n", r, size); -} - -#define STATIC_SIZE 100 -#define VSIZE 1 -char static_data[STATIC_SIZE]; - -int main(void) -{ -void *v; -int i; - -printf("starting sieve\n"); -test_sieve("static", static_data, STATIC_SIZE); -setup_vm(); -test_sieve("mapped", static_data, STATIC_SIZE); -for (i = 0; i < 3; ++i) { - v = malloc(VSIZE); - test_sieve("virtual", v, VSIZE); - free(v); -} - -return 0; -} diff --git a/x86/sieve.c b/x86/sieve.c new file mode 12 index ..fe299f30 --- /dev/null +++ b/x86/sieve.c @@ -0,0 +1 @@ +../common/sieve.c \ No newline at end of file -- 2.42.0
[kvm-unit-tests PATCH v3 6/8] migration: Add quiet migration support
Console output required to support migration becomes quite noisy when doing lots of migrations. Provide a migrate_quiet() call that suppresses console output and doesn't log a message. Signed-off-by: Nicholas Piggin --- lib/migrate.c | 12 lib/migrate.h | 1 + scripts/arch-run.bash | 4 ++-- 3 files changed, 15 insertions(+), 2 deletions(-) diff --git a/lib/migrate.c b/lib/migrate.c index b7721659..4e0ab516 100644 --- a/lib/migrate.c +++ b/lib/migrate.c @@ -18,6 +18,18 @@ void migrate(void) report_info("Migration complete"); } +/* + * Like migrate() but supporess output and logs, useful for intensive + * migration stress testing without polluting logs. Test cases should + * provide relevant information about migration in failure reports. + */ +void migrate_quiet(void) +{ + puts("Now migrate the VM (quiet)\n"); + (void)getchar(); +} + + /* * Initiate migration and wait for it to complete. * If this function is called more than once, it is a no-op. diff --git a/lib/migrate.h b/lib/migrate.h index 2af06a72..95b9102b 100644 --- a/lib/migrate.h +++ b/lib/migrate.h @@ -7,4 +7,5 @@ */ void migrate(void); +void migrate_quiet(void); void migrate_once(void); diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash index 0b45eb61..29cf9b0c 100644 --- a/scripts/arch-run.bash +++ b/scripts/arch-run.bash @@ -152,7 +152,7 @@ run_migration () -chardev socket,id=mon,path=${src_qmp},server=on,wait=off \ -mon chardev=mon,mode=control > ${src_outfifo} & live_pid=$! - cat ${src_outfifo} | tee ${src_out} & + cat ${src_outfifo} | tee ${src_out} | grep -v "Now migrate the VM (quiet)" & # The test must prompt the user to migrate, so wait for the "migrate" # keyword @@ -200,7 +200,7 @@ do_migration () -mon chardev=mon,mode=control -incoming unix:${dst_incoming} \ < <(cat ${dst_infifo}) > ${dst_outfifo} & incoming_pid=$! - cat ${dst_outfifo} | tee ${dst_out} & + cat ${dst_outfifo} | tee ${dst_out} | grep -v "Now migrate the VM (quiet)" & # The test must prompt the user to migrate, so wait for the "migrate" keyword while ! grep -q -i "Now migrate the VM" < ${src_out} ; do -- 2.42.0
[kvm-unit-tests PATCH v3 5/8] arch-run: rename migration variables
Using 1 and 2 for source and destination is confusing, particularly now with multiple migrations that flip between them. Do a rename pass to tidy things up. Signed-off-by: Nicholas Piggin --- scripts/arch-run.bash | 115 +- 1 file changed, 58 insertions(+), 57 deletions(-) diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash index a914ba17..0b45eb61 100644 --- a/scripts/arch-run.bash +++ b/scripts/arch-run.bash @@ -132,34 +132,34 @@ run_migration () migcmdline=$@ trap 'trap - TERM ; kill 0 ; exit 2' INT TERM - trap 'rm -f ${migout1} ${migout2} ${migout_fifo1} ${migout_fifo2} ${migsock} ${qmp1} ${qmp2} ${fifo}' RETURN EXIT - - migsock=$(mktemp -u -t mig-helper-socket.XX) - migout1=$(mktemp -t mig-helper-stdout1.XX) - migout_fifo1=$(mktemp -u -t mig-helper-fifo-stdout1.XX) - migout2=$(mktemp -t mig-helper-stdout2.XX) - migout_fifo2=$(mktemp -u -t mig-helper-fifo-stdout2.XX) - qmp1=$(mktemp -u -t mig-helper-qmp1.XX) - qmp2=$(mktemp -u -t mig-helper-qmp2.XX) - fifo=$(mktemp -u -t mig-helper-fifo.XX) - qmpout1=/dev/null - qmpout2=/dev/null - - mkfifo ${migout_fifo1} - mkfifo ${migout_fifo2} + trap 'rm -f ${src_out} ${dst_out} ${src_outfifo} ${dst_outfifo} ${dst_incoming} ${src_qmp} ${dst_qmp} ${dst_infifo}' RETURN EXIT + + dst_incoming=$(mktemp -u -t mig-helper-socket-incoming.XX) + src_out=$(mktemp -t mig-helper-stdout1.XX) + src_outfifo=$(mktemp -u -t mig-helper-fifo-stdout1.XX) + dst_out=$(mktemp -t mig-helper-stdout2.XX) + dst_outfifo=$(mktemp -u -t mig-helper-fifo-stdout2.XX) + src_qmp=$(mktemp -u -t mig-helper-qmp1.XX) + dst_qmp=$(mktemp -u -t mig-helper-qmp2.XX) + dst_infifo=$(mktemp -u -t mig-helper-fifo-stdin.XX) + src_qmpout=/dev/null + dst_qmpout=/dev/null + + mkfifo ${src_outfifo} + mkfifo ${dst_outfifo} eval "$migcmdline" \ - -chardev socket,id=mon1,path=${qmp1},server=on,wait=off \ - -mon chardev=mon1,mode=control > ${migout_fifo1} & + -chardev socket,id=mon,path=${src_qmp},server=on,wait=off \ + -mon chardev=mon,mode=control > ${src_outfifo} & live_pid=$! - cat ${migout_fifo1} | tee ${migout1} & + cat ${src_outfifo} | tee ${src_out} & # The test must prompt the user to migrate, so wait for the "migrate" # keyword - while ! grep -q -i "Now migrate the VM" < ${migout1} ; do + while ! grep -q -i "Now migrate the VM" < ${src_out} ; do if ! ps -p ${live_pid} > /dev/null ; then echo "ERROR: Test exit before migration point." >&2 - qmp ${qmp1} '"quit"'> ${qmpout1} 2>/dev/null + qmp ${src_qmp} '"quit"'> ${src_qmpout} 2>/dev/null return 3 fi sleep 0.1 @@ -172,7 +172,7 @@ run_migration () while ps -p ${live_pid} > /dev/null ; do # Wait for EXIT or further migrations - if ! grep -q -i "Now migrate the VM" < ${migout1} ; then + if ! grep -q -i "Now migrate the VM" < ${src_out} ; then sleep 0.1 else do_migration || return $? @@ -194,79 +194,80 @@ do_migration () # We have to use cat to open the named FIFO, because named FIFO's, # unlike pipes, will block on open() until the other end is also # opened, and that totally breaks QEMU... - mkfifo ${fifo} + mkfifo ${dst_infifo} eval "$migcmdline" \ - -chardev socket,id=mon2,path=${qmp2},server=on,wait=off \ - -mon chardev=mon2,mode=control -incoming unix:${migsock} \ - < <(cat ${fifo}) > ${migout_fifo2} & + -chardev socket,id=mon,path=${dst_qmp},server=on,wait=off \ + -mon chardev=mon,mode=control -incoming unix:${dst_incoming} \ + < <(cat ${dst_infifo}) > ${dst_outfifo} & incoming_pid=$! - cat ${migout_fifo2} | tee ${migout2} & + cat ${dst_outfifo} | tee ${dst_out} & # The test must prompt the user to migrate, so wait for the "migrate" keyword - while ! grep -q -i "Now migrate the VM" < ${migout1} ; do + while ! grep -q -i "Now migrate the VM" < ${src_out} ; do if ! ps -p ${live_pid} > /dev/null ; then echo "ERROR: Test exit before migration point." >&2 - echo > ${fifo} - qmp ${qmp1} '"quit"'> ${qmpout1} 2>/dev/null - qmp ${qmp2} '"quit"'> ${qmpout2} 2>/dev/null + echo > ${dst_infifo} + qmp ${src_qmp} '"quit"'> ${src_qmpout} 2>/dev/nul
[kvm-unit-tests PATCH v3 4/8] migration: Support multiple migrations
Support multiple migrations by flipping dest file/socket variables to source after the migration is complete, ready to start again. A new destination is created if the test outputs the migrate line again. Test cases may now switch to calling migrate() one or more times. Signed-off-by: Nicholas Piggin --- lib/migrate.c | 8 ++-- lib/migrate.h | 1 + scripts/arch-run.bash | 93 +-- 3 files changed, 85 insertions(+), 17 deletions(-) diff --git a/lib/migrate.c b/lib/migrate.c index 527e63ae..b7721659 100644 --- a/lib/migrate.c +++ b/lib/migrate.c @@ -8,8 +8,10 @@ #include #include "migrate.h" -/* static for now since we only support migrating exactly once per test. */ -static void migrate(void) +/* + * Initiate migration and wait for it to complete. + */ +void migrate(void) { puts("Now migrate the VM, then press a key to continue...\n"); (void)getchar(); @@ -19,8 +21,6 @@ static void migrate(void) /* * Initiate migration and wait for it to complete. * If this function is called more than once, it is a no-op. - * Since migrate_cmd can only migrate exactly once this function can - * simplify the control flow, especially when skipping tests. */ void migrate_once(void) { diff --git a/lib/migrate.h b/lib/migrate.h index 3c94e6af..2af06a72 100644 --- a/lib/migrate.h +++ b/lib/migrate.h @@ -6,4 +6,5 @@ * Author: Nico Boehr */ +void migrate(void); void migrate_once(void); diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash index 3689d7c2..a914ba17 100644 --- a/scripts/arch-run.bash +++ b/scripts/arch-run.bash @@ -129,12 +129,16 @@ run_migration () return 77 fi + migcmdline=$@ + trap 'trap - TERM ; kill 0 ; exit 2' INT TERM - trap 'rm -f ${migout1} ${migout_fifo1} ${migsock} ${qmp1} ${qmp2} ${fifo}' RETURN EXIT + trap 'rm -f ${migout1} ${migout2} ${migout_fifo1} ${migout_fifo2} ${migsock} ${qmp1} ${qmp2} ${fifo}' RETURN EXIT migsock=$(mktemp -u -t mig-helper-socket.XX) migout1=$(mktemp -t mig-helper-stdout1.XX) migout_fifo1=$(mktemp -u -t mig-helper-fifo-stdout1.XX) + migout2=$(mktemp -t mig-helper-stdout2.XX) + migout_fifo2=$(mktemp -u -t mig-helper-fifo-stdout2.XX) qmp1=$(mktemp -u -t mig-helper-qmp1.XX) qmp2=$(mktemp -u -t mig-helper-qmp2.XX) fifo=$(mktemp -u -t mig-helper-fifo.XX) @@ -142,18 +146,61 @@ run_migration () qmpout2=/dev/null mkfifo ${migout_fifo1} - eval "$@" -chardev socket,id=mon1,path=${qmp1},server=on,wait=off \ + mkfifo ${migout_fifo2} + + eval "$migcmdline" \ + -chardev socket,id=mon1,path=${qmp1},server=on,wait=off \ -mon chardev=mon1,mode=control > ${migout_fifo1} & live_pid=$! cat ${migout_fifo1} | tee ${migout1} & - # We have to use cat to open the named FIFO, because named FIFO's, unlike - # pipes, will block on open() until the other end is also opened, and that - # totally breaks QEMU... + # The test must prompt the user to migrate, so wait for the "migrate" + # keyword + while ! grep -q -i "Now migrate the VM" < ${migout1} ; do + if ! ps -p ${live_pid} > /dev/null ; then + echo "ERROR: Test exit before migration point." >&2 + qmp ${qmp1} '"quit"'> ${qmpout1} 2>/dev/null + return 3 + fi + sleep 0.1 + done + + # This starts the first source QEMU in advance of the test reaching the + # migration point, since we expect at least one migration. Subsequent + # sources are started as the test hits migrate keywords. + do_migration || return $? + + while ps -p ${live_pid} > /dev/null ; do + # Wait for EXIT or further migrations + if ! grep -q -i "Now migrate the VM" < ${migout1} ; then + sleep 0.1 + else + do_migration || return $? + fi + done + + wait ${live_pid} + ret=$? + + while (( $(jobs -r | wc -l) > 0 )); do + sleep 0.1 + done + + return $ret +} + +do_migration () +{ + # We have to use cat to open the named FIFO, because named FIFO's, + # unlike pipes, will block on open() until the other end is also + # opened, and that totally breaks QEMU... mkfifo ${fifo} - eval "$@" -chardev socket,id=mon2,path=${qmp2},server=on,wait=off \ - -mon chardev=mon2,mode=control -incoming unix:${migsock} < <(cat ${fifo}) & + eval "$migcmdline" \ + -chardev socket,id=mon2,path=${qmp2},server=on,wait=off \ + -mon chardev=mon2,mode=control -incoming unix:${migsock} \ + < <(cat ${fifo}) > ${migout_fifo2} & incoming_pid=$! +
[kvm-unit-tests PATCH v3 3/8] migration: use a more robust way to wait for background job
Starting a pipeline of jobs in the background does not seem to have a simple way to reliably find the pid of a particular process in the pipeline (because not all processes are started when the shell continues to execute). The way PID of QEMU is derived can result in a failure waiting on a PID that is not running. This is easier to hit with subsequent multiple-migration support. Changing this to use $! by swapping the pipeline for a fifo is more robust. Signed-off-by: Nicholas Piggin --- scripts/arch-run.bash | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash index 1e903e83..3689d7c2 100644 --- a/scripts/arch-run.bash +++ b/scripts/arch-run.bash @@ -130,19 +130,22 @@ run_migration () fi trap 'trap - TERM ; kill 0 ; exit 2' INT TERM - trap 'rm -f ${migout1} ${migsock} ${qmp1} ${qmp2} ${fifo}' RETURN EXIT + trap 'rm -f ${migout1} ${migout_fifo1} ${migsock} ${qmp1} ${qmp2} ${fifo}' RETURN EXIT migsock=$(mktemp -u -t mig-helper-socket.XX) migout1=$(mktemp -t mig-helper-stdout1.XX) + migout_fifo1=$(mktemp -u -t mig-helper-fifo-stdout1.XX) qmp1=$(mktemp -u -t mig-helper-qmp1.XX) qmp2=$(mktemp -u -t mig-helper-qmp2.XX) fifo=$(mktemp -u -t mig-helper-fifo.XX) qmpout1=/dev/null qmpout2=/dev/null + mkfifo ${migout_fifo1} eval "$@" -chardev socket,id=mon1,path=${qmp1},server=on,wait=off \ - -mon chardev=mon1,mode=control | tee ${migout1} & - live_pid=`jobs -l %+ | grep "eval" | awk '{print$2}'` + -mon chardev=mon1,mode=control > ${migout_fifo1} & + live_pid=$! + cat ${migout_fifo1} | tee ${migout1} & # We have to use cat to open the named FIFO, because named FIFO's, unlike # pipes, will block on open() until the other end is also opened, and that @@ -150,7 +153,7 @@ run_migration () mkfifo ${fifo} eval "$@" -chardev socket,id=mon2,path=${qmp2},server=on,wait=off \ -mon chardev=mon2,mode=control -incoming unix:${migsock} < <(cat ${fifo}) & - incoming_pid=`jobs -l %+ | awk '{print$2}'` + incoming_pid=$! # The test must prompt the user to migrate, so wait for the "migrate" keyword while ! grep -q -i "Now migrate the VM" < ${migout1} ; do @@ -164,6 +167,10 @@ run_migration () sleep 1 done + # Wait until the destination has created the incoming and qmp sockets + while ! [ -S ${migsock} ] ; do sleep 0.1 ; done + while ! [ -S ${qmp2} ] ; do sleep 0.1 ; done + qmp ${qmp1} '"migrate", "arguments": { "uri": "unix:'${migsock}'" }' > ${qmpout1} # Wait for the migration to complete -- 2.42.0
[kvm-unit-tests PATCH v3 2/8] arch-run: Clean up initrd cleanup
Rather than put a big script into the trap handler, have it call a function. Signed-off-by: Nicholas Piggin --- scripts/arch-run.bash | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash index 11d47a85..1e903e83 100644 --- a/scripts/arch-run.bash +++ b/scripts/arch-run.bash @@ -269,10 +269,21 @@ search_qemu_binary () export PATH=$save_path } +initrd_cleanup () +{ + rm -f $KVM_UNIT_TESTS_ENV + if [ "$KVM_UNIT_TESTS_ENV_OLD" ]; then + export KVM_UNIT_TESTS_ENV="$KVM_UNIT_TESTS_ENV_OLD" + else + unset KVM_UNIT_TESTS_ENV + unset KVM_UNIT_TESTS_ENV_OLD + fi +} + initrd_create () { if [ "$ENVIRON_DEFAULT" = "yes" ]; then - trap_exit_push 'rm -f $KVM_UNIT_TESTS_ENV; [ "$KVM_UNIT_TESTS_ENV_OLD" ] && export KVM_UNIT_TESTS_ENV="$KVM_UNIT_TESTS_ENV_OLD" || unset KVM_UNIT_TESTS_ENV; unset KVM_UNIT_TESTS_ENV_OLD' + trap_exit_push 'initrd_cleanup' [ -f "$KVM_UNIT_TESTS_ENV" ] && export KVM_UNIT_TESTS_ENV_OLD="$KVM_UNIT_TESTS_ENV" export KVM_UNIT_TESTS_ENV=$(mktemp) env_params -- 2.42.0
[kvm-unit-tests PATCH v3 1/8] arch-run: Fix TRAP handler recursion to remove temporary files properly
Migration files were not being removed when the QEMU process is interrupted (e.g., with ^C). This is becaus the SIGINT propagates to the bash TRAP handler, which recursively TRAPs due to the 'kill 0' in the handler. This eventually crashes bash. This can be observed by interrupting a long-running test program that is run with MIGRATION=yes, /tmp/mig-helper-* files remain afterwards. Removing TRAP recursion solves this problem and allows the EXIT handler to run and clean up the files. This also moves the trap handler before temp file creation, and expands the name variables at trap-time rather than install-time, which closes the small race between creation trap handler install. Reviewed-by: Thomas Huth Signed-off-by: Nicholas Piggin --- scripts/arch-run.bash | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash index d0864360..11d47a85 100644 --- a/scripts/arch-run.bash +++ b/scripts/arch-run.bash @@ -129,6 +129,9 @@ run_migration () return 77 fi + trap 'trap - TERM ; kill 0 ; exit 2' INT TERM + trap 'rm -f ${migout1} ${migsock} ${qmp1} ${qmp2} ${fifo}' RETURN EXIT + migsock=$(mktemp -u -t mig-helper-socket.XX) migout1=$(mktemp -t mig-helper-stdout1.XX) qmp1=$(mktemp -u -t mig-helper-qmp1.XX) @@ -137,9 +140,6 @@ run_migration () qmpout1=/dev/null qmpout2=/dev/null - trap 'kill 0; exit 2' INT TERM - trap 'rm -f ${migout1} ${migsock} ${qmp1} ${qmp2} ${fifo}' RETURN EXIT - eval "$@" -chardev socket,id=mon1,path=${qmp1},server=on,wait=off \ -mon chardev=mon1,mode=control | tee ${migout1} & live_pid=`jobs -l %+ | grep "eval" | awk '{print$2}'` @@ -209,11 +209,11 @@ run_panic () return 77 fi - qmp=$(mktemp -u -t panic-qmp.XX) - - trap 'kill 0; exit 2' INT TERM + trap 'trap - TERM ; kill 0 ; exit 2' INT TERM trap 'rm -f ${qmp}' RETURN EXIT + qmp=$(mktemp -u -t panic-qmp.XX) + # start VM stopped so we don't miss any events eval "$@" -chardev socket,id=mon1,path=${qmp},server=on,wait=off \ -mon chardev=mon1,mode=control -S & -- 2.42.0
[kvm-unit-tests PATCH v3 0/8] Multi-migration support
Since v2: - Rebase on riscv port and auxvinfo fix was merged. - Clean up initrd cleanup moves more commands into the new cleanup function from the trap handler comands (suggested by Thomas). - "arch-run: Clean up temporary files properly" patch is now renamed to "arch-run: Fix TRAP handler..." - Fix TRAP handler patch has redone changelog to be more precise about the problem and including recipe to recreate it. - Fix TRAP handler patch reworked slightly to remove the theoretical race rather than just adding a comment about it. - Patch 3 was missing a couple of fixes that leaked into patch 4, those are moved into patch 3. I did look into doing a better job at handling timeouts in places where the migration script can hang (the timeout command only kills the qemu process, but there are other places the bash script itself can still timeout). There are ways it might be possible (along the lines of starting ( sleep N ; kill ) subshell in the backround), but it's very tricky to handle all the details. Existing script has timeout issues already, so this series doesn't add a fundamentally new type of problem here. Thanks, Nick Nicholas Piggin (8): arch-run: Fix TRAP handler recursion to remove temporary files properly arch-run: Clean up initrd cleanup migration: use a more robust way to wait for background job migration: Support multiple migrations arch-run: rename migration variables migration: Add quiet migration support Add common/ directory for architecture-independent tests migration: add a migration selftest arm/Makefile.common | 1 + arm/selftest-migration.c | 1 + arm/sieve.c | 2 +- arm/unittests.cfg| 6 ++ common/selftest-migration.c | 34 +++ common/sieve.c | 51 ++ lib/migrate.c| 20 +++- lib/migrate.h| 2 + powerpc/Makefile.common | 1 + powerpc/selftest-migration.c | 1 + powerpc/unittests.cfg| 4 + riscv/sieve.c| 2 +- s390x/Makefile | 1 + s390x/selftest-migration.c | 1 + s390x/sieve.c| 2 +- s390x/unittests.cfg | 4 + scripts/arch-run.bash| 182 ++- x86/sieve.c | 52 +- 18 files changed, 261 insertions(+), 106 deletions(-) create mode 12 arm/selftest-migration.c create mode 100644 common/selftest-migration.c create mode 100644 common/sieve.c create mode 12 powerpc/selftest-migration.c create mode 12 s390x/selftest-migration.c mode change 100644 => 12 x86/sieve.c -- 2.42.0
[PATCH] mm/hugetlb: Move page order check inside hugetlb_cma_reserve()
All platforms could benefit from page order check against MAX_PAGE_ORDER before allocating a CMA area for gigantic hugetlb pages. Let's move this check from individual platforms to generic hugetlb. Cc: Catalin Marinas Cc: Will Deacon Cc: Michael Ellerman Cc: Nicholas Piggin Cc: linux-arm-ker...@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux...@kvack.org Cc: linux-ker...@vger.kernel.org Signed-off-by: Anshuman Khandual --- This applies on v6.8-rc3 arch/arm64/mm/hugetlbpage.c | 7 --- arch/powerpc/mm/hugetlbpage.c | 4 +--- mm/hugetlb.c | 7 +++ 3 files changed, 8 insertions(+), 10 deletions(-) diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 8116ac599f80..6720ec8d50e7 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -45,13 +45,6 @@ void __init arm64_hugetlb_cma_reserve(void) else order = CONT_PMD_SHIFT - PAGE_SHIFT; - /* -* HugeTLB CMA reservation is required for gigantic -* huge pages which could not be allocated via the -* page allocator. Just warn if there is any change -* breaking this assumption. -*/ - WARN_ON(order <= MAX_PAGE_ORDER); hugetlb_cma_reserve(order); } #endif /* CONFIG_CMA */ diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 0a540b37aab6..16557d008eef 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -614,8 +614,6 @@ void __init gigantic_hugetlb_cma_reserve(void) */ order = mmu_psize_to_shift(MMU_PAGE_16G) - PAGE_SHIFT; - if (order) { - VM_WARN_ON(order <= MAX_PAGE_ORDER); + if (order) hugetlb_cma_reserve(order); - } } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index cf9c9b2906ea..345b3524df35 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7699,6 +7699,13 @@ void __init hugetlb_cma_reserve(int order) bool node_specific_cma_alloc = false; int nid; + /* +* HugeTLB CMA reservation is required for gigantic +* huge pages which could not be allocated via the +* page allocator. Just warn if there is any change +* breaking this assumption. +*/ + VM_WARN_ON(order <= MAX_PAGE_ORDER); cma_reserve_called = true; if (!hugetlb_cma_size) -- 2.25.1
Re: [PATCH v2] drivers/ps3: select VIDEO to provide cmdline functions
Thomas Zimmermann writes: > Am 07.02.24 um 17:13 schrieb Randy Dunlap: >> When VIDEO is not set, there is a build error. Fix that by selecting >> VIDEO for PS3_PS3AV. >> >> ERROR: modpost: ".video_get_options" [drivers/ps3/ps3av_mod.ko] undefined! >> >> Fixes: dae7fbf43fd0 ("driver/ps3: Include for mode >> parsing") >> Fixes: a3b6792e990d ("video/cmdline: Introduce CONFIG_VIDEO for video= >> parameter") >> Cc: Michael Ellerman >> Cc: Nicholas Piggin >> Cc: Christophe Leroy >> Cc: Aneesh Kumar K.V >> Cc: Naveen N. Rao >> Cc: linuxppc-dev@lists.ozlabs.org >> Cc: Thomas Zimmermann >> Cc: Geoff Levand >> Acked-by: Geoff Levand >> Cc: linux-fb...@vger.kernel.org >> Cc: dri-de...@lists.freedesktop.org >> Signed-off-by: Randy Dunlap > > Reviewed-by: Thomas Zimmermann Can you take it via whatever tree the CONFIG_VIDEO patch is in? Acked-by: Michael Ellerman cheers
Re: [kvm-unit-tests PATCH v2 2/9] arch-run: Clean up temporary files properly
On Wed Feb 7, 2024 at 5:58 PM AEST, Thomas Huth wrote: > On 02/02/2024 07.57, Nicholas Piggin wrote: > > Migration files weren't being removed when tests were interrupted. > > This improves the situation. > > > > Signed-off-by: Nicholas Piggin > > --- > > scripts/arch-run.bash | 12 +++- > > 1 file changed, 7 insertions(+), 5 deletions(-) > > > > diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash > > index d0864360..f22ead6f 100644 > > --- a/scripts/arch-run.bash > > +++ b/scripts/arch-run.bash > > @@ -134,12 +134,14 @@ run_migration () > > qmp1=$(mktemp -u -t mig-helper-qmp1.XX) > > qmp2=$(mktemp -u -t mig-helper-qmp2.XX) > > fifo=$(mktemp -u -t mig-helper-fifo.XX) > > + > > + # race here between file creation and trap > > + trap "trap - TERM ; kill 0 ; exit 2" INT TERM > > + trap "rm -f ${migout1} ${migsock} ${qmp1} ${qmp2} ${fifo}" RETURN EXIT > > + > > qmpout1=/dev/null > > qmpout2=/dev/null > > > > - trap 'kill 0; exit 2' INT TERM > > - trap 'rm -f ${migout1} ${migsock} ${qmp1} ${qmp2} ${fifo}' RETURN EXIT > > - > > eval "$@" -chardev socket,id=mon1,path=${qmp1},server=on,wait=off \ > > -mon chardev=mon1,mode=control | tee ${migout1} & > > live_pid=`jobs -l %+ | grep "eval" | awk '{print$2}'` > > @@ -211,8 +213,8 @@ run_panic () > > > > qmp=$(mktemp -u -t panic-qmp.XX) > > > > - trap 'kill 0; exit 2' INT TERM > > - trap 'rm -f ${qmp}' RETURN EXIT > > + trap "trap - TERM ; kill 0 ; exit 2" INT TERM > > + trap "rm -f ${qmp}" RETURN EXIT > > > > # start VM stopped so we don't miss any events > > eval "$@" -chardev socket,id=mon1,path=${qmp},server=on,wait=off \ > > So the point is that the "EXIT" trap wasn't executed without the "trap - > TERM" in the other trap? ... ok, then your patch certainly makes sense. Iff you don't remove the TERM handler then the kill will recursively invoke it until some crash. This did solve some cases of dangling temp files for me, although now I test with a simple script: #!/bin/bash trap 'echo "INT" ; kill 0 ; exit 2' INT trap 'trap - TERM ; echo "TERM" ; kill 0 ; exit 2' TERM trap 'echo "RETURN"' RETURN trap 'echo "EXIT"' EXIT sleep 10 echo "done" If you ^C it then it still doesn't get to the EXIT or RETURN handlers. It looks like 'kill -INT $$' might be the way to do it instad of kill 0. Not sure if that means my observation was incorrect, or if the real script is behaving differently. In any case, I will dig into it and try to explain more precisely in the changelog what it is fixing. And possibly do another patch for the 'kill -INT $$' if that is needed. Thanks, Nick
[PATCH v3 2/2] powerpc/perf: Power11 Performance Monitoring support
Base enablement patch to register performance monitoring hardware support for Power11. Most of fields are copied from power10_pmu struct for power11_pmu struct. Signed-off-by: Madhavan Srinivasan --- Changelog V2: - No change Changelog v1: - Copied power10 struct for power11 with name change arch/powerpc/perf/core-book3s.c | 2 ++ arch/powerpc/perf/internal.h| 1 + arch/powerpc/perf/power10-pmu.c | 27 +++ 3 files changed, 30 insertions(+) diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c index b7ff680cde96..6f0d46c53027 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -2593,6 +2593,8 @@ static int __init init_ppc64_pmu(void) return 0; else if (!init_power10_pmu()) return 0; + else if (!init_power11_pmu()) + return 0; else if (!init_ppc970_pmu()) return 0; else diff --git a/arch/powerpc/perf/internal.h b/arch/powerpc/perf/internal.h index 4c18b5504326..a70ac471a5a5 100644 --- a/arch/powerpc/perf/internal.h +++ b/arch/powerpc/perf/internal.h @@ -10,4 +10,5 @@ int __init init_power7_pmu(void); int __init init_power8_pmu(void); int __init init_power9_pmu(void); int __init init_power10_pmu(void); +int __init init_power11_pmu(void); int __init init_generic_compat_pmu(void); diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c index 9b5133e361a7..62a68b6b2d4b 100644 --- a/arch/powerpc/perf/power10-pmu.c +++ b/arch/powerpc/perf/power10-pmu.c @@ -634,3 +634,30 @@ int __init init_power10_pmu(void) return 0; } + +static struct power_pmu power11_pmu; + +int __init init_power11_pmu(void) +{ + unsigned int pvr; + int rc; + + pvr = mfspr(SPRN_PVR); + if (PVR_VER(pvr) != PVR_POWER11) + return -ENODEV; + + /* Set the PERF_REG_EXTENDED_MASK here */ + PERF_REG_EXTENDED_MASK = PERF_REG_PMU_MASK_31; + + power11_pmu = power10_pmu; + power11_pmu.name = "Power11"; + + rc = register_power_pmu(&power11_pmu); + if (rc) + return rc; + + /* Tell userspace that EBB is supported */ + cur_cpu_spec->cpu_user_features2 |= PPC_FEATURE2_EBB; + + return 0; +} -- 2.43.0
[PATCH v3 1/2] powerpc: Add Power11 architected and raw mode
reg.h is updated with Power11 pvr. pvr_mask value of 0x0F07 means we are arch v3.1 compliant. This is used by phyp and kvm when booting as a pseries guest to detect and enable the appropriate hwcap, facility bits and PMU related fields. Copied most of fields from Power10 table entry. Signed-off-by: Madhavan Srinivasan --- Changelog v2: - added macro to address logical PVR to handle compat mode PCR setting - removed power11 functions which were just re-direction to power10. Instead used power10 functions as suggested. Changelog v1 - No change arch/powerpc/include/asm/cputable.h | 3 ++ arch/powerpc/include/asm/mmu.h| 1 + arch/powerpc/include/asm/reg.h| 2 ++ arch/powerpc/kernel/cpu_specs_book3s_64.h | 34 +++ arch/powerpc/kernel/dt_cpu_ftrs.c | 10 +++ arch/powerpc/kernel/prom_init.c | 10 ++- arch/powerpc/kvm/book3s_hv.c | 1 + 7 files changed, 60 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h index 8765d5158324..3bd6e6e0224c 100644 --- a/arch/powerpc/include/asm/cputable.h +++ b/arch/powerpc/include/asm/cputable.h @@ -454,6 +454,9 @@ static inline void cpu_feature_keys_init(void) { } CPU_FTR_ARCH_300 | CPU_FTR_ARCH_31 | \ CPU_FTR_DAWR | CPU_FTR_DAWR1 | \ CPU_FTR_DEXCR_NPHIE) + +#define CPU_FTRS_POWER11 CPU_FTRS_POWER10 + #define CPU_FTRS_CELL (CPU_FTR_LWSYNC | \ CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \ CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \ diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h index d8b7e246a32f..61ebe5eff2c9 100644 --- a/arch/powerpc/include/asm/mmu.h +++ b/arch/powerpc/include/asm/mmu.h @@ -133,6 +133,7 @@ #define MMU_FTRS_POWER8MMU_FTRS_POWER6 #define MMU_FTRS_POWER9MMU_FTRS_POWER6 #define MMU_FTRS_POWER10 MMU_FTRS_POWER6 +#define MMU_FTRS_POWER11 MMU_FTRS_POWER6 #define MMU_FTRS_CELL MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \ MMU_FTR_CI_LARGE_PAGE #define MMU_FTRS_PA6T MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \ diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index 7fd09f25452d..a3d844dae417 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -1364,6 +1364,7 @@ #define PVR_HX_C2000 0x0066 #define PVR_POWER9 0x004E #define PVR_POWER100x0080 +#define PVR_POWER110x0082 #define PVR_BE 0x0070 #define PVR_PA6T 0x0090 @@ -1375,6 +1376,7 @@ #define PVR_ARCH_207 0x0f04 #define PVR_ARCH_300 0x0f05 #define PVR_ARCH_310x0f06 +#define PVR_ARCH_31N 0x0f07 /* Macros for setting and retrieving special purpose registers */ #ifndef __ASSEMBLY__ diff --git a/arch/powerpc/kernel/cpu_specs_book3s_64.h b/arch/powerpc/kernel/cpu_specs_book3s_64.h index 3ff9757df4c0..98d4274a1b6b 100644 --- a/arch/powerpc/kernel/cpu_specs_book3s_64.h +++ b/arch/powerpc/kernel/cpu_specs_book3s_64.h @@ -60,6 +60,9 @@ PPC_FEATURE2_ISEL | PPC_FEATURE2_TAR | \ PPC_FEATURE2_VEC_CRYPTO) +#define COMMON_USER_POWER11COMMON_USER_POWER10 +#define COMMON_USER2_POWER11 COMMON_USER2_POWER10 + static struct cpu_spec cpu_specs[] __initdata = { { /* PPC970 */ .pvr_mask = 0x, @@ -281,6 +284,20 @@ static struct cpu_spec cpu_specs[] __initdata = { .cpu_restore= __restore_cpu_power10, .platform = "power10", }, + { /* 3.1-compliant processor, i.e. Power11 "architected" mode */ + .pvr_mask = 0x, + .pvr_value = 0x0f07, + .cpu_name = "Power11 (architected)", + .cpu_features = CPU_FTRS_POWER11, + .cpu_user_features = COMMON_USER_POWER11, + .cpu_user_features2 = COMMON_USER2_POWER11, + .mmu_features = MMU_FTRS_POWER11, + .icache_bsize = 128, + .dcache_bsize = 128, + .cpu_setup = __setup_cpu_power10, + .cpu_restore= __restore_cpu_power10, + .platform = "power11", + }, { /* Power7 */ .pvr_mask = 0x, .pvr_value = 0x003f, @@ -451,6 +468,23 @@ static struct cpu_spec cpu_specs[] __initdata = { .machine_check_early= __machine_check_early_realmode_p10, .platform = "power10", }, + { /* Power11 */ + .pvr_mask = 0x, + .pvr_value = 0x0082, +
Re: [PATCH v2 5/5] sched/vtime: do not include header
On Fri Feb 9, 2024 at 6:15 AM AEST, Alexander Gordeev wrote: > There is no architecture-specific code or data left > that generic needs to know about. > Thus, avoid the inclusion of header. Nice cleanup! Acked-by: Nicholas Piggin > > Reviewed-by: Frederic Weisbecker > Signed-off-by: Alexander Gordeev > --- > arch/powerpc/include/asm/Kbuild | 1 - > include/asm-generic/vtime.h | 1 - > include/linux/vtime.h | 4 > 3 files changed, 6 deletions(-) > delete mode 100644 include/asm-generic/vtime.h > > diff --git a/arch/powerpc/include/asm/Kbuild b/arch/powerpc/include/asm/Kbuild > index 61a8dcd7..e5fdc336c9b2 100644 > --- a/arch/powerpc/include/asm/Kbuild > +++ b/arch/powerpc/include/asm/Kbuild > @@ -6,5 +6,4 @@ generic-y += agp.h > generic-y += kvm_types.h > generic-y += mcs_spinlock.h > generic-y += qrwlock.h > -generic-y += vtime.h > generic-y += early_ioremap.h > diff --git a/include/asm-generic/vtime.h b/include/asm-generic/vtime.h > deleted file mode 100644 > index b1a49677fe25.. > --- a/include/asm-generic/vtime.h > +++ /dev/null > @@ -1 +0,0 @@ > -/* no content, but patch(1) dislikes empty files */ > diff --git a/include/linux/vtime.h b/include/linux/vtime.h > index 593466ceebed..29dd5b91dd7d 100644 > --- a/include/linux/vtime.h > +++ b/include/linux/vtime.h > @@ -5,10 +5,6 @@ > #include > #include > > -#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE > -#include > -#endif > - > /* > * Common vtime APIs > */
Re: [PATCH v2 3/5] s390/vtime: remove unused __ARCH_HAS_VTIME_TASK_SWITCH leftover
On Fri Feb 9, 2024 at 6:15 AM AEST, Alexander Gordeev wrote: > __ARCH_HAS_VTIME_TASK_SWITCH macro is not used anymore. ... but for benefit of patchwork if you decide to keep them apart Acked-by: Nicholas Piggin > > Reviewed-by: Frederic Weisbecker > Acked-by: Heiko Carstens > Signed-off-by: Alexander Gordeev > --- > arch/s390/include/asm/vtime.h | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/arch/s390/include/asm/vtime.h b/arch/s390/include/asm/vtime.h > index fe17e448c0c5..561c91c1a87c 100644 > --- a/arch/s390/include/asm/vtime.h > +++ b/arch/s390/include/asm/vtime.h > @@ -2,8 +2,6 @@ > #ifndef _S390_VTIME_H > #define _S390_VTIME_H > > -#define __ARCH_HAS_VTIME_TASK_SWITCH > - > static inline void update_timer_sys(void) > { > S390_lowcore.system_timer += S390_lowcore.last_update_timer - > S390_lowcore.exit_timer;
Re: [PATCH v2 2/5] sched/vtime: get rid of generic vtime_task_switch() implementation
On Fri Feb 9, 2024 at 6:15 AM AEST, Alexander Gordeev wrote: > The generic vtime_task_switch() implementation gets built only > if __ARCH_HAS_VTIME_TASK_SWITCH is not defined, but requires an > architecture to implement arch_vtime_task_switch() callback at > the same time, which is confusing. > > Further, arch_vtime_task_switch() is implemented for 32-bit PowerPC > architecture only and vtime_task_switch() generic variant is rather > superfluous. > > Simplify the whole vtime_task_switch() wiring by moving the existing > generic implementation to PowerPC. I think this could be squashed with patch 3. Reviewed-by: Nicholas Piggin > > Reviewed-by: Frederic Weisbecker > Signed-off-by: Alexander Gordeev > --- > arch/powerpc/include/asm/cputime.h | 13 - > arch/powerpc/kernel/time.c | 22 ++ > kernel/sched/cputime.c | 13 - > 3 files changed, 22 insertions(+), 26 deletions(-) > > diff --git a/arch/powerpc/include/asm/cputime.h > b/arch/powerpc/include/asm/cputime.h > index 4961fb38e438..aff858ca99c0 100644 > --- a/arch/powerpc/include/asm/cputime.h > +++ b/arch/powerpc/include/asm/cputime.h > @@ -32,23 +32,10 @@ > #ifdef CONFIG_PPC64 > #define get_accounting(tsk) (&get_paca()->accounting) > #define raw_get_accounting(tsk) (&local_paca->accounting) > -static inline void arch_vtime_task_switch(struct task_struct *tsk) { } > > #else > #define get_accounting(tsk) (&task_thread_info(tsk)->accounting) > #define raw_get_accounting(tsk) get_accounting(tsk) > -/* > - * Called from the context switch with interrupts disabled, to charge all > - * accumulated times to the current process, and to prepare accounting on > - * the next process. > - */ > -static inline void arch_vtime_task_switch(struct task_struct *prev) > -{ > - struct cpu_accounting_data *acct = get_accounting(current); > - struct cpu_accounting_data *acct0 = get_accounting(prev); > - > - acct->starttime = acct0->starttime; > -} > #endif > > /* > diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c > index df20cf201f74..c0fdc6d94fee 100644 > --- a/arch/powerpc/kernel/time.c > +++ b/arch/powerpc/kernel/time.c > @@ -354,6 +354,28 @@ void vtime_flush(struct task_struct *tsk) > acct->hardirq_time = 0; > acct->softirq_time = 0; > } > + > +/* > + * Called from the context switch with interrupts disabled, to charge all > + * accumulated times to the current process, and to prepare accounting on > + * the next process. > + */ > +void vtime_task_switch(struct task_struct *prev) > +{ > + if (is_idle_task(prev)) > + vtime_account_idle(prev); > + else > + vtime_account_kernel(prev); > + > + vtime_flush(prev); > + > + if (!IS_ENABLED(CONFIG_PPC64)) { > + struct cpu_accounting_data *acct = get_accounting(current); > + struct cpu_accounting_data *acct0 = get_accounting(prev); > + > + acct->starttime = acct0->starttime; > + } > +} > #endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */ > > void __no_kcsan __delay(unsigned long loops) > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c > index af7952f12e6c..aa48b2ec879d 100644 > --- a/kernel/sched/cputime.c > +++ b/kernel/sched/cputime.c > @@ -424,19 +424,6 @@ static inline void irqtime_account_process_tick(struct > task_struct *p, int user_ > */ > #ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE > > -# ifndef __ARCH_HAS_VTIME_TASK_SWITCH > -void vtime_task_switch(struct task_struct *prev) > -{ > - if (is_idle_task(prev)) > - vtime_account_idle(prev); > - else > - vtime_account_kernel(prev); > - > - vtime_flush(prev); > - arch_vtime_task_switch(prev); > -} > -# endif > - > void vtime_account_irq(struct task_struct *tsk, unsigned int offset) > { > unsigned int pc = irq_count() - offset;
Re: [PATCH v2 1/5] sched/vtime: remove confusing arch_vtime_task_switch() declaration
On Fri Feb 9, 2024 at 6:15 AM AEST, Alexander Gordeev wrote: > Callback arch_vtime_task_switch() is only defined when > CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is selected. Yet, the > function prototype forward declaration is present for > CONFIG_VIRT_CPU_ACCOUNTING_GEN variant. Remove it. > And powerpc arch_vtime_task_switch is static inline too, so this just confuses things. Reviewed-by: Nicholas Piggin > Reviewed-by: Frederic Weisbecker > Signed-off-by: Alexander Gordeev > --- > include/linux/vtime.h | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/include/linux/vtime.h b/include/linux/vtime.h > index 3684487d01e1..593466ceebed 100644 > --- a/include/linux/vtime.h > +++ b/include/linux/vtime.h > @@ -18,7 +18,6 @@ extern void vtime_account_idle(struct task_struct *tsk); > #endif /* !CONFIG_VIRT_CPU_ACCOUNTING */ > > #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN > -extern void arch_vtime_task_switch(struct task_struct *tsk); > extern void vtime_user_enter(struct task_struct *tsk); > extern void vtime_user_exit(struct task_struct *tsk); > extern void vtime_guest_enter(struct task_struct *tsk);
Re: [PATCH v4 5/5] sched: rename SD_SHARE_PKG_RESOURCES to SD_SHARE_LLC
On Wed, Feb 07, 2024 at 11:58:40AM +0800, al...@kernel.org wrote: > From: Alex Shi > > SD_CLUSTER shares the CPU resources like llc tags or l2 cache, that's > easy confuse with SD_SHARE_PKG_RESOURCES. So let's specifical point > what the latter shares: LLC. That would reduce some confusing. > > Suggested-by: Valentin Schneider > Signed-off-by: Alex Shi FWIW, Reviewed-by: Ricardo Neri
Re: [PATCH v2 2/4] eventfd: simplify eventfd_signal()
On Wed, Feb 07, 2024 at 03:34:59PM +0100, Paolo Bonzini wrote: > On Wed, Nov 22, 2023 at 1:49 PM Christian Brauner wrote: > > > > Ever since the evenfd type was introduced back in 2007 in commit > > e1ad7468c77d ("signal/timer/event: eventfd core") the eventfd_signal() > > function only ever passed 1 as a value for @n. There's no point in > > keeping that additional argument. > > > > Signed-off-by: Christian Brauner > > --- > > arch/x86/kvm/hyperv.c | 2 +- > > arch/x86/kvm/xen.c| 2 +- > > virt/kvm/eventfd.c| 4 ++-- > > 30 files changed, 60 insertions(+), 63 deletions(-) > > For KVM: > > Acked-by: Paolo Bonzini I really appreciate all of the ACKs but just fyi that this was merged for v6.8-rc1. Just so that there's no confusion!
Re: [revert commit 83b3836bf83f09beea5f592b126cfdd1bc921e48] [mainline] [6.8.0-rc3]kernel BUG at arch/powerpc/platforms/pseries/iommu.c:100!
Adding more people On 08/02/24 8:26 am, Venkat Rao Bagalkote wrote: Resending as earlier one bounced. Greetings!!! [revert commit 83b3836bf83f09beea5f592b126cfdd1bc921e48] [mainline] [6.8.0-rc3]kernel BUG at arch/powerpc/platforms/pseries/iommu.c:100! By Reverting below commit id, issue is not seen. 83b3836bf83f09beea5f592b126cfdd1bc921e48 iommu: Allow ops->default_domain to work when !CONFIG_IOMMU_DMA --- Traces --- KernelBug: Kernel bug in state 'None': kernel BUG at arch/powerpc/platforms/pseries/iommu.c:100! [ 7843.148980] Oops: Exception in kernel mode, sig: 5 [#1] [ 7843.148984] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=8192 NUMA pSeries [ 7843.148988] Modules linked in: nft_compat nf_tables libcrc32c nfnetlink bridge stp llc tls sit tunnel4 ip_tunnel nvram rpadlpar_io rpaphp xsk_diag rfkill binfmt_misc mlx5_ib ib_uverbs ib_core pseries_rng drm drm_panel_orientation_quirks ext4 mbcache jbd2 dm_service_time sd_mod t10_pi crc64_rocksoft_generic crc64_rocksoft crc64 sg ibmvfc scsi_transport_fc ibmveth mlx5_core mlxfw vmx_crypto psample dm_multipath dm_mirror dm_region_hash dm_log dm_mod fuse [last unloaded: tls] [ 7843.149041] CPU: 17 PID: 218444 Comm: drmgr Kdump: loaded Tainted: G I 6.8.0-rc3-autotest-g99bd3cb0d12e #1 [ 7843.149047] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf06 of:IBM,FW1060.00 (NH1060_016) hv:phyp pSeries [ 7843.149051] NIP: c00ff4d4 LR: c00ff4cc CTR: [ 7843.149055] REGS: c013aed5f840 TRAP: 0700 Tainted: G I (6.8.0-rc3-autotest-g99bd3cb0d12e) [ 7843.149060] MSR: 80029033 CR: 44002402 XER: 2004 [ 7843.149069] CFAR: c0a0d170 IRQMASK: 0 [ 7843.149069] GPR00: c00ff4cc c013aed5fae0 c1512700 c013aa362138 [ 7843.149069] GPR04: 000119c8afd0 [ 7843.149069] GPR08: c01284442b00 0001 1003 [ 7843.149069] GPR12: 0003 c0182f00 [ 7843.149069] GPR16: [ 7843.149069] GPR20: [ 7843.149069] GPR24: c013aed5fc40 0002 c2757d90 [ 7843.149069] GPR28: c00ff440 c2757cb8 c0183799c1a0 c013aa362b00 [ 7843.149112] NIP [c00ff4d4] iommu_reconfig_notifier+0x94/0x200 [ 7843.149123] LR [c00ff4cc] iommu_reconfig_notifier+0x8c/0x200 [ 7843.149129] Call Trace: [ 7843.149131] [c013aed5fae0] [c00ff4cc] iommu_reconfig_notifier+0x8c/0x200 (unreliable) [ 7843.149138] [c013aed5fb10] [c01a27b0] notifier_call_chain+0xb8/0x19c [ 7843.149146] [c013aed5fb70] [c01a2a78] blocking_notifier_call_chain+0x64/0x98 [ 7843.149152] [c013aed5fbb0] [c0c4a898] of_reconfig_notify+0x44/0xdc [ 7843.149160] [c013aed5fc20] [c0c4add4] of_detach_node+0x78/0xb0 [ 7843.149165] [c013aed5fc70] [c00f96a8] ofdt_write.part.0+0x86c/0xbb8 [ 7843.149171] [c013aed5fce0] [c069b4bc] proc_reg_write+0xf4/0x150 [ 7843.149178] [c013aed5fd10] [c05bfeb4] vfs_write+0xf8/0x488 [ 7843.149184] [c013aed5fdc0] [c05c0570] ksys_write+0x84/0x140 [ 7843.149189] [c013aed5fe10] [c0033358] system_call_exception+0x138/0x330 [ 7843.149196] [c013aed5fe50] [c000d05c] system_call_vectored_common+0x15c/0x2ec [ 7843.149203] --- interrupt: 3000 at 0x2433acb4 [ 7843.149208] NIP: 2433acb4 LR: CTR: [ 7843.149212] REGS: c013aed5fe80 TRAP: 3000 Tainted: G I (6.8.0-rc3-autotest-g99bd3cb0d12e) [ 7843.149217] MSR: 8280f033 CR: 42002408 XER: [ 7843.149227] IRQMASK: 0 [ 7843.149227] GPR00: 0004 7fffcde2d2c0 00013a707f00 0006 [ 7843.149227] GPR04: 7fffcde2d368 0020 0001 [ 7843.149227] GPR08: 0020 [ 7843.149227] GPR12: 2403b1e0 00013a6df668 0006 [ 7843.149227] GPR16: 0002 0004 0005 [ 7843.149227] GPR20: 00013a700018 00013a6df360 00013a6dcb20 00013a6dd918 [ 7843.149227] GPR24: 00013a6db178 0001722028c8 7fffcde2d7f8 0001722028c8 [ 7843.149227] GPR28: 00013a6dc340 7fffcde2d368 0006 0020 [ 7843.149268] NIP [2433acb4] 0x2433acb4 [ 7843.149271] LR [] 0x0 [ 7843.149274] --- interrupt: 3000 [ 7843.149277] Code: 4082016c 2c3f 41820060 ebff0028 2c3f 41820054 e87f0018 2c23 41820014 4890dc75 6000 e93f0018 <0b09> e87f0020 2c23 4182000c [ 7843.149292] ---[ end trace ]---
[PATCH v2 1/5] sched/vtime: remove confusing arch_vtime_task_switch() declaration
Callback arch_vtime_task_switch() is only defined when CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is selected. Yet, the function prototype forward declaration is present for CONFIG_VIRT_CPU_ACCOUNTING_GEN variant. Remove it. Reviewed-by: Frederic Weisbecker Signed-off-by: Alexander Gordeev --- include/linux/vtime.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/vtime.h b/include/linux/vtime.h index 3684487d01e1..593466ceebed 100644 --- a/include/linux/vtime.h +++ b/include/linux/vtime.h @@ -18,7 +18,6 @@ extern void vtime_account_idle(struct task_struct *tsk); #endif /* !CONFIG_VIRT_CPU_ACCOUNTING */ #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN -extern void arch_vtime_task_switch(struct task_struct *tsk); extern void vtime_user_enter(struct task_struct *tsk); extern void vtime_user_exit(struct task_struct *tsk); extern void vtime_guest_enter(struct task_struct *tsk); -- 2.40.1
[PATCH v2 5/5] sched/vtime: do not include header
There is no architecture-specific code or data left that generic needs to know about. Thus, avoid the inclusion of header. Reviewed-by: Frederic Weisbecker Signed-off-by: Alexander Gordeev --- arch/powerpc/include/asm/Kbuild | 1 - include/asm-generic/vtime.h | 1 - include/linux/vtime.h | 4 3 files changed, 6 deletions(-) delete mode 100644 include/asm-generic/vtime.h diff --git a/arch/powerpc/include/asm/Kbuild b/arch/powerpc/include/asm/Kbuild index 61a8dcd7..e5fdc336c9b2 100644 --- a/arch/powerpc/include/asm/Kbuild +++ b/arch/powerpc/include/asm/Kbuild @@ -6,5 +6,4 @@ generic-y += agp.h generic-y += kvm_types.h generic-y += mcs_spinlock.h generic-y += qrwlock.h -generic-y += vtime.h generic-y += early_ioremap.h diff --git a/include/asm-generic/vtime.h b/include/asm-generic/vtime.h deleted file mode 100644 index b1a49677fe25.. --- a/include/asm-generic/vtime.h +++ /dev/null @@ -1 +0,0 @@ -/* no content, but patch(1) dislikes empty files */ diff --git a/include/linux/vtime.h b/include/linux/vtime.h index 593466ceebed..29dd5b91dd7d 100644 --- a/include/linux/vtime.h +++ b/include/linux/vtime.h @@ -5,10 +5,6 @@ #include #include -#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE -#include -#endif - /* * Common vtime APIs */ -- 2.40.1
[PATCH v2 3/5] s390/vtime: remove unused __ARCH_HAS_VTIME_TASK_SWITCH leftover
__ARCH_HAS_VTIME_TASK_SWITCH macro is not used anymore. Reviewed-by: Frederic Weisbecker Acked-by: Heiko Carstens Signed-off-by: Alexander Gordeev --- arch/s390/include/asm/vtime.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/s390/include/asm/vtime.h b/arch/s390/include/asm/vtime.h index fe17e448c0c5..561c91c1a87c 100644 --- a/arch/s390/include/asm/vtime.h +++ b/arch/s390/include/asm/vtime.h @@ -2,8 +2,6 @@ #ifndef _S390_VTIME_H #define _S390_VTIME_H -#define __ARCH_HAS_VTIME_TASK_SWITCH - static inline void update_timer_sys(void) { S390_lowcore.system_timer += S390_lowcore.last_update_timer - S390_lowcore.exit_timer; -- 2.40.1
[PATCH v2 0/5] sched/vtime: vtime.h headers cleanup
Hi All, I kept all tags on reveiwed patches. v2: - patch 4: commit message reworded (Heiko) - patch 5: vtime.h is removed from Kbuild scripts (PowerPC only) (Heiko) v1: Please find a small cleanup to vtime_task_switch() wiring. I split it into smaller patches to allow separate PowerPC vs s390 reviews. Otherwise patches 2+3 and 4+5 could have been merged. I tested it on s390 and compile-tested it on 32- and 64-bit PowerPC and few other major architectures only, but it is only of concern for CONFIG_VIRT_CPU_ACCOUNTING_NATIVE-capable ones (AFAICT). Thanks! Alexander Gordeev (5): sched/vtime: remove confusing arch_vtime_task_switch() declaration sched/vtime: get rid of generic vtime_task_switch() implementation s390/vtime: remove unused __ARCH_HAS_VTIME_TASK_SWITCH leftover s390/irq,nmi: include header directly sched/vtime: do not include header arch/powerpc/include/asm/Kbuild| 1 - arch/powerpc/include/asm/cputime.h | 13 - arch/powerpc/kernel/time.c | 22 ++ arch/s390/include/asm/vtime.h | 2 -- arch/s390/kernel/irq.c | 1 + arch/s390/kernel/nmi.c | 1 + include/asm-generic/vtime.h| 1 - include/linux/vtime.h | 5 - kernel/sched/cputime.c | 13 - 9 files changed, 24 insertions(+), 35 deletions(-) delete mode 100644 include/asm-generic/vtime.h -- 2.40.1
[PATCH v2 4/5] s390/irq,nmi: include header directly
update_timer_sys() and update_timer_mcck() are inlines used for CPU time accounting from the interrupt and machine-check handlers. These routines are specific to s390 architecture, but included via header implicitly. Avoid the extra loop and include header directly. Acked-by: Heiko Carstens Signed-off-by: Alexander Gordeev --- arch/s390/kernel/irq.c | 1 + arch/s390/kernel/nmi.c | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/s390/kernel/irq.c b/arch/s390/kernel/irq.c index 6f71b0ce1068..259496fe0ef9 100644 --- a/arch/s390/kernel/irq.c +++ b/arch/s390/kernel/irq.c @@ -29,6 +29,7 @@ #include #include #include +#include #include "entry.h" DEFINE_PER_CPU_SHARED_ALIGNED(struct irq_stat, irq_stat); diff --git a/arch/s390/kernel/nmi.c b/arch/s390/kernel/nmi.c index 9ad44c26d1a2..4422a27faace 100644 --- a/arch/s390/kernel/nmi.c +++ b/arch/s390/kernel/nmi.c @@ -32,6 +32,7 @@ #include #include #include +#include #include struct mcck_struct { -- 2.40.1
[PATCH v2 2/5] sched/vtime: get rid of generic vtime_task_switch() implementation
The generic vtime_task_switch() implementation gets built only if __ARCH_HAS_VTIME_TASK_SWITCH is not defined, but requires an architecture to implement arch_vtime_task_switch() callback at the same time, which is confusing. Further, arch_vtime_task_switch() is implemented for 32-bit PowerPC architecture only and vtime_task_switch() generic variant is rather superfluous. Simplify the whole vtime_task_switch() wiring by moving the existing generic implementation to PowerPC. Reviewed-by: Frederic Weisbecker Signed-off-by: Alexander Gordeev --- arch/powerpc/include/asm/cputime.h | 13 - arch/powerpc/kernel/time.c | 22 ++ kernel/sched/cputime.c | 13 - 3 files changed, 22 insertions(+), 26 deletions(-) diff --git a/arch/powerpc/include/asm/cputime.h b/arch/powerpc/include/asm/cputime.h index 4961fb38e438..aff858ca99c0 100644 --- a/arch/powerpc/include/asm/cputime.h +++ b/arch/powerpc/include/asm/cputime.h @@ -32,23 +32,10 @@ #ifdef CONFIG_PPC64 #define get_accounting(tsk)(&get_paca()->accounting) #define raw_get_accounting(tsk)(&local_paca->accounting) -static inline void arch_vtime_task_switch(struct task_struct *tsk) { } #else #define get_accounting(tsk)(&task_thread_info(tsk)->accounting) #define raw_get_accounting(tsk)get_accounting(tsk) -/* - * Called from the context switch with interrupts disabled, to charge all - * accumulated times to the current process, and to prepare accounting on - * the next process. - */ -static inline void arch_vtime_task_switch(struct task_struct *prev) -{ - struct cpu_accounting_data *acct = get_accounting(current); - struct cpu_accounting_data *acct0 = get_accounting(prev); - - acct->starttime = acct0->starttime; -} #endif /* diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index df20cf201f74..c0fdc6d94fee 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -354,6 +354,28 @@ void vtime_flush(struct task_struct *tsk) acct->hardirq_time = 0; acct->softirq_time = 0; } + +/* + * Called from the context switch with interrupts disabled, to charge all + * accumulated times to the current process, and to prepare accounting on + * the next process. + */ +void vtime_task_switch(struct task_struct *prev) +{ + if (is_idle_task(prev)) + vtime_account_idle(prev); + else + vtime_account_kernel(prev); + + vtime_flush(prev); + + if (!IS_ENABLED(CONFIG_PPC64)) { + struct cpu_accounting_data *acct = get_accounting(current); + struct cpu_accounting_data *acct0 = get_accounting(prev); + + acct->starttime = acct0->starttime; + } +} #endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */ void __no_kcsan __delay(unsigned long loops) diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index af7952f12e6c..aa48b2ec879d 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -424,19 +424,6 @@ static inline void irqtime_account_process_tick(struct task_struct *p, int user_ */ #ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE -# ifndef __ARCH_HAS_VTIME_TASK_SWITCH -void vtime_task_switch(struct task_struct *prev) -{ - if (is_idle_task(prev)) - vtime_account_idle(prev); - else - vtime_account_kernel(prev); - - vtime_flush(prev); - arch_vtime_task_switch(prev); -} -# endif - void vtime_account_irq(struct task_struct *tsk, unsigned int offset) { unsigned int pc = irq_count() - offset; -- 2.40.1
Re: [PATCH v5 00/25] Transparent Contiguous PTEs for User Mappings
On Fri, Feb 02, 2024 at 08:07:31AM +, Ryan Roberts wrote: > Hi All, Hi Ryan, I assume this is the same as your 'features/granule_perf/contpte-lkml_v' branch on https://gitlab.arm.com/linux-arm/linux-rr/ I've taken a quick look, and I have a few initial/superficial comments before digging into the detail on the important changes. > Patch Layout > > > In this version, I've split the patches to better show each optimization: > > - 1-2:mm prep: misc code and docs cleanups I'm not confident enough to comment on patch 2, but these look reasonable to me. > - 3-8:mm,arm,arm64,powerpc,x86 prep: Replace pte_next_pfn() with more > general pte_advance_pfn() These look fine to me. > - 9-18: arm64 prep: Refactor ptep helpers into new layer The result of patches 9-17 looks good to me, but the intermediate stages where some functions are converted is a bit odd, and it's a bit painful for review since you need to skip ahead a few patches to see the end result to tell that the conversions are consistent and complete. IMO it'd be easier for review if that were three patches: 1) Convert READ_ONCE() -> ptep_get() 2) Convert set_pte_at() -> set_ptes() 3) All the "New layer" renames and addition of the trivial wrappers Patch 18 looks fine to me. > - 19: functional contpte implementation > - 20-25: various optimizations on top of the contpte implementation I'll try to dig into these over the next few days. Mark.
Re: [PATCH v2 1/4] PCI/AER: Store more information in aer_err_info
On Tue, Feb 06, 2024 at 11:23:35AM -0600, Bjorn Helgaas wrote: > On Wed, Feb 07, 2024 at 12:41:41AM +0800, Wang, Qingshun wrote: > > On Mon, Feb 05, 2024 at 05:12:31PM -0600, Bjorn Helgaas wrote: > > > On Thu, Jan 25, 2024 at 02:27:59PM +0800, Wang, Qingshun wrote: > > > > When Advisory Non-Fatal errors are raised, both correctable and > > > > uncorrectable error statuses will be set. The current kernel code cannot > > > > store both statuses at the same time, thus failing to handle ANFE > > > > properly. > > > > In addition, to avoid clearing UEs that are not ANFE by accident, UE > > > > severity and Device Status also need to be recorded: any fatal UE cannot > > > > be ANFE, and if Fatal/Non-Fatal Error Detected is set in Device Status, > > > > do > > > > not take any assumption and let UE handler to clear UE status. > > > > > > > > Store status and mask of both correctable and uncorrectable errors in > > > > aer_err_info. The severity of UEs and the values of the Device Status > > > > register are also recorded, which will be used to determine UEs that > > > > should > > > > be handled by the ANFE handler. Refactor the rest of the code to use > > > > cor/uncor_status and cor/uncor_mask fields instead of status and mask > > > > fields. > > > > > > There's a lot going on in this patch. Could it possibly be split up a > > > bit, e.g., first tease apart aer_err_info.status/.mask into > > > .cor_status/mask and .uncor_status/mask, then add .uncor_severity, > > > then add the device_status bit separately? If it could be split up, I > > > think the ANFE case would be easier to see. > > > > Thanks for the feedback! Will split it up into two pacthes in the next > > version. > > Or even three: > > 1) tease apart aer_err_info.status/.mask into .cor_status/mask and > .uncor_status/mask > > 2) add .uncor_severity > > 3) add device_status > > Looking at this again, I'm a little confused about 2) and 3). I see > the new read of PCI_ERR_UNCOR_SEVER into .uncor_severity, but there's > no actual *use* of it. > > Same for 3), I see the new read of PCI_EXP_DEVSTA, but AFAICS there's > no use of that value. > Both 2) and 3) are used in PATCH 2 and traced in PATCH 4. I can separate the logic for reading these values from PATCH 1 and merge it with PATCH 2. -- Best regards, Wang, Qingshun
Re: [PATCH v4 5/5] sched: rename SD_SHARE_PKG_RESOURCES to SD_SHARE_LLC
On 07/02/24 11:58, al...@kernel.org wrote: > From: Alex Shi > > SD_CLUSTER shares the CPU resources like llc tags or l2 cache, that's > easy confuse with SD_SHARE_PKG_RESOURCES. So let's specifical point > what the latter shares: LLC. That would reduce some confusing. > > Suggested-by: Valentin Schneider > Signed-off-by: Alex Shi AFAICT it's just missing the below replacement (I've stretched the comments to go up to 80 chars while at it), otherwise LGTM. Reviewed-by: Valentin Schneider --- diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index e877730219d38..99ea5986038ce 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -657,13 +657,13 @@ static void destroy_sched_domains(struct sched_domain *sd) } /* - * Keep a special pointer to the highest sched_domain that has - * SD_SHARE_PKG_RESOURCE set (Last Level Cache Domain) for this - * allows us to avoid some pointer chasing select_idle_sibling(). + * Keep a special pointer to the highest sched_domain that has SD_SHARE_LLC set + * (Last Level Cache Domain) for this allows us to avoid some pointer chasing + * select_idle_sibling(). * - * Also keep a unique ID per domain (we use the first CPU number in - * the cpumask of the domain), this allows us to quickly tell if - * two CPUs are in the same cache domain, see cpus_share_cache(). + * Also keep a unique ID per domain (we use the first CPU number in the cpumask + * of the domain), this allows us to quickly tell if two CPUs are in the same + * cache domain, see cpus_share_cache(). */ DEFINE_PER_CPU(struct sched_domain __rcu *, sd_llc); DEFINE_PER_CPU(int, sd_llc_size);
Re: [PATCH] MAINTAINERS: adjust file entries after crypto vmx file movement
On Thu, Feb 1, 2024 at 6:51 AM Herbert Xu wrote: > > On Mon, Jan 29, 2024 at 02:17:29PM +0100, Lukas Bulwahn wrote: > > > > diff --git a/MAINTAINERS b/MAINTAINERS > > index 2fb944964be5..15bc79e80e28 100644 > > --- a/MAINTAINERS > > +++ b/MAINTAINERS > > @@ -10307,12 +10307,12 @@ M: Nayna Jain > > M: Paulo Flabiano Smorigo > > L: linux-cry...@vger.kernel.org > > S: Supported > > -F: drivers/crypto/vmx/Kconfig > > -F: drivers/crypto/vmx/Makefile > > -F: drivers/crypto/vmx/aes* > > -F: drivers/crypto/vmx/ghash* > > -F: drivers/crypto/vmx/ppc-xlate.pl > > -F: drivers/crypto/vmx/vmx.c > > +F: arch/powerpc/crypto/Kconfig > > +F: arch/powerpc/crypto/Makefile > > +F: arch/powerpc/crypto/aes* > > Are you sure about this? There are non-vmx aes* files in that > directory. Perhaps something more specific is needed here? > sorry for the late reply. I revisited this patch and now keep the match exact in my patch v2: https://lore.kernel.org/lkml/20240208093327.23926-1-lukas.bulw...@gmail.com/ Herbert, I hope you are fine to pick this patch v2. Lukas
[PATCH v2] MAINTAINERS: adjust file entries after crypto vmx file movement
Commit 109303336a0c ("crypto: vmx - Move to arch/powerpc/crypto") moves the crypto vmx files to arch/powerpc, but misses to adjust the file entries for IBM Power VMX Cryptographic instructions and LINUX FOR POWERPC. Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about broken references. Adjust these file entries accordingly. To keep the matched files exact after the movement, spell out each file name in the new directory. Signed-off-by: Lukas Bulwahn --- v1: https://lore.kernel.org/lkml/20240129131729.4311-1-lukas.bulw...@gmail.com/ v1 -> v2: - address Herbert Xu's feedback: keep the matched files exactly those which were in the vmx directory Danny, please ack. Herbert, please pick this minor clean-up patch on your -next tree. MAINTAINERS | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index 58845a852ab1..1820f661bfe1 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -10329,12 +10329,17 @@ M:Nayna Jain M: Paulo Flabiano Smorigo L: linux-cry...@vger.kernel.org S: Supported -F: drivers/crypto/vmx/Kconfig -F: drivers/crypto/vmx/Makefile -F: drivers/crypto/vmx/aes* -F: drivers/crypto/vmx/ghash* -F: drivers/crypto/vmx/ppc-xlate.pl -F: drivers/crypto/vmx/vmx.c +F: arch/powerpc/crypto/Kconfig +F: arch/powerpc/crypto/Makefile +F: arch/powerpc/crypto/aes.c +F: arch/powerpc/crypto/aes_cbc.c +F: arch/powerpc/crypto/aes_ctr.c +F: arch/powerpc/crypto/aes_xts.c +F: arch/powerpc/crypto/aesp8-ppc.* +F: arch/powerpc/crypto/ghash.c +F: arch/powerpc/crypto/ghashp8-ppc.pl +F: arch/powerpc/crypto/ppc-xlate.pl +F: arch/powerpc/crypto/vmx.c IBM ServeRAID RAID DRIVER S: Orphan @@ -12428,7 +12433,6 @@ F: drivers/*/*/*pasemi* F: drivers/*/*pasemi* F: drivers/char/tpm/tpm_ibmvtpm* F: drivers/crypto/nx/ -F: drivers/crypto/vmx/ F: drivers/i2c/busses/i2c-opal.c F: drivers/net/ethernet/ibm/ibmveth.* F: drivers/net/ethernet/ibm/ibmvnic.* -- 2.17.1