[PATCH v8 3/4] KVM: selftests: Add ucall test support for LoongArch
Add ucall test support for LoongArch, ucall method on LoongArch uses undefined mmio area. It will cause causes vcpu exits to hypervisor so that hypervisor can communicate with vcpu. Signed-off-by: Bibo Mao --- .../selftests/kvm/include/loongarch/ucall.h | 20 ++ .../selftests/kvm/lib/loongarch/ucall.c | 38 +++ 2 files changed, 58 insertions(+) create mode 100644 tools/testing/selftests/kvm/include/loongarch/ucall.h create mode 100644 tools/testing/selftests/kvm/lib/loongarch/ucall.c diff --git a/tools/testing/selftests/kvm/include/loongarch/ucall.h b/tools/testing/selftests/kvm/include/loongarch/ucall.h new file mode 100644 index ..4ec801f37f00 --- /dev/null +++ b/tools/testing/selftests/kvm/include/loongarch/ucall.h @@ -0,0 +1,20 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef SELFTEST_KVM_UCALL_H +#define SELFTEST_KVM_UCALL_H + +#include "kvm_util.h" + +#define UCALL_EXIT_REASON KVM_EXIT_MMIO + +/* + * ucall_exit_mmio_addr holds per-VM values (global data is duplicated by each + * VM), it must not be accessed from host code. + */ +extern vm_vaddr_t *ucall_exit_mmio_addr; + +static inline void ucall_arch_do_ucall(vm_vaddr_t uc) +{ + WRITE_ONCE(*ucall_exit_mmio_addr, uc); +} + +#endif diff --git a/tools/testing/selftests/kvm/lib/loongarch/ucall.c b/tools/testing/selftests/kvm/lib/loongarch/ucall.c new file mode 100644 index ..fc6cbb50573f --- /dev/null +++ b/tools/testing/selftests/kvm/lib/loongarch/ucall.c @@ -0,0 +1,38 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * ucall support. A ucall is a "hypercall to userspace". + * + */ +#include "kvm_util.h" + +/* + * ucall_exit_mmio_addr holds per-VM values (global data is duplicated by each + * VM), it must not be accessed from host code. + */ +vm_vaddr_t *ucall_exit_mmio_addr; + +void ucall_arch_init(struct kvm_vm *vm, vm_paddr_t mmio_gpa) +{ + vm_vaddr_t mmio_gva = vm_vaddr_unused_gap(vm, vm->page_size, KVM_UTIL_MIN_VADDR); + + virt_map(vm, mmio_gva, mmio_gpa, 1); + + vm->ucall_mmio_addr = mmio_gpa; + + write_guest_global(vm, ucall_exit_mmio_addr, (vm_vaddr_t *)mmio_gva); +} + +void *ucall_arch_get_ucall(struct kvm_vcpu *vcpu) +{ + struct kvm_run *run = vcpu->run; + + if (run->exit_reason == KVM_EXIT_MMIO && + run->mmio.phys_addr == vcpu->vm->ucall_mmio_addr) { + TEST_ASSERT(run->mmio.is_write && run->mmio.len == sizeof(uint64_t), + "Unexpected ucall exit mmio address access"); + + return (void *)(*((uint64_t *)run->mmio.data)); + } + + return NULL; +} -- 2.39.3
[PATCH v8 0/4] KVM: selftests: Add LoongArch support
This patchset adds KVM selftests for LoongArch system, currently only some common test cases are supported and pass to run. These testcase are listed as following: coalesced_io_test demand_paging_test dirty_log_perf_test dirty_log_test guest_print_test hardware_disable_test kvm_binary_stats_test kvm_create_max_vcpus kvm_page_table_test memslot_modification_stress_test memslot_perf_test set_memory_region_test This patchset originally is posted from zhaotianrui, I continue to work on his efforts. --- Changes in v8: 1. Porting patch based on the latest version. 2. For macro PC_OFFSET_EXREGS, offsetof() method is used for C header file, still hardcoded definition for assemble language. Changes in v7: 1. Refine code to add LoongArch support in test case set_memory_region_test. Changes in v6: 1. Refresh the patch based on latest kernel 6.8-rc1, add LoongArch support about testcase set_memory_region_test. 2. Add hardware_disable_test test case. 3. Drop modification about macro DEFAULT_GUEST_TEST_MEM, it is problem of LoongArch binutils, this issue is raised to LoongArch binutils owners. Changes in v5: 1. In LoongArch kvm self tests, the DEFAULT_GUEST_TEST_MEM could be 0x13000, it is different from the default value in memstress.h. So we Move the definition of DEFAULT_GUEST_TEST_MEM into LoongArch ucall.h, and add 'ifndef' condition for DEFAULT_GUEST_TEST_MEM in memstress.h. Changes in v4: 1. Remove the based-on flag, as the LoongArch KVM patch series have been accepted by Linux kernel, so this can be applied directly in kernel. Changes in v3: 1. Improve implementation of LoongArch VM page walk. 2. Add exception handler for LoongArch. 3. Add dirty_log_test, dirty_log_perf_test, guest_print_test test cases for LoongArch. 4. Add __ASSEMBLER__ macro to distinguish asm file and c file. 5. Move ucall_arch_do_ucall to the header file and make it as static inline to avoid function calls. 6. Change the DEFAULT_GUEST_TEST_MEM base addr for LoongArch. Changes in v2: 1. We should use ".balign 4096" to align the assemble code with 4K in exception.S instead of "align 12". 2. LoongArch only supports 3 or 4 levels page tables, so we remove the hanlders for 2-levels page table. 3. Remove the DEFAULT_LOONGARCH_GUEST_STACK_VADDR_MIN and use the common DEFAULT_GUEST_STACK_VADDR_MIN to allocate stack memory in guest. 4. Reorganize the test cases supported by LoongArch. 5. Fix some code comments. 6. Add kvm_binary_stats_test test case into LoongArch KVM selftests. --- Bibo Mao (4): KVM: selftests: Add KVM selftests header files for LoongArch KVM: selftests: Add core KVM selftests support for LoongArch KVM: selftests: Add ucall test support for LoongArch KVM: selftests: Add test cases for LoongArch tools/testing/selftests/kvm/Makefile | 2 +- tools/testing/selftests/kvm/Makefile.kvm | 18 + .../testing/selftests/kvm/include/kvm_util.h | 5 + .../kvm/include/loongarch/kvm_util_arch.h | 7 + .../kvm/include/loongarch/processor.h | 138 +++ .../selftests/kvm/include/loongarch/ucall.h | 20 + .../selftests/kvm/lib/loongarch/exception.S | 59 +++ .../selftests/kvm/lib/loongarch/processor.c | 349 ++ .../selftests/kvm/lib/loongarch/ucall.c | 38 ++ .../selftests/kvm/set_memory_region_test.c| 2 +- 10 files changed, 636 insertions(+), 2 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/loongarch/kvm_util_arch.h create mode 100644 tools/testing/selftests/kvm/include/loongarch/processor.h create mode 100644 tools/testing/selftests/kvm/include/loongarch/ucall.h create mode 100644 tools/testing/selftests/kvm/lib/loongarch/exception.S create mode 100644 tools/testing/selftests/kvm/lib/loongarch/processor.c create mode 100644 tools/testing/selftests/kvm/lib/loongarch/ucall.c base-commit: 0af2f6be1b4281385b618cb86ad946eded089ac8 -- 2.39.3
[PATCH v8 4/4] KVM: selftests: Add test cases for LoongArch
Some common KVM testcases are supported on LoongArch now as following: coalesced_io_test demand_paging_test dirty_log_perf_test dirty_log_test guest_print_test hardware_disable_test kvm_binary_stats_test kvm_create_max_vcpus kvm_page_table_test memslot_modification_stress_test memslot_perf_test set_memory_region_test And other test cases are not supported by LoongArch such as rseq_test, since it is not supported on LoongArch physical machine neither. Signed-off-by: Bibo Mao --- tools/testing/selftests/kvm/Makefile | 2 +- tools/testing/selftests/kvm/Makefile.kvm | 18 ++ .../selftests/kvm/set_memory_region_test.c | 2 +- 3 files changed, 20 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 20af35a91d6f..d9fffe06d3ea 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -3,7 +3,7 @@ top_srcdir = ../../../.. include $(top_srcdir)/scripts/subarch.include ARCH?= $(SUBARCH) -ifeq ($(ARCH),$(filter $(ARCH),arm64 s390 riscv x86 x86_64)) +ifeq ($(ARCH),$(filter $(ARCH),arm64 s390 riscv x86 x86_64 loongarch)) # Top-level selftests allows ARCH=x86_64 :-( ifeq ($(ARCH),x86_64) ARCH := x86 diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm index f773f8f99249..4704ab16f19f 100644 --- a/tools/testing/selftests/kvm/Makefile.kvm +++ b/tools/testing/selftests/kvm/Makefile.kvm @@ -47,6 +47,10 @@ LIBKVM_riscv += lib/riscv/handlers.S LIBKVM_riscv += lib/riscv/processor.c LIBKVM_riscv += lib/riscv/ucall.c +LIBKVM_loongarch += lib/loongarch/processor.c +LIBKVM_loongarch += lib/loongarch/ucall.c +LIBKVM_loongarch += lib/loongarch/exception.S + # Non-compiled test targets TEST_PROGS_x86 += x86/nx_huge_pages_test.sh @@ -205,6 +209,20 @@ TEST_GEN_PROGS_riscv += kvm_page_table_test TEST_GEN_PROGS_riscv += set_memory_region_test TEST_GEN_PROGS_riscv += steal_time +TEST_GEN_PROGS_loongarch += coalesced_io_test +TEST_GEN_PROGS_loongarch += demand_paging_test +TEST_GEN_PROGS_loongarch += dirty_log_perf_test +TEST_GEN_PROGS_loongarch += dirty_log_test +TEST_GEN_PROGS_loongarch += demand_paging_test +TEST_GEN_PROGS_loongarch += guest_print_test +TEST_GEN_PROGS_loongarch += hardware_disable_test +TEST_GEN_PROGS_loongarch += kvm_binary_stats_test +TEST_GEN_PROGS_loongarch += kvm_create_max_vcpus +TEST_GEN_PROGS_loongarch += kvm_page_table_test +TEST_GEN_PROGS_loongarch += memslot_modification_stress_test +TEST_GEN_PROGS_loongarch += memslot_perf_test +TEST_GEN_PROGS_loongarch += set_memory_region_test + SPLIT_TESTS += arch_timer SPLIT_TESTS += get-reg-list diff --git a/tools/testing/selftests/kvm/set_memory_region_test.c b/tools/testing/selftests/kvm/set_memory_region_test.c index bc440d5aba57..ce3ac0fd6dfb 100644 --- a/tools/testing/selftests/kvm/set_memory_region_test.c +++ b/tools/testing/selftests/kvm/set_memory_region_test.c @@ -350,7 +350,7 @@ static void test_invalid_memory_region_flags(void) struct kvm_vm *vm; int r, i; -#if defined __aarch64__ || defined __riscv || defined __x86_64__ +#if defined __aarch64__ || defined __riscv || defined __x86_64__ || defined __loongarch__ supported_flags |= KVM_MEM_READONLY; #endif -- 2.39.3
Re: [PATCH v2] scripts/spdxcheck: Limit the scope of git.Repo
On Wed, 9 Apr 2025 at 22:40, Gon Solo wrote: > > Am Wed, Apr 09, 2025 at 08:04:19PM + schrieb Ricardo Ribalda: > > Changes in v2: > > - Make repo a local variable > > - Link to v1: > > https://lore.kernel.org/r/20250225-spx-v1-1-e935b27eb...@chromium.org > > This is not necessary as it was Python's fault and is fixed by 3.13.3 > which came out yesterday. I just checked. It will take some time before this reaches all distributions. This patch is relatively simple. I might be biased, but I think the benefits outweigh the cons. > -- Ricardo Ribalda
Re: [PATCH bpf-next v1 0/4] bpf, sockmap: Fix data loss and panic issues
On 2025-04-10 03:10:37, patchwork-bot+netdev...@kernel.org wrote: > Hello: > > This series was applied to bpf/bpf-next.git (master) > by Alexei Starovoitov : > > On Mon, 7 Apr 2025 22:21:19 +0800 you wrote: > > I was writing a benchmark based on sockmap + TCP and discovered several > > issues: > > > > 1. When EAGAIN occurs, the direction of skb is incorrect, causing data > >loss when retry. > > 2. When sending partial data, the offset is not recorded, leading to > >duplicate data being sent when retry. > > 3. An unexpected BUG_ON() judgment in skb_linearize is triggered. > > 4. The memory of psock->ingress_skb is not limited by the socket buffer > >and memcg. > > > > [...] LGTM thanks for the fixes Jiayuan. Good to see someone working through all the cases. already merged but ACK for me. > > Here is the summary with links: > - [bpf-next,v1,1/4] bpf, sockmap: Fix data lost during EAGAIN retries > https://git.kernel.org/bpf/bpf-next/c/7683167196bd > - [bpf-next,v1,2/4] bpf, sockmap: fix duplicated data transmission > https://git.kernel.org/bpf/bpf-next/c/3b4f14b79428 > - [bpf-next,v1,3/4] bpf, sockmap: Fix panic when calling skb_linearize > https://git.kernel.org/bpf/bpf-next/c/5ca2e29f6834 > - [bpf-next,v1,4/4] selftest/bpf/benchs: Add benchmark for sockmap usage > https://git.kernel.org/bpf/bpf-next/c/7b2fa44de5e7 > > You are awesome, thank you! > -- > Deet-doot-dot, I am a bot. > https://korg.docs.kernel.org/patchwork/pwbot.html > >
Re: [PATCH bpf-next v1 0/4] bpf, sockmap: Fix data loss and panic issues
Hello: This series was applied to bpf/bpf-next.git (master) by Alexei Starovoitov : On Mon, 7 Apr 2025 22:21:19 +0800 you wrote: > I was writing a benchmark based on sockmap + TCP and discovered several > issues: > > 1. When EAGAIN occurs, the direction of skb is incorrect, causing data >loss when retry. > 2. When sending partial data, the offset is not recorded, leading to >duplicate data being sent when retry. > 3. An unexpected BUG_ON() judgment in skb_linearize is triggered. > 4. The memory of psock->ingress_skb is not limited by the socket buffer >and memcg. > > [...] Here is the summary with links: - [bpf-next,v1,1/4] bpf, sockmap: Fix data lost during EAGAIN retries https://git.kernel.org/bpf/bpf-next/c/7683167196bd - [bpf-next,v1,2/4] bpf, sockmap: fix duplicated data transmission https://git.kernel.org/bpf/bpf-next/c/3b4f14b79428 - [bpf-next,v1,3/4] bpf, sockmap: Fix panic when calling skb_linearize https://git.kernel.org/bpf/bpf-next/c/5ca2e29f6834 - [bpf-next,v1,4/4] selftest/bpf/benchs: Add benchmark for sockmap usage https://git.kernel.org/bpf/bpf-next/c/7b2fa44de5e7 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
Re: [PATCH] selftests/bpf: Fix bpf_nf selftest failure
Hello: This patch was applied to bpf/bpf-next.git (master) by Andrii Nakryiko : On Wed, 9 Apr 2025 15:26:33 +0530 you wrote: > For systems with missing iptables-legacy tool, this selftest fails. > > Add check to find if iptables-legacy tool is available and skip the > test if the tool is missing. > > Fixes: de9c8d848d90 ("selftests/bpf: S/iptables/iptables-legacy/ in the > bpf_nf and xdp_synproxy test") > Signed-off-by: Saket Kumar Bhaskar > > [...] Here is the summary with links: - selftests/bpf: Fix bpf_nf selftest failure https://git.kernel.org/bpf/bpf-next/c/967e8def1100 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
Re: [PATCH 5/5] KVM: VMX: Always honor guest PAT on CPUs that support self-snoop
Hi, AFAIK, the commit c9c1e20b4c7d ("KVM: x86: Introduce Intel specific quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT") which re-allows honoring guest PAT on Intel's platforms has been in kvm/queue now. However, as the quirk is enabled by default, userspace(like QEMU) needs to turn it off by code like "kvm_vm_enable_cap(kvm_state, KVM_CAP_DISABLE_QUIRKS2, 0, KVM_X86_QUIRK_IGNORE_GUEST_PAT)" to honor guest PAT, according to the doc: KVM_X86_QUIRK_IGNORE_GUEST_PAT ... Userspace can disable the quirk to honor guest PAT if it knows that there is no such guest software, for example if it does not expose a bochs graphics device (which is known to have had a buggy driver). Thanks Yan On Thu, Apr 10, 2025 at 01:13:18AM +, Myrsky Lintu wrote: > Hello, > > I am completely new to and uninformed about kernel development. I was > pointed here from Mesa documentation for Venus (Vulkan encapsulation for > KVM/QEMU): https://docs.mesa3d.org/drivers/venus.html > > Based on my limited understanding of what has happened here, this patch > series was partially reverted due to an issue with the Bochs DRM driver. > A fix for that issue has been merged months ago according to the link > provided in an earlier message. Since then work on this detail of KVM > seems to have stalled. > > Is it reasonable to ask here for this patch series to be evaluated and > incorporated again? > > My layperson's attempt at applying the series against 6.14.1 source code > failed. In addition to the parts that appear to have already been > incorporated there are some parts of the patch series that are rejected. > I lack the knowledge to correct that. > > Distro kernels currently ship without it which limits the usability of > Venus on AMD and NVIDIA GPUs paired with Intel CPUs. Convincing > individual distro maintainers of the necessity of this patch series > without the specialized knowledge required for understanding what it > does and performing that evaluation is quite hard. If upstream (kernel) > would apply it now the distros would ship a kernel including the > required changes to users, including me, without that multiplicated effort. > > Thank you for your time. If this request is out of place here please > forgive me for engaging this mailing list without a proper understanding > of the list's scope. > > On 2024-10-07 14:04:24, Linux regression tracking (Thorsten Leemhuis) wrote: > > On 07.10.24 15:38, Vitaly Kuznetsov wrote: > >> "Linux regression tracking (Thorsten Leemhuis)" > >> writes: > >> > >>> On 30.08.24 11:35, Vitaly Kuznetsov wrote: > Sean Christopherson writes: > > > Unconditionally honor guest PAT on CPUs that support self-snoop, as > > Intel has confirmed that CPUs that support self-snoop always snoop > > caches > > and store buffers. I.e. CPUs with self-snoop maintain cache coherency > > even in the presence of aliased memtypes, thus there is no need to trust > > the guest behaves and only honor PAT as a last resort, as KVM does > > today. > > > > Honoring guest PAT is desirable for use cases where the guest has access > > to non-coherent DMA _without_ bouncing through VFIO, e.g. when a virtual > > (mediated, for all intents and purposes) GPU is exposed to the guest, > > along > > with buffers that are consumed directly by the physical GPU, i.e. which > > can't be proxied by the host to ensure writes from the guest are > > performed > > with the correct memory type for the GPU. > > Necroposting! > > Turns out that this change broke "bochs-display" driver in QEMU even > when the guest is modern (don't ask me 'who the hell uses bochs for > modern guests', it was basically a configuration error :-). E.g: > [...] > >>> > >>> This regression made it to the list of tracked regressions. It seems > >>> this thread stalled a while ago. Was this ever fixed? Does not look like > >>> it, but I might have missed something. Or is this a regression I should > >>> just ignore for one reason or another? > >>> > >> > >> The regression was addressed in by reverting 377b2f359d1f in 6.11 > >> > >> commit 9d70f3fec14421e793ffbc0ec2f739b24e534900 > >> Author: Paolo Bonzini > >> Date: Sun Sep 15 02:49:33 2024 -0400 > >> > >> Revert "KVM: VMX: Always honor guest PAT on CPUs that support > >> self-snoop" > > > > Thx. Sorry, missed that, thx for pointing me towards it. I had looked > > for things like that, but seems I messed up my lore query. Apologies for > > the noise! > > > >> Also, there's a (pending) DRM patch fixing it from the guest's side: > >> https://gitlab.freedesktop.org/drm/misc/kernel/-/commit/9388ccf69925223223c87355a417ba39b13a5e8e > > > > Great! > > > > Ciao, Thorsten > > > > P.S.: > > > > #regzbot fix: 9d70f3fec14421
[linus:master] [fs/dax] bde708f1a6: WARNING:at_mm/truncate.c:#truncate_folio_batch_exceptionals
Hello, kernel test robot noticed "WARNING:at_mm/truncate.c:#truncate_folio_batch_exceptionals" on: commit: bde708f1a65d025c45575bfe1e7bf7bdf7e71e87 ("fs/dax: always remove DAX page-cache entries when breaking layouts") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master in testcase: xfstests version: xfstests-x86_64-8467552f-1_20241215 with following parameters: bp1_memmap: 4G!8G bp2_memmap: 4G!10G bp3_memmap: 4G!16G bp4_memmap: 4G!22G nr_pmem: 4 fs: ext2 test: generic-dax config: x86_64-rhel-9.4-func compiler: gcc-12 test machine: 8 threads Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz (Skylake) with 28G memory (please refer to attached dmesg/kmsg for entire log/backtrace) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-lkp/202504101036.390f29a5-...@intel.com [ 46.394237][ T4025] [ cut here ] [ 46.399593][ T4025] WARNING: CPU: 7 PID: 4025 at mm/truncate.c:89 truncate_folio_batch_exceptionals (mm/truncate.c:89 (discriminator 1)) [ 46.409748][ T4025] Modules linked in: ext2 snd_hda_codec_hdmi snd_ctl_led snd_hda_codec_realtek snd_hda_codec_generic snd_hda_scodec_component intel_rapl_msr btrfs intel_rapl_common blake2b_generic xor ipmi_devintf zstd_compress ipmi_msghandler x86_pkg_temp_thermal snd_soc_avs intel_powerclamp raid6_pq snd_soc_hda_codec snd_hda_ext_core coretemp snd_soc_core snd_compress kvm_intel i915 sd_mod snd_hda_intel kvm snd_intel_dspcfg sg snd_intel_sdw_acpi intel_gtt snd_hda_codec cec dell_pc platform_profile drm_buddy ghash_clmulni_intel snd_hda_core sha512_ssse3 dell_wmi ttm sha256_ssse3 snd_hwdep nd_pmem sha1_ssse3 dell_smbios drm_display_helper nd_btt snd_pcm dax_pmem mei_wdt rapl drm_kms_helper ahci mei_me snd_timer libahci rfkill nd_e820 intel_cstate sparse_keymap video wmi_bmof dcdbas dell_wmi_descriptor libnvdimm libata pcspkr intel_uncore i2c_i801 snd mei i2c_smbus soundcore intel_pch_thermal intel_pmc_core intel_vsec wmi pmt_telemetry acpi_pad pmt_class binfmt_misc fuse loop drm dm_mod ip_tables [ 46.498759][ T4025] CPU: 7 UID: 0 PID: 4025 Comm: umount Not tainted 6.14.0-rc6-00297-gbde708f1a65d #1 [ 46.508156][ T4025] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 01/26/2016 [ 46.516324][ T4025] RIP: 0010:truncate_folio_batch_exceptionals (mm/truncate.c:89 (discriminator 1)) [ 46.523347][ T4025] Code: 84 70 ff ff ff 4d 63 fd 49 83 ff 1e 0f 87 d4 01 00 00 4c 89 f0 48 c1 e8 03 42 80 3c 00 00 0f 85 9b 01 00 00 41 f6 06 01 74 c6 <0f> 0b 48 89 c8 48 c1 e8 03 42 80 3c 00 00 0f 85 d6 01 00 00 48 8b All code 0: 84 70 fftest %dh,-0x1(%rax) 3: ff (bad) 4: ff 4d 63decl 0x63(%rbp) 7: fd std 8: 49 83 ff 1e cmp$0x1e,%r15 c: 0f 87 d4 01 00 00 ja 0x1e6 12: 4c 89 f0mov%r14,%rax 15: 48 c1 e8 03 shr$0x3,%rax 19: 42 80 3c 00 00 cmpb $0x0,(%rax,%r8,1) 1e: 0f 85 9b 01 00 00 jne0x1bf 24: 41 f6 06 01 testb $0x1,(%r14) 28: 74 c6 je 0xfff0 2a:* 0f 0b ud2 <-- trapping instruction 2c: 48 89 c8mov%rcx,%rax 2f: 48 c1 e8 03 shr$0x3,%rax 33: 42 80 3c 00 00 cmpb $0x0,(%rax,%r8,1) 38: 0f 85 d6 01 00 00 jne0x214 3e: 48 rex.W 3f: 8b .byte 0x8b Code starting with the faulting instruction === 0: 0f 0b ud2 2: 48 89 c8mov%rcx,%rax 5: 48 c1 e8 03 shr$0x3,%rax 9: 42 80 3c 00 00 cmpb $0x0,(%rax,%r8,1) e: 0f 85 d6 01 00 00 jne0x1ea 14: 48 rex.W 15: 8b .byte 0x8b [ 46.542938][ T4025] RSP: 0018:c9000d74f370 EFLAGS: 00010202 [ 46.548900][ T4025] RAX: 192001ae9ec8 RBX: 8881e2d959d8 RCX: c9000d74f4f8 [ 46.556787][ T4025] RDX: 0001 RSI: c9000d74f638 RDI: 8881e2d95874 [ 46.564676][ T4025] RBP: 192001ae9e74 R08: dc00 R09: f52001ae9eec [ 46.572557][ T4025] R10: 0003 R11: 1110d4bf8d9c R12: c9000d74f638 [ 46.580439][ T4025] R13: R14: c9000d74f640 R15: [ 46.588319][ T4025] FS: 7fe696f3a840() GS:8886a5f8() knlGS: [ 46.597163][ T4025] CS: 0010 DS: ES: CR0: 80050033 [ 46.603691][ T4025] CR2: 7fff266c5ec0 CR3: 0001ead7e002 CR4: 003726f0 [ 46.611586][ T4025] DR0: DR1: DR2:
[PATCH v2] scripts/spdxcheck: Limit the scope of git.Repo
If the git.Repo object's scope extends to the Python interpreter's shutdown phase, its destructor may fail due to the interpreter's state. Exception ignored in: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/git/cmd.py", line 565, in __del__ File "/usr/lib/python3/dist-packages/git/cmd.py", line 546, in _terminate File "/usr/lib/python3.13/subprocess.py", line 2227, in terminate ImportError: sys.meta_path is None, Python is likely shutting down Make repo a variable of the function read_spdxdata() and scan_git_tree() to limit the scope of git.Repo and ensure proper resource management. Signed-off-by: Ricardo Ribalda --- Changes in v2: - Make repo a local variable - Link to v1: https://lore.kernel.org/r/20250225-spx-v1-1-e935b27eb...@chromium.org --- scripts/spdxcheck.py | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/scripts/spdxcheck.py b/scripts/spdxcheck.py index 8d608f61bf371647e7ca0129f583e94e535b6193..d5c4c37f1b068486af28110261f74c67301618a9 100755 --- a/scripts/spdxcheck.py +++ b/scripts/spdxcheck.py @@ -45,7 +45,9 @@ class dirinfo(object): self.files.append(fname) # Read the spdx data from the LICENSES directory -def read_spdxdata(repo): +def read_spdxdata(): +repo = git.Repo(os.getcwd()) +assert not repo.bare # The subdirectories of LICENSES in the kernel source # Note: exceptions needs to be parsed as last directory. @@ -295,7 +297,15 @@ def exclude_file(fpath): return True return False -def scan_git_tree(tree, basedir, dirdepth): +def scan_git_tree(basedir, dirdepth): +repo = git.Repo(os.getcwd()) +tree = repo.head.commit.tree + +basedir = basedir.strip('/') +if basedir != '.': +for p in basedir.split('/'): +tree = tree[p] + parser.set_dirinfo(basedir, dirdepth) for el in tree.traverse(): if not os.path.isfile(el.path): @@ -306,11 +316,6 @@ def scan_git_tree(tree, basedir, dirdepth): with open(el.path, 'rb') as fd: parser.parse_lines(fd, args.maxlines, el.path) -def scan_git_subtree(tree, path, dirdepth): -for p in path.strip('/').split('/'): -tree = tree[p] -scan_git_tree(tree, path.strip('/'), dirdepth) - def read_exclude_file(fname): rules = [] if not fname: @@ -348,12 +353,8 @@ if __name__ == '__main__': sys.exit(1) try: -# Use git to get the valid license expressions -repo = git.Repo(os.getcwd()) -assert not repo.bare - # Initialize SPDX data -spdx = read_spdxdata(repo) +spdx = read_spdxdata() # Initialize the parser parser = id_parser(spdx) @@ -389,14 +390,13 @@ if __name__ == '__main__': if os.path.isfile(p): parser.parse_lines(open(p, 'rb'), args.maxlines, p) elif os.path.isdir(p): -scan_git_subtree(repo.head.reference.commit.tree, p, - args.depth) +scan_git_tree(p, args.depth) else: sys.stderr.write('path %s does not exist\n' %p) sys.exit(1) else: # Full git tree scan -scan_git_tree(repo.head.commit.tree, '.', args.depth) +scan_git_tree('.', args.depth) ndirs = len(parser.spdx_dirs) dirsok = 0 --- base-commit: d082ecbc71e9e0bf49883ee4afd435a77a5101b6 change-id: 20250225-spx-382cf543370e Best regards, -- Ricardo Ribalda
Re: [PATCH 3/4] sysctl: call sysctl tests with a for loop
On Fri, Mar 21, 2025 at 01:47:26PM +0100, Joel Granados wrote: > As we add more test functions in lib/tests_sysctl the main test function > (test_sysctl_init) grows. Condense the logic to make it easier to > add/remove tests. > > Signed-off-by: Joel Granados Nice cleanup! Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH RFC v7 3/8] security: Export security_inode_init_security_anon for KVM guest_memfd
On Tue, Apr 8, 2025 at 7:25 AM Shivank Garg wrote: > > KVM guest_memfd is implementing its own inodes to store metadata for > backing memory using a custom filesystem. This requires the ability to > initialize anonymous inode using security_inode_init_security_anon(). > > As guest_memfd currently resides in the KVM module, we need to export this > symbol for use outside the core kernel. In the future, guest_memfd might be > moved to core-mm, at which point the symbols no longer would have to be > exported. When/if that happens is still unclear. Can you help me understand the timing just a bit more ... do you expect the move to the core MM code to happen during the lifetime of this patchset, or is it just some hand-wavy "future date"? No worries either way, just trying to understand things a bit better. > Signed-off-by: Shivank Garg > --- > security/security.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/security/security.c b/security/security.c > index fb57e8fddd91..097283bb06a5 100644 > --- a/security/security.c > +++ b/security/security.c > @@ -1877,6 +1877,7 @@ int security_inode_init_security_anon(struct inode > *inode, > return call_int_hook(inode_init_security_anon, inode, name, > context_inode); > } > +EXPORT_SYMBOL(security_inode_init_security_anon); > > #ifdef CONFIG_SECURITY_PATH > /** > -- > 2.34.1 -- paul-moore.com
RE: spdxcheck: python git module considered harmful (was RE: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo)
Tim! On Wed, Apr 09 2025 at 17:44, Tim Bird wrote: >> From: Thomas Gleixner >> On Tue, Apr 08 2025 at 17:34, Tim Bird wrote: >> And yes, it ignores not yet tracked files, but if you want to check >> them, then it's easy enough to commit them temporarily or provide a >> dedicated file target to the tools, which ignores git. > > OK. Yes. That's an easy workaround. Actually spdxcheck supports that already: scripts/spdxcheck.py path/to/file >> Good luck for coming up with a clever and clean solution for that! > > I thought about various solutions for this, but each one I came up > with had other drawbacks. If it was just a matter of separating > *.[chS] files from ELF object files, that would be easy to deal with. > But we put SPDX headers on all kinds of files, and there are lots > of other types of files generated during a build that are not just > ELF objects. And build rules change over time. So even if I made > a comprehensive system today to catch build-generated outliers, > the solution would probably need constant updating and tweaking, which > IMHO makes it a no-go. I'm glad that I'm not the only one who came to this conclusion :) >> Just for the record: I rather wish that people would contribute to >> eliminate the remaining 17% (15397 files) which do not have SPDX >> identifiers than complaining about the trivial to solve short-comings of >> the tool, which was written to help this effort and to make sure that it >> does not degrade. > > I agree with this. Analyzing where the headers are missing is interesting. > But it's more important to just fix the missing ones. > I'll spend more of my time working on missing headers, > rather than on tools to analyze and report them. Very appreciated. Thanks, tglx
[PATCH v4] remoteproc: imx_dsp_rproc: Add support for DSP-specific features
From: Iuliana Prodan Some DSP firmware requires a FW_READY signal before proceeding, while others do not. Therefore, add support to handle i.MX DSP-specific features. Implement handle_rsc callback to handle resource table parsing and to process DSP-specific resource, to determine if waiting is needed. Update imx_dsp_rproc_start() to handle this condition accordingly. Signed-off-by: Iuliana Prodan --- Changes in v4: - Reviews from Mathieu Poirier: - Adjusted len to include the size of struct fw_rsc_imx_dsp. - Updated len validation checks. - Review from Frank Li: - In imx_dsp_rproc_handle_rsc(), removed the goto ignored statement. - In probe(), set flags to WAIT_FW_READY to ensure the host waits for fw_ready when no vendor-specific resource is defined. - Link to v3: https://lore.kernel.org/all/20250403100124.637889-1-iuliana.pro...@oss.nxp.com/ Changes in v3: - Reviews from Mathieu Poirier: - Added version and magic number to vendor-specific resource table entry. - Updated defines to maintain backward compatibility with a resource table that doesn't have a vendor-specific resource. - By default, wait for `fw_ready`, unless specified otherwise. - Link to v2: https://lore.kernel.org/all/20250318215007.2109726-1-iuliana.pro...@oss.nxp.com Changes in v2: - Reviews from Mathieu Poirier: - Use vendor-specific resource table entry. - Implement resource handler specific to the i.MX DSP. - Revise commit message to include recent updates. - Link to v1: https://lore.kernel.org/all/20250305123923.514386-1-iuliana.pro...@oss.nxp.com/ drivers/remoteproc/imx_dsp_rproc.c | 98 +- 1 file changed, 96 insertions(+), 2 deletions(-) diff --git a/drivers/remoteproc/imx_dsp_rproc.c b/drivers/remoteproc/imx_dsp_rproc.c index b9bb15970966..e4212e624a91 100644 --- a/drivers/remoteproc/imx_dsp_rproc.c +++ b/drivers/remoteproc/imx_dsp_rproc.c @@ -35,9 +35,18 @@ module_param_named(no_mailboxes, no_mailboxes, int, 0644); MODULE_PARM_DESC(no_mailboxes, "There is no mailbox between cores, so ignore remote proc reply after start, default is 0 (off)."); +/* Flag indicating that the remote is up and running */ #define REMOTE_IS_READYBIT(0) +/* Flag indicating that the host should wait for a firmware-ready response */ +#define WAIT_FW_READY BIT(1) #define REMOTE_READY_WAIT_MAX_RETRIES 500 +/* + * This flag is set in the DSP resource table's features field to indicate + * that the firmware requires the host NOT to wait for a FW_READY response. + */ +#define FEATURE_DONT_WAIT_FW_READY BIT(0) + /* att flags */ /* DSP own area */ #define ATT_OWNBIT(31) @@ -72,6 +81,10 @@ MODULE_PARM_DESC(no_mailboxes, #define IMX8ULP_SIP_HIFI_XRDC 0xc20e +#define FW_RSC_NXP_S_MAGIC ((uint32_t)'n' << 24 | \ +(uint32_t)'x' << 16 | \ +(uint32_t)'p' << 8 | \ +(uint32_t)'s') /* * enum - Predefined Mailbox Messages * @@ -136,6 +149,24 @@ struct imx_dsp_rproc_dcfg { int (*reset)(struct imx_dsp_rproc *priv); }; +/** + * struct fw_rsc_imx_dsp - i.MX DSP specific info + * + * @len: length of the resource entry + * @magic_num: 32-bit magic number + * @version: version of data structure + * @features: feature flags supported by the i.MX DSP firmware + * + * This represents a DSP-specific resource in the firmware's + * resource table, providing information on supported features. + */ +struct fw_rsc_imx_dsp { + uint32_t len; + uint32_t magic_num; + uint32_t version; + uint32_t features; +} __packed; + static const struct imx_rproc_att imx_dsp_rproc_att_imx8qm[] = { /* dev addr , sys addr , size , flags */ { 0x596e8000, 0x556e8000, 0x8000, ATT_OWN }, @@ -300,6 +331,66 @@ static int imx_dsp_rproc_ready(struct rproc *rproc) return -ETIMEDOUT; } +/** + * imx_dsp_rproc_handle_rsc() - Handle DSP-specific resource table entries + * @rproc: remote processor instance + * @rsc_type: resource type identifier + * @rsc: pointer to the resource entry + * @offset: offset of the resource entry + * @avail: available space in the resource table + * + * Parse the DSP-specific resource entry and update flags accordingly. + * If the WAIT_FW_READY feature is set, the host must wait for the firmware + * to signal readiness before proceeding with execution. + * + * Return: RSC_HANDLED if processed successfully, RSC_IGNORED otherwise. + */ +static int imx_dsp_rproc_handle_rsc(struct rproc *rproc, u32 rsc_type, + void *rsc, int offset, int avail) +{ + struct imx_dsp_rproc *priv = rproc->priv; + struct fw_rsc_imx_dsp *imx_dsp_rsc = rsc; + struct device *dev = rproc->dev.parent; + +
Re: [PATCH v2] DAX: warn when kmem regions are truncated for memory block alignment.
On Tue, Apr 01, 2025 at 09:59:20PM -0400, Gregory Price wrote: > Device capacity intended for use as system ram should be aligned to the > archite-defined memory block size or that capacity will be silently > truncated and capacity stranded. > > As hotplug dax memory becomes more prevelant, the memory block size > alignment becomes more important for platform and device vendors to > pay attention to - so this truncation should not be silent. > > This issue is particularly relevant for CXL Dynamic Capacity devices, > whose capacity may arrive in spec-aligned but block-misaligned chunks. > > Suggested-by: David Hildenbrand > Suggested-by: Dan Williams > Signed-off-by: Gregory Price Existing unit test 'daxctl-devices.sh' hits this case: [ 52.547521] kmem dax3.0: DAX region truncated by 62.0 MiB due to alignment Tested-by: Alison Schofield snip >
[PATCH] selftests/sgx: Fix an enclave built with extended instructions
Creating an enclave with xfrm == 3 disables extended CPU states and instruction sets, like AVX2 and AVX512 inside the enclave. Thus the enclave code has to be built with a compiler which does not produce instructions from the extended instruction sets. Nevertheless certain Linux distributions confgure a compiler so it produces extended instructions by default ("--with-arch_64=x86-64-v3" for gcc). Thus an enclave code from test_encl.c is built with extended instructions and an enclave execution hits the #UD exception (note exception_vector == 6): # ./test_sgx ... # RUN enclave.unclobbered_vdso_oversubscribed_remove ... # main.c:481:unclobbered_vdso_oversubscribed_remove:Expected self->run.exception_vector (6) == 0 (0) # main.c:485:unclobbered_vdso_oversubscribed_remove:Expected self->run.function (3) == EEXIT (4) # unclobbered_vdso_oversubscribed_remove: Test terminated by assertion # FAIL enclave.unclobbered_vdso_oversubscribed_remove not ok 3 enclave.unclobbered_vdso_oversubscribed_remove Fix this by adding "-mno-avx" to ENCL_CFLAGS in Makefile. Add some comments about this to code locations where enclave's xfrm field is set. Suggested-by: Dave Hansen Signed-off-by: Vladis Dronov --- an out-of-commit-message note: I would greatly appreciate if someone reviews and possibly fixes my wording of the commit message and the code comments. tools/testing/selftests/sgx/Makefile| 2 +- tools/testing/selftests/sgx/load.c | 8 +++- tools/testing/selftests/sgx/sigstruct.c | 6 ++ 3 files changed, 14 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/sgx/Makefile b/tools/testing/selftests/sgx/Makefile index 03b5e13b872b..ab2561b4456d 100644 --- a/tools/testing/selftests/sgx/Makefile +++ b/tools/testing/selftests/sgx/Makefile @@ -15,7 +15,7 @@ INCLUDES := -I$(top_srcdir)/tools/include HOST_CFLAGS := -Wall -Werror -g $(INCLUDES) -fPIC $(CFLAGS) HOST_LDFLAGS := -z noexecstack -lcrypto ENCL_CFLAGS += -Wall -Werror -static-pie -nostdlib -ffreestanding -fPIE \ - -fno-stack-protector -mrdrnd $(INCLUDES) + -fno-stack-protector -mrdrnd -mno-avx $(INCLUDES) ENCL_LDFLAGS := -Wl,-T,test_encl.lds,--build-id=none ifeq ($(CAN_BUILD_X86_64), 1) diff --git a/tools/testing/selftests/sgx/load.c b/tools/testing/selftests/sgx/load.c index c9f658e44de6..79946ca8f1a5 100644 --- a/tools/testing/selftests/sgx/load.c +++ b/tools/testing/selftests/sgx/load.c @@ -88,10 +88,16 @@ static bool encl_ioc_create(struct encl *encl) memset(secs, 0, sizeof(*secs)); secs->ssa_frame_size = 1; secs->attributes = SGX_ATTR_MODE64BIT; - secs->xfrm = 3; secs->base = encl->encl_base; secs->size = encl->encl_size; + /* +* Setting xfrm to 3 disables extended CPU states and instruction sets +* like AVX2 inside the enclave. Thus the enclave code has to be built +* without instructions from extended instruction sets (-mno-avx). +*/ + secs->xfrm = 3; + ioc.src = (unsigned long)secs; rc = ioctl(encl->fd, SGX_IOC_ENCLAVE_CREATE, &ioc); if (rc) { diff --git a/tools/testing/selftests/sgx/sigstruct.c b/tools/testing/selftests/sgx/sigstruct.c index d73b29becf5b..f548392a2fee 100644 --- a/tools/testing/selftests/sgx/sigstruct.c +++ b/tools/testing/selftests/sgx/sigstruct.c @@ -331,6 +331,12 @@ bool encl_measure(struct encl *encl) sigstruct->header.header2[1] = header2[1]; sigstruct->exponent = 3; sigstruct->body.attributes = SGX_ATTR_MODE64BIT; + + /* +* Setting xfrm to 3 disables extended CPU states and instruction sets +* like AVX2 inside the enclave. Thus the enclave code has to be built +* without instructions from extended instruction sets (-mno-avx). +*/ sigstruct->body.xfrm = 3; /* sanity check */ -- 2.49.0
Re: [PATCH v2] scripts/spdxcheck: Limit the scope of git.Repo
Am Wed, Apr 09, 2025 at 08:04:19PM + schrieb Ricardo Ribalda: > Changes in v2: > - Make repo a local variable > - Link to v1: > https://lore.kernel.org/r/20250225-spx-v1-1-e935b27eb...@chromium.org This is not necessary as it was Python's fault and is fixed by 3.13.3 which came out yesterday. I just checked.
Re: [PATCH 1/4] sysctl: move u8 register test to lib/test_sysctl.c
On Fri, Mar 21, 2025 at 01:47:24PM +0100, Joel Granados wrote: > If the test added in commit b5ffbd139688 ("sysctl: move the extra1/2 > boundary check of u8 to sysctl_check_table_array") is run as a module, a > lingering reference to the module is left behind, and a 'sysctl -a' > leads to a panic. > > To reproduce > CONFIG_KUNIT=y > CONFIG_SYSCTL_KUNIT_TEST=m > > Then run these commands: > modprobe sysctl-test > rmmod sysctl-test > sysctl -a > > The panic varies but generally looks something like this: > > BUG: unable to handle page fault for address: a4571c0c7db4 > #PF: supervisor read access in kernel mode > #PF: error_code(0x) - not-present page > PGD 10067 P4D 10067 PUD 100351067 PMD 114f5e067 PTE 0 > Oops: Oops: [#1] SMP NOPTI > ... ... ... > RIP: 0010:proc_sys_readdir+0x166/0x2c0 > ... ... ... > Call Trace: > > iterate_dir+0x6e/0x140 > __se_sys_getdents+0x6e/0x100 > do_syscall_64+0x70/0x150 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Move the test to lib/test_sysctl.c where the registration reference is > handled on module exit > > 'Fixes: b5ffbd139688 ("sysctl: move the extra1/2 boundary check of u8 to Typoe: drop leading ' > sysctl_check_table_array")' And avoid wrapping this line for the field. > > Signed-off-by: Joel Granados Otherwise looks good to me. Reviewed-by: Kees Cook (And I should note that there is a push to move kunit tests into a "/tests/" subdir, but that's separate from this series.) -- Kees Cook
Re: [PATCH] compiler.h: Avoid the usage of __typeof_unqual__() when __GENKSYMS__ is defined
On Wed, Apr 9, 2025 at 5:38 PM Borislav Petkov wrote: > > On Wed, Apr 09, 2025 at 05:32:39PM +0200, Uros Bizjak wrote: > > The workaround is posted to the list. It should be committed to the > > mainline until genksyms is fixed. > > I'll take it through tip. Thanks! Best regards, Uros.
Re: [PATCH v2] DAX: warn when kmem regions are truncated for memory block alignment.
On Wed, Apr 09, 2025 at 03:21:53PM -0700, Alison Schofield wrote: > Existing unit test 'daxctl-devices.sh' hits this case: > [ 52.547521] kmem dax3.0: DAX region truncated by 62.0 MiB due to alignment > > Tested-by: Alison Schofield > Thanks for testing, good to know there's an existing test for this will respin w/ Jonathan's recommendations and tags. ~Gregory
[GIT PULL] Kselftest fixes update for Linux 6.15-rc2
Hi Linus, Please pull the following kselftest fixes update for Linux 6.15-rc2 Fixes tpm2, futex, and mincore tests. Creates a dedicated .gitignore for tpm2 Details: selftests: tpm2: test_smoke: use POSIX-conformant expression operator selftests/futex: futex_waitv wouldblock test should fail selftests: tpm2: create a dedicated .gitignore selftests/mincore: Allow read-ahead pages to reach the end of the file diff is attached. thanks, -- Shuah The following changes since commit 0af2f6be1b4281385b618cb86ad946eded089ac8: Linux 6.15-rc1 (2025-04-06 13:11:33 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-fixes-6.15-rc2 for you to fetch changes up to 197c1eaa7ba633a482ed7588eea6fd4aa57e08d4: selftests/mincore: Allow read-ahead pages to reach the end of the file (2025-04-08 17:08:50 -0600) linux_kselftest-fixes-6.15-rc2 Fixes tpm2, futex, and mincore tests. Creates a dedicated .gitignore for tpm2 Details: selftests: tpm2: test_smoke: use POSIX-conformant expression operator selftests/futex: futex_waitv wouldblock test should fail selftests: tpm2: create a dedicated .gitignore selftests/mincore: Allow read-ahead pages to reach the end of the file Ahmed Salem (1): selftests: tpm2: test_smoke: use POSIX-conformant expression operator Edward Liaw (1): selftests/futex: futex_waitv wouldblock test should fail Khaled Elnaggar (1): selftests: tpm2: create a dedicated .gitignore Qiuxu Zhuo (1): selftests/mincore: Allow read-ahead pages to reach the end of the file tools/testing/selftests/.gitignore | 1 - tools/testing/selftests/futex/functional/futex_wait_wouldblock.c | 2 +- tools/testing/selftests/mincore/mincore_selftest.c | 3 --- tools/testing/selftests/tpm2/.gitignore | 3 +++ tools/testing/selftests/tpm2/test_smoke.sh | 2 +- 5 files changed, 5 insertions(+), 6 deletions(-) create mode 100644 tools/testing/selftests/tpm2/.gitignore diff --git a/tools/testing/selftests/.gitignore b/tools/testing/selftests/.gitignore index cb24124ac5b9..674aaa02e396 100644 --- a/tools/testing/selftests/.gitignore +++ b/tools/testing/selftests/.gitignore @@ -4,7 +4,6 @@ gpiogpio-hammer gpioinclude/ gpiolsgpio kselftest_install/ -tpm2/SpaceTest.log # Python bytecode and cache __pycache__/ diff --git a/tools/testing/selftests/futex/functional/futex_wait_wouldblock.c b/tools/testing/selftests/futex/functional/futex_wait_wouldblock.c index 7d7a6a06cdb7..2d8230da9064 100644 --- a/tools/testing/selftests/futex/functional/futex_wait_wouldblock.c +++ b/tools/testing/selftests/futex/functional/futex_wait_wouldblock.c @@ -98,7 +98,7 @@ int main(int argc, char *argv[]) info("Calling futex_waitv on f1: %u @ %p with val=%u\n", f1, &f1, f1+1); res = futex_waitv(&waitv, 1, 0, &to, CLOCK_MONOTONIC); if (!res || errno != EWOULDBLOCK) { - ksft_test_result_pass("futex_waitv returned: %d %s\n", + ksft_test_result_fail("futex_waitv returned: %d %s\n", res ? errno : res, res ? strerror(errno) : ""); ret = RET_FAIL; diff --git a/tools/testing/selftests/mincore/mincore_selftest.c b/tools/testing/selftests/mincore/mincore_selftest.c index e949a43a6145..efabfcbe0b49 100644 --- a/tools/testing/selftests/mincore/mincore_selftest.c +++ b/tools/testing/selftests/mincore/mincore_selftest.c @@ -261,9 +261,6 @@ TEST(check_file_mmap) TH_LOG("No read-ahead pages found in memory"); } - EXPECT_LT(i, vec_size) { - TH_LOG("Read-ahead pages reached the end of the file"); - } /* * End of the readahead window. The rest of the pages shouldn't * be in memory. diff --git a/tools/testing/selftests/tpm2/.gitignore b/tools/testing/selftests/tpm2/.gitignore new file mode 100644 index ..6d6165c5e35d --- /dev/null +++ b/tools/testing/selftests/tpm2/.gitignore @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0-only +AsyncTest.log +SpaceTest.log diff --git a/tools/testing/selftests/tpm2/test_smoke.sh b/tools/testing/selftests/tpm2/test_smoke.sh index 168f4b166234..3a60e6c6f5c9 100755 --- a/tools/testing/selftests/tpm2/test_smoke.sh +++ b/tools/testing/selftests/tpm2/test_smoke.sh @@ -6,6 +6,6 @@ ksft_skip=4 [ -e /dev/tpm0 ] || exit $ksft_skip read tpm_version < /sys/class/tpm/tpm0/tpm_version_major -[ "$tpm_version" == 2 ] || exit $ksft_skip +[ "$tpm_version" = 2 ] || exit $ksft_skip python3 -m unittest -v tpm2_tests.SmokeTest 2>&1
Re: [PATCH RESEND bpf-next v2] selftests/bpf: close the file descriptor to avoid resource leaks
On Wed, Apr 9, 2025 at 4:28 PM Andrii Nakryiko wrote: > > On Tue, Apr 8, 2025 at 11:33 AM Malaya Kumar Rout > wrote: > > > > Static Analyis for bench_htab_mem.c with cppcheck:error > > typo: analysis (lower case and typo) > > you can also make into a bit more human-readable sentence: > > "Static analysis found an issue in bench_htab_mem.c: > > > > tools/testing/selftests/bpf/benchs/bench_htab_mem.c:284:3: > > error: Resource leak: fd [resourceLeak] > > tools/testing/selftests/bpf/prog_tests/sk_assign.c:41:3: > > error: Resource leak: tc [resourceLeak] > > > > fix the issue by closing the file descriptor (fd & tc) when > > read & fgets operation fails. > > "Fix the issue by closing" (capitalization, single space between words > > > > > Signed-off-by: Malaya Kumar Rout > > --- > > tools/testing/selftests/bpf/benchs/bench_htab_mem.c | 3 +-- > > tools/testing/selftests/bpf/prog_tests/sk_assign.c | 4 +++- > > 2 files changed, 4 insertions(+), 3 deletions(-) > > > > For some reason this patch didn't make it into our Patchworks system, > so we can't easily apply it. Please fix the above commit issues and > resubmit, hopefully this time it goes through just fine. I thought it was a maintenance issue with patchwork, but it's probably something else. I'm guessing your cc list confuses pw bot. Pls drop linux-kselftest@vger and lkml@vger and keep bpf@vger only.
Re: [PATCH RESEND bpf-next v2] selftests/bpf: close the file descriptor to avoid resource leaks
On Tue, Apr 8, 2025 at 11:33 AM Malaya Kumar Rout wrote: > > Static Analyis for bench_htab_mem.c with cppcheck:error typo: analysis (lower case and typo) you can also make into a bit more human-readable sentence: "Static analysis found an issue in bench_htab_mem.c: > tools/testing/selftests/bpf/benchs/bench_htab_mem.c:284:3: > error: Resource leak: fd [resourceLeak] > tools/testing/selftests/bpf/prog_tests/sk_assign.c:41:3: > error: Resource leak: tc [resourceLeak] > > fix the issue by closing the file descriptor (fd & tc) when > read & fgets operation fails. "Fix the issue by closing" (capitalization, single space between words > > Signed-off-by: Malaya Kumar Rout > --- > tools/testing/selftests/bpf/benchs/bench_htab_mem.c | 3 +-- > tools/testing/selftests/bpf/prog_tests/sk_assign.c | 4 +++- > 2 files changed, 4 insertions(+), 3 deletions(-) > For some reason this patch didn't make it into our Patchworks system, so we can't easily apply it. Please fix the above commit issues and resubmit, hopefully this time it goes through just fine. > diff --git a/tools/testing/selftests/bpf/benchs/bench_htab_mem.c > b/tools/testing/selftests/bpf/benchs/bench_htab_mem.c > index 926ee822143e..297e32390cd1 100644 > --- a/tools/testing/selftests/bpf/benchs/bench_htab_mem.c > +++ b/tools/testing/selftests/bpf/benchs/bench_htab_mem.c > @@ -279,6 +279,7 @@ static void htab_mem_read_mem_cgrp_file(const char *name, > unsigned long *value) > } > > got = read(fd, buf, sizeof(buf) - 1); > + close(fd); > if (got <= 0) { > *value = 0; > return; > @@ -286,8 +287,6 @@ static void htab_mem_read_mem_cgrp_file(const char *name, > unsigned long *value) > buf[got] = 0; > > *value = strtoull(buf, NULL, 0); > - > - close(fd); > } > > static void htab_mem_measure(struct bench_res *res) > diff --git a/tools/testing/selftests/bpf/prog_tests/sk_assign.c > b/tools/testing/selftests/bpf/prog_tests/sk_assign.c > index 0b9bd1d6f7cc..10a0ab954b8a 100644 > --- a/tools/testing/selftests/bpf/prog_tests/sk_assign.c > +++ b/tools/testing/selftests/bpf/prog_tests/sk_assign.c > @@ -37,8 +37,10 @@ configure_stack(void) > tc = popen("tc -V", "r"); > if (CHECK_FAIL(!tc)) > return false; > - if (CHECK_FAIL(!fgets(tc_version, sizeof(tc_version), tc))) > + if (CHECK_FAIL(!fgets(tc_version, sizeof(tc_version), tc))) { > + pclose(tc); > return false; > + } > if (strstr(tc_version, ", libbpf ")) > prog = "test_sk_assign_libbpf.bpf.o"; > else > -- > 2.43.0 >
Re: [GIT PULL] Kselftest fixes update for Linux 6.15-rc2
The pull request you sent on Wed, 9 Apr 2025 16:13:39 -0600: > git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest > tags/linux_kselftest-fixes-6.15-rc2 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/3b07108ada81a8ebcebf1fe61367b4e436c895bd Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
Re: [PATCH v2] DAX: warn when kmem regions are truncated for memory block alignment.
Gregory Price wrote: > Device capacity intended for use as system ram should be aligned to the > archite-defined memory block size or that capacity will be silently > truncated and capacity stranded. > > As hotplug dax memory becomes more prevelant, the memory block size > alignment becomes more important for platform and device vendors to > pay attention to - so this truncation should not be silent. > > This issue is particularly relevant for CXL Dynamic Capacity devices, > whose capacity may arrive in spec-aligned but block-misaligned chunks. > > Suggested-by: David Hildenbrand > Suggested-by: Dan Williams > Signed-off-by: Gregory Price The typo in the changelog was already noted, and this addresses all my feedback so feel free to add: Reviewed-by: Dan Williams
Re: [PATCH 5/5] KVM: VMX: Always honor guest PAT on CPUs that support self-snoop
Hello, I am completely new to and uninformed about kernel development. I was pointed here from Mesa documentation for Venus (Vulkan encapsulation for KVM/QEMU): https://docs.mesa3d.org/drivers/venus.html Based on my limited understanding of what has happened here, this patch series was partially reverted due to an issue with the Bochs DRM driver. A fix for that issue has been merged months ago according to the link provided in an earlier message. Since then work on this detail of KVM seems to have stalled. Is it reasonable to ask here for this patch series to be evaluated and incorporated again? My layperson's attempt at applying the series against 6.14.1 source code failed. In addition to the parts that appear to have already been incorporated there are some parts of the patch series that are rejected. I lack the knowledge to correct that. Distro kernels currently ship without it which limits the usability of Venus on AMD and NVIDIA GPUs paired with Intel CPUs. Convincing individual distro maintainers of the necessity of this patch series without the specialized knowledge required for understanding what it does and performing that evaluation is quite hard. If upstream (kernel) would apply it now the distros would ship a kernel including the required changes to users, including me, without that multiplicated effort. Thank you for your time. If this request is out of place here please forgive me for engaging this mailing list without a proper understanding of the list's scope. On 2024-10-07 14:04:24, Linux regression tracking (Thorsten Leemhuis) wrote: > On 07.10.24 15:38, Vitaly Kuznetsov wrote: >> "Linux regression tracking (Thorsten Leemhuis)" >> writes: >> >>> On 30.08.24 11:35, Vitaly Kuznetsov wrote: Sean Christopherson writes: > Unconditionally honor guest PAT on CPUs that support self-snoop, as > Intel has confirmed that CPUs that support self-snoop always snoop caches > and store buffers. I.e. CPUs with self-snoop maintain cache coherency > even in the presence of aliased memtypes, thus there is no need to trust > the guest behaves and only honor PAT as a last resort, as KVM does today. > > Honoring guest PAT is desirable for use cases where the guest has access > to non-coherent DMA _without_ bouncing through VFIO, e.g. when a virtual > (mediated, for all intents and purposes) GPU is exposed to the guest, > along > with buffers that are consumed directly by the physical GPU, i.e. which > can't be proxied by the host to ensure writes from the guest are performed > with the correct memory type for the GPU. Necroposting! Turns out that this change broke "bochs-display" driver in QEMU even when the guest is modern (don't ask me 'who the hell uses bochs for modern guests', it was basically a configuration error :-). E.g: [...] >>> >>> This regression made it to the list of tracked regressions. It seems >>> this thread stalled a while ago. Was this ever fixed? Does not look like >>> it, but I might have missed something. Or is this a regression I should >>> just ignore for one reason or another? >>> >> >> The regression was addressed in by reverting 377b2f359d1f in 6.11 >> >> commit 9d70f3fec14421e793ffbc0ec2f739b24e534900 >> Author: Paolo Bonzini >> Date: Sun Sep 15 02:49:33 2024 -0400 >> >> Revert "KVM: VMX: Always honor guest PAT on CPUs that support >> self-snoop" > > Thx. Sorry, missed that, thx for pointing me towards it. I had looked > for things like that, but seems I messed up my lore query. Apologies for > the noise! > >> Also, there's a (pending) DRM patch fixing it from the guest's side: >> https://gitlab.freedesktop.org/drm/misc/kernel/-/commit/9388ccf69925223223c87355a417ba39b13a5e8e > > Great! > > Ciao, Thorsten > > P.S.: > > #regzbot fix: 9d70f3fec14421e793ffbc0ec2f739b24e534900 > > >
Re: [PATCH] compiler.h: Avoid the usage of __typeof_unqual__() when __GENKSYMS__ is defined
On Wed, Apr 9, 2025 at 5:28 PM Borislav Petkov wrote: > > On Sun, Apr 06, 2025 at 05:36:13PM +0200, Uros Bizjak wrote: > > On Fri, Apr 4, 2025 at 9:14 PM Masahiro Yamada wrote: > > > > > > On Fri, Apr 4, 2025 at 11:37 PM Uros Bizjak wrote: > > > > > > > > On Fri, Apr 4, 2025 at 4:06 PM Masahiro Yamada > > > > wrote: > > > > > > > > > > > > Current version of genksyms doesn't know anything about > > > > > > > > __typeof_unqual__() > > > > > > > > operator. Avoid the usage of __typeof_unqual__() with genksyms > > > > > > > > to prevent > > > > > > > > errors when symbols are versioned. > > > > > > > > > > > > > > > > There were no problems with gendwarfksyms. > > > > > > > > > > > > > > > > Signed-off-by: Uros Bizjak > > > > > > > > Fixes: ac053946f5c40 ("compiler.h: introduce TYPEOF_UNQUAL() > > > > > > > > macro") > > > > > > > > Reported-by: Paul Menzel > > > > > > > > Closes: > > > > > > > > https://lore.kernel.org/lkml/81a25a60-de78-43fb-b56a-131151e1c...@molgen.mpg.de/ > > > > > > > > Cc: Sami Tolvanen > > > > > > > > Cc: Andrew Morton > > > > > > > > --- > > > > > > > > > > > > > > > > > > > > > Why don't you add it to the genksyms keyword table? > > > > > > > > > > > > It doesn't work, even if I patch it with an even more elaborate > > > > > > patch > > > > > > (attached). > > > > > > > > > > > > I guess some more surgery will be needed, but for now a fallback > > > > > > works > > > > > > as expected. > > > > > > > > > > > > Uros. > > > > > > > > > > The attached patch looks good to me. > > > > > > > > FAOD - do you refer to the submitted one for compiler.h or to the one > > > > for scripts/genksyms/keywords.c? (The latter doesn't fix the warning, > > > > though). > > > > > > > > > > > > You are still seeing the warnings because __typeof_unqual__ > > > is not only the issue. > > > > > > Hint: > > > > > > $ make -s KCFLAGS=-D__GENKSYMS__ arch/x86/kernel/setup_percpu.i > > > $ grep 'this_cpu_off;' arch/x86/kernel/setup_percpu.i > > > > I see. > > > > With my workaround, this_cpu_off is declared as: > > > > extern __attribute__((section(".data..percpu" "..hot.." > > "this_cpu_off"))) __typeof__(unsigned long) this_cpu_off; > > > > while without workaround, the same variable is declared as: > > > > extern __seg_gs __attribute__((section(".data..percpu" "..hot.." > > "this_cpu_off"))) __typeof__(unsigned long) this_cpu_off; > > > > It looks that genksyms should be extended to handle (or ignore) > > __seg_gs/__seg_fs named address prefix. Somewhat surprising, because > > genksyms can process: > > > > extern __attribute__((section(".data..percpu" "..hot.." > > "const_current_task"))) __typeof__(struct task_struct * const > > __seg_gs) const_current_task > > > > without problems. > > > > I'm sorry, but I'm not able to extend genksyms with a new keyword by > > myself... > > Well, we need a fix here because this fires a lot by now - triggers on my > machines now too. > > So either take a fix or we'll need to revert until it is fixed properly. The workaround is posted to the list. It should be committed to the mainline until genksyms is fixed. Uros.
Re: [PATCH] compiler.h: Avoid the usage of __typeof_unqual__() when __GENKSYMS__ is defined
On Wed, Apr 09, 2025 at 05:32:39PM +0200, Uros Bizjak wrote: > The workaround is posted to the list. It should be committed to the > mainline until genksyms is fixed. I'll take it through tip. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH net-next] configs/debug: run and debug PREEMPT
On Wed, 9 Apr 2025 13:58:13 +0200 Matthieu Baerts wrote: > On 08/04/2025 21:03, Jakub Kicinski wrote: > > On Tue, 8 Apr 2025 20:18:26 +0200 Matthieu Baerts wrote: > >> On 02/04/2025 19:23, Stanislav Fomichev wrote: > >>> Recent change [0] resulted in a "BUG: using __this_cpu_read() in > >>> preemptible" splat [1]. PREEMPT kernels have additional requirements > >>> on what can and can not run with/without preemption enabled. > >>> Expose those constrains in the debug kernels. > >> > >> Good idea to suggest this to find more bugs! > >> > >> I did some quick tests on my side with our CI, and the MPTCP selftests > >> seem to take a bit more time, but without impacting the results. > >> Hopefully, there will be no impact in slower/busy environments :) > > > > What kind of slow down do you see? I think we get up to 50% more time > > spent in the longer tests. > > That's difficult to measure in our CI because we have a majority of > tests either creating test envs with random parameters (latency, losses, > etc.), or waiting for a transfer at a limited speed to finish. Plus, we > don't control the host running our tests. But if we omit that, our > packetdrill tests take ~20% longer on the CI, and our 2 mini KUnit tests > took ~10% longer (275ms -> 305ms). Globally, our test suite took maybe > ~10-20% longer, and that's acceptable. > > So not 50%. Is this difference acceptable for NIPA? Even when some tests > are restarted automatically in case of instabilities? We also see 10%+ on most cases, the 50% was the worst one I glanced. The worst offenders in terms of runtime only increased by 10% so still within the guidelines. > One last thing, Stanislav's patch has been shared during Linus' merge > window: perhaps something else could also impact the time? > > > Not sure how bad is too bad.. > > Did you observe more instabilities? Maybe the individual results should > be omitted, and only debug specific issues (calltraces, kmemleak, etc.) > should be looked at? A couple but unclear at this stage whether that was just the merge window or enabling preempt debug. Now patchwork is super unstable so again, hard to judge the source of the problems :( > > I'm leaning > > towards applying this to net-next and we can see if people running > > on linux-next complain? > > Good idea! But I do wonder how run **and monitor** the selftests in > linux-next with a debug kernel :) One way to find out :)
Re: [PATCH v1] s390/virtio_ccw: don't allocate/assign airqs for non-existing queues
On Wed, Apr 09, 2025 at 02:24:32PM +0200, David Hildenbrand wrote: > On 09.04.25 14:07, Michael S. Tsirkin wrote: > > On Wed, Apr 09, 2025 at 01:12:19PM +0200, David Hildenbrand wrote: > > > On 09.04.25 12:56, Michael S. Tsirkin wrote: > > > > On Wed, Apr 09, 2025 at 12:46:41PM +0200, David Hildenbrand wrote: > > > > > On 07.04.25 23:20, Michael S. Tsirkin wrote: > > > > > > On Mon, Apr 07, 2025 at 08:47:05PM +0200, David Hildenbrand wrote: > > > > > > > > In my opinion, it makes the most sense to keep the spec as it > > > > > > > > is and > > > > > > > > change QEMU and the kernel to match, but obviously that's not > > > > > > > > trivial > > > > > > > > to do in a way that doesn't break existing devices and drivers. > > > > > > > > > > > > > > If only it would be limited to QEMU and Linux ... :) > > > > > > > > > > > > > > Out of curiosity, assuming we'd make the spec match the current > > > > > > > QEMU/Linux > > > > > > > implementation at least for the 3 involved features only, would > > > > > > > there be a > > > > > > > way to adjust crossvm without any disruption? > > > > > > > > > > > > > > I still have the feeling that it will be rather hard to get that > > > > > > > all > > > > > > > implementations match the spec ... For new features+queues it > > > > > > > will be easy > > > > > > > to force the usage of fixed virtqueue numbers, but for > > > > > > > free-page-hinting and > > > > > > > reporting, it's a mess :( > > > > > > > > > > > > > > > > > > Still thinking about a way to fix drivers... We can discuss this > > > > > > theoretically, maybe? > > > > > > > > > > Yes, absolutely. I took the time to do some more digging; regarding > > > > > drivers > > > > > only Linux seems to be problematic. > > > > > > > > > > virtio-win, FreeBSD, NetBSD and OpenBSD and don't seem to support > > > > > problematic features (free page hinting, free page reporting) in their > > > > > virtio-balloon implementations. > > > > > > > > > > So from the known drivers, only Linux is applicable. > > > > > > > > > > reporting_vq is either at idx 4/3/2 > > > > > free_page_vq is either at idx 3/2 > > > > > statsq is at idx2 (only relevant if the feature is offered) > > > > > > > > > > So if we could test for the existence of a virtqueue at an idx > > > > > easily, we > > > > > could test from highest-to-smallest idx. > > > > > > > > > > But I recall that testing for the existance of a virtqueue on s390x > > > > > resulted > > > > > in the problem/deadlock in the first place ... > > > > > > > > > > -- > > > > > Cheers, > > > > > > > > > > David / dhildenb > > > > > > > > So let's talk about a new feature bit? > > > > > > Are you thinking about a new feature that switches between "fixed queue > > > indices" and "compressed queue indices", whereby the latter would be the > > > legacy default and we would expect all devices to switch to the new > > > fixed-queue-indices layout? > > > > > > We could make all new features require "fixed-queue-indices". > > > > I see two ways: > > 1. we make driver behave correctly with in spec and out of spec devices > > and we make qemu behave correctly with in spec and out of spec devices > > 2. a new feature bit > > > > I prefer 1, and when we add a new feature we can also > > document that it should be in spec if negotiated. > > > > My question is if 1 is practical. > > AFAIKT, 1) implies: > > virtio-balloon: > > a) Driver > > As mentioned above, we'd need a reliable way to test for the existence of a > virtqueue, so we can e.g., test for reporting_vq idx 4 -> 3 -> 2 > > With that we'd be able to support compressed+fixed at the same time. > > Q: Is it possible/feasible? > > b) Device: virtio-balloon: we can fake existence of STAT and > FREE_PAGE_HINTING easily, such that the compressed layout corresponds to the > fixed layout easily. > > Q: alternatives? We could try creating multiple queues for the same feature, > but it's going to be a mess I'm afraid ... > > > virtio-fs: > > a) Driver > > Linux does not even implement VIRTIO_FS_F_NOTIFICATION or respect > VIRTIO_FS_F_NOTIFICATION when calculating queue indices, ... > > b) Device > > Same applies to virtiofsd ... > > Q: Did anybody actually implement VIRTIO_FS_F_NOTIFICATION ever? If not, can > we just remove it from the spec completely and resolve the issue that way? Donnu. Vivek? Or we can check for queue number 1+num_request_queues maybe? If that exists then it is spec compliant? > -- > Cheers, > > David / dhildenb
Re: [PATCH v1 3/3] selftests: pid_namespace: add missing sys/mount.h include in pid_max.c
On Wed, Jan 15, 2025 at 2:53 AM Peter Seiderer wrote: > > Fix compile on openSUSE Tumbleweed (gcc-14.2.1, glibc-2.40): > - add missing sys/mount.h include > > Fixes: > > pid_max.c: In function ‘pid_max_cb’: > pid_max.c:42:15: error: implicit declaration of function ‘mount’ > [-Wimplicit-function-declaration] > 42 | ret = mount("", "/", NULL, MS_PRIVATE | MS_REC, 0); > | ^ > > Signed-off-by: Peter Seiderer Reviewed-by: T.J. Mercier
Re: [PATCH net-next] configs/debug: run and debug PREEMPT
Hi Jakub, On 08/04/2025 21:03, Jakub Kicinski wrote: > On Tue, 8 Apr 2025 20:18:26 +0200 Matthieu Baerts wrote: >> On 02/04/2025 19:23, Stanislav Fomichev wrote: >>> Recent change [0] resulted in a "BUG: using __this_cpu_read() in >>> preemptible" splat [1]. PREEMPT kernels have additional requirements >>> on what can and can not run with/without preemption enabled. >>> Expose those constrains in the debug kernels. >> >> Good idea to suggest this to find more bugs! >> >> I did some quick tests on my side with our CI, and the MPTCP selftests >> seem to take a bit more time, but without impacting the results. >> Hopefully, there will be no impact in slower/busy environments :) > > What kind of slow down do you see? I think we get up to 50% more time > spent in the longer tests. That's difficult to measure in our CI because we have a majority of tests either creating test envs with random parameters (latency, losses, etc.), or waiting for a transfer at a limited speed to finish. Plus, we don't control the host running our tests. But if we omit that, our packetdrill tests take ~20% longer on the CI, and our 2 mini KUnit tests took ~10% longer (275ms -> 305ms). Globally, our test suite took maybe ~10-20% longer, and that's acceptable. So not 50%. Is this difference acceptable for NIPA? Even when some tests are restarted automatically in case of instabilities? One last thing, Stanislav's patch has been shared during Linus' merge window: perhaps something else could also impact the time? > Not sure how bad is too bad.. Did you observe more instabilities? Maybe the individual results should be omitted, and only debug specific issues (calltraces, kmemleak, etc.) should be looked at? > I'm leaning > towards applying this to net-next and we can see if people running > on linux-next complain? Good idea! But I do wonder how run **and monitor** the selftests in linux-next with a debug kernel :) Cheers, Matt -- Sponsored by the NGI0 Core fund.
Re: [PATCH RFC v3 3/8] slab: add sheaf support for batching kfree_rcu() operations
On 4/9/25 3:50 AM, Harry Yoo wrote: > On Mon, Mar 17, 2025 at 03:33:04PM +0100, Vlastimil Babka wrote: > > Hmm this hunk in v3 is fine, but on your slub-percpu-shaves-v4r0 branch > it's calling local_unlock() twice. Probably a rebase error? Yeah, thanks a lot for catching that! I've just pushed https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slub-percpu-sheaves-v4r1 with this fixed, and fixups for 2/8 to the points you made, plus a proper strict_numa handling in 2/8, and an extra patch for better NUMA locality > Otherwise looks good to me. > > When you address this, please feel free to add: > > Reviewed-by: Harry Yoo > > Thanks! >
[PATCH net 0/2] mptcp: only inc MPJoinAckHMacFailure for HMAC failures
Recently, during a debugging session using local MPTCP connections, I noticed MPJoinAckHMacFailure was strangely not zero on the server side. The first patch fixes this issue -- present since v5.9 -- and the second one validates it in the selftests. Signed-off-by: Matthieu Baerts (NGI0) --- Matthieu Baerts (NGI0) (2): mptcp: only inc MPJoinAckHMacFailure for HMAC failures selftests: mptcp: validate MPJoin HMacFailure counters net/mptcp/subflow.c | 8 ++-- tools/testing/selftests/net/mptcp/mptcp_join.sh | 18 ++ 2 files changed, 24 insertions(+), 2 deletions(-) --- base-commit: 61f96e684edd28ca40555ec49ea1555df31ba619 change-id: 20250407-net-mptcp-hmac-failure-mib-66f599305ff3 Best regards, -- Matthieu Baerts (NGI0)
Re: [PATCH bpf-next v4 2/3] bpf, sockmap: avoid using sk_socket after free when reading
On Tue, Apr 8, 2025 at 12:31 AM Jiayuan Chen wrote: > > There are potential concurrency issues, as shown below. > ''' > CPU0 CPU1 > sk_psock_verdict_data_ready: > socket *sock = sk->sk_socket > if (!sock) return >close(fd): > ... > ops->release() > if (!sock->ops) return > sock->ops = NULL > rcu_call(sock) > free(sock) > READ_ONCE(sock->ops) > ^ > use 'sock' after free > ''' > > RCU is not applicable to Unix sockets read path, because the Unix socket > implementation itself assumes it's always in process context and heavily > uses mutex_lock, so, we can't call read_skb within rcu lock. > > Incrementing the psock reference count would not help either, since > sock_map_close() does not wait for data_ready() to complete its execution. > > While we don't utilize sk_socket here, implementing read_skb at the sock > layer instead of socket layer might be architecturally preferable ? > However, deferring this optimization as current fix adequately addresses > the immediate issue. > > Fixes: c63829182c37 ("af_unix: Implement ->psock_update_sk_prot()") > Reported-by: syzbot+dd90a702f518e0eac...@syzkaller.appspotmail.com > Closes: > https://lore.kernel.org/bpf/6734c033.050a0220.2a2fcc.0015@google.com/ > Signed-off-by: Jiayuan Chen > --- > net/core/skmsg.c | 13 ++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/net/core/skmsg.c b/net/core/skmsg.c > index 6101c1bb279a..5e913b62929e 100644 > --- a/net/core/skmsg.c > +++ b/net/core/skmsg.c > @@ -1231,17 +1231,24 @@ static int sk_psock_verdict_recv(struct sock *sk, > struct sk_buff *skb) > > static void sk_psock_verdict_data_ready(struct sock *sk) > { > - struct socket *sock = sk->sk_socket; > + struct socket *sock; > const struct proto_ops *ops; > int copied; > > trace_sk_data_ready(sk); > > - if (unlikely(!sock)) > + rcu_read_lock(); > + sock = sk->sk_socket; > + if (unlikely(!sock)) { > + rcu_read_unlock(); > return; > + } > ops = READ_ONCE(sock->ops); > - if (!ops || !ops->read_skb) > + if (!ops || !ops->read_skb) { > + rcu_read_unlock(); > return; > + } > + rcu_read_unlock(); This makes no sense to me. RCU doesn't work this way. pw-bot: cr
[PATCH v8 2/4] KVM: selftests: Add core KVM selftests support for LoongArch
Add core KVM selftests support for LoongArch, it includes exception handler, mmu page table setup and vcpu startup entry supporting etc. Signed-off-by: Bibo Mao --- .../selftests/kvm/lib/loongarch/exception.S | 59 +++ .../selftests/kvm/lib/loongarch/processor.c | 349 ++ 2 files changed, 408 insertions(+) create mode 100644 tools/testing/selftests/kvm/lib/loongarch/exception.S create mode 100644 tools/testing/selftests/kvm/lib/loongarch/processor.c diff --git a/tools/testing/selftests/kvm/lib/loongarch/exception.S b/tools/testing/selftests/kvm/lib/loongarch/exception.S new file mode 100644 index ..88bfa505c6f5 --- /dev/null +++ b/tools/testing/selftests/kvm/lib/loongarch/exception.S @@ -0,0 +1,59 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#include "processor.h" + +/* address of refill exception should be 4K aligned */ +.balign4096 +.global handle_tlb_refill +handle_tlb_refill: + csrwr t0, LOONGARCH_CSR_TLBRSAVE + csrrd t0, LOONGARCH_CSR_PGD + lddir t0, t0, 3 + lddir t0, t0, 1 + ldpte t0, 0 + ldpte t0, 1 + tlbfill + csrrd t0, LOONGARCH_CSR_TLBRSAVE + ertn + + /* +* save and restore all gprs except base register, +* and default value of base register is sp ($r3). +*/ +.macro save_gprs base + .irp n,1,2,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31 + st.d$r\n, \base, 8 * \n + .endr +.endm + +.macro restore_gprs base + .irp n,1,2,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31 + ld.d$r\n, \base, 8 * \n + .endr +.endm + +/* address of general exception should be 4K aligned */ +.balign4096 +.global handle_exception +handle_exception: + csrwr sp, LOONGARCH_CSR_KS0 + csrrd sp, LOONGARCH_CSR_KS1 + addi.d sp, sp, -EXREGS_SIZE + + save_gprs sp + /* save sp register to stack */ + csrrd t0, LOONGARCH_CSR_KS0 + st.d t0, sp, 3 * 8 + + csrrd t0, LOONGARCH_CSR_ERA + st.d t0, sp, PC_OFFSET_EXREGS + csrrd t0, LOONGARCH_CSR_ESTAT + st.d t0, sp, ESTAT_OFFSET_EXREGS + csrrd t0, LOONGARCH_CSR_BADV + st.d t0, sp, BADV_OFFSET_EXREGS + + or a0, sp, zero + bl route_exception + restore_gprs sp + csrrd sp, LOONGARCH_CSR_KS0 + ertn diff --git a/tools/testing/selftests/kvm/lib/loongarch/processor.c b/tools/testing/selftests/kvm/lib/loongarch/processor.c new file mode 100644 index ..25b7f896a107 --- /dev/null +++ b/tools/testing/selftests/kvm/lib/loongarch/processor.c @@ -0,0 +1,349 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include + +#include "kvm_util.h" +#include "processor.h" +#include "ucall_common.h" + +#define LOONGARCH_GUEST_STACK_VADDR_MIN0x20 +#define LOONARCH_PAGE_TABLE_PHYS_MIN 0x20 + +static vm_paddr_t invalid_pgtable[4]; +static uint64_t virt_pte_index(struct kvm_vm *vm, vm_vaddr_t gva, int level) +{ + unsigned int shift; + uint64_t mask; + + shift = level * (vm->page_shift - 3) + vm->page_shift; + mask = (1UL << (vm->page_shift - 3)) - 1; + return (gva >> shift) & mask; +} + +static uint64_t pte_addr(struct kvm_vm *vm, uint64_t entry) +{ + return entry & ~((0x1UL << vm->page_shift) - 1); +} + +static uint64_t ptrs_per_pte(struct kvm_vm *vm) +{ + return 1 << (vm->page_shift - 3); +} + +static void virt_set_pgtable(struct kvm_vm *vm, vm_paddr_t table, vm_paddr_t child) +{ + uint64_t *ptep; + int i, ptrs_per_pte; + + ptep = addr_gpa2hva(vm, table); + ptrs_per_pte = 1 << (vm->page_shift - 3); + for (i = 0; i < ptrs_per_pte; i++) + WRITE_ONCE(*(ptep + i), child); +} + +void virt_arch_pgd_alloc(struct kvm_vm *vm) +{ + int i; + vm_paddr_t child, table; + + if (vm->pgd_created) + return; + child = table = 0; + for (i = 0; i < vm->pgtable_levels; i++) { + invalid_pgtable[i] = child; + table = vm_phy_page_alloc(vm, LOONARCH_PAGE_TABLE_PHYS_MIN, + vm->memslots[MEM_REGION_PT]); + TEST_ASSERT(table, "Fail to allocate page tale at level %d\n", i); + virt_set_pgtable(vm, table, child); + child = table; + } + vm->pgd = table; + vm->pgd_created = true; +} + +static int virt_pte_none(uint64_t *ptep, int level) +{ + return *ptep == invalid_pgtable[level]; +} + +static uint64_t *virt_populate_pte(struct kvm_vm *vm, vm_vaddr_t gva, int alloc) +{ + uint64_t *ptep; + vm_paddr_t child; + int level; + + if (!vm->pgd_created) + goto unmapped_gva; + + level = vm->pgtable_levels - 1; + child = vm->pgd; + while (level > 0) { + ptep = addr_gpa2hva(vm, child) + virt_pte_index(vm, gva, l
[PATCH v8 1/4] KVM: selftests: Add KVM selftests header files for LoongArch
Add KVM selftests header files for LoongArch, including processor.h and kvm_util_base.h. It mainly contains LoongArch CSR register definition and page table entry definition. Signed-off-by: Bibo Mao --- .../testing/selftests/kvm/include/kvm_util.h | 5 + .../kvm/include/loongarch/kvm_util_arch.h | 7 + .../kvm/include/loongarch/processor.h | 138 ++ 3 files changed, 150 insertions(+) create mode 100644 tools/testing/selftests/kvm/include/loongarch/kvm_util_arch.h create mode 100644 tools/testing/selftests/kvm/include/loongarch/processor.h diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index 373912464fb4..d9cb62207c57 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -232,6 +232,11 @@ extern enum vm_guest_mode vm_mode_default; #define MIN_PAGE_SHIFT 12U #define ptes_per_page(page_size) ((page_size) / 8) +#elif defined(__loongarch__) +#define VM_MODE_DEFAULTVM_MODE_P36V47_16K +#define MIN_PAGE_SHIFT 12U +#define ptes_per_page(page_size) ((page_size) / 8) + #endif #define VM_SHAPE_DEFAULT VM_SHAPE(VM_MODE_DEFAULT) diff --git a/tools/testing/selftests/kvm/include/loongarch/kvm_util_arch.h b/tools/testing/selftests/kvm/include/loongarch/kvm_util_arch.h new file mode 100644 index ..e43a57d99b56 --- /dev/null +++ b/tools/testing/selftests/kvm/include/loongarch/kvm_util_arch.h @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef SELFTEST_KVM_UTIL_ARCH_H +#define SELFTEST_KVM_UTIL_ARCH_H + +struct kvm_vm_arch {}; + +#endif // SELFTEST_KVM_UTIL_ARCH_H diff --git a/tools/testing/selftests/kvm/include/loongarch/processor.h b/tools/testing/selftests/kvm/include/loongarch/processor.h new file mode 100644 index ..e95dd2059605 --- /dev/null +++ b/tools/testing/selftests/kvm/include/loongarch/processor.h @@ -0,0 +1,138 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#ifndef SELFTEST_KVM_PROCESSOR_H +#define SELFTEST_KVM_PROCESSOR_H + +#ifndef __ASSEMBLER__ +#include "ucall_common.h" + +#else +/* general registers */ +#define zero $r0 +#define ra $r1 +#define tp $r2 +#define sp $r3 +#define a0 $r4 +#define a1 $r5 +#define a2 $r6 +#define a3 $r7 +#define a4 $r8 +#define a5 $r9 +#define a6 $r10 +#define a7 $r11 +#define t0 $r12 +#define t1 $r13 +#define t2 $r14 +#define t3 $r15 +#define t4 $r16 +#define t5 $r17 +#define t6 $r18 +#define t7 $r19 +#define t8 $r20 +#define u0 $r21 +#define fp $r22 +#define s0 $r23 +#define s1 $r24 +#define s2 $r25 +#define s3 $r26 +#define s4 $r27 +#define s5 $r28 +#define s6 $r29 +#define s7 $r30 +#define s8 $r31 +#endif + +/* LoongArch page table entry definition */ +#define _PAGE_VALID_SHIFT 0 +#define _PAGE_DIRTY_SHIFT 1 +#define _PAGE_PLV_SHIFT2 /* 2~3, two bits */ +#define PLV_KERN 0 +#define PLV_USER 3 +#define PLV_MASK 0x3 +#define _CACHE_SHIFT 4 /* 4~5, two bits */ +#define _PAGE_PRESENT_SHIFT7 +#define _PAGE_WRITE_SHIFT 8 + +#define _PAGE_VALIDBIT_ULL(_PAGE_VALID_SHIFT) +#define _PAGE_PRESENT BIT_ULL(_PAGE_PRESENT_SHIFT) +#define _PAGE_WRITEBIT_ULL(_PAGE_WRITE_SHIFT) +#define _PAGE_DIRTYBIT_ULL(_PAGE_DIRTY_SHIFT) +#define _PAGE_USER (PLV_USER << _PAGE_PLV_SHIFT) +#define __READABLE (_PAGE_VALID) +#define __WRITEABLE (_PAGE_DIRTY | _PAGE_WRITE) +/* Coherent Cached */ +#define _CACHE_CC BIT_ULL(_CACHE_SHIFT) +#define PS_4K 0x000c +#define PS_8K 0x000d +#define PS_16K 0x000e +#define PS_DEFAULT_SIZEPS_16K + +/* LoongArch Basic CSR registers */ +#define LOONGARCH_CSR_CRMD 0x0 /* Current mo
Re: [PATCH v1] s390/virtio_ccw: don't allocate/assign airqs for non-existing queues
On 09.04.25 14:07, Michael S. Tsirkin wrote: On Wed, Apr 09, 2025 at 01:12:19PM +0200, David Hildenbrand wrote: On 09.04.25 12:56, Michael S. Tsirkin wrote: On Wed, Apr 09, 2025 at 12:46:41PM +0200, David Hildenbrand wrote: On 07.04.25 23:20, Michael S. Tsirkin wrote: On Mon, Apr 07, 2025 at 08:47:05PM +0200, David Hildenbrand wrote: In my opinion, it makes the most sense to keep the spec as it is and change QEMU and the kernel to match, but obviously that's not trivial to do in a way that doesn't break existing devices and drivers. If only it would be limited to QEMU and Linux ... :) Out of curiosity, assuming we'd make the spec match the current QEMU/Linux implementation at least for the 3 involved features only, would there be a way to adjust crossvm without any disruption? I still have the feeling that it will be rather hard to get that all implementations match the spec ... For new features+queues it will be easy to force the usage of fixed virtqueue numbers, but for free-page-hinting and reporting, it's a mess :( Still thinking about a way to fix drivers... We can discuss this theoretically, maybe? Yes, absolutely. I took the time to do some more digging; regarding drivers only Linux seems to be problematic. virtio-win, FreeBSD, NetBSD and OpenBSD and don't seem to support problematic features (free page hinting, free page reporting) in their virtio-balloon implementations. So from the known drivers, only Linux is applicable. reporting_vq is either at idx 4/3/2 free_page_vq is either at idx 3/2 statsq is at idx2 (only relevant if the feature is offered) So if we could test for the existence of a virtqueue at an idx easily, we could test from highest-to-smallest idx. But I recall that testing for the existance of a virtqueue on s390x resulted in the problem/deadlock in the first place ... -- Cheers, David / dhildenb So let's talk about a new feature bit? Are you thinking about a new feature that switches between "fixed queue indices" and "compressed queue indices", whereby the latter would be the legacy default and we would expect all devices to switch to the new fixed-queue-indices layout? We could make all new features require "fixed-queue-indices". I see two ways: 1. we make driver behave correctly with in spec and out of spec devices and we make qemu behave correctly with in spec and out of spec devices 2. a new feature bit I prefer 1, and when we add a new feature we can also document that it should be in spec if negotiated. My question is if 1 is practical. AFAIKT, 1) implies: virtio-balloon: a) Driver As mentioned above, we'd need a reliable way to test for the existence of a virtqueue, so we can e.g., test for reporting_vq idx 4 -> 3 -> 2 With that we'd be able to support compressed+fixed at the same time. Q: Is it possible/feasible? b) Device: virtio-balloon: we can fake existence of STAT and FREE_PAGE_HINTING easily, such that the compressed layout corresponds to the fixed layout easily. Q: alternatives? We could try creating multiple queues for the same feature, but it's going to be a mess I'm afraid ... virtio-fs: a) Driver Linux does not even implement VIRTIO_FS_F_NOTIFICATION or respect VIRTIO_FS_F_NOTIFICATION when calculating queue indices, ... b) Device Same applies to virtiofsd ... Q: Did anybody actually implement VIRTIO_FS_F_NOTIFICATION ever? If not, can we just remove it from the spec completely and resolve the issue that way? -- Cheers, David / dhildenb
Re: [PATCH] compiler.h: Avoid the usage of __typeof_unqual__() when __GENKSYMS__ is defined
On Sun, Apr 06, 2025 at 05:36:13PM +0200, Uros Bizjak wrote: > On Fri, Apr 4, 2025 at 9:14 PM Masahiro Yamada wrote: > > > > On Fri, Apr 4, 2025 at 11:37 PM Uros Bizjak wrote: > > > > > > On Fri, Apr 4, 2025 at 4:06 PM Masahiro Yamada > > > wrote: > > > > > > > > > > Current version of genksyms doesn't know anything about > > > > > > > __typeof_unqual__() > > > > > > > operator. Avoid the usage of __typeof_unqual__() with genksyms > > > > > > > to prevent > > > > > > > errors when symbols are versioned. > > > > > > > > > > > > > > There were no problems with gendwarfksyms. > > > > > > > > > > > > > > Signed-off-by: Uros Bizjak > > > > > > > Fixes: ac053946f5c40 ("compiler.h: introduce TYPEOF_UNQUAL() > > > > > > > macro") > > > > > > > Reported-by: Paul Menzel > > > > > > > Closes: > > > > > > > https://lore.kernel.org/lkml/81a25a60-de78-43fb-b56a-131151e1c...@molgen.mpg.de/ > > > > > > > Cc: Sami Tolvanen > > > > > > > Cc: Andrew Morton > > > > > > > --- > > > > > > > > > > > > > > > > > > Why don't you add it to the genksyms keyword table? > > > > > > > > > > It doesn't work, even if I patch it with an even more elaborate patch > > > > > (attached). > > > > > > > > > > I guess some more surgery will be needed, but for now a fallback works > > > > > as expected. > > > > > > > > > > Uros. > > > > > > > > The attached patch looks good to me. > > > > > > FAOD - do you refer to the submitted one for compiler.h or to the one > > > for scripts/genksyms/keywords.c? (The latter doesn't fix the warning, > > > though). > > > > > > > > You are still seeing the warnings because __typeof_unqual__ > > is not only the issue. > > > > Hint: > > > > $ make -s KCFLAGS=-D__GENKSYMS__ arch/x86/kernel/setup_percpu.i > > $ grep 'this_cpu_off;' arch/x86/kernel/setup_percpu.i > > I see. > > With my workaround, this_cpu_off is declared as: > > extern __attribute__((section(".data..percpu" "..hot.." > "this_cpu_off"))) __typeof__(unsigned long) this_cpu_off; > > while without workaround, the same variable is declared as: > > extern __seg_gs __attribute__((section(".data..percpu" "..hot.." > "this_cpu_off"))) __typeof__(unsigned long) this_cpu_off; > > It looks that genksyms should be extended to handle (or ignore) > __seg_gs/__seg_fs named address prefix. Somewhat surprising, because > genksyms can process: > > extern __attribute__((section(".data..percpu" "..hot.." > "const_current_task"))) __typeof__(struct task_struct * const > __seg_gs) const_current_task > > without problems. > > I'm sorry, but I'm not able to extend genksyms with a new keyword by myself... Well, we need a fix here because this fires a lot by now - triggers on my machines now too. So either take a fix or we'll need to revert until it is fixed properly. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
[PATCH v5 1/2] mm/vmscan: Skip memcg with !usage in shrink_node_memcgs()
The test_memcontrol selftest consistently fails its test_memcg_low sub-test due to the fact that two of its test child cgroups which have a memmory.low of 0 or an effective memory.low of 0 still have low events generated for them since mem_cgroup_below_low() use the ">=" operator when comparing to elow. The two failed use cases are as follows: 1) memory.low is set to 0, but low events can still be triggered and so the cgroup may have a non-zero low event count. I doubt users are looking for that as they didn't set memory.low at all. 2) memory.low is set to a non-zero value but the cgroup has no task in it so that it has an effective low value of 0. Again it may have a non-zero low event count if memory reclaim happens. This is probably not a result expected by the users and it is really doubtful that users will check an empty cgroup with no task in it and expecting some non-zero event counts. In the first case, even though memory.low isn't set, it may still have some low protection if memory.low is set in the parent. So low event may still be recorded. The test_memcontrol.c test has to be modified to account for that. For the second case, it really doesn't make sense to have non-zero low event if the cgroup has 0 usage. So we need to skip this corner case in shrink_node_memcgs() using mem_cgroup_usage(). The mem_cgroup_usage() function declaration is moved from mm/memcontrol-v1.h to mm/internal.h with the !CONFIG_MEMCG case defined as always true. With this patch applied, the test_memcg_low sub-test finishes successfully without failure in most cases. Though both test_memcg_low and test_memcg_min sub-tests may still fail occasionally if the memory.current values fall outside of the expected ranges. Suggested-by: Johannes Weiner Signed-off-by: Waiman Long --- mm/internal.h| 9 + mm/memcontrol-v1.h | 2 -- mm/vmscan.c | 4 tools/testing/selftests/cgroup/test_memcontrol.c | 7 ++- 4 files changed, 19 insertions(+), 3 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 50c2f590b2d0..c06fb0e8d75c 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1535,6 +1535,15 @@ void __meminit __init_page_from_nid(unsigned long pfn, int nid); unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg, int priority); +#ifdef CONFIG_MEMCG +unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap); +#else +static inline unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap) +{ + return 1UL; +} +#endif + #ifdef CONFIG_SHRINKER_DEBUG static inline __printf(2, 0) int shrinker_debugfs_name_alloc( struct shrinker *shrinker, const char *fmt, va_list ap) diff --git a/mm/memcontrol-v1.h b/mm/memcontrol-v1.h index 6358464bb416..e92b21af92b1 100644 --- a/mm/memcontrol-v1.h +++ b/mm/memcontrol-v1.h @@ -22,8 +22,6 @@ iter != NULL; \ iter = mem_cgroup_iter(NULL, iter, NULL)) -unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap); - void drain_all_stock(struct mem_cgroup *root_memcg); unsigned long memcg_events(struct mem_cgroup *memcg, int event); diff --git a/mm/vmscan.c b/mm/vmscan.c index b620d74b0f66..a771a0145a12 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -5963,6 +5963,10 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc) mem_cgroup_calculate_protection(target_memcg, memcg); + /* Skip memcg with no usage */ + if (!mem_cgroup_usage(memcg, false)) + continue; + if (mem_cgroup_below_min(target_memcg, memcg)) { /* * Hard protection. diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c index 16f5d74ae762..bab826b6b7b0 100644 --- a/tools/testing/selftests/cgroup/test_memcontrol.c +++ b/tools/testing/selftests/cgroup/test_memcontrol.c @@ -525,8 +525,13 @@ static int test_memcg_protection(const char *root, bool min) goto cleanup; } + /* +* Child 2 has memory.low=0, but some low protection is still being +* distributed down from its parent with memory.low=50M. So the low +* event count will be non-zero. +*/ for (i = 0; i < ARRAY_SIZE(children); i++) { - int no_low_events_index = 1; + int no_low_events_index = 2; long low, oom; oom = cg_read_key_long(children[i], "memory.events", "oom "); -- 2.48.1
Re: [PATCH] ASoC: cs-amp-lib-test: Don't select SND_SOC_CS_AMP_LIB
On 09/04/2025 3:24 pm, Mark Brown wrote: On Wed, Apr 09, 2025 at 11:45:44AM +0100, Richard Fitzgerald wrote: Depend on SND_SOC_CS_AMP_LIB instead of selecting it. KUNIT_ALL_TESTS should only build tests for components that are already being built, it should not cause other stuff to be added to the build. config SND_SOC_CS_AMP_LIB_TEST - tristate "KUnit test for Cirrus Logic cs-amp-lib" - depends on KUNIT + tristate "KUnit test for Cirrus Logic cs-amp-lib" if !KUNIT_ALL_TESTS + depends on SND_SOC_CS_AMP_LIB && KUNIT default KUNIT_ALL_TESTS - select SND_SOC_CS_AMP_LIB help This builds KUnit tests for the Cirrus Logic common amplifier library. This by itself results in the Cirrus tests being removed from a kunit --alltests run which is a regression in coverage. I'd expect to see some corresponding updates in the KUnit all_tests.config to keep them enabled. That's the defined behaviour of KUNIT_ALL_TESTS. It shouldn't have been running as part of an alltests if nothing had selected it. That seems to make people angry. Probably the same people who would complain if there was a bug in the code that they didn't want to test.
Re: [PATCH] selftests/sgx: Fix an enclave built with extended instructions
On 4/9/25 09:55, Vladis Dronov wrote: ... > Fix this by adding "-mno-avx" to ENCL_CFLAGS in Makefile. Add some comments > about this to code locations where enclave's xfrm field is set. > > Suggested-by: Dave Hansen > Signed-off-by: Vladis Dronov First of all, this looks fine to me: Acked-by: Dave Hansen The code comments are fine. I'm much less picky about selftests. I'm also open to other solutions here. We could, for instance, set xfrm=7 to allow AVX2 instructions (which are generated by "--with-arch_64=x86-64-v3") or use some compiler flags other than "-mno-avx". But "-mno-avx" does seem good to me.
Re: [PATCH] selftests/sgx: Fix an enclave built with extended instructions
On Wed, Apr 9, 2025 at 7:07 PM Dave Hansen wrote: > > On 4/9/25 09:55, Vladis Dronov wrote: > ... > > Fix this by adding "-mno-avx" to ENCL_CFLAGS in Makefile. Add some comments > > about this to code locations where enclave's xfrm field is set. > > > > Suggested-by: Dave Hansen > > Signed-off-by: Vladis Dronov > > First of all, this looks fine to me: > > Acked-by: Dave Hansen > > The code comments are fine. I'm much less picky about selftests. > > I'm also open to other solutions here. We could, for instance, set > xfrm=7 to allow AVX2 instructions (which are generated by > "--with-arch_64=x86-64-v3") or use some compiler flags other than > "-mno-avx". > > But "-mno-avx" does seem good to me. > Thank you for the ACK, Dave. I've tested an enclave with xfrm=7 and it errors out at the AVX512 instruction (namely, vmovdqa64) in the same way. So if there is a compiler built with "--with-arch_64=x86-64-v4" in some future, we would get into the exact same situation. So I believe a solution when we disable extended instruction sets in an enclave as one covering all future cases.
Re: [PATCH] selftests/mm: Fix compiler -Wmaybe-uninitialized warning
On 4/9/25 15:51, David Hildenbrand wrote: > On 09.04.25 12:09, Anshuman Khandual wrote: >> >> >> On 4/9/25 15:27, David Hildenbrand wrote: >>> On 09.04.25 11:50, Anshuman Khandual wrote: Following build warning comes up for cow test as 'transferred' variable has not been initialized. Fix the warning via zero init for the variable. CC cow cow.c: In function ‘do_test_vmsplice_in_parent’: cow.c:365:61: warning: ‘transferred’ may be used uninitialized [-Wmaybe-uninitialized] 365 | cur = read(fds[0], new + total, transferred - total); | ^~~ cow.c:296:29: note: ‘transferred’ was declared here 296 | ssize_t cur, total, transferred; | ^~~ CC compaction_test CC gup_longterm Cc: Andrew Morton Cc: Shuah Khan Cc: linux...@kvack.org Cc: linux-kselft...@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual --- tools/testing/selftests/mm/cow.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/mm/cow.c b/tools/testing/selftests/mm/cow.c index f0cb14ea8608..b6cfe0a4b7df 100644 --- a/tools/testing/selftests/mm/cow.c +++ b/tools/testing/selftests/mm/cow.c @@ -293,7 +293,7 @@ static void do_test_vmsplice_in_parent(char *mem, size_t size, .iov_base = mem, .iov_len = size, }; - ssize_t cur, total, transferred; + ssize_t cur, total, transferred = 0; struct comm_pipes comm_pipes; char *old, *new; int ret, fds[2]; >>> >>> >>> if (before_fork) { >>> transferred = vmsplice(fds[1], &iov, 1, 0); >>> ... >>> >>> if (!before_fork) { >>> transferred = vmsplice(fds[1], &iov, 1, 0); >>> ... >>> >>> for (total = 0; total < transferred; total += cur) { >>> ... >>> >>> >>> And I don't see any jump label that could jump to code that would ve using >>> transferred. >>> >>> What am I missing? >> >> Probably because both those conditional statements are not mutually >> exclusive above with an if-else construct. Hence compiler flags it >> rather as a false positive ? Initializing with 0 just works around >> that false positive. > > This is something the compiler should clearly be able to verify. before_fork > is never changed in that function. > > We should not work around wrong compilers. > > Which compiler are you using such that you run into this issue? gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
[PATCH v4] selftests/x86/lam: fix resource leak in do_uring() and allocate_dsa_pasid()
This patch resolves resource leaks reported by cppcheck in lam.c. Specifically, the 'file_fd' and 'fd' were not closed in do_uring() and allocate_dsa_pasid() functions, respectively. cppcheck output before this patch: tools/testing/selftests/x86/lam.c:685:3: error: Resource leak: file_fd [resourceLeak] tools/testing/selftests/x86/lam.c:693:3: error: Resource leak: file_fd [resourceLeak] tools/testing/selftests/x86/lam.c:1195:2: error: Resource leak: fd [resourceLeak] cppcheck output after this patch: No resource leaks found Signed-off-by: Malaya Kumar Rout --- tools/testing/selftests/x86/lam.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/x86/lam.c b/tools/testing/selftests/x86/lam.c index 18d736640ece..0873b0e5f48b 100644 --- a/tools/testing/selftests/x86/lam.c +++ b/tools/testing/selftests/x86/lam.c @@ -682,7 +682,7 @@ int do_uring(unsigned long lam) return 1; if (fstat(file_fd, &st) < 0) - return 1; + goto cleanup; off_t file_sz = st.st_size; @@ -690,7 +690,7 @@ int do_uring(unsigned long lam) fi = malloc(sizeof(*fi) + sizeof(struct iovec) * blocks); if (!fi) - return 1; + goto cleanup; fi->file_sz = file_sz; fi->file_fd = file_fd; @@ -698,7 +698,7 @@ int do_uring(unsigned long lam) ring = malloc(sizeof(*ring)); if (!ring) { free(fi); - return 1; + goto cleanup; } memset(ring, 0, sizeof(struct io_ring)); @@ -729,6 +729,8 @@ int do_uring(unsigned long lam) } free(fi); +cleanup: + close(file_fd); return ret; } @@ -1189,6 +1191,7 @@ void *allocate_dsa_pasid(void) wq = mmap(NULL, 0x1000, PROT_WRITE, MAP_SHARED | MAP_POPULATE, fd, 0); + close(fd); if (wq == MAP_FAILED) perror("mmap"); -- 2.43.0
Re: [PATCH v2] selftests: pid_namespace: Add missing sys/mount.h
On Wed, Apr 9, 2025 at 1:53 AM Peter Seiderer wrote: > > Hello T.J., > > On Tue, 8 Apr 2025 23:02:02 +, "T.J. Mercier" > wrote: > > > pid_max.c: In function ‘pid_max_cb’: > > pid_max.c:42:15: error: implicit declaration of function ‘mount’ > >[-Wimplicit-function-declaration] > >42 | ret = mount("", "/", NULL, MS_PRIVATE | MS_REC, 0); > > | ^ > > pid_max.c:42:36: error: ‘MS_PRIVATE’ undeclared (first use in this > > function); did you mean ‘MAP_PRIVATE’? > >42 | ret = mount("", "/", NULL, MS_PRIVATE | MS_REC, 0); > > |^~ > > |MAP_PRIVATE > > pid_max.c:42:49: error: ‘MS_REC’ undeclared (first use in this function) > >42 | ret = mount("", "/", NULL, MS_PRIVATE | MS_REC, 0); > > | ^~ > > pid_max.c:48:9: error: implicit declaration of function ‘umount2’; did > >you mean ‘SYS_umount2’? [-Wimplicit-function-declaration] > >48 | umount2("/proc", MNT_DETACH); > > | ^~~ > > | SYS_umount2 > > pid_max.c:48:26: error: ‘MNT_DETACH’ undeclared (first use in this > >function) > >48 | umount2("/proc", MNT_DETACH); > > > > Fixes: 615ab43b838b ("tests/pid_namespace: add pid_max tests") > > Signed-off-by: T.J. Mercier > > --- > > tools/testing/selftests/pid_namespace/pid_max.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/tools/testing/selftests/pid_namespace/pid_max.c > > b/tools/testing/selftests/pid_namespace/pid_max.c > > index 51c414faabb0..96f274f0582b 100644 > > --- a/tools/testing/selftests/pid_namespace/pid_max.c > > +++ b/tools/testing/selftests/pid_namespace/pid_max.c > > @@ -10,6 +10,7 @@ > > #include > > #include > > #include > > +#include > > #include > > > > #include "../kselftest_harness.h" > > Predated patch already available, see > > > https://lore.kernel.org/linux-kselftest/20250115105211.390370-3-ps.rep...@gmx.net/ > > Regards, > Peter Thanks Peter, looks like that was never taken into anybody's tree? You can have my: Reviewed-by: T.J. Mercier
Re: [PATCH 2/4] sysctl: Add 0012 to test the u8 range check
On Fri, Mar 21, 2025 at 01:47:25PM +0100, Joel Granados wrote: > Add a sysctl test that uses the new u8 test ctl files in a created by > the sysctl test module. Check that the u8 proc file that is valid is > created and that there are two messages in dmesg for the files that were > out of range. > > Signed-off-by: Joel Granados Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH v2] selftests: pid_namespace: Add missing sys/mount.h
Hello T.J., On Tue, 8 Apr 2025 23:02:02 +, "T.J. Mercier" wrote: > pid_max.c: In function ‘pid_max_cb’: > pid_max.c:42:15: error: implicit declaration of function ‘mount’ >[-Wimplicit-function-declaration] >42 | ret = mount("", "/", NULL, MS_PRIVATE | MS_REC, 0); > | ^ > pid_max.c:42:36: error: ‘MS_PRIVATE’ undeclared (first use in this > function); did you mean ‘MAP_PRIVATE’? >42 | ret = mount("", "/", NULL, MS_PRIVATE | MS_REC, 0); > |^~ > |MAP_PRIVATE > pid_max.c:42:49: error: ‘MS_REC’ undeclared (first use in this function) >42 | ret = mount("", "/", NULL, MS_PRIVATE | MS_REC, 0); > | ^~ > pid_max.c:48:9: error: implicit declaration of function ‘umount2’; did >you mean ‘SYS_umount2’? [-Wimplicit-function-declaration] >48 | umount2("/proc", MNT_DETACH); > | ^~~ > | SYS_umount2 > pid_max.c:48:26: error: ‘MNT_DETACH’ undeclared (first use in this >function) >48 | umount2("/proc", MNT_DETACH); > > Fixes: 615ab43b838b ("tests/pid_namespace: add pid_max tests") > Signed-off-by: T.J. Mercier > --- > tools/testing/selftests/pid_namespace/pid_max.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/tools/testing/selftests/pid_namespace/pid_max.c > b/tools/testing/selftests/pid_namespace/pid_max.c > index 51c414faabb0..96f274f0582b 100644 > --- a/tools/testing/selftests/pid_namespace/pid_max.c > +++ b/tools/testing/selftests/pid_namespace/pid_max.c > @@ -10,6 +10,7 @@ > #include > #include > #include > +#include > #include > > #include "../kselftest_harness.h" Predated patch already available, see https://lore.kernel.org/linux-kselftest/20250115105211.390370-3-ps.rep...@gmx.net/ Regards, Peter
RE: spdxcheck: python git module considered harmful (was RE: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo)
> -Original Message- > From: Thomas Gleixner > On Tue, Apr 08 2025 at 17:34, Tim Bird wrote: > >> -Original Message- > > For what it's worth, I've always been a bit skeptical of the use of the > > python git module > > in spdxcheck.py. Its use makes it impossible to use spdxcheck on a kernel > > source tree > > from a tarball (ie, on source not inside a git repo). Also, from what I > > can see in spdxcheck.py, > > the way it's used is just to get the top directories for either the > > LICENSES dir, > > the top dir of the kernel source tree, or the directory to scan passed on > > the > > spdxcheck.py command line, and then to use the repo.traverse() function on > > said directory. > > > > This ends up excluding any files in the source directory tree that are not > > checked > > into git yet, silently skipping them (which I've run into before when > > using the tool). > > The exactly same problem exists the other way round. Run an > unconstrained version of spdxcheck on a dirty source tree with lots of > leftovers, then it scans nonsense all the way instead of skipping some > not yet git tracked files. Yeah. I thought about this overnight, and came to the same conclusion. I forgot that most people build the kernel in a way that the build results end up in the source tree. (Crazy, right?) I almost always use KBUILD_OUTPUT, and I always use it when I'm doing embedded and spdx-related work, so I don't often run into build contamination of the source tree. > > The easiest way for me to achieve that was using git to exclude all of > the irrelevant noise, which I still consider to be a reasonable design > decision. Agreed. Given common build practices, this is a reasonable design decision. I thought there might be a good reason for this design choice, and was hoping you would respond, Thomas. Thanks for the quick feedback. > > And yes, it ignores not yet tracked files, but if you want to check > them, then it's easy enough to commit them temporarily or provide a > dedicated file target to the tools, which ignores git. OK. Yes. That's an easy workaround. > > > I think the code could be relatively easily refactored to eliminate the use > > of the git > > module, to overcome these issues. I'm not sure if removing the module would > > eliminate the yield operation (used inside repo.traverse()), which seems to > > be causing the > > problem found here. IMHO, in my experience when using python it is helpful > > to use as few non-core modules as possible, because they tend to break like > > this > > occasionally. > > > > Let me know if anyone objects to me working up a refactoring of spdxcheck.py > > eliminating the use of the python 'git' module, and submitting it for > > review. > > I have no objections at all as long as it gives the same result of not > trying to scan random artifacts which might sit around in a source tree. > > But not for the price that I have to create a tarball or a pristine > checked out tree first to run it. That'd be a usability regression to > begin with. Agreed. > > Good luck for coming up with a clever and clean solution for that! I thought about various solutions for this, but each one I came up with had other drawbacks. If it was just a matter of separating *.[chS] files from ELF object files, that would be easy to deal with. But we put SPDX headers on all kinds of files, and there are lots of other types of files generated during a build that are not just ELF objects. And build rules change over time. So even if I made a comprehensive system today to catch build-generated outliers, the solution would probably need constant updating and tweaking, which IMHO makes it a no-go. > > Just for the record: I rather wish that people would contribute to > eliminate the remaining 17% (15397 files) which do not have SPDX > identifiers than complaining about the trivial to solve short-comings of > the tool, which was written to help this effort and to make sure that it > does not degrade. I agree with this. Analyzing where the headers are missing is interesting. But it's more important to just fix the missing ones. I'll spend more of my time working on missing headers, rather than on tools to analyze and report them. Thanks and regards, -- Tim
[PATCH] selftests/bpf: Fix bpf_nf selftest failure
For systems with missing iptables-legacy tool, this selftest fails. Add check to find if iptables-legacy tool is available and skip the test if the tool is missing. Fixes: de9c8d848d90 ("selftests/bpf: S/iptables/iptables-legacy/ in the bpf_nf and xdp_synproxy test") Signed-off-by: Saket Kumar Bhaskar --- tools/testing/selftests/bpf/prog_tests/bpf_nf.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_nf.c b/tools/testing/selftests/bpf/prog_tests/bpf_nf.c index dbd13f8e42a7..dd6512fa652b 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_nf.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_nf.c @@ -63,6 +63,12 @@ static void test_bpf_nf_ct(int mode) .repeat = 1, ); + if (SYS_NOFAIL("iptables-legacy --version")) { + fprintf(stdout, "Missing required iptables-legacy tool\n"); + test__skip(); + return; + } + skel = test_bpf_nf__open_and_load(); if (!ASSERT_OK_PTR(skel, "test_bpf_nf__open_and_load")) return; -- 2.43.5
Re: [PATCH] selftests/mm: Fix compiler -Wmaybe-uninitialized warning
On 09.04.25 11:50, Anshuman Khandual wrote: Following build warning comes up for cow test as 'transferred' variable has not been initialized. Fix the warning via zero init for the variable. CC cow cow.c: In function ‘do_test_vmsplice_in_parent’: cow.c:365:61: warning: ‘transferred’ may be used uninitialized [-Wmaybe-uninitialized] 365 | cur = read(fds[0], new + total, transferred - total); | ^~~ cow.c:296:29: note: ‘transferred’ was declared here 296 | ssize_t cur, total, transferred; | ^~~ CC compaction_test CC gup_longterm Cc: Andrew Morton Cc: Shuah Khan Cc: linux...@kvack.org Cc: linux-kselft...@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual --- tools/testing/selftests/mm/cow.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/mm/cow.c b/tools/testing/selftests/mm/cow.c index f0cb14ea8608..b6cfe0a4b7df 100644 --- a/tools/testing/selftests/mm/cow.c +++ b/tools/testing/selftests/mm/cow.c @@ -293,7 +293,7 @@ static void do_test_vmsplice_in_parent(char *mem, size_t size, .iov_base = mem, .iov_len = size, }; - ssize_t cur, total, transferred; + ssize_t cur, total, transferred = 0; struct comm_pipes comm_pipes; char *old, *new; int ret, fds[2]; if (before_fork) { transferred = vmsplice(fds[1], &iov, 1, 0); ... if (!before_fork) { transferred = vmsplice(fds[1], &iov, 1, 0); ... for (total = 0; total < transferred; total += cur) { ... And I don't see any jump label that could jump to code that would ve using transferred. What am I missing? -- Cheers, David / dhildenb
Re: [PATCH v8 8/8] vhost: Add a KConfig knob to enable IOCTL VHOST_FORK_FROM_OWNER
On Tue, Apr 8, 2025 at 7:56 PM Michael S. Tsirkin wrote: > > On Fri, Mar 28, 2025 at 06:02:52PM +0800, Cindy Lu wrote: > > Introduce a new config knob `CONFIG_VHOST_ENABLE_FORK_OWNER_IOCTL`, > > to control the availability of the `VHOST_FORK_FROM_OWNER` ioctl. > > When CONFIG_VHOST_ENABLE_FORK_OWNER_IOCTL is set to n, the ioctl > > is disabled, and any attempt to use it will result in failure. > > > > Signed-off-by: Cindy Lu > > --- > > drivers/vhost/Kconfig | 15 +++ > > drivers/vhost/vhost.c | 3 +++ > > 2 files changed, 18 insertions(+) > > > > diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig > > index b455d9ab6f3d..e5b9dcbf31b6 100644 > > --- a/drivers/vhost/Kconfig > > +++ b/drivers/vhost/Kconfig > > @@ -95,3 +95,18 @@ config VHOST_CROSS_ENDIAN_LEGACY > > If unsure, say "N". > > > > endif > > + > > +config VHOST_ENABLE_FORK_OWNER_IOCTL > > + bool "Enable IOCTL VHOST_FORK_FROM_OWNER" > > + default n > > + help > > + This option enables the IOCTL VHOST_FORK_FROM_OWNER, which allows > > + userspace applications to modify the thread mode for vhost devices. > > ok > > > + By default, `CONFIG_VHOST_ENABLE_FORK_OWNER_IOCTL` is set to `n`, > > + meaning the ioctl is disabled and any operation using this ioctl > > + will fail. > > + When the configuration is enabled (y), the ioctl becomes > > + available, allowing users to set the mode if needed. > > no need to be so verbose - the disabled beavious belongs in commit log > not here. > > Also either ioctl or IOCTL but not both. > sure, will change this Thanks cindy > > + > > + If unsure, say "N". > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > > index fb0c7fb43f78..568e43cb54a9 100644 > > --- a/drivers/vhost/vhost.c > > +++ b/drivers/vhost/vhost.c > > @@ -2294,6 +2294,8 @@ long vhost_dev_ioctl(struct vhost_dev *d, unsigned > > int ioctl, void __user *argp) > > r = vhost_dev_set_owner(d); > > goto done; > > } > > + > > +#ifdef CONFIG_VHOST_ENABLE_FORK_OWNER_IOCTL > > if (ioctl == VHOST_FORK_FROM_OWNER) { > > u8 inherit_owner; > > /*inherit_owner can only be modified before owner is set*/ > > @@ -2313,6 +2315,7 @@ long vhost_dev_ioctl(struct vhost_dev *d, unsigned > > int ioctl, void __user *argp) > > r = 0; > > goto done; > > } > > +#endif > > /* You must be the owner to do anything else */ > > r = vhost_dev_check_owner(d); > > if (r) > > -- > > 2.45.0 >
Re: [PATCH] selftests/mm: Fix compiler -Wmaybe-uninitialized warning
On 4/9/25 15:27, David Hildenbrand wrote: > On 09.04.25 11:50, Anshuman Khandual wrote: >> Following build warning comes up for cow test as 'transferred' variable has >> not been initialized. Fix the warning via zero init for the variable. >> >> CC cow >> cow.c: In function ‘do_test_vmsplice_in_parent’: >> cow.c:365:61: warning: ‘transferred’ may be used uninitialized >> [-Wmaybe-uninitialized] >> 365 | cur = read(fds[0], new + total, transferred - >> total); >> | ^~~ >> cow.c:296:29: note: ‘transferred’ was declared here >> 296 | ssize_t cur, total, transferred; >> | ^~~ >> CC compaction_test >> CC gup_longterm >> >> Cc: Andrew Morton >> Cc: Shuah Khan >> Cc: linux...@kvack.org >> Cc: linux-kselft...@vger.kernel.org >> Cc: linux-kernel@vger.kernel.org >> Signed-off-by: Anshuman Khandual >> --- >> tools/testing/selftests/mm/cow.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/tools/testing/selftests/mm/cow.c >> b/tools/testing/selftests/mm/cow.c >> index f0cb14ea8608..b6cfe0a4b7df 100644 >> --- a/tools/testing/selftests/mm/cow.c >> +++ b/tools/testing/selftests/mm/cow.c >> @@ -293,7 +293,7 @@ static void do_test_vmsplice_in_parent(char *mem, size_t >> size, >> .iov_base = mem, >> .iov_len = size, >> }; >> - ssize_t cur, total, transferred; >> + ssize_t cur, total, transferred = 0; >> struct comm_pipes comm_pipes; >> char *old, *new; >> int ret, fds[2]; > > > if (before_fork) { > transferred = vmsplice(fds[1], &iov, 1, 0); > ... > > if (!before_fork) { > transferred = vmsplice(fds[1], &iov, 1, 0); > ... > > for (total = 0; total < transferred; total += cur) { > ... > > > And I don't see any jump label that could jump to code that would ve using > transferred. > > What am I missing? Probably because both those conditional statements are not mutually exclusive above with an if-else construct. Hence compiler flags it rather as a false positive ? Initializing with 0 just works around that false positive.
[PATCH] selftests/mm: Fix compiler -Wmaybe-uninitialized warning
Following build warning comes up for cow test as 'transferred' variable has not been initialized. Fix the warning via zero init for the variable. CC cow cow.c: In function ‘do_test_vmsplice_in_parent’: cow.c:365:61: warning: ‘transferred’ may be used uninitialized [-Wmaybe-uninitialized] 365 | cur = read(fds[0], new + total, transferred - total); | ^~~ cow.c:296:29: note: ‘transferred’ was declared here 296 | ssize_t cur, total, transferred; | ^~~ CC compaction_test CC gup_longterm Cc: Andrew Morton Cc: Shuah Khan Cc: linux...@kvack.org Cc: linux-kselft...@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual --- tools/testing/selftests/mm/cow.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/mm/cow.c b/tools/testing/selftests/mm/cow.c index f0cb14ea8608..b6cfe0a4b7df 100644 --- a/tools/testing/selftests/mm/cow.c +++ b/tools/testing/selftests/mm/cow.c @@ -293,7 +293,7 @@ static void do_test_vmsplice_in_parent(char *mem, size_t size, .iov_base = mem, .iov_len = size, }; - ssize_t cur, total, transferred; + ssize_t cur, total, transferred = 0; struct comm_pipes comm_pipes; char *old, *new; int ret, fds[2]; -- 2.43.0
Re: [PATCH] selftests/mm: Fix compiler -Wmaybe-uninitialized warning
On 09.04.25 12:09, Anshuman Khandual wrote: On 4/9/25 15:27, David Hildenbrand wrote: On 09.04.25 11:50, Anshuman Khandual wrote: Following build warning comes up for cow test as 'transferred' variable has not been initialized. Fix the warning via zero init for the variable. CC cow cow.c: In function ‘do_test_vmsplice_in_parent’: cow.c:365:61: warning: ‘transferred’ may be used uninitialized [-Wmaybe-uninitialized] 365 | cur = read(fds[0], new + total, transferred - total); | ^~~ cow.c:296:29: note: ‘transferred’ was declared here 296 | ssize_t cur, total, transferred; | ^~~ CC compaction_test CC gup_longterm Cc: Andrew Morton Cc: Shuah Khan Cc: linux...@kvack.org Cc: linux-kselft...@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual --- tools/testing/selftests/mm/cow.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/mm/cow.c b/tools/testing/selftests/mm/cow.c index f0cb14ea8608..b6cfe0a4b7df 100644 --- a/tools/testing/selftests/mm/cow.c +++ b/tools/testing/selftests/mm/cow.c @@ -293,7 +293,7 @@ static void do_test_vmsplice_in_parent(char *mem, size_t size, .iov_base = mem, .iov_len = size, }; - ssize_t cur, total, transferred; + ssize_t cur, total, transferred = 0; struct comm_pipes comm_pipes; char *old, *new; int ret, fds[2]; if (before_fork) { transferred = vmsplice(fds[1], &iov, 1, 0); ... if (!before_fork) { transferred = vmsplice(fds[1], &iov, 1, 0); ... for (total = 0; total < transferred; total += cur) { ... And I don't see any jump label that could jump to code that would ve using transferred. What am I missing? Probably because both those conditional statements are not mutually exclusive above with an if-else construct. Hence compiler flags it rather as a false positive ? Initializing with 0 just works around that false positive. This is something the compiler should clearly be able to verify. before_fork is never changed in that function. We should not work around wrong compilers. Which compiler are you using such that you run into this issue? -- Cheers, David / dhildenb
Re: [PATCH v1] s390/virtio_ccw: don't allocate/assign airqs for non-existing queues
On 07.04.25 23:20, Michael S. Tsirkin wrote: On Mon, Apr 07, 2025 at 08:47:05PM +0200, David Hildenbrand wrote: In my opinion, it makes the most sense to keep the spec as it is and change QEMU and the kernel to match, but obviously that's not trivial to do in a way that doesn't break existing devices and drivers. If only it would be limited to QEMU and Linux ... :) Out of curiosity, assuming we'd make the spec match the current QEMU/Linux implementation at least for the 3 involved features only, would there be a way to adjust crossvm without any disruption? I still have the feeling that it will be rather hard to get that all implementations match the spec ... For new features+queues it will be easy to force the usage of fixed virtqueue numbers, but for free-page-hinting and reporting, it's a mess :( Still thinking about a way to fix drivers... We can discuss this theoretically, maybe? Yes, absolutely. I took the time to do some more digging; regarding drivers only Linux seems to be problematic. virtio-win, FreeBSD, NetBSD and OpenBSD and don't seem to support problematic features (free page hinting, free page reporting) in their virtio-balloon implementations. So from the known drivers, only Linux is applicable. reporting_vq is either at idx 4/3/2 free_page_vq is either at idx 3/2 statsq is at idx2 (only relevant if the feature is offered) So if we could test for the existence of a virtqueue at an idx easily, we could test from highest-to-smallest idx. But I recall that testing for the existance of a virtqueue on s390x resulted in the problem/deadlock in the first place ... -- Cheers, David / dhildenb
Re: [PATCH] selftests/mm: Fix compiler -Wmaybe-uninitialized warning
On 09.04.25 12:25, Anshuman Khandual wrote: On 4/9/25 15:51, David Hildenbrand wrote: On 09.04.25 12:09, Anshuman Khandual wrote: On 4/9/25 15:27, David Hildenbrand wrote: On 09.04.25 11:50, Anshuman Khandual wrote: Following build warning comes up for cow test as 'transferred' variable has not been initialized. Fix the warning via zero init for the variable. CC cow cow.c: In function ‘do_test_vmsplice_in_parent’: cow.c:365:61: warning: ‘transferred’ may be used uninitialized [-Wmaybe-uninitialized] 365 | cur = read(fds[0], new + total, transferred - total); | ^~~ cow.c:296:29: note: ‘transferred’ was declared here 296 | ssize_t cur, total, transferred; | ^~~ CC compaction_test CC gup_longterm Cc: Andrew Morton Cc: Shuah Khan Cc: linux...@kvack.org Cc: linux-kselft...@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual --- tools/testing/selftests/mm/cow.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/mm/cow.c b/tools/testing/selftests/mm/cow.c index f0cb14ea8608..b6cfe0a4b7df 100644 --- a/tools/testing/selftests/mm/cow.c +++ b/tools/testing/selftests/mm/cow.c @@ -293,7 +293,7 @@ static void do_test_vmsplice_in_parent(char *mem, size_t size, .iov_base = mem, .iov_len = size, }; - ssize_t cur, total, transferred; + ssize_t cur, total, transferred = 0; struct comm_pipes comm_pipes; char *old, *new; int ret, fds[2]; if (before_fork) { transferred = vmsplice(fds[1], &iov, 1, 0); ... if (!before_fork) { transferred = vmsplice(fds[1], &iov, 1, 0); ... for (total = 0; total < transferred; total += cur) { ... And I don't see any jump label that could jump to code that would ve using transferred. What am I missing? Probably because both those conditional statements are not mutually exclusive above with an if-else construct. Hence compiler flags it rather as a false positive ? Initializing with 0 just works around that false positive. This is something the compiler should clearly be able to verify. before_fork is never changed in that function. We should not work around wrong compilers. Which compiler are you using such that you run into this issue? gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 gcc (GCC) 14.2.1 20250110 (Red Hat 14.2.1-7) Seems to be fine, just like all other compilers people used with this over the years. Maybe something about that compiler is shaky that was fixed in the meantime? -- Cheers, David / dhildenb
Re: [PATCH v1] s390/virtio_ccw: don't allocate/assign airqs for non-existing queues
On Wed, Apr 09, 2025 at 12:46:41PM +0200, David Hildenbrand wrote: > On 07.04.25 23:20, Michael S. Tsirkin wrote: > > On Mon, Apr 07, 2025 at 08:47:05PM +0200, David Hildenbrand wrote: > > > > In my opinion, it makes the most sense to keep the spec as it is and > > > > change QEMU and the kernel to match, but obviously that's not trivial > > > > to do in a way that doesn't break existing devices and drivers. > > > > > > If only it would be limited to QEMU and Linux ... :) > > > > > > Out of curiosity, assuming we'd make the spec match the current QEMU/Linux > > > implementation at least for the 3 involved features only, would there be a > > > way to adjust crossvm without any disruption? > > > > > > I still have the feeling that it will be rather hard to get that all > > > implementations match the spec ... For new features+queues it will be easy > > > to force the usage of fixed virtqueue numbers, but for free-page-hinting > > > and > > > reporting, it's a mess :( > > > > > > Still thinking about a way to fix drivers... We can discuss this > > theoretically, maybe? > > Yes, absolutely. I took the time to do some more digging; regarding drivers > only Linux seems to be problematic. > > virtio-win, FreeBSD, NetBSD and OpenBSD and don't seem to support > problematic features (free page hinting, free page reporting) in their > virtio-balloon implementations. > > So from the known drivers, only Linux is applicable. > > reporting_vq is either at idx 4/3/2 > free_page_vq is either at idx 3/2 > statsq is at idx2 (only relevant if the feature is offered) > > So if we could test for the existence of a virtqueue at an idx easily, we > could test from highest-to-smallest idx. > > But I recall that testing for the existance of a virtqueue on s390x resulted > in the problem/deadlock in the first place ... > > -- > Cheers, > > David / dhildenb So let's talk about a new feature bit? Since vqs are probed after feature negotiation, it looks like we could have a feature bit trigger sane behaviour, right? I kind of dislike it that we have a feature bit for bugs though. What would be a minimal new feature to add so it does not feel wrong? Maybe it's in the field of psychology though ... -- MST
Re: [PATCH v1] s390/virtio_ccw: don't allocate/assign airqs for non-existing queues
On 07.04.25 23:09, Daniel Verkamp wrote: On Mon, Apr 7, 2025 at 11:47 AM David Hildenbrand wrote: Heh, but that one said: +\item[ VIRTIO_BALLOON_F_WS_REPORTING(6) ] The device has support for Working Set Which does not seem to reflect reality ... Please feel free to disregard these features and reuse their bits and queue indexes; as far as I know, they are not actually enabled anywhere currently and the corresponding guest patches were only applied to some (no-longer-used) ChromeOS kernel trees, so the compatibility impact should be minimal. I will also try to clean up the leftover bits on the crosvm side just to clear things up. Thanks for your reply, and thanks for clarifying+cleaning it up. [...] IIRC, in that commit they switched to the "spec" behavior. That's when they started hard-coding the queue indexes. CCing Daniel. All Linux versions should be incompatible with cross-vmm regarding free page reporting. How is that handled? In practice, it only works because nobody calls crosvm with --balloon-page-reporting (it's off by default), so the balloon device does not advertise the VIRTIO_BALLOON_F_PAGE_REPORTING feature. (I just went searching now, and it does seem like there is actually one user in Android that does try to enable page reporting[1], which I'll have to look into...) In my opinion, it makes the most sense to keep the spec as it is and change QEMU and the kernel to match, but obviously that's not trivial to do in a way that doesn't break existing devices and drivers. If only it would be limited to QEMU and Linux ... :) Out of curiosity, assuming we'd make the spec match the current QEMU/Linux implementation at least for the 3 involved features only, would there be a way to adjust crossvm without any disruption? I still have the feeling that it will be rather hard to get that all implementations match the spec ... For new features+queues it will be easy to force the usage of fixed virtqueue numbers, but for free-page-hinting and reporting, it's a mess :( If the spec is changed, we can certainly update crosvm to match it; I think this only really affects a few devices (balloon and technically filesystem, but see below), only affects features that are generally not turned on, and in many cases, the guest kernel is updated simultaneously with the crosvm binary. I'm not opposed to changing the spec to match reality, although that feels like a bad move from a spec-integrity perspective. Right. We didn't pay attention that the spec would reflect reality, and the reality was a bad decision :) Regardless of the chosen path, I think the spec should be clarified - the meaning of "queue only exists if is set" leaves the reader with too many questions: Right, that's what we've been discussing. - What does "if is set" mean? If it's advertised by the device? If it's acked by the driver? (To me, "set" definitely hints at the latter, but it should be explicit.) Currently it's "feature is offered by the device". - What does it mean for a virtqueue to "exist"? Does that queue index disappear from the numbering if it does not exist, sliding all of the later queues down? Currently it's like that, yes. If so, the spec should really not have the static queue numbers listed for the later queues, since they are only correct if all previous feature-dependent queues were also "set", whatever that means. Yes, that's also what we've been discussing. And that started restarted the whole "can we fix device/drivers instead" :) The way crosvm interpreted this was: - "if is set" means "if the device advertised *and* driver acknowledged " - "queue only exists" means "if was not acked, the corresponding virtqueue cannot be enabled by the driver" (attempting to set queue_enable = 1 has no effect). - Any later virtqueues are unaffected and still have the same queue indexes. Yes, that matches my understanding. The way QEMU interpeted this (I think, just skimming the code and working from memory here): - "if is set" means "if the device advertised " (it is checking host_features, not guest_features) Right, "offered features". - "queue only exists" means "if was not offered by the device, all later virtqueues are shifted down one index" Exactly. --- The spec for the filesystem device has a similar issue to the balloon device: - Queue 0 (hiprio) is always available regardless of features. - Queue 1 (notification queue) has a note that "The notification queue only exists if VIRTIO_FS_F_NOTIFICATION is set." - Queue 2..n are supposed to be the request queues per the numbering in the spec. This is how it has been specified since virtio 1.2 when the fs device was originally added. However, the Linux driver (and probably all existing device implementations - at least virtiofsd and crosvm's fs device) don't support VIRTIO_FS_F_NOTIFICATION and use queue 1 as a request queue, which matches the QEMU/Linux interpretation but means the spec doesn't match reality aga
Re: [PATCH v1] s390/virtio_ccw: don't allocate/assign airqs for non-existing queues
On 09.04.25 12:56, Michael S. Tsirkin wrote: On Wed, Apr 09, 2025 at 12:46:41PM +0200, David Hildenbrand wrote: On 07.04.25 23:20, Michael S. Tsirkin wrote: On Mon, Apr 07, 2025 at 08:47:05PM +0200, David Hildenbrand wrote: In my opinion, it makes the most sense to keep the spec as it is and change QEMU and the kernel to match, but obviously that's not trivial to do in a way that doesn't break existing devices and drivers. If only it would be limited to QEMU and Linux ... :) Out of curiosity, assuming we'd make the spec match the current QEMU/Linux implementation at least for the 3 involved features only, would there be a way to adjust crossvm without any disruption? I still have the feeling that it will be rather hard to get that all implementations match the spec ... For new features+queues it will be easy to force the usage of fixed virtqueue numbers, but for free-page-hinting and reporting, it's a mess :( Still thinking about a way to fix drivers... We can discuss this theoretically, maybe? Yes, absolutely. I took the time to do some more digging; regarding drivers only Linux seems to be problematic. virtio-win, FreeBSD, NetBSD and OpenBSD and don't seem to support problematic features (free page hinting, free page reporting) in their virtio-balloon implementations. So from the known drivers, only Linux is applicable. reporting_vq is either at idx 4/3/2 free_page_vq is either at idx 3/2 statsq is at idx2 (only relevant if the feature is offered) So if we could test for the existence of a virtqueue at an idx easily, we could test from highest-to-smallest idx. But I recall that testing for the existance of a virtqueue on s390x resulted in the problem/deadlock in the first place ... -- Cheers, David / dhildenb So let's talk about a new feature bit? Are you thinking about a new feature that switches between "fixed queue indices" and "compressed queue indices", whereby the latter would be the legacy default and we would expect all devices to switch to the new fixed-queue-indices layout? We could make all new features require "fixed-queue-indices". Since vqs are probed after feature negotiation, it looks like we could have a feature bit trigger sane behaviour, right? In the Linux driver, yes. In QEMU (devices), we add the queues when realizing, so we'd need some mechanism to adjust the queue indices based on feature negotiation I guess? For virtio-balloon it might be doable to simply always create+indicate free-page hinting to resolve the issue easily. For virtio-fs it might not be that easy. I kind of dislike it that we have a feature bit for bugs though. What would be a minimal new feature to add so it does not feel wrong? Probably as above: fixed vs. compressed virtqueue indices? -- Cheers, David / dhildenb
[RFC bpf-next 08/13] selftests: bpf: Avoid attaching to bpf_check()
bpf_check(), as it currently exists, will soon be going away to make way for loadable BPF verifier support. Fixup selftests so they fentry attach to a more reliable location. Signed-off-by: Daniel Xu --- tools/testing/selftests/bpf/progs/exceptions_assert.c | 2 +- tools/testing/selftests/bpf/progs/exceptions_fail.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/bpf/progs/exceptions_assert.c b/tools/testing/selftests/bpf/progs/exceptions_assert.c index 5e0a1ca96d4e..50bc52cbb2e7 100644 --- a/tools/testing/selftests/bpf/progs/exceptions_assert.c +++ b/tools/testing/selftests/bpf/progs/exceptions_assert.c @@ -124,7 +124,7 @@ int check_assert_generic(struct __sk_buff *ctx) return data[128]; } -SEC("?fentry/bpf_check") +SEC("?fentry/bpf_fentry_test1") __failure __msg("At program exit the register R1 has smin=64 smax=64") int check_assert_with_return(void *ctx) { diff --git a/tools/testing/selftests/bpf/progs/exceptions_fail.c b/tools/testing/selftests/bpf/progs/exceptions_fail.c index 8a0fdff89927..b44cb0a6c9d9 100644 --- a/tools/testing/selftests/bpf/progs/exceptions_fail.c +++ b/tools/testing/selftests/bpf/progs/exceptions_fail.c @@ -299,7 +299,7 @@ __noinline int exception_cb_bad_ret(u64 c) return c; } -SEC("?fentry/bpf_check") +SEC("?fentry/bpf_fentry_test1") __exception_cb(exception_cb_bad_ret) __failure __msg("At program exit the register R0 has unknown scalar value should") int reject_set_exception_cb_bad_ret1(void *ctx) @@ -307,7 +307,7 @@ int reject_set_exception_cb_bad_ret1(void *ctx) return 0; } -SEC("?fentry/bpf_check") +SEC("?fentry/bpf_fentry_test1") __failure __msg("At program exit the register R1 has smin=64 smax=64 should") int reject_set_exception_cb_bad_ret2(void *ctx) { -- 2.47.1
Re: [PATCH v1] s390/virtio_ccw: don't allocate/assign airqs for non-existing queues
On Wed, Apr 09, 2025 at 01:12:19PM +0200, David Hildenbrand wrote: > On 09.04.25 12:56, Michael S. Tsirkin wrote: > > On Wed, Apr 09, 2025 at 12:46:41PM +0200, David Hildenbrand wrote: > > > On 07.04.25 23:20, Michael S. Tsirkin wrote: > > > > On Mon, Apr 07, 2025 at 08:47:05PM +0200, David Hildenbrand wrote: > > > > > > In my opinion, it makes the most sense to keep the spec as it is and > > > > > > change QEMU and the kernel to match, but obviously that's not > > > > > > trivial > > > > > > to do in a way that doesn't break existing devices and drivers. > > > > > > > > > > If only it would be limited to QEMU and Linux ... :) > > > > > > > > > > Out of curiosity, assuming we'd make the spec match the current > > > > > QEMU/Linux > > > > > implementation at least for the 3 involved features only, would there > > > > > be a > > > > > way to adjust crossvm without any disruption? > > > > > > > > > > I still have the feeling that it will be rather hard to get that all > > > > > implementations match the spec ... For new features+queues it will be > > > > > easy > > > > > to force the usage of fixed virtqueue numbers, but for > > > > > free-page-hinting and > > > > > reporting, it's a mess :( > > > > > > > > > > > > Still thinking about a way to fix drivers... We can discuss this > > > > theoretically, maybe? > > > > > > Yes, absolutely. I took the time to do some more digging; regarding > > > drivers > > > only Linux seems to be problematic. > > > > > > virtio-win, FreeBSD, NetBSD and OpenBSD and don't seem to support > > > problematic features (free page hinting, free page reporting) in their > > > virtio-balloon implementations. > > > > > > So from the known drivers, only Linux is applicable. > > > > > > reporting_vq is either at idx 4/3/2 > > > free_page_vq is either at idx 3/2 > > > statsq is at idx2 (only relevant if the feature is offered) > > > > > > So if we could test for the existence of a virtqueue at an idx easily, we > > > could test from highest-to-smallest idx. > > > > > > But I recall that testing for the existance of a virtqueue on s390x > > > resulted > > > in the problem/deadlock in the first place ... > > > > > > -- > > > Cheers, > > > > > > David / dhildenb > > > > So let's talk about a new feature bit? > > Are you thinking about a new feature that switches between "fixed queue > indices" and "compressed queue indices", whereby the latter would be the > legacy default and we would expect all devices to switch to the new > fixed-queue-indices layout? > > We could make all new features require "fixed-queue-indices". I see two ways: 1. we make driver behave correctly with in spec and out of spec devices and we make qemu behave correctly with in spec and out of spec devices 2. a new feature bit I prefer 1, and when we add a new feature we can also document that it should be in spec if negotiated. My question is if 1 is practical. > > > > Since vqs are probed after feature negotiation, it looks like > > we could have a feature bit trigger sane behaviour, right? > > In the Linux driver, yes. In QEMU (devices), we add the queues when > realizing, so we'd need some mechanism to adjust the queue indices based on > feature negotiation I guess? Well we can add queues later, nothing prevents that. > For virtio-balloon it might be doable to simply always create+indicate > free-page hinting to resolve the issue easily. OK, so - for devices, we suggest that basically VIRTIO_BALLOON_F_REPORTING only created with VIRTIO_BALLOON_F_FREE_PAGE_HINT and VIRTIO_BALLOON_F_FREE_PAGE_HINT only created with VIRTIO_BALLOON_F_STATS_VQ I got that. Now, for drivers. If the dependency is satisfied as above, no difference. What should drivers do if not? I think the thing to do would be to first probe spec compliant vq numbers? If not there, try with the non compliant version? However, you wrote: > > > But I recall that testing for the existance of a virtqueue on s390x > > > resulted > > > in the problem/deadlock in the first place ... I think the deadlock was if trying to *use* a non-existent virtqueue? This is qemu code: case CCW_CMD_READ_VQ_CONF: if (check_len) { if (ccw.count != sizeof(vq_config)) { ret = -EINVAL; break; } } else if (ccw.count < sizeof(vq_config)) { /* Can't execute command. */ ret = -EINVAL; break; } if (!ccw.cda) { ret = -EFAULT; } else { ret = ccw_dstream_read(&sch->cds, vq_config.index); if (ret) { break; } vq_config.index = be16_to_cpu(vq_config.index); if (vq_config.index >= VIRTIO_QUEUE_MAX) { ret = -EINVAL; break; } vq_config.num_max = virtio_queue_get_num(vdev, vq_config.index); vq_config.num_