RE: [PATCH AUTOSEL 6.1 3/7] x86/hyperv: Use slow_virt_to_phys() in page transition hypervisor callback
From: Pavel Machek Sent: Tuesday, March 12, 2024 1:35 PM > > > In preparation for temporarily marking pages not present during a > > transition between encrypted and decrypted, use slow_virt_to_phys() > > in the hypervisor callback. As long as the PFN is correct, > > This seems to be preparation for something we don't plan to do in > -stable. Please drop. > As the author of the patch, I agree. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v5 2/8] arm64: hyperv: Add hypercall and register access functions
From: Marc Zyngier Sent: Thursday, November 7, 2019 1:11 AM > >> > >> On 2019-10-03 20:12, Michael Kelley wrote: > >> > Add ARM64-specific code to make Hyper-V hypercalls and to > >> > access virtual processor synthetic registers via hypercalls. > >> > Hypercalls use a Hyper-V specific calling sequence with a non-zero > >> > immediate value per Section 2.9 of the SMC Calling Convention > >> > spec. > >> > >> I find this "following the spec by actively sidestepping it" counter > >> productive. You (or rather the Hyper-V people) are reinventing the > >> wheel (of the slightly square variety) instead of using the standard > >> that the whole of the ARM ecosystem seems happy to take advantage > >> of. > >> > >> I wonder what is the rational for this. If something doesn't quite > >> work for Hyper-V, I think we'd all like to know. > >> > > > > I'll go another round internally with the Hyper-V people on this > > topic and impress upon them the desire of the Linux community to > > have Hyper-V adopt the true spirit of the spec. But I know they are > > fairly set in their approach at this point, regardless of the technical > > merits or lack thereof. Hyper-V is shipping and in use as a > > commercial product on ARM64 hardware, which makes it harder to > > change. I hope we can find a way to avoid a complete impasse > > Hyper-V shipping with their own calling convention is fine by me. Linux > having to implement multiple calling conventions because the Hyper-V > folks refuse (for undisclosed reason) to adopt the standard isn't fine at > all. The "undisclosed reason" is performance. Hyper-V implements 100+ different hypercalls, though many are used only by code in the parent partition (dom0 in Xen terminology). These hypercalls often take moderately complex data structures as inputs and outputs. While the data structures can be passed by reference using the guest physical address (GPA), Hyper-V also offers a "fast" option where both input and output data structures are passed entirely in registers, avoiding two virt_to_phys() calls to get GPAs. The Hyper-V calling sequence allows X0-X16 to be used for input and output data for these "fast" hypercalls, allowing more hypercalls to fit in registers vs. the SMCCC that is limited to X1-X6. The "fast" hypercall approach originated with Hyper-V on x86/x64, where it also uses most of the available registers. These initial Linux patches for ARM64 make only a limited number of hypercalls, all of which fit in X0-X6, even in "fast" mode. So the "why" of the Hyper-V calling sequence is certainly not evident. But future Hyper-V enlightenments in Linux on ARM64 will use more hypercalls, some of which would not be able to use "fast" mode with the SMCCC register limits. > > HV can perfectly retain its interface for Windows or other things, but > please *at least* implement the standard interface on which all > existing operating systems rely. The Hyper-V team has agreed to look at the implications of adding a shim to accept hypercalls through HVC #0 that follow the SMCCC. I'll follow up once we have a better sense of what Hyper-V can do, and the perf implications. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v5 3/8] arm64: hyperv: Add memory alloc/free functions for Hyper-V size pages
From: Marc Zyngier Sent: Thursday, November 7, 2019 6:20 AM > > +/* > > + * Functions for allocating and freeing memory with size and > > + * alignment HV_HYP_PAGE_SIZE. These functions are needed because > > + * the guest page size may not be the same as the Hyper-V page > > + * size. And while kalloc() could allocate the memory, it does not > > + * guarantee the required alignment. So a separate small memory > > + * allocator is needed. The free function is rarely used, so it > > + * does not try to combine freed pages into larger chunks. > > Is this still needed now that kmalloc has alignment guarantees > (see 59bb47985c1d)? > The new kmalloc alignment guarantee is good news, and at least for now would allow these implementations to collapse to just kmalloc/kzalloc/kfree calls. My inclination is to keep the function calls as wrappers, since ISA neutral Hyper-V drivers are starting to use them, and future work on memory encryption in virtual environments may require special handling of pages like these that are shared between the host and guest. But they probably can be moved into the ISA neutral Hyper-V drivers instead of per arch. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v5 2/8] arm64: hyperv: Add hypercall and register access functions
From: Marc Zyngier Sent: Wednesday, November 6, 2019 2:20 AM > > On 2019-10-03 20:12, Michael Kelley wrote: > > Add ARM64-specific code to make Hyper-V hypercalls and to > > access virtual processor synthetic registers via hypercalls. > > Hypercalls use a Hyper-V specific calling sequence with a non-zero > > immediate value per Section 2.9 of the SMC Calling Convention > > spec. > > I find this "following the spec by actively sidestepping it" counter > productive. You (or rather the Hyper-V people) are reinventing the > wheel (of the slightly square variety) instead of using the standard > that the whole of the ARM ecosystem seems happy to take advantage > of. > > I wonder what is the rational for this. If something doesn't quite > work for Hyper-V, I think we'd all like to know. > I'll go another round internally with the Hyper-V people on this topic and impress upon them the desire of the Linux community to have Hyper-V adopt the true spirit of the spec. But I know they are fairly set in their approach at this point, regardless of the technical merits or lack thereof. Hyper-V is shipping and in use as a commercial product on ARM64 hardware, which makes it harder to change. I hope we can find a way to avoid a complete impasse Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v5 2/8] arm64: hyperv: Add hypercall and register access functions
From: Boqun Feng Sent: Sunday, November 3, 2019 8:37 PM > > > diff --git a/arch/arm64/Kbuild b/arch/arm64/Kbuild > > index d646582..2469421 100644 > > --- a/arch/arm64/Kbuild > > +++ b/arch/arm64/Kbuild > > @@ -3,4 +3,5 @@ obj-y += kernel/ mm/ > > obj-$(CONFIG_NET) += net/ > > obj-$(CONFIG_KVM) += kvm/ > > obj-$(CONFIG_XEN) += xen/ > > +obj-$(CONFIG_HYPERV) += hyperv/ > > I did a kernel built with CONFIG_HYPERV=m today, and found out this line > should be (similar to x86): > > +obj-$(subst m,y,$(CONFIG_HYPERV)) += hyperv/ > > , otherwise, when CONFIG_HYPERV=m, files in arch/arm64/hyperv/ will be > compiled as obj-m, and symbols defined in those files cannot be > used by kernel builtin, e.g. hyperv_timer (since CONFIG_HYPERV_TIMER=y > in this case). Agreed. I'll fix that in the next version. > > A compile/link error I hit today is: > > | /home/boqun/linux-arm64/drivers/clocksource/hyperv_timer.c:98: undefined > reference > to `hv_set_vpreg' > | aarch64-linux-gnu-ld: > /home/boqun/linux-arm64/drivers/clocksource/hyperv_timer.c:98: > undefined reference to `hv_set_vpreg' I'm not seeing this error. I'm building natively on an ARM64 system, though the environment and tools are perhaps a couple of years old. Are you still able to reproduce the above error? And is it only complaining about 'hv_set_vpreg', or also about similar functions like 'hv_get_vpreg' that are very parallel? > > [...] > > Besides, another problem I hit when compiled with CONFIG_HYPERV=m is: > > | ERROR: "screen_info" [drivers/hv/hv_vmbus.ko] undefined! > > , which can be fixed by the following change. > > Regards, > Boqun > > >8 > diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c > index d0cf596db82c..8ff557ae5cc6 100644 > --- a/arch/arm64/kernel/efi.c > +++ b/arch/arm64/kernel/efi.c > > @@ -55,6 +55,7 @@ static __init pteval_t > create_mapping_protection(efi_memory_desc_t > *md) > > /* we will fill this structure from the stub, so don't put it in .bss */ > struct screen_info screen_info __section(.data); > +EXPORT_SYMBOL(screen_info); > > int __init efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md) > { Agreed. I can reproduce the same problem, and will fix it as you suggest. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v5 6/8] arm64: hyperv: Initialize hypervisor on boot
Add ARM64-specific code to initialize the Hyper-V hypervisor when booting as a guest VM. Provide functions and data structures indicating hypervisor status that are needed by VMbus driver. This code is built only when CONFIG_HYPERV is enabled. Signed-off-by: Michael Kelley --- arch/arm64/hyperv/hv_init.c | 153 1 file changed, 153 insertions(+) diff --git a/arch/arm64/hyperv/hv_init.c b/arch/arm64/hyperv/hv_init.c index 67350ec..86e4621 100644 --- a/arch/arm64/hyperv/hv_init.c +++ b/arch/arm64/hyperv/hv_init.c @@ -13,15 +13,48 @@ #include #include #include +#include #include +#include +#include #include #include #include #include +#include +#include +#include +#include #include #include #include +#include +#include +static boolhyperv_initialized; + +struct ms_hyperv_info ms_hyperv __ro_after_init; +EXPORT_SYMBOL_GPL(ms_hyperv); + +u32*hv_vp_index; +EXPORT_SYMBOL_GPL(hv_vp_index); + +u32hv_max_vp_index; +EXPORT_SYMBOL_GPL(hv_max_vp_index); + +static int hv_cpu_init(unsigned int cpu) +{ + u64 msr_vp_index; + + hv_get_vp_index(msr_vp_index); + + hv_vp_index[smp_processor_id()] = msr_vp_index; + + if (msr_vp_index > hv_max_vp_index) + hv_max_vp_index = msr_vp_index; + + return 0; +} /* * Functions for allocating and freeing memory with size and @@ -88,6 +121,120 @@ void hv_free_hyperv_page(unsigned long addr) /* + * This function is invoked via the ACPI clocksource probe mechanism. We + * don't actually use any values from the ACPI GTDT table, but we set up + * the Hyper-V synthetic clocksource and do other initialization for + * interacting with Hyper-V the first time. Using early_initcall to invoke + * this function is too late because interrupts are already enabled at that + * point, and hv_init_clocksource() must run before interrupts are enabled. + * + * 1. Setup the guest ID. + * 2. Get features and hints info from Hyper-V + * 3. Setup per-cpu VP indices. + * 4. Initialize the Hyper-V clocksource. + */ + +static int __init hyperv_init(struct acpi_table_header *table) +{ + struct hv_get_vp_register_output result; + u32 a, b, c, d; + u64 guest_id; + int i; + + /* +* If we're in a VM on Hyper-V, the ACPI hypervisor_id field will +* have the string "MsHyperV". +*/ + if (strncmp((char *)&acpi_gbl_FADT.hypervisor_id, "MsHyperV", 8)) + return -EINVAL; + + /* Setup the guest ID */ + guest_id = generate_guest_id(0, LINUX_VERSION_CODE, 0); + hv_set_vpreg(HV_REGISTER_GUEST_OSID, guest_id); + + /* Get the features and hints from Hyper-V */ + hv_get_vpreg_128(HV_REGISTER_PRIVILEGES_AND_FEATURES, &result); + ms_hyperv.features = lower_32_bits(result.registervaluelow); + ms_hyperv.misc_features = upper_32_bits(result.registervaluehigh); + + hv_get_vpreg_128(HV_REGISTER_FEATURES, &result); + ms_hyperv.hints = lower_32_bits(result.registervaluelow); + + pr_info("Hyper-V: Features 0x%x, hints 0x%x\n", + ms_hyperv.features, ms_hyperv.hints); + + /* +* Direct mode is the only option for STIMERs provided Hyper-V +* on ARM64, so Hyper-V doesn't actually set the flag. But add +* the flag so the architecture independent code in +* drivers/clocksource/hyperv_timer.c will correctly use that mode. +*/ + ms_hyperv.misc_features |= HV_STIMER_DIRECT_MODE_AVAILABLE; + + /* +* Hyper-V on ARM64 doesn't support AutoEOI. Add the hint +* that tells architecture independent code not to use this +* feature. +*/ + ms_hyperv.hints |= HV_DEPRECATING_AEOI_RECOMMENDED; + + /* Get information about the Hyper-V host version */ + hv_get_vpreg_128(HV_REGISTER_HYPERVISOR_VERSION, &result); + a = lower_32_bits(result.registervaluelow); + b = upper_32_bits(result.registervaluelow); + c = lower_32_bits(result.registervaluehigh); + d = upper_32_bits(result.registervaluehigh); + pr_info("Hyper-V: Host Build %d.%d.%d.%d-%d-%d\n", + b >> 16, b & 0x, a, d & 0xFF, c, d >> 24); + + /* Allocate and initialize percpu VP index array */ + hv_vp_index = kmalloc_array(num_possible_cpus(), sizeof(*hv_vp_index), + GFP_KERNEL); + if (!hv_vp_index) + return -ENOMEM; + + for (i = 0; i < num_possible_cpus(); i++) + hv_vp_index[i] = VP_INVAL; + + if (cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "arm64/hyperv_init:online", + hv_cpu_init, NULL) < 0) + goto free_vp_index; + + hv_init_clocksource(); + + hyperv_initialized = true;
[PATCH v5 8/8] Drivers: hv: Enable Hyper-V code to be built on ARM64
Update drivers/hv/Kconfig so CONFIG_HYPERV can be selected on ARM64, causing the Hyper-V specific code to be built. Signed-off-by: Michael Kelley --- drivers/hv/Kconfig | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig index 79e5356..1113e49 100644 --- a/drivers/hv/Kconfig +++ b/drivers/hv/Kconfig @@ -4,7 +4,8 @@ menu "Microsoft Hyper-V guest support" config HYPERV tristate "Microsoft Hyper-V client drivers" - depends on X86 && ACPI && X86_LOCAL_APIC && HYPERVISOR_GUEST + depends on ACPI && \ + ((X86 && X86_LOCAL_APIC && HYPERVISOR_GUEST) || ARM64) select PARAVIRT select X86_HV_CALLBACK_VECTOR help -- 1.8.3.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v5 7/8] Drivers: hv: vmbus: Add hooks for per-CPU IRQ
Add hooks to enable/disable a per-CPU IRQ for VMbus. These hooks are in the architecture independent setup and shutdown paths for Hyper-V, and are needed by Linux guests on Hyper-V on ARM64. The x86/x64 implementation is null because VMbus interrupts on x86/x64 don't use an IRQ. Signed-off-by: Michael Kelley --- arch/x86/include/asm/mshyperv.h | 4 drivers/hv/hv.c | 3 +++ 2 files changed, 7 insertions(+) diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h index f4138ae..583e1ce 100644 --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -56,6 +56,10 @@ typedef int (*hyperv_fill_flush_list_func)( #endif void hyperv_vector_handler(struct pt_regs *regs); +/* On x86/x64, there isn't a real IRQ to be enabled/disable */ +static inline void hv_enable_vmbus_irq(void) {} +static inline void hv_disable_vmbus_irq(void) {} + /* * Routines for stimer0 Direct Mode handling. * On x86/x64, there are no percpu actions to take. diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c index fcc5279..51d8f8a 100644 --- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -180,6 +180,7 @@ void hv_synic_enable_regs(unsigned int cpu) hv_set_siefp(siefp.as_uint64); /* Setup the shared SINT. */ + hv_enable_vmbus_irq(); hv_get_synint_state(VMBUS_MESSAGE_SINT, shared_sint.as_uint64); shared_sint.vector = HYPERVISOR_CALLBACK_VECTOR; @@ -241,6 +242,8 @@ void hv_synic_disable_regs(unsigned int cpu) hv_get_synic_state(sctrl.as_uint64); sctrl.enable = 0; hv_set_synic_state(sctrl.as_uint64); + + hv_disable_vmbus_irq(); } int hv_synic_cleanup(unsigned int cpu) -- 1.8.3.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v5 1/8] arm64: hyperv: Add core Hyper-V include files
hyperv-tlfs.h defines Hyper-V interfaces from the Hyper-V Top Level Functional Spec (TLFS). The TLFS is distinctly oriented to x86/x64, and Hyper-V has not separated out the architecture-dependent parts into x86/x64 vs. ARM64. So hyperv-tlfs.h includes information for ARM64 that is not yet formally published. The TLFS is available here: docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs mshyperv.h defines Linux-specific structures and routines for interacting with Hyper-V on ARM64, and #includes the architecture- independent part of mshyperv.h in include/asm-generic. Signed-off-by: Michael Kelley --- MAINTAINERS | 2 + arch/arm64/include/asm/hyperv-tlfs.h | 408 +++ arch/arm64/include/asm/mshyperv.h| 105 + 3 files changed, 515 insertions(+) create mode 100644 arch/arm64/include/asm/hyperv-tlfs.h create mode 100644 arch/arm64/include/asm/mshyperv.h diff --git a/MAINTAINERS b/MAINTAINERS index f04f081..d464067 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7564,6 +7564,8 @@ F:arch/x86/include/asm/trace/hyperv.h F: arch/x86/include/asm/hyperv-tlfs.h F: arch/x86/kernel/cpu/mshyperv.c F: arch/x86/hyperv +F: arch/arm64/include/asm/hyperv-tlfs.h +F: arch/arm64/include/asm/mshyperv.h F: drivers/clocksource/hyperv_timer.c F: drivers/hid/hid-hyperv.c F: drivers/hv/ diff --git a/arch/arm64/include/asm/hyperv-tlfs.h b/arch/arm64/include/asm/hyperv-tlfs.h new file mode 100644 index 000..fe167c4 --- /dev/null +++ b/arch/arm64/include/asm/hyperv-tlfs.h @@ -0,0 +1,408 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * This file contains definitions from the Hyper-V Hypervisor Top-Level + * Functional Specification (TLFS): + * https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs + * + * Copyright (C) 2019, Microsoft, Inc. + * + * Author : Michael Kelley + */ + +#ifndef _ASM_HYPERV_TLFS_H +#define _ASM_HYPERV_TLFS_H + +#include + +/* + * All data structures defined in the TLFS that are shared between Hyper-V + * and a guest VM use Little Endian byte ordering. This matches the default + * byte ordering of Linux running on ARM64, so no special handling is required. + */ + + +/* + * While not explicitly listed in the TLFS, Hyper-V always runs with a page + * size of 4096. These definitions are used when communicating with Hyper-V + * using guest physical pages and guest physical page addresses, since the + * guest page size may not be 4096 on ARM64. + */ +#define HV_HYP_PAGE_SHIFT 12 +#define HV_HYP_PAGE_SIZE (1 << HV_HYP_PAGE_SHIFT) +#define HV_HYP_PAGE_MASK (~(HV_HYP_PAGE_SIZE - 1)) + +/* + * These Hyper-V registers provide information equivalent to the CPUID + * instruction on x86/x64. + */ +#define HV_REGISTER_HYPERVISOR_VERSION 0x0100 /*CPUID 0x4002 */ +#defineHV_REGISTER_PRIVILEGES_AND_FEATURES 0x0200 /*CPUID 0x4003 */ +#defineHV_REGISTER_FEATURES0x0201 /*CPUID 0x4004 */ +#defineHV_REGISTER_IMPLEMENTATION_LIMITS 0x0202 /*CPUID 0x4005 */ +#define HV_ARM64_REGISTER_INTERFACE_VERSION0x00090006 /*CPUID 0x4001 */ + +/* + * Feature identification. HvRegisterPrivilegesAndFeaturesInfo returns a + * 128-bit value with flags indicating which features are available to the + * partition based upon the current partition privileges. The 128-bit + * value is broken up with different portions stored in different 32-bit + * fields in the ms_hyperv structure. + */ + +/* Partition Reference Counter available*/ +#define HV_MSR_TIME_REF_COUNT_AVAILABLEBIT(1) + +/* + * Synthetic Timers available + */ +#define HV_MSR_SYNTIMER_AVAILABLE BIT(3) + +/* Frequency MSRs available */ +#define HV_FEATURE_FREQUENCY_MSRS_AVAILABLEBIT(8) + +/* Reference TSC available */ +#define HV_MSR_REFERENCE_TSC_AVAILABLE BIT(9) + +/* Crash MSR available */ +#define HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE BIT(10) + + +/* + * This group of flags is in the high order 64-bits of the returned + * 128-bit value. + */ + +/* STIMER direct mode is available */ +#define HV_STIMER_DIRECT_MODE_AVAILABLEBIT(19) + +/* + * Implementation recommendations in register + * HvRegisterFeaturesInfo. Indicates which behaviors the hypervisor + * recommends the OS implement for optimal performance. + */ + +/* + * Recommend not using Auto EOI + */ +#define HV_DEPRECATING_AEOI_RECOMMENDEDBIT(9) + +/* + * Synthetic register definitions equivalent to MSRs on x86/x64 + */ +#define HV_REGISTER_CRASH_P0 0x0210 +#define HV_REGISTER_CRASH_P1 0x0211 +#define HV_REGISTER_CRASH_P2 0x0212 +#define HV_REGISTER_CRASH_P3 0x0213 +#define HV_REGISTER_CRASH_P4 0x0214 +#define HV_REGISTER_CRASH_CTL 0x0215 + +#define HV_REGISTER_GUES
[PATCH v5 3/8] arm64: hyperv: Add memory alloc/free functions for Hyper-V size pages
Add ARM64-specific code to allocate memory with HV_HYP_PAGE_SIZE size and alignment. These are for use when pages need to be shared with Hyper-V. Separate functions are needed as the page size used by Hyper-V may not be the same as the guest page size. Free operations are rarely done, so no attempt is made to combine freed pages into larger chunks. This code is built only when CONFIG_HYPERV is enabled. Signed-off-by: Michael Kelley --- arch/arm64/hyperv/hv_init.c| 68 ++ include/asm-generic/mshyperv.h | 5 2 files changed, 73 insertions(+) diff --git a/arch/arm64/hyperv/hv_init.c b/arch/arm64/hyperv/hv_init.c index 6808bc8..9c294f6 100644 --- a/arch/arm64/hyperv/hv_init.c +++ b/arch/arm64/hyperv/hv_init.c @@ -15,10 +15,78 @@ #include #include #include +#include +#include +#include #include #include #include + +/* + * Functions for allocating and freeing memory with size and + * alignment HV_HYP_PAGE_SIZE. These functions are needed because + * the guest page size may not be the same as the Hyper-V page + * size. And while kalloc() could allocate the memory, it does not + * guarantee the required alignment. So a separate small memory + * allocator is needed. The free function is rarely used, so it + * does not try to combine freed pages into larger chunks. + * + * These functions are used by arm64 specific code as well as + * arch independent Hyper-V drivers. + */ + +static DEFINE_SPINLOCK(free_list_lock); +static struct list_head free_list = LIST_HEAD_INIT(free_list); + +void *hv_alloc_hyperv_page(void) +{ + int i; + struct list_head *hv_page; + unsigned long addr; + + BUILD_BUG_ON(HV_HYP_PAGE_SIZE > PAGE_SIZE); + + spin_lock(&free_list_lock); + if (list_empty(&free_list)) { + spin_unlock(&free_list_lock); + addr = __get_free_page(GFP_KERNEL); + spin_lock(&free_list_lock); + for (i = 0; i < PAGE_SIZE; i += HV_HYP_PAGE_SIZE) + list_add_tail((struct list_head *)(addr + i), + &free_list); + } + hv_page = free_list.next; + list_del(hv_page); + spin_unlock(&free_list_lock); + + return hv_page; +} +EXPORT_SYMBOL_GPL(hv_alloc_hyperv_page); + +void *hv_alloc_hyperv_zeroed_page(void) +{ + void *memp; + + memp = hv_alloc_hyperv_page(); + memset(memp, 0, HV_HYP_PAGE_SIZE); + + return memp; +} +EXPORT_SYMBOL_GPL(hv_alloc_hyperv_zeroed_page); + + +void hv_free_hyperv_page(unsigned long addr) +{ + if (!addr) + return; + spin_lock(&free_list_lock); + list_add((struct list_head *)addr, &free_list); + spin_unlock(&free_list_lock); +} +EXPORT_SYMBOL_GPL(hv_free_hyperv_page); + + /* * hv_do_hypercall- Invoke the specified hypercall */ diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h index 18d8e2d..f9f3b66 100644 --- a/include/asm-generic/mshyperv.h +++ b/include/asm-generic/mshyperv.h @@ -99,6 +99,11 @@ static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type) void hv_setup_crash_handler(void (*handler)(struct pt_regs *regs)); void hv_remove_crash_handler(void); +void *hv_alloc_hyperv_page(void); +void *hv_alloc_hyperv_zeroed_page(void); +void hv_free_hyperv_page(unsigned long addr); + + #if IS_ENABLED(CONFIG_HYPERV) /* * Hypervisor's notion of virtual processor ID is different from -- 1.8.3.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v5 0/8] Enable Linux guests on Hyper-V on ARM64
This series enables Linux guests running on Hyper-V on ARM64 hardware. New ARM64-specific code in arch/arm64/hyperv initializes Hyper-V, including its interrupts and hypercall mechanism. Existing architecture independent drivers for Hyper-V's VMbus and synthetic devices just work when built for ARM64. Hyper-V code is built and included in the image and modules only if CONFIG_HYPERV is enabled. The eight patches are organized as follows: 1) Add include files that define the Hyper-V interface as described in the Hyper-V Top Level Functional Spec (TLFS), plus additional definitions specific to Linux running on Hyper-V. 2 thru 6) Add core Hyper-V support on ARM64, including hypercalls, interrupt handlers, kexec & panic handlers, and core hypervisor initialization. 7) Update the existing VMbus driver to generalize interrupt management across x86/x64 and ARM64. 8) Make CONFIG_HYPERV selectable on ARM64 in addition to x86/x64. Some areas of Linux guests on Hyper-V on ARM64 are a work- in-progress: * Hyper-V on ARM64 currently runs with a 4 Kbyte page size, but allows guests with 16K/64K page size. However, the Linux drivers for Hyper-V synthetic devices assume the guest page size is 4K. This patch set lays the groundwork for larger guest page sizes, but the main page size changes are in a different patch stream that is underway to update these drivers. * The Hyper-V vPCI driver at drivers/pci/host/pci-hyperv.c has x86/x64-specific code and is not being built for ARM64. Fixing this driver to enable vPCI devices on ARM64 will be done later. In a few cases, terminology from the x86/x64 world has been carried over into the ARM64 code ("MSR", "TSC"). Hyper-V still uses the x86/x64 terminology and has not replaced it with something more generic, so the code uses the Hyper-V terminology. This will be fixed when Hyper-V updates the usage in the TLFS. This patch set is based on the 5.4-rc1-next-20191001 tree. Changes in v5: * Minor fixups to rebase to 5.4-rc1 linux-next Changes in v4: * Moved clock-related code into an architecture independent Hyper-V clocksource driver that is already upstream. Clock related code is removed from this patch set except for the ARM64 specific interrupt handler. [Marc Zyngier] * Separately upstreamed the split of mshyperv.h into arch independent and arch dependent portions. The arch independent portion has been removed from this patch set. * Divided patch #2 of the series into multiple smaller patches [Marc Zyngier] * Changed a dozen or so smaller things based on feedback [Marc Zyngier, Will Deacon] * Added functions to alloc/free Hyper-V size pages for use by drivers for Hyper-V synthetic devices when updated to not assume guest page size and Hyper-v page size are the same Changes in v3: * Added initialization of hv_vp_index array like was recently added on x86 branch [KY Srinivasan] * Changed Hyper-V ARM64 register symbols to be all uppercase instead of mixed case [KY Srinivasan] * Separated mshyperv.h into two files, one architecture independent and one architecture dependent. After this code is upstream, will make changes to the x86 code to use the architecture independent file and remove duplication. And once we have a multi-architecture Hyper-V TLFS, will do a separate patch to split hyperv-tlfs.h in the same way. [KY Srinivasan] * Minor tweaks to rebase to latest linux-next code Changes in v2: * Removed patch to implement slow_virt_to_phys() on ARM64. Use of slow_virt_to_phys() in arch independent Hyper-V drivers has been eliminated by commit 6ba34171bcbd ("Drivers: hv: vmbus: Remove use of slow_virt_to_phys()") * Minor tweaks to rebase to latest linux-next code Michael Kelley (8): arm64: hyperv: Add core Hyper-V include files arm64: hyperv: Add hypercall and register access functions arm64: hyperv: Add memory alloc/free functions for Hyper-V size pages arm64: hyperv: Add interrupt handlers for VMbus and stimer arm64: hyperv: Add kexec and panic handlers arm64: hyperv: Initialize hypervisor on boot Drivers: hv: vmbus: Add hooks for per-CPU IRQ Drivers: hv: Enable Hyper-V code to be built on ARM64 MAINTAINERS | 3 + arch/arm64/Kbuild| 1 + arch/arm64/hyperv/Makefile | 2 + arch/arm64/hyperv/hv_hvc.S | 44 arch/arm64/hyperv/hv_init.c | 415 +++ arch/arm64/hyperv/mshyperv.c | 165 ++ arch/arm64/include/asm/hyperv-tlfs.h | 408 ++ arch/arm64/include/asm/mshyperv.h| 105 + arch/x86/include/asm/mshyperv.h | 4 + drivers/hv/Kconfig | 3 +- drivers/hv/hv.c | 3 + include/asm-generic/mshyperv.h | 5 + 12 files changed, 1157 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/hyperv/Makefile create mode 10
[PATCH v5 2/8] arm64: hyperv: Add hypercall and register access functions
Add ARM64-specific code to make Hyper-V hypercalls and to access virtual processor synthetic registers via hypercalls. Hypercalls use a Hyper-V specific calling sequence with a non-zero immediate value per Section 2.9 of the SMC Calling Convention spec. This code is architecture dependent and is mostly driven by architecture independent code in the VMbus driver and the Hyper-V timer clocksource driver. This code is built only when CONFIG_HYPERV is enabled. Signed-off-by: Michael Kelley --- MAINTAINERS | 1 + arch/arm64/Kbuild | 1 + arch/arm64/hyperv/Makefile | 2 + arch/arm64/hyperv/hv_hvc.S | 44 +++ arch/arm64/hyperv/hv_init.c | 133 5 files changed, 181 insertions(+) create mode 100644 arch/arm64/hyperv/Makefile create mode 100644 arch/arm64/hyperv/hv_hvc.S create mode 100644 arch/arm64/hyperv/hv_init.c diff --git a/MAINTAINERS b/MAINTAINERS index d464067..84f76f9 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7566,6 +7566,7 @@ F:arch/x86/kernel/cpu/mshyperv.c F: arch/x86/hyperv F: arch/arm64/include/asm/hyperv-tlfs.h F: arch/arm64/include/asm/mshyperv.h +F: arch/arm64/hyperv F: drivers/clocksource/hyperv_timer.c F: drivers/hid/hid-hyperv.c F: drivers/hv/ diff --git a/arch/arm64/Kbuild b/arch/arm64/Kbuild index d646582..2469421 100644 --- a/arch/arm64/Kbuild +++ b/arch/arm64/Kbuild @@ -3,4 +3,5 @@ obj-y += kernel/ mm/ obj-$(CONFIG_NET) += net/ obj-$(CONFIG_KVM) += kvm/ obj-$(CONFIG_XEN) += xen/ +obj-$(CONFIG_HYPERV) += hyperv/ obj-$(CONFIG_CRYPTO) += crypto/ diff --git a/arch/arm64/hyperv/Makefile b/arch/arm64/hyperv/Makefile new file mode 100644 index 000..6bd8439 --- /dev/null +++ b/arch/arm64/hyperv/Makefile @@ -0,0 +1,2 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-y := hv_init.o hv_hvc.o diff --git a/arch/arm64/hyperv/hv_hvc.S b/arch/arm64/hyperv/hv_hvc.S new file mode 100644 index 000..09324ac --- /dev/null +++ b/arch/arm64/hyperv/hv_hvc.S @@ -0,0 +1,44 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Microsoft Hyper-V hypervisor invocation routines + * + * Copyright (C) 2018, Microsoft, Inc. + * + * Author : Michael Kelley + */ + +#include + + .text +/* + * Do the HVC instruction. For Hyper-V the argument is always 1. + * x0 contains the hypercall control value, while additional registers + * vary depending on the hypercall, and whether the hypercall arguments + * are in memory or in registers (a "fast" hypercall per the Hyper-V + * TLFS). When the arguments are in memory x1 is the guest physical + * address of the input arguments, and x2 is the guest physical + * address of the output arguments. When the arguments are in + * registers, the register values depends on the hypercall. Note + * that this version cannot return any values in registers. + */ +ENTRY(hv_do_hvc) + hvc #1 + ret +ENDPROC(hv_do_hvc) + +/* + * This variant of HVC invocation is for hv_get_vpreg and + * hv_get_vpreg_128. The input parameters are passed in registers + * along with a pointer in x4 to where the output result should + * be stored. The output is returned in x15 and x16. x18 is used as + * scratch space to avoid buildng a stack frame, as Hyper-V does + * not preserve registers x0-x17. + */ +ENTRY(hv_do_hvc_fast_get) + mov x18, x4 + hvc #1 + str x15,[x18] + str x16,[x18,#8] + ret +ENDPROC(hv_do_hvc_fast_get) diff --git a/arch/arm64/hyperv/hv_init.c b/arch/arm64/hyperv/hv_init.c new file mode 100644 index 000..6808bc8 --- /dev/null +++ b/arch/arm64/hyperv/hv_init.c @@ -0,0 +1,133 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Initialization of the interface with Microsoft's Hyper-V hypervisor, + * and various low level utility routines for interacting with Hyper-V. + * + * Copyright (C) 2019, Microsoft, Inc. + * + * Author : Michael Kelley + */ + + +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * hv_do_hypercall- Invoke the specified hypercall + */ +u64 hv_do_hypercall(u64 control, void *input, void *output) +{ + u64 input_address; + u64 output_address; + + input_address = input ? virt_to_phys(input) : 0; + output_address = output ? virt_to_phys(output) : 0; + return hv_do_hvc(control, input_address, output_address); +} +EXPORT_SYMBOL_GPL(hv_do_hypercall); + +/* + * hv_do_fast_hypercall8 -- Invoke the specified hypercall + * with arguments in registers instead of physical memory. + * Avoids the overhead of virt_to_phys for simple hypercalls. + */ + +u64 hv_do_fast_hypercall8(u16 code, u64 input) +{ + u64 control; + + control = (u64)code | HV_HYPERCALL_FAST_BIT; + return hv_do_hvc(control, input); +} +EXPORT_SYMBOL_GPL(hv_do_fast_hypercall8); + + +/* + * Set a single VP register to a 64-bit value. + */ +void hv_set_vpreg(u32 msr, u64 value) +
[PATCH v5 5/8] arm64: hyperv: Add kexec and panic handlers
Add functions to set up and remove kexec and panic handlers, and to inform Hyper-V about a guest panic. These functions are called from architecture independent code in the VMbus driver. This code is built only when CONFIG_HYPERV is enabled. Signed-off-by: Michael Kelley --- arch/arm64/hyperv/hv_init.c | 61 arch/arm64/hyperv/mshyperv.c | 26 +++ 2 files changed, 87 insertions(+) diff --git a/arch/arm64/hyperv/hv_init.c b/arch/arm64/hyperv/hv_init.c index 9c294f6..67350ec 100644 --- a/arch/arm64/hyperv/hv_init.c +++ b/arch/arm64/hyperv/hv_init.c @@ -199,3 +199,64 @@ void hv_get_vpreg_128(u32 msr, struct hv_get_vp_register_output *result) } EXPORT_SYMBOL_GPL(hv_get_vpreg_128); + +void hyperv_report_panic(struct pt_regs *regs, long err) +{ + static bool panic_reported; + u64 guest_id; + + /* +* We prefer to report panic on 'die' chain as we have proper +* registers to report, but if we miss it (e.g. on BUG()) we need +* to report it on 'panic'. +*/ + if (panic_reported) + return; + panic_reported = true; + + guest_id = hv_get_vpreg(HV_REGISTER_GUEST_OSID); + + /* +* Hyper-V provides the ability to store only 5 values. +* Pick the passed in error value, the guest_id, and the PC. +* The first two general registers are added arbitrarily. +*/ + hv_set_vpreg(HV_REGISTER_CRASH_P0, err); + hv_set_vpreg(HV_REGISTER_CRASH_P1, guest_id); + hv_set_vpreg(HV_REGISTER_CRASH_P2, regs->pc); + hv_set_vpreg(HV_REGISTER_CRASH_P3, regs->regs[0]); + hv_set_vpreg(HV_REGISTER_CRASH_P4, regs->regs[1]); + + /* +* Let Hyper-V know there is crash data available +*/ + hv_set_vpreg(HV_REGISTER_CRASH_CTL, HV_CRASH_CTL_CRASH_NOTIFY); +} +EXPORT_SYMBOL_GPL(hyperv_report_panic); + +/* + * hyperv_report_panic_msg - report panic message to Hyper-V + * @pa: physical address of the panic page containing the message + * @size: size of the message in the page + */ +void hyperv_report_panic_msg(phys_addr_t pa, size_t size) +{ + /* +* P3 to contain the physical address of the panic page & P4 to +* contain the size of the panic data in that page. Rest of the +* registers are no-op when the NOTIFY_MSG flag is set. +*/ + hv_set_vpreg(HV_REGISTER_CRASH_P0, 0); + hv_set_vpreg(HV_REGISTER_CRASH_P1, 0); + hv_set_vpreg(HV_REGISTER_CRASH_P2, 0); + hv_set_vpreg(HV_REGISTER_CRASH_P3, pa); + hv_set_vpreg(HV_REGISTER_CRASH_P4, size); + + /* +* Let Hyper-V know there is crash data available along with +* the panic message. +*/ + hv_set_vpreg(HV_REGISTER_CRASH_CTL, + (HV_CRASH_CTL_CRASH_NOTIFY | HV_CRASH_CTL_CRASH_NOTIFY_MSG)); +} +EXPORT_SYMBOL_GPL(hyperv_report_panic_msg); diff --git a/arch/arm64/hyperv/mshyperv.c b/arch/arm64/hyperv/mshyperv.c index ae6ece6..c58940d 100644 --- a/arch/arm64/hyperv/mshyperv.c +++ b/arch/arm64/hyperv/mshyperv.c @@ -23,6 +23,8 @@ static void (*vmbus_handler)(void); static void (*hv_stimer0_handler)(void); +static void (*hv_kexec_handler)(void); +static void (*hv_crash_handler)(struct pt_regs *regs); static int vmbus_irq; static long __percpu *vmbus_evt; @@ -137,3 +139,27 @@ void hv_remove_stimer0_irq(int irq) } } EXPORT_SYMBOL_GPL(hv_remove_stimer0_irq); + +void hv_setup_kexec_handler(void (*handler)(void)) +{ + hv_kexec_handler = handler; +} +EXPORT_SYMBOL_GPL(hv_setup_kexec_handler); + +void hv_remove_kexec_handler(void) +{ + hv_kexec_handler = NULL; +} +EXPORT_SYMBOL_GPL(hv_remove_kexec_handler); + +void hv_setup_crash_handler(void (*handler)(struct pt_regs *regs)) +{ + hv_crash_handler = handler; +} +EXPORT_SYMBOL_GPL(hv_setup_crash_handler); + +void hv_remove_crash_handler(void) +{ + hv_crash_handler = NULL; +} +EXPORT_SYMBOL_GPL(hv_remove_crash_handler); -- 1.8.3.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v5 4/8] arm64: hyperv: Add interrupt handlers for VMbus and stimer
Add ARM64-specific code to set up and handle the interrupts generated by Hyper-V for VMbus messages and for stimer expiration. This code is architecture dependent and is mostly driven by architecture independent code in the VMbus driver and the Hyper-V timer clocksource driver. This code is built only when CONFIG_HYPERV is enabled. Signed-off-by: Michael Kelley --- arch/arm64/hyperv/Makefile | 2 +- arch/arm64/hyperv/mshyperv.c | 139 +++ 2 files changed, 140 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/hyperv/mshyperv.c diff --git a/arch/arm64/hyperv/Makefile b/arch/arm64/hyperv/Makefile index 6bd8439..988eda5 100644 --- a/arch/arm64/hyperv/Makefile +++ b/arch/arm64/hyperv/Makefile @@ -1,2 +1,2 @@ # SPDX-License-Identifier: GPL-2.0 -obj-y := hv_init.o hv_hvc.o +obj-y := hv_init.o hv_hvc.o mshyperv.o diff --git a/arch/arm64/hyperv/mshyperv.c b/arch/arm64/hyperv/mshyperv.c new file mode 100644 index 000..ae6ece6 --- /dev/null +++ b/arch/arm64/hyperv/mshyperv.c @@ -0,0 +1,139 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Core routines for interacting with Microsoft's Hyper-V hypervisor, + * including setting up VMbus and STIMER interrupts, and handling + * crashes and kexecs. These interactions are through a set of + * static "handler" variables set by the architecture independent + * VMbus and STIMER drivers. + * + * Copyright (C) 2019, Microsoft, Inc. + * + * Author : Michael Kelley + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +static void (*vmbus_handler)(void); +static void (*hv_stimer0_handler)(void); + +static int vmbus_irq; +static long __percpu *vmbus_evt; +static long __percpu *stimer0_evt; + +irqreturn_t hyperv_vector_handler(int irq, void *dev_id) +{ + vmbus_handler(); + return IRQ_HANDLED; +} + +/* Must be done just once */ +void hv_setup_vmbus_irq(void (*handler)(void)) +{ + int result; + + vmbus_handler = handler; + vmbus_irq = acpi_register_gsi(NULL, HYPERVISOR_CALLBACK_VECTOR, +ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_HIGH); + if (vmbus_irq <= 0) { + pr_err("Can't register Hyper-V VMBus GSI. Error %d", + vmbus_irq); + vmbus_irq = 0; + return; + } + vmbus_evt = alloc_percpu(long); + result = request_percpu_irq(vmbus_irq, hyperv_vector_handler, + "Hyper-V VMbus", vmbus_evt); + if (result) { + pr_err("Can't request Hyper-V VMBus IRQ %d. Error %d", + vmbus_irq, result); + free_percpu(vmbus_evt); + acpi_unregister_gsi(vmbus_irq); + vmbus_irq = 0; + } +} +EXPORT_SYMBOL_GPL(hv_setup_vmbus_irq); + +/* Must be done just once */ +void hv_remove_vmbus_irq(void) +{ + if (vmbus_irq) { + free_percpu_irq(vmbus_irq, vmbus_evt); + free_percpu(vmbus_evt); + acpi_unregister_gsi(vmbus_irq); + } +} +EXPORT_SYMBOL_GPL(hv_remove_vmbus_irq); + +/* Must be done by each CPU */ +void hv_enable_vmbus_irq(void) +{ + enable_percpu_irq(vmbus_irq, 0); +} +EXPORT_SYMBOL_GPL(hv_enable_vmbus_irq); + +/* Must be done by each CPU */ +void hv_disable_vmbus_irq(void) +{ + disable_percpu_irq(vmbus_irq); +} +EXPORT_SYMBOL_GPL(hv_disable_vmbus_irq); + +/* Routines to do per-architecture handling of STIMER0 when in Direct Mode */ + +static irqreturn_t hv_stimer0_vector_handler(int irq, void *dev_id) +{ + if (hv_stimer0_handler) + hv_stimer0_handler(); + return IRQ_HANDLED; +} + +int hv_setup_stimer0_irq(int *irq, int *vector, void (*handler)(void)) +{ + int localirq; + int result; + + localirq = acpi_register_gsi(NULL, HV_STIMER0_IRQNR, + ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_HIGH); + if (localirq <= 0) { + pr_err("Can't register Hyper-V stimer0 GSI. Error %d", + localirq); + *irq = 0; + return -1; + } + stimer0_evt = alloc_percpu(long); + result = request_percpu_irq(localirq, hv_stimer0_vector_handler, +"Hyper-V stimer0", stimer0_evt); + if (result) { + pr_err("Can't request Hyper-V stimer0 IRQ %d. Error %d", + localirq, result); + free_percpu(stimer0_evt); + acpi_unregister_gsi(localirq); + *irq = 0; + return -1; + } + + hv_stimer0_handler = handler; + *vector = HV_STIMER0_IRQNR; + *irq = localirq; + return 0; +} +EXPORT_SYMBOL_GPL(hv_setup_stimer0_irq); + +void hv_remove_stimer0_irq(int irq) +{ + hv_stimer0_handler = NULL; +
RE: [PATCH v4 0/8] Enable Linux guests on Hyper-V on ARM64
From: Michael Kelley Sent: Tuesday, August 6, 2019 1:31 PM > > This series enables Linux guests running on Hyper-V on ARM64 > hardware. New ARM64-specific code in arch/arm64/hyperv initializes > Hyper-V, including its interrupts and hypercall mechanism. > Existing architecture independent drivers for Hyper-V's VMbus and > synthetic devices just work when built for ARM64. Hyper-V code is > built and included in the image and modules only if CONFIG_HYPERV > is enabled. > > The eight patches are organized as follows: > 1) Add include files that define the Hyper-V interface as >described in the Hyper-V Top Level Functional Spec (TLFS), plus >additional definitions specific to Linux running on Hyper-V. > > 2 thru 6) Add core Hyper-V support on ARM64, including hypercalls, >interrupt handlers, kexec & panic handlers, and core hypervisor >initialization. > > 7) Update the existing VMbus driver to generalize interrupt >management across x86/x64 and ARM64. > > 8) Make CONFIG_HYPERV selectable on ARM64 in addition to x86/x64. > I'm hoping to get some feedback from the ARM64 maintainers on this series. Previous feedback has been incorporated, so it should be close to being able to go in. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v4 7/8] Drivers: hv: vmbus: Add hooks for per-CPU IRQ
Add hooks to enable/disable a per-CPU IRQ for VMbus. These hooks are in the architecture independent setup and shutdown paths for Hyper-V, and are needed by Linux guests on Hyper-V on ARM64. The x86/x64 implementation is null because VMbus interrupts on x86/x64 don't use an IRQ. Signed-off-by: Michael Kelley --- arch/x86/include/asm/mshyperv.h | 4 drivers/hv/hv.c | 2 ++ 2 files changed, 6 insertions(+) diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h index f4138ae..583e1ce 100644 --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -56,6 +56,10 @@ typedef int (*hyperv_fill_flush_list_func)( #endif void hyperv_vector_handler(struct pt_regs *regs); +/* On x86/x64, there isn't a real IRQ to be enabled/disable */ +static inline void hv_enable_vmbus_irq(void) {} +static inline void hv_disable_vmbus_irq(void) {} + /* * Routines for stimer0 Direct Mode handling. * On x86/x64, there are no percpu actions to take. diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c index 6188fb7..86f5435 100644 --- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -180,6 +180,7 @@ int hv_synic_init(unsigned int cpu) hv_set_siefp(siefp.as_uint64); /* Setup the shared SINT. */ + hv_enable_vmbus_irq(); hv_get_synint_state(VMBUS_MESSAGE_SINT, shared_sint.as_uint64); shared_sint.vector = HYPERVISOR_CALLBACK_VECTOR; @@ -272,6 +273,7 @@ int hv_synic_cleanup(unsigned int cpu) /* Disable the global synic bit */ sctrl.enable = 0; hv_set_synic_state(sctrl.as_uint64); + hv_disable_vmbus_irq(); return 0; } -- 1.8.3.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v4 0/8] Enable Linux guests on Hyper-V on ARM64
This series enables Linux guests running on Hyper-V on ARM64 hardware. New ARM64-specific code in arch/arm64/hyperv initializes Hyper-V, including its interrupts and hypercall mechanism. Existing architecture independent drivers for Hyper-V's VMbus and synthetic devices just work when built for ARM64. Hyper-V code is built and included in the image and modules only if CONFIG_HYPERV is enabled. The eight patches are organized as follows: 1) Add include files that define the Hyper-V interface as described in the Hyper-V Top Level Functional Spec (TLFS), plus additional definitions specific to Linux running on Hyper-V. 2 thru 6) Add core Hyper-V support on ARM64, including hypercalls, interrupt handlers, kexec & panic handlers, and core hypervisor initialization. 7) Update the existing VMbus driver to generalize interrupt management across x86/x64 and ARM64. 8) Make CONFIG_HYPERV selectable on ARM64 in addition to x86/x64. Some areas of Linux guests on Hyper-V on ARM64 are a work- in-progress: * Hyper-V on ARM64 currently runs with a 4 Kbyte page size, but allows guests with 16K/64K page size. However, the Linux drivers for Hyper-V synthetic devices assume the guest page size is 4K. This patch set lays the groundwork for larger guest page sizes, but the main changes are in a different patch stream that is underway to update these drivers. * The Hyper-V vPCI driver at drivers/pci/host/pci-hyperv.c has x86/x64-specific code and is not being built for ARM64. Fixing this driver to enable vPCI devices on ARM64 will be done later. In a few cases, terminology from the x86/x64 world has been carried over into the ARM64 code ("MSR", "TSC"). Hyper-V still uses the x86/x64 terminology and has not replaced it with something more generic, so the code uses the Hyper-V terminology. This will be fixed when Hyper-V updates the usage in the TLFS. This patch set is built against a 5.3.0-rc2-next-20190731 tree. Changes in v4: * Moved clock-related code into an architecture independent Hyper-V clocksource driver that is already upstream. Clock related code is removed from this patch set except for the ARM64 specific interrupt handler. [Marc Zyngier] * Separately upstreamed the split of mshyperv.h into arch independent and arch dependent portions. The arch independent portion has been removed from this patch set. * Divided patch #2 of the series into multiple smaller patches [Marc Zyngier] * Changed a dozen or so smaller things based on feedback [Marc Zyngier, Will Deacon] * Added functions to alloc/free Hyper-V size pages. These are for use by drivers for Hyper-V synthetic devices when they are updated to handle guest page size != Hyper-V page size Changes in v3: * Added initialization of hv_vp_index array like was recently added on x86 branch [KY Srinivasan] * Changed Hyper-V ARM64 register symbols to be all uppercase instead of mixed case [KY Srinivasan] * Separated mshyperv.h into two files, one architecture independent and one architecture dependent. After this code is upstream, will make changes to the x86 code to use the architecture independent file and remove duplication. And once we have a multi-architecture Hyper-V TLFS, will do a separate patch to split hyperv-tlfs.h in the same way. [KY Srinivasan] * Minor tweaks to rebase to latest linux-next code Changes in v2: * Removed patch to implement slow_virt_to_phys() on ARM64. Use of slow_virt_to_phys() in arch independent Hyper-V drivers has been eliminated by commit 6ba34171bcbd ("Drivers: hv: vmbus: Remove use of slow_virt_to_phys()") * Minor tweaks to rebase to latest linux-next code Michael Kelley (8): arm64: hyperv: Add core Hyper-V include files arm64: hyperv: Add hypercall and register access functions arm64: hyperv: Add memory alloc/free functions for Hyper-V size pages arm64: hyperv: Add interrupt handlers for VMbus and stimer arm64: hyperv: Add kexec and panic handlers arm64: hyperv: Initialize hypervisor on boot Drivers: hv: vmbus: Add hooks for per-CPU IRQ Drivers: hv: Enable Hyper-V code to be built on ARM64 MAINTAINERS | 3 + arch/arm64/Makefile | 1 + arch/arm64/hyperv/Makefile | 2 + arch/arm64/hyperv/hv_hvc.S | 44 arch/arm64/hyperv/hv_init.c | 404 ++ arch/arm64/hyperv/mshyperv.c | 165 ++ arch/arm64/include/asm/hyperv-tlfs.h | 408 +++ arch/arm64/include/asm/mshyperv.h| 105 + arch/x86/include/asm/mshyperv.h | 4 + drivers/hv/Kconfig | 5 +- drivers/hv/hv.c | 2 + include/asm-generic/mshyperv.h | 5 + 12 files changed, 1146 insertions(+), 2 deletions(-) create mode 100644 arch/arm64/hyperv/Makefile create mode 100644 arch/arm64/hyperv/hv_hvc.S create mode 100644 arch/arm64/hy
[PATCH v4 5/8] arm64: hyperv: Add kexec and panic handlers
Add functions to set up and remove kexec and panic handlers, and to inform Hyper-V about a guest panic. These functions are called from architecture independent code in the VMbus driver. This code is built only when CONFIG_HYPERV is enabled. Signed-off-by: Michael Kelley --- arch/arm64/hyperv/hv_init.c | 61 arch/arm64/hyperv/mshyperv.c | 26 +++ 2 files changed, 87 insertions(+) diff --git a/arch/arm64/hyperv/hv_init.c b/arch/arm64/hyperv/hv_init.c index 9c294f6..67350ec 100644 --- a/arch/arm64/hyperv/hv_init.c +++ b/arch/arm64/hyperv/hv_init.c @@ -199,3 +199,64 @@ void hv_get_vpreg_128(u32 msr, struct hv_get_vp_register_output *result) } EXPORT_SYMBOL_GPL(hv_get_vpreg_128); + +void hyperv_report_panic(struct pt_regs *regs, long err) +{ + static bool panic_reported; + u64 guest_id; + + /* +* We prefer to report panic on 'die' chain as we have proper +* registers to report, but if we miss it (e.g. on BUG()) we need +* to report it on 'panic'. +*/ + if (panic_reported) + return; + panic_reported = true; + + guest_id = hv_get_vpreg(HV_REGISTER_GUEST_OSID); + + /* +* Hyper-V provides the ability to store only 5 values. +* Pick the passed in error value, the guest_id, and the PC. +* The first two general registers are added arbitrarily. +*/ + hv_set_vpreg(HV_REGISTER_CRASH_P0, err); + hv_set_vpreg(HV_REGISTER_CRASH_P1, guest_id); + hv_set_vpreg(HV_REGISTER_CRASH_P2, regs->pc); + hv_set_vpreg(HV_REGISTER_CRASH_P3, regs->regs[0]); + hv_set_vpreg(HV_REGISTER_CRASH_P4, regs->regs[1]); + + /* +* Let Hyper-V know there is crash data available +*/ + hv_set_vpreg(HV_REGISTER_CRASH_CTL, HV_CRASH_CTL_CRASH_NOTIFY); +} +EXPORT_SYMBOL_GPL(hyperv_report_panic); + +/* + * hyperv_report_panic_msg - report panic message to Hyper-V + * @pa: physical address of the panic page containing the message + * @size: size of the message in the page + */ +void hyperv_report_panic_msg(phys_addr_t pa, size_t size) +{ + /* +* P3 to contain the physical address of the panic page & P4 to +* contain the size of the panic data in that page. Rest of the +* registers are no-op when the NOTIFY_MSG flag is set. +*/ + hv_set_vpreg(HV_REGISTER_CRASH_P0, 0); + hv_set_vpreg(HV_REGISTER_CRASH_P1, 0); + hv_set_vpreg(HV_REGISTER_CRASH_P2, 0); + hv_set_vpreg(HV_REGISTER_CRASH_P3, pa); + hv_set_vpreg(HV_REGISTER_CRASH_P4, size); + + /* +* Let Hyper-V know there is crash data available along with +* the panic message. +*/ + hv_set_vpreg(HV_REGISTER_CRASH_CTL, + (HV_CRASH_CTL_CRASH_NOTIFY | HV_CRASH_CTL_CRASH_NOTIFY_MSG)); +} +EXPORT_SYMBOL_GPL(hyperv_report_panic_msg); diff --git a/arch/arm64/hyperv/mshyperv.c b/arch/arm64/hyperv/mshyperv.c index ae6ece6..c58940d 100644 --- a/arch/arm64/hyperv/mshyperv.c +++ b/arch/arm64/hyperv/mshyperv.c @@ -23,6 +23,8 @@ static void (*vmbus_handler)(void); static void (*hv_stimer0_handler)(void); +static void (*hv_kexec_handler)(void); +static void (*hv_crash_handler)(struct pt_regs *regs); static int vmbus_irq; static long __percpu *vmbus_evt; @@ -137,3 +139,27 @@ void hv_remove_stimer0_irq(int irq) } } EXPORT_SYMBOL_GPL(hv_remove_stimer0_irq); + +void hv_setup_kexec_handler(void (*handler)(void)) +{ + hv_kexec_handler = handler; +} +EXPORT_SYMBOL_GPL(hv_setup_kexec_handler); + +void hv_remove_kexec_handler(void) +{ + hv_kexec_handler = NULL; +} +EXPORT_SYMBOL_GPL(hv_remove_kexec_handler); + +void hv_setup_crash_handler(void (*handler)(struct pt_regs *regs)) +{ + hv_crash_handler = handler; +} +EXPORT_SYMBOL_GPL(hv_setup_crash_handler); + +void hv_remove_crash_handler(void) +{ + hv_crash_handler = NULL; +} +EXPORT_SYMBOL_GPL(hv_remove_crash_handler); -- 1.8.3.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v4 1/8] arm64: hyperv: Add core Hyper-V include files
hyperv-tlfs.h defines Hyper-V interfaces from the Hyper-V Top Level Functional Spec (TLFS). The TLFS is distinctly oriented to x86/x64, and Hyper-V has not separated out the architecture-dependent parts into x86/x64 vs. ARM64. So hyperv-tlfs.h includes information for ARM64 that is not yet formally published. The TLFS is available here: docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs mshyperv.h defines Linux-specific structures and routines for interacting with Hyper-V on ARM64, and #includes the architecture- independent part of mshyperv.h in include/asm-generic. Signed-off-by: Michael Kelley --- MAINTAINERS | 2 + arch/arm64/include/asm/hyperv-tlfs.h | 408 +++ arch/arm64/include/asm/mshyperv.h| 105 + 3 files changed, 515 insertions(+) create mode 100644 arch/arm64/include/asm/hyperv-tlfs.h create mode 100644 arch/arm64/include/asm/mshyperv.h diff --git a/MAINTAINERS b/MAINTAINERS index cf2225b..fa98b21 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7486,6 +7486,8 @@ F:arch/x86/include/asm/trace/hyperv.h F: arch/x86/include/asm/hyperv-tlfs.h F: arch/x86/kernel/cpu/mshyperv.c F: arch/x86/hyperv +F: arch/arm64/include/asm/hyperv-tlfs.h +F: arch/arm64/include/asm/mshyperv.h F: drivers/clocksource/hyperv_timer.c F: drivers/hid/hid-hyperv.c F: drivers/hv/ diff --git a/arch/arm64/include/asm/hyperv-tlfs.h b/arch/arm64/include/asm/hyperv-tlfs.h new file mode 100644 index 000..fe167c4 --- /dev/null +++ b/arch/arm64/include/asm/hyperv-tlfs.h @@ -0,0 +1,408 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * This file contains definitions from the Hyper-V Hypervisor Top-Level + * Functional Specification (TLFS): + * https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs + * + * Copyright (C) 2019, Microsoft, Inc. + * + * Author : Michael Kelley + */ + +#ifndef _ASM_HYPERV_TLFS_H +#define _ASM_HYPERV_TLFS_H + +#include + +/* + * All data structures defined in the TLFS that are shared between Hyper-V + * and a guest VM use Little Endian byte ordering. This matches the default + * byte ordering of Linux running on ARM64, so no special handling is required. + */ + + +/* + * While not explicitly listed in the TLFS, Hyper-V always runs with a page + * size of 4096. These definitions are used when communicating with Hyper-V + * using guest physical pages and guest physical page addresses, since the + * guest page size may not be 4096 on ARM64. + */ +#define HV_HYP_PAGE_SHIFT 12 +#define HV_HYP_PAGE_SIZE (1 << HV_HYP_PAGE_SHIFT) +#define HV_HYP_PAGE_MASK (~(HV_HYP_PAGE_SIZE - 1)) + +/* + * These Hyper-V registers provide information equivalent to the CPUID + * instruction on x86/x64. + */ +#define HV_REGISTER_HYPERVISOR_VERSION 0x0100 /*CPUID 0x4002 */ +#defineHV_REGISTER_PRIVILEGES_AND_FEATURES 0x0200 /*CPUID 0x4003 */ +#defineHV_REGISTER_FEATURES0x0201 /*CPUID 0x4004 */ +#defineHV_REGISTER_IMPLEMENTATION_LIMITS 0x0202 /*CPUID 0x4005 */ +#define HV_ARM64_REGISTER_INTERFACE_VERSION0x00090006 /*CPUID 0x4001 */ + +/* + * Feature identification. HvRegisterPrivilegesAndFeaturesInfo returns a + * 128-bit value with flags indicating which features are available to the + * partition based upon the current partition privileges. The 128-bit + * value is broken up with different portions stored in different 32-bit + * fields in the ms_hyperv structure. + */ + +/* Partition Reference Counter available*/ +#define HV_MSR_TIME_REF_COUNT_AVAILABLEBIT(1) + +/* + * Synthetic Timers available + */ +#define HV_MSR_SYNTIMER_AVAILABLE BIT(3) + +/* Frequency MSRs available */ +#define HV_FEATURE_FREQUENCY_MSRS_AVAILABLEBIT(8) + +/* Reference TSC available */ +#define HV_MSR_REFERENCE_TSC_AVAILABLE BIT(9) + +/* Crash MSR available */ +#define HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE BIT(10) + + +/* + * This group of flags is in the high order 64-bits of the returned + * 128-bit value. + */ + +/* STIMER direct mode is available */ +#define HV_STIMER_DIRECT_MODE_AVAILABLEBIT(19) + +/* + * Implementation recommendations in register + * HvRegisterFeaturesInfo. Indicates which behaviors the hypervisor + * recommends the OS implement for optimal performance. + */ + +/* + * Recommend not using Auto EOI + */ +#define HV_DEPRECATING_AEOI_RECOMMENDEDBIT(9) + +/* + * Synthetic register definitions equivalent to MSRs on x86/x64 + */ +#define HV_REGISTER_CRASH_P0 0x0210 +#define HV_REGISTER_CRASH_P1 0x0211 +#define HV_REGISTER_CRASH_P2 0x0212 +#define HV_REGISTER_CRASH_P3 0x0213 +#define HV_REGISTER_CRASH_P4 0x0214 +#define HV_REGISTER_CRASH_CTL 0x0215 + +#define HV_REGISTER_GUES
[PATCH v4 2/8] arm64: hyperv: Add hypercall and register access functions
Add ARM64-specific code to make Hyper-V hypercalls and to access virtual processor synthetic registers via hypercalls. Hypercalls use a Hyper-V specific calling sequence with a non-zero immediate value per Section 2.9 of the SMC Calling Convention spec. This code is architecture dependent and is mostly driven by architecture independent code in the VMbus driver and the Hyper-V timer clocksource driver. This code is built only when CONFIG_HYPERV is enabled. Signed-off-by: Michael Kelley --- MAINTAINERS | 1 + arch/arm64/Makefile | 1 + arch/arm64/hyperv/Makefile | 2 + arch/arm64/hyperv/hv_hvc.S | 44 +++ arch/arm64/hyperv/hv_init.c | 133 5 files changed, 181 insertions(+) create mode 100644 arch/arm64/hyperv/Makefile create mode 100644 arch/arm64/hyperv/hv_hvc.S create mode 100644 arch/arm64/hyperv/hv_init.c diff --git a/MAINTAINERS b/MAINTAINERS index fa98b21..71a8276 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7488,6 +7488,7 @@ F:arch/x86/kernel/cpu/mshyperv.c F: arch/x86/hyperv F: arch/arm64/include/asm/hyperv-tlfs.h F: arch/arm64/include/asm/mshyperv.h +F: arch/arm64/hyperv F: drivers/clocksource/hyperv_timer.c F: drivers/hid/hid-hyperv.c F: drivers/hv/ diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile index bb1f1db..1f014e6 100644 --- a/arch/arm64/Makefile +++ b/arch/arm64/Makefile @@ -140,6 +140,7 @@ core-y += arch/arm64/kernel/ arch/arm64/mm/ core-$(CONFIG_NET) += arch/arm64/net/ core-$(CONFIG_KVM) += arch/arm64/kvm/ core-$(CONFIG_XEN) += arch/arm64/xen/ +core-$(CONFIG_HYPERV) += arch/arm64/hyperv/ core-$(CONFIG_CRYPTO) += arch/arm64/crypto/ libs-y := arch/arm64/lib/ $(libs-y) core-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a diff --git a/arch/arm64/hyperv/Makefile b/arch/arm64/hyperv/Makefile new file mode 100644 index 000..6bd8439 --- /dev/null +++ b/arch/arm64/hyperv/Makefile @@ -0,0 +1,2 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-y := hv_init.o hv_hvc.o diff --git a/arch/arm64/hyperv/hv_hvc.S b/arch/arm64/hyperv/hv_hvc.S new file mode 100644 index 000..09324ac --- /dev/null +++ b/arch/arm64/hyperv/hv_hvc.S @@ -0,0 +1,44 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Microsoft Hyper-V hypervisor invocation routines + * + * Copyright (C) 2018, Microsoft, Inc. + * + * Author : Michael Kelley + */ + +#include + + .text +/* + * Do the HVC instruction. For Hyper-V the argument is always 1. + * x0 contains the hypercall control value, while additional registers + * vary depending on the hypercall, and whether the hypercall arguments + * are in memory or in registers (a "fast" hypercall per the Hyper-V + * TLFS). When the arguments are in memory x1 is the guest physical + * address of the input arguments, and x2 is the guest physical + * address of the output arguments. When the arguments are in + * registers, the register values depends on the hypercall. Note + * that this version cannot return any values in registers. + */ +ENTRY(hv_do_hvc) + hvc #1 + ret +ENDPROC(hv_do_hvc) + +/* + * This variant of HVC invocation is for hv_get_vpreg and + * hv_get_vpreg_128. The input parameters are passed in registers + * along with a pointer in x4 to where the output result should + * be stored. The output is returned in x15 and x16. x18 is used as + * scratch space to avoid buildng a stack frame, as Hyper-V does + * not preserve registers x0-x17. + */ +ENTRY(hv_do_hvc_fast_get) + mov x18, x4 + hvc #1 + str x15,[x18] + str x16,[x18,#8] + ret +ENDPROC(hv_do_hvc_fast_get) diff --git a/arch/arm64/hyperv/hv_init.c b/arch/arm64/hyperv/hv_init.c new file mode 100644 index 000..6808bc8 --- /dev/null +++ b/arch/arm64/hyperv/hv_init.c @@ -0,0 +1,133 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Initialization of the interface with Microsoft's Hyper-V hypervisor, + * and various low level utility routines for interacting with Hyper-V. + * + * Copyright (C) 2019, Microsoft, Inc. + * + * Author : Michael Kelley + */ + + +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * hv_do_hypercall- Invoke the specified hypercall + */ +u64 hv_do_hypercall(u64 control, void *input, void *output) +{ + u64 input_address; + u64 output_address; + + input_address = input ? virt_to_phys(input) : 0; + output_address = output ? virt_to_phys(output) : 0; + return hv_do_hvc(control, input_address, output_address); +} +EXPORT_SYMBOL_GPL(hv_do_hypercall); + +/* + * hv_do_fast_hypercall8 -- Invoke the specified hypercall + * with arguments in registers instead of physical memory. + * Avoids the overhead of virt_to_phys for simple hypercalls. + */ + +u64 hv_do_fast_hypercall8(u16 code, u64 input) +{ + u64 control; + + control = (u64)code | HV_HYPERCALL_FAST_BIT;
[PATCH v4 8/8] Drivers: hv: Enable Hyper-V code to be built on ARM64
Update drivers/hv/Kconfig so CONFIG_HYPERV and CONFIG_HYPERV_TSCPAGE can be selected on ARM64, causing the Hyper-V specific code to be built. Signed-off-by: Michael Kelley --- drivers/hv/Kconfig | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig index 9a59957..6f8808f 100644 --- a/drivers/hv/Kconfig +++ b/drivers/hv/Kconfig @@ -4,7 +4,8 @@ menu "Microsoft Hyper-V guest support" config HYPERV tristate "Microsoft Hyper-V client drivers" - depends on X86 && ACPI && X86_LOCAL_APIC && HYPERVISOR_GUEST + depends on ACPI && \ + ((X86 && X86_LOCAL_APIC && HYPERVISOR_GUEST) || ARM64) select PARAVIRT select X86_HV_CALLBACK_VECTOR help @@ -15,7 +16,7 @@ config HYPERV_TIMER def_bool HYPERV config HYPERV_TSCPAGE - def_bool HYPERV && X86_64 + def_bool HYPERV && (X86_64 || ARM64) config HYPERV_UTILS tristate "Microsoft Hyper-V Utilities driver" -- 1.8.3.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v4 3/8] arm64: hyperv: Add memory alloc/free functions for Hyper-V size pages
Add ARM64-specific code to allocate memory with HV_HYP_PAGE_SIZE size and alignment. These are for use when pages need to be shared with Hyper-V. Separate functions are needed as the page size used by Hyper-V may not be the same as the guest page size. Free operations are rarely done, so no attempt is made to combine freed pages into larger chunks. This code is built only when CONFIG_HYPERV is enabled. Signed-off-by: Michael Kelley --- arch/arm64/hyperv/hv_init.c| 68 ++ include/asm-generic/mshyperv.h | 5 2 files changed, 73 insertions(+) diff --git a/arch/arm64/hyperv/hv_init.c b/arch/arm64/hyperv/hv_init.c index 6808bc8..9c294f6 100644 --- a/arch/arm64/hyperv/hv_init.c +++ b/arch/arm64/hyperv/hv_init.c @@ -15,10 +15,78 @@ #include #include #include +#include +#include +#include #include #include #include + +/* + * Functions for allocating and freeing memory with size and + * alignment HV_HYP_PAGE_SIZE. These functions are needed because + * the guest page size may not be the same as the Hyper-V page + * size. And while kalloc() could allocate the memory, it does not + * guarantee the required alignment. So a separate small memory + * allocator is needed. The free function is rarely used, so it + * does not try to combine freed pages into larger chunks. + * + * These functions are used by arm64 specific code as well as + * arch independent Hyper-V drivers. + */ + +static DEFINE_SPINLOCK(free_list_lock); +static struct list_head free_list = LIST_HEAD_INIT(free_list); + +void *hv_alloc_hyperv_page(void) +{ + int i; + struct list_head *hv_page; + unsigned long addr; + + BUILD_BUG_ON(HV_HYP_PAGE_SIZE > PAGE_SIZE); + + spin_lock(&free_list_lock); + if (list_empty(&free_list)) { + spin_unlock(&free_list_lock); + addr = __get_free_page(GFP_KERNEL); + spin_lock(&free_list_lock); + for (i = 0; i < PAGE_SIZE; i += HV_HYP_PAGE_SIZE) + list_add_tail((struct list_head *)(addr + i), + &free_list); + } + hv_page = free_list.next; + list_del(hv_page); + spin_unlock(&free_list_lock); + + return hv_page; +} +EXPORT_SYMBOL_GPL(hv_alloc_hyperv_page); + +void *hv_alloc_hyperv_zeroed_page(void) +{ + void *memp; + + memp = hv_alloc_hyperv_page(); + memset(memp, 0, HV_HYP_PAGE_SIZE); + + return memp; +} +EXPORT_SYMBOL_GPL(hv_alloc_hyperv_zeroed_page); + + +void hv_free_hyperv_page(unsigned long addr) +{ + if (!addr) + return; + spin_lock(&free_list_lock); + list_add((struct list_head *)addr, &free_list); + spin_unlock(&free_list_lock); +} +EXPORT_SYMBOL_GPL(hv_free_hyperv_page); + + /* * hv_do_hypercall- Invoke the specified hypercall */ diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h index 0becb7d..30a9f3e 100644 --- a/include/asm-generic/mshyperv.h +++ b/include/asm-generic/mshyperv.h @@ -99,6 +99,11 @@ static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type) void hv_setup_crash_handler(void (*handler)(struct pt_regs *regs)); void hv_remove_crash_handler(void); +void *hv_alloc_hyperv_page(void); +void *hv_alloc_hyperv_zeroed_page(void); +void hv_free_hyperv_page(unsigned long addr); + + #if IS_ENABLED(CONFIG_HYPERV) /* * Hypervisor's notion of virtual processor ID is different from -- 1.8.3.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v4 4/8] arm64: hyperv: Add interrupt handlers for VMbus and stimer
Add ARM64-specific code to set up and handle the interrupts generated by Hyper-V for VMbus messages and for stimer expiration. This code is architecture dependent and is mostly driven by architecture independent code in the VMbus driver and the Hyper-V timer clocksource driver. This code is built only when CONFIG_HYPERV is enabled. Signed-off-by: Michael Kelley --- arch/arm64/hyperv/Makefile | 2 +- arch/arm64/hyperv/mshyperv.c | 139 +++ 2 files changed, 140 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/hyperv/mshyperv.c diff --git a/arch/arm64/hyperv/Makefile b/arch/arm64/hyperv/Makefile index 6bd8439..988eda5 100644 --- a/arch/arm64/hyperv/Makefile +++ b/arch/arm64/hyperv/Makefile @@ -1,2 +1,2 @@ # SPDX-License-Identifier: GPL-2.0 -obj-y := hv_init.o hv_hvc.o +obj-y := hv_init.o hv_hvc.o mshyperv.o diff --git a/arch/arm64/hyperv/mshyperv.c b/arch/arm64/hyperv/mshyperv.c new file mode 100644 index 000..ae6ece6 --- /dev/null +++ b/arch/arm64/hyperv/mshyperv.c @@ -0,0 +1,139 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Core routines for interacting with Microsoft's Hyper-V hypervisor, + * including setting up VMbus and STIMER interrupts, and handling + * crashes and kexecs. These interactions are through a set of + * static "handler" variables set by the architecture independent + * VMbus and STIMER drivers. + * + * Copyright (C) 2019, Microsoft, Inc. + * + * Author : Michael Kelley + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +static void (*vmbus_handler)(void); +static void (*hv_stimer0_handler)(void); + +static int vmbus_irq; +static long __percpu *vmbus_evt; +static long __percpu *stimer0_evt; + +irqreturn_t hyperv_vector_handler(int irq, void *dev_id) +{ + vmbus_handler(); + return IRQ_HANDLED; +} + +/* Must be done just once */ +void hv_setup_vmbus_irq(void (*handler)(void)) +{ + int result; + + vmbus_handler = handler; + vmbus_irq = acpi_register_gsi(NULL, HYPERVISOR_CALLBACK_VECTOR, +ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_HIGH); + if (vmbus_irq <= 0) { + pr_err("Can't register Hyper-V VMBus GSI. Error %d", + vmbus_irq); + vmbus_irq = 0; + return; + } + vmbus_evt = alloc_percpu(long); + result = request_percpu_irq(vmbus_irq, hyperv_vector_handler, + "Hyper-V VMbus", vmbus_evt); + if (result) { + pr_err("Can't request Hyper-V VMBus IRQ %d. Error %d", + vmbus_irq, result); + free_percpu(vmbus_evt); + acpi_unregister_gsi(vmbus_irq); + vmbus_irq = 0; + } +} +EXPORT_SYMBOL_GPL(hv_setup_vmbus_irq); + +/* Must be done just once */ +void hv_remove_vmbus_irq(void) +{ + if (vmbus_irq) { + free_percpu_irq(vmbus_irq, vmbus_evt); + free_percpu(vmbus_evt); + acpi_unregister_gsi(vmbus_irq); + } +} +EXPORT_SYMBOL_GPL(hv_remove_vmbus_irq); + +/* Must be done by each CPU */ +void hv_enable_vmbus_irq(void) +{ + enable_percpu_irq(vmbus_irq, 0); +} +EXPORT_SYMBOL_GPL(hv_enable_vmbus_irq); + +/* Must be done by each CPU */ +void hv_disable_vmbus_irq(void) +{ + disable_percpu_irq(vmbus_irq); +} +EXPORT_SYMBOL_GPL(hv_disable_vmbus_irq); + +/* Routines to do per-architecture handling of STIMER0 when in Direct Mode */ + +static irqreturn_t hv_stimer0_vector_handler(int irq, void *dev_id) +{ + if (hv_stimer0_handler) + hv_stimer0_handler(); + return IRQ_HANDLED; +} + +int hv_setup_stimer0_irq(int *irq, int *vector, void (*handler)(void)) +{ + int localirq; + int result; + + localirq = acpi_register_gsi(NULL, HV_STIMER0_IRQNR, + ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_HIGH); + if (localirq <= 0) { + pr_err("Can't register Hyper-V stimer0 GSI. Error %d", + localirq); + *irq = 0; + return -1; + } + stimer0_evt = alloc_percpu(long); + result = request_percpu_irq(localirq, hv_stimer0_vector_handler, +"Hyper-V stimer0", stimer0_evt); + if (result) { + pr_err("Can't request Hyper-V stimer0 IRQ %d. Error %d", + localirq, result); + free_percpu(stimer0_evt); + acpi_unregister_gsi(localirq); + *irq = 0; + return -1; + } + + hv_stimer0_handler = handler; + *vector = HV_STIMER0_IRQNR; + *irq = localirq; + return 0; +} +EXPORT_SYMBOL_GPL(hv_setup_stimer0_irq); + +void hv_remove_stimer0_irq(int irq) +{ + hv_stimer0_handler = NULL; +
[PATCH v4 6/8] arm64: hyperv: Initialize hypervisor on boot
Add ARM64-specific code to initialize the Hyper-V hypervisor when booting as a guest VM. Provide functions and data structures indicating hypervisor status that are needed by VMbus driver. This code is built only when CONFIG_HYPERV is enabled. Signed-off-by: Michael Kelley --- arch/arm64/hyperv/hv_init.c | 142 1 file changed, 142 insertions(+) diff --git a/arch/arm64/hyperv/hv_init.c b/arch/arm64/hyperv/hv_init.c index 67350ec..7179e12 100644 --- a/arch/arm64/hyperv/hv_init.c +++ b/arch/arm64/hyperv/hv_init.c @@ -13,15 +13,47 @@ #include #include #include +#include #include +#include +#include #include #include #include #include +#include +#include +#include #include #include #include +#include +#include +static boolhyperv_initialized; + +struct ms_hyperv_info ms_hyperv __ro_after_init; +EXPORT_SYMBOL_GPL(ms_hyperv); + +u32*hv_vp_index; +EXPORT_SYMBOL_GPL(hv_vp_index); + +u32hv_max_vp_index; +EXPORT_SYMBOL_GPL(hv_max_vp_index); + +static int hv_cpu_init(unsigned int cpu) +{ + u64 msr_vp_index; + + hv_get_vp_index(msr_vp_index); + + hv_vp_index[smp_processor_id()] = msr_vp_index; + + if (msr_vp_index > hv_max_vp_index) + hv_max_vp_index = msr_vp_index; + + return 0; +} /* * Functions for allocating and freeing memory with size and @@ -88,6 +120,110 @@ void hv_free_hyperv_page(unsigned long addr) /* + * This function is invoked via the ACPI clocksource probe mechanism. We + * don't actually use any values from the ACPI GTDT table, but we set up + * the Hyper-V synthetic clocksource and do other initialization for + * interacting with Hyper-V the first time. Using early_initcall to invoke + * this function is too late because interrupts are already enabled at that + * point, and hv_init_clocksource() must run before interrupts are enabled. + * + * 1. Setup the guest ID. + * 2. Get features and hints info from Hyper-V + * 3. Setup per-cpu VP indices. + * 4. Initialize the Hyper-V clocksource. + */ + +static int __init hyperv_init(struct acpi_table_header *table) +{ + struct hv_get_vp_register_output result; + u32 a, b, c, d; + u64 guest_id; + int i; + + /* +* If we're in a VM on Hyper-V, the ACPI hypervisor_id field will +* have the string "MsHyperV". +*/ + if (strncmp((char *)&acpi_gbl_FADT.hypervisor_id, "MsHyperV", 8)) + return -EINVAL; + + /* Setup the guest ID */ + guest_id = generate_guest_id(0, LINUX_VERSION_CODE, 0); + hv_set_vpreg(HV_REGISTER_GUEST_OSID, guest_id); + + /* Get the features and hints from Hyper-V */ + hv_get_vpreg_128(HV_REGISTER_PRIVILEGES_AND_FEATURES, &result); + ms_hyperv.features = lower_32_bits(result.registervaluelow); + ms_hyperv.misc_features = upper_32_bits(result.registervaluehigh); + + hv_get_vpreg_128(HV_REGISTER_FEATURES, &result); + ms_hyperv.hints = lower_32_bits(result.registervaluelow); + + pr_info("Hyper-V: Features 0x%x, hints 0x%x\n", + ms_hyperv.features, ms_hyperv.hints); + + /* +* Direct mode is the only option for STIMERs provided Hyper-V +* on ARM64, so Hyper-V doesn't actually set the flag. But add +* the flag so the architecture independent code in +* drivers/clocksource/hyperv_timer.c will correctly use that mode. +*/ + ms_hyperv.misc_features |= HV_STIMER_DIRECT_MODE_AVAILABLE; + + /* +* Hyper-V on ARM64 doesn't support AutoEOI. Add the hint +* that tells architecture independent code not to use this +* feature. +*/ + ms_hyperv.hints |= HV_DEPRECATING_AEOI_RECOMMENDED; + + /* Get information about the Hyper-V host version */ + hv_get_vpreg_128(HV_REGISTER_HYPERVISOR_VERSION, &result); + a = lower_32_bits(result.registervaluelow); + b = upper_32_bits(result.registervaluelow); + c = lower_32_bits(result.registervaluehigh); + d = upper_32_bits(result.registervaluehigh); + pr_info("Hyper-V: Host Build %d.%d.%d.%d-%d-%d\n", + b >> 16, b & 0x, a, d & 0xFF, c, d >> 24); + + /* Allocate and initialize percpu VP index array */ + hv_vp_index = kmalloc_array(num_possible_cpus(), sizeof(*hv_vp_index), + GFP_KERNEL); + if (!hv_vp_index) + return -ENOMEM; + + for (i = 0; i < num_possible_cpus(); i++) + hv_vp_index[i] = VP_INVAL; + + if (cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "arm64/hyperv_init:online", + hv_cpu_init, NULL) < 0) + goto free_vp_index; + + hv_init_clocksource(); + + hyperv_initialized = true;
RE: [PATCH v2] x86/hyper-v: Zero out the VP ASSIST PAGE to fix CPU offlining
From: Dexuan Cui Sent: Thursday, July 18, 2019 8:23 PM > > The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's Section > 5.2.1 "GPA Overlay Pages" for the details) and here is an excerpt: > > " > The hypervisor defines several special pages that "overlay" the guest's > Guest Physical Addresses (GPA) space. Overlays are addressed GPA but are > not included in the normal GPA map maintained internally by the hypervisor. > Conceptually, they exist in a separate map that overlays the GPA map. > > If a page within the GPA space is overlaid, any SPA page mapped to the > GPA page is effectively "obscured" and generally unreachable by the > virtual processor through processor memory accesses. > > If an overlay page is disabled, the underlying GPA page is "uncovered", > and an existing mapping becomes accessible to the guest. > " > > SPA = System Physical Address = the final real physical address. > > When a CPU (e.g. CPU1) is being onlined, in hv_cpu_init(), we allocate the > VP ASSIST PAGE and enable the EOI optimization for this CPU by writing the > MSR HV_X64_MSR_VP_ASSIST_PAGE. From now on, hvp->apic_assist belongs to the > special SPA page, and this CPU *always* uses hvp->apic_assist (which is > shared with the hypervisor) to decide if it needs to write the EOI MSR. > > When a CPU (e.g. CPU1) is being offlined, on this CPU, we do: > 1. in hv_cpu_die(), we disable the EOI optimizaton for this CPU, and from >now on hvp->apic_assist belongs to the original "normal" SPA page; > 2. we finish the remaining work of stopping this CPU; > 3. this CPU is completely stopped. > > Between 1 and 3, this CPU can still receive interrupts (e.g. reschedule > IPIs from CPU0, and Local APIC timer interrupts), and this CPU *must* write > the EOI MSR for every interrupt received, otherwise the hypervisor may not > deliver further interrupts, which may be needed to completely stop the CPU. > > So, after we disable the EOI optimization in hv_cpu_die(), we need to make > sure hvp->apic_assist's bit0 is zero. The easiest way is we just zero out > the page when it's allocated in hv_cpu_init(). > > Note 1: after the "normal" SPA page is allocted and zeroed out, neither the > hypervisor nor the guest writes into the page, so the page remains with > zeros. > > Note 2: see Section 10.3.5 "EOI Assist" for the details of the EOI > optimization. When the optimization is enabled, the guest can still write > the EOI MSR register irrespective of the "No EOI required" value, though > by doing so we can't benefit from the optimization. > > Fixes: ba696429d290 ("x86/hyper-v: Implement EOI assist") > Signed-off-by: Dexuan Cui > --- > > v2: there is no code change. I just improved the comment and the changelog > according to the discussion with tglx: > > https://lkml.org/lkml/2019/7/17/781 > https://lkml.org/lkml/2019/7/18/91 > > arch/x86/hyperv/hv_init.c | 10 +- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c > index 0e033ef11a9f..d26832cb38bb 100644 > --- a/arch/x86/hyperv/hv_init.c > +++ b/arch/x86/hyperv/hv_init.c > @@ -60,8 +60,16 @@ static int hv_cpu_init(unsigned int cpu) > if (!hv_vp_assist_page) > return 0; > > + /* > + * The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's Section > + * 5.2.1 "GPA Overlay Pages"). Here it must be zeroed out to make sure > + * we always write the EOI MSR in hv_apic_eoi_write() *after* the > + * EOI optimization is disabled in hv_cpu_die(), otherwise a CPU may > + * not be stopped in the case of CPU offlining and the VM will hang. > + */ > if (!*hvp) > - *hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL, PAGE_KERNEL); > + *hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL | __GFP_ZERO, > + PAGE_KERNEL); > > if (*hvp) { > u64 val; > -- > 2.19.1 Reviewed-by: Michael Kelley
RE: [PATCH] x86/hyper-v: Zero out the VP assist page to fix CPU offlining
From: Dexuan Cui Sent: Wednesday, July 3, 2019 6:46 PM > > When a CPU is being offlined, the CPU usually still receives a few > interrupts (e.g. reschedule IPIs), after hv_cpu_die() disables the > HV_X64_MSR_VP_ASSIST_PAGE, so hv_apic_eoi_write() may not write the EOI > MSR, if the apic_assist field's bit0 happens to be 1; as a result, Hyper-V > may not be able to deliver all the interrupts to the CPU, and the CPU may > not be stopped, and the kernel will hang soon. > > The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's Section > 5.2.1 "GPA Overlay Pages"), so with this fix we're sure the apic_assist > field is still zero, after the VP ASSIST PAGE is disabled. > > Fixes: ba696429d290 ("x86/hyper-v: Implement EOI assist") > Signed-off-by: Dexuan Cui > --- > arch/x86/hyperv/hv_init.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c > index 0e033ef11a9f..db51a301f759 100644 > --- a/arch/x86/hyperv/hv_init.c > +++ b/arch/x86/hyperv/hv_init.c > @@ -60,8 +60,14 @@ static int hv_cpu_init(unsigned int cpu) > if (!hv_vp_assist_page) > return 0; > > + /* > + * The ZERO flag is necessary, because in the case of CPU offlining > + * the page can still be used by hv_apic_eoi_write() for a while, > + * after the VP ASSIST PAGE is disabled in hv_cpu_die(). > + */ > if (!*hvp) > - *hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL, PAGE_KERNEL); > + *hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL | __GFP_ZERO, > + PAGE_KERNEL); > > if (*hvp) { > u64 val; > -- > 2.19.1 Reviewed-by: Michael Kelley
RE: [PATCH v2] PCI: hv: Fix a use-after-free bug in hv_eject_device_work()
From: Dexuan Cui Sent: Friday, June 21, 2019 4:45 PM > > The commit 05f151a73ec2 itself is correct, but it exposes this > use-after-free bug, which is caught by some memory debug options. > > Add a Fixes tag to indicate the dependency. > > Fixes: 05f151a73ec2 ("PCI: hv: Fix a memory leak in hv_eject_device_work()") > Signed-off-by: Dexuan Cui > Cc: sta...@vger.kernel.org > --- > > In v2: > Replaced "hpdev->hbus" with "hbus", since we have the new "hbus" variable. > [Michael > Kelley] > > drivers/pci/controller/pci-hyperv.c | 15 +-- > 1 file changed, 9 insertions(+), 6 deletions(-) > > diff --git a/drivers/pci/controller/pci-hyperv.c > b/drivers/pci/controller/pci-hyperv.c > index 808a182830e5..5dadc964ad3b 100644 > --- a/drivers/pci/controller/pci-hyperv.c > +++ b/drivers/pci/controller/pci-hyperv.c > @@ -1880,6 +1880,7 @@ static void hv_pci_devices_present(struct > hv_pcibus_device > *hbus, > static void hv_eject_device_work(struct work_struct *work) > { > struct pci_eject_response *ejct_pkt; > + struct hv_pcibus_device *hbus; > struct hv_pci_dev *hpdev; > struct pci_dev *pdev; > unsigned long flags; > @@ -1890,6 +1891,7 @@ static void hv_eject_device_work(struct work_struct > *work) > } ctxt; > > hpdev = container_of(work, struct hv_pci_dev, wrk); > + hbus = hpdev->hbus; > > WARN_ON(hpdev->state != hv_pcichild_ejecting); > > @@ -1900,8 +1902,7 @@ static void hv_eject_device_work(struct work_struct > *work) >* because hbus->pci_bus may not exist yet. >*/ > wslot = wslot_to_devfn(hpdev->desc.win_slot.slot); > - pdev = pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain, 0, > -wslot); > + pdev = pci_get_domain_bus_and_slot(hbus->sysdata.domain, 0, wslot); > if (pdev) { > pci_lock_rescan_remove(); > pci_stop_and_remove_bus_device(pdev); > @@ -1909,9 +1910,9 @@ static void hv_eject_device_work(struct work_struct > *work) > pci_unlock_rescan_remove(); > } > > - spin_lock_irqsave(&hpdev->hbus->device_list_lock, flags); > + spin_lock_irqsave(&hbus->device_list_lock, flags); > list_del(&hpdev->list_entry); > - spin_unlock_irqrestore(&hpdev->hbus->device_list_lock, flags); > + spin_unlock_irqrestore(&hbus->device_list_lock, flags); > > if (hpdev->pci_slot) > pci_destroy_slot(hpdev->pci_slot); > @@ -1920,7 +1921,7 @@ static void hv_eject_device_work(struct work_struct > *work) > ejct_pkt = (struct pci_eject_response *)&ctxt.pkt.message; > ejct_pkt->message_type.type = PCI_EJECTION_COMPLETE; > ejct_pkt->wslot.slot = hpdev->desc.win_slot.slot; > - vmbus_sendpacket(hpdev->hbus->hdev->channel, ejct_pkt, > + vmbus_sendpacket(hbus->hdev->channel, ejct_pkt, >sizeof(*ejct_pkt), (unsigned long)&ctxt.pkt, >VM_PKT_DATA_INBAND, 0); > > @@ -1929,7 +1930,9 @@ static void hv_eject_device_work(struct work_struct > *work) > /* For the two refs got in new_pcichild_device() */ > put_pcichild(hpdev); > put_pcichild(hpdev); > - put_hvpcibus(hpdev->hbus); > + /* hpdev has been freed. Do not use it any more. */ > + > + put_hvpcibus(hbus); > } > > /** > -- > 2.17.1 Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] PCI: hv: Fix a use-after-free bug in hv_eject_device_work()
From: Dexuan Cui Sent: Friday, June 21, 2019 12:02 PM > > The commit 05f151a73ec2 itself is correct, but it exposes this > use-after-free bug, which is caught by some memory debug options. > > Add the Fixes tag to indicate the dependency. > > Fixes: 05f151a73ec2 ("PCI: hv: Fix a memory leak in hv_eject_device_work()") > Signed-off-by: Dexuan Cui > Cc: sta...@vger.kernel.org > --- > Sorry for not spotting the bug when sending 05f151a73ec2. > > Now I have enabled the mm debug options to help catch such mistakes in future. > > drivers/pci/controller/pci-hyperv.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/controller/pci-hyperv.c > b/drivers/pci/controller/pci-hyperv.c > index 808a182830e5..42ace1a690f9 100644 > --- a/drivers/pci/controller/pci-hyperv.c > +++ b/drivers/pci/controller/pci-hyperv.c > @@ -1880,6 +1880,7 @@ static void hv_pci_devices_present(struct > hv_pcibus_device > *hbus, > static void hv_eject_device_work(struct work_struct *work) > { > struct pci_eject_response *ejct_pkt; > + struct hv_pcibus_device *hbus; > struct hv_pci_dev *hpdev; > struct pci_dev *pdev; > unsigned long flags; > @@ -1890,6 +1891,7 @@ static void hv_eject_device_work(struct work_struct > *work) > } ctxt; > > hpdev = container_of(work, struct hv_pci_dev, wrk); > + hbus = hpdev->hbus; In the lines of code following this new assignment, there are four uses of hpdev->hbus besides the one at the bottom of the function that causes the use-after-free error. With 'hbus' now available as a local variable, it looks rather strange to have those other places still using hpdev->hbus. I'm thinking they should be shortened to just 'hbus' for consistency, even though such changes aren't directly related to fixing the bug. Michael > > WARN_ON(hpdev->state != hv_pcichild_ejecting); > > @@ -1929,7 +1931,9 @@ static void hv_eject_device_work(struct work_struct > *work) > /* For the two refs got in new_pcichild_device() */ > put_pcichild(hpdev); > put_pcichild(hpdev); > - put_hvpcibus(hpdev->hbus); > + /* hpdev has been freed. Do not use it any more. */ > + > + put_hvpcibus(hbus); > } > > /** > -- > 2.17.1
RE: [PATCH] Drivers: hv: vmbus: Fix virt_to_hvpfn() for X86_PAE
From: Dexuan Cui Sent: Tuesday, May 7, 2019 12:47 AM > > In the case of X86_PAE, unsigned long is u32, but the physical address type > should be u64. Due to the bug here, the netvsc driver can not load > successfully, and sometimes the VM can panic due to memory corruption (the > hypervisor writes data to the wrong location). > > Fixes: 6ba34171bcbd ("Drivers: hv: vmbus: Remove use of slow_virt_to_phys()") > Cc: sta...@vger.kernel.org > Cc: Michael Kelley > Reported-and-tested-by: Juliana Rodrigueiro > > Signed-off-by: Dexuan Cui Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] vmbus: Remove the undesired put_cpu_ptr() in hv_synic_cleanup()
From: Dexuan Cui Sent: Friday, April 12, 2019 4:35 PM > > With CONFIG_DEBUG_PREEMPT=y, the put_cpu_ptr() triggiers an underflow > warning in preempt_count_sub(). > > Fixes: 37cdd991fac8 ("vmbus: put related per-cpu variable together") > Cc: sta...@vger.kernel.org > Cc: Stephen Hemminger > Signed-off-by: Dexuan Cui Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 1/3] PCI: hv: Fix a memory leak in hv_eject_device_work()
From: Lorenzo Pieralisi Sent: Tuesday, March 26, 2019 10:09 AM > On Thu, Mar 21, 2019 at 12:12:03AM +, Dexuan Cui wrote: > > > From: Michael Kelley > > > Sent: Wednesday, March 20, 2019 2:38 PM > > > > > > From: Dexuan Cui > > > > > > > > After a device is just created in new_pcichild_device(), hpdev->refs is > > > > set > > > > to 2 (i.e. the initial value of 1 plus the get_pcichild()). > > > > > > > > When we hot remove the device from the host, in Linux VM we first call > > > > hv_pci_eject_device(), which increases hpdev->refs by get_pcichild() and > > > > then schedules a work of hv_eject_device_work(), so hpdev->refs becomes > > > > 3 > > > > (let's ignore the paired get/put_pcichild() in other places). But in > > > > hv_eject_device_work(), currently we only call put_pcichild() twice, > > > > meaning the 'hpdev' struct can't be freed in put_pcichild(). This patch > > > > adds one put_pcichild() to fix the memory leak. > > > > > > > > BTW, the device can also be removed when we run "rmmod pci-hyperv". On > > > this > > > > path (hv_pci_remove() -> hv_pci_bus_exit() -> hv_pci_devices_present()), > > > > hpdev->refs is 2, and we do correctly call put_pcichild() twice in > > > > pci_devices_present_work(). > > > > > > Exiting new_pcichild_device() with hpdev->refs set to 2 seems OK to me. > > > There is the reference in the hbus->children list, and there is the > > > reference that > > > is returned to the caller. > > So IMO the "normal" reference count should be 2. :-) IMO only when a > > hv_pci_dev > > device is about to be destroyed, its reference count can drop to less than > > 2, > > i.e. first temporarily drop to 1 (meaning the hv_pci_dev device is removed > > from > > hbus->children), and then drop to zero (meaning kfree(hpdev) is called). > > > > > But what is strange is that pci_devices_present_work() > > > overwrites the reference returned in local variable hpdev without doing a > > > put_pcichild(). > > I suppose you mean: > > > > /* First, mark all existing children as reported missing. */ > > spin_lock_irqsave(&hbus->device_list_lock, flags); > > list_for_each_entry(hpdev, &hbus->children, list_entry) { > > hpdev->reported_missing = true; > > } > > spin_unlock_irqrestore(&hbus->device_list_lock, flags) > > > > This is not strange to me, because, in pci_devices_present_work(), at first > > we > > don't know which devices are about to disappear, so we pre-mark all devices > > to > > be potentially missing like that; if a device is still on the bus, we'll > > mark its > > hpdev->reported_missing to false later; only after we know exactly which > > devices are missing, we should call put_pcichild() against them. All these > > seem natural to me. > > > > > It seems like the "normal" reference count should be 1 when the > > > child device is not being manipulated, not 2. > > What does "not being manipulated" mean? > > > > > The fix would be to add a call to > > > put_pcichild() when the return value from new_pcichild_device() is > > > overwritten. > > In pci_devices_present_work(), we NEVER "overwrite" the "hpdev" returned > > from new_pcichild_device(): the "reported_missing" field of the new hpdev > > is implicitly initialized to false in new_pcichild_device(). > > > > > Then remove the call to put_pcichild() in pci_device_present_work() when > > > missing > > > children are moved to the local list. The children have been moved from > > > one > > > list > > > to another, so there's no need to decrement the reference count. Then > > > when > > > everything in the local list is deleted, the reference is correctly > > > decremented, > > > presumably freeing the memory. > > > > > > With this approach, the code in hv_eject_device_work() is correct. > > > There's > > > one call to put_pcichild() to reflect removing the child device from the > > > hbus-> > > > children list, and one call to put_pcichild() to pair with the > > > get_pcichild() in > > > hv_pci_eject_device(). > > Please refer to my replies above. IMO we should fix > > hv_eject_device_work() rather than pci_devices_present_work(). > > Have we reached a conclusion on this ? I would like to merge this series > given that it is fixing bugs and it has hung in the balance for quite > a while but it looks like Michael is not too happy about these patches > and I need a maintainer ACK to merge them. > > Thanks, > Lorenzo Dexuan and I have discussed the topic extensively offline. The patch works in its current form, and I'll agree to it. Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 3/3] PCI: hv: Add pci_destroy_slot() in pci_devices_present_work(), if necessary
From: Dexuan Cui Sent: Wednesday, March 20, 2019 5:36 PM > > > From: Michael Kelley > > > ... > > > diff --git a/drivers/pci/controller/pci-hyperv.c > > > @@ -1776,6 +1776,10 @@ static void pci_devices_present_work(struct > > work_struct *work) > > > hpdev = list_first_entry(&removed, struct hv_pci_dev, > > >list_entry); > > > list_del(&hpdev->list_entry); > > > + > > > + if (hpdev->pci_slot) > > > + pci_destroy_slot(hpdev->pci_slot); > > > > The code is inconsistent in whether hpdev->pci_slot is set to NULL after > > calling > > pci_destory_slot(). > Here, in pci_devices_present_work(), it's unnecessary to set it to NULL, > Because: > 1) the "hpdev" is removed from hbus->children and it can not be seen > elsewhere; > 2) the "hpdev" struct is freed in the below put_pcichild(): > > while (!list_empty(&removed)) { > hpdev = list_first_entry(&removed, struct hv_pci_dev, > list_entry); > list_del(&hpdev->list_entry); > > if (hpdev->pci_slot) > pci_destroy_slot(hpdev->pci_slot); > > put_pcichild(hpdev); > } > > > Patch 2 in this series does set it to NULL, but this code does not. > In Patch2, i.e. in the code path hv_pci_remove() -> hv_pci_remove_slots(), > we must set hpdev->pci_slot to NULL, otherwise, later, due to > hv_pci_remove() -> hv_pci_bus_exit() -> > hv_pci_devices_present() with the zero "relations", we'll double-free the > "hpdev" struct in pci_devices_present_work() -- see the above. > > > And the code in hv_eject_device_work() does not set it to NULL. > It's unnecessary to set hpdev->pci_slot to NULL in hv_eject_device_work(), > Because in hv_eject_device_work(): > 1) the "hpdev" is removed from hbus->children and it can not be seen > elsewhere; > 2) the "hpdev" struct is freed at the end of hv_eject_device_work() with my > first patch: [PATCH 1/3] PCI: hv: Fix a memory leak in hv_eject_device_work(). > > > It looks like all the places that test the value of hpdev->pci_slot or call > > pci_destroy_slot() are serialized, so it looks like it really doesn't > > matter. But > > when > > the code is inconsistent about setting to NULL, it always makes me wonder if > > there > > is a reason. > > > > Michael > Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 2/3] PCI: hv: Add hv_pci_remove_slots() when we unload the driver
From: Dexuan Cui Sent: Monday, March 4, 2019 1:35 PM > > When we unload pci-hyperv, the host doesn't send us a PCI_EJECT message. > In this case we also need to make sure the sysfs pci slot directory > is removed, otherwise "cat /sys/bus/pci/slots/2/address" will trigger > "BUG: unable to handle kernel paging request" (I noticed the issue when > systemd-dev crashed for me when I unloaded the driver). And, if we > unload/reload the driver several times, we'll have multiple pci slot > directories in /sys/bus/pci/slots/ like this: > > root@localhost:~# ls -rtl /sys/bus/pci/slots/ > total 0 > drwxr-xr-x 2 root root 0 Feb 7 10:49 2 > drwxr-xr-x 2 root root 0 Feb 7 10:49 2-1 > drwxr-xr-x 2 root root 0 Feb 7 10:51 2-2 > > The patch adds the missing code. > > Fixes: a15f2c08c708 ("PCI: hv: support reporting serial number as slot > information") > Signed-off-by: Dexuan Cui > Acked-by: Stephen Hemminger > Cc: sta...@vger.kernel.org Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 1/3] PCI: hv: Fix a memory leak in hv_eject_device_work()
From: Dexuan Cui > > After a device is just created in new_pcichild_device(), hpdev->refs is set > to 2 (i.e. the initial value of 1 plus the get_pcichild()). > > When we hot remove the device from the host, in Linux VM we first call > hv_pci_eject_device(), which increases hpdev->refs by get_pcichild() and > then schedules a work of hv_eject_device_work(), so hpdev->refs becomes 3 > (let's ignore the paired get/put_pcichild() in other places). But in > hv_eject_device_work(), currently we only call put_pcichild() twice, > meaning the 'hpdev' struct can't be freed in put_pcichild(). This patch > adds one put_pcichild() to fix the memory leak. > > BTW, the device can also be removed when we run "rmmod pci-hyperv". On this > path (hv_pci_remove() -> hv_pci_bus_exit() -> hv_pci_devices_present()), > hpdev->refs is 2, and we do correctly call put_pcichild() twice in > pci_devices_present_work(). > > Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft > Hyper-V VMs") > Signed-off-by: Dexuan Cui > Cc: Exiting new_pcichild_device() with hpdev->refs set to 2 seems OK to me. There is the reference in the hbus->children list, and there is the reference that is returned to the caller. But what is strange is that pci_devices_present_work() overwrites the reference returned in local variable hpdev without doing a put_pcichild(). It seems like the "normal" reference count should be 1 when the child device is not being manipulated, not 2. The fix would be to add a call to put_pcichild() when the return value from new_pcichild_device() is overwritten. Then remove the call to put_pcichild() in pci_device_present_work() when missing children are moved to the local list. The children have been moved from one list to another, so there's no need to decrement the reference count. Then when everything in the local list is deleted, the reference is correctly decremented, presumably freeing the memory. With this approach, the code in hv_eject_device_work() is correct. There's one call to put_pcichild() to reflect removing the child device from the hbus-> children list, and one call to put_pcichild() to pair with the get_pcichild() in hv_pci_eject_device(). Your patch works, but to me it leaves the ref count in an unnatural state most of the time. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 3/3] PCI: hv: Add pci_destroy_slot() in pci_devices_present_work(), if necessary
From: Dexuan Cui Sent: Monday, March 4, 2019 1:35 PM > > diff --git a/drivers/pci/controller/pci-hyperv.c > b/drivers/pci/controller/pci-hyperv.c > index b489412e3502..82acd6155adf 100644 > --- a/drivers/pci/controller/pci-hyperv.c > +++ b/drivers/pci/controller/pci-hyperv.c > @@ -1776,6 +1776,10 @@ static void pci_devices_present_work(struct > work_struct *work) > hpdev = list_first_entry(&removed, struct hv_pci_dev, >list_entry); > list_del(&hpdev->list_entry); > + > + if (hpdev->pci_slot) > + pci_destroy_slot(hpdev->pci_slot); The code is inconsistent in whether hpdev->pci_slot is set to NULL after calling pci_destory_slot(). Patch 2 in this series does set it to NULL, but this code does not. And the code in hv_eject_device_work() does not set it to NULL. It looks like all the places that test the value of hpdev->pci_slot or call pci_destroy_slot() are serialized, so it looks like it really doesn't matter. But when the code is inconsistent about setting to NULL, it always makes me wonder if there is a reason. Michael > + > put_pcichild(hpdev); > } > > -- > 2.19.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH V6 1/3] x86/Hyper-V: Set x2apic destination mode to physical when x2apic is available
From: Tianyu Lan Sent: Wednesday, February 27, 2019 6:54 AM > > Hyper-V doesn't provide irq remapping for IO-APIC. To enable x2apic, > set x2apic destination mode to physcial mode when x2apic is available > and Hyper-V IOMMU driver makes sure cpus assigned with IO-APIC irqs have > 8-bit APIC id. > > Reviewed-by: Thomas Gleixner > Reviewed-by: Michael Kelley > Signed-off-by: Lan Tianyu > --- > Change since v5: >- Fix comile error due to x2apic_phys > > Change since v2: >- Fix compile error due to x2apic_phys >- Fix comment indent > Change since v1: >- Remove redundant extern for x2apic_phys > --- > arch/x86/kernel/cpu/mshyperv.c | 12 > 1 file changed, 12 insertions(+) Reconfirming my reviewed-by after the change to fix the compile error detected by the kbuild test robot. Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH V5 0/3] x86/Hyper-V/IOMMU: Add Hyper-V IOMMU driver to support x2apic mode
From: Tianyu Lan Sent: Friday, February 22, 2019 4:12 AM > > On the bare metal, enabling X2APIC mode requires interrupt remapping > function which helps to deliver irq to cpu with 32-bit APIC ID. > Hyper-V doesn't provide interrupt remapping function so far and Hyper-V > MSI protocol already supports to deliver interrupt to the CPU whose > virtual processor index is more than 255. IO-APIC interrupt still has > 8-bit APIC ID limitation. > > This patchset is to add Hyper-V stub IOMMU driver in order to enable > X2APIC mode successfully in Hyper-V Linux guest. The driver returns X2APIC > interrupt remapping capability when X2APIC mode is available. X2APIC > destination mode is set to physical by PATCH 1 when X2APIC is available. > Hyper-V IOMMU driver will scan cpu 0~255 and set cpu into IO-APIC MAX cpu > affinity cpumask if its APIC ID is 8-bit. Driver creates a Hyper-V irq domain > to limit IO-APIC interrupts' affinity and make sure cpus assigned with IO-APIC > interrupt are in the scope of IO-APIC MAX cpu affinity. > > Lan Tianyu (3): > x86/Hyper-V: Set x2apic destination mode to physical when x2apic is > available > HYPERV/IOMMU: Add Hyper-V stub IOMMU driver > MAINTAINERS: Add Hyper-V IOMMU driver into Hyper-V CORE AND DRIVERS > scope > Joerg -- What's your take on this patch set now that it has settled down? If you are good with it, from the Microsoft standpoint we're hoping that it can get into linux-next this week (given the extra week due to 5.0-rc8). Thanks, Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH V4 1/3] x86/Hyper-V: Set x2apic destination mode to physical when x2apic is available
From: lantianyu1...@gmail.com Sent: Monday, February 11, 2019 6:20 AM > > Hyper-V doesn't provide irq remapping for IO-APIC. To enable x2apic, > set x2apic destination mode to physcial mode when x2apic is available > and Hyper-V IOMMU driver makes sure cpus assigned with IO-APIC irqs have > 8-bit APIC id. > > Reviewed-by: Thomas Gleixner > Signed-off-by: Lan Tianyu > --- > Change since v2: >- Fix compile error due to x2apic_phys >- Fix comment indent > Change since v1: >- Remove redundant extern for x2apic_phys > Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v2 2/2] Drivers: hv: vmbus: Return -EINVAL if monitor_allocated not set
From: Kimberly Brown Sent: Monday, February 18, 2019 9:38 PM > > There are two methods for signaling the host: the monitor page mechanism > and hypercalls. The monitor page mechanism is used by performance > critical channels (storage, networking, etc.) because it provides > improved throughput. However, latency is increased. Monitor pages are > allocated to these channels. > > Monitor pages are not allocated to channels that do not use the monitor > page mechanism. Therefore, these channels do not have a valid monitor id > or valid monitor page data. In these cases, some of the "_show" > functions return incorrect data. They return an invalid monitor id and > data that is beyond the bounds of the hv_monitor_page array fields. > > The "channel->offermsg.monitor_allocated" value can be used to determine > whether monitor pages have been allocated to a channel. In the affected > "_show" functions, verify that "channel->offermsg.monitor_allocated" is > set before accessing the monitor id or the monitor page data. If > "channel->offermsg.monitor_allocated" is not set, return -EINVAL. > > Signed-off-by: Kimberly Brown > Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v2 1/2] Drivers: hv: vmbus: Change server monitor_pages index to 0
From: Kimberly Brown Sent: Monday, February 18, 2019 9:38 PM > > Change the monitor_pages index in server_monitor_pending_show() to '0'. > '0' is the correct monitor_pages index for the server. A comment for the > monitor_pages field in the vmbus_connection struct definition indicates > that the 1st page is for parent->child notifications. In addition, the > server_monitor_latency_show() and server_monitor_conn_id_show() > functions use monitor_pages index '0'. > > Signed-off-by: Kimberly Brown > Acked-by: Stephen Hemminger > --- > drivers/hv/vmbus_drv.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v2 1/2] PCI: hv: Replace hv_vp_set with hv_vpset
From: Lorenzo Pieralisi Sent: Friday, February 15, 2019 2:27 AM > > I will add Michael's tag to v3 (unless Michael is not happy with that), > it is missing there. > Yes, please add. Thanks. Michae ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v2 1/2] PCI: hv: Replace hv_vp_set with hv_vpset
From: Lorenzo Pieralisi Sent: Tuesday, February 12, 2019 8:35 AM > > On Mon, Jan 28, 2019 at 09:49:32PM -0800, Maya Nakamura wrote: > > On Sun, Jan 27, 2019 at 05:11:48AM +0000, Michael Kelley wrote: > > > From: Maya Nakamura Sent: Saturday, January > > > 26, > 2019 12:52 AM > > > > > > > > Remove a duplicate definition of VP set (hv_vp_set) and use the common > > > > definition (hv_vpset) that is used in other places. > > > > > > > > Change the order of the members in struct hv_pcibus_device so that the > > > > declaration of retarget_msi_interrupt_params is the last member. Struct > > > > hv_vpset, which contains a flexible array, is nested two levels deep in > > > > struct hv_pcibus_device via retarget_msi_interrupt_params. > > > > > > > > Add a comment that retarget_msi_interrupt_params should be the last > > > > member > > > > of struct hv_pcibus_device. > > > > > > > > Signed-off-by: Maya Nakamura > > > > --- > > > > Change in v2: > > > > - None > > > > > > > > > > Right -- there was no code change. But it's customary to note that > > > you updated the commit message. > > > > > Thank you for your feedback. I will edit the change log in v3. > > > > > Reviewed-by: Michael Kelley > > Are you really sure there is no behavioural change ? What piece of > code allocates hv_vpset.bank_contents[] memory with this patch applied ? > > I suspect the current code does not use hv_vpset for this specific > reason, ie allocate struct hv_vp_set.masks array memory statically. > There is indeed no behavior change. A full page of memory is allocated in hv_pci_probe() so that we can be sure that the Hyper-V hypercall arguments don't cross a page boundary. This page allows more than enough space for the hv_vpset.bank_contents[] to grow as needed (with one bit allocated in the masks for up to the limit of 8192 CPUs allowed by Linux). A flexible array is used because the hv_vpset structure is also used in some MMU hypercalls that have two variable size arrays. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v4] Drivers: hv: vmbus: Expose counters for interrupts and full conditions
From: Kimberly Brown Sent: Sunday, February 3, 2019 11:13 PM > > Counter values for per-channel interrupts and ring buffer full > conditions are useful for investigating performance. > > Expose counters in sysfs for 2 types of guest to host interrupts: > 1) Interrupts caused by the channel's outbound ring buffer transitioning > from empty to not empty > 2) Interrupts caused by the channel's inbound ring buffer transitioning > from full to not full while a packet is waiting for enough buffer space to > become available > > Expose 2 counters in sysfs for the number of times that write operations > encountered a full outbound ring buffer: > 1) The total number of write operations that encountered a full > condition > 2) The number of write operations that were the first to encounter a > full condition > > Increment the outbound full condition counters in the > hv_ringbuffer_write() function because, for most drivers, a full > outbound ring buffer is detected in that function. Also increment the > outbound full condition counters in the set_channel_pending_send_size() > function. In the hv_sock driver, a full outbound ring buffer is detected > and set_channel_pending_send_size() is called before > hv_ringbuffer_write() is called. > > I tested this patch by confirming that the sysfs files were created and > observing the counter values. The values seemed to increase by a > reasonable amount when the Hyper-v related drivers were in use. > > Signed-off-by: Kimberly Brown > --- > Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH V2 1/3] x86/Hyper-V: Set x2apic destination mode to physical when x2apic is available
From: lantianyu1...@gmail.com Sent: Saturday, February 2, 2019 5:15 AM > > Hyper-V doesn't provide irq remapping for IO-APIC. To enable x2apic, > set x2apic destination mode to physcial mode when x2apic is available > and Hyper-V IOMMU driver makes sure cpus assigned with IO-APIC irqs have > 8-bit APIC id. > > Signed-off-by: Lan Tianyu > --- > Change since v1: >- Remove redundant extern for x2apic_phys > --- > arch/x86/kernel/cpu/mshyperv.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c > index e81a2db..4bd6d90 100644 > --- a/arch/x86/kernel/cpu/mshyperv.c > +++ b/arch/x86/kernel/cpu/mshyperv.c > @@ -328,6 +328,16 @@ static void __init ms_hyperv_init_platform(void) > # ifdef CONFIG_SMP > smp_ops.smp_prepare_boot_cpu = hv_smp_prepare_boot_cpu; > # endif > + > +/* > + * Hyper-V doesn't provide irq remapping for IO-APIC. To enable x2apic, > + * set x2apic destination mode to physcial mode when x2apic is available > + * and Hyper-V IOMMU driver makes sure cpus assigned with IO-APIC irqs > + * have 8-bit APIC id. > + */ Per comment from Dan Carpenter on v1 of this patch, the above comment block should be indented one tab to line up with the "if" statement below. Michael > + if (IS_ENABLED(CONFIG_HYPERV_IOMMU) && x2apic_supported()) > + x2apic_phys = 1; > + > #endif > } > > -- > 2.7.4 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] Drivers: hv: vmbus: Add mutex lock to channel show functions
From: Sasha Levin Sent: Thursday, January 31, 2019 7:20 AM > > I've queued this one for hyper-fixes, thanks all! > Actually, please hold off on queuing this one. In a conversation I had yesterday with Kim, they had identified a deadlock. Kim was going to be looking at some revisions to avoid the deadlock. Kim -- please confirm. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v3 2/2] PCI: hv: Refactor hv_irq_unmask() to use cpumask_to_vpset()
From: Maya Nakamura Sent: Monday, January 28, 2019 11:21 PM > > Remove the duplicate implementation of cpumask_to_vpset() and use the > shared implementation. Export hv_max_vp_index, which is required by > cpumask_to_vpset(). > > Apply changes to hv_irq_unmask() based on feedback. > > Signed-off-by: Maya Nakamura > --- > Changes in v3: > - Modify to catch all failures from cpumask_to_vpset(). > - Correct the v2 change log about the commit message. > > Changes in v2: > - Remove unnecessary nr_bank initialization. > - Delete two unnecessary dev_err()'s. > - Unlock before returning. > - Update the commit message. > Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v3 1/2] PCI: hv: Replace hv_vp_set with hv_vpset
From: Maya Nakamura Sent: Monday, January 28, 2019 11:18 PM > > Remove a duplicate definition of VP set (hv_vp_set) and use the common > definition (hv_vpset) that is used in other places. > > Change the order of the members in struct hv_pcibus_device so that the > declaration of retarget_msi_interrupt_params is the last member. Struct > hv_vpset, which contains a flexible array, is nested two levels deep in > struct hv_pcibus_device via retarget_msi_interrupt_params. > > Add a comment that retarget_msi_interrupt_params should be the last member > of struct hv_pcibus_device. > > Signed-off-by: Maya Nakamura > --- > Change in v3: > - Correct the v2 change log. > > Change in v2: > - Update the commit message. > Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v2 2/2] PCI: hv: Refactor hv_irq_unmask() to use cpumask_to_vpset()
From: Maya Nakamura Sent: Saturday, January 26, 2019 12:55 AM > > @@ -953,29 +951,27 @@ static void hv_irq_unmask(struct irq_data *data) >*/ > params->int_target.flags |= > HV_DEVICE_INTERRUPT_TARGET_PROCESSOR_SET; > - params->int_target.vp_set.valid_bank_mask = > - (1ull << HV_VP_SET_BANK_COUNT_MAX) - 1; > + > + if (!alloc_cpumask_var(&tmp, GFP_KERNEL)) { > + res = 1; > + goto exit_unlock; > + } > + > + cpumask_and(tmp, dest, cpu_online_mask); > + nr_bank = cpumask_to_vpset(¶ms->int_target.vp_set, tmp); > + free_cpumask_var(tmp); > + > + if (!nr_bank) { There are two failures cases in cpumask_to_vpset(). One case returns 0, and the other case returns -1. The above test only catches the 0 failure case. Need to modify the test to catch both cases. Michael > + res = 1; > + goto exit_unlock; > + } > ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v2 1/2] PCI: hv: Replace hv_vp_set with hv_vpset
From: Maya Nakamura Sent: Saturday, January 26, 2019 12:52 AM > > Remove a duplicate definition of VP set (hv_vp_set) and use the common > definition (hv_vpset) that is used in other places. > > Change the order of the members in struct hv_pcibus_device so that the > declaration of retarget_msi_interrupt_params is the last member. Struct > hv_vpset, which contains a flexible array, is nested two levels deep in > struct hv_pcibus_device via retarget_msi_interrupt_params. > > Add a comment that retarget_msi_interrupt_params should be the last member > of struct hv_pcibus_device. > > Signed-off-by: Maya Nakamura > --- > Change in v2: > - None > Right -- there was no code change. But it's customary to note that you updated the commit message. Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] nfit: add Hyper-V NVDIMM DSM command set to white list
From: Dexuan Cui Sent: Wednesday, January 23, 2019 12:51 PM > > Add the Hyper-V _DSM command set to the white list of NVDIMM command > sets. > > This command set is documented at > http://www.uefi.org/RFIC_LIST (see the link to "Virtual NVDIMM 0x1901" on the > page). > > Signed-off-by: Dexuan Cui > --- > > I'm going to change the user-space utility "ndctl" to support Hyper-V Virtual > NVDIMM. > This kernel patch is required first. > > drivers/acpi/nfit/core.c | 5 - > drivers/acpi/nfit/nfit.h | 6 +- > include/uapi/linux/ndctl.h | 1 + > 3 files changed, 10 insertions(+), 2 deletions(-) > Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 2/4] arm64: hyperv: Add support for Hyper-V as a hypervisor
From: Michael Kelley Sent: Friday, January 4, 2019 12:05 PM > > > >> As Will said, this isn't a viable option. Please follow SMCCC 1.1. > > > > > > I'll have to start a conversation with the Hyper-V team about this. > > > I don't know why they chose to use HVC #1 or this register scheme > > > for output values. It may be tough to change at this point because > > > there are Windows guests on Hyper-V for ARM64 that are already > > > using this approach. > > > > I appreciate you already have stuff in the wild, but there is definitely > > a case to be made for supporting architecturally specified mechanisms in > > a hypervisor, and SMCCC is definitely part of it (I'm certainly curious > > of how you support the Spectre mitigation otherwise). > > > > The Hyper-V guys I need to discuss this with are not back from the > holidays until January 7th. I'll follow up on this thread once I've > had that conversation. > Feedback from the Hyper-V guys is that they believe the Hyper-V specific hypercall sequence *is* compliant with SMCCC 1.1, in the sense of being outside the requirements per Section 2.9 since the Hyper-V hypercall sequence uses HVC #1. Hyper-V wanted to use a simpler calling sequence that doesn't have the full register save/restore requirements, for better performance. The details of the Hyper-V hypercall sequence are documented internally, but we do still need to get it published externally as part of a Hyper-V TLFS version that includes ARM64. Hyper-V uses the full SMC Calling Conventions in other places, such as for PSCI calls, for SMCs to EL3, and for the Spectre mitigation related calls. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH hyperv-fixes, 3/3] Fix hash key value reset after other ops
From: Haiyang Zhang Sent: Monday, January 14, 2019 4:52 PM > > Changing mtu, channels, or buffer sizes ops call to netvsc_attach(), > rndis_set_subchannel(), which always reset the hash key to default > value. That will override hash key changed previously. This patch > fixes the problem by save the hash key, then restore it when we re- > add the netvsc device. > > Fixes: ff4a44199012 ("netvsc: allow get/set of RSS indirection table") > Signed-off-by: Haiyang Zhang > --- > drivers/net/hyperv/hyperv_net.h | 10 +++--- > drivers/net/hyperv/netvsc.c | 2 +- > drivers/net/hyperv/netvsc_drv.c | 5 - > drivers/net/hyperv/rndis_filter.c | 9 +++-- > 4 files changed, 19 insertions(+), 7 deletions(-) > Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH hyperv-fixes,2/3] Refactor assignments of struct netvsc_device_info
From: Haiyang Zhang Sent: Monday, January 14, 2019 4:52 PM > > These assignments occur in multiple places. The patch refactor them > to a function for simplicity. It also puts the struct to heap area > for future expension. > > Signed-off-by: Haiyang Zhang > --- > drivers/net/hyperv/netvsc_drv.c | 134 > 1 file changed, 85 insertions(+), 49 deletions(-) > Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH hyperv-fixes,1/3] Fix ethtool change hash key error
From: Haiyang Zhang Sent: Monday, January 14, 2019 4:52 PM > > Hyper-V hosts require us to disable RSS before changing RSS key, > otherwise the changing request will fail. This patch fixes the > coding error. > > Fixes: ff4a44199012 ("netvsc: allow get/set of RSS indirection table") > Reported-by: Wei Hu > Signed-off-by: Haiyang Zhang > --- > drivers/net/hyperv/rndis_filter.c | 25 +++-- > 1 file changed, 19 insertions(+), 6 deletions(-) > Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v3] Drivers: hv: vmbus: Expose counters for interrupts and full conditions
From: Kimberly Brown Sent: Wednesday, January 16, 2019 8:38 PM > > Counter values for per-channel interrupts and ring buffer full > conditions are useful for investigating performance. > > Expose counters in sysfs for 2 types of guest to host interrupts: > 1) Interrupts caused by the channel's outbound ring buffer transitioning > from empty to not empty > 2) Interrupts caused by the channel's inbound ring buffer transitioning > from full to not full while a packet is waiting for enough buffer space to > become available > > Expose 2 counters in sysfs for the number of times that write operations > encountered a full outbound ring buffer: > 1) The total number of write operations that encountered a full > condition > 2) The number of write operations that were the first to encounter a > full condition > > I tested this patch by confirming that the sysfs files were created and > observing the counter values. The values seemed to increase by a > reasonable amount when the Hyper-v related drivers were in use. > > Signed-off-by: Kimberly Brown > --- > Changes in v3: > - Used the outbound ring buffer spinlock to protect the the full >condition counters in set_channel_pending_send_size() > - Corrected the KernelVersion values for the new entries in >Documentation/ABI/stable/sysfs-bus-vmbus > > Changes in v2: > - Added mailing lists to the cc list > - Removed the host to guest interrupt counters proposed in v1 because >they were not accurate > - Added full condition counters for the channel's outbound ring buffer > > Documentation/ABI/stable/sysfs-bus-vmbus | 33 > drivers/hv/ring_buffer.c | 14 - > drivers/hv/vmbus_drv.c | 32 > include/linux/hyperv.h | 38 > 4 files changed, 116 insertions(+), 1 deletion(-) > Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v2] vmbus: Switch to use new generic UUID API
From: Andy Shevchenko Sent: Thursday, January 10, 2019 6:26 AM > > There are new types and helpers that are supposed to be used in new code. > > As a preparation to get rid of legacy types and API functions do > the conversion here. > > Cc: "K. Y. Srinivasan" > Cc: Haiyang Zhang > Cc: Stephen Hemminger > Cc: de...@linuxdriverproject.org > Signed-off-by: Andy Shevchenko > --- > > v2: > - leave uapi untouched (Christoph, Haiyang) > - rebase on top of latest linux-next > > drivers/hv/channel.c | 4 +- > drivers/hv/channel_mgmt.c | 18 +++ > drivers/hv/hyperv_vmbus.h | 4 +- > drivers/hv/vmbus_drv.c| 48 +++ > include/linux/hyperv.h| 98 +++---- > 5 files changed, 79 insertions(+), 93 deletions(-) Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v2] Drivers: hv: vmbus: Expose counters for interrupts and full conditions
From: Kimberly Brown Sent: Friday, January 4, 2019 8:35 PM > static inline void set_channel_pending_send_size(struct vmbus_channel *c, >u32 size) > { > + if (size) { > + ++c->out_full_total; > + > + if (!c->out_full_flag) { > + ++c->out_full_first; > + c->out_full_flag = true; > + } > + } else { > + c->out_full_flag = false; > + } > + > c->outbound.ring_buffer->pending_send_sz = size; > } > I think there may be an atomicity problem with the above code. I looked in the hv_sock code, and didn't see any locks being held when set_channel_pending_send_size() is called. The original code doesn't need a lock because it is just storing a single value into pending_send_sz. In the similar code in hv_ringbuffer_write(), the ring buffer spin lock is held while the counts are incremented and the out_full_flag is maintained, so all is good there. But some locking may be needed here. Dexuan knows the hv_sock code best and can comment on whether there is any higher level synchronization that prevents multiple threads from running the above code on the same channel. Even if there is such higher level synchronization, this code probably shouldn't depend on it for correctness. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 2/4] arm64: hyperv: Add support for Hyper-V as a hypervisor
From: Marc Zyngier Sent: Thursday, December 13, 2018 3:23 AM > >> As Will said, this isn't a viable option. Please follow SMCCC 1.1. > > > > I'll have to start a conversation with the Hyper-V team about this. > > I don't know why they chose to use HVC #1 or this register scheme > > for output values. It may be tough to change at this point because > > there are Windows guests on Hyper-V for ARM64 that are already > > using this approach. > > I appreciate you already have stuff in the wild, but there is definitely > a case to be made for supporting architecturally specified mechanisms in > a hypervisor, and SMCCC is definitely part of it (I'm certainly curious > of how you support the Spectre mitigation otherwise). > The Hyper-V guys I need to discuss this with are not back from the holidays until January 7th. I'll follow up on this thread once I've had that conversation. > >>> +static int hv_cpu_init(unsigned int cpu) > >>> +{ > >>> + u64 msr_vp_index; > >>> + > >>> + hv_get_vp_index(msr_vp_index); > >>> + > >>> + hv_vp_index[smp_processor_id()] = msr_vp_index; > >>> + > >>> + if (msr_vp_index > hv_max_vp_index) > >>> + hv_max_vp_index = msr_vp_index; > >>> + > >>> + return 0; > >>> +} > >> > >> Is that some new way to describe a CPU topology? If so, why isn't that > >> exposed via the ACPI tables that the kernel already parses? > > > > Hyper-V's hypercall interface uses vCPU identifiers that are not > > guaranteed to be consecutive integers or to match what ACPI shows. > > No topology information is implied -- it's just unique identifiers. The > > hv_vp_index array provides easy mapping from Linux's consecutive > > integer IDs for CPUs when needed to construct hypercall arguments. > > That's extremely odd. The hypervisor obviously knows which vCPU is doing > a hypercall, and if referencing another vCPU, the virtualized MPIDR_EL1 > value should be used. I don't think deviating from the architecture is a > good idea (but I appreciate this is none of your doing). Following the > architecture would allow this code to directly use the cpu_logical_map > infrastructure we already have. I see what you are getting at. However, some Hyper-V hypercalls allow specifying arbitrary sets of vCPUs. These hypercalls are used to define target processors in the virtual PCI code (which I have not yet brought over to ARM64) and in enlightenments for IPIs and TLB flushes (used by Windows guests and Linux guests on x86, but not yet brought over to Linux ARM64, if they ever will be). These hypercalls take bitmaps as arguments, similar to a Linux cpumask, as defined in Sections 7.8.7.3 thru 7.8.7.5 in the Hyper-V TLFS. So Hyper-V defines its own VP index that is akin to the index into the cpu_logical_map, though it may not be the same mapping. My earlier comments may have been misleading -- the Hyper-V VP index is an integer ranging from 0 thru (# vCPUs - 1). With these requirements, Hyper-V defining its own VP index seems like a reasonable thing to do. And since Hyper-V provides the same hypercall interfaces for both x86 and ARM64 implementations, and for Windows guests, there's not much choice but to use the Hyper-V VP index as specified. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/1] x86/hyper-v: Fix 'set but not used' warnings
In these two cases, a value returned by rdmsr() or rdmsrl() is ignored. Indicate that ignoring the value is intentional, so that with the W=1 compilation option no warning is generated. Signed-off-by: Michael Kelley --- arch/x86/hyperv/hv_apic.c | 2 +- arch/x86/hyperv/hv_spinlock.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c index 8eb6fbee..66a0f53 100644 --- a/arch/x86/hyperv/hv_apic.c +++ b/arch/x86/hyperv/hv_apic.c @@ -55,7 +55,7 @@ static void hv_apic_icr_write(u32 low, u32 id) static u32 hv_apic_read(u32 reg) { - u32 reg_val, hi; + u32 reg_val, __maybe_unused hi; switch (reg) { case APIC_EOI: diff --git a/arch/x86/hyperv/hv_spinlock.c b/arch/x86/hyperv/hv_spinlock.c index a861b04..e18c63d5 100644 --- a/arch/x86/hyperv/hv_spinlock.c +++ b/arch/x86/hyperv/hv_spinlock.c @@ -25,7 +25,7 @@ static void hv_qlock_kick(int cpu) static void hv_qlock_wait(u8 *byte, u8 val) { - unsigned long msr_val; + unsigned long __maybe_unused msr_val; unsigned long flags; if (in_nmi()) -- 1.8.3.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/1] scsi: storvsc: Always use blk-mq
With high IOPS storage being increasingly prevalent for guests on Hyper-V and in Azure, make blk-mq the default so that the full performance of the storage can be realized without having to tweak other configuration settings. Signed-off-by: Michael Kelley --- drivers/scsi/storvsc_drv.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index f03dc03..3dbbd14 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -1703,6 +1703,7 @@ static int storvsc_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scmnd) .dma_boundary = PAGE_SIZE-1, .no_write_same =1, .track_queue_depth =1, + .force_blk_mq = 1, }; enum { -- 1.8.3.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 2/4] arm64: hyperv: Add support for Hyper-V as a hypervisor
From: Marc Zyngier Sent: Friday, December 7, 2018 6:43 AM > > Add ARM64-specific code to enable Hyper-V. This code includes: > > * Detecting Hyper-V and initializing the guest/Hyper-V interface > > * Setting up Hyper-V's synthetic clocks > > * Making hypercalls using the HVC instruction > > * Setting up VMbus and stimer0 interrupts > > * Setting up kexec and crash handlers > > This commit message is a clear indication that this should be split in > at least 5 different patches. OK, I'll work on separating into multiple layered patches in the next version. > > > This code is architecture dependent code and is mostly driven by > > architecture independent code in the VMbus driver in drivers/hv/hv.c > > and drivers/hv/vmbus_drv.c. > > > > This code is built only when CONFIG_HYPERV is enabled. > > > > Signed-off-by: Michael Kelley > > Signed-off-by: K. Y. Srinivasan > > --- > > MAINTAINERS | 1 + > > arch/arm64/Makefile | 1 + > > arch/arm64/hyperv/Makefile | 2 + > > arch/arm64/hyperv/hv_hvc.S | 54 + > > arch/arm64/hyperv/hv_init.c | 441 +++ > > arch/arm64/hyperv/mshyperv.c | 178 ++ > > 6 files changed, 677 insertions(+) > > create mode 100644 arch/arm64/hyperv/Makefile > > create mode 100644 arch/arm64/hyperv/hv_hvc.S > > create mode 100644 arch/arm64/hyperv/hv_init.c > > create mode 100644 arch/arm64/hyperv/mshyperv.c > > > > diff --git a/MAINTAINERS b/MAINTAINERS > > index 72f19cef4c48..326eeb32a0cd 100644 > > --- a/MAINTAINERS > > +++ b/MAINTAINERS > > @@ -6837,6 +6837,7 @@ F:arch/x86/kernel/cpu/mshyperv.c > > F: arch/x86/hyperv > > F: arch/arm64/include/asm/hyperv-tlfs.h > > F: arch/arm64/include/asm/mshyperv.h > > +F: arch/arm64/hyperv > > F: drivers/hid/hid-hyperv.c > > F: drivers/hv/ > > F: drivers/input/serio/hyperv-keyboard.c > > diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile > > index 6cb9fc7e9382..ad9ec0579553 100644 > > --- a/arch/arm64/Makefile > > +++ b/arch/arm64/Makefile > > @@ -106,6 +106,7 @@ core-y += arch/arm64/kernel/ arch/arm64/mm/ > > core-$(CONFIG_NET) += arch/arm64/net/ > > core-$(CONFIG_KVM) += arch/arm64/kvm/ > > core-$(CONFIG_XEN) += arch/arm64/xen/ > > +core-$(CONFIG_HYPERV) += arch/arm64/hyperv/ > > core-$(CONFIG_CRYPTO) += arch/arm64/crypto/ > > libs-y := arch/arm64/lib/ $(libs-y) > > core-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a > > diff --git a/arch/arm64/hyperv/Makefile b/arch/arm64/hyperv/Makefile > > new file mode 100644 > > index ..988eda55330c > > --- /dev/null > > +++ b/arch/arm64/hyperv/Makefile > > @@ -0,0 +1,2 @@ > > +# SPDX-License-Identifier: GPL-2.0 > > +obj-y := hv_init.o hv_hvc.o mshyperv.o > > diff --git a/arch/arm64/hyperv/hv_hvc.S b/arch/arm64/hyperv/hv_hvc.S > > new file mode 100644 > > index ..82636969b4f2 > > --- /dev/null > > +++ b/arch/arm64/hyperv/hv_hvc.S > > @@ -0,0 +1,54 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > + > > +/* > > + * Microsoft Hyper-V hypervisor invocation routines > > + * > > + * Copyright (C) 2018, Microsoft, Inc. > > + * > > + * Author : Michael Kelley > > + * > > + * This program is free software; you can redistribute it and/or modify it > > + * under the terms of the GNU General Public License version 2 as published > > + * by the Free Software Foundation. > > + * > > + * This program is distributed in the hope that it will be useful, but > > + * WITHOUT ANY WARRANTY; without even the implied warranty of > > + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > > + * NON INFRINGEMENT. See the GNU General Public License for more > > + * details. > > + */ > > + > > +#include > > + > > + .text > > +/* > > + * Do the HVC instruction. For Hyper-V the argument is always 1. > > + * x0 contains the hypercall control value, while additional registers > > + * vary depending on the hypercall, and whether the hypercall arguments > > + * are in memory or in registers (a "fast" hypercall per the Hyper-V > > + * TLFS). When the arguments are in memory x1 is the guest physical > > + * address of the input arguments, and x2 is the guest physical > > + * address of the output arguments. When the arguments are in > > + * registers, the register values depends on the hypercall. Note > > + * that
RE: [PATCH 1/4] arm64: hyperv: Add core Hyper-V include files
From: Will Deacon Sent: Friday, December 7, 2018 5:43 AM > > hyperv-tlfs.h defines Hyper-V interfaces from the Hyper-V Top Level > > Functional Spec (TLFS). The TLFS is distinctly oriented to x86/x64, > > and Hyper-V has not separated out the architecture-dependent parts into > > x86/x64 vs. ARM64. So hyperv-tlfs.h includes information for ARM64 > > that is not yet formally published. The TLFS is available here: > > When do you plan to publish the spec? It's pretty hard to review this stuff > without knowing what it's supposed to look like. I don't have a commitment from the Hyper-V team on when an updated TLFS that covers ARM64 will be published. I'm on the Linux side, and Hyper-V is a separate group, but I'll raise the topic again with them. > > > docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs > > > > mshyperv.h defines Linux-specific structures and routines for > > interacting with Hyper-V. It is split into an ARM64 specific file > > and an architecture independent file in include/asm-generic. > > > > Signed-off-by: Michael Kelley > > Signed-off-by: K. Y. Srinivasan > > --- > > MAINTAINERS | 3 + > > arch/arm64/include/asm/hyperv-tlfs.h | 338 +++ > > arch/arm64/include/asm/mshyperv.h| 116 + > > include/asm-generic/mshyperv.h | 240 +++ > > 4 files changed, 697 insertions(+) > > create mode 100644 arch/arm64/include/asm/hyperv-tlfs.h > > create mode 100644 arch/arm64/include/asm/mshyperv.h > > create mode 100644 include/asm-generic/mshyperv.h > > > > diff --git a/MAINTAINERS b/MAINTAINERS > > index f4855974f325..72f19cef4c48 100644 > > --- a/MAINTAINERS > > +++ b/MAINTAINERS > > @@ -6835,6 +6835,8 @@ F:arch/x86/include/asm/trace/hyperv.h > > F: arch/x86/include/asm/hyperv-tlfs.h > > F: arch/x86/kernel/cpu/mshyperv.c > > F: arch/x86/hyperv > > +F: arch/arm64/include/asm/hyperv-tlfs.h > > +F: arch/arm64/include/asm/mshyperv.h > > F: drivers/hid/hid-hyperv.c > > F: drivers/hv/ > > F: drivers/input/serio/hyperv-keyboard.c > > @@ -6846,6 +6848,7 @@ F:drivers/video/fbdev/hyperv_fb.c > > F: net/vmw_vsock/hyperv_transport.c > > F: include/linux/hyperv.h > > F: include/uapi/linux/hyperv.h > > +F: include/asm-generic/mshyperv.h > > F: tools/hv/ > > F: Documentation/ABI/stable/sysfs-bus-vmbus > > > > diff --git a/arch/arm64/include/asm/hyperv-tlfs.h > > b/arch/arm64/include/asm/hyperv- > tlfs.h > > new file mode 100644 > > index ..924e37600e92 > > --- /dev/null > > +++ b/arch/arm64/include/asm/hyperv-tlfs.h > > @@ -0,0 +1,338 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > + > > +/* > > + * This file contains definitions from the Hyper-V Hypervisor Top-Level > > + * Functional Specification (TLFS): > > + * > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.co > m%2Fen-us%2Fvirtualization%2Fhyper-v-on- > windows%2Freference%2Ftlfs&data=02%7C01%7Cmikelley%40microsoft.com%7Ce1b > dbb31db064623174f08d65c49cc3b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63 > 6797869386921804&sdata=RiD05cDWC%2FPnXnis6U7EcfEfCjvb54uuKHRfifQhMEM%3D > &reserved=0 > > As mentioned elsewhere, please use a better link here and drop the license > boilerplate below. Agreed. Will do so here and in other files in the next version of the patch. > > > + * > > + * Copyright (C) 2018, Microsoft, Inc. > > + * > > + * Author : Michael Kelley > > + * > > + * This program is free software; you can redistribute it and/or modify it > > + * under the terms of the GNU General Public License version 2 as published > > + * by the Free Software Foundation. > > + * > > + * This program is distributed in the hope that it will be useful, but > > + * WITHOUT ANY WARRANTY; without even the implied warranty of > > + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > > + * NON INFRINGEMENT. See the GNU General Public License for more > > + * details. > > + */ > > + > > +#ifndef _ASM_ARM64_HYPERV_H > > +#define _ASM_ARM64_HYPERV_H > > __ASM_HYPER_V_H please OK. > > > + > > +#include > > + > > +/* > > + * These Hyper-V registers provide information equivalent to the CPUID > > + * instruction on x86/x64. > > + */ > > +#define HV_REGISTER_HYPERVISOR_VERSION 0x0100 /*CPUID > 0x4002 */ > > +#defineHV_REGISTER_PRIVILEGES_AND_FEATURES
RE: [PATCH 3/4] Drivers: hv: vmbus: Add hooks for per-CPU IRQ
From: Will Deacon Sent: Tuesday, November 27, 2018 2:19 AM > > > The general approach is for patches 1 and 2 of the series to provide > > > all the new code under arch/arm64 to enable Hyper-V. But the code > > > won't get called (or even built) with just these two patches because > > > CONFIG_HYPERV can't be selected. Patch 3 is separate because it > > > applies to architecture independent code and arch/x86 code -- I thought > > > there might be value in keeping the ARM64 and x86 patches distinct. > > > Patch 4 applies to architecture independent code, and enables the > > > ARM64 code in patches 1 and 2 to be compiled and run when > > > CONFIG_HYPERV is selected. > > > > > > If combining some of the patches in the series is a better approach, I'm > > > good with that. > > > > Ok, that makes more sense, if it is easier to get the ARM people to > > review this, that's fine. Doesn't seem like anyone did that yet :( > > It's on the list, but thanks for having a look as well! > > Will Will -- I'll hold off on sending a new version, pending comments from the ARM64 maintainers. Let me know if you prefer that I do otherwise. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 3/4] Drivers: hv: vmbus: Add hooks for per-CPU IRQ
From: Greg KH Monday, November 26, 2018 11:57 AM > > > You created "null" hooks that do nothing, for no one in this patch > > > series, why? > > > > > > > hv_enable_vmbus_irq() and hv_disable_vmbus_irq() have non-null > > implementations in the ARM64 code in patch 2 of this series. The > > implementations are in the new file arch/arm64/hyperv/mshyperv.c. > > Or am I misunderstanding your point? > > So you use a hook in an earlier patch and then add it in a later one? > > Shouldn't you do it the other way around? As it is, the earlier patch > should not work properly, right? The earlier patch implements the hook on the ARM64 side but it is unused -- it's not called. The later patch then calls it. Wouldn't the other way around be backwards? The general approach is for patches 1 and 2 of the series to provide all the new code under arch/arm64 to enable Hyper-V. But the code won't get called (or even built) with just these two patches because CONFIG_HYPERV can't be selected. Patch 3 is separate because it applies to architecture independent code and arch/x86 code -- I thought there might be value in keeping the ARM64 and x86 patches distinct. Patch 4 applies to architecture independent code, and enables the ARM64 code in patches 1 and 2 to be compiled and run when CONFIG_HYPERV is selected. If combining some of the patches in the series is a better approach, I'm good with that. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 3/4] Drivers: hv: vmbus: Add hooks for per-CPU IRQ
From: Greg KH Monday, November 26, 2018 11:21 AM > > diff --git a/arch/x86/include/asm/mshyperv.h > > b/arch/x86/include/asm/mshyperv.h > > index 0d6271cce198..8d97bd3a13a6 100644 > > --- a/arch/x86/include/asm/mshyperv.h > > +++ b/arch/x86/include/asm/mshyperv.h > > @@ -109,6 +109,10 @@ void hyperv_vector_handler(struct pt_regs *regs); > > void hv_setup_vmbus_irq(void (*handler)(void)); > > void hv_remove_vmbus_irq(void); > > > > +/* On x86/x64, there isn't a real IRQ to be enabled/disable */ > > +static inline void hv_enable_vmbus_irq(void) {} > > +static inline void hv_disable_vmbus_irq(void) {} > > + > > void hv_setup_kexec_handler(void (*handler)(void)); > > void hv_remove_kexec_handler(void); > > void hv_setup_crash_handler(void (*handler)(struct pt_regs *regs)); > > diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c > > index 166c2501de17..d0bb09a4bd73 100644 > > --- a/drivers/hv/hv.c > > +++ b/drivers/hv/hv.c > > @@ -307,6 +307,7 @@ int hv_synic_init(unsigned int cpu) > > hv_set_siefp(siefp.as_uint64); > > > > /* Setup the shared SINT. */ > > + hv_enable_vmbus_irq(); > > hv_get_synint_state(VMBUS_MESSAGE_SINT, shared_sint.as_uint64); > > > > shared_sint.vector = HYPERVISOR_CALLBACK_VECTOR; > > @@ -434,6 +435,7 @@ int hv_synic_cleanup(unsigned int cpu) > > /* Disable the global synic bit */ > > sctrl.enable = 0; > > hv_set_synic_state(sctrl.as_uint64); > > + hv_disable_vmbus_irq(); > > > > return 0; > > } > > -- > > 2.19.1 > > You created "null" hooks that do nothing, for no one in this patch > series, why? > hv_enable_vmbus_irq() and hv_disable_vmbus_irq() have non-null implementations in the ARM64 code in patch 2 of this series. The implementations are in the new file arch/arm64/hyperv/mshyperv.c. Or am I misunderstanding your point? Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[tip:x86/urgent] x86/hyper-v: Enable PIT shutdown quirk
Commit-ID: 1de72c706488b7be664a601cf3843bd01e327e58 Gitweb: https://git.kernel.org/tip/1de72c706488b7be664a601cf3843bd01e327e58 Author: Michael Kelley AuthorDate: Sun, 4 Nov 2018 03:48:57 + Committer: Thomas Gleixner CommitDate: Sun, 4 Nov 2018 11:04:46 +0100 x86/hyper-v: Enable PIT shutdown quirk Hyper-V emulation of the PIT has a quirk such that the normal PIT shutdown path doesn't work, because clearing the counter register restarts the timer. Disable the counter clearing on PIT shutdown. Signed-off-by: Michael Kelley Signed-off-by: Thomas Gleixner Cc: "gre...@linuxfoundation.org" Cc: "de...@linuxdriverproject.org" Cc: "daniel.lezc...@linaro.org" Cc: "virtualizat...@lists.linux-foundation.org" Cc: "jgr...@suse.com" Cc: "akata...@vmware.com" Cc: "o...@aepfle.de" Cc: "a...@canonical.com" Cc: vkuznets Cc: "jasow...@redhat.com" Cc: "marcelo.ce...@canonical.com" Cc: KY Srinivasan Cc: sta...@vger.kernel.org Link: https://lkml.kernel.org/r/1541303219-11142-3-git-send-email-mikel...@microsoft.com --- arch/x86/kernel/cpu/mshyperv.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index 1c72f3819eb1..e81a2db42df7 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -295,6 +296,16 @@ static void __init ms_hyperv_init_platform(void) if (efi_enabled(EFI_BOOT)) x86_platform.get_nmi_reason = hv_get_nmi_reason; + /* +* Hyper-V VMs have a PIT emulation quirk such that zeroing the +* counter register during PIT shutdown restarts the PIT. So it +* continues to interrupt @18.2 HZ. Setting i8253_clear_counter +* to false tells pit_shutdown() not to zero the counter so that +* the PIT really is shutdown. Generation 2 VMs don't have a PIT, +* and setting this value has no effect. +*/ + i8253_clear_counter_on_shutdown = false; + #if IS_ENABLED(CONFIG_HYPERV) /* * Setup the hook to get control post apic initialization. ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[tip:x86/urgent] clockevents/drivers/i8253: Add support for PIT shutdown quirk
Commit-ID: 35b69a420bfb56b7b74cb635ea903db05e357bec Gitweb: https://git.kernel.org/tip/35b69a420bfb56b7b74cb635ea903db05e357bec Author: Michael Kelley AuthorDate: Sun, 4 Nov 2018 03:48:54 + Committer: Thomas Gleixner CommitDate: Sun, 4 Nov 2018 11:04:46 +0100 clockevents/drivers/i8253: Add support for PIT shutdown quirk Add support for platforms where pit_shutdown() doesn't work because of a quirk in the PIT emulation. On these platforms setting the counter register to zero causes the PIT to start running again, negating the shutdown. Provide a global variable that controls whether the counter register is zero'ed, which platform specific code can override. Signed-off-by: Michael Kelley Signed-off-by: Thomas Gleixner Cc: "gre...@linuxfoundation.org" Cc: "de...@linuxdriverproject.org" Cc: "daniel.lezc...@linaro.org" Cc: "virtualizat...@lists.linux-foundation.org" Cc: "jgr...@suse.com" Cc: "akata...@vmware.com" Cc: "o...@aepfle.de" Cc: "a...@canonical.com" Cc: vkuznets Cc: "jasow...@redhat.com" Cc: "marcelo.ce...@canonical.com" Cc: KY Srinivasan Cc: sta...@vger.kernel.org Link: https://lkml.kernel.org/r/1541303219-11142-2-git-send-email-mikel...@microsoft.com --- drivers/clocksource/i8253.c | 14 -- include/linux/i8253.h | 1 + 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/drivers/clocksource/i8253.c b/drivers/clocksource/i8253.c index 9c38895542f4..d4350bb10b83 100644 --- a/drivers/clocksource/i8253.c +++ b/drivers/clocksource/i8253.c @@ -20,6 +20,13 @@ DEFINE_RAW_SPINLOCK(i8253_lock); EXPORT_SYMBOL(i8253_lock); +/* + * Handle PIT quirk in pit_shutdown() where zeroing the counter register + * restarts the PIT, negating the shutdown. On platforms with the quirk, + * platform specific code can set this to false. + */ +bool i8253_clear_counter_on_shutdown __ro_after_init = true; + #ifdef CONFIG_CLKSRC_I8253 /* * Since the PIT overflows every tick, its not very useful @@ -109,8 +116,11 @@ static int pit_shutdown(struct clock_event_device *evt) raw_spin_lock(&i8253_lock); outb_p(0x30, PIT_MODE); - outb_p(0, PIT_CH0); - outb_p(0, PIT_CH0); + + if (i8253_clear_counter_on_shutdown) { + outb_p(0, PIT_CH0); + outb_p(0, PIT_CH0); + } raw_spin_unlock(&i8253_lock); return 0; diff --git a/include/linux/i8253.h b/include/linux/i8253.h index e6bb36a97519..8336b2f6f834 100644 --- a/include/linux/i8253.h +++ b/include/linux/i8253.h @@ -21,6 +21,7 @@ #define PIT_LATCH ((PIT_TICK_RATE + HZ/2) / HZ) extern raw_spinlock_t i8253_lock; +extern bool i8253_clear_counter_on_shutdown; extern struct clock_event_device i8253_clockevent; extern void clockevent_i8253_init(bool oneshot); ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v2 1/2] i8253: Add support for PIT shutdown quirk
Add support for platforms where pit_shutdown() doesn't work because of a quirk in the PIT emulation. On these platforms setting the counter register to zero causes the PIT to start running again, negating the shutdown. Provide a global variable that controls whether the counter register is zero'ed, which platform specific code can override. Signed-off-by: Michael Kelley --- drivers/clocksource/i8253.c | 14 -- include/linux/i8253.h | 1 + 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/drivers/clocksource/i8253.c b/drivers/clocksource/i8253.c index 9c38895..2a4202a 100644 --- a/drivers/clocksource/i8253.c +++ b/drivers/clocksource/i8253.c @@ -20,6 +20,13 @@ DEFINE_RAW_SPINLOCK(i8253_lock); EXPORT_SYMBOL(i8253_lock); +/* + * Handle PIT quirk in pit_shutdown() where zeroing the counter register + * restarts the PIT, negating the shutdown. On platforms with the quirk, + * platform specific code can set this to false. + */ +bool i8253_clear_counter __ro_after_init = true; + #ifdef CONFIG_CLKSRC_I8253 /* * Since the PIT overflows every tick, its not very useful @@ -109,8 +116,11 @@ static int pit_shutdown(struct clock_event_device *evt) raw_spin_lock(&i8253_lock); outb_p(0x30, PIT_MODE); - outb_p(0, PIT_CH0); - outb_p(0, PIT_CH0); + + if (i8253_clear_counter) { + outb_p(0, PIT_CH0); + outb_p(0, PIT_CH0); + } raw_spin_unlock(&i8253_lock); return 0; diff --git a/include/linux/i8253.h b/include/linux/i8253.h index e6bb36a..31c4be5 100644 --- a/include/linux/i8253.h +++ b/include/linux/i8253.h @@ -21,6 +21,7 @@ #define PIT_LATCH ((PIT_TICK_RATE + HZ/2) / HZ) extern raw_spinlock_t i8253_lock; +extern bool i8253_clear_counter; extern struct clock_event_device i8253_clockevent; extern void clockevent_i8253_init(bool oneshot); -- 1.8.3.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v2 0/2] i8253: Fix PIT shutdown quirk on Hyper-V
pit_shutdown() doesn't work on Hyper-V because of a quirk in the PIT emulation. This problem exists in all versions of Hyper-V and had not been noticed previously. When the counter register is set to zero, the emulated PIT continues to interrupt @18.2 HZ. Account for this quirk by adding a global variable in the i8253 code that controls whether the counter register is zero'ed. Then in Hyper-V initialization code, override the default setting so the counter register is not zero'ed. Changes in v2: * Instead of a function call to check if running on Hyper-V, use a global variable to control whether the counter register is zero'ed. [Juergen Gross & Thomas Gleixner] Michael Kelley (2): i8253: Add support for PIT shutdown quirk x86/hyper-v: Enable PIT shutdown quirk arch/x86/kernel/cpu/mshyperv.c | 11 +++ drivers/clocksource/i8253.c| 14 -- include/linux/i8253.h | 1 + 3 files changed, 24 insertions(+), 2 deletions(-) -- 1.8.3.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v2 2/2] x86/hyper-v: Enable PIT shutdown quirk
Hyper-V emulation of the PIT has a quirk such that the normal PIT shutdown path doesn't work. Enable the PIT code that handles this quirk. Signed-off-by: Michael Kelley --- arch/x86/kernel/cpu/mshyperv.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index 1c72f38..65b2f88 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -295,6 +296,16 @@ static void __init ms_hyperv_init_platform(void) if (efi_enabled(EFI_BOOT)) x86_platform.get_nmi_reason = hv_get_nmi_reason; + /* +* Hyper-V VMs have a PIT emulation quirk such that zeroing the +* counter register during PIT shutdown restarts the PIT. So it +* continues to interrupt @18.2 HZ. Setting i8253_clear_counter +* to false tells pit_shutdown() not to zero the counter so that +* the PIT really is shutdown. Generation 2 VMs don't have a PIT, +* and setting this value has no effect. +*/ + i8253_clear_counter = false; + #if IS_ENABLED(CONFIG_HYPERV) /* * Setup the hook to get control post apic initialization. -- 1.8.3.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v3 4/4] Drivers: hv: Enable CONFIG_HYPERV on ARM64
Update drivers/hv/Kconfig so CONFIG_HYPERV can be selected on ARM64, causing the Hyper-V specific code to be built. Signed-off-by: Michael Kelley --- drivers/hv/Kconfig | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig index 97954f5..c3e11a2 100644 --- a/drivers/hv/Kconfig +++ b/drivers/hv/Kconfig @@ -4,7 +4,8 @@ menu "Microsoft Hyper-V guest support" config HYPERV tristate "Microsoft Hyper-V client drivers" - depends on X86 && ACPI && PCI && X86_LOCAL_APIC && HYPERVISOR_GUEST + depends on ACPI && PCI && \ + ((X86 && X86_LOCAL_APIC && HYPERVISOR_GUEST) || ARM64) select PARAVIRT help Select this option to run Linux as a Hyper-V client operating -- 1.8.3.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v3 3/4] Drivers: hv: vmbus: Add hooks for per-CPU IRQ
Add hooks to enable/disable a per-CPU IRQ for VMbus. These hooks are in the architecture independent setup and shutdown paths for Hyper-V, and are needed by Linux guests on Hyper-V on ARM64. The x86/x64 implementation is null because VMbus interrupts on x86/x64 don't use an IRQ. Signed-off-by: Michael Kelley --- arch/x86/include/asm/mshyperv.h | 4 drivers/hv/hv.c | 2 ++ 2 files changed, 6 insertions(+) diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h index 0d6271c..8d97bd3 100644 --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -109,6 +109,10 @@ static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type) void hv_setup_vmbus_irq(void (*handler)(void)); void hv_remove_vmbus_irq(void); +/* On x86/x64, there isn't a real IRQ to be enabled/disable */ +static inline void hv_enable_vmbus_irq(void) {} +static inline void hv_disable_vmbus_irq(void) {} + void hv_setup_kexec_handler(void (*handler)(void)); void hv_remove_kexec_handler(void); void hv_setup_crash_handler(void (*handler)(struct pt_regs *regs)); diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c index 332d7c3..5857208 100644 --- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -309,6 +309,7 @@ int hv_synic_init(unsigned int cpu) hv_set_siefp(siefp.as_uint64); /* Setup the shared SINT. */ + hv_enable_vmbus_irq(); hv_get_synint_state(VMBUS_MESSAGE_SINT, shared_sint.as_uint64); shared_sint.vector = HYPERVISOR_CALLBACK_VECTOR; @@ -438,6 +439,7 @@ int hv_synic_cleanup(unsigned int cpu) hv_get_synic_state(sctrl.as_uint64); sctrl.enable = 0; hv_set_synic_state(sctrl.as_uint64); + hv_disable_vmbus_irq(); return 0; } -- 1.8.3.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v3 0/4] Subject: Enable Linux guests on Hyper-V on ARM64
This series enables Linux guests running on Hyper-V on ARM64 hardware. New ARM64-specific code in arch/arm64/hyperv initializes Hyper-V, including its synthetic clocks and hypercall mechanism. Existing architecture independent drivers for Hyper-V's VMbus and synthetic devices just work when built for ARM64. Hyper-V code is built and included in the image and modules only if CONFIG_HYPERV is enabled. The four patches are organized as follows: 1) Add include files that define the Hyper-V interface as described in the Hyper-V Top Level Functional Spec (TLFS), plus additional definitions specific to Linux running on Hyper-V. 2) Add core Hyper-V support on ARM64, including hypercalls, synthetic clock initialization, and interrupt handlers. 3) Update the existing VMbus driver to generalize interrupt management across x86/x64 and ARM64. 4) Make CONFIG_HYPERV selectable on ARM64 in addition to x86/x64. Some areas of Linux guests on Hyper-V on ARM64 are a work- in-progress, primarily due to work still being done in Hyper-V: * Hyper-V on ARM64 currently runs with a 4 Kbyte page size, and only supports guests with a 4 Kbyte page size. Because Hyper-V uses shared pages to communicate between the guest and the hypervisor, there are open design decisions on the page size to use when the guest is using 16K/64K pages. Once those issues are resolved and Hyper-V fully supports 16K/64K guest pages, changes may be needed in the Linux drivers for Hyper-V synthetic devices. * Hyper-V on ARM64 does not currently support mapping PCI devices into the guest address space. The Hyper-V PCI driver at drivers/pci/host/pci-hyperv.c has x86/x64-specific code and is not being built for ARM64. In a few cases, terminology from the x86/x64 world has been carried over into the ARM64 code ("MSR", "TSC"). Hyper-V still uses the x86/x64 terminology and has not replaced it with something more generic, so the code uses the Hyper-V terminology. This will be fixed when Hyper-V updates the usage in the TLFS. Changes in v3: * Added initialization of hv_vp_index array like was recently added on x86 branch [KY Srinivasan] * Changed Hyper-V ARM64 register symbols to be all uppercase instead of mixed case [KY Srinivasan] * Separated mshyperv.h into two files, one architecture independent and one architecture dependent. After this code is upstream, will make changes to the x86 code to use the architecture independent file and remove duplication. And once we have a multi-architecture Hyper-V TLFS, will do a separate patch to split hyperv-tlfs.h in the same way. [KY Srinivasan] * Minor tweaks to rebase to latest linux-next code Changes in v2: * Removed patch to implement slow_virt_to_phys() on ARM64. Use of slow_virt_to_phys() in arch independent Hyper-V drivers has been eliminated by commit 6ba34171bcbd ("Drivers: hv: vmbus: Remove use of slow_virt_to_phys()") * Minor tweaks to rebase to latest linux-next code Michael Kelley (4): arm64: hyperv: Add core Hyper-V include files arm64: hyperv: Add support for Hyper-V as a hypervisor Drivers: hv: vmbus: Add hooks for per-CPU IRQ Drivers: hv: Enable CONFIG_HYPERV on ARM64 MAINTAINERS | 4 + arch/arm64/Makefile | 1 + arch/arm64/hyperv/Makefile | 2 + arch/arm64/hyperv/hv_hvc.S | 54 + arch/arm64/hyperv/hv_init.c | 441 +++ arch/arm64/hyperv/mshyperv.c | 178 ++ arch/arm64/include/asm/hyperv-tlfs.h | 338 +++ arch/arm64/include/asm/mshyperv.h| 116 + arch/x86/include/asm/mshyperv.h | 4 + drivers/hv/Kconfig | 3 +- drivers/hv/hv.c | 2 + include/asm-generic/mshyperv.h | 240 +++ 12 files changed, 1382 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/hyperv/Makefile create mode 100644 arch/arm64/hyperv/hv_hvc.S create mode 100644 arch/arm64/hyperv/hv_init.c create mode 100644 arch/arm64/hyperv/mshyperv.c create mode 100644 arch/arm64/include/asm/hyperv-tlfs.h create mode 100644 arch/arm64/include/asm/mshyperv.h create mode 100644 include/asm-generic/mshyperv.h -- 1.8.3.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v3 2/4] arm64: hyperv: Add support for Hyper-V as a hypervisor
Add ARM64-specific code to enable Hyper-V. This code includes: * Detecting Hyper-V and initializing the guest/Hyper-V interface * Setting up Hyper-V's synthetic clocks * Making hypercalls using the HVC instruction * Setting up VMbus and stimer0 interrupts * Setting up kexec and crash handlers This code is architecture dependent code and is mostly driven by architecture independent code in the VMbus driver in drivers/hv/hv.c and drivers/hv/vmbus_drv.c. This code is built only when CONFIG_HYPERV is enabled. Signed-off-by: Michael Kelley --- MAINTAINERS | 1 + arch/arm64/Makefile | 1 + arch/arm64/hyperv/Makefile | 2 + arch/arm64/hyperv/hv_hvc.S | 54 ++ arch/arm64/hyperv/hv_init.c | 441 +++ arch/arm64/hyperv/mshyperv.c | 178 + 6 files changed, 677 insertions(+) create mode 100644 arch/arm64/hyperv/Makefile create mode 100644 arch/arm64/hyperv/hv_hvc.S create mode 100644 arch/arm64/hyperv/hv_init.c create mode 100644 arch/arm64/hyperv/mshyperv.c diff --git a/MAINTAINERS b/MAINTAINERS index 00c7bad..6e55f55 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6800,6 +6800,7 @@ F:arch/x86/kernel/cpu/mshyperv.c F: arch/x86/hyperv F: arch/arm64/include/asm/hyperv-tlfs.h F: arch/arm64/include/asm/mshyperv.h +F: arch/arm64/hyperv F: drivers/hid/hid-hyperv.c F: drivers/hv/ F: drivers/input/serio/hyperv-keyboard.c diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile index b4e994c..114fdfa 100644 --- a/arch/arm64/Makefile +++ b/arch/arm64/Makefile @@ -106,6 +106,7 @@ core-y += arch/arm64/kernel/ arch/arm64/mm/ core-$(CONFIG_NET) += arch/arm64/net/ core-$(CONFIG_KVM) += arch/arm64/kvm/ core-$(CONFIG_XEN) += arch/arm64/xen/ +core-$(CONFIG_HYPERV) += arch/arm64/hyperv/ core-$(CONFIG_CRYPTO) += arch/arm64/crypto/ libs-y := arch/arm64/lib/ $(libs-y) core-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a diff --git a/arch/arm64/hyperv/Makefile b/arch/arm64/hyperv/Makefile new file mode 100644 index 000..988eda5 --- /dev/null +++ b/arch/arm64/hyperv/Makefile @@ -0,0 +1,2 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-y := hv_init.o hv_hvc.o mshyperv.o diff --git a/arch/arm64/hyperv/hv_hvc.S b/arch/arm64/hyperv/hv_hvc.S new file mode 100644 index 000..8263696 --- /dev/null +++ b/arch/arm64/hyperv/hv_hvc.S @@ -0,0 +1,54 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Microsoft Hyper-V hypervisor invocation routines + * + * Copyright (C) 2018, Microsoft, Inc. + * + * Author : Michael Kelley + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or + * NON INFRINGEMENT. See the GNU General Public License for more + * details. + */ + +#include + + .text +/* + * Do the HVC instruction. For Hyper-V the argument is always 1. + * x0 contains the hypercall control value, while additional registers + * vary depending on the hypercall, and whether the hypercall arguments + * are in memory or in registers (a "fast" hypercall per the Hyper-V + * TLFS). When the arguments are in memory x1 is the guest physical + * address of the input arguments, and x2 is the guest physical + * address of the output arguments. When the arguments are in + * registers, the register values depends on the hypercall. Note + * that this version cannot return any values in registers. + */ +ENTRY(hv_do_hvc) + hvc #1 + ret +ENDPROC(hv_do_hvc) + +/* + * This variant of HVC invocation is for hv_get_vpreg and + * hv_get_vpreg_128. The input parameters are passed in registers + * along with a pointer in x4 to where the output result should + * be stored. The output is returned in x15 and x16. x18 is used as + * scratch space to avoid buildng a stack frame, as Hyper-V does + * not preserve registers x0-x17. + */ +ENTRY(hv_do_hvc_fast_get) + mov x18, x4 + hvc #1 + str x15,[x18] + str x16,[x18,#8] + ret +ENDPROC(hv_do_hvc_fast_get) diff --git a/arch/arm64/hyperv/hv_init.c b/arch/arm64/hyperv/hv_init.c new file mode 100644 index 000..aa1a8c0 --- /dev/null +++ b/arch/arm64/hyperv/hv_init.c @@ -0,0 +1,441 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Initialization of the interface with Microsoft's Hyper-V hypervisor, + * and various low level utility routines for interacting with Hyper-V. + * + * Copyright (C) 2018, Microsoft, Inc. + * + * Author : Michael Kelley + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free
[PATCH v3 1/4] arm64: hyperv: Add core Hyper-V include files
hyperv-tlfs.h defines Hyper-V interfaces from the Hyper-V Top Level Functional Spec (TLFS). The TLFS is distinctly oriented to x86/x64, and Hyper-V has not separated out the architecture-dependent parts into x86/x64 vs. ARM64. So hyperv-tlfs.h includes information for ARM64 that is not yet formally published. The TLFS is available here: docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs mshyperv.h defines Linux-specific structures and routines for interacting with Hyper-V. It is split into an ARM64 specific file and an architecture independent file in include/asm-generic. Signed-off-by: Michael Kelley --- MAINTAINERS | 3 + arch/arm64/include/asm/hyperv-tlfs.h | 338 +++ arch/arm64/include/asm/mshyperv.h| 116 include/asm-generic/mshyperv.h | 240 + 4 files changed, 697 insertions(+) create mode 100644 arch/arm64/include/asm/hyperv-tlfs.h create mode 100644 arch/arm64/include/asm/mshyperv.h create mode 100644 include/asm-generic/mshyperv.h diff --git a/MAINTAINERS b/MAINTAINERS index 25c090e..00c7bad 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6798,6 +6798,8 @@ F:arch/x86/include/asm/trace/hyperv.h F: arch/x86/include/asm/hyperv-tlfs.h F: arch/x86/kernel/cpu/mshyperv.c F: arch/x86/hyperv +F: arch/arm64/include/asm/hyperv-tlfs.h +F: arch/arm64/include/asm/mshyperv.h F: drivers/hid/hid-hyperv.c F: drivers/hv/ F: drivers/input/serio/hyperv-keyboard.c @@ -6809,6 +6811,7 @@ F:drivers/video/fbdev/hyperv_fb.c F: net/vmw_vsock/hyperv_transport.c F: include/linux/hyperv.h F: include/uapi/linux/hyperv.h +F: include/asm-generic/mshyperv.h F: tools/hv/ F: Documentation/ABI/stable/sysfs-bus-vmbus diff --git a/arch/arm64/include/asm/hyperv-tlfs.h b/arch/arm64/include/asm/hyperv-tlfs.h new file mode 100644 index 000..1430798 --- /dev/null +++ b/arch/arm64/include/asm/hyperv-tlfs.h @@ -0,0 +1,338 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * This file contains definitions from the Hyper-V Hypervisor Top-Level + * Functional Specification (TLFS): + * https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs + * + * Copyright (C) 2018, Microsoft, Inc. + * + * Author : Michael Kelley + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or + * NON INFRINGEMENT. See the GNU General Public License for more + * details. + */ + +#ifndef _ASM_ARM64_HYPERV_H +#define _ASM_ARM64_HYPERV_H + +#include + +/* + * These Hyper-V registers provide information equivalent to the CPUID + * instruction on x86/x64. + */ +#define HV_REGISTER_HYPERVISOR_VERSION 0x0100 /*CPUID 0x4002 */ +#defineHV_REGISTER_PRIVILEGES_AND_FEATURES 0x0200 /*CPUID 0x4003 */ +#defineHV_REGISTER_FEATURES0x0201 /*CPUID 0x4004 */ +#defineHV_REGISTER_IMPLEMENTATION_LIMITS 0x0202 /*CPUID 0x4005 */ +#define HV_ARM64_REGISTER_INTERFACE_VERSION0x00090006 /*CPUID 0x4001 */ + +/* + * Feature identification. HvRegisterPrivilegesAndFeaturesInfo returns a + * 128-bit value with flags indicating which features are available to the + * partition based upon the current partition privileges. The 128-bit + * value is broken up with different portions stored in different 32-bit + * fields in the ms_hyperv structure. + */ + +/* Partition Reference Counter available*/ +#define HV_MSR_TIME_REF_COUNT_AVAILABLE(1 << 1) + +/* + * Synthetic Timers available + */ +#define HV_MSR_SYNTIMER_AVAILABLE (1 << 3) + +/* Frequency MSRs available */ +#define HV_FEATURE_FREQUENCY_MSRS_AVAILABLE(1 << 8) + +/* Reference TSC available */ +#define HV_MSR_REFERENCE_TSC_AVAILABLE (1 << 9) + +/* Crash MSR available */ +#define HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE (1 << 10) + + +/* + * This group of flags is in the high order 64-bits of the returned + * 128-bit value. + */ + +/* STIMER direct mode is available */ +#define HV_STIMER_DIRECT_MODE_AVAILABLE(1 << 19) + +/* + * Implementation recommendations in register + * HvRegisterFeaturesInfo. Indicates which behaviors the hypervisor + * recommends the OS implement for optimal performance. + */ + +/* + * Recommend not using Auto EOI + */ +#define HV_DEPRECATING_AEOI_RECOMMENDED(1 << 9) + +/* + * Synthetic register definitions equivalent to MSRs on x86/x64 + */ +#define HV_REGISTER_CRASH_P0 0x0210 +#define HV_REGIS
RE: [PATCH V2 3/5] Drivers: hv: kvp: Fix the recent regression caused by incorrect clean-up
From: k...@linuxonhyperv.com Sent: Wednesday, October 17, 2018 10:10 PM > From: Dexuan Cui > > In kvp_send_key(), we do need call process_ib_ipinfo() if > message->kvp_hdr.operation is KVP_OP_GET_IP_INFO, because it turns out > the userland hv_kvp_daemon needs the info of operation, adapter_id and > addr_family. With the incorrect fc62c3b1977d, the host can't get the > VM's IP via KVP. > > And, fc62c3b1977d added a "break;", but actually forgot to initialize > the key_size/value in the case of KVP_OP_SET, so the default key_size of > 0 is passed to the kvp daemon, and the pool files > /var/lib/hyperv/.kvp_pool_* can't be updated. > > This patch effectively rolls back the previous fc62c3b1977d, and > correctly fixes the "this statement may fall through" warnings. > > This patch is tested on WS 2012 R2 and 2016. > > Fixes: fc62c3b1977d ("Drivers: hv: kvp: Fix two "this statement may fall > through" warnings") > Signed-off-by: Dexuan Cui > Cc: K. Y. Srinivasan > Cc: Haiyang Zhang > Cc: Stephen Hemminger > Cc: > Signed-off-by: K. Y. Srinivasan > --- > drivers/hv/hv_kvp.c | 26 ++ > 1 file changed, 22 insertions(+), 4 deletions(-) > Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH V2 1/5] Drivers: hv: vmbus: Get rid of unnecessary state in hv_context
From: k...@linuxonhyperv.com Sent: Wednesday, October 17, 2018 10:09 PM > > Currently we are replicating state in struct hv_context that is unnecessary - > this state can be retrieved from the hypervisor. Furthermore, this is a > per-cpu > state that is being maintained as a global state in struct hv_context. > Get rid of this state in struct hv_context. > > Signed-off-by: K. Y. Srinivasan > --- > drivers/hv/hv.c | 10 +++--- > drivers/hv/hyperv_vmbus.h | 2 -- > 2 files changed, 3 insertions(+), 9 deletions(-) > Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH V2 3/4] vmbus: add per-channel sysfs info
>From Olaf Hering Sent: Thursday, October 18, 2018 8:20 AM > > > This extends existing vmbus related sysfs structure to provide per-channel > > state information. This is useful when diagnosing issues with multiple > > queues in networking and storage. > > > +++ b/drivers/hv/vmbus_drv.c > > +static ssize_t write_avail_show(const struct vmbus_channel *channel, char > > *buf) > > +{ > > + const struct hv_ring_buffer_info *rbi = &channel->outbound; > > + > > + return sprintf(buf, "%u\n", hv_get_bytes_to_write(rbi)); > > +} > > +VMBUS_CHAN_ATTR_RO(write_avail); > > This is upstream since a year. > > But I wonder how this can work if vmbus_device_register is called, > and then something reads the populated sysfs files before vmbus_open returns. > Nothing protects rbi->ring_buffer in this case, which remains NULL > until vmbus_open populates it. > > A simple reproduce, with a modular kernel, is to boot with init=/bin/bash > head /sys/bus/vmbus/devices/*/channels/*/* > There are multiple race conditions with this and other VMbus sysfs information. There's a race on the close path as well. I've got an action on my list to get it cleaned up. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH V3 10/13] x86/hyper-v: Add HvFlushGuestAddressList hypercall support
From: Tianyu Lan Sent: Wednesday, September 26, 2018 8:50 PM > > Hyper-V provides HvFlushGuestAddressList() hypercall to flush EPT tlb > with specified ranges. This patch is to add the hypercall support. > > Signed-off-by: Lan Tianyu > Looks good! Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH -next] x86/hyper-v: Remove unused including
From: YueHaibing Sent: Sunday, September 23, 2018 1:20 AM > Remove including that don't need it. > > Signed-off-by: YueHaibing > --- > arch/x86/hyperv/hv_apic.c | 1 - > 1 file changed, 1 deletion(-) > Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH V2 4/13] KVM/MMU: Flush tlb directly in the kvm_handle_hva_range()
From: Tianyu Lan Sent: Thursday, September 20, 2018 7:30 AM > On 9/20/2018 12:08 AM, Michael Kelley (EOSG) wrote: > > From: Tianyu Lan Sent: Monday, September 17, 2018 8:19 PM > >> + > >> + if (ret && kvm_available_flush_tlb_with_range()) { > >> + kvm_flush_remote_tlbs_with_address(kvm, > >> + gfn_start, > >> + gfn_end - gfn_start); > > > > Does the above need to be gfn_end - gfn_start + 1? > > The flush range depends on the input parameter frame start and frame end > of for_each_slot_rmap_range(). > > for_each_slot_rmap_range(memslot, PT_PAGE_TABLE_LEVEL, > PT_MAX_HUGEPAGE_LEVEL, > gfn_start, gfn_end - 1, > &iterator) > ret |= handler(kvm, iterator.rmap, memslot, > iterator.gfn, iterator.level, data); > > > The start is "gfn_start" and the end is "gfn_end - 1". The flush size is > (gfn_end - 1) - gfn_start + 1 = gfn_end - gfn_start. > Got it. I agree. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH V2 10/13] x86/hyper-v: Add HvFlushGuestAddressList hypercall support
From: Tianyu Lan Sent: Monday, September 17, 2018 8:19 PM > > #include > #include > #include > #include > +#include Hopefully asm/kvm_host.h does not need to be #included, given the new code structure. > > #include > > +/* > + * MAX_FLUSH_PAGES = "additional_pages" + 1. It's limited > + * by the bitwidth of "additional_pages" in union hv_gpa_page_range. > + */ > +#define MAX_FLUSH_PAGES (2048) > + > +/* > + * All input flush parameters are in single page. The max flush count > + * is equal with how many entries of union hv_gpa_page_range can be > + * populated in the input parameter page. MAX_FLUSH_REP_COUNT > + * = (4096 - 16) / 8. (“Page Size” - "Address Space" - "Flags") / > + * "GPA Range". > + */ > +#define MAX_FLUSH_REP_COUNT (510) > + I would recommend putting the above two definitions in hyperv-tlfs.h. They are directly tied to the data structures defined by Hyper-V in the TLFS. Put MAX_FLUSH_PAGES immediately after the definition for hv_gpa_page_range so that the dependency is obvious. For MAX_FLUSH_REP_COUNT, can you do the calculation in the #define rather than just in the comment? Alternatively, define the gpa_list[] array to be of MAX_FLUSH_REP_COUNT size, and then add a compile time assert that the size of struct hv_guest_mapping_flush_list is exactly one page in size. It's just a good way to use the compiler to help check for mistakes. Also prefix them both with HV_ since they will be more globally visible as part of hyperv-tlfs.h. > int hyperv_flush_guest_mapping(u64 as) > { > struct hv_guest_mapping_flush **flush_pcpu; > @@ -54,3 +71,89 @@ int hyperv_flush_guest_mapping(u64 as) > return ret; > } > EXPORT_SYMBOL_GPL(hyperv_flush_guest_mapping); > + > +static int fill_flush_list(union hv_gpa_page_range gpa_list[], > + int offset, u64 start_gfn, u64 pages) > +{ > + int gpa_n = offset; > + u64 cur = start_gfn; > + u64 additional_pages; > + > + do { > + if (gpa_n >= MAX_FLUSH_REP_COUNT) { > + pr_warn("Request exceeds HvFlushGuestList max flush > count."); > + return -ENOSPC; I wonder if the warning is really needed. When the error is returned up through the higher levels of code, won't the higher levels just fallback to the non-enlightened flush code? So nothing is actually goes wrong; it's just that a slower code path gets taken. A comment about such expectations might be helpful. > + } > + > + if (pages > MAX_FLUSH_PAGES) { > + additional_pages = MAX_FLUSH_PAGES - 1; > + pages -= MAX_FLUSH_PAGES; > + } else { > + additional_pages = pages - 1; > + pages = 0; > + } The above code is really doing: additional_pages = min(pages, MAX_FLUSH_PAGES) - 1; pages -= additional_pages + 1; And you might want to move the decrement of 'pages' down to the bottom of the loop where you update the other loop variables. > + > + gpa_list[gpa_n].page.additional_pages = additional_pages; > + gpa_list[gpa_n].page.largepage = false; > + gpa_list[gpa_n].page.basepfn = cur; > + > + cur += additional_pages + 1; > + gpa_n++; > + } while (pages > 0); > + > + return gpa_n; > +} > + > +int hyperv_flush_guest_mapping_range(u64 as, struct hyperv_tlb_range *range) > +{ > + struct hv_guest_mapping_flush_list **flush_pcpu; > + struct hv_guest_mapping_flush_list *flush; > + u64 status = 0; > + unsigned long flags; > + int ret = -ENOTSUPP; > + int gpa_n = 0; > + > + if (!hv_hypercall_pg) > + goto fault; > + > + local_irq_save(flags); > + > + flush_pcpu = (struct hv_guest_mapping_flush_list **) > + this_cpu_ptr(hyperv_pcpu_input_arg); > + > + flush = *flush_pcpu; > + if (unlikely(!flush)) { > + local_irq_restore(flags); > + goto fault; > + } > + > + flush->address_space = as; > + flush->flags = 0; > + > + if (!range->flush_list) > + gpa_n = fill_flush_list(flush->gpa_list, gpa_n, > + range->start_gfn, range->pages); > + else if (range->parse_flush_list_func) > + gpa_n = range->parse_flush_list_func(flush->gpa_list, gpa_n, > + range->flush_list, fill_flush_list); > + else > + gpa_n = -1; > + > + if (gpa_n < 0) { > + local_irq_restore(flags); > + goto fault; > + } > + > + status = hv_do_rep_hypercall(HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_LIST, > + gpa_n, 0, flush, NULL); > + > + local_irq_restore(flags); > + > + if (!(status & HV_HYPERCALL_RESULT_MASK)) > + ret = 0; > + else > + ret = status; > +fault: > + return ret; > +} > +EXPORT_SYMBOL_GPL(hyperv_flush_guest_mapping_r
RE: [PATCH V2 4/13] KVM/MMU: Flush tlb directly in the kvm_handle_hva_range()
From: Tianyu Lan Sent: Monday, September 17, 2018 8:19 PM > + > + if (ret && kvm_available_flush_tlb_with_range()) { > + kvm_flush_remote_tlbs_with_address(kvm, > + gfn_start, > + gfn_end - gfn_start); Does the above need to be gfn_end - gfn_start + 1? > + ret = 0; > + } Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH V2 2/13] KVM/MMU: Add tlb flush with range helper function
From: Tianyu Lan Sent: Monday, September 17, 2018 8:18 PM > > +static void kvm_flush_remote_tlbs_with_range(struct kvm *kvm, > + struct kvm_tlb_range *range) > +{ > + int ret = -ENOTSUPP; > + > + if (range && kvm_x86_ops->tlb_remote_flush_with_range) { > + /* > + * Read tlbs_dirty before setting KVM_REQ_TLB_FLUSH in > + * kvm_make_all_cpus_request. > + */ > + long dirty_count = smp_load_acquire(&kvm->tlbs_dirty); > + > + ret = kvm_x86_ops->tlb_remote_flush_with_range(kvm, range); > + cmpxchg(&kvm->tlbs_dirty, dirty_count, 0); > + } The comment and the code that manipulates kvm->tlbs_dirty appears to have been copied from kvm_flush_remote_tlbs(). But the above code doesn't call kvm_make_all_cpus_request(). I haven't traced all the details, but it seems like the comment should be updated, or the code isn't needed. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] x86/hyperv: suppress "PCI: Fatal: No config space access function found"
From: Dexuan Cui Sent: Tuesday, September 18, 2018 3:30 PM > > A Generatin-2 Linux VM on Hyper-V doesn't have the legacy PCI bus, and > users always see the scary warning, which is actually harmless. The patch > is made to suppress the warning. > > Signed-off-by: Dexuan Cui > Cc: K. Y. Srinivasan > Cc: Haiyang Zhang > Cc: Stephen Hemminger > --- > arch/x86/hyperv/hv_init.c | 19 +++ > 1 file changed, 19 insertions(+) > Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] Drivers: hv: vmbus: include header for get_irq_regs()
From Sebastian Andrzej Siewior Sent: Thursday, August 30, 2018 12:55 AM > > On !RT the header file get_irq_regs() gets pulled in via other header files. > On > RT it does not and the build fails: > > drivers/hv/vmbus_drv.c:975 implicit declaration of function > ‘get_irq_regs’ [- > Werror=implicit-function-declaration] > drivers/hv/hv.c:115 implicit declaration of function ‘get_irq_regs’ > [-Werror=implicit- > function-declaration] > > Add the header file for get_irq_regs() in a common header so it used by > vmbus_drv.c by hv.c for their get_irq_regs() usage. > get_irq_regs() is not used explicitly in either vmbus_drv.c or in hv.c. And I couldn't make the line numbers in the errors above line up with anything in the source code that might be implicitly using get_irq_regs(). Is it the calls to add_interrupt_randomness()? Did you figure out exactly what line of code is causing the compile error? I'm wondering whether adding the #include of irq.h into hyperv_vmbus.h is really the right solution. More correct might be to have the file where get_irq_regs() is actually used to #include irq_regs.h. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 10/13] x86/hyper-v: Add HvFlushGuestAddressList hypercall support
From: Tianyu Lan Sent: Monday, September 10, 2018 1:39 AM > + > +int hyperv_flush_guest_mapping_range(u64 as, struct kvm_tlb_range *range) I'm really concerned about defining the Hyper-V function to flush guest mappings in terms of a KVM struct definition. Your patch puts this function in arch/x86/hyperv/nested.c. I haven't investigated all the details, but on its face this approach seems like it would cause trouble in the long run, and it doesn't support the case of a hypervisor other than KVM running at L1. I know that KVM code has taken a dependency on Hyper-V types and code, but that's because KVM is emulating a lot of Hyper-V functionality and it's taking advantage of Hyper-V enlightenments. Is there a top level reason I haven't thought of for Hyper-V code to take a dependency on KVM definitions? I would think we want Hyper-V code to be generic, using Hyper-V data structure definitions. Then in keeping with what's already been done, KVM code would use those definitions where it needs to make calls to Hyper-V code. > +{ > + struct kvm_mmu_page *sp; > + struct hv_guest_mapping_flush_list **flush_pcpu; > + struct hv_guest_mapping_flush_list *flush; > + u64 status = 0; > + unsigned long flags; > + int ret = -ENOTSUPP; > + int gpa_n = 0; > + > + if (!hv_hypercall_pg) > + goto fault; > + > + local_irq_save(flags); > + > + flush_pcpu = (struct hv_guest_mapping_flush_list **) > + this_cpu_ptr(hyperv_pcpu_input_arg); > + > + flush = *flush_pcpu; > + if (unlikely(!flush)) { > + local_irq_restore(flags); > + goto fault; > + } > + > + flush->address_space = as; > + flush->flags = 0; > + > + if (!range->flush_list) { > + gpa_n = fill_flush_list(flush->gpa_list, gpa_n, > + range->start_gfn, range->end_gfn); > + } else { > + list_for_each_entry(sp, range->flush_list, > + flush_link) { > + u64 end_gfn = sp->gfn + > + KVM_PAGES_PER_HPAGE(sp->role.level) - 1; > + gpa_n = fill_flush_list(flush->gpa_list, gpa_n, > + sp->gfn, end_gfn); > + } Per the previous comment, if this loop really needs to walk a KVM data structure, look for a different way to organize things so that the handling of KVM-specific data structures is in code that’s part of KVM, rather than in Hyper-V code. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v2 2/4] arm64: hyperv: Add support for Hyper-V as a hypervisor
From: KY Srinivasan Sent: Thursday, August 30, 2018 11:51 AM > > + /* Allocate percpu VP index */ > > + hv_vp_index = kmalloc_array(num_possible_cpus(), > > sizeof(*hv_vp_index), > > + GFP_KERNEL); > > + if (!hv_vp_index) > > + return 1; > > + > We should perhaps set the array so the contents are invalid so we can > correctly > handle enlightenments for TL shootdown and IPI. Agreed. Will add the initialization in v3 of the patch. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH v2 1/4] arm64: hyperv: Add core Hyper-V include files
From: KY Srinivasan Sent: Thursday, August 30, 2018 11:23 AM > > +/* > > + * This file contains definitions from the Hyper-V Hypervisor Top-Level > > + * Functional Specification (TLFS): > > + * > > https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs > > + > A lot of TLFS definitions are ISA independent and we are duplicating these > definitions both for X86_64 and ARM_64. Perhaps we should look at splitting > this file into a common and ISA specific header file. I agree that we want to end up with x86_64 and ARM64 ISA dependent files that include an ISA independent file. My thinking was to not make that separation now, for a couple of reasons: 1) We don't have a Hyper-V TLFS that is explicit about what should be considered ISA independent and ISA dependent. I can make some reasonable guesses, but it will be subject to change as the Hyper-V team firms up the interface and decides what they want to commit to. 2) Some of the things defined in the TLFS have names that are x86-specific (TSC, MSR, etc.). For the ISA independent parts, those names should be more generic, which is another dependency on the Hyper-V team defining the ISA independent parts of the TLFS. My judgment was that we'll end up with less perturbation overall to go with this cloned version of hyperv-tlfs.h for now, and then come back and do the separation once we have a definitive TLFS to base it on. But it's a judgment call, and if the sense is that we should do the separation now, I can give it a try. > > +#define HvRegisterHypervisorVersion0x0100 /*CPUID > > 0x4002 */ > > +#defineHvRegisterPrivilegesAndFeaturesInfo0x0200 /*CPUID > > 0x4003 */ > > +#defineHvRegisterFeaturesInfo0x0201 > > /*CPUID 0x4004 */ > > +#defineHvRegisterImplementationLimitsInfo0x0202 /*CPUID > > 0x4005 */ > > +#define HvARM64RegisterInterfaceVersion0x00090006 /*CPUID > > 0x4001 */ > > Can we avoid the mixed case names. Agreed. I'll fix this throughout to use all uppercase, with underscore as the word separator. > > + * Linux-specific definitions for managing interactions with Microsoft's > > + * Hyper-V hypervisor. Definitions that are specified in the Hyper-V > > + * Top Level Functional Spec (TLFS) should not go in this file, but > > + * should instead go in hyperv-tlfs.h. > > Would it make sense to breakup this header file into ISA independent and > dependent files? Yes, as above I agree the separation make sense. And since this file is tied To Linux and not to the Hyper-V TLFS, the separation isn't affected by the TLFS issues mentioned above. I'll give it a try and see if any issues arise. > > +/* > > + * Define the IRQ numbers/vectors used by Hyper-V VMbus interrupts > > + * and by STIMER0 Direct Mode interrupts. Hyper-V should be supplying > > + * these values through ACPI, but there are no other interrupting > > + * devices in a Hyper-V VM on ARM64, so it's OK to hard code for now. > > + * The "CALLBACK_VECTOR" terminology is a left-over from the x86/x64 > > + * world that is used in architecture independent Hyper-V code. > > + */ > When we have direct device assignment for ARM-64 guests, can we still > hardcode. Yes, we can still hardcode. These values are in the Per-Processor Interrupt (PPI) range of 16 to 31. Any IRQ numbers assigned to a Discrete Device Assignment (DDA) device will be in the Shared Peripheral Interrupt (SPI) range of 32-1019 or the Locality-specific Peripheral Interrupt (LPI) range of greater than 8192. The handling of DDA interrupts is still under discussion with the Hyper-V team, but there won't be any conflicts with the PPI values that are hardcoded here. > > +/* > > + * The guest OS needs to register the guest ID with the hypervisor. > > + * The guest ID is a 64 bit entity and the structure of this ID is > > + * specified in the Hyper-V specification: > > + * > > + * msdn.microsoft.com/en- > > us/library/windows/hardware/ff542653%28v=vs.85%29.aspx > > + * > > + * While the current guideline does not specify how Linux guest ID(s) > > + * need to be generated, our plan is to publish the guidelines for > > + * Linux and other guest operating systems that currently are hosted > > + * on Hyper-V. The implementation here conforms to this yet > > + * unpublished guidelines. > > + * > > + * > > + * Bit(s) > > + * 63 - Indicates if the OS is Open Source or not; 1 is Open Source > > + * 62:56 - Os Type; Linux is 0x100 > > + * 55:48 - Distro specific identification > > + * 47:16 - Linux kernel version number > > + * 15:0 - Distro specific identification > > + * > > + * Generate the guest ID based on the guideline described above. > > + */ > > No need to repeat the above block comment (already included in the TLFS > header). Agreed. Will make the change in v3 of the patch. > > +/* Free the message slot and signal end-of-message if required */ > > +static inline void vmbus_signal_eom(struct hv_message *msg, u32 > > old_msg_type) > > +{ > > +/* > > +
RE: [PATCH char-misc 1/1] Drivers: hv: vmbus: Make synic_initialized flag per-cpu
From: Vitaly Kuznetsov Sent: Wednesday, August 1, 2018 2:26 AM > > I was trying to decide if there are any arguments in favor of one > > approach vs. the other: a per-cpu flag in memory or checking > > the synic_control "enable" bit. Seems like a wash to me, in which > > case I have a slight preference for the per-cpu flag in memory vs. > > creating another function to return sctrl.enable. But I'm completely > > open to reasons why checking sctrl.enable is better. > > Just a few thoughts: reading MSR is definitely slower but we avoid > 'shadowing' the state, the reading is always correct. In case there's a > chance the SynIC will get disabled from host side we can only find this > out by doing MSR read. This is a purely theoretical possibility, I > believe, we can go ahead with this patch. Vitaly -- just to confirm: you are OK with the patch as is? (I'll check, but I may need to rebase on the latest code.) Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 2/5] vmbus: add driver_override support
From: Stephen Hemminger Sent: Tuesday, August 14, 2018 9:35 AM > On Mon, 13 Aug 2018 19:30:50 + > "Michael Kelley (EOSG)" wrote: > > > > +/* > > > + * Return a matching hv_vmbus_device_id pointer. > > > + * If there is no match, return NULL. > > > + */ > > > +static const struct hv_vmbus_device_id *hv_vmbus_get_id(struct hv_driver > > > *drv, > > > + struct hv_device *dev) > > > +{ > > > + const uuid_le *guid = &dev->dev_type; > > > + const struct hv_vmbus_device_id *id; > > > > > > - return NULL; > > > + /* When driver_override is set, only bind to the matching driver */ > > > + if (dev->driver_override && strcmp(dev->driver_override, drv->name)) > > > + return NULL; > > > > This function needs to be covered by the device lock, so that > > dev->driver_override can't be set to NULL and the memory freed > > during the above 'if' statement. When called from vmbus_probe(), > > the device lock is held, so it's good. But when called from > > vmbus_match(), the device lock may not be held: consider the path > > __driver_attach() -> driver_match_device() -> vmbus_match(). > > The function hv_vmbus_get_id is called from that path. > i.e. __device_attach -> driver-match_device -> vmbus_match. > and __device_attach always does: > device_lock(dev); Agreed. The __device_attach() path holds the device lock and all is good. But the __driver_attach() path does not hold the device lock when the match function is called, leaving the code open to a potential race. Same problem could happen in the pci subsystem, so the issue is more generic and probably should be evaluated and dealt with separately. Michael > > The code in driver _override_store uses the same device_lock > when storing the new value. > > This is same locking as is done in pci-sysfs.c ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 2/5] vmbus: add driver_override support
From: k...@linuxonhyperv.com Sent: Friday, August 10, 2018 4:06 PM > From: Stephen Hemminger > > Add support for overriding the default driver for a VMBus device > in the same way that it can be done for PCI devices. This patch > adds the /sys/bus/vmbus/devices/.../driver_override file > and the logic for matching. > > This is used by driverctl tool to do driver override. > https://gitlab.com/driverctl/driverctl > > Signed-off-by: Stephen Hemminger > Signed-off-by: K. Y. Srinivasan > --- > diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c > index b1b548a21f91..e6d8fdac6d8b 100644 > --- a/drivers/hv/vmbus_drv.c > +++ b/drivers/hv/vmbus_drv.c > @@ -498,6 +498,54 @@ static ssize_t device_show(struct device *dev, > } > static DEVICE_ATTR_RO(device); > > +static ssize_t driver_override_store(struct device *dev, > + struct device_attribute *attr, > + const char *buf, size_t count) > +{ > + struct hv_device *hv_dev = device_to_hv_device(dev); > + char *driver_override, *old, *cp; > + > + /* We need to keep extra room for a newline */ > + if (count >= (PAGE_SIZE - 1)) > + return -EINVAL; Does 'count' actually have a relationship to PAGE_SIZE, or is PAGE_SIZE just used as an arbitrary size limit? I'm wondering what happens on ARM64 with a 64K page size, for example. If it's just arbitrary, coding such a constant would be better. > +/* > + * Return a matching hv_vmbus_device_id pointer. > + * If there is no match, return NULL. > + */ > +static const struct hv_vmbus_device_id *hv_vmbus_get_id(struct hv_driver > *drv, > + struct hv_device *dev) > +{ > + const uuid_le *guid = &dev->dev_type; > + const struct hv_vmbus_device_id *id; > > - return NULL; > + /* When driver_override is set, only bind to the matching driver */ > + if (dev->driver_override && strcmp(dev->driver_override, drv->name)) > + return NULL; This function needs to be covered by the device lock, so that dev->driver_override can't be set to NULL and the memory freed during the above 'if' statement. When called from vmbus_probe(), the device lock is held, so it's good. But when called from vmbus_match(), the device lock may not be held: consider the path __driver_attach() -> driver_match_device() -> vmbus_match(). Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 1/5] Tools: hv: Fix a bug in the key delete code
From: k...@linuxonhyperv.com Sent: Friday, August 10, 2018 4:06 PM > > Fix a bug in the key delete code - the num_records range > from 0 to num_records-1. > > Signed-off-by: K. Y. Srinivasan > Reported-by: David Binderman > Cc: > --- Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH char-misc 1/1] Drivers: hv: vmbus: Make synic_initialized flag per-cpu
From: Vitaly Kuznetsov Sent: Tuesday, July 31, 2018 4:20 AM > > Reviewed-by: Vitaly Kuznetsov Thanks for the review > > Alternatively, we can get rid of synic_initialized flag altogether: > hv_synic_init() never fails in the first place but we can always > implement something like: > > int hv_synic_is_initialized(void) { > union hv_synic_scontrol sctrl; > > hv_get_synic_state(sctrl.as_uint64); > > return sctrl.enable; > } > > as it doesn't seem that we need to check synic state on _other_ CPUs. > > -- > Vitaly I was trying to decide if there are any arguments in favor of one approach vs. the other: a per-cpu flag in memory or checking the synic_control "enable" bit. Seems like a wash to me, in which case I have a slight preference for the per-cpu flag in memory vs. creating another function to return sctrl.enable. But I'm completely open to reasons why checking sctrl.enable is better. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH] Drivers: hv: vmbus: Reset the channel callback in vmbus_onoffer_rescind()
From: Dexuan Cui Sent: Thursday, July 12, 2018 10:53 PM > > Before setting channel->rescind in vmbus_rescind_cleanup(), we should make > sure the channel callback won't run any more, otherwise a high-level > driver like pci_hyperv, which may be infinitely waiting for the host VSP's > response and notices the channel has been rescinded, can't safely give > up: e.g., in hv_pci_protocol_negotiation() -> wait_for_response(), it's > unsafe to exit from wait_for_response() and proceed with the on-stack > variable "comp_pkt" popped. The issue was originally spotted by > Michael Kelley . > > In vmbus_close_internal(), the patch also minimizes the range protected by > disabling/enabling channel->callback_event: we don't really need that for > the whole function. > > Signed-off-by: Dexuan Cui > Cc: sta...@vger.kernel.org > Cc: K. Y. Srinivasan > Cc: Stephen Hemminger > Cc: Michael Kelley Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 1/1] Drivers: HV: Send one page worth of kmsg dump over Hyper-V during panic
From: Sunil Muthuswamy Sent: Wednesday, July 11, 2018 9:59 AM > Thanks, Michael. In which branch should I fix these now that the changes have > been > merged with the char-misc-next branch? If the original code is already in char-misc-next, you should probably submit a completely new patch for char-misc-next that just makes the fixes to the original code. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 1/1] Drivers: HV: Send one page worth of kmsg dump over Hyper-V during panic
>From k...@linuxonhyperv.com Sent: Saturday, July 7, >2018 7:57 PM > > From: Sunil Muthuswamy > > In the VM mode on Hyper-V, currently, when the kernel panics, an error > code and few register values are populated in an MSR and the Hypervisor > notified. This information is collected on the host. The amount of > information currently collected is found to be limited and not very > actionable. To gather more actionable data, such as stack trace, the > proposal is to write one page worth of kmsg data on an allocated page > and the Hypervisor notified of the page address through the MSR. > > - Sysctl option to control the behavior, with ON by default. > > Cc: K. Y. Srinivasan > Cc: Stephen Hemminger > Signed-off-by: Sunil Muthuswamy > Signed-off-by: K. Y. Srinivasan > --- > + /* > + * Write dump contents to the page. No need to synchronize; panic should > + * be single-threaded. > + */ > + if (!kmsg_dump_get_buffer(dumper, true, hv_panic_page, > + PAGE_SIZE, &bytes_written)) { > + pr_err("Hyper-V: Unable to get kmsg data for panic\n"); > + return; >From what I can see, the return value from kmsg_dump_get_buffer() is not an indication of success or failure -- it's an indication of whether there is more data available. There's no reason to output an error message. > @@ -1065,6 +1136,32 @@ static int vmbus_bus_init(void) >* Only register if the crash MSRs are available >*/ > if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) { > + u64 hyperv_crash_ctl; > + /* > + * Sysctl registration is not fatal, since by default > + * reporting is enabled. > + */ > + hv_ctl_table_hdr = register_sysctl_table(hv_root_table); > + if (!hv_ctl_table_hdr) > + pr_err("Hyper-V: sysctl table register error"); > + > + /* > + * Register for panic kmsg callback only if the right > + * capability is supported by the hypervisor. > + */ > + rdmsrl(HV_X64_MSR_CRASH_CTL, hyperv_crash_ctl); > + if (hyperv_crash_ctl & HV_CRASH_CTL_CRASH_NOTIFY_MSG) { vmbus_drv.c is architecture independent code, and should not be referencing x86/x64 MSRs. Reading the MSR (and maybe the test as well?) should go in a separate function in an x86-specific source file. And just to confirm, is this the right way to test for the feature? Usually, feature determination is based on one of the feature registers. The NOTIFY_MSG flag seems to have a dual meaning -- on read it indicates the feature is present. On write in hyperv_report_panic_msg(), it evidently means that the guest is sending a full page of data to Hyper-V. > @@ -1081,6 +1178,11 @@ static int vmbus_bus_init(void) > bus_unregister(&hv_bus); > + free_page((unsigned long)hv_panic_page); > + if (!hv_ctl_table_hdr) { The above test is backwards. Remove the bang. > @@ -1785,10 +1887,18 @@ static void __exit vmbus_exit(void) > + free_page((unsigned long)hv_panic_page); > + if (!hv_ctl_table_hdr) { Same here. Test is backwards. Michael ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH V2 1/5] X86/Hyper-V: Add flush HvFlushGuestPhysicalAddressSpace hypercall support
From: Tianyu Lan Monday, July 9, 2018 2:03 AM > Hyper-V supports a pv hypercall HvFlushGuestPhysicalAddressSpace to > flush nested VM address space mapping in l1 hypervisor and it's to > reduce overhead of flushing ept tlb among vcpus. This patch is to > implement it. > > Signed-off-by: Lan Tianyu > --- > arch/x86/hyperv/Makefile | 2 +- > arch/x86/hyperv/nested.c | 64 > ++ > arch/x86/include/asm/hyperv-tlfs.h | 8 + > arch/x86/include/asm/mshyperv.h| 2 ++ > 4 files changed, 75 insertions(+), 1 deletion(-) > create mode 100644 arch/x86/hyperv/nested.c > +#include > +#include > +#include > +#include > + > +int hyperv_flush_guest_mapping(u64 as) > +{ > + struct hv_guest_mapping_flush **flush_pcpu; > + struct hv_guest_mapping_flush *flush; > + u64 status; > + unsigned long flags; > + int ret = -EFAULT; > + > + if (!hv_hypercall_pg) > + goto fault; > + > + local_irq_save(flags); > + > + flush_pcpu = (struct hv_guest_mapping_flush **) > + this_cpu_ptr(hyperv_pcpu_input_arg); > + > + flush = *flush_pcpu; > + > + if (unlikely(!flush)) { > + local_irq_restore(flags); > + goto fault; > + } > + > + flush->address_space = as; > + flush->flags = 0; > + > + status = hv_do_hypercall(HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE, > + flush, NULL); Did you consider using a "fast" hypercall? Unless there's some reason I'm not aware of, a "fast" hypercall would be perfect here as there are 16 bytes of input and no output. Vitaly recently added hv_do_fast_hypercall16() in the linux-next tree. See __send_ipi_mask() in hv_apic.c in linux-next for an example of usage. With a fast hypercall, you don't need the code for getting the per-cpu input arg or the code for local irq save/restore, so the code that is left is a lot faster and simpler. Michael > + local_irq_restore(flags); > + > + if (!(status & HV_HYPERCALL_RESULT_MASK)) > + ret = 0; > + > +fault: > + return ret; > +} > +EXPORT_SYMBOL_GPL(hyperv_flush_guest_mapping); ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 2/2] x86/hyper-v: check for VP_INVAL in hyperv_flush_tlb_others()
From: Vitaly Kuznetsov Monday, July 9, 2018 10:40 AM > Commit 1268ed0c474a ("x86/hyper-v: Fix the circular dependency in IPI > enlightenment") pre-filled hv_vp_index with VP_INVAL so it is now > (theoretically) possible to observe hv_cpu_number_to_vp_number() > returning VP_INVAL. We need to check for that in hyperv_flush_tlb_others(). > > Not checking for VP_INVAL on the first call site where we do > > if (hv_cpu_number_to_vp_number(cpumask_last(cpus)) >= 64) > goto do_ex_hypercall; > > is OK, in case we're eligible for non-ex hypercall we'll catch the > issue later in for_each_cpu() cycle and in case we'll be doing ex- > hypercall cpumask_to_vpset() will fail. > > It would be nice to change hv_cpu_number_to_vp_number() return > value's type to 'u32' but this will likely be a bigger change as > all call sites need to be checked first. > > Fixes: 1268ed0c474a ("x86/hyper-v: Fix the circular dependency in IPI > enlightenment") > Signed-off-by: Vitaly Kuznetsov > --- > arch/x86/hyperv/mmu.c | 5 + > 1 file changed, 5 insertions(+) Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
RE: [PATCH 1/2] x86/hyper-v: check cpumask_to_vpset() return value in hyperv_flush_tlb_others_ex()
From: Vitaly Kuznetsov Monday, July 9, 2018 10:40 AM > Commit 1268ed0c474a ("x86/hyper-v: Fix the circular dependency in IPI > enlightenment") made cpumask_to_vpset() return '-1' when there is a CPU > with unknown VP index in the supplied set. This needs to be checked before > we pass 'nr_bank' to hypercall. > > Fixes: 1268ed0c474a ("x86/hyper-v: Fix the circular dependency in IPI > enlightenment") > Signed-off-by: Vitaly Kuznetsov > --- > arch/x86/hyperv/mmu.c | 2 ++ > 1 file changed, 2 insertions(+) Reviewed-by: Michael Kelley ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel