Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts
On Mon, Dec 19, 2011 at 1:15 AM, Antonios Motakis a.mota...@virtualopensystems.com wrote: On 12/11/2011 11:25 AM, Christoffer Dall wrote: WARNING: This code is in development and guests do not fully boot on SMP hosts yet. Hello, What would still be needed to fully booted SMP? For example, are there identified critical sections and structures that need to be worked on, or there are parts that still need to be reviewed to find those? Or is it only a matter of fixing up any existing locking/syncing introduced in this patch? You should simply start booting a UP guest on an SMP host, see where it crashes and start tracking it down. Same procedure for guest SMP. I'd like to throw some cycles on this, so I'll start by looking in this patch again more carefully (and guest SMP as well). that sounds good, just go for it. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts
On 19/12/11 14:57, Christoffer Dall wrote: On Mon, Dec 19, 2011 at 1:15 AM, Antonios Motakis a.mota...@virtualopensystems.com wrote: On 12/11/2011 11:25 AM, Christoffer Dall wrote: WARNING: This code is in development and guests do not fully boot on SMP hosts yet. Hello, What would still be needed to fully booted SMP? For example, are there identified critical sections and structures that need to be worked on, or there are parts that still need to be reviewed to find those? Or is it only a matter of fixing up any existing locking/syncing introduced in this patch? You should simply start booting a UP guest on an SMP host, see where it crashes and start tracking it down. For the time being, I've yet to see UP guest crashing on SMP host. On the model, that is... Same procedure for guest SMP. That's a very different kettle of fish. I see both CPUs starting to run, and end up with both in WFI after a while, without any interrupt pending... I'll investigate that as soon as I can. M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts
On 12/19/2011 04:19 PM, Marc Zyngier wrote: On 19/12/11 14:57, Christoffer Dall wrote: You should simply start booting a UP guest on an SMP host, see where it crashes and start tracking it down. For the time being, I've yet to see UP guest crashing on SMP host. On the model, that is... Last time I tried to run a guest in a 2 core model, it hanged and crashed after a while. Anyway, I will investigate. So I gather critical sections have been dealt with and it's just a matter of ironing bugs right now? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts
On 19/12/11 15:30, Antonios Motakis wrote: On 12/19/2011 04:19 PM, Marc Zyngier wrote: On 19/12/11 14:57, Christoffer Dall wrote: You should simply start booting a UP guest on an SMP host, see where it crashes and start tracking it down. For the time being, I've yet to see UP guest crashing on SMP host. On the model, that is... Last time I tried to run a guest in a 2 core model, it hanged and crashed after a while. Anyway, I will investigate. So I gather critical sections have been dealt with and it's just a matter of ironing bugs right now? Depends when you tested. V4 was definitely unusable on SMP. With my patches merged in Christoffer's v5 release, the basics should be covered (correct SMP setup in the boot-wrapper and KVM, MPIDR virtualization). If you find any problem (and let's face it, I'm sure you will... ;-), I'll be happy to investigate. Cheers, M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts
On Mon, Dec 19, 2011 at 10:37 AM, Marc Zyngier marc.zyng...@arm.com wrote: On 19/12/11 15:30, Antonios Motakis wrote: On 12/19/2011 04:19 PM, Marc Zyngier wrote: On 19/12/11 14:57, Christoffer Dall wrote: You should simply start booting a UP guest on an SMP host, see where it crashes and start tracking it down. For the time being, I've yet to see UP guest crashing on SMP host. On the model, that is... Last time I tried to run a guest in a 2 core model, it hanged and crashed after a while. Anyway, I will investigate. So I gather critical sections have been dealt with and it's just a matter of ironing bugs right now? Depends when you tested. V4 was definitely unusable on SMP. With my patches merged in Christoffer's v5 release, the basics should be covered (correct SMP setup in the boot-wrapper and KVM, MPIDR virtualization). If you find any problem (and let's face it, I'm sure you will... ;-), I'll be happy to investigate. I had the guest boot process crash on a page fault on the v5 series. Consistently. I just didn't have time to investigate yet. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts
On 12/19/2011 04:40 PM, Christoffer Dall wrote: On Mon, Dec 19, 2011 at 10:37 AM, Marc Zyngiermarc.zyng...@arm.com wrote: On 19/12/11 15:30, Antonios Motakis wrote: On 12/19/2011 04:19 PM, Marc Zyngier wrote: On 19/12/11 14:57, Christoffer Dall wrote: You should simply start booting a UP guest on an SMP host, see where it crashes and start tracking it down. For the time being, I've yet to see UP guest crashing on SMP host. On the model, that is... Last time I tried to run a guest in a 2 core model, it hanged and crashed after a while. Anyway, I will investigate. So I gather critical sections have been dealt with and it's just a matter of ironing bugs right now? Depends when you tested. V4 was definitely unusable on SMP. With my patches merged in Christoffer's v5 release, the basics should be covered (correct SMP setup in the boot-wrapper and KVM, MPIDR virtualization). If you find any problem (and let's face it, I'm sure you will... ;-), I'll be happy to investigate. I had the guest boot process crash on a page fault on the v5 series. Consistently. I just didn't have time to investigate yet. I was also running v5, however I didn't record the exact crash behavior, since I assumed I was having the same results as everyone. I will look into it now. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts
On 19/12/11 15:42, Antonios Motakis wrote: On 12/19/2011 04:40 PM, Christoffer Dall wrote: On Mon, Dec 19, 2011 at 10:37 AM, Marc Zyngiermarc.zyng...@arm.com wrote: On 19/12/11 15:30, Antonios Motakis wrote: On 12/19/2011 04:19 PM, Marc Zyngier wrote: On 19/12/11 14:57, Christoffer Dall wrote: You should simply start booting a UP guest on an SMP host, see where it crashes and start tracking it down. For the time being, I've yet to see UP guest crashing on SMP host. On the model, that is... Last time I tried to run a guest in a 2 core model, it hanged and crashed after a while. Anyway, I will investigate. So I gather critical sections have been dealt with and it's just a matter of ironing bugs right now? Depends when you tested. V4 was definitely unusable on SMP. With my patches merged in Christoffer's v5 release, the basics should be covered (correct SMP setup in the boot-wrapper and KVM, MPIDR virtualization). If you find any problem (and let's face it, I'm sure you will... ;-), I'll be happy to investigate. I had the guest boot process crash on a page fault on the v5 series. Consistently. I just didn't have time to investigate yet. I was also running v5, however I didn't record the exact crash behavior, since I assumed I was having the same results as everyone. I will look into it now. If you manage to find a way to reliably reproduce it, I really want help having it fixed ASAP. Kernel config and co much appreciated. M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts
2011/12/19 Christoffer Dall cd...@cs.columbia.edu: ok, just reproduced. Using the AEM v7 model, 2 cores, attached guest and host config. Host running on the v5 of the KVM/ARM branch, found here: https://github.com/virtualopensystems/linux-kvm-arm Guest kernel revision: f6b252b6b92671d2633008408c06d35c26e55ecf Guest file system is a Linaro min-debug cramfs. You should probably also say which revision of qemu you were running... -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts
On Dec 19, 2011, at 12:19 PM, Peter Maydell peter.mayd...@linaro.org wrote: 2011/12/19 Christoffer Dall cd...@cs.columbia.edu: ok, just reproduced. Using the AEM v7 model, 2 cores, attached guest and host config. Host running on the v5 of the KVM/ARM branch, found here: https://github.com/virtualopensystems/linux-kvm-arm Guest kernel revision: f6b252b6b92671d2633008408c06d35c26e55ecf Guest file system is a Linaro min-debug cramfs. You should probably also say which revision of qemu you were running... I don't think it would make a difference, but it's the upstream linaro branch that you maintain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts
On 19 December 2011 17:24, Christoffer Dall christofferd...@christofferdall.dk wrote: On Dec 19, 2011, at 12:19 PM, Peter Maydell peter.mayd...@linaro.org wrote: 2011/12/19 Christoffer Dall cd...@cs.columbia.edu: ok, just reproduced. Using the AEM v7 model, 2 cores, attached guest and host config. Host running on the v5 of the KVM/ARM branch, found here: https://github.com/virtualopensystems/linux-kvm-arm Guest kernel revision: f6b252b6b92671d2633008408c06d35c26e55ecf Guest file system is a Linaro min-debug cramfs. You should probably also say which revision of qemu you were running... I don't think it would make a difference, but it's the upstream linaro branch that you maintain. Well, in particular, anything predating qemu-linaro commit bd40be95d1 won't do the right thing for interrupt delivery to SMP guests. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts
the problem so far is SMP hosts, but I used this revision: c3ea1e4ac8b2ff0f13d86289562a477f0ae3fd6b On Mon, Dec 19, 2011 at 12:36 PM, Peter Maydell peter.mayd...@linaro.org wrote: On 19 December 2011 17:24, Christoffer Dall christofferd...@christofferdall.dk wrote: On Dec 19, 2011, at 12:19 PM, Peter Maydell peter.mayd...@linaro.org wrote: 2011/12/19 Christoffer Dall cd...@cs.columbia.edu: ok, just reproduced. Using the AEM v7 model, 2 cores, attached guest and host config. Host running on the v5 of the KVM/ARM branch, found here: https://github.com/virtualopensystems/linux-kvm-arm Guest kernel revision: f6b252b6b92671d2633008408c06d35c26e55ecf Guest file system is a Linaro min-debug cramfs. You should probably also say which revision of qemu you were running... I don't think it would make a difference, but it's the upstream linaro branch that you maintain. Well, in particular, anything predating qemu-linaro commit bd40be95d1 won't do the right thing for interrupt delivery to SMP guests. -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 11/13] ARM: KVM: Support SMP hosts
On 12/11/2011 11:25 AM, Christoffer Dall wrote: WARNING: This code is in development and guests do not fully boot on SMP hosts yet. Hello, What would still be needed to fully booted SMP? For example, are there identified critical sections and structures that need to be worked on, or there are parts that still need to be reviewed to find those? Or is it only a matter of fixing up any existing locking/syncing introduced in this patch? I'd like to throw some cycles on this, so I'll start by looking in this patch again more carefully (and guest SMP as well). Antonios -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts
On Tue, Dec 13, 2011 at 4:37 AM, Avi Kivity a...@redhat.com wrote: On 12/12/2011 09:36 PM, Christoffer Dall wrote: something like this: if (irq_level-level) { set_bit(vcpu-arch.irq_lines, bit_nr); smp_mb(); wake_up_interruptible(vcpu-wq); or, smp_send_reschedule(). See kvm_vcpu_kick(). An optimization: do a cmpxchg() and don't wakeup if the operation raised IRQ file FIQ was set (assuming that FIQ has a higher priority than IRQ). forgive my lack of experience here: how would you use cmpxchg exactly? cmpxchg is used as a fallback for general test-and-change which doesn't fit any other atomic operation. In this case: do { virt_intr = ACCESS_ONCE(vcpu-virt_intr) new_virt_intr = virt_intr | mask if (new_virt_intr == virt_intr) return; /* no change */ } while (!cmpxchg(vcpu-virt_intr, virt_intr, new_virt_intr) /* at this point, we know the old and new values, and we know the update was atomic */ heh, nice. thanks. if (new_virt_intr == IRQ | FIQ virt_intr == FIQ) { /* IRQ raised, FIQ already set */ return; } hmm, so what you want to avoid here is sending an IPI to the other CPU in case we already have FIQ raised? But I think we have to do this anyhow. If a guest is servicing a raised FIQ and have FIQs masked, but the GIC hasn't lowered the FIQ line yet, and now comes an IRQ, if the IRQ is unmasked we want to change the hypervisor virtual IRQ register right away as to signal an IRQ immediately and if the guest masks IRQ we still want to change the hypervisor virtual register so that the moment the guest unmasks the IRQ, an exception is raised. The only way to set the hypervisor register for another CPU would be to have it take a world-switch round. So, I think if we simply change either of these lines, we need to signal the other CPU. Marc, can you confirm? yes, FIQ takes priority over IRQ, but we need to raise the IRQ flag anyway. I can see that we don't have to call the wake_up function unless the old value was 0 (no interrupts pending == raising the first one). Is this what you mean? If so, I can't seem to wrap my head around how to accomplish this elegantly with cmpxchg...? Then, another issue. I am going to need to two fields and then a memcpy, since I don't want to be reading an atomic value from my world-switch code. (I would need this copy even with a spinlock). Agree? Don't follow. Why a memcpy? why don't you want to read an atomic value during world switch? if it was declared as an atomic_t, since that's theoretically opaque, could change to 64-bit and so on, I thought it would be ugly to read that in the world-switch assembler code. But using atomic bitops on a standard long field, we should be fine. But, if I am doing atomic bitops on an unsigned long field, how do I read that (or test two bits at once) atomically? Reads (and writes) are always atomic. Its read-modify-write that needs special treatment. embarrassing. I actually thought read/writes of a word could be partial to the byte between SMP nodes, but it turns out, and as you say, they cannot. Got it. Thanks again! -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts
On 12/13/2011 03:36 PM, Christoffer Dall wrote: if (new_virt_intr == IRQ | FIQ virt_intr == FIQ) { /* IRQ raised, FIQ already set */ return; } hmm, so what you want to avoid here is sending an IPI to the other CPU in case we already have FIQ raised? But I think we have to do this anyhow. If a guest is servicing a raised FIQ and have FIQs masked, but the GIC hasn't lowered the FIQ line yet, and now comes an IRQ, if the IRQ is unmasked we want to change the hypervisor virtual IRQ register right away as to signal an IRQ immediately and if the guest masks IRQ we still want to change the hypervisor virtual register so that the moment the guest unmasks the IRQ, an exception is raised. The only way to set the hypervisor register for another CPU would be to have it take a world-switch round. Ah, if the register is virtualized, indeed you need to signal immediately. But, if I am doing atomic bitops on an unsigned long field, how do I read that (or test two bits at once) atomically? Reads (and writes) are always atomic. Its read-modify-write that needs special treatment. embarrassing. I actually thought read/writes of a word could be partial to the byte between SMP nodes, but it turns out, and as you say, they cannot. Got it. Read Documentation/atomic_ops.txt and Documentation/memory-barriers.txt, it's a tricky subject. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts
On 13/12/11 13:36, Christoffer Dall wrote: On Tue, Dec 13, 2011 at 4:37 AM, Avi Kivity a...@redhat.com wrote: if (new_virt_intr == IRQ | FIQ virt_intr == FIQ) { /* IRQ raised, FIQ already set */ return; } hmm, so what you want to avoid here is sending an IPI to the other CPU in case we already have FIQ raised? But I think we have to do this anyhow. If a guest is servicing a raised FIQ and have FIQs masked, but the GIC hasn't lowered the FIQ line yet, and now comes an IRQ, if the IRQ is unmasked we want to change the hypervisor virtual IRQ register right away as to signal an IRQ immediately and if the guest masks IRQ we still want to change the hypervisor virtual register so that the moment the guest unmasks the IRQ, an exception is raised. The only way to set the hypervisor register for another CPU would be to have it take a world-switch round. So, I think if we simply change either of these lines, we need to signal the other CPU. Marc, can you confirm? We definitely need to signal the CPU an IRQ is pending. The VGIC may help us here, as we can access another CPU's VGIC List Registers (via the processor-specific base address of the Virtual interface control). That would save sending an IPI to the target CPU. M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts
On Dec 13, 2011, at 9:17 AM, Avi Kivity a...@redhat.com wrote: On 12/13/2011 03:36 PM, Christoffer Dall wrote: if (new_virt_intr == IRQ | FIQ virt_intr == FIQ) { /* IRQ raised, FIQ already set */ return; } hmm, so what you want to avoid here is sending an IPI to the other CPU in case we already have FIQ raised? But I think we have to do this anyhow. If a guest is servicing a raised FIQ and have FIQs masked, but the GIC hasn't lowered the FIQ line yet, and now comes an IRQ, if the IRQ is unmasked we want to change the hypervisor virtual IRQ register right away as to signal an IRQ immediately and if the guest masks IRQ we still want to change the hypervisor virtual register so that the moment the guest unmasks the IRQ, an exception is raised. The only way to set the hypervisor register for another CPU would be to have it take a world-switch round. Ah, if the register is virtualized, indeed you need to signal immediately. But, if I am doing atomic bitops on an unsigned long field, how do I read that (or test two bits at once) atomically? Reads (and writes) are always atomic. Its read-modify-write that needs special treatment. embarrassing. I actually thought read/writes of a word could be partial to the byte between SMP nodes, but it turns out, and as you say, they cannot. Got it. Read Documentation/atomic_ops.txt and Documentation/memory-barriers.txt, it's a tricky subject. Actually I did (yes the memory barriers hurt my brain), but I kind of missed the most basic level of memory atomicity - there was a nice section about this in the ARM arm though. Thanks for helping out on all this! -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Android-virt] [PATCH v5 11/13] ARM: KVM: Support SMP hosts
On Mon, Dec 12, 2011 at 12:56 PM, Avi Kivity a...@redhat.com wrote: On 12/12/2011 07:37 PM, Christoffer Dall wrote: This looks overly complicated with two levels of locks. x86 gets by with no locks, and a much more complicated interrupt architecture. My recommendation is: wait_for_interrupts is managed solely by the vcpu thread KVM_IRQ_LINE does a set_bit(, virt_irq) for the appropriate irq type, then IPI/wakeups the vcpu to make it examine both wait_for_interrupts and virt_irq. this sounds pretty good to me. something like this: if (irq_level-level) { set_bit(vcpu-arch.irq_lines, bit_nr); smp_mb(); wake_up_interruptible(vcpu-wq); or, smp_send_reschedule(). See kvm_vcpu_kick(). An optimization: do a cmpxchg() and don't wakeup if the operation raised IRQ file FIQ was set (assuming that FIQ has a higher priority than IRQ). } else clear_bit(vcpu-arch.irq_lines, bit_nr); and the vcpu thread would clear the wait_for_interrupts flag if it ever sees the mask field be non-zero? Yes. This is what x86 does, except it's a lot more complicated. @@ -522,47 +573,42 @@ static int init_hyp_mode(void) return -ENOMEM; /* - * Allocate stack page for Hypervisor-mode + * Allocate stack pages for Hypervisor-mode */ - kvm_arm_hyp_stack_page = (void *)__get_free_page(GFP_KERNEL); - if (!kvm_arm_hyp_stack_page) { - err = -ENOMEM; - goto out_free_pgd; - } + for_each_possible_cpu(cpu) { + void *stack_page; - hyp_stack_ptr = (unsigned long)kvm_arm_hyp_stack_page + PAGE_SIZE; + stack_page = (void *)__get_free_page(GFP_KERNEL); Best to allocate this (and other per-cpu state) on the cpu's node. why, for performance reasons? Yes, I'm assuming that all multi-socket A15s will be numa? I have no idea. Marc, Peter, Catalin? The code get slightly more complicated, since we have to pass the return value through the argument so we have to pass an opaque pointer to the struct or something like that. Don't see why, just use alloc_pages_node(). got it, I thought you wanted to issue the actual allocation from each cpu as to parallelize the work. Now I understand. + for_each_online_cpu(cpu) { + smp_call_function_single(cpu, cpu_init_hyp_mode, + (void *)(long)init_phys_addr, 1); + } Need similar code for cpu hotplug. See kvm_cpu_hotplug() and kvm_arch_hardware_enable() which do all this for you. so just to be sure, this will only be called for cpus that are hotplugged right? we still call the cpu_init_hyp_mode for each cpu that's online at this point. The infrastructure will call kvm_arch_hardware_enable() for all currently online cpus and any future hotplugged cpu. Just follow kvm_init() and fill in the arch callbacks. You do have a call to kvm_init() somewhere, yes? Do we need some locking to make sure the two don't overlap (like should I grab the kvm_lock here)? Let kvm_init() do the driving and relax. This sounds good (actually looking into this was on my todo list, so now is a good time I guess). -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 11/13] ARM: KVM: Support SMP hosts
On 12/11/2011 12:25 PM, Christoffer Dall wrote: In order to support KVM on a SMP host, it is necessary to initialize the hypervisor on all CPUs, mostly by making sure each CPU gets its own hypervisor stack and runs the HYP init code. We also take care of some missing locking of modifications to the hypervisor page tables and ensure synchronized consistency between virtual IRQ masks and wait_for_interrupt flags on the VPUs. Note that this code doesn't handle CPU hotplug yet. Note that this code doesn't support SMP guests. WARNING: This code is in development and guests do not fully boot on SMP hosts yet. Damn, I just reviewed all that breakage. /* Misc. fields */ + spinlock_t irq_lock; + u32 virt_irq; /* HCR exception mask */ u32 wait_for_interrupts; Better to use atomics, IMO. @@ -464,13 +466,27 @@ static int kvm_arch_vm_ioctl_irq_line(struct kvm *kvm, trace_kvm_irq_line(irq_level-irq % 2, irq_level-level, vcpu_idx); + spin_lock(vcpu-arch.irq_lock); if (irq_level-level) { vcpu-arch.virt_irq |= mask; + + /* + * Note that we grab the wq.lock before clearing the wfi flag + * since this ensures that a concurrent call to kvm_vcpu_block + * will either sleep before we grab the lock, in which case we + * wake it up, or will never sleep due to + * kvm_arch_vcpu_runnable being true (iow. this avoids having + * to grab the irq_lock in kvm_arch_vcpu_runnable). + */ + spin_lock(vcpu-wq.lock); vcpu-arch.wait_for_interrupts = 0; + if (waitqueue_active(vcpu-wq)) - wake_up_interruptible(vcpu-wq); + __wake_up_locked(vcpu-wq, TASK_INTERRUPTIBLE); + spin_unlock(vcpu-wq.lock); } else vcpu-arch.virt_irq = ~mask; + spin_unlock(vcpu-arch.irq_lock); This looks overly complicated with two levels of locks. x86 gets by with no locks, and a much more complicated interrupt architecture. My recommendation is: wait_for_interrupts is managed solely by the vcpu thread KVM_IRQ_LINE does a set_bit(, virt_irq) for the appropriate irq type, then IPI/wakeups the vcpu to make it examine both wait_for_interrupts and virt_irq. + +static void cpu_init_hyp_mode(void *vector) +{ + unsigned long hyp_stack_ptr; + void *stack_page; + + stack_page = __get_cpu_var(kvm_arm_hyp_stack_page); + hyp_stack_ptr = (unsigned long)stack_page + PAGE_SIZE; + + cpu_set_vector(vector); + + /* + * Call initialization code + */ + asm volatile ( + movr0, %[pgd_ptr]\n\t + movr1, %[stack_ptr]\n\t + hvc#0\n\t : : + [pgd_ptr] r (virt_to_phys(kvm_hyp_pgd)), + [stack_ptr] r (hyp_stack_ptr) : + r0, r1); +} (slightly nicer is to allocate hyp_stack_ptr and pgd_ptr to register asm(r0) and register asm(r1) to avoid the extra mov instruction) @@ -522,47 +573,42 @@ static int init_hyp_mode(void) return -ENOMEM; /* - * Allocate stack page for Hypervisor-mode + * Allocate stack pages for Hypervisor-mode */ - kvm_arm_hyp_stack_page = (void *)__get_free_page(GFP_KERNEL); - if (!kvm_arm_hyp_stack_page) { - err = -ENOMEM; - goto out_free_pgd; - } + for_each_possible_cpu(cpu) { + void *stack_page; - hyp_stack_ptr = (unsigned long)kvm_arm_hyp_stack_page + PAGE_SIZE; + stack_page = (void *)__get_free_page(GFP_KERNEL); Best to allocate this (and other per-cpu state) on the cpu's node. + if (!stack_page) { + err = -ENOMEM; + goto out_free_pgd; + } + + per_cpu(kvm_arm_hyp_stack_page, cpu) = stack_page; + } init_phys_addr = virt_to_phys(__kvm_hyp_init); init_end_phys_addr = virt_to_phys(__kvm_hyp_init_end); + BUG_ON(init_phys_addr 0x1f); /* - * Create identity mapping + * Create identity mapping for the init code. */ hyp_identity_mapping_add(kvm_hyp_pgd, (unsigned long)init_phys_addr, (unsigned long)init_end_phys_addr); + for_each_online_cpu(cpu) { + smp_call_function_single(cpu, cpu_init_hyp_mode, + (void *)(long)init_phys_addr, 1); + } Need similar code for cpu hotplug. See kvm_cpu_hotplug() and kvm_arch_hardware_enable() which do all this for you. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 11/13] ARM: KVM: Support SMP hosts
On Mon, Dec 12, 2011 at 9:30 AM, Avi Kivity a...@redhat.com wrote: On 12/11/2011 12:25 PM, Christoffer Dall wrote: In order to support KVM on a SMP host, it is necessary to initialize the hypervisor on all CPUs, mostly by making sure each CPU gets its own hypervisor stack and runs the HYP init code. We also take care of some missing locking of modifications to the hypervisor page tables and ensure synchronized consistency between virtual IRQ masks and wait_for_interrupt flags on the VPUs. Note that this code doesn't handle CPU hotplug yet. Note that this code doesn't support SMP guests. WARNING: This code is in development and guests do not fully boot on SMP hosts yet. Damn, I just reviewed all that breakage. so sorry..., /* Misc. fields */ + spinlock_t irq_lock; + u32 virt_irq; /* HCR exception mask */ u32 wait_for_interrupts; Better to use atomics, IMO. hmm, yeah, I guess the way to do it would be to have two fields - one atomic field used for interrupt injection, which is read atomically in the C-code into a plain u32 variable, which can then be copied directly onto the hardware during the world-switch... @@ -464,13 +466,27 @@ static int kvm_arch_vm_ioctl_irq_line(struct kvm *kvm, trace_kvm_irq_line(irq_level-irq % 2, irq_level-level, vcpu_idx); + spin_lock(vcpu-arch.irq_lock); if (irq_level-level) { vcpu-arch.virt_irq |= mask; + + /* + * Note that we grab the wq.lock before clearing the wfi flag + * since this ensures that a concurrent call to kvm_vcpu_block + * will either sleep before we grab the lock, in which case we + * wake it up, or will never sleep due to + * kvm_arch_vcpu_runnable being true (iow. this avoids having + * to grab the irq_lock in kvm_arch_vcpu_runnable). + */ + spin_lock(vcpu-wq.lock); vcpu-arch.wait_for_interrupts = 0; + if (waitqueue_active(vcpu-wq)) - wake_up_interruptible(vcpu-wq); + __wake_up_locked(vcpu-wq, TASK_INTERRUPTIBLE); + spin_unlock(vcpu-wq.lock); } else vcpu-arch.virt_irq = ~mask; + spin_unlock(vcpu-arch.irq_lock); This looks overly complicated with two levels of locks. x86 gets by with no locks, and a much more complicated interrupt architecture. My recommendation is: wait_for_interrupts is managed solely by the vcpu thread KVM_IRQ_LINE does a set_bit(, virt_irq) for the appropriate irq type, then IPI/wakeups the vcpu to make it examine both wait_for_interrupts and virt_irq. this sounds pretty good to me. something like this: if (irq_level-level) { set_bit(vcpu-arch.irq_lines, bit_nr); smp_mb(); wake_up_interruptible(vcpu-wq); } else clear_bit(vcpu-arch.irq_lines, bit_nr); and the vcpu thread would clear the wait_for_interrupts flag if it ever sees the mask field be non-zero? + +static void cpu_init_hyp_mode(void *vector) +{ + unsigned long hyp_stack_ptr; + void *stack_page; + + stack_page = __get_cpu_var(kvm_arm_hyp_stack_page); + hyp_stack_ptr = (unsigned long)stack_page + PAGE_SIZE; + + cpu_set_vector(vector); + + /* + * Call initialization code + */ + asm volatile ( + mov r0, %[pgd_ptr]\n\t + mov r1, %[stack_ptr]\n\t + hvc #0\n\t : : + [pgd_ptr] r (virt_to_phys(kvm_hyp_pgd)), + [stack_ptr] r (hyp_stack_ptr) : + r0, r1); +} (slightly nicer is to allocate hyp_stack_ptr and pgd_ptr to register asm(r0) and register asm(r1) to avoid the extra mov instruction) I agree @@ -522,47 +573,42 @@ static int init_hyp_mode(void) return -ENOMEM; /* - * Allocate stack page for Hypervisor-mode + * Allocate stack pages for Hypervisor-mode */ - kvm_arm_hyp_stack_page = (void *)__get_free_page(GFP_KERNEL); - if (!kvm_arm_hyp_stack_page) { - err = -ENOMEM; - goto out_free_pgd; - } + for_each_possible_cpu(cpu) { + void *stack_page; - hyp_stack_ptr = (unsigned long)kvm_arm_hyp_stack_page + PAGE_SIZE; + stack_page = (void *)__get_free_page(GFP_KERNEL); Best to allocate this (and other per-cpu state) on the cpu's node. why, for performance reasons? The code get slightly more complicated, since we have to pass the return value through the argument so we have to pass an opaque pointer to the struct or something like that. + if (!stack_page) { + err = -ENOMEM; + goto out_free_pgd; + } + + per_cpu(kvm_arm_hyp_stack_page, cpu) = stack_page; + } init_phys_addr = virt_to_phys(__kvm_hyp_init); init_end_phys_addr =
Re: [PATCH v5 11/13] ARM: KVM: Support SMP hosts
On 12/12/2011 07:37 PM, Christoffer Dall wrote: This looks overly complicated with two levels of locks. x86 gets by with no locks, and a much more complicated interrupt architecture. My recommendation is: wait_for_interrupts is managed solely by the vcpu thread KVM_IRQ_LINE does a set_bit(, virt_irq) for the appropriate irq type, then IPI/wakeups the vcpu to make it examine both wait_for_interrupts and virt_irq. this sounds pretty good to me. something like this: if (irq_level-level) { set_bit(vcpu-arch.irq_lines, bit_nr); smp_mb(); wake_up_interruptible(vcpu-wq); or, smp_send_reschedule(). See kvm_vcpu_kick(). An optimization: do a cmpxchg() and don't wakeup if the operation raised IRQ file FIQ was set (assuming that FIQ has a higher priority than IRQ). } else clear_bit(vcpu-arch.irq_lines, bit_nr); and the vcpu thread would clear the wait_for_interrupts flag if it ever sees the mask field be non-zero? Yes. This is what x86 does, except it's a lot more complicated. @@ -522,47 +573,42 @@ static int init_hyp_mode(void) return -ENOMEM; /* - * Allocate stack page for Hypervisor-mode + * Allocate stack pages for Hypervisor-mode */ - kvm_arm_hyp_stack_page = (void *)__get_free_page(GFP_KERNEL); - if (!kvm_arm_hyp_stack_page) { - err = -ENOMEM; - goto out_free_pgd; - } + for_each_possible_cpu(cpu) { + void *stack_page; - hyp_stack_ptr = (unsigned long)kvm_arm_hyp_stack_page + PAGE_SIZE; + stack_page = (void *)__get_free_page(GFP_KERNEL); Best to allocate this (and other per-cpu state) on the cpu's node. why, for performance reasons? Yes, I'm assuming that all multi-socket A15s will be numa? The code get slightly more complicated, since we have to pass the return value through the argument so we have to pass an opaque pointer to the struct or something like that. Don't see why, just use alloc_pages_node(). + for_each_online_cpu(cpu) { + smp_call_function_single(cpu, cpu_init_hyp_mode, + (void *)(long)init_phys_addr, 1); + } Need similar code for cpu hotplug. See kvm_cpu_hotplug() and kvm_arch_hardware_enable() which do all this for you. so just to be sure, this will only be called for cpus that are hotplugged right? we still call the cpu_init_hyp_mode for each cpu that's online at this point. The infrastructure will call kvm_arch_hardware_enable() for all currently online cpus and any future hotplugged cpu. Just follow kvm_init() and fill in the arch callbacks. You do have a call to kvm_init() somewhere, yes? Do we need some locking to make sure the two don't overlap (like should I grab the kvm_lock here)? Let kvm_init() do the driving and relax. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 11/13] ARM: KVM: Support SMP hosts
In order to support KVM on a SMP host, it is necessary to initialize the hypervisor on all CPUs, mostly by making sure each CPU gets its own hypervisor stack and runs the HYP init code. We also take care of some missing locking of modifications to the hypervisor page tables and ensure synchronized consistency between virtual IRQ masks and wait_for_interrupt flags on the VPUs. Note that this code doesn't handle CPU hotplug yet. Note that this code doesn't support SMP guests. WARNING: This code is in development and guests do not fully boot on SMP hosts yet. Signed-off-by: Marc Zyngier marc.zyng...@arm.com Signed-off-by: Christoffer Dall c.d...@virtualopensystems.com --- arch/arm/include/asm/kvm_host.h |4 - arch/arm/include/asm/kvm_mmu.h |1 arch/arm/kvm/arm.c | 175 +++ arch/arm/kvm/emulate.c |2 arch/arm/kvm/mmu.c |9 ++ 5 files changed, 114 insertions(+), 77 deletions(-) diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index 86f6cf1..a0ffbe8 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -78,8 +78,6 @@ struct kvm_vcpu_arch { u32 c13_TID_PRIV; /* Thread ID, Priveleged */ } cp15; - u32 virt_irq; /* HCR exception mask */ - /* Exception Information */ u32 hsr;/* Hyp Syndrom Register */ u32 hdfar; /* Hyp Data Fault Address Register */ @@ -92,6 +90,8 @@ struct kvm_vcpu_arch { u32 mmio_rd; /* Misc. fields */ + spinlock_t irq_lock; + u32 virt_irq; /* HCR exception mask */ u32 wait_for_interrupts; }; diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h index e82eae9..917edd7 100644 --- a/arch/arm/include/asm/kvm_mmu.h +++ b/arch/arm/include/asm/kvm_mmu.h @@ -28,6 +28,7 @@ #define PGD2_ORDER get_order(PTRS_PER_PGD2 * sizeof(pgd_t)) extern pgd_t *kvm_hyp_pgd; +extern struct mutex kvm_hyp_pgd_mutex; int create_hyp_mappings(pgd_t *hyp_pgd, void *from, void *to); void free_hyp_pmds(pgd_t *hyp_pgd); diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index 00215a1..6e384e2 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -61,7 +61,7 @@ void __kvm_print_msg(char *fmt, ...) spin_unlock(__tmp_log_lock); } -static void *kvm_arm_hyp_stack_page; +static DEFINE_PER_CPU(void *, kvm_arm_hyp_stack_page); /* The VMID used in the VTTBR */ #define VMID_SIZE (18) @@ -257,6 +257,8 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) unsigned long cpsr; unsigned long sctlr; + spin_lock_init(vcpu-arch.irq_lock); + /* Init execution CPSR */ asm volatile (mrs %[cpsr], cpsr : [cpsr] =r (cpsr)); @@ -464,13 +466,27 @@ static int kvm_arch_vm_ioctl_irq_line(struct kvm *kvm, trace_kvm_irq_line(irq_level-irq % 2, irq_level-level, vcpu_idx); + spin_lock(vcpu-arch.irq_lock); if (irq_level-level) { vcpu-arch.virt_irq |= mask; + + /* +* Note that we grab the wq.lock before clearing the wfi flag +* since this ensures that a concurrent call to kvm_vcpu_block +* will either sleep before we grab the lock, in which case we +* wake it up, or will never sleep due to +* kvm_arch_vcpu_runnable being true (iow. this avoids having +* to grab the irq_lock in kvm_arch_vcpu_runnable). +*/ + spin_lock(vcpu-wq.lock); vcpu-arch.wait_for_interrupts = 0; + if (waitqueue_active(vcpu-wq)) - wake_up_interruptible(vcpu-wq); + __wake_up_locked(vcpu-wq, TASK_INTERRUPTIBLE); + spin_unlock(vcpu-wq.lock); } else vcpu-arch.virt_irq = ~mask; + spin_unlock(vcpu-arch.irq_lock); return 0; } @@ -505,14 +521,49 @@ long kvm_arch_vm_ioctl(struct file *filp, } } +static void cpu_set_vector(void *vector) +{ + /* +* Set the HVBAR +*/ + asm volatile ( + movr0, %[vector_ptr]\n\t + ldrr7, =SMCHYP_HVBAR_W\n\t + smc#0\n\t : : + [vector_ptr] r (vector) : + r0, r7); +} + +static void cpu_init_hyp_mode(void *vector) +{ + unsigned long hyp_stack_ptr; + void *stack_page; + + stack_page = __get_cpu_var(kvm_arm_hyp_stack_page); + hyp_stack_ptr = (unsigned long)stack_page + PAGE_SIZE; + + cpu_set_vector(vector); + + /* +* Call initialization code +*/ + asm volatile ( + movr0, %[pgd_ptr]\n\t + movr1, %[stack_ptr]\n\t + hvc#0\n\t : : + [pgd_ptr] r (virt_to_phys(kvm_hyp_pgd)), + [stack_ptr] r