Re: Fix lapic time counter read for periodic mode
Hi, thanks for your reply. On Mon, Nov 12, 2012 at 07:32:37PM -0200, Marcelo Tosatti wrote: > > there is a bug in the emulation of the lapic time counter. In particular > > what we are seeing is that the time counter of a periodic lapic timer > > in the guest reads as zero 99% of the time. The patch below fixes that. > > > > The emulation of the lapic timer is done with the help of a hires > > timer that expires with the same frequency as the lapic counter. > > New expiration times for a periodic timer are calculated incrementally > > based on the last scheduled expiration time. This ensures long term > > accuracy of the emulated timer close to that of the underlying clock. > > > > The actual value of the lapic time counter is calculated from the > > real time difference between current time and scheduled expiration time > > of the hires timer. If this difference is negative, the hires timer > > expired. For oneshot mode this is correctly translated into a zero value > > for the time counter. However, in periodic mode we must use the negative > > difference unmodified. > > > > regards Christian > > > > Fix lapic time counter read for periodic mode. > > In periodic mode the hrtimer is rearmed once expired, see > apic_timer_fn. So _get_remaining should return proper value > even if the guest is not able to process timer interrupts. > > Can you describe your specific scenario in more detail? In my specific case, the host is admittedly somewhat special as it already is a rehosted version of linux, i.e. not running directly on native hardware. It is still unclear if the host has sufficiently accurate timer interrupts. This is most likely part of the problems we are seeing. However, AFAICS apic_timer_fn is only called once per jiffy (at least in some configurations). In particular, it is not called by hrtimer_get_remaining. Thus depending on the frequency of the LAPIC timer in the guest there might _several_ iterations that are missed. This can probably be mitigated by a hires timer interrupts. However, I think the problem is still there even in that case. Additionally, the behaviour that I want to establish matches that of the PIT timer (in a not completely obvious way, though). Having said that the proposed patch in my first mail is incomplete, as the mod_64 does not work correctly for negative values. A fixed version is below. regards Christian Signed-off-by: Christian Ehrhardt diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 43e9fad..ec7242c 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -810,11 +810,22 @@ static u32 apic_get_tmcct(struct kvm_lapic *apic) if (kvm_apic_get_reg(apic, APIC_TMICT) == 0) return 0; + /* +* hrtimer_get_remaining returns the signed difference between +* timer expiration time and current time. Keep negative return +* values iff the the timer is periodic. +*/ remaining = hrtimer_get_remaining(&apic->lapic_timer.timer); - if (ktime_to_ns(remaining) < 0) - remaining = ktime_set(0, 0); + ns = ktime_to_ns(remaining); + if (unlikely(ns < 0)) { + if (apic_lvtt_period(apic)) + ns = apic->lapic_timer.period - + mod_64(-ns, apic->lapic_timer.period); + else + ns = 0; + } - ns = mod_64(ktime_to_ns(remaining), apic->lapic_timer.period); + ns = mod_64(ns, apic->lapic_timer.period); tmcct = div64_u64(ns, (APIC_BUS_CYCLE_NS * apic->divide_count)); -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Fix lapic time counter read for periodic mode
Hi, there is a bug in the emulation of the lapic time counter. In particular what we are seeing is that the time counter of a periodic lapic timer in the guest reads as zero 99% of the time. The patch below fixes that. The emulation of the lapic timer is done with the help of a hires timer that expires with the same frequency as the lapic counter. New expiration times for a periodic timer are calculated incrementally based on the last scheduled expiration time. This ensures long term accuracy of the emulated timer close to that of the underlying clock. The actual value of the lapic time counter is calculated from the real time difference between current time and scheduled expiration time of the hires timer. If this difference is negative, the hires timer expired. For oneshot mode this is correctly translated into a zero value for the time counter. However, in periodic mode we must use the negative difference unmodified. regards Christian Fix lapic time counter read for periodic mode. Signed-off-by: Christian Ehrhardt diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 43e9fad..eff902d 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -810,8 +810,13 @@ static u32 apic_get_tmcct(struct kvm_lapic *apic) if (kvm_apic_get_reg(apic, APIC_TMICT) == 0) return 0; + /* +* hrtimer_get_remaining returns the signed difference between +* timer expiration time and current time. Keep negative return +* value iff the the timer is periodic. +*/ remaining = hrtimer_get_remaining(&apic->lapic_timer.timer); - if (ktime_to_ns(remaining) < 0) + if (ktime_to_ns(remaining) < 0 && !apic_lvtt_period(apic)) remaining = ktime_set(0, 0); ns = mod_64(ktime_to_ns(remaining), apic->lapic_timer.period); -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] kvm-s390: streamline memslot handling - rebased
Avi Kivity wrote: Marcelo Tosatti wrote: (continued below) Anyway, yeah, the set request / wait mechanism you implement here is quite similar to the idea mentioned earlier that could be used for x86. Just get rid of this explicit KVM_REQ_MMU_RELOAD knowledge in arch-independent code please (if you want to see this merged). I agree to lift the wait part to other archs later if needed, but as mentioned above I could move this to arch code to the cost of one arch hook more. But as also mentioned it doesn't really hurt. I agree that it does not need to be KVM_REQ_MMU_RELOAD specific, we could just walk/clear/wake all bits on that vcpu->requests variable. Would that be generic enough in your opinion ? Don't know. Avi? I think I lost the thread here, but I'll try. Isn't the wake part make_all_vcpus_request() in kvm_main.c? The wait part could be moved to a similar generic function. I'll try to summarize my current thoughts a bit: The rebased patch series brings several fixes and the wait/wakeup mechanism which is in discussion here. As explained before this keeps the new wait implementation in s390 arch code which allows us to experiment with it. Later if we are happy with it we might (or not) continue the merge and bring this mechanism to make_all_vcpus_request (as on x86 you don't have the issues I try to fix here we don't need to hurry bringing that into generic code). Well now to the wait/wakeup which is here in discussion in detail: The s390 arch code can kick a guest, but we don't know implicitly (as x86 does) that the kick succeeded, it might happen somewhen sooner or later. Therefore the code uses wait_on_bit to wait until the vcpu->request bit is consumed. To ensure cleanup of these waiting threads in some special cases the clear&wake up is also needed at other places than the real bit consumption. One of them is the vcpu release code where we should clear&wakeup all waiters (Marcelo correctly pointed out that we should not be bit specific there, so I just just wake up all in the updated code). That was the discussion here: "if it would be ok to clear & wake up all". As wake_up_bit doesn't hurt if there is no waiter it looks like the best solution to to do that in the generic part of vcpu_release. If ever someone else waits for this or another bit in vcpu->requests, the code ensures all of them are awaken on vcpu release. I send an updated version of the rebased series in a few minutes, containing updates related to what marcelo pointed out. P.S. in case you think we need much more discussions we might try to catch up on irc to save this thread a few cycles :-) -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] kvm-s390: streamline memslot handling - rebased
Avi Kivity wrote: Christian Ehrhardt wrote: Really need that smp_mb__after_clear_bit ? AFAIK test_and_clear_bit implies a barrier? Well I agree that practically test_and_clear_bit has a barrier on s390, but as far as I read Documentation/atomic_ops.txt at line 339-360 I think the interface does not imply it so I wanted to add it explicitly. I would be happy if someone really knows the in depth details here and corrects me :-) IIUC rmw bitops are full memory barriers. The non-rmw (from the caller's perspective), clear_bit() and set_bit(), are not. Ok, as the real implementation has one + memory-barriers.txt describing it with barrier and finally include/asm-generic/bitops/atomic.h descirbes it that way too I think I can drop the explicit smb_wb from my patch in the next update (I wait a bit to give the discussion about the wati/bits a bit more time). Hmm ... would that be worth a clarifying patch to atomic_ops.txt that confused me in the first place ? -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] kvm-s390: streamline memslot handling - rebased
Marcelo Tosatti wrote: On Tue, Jun 02, 2009 at 04:26:11PM +0200, ehrha...@linux.vnet.ibm.com wrote: From: Christian Ehrhardt [...] @@ -706,13 +713,19 @@ int kvm_arch_set_memory_region(struct kv /* request update of sie control block for all available vcpus */ for (i = 0; i < KVM_MAX_VCPUS; ++i) { - if (kvm->vcpus[i]) { - if (test_and_set_bit(KVM_REQ_MMU_RELOAD, - &kvm->vcpus[i]->requests)) - continue; - kvm_s390_inject_sigp_stop(kvm->vcpus[i], - ACTION_VCPUREQUEST_ON_STOP); - } + vcpu = kvm->vcpus[i]; + if (!vcpu) + continue; + + if (!test_and_set_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests)) + continue; + + if (vcpu->cpu == -1) + continue; What happens if the check for cpu == -1 races with kvm_arch_vcpu_put? This context will wait until the vcpu_put context is scheduled back in to clear the bit? Is that OK? It either comes back to clear the bit or it is consumed on deletion of the vcpu. Both ways are ok. The question we have to answer is if it might stall the mem update ioctl for too long. Because eventually the check for vcpu->cpu == -1 is just an optimization if we would completely ignore remove it we would have the same problem -> could it stall the set mem operation too much. That means the "race" is not an issue it might just be sub-optimal, but the chance for a long stall could become an issue. Unfortunately I have no better approach to that (yet), until then this I like this implementation more than what we would have without all the corner case fixes in that patch series. + + kvm_s390_inject_sigp_stop(vcpu, ACTION_VCPUREQUEST_ON_STOP); + wait_on_bit(&vcpu->requests, KVM_REQ_MMU_RELOAD, + wait_bit_schedule, TASK_UNINTERRUPTIBLE); } void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) { + vcpu->cpu = -1; save_fp_regs(&vcpu->arch.guest_fpregs); [...] +++ kvm/arch/s390/kvm/kvm-s390.h @@ -92,6 +92,13 @@ static inline unsigned long kvm_s390_han if (!vcpu->requests) return 0; + /* requests that can be handled at all levels */ + if (test_and_clear_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests)) { + smp_mb__after_clear_bit(); Really need that smp_mb__after_clear_bit ? AFAIK test_and_clear_bit implies a barrier? Well I agree that practically test_and_clear_bit has a barrier on s390, but as far as I read Documentation/atomic_ops.txt at line 339-360 I think the interface does not imply it so I wanted to add it explicitly. I would be happy if someone really knows the in depth details here and corrects me :-) + wake_up_bit(&vcpu->requests, KVM_REQ_MMU_RELOAD); + kvm_s390_vcpu_set_mem(vcpu); + } + return vcpu->requests; } Index: kvm/virt/kvm/kvm_main.c === --- kvm.orig/virt/kvm/kvm_main.c +++ kvm/virt/kvm/kvm_main.c @@ -1682,6 +1682,10 @@ static int kvm_vcpu_release(struct inode { struct kvm_vcpu *vcpu = filp->private_data; + clear_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests); + smp_mb__after_clear_bit(); + wake_up_bit(&vcpu->requests, KVM_REQ_MMU_RELOAD); + And this should be generic? Say if other architectures want to make use of a similar wait infrastructure. Talk is cheap. Clear bit and wake up on release doesn't hurt any architecture, but it is at a good place fine for those using the mechanism to ensure cleaning up outstanding things when closing a vcpu fd. I thought its not worth to add kvm_ARCH_vcpu_release for it while I could do so if we want it separated. (continued below) Anyway, yeah, the set request / wait mechanism you implement here is quite similar to the idea mentioned earlier that could be used for x86. Just get rid of this explicit KVM_REQ_MMU_RELOAD knowledge in arch-independent code please (if you want to see this merged). I agree to lift the wait part to other archs later if needed, but as mentioned above I could move this to arch code to the cost of one arch hook more. But as also mentioned it doesn't really hurt. I agree that it does not need to be KVM_REQ_MMU_RELOAD specific, we could just walk/clear/wake all bits on that vcpu->requests variable. Would that be generic enough in your opinion ? Later it can all be lifted off to arch independent code. True for the wait part which can evolve in our arch code until it is ripe to get cross arch merged. -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, O
Re: [PATCH 4/4] kvm-s390: streamline memslot handling - v6
Marcelo Tosatti wrote: On Sun, May 31, 2009 at 11:22:58AM +0300, Avi Kivity wrote: ehrha...@linux.vnet.ibm.com wrote: From: Christian Ehrhardt *updates in v6* - ensure the wait_on_bit waiter is notified - move the reset of requests to kvm_vcpu_release to drop them early *updates in v5* - ensure dropping vcpu all requests while freeing a vcpu *updates in v4* - kickout only scheduled vcpus (its superfluous and wait might hang forever on not running vcpus) v3 is already in (and pushed so I can't unapply), so please rebase on top of current git. Christian, Seems a good step toward further unification. Can you please rebase as requested? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html The missing updates are mostly fixes or code merges due to our discussions and should be on the list in a few minutes. -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state
Avi Kivity wrote: Christian Ehrhardt wrote: So you _need_ a mechanism to kick all vcpus out of guest mode? I have a mechanism to kick a vcpu, and I use it. Due to the fact that smp_call_* don't work as kick for us the kick is an arch specific function. I hop ethat clarified this part :-) You could still use make_all_vcpus_request(), just change smp_call_function_many() to your own kicker. Yes and I like this idea for further unification, but I don't want it mixed too much into the patches in discussion atm. Because on one hand I have some problems giving my arch specific kick a behaviour like "return when the guest WAS kicked" and on the other hand I would e.g. also need to streamline the check in make_all_vcpus_request which cpu is running etc because vcpu->cpu stays -1 all the time on s390 (never used). Therefore I would unify things step by step and this way allow single task to went off my task pile here :-) -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state
Marcelo Tosatti wrote: On Tue, May 26, 2009 at 10:02:59AM +0200, Christian Ehrhardt wrote: Marcelo Tosatti wrote: On Mon, May 25, 2009 at 01:40:49PM +0200, ehrha...@linux.vnet.ibm.com wrote: From: Christian Ehrhardt To ensure vcpu's come out of guest context in certain cases this patch adds a s390 specific way to kick them out of guest context. Currently it kicks them out to rerun the vcpu_run path in the s390 code, but the mechanism itself is expandable and with a new flag we could also add e.g. kicks to userspace etc. Signed-off-by: Christian Ehrhardt "For now I added the optimization to skip kicking vcpus out of guest that had the request bit already set to the s390 specific loop (sent as v2 in a few minutes). We might one day consider standardizing some generic kickout levels e.g. kick to "inner loop", "arch vcpu run", "generic vcpu run", "userspace", ... whatever levels fit *all* our use cases. And then let that kicks be implemented in an kvm_arch_* backend as it might be very different how they behave on different architectures." That would be ideal, yes. Two things make_all_requests handles: 1) It disables preemption with get_cpu(), so it can reliably check for cpu id. Somehow you don't need that for s390 when kicking multiple vcpus? I don't even need the cpuid as make_all_requests does, I just insert a special bit in the vcpu arch part and the vcpu will "come out to me (host)". Fortunateley the kick is rare and fast so I can just insert it unconditionally (it's even ok to insert it if the vcpu is not in guest state). That prevents us from needing vcpu lock or detailed checks which would end up where we started (no guarantee that vcpu's come out of guest context while trying to aquire all vcpu locks) Let me see if I get this right: you kick the vcpus out of guest mode by using a special bit in the vcpu arch part. OK. What I don't understand is this: "would end up where we started (no guarantee that vcpu's come out of guest context while trying to aquire all vcpu locks)" initially the mechanism looped over vcpu's and just aquired the vcpu lock and then updated the vcpu.arch infor directly. Avi mentioned that we have no guarantee if/when the vcpu will come out of guest context to free a lock currently held and suggested the mechanism x86 uses via setting vcpu->request and kicking the vcpu. Thats the eason behind "end up where we (the discussion) started", if we would need the vcpu lock again we would be at the beginnign of the discussion. So you _need_ a mechanism to kick all vcpus out of guest mode? I have a mechanism to kick a vcpu, and I use it. Due to the fact that smp_call_* don't work as kick for us the kick is an arch specific function. I hop ethat clarified this part :-) 2) It uses smp_call_function_many(wait=1), which guarantees that by the time make_all_requests returns no vcpus will be using stale data (the remote vcpus will have executed ack_flush). yes this is really a part my s390 implementation doesn't fulfill yet. Currently on return vcpus might still use the old memslot information. As mentioned before letting all interrupts come "too far" out of the hot loop would be a performance issue, therefore I think I will need some request&confirm mechanism. I'm not sure yet but maybe it could be as easy as this pseudo code example: # in make_all_requests # remember we have slots_lock write here and the reentry that updates the vcpu specific data aquires slots_lock for read. loop vcpus set_bit in vcpu requests kick vcpu #arch function endloop loop vcpus until the requested bit is disappeared #as the reentry path uses test_and_clear it will disappear endloop That would be a implicit synchronization and should work, as I wrote before setting memslots while the guest is running is rare if ever existant for s390. On x86 smp_call_many could then work without the wait flag being set. I see, yes. But I assume that this synchronization approach is slower as it serializes all vcpus on reentry (they wait for the slots_lock to get dropped), therefore I wanted to ask how often setting memslots on runtime will occur on x86 ? Would this approach be acceptable ? For x86 we need slots_lock for two things: 1) to protect the memslot structures from changing (very rare), ie: kvm_set_memory. 2) to protect updates to the dirty bitmap (operations on behalf of guest) which take slots_lock for read versus updates to that dirty bitmap (an ioctl that retrieves what pages have been dirtied in the memslots, and clears the dirtyness info). All you need for S390 is 1), AFAICS. correct For 1), we can drop the slots_lock usage, but instead create an explicit synchronization point, where all vcpus are forced to (
Re: [PATCH 3/3] kvm-s390: streamline memslot handling
Avi Kivity wrote: Christian Bornträger wrote: Am Dienstag 26 Mai 2009 09:57:58 schrieb Avi Kivity: [...] In our low-level interrupt handler we do check for signal_pending, machine_check_pending and need_resched to leave the sie instruction. For anything else a the host sees a cpu bound guest always in the SIE instruction. Okay, now I understand (and agree with) you multi-level kick thing. Maybe we could do it like so: Interrupt handler (on s390 only) checks vcpu->requests, handles the ones it cans. If bits are still set, it exits to arch loop, which handles the bits it cans. If bits are still set, it exits to the generic code loop, which can finally exit to userspace. Does this fit with s390 hardware? I like this idea instead of explicitly kicking to an (upper) level to use the lowest kick and exit if not able to handle. I think it should work (no guarantee) and I try to come up with something in the next few days - either a updated patch series or additional discussion input :-). -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] kvm-s390: infrastructure to kick vcpus out of guest state
Marcelo Tosatti wrote: On Mon, May 25, 2009 at 01:40:49PM +0200, ehrha...@linux.vnet.ibm.com wrote: From: Christian Ehrhardt To ensure vcpu's come out of guest context in certain cases this patch adds a s390 specific way to kick them out of guest context. Currently it kicks them out to rerun the vcpu_run path in the s390 code, but the mechanism itself is expandable and with a new flag we could also add e.g. kicks to userspace etc. Signed-off-by: Christian Ehrhardt "For now I added the optimization to skip kicking vcpus out of guest that had the request bit already set to the s390 specific loop (sent as v2 in a few minutes). We might one day consider standardizing some generic kickout levels e.g. kick to "inner loop", "arch vcpu run", "generic vcpu run", "userspace", ... whatever levels fit *all* our use cases. And then let that kicks be implemented in an kvm_arch_* backend as it might be very different how they behave on different architectures." That would be ideal, yes. Two things make_all_requests handles: 1) It disables preemption with get_cpu(), so it can reliably check for cpu id. Somehow you don't need that for s390 when kicking multiple vcpus? I don't even need the cpuid as make_all_requests does, I just insert a special bit in the vcpu arch part and the vcpu will "come out to me (host)". Fortunateley the kick is rare and fast so I can just insert it unconditionally (it's even ok to insert it if the vcpu is not in guest state). That prevents us from needing vcpu lock or detailed checks which would end up where we started (no guarantee that vcpu's come out of guest context while trying to aquire all vcpu locks) 2) It uses smp_call_function_many(wait=1), which guarantees that by the time make_all_requests returns no vcpus will be using stale data (the remote vcpus will have executed ack_flush). yes this is really a part my s390 implementation doesn't fulfill yet. Currently on return vcpus might still use the old memslot information. As mentioned before letting all interrupts come "too far" out of the hot loop would be a performance issue, therefore I think I will need some request&confirm mechanism. I'm not sure yet but maybe it could be as easy as this pseudo code example: # in make_all_requests # remember we have slots_lock write here and the reentry that updates the vcpu specific data aquires slots_lock for read. loop vcpus set_bit in vcpu requests kick vcpu #arch function endloop loop vcpus until the requested bit is disappeared #as the reentry path uses test_and_clear it will disappear endloop That would be a implicit synchronization and should work, as I wrote before setting memslots while the guest is running is rare if ever existant for s390. On x86 smp_call_many could then work without the wait flag being set. But I assume that this synchronization approach is slower as it serializes all vcpus on reentry (they wait for the slots_lock to get dropped), therefore I wanted to ask how often setting memslots on runtime will occur on x86 ? Would this approach be acceptable ? If it is too adventurous for now I can implement it that way in the s390 code and we split the long term discussion (synchronization + generic kickout levels + who knows what comes up). If smp_call_function_many is hidden behind kvm_arch_kick_vcpus, can you make use of make_all_requests for S390 (without the smp_call_function performance impact you mentioned) ? In combination with the request&confirm mechanism desribed above it should work if smp_call function and all the cpuid gathering which belongs to it is hidden behind kvm_arch_kick_vcpus. For x86 we can further optimize make_all_requests by checking REQ_KICK, and kvm_arch_kick_vcpus would be a good place for that. And the kickout levels idea you mentioned can come later, as an optimization? yes I agree splitting that to a later optimization is a good idea. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] kvm-s390: streamline memslot handling
Christian Ehrhardt wrote: Avi Kivity wrote: ehrha...@linux.vnet.ibm.com wrote: From: Christian Ehrhardt [...] -/* update sie control blocks, and unlock all vcpus */ +/* request update of sie control block for all available vcpus */ for (i = 0; i < KVM_MAX_VCPUS; ++i) { if (kvm->vcpus[i]) { -kvm->vcpus[i]->arch.sie_block->gmsor = -kvm->arch.guest_origin; -kvm->vcpus[i]->arch.sie_block->gmslm = -kvm->arch.guest_memsize + -kvm->arch.guest_origin + -VIRTIODESCSPACE - 1ul; -mutex_unlock(&kvm->vcpus[i]->mutex); +set_bit(KVM_REQ_MMU_RELOAD, &kvm->vcpus[i]->requests); +kvm_s390_inject_sigp_stop(kvm->vcpus[i], + ACTION_RELOADVCPU_ON_STOP); } } There already exists a loop which does this, see make_all_cpus_request(). It uses an IPI (Marcelo, can't it use the reschedule interrupt?). It has a couple of optimizations -- if the request is already set, it skips the IPI, and it avoids the IPI for vcpus out of guest mode. Maybe it could fit s390 too. I assume that the IPI on x86 is a implicit consequence of the smp_call_function_many function, but I think this doesn't work that way for us. The kick implied by that call would be recieved, but not reach the code the checke vcpu->request. I could add that behaviour, but that could make our normal interrupt handling much slower. Therefore I don't want to call that function, but on the other hand I like the "skip if the request is already set" functionality and think about adding that in my loop. For now I added the optimization to skip kicking vcpus out of guest that had the request bit already set to the s390 specific loop (sent as v2 in a few minutes). We might one day consider standardizing some generic kickout levels e.g. kick to "inner loop", "arch vcpu run", "generic vcpu run", "userspace", ... whatever levels fit *all* our use cases. And then let that kicks be implemented in an kvm_arch_* backend as it might be very different how they behave on different architectures. In case an architecture cannot achive reaching the specified kickout level it has to kick to the next available upper level which eventually will reach the desired step on the way to re-run the vcpu. Alltogether this should lead to a much more reliable and transparent interface that finally should be used all across the generic code. -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] kvm-s390: streamline memslot handling
Avi Kivity wrote: ehrha...@linux.vnet.ibm.com wrote: From: Christian Ehrhardt This patch relocates the variables kvm-s390 uses to track guest mem addr/size. As discussed dropping the variables at struct kvm_arch level allows to use the common vcpu->request based mechanism to reload guest memory if e.g. changes via set_memory_region. The kick mechanism introduced in this series is used to ensure running vcpus leave guest state to catch the update. rerun_vcpu: +if (vcpu->requests) +if (test_and_clear_bit(KVM_REQ_MMU_RELOAD, &vcpu->requests)) +kvm_s390_vcpu_set_mem(vcpu); + /* verify, that memory has been registered */ -if (!vcpu->kvm->arch.guest_memsize) { +if (!vcpu->arch.sie_block->gmslm) { vcpu_put(vcpu); +VCPU_EVENT(vcpu, 3, "%s", "no memory registered to run vcpu"); return -EINVAL; } x86 uses a double check: first we check vcpu->requests outside atomic context, then we enter the critical section and check again for signals and vcpu->requests. This allows us (a) to do naughty things in vcpu->requests handlers, (b) keep the critical section short. Does this apply here? The patch already keeps the critical inner loop clear of extra code. The check for vcpu->requests I added is only reached by either a heavyweight (userspace) exit/reentry or the explicit kickout of a vcpu to this label. Therefore weit fulfills a+b as you mentioned them above. Additionally the s390 reload is very rare as well as fast, therefore it would not even be an issue. -/* update sie control blocks, and unlock all vcpus */ +/* request update of sie control block for all available vcpus */ for (i = 0; i < KVM_MAX_VCPUS; ++i) { if (kvm->vcpus[i]) { -kvm->vcpus[i]->arch.sie_block->gmsor = -kvm->arch.guest_origin; -kvm->vcpus[i]->arch.sie_block->gmslm = -kvm->arch.guest_memsize + -kvm->arch.guest_origin + -VIRTIODESCSPACE - 1ul; -mutex_unlock(&kvm->vcpus[i]->mutex); +set_bit(KVM_REQ_MMU_RELOAD, &kvm->vcpus[i]->requests); +kvm_s390_inject_sigp_stop(kvm->vcpus[i], + ACTION_RELOADVCPU_ON_STOP); } } There already exists a loop which does this, see make_all_cpus_request(). It uses an IPI (Marcelo, can't it use the reschedule interrupt?). It has a couple of optimizations -- if the request is already set, it skips the IPI, and it avoids the IPI for vcpus out of guest mode. Maybe it could fit s390 too. I assume that the IPI on x86 is a implicit consequence of the smp_call_function_many function, but I think this doesn't work that way for us. The kick implied by that call would be recieved, but not reach the code the checke vcpu->request. I could add that behaviour, but that could make our normal interrupt handling much slower. Therefore I don't want to call that function, but on the other hand I like the "skip if the request is already set" functionality and think about adding that in my loop. -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
Avi Kivity wrote: Christian Ehrhardt wrote: The bad thing on vcpu->request in that case is that I don't want the async behaviour of vcpu->requests in that case, I want the memory slot updated in all vcpu's when the ioctl is returning. You mean, the hardware can access the vcpu control block even when the vcpu is not running? No, hardware only uses it with a running vcpu, but I realised my own fault while changing the code to vcpu->request style. For s390 I need to update the KVM->arch and *all* vcpu->arch->sie_block... data synchronously. Out of interest, can you explain why? Sure I'll try to give an example. a) The whole guest has "one" memory slot representing all it's memory. Therefore some important values like guest_origin and guest_memsize (one slot so it's just addr+size) are kept at VM level in kvm->arch. It should really be kept in kvm->memslots[0]->{userspace_addr, npages}. This is common to all architectures. As I said wanted to do that, but due to the need to relocate my work environment to a new laptop I was a bit stalled the last few days. A patch series implementing it in a streamlined (storing in memslots only, slots_lock, vcpu->request, ...) way will soon appear on the list. [...] c) we have other code e.g. all our copy_from/to_guest stuff that uses the kvm->arch values You want to drop these and use kvm_read_guest() / kvm_write_guest(). I put it on my "low-prio-but-very-useful todo list" to take a look at that too as one of the next opportunities to streamline our code. [...] -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
Avi Kivity wrote: Christian Ehrhardt wrote: Avi Kivity wrote: Christian Ehrhardt wrote: The bad thing on vcpu->request in that case is that I don't want the async behaviour of vcpu->requests in that case, I want the memory slot updated in all vcpu's when the ioctl is returning. You mean, the hardware can access the vcpu control block even when the vcpu is not running? No, hardware only uses it with a running vcpu, but I realised my own fault while changing the code to vcpu->request style. For s390 I need to update the KVM->arch and *all* vcpu->arch->sie_block... data synchronously. Out of interest, can you explain why? Sure I'll try to give an example. a) The whole guest has "one" memory slot representing all it's memory. Therefore some important values like guest_origin and guest_memsize (one slot so it's just addr+size) are kept at VM level in kvm->arch. b) We fortunately have cool hardware support for "nearly everything"(tm) :-) In this case for example we set in vcpu->arch.sie_block the values for origin and size translated into a "limit" to get memory management virtualization support. c) we have other code e.g. all our copy_from/to_guest stuff that uses the kvm->arch values If we would allow e.g. updates of a memslot (or as the patch supposes to harden the set_memory_region code against inconsiderate code changes in other sections) it might happen that we set the kvm->arch information but the vcpu->arch->sie_block stuff not until next reentry. Now concurrently the running vcpu could cause some kind of fault that involves a copy_from/to_guest. That way we could end up with potentially invalid handling of that fault (fault handling and running guest would use different userspace adresses until it is synced on next vcpu reentry) - it's theoretical I know, but it might cause some issues that would be hard to find. On the other hand for the long term I wanted to note that all our copy_from/to_guest functions is per vcpu, so when we some day implement updateable memslots, multiple memslots or even just fill "free time"(tm) and streamline our code we could redesign that origin/size storage. This could be done multiple ways, either just store it per vcpu or with a lock for the kvm->arch level variables - both ways and maybe more could then use the vcpu->request based approach, but unfortunately it's neither part of that patch nor of the current effort to do that. The really good thing is, because of our discussion about that I now have a really detailed idea how I can improve that code aside from this bugfix patch (lets hope not too far in the future). That makes the "per vcpu resync on next entry" approach not feasible. On the other hand I realized at the same moment that the livelock should be no issue for us, because as I mentioned: a) only one memslot b) a vcpu can't run without memslot So I don't even need to kick out vcpu's, they just should not be running. Until we ever support multiple slots, or updates of the existing single slot this should be ok, so is the bugfix patch this should be. To avoid a theoretical deadlock in case other code is changing (badly) it should be fair to aquire the lock with mutex_trylock and return -EINVAL if we did not get all locks. OK. -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
Avi Kivity wrote: Christian Ehrhardt wrote: The bad thing on vcpu->request in that case is that I don't want the async behaviour of vcpu->requests in that case, I want the memory slot updated in all vcpu's when the ioctl is returning. You mean, the hardware can access the vcpu control block even when the vcpu is not running? No, hardware only uses it with a running vcpu, but I realised my own fault while changing the code to vcpu->request style. For s390 I need to update the KVM->arch and *all* vcpu->arch->sie_block... data synchronously. That makes the "per vcpu resync on next entry" approach not feasible. On the other hand I realized at the same moment that the livelock should be no issue for us, because as I mentioned: a) only one memslot b) a vcpu can't run without memslot So I don't even need to kick out vcpu's, they just should not be running. Until we ever support multiple slots, or updates of the existing single slot this should be ok, so is the bugfix patch this should be. To avoid a theoretical deadlock in case other code is changing (badly) it should be fair to aquire the lock with mutex_trylock and return -EINVAL if we did not get all locks. [...] If I can change it that way it will definitely require some testing. ... to be continued :-) I definitely recommend it -- would bring s390 more in line with the other ports (I know it's a backward step for you :) Note our plan is to change slots_lock to RCU, so it's even better if you use memslots. As long as we have the special conditions mentioned above I think its ok to implement it the way I do it now. I agree that if we ever support multiple memslots we should strive for a common solution. p.s. the second patch in the series ensures that a vcpu really never runs without a memslot being set as this was another bug we had. -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
Avi Kivity wrote: Christian Ehrhardt wrote: I thought about implementing it with slots_lock, vcpu->request, etc but it really looks like overkill for s390. We could make (some of) it common code, so it won't look so bad. There's value in having all kvm ports do things similarly; though of course we shouldn't force the solution when it isn't really needed. vcpu->requests is useful whenever we modify global VM state that needs to be seen by all vcpus in host mode; see kvm_reload_remote_mmus(). yeah I read that code after your first hint in that thread, and I agree that merging some of this into common code might be good. But in my opinion not now for this bugfix patch (the intention is just to prevent a user being able to crash the host via vcpu create,set mem& and vcpu run in that order). It might be a good point to further streamline this once we use the same userspace code, but I think it doesn't make sense yet. Sure, don't mix bugfixes with infrastructure changes, when possible. At least today we can assume that we only have one memslot. Therefore a set_memslot with already created vcpu's will still not interfere with running vcpus (they can't run without memslot and since we have only one they won't run). Anyway I the code is prepared to "meet" running vcpus, because it might be different in future. To prevent the livelock issue I changed the code using mutex_trylock and in case I can't get the lock I explicitly let the vcpu exit from guest. Why not do it unconditionally? hmm I might have written that misleading - eventually it's a loop until it got the lock while !trylock kick vcpu out of guest schedule There is no reason to kick out guests where I got the lock cleanly as far as I see. Especially as I expect the vcpus not running in the common case as i explained above (can't run without memslot + we only have one => no vcpu will run). Still livelockable, unless you stop the vcpu from entering the guest immediately. That's why vcpu->requests is so powerful. Not only you kick the vcpu out of guest mode, you force it to synchronize when it tries to enter again. The bad thing on vcpu->request in that case is that I don't want the async behaviour of vcpu->requests in that case, I want the memory slot updated in all vcpu's when the ioctl is returning. Looking at vcpu->request based solution I don't find the synchronization I need. The changes to vcpu->arch.guest_origin/guest_memsize and the changes to vcpu->arch.sie_block->gmsor/gmslm need to happen without the vcpu running. Therefor i want the vcpu lock _before_ I update the both structs, otherwise it could be racy (at least on s390). On the other hand while it is very++ unlikely to happen you are still right that it could theoretically livelock there. I might use vcpu->request in to not enter vcpu run again after such a "kick" out of guest state. It would be checked on vcpu_run enter and could then drop the lock, call schedule, relock and check the flag again until it is cleared. I'm not yet happy with this solution as I expect it to end up in something like a reference count which then would not fit into the existing vcpu->request flags :-/ As I mentioned above the changes to vcpu->arch and vcpu->arch->sie_block have to be exclusive with the vcpu not running. If I would find something as "transport" for the information I have on set_memory_slot (origin/size) until the next vcpu_run entry I could do both changes there synchronously. In that case I could really use your suggested solution with vcpu->request, kick out unconditionally and set values on next (re-)entry. Hmmm .. Maybe I can find all I need on reentry in vcpu->kvm->memslots[]. If I can change it that way it will definitely require some testing. ... to be continued :-) -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
Avi Kivity wrote: Christian Ehrhardt wrote: On x86, we use slots_lock to protect memory slots. When we change the global memory configuration, we set a bit in vcpu->requests, and send an IPI to all cpus that are currently in guest mode for our guest. This forces the cpu back to host mode. On the next entry, vcpu_run notices vcpu->requests has the bit set and reloads the mmu configuration. Of course, all this may be overkill for s390. I thought about implementing it with slots_lock, vcpu->request, etc but it really looks like overkill for s390. We could make (some of) it common code, so it won't look so bad. There's value in having all kvm ports do things similarly; though of course we shouldn't force the solution when it isn't really needed. vcpu->requests is useful whenever we modify global VM state that needs to be seen by all vcpus in host mode; see kvm_reload_remote_mmus(). yeah I read that code after your first hint in that thread, and I agree that merging some of this into common code might be good. But in my opinion not now for this bugfix patch (the intention is just to prevent a user being able to crash the host via vcpu create,set mem& and vcpu run in that order). It might be a good point to further streamline this once we use the same userspace code, but I think it doesn't make sense yet. At least today we can assume that we only have one memslot. Therefore a set_memslot with already created vcpu's will still not interfere with running vcpus (they can't run without memslot and since we have only one they won't run). Anyway I the code is prepared to "meet" running vcpus, because it might be different in future. To prevent the livelock issue I changed the code using mutex_trylock and in case I can't get the lock I explicitly let the vcpu exit from guest. Why not do it unconditionally? hmm I might have written that misleading - eventually it's a loop until it got the lock while !trylock kick vcpu out of guest schedule There is no reason to kick out guests where I got the lock cleanly as far as I see. Especially as I expect the vcpus not running in the common case as i explained above (can't run without memslot + we only have one => no vcpu will run). -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
Avi Kivity wrote: ehrha...@linux.vnet.ibm.com wrote: From: Carsten Otte This patch fixes an incorrectness in the kvm backend for s390. In case virtual cpus are being created before the corresponding memory slot is being registered, we need to update the sie control blocks for the virtual cpus. In order to do that, we use the vcpu->mutex to lock out kvm_run and friends. This way we can ensure a consistent update of the memory for the entire smp configuration. @@ -657,6 +657,8 @@ int kvm_arch_set_memory_region(struct kv struct kvm_memory_slot old, int user_alloc) { +int i; + /* A few sanity checks. We can have exactly one memory slot which has to start at guest virtual zero and which has to be located at a page boundary in userland and which has to end at a page boundary. @@ -676,13 +678,27 @@ int kvm_arch_set_memory_region(struct kv if (mem->memory_size & (PAGE_SIZE - 1)) return -EINVAL; +/* lock all vcpus */ +for (i = 0; i < KVM_MAX_VCPUS; ++i) { +if (kvm->vcpus[i]) +mutex_lock(&kvm->vcpus[i]->mutex); +} + Can't that livelock? Nothing requires a vcpu to ever exit, and if the cpu on which it's running on has no other load and no interrupts, it could remain in guest mode indefinitely, and then the ioctl will hang, waiting for something to happen. Yes it could wait indefinitely - good spot. On x86, we use slots_lock to protect memory slots. When we change the global memory configuration, we set a bit in vcpu->requests, and send an IPI to all cpus that are currently in guest mode for our guest. This forces the cpu back to host mode. On the next entry, vcpu_run notices vcpu->requests has the bit set and reloads the mmu configuration. Of course, all this may be overkill for s390. I thought about implementing it with slots_lock, vcpu->request, etc but it really looks like overkill for s390. At least today we can assume that we only have one memslot. Therefore a set_memslot with already created vcpu's will still not interfere with running vcpus (they can't run without memslot and since we have only one they won't run). Anyway I the code is prepared to "meet" running vcpus, because it might be different in future. To prevent the livelock issue I changed the code using mutex_trylock and in case I can't get the lock I explicitly let the vcpu exit from guest. -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/6] kvm-s390: Unlink vcpu on destroy
Avi Kivity wrote: ehrha...@linux.vnet.ibm.com wrote: From: Carsten Otte This patch makes sure we do unlink a vcpu's sie control block from the system control area in kvm_arch_vcpu_destroy. This prevents illegal accesses to the sie control block from other virtual cpus after free. Reported-by: Mijo Safradin Signed-off-by: Carsten Otte --- arch/s390/kvm/kvm-s390.c |9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) Index: kvm/arch/s390/kvm/kvm-s390.c === --- kvm.orig/arch/s390/kvm/kvm-s390.c +++ kvm/arch/s390/kvm/kvm-s390.c @@ -195,6 +195,9 @@ out_nokvm: void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) { VCPU_EVENT(vcpu, 3, "%s", "free cpu"); +if (vcpu->kvm->arch.sca->cpu[vcpu->vcpu_id].sda == +(__u64) vcpu->arch.sie_block) +vcpu->kvm->arch.sca->cpu[vcpu->vcpu_id].sda = 0; free_page((unsigned long)(vcpu->arch.sie_block)); If this is accessed by hardware on a different cpu, don't you need a memory barrier here? Right, will be in v2 -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] qemu: fix configuring kvm probe when using --kerneldir
Andre Przywara wrote: ehrha...@linux.vnet.ibm.com wrote: From: Christian Ehrhardt There is already a variable kvm_cflags which gets the path of the kernel includes when using --kerneldir. But eventually with newer kernels we all will need arch/$arch/include too (my case was a incldue of asm/kvm.h which was not found anymore). Headers in a full kernel source are not flattened to one arch like they are if e.g. installed kernel headers are used. I also stumbled over this recently (in kvm-userspace.git), but I had problems with the qemu part not including KVM support because in qemu/configure the KVM build test failed due to the missing asm/kvm.h. I saw that --kerneldir gets not propagated to qemu, but libkvm_kerneldir instead, which is hardcoded to point to `pwd`/kernel. Shouldn't that be fixed, too? I use kvm-userspace.git and a not-installed kernel from kvm.git for compiling, so I say "./configure --kerneldir=/src/kvm.git --with-patched-kernel". I eventually hacked KVM's configure to propagate --kerneldir to qemu and added arch/x86/include to the include path in qemu/configure. This is of course a hack (that's why I don't append it here), but it worked ;-) If someone proposes a clean and easy way to solve this, I'd be happy to write a patch. I know this issue and reported it ~a month ago. I also had issues compiling against a --kerneldir kernel because the libkvm_kerneldir was propagated. Eventually in the discussion it came up that we don't need to fix configure "technically", but maybe we should find a way to better inform users/developüers about this (I guess up to 99% that this works for you in kvm-userspace too): (in a clean kvm-userspace) cd kernel make sync LINUX=path/to/your/kerneldir cd .. ./configure opt=whateveryouwant This way your kerneldir is synced and flattened into kvm-userspace and propagating libkvm_kerneldir is fine since that are your kerneldir headers now. Maybe a "fix" would be that if --kerneldir is provided to configure it has to ensure that THIS kerneldir is synced in before continuing. You should be aware that the fix I sent on Friday was for plain qemu which doesn't have that kernel subdir indirection and therefore works a bit different. To fix that, the includes added to cflags depending on --kerneldir should also contian the arch includes. The patch adds a special check for x86 because its source layout recently changed, all others directly use arch/$cpu/include if existent. This is one problem I also noticed. $cpu is not the same as the Linux' arch name, is there a suitable variable or do we have to do a large switch/case? I looked around and there was no real 1:1 matching variable. But fortunately $cpu is similar enough to simplify that swicth/case a lot like I did in my patch here. Regards, Andre. Signed-off-by: Christian Ehrhardt --- [diffstat] configure |6 ++ 1 file changed, 6 insertions(+) [diff] diff --git a/configure b/configure --- a/configure +++ b/configure @@ -963,6 +963,12 @@ EOF EOF if test "$kerneldir" != "" ; then kvm_cflags=-I"$kerneldir"/include + if test \( "$cpu" = "i386" -o "$cpu" = "x86_64" \) \ + -a -d "$kerneldir/arch/x86/include" ; then +kvm_cflags="$kvm_cflags -I$kerneldir/arch/x86/include" +elif test -d "$kerneldir/arch/$cpu/include" ; then +kvm_cflags="$kvm_cflags -I$kerneldir/arch/$cpu/include" + fi else kvm_cflags="" fi -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] qemu: report issues causing the kvm probe to fail v3
Anthony Liguori wrote: ehrha...@linux.vnet.ibm.com wrote: From: Christian Ehrhardt The patch applies to upstream qemu as well as kvm-userspace, but since it is the qemu configure script I think it should go to upstream qemu (Anthony) first and with the next merge to kvm-userspace. On the other hand it is the kvm probe so an ack from Avi in case v3 is ok would be reasonable. *updates* v2 - it also reports other errors than just #error preprocessor statements (requested by Avi) v3 - In case awk or grep is not installed it now gracfully (silently) fails still disabling kvm (requested by Anthony) This patch is about reporting more details of the issue if configuring kvm fails. Therefore this patch keeps the qemu style configure output which is a list of "$Feature $Status", but extend the "no" result like "KVM Support no" with some more information. There might be a lot of things going wrong with that probe and I don't want to handle all of them, but if it is one of the known checks e.g. for KVM_API_VERSION then we could grep/awk that out and report it. The patch reports in case of a known case in the style "KVM support no - (Missing KVM capability KVM_CAP_DESTROY_MEMORY_REGION_WORKS)" In case more than one #error is triggered it creates a comma separated list in those brackets and in case it is something else than an #error it just reports plain old "no". Signed-off-by: Christian Ehrhardt --- configure | 27 --- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/qemu/configure b/qemu/configure --- a/qemu/configure +++ b/qemu/configure Please send against upstream QEMU. Regards, Anthony Liguori This applies to qemu upstream already when not specifying -p or -p 2 (as I already tested yesterday before submission). Well I removed the leading qemu dir manually and updated the -51 lines offset (attached), should I change more than that (add the leading trunk maybe, atm I'm just not sure what changes you want) ? -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization Subject: [PATCH] qemu: report issues causing the kvm probe to fail v3 From: Christian Ehrhardt The patch applies to upstream qemu as well as kvm-userspace, but since it is the qemu configure script I think it should go to upstream qemu (Anthony) first and with the next merge to kvm-userspace. On the other hand it is the kvm probe so an ack from Avi in case v3 is ok would be reasonable. *updates* v2 - it also reports other errors than just #error preprocessor statements (requested by Avi) v3 - In case awk or grep is not installed it now gracfully (silently) fails still disabling kvm (requested by Anthony) This patch is about reporting more details of the issue if configuring kvm fails. Therefore this patch keeps the qemu style configure output which is a list of "$Feature $Status", but extend the "no" result like "KVM Support no" with some more information. There might be a lot of things going wrong with that probe and I don't want to handle all of them, but if it is one of the known checks e.g. for KVM_API_VERSION then we could grep/awk that out and report it. The patch reports in case of a known case in the style "KVM support no - (Missing KVM capability KVM_CAP_DESTROY_MEMORY_REGION_WORKS)" In case more than one #error is triggered it creates a comma separated list in those brackets and in case it is something else than an #error it just reports plain old "no". Signed-off-by: Christian Ehrhardt --- configure | 27 --- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/configure b/configure --- a/configure +++ b/configure @@ -951,13 +951,17 @@ if test "$kvm" = "yes" ; then if test "$kvm" = "yes" ; then cat > $TMPC < -#if !defined(KVM_API_VERSION) || \ -KVM_API_VERSION < 12 || \ -KVM_API_VERSION > 12 || \ -!defined(KVM_CAP_USER_MEMORY) || \ -!defined(KVM_CAP_SET_TSS_ADDR) || \ -!defined(KVM_CAP_DESTROY_MEMORY_REGION_WORKS) +#if !defined(KVM_API_VERSION) || KVM_API_VERSION < 12 || KVM_API_VERSION > 12 #error Invalid KVM version +#endif +#if !defined(KVM_CAP_USER_MEMORY) +#error Missing KVM capability KVM_CAP_USER_MEMORY +#endif +#if !defined(KVM_CAP_SET_TSS_ADDR) +#error Missing KVM capability KVM_CAP_SET_TSS_ADDR +#endif +#if !defined(KVM_CAP_DESTROY_MEMORY_REGION_WORKS) +#error Missing KVM capability KVM_CAP_DESTROY_MEMORY_REGION_WORKS #endif int main(void) { return 0; } EOF @@ -970,7 +974,16 @@ EOF > /dev/null 2>/dev/null ; then : else -kvm="no" +kvm="no"; +if [ -x "`which awk 2>/dev/null`" ] && \ + [ -x "`which grep 2>/dev/null`" ]; then + kvmerr=`$cc $ARCH_CFLAGS -o $TMPE ${OS_CFLAGS} $kvm_cflags $TMPC 2>&1 \ + | grep "error: " \ + | awk -F "error: " '{if (NR>1) printf(", "); printf("%s",$2);}'` + if test "$kvmerr" != "" ; then +kvm="no - (${kvmerr})" + fi +fi fi fi
Re: [PATCH] kvm: work around inability of older kvm modules to destroy memory regions
Avi Kivity wrote: Christian Ehrhardt wrote: Hi, this patch breaks all non x86 architectures as libkvm/libkvm-x86.c has the only implementation of the alias functionality. Until now only qemu-kvm-x86 has called that functions, but since this patch the generic qemu-kvm.c calls them which leads to unresolved symbols for powerpc, s390 and surely ia64 too. Well we could insert stubs for these call, but when looking on the kernel side x86 is also the only implementer of the KVM_SET_MEMORY_ALIAS ioctl. Until more arch support that there is no reason to create these functions for non-x86 in libkvm. Also the assumptions which addresses must be aliased base on hardware specific assumptions e.g. vga card -> arch specific too. For now I hold a no-op stub in my private queue to test powerpc, but eventually this mechanism should be arch dependent and this implementation x86 only. Avi could you modify your patch to work for the other arch's too ? Your band aid should be fine. Yes, it's ugly, but this will be in flux as we merge with upstream qemu. Please resend it with a signoff. still ugly, but in a proper patch style now :-) -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization Subject: [PATCH] kvm-userspace: guard destroy memory regions to x86 only From: Christian Ehrhardt Some parts of the changes made were arch specific e..g vga mem. This is a quick, but ugly workaround until a real arch split is found. Signed-off-by: Christian Ehrhardt --- [diffstat] qemu-kvm.c | 26 ++ 1 file changed, 22 insertions(+), 4 deletions(-) [diff] diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c --- a/qemu/qemu-kvm.c +++ b/qemu/qemu-kvm.c @@ -766,7 +766,9 @@ int kvm_qemu_init() return 0; } +#ifdef TARGET_I386 static int destroy_region_works = 0; +#endif int kvm_qemu_create_context(void) { @@ -784,7 +786,9 @@ int kvm_qemu_create_context(void) r = kvm_arch_qemu_create_context(); if(r <0) kvm_qemu_destroy(); +#ifdef TARGET_I386 destroy_region_works = kvm_destroy_memory_region_works(kvm_context); +#endif return 0; } @@ -793,6 +797,7 @@ void kvm_qemu_destroy(void) kvm_finalize(kvm_context); } +#ifdef TARGET_I386 static int must_use_aliases_source(target_phys_addr_t addr) { if (destroy_region_works) @@ -849,6 +854,7 @@ static void drop_mapping(target_phys_add if (p) *p = mappings[--nr_mappings]; } +#endif void kvm_cpu_register_physical_memory(target_phys_addr_t start_addr, unsigned long size, @@ -856,17 +862,21 @@ void kvm_cpu_register_physical_memory(ta { int r = 0; unsigned long area_flags = phys_offset & ~TARGET_PAGE_MASK; +#ifdef TARGET_I386 struct mapping *p; +#endif phys_offset &= ~IO_MEM_ROM; if (area_flags == IO_MEM_UNASSIGNED) { +#ifdef TARGET_I386 if (must_use_aliases_source(start_addr)) { kvm_destroy_memory_alias(kvm_context, start_addr); return; } if (must_use_aliases_target(start_addr)) return; +#endif kvm_unregister_memory_area(kvm_context, start_addr, size); return; } @@ -878,6 +888,7 @@ void kvm_cpu_register_physical_memory(ta if (area_flags >= TLB_MMIO) return; +#ifdef TARGET_I386 if (must_use_aliases_source(start_addr)) { p = find_ram_mapping(phys_offset); if (p) { @@ -886,6 +897,7 @@ void kvm_cpu_register_physical_memory(ta } return; } +#endif r = kvm_register_phys_mem(kvm_context, start_addr, phys_ram_base + phys_offset, @@ -895,11 +907,13 @@ void kvm_cpu_register_physical_memory(ta exit(1); } +#ifdef TARGET_I386 drop_mapping(start_addr); p = &mappings[nr_mappings++]; p->phys = start_addr; p->ram = phys_offset; p->len = size; +#endif return; } @@ -1068,8 +1082,10 @@ void kvm_qemu_log_memory(target_phys_add if (log) kvm_dirty_pages_log_enable_slot(kvm_context, start, size); else { +#ifdef TARGET_I386 if (must_use_aliases_target(start)) return; +#endif kvm_dirty_pages_log_disable_slot(kvm_context, start, size); } } @@ -1157,8 +1173,10 @@ void kvm_physical_sync_dirty_bitmap(targ { void *buf; +#ifdef TARGET_I386 if (must_use_aliases_source(start_addr)) return; +#endif buf = qemu_malloc((end_addr - start_addr) / 8 + 2); kvm_get_dirty_pages_range(kvm_context, start_addr, end_addr - start_addr, @@ -1168,21 +1186,21 @@ void kvm_physical_sync_dirty_bitmap(targ int kvm_log_start(target_phys_addr_t phys_addr, target_phys_addr_t len) { -#ifndef TARGET_IA64 +#ifdef TARGET_I386 if (must_use_aliases_source(phys_addr)) return 0; +#endif kvm_qemu_log_memory(phys_addr, len, 1); -#endif return 0; } int kvm_log_stop(target_phys_addr_t phys_a
Re: [PATCH] [PATCH] qemu: report issues causing the kvm probe to fail
Avi Kivity wrote: Christian Ehrhardt wrote: I ran into the issue of a failign KVM Probe of the qemu configure script three times this week always needing "set -x", inserting an exit, masking the cleanup trap and compiling the c file by hand until I knew what the reason is. I think we could make easier for developers and end users. Therefore this patch keeps the qemu style configure output which is a list of "$Feature $Status", but extend the "no" result like "KVM Support no" with some more information. There might be a lot of things going wrong with that probe and I don't want to handle all of them, but if it is one of the known checks e.g. for KVM_API_VERSION then we could grep/awk that out and report it. The patch reports in case of a known case in the style "KVM support no - (Missing KVM capability KVM_CAP_DESTROY_MEMORY_REGION_WORKS)" In case more than one #error is triggered it creates a comma separated list in those brackets and in case it is something else than an #error it just reports plain old "no". diff --git a/qemu/configure b/qemu/configure --- a/qemu/configure +++ b/qemu/configure @@ -1037,12 +1037,14 @@ if test "$kvm" = "yes" ; then if test "$kvm" = "yes" ; then cat > $TMPC < -#if !defined(KVM_API_VERSION) || \ -KVM_API_VERSION < 12 || \ -KVM_API_VERSION > 12 || \ -!defined(KVM_CAP_USER_MEMORY) || \ -!defined(KVM_CAP_SET_TSS_ADDR) +#if !defined(KVM_API_VERSION) || KVM_API_VERSION < 12 || KVM_API_VERSION > 12 #error Invalid KVM version You might refine this a bit: if KVM_API_VERSION is not defined, most likely linux/kvm.h could not be found, so you might as well report that. +#endif Updated v2 should appear on the list soon reporting all gcc "error:" messages. -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] [PATCH] qemu: report issues causing the kvm probe to fail v2
# HG changeset patch # User Christian Ehrhardt # Date 1229347014 -3600 # Node ID c754b8806d756a19c57fc3b3e317bbe3c147d5ec # Parent f7dc67cd9b74c5d7ad322686e58325f879d93468 [PATCH] qemu: report issues causing the kvm probe to fail v2 From: Christian Ehrhardt *update to v2* It now reports all "error:" statements from gcc behind the KVM no status. As the status line now is a big larger I also found that "kvm support" is reported twice, so I removed one duplicate. Finally there was also one check in configure_kvm that did not use "" guards when runnign test on $kvm which could fail now that it might contain more than just yes/no - I added apostrophs to prevent that. I ran into the issue of a failign KVM Probe of the qemu configure script three times this week always needing "set -x", inserting an exit, masking the cleanup trap and compiling the c file by hand until I knew what the reason is. I think we could make easier for developers and end users. Therefore this patch keeps the qemu style configure output which is a list of "$Feature $Status", but extend the "no" result like "KVM Support no" with some more information. There might be a lot of things going wrong with that probe and I don't want to handle all of them, but if it is one of the known checks e.g. for KVM_API_VERSION then we could grep/awk that out and report it. The patch reports in case of a known case in the style "KVM support no - (Missing KVM capability KVM_CAP_DESTROY_MEMORY_REGION_WORKS)" In case more than one #error is triggered it creates a comma separated list in those brackets and in case it is something else than an #error it just reports plain old "no". I sent a similar patch matching qemu upstream version of this file to qemu-de...@nongnu.org to keep both in sync as much as possible. Signed-off-by: Christian Ehrhardt --- [diffstat] configure | 22 ++ 1 file changed, 14 insertions(+), 8 deletions(-) [diff] diff --git a/qemu/configure b/qemu/configure --- a/qemu/configure +++ b/qemu/configure @@ -1037,12 +1037,14 @@ if test "$kvm" = "yes" ; then if test "$kvm" = "yes" ; then cat > $TMPC < -#if !defined(KVM_API_VERSION) || \ -KVM_API_VERSION < 12 || \ -KVM_API_VERSION > 12 || \ -!defined(KVM_CAP_USER_MEMORY) || \ -!defined(KVM_CAP_SET_TSS_ADDR) +#if !defined(KVM_API_VERSION) || KVM_API_VERSION < 12 || KVM_API_VERSION > 12 #error Invalid KVM version +#endif +#if !defined(KVM_CAP_USER_MEMORY) +#error Missing KVM capability KVM_CAP_USER_MEMORY +#endif +#if !defined(KVM_CAP_SET_TSS_ADDR) +#error Missing KVM capability KVM_CAP_SET_TSS_ADDR #endif int main(void) { return 0; } EOF @@ -1055,7 +1057,12 @@ EOF 2>/dev/null ; then : else -kvm="no" +kvmprobeerr=`$cc $ARCH_CFLAGS -o $TMPE ${OS_CFLAGS} $kvm_cflags $TMPC 2>&1 | grep "error: " | awk --field-separator "error: " '{if (NR>1) printf(", "); printf("%s",$2);}'` +if test "$kvmprobeerr" != "" ; then + kvm="no - (${kvmprobeerr})" +else + kvm="no" +fi fi fi @@ -1201,7 +1208,6 @@ if test -n "$sparc_cpu"; then echo "Target Sparc Arch $sparc_cpu" fi echo "kqemu support $kqemu" -echo "kvm support $kvm" echo "CPU emulation $cpu_emulation" if test $cpu = "powerpc"; then echo "libfdt support$device_tree_support" @@ -1638,7 +1644,7 @@ disable_cpu_emulation() { } configure_kvm() { - if test $kvm = "yes" -a "$target_softmmu" = "yes" -a \ + if test "$kvm" = "yes" -a "$target_softmmu" = "yes" -a \ \( "$cpu" = "i386" -o "$cpu" = "x86_64" -o "$cpu" = "ia64" -o "$cpu" = "powerpc" \); then echo "#define USE_KVM 1" >> $config_h echo "USE_KVM=1" >> $config_mak -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] [PATCH] kvm-userspace: ppc: fix compatfd build decision v2
# HG changeset patch # User Christian Ehrhardt # Date 1229345133 -3600 # Node ID b48b9d560f80037ab4e12eae128022622c7ccb34 # Parent 4b0ad05490115e4c6f31d2419c0e5b628040f90b [PATCH] kvm-userspace: ppc: fix compatfd build decision v2 From: Christian Ehrhardt qemu-kvm.c uses qemu_eventfd/qemu_signalfd. The code of compatfd takes care if CONFIG_eventfd/CONFIG_signalfd is really enabled. But currently compatfd is not build if --disable-aio is set which leads to undefined references when linking. Since compatfd.c takes care of configs being set we can savely build it in all cases dropping the ifdef completely. This allows powerpc but also all other archs to build in case --disable-aio is set. Signed-off-by: Christian Ehrhardt --- [diffstat] Makefile|5 + Makefile.target |6 +- 2 files changed, 2 insertions(+), 9 deletions(-) [diff] diff --git a/qemu/Makefile b/qemu/Makefile --- a/qemu/Makefile +++ b/qemu/Makefile @@ -51,7 +51,7 @@ BLOCK_OBJS+=block-cow.o block-qcow.o aes BLOCK_OBJS+=block-cow.o block-qcow.o aes.o block-vmdk.o block-cloop.o BLOCK_OBJS+=block-dmg.o block-bochs.o block-vpc.o block-vvfat.o BLOCK_OBJS+=block-qcow2.o block-parallels.o block-nbd.o -BLOCK_OBJS+=nbd.o block.o aio.o +BLOCK_OBJS+=nbd.o block.o aio.o compatfd.o ifdef CONFIG_WIN32 BLOCK_OBJS += block-raw-win32.o @@ -59,9 +59,6 @@ BLOCK_OBJS += block-raw-posix.o BLOCK_OBJS += block-raw-posix.o endif -ifdef CONFIG_AIO -BLOCK_OBJS += compatfd.o -endif ## # libqemu_common.a: Target independent part of system emulation. The diff --git a/qemu/Makefile.target b/qemu/Makefile.target --- a/qemu/Makefile.target +++ b/qemu/Makefile.target @@ -646,6 +646,7 @@ OBJS=vl.o osdep.o monitor.o pci.o loader OBJS=vl.o osdep.o monitor.o pci.o loader.o isa_mmio.o machine.o OBJS+=fw_cfg.o OBJS+=net.o +OBJS+=compatfd.o ifdef CONFIG_KVM OBJS+=kvm.o kvm-all.o endif @@ -654,11 +655,6 @@ else else OBJS+=block-raw-posix.o endif - -ifdef CONFIG_AIO -OBJS+=compatfd.o -endif - LIBS+=-lz ifdef CONFIG_ALSA LIBS += -lasound -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1 of 6] [PATCH] kvm-userspace: ppc: fix compatfd build decision
Avi Kivity wrote: qemu-kvm.c uses qemu_eventfd/qemu_signalfd. The code of compatfd takes care if CONFIG_eventfd/CONFIG_signalfd is really enabled. But currently compatfd is not build if --disable-aio is set. This patch lets compatfd.c build if USE_KVM is set to allow qemu-kvm to be linked in all cases (with/without --disable-aio) This breaks x86, so I dropped it. On the other Hand x86 it is broken atm too. If you compile current upstream for x86 with --disable-aio you'll get this too: ibqemu.a(qemu-kvm.o): In function `kvm_main_loop': /home/paelzer/Desktop/KVM/ppc_port/kvm-userspace-ppc.hg-testbuild/qemu/qemu-kvm.c:565: undefined reference to `qemu_eventfd' /home/paelzer/Desktop/KVM/ppc_port/kvm-userspace-ppc.hg-testbuild/qemu/qemu-kvm.c:580: undefined reference to `qemu_signalfd' collect2: ld returned 1 exit status Which was exactly what I had with power :-/ I checked for the error you reported Avi, and the problem seems to be that USE_KVM was not set even if KVM support is enabled (weird?). However looking at this more in detail I realized that I don't have to care about USE_KVM in this csae. As I mentioned before compatfd.c takes care if CONFIG_signalfd/CONFIG_eventfd are set. Therefore we can savely remove the makefile guard completely and just always build compatfd.c. This updated patch works for x86&powerpc with/without --disable-aio in my tests. It should appear on the list shortly. -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] [PATCH] kvm-userspace: gdb: fix new gdb function types
# HG changeset patch # User Christian Ehrhardt # Date 1229085659 -3600 # Node ID 37967a80a2757505488685aac135681945e6da91 # Parent f0ed33f14658fe91a14ec02501cb42d26e32f01f [PATCH] kvm-userspace: gdb: fix new gdb function types From: Christian Ehrhardt The types changed in the header but not in the powerpc and ia64 implementation. This patch fix that build error by changing the stubs to the right prototype. Signed-off-by: Christian Ehrhardt --- [diffstat] qemu-kvm-ia64.c|6 -- qemu-kvm-powerpc.c |6 -- 2 files changed, 8 insertions(+), 4 deletions(-) [diff] diff --git a/qemu/qemu-kvm-ia64.c b/qemu/qemu-kvm-ia64.c --- a/qemu/qemu-kvm-ia64.c +++ b/qemu/qemu-kvm-ia64.c @@ -65,12 +65,14 @@ void kvm_arch_update_regs_for_sipi(CPUSt { } -int kvm_arch_insert_sw_breakpoint(struct kvm_sw_breakpoint *bp) +int kvm_arch_insert_sw_breakpoint(CPUState *current_env, + struct kvm_sw_breakpoint *bp) { return -EINVAL; } -int kvm_arch_remove_sw_breakpoint(struct kvm_sw_breakpoint *bp) +int kvm_arch_remove_sw_breakpoint(CPUState *current_env, + struct kvm_sw_breakpoint *bp) { return -EINVAL; } diff --git a/qemu/qemu-kvm-powerpc.c b/qemu/qemu-kvm-powerpc.c --- a/qemu/qemu-kvm-powerpc.c +++ b/qemu/qemu-kvm-powerpc.c @@ -223,12 +223,14 @@ void kvm_arch_cpu_reset(CPUState *env) { } -int kvm_arch_insert_sw_breakpoint(struct kvm_sw_breakpoint *bp) +int kvm_arch_insert_sw_breakpoint(CPUState *current_env, + struct kvm_sw_breakpoint *bp) { return -EINVAL; } -int kvm_arch_remove_sw_breakpoint(struct kvm_sw_breakpoint *bp) +int kvm_arch_remove_sw_breakpoint(CPUState *current_env, + struct kvm_sw_breakpoint *bp) { return -EINVAL; } -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [PATCH] qemu: ppc: kvm-userspace: KVM PowerPC support for qemu gdbstub
Hollis Blanchard wrote: On Thu, 2008-12-11 at 17:05 +0100, Jan Kiszka wrote: Hollis Blanchard wrote: On Thu, 2008-12-11 at 13:53 +0100, Christian Ehrhardt wrote: This is v2 as version one had a type in it occured when splitting patches. Mercurial somehow lost my changes to the patch description explaining that, but the patch is right this way. Christian Ehrhardt wrote: # HG changeset patch # User Christian Ehrhardt # Date 1228999833 -3600 # Node ID dc1466c9077ab162f4637fffee1869f26be02299 # Parent 4c07fe2a56c7653a9113e05bb08c2de9aec210ce [PATCH] qemu: ppc: kvm-userspace: KVM PowerPC support for qemu gdbstub From: Hollis Blanchard Add basic KVM PowerPC support to qemu's gdbstub introducing a kvm ppc style mmu implementation that uses the kvm_translate ioctl. This also requires to save the kvm registers prior to the 'm' gdb operations. Signed-off-by: Hollis Blanchard Signed-off-by: Christian Ehrhardt Let's *not* apply this to kvm-userspace. We will submit this to qemu, and once we work out the right solution there it will be merged naturally. I don't oversee yet what you want to push upstream, but in case it's the gdbstub support for kvm (including ppc bits): please note that I plan to push the new interface once it is merged into kvm-userspace, avoiding to spread the current, limited one as far as possible. BTW, would be great if you could have a look / provide patches for ppc to support the new interface already. I am open for feedback, specifically regarding its suitability beyond x86. I've been meaning to do this for a while, sorry. We'll take a look soon. Hi Jan, I saw that you already had that env->s->g_cpu fix, so if you change all that anyway it might really be better to test/extend your patches for powerpc now. If it is ok for you I would submit my patches that apply on top of yours to you and cc the kvm list. But as Hollis mentioned I would prefer go for qemu upstream first and then assist Avi in merging it into kvm-userspace because this is the natural direction patches flow atm (and if you need to change it multiple times until you get qemu acceptance you would have to extensivly patch both projects to match again). As my code in that case depend on your patches it would be nice if you could put them into your series once you are happy with it. -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] [PATCH] qemu: report issues causing the kvm probe to fail
# HG changeset patch # User Christian Ehrhardt <[EMAIL PROTECTED]> # Date 1229005383 -3600 # Node ID d788f32f8f60f3a0d86ab218459089e5186632ca # Parent f7dc67cd9b74c5d7ad322686e58325f879d93468 [PATCH] qemu: report issues causing the kvm probe to fail From: Christian Ehrhardt <[EMAIL PROTECTED]> I ran into the issue of a failign KVM Probe of the qemu configure script three times this week always needing "set -x", inserting an exit, masking the cleanup trap and compiling the c file by hand until I knew what the reason is. I think we could make easier for developers and end users. Therefore this patch keeps the qemu style configure output which is a list of "$Feature $Status", but extend the "no" result like "KVM Support no" with some more information. There might be a lot of things going wrong with that probe and I don't want to handle all of them, but if it is one of the known checks e.g. for KVM_API_VERSION then we could grep/awk that out and report it. The patch reports in case of a known case in the style "KVM support no - (Missing KVM capability KVM_CAP_DESTROY_MEMORY_REGION_WORKS)" In case more than one #error is triggered it creates a comma separated list in those brackets and in case it is something else than an #error it just reports plain old "no". Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]> --- [diffstat] configure | 24 +--- 1 file changed, 17 insertions(+), 7 deletions(-) [diff] diff --git a/qemu/configure b/qemu/configure --- a/qemu/configure +++ b/qemu/configure @@ -1037,12 +1037,14 @@ if test "$kvm" = "yes" ; then if test "$kvm" = "yes" ; then cat > $TMPC < -#if !defined(KVM_API_VERSION) || \ -KVM_API_VERSION < 12 || \ -KVM_API_VERSION > 12 || \ -!defined(KVM_CAP_USER_MEMORY) || \ -!defined(KVM_CAP_SET_TSS_ADDR) +#if !defined(KVM_API_VERSION) || KVM_API_VERSION < 12 || KVM_API_VERSION > 12 #error Invalid KVM version +#endif +#if !defined(KVM_CAP_USER_MEMORY) +#error Missing KVM capability KVM_CAP_USER_MEMORY +#endif +#if !defined(KVM_CAP_SET_TSS_ADDR) +#error Missing KVM capability KVM_CAP_SET_TSS_ADDR #endif int main(void) { return 0; } EOF @@ -1055,7 +1057,12 @@ EOF 2>/dev/null ; then : else -kvm="no" +kvmprobeerr=`$cc $ARCH_CFLAGS -o $TMPE ${OS_CFLAGS} $kvm_cflags $TMPC 2>&1 | grep "error: #error " | awk --field-separator "error: #error " '{if (NR>1) printf(", "); printf("%s",$2);}'` +if test "$kvmprobeerr" != "" ; then + kvm="no - (${kvmprobeerr})" +else + kvm="no" +fi fi fi -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] [PATCH] qemu: ppc: kvm-userspace: KVM PowerPC support for qemu gdbstub
# HG changeset patch # User Christian Ehrhardt <[EMAIL PROTECTED]> # Date 1228999833 -3600 # Node ID dc1466c9077ab162f4637fffee1869f26be02299 # Parent 4c07fe2a56c7653a9113e05bb08c2de9aec210ce [PATCH] qemu: ppc: kvm-userspace: KVM PowerPC support for qemu gdbstub From: Hollis Blanchard <[EMAIL PROTECTED]> Add basic KVM PowerPC support to qemu's gdbstub introducing a kvm ppc style mmu implementation that uses the kvm_translate ioctl. This also requires to save the kvm registers prior to the 'm' gdb operations. Signed-off-by: Hollis Blanchard <[EMAIL PROTECTED]> Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]> --- [diffstat] gdbstub.c |2 ++ hw/ppc440_bamboo.c |1 + qemu-kvm-powerpc.c | 28 target-ppc/cpu.h|2 ++ target-ppc/helper.c |4 target-ppc/translate_init.c |5 + 6 files changed, 42 insertions(+) [diff] diff --git a/qemu/gdbstub.c b/qemu/gdbstub.c --- a/qemu/gdbstub.c +++ b/qemu/gdbstub.c @@ -1374,6 +1374,7 @@ static int gdb_handle_packet(GDBState *s if (*p == ',') p++; len = strtoull(p, NULL, 16); +kvm_save_registers(s->g_cpu); if (cpu_memory_rw_debug(s->g_cpu, addr, mem_buf, len, 0) != 0) { put_packet (s, "E14"); } else { @@ -1389,6 +1390,7 @@ static int gdb_handle_packet(GDBState *s if (*p == ':') p++; hextomem(mem_buf, p, len); +kvm_save_registers(s->g_cpu); if (cpu_memory_rw_debug(s->g_cpu, addr, mem_buf, len, 1) != 0) put_packet(s, "E14"); else diff --git a/qemu/hw/ppc440_bamboo.c b/qemu/hw/ppc440_bamboo.c --- a/qemu/hw/ppc440_bamboo.c +++ b/qemu/hw/ppc440_bamboo.c @@ -99,6 +99,7 @@ void bamboo_init(ram_addr_t ram_size, in fprintf(stderr, "Unable to initialize CPU!\n"); exit(1); } + env->mmu_model = POWERPC_MMU_KVM; /* call init */ printf("Calling function ppc440_init\n"); diff --git a/qemu/qemu-kvm-powerpc.c b/qemu/qemu-kvm-powerpc.c --- a/qemu/qemu-kvm-powerpc.c +++ b/qemu/qemu-kvm-powerpc.c @@ -102,6 +102,7 @@ void kvm_arch_save_regs(CPUState *env) env->spr[SPR_SRR0] = regs.srr0; env->spr[SPR_SRR1] = regs.srr1; +env->spr[SPR_BOOKE_PID] = regs.pid; env->spr[SPR_SPRG0] = regs.sprg0; env->spr[SPR_SPRG1] = regs.sprg1; @@ -219,6 +220,33 @@ int handle_powerpc_dcr_write(int vcpu, u return 0; /* XXX ignore failed DCR ops */ } +int mmukvm_get_physical_address(CPUState *env, mmu_ctx_t *ctx, +target_ulong eaddr, int rw, int access_type) +{ +struct kvm_translation tr; +uint64_t pid; +uint64_t as; +int r; + +pid = env->spr[SPR_BOOKE_PID]; + +if (access_type == ACCESS_CODE) +as = env->msr & msr_ir; +else +as = env->msr & msr_dr; + +tr.linear_address = as << 40 | pid << 32 | eaddr; +r = kvm_translate(kvm_context, env->cpu_index, &tr); +if (r == -1) +return r; + +if (!tr.valid) +return -EFAULT; + +ctx->raddr = tr.physical_address; +return 0; +} + void kvm_arch_cpu_reset(CPUState *env) { } diff --git a/qemu/target-ppc/cpu.h b/qemu/target-ppc/cpu.h --- a/qemu/target-ppc/cpu.h +++ b/qemu/target-ppc/cpu.h @@ -98,6 +98,8 @@ enum powerpc_mmu_t { POWERPC_MMU_BOOKE_FSL = 0x0009, /* PowerPC 601 MMU model (specific BATs format)*/ POWERPC_MMU_601= 0x000A, +/* KVM managing the MMU state */ +POWERPC_MMU_KVM= 0x000B, #if defined(TARGET_PPC64) #define POWERPC_MMU_64 0x0001 /* 64 bits PowerPC MMU */ diff --git a/qemu/target-ppc/helper.c b/qemu/target-ppc/helper.c --- a/qemu/target-ppc/helper.c +++ b/qemu/target-ppc/helper.c @@ -1429,6 +1429,10 @@ int get_physical_address (CPUState *env, fprintf(logfile, "%s\n", __func__); } #endif + +if (env->mmu_model == POWERPC_MMU_KVM) +return mmukvm_get_physical_address(env, ctx, eaddr, rw, access_type); + if ((access_type == ACCESS_CODE && msr_ir == 0) || (access_type != ACCESS_CODE && msr_dr == 0)) { /* No address translation */ diff --git a/qemu/target-ppc/translate_init.c b/qemu/target-ppc/translate_init.c --- a/qemu/target-ppc/translate_init.c +++ b/qemu/target-ppc/translate_init.c @@ -9273,6 +9273,11 @@ int cpu_ppc_register_internal (CPUPPCSta case POWERPC_MMU_601: mmu_model = "PowerPC 601"; break; +#ifdef KVM +case POWERPC_MMU_KVM: +mmu_model = "PowerPC KVM"; +break; +#endif #if defined (TARGET_PPC64) case POWERPC
Re: [PATCH] [PATCH] qemu: ppc: kvm-userspace: KVM PowerPC support for qemu gdbstub
This is v2 as version one had a type in it occured when splitting patches. Mercurial somehow lost my changes to the patch description explaining that, but the patch is right this way. Christian Ehrhardt wrote: # HG changeset patch # User Christian Ehrhardt <[EMAIL PROTECTED]> # Date 1228999833 -3600 # Node ID dc1466c9077ab162f4637fffee1869f26be02299 # Parent 4c07fe2a56c7653a9113e05bb08c2de9aec210ce [PATCH] qemu: ppc: kvm-userspace: KVM PowerPC support for qemu gdbstub From: Hollis Blanchard <[EMAIL PROTECTED]> Add basic KVM PowerPC support to qemu's gdbstub introducing a kvm ppc style mmu implementation that uses the kvm_translate ioctl. This also requires to save the kvm registers prior to the 'm' gdb operations. Signed-off-by: Hollis Blanchard <[EMAIL PROTECTED]> Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]> --- [diffstat] gdbstub.c |2 ++ hw/ppc440_bamboo.c |1 + qemu-kvm-powerpc.c | 28 target-ppc/cpu.h|2 ++ target-ppc/helper.c |4 target-ppc/translate_init.c |5 + 6 files changed, 42 insertions(+) [diff] diff --git a/qemu/gdbstub.c b/qemu/gdbstub.c --- a/qemu/gdbstub.c +++ b/qemu/gdbstub.c @@ -1374,6 +1374,7 @@ static int gdb_handle_packet(GDBState *s if (*p == ',') p++; len = strtoull(p, NULL, 16); +kvm_save_registers(s->g_cpu); if (cpu_memory_rw_debug(s->g_cpu, addr, mem_buf, len, 0) != 0) { put_packet (s, "E14"); } else { @@ -1389,6 +1390,7 @@ static int gdb_handle_packet(GDBState *s if (*p == ':') p++; hextomem(mem_buf, p, len); +kvm_save_registers(s->g_cpu); if (cpu_memory_rw_debug(s->g_cpu, addr, mem_buf, len, 1) != 0) put_packet(s, "E14"); else diff --git a/qemu/hw/ppc440_bamboo.c b/qemu/hw/ppc440_bamboo.c --- a/qemu/hw/ppc440_bamboo.c +++ b/qemu/hw/ppc440_bamboo.c @@ -99,6 +99,7 @@ void bamboo_init(ram_addr_t ram_size, in fprintf(stderr, "Unable to initialize CPU!\n"); exit(1); } + env->mmu_model = POWERPC_MMU_KVM; /* call init */ printf("Calling function ppc440_init\n"); diff --git a/qemu/qemu-kvm-powerpc.c b/qemu/qemu-kvm-powerpc.c --- a/qemu/qemu-kvm-powerpc.c +++ b/qemu/qemu-kvm-powerpc.c @@ -102,6 +102,7 @@ void kvm_arch_save_regs(CPUState *env) env->spr[SPR_SRR0] = regs.srr0; env->spr[SPR_SRR1] = regs.srr1; +env->spr[SPR_BOOKE_PID] = regs.pid; env->spr[SPR_SPRG0] = regs.sprg0; env->spr[SPR_SPRG1] = regs.sprg1; @@ -219,6 +220,33 @@ int handle_powerpc_dcr_write(int vcpu, u return 0; /* XXX ignore failed DCR ops */ } +int mmukvm_get_physical_address(CPUState *env, mmu_ctx_t *ctx, +target_ulong eaddr, int rw, int access_type) +{ +struct kvm_translation tr; +uint64_t pid; +uint64_t as; +int r; + +pid = env->spr[SPR_BOOKE_PID]; + +if (access_type == ACCESS_CODE) +as = env->msr & msr_ir; +else +as = env->msr & msr_dr; + +tr.linear_address = as << 40 | pid << 32 | eaddr; +r = kvm_translate(kvm_context, env->cpu_index, &tr); +if (r == -1) +return r; + +if (!tr.valid) +return -EFAULT; + +ctx->raddr = tr.physical_address; +return 0; +} + void kvm_arch_cpu_reset(CPUState *env) { } diff --git a/qemu/target-ppc/cpu.h b/qemu/target-ppc/cpu.h --- a/qemu/target-ppc/cpu.h +++ b/qemu/target-ppc/cpu.h @@ -98,6 +98,8 @@ enum powerpc_mmu_t { POWERPC_MMU_BOOKE_FSL = 0x0009, /* PowerPC 601 MMU model (specific BATs format)*/ POWERPC_MMU_601= 0x000A, +/* KVM managing the MMU state */ +POWERPC_MMU_KVM= 0x000B, #if defined(TARGET_PPC64) #define POWERPC_MMU_64 0x0001 /* 64 bits PowerPC MMU */ diff --git a/qemu/target-ppc/helper.c b/qemu/target-ppc/helper.c --- a/qemu/target-ppc/helper.c +++ b/qemu/target-ppc/helper.c @@ -1429,6 +1429,10 @@ int get_physical_address (CPUState *env, fprintf(logfile, "%s\n", __func__); } #endif + +if (env->mmu_model == POWERPC_MMU_KVM) +return mmukvm_get_physical_address(env, ctx, eaddr, rw, access_type); + if ((access_type == ACCESS_CODE && msr_ir == 0) || (access_type != ACCESS_CODE && msr_dr == 0)) { /* No address translation */ diff --git a/qemu/target-ppc/translate_init.c b/qemu/target-ppc/translate_init.c --- a/qemu/target-ppc/translate_init.c +++ b/qemu/target-ppc/translate_init.c @@ -9273,6 +9273,11 @@ int cpu_ppc_register_internal (CPUPPCSta case POWERPC_MMU_601: mmu_model = "
[PATCH 3 of 3] [PATCH] kvm-userspace: fix gdbstub kvm integration
# HG changeset patch # User Christian Ehrhardt <[EMAIL PROTECTED]> # Date 1228989958 -3600 # Node ID f80fb35de91fe69dae889c70948c9a53212ee444 # Parent 6f228c807ad0b239b7342d2974debfc66418d784 [PATCH] kvm-userspace: fix gdbstub kvm integration From: Christian Ehrhardt <[EMAIL PROTECTED]> Some recent qemu upstream merges brought in a new concept to not use "env" as current cpu in gdb_handle_packet anymore. But the kvm calls still do, this leads to SIGDEV's as env is not initialized when calling the functions like kvm_save_registers. Insted there is now a gdbstate structure holding current cpu for step/continue and "other" ops splitted. This patch changes the kvm_save_registers calls to use the right CPUState variable for the kvm calls in gdb_handle_packet. Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]> --- [diffstat] gdbstub.c |8 1 file changed, 4 insertions(+), 4 deletions(-) [diff] diff --git a/qemu/gdbstub.c b/qemu/gdbstub.c --- a/qemu/gdbstub.c +++ b/qemu/gdbstub.c @@ -1348,7 +1348,7 @@ static int gdb_handle_packet(GDBState *s } break; case 'g': -kvm_save_registers(env); +kvm_save_registers(s->g_cpu); len = 0; for (addr = 0; addr < num_g_regs; addr++) { reg_size = gdb_read_register(s->g_cpu, mem_buf + len, addr); @@ -1366,7 +1366,7 @@ static int gdb_handle_packet(GDBState *s len -= reg_size; registers += reg_size; } -kvm_load_registers(env); +kvm_load_registers(s->g_cpu); put_packet(s, "OK"); break; case 'm': -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1 of 3] [PATCH] kvm-userspace: ppc: Add kvm_translate wrapper
# HG changeset patch # User Christian Ehrhardt <[EMAIL PROTECTED]> # Date 1228924564 -3600 # Node ID 38846cef16e56c681da1ddc179e248972c8b2ff9 # Parent 705d874ff7a24484eaa15ed75a748c4e1a70c2ef [PATCH] kvm-userspace: ppc: Add kvm_translate wrapper From: Hollis Blanchard <[EMAIL PROTECTED]> Add kvm_translate() wrapper used to get mmu translations from userspace. Signed-off-by: Hollis Blanchard <[EMAIL PROTECTED]> Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]> --- [diffstat] libkvm.c |5 + libkvm.h |2 ++ 2 files changed, 7 insertions(+) [diff] diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c --- a/libkvm/libkvm.c +++ b/libkvm/libkvm.c @@ -987,6 +987,11 @@ int kvm_guest_debug(kvm_context_t kvm, i return ioctl(kvm->vcpu_fd[vcpu], KVM_DEBUG_GUEST, dbg); } +int kvm_translate(kvm_context_t kvm, int vcpu, struct kvm_translation *tr) +{ + return ioctl(kvm->vcpu_fd[vcpu], KVM_TRANSLATE, tr); +} + int kvm_set_signal_mask(kvm_context_t kvm, int vcpu, const sigset_t *sigset) { struct kvm_signal_mask *sigmask; diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h --- a/libkvm/libkvm.h +++ b/libkvm/libkvm.h @@ -639,6 +639,8 @@ int kvm_set_pit(kvm_context_t kvm, struc int kvm_set_pit(kvm_context_t kvm, struct kvm_pit_state *s); #endif +int kvm_translate(kvm_context_t kvm, int vcpu, struct kvm_translation *tr); + #endif #ifdef KVM_CAP_VAPIC -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2 of 3] [PATCH] qemu: ppc: kvm-userspace: KVM PowerPC support for qemu gdbstub
# HG changeset patch # User Christian Ehrhardt <[EMAIL PROTECTED]> # Date 1228989956 -3600 # Node ID 6f228c807ad0b239b7342d2974debfc66418d784 # Parent 38846cef16e56c681da1ddc179e248972c8b2ff9 [PATCH] qemu: ppc: kvm-userspace: KVM PowerPC support for qemu gdbstub From: Hollis Blanchard <[EMAIL PROTECTED]> Add basic KVM PowerPC support to qemu's gdbstub introducing a kvm ppc style mmu implementation that uses the kvm_translate ioctl. This also requires to save the kvm registers prior to the 'm' gdb operations. Signed-off-by: Hollis Blanchard <[EMAIL PROTECTED]> Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]> --- [diffstat] gdbstub.c |2 ++ hw/ppc440_bamboo.c |1 + qemu-kvm-powerpc.c | 28 target-ppc/cpu.h|2 ++ target-ppc/helper.c |4 target-ppc/translate_init.c |5 + 6 files changed, 42 insertions(+) [diff] diff --git a/qemu/gdbstub.c b/qemu/gdbstub.c --- a/qemu/gdbstub.c +++ b/qemu/gdbstub.c @@ -1374,6 +1374,7 @@ static int gdb_handle_packet(GDBState *s if (*p == ',') p++; len = strtoull(p, NULL, 16); +kvm_save_registers(s->g_cpu); if (cpu_memory_rw_debug(s->g_cpu, addr, mem_buf, len, 0) != 0) { put_packet (s, "E14"); } else { @@ -1389,6 +1390,7 @@ static int gdb_handle_packet(GDBState *s if (*p == ':') p++; hextomem(mem_buf, p, len); +kvm_save_registers(s->gcpu); if (cpu_memory_rw_debug(s->g_cpu, addr, mem_buf, len, 1) != 0) put_packet(s, "E14"); else diff --git a/qemu/hw/ppc440_bamboo.c b/qemu/hw/ppc440_bamboo.c --- a/qemu/hw/ppc440_bamboo.c +++ b/qemu/hw/ppc440_bamboo.c @@ -99,6 +99,7 @@ void bamboo_init(ram_addr_t ram_size, in fprintf(stderr, "Unable to initialize CPU!\n"); exit(1); } + env->mmu_model = POWERPC_MMU_KVM; /* call init */ printf("Calling function ppc440_init\n"); diff --git a/qemu/qemu-kvm-powerpc.c b/qemu/qemu-kvm-powerpc.c --- a/qemu/qemu-kvm-powerpc.c +++ b/qemu/qemu-kvm-powerpc.c @@ -102,6 +102,7 @@ void kvm_arch_save_regs(CPUState *env) env->spr[SPR_SRR0] = regs.srr0; env->spr[SPR_SRR1] = regs.srr1; +env->spr[SPR_BOOKE_PID] = regs.pid; env->spr[SPR_SPRG0] = regs.sprg0; env->spr[SPR_SPRG1] = regs.sprg1; @@ -219,6 +220,33 @@ int handle_powerpc_dcr_write(int vcpu, u return 0; /* XXX ignore failed DCR ops */ } +int mmukvm_get_physical_address(CPUState *env, mmu_ctx_t *ctx, +target_ulong eaddr, int rw, int access_type) +{ +struct kvm_translation tr; +uint64_t pid; +uint64_t as; +int r; + +pid = env->spr[SPR_BOOKE_PID]; + +if (access_type == ACCESS_CODE) +as = env->msr & msr_ir; +else +as = env->msr & msr_dr; + +tr.linear_address = as << 40 | pid << 32 | eaddr; +r = kvm_translate(kvm_context, env->cpu_index, &tr); +if (r == -1) +return r; + +if (!tr.valid) +return -EFAULT; + +ctx->raddr = tr.physical_address; +return 0; +} + void kvm_arch_cpu_reset(CPUState *env) { } diff --git a/qemu/target-ppc/cpu.h b/qemu/target-ppc/cpu.h --- a/qemu/target-ppc/cpu.h +++ b/qemu/target-ppc/cpu.h @@ -98,6 +98,8 @@ enum powerpc_mmu_t { POWERPC_MMU_BOOKE_FSL = 0x0009, /* PowerPC 601 MMU model (specific BATs format)*/ POWERPC_MMU_601= 0x000A, +/* KVM managing the MMU state */ +POWERPC_MMU_KVM= 0x000B, #if defined(TARGET_PPC64) #define POWERPC_MMU_64 0x0001 /* 64 bits PowerPC MMU */ diff --git a/qemu/target-ppc/helper.c b/qemu/target-ppc/helper.c --- a/qemu/target-ppc/helper.c +++ b/qemu/target-ppc/helper.c @@ -1429,6 +1429,10 @@ int get_physical_address (CPUState *env, fprintf(logfile, "%s\n", __func__); } #endif + +if (env->mmu_model == POWERPC_MMU_KVM) +return mmukvm_get_physical_address(env, ctx, eaddr, rw, access_type); + if ((access_type == ACCESS_CODE && msr_ir == 0) || (access_type != ACCESS_CODE && msr_dr == 0)) { /* No address translation */ diff --git a/qemu/target-ppc/translate_init.c b/qemu/target-ppc/translate_init.c --- a/qemu/target-ppc/translate_init.c +++ b/qemu/target-ppc/translate_init.c @@ -9273,6 +9273,11 @@ int cpu_ppc_register_internal (CPUPPCSta case POWERPC_MMU_601: mmu_model = "PowerPC 601"; break; +#ifdef KVM +case POWERPC_MMU_KVM: +mmu_model = "PowerPC KVM"; +break; +#endif #if defined (TARGET_PPC64) case POWERPC
[PATCH 0 of 3] update gdbstub support
This patch series updates the gdbstub support for kvm. Patch 1&2 introduce basic powerpc support while patch 3 fixes gdbstub generic code that was broken in a qemu merge. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6 of 6] [PATCH] kvm-userspace: ppc: align with upstream qemu - pcibus
# HG changeset patch # User Christian Ehrhardt <[EMAIL PROTECTED]> # Date 1228922789 -3600 # Node ID 9a7208ca1afab83913ee14c629bf27632ee6326b # Parent 5adc6fbbd4a3b82e1bc8acbcb233c60e89715b61 [PATCH] kvm-userspace: ppc: align with upstream qemu - pcibus From: Christian Ehrhardt <[EMAIL PROTECTED]> ppc initializer now properly use the opaque PCIBus type. This is already changed in all upstream qemu files. This patch changes kvm ppc440 files to use the new type too. Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]> --- [diffstat] ppc440.c| 11 ++- ppc440.h|3 ++- ppc440_bamboo.c | 10 +- 3 files changed, 13 insertions(+), 11 deletions(-) [diff] diff --git a/qemu/hw/ppc440.c b/qemu/hw/ppc440.c --- a/qemu/hw/ppc440.c +++ b/qemu/hw/ppc440.c @@ -13,6 +13,7 @@ #include "hw.h" #include "hw/isa.h" #include "ppc440.h" +#include "pci.h" #define PPC440EP_PCI_CONFIG 0xeec0 #define PPC440EP_PCI_INTACK 0xeed0 @@ -29,12 +30,12 @@ void ppc440ep_init(CPUState *env, target_phys_addr_t ram_sizes[PPC440_MAX_RAM_SLOTS], int nbanks, qemu_irq **picp, - ppc4xx_pci_t **pcip, + PCIBus **pcibusp, int do_init) { ppc4xx_mmio_t *mmio; qemu_irq *pic, *irqs; - ppc4xx_pci_t *pci; + PCIBus *pcibus; int i; ppc_dcr_init(env, NULL, NULL); @@ -59,14 +60,14 @@ void ppc440ep_init(CPUState *env, ppc405_sdram_init(env, pic[14], nbanks, ram_bases, ram_sizes, do_init); /* PCI */ - pci = ppc4xx_pci_init(env, pic, + pcibus = ppc4xx_pci_init(env, pic, PPC440EP_PCI_CONFIG, PPC440EP_PCI_INTACK, PPC440EP_PCI_SPECIAL, PPC440EP_PCI_REGS); - if (!pci) + if (!pcibus) printf("couldn't create PCI controller!\n"); - *pcip = pci; + *pcibusp = pcibus; isa_mmio_init(PPC440EP_PCI_IO, PPC440EP_PCI_IOLEN); diff --git a/qemu/hw/ppc440.h b/qemu/hw/ppc440.h --- a/qemu/hw/ppc440.h +++ b/qemu/hw/ppc440.h @@ -20,6 +20,7 @@ #include "sysemu.h" #include "exec-all.h" #include "boards.h" +#include "pci.h" #define PPC440_MAX_RAM_SLOTS 4 @@ -28,7 +29,7 @@ void ppc440ep_init(CPUState *env, target_phys_addr_t ram_sizes[PPC440_MAX_RAM_SLOTS], int nbanks, qemu_irq **picp, - ppc4xx_pci_t **pcip, + PCIBus **pcip, int do_init); #endif diff --git a/qemu/hw/ppc440_bamboo.c b/qemu/hw/ppc440_bamboo.c --- a/qemu/hw/ppc440_bamboo.c +++ b/qemu/hw/ppc440_bamboo.c @@ -37,7 +37,7 @@ void bamboo_init(ram_addr_t ram_size, in target_phys_addr_t ram_sizes[PPC440_MAX_RAM_SLOTS]; NICInfo *nd; qemu_irq *pic; - ppc4xx_pci_t *pci; + PCIBus *pcibus; CPUState *env; uint64_t elf_entry; uint64_t elf_lowaddr; @@ -102,7 +102,7 @@ void bamboo_init(ram_addr_t ram_size, in /* call init */ printf("Calling function ppc440_init\n"); - ppc440ep_init(env, ram_bases, ram_sizes, nbanks, &pic, &pci, 1); + ppc440ep_init(env, ram_bases, ram_sizes, nbanks, &pic, &pcibus, 1); printf("Done calling ppc440_init\n"); /* load kernel with uboot loader */ @@ -197,12 +197,12 @@ void bamboo_init(ram_addr_t ram_size, in env->nip = entry; } - if (pci) { + if (pcibus) { int unit_id = 0; /* Add virtio block devices. */ while ((i = drive_get_index(IF_VIRTIO, 0, unit_id)) != -1) { - virtio_blk_init(pci->bus, 0x1AF4, 0x1001, + virtio_blk_init(pcibus, 0x1AF4, 0x1001, drives_table[i].bdrv); unit_id++; } @@ -212,7 +212,7 @@ void bamboo_init(ram_addr_t ram_size, in nd = &nd_table[i]; if (!nd->model) nd->model = "virtio"; - pci_nic_init(pci->bus, nd, -1); + pci_nic_init(pcibus, nd, -1); } } -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2 of 6] [PATCH] kvm-userspace: ppc: fix configure enabling kvm for ppc
# HG changeset patch # User Christian Ehrhardt <[EMAIL PROTECTED]> # Date 1228922788 -3600 # Node ID f100b1bfa5f3d084d68bd2c66244271db1f5d084 # Parent b41f0d6129f51fb86bf799a5fe7b14a9270eeca4 [PATCH] kvm-userspace: ppc: fix configure enabling kvm for ppc From: Christian Ehrhardt <[EMAIL PROTECTED]> The configure script is missing the target/host cpu sync for powerpc kvm. Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]> --- [diffstat] configure |1 + 1 file changed, 1 insertion(+) [diff] diff --git a/qemu/configure b/qemu/configure --- a/qemu/configure +++ b/qemu/configure @@ -1659,6 +1659,7 @@ if [ use_upstream_kvm = yes ]; then # Make sure the target and host cpus are compatible if test "$kvm" = "yes" -a ! \( "$target_cpu" = "$cpu" -o \ + \( "$target_cpu" = "ppcemb" -a "$cpu" = "powerpc" \) -o \ \( "$target_cpu" = "x86_64" -a "$cpu" = "i386" \) -o \ \( "$target_cpu" = "i386" -a "$cpu" = "x86_64" \) \) ; then kvm="no" -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3 of 6] [PATCH] kvm-userspace: ppc: align with upstream qemu - breakpoint reset
# HG changeset patch # User Christian Ehrhardt <[EMAIL PROTECTED]> # Date 1228922788 -3600 # Node ID c032d8555c9494f9812e4d4e0b5b511ae597 # Parent f100b1bfa5f3d084d68bd2c66244271db1f5d084 [PATCH] kvm-userspace: ppc: align with upstream qemu - breakpoint reset From: Christian Ehrhardt <[EMAIL PROTECTED]> The way resettign that changed upstream, adopt new style in kvmppc code. Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]> --- [diffstat] qemu-kvm-powerpc.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) [diff] diff --git a/qemu/qemu-kvm-powerpc.c b/qemu/qemu-kvm-powerpc.c --- a/qemu/qemu-kvm-powerpc.c +++ b/qemu/qemu-kvm-powerpc.c @@ -31,7 +31,7 @@ extern kvm_context_t kvm_context; void cpu_reset(CPUState *env) { - memset(env->breakpoints, 0, sizeof(env->breakpoints)); + memset(env, 0, offsetof(CPUPPCState, breakpoints)); cpu_ppc_reset(env); } -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4 of 6] [PATCH] kvm-userpace: ppc: align with upstream qemu - 4xxdevs
# HG changeset patch # User Christian Ehrhardt <[EMAIL PROTECTED]> # Date 1228922788 -3600 # Node ID 214485869c303ab81c9da30b6784d641f58585f4 # Parent c032d8555c9494f9812e4d4e0b5b511ae597 [PATCH] kvm-userpace: ppc: align with upstream qemu - 4xxdevs From: Christian Ehrhardt <[EMAIL PROTECTED]> This was mostly moved to qemu/hw/ppc4xx_pci.c and is no more needed in hw/ppc4xx_devs.c Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]> --- [diffstat] ppc4xx_devs.c | 371 -- 1 file changed, 371 deletions(-) [diff] diff --git a/qemu/hw/ppc4xx_devs.c b/qemu/hw/ppc4xx_devs.c --- a/qemu/hw/ppc4xx_devs.c +++ b/qemu/hw/ppc4xx_devs.c @@ -2,9 +2,6 @@ * QEMU PowerPC 4xx embedded processors shared devices emulation * * Copyright (c) 2007 Jocelyn Mayer - * - * Copyright 2008 IBM Corp. - * Authors: Hollis Blanchard <[EMAIL PROTECTED]> * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal @@ -29,8 +26,6 @@ #include "ppc4xx.h" #include "sysemu.h" #include "qemu-log.h" -#include "pci.h" -#include "bswap.h" //#define DEBUG_MMIO //#define DEBUG_UNASSIGNED @@ -537,369 +532,3 @@ qemu_irq *ppcuic_init (CPUState *env, qe return qemu_allocate_irqs(&ppcuic_set_irq, uic, UIC_MAX_IRQ); } - - - - -#define PCIC0_CFGADDR 0x0 -#define PCIC0_CFGDATA 0x4 - -#define PCIL0_PMM0LA0x0 -#define PCIL0_PMM0MA0x4 -#define PCIL0_PMM0PCILA 0x8 -#define PCIL0_PMM0PCIHA 0xc -#define PCIL0_PMM1LA0x10 -#define PCIL0_PMM1MA0x14 -#define PCIL0_PMM1PCILA 0x18 -#define PCIL0_PMM1PCIHA 0x1c -#define PCIL0_PMM2LA0x20 -#define PCIL0_PMM2MA0x24 -#define PCIL0_PMM2PCILA 0x28 -#define PCIL0_PMM2PCIHA 0x2c -#define PCIL0_PTM1MS0x30 -#define PCIL0_PTM1LA0x34 -#define PCIL0_PTM2MS0x38 -#define PCIL0_PTM2LA0x3c -#define PCI_REG_SIZE0x40 - -#define PPC44x_PCI_MA_MASK 0xf000 -#define PPC44x_PCI_MA_ENABLE 0x1 - - -static uint32_t pci4xx_cfgaddr_read4(void *opaque, target_phys_addr_t addr) -{ -ppc4xx_pci_t *ppc4xx_pci = opaque; -return cpu_to_le32(ppc4xx_pci->pcic0_cfgaddr); -} - -static CPUReadMemoryFunc *pci4xx_cfgaddr_read[] = { -&pci4xx_cfgaddr_read4, -&pci4xx_cfgaddr_read4, -&pci4xx_cfgaddr_read4, -}; - -static void pci4xx_cfgaddr_write4(void *opaque, target_phys_addr_t addr, - uint32_t value) -{ -ppc4xx_pci_t *ppc4xx_pci = opaque; - -value = le32_to_cpu(value); - -ppc4xx_pci->pcic0_cfgaddr = value & ~0x3; -} - -static CPUWriteMemoryFunc *pci4xx_cfgaddr_write[] = { -&pci4xx_cfgaddr_write4, -&pci4xx_cfgaddr_write4, -&pci4xx_cfgaddr_write4, -}; - -static uint32_t pci4xx_cfgdata_read1(void *opaque, target_phys_addr_t addr) -{ -ppc4xx_pci_t *ppc4xx_pci = opaque; -int offset = addr & 0x3; -uint32_t cfgaddr = ppc4xx_pci->pcic0_cfgaddr; -uint32_t value; - -if (!(cfgaddr & (1<<31))) -return 0x; - -value = pci_data_read(ppc4xx_pci->bus, cfgaddr | offset, 1); - -return value; -} - -static uint32_t pci4xx_cfgdata_read2(void *opaque, target_phys_addr_t addr) -{ -ppc4xx_pci_t *ppc4xx_pci = opaque; -int offset = addr & 0x3; -uint32_t cfgaddr = ppc4xx_pci->pcic0_cfgaddr; -uint32_t value; - -if (!(cfgaddr & (1<<31))) -return 0x; - -value = pci_data_read(ppc4xx_pci->bus, cfgaddr | offset, 2); - -return cpu_to_le16(value); -} - -static uint32_t pci4xx_cfgdata_read4(void *opaque, target_phys_addr_t addr) -{ -ppc4xx_pci_t *ppc4xx_pci = opaque; -int offset = addr & 0x3; -uint32_t cfgaddr = ppc4xx_pci->pcic0_cfgaddr; -uint32_t value; - -if (!(cfgaddr & (1<<31))) -return 0x; - -value = pci_data_read(ppc4xx_pci->bus, cfgaddr | offset, 4); - -return cpu_to_le32(value); -} - -static CPUReadMemoryFunc *pci4xx_cfgdata_read[] = { -&pci4xx_cfgdata_read1, -&pci4xx_cfgdata_read2, -&pci4xx_cfgdata_read4, -}; - -static void pci4xx_cfgdata_write1(void *opaque, target_phys_addr_t addr, - uint32_t value) -{ -ppc4xx_pci_t *ppc4xx_pci = opaque; -int offset = addr & 0x3; - -pci_data_write(ppc4xx_pci->bus, ppc4xx_pci->pcic0_cfgaddr | offset, - value, 1); -} - -static void pci4xx_cfgdata_write2(void *opaque, target_phys_addr_t addr, - uint32_t value) -{ -ppc4xx_pci_t *ppc4xx_pci = opaque; -int offset = addr & 0x3; - -value = le16_to_cpu(value); - -pci_data_write(ppc4xx_pci->bus, ppc4xx_pci->pcic0_cfgaddr | offset, -
[PATCH 5 of 6] [PATCH] kvm-userspace: ppc: use virtio-blk header
# HG changeset patch # User Christian Ehrhardt <[EMAIL PROTECTED]> # Date 1228922789 -3600 # Node ID 5adc6fbbd4a3b82e1bc8acbcb233c60e89715b61 # Parent 214485869c303ab81c9da30b6784d641f58585f4 [PATCH] kvm-userspace: ppc: use virtio-blk header From: Christian Ehrhardt <[EMAIL PROTECTED]> virtio_blk_init now is in a separate header Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]> --- [diffstat] ppc440_bamboo.c |1 + 1 file changed, 1 insertion(+) [diff] diff --git a/qemu/hw/ppc440_bamboo.c b/qemu/hw/ppc440_bamboo.c --- a/qemu/hw/ppc440_bamboo.c +++ b/qemu/hw/ppc440_bamboo.c @@ -19,6 +19,7 @@ #include "ppc440.h" #include "qemu-kvm.h" #include "device_tree.h" +#include "virtio-blk.h" #define BINARY_DEVICE_TREE_FILE "bamboo.dtb" -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1 of 6] [PATCH] kvm-userspace: ppc: fix compatfd build decision
# HG changeset patch # User Christian Ehrhardt <[EMAIL PROTECTED]> # Date 1228922788 -3600 # Node ID b41f0d6129f51fb86bf799a5fe7b14a9270eeca4 # Parent 3af3fa5e009e143e1167979e55d547c453661059 [PATCH] kvm-userspace: ppc: fix compatfd build decision From: Christian Ehrhardt <[EMAIL PROTECTED]> qemu-kvm.c uses qemu_eventfd/qemu_signalfd. The code of compatfd takes care if CONFIG_eventfd/CONFIG_signalfd is really enabled. But currently compatfd is not build if --disable-aio is set. This patch lets compatfd.c build if USE_KVM is set to allow qemu-kvm to be linked in all cases (with/without --disable-aio) Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]> --- [diffstat] Makefile|2 +- Makefile.target |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) [diff] diff --git a/qemu/Makefile b/qemu/Makefile --- a/qemu/Makefile +++ b/qemu/Makefile @@ -59,7 +59,7 @@ BLOCK_OBJS += block-raw-posix.o BLOCK_OBJS += block-raw-posix.o endif -ifdef CONFIG_AIO +ifeq ($(USE_KVM), 1) BLOCK_OBJS += compatfd.o endif diff --git a/qemu/Makefile.target b/qemu/Makefile.target --- a/qemu/Makefile.target +++ b/qemu/Makefile.target @@ -655,7 +655,7 @@ OBJS+=block-raw-posix.o OBJS+=block-raw-posix.o endif -ifdef CONFIG_AIO +ifeq ($(USE_KVM), 1) OBJS+=compatfd.o endif -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: work around inability of older kvm modules to destroy memory regions
+drop_mapping(start_addr); +p = &mappings[nr_mappings++]; +p->phys = start_addr; +p->ram = phys_offset; +p->len = size; + return; } @@ -984,8 +1067,11 @@ void kvm_qemu_log_memory(target_phys_addr_t start, target_phys_addr_t size, { if (log) kvm_dirty_pages_log_enable_slot(kvm_context, start, size); -else +else { +if (must_use_aliases_target(start)) +return; kvm_dirty_pages_log_disable_slot(kvm_context, start, size); +} } int kvm_get_phys_ram_page_bitmap(unsigned char *bitmap) @@ -1071,6 +1157,9 @@ void kvm_physical_sync_dirty_bitmap(target_phys_addr_t start_addr, target_phys_a { void *buf; +if (must_use_aliases_source(start_addr)) +return; + buf = qemu_malloc((end_addr - start_addr) / 8 + 2); kvm_get_dirty_pages_range(kvm_context, start_addr, end_addr - start_addr, buf, NULL, kvm_get_dirty_bitmap_cb); @@ -1080,6 +1169,8 @@ void kvm_physical_sync_dirty_bitmap(target_phys_addr_t start_addr, target_phys_a int kvm_log_start(target_phys_addr_t phys_addr, target_phys_addr_t len) { #ifndef TARGET_IA64 +if (must_use_aliases_source(phys_addr)) +return 0; kvm_qemu_log_memory(phys_addr, len, 1); #endif return 0; @@ -1088,6 +1179,8 @@ int kvm_log_start(target_phys_addr_t phys_addr, target_phys_addr_t len) int kvm_log_stop(target_phys_addr_t phys_addr, target_phys_addr_t len) { #ifndef TARGET_IA64 +if (must_use_aliases_source(phys_addr)) +return 0; kvm_qemu_log_memory(phys_addr, len, 0); #endif return 0; -- To unsubscribe from this list: send the line "unsubscribe kvm-commits" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c --- a/qemu/qemu-kvm.c +++ b/qemu/qemu-kvm.c @@ -766,7 +766,9 @@ int kvm_qemu_init() return 0; } +#ifdef TARGET_I386 static int destroy_region_works = 0; +#endif int kvm_qemu_create_context(void) { @@ -784,7 +786,9 @@ int kvm_qemu_create_context(void) r = kvm_arch_qemu_create_context(); if(r <0) kvm_qemu_destroy(); +#ifdef TARGET_I386 destroy_region_works = kvm_destroy_memory_region_works(kvm_context); +#endif return 0; } @@ -793,6 +797,7 @@ void kvm_qemu_destroy(void) kvm_finalize(kvm_context); } +#ifdef TARGET_I386 static int must_use_aliases_source(target_phys_addr_t addr) { if (destroy_region_works) @@ -849,6 +854,7 @@ static void drop_mapping(target_phys_add if (p) *p = mappings[--nr_mappings]; } +#endif void kvm_cpu_register_physical_memory(target_phys_addr_t start_addr, unsigned long size, @@ -856,17 +862,21 @@ void kvm_cpu_register_physical_memory(ta { int r = 0; unsigned long area_flags = phys_offset & ~TARGET_PAGE_MASK; +#ifdef TARGET_I386 struct mapping *p; +#endif phys_offset &= ~IO_MEM_ROM; if (area_flags == IO_MEM_UNASSIGNED) { +#ifdef TARGET_I386 if (must_use_aliases_source(start_addr)) { kvm_destroy_memory_alias(kvm_context, start_addr); return; } if (must_use_aliases_target(start_addr)) return; +#endif kvm_unregister_memory_area(kvm_context, start_addr, size); return; } @@ -878,6 +888,7 @@ void kvm_cpu_register_physical_memory(ta if (area_flags >= TLB_MMIO) return; +#ifdef TARGET_I386 if (must_use_aliases_source(start_addr)) { p = find_ram_mapping(phys_offset); if (p) { @@ -886,6 +897,7 @@ void kvm_cpu_register_physical_memory(ta } return; } +#endif r = kvm_register_phys_mem(kvm_context, start_addr, phys_ram_base + phys_offset, @@ -895,11 +907,13 @@ void kvm_cpu_register_physical_memory(ta exit(1); } +#ifdef TARGET_I386 drop_mapping(start_addr); p = &mappings[nr_mappings++]; p->phys = start_addr; p->ram = phys_offset; p->len = size; +#endif return; } @@ -1068,8 +1082,10 @@ void kvm_qemu_log_memory(target_phys_add if (log) kvm_dirty_pages_log_enable_slot(kvm_context, start, size); else { +#ifdef TARGET_I386 if (must_use_aliases_target(start)) return; +#endif kvm_dirty_pages_log_disable_slot(kvm_context, start, size); } } @@ -1157,8 +1173,10 @@ void kvm_physical_sync_dirty_bitmap(targ { void *buf; +#ifdef TARGET_I386 if (must_use_aliases_source(start_addr)) return; +#endif buf = qemu_malloc((end_addr - start_addr) / 8 + 2); kvm_get_dirty_pages_range(kvm_context, start_addr, end_addr - start_addr, @@ -1168,21 +1186,21 @@ void kvm_physical_sync_dirty_bitmap(targ int kvm_log
kvm-userspace requires kvm capable kernel headers in default search path of the compiler
Hi everyone, while running a test when updating kvm-userspace for powerpc I found that the current kvm userspace requires kvm kernel headers in the default search path of the used compilers. I used to update and build in the same kvm-userspace directory for a while and this one had an old stale kernel/include directory which fulfilled all the requirements. While testing my current patch series on a new git clone of kvm-userspace I wondered why this doesn't work anymore (it worked in my old directory which also had the updated current source). I switched to x86 to verify that issue there and I found that eventually the issue is in the kvm detection "KVM Probe" part of the qemu configuration in kvm-userspace. It failed not finding kvm headers. gcc -m32 -o /tmp/qemu-conf--21885- -I/home/paelzer/Desktop/kvm-userspace/kernel/include /tmp/qemu-conf--21885-.c /tmp/qemu-conf--21885-.c:1:23: error: linux/kvm.h: No such file or directory /tmp/qemu-conf--21885-.c:3:2: error: #error Invalid KVM version Looking at the include paths it is worth to note that a current git clone of kvm-userspace has no kernel/include directory anymore. A few questions later to some other kvm developers I found that there headers can be found, but in the default search path of the compiler e.g. /usr/include. In my environment with an older gcc cross compiler for powerpc and no up to date linux headers installed for x86 both architectures failed. The Solution to that can be done in several ways: a) we decide that kvm-userspace needs up-to-date kernel headers installed. And modify the KVM Probe at least to tell the user about this possible reason when failing instead of silently switching to "KVM support no". b) if the user already provide a --kerneldir option to specify where the right includes can be found we should give those configure checks a chance to really reach that. Atm it adds the kernel/include path of the kvm-userspace tree which doesn't exist anymore (except as stale old dir in a lot of source trees out there). c) I overlooked something and there is an even better or trivial approach :-) Since this could be solved several very different ways I think its worth a discussion. As I prefer b) I attached a simple patch example how this could look like :-). -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization diff --git a/configure b/configure index 63f956c..f772bae 100755 --- a/configure +++ b/configure @@ -3,6 +3,7 @@ prefix=/usr/local kernelsourcedir= kerneldir=/lib/modules/$(uname -r)/build +qemu_kerneldir= cc=gcc ld=ld objcopy=objcopy @@ -56,6 +57,7 @@ while [[ "$1" = -* ]]; do ;; --kerneldir) kerneldir="$arg" + qemu_kerneldir="$arg" ;; --with-patched-kernel) want_module= @@ -84,9 +86,12 @@ while [[ "$1" = -* ]]; do esac done - -#set kenel directory +#set libkvm kernel directory libkvm_kerneldir=$(readlink -f kernel) +# use libkvm_kerneldir for qemu if no kerneldir option was set +if test "$qemu_kerneldir" = "" ; then +qemu_kerneldir=$libkvm_kerneldir +fi case $arch in i?86*|x86_64*) @@ -123,7 +128,7 @@ fi --disable-gcc-check \ --extra-cflags="-I $PWD/../libkvm $qemu_cflags" \ --extra-ldflags="-L $PWD/../libkvm $qemu_ldflags" \ ---kerneldir="$libkvm_kerneldir" \ +--kerneldir="$qemu_kerneldir" \ --prefix="$prefix" \ ${cross_prefix:+"--cross-prefix=$cross_prefix"} \ ${cross_prefix:+"--cpu=$arch"} "[EMAIL PROTECTED]"
Re: [PATCH] [mq]: fix-kvm-init.diff
sorry - this should actually have some description ... I'll resend it with one as soon as I find the issue why it is missing Ehrhardt Christian wrote: diff --git a/qemu/target-ppc/helper.c b/qemu/target-ppc/helper.c --- a/qemu/target-ppc/helper.c +++ b/qemu/target-ppc/helper.c @@ -2959,10 +2959,8 @@ env->cpu_model_str = cpu_model; cpu_ppc_register_internal(env, def); cpu_ppc_reset(env); -#ifdef USE_KVM if (kvm_enabled()) - kvm_init_new_ap(env->cpu_index, env); -#endif + kvm_init_vcpu(env); return env; } -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0 of 7] kvm-userspace: support multiple processors in the same architecture
Hollis Blanchard wrote: These patches allow the kvmctl bits (including testcases and libcflat) to be built for multiple processor types within the same architecture (e.g. 440 and e500). This is important because PowerPC supervisor mode can contain significant differences between processors (it's user mode that's more or less identical). For example, the data in a TLB entry and how to manipulate the TLB are a major difference between 440 and e500, which is critical here because libcflat must create its own mappings and so must know which method to use. Some of the complexity comes from user/Makefile *not* using the top-level config.mak, so we have to add some of the same logic to both configure scripts to generate both config.mak files. Too much makefile logic depends on ARCH containing only the architecture name, so it was simpler to create and export a separate PROCESSOR variable. -Hollis -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Unfortunately there is too often no "really nice" way to do Makefile magic :-) I know you started with the arch-platform-os-compiler after our discussion, but I like the $PROCESSOR solution for our *powerpc* Makefiles too. And a good catch with that AR usage in patch 7. (full series) Acked-by: Christian Ehrhardt <[EMAIL PROTECTED]> -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 03 of 10] [PATCH] user: ppc: better error reporting in load_file
Avi Kivity wrote: Ehrhardt Christian wrote: From: Hollis Blanchard <[EMAIL PROTECTED]> Fancy description. Ahem. Sorry that is my patch template description :-/ A proper description header should be: Subject: [PATCH] user: ppc: better error reporting in load_file From: Hollis Blanchard <[EMAIL PROTECTED]> This patch adds a better error reporting for powerpc testcases. It prints the bytes read in load_file so far until an error occured. Signed-off-by: Hollis Blanchard <[EMAIL PROTECTED]> Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]> --- [diffstat] main-ppc.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) [diff] -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] kvm-userspace: ppc: userspace fixes for powerpc
Ok I should have send these two series with some minutes in between to not intermix them :-/ Additionally I have the wrong header in this one it should be [0/5] :-/++ Overall it is a patch series of three patches for powerpc kvm-userspace, a five patch series for kvm-suerspace/user/ and a single patch I just submitted while cleaning our userspace repo to get the missing things upstream. Avi, let me know if I confused you and I'll send them once again with some time in between to order them easier. [EMAIL PROTECTED] wrote: From: Christian Ehrhardt <[EMAIL PROTECTED]> This is a set of fixes for the powerpc tests kvm-userspace/user. Patch 1&2 fix main-ppc.c while patch 3 introduces libcflat for powerpc. Further on patch 4 provides a timebase accessor for the ppc testcases (not used yet) and patch 5 finally adds a stub nmi handler to main-ppc.c. [patches in series] [PATCH 1/5] user: ppc: fix threading bugs in main-ppc.c [PATCH 2/5] user: ppc: better error reporting in load_file [PATCH 3/5] user: ppc: implement PowerPC 44x libcflat [PATCH 4/5] libcflat: ppc: add timebase accessor [PATCH 5/5] user: ppc: add stub nmi handler --- [diffstat] b/user/config-powerpc-44x.mak | 14 + b/user/config-powerpc.mak | 46 - b/user/main-ppc.c | 32 +++- b/user/test/lib/powerpc/44x/map.c | 51 + b/user/test/lib/powerpc/44x/timebase.S | 28 ++ b/user/test/lib/powerpc/44x/timebase.h | 25 b/user/test/lib/powerpc/44x/tlbwe.S| 29 ++ b/user/test/lib/powerpc/io.c | 35 ++ b/user/test/powerpc/cstart.S | 38 b/user/test/powerpc/exit.c | 23 ++ user/config-powerpc-44x.mak|3 + user/main-ppc.c|9 + 12 files changed, 296 insertions(+), 37 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 2/3] kvm-userspace: kvmppc: fix hostlonbits detection when cross compiling v2
malc wrote: On Tue, 30 Sep 2008, [EMAIL PROTECTED] wrote: From: Christian Ehrhardt <[EMAIL PROTECTED]> *update* further debugging according to some requests revealed that ARCH_CFLAGS does not contain all CFLAGS that might be needed, especially those supplied via extra-cflags. Therefore people supplying things via extra-cflags instead of an environment variable might have had issues. This part i don't get, there are few more checks before/after hostlongbits where no CFLAGS are added to the $cc argument list. What makes hostlongbits selection "special"? Do people specify -m32/-m64 via --extra-cflags? it was there to ensure availability of the needed include paths to reach wordsize.h. But Hollis approach is much simpler, better and more reliable so never mind :-) A recent kvm merge with qemu brought code for 64bit power that broke cross compilation. The issue is caused by configure trying to execute target architecture binaries where configure is executed. Yes, i never thought about cross-compilation, my bad. np, now it's fixed - thanks for quickly applying it. I tried to change that detection so that it works with&without cross compilation with only a small change and especially without an addtional configure command line switch. Including the bits/wordsize.h header a platform usually can check its wordsize and by doing that configure can check the hostlongbits without executing the binary. Instead it now stops after preprocessing stage which resolved the __WORDSIZE constant and retrieves that value. I don't like my new check style, but it is at least less broken than before. Another approach that was suggested was that qemu might end up needing something like asm-offsets in the kernel to manage architecture sizes etc. Comments and other approaches welcome. I think Hollis Blanchard's method is sound, Thank you for bringing this up. -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0 of 5] PowerPC patches for 2.6.27
On Sunday 27 July 2008 10:50:38 Avi Kivity wrote: > Hollis Blanchard wrote: > > Hi Avi, can these patches go upstream for 2.6.27? There's a bug fix, > > the addition of hardware breakpoint functionality, and three very > > significant performance improvements. > > > > By the way, I will be on vacation for a few weeks starting Monday, but > > Christian Ehrhardt should be able to take care of any technical issues. > > Applied all; thanks. I prefer to only merge bug fixes at this time for > 2.6.27. As far as I can tell, patch 2 is independent of the rest so > I'll queue that. Let me know if that works. Yes these patches are independent - for the type question you can consider 1/5 - feature 2/5 - bug fix 3,4,5/5 - yeah what is it .. neither a pure bug fix nor a feature let's call it "performance fix" We tested most of those performance fixes at least 1.5 Months now, having them applied while developing new things on top and I feel very good with 3&4. It would be nice if at least the performance fixes 3&4 could go in now under the bug fix label too, but it's up to you. -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5 of 5] kvm: powerpc: Map guest userspace with TID=0 mappings
On Monday 28 July 2008 12:33:41 Liu Yu wrote: > I have a question that I could not think through. > While multiple qemu/kvm processes are running at the same time, how to > prevent one guest from using others' TLB? For all the guests have the > same TID=0 for userspace and TID=1 for kernel. [...] Hi Yu Liu, thats a good question. Afaik thats solved by the fact that the shadow tlb which is used when entering guest context is per vcpu. Therefor a guest has always it's own shadow tlb active and no mappings to the content of other guests. This patch just allows us that a single guest userspace process accessing the kernel 20 times (and changing privilege level 20 times by doing so) can run without tlb flushes. Guest-userspace context switch (pid is changing) -> tlb flush; and guest switches (guest A -> guest B) -> other shadow tlb active; should still be working fine. > > > > The net is that we don't need to flush the TLB on privilege > > switches, but we do on guest context switches (which are far > > more infrequent). Guest boot time performance improvement: about 30%. > > -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2/RFC] libkvm-s390
Christian Borntraeger wrote: This is an update patch for libkvm to build and work on s390 [...] Index: kvm-userspace/libkvm/libkvm-s390.c === --- /dev/null +++ kvm-userspace/libkvm/libkvm-s390.c @@ -0,0 +1,137 @@ +/* + * This file contains the s390 specific implementation for the + * architecture dependent functions defined in kvm-common.h and + * libkvm.h + * + * Copyright (C) 2006 Qumranet + * Copyright IBM Corp. 2008 + * + * Authors: + * Carsten Otte <[EMAIL PROTECTED]> + * Christian Borntraeger <[EMAIL PROTECTED]> whitespace error - well just in the comment, but if you resubmit it anyway this space could be removed ;-) [...] + +int handle_dcr(struct kvm_run *run, kvm_context_t kvm, int vcpu) +{ + fprintf(stderr, "%s: Operation not supported\n", __FUNCTION__); + return -1; +} + I think Carsten started with the powerpc file here. This function is powerpc only and even in our code only called later on in this libkvm-$arch.c file. Maybe we should patch our file and make it static to clarify that. Since you don't have that call its superfluous, you should be able to drop it without consequence. [...] Index: kvm-userspace/libkvm/libkvm.c === --- kvm-userspace.orig/libkvm/libkvm.c +++ kvm-userspace/libkvm/libkvm.c @@ -48,6 +48,10 @@ #include "kvm-powerpc.h" #endif +#if defined(__s390__) +#include "kvm-s390.h" +#endif + int kvm_abi = EXPECTED_KVM_API_VERSION; int kvm_page_size; @@ -88,7 +92,11 @@ int get_free_slot(kvm_context_t kvm) if (tss_ext > 0) i = 0; else +#if !defined(__s390__) i = 1; +#else + i = 0; +#endif This looks a bit complicated, but when thinking of the result this just means that i is always 0 on s390. Therefore I think it would look less confusing when the ifdef would not be in the else part of that tss_ext if. Since i will always be zero in s390 case you could do things like - initialize i with 0 and let the whole block until the for loop run in "if !defined(__s390__)" That way you would additionally stop modifying the tss_ext variable which is not used in the s390 case anyway. [...] -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvmtrace?
On Monday 07 July 2008 17:34:13 Jun Koi wrote: > On Mon, Jul 7, 2008 at 11:07 PM, Christian Ehrhardt > > <[EMAIL PROTECTED]> wrote: > > Jun Koi wrote: > >> Hi, > >> > >> Could anybody explain a bit about kvmtrace? I cannot find any doc on > >> it anywhere. > >> > >> - Is there any short usage instruction for kvmtrace? > > > > I don't know if there is an official doc yet, but you can just start with > > the simple version like: > > ./kvmtrace -o outfile -w 5 > > which traces for 5 seconds to outfile.* > > This really surprises me. I expected that we can collect trace for a > specific VM. But it seems that we always trace all the VMs currently > running?? usually you want to trace all to ensure that you can interpret sideeffects. Otherwise you might often urn into wrong interpretation or a good assumption and you would "just" need the data for the other guests to ensure your theory. Anyway if you really think we need it you might code it in the probe function near the head of kvm_add_trace, there is a check for KVM_TRACE_STATE_RUNNING and there you could easily implement some filter mechanism (you'll want to extend the ioctls too to get your mask). But anyway, as I mentioned you usually want all data to "be sure" what you see in your trace. The only scenario where I think we would really need a mask is in very big many guest scenarios, but today I think no one except the s390 guys have enough guests to have that need practically. > With trace feature always ON, I guess we suffer some overhead, right? > Perhaps the overhead is little, but still we have. If so, is there > anyway to turn the trace feature OFF? Well, you can turn it off permanent by disabling CONFIG_KVM_TRACE. But be aware that kvm_trace uses the markers infrastructure (see Documentation/markers.txt) which means that the overhead is really really low as long as you don't actively use kvmtrace (invoked via the userspace tool). And if you use it, well then it should do something ;-) > Many thanks, > Jun > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] kvmtrace: make cycle calculation architecture aware
Hollis Blanchard wrote: On Mon, 2008-07-07 at 15:56 +0200, [EMAIL PROTECTED] wrote: From: Christian Ehrhardt <[EMAIL PROTECTED]> The current implementation of kvmtrace uses always a 64 bit cycle variable, but get_cycles() which is used to fill it is "unsigned long" which might be 32 bit. This reduces the accuracy e.g. on embedded powerpc since we would have a 64bit value but get_cycle() only returns the low 32 bit. To solve that this patch introduces kvm_arch_trace_cycles() which allows us to make this calculation architecture aware. That way every architecture can insert whatever fits best for their "kvmtrace cycle counter". Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]> "cycles" is a very poor name, because that's not really what we're talking about at all. (Also, that function name made me wonder what a "trace cycle" is. :) I would strongly prefer using "timestamp" instead. It would be nice if we could rename the data structure too, but I'd settle for only properly naming the new architecture function hook. In fact, if we want to be rigorous about it, it should really be something like "nanoseconds" instead, so that userspace wouldn't need to perform awkward conversions of "cycles" or "timebase ticks" to real time. It looks like getnstimeofday() would do the trick, and that way we wouldn't need an arch-specific hook at all I agree that the naming is poor. I also wondered about it and just continued it to the kvm_arch function. I'll change the naming - timestamp would be fine, but only if it is really time and not a cycle counter. I experimented with the tv_nsec portion and the classic current_time and accuracy there was just far to low compared to the time base register data (5 to 20 instructions until tv.nssec portion changed). I checked the getnstimeofday function you mentioned and compared accuracy again and yes that is exactly what I would like to use: here a ittle example 263527418457 (+ 36696)ns=0x1c43286a 263527740677 (+ 30) ns=0x1c4a880c (+ 75FA2 = 483234) 263527779757 (+ 39080)ns=0x1c4b6cae (+ E4A2 = 58530) While the unit is not the same (explains the difference in the same line e.g. 39080 vs. 58530) the accuracy is the same. this can be seen due to the fact that it changes by the same ratio all over the file. For the example I had above cycle = 30/39080 = 8.25 nsec = 483234/58530 = 8.25 That leads me to the point where I completely agree with Hollis that the naming should be timestamp as well as that the function should rely on getnstimeofday() instead of get_cycles() and that way would remove the need kvm_arch_ function. So the question that is left before changing that is, if the original author had something special in mind chosing cycles here. I added Eric on CC for that. I wait with my resubmission of the patch series until all architectures agree *hope* on using getnstimeofday() - after an ack from all sides I would revise my patch series and submit that changes alltogether. -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] kvmtrace: make cycle calculation architecture aware
Hollis Blanchard wrote: On Mon, 2008-07-07 at 15:56 +0200, [EMAIL PROTECTED] wrote: From: Christian Ehrhardt <[EMAIL PROTECTED]> The current implementation of kvmtrace uses always a 64 bit cycle variable, but get_cycles() which is used to fill it is "unsigned long" which might be 32 bit. This reduces the accuracy e.g. on embedded powerpc since we would have a 64bit value but get_cycle() only returns the low 32 bit. To solve that this patch introduces kvm_arch_trace_cycles() which allows us to make this calculation architecture aware. That way every architecture can insert whatever fits best for their "kvmtrace cycle counter". Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]> Just one comment below... BTW, because this breaks the ia64 and s390 builds, it might be nice to CC the appropriate list/maintainer directly. you're right - I'll add dummy stubs for ia64&s390 using the classic get_cycle() call and resubmit the series. --- [diffstat] arch/powerpc/kvm/powerpc.c | 25 + arch/x86/kvm/x86.c |5 + include/linux/kvm_host.h |2 ++ virt/kvm/kvm_trace.c |2 +- 4 files changed, 33 insertions(+), 1 deletion(-) [diff] diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -521,6 +521,31 @@ return r; } +/* + * We need a 64 bit value here and the default get_cycles has only 32bit (tbl) + * on 32bit ppc. Since we have a 64bit counter for that we provide it here in + * full resolution for the trace records. +*/ +__u64 kvm_arch_trace_cycles() +{ + unsigned long ruval; + unsigned long ruval2; + unsigned long rlval; + + /* get a consistant pair of upper/lower timebase (no wrap occured) */ + asm volatile( + "loop:\n" + "mftbu %0\n" + "mftbl %1\n" + "mftbu %2\n" + "cmpw %0, %2\n" + "bne loop" + : "=r" (ruval), "=r" (rlval), "=r" (ruval2) + ); + + return (((__u64)ruval) << 32) | rlval; +} You should use get_tb() here (see asm-powerpc/time.h). yep does the same, I didn't see it and coded the asm from the processor manual hint at the time base register. Anyway - I'll change the code to get_tb() - thanks for the hint. -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvmtrace?
Jun Koi wrote: Hi, Could anybody explain a bit about kvmtrace? I cannot find any doc on it anywhere. - Is there any short usage instruction for kvmtrace? I don't know if there is an official doc yet, but you can just start with the simple version like: ./kvmtrace -o outfile -w 5 which traces for 5 seconds to outfile.* After that you can cat that file and pipe it to kvmtrace_format which converts it to a readable file: cat outfile.kvmtrace.0 | ./kvmtrace_format formats formats is a simple file defining how to present the trace records. - Which kind of data kvmtrace can collect? Due to the structure of the "formats" file (lockated in kvm-userspace/user/formats) this is also a list of the event types currently tracked - it is readable ascii and lists event+additional data reported. Both kvmtrace (c) & kvmtrace_formats (python) are not very big (yet) so if you want to go further just look into the code. While I'm sure that it will grow in complexity in the future it's clearly arranged and easy to read atm. Many thanks, Jun -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvmtrace: fix "Remove use of bit fields in kvm trace structure" patch
[PATCH] kvmppc: kvmtrace: fix "Remove use of bit fields in kvm trace structure" From: Christian Ehrhardt <[EMAIL PROTECTED]> The current patch kvm_trace_nobitfields.diff in the kvmppc.hg-dev repo doesn't compile and has some minor flaws. This patch is known to the kvm list as "[PATCH] [v2] Remove use of bit fields in kvm trace structure" from Jerone. Problem: code: rec.rec_val |= TRACE_REC_EVENT_ID(va_arg(*args, u32)); makro: #define TRACE_REC_EVENT_ID (val) \ (0x0fff & (val)) is extended to: rec.rec_val |= (val) (0x0fff & (val))(__builtin_va_arg(*args,u32)); that is failing with: kvm_trace.c: In function 'kvm_add_trace': kvm_trace.c:66: error: 'val' undeclared (first use in this function) kvm_trace.c:66: error: (Each undeclared identifier is reported only once kvm_trace.c:66: error: for each function it appears in.) Obviously caused by the blank between macro definition and (val) which makes it a non-parameter macro. Additionally the two macro definitions below don't put () around the parameter which is one of the golden macro rules to keep operator precedence independent to what someone passes as argument to your macro. Further I found that part of the path added a superfluous assignment in kvm_trace.c (initialize with =0 and directly or something in with |=). This patch fixes all three issues. As Hollis pointed out yesterday patch #1 of the initial series of 3 patches should already be applied according to your (Avi) response while it is not seen on kvm-commits yet. He offered to collect those patches in his patch queue and send you a batch soon. That said you can consider this patch FYI (to collect comments now) and wait until our consolidated batch arrives including all fixes at once. Signed-off-by: Christian Ehrhardt <[EMAIL PROTECTED]> --- [diffstat] include/linux/kvm.h | 10 +- virt/kvm/kvm_trace.c |4 +--- 2 files changed, 6 insertions(+), 8 deletions(-) [diff] diff --git a/include/linux/kvm.h b/include/linux/kvm.h --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -331,12 +331,12 @@ } u; } __attribute__((packed)); -#define TRACE_REC_EVENT_ID (val) \ +#define TRACE_REC_EVENT_ID(val) \ (0x0fff & (val)) -#define TRACE_REC_NUM_DATA_ARGS (val) \ -(0x7000 & (val << 28)) -#define TRACE_REC_TCS (val) \ -(0x8000 & (val << 31)) +#define TRACE_REC_NUM_DATA_ARGS(val) \ +(0x7000 & ((val) << 28)) +#define TRACE_REC_TCS(val) \ +(0x8000 & ((val) << 31)) #define KVMIO 0xAE diff --git a/virt/kvm/kvm_trace.c b/virt/kvm/kvm_trace.c --- a/virt/kvm/kvm_trace.c +++ b/virt/kvm/kvm_trace.c @@ -60,10 +60,8 @@ if (unlikely(kt->trace_state != KVM_TRACE_STATE_RUNNING)) return; -rec.rec_val = 0; - /* set event id */ -rec.rec_val|= TRACE_REC_EVENT_ID(va_arg(*args, u32)); +rec.rec_val= TRACE_REC_EVENT_ID(va_arg(*args, u32)); vcpu = va_arg(*args, struct kvm_vcpu *); rec.pid= current->tgid; --- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html