Re: Host latency peaks due to kvm-intel
On 07/24/2009 12:41 PM, Jan Kiszka wrote: I vaguely recall that someone promised to add a feature reporting facility for all those nice things, modern VM-extensions may or may not support (something like or even an extension of /proc/cpuinfo). What is the state of this plan? Would be specifically interesting for Intel CPUs as there seem to be many of them out there with restrictions for special use cases - like real-time. Newer kernels do report some vmx features (like flexpriority) in /proc/cpuinfo but not all. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Host latency peaks due to kvm-intel
Avi Kivity wrote: On 07/24/2009 12:41 PM, Jan Kiszka wrote: I vaguely recall that someone promised to add a feature reporting facility for all those nice things, modern VM-extensions may or may not support (something like or even an extension of /proc/cpuinfo). What is the state of this plan? Would be specifically interesting for Intel CPUs as there seem to be many of them out there with restrictions for special use cases - like real-time. Newer kernels do report some vmx features (like flexpriority) in /proc/cpuinfo but not all. Ah, nice. Then we just need this? From: Jan Kiszka jan.kis...@siemens.com Subject: [PATCH] x86: Report VMX feature vwbinvd Not all VMX-capable CPUs support guest exists on wbinvd execution. If this is not supported, the instruction will run natively on behalf of the guest. This can cause multi-millisecond latencies to the host which is very problematic in real-time scenarios. Report the wbinvd trapping feature along with other VMX feature flags, calling it 'vwbinvd' ('virtual wbinvd'). Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- arch/x86/include/asm/cpufeature.h |1 + arch/x86/kernel/cpu/intel.c |4 2 files changed, 5 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 4a28d22..8647524 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -165,6 +165,7 @@ #define X86_FEATURE_FLEXPRIORITY (8*32+ 2) /* Intel FlexPriority */ #define X86_FEATURE_EPT (8*32+ 3) /* Intel Extended Page Table */ #define X86_FEATURE_VPID(8*32+ 4) /* Intel Virtual Processor ID */ +#define X86_FEATURE_VWBINVD (8*32+ 5) /* Guest Exiting on WBINVD */ #if defined(__KERNEL__) !defined(__ASSEMBLY__) diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 3260ab0..2d921b0 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -297,6 +297,7 @@ static void __cpuinit detect_vmx_virtcap(struct cpuinfo_x86 *c) #define X86_VMX_FEATURE_PROC_CTLS2_VIRT_APIC 0x0001 #define X86_VMX_FEATURE_PROC_CTLS2_EPT 0x0002 #define X86_VMX_FEATURE_PROC_CTLS2_VPID0x0020 +#define X86_VMX_FEATURE_PROC_CTLS2_VWBINVD 0x0040 u32 vmx_msr_low, vmx_msr_high, msr_ctl, msr_ctl2; @@ -305,6 +306,7 @@ static void __cpuinit detect_vmx_virtcap(struct cpuinfo_x86 *c) clear_cpu_cap(c, X86_FEATURE_FLEXPRIORITY); clear_cpu_cap(c, X86_FEATURE_EPT); clear_cpu_cap(c, X86_FEATURE_VPID); + clear_cpu_cap(c, X86_FEATURE_VWBINVD); rdmsr(MSR_IA32_VMX_PROCBASED_CTLS, vmx_msr_low, vmx_msr_high); msr_ctl = vmx_msr_high | vmx_msr_low; @@ -323,6 +325,8 @@ static void __cpuinit detect_vmx_virtcap(struct cpuinfo_x86 *c) set_cpu_cap(c, X86_FEATURE_EPT); if (msr_ctl2 X86_VMX_FEATURE_PROC_CTLS2_VPID) set_cpu_cap(c, X86_FEATURE_VPID); + if (msr_ctl2 X86_VMX_FEATURE_PROC_CTLS2_VWBINVD) + set_cpu_cap(c, X86_FEATURE_VWBINVD); } } signature.asc Description: OpenPGP digital signature
Re: Host latency peaks due to kvm-intel
On 07/25/2009 12:55 PM, Jan Kiszka wrote: Avi Kivity wrote: On 07/24/2009 12:41 PM, Jan Kiszka wrote: I vaguely recall that someone promised to add a feature reporting facility for all those nice things, modern VM-extensions may or may not support (something like or even an extension of /proc/cpuinfo). What is the state of this plan? Would be specifically interesting for Intel CPUs as there seem to be many of them out there with restrictions for special use cases - like real-time. Newer kernels do report some vmx features (like flexpriority) in /proc/cpuinfo but not all. Ah, nice. Then we just need this? From: Jan Kiszkajan.kis...@siemens.com Subject: [PATCH] x86: Report VMX feature vwbinvd Not all VMX-capable CPUs support guest exists on wbinvd execution. If this is not supported, the instruction will run natively on behalf of the guest. This can cause multi-millisecond latencies to the host which is very problematic in real-time scenarios. Report the wbinvd trapping feature along with other VMX feature flags, calling it 'vwbinvd' ('virtual wbinvd'). What about AMD cpus that can always trap wbinvd? do we set the bit or do we trust the user to know that it isn't needed on AMD (I suppose the latter)? This should go in via tip.git, it isn't really kvm related (except that kvm should start reading these caps one day instead of querying the hardware directly). -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: slow guest performance with build load, looking for ideas
On 07/16/2009 03:20 PM, Jes Sorensen wrote: It's really just a standard Intel or Supermicro motherboard in a box that has been painted purple (or blue/green now I guess), so it really shouldn't have extra numa factors compared to other Nehalem systems. Have you been transferred to marketing? -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Host latency peaks due to kvm-intel
On 07/24/2009 12:41 PM, Jan Kiszka wrote: Jan (who is now patching his guest to avoid wbinvd where possible) Is there ever a case where it is required? What about under a hypervisor (i.e. check the hypervisor enabled bit). -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 6/6] remove kvm_mmio_read and kvm_mmio_write
On Tue, Jul 21, 2009 at 06:13:12PM -0400, Glauber Costa wrote: all they did was to call a qemu function. Call this function instead. Signed-off-by: Glauber Costa glom...@redhat.com --- qemu-kvm-x86.c |7 +-- qemu-kvm.c | 34 -- 2 files changed, 9 insertions(+), 32 deletions(-) diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 350f272..741ae0a 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -344,7 +344,6 @@ void kvm_show_code(kvm_vcpu_context_t vcpu) unsigned char code; char code_str[SHOW_CODE_LEN * 3 + 1]; unsigned long rip; - kvm_context_t kvm = vcpu-kvm; r = ioctl(fd, KVM_GET_SREGS, sregs); if (r == -1) { @@ -364,11 +363,7 @@ void kvm_show_code(kvm_vcpu_context_t vcpu) for (n = -back_offset; n SHOW_CODE_LEN-back_offset; ++n) { if (n == 0) strcat(code_str, --); - r = kvm_mmio_read(kvm-opaque, rip + n, code, 1); - if (r 0) { - strcat(code_str, xx); - continue; - } + cpu_physical_memory_rw(rip + n, code, 1, 0); sprintf(code_str + strlen(code_str), %02x, code); } fprintf(stderr, code:%s\n, code_str); diff --git a/qemu-kvm.c b/qemu-kvm.c index 0724c28..9b1c506 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -95,18 +95,6 @@ static int kvm_debug(void *opaque, void *data, } #endif -int kvm_mmio_read(void *opaque, uint64_t addr, uint8_t *data, int len) -{ - cpu_physical_memory_rw(addr, data, len, 0); - return 0; -} - -int kvm_mmio_write(void *opaque, uint64_t addr, uint8_t *data, int len) -{ - cpu_physical_memory_rw(addr, data, len, 1); - return 0; -} - static int handle_unhandled(kvm_context_t kvm, kvm_vcpu_context_t vcpu, uint64_t reason) { @@ -888,23 +876,17 @@ int kvm_set_mpstate(kvm_vcpu_context_t vcpu, struct kvm_mp_state *mp_state) } #endif -static int handle_mmio(kvm_vcpu_context_t vcpu) +static void handle_mmio(kvm_vcpu_context_t vcpu) { unsigned long addr = vcpu-run-mmio.phys_addr; - kvm_context_t kvm = vcpu-kvm; struct kvm_run *kvm_run = vcpu-run; void *data = kvm_run-mmio.data; /* hack: Red Hat 7.1 generates these weird accesses. */ if ((addr 0xa-4 addr = 0xa) kvm_run-mmio.len == 3) - return 0; + return; - if (kvm_run-mmio.is_write) - return kvm_mmio_write(kvm-opaque, addr, data, - kvm_run-mmio.len); - else - return kvm_mmio_read(kvm-opaque, addr, data, - kvm_run-mmio.len); +cpu_physical_memory_rw(addr, data, kvm_run-mmio.len, kvm_run-mmio.is_write); Glauber, The indentation now looks horrible. Applied the kvm_init order patches. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Very high memory usage with KVM
Hi all! I have an installation with Ubuntu Hardy Heron server amd64 with KVM-62 from Ubuntu repositories installed on an HP Proliant DL380 G5 with two Xeon E5405 quadcore processors and 16 GiB of RAM which has six VMs with the following configuration of memory: Hostname | RAM ===+=== Ganimedes |2 GiB Os |1 GiB Aprender |2 GiB Aps0 |2 GiB Aps2 |4 GiB Ratatoskr |4 GiB ===+=== TOTAL | 15 GiB Initially the host was created with a swap partition of 1 GiB (more 1 GiB than was free for use of host) but this amount with the time remained short and I had to add a LV of 7 GiB to be used with swap, being now a total of 8 GiB of swap of which at this moment I have only a 9% free. Is 'normal' this use of memory? r...@ss02:~# ps -e --sort -rss -Ho user,start_time,pid,pcpu,pmem,rss,size,vsz,args USER START PID %CPU %MEM RSSSZVSZ COMMAND [...] root Jul06 27471 52.3 24.4 4023232 4292200 4350296 kvm ratatoskr root Jul24 9955 137 23.8 3923620 4308592 4350308 kvm aps2 root Jul06 8751 5.8 8.3 1368228 2171808 2229888 kvm aps0 root Jul07 8565 2.7 5.2 862844 2204704 2246416 kvm aprender root Apr22 7842 0.6 3.6 600072 2172056 2230136 kvm ganimedes root Jul01 7944 0.6 2.0 334860 1119916 1177996 kvm os r...@ss02:~# free total used free sharedbuffers cached Mem: 16463388 16377844 85544 0 894216 66328 -/+ buffers/cache: 154173001046088 Swap: 83199487621916 698032 Updating to KVM-84 or superior can improve this situation? Thanks in advance. Regards, Daniel -- Fingerprint: BFB3 08D6 B4D1 31B2 72B9 29CE 6696 BF1B 14E6 1D37 Powered by Debian GNU/Linux Squeeze - Linux user #188.598 signature.asc Description: Digital signature
serial port again
Finally, I'm trying to use serial port in a kvm guest. And as in my previous attempt, it does not quite work. qemu-kvm-0.10.5, 2.6.27.latest guest, 2.6.30.latest host. Running pppd over real serial/model lines for now. Dialing into kvm virtual machine from another location, kvm pppd acts as dial-in server. Authentication succeeded, 2 pppd starts exchanging options. But at least one option never reaches the dialing in machine from a ppp server. Here's some straces. I'm omitting client to server communicatiins, leaving only server to client. server: 23:18:43 send(3, 31Jul 25 23:18:43 pppd[3199]: sent [PAP AuthAck id=0x1 \\]..., 60, MSG_NOSIGNAL) = 60 23:18:43 write(7, \300#\2\1\0\5\0..., 7) = 7 client: 23:18:44 read(7, \300#\2\1\0\5\0..., 1502) = 7 23:18:44 send(3, 31Jul 25 23:18:44 pppd[31168]: rcvd [PAP AuthAck id=0x1 \\]..., 61, MSG_NOSIGNAL) = 61 as you see, the 7 bytes \300#\2\1\0\5\0 were received successfully. server: 23:18:44 send(3, 31Jul 25 23:18:44 pppd[3199]: sent [IPCP ConfReq id=0x1 addr 192.168.2.17]... 23:18:44 write(8, \200!\1\1\0\n\3\6\300\250\2\21..., 12) = 12 and this one is never ever received by client. server: 23:18:44 send(3, 31Jul 25 23:18:44 pppd[3199]: sent [CCP ConfReq id=0x1]... 23:18:44 write(8, \200\375\1\1\0\4..., 6) = 6 client: 23:18:44 read(8, \200\375\1\1\0\4..., 1502) = 6 23:18:44 send(3, 31Jul 25 23:18:44 pppd[31168]: rcvd [CCP ConfReq id=0x1]... and again, this 6-char string gets received ok. server: 23:18:44 send(3, 31Jul 25 23:18:44 pppd[3199]: sent [CCP ConfRej id=0x1 deflate 15... 23:18:44 write(8, \200\375\4\1\0\17\32\4x\0\30\4x\0\25\3/..., 17) = 17 client: 23:18:44 read(8, \200\375\4\1\0\17\32\4x\0\30\4x\0\25\3/..., 1502) = 17 23:18:44 send(3, 31Jul 25 23:18:44 pppd[31168]: rcvd [CCP ConfRej id=0x1 deflate 15... got this 17-char string ok. server, after some more exchanges which are all ok, repeats the IP request: 23:18:47 send(3, 31Jul 25 23:18:47 pppd[3199]: sent [IPCP ConfReq id=0x1 addr 192.168.2.17]... 23:18:47 write(8, \200!\1\1\0\n\3\6\300\250\2\21..., 12) = 12 and again, this request never reaches the other end and finally the client disconnects, timing out waiting for the addresses being agreed upon. Another thing that never reaches client: 23:18:44 send(3, 31Jul 25 23:18:44 pppd[3199]: sent [IPCP ConfRej id=0x1 compress VJ 0f 01]... 23:18:44 write(8, \200!\4\1\0\n\2\6\0-\17\1..., 12) = 12 So, either we've issue with 12-byte strings sent from kvm to host (no 12-byte string gets received actually, but those mentioned are the only sent), or an issue with certain character (sequence) -- but I see all the individual chars in other places -- maybe only except \6 (ACK). Exact same config, on exact same machine but on host (not guest) worked for some 10 years already, and works today still. Client immediately sees the IPCP ConfReq message and connection progresses. That was long and tiring excercise. Now something simpler: stty outputs on both host and guest for the same (physical) serial port while pppd is running in guest. GUEST: speed 57600 baud; rows 0; columns 0; line = 3; intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = undef; eol2 = undef; swtch = undef; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W; lnext = ^V; flush = ^O; min = 1; time = 0; -parenb -parodd cs8 hupcl -cstopb cread -clocal crtscts ignbrk -brkint ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl -ixon -ixoff -iuclc -ixany -imaxbel -iutf8 -opost -olcuc -ocrnl -onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0 -isig -icanon -iexten -echo -echoe -echok -echonl -noflsh -xcase -tostop -echoprt -echoctl -echoke HOST: speed 57600 baud; rows 0; columns 0; line = 0; intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = undef; eol2 = undef; swtch = undef; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W; lnext = ^V; flush = ^O; min = 1; time = 0; -parenb -parodd cs8 hupcl -cstopb cread clocal -crtscts -ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl -ixon -ixoff -iuclc -ixany -imaxbel -iutf8 opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0 -isig -icanon -iexten -echo echoe echok -echonl -noflsh -xcase -tostop -echoprt echoctl echoke wtf. The line discipline is wrong, it looks like. Especially important is opost (together with onlcr) on host: it's output processing which gets enabled while it should not be. Guest correctly turned it off. Also of interest is echoctl but it's not that important. For others, like local/-local and crtscts/-crtscts I can imagine the difference since kvm should pass the info to guest instead of beiong blocked itself. Any comments on all this? Thanks! /mjt -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Timeout of network interface with OpenBSD 4.5 VM
Hi Chris. On Friday, 24 July 2009 08:03:29 -0400, Chris Dukes wrote: http://zeniv.linux.org.uk/~pakrat/obsd45 3 configs, 3 kernels. All work on a Core2 Duo T7300 running 2.6.27-7-generic (From ubuntu intrepid) crammed onto a laptop that's mostly running hardy. The OpenBSD virtual machines have 256M allocated, e1000 NIC, and vde backing it. Firstly, thank you very much by to take the trouble to test and to put the files in a site. I was testing the compiled kernels that you provided to me in the VM with OpenBSD 4.5 in my host with KVM-88 running Ubuntu Hardy Heron server amd64 on a Athlon 64 X2 3800+ using a kernel 2.6.24-19-server. The boot parameters were the following: # OpenBSD $KVM -hda /dev/vm/fugu-disk -m 512 -boot c -net \ nic,vlan=0,macaddr=00:16:3e:00:00:35,model=ne2k_pci -net tap \ -daemonize -vnc :4 -k es -localtime -serial \ telnet:localhost:4001,server,nowait bsd.kvm3 and bsd.kvm4 boots without problems and the VM has connectivity with the rest of my LAN, nevertheless bsd.kvm3l boots without problems but the network interface does not take the static IP assigned (in none of the three cases it uses DHCP). With kernel bsd.kvm3l: fugu:~# ifconfig lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST mtu 33204 priority: 0 groups: lo inet 127.0.0.1 netmask 0xff00 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2 enc0: flags=0 mtu 1536 priority: 0 With kernel bsd.kvm[34]: fugu:~# ifconfig lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST mtu 33204 priority: 0 groups: lo inet 127.0.0.1 netmask 0xff00 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 ne3: flags=8863UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST mtu 1500 lladdr 00:16:3e:00:00:35 priority: 0 groups: egress media: Ethernet 10baseT full-duplex inet 10.1.0.12 netmask 0xff00 broadcast 10.1.0.255 inet6 fe80::216:3eff:fe00:35%ne3 prefixlen 64 scopeid 0x1 enc0: flags=0 mtu 1536 priority: 0 On the other hand, also I tried to compile the kernel with the configuration files that you provided to me booting the operating system with a kernel GENERIC and the option --no-acpi of kvm so that boot finishes successful (although this way did not have connectivity), although under these circumstances some of the locally compiled kernels did not work. KVM4 boots without problems, but giving a 'ne3: device timeout' and the VM loses connectivity with the LAN even though the network interface is configured with its static IP. KVM3 boots OK and it doesn't presents networking problems. KVM3L boots OK but the interface is not configured with its static IP as in the previous case with the kernel compiled for you. Also I was trying to boot the VM with the KVM option model=e1000. In this case, the kernels locally compiled with KVM3 and KVM3L didn't present problems when boot or with the configuration of network, nevertheless, with kernel locally compiled with KVM4L were rebooted after mttr similar to as it happened with GENERIC kernel: mtrr: Pentium Pro MTRR support uk OpenBSD/i386 BOOT 3.02 boot booting hd0a:/bsd: 6039964+1059784 [52+336688+318896]=0x7657ec entry point at 0x200120 and occasionally the boot finishes with: mtrr: Pentium Pro MTRR support uvm_fault(0xd080d9e0, 0x0, 0, 1) - e kernel: page fault trap, code=0 kernel: page fault trap, code=0 Skernel: page fault trap, code=0 Skernel: page fault trap, code=0 Skernel: page fault trap, code=0 Skernel: page fault trap, code=0 Sstray interrupt 12 kkernel: page fault trap, code=0 Stopped at kernel: page fault trap, code=0 Stopped at kernel: page fault trap, code=0 Stopped at kernel: page fault trap, code=0 Stopped at kernel: page fault trap, code=0 Stopped at kernel: page fault trap, code=0 Stopped at kernel: page fault trap, code=0 Stopped at kernel: page fault trap, code=0 Stopped at kernel: page fault trap, code=0 Stopped at kernel: page fault trap, code=0 Stopped at kernel: page fault trap, code=0 Stopped at kernel: page fault trap, code=0 Stopped at kernel: page fault trap, code=0 Stopped at kernel: page fault trap, code=0 Stopped at kernel: page fault trap, code=0 Stopped at kernel: page fault trap, code=0 Stopped at kernel: page fault trap, code=0 --db_more-- With the three compiled kernels that you provided to me, I didn't have no kind of problem during boot or of networking using model=e1000, although it draws attention to me to observe these differences that commented to you. This difference will be related to hardware that I am using? CPU: Athlon 64 X2 3800+ Motherboard: M2N32 SLI Network interface: 2 x nVidia MCP55 I don't have much experience compiling OpenBSD kernel, but the procedure that I used was the following one: * Download
NMI Injection to Guest
Hi list, I'm trying to extend OProfile to support guest profiling. One step of my work is to push an NMI to the guest(s) when a performance counter overflows. Please correct me if the following is not correct: counter overflow -- NMI to host -- VM exit -- int $2 to handle NMI on host -- ... -- VM entry -- NMI to guest On the path between VM-exit and VM-entry, I want to push an NMI to the guest. I tried to put the following code on the path, but never succeeded. Various wired things happened, such as KVM hangs, guest kernel oops, and host hangs. I tried both code with Linux 2.6.30 and version 88. if (vmx_nmi_allowed()) { vmx_inject_nmi(); } Any suggestions? Where is the right place to push an NMI and what are the necessary checks? Thanks, Jiaqing -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/5] Fix kvmppc build error
like this: /home/liuyu/git/qemu.git/target-ppc/kvm_ppc.c: In function 'kvmppc_read_host_property': /home/liuyu/git/qemu.git/target-ppc/kvm_ppc.c:55: error: label 'out' defined but not used Signed-off-by: Liu Yu yu@freescale.com --- target-ppc/kvm_ppc.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/target-ppc/kvm_ppc.c b/target-ppc/kvm_ppc.c index 10cfdb3..be47469 100644 --- a/target-ppc/kvm_ppc.c +++ b/target-ppc/kvm_ppc.c @@ -52,7 +52,6 @@ close: fclose(f); free: free(path); -out: return ret; } -- 1.5.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/5]
The whole patchset includes: patch 1: fix kvmppc build error patch 2: fix kvmppc init error patch 3~5: add kvmppc guest debug support The guest debug still have some problems I haven't solved. 1. gdb 'next' command uses software breakpoint software breakpoint is implemented via modify guest's code. In most case it works well, but when used by 'next' it's easy to make trouble on powerpc booke. For example booke has a code template for jumping to and returning from interrupt handlers: bl transfer .long handler_addr .long ret_addr when call transfer, it never return but in transfer assembly code it will read the handler_addr and ultimately call the handler. Gdb doesn't know that and treat it as a normal function call. so gdb put a software breakpoint instruction at handler_addr, in order to get trap there when return from transfer. Then guest will read software breakpoint as handler_addr and jump to there.. I'm not sure if x86 suffer this kind of issue. Is there any way to avoid this? 2. gdb 'watch' command Jan told me gdb6.8 can issue hardware watchpoint request via command 'watch', my gdb is 6.8.50.20080821-cvs and our toolchain provider confirm that it supports hardware watch However when I use 'watch', I can only see single step from gdbstub side. Did I miss anything? -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/5] Add eaddr translator for fsl_booke mmu
Signed-off-by: Liu Yu yu@freescale.com --- target-ppc/helper.c | 17 +++-- 1 files changed, 15 insertions(+), 2 deletions(-) diff --git a/target-ppc/helper.c b/target-ppc/helper.c index b7162df..f4af124 100644 --- a/target-ppc/helper.c +++ b/target-ppc/helper.c @@ -22,6 +22,7 @@ #include string.h #include inttypes.h #include signal.h +#include linux/kvm.h #include cpu.h #include exec-all.h @@ -1325,8 +1326,20 @@ static always_inline int check_physical (CPUState *env, mmu_ctx_t *ctx, cpu_abort(env, MPC8xx MMU model is not implemented\n); break; case POWERPC_MMU_BOOKE_FSL: -/* XXX: TODO */ -cpu_abort(env, BookE FSL MMU model not implemented\n); +if (kvm_enabled()) { +struct kvm_translation tr; + +/* For now we only debug guest kernel */ +tr.linear_address = eaddr; +ret = kvm_vcpu_ioctl(env, KVM_TRANSLATE, tr); +if (ret 0) +return ret; + +ctx-raddr = tr.physical_address; +} else { +/* XXX: TODO */ +cpu_abort(env, BookE FSL MMU model not implemented\n); +} break; default: cpu_abort(env, Unknown or invalid MMU model\n); -- 1.5.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/5] guest debug init for 440 and e500 core
e500 only support 2 hardware breakpoints, 440(BOOKE) supports 4. Signed-off-by: Liu Yu yu@freescale.com --- hw/ppc440_bamboo.c |1 + hw/ppce500_mpc8544ds.c |1 + target-ppc/kvm_ppc.h |1 + 3 files changed, 3 insertions(+), 0 deletions(-) diff --git a/hw/ppc440_bamboo.c b/hw/ppc440_bamboo.c index f1ba130..8c9c3b6 100644 --- a/hw/ppc440_bamboo.c +++ b/hw/ppc440_bamboo.c @@ -185,6 +185,7 @@ static void bamboo_init(ram_addr_t ram_size, if (kvm_enabled()) { kvm_arch_put_registers(env); kvmppc_init(); +kvmppc_debug_init(4, 2); } } diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c index f1b3c1a..6c2aa61 100644 --- a/hw/ppce500_mpc8544ds.c +++ b/hw/ppce500_mpc8544ds.c @@ -279,6 +279,7 @@ static void mpc8544ds_init(ram_addr_t ram_size, if (kvm_enabled()) { kvm_arch_put_registers(env); kvmppc_init(); +kvmppc_debug_init(2, 2); /* E500v2 doesn't support IAC3,IAC4 */ } return; diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h index 3792ef7..8b4edca 100644 --- a/target-ppc/kvm_ppc.h +++ b/target-ppc/kvm_ppc.h @@ -13,5 +13,6 @@ void kvmppc_init(void); void kvmppc_fdt_update(void *fdt); int kvmppc_read_host_property(const char *node_path, const char *prop, void *val, size_t len); +void kvmppc_debug_init(int max_hw_bp, int max_hw_wp); #endif /* __KVM_PPC_H__ */ -- 1.5.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Single step hack for guest debug
As booke doesn't have hardware support for virtualization, hardware never know guest and host. So when enable hardware single step for guest, it cannot disable it timely if guest failed on certain instruction and then exit. Thus, we'll see that an single step interrupt happens at the very beginning of guest exit path. Then we need to recognize this kind of single step interrupt and fix the exit_nr to the corret value. Signed-off-by: Liu Yu yu@freescale.com --- arch/powerpc/kvm/booke.c| 82 +++ arch/powerpc/kvm/booke_interrupts.S |9 ++-- 2 files changed, 87 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 7f47003..b042265 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -24,6 +24,7 @@ #include linux/module.h #include linux/vmalloc.h #include linux/fs.h +#include linux/highmem.h #include asm/cputable.h #include asm/uaccess.h @@ -34,6 +35,8 @@ #include booke.h unsigned long kvmppc_booke_handlers; +unsigned long kvmppc_booke_handler_addr[16]; +#define handler_vector_num (sizeof(kvmppc_booke_handler_addr)/sizeof(kvmppc_booke_handler_addr[0])) #define VM_STAT(x) offsetof(struct kvm, stat.x), KVM_STAT_VM #define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU @@ -176,6 +179,80 @@ void kvmppc_core_deliver_interrupts(struct kvm_vcpu *vcpu) } } +int kvmppc_read_guest(struct kvm_vcpu *vcpu, unsigned long geaddr, + void *data, int len) +{ + int gtlb_index; + gpa_t gpa; + gfn_t gfn; + struct page *page; + void *headdr, *from; + + /* Check the guest TLB. */ + gtlb_index = kvmppc_mmu_itlb_index(vcpu, geaddr); + if (gtlb_index 0) + return -EFAULT; + + gpa = kvmppc_mmu_xlate(vcpu, gtlb_index, geaddr); + gfn = gpa PAGE_SHIFT; + + page = gfn_to_page(vcpu-kvm, gfn); + if (page == bad_page) + return -EFAULT; + + headdr = kmap_atomic(page, KM_USER0); + if (!headdr) + return -EFAULT; + from = headdr + (geaddr (PAGE_SIZE - 1)); + memcpy(data, from, len); + kunmap_atomic(headdr, KM_USER0); + + return 0; +} + +static unsigned int kvmppc_guest_debug_exit_fixup(struct kvm_vcpu *vcpu, + unsigned int exit_nr) +{ + unsigned int ret = exit_nr; + + u32 csrr0 = mfspr(SPRN_CSRR0); + u32 dbsr = mfspr(SPRN_DBSR); + + if ((dbsr | DBSR_IC) + csrr0 = kvmppc_booke_handlers + csrr0 kvmppc_booke_handlers + (PAGE_SIZE VCPU_SIZE_ORDER)) { + int i = 0; + + for (i = 0; i handler_vector_num; i++) { + if (kvmppc_booke_handler_addr[i] + csrr0 == kvmppc_booke_handler_addr[i] + 4) { + mtspr(SPRN_DBSR, ~0); + ret = i; + break; + } + } + + } + + switch (ret) { + case BOOKE_INTERRUPT_DEBUG: + case BOOKE_INTERRUPT_ITLB_MISS: + case BOOKE_INTERRUPT_EXTERNAL: + case BOOKE_INTERRUPT_DECREMENTER: + break; + + case BOOKE_INTERRUPT_PROGRAM: + case BOOKE_INTERRUPT_DTLB_MISS: + /* Need to save the last instruction */ + kvmppc_read_guest(vcpu, vcpu-arch.pc, vcpu-arch.last_inst, 4); + break; + default: + printk(Unhandled debug after interrupt:%d\n, ret); + } + + return ret; +} + /** * kvmppc_handle_exit * @@ -195,6 +272,9 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, run-exit_reason = KVM_EXIT_UNKNOWN; run-ready_for_interrupt_injection = 1; + if (exit_nr == BOOKE_INTERRUPT_DEBUG) + exit_nr = kvmppc_guest_debug_exit_fixup(vcpu, exit_nr); + switch (exit_nr) { case BOOKE_INTERRUPT_MACHINE_CHECK: printk(MACHINE CHECK: %lx\n, mfspr(SPRN_MCSR)); @@ -693,6 +773,8 @@ int kvmppc_booke_init(void) memcpy((void *)kvmppc_booke_handlers + ivor[i], kvmppc_handlers_start + i * kvmppc_handler_len, kvmppc_handler_len); + kvmppc_booke_handler_addr[i] = + (void *)kvmppc_booke_handlers + ivor[i]; } flush_icache_range(kvmppc_booke_handlers, kvmppc_booke_handlers + max_ivor + kvmppc_handler_len); diff --git a/arch/powerpc/kvm/booke_interrupts.S b/arch/powerpc/kvm/booke_interrupts.S index d0c6f84..45ff93f 100644 --- a/arch/powerpc/kvm/booke_interrupts.S +++ b/arch/powerpc/kvm/booke_interrupts.S @@ -42,16 +42,17 @@ #define HOST_STACK_LR (HOST_STACK_SIZE + 4) /* In caller stack frame. */ #define NEED_INST_MASK ((1BOOKE_INTERRUPT_PROGRAM) | \ -
Re: [PATCH 2/5] Fix booke registers init
Liu Yu wrote: Commit 8d2ba1fb9c8e7006e10d71fa51a020977f14c8b0 introduces a new new reset order. So that we have to synchronize registers explicitly. Signed-off-by: Liu Yu yu.liu-kzfg59tc24xl57midrc...@public.gmane.org --- hw/ppc440_bamboo.c |4 +++- hw/ppce500_mpc8544ds.c |4 +++- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/hw/ppc440_bamboo.c b/hw/ppc440_bamboo.c index d9ef3ec..f1ba130 100644 --- a/hw/ppc440_bamboo.c +++ b/hw/ppc440_bamboo.c @@ -182,8 +182,10 @@ static void bamboo_init(ram_addr_t ram_size, /* XXX we currently depend on KVM to create some initial TLB entries. */ } -if (kvm_enabled()) +if (kvm_enabled()) { +kvm_arch_put_registers(env); kvmppc_init(); +} } static QEMUMachine bamboo_machine = { diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c index c0e367d..f1b3c1a 100644 --- a/hw/ppce500_mpc8544ds.c +++ b/hw/ppce500_mpc8544ds.c @@ -276,8 +276,10 @@ static void mpc8544ds_init(ram_addr_t ram_size, /* XXX we currently depend on KVM to create some initial TLB entries. */ } -if (kvm_enabled()) +if (kvm_enabled()) { +kvm_arch_put_registers(env); kvmppc_init(); +} return; } These are required when loading a device tree and, thus, changing some registers after cpu_init, right? Then please add cpu_synchronize_state(env, 1) to the corresponding code blocks instead of this explicit, kvm-specific loading. Jan signature.asc Description: OpenPGP digital signature
Re: [PATCH 3/5] Add guest debug support for kvmppc
Liu Yu wrote: Signed-off-by: Liu Yu yu.liu-kzfg59tc24xl57midrc...@public.gmane.org --- target-ppc/kvm.c | 197 ++ 1 files changed, 197 insertions(+), 0 deletions(-) diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index b53d6e9..d8dbdb4 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -8,6 +8,9 @@ * Christian Ehrhardt ehrhardt-23vcf4htsmix0ybbhkvfkdbpr1lh4...@public.gmane.org * Hollis Blanchard hollisb-r/jw6+rmf7hqt0dzr+a...@public.gmane.org * + * Copyright (C) 2009 Freescale Semiconductor, Inc. All rights reserved. + * Yu Liu yu.liu-kzfg59tc24xl57midrc...@public.gmane.org + * * This work is licensed under the terms of the GNU GPL, version 2 or later. * See the COPYING file in the top-level directory. * @@ -18,6 +21,7 @@ #include sys/mman.h #include linux/kvm.h +#include asm/kvm_asm.h #include qemu-common.h #include qemu-timer.h @@ -26,6 +30,7 @@ #include kvm_ppc.h #include cpu.h #include device_tree.h +#include gdbstub.h //#define DEBUG_KVM @@ -216,3 +221,195 @@ int kvm_arch_handle_exit(CPUState *env, struct kvm_run *run) return ret; } +#ifdef KVM_CAP_SET_GUEST_DEBUG +int kvm_arch_insert_sw_breakpoint(CPUState *env, struct kvm_sw_breakpoint *bp) +{ +uint32_t sc = tswap32(KVM_INST_GUESTGDB); +uint32_t tmp; + +if (cpu_memory_rw_debug(env, bp-pc, (uint8_t *)bp-saved_insn, 4, 0) || +cpu_memory_rw_debug(env, bp-pc, (uint8_t *)sc, 4, 1)) +return -EINVAL; +cpu_memory_rw_debug(env, bp-pc, (uint8_t *)tmp, 4, 0); +return 0; +} + +int kvm_arch_remove_sw_breakpoint(CPUState *env, struct kvm_sw_breakpoint *bp) +{ +uint32_t sc; + +if (cpu_memory_rw_debug(env, bp-pc, (uint8_t *)sc, 4, 0) || +sc != tswap32(KVM_INST_GUESTGDB) || +cpu_memory_rw_debug(env, bp-pc, (uint8_t *)bp-saved_insn, 4, 1)) +return -EINVAL; +return 0; +} + +static struct { +target_ulong addr; +int type; +} hw_breakpoint[6]; + +static int nb_hw_breakpoint; +static int nb_hw_watchpoint; +static int max_hw_breakpoint; +static int max_hw_watchpoint; + +void kvmppc_debug_init(int max_hw_bp, int max_hw_wp) +{ +max_hw_breakpoint = max_hw_bp 4? 4 : max_hw_bp; +max_hw_watchpoint = max_hw_wp 2? 2 : max_hw_wp; +} + +static int find_hw_breakpoint(target_ulong addr, int type) +{ +int n; + +for (n = 0; n nb_hw_breakpoint + nb_hw_watchpoint; n++) +if (hw_breakpoint[n].addr == addr hw_breakpoint[n].type == type) +return n; +return -1; +} + +int kvm_arch_insert_hw_breakpoint(target_ulong addr, + target_ulong len, int type) +{ +hw_breakpoint[nb_hw_breakpoint + nb_hw_watchpoint].addr = addr; +hw_breakpoint[nb_hw_breakpoint + nb_hw_watchpoint].type = type; + +switch (type) { +case GDB_BREAKPOINT_HW: +if (nb_hw_breakpoint = max_hw_breakpoint) +return -ENOBUFS; + +if (find_hw_breakpoint(addr, type) = 0) +return -EEXIST; + +nb_hw_breakpoint++; +break; + +case GDB_WATCHPOINT_WRITE: +case GDB_WATCHPOINT_ACCESS: +if (nb_hw_watchpoint = max_hw_watchpoint) +return -ENOBUFS; + +if (find_hw_breakpoint(addr, type) = 0) +return -EEXIST; + +nb_hw_watchpoint++; +break; + +default: +return -ENOSYS; +} + +return 0; +} + +int kvm_arch_remove_hw_breakpoint(target_ulong addr, + target_ulong len, int type) +{ +int n; + +n = find_hw_breakpoint(addr, type); +if (n 0) +return -ENOENT; + +switch (type) { +case GDB_BREAKPOINT_HW: +nb_hw_breakpoint--; +break; + +case GDB_WATCHPOINT_WRITE: +case GDB_WATCHPOINT_ACCESS: +nb_hw_watchpoint--; +break; + +default: +return -ENOSYS; +} +hw_breakpoint[n] = hw_breakpoint[nb_hw_breakpoint + nb_hw_watchpoint]; + +return 0; +} + +void kvm_arch_remove_all_hw_breakpoints(void) +{ +nb_hw_breakpoint = nb_hw_watchpoint = 0; +} + +static CPUWatchpoint hw_watchpoint; + +int kvm_arch_debug(struct kvm_debug_exit_arch *arch_info) +{ +int handle = 0; +int n; + +if (cpu_single_env-singlestep_enabled) { +handle = 1; + +} else if (arch_info-status) { +if (arch_info-status == KVMPPC_DEBUG_BREAKPOINT) { +n = find_hw_breakpoint(arch_info-pc, GDB_BREAKPOINT_HW); +if (n = 0) +handle = 1; + +} else if (arch_info-status == KVMPPC_DEBUG_WATCH_ACCESS) { +n = find_hw_breakpoint(arch_info-pc, GDB_WATCHPOINT_ACCESS); +if (n = 0) { +handle = 1; +cpu_single_env-watchpoint_hit = hw_watchpoint; +hw_watchpoint.vaddr =
Re: [PATCH 5/5] guest debug init for 440 and e500 core
Liu Yu wrote: e500 only support 2 hardware breakpoints, 440(BOOKE) supports 4. Signed-off-by: Liu Yu yu.liu-kzfg59tc24xl57midrc...@public.gmane.org --- hw/ppc440_bamboo.c |1 + hw/ppce500_mpc8544ds.c |1 + target-ppc/kvm_ppc.h |1 + 3 files changed, 3 insertions(+), 0 deletions(-) diff --git a/hw/ppc440_bamboo.c b/hw/ppc440_bamboo.c index f1ba130..8c9c3b6 100644 --- a/hw/ppc440_bamboo.c +++ b/hw/ppc440_bamboo.c @@ -185,6 +185,7 @@ static void bamboo_init(ram_addr_t ram_size, if (kvm_enabled()) { kvm_arch_put_registers(env); kvmppc_init(); +kvmppc_debug_init(4, 2); } } diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c index f1b3c1a..6c2aa61 100644 --- a/hw/ppce500_mpc8544ds.c +++ b/hw/ppce500_mpc8544ds.c @@ -279,6 +279,7 @@ static void mpc8544ds_init(ram_addr_t ram_size, if (kvm_enabled()) { kvm_arch_put_registers(env); kvmppc_init(); +kvmppc_debug_init(2, 2); /* E500v2 doesn't support IAC3,IAC4 */ I think those two are better moved to kvm_arch_init_vcpu. } return; diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h index 3792ef7..8b4edca 100644 --- a/target-ppc/kvm_ppc.h +++ b/target-ppc/kvm_ppc.h @@ -13,5 +13,6 @@ void kvmppc_init(void); void kvmppc_fdt_update(void *fdt); int kvmppc_read_host_property(const char *node_path, const char *prop, void *val, size_t len); +void kvmppc_debug_init(int max_hw_bp, int max_hw_wp); #endif /* __KVM_PPC_H__ */ Jan signature.asc Description: OpenPGP digital signature
Re: [PATCH 0/5]
Liu Yu wrote: The whole patchset includes: patch 1: fix kvmppc build error patch 2: fix kvmppc init error patch 3~5: add kvmppc guest debug support The guest debug still have some problems I haven't solved. 1. gdb 'next' command uses software breakpoint software breakpoint is implemented via modify guest's code. In most case it works well, but when used by 'next' it's easy to make trouble on powerpc booke. For example booke has a code template for jumping to and returning from interrupt handlers: bl transfer .long handler_addr .long ret_addr when call transfer, it never return but in transfer assembly code it will read the handler_addr and ultimately call the handler. Gdb doesn't know that and treat it as a normal function call. so gdb put a software breakpoint instruction at handler_addr, in order to get trap there when return from transfer. Then guest will read software breakpoint as handler_addr and jump to there.. I'm not sure if x86 suffer this kind of issue. It would if it had such a pattern. Is there any way to avoid this? Unless there is a mechanism via the debug infos of a binary to tell gdb about this, I think one can only avoid it by not using next here. 2. gdb 'watch' command Jan told me gdb6.8 can issue hardware watchpoint request via command 'watch', my gdb is 6.8.50.20080821-cvs and our toolchain provider confirm that it supports hardware watch However when I use 'watch', I can only see single step from gdbstub side. Did I miss anything? Did you install a watchpoint on a symbol? If yes, try if placing one on an absolute address changes the picture. Frankly, I didn't understand gdb's logic for selecting soft or hard watchpoints so far. Soft watchpoints are those you saw: single step to the program, checking after each step if the watched variable has changed. In theory it should be clear when to use which. But practice appears to be non-deterministic, at least with the versions we recently tried on x86. Jan signature.asc Description: OpenPGP digital signature