Re: [Qemu-devel] DIP and SMD Xtal supplier / GMK
Dear Sir or Madam, We, Great Marking (HK) Industry Co., Limited (i.e. GMK), supply quartz crystals, oscillators, ceramic resonators and filters with good quality and very competitive price. Hope to be a partner of your company! E-catalog will be provided if needed. FREE SAMPLES can be sent on request. Email me or just call me directly. Thank you! Best regards, Frank Sales representative Tel: 86-755-27667101 Fax: 86-755-27667590 E-mail: sale...@greatmarking.com Website: www.greatmarking.com
Re: [Qemu-devel] Improve QEMU performance with LLVM codegen and other techniques
Hi Peter, 1. cpu_unlink_tb (exec.c) This function is broken even for pure TCG -- we know it has a race condition. As I said on IRC, I think that the right thing to do is to start by overhauling the current TCG code so that it is: (a) properly multithreaded (b) race condition free (c) well documented (d) clean code Then you have a firm foundation you can use as a basis for the LLVM integration (and in the course of doing this overhaul you'll have figured out enough of how the current code works to be clear about where hooks for invalidating your traces need to go). I must say I totally agree with you on overhauling the current TCG code. But my boss might have no such patient on this. ;) If there is a plan out there, I'll be very happy to join in. I read the thread talking about the broken tb_unlink [1], and I'm surprised that tb_unlink is broken even under single-threaded mode and system mode. You mentioned (b) could be the IO thread in [1]. I think we don't enable IO thread in system mode right now. My concern is if I spot _all_ place/situation that I need to break the link between block and trace. The big problem is debugging. Yes. In this sort of hotspot based design it's very easy to end up with bugs that are intermittent or painful to reproduce and where you have very little clue about which version of the code for which address ended up misgenerated (since timing issues mean that what code is recompiled and when it is inserted will vary from run to run). Being able to conveniently get rid of some of this nondeterminism is vital for tracking down what actually goes wrong. Misgenerated code might not be an issue now since we have tested our framework in LLVM-only mode. I think the problem still is about the link/unlink stuff. The first problem I have while lowering the threshold is the broken one generate a few traces (2, actually) that a work one doesn't. When boot the linux image downloaded from the QEMU website, the system hangs on the booting process (see attach if you're interested). Simply put, the system hangs after printing ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 which turns out should be function check_timer (arch/i386/kernel/io_apic.c). I am not a Linux kernel expert and have no idea how to solve this. The culprit traces beginning with 0xc0b8 and 0xc0d7. Here is their corresponding guest binary. IN: 0xc0b8: add0xc04fa798,%eax 0xc0be: mov(%eax),%eax 0xc0c0: ret IN: 0xc0d7: mov$0x108,%eax 0xc0dc: call 0xc0b8 I compile the linux kernel with debug info and without inline function, then objdump vmlinux to see what the source code might be. I guess because linux-0.2.img has other stuff besides vmlinux (kernel image), the addresses above can only be used as an approximation or even useless. I only find one spot having the same code sequence (I believe) as 0xc0b8 but can't find the other one so far. See below, static inline unsigned int readl(const volatile void __iomem *addr) { return *(volatile unsigned int __force *) addr; c0214a90: 03 05 44 56 4f c0 add0xc04f5644,%eax c0214a96: 8b 00 mov(%eax),%eax #define FSEC_TO_USEC (10UL) int hpet_readl(unsigned long a) { return readl(hpet_virt_address + a); } c0214a98: c3 ret This is the whole story so far. :-) Any comment are welcome! [1] http://lists.gnu.org/archive/html/qemu-devel/2011-11/msg02447.html Regards, chenwj -- Wei-Ren Chen (陳韋任) Computer Systems Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 Homepage: http://people.cs.nctu.edu.tw/~chenwj
Re: [Qemu-devel] [PATCH 1.0] 9pfs: improve portability to older systems
On 11/30/2011 10:27 PM, Erik Rull wrote: Erik, if you can test on your Debian 4.0 installation, that would be nice. Thanks! I will do so. Can you provide me a snapshot (.tgz) of the git? I have heard that there are possibilities to create the tgz over some web interfaces of git, but I didn't find them. I have only very restricted web access. Please help me for this time and / or provide me a short description how to do it. I will then try to give you feedback asap. You can simply test 1.0 when it comes out. Testing that it compiles would be 99% of the work. :) Paolo
Re: [Qemu-devel] [PATCH v2 0/2] net: fix some problems with dump backend
On Wed, Nov 30, 2011 at 09:35:36PM +0100, Hervé Poussineau wrote: This small patchset fixes some problems I've found while using the dump backend. At least first patch may be considered for 1.0. Changes v1-v2: - Update patches descriptions Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Re: [Qemu-devel] Improve QEMU performance with LLVM codegen and other techniques
On Thu, Dec 01, 2011 at 11:50:24AM +0800, 陳韋任 wrote: I don't see any better approach to debugging this than the one you're already taking. Try to run as many workloads as you can and see if they break :). Oh and always make the optimization optional, so that you can narrow it down to it and know you didn't hit a generic QEMU bug. You mean make the trace optimization optional? We have tested our framework in LLVM-only mode. which means we replace TCG with LLVM entirely. It's _very_ slow but works. It would be interesting to use an optimized interpreter instead of TCG, then go to LLVM for hot traces. This is more HotSpot-like with the idea being that the interpreter runs through initialization and rarely executed code without a translation overhead. For the hot paths LLVM kicks in and high-quality translated code is executed. Stefan
Re: [Qemu-devel] [PATCH 01/18] qom: add new dynamic property infrastructure based on Visitors
On Wed, Nov 30, 2011 at 03:03:31PM -0600, Anthony Liguori wrote: +/** + * @DevicePropertyEtter - called when trying to get/set a property An established term for this is an accessor. I've never heard etter before and it looks like a typo on first sight ;). +/** + * @qdev_property_set - writes a property to a device + * + * @dev - the device + * + * @v - the visitor that will used to write the property value. This should be s/will used/will be used/
Re: [Qemu-devel] [PATCH 02/18] qom: register legacy properties as new style properties
On Wed, Nov 30, 2011 at 03:03:32PM -0600, Anthony Liguori wrote: +static void qdev_set_legacy_property(DeviceState *dev, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +Property *prop = opaque; + +if (dev-state != DEV_STATE_CREATED) { +error_set(errp, QERR_PERMISSION_DENIED); +return; +} + +if (prop-info-parse) { +Error *local_err = NULL; +char *ptr = NULL; + +visit_type_str(v, ptr, name, local_err); +if (!local_err) { +int ret; +ret = prop-info-parse(dev, prop, ptr); +if (ret != 0) { +error_set(errp, QERR_INVALID_PARAMETER_VALUE, + name, prop-info-name); +} +g_free(ptr); +} else { +error_propagate(errp, local_err); +} I noticed something subtle about Error** here. Your code is correct but I (incorrectly) wanted to eliminate local_err and use errp directly: if (prop-info-parse) { char *ptr = NULL; visit_type_str(v, ptr, name, errp); if (!error_is_set(errp)) { int ret; ret = prop-info-parse(dev, prop, ptr); if (ret != 0) { error_set(errp, QERR_INVALID_PARAMETER_VALUE, name, prop-info-name); } g_free(ptr); } } else { ... Turns out you cannot do this because error_is_set(errp) will be false if the caller passed in a NULL errp. That means we wouldn't detect the error from visit_type_str()! So it's not okay to depend on the caller's errp. We always need to juggle around a local_err with propagation :(. Just wanted to share.
Re: [Qemu-devel] Improve QEMU performance with LLVM codegen and other techniques
Misgenerated code might not be an issue now since we have tested our framework in LLVM-only mode. I think the problem still is about the link/unlink stuff. The first problem I have while lowering the threshold is the broken one generate a few traces (2, actually) that a work one doesn't. When boot the linux image downloaded from the QEMU website, the system hangs on the booting process (see attach if you're interested). Simply put, the system hangs after printing There's no attachment in this mail. I can try to help you resolving it if you provide more information. ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 which turns out should be function check_timer (arch/i386/kernel/io_apic.c). I If it hangs inside QEMU itself then you may try to backport commit 4f61927a41a098d06e642ffdea5fc285dc3a0e6b that fixes infinite loop caused by hpet interrupt probing. -- Thanks. -- Max
Re: [Qemu-devel] Improve QEMU performance with LLVM codegen and other techniques
On 1 December 2011 07:46, Stefan Hajnoczi stefa...@gmail.com wrote: It would be interesting to use an optimized interpreter instead of TCG, then go to LLVM for hot traces. This is more HotSpot-like with the idea being that the interpreter runs through initialization and rarely executed code without a translation overhead. For the hot paths LLVM kicks in and high-quality translated code is executed. Might the recently-added TCI be a suitable starting point for this? Alex
Re: [Qemu-devel] [PATCH] exec.c: Fix subpage memory access to RAM MemoryRegion
On 11/30/2011 06:22 PM, Andreas Färber wrote: The problem is not with the 0xfff00 region, that one works fine with 12 and 16 bit pages so far. What I'm seeing is a crash of the very first TB at PC 0x02010, which is in a 4K RAM region from 0x0 on, with 16 bit pages. Also happens with v2. Executing out of subpage RAM is probably unique enough to have its own set of weird and wonderful bugs. Since it's the very first TB, you could simply add tons of instrumentation and just follow the guest code and see what it does. It's similar to my original problem of being unable to read the reset vector, where I couldn't use the regular memory read functions but had to use rom_ptr(), as Peter pointed out in target-arm/helper.c:cpu_reset(). Apparently I'll need to add some check for handling ROM, but on what condition and where? Don't really understand why loading ELF code into a RAM area creates these weird ROM areas... Is this a legacy thing, as it's not shown in info mtree? Everything should be seen in 'info mtree', at least for everything that's been converted. You could try the memory/master branch where this is true. Did you add roms yourself? (qemu) info roms addr= size=0x02 mem=rom name=phdr #0: /home/andreas/MCU/RL78/test addr=0004 size=0x7c mem=rom name=phdr #1: /home/andreas/MCU/RL78/test addr=00c0 size=0x04 mem=rom name=phdr #2: /home/andreas/MCU/RL78/test addr=00d8 size=0x04 mem=rom name=phdr #3: /home/andreas/MCU/RL78/test addr=00dc size=0x00026a mem=rom name=phdr #4: /home/andreas/MCU/RL78/test addr=2000 size=0x0004e4 mem=rom name=phdr #5: /home/andreas/MCU/RL78/test (qemu) cont Bad ram pointer 0x2010 Program received signal SIGABRT, Aborted. [Switching to Thread 0x75504700 (LWP 6484)] 0x7617ad95 in raise () from /lib64/libc.so.6 (gdb) bt #0 0x7617ad95 in raise () from /lib64/libc.so.6 #1 0x7617c2ab in abort () from /lib64/libc.so.6 #2 0x77f76928 in qemu_ram_addr_from_host_nofail (ptr=0x2010) at /home/andreas/QEMU/qemu-rl78/exec.c:3248 #3 qemu_ram_addr_from_host_nofail (ptr=0x2010) at /home/andreas/QEMU/qemu-rl78/exec.c:3242 #4 0x77f7036d in get_page_addr_code (addr=8208, env1=0x78b04010) This function tries to translate a virtual guest address of a page into a ram address, but that doesn't work for subpage. if (pd IO_MEM_ROM !(pd IO_MEM_ROMD)) { #if defined(TARGET_ALPHA) || defined(TARGET_MIPS) || defined(TARGET_SPARC) cpu_unassigned_access(env1, addr, 0, 1, 0, 4); #else cpu_abort(env1, Trying to execute code outside RAM or ROM at 0x TARGET_FMT_lx \n, addr); #endif You're clearly in RAM or ROM, but the subpage code confuses qemu. It's not going to be easy to unconfuse it. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH v2] exec.c: Fix subpage memory access to RAM MemoryRegion
On 11/30/2011 05:26 PM, Andreas Färber wrote: Commit 95c318f5e1f88d7e5bcc6deac17330fd4806a2d3 (Fix segfault in mmio subpage handling code.) prevented a segfault by making all subpage registrations over an existing memory page perform an unassigned access. Symptoms were writes not taking effect and reads returning zero. Very small page sizes are not currently supported either, so subpage memory areas cannot fully be avoided. Therefore change the previous fix to use a new IO_MEM_SUBPAGE_RAM instead of IO_MEM_UNASSIGNED. Suggested by Avi. Looks reasonable. Should go into 1.1. Should we backport it to 1.0.blah? From 95c318f's description, it doesn't happen in normal circumstances. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] Improve QEMU performance with LLVM codegen and other techniques
Hi, Stefan It would be interesting to use an optimized interpreter instead of TCG, then go to LLVM for hot traces. This is more HotSpot-like with the idea being that the interpreter runs through initialization and rarely executed code without a translation overhead. For the hot paths LLVM kicks in and high-quality translated code is executed. Not sure if it's doable. I can only tell you we rely on QEMU frontend to disassemble guest binary into TCG IR, then translate TCG IR into LLVM IR. And talk about the translation overhead, the time QEMU frontend spend is negligible. Regards, chenwj -- Wei-Ren Chen (陳韋任) Computer Systems Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 Homepage: http://people.cs.nctu.edu.tw/~chenwj
Re: [Qemu-devel] KVM call minutes for November 29
On 11/29/2011 09:10 PM, Markus Armbruster wrote: Avi Kivity a...@redhat.com writes: On 11/29/2011 05:51 PM, Juan Quintela wrote: How to do high level stuff? - python? One of the disadvantages of the various scripting languages is the lack of static type checking, which makes it harder to do full sweeps of the source for API changes, relying on the compiler to catch type (or other) errors. On the other hand, the statically typed languages usually have more boilerplate. Since one of the goals is to simplify things, this indicates the need for a language with type inference. On the third hand, languages with type inferences are still immature (golang?), so we probably need to keep this discussion going until an obvious choice presents itself. I wouldn't call ML immature. But I wouldn't call it a scripting language, either. It was just off the radar for me. We should consider it, by all means. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH v2] exec.c: Fix subpage memory access to RAM MemoryRegion
On Thu, Dec 01, 2011 at 11:29:48AM +0200, Avi Kivity wrote: On 11/30/2011 05:26 PM, Andreas Färber wrote: Commit 95c318f5e1f88d7e5bcc6deac17330fd4806a2d3 (Fix segfault in mmio subpage handling code.) prevented a segfault by making all subpage registrations over an existing memory page perform an unassigned access. Symptoms were writes not taking effect and reads returning zero. Very small page sizes are not currently supported either, so subpage memory areas cannot fully be avoided. Therefore change the previous fix to use a new IO_MEM_SUBPAGE_RAM instead of IO_MEM_UNASSIGNED. Suggested by Avi. Looks reasonable. Should go into 1.1. Should we backport it to 1.0.blah? From 95c318f's description, it doesn't happen in normal circumstances. To reproduce that I mappped subpage PCI bar over RAM IIRC. For KVM the memory in a subpage will not be accessible even with this fix since memory slots have page granularity and KVM can't execute code from MMIO (yet?). Andreas do you have real scenario where this fix is needed? -- Gleb.
Re: [Qemu-devel] Improve QEMU performance with LLVM codegen and other techniques
There's no attachment in this mail. I can try to help you resolving it if you provide more information. Sorry about that, see the attachment please. What kind of information you want to know? If your code is available online I can try it myself, the question is where is it hosted then. If not, then link to kernel binary and qemu exec trace would help me to start. ?..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 which turns out should be function check_timer (arch/i386/kernel/io_apic.c). I If it hangs inside QEMU itself then you may try to backport commit 4f61927a41a098d06e642ffdea5fc285dc3a0e6b that fixes infinite loop caused by hpet interrupt probing. I don't understand. What it hangs inside QEMU itself supposed to mean? QEMU doesn't execute guest code doing something for itself vs. QEMU executes guest code in loop checking for something that doesn't happen. I'm talking about the first case. They may be distinguished from e.g. guest debugger connected to QEMU's gdbstub -- in the former case it cannot break guest execution by ^C. -- Thanks. -- Max
Re: [Qemu-devel] [PATCH v2] exec.c: Fix subpage memory access to RAM MemoryRegion
On 12/01/2011 11:37 AM, Gleb Natapov wrote: Looks reasonable. Should go into 1.1. Should we backport it to 1.0.blah? From 95c318f's description, it doesn't happen in normal circumstances. To reproduce that I mappped subpage PCI bar over RAM IIRC. In qemu 1.0, you can no longer do that (the pci bridge will not let the BAR override the RAM). -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] Boot from disk problem
On Fri, Nov 25, 2011 at 5:01 PM, Andreas Färber afaer...@suse.de wrote: Hi, Am 25.11.2011 16:23, schrieb Ignacio Geli: Im trying to run a solaris 2.5.1 vm acording the instructions of i found in this blog: http://tyom.blogspot.com/2009/12/solaris-under-qemu-how-to.html#uds-search-results compile my qemu with: git clone git://git.savannah.nongnu.org/qemu.git cd qemu ./configure --target-list=sparc-softmmu make (version 0.13.50) That 2009 blog is outdated: QEMU now lives in git://git.qemu.org/qemu.git and should be at 0.15.93 (1.0-rc3). Compare http://qemu.org for details. CC'ed sparc maintainer and Artyom. Thanks Andreas, I'll update the entry. For some reason gmail filtered out your email, and I've just founded it just by chance. Sorry for delay. Andreas Ive downloaded the ss5.bin bios (sparc station 5)) I have the solaris disk and i can do the installation but can't get the disk to boot. I can boot from the cd, mount and check the /etc/system file it has the set scsi_options=0x58 option i set to make it work, still not working: Here is the output: /ok boot disk0:d Boot device: /iommu/sbus/espdma@5,840/esp@5,880/sd@0,0:d File and args: SunOS Release 5.5.1 Version Generic [UNIX(R) System V Release 4.0] Copyright (c) 1983-1996, Sun Microsystems, Inc. WARNING: /iommu@0,1000/sbus@0,10001000/espdma@5,840/esp@5,880/sd@1,0 (sd1): corrupt label - bad geometry Label says 1310720 blocks, Drive says 883120 blocks Cannot mount root on /iommu@0,1000/sbus@0,10001000/espdma@5,840/esp@5,880/sd@0,0:d fstype ufs panic: vfs_mountroot: cannot mount root rebooting... Resetting ... / maybe the disk is wrong ive try several disk one following these instructions: 1) qemu-img create -f qcow2 file.qcow2 12G 2) Add -drive file=file.qcow2,unit=3 to your qemu startup 3) Boot single user 4) echo 'disk_type=QEMU12.0G:ctlr=SCSI:ncyl=49152:acyl=0:pcyl=49152:nhead=16:nsect=32:rpm=7200' /etc/format.dat 5) echo 'partition=QEMU12.0G:disk=QEMU12.0G:ctlr=SCSI:2=0,25165824:0=0,20971520:6=40960,4194304' /etc/format.dat 6) format -d c0t3d0 -t 'QEMU12.0G' then label, y, and quit 7) newfs /dev/dsk/c0t3d0s0 followed by y another downloaded from: http://gunkies.org/wiki/Installing_Solaris_2.4_on_Qemu_SPARC any help will be welcome -- ITECNIS SRL http:/www.itecnis.com Simplicity is the ultimate sophistication Ignacio Geli Implementación y Soporte -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg -- Regards, Artyom Tarasenko solaris/sparc under qemu blog: http://tyom.blogspot.com/
Re: [Qemu-devel] [PATCH v8 0/2] PC system flash support
On 11/29/2011 10:51 PM, Jordan Justen wrote: One way to verify is to to 'info qdev' and 'info mtree' with qemu-1.0 and qemu-1.1 -M pc-1.0, and see that you get the same results. I think this means that I cannot add a pc-sysfw qdev for pc-1.0. (Even if the rom memory usage matches pc-1.0.) Is that right? Yes. If so, the plan I mentioned above would not work. In the plan above, a property of the pc-sysfw qdev would have determined if the old rom method would be used, or the newer flash method would be used. Okay. It needs to be a machine/global property then. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH v2] exec.c: Fix subpage memory access to RAM MemoryRegion
On Thu, Dec 01, 2011 at 11:41:52AM +0200, Avi Kivity wrote: On 12/01/2011 11:37 AM, Gleb Natapov wrote: Looks reasonable. Should go into 1.1. Should we backport it to 1.0.blah? From 95c318f's description, it doesn't happen in normal circumstances. To reproduce that I mappped subpage PCI bar over RAM IIRC. In qemu 1.0, you can no longer do that (the pci bridge will not let the BAR override the RAM). Hmm, if this is how real HW work then problem solved :) (different HW can behave differently, but it is reasonable to assume that on a PC memory access below TOM will be redirected to memory controller no matter what) So what is the motivation for Andreas patch than? -- Gleb.
Re: [Qemu-devel] [PATCH v2] exec.c: Fix subpage memory access to RAM MemoryRegion
On 12/01/2011 11:47 AM, Gleb Natapov wrote: On Thu, Dec 01, 2011 at 11:41:52AM +0200, Avi Kivity wrote: On 12/01/2011 11:37 AM, Gleb Natapov wrote: Looks reasonable. Should go into 1.1. Should we backport it to 1.0.blah? From 95c318f's description, it doesn't happen in normal circumstances. To reproduce that I mappped subpage PCI bar over RAM IIRC. In qemu 1.0, you can no longer do that (the pci bridge will not let the BAR override the RAM). Hmm, if this is how real HW work then problem solved :) (different HW can behave differently, but it is reasonable to assume that on a PC memory access below TOM will be redirected to memory controller no matter what) So what is the motivation for Andreas patch than? He's not emulating pc hardware. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH v2] exec.c: Fix subpage memory access to RAM MemoryRegion
On Thu, Dec 01, 2011 at 11:54:33AM +0200, Avi Kivity wrote: On 12/01/2011 11:47 AM, Gleb Natapov wrote: On Thu, Dec 01, 2011 at 11:41:52AM +0200, Avi Kivity wrote: On 12/01/2011 11:37 AM, Gleb Natapov wrote: Looks reasonable. Should go into 1.1. Should we backport it to 1.0.blah? From 95c318f's description, it doesn't happen in normal circumstances. To reproduce that I mappped subpage PCI bar over RAM IIRC. In qemu 1.0, you can no longer do that (the pci bridge will not let the BAR override the RAM). Hmm, if this is how real HW work then problem solved :) (different HW can behave differently, but it is reasonable to assume that on a PC memory access below TOM will be redirected to memory controller no matter what) So what is the motivation for Andreas patch than? He's not emulating pc hardware. That's not a crime in itself :) What HW he encountered this problem on? What scenario? How likely is this scenario on that HW (my comment for 95c318f which you are referring to above was for PC)? And if KVM is supported on that HW my comment about KVM still applies. -- Gleb.
[Qemu-devel] Unsubscription Confirmation
Thank you for subscribing. You have now unsubscribed and no more messages will be sent.
Re: [Qemu-devel] Improve QEMU performance with LLVM codegen and other techniques
On 1 December 2011 09:03, 陳韋任 che...@iis.sinica.edu.tw wrote: I read the thread talking about the broken tb_unlink [1], and I'm surprised that tb_unlink is broken even under single-threaded mode and system mode. You mentioned (b) could be the IO thread in [1]. I think we don't enable IO thread in system mode right now. My concern is if I spot _all_ place/situation that I need to break the link between block and trace. The IO thread is always enabled in QEMU these days. -- PMM
Re: [Qemu-devel] [BUG] [Seabios] PCI 64bit BARs on Win2008 - unable to start the device. (ACPI lacks the _DSM method)
On Thu, Dec 01, 2011 at 06:49:54PM +1300, Alexey Korolev wrote: Isaku san, I've just added you to discussion. There are some issues with PCI 64bit support in Windows. Windows fails to assign the resource if it doesn't fit in first 4GB window. I really don't know why it happens. One of the possibilities is related to lack of _DSM method in ACPI. Another guesse could be related to the fact that 440FX only supports 32bit PCI bus interface and windows may limit PCI address range to first 4GB for PCI devices under this bridge. I remember you were working on Q35 chipset simulation, I wonder if it is working and would it be possible to try? Thanks, Alexey Maybe the range above 4G needs to be declared in the _CRS resource?
Re: [Qemu-devel] Improve QEMU performance with LLVM codegen and other techniques
On 01.12.2011, at 04:50, 陳韋任 wrote: Hi Alex, Very cool! I was thinking about this for a while myself now. It's especially appealing these days since you can do the hotspot optimization in a separate thread :). Especially in system mode, you also need to flush when tb_flush() is called though. And you have to make sure to match hflags and segment descriptors for the links - otherwise you might end up connecting TBs from different processes :). I'll check the tb_flush again. IIRC, we make the code cache big enough so that there is no need to flush the code cache. But I think we still need to deal with it in the end. It is never big enough :). In fact, even a normal system mode guest boot triggers tb_flush usually because the cache is full. And target code can also trigger it manually. The block linking is done by QEMU and we leave it alone. But I don't know QEMU ever does hflags and segment descriptors check before doing block linking. Could you point it out? Anyway, here is how we form trace from a set of basic blocks. Sure. Just check for every piece of code that executes cpu_get_tb_cpu_state() :). 1. We insert instrumented code at the beginning of each TCG block to collect how many times this block being executed. 2. When a block's execution time, say block A, reaches a pre-defined threshold, we follow the run time execution path to collect block B followed A and so on to form a trace. This approach is called NET (Next-Executing Tail) [1]. 3. Then a trace composed of TCG blocks is sent to a LLVM translator. The translator generates the host binary for the trace into a LLVM code cache, and patch the I don't fully understand this part. Do you disassemble the x86 blob that TCG emitted? beginning of block A (in QEMU code cache) so that anyone executing block A will jump to the corresponding trace and execute. Above is block to trace link. I think there is no need to do hflags and segment descriptors check, right? Although I set the trace length to one basic block at If you only take the choices that QEMU has already patched into the TB for you then no, you don't need to check it yourself, because QEMU already checked it :) the moment (make the situation simpler), I think we still don't have to check the blocks' hflags and segment descriptors in the trace to see if they match. Yeah. You only need to be sync'ed with the invalidation then. And make sure you patch the TB atomically, so you don't have a separate thread accidentally run half your code and half the old code. successfully, then login and run some benchmark on it. As a very first step, we make a very high threshold on trace building. In other words, a basic block must be executed *many* time to trigger the trace building process. Then we lower the threshold a bit at a time to see how things work. When something goes wrong, we might get kernel panic or the system hangs at some point on the booting process. I have no idea on how to solve this kind of problem. So I'd like to seek for help/experience/suggestion on the mailing list. I just hope I make the whole situation clear to you. I don't see any better approach to debugging this than the one you're already taking. Try to run as many workloads as you can and see if they break :). Oh and always make the optimization optional, so that you can narrow it down to it and know you didn't hit a generic QEMU bug. You mean make the trace optimization optional? We have tested our framework in LLVM-only mode. which means we replace TCG with LLVM entirely. It's _very_ slow I was more thinking of making the trace optimization optional as in not optimize but do only TCG like it's done today :). but works. What the generic QEMU bug is? We use QEMU 0.13 and just rely on its emulation part right now. Does recent version fix major bugs in the emulation engine? I don't know - there are always bug fixes in areas all over the code base. But I guess the parts you've been touching have been pretty stable. Either way, I was really more trying to point out that there could always be bugs in any layer, so having the ability to turn off a layer is in general a good idea :). Alex
Re: [Qemu-devel] [PATCH 05/18] qdev: provide a path resolution
On Wed, Nov 30, 2011 at 03:03:35PM -0600, Anthony Liguori wrote: +DeviceState *qdev_resolve_path(const char *path, bool *ambiguous) +{ +bool partial_path = true; +DeviceState *dev; +gchar **parts; + +parts = g_strsplit(path, /, 0); +if (parts == NULL || parts[0] == NULL) { We must g_strfreev(parts) in the parts[0] == NULL case. +/** + * @qdev_resolve_path - resolves a path returning a device + * + * There are two types of supported paths--absolute paths and partial paths. + * + * Absolute paths are derived from the root device and can follow child or + * link properties. Since they can follow link properties, they can be + * arbitrarily long. Absolute paths look like absolute filenames and are prefix s/prefix/prefixed/ + * with a leading slash. + * + * Partial paths are look like relative filenames. They do not begin with a s/are// + * prefix. The matching rules for partial paths are subtle but designed to make + * specifying devices easy. At each level of the composition tree, the partial + * path is matched as an absolute path. The first match is not returned. At + * least two matches are searched for. A successful result is only returned if + * only one match is founded. If more than one match is found, a flag is return s/return/returned/ + * to indicate that the match was ambiguous. + * + * @path - the path to resolve + * + * @ambiguous - returns true if the path resolution failed because of an + * ambiguous match The implementation seems to depend on ambiguous being initialized to false by the caller. That would be worth documenting or changing so it does not depend on the initial value. + * + * Returns: + * The matched device. The matched device or NULL on path lookup failure.
Re: [Qemu-devel] Boot from disk problem
On Sat, Nov 26, 2011 at 9:01 AM, Blue Swirl blauwir...@gmail.com wrote: On Fri, Nov 25, 2011 at 18:24, Ignacio Geli ig...@itecnis.com wrote: El 25/11/11 13:01, Andreas Färber escribió: Hi, Am 25.11.2011 16:23, schrieb Ignacio Geli: Im trying to run a solaris 2.5.1 vm acording the instructions of i found in this blog: http://tyom.blogspot.com/2009/12/solaris-under-qemu-how-to.html#uds-search-results I'm not sure if Artyom has submitted all his patches to QEMU yet. This stuff is in. Definitely worked at 0.13 - 0.14 times, but I never had time to check 0.15+. compile my qemu with: git clone git://git.savannah.nongnu.org/qemu.git cd qemu ./configure --target-list=sparc-softmmu make (version 0.13.50) That 2009 blog is outdated: QEMU now lives in git://git.qemu.org/qemu.git and should be at 0.15.93 (1.0-rc3). Compare http://qemu.org for details. CC'ed sparc maintainer and Artyom. Andreas Andesas tks for the info. upgrading to 0.15.1 did not solve the issue. Adding some info: QEMU emulator version 0.15.1, Copyright (c) 2003-2008 Fabrice Bellard Linux ws2132 2.6.32-5-openvz-amd64 #1 SMP Mon Oct 3 05:12:50 UTC 2011 x86_64 GNU/Linux Debian squeeze Regards! Ive downloaded the ss5.bin bios (sparc station 5)) I have the solaris disk and i can do the installation but can't get the disk to boot. I can boot from the cd, mount and check the /etc/system file it has the set scsi_options=0x58 option i set to make it work, still not working: Here is the output: /ok boot disk0:d Boot device: /iommu/sbus/espdma@5,840/esp@5,880/sd@0,0:d File and args: SunOS Release 5.5.1 Version Generic [UNIX(R) System V Release 4.0] Copyright (c) 1983-1996, Sun Microsystems, Inc. WARNING: /iommu@0,1000/sbus@0,10001000/espdma@5,840/esp@5,880/sd@1,0 (sd1): corrupt label - bad geometry Label says 1310720 blocks, Drive says 883120 blocks Cannot mount root on /iommu@0,1000/sbus@0,10001000/espdma@5,840/esp@5,880/sd@0,0:d fstype ufs panic: vfs_mountroot: cannot mount root rebooting... Resetting ... / maybe the disk is wrong ive try several disk one following these instructions: 1) qemu-img create -f qcow2 file.qcow2 12G 2) Add -drive file=file.qcow2,unit=3 to your qemu startup 3) Boot single user 4) echo 'disk_type=QEMU12.0G:ctlr=SCSI:ncyl=49152:acyl=0:pcyl=49152:nhead=16:nsect=32:rpm=7200' /etc/format.dat I'd guess these geometry parameters are not correct for your 12G disk. It's strange though that the installer kernel sees different geometry from the normal boot kernel. 5) echo 'partition=QEMU12.0G:disk=QEMU12.0G:ctlr=SCSI:2=0,25165824:0=0,20971520:6=40960,4194304' /etc/format.dat These could also be wrong, but I have no idea what the numbers mean. In order not to mess with all this numbers I used a pre-defined 1.3G disk in the Solaris format utility and created it with dd: dd if=/dev/zero of=$1 bs=10k count=133756 Otherwise, Ignacio, are you sure that disk0 is your root FS disk and not something else? Can you post your qemu command line? -- Regards, Artyom Tarasenko solaris/sparc under qemu blog: http://tyom.blogspot.com/
Re: [Qemu-devel] Improve QEMU performance with LLVM codegen and other techniques
The IO thread is always enabled in QEMU these days. We use QEMU 0.13. I think IO thread is not enabled by default. Regards, chenwj -- Wei-Ren Chen (陳韋任) Computer Systems Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 Homepage: http://people.cs.nctu.edu.tw/~chenwj
Re: [Qemu-devel] [PATCH 07/18] qom: add link properties
On Wed, Nov 30, 2011 at 03:03:37PM -0600, Anthony Liguori wrote: +/** + * @qdev_property_add_link - Add a link property to a device + * + * Links establish relationships between devices. Links are unidirection s/unidirection/unidirectional/
Re: [Qemu-devel] Improve QEMU performance with LLVM codegen and other techniques
On 1 December 2011 03:50, 陳韋任 che...@iis.sinica.edu.tw wrote: We use QEMU 0.13 Oops, I missed this. 0.13 is over a year old now. There is zero point in doing any kind of engineering work of this scale on such an old codebase. You need to be tracking the head of git master, generally, if you want (a) any hope of getting your changes back into qemu or (b) any kind of useful support for problems you encounter during development. -- PMM
Re: [Qemu-devel] [PATCH 10/18] qom: qom_{get,set} monitor commands
On Wed, Nov 30, 2011 at 03:03:40PM -0600, Anthony Liguori wrote: +## +# @qom-get: +# +# This command will get a property from a device model path and return the +# value. +# +# @path: The path within the device model. There are two forms of supported +#paths--absolute and partial paths. +# +#Absolute paths are derived from the root device and can follow child +#or link properties. Since they can follow link properties, they +#can be arbitrarily long. Absolute paths look like absolute filenames +#and are prefixed with a leading slash. +# +#Partial paths are look like relative filenames. They do not begin s/are// +#with a prefix. The matching rules for partial paths are subtle but +#designed to make specifying devices easy. At each level of the +#composition tree, the partial path is matched as an absolute path. +#The first match is not returned. At least two matches are searched +#for. A successful result is only returned if only one match is +#founded. If more than one match is found, a flag is return to s/founded/found/
Re: [Qemu-devel] [PATCH 13/18] rtc: make piix3 set the rtc as a child
On Wed, Nov 30, 2011 at 03:03:43PM -0600, Anthony Liguori wrote: +/* FIXME there's some major spaghetti here. Somehow we create the devices + * on the PIIX before we actually create it. We create the PIIX3 deep in + * the recess of the i440fx creation too and then loose the DeviceState. s/loose/lose/
Re: [Qemu-devel] [PATCH 07/18] qom: add link properties
On 11/30/2011 11:03 PM, Anthony Liguori wrote: Links represent an ephemeral relationship between devices. They are meant to replace the qdev concept of busses by allowing more informal relationships between devices. So, links are equivalent to pointers? Links are fairly limited in their usefulness without implementing QOM-style subclassing and interfaces. +static void qdev_get_link_property(DeviceState *dev, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +DeviceState **child = opaque; +gchar *path; + +if (*child) { +path = qdev_get_canonical_path(*child); +visit_type_str(v, path, name, errp); +g_free(path); +} else { +path = (gchar *); If gchar != char, this is wrong. Also, you're converting a const pointer into a non-const pointer, discarding type safety. +visit_type_str(v, path, name, errp); +} Shouldn't this be visit_type_link()? -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH 15/18] qom: optimize qdev_get_canonical_path using a parent link
On Wed, Nov 30, 2011 at 03:03:45PM -0600, Anthony Liguori wrote: @@ -1210,6 +1210,9 @@ void qdev_property_add_child(DeviceState *dev, const char *name, qdev_property_add(dev, name, type, qdev_get_child_property, NULL, NULL, child, errp); +g_assert(child-parent == NULL); +child-parent = dev; The implications are: 1. A DeviceState must be a child or the root. It is not okay to create a DeviceState and inquire its canonical path before making it a child in the graph. 2. A DeviceState can only be the child of one parent. Since user-created devices are added to /peripheral or /peripheral-anon this means that the /i440fx only has links to them, never a parent-child relationship. Is this right? +/* Do not, under any circumstance, use this parent link below anywhere + * outside of qdev.c. You have been warned. */ +DeviceState *parent; It would be nice to explain why parent is private to qdev.c.
Re: [Qemu-devel] [BUG] [Seabios] PCI 64bit BARs on Win2008 - unable to start the device. (ACPI lacks the _DSM method)
Hi, Ok Here are logs the faulty device is: 00:05.0. In the first case the BAR1 size is 512MB and it fails in Windows (Linux is fine, the kernel enumerates the bus and assigns the resource above 4GB), In the second case the BAR1 size is 256M and works fine. === PCI new allocation pass #1 === PCI: check devices bus 0 Hmm, with 512 MB seabios should have paniced here with: PCI: out of address space PCI: map device bus 0, bfd 0x28 bar 0, addr febe, size 1 [mem] bar 1, addr 0, size 2000 [mem] Somehow seabios didn't recognise the bar correctly it seems (both 512 and 256 MB cases look the same). For the 256 MB case seabios should have mapped the bar @ 0xe000. cheers, Gerd
Re: [Qemu-devel] [PATCH 17/18] Add test tools
On Wed, Nov 30, 2011 at 03:03:47PM -0600, Anthony Liguori wrote: diff --git a/QMP/qom-get b/QMP/qom-get new file mode 100755 index 000..b086bc5 --- /dev/null +++ b/QMP/qom-get @@ -0,0 +1,26 @@ +#!/usr/bin/python +## +# Virtio Support QEMU Object Model property getter utility +# +# Copyright IBM, Corp. 2011 +# +# Authors: +# Anthony Liguori aligu...@us.ibm.com +# +# This work is licensed under the terms of the GNU GPL, version 2 or later. See +# the COPYING file in the top-level directory. +## + +import sys +from qmp import QEMUMonitorProtocol + +srv = QEMUMonitorProtocol('/tmp/server.sock') I think it would be worth doing it like QMP/qmp-shell.py and using a command-line argument. We might also like to support a QEMU_MONITOR environment variable. Hardcoding to /tmp/server.sock makes this script less useful. diff --git a/QMP/qom-list b/QMP/qom-list new file mode 100755 index 000..5dbb07b --- /dev/null +++ b/QMP/qom-list @@ -0,0 +1,30 @@ +#!/usr/bin/python +## +# Virtio Support QEMU Object Model property listing utility diff --git a/QMP/qom-set b/QMP/qom-set new file mode 100755 index 000..83a4394 --- /dev/null +++ b/QMP/qom-set @@ -0,0 +1,21 @@ +#!/usr/bin/python +## +# Virtio Support QEMU Object Model property setter utility
Re: [Qemu-devel] [PATCH] Add minimal Vexpress Cortex A15 support
On 1 December 2011 01:37, bill4car...@gmail.com wrote: Qemu based on: git://git.linaro.org/qemu/qemu-linaro.git Please don't post patches to upstream which are based on qemu-linaro. This work should be entirely doable against upstream qemu. You don't say whether you've tested this with TCG -- done correctly it should be entirely possible to run an A15 kernel in QEMU system mode without using KVM. From: Bill Carson bill4car...@gmail.com This patch adds minimal codes to support A15 which enables ARM KVM could run Guest OS build with Versatile Express Cortex-A15x4 tile. Signed-off-by: Bill Carson bill4car...@gmail.com Patch format etiquette: * for second and later rounds of a patch please put [PATCH v2] (v3,v4...) in the subject lines of patches and cover letters rather than just [PATCH] * include (in the cover letter for patch series, below the --- line for a single patch) a brief summary of changes since previous versions of the patch This patch should be split up into a series, something like: 1/3 add the A15 private peripherals 2/3 add the A15 cpu support itself 3/3 add the vexpress A15 board support --- Makefile.target | 2 +- hw/a15mpcore.c | 118 +++ hw/vexpress.c | 192 +++ target-arm/cpu.h | 1 + target-arm/helper.c | 31 5 files changed, 343 insertions(+), 1 deletions(-) create mode 100644 hw/a15mpcore.c diff --git a/Makefile.target b/Makefile.target index cce3554..69f0c4f 100644 --- a/Makefile.target +++ b/Makefile.target @@ -343,7 +343,7 @@ endif obj-arm-y = integratorcp.o versatilepb.o arm_pic.o arm_timer.o obj-arm-y += arm_boot.o pl011.o pl031.o pl050.o pl080.o pl110.o pl181.o pl190.o obj-arm-y += versatile_pci.o -obj-arm-y += realview_gic.o realview.o arm_sysctl.o arm11mpcore.o a9mpcore.o +obj-arm-y += realview_gic.o realview.o arm_sysctl.o arm11mpcore.o a9mpcore.o a15mpcore.o obj-arm-y += armv7m.o armv7m_nvic.o stellaris.o pl022.o stellaris_enet.o obj-arm-y += pl061.o obj-arm-y += arm-semi.o diff --git a/hw/a15mpcore.c b/hw/a15mpcore.c new file mode 100644 index 000..2518c17 --- /dev/null +++ b/hw/a15mpcore.c @@ -0,0 +1,118 @@ +/* + * ARM A15MPCore internal peripheral emulation (common code). + * + * Based on mpcore.c written by Paul Brook + * This code is licensed under the GPL. Missing copyright notice. Say which version(s) of the GPL. + */ + +#include sysbus.h +#include qemu-timer.h + +/* 64 external IRQ lines */ +#define GIC_NIRQ 96 64 or 96? + +#define NCPU 4 + +static inline int +gic_get_current_cpu(void) +{ + return cpu_single_env-cpu_index; +} + +#include arm_gic.c + +/* MPCore private memory region. */ +typedef struct a15mpcore_priv_state { + gic_state gic; + uint32_t scu_control; + int iomemtype; + uint32_t num_cpu; + MemoryRegion iomem; + MemoryRegion container; +} a15mpcore_priv_state; We don't seem to use scu_control. +/* TODO:Per-CPU Timers. */ A15 generic timers aren't memory mapped so they won't be in this file. + + +/* Per-CPU private memory mapped IO. */ +static uint64_t a15mpcore_priv_read(void *opaque, target_phys_addr_t offset, + unsigned size) +{ + a15mpcore_priv_state *s = (a15mpcore_priv_state *)opaque; + int id; + + offset = 0xfff; + /* Interrupt controller. */ + if (offset 0x200) { + id = gic_get_current_cpu(); + } else { + id = (offset - 0x200) 8; + if (id = s-num_cpu) { + return 0; + } + } + return gic_cpu_read(s-gic, id, offset 0xff); +} This is not what the A15 GIC looks like. Please implement against the TRM; don't just copy old code from the A9/11MPCore QEMU code. The cores are not the same. + +static void a15mpcore_priv_map_setup(a15mpcore_priv_state *s) +{ + memory_region_init(s-container, mpcode-priv-container, 0x3000); + memory_region_init_io(s-iomem, mpcore_priv_ops, s, mpcode-priv, + 0x1000); + memory_region_add_subregion(s-container, 0x2000, s-iomem); + memory_region_add_subregion(s-container, 0x1000, s-gic.iomem); This is wrong too. The GIC CPU interface region is 0x2000..0x3fff, not 0x2000..0x2fff. Since the A15 doesn't squash multiple different things into one page it might be simpler to have the GIC just expose more memory regions to be mapped rather than having an indirect through the mpcore_priv_ops here. + +static SysBusDeviceInfo a15mpcore_priv_info = { + .init = a15mpcore_priv_init, + .qdev.name = a15mpcore_priv, + .qdev.size = sizeof(a15mpcore_priv_state), + .qdev.props = (Property[]) { + DEFINE_PROP_UINT32(num-cpu, a15mpcore_priv_state, num_cpu, 1), + DEFINE_PROP_END_OF_LIST(), + } +}; + +static void a15mpcore_register_devices(void) +{ +
Re: [Qemu-devel] [PATCH] Add minimal Vexpress Cortex A15 support
On 1 December 2011 11:29, Peter Maydell peter.mayd...@linaro.org wrote: On 1 December 2011 01:37, bill4car...@gmail.com wrote: Qemu based on: git://git.linaro.org/qemu/qemu-linaro.git Please don't post patches to upstream which are based on qemu-linaro. This work should be entirely doable against upstream qemu. To clarify that a little, having noticed the cc list here: we can do kvm-specific work against qemu-linaro, but where possible we should do the non-kvm-specific bits against qemu upstream and bring them in that way. A15 board support is definitely not kvm specific and we should be aiming to upstream it first. I'm not sure what the best thing to do with work-in-progress or intended-for-kvm-qemu-linaro patches is; I think I'd suggest that those be posted to android-virt and linaro-toolchain but not qemu-devel. thanks -- PMM
Re: [Qemu-devel] [PATCH 07/18] qom: add link properties
On Thu, Dec 1, 2011 at 11:21 AM, Avi Kivity a...@redhat.com wrote: On 11/30/2011 11:03 PM, Anthony Liguori wrote: Links represent an ephemeral relationship between devices. They are meant to replace the qdev concept of busses by allowing more informal relationships between devices. So, links are equivalent to pointers? Links are fairly limited in their usefulness without implementing QOM-style subclassing and interfaces. +static void qdev_get_link_property(DeviceState *dev, Visitor *v, void *opaque, + const char *name, Error **errp) +{ + DeviceState **child = opaque; + gchar *path; + + if (*child) { + path = qdev_get_canonical_path(*child); + visit_type_str(v, path, name, errp); + g_free(path); + } else { + path = (gchar *); If gchar != char, this is wrong. Also, you're converting a const pointer into a non-const pointer, discarding type safety. This looked weird to me too but the cast has to do with the fact that the visitor function works both for input and output visitors. The output visitor needs to write to gchar** while the input visitor does not. When this function is called with the correct visitor type we are guaranteed that path will not be modified. Stefan
Re: [Qemu-devel] [BUG] [Seabios] PCI 64bit BARs on Win2008 - unable to start the device. (ACPI lacks the _DSM method)
Hi, PCI: map device bus 0, bfd 0x28 bar 0, addr febe, size 1 [mem] bar 1, addr 0, size 2000 [mem] Somehow seabios didn't recognise the bar correctly it seems (both 512 and 256 MB cases look the same). For the 256 MB case seabios should have mapped the bar @ 0xe000. ... and it should also have figured it is prefetchable memory. Was pci config space messed up somehow? What does 'lspci -v' say once you've booted the machine with linux? What qemu version you are running? What kind of device is this? Emulated? Code somewhere? Or a real device passed through to the guest? cheers, Gerd
Re: [Qemu-devel] [PATCH 07/18] qom: add link properties
On 12/01/2011 01:35 PM, Stefan Hajnoczi wrote: +static void qdev_get_link_property(DeviceState *dev, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +DeviceState **child = opaque; +gchar *path; + +if (*child) { +path = qdev_get_canonical_path(*child); +visit_type_str(v, path, name, errp); +g_free(path); +} else { +path = (gchar *); If gchar != char, this is wrong. Also, you're converting a const pointer into a non-const pointer, discarding type safety. This looked weird to me too but the cast has to do with the fact that the visitor function works both for input and output visitors. The output visitor needs to write to gchar** while the input visitor does not. So you need to pass a non-const pointer to an array of const char, or const gchar **. You don't modify the string in place, you allocate a new string and free the old one. When this function is called with the correct visitor type we are guaranteed that path will not be modified. What if it's called with the output visitor? (warning: confusing convention). -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH 01/18] qom: add new dynamic property infrastructure based on Visitors
On 12/01/2011 02:19 AM, Stefan Hajnoczi wrote: On Wed, Nov 30, 2011 at 03:03:31PM -0600, Anthony Liguori wrote: +/** + * @DevicePropertyEtter - called when trying to get/set a property An established term for this is an accessor. I've never heard etter before and it looks like a typo on first sight ;). +/** + * @qdev_property_set - writes a property to a device + * + * @dev - the device + * + * @v - the visitor that will used to write the property value. This should be s/will used/will be used/ Ack. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 02/18] qom: register legacy properties as new style properties
On 12/01/2011 02:33 AM, Stefan Hajnoczi wrote: On Wed, Nov 30, 2011 at 03:03:32PM -0600, Anthony Liguori wrote: +static void qdev_set_legacy_property(DeviceState *dev, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +Property *prop = opaque; + +if (dev-state != DEV_STATE_CREATED) { +error_set(errp, QERR_PERMISSION_DENIED); +return; +} + +if (prop-info-parse) { +Error *local_err = NULL; +char *ptr = NULL; + +visit_type_str(v,ptr, name,local_err); +if (!local_err) { +int ret; +ret = prop-info-parse(dev, prop, ptr); +if (ret != 0) { +error_set(errp, QERR_INVALID_PARAMETER_VALUE, + name, prop-info-name); +} +g_free(ptr); +} else { +error_propagate(errp, local_err); +} I noticed something subtle about Error** here. Your code is correct but I (incorrectly) wanted to eliminate local_err and use errp directly: if (prop-info-parse) { char *ptr = NULL; visit_type_str(v,ptr, name, errp); if (!error_is_set(errp)) { int ret; ret = prop-info-parse(dev, prop, ptr); if (ret != 0) { error_set(errp, QERR_INVALID_PARAMETER_VALUE, name, prop-info-name); } g_free(ptr); } } else { ... Turns out you cannot do this because error_is_set(errp) will be false if the caller passed in a NULL errp. That means we wouldn't detect the error from visit_type_str()! So it's not okay to depend on the caller's errp. We always need to juggle around a local_err with propagation :(. Just wanted to share. Yes, you are correct. You see this idiom a lot in QAPI an also in glib code that uses GError. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 05/18] qdev: provide a path resolution
On 12/01/2011 04:24 AM, Stefan Hajnoczi wrote: On Wed, Nov 30, 2011 at 03:03:35PM -0600, Anthony Liguori wrote: +DeviceState *qdev_resolve_path(const char *path, bool *ambiguous) +{ +bool partial_path = true; +DeviceState *dev; +gchar **parts; + +parts = g_strsplit(path, /, 0); +if (parts == NULL || parts[0] == NULL) { We must g_strfreev(parts) in the parts[0] == NULL case. Good catch! +/** + * @qdev_resolve_path - resolves a path returning a device + * + * There are two types of supported paths--absolute paths and partial paths. + * + * Absolute paths are derived from the root device and can follow child or + * link properties. Since they can follow link properties, they can be + * arbitrarily long. Absolute paths look like absolute filenames and are prefix s/prefix/prefixed/ + * with a leading slash. + * + * Partial paths are look like relative filenames. They do not begin with a s/are// + * prefix. The matching rules for partial paths are subtle but designed to make + * specifying devices easy. At each level of the composition tree, the partial + * path is matched as an absolute path. The first match is not returned. At + * least two matches are searched for. A successful result is only returned if + * only one match is founded. If more than one match is found, a flag is return s/return/returned/ + * to indicate that the match was ambiguous. + * + * @path - the path to resolve + * + * @ambiguous - returns true if the path resolution failed because of an + * ambiguous match The implementation seems to depend on ambiguous being initialized to false by the caller. That would be worth documenting or changing so it does not depend on the initial value. That's actually a bug. I'll fix it. Ack on the other comments too. Regards, Anthony Liguori + * + * Returns: + * The matched device. The matched device or NULL on path lookup failure.
Re: [Qemu-devel] [PATCH 10/18] qom: qom_{get,set} monitor commands
On 12/01/2011 05:04 AM, Stefan Hajnoczi wrote: On Wed, Nov 30, 2011 at 03:03:40PM -0600, Anthony Liguori wrote: +## +# @qom-get: +# +# This command will get a property from a device model path and return the +# value. +# +# @path: The path within the device model. There are two forms of supported +#paths--absolute and partial paths. +# +#Absolute paths are derived from the root device and can follow child +#or link properties. Since they can follow link properties, they +#can be arbitrarily long. Absolute paths look like absolute filenames +#and are prefixed with a leading slash. +# +#Partial paths are look like relative filenames. They do not begin s/are// +#with a prefix. The matching rules for partial paths are subtle but +#designed to make specifying devices easy. At each level of the +#composition tree, the partial path is matched as an absolute path. +#The first match is not returned. At least two matches are searched +#for. A successful result is only returned if only one match is +#founded. If more than one match is found, a flag is return to s/founded/found/ Ack. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 13/18] rtc: make piix3 set the rtc as a child
On 12/01/2011 05:07 AM, Stefan Hajnoczi wrote: On Wed, Nov 30, 2011 at 03:03:43PM -0600, Anthony Liguori wrote: +/* FIXME there's some major spaghetti here. Somehow we create the devices + * on the PIIX before we actually create it. We create the PIIX3 deep in + * the recess of the i440fx creation too and then loose the DeviceState. s/loose/lose/ Ack. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 15/18] qom: optimize qdev_get_canonical_path using a parent link
On 12/01/2011 05:21 AM, Stefan Hajnoczi wrote: On Wed, Nov 30, 2011 at 03:03:45PM -0600, Anthony Liguori wrote: @@ -1210,6 +1210,9 @@ void qdev_property_add_child(DeviceState *dev, const char *name, qdev_property_add(dev, name, type, qdev_get_child_property, NULL, NULL, child, errp); +g_assert(child-parent == NULL); +child-parent = dev; The implications are: 1. A DeviceState must be a child or the root. It is not okay to create a DeviceState and inquire its canonical path before making it a child in the graph. Correct. 2. A DeviceState can only be the child of one parent. Since user-created devices are added to /peripheral or /peripheral-anon this means that the /i440fx only has links to them, never a parent-child relationship. Correct. /peripheral[-anon] are where user created devices live. Machine created devices live directly off of / Each directory ends up being a unique namespace. This ends up being a nice way to support compatibly since we have so many namespace today. I imagine, for instance, that we will end up with a /block directory too and that is where blockdev objects will live. Is this right? +/* Do not, under any circumstance, use this parent link below anywhere + * outside of qdev.c. You have been warned. */ +DeviceState *parent; It would be nice to explain why parent is private to qdev.c. Composition is a unidirectional relationship. The parent pointer is there simply as an optimization. I'll make the comment a big scarier :-) Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 17/18] Add test tools
On 12/01/2011 05:26 AM, Stefan Hajnoczi wrote: On Wed, Nov 30, 2011 at 03:03:47PM -0600, Anthony Liguori wrote: diff --git a/QMP/qom-get b/QMP/qom-get new file mode 100755 index 000..b086bc5 --- /dev/null +++ b/QMP/qom-get @@ -0,0 +1,26 @@ +#!/usr/bin/python +## +# Virtio Support QEMU Object Model property getter utility +# +# Copyright IBM, Corp. 2011 +# +# Authors: +# Anthony Liguorialigu...@us.ibm.com +# +# This work is licensed under the terms of the GNU GPL, version 2 or later. See +# the COPYING file in the top-level directory. +## + +import sys +from qmp import QEMUMonitorProtocol + +srv = QEMUMonitorProtocol('/tmp/server.sock') I think it would be worth doing it like QMP/qmp-shell.py and using a command-line argument. We might also like to support a QEMU_MONITOR environment variable. Hardcoding to /tmp/server.sock makes this script less useful. diff --git a/QMP/qom-list b/QMP/qom-list new file mode 100755 index 000..5dbb07b --- /dev/null +++ b/QMP/qom-list @@ -0,0 +1,30 @@ +#!/usr/bin/python +## +# Virtio Support QEMU Object Model property listing utility diff --git a/QMP/qom-set b/QMP/qom-set new file mode 100755 index 000..83a4394 --- /dev/null +++ b/QMP/qom-set @@ -0,0 +1,21 @@ +#!/usr/bin/python +## +# Virtio Support QEMU Object Model property setter utility Ack. I made these changes actually but something got lost when I sent the patches out. I actually used git-publish for the first time and apparently screwed something up when setting up the branch :-/ Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 07/18] qom: add link properties
On 12/01/2011 04:55 AM, Stefan Hajnoczi wrote: On Wed, Nov 30, 2011 at 03:03:37PM -0600, Anthony Liguori wrote: +/** + * @qdev_property_add_link - Add a link property to a device + * + * Links establish relationships between devices. Links are unidirection s/unidirection/unidirectional/ Ack. Regards, Anthony Liguori
[Qemu-devel] [PATCH 2/2] common.config: Allow use of arbitrary qemu* paths
Since we might want to test arbitrary qemu, qemu-img and qemu-io paths, allow users to specify environment variable values for QEMU_PROG, QEMU_IMG_PROG and QEMU_IO_PROG so the testsuite will use those values rather than find them on PATH. Obviously, if such env variables are not set prior to script execution, normal detection mechanism takes place. Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com --- common.config | 12 +--- 1 files changed, 9 insertions(+), 3 deletions(-) diff --git a/common.config b/common.config index d5a72af..d07f435 100644 --- a/common.config +++ b/common.config @@ -87,13 +87,19 @@ export BC_PROG=`set_prog_path bc` export PS_ALL_FLAGS=-ef -export QEMU_PROG=`set_prog_path qemu` +if [ -z $QEMU_PROG ]; then +export QEMU_PROG=`set_prog_path qemu` +fi [ $QEMU_PROG = ] _fatal qemu not found -export QEMU_IMG_PROG=`set_prog_path qemu-img` +if [ -z $QEMU_IMG_PROG ]; then +export QEMU_IMG_PROG=`set_prog_path qemu-img` +fi [ $QEMU_IMG_PROG = ] _fatal qemu-img not found -export QEMU_IO_PROG=`set_prog_path qemu-io` +if [ -z $QEMU_IO_PROG ]; then +export QEMU_IO_PROG=`set_prog_path qemu-io` +fi [ $QEMU_IO_PROG = ] _fatal qemu-io not found export QEMU=$QEMU_PROG -- 1.7.7.3
[Qemu-devel] [PATCH 1/2] check: print relevant path information
Print the paths of the programs under test (qemu, qemu-img and qemu-io). Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com --- check |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/check b/check index 84ef3e5..8499a04 100755 --- a/check +++ b/check @@ -158,6 +158,9 @@ FULL_HOST_DETAILS=`_full_platform_details` #FULL_MOUNT_OPTIONS=`_scratch_mount_options` cat EOF +QEMU -- $QEMU +QEMU_IMG -- $QEMU_IMG +QEMU_IO -- $QEMU_IO IMGFMT-- $FULL_IMGFMT_DETAILS IMGPROTO -- $FULL_IMGPROTO_DETAILS PLATFORM -- $FULL_HOST_DETAILS -- 1.7.7.3
[Qemu-devel] [PATCH 0/2] qemu-io tests: More fine grained control of qemu paths
In automated test environments, we often build and test qemu from arbitrary paths, rather than installing them on standard PATH directories. Of course, appending directories to PATH might produce the desired result, but making it possible to specify arbitrary qemu paths through environment variables is very convenient and minimally intrusive. So, make it possible to set qemu paths through env variables, and also, print the paths of the qemu versions being tested, for clarity sake. Lucas Meneghel Rodrigues (2): check: print relevant path information common.config: Allow use of arbitrary qemu* paths check |3 +++ common.config | 12 +--- 2 files changed, 12 insertions(+), 3 deletions(-) -- 1.7.7.3
Re: [Qemu-devel] [PATCH 07/18] qom: add link properties
On 12/01/2011 05:21 AM, Avi Kivity wrote: On 11/30/2011 11:03 PM, Anthony Liguori wrote: Links represent an ephemeral relationship between devices. They are meant to replace the qdev concept of busses by allowing more informal relationships between devices. So, links are equivalent to pointers? Yup. Once we have qom inheritance (next stage), we can have a linkPCIDevice property and you'll be able to set it to an E1000State with the appropriate casting and error checking taking place. Links are fairly limited in their usefulness without implementing QOM-style subclassing and interfaces. +static void qdev_get_link_property(DeviceState *dev, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +DeviceState **child = opaque; +gchar *path; + +if (*child) { +path = qdev_get_canonical_path(*child); +visit_type_str(v,path, name, errp); +g_free(path); +} else { +path = (gchar *); If gchar != char, this is wrong. Also, you're converting a const pointer into a non-const pointer, discarding type safety. I'll address this later in the thread. +visit_type_str(v,path, name, errp); +} Shouldn't this be visit_type_link()? link isn't a primitive type for visitors. visit_type_link() would just call visit_type_str() so I don't think there's a ton of value in introducing the extra layer. The accessors are the only places that should be marshaling links. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 07/18] qom: add link properties
On 12/01/2011 06:34 AM, Avi Kivity wrote: On 12/01/2011 01:35 PM, Stefan Hajnoczi wrote: +static void qdev_get_link_property(DeviceState *dev, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +DeviceState **child = opaque; +gchar *path; + +if (*child) { +path = qdev_get_canonical_path(*child); +visit_type_str(v,path, name, errp); +g_free(path); +} else { +path = (gchar *); If gchar != char, this is wrong. Also, you're converting a const pointer into a non-const pointer, discarding type safety. This looked weird to me too but the cast has to do with the fact that the visitor function works both for input and output visitors. The output visitor needs to write to gchar** while the input visitor does not. So you need to pass a non-const pointer to an array of const char, or const gchar **. You don't modify the string in place, you allocate a new string and free the old one. When this function is called with the correct visitor type we are guaranteed that path will not be modified. What if it's called with the output visitor? (warning: confusing convention). The reason there's a single Visitor type that works for both input and output visitors is to make it so you can write a single visit function that works for input and output. This works very well for most cases (in fact, QAPI makes heavy use of it). That said, I'm starting to feel like there should be a separate input and output visitor interface. It would make a number of things (like visiting lists) significantly simpler. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 07/18] qom: add link properties
On 12/01/2011 03:47 PM, Anthony Liguori wrote: On 12/01/2011 06:34 AM, Avi Kivity wrote: On 12/01/2011 01:35 PM, Stefan Hajnoczi wrote: +static void qdev_get_link_property(DeviceState *dev, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +DeviceState **child = opaque; +gchar *path; + +if (*child) { +path = qdev_get_canonical_path(*child); +visit_type_str(v,path, name, errp); +g_free(path); +} else { +path = (gchar *); If gchar != char, this is wrong. Also, you're converting a const pointer into a non-const pointer, discarding type safety. This looked weird to me too but the cast has to do with the fact that the visitor function works both for input and output visitors. The output visitor needs to write to gchar** while the input visitor does not. So you need to pass a non-const pointer to an array of const char, or const gchar **. You don't modify the string in place, you allocate a new string and free the old one. When this function is called with the correct visitor type we are guaranteed that path will not be modified. What if it's called with the output visitor? (warning: confusing convention). The reason there's a single Visitor type that works for both input and output visitors is to make it so you can write a single visit function that works for input and output. This works very well for most cases (in fact, QAPI makes heavy use of it). That said, I'm starting to feel like there should be a separate input and output visitor interface. It would make a number of things (like visiting lists) significantly simpler. Perhaps. But that's independent of the issue here. No matter how many visitors you have, you never change a gchar in a string in place, you always allocate a new string (or you can't change the string length). So the type needs to be const gchar **, not gchar **. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH 07/18] qom: add link properties
On 12/01/2011 03:44 PM, Anthony Liguori wrote: So, links are equivalent to pointers? Yup. Once we have qom inheritance (next stage), we can have a linkPCIDevice property and you'll be able to set it to an E1000State with the appropriate casting and error checking taking place. I really like this goal but can't help feeling that we're stretching C beyond its limits here, so that the client code ends up boilerplate-heavy. Kind of like the issue with local_err elsewhere in this thread, where you juggle things instead of a throw Exception(...). What does the client code looks like for linkPCIDevice? -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH 17/18] Add test tools
On Thu, Dec 1, 2011 at 1:39 PM, Anthony Liguori anth...@codemonkey.ws wrote: On 12/01/2011 05:26 AM, Stefan Hajnoczi wrote: On Wed, Nov 30, 2011 at 03:03:47PM -0600, Anthony Liguori wrote: Ack. I made these changes actually but something got lost when I sent the patches out. I actually used git-publish for the first time and apparently screwed something up when setting up the branch :-/ Let me know if git-publish acts in a way you don't expect. The latest version is 0.2 at https://github.com/stefanha/git-publish. Stefan
[Qemu-devel] Offres emplois pour vous
Bonjour, des nouveaux offres emplois pour vous http://www.universfreeads.com/emplois.php ..
Re: [Qemu-devel] [PATCH 00/18] qom: dynamic properties and composition tree
On 11/30/2011 11:03 PM, Anthony Liguori wrote: This is a follow up to my previous series to get us started in the QOM direction. A few things are different this time around. Most notably: 1) Devices no longer have names. Instead, path names are always used to identify devices. 2) In order to support (1), dynamic properties are now supported. 3) The concept of a root device has been introduced. The root device is roughly modelling the motherboard of a machine. This type is a container type and it's best to think of it as something like a PCB board I guess. To try it out, here's an example session: Launch: anthony@titi:~/build/qemu$ x86_64-softmmu/qemu-system-x86_64 -hda ~/images/linux.img -snapshot -device virtio-balloon-pci,id=foo -qmp unix:/tmp/server.sock,server,nowait Explore the object model: anthony@titi:~/git/qemu/QMP$ ./qom-list / peripheral/ i440fx/ anthony@titi:~/git/qemu/QMP$ ./qom-list /i440fx/ piix3/ anthony@titi:~/git/qemu/QMP$ ./qom-list /i440fx/piix3 rtc/ anthony@titi:~/git/qemu/QMP$ ./qom-list /i440fx/piix3/rtc date base_year anthony@titi:~/git/qemu/QMP$ ./qom-get /i440fx/piix3/rtc.date tm_sec: 33 tm_hour: 21 tm_mday: 30 tm_year: 111 tm_mon: 10 tm_min: 2 So all of these become ABIs, right? We need good tools to allow easy review of the ABI bits hiding in patches, and to maintain ABI compatibility. Something like qemu-print-abi that dumps all properties for all devices. Patches could show the ABI changes by including a diff of the output of this program from before and after a change, and we could add similar tests for backwards compatibility. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH] Guest stop notification
On 2011-11-29 22:36, Eric B Munson wrote: Often when a guest is stopped from the qemu console, it will report spurious soft lockup warnings on resume. There are kernel patches being discussed that will give the host the ability to tell the guest that it is being stopped and should ignore the soft lockup warning that generates. Signed-off-by: Eric B Munson emun...@mgebm.net Cc: ry...@linux.vnet.ibm.com Cc: aligu...@us.ibm.com Cc: mtosa...@redhat.com Cc: a...@redhat.com Cc: k...@vger.kernel.org Cc: linux-ker...@vger.kernel.org --- target-i386/kvm.c |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 5bfc21f..defd364 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -336,12 +336,18 @@ static int kvm_inject_mce_oldstyle(CPUState *env) return 0; } +static void kvm_put_guest_paused(CPUState *penv) +{ +kvm_vcpu_ioctl(penv, KVM_GUEST_PAUSED, 0); +} I see no need in encapsulating this in a separate function. + static void cpu_update_state(void *opaque, int running, RunState state) { CPUState *env = opaque; if (running) { env-tsc_valid = false; + kvm_put_guest_paused(env); checkpatch.pl would have asked you to remove this tab. More general: Why is this x86-only? If the kernel interface is x86-only, what prevents making it generic right from the beginning? Why do we need a new IOCTL for this? Was there no space left in the kvm_run structure e.g. to pass this flag down on next vcpu execution? No big deal, just wondering. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH 00/18] qom: dynamic properties and composition tree
On 12/01/2011 08:20 AM, Avi Kivity wrote: On 11/30/2011 11:03 PM, Anthony Liguori wrote: This is a follow up to my previous series to get us started in the QOM direction. A few things are different this time around. Most notably: 1) Devices no longer have names. Instead, path names are always used to identify devices. 2) In order to support (1), dynamic properties are now supported. 3) The concept of a root device has been introduced. The root device is roughly modelling the motherboard of a machine. This type is a container type and it's best to think of it as something like a PCB board I guess. To try it out, here's an example session: Launch: anthony@titi:~/build/qemu$ x86_64-softmmu/qemu-system-x86_64 -hda ~/images/linux.img -snapshot -device virtio-balloon-pci,id=foo -qmp unix:/tmp/server.sock,server,nowait Explore the object model: anthony@titi:~/git/qemu/QMP$ ./qom-list / peripheral/ i440fx/ anthony@titi:~/git/qemu/QMP$ ./qom-list /i440fx/ piix3/ anthony@titi:~/git/qemu/QMP$ ./qom-list /i440fx/piix3 rtc/ anthony@titi:~/git/qemu/QMP$ ./qom-list /i440fx/piix3/rtc date base_year anthony@titi:~/git/qemu/QMP$ ./qom-get /i440fx/piix3/rtc.date tm_sec: 33 tm_hour: 21 tm_mday: 30 tm_year: 111 tm_mon: 10 tm_min: 2 So all of these become ABIs, right? At first, no. I marked all of these QMP functions as experimental specifically to avoid introducing a new ABI. I want to give us some time to work out things out a bit more. But yes, this will eventually become an ABI. We need good tools to allow easy review of the ABI bits hiding in patches, and to maintain ABI compatibility. Something like qemu-print-abi that dumps all properties for all devices. Patches could show the ABI changes by including a diff of the output of this program from before and after a change, and we could add similar tests for backwards compatibility. I'm not sure that we want this interface to be backwards compatible. I actually think we should provide a higher level interface that's explicitly there for compatibility. Probably in the form of a C library that can be documented and reasoned with better. That turns the problem into something that requires less cleverness. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 00/18] qom: dynamic properties and composition tree
On 12/01/2011 04:42 PM, Anthony Liguori wrote: We need good tools to allow easy review of the ABI bits hiding in patches, and to maintain ABI compatibility. Something like qemu-print-abi that dumps all properties for all devices. Patches could show the ABI changes by including a diff of the output of this program from before and after a change, and we could add similar tests for backwards compatibility. I'm not sure that we want this interface to be backwards compatible. I actually think we should provide a higher level interface that's explicitly there for compatibility. Probably in the form of a C library that can be documented and reasoned with better. Does this force anyone who wants a stable ABI to use this library? I don't have a good picture of this library. If FooState has a bar propery, do you generate qemu_foo_get_bar() and qemu_foo_set_bar() accessors? -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH 1/2] [PATCH RFC v2 1/2] hyper-v: introduce Hyper-V support infrastructure.
On 2011-10-23 17:39, Vadim Rozenfeld wrote: --- Makefile.target |2 + target-i386/cpuid.c | 14 ++ target-i386/hyperv.c | 65 ++ target-i386/hyperv.h | 37 4 files changed, 118 insertions(+), 0 deletions(-) create mode 100644 target-i386/hyperv.c create mode 100644 target-i386/hyperv.h diff --git a/Makefile.target b/Makefile.target index 325f26e..446fabc 100644 --- a/Makefile.target +++ b/Makefile.target @@ -202,6 +202,8 @@ obj-$(CONFIG_NO_KVM) += kvm-stub.o obj-y += memory.o LIBS+=-lz +obj-i386-y +=hyperv.o + QEMU_CFLAGS += $(VNC_TLS_CFLAGS) QEMU_CFLAGS += $(VNC_SASL_CFLAGS) QEMU_CFLAGS += $(VNC_JPEG_CFLAGS) diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c index 1e8bcff..261c168 100644 --- a/target-i386/cpuid.c +++ b/target-i386/cpuid.c @@ -27,6 +27,8 @@ #include qemu-option.h #include qemu-config.h +#include hyperv.h + /* feature flags taken from Intel Processor Identification and the CPUID * Instruction and AMD's CPUID Specification. In cases of disagreement * between feature naming conventions, aliases may be added. @@ -716,6 +718,14 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model) goto error; } x86_cpu_def-tsc_khz = tsc_freq / 1000; +} else if (!strcmp(featurestr, hv_spinlocks)) { + char *err; + numvalue = strtoul(val, err, 0); + if (!*val || *err) { Indention is off here. +fprintf(stderr, bad numerical value %s\n, val); +goto error; +} +hyperv_set_spinlock_retries(numvalue); } else { fprintf(stderr, unrecognized feature %s\n, featurestr); goto error; @@ -724,6 +734,10 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model) check_cpuid = 1; } else if (!strcmp(featurestr, enforce)) { check_cpuid = enforce_cpuid = 1; +} else if (!strcmp(featurestr, hv_relaxed)) { +hyperv_enable_relaxed_timing(true); +} else if (!strcmp(featurestr, hv_vapic)) { +hyperv_enable_vapic_recommended(true); } else { fprintf(stderr, feature string `%s' not in format (+feature|-feature|feature=xyz)\n, featurestr); goto error; diff --git a/target-i386/hyperv.c b/target-i386/hyperv.c new file mode 100644 index 000..b2e57ad --- /dev/null +++ b/target-i386/hyperv.c @@ -0,0 +1,65 @@ +/* + * QEMU Hyper-V support + * + * Copyright Red Hat, Inc. 2011 + * + * Author: Vadim Rozenfeld vroze...@redhat.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include hyperv.h + +static bool hyperv_vapic; +static bool hyperv_relaxed_timing; +static int hyperv_spinlock_attempts = HYPERV_SPINLOCK_NEVER_RETRY; + +void hyperv_enable_vapic_recommended(bool val) +{ +hyperv_vapic = val; +} + +void hyperv_enable_relaxed_timing(bool val) +{ +hyperv_relaxed_timing = val; +} + +void hyperv_set_spinlock_retries(int val) +{ +hyperv_spinlock_attempts = val; +if (hyperv_spinlock_attempts 0xFFF) { +hyperv_spinlock_attempts = 0xFFF; +} +} + +bool hyperv_enabled(void) +{ +return hyperv_hypercall_available() || hyperv_relaxed_timing_enabled(); +} + +bool hyperv_hypercall_available(void) +{ +if (hyperv_vapic || +(hyperv_spinlock_attempts != HYPERV_SPINLOCK_NEVER_RETRY)) { [ minor: braces arount the second condition aren't needed. ] + return true; +} +return false; +} + +bool hyperv_vapic_recommended(void) +{ +return hyperv_vapic; +} + +bool hyperv_relaxed_timing_enabled(void) +{ +return hyperv_relaxed_timing; +} + +int hyperv_get_spinlock_retries(void) +{ +return hyperv_spinlock_attempts; +} + diff --git a/target-i386/hyperv.h b/target-i386/hyperv.h new file mode 100644 index 000..0d742f8 --- /dev/null +++ b/target-i386/hyperv.h @@ -0,0 +1,37 @@ +/* + * QEMU Hyper-V support + * + * Copyright Red Hat, Inc. 2011 + * + * Author: Vadim Rozenfeld vroze...@redhat.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef QEMU_HW_HYPERV_H +#define QEMU_HW_HYPERV_H 1 + +#include qemu-common.h +#include asm/hyperv.h + +#ifndef HYPERV_SPINLOCK_NEVER_RETRY +#define HYPERV_SPINLOCK_NEVER_RETRY 0x +#endif + +#ifndef KVM_CPUID_SIGNATURE_NEXT +#define KVM_CPUID_SIGNATURE_NEXT0x4100 +#endif + +void hyperv_enable_vapic_recommended(bool val); +void
Re: [Qemu-devel] [PATCH 2/2] [PATCH RFC v2 2/2] hyper-v: initialize Hyper-V CPUID leaves.
On 2011-10-23 17:39, Vadim Rozenfeld wrote: --- target-i386/kvm.c | 73 +++- 1 files changed, 71 insertions(+), 2 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 82fec8c..c061e3b 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -29,6 +29,7 @@ #include hw/pc.h #include hw/apic.h #include ioport.h +#include hyperv.h //#define DEBUG_KVM @@ -380,11 +381,16 @@ int kvm_arch_init_vcpu(CPUState *env) cpuid_i = 0; /* Paravirtualization CPUIDs */ -memcpy(signature, KVMKVMKVM\0\0\0, 12); c = cpuid_data.entries[cpuid_i++]; memset(c, 0, sizeof(*c)); c-function = KVM_CPUID_SIGNATURE; -c-eax = 0; +if (!hyperv_enabled()) { +memcpy(signature, KVMKVMKVM\0\0\0, 12); +c-eax = 0; +} else { +memcpy(signature, Microsoft Hv, 12); +c-eax = HYPERV_CPUID_MIN; +} c-ebx = signature[0]; c-ecx = signature[1]; c-edx = signature[2]; @@ -395,6 +401,54 @@ int kvm_arch_init_vcpu(CPUState *env) c-eax = env-cpuid_kvm_features kvm_arch_get_supported_cpuid(s, KVM_CPUID_FEATURES, 0, R_EAX); +if (hyperv_enabled()) { +memcpy(signature, Hv#1\0\0\0\0\0\0\0\0, 12); +c-eax = signature[0]; + +c = cpuid_data.entries[cpuid_i++]; +memset(c, 0, sizeof(*c)); +c-function = HYPERV_CPUID_VERSION; +c-eax = 0x1bbc; +c-ebx = 0x00060001; + +c = cpuid_data.entries[cpuid_i++]; +memset(c, 0, sizeof(*c)); +c-function = HYPERV_CPUID_FEATURES; +if (hyperv_relaxed_timing_enabled()) { +c-eax |= HV_X64_MSR_HYPERCALL_AVAILABLE; +} +if (hyperv_vapic_recommended()) { +c-eax |= HV_X64_MSR_HYPERCALL_AVAILABLE; +c-eax |= HV_X64_MSR_APIC_ACCESS_AVAILABLE; +} + +c = cpuid_data.entries[cpuid_i++]; +memset(c, 0, sizeof(*c)); +c-function = HYPERV_CPUID_ENLIGHTMENT_INFO; +if (hyperv_relaxed_timing_enabled()) { +c-eax |= HV_X64_RELAXED_TIMING_RECOMMENDED; +} +if (hyperv_vapic_recommended()) { +c-eax |= HV_X64_APIC_ACCESS_RECOMMENDED; +} +c-ebx = hyperv_get_spinlock_retries(); + +c = cpuid_data.entries[cpuid_i++]; +memset(c, 0, sizeof(*c)); +c-function = HYPERV_CPUID_IMPLEMENT_LIMITS; +c-eax = 0x40; +c-ebx = 0x40; + +c = cpuid_data.entries[cpuid_i++]; +memset(c, 0, sizeof(*c)); +c-function = KVM_CPUID_SIGNATURE_NEXT; +memcpy(signature, KVMKVMKVM\0\0\0, 12); +c-eax = 0; +c-ebx = signature[0]; +c-ecx = signature[1]; +c-edx = signature[2]; +} + has_msr_async_pf_en = c-eax (1 KVM_FEATURE_ASYNC_PF); cpu_x86_cpuid(env, 0, 0, limit, unused, unused, unused); @@ -953,6 +1007,13 @@ static int kvm_put_msrs(CPUState *env, int level) kvm_msr_entry_set(msrs[n++], MSR_KVM_ASYNC_PF_EN, env-async_pf_en_msr); } +if (hyperv_hypercall_available()) { +kvm_msr_entry_set(msrs[n++], HV_X64_MSR_GUEST_OS_ID, 0); +kvm_msr_entry_set(msrs[n++], HV_X64_MSR_HYPERCALL, 0); +} +if (hyperv_vapic_recommended()) { +kvm_msr_entry_set(msrs[n++], HV_X64_MSR_APIC_ASSIST_PAGE, 0); +} } if (env-mcg_cap) { int i; @@ -1190,6 +1251,14 @@ static int kvm_get_msrs(CPUState *env) msrs[n++].index = MSR_KVM_ASYNC_PF_EN; } +if (hyperv_hypercall_available()) { +msrs[n++].index = HV_X64_MSR_GUEST_OS_ID; +msrs[n++].index = HV_X64_MSR_HYPERCALL; +} +if (hyperv_vapic_recommended()) { +msrs[n++].index = HV_X64_MSR_APIC_ASSIST_PAGE; +} + if (env-mcg_cap) { msrs[n++].index = MSR_MCG_STATUS; msrs[n++].index = MSR_MCG_CTL; Not sure if we discussed this already: Is there no need to save/restore the HV MSR values in the vcpu structure? Specifcially kvm_get_msrs looks fishy as you request to read the state from the kernel but then simply ignore it. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH 07/18] qom: add link properties
On 12/01/2011 08:03 AM, Avi Kivity wrote: On 12/01/2011 03:44 PM, Anthony Liguori wrote: So, links are equivalent to pointers? Yup. Once we have qom inheritance (next stage), we can have a linkPCIDevice property and you'll be able to set it to an E1000State with the appropriate casting and error checking taking place. I really like this goal but can't help feeling that we're stretching C beyond its limits here, so that the client code ends up boilerplate-heavy. Kind of like the issue with local_err elsewhere in this thread, where you juggle things instead of a throw Exception(...). I understand and have been down every possible road here. It's tempting to look at C++ or another language and view it as a simplifying assumption that makes the whole effort tremendously easier. But that's not been my experience. We just have to stretch C++ in different ways and you end up with that same icky feeling at the end of the day. What does the client code looks like for linkPCIDevice? I'm not sure what you mean by client code, but consider a device called UsbController that looks like: struct UsbController { DeviceState parent; UsbDevice *slave; // link property }; To add this as a link, somewhere in the init function you would do: static void usb_controller_initfn(UsbController *dev) { ... qdev_property_add_link(DEVICE(dev), slave, UsbDevice, (DeviceState **)dev-slave, NULL); } If you want to set the property explicitly, you would just do: dev-slave = some_other_device You don't need to use any special function to manipulate the link. Stylistically, I'd prefer that all devices exposed accessor functions and that you did these things through accessors so that we had clear rules about what's public and private. In terms of QMP client code, you just do: qom-set /path/to/usb-controller.slave /some/other/device Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 07/18] qom: add link properties
On 12/01/2011 07:50 AM, Avi Kivity wrote: On 12/01/2011 03:47 PM, Anthony Liguori wrote: What if it's called with the output visitor? (warning: confusing convention). The reason there's a single Visitor type that works for both input and output visitors is to make it so you can write a single visit function that works for input and output. This works very well for most cases (in fact, QAPI makes heavy use of it). That said, I'm starting to feel like there should be a separate input and output visitor interface. It would make a number of things (like visiting lists) significantly simpler. Perhaps. But that's independent of the issue here. No matter how many visitors you have, you never change a gchar in a string in place, you always allocate a new string (or you can't change the string length). So the type needs to be const gchar **, not gchar **. While you are correct, I tend to use 'gchar *' and 'const gchar *' to indicate ownership. If you have a function that takes a 'const gchar **', the expectation is that you wouldn't have to free the return value since g_free() takes a non-const pointer. You would, in fact, have to cast away the const in order to even call free. IOW, if you call visit_type_str() and it returns a newly allocated string to you, it's your responsibility to free it. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 07/18] qom: add link properties
On 12/01/2011 04:53 PM, Anthony Liguori wrote: What does the client code looks like for linkPCIDevice? I'm not sure what you mean by client code, This: but consider a device called UsbController that looks like: struct UsbController { DeviceState parent; UsbDevice *slave; // link property }; To add this as a link, somewhere in the init function you would do: static void usb_controller_initfn(UsbController *dev) { ... qdev_property_add_link(DEVICE(dev), slave, UsbDevice, (DeviceState **)dev-slave, NULL); } Issues: - this is an object property, not a class property, so to get a list of properties we need to instantiate an object. - UsbDevice as the type is not type safe at compile time - ditto for the cast If you want to set the property explicitly, you would just do: dev-slave = some_other_device You don't need to use any special function to manipulate the link. Stylistically, I'd prefer that all devices exposed accessor functions and that you did these things through accessors so that we had clear rules about what's public and private. Also, so we can have observers that trigger on changes. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH 00/18] qom: dynamic properties and composition tree
On 12/01/2011 08:48 AM, Avi Kivity wrote: On 12/01/2011 04:42 PM, Anthony Liguori wrote: We need good tools to allow easy review of the ABI bits hiding in patches, and to maintain ABI compatibility. Something like qemu-print-abi that dumps all properties for all devices. Patches could show the ABI changes by including a diff of the output of this program from before and after a change, and we could add similar tests for backwards compatibility. I'm not sure that we want this interface to be backwards compatible. I actually think we should provide a higher level interface that's explicitly there for compatibility. Probably in the form of a C library that can be documented and reasoned with better. Does this force anyone who wants a stable ABI to use this library? I don't have a good picture of this library. If FooState has a bar propery, do you generate qemu_foo_get_bar() and qemu_foo_set_bar() accessors? This is all extremely low level stuff. My view is that if you are a user that cares about backwards compatibility, you probably don't want all of this low level stuff in the first place. I think we need to provide two classes of interfaces. A low-level interface that only something like libvirt would consume that is not 100% backwards compatible and a higher-level interface that less sophisticated tools would consume. If we provide a not 100% backwards compatible interface, then we would also need to provide a good introspection/capabilities mechanism so that something like libvirt could find out whether a feature was available or whether a device has changed significantly. To be very clear, I think -drive if=virtio should be absolutely stable regardless of what we do to the virtio device model. However, I want the flexibility to remove -device virtio-blk-pci and replace it with -device virtio-pci,id=foo -device virtio-blk,bus=foo. We would provide a means to enumerate supported devices so that things like libvirt could see that virtio-blk-pci is not valid in this new version but now there is a virtio-pci and virtio-blk device. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 07/18] qom: add link properties
Hi, In terms of QMP client code, you just do: qom-set /path/to/usb-controller.slave /some/other/device Lacks notification. usb-controller doesn't notice that you have plugged in some usb device and thus can't raise an IRQ to notify the guest ... cheers, Gerd
[Qemu-devel] [PATCH] hw/9pfs: Use the correct file descriptor in Fsdriver Callback
From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com Fsdriver callback that operate on file descriptor need to differentiate between directory fd and file fd. Based on the original patch from Sassan Panahinejad sas...@sassan.me.uk Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com --- fsdev/file-op-9p.h |4 ++-- hw/9pfs/cofile.c |4 ++-- hw/9pfs/virtio-9p-handle.c | 28 ++-- hw/9pfs/virtio-9p-local.c | 36 ++-- hw/9pfs/virtio-9p-synth.c |5 +++-- 5 files changed, 55 insertions(+), 22 deletions(-) diff --git a/fsdev/file-op-9p.h b/fsdev/file-op-9p.h index 1928da2..a85ecd3 100644 --- a/fsdev/file-op-9p.h +++ b/fsdev/file-op-9p.h @@ -112,10 +112,10 @@ typedef struct FileOperations ssize_t (*pwritev)(FsContext *, V9fsFidOpenState *, const struct iovec *, int, off_t); int (*mkdir)(FsContext *, V9fsPath *, const char *, FsCred *); -int (*fstat)(FsContext *, V9fsFidOpenState *, struct stat *); +int (*fstat)(FsContext *, int, V9fsFidOpenState *, struct stat *); int (*rename)(FsContext *, const char *, const char *); int (*truncate)(FsContext *, V9fsPath *, off_t); -int (*fsync)(FsContext *, V9fsFidOpenState *, int); +int (*fsync)(FsContext *, int, V9fsFidOpenState *, int); int (*statfs)(FsContext *s, V9fsPath *path, struct statfs *stbuf); ssize_t (*lgetxattr)(FsContext *, V9fsPath *, const char *, void *, size_t); diff --git a/hw/9pfs/cofile.c b/hw/9pfs/cofile.c index 586b038..b15838c 100644 --- a/hw/9pfs/cofile.c +++ b/hw/9pfs/cofile.c @@ -71,7 +71,7 @@ int v9fs_co_fstat(V9fsPDU *pdu, V9fsFidState *fidp, struct stat *stbuf) } v9fs_co_run_in_worker( { -err = s-ops-fstat(s-ctx, fidp-fs, stbuf); +err = s-ops-fstat(s-ctx, fidp-fid_type, fidp-fs, stbuf); if (err 0) { err = -errno; } @@ -192,7 +192,7 @@ int v9fs_co_fsync(V9fsPDU *pdu, V9fsFidState *fidp, int datasync) } v9fs_co_run_in_worker( { -err = s-ops-fsync(s-ctx, fidp-fs, datasync); +err = s-ops-fsync(s-ctx, fidp-fid_type, fidp-fs, datasync); if (err 0) { err = -errno; } diff --git a/hw/9pfs/virtio-9p-handle.c b/hw/9pfs/virtio-9p-handle.c index 93552a1..ab7ea20 100644 --- a/hw/9pfs/virtio-9p-handle.c +++ b/hw/9pfs/virtio-9p-handle.c @@ -288,10 +288,17 @@ static int handle_mkdir(FsContext *fs_ctx, V9fsPath *dir_path, return ret; } -static int handle_fstat(FsContext *fs_ctx, V9fsFidOpenState *fs, -struct stat *stbuf) +static int handle_fstat(FsContext *fs_ctx, int fid_type, +V9fsFidOpenState *fs, struct stat *stbuf) { -return fstat(fs-fd, stbuf); +int fd; + +if (fid_type == P9_FID_DIR) { +fd = dirfd(fs-dir); +} else { +fd = fs-fd; +} +return fstat(fd, stbuf); } static int handle_open2(FsContext *fs_ctx, V9fsPath *dir_path, const char *name, @@ -428,12 +435,21 @@ static int handle_remove(FsContext *ctx, const char *path) return -1; } -static int handle_fsync(FsContext *ctx, V9fsFidOpenState *fs, int datasync) +static int handle_fsync(FsContext *ctx, int fid_type, +V9fsFidOpenState *fs, int datasync) { +int fd; + +if (fid_type == P9_FID_DIR) { +fd = dirfd(fs-dir); +} else { +fd = fs-fd; +} + if (datasync) { -return qemu_fdatasync(fs-fd); +return qemu_fdatasync(fd); } else { -return fsync(fs-fd); +return fsync(fd); } } diff --git a/hw/9pfs/virtio-9p-local.c b/hw/9pfs/virtio-9p-local.c index 99ef0cd..371a94d 100644 --- a/hw/9pfs/virtio-9p-local.c +++ b/hw/9pfs/virtio-9p-local.c @@ -366,11 +366,18 @@ out: return err; } -static int local_fstat(FsContext *fs_ctx, +static int local_fstat(FsContext *fs_ctx, int fid_type, V9fsFidOpenState *fs, struct stat *stbuf) { -int err; -err = fstat(fs-fd, stbuf); +int err, fd; + +if (fid_type == P9_FID_DIR) { +fd = dirfd(fs-dir); +} else { +fd = fs-fd; +} + +err = fstat(fd, stbuf); if (err) { return err; } @@ -381,19 +388,19 @@ static int local_fstat(FsContext *fs_ctx, mode_t tmp_mode; dev_t tmp_dev; -if (fgetxattr(fs-fd, user.virtfs.uid, +if (fgetxattr(fd, user.virtfs.uid, tmp_uid, sizeof(uid_t)) 0) { stbuf-st_uid = tmp_uid; } -if (fgetxattr(fs-fd, user.virtfs.gid, +if (fgetxattr(fd, user.virtfs.gid, tmp_gid, sizeof(gid_t)) 0) { stbuf-st_gid = tmp_gid; } -if (fgetxattr(fs-fd, user.virtfs.mode, +if (fgetxattr(fd, user.virtfs.mode, tmp_mode, sizeof(mode_t)) 0) {
Re: [Qemu-devel] [PATCH 07/18] qom: add link properties
On 12/01/2011 09:00 AM, Avi Kivity wrote: On 12/01/2011 04:53 PM, Anthony Liguori wrote: What does the client code looks like for linkPCIDevice? I'm not sure what you mean by client code, This: but consider a device called UsbController that looks like: struct UsbController { DeviceState parent; UsbDevice *slave; // link property }; To add this as a link, somewhere in the init function you would do: static void usb_controller_initfn(UsbController *dev) { ... qdev_property_add_link(DEVICE(dev), slave, UsbDevice, (DeviceState **)dev-slave, NULL); } Issues: I have thought about all of these and have solutions but we have to take it one step at a time. - this is an object property, not a class property, so to get a list of properties we need to instantiate an object. Yes, but it will be safe to instantiate an object as object instantiation is side-effect free. Recall, QOM does not have constructor properties and construction is delayed to realize. In addition, we'll be moving all structures into header files and I imagine we'll have a doc syntax so that we can place documentation about the properties in the headers and exact it appropriately. - UsbDevice as the type is not type safe at compile time It will become TYPE_USB_DEVICE once we get the next stage in. See some of my earlier QOM posts for examples. - ditto for the cast The cast is unfortunate. I can't really figure a good solution to this. I could implement a macro to eliminate the cast but since this may be a NULL pointer when this function is called, I can't find a way to make this safer. If you want to set the property explicitly, you would just do: dev-slave = some_other_device You don't need to use any special function to manipulate the link. Stylistically, I'd prefer that all devices exposed accessor functions and that you did these things through accessors so that we had clear rules about what's public and private. Also, so we can have observers that trigger on changes. Exactly. I don't think it's a hard requirement right now for any type of property. I think avoiding boiler plate is more important at the moment. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 07/18] qom: add link properties
On 12/01/2011 05:03 PM, Gerd Hoffmann wrote: Hi, In terms of QMP client code, you just do: qom-set /path/to/usb-controller.slave /some/other/device Lacks notification. usb-controller doesn't notice that you have plugged in some usb device and thus can't raise an IRQ to notify the guest ... I was looking for an example of why we need an observer, thanks. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH 07/18] qom: add link properties
On 12/01/2011 09:03 AM, Gerd Hoffmann wrote: Hi, In terms of QMP client code, you just do: qom-set /path/to/usb-controller.slave /some/other/device Lacks notification. usb-controller doesn't notice that you have plugged in some usb device and thus can't raise an IRQ to notify the guest ... Properties will have flags. One of those flags makes the property mutable only while the device isn't realized. This is a case where the property is mutable during realize. You wouldn't use a normal link here. You would use a hotpluggable link which would have a notify hook. This works because all properties are get/set through an accessor interface and there's a rich error interface. The original QOM series I posted had this logic in it. Regards, Anthony Liguori cheers, Gerd
Re: [Qemu-devel] [PATCH 0/4] KVM: Dirty logging optimization using rmap
On 11/30/2011 07:15 AM, Takuya Yoshikawa wrote: (2011/11/30 14:02), Takuya Yoshikawa wrote: IIUC, even though O(1) is O(1) at the timing of GET DIRTY LOG, it needs O(N) write protections with respect to the total number of dirty pages: distributed, but actually each page fault, which should be logged, does some write protection? Sorry, was not precise. It depends on the level, and not completely distributed. But I think it is O(N), and the total number of costs will not change so much, I guess. That's true. But some applications do require low latency, and the current code can impose a lot of time with the mmu spinlock held. The total amount of work actually increases slightly, from O(N) to O(N log N), but since the tree is so wide, the overhead is small. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [Qemu-trivial] [PATCH 2/2] Makefile: use full path for qapi-generated directory
On Wed, 30 Nov 2011, Stefan Weil wrote: It's common to use either out-of-tree builds or in-tree builds, but not to mix both variants with a common root directory. I think QEMU should explicitly forbid that mixed scenario (like other projects do). Even with your fix there can remain problems with generated header files. Really? Can you provide more specific details? The mixed scenario creates unnecessary complexity. Without the mixed scenario, your patch is not needed. I agree that supporting the mixed scenario shouldn't be a priority. However without this last patch a make clean on the main tree is not enough to allow out of tree builds. Try the following scenario: - cd qemu; ./configure; make - make clean - cd ../temp; ./configure --source-path=../qemu; make this has to work, right? It does not without the patch to fix the generation of config-devices.mak.
Re: [Qemu-devel] [PATCH 14/18] rtc: add a dynamic property for retrieving the date
Hi, +static void rtc_get_date(DeviceState *dev, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +ISADevice *isa = DO_UPCAST(ISADevice, qdev, dev); +RTCState *s = DO_UPCAST(RTCState, dev, isa); + +visit_start_struct(v, NULL, struct tm, name, 0, errp); +visit_type_int32(v, s-current_tm.tm_year, tm_year, errp); +visit_type_int32(v, s-current_tm.tm_mon, tm_mon, errp); +visit_type_int32(v, s-current_tm.tm_mday, tm_mday, errp); +visit_type_int32(v, s-current_tm.tm_hour, tm_hour, errp); +visit_type_int32(v, s-current_tm.tm_min, tm_min, errp); +visit_type_int32(v, s-current_tm.tm_sec, tm_sec, errp); +visit_end_struct(v, errp); +} Ok, what is the long term plan here? I don't think we want open-code everything here, do we? Especially once visitors become more widespread used. And I can see that they are useful for a bunch of stuff beside device properties and relationships. vmstate for example. Or to list device state (say a register dump) for debugging purposes. Today we have code to generate structs and visitors from scratch. I think it would be useful to also generate visitors for existing structs, with some kind of annotation, like this ... struct SomeDev { DeviceState dev; Chardev *chr; uint32_treg1 __v(vmstate); uint32_treg2 __v(vmstate); [ ... ] ); ... instead of the vmstate structs we create manually today. Likewise for properties. And probably we can even generate different visitors for different views at the same struct. cheers, Gerd
Re: [Qemu-devel] [PATCH 01/18] qom: add new dynamic property infrastructure based on Visitors
Am 30.11.2011 22:03, schrieb Anthony Liguori: qdev properties are settable only during construction and static to classes. This isn't flexible enough for QOM. This patch introduces a property interface for qdev that provides dynamic properties that are tied to objects, instead of classes. These properties are Visitor based instead of string based too. Signed-off-by: Anthony Liguori aligu...@us.ibm.com --- hw/qdev.c | 99 +++ hw/qdev.h | 118 + qerror.c |4 ++ qerror.h |3 ++ 4 files changed, 224 insertions(+), 0 deletions(-) diff --git a/hw/qdev.c b/hw/qdev.c index 106407f..ad2d44f 100644 --- a/hw/qdev.c +++ b/hw/qdev.c @@ -390,12 +390,33 @@ void qdev_init_nofail(DeviceState *dev) } } +static void qdev_property_del_all(DeviceState *dev) +{ +while (dev-properties) { +GSList *i = dev-properties; +DeviceProperty *prop = i-data; + +dev-properties = i-next; + +if (prop-release) { +prop-release(dev, prop-name, prop-opaque); +} + +g_free(prop-name); +g_free(prop-type); +g_free(prop); +g_free(i); +} +} + /* Unlink device from bus and free the structure. */ void qdev_free(DeviceState *dev) { BusState *bus; Property *prop; +qdev_property_del_all(dev); + if (dev-state == DEV_STATE_INITIALIZED) { while (dev-num_child_bus) { bus = QLIST_FIRST(dev-child_bus); @@ -962,3 +983,81 @@ char* qdev_get_fw_dev_path(DeviceState *dev) return strdup(path); } + +void qdev_property_add(DeviceState *dev, const char *name, const char *type, + DevicePropertyEtter *get, DevicePropertyEtter *set, + DevicePropertyRelease *release, void *opaque, + Error **errp) How about letting the caller pass in a DeviceProperty for improved readability and usability? Instead of memorizing the order of currently eight parameters (could probably become more in the future) you can use proper C99 initializers then. @@ -45,6 +82,7 @@ struct DeviceState { QTAILQ_ENTRY(DeviceState) sibling; int instance_id_alias; int alias_required_for_version; +GSList *properties; }; Why GSList instead of qemu-queue.h macros that would provide type safety? I don't think a property can belong to multiple devices, can it? qdev_property_add only refers to a single device, and nothing else adds elements to the list. Kevin
Re: [Qemu-devel] [PATCH 02/18] qom: register legacy properties as new style properties
Hi, +for (prop = dev-info-props; prop prop-name; prop++) { +qdev_property_add_legacy(dev, prop, NULL); +} bus properties? +static void qdev_get_legacy_property(DeviceState *dev, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +Property *prop = opaque; + +if (prop-info-print) { +char buffer[1024]; +char *ptr = buffer; + +prop-info-print(dev, prop, buffer, sizeof(buffer)); +visit_type_str(v, ptr, name, errp); I think you can look at prop-info-type here and do something more clever at least for the bool + integer properties. cheers, Gerd
Re: [Qemu-devel] [Qemu-trivial] [PATCH 2/2] Makefile: use full path for qapi-generated directory
On 12/01/2011 09:19 AM, Stefano Stabellini wrote: On Wed, 30 Nov 2011, Stefan Weil wrote: It's common to use either out-of-tree builds or in-tree builds, but not to mix both variants with a common root directory. I think QEMU should explicitly forbid that mixed scenario (like other projects do). Even with your fix there can remain problems with generated header files. Really? Can you provide more specific details? The mixed scenario creates unnecessary complexity. Without the mixed scenario, your patch is not needed. I agree that supporting the mixed scenario shouldn't be a priority. However without this last patch a make clean on the main tree is not enough to allow out of tree builds. Try the following scenario: - cd qemu; ./configure; make - make clean - cd ../temp; ./configure --source-path=../qemu; make this has to work, right? It does not without the patch to fix the generation of config-devices.mak. To clarify, the first patch I sent fix a bug that's not currently triggered due to qapi-dir always ending up being qapi-generated/, so the guardname generated works, albeit with some needlessly verbose names. The second isn't intended to fix your issue with builds, but avoid potential problems that may arise if at some point $(BUILD_DIR) != $(CURDIR). So they don't actually fix anything, just potential bugs that may arise in the future. The issue with doing out of tree builds with a dirty source directory is really just an unsupported scenario. To really clean things up, you need to run `make distclean` using the same configuration you used to dirty the directory in the first place, otherwise even `make distclean` will leave cruft laying in target directories not present in your current config. Alexandre Raymond sent a patch to make it so that `make distclean` always left the source directory pristine, but there was some contention on what exactly a `make distclean` entails and it wasn't applied: http://lists.gnu.org/archive/html/qemu-trivial/2011-07/msg00037.html That's really the only complete fix for these build issues. I'm starting to think it'd be worth it to add a `make clean_for_reals` target that does what Alexandre's patch does and tell people to run it after a pull or switching to out-of-tree builds.
Re: [Qemu-devel] [PATCH 02/18] qom: register legacy properties as new style properties
Am 30.11.2011 22:03, schrieb Anthony Liguori: Expose all legacy properties through the new QOM property mechanism. The qdev property types are exposed through the 'legacy' namespace. They are always visited as strings since they do their own string parsing. Signed-off-by: Anthony Liguori aligu...@us.ibm.com --- hw/qdev.c | 82 + hw/qdev.h |7 + 2 files changed, 89 insertions(+), 0 deletions(-) Not directly related to this patch, but looking for the first time at visitor code, visitors are clearly underdocumented. +static void qdev_get_legacy_property(DeviceState *dev, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +Property *prop = opaque; + +if (prop-info-print) { +char buffer[1024]; +char *ptr = buffer; + +prop-info-print(dev, prop, buffer, sizeof(buffer)); +visit_type_str(v, ptr, name, errp); So a string output visitor (this is what it's called, right?) takes a buffer on the stack that the visitor must not free... +} else { +error_set(errp, QERR_PERMISSION_DENIED); +} +} + +static void qdev_set_legacy_property(DeviceState *dev, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +Property *prop = opaque; + +if (dev-state != DEV_STATE_CREATED) { +error_set(errp, QERR_PERMISSION_DENIED); +return; +} + +if (prop-info-parse) { +Error *local_err = NULL; +char *ptr = NULL; + +visit_type_str(v, ptr, name, local_err); +if (!local_err) { +int ret; +ret = prop-info-parse(dev, prop, ptr); +if (ret != 0) { +error_set(errp, QERR_INVALID_PARAMETER_VALUE, + name, prop-info-name); +} +g_free(ptr); ...but a string input visitor creates a copy that must be freed. Kind of surprising behaviour, and qapi-visit-core.h doesn't document it. Kevin
Re: [Qemu-devel] [PATCH] pc: map pc ram from user-specified file
Hi Hi Juan, Sorry for taking so long to reply -- my email filters apparently aren't setup correctly! Do you have any performance number for this? And examples on how your are using it? The performance should depend only on the VMA backing the file, in addition to any indirect overhead caused by MMU synchronization. If the file is a disk file that gets flushed from the buffer cache frequently, then performance will be abysmal. However, if the file is guaranteed to be in-core (e.g., mounted on a ramfs), then KVM will hit the same kernel code paths as a file backed by an anonymous VMA that isn't swapped out. Our principal use case is implementing VM migration techniques. We're particularly interested in memory migration. Right now, QEMU implements VM migration, but QEMU's migration mechanism is inflexible with respect to memory. That is, the entire contents of the VM's RAM are copied from the migration host to the migration destination before the destination VM can run. With the current VM migration implementation, it's impossible, for instance, to allow the destination VM to start immediately and lazily fetch its RAM. With the -pcram-option, we could specify a file for the RAM that's backed by a fileystem that fetches pages on demand over the network. +#ifdef __linux__ + new_block-host = mem_file_ram_alloc(new_block, size); + if (new_block-host) { + assert(!host); + } else +#endif if (host) { This test is (at least suspicious). Shouldn't we check first if host is not NULL? (Not that I fully understand that part) Here's what I'm really testing: (host == NULL) or (there's no ram file for this RAMBlock) Here's my rationale: If there's a ram file for this RAMBlock, then the user of QEMU expects the RAMBlock to be backed by some file. Presumably the VM wouldn't run correctly otherwise (e.g., in the case of migration). However, if qemu passed host!=NULL into qemu_ram_alloc_from_ptr, then it expects the RAMBlock to be backed by something else; if the RAMBlock were backed by something other than the passed in host memory, then the VM presumably wouldn't work properly in this case either. Hence it's an error for host to be non-NULL and there to be a ram file for this RAMBlock, which is indicated by mem_file_ram_alloc returning non-NULL. It's up to the caller of add_memory_file to know if the RAMBlock named by idstr is normally allocated by qemu_ram_alloc_from_ptr. Hence why the exposed command-line option is --pcram-file file instead of --memory-for-arbitrary-ram-block idstr=x,file. I hope this clears some things up! Peter
Re: [Qemu-devel] [PATCH V3 1/4] Implement the FreeScale i.MX UART. This uart is used in a variety of SoCs, including some by Motorola, as well as in the FreeScale i.MX series.
On 30 November 2011 03:36, Peter Chubb peter.ch...@nicta.com.au wrote: Commit messages should be formatted with a short summary line, then a blank line, then a more detailed description. You've put everything into one extremely long summary line here, which looks odd in git log. Try: ===begin=== hw/imx_serial: Implement the FreeScale i.MX UART Implement the FreeScale i.MX UART. This uart is used in a variety of SoCs, including some by Motorola, as well as in the FreeScale i.MX series. Signed-off-by: c. ===endit=== Similarly for the other patches in this series. Signed-off-by: Hans Jang hsj...@ok-labs.com Signed-off-by: Adam Clench ad...@ok-labs.com Signed-off-by: Peter Chubb peter.ch...@nicta.com.au --- Makefile.target | 1 hw/imx_serial.c | 320 2 files changed, 321 insertions(+) create mode 100644 hw/imx_serial.c Index: qemu-working/hw/imx_serial.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ qemu-working/hw/imx_serial.c 2011-11-30 13:38:24.434778115 +1100 @@ -0,0 +1,320 @@ +/* + * IMX31 UARTS + * + * Copyright (c) 2008 OKL + * Originally Written by Hans Jiang + * Copyright (c) 2011 NICTA Pty Ltd. + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. Still GPL v2 only. + * + * This is a `bare-bones' implementation of the IMX series serial ports. + * TODO: + * -- implement FIFOs. The real hardware has 32 word transmit + * and receive FIFOs; we currently use a 1-char buffer + * -- implement DMA + * -- implement BAUD-rate and modem lines, for when the backend + * is a real serial device. + */ + +#include hw.h +#include sysbus.h +#include qemu-char.h + +//#define DEBUG_SERIAL 1 + +#ifdef DEBUG_SERIAL +#define DPRINTF(fmt, args...) \ +do { printf(imx_serial: fmt , ##args); } while (0) +#else +#define DPRINTF(fmt, args...) do {} while (0) +#endif + +/* + * Define to 1 for messages about attempts to + * access unimplemented registers or similar. + */ +#define DEBUG_IMPLEMENTATION 1 +#if DEBUG_IMPLEMENTATION +# define IPRINTF(fmt, args...) \ + do { fprintf(stderr, imx_serial: fmt, ##args); } while (0) +#else +# define IPRINTF(fmt, args...) do {} while (0) +#endif + +typedef struct { + SysBusDevice busdev; + MemoryRegion iomem; + int32_t readbuff; + + uint32_t usr1; + uint32_t usr2; + uint32_t ucr1; + uint32_t uts1; + + uint32_t ubrm; The MCIMX31RM calls this UBMR, not UBRM... + uint32_t ubrc; + + qemu_irq irq; + CharDriverState *chr; +} imx_state; + +static const VMStateDescription vmstate_imx_serial = { + .name = imx-serial, + .version_id = 1, + .minimum_version_id = 1, + .minimum_version_id_old = 1, + .fields = (VMStateField[]) { + VMSTATE_INT32(readbuff, imx_state), + VMSTATE_UINT32(usr1, imx_state), + VMSTATE_UINT32(usr2, imx_state), + VMSTATE_UINT32(ucr1, imx_state), + VMSTATE_UINT32(uts1, imx_state), + VMSTATE_UINT32(ubrm, imx_state), + VMSTATE_UINT32(ubrc, imx_state), + VMSTATE_END_OF_LIST() + }, +}; + + +#define URXD_CHARRDY (115) /* character read is valid */ + +#define USR1_TRDY (113) /* Xmitter ready */ +#define USR1_RRDY (19) /* receiver ready */ + +#define USR2_TXFE (114) /* Transmit FIFO empty */ +#define USR2_TXDC (13) /* Transmission complete */ +#define USR2_RDR (10) /* Receive data ready */ + +#define UCR1_TRDYEN (113) +#define UCR1_RRDYEN (19) +#define UCR1_TXMPTYEN (16) +#define UCR1_UARTEN (10) + +#define UTS1_TXEMPTY (16) +#define UTS1_RXEMPTY (15) +#define UTS1_TXFULL (14) +#define UTS1_RXFULL (13) + +static void imx_update(imx_state *s) +{ + uint32_t flags; + + flags = ((s-usr1 s-ucr1)) (USR1_TRDY|USR1_RRDY); + if (!(s-ucr1 UCR1_TXMPTYEN)) { + flags = ~USR1_TRDY; + } + + if (flags) { + DPRINTF(imx_serial: raising interrupt\n); + } + + qemu_set_irq(s-irq, !!flags); +} + +static void imx_serial_reset(DeviceState *dev) +{ + imx_state *s = container_of(dev, imx_state, busdev.qdev); + + s-usr1 = USR1_TRDY; My copy of the MCIMX31RM says we also set RXDS on reset. + s-usr2 = USR2_TXFE | USR2_TXDC; Presumably we don't set DCDIN here because we haven't implemented the modem control signals yet? + s-uts1 = UTS1_RXEMPTY | UTS1_TXEMPTY; + s-ubrm = 0; + s-ubrc = 0; RM says the reset value for UBRC is 0x4. You also need to reset s-ucr1. (If you want to retain that hack in the init function then you still need to reset the other bits even if you don't allow reset to clear UARTEN.) + s-readbuff = 0; + + imx_update(s); This will call qemu_set_irq()
Re: [Qemu-devel] [PATCH V3 1/4] Implement the FreeScale i.MX UART. This uart is used in a variety of SoCs, including some by Motorola, as well as in the FreeScale i.MX series.
On 1 December 2011 16:55, Peter Maydell peter.mayd...@linaro.org wrote: On 30 November 2011 03:36, Peter Chubb peter.ch...@nicta.com.au wrote: Signed-off-by: Hans Jang hsj...@ok-labs.com Is this email address correct? Trying to send this email got me: 550 550 hsj...@ok-labs.com... User not known (state 17). Signed-off-by: Adam Clench ad...@ok-labs.com Signed-off-by: Peter Chubb peter.ch...@nicta.com.au -- PMM
[Qemu-devel] [PATCH 2/2] ppc_prep: convert ioport to memory API
Signed-off-by: Benoît Canet benoit.ca...@gmail.com --- hw/ppc_prep.c | 80 - 1 files changed, 62 insertions(+), 18 deletions(-) diff --git a/hw/ppc_prep.c b/hw/ppc_prep.c index f22d5b9..299305b 100644 --- a/hw/ppc_prep.c +++ b/hw/ppc_prep.c @@ -94,7 +94,8 @@ static int speaker_data_on; static int dummy_refresh_clock; #endif -static void speaker_ioport_write (void *opaque, uint32_t addr, uint32_t val) +static void speaker_ioport_write(void *opaque, target_phys_addr_t addr, + uint64_t val, unsigned size) { #if 0 speaker_data_on = (val 1) 1; @@ -102,7 +103,8 @@ static void speaker_ioport_write (void *opaque, uint32_t addr, uint32_t val) #endif } -static uint32_t speaker_ioport_read (void *opaque, uint32_t addr) +static uint64_t speaker_ioport_read(void *opaque, target_phys_addr_t addr, +unsigned size) { #if 0 int out; @@ -114,6 +116,16 @@ static uint32_t speaker_ioport_read (void *opaque, uint32_t addr) return 0; } +static const MemoryRegionOps speaker_ioport_ops = { +.read = speaker_ioport_read, +.write = speaker_ioport_write, +.endianness = DEVICE_LITTLE_ENDIAN, +.valid = { +.min_access_size = 1, +.max_access_size = 1, +}, +}; + /* PCI intack register */ /* Read-only register (?) */ static void PPC_intack_write (void *opaque, target_phys_addr_t addr, @@ -251,7 +263,8 @@ enum { static sysctrl_t *sysctrl; -static void PREP_io_write (void *opaque, uint32_t addr, uint32_t val) +static void PREP_io_write(void *opaque, target_phys_addr_t addr, + uint64_t val, unsigned size) { sysctrl_t *sysctrl = opaque; @@ -260,7 +273,8 @@ static void PREP_io_write (void *opaque, uint32_t addr, uint32_t val) sysctrl-fake_io[addr - 0x0398] = val; } -static uint32_t PREP_io_read (void *opaque, uint32_t addr) +static uint64_t PREP_io_read(void *opaque, target_phys_addr_t addr, + unsigned size) { sysctrl_t *sysctrl = opaque; @@ -269,7 +283,18 @@ static uint32_t PREP_io_read (void *opaque, uint32_t addr) return sysctrl-fake_io[addr - 0x0398]; } -static void PREP_io_800_writeb (void *opaque, uint32_t addr, uint32_t val) +static const MemoryRegionOps PREP_io_ops = { +.read = PREP_io_read, +.write = PREP_io_write, +.endianness = DEVICE_LITTLE_ENDIAN, +.valid = { +.min_access_size = 1, +.max_access_size = 1, +}, +}; + +static void PREP_io_800_writeb(void *opaque, target_phys_addr_t addr, + uint64_t val, unsigned size) { sysctrl_t *sysctrl = opaque; @@ -330,13 +355,14 @@ static void PREP_io_800_writeb (void *opaque, uint32_t addr, uint32_t val) sysctrl-contiguous_map = val 0x01; break; default: -printf(ERROR: unaffected IO port write: %04 PRIx32 -= %02 PRIx32\n, addr, val); +printf(ERROR: unaffected IO port write: %04 PRIx64 += %02 PRIx64 \n, addr, val); break; } } -static uint32_t PREP_io_800_readb (void *opaque, uint32_t addr) +static uint64_t PREP_io_800_readb(void *opaque, target_phys_addr_t addr, + unsigned size) { sysctrl_t *sysctrl = opaque; uint32_t retval = 0xFF; @@ -393,15 +419,25 @@ static uint32_t PREP_io_800_readb (void *opaque, uint32_t addr) retval = sysctrl-contiguous_map; break; default: -printf(ERROR: unaffected IO port: %04 PRIx32 read\n, addr); +printf(ERROR: unaffected IO port: %04 PRIx64 read\n, addr); break; } -PPC_IO_DPRINTF(0x%08 PRIx32 = 0x%02 PRIx32 \n, +PPC_IO_DPRINTF(0x%08 PRIx64 = 0x%02 PRIx64 \n, addr - PPC_IO_BASE, retval); return retval; } +static const MemoryRegionOps PREP_io_800_ops = { +.read = PREP_io_800_readb, +.write = PREP_io_800_writeb, +.endianness = DEVICE_LITTLE_ENDIAN, +.valid = { +.min_access_size = 1, +.max_access_size = 1, +}, +}; + static inline target_phys_addr_t prep_IO_address(sysctrl_t *sysctrl, target_phys_addr_t addr) { @@ -514,6 +550,10 @@ static void ppc_prep_init (ram_addr_t ram_size, M48t59State *m48t59; MemoryRegion *PPC_io_memory = g_new(MemoryRegion, 1); MemoryRegion *intack = g_new(MemoryRegion, 1); +MemoryRegion *io_speaker = g_new(MemoryRegion, 1); +MemoryRegion *io_fake = g_new(MemoryRegion, 1); +MemoryRegion *io_control_092 = g_new(MemoryRegion, 1); +MemoryRegion *io_control_800 = g_new(MemoryRegion, 1); #if 0 MemoryRegion *xcsr = g_new(MemoryRegion, 1); #endif @@ -679,17 +719,21 @@ static void ppc_prep_init (ram_addr_t ram_size, fdctrl_init_isa(fd); /* Register speaker port */ -register_ioport_read(0x61, 1, 1, speaker_ioport_read, NULL); -
[Qemu-devel] [PATCH 1/2] applesmc: convert portio to memory API
Signed-off-by: Benoît Canet benoit.ca...@gmail.com --- hw/applesmc.c | 20 1 files changed, 12 insertions(+), 8 deletions(-) diff --git a/hw/applesmc.c b/hw/applesmc.c index c47b592..fb11bdf 100644 --- a/hw/applesmc.c +++ b/hw/applesmc.c @@ -165,6 +165,16 @@ static uint32_t applesmc_io_cmd_readb(void *opaque, uint32_t addr1) return s-status; } +static const MemoryRegionPortio applesmc_portio_list[] = { +{ APPLESMC_DATA_PORT, 4, 1, + .read = applesmc_io_data_readb, + .write = applesmc_io_data_writeb }, +{ APPLESMC_CMD_PORT, 4, 1, + .read = applesmc_io_cmd_readb, + .write = applesmc_io_cmd_writeb }, +PORTIO_END_OF_LIST(), +}; + static void applesmc_add_key(struct AppleSMCStatus *s, const char *key, int len, const char *data) { @@ -200,14 +210,8 @@ static int applesmc_isa_init(ISADevice *dev) { struct AppleSMCStatus *s = DO_UPCAST(struct AppleSMCStatus, dev, dev); -register_ioport_read(s-iobase + APPLESMC_DATA_PORT, 4, 1, - applesmc_io_data_readb, s); -register_ioport_read(s-iobase + APPLESMC_CMD_PORT, 4, 1, - applesmc_io_cmd_readb, s); -register_ioport_write(s-iobase + APPLESMC_DATA_PORT, 4, 1, - applesmc_io_data_writeb, s); -register_ioport_write(s-iobase + APPLESMC_CMD_PORT, 4, 1, - applesmc_io_cmd_writeb, s); +isa_register_portio_list(dev, s-iobase, applesmc_portio_list, s, + applesmc); if (!s-osk || (strlen(s-osk) != 64)) { fprintf(stderr, WARNING: Using AppleSMC with invalid key\n); -- 1.7.7.3
[Qemu-devel] [PATCH 0/2] ioport conversions to memory API
These patches convert two files ioports to memory API. Benoît Canet (2): applesmc: convert portio to memory API ppc_prep: convert ioport to memory API hw/applesmc.c | 20 - hw/ppc_prep.c | 80 - 2 files changed, 74 insertions(+), 26 deletions(-) -- 1.7.7.3
Re: [Qemu-devel] [PATCH V3 2/4] Implement the timers on the FreeScale i.MX31 SoC. This is not a complete implementation, but gives enough for Linux to boot and run.
On 30 November 2011 03:36, Peter Chubb peter.ch...@nicta.com.au wrote: Signed-off-by: Hans Jang hsj...@ok-labs.com Signed-off-by: Adam Clench ad...@ok-labs.com Signed-off-by: Peter Chubb peter.ch...@nicta.com.au --- Makefile.target | 2 hw/imx_timer.c | 460 2 files changed, 461 insertions(+), 1 deletion(-) create mode 100644 hw/imx_timer.c Index: qemu-working/hw/imx_timer.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ qemu-working/hw/imx_timer.c 2011-11-30 13:38:25.818785258 +1100 @@ -0,0 +1,460 @@ +/* + * IMX31 Timer + * + * Copyright (c) 2008 OKL + * Copyright (c) 2011 NICTA Pty Ltd + * Originally Written by Hans Jiang + * Updated by Peter Chubb + * + * This code is licenced under GPL version 2 or later. See + * the COPYING file in the top-level directory. + */ + +#include hw.h +#include qemu-timer.h +#include sysbus.h + +//#define DEBUG_TIMER 1 + +#ifdef DEBUG_TIMER +# define DPRINTF(fmt, args...) \ + do { printf(imx_timer: fmt , ##args); } while (0) +#else +# define DPRINTF(fmt, args...) do {} while (0) +#endif + +/* + * Define to 1 for messages about attempts to + * access unimplemented registers or similar. + */ +#define DEBUG_IMPLEMENTATION 1 +#if DEBUG_IMPLEMENTATION +# define IPRINTF(fmt, args...) \ + do { fprintf(stderr, imx_timer: fmt, ##args); } while (0) +#else +# define IPRINTF(fmt, args...) do {} while (0) +#endif + +/* + * GPT : General purpose timer + */ + +#define TIMER_MAX 0xUL +#define GPT_FREQ 5000 /* Hz == 50 MHz */ + +/* Control register. Not all of these bits have any effect (yet) */ +#define GPT_CR_EN (1 0) /* GPT Enable */ +#define GPT_CR_ENMODE (1 1) /* GPT Enable Mode */ +#define GPT_CR_DBGEN (1 2) /* GPT Debug mode enable */ +#define GPT_CR_WAITEN (1 3) /* GPT Wait Mode Enable */ +#define GPT_CR_DOZEN (1 4) /* GPT Doze mode enable */ +#define GPT_CR_STOPEN (1 5) /* GPT Stop Mode Enable */ +#define GPT_CR_CLKSRC (7 6) /* Clock source select (3 bits) */ +#define GPT_CR_FRR (1 9) /* Freerun or Restart */ +#define GPT_CR_SWR (1 15) +#define GPT_CR_IM1 (3 16) /* Input capture channel 1 mode (2 bits) */ +#define GPT_CR_IM2 (3 18) /* Input capture channel 2 mode (2 bits) */ +#define GPT_CR_OM1 (7 20) /* Output Compare Channel 1 Mode (3 bits) */ +#define GPT_CR_OM2 (7 23) /* Output Compare Channel 2 Mode (3 bits) */ +#define GPT_CR_OM3 (7 26) /* Output Compare Channel 3 Mode (3 bits) */ +#define GPT_CR_FO1 (1 29) /* Force Output Compare Channel 1 */ +#define GPT_CR_FO2 (1 30) /* Force Output Compare Channel 2 */ +#define GPT_CR_FO3 (1 31) /* Force Output Compare Channel 3 */ + +#define GPT_SR_OF1 (1 0) +#define GPT_SR_ROV (1 5) + +#define GPT_IR_OF1IE (1 0) +#define GPT_IR_ROVIE (1 5) + +typedef struct { + SysBusDevice busdev; + QEMUTimer *timer; + MemoryRegion iomem; + uint32_t cr; + uint32_t sr; + uint32_t pr; + uint32_t ir; + uint32_t ocr1; + uint32_t cnt; + + int waiting_rov; + qemu_irq irq; +} imxg_timer_state; + +static const VMStateDescription vmstate_imxg_timer = { + .name = imxg-timer, + .version_id = 1, + .minimum_version_id = 1, + .minimum_version_id_old = 1, + .fields = (VMStateField[]) { + VMSTATE_UINT32(cr, imxg_timer_state), + VMSTATE_UINT32(sr, imxg_timer_state), + VMSTATE_UINT32(ir, imxg_timer_state), + VMSTATE_UINT32(cnt, imxg_timer_state), + VMSTATE_UINT32(ocr1, imxg_timer_state), + VMSTATE_TIMER(timer, imxg_timer_state), Missing 'pr' and 'waiting_rov' ? + VMSTATE_END_OF_LIST() + } +}; + + +/* Check all active timers, and schedule the next timer interrupt. */ +static void imxg_timer_update(imxg_timer_state *s) +{ + uint32_t flags = s-sr s-ir (GPT_SR_OF1 | GPT_SR_ROV); + + if ((s-cr GPT_CR_EN) flags) { + qemu_irq_raise(s-irq); + } else { + qemu_irq_lower(s-irq); + } +} + +static uint64_t imxg_timer_update_count(imxg_timer_state *s) +{ + uint64_t clk = qemu_get_clock_ns(vm_clock); + + s-cnt = ((uint32_t)muldiv64(clk, GPT_FREQ/100, + 1000)) % TIMER_MAX; + return clk; +} + +static void imxg_timer_run(imxg_timer_state *s, uint32_t timeout) +{ + uint64_t clk = imxg_timer_update_count(s); + uint32_t diff_cnt; + if (s-cnt timeout) { + diff_cnt = (timeout - s-cnt); + s-waiting_rov = 0; + } else { + diff_cnt = (TIMER_MAX - s-cnt); + s-waiting_rov = 1; + } + qemu_mod_timer(s-timer, clk + diff_cnt * 1000 / (GPT_FREQ/100)); You haven't explained why this uses raw qemu timers but the enhanced timer later in the file uses ptimers, why doing a % 0x
[Qemu-devel] [qemu-kvm PATCH 1/1] don't try to hotplug a cpu
When you do cpu_set n online where n current number of cpus, qemu-kvm will end up crashing when qdev finds hotplug is not enabled. Let's instead gracefully refuse. See https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/878422 for the related bug report. Signed-off-by: Serge Hallyn serge.hal...@canonical.com --- hw/acpi_piix4.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c index 1b35707..5e95569 100644 --- a/hw/acpi_piix4.c +++ b/hw/acpi_piix4.c @@ -589,12 +589,17 @@ void qemu_system_cpu_hot_add(int cpu, int state) PIIX4PMState *s = global_piix4_pm_state; if (state !qemu_get_cpu(cpu)) { +#if 1 +fprintf(stderr, cpu hotplug not supported\n, cpu); +return; +#else env = pc_new_cpu(global_cpu_model); if (!env) { fprintf(stderr, cpu %d creation failed\n, cpu); return; } env-cpuid_apic_id = cpu; +#endif } if (state) -- 1.7.5.4
[Qemu-devel] [PATCH] qed: add qed-tool.py image manipulation utility
The qed-tool.py utility can inspect and manipulate QED image files. It can be used for testing to see the state of image metadata and also to inject corruptions into the image file. It also has a scrubbing feature to copy just the metadata out of an image file, allowing users to share broken image files without revealing data in bug reports. This has lived in my local repo for a long time but could be useful to others. Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- scripts/qed-tool.py | 250 +++ 1 files changed, 250 insertions(+), 0 deletions(-) create mode 100755 scripts/qed-tool.py diff --git a/scripts/qed-tool.py b/scripts/qed-tool.py new file mode 100755 index 000..90cc375 --- /dev/null +++ b/scripts/qed-tool.py @@ -0,0 +1,250 @@ +#!/usr/bin/env python +# +# Tool to manipulate QED image files +# +# Copyright (C) 2010 IBM, Corp. +# +# Authors: +# Stefan Hajnoczi stefa...@linux.vnet.ibm.com +# +# This work is licensed under the terms of the GNU GPL, version 2. See +# the COPYING file in the top-level directory. + +import sys +import struct +import random +import optparse + +QED_F_NEED_CHECK = 0x02 + +header_fmt = 'QII' +header_size = struct.calcsize(header_fmt) +field_names = ['magic', 'cluster_size', 'table_size', + 'header_size', 'features', 'compat_features', + 'autoclear_features', 'l1_table_offset', 'image_size', + 'backing_filename_offset', 'backing_filename_size'] +table_elem_fmt = 'Q' +table_elem_size = struct.calcsize(table_elem_fmt) + +def err(msg): +sys.stderr.write(msg + '\n') +sys.exit(1) + +def unpack_header(s): +fields = struct.unpack(header_fmt, s) +return dict((field_names[idx], val) for idx, val in enumerate(fields)) + +def pack_header(header): +fields = tuple(header[x] for x in field_names) +return struct.pack(header_fmt, *fields) + +def unpack_table_elem(s): +return struct.unpack(table_elem_fmt, s)[0] + +def pack_table_elem(elem): +return struct.pack(table_elem_fmt, elem) + +class QED(object): +def __init__(self, f): +self.f = f + +self.f.seek(0, 2) +self.filesize = f.tell() + +self.load_header() +self.load_l1_table() + +def raw_pread(self, offset, size): +self.f.seek(offset) +return self.f.read(size) + +def raw_pwrite(self, offset, data): +self.f.seek(offset) +return self.f.write(data) + +def load_header(self): +self.header = unpack_header(self.raw_pread(0, header_size)) + +def store_header(self): +self.raw_pwrite(0, pack_header(self.header)) + +def read_table(self, offset): +size = self.header['table_size'] * self.header['cluster_size'] +s = self.raw_pread(offset, size) +table = [unpack_table_elem(s[i:i + table_elem_size]) for i in xrange(0, size, table_elem_size)] +return table + +def load_l1_table(self): +self.l1_table = self.read_table(self.header['l1_table_offset']) +self.table_nelems = self.header['table_size'] * self.header['cluster_size'] / table_elem_size + +def write_table(self, offset, table): +s = ''.join(pack_table_elem(x) for x in table) +self.raw_pwrite(offset, s) + +def random_table_item(table): +return random.choice([(index, offset) for index, offset in enumerate(table) if offset != 0]) + +def corrupt_table_duplicate(table): +'''Corrupt a table by introducing a duplicate offset''' +_, dup_victim = random_table_item(table) + +for i in xrange(len(table)): +dup_target = random.randint(0, len(table) - 1) +if table[dup_target] != dup_victim: +table[dup_target] = dup_victim +return +raise Exception('no duplication corruption possible in table') + +def corrupt_table_invalidate(qed, table): +'''Corrupt a table by introducing an invalid offset''' +index, _ = random_table_item(table) +table[index] = qed.filesize + random.randint(0, 100 * 1024 * 1024 * 1024 * 1024) + +def cmd_show(qed, *args): +'''show - Show header and l1 table''' +if not args or args[0] == 'header': +print qed.header +elif args[0] == 'l1': +print qed.l1_table +elif len(args) == 2 and args[0] == 'l2': +offset = int(args[1]) +print qed.read_table(offset) +else: +err('unrecognized sub-command') + +def cmd_duplicate(qed, table_level): +'''duplicate l1|l2 - Duplicate a table element''' +if table_level == 'l1': +offset = qed.header['l1_table_offset'] +table = qed.l1_table +elif table_level == 'l2': +_, offset = random_table_item(qed.l1_table) +table = qed.read_table(l2_offset) +else: +err('unrecognized sub-command') +corrupt_table_duplicate(table) +qed.write_table(offset, table) + +def cmd_invalidate(qed, table_level): +'''invalidate l1|l2 - Plant an invalid table element''' +
Re: [Qemu-devel] [PATCH v2] exec.c: Fix subpage memory access to RAM MemoryRegion
Am 01.12.2011 11:06, schrieb Gleb Natapov: On Thu, Dec 01, 2011 at 11:54:33AM +0200, Avi Kivity wrote: On 12/01/2011 11:47 AM, Gleb Natapov wrote: On Thu, Dec 01, 2011 at 11:41:52AM +0200, Avi Kivity wrote: On 12/01/2011 11:37 AM, Gleb Natapov wrote: Looks reasonable. Should go into 1.1. Should we backport it to 1.0.blah? From 95c318f's description, it doesn't happen in normal circumstances. To reproduce that I mappped subpage PCI bar over RAM IIRC. In qemu 1.0, you can no longer do that (the pci bridge will not let the BAR override the RAM). Hmm, if this is how real HW work then problem solved :) (different HW can behave differently, but it is reasonable to assume that on a PC memory access below TOM will be redirected to memory controller no matter what) Ah, glad to know that x86_64 is no longer affected. What about 0.15.1? So what is the motivation for Andreas patch than? He's not emulating pc hardware. That's not a crime in itself :) What HW he encountered this problem on? What scenario? How likely is this scenario on that HW (my comment for 95c318f which you are referring to above was for PC)? I encountered this on a nommu architecture that's not yet upstream (78k0 family / rl78). The exact scenario was a 256-byte long RAM area for Special Function Registers (fixable by 8-bit pages) and a 32-byte long RAM subarea for memory-mapped banked GPRs (not fixable by lowering page size to 5, doesn't build). I'm aware that the former I could convert to mmio and the latter I might drop but that's besides the point, it's not prohibited by MemoryRegion API and silently fails unless DEBUG_UNASSIGNED enabled. Seems worth a fix. Upstream potential no-mmu architectures and their target page sizes are: lm32 (12) m68k (10) microblaze (12) mips (12) xtensa (12) And if KVM is supported on that HW my comment about KVM still applies. I don't think KVM is supported on any of the above. Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH] Guest stop notification
On Thu, 01 Dec 2011, Jan Kiszka wrote: On 2011-11-29 22:36, Eric B Munson wrote: Often when a guest is stopped from the qemu console, it will report spurious soft lockup warnings on resume. There are kernel patches being discussed that will give the host the ability to tell the guest that it is being stopped and should ignore the soft lockup warning that generates. Signed-off-by: Eric B Munson emun...@mgebm.net Cc: ry...@linux.vnet.ibm.com Cc: aligu...@us.ibm.com Cc: mtosa...@redhat.com Cc: a...@redhat.com Cc: k...@vger.kernel.org Cc: linux-ker...@vger.kernel.org --- target-i386/kvm.c |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 5bfc21f..defd364 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -336,12 +336,18 @@ static int kvm_inject_mce_oldstyle(CPUState *env) return 0; } +static void kvm_put_guest_paused(CPUState *penv) +{ +kvm_vcpu_ioctl(penv, KVM_GUEST_PAUSED, 0); +} I see no need in encapsulating this in a separate function. The encapsulated function was from a previous idea, I will remove it for V2. + static void cpu_update_state(void *opaque, int running, RunState state) { CPUState *env = opaque; if (running) { env-tsc_valid = false; + kvm_put_guest_paused(env); checkpatch.pl would have asked you to remove this tab. Will change to spaces for V2. More general: Why is this x86-only? If the kernel interface is x86-only, what prevents making it generic right from the beginning? Why do we need a new IOCTL for this? Was there no space left in the kvm_run structure e.g. to pass this flag down on next vcpu execution? No big deal, just wondering. Thanks for your review/feedback. When I started looking into this problem, the ioctl was the first suggestion I got for how to communicate from qemu to guest kernel. I don't see a technical reason that this could not be added to the kvm_run structure in one of the bytes currently used as padding. I would prefer to keep the ioctl because I have the corresponding kernel patches out to work with this, however, if there is a strong preference for using kvm_run, I can rework both sets. Eric signature.asc Description: Digital signature
Re: [Qemu-devel] [PATCH] Guest stop notification
On Thu, 01 Dec 2011, Jan Kiszka wrote: On 2011-11-29 22:36, Eric B Munson wrote: Often when a guest is stopped from the qemu console, it will report spurious soft lockup warnings on resume. There are kernel patches being discussed that will give the host the ability to tell the guest that it is being stopped and should ignore the soft lockup warning that generates. Signed-off-by: Eric B Munson emun...@mgebm.net Cc: ry...@linux.vnet.ibm.com Cc: aligu...@us.ibm.com Cc: mtosa...@redhat.com Cc: a...@redhat.com Cc: k...@vger.kernel.org Cc: linux-ker...@vger.kernel.org --- target-i386/kvm.c |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 5bfc21f..defd364 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -336,12 +336,18 @@ static int kvm_inject_mce_oldstyle(CPUState *env) return 0; } +static void kvm_put_guest_paused(CPUState *penv) +{ +kvm_vcpu_ioctl(penv, KVM_GUEST_PAUSED, 0); +} I see no need in encapsulating this in a separate function. + static void cpu_update_state(void *opaque, int running, RunState state) { CPUState *env = opaque; if (running) { env-tsc_valid = false; + kvm_put_guest_paused(env); checkpatch.pl would have asked you to remove this tab. More general: Why is this x86-only? If the kernel interface is x86-only, what prevents making it generic right from the beginning? Sorry, missed this question on the first pass, this is x86 only because the flag used lives in the pvclock structure. AFAICT, there aren't any other architectures out there that implement paravirtualized clocks yet. Why do we need a new IOCTL for this? Was there no space left in the kvm_run structure e.g. to pass this flag down on next vcpu execution? No big deal, just wondering. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux signature.asc Description: Digital signature
Re: [Qemu-devel] [PATCH v2] exec.c: Fix subpage memory access to RAM MemoryRegion
On 12/01/2011 07:18 PM, Andreas Färber wrote: Am 01.12.2011 11:06, schrieb Gleb Natapov: On Thu, Dec 01, 2011 at 11:54:33AM +0200, Avi Kivity wrote: On 12/01/2011 11:47 AM, Gleb Natapov wrote: On Thu, Dec 01, 2011 at 11:41:52AM +0200, Avi Kivity wrote: On 12/01/2011 11:37 AM, Gleb Natapov wrote: Looks reasonable. Should go into 1.1. Should we backport it to 1.0.blah? From 95c318f's description, it doesn't happen in normal circumstances. To reproduce that I mappped subpage PCI bar over RAM IIRC. In qemu 1.0, you can no longer do that (the pci bridge will not let the BAR override the RAM). Hmm, if this is how real HW work then problem solved :) (different HW can behave differently, but it is reasonable to assume that on a PC memory access below TOM will be redirected to memory controller no matter what) Ah, glad to know that x86_64 is no longer affected. What about 0.15.1? In 0.15 this can still happen, 1.0 was fixed by the memory API. Note the fix is actually a side effect of a different change, and the issue can still happen for other reasons (as in your case). -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [RFC PATCH] Exporting Guest RAM information for NUMA binding
On Wed, Nov 30, 2011 at 06:41:13PM +0100, Andrea Arcangeli wrote: On Wed, Nov 30, 2011 at 09:52:37PM +0530, Dipankar Sarma wrote: create the guest topology correctly and optimize for NUMA. This would work for us. Even on the case of 1 guest that fits in one node, you're not going to max out the full bandwidth of all memory channels with this. qemu all can do with ms_mbind/tbind is to create a vtopology that matches the hardware topology. It has these limits: 1) requires all userland applications to be modified to scan either the physical topology if run on host, or the vtopology if run on guest to get the full benefit. Not sure why you would need that. qemu can reflect the topology based on -numa specifications and the corresponding ms_tbind/mbind in FDT (in the case of Power, I guess ACPI tables for x86) and guest kernel would detect this virtualized topology. So there is no need for two types of topologies afaics. It will all be reflected in /sys/devices/system/node in the guest. 2) breaks across live migration if host physical topology changes That is indeed an issue. Either VM placement software needs to be really smart to migrate VMs that fit well or, more likely, we will have to find a way to make guest kernels aware of topology changes. But the latter has impact on userspace as well for applications that might have optimized for NUMA. 3) 1 small guest on a idle numa system that fits in one numa node will tell not enough information to the host kernel 4) if used outside of qemu and one threads allocates more memory than what fits in one node it won't tell enough info to the host kernel. About 3): if you've just one guest that fits in one node, each vcpu should be spread across all the nodes probably, and behave like MADV_INTERLEAVE if the guest CPU scheduler migrate guests processes in reverse, the global memory bandwidth will still be used full even if they will both access remote memory. I've just seen benchmarks where no pinning runs more than _twice_ as fast than pinning with just 1 guest and only 10 vcpu threads, probably because of that. I agree. Specifying NUMA topology for guest can result in sub-optimal performance in some cases, it is a tradeoff. In short it's an incremental step that moves some logic to the kernel but I don't see it solving all situations optimally and it shares a lot of the limits of the hard bindings. Agreed. Thanks Dipankar
Re: [Qemu-devel] [PATCH V3 3/4] Implement the FreeSCALE i.MX31 advanced vectored interrupt controller, at least to the extent it is used by Linux 3.0.x
On 30 November 2011 03:36, Peter Chubb peter.ch...@nicta.com.au wrote: Signed-off-by: Hans Jang hsj...@ok-labs.com Signed-off-by: Adam Clench ad...@ok-labs.com Signed-off-by: Peter Chubb peter.ch...@nicta.com.au --- Makefile.target | 2 hw/imx_avic.c | 378 2 files changed, 379 insertions(+), 1 deletion(-) create mode 100644 hw/imx_avic.c Index: qemu-working/hw/imx_avic.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ qemu-working/hw/imx_avic.c 2011-11-30 13:38:27.070791665 +1100 @@ -0,0 +1,378 @@ +/* + * IMX31 Vectored Interrupt Controller + * + * Note this is NOT the PL192 provided by ARM, but + * a custom implementation by FreeScale. + * + * Copyright (c) 2008 OKL + * Copyright (c) 2011 NICTA Pty Ltd + * Originally Written by Hans Jiang + * + * This code is licenced under the GPL version 2 or later. See + * the COPYING file in the top-level directory. + * + * TODO: implement vectors and priorities. Vectors are harder (they require support from the target-arm core implementation which isn't there) but I think you should implement priorities. That should be purely internal to this device, and it would be good not to have yet another interrupt controller in the tree which doesn't get interrupt priorities right (the NVIC being our other). Mostly looks ok otherwise though I haven't prodded it too closely. + */ + +#include hw.h +#include sysbus.h +#include host-utils.h + +#define DEBUG_INT 1 +#undef DEBUG_INT /* comment out for debugging */ + +#ifdef DEBUG_INT +#define DPRINTF(fmt, args...) \ +do { printf(imx_avic: fmt , ##args); } while (0) +#else +#define DPRINTF(fmt, args...) do {} while (0) +#endif + +/* + * Define to 1 for messages about attempts to + * access unimplemented registers or similar. + */ +#define DEBUG_IMPLEMENTATION 1 +#if DEBUG_IMPLEMENTATION +# define IPRINTF(fmt, args...) \ + do { fprintf(stderr, imx_avic: fmt, ##args); } while (0) +#else +# define IPRINTF(fmt, args...) do {} while (0) +#endif + +#define IMX_INT_NUM_IRQS 64 + +/* Interrupt Control Bits */ +#define ABFLAG (125) +#define ABFEN (124) +#define NIDIS (122) /* Normal Interrupt disable */ +#define FIDIS (121) /* Fast interrupt disable */ +#define NIAD (120) /* Normal Interrupt Arbiter Rise ARM level */ +#define FIAD (119) /* Fast Interrupt Arbiter Rise ARM level */ +#define NM (118) /* Normal interrupt mode */ + + +#define PRIO_PER_WORD (sizeof(uint32_t) * 8 / 4) +#define PRIO_WORDS (IMX_INT_NUM_IRQS/PRIO_PER_WORD) + +typedef struct { + SysBusDevice busdev; + MemoryRegion iomem; + uint64_t pending; + uint64_t enabled; + uint64_t is_fiq; + uint32_t intcntl; + uint32_t intmask; + qemu_irq irq; + qemu_irq fiq; + uint32_t prio[PRIO_WORDS]; /* Priorities are 4-bits each */ +} imx_int_state; + +static const VMStateDescription vmstate_imx_avic = { + .name = imx-avic, + .version_id = 1, + .minimum_version_id = 1, + .minimum_version_id_old = 1, + .fields = (VMStateField[]) { + VMSTATE_UINT64(pending, imx_int_state), + VMSTATE_UINT64(enabled, imx_int_state), + VMSTATE_UINT64(is_fiq, imx_int_state), + VMSTATE_UINT32(intcntl, imx_int_state), + VMSTATE_UINT32(intmask, imx_int_state), + VMSTATE_UINT32_ARRAY(prio, imx_int_state, PRIO_WORDS), + VMSTATE_END_OF_LIST() + }, +}; + + + +static inline int imx_int_prio(imx_int_state *s, int irq) +{ + uint32_t word = irq / PRIO_PER_WORD; + uint32_t part = 4 * (irq % PRIO_PER_WORD); + return 0xf (s-prio[word] part); +} + +static inline void imx_int_set_prio(imx_int_state *s, int irq, int prio) +{ + uint32_t word = irq / PRIO_PER_WORD; + uint32_t part = 4 * (irq % PRIO_PER_WORD); + uint32_t mask = ~(0xf part); + s-prio[word] = mask; + s-prio[word] |= prio part; +} + +/* Update interrupts. */ +static void imx_int_update(imx_int_state *s) +{ + int i; + uint64_t new = s-pending; + uint64_t flags; + + flags = new s-enabled s-is_fiq; + qemu_set_irq(s-fiq, !!flags); + + flags = new s-enabled ~s-is_fiq; + if (!flags || (s-intmask == 0x1f)) { + qemu_set_irq(s-irq, !!flags); + return; + } + + /* + * Take interrupt if prio lower than the value of intmask + * (should really take highest priority interrupt here, + * but that would involve processing interrupt sources + * in priority order) + */ + for (i = 0; i IMX_INT_NUM_IRQS; i++) { + if (flags (1UL i)) { + if (imx_int_prio(s, i) s-intmask) { + qemu_set_irq(s-irq, 1); + return; + } + } + } + qemu_set_irq(s-irq, 0); +} + +static void imx_int_set_irq(void *opaque, int irq, int level) +{ +
[Qemu-devel] [PATCH V2] Guest stop notification
Often when a guest is stopped from the qemu console, it will report spurious soft lockup warnings on resume. There are kernel patches being discussed that will give the host the ability to tell the guest that it is being stopped and should ignore the soft lockup warning that generates. This patch uses the qemu Notifier system to tell the guest it is about to be stopped. Signed-off-by: Eric B Munson emun...@mgebm.net Cc: Avi Kivity a...@redhat.com Cc: Marcelo Tosatti mtosa...@redhat.com Cc: Jan Kiszka jan.kis...@siemens.com Cc: ry...@linux.vnet.ibm.com Cc: aligu...@us.ibm.com Cc: linux-ker...@vger.kernel.org Cc: k...@vger.kernel.org --- target-i386/kvm.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 5bfc21f..12ac91a 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -342,6 +342,7 @@ static void cpu_update_state(void *opaque, int running, RunState state) if (running) { env-tsc_valid = false; +kvm_vcpu_ioctl(env, KVM_GUEST_PAUSED, 0); } } -- 1.7.5.4
Re: [Qemu-devel] [PATCH] Guest stop notification
On 2011-12-01 18:19, Eric B Munson wrote: On Thu, 01 Dec 2011, Jan Kiszka wrote: On 2011-11-29 22:36, Eric B Munson wrote: Often when a guest is stopped from the qemu console, it will report spurious soft lockup warnings on resume. There are kernel patches being discussed that will give the host the ability to tell the guest that it is being stopped and should ignore the soft lockup warning that generates. Signed-off-by: Eric B Munson emun...@mgebm.net Cc: ry...@linux.vnet.ibm.com Cc: aligu...@us.ibm.com Cc: mtosa...@redhat.com Cc: a...@redhat.com Cc: k...@vger.kernel.org Cc: linux-ker...@vger.kernel.org --- target-i386/kvm.c |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 5bfc21f..defd364 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -336,12 +336,18 @@ static int kvm_inject_mce_oldstyle(CPUState *env) return 0; } +static void kvm_put_guest_paused(CPUState *penv) +{ + kvm_vcpu_ioctl(penv, KVM_GUEST_PAUSED, 0); +} I see no need in encapsulating this in a separate function. The encapsulated function was from a previous idea, I will remove it for V2. + static void cpu_update_state(void *opaque, int running, RunState state) { CPUState *env = opaque; if (running) { env-tsc_valid = false; + kvm_put_guest_paused(env); checkpatch.pl would have asked you to remove this tab. Will change to spaces for V2. More general: Why is this x86-only? If the kernel interface is x86-only, what prevents making it generic right from the beginning? Why do we need a new IOCTL for this? Was there no space left in the kvm_run structure e.g. to pass this flag down on next vcpu execution? No big deal, just wondering. Thanks for your review/feedback. When I started looking into this problem, the ioctl was the first suggestion I got for how to communicate from qemu to guest kernel. I don't see a technical reason that this could not be added to the kvm_run structure in one of the bytes currently used as padding. I would prefer to keep the ioctl because I have the corresponding kernel patches out to work with this, however, if there is a strong preference for using kvm_run, I can rework both sets. My feeling is that a run field would be more elegant, but I might be wrong on this as well. In any case: You need KVM_CAP in your kernel interface to announce the new feature and you have to sync in the new kernel header into QEMU to make it build. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH] Guest stop notification
On 2011-12-01 18:31, Arend van Spriel wrote: On 12/01/2011 06:19 PM, Eric B Munson wrote: On Thu, 01 Dec 2011, Jan Kiszka wrote: On 2011-11-29 22:36, Eric B Munson wrote: + static void cpu_update_state(void *opaque, int running, RunState state) { CPUState *env = opaque; if (running) { env-tsc_valid = false; + kvm_put_guest_paused(env); checkpatch.pl would have asked you to remove this tab. Will change to spaces for V2. Huh. Remove what tab? Indent are always 8-space TAB according CodingStyle doc: Chapter 1: Indentation Tabs are 8 characters, and thus indentations are also 8 characters. There are heretic movements that try to make indentations 4 (or even 2!) characters deep, and that is akin to trying to define the value of PI to be 3. Don't worry, this is not kernel code. :) QEMU uses a different coding style. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH] Guest stop notification
On 2011-12-01 18:22, Eric B Munson wrote: On Thu, 01 Dec 2011, Jan Kiszka wrote: On 2011-11-29 22:36, Eric B Munson wrote: Often when a guest is stopped from the qemu console, it will report spurious soft lockup warnings on resume. There are kernel patches being discussed that will give the host the ability to tell the guest that it is being stopped and should ignore the soft lockup warning that generates. Signed-off-by: Eric B Munson emun...@mgebm.net Cc: ry...@linux.vnet.ibm.com Cc: aligu...@us.ibm.com Cc: mtosa...@redhat.com Cc: a...@redhat.com Cc: k...@vger.kernel.org Cc: linux-ker...@vger.kernel.org --- target-i386/kvm.c |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 5bfc21f..defd364 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -336,12 +336,18 @@ static int kvm_inject_mce_oldstyle(CPUState *env) return 0; } +static void kvm_put_guest_paused(CPUState *penv) +{ +kvm_vcpu_ioctl(penv, KVM_GUEST_PAUSED, 0); +} I see no need in encapsulating this in a separate function. + static void cpu_update_state(void *opaque, int running, RunState state) { CPUState *env = opaque; if (running) { env-tsc_valid = false; + kvm_put_guest_paused(env); checkpatch.pl would have asked you to remove this tab. More general: Why is this x86-only? If the kernel interface is x86-only, what prevents making it generic right from the beginning? Sorry, missed this question on the first pass, this is x86 only because the flag used lives in the pvclock structure. AFAICT, there aren't any other architectures out there that implement paravirtualized clocks yet. That's an implementation detail of the kernel. The interface (IOCTL or kvm_run field) is generic, no? I would just fire this notification from generic code, evaluate the error (that was lacking so far), and only report it if it's something else than not supported. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [RFC PATCH] Exporting Guest RAM information for NUMA binding
On Thu, Dec 01, 2011 at 10:55:20PM +0530, Dipankar Sarma wrote: On Wed, Nov 30, 2011 at 06:41:13PM +0100, Andrea Arcangeli wrote: On Wed, Nov 30, 2011 at 09:52:37PM +0530, Dipankar Sarma wrote: create the guest topology correctly and optimize for NUMA. This would work for us. Even on the case of 1 guest that fits in one node, you're not going to max out the full bandwidth of all memory channels with this. qemu all can do with ms_mbind/tbind is to create a vtopology that matches the hardware topology. It has these limits: 1) requires all userland applications to be modified to scan either the physical topology if run on host, or the vtopology if run on guest to get the full benefit. Not sure why you would need that. qemu can reflect the topology based on -numa specifications and the corresponding ms_tbind/mbind in FDT (in the case of Power, I guess ACPI tables for x86) and guest kernel would detect this virtualized topology. So there is no need for two types of topologies afaics. It will all be reflected in /sys/devices/system/node in the guest. The point is: what a vtopology gives you? If you don't modify all apps running in the guest to use it? vtopology on guest, helps exactly like the topology on host - very little unless you modify qemu on host to use ms_tbind/mbind. 2) breaks across live migration if host physical topology changes That is indeed an issue. Either VM placement software needs to be really smart to migrate VMs that fit well or, more likely, we will have to find a way to make guest kernels aware of topology changes. But the latter has impact on userspace as well for applications that might have optimized for NUMA. Making guest kernel aware about memory topology changes is going to be a whole mess. Or at least harder than memory hotplug. I agree. Specifying NUMA topology for guest can result in sub-optimal performance in some cases, it is a tradeoff. I see it more like a limit of this solution, which is a common limit to the hard bindings than a tradeoff. Agreed. Yep I just wanted to make clear the limits remains with this solution. I'll try to teach knumad to detect thread-memory affinity too with some logic, we'll see how well that can work.