[Qemu-devel] Re: [PATCH v2 13/24] kvm: Set up signal mask also for !CONFIG_IOTHREAD
On 02/28/2011 06:54 PM, Avi Kivity wrote: On 02/28/2011 06:49 PM, Jan Kiszka wrote: That's what I tried, and it didn't work?! Maybe I forgot to compile or something. Well, it maybe failed to build as qemu_kvm_init_cpu_signals became unused and the compiler should have bailed out? Probably it's better to disable it directly in the function. That's what I did, with #ifdefs, but brokenly (#ifndef). Well it fails even with the correct #ifdef. Maybe some later patch adds to the breakage. This is really strange - the same test (migrate.tcp) works for Fedora and Windows XP x86. Installation and setup of Windows XP x64 work fine. It is only migration.tcp (when using netcat to connect to the guest) that fails. -- error compiling committee.c: too many arguments to function
[Qemu-devel] Re: [PATCH v2 13/24] kvm: Set up signal mask also for !CONFIG_IOTHREAD
On 2011-03-01 09:39, Avi Kivity wrote: On 02/28/2011 06:54 PM, Avi Kivity wrote: On 02/28/2011 06:49 PM, Jan Kiszka wrote: That's what I tried, and it didn't work?! Maybe I forgot to compile or something. Well, it maybe failed to build as qemu_kvm_init_cpu_signals became unused and the compiler should have bailed out? Probably it's better to disable it directly in the function. That's what I did, with #ifdefs, but brokenly (#ifndef). Well it fails even with the correct #ifdef. Maybe some later patch adds to the breakage. But when ifdef'ed out, this patch should be a nop for qemu-kvm. Indeed strange. This is really strange - the same test (migrate.tcp) works for Fedora and Windows XP x86. Installation and setup of Windows XP x64 work fine. It is only migration.tcp (when using netcat to connect to the guest) that fails. Guess this has to be classically debugged. :-/ Let me know if I can help (though not today). Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 02/28/2011 08:12 PM, Anthony Liguori wrote: On Feb 28, 2011 11:47 AM, Avi Kivity a...@redhat.com mailto:a...@redhat.com wrote: On 02/28/2011 07:33 PM, Anthony Liguori wrote: You're just ignoring what I've written. No, you're just impervious to my subtle attempt to refocus the discussion on solving a practical problem. There's a lot of good, reasonably straight forward changes we can make that have a high return on investment. Is making qemu the authoritative source of configuration information a straightforward change? Is the return on it high? Is the investment low? I think this is where we fundamentally disagree. My position is that QEMU is already the authoritative source. Having a state file doesn't change anything. Do a hot unplug of a network device with upstream libvirt with acpiphp unloaded, consult libvirt and then consult the monitor to see who has the right view of the guests config. To me, that's the definition of authoritative. No to all three (ignoring for the moment whether it is good or not, which we were debating). The only suggestion I'm making beyond Marcelo's original patch is that we use a structured format and that we make it possible to use the same file to solve this problem in multiple places. No, you're suggesting a lot more than that. That's exactly what I'm suggesting from a technical perspective. I don't think this creates a fundamental break in how management tools interact with QEMU. I don't think introducing RAID support in the block layer is a reasonable alternative. Why not? Because its a lot of complexity and code that can go wrong while only solving the race for one specific case. Not to mention that we double the iop rate. Something that avoids the whole state thing altogether: - instead of atomically switching when live copy is done, keep on issuing writes to both the origin and the live copy - issue a notification to management - management receives the notification, and issues an atomic blockdev switch command this is really the RAID-1 solution but without the state file (credit Dor). An advantage is that there is no additional latency when trying to catch up to the dirty bitmap. It still suffers from the two generals problem. You cannot solve this without making one node reliable and that takes us back to it being either QEMU (posted event and state file) or the management tool (sync event). It is safe w/o a state file by changing the basic live copy algorithm: 1. Live copy in progress stage Once live copy command is issued, a dirty bit map is created for tracking. There is a single pass over the entire image where we copy blocks from the src to the dst. Write commands for blocks that were already copied will be done twice for the src and dst. Once the full copy single pass ends, we trigger a QMP event that this stage can end. The live copy stage keeps running till the management issue a switch command. When it will happen, the switch is immediate and no need to copy additional blocks (but flush pending IOs). 2. Management sends a switch command. Qemu stops the doubling the IO and switches to the destination. End. Now let's see the error case: - qemu failure over stage #1 No matter what happens, the management will start qemu with the source image. The destination will be erased, no matter how much we copied. - management failure over stage #1 The new mgmt daemon needs to query qemu's status. Management can continue as before. - qemu+mgmt failure at stage #1 The management should just run qemu with the source image. - mgmt failure post sending stage #2 command. The mgmt DB states that we switched, just need to connect to qemu. - qemu failure before/after getting the stage #2 event. Management will need just to execute new qemu with the dst image - Failure of both qemu mgmt in stage #2 The same as above. Pros: - Fast switch over time, minimal latency - No external storage/config needed - No need to wait for mgmt Thanks, Dor Regards, Anthony Liguori -- error compiling committee.c: too many arguments to function
[Qemu-devel] Re: [PATCH v2 13/24] kvm: Set up signal mask also for !CONFIG_IOTHREAD
On 03/01/2011 10:58 AM, Jan Kiszka wrote: On 2011-03-01 09:39, Avi Kivity wrote: On 02/28/2011 06:54 PM, Avi Kivity wrote: On 02/28/2011 06:49 PM, Jan Kiszka wrote: That's what I tried, and it didn't work?! Maybe I forgot to compile or something. Well, it maybe failed to build as qemu_kvm_init_cpu_signals became unused and the compiler should have bailed out? Probably it's better to disable it directly in the function. That's what I did, with #ifdefs, but brokenly (#ifndef). Well it fails even with the correct #ifdef. Maybe some later patch adds to the breakage. But when ifdef'ed out, this patch should be a nop for qemu-kvm. Indeed strange. Well, there are two functions in cpus.c named qemu_kvm_init_cpu_signals() (an intriguing coincidence). I #ifdefed the wrong one. With the right #ifdef it works correctly. This is really strange - the same test (migrate.tcp) works for Fedora and Windows XP x86. Installation and setup of Windows XP x64 work fine. It is only migration.tcp (when using netcat to connect to the guest) that fails. Guess this has to be classically debugged. :-/ Let me know if I can help (though not today). Still has to be debugged, but at least the tree is alive now. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [Bug 726619] [NEW] loadvm does not load (offline) snapshot anymore
Am 28.02.2011 17:04, schrieb Ralf Haferkamp: Public bug reported: qemu Version: 0.14.0 The problem is present in the current code from git master as well. Loading a snapshot that was created while qemu was not running (using qemu-img) does not seem to work anymore. Is there even a use case for this? While an OS is running, just silently replacing the disk contents isn't going to work. And if you're rebooting anyway, you can as well load the snapshot offline. Using loadvm snapshot-id in the qemu monitor does not have the desired effect. Not even an error message is displayed. I was able to track down the problem (using git bisect) to this commit: http://git.qemu.org/qemu.git/commit/?id=f0aa7a8b2d518c54430e4382309281b93e51981a After reverting that commit in my local git checkout everything is workin as expected again. In fact, that commit only lets loadvm fail earlier. Before, loadvm would have loaded all disk snapshots and then returned -EINVAL because there was no VM state, now it fails immediately (the VM is stopped in both cases because of the failure). I think the correct fix is to add an error message. Kevin
[Qemu-devel] Re: [PATCH -V2 1/6] hw/9pfs: Add V9fsfidmap in preparation for adding fd reclaim
On Tue, 1 Mar 2011 12:08:49 +0530, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: diff --git a/hw/9pfs/virtio-9p.h b/hw/9pfs/virtio-9p.h index 2ae4ce7..2f49641 100644 --- a/hw/9pfs/virtio-9p.h +++ b/hw/9pfs/virtio-9p.h @@ -101,7 +101,10 @@ enum p9_proto_version { #define P9_NOTAG(u16)(~0) #define P9_NOFID(u32)(~0) #define P9_MAXWELEM 16 +#define P9_FD_RECLAIM_THRES 100 before merge we should change that to 1000. I have updated my local tree. I kept it at 100 to make sure code path get tested when running most of the test case. +#define FID_REFERENCED 0x1 +#define FID_NON_RECLAIMABLE 0x2 /* * ample room for Twrite/Rread header * size[4] Tread/Twrite tag[2] fid[4] offset[8] count[4] @@ -185,17 +188,22 @@ typedef struct V9fsXattr int flags; } V9fsXattr; +typedef struct V9fsfidmap { +union { +int fd; +DIR *dir; +V9fsXattr xattr; +} fs; +int fid_type; +V9fsString path; +int flags; +} V9fsFidMap; + struct V9fsFidState { -int fid_type; int32_t fid; -V9fsString path; -union { - int fd; - DIR *dir; - V9fsXattr xattr; -} fs; uid_t uid; +V9fsFidMap fsmap; V9fsFidState *next; }; -aneesh
Re: [Qemu-devel] Re: KVM call agenda for Jan 25
On Mon, Feb 28, 2011 at 8:41 PM, Dushyant Bansal cs5070...@cse.iitd.ac.in wrote: On Sunday 27 February 2011 04:19 PM, Stefan Hajnoczi wrote: On Sat, Feb 26, 2011 at 9:50 PM, Dushyant Bansal cs5070...@cse.iitd.ac.in wrote: virtual size: 10G (10737418240 bytes) disk size: 569M convert- original time 0m52.522s convert- modified (resultant disk size: 5.3G) time 2m12.744s cp time 0m51.724s (same disk size) --- virtual size: 10G (10737418240 bytes) disk size: 3.6G convert- original time 1m52.249s convert- modified (resultant disk size: 7.1G) time 3m2.891s cp time 1m55.320s (same disk size) --- In these results, we can see that resultant disk size has increased. If I'm reading this correctly then qemu-img convert is within a few seconds of cp(1) for you? I have collected and attached some more time results for different operations on a 2.2G image. qemu-img convert is close to cp. qemu(modified): IO_BUF_SIZE = (2 * 1024 ) if any sector is non-null, write all sectors Nice that qemu-img convert isn't that far out by default on raw :). About Google Summer of Code, I have posted my take on applying and want to share that with you and qemu-devel: http://blog.vmsplice.net/2011/03/advice-for-students-applying-to-google.html Stefan
Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 02/28/2011 08:12 PM, Anthony Liguori wrote: On Feb 28, 2011 11:47 AM, Avi Kivity a...@redhat.com mailto:a...@redhat.com wrote: On 02/28/2011 07:33 PM, Anthony Liguori wrote: You're just ignoring what I've written. No, you're just impervious to my subtle attempt to refocus the discussion on solving a practical problem. There's a lot of good, reasonably straight forward changes we can make that have a high return on investment. Is making qemu the authoritative source of configuration information a straightforward change? Is the return on it high? Is the investment low? I think this is where we fundamentally disagree. My position is that QEMU is already the authoritative source. Having a state file doesn't change anything. Do a hot unplug of a network device with upstream libvirt with acpiphp unloaded, consult libvirt and then consult the monitor to see who has the right view of the guests config. libvirt is right and the monitor is wrong. On real hardware, calling _EJ0 doesn't affect the configuration one little bit (if I understand it correctly). It just turns off power to the slot. If you power-cycle, the card will be there. In the real world, the authoritative source of configuration is a human with a screwdriver. The virtualized equivalent is the management tool. To me, that's the definition of authoritative. No to all three (ignoring for the moment whether it is good or not, which we were debating). The only suggestion I'm making beyond Marcelo's original patch is that we use a structured format and that we make it possible to use the same file to solve this problem in multiple places. No, you're suggesting a lot more than that. That's exactly what I'm suggesting from a technical perspective. Unless I'm hallucinating, you're suggesting quite a bit more. A revolution in how qemu is to be managed. I don't think this creates a fundamental break in how management tools interact with QEMU. I don't think introducing RAID support in the block layer is a reasonable alternative. Why not? Because its a lot of complexity and code that can go wrong while only solving the race for one specific case. Not to mention that we double the iop rate. IMO it's of similar complexity. The number of I/Os don't change (reads stay the same, and any write that has already been mirrored needs to be re-mirrored in both cases. We do gain lower latency switchover and we package the code as a block format driver instead of core block code. We decouple the dependencies from live migration. Something that avoids the whole state thing altogether: - instead of atomically switching when live copy is done, keep on issuing writes to both the origin and the live copy - issue a notification to management - management receives the notification, and issues an atomic blockdev switch command this is really the RAID-1 solution but without the state file (credit Dor). An advantage is that there is no additional latency when trying to catch up to the dirty bitmap. It still suffers from the two generals problem. You cannot solve this without making one node reliable and that takes us back to it being either QEMU (posted event and state file) or the management tool (sync event). It works without either. If qemu fails, you simply re-mirror everything. If the management tool fails, it re-subscribes to the mirror-complete event, queries whether it already happened in its absence, and if it did, requests the switchover. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 02/28/2011 08:56 PM, Marcelo Tosatti wrote: Something that avoids the whole state thing altogether: - instead of atomically switching when live copy is done, keep on issuing writes to both the origin and the live copy - issue a notification to management - management receives the notification, and issues an atomic blockdev switch command How do you know if QEMU performed the switch or not? That is, how can the switch command be atomic? If you fail while qemu processes the command, you query it to see what it's current state. You can simply re-issue the command; it's idempotent. If qemu fails before you issue the switch you discard the new copy. If it fails after the switch (whether it acked or not) you discard the original. this is really the RAID-1 solution but without the state file (credit Dor). An advantage is that there is no additional latency when trying to catch up to the dirty bitmap. So you're implementing a virtual block driver not visible to the user as an image format. The image format would allow storage of persistent copy progress, etc. so you lose all that. All of that to avoid introducing a commit file which is not part of global qemu state. I think it has other advantages as well - code isolation, live migration compatibility, reduced switchover times. Really the bad thing about the commit file is its un-documented-ness and ad-hoc nature which means the management tool is unlikely to get things right. Qemu maintaining state is fine. It already does - in block devices, and should do more. We just have to take the right approach to do it (and be careful not to make it manage configuration). -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH] `qdev_free` when unplug a pci device
On Tue, Mar 01, 2011 at 03:32:08PM +0800, Wen Congyang wrote: The issue is, Qemu injected sci interrupt into guest, but before the guest completes to handled it, users/qemu can start to inject the next sci event triggered by hot plug/unplug. Thus qemu loses sci interrupt and up/down status. Yes, you are right. But I still do not understand why introducing pending bits can not solve the issue? Maybe your patch works. The difference is to introduce new variables or new state machine. Anyway I suppose that the eventual solution is to switch to express native hotplug which is much more reliable and doesn't rely on ACPI. (note: I'm very biased here.) thanks Thanks Wen Congyang thanks - on power-on (0, 0) - on hot plug (0, 0) - (1, 0) if other combination - error - on hot unpug (1, 0) or (0, 0) - (0, 1) (0, 0) is for cold plugged devices. (1, 0) is for hot plugged devices. if other combination - error - on pciej_write(write on PCI_EJ_BASE) (0, 1) - (0, 0) and qdev_free() if other combination - ignore When enabling sci, those bits are checked and raise sci if necessary. Any comments on this? Anyway the current pci hotplug-related commands are somewhat broken, and needs clean up with multifunction hotplug support which is on my todo list. thanks On Mon, Feb 28, 2011 at 10:52:40AM +0800, Wen Congyang wrote: Hi Markus Armbruster At 02/23/2011 04:30 PM, Markus Armbruster Write: Isaku Yamahata yamah...@valinux.co.jp writes: snip I don't think this patch is correct. Let me explain. Device hot unplug is *not* guaranteed to succeed. For some buses, such as USB, it always succeeds immediately, i.e. when the device_del monitor command finishes, the device is gone. Live is good. But for PCI, device_del merely initiates the ACPI unplug rain dance. It doesn't wait for the dance to complete. Why? The dance can take an unpredictable amount of time, including forever. Problem: Subsequent device_add can fail if it reuses the qdev ID or PCI slot, and the unplug has not yet completed (race condition), or it failed. Yes, Virginia, PCI hotplug *can* fail. When unplug succeeds, the qdev is automatically destroyed. pciej_write() does that for PIIX4. Looks like pcie_cap_slot_event() does it for PCIE. I got a similar problem. When I unplug a pci device by hand, it works as expected, and I can hotplug it again. But when I use a srcipt to do the same thing, sometimes it failed. I think I may find another bug. Steps to reproduce this bug: 1. cat ./test-e1000.sh # RHEL6RC is domain name #! /bin/bash while true; do virsh attach-interface RHEL6RC network default --mac 52:54:00:1f:db:c7 --model e1000 if [[ $? -ne 0 ]]; then break fi virsh detach-interface RHEL6RC network --mac 52:54:00:1f:db:c7 if [[ $? -ne 0 ]]; then break fi sleep 5 done 2. ./test-e1000.sh Interface attached successfully Interface detached successfully Interface attached successfully Interface detached successfully error: Failed to attach interface error: operation failed: adding e1000,netdev=hostnet1,id=net1,mac=52:54:00:1f:db:c7,bus=pci.0,addr=0x8 device failed: Duplicate ID 'net1' for device I use ftrace to trace whether sci interrupt is passed to guest OS, here is log: # cat trace | grep 'irq=9' idle-0 [000] 118342.634772: irq_handler_entry: irq=9 name=acpi idle-0 [000] 118342.634831: irq_handler_exit: irq=9 ret=handled idle-0 [000] 118342.693216: irq_handler_entry: irq=9 name=acpi idle-0 [000] 118342.693254: irq_handler_exit: irq=9 ret=handled idle-0 [000] 118347.737689: irq_handler_entry: irq=9 name=acpi idle-0 [000] 118347.737738: irq_handler_exit: irq=9 ret=handled According to the log, we only receive 3 sci interrupt, and one is lost. I enable piix4's debug and 1 line printf into piix4_device_hotplug: printf(slot: %d, up: %d, down: %d, state: %d\n, slot, s-pci0_status.up, s-pci0_status.down, state); here is the log: PM: mapping to 0xb000 PM readw port=0x0004 val=0x ... gpe write afe2 == 0 gpe write afe0 == 255 gpe write afe3 == 0 gpe write afe1 == 255 PM readw port=0x val=0x0001 PM readw port=0x0002 val=0x gpe read afe0 == 0 gpe read afe2 == 0 gpe read afe1 == 0 gpe read afe3 == 0 PM writew port=0x val=0x0020 PM readw port=0x0002 val=0x PM writew port=0x0002 val=0x0020 PM readw port=0x0002 val=0x0020 gpe write afe2 == 255 gpe write afe3 == 255 ... slot: 6, up: 0, down: 0, state: 1 we attach interface the
[Qemu-devel] [PATCH] Add error message for loading snapshot without VM state
It already fails, but it didn't tell the user why. Signed-off-by: Kevin Wolf kw...@redhat.com --- savevm.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/savevm.c b/savevm.c index a50fd31..6e026a8 100644 --- a/savevm.c +++ b/savevm.c @@ -1996,6 +1996,8 @@ int load_vmstate(const char *name) if (ret 0) { return ret; } else if (sn.vm_state_size == 0) { +error_report(This is a disk-only snapshot. Revert to it offline +using qemu-img.); return -EINVAL; } -- 1.7.2.3
Re: [Qemu-devel] [RFC][PATCH 0/2] Allow cache settings for block devices to be changed at runtime.
On Mon, Feb 28, 2011 at 3:48 PM, Kevin Wolf kw...@redhat.com wrote: Am 28.02.2011 16:35, schrieb Stefan Hajnoczi: On Mon, Feb 28, 2011 at 3:12 PM, Kevin Wolf kw...@redhat.com wrote: Am 28.02.2011 12:49, schrieb Prerna Saxena: The following patchset introduces monitor commands: 1. set_cache DEVICE CACHE-SETTING Change cache settings for block device, DEVICE, through the monitor. (Available options : 'none', 'writeback', 'writethrough') Eg, (qemu)set_cache ide0-hd0 none - Changes cache setting for ide0-hd0 to 'none' Not sure if adding this interface is a good idea. I see that you only add it for HMP, and we may consider that, but it's definitely not suitable for QMP. One reason is that none/writethrough/writeback/unsafe isn't really what we want to use long term. We want to separate advertising a write cache (which is guest visible) from things like whether to use O_DIRECT or not. In the past, Christoph mentioned that he had patches to make these separate and even let the guest change the write cache enabled flag, which would probably solve most of the use cases of this patch. Toggling host page cache at runtime is useful too because it saves having to restart VMs. Not sure why I wanted to change that during runtime, but agreed, allowing to change parameters using the monitor is generally a good thing. However, I'm not sure if a command for changing the cache mode is the right solution, or if it should be something like a command to change block device options. (For example, what about toggling read-only or snapshot mode?) Yes, I think you're right. We should aim for a general interface rather than having to add many more specific interfaces in the future. CQ: Do you see a relation to the update interface you added to adjust drive options at runtime for FVD? Stefan
[Qemu-devel] Re: [Bug 638955] Re: emulated netcards don't work with recent sunos kernel
On Mon, Feb 28, 2011 at 7:06 PM, geppz no_carr...@plasmacore.com wrote: Going with tcpdump -e from within the guest, I have identified that the problem is when a big enough packet is outputed. I tried a few times with dmesg, and as soon as the tcp packet reaches the following length: 18:38:28.340097 52:54:69:b5:89:11 (oui Unknown) 00:19:b9:81:2c:52 (oui Unknown), ethertype IPv4 (0x0800), length 1514: 192.168.7.38.ssh 192.168.7.52.59008: Flags [.], ack 2824, win 64436, options [nop,nop,TS val 27488132 ecr 6063255], length 1448 it cannot get through. Then the IP stack tries and retries to send the same identical packet, but there will never be any reply from the other side. Finally the socket is torn down. I have bridged networking for the VM. My bridge is a normal linux bridge br0 with MTU 1500. Has MTU anything to do with all this? Is it a linux-bridge bug or a qemu-kvm bug? Excellent, thanks for posting these details. The bug is probably in the NIC hardware emulation and I think we can track this one down fairly easily. Can you please post your qemu-kvm command-line including the NIC model that you are using? Stefan -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/638955 Title: emulated netcards don't work with recent sunos kernel Status in QEMU: New Bug description: hi there, i'm using qemu-kvm backend in version: # qemu-kvm -version QEMU PC emulator version 0.12.5 (qemu-kvm-0.12.5), Copyright (c) 2003-2008 Fabrice Bellard and there are just *not working any of model=$type with combinations of recent sunos (solaris, openindiana, opensolaris, ..) .. you can download for testing purposes iso from here: http://dlc- origin.openindiana.org/isos/147/ or from here: http://genunix.org/distributions/indiana/ osol and oi are also bubuntu-like *live cds, so no need to bother with installing behaviour is as follows: e1000 - receiving doesn't work, transmitting works .. dladm (tool for handle ethers) shows that is all ok, correct mode is loaded up, it just seems like this driver works at 100% but .. rtl8169|pcnet - works in 10Mbit mode with several other issues like high cpu utilization and so .. dladm is unable to recognize options for this kind of -nic others - just don't work .. i experienced this issue several times in past .. woraround was, that rtl8169 worked so-so .. with recent sunos kernel it doesn't. it's easy to reproduce, this is why i'm not putting here more then launching script for my virtual machine: # cat openindiana.sh qemu-kvm -hda /home/kvm/openindiana/openindiana.img -m 2048 -localtime -cdrom /home/kvm/+images/oi-dev-147-x86.iso -boot d \ -vga std -vnc :9 -k en-us -monitor unix:/home/kvm/openindiana/instance,server,nowait \ -net nic,model=e1000,vlan=1 -net tap,ifname=oi0,script=no,vlan=1 sleep 2; ip l set oi0 up; ip a a 192.168.99.9/24 dev oi0; regards by daniel
Re: [Qemu-devel] [RFC][PATCH 0/2] Allow cache settings for block devices to be changed at runtime.
Am 01.03.2011 10:55, schrieb Stefan Hajnoczi: On Mon, Feb 28, 2011 at 3:48 PM, Kevin Wolf kw...@redhat.com wrote: Am 28.02.2011 16:35, schrieb Stefan Hajnoczi: On Mon, Feb 28, 2011 at 3:12 PM, Kevin Wolf kw...@redhat.com wrote: Am 28.02.2011 12:49, schrieb Prerna Saxena: The following patchset introduces monitor commands: 1. set_cache DEVICE CACHE-SETTING Change cache settings for block device, DEVICE, through the monitor. (Available options : 'none', 'writeback', 'writethrough') Eg, (qemu)set_cache ide0-hd0 none - Changes cache setting for ide0-hd0 to 'none' Not sure if adding this interface is a good idea. I see that you only add it for HMP, and we may consider that, but it's definitely not suitable for QMP. One reason is that none/writethrough/writeback/unsafe isn't really what we want to use long term. We want to separate advertising a write cache (which is guest visible) from things like whether to use O_DIRECT or not. In the past, Christoph mentioned that he had patches to make these separate and even let the guest change the write cache enabled flag, which would probably solve most of the use cases of this patch. Toggling host page cache at runtime is useful too because it saves having to restart VMs. Not sure why I wanted to change that during runtime, but agreed, allowing to change parameters using the monitor is generally a good thing. However, I'm not sure if a command for changing the cache mode is the right solution, or if it should be something like a command to change block device options. (For example, what about toggling read-only or snapshot mode?) Yes, I think you're right. We should aim for a general interface rather than having to add many more specific interfaces in the future. CQ: Do you see a relation to the update interface you added to adjust drive options at runtime for FVD? On one hand it's a different set of options today. IIUC, qemu-img update refers to persistent per-image options like backing file, encryption, etc. This monitor command refers to temporary command line options like cache, snapshot mode etc. On the other hand, we've had discussions before about storing a copy-on-read flag in images which makes sense as a command line option as well. The same may apply to things like the read-only flags. So maybe these two sets of flags aren't that distinct from each other. Kevin
[Qemu-devel] Re: [PATCH] Add error message for loading snapshot without VM state
Kevin Wolf kw...@redhat.com wrote: It already fails, but it didn't tell the user why. Signed-off-by: Kevin Wolf kw...@redhat.com --- savevm.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/savevm.c b/savevm.c index a50fd31..6e026a8 100644 --- a/savevm.c +++ b/savevm.c @@ -1996,6 +1996,8 @@ int load_vmstate(const char *name) if (ret 0) { return ret; } else if (sn.vm_state_size == 0) { +error_report(This is a disk-only snapshot. Revert to it offline +using qemu-img.); return -EINVAL; } Reviewed-by: Juan Quintela quint...@redhat.com
Re: [Qemu-devel] [PATCH -V2 4/6] hw/9pfs: Implement syncfs
On Tue, Mar 1, 2011 at 6:38 AM, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com --- hw/9pfs/virtio-9p.c | 31 +++ hw/9pfs/virtio-9p.h | 2 ++ 2 files changed, 33 insertions(+), 0 deletions(-) diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c index c4b0198..882f4f3 100644 --- a/hw/9pfs/virtio-9p.c +++ b/hw/9pfs/virtio-9p.c @@ -1978,6 +1978,36 @@ static void v9fs_fsync(V9fsState *s, V9fsPDU *pdu) v9fs_post_do_fsync(s, pdu, err); } +static void v9fs_post_do_syncfs(V9fsState *s, V9fsPDU *pdu, int err) +{ + if (err == -1) { + err = -errno; + } + complete_pdu(s, pdu, err); +} + +static void v9fs_syncfs(V9fsState *s, V9fsPDU *pdu) +{ + int err; + int32_t fid; + size_t offset = 7; + V9fsFidState *fidp; + + pdu_unmarshal(pdu, offset, d, fid); + fidp = lookup_fid(s, fid); + if (fidp == NULL) { + err = -ENOENT; + v9fs_post_do_syncfs(s, pdu, err); + return; + } + /* + * We don't have per file system syncfs + * So just return success + */ + err = 0; + v9fs_post_do_syncfs(s, pdu, err); +} Please explain the semantics of P9_TSYNCFS. Won't returning success without doing anything lead to data integrity issues? It seems unnecessary to split v9fs_post_do_syncfs() into its own function since there is no blocking point here and a callback will not be needed. Stefan
Re: [Qemu-devel] [PATCH (resend, rebase) 3/3] virtio-serial: Enable ioeventfd
On Tue, Mar 1, 2011 at 6:41 AM, Amit Shah amit.s...@redhat.com wrote: On (Mon) 28 Feb 2011 [15:28:49], Stefan Hajnoczi wrote: On Mon, Feb 28, 2011 at 11:12 AM, Amit Shah amit.s...@redhat.com wrote: Enable ioeventfd for virtio-serial devices by default. Commit 25db9ebe15125deb32958c6df74996f745edf1f9 lists the benefits of using ioeventfd. Copying a file from guest to host over a virtio-serial channel didn't show much difference in time or io_exit rate. The cost of enabling ioeventfd is one eventfd file descriptor and KVM in-kernel device slot per virtqueue. The current maximum number per VM is 200, this is a kernel limit in include/linux/kvm_host.h:NR_IOBUS_DEVS. Do you really want to use ioeventfd for virtio-serial? Perhaps this is more useful for high-frequency device interfaces. I guess virtio-serial is being used heavily -- by almost all guest agents nowadays. The primary use-case, though, is not for high-bandwidth communication. This setting could be default off, it didn't show any difference in my test run, but depends on what people who use it see and think. I don't have strong opinions about this but wanted to make you aware that there is a limited number of ioeventfds to go around. Stefan
[Qemu-devel] QEMU development
hiI have done some development in QEMU live code and I want to submit my work. should I directly submit the patch to mailing list or do something else?? I have not fixed the bug but added extra feature to QEMU. Regards
Re: [Qemu-devel] QEMU development
On 1 March 2011 10:48, maheen butt maheen_but...@yahoo.com wrote: I have done some development in QEMU live code and I want to submit my work. should I directly submit the patch to mailing list or do something else?? I have not fixed the bug but added extra feature to QEMU. How to submit a patch is documented here: http://wiki.qemu.org/Contribute/SubmitAPatch -- PMM
Re: [Qemu-devel] 68k and BeBox (was SymbianOS, MeeGO, WebOS and QEMU)
Le mardi 01 mars 2011 à 01:23 +0100, François Revol a écrit : Le 1 mars 2011 à 01:18, Natalia Portillo a écrit : Well, most of those emulators do not support the required mmu, except ARAnyM (and their mmu patch was backported to UAE I think). That's the main problem, but first of all in QEMU there is the need for complete pre-Coldfire 68ks, as well as the modular support for FPUs and MMU (Sun and Apple's Lisa) Yeah old Sun stuff used their own mmu due to missing support in pre-020 for some insn restart. Currently the fastest ones would be BeBox, Mac68k and NeXT machines, because almost all devices are already emulated, but the assembly itself, firmware and CPU/FPU/MMU in case of 68k. IIRC the Mac68k hardware is quite obscure and model-dependant... but EMILE and BasiliskII should say enough. They will not help you: - EMILE uses Mac ROM to access hardware - BasiliskII patches the ROM to call its internal drivers instead of accessing hardware. The best source for hardware definition is linux... If it can help I think I have all hardware reference manuals for m68k macintosh. Just posted my BeBox patch btw. A/UX would be fun to run :-) There used to be UNIX for Atari TT also IIRC, though not sure it was ever published. There is a binary dump somewhere, I may have it. So should I, just can't recall where I left it. François. -- - laur...@vivier.eu -- Tout ce qui est impossible reste à accomplirJules Verne Things are only impossible until they're not Jean-Luc Picard
Re: [Qemu-devel] 68k and BeBox (was SymbianOS, MeeGO, WebOS and QEMU)
Le 1 mars 2011 à 13:02, Laurent Vivier a écrit : Currently the fastest ones would be BeBox, Mac68k and NeXT machines, because almost all devices are already emulated, but the assembly itself, firmware and CPU/FPU/MMU in case of 68k. IIRC the Mac68k hardware is quite obscure and model-dependant... but EMILE and BasiliskII should say enough. They will not help you: - EMILE uses Mac ROM to access hardware - BasiliskII patches the ROM to call its internal drivers instead of accessing hardware. Oh right, I was sure there was a trick somewhere :) The best source for hardware definition is linux... Or NetBSD, if you don't have enough aspirin for Linux. http://www.netbsd.org/ports/mac68k/ And I have an LCIII around for testing. As well for the BeBox: http://www.netbsd.org/ports/bebox/hardware.html If it can help I think I have all hardware reference manuals for m68k macintosh. Actually I think they used to be online until recently, but Apple revamped their archived not too long ago IIRC. François.
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On 02/28/2011 07:44 PM, Anthony Liguori wrote: On Feb 28, 2011 10:44 AM, Jes Sorensen jes.soren...@redhat.com mailto:jes.soren...@redhat.com wrote: Hi, On last week's call we discussed the issue of splitting non core features of QEMU into it's own process to reduce the security risks etc. I wrote up a summary of my thoughts on this to try to cover the various issues. Feedback welcome and hopefully we can continue the discussion on a future call - maybe next week? I would like to be part of the discussion, but it's a public holiday here March 1st, so I won't be on that call. Cheers, Jes Separating host-side virtagent and other tasks from core QEMU = To improve auditing of the core QEMU code, it would be ideal to be able to separate the core QEMU functionality from utility functionality by moving the utility functionality into its own process. This process will be referred to as the QEMU client below. Separating as in moving code outside of qemu.git, making code optionally built in, making code optionally built in or loadable as a separate executable that is automatically launched, or making code always built outside the main executable? I'm very nervous about having a large number of daemons necessary to run QEMU. I think a reasonable approach would be a single front-end daemond. s/daemon/son processes/ Qemu is the one that should spawn them and they should be transparent from the management. This way running qemu stays the same and qemu just need to add the logic to get a SIGCHILD and potentially re-execute an dead son process. Once QAPI is merged, there is a very incremental approach we can take for this sort of work. Take your favorite subsystem (like gdbstub or SDL) and make it only use QMP apis. Once we're only using QMP internally in a subsystem, then building it out of the main QEMU and using libqmp should be fairly easy. Regards, Anthony Liguori Components which are candidates for moving out of QEMU include: - virtagent - vnc server (and other graphical displays such as SDL, spice and curses) - human monitor These are all much harder than they may seem. There's a ton of QMP functions that would be needed before we could even try to do this. The idea is to have QEMU launch as a daemon, and then allow for one of more client processes to connect to it. These will then offer the various services. The main issue to discuss is how to handle various state information, reconnects, and migration. Security The primary reason for this discussion is security, however there are other benefits from this split, as will be mentioned below. During a demo of virtagent I hit a case where a bug in the agent command handling code caused a crash of the host QEMU process. While it is probably a simple bug, it shows how adding more complexity to the QEMU process increases the risk of adding security problems that could potentially be exploited by a hostile guest. By splitting non core functionality into a QEMU client process, the host process will be isolated from a large number of potential security problems, ie. in case a client process is killed or crashes, it should not affect the main QEMU process. In addition it makes it easier to audit the core QEMU functionality. virtagent = In short virtagent provides a set of simple commands, most of which do not have state associated with them. These include shutdown, ping, fsfreeze/fsthaw, etc. Other commands might be multi-stage commands which could fail if the client is disconnected from the daemon while the command is in progress. These include copy-paste and file copy. vnc server == The vnc server simply needs a connection to the video memory of the QEMU process, video mode state, as well as channels for sending keyboard and mouse events. It is stateless by nature and supports reconnects. This applies to the other graphical display engines (SDL, spice, and curses) as well. Human monitor = The human monitor is effectively stateless. It issues commands and prints the result. There is no state in the monitor and it can be built directly on top of QMP. An additional benefit here is that it would allow for multiple monitors. Disconnects === It must be possible for a client process getting killed or disconnected from the QEMU process, in which case is should be possible to launch a new client that connects to the QEMU process. In this case, commands needs to be provided allowing the client process to query the QEMU process and virtagent for current state information. In-progress commands may fail, and will need to be re-run, such as copy-paste and and file copy. However neither of these are vital commands and a re-run of such commands is acceptable behavior.
[Qemu-devel] [PATCH 0/2] ARM: Add Versatile Express board model
This patchset adds support for the ARM Versatile Express board with Cortex-A9 daughterboard. It's based on some vexpress modelling work done by Bahadir Balban and Amit Mahajan at B Labs, overhauled and cleaned up by me (thanks to them for making that work available). The patchset depends on the MMC cleanup work I posted last week: http://www.mail-archive.com/qemu-devel@nongnu.org/msg56148.html as it wants to wire up the MMC status lines. Peter Maydell (2): hw/arm_sysctl.c: Add the Versatile Express system registers hw/vexpress.c: Add model of ARM Versatile Express board Makefile.target |1 + hw/arm_sysctl.c | 61 ++ hw/vexpress.c | 238 +++ 3 files changed, 300 insertions(+), 0 deletions(-) create mode 100644 hw/vexpress.c
[Qemu-devel] [PATCH 2/2] hw/vexpress.c: Add model of ARM Versatile Express board
Add a model of the ARM Versatile Express board (with A9MPx4 daughterboard). Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- Makefile.target |1 + hw/vexpress.c | 238 +++ 2 files changed, 239 insertions(+), 0 deletions(-) create mode 100644 hw/vexpress.c diff --git a/Makefile.target b/Makefile.target index 220589e..949bd4e 100644 --- a/Makefile.target +++ b/Makefile.target @@ -315,6 +315,7 @@ obj-arm-y += framebuffer.o obj-arm-y += syborg.o syborg_fb.o syborg_interrupt.o syborg_keyboard.o obj-arm-y += syborg_serial.o syborg_timer.o syborg_pointer.o syborg_rtc.o obj-arm-y += syborg_virtio.o +obj-arm-y += vexpress.o obj-sh4-y = shix.o r2d.o sh7750.o sh7750_regnames.o tc58128.o obj-sh4-y += sh_timer.o sh_serial.o sh_intc.o sh_pci.o sm501.o diff --git a/hw/vexpress.c b/hw/vexpress.c new file mode 100644 index 000..1ed3578 --- /dev/null +++ b/hw/vexpress.c @@ -0,0 +1,238 @@ +/* + * ARM Versatile Express emulation. + * + * Copyright (c) 2010 - 2011 B Labs Ltd. + * Copyright (c) 2011 Linaro Limited + * Written by Bahadir Balban, Amit Mahajan, Peter Maydell + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, see http://www.gnu.org/licenses/. + */ + +#include sysbus.h +#include arm-misc.h +#include primecell.h +#include devices.h +#include net.h +#include sysemu.h +#include boards.h + +#define SMP_BOOT_ADDR 0xe000 + +#define VEXPRESS_BOARD_ID 0x8e0 + +static struct arm_boot_info vexpress_binfo = { +.smp_loader_start = SMP_BOOT_ADDR, +}; + +static void secondary_cpu_reset(void *opaque) +{ + CPUState *env = opaque; + + cpu_reset(env); + /* Set entry point for secondary CPUs. This assumes we're using + the init code from arm_boot.c. Real hardware resets all CPUs + the same. */ + env-regs[15] = SMP_BOOT_ADDR; +} + +static void vexpress_a9_init(ram_addr_t ram_size, + const char *boot_device, + const char *kernel_filename, const char *kernel_cmdline, + const char *initrd_filename, const char *cpu_model) +{ +CPUState *env = NULL; +ram_addr_t ram_offset, vram_offset, sram_offset; +DeviceState *dev, *sysctl; +SysBusDevice *busdev; +qemu_irq *irqp; +qemu_irq pic[64]; +int n; +qemu_irq cpu_irq[4]; +uint32_t proc_id; +uint32_t sys_id; +ram_addr_t low_ram_size, vram_size, sram_size; + +if (!cpu_model) { +cpu_model = cortex-a9; +} + +for (n = 0; n smp_cpus; n++) { +env = cpu_init(cpu_model); +if (!env) { +fprintf(stderr, Unable to find CPU definition\n); +exit(1); +} +irqp = arm_pic_init_cpu(env); +cpu_irq[n] = irqp[ARM_PIC_CPU_IRQ]; +if (n 0) { +qemu_register_reset(secondary_cpu_reset, env); +} +} + +if (ram_size 0x4000) { +/* 1GB is the maximum the address space permits */ +fprintf(stderr, vexpress: cannot model more than 1GB RAM\n); +exit(1); +} + +ram_offset = qemu_ram_alloc(NULL, vexpress.highmem, ram_size); +low_ram_size = ram_size; +if (low_ram_size 0x400) { +low_ram_size = 0x400; +} +/* RAM is from 0x6000 upwards. The bottom 64MB of the + * address space should in theory be remappable to various + * things including ROM or RAM; we always map the RAM there. + */ +cpu_register_physical_memory(0x0, low_ram_size, ram_offset | IO_MEM_RAM); +cpu_register_physical_memory(0x6000, ram_size, + ram_offset | IO_MEM_RAM); + +/* 0x1e00 A9MPCore (SCU) private memory region */ +dev = qdev_create(NULL, a9mpcore_priv); +qdev_prop_set_uint32(dev, num-cpu, smp_cpus); +qdev_init_nofail(dev); +busdev = sysbus_from_qdev(dev); +vexpress_binfo.smp_priv_base = 0x1e00; +sysbus_mmio_map(busdev, 0, vexpress_binfo.smp_priv_base); +for (n = 0; n smp_cpus; n++) { +sysbus_connect_irq(busdev, n, cpu_irq[n]); +} +/* Interrupts [42:0] are from the motherboard; + * [47:43] are reserved; [63:48] are daughterboard + * peripherals. Note that some documentation numbers + * external interrupts starting from 32 (because the + * A9MP has internal interrupts 0..31). + */ +for (n = 0; n 64; n++) { +pic[n] = qdev_get_gpio_in(dev, n); +} + +/* Motherboard peripherals CS7 : 0x1000 .. 0x1002 */ +
[Qemu-devel] [PATCH 1/2] hw/arm_sysctl.c: Add the Versatile Express system registers
Add support for the Versatile Express SYS_CFG registers, which provide a generic means of reading or writing configuration information from various parts of the board. We only implement shutdown and reset. Also make the RESETCTL register RAZ/WI on Versatile Express rather than reset the board. Other system registers are generally the same as Versatile and Realview. Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- hw/arm_sysctl.c | 61 +++ 1 files changed, 61 insertions(+), 0 deletions(-) diff --git a/hw/arm_sysctl.c b/hw/arm_sysctl.c index 799b007..564b512 100644 --- a/hw/arm_sysctl.c +++ b/hw/arm_sysctl.c @@ -27,6 +27,9 @@ typedef struct { uint32_t resetlevel; uint32_t proc_id; uint32_t sys_mci; +uint32_t sys_cfgdata; +uint32_t sys_cfgctrl; +uint32_t sys_cfgstat; } arm_sysctl_state; static const VMStateDescription vmstate_arm_sysctl = { @@ -41,6 +44,9 @@ static const VMStateDescription vmstate_arm_sysctl = { VMSTATE_UINT32(flags, arm_sysctl_state), VMSTATE_UINT32(nvflags, arm_sysctl_state), VMSTATE_UINT32(resetlevel, arm_sysctl_state), +VMSTATE_UINT32(sys_cfgdata, arm_sysctl_state), +VMSTATE_UINT32(sys_cfgctrl, arm_sysctl_state), +VMSTATE_UINT32(sys_cfgstat, arm_sysctl_state), VMSTATE_END_OF_LIST() } }; @@ -53,6 +59,7 @@ static const VMStateDescription vmstate_arm_sysctl = { #define BOARD_ID_EB 0x140 #define BOARD_ID_PBA8 0x178 #define BOARD_ID_PBX 0x182 +#define BOARD_ID_VEXPRESS 0x190 static int board_id(arm_sysctl_state *s) { @@ -104,6 +111,10 @@ static uint32_t arm_sysctl_read(void *opaque, target_phys_addr_t offset) case 0x38: /* NVFLAGS */ return s-nvflags; case 0x40: /* RESETCTL */ +if (board_id(s) == BOARD_ID_VEXPRESS) { +/* reserved: RAZ/WI */ +return 0; +} return s-resetlevel; case 0x44: /* PCICTL */ return 1; @@ -142,7 +153,23 @@ static uint32_t arm_sysctl_read(void *opaque, target_phys_addr_t offset) case 0xcc: /* SYS_TEST_OSC3 */ case 0xd0: /* SYS_TEST_OSC4 */ return 0; +case 0xa0: /* SYS_CFGDATA */ +if (board_id(s) != BOARD_ID_VEXPRESS) { +goto bad_reg; +} +return s-sys_cfgdata; +case 0xa4: /* SYS_CFGCTRL */ +if (board_id(s) != BOARD_ID_VEXPRESS) { +goto bad_reg; +} +return s-sys_cfgctrl; +case 0xa8: /* SYS_CFGSTAT */ +if (board_id(s) != BOARD_ID_VEXPRESS) { +goto bad_reg; +} +return s-sys_cfgstat; default: +bad_reg: printf (arm_sysctl_read: Bad register offset 0x%x\n, (int)offset); return 0; } @@ -190,6 +217,10 @@ static void arm_sysctl_write(void *opaque, target_phys_addr_t offset, s-nvflags = ~val; break; case 0x40: /* RESETCTL */ +if (board_id(s) == BOARD_ID_VEXPRESS) { +/* reserved: RAZ/WI */ +break; +} if (s-lockval == LOCK_VALUE) { s-resetlevel = val; if (val 0x100) @@ -216,7 +247,37 @@ static void arm_sysctl_write(void *opaque, target_phys_addr_t offset, case 0x98: /* OSCRESET3 */ case 0x9c: /* OSCRESET4 */ break; +case 0xa0: /* SYS_CFGDATA */ +if (board_id(s) != BOARD_ID_VEXPRESS) { +goto bad_reg; +} +s-sys_cfgdata = val; +return; +case 0xa4: /* SYS_CFGCTRL */ +if (board_id(s) != BOARD_ID_VEXPRESS) { +goto bad_reg; +} +s-sys_cfgctrl = val ~(3 18); +s-sys_cfgstat = 1;/* complete */ +switch (s-sys_cfgctrl) { +case 0xc080:/* SYS_CFG_SHUTDOWN to motherboard */ +qemu_system_shutdown_request(); +break; +case 0xc090:/* SYS_CFG_REBOOT to motherboard */ +qemu_system_reset_request(); +break; +default: +s-sys_cfgstat |= 2;/* error */ +} +return; +case 0xa8: /* SYS_CFGSTAT */ +if (board_id(s) != BOARD_ID_VEXPRESS) { +goto bad_reg; +} +s-sys_cfgstat = val 3; +return; default: +bad_reg: printf (arm_sysctl_write: Bad register offset 0x%x\n, (int)offset); return; } -- 1.7.1
[Qemu-devel] [Bug 727134] [NEW] pci-stub.o: In function `do_pci_info':0.14.0 compile problem
Public bug reported: Please see this build log. I didn't compile thq qemu-kvm on Mandriva Cooker and haven't any idea. I'm the qemu maintainer on Mandriva. ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/727134 Title: pci-stub.o: In function `do_pci_info':0.14.0 compile problem Status in QEMU: New Bug description: Please see this build log. I didn't compile thq qemu-kvm on Mandriva Cooker and haven't any idea. I'm the qemu maintainer on Mandriva.
[Qemu-devel] Re: [PATCH v3 uq/master 00/22] Win32 iothread support
On 02/28/2011 04:05 PM, Paolo Bonzini wrote: On 02/28/2011 01:13 PM, Avi Kivity wrote: If there's a git tree of this I'll be happy to do an autotest run. Sure, it's branch iothread-win32 of git://github.com/bonzini/qemu.git Fails on Fedora 9 i386 install, hangs right after Performing post install configuration The guest is processing interrupts but the mouse won't move, and it doesn't make progress. Configured with --enable-io-thread. Perhaps the problem exists even before the patchset. -- error compiling committee.c: too many arguments to function
[Qemu-devel] [Bug 727134] Re: pci-stub.o: In function `do_pci_info':0.14.0 compile problem
** Attachment added: build.0.20110228211607.log.bz2 https://bugs.launchpad.net/bugs/727134/+attachment/1878575/+files/build.0.20110228211607.log.bz2 -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/727134 Title: pci-stub.o: In function `do_pci_info':0.14.0 compile problem Status in QEMU: New Bug description: Please see this build log. I didn't compile thq qemu-kvm on Mandriva Cooker and haven't any idea. I'm the qemu maintainer on Mandriva.
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On Mar 1, 2011 7:07 AM, Dor Laor dl...@redhat.com wrote: On 02/28/2011 07:44 PM, Anthony Liguori wrote: On Feb 28, 2011 10:44 AM, Jes Sorensen jes.soren...@redhat.com mailto:jes.soren...@redhat.com wrote: Hi, On last week's call we discussed the issue of splitting non core features of QEMU into it's own process to reduce the security risks etc. I wrote up a summary of my thoughts on this to try to cover the various issues. Feedback welcome and hopefully we can continue the discussion on a future call - maybe next week? I would like to be part of the discussion, but it's a public holiday here March 1st, so I won't be on that call. Cheers, Jes Separating host-side virtagent and other tasks from core QEMU = To improve auditing of the core QEMU code, it would be ideal to be able to separate the core QEMU functionality from utility functionality by moving the utility functionality into its own process. This process will be referred to as the QEMU client below. Separating as in moving code outside of qemu.git, making code optionally built in, making code optionally built in or loadable as a separate executable that is automatically launched, or making code always built outside the main executable? I'm very nervous about having a large number of daemons necessary to run QEMU. I think a reasonable approach would be a single front-end daemond. s/daemon/son processes/ Qemu is the one that should spawn them and they should be transparent from the management. This way running qemu stays the same and qemu just need to add the logic to get a SIGCHILD and potentially re-execute an dead son process. Spice is the logical place to start, no? It's the largest single dependency we have and it does some scary things with qemu_mutex. I would use spice as a way to prove the concept. Regards, Anthony Liguori Once QAPI is merged, there is a very incremental approach we can take for this sort of work. Take your favorite subsystem (like gdbstub or SDL) and make it only use QMP apis. Once we're only using QMP internally in a subsystem, then building it out of the main QEMU and using libqmp should be fairly easy. Regards, Anthony Liguori Components which are candidates for moving out of QEMU include: - virtagent - vnc server (and other graphical displays such as SDL, spice and curses) - human monitor These are all much harder than they may seem. There's a ton of QMP functions that would be needed before we could even try to do this. The idea is to have QEMU launch as a daemon, and then allow for one of more client processes to connect to it. These will then offer the various services. The main issue to discuss is how to handle various state information, reconnects, and migration. Security The primary reason for this discussion is security, however there are other benefits from this split, as will be mentioned below. During a demo of virtagent I hit a case where a bug in the agent command handling code caused a crash of the host QEMU process. While it is probably a simple bug, it shows how adding more complexity to the QEMU process increases the risk of adding security problems that could potentially be exploited by a hostile guest. By splitting non core functionality into a QEMU client process, the host process will be isolated from a large number of potential security problems, ie. in case a client process is killed or crashes, it should not affect the main QEMU process. In addition it makes it easier to audit the core QEMU functionality. virtagent = In short virtagent provides a set of simple commands, most of which do not have state associated with them. These include shutdown, ping, fsfreeze/fsthaw, etc. Other commands might be multi-stage commands which could fail if the client is disconnected from the daemon while the command is in progress. These include copy-paste and file copy. vnc server == The vnc server simply needs a connection to the video memory of the QEMU process, video mode state, as well as channels for sending keyboard and mouse events. It is stateless by nature and supports reconnects. This applies to the other graphical display engines (SDL, spice, and curses) as well. Human monitor = The human monitor is effectively stateless. It issues commands and prints the result. There is no state in the monitor and it can be built directly on top of QMP. An additional benefit here is that it would allow for multiple monitors. Disconnects === It must be possible for a client process getting killed or disconnected from the QEMU process, in which case is should be possible to launch a new client that connects to the QEMU
Re: [Qemu-devel] [RFC][PATCH 0/2] Allow cache settings for block devices to be changed at runtime.
The only way to change the cache settings is from the guest. Without that we're guranteed to lose data when going from WCE=0 to WCE=1. I have patches to do that, and to allow changing O_DIRECT via a monitor command, but to toggle O_SYNC via fcntl I first need to get a kernel patch in as that's currently not allowed to be changed at runtime.
Re: [Qemu-devel] [RFC][PATCH 0/2] Allow cache settings for block devices to be changed at runtime.
Am 01.03.2011 13:42, schrieb Christoph Hellwig: The only way to change the cache settings is from the guest. Without that we're guranteed to lose data when going from WCE=0 to WCE=1. I have patches to do that, and to allow changing O_DIRECT via a monitor command, but to toggle O_SYNC via fcntl I first need to get a kernel patch in as that's currently not allowed to be changed at runtime. We can re-open the file for now, and we need an implementation for older kernels/other host OSes anyway, so I don't think we have to wait for the kernel patch to be accepted. Kevin
Re: [Qemu-devel] [RFC][PATCH 0/2] Allow cache settings for block devices to be changed at runtime.
On Tue, Mar 01, 2011 at 12:48:34PM +, Stefan Hajnoczi wrote: On Tue, Mar 01, 2011 at 01:42:54PM +0100, Christoph Hellwig wrote: I have patches to do that, and to allow changing O_DIRECT via a monitor command, but to toggle O_SYNC via fcntl I first need to get a kernel patch in as that's currently not allowed to be changed at runtime. Great it sounds like you have already implemented the two cases (guest wce and host O_DIRECT) that we're talking about. At least in theory. And for Linux I can add setting/clearing of O_SYNC via fcntl easily, but what do we do for other hosts? I'm not sure closing/reopening the backing file is easily feasible.
Re: [Qemu-devel] Re: kvm crashes with spice while loading qxl
On Sun, Feb 27, 2011 at 08:11:26PM +0100, Jan Kiszka wrote: On 2011-02-27 20:03, Alon Levy wrote: On Sat, Feb 26, 2011 at 01:29:01PM +0100, Jan Kiszka wrote: On 2011-02-26 12:43, xming wrote: When trying to start X (and it loads qxl driver) the kvm process just crashes. This is fixed by Gerd's attached patch (taken from rhel repository, don't know why it wasn't pushed to qemu-kvm upstream). I'll send it to kvm list as well (separate email). Patch looks OK on first glance, but the changelog is misleading: This was broken for _both_ trees, but upstream didn't detect the bug. So I didn't test with qemu not having this patch, but according to the discussion in the launchpad bug the problem only happens with qemu-kvm. This doesn't rule out it being a bug, perhaps it is just triggered much less frequently I guess. My concerns regarding other side effects of juggling with global mutex in spice code remain. Jan
Re: [Qemu-devel] [RFC][PATCH 0/2] Allow cache settings for block devices to be changed at runtime.
On Tue, Mar 01, 2011 at 01:42:54PM +0100, Christoph Hellwig wrote: I have patches to do that, and to allow changing O_DIRECT via a monitor command, but to toggle O_SYNC via fcntl I first need to get a kernel patch in as that's currently not allowed to be changed at runtime. Great it sounds like you have already implemented the two cases (guest wce and host O_DIRECT) that we're talking about. Stefan
[Qemu-devel] [Bug 723871] Re: qemu-kvm-0.14.0 Aborts with -vga qxl
This is fixed by the following patch on the list (confirmed by xming on list): http://patchwork.ozlabs.org/patch/84704/ Hopefully that patch will be merged soon. Alon -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/723871 Title: qemu-kvm-0.14.0 Aborts with -vga qxl Status in QEMU: Confirmed Bug description: Host CPU is Core i7 Q820. KVM is from 2.6.35-gentoo-r5 kernel (x86_64). Host has spice-0.7.2 and spice-protocol-0.7.0. Guest is Windows XP SP3 with qxl driver 0.6.1, virtio-serial 1.1.6 and vdagent 0.6.3. qemu-kvm is started like so: qemu-system-x86_64 -cpu host -enable-kvm -pidfile /home/rick/qemu/hds/wxp.pid -drive file=/home/rick/qemu/hds/wxp.raw,if=virtio,media=disk,aio=native,snapshot=on -m 768 -name WinXP -net nic,model=virtio -net user -localtime -usb -vga qxl -device virtio-serial -chardev spicevmc,name=vdagent,id=vdagent -device virtserialport,chardev=vdagent,name=com.redhat.spice.0 -spice port=1234,disable-ticketing -monitor stdio and crashes with: qemu-system-x86_64: /home/rick/qemu/src/qemu-kvm-0.14.0/qemu-kvm.c:1724: kvm_mutex_unlock: Assertion `!cpu_single_env' failed. Aborted If I use -no-kvm, it works fine. If I use -vga std, it works fine. -enable-kvm and -vga qxl crashes.
Re: [Qemu-devel] [RFC][PATCH 0/2] Allow cache settings for block devices to be changed at runtime.
On Feb 28, 2011 10:48 AM, Kevin Wolf kw...@redhat.com wrote: Am 28.02.2011 16:35, schrieb Stefan Hajnoczi: On Mon, Feb 28, 2011 at 3:12 PM, Kevin Wolf kw...@redhat.com wrote: Am 28.02.2011 12:49, schrieb Prerna Saxena: The following patchset introduces monitor commands: 1. set_cache DEVICE CACHE-SETTING Change cache settings for block device, DEVICE, through the monitor. (Available options : 'none', 'writeback', 'writethrough') Eg, (qemu)set_cache ide0-hd0 none - Changes cache setting for ide0-hd0 to 'none' Not sure if adding this interface is a good idea. I see that you only add it for HMP, and we may consider that, but it's definitely not suitable for QMP. One reason is that none/writethrough/writeback/unsafe isn't really what we want to use long term. We want to separate advertising a write cache (which is guest visible) from things like whether to use O_DIRECT or not. In the past, Christoph mentioned that he had patches to make these separate and even let the guest change the write cache enabled flag, which would probably solve most of the use cases of this patch. Toggling host page cache at runtime is useful too because it saves having to restart VMs. Not sure why I wanted to change that during runtime, but agreed, allowing to change parameters using the monitor is generally a good thing. However, I'm not sure if a command for changing the cache mode is the right solution, or if it should be something like a command to change block device options. (For example, what about toggling read-only or snapshot mode?) Certainly good questions, but let me suggest not taking an HMP command and not a QMP commans because of interface concerns. My goal for 0.15 is to convert HMP to be implemented in terms of QMP. To do that, a bunch of new QMP commands are needed. They all won't be perfect but i'd rather support a bad QMP command forever than to continue to/ have people rely on HMP. Regards, Anthony Liguori I agree that the guest should control the emulated drive cache at runtime and we probably don't want to allow toggling that from the host - it could be dangerous :). Good point. That's a NACK for this patch as long as we haven't separated WCE from the host cache setting. Kevin
Re: [Qemu-devel] [RFC][PATCH 0/2] Allow cache settings for block devices to be changed at runtime.
Am 01.03.2011 14:03, schrieb Anthony Liguori: On Feb 28, 2011 10:48 AM, Kevin Wolf kw...@redhat.com mailto:kw...@redhat.com wrote: Am 28.02.2011 16:35, schrieb Stefan Hajnoczi: On Mon, Feb 28, 2011 at 3:12 PM, Kevin Wolf kw...@redhat.com mailto:kw...@redhat.com wrote: Am 28.02.2011 12:49, schrieb Prerna Saxena: The following patchset introduces monitor commands: 1. set_cache DEVICE CACHE-SETTING Change cache settings for block device, DEVICE, through the monitor. (Available options : 'none', 'writeback', 'writethrough') Eg, (qemu)set_cache ide0-hd0 none - Changes cache setting for ide0-hd0 to 'none' Not sure if adding this interface is a good idea. I see that you only add it for HMP, and we may consider that, but it's definitely not suitable for QMP. One reason is that none/writethrough/writeback/unsafe isn't really what we want to use long term. We want to separate advertising a write cache (which is guest visible) from things like whether to use O_DIRECT or not. In the past, Christoph mentioned that he had patches to make these separate and even let the guest change the write cache enabled flag, which would probably solve most of the use cases of this patch. Toggling host page cache at runtime is useful too because it saves having to restart VMs. Not sure why I wanted to change that during runtime, but agreed, allowing to change parameters using the monitor is generally a good thing. However, I'm not sure if a command for changing the cache mode is the right solution, or if it should be something like a command to change block device options. (For example, what about toggling read-only or snapshot mode?) Certainly good questions, but let me suggest not taking an HMP command and not a QMP commans because of interface concerns. My goal for 0.15 is to convert HMP to be implemented in terms of QMP. To do that, a bunch of new QMP commands are needed. They all won't be perfect but i'd rather support a bad QMP command forever than to continue to/ have people rely on HMP. Okay, makes sense. So we should reject patches that add new HMP commands without adding a QMP counterpart. I agree that the guest should control the emulated drive cache at runtime and we probably don't want to allow toggling that from the host - it could be dangerous :). Good point. That's a NACK for this patch as long as we haven't separated WCE from the host cache setting. Doesn't make a difference for this one, though, because it's NACKed anyway. Kevin PS: Anthony, is there a specific reason why you started sending HTML emails?
[Qemu-devel] Re: [PATCH v2] fix vnc regression
On Tue, Mar 1, 2011 at 1:48 AM, Wen Congyang we...@cn.fujitsu.com wrote: This patch fix the following two regressions: 1. we should use bitmap_set() and bitmap_clear() to replace vnc_set_bits(). 2. The unit of bitmap_intersects()'third parameter is bit, not words. But we pass the num of words to bitmap_intersects(). Changes from v1 to v2: 1. fix the third argument of bitmap_clear() Signed-off-by: Wen Congyang we...@cn.fujitsu.com Seems ok, tested with valgrind this time. But could you re-send this one and the other one at http://patchwork.ozlabs.org/patch/84887/ : - with appropriate Signed-off-by: and changelog for the other patch - using a const size_t width = ds_get_width(vs-ds) / 16; in both patchs to make the call more explicit Thanks, -- Corentin Chary http://xf.iksaif.net
[Qemu-devel] [PATCH v5] PING: Fix ATA SMART and CHECK POWER MODE
This patch fixes two things: 1) CHECK POWER MODE The error return value wasn't always zero, so it would show up as offline. Error is now explicitly set to zero. 2) SMART The smart values that were returned were invalid and tools like skdump would not recognize that the smart data was actually valid and would dump weird output. The data has been fixed up and raw value support was added. Tools like skdump and palimpsest work as expected. v5 changes: rebase v4 changes: incorporate changes from Ryan Harper v3 changes: don't reformat code I didn't change v2 changes: use single structure instead of one for thresholds and one for data. Signed-off-by: bdwhe...@indiana.edu diff --git a/hw/ide/core.c b/hw/ide/core.c index 9c91a49..1ffca56 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -34,13 +34,26 @@ #include hw/ide/internal.h -static const int smart_attributes[][5] = { -/* id, flags, val, wrst, thrsh */ -{ 0x01, 0x03, 0x64, 0x64, 0x06}, /* raw read */ -{ 0x03, 0x03, 0x64, 0x64, 0x46}, /* spin up */ -{ 0x04, 0x02, 0x64, 0x64, 0x14}, /* start stop count */ -{ 0x05, 0x03, 0x64, 0x64, 0x36}, /* remapped sectors */ -{ 0x00, 0x00, 0x00, 0x00, 0x00} +/* These values were based on a Seagate ST3500418AS but have been modified + to make more sense in QEMU */ +static const int smart_attributes[][12] = { +/* id, flags, hflags, val, wrst, raw (6 bytes), threshold */ +/* raw read error rate*/ +{ 0x01, 0x03, 0x00, 0x64, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x06}, +/* spin up */ +{ 0x03, 0x03, 0x00, 0x64, 0x64, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}, +/* start stop count */ +{ 0x04, 0x02, 0x00, 0x64, 0x64, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00, 0x14}, +/* remapped sectors */ +{ 0x05, 0x03, 0x00, 0x64, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x24}, +/* power on hours */ +{ 0x09, 0x03, 0x00, 0x64, 0x64, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}, +/* power cycle count */ +{ 0x0c, 0x03, 0x00, 0x64, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}, +/* airflow-temperature-celsius */ +{ 190, 0x03, 0x00, 0x45, 0x45, 0x1f, 0x00, 0x1f, 0x1f, 0x00, 0x00, 0x32}, +/* end of list */ +{ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00} }; /* XXX: DVDs that could fit on a CD will be reported as a CD */ @@ -1843,6 +1856,7 @@ void ide_exec_cmd(IDEBus *bus, uint32_t val) break; case WIN_CHECKPOWERMODE1: case WIN_CHECKPOWERMODE2: +s-error = 0; s-nsector = 0xff; /* device active or idle */ s-status = READY_STAT | SEEK_STAT; ide_set_irq(s-bus); @@ -2097,7 +2111,7 @@ void ide_exec_cmd(IDEBus *bus, uint32_t val) if (smart_attributes[n][0] == 0) break; s-io_buffer[2+0+(n*12)] = smart_attributes[n][0]; - s-io_buffer[2+1+(n*12)] = smart_attributes[n][4]; + s-io_buffer[2+1+(n*12)] = smart_attributes[n][11]; } for (n=0; n511; n++) /* checksum */ s-io_buffer[511] += s-io_buffer[n]; @@ -2110,12 +2124,13 @@ void ide_exec_cmd(IDEBus *bus, uint32_t val) memset(s-io_buffer, 0, 0x200); s-io_buffer[0] = 0x01; /* smart struct version */ for (n=0; n30; n++) { - if (smart_attributes[n][0] == 0) + if (smart_attributes[n][0] == 0) { break; - s-io_buffer[2+0+(n*12)] = smart_attributes[n][0]; - s-io_buffer[2+1+(n*12)] = smart_attributes[n][1]; - s-io_buffer[2+3+(n*12)] = smart_attributes[n][2]; - s-io_buffer[2+4+(n*12)] = smart_attributes[n][3]; + } + int i; + for(i = 0; i 11; i++) { + s-io_buffer[2+i+(n*12)] = smart_attributes[n][i]; + } } s-io_buffer[362] = 0x02 | (s-smart_autosave?0x80:0x00); if (s-smart_selftest_count == 0) {
[Qemu-devel] Re: [PATCH 1/2] hw/arm_sysctl.c: Add the Versatile Express system registers
Peter Maydell peter.mayd...@linaro.org wrote: Hi @@ -41,6 +44,9 @@ static const VMStateDescription vmstate_arm_sysctl = { VMSTATE_UINT32(flags, arm_sysctl_state), VMSTATE_UINT32(nvflags, arm_sysctl_state), VMSTATE_UINT32(resetlevel, arm_sysctl_state), +VMSTATE_UINT32(sys_cfgdata, arm_sysctl_state), +VMSTATE_UINT32(sys_cfgctrl, arm_sysctl_state), +VMSTATE_UINT32(sys_cfgstat, arm_sysctl_state), VMSTATE_END_OF_LIST() } }; Three options (about migration): - left things as they are and become incompatible without changing versions - if you don't care about backward compatibility, just add +1 to all the version fields and you are done. - add this fields only for the new version. IMHO 1st one is the worse option. I will go with the middle one (as far as I know, nobody on arm uses interversion migration (as far as I know, nobody uses migration at all). If you (or anybody else does), a pointer to one setup that is known to work is welcome. Later, Juan.
Re: [Qemu-devel] jitter in Audio
Am Montag 28 Februar 2011 12:55:22 schrieb asim khan: Hi, Iam using qemu 0.13.0..whenever Iam playing any file using ffplay.sometimes it happens that audio stops and then after sometime gain it starts playing..but i dont see this problem with aplay. so whats going wrong.Plz update me as soon as possible. Hi AK, first you should try if you can reproduce the problem with the most recent version of Qemu (0.14.0 at minimum, svn would be best). Then you'll have to give a detailed description how to reproduce the problem. Cheers, Jan
Re: [Qemu-devel] jitter in Audio
Am Dienstag 01 März 2011 14:45:18 schrieb Jan Marten Simons: Hi AK, first you should try if you can reproduce the problem with the most recent version of Qemu (0.14.0 at minimum, svn would be best). Then you'll have to give a detailed description how to reproduce the problem. erm make that git instead of svn. See http://wiki.qemu.org/Download
[Qemu-devel] Re: [PATCH 1/2] hw/arm_sysctl.c: Add the Versatile Express system registers
2011/3/1 Juan Quintela quint...@redhat.com: Peter Maydell peter.mayd...@linaro.org wrote: Hi @@ -41,6 +44,9 @@ static const VMStateDescription vmstate_arm_sysctl = { VMSTATE_UINT32(flags, arm_sysctl_state), VMSTATE_UINT32(nvflags, arm_sysctl_state), VMSTATE_UINT32(resetlevel, arm_sysctl_state), + VMSTATE_UINT32(sys_cfgdata, arm_sysctl_state), + VMSTATE_UINT32(sys_cfgctrl, arm_sysctl_state), + VMSTATE_UINT32(sys_cfgstat, arm_sysctl_state), VMSTATE_END_OF_LIST() } }; Three options (about migration): - left things as they are and become incompatible without changing versions - if you don't care about backward compatibility, just add +1 to all the version fields and you are done. - add this fields only for the new version. IMHO 1st one is the worse option. I will go with the middle one (as far as I know, nobody on arm uses interversion migration (as far as I know, nobody uses migration at all). OK, so in: static const VMStateDescription vmstate_arm_sysctl = { .name = realview_sysctl, .version_id = 1, .minimum_version_id = 1, I just bump the '1' in both cases to '2' ? I've only ever used the save/restore for debugging purposes. We know for certain that nobody can have been doing migration with versatile platforms before this year, because we only added save/load support to arm_sysctl.c in December 2010 :-) (What's the equivalent version-bump that needs to be done when entries are added to target-arm/machine.c's save and restore code?) thanks -- PMM
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On 03/01/2011 02:40 PM, Anthony Liguori wrote: On Mar 1, 2011 7:07 AM, Dor Laor dl...@redhat.com mailto:dl...@redhat.com wrote: On 02/28/2011 07:44 PM, Anthony Liguori wrote: On Feb 28, 2011 10:44 AM, Jes Sorensen jes.soren...@redhat.com mailto:jes.soren...@redhat.com mailto:jes.soren...@redhat.com mailto:jes.soren...@redhat.com wrote: Hi, On last week's call we discussed the issue of splitting non core features of QEMU into it's own process to reduce the security risks etc. I wrote up a summary of my thoughts on this to try to cover the various issues. Feedback welcome and hopefully we can continue the discussion on a future call - maybe next week? I would like to be part of the discussion, but it's a public holiday here March 1st, so I won't be on that call. Cheers, Jes Separating host-side virtagent and other tasks from core QEMU = To improve auditing of the core QEMU code, it would be ideal to be able to separate the core QEMU functionality from utility functionality by moving the utility functionality into its own process. This process will be referred to as the QEMU client below. Separating as in moving code outside of qemu.git, making code optionally built in, making code optionally built in or loadable as a separate executable that is automatically launched, or making code always built outside the main executable? I'm very nervous about having a large number of daemons necessary to run QEMU. I think a reasonable approach would be a single front-end daemond. s/daemon/son processes/ Qemu is the one that should spawn them and they should be transparent from the management. This way running qemu stays the same and qemu just need to add the logic to get a SIGCHILD and potentially re-execute an dead son process. Spice is the logical place to start, no? It's the largest single dependency we have and it does some scary things with qemu_mutex. I would use spice as a way to prove the concept. I agree it is desirable to the this for spice but it is allot more complex than virtagent isolation. Spice is performance sensitive and contains much more state. It needs to access the guest memory for reading the surfaces. It can be solved but needs some major changes. Adding spice-devel to the discussion. Will virtagent be kept out of tree till we merge spice? Regards, Anthony Liguori Once QAPI is merged, there is a very incremental approach we can take for this sort of work. Take your favorite subsystem (like gdbstub or SDL) and make it only use QMP apis. Once we're only using QMP internally in a subsystem, then building it out of the main QEMU and using libqmp should be fairly easy. Regards, Anthony Liguori Components which are candidates for moving out of QEMU include: - virtagent - vnc server (and other graphical displays such as SDL, spice and curses) - human monitor These are all much harder than they may seem. There's a ton of QMP functions that would be needed before we could even try to do this. The idea is to have QEMU launch as a daemon, and then allow for one of more client processes to connect to it. These will then offer the various services. The main issue to discuss is how to handle various state information, reconnects, and migration. Security The primary reason for this discussion is security, however there are other benefits from this split, as will be mentioned below. During a demo of virtagent I hit a case where a bug in the agent command handling code caused a crash of the host QEMU process. While it is probably a simple bug, it shows how adding more complexity to the QEMU process increases the risk of adding security problems that could potentially be exploited by a hostile guest. By splitting non core functionality into a QEMU client process, the host process will be isolated from a large number of potential security problems, ie. in case a client process is killed or crashes, it should not affect the main QEMU process. In addition it makes it easier to audit the core QEMU functionality. virtagent = In short virtagent provides a set of simple commands, most of which do not have state associated with them. These include shutdown, ping, fsfreeze/fsthaw, etc. Other commands might be multi-stage commands which could fail if the client is disconnected from the daemon while the command is in progress. These include copy-paste and file copy. vnc server == The vnc server simply needs a connection to the video memory of the QEMU process, video mode state, as well as channels for sending keyboard and mouse events. It is stateless by nature and supports reconnects. This applies to the other graphical display engines (SDL,
Re: [Qemu-devel] QEMU: Discussion of separating core functionality vs supportive features
On 03/01/2011 09:25 AM, Dor Laor wrote: On 03/01/2011 02:40 PM, Anthony Liguori wrote: On Mar 1, 2011 7:07 AM, Dor Laor dl...@redhat.com mailto:dl...@redhat.com wrote: On 02/28/2011 07:44 PM, Anthony Liguori wrote: On Feb 28, 2011 10:44 AM, Jes Sorensen jes.soren...@redhat.com mailto:jes.soren...@redhat.com mailto:jes.soren...@redhat.com mailto:jes.soren...@redhat.com wrote: Hi, On last week's call we discussed the issue of splitting non core features of QEMU into it's own process to reduce the security risks etc. I wrote up a summary of my thoughts on this to try to cover the various issues. Feedback welcome and hopefully we can continue the discussion on a future call - maybe next week? I would like to be part of the discussion, but it's a public holiday here March 1st, so I won't be on that call. Cheers, Jes Separating host-side virtagent and other tasks from core QEMU = To improve auditing of the core QEMU code, it would be ideal to be able to separate the core QEMU functionality from utility functionality by moving the utility functionality into its own process. This process will be referred to as the QEMU client below. Separating as in moving code outside of qemu.git, making code optionally built in, making code optionally built in or loadable as a separate executable that is automatically launched, or making code always built outside the main executable? I'm very nervous about having a large number of daemons necessary to run QEMU. I think a reasonable approach would be a single front-end daemond. s/daemon/son processes/ Qemu is the one that should spawn them and they should be transparent from the management. This way running qemu stays the same and qemu just need to add the logic to get a SIGCHILD and potentially re-execute an dead son process. Spice is the logical place to start, no? It's the largest single dependency we have and it does some scary things with qemu_mutex. I would use spice as a way to prove the concept. I agree it is desirable to the this for spice but it is allot more complex than virtagent isolation. Spice is performance sensitive and contains much more state. It needs to access the guest memory for reading the surfaces. It can be solved but needs some major changes. Adding spice-devel to the discussion. Yeah, but the viability of this mechanism is dependent on whether it can support something that's complex, just like Spice. Considering that the smaller pieces like VNC or virt-agent are at most a couple thousand of lines of code, whereas Spice is close to a hundred thousand lines, the benefit from moving Spice to a separate address space is significantly higher. Will virtagent be kept out of tree till we merge spice? Nothing should be held up from being merged because of an effort to put things in a separate address space. But that's not to say that virt-agent is something that's going to get merged tomorrow. I don't know that there is even an agreed upon architecture at the moment. Regards, Anthony Liguori
[Qemu-devel] Re: [PATCH 0/2] ARM: Add Versatile Express board model
Thanks Peter for the efforts. On Tue, 2011-03-01 at 12:32 +, Peter Maydell wrote: This patchset adds support for the ARM Versatile Express board with Cortex-A9 daughterboard. It's based on some vexpress modelling work done by Bahadir Balban and Amit Mahajan at B Labs, overhauled and cleaned up by me (thanks to them for making that work available). The patchset depends on the MMC cleanup work I posted last week: http://www.mail-archive.com/qemu-devel@nongnu.org/msg56148.html as it wants to wire up the MMC status lines. Peter Maydell (2): hw/arm_sysctl.c: Add the Versatile Express system registers hw/vexpress.c: Add model of ARM Versatile Express board Makefile.target |1 + hw/arm_sysctl.c | 61 ++ hw/vexpress.c | 238 +++ 3 files changed, 300 insertions(+), 0 deletions(-) create mode 100644 hw/vexpress.c -- Thanks Amit Mahajan
Re: [Qemu-devel] 68k and BeBox (was SymbianOS, MeeGO, WebOS and QEMU)
Hi, El 01/03/2011, a las 12:06, François Revol escribió: Le 1 mars 2011 à 13:02, Laurent Vivier a écrit : Currently the fastest ones would be BeBox, Mac68k and NeXT machines, because almost all devices are already emulated, but the assembly itself, firmware and CPU/FPU/MMU in case of 68k. IIRC the Mac68k hardware is quite obscure and model-dependant... but EMILE and BasiliskII should say enough. They will not help you: - EMILE uses Mac ROM to access hardware - BasiliskII patches the ROM to call its internal drivers instead of accessing hardware. That's what MacOS itself does. Indeed there is a document on how Mac OS X Server boots somewhere deep in the Apple's support page that describe the various patching methods they use to boot on m68k (for A/UX), OldWorld and NewWorld machines. Remember that the Mac ROM, is the majority of the Mac OS APIs, that may be used or patched in RAM depending on the Mac OS version running. vMac however is emulating the hardware more exactly. If it can help I think I have all hardware reference manuals for m68k macintosh. Actually I think they used to be online until recently, but Apple revamped their archived not too long ago IIRC. For up to Mac II they are in the Inside Macintosh books, from them up to PowerPC you'll need to guess it, and for the cloneable systems, there is information in Apple Developer CDs. Regards, Natalia Portillo
[Qemu-devel] Re: KVM call agend for Mar 1
Juan Quintela quint...@redhat.com wrote: Please send in any agenda items you are interested in covering. As there are 0 items on the agenda, call got cancelled this week. Enjoy, Juan. Thanks, Juan. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH -V2 4/6] hw/9pfs: Implement syncfs
On Tue, 1 Mar 2011 10:22:07 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Tue, Mar 1, 2011 at 6:38 AM, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com --- hw/9pfs/virtio-9p.c | 31 +++ hw/9pfs/virtio-9p.h | 2 ++ 2 files changed, 33 insertions(+), 0 deletions(-) diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c index c4b0198..882f4f3 100644 --- a/hw/9pfs/virtio-9p.c +++ b/hw/9pfs/virtio-9p.c @@ -1978,6 +1978,36 @@ static void v9fs_fsync(V9fsState *s, V9fsPDU *pdu) v9fs_post_do_fsync(s, pdu, err); } +static void v9fs_post_do_syncfs(V9fsState *s, V9fsPDU *pdu, int err) +{ + if (err == -1) { + err = -errno; + } + complete_pdu(s, pdu, err); +} + +static void v9fs_syncfs(V9fsState *s, V9fsPDU *pdu) +{ + int err; + int32_t fid; + size_t offset = 7; + V9fsFidState *fidp; + + pdu_unmarshal(pdu, offset, d, fid); + fidp = lookup_fid(s, fid); + if (fidp == NULL) { + err = -ENOENT; + v9fs_post_do_syncfs(s, pdu, err); + return; + } + /* + * We don't have per file system syncfs + * So just return success + */ + err = 0; + v9fs_post_do_syncfs(s, pdu, err); +} Please explain the semantics of P9_TSYNCFS. Won't returning success without doing anything lead to data integrity issues? I should actually do the 9P Operation format as commit message. Will add in the next update. Whether returning here would cause a data integrity issue, it depends what sort of guarantee we want to provide. So calling sync on the guest will cause all the dirty pages in the guest to be flushed to host. Now all those changes are in the host page cache and it would be nice to flush them as a part of sync but then since we don't have a per file system sync, the above would imply we flush all dirty pages on the host which can result in large performance impact. It seems unnecessary to split v9fs_post_do_syncfs() into its own function since there is no blocking point here and a callback will not be needed. That is done as a place holder to add the per file system sync call once we get the support http://thread.gmane.org/gmane.linux.file-systems/44628 -aneesh
Re: [Qemu-devel] [RFC][PATCH 0/2] Allow cache settings for block devices to be changed at runtime.
Am 01.03.2011 10:55, schrieb Stefan Hajnoczi: On Mon, Feb 28, 2011 at 3:48 PM, Kevin Wolf kw...@redhat.com wrote: Am 28.02.2011 16:35, schrieb Stefan Hajnoczi: On Mon, Feb 28, 2011 at 3:12 PM, Kevin Wolf kw...@redhat.com wrote: Am 28.02.2011 12:49, schrieb Prerna Saxena: The following patchset introduces monitor commands: 1. set_cache DEVICE CACHE-SETTING Change cache settings for block device, DEVICE, through the monitor. (Available options : 'none', 'writeback', 'writethrough') Eg, (qemu)set_cache ide0-hd0 none - Changes cache setting for ide0-hd0 to 'none' Not sure if adding this interface is a good idea. I see that you only add it for HMP, and we may consider that, but it's definitely not suitable for QMP. One reason is that none/writethrough/writeback/unsafe isn't really what we want to use long term. We want to separate advertising a write cache (which is guest visible) from things like whether to use O_DIRECT or not. In the past, Christoph mentioned that he had patches to make these separate and even let the guest change the write cache enabled flag, which would probably solve most of the use cases of this patch. Toggling host page cache at runtime is useful too because it saves having to restart VMs. Not sure why I wanted to change that during runtime, but agreed, allowing to change parameters using the monitor is generally a good thing. However, I'm not sure if a command for changing the cache mode is the right solution, or if it should be something like a command to change block device options. (For example, what about toggling read-only or snapshot mode?) Yes, I think you're right. We should aim for a general interface rather than having to add many more specific interfaces in the future. CQ: Do you see a relation to the update interface you added to adjust drive options at runtime for FVD? On one hand it's a different set of options today. IIUC, qemu-img update refers to persistent per-image options like backing file, encryption, etc. This monitor command refers to temporary command line options like cache, snapshot mode etc. On the other hand, we've had discussions before about storing a copy-on-read flag in images which makes sense as a command line option as well. The same may apply to things like the read-only flags. So maybe these two sets of flags aren't that distinct from each other. Kevin Kevin is right. The 'qemu-img update' patch I posted recently is a general interface for manipulating persistent per-image options. Its implementation certainly differs from a one-time runtime monitor command. I can imagine two sets of flags: 1) those should only be updated in the persistent image (e.g., backing file) and its effect must be reflected at the time of opening an image, and 2) those flags that can be updated both in the persistent image and at runtime. For 2), the flag value persistently stored in the image is taken as the default value when an image is opened, and the flag value possibly can be further modified by a runtime monitor command. It seems that copy_on_read and prefetching fall in to category 2), with 'qemu-img update' handling persistent changes in image and Stefan H.'s monitor commands in QED handling runtime changes. If we want a general monitor interface for runtime flag changes, the interface may be similar to 'qemu-img update' but the implementation would be very different. Specifically regarding the set_cache monitor command, it allows flexibility, but probably will be rarely used. It also poses additional requires on the image format drivers, e.g., to flush metadata cache on the change of cache mode and to re-initialize internal data structure properly. Regards, ChunQiang (CQ) Tang Homepage: http://www.research.ibm.com/people/c/ctang
[Qemu-devel] [Bug 498035] Re: qemu hangs on shutdown or reboot (XP guest)
I've seen windows XP hanging on reboot/shutdown like this so many countless times I'd not bother with this at all. At least, does clean install of winXP shows the same behavour? -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/498035 Title: qemu hangs on shutdown or reboot (XP guest) Status in QEMU: Triaged Bug description: When I shut down or reboot my Windows XP guest, about half the time, it hangs at the point where it says Windows is shutting down At that point qemu is using 100% of one host CPU, about 85% user, 15% system. (Core 2 Quad 2.66GHz) This is the command line I use to start qemu: qemu-system-x86_64 -hda winxp.img -k en-us -m 2048 -smp 2 -vnc :3100 -usbdevice tablet -boot c -enable-kvm
[Qemu-devel] [Bug 638955] Re: emulated netcards don't work with recent sunos kernel
Emulated NIC is e1000. I found out that if one reduces the MTU on the client like ifconfig eth0 mtu 300 it seems ssh hangs much more rarely (but still hangs, at 300). Reducing it on the virtualization host bridge is not enough though (unless you are initiating ssh from the virtualization host itself) To trigger the hang, do: while true ; do dmesg ; done The higher the allowed MTU, the quicker the hang, e.g. MTU 500 hangs within one minute. 1500 hangs instantly. Command line is the following. Excuse the length... it's a libvirt LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/local/qemu-kvm-0.14.0/bin/qemu-system-x86_64 -S -M pc-0.14 -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -name openindiana1 -uuid ed0b8483-d186-1f39-39ef-97194a1f02bf -nodefconfig -nodefaults -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/openindiana1.monitor,server,nowait -mon chardev=monitor,mode=readline -rtc base=utc -no-acpi -boot c -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/dev/mapper/datavg1-openindiana1,if=none,id=drive- ide0-0-0,boot=on,format=raw,cache=none -device ide- drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=54,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=52:54:69:b5:89:11,bus=pci.0,addr=0x3 -usb -vnc 127.0.0.1:0 -vga cirrus -device virtio-balloon- pci,id=balloon0,bus=pci.0,addr=0x4 I'm available to try patches for a while if somebody can spot the problem... the host is still not in production. Thanks for your work -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/638955 Title: emulated netcards don't work with recent sunos kernel Status in QEMU: New Bug description: hi there, i'm using qemu-kvm backend in version: # qemu-kvm -version QEMU PC emulator version 0.12.5 (qemu-kvm-0.12.5), Copyright (c) 2003-2008 Fabrice Bellard and there are just *not working any of model=$type with combinations of recent sunos (solaris, openindiana, opensolaris, ..) .. you can download for testing purposes iso from here: http://dlc- origin.openindiana.org/isos/147/ or from here: http://genunix.org/distributions/indiana/ osol and oi are also bubuntu-like *live cds, so no need to bother with installing behaviour is as follows: e1000 - receiving doesn't work, transmitting works .. dladm (tool for handle ethers) shows that is all ok, correct mode is loaded up, it just seems like this driver works at 100% but .. rtl8169|pcnet - works in 10Mbit mode with several other issues like high cpu utilization and so .. dladm is unable to recognize options for this kind of -nic others - just don't work .. i experienced this issue several times in past .. woraround was, that rtl8169 worked so-so .. with recent sunos kernel it doesn't. it's easy to reproduce, this is why i'm not putting here more then launching script for my virtual machine: # cat openindiana.sh qemu-kvm -hda /home/kvm/openindiana/openindiana.img -m 2048 -localtime -cdrom /home/kvm/+images/oi-dev-147-x86.iso -boot d \ -vga std -vnc :9 -k en-us -monitor unix:/home/kvm/openindiana/instance,server,nowait \ -net nic,model=e1000,vlan=1 -net tap,ifname=oi0,script=no,vlan=1 sleep 2; ip l set oi0 up; ip a a 192.168.99.9/24 dev oi0; regards by daniel
[Qemu-devel] Tracing memory access (tcg_gen_qemu_st|ld)
Hi, i am trying to hook guest vm memory access (i386-softmmu) by compiling custom hooking functions into tcg_gen_qemu_{st|ld}*. There are two main problems: the first is that the output seems weird (see below), the second is that I am running into a BSOD with my windows xp guest after some calls (to I modify any values here?). Does anyone of you see problems? Will that code catch all memory access or is there anything I will miss? Is there a better method than using a dummy TCGv for the flx_memtrace_read return value (sth. like hooks of return type void)? static inline void tcg_gen_qemu_ld8u(TCGv ret, TCGv addr, int mem_index) { tcg_gen_qemu_ldst_op(INDEX_op_qemu_ld8u, ret, addr, mem_index); if(memtrace_enabled){ int sizemask = 0; sizemask |= tcg_gen_sizemask(0, 0, 0); sizemask |= tcg_gen_sizemask(1, 0, 0); sizemask |= tcg_gen_sizemask(2, 0, 0); TCGv dummy = ret; tcg_gen_helper4(flx_memtrace_read, sizemask, dummy, ret, addr, tcg_const_i32(mem_index), tcg_const_i32(8)); } } static inline void tcg_gen_helper4(void *func, int sizemask, TCGv_i32 ret, TCGv_i32 a, TCGv_i32 b, TCGv_i32 c, TCGv_i32 d) { TCGv_ptr fn; TCGArg args[4]; fn = tcg_const_ptr((tcg_target_long)func); args[0] = GET_TCGV_I32(a); args[1] = GET_TCGV_I32(b); args[2] = GET_TCGV_I32(c); args[3] = GET_TCGV_I32(d); tcg_gen_callN(tcg_ctx, fn, TCG_CALL_CONST | TCG_CALL_PURE, sizemask, GET_TCGV_I32(ret), 4, args); tcg_temp_free_ptr(fn); } static inline int32_t flx_memtrace_read(int32_t value, int32_t address, int32_t offset, int32_t size){ if(instrumentation_active){ if(!memtrace_enabled) printf(memtrace_read called but memtrace disabled! check invalidation!!!\n); flx_memtrace_event(value, address, size, 0); } return value; } Output: 1. Addresses look weird 2. Read values look like addresses and if they are, EIP reads seem to be included Read: 0x21664 , Addr: 0x3d4 Read: 0x21666 , Addr: 0xe Read: 0x2165c , Addr: 0x0 Read: 0x2165e , Addr: 0xe Read: 0x21660 , Addr: 0x1674 Read: 0x21662 , Addr: 0x42f0 Read: 0x2166a , Addr: 0x0 Read: 0x21668 , Addr: 0x3d4 Write: 0x21662 , Addr: 0x4305 Read: 0x21664 , Addr: 0x3d5 Read: 0x21666 , Addr: 0x0 Read: 0x2165c , Addr: 0x0 Read: 0x2165e , Addr: 0x3d5 Read: 0x21660 , Addr: 0x1674 Read: 0x21662 , Addr: 0x4305 Read: 0x21668 , Addr: 0x3d4 Write: 0x21662 , Addr: 0x4312 Read: 0x21664 , Addr: 0x3d4 Read: 0x21666 , Addr: 0xf Read: 0x2165c , Addr: 0x0 Read: 0x2165e , Addr: 0xf Read: 0x21660 , Addr: 0x1674 Read: 0x21662 , Addr: 0x4312 Read: 0x2166a , Addr: 0x0 Read: 0x21668 , Addr: 0x3d4 Write: 0x21662 , Addr: 0x4323 Read: 0x21664 , Addr: 0x3d5 Read: 0x21666 , Addr: 0x0 Read: 0x2165c , Addr: 0x0 Read: 0x2165e , Addr: 0x3d5 Read: 0x21660 , Addr: 0x1674 Read: 0x21662 , Addr: 0x4323 Read: 0x21674 , Addr: 0x168a Read: 0x21676 , Addr: 0x4507 Read: 0x2168a , Addr: 0x16a Best regards, Felix
Re: [Qemu-devel] [RFC][PATCH 0/2] Allow cache settings for block devices to be changed at runtime.
On 03/01/2011 08:22 AM, Kevin Wolf wrote: Certainly good questions, but let me suggest not taking an HMP command and not a QMP commans because of interface concerns. My goal for 0.15 is to convert HMP to be implemented in terms of QMP. To do that, a bunch of new QMP commands are needed. They all won't be perfect but i'd rather support a bad QMP command forever than to continue to/ have people rely on HMP. Okay, makes sense. So we should reject patches that add new HMP commands without adding a QMP counterpart. Definitely. We essentially are supporting HMP today just as much as QMP but HMP is much harder to support (no standard way to interpret input/output/errors). I agree that the guest should control the emulated drive cache at runtime and we probably don't want to allow toggling that from the host - it could be dangerous :). Good point. That's a NACK for this patch as long as we haven't separated WCE from the host cache setting. Doesn't make a difference for this one, though, because it's NACKed anyway. Kevin PS: Anthony, is there a specific reason why you started sending HTML emails? Because I was stuck using my phone because of bad hotel wireless :-/ Regards, Anthony Liguori
Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 03/01/2011 04:39 AM, Avi Kivity wrote: On 02/28/2011 08:12 PM, Anthony Liguori wrote: On Feb 28, 2011 11:47 AM, Avi Kivity a...@redhat.com mailto:a...@redhat.com wrote: On 02/28/2011 07:33 PM, Anthony Liguori wrote: You're just ignoring what I've written. No, you're just impervious to my subtle attempt to refocus the discussion on solving a practical problem. There's a lot of good, reasonably straight forward changes we can make that have a high return on investment. Is making qemu the authoritative source of configuration information a straightforward change? Is the return on it high? Is the investment low? I think this is where we fundamentally disagree. My position is that QEMU is already the authoritative source. Having a state file doesn't change anything. Do a hot unplug of a network device with upstream libvirt with acpiphp unloaded, consult libvirt and then consult the monitor to see who has the right view of the guests config. libvirt is right and the monitor is wrong. On real hardware, calling _EJ0 doesn't affect the configuration one little bit (if I understand it correctly). It just turns off power to the slot. If you power-cycle, the card will be there. It's up to the hardware vendor. Since it's ACPI, it can result in any number of operations. Usually, there's some logic to flip on an LED or something. There's nothing that prevents a vendor from ejecting the card. My point is that there aren't cleanly separated lines in the real world. To me, that's the definition of authoritative. No to all three (ignoring for the moment whether it is good or not, which we were debating). The only suggestion I'm making beyond Marcelo's original patch is that we use a structured format and that we make it possible to use the same file to solve this problem in multiple places. No, you're suggesting a lot more than that. That's exactly what I'm suggesting from a technical perspective. Unless I'm hallucinating, you're suggesting quite a bit more. A revolution in how qemu is to be managed. Let me take another route to see if I can't persuade you. First, let's clarify your proposal. You want to introduce a new block format that references to block devices. It may also store a dirty bitmap to keep track of which blocks are out of sync. Hopefully, it goes without saying that the dirty bitmap is strictly optional (it's a performance optimization) so let's ignore it. Your format, as a text file, looks like: [raid1] primary=diska.img secondary=diskb.img active=primary To use it, here's the sequence: 0) qemu uses disk A for a block device 1) create a raid1 block device pointing to disk A and disk B. 2) management tool asks qemu to us the new raid1 block device. 3) qemu acks (2) 4) at some point, the mirror completes, writes are going to both disks 5) qemu sends out an event indicating that the disks are in sync 6) management tool then sends a command to fail over to disk B 7) qemu acks (6) We're making the management tool the authoritative source of how to launch QEMU. That means that the management tool ultimately determines which command line to relaunch QEMU with. Here are the races: A) If QEMU crashes between (2) and (3), it may have issues a write to the new raid1 block device before the management tool sees (3). If this happens, when the management tool restarts QEMU with disk A, we're left with a dangling raid1 block device. Not a critical failure, but not ideal. B) If QEMU crashes between (6) and (7), QEMU may have started writing to disk B before the management tool sees (7). This means that the management tool will create the guest with the raid1 block device which no longer is the correct disk. This could fail in subtly bad ways. Depending on how read is implemented (if you try to do striping for instance), bad data could be returned. You could try to implement a policy of always reading from B if the block has been copied but this gets harry really quickly. It's definitely not RAID1 anymore. You may observe that the problem is not the RAID1 mechanism, but changing from using a normal device and the RAID1 mechanism. It would then be wise to say, let's always use this image format. Since that eliminates the race, we don't really need the copy bitmap anymore. Now we're left with a simple format that just refers to two filenames. However, block devices are more than just a filename. It needs a format, cache settings, etc. So let's put this all in the RAID1 block format. We also need a way to indicate which block device is selected. Let's make it a text file for purposes of discussion. It will look something like: [primary] filename=diska.img cache=none format=raw [secondary] filename=diskb.img cache=writethrough format=qcow2 [global] active=primary Since we might want to mirror multiple drives at once, we should probablyn support
Re: [Qemu-devel] [PATCH -V2 4/6] hw/9pfs: Implement syncfs
On Tue, Mar 1, 2011 at 3:02 PM, Aneesh Kumar K. V aneesh.ku...@linux.vnet.ibm.com wrote: On Tue, 1 Mar 2011 10:22:07 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Tue, Mar 1, 2011 at 6:38 AM, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com --- hw/9pfs/virtio-9p.c | 31 +++ hw/9pfs/virtio-9p.h | 2 ++ 2 files changed, 33 insertions(+), 0 deletions(-) diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c index c4b0198..882f4f3 100644 --- a/hw/9pfs/virtio-9p.c +++ b/hw/9pfs/virtio-9p.c @@ -1978,6 +1978,36 @@ static void v9fs_fsync(V9fsState *s, V9fsPDU *pdu) v9fs_post_do_fsync(s, pdu, err); } +static void v9fs_post_do_syncfs(V9fsState *s, V9fsPDU *pdu, int err) +{ + if (err == -1) { + err = -errno; + } + complete_pdu(s, pdu, err); +} + +static void v9fs_syncfs(V9fsState *s, V9fsPDU *pdu) +{ + int err; + int32_t fid; + size_t offset = 7; + V9fsFidState *fidp; + + pdu_unmarshal(pdu, offset, d, fid); + fidp = lookup_fid(s, fid); + if (fidp == NULL) { + err = -ENOENT; + v9fs_post_do_syncfs(s, pdu, err); + return; + } + /* + * We don't have per file system syncfs + * So just return success + */ + err = 0; + v9fs_post_do_syncfs(s, pdu, err); +} Please explain the semantics of P9_TSYNCFS. Won't returning success without doing anything lead to data integrity issues? I should actually do the 9P Operation format as commit message. Will add in the next update. Whether returning here would cause a data integrity issue, it depends what sort of guarantee we want to provide. So calling sync on the guest will cause all the dirty pages in the guest to be flushed to host. Now all those changes are in the host page cache and it would be nice to flush them as a part of sync but then since we don't have a per file system sync, the above would imply we flush all dirty pages on the host which can result in large performance impact. You get the define the semantics of P9_TSYNCFS? I thought this is part of a well-defined protocol :). If this is a .L extension then it's probably a bad design and shouldn't be added to the protocol if we can't implement it. Is this operation supposed to flush the disk write cache too? I think virtio-9p has a file descriptor cache. Would it be possible to fsync() those file descriptors? Stefan
[Qemu-devel] [Bug 723871] Re: qemu-kvm-0.14.0 Aborts with -vga qxl
** Also affects: qemu-kvm (Ubuntu) Importance: Undecided Status: New ** Changed in: qemu-kvm (Ubuntu) Importance: Undecided = Medium ** Changed in: qemu-kvm (Ubuntu) Assignee: (unassigned) = Serge Hallyn (serge-hallyn) -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/723871 Title: qemu-kvm-0.14.0 Aborts with -vga qxl Status in QEMU: Confirmed Status in “qemu-kvm” package in Ubuntu: New Bug description: Host CPU is Core i7 Q820. KVM is from 2.6.35-gentoo-r5 kernel (x86_64). Host has spice-0.7.2 and spice-protocol-0.7.0. Guest is Windows XP SP3 with qxl driver 0.6.1, virtio-serial 1.1.6 and vdagent 0.6.3. qemu-kvm is started like so: qemu-system-x86_64 -cpu host -enable-kvm -pidfile /home/rick/qemu/hds/wxp.pid -drive file=/home/rick/qemu/hds/wxp.raw,if=virtio,media=disk,aio=native,snapshot=on -m 768 -name WinXP -net nic,model=virtio -net user -localtime -usb -vga qxl -device virtio-serial -chardev spicevmc,name=vdagent,id=vdagent -device virtserialport,chardev=vdagent,name=com.redhat.spice.0 -spice port=1234,disable-ticketing -monitor stdio and crashes with: qemu-system-x86_64: /home/rick/qemu/src/qemu-kvm-0.14.0/qemu-kvm.c:1724: kvm_mutex_unlock: Assertion `!cpu_single_env' failed. Aborted If I use -no-kvm, it works fine. If I use -vga std, it works fine. -enable-kvm and -vga qxl crashes.
[Qemu-devel] [Bug 723871] Re: qemu-kvm-0.14.0 Aborts with -vga qxl
** Branch linked: lp:~serge-hallyn/ubuntu/natty/qemu-kvm/qxl-lock -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/723871 Title: qemu-kvm-0.14.0 Aborts with -vga qxl Status in QEMU: Confirmed Status in “qemu-kvm” package in Ubuntu: New Bug description: Host CPU is Core i7 Q820. KVM is from 2.6.35-gentoo-r5 kernel (x86_64). Host has spice-0.7.2 and spice-protocol-0.7.0. Guest is Windows XP SP3 with qxl driver 0.6.1, virtio-serial 1.1.6 and vdagent 0.6.3. qemu-kvm is started like so: qemu-system-x86_64 -cpu host -enable-kvm -pidfile /home/rick/qemu/hds/wxp.pid -drive file=/home/rick/qemu/hds/wxp.raw,if=virtio,media=disk,aio=native,snapshot=on -m 768 -name WinXP -net nic,model=virtio -net user -localtime -usb -vga qxl -device virtio-serial -chardev spicevmc,name=vdagent,id=vdagent -device virtserialport,chardev=vdagent,name=com.redhat.spice.0 -spice port=1234,disable-ticketing -monitor stdio and crashes with: qemu-system-x86_64: /home/rick/qemu/src/qemu-kvm-0.14.0/qemu-kvm.c:1724: kvm_mutex_unlock: Assertion `!cpu_single_env' failed. Aborted If I use -no-kvm, it works fine. If I use -vga std, it works fine. -enable-kvm and -vga qxl crashes.
[Qemu-devel] [PATCH] target-arm: Handle VMOV between two core and VFP single regs
Fix two bugs in the translation of the instructions VMOV sa,sb,rx,ry and VMOV rx,ry,sa,sb (which copy between a pair of ARM core registers and a pair of VFP single precision registers): * An incorrect condition meant these instruction patterns were being treated as load/store multiple, which resulted in the generation of bad code and a runtime segfault * The order of the core register pair was reversed so the values would go to the wrong registers Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- target-arm/translate.c | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index dbd958b..0111a61 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -3232,7 +3232,7 @@ static int disas_vfp_insn(CPUState * env, DisasContext *s, uint32_t insn) break; case 0xc: case 0xd: -if (dp (insn 0x03e0) == 0x0040) { +if ((insn 0x03e0) == 0x0040) { /* two-register transfer */ rn = (insn 16) 0xf; rd = (insn 12) 0xf; @@ -3254,10 +3254,10 @@ static int disas_vfp_insn(CPUState * env, DisasContext *s, uint32_t insn) } else { gen_mov_F0_vreg(0, rm); tmp = gen_vfp_mrs(); -store_reg(s, rn, tmp); +store_reg(s, rd, tmp); gen_mov_F0_vreg(0, rm + 1); tmp = gen_vfp_mrs(); -store_reg(s, rd, tmp); +store_reg(s, rn, tmp); } } else { /* arm-vfp */ @@ -3269,10 +3269,10 @@ static int disas_vfp_insn(CPUState * env, DisasContext *s, uint32_t insn) gen_vfp_msr(tmp); gen_mov_vreg_F0(0, rm * 2 + 1); } else { -tmp = load_reg(s, rn); +tmp = load_reg(s, rd); gen_vfp_msr(tmp); gen_mov_vreg_F0(0, rm); -tmp = load_reg(s, rd); +tmp = load_reg(s, rn); gen_vfp_msr(tmp); gen_mov_vreg_F0(0, rm + 1); } -- 1.7.1
[Qemu-devel] [Bug 723871] Re: qemu-kvm-0.14.0 Aborts with -vga qxl
** Attachment added: debs with proposed fix for amd64 natty https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/723871/+attachment/1879081/+files/nattydebs.tgz -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/723871 Title: qemu-kvm-0.14.0 Aborts with -vga qxl Status in QEMU: Confirmed Status in “qemu-kvm” package in Ubuntu: New Bug description: Host CPU is Core i7 Q820. KVM is from 2.6.35-gentoo-r5 kernel (x86_64). Host has spice-0.7.2 and spice-protocol-0.7.0. Guest is Windows XP SP3 with qxl driver 0.6.1, virtio-serial 1.1.6 and vdagent 0.6.3. qemu-kvm is started like so: qemu-system-x86_64 -cpu host -enable-kvm -pidfile /home/rick/qemu/hds/wxp.pid -drive file=/home/rick/qemu/hds/wxp.raw,if=virtio,media=disk,aio=native,snapshot=on -m 768 -name WinXP -net nic,model=virtio -net user -localtime -usb -vga qxl -device virtio-serial -chardev spicevmc,name=vdagent,id=vdagent -device virtserialport,chardev=vdagent,name=com.redhat.spice.0 -spice port=1234,disable-ticketing -monitor stdio and crashes with: qemu-system-x86_64: /home/rick/qemu/src/qemu-kvm-0.14.0/qemu-kvm.c:1724: kvm_mutex_unlock: Assertion `!cpu_single_env' failed. Aborted If I use -no-kvm, it works fine. If I use -vga std, it works fine. -enable-kvm and -vga qxl crashes.
[Qemu-devel] [Bug 723871] Re: qemu-kvm-0.14.0 Aborts with -vga qxl
@Rick, could you tell me if the debs in comment #5 fix the issue? If so I'll go ahead and do a merge request. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/723871 Title: qemu-kvm-0.14.0 Aborts with -vga qxl Status in QEMU: Confirmed Status in “qemu-kvm” package in Ubuntu: New Bug description: Host CPU is Core i7 Q820. KVM is from 2.6.35-gentoo-r5 kernel (x86_64). Host has spice-0.7.2 and spice-protocol-0.7.0. Guest is Windows XP SP3 with qxl driver 0.6.1, virtio-serial 1.1.6 and vdagent 0.6.3. qemu-kvm is started like so: qemu-system-x86_64 -cpu host -enable-kvm -pidfile /home/rick/qemu/hds/wxp.pid -drive file=/home/rick/qemu/hds/wxp.raw,if=virtio,media=disk,aio=native,snapshot=on -m 768 -name WinXP -net nic,model=virtio -net user -localtime -usb -vga qxl -device virtio-serial -chardev spicevmc,name=vdagent,id=vdagent -device virtserialport,chardev=vdagent,name=com.redhat.spice.0 -spice port=1234,disable-ticketing -monitor stdio and crashes with: qemu-system-x86_64: /home/rick/qemu/src/qemu-kvm-0.14.0/qemu-kvm.c:1724: kvm_mutex_unlock: Assertion `!cpu_single_env' failed. Aborted If I use -no-kvm, it works fine. If I use -vga std, it works fine. -enable-kvm and -vga qxl crashes.
Re: [Qemu-devel] [PATCH -V2 4/6] hw/9pfs: Implement syncfs
On Tue, 1 Mar 2011 15:59:19 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Tue, Mar 1, 2011 at 3:02 PM, Aneesh Kumar K. V aneesh.ku...@linux.vnet.ibm.com wrote: On Tue, 1 Mar 2011 10:22:07 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Tue, Mar 1, 2011 at 6:38 AM, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com --- hw/9pfs/virtio-9p.c | 31 +++ hw/9pfs/virtio-9p.h | 2 ++ 2 files changed, 33 insertions(+), 0 deletions(-) diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c index c4b0198..882f4f3 100644 --- a/hw/9pfs/virtio-9p.c +++ b/hw/9pfs/virtio-9p.c @@ -1978,6 +1978,36 @@ static void v9fs_fsync(V9fsState *s, V9fsPDU *pdu) v9fs_post_do_fsync(s, pdu, err); } +static void v9fs_post_do_syncfs(V9fsState *s, V9fsPDU *pdu, int err) +{ + if (err == -1) { + err = -errno; + } + complete_pdu(s, pdu, err); +} + +static void v9fs_syncfs(V9fsState *s, V9fsPDU *pdu) +{ + int err; + int32_t fid; + size_t offset = 7; + V9fsFidState *fidp; + + pdu_unmarshal(pdu, offset, d, fid); + fidp = lookup_fid(s, fid); + if (fidp == NULL) { + err = -ENOENT; + v9fs_post_do_syncfs(s, pdu, err); + return; + } + /* + * We don't have per file system syncfs + * So just return success + */ + err = 0; + v9fs_post_do_syncfs(s, pdu, err); +} Please explain the semantics of P9_TSYNCFS. Won't returning success without doing anything lead to data integrity issues? I should actually do the 9P Operation format as commit message. Will add in the next update. Whether returning here would cause a data integrity issue, it depends what sort of guarantee we want to provide. So calling sync on the guest will cause all the dirty pages in the guest to be flushed to host. Now all those changes are in the host page cache and it would be nice to flush them as a part of sync but then since we don't have a per file system sync, the above would imply we flush all dirty pages on the host which can result in large performance impact. You get the define the semantics of P9_TSYNCFS? I thought this is part of a well-defined protocol :). If this is a .L extension then it's probably a bad design and shouldn't be added to the protocol if we can't implement it. It is a part of .L extension and we can definitely implement it. There is patch out there which is yet to be merged http://thread.gmane.org/gmane.linux.file-systems/44628 Is this operation supposed to flush the disk write cache too? I am not sure we need to specify that as a part of 9p operation. I guess we can only say maximum possible data integrity. Whether a sync will cause disk write cache flush depends on the file system. For ext* that can be controlled by mount option barrier. I think virtio-9p has a file descriptor cache. Would it be possible to fsync() those file descriptors? Ideally we should. But that would involve a large number of fsync calls. -aneesh
[Qemu-devel] [Bug 723871] Re: qemu-kvm-0.14.0 Aborts with -vga qxl
Serge, I run qemu-kvm from source. My distro is Gentoo, so I can't check your debs. I applied the patch from comment #4 last night and found that I have not encountered this bug since. Let me know if I can provide any additional info... -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/723871 Title: qemu-kvm-0.14.0 Aborts with -vga qxl Status in QEMU: Confirmed Status in “qemu-kvm” package in Ubuntu: New Bug description: Host CPU is Core i7 Q820. KVM is from 2.6.35-gentoo-r5 kernel (x86_64). Host has spice-0.7.2 and spice-protocol-0.7.0. Guest is Windows XP SP3 with qxl driver 0.6.1, virtio-serial 1.1.6 and vdagent 0.6.3. qemu-kvm is started like so: qemu-system-x86_64 -cpu host -enable-kvm -pidfile /home/rick/qemu/hds/wxp.pid -drive file=/home/rick/qemu/hds/wxp.raw,if=virtio,media=disk,aio=native,snapshot=on -m 768 -name WinXP -net nic,model=virtio -net user -localtime -usb -vga qxl -device virtio-serial -chardev spicevmc,name=vdagent,id=vdagent -device virtserialport,chardev=vdagent,name=com.redhat.spice.0 -spice port=1234,disable-ticketing -monitor stdio and crashes with: qemu-system-x86_64: /home/rick/qemu/src/qemu-kvm-0.14.0/qemu-kvm.c:1724: kvm_mutex_unlock: Assertion `!cpu_single_env' failed. Aborted If I use -no-kvm, it works fine. If I use -vga std, it works fine. -enable-kvm and -vga qxl crashes.
[Qemu-devel] Re: [PATCH v2 00/15] [uq/master] Patch queue, part IV (MCE edition)
On Fri, Feb 18, 2011 at 11:11:11AM +0100, Jan Kiszka wrote: Round 2 of this part, primarily addressing review comments: - Reworked CPU_INTERRUPT_MCE - exection translation (now done in kvm_arch_process_async_events, indeed much cleaner) - Add missing cpu_synchronize_state on pending MCE events for !kvm_irqchip_in_kernel - Split up KVM MCE code switch from old to new style into two patches and dropped some unneeded variable renamings - Fixed Windows build (qemu_ram_remap is POSIX-only) Thanks for the feedback so far. CC: Hidetoshi Seto seto.hideto...@jp.fujitsu.com CC: Huang Ying ying.hu...@intel.com CC: Jin Dongming jin.dongm...@np.css.fujitsu.com Huang Ying (2): Add qemu_ram_remap KVM, MCE, unpoison memory address across reboot Jan Kiszka (13): x86: Account for MCE in cpu_has_work x86: Perform implicit mcg_status reset x86: Small cleanups of MCE helpers x86: Refine error reporting of MCE injection services x86: Optionally avoid injecting AO MCEs while others are pending Synchronize VCPU states before reset kvm: x86: Move MCE functions together kvm: Rename kvm_arch_process_irqchip_events to async_events kvm: x86: Inject pending MCE events on state writeback x86: Run qemu_inject_x86_mce on target VCPU kvm: x86: Consolidate TCG and KVM MCE injection code kvm: x86: Clean up kvm_setup_mce kvm: x86: Fail kvm_arch_init_vcpu if MCE initialization fails Please rebase.
[Qemu-devel] [Bug 723871] Re: qemu-kvm-0.14.0 Aborts with -vga qxl
Ok, thanks Rick. Actually, I guess as this isn't an SRU I can go ahead and verify it myself and upload. ** Changed in: qemu-kvm (Ubuntu) Status: New = In Progress -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/723871 Title: qemu-kvm-0.14.0 Aborts with -vga qxl Status in QEMU: Confirmed Status in “qemu-kvm” package in Ubuntu: In Progress Bug description: Host CPU is Core i7 Q820. KVM is from 2.6.35-gentoo-r5 kernel (x86_64). Host has spice-0.7.2 and spice-protocol-0.7.0. Guest is Windows XP SP3 with qxl driver 0.6.1, virtio-serial 1.1.6 and vdagent 0.6.3. qemu-kvm is started like so: qemu-system-x86_64 -cpu host -enable-kvm -pidfile /home/rick/qemu/hds/wxp.pid -drive file=/home/rick/qemu/hds/wxp.raw,if=virtio,media=disk,aio=native,snapshot=on -m 768 -name WinXP -net nic,model=virtio -net user -localtime -usb -vga qxl -device virtio-serial -chardev spicevmc,name=vdagent,id=vdagent -device virtserialport,chardev=vdagent,name=com.redhat.spice.0 -spice port=1234,disable-ticketing -monitor stdio and crashes with: qemu-system-x86_64: /home/rick/qemu/src/qemu-kvm-0.14.0/qemu-kvm.c:1724: kvm_mutex_unlock: Assertion `!cpu_single_env' failed. Aborted If I use -no-kvm, it works fine. If I use -vga std, it works fine. -enable-kvm and -vga qxl crashes.
[Qemu-devel] [Bug 723871] Re: qemu-kvm-0.14.0 Aborts with -vga qxl
@Rick, would you expect a fedora guest to reproduce this? Would it have the qxl driver? Or must it be Windows? -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/723871 Title: qemu-kvm-0.14.0 Aborts with -vga qxl Status in QEMU: Confirmed Status in “qemu-kvm” package in Ubuntu: In Progress Bug description: Host CPU is Core i7 Q820. KVM is from 2.6.35-gentoo-r5 kernel (x86_64). Host has spice-0.7.2 and spice-protocol-0.7.0. Guest is Windows XP SP3 with qxl driver 0.6.1, virtio-serial 1.1.6 and vdagent 0.6.3. qemu-kvm is started like so: qemu-system-x86_64 -cpu host -enable-kvm -pidfile /home/rick/qemu/hds/wxp.pid -drive file=/home/rick/qemu/hds/wxp.raw,if=virtio,media=disk,aio=native,snapshot=on -m 768 -name WinXP -net nic,model=virtio -net user -localtime -usb -vga qxl -device virtio-serial -chardev spicevmc,name=vdagent,id=vdagent -device virtserialport,chardev=vdagent,name=com.redhat.spice.0 -spice port=1234,disable-ticketing -monitor stdio and crashes with: qemu-system-x86_64: /home/rick/qemu/src/qemu-kvm-0.14.0/qemu-kvm.c:1724: kvm_mutex_unlock: Assertion `!cpu_single_env' failed. Aborted If I use -no-kvm, it works fine. If I use -vga std, it works fine. -enable-kvm and -vga qxl crashes.
[Qemu-devel] [PATCH V11 00/15] Xen device model support
From: Anthony PERARD anthony.per...@citrix.com Hi all, Here is the few change since the V10: - Add braces for blocks with single statement in the clean-up patch; - the patch that builds Xen only for x86 have been removed, instead, xen_domainbuild is built with libhw and other Xen files are built for i386 target only; - the redirection structure with function pointer have been removed, instead, there are few #define or static inline function use for the compatibility; - the platform device uses trace instead of dprintf for guest log; - introduce i440fx_xen_init and i440fx_common_init to avoid xen_enabled() in piix_pci. This series depends on the series Introduce machine QemuOpts. You can find a git tree here: git://xenbits.xen.org/people/aperard/qemu-dm.git qemu-dm-v11 Anthony PERARD (12): xen: Replace some tab-indents with spaces (clean-up). xen: Make Xen build once. xen: Support new libxc calls from xen unstable. xen: Add initialisation of Xen xen: Add xenfv machine piix_pci: Introduces Xen specific call for irq. xen: Introduce Xen Interrupt Controller configure: Always use 64bits target physical addresses with xen enabled. Introduce qemu_put_ram_ptr vl.c: Introduce getter for shutdown_requested and reset_requested. xen: Set running state in xenstore. xen: Add Xen hypercall for sleep state in the cmos_s3 callback. Arun Sharma (1): xen: Initialize event channels and io rings Jun Nakajima (1): xen: Introduce the Xen mapcache Steven Smith (1): xen: Add the Xen platform pci device Makefile.objs|3 + Makefile.target | 16 ++- configure| 71 ++- cpu-common.h |1 + exec.c | 50 - hw/hw.h |3 + hw/pc.c | 19 ++- hw/pc.h |1 + hw/pc_piix.c | 41 - hw/pci_ids.h |2 + hw/piix_pci.c| 47 - hw/xen.h | 41 hw/xen_backend.c | 422 - hw/xen_backend.h |6 +- hw/xen_common.h | 75 ++-- hw/xen_disk.c| 496 +++ hw/xen_domainbuild.c | 13 +- hw/xen_domainbuild.h |5 +- hw/xen_machine_pv.c |2 +- hw/xen_nic.c | 265 +--- hw/xen_platform.c| 349 ++ sysemu.h |2 + trace-events |3 + vl.c | 12 + xen-all.c| 573 ++ xen-mapcache-stub.c | 40 xen-mapcache.c | 344 ++ xen-mapcache.h | 22 ++ xen-stub.c | 45 29 files changed, 2386 insertions(+), 583 deletions(-) create mode 100644 hw/xen_platform.c create mode 100644 xen-all.c create mode 100644 xen-mapcache-stub.c create mode 100644 xen-mapcache.c create mode 100644 xen-mapcache.h create mode 100644 xen-stub.c -- Anthony PERARD
[Qemu-devel] [PATCH V11 02/15] xen: Make Xen build once.
From: Anthony PERARD anthony.per...@citrix.com xen_domainbuild is now build in libhw. And xen_machine_pv is build only for i386 targets. Signed-off-by: Anthony PERARD anthony.per...@citrix.com --- Makefile.objs|3 +++ Makefile.target |2 +- hw/xen_domainbuild.c | 10 +- hw/xen_domainbuild.h |5 +++-- hw/xen_machine_pv.c |2 +- 5 files changed, 13 insertions(+), 9 deletions(-) diff --git a/Makefile.objs b/Makefile.objs index 9e98a66..8034115 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -269,6 +269,9 @@ hw-obj-$(CONFIG_DP8393X) += dp8393x.o hw-obj-$(CONFIG_DS1225Y) += ds1225y.o hw-obj-$(CONFIG_MIPSNET) += mipsnet.o +# Xen +hw-obj-$(CONFIG_XEN) += xen_domainbuild.o + # Sound sound-obj-y = sound-obj-$(CONFIG_SB16) += sb16.o diff --git a/Makefile.target b/Makefile.target index 220589e..ab0a570 100644 --- a/Makefile.target +++ b/Makefile.target @@ -206,7 +206,7 @@ QEMU_CFLAGS += $(VNC_JPEG_CFLAGS) QEMU_CFLAGS += $(VNC_PNG_CFLAGS) # xen backend driver support -obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o +obj-i386-$(CONFIG_XEN) += xen_machine_pv.o # Inter-VM PCI shared memory obj-$(CONFIG_KVM) += ivshmem.o diff --git a/hw/xen_domainbuild.c b/hw/xen_domainbuild.c index 7f1fd66..b73d47f 100644 --- a/hw/xen_domainbuild.c +++ b/hw/xen_domainbuild.c @@ -1,9 +1,9 @@ #include signal.h -#include xen_backend.h -#include xen_domainbuild.h #include sysemu.h #include qemu-timer.h #include qemu-log.h +#include xen_backend.h +#include xen_domainbuild.h #include xenguest.h @@ -49,7 +49,7 @@ static int xenstore_domain_mkdir(char *path) } int xenstore_domain_init1(const char *kernel, const char *ramdisk, - const char *cmdline) + const char *cmdline, ram_addr_t ram_size) { char *dom, uuid_string[42], vm[256], path[256]; int i; @@ -225,7 +225,7 @@ static void xen_domain_cleanup(void) } int xen_domain_build_pv(const char *kernel, const char *ramdisk, -const char *cmdline) +const char *cmdline, ram_addr_t ram_size) { uint32_t ssidref = 0; uint32_t flags = 0; @@ -246,7 +246,7 @@ int xen_domain_build_pv(const char *kernel, const char *ramdisk, goto err; } -xenstore_domain_init1(kernel, ramdisk, cmdline); +xenstore_domain_init1(kernel, ramdisk, cmdline, ram_size); rc = xc_domain_max_vcpus(xen_xc, xen_domid, smp_cpus); if (rc 0) { diff --git a/hw/xen_domainbuild.h b/hw/xen_domainbuild.h index dea0121..49683f8 100644 --- a/hw/xen_domainbuild.h +++ b/hw/xen_domainbuild.h @@ -1,13 +1,14 @@ #ifndef QEMU_HW_XEN_DOMAINBUILD_H #define QEMU_HW_XEN_DOMAINBUILD_H 1 +#include cpu-common.h #include xen_common.h int xenstore_domain_init1(const char *kernel, const char *ramdisk, - const char *cmdline); + const char *cmdline, ram_addr_t ram_size); int xenstore_domain_init2(int xenstore_port, int xenstore_mfn, int console_port, int console_mfn); int xen_domain_build_pv(const char *kernel, const char *ramdisk, -const char *cmdline); +const char *cmdline, ram_addr_t ram_size); #endif /* QEMU_HW_XEN_DOMAINBUILD_H */ diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c index 77a34bf..d0e6e8f 100644 --- a/hw/xen_machine_pv.c +++ b/hw/xen_machine_pv.c @@ -64,7 +64,7 @@ static void xen_init_pv(ram_addr_t ram_size, break; case XEN_CREATE: if (xen_domain_build_pv(kernel_filename, initrd_filename, -kernel_cmdline) 0) { +kernel_cmdline, ram_size) 0) { fprintf(stderr, xen pv domain creation failed\n); exit(1); } -- 1.7.2.3
[Qemu-devel] [PATCH V11 03/15] xen: Support new libxc calls from xen unstable.
From: Anthony PERARD anthony.per...@citrix.com This patch updates the libxenctrl calls in Qemu to use the new interface, otherwise Qemu wouldn't be able to build against new versions of the library. We check libxenctrl version in configure, from Xen 3.3.0 to Xen unstable. Signed-off-by: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Acked-by: Alexander Graf ag...@suse.de --- configure| 67 - hw/xen_backend.c | 21 --- hw/xen_backend.h |6 ++-- hw/xen_common.h | 64 +-- hw/xen_disk.c|4 +- hw/xen_domainbuild.c |3 +- 6 files changed, 133 insertions(+), 32 deletions(-) diff --git a/configure b/configure index 3036faf..a84d974 100755 --- a/configure +++ b/configure @@ -126,6 +126,7 @@ vnc_jpeg= vnc_png= vnc_thread=no xen= +xen_ctrl_version= linux_aio= attr= vhost_net= @@ -1147,20 +1148,81 @@ fi if test $xen != no ; then xen_libs=-lxenstore -lxenctrl -lxenguest + + # Xen unstable cat $TMPC EOF #include xenctrl.h #include xs.h -int main(void) { xs_daemon_open(); xc_interface_open(); return 0; } +#include stdint.h +#include xen/hvm/hvm_info_table.h +#if !defined(HVM_MAX_VCPUS) +# error HVM_MAX_VCPUS not defined +#endif +int main(void) { + xc_interface *xc; + xs_daemon_open(); + xc = xc_interface_open(0, 0, 0); + xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0); + xc_gnttab_open(NULL, 0); + return 0; +} EOF if compile_prog $xen_libs ; then +xen_ctrl_version=410 xen=yes -libs_softmmu=$xen_libs $libs_softmmu + + # Xen 4.0.0 + elif ( + cat $TMPC EOF +#include xenctrl.h +#include xs.h +#include stdint.h +#include xen/hvm/hvm_info_table.h +#if !defined(HVM_MAX_VCPUS) +# error HVM_MAX_VCPUS not defined +#endif +int main(void) { + xs_daemon_open(); + xc_interface_open(); + xc_gnttab_open(); + xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0); + return 0; +} +EOF + compile_prog $xen_libs +) ; then +xen_ctrl_version=400 +xen=yes + + # Xen 3.3.0, 3.4.0 + elif ( + cat $TMPC EOF +#include xenctrl.h +#include xs.h +int main(void) { + xs_daemon_open(); + xc_interface_open(); + xc_gnttab_open(); + xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0); + return 0; +} +EOF + compile_prog $xen_libs +) ; then +xen_ctrl_version=330 +xen=yes + + # Xen not found or unsupported else if test $xen = yes ; then feature_not_found xen fi xen=no fi + + if test $xen = yes; then +libs_softmmu=$xen_libs $libs_softmmu + fi fi ## @@ -2755,6 +2817,7 @@ if test $bluez = yes ; then fi if test $xen = yes ; then echo CONFIG_XEN=y $config_host_mak + echo CONFIG_XEN_CTRL_INTERFACE_VERSION=$xen_ctrl_version $config_host_mak fi if test $io_thread = yes ; then echo CONFIG_IOTHREAD=y $config_host_mak diff --git a/hw/xen_backend.c b/hw/xen_backend.c index 9f4ec4b..3907b83 100644 --- a/hw/xen_backend.c +++ b/hw/xen_backend.c @@ -43,7 +43,8 @@ /* - */ /* public */ -int xen_xc; +XenXC xen_xc = XC_HANDLER_INITIAL_VALUE; +XenGnttab xen_xcg = XC_HANDLER_INITIAL_VALUE; struct xs_handle *xenstore = NULL; const char *xen_protocol; @@ -214,8 +215,8 @@ static struct XenDevice *xen_be_get_xendev(const char *type, int dom, int dev, xendev-debug = debug; xendev-local_port = -1; -xendev-evtchndev = xc_evtchn_open(); -if (xendev-evtchndev 0) { +xendev-evtchndev = xc_evtchn_open(NULL, 0); +if (xendev-evtchndev == XC_HANDLER_INITIAL_VALUE) { xen_be_printf(NULL, 0, can't open evtchn device\n); qemu_free(xendev); return NULL; @@ -223,15 +224,15 @@ static struct XenDevice *xen_be_get_xendev(const char *type, int dom, int dev, fcntl(xc_evtchn_fd(xendev-evtchndev), F_SETFD, FD_CLOEXEC); if (ops-flags DEVOPS_FLAG_NEED_GNTDEV) { -xendev-gnttabdev = xc_gnttab_open(); -if (xendev-gnttabdev 0) { +xendev-gnttabdev = xc_gnttab_open(NULL, 0); +if (xendev-gnttabdev == XC_HANDLER_INITIAL_VALUE) { xen_be_printf(NULL, 0, can't open gnttab device\n); xc_evtchn_close(xendev-evtchndev); qemu_free(xendev); return NULL; } } else { -xendev-gnttabdev = -1; +xendev-gnttabdev = XC_HANDLER_INITIAL_VALUE; } QTAILQ_INSERT_TAIL(xendevs, xendev, next); @@ -277,10 +278,10 @@ static struct XenDevice *xen_be_del_xendev(int dom, int dev) qemu_free(xendev-fe); } -if (xendev-evtchndev = 0) { +if (xendev-evtchndev != XC_HANDLER_INITIAL_VALUE) { xc_evtchn_close(xendev-evtchndev); } -if (xendev-gnttabdev = 0) { +if (xendev-gnttabdev != XC_HANDLER_INITIAL_VALUE) {
[Qemu-devel] [PATCH V11 07/15] piix_pci: Introduces Xen specific call for irq.
From: Anthony PERARD anthony.per...@citrix.com This patch introduces Xen specific call in piix_pci. The specific part for Xen is in write_config, set_irq and get_pirq. Signed-off-by: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Acked-by: Alexander Graf ag...@suse.de --- hw/pc.h |1 + hw/pc_piix.c |6 +- hw/piix_pci.c | 47 --- hw/xen.h |6 ++ xen-all.c | 31 +++ xen-stub.c| 13 + 6 files changed, 100 insertions(+), 4 deletions(-) diff --git a/hw/pc.h b/hw/pc.h index feb8a7a..85662c3 100644 --- a/hw/pc.h +++ b/hw/pc.h @@ -176,6 +176,7 @@ struct PCII440FXState; typedef struct PCII440FXState PCII440FXState; PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn, qemu_irq *pic, ram_addr_t ram_size); +PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq *pic, ram_addr_t ram_size); void i440fx_init_memory_mappings(PCII440FXState *d); /* piix4.c */ diff --git a/hw/pc_piix.c b/hw/pc_piix.c index 417c456..7457bdb 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -109,7 +109,11 @@ static void pc_init1(ram_addr_t ram_size, isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24); if (pci_enabled) { -pci_bus = i440fx_init(i440fx_state, piix3_devfn, isa_irq, ram_size); +if (!xen_enabled()) { +pci_bus = i440fx_init(i440fx_state, piix3_devfn, isa_irq, ram_size); +} else { +pci_bus = i440fx_xen_init(i440fx_state, piix3_devfn, isa_irq, ram_size); +} } else { pci_bus = NULL; i440fx_state = NULL; diff --git a/hw/piix_pci.c b/hw/piix_pci.c index 358da58..c11a7f6 100644 --- a/hw/piix_pci.c +++ b/hw/piix_pci.c @@ -29,6 +29,7 @@ #include isa.h #include sysbus.h #include range.h +#include xen.h /* * I440FX chipset data sheet. @@ -151,6 +152,13 @@ static void i440fx_write_config(PCIDevice *dev, } } +static void i440fx_write_config_xen(PCIDevice *dev, +uint32_t address, uint32_t val, int len) +{ +xen_piix_pci_write_config_client(address, val, len); +i440fx_write_config(dev, address, val, len); +} + static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id) { PCII440FXState *d = opaque; @@ -216,7 +224,10 @@ static int i440fx_initfn(PCIDevice *dev) return 0; } -PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq *pic, ram_addr_t ram_size) +static PCIBus *i440fx_common_init(const char *device_name, + PCII440FXState **pi440fx_state, + int *piix3_devfn, + qemu_irq *pic, ram_addr_t ram_size) { DeviceState *dev; PCIBus *b; @@ -230,13 +241,13 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq * s-bus = b; qdev_init_nofail(dev); -d = pci_create_simple(b, 0, i440FX); +d = pci_create_simple(b, 0, device_name); *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d); piix3 = DO_UPCAST(PIIX3State, dev, pci_create_simple_multifunction(b, -1, true, PIIX3)); piix3-pic = pic; -pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, piix3, 4); + (*pi440fx_state)-piix3 = piix3; *piix3_devfn = piix3-dev.devfn; @@ -249,6 +260,28 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq * return b; } +PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix3_devfn, +qemu_irq *pic, ram_addr_t ram_size) +{ +PCIBus *b; + +b = i440fx_common_init(i440FX, pi440fx_state, piix3_devfn, pic, ram_size); +pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, (*pi440fx_state)-piix3, 4); + +return b; +} + +PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn, +qemu_irq *pic, ram_addr_t ram_size) +{ +PCIBus *b; + +b = i440fx_common_init(i440FX-xen, pi440fx_state, piix3_devfn, pic, ram_size); +pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq, (*pi440fx_state)-piix3, 4); + +return b; +} + /* PIIX3 PCI to ISA bridge */ static void piix3_set_irq(void *opaque, int irq_num, int level) @@ -352,6 +385,14 @@ static PCIDeviceInfo i440fx_info[] = { .init = i440fx_initfn, .config_write = i440fx_write_config, },{ +.qdev.name= i440FX-xen, +.qdev.desc= Host bridge, +.qdev.size= sizeof(PCII440FXState), +.qdev.vmsd= vmstate_i440fx, +.qdev.no_user = 1, +.init = i440fx_initfn, +.config_write = i440fx_write_config_xen, +},{ .qdev.name= PIIX3, .qdev.desc= ISA bridge, .qdev.size= sizeof(PIIX3State), diff --git a/hw/xen.h b/hw/xen.h
[Qemu-devel] [PATCH V11 04/15] xen: Add initialisation of Xen
From: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Anthony PERARD anthony.per...@citrix.com Acked-by: Alexander Graf ag...@suse.de --- Makefile.target |9 + hw/xen.h| 13 + vl.c|2 ++ xen-all.c | 23 +++ xen-stub.c | 15 +++ 5 files changed, 62 insertions(+), 0 deletions(-) create mode 100644 xen-all.c create mode 100644 xen-stub.c diff --git a/Makefile.target b/Makefile.target index ab0a570..b08c7f7 100644 --- a/Makefile.target +++ b/Makefile.target @@ -208,6 +208,15 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS) # xen backend driver support obj-i386-$(CONFIG_XEN) += xen_machine_pv.o +ifeq ($(TARGET_ARCH), i386) +CONFIG_NO_XEN = $(if $(subst n,,$(CONFIG_XEN)),n,y) +else +CONFIG_NO_XEN = y +endif +# xen support +obj-i386-$(CONFIG_XEN) += xen-all.o +obj-$(CONFIG_NO_XEN) += xen-stub.o + # Inter-VM PCI shared memory obj-$(CONFIG_KVM) += ivshmem.o diff --git a/hw/xen.h b/hw/xen.h index 780dcf7..1fefe3a 100644 --- a/hw/xen.h +++ b/hw/xen.h @@ -18,4 +18,17 @@ enum xen_mode { extern uint32_t xen_domid; extern enum xen_mode xen_mode; +extern int xen_allowed; + +static inline int xen_enabled(void) +{ +#ifdef CONFIG_XEN +return xen_allowed; +#else +return 0; +#endif +} + +int xen_init(void); + #endif /* QEMU_HW_XEN_H */ diff --git a/vl.c b/vl.c index 0030741..910b000 100644 --- a/vl.c +++ b/vl.c @@ -259,6 +259,7 @@ static NotifierList machine_init_done_notifiers = static int tcg_allowed = 1; int kvm_allowed = 0; +int xen_allowed = 0; uint32_t xen_domid; enum xen_mode xen_mode = XEN_EMULATE; @@ -1866,6 +1867,7 @@ static struct { int *allowed; } accel_list[] = { { tcg, tcg, tcg_available, tcg_init, tcg_allowed }, +{ xen, Xen, xen_available, xen_init, xen_allowed }, { kvm, KVM, kvm_available, kvm_init, kvm_allowed }, }; diff --git a/xen-all.c b/xen-all.c new file mode 100644 index 000..888fb1d --- /dev/null +++ b/xen-all.c @@ -0,0 +1,23 @@ +/* + * Copyright (C) 2010 Citrix Ltd. + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +#include hw/xen_common.h +#include hw/xen_backend.h + +/* Initialise Xen */ + +int xen_init(void) +{ +xen_xc = xc_interface_open(0, 0, 0); +if (xen_xc == XC_HANDLER_INITIAL_VALUE) { +xen_be_printf(NULL, 0, can't open xen interface\n); +return -1; +} + +return 0; +} diff --git a/xen-stub.c b/xen-stub.c new file mode 100644 index 000..beb982f --- /dev/null +++ b/xen-stub.c @@ -0,0 +1,15 @@ +/* + * Copyright (C) 2010 Citrix Ltd. + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +#include qemu-common.h +#include hw/xen.h + +int xen_init(void) +{ +return -ENOSYS; +} -- 1.7.2.3
[Qemu-devel] [PATCH V11 06/15] xen: Add the Xen platform pci device
From: Steven Smith ssm...@xensource.com Introduce a new emulated PCI device, specific to fully virtualized Xen guests. The device is necessary for PV on HVM drivers to work. Signed-off-by: Steven Smith ssm...@xensource.com Signed-off-by: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- Makefile.target |2 + hw/hw.h |3 + hw/pc_piix.c |4 + hw/pci_ids.h |2 + hw/xen.h |2 + hw/xen_platform.c | 349 + trace-events |3 + xen-stub.c|4 + 8 files changed, 369 insertions(+), 0 deletions(-) create mode 100644 hw/xen_platform.c diff --git a/Makefile.target b/Makefile.target index b08c7f7..c539b1e 100644 --- a/Makefile.target +++ b/Makefile.target @@ -217,6 +217,8 @@ endif obj-i386-$(CONFIG_XEN) += xen-all.o obj-$(CONFIG_NO_XEN) += xen-stub.o +obj-i386-$(CONFIG_XEN) += xen_platform.o + # Inter-VM PCI shared memory obj-$(CONFIG_KVM) += ivshmem.o diff --git a/hw/hw.h b/hw/hw.h index 5e24329..c285b2e 100644 --- a/hw/hw.h +++ b/hw/hw.h @@ -682,6 +682,9 @@ extern const VMStateDescription vmstate_usb_device; #define VMSTATE_INT32_LE(_f, _s) \ VMSTATE_SINGLE(_f, _s, 0, vmstate_info_int32_le, int32_t) +#define VMSTATE_UINT8_TEST(_f, _s, _t) \ +VMSTATE_SINGLE_TEST(_f, _s, _t, 0, vmstate_info_uint8, uint8_t) + #define VMSTATE_UINT16_TEST(_f, _s, _t) \ VMSTATE_SINGLE_TEST(_f, _s, _t, 0, vmstate_info_uint16, uint16_t) diff --git a/hw/pc_piix.c b/hw/pc_piix.c index 6eff06e..417c456 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -121,6 +121,10 @@ static void pc_init1(ram_addr_t ram_size, pc_vga_init(pci_enabled? pci_bus: NULL); +if (xen_enabled()) { +pci_xen_platform_init(pci_bus); +} + /* init basic PC hardware */ pc_basic_device_init(isa_irq, rtc_state); diff --git a/hw/pci_ids.h b/hw/pci_ids.h index ea3418c..6e9eabc 100644 --- a/hw/pci_ids.h +++ b/hw/pci_ids.h @@ -108,3 +108,5 @@ #define PCI_DEVICE_ID_INTEL_82371AB 0x7111 #define PCI_DEVICE_ID_INTEL_82371AB_20x7112 #define PCI_DEVICE_ID_INTEL_82371AB_30x7113 + +#define PCI_VENDOR_ID_XENSOURCE 0x5853 diff --git a/hw/xen.h b/hw/xen.h index 726360a..46c4a1c 100644 --- a/hw/xen.h +++ b/hw/xen.h @@ -29,6 +29,8 @@ static inline int xen_enabled(void) #endif } +void pci_xen_platform_init(PCIBus *bus); + int xen_init(void); #if defined(CONFIG_XEN) CONFIG_XEN_CTRL_INTERFACE_VERSION 400 diff --git a/hw/xen_platform.c b/hw/xen_platform.c new file mode 100644 index 000..a03a117 --- /dev/null +++ b/hw/xen_platform.c @@ -0,0 +1,349 @@ +/* + * XEN platform pci device, formerly known as the event channel device + * + * Copyright (c) 2003-2004 Intel Corp. + * Copyright (c) 2006 XenSource + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the Software), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include assert.h + +#include hw.h +#include pc.h +#include pci.h +#include irq.h +#include xen_common.h +#include net.h +#include xen_backend.h +#include rwhandler.h +#include trace.h + +#include xenguest.h + +//#define DEBUG_PLATFORM + +#ifdef DEBUG_PLATFORM +#define DPRINTF(fmt, ...) do { \ +fprintf(stderr, xen_platform: fmt, ## __VA_ARGS__); \ +} while (0) +#else +#define DPRINTF(fmt, ...) do { } while (0) +#endif + +#define PFFLAG_ROM_LOCK 1 /* Sets whether ROM memory area is RW or RO */ + +typedef struct PCIXenPlatformState { +PCIDevice pci_dev; +uint8_t flags; /* used only for version_id == 2 */ +int drivers_blacklisted; +uint16_t driver_product_version; + +/* Log from guest drivers */ +char log_buffer[4096]; +int log_buffer_off; +} PCIXenPlatformState; + +#define XEN_PLATFORM_IOPORT 0x10 + +/* Send bytes to syslog */ +static void log_writeb(PCIXenPlatformState *s, char val) +{ +if (val == '\n' ||
[Qemu-devel] [PATCH V11 11/15] Introduce qemu_put_ram_ptr
From: Anthony PERARD anthony.per...@citrix.com This function allows to unlock a ram_ptr give by qemu_get_ram_ptr. After a call to qemu_put_ram_ptr, the pointer may be unmap from QEMU when used with Xen. Signed-off-by: Anthony PERARD anthony.per...@citrix.com Acked-by: Alexander Graf ag...@suse.de --- cpu-common.h |1 + exec.c | 10 ++ xen-mapcache.c | 34 ++ 3 files changed, 45 insertions(+), 0 deletions(-) diff --git a/cpu-common.h b/cpu-common.h index 54d21d4..1f1c886 100644 --- a/cpu-common.h +++ b/cpu-common.h @@ -55,6 +55,7 @@ void *qemu_get_ram_ptr(ram_addr_t addr); /* Same but slower, to use for migration, where the order of * RAMBlocks must not change. */ void *qemu_safe_ram_ptr(ram_addr_t addr); +void qemu_put_ram_ptr(void *addr); /* This should not be used by devices. */ int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr); ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr); diff --git a/exec.c b/exec.c index 558595a..3f3caaa 100644 --- a/exec.c +++ b/exec.c @@ -3007,6 +3007,13 @@ void *qemu_safe_ram_ptr(ram_addr_t addr) return NULL; } +void qemu_put_ram_ptr(void *addr) +{ +if (xen_mapcache_enabled()) { +qemu_map_cache_unlock(addr); +} +} + int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr) { RAMBlock *block; @@ -3722,6 +3729,7 @@ void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf, cpu_physical_memory_set_dirty_flags( addr1, (0xff ~CODE_DIRTY_FLAG)); } +qemu_put_ram_ptr(ptr); } } else { if ((pd ~TARGET_PAGE_MASK) IO_MEM_ROM @@ -3752,6 +3760,7 @@ void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf, ptr = qemu_get_ram_ptr(pd TARGET_PAGE_MASK) + (addr ~TARGET_PAGE_MASK); memcpy(buf, ptr, l); +qemu_put_ram_ptr(ptr); } } len -= l; @@ -3792,6 +3801,7 @@ void cpu_physical_memory_write_rom(target_phys_addr_t addr, /* ROM/RAM case */ ptr = qemu_get_ram_ptr(addr1); memcpy(ptr, buf, l); +qemu_put_ram_ptr(ptr); } len -= l; buf += l; diff --git a/xen-mapcache.c b/xen-mapcache.c index d7f44a7..6398456 100644 --- a/xen-mapcache.c +++ b/xen-mapcache.c @@ -195,6 +195,40 @@ uint8_t *qemu_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size, u return mapcache-last_address_vaddr + address_offset; } +void qemu_map_cache_unlock(void *buffer) +{ +MapCacheEntry *entry = NULL, *pentry = NULL; +MapCacheRev *reventry; +target_phys_addr_t paddr_index; +int found = 0; + +QTAILQ_FOREACH(reventry, mapcache-locked_entries, next) { +if (reventry-vaddr_req == buffer) { +paddr_index = reventry-paddr_index; +found = 1; +break; +} +} +if (!found) { +return; +} +QTAILQ_REMOVE(mapcache-locked_entries, reventry, next); +qemu_free(reventry); + +entry = mapcache-entry[paddr_index % mapcache-nr_buckets]; +while (entry entry-paddr_index != paddr_index) { +pentry = entry; +entry = entry-next; +} +if (!entry) { +return; +} +entry-lock--; +if (entry-lock 0) { +entry-lock--; +} +} + ram_addr_t qemu_ram_addr_from_mapcache(void *ptr) { MapCacheRev *reventry; -- 1.7.2.3
[Qemu-devel] [PATCH V11 05/15] xen: Add xenfv machine
From: Anthony PERARD anthony.per...@citrix.com Introduce the Xen FV (Fully Virtualized) machine to Qemu, some more Xen specific call will be added in further patches. Signed-off-by: Anthony PERARD anthony.per...@citrix.com --- hw/pc.c | 19 +-- hw/pc_piix.c | 17 + hw/xen.h |4 3 files changed, 38 insertions(+), 2 deletions(-) diff --git a/hw/pc.c b/hw/pc.c index 5966bf1..6a73407 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -41,6 +41,7 @@ #include sysemu.h #include blockdev.h #include ui/qemu-spice.h +#include xen.h /* output Bochs bios info messages */ //#define DEBUG_BIOS @@ -918,7 +919,11 @@ static void pc_cpu_reset(void *opaque) CPUState *env = opaque; cpu_reset(env); -env-halted = !cpu_is_bsp(env); +if (!xen_enabled()) { +env-halted = !cpu_is_bsp(env); +} else { +env-halted = 1; +} } static CPUState *pc_new_cpu(const char *cpu_model) @@ -952,7 +957,12 @@ void pc_cpus_init(const char *cpu_model) #endif } -for(i = 0; i smp_cpus; i++) { +if (!xen_enabled()) { +for(i = 0; i smp_cpus; i++) { +pc_new_cpu(cpu_model); +} +} else { +/* Xen require only one Qemu VCPU */ pc_new_cpu(cpu_model); } } @@ -980,6 +990,11 @@ void pc_memory_init(ram_addr_t ram_size, *above_4g_mem_size_p = above_4g_mem_size; *below_4g_mem_size_p = below_4g_mem_size; +if (xen_enabled()) { +/* Nothing to do for Xen */ +return; +} + linux_boot = (kernel_filename != NULL); /* allocate RAM */ diff --git a/hw/pc_piix.c b/hw/pc_piix.c index b3ede89..6eff06e 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -37,6 +37,10 @@ #include sysbus.h #include arch_init.h #include blockdev.h +#include xen.h +#ifdef CONFIG_XEN +# include xen/hvm/hvm_info_table.h +#endif #define MAX_IDE_BUS 2 @@ -391,6 +395,16 @@ static QEMUMachine isapc_machine = { .max_cpus = 1, }; +#ifdef CONFIG_XEN +static QEMUMachine xenfv_machine = { +.name = xenfv, +.desc = Xen Fully-virtualized PC, +.init = pc_init_pci, +.max_cpus = HVM_MAX_VCPUS, +.default_machine_opts = accel=xen, +}; +#endif + static void pc_machine_init(void) { qemu_register_machine(pc_machine); @@ -399,6 +413,9 @@ static void pc_machine_init(void) qemu_register_machine(pc_machine_v0_11); qemu_register_machine(pc_machine_v0_10); qemu_register_machine(isapc_machine); +#ifdef CONFIG_XEN +qemu_register_machine(xenfv_machine); +#endif } machine_init(pc_machine_init); diff --git a/hw/xen.h b/hw/xen.h index 1fefe3a..726360a 100644 --- a/hw/xen.h +++ b/hw/xen.h @@ -31,4 +31,8 @@ static inline int xen_enabled(void) int xen_init(void); +#if defined(CONFIG_XEN) CONFIG_XEN_CTRL_INTERFACE_VERSION 400 +# define HVM_MAX_VCPUS 32 +#endif + #endif /* QEMU_HW_XEN_H */ -- 1.7.2.3
[Qemu-devel] [PATCH V11 12/15] vl.c: Introduce getter for shutdown_requested and reset_requested.
From: Anthony PERARD anthony.per...@citrix.com Introduce two functions qemu_shutdown_requested_get and qemu_reset_requested_get to get the value of shutdown/reset_requested without reset it. Signed-off-by: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Acked-by: Alexander Graf ag...@suse.de --- sysemu.h |2 ++ vl.c | 10 ++ 2 files changed, 12 insertions(+), 0 deletions(-) diff --git a/sysemu.h b/sysemu.h index 0a83ab9..37992b6 100644 --- a/sysemu.h +++ b/sysemu.h @@ -63,6 +63,8 @@ void qemu_system_shutdown_request(void); void qemu_system_powerdown_request(void); void qemu_system_debug_request(void); void qemu_system_vmstop_request(int reason); +int qemu_shutdown_requested_get(void); +int qemu_reset_requested_get(void); int qemu_shutdown_requested(void); int qemu_reset_requested(void); int qemu_powerdown_requested(void); diff --git a/vl.c b/vl.c index 910b000..f7d317e 100644 --- a/vl.c +++ b/vl.c @@ -1222,6 +1222,16 @@ static int powerdown_requested; static int debug_requested; static int vmstop_requested; +int qemu_shutdown_requested_get(void) +{ +return shutdown_requested; +} + +int qemu_reset_requested_get(void) +{ +return reset_requested; +} + int qemu_shutdown_requested(void) { int r = shutdown_requested; -- 1.7.2.3
[Qemu-devel] [PATCH V11 15/15] xen: Add Xen hypercall for sleep state in the cmos_s3 callback.
From: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Anthony PERARD anthony.per...@citrix.com --- hw/pc_piix.c |6 +- hw/xen.h |1 + xen-all.c|9 + xen-stub.c |4 4 files changed, 19 insertions(+), 1 deletions(-) diff --git a/hw/pc_piix.c b/hw/pc_piix.c index 1d55bc9..5ec7d7f 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -181,7 +181,11 @@ static void pc_init1(ram_addr_t ram_size, uint8_t *eeprom_buf = qemu_mallocz(8 * 256); /* XXX: make this persistent */ i2c_bus *smbus; -cmos_s3 = qemu_allocate_irqs(pc_cmos_set_s3_resume, rtc_state, 1); +if (!xen_enabled()) { +cmos_s3 = qemu_allocate_irqs(pc_cmos_set_s3_resume, rtc_state, 1); +} else { +cmos_s3 = qemu_allocate_irqs(xen_cmos_set_s3_resume, rtc_state, 1); +} smi_irq = qemu_allocate_irqs(pc_acpi_smi_interrupt, first_cpu, 1); /* TODO: Populate SPD eeprom data. */ smbus = piix4_pm_init(pci_bus, piix3_devfn + 3, 0xb100, diff --git a/hw/xen.h b/hw/xen.h index e26d061..d8ee1e4 100644 --- a/hw/xen.h +++ b/hw/xen.h @@ -43,6 +43,7 @@ static inline int xen_mapcache_enabled(void) int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num); void xen_piix3_set_irq(void *opaque, int irq_num, int level); void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len); +void xen_cmos_set_s3_resume(void *opaque, int irq, int level); qemu_irq *xen_interrupt_controller_init(void); diff --git a/xen-all.c b/xen-all.c index 279efd0..09e3792 100644 --- a/xen-all.c +++ b/xen-all.c @@ -9,6 +9,7 @@ #include sys/mman.h #include hw/pci.h +#include hw/pc.h #include hw/xen_common.h #include hw/xen_backend.h @@ -90,6 +91,14 @@ void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len) } } +void xen_cmos_set_s3_resume(void *opaque, int irq, int level) +{ +pc_cmos_set_s3_resume(opaque, irq, level); +if (level) { +xc_set_hvm_param(xen_xc, xen_domid, HVM_PARAM_ACPI_S_STATE, 3); +} +} + /* Xen Interrupt Controller */ static void xen_set_irq(void *opaque, int irq, int level) diff --git a/xen-stub.c b/xen-stub.c index eebc223..718eb05 100644 --- a/xen-stub.c +++ b/xen-stub.c @@ -22,6 +22,10 @@ void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len) { } +void xen_cmos_set_s3_resume(void *opaque, int irq, int level) +{ +} + void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size) { } -- 1.7.2.3
[Qemu-devel] [PATCH V11 08/15] xen: Introduce Xen Interrupt Controller
From: Anthony PERARD anthony.per...@citrix.com Every set_irq call makes a Xen hypercall. Signed-off-by: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- hw/pc_piix.c |8 ++-- hw/xen.h |2 ++ xen-all.c| 12 xen-stub.c |5 + 4 files changed, 25 insertions(+), 2 deletions(-) diff --git a/hw/pc_piix.c b/hw/pc_piix.c index 7457bdb..1d55bc9 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -99,8 +99,12 @@ static void pc_init1(ram_addr_t ram_size, pc_memory_init(ram_size, kernel_filename, kernel_cmdline, initrd_filename, below_4g_mem_size, above_4g_mem_size); -cpu_irq = pc_allocate_cpu_irq(); -i8259 = i8259_init(cpu_irq[0]); +if (!xen_enabled()) { +cpu_irq = pc_allocate_cpu_irq(); +i8259 = i8259_init(cpu_irq[0]); +} else { +i8259 = xen_interrupt_controller_init(); +} isa_irq_state = qemu_mallocz(sizeof(*isa_irq_state)); isa_irq_state-i8259 = i8259; if (pci_enabled) { diff --git a/hw/xen.h b/hw/xen.h index 95653df..12d4e5f 100644 --- a/hw/xen.h +++ b/hw/xen.h @@ -35,6 +35,8 @@ int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num); void xen_piix3_set_irq(void *opaque, int irq_num, int level); void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len); +qemu_irq *xen_interrupt_controller_init(void); + void pci_xen_platform_init(PCIBus *bus); int xen_init(void); diff --git a/xen-all.c b/xen-all.c index 7b94d61..761f2a0 100644 --- a/xen-all.c +++ b/xen-all.c @@ -40,6 +40,18 @@ void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len) } } +/* Xen Interrupt Controller */ + +static void xen_set_irq(void *opaque, int irq, int level) +{ +xc_hvm_set_isa_irq_level(xen_xc, xen_domid, irq, level); +} + +qemu_irq *xen_interrupt_controller_init(void) +{ +return qemu_allocate_irqs(xen_set_irq, NULL, 16); +} + /* Initialise Xen */ int xen_init(void) diff --git a/xen-stub.c b/xen-stub.c index 216cae9..e335b67 100644 --- a/xen-stub.c +++ b/xen-stub.c @@ -22,6 +22,11 @@ void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len) { } +qemu_irq *xen_interrupt_controller_init(void) +{ +return NULL; +} + void pci_xen_platform_init(PCIBus *bus) { } -- 1.7.2.3
[Qemu-devel] [PATCH V11 14/15] xen: Set running state in xenstore.
From: Anthony PERARD anthony.per...@citrix.com This tells to the xen management tool that the machine can begin run. Signed-off-by: Anthony PERARD anthony.per...@citrix.com Acked-by: Alexander Graf ag...@suse.de --- xen-all.c | 23 +++ 1 files changed, 23 insertions(+), 0 deletions(-) diff --git a/xen-all.c b/xen-all.c index f96fd7d..279efd0 100644 --- a/xen-all.c +++ b/xen-all.c @@ -55,6 +55,8 @@ typedef struct XenIOState { /* which vcpu we are serving */ int send_vcpu; +struct xs_handle *xenstore; + Notifier exit; } XenIOState; @@ -423,6 +425,17 @@ static void cpu_handle_ioreq(void *opaque) } } +static void xenstore_record_dm_state(XenIOState *s, const char *state) +{ +char path[50]; + +snprintf(path, sizeof (path), /local/domain/0/device-model/%u/state, xen_domid); +if (!xs_write(s-xenstore, XBT_NULL, path, state, strlen(state))) { +fprintf(stderr, error recording dm state\n); +exit(1); +} +} + static void xen_main_loop_prepare(XenIOState *state) { int evtchn_fd = -1; @@ -438,6 +451,9 @@ static void xen_main_loop_prepare(XenIOState *state) if (evtchn_fd != -1) { qemu_set_fd_handler(evtchn_fd, cpu_handle_ioreq, NULL, state); } + +/* record state running */ +xenstore_record_dm_state(state, running); } @@ -456,6 +472,7 @@ static void xen_exit_notifier(Notifier *n) XenIOState *state = container_of(n, XenIOState, exit); xc_evtchn_close(state-xce_handle); +xs_daemon_close(state-xenstore); } int xen_init(void) @@ -478,6 +495,12 @@ int xen_init(void) return -errno; } +state-xenstore = xs_daemon_open(); +if (state-xenstore == NULL) { +perror(xen: xenstore open); +return -errno; +} + state-exit.notify = xen_exit_notifier; qemu_add_exit_notifier(state-exit); -- 1.7.2.3
[Qemu-devel] [PATCH V11 09/15] xen: Introduce the Xen mapcache
From: Jun Nakajima jun.nakaj...@intel.com On IA32 host or IA32 PAE host, at present, generally, we can't create an HVM guest with more than 2G memory, because generally it's almost impossible for Qemu to find a large enough and consecutive virtual address space to map an HVM guest's whole physical address space. The attached patch fixes this issue using dynamic mapping based on little blocks of memory. Each call to qemu_get_ram_ptr makes a call to qemu_map_cache with the lock option, so mapcache will not unmap these ram_ptr. Signed-off-by: Jun Nakajima jun.nakaj...@intel.com Signed-off-by: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- Makefile.target |3 + configure |3 + exec.c | 40 ++- hw/xen.h| 13 ++ hw/xen_common.h |9 ++ xen-all.c | 64 +++ xen-mapcache-stub.c | 40 +++ xen-mapcache.c | 310 +++ xen-mapcache.h | 22 xen-stub.c |4 + 10 files changed, 504 insertions(+), 4 deletions(-) create mode 100644 xen-mapcache-stub.c create mode 100644 xen-mapcache.c create mode 100644 xen-mapcache.h diff --git a/Makefile.target b/Makefile.target index c539b1e..dcdd51d 100644 --- a/Makefile.target +++ b/Makefile.target @@ -214,8 +214,11 @@ else CONFIG_NO_XEN = y endif # xen support +CONFIG_NO_XEN_MAPCACHE = $(if $(subst n,,$(CONFIG_XEN_MAPCACHE)),n,y) obj-i386-$(CONFIG_XEN) += xen-all.o obj-$(CONFIG_NO_XEN) += xen-stub.o +obj-i386-$(CONFIG_XEN_MAPCACHE) += xen-mapcache.o +obj-$(CONFIG_NO_XEN_MAPCACHE) += xen-mapcache-stub.o obj-i386-$(CONFIG_XEN) += xen_platform.o diff --git a/configure b/configure index a84d974..4fd8696 100755 --- a/configure +++ b/configure @@ -3172,6 +3172,9 @@ case $target_arch2 in i386|x86_64) if test $xen = yes -a $target_softmmu = yes ; then echo CONFIG_XEN=y $config_target_mak + if test $cpu = i386 -o $cpu = x86_64; then + echo CONFIG_XEN_MAPCACHE=y $config_target_mak + fi fi esac case $target_arch2 in diff --git a/exec.c b/exec.c index d611100..558595a 100644 --- a/exec.c +++ b/exec.c @@ -32,6 +32,7 @@ #include hw/qdev.h #include osdep.h #include kvm.h +#include hw/xen.h #include qemu-timer.h #if defined(CONFIG_USER_ONLY) #include qemu.h @@ -51,6 +52,8 @@ #include libutil.h #endif #endif +#else /* !CONFIG_USER_ONLY */ +#include xen-mapcache.h #endif //#define DEBUG_TB_INVALIDATE @@ -2865,6 +2868,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const char *name, } } +new_block-offset = find_ram_offset(size); if (host) { new_block-host = host; } else { @@ -2886,13 +2890,15 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const char *name, PROT_EXEC|PROT_READ|PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); #else -new_block-host = qemu_vmalloc(size); +if (xen_mapcache_enabled()) { +xen_ram_alloc(new_block-offset, size); +} else { +new_block-host = qemu_vmalloc(size); +} #endif qemu_madvise(new_block-host, size, QEMU_MADV_MERGEABLE); } } - -new_block-offset = find_ram_offset(size); new_block-length = size; QLIST_INSERT_HEAD(ram_list.blocks, new_block, next); @@ -2933,7 +2939,11 @@ void qemu_ram_free(ram_addr_t addr) #if defined(TARGET_S390X) defined(CONFIG_KVM) munmap(block-host, block-length); #else -qemu_vfree(block-host); +if (xen_mapcache_enabled()) { +qemu_invalidate_entry(block-host); +} else { +qemu_vfree(block-host); +} #endif } qemu_free(block); @@ -2959,6 +2969,15 @@ void *qemu_get_ram_ptr(ram_addr_t addr) if (addr - block-offset block-length) { QLIST_REMOVE(block, next); QLIST_INSERT_HEAD(ram_list.blocks, block, next); +if (xen_mapcache_enabled()) { +/* We need to check if the requested address is in the RAM + * because we don't want to map the entire memory in QEMU. + */ +if (block-offset == 0) { +return qemu_map_cache(addr, 0, 1); +} +block-host = qemu_map_cache(block-offset, block-length, 1); +} return block-host + (addr - block-offset); } } @@ -2994,11 +3013,21 @@ int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr) uint8_t *host = ptr; QLIST_FOREACH(block, ram_list.blocks, next) { +/* This case append when the block is not mapped. */ +if (block-host == NULL) { +continue; +} if (host - block-host block-length)
Re: [Qemu-devel] [Bug 723871] Re: qemu-kvm-0.14.0 Aborts with -vga qxl
On Tuesday 01 March 2011 12:29:14 Serge Hallyn wrote: @Rick, would you expect a fedora guest to reproduce this? Would it have the qxl driver? Or must it be Windows? I don't have a fedora guest to test on, and I don't know the implementation details well enough to postulate. -Rick
[Qemu-devel] [PATCH V11 10/15] configure: Always use 64bits target physical addresses with xen enabled.
From: Anthony PERARD anthony.per...@citrix.com With MapCache, we can handle a 64b target, even with a 32b host/qemu. So, we need to have target_phys_addr_t to 64bits. Signed-off-by: Anthony PERARD anthony.per...@citrix.com Acked-by: Alexander Graf ag...@suse.de --- configure |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/configure b/configure index 4fd8696..ecab08c 100755 --- a/configure +++ b/configure @@ -3171,6 +3171,7 @@ echo TARGET_ABI_DIR=$TARGET_ABI_DIR $config_target_mak case $target_arch2 in i386|x86_64) if test $xen = yes -a $target_softmmu = yes ; then + target_phys_bits=64 echo CONFIG_XEN=y $config_target_mak if test $cpu = i386 -o $cpu = x86_64; then echo CONFIG_XEN_MAPCACHE=y $config_target_mak -- 1.7.2.3
[Qemu-devel] Re: [PATCH v2 00/15] [uq/master] Patch queue, part IV (MCE edition)
On 2011-03-01 18:48, Marcelo Tosatti wrote: On Fri, Feb 18, 2011 at 11:11:11AM +0100, Jan Kiszka wrote: Round 2 of this part, primarily addressing review comments: - Reworked CPU_INTERRUPT_MCE - exection translation (now done in kvm_arch_process_async_events, indeed much cleaner) - Add missing cpu_synchronize_state on pending MCE events for !kvm_irqchip_in_kernel - Split up KVM MCE code switch from old to new style into two patches and dropped some unneeded variable renamings - Fixed Windows build (qemu_ram_remap is POSIX-only) Thanks for the feedback so far. CC: Hidetoshi Seto seto.hideto...@jp.fujitsu.com CC: Huang Ying ying.hu...@intel.com CC: Jin Dongming jin.dongm...@np.css.fujitsu.com Huang Ying (2): Add qemu_ram_remap KVM, MCE, unpoison memory address across reboot Jan Kiszka (13): x86: Account for MCE in cpu_has_work x86: Perform implicit mcg_status reset x86: Small cleanups of MCE helpers x86: Refine error reporting of MCE injection services x86: Optionally avoid injecting AO MCEs while others are pending Synchronize VCPU states before reset kvm: x86: Move MCE functions together kvm: Rename kvm_arch_process_irqchip_events to async_events kvm: x86: Inject pending MCE events on state writeback x86: Run qemu_inject_x86_mce on target VCPU kvm: x86: Consolidate TCG and KVM MCE injection code kvm: x86: Clean up kvm_setup_mce kvm: x86: Fail kvm_arch_init_vcpu if MCE initialization fails Please rebase. Can do - if you push your updated uq/master. :) Jan signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH V11 13/15] xen: Initialize event channels and io rings
From: Arun Sharma arun.sha...@intel.com Open and bind event channels; map ioreq and buffered ioreq rings. Signed-off-by: Arun Sharma arun.sha...@intel.com Signed-off-by: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Acked-by: Alexander Graf ag...@suse.de --- hw/xen_common.h |2 + xen-all.c | 411 +++ 2 files changed, 413 insertions(+), 0 deletions(-) diff --git a/hw/xen_common.h b/hw/xen_common.h index 5a36642..a5fc74b 100644 --- a/hw/xen_common.h +++ b/hw/xen_common.h @@ -76,4 +76,6 @@ static inline int xc_fd(xc_interface *xen_xc) } #endif +void destroy_hvm_domain(void); + #endif /* QEMU_HW_XEN_COMMON_H */ diff --git a/xen-all.c b/xen-all.c index 03d1e90..f96fd7d 100644 --- a/xen-all.c +++ b/xen-all.c @@ -6,12 +6,58 @@ * */ +#include sys/mman.h + #include hw/pci.h #include hw/xen_common.h #include hw/xen_backend.h #include xen-mapcache.h +#include xen/hvm/ioreq.h +#include xen/hvm/params.h + +//#define DEBUG_XEN + +#ifdef DEBUG_XEN +#define DPRINTF(fmt, ...) \ +do { fprintf(stderr, xen: fmt, ## __VA_ARGS__); } while (0) +#else +#define DPRINTF(fmt, ...) \ +do { } while (0) +#endif + +/* Compatibility with older version */ +#if __XEN_LATEST_INTERFACE_VERSION__ 0x0003020a +# define xen_vcpu_eport(shared_page, i) \ +(shared_page-vcpu_iodata[i].vp_eport) +# define xen_vcpu_ioreq(shared_page, vcpu) \ +(shared_page-vcpu_iodata[vcpu].vp_ioreq) +# define FMT_ioreq_size PRIx64 +#else +# define xen_vcpu_eport(shared_page, i) \ +(shared_page-vcpu_ioreq[i].vp_eport) +# define xen_vcpu_ioreq(shared_page, vcpu) \ +(shared_page-vcpu_ioreq[vcpu]) +# define FMT_ioreq_size u +#endif + +#define BUFFER_IO_MAX_DELAY 100 + +typedef struct XenIOState { +shared_iopage_t *shared_page; +buffered_iopage_t *buffered_io_page; +QEMUTimer *buffered_io_timer; +/* the evtchn port for polling the notification, */ +evtchn_port_t *ioreq_local_port; +/* the evtchn fd for polling */ +XenEvtchn xce_handle; +/* which vcpu we are serving */ +int send_vcpu; + +Notifier exit; +} XenIOState; + /* Xen specific function for piix pci */ int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num) @@ -112,19 +158,384 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size) } +/* VCPU Operations, MMIO, IO ring ... */ + +/* get the ioreq packets from share mem */ +static ioreq_t *cpu_get_ioreq_from_shared_memory(XenIOState *state, int vcpu) +{ +ioreq_t *req = xen_vcpu_ioreq(state-shared_page, vcpu); + +if (req-state != STATE_IOREQ_READY) { +DPRINTF(I/O request not ready: +%x, ptr: %x, port: %PRIx64, +data: %PRIx64, count: % FMT_ioreq_size , size: % FMT_ioreq_size \n, +req-state, req-data_is_ptr, req-addr, +req-data, req-count, req-size); +return NULL; +} + +xen_rmb(); /* see IOREQ_READY /then/ read contents of ioreq */ + +req-state = STATE_IOREQ_INPROCESS; +return req; +} + +/* use poll to get the port notification */ +/* ioreq_vec--out,the */ +/* retval--the number of ioreq packet */ +static ioreq_t *cpu_get_ioreq(XenIOState *state) +{ +int i; +evtchn_port_t port; + +port = xc_evtchn_pending(state-xce_handle); +if (port != -1) { +for (i = 0; i smp_cpus; i++) { +if (state-ioreq_local_port[i] == port) { +break; +} +} + +if (i == smp_cpus) { +hw_error(Fatal error while trying to get io event!\n); +} + +/* unmask the wanted port again */ +xc_evtchn_unmask(state-xce_handle, port); + +/* get the io packet from shared memory */ +state-send_vcpu = i; +return cpu_get_ioreq_from_shared_memory(state, i); +} + +/* read error or read nothing */ +return NULL; +} + +static uint32_t do_inp(pio_addr_t addr, unsigned long size) +{ +switch (size) { +case 1: +return cpu_inb(addr); +case 2: +return cpu_inw(addr); +case 4: +return cpu_inl(addr); +default: +hw_error(inp: bad size: %04FMT_pioaddr %lx, addr, size); +} +} + +static void do_outp(pio_addr_t addr, +unsigned long size, uint32_t val) +{ +switch (size) { +case 1: +return cpu_outb(addr, val); +case 2: +return cpu_outw(addr, val); +case 4: +return cpu_outl(addr, val); +default: +hw_error(outp: bad size: %04FMT_pioaddr %lx, addr, size); +} +} + +static void cpu_ioreq_pio(ioreq_t *req) +{ +int i, sign; + +sign = req-df ? -1 : 1; + +if (req-dir == IOREQ_READ) { +if (!req-data_is_ptr) { +req-data = do_inp(req-addr, req-size); +} else { +uint32_t tmp; + +for (i = 0; i req-count; i++) { +
Re: [Qemu-devel] [PATCH] hw/pcnet.c: Fix EPROM contents to suit AMD netware drivers
Hello, Any feedback to the patch, ready to commit? Thnx. Ciao, Gerhard -- http://www.wiesinger.com/ On Wed, 23 Feb 2011, Gerhard Wiesinger wrote: bugfix under DOS for AMD netware driver: AMD PCNTNW Ethernet MLID v3.10 (960115), network card not found bugfix works well under DOS with: 1.) AMD NDIS driver v2.0.1 2.) AMD PCNTNW Ethernet MLID v3.10 (960115) 3.) Knoppix 6.2 --- hw/pcnet.c | 16 1 files changed, 16 insertions(+), 0 deletions(-) diff --git a/hw/pcnet.c b/hw/pcnet.c index db52dc5..6dfdcc4 100644 --- a/hw/pcnet.c +++ b/hw/pcnet.c @@ -1562,8 +1562,24 @@ void pcnet_h_reset(void *opaque) /* Initialize the PROM */ +/* + Datasheet: http://pdfdata.datasheetsite.com/web/24528/AM79C970A.pdf + page 95 +*/ memcpy(s-prom, s-conf.macaddr.a, 6); +/* Reserved Location: must be 00h */ +s-prom[6] = s-prom[7] = 0x00; +/* Reserved Location: must be 00h */ +s-prom[8] = 0x00; +/* Hardware ID: must be 11h if compatibility to AMD drivers is desired */ +s-prom[9] = 0x11; +/* User programmable space, init with 0 */ +s-prom[10] = s-prom[11] = 0x00; +/* LSByte of two-byte checksum, which is the sum of bytes 00h-0Bh + and bytes 0Eh and 0Fh, must therefore be initialized with 0! */ s-prom[12] = s-prom[13] = 0x00; +/* Must be ASCII W (57h) if compatibility to AMD + driver software is desired */ s-prom[14] = s-prom[15] = 0x57; for (i = 0,checksum = 0; i 16; i++) -- 1.7.3.4
Re: [Qemu-devel] [RFC][PATCH 0/2] Allow cache settings for block devices to be changed at runtime.
On 03/01/2011 07:50 AM, Christoph Hellwig wrote: On Tue, Mar 01, 2011 at 12:48:34PM +, Stefan Hajnoczi wrote: On Tue, Mar 01, 2011 at 01:42:54PM +0100, Christoph Hellwig wrote: I have patches to do that, and to allow changing O_DIRECT via a monitor command, but to toggle O_SYNC via fcntl I first need to get a kernel patch in as that's currently not allowed to be changed at runtime. Great it sounds like you have already implemented the two cases (guest wce and host O_DIRECT) that we're talking about. At least in theory. And for Linux I can add setting/clearing of O_SYNC via fcntl easily, but what do we do for other hosts? To start with, we can just fail the command at the QMP level. I'm not sure closing/reopening the backing file is easily feasible. Regards, Anthony Liguori
Re: [Qemu-devel] Re: [PATCH] For AIO return -ENOSPC on short write
On Tue, Feb 22, 2011 at 05:59:01PM +0100, Paolo Bonzini wrote: On 02/22/2011 04:16 PM, Stefan Hajnoczi wrote: Yes it is. It doesn't explain it though. The code involved here is linux-aio.c and will be qcow2's bs-file. That ought to be a host_device and AFAIK that is not growable. So I wanted to figure out why we're even getting this far. I expected the request to get rejected in block.c when checking the range against the host_device. Possibly a COW logical volume can give short writes on disk full? It shouldn't. If it does that needs fixing.
Re: [Qemu-devel] [PATCH -V2 4/6] hw/9pfs: Implement syncfs
On Tue, Mar 1, 2011 at 6:02 PM, Aneesh Kumar K. V aneesh.ku...@linux.vnet.ibm.com wrote: On Tue, 1 Mar 2011 15:59:19 +, Stefan Hajnoczi stefa...@gmail.com wrote: Please explain the semantics of P9_TSYNCFS. Won't returning success without doing anything lead to data integrity issues? I should actually do the 9P Operation format as commit message. Will add in the next update. Whether returning here would cause a data integrity issue, it depends what sort of guarantee we want to provide. So calling sync on the guest will cause all the dirty pages in the guest to be flushed to host. Now all those changes are in the host page cache and it would be nice to flush them as a part of sync but then since we don't have a per file system sync, the above would imply we flush all dirty pages on the host which can result in large performance impact. You get the define the semantics of P9_TSYNCFS? I thought this is part of a well-defined protocol :). If this is a .L extension then it's probably a bad design and shouldn't be added to the protocol if we can't implement it. It is a part of .L extension and we can definitely implement it. There is patch out there which is yet to be merged http://thread.gmane.org/gmane.linux.file-systems/44628 A future Linux-only ioctl :/. Is this operation supposed to flush the disk write cache too? I am not sure we need to specify that as a part of 9p operation. I guess we can only say maximum possible data integrity. Whether a sync will cause disk write cache flush depends on the file system. For ext* that can be controlled by mount option barrier. So on a host with a safe configuration this operation should put data on stable storage? I think virtio-9p has a file descriptor cache. Would it be possible to fsync() those file descriptors? Ideally we should. But that would involve a large number of fsync calls. Yep, that's why this is a weird operation to support, especially since it's a .L add-on and not original 9P. What's the use-case since today's Linux userland cannot directly make use of this operation? I guess it has been added in order to pass-through a Linux internal vfs super block sync function? Stefan
Re: [Qemu-devel] [PATCH 2/2] microblaze: Allow targeting little-endian mb
On Fri, Feb 25, 2011 at 05:15:57PM +0200, Blue Swirl wrote: On Mon, Feb 21, 2011 at 3:44 PM, Edgar E. Iglesias edgar.igles...@petalogix.com wrote: Signed-off-by: Edgar E. Iglesias edgar.igles...@petalogix.com --- configure | 7 +-- default-configs/microblazeel-linux-user.mak | 1 + default-configs/microblazeel-softmmu.mak | 4 3 files changed, 10 insertions(+), 2 deletions(-) create mode 100644 default-configs/microblazeel-linux-user.mak create mode 100644 default-configs/microblazeel-softmmu.mak diff --git a/configure b/configure index 791b71d..3036faf 100755 --- a/configure +++ b/configure @@ -984,6 +984,7 @@ arm-softmmu \ cris-softmmu \ m68k-softmmu \ microblaze-softmmu \ +microblazeel-softmmu \ mips-softmmu \ mipsel-softmmu \ mips64-softmmu \ @@ -1008,6 +1009,7 @@ armeb-linux-user \ cris-linux-user \ m68k-linux-user \ microblaze-linux-user \ +microblazeel-linux-user \ mips-linux-user \ mipsel-linux-user \ ppc-linux-user \ @@ -3005,7 +3007,8 @@ case $target_arch2 in target_long_alignment=2 target_llong_alignment=2 ;; - microblaze) + microblaze|microblazeel) + TARGET_ARCH=microblaze bflt=yes target_nptl=yes target_phys_bits=32 @@ -3231,7 +3234,7 @@ for i in $ARCH $TARGET_BASE_ARCH ; do echo CONFIG_M68K_DIS=y $config_target_mak echo CONFIG_M68K_DIS=y $libdis_config_mak ;; - microblaze) + microblaze*) echo CONFIG_MICROBLAZE_DIS=y $config_target_mak echo CONFIG_MICROBLAZE_DIS=y $libdis_config_mak ;; diff --git a/default-configs/microblazeel-linux-user.mak b/default-configs/microblazeel-linux-user.mak new file mode 100644 index 000..566fdc0 --- /dev/null +++ b/default-configs/microblazeel-linux-user.mak @@ -0,0 +1 @@ +# Default configuration for microblaze-linux-user microblazeel-linux-user? diff --git a/default-configs/microblazeel-softmmu.mak b/default-configs/microblazeel-softmmu.mak new file mode 100644 index 000..4399b8b --- /dev/null +++ b/default-configs/microblazeel-softmmu.mak @@ -0,0 +1,4 @@ +# Default configuration for microblaze-softmmu microblazeel-softmmu? Thanks, I've corrected the copy+paste:o.. Cheers
Re: [Qemu-devel] [PATCH 00/17 v3] LatticeMico32 target
On Fri, Feb 25, 2011 at 12:03:37AM +0100, Michael Walle wrote: Am Donnerstag 17 Februar 2011, 23:45:01 schrieb Michael Walle: This patchset adds support for the LatticeMico32 softcore processor by Lattice Semiconductor. Changes since v2: - lots of CODING_STYLE fixes - reworked pic and juart model, CPUState is not passed anymore - use qdev reset field instead of qemu_register_reset() - add missing include guards - merged lm32_pic_cpu.c into boards file - removed buggy qemu_irq_lower() in reset functions - converted hw_error to error_report() Changes since v1: - removed variables which are no longer in use - replaced some tcg ops with specialized ones - kill VM in case of an unknown opcode - fixed tracepoints format strings to match existing ones Any comments/reviews on this patchset? I've changed the opcode decoding to use a lookup table instead of the for- loop. If you don't mind, i would submit a patch after the above is merged. Or, alternatively, if there is another patchset version, i'll integrate it into that ;) Hi, lets do v3 first. Do you have a public tree to pull from? Blue, did v3 address all the comments you had? Cheers
Re: [Qemu-devel] [Bug 723871] Re: qemu-kvm-0.14.0 Aborts with -vga qxl
Hi all, I had the same crash using Fedora 14 guest. I described it here : http://www.mail-archive.com/virt@lists.fedoraproject.org/msg00768.html I built qemu by myself since, using git tag spice.kvm.v28 from spice git repository : it seemed to solve the bug. - Mail original - On Tuesday 01 March 2011 12:29:14 Serge Hallyn wrote: @Rick, would you expect a fedora guest to reproduce this? Would it have the qxl driver? Or must it be Windows? I don't have a fedora guest to test on, and I don't know the implementation details well enough to postulate. -Rick
Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
On 03/01/2011 05:51 PM, Anthony Liguori wrote: On 03/01/2011 04:39 AM, Avi Kivity wrote: On 02/28/2011 08:12 PM, Anthony Liguori wrote: On Feb 28, 2011 11:47 AM, Avi Kivity a...@redhat.com mailto:a...@redhat.com wrote: On 02/28/2011 07:33 PM, Anthony Liguori wrote: You're just ignoring what I've written. No, you're just impervious to my subtle attempt to refocus the discussion on solving a practical problem. There's a lot of good, reasonably straight forward changes we can make that have a high return on investment. Is making qemu the authoritative source of configuration information a straightforward change? Is the return on it high? Is the investment low? I think this is where we fundamentally disagree. My position is that QEMU is already the authoritative source. Having a state file doesn't change anything. Do a hot unplug of a network device with upstream libvirt with acpiphp unloaded, consult libvirt and then consult the monitor to see who has the right view of the guests config. libvirt is right and the monitor is wrong. On real hardware, calling _EJ0 doesn't affect the configuration one little bit (if I understand it correctly). It just turns off power to the slot. If you power-cycle, the card will be there. It's up to the hardware vendor. Since it's ACPI, it can result in any number of operations. Usually, there's some logic to flip on an LED or something. There's nothing that prevents a vendor from ejecting the card. My point is that there aren't cleanly separated lines in the real world. To me, that's the definition of authoritative. No to all three (ignoring for the moment whether it is good or not, which we were debating). The only suggestion I'm making beyond Marcelo's original patch is that we use a structured format and that we make it possible to use the same file to solve this problem in multiple places. No, you're suggesting a lot more than that. That's exactly what I'm suggesting from a technical perspective. Unless I'm hallucinating, you're suggesting quite a bit more. A revolution in how qemu is to be managed. Let me take another route to see if I can't persuade you. First, let's clarify your proposal. You want to introduce a new block format No. That was Avi's initial proposal, after we talked we realized that it is not needed and we can use plain files w/o any new configuration. Pretty much similar to what you're proposing below, just w/o the configuration files. that references to block devices. It may also store a dirty bitmap to keep track of which blocks are out of sync. Hopefully, it goes without saying that the dirty bitmap is strictly optional (it's a performance optimization) so let's ignore it. Your format, as a text file, looks like: [raid1] primary=diska.img secondary=diskb.img active=primary To use it, here's the sequence: 0) qemu uses disk A for a block device 1) create a raid1 block device pointing to disk A and disk B. 2) management tool asks qemu to us the new raid1 block device. 3) qemu acks (2) 4) at some point, the mirror completes, writes are going to both disks 5) qemu sends out an event indicating that the disks are in sync 6) management tool then sends a command to fail over to disk B 7) qemu acks (6) 7) is not a must when there is no raid. We're making the management tool the authoritative source of how to launch QEMU. That means that the management tool ultimately determines which command line to relaunch QEMU with. This is what we have today regardless of live copy. How else would you track many hot plug/unplug operations and live migration afterwards? For enterprise usage, that's the best case. It's also true for a single host w/ libvirt and virt-manager. Here are the races: A) If QEMU crashes between (2) and (3), it may have issues a write to the new raid1 block device before the management tool sees (3). If this happens, when the management tool restarts QEMU with disk A, we're left with a dangling raid1 block device. Not a critical failure, but not ideal. Once there is no raid there is no race. B) If QEMU crashes between (6) and (7), QEMU may have started writing to disk B before the management tool sees (7). This means that the management tool will create the guest with the raid1 block device which no longer is the correct disk. This could fail in subtly bad ways. Depending on how read is implemented (if you try to do striping for instance), bad data could be returned. You could try to implement a policy of always reading from B if the block has been copied but this gets harry really quickly. It's definitely not RAID1 anymore. Exactly! Drop the raid and always read from B post #6. This is what I was suggesting before. You may observe that the problem is not the RAID1 mechanism, but changing from using a normal device and the RAID1 mechanism. It would then be wise to say, let's always use this image format. Since that eliminates the race, we don't really need the
Re: [Qemu-devel] [V6 PATCH 6/9] virtio-9p: Create support in chroot environment
On 2/28/2011 3:22 AM, M. Mohan Kumar wrote: Add both chroot deamon qemu side interfaces to create regular files in chroot environment Signed-off-by: M. Mohan Kumar mo...@in.ibm.com --- hw/9pfs/virtio-9p-chroot-dm.c | 39 +++ hw/9pfs/virtio-9p-local.c | 21 +++-- 2 files changed, 58 insertions(+), 2 deletions(-) diff --git a/hw/9pfs/virtio-9p-chroot-dm.c b/hw/9pfs/virtio-9p-chroot-dm.c index c1d8c6e..985d42b 100644 --- a/hw/9pfs/virtio-9p-chroot-dm.c +++ b/hw/9pfs/virtio-9p-chroot-dm.c @@ -83,6 +83,42 @@ static void chroot_do_open(V9fsFileObjectRequest *request, FdInfo *fd_info) } } +/* + * Helper routine to create a file and return the file descriptor and + * error status in FdInfo structure. + */ +static void chroot_do_create(V9fsFileObjectRequest *request, FdInfo *fd_info) +{ +uid_t cur_uid; +gid_t cur_gid; + +cur_uid = geteuid(); +cur_gid = getegid(); + +fd_info-fi_fd = -1; + +if (setfsuid(request-data.uid) 0) { +fd_info-fi_fd = -errno; +fd_info-fi_flags = FI_FD_INVALID; +return; +} +if (setfsgid(request-data.gid) 0) { +fd_info-fi_fd = -errno; +fd_info-fi_flags = FI_FD_INVALID; +goto unset_uid; +} + +fd_info-fi_fd = open(request-path.path, request-data.flags, +request-data.mode); +if (fd_info-fi_fd 0) { +fd_info-fi_fd = -errno; +fd_info-fi_flags = FI_FD_INVALID; +} +setfsgid(cur_gid); +unset_uid: +setfsuid(cur_uid); +} + static int chroot_daemonize(int chroot_sock) { sigset_t sigset; @@ -177,6 +213,9 @@ int v9fs_chroot(FsContext *fs_ctx) case T_OPEN: chroot_do_open(request, fd_info); break; +case T_CREATE: +chroot_do_create(request, fd_info); +break; default: fd_info.fi_flags = FI_FD_SOCKERR; break; diff --git a/hw/9pfs/virtio-9p-local.c b/hw/9pfs/virtio-9p-local.c index 0c55d35..3fed16c 100644 --- a/hw/9pfs/virtio-9p-local.c +++ b/hw/9pfs/virtio-9p-local.c @@ -58,6 +58,22 @@ static int passthrough_open(FsContext *fs_ctx, const char *path, int flags) return fd; } +static int passthrough_create(FsContext *fs_ctx, const char *path, int flags, +FsCred *credp) +{ +V9fsFileObjectRequest request; +int fd; + +fd = fill_fileobjectrequest(request, path, credp); +if (fd 0) { +return fd; Here fd is -errno; Need to assign it back to errno before returning. You took care of it in passthrough_open() but not here. Please check other places where this logic applies. - JV +} +request.data.flags = flags; +request.data.type = T_CREATE; +fd = v9fs_request(fs_ctx, request); +return fd; +} + static int local_lstat(FsContext *fs_ctx, const char *path, struct stat *stbuf) { int err; @@ -382,8 +398,7 @@ static int local_open2(FsContext *fs_ctx, const char *path, int flags, serrno = errno; goto err_end; } -} else if ((fs_ctx-fs_sm == SM_PASSTHROUGH) || - (fs_ctx-fs_sm == SM_NONE)) { +} else if (fs_ctx-fs_sm == SM_NONE) { fd = open(rpath(fs_ctx, path), flags, credp-fc_mode); if (fd == -1) { return fd; @@ -393,6 +408,8 @@ static int local_open2(FsContext *fs_ctx, const char *path, int flags, serrno = errno; goto err_end; } +} else if (fs_ctx-fs_sm == SM_PASSTHROUGH) { +fd = passthrough_create(fs_ctx, path, flags, credp); } return fd;
Re: [Qemu-devel] [V6 PATCH 7/9] virtio-9p: Support for creating special files
On 2/28/2011 3:22 AM, M. Mohan Kumar wrote: Add both chroot deamon and qemu side interfaces to create special files (directory, device nodes, links and symbolic links) Signed-off-by: M. Mohan Kumar mo...@in.ibm.com --- hw/9pfs/virtio-9p-chroot-dm.c | 57 hw/9pfs/virtio-9p-chroot-qemu.c | 19 hw/9pfs/virtio-9p-chroot.h |1 + hw/9pfs/virtio-9p-local.c | 93 ++- 4 files changed, 139 insertions(+), 31 deletions(-) diff --git a/hw/9pfs/virtio-9p-chroot-dm.c b/hw/9pfs/virtio-9p-chroot-dm.c index 985d42b..0ead017 100644 --- a/hw/9pfs/virtio-9p-chroot-dm.c +++ b/hw/9pfs/virtio-9p-chroot-dm.c @@ -119,6 +119,57 @@ unset_uid: setfsuid(cur_uid); } +/* + * Create directory, symbolic link, link, device node and regular files + * Similar to create, but it does not return the fd of created object + * Returns 0 as file descriptor on success and -errno on failure in FdInfo + * structure + */ +static void chroot_do_create_special(V9fsFileObjectRequest *request, +FdInfo *fd_info) +{ +int cur_uid, cur_gid; + +cur_uid = geteuid(); +cur_gid = getegid(); + +fd_info-fi_fd = -1; +/* fd is not valid for create operations */ +fd_info-fi_flags = FI_FD_INVALID; + +if (setfsuid(request-data.uid) 0) { +fd_info-fi_fd = -errno; +return; +} +if (setfsgid(request-data.gid) 0) { +fd_info-fi_fd = -errno; +goto unset_uid; +} + +switch (request-data.type) { +case T_MKDIR: +fd_info-fi_fd = mkdir(request-path.path, request-data.mode); +break; +case T_SYMLINK: +fd_info-fi_fd = symlink(request-path.old_path, request-path.path); +break; +case T_LINK: +fd_info-fi_fd = link(request-path.old_path, request-path.path); +break; +default: +fd_info-fi_fd = mknod(request-path.path, request-data.mode, +request-data.dev); +break; +} + +if (fd_info-fi_fd 0) { +fd_info-fi_fd = -errno; +} +setfsgid(cur_gid); +unset_uid: +setfsuid(cur_uid); +} + static int chroot_daemonize(int chroot_sock) { sigset_t sigset; @@ -216,6 +267,12 @@ int v9fs_chroot(FsContext *fs_ctx) case T_CREATE: chroot_do_create(request, fd_info); break; +case T_MKDIR: +case T_SYMLINK: +case T_LINK: +case T_MKNOD: +chroot_do_create_special(request, fd_info); +break; default: fd_info.fi_flags = FI_FD_SOCKERR; break; diff --git a/hw/9pfs/virtio-9p-chroot-qemu.c b/hw/9pfs/virtio-9p-chroot-qemu.c index 41f9db2..1a42dc2 100644 --- a/hw/9pfs/virtio-9p-chroot-qemu.c +++ b/hw/9pfs/virtio-9p-chroot-qemu.c @@ -103,3 +103,22 @@ unlock: qemu_mutex_unlock(fs_ctx-chroot_mutex); return fd; } + +/* Return 0 on success or -errno on error */ +int v9fs_create_special(FsContext *fs_ctx, V9fsFileObjectRequest *request) +{ +int fd, sock_error; Since this is not fd; may be you can use some other variable like err or something? +qemu_mutex_lock(fs_ctx-chroot_mutex); +if (fs_ctx-chroot_ioerror) { +fd = -EIO; +goto unlock; +} +v9fs_write_request(fs_ctx-chroot_socket, request); +fd = v9fs_receivefd(fs_ctx-chroot_socket, sock_error); +if (fd 0 sock_error) { +fs_ctx-chroot_ioerror = 1; +} Format?? - JV +unlock: +qemu_mutex_unlock(fs_ctx-chroot_mutex); +return fd; +} diff --git a/hw/9pfs/virtio-9p-chroot.h b/hw/9pfs/virtio-9p-chroot.h index 4592807..f113ff1 100644 --- a/hw/9pfs/virtio-9p-chroot.h +++ b/hw/9pfs/virtio-9p-chroot.h @@ -54,5 +54,6 @@ typedef struct V9fsFileObjectRequest int v9fs_chroot(FsContext *fs_ctx); int v9fs_request(FsContext *fs_ctx, V9fsFileObjectRequest *or); +int v9fs_create_special(FsContext *fs_ctx, V9fsFileObjectRequest *request); #endif /* _QEMU_VIRTIO_9P_CHROOT_H */ diff --git a/hw/9pfs/virtio-9p-local.c b/hw/9pfs/virtio-9p-local.c index 3fed16c..c92c5dd 100644 --- a/hw/9pfs/virtio-9p-local.c +++ b/hw/9pfs/virtio-9p-local.c @@ -74,6 +74,28 @@ static int passthrough_create(FsContext *fs_ctx, const char *path, int flags, return fd; } +static int passthrough_create_special(FsContext *fs_ctx, const char *oldpath, +const char *path, FsCred *credp, int type) +{ +V9fsFileObjectRequest request; +int retval; + +retval = fill_fileobjectrequest(request, path, credp); +if (retval 0) { +return retval; +} +request.data.type = type; +if (oldpath) { +request.data.oldpath_len = strlen(oldpath); +if (strlen(oldpath) PATH_MAX) { +return -ENAMETOOLONG; +} +strcpy(request.path.old_path, oldpath); +}
[Qemu-devel] [Bug 723871] Re: qemu-kvm-0.14.0 Aborts with -vga qxl
Thanks - I was able to reproduce the lockup with a RHEL boot cd, and confirm that the proposed fix works. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/723871 Title: qemu-kvm-0.14.0 Aborts with -vga qxl Status in QEMU: Confirmed Status in “qemu-kvm” package in Ubuntu: In Progress Bug description: Host CPU is Core i7 Q820. KVM is from 2.6.35-gentoo-r5 kernel (x86_64). Host has spice-0.7.2 and spice-protocol-0.7.0. Guest is Windows XP SP3 with qxl driver 0.6.1, virtio-serial 1.1.6 and vdagent 0.6.3. qemu-kvm is started like so: qemu-system-x86_64 -cpu host -enable-kvm -pidfile /home/rick/qemu/hds/wxp.pid -drive file=/home/rick/qemu/hds/wxp.raw,if=virtio,media=disk,aio=native,snapshot=on -m 768 -name WinXP -net nic,model=virtio -net user -localtime -usb -vga qxl -device virtio-serial -chardev spicevmc,name=vdagent,id=vdagent -device virtserialport,chardev=vdagent,name=com.redhat.spice.0 -spice port=1234,disable-ticketing -monitor stdio and crashes with: qemu-system-x86_64: /home/rick/qemu/src/qemu-kvm-0.14.0/qemu-kvm.c:1724: kvm_mutex_unlock: Assertion `!cpu_single_env' failed. Aborted If I use -no-kvm, it works fine. If I use -vga std, it works fine. -enable-kvm and -vga qxl crashes.
[Qemu-devel] [PATCH RESEND v2 1/2] fix vnc regression
This patch fix the following two regressions: 1. we should use bitmap_set() and bitmap_clear() to replace vnc_set_bits(). 2. The unit of bitmap_intersects()'third parameter is bit, not words. But we pass the num of words to bitmap_intersects(). Changes from v1 to v2: 1. fix the third argument of bitmap_clear() Signed-off-by: Wen Congyang we...@cn.fujitsu.com --- ui/vnc.c | 11 --- 1 files changed, 8 insertions(+), 3 deletions(-) diff --git a/ui/vnc.c b/ui/vnc.c index 610f884..e3761b0 100644 --- a/ui/vnc.c +++ b/ui/vnc.c @@ -1645,17 +1645,21 @@ static void framebuffer_update_request(VncState *vs, int incremental, int x_position, int y_position, int w, int h) { +int i; +const size_t width = ds_get_width(vs-ds) / 16; + if (y_position ds_get_height(vs-ds)) y_position = ds_get_height(vs-ds); if (y_position + h = ds_get_height(vs-ds)) h = ds_get_height(vs-ds) - y_position; -int i; vs-need_update = 1; if (!incremental) { vs-force_update = 1; for (i = 0; i h; i++) { -bitmap_set(vs-dirty[y_position + i], x_position / 16, w / 16); +bitmap_set(vs-dirty[y_position + i], 0, width); +bitmap_clear(vs-dirty[y_position + i], width, + VNC_DIRTY_WORDS * BITS_PER_LONG - width); } } } @@ -2406,7 +2410,8 @@ static int vnc_refresh_server_surface(VncDisplay *vd) guest_row = vd-guest.ds-data; server_row = vd-server-data; for (y = 0; y vd-guest.ds-height; y++) { -if (bitmap_intersects(vd-guest.dirty[y], width_mask, VNC_DIRTY_WORDS)) { +if (bitmap_intersects(vd-guest.dirty[y], width_mask, +VNC_DIRTY_WORDS * BITS_PER_LONG)) { int x; uint8_t *guest_ptr; uint8_t *server_ptr; -- 1.7.1
[Qemu-devel] [PATCH RESEND 2/2] vnc: Fix heap corruption
This bug is reported by Stefan Weil: Commit bc2429b9174ac2d3c56b7fd35884b0d89ec7fb02 introduced a severe bug (heap corruption). bitmap_clear was called with a wrong argument which caused out-of-bound writes to width_mask. This bug was detected with QEMU running on windows. It also occurs with wine: *** stack smashing detected ***: terminated wine: Unhandled illegal instruction at address 0x6115c7 (thread 0009), starting debugger... The bug is not windows specific! The third argument of bitmap_clear() is number of bits to be cleared, but we pass the end bits to be cleared to bitmap_clear(). Signed-off-by: Wen Congyang we...@cn.fujitsu.com Reported-by: Stefan Weil w...@mail.berlios.de --- ui/vnc.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/ui/vnc.c b/ui/vnc.c index e3761b0..e7d0b5b 100644 --- a/ui/vnc.c +++ b/ui/vnc.c @@ -2390,6 +2390,7 @@ static int vnc_refresh_server_surface(VncDisplay *vd) unsigned long width_mask[VNC_DIRTY_WORDS]; VncState *vs; int has_dirty = 0; +const size_t width = ds_get_width(vd-ds) / 16; struct timeval tv = { 0, 0 }; @@ -2403,9 +2404,8 @@ static int vnc_refresh_server_surface(VncDisplay *vd) * Check and copy modified bits from guest to server surface. * Update server dirty map. */ -bitmap_set(width_mask, 0, (ds_get_width(vd-ds) / 16)); -bitmap_clear(width_mask, (ds_get_width(vd-ds) / 16), - VNC_DIRTY_WORDS * BITS_PER_LONG); +bitmap_set(width_mask, 0, width); +bitmap_clear(width_mask, width, VNC_DIRTY_WORDS * BITS_PER_LONG - width); cmp_bytes = 16 * ds_get_bytes_per_pixel(vd-ds); guest_row = vd-guest.ds-data; server_row = vd-server-data; -- 1.7.1
Re: [Qemu-devel] [PATCH -V2 4/6] hw/9pfs: Implement syncfs
On Tue, 1 Mar 2011 20:27:19 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Tue, Mar 1, 2011 at 6:02 PM, Aneesh Kumar K. V aneesh.ku...@linux.vnet.ibm.com wrote: On Tue, 1 Mar 2011 15:59:19 +, Stefan Hajnoczi stefa...@gmail.com wrote: Please explain the semantics of P9_TSYNCFS. Won't returning success without doing anything lead to data integrity issues? I should actually do the 9P Operation format as commit message. Will add in the next update. Whether returning here would cause a data integrity issue, it depends what sort of guarantee we want to provide. So calling sync on the guest will cause all the dirty pages in the guest to be flushed to host. Now all those changes are in the host page cache and it would be nice to flush them as a part of sync but then since we don't have a per file system sync, the above would imply we flush all dirty pages on the host which can result in large performance impact. You get the define the semantics of P9_TSYNCFS? I thought this is part of a well-defined protocol :). If this is a .L extension then it's probably a bad design and shouldn't be added to the protocol if we can't implement it. It is a part of .L extension and we can definitely implement it. There is patch out there which is yet to be merged http://thread.gmane.org/gmane.linux.file-systems/44628 A future Linux-only ioctl :/. Is this operation supposed to flush the disk write cache too? I am not sure we need to specify that as a part of 9p operation. I guess we can only say maximum possible data integrity. Whether a sync will cause disk write cache flush depends on the file system. For ext* that can be controlled by mount option barrier. So on a host with a safe configuration this operation should put data on stable storage? I think virtio-9p has a file descriptor cache. Would it be possible to fsync() those file descriptors? Ideally we should. But that would involve a large number of fsync calls. Yep, that's why this is a weird operation to support, especially since it's a .L add-on and not original 9P. What's the use-case since today's Linux userland cannot directly make use of this operation? I guess it has been added in order to pass-through a Linux internal vfs super block sync function? IMHO it would be nice to have a syncfs 9p operation because that enables the client to say if possible flush the dirty data on the server. I guess we should consider this as something server can choose to ignore. In a cloud setup even doing a per file system sync can imply performance impact because VirtFS export may not 1:1 map to mount point on host. There is also plan to add a new option to the VirtFs export point which enable write to exported files to be either O_SYNC or O_DIRECT, similar to the way done for image files. That would imply Tsyncfs doesn't have much to do because we don't have dirty data on host pagecache anymore. So from 9p .L protocol point of view, it is a valid operation which enables client to request a flush of server cache if possible. And qemu 9p server choose to ignore because of the performance impact. If you are not comfortable with not doing anything specific in Tsyncfs operation, we can add sync(2) call as part of this 9p operation and later switch to FS_IOC_SYNCFS when it become available. -aneesh
Re: [Qemu-devel] [RFC][PATCH 0/2] Allow cache settings for block devices to be changed at runtime.
Am 01.03.2011 20:13, schrieb Anthony Liguori: On 03/01/2011 07:50 AM, Christoph Hellwig wrote: On Tue, Mar 01, 2011 at 12:48:34PM +, Stefan Hajnoczi wrote: On Tue, Mar 01, 2011 at 01:42:54PM +0100, Christoph Hellwig wrote: I have patches to do that, and to allow changing O_DIRECT via a monitor command, but to toggle O_SYNC via fcntl I first need to get a kernel patch in as that's currently not allowed to be changed at runtime. Great it sounds like you have already implemented the two cases (guest wce and host O_DIRECT) that we're talking about. At least in theory. And for Linux I can add setting/clearing of O_SYNC via fcntl easily, but what do we do for other hosts? To start with, we can just fail the command at the QMP level. O_SYNC isn't toggled from QMP but from the guest. Kevin
[Qemu-devel] [PATCH v3 08/17] Synchronize VCPU states before reset
This is required to support keeping VCPU states across a system reset. If we do not read the current state before the reset, cpu_synchronize_all_post_reset may write back incorrect state information. The first user of this will be MCE MSR synchronization which currently works around the missing cpu_synchronize_all_states. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- vl.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/vl.c b/vl.c index b436952..7751843 100644 --- a/vl.c +++ b/vl.c @@ -1452,6 +1452,7 @@ static void main_loop(void) } if (qemu_reset_requested()) { pause_all_vcpus(); +cpu_synchronize_all_states(); qemu_system_reset(); resume_all_vcpus(); } -- 1.7.1
[Qemu-devel] [PATCH v3 01/17] kvm: ppc: Fix breakage of kvm_arch_pre_run/process_irqchip_events
Commit 7a39fe5882 failed to convert the right arch function. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- target-ppc/kvm.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index bd4012a..3924f4b 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -222,7 +222,7 @@ int kvmppc_set_interrupt(CPUState *env, int irq, int level) #define PPC_INPUT_INT PPC6xx_INPUT_INT #endif -int kvm_arch_pre_run(CPUState *env, struct kvm_run *run) +void kvm_arch_pre_run(CPUState *env, struct kvm_run *run) { int r; unsigned irq; @@ -253,15 +253,15 @@ int kvm_arch_pre_run(CPUState *env, struct kvm_run *run) /* We don't know if there are more interrupts pending after this. However, * the guest will return to userspace in the course of handling this one * anyways, so we will get a chance to deliver the rest. */ -return 0; } void kvm_arch_post_run(CPUState *env, struct kvm_run *run) { } -void kvm_arch_process_irqchip_events(CPUState *env) +int kvm_arch_process_irqchip_events(CPUState *env) { +return 0; } static int kvmppc_handle_halt(CPUState *env) -- 1.7.1