Re: [Qemu-devel] [PATCH 1/5] introduce powerdown_notifiers
On Wed, 29 Aug 2012 19:06:45 +0200 Andreas Färber afaer...@suse.de wrote: Am 29.08.2012 19:02, schrieb Igor Mammedov: notifier will be used for signaling powerdown request to guest in more general way and intended to replace very specific qemu_irq_rise(qemu_system_powerdown) and will allow to remove global variable qemu_system_powerdown. v2: do not make qemu_system_powerdown static, spotted-by: Paolo Bonzini pbonz...@redhat.com Is the not a typo or did you send the wrong patch? It seems to be static below. Andreas Signed-off-by: Igor Mammedov imamm...@redhat.com --- sysemu.h | 1 + vl.c | 15 ++- 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/sysemu.h b/sysemu.h index f765821..eb9a750 100644 --- a/sysemu.h +++ b/sysemu.h @@ -53,6 +53,7 @@ void qemu_system_wakeup_enable(WakeupReason reason, bool enabled); void qemu_register_wakeup_notifier(Notifier *notifier); void qemu_system_shutdown_request(void); void qemu_system_powerdown_request(void); +void qemu_register_powerdown_notifier(Notifier *notifier); void qemu_system_debug_request(void); void qemu_system_vmstop_request(RunState reason); int qemu_shutdown_requested_get(void); diff --git a/vl.c b/vl.c index 7c577fa..8dc4b4f 100644 --- a/vl.c +++ b/vl.c @@ -1355,6 +1355,8 @@ static int powerdown_requested; static int debug_requested; static int suspend_requested; static int wakeup_requested; +static NotifierList powerdown_notifiers = +NOTIFIER_LIST_INITIALIZER(powerdown_notifiers); static NotifierList suspend_notifiers = NOTIFIER_LIST_INITIALIZER(suspend_notifiers); static NotifierList wakeup_notifiers = @@ -1563,12 +1565,23 @@ void qemu_system_shutdown_request(void) qemu_notify_event(); } +static void qemu_system_powerdown(void) this is a bad naming that conflicts with global var qemu_system_powerdown, so bisectability of series is still broken. perhaps qemu_do_system_powerdown() would be better for function name, although it doesn't match common name pattern used for this kind of functions. +{ +monitor_protocol_event(QEVENT_POWERDOWN, NULL); +notifier_list_notify(powerdown_notifiers, NULL); +} + void qemu_system_powerdown_request(void) { powerdown_requested = 1; qemu_notify_event(); } +void qemu_register_powerdown_notifier(Notifier *notifier) +{ +notifier_list_add(powerdown_notifiers, notifier); +} + void qemu_system_debug_request(void) { debug_requested = 1; @@ -1619,7 +1632,7 @@ static bool main_loop_should_exit(void) monitor_protocol_event(QEVENT_WAKEUP, NULL); } if (qemu_powerdown_requested()) { -monitor_protocol_event(QEVENT_POWERDOWN, NULL); +qemu_system_powerdown(); qemu_irq_raise(qemu_system_powerdown); } if (qemu_vmstop_requested(r)) { -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg -- Regards, Igor
Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem
On 2012-08-30 07:54, liu ping fan wrote: On Thu, Aug 30, 2012 at 1:40 AM, Avi Kivity a...@redhat.com wrote: On 08/29/2012 10:30 AM, Jan Kiszka wrote: On 2012-08-29 19:23, Avi Kivity wrote: On 08/28/2012 02:42 AM, Jan Kiszka wrote: Let's not talk about devices or MMIO dispatching. I think the problem is way more generic, and we will face it multiple times in QEMU. The problem exists outside qemu as well. It is one of the reasons for the popularity of garbage collection IMO, and the use of reference counting when GC is not available. This pattern is even documented in Documentation/DocBook/kernel-locking.tmpl: @@ -104,12 +114,11 @@ struct object *cache_find(int id) { struct object *obj; -unsigned long flags; -spin_lock_irqsave(amp;cache_lock, flags); +rcu_read_lock(); obj = __cache_find(id); if (obj) object_get(obj); -spin_unlock_irqrestore(amp;cache_lock, flags); +rcu_read_unlock(); Of course that doesn't mean we should use it, but at least it indicates that it is a standard pattern. With MemoryRegion the pattern is changed, since MemoryRegion is a thunk, not the object we're really dispatching to. We are dispatching according to the memory region (parameters, op handlers, opaques). If we end up in device object is not decided at this level. A memory region describes a dispatchable area - not to confuse with a device that may only partially be able to receive such requests. But I think the meaning of the memory region is for dispatching. If no dispatching associated with mr, why need it exist in the system? Where did I say that memory regions should no longer be used for dispatching? The point is to keep the clean layer separation between memory regions and device objects instead of merging them together. Given Object ^^ || Region 1Region 2 you protect the object during dispatch, implicitly (and that is bad) requiring that no region must change in that period. I say what rather needs protection are the regions so that Region 2 can pass away and maybe reappear independent of Region 1. And: I won't need to know the type of that object the regions are referring to in this model. That's the difference. And could you elaborate that who will be the ref holder of mr? The memory subsystem while running a memory region handler. If that will be a reference counter or a per-region lock like Avi suggested, we still need to find out. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH 1/5] introduce powerdown_notifiers
Il 30/08/2012 08:49, Igor Mammedov ha scritto: +static void qemu_system_powerdown(void) this is a bad naming that conflicts with global var qemu_system_powerdown, so bisectability of series is still broken. perhaps qemu_do_system_powerdown() would be better for function name, although it doesn't match common name pattern used for this kind of functions. Just inline the function in this patch, and uninline it later. Paolo
Re: [Qemu-devel] [PATCH v3 2/5] [RFC] libqblock, user example
Il 30/08/2012 03:59, Wenchao Xia ha scritto: Busy waiting is not acceptable, and this is the reason why I had suggested to keep AIO out of the design for now. You need to provide an implementation of AIO for either glib or something else, but this is best done within QEMU first (and only later moved to libqblock). It is similar to qemu's select type of AIO, while (true == qb_aio_check(broker) is not neccessary but an example here to ensure write i/o is executed first. Do you mean qemu's aio should be improved to another type of AIO API instead of select type? which kind of AIO api is preferred? Using GSource to integrate with the QEMU main loop would be an idea. qemu_aio_wait would remain. However, this is not relevant to libqblock. My point is that APIs are hard to get right, and even harder if you try to do too many things in the first iteration. Paolo
Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem
On Thu, Aug 30, 2012 at 3:08 PM, Jan Kiszka jan.kis...@siemens.com wrote: On 2012-08-30 07:54, liu ping fan wrote: On Thu, Aug 30, 2012 at 1:40 AM, Avi Kivity a...@redhat.com wrote: On 08/29/2012 10:30 AM, Jan Kiszka wrote: On 2012-08-29 19:23, Avi Kivity wrote: On 08/28/2012 02:42 AM, Jan Kiszka wrote: Let's not talk about devices or MMIO dispatching. I think the problem is way more generic, and we will face it multiple times in QEMU. The problem exists outside qemu as well. It is one of the reasons for the popularity of garbage collection IMO, and the use of reference counting when GC is not available. This pattern is even documented in Documentation/DocBook/kernel-locking.tmpl: @@ -104,12 +114,11 @@ struct object *cache_find(int id) { struct object *obj; -unsigned long flags; -spin_lock_irqsave(amp;cache_lock, flags); +rcu_read_lock(); obj = __cache_find(id); if (obj) object_get(obj); -spin_unlock_irqrestore(amp;cache_lock, flags); +rcu_read_unlock(); Of course that doesn't mean we should use it, but at least it indicates that it is a standard pattern. With MemoryRegion the pattern is changed, since MemoryRegion is a thunk, not the object we're really dispatching to. We are dispatching according to the memory region (parameters, op handlers, opaques). If we end up in device object is not decided at this level. A memory region describes a dispatchable area - not to confuse with a device that may only partially be able to receive such requests. But I think the meaning of the memory region is for dispatching. If no dispatching associated with mr, why need it exist in the system? Where did I say that memory regions should no longer be used for dispatching? The point is to keep the clean layer separation between memory regions and device objects instead of merging them together. Given Object ^^ || Region 1Region 2 you protect the object during dispatch, implicitly (and that is bad) requiring that no region must change in that period. I say what rather needs protection are the regions so that Region 2 can pass away and maybe reappear independent of Region 1. And: I won't need to know the OK, I see, this is a strong reason. Regards, pingfan type of that object the regions are referring to in this model. That's the difference. And could you elaborate that who will be the ref holder of mr? The memory subsystem while running a memory region handler. If that will be a reference counter or a per-region lock like Avi suggested, we still need to find out. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux
Re: [Qemu-devel] IPv6 support for -net user?
On Wed, Aug 29, 2012 at 04:43:18PM +0700, Ivan Shmakov wrote: I'm writing an iPXE mini-HOWTO (in Russian), using QEMU and -net user in examples (so that they're runnable by unprivileged users.) However, the QEMU documentation [1] seems to suggest that only IPv4 is implemented for -net user, which made me curious on whether the IPv6 support is planned to be added anytime soon? Personally, I'm interested mostly in QEMU sending router (prefix) advertisements to the “guest”, and forwarding TCP and UDP traffic, although support for recursive DNS discovery and DHCPv6 may also be nice to have. Jan Kiszka is the -net user maintainer, I have CCed him. I'm not aware of work to add IPv6 support to slirp. Someone would have to step up and submit patches :). You can still do unprivileged IPv6 networking with external DHCPv6, etc software: $ qemu -netdev socket,id=socket0,listen=127.0.0.1:1234 \ -device virtio-net-pci,netdev=socket0 The socket netdev tunnels traffic over a TCP or UDP socket. For TCP it prefixes each packet with the big-endian uint32_t length. For UDP no length header is necessary because packet boundaries are preserved. You could write your own code or find something that can speak with QEMU's -netdev socket. Stefan
Re: [Qemu-devel] [PATCH v2] Fix buffer run out in eepro100.
On Thu, Aug 30, 2012 at 07:47:38AM +0800, Bo Yang wrote: On 08/29/2012 11:19 PM, Stefan Hajnoczi wrote: On Wed, Aug 29, 2012 at 07:26:11PM +0800, Bo Yang wrote: This is reported by QA. When installing os with pxe, after the initial kernel and initrd are loaded, the procedure tries to copy files from install server to local harddisk, the network becomes stall because of running out of receive descriptor. Signed-off-by: Bo Yang boy...@suse.com --- hw/eepro100.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) Thanks, applied to the net patches tree: https://github.com/stefanha/qemu/commits/net I have reproduced the bug and tested that your patch fixes it. Your patch also has tabs instead of spaces, I fixed that up when merging. Thanks for verifying and merging the patch. Sorry for my mistake in coding style. I'll be careful next time. I use vim, it looks like set tabstop=4 is not enough to get it work as expected. set tabstop=4 display tabs as 4 spaces set expandtab use spaces instead of tabs set shiftwidth=4 indent by 4 spaces
Re: [Qemu-devel] [PATCH v2] Fix buffer run out in eepro100.
On Wed, Aug 29, 2012 at 09:17:43PM +0200, Stefan Weil wrote: Am 29.08.2012 13:26, schrieb Bo Yang: This is reported by QA. When installing os with pxe, after the initial kernel and initrd are loaded, the procedure tries to copy files from install server to local harddisk, the network becomes stall because of running out of receive descriptor. Signed-off-by: Bo Yangboy...@suse.com --- hw/eepro100.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/hw/eepro100.c b/hw/eepro100.c index 50d117e..52a18ad 100644 --- a/hw/eepro100.c +++ b/hw/eepro100.c @@ -1036,6 +1036,8 @@ static void eepro100_ru_command(EEPRO100State * s, uint8_t val) } set_ru_state(s, ru_ready); s-ru_offset = e100_read_reg4(s, SCBPointer); +qemu_flush_queued_packets(s-nic-nc); +qemu_notify_event(); What would happen if the above changes were omitted? In the worst case the guest code would be unable to make progress since packet reception is disabled. The QEMU net subsystem needs to be kicked when rx buffers become available again so that any queued packets can be delivered and we can restart the event loop. The event loop needs to be restarted because net clients (like tap) use qemu_set_fd_handler2() with a read_poll() handler that returns false when the NIC is unable to receive. Imagine this scenario: 1. NIC runs out of rx buffers. 2. Event loop iteration starts and calls tap's read_poll() handler, which sees the NIC cannot receive and therefore does not add the tap file descriptor to select(2). 3. NIC gets new rx buffers but does not kick net subsystem/event loop. 4. Event loop still sitting in select(2) without the tap file descriptor. Therefore incoming packets are not picked up by QEMU! In practice the event loop tends to iterate due to timers, etc. But in the worst case we can go completely starved here. Would the network show less performance? How much would the test scenario (Linux installation) take longer? Yes, the lack of kicks causes reduced network performance. This is especially true with -netdev tap and a guest driver that runs out of rx buffers. If you're lucky you might not hit this depending on the -netdev and availability of rx buffers. What about the other nic emulations in QEMU? I observe hanging network rather often with the ARM versatilepb emulation. virtio-net has been correct for some time. e1000, xen, usb, and eepro100 are now fixed in the net tree: http://github.com/stefanha/qemu/commits/net Other NICs may or may not be okay. Really all of them need to be audited. TRACE(OTHER, logout(val=0x%02x (rx start)\n, val)); break; case RX_RESUME: @@ -1770,7 +1772,8 @@ static ssize_t nic_receive(NetClientState *nc, const uint8_t * buf, size_t size) if (rfd_command COMMAND_EL) { /* EL bit is set, so this was the last frame. */ logout(receive: Running out of frames\n); -set_ru_state(s, ru_suspended); +set_ru_state(s, ru_no_resources); +eepro100_rnr_interrupt(s); Adding the interrupt here is correct (I have similar code in http://repo.or.cz/w/qemu/ar7.git/blob/HEAD:/hw/eepro100.c which is an improved version of hw/eepro100.c). Setting ru_no_resources looks also good, but I am not sure whether removing ru_suspended is ok. Maybe it should be ru_no_resources | ru_suspended. I think the datasheet talks about setting the RU to no resources and the CU to suspended. So there are two state machines and we only track one here. Stefan
Re: [Qemu-devel] [PATCH 1/5] introduce powerdown_notifiers
On Thu, 30 Aug 2012 09:41:55 +0200 Paolo Bonzini pbonz...@redhat.com wrote: Il 30/08/2012 08:49, Igor Mammedov ha scritto: +static void qemu_system_powerdown(void) this is a bad naming that conflicts with global var qemu_system_powerdown, so bisectability of series is still broken. perhaps qemu_do_system_powerdown() would be better for function name, although it doesn't match common name pattern used for this kind of functions. Just inline the function in this patch, and uninline it later. Thanks, fixed in https://github.com/imammedo/qemu/tree/cpu_as_device.WIP Paolo -- Regards, Igor
Re: [Qemu-devel] Slow inbound traffic on macvtap interfaces
On Thu, Aug 30, 2012 at 09:20:57AM +0100, Richard Davies wrote: Chris Webb wrote: I found that on my laptop, the single change of host kernel config -CONFIG_INTEL_IDLE=y +# CONFIG_INTEL_IDLE is not set is sufficient to turn transfers into guests from slow to full wire speed I am not deep enough in this code to write a patch, but I wonder if macvtap_forward in macvtap.c is missing a call to kill_fasync, which I understand is used to signal to interested processes when data arrives? No, only if TUN_FASYNC is set. qemu does not seem to set it. Here is the end of macvtap_forward: skb_queue_tail(q-sk.sk_receive_queue, skb); wake_up_interruptible_poll(sk_sleep(q-sk), POLLIN | POLLRDNORM | POLLRDBAND); return NET_RX_SUCCESS; Compared to this end of tun_net_xmit in tun.c: /* Enqueue packet */ skb_queue_tail(tun-socket.sk-sk_receive_queue, skb); /* Notify and wake up reader process */ if (tun-flags TUN_FASYNC) kill_fasync(tun-fasync, SIGIO, POLL_IN); wake_up_interruptible_poll(tun-wq.wait, POLLIN | POLLRDNORM | POLLRDBAND); return NETDEV_TX_OK; Richard.
Re: [Qemu-devel] [PATCH v2] Fix buffer run out in eepro100.
On 08/30/2012 04:04 PM, Stefan Hajnoczi wrote: On Wed, Aug 29, 2012 at 09:17:43PM +0200, Stefan Weil wrote: Am 29.08.2012 13:26, schrieb Bo Yang: This is reported by QA. When installing os with pxe, after the initial kernel and initrd are loaded, the procedure tries to copy files from install server to local harddisk, the network becomes stall because of running out of receive descriptor. Signed-off-by: Bo Yangboy...@suse.com --- hw/eepro100.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/hw/eepro100.c b/hw/eepro100.c index 50d117e..52a18ad 100644 --- a/hw/eepro100.c +++ b/hw/eepro100.c @@ -1036,6 +1036,8 @@ static void eepro100_ru_command(EEPRO100State * s, uint8_t val) } set_ru_state(s, ru_ready); s-ru_offset = e100_read_reg4(s, SCBPointer); + qemu_flush_queued_packets(s-nic-nc); + qemu_notify_event(); What would happen if the above changes were omitted? In the worst case the guest code would be unable to make progress since packet reception is disabled. The QEMU net subsystem needs to be kicked when rx buffers become available again so that any queued packets can be delivered and we can restart the event loop. The event loop needs to be restarted because net clients (like tap) use qemu_set_fd_handler2() with a read_poll() handler that returns false when the NIC is unable to receive. Imagine this scenario: 1. NIC runs out of rx buffers. 2. Event loop iteration starts and calls tap's read_poll() handler, which sees the NIC cannot receive and therefore does not add the tap file descriptor to select(2). 3. NIC gets new rx buffers but does not kick net subsystem/event loop. 4. Event loop still sitting in select(2) without the tap file descriptor. Therefore incoming packets are not picked up by QEMU! In practice the event loop tends to iterate due to timers, etc. But in the worst case we can go completely starved here. Yes. The fd will be added to read set in the next iteration. The delay depends on the select timeout. it is possible to go starved here. Would the network show less performance? How much would the test scenario (Linux installation) take longer? Yes, the lack of kicks causes reduced network performance. This is especially true with -netdev tap and a guest driver that runs out of rx buffers. If you're lucky you might not hit this depending on the -netdev and availability of rx buffers. What about the other nic emulations in QEMU? I observe hanging network rather often with the ARM versatilepb emulation. virtio-net has been correct for some time. e1000, xen, usb, and eepro100 are now fixed in the net tree: http://github.com/stefanha/qemu/commits/net Other NICs may or may not be okay. Really all of them need to be audited. TRACE(OTHER, logout(val=0x%02x (rx start)\n, val)); break; case RX_RESUME: @@ -1770,7 +1772,8 @@ static ssize_t nic_receive(NetClientState *nc, const uint8_t * buf, size_t size) if (rfd_command COMMAND_EL) { /* EL bit is set, so this was the last frame. */ logout(receive: Running out of frames\n); -set_ru_state(s, ru_suspended); +set_ru_state(s, ru_no_resources); + eepro100_rnr_interrupt(s); Adding the interrupt here is correct (I have similar code in http://repo.or.cz/w/qemu/ar7.git/blob/HEAD:/hw/eepro100.c which is an improved version of hw/eepro100.c). Setting ru_no_resources looks also good, but I am not sure whether removing ru_suspended is ok. Maybe it should be ru_no_resources | ru_suspended. I think the datasheet talks about setting the RU to no resources and the CU to suspended. So there are two state machines and we only track one here. I don't think I understand this. If we run out of rx descriptor, why do we suspend tx unit too? maybe there are reasons I am unaware of.. I don't know. Stefan
Re: [Qemu-devel] QEMU emulation per CPU
Hi, Can you please explain me why qemu user mode doesn't get along nicely with POSIX threads. ?? Thanks and regards -Naresh Bhat On Tue, Aug 28, 2012 at 1:51 PM, Mulyadi Santosa mulyadi.sant...@gmail.com wrote: Hi.. On Tue, Aug 28, 2012 at 3:04 PM, Naresh Bhat nareshgb...@gmail.com wrote: Hi All, I have the following questions related to QEMU a. Does the userland emulation mode of QEMU support running multiple processes on separate processors? (i.e. if we were running ARM7 emulation on a x86 machine with 8 CPU cores, can we launch one ARM7 binary per CPU?). yes, qemu user mode is running just like plain normal process. Maybe you just need to add cpu affinity here to lock them to certain processor... b. Same question as (a), but for threads. That is, for a single ARM7 multi-threaded process, can we run different threads on different underlying CPUs? IIRC, qemu user mode doesn't get along nicely with POSIX threads -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com -- For things to change, we must change -Naresh Bhat
[Qemu-devel] CPU hotplug
Hello list, what is the status of CPU hotplug support? I tried latest 1.2rc1 kvm-qemu with vanilla kernel v3.5.2 but the VM just crashes when sending cpu_set X online through qm monitor. Greets, Stefan
Re: [Qemu-devel] Adding support for Stateless Static NAT for TAP devices
On Thu, Aug 30, 2012 at 09:12:19AM +0300, John Basila wrote: When running multiple instances of QEMU from the same image file (using -snapshot) and connecting each instance to a dedicated TAP device, the Guest OS will most likely not be able to communicate with the outside world as all packets leave the Guest OS from the same IP and thus the Host OS will have difficulty returning the packets to the correct TAP device/Guest OS. Stateless Static Network Address Translation or SSNAT allows the QEMU to map the network of the Guest OS to the network of the TAP device allowing a unique IP address for each Guest OS that ease such case. The only mandatory argument to the SSNAT is the Guest OS network IP, the rest will be figured out from the underlying TAP device. Signed-off-by: John Basila jbas...@checkpoint.com --- net/tap.c| 369 +- qapi-schema.json |5 +- qemu-options.hx | 10 ++- 3 files changed, 381 insertions(+), 3 deletions(-) This does not work with vhost=on because the host-guest packet processing happens in vhost_net.ko instead of in QEMU. Use iptables on the host to NAT the tap interface. Stefan
Re: [Qemu-devel] QEMU emulation per CPU
Hi... On Thu, Aug 30, 2012 at 3:58 PM, Naresh Bhat nareshgb...@gmail.com wrote: Hi, Can you please explain me why qemu user mode doesn't get along nicely with POSIX threads. ?? there is another thread in this qemu-devel list that explains this. All I can conclude from that thread is that it has something to do with timers and address mapping... -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com
Re: [Qemu-devel] CPU hotplug
On Thu, 30 Aug 2012 11:06:21 +0200 Stefan Priebe s.pri...@profihost.ag wrote: Hello list, what is the status of CPU hotplug support? I tried latest 1.2rc1 kvm-qemu with vanilla kernel v3.5.2 but the VM just crashes when sending cpu_set X online through qm monitor. it's work in progress and not implemented upstream yet. You could try RHEl6.3 if you'd like to play with tech-preview feature. Greets, Stefan I'll create cpu hotplug page on qemu wiki to track status and todo list if no one objects to it. -- Regards, Igor
Re: [Qemu-devel] IPv6 support for -net user?
On 2012-08-30 09:23, Stefan Hajnoczi wrote: On Wed, Aug 29, 2012 at 04:43:18PM +0700, Ivan Shmakov wrote: I'm writing an iPXE mini-HOWTO (in Russian), using QEMU and -net user in examples (so that they're runnable by unprivileged users.) However, the QEMU documentation [1] seems to suggest that only IPv4 is implemented for -net user, which made me curious on whether the IPv6 support is planned to be added anytime soon? Personally, I'm interested mostly in QEMU sending router (prefix) advertisements to the “guest”, and forwarding TCP and UDP traffic, although support for recursive DNS discovery and DHCPv6 may also be nice to have. Jan Kiszka is the -net user maintainer, I have CCed him. I'm not aware of work to add IPv6 support to slirp. Someone would have to step up and submit patches :). Yep, I'm also not aware of plans or even activities in this direction. Some refactoring will likely be required to make the IPv4-oriented stack ready for this. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH] [RFC] PPC: dump DCRs from monitor
On 29/08/2012 19:55, Alexander Graf wrote: Are they accessible through the monitor's p command? Would be good to implement there too if not. I don't think so, which syntax would you use anyway? $dcr[n] ? Sure, why not? Is that possible with the register parsing code? I don't know that one too well, but it's probably the best fit for you, right? Except I don't know the parsing code well enough to not waste time digging it... For now the full dump is enough to me, I suppose if someone wants more he can also send a patch ;-) We could also add logging to the read/write calls to see the ordering. For now I just added some printf in my code just like in ppc405*. François.
Re: [Qemu-devel] [PATCH v10] kvm: notify host when the guest is panicked
On 08/30/2012 04:03 AM, Wen Congyang wrote: At 08/29/2012 07:56 PM, Sasha Levin Wrote: On 08/29/2012 07:18 AM, Wen Congyang wrote: diff --git a/Documentation/virtual/kvm/pv_event.txt b/Documentation/virtual/kvm/pv_event.txt new file mode 100644 index 000..bb04de0 --- /dev/null +++ b/Documentation/virtual/kvm/pv_event.txt @@ -0,0 +1,32 @@ +The KVM paravirtual event interface += + +Initializing the paravirtual event interface +== +kvm_pv_event_init() +Argiments: + None + +Return Value: + 0: The guest kernel can use paravirtual event interface. + 1: The guest kernel can't use paravirtual event interface. + +Querying whether the event can be ejected +== +kvm_pv_has_feature() +Arguments: + feature: The bit value of this paravirtual event to query + +Return Value: + 0 : The guest kernel can't eject this paravirtual event. + -1: The guest kernel can eject this paravirtual event. + + +Ejecting paravirtual event +== +kvm_pv_eject_event() +Arguments: + event: The event to be ejected. + +Return Value: + None What's the protocol for communicating with the hypervisor? What is it supposed to do on reads/writes to that ioport? Not only ioport, the other arch can use some other ways. We can use these APIs to eject event to hypervisor. The caller does not care how to communicate with the hypervisor. Right, it can work in several ways, but the protocol (whatever it may be) between the hypervisor and the guest kernel should be documented here as well. diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h index 2f7712e..7d297f0 100644 --- a/arch/x86/include/asm/kvm_para.h +++ b/arch/x86/include/asm/kvm_para.h @@ -96,8 +96,11 @@ struct kvm_vcpu_pv_apf_data { #define KVM_PV_EOI_ENABLED KVM_PV_EOI_MASK #define KVM_PV_EOI_DISABLED 0x0 +#define KVM_PV_EVENT_PORT (0x505UL) + #ifdef __KERNEL__ #include asm/processor.h +#include linux/ioport.h extern void kvmclock_init(void); extern int kvm_register_clock(char *txt); @@ -228,6 +231,30 @@ static inline void kvm_disable_steal_time(void) } #endif +static inline int kvm_arch_pv_event_init(void) +{ + if (!request_region(KVM_PV_EVENT_PORT, 1, KVM_PV_EVENT)) Only one byte is requested here, but the rest of the code is reading/writing longs? The struct resource * returned from request_region is simply being leaked here? What happens if we go ahead with adding another event (let's say OOM event)? request_region() is going to fail for anything but the first call. For x86, we use ioport to communicate with hypervisor. We can read a 32bit value from the hypervisor. If the bit0 is setted, it means the hypervisor supports panicked event. If you want add another event, you can use another unused bit. I think 32 events are enough now. You can write a value to the ioport to eject the event. Only one event can be ejected at a time. I was trying to point out that kvm_pv_event_init() would fail on anything but the first call, while the API suggests it should be called to verify we can write events. +static inline unsigned int kvm_arch_pv_features(void) +{ + unsigned int features = inl(KVM_PV_EVENT_PORT); + + /* Reading from an invalid I/O port will return -1 */ Just wondering, where is that documented? For lkvm for example the return value from an ioport without a device on the other side is undefined, so it's possible we're doing something wrong there. Hmm, how to use lkvm? Can you give me a example. So I can test this patch on lkvm. You can grab lkvm from https://github.com/penberg/linux-kvm , it lives under tools/kvm/ in the kernel tree. For qemu, it returns -1. I don't know which is right now. I will investigate it. Thing is, unless x86 arch suggests it should return something specific in this case, we can't assume a return value either from lkvm or qemu. Thanks, Sasha
[Qemu-devel] [Bug 1042654] Re: Floppy disks and network not working on NT 3.1 on Qemu 1.2 rc1
it does not happen on NT 3.5, 3.51 or 4.0, only on 3.1. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1042654 Title: Floppy disks and network not working on NT 3.1 on Qemu 1.2 rc1 Status in QEMU: New Bug description: When I try to put Floppy IMG/IMA/VFD images on NT 3.1 when it is running on Qemu 1.2 rc, they are not recognized and the network is not working even though I set it correctly (especially the AMD PCnet adapter) Here's some screenshot of the floppy error: http://i49.tinypic.com/j77wcw.png To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1042654/+subscriptions
[Qemu-devel] [Bug 1037606] Re: vmwgfx does not work with kvm vmware vga
** Tags added: precise running-unity ** Description changed: vmwgfx driver fails to initialize inside kvm. tried: kvm -m 2048 -vga vmware -cdrom RebeccaBlackLinux.iso (Ubuntu based, any Ubuntu live CD would do) Apport data collected with qantal alpha live CD (somewhat older kernel). The error is shjown in CurrentDmesg.txt https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1037606/+attachment/3265235/+files/CurrentDmesg.txt --- ApportVersion: 2.4-0ubuntu8 Architecture: amd64 AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found. CasperVersion: 1.320 DistroRelease: Ubuntu 12.10 IwConfig: eth0 no wireless extensions. lono wireless extensions. LiveMediaBuild: Ubuntu 12.10 Quantal Quetzal - Alpha amd64 (20120724.2) Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99 MachineType: Bochs Bochs Package: linux (not installed) ProcEnviron: TERM=linux PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: ProcKernelCmdLine: file=/cdrom/preseed/hostname.seed boot=casper initrd=/casper/initrd.lz quiet splash -- maybe-ubiquity ProcVersionSignature: Ubuntu 3.5.0-6.6-generic 3.5.0 PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon. RelatedPackageVersions: linux-restricted-modules-3.5.0-6-generic N/A linux-backports-modules-3.5.0-6-generic N/A linux-firmware 1.85 RfKill: Tags: quantal Uname: Linux 3.5.0-6-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: dmi.bios.date: 01/01/2007 dmi.bios.vendor: Bochs dmi.bios.version: Bochs dmi.chassis.type: 1 dmi.chassis.vendor: Bochs dmi.modalias: dmi:bvnBochs:bvrBochs:bd01/01/2007:svnBochs:pnBochs:pvr:cvnBochs:ct1:cvr: dmi.product.name: Bochs dmi.sys.vendor: Bochs + --- + ApportVersion: 2.0.1-0ubuntu12 + Architecture: i386 + DistroRelease: Ubuntu 12.04 + InstallationMedia: Ubuntu 10.10 Maverick Meerkat - Release i386 (20101007) + Package: linux (not installed) + ProcEnviron: + TERM=xterm + PATH=(custom, no user) + LANG=en_US.UTF-8 + SHELL=/bin/bash + Tags: precise running-unity + Uname: Linux 3.6.0-030600rc3-generic i686 + UnreportableReason: The running kernel is not an Ubuntu kernel + UpgradeStatus: Upgraded to precise on 2012-08-30 (0 days ago) + UserGroups: adm admin cdrom dialout lpadmin plugdev sambashare ** Changed in: linux (Ubuntu) Status: Incomplete = New ** Tags removed: needs-upstream-testing ** Changed in: linux (Ubuntu) Status: New = Confirmed ** Tags added: kernel-bug-exists-upstream -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1037606 Title: vmwgfx does not work with kvm vmware vga Status in QEMU: New Status in “linux” package in Ubuntu: Confirmed Bug description: vmwgfx driver fails to initialize inside kvm. tried: kvm -m 2048 -vga vmware -cdrom RebeccaBlackLinux.iso (Ubuntu based, any Ubuntu live CD would do) Apport data collected with qantal alpha live CD (somewhat older kernel). The error is shjown in CurrentDmesg.txt https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1037606/+attachment/3265235/+files/CurrentDmesg.txt --- ApportVersion: 2.4-0ubuntu8 Architecture: amd64 AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found. CasperVersion: 1.320 DistroRelease: Ubuntu 12.10 IwConfig: eth0 no wireless extensions. lono wireless extensions. LiveMediaBuild: Ubuntu 12.10 Quantal Quetzal - Alpha amd64 (20120724.2) Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99 MachineType: Bochs Bochs Package: linux (not installed) ProcEnviron: TERM=linux PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: ProcKernelCmdLine: file=/cdrom/preseed/hostname.seed boot=casper initrd=/casper/initrd.lz quiet splash -- maybe-ubiquity ProcVersionSignature: Ubuntu 3.5.0-6.6-generic 3.5.0 PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon. RelatedPackageVersions: linux-restricted-modules-3.5.0-6-generic N/A linux-backports-modules-3.5.0-6-generic N/A linux-firmware 1.85 RfKill: Tags: quantal Uname: Linux 3.5.0-6-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: dmi.bios.date: 01/01/2007 dmi.bios.vendor: Bochs dmi.bios.version: Bochs
Re: [Qemu-devel] Adding support for Stateless Static NAT for TAP devices
On Thu, Aug 30, 2012 at 10:27 AM, John Basila jbas...@checkpoint.com wrote: I have tried NAT and this is why I came up with this feature. QEMU's net/tap.c is the wrong place to add NAT code. The point of tap is to use the host network stack. If you want userspace networking, use -netdev user or -netdev socket. Please look into iptables more. I have CCed the netfilter mailing list. The question is: The host has several tap interfaces (tap0, tap1, ...) and the machine on the other end of each tap interface uses IP address 10.0.0.2. So we have: tap0 - virtual machine #0 (10.0.0.2) tap1 - virtual machine #1 (10.0.0.2) tap2 - virtual machine #2 (10.0.0.2) Because the virtual machines all use the same static IP address, they cannot communicate with each other or the outside world (they fight over ARP). We'd like to NAT the tap interfaces: tap0 - virtual machine #0 (10.0.0.2 NAT to 192.168.0.2) tap1 - virtual machine #1 (10.0.0.2 NAT to 192.168.0.3) tap2 - virtual machine #2 (10.0.0.2 NAT to 192.168.0.4) This would allow the virtual machines to communicate even though each believes it is 10.0.0.2. How can this be done using iptables and friends? Thanks, Stefan
[Qemu-devel] boot device order has no effect for virtio-scsi devices
My host is Gentoo x64, kernel 3.5.2, qemu-kvm 1.1.1-r1, libvirt 0.9.13, seabios 1.7.0 i try to set the boot order with scsi cd-rom first, then scsi harddisk but the virtual machine will always boot with first scsi device only (unit='0', the scsi harddisk) is it a known problem? my libvirt config: domain type='kvm' nameLinux/name uuidxxx/uuid memory unit='KiB'1048576/memory currentMemory unit='KiB'1048576/currentMemory vcpu placement='static'2/vcpu os type arch='x86_64' machine='pc-1.1'hvm/type boot dev='cdrom'/ boot dev='hd'/ bootmenu enable='no'/ /os features acpi/ apic/ hap/ /features cpu mode='host-model' model fallback='allow'/ /cpu clock offset='utc'/ on_poweroffdestroy/on_poweroff on_rebootrestart/on_reboot on_crashrestart/on_crash devices emulator/usr/bin/qemu-kvm/emulator disk type='file' device='disk' driver name='qemu' type='raw' cache='unsafe' io='native'/ source file='/Linux.raw_image'/ target dev='sda' bus='scsi'/ address type='drive' controller='0' bus='0' target='0' unit='0'/ /disk disk type='file' device='cdrom' driver name='qemu' type='raw' cache='unsafe' io='native'/ source file='/xubuntu-12.04-desktop-amd64.iso'/ target dev='sdb' bus='scsi'/ readonly/ address type='drive' controller='0' bus='0' target='0' unit='1'/ /disk controller type='usb' index='0' address type='pci' domain='0x' bus='0x00' slot='0x01' function='0x2'/ /controller controller type='virtio-serial' index='0' address type='pci' domain='0x' bus='0x00' slot='0x04' function='0x0'/ /controller controller type='scsi' index='0' model='virtio-scsi' address type='pci' domain='0x' bus='0x00' slot='0x05' function='0x0'/ /controller interface type='direct' mac address='xx'/ source dev='eth0' mode='bridge'/ model type='virtio'/ address type='pci' domain='0x' bus='0x00' slot='0x08' function='0x0'/ /interface serial type='pty' target port='0'/ /serial console type='pty' target type='serial' port='0'/ /console channel type='spicevmc' target type='virtio' name='com.redhat.spice.0'/ address type='virtio-serial' controller='0' bus='0' port='1'/ /channel input type='mouse' bus='ps2'/ graphics type='spice' autoport='yes' image compression='off'/ jpeg compression='never'/ zlib compression='never'/ playback compression='off'/ streaming mode='off'/ /graphics sound model='ich6' codec type='micro'/ address type='pci' domain='0x' bus='0x00' slot='0x03' function='0x0'/ /sound video model type='qxl' vram='65536' heads='1'/ address type='pci' domain='0x' bus='0x00' slot='0x02' function='0x0'/ /video redirdev bus='usb' type='spicevmc' /redirdev memballoon model='virtio' address type='pci' domain='0x' bus='0x00' slot='0x06' function='0x0'/ /memballoon /devices /domain
Re: [Qemu-devel] [PATCH 6/9] omap_lcdc: omap_ppm_save(): add error handling
On Wed, 29 Aug 2012 22:28:38 +0100 Peter Maydell peter.mayd...@linaro.org wrote: On 29 August 2012 20:53, Luiz Capitulino lcapitul...@redhat.com wrote: Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- hw/omap_lcdc.c | 59 -- 1 file changed, 45 insertions(+), 14 deletions(-) diff --git a/hw/omap_lcdc.c b/hw/omap_lcdc.c index 3d6328f..e2ba108 100644 --- a/hw/omap_lcdc.c +++ b/hw/omap_lcdc.c @@ -224,18 +224,24 @@ static void omap_update_display(void *opaque) omap_lcd-invalidate = 0; } -static int omap_ppm_save(const char *filename, uint8_t *data, -int w, int h, int linesize) +static void omap_ppm_save(const char *filename, uint8_t *data, +int w, int h, int linesize, Error **errp) { FILE *f; uint8_t *d, *d1; unsigned int v; -int y, x, bpp; +int ret, y, x, bpp; f = fopen(filename, wb); -if (!f) -return -1; -fprintf(f, P6\n%d %d\n%d\n, w, h, 255); +if (!f) { +error_setg(errp, failed to open file '%s': %s, filename, + strerror(errno)); +return; +} +ret = fprintf(f, P6\n%d %d\n%d\n, w, h, 255); +if (ret 0) { +goto write_err; +} We don't use 'ret' in write_err, so why not just if (fprintf(f) 0) { goto write_err; } here and similarly below and drop the variable altogether? For clarity. This is probably a matter of taste, but I much more prefer separate statements (vs. saving 4 bytes during the function call).
Re: [Qemu-devel] [PATCH v2] Fix buffer run out in eepro100.
On Thu, Aug 30, 2012 at 9:38 AM, Bo Yang boy...@suse.com wrote: On 08/30/2012 04:04 PM, Stefan Hajnoczi wrote: On Wed, Aug 29, 2012 at 09:17:43PM +0200, Stefan Weil wrote: Am 29.08.2012 13:26, schrieb Bo Yang: This is reported by QA. When installing os with pxe, after the initial kernel and initrd are loaded, the procedure tries to copy files from install server to local harddisk, the network becomes stall because of running out of receive descriptor. Signed-off-by: Bo Yangboy...@suse.com --- hw/eepro100.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/hw/eepro100.c b/hw/eepro100.c index 50d117e..52a18ad 100644 --- a/hw/eepro100.c +++ b/hw/eepro100.c @@ -1036,6 +1036,8 @@ static void eepro100_ru_command(EEPRO100State * s, uint8_t val) } set_ru_state(s, ru_ready); s-ru_offset = e100_read_reg4(s, SCBPointer); + qemu_flush_queued_packets(s-nic-nc); + qemu_notify_event(); What would happen if the above changes were omitted? In the worst case the guest code would be unable to make progress since packet reception is disabled. The QEMU net subsystem needs to be kicked when rx buffers become available again so that any queued packets can be delivered and we can restart the event loop. The event loop needs to be restarted because net clients (like tap) use qemu_set_fd_handler2() with a read_poll() handler that returns false when the NIC is unable to receive. Imagine this scenario: 1. NIC runs out of rx buffers. 2. Event loop iteration starts and calls tap's read_poll() handler, which sees the NIC cannot receive and therefore does not add the tap file descriptor to select(2). 3. NIC gets new rx buffers but does not kick net subsystem/event loop. 4. Event loop still sitting in select(2) without the tap file descriptor. Therefore incoming packets are not picked up by QEMU! In practice the event loop tends to iterate due to timers, etc. But in the worst case we can go completely starved here. Yes. The fd will be added to read set in the next iteration. The delay depends on the select timeout. it is possible to go starved here. Would the network show less performance? How much would the test scenario (Linux installation) take longer? Yes, the lack of kicks causes reduced network performance. This is especially true with -netdev tap and a guest driver that runs out of rx buffers. If you're lucky you might not hit this depending on the -netdev and availability of rx buffers. What about the other nic emulations in QEMU? I observe hanging network rather often with the ARM versatilepb emulation. virtio-net has been correct for some time. e1000, xen, usb, and eepro100 are now fixed in the net tree: http://github.com/stefanha/qemu/commits/net Other NICs may or may not be okay. Really all of them need to be audited. TRACE(OTHER, logout(val=0x%02x (rx start)\n, val)); break; case RX_RESUME: @@ -1770,7 +1772,8 @@ static ssize_t nic_receive(NetClientState *nc, const uint8_t * buf, size_t size) if (rfd_command COMMAND_EL) { /* EL bit is set, so this was the last frame. */ logout(receive: Running out of frames\n); -set_ru_state(s, ru_suspended); +set_ru_state(s, ru_no_resources); + eepro100_rnr_interrupt(s); Adding the interrupt here is correct (I have similar code in http://repo.or.cz/w/qemu/ar7.git/blob/HEAD:/hw/eepro100.c which is an improved version of hw/eepro100.c). Setting ru_no_resources looks also good, but I am not sure whether removing ru_suspended is ok. Maybe it should be ru_no_resources | ru_suspended. I think the datasheet talks about setting the RU to no resources and the CU to suspended. So there are two state machines and we only track one here. I don't think I understand this. If we run out of rx descriptor, why do we suspend tx unit too? maybe there are reasons I am unaware of.. I don't know. I was wrong. The datasheet Table 55. CU Activities Performed at the End of Execution shows that the EL and S bit cause the CU to enter the Idle State. In terms of hw/eepro100.c I don't think we care about the CU state. RU state No Resources is correct. Stefan
Re: [Qemu-devel] Adding support for Stateless Static NAT for TAP devices
John Basila jbas...@checkpoint.com writes: […] The problem here is related to the fact that QEMU is executed with multiple instances and all instances start from the same snapshot, Isn't it possible to resolve such an issue using, e. g., DHCPv6 or DHCP? All the QEMU instances will (AIUI) have random MAC addresses by default, but a static “instance to MAC” mapping is also possible, as is the respective “MAC to IP” mapping. […] -- FSF associate member #7257 http://sfd.am-1.org/
[Qemu-devel] Any KVM passthrough performance report?
Hi, I wonder if there is any detailed performance report of KVM passthrough. I was just told it can achieve near native hardware performance and could not get any experimental results. Best, Yi
Re: [Qemu-devel] macvlan/macvtap: guest/host cannot communicate when network cable is unplugged
Can you try the same test with two macvlan interfaces on the host (no macvtap)? You may need to use the ping -I interface-address argument to force the ping source address to a specific macvlan interface. If you see the same problem, it may just be the macvlan design - it is stacked on top of eth0 and might not work when eth0 is down. CCing macvlan/macvtap folks. Stefan tested as below $ifconfig eth0 Link encap:Ethernet HWaddr f4:6d:xx:xx:xx:xx inet6 addr: fe80::xx:xx:xx:xx/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:86507 errors:0 dropped:0 overruns:0 frame:0 TX packets:55940 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:126005746 (120.1 MiB) TX bytes:4394225 (4.1 MiB) macvtap0 Link encap:Ethernet HWaddr 52:54:xx:xx:xx:xx inet6 addr: fe80::xx:xx:xx:xx/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:70 errors:0 dropped:0 overruns:0 frame:0 TX packets:84 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:500 RX bytes:9036 (8.8 KiB) TX bytes:14734 (14.3 KiB) znet0 Link encap:Ethernet HWaddr 00:60:xx:xx:xx:92 inet addr:192.168.1.2 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: 2002:xx:xx:xx:xx/64 Scope:Global inet6 addr: fe80:xx:xx:xx:xx/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:4463190 errors:0 dropped:0 overruns:0 frame:0 TX packets:12527522 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:3959213697 (3.6 GiB) TX bytes:18590336476 (17.3 GiB) znet1 Link encap:Ethernet HWaddr 00:60:xx:xx:xx:99 inet addr:192.168.1.177 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: 2002:xx:xx:xx:xx64 Scope:Global inet6 addr: fe80:xx:xx:xx:xx/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:8 errors:0 dropped:0 overruns:0 frame:0 TX packets:9 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:1399 (1.3 KiB) TX bytes:1522 (1.4 KiB) $ ip -d link show 10: znet0@eth0: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc noqueue state UP mode DEFAULT link/ether 00:60:xx:xx:xx:92 brd ff:ff:ff:ff:ff:ff macvlan mode bridge 15: znet1@eth0: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT link/ether 00:60:xx:xx:xx:99 brd ff:ff:ff:ff:ff:ff macvlan mode bridge the macvlan interface cannot ping each other no matter network cable is plugged or not $ ping -I 192.168.1.2 192.168.1.177 PING 192.168.1.177 (192.168.1.177) from 192.168.1.2 : 56(84) bytes of data. --- 192.168.1.177 ping statistics --- 6 packets transmitted, 0 received, 100% packet loss, time 4999ms I also perform an additional test: the guests (macvtap bridge mode) CAN communicate each other no matter network cable is plugged or not.
Re: [Qemu-devel] [PATCH 01/18] qerror: introduce QERR_GENERIC_ERROR
On Wed, 15 Aug 2012 09:41:42 +0200 Pavel Hrdina phrd...@redhat.com wrote: Signed-off-by: Pavel Hrdina phrd...@redhat.com --- qerror.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/qerror.h b/qerror.h index d0a76a4..7e0bae7 100644 --- a/qerror.h +++ b/qerror.h @@ -120,6 +120,9 @@ void assert_no_error(Error *err); #define QERR_FEATURE_DISABLED \ ERROR_CLASS_GENERIC_ERROR, The feature '%s' is not enabled +#define QERR_GENERIC_ERROR \ +ERROR_CLASS_GENERIC_ERROR, An (Errno %d) error has occurred + You should use error_setg() instead: http://lists.gnu.org/archive/html/qemu-devel/2012-08/msg04980.html There usage examples in the series introducing it. It would be better to wait for it to be merged before you use it though, as it's always possible for people to ask for changes. #define QERR_INVALID_BLOCK_FORMAT \ ERROR_CLASS_GENERIC_ERROR, Invalid block format '%s'
Re: [Qemu-devel] QEMU emulation per CPU
Hi Santosa, Can you please forward a link of that discussion thread ?? Thanks and Regards -Naresh Bhat On Thu, Aug 30, 2012 at 2:39 PM, Mulyadi Santosa mulyadi.sant...@gmail.com wrote: Hi... On Thu, Aug 30, 2012 at 3:58 PM, Naresh Bhat nareshgb...@gmail.com wrote: Hi, Can you please explain me why qemu user mode doesn't get along nicely with POSIX threads. ?? there is another thread in this qemu-devel list that explains this. All I can conclude from that thread is that it has something to do with timers and address mapping... -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com -- For things to change, we must change -Naresh Bhat
[Qemu-devel] Posix timer syscalls ; dealing with the timer_t type
Hi all, I'm working on implementing Posix timers in linux-user. I'm having trouble figuring out how to handle the timer_t type. Consider the following code with say 32 bit ARM being emulated on 64 bit x86-64: timer_t timerid; err = timer_create(clockid, sev, timerid); err = timer_gettime(timerid, curr); The issue is that memory for the timer_t value in the 32 bit target is alloacted on the tack (where the timer_t is 4 bytes) but the value provided by the 64 bit host where the timer_t is 8 bytes. Any suggestions on dealing with this? Erik -- -- Erik de Castro Lopo http://www.mega-nerd.com/
Re: [Qemu-devel] [PATCH V6 0/2] Add JSON output to qemu-img info
Le Monday 27 Aug 2012 à 11:52:59 (-0600), Eric Blake a écrit : On 08/27/2012 01:15 AM, Benoît Canet wrote: This patchset add a JSON output mode to the qemu-img info command. It's a rewrite from scratch of the original patchset by Wenchao Xia following Anthony Liguori advices on JSON formating. the --output=(json|human) option is now mandatory on the command line. This statement is not true, but doesn't affect the series itself. in v6: Blue Swirl: -Add missing const in getopt structure declaration. Eric Blake: -Remove spurious undef. -Use an enum instead of two boolean. You have now addressed my complaints about the interface, but I would feel comfortable if someone more familiar with qemu-img itself gives final review and/or ack. gentle ping Benoît -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org
[Qemu-devel] [PATCH] xhci: allow 1 and 2 bytes accesses to capability registers
Some xHC drivers (most notably on Windows and BSD systems) read the first capability registers using 1 and 2 bytes accesses, since this is how they are defined in section 5.3 of the xHCI specs. Enabling these kind of read accesses allows Windows and FreeBSD guests to properly recognize the host controller. As this is an exception to the general 4-byte aligned accesses rule, we special-case the code path for capability reading and implement checks to guard against wrong size/alignment combinations. Signed-off-by: Alejandro Martinez Ruiz a...@securiforest.com --- hw/usb/hcd-xhci.c | 75 --- 1 file changed, 55 insertions(+), 20 deletions(-) diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c index 6c2ff02..6cca161 100644 --- a/hw/usb/hcd-xhci.c +++ b/hw/usb/hcd-xhci.c @@ -2320,13 +2320,30 @@ static void xhci_reset(DeviceState *dev) xhci-ev_buffer_get = 0; } -static uint32_t xhci_cap_read(XHCIState *xhci, uint32_t reg) +static uint32_t xhci_cap_read(XHCIState *xhci, uint32_t reg, unsigned size) { -uint32_t ret; +uint32_t ret = 0; + +/* + * Section 5.3 of the xHCI specification defines the first capability + * registers as being only 1 and 2 bytes in size. In fact, these are + * often accessed as 1 or 2 bytes reads. + * + * Some drivers read the first 4 bytes in one go, while others -most + * notably the original NEC Renesas driver for Windows and the *BSDs- + * read one register at a time. This is the only known exception to + * the 4 byte accesses rule, so we'll special-case the code. + */ switch (reg) { -case 0x00: /* HCIVERSION, CAPLENGTH */ -ret = 0x0100 | LEN_CAP; +case 0x00: /* CAPLENGTH [, HCIVERSION] */ +ret = LEN_CAP; +if (size 4) { +break; +} +/* fall-through if asking for all 4 bytes */ +case 0x02: /* HCIVERSION */ +ret |= 0x0100 (4 - size) * CHAR_BIT; break; case 0x04: /* HCSPARAMS 1 */ ret = (MAXPORTS24) | (MAXINTRS8) | MAXSLOTS; @@ -2685,26 +2702,43 @@ static void xhci_doorbell_write(XHCIState *xhci, uint32_t reg, uint32_t val) static uint64_t xhci_mem_read(void *ptr, target_phys_addr_t addr, unsigned size) { +uint64_t ret = 0; XHCIState *xhci = ptr; -/* Only aligned reads are allowed on xHCI */ -if (addr 3) { -fprintf(stderr, xhci_mem_read: Mis-aligned read\n); -return 0; -} - +/* Allow 1, 2 and 4-byte aligned reads on capabilities, and only + * 4-byte reads elsewhere. + */ if (addr LEN_CAP) { -return xhci_cap_read(xhci, addr); -} else if (addr = OFF_OPER addr (OFF_OPER + LEN_OPER)) { -return xhci_oper_read(xhci, addr - OFF_OPER); -} else if (addr = OFF_RUNTIME addr (OFF_RUNTIME + LEN_RUNTIME)) { -return xhci_runtime_read(xhci, addr - OFF_RUNTIME); -} else if (addr = OFF_DOORBELL addr (OFF_DOORBELL + LEN_DOORBELL)) { -return xhci_doorbell_read(xhci, addr - OFF_DOORBELL); +/* deny accesses to odd addresses, specially since we accept 1-byte reads */ +if (addr 1) { +fprintf(stderr, xhci_mem_read: invalid %ud-byte capability read at address %x\n, size, (unsigned int) addr); +goto out; +} + +/* We deal with read size down in xhci_cap_read, since access + * is variable for some addresses. + */ +ret = xhci_cap_read(xhci, addr, size); } else { -fprintf(stderr, xhci_mem_read: Bad offset %x\n, (int)addr); -return 0; +/* non capability read */ +if (size 4) { +fprintf(stderr, xhci_mem_read: mis-aligned %ud-byte read on address %x\n, size, (unsigned int) addr); +goto out; +} + +if (addr = OFF_OPER addr (OFF_OPER + LEN_OPER)) { +ret = xhci_oper_read(xhci, addr - OFF_OPER); +} else if (addr = OFF_RUNTIME addr (OFF_RUNTIME + LEN_RUNTIME)) { +ret = xhci_runtime_read(xhci, addr - OFF_RUNTIME); +} else if (addr = OFF_DOORBELL addr (OFF_DOORBELL + LEN_DOORBELL)) { +ret = xhci_doorbell_read(xhci, addr - OFF_DOORBELL); +} else { +fprintf(stderr, xhci_mem_read: tried to read %ud bytes from bad offset %x\n, size, (unsigned int) addr); +} } + +out: +return ret; } static void xhci_mem_write(void *ptr, target_phys_addr_t addr, @@ -2732,8 +2766,9 @@ static void xhci_mem_write(void *ptr, target_phys_addr_t addr, static const MemoryRegionOps xhci_mem_ops = { .read = xhci_mem_read, .write = xhci_mem_write, -.valid.min_access_size = 4, +.valid.min_access_size = 1, .valid.max_access_size = 4, +.valid.unaligned = false, .endianness = DEVICE_LITTLE_ENDIAN, }; -- 1.7.12.rc2.18.g61b472e
Re: [Qemu-devel] macvlan/macvtap: guest/host cannot communicate when network cable is unplugged
On Thu, Aug 30, 2012 at 1:13 PM, ching lschin...@gmail.com wrote: Can you try the same test with two macvlan interfaces on the host (no macvtap)? You may need to use the ping -I interface-address argument to force the ping source address to a specific macvlan interface. If you see the same problem, it may just be the macvlan design - it is stacked on top of eth0 and might not work when eth0 is down. CCing macvlan/macvtap folks. Stefan tested as below $ifconfig eth0 Link encap:Ethernet HWaddr f4:6d:xx:xx:xx:xx inet6 addr: fe80::xx:xx:xx:xx/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:86507 errors:0 dropped:0 overruns:0 frame:0 TX packets:55940 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:126005746 (120.1 MiB) TX bytes:4394225 (4.1 MiB) macvtap0 Link encap:Ethernet HWaddr 52:54:xx:xx:xx:xx inet6 addr: fe80::xx:xx:xx:xx/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:70 errors:0 dropped:0 overruns:0 frame:0 TX packets:84 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:500 RX bytes:9036 (8.8 KiB) TX bytes:14734 (14.3 KiB) znet0 Link encap:Ethernet HWaddr 00:60:xx:xx:xx:92 inet addr:192.168.1.2 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: 2002:xx:xx:xx:xx/64 Scope:Global inet6 addr: fe80:xx:xx:xx:xx/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:4463190 errors:0 dropped:0 overruns:0 frame:0 TX packets:12527522 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:3959213697 (3.6 GiB) TX bytes:18590336476 (17.3 GiB) znet1 Link encap:Ethernet HWaddr 00:60:xx:xx:xx:99 inet addr:192.168.1.177 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: 2002:xx:xx:xx:xx64 Scope:Global inet6 addr: fe80:xx:xx:xx:xx/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:8 errors:0 dropped:0 overruns:0 frame:0 TX packets:9 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:1399 (1.3 KiB) TX bytes:1522 (1.4 KiB) $ ip -d link show 10: znet0@eth0: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc noqueue state UP mode DEFAULT link/ether 00:60:xx:xx:xx:92 brd ff:ff:ff:ff:ff:ff macvlan mode bridge 15: znet1@eth0: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT link/ether 00:60:xx:xx:xx:99 brd ff:ff:ff:ff:ff:ff macvlan mode bridge the macvlan interface cannot ping each other no matter network cable is plugged or not $ ping -I 192.168.1.2 192.168.1.177 PING 192.168.1.177 (192.168.1.177) from 192.168.1.2 : 56(84) bytes of data. --- 192.168.1.177 ping statistics --- 6 packets transmitted, 0 received, 100% packet loss, time 4999ms In bridge mode I expected them to be able to communicate. I also perform an additional test: the guests (macvtap bridge mode) CAN communicate each other no matter network cable is plugged or not. Strange. I thought the original problem was that the macvtap guests cannot communicate with each other when the network cable is unplugged? Hopefully someone else can help you, I'm not familiar enough with macvlan/macvtap. Stefan
Re: [Qemu-devel] [PATCH for-1.2] hw/arm_gic.c: Define .class_size in arm_gic_info TypeInfo
Am 29.08.2012 20:57, schrieb Stefan Weil: PS. Are there perhaps more bugs of this sort? A quick test looking for .class_init without .class_size shows a lot of files. That alone is not wrong. A problem only arises when a new struct ...Class is casted to but the object not sized appropriately through .class_size. Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH for-1.2] msix: make [un]use vectors on reset/load optional
Am 29.08.2012 20:13, schrieb Michael S. Tsirkin: On Wed, Aug 29, 2012 at 06:54:35PM +0200, Andreas Färber wrote: $subject: [un]used vectors? -- could be fixed by committer. Sorry I don't unedrstand. it's not 'unused': it's use and unuse. What is wrong with the subject? The grammar with two verbs make [un]use sounded wrong to my ears. Given your explanation of [un]use above, did you mean make clearing [un]use vectors optional on reset/load or something like that? /-F -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] Is is possible to virtualise or share the TPM?
Dear Stefan, What does it mean that the patches with the VTPM functionality exist but they are behind the regular ones? Does it mean that they are not currently updated? That they have less priority? Best regards, Jordi. On 08/29/2012 02:57 PM, Stefan Berger wrote: On 08/23/2012 04:05 PM, Corey Bryant wrote: On 08/21/2012 06:31 AM, Jordi Cucurull Juan wrote: Dear all, After applying the TPM patches to QEMU, I was wondering if it is possible to simultaneously use the TPM in more than one virtual machine, i.e. virtualisation of the TPM. According to the paper Stefan Berger, Ramón Cáceres, Kenneth A. Goldman, Ronald Perez, Reiner Sailer, Leendert van Doorn. vTPM: Virtualizing the Trusted Platform Module this seems to be possible in Xen. Is not possible in QEMU? Thanks! Jordi. I don't think the pass-through driver supports use by multiple VMs. Stefan Berger should be able to answer better so I'm adding him to the thread. The pass-through driver cannot provide access for multiple VMs to the single hardware TPM on the host. The usage model and the statefulness of the TPM (SRK password, owner password, keys) basically prevent/complicate this. The implementation for Xen was indep. of the Qemu code base today and there we used a software implementation of the TPM that provided a private TPm instance to each VM. I have patches for this for Qemu but due to an IRC chat in Sept. 2011 they are 'behind' the pass-through driver patches. Stefan -- Jordi Cucurull Juan Researcher Scytl Secure Electronic Voting Plaça Gal·la Placidia, 1-3, 1st floor · 08006 Barcelona Phone: + 34 934 230 324 Fax+ 34 933 251 028 jordi.cucur...@scytl.com http://www.scytl.com NOTICE: The information in this e-mail and in any of its attachments is confidential and intended solely for the attention and use of the named addressee(s). If you are not the intended recipient, any disclosure, copying, distribution or retaining of this message or any part of it, without the prior written consent of Scytl Secure Electronic Voting, SA is prohibited and may be unlawful. If you have received this in error, please contact the sender and delete the material from any computer. Your data are in a file owned by Scytl Secure Electronic Voting, S.A. You can exercice your rights of access, rectification, cancellation and opposition by contacting Scytl Secure Electronic Voting, S.A. at the following address: Gal·la Placídia, 1-3. 1st, 08006 Barcelona (Spain), according to the Organic Law 15/1999, of 13th December of Protection of Personal Data.
[Qemu-devel] [Bug 1037606] Re: vmwgfx does not work with kvm vmware vga
This issue appears to be an upstream bug, since you tested the latest upstream kernel. Would it be possible for you to open an upstream bug report[0]? That will allow the upstream Developers to examine the issue, and may provide a quicker resolution to the bug. If you are comfortable with opening a bug upstream, It would be great if you can report back the upstream bug number in this bug report. That will allow us to link this bug to the upstream report. [0] https://wiki.ubuntu.com/Bugs/Upstream/kernel ** Changed in: linux (Ubuntu) Status: Confirmed = Triaged -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1037606 Title: vmwgfx does not work with kvm vmware vga Status in QEMU: New Status in “linux” package in Ubuntu: Triaged Bug description: vmwgfx driver fails to initialize inside kvm. tried: kvm -m 2048 -vga vmware -cdrom RebeccaBlackLinux.iso (Ubuntu based, any Ubuntu live CD would do) Apport data collected with qantal alpha live CD (somewhat older kernel). The error is shjown in CurrentDmesg.txt https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1037606/+attachment/3265235/+files/CurrentDmesg.txt --- ApportVersion: 2.4-0ubuntu8 Architecture: amd64 AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found. CasperVersion: 1.320 DistroRelease: Ubuntu 12.10 IwConfig: eth0 no wireless extensions. lono wireless extensions. LiveMediaBuild: Ubuntu 12.10 Quantal Quetzal - Alpha amd64 (20120724.2) Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99 MachineType: Bochs Bochs Package: linux (not installed) ProcEnviron: TERM=linux PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: ProcKernelCmdLine: file=/cdrom/preseed/hostname.seed boot=casper initrd=/casper/initrd.lz quiet splash -- maybe-ubiquity ProcVersionSignature: Ubuntu 3.5.0-6.6-generic 3.5.0 PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon. RelatedPackageVersions: linux-restricted-modules-3.5.0-6-generic N/A linux-backports-modules-3.5.0-6-generic N/A linux-firmware 1.85 RfKill: Tags: quantal Uname: Linux 3.5.0-6-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: dmi.bios.date: 01/01/2007 dmi.bios.vendor: Bochs dmi.bios.version: Bochs dmi.chassis.type: 1 dmi.chassis.vendor: Bochs dmi.modalias: dmi:bvnBochs:bvrBochs:bd01/01/2007:svnBochs:pnBochs:pvr:cvnBochs:ct1:cvr: dmi.product.name: Bochs dmi.sys.vendor: Bochs --- ApportVersion: 2.0.1-0ubuntu12 Architecture: i386 DistroRelease: Ubuntu 12.04 InstallationMedia: Ubuntu 10.10 Maverick Meerkat - Release i386 (20101007) Package: linux (not installed) ProcEnviron: TERM=xterm PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash Tags: precise running-unity Uname: Linux 3.6.0-030600rc3-generic i686 UnreportableReason: The running kernel is not an Ubuntu kernel UpgradeStatus: Upgraded to precise on 2012-08-30 (0 days ago) UserGroups: adm admin cdrom dialout lpadmin plugdev sambashare To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1037606/+subscriptions
Re: [Qemu-devel] [PATCH for-1.2] msix: make [un]use vectors on reset/load optional
On Thu, Aug 30, 2012 at 03:34:42PM +0200, Andreas Färber wrote: Am 29.08.2012 20:13, schrieb Michael S. Tsirkin: On Wed, Aug 29, 2012 at 06:54:35PM +0200, Andreas Färber wrote: $subject: [un]used vectors? -- could be fixed by committer. Sorry I don't unedrstand. it's not 'unused': it's use and unuse. What is wrong with the subject? The grammar with two verbs make [un]use sounded wrong to my ears. Given your explanation of [un]use above, did you mean make clearing [un]use vectors optional on reset/load or something like that? /-F No, sorry. What is meant is simply functions msix_vector_use/msix_vector_unuse: calling these on reset/load was required but is now optional. -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH 02/18] block: add error parameter to bdrv_snapshot_create() and related functions
On Wed, 15 Aug 2012 09:41:43 +0200 Pavel Hrdina phrd...@redhat.com wrote: Signed-off-by: Pavel Hrdina phrd...@redhat.com --- block.c| 25 + block.h| 3 ++- block/qcow2-snapshot.c | 9 - block/qcow2.h | 4 +++- block/rbd.c| 20 ++-- block/sheepdog.c | 17 + block_int.h| 3 ++- qemu-img.c | 2 +- savevm.c | 2 +- 9 files changed, 57 insertions(+), 28 deletions(-) diff --git a/block.c b/block.c index 016858b..8bc49b7 100644 --- a/block.c +++ b/block.c @@ -2661,16 +2661,25 @@ BlockDriverState *bdrv_snapshots(void) } int bdrv_snapshot_create(BlockDriverState *bs, - QEMUSnapshotInfo *sn_info) + QEMUSnapshotInfo *sn_info, + Error **errp) { BlockDriver *drv = bs-drv; -if (!drv) -return -ENOMEDIUM; -if (drv-bdrv_snapshot_create) -return drv-bdrv_snapshot_create(bs, sn_info); -if (bs-file) -return bdrv_snapshot_create(bs-file, sn_info); -return -ENOTSUP; +int ret; + +if (!drv) { +error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, bdrv_get_device_name(bs)); We should only use QERR_ macros for the errors listed in the ErrorClass enum (except GenericError), all other errors should generally use error_setg(), like this: error_setg(errp, device '%s' has no medium); +ret = -ENOMEDIUM; And, usually, we should get rid of errno propagation. There are two cases here: 1. errno is propagated up so that upper layers can print a decent error message to the user. In this case, it's safe to eliminate errno. error_setg() will store a decent message already and the Error object can be propagated up. 2. errno is propagated up so that upper layers can distinguish among error causes and take different actions accordingly. Doesn't seem to be the case of bdrv_snapshot_create() (ie. errno is only used to communicate the error to the user). However, I'm pretty sure that such usage exists in qemu and the error API will break it, as most of our errors are generic. I see two solutions to this problem: A. Add specific errors to ErrorClass. I don't like this very much, as it's possible that such errors are going to be useful only internally. B. Add two new functions: void error_sete(Error **err, ErrorClass err_class, int errno, const char *fmt, ...); int error_get_errno(const Error **err); So that we can maintain errno when it's used to communicate error cause among functions. +} else if (drv-bdrv_snapshot_create) { +ret = drv-bdrv_snapshot_create(bs, sn_info, errp); +} else if (bs-file) { +ret = bdrv_snapshot_create(bs-file, sn_info, errp); +} else { +error_set(errp, QERR_NOT_SUPPORTED); +ret = -ENOTSUP; +} + +return ret; } int bdrv_snapshot_goto(BlockDriverState *bs, diff --git a/block.h b/block.h index 2e2be11..92e782b 100644 --- a/block.h +++ b/block.h @@ -296,7 +296,8 @@ int bdrv_can_snapshot(BlockDriverState *bs); int bdrv_is_snapshot(BlockDriverState *bs); BlockDriverState *bdrv_snapshots(void); int bdrv_snapshot_create(BlockDriverState *bs, - QEMUSnapshotInfo *sn_info); + QEMUSnapshotInfo *sn_info, + Error **errp); int bdrv_snapshot_goto(BlockDriverState *bs, const char *snapshot_id); int bdrv_snapshot_delete(BlockDriverState *bs, const char *snapshot_id); diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c index 4e7c93b..cf86dae 100644 --- a/block/qcow2-snapshot.c +++ b/block/qcow2-snapshot.c @@ -25,6 +25,7 @@ #include qemu-common.h #include block_int.h #include block/qcow2.h +#include qerror.h typedef struct QEMU_PACKED QCowSnapshotHeader { /* header is 8 byte aligned */ @@ -312,7 +313,9 @@ static int find_snapshot_by_id_or_name(BlockDriverState *bs, const char *name) } /* if no id is provided, a new one is constructed */ -int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info) +int qcow2_snapshot_create(BlockDriverState *bs, + QEMUSnapshotInfo *sn_info, + Error **errp) { BDRVQcowState *s = bs-opaque; QCowSnapshot *new_snapshot_list = NULL; @@ -331,6 +334,8 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info) /* Check that the ID is unique */ if (find_snapshot_by_id(bs, sn_info-id_str) = 0) { +error_set(errp, QERR_INVALID_PARAMETER_VALUE, + name, non-existing id identifier); return -EEXIST; } @@ -415,6 +420,8 @@ fail: g_free(sn-name); g_free(l1_table); +
[Qemu-devel] [Bug 1037606] Re: vmwgfx does not work with kvm vmware vga
** Bug watch added: Linux Kernel Bug Tracker #46711 http://bugzilla.kernel.org/show_bug.cgi?id=46711 ** Also affects: linux via http://bugzilla.kernel.org/show_bug.cgi?id=46711 Importance: Unknown Status: Unknown -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1037606 Title: vmwgfx does not work with kvm vmware vga Status in The Linux Kernel: Unknown Status in QEMU: New Status in “linux” package in Ubuntu: Triaged Bug description: vmwgfx driver fails to initialize inside kvm. tried: kvm -m 2048 -vga vmware -cdrom RebeccaBlackLinux.iso (Ubuntu based, any Ubuntu live CD would do) Apport data collected with qantal alpha live CD (somewhat older kernel). The error is shjown in CurrentDmesg.txt https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1037606/+attachment/3265235/+files/CurrentDmesg.txt --- ApportVersion: 2.4-0ubuntu8 Architecture: amd64 AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found. CasperVersion: 1.320 DistroRelease: Ubuntu 12.10 IwConfig: eth0 no wireless extensions. lono wireless extensions. LiveMediaBuild: Ubuntu 12.10 Quantal Quetzal - Alpha amd64 (20120724.2) Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99 MachineType: Bochs Bochs Package: linux (not installed) ProcEnviron: TERM=linux PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: ProcKernelCmdLine: file=/cdrom/preseed/hostname.seed boot=casper initrd=/casper/initrd.lz quiet splash -- maybe-ubiquity ProcVersionSignature: Ubuntu 3.5.0-6.6-generic 3.5.0 PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon. RelatedPackageVersions: linux-restricted-modules-3.5.0-6-generic N/A linux-backports-modules-3.5.0-6-generic N/A linux-firmware 1.85 RfKill: Tags: quantal Uname: Linux 3.5.0-6-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: dmi.bios.date: 01/01/2007 dmi.bios.vendor: Bochs dmi.bios.version: Bochs dmi.chassis.type: 1 dmi.chassis.vendor: Bochs dmi.modalias: dmi:bvnBochs:bvrBochs:bd01/01/2007:svnBochs:pnBochs:pvr:cvnBochs:ct1:cvr: dmi.product.name: Bochs dmi.sys.vendor: Bochs --- ApportVersion: 2.0.1-0ubuntu12 Architecture: i386 DistroRelease: Ubuntu 12.04 InstallationMedia: Ubuntu 10.10 Maverick Meerkat - Release i386 (20101007) Package: linux (not installed) ProcEnviron: TERM=xterm PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash Tags: precise running-unity Uname: Linux 3.6.0-030600rc3-generic i686 UnreportableReason: The running kernel is not an Ubuntu kernel UpgradeStatus: Upgraded to precise on 2012-08-30 (0 days ago) UserGroups: adm admin cdrom dialout lpadmin plugdev sambashare To manage notifications about this bug go to: https://bugs.launchpad.net/linux/+bug/1037606/+subscriptions
Re: [Qemu-devel] Is is possible to virtualise or share the TPM?
On 08/30/2012 10:21 AM, Jordi Cucurull Juan wrote: Dear Stefan, What does it mean that the patches with the VTPM functionality exist but they are behind the regular ones? Does it mean that they are not currently updated? That they have less priority? It means that in my patch queue they are 'behind' the ones I posted over the last few months. Stefan Best regards, Jordi. On 08/29/2012 02:57 PM, Stefan Berger wrote: On 08/23/2012 04:05 PM, Corey Bryant wrote: On 08/21/2012 06:31 AM, Jordi Cucurull Juan wrote: Dear all, After applying the TPM patches to QEMU, I was wondering if it is possible to simultaneously use the TPM in more than one virtual machine, i.e. virtualisation of the TPM. According to the paper Stefan Berger, Ramón Cáceres, Kenneth A. Goldman, Ronald Perez, Reiner Sailer, Leendert van Doorn. vTPM: Virtualizing the Trusted Platform Module this seems to be possible in Xen. Is not possible in QEMU? Thanks! Jordi. I don't think the pass-through driver supports use by multiple VMs. Stefan Berger should be able to answer better so I'm adding him to the thread. The pass-through driver cannot provide access for multiple VMs to the single hardware TPM on the host. The usage model and the statefulness of the TPM (SRK password, owner password, keys) basically prevent/complicate this. The implementation for Xen was indep. of the Qemu code base today and there we used a software implementation of the TPM that provided a private TPm instance to each VM. I have patches for this for Qemu but due to an IRC chat in Sept. 2011 they are 'behind' the pass-through driver patches. Stefan
Re: [Qemu-devel] [PATCH 03/18] block: add error parameter to bdrv_snapshot_goto() and related functions
On Wed, 15 Aug 2012 09:41:44 +0200 Pavel Hrdina phrd...@redhat.com wrote: Signed-off-by: Pavel Hrdina phrd...@redhat.com --- block.c| 26 +++--- block.h| 3 ++- block/qcow2-snapshot.c | 11 --- block/qcow2.h | 4 +++- block/rbd.c| 6 +- block/sheepdog.c | 16 +--- block_int.h| 3 ++- qemu-img.c | 2 +- savevm.c | 2 +- 9 files changed, 46 insertions(+), 27 deletions(-) diff --git a/block.c b/block.c index 8bc49b7..ad25184 100644 --- a/block.c +++ b/block.c @@ -2683,29 +2683,33 @@ int bdrv_snapshot_create(BlockDriverState *bs, } int bdrv_snapshot_goto(BlockDriverState *bs, - const char *snapshot_id) + const char *snapshot_id, + Error **errp) { BlockDriver *drv = bs-drv; int ret, open_ret; -if (!drv) -return -ENOMEDIUM; -if (drv-bdrv_snapshot_goto) -return drv-bdrv_snapshot_goto(bs, snapshot_id); - -if (bs-file) { +if (!drv) { +error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, bdrv_get_device_name(bs)); +ret = -ENOMEDIUM; Most of the comments I made in 02/18 apply for this patch (and probably the next ones too), so to summarize: 1. The QERR_ macros are deprecated, we should use error_setg() instead 2. As a general rule, only the Error object should be propagated up (ie. we shouldn't propagate the Error object _and_ errno). It's debatable if we can get rid of errno, if we can't then we'll have to extend the error API to embed errno. We have to discuss this with block layer guys, preferably before doing large series. One more comment below. +} else if (drv-bdrv_snapshot_goto) { +ret = drv-bdrv_snapshot_goto(bs, snapshot_id, errp); +} else if (bs-file) { drv-bdrv_close(bs); -ret = bdrv_snapshot_goto(bs-file, snapshot_id); +ret = bdrv_snapshot_goto(bs-file, snapshot_id, errp); open_ret = drv-bdrv_open(bs, bs-open_flags); if (open_ret 0) { bdrv_delete(bs-file); bs-drv = NULL; -return open_ret; +error_set(errp, QERR_OPEN_FILE_FAILED, bdrv_get_device_name(bs)); +ret = open_ret; } -return ret; +} else { +error_set(errp, QERR_NOT_SUPPORTED); +ret = -ENOTSUP; } -return -ENOTSUP; +return ret; } int bdrv_snapshot_delete(BlockDriverState *bs, const char *snapshot_id) diff --git a/block.h b/block.h index 92e782b..11edcd3 100644 --- a/block.h +++ b/block.h @@ -299,7 +299,8 @@ int bdrv_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info, Error **errp); int bdrv_snapshot_goto(BlockDriverState *bs, - const char *snapshot_id); + const char *snapshot_id, + Error **errp); int bdrv_snapshot_delete(BlockDriverState *bs, const char *snapshot_id); int bdrv_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_info); diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c index cf86dae..8a87b0c 100644 --- a/block/qcow2-snapshot.c +++ b/block/qcow2-snapshot.c @@ -426,7 +426,9 @@ fail: } /* copy the snapshot 'snapshot_name' into the current disk image */ -int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id) +int qcow2_snapshot_goto(BlockDriverState *bs, +const char *snapshot_id, +Error **errp) { BDRVQcowState *s = bs-opaque; QCowSnapshot *sn; @@ -438,13 +440,13 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id) /* Search the snapshot */ snapshot_index = find_snapshot_by_id_or_name(bs, snapshot_id); if (snapshot_index 0) { +error_set(errp, QERR_OPEN_FILE_FAILED, snapshot_id); return -ENOENT; } sn = s-snapshots[snapshot_index]; if (sn-disk_size != bs-total_sectors * BDRV_SECTOR_SIZE) { -error_report(qcow2: Loading snapshots with different disk -size is not implemented); +error_set(errp, QERR_NOT_SUPPORTED); ret = -ENOTSUP; goto fail; } @@ -536,6 +538,9 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id) fail: g_free(sn_l1_table); +if (!error_is_set(errp)) { +error_set(errp, QERR_GENERIC_ERROR, ret); +} return ret; } diff --git a/block/qcow2.h b/block/qcow2.h index 854bd12..6babb56 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -311,7 +311,9 @@ int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors); int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo
Re: [Qemu-devel] Is is possible to virtualise or share the TPM?
Do you refer to the patches that add TPM support to the SeaBIOS? If this is the case, this is just a completely virtual TPM without any link with the TPM of the physical machine, right? Jordi. On 08/30/2012 04:50 PM, Stefan Berger wrote: On 08/30/2012 10:21 AM, Jordi Cucurull Juan wrote: Dear Stefan, What does it mean that the patches with the VTPM functionality exist but they are behind the regular ones? Does it mean that they are not currently updated? That they have less priority? It means that in my patch queue they are 'behind' the ones I posted over the last few months. Stefan Best regards, Jordi. On 08/29/2012 02:57 PM, Stefan Berger wrote: On 08/23/2012 04:05 PM, Corey Bryant wrote: On 08/21/2012 06:31 AM, Jordi Cucurull Juan wrote: Dear all, After applying the TPM patches to QEMU, I was wondering if it is possible to simultaneously use the TPM in more than one virtual machine, i.e. virtualisation of the TPM. According to the paper Stefan Berger, Ramón Cáceres, Kenneth A. Goldman, Ronald Perez, Reiner Sailer, Leendert van Doorn. vTPM: Virtualizing the Trusted Platform Module this seems to be possible in Xen. Is not possible in QEMU? Thanks! Jordi. I don't think the pass-through driver supports use by multiple VMs. Stefan Berger should be able to answer better so I'm adding him to the thread. The pass-through driver cannot provide access for multiple VMs to the single hardware TPM on the host. The usage model and the statefulness of the TPM (SRK password, owner password, keys) basically prevent/complicate this. The implementation for Xen was indep. of the Qemu code base today and there we used a software implementation of the TPM that provided a private TPm instance to each VM. I have patches for this for Qemu but due to an IRC chat in Sept. 2011 they are 'behind' the pass-through driver patches. Stefan
Re: [Qemu-devel] CPU hotplug
Hello, Am 30.08.2012 11:06, schrieb Stefan Priebe: I tried latest 1.2rc1 kvm-qemu with vanilla kernel v3.5.2 but the VM just crashes when sending cpu_set X online through qm monitor. For SLES we're carrying a patch by Kamalesh Babulal that prevents this (BNC#747339). Don't know why that patch is not in upstream - Bo Yang? Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH] sheepdog: fix savevm and loadvm
Am 29.08.2012 20:39, schrieb MORITA Kazutaka: This patch sets data to be sent to Sheepdog correctly and fixes savevm and loadvm operations on a Sheepdog image. Signed-off-by: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp Thanks, applied to the block branch. Kevin
Re: [Qemu-devel] [Spice-devel] [PATCH] Add new client_present and client capabilities fields to QXLRom
Gerd Hoffmann kra...@redhat.com writes: The scheme I had in mind was this: - When a new non-a8-capable client appears, don't send it any of the a8 surfaces - If the client doesn't understand a8 surfaces, - keep all a8 surfaces rendered on the server side - if the guest sends a command using an a8 surface as a destination, simply render the command on the server side - if the client sends a command using an a8 surface as a source, rewrite the image object to be a real image referring to the server side bits (which are also sent or possibly cached) rather than a surface Hmm, when the server is able to translate a8 ops into non-a8 ops using server-side rendering, then there is no need to notify the guest about the client capabilities. To be clear, this ability doesn't exist at the moment, and it would be a significant chunk of work to add it. But it's much simpler to just say that the guest should stop referring to a8 surfaces if the client can't handle them. Not sure about that, this move might just shift the complexity from spice-server to the guest qxl driver. The ability to handle this is already pretty much present in at least the X driver (and I'm pretty sure the Windows driver has it as well) because any time something can't be expressed in the SPICE protocol, it has to fall back to software rendering. Ie., it has to read all the involved surfaces back from video memory, do software rendering, then upload the result as an image. Dealing with a disappearing ability to handle a8 surfaces would simply be a matter of reading back the a8 surfaces to guest RAM and then not attempt to acccelerate any operations involving them any more. It looks much more involved to do it in spice-server because it would probably involve adding a new concept of emulated surface that needs to be handled specially in a bunch of cases. Søren
Re: [Qemu-devel] CPU hotplug
30.08.2012 19:41, Andreas Färber wrote: Hello, Am 30.08.2012 11:06, schrieb Stefan Priebe: I tried latest 1.2rc1 kvm-qemu with vanilla kernel v3.5.2 but the VM just crashes when sending cpu_set X online through qm monitor. For SLES we're carrying a patch by Kamalesh Babulal that prevents this (BNC#747339). The same is for debian/ubuntu: http://bugs.debian.org/680551 /mjt
Re: [Qemu-devel] Is is possible to virtualise or share the TPM?
On 08/30/2012 11:40 AM, Jordi Cucurull Juan wrote: Do you refer to the patches that add TPM support to the SeaBIOS? Sorry for the confusion. What I meant is that the patches adding support for a private vTPM for each QEMU VM are 'behind' those adding support for the passthrough device model. There are SeaBIOS patches as well adding support for TPM, but those are different. If this is the case, this is just a completely virtual TPM without any link with the TPM of the physical machine, right? The SeaBIOS patches don't do that. They just add TPM BIOS support for TPM initialization, ACPI tables etc. To add a completely virtual TPM to QEMU a completely different device model is necessary than the one I have recently posted. Stefan Jordi. On 08/30/2012 04:50 PM, Stefan Berger wrote: On 08/30/2012 10:21 AM, Jordi Cucurull Juan wrote: Dear Stefan, What does it mean that the patches with the VTPM functionality exist but they are behind the regular ones? Does it mean that they are not currently updated? That they have less priority? It means that in my patch queue they are 'behind' the ones I posted over the last few months. Stefan Best regards, Jordi. On 08/29/2012 02:57 PM, Stefan Berger wrote: On 08/23/2012 04:05 PM, Corey Bryant wrote: On 08/21/2012 06:31 AM, Jordi Cucurull Juan wrote: Dear all, After applying the TPM patches to QEMU, I was wondering if it is possible to simultaneously use the TPM in more than one virtual machine, i.e. virtualisation of the TPM. According to the paper Stefan Berger, Ramón Cáceres, Kenneth A. Goldman, Ronald Perez, Reiner Sailer, Leendert van Doorn. vTPM: Virtualizing the Trusted Platform Module this seems to be possible in Xen. Is not possible in QEMU? Thanks! Jordi. I don't think the pass-through driver supports use by multiple VMs. Stefan Berger should be able to answer better so I'm adding him to the thread. The pass-through driver cannot provide access for multiple VMs to the single hardware TPM on the host. The usage model and the statefulness of the TPM (SRK password, owner password, keys) basically prevent/complicate this. The implementation for Xen was indep. of the Qemu code base today and there we used a software implementation of the TPM that provided a private TPm instance to each VM. I have patches for this for Qemu but due to an IRC chat in Sept. 2011 they are 'behind' the pass-through driver patches. Stefan
Re: [Qemu-devel] Is is possible to virtualise or share the TPM?
On 08/30/2012 11:40 AM, Jordi Cucurull Juan wrote: Do you refer to the patches that add TPM support to the SeaBIOS? Sorry for the confusion. What I meant is that the patches adding support for a private vTPM for each QEMU VM are 'behind' those adding support for the passthrough device model. There are SeaBIOS patches as well adding support for TPM, but those are different. If this is the case, this is just a completely virtual TPM without any link with the TPM of the physical machine, right? The SeaBIOS patches don't do that. They just add TPM BIOS support for TPM initialization, ACPI tables etc. To add a completely virtual TPM to QEMU a completely different device model is necessary than the one I have recently posted. Stefan Jordi. On 08/30/2012 04:50 PM, Stefan Berger wrote: On 08/30/2012 10:21 AM, Jordi Cucurull Juan wrote: Dear Stefan, What does it mean that the patches with the VTPM functionality exist but they are behind the regular ones? Does it mean that they are not currently updated? That they have less priority? It means that in my patch queue they are 'behind' the ones I posted over the last few months. Stefan Best regards, Jordi. On 08/29/2012 02:57 PM, Stefan Berger wrote: On 08/23/2012 04:05 PM, Corey Bryant wrote: On 08/21/2012 06:31 AM, Jordi Cucurull Juan wrote: Dear all, After applying the TPM patches to QEMU, I was wondering if it is possible to simultaneously use the TPM in more than one virtual machine, i.e. virtualisation of the TPM. According to the paper Stefan Berger, Ramón Cáceres, Kenneth A. Goldman, Ronald Perez, Reiner Sailer, Leendert van Doorn. vTPM: Virtualizing the Trusted Platform Module this seems to be possible in Xen. Is not possible in QEMU? Thanks! Jordi. I don't think the pass-through driver supports use by multiple VMs. Stefan Berger should be able to answer better so I'm adding him to the thread. The pass-through driver cannot provide access for multiple VMs to the single hardware TPM on the host. The usage model and the statefulness of the TPM (SRK password, owner password, keys) basically prevent/complicate this. The implementation for Xen was indep. of the Qemu code base today and there we used a software implementation of the TPM that provided a private TPm instance to each VM. I have patches for this for Qemu but due to an IRC chat in Sept. 2011 they are 'behind' the pass-through driver patches. Stefan
Re: [Qemu-devel] CPU hotplug
Am 30.08.2012 um 17:41 schrieb Andreas Färber afaer...@suse.de: Hello, Am 30.08.2012 11:06, schrieb Stefan Priebe: I tried latest 1.2rc1 kvm-qemu with vanilla kernel v3.5.2 but the VM just crashes when sending cpu_set X online through qm monitor. For SLES we're carrying a patch by Kamalesh Babulal that prevents this (BNC#747339). Don't know why that patch is not in upstream - Bo Yang? But this just disables CPU Hotelzug and does Not fix it? Stefan
Re: [Qemu-devel] CPU hotplug
Am 30.08.2012 18:35, schrieb Stefan Priebe: Am 30.08.2012 um 17:41 schrieb Andreas Färber afaer...@suse.de: Am 30.08.2012 11:06, schrieb Stefan Priebe: I tried latest 1.2rc1 kvm-qemu with vanilla kernel v3.5.2 but the VM just crashes when sending cpu_set X online through qm monitor. For SLES we're carrying a patch by Kamalesh Babulal that prevents this (BNC#747339). But this just disables CPU [hotplug] and does Not fix it? It fixes the crash. Hotplug needs to be implemented first, and this has been taking several months already (for x86, to be specific). Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] Posix timer syscalls ; dealing with the timer_t type
Hi, Am 30.08.2012 14:30, schrieb Erik de Castro Lopo: I'm working on implementing Posix timers in linux-user. I'm having trouble figuring out how to handle the timer_t type. Consider the following code with say 32 bit ARM being emulated on 64 bit x86-64: timer_t timerid; err = timer_create(clockid, sev, timerid); err = timer_gettime(timerid, curr); The issue is that memory for the timer_t value in the 32 bit target is alloacted on the tack (where the timer_t is 4 bytes) but the value provided by the 64 bit host where the timer_t is 8 bytes. Any suggestions on dealing with this? typedef target_ulong target_timer_t; or abi_ulong, or without the u if signed. Depending on where/how you use this, you may need to convert back and forth between host and target values. Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
[Qemu-devel] [PATCH] qxl: dont update invalid area
From: Dunrong Huang riegama...@gmail.com This patch fixes the following error: $ ~/usr/bin/qemu-system-x86_64 -enable-kvm -m 1024 -spice port=5900,disable-ticketing -vga qxl -cdrom ~/Images/linuxmint-13-mate-dvd-32bit.iso (/home/mathslinux/usr/bin/qemu-system-x86_64:10068): SpiceWorker-CRITICAL **: red_worker.c:4599:red_update_area: condition `area-left = 0 area-top = 0 area-left area-right area-top area-bottom' failed Aborted spice server terminates QEMU process if we pass invalid area to it, so dont update those invalid areas. Signed-off-by: Dunrong Huang riegama...@gmail.com --- hw/qxl.c |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index c2dd3b4..10e6bb3 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -1385,6 +1385,13 @@ async_common: QXLCookie *cookie = NULL; QXLRect update = d-ram-update_area; +if (update.left 0 || update.top 0 || update.left = update.right || +update.top = update.bottom) { +qxl_set_guest_bug(d, QXL_IO_UPDATE_AREA: + invalid area(%d,%d,%d,%d)\n, update.left, + update.right, update.top, update.bottom); +break; +} if (async == QXL_ASYNC) { cookie = qxl_cookie_new(QXL_COOKIE_TYPE_IO, QXL_IO_UPDATE_AREA_ASYNC); -- 1.7.8.6
Re: [Qemu-devel] [PATCH 12/18] savevm: add error parameter to qemu_loadvm_state()
On Wed, 15 Aug 2012 09:41:53 +0200 Pavel Hrdina phrd...@redhat.com wrote: Signed-off-by: Pavel Hrdina phrd...@redhat.com --- migration.c | 2 +- savevm.c| 44 sysemu.h| 3 ++- 3 files changed, 31 insertions(+), 18 deletions(-) diff --git a/migration.c b/migration.c index ec2f267..f048faf 100644 --- a/migration.c +++ b/migration.c @@ -88,7 +88,7 @@ int qemu_start_incoming_migration(const char *uri, Error **errp) void process_incoming_migration(QEMUFile *f) { -if (qemu_loadvm_state(f) 0) { +if (qemu_loadvm_state(f, NULL) 0) { fprintf(stderr, load of migration failed\n); exit(0); } diff --git a/savevm.c b/savevm.c index 0d54115..500eb72 100644 --- a/savevm.c +++ b/savevm.c @@ -1916,7 +1916,8 @@ typedef struct LoadStateEntry { int version_id; } LoadStateEntry; -int qemu_loadvm_state(QEMUFile *f) +int qemu_loadvm_state(QEMUFile *f, + Error **errp) { QLIST_HEAD(, LoadStateEntry) loadvm_handlers = QLIST_HEAD_INITIALIZER(loadvm_handlers); @@ -1925,21 +1926,26 @@ int qemu_loadvm_state(QEMUFile *f) unsigned int v; int ret; -if (qemu_savevm_state_blocked(NULL)) { -return -EINVAL; +if (qemu_savevm_state_blocked(errp)) { +return -ENOTSUP; } v = qemu_get_be32(f); -if (v != QEMU_VM_FILE_MAGIC) +if (v != QEMU_VM_FILE_MAGIC) { +error_set(errp, ERROR_CLASS_GENERIC_ERROR, + Unknown vm-state file magic); return -EINVAL; +} v = qemu_get_be32(f); if (v == QEMU_VM_FILE_VERSION_COMPAT) { -fprintf(stderr, SaveVM v2 format is obsolete and don't work anymore\n); +error_set(errp, QERR_NOT_SUPPORTED); return -ENOTSUP; } -if (v != QEMU_VM_FILE_VERSION) +if (v != QEMU_VM_FILE_VERSION) { +error_set(errp, QERR_NOT_SUPPORTED); return -ENOTSUP; +} while ((section_type = qemu_get_byte(f)) != QEMU_VM_EOF) { uint32_t instance_id, version_id, section_id; @@ -1961,15 +1967,18 @@ int qemu_loadvm_state(QEMUFile *f) /* Find savevm section */ se = find_se(idstr, instance_id); if (se == NULL) { -fprintf(stderr, Unknown savevm section or instance '%s' %d\n, idstr, instance_id); +error_set(errp, ERROR_CLASS_GENERIC_ERROR, + Unknown savevm section or instance '%s' %d, + idstr, instance_id); ret = -EINVAL; goto out; } /* Validate version */ if (version_id se-version_id) { -fprintf(stderr, savevm: unsupported version %d for '%s' v%d\n, -version_id, idstr, se-version_id); +error_set(errp, ERROR_CLASS_GENERIC_ERROR, + savevm: unsupported version %d for '%s' v%d, + version_id, idstr, se-version_id); ret = -EINVAL; goto out; } @@ -1984,8 +1993,7 @@ int qemu_loadvm_state(QEMUFile *f) ret = vmstate_load(f, le-se, le-version_id); if (ret 0) { -fprintf(stderr, qemu: warning: error while loading state for instance 0x%x of device '%s'\n, -instance_id, idstr); +error_set(errp, QERR_GENERIC_ERROR, ret); goto out; } break; @@ -1999,20 +2007,21 @@ int qemu_loadvm_state(QEMUFile *f) } } if (le == NULL) { -fprintf(stderr, Unknown savevm section %d\n, section_id); +error_set(errp, ERROR_CLASS_GENERIC_ERROR, + Unknown savevm section %d, section_id); You sure that this error message will be printed to the terminal? This has to be done by the caller. ret = -EINVAL; goto out; } ret = vmstate_load(f, le-se, le-version_id); if (ret 0) { -fprintf(stderr, qemu: warning: error while loading state section id %d\n, -section_id); +error_set(errp, QERR_GENERIC_ERROR, ret); goto out; } break; default: -fprintf(stderr, Unknown savevm section type %d\n, section_type); +error_set(errp, ERROR_CLASS_GENERIC_ERROR, + Unknown savevm section type %d, section_type); ret = -EINVAL; goto out; } @@ -2030,6 +2039,9 @@ out: if (ret == 0) { ret = qemu_file_get_error(f); +if (ret 0) { +error_set(errp, QERR_GENERIC_ERROR, ret); +} } return ret; @@ -2297,7
Re: [Qemu-devel] [PATCH 00/18] qapi: Convert savevm, loadvm, delvm and info snapshots
On Wed, 15 Aug 2012 09:41:41 +0200 Pavel Hrdina phrd...@redhat.com wrote: This patch series convert these commands into qapi and intruduce QMP commands vm-snapshot-save, vm-snapshot-load, vm-snapshot-delete and query-vm-snapshots. It also rewrite error report for function used by these commands. Unfortunately, most of the error conversions are wrong. I've commented on them, but the most important thing here is to decide how we should propagate the Error object in the block layer and what to do with its errno usage. It's better to discuss this first before doing large changes. It might also be worth it to split this series and work on error propagation first. I've CC'ed the block layer guys in one of my reviews to this series.
Re: [Qemu-devel] CPU hotplug
Am 30.08.2012 18:43, schrieb Andreas Färber: Am 30.08.2012 18:35, schrieb Stefan Priebe: Am 30.08.2012 um 17:41 schrieb Andreas Färber afaer...@suse.de: Am 30.08.2012 11:06, schrieb Stefan Priebe: I tried latest 1.2rc1 kvm-qemu with vanilla kernel v3.5.2 but the VM just crashes when sending cpu_set X online through qm monitor. For SLES we're carrying a patch by Kamalesh Babulal that prevents this (BNC#747339). But this just disables CPU [hotplug] and does Not fix it? It fixes the crash. Hotplug needs to be implemented first, and this has been taking several months already (for x86, to be specific). Mhm RHEL 6.3 claims to support this? Dynamic virtual CPU allocation = https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html/6.3_Release_Notes/virtualization.html Greets, Stefan
Re: [Qemu-devel] [PATCH v7 5/6] add the QKeyCode enum and the key_defs table
On Mon, 20 Aug 2012 12:39:28 +0800 Amos Kong ak...@redhat.com wrote: key_defs[] in monitor.c is a mapping table of keys and keycodes, this patch added a QKeyCode enum and a new key_defs table, key's index in the enmu is same as keycode's index in new key_defs[]. And added two help functions to convert key/code to index of mapping table, those functions will return Q_KEY_CODE_MAX if the code/key is invalid. 'key_defs' was dropped from the monitor, monitor functions were changed to access key_defs directly. Signed-off-by: Amos Kong ak...@redhat.com This patch (and probably the next one too) doesn't apply on master, could you rebase please? As you'll have to respin, please change versions to 1.3.0. I also have comments below (minor, but as you'll respin anyway). --- console.h|6 ++ input.c | 186 ++ monitor.c| 183 +++- qapi-schema.json | 26 4 files changed, 229 insertions(+), 172 deletions(-) diff --git a/console.h b/console.h index 4334db5..7934b11 100644 --- a/console.h +++ b/console.h @@ -6,6 +6,7 @@ #include notify.h #include monitor.h #include trace.h +#include qapi-types.h /* keyboard/mouse support */ @@ -397,4 +398,9 @@ static inline int vnc_display_pw_expire(DisplayState *ds, time_t expires) /* curses.c */ void curses_display_init(DisplayState *ds, int full_screen); +/* input.c */ +extern const int key_defs[]; Why are you exporting key_defs[]? It should be static. Also, it would be better to move the addition of QKeyCode out of this patch (ie. you first add it and then move the table). +int index_from_key(const char *key); +int index_from_keycode(int code); + #endif diff --git a/input.c b/input.c index 6968b31..5630cb1 100644 --- a/input.c +++ b/input.c @@ -37,6 +37,192 @@ static QTAILQ_HEAD(, QEMUPutMouseEntry) mouse_handlers = static NotifierList mouse_mode_notifiers = NOTIFIER_LIST_INITIALIZER(mouse_mode_notifiers); +const int key_defs[] = { +[Q_KEY_CODE_SHIFT] = 0x2a, +[Q_KEY_CODE_SHIFT_R] = 0x36, + +[Q_KEY_CODE_ALT] = 0x38, +[Q_KEY_CODE_ALT_R] = 0xb8, +[Q_KEY_CODE_ALTGR] = 0x64, +[Q_KEY_CODE_ALTGR_R] = 0xe4, +[Q_KEY_CODE_CTRL] = 0x1d, +[Q_KEY_CODE_CTRL_R] = 0x9d, + +[Q_KEY_CODE_MENU] = 0xdd, + +[Q_KEY_CODE_ESC] = 0x01, + +[Q_KEY_CODE_1] = 0x02, +[Q_KEY_CODE_2] = 0x03, +[Q_KEY_CODE_3] = 0x04, +[Q_KEY_CODE_4] = 0x05, +[Q_KEY_CODE_5] = 0x06, +[Q_KEY_CODE_6] = 0x07, +[Q_KEY_CODE_7] = 0x08, +[Q_KEY_CODE_8] = 0x09, +[Q_KEY_CODE_9] = 0x0a, +[Q_KEY_CODE_0] = 0x0b, +[Q_KEY_CODE_MINUS] = 0x0c, +[Q_KEY_CODE_EQUAL] = 0x0d, +[Q_KEY_CODE_BACKSPACE] = 0x0e, + +[Q_KEY_CODE_TAB] = 0x0f, +[Q_KEY_CODE_Q] = 0x10, +[Q_KEY_CODE_W] = 0x11, +[Q_KEY_CODE_E] = 0x12, +[Q_KEY_CODE_R] = 0x13, +[Q_KEY_CODE_T] = 0x14, +[Q_KEY_CODE_Y] = 0x15, +[Q_KEY_CODE_U] = 0x16, +[Q_KEY_CODE_I] = 0x17, +[Q_KEY_CODE_O] = 0x18, +[Q_KEY_CODE_P] = 0x19, +[Q_KEY_CODE_BRACKET_LEFT] = 0x1a, +[Q_KEY_CODE_BRACKET_RIGHT] = 0x1b, +[Q_KEY_CODE_RET] = 0x1c, + +[Q_KEY_CODE_A] = 0x1e, +[Q_KEY_CODE_S] = 0x1f, +[Q_KEY_CODE_D] = 0x20, +[Q_KEY_CODE_F] = 0x21, +[Q_KEY_CODE_G] = 0x22, +[Q_KEY_CODE_H] = 0x23, +[Q_KEY_CODE_J] = 0x24, +[Q_KEY_CODE_K] = 0x25, +[Q_KEY_CODE_L] = 0x26, +[Q_KEY_CODE_SEMICOLON] = 0x27, +[Q_KEY_CODE_APOSTROPHE] = 0x28, +[Q_KEY_CODE_GRAVE_ACCENT] = 0x29, + +[Q_KEY_CODE_BACKSLASH] = 0x2b, +[Q_KEY_CODE_Z] = 0x2c, +[Q_KEY_CODE_X] = 0x2d, +[Q_KEY_CODE_C] = 0x2e, +[Q_KEY_CODE_V] = 0x2f, +[Q_KEY_CODE_B] = 0x30, +[Q_KEY_CODE_N] = 0x31, +[Q_KEY_CODE_M] = 0x32, +[Q_KEY_CODE_COMMA] = 0x33, +[Q_KEY_CODE_DOT] = 0x34, +[Q_KEY_CODE_SLASH] = 0x35, + +[Q_KEY_CODE_ASTERISK] = 0x37, + +[Q_KEY_CODE_SPC] = 0x39, +[Q_KEY_CODE_CAPS_LOCK] = 0x3a, +[Q_KEY_CODE_F1] = 0x3b, +[Q_KEY_CODE_F2] = 0x3c, +[Q_KEY_CODE_F3] = 0x3d, +[Q_KEY_CODE_F4] = 0x3e, +[Q_KEY_CODE_F5] = 0x3f, +[Q_KEY_CODE_F6] = 0x40, +[Q_KEY_CODE_F7] = 0x41, +[Q_KEY_CODE_F8] = 0x42, +[Q_KEY_CODE_F9] = 0x43, +[Q_KEY_CODE_F10] = 0x44, +[Q_KEY_CODE_NUM_LOCK] = 0x45, +[Q_KEY_CODE_SCROLL_LOCK] = 0x46, + +[Q_KEY_CODE_KP_DIVIDE] = 0xb5, +[Q_KEY_CODE_KP_MULTIPLY] = 0x37, +[Q_KEY_CODE_KP_SUBTRACT] = 0x4a, +[Q_KEY_CODE_KP_ADD] = 0x4e, +[Q_KEY_CODE_KP_ENTER] = 0x9c, +[Q_KEY_CODE_KP_DECIMAL] = 0x53, +[Q_KEY_CODE_SYSRQ] = 0x54, + +[Q_KEY_CODE_KP_0] = 0x52, +[Q_KEY_CODE_KP_1] = 0x4f, +[Q_KEY_CODE_KP_2] = 0x50, +[Q_KEY_CODE_KP_3] = 0x51, +[Q_KEY_CODE_KP_4] = 0x4b, +[Q_KEY_CODE_KP_5] = 0x4c, +[Q_KEY_CODE_KP_6] = 0x4d, +[Q_KEY_CODE_KP_7]
[Qemu-devel] [PATCH 0/2] pcie migration fixes
Hi, A couple of pcie related migration fixes that I found while testing q35 migration. Thanks, -Jason Jason Baron (2): pcie: drop version_id field for live migration pcie_aer: clear cmask for Advanced Error Interrupt Message Number hw/pci.c |2 +- hw/pcie.h |1 - hw/pcie_aer.c |6 ++ 3 files changed, 7 insertions(+), 2 deletions(-)
[Qemu-devel] [PATCH 2/2] pcie_aer: clear cmask for Advanced Error Interrupt Message Number
The Advanced Error Interrupt Message Number (bits 31:27 of the Root Error Status Register) is updated when the number of msi messages assigned to a device changes. Migration of windows 7 on q35 chipset failed because the check in get_pci_config_device() fails due to wmask being set on these bits. Its valid to update these bits and we must restore this state across migration. Signed-off-by: Jason Baron jba...@redhat.com --- hw/pcie_aer.c |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/hw/pcie_aer.c b/hw/pcie_aer.c index 3b6981c..6edcd79 100644 --- a/hw/pcie_aer.c +++ b/hw/pcie_aer.c @@ -738,6 +738,12 @@ void pcie_aer_root_init(PCIDevice *dev) PCI_ERR_ROOT_CMD_EN_MASK); pci_set_long(dev-w1cmask + pos + PCI_ERR_ROOT_STATUS, PCI_ERR_ROOT_STATUS_REPORT_MASK); +/* Bits 31:27 - Advanced Error Interrupt Message Number + * These bits are updated when the number of MSI messages changes. + * By clearing the cmask, pcie devices can be migrated. + */ +pci_set_long(dev-cmask + pos + PCI_ERR_ROOT_STATUS, + (1 PCI_ERR_ROOT_IRQ_SHIFT) - 1); } void pcie_aer_root_reset(PCIDevice *dev) -- 1.7.1
[Qemu-devel] [PATCH 1/2] pcie: drop version_id field for live migration
While testing q35 live migration, I found that the migration would abort with the following error: Unknown savevm section type 76. The error is due to this check failing in 'vmstate_load_state()': while(field-name) { if ((field-field_exists field-field_exists(opaque, version_id)) || (!field-field_exists field-version_id = version_id)) { The VMSTATE_PCIE_DEVICE() currently has a 'version_id' set to 2. However, 'version_id' in the above check is 1. And thus we fail to load the pcie device field. Further the code returns to 'qemu_loadvm_state()' which produces the error that I saw. I'm proposing to fix this by simply dropping the 'version_id' field from VMSTATE_PCIE_DEVICE(). VMSTATE_PCI_DEVICE() defines no such field and further the vmstate_pcie_device that VMSTATE_PCI_DEVICE() refers to is already versioned. Thus, any versioning issues could be detected at the vmsd level. Taking a step back, I think that the 'field-version_id' should be compared against a saved version number for the field not the 'version_id'. Futhermore, once vmstate_load_state() is called recursively on another vmsd, the check of: if (version_id vmsd-version_id) { return -EINVAL; } Will never fail since version_id is always equal to vmsd-version_id. So I'm wondering why we aren't storing the vmsd version id of the source in the migration stream? This patch also renames the 'name' field of vmstate_pcie_device from: PCIDevice - PCIEDevice to differentiate it from vmstate_pci_device. Signed-off-by: Jason Baron jba...@redhat.com --- hw/pci.c |2 +- hw/pcie.h |1 - 2 files changed, 1 insertions(+), 2 deletions(-) diff --git a/hw/pci.c b/hw/pci.c index 3727afa..5386a4f 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -439,7 +439,7 @@ const VMStateDescription vmstate_pci_device = { }; const VMStateDescription vmstate_pcie_device = { -.name = PCIDevice, +.name = PCIEDevice, .version_id = 2, .minimum_version_id = 1, .minimum_version_id_old = 1, diff --git a/hw/pcie.h b/hw/pcie.h index b8ab0c7..4889194 100644 --- a/hw/pcie.h +++ b/hw/pcie.h @@ -133,7 +133,6 @@ extern const VMStateDescription vmstate_pcie_device; #define VMSTATE_PCIE_DEVICE(_field, _state) {\ .name = (stringify(_field)), \ -.version_id = 2, \ .size = sizeof(PCIDevice), \ .vmsd = vmstate_pcie_device, \ .flags = VMS_STRUCT,\ -- 1.7.1
Re: [Qemu-devel] CPU hotplug
On Thu, 30 Aug 2012 19:23:14 +0200 Stefan Priebe s.pri...@profihost.ag wrote: Am 30.08.2012 18:43, schrieb Andreas Färber: Am 30.08.2012 18:35, schrieb Stefan Priebe: Am 30.08.2012 um 17:41 schrieb Andreas Färber afaer...@suse.de: Am 30.08.2012 11:06, schrieb Stefan Priebe: I tried latest 1.2rc1 kvm-qemu with vanilla kernel v3.5.2 but the VM just crashes when sending cpu_set X online through qm monitor. For SLES we're carrying a patch by Kamalesh Babulal that prevents this (BNC#747339). But this just disables CPU [hotplug] and does Not fix it? It fixes the crash. Hotplug needs to be implemented first, and this has been taking several months already (for x86, to be specific). Mhm RHEL 6.3 claims to support this? it's not officially supported, it's just tech-preview. That allows to play with hotplug and uncover possible guest issues early. Dynamic virtual CPU allocation = https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html/6.3_Release_Notes/virtualization.html Greets, Stefan -- Regards, Igor
Re: [Qemu-devel] [PATCH 1/6] qemu-char: Convert MemCharDriver to circular buffer
On Thu, 23 Aug 2012 13:14:21 +0800 Lei Li li...@linux.vnet.ibm.com wrote: Signed-off-by: Lei Li li...@linux.vnet.ibm.com --- qemu-char.c | 96 +++--- qemu-char.h |2 +- 2 files changed, 78 insertions(+), 20 deletions(-) diff --git a/qemu-char.c b/qemu-char.c index 398baf1..b21b93a 100644 --- a/qemu-char.c +++ b/qemu-char.c @@ -2528,38 +2528,96 @@ static CharDriverState *qemu_chr_open_socket(QemuOpts *opts) /***/ /* Memory chardev */ typedef struct { -size_t outbuf_size; -size_t outbuf_capacity; -uint8_t *outbuf; +size_t cbuf_capacity; +size_t cbuf_in; +size_t cbuf_out; +size_t cbuf_count; +uint8_t *cbuf; } MemoryDriver; +static int mem_chr_is_empty(CharDriverState *chr) +{ +MemoryDriver *d = chr-opaque; + +return d-cbuf_count == 0; +} + +static int mem_chr_is_full(CharDriverState *chr) +{ +MemoryDriver *d = chr-opaque; + +return d-cbuf_count == d-cbuf_capacity; +} Please, make them return a bool and chr can be const. + static int mem_chr_write(CharDriverState *chr, const uint8_t *buf, int len) { MemoryDriver *d = chr-opaque; +int left; -/* TODO: the QString implementation has the same code, we should - * introduce a generic way to do this in cutils.c */ -if (d-outbuf_capacity d-outbuf_size + len) { -/* grow outbuf */ -d-outbuf_capacity += len; -d-outbuf_capacity *= 2; -d-outbuf = g_realloc(d-outbuf, d-outbuf_capacity); +if (d-cbuf_capacity len) { +return -1; } This is the first time I look at a circular buffer implementation, but I'd expect this too work: ie. you just write the last bytes that fit on the buffer. -memcpy(d-outbuf + d-outbuf_size, buf, len); -d-outbuf_size += len; +left = d-cbuf_capacity - d-cbuf_count % d-cbuf_capacity; + +/* Some of cbuf need to be overwrited */ +if (left len) { +memcpy(d-cbuf + d-cbuf_in, buf, left); +memcpy(d-cbuf + d-cbuf_out, buf + left, len - left); +d-cbuf_out = (d-cbuf_out + len - left) % d-cbuf_capacity; +d-cbuf_count = d-cbuf_count + left; +} else { +/* Completely overwrite */ +if (mem_chr_is_full(chr)) { +d-cbuf_out = (d-cbuf_out + len) % d-cbuf_capacity; +} else { +/* Enough cbuf to write */ +d-cbuf_count += len; +} +memcpy(d-cbuf + d-cbuf_in, buf, len); +} Couldn't this be made simpler by having a pointer to d-cbuf that points to where we are, then we just made that pointer points to the beginning of the buffer every time we cross its end? Just an idea. + +d-cbuf_in = (d-cbuf_in + len) % d-cbuf_capacity; return len; } -void qemu_chr_init_mem(CharDriverState *chr) +static void mem_chr_read(CharDriverState *chr, uint8_t *buf, int len) +{ +MemoryDriver *d = chr-opaque; +int left; + +if (mem_chr_is_empty(chr)) { +return; +} + +left = d-cbuf_capacity - d-cbuf_count % d-cbuf_capacity; + +if (d-cbuf_capacity len) { +len = d-cbuf_capacity; +} + +if (left len) { +memcpy(buf, d-cbuf + d-cbuf_out, left); +memcpy(buf + left, d-cbuf + d-cbuf_out + left, len - left); +} else { +memcpy(buf, d-cbuf + d-cbuf_out, len); +} + +d-cbuf_out = (d-cbuf_out + len) % d-cbuf_capacity; +d-cbuf_count -= len; +} + +void qemu_chr_init_mem(CharDriverState *chr, size_t size) Won't this break bisect? { MemoryDriver *d; d = g_malloc(sizeof(*d)); -d-outbuf_size = 0; -d-outbuf_capacity = 4096; -d-outbuf = g_malloc0(d-outbuf_capacity); +d-cbuf_capacity = size; +d-cbuf_in = 0; +d-cbuf_out = 0; +d-cbuf_count = 0; +d-cbuf = g_malloc0(d-cbuf_capacity); memset(chr, 0, sizeof(*chr)); chr-opaque = d; @@ -2569,7 +2627,7 @@ void qemu_chr_init_mem(CharDriverState *chr) QString *qemu_chr_mem_to_qs(CharDriverState *chr) { MemoryDriver *d = chr-opaque; -return qstring_from_substr((char *) d-outbuf, 0, d-outbuf_size - 1); +return qstring_from_substr((char *) d-cbuf, 0, d-cbuf_count - 1); } /* NOTE: this driver can not be closed with qemu_chr_delete()! */ @@ -2577,7 +2635,7 @@ void qemu_chr_close_mem(CharDriverState *chr) { MemoryDriver *d = chr-opaque; -g_free(d-outbuf); +g_free(d-cbuf); g_free(chr-opaque); chr-opaque = NULL; chr-chr_write = NULL; @@ -2586,7 +2644,7 @@ void qemu_chr_close_mem(CharDriverState *chr) size_t qemu_chr_mem_osize(const CharDriverState *chr) { const MemoryDriver *d = chr-opaque; -return d-outbuf_size; +return d-cbuf_count; } QemuOpts *qemu_chr_parse_compat(const char *label, const char *filename) diff --git a/qemu-char.h
Re: [Qemu-devel] CPU hotplug
Am 30.08.2012 20:40, schrieb Igor Mammedov: Am 30.08.2012 um 17:41 schrieb Andreas Färber afaer...@suse.de: Am 30.08.2012 11:06, schrieb Stefan Priebe: I tried latest 1.2rc1 kvm-qemu with vanilla kernel v3.5.2 but the VM just crashes when sending cpu_set X online through qm monitor. For SLES we're carrying a patch by Kamalesh Babulal that prevents this (BNC#747339). But this just disables CPU [hotplug] and does Not fix it? It fixes the crash. Hotplug needs to be implemented first, and this has been taking several months already (for x86, to be specific). Mhm RHEL 6.3 claims to support this? it's not officially supported, it's just tech-preview. That allows to play with hotplug and uncover possible guest issues early. Yes, but does this mean that is doesn't work at RHEL too? i wasn't able to get it working any guest or host at all. Greets, Stefan
[Qemu-devel] [PATCH 0/7] block: bdrv_reopen() patches
These patches are strongly based off Supriya Kannery's original bdrv_reopen() patches as part of the hostcache series, including the _prepare(), _commit(), and _abort() structure. Some additions / changes: * Added support for multiple image reopen transactionally * Reopen changes are staged into temporary stashes in prepare(), and copied over in commit() (discarded in abort()). * Driver-level reopen file changes are mainly contained in the raw-* files. TODO: The raw-win32 driver still needs to be finished TODO: The vmdk driver still needs to be finished Jeff Cody (7): block: correctly set the keep_read_only flag block: Framework for reopening files safely block: raw-posix image file reopen block: raw image file reopen block: qed image file reopen block: qcow2 image file reopen block: qcow image file reopen block.c | 242 -- block.h | 16 block/qcow.c | 23 ++ block/qcow2.c | 22 + block/qed.c | 20 + block/raw-posix.c | 153 ++ block/raw.c | 22 + block_int.h | 13 +++ qemu-common.h | 1 + 9 files changed, 489 insertions(+), 23 deletions(-) -- 1.7.11.2
[Qemu-devel] [RFC v2 PATCH 6/6] QAPI: add command for live block commit, 'block-commit'
The command for live block commit is added, which has the following arguments: device: the block device to perform the commit on (mandatory) base: the base image to commit into; optional (if not specified, it is the underlying original image) top:the top image of the commit - all data from inside top down to base will be committed into base. optional (if not specified, it is the active image) - see note below speed: maximum speed, in bytes/sec note: eventually this will support merging down the active layer, but that code is not yet complete. If the active layer is passed in currently as top, or top is left to the default, then the error QERR_TOP_NOT_FOUND will be returned. The is done as a block job, so upon completion a BLOCK_JOB_COMPLETED will be emitted. Signed-off-by: Jeff Cody jc...@redhat.com --- blockdev.c | 83 qapi-schema.json | 30 qmp-commands.hx | 6 3 files changed, 119 insertions(+) diff --git a/blockdev.c b/blockdev.c index 68d65fb..e0d6ca0 100644 --- a/blockdev.c +++ b/blockdev.c @@ -827,6 +827,89 @@ exit: return; } +void qmp_block_commit(const char *device, + bool has_base, const char *base, + bool has_top, const char *top, + bool has_speed, int64_t speed, + Error **errp) +{ +BlockDriverState *bs; +BlockDriverState *base_bs, *top_bs, *child_bs; +Error *local_err = NULL; +int orig_base_flags, orig_top_flags; +BlockReopenQueue *reopen_queue = NULL; +/* This will be part of the QMP command, if/when the + * BlockdevOnError change for blkmirror makes it in + */ +BlockErrorAction on_error = BLOCK_ERR_REPORT; + +/* drain all i/o before commits */ +bdrv_drain_all(); + +bs = bdrv_find(device); +if (!bs) { +error_set(errp, QERR_DEVICE_NOT_FOUND, device); +return; +} +if (base has_base) { +base_bs = bdrv_find_backing_image(bs, base); +} else { +base_bs = bdrv_find_base(bs); +} + +if (base_bs == NULL) { +error_set(errp, QERR_BASE_NOT_FOUND, NULL); +return; +} + +if (top has_top) { +/* if we want to allow the active layer, + * use 'bdrv_find_image()' here */ +top_bs = bdrv_find_backing_image(bs, top); +if (top_bs == NULL) { +error_set(errp, QERR_TOP_NOT_FOUND, top); +return; +} +} else { +/* we will eventually default to the top layer,i.e. top_bs = bs */ +error_set(errp, QERR_TOP_NOT_FOUND, top); +return; +} + +child_bs = bdrv_find_child(bs, top_bs); + +orig_base_flags = bdrv_get_flags(base_bs); /* what we are writing into */ +orig_top_flags = bdrv_get_flags(child_bs); /* to change the backing file */ + +/* convert base_bs to r/w, if necessary */ +if (!(orig_base_flags BDRV_O_RDWR)) { +reopen_queue = bdrv_reopen_queue(reopen_queue, base_bs, + orig_base_flags | BDRV_O_RDWR); +} +if (!(orig_top_flags BDRV_O_RDWR)) { +reopen_queue = bdrv_reopen_queue(reopen_queue, base_bs, + orig_base_flags | BDRV_O_RDWR); +} +if (reopen_queue) { +bdrv_reopen_multiple(reopen_queue, local_err); +if (local_err != NULL) { +error_propagate(errp, local_err); +return; +} +} + +commit_start(bs, base_bs, top_bs, speed, on_error, + block_job_cb, bs, orig_base_flags, orig_top_flags, + local_err); +if (local_err != NULL) { +error_propagate(errp, local_err); +return; +} +/* Grab a reference so hotplug does not delete the BlockDriverState from + * underneath us. + */ +drive_get_ref(drive_get_by_blockdev(bs)); +} static void eject_device(BlockDriverState *bs, int force, Error **errp) { diff --git a/qapi-schema.json b/qapi-schema.json index bd8ad74..45feda6 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -1401,6 +1401,36 @@ 'returns': 'str' } ## +# @block-commit +# +# Live commit of data from child image nodes into parent nodes - i.e., +# writes data between 'top' and 'base' into 'base'. +# +# @device: the name of the device +# +# @base: #optional The parent image of the device to write data into. +#If not specified, this is the original parent image. +# +# @top:#optional The child image, above which data will not be committed +#down. If not specified, this is the active layer. +# +# @speed: #optional the maximum speed, in bytes per second +# +# Returns: Nothing on success +# If commit or stream is already active on this device, DeviceInUse +# If @device does not exist, DeviceNotFound +# If image commit is not supported by
Re: [Qemu-devel] QEMU emulation per CPU
On Thu, Aug 30, 2012 at 7:27 PM, Naresh Bhat nareshgb...@gmail.com wrote: Hi Santosa, Can you please forward a link of that discussion thread ?? try: http://lists.nongnu.org/archive/html/qemu-devel/2012-08/msg05037.html -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com
Re: [Qemu-devel] [PATCH 2/6] monitor: Adjust qmp_human_monitor_command to new MemCharDriver
On Thu, 23 Aug 2012 13:14:22 +0800 Lei Li li...@linux.vnet.ibm.com wrote: Signed-off-by: Lei Li li...@linux.vnet.ibm.com --- monitor.c |8 +++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/monitor.c b/monitor.c index 480f583..ab4650b 100644 --- a/monitor.c +++ b/monitor.c @@ -642,7 +642,13 @@ char *qmp_human_monitor_command(const char *command_line, bool has_cpu_index, CharDriverState mchar; memset(hmp, 0, sizeof(hmp)); -qemu_chr_init_mem(mchar); + +/* Since the backend of MemCharDriver convert to a circular + * buffer with fixed size, so should indicate the init memory + * size. + * + * XXX: is 4096 as init memory enough for this? */ +qemu_chr_init_mem(mchar, 4096); I'm not sure I like this. The end result will be that hmp commands writing more than 4096 bytes will simply fail or return garbage (if the circular buffer is changed to allow writing more than it supports) today they would just work. Although it's always possible to increase the buffer size, we would only realize this is needed when the bug is triggered, which means it has a high chance of happening in production. IOW, this would be a regression. The only solution I can think of is to make the circular buffer and the current MemoryDriver live in parallel. Actually, you really seem to be adding something else. hmp.chr = mchar; old_mon = cur_mon;
[Qemu-devel] [RFC v2 PATCH 4/6] qerror: new error for live block commit, QERR_TOP_NOT_FOUND
Signed-off-by: Jeff Cody jc...@redhat.com --- qerror.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/qerror.h b/qerror.h index d0a76a4..7396184 100644 --- a/qerror.h +++ b/qerror.h @@ -219,6 +219,9 @@ void assert_no_error(Error *err); #define QERR_TOO_MANY_FILES \ ERROR_CLASS_GENERIC_ERROR, Too many open files +#define QERR_TOP_NOT_FOUND \ +ERROR_CLASS_GENERIC_ERROR, Top image file %s not found + #define QERR_UNDEFINED_ERROR \ ERROR_CLASS_GENERIC_ERROR, An undefined error has occurred -- 1.7.11.2
Re: [Qemu-devel] CPU hotplug
On Thu, 30 Aug 2012 20:45:10 +0200 Stefan Priebe s.pri...@profihost.ag wrote: Am 30.08.2012 20:40, schrieb Igor Mammedov: Am 30.08.2012 um 17:41 schrieb Andreas Färber afaer...@suse.de: Am 30.08.2012 11:06, schrieb Stefan Priebe: I tried latest 1.2rc1 kvm-qemu with vanilla kernel v3.5.2 but the VM just crashes when sending cpu_set X online through qm monitor. For SLES we're carrying a patch by Kamalesh Babulal that prevents this (BNC#747339). But this just disables CPU [hotplug] and does Not fix it? It fixes the crash. Hotplug needs to be implemented first, and this has been taking several months already (for x86, to be specific). Mhm RHEL 6.3 claims to support this? it's not officially supported, it's just tech-preview. That allows to play with hotplug and uncover possible guest issues early. Yes, but does this mean that is doesn't work at RHEL too? i wasn't able to get it working any guest or host at all. It works with RHEL 6.3 host/guest combo. Extra testing is greatly appreciated. Greets, Stefan -- Regards, Igor
Re: [Qemu-devel] CPU hotplug
Am 30.08.2012 20:56, schrieb Igor Mammedov: On Thu, 30 Aug 2012 20:45:10 +0200 Stefan Priebe s.pri...@profihost.ag wrote: Am 30.08.2012 20:40, schrieb Igor Mammedov: Am 30.08.2012 um 17:41 schrieb Andreas Färber afaer...@suse.de: Am 30.08.2012 11:06, schrieb Stefan Priebe: I tried latest 1.2rc1 kvm-qemu with vanilla kernel v3.5.2 but the VM just crashes when sending cpu_set X online through qm monitor. For SLES we're carrying a patch by Kamalesh Babulal that prevents this (BNC#747339). But this just disables CPU [hotplug] and does Not fix it? It fixes the crash. Hotplug needs to be implemented first, and this has been taking several months already (for x86, to be specific). Mhm RHEL 6.3 claims to support this? it's not officially supported, it's just tech-preview. That allows to play with hotplug and uncover possible guest issues early. Yes, but does this mean that is doesn't work at RHEL too? i wasn't able to get it working any guest or host at all. It works with RHEL 6.3 host/guest combo. Extra testing is greatly appreciated. mhm OK. I'm using Debian Squeeze with qemu-kvm 1.2rc1 and vanilla kernel 3.5.2 as guest AND host. This just results in the known crash. Stefan
[Qemu-devel] [PATCH 4/7] block: raw image file reopen
These are the stubs for the file reopen drivers for the raw format. There is currently nothing that needs to be done by the raw driver in reopen. Signed-off-by: Jeff Cody jc...@redhat.com --- block/raw.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/block/raw.c b/block/raw.c index ff34ea4..fa47ff1 100644 --- a/block/raw.c +++ b/block/raw.c @@ -9,6 +9,24 @@ static int raw_open(BlockDriverState *bs, int flags) return 0; } +/* We have nothing to do for raw reopen, stubs just return + * success */ +static int raw_reopen_prepare(BDRVReopenState *state, Error **errp) +{ +return 0; +} + +static void raw_reopen_commit(BDRVReopenState *state) +{ +return; +} + +static void raw_reopen_abort(BDRVReopenState *state) +{ +return; +} + + static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *qiov) { @@ -115,6 +133,10 @@ static BlockDriver bdrv_raw = { .bdrv_open = raw_open, .bdrv_close = raw_close, +.bdrv_reopen_prepare = raw_reopen_prepare, +.bdrv_reopen_commit = raw_reopen_commit, +.bdrv_reopen_abort= raw_reopen_abort, + .bdrv_co_readv = raw_co_readv, .bdrv_co_writev = raw_co_writev, .bdrv_co_is_allocated = raw_co_is_allocated, -- 1.7.11.2
[Qemu-devel] [PATCH 3/7] block: raw-posix image file reopen
This is derived from the Supriya Kannery's reopen patches. This contains the raw-posix driver changes for the bdrv_reopen_* functions. All changes are staged into a temporary scratch buffer during the prepare() stage, and copied over to the live structure during commit(). Upon abort(), all changes are abandoned, and the live structures are unmodified. The _prepare() will create an extra fd - either by means of a dup, if possible, or opening a new fd if not (for instance, access control changes). Upon _commit(), the original fd is closed and the new fd is used. Upon _abort(), the duplicate/new fd is closed. Signed-off-by: Jeff Cody jc...@redhat.com --- block/raw-posix.c | 153 +- 1 file changed, 139 insertions(+), 14 deletions(-) diff --git a/block/raw-posix.c b/block/raw-posix.c index 6be20b1..48086d7 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -140,6 +140,15 @@ typedef struct BDRVRawState { #endif } BDRVRawState; +typedef struct BDRVRawReopenState { +BDRVReopenState reopen_state; +int fd; +int open_flags; +uint8_t *aligned_buf; +unsigned aligned_buf_size; +BDRVRawState *stash_s; +} BDRVRawReopenState; + static int fd_open(BlockDriverState *bs); static int64_t raw_getlength(BlockDriverState *bs); @@ -185,6 +194,28 @@ static int raw_normalize_devicepath(const char **filename) } #endif +static void raw_parse_flags(int bdrv_flags, int *open_flags) +{ +assert(open_flags != NULL); + +*open_flags |= O_BINARY; +*open_flags = ~O_ACCMODE; +if (bdrv_flags BDRV_O_RDWR) { +*open_flags |= O_RDWR; +} else { +*open_flags |= O_RDONLY; +} + +/* Use O_DSYNC for write-through caching, no flags for write-back caching, + * and O_DIRECT for no caching. */ +if ((bdrv_flags BDRV_O_NOCACHE)) { +*open_flags |= O_DIRECT; +} +if (!(bdrv_flags BDRV_O_CACHE_WB)) { +*open_flags |= O_DSYNC; +} +} + static int raw_open_common(BlockDriverState *bs, const char *filename, int bdrv_flags, int open_flags) { @@ -196,20 +227,8 @@ static int raw_open_common(BlockDriverState *bs, const char *filename, return ret; } -s-open_flags = open_flags | O_BINARY; -s-open_flags = ~O_ACCMODE; -if (bdrv_flags BDRV_O_RDWR) { -s-open_flags |= O_RDWR; -} else { -s-open_flags |= O_RDONLY; -} - -/* Use O_DSYNC for write-through caching, no flags for write-back caching, - * and O_DIRECT for no caching. */ -if ((bdrv_flags BDRV_O_NOCACHE)) -s-open_flags |= O_DIRECT; -if (!(bdrv_flags BDRV_O_CACHE_WB)) -s-open_flags |= O_DSYNC; +s-open_flags = open_flags; +raw_parse_flags(bdrv_flags, s-open_flags); s-fd = -1; fd = qemu_open(filename, s-open_flags, 0644); @@ -283,6 +302,109 @@ static int raw_open(BlockDriverState *bs, const char *filename, int flags) return raw_open_common(bs, filename, flags, 0); } +static int raw_reopen_prepare(BDRVReopenState *state, Error **errp) +{ +BDRVRawState *s; +BDRVRawReopenState *raw_s; +int ret = 0; + +assert(state != NULL); +assert(state-bs != NULL); + +s = state-bs-opaque; + +state-opaque = g_malloc0(sizeof(BDRVRawReopenState)); +raw_s = state-opaque; + +raw_parse_flags(state-flags, raw_s-open_flags); + +/* + * If we didn't have BDRV_O_NOCACHE set before, we may not have allocated + * aligned_buf + */ +if ((state-flags BDRV_O_NOCACHE)) { +/* + * Allocate a buffer for read/modify/write cycles. Choose the size + * pessimistically as we don't know the block size yet. + */ +raw_s-aligned_buf_size = 32 * MAX_BLOCKSIZE; +raw_s-aligned_buf = qemu_memalign(MAX_BLOCKSIZE, + raw_s-aligned_buf_size); + +if (raw_s-aligned_buf == NULL) { +ret = -1; +goto error; +} +} + +int fcntl_flags = O_APPEND | O_ASYNC | O_NONBLOCK; +#ifdef O_NOATIME +fcntl_flags |= O_NOATIME; +#endif +if ((raw_s-open_flags ~fcntl_flags) == (s-open_flags ~fcntl_flags)) { +/* dup the original fd */ +/* TODO: use qemu fcntl wrapper */ +raw_s-fd = fcntl(s-fd, F_DUPFD_CLOEXEC, 0); +if (raw_s-fd == -1) { +ret = -1; +goto error; +} +ret = fcntl_setfl(raw_s-fd, raw_s-open_flags); +} else { +raw_s-fd = qemu_open(state-bs-filename, raw_s-open_flags, 0644); +if (raw_s-fd == -1) { +ret = -1; +} +} +error: +return ret; +} + + +static void raw_reopen_commit(BDRVReopenState *state) +{ +BDRVRawReopenState *raw_s = state-opaque; +BDRVRawState *s = state-bs-opaque; + +if (raw_s-aligned_buf != NULL) { +if (s-aligned_buf) { +qemu_vfree(s-aligned_buf); +} +s-aligned_buf = raw_s-aligned_buf; +
[Qemu-devel] [PATCH 2/7] block: Framework for reopening files safely
This is based heavily on Supriya Kannery's bdrv_reopen() patch series. This provides a transactional method to reopen multiple images files safely. Image files are queue for reopen via bdrv_reopen_queue(), and the reopen occurs when bdrv_reopen_multiple() is called. Changes are staged in bdrv_reopen_prepare() and in the equivalent driver level functions. If any of the staged images fails a prepare, then all of the images left untouched, and the staged changes for each image abandoned. Signed-off-by: Jeff Cody jc...@redhat.com --- block.c | 226 ++ block.h | 15 block_int.h | 13 qemu-common.h | 1 + 4 files changed, 255 insertions(+) diff --git a/block.c b/block.c index e31b76f..9470319 100644 --- a/block.c +++ b/block.c @@ -857,6 +857,232 @@ unlink_and_fail: return ret; } +/* + * Adds a BlockDriverState to a simple queue for an atomic, transactional + * reopen of multiple devices. + * + * bs_queue can either be an existing BlockReopenQueue that has had QSIMPLE_INIT + * already performed, or alternatively may be NULL a new BlockReopenQueue will + * be created and initialized. This newly created BlockReopenQueue should be + * passed back in for subsequent calls that are intended to be of the same + * atomic 'set'. + * + * bs is the BlockDriverState to add to the reopen queue. + * + * flags contains the open flags for the associated bs + * + * returns a pointer to bs_queue, which is either the newly allocated + * bs_queue, or the existing bs_queue being used. + * + */ +BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue, +BlockDriverState *bs, int flags) +{ +assert(bs != NULL); + +BlockReopenQueueEntry *bs_entry; +if (bs_queue == NULL) { +bs_queue = g_new0(BlockReopenQueue, 1); +QSIMPLEQ_INIT(bs_queue); +} + +if (bs-file) { +bdrv_reopen_queue(bs_queue, bs-file, flags); +} + +bs_entry = g_new0(BlockReopenQueueEntry, 1); +QSIMPLEQ_INSERT_TAIL(bs_queue, bs_entry, entry); + +bs_entry-state = g_new0(BDRVReopenState, 1); +bs_entry-state-bs = bs; +bs_entry-state-flags = flags; + +return bs_queue; +} + +/* + * Reopen multiple BlockDriverStates atomically transactionally. + * + * The queue passed in (bs_queue) must have been built up previous + * via bdrv_reopen_queue(). + * + * Reopens all BDS specified in the queue, with the appropriate + * flags. All devices are prepared for reopen, and failure of any + * device will cause all device changes to be abandonded, and intermediate + * data cleaned up. + * + * If all devices prepare successfully, then the changes are committed + * to all devices. + * + */ +int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp) +{ +int ret = -1; +BlockReopenQueueEntry *bs_entry; +Error *local_err = NULL; + +assert(bs_queue != NULL); + +bdrv_drain_all(); + +QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) { +if (bdrv_reopen_prepare(bs_entry-state, local_err)) { +error_propagate(errp, local_err); +goto cleanup; +} +bs_entry-prepared = true; +} + +/* If we reach this point, we have success and just need to apply the + * changes + */ +QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) { +bdrv_reopen_commit(bs_entry-state); +} + +ret = 0; + +cleanup: +QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) { +if (ret bs_entry-prepared) { +bdrv_reopen_abort(bs_entry-state); +} +g_free(bs_entry-state); +g_free(bs_entry); +} +g_free(bs_queue); +return ret; +} + + +/* Reopen a single BlockDriverState with the specified flags. */ +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags, Error **errp) +{ +int ret = -1; +Error *local_err = NULL; +BlockReopenQueue *queue = bdrv_reopen_queue(NULL, bs, bdrv_flags); + +ret = bdrv_reopen_multiple(queue, local_err); +if (local_err != NULL) { +error_propagate(errp, local_err); +} +return ret; +} + + +/* + * Prepares a BlockDriverState for reopen. All changes are staged in the + * 'reopen_state' field of the BlockDriverState, which must be NULL when + * entering (all previous reopens must have completed for the BDS). + * + * bs is the BlockDriverState to reopen + * flags are the new open flags + * + * Returns 0 on success, non-zero on error. On error errp will be set + * as well. + * + * On failure, bdrv_reopen_abort() will be called to clean up any data. + * It is the responsibility of the caller to then call the abort() or + * commit() for any other BDS that have been left in a prepare() state + * + */ +int bdrv_reopen_prepare(BDRVReopenState *reopen_state, Error **errp) +{ +int ret = -1; +Error *local_err = NULL; +BlockDriver *drv; + +assert(reopen_state != NULL); +assert(reopen_state-bs-drv != NULL); +drv =
Re: [Qemu-devel] Posix timer syscalls ; dealing with the timer_t type
Andreas Färber wrote: Hi, Am 30.08.2012 14:30, schrieb Erik de Castro Lopo: I'm working on implementing Posix timers in linux-user. I'm having trouble figuring out how to handle the timer_t type. Consider the following code with say 32 bit ARM being emulated on 64 bit x86-64: timer_t timerid; err = timer_create(clockid, sev, timerid); err = timer_gettime(timerid, curr); The issue is that memory for the timer_t value in the 32 bit target is alloacted on the stack (where the timer_t is 4 bytes) but the value provided by the 64 bit host where the timer_t is 8 bytes. Any suggestions on dealing with this? typedef target_ulong target_timer_t; or abi_ulong, or without the u if signed. The timer_t type is actually an alias for void*. Depending on where/how you use this, you may need to convert back and forth between host and target values. The complication is that each call to the host's timer_create() function generates 64 bits of data, but on the 32 bit target, where there are only 32 bits to store that data. The only obvious solution is store the 64 bit pointers from the host in a table and return the index into that table to the target as its version of the timer_t. Does that make sense? Cheers, Erik -- -- Erik de Castro Lopo http://www.mega-nerd.com/
Re: [Qemu-devel] [PATCH 3/6] QAPI: Introduce memchar_write QMP command
On Thu, 23 Aug 2012 13:14:23 +0800 Lei Li li...@linux.vnet.ibm.com wrote: Signed-off-by: Lei Li li...@linux.vnet.ibm.com --- hmp-commands.hx | 16 hmp.c| 15 +++ hmp.h|1 + qapi-schema.json | 28 qemu-char.c | 36 qmp-commands.hx | 33 + 6 files changed, 129 insertions(+), 0 deletions(-) diff --git a/hmp-commands.hx b/hmp-commands.hx index f6104b0..829aea1 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -797,6 +797,22 @@ Inject an NMI on the given CPU (x86 only). ETEXI { +.name = memchar-write, +.args_type = chardev:s,size:i,data:s,format:s?, +.params = chardev size data, +.help = Provide writing interface for memchardev. Write + 'data' to memchr char device with size 'size', +.mhandler.cmd = hmp_memchar_write, +}, + +STEXI +@item memchar-write @var{chardev} @var{size} @var{data} @var{format} +@findex memchar-write +Provide writing interface for memchardev. Write @var{data} +to memchr char device with size @var{size}. +ETEXI + +{ .name = migrate, .args_type = detach:-d,blk:-b,inc:-i,uri:s, .params = [-d] [-b] [-i] uri, diff --git a/hmp.c b/hmp.c index 81c8acb..c1164eb 100644 --- a/hmp.c +++ b/hmp.c @@ -668,6 +668,21 @@ void hmp_pmemsave(Monitor *mon, const QDict *qdict) hmp_handle_error(mon, errp); } +void hmp_memchar_write(Monitor *mon, const QDict *qdict) +{ +uint32_t size = qdict_get_int(qdict, size); +const char *chardev = qdict_get_str(qdict, chardev); +const char *data = qdict_get_str(qdict, data); +int con = qdict_get_try_bool(qdict, utf8, 0); +enum DataFormat format; +Error *errp = NULL; + +format = con ? DATA_FORMAT_UTF8 : DATA_FORMAT_BASE64; +qmp_memchar_write(chardev, size, data, true, format, errp); + +hmp_handle_error(mon, errp); +} + static void hmp_cont_cb(void *opaque, int err) { if (!err) { diff --git a/hmp.h b/hmp.h index 7dd93bf..15e1311 100644 --- a/hmp.h +++ b/hmp.h @@ -43,6 +43,7 @@ void hmp_system_powerdown(Monitor *mon, const QDict *qdict); void hmp_cpu(Monitor *mon, const QDict *qdict); void hmp_memsave(Monitor *mon, const QDict *qdict); void hmp_pmemsave(Monitor *mon, const QDict *qdict); +void hmp_memchar_write(Monitor *mon, const QDict *qdict); void hmp_cont(Monitor *mon, const QDict *qdict); void hmp_system_wakeup(Monitor *mon, const QDict *qdict); void hmp_inject_nmi(Monitor *mon, const QDict *qdict); diff --git a/qapi-schema.json b/qapi-schema.json index bd8ad74..2fc1a27 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -235,6 +235,34 @@ ## { 'command': 'query-chardev', 'returns': ['ChardevInfo'] } +{ 'enum': 'DataFormat' + 'data': [ 'utf8', 'base64' ] } + +## +# @memchar-write: +# +# Provide writing interface for memchardev. Write data to memchar +# char device. +# +# @chardev: the name of the memchar char device. +# +# @size: the size to write in bytes. +# +# @data: the source data write to memchar. +# +# @format: #optional the format of the data write to memchardev, by +# default is 'utf8'. +# +# Returns: Nothing on success +# If @chardev is not a valid memchr device, DeviceNotFound +# If an I/O error occurs while writing, IOError +# +# Since: 1.3 +## +{ 'command': 'memchar-write', + 'data': {'chardev': 'str', 'size': 'int', 'data': 'str', + '*format': 'DataFormat'} } + ## # @CommandInfo: # diff --git a/qemu-char.c b/qemu-char.c index b21b93a..d8f3238 100644 --- a/qemu-char.c +++ b/qemu-char.c @@ -2647,6 +2647,42 @@ size_t qemu_chr_mem_osize(const CharDriverState *chr) return d-cbuf_count; } +void qmp_memchar_write(const char *chardev, int64_t size, + const char *data, bool has_format, + enum DataFormat format, + + Error **errp) +{ +CharDriverState *chr; +guchar *write_data; +int ret; +gsize write_count; + +chr = qemu_chr_find(chardev); + +if(!chr) { +error_set(errp, QERR_DEVICE_NOT_FOUND, chardev); +return; +} + +write_count = (gsize)size; + +if (has_format) { +if (format == DATA_FORMAT_BASE64) { +write_data = g_base64_decode(data, write_count); +} +} else { +write_data = (uint8_t *)data; +} + +ret = mem_chr_write(chr, write_data, size); +if (ret = 0) { +error_set(errp, QERR_IO_ERROR); We're only using the QERR_ macros for the errors we have to maintain compatibility (they're listed in ErrorClass in the schema) all other errors can use the error_setg() function:
Re: [Qemu-devel] [PATCH 5/6] Fix enumeration typo error
On Thu, 23 Aug 2012 13:14:25 +0800 Lei Li li...@linux.vnet.ibm.com wrote: Signed-off-by: Lei Li li...@linux.vnet.ibm.com Reviewed-by: Luiz Capitulino lcapitul...@redhat.com PS: CC'ing qemu-trivial as per Eric's suggestion. --- qapi-schema-guest.json |2 +- qapi-schema.json |4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/qapi-schema-guest.json b/qapi-schema-guest.json index d955cf1..ed0eb69 100644 --- a/qapi-schema-guest.json +++ b/qapi-schema-guest.json @@ -293,7 +293,7 @@ ## # @GuestFsFreezeStatus # -# An enumation of filesystem freeze states +# An enumeration of filesystem freeze states # # @thawed: filesystems thawed/unfrozen # diff --git a/qapi-schema.json b/qapi-schema.json index 1346bcc..8e13a05 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -118,7 +118,7 @@ ## # @RunState # -# An enumation of VM run states. +# An enumeration of VM run states. # # @debug: QEMU is running on a debugger # @@ -836,7 +836,7 @@ ## # @SpiceQueryMouseMode # -# An enumation of Spice mouse states. +# An enumeration of Spice mouse states. # # @client: Mouse cursor position is determined by the client. #
[Qemu-devel] [PATCH 5/7] block: qed image file reopen
These are the stubs for the file reopen drivers for the qed format. There is currently nothing that needs to be done by the qed driver in reopen. Signed-off-by: Jeff Cody jc...@redhat.com --- block/qed.c | 20 1 file changed, 20 insertions(+) diff --git a/block/qed.c b/block/qed.c index a02dbfd..231e86a 100644 --- a/block/qed.c +++ b/block/qed.c @@ -505,6 +505,23 @@ out: return ret; } +/* We have nothing to do for QED reopen, stubs just return + * success */ +static int bdrv_qed_reopen_prepare(BDRVReopenState *state, Error **errp) +{ +return 0; +} + +static void bdrv_qed_reopen_commit(BDRVReopenState *state) +{ +return; +} + +static void bdrv_qed_reopen_abort(BDRVReopenState *state) +{ +return; +} + static void bdrv_qed_close(BlockDriverState *bs) { BDRVQEDState *s = bs-opaque; @@ -1553,6 +1570,9 @@ static BlockDriver bdrv_qed = { .bdrv_rebind = bdrv_qed_rebind, .bdrv_open= bdrv_qed_open, .bdrv_close = bdrv_qed_close, +.bdrv_reopen_prepare = bdrv_qed_reopen_prepare, +.bdrv_reopen_commit = bdrv_qed_reopen_commit, +.bdrv_reopen_abort= bdrv_qed_reopen_abort, .bdrv_create = bdrv_qed_create, .bdrv_co_is_allocated = bdrv_qed_co_is_allocated, .bdrv_make_empty = bdrv_qed_make_empty, -- 1.7.11.2
[Qemu-devel] [RFC v2 PATCH 5/6] block: helper function, to find the base image of a chain
This is a simple helper function, that will return the base image of a given image chain. Signed-off-by: Jeff Cody jc...@redhat.com --- block.c | 16 block.h | 1 + 2 files changed, 17 insertions(+) diff --git a/block.c b/block.c index 11e275c..5f58600 100644 --- a/block.c +++ b/block.c @@ -3137,6 +3137,22 @@ int bdrv_get_backing_file_depth(BlockDriverState *bs) return 1 + bdrv_get_backing_file_depth(bs-backing_hd); } +BlockDriverState *bdrv_find_base(BlockDriverState *bs) +{ +BlockDriverState *curr_bs = NULL; + +if (!bs) { +return NULL; +} + +curr_bs = bs; + +while (curr_bs-backing_hd) { +curr_bs = curr_bs-backing_hd; +} +return curr_bs; +} + #define NB_SUFFIXES 4 char *get_human_readable_size(char *buf, int buf_size, int64_t size) diff --git a/block.h b/block.h index ee76869..376cc50 100644 --- a/block.h +++ b/block.h @@ -201,6 +201,7 @@ int bdrv_commit_all(void); int bdrv_change_backing_file(BlockDriverState *bs, const char *backing_file, const char *backing_fmt); void bdrv_register(BlockDriver *bdrv); +BlockDriverState *bdrv_find_base(BlockDriverState *bs); int bdrv_delete_intermediate(BlockDriverState *active, BlockDriverState *top, BlockDriverState *base); BlockDriverState *bdrv_find_child(BlockDriverState *active, -- 1.7.11.2
[Qemu-devel] [PATCH 7/7] block: qcow image file reopen
These are the stubs for the file reopen drivers for the qcow format. There is currently nothing that needs to be done by the qcow driver in reopen. Signed-off-by: Jeff Cody jc...@redhat.com --- block/qcow.c | 23 +++ 1 file changed, 23 insertions(+) diff --git a/block/qcow.c b/block/qcow.c index 7b5ab87..f201575 100644 --- a/block/qcow.c +++ b/block/qcow.c @@ -197,6 +197,26 @@ static int qcow_open(BlockDriverState *bs, int flags) return ret; } + +/* We have nothing to do for QCOW reopen, stubs just return + * success */ +static int qcow_reopen_prepare(BDRVReopenState *state, Error **errp) +{ +return 0; +} + +static void qcow_reopen_commit(BDRVReopenState *state) +{ +return; +} + +static void qcow_reopen_abort(BDRVReopenState *state) +{ +return; +} + + + static int qcow_set_key(BlockDriverState *bs, const char *key) { BDRVQcowState *s = bs-opaque; @@ -868,6 +888,9 @@ static BlockDriver bdrv_qcow = { .bdrv_probe= qcow_probe, .bdrv_open = qcow_open, .bdrv_close= qcow_close, +.bdrv_reopen_prepare = qcow_reopen_prepare, +.bdrv_reopen_commit = qcow_reopen_commit, +.bdrv_reopen_abort = qcow_reopen_abort, .bdrv_create = qcow_create, .bdrv_co_readv = qcow_co_readv, -- 1.7.11.2
[Qemu-devel] [PATCH 6/7] block: qcow2 image file reopen
These are the stubs for the file reopen drivers for the qcow2 format. There is currently nothing that needs to be done by the qcow2 driver in reopen. Signed-off-by: Jeff Cody jc...@redhat.com --- block/qcow2.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/block/qcow2.c b/block/qcow2.c index 8f183f1..d462093 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -52,6 +52,7 @@ typedef struct { uint32_t magic; uint32_t len; } QCowExtension; + #define QCOW2_EXT_MAGIC_END 0 #define QCOW2_EXT_MAGIC_BACKING_FORMAT 0xE2792ACA #define QCOW2_EXT_MAGIC_FEATURE_TABLE 0x6803f857 @@ -558,6 +559,24 @@ static int qcow2_set_key(BlockDriverState *bs, const char *key) return 0; } +/* We have nothing to do for QCOW2 reopen, stubs just return + * success */ +static int qcow2_reopen_prepare(BDRVReopenState *state, Error **errp) +{ +return 0; +} + +static void qcow2_reopen_commit(BDRVReopenState *state) +{ +return; +} + +static void qcow2_reopen_abort(BDRVReopenState *state) +{ +return; +} + + static int coroutine_fn qcow2_co_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors, int *pnum) { @@ -1679,6 +1698,9 @@ static BlockDriver bdrv_qcow2 = { .bdrv_probe = qcow2_probe, .bdrv_open = qcow2_open, .bdrv_close = qcow2_close, +.bdrv_reopen_prepare = qcow2_reopen_prepare, +.bdrv_reopen_commit = qcow2_reopen_commit, +.bdrv_reopen_abort= qcow2_reopen_abort, .bdrv_create= qcow2_create, .bdrv_co_is_allocated = qcow2_co_is_allocated, .bdrv_set_key = qcow2_set_key, -- 1.7.11.2
[Qemu-devel] [RFC v2 PATCH 0/6] Live block commit
Live block commit. I originally had intended for this RFC series to include the more complicated case of a live commit of the active layer, but removed it for this commit in the hopes of making it into the soft feature freeze for 1.2, so this series is the simpler case. This series adds the basic case, of a live commit between two images below the active layer, e.g.: [base] --- [snp-1] --- [snp-2] --- [snp-3] --- [active] can be collapsed down via commit, into: [base] --- [active] or, [base] --- [snp-1] --- [active], [base] --- [snp-3] --- [active], etc.. TODO: * qemu-io tests (in progress) * 'stage-2' of live commit functionality, to be able to push down the active layer. This structured something like mirroring, to allow for convergence. Changes from the RFC v1 series: * This patch series is not on top of Paolo's blk mirror series yet, to make it easier to apply independently if desired. This means some of what was in the previous RFC series is not in this one (BlockdevOnError, for instance), but that can be easily added in once Paolo's series are in. * This patches series is dependent on the reopen() series with transactional reopen. * The target release for this series is 1.3 * Found some mistakes in the reopen calls * Dropped the BlockdevOnError argument (for now), will add in if rebasing on top of Paolo's series. * Used the new qerror system Jeff Cody (6): 1/6 block: add support functions for live commit, to find and delete images. 2/6 block: add live block commit functionality 3/6 blockdev: rename block_stream_cb to a generic block_job_cb 4/6 qerror: new error for live block commit, QERR_TOP_NOT_FOUND 5/6 block: helper function, to find the base image of a chain 6/6 QAPI: add command for live block commit, 'block-commit' block.c | 158 block.h | 6 +- block/Makefile.objs | 1 + block/commit.c | 202 block_int.h | 19 + blockdev.c | 91 ++- qapi-schema.json| 30 qerror.h| 3 + qmp-commands.hx | 6 ++ trace-events| 4 +- 10 files changed, 515 insertions(+), 5 deletions(-) create mode 100644 block/commit.c -- 1.7.11.2
[Qemu-devel] [RFC v2 PATCH 2/6] block: add live block commit functionality
This adds the live commit coroutine. This iteration focuses on the commit only below the active layer, and not the active layer itself. The behaviour is similar to block streaming; the sectors are walked through, and anything that exists above 'base' is committed back down into base. At the end, intermediate images are deleted, and the chain stitched together. Images are restored to their original open flags upon completion. Signed-off-by: Jeff Cody jc...@redhat.com --- block/Makefile.objs | 1 + block/commit.c | 202 block_int.h | 19 + trace-events| 2 + 4 files changed, 224 insertions(+) create mode 100644 block/commit.c diff --git a/block/Makefile.objs b/block/Makefile.objs index b5754d3..4a136b8 100644 --- a/block/Makefile.objs +++ b/block/Makefile.objs @@ -4,6 +4,7 @@ block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o block-obj-y += qed-check.o block-obj-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o block-obj-y += stream.o +block-obj-y += commit.o block-obj-$(CONFIG_WIN32) += raw-win32.o block-obj-$(CONFIG_POSIX) += raw-posix.o block-obj-$(CONFIG_LIBISCSI) += iscsi.o diff --git a/block/commit.c b/block/commit.c new file mode 100644 index 000..bd3d882 --- /dev/null +++ b/block/commit.c @@ -0,0 +1,202 @@ +/* + * Live block commit + * + * Copyright Red Hat, Inc. 2012 + * + * Authors: + * Jeff Cody jc...@redhat.com + * Based on stream.c by Stefan Hajnoczi + * + * This work is licensed under the terms of the GNU LGPL, version 2 or later. + * See the COPYING.LIB file in the top-level directory. + * + */ + +#include trace.h +#include block_int.h +#include qemu/ratelimit.h + +enum { +/* + * Size of data buffer for populating the image file. This should be large + * enough to process multiple clusters in a single call, so that populating + * contiguous regions of the image is efficient. + */ +COMMIT_BUFFER_SIZE = 512 * 1024, /* in bytes */ +}; + +#define SLICE_TIME 1ULL /* ns */ + +typedef struct CommitBlockJob { +BlockJob common; +RateLimit limit; +BlockDriverState *active; +BlockDriverState *top; +BlockDriverState *base; +BlockErrorAction on_error; +int base_flags; +int top_flags; +} CommitBlockJob; + +static int coroutine_fn commit_populate(BlockDriverState *bs, +BlockDriverState *base, +int64_t sector_num, int nb_sectors, +void *buf) +{ +if (bdrv_read(bs, sector_num, buf, nb_sectors)) { +return -EIO; +} +if (bdrv_write(base, sector_num, buf, nb_sectors)) { +return -EIO; +} +return 0; +} + +static void coroutine_fn commit_run(void *opaque) +{ +CommitBlockJob *s = opaque; +BlockDriverState *active = s-active; +BlockDriverState *top = s-top; +BlockDriverState *base = s-base; +BlockDriverState *top_child = NULL; +int64_t sector_num, end; +int error = 0; +int ret = 0; +int n = 0; +void *buf; +int bytes_written = 0; + +s-common.len = bdrv_getlength(top); +if (s-common.len 0) { +block_job_complete(s-common, s-common.len); +return; +} + +top_child = bdrv_find_child(active, top); + +end = s-common.len BDRV_SECTOR_BITS; +buf = qemu_blockalign(top, COMMIT_BUFFER_SIZE); + +for (sector_num = 0; sector_num end; sector_num += n) { +uint64_t delay_ns = 0; +bool copy; + +wait: +/* Note that even when no rate limit is applied we need to yield + * with no pending I/O here so that qemu_aio_flush() returns. + */ +block_job_sleep_ns(s-common, rt_clock, delay_ns); +if (block_job_is_cancelled(s-common)) { +break; +} +/* Copy if allocated above the base */ +ret = bdrv_co_is_allocated_above(top, base, sector_num, + COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE, + n); +copy = (ret == 1); +trace_commit_one_iteration(s, sector_num, n, ret); +if (ret = 0 copy) { +if (s-common.speed) { +delay_ns = ratelimit_calculate_delay(s-limit, n); +if (delay_ns 0) { +goto wait; +} +} +ret = commit_populate(top, base, sector_num, n, buf); +bytes_written += n * BDRV_SECTOR_SIZE; +} +if (ret 0) { +if (s-on_error == BLOCK_ERR_STOP_ANY || +s-on_error == BLOCK_ERR_STOP_ENOSPC) { +n = 0; +continue; +} +if (error == 0) { +error = ret; +} +if (s-on_error == BLOCK_ERR_REPORT) { +break; +} +} +ret = 0; + +/* Publish progress
[Qemu-devel] [RFC v2 PATCH 1/6] block: add support functions for live commit, to find and delete images.
Add bdrv_find_child(), and bdrv_delete_intermediate(). bdrv_find_child(): given 'bs' and the active (topmost) BDS of an image chain, find the image that is the immediate top of 'bs' bdrv_delete_intermediate(): Given 3 BDS (active, top, base), delete images above base up to and including top, and set base to be the parent of top's child node. E.g., this converts: bottom - base - intermediate - top - active to bottom - base - active where top == active is permitted, although active will not be deleted. Signed-off-by: Jeff Cody jc...@redhat.com --- block.c | 142 block.h | 5 ++- 2 files changed, 146 insertions(+), 1 deletion(-) diff --git a/block.c b/block.c index 9470319..11e275c 100644 --- a/block.c +++ b/block.c @@ -1752,6 +1752,148 @@ int bdrv_change_backing_file(BlockDriverState *bs, return ret; } +/* Finds the image layer immediately to the 'top' of bs. + * + * active is the current topmost image. + */ +BlockDriverState *bdrv_find_child(BlockDriverState *active, + BlockDriverState *bs) +{ +BlockDriverState *child = NULL; +BlockDriverState *intermediate; + +/* if the active bs layer is the same as the new top, then there + * is no image above the top, so it will be returned as the child + */ +if (active == bs) { +child = active; +} else { +intermediate = active; +while (intermediate-backing_hd) { +if (intermediate-backing_hd == bs) { +child = intermediate; +break; +} +intermediate = intermediate-backing_hd; +} +} + +return child; +} + +typedef struct BlkIntermediateStates { +BlockDriverState *bs; +QSIMPLEQ_ENTRY(BlkIntermediateStates) entry; +} BlkIntermediateStates; + + +/* deletes images above 'base' up to and including 'top', and sets the image + * above 'top' to have base as its backing file. + * + * E.g., this will convert the following chain: + * bottom - base - intermediate - top - active + * + * to + * + * bottom - base - active + * + * It is allowed for bottom==base, in which case it converts: + * + * base - intermediate - top - active + * + * to + * + * base - active + * + * It is also allowed for top==active, except in that case active is not + * deleted: + * + * base - intermediate - top + * + * becomes + * + * base - top + */ +int bdrv_delete_intermediate(BlockDriverState *active, BlockDriverState *top, + BlockDriverState *base) +{ +BlockDriverState *intermediate; +BlockDriverState *base_bs = NULL; +BlockDriverState *new_top_bs = NULL; +BlkIntermediateStates *intermediate_state, *next; +int ret = -1; + +QSIMPLEQ_HEAD(states_to_delete, BlkIntermediateStates) states_to_delete; +QSIMPLEQ_INIT(states_to_delete); + +if (!top-drv || !base-drv) { +goto exit; +} + +new_top_bs = bdrv_find_child(active, top); + +/* special case of new_top_bs-backing_hd already pointing to base - nothing + * to do, no intermediate images + */ +if (new_top_bs-backing_hd == base) { +ret = 0; +goto exit; +} + +if (new_top_bs == NULL) { +/* we could not find the image above 'top', this is an error */ +goto exit; +} + +/* if the active and top image passed in are the same, then we + * can't delete the active, so we start one below + */ +intermediate = (active == top) ? active-backing_hd : top; + +/* now we will go down through the list, and add each BDS we find + * into our deletion queue, until we hit the 'base' + */ +while (intermediate) { +intermediate_state = g_malloc0(sizeof(BlkIntermediateStates)); +intermediate_state-bs = intermediate; +QSIMPLEQ_INSERT_TAIL(states_to_delete, intermediate_state, entry); + +if (intermediate-backing_hd == base) { +base_bs = intermediate-backing_hd; +break; +} +intermediate = intermediate-backing_hd; +} +if (base_bs == NULL) { +/* something went wrong, we did not end at the base. safely + * unravel everything, and exit with error */ +goto exit; +} + +/* success - we can delete the intermediate states, and link top-base */ +ret = bdrv_change_backing_file(new_top_bs, base_bs-filename, + base_bs-drv ? base_bs-drv-format_name : ); +if (ret) { +goto exit; +} +new_top_bs-backing_hd = base_bs; + + +QSIMPLEQ_FOREACH_SAFE(intermediate_state, states_to_delete, entry, next) { +/* so that bdrv_close() does not recursively close the chain */ +intermediate_state-bs-backing_hd = NULL; +
[Qemu-devel] [RFC v2 PATCH 3/6] blockdev: rename block_stream_cb to a generic block_job_cb
Signed-off-by: Jeff Cody jc...@redhat.com --- blockdev.c | 8 +--- trace-events | 2 +- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/blockdev.c b/blockdev.c index 7c83baa..68d65fb 100644 --- a/blockdev.c +++ b/blockdev.c @@ -53,6 +53,8 @@ static const int if_max_devs[IF_COUNT] = { [IF_SCSI] = 7, }; +static void block_job_cb(void *opaque, int ret); + /* * We automatically delete the drive when a device using it gets * unplugged. Questionable feature, but we can't just drop it. @@ -1063,12 +1065,12 @@ static QObject *qobject_from_block_job(BlockJob *job) job-speed); } -static void block_stream_cb(void *opaque, int ret) +static void block_job_cb(void *opaque, int ret) { BlockDriverState *bs = opaque; QObject *obj; -trace_block_stream_cb(bs, bs-job, ret); +trace_block_job_cb(bs, bs-job, ret); assert(bs-job); obj = qobject_from_block_job(bs-job); @@ -1110,7 +1112,7 @@ void qmp_block_stream(const char *device, bool has_base, } stream_start(bs, base_bs, base, has_speed ? speed : 0, - block_stream_cb, bs, local_err); + block_job_cb, bs, local_err); if (error_is_set(local_err)) { error_propagate(errp, local_err); return; diff --git a/trace-events b/trace-events index 9eb8f10..8d7a8d3 100644 --- a/trace-events +++ b/trace-events @@ -79,7 +79,7 @@ commit_start(void *bs, void *base, void *top, void *s, void *co, void *opaque, i # blockdev.c qmp_block_job_cancel(void *job) job %p -block_stream_cb(void *bs, void *job, int ret) bs %p job %p ret %d +block_job_cb(void *bs, void *job, int ret) bs %p job %p ret %d qmp_block_stream(void *bs, void *job) bs %p job %p # hw/virtio-blk.c -- 1.7.11.2
[Qemu-devel] [ANNOUNCE] QEMU 1.2.0-rc2 release
Hi, On behalf of the QEMU Team, I'd like to announce the availability of the first release candidate for the QEMU 1.2 release. This release is meant for testing purposes and should not be used in a production environment. http://wiki.qemu.org/download/qemu-1.2.0-rc2.tar.bz2 You can help improve the quality of the QEMU 1.2 release by testing this release and reporting bugs on Launchpad: https://bugs.launchpad.net/qemu/ The release plan for the 1.2 release is available at: http://wiki.qemu.org/Planning/1.2 And a detailed change log is available at: http://wiki.qemu.org/ChangeLog/Next Known Issues: - References are not completely released during device removal which may lead to internal memory leaks. Changelog since -rc1: - scsi-disk: Fix typo (uint32 - uint32_t) (Stefan Weil) - msix: make [un]use vectors on reset/load optional (Michael S. Tsirkin) - kvm: get/set PV EOI MSR (Michael S. Tsirkin) - linux-headers: update to 3.6-rc3 (Michael S. Tsirkin) - target-i386: disable pv eoi to fix migration across QEMU versions (Anthony Liguori) - reset PMBA and PMREGMISC PIIX4 registers. (Gleb Natapov) - qemu-ga: Fix null pointer passed to unlink in failure branch (Stefan Weil) - memory: Fix copypaste mistake in memory_region_iorange_write (Jan Kiszka) - ivshmem: remove redundant ioeventfd configuration (Cam Macdonell) - hw/arm_gic.c: Define .class_size in arm_gic_info TypeInfo (Peter Maydell) - tcg/mips: fix broken CONFIG_TCG_PASS_AREG0 code (Aurelien Jarno) - Update OpenBIOS PPC image (Aurelien Jarno) - target-ppc: fix altivec instructions (Aurelien Jarno) - audio/winwave: previous audio buffer should be flushed (munkyu.im) - iscsi: Set number of blocks to 0 for blank CDROM devices (Ronnie Sahlberg) - scsi: more fixes to properties for passthrough devices (Paolo Bonzini) - esp: support 24-bit DMA (Paolo Bonzini) - megasas: Add 'hba_serial' property (Hannes Reinecke) - target-mips: allow microMIPS SWP and SDP to have RD equal to BASE (Eric Johnson) - target-mips: add privilege level check to several Cop0 instructions (Eric Johnson) - Revert fix some debug printf format strings (malc) - Revert vl: fix -hdachs/-hda argument order parsing issues (malc) - Revert qemu-options.hx: mention retrace= VGA option (malc) - Revert vga: add some optional CGA compatibility hacks (malc) - Revert i8259: add -no-spurious-interrupt-hack option (malc) - mips-linux-user: Always support rdhwr. (Richard Henderson) - target-mips: Streamline indexed cp1 memory addressing. (Richard Henderson) - Fix order of CVT.PS.S operands (Richard Sandiford) - Fix operands of RECIP2.S and RECIP2.PS (Richard Sandiford) - linux-user: Clarify Unable to reserve guest address space error (Peter Maydell) - linux-user: fix emulation of getdents (Dmitry V. Levin) - linux-user: arg_table need not have global scope (Jim Meyering) - tcg/ia64: fix and optimize ld/st slow path (Aurelien Jarno) - tcg/ia64: fix prologue/epilogue (Aurelien Jarno) - tcg/arm: Fix broken CONFIG_TCG_PASS_AREG0 code (Peter Maydell) - i8259: add -no-spurious-interrupt-hack option (Matthew Ogilvie) - vga: add some optional CGA compatibility hacks (Matthew Ogilvie) - qemu-options.hx: mention retrace= VGA option (Matthew Ogilvie) - vl: fix -hdachs/-hda argument order parsing issues (Matthew Ogilvie) - target-i386/translate.c: mov to/from crN/drN: ignore mod bits (Matthew Ogilvie) - fix some debug printf format strings (Matthew Ogilvie) - ivshmem: fix memory_region_del_eventfd assertion failure (Paolo Bonzini) - qom: object_delete should unparent the object first (Paolo Bonzini) - monitor: don't try to initialize json parser when monitor is HMP (Anthony Liguori) - target-mips: Fix some helper functions (VR54xx multiplication) (Stefan Weil) - target-mips: Enable access to required RDHWR hardware registers (Meador Inge) - monitor: move json init from OPEN event to init (Anthony Liguori) - boards: add a 'none' machine type to all platforms (Anthony Liguori) Regards, Anthony Liguori
[Qemu-devel] [PATCH 1/7] block: correctly set the keep_read_only flag
I believe the bs-keep_read_only flag is supposed to reflect the initial open state of the device. If the device is initially opened R/O, then commit operations, or reopen operations changing to R/W, are prohibited. Currently, the keep_read_only flag is only accurate for the active layer, and its backing file. Subsequent images end up always having the keep_read_only flag set. For instance, what happens now: [ base ] kro = 1, ro = 1 | v [ snap-1 ] kro = 1, ro = 1 | v [ snap-2 ] kro = 0, ro = 1 | v [ active ] kro = 0, ro = 0 What we want: [ base ] kro = 0, ro = 1 | v [ snap-1 ] kro = 0, ro = 1 | v [ snap-2 ] kro = 0, ro = 1 | v [ active ] kro = 0, ro = 0 Signed-off-by: Jeff Cody jc...@redhat.com --- block.c | 16 +++- block.h | 1 + 2 files changed, 8 insertions(+), 9 deletions(-) diff --git a/block.c b/block.c index 470bdcc..e31b76f 100644 --- a/block.c +++ b/block.c @@ -655,7 +655,7 @@ static int bdrv_open_common(BlockDriverState *bs, const char *filename, * Clear flags that are internal to the block layer before opening the * image. */ -open_flags = ~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING); +open_flags = ~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING | BDRV_O_ALLOW_RDWR); /* * Snapshots should be writable. @@ -664,8 +664,6 @@ static int bdrv_open_common(BlockDriverState *bs, const char *filename, open_flags |= BDRV_O_RDWR; } -bs-keep_read_only = bs-read_only = !(open_flags BDRV_O_RDWR); - /* Open the image, either directly or using a protocol */ if (drv-bdrv_file_open) { ret = drv-bdrv_file_open(bs, filename, open_flags); @@ -804,6 +802,12 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags, goto unlink_and_fail; } +if (flags BDRV_O_RDWR) { +flags |= BDRV_O_ALLOW_RDWR; +} + +bs-keep_read_only = !(flags BDRV_O_ALLOW_RDWR); + /* Open the image */ ret = bdrv_open_common(bs, filename, flags, drv); if (ret 0) { @@ -833,12 +837,6 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags, bdrv_close(bs); return ret; } -if (bs-is_temporary) { -bs-backing_hd-keep_read_only = !(flags BDRV_O_RDWR); -} else { -/* base image inherits from parent */ -bs-backing_hd-keep_read_only = bs-keep_read_only; -} } if (!bdrv_key_required(bs)) { diff --git a/block.h b/block.h index 2e2be11..4d919c2 100644 --- a/block.h +++ b/block.h @@ -80,6 +80,7 @@ typedef struct BlockDevOps { #define BDRV_O_COPY_ON_READ 0x0400 /* copy read backing sectors into image */ #define BDRV_O_INCOMING0x0800 /* consistency hint for incoming migration */ #define BDRV_O_CHECK 0x1000 /* open solely for consistency check */ +#define BDRV_O_ALLOW_RDWR 0x2000 /* allow reopen to change from r/o to r/w */ #define BDRV_O_CACHE_MASK (BDRV_O_NOCACHE | BDRV_O_CACHE_WB | BDRV_O_NO_FLUSH) -- 1.7.11.2
[Qemu-devel] [PATCH] ahci: add migration support
Add support for ahci migration. This patch builds upon the patches posted previously by Andreas Faerber: http://lists.gnu.org/archive/html/qemu-devel/2012-08/msg01538.html (I hope I am giving Andreas proper credit for his work.) I've tested these patches by migrating Windows 7 and Fedora 16 guests on both piix with ahci attached and on q35 (which has a built-in ahci controller). Signed-off-by: Jason Baron jba...@redhat.com --- hw/ide/ahci.c | 64 - hw/ide/ahci.h | 10 + hw/ide/ich.c | 11 +++-- 3 files changed, 81 insertions(+), 4 deletions(-) diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index b53c757..e94509b 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -1204,6 +1204,65 @@ void ahci_reset(AHCIState *s) } } +static const VMStateDescription vmstate_ahci_device = { +.name = ahci port, +.version_id = 1, +.fields = (VMStateField []) { +VMSTATE_IDE_BUS(port, AHCIDevice), +VMSTATE_UINT32(port_state, AHCIDevice), +VMSTATE_UINT32(finished, AHCIDevice), +VMSTATE_UINT32(port_regs.lst_addr, AHCIDevice), +VMSTATE_UINT32(port_regs.lst_addr_hi, AHCIDevice), +VMSTATE_UINT32(port_regs.fis_addr, AHCIDevice), +VMSTATE_UINT32(port_regs.fis_addr_hi, AHCIDevice), +VMSTATE_UINT32(port_regs.irq_stat, AHCIDevice), +VMSTATE_UINT32(port_regs.irq_mask, AHCIDevice), +VMSTATE_UINT32(port_regs.cmd, AHCIDevice), +VMSTATE_UINT32(port_regs.tfdata, AHCIDevice), +VMSTATE_UINT32(port_regs.sig, AHCIDevice), +VMSTATE_UINT32(port_regs.scr_stat, AHCIDevice), +VMSTATE_UINT32(port_regs.scr_ctl, AHCIDevice), +VMSTATE_UINT32(port_regs.scr_err, AHCIDevice), +VMSTATE_UINT32(port_regs.scr_act, AHCIDevice), +VMSTATE_UINT32(port_regs.cmd_issue, AHCIDevice), +VMSTATE_END_OF_LIST() +}, +}; + +static int ahci_state_post_load(void *opaque, int version_id) +{ +int i; +AHCIState *s = opaque; + +for (i = 0; i s-ports; i++) { +AHCIPortRegs *pr = s-dev[i].port_regs; + +map_page(s-dev[i].lst, + ((uint64_t)pr-lst_addr_hi 32) | pr-lst_addr, 1024); +map_page(s-dev[i].res_fis, + ((uint64_t)pr-fis_addr_hi 32) | pr-fis_addr, 256); +} + +return 0; +} + +const VMStateDescription vmstate_ahci = { +.name = ahci, +.version_id = 1, +.post_load = ahci_state_post_load, +.fields = (VMStateField []) { +VMSTATE_STRUCT_VARRAY_POINTER_INT32(dev, AHCIState, ports, + vmstate_ahci_device, AHCIDevice), +VMSTATE_UINT32(control_regs.cap, AHCIState), +VMSTATE_UINT32(control_regs.ghc, AHCIState), +VMSTATE_UINT32(control_regs.irqstatus, AHCIState), +VMSTATE_UINT32(control_regs.impl, AHCIState), +VMSTATE_UINT32(control_regs.version, AHCIState), +VMSTATE_UINT32(idp_index, AHCIState), +VMSTATE_END_OF_LIST() +}, +}; + typedef struct SysbusAHCIState { SysBusDevice busdev; AHCIState ahci; @@ -1212,7 +1271,10 @@ typedef struct SysbusAHCIState { static const VMStateDescription vmstate_sysbus_ahci = { .name = sysbus-ahci, -.unmigratable = 1, +.fields = (VMStateField []) { +VMSTATE_AHCI(ahci, AHCIPCIState), +VMSTATE_END_OF_LIST() +}, }; static void sysbus_ahci_reset(DeviceState *dev) diff --git a/hw/ide/ahci.h b/hw/ide/ahci.h index 1200a56..7719dbf 100644 --- a/hw/ide/ahci.h +++ b/hw/ide/ahci.h @@ -307,6 +307,16 @@ typedef struct AHCIPCIState { AHCIState ahci; } AHCIPCIState; +extern const VMStateDescription vmstate_ahci; + +#define VMSTATE_AHCI(_field, _state) { \ +.name = (stringify(_field)), \ +.size = sizeof(AHCIState), \ +.vmsd = vmstate_ahci, \ +.flags = VMS_STRUCT,\ +.offset = vmstate_offset_value(_state, _field, AHCIState), \ +} + typedef struct NCQFrame { uint8_t fis_type; uint8_t c; diff --git a/hw/ide/ich.c b/hw/ide/ich.c index 272b773..ae6f56f 100644 --- a/hw/ide/ich.c +++ b/hw/ide/ich.c @@ -79,9 +79,14 @@ #define ICH9_IDP_INDEX 0x10 #define ICH9_IDP_INDEX_LOG2 0x04 -static const VMStateDescription vmstate_ahci = { +static const VMStateDescription vmstate_ich9_ahci = { .name = ahci, -.unmigratable = 1, +.version_id = 1, +.fields = (VMStateField []) { +VMSTATE_PCI_DEVICE(card, AHCIPCIState), +VMSTATE_AHCI(ahci, AHCIPCIState), +VMSTATE_END_OF_LIST() +}, }; static void pci_ich9_reset(DeviceState *dev) @@ -152,7 +157,7 @@ static void ich_ahci_class_init(ObjectClass *klass, void *data) k-device_id = PCI_DEVICE_ID_INTEL_82801IR; k-revision = 0x02; k-class_id = PCI_CLASS_STORAGE_SATA; -
Re: [Qemu-devel] [PATCH] ahci: properly reset PxCMD on HBA reset
On Fri, Aug 24, 2012 at 06:39:02AM +0200, Alexander Graf wrote: While testing q35, I found that windows 7 (specifically, windows 7 ultimate with sp1 x64), wouldn't install because it can't find the cdrom or disk drive. The failure message is: 'A required cd/dvd device driver is missing. If you have a driver floppy disk, CD, DVD, or USB flash drive, please insert it now.' This can also be reproduced on piix by adding an ahci controller, and observing that windows 7 does not see any devices behind it. The problem is that when windows issues a HBA reset, qemu does not reset the individual ports' PxCMD register. Windows 7 then reads back the PxCMD register and presumably assumes that the ahci controller has already been initialized. Windows then never sets up the PxIE register to enable interrupts, and thus it never gets irqs back when it sends ata device inquiry commands. I believe this change brings qemu into ahci 1.3 specification compliance. Section 10.4.3 HBA Reset: When GHC.HR is set to '1', GHC.AE, GHC.IE, the IS register, and all port register fields (except PxFB/PxFBU/PxCLB/PxCLBU) that are not HwInit in the HBA's register memory space are reset. I've also re-tested Fedora 16 and 17 to verify that they continue to work with this change. What a nasty little bug. If it makes it work for you, the change is all fine from my POV (and should go into 1.2). Ok, I don't see it 1.2.0-rc2. If others are ok with this for 1.2, whose tree should this go through? Thanks, -Jason
[Qemu-devel] [PATCH for 1.2] w32: Fix broken build
Commit ef8621b1a3b199c348606c0a11a77d8e8bf135f1 added an include file which is not available for MinGW compilations. Signed-off-by: Stefan Weil s...@weilnetz.de --- target-i386/cpu.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index f3cac49..423e009 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -33,7 +33,9 @@ #include hyperv.h #include hw/hw.h +#if defined(CONFIG_KVM) #include linux/kvm_para.h +#endif /* feature flags taken from Intel Processor Identification and the CPUID * Instruction and AMD's CPUID Specification. In cases of disagreement -- 1.7.10
Re: [Qemu-devel] [PATCH] ahci: properly reset PxCMD on HBA reset
On 30.08.2012, at 12:59, Jason Baron jba...@redhat.com wrote: On Fri, Aug 24, 2012 at 06:39:02AM +0200, Alexander Graf wrote: While testing q35, I found that windows 7 (specifically, windows 7 ultimate with sp1 x64), wouldn't install because it can't find the cdrom or disk drive. The failure message is: 'A required cd/dvd device driver is missing. If you have a driver floppy disk, CD, DVD, or USB flash drive, please insert it now.' This can also be reproduced on piix by adding an ahci controller, and observing that windows 7 does not see any devices behind it. The problem is that when windows issues a HBA reset, qemu does not reset the individual ports' PxCMD register. Windows 7 then reads back the PxCMD register and presumably assumes that the ahci controller has already been initialized. Windows then never sets up the PxIE register to enable interrupts, and thus it never gets irqs back when it sends ata device inquiry commands. I believe this change brings qemu into ahci 1.3 specification compliance. Section 10.4.3 HBA Reset: When GHC.HR is set to '1', GHC.AE, GHC.IE, the IS register, and all port register fields (except PxFB/PxFBU/PxCLB/PxCLBU) that are not HwInit in the HBA's register memory space are reset. I've also re-tested Fedora 16 and 17 to verify that they continue to work with this change. What a nasty little bug. If it makes it work for you, the change is all fine from my POV (and should go into 1.2). Ok, I don't see it 1.2.0-rc2. If others are ok with this for 1.2, whose tree should this go through? Kevin :) Alex Thanks, -Jason
Re: [Qemu-devel] [PATCH for 1.2] console: Fix warning from clang (and potential crash)
Am 17.08.2012 16:10, schrieb Jan Kiszka: On 2012-08-17 15:50, Stefan Weil wrote: ccc-analyzer reports this warning: console.c:1090:29: warning: Dereference of null pointer if (active_console-cursor_timer) { ^ Function console_select allows active_console to be NULL, but would crash when accessing cursor_timer. Fix this. Signed-off-by: Stefan Weils...@weilnetz.de --- Please note that I don't have a test case which triggers the crash. Regards, Stefan Weil console.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/console.c b/console.c index 4525cc7..f5e8814 100644 --- a/console.c +++ b/console.c @@ -1087,7 +1087,7 @@ void console_select(unsigned int index) if (s) { DisplayState *ds = s-ds; -if (active_console-cursor_timer) { +if (active_console active_console-cursor_timer) { qemu_del_timer(active_console-cursor_timer); } active_console = s; The only path that could trigger this is console_select() in the absence of any console. Not sure if that is possible, but the above is surely consistent with existing code. Reviewed-by: Jan Kiszkajan.kis...@siemens.com Jan Ping? It's still missing in QEMU 1.2.
[Qemu-devel] VHDX support
Is anyone currently working on VHDX (as opposed to VHD) support, as used by the most recent version of Hyper-V? If not, would you be interested in patches? File format at: http://www.microsoft.com/en-us/download/details.aspx?id=29681 (Word format, sadly) -- Alex Bligh
[Qemu-devel] Adding support for Stateless Static NAT for TAP devices
When running multiple instances of QEMU from the same image file (using -snapshot) and connecting each instance to a dedicated TAP device, the Guest OS will most likely not be able to communicate with the outside world as all packets leave the Guest OS from the same IP and thus the Host OS will have difficulty returning the packets to the correct TAP device/Guest OS. Stateless Static Network Address Translation or SSNAT allows the QEMU to map the network of the Guest OS to the network of the TAP device allowing a unique IP address for each Guest OS that ease such case. The only mandatory argument to the SSNAT is the Guest OS network IP, the rest will be figured out from the underlying TAP device. Signed-off-by: John Basila jbas...@checkpoint.com --- net/tap.c| 369 +- qapi-schema.json |5 +- qemu-options.hx | 10 ++- 3 files changed, 381 insertions(+), 3 deletions(-) diff --git a/net/tap.c b/net/tap.c index 1971525..2408a49 100644 --- a/net/tap.c +++ b/net/tap.c @@ -39,16 +39,88 @@ #include qemu-char.h #include qemu-common.h #include qemu-error.h +#include qemu_socket.h #include net/tap-linux.h #include hw/vhost_net.h +#include checksum.h + +#define ETH_P_ARP 0x0806 /* Address Resolution packet */ +#define ETH_P_IP 0x0800 /* Internet Protocol packet */ +#define ETH_P_IPV6 0x86DD /* IPv6 over blueblook */ + +#define ETH_ALEN 6 + +struct ethhdr { +unsigned char h_dest[ETH_ALEN]; /* destination eth addr */ +unsigned char h_source[ETH_ALEN]; /* source ether addr*/ +unsigned short h_proto;/* packet type ID field */ +}; + +#define IP_PROTO_TCP 6 +#define IP_PROTO_UDP 17 +#define IPV4_ADRESS_LENGTH 4 + +struct arphdr { +unsigned short ar_hrd; /* format of hardware address */ +unsigned short ar_pro; /* format of protocol address */ +unsigned char ar_hln; /* length of hardware address */ +unsigned char ar_pln; /* length of protocol address */ +unsigned short ar_op; /* ARP opcode (command) */ + +/* + * Ethernet looks like this : This bit is variable sized however... + */ +unsigned char ar_sha[ETH_ALEN];/* sender hardware address */ +unsigned char ar_sip[IPV4_ADRESS_LENGTH]; /* sender IP address */ +unsigned char ar_tha[ETH_ALEN];/* target hardware address */ +unsigned char ar_tip[IPV4_ADRESS_LENGTH]; /* target IP address */ +}; + +#define IP_HEADER_LENGTH(ip) (((ip-ip_hlv)0xf) 2) + +/** An IPv4 packet header */ +struct iphdr { + uint8_t ip_hlv; /** Header length and version of the header */ + uint8_t ip_tos; /** Type of Service */ + uint16_t ip_len; /** Length in octets, inlc. this header and data */ + uint16_t ip_id; /** ID is used to aid in assembling framents */ + uint16_t ip_off; /** Info about fragmentation (control, offset) */ + uint8_t ip_ttl; /** Time to Live */ + uint8_t ip_p;/** Next level protocol type */ + uint16_t ip_sum; /** Header checksum */ + uint32_t ip_src; /** Source IP address */ + uint32_t ip_dst; /** Destination IP address */ +}; + +/** UDP packet header */ +typedef struct udphdr { +uint16_t uh_sport; /* source port */ +uint16_t uh_dport; /* destination port */ +uint16_t uh_ulen; /* udp length */ +uint16_t uh_chksum;/* udp checksum */ +} udp_header; + + /* Maximum GSO packet size (64k) plus plenty of room for * the ethernet and virtio_net headers */ #define TAP_BUFSIZE (4096 + 65536) +typedef struct SSNATInfo { + unsigned int ssnat_active : 1; + + struct in_addr ssnat_ifaddr; + struct in_addr ssnat_ifmask; + uint8_t ssnat_hwaddr[ETH_ALEN]; + + struct in_addr ssnat_guest; + struct in_addr ssnat_host; + struct in_addr ssnat_mask; +} SSNATInfo; + typedef struct TAPState { NetClientState nc; int fd; @@ -59,6 +131,9 @@ typedef struct TAPState { unsigned int write_poll : 1; unsigned int using_vnet_hdr : 1; unsigned int has_ufo: 1; + + SSNATInfo ssnat_info; + VHostNetState *vhost_net; unsigned host_vnet_hdr_len; } TAPState; @@ -154,11 +229,154 @@ static ssize_t tap_receive_raw(NetClientState *nc, const uint8_t *buf, size_t si return tap_write_packet(s, iov, iovcnt); } +#define SSNAT_MAP_IP(_orig, _to, _mask) ( (_orig.s_addr ~_mask.s_addr) | (_to.s_addr _mask.s_addr) ) +#define SSNAT_IS_MATCH(_orig, _from, _mask)( (_orig.s_addr _mask.s_addr) == (_from.s_addr _mask.s_addr) ) + +static void tap_ssnat_translate_arp(uint8_t* buf, size_t size, const struct in_addr from, const struct in_addr to, const
[Qemu-devel] [PATCH] target-i386: Allow changing of Hypervisor CPUIDs.
This is primarily done so that the guest will think it is running under vmware when hypervisor=vmware is specified as a property of a cpu. Also allow this to work in accel=tcg mode. The new cpu properties hyper_level, hyper_extra, hyper_extra_a, and hyper_extra_b can be used to further adjust what the guest sees. Signed-off-by: Don Slutz d...@cloudswitch.com --- target-i386/cpu.c | 178 + target-i386/cpu.h |9 +++ target-i386/kvm.c | 33 -- 3 files changed, 214 insertions(+), 6 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index f3cac49..9e82b76 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -26,6 +26,7 @@ #include qemu-option.h #include qemu-config.h +#include qemu-timer.h #include qapi/qapi-visit-core.h #include arch_init.h @@ -244,6 +245,15 @@ typedef struct x86_def_t { uint32_t xlevel2; /* The feature bits on CPUID[EAX=7,ECX=0].EBX */ uint32_t cpuid_7_0_ebx_features; +/* Hypervisor CPUIDs */ +uint32_t cpuid_hv_level; +uint32_t cpuid_hv_vendor1; +uint32_t cpuid_hv_vendor2; +uint32_t cpuid_hv_vendor3; +/* VMware extra data */ +uint32_t cpuid_hv_extra; +uint32_t cpuid_hv_extra_a; +uint32_t cpuid_hv_extra_b; } x86_def_t; #define I486_FEATURES (CPUID_FP87 | CPUID_VME | CPUID_PSE) @@ -860,6 +870,18 @@ static void x86_cpuid_set_tsc_freq(Object *obj, Visitor *v, void *opaque, cpu-env.tsc_khz = value / 1000; } +static void x86_cpuid_set_hv(x86_def_t *x86_cpu_def, uint32_t level, + const char *who) +{ +uint32_t signature[3]; + +memcpy(signature, who, 12); +x86_cpu_def-cpuid_hv_level = level; +x86_cpu_def-cpuid_hv_vendor1 = signature[0]; +x86_cpu_def-cpuid_hv_vendor2 = signature[1]; +x86_cpu_def-cpuid_hv_vendor3 = signature[2]; +} + static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model) { unsigned int i; @@ -867,6 +889,10 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model) char *s = g_strdup(cpu_model); char *featurestr, *name = strtok(s, ,); +bool hyperv_enabled = false; +bool hv_enabled = false; +long hyper_level = -1; +long hyper_extra = -1; /* Features to be added*/ uint32_t plus_features = 0, plus_ext_features = 0; uint32_t plus_ext2_features = 0, plus_ext3_features = 0; @@ -993,12 +1019,84 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model) x86_cpu_def-tsc_khz = tsc_freq / 1000; } else if (!strcmp(featurestr, hv_spinlocks)) { char *err; + +if (hv_enabled) { +fprintf(stderr, +Only one of hypervisor= or hv_* can be used at one time.\n); +goto error; +} numvalue = strtoul(val, err, 0); if (!*val || *err) { fprintf(stderr, bad numerical value %s\n, val); goto error; } +hyperv_enabled = true; hyperv_set_spinlock_retries(numvalue); +} else if (!strcmp(featurestr, hyper_level)) { +char *err; +long longvalue = strtol(val, err, 0); + +if (!*val || *err) { +fprintf(stderr, bad numerical value for hyper_level=%s\n, +val); +goto error; +} +hyper_level = longvalue; +} else if (!strcmp(featurestr, hyper_extra)) { +char *err; +long longvalue = strtol(val, err, 0); + +if (!*val || *err) { +fprintf(stderr, bad numerical value for hyper_extra=%s\n, +val); +goto error; +} +hyper_extra = longvalue; +} else if (!strcmp(featurestr, hyper_extra_a)) { +char *err; + +numvalue = strtoul(val, err, 0); +if (!*val || *err) { +fprintf(stderr, +bad numerical value for hyper_extra_a=%s\n, +val); +goto error; +} +x86_cpu_def-cpuid_hv_extra_a = (uint32_t)numvalue; +} else if (!strcmp(featurestr, hyper_extra_b)) { +char *err; + +numvalue = strtoul(val, err, 0); +if (!*val || *err) { +fprintf(stderr, +bad numerical value for hyper_extra_b=%s\n, +val); +goto error; +} +x86_cpu_def-cpuid_hv_extra_b = (uint32_t)numvalue; +} else if (!strcmp(featurestr, hv) || + !strcmp(featurestr,
[Qemu-devel] [PATCH] target-i386: Allow changing of Hypervisor CPUIDs.
This is primarily done so that the guest will think it is running under vmware when hypervisor=vmware is specified as a property of a cpu. Also allow this to work in accel=tcg mode. The new cpu properties hyper_level, hyper_extra, hyper_extra_a, and hyper_extra_b can be used to further adjust what the guest sees. Signed-off-by: Don Slutz d...@cloudswitch.com --- target-i386/cpu.c | 178 + target-i386/cpu.h |9 +++ target-i386/kvm.c | 33 -- 3 files changed, 214 insertions(+), 6 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index f3cac49..a444b95 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -26,6 +26,7 @@ #include qemu-option.h #include qemu-config.h +#include qemu-timer.h #include qapi/qapi-visit-core.h #include arch_init.h @@ -244,6 +245,15 @@ typedef struct x86_def_t { uint32_t xlevel2; /* The feature bits on CPUID[EAX=7,ECX=0].EBX */ uint32_t cpuid_7_0_ebx_features; +/* Hypervisor CPUIDs */ +uint32_t cpuid_hv_level; +uint32_t cpuid_hv_vendor1; +uint32_t cpuid_hv_vendor2; +uint32_t cpuid_hv_vendor3; +/* VMware extra data */ +uint32_t cpuid_hv_extra; +uint32_t cpuid_hv_extra_a; +uint32_t cpuid_hv_extra_b; } x86_def_t; #define I486_FEATURES (CPUID_FP87 | CPUID_VME | CPUID_PSE) @@ -860,6 +870,18 @@ static void x86_cpuid_set_tsc_freq(Object *obj, Visitor *v, void *opaque, cpu-env.tsc_khz = value / 1000; } +static void x86_cpuid_set_hv(x86_def_t *x86_cpu_def, uint32_t level, + const char *who) +{ +uint32_t signature[3]; + +memcpy(signature, who, 12); +x86_cpu_def-cpuid_hv_level = level; +x86_cpu_def-cpuid_hv_vendor1 = signature[0]; +x86_cpu_def-cpuid_hv_vendor2 = signature[1]; +x86_cpu_def-cpuid_hv_vendor3 = signature[2]; +} + static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model) { unsigned int i; @@ -867,6 +889,10 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model) char *s = g_strdup(cpu_model); char *featurestr, *name = strtok(s, ,); +bool hyperv_enabled = false; +bool hv_enabled = false; +long hyper_level = -1; +long hyper_extra = -1; /* Features to be added*/ uint32_t plus_features = 0, plus_ext_features = 0; uint32_t plus_ext2_features = 0, plus_ext3_features = 0; @@ -993,12 +1019,84 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model) x86_cpu_def-tsc_khz = tsc_freq / 1000; } else if (!strcmp(featurestr, hv_spinlocks)) { char *err; + +if (hv_enabled) { +fprintf(stderr, +Only one of hypervisor= or hv_* can be used at one time.\n); +goto error; +} numvalue = strtoul(val, err, 0); if (!*val || *err) { fprintf(stderr, bad numerical value %s\n, val); goto error; } +hyperv_enabled = true; hyperv_set_spinlock_retries(numvalue); +} else if (!strcmp(featurestr, hyper_level)) { +char *err; +long longvalue = strtol(val, err, 0); + +if (!*val || *err) { +fprintf(stderr, bad numerical value for hyper_level=%s\n, +val); +goto error; +} +hyper_level = longvalue; +} else if (!strcmp(featurestr, hyper_extra)) { +char *err; +long longvalue = strtol(val, err, 0); + +if (!*val || *err) { +fprintf(stderr, bad numerical value for hyper_extra=%s\n, +val); +goto error; +} +hyper_extra = longvalue; +} else if (!strcmp(featurestr, hyper_extra_a)) { +char *err; + +numvalue = strtoul(val, err, 0); +if (!*val || *err) { +fprintf(stderr, +bad numerical value for hyper_extra_a=%s\n, +val); +goto error; +} +x86_cpu_def-cpuid_hv_extra_a = (uint32_t)numvalue; +} else if (!strcmp(featurestr, hyper_extra_b)) { +char *err; + +numvalue = strtoul(val, err, 0); +if (!*val || *err) { +fprintf(stderr, +bad numerical value for hyper_extra_b=%s\n, +val); +goto error; +} +x86_cpu_def-cpuid_hv_extra_b = (uint32_t)numvalue; +} else if (!strcmp(featurestr, hv) || + !strcmp(featurestr,
Re: [Qemu-devel] Adding support for Stateless Static NAT for TAP devices
Please allow me to add a few comments: The problem here is related to the fact that QEMU is executed with multiple instances and all instances start from the same snapshot, thus if they all send a UDP DNS query, they will all create a packet - for example - 10.0.0.2:2345 - DNSERVER:53. The source port is the same. The first packet that reaches the ipfilter will result in going over the iptables rules and get NATed properly, the second QEMU instance that will send the same UDP packet will not get to run over the iptables rules as the ipfilter already saw this packet and the packet should be RELATED to a different connection and thus will cause the response packets of machine B to be received via machine A as the NAT rule will de-NAT the return packet to to the relevant connection which is related to machine A. John -Original Message- From: Stefan Hajnoczi [mailto:stefa...@gmail.com] Sent: Thursday, August 30, 2012 1:44 PM To: John Basila Cc: qemu-devel@nongnu.org; Anthony Liguori; Rusty Russell; netfil...@vger.kernel.org Subject: Re: Adding support for Stateless Static NAT for TAP devices On Thu, Aug 30, 2012 at 10:27 AM, John Basila jbas...@checkpoint.com wrote: I have tried NAT and this is why I came up with this feature. QEMU's net/tap.c is the wrong place to add NAT code. The point of tap is to use the host network stack. If you want userspace networking, use -netdev user or -netdev socket. Please look into iptables more. I have CCed the netfilter mailing list. The question is: The host has several tap interfaces (tap0, tap1, ...) and the machine on the other end of each tap interface uses IP address 10.0.0.2. So we have: tap0 - virtual machine #0 (10.0.0.2) tap1 - virtual machine #1 (10.0.0.2) tap2 - virtual machine #2 (10.0.0.2) Because the virtual machines all use the same static IP address, they cannot communicate with each other or the outside world (they fight over ARP). We'd like to NAT the tap interfaces: tap0 - virtual machine #0 (10.0.0.2 NAT to 192.168.0.2) tap1 - virtual machine #1 (10.0.0.2 NAT to 192.168.0.3) tap2 - virtual machine #2 (10.0.0.2 NAT to 192.168.0.4) This would allow the virtual machines to communicate even though each believes it is 10.0.0.2. How can this be done using iptables and friends? Thanks, Stefan Scanned by Check Point Total Security Gateway.
Re: [Qemu-devel] Adding support for Stateless Static NAT for TAP devices
I have tried NAT and this is why I came up with this feature. When starting multiple QEMU instances from the same snapshot image, the Guest OS in all instances from the same state and if they start a connection to the DNS server for example, they will all use the same source port. The iptables will NAT the first packet it sees, but when the second QEMU instance sends the same packet, the iptables will match the already NATed connection and thus cause problems from returning packets. Using the SSNAT, this solves the problem by allowing a unique connection to be observed by the iptables. Regarding the vhost=on, I can disallow the use of both which I think is fair. John -Original Message- From: Stefan Hajnoczi [mailto:stefa...@gmail.com] Sent: Thursday, August 30, 2012 12:14 PM To: John Basila Cc: qemu-devel@nongnu.org; Anthony Liguori Subject: Re: Adding support for Stateless Static NAT for TAP devices On Thu, Aug 30, 2012 at 09:12:19AM +0300, John Basila wrote: When running multiple instances of QEMU from the same image file (using -snapshot) and connecting each instance to a dedicated TAP device, the Guest OS will most likely not be able to communicate with the outside world as all packets leave the Guest OS from the same IP and thus the Host OS will have difficulty returning the packets to the correct TAP device/Guest OS. Stateless Static Network Address Translation or SSNAT allows the QEMU to map the network of the Guest OS to the network of the TAP device allowing a unique IP address for each Guest OS that ease such case. The only mandatory argument to the SSNAT is the Guest OS network IP, the rest will be figured out from the underlying TAP device. Signed-off-by: John Basila jbas...@checkpoint.com --- net/tap.c| 369 +- qapi-schema.json |5 +- qemu-options.hx | 10 ++- 3 files changed, 381 insertions(+), 3 deletions(-) This does not work with vhost=on because the host-guest packet processing happens in vhost_net.ko instead of in QEMU. Use iptables on the host to NAT the tap interface. Stefan Scanned by Check Point Total Security Gateway.
Re: [Qemu-devel] Slow inbound traffic on macvtap interfaces
Chris Webb wrote: I found that on my laptop, the single change of host kernel config -CONFIG_INTEL_IDLE=y +# CONFIG_INTEL_IDLE is not set is sufficient to turn transfers into guests from slow to full wire speed I am not deep enough in this code to write a patch, but I wonder if macvtap_forward in macvtap.c is missing a call to kill_fasync, which I understand is used to signal to interested processes when data arrives? Here is the end of macvtap_forward: skb_queue_tail(q-sk.sk_receive_queue, skb); wake_up_interruptible_poll(sk_sleep(q-sk), POLLIN | POLLRDNORM | POLLRDBAND); return NET_RX_SUCCESS; Compared to this end of tun_net_xmit in tun.c: /* Enqueue packet */ skb_queue_tail(tun-socket.sk-sk_receive_queue, skb); /* Notify and wake up reader process */ if (tun-flags TUN_FASYNC) kill_fasync(tun-fasync, SIGIO, POLL_IN); wake_up_interruptible_poll(tun-wq.wait, POLLIN | POLLRDNORM | POLLRDBAND); return NETDEV_TX_OK; Richard.
Re: [Qemu-devel] [RFC v2 PATCH 0/6] Live block commit
On 08/30/2012 02:47 PM, Jeff Cody wrote: Live block commit. I originally had intended for this RFC series to include the more complicated case of a live commit of the active layer, but removed it for this commit in the hopes of making it into the soft feature freeze for 1.2, so this series is the simpler case. This series adds the basic case, of a live commit between two images below the active layer, e.g.: [base] --- [snp-1] --- [snp-2] --- [snp-3] --- [active] can be collapsed down via commit, into: [base] --- [active] or, [base] --- [snp-1] --- [active], [base] --- [snp-3] --- [active], etc.. TODO: * qemu-io tests (in progress) * 'stage-2' of live commit functionality, to be able to push down the active layer. This structured something like mirroring, to allow for convergence. Changes from the RFC v1 series: * This patch series is not on top of Paolo's blk mirror series yet, to make it easier to apply independently if desired. This means some of what was in the previous RFC series is not in this one (BlockdevOnError, for instance), but that can be easily added in once Paolo's series are in. * This patches series is dependent on the reopen() series with transactional reopen. * The target release for this series is 1.3 * Found some mistakes in the reopen calls * Dropped the BlockdevOnError argument (for now), will add in if rebasing on top of Paolo's series. * Used the new qerror system I meant to add this to my cover letter, but forgot; if anyone wants to play around with this, you can find it on github: git://github.com/codyprime/qemu-kvm-jtc.git (branch jtc-live-commit-1.3) Jeff Cody (6): 1/6 block: add support functions for live commit, to find and delete images. 2/6 block: add live block commit functionality 3/6 blockdev: rename block_stream_cb to a generic block_job_cb 4/6 qerror: new error for live block commit, QERR_TOP_NOT_FOUND 5/6 block: helper function, to find the base image of a chain 6/6 QAPI: add command for live block commit, 'block-commit' block.c | 158 block.h | 6 +- block/Makefile.objs | 1 + block/commit.c | 202 block_int.h | 19 + blockdev.c | 91 ++- qapi-schema.json| 30 qerror.h| 3 + qmp-commands.hx | 6 ++ trace-events| 4 +- 10 files changed, 515 insertions(+), 5 deletions(-) create mode 100644 block/commit.c