[Qemu-devel] [PATCH 2/4] vnc: really call zlib if we want zlib
Signed-off-by: Corentin Chary --- vnc.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/vnc.c b/vnc.c index a91c3a3..d0c0d00 100644 --- a/vnc.c +++ b/vnc.c @@ -655,7 +655,7 @@ static void send_framebuffer_update(VncState *vs, int x, int y, int w, int h) { switch(vs->vnc_encoding) { case VNC_ENCODING_ZLIB: -vnc_hextile_send_framebuffer_update(vs, x, y, w, h); +vnc_zlib_send_framebuffer_update(vs, x, y, w, h); break; case VNC_ENCODING_HEXTILE: vnc_framebuffer_update(vs, x, y, w, h, VNC_ENCODING_HEXTILE); -- 1.7.0.2
[Qemu-devel] [PATCH 3/4] vnc: only use a single zlib stream
According to http://tigervnc.org/cgi-bin/rfbproto#zlib-encoding Signed-off-by: Corentin Chary --- vnc-encoding-zlib.c | 12 +--- vnc.h |2 +- 2 files changed, 6 insertions(+), 8 deletions(-) diff --git a/vnc-encoding-zlib.c b/vnc-encoding-zlib.c index 4a495ad..6a16a79 100644 --- a/vnc-encoding-zlib.c +++ b/vnc-encoding-zlib.c @@ -54,9 +54,9 @@ static void vnc_zlib_start(VncState *vs) vs->output = vs->zlib; } -static int vnc_zlib_stop(VncState *vs, int stream_id) +static int vnc_zlib_stop(VncState *vs) { -z_streamp zstream = &vs->zlib_stream[stream_id]; +z_streamp zstream = &vs->zlib_stream; int previous_out; // switch back to normal output/zlib buffers @@ -70,7 +70,7 @@ static int vnc_zlib_stop(VncState *vs, int stream_id) if (zstream->opaque != vs) { int err; -VNC_DEBUG("VNC: initializing zlib stream %d\n", stream_id); +VNC_DEBUG("VNC: initializing zlib stream\n"); VNC_DEBUG("VNC: opaque = %p | vs = %p\n", zstream->opaque, vs); zstream->zalloc = zalloc; zstream->zfree = zfree; @@ -122,7 +122,7 @@ void vnc_zlib_send_framebuffer_update(VncState *vs, int x, int y, int w, int h) // compress the stream vnc_zlib_start(vs); vnc_raw_send_framebuffer_update(vs, x, y, w, h); -bytes_written = vnc_zlib_stop(vs, 0); +bytes_written = vnc_zlib_stop(vs); if (bytes_written == -1) return; @@ -136,7 +136,5 @@ void vnc_zlib_send_framebuffer_update(VncState *vs, int x, int y, int w, int h) void vnc_zlib_init(VncState *vs) { -int i; -for (i=0; i<(sizeof(vs->zlib_stream) / sizeof(z_stream)); i++) -vs->zlib_stream[i].opaque = NULL; +vs->zlib_stream.opaque = NULL; } diff --git a/vnc.h b/vnc.h index 1aa71b0..dfdb240 100644 --- a/vnc.h +++ b/vnc.h @@ -173,7 +173,7 @@ struct VncState /* Zlib */ Buffer zlib; Buffer zlib_tmp; -z_stream zlib_stream[4]; +z_stream zlib_stream; Notifier mouse_mode_notifier; -- 1.7.0.2
[Qemu-devel] Re: [patch uq/master 0/9] enable smp > 1 and related fixes
On 05/05/2010 09:24 PM, Anthony Liguori wrote: On 05/04/2010 07:45 AM, Marcelo Tosatti wrote: -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html How does this work without an in-kernel apic (or does uq/master already have an in-kernel apic)? An in-kernel apic isn't needed (qemu-kvm supports smp with userspace apic). -- error compiling committee.c: too many arguments to function
[Qemu-devel] Re: [patch uq/master 0/9] enable smp > 1 and related fixes
On 05/04/2010 03:45 PM, Marcelo Tosatti wrote: Applied all, thanks. -- error compiling committee.c: too many arguments to function
[Qemu-devel] [PATCH] block: Fix protocol detection for Windows devices
We can't assume the file protocol for Windows devices, they need the same detection as other files for which an explicit protocol is not specified. Signed-off-by: Kevin Wolf --- block.c | 10 ++ 1 files changed, 6 insertions(+), 4 deletions(-) diff --git a/block.c b/block.c index 10a2b31..adb3e5d 100644 --- a/block.c +++ b/block.c @@ -287,16 +287,18 @@ static BlockDriver *find_protocol(const char *filename) char protocol[128]; int len; const char *p; +int is_drive; /* TODO Drivers without bdrv_file_open must be specified explicitly */ #ifdef _WIN32 -if (is_windows_drive(filename) || -is_windows_drive_prefix(filename)) -return bdrv_find_format("file"); +is_drive = is_windows_drive(filename) || +is_windows_drive_prefix(filename); +#else +is_drive = 0; #endif p = strchr(filename, ':'); -if (!p) { +if (!p || is_drive) { drv1 = find_hdev_driver(filename); if (!drv1) { drv1 = bdrv_find_format("file"); -- 1.6.6.1
Re: [Qemu-devel] Call for 0.12.4
Am 08.04.2010 20:37, schrieb Aurelien Jarno: > Hi all, > > A number of fixes have been accumulated in the stable-0.12 branch, and > I think it's time to release a new stable version. I would like to see > that happening for the end of next week (around the 18th of April). > > If you want to see some patches included, please send a mail to the > mailing list with the [STABLE] tag. I would clearly prefer patches that > are already in HEAD (if the patch can simply be cherry-picked, there is > no need to send a patch, just the commit number), though other patches > might be considered too. Not sure if it's already too late for 0.12.4, but this one should be applied to the stable-0.12 branch: Commit: d6771bfa52744eb4f959198b4b0e59451463eebf qemu-img: use the heap instead of the huge stack array for win32 On Windows, qemu-img convert is broken without this fix. Kevin
[Qemu-devel] [PATCH] pckbd: support for commands 0xf0-0xff: Pulse output bit
I use a legacy guest OS which sends the command 0xfd to the keyboard controller during initialization. To get rid of the message "qemu: unsupported keyboard cmd=0x%02x\n" I added support for the pulse output bit commands. Signed-off-by: Bernhard Kohl --- hw/pckbd.c | 23 --- 1 files changed, 20 insertions(+), 3 deletions(-) diff --git a/hw/pckbd.c b/hw/pckbd.c index 7998aa6..554fbe4 100644 --- a/hw/pckbd.c +++ b/hw/pckbd.c @@ -50,7 +50,9 @@ #define KBD_CCMD_WRITE_MOUSE0xD4/* Write the following byte to the mouse */ #define KBD_CCMD_DISABLE_A200xDD/* HP vectra only ? */ #define KBD_CCMD_ENABLE_A20 0xDF/* HP vectra only ? */ -#define KBD_CCMD_RESET0xFE +#define KBD_CCMD_PULSE_BITS_3_0 0xF0/* Pulse bits 3-0 of the output port P2. */ +#define KBD_CCMD_RESET 0xFE/* Pulse bit 0 of the output port P2 = CPU reset. */ +#define KBD_CCMD_NO_OP 0xFF/* Pulse no bits of the output port P2. */ /* Keyboard Commands */ #define KBD_CMD_SET_LEDS0xED/* Set keyboard leds */ @@ -203,6 +205,21 @@ static void kbd_write_command(void *opaque, uint32_t addr, uint32_t val) #ifdef DEBUG_KBD printf("kbd: write cmd=0x%02x\n", val); #endif + +/* Bits 3-0 of the output port P2 of the keyboard controller may be pulsed + * low for approximately 6 micro seconds. Bits 3-0 of the KBD_CCMD_PULSE + * command specify the output port bits to be pulsed. + * 0: Bit should be pulsed. 1: Bit should not be modified. + * The only useful version of this command is pulsing bit 0, + * which does a CPU reset. + */ +if((val & KBD_CCMD_PULSE_BITS_3_0) == KBD_CCMD_PULSE_BITS_3_0) { +if(!(val & 1)) +val = KBD_CCMD_RESET; +else +val = KBD_CCMD_NO_OP; +} + switch(val) { case KBD_CCMD_READ_MODE: kbd_queue(s, s->mode, 0); @@ -265,8 +282,8 @@ static void kbd_write_command(void *opaque, uint32_t addr, uint32_t val) case KBD_CCMD_RESET: qemu_system_reset_request(); break; -case 0xff: -/* ignore that - I don't know what is its use */ +case KBD_CCMD_NO_OP: +/* ignore that */ break; default: fprintf(stderr, "qemu: unsupported keyboard cmd=0x%02x\n", val); -- 1.6.6.1
[Qemu-devel] [PATCH] exec: optimize lduw_phys and stw_phys
Implementation of the optimized code for these two functions. This is necessary for virtio which reads and writes VirtQueue index fields using these functions. The assumption is that this are atomic operations, which is not the case, if the memcpy() function which is used in the non optimized code does single byte copying. This happens for example with an older WindRiver glibc. Signed-off-by: Bernhard Kohl --- exec.c | 67 +-- 1 files changed, 60 insertions(+), 7 deletions(-) diff --git a/exec.c b/exec.c index 14d1fd7..fb40398 100644 --- a/exec.c +++ b/exec.c @@ -3739,12 +3739,36 @@ uint32_t ldub_phys(target_phys_addr_t addr) return val; } -/* XXX: optimize */ +/* warning: addr must be aligned */ uint32_t lduw_phys(target_phys_addr_t addr) { -uint16_t val; -cpu_physical_memory_read(addr, (uint8_t *)&val, 2); -return tswap16(val); +int io_index; +uint8_t *ptr; +uint32_t val; +unsigned long pd; +PhysPageDesc *p; + +p = phys_page_find(addr >> TARGET_PAGE_BITS); +if (!p) { +pd = IO_MEM_UNASSIGNED; +} else { +pd = p->phys_offset; +} + +if ((pd & ~TARGET_PAGE_MASK) > IO_MEM_ROM && +!(pd & IO_MEM_ROMD)) { +/* I/O case */ +io_index = (pd >> IO_MEM_SHIFT) & (IO_MEM_NB_ENTRIES - 1); +if (p) +addr = (addr & ~TARGET_PAGE_MASK) + p->region_offset; +val = io_mem_read[io_index][1](io_mem_opaque[io_index], addr); +} else { +/* RAM case */ +ptr = qemu_get_ram_ptr(pd & TARGET_PAGE_MASK) + +(addr & ~TARGET_PAGE_MASK); +val = lduw_p(ptr); +} +return val; } /* warning: addr must be aligned. The ram page is not masked as dirty @@ -3861,11 +3885,40 @@ void stb_phys(target_phys_addr_t addr, uint32_t val) cpu_physical_memory_write(addr, &v, 1); } -/* XXX: optimize */ +/* warning: addr must be aligned */ void stw_phys(target_phys_addr_t addr, uint32_t val) { -uint16_t v = tswap16(val); -cpu_physical_memory_write(addr, (const uint8_t *)&v, 2); +int io_index; +uint8_t *ptr; +unsigned long pd; +PhysPageDesc *p; + +p = phys_page_find(addr >> TARGET_PAGE_BITS); +if (!p) { +pd = IO_MEM_UNASSIGNED; +} else { +pd = p->phys_offset; +} + +if ((pd & ~TARGET_PAGE_MASK) != IO_MEM_RAM) { +io_index = (pd >> IO_MEM_SHIFT) & (IO_MEM_NB_ENTRIES - 1); +if (p) +addr = (addr & ~TARGET_PAGE_MASK) + p->region_offset; +io_mem_write[io_index][1](io_mem_opaque[io_index], addr, val); +} else { +unsigned long addr1; +addr1 = (pd & TARGET_PAGE_MASK) + (addr & ~TARGET_PAGE_MASK); +/* RAM case */ +ptr = qemu_get_ram_ptr(addr1); +stw_p(ptr, val); +if (!cpu_physical_memory_is_dirty(addr1)) { +/* invalidate code */ +tb_invalidate_phys_page_range(addr1, addr1 + 2, 0); +/* set dirty bit */ +phys_ram_dirty[addr1 >> TARGET_PAGE_BITS] |= +(0xff & ~CODE_DIRTY_FLAG); +} +} } /* XXX: optimize */ -- 1.6.6.1
[Qemu-devel] [PATCH] pckbd: support for commands 0xf0-0xff: Pulse output bit
I use a legacy guest OS which sends the command 0xfd to the keyboard controller during initialization. To get rid of the message "qemu: unsupported keyboard cmd=0x%02x\n" I added support for the pulse output bit commands. Signed-off-by: Bernhard Kohl --- hw/pckbd.c | 23 --- 1 files changed, 20 insertions(+), 3 deletions(-) diff --git a/hw/pckbd.c b/hw/pckbd.c index 7998aa6..554fbe4 100644 --- a/hw/pckbd.c +++ b/hw/pckbd.c @@ -50,7 +50,9 @@ #define KBD_CCMD_WRITE_MOUSE0xD4/* Write the following byte to the mouse */ #define KBD_CCMD_DISABLE_A200xDD/* HP vectra only ? */ #define KBD_CCMD_ENABLE_A20 0xDF/* HP vectra only ? */ -#define KBD_CCMD_RESET0xFE +#define KBD_CCMD_PULSE_BITS_3_0 0xF0/* Pulse bits 3-0 of the output port P2. */ +#define KBD_CCMD_RESET 0xFE/* Pulse bit 0 of the output port P2 = CPU reset. */ +#define KBD_CCMD_NO_OP 0xFF/* Pulse no bits of the output port P2. */ /* Keyboard Commands */ #define KBD_CMD_SET_LEDS0xED/* Set keyboard leds */ @@ -203,6 +205,21 @@ static void kbd_write_command(void *opaque, uint32_t addr, uint32_t val) #ifdef DEBUG_KBD printf("kbd: write cmd=0x%02x\n", val); #endif + +/* Bits 3-0 of the output port P2 of the keyboard controller may be pulsed + * low for approximately 6 micro seconds. Bits 3-0 of the KBD_CCMD_PULSE + * command specify the output port bits to be pulsed. + * 0: Bit should be pulsed. 1: Bit should not be modified. + * The only useful version of this command is pulsing bit 0, + * which does a CPU reset. + */ +if((val & KBD_CCMD_PULSE_BITS_3_0) == KBD_CCMD_PULSE_BITS_3_0) { +if(!(val & 1)) +val = KBD_CCMD_RESET; +else +val = KBD_CCMD_NO_OP; +} + switch(val) { case KBD_CCMD_READ_MODE: kbd_queue(s, s->mode, 0); @@ -265,8 +282,8 @@ static void kbd_write_command(void *opaque, uint32_t addr, uint32_t val) case KBD_CCMD_RESET: qemu_system_reset_request(); break; -case 0xff: -/* ignore that - I don't know what is its use */ +case KBD_CCMD_NO_OP: +/* ignore that */ break; default: fprintf(stderr, "qemu: unsupported keyboard cmd=0x%02x\n", val); -- 1.6.6.1
[Qemu-devel] Re: How to map PCI memory into the VM without trapping
Hi there, this is just an update if you are interested in the outcome. I turns out that my initial assumption that there would be page faults/trapping on the memory pages was false. The reason the throughput is so low is because the memory was mapped non-cached. The VGA driver and the ivshmem driver use pci_ioremap_bar which will *always* map the PCI bar non-cached (even the resourceX_wc). Changing the driver(s) to use ioremap_cache or ioremap_wc speeds up things quite a bit. I don't know if VGA framebuffer was always mapped this way -- it appears on a real system that usually graphics memory is mapped WC. Mapping it UC would cause quite a performance degradation. This could be the reason for the reported VGA performance drop in another email thread. IMHO, since QEMU emulates VGA, this could be mapped WB. Thanks, Frank. On Mon, May 3, 2010 at 2:07 PM, Frank Berreth wrote: > Hello, > > I am using a modified patch for the ivshmem device that was posted here > recently and discovered some performance issues with it. Essentially I > measure write performance into the PCI memory that is mapped into the VM as > really slow (about 60mb/sec). I assume that is because there is trapping on > each memory access (there are no handlers installed for handling read/write > access to that memory though). The device shows up in the guest as a UIO > device but I also tried accessing other PCI memory like the vga memory (16mb > from vga-pci.c) and it has similar performance (through > /sys/device/.../resource...). > > My question is how I can map that memory into the VM so it doesn't get > trapped on in the normal use case. The mappings (registering PCI bar etc) is > done as in the patch submitted recently). > > Thanks, > Frank. > >
[Qemu-devel] Re: question on virtio
On Wed, 5 May 2010 08:39:47 pm Michael S. Tsirkin wrote: > Hi! > I see this in virtio_ring.c: > > /* Put entry in available array (but don't update avail->idx * > until they do sync). */ > > Why is it done this way? > It seems that updating the index straight away would be simpler, while > this might allow the host to specilatively look up the buffer and handle > it, without waiting for the kick. I agree. From my TODO: what if we actually expose in ->add_buf? I don't *think* anyone adds buffers without being ready for them to be used, so changing this should be safe. Want to give it a try and report back? Thanks! Rusty.
[Qemu-devel] [PATCH] A bit optimization for tlb_set_page() (resend)
This patch avoids handling write watchpoints on read-only memory access. It also breaks the searching loop for watchpoint once the setup for handling watchpoint later is done. Signed-off-by: Jun Koi diff --git a/exec.c b/exec.c index 14d1fd7..6fd859f 100644 --- a/exec.c +++ b/exec.c @@ -2236,10 +2236,12 @@ void tlb_set_page(CPUState *env, target_ulong vaddr, watchpoint trap routines. */ QTAILQ_FOREACH(wp, &env->watchpoints, entry) { if (vaddr == (wp->vaddr & TARGET_PAGE_MASK)) { -iotlb = io_mem_watch + paddr; -/* TODO: The memory case can be optimized by not trapping - reads of pages with a write breakpoint. */ -address |= TLB_MMIO; +/* Avoid trapping reads of pages with a write breakpoint. */ +if ((prot & PAGE_WRITE) || (wp->flags & BP_MEM_READ)) { +iotlb = io_mem_watch + paddr; +address |= TLB_MMIO; +break; +} } }
Re: [Qemu-devel] Qemu - samba share help
You need to export your share on host OS. AFAIR in the guest OS the address to access SAMBA shares was 10.0.2.4. Regards, Sergei. --- On Wed, 5/5/10, Arpit Patel wrote: From: Arpit Patel Subject: [Qemu-devel] Qemu - samba share help To: qemu-devel@nongnu.org Date: Wednesday, May 5, 2010, 11:02 AM Hi All, I am trying to share files between host OS and guest OS, both are Ubuntu. Here is the command line that I am using to start guest OSqemu -kernel kernelimage -initrd initrd.img /dev/zero -append "cmdline" -smb /tmp After which I tried to mount /tmp on guest OS, using IP address of host OS, something like mount //10.80.8.100/tmp /tmp, but fails saying "No such device". Samba service is also running on Host OS. Any ideas or any other way to share files between Host OS and Guest OS. Thanks for Help. Y
Re: [Qemu-devel] [PATCH 6/7] Fix zero-length write(2).
On Mon, Mar 29, 2010 at 10:54:42AM -0700, Richard Henderson wrote: > Signed-off-by: Richard Henderson Thanks, applied. > --- > exec.c |3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > diff --git a/exec.c b/exec.c > index 33854e1..d69194c 100644 > --- a/exec.c > +++ b/exec.c > @@ -2461,6 +2461,9 @@ int page_check_range(target_ulong start, target_ulong > len, int flags) > assert(start < ((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS)); > #endif > > +if (len == 0) { > +return 0; > +} > if (start + len - 1 < start) { > /* We've wrapped around. */ > return -1; > -- > 1.6.6.1 > > > > -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
[Qemu-devel] Re: vgabios + qemu: issues and plans.
Hi, First, I'd like to be able to load the vgabios via PCI ROM bar on all pci vga cards (stdvga, vmware, soon qxl). The PCI ID in the bios has to match the PCI ID of the card, so we'll need a bunch of vga bios binaries, all identical except for the PCI ID. Or we need some kind of binary patching. Otherwise seabios will not load them from the PCI ROM bar. vgabios could be build multiple times with PCIBIOS set and VENDOR_ID and DEVICE_ID supplied from the Makefile like it's already done with VGABIOS_DATE. Yes, something like this I have in mind. Works for stdvga and qxl. vmware fails as it has the memory at pci region 1 (region 0 has ioports) and it seems vgabios isn't prepared to handle that ... Anyone knows what the best place to send patches to is? vgabios-developers list @ savannah looks pretty much dead. It has one (!) message in 2009 which is an unanswered question. Or is upstream dead anyway and we should just go with http://git.qemu.org/vgabios.git/ ? cheers, Gerd
[Qemu-devel] Re: [PATCH RFC] virtio: put last seen used index into ring itself
On Thu, May 06, 2010 at 12:01:34PM +0930, Rusty Russell wrote: > On Thu, 6 May 2010 06:28:14 am Michael S. Tsirkin wrote: > > Rusty, > > this is a simplified form of a patch you posted in the past. > > I have a vhost patch that, using this feature, shows external > > to host bandwidth grow from 5 to 7 GB/s, by avoiding > > an interrupt in the window after previous interrupt > > was sent and before interrupts were disabled for the vq. > > With vhost under some external to host loads I see > > this window being hit about 30% sometimes. > > Fascinating. So you use this to guess if the guest is still processing? Exactly. > I haven't thought about it hard, but is that racy? I thought about this really hard and I don't think it's necessarily racy, as long as (pseudo code): guest: start: disable interrupts read(used) write(last used) enable interrupts mb() if read(used) goto start host: write used mb() if (reached(read(last used), used)) interrupt IOW, guest does read then write then read, host does write then read. Now, the only way to miss an interrupt is if we read last used value before it is written so we think guest is still processing. But if that happens, this means that host has written used before guest updated last used, and for this reason peek will see used value and restart polling. IOW, to miss an interrupt host must read a stale value. For this host must read before guest write, then host write already happened, so second read in guest will see an updated value host has written. Now, I also added an mb() in guest between read and write so that last used index write can not get ahead of used index read. It does feel good to have it there, but I can not say why it's helpful. Works fine without it, but then these subtle races might be hard to trigger. What do you think? > Obviously happy to apply this when you finalize it. > > Thanks! > Rusty.
Re: [Qemu-devel] Re: [PATCH] virtio-spec: document block CMD and FLUSH
On Wed, 5 May 2010 03:33:43 pm Neil Brown wrote: > On Wed, 5 May 2010 14:28:41 +0930 > Rusty Russell wrote: > > > On Wed, 5 May 2010 05:47:05 am Jamie Lokier wrote: > > > Jens Axboe wrote: > > > > On Tue, May 04 2010, Rusty Russell wrote: > > > > > ISTR someone mentioning a desire for such an API years ago, so CC'ing > > > > > the > > > > > usual I/O suspects... > > > > > > > > It would be nice to have a more fuller API for this, but the reality is > > > > that only the flush approach is really workable. Even just strict > > > > ordering of requests could only be supported on SCSI, and even there the > > > > kernel still lacks proper guarantees on error handling to prevent > > > > reordering there. > > > > > > There's a few I/O scheduling differences that might be useful: > > > > > > 1. The I/O scheduler could freely move WRITEs before a FLUSH but not > > >before a BARRIER. That might be useful for time-critical WRITEs, > > >and those issued by high I/O priority. > > > > This is only because noone actually wants flushes or barriers, though > > I/O people seem to only offer that. We really want " must > > occur before ". That offers maximum choice to the I/O subsystem > > and potentially to smart (virtual?) disks. > > > > > 2. The I/O scheduler could move WRITEs after a FLUSH if the FLUSH is > > >only for data belonging to a particular file (e.g. fdatasync with > > >no file size change, even on btrfs if O_DIRECT was used for the > > >writes being committed). That would entail tagging FLUSHes and > > >WRITEs with a fs-specific identifier (such as inode number), opaque > > >to the scheduler which only checks equality. > > > > This is closer. In userspace I'd be happy with a "all prior writes to this > > struct file before all future writes". Even if the original guarantees were > > stronger (ie. inode basis). We currently implement transactions using 4 > > fsync > > /msync pairs. > > > > write_recovery_data(fd); > > fsync(fd); > > msync(mmap); > > write_recovery_header(fd); > > fsync(fd); > > msync(mmap); > > overwrite_with_new_data(fd); > > fsync(fd); > > msync(mmap); > > remove_recovery_header(fd); > > fsync(fd); > > msync(mmap); > > Seems over-zealous. > If the recovery_header held a strong checksum of the recovery_data you would > not need the first fsync, and as long as you have two places to write recovery > data, you don't need the 3rd and 4th syncs. > Just: > write_internally_checksummed_recovery_data_and_header_to_unused_log_space() > fsync / msync > overwrite_with_new_data() > > To recovery you choose the most recent log_space and replay the content. > That may be a redundant operation, but that is no loss. I think you missed a checksum for the new data? Otherwise we can't tell if the new data is completely written. But yes, I will steal this scheme for TDB2, thanks! In practice, it's the first sync which is glacial, the rest are pretty cheap. > Also cannot see the point of msync if you have already performed an fsync, > and if there is a point, I would expect you to call msync before > fsync... Maybe there is some subtlety there that I am not aware of. I assume it's this from the msync man page: msync() flushes changes made to the in-core copy of a file that was mapped into memory using mmap(2) back to disk. Without use of this call there is no guarantee that changes are written back before mun‐ map(2) is called. > > It's an implementation detail; barrier has less flexibility because it has > > less information about what is required. I'm saying I want to give you as > > much information as I can, even if you don't use it yet. > > Only we know that approach doesn't work. > People will learn that they don't need to give the extra information to still > achieve the same result - just like they did with ext3 and fsync. > Then when we improve the implementation to only provide the guarantees that > you asked for, people will complain that they are getting empty files that > they didn't expect. I think that's an oversimplification: IIUC that occurred to people *not* using fsync(). They weren't using it because it was too slow. Providing a primitive which is as fast or faster and more specific doesn't have the same magnitude of social issues. And we can't write userspace interfaces for idiots only. > The abstraction I would like to see is a simple 'barrier' that contains no > data and has a filesystem-wide effect. I think you lack ambition ;) Thinking about the single-file use case (eg. kvm guest or tdb), isn't that suboptimal for md? Since you have to hand your barrier to every device whereas a file-wide primitive may theoretically only go to some. Cheers, Rusty.
[Qemu-devel] Re: [PATCH v7 4/6] char: Add qemu_chr_write_nb() for nonblocking writes
I'm all for killing the blocking interface. Problem is that converting over all users isn't exactly trivial and we have plenty of them. IMHO it isn't realistic to do the switch with a single patch series. If we're agreed we ought to kill the blocking interface, let's define a new proper interface, rename all previous users to something ugly and deprecated, and approach it that way. __attribute__ ((deprecated)) ? cheers, Gerd
Re: [Qemu-devel] [PATCH] vnc: set the right prefered encoding
On Tue, May 4, 2010 at 2:01 PM, Corentin Chary wrote: > From RFB specs: "The order of the encoding types given in this > message is a hint by the client as to its preference (the first > encoding specified being most preferred)" > > Signed-off-by: Corentin Chary > --- > vnc.c | 14 ++ > 1 files changed, 10 insertions(+), 4 deletions(-) > > diff --git a/vnc.c b/vnc.c > index 5241a6a..2d05d8f 100644 > --- a/vnc.c > +++ b/vnc.c > @@ -1594,7 +1594,7 @@ static void set_encodings(VncState *vs, int32_t > *encodings, size_t n_encodings) > > vnc_zlib_init(vs); > vs->features = 0; > - vs->vnc_encoding = 0; > + vs->vnc_encoding = -1; > vs->tight_compression = 9; > vs->tight_quality = 9; > vs->absolute = -1; > @@ -1603,18 +1603,24 @@ static void set_encodings(VncState *vs, int32_t > *encodings, size_t n_encodings) > enc = encodings[i]; > switch (enc) { > case VNC_ENCODING_RAW: > - vs->vnc_encoding = enc; > + if (vs->vnc_encoding != -1) { > + vs->vnc_encoding = enc; > + } hum patch is broken, sending a fixed patch today, sorry should be if (vs->vnc_encoding == -1) -- Corentin Chary http://xf.iksaif.net
Re: [Qemu-devel] vgabios + qemu: issues and plans.
First, I'd like to be able to load the vgabios via PCI ROM bar on all pci vga cards (stdvga, vmware, soon qxl). The PCI ID in the bios has to match the PCI ID of the card, so we'll need a bunch of vga bios binaries, all identical except for the PCI ID. Or we need some kind of binary patching. Otherwise seabios will not load them from the PCI ROM bar. Make vga bios aware of all ids? Would work for the code doing the lfb address lookup. But I don't think the PCI option rom header allows multiple entries. I also want to eliminate the magic number. Boch vga bios is paravirtualized to get VBE address dynamically via a new pci device with special pci device of device id 0x1234. 0x1234 happens to be the vendor id the qemu stdvga has. Probably the same is true for bochs. So this code actually looks up the emulated vga card ;) Are you meaning to follow bochs way? Yes. cheers, Gerd
[Qemu-devel] [PATCH] block: Fix bdrv_commit
When reopening the image, don't guess the driver, but use the same driver as was used before. This is important if the format=... option was used for that image. Signed-off-by: Kevin Wolf --- block.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/block.c b/block.c index adb3e5d..ffe82f8 100644 --- a/block.c +++ b/block.c @@ -701,12 +701,12 @@ int bdrv_commit(BlockDriverState *bs) bdrv_delete(bs->backing_hd); bs->backing_hd = NULL; bs_rw = bdrv_new(""); -rw_ret = bdrv_open(bs_rw, filename, open_flags | BDRV_O_RDWR, NULL); +rw_ret = bdrv_open(bs_rw, filename, open_flags | BDRV_O_RDWR, drv); if (rw_ret < 0) { bdrv_delete(bs_rw); /* try to re-open read-only */ bs_ro = bdrv_new(""); -ret = bdrv_open(bs_ro, filename, open_flags & ~BDRV_O_RDWR, NULL); +ret = bdrv_open(bs_ro, filename, open_flags & ~BDRV_O_RDWR, drv); if (ret < 0) { bdrv_delete(bs_ro); /* drive not functional anymore */ @@ -758,7 +758,7 @@ ro_cleanup: bdrv_delete(bs->backing_hd); bs->backing_hd = NULL; bs_ro = bdrv_new(""); -ret = bdrv_open(bs_ro, filename, open_flags & ~BDRV_O_RDWR, NULL); +ret = bdrv_open(bs_ro, filename, open_flags & ~BDRV_O_RDWR, drv); if (ret < 0) { bdrv_delete(bs_ro); /* drive not functional anymore */ -- 1.6.6.1
[Qemu-devel] [PATCH RFC] qemu/virtio: use last used index published by guest
Reduces irq_window in guest by only injecting an interrupt if guest has handled all buffers we used so far. Signed-off-by: Michael S. Tsirkin --- This is the qemu part of the story. hw/vhost_net.c |6 ++ hw/virtio.c| 15 +++ hw/virtio.h|4 3 files changed, 25 insertions(+), 0 deletions(-) diff --git a/hw/vhost_net.c b/hw/vhost_net.c index 2e292ee..d77c9b2 100644 --- a/hw/vhost_net.c +++ b/hw/vhost_net.c @@ -51,6 +51,9 @@ unsigned vhost_net_get_features(struct vhost_net *net, unsigned features) if (!(net->dev.features & (1 << VIRTIO_RING_F_INDIRECT_DESC))) { features &= ~(1 << VIRTIO_RING_F_INDIRECT_DESC); } +if (!(net->dev.features & (1 << VIRTIO_RING_F_PUBLISH_USED))) { +features &= ~(1 << VIRTIO_RING_F_PUBLISH_USED); +} features &= ~(1 << VIRTIO_NET_F_MRG_RXBUF); return features; } @@ -64,6 +67,9 @@ void vhost_net_ack_features(struct vhost_net *net, unsigned features) if (features & (1 << VIRTIO_RING_F_INDIRECT_DESC)) { net->dev.acked_features |= (1 << VIRTIO_RING_F_INDIRECT_DESC); } +if (features & (1 << VIRTIO_RING_F_PUBLISH_USED)) { +net->dev.acked_features |= (1 << VIRTIO_RING_F_PUBLISH_USED); +} } static int vhost_net_get_fd(VLANClientState *backend) diff --git a/hw/virtio.c b/hw/virtio.c index e7657ae..5d686f0 100644 --- a/hw/virtio.c +++ b/hw/virtio.c @@ -71,6 +71,7 @@ struct VirtQueue target_phys_addr_t pa; uint16_t last_avail_idx; int inuse; +int num_notify; uint16_t vector; void (*handle_output)(VirtIODevice *vdev, VirtQueue *vq); VirtIODevice *vdev; @@ -139,6 +140,11 @@ static inline uint16_t vring_avail_ring(VirtQueue *vq, int i) return lduw_phys(pa); } +static inline uint16_t vring_last_used_idx(VirtQueue *vq) +{ +return vring_avail_ring(vq, vq->vring.num); +} + static inline void vring_used_ring_id(VirtQueue *vq, int i, uint32_t val) { target_phys_addr_t pa; @@ -234,6 +240,7 @@ void virtqueue_flush(VirtQueue *vq, unsigned int count) wmb(); vring_used_idx_increment(vq, count); vq->inuse -= count; +vq->num_notify += count; } void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem, @@ -603,6 +610,14 @@ void virtio_irq(VirtQueue *vq) void virtio_notify(VirtIODevice *vdev, VirtQueue *vq) { +/* Do not notify if guest did not yet see the last update. */ +if ((vdev->guest_features & (1 << VIRTIO_RING_F_PUBLISH_USED)) && + vring_last_used_idx(vq) != + (uint16_t)(vring_used_idx(vq) + vq->num_notify)) + return; + +vq->num_notify = 0; + /* Always notify when queue is empty (when feature acknowledge) */ if ((vring_avail_flags(vq) & VRING_AVAIL_F_NO_INTERRUPT) && (!(vdev->guest_features & (1 << VIRTIO_F_NOTIFY_ON_EMPTY)) || diff --git a/hw/virtio.h b/hw/virtio.h index f885f1b..4bdfded 100644 --- a/hw/virtio.h +++ b/hw/virtio.h @@ -43,6 +43,8 @@ #define VIRTIO_F_NOTIFY_ON_EMPTY24 /* We support indirect buffer descriptors */ #define VIRTIO_RING_F_INDIRECT_DESC 28 +/* The Guest publishes last-seen used index at the end of the avail ring. */ +#define VIRTIO_RING_F_PUBLISH_USED 29 /* A guest should never accept this. It implies negotiation is broken. */ #define VIRTIO_F_BAD_FEATURE 30 @@ -189,6 +191,8 @@ VirtIODevice *virtio_balloon_init(DeviceState *dev); void virtio_net_exit(VirtIODevice *vdev); #define DEFINE_VIRTIO_COMMON_FEATURES(_state, _field) \ + DEFINE_PROP_BIT("publish_used", _state, _field, \ + VIRTIO_RING_F_PUBLISH_USED, true), \ DEFINE_PROP_BIT("indirect_desc", _state, _field, \ VIRTIO_RING_F_INDIRECT_DESC, true) -- 1.7.1.12.g42b7f
Re: [Qemu-devel] Call for 0.12.4
On 05/06/2010 08:09 AM, Kevin Wolf wrote: Am 08.04.2010 20:37, schrieb Aurelien Jarno: Hi all, A number of fixes have been accumulated in the stable-0.12 branch, and I think it's time to release a new stable version. I would like to see that happening for the end of next week (around the 18th of April). If you want to see some patches included, please send a mail to the mailing list with the [STABLE] tag. I would clearly prefer patches that are already in HEAD (if the patch can simply be cherry-picked, there is no need to send a patch, just the commit number), though other patches might be considered too. Not sure if it's already too late for 0.12.4, but this one should be applied to the stable-0.12 branch: I've already tagged 0.12.4. Regards, Anthony Liguori Commit: d6771bfa52744eb4f959198b4b0e59451463eebf qemu-img: use the heap instead of the huge stack array for win32 On Windows, qemu-img convert is broken without this fix. Kevin
[Qemu-devel] [PATCH 1/4] vnc: set the right prefered encoding
>From RFB specs: "The order of the encoding types given in this message is a hint by the client as to its preference (the first encoding specified being most preferred)" Signed-off-by: Corentin Chary --- vnc.c | 15 +++ 1 files changed, 11 insertions(+), 4 deletions(-) diff --git a/vnc.c b/vnc.c index b1a3fdb..a91c3a3 100644 --- a/vnc.c +++ b/vnc.c @@ -1587,6 +1587,13 @@ static void send_ext_audio_ack(VncState *vs) vnc_flush(vs); } +static void set_encoding(VncState *vs, int encoding) +{ +if (vs->vnc_encoding == -1) { +vs->vnc_encoding = encoding; +} +} + static void set_encodings(VncState *vs, int32_t *encodings, size_t n_encodings) { int i; @@ -1594,7 +1601,7 @@ static void set_encodings(VncState *vs, int32_t *encodings, size_t n_encodings) vnc_zlib_init(vs); vs->features = 0; -vs->vnc_encoding = 0; +vs->vnc_encoding = -1; vs->tight_compression = 9; vs->tight_quality = 9; vs->absolute = -1; @@ -1603,18 +1610,18 @@ static void set_encodings(VncState *vs, int32_t *encodings, size_t n_encodings) enc = encodings[i]; switch (enc) { case VNC_ENCODING_RAW: -vs->vnc_encoding = enc; +set_encoding(vs, enc); break; case VNC_ENCODING_COPYRECT: vs->features |= VNC_FEATURE_COPYRECT_MASK; break; case VNC_ENCODING_HEXTILE: vs->features |= VNC_FEATURE_HEXTILE_MASK; -vs->vnc_encoding = enc; +set_encoding(vs, enc); break; case VNC_ENCODING_ZLIB: vs->features |= VNC_FEATURE_ZLIB_MASK; -vs->vnc_encoding = enc; +set_encoding(vs, enc); break; case VNC_ENCODING_DESKTOPRESIZE: vs->features |= VNC_FEATURE_RESIZE_MASK; -- 1.7.0.2
Re: [Qemu-devel] [PATCH] qemu: address todo comment in exec.c
On Tue, Apr 06, 2010 at 02:18:19PM +0300, Michael S. Tsirkin wrote: > exec.c has a comment 'XXX: optimize' for lduw_phys/stw_phys, > so let's do it, along the lines of stl_phys. > > The reason to address 16 bit accesses specifically is that virtio relies > on these accesses to be done atomically, using memset as we do now > breaks this assumption, which is reported to cause qemu with kvm > to read wrong index values under stress. > > https://bugzilla.redhat.com/show_bug.cgi?id=525323 Thanks, applied. > Signed-off-by: Michael S. Tsirkin > --- > exec.c | 67 +-- > 1 files changed, 60 insertions(+), 7 deletions(-) > > diff --git a/exec.c b/exec.c > index 33854e1..262c255 100644 > --- a/exec.c > +++ b/exec.c > @@ -3788,12 +3788,36 @@ uint32_t ldub_phys(target_phys_addr_t addr) > return val; > } > > -/* XXX: optimize */ > +/* warning: addr must be aligned */ > uint32_t lduw_phys(target_phys_addr_t addr) > { > -uint16_t val; > -cpu_physical_memory_read(addr, (uint8_t *)&val, 2); > -return tswap16(val); > +int io_index; > +uint8_t *ptr; > +uint64_t val; > +unsigned long pd; > +PhysPageDesc *p; > + > +p = phys_page_find(addr >> TARGET_PAGE_BITS); > +if (!p) { > +pd = IO_MEM_UNASSIGNED; > +} else { > +pd = p->phys_offset; > +} > + > +if ((pd & ~TARGET_PAGE_MASK) > IO_MEM_ROM && > +!(pd & IO_MEM_ROMD)) { > +/* I/O case */ > +io_index = (pd >> IO_MEM_SHIFT) & (IO_MEM_NB_ENTRIES - 1); > +if (p) > +addr = (addr & ~TARGET_PAGE_MASK) + p->region_offset; > +val = io_mem_read[io_index][1](io_mem_opaque[io_index], addr); > +} else { > +/* RAM case */ > +ptr = phys_ram_base + (pd & TARGET_PAGE_MASK) + > +(addr & ~TARGET_PAGE_MASK); > +val = lduw_p(ptr); > +} > +return val; > } > > /* warning: addr must be aligned. The ram page is not masked as dirty > @@ -3910,11 +3934,40 @@ void stb_phys(target_phys_addr_t addr, uint32_t val) > cpu_physical_memory_write(addr, &v, 1); > } > > -/* XXX: optimize */ > +/* warning: addr must be aligned */ > void stw_phys(target_phys_addr_t addr, uint32_t val) > { > -uint16_t v = tswap16(val); > -cpu_physical_memory_write(addr, (const uint8_t *)&v, 2); > +int io_index; > +uint8_t *ptr; > +unsigned long pd; > +PhysPageDesc *p; > + > +p = phys_page_find(addr >> TARGET_PAGE_BITS); > +if (!p) { > +pd = IO_MEM_UNASSIGNED; > +} else { > +pd = p->phys_offset; > +} > + > +if ((pd & ~TARGET_PAGE_MASK) != IO_MEM_RAM) { > +io_index = (pd >> IO_MEM_SHIFT) & (IO_MEM_NB_ENTRIES - 1); > +if (p) > +addr = (addr & ~TARGET_PAGE_MASK) + p->region_offset; > +io_mem_write[io_index][1](io_mem_opaque[io_index], addr, val); > +} else { > +unsigned long addr1; > +addr1 = (pd & TARGET_PAGE_MASK) + (addr & ~TARGET_PAGE_MASK); > +/* RAM case */ > +ptr = phys_ram_base + addr1; > +stw_p(ptr, val); > +if (!cpu_physical_memory_is_dirty(addr1)) { > +/* invalidate code */ > +tb_invalidate_phys_page_range(addr1, addr1 + 2, 0); > +/* set dirty bit */ > +phys_ram_dirty[addr1 >> TARGET_PAGE_BITS] |= > +(0xff & ~CODE_DIRTY_FLAG); > +} > +} > } > > /* XXX: optimize */ > -- > 1.7.0.2.280.gc6f05 > > > -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] Re: [PATCH] virtio-spec: document block CMD and FLUSH
Rusty Russell wrote: > > Seems over-zealous. > > If the recovery_header held a strong checksum of the recovery_data you would > > not need the first fsync, and as long as you have two places to write > > recovery > > data, you don't need the 3rd and 4th syncs. > > Just: > > > > write_internally_checksummed_recovery_data_and_header_to_unused_log_space() > > fsync / msync > > overwrite_with_new_data() > > > > To recovery you choose the most recent log_space and replay the content. > > That may be a redundant operation, but that is no loss. > > I think you missed a checksum for the new data? Otherwise we can't tell if > the new data is completely written. The data checksum can go in the recovery-data block. If there's enough slack in the log, by the time that recovery-data block is overwritten, you can be sure that an fsync has been done for that data (by a later commit). > But yes, I will steal this scheme for TDB2, thanks! Take a look at the filesystems. I think ext4 did some optimisations in this area, and that checksums had to be added anyway due to a subtle replay-corruption problem that happens when the log is partially corrupted, and followed by non-corrupt blocks. Also, you can remove even more fsyncs by adding a bit of slack to the data space and writing into unused/fresh areas some of the time - i.e. a bit like btrfs/zfs or anything log-structured, but you don't have to go all the way with that. > In practice, it's the first sync which is glacial, the rest are pretty cheap. The 3rd and 4th fsyncs imply a disk seek each, just because the preceding writes are to different areas of the disk. Seeks are quite slow - but not as slow as ext3 fsyncs :-) What do you mean by cheap? That it's only a couple of seeks, or that you don't see even that? > > > Also cannot see the point of msync if you have already performed an fsync, > > and if there is a point, I would expect you to call msync before > > fsync... Maybe there is some subtlety there that I am not aware of. > > I assume it's this from the msync man page: > >msync() flushes changes made to the in-core copy of a file that was >mapped into memory using mmap(2) back to disk. Without use of this >call there is no guarantee that changes are written back before mun‐ >map(2) is called. Historically, that means msync() ensures dirty mapping data is written to the file as if with write(), and that mapping pages are removed or refreshed to get the effect of read() (possibly a lazy one). It's more obvious in the early mmap implementations where mappings don't share pages with the filesystem cache, so msync() has explicit behaviour. Like with write(), after calling msync() you would then call fsync() to ensure the data is flushed to disk. If you've been calling fsync then msync, I guess that's another fine example of how these function are so hard to test, that they aren't. Historically on Linux, msync has been iffy on some architectures, and I'm still not sure it has the same semantics as other unixes. fsync as we know has also been iffy, and even now that fsync is tidier it does not always issue a hardware-level cache commit. But then historically writable mmap has been iffy on a boatload of unixes. > > > It's an implementation detail; barrier has less flexibility because it has > > > less information about what is required. I'm saying I want to give you as > > > much information as I can, even if you don't use it yet. > > > > Only we know that approach doesn't work. > > People will learn that they don't need to give the extra information to > > still > > achieve the same result - just like they did with ext3 and fsync. > > Then when we improve the implementation to only provide the guarantees that > > you asked for, people will complain that they are getting empty files that > > they didn't expect. > > I think that's an oversimplification: IIUC that occurred to people *not* > using fsync(). They weren't using it because it was too slow. Providing > a primitive which is as fast or faster and more specific doesn't have the > same magnitude of social issues. I agree with Rusty. Let's make it perform well so there is no reason to deliberately avoid using it, and let's make say what apps actually want to request without being way too strong. And please, if anyone has ideas on how we could make correct use of these functions *testable* by app authors, I'm all ears. Right now it is quite difficult - pulling power on hard disks mid-transaction is not a convenient method :) > > The abstraction I would like to see is a simple 'barrier' that contains no > > data and has a filesystem-wide effect. > > I think you lack ambition ;) > > Thinking about the single-file use case (eg. kvm guest or tdb), isn't that > suboptimal for md? Since you have to hand your barrier to every device > whereas a file-wide primitive may theoretically only go to some. Yes. Note that database-like programs s
[Qemu-devel] Re: [patch uq/master 5/9] kvm: synchronize state from cpu context
On 05/04/2010 03:45 PM, Marcelo Tosatti wrote: From: Jan Kiszka It is not safe to retrieve the KVM internal state of a given cpu while its potentially modifying it. Queue the request to run on cpu context, similarly to qemu-kvm. Even better is to query the state in the cpu thread; but that can come later. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] Re: [PATCH] virtio-spec: document block CMD and FLUSH
Rusty Russell wrote: > On Wed, 5 May 2010 05:47:05 am Jamie Lokier wrote: > > Jens Axboe wrote: > > > On Tue, May 04 2010, Rusty Russell wrote: > > > > ISTR someone mentioning a desire for such an API years ago, so CC'ing > > > > the > > > > usual I/O suspects... > > > > > > It would be nice to have a more fuller API for this, but the reality is > > > that only the flush approach is really workable. Even just strict > > > ordering of requests could only be supported on SCSI, and even there the > > > kernel still lacks proper guarantees on error handling to prevent > > > reordering there. > > > > There's a few I/O scheduling differences that might be useful: > > > > 1. The I/O scheduler could freely move WRITEs before a FLUSH but not > >before a BARRIER. That might be useful for time-critical WRITEs, > >and those issued by high I/O priority. > > This is only because noone actually wants flushes or barriers, though > I/O people seem to only offer that. We really want " must > occur before ". That offers maximum choice to the I/O subsystem > and potentially to smart (virtual?) disks. We do want flushes for the "D" in ACID - such things as after receiving a mail, or blog update into a database file (could be TDB), and confirming that to the sender, to have high confidence that the update won't disappear on system crash or power failure. Less obviously, it's also needed for the "C" in ACID when more than one file is involved. "C" is about differently updated things staying consistent with each other. For example, imagine you have a TDB file mapping Samba usernames to passwords, and another mapping Samba usernames to local usernames. (I don't know if you do this; it's just an illustration). To rename a Samba user involves updating both. Let's ignore transient transactional issues :-) and just think about what happens with per-file barriers and no sync, when a crash happens long after the updates, and before the system has written out all data and issued low level cache flushes. After restarting, due to lack of sync, the Samba username could be present in one file and not the other. > > 2. The I/O scheduler could move WRITEs after a FLUSH if the FLUSH is > >only for data belonging to a particular file (e.g. fdatasync with > >no file size change, even on btrfs if O_DIRECT was used for the > >writes being committed). That would entail tagging FLUSHes and > >WRITEs with a fs-specific identifier (such as inode number), opaque > >to the scheduler which only checks equality. > > This is closer. In userspace I'd be happy with a "all prior writes to this > struct file before all future writes". Even if the original guarantees were > stronger (ie. inode basis). We currently implement transactions using 4 fsync > /msync pairs. > > write_recovery_data(fd); > fsync(fd); > msync(mmap); > write_recovery_header(fd); > fsync(fd); > msync(mmap); > overwrite_with_new_data(fd); > fsync(fd); > msync(mmap); > remove_recovery_header(fd); > fsync(fd); > msync(mmap); > > Yet we really only need ordering, not guarantees about it actually hitting > disk before returning. > > > In other words, FLUSH can be more relaxed than BARRIER inside the > > kernel. It's ironic that we think of fsync as stronger than > > fbarrier outside the kernel :-) > > It's an implementation detail; barrier has less flexibility because it has > less information about what is required. I'm saying I want to give you as > much information as I can, even if you don't use it yet. I agree, and I've started a few threads about it over the last couple of years. An fsync_range() system call would be very easy to use and (most importantly) easy to understand. With optional flags to weaken it (into fdatasync, barrier without sync, sync without barrier, one-sided barrier, no lowlevel cache-flush, don't rush, etc.), it would be very versatile, and still easy to understand. With an AIO version, and another flag meaning don't rush, just return when satisfied, and I suspect it would be useful for the most demanding I/O apps. -- Jamie
[Qemu-devel] [PATCH 4/4] vnc: adjust compression zstream level
Signed-off-by: Corentin Chary --- vnc-encoding-zlib.c |9 - vnc.h |1 + 2 files changed, 9 insertions(+), 1 deletions(-) diff --git a/vnc-encoding-zlib.c b/vnc-encoding-zlib.c index 6a16a79..29dd1b7 100644 --- a/vnc-encoding-zlib.c +++ b/vnc-encoding-zlib.c @@ -83,10 +83,17 @@ static int vnc_zlib_stop(VncState *vs) return -1; } +vs->zlib_level = vs->tight_compression; zstream->opaque = vs; } -// XXX what to do if tight_compression changed in between? +if (vs->tight_compression != vs->zlib_level) { +if (deflateParams(zstream, vs->tight_compression, + Z_DEFAULT_STRATEGY) != Z_OK) { +return -1; +} +vs->zlib_level = vs->tight_compression; +} // reserve memory in output buffer buffer_reserve(&vs->output, vs->zlib.offset + 64); diff --git a/vnc.h b/vnc.h index dfdb240..96f3fe7 100644 --- a/vnc.h +++ b/vnc.h @@ -174,6 +174,7 @@ struct VncState Buffer zlib; Buffer zlib_tmp; z_stream zlib_stream; +int zlib_level; Notifier mouse_mode_notifier; -- 1.7.0.2
[Qemu-devel] Bug in net/socket.c: info_str is overwritten
Hi there, I'm working on the conversion of the 'info network' command do QMP, and I think I've found some problems. Running qemu like this: qemu -net socket,listen=: In net/socket.c, the function net_socket_listen_init() [1] is called and a listening socket is created. There is nothing registering this, so when using 'info network' as is, no information about this listening socket is shown. When a connection is accepted, the handler net_socket_accept() calls net_socket_fd_init(). If the socket is UDP/multicast net_socket_fd_init() calls net_socket_fd_init_dgram, if the socket is TCP it calls net_socket_fd_init_dgram_stream. [2] In both cases the info_str string is written inside net_socket_fd_init_(stream|dgram) [3], and after that, it is overwritten on a subsequent snprintf() in net_socket_accept() [4]. net_socket_fd_init_(stream|dgram) always puts into the info_str the fd number, and this information is overwritten latter. So, is the fd number relevant to be transmitted over QMP? Right now this information is being lost, IMHO. Same thing happens when qemu is run as a client: net_socket_connect_init() calls net_socket_fd_init_(dgram|stream) and latter overwrites info_str. [1] http://git.qemu.org/qemu.git/tree/net/socket.c#n375 [2] http://git.qemu.org/qemu.git/tree/net/socket.c#n336 [3] http://git.qemu.org/qemu.git/tree/net/socket.c#n310 [4] http://git.qemu.org/qemu.git/tree/net/socket.c#n369
[Qemu-devel] [PATCH][RESEND] exec: optimize lduw_phys and stw_phys
Implementation of the optimized code for these two functions. This is necessary for virtio which reads and writes VirtQueue index fields using these functions. The assumption is that this are atomic operations, which is not the case, if the memcpy() function which is used in the non optimized code does single byte copying. This happens for example with an older WindRiver glibc. Signed-off-by: Bernhard Kohl --- RESEND: This message did not reach the gmane archive and maybe others. --- exec.c | 67 +-- 1 files changed, 60 insertions(+), 7 deletions(-) diff --git a/exec.c b/exec.c index 14d1fd7..fb40398 100644 --- a/exec.c +++ b/exec.c @@ -3739,12 +3739,36 @@ uint32_t ldub_phys(target_phys_addr_t addr) return val; } -/* XXX: optimize */ +/* warning: addr must be aligned */ uint32_t lduw_phys(target_phys_addr_t addr) { -uint16_t val; -cpu_physical_memory_read(addr, (uint8_t *)&val, 2); -return tswap16(val); +int io_index; +uint8_t *ptr; +uint32_t val; +unsigned long pd; +PhysPageDesc *p; + +p = phys_page_find(addr >> TARGET_PAGE_BITS); +if (!p) { +pd = IO_MEM_UNASSIGNED; +} else { +pd = p->phys_offset; +} + +if ((pd & ~TARGET_PAGE_MASK) > IO_MEM_ROM && +!(pd & IO_MEM_ROMD)) { +/* I/O case */ +io_index = (pd >> IO_MEM_SHIFT) & (IO_MEM_NB_ENTRIES - 1); +if (p) +addr = (addr & ~TARGET_PAGE_MASK) + p->region_offset; +val = io_mem_read[io_index][1](io_mem_opaque[io_index], addr); +} else { +/* RAM case */ +ptr = qemu_get_ram_ptr(pd & TARGET_PAGE_MASK) + +(addr & ~TARGET_PAGE_MASK); +val = lduw_p(ptr); +} +return val; } /* warning: addr must be aligned. The ram page is not masked as dirty @@ -3861,11 +3885,40 @@ void stb_phys(target_phys_addr_t addr, uint32_t val) cpu_physical_memory_write(addr, &v, 1); } -/* XXX: optimize */ +/* warning: addr must be aligned */ void stw_phys(target_phys_addr_t addr, uint32_t val) { -uint16_t v = tswap16(val); -cpu_physical_memory_write(addr, (const uint8_t *)&v, 2); +int io_index; +uint8_t *ptr; +unsigned long pd; +PhysPageDesc *p; + +p = phys_page_find(addr >> TARGET_PAGE_BITS); +if (!p) { +pd = IO_MEM_UNASSIGNED; +} else { +pd = p->phys_offset; +} + +if ((pd & ~TARGET_PAGE_MASK) != IO_MEM_RAM) { +io_index = (pd >> IO_MEM_SHIFT) & (IO_MEM_NB_ENTRIES - 1); +if (p) +addr = (addr & ~TARGET_PAGE_MASK) + p->region_offset; +io_mem_write[io_index][1](io_mem_opaque[io_index], addr, val); +} else { +unsigned long addr1; +addr1 = (pd & TARGET_PAGE_MASK) + (addr & ~TARGET_PAGE_MASK); +/* RAM case */ +ptr = qemu_get_ram_ptr(addr1); +stw_p(ptr, val); +if (!cpu_physical_memory_is_dirty(addr1)) { +/* invalidate code */ +tb_invalidate_phys_page_range(addr1, addr1 + 2, 0); +/* set dirty bit */ +phys_ram_dirty[addr1 >> TARGET_PAGE_BITS] |= +(0xff & ~CODE_DIRTY_FLAG); +} +} } /* XXX: optimize */ -- 1.6.6.1
Re: [Qemu-devel] Fate of the read-only block drivers?
Am 05.05.2010 20:27, schrieb Christoph Hellwig: > Currently we have four very simple, read-only block drivers in the > tree: > > - cloop: > This one is known buggy for non-trivial use and didn't get > any chance but the usual API changes and cleanup sweeps > since it was commited in 2004. > > - dmg: > This one is really grotty and by auditing the code and comparing > it to dmg2img completely buggy for non-compressed chunks. It's > also missing lots of the features in modern dmg image. Also > there's no creation tool for Linux. No non-trivial commits > since the initial import in 2004. > > - bochs: > There is an bximage tools to actually create images for it, > but not to actually add any content to them. It had one > non-trivial commit in 2007 to add support for growable formats > since it's initial commit in 2004. > > - parallels: > I can't find any way to actually create the image except with > the parallels image tool which seems to require a commercial > parallels license. > No non-trivial commits since the initial import in 2007. > > At this point I'm not sure if it's worth bothering to fix these up, > as they're bitrotted and except for cloop there's no reasy way to > actually create test images for them. We can create images at least for cloop and bochs, which is 50%, even though the latter not automated. (The only way I found was using bochs, and the biggest challenge there is finding a guest OS that boots up in finite time. A FreeDOS floppy did the trick for my manual test.) Likewise, not all of them are known broken, it's only cloop and one path in dmg. I'm not sure about the good reasons to keep them, but I guess the code wouldn't exist in the first place if nobody had cared about the functionality. It's probably easy to remove code that can't possibly work today, but that's true only for part of this code. > Any good reason to keep them, and if yes any good idea for a test plan > to keep them working? Basically I see two options for testing them: a) Make the read-write, so you can test consistency of qemu's implementation. This is what we're doing for formats like VMDK. Probably not worth it. b) Put some binary images in the qemu-iotests repository. For each format we would need someone who owns the respective software and can convert our test image to that format. For cloop and bochs we can probably do it ourselves, for dmg and parallels we would need help. Kevin
[Qemu-devel] Re: [PATCH] block: Fix bdrv_commit
Kevin Wolf wrote: > When reopening the image, don't guess the driver, but use the same driver as > was used before. This is important if the format=... option was used for that > image. > > Signed-off-by: Kevin Wolf > --- > block.c |6 +++--- > 1 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/block.c b/block.c > index adb3e5d..ffe82f8 100644 > --- a/block.c > +++ b/block.c > @@ -701,12 +701,12 @@ int bdrv_commit(BlockDriverState *bs) > bdrv_delete(bs->backing_hd); > bs->backing_hd = NULL; > bs_rw = bdrv_new(""); > -rw_ret = bdrv_open(bs_rw, filename, open_flags | BDRV_O_RDWR, NULL); > +rw_ret = bdrv_open(bs_rw, filename, open_flags | BDRV_O_RDWR, drv); > if (rw_ret < 0) { > bdrv_delete(bs_rw); > /* try to re-open read-only */ > bs_ro = bdrv_new(""); > -ret = bdrv_open(bs_ro, filename, open_flags & ~BDRV_O_RDWR, > NULL); > +ret = bdrv_open(bs_ro, filename, open_flags & ~BDRV_O_RDWR, drv); > if (ret < 0) { > bdrv_delete(bs_ro); > /* drive not functional anymore */ > @@ -758,7 +758,7 @@ ro_cleanup: > bdrv_delete(bs->backing_hd); > bs->backing_hd = NULL; > bs_ro = bdrv_new(""); > -ret = bdrv_open(bs_ro, filename, open_flags & ~BDRV_O_RDWR, NULL); > +ret = bdrv_open(bs_ro, filename, open_flags & ~BDRV_O_RDWR, drv); > if (ret < 0) { > bdrv_delete(bs_ro); > /* drive not functional anymore */ Acked-by: Naphtali Sprei
Re: [Qemu-devel] [PATCH RFC] virtio: put last seen used index into ring itself
On 05/05/2010 11:58 PM, Michael S. Tsirkin wrote: + /* We publish the last-seen used index at the end of the available ring. +* It is at the end for backwards compatibility. */ + vr->last_used_idx =&(vr)->avail->ring[num]; + /* Verify that last used index does not spill over the used ring. */ + BUG_ON((void *)vr->last_used_idx + + sizeof *vr->last_used_idx> (void *)vr->used); } Shouldn't this be on its own cache line? One way to achieve this is to allocate it at the end. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] vgabios + qemu: issues and plans.
On Wed, May 05, 2010 at 03:38:21PM +0200, Gerd Hoffmann wrote: > Hi, > > Today we have two vgabios versions in qemu: The standard one > (vgabios.bin) and the cirrus one (vgabios-cirrus.bin). > > The cirrus vgabios is a PCI ROM. We can (and do) load it into the ROM > PCI bar. The vgabios checks the pci config space to figure where the > linear framebuffer (for vesa graphics) is mapped to. It knows how to > program the cirrus. > > The standard bios isn't a PCI ROM. We have to load it using the seabios > firmware interface. It expects to find the linear framebuffer at the > magic address 0xe000. It uses the bochs extentions to implement > vesa graphics support. > > So, what is wrong with this? > > First, I'd like to be able to load the vgabios via PCI ROM bar on all > pci vga cards (stdvga, vmware, soon qxl). The PCI ID in the bios has to > match the PCI ID of the card, so we'll need a bunch of vga bios > binaries, all identical except for the PCI ID. Or we need some kind of > binary patching. Otherwise seabios will not load them from the PCI ROM > bar. Make vga bios aware of all ids? Then single vga bios image can handle all vga devices. If new device is introduced (qxl), new id will be added to id table I thing that adding a new vga device is rare. Or do you want to more generic/dynamic solution? > Second, I want to get rid of the magic address 0xe000 (except for > isa-vga). This is basically just a matter of updating to vgabios > version 0.6c. And this needs one vga bios binary per vga card too as > the PCI ID is used to lookup the card (and then the framebuffer address) > in PCI config space. I also want to eliminate the magic number. Boch vga bios is paravirtualized to get VBE address dynamically via a new pci device with special pci device of device id 0x1234. Are you meaning to follow bochs way? Or how about to use qemu fw interface to get the address dynamically? > > Comments? Especially on the binary patching? Worth it? Or just build > a bunch of binaries? They are not *that* big after all ... > > cheers, > Gerd > > -- yamahata
Re: [Qemu-devel] question on virtio
Michael S. Tsirkin wrote: > Hi! > I see this in virtio_ring.c: > > /* Put entry in available array (but don't update avail->idx * > until they do sync). */ > > Why is it done this way? > It seems that updating the index straight away would be simpler, while > this might allow the host to specilatively look up the buffer and handle > it, without waiting for the kick. Even better, if the host updates a location containing which index it has seen recently, you can avoid the kick entirely during sustained flows - just like your recent patch to avoid sending irqs to the guest. -- Jamie
[Qemu-devel] [PATCH] vdi: Fix image creation
The number of blocks needs to be rounded up to cover all of the virtual hard disk. Without this fix, we can't even open our own images if their size is not a multiple of the block size. Signed-off-by: Kevin Wolf --- block/vdi.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/block/vdi.c b/block/vdi.c index 2b4d2c2..b990bbc 100644 --- a/block/vdi.c +++ b/block/vdi.c @@ -827,7 +827,7 @@ static int vdi_create(const char *filename, QEMUOptionParameter *options) return -errno; } -blocks = bytes / block_size; +blocks = (bytes + block_size - 1) / block_size; bmap_size = blocks * sizeof(uint32_t); bmap_size = ((bmap_size + SECTOR_SIZE - 1) & ~(SECTOR_SIZE -1)); -- 1.6.6.1
Re: [Qemu-devel] [PATCH] QMP: Spec: Private Extensions support
Anthony, no reply from you; did it fall through the cracks? If you're fine with my draft, I'll turn it into a proper patch. Markus Armbruster writes: > Anthony asked me to take a stab at rewriting his draft to something more > along the lines of what I'm thinking. Here goes. I put some remarks > [in brackets]. > > FYI, I'll be out of town until Wednesday. > > > 6. Downstream extension of QMP > -- > > We recommend that downstream consumers of QEMU do *not* modify QMP. > Management tools should be able to support both upstream and downstream > versions of QMP without special logic, and downstream extensions are > inherently at odds with that. > > However, we recognize that it is sometimes impossible for downstreams to > avoid modifying QMP. Both upstream and downstream need to take care to > preserve long-term compatibility and interoperability. > > To help with that, QMP reserves JSON object member names beginning with > '__' (double underscore) for downstream use ("downstream names"). This > means upstream will never use any downstream names for its commands, > arguments, errors, asynchronous events, and so forth. > > Any new names downstream wishes to add must begin with '__'. To ensure > compatibility with other downstreams, it is strongly recommended that > you prefix the commands with '__RFQDN_' where RFQDN is a valid, reverse > fully qualified domain name which you control. For example, a qemu-kvm > specific monitor command would be: > > (qemu) __org.linux-kvm_enable_irqchip > > Downstream must not change the server greeting (section 2.2) other than > to offer additional capabilities. But see below for why even that is > discouraged. > > Section '5 Compatibility Considerations' applies to downstream as well > as to upstream, obviously. [That section needs work!] It follows that > downstream must behave exactly like upstream for any input not > containing members with downstream names ("downstream members"), except > it may add members with downstream names to its output. > > Thus, a client should not be able to distinguish downstream from > upstream as long as it doesn't send input with downstream members, and > properly ignores any downstream members in the output it receives. > > [I fully support everything up to this point. I have some reservations > on the rest, and I doubt it'll be all that useful, but I don't really > mind having it, at least not in this form.] > > Advice on downstream modifications: > [I made a honest effort at capturing Anthony's intentions here, > my apologies if I screwed it up.] > > 1. Introducing new commands is okay. If you want to extend an existing >command, consider introducing a new one with the new behaviour >instead. [FIXME Could use a rationale: why is extending bad? Make >sure to cover errors, because that's needed for 3.] > > 2. Introducing new asynchronous messages is okay. If you want to extend >an existing message, consider adding a new one instead. [FIXME Could >use a rationale: why is extending bad?] > > 3. Introducing new errors for use in new commands is okay. Adding new >errors to existing commands counts as extension, so 1. applies. > > 4. New capabilities are strongly discouraged. Capabilities are for >evolving the basic protocol, and multiple diverging basic protocol >dialects are most undesirable.
[Qemu-devel] Adding AUDIO CD support in Qemu : Buffer I/O Error on device sr0
Dear Developers, I am working on a Xen HVM platform and trying to add AUDIO CD support to Qemu. I modified ide_atapi_cmd function to handle audio media. This includes changing cdrom_read_toc to set ADR/Control bits to 0x10 if media is CDS_AUDIO. Changed GPCMD_MODE_SENSE_10 case to include cdda bit. I also modified READ_CD to do a ide_atapi_cmd_read for case 0xf0 and have added GPCMD_READ_SUBCHANNEL command too. Beacuse of the above changes, now when i load audio cd and do 'xm block-configure ...' Windows Media Player now gets launched but it is still unable to play the track. Dom0 Dmesg shows the following error: [ 1296.401294] sr 1:0:0:0: [sr0] Result: hostbyte=0x00 driverbyte=0x08 [ 1296.401483] sr 1:0:0:0: [sr0] Sense Key : 0x5 [current] [ 1296.401772] sr 1:0:0:0: [sr0] ASC=0x64 ASCQ=0x0 [ 1296.402028] end_request: I/O error, dev sr0, sector 0 [ 1296.402129] Buffer I/O error on device sr0, logical block 0 [ 1296.402232] Buffer I/O error on device sr0, logical block 1 [ 1296.402345] Buffer I/O error on device sr0, logical block 2 [ 1296.402450] Buffer I/O error on device sr0, logical block 3 [ 1296.405520] sr 1:0:0:0: [sr0] Result: hostbyte=0x00 driverbyte=0x08 [ 1296.405712] sr 1:0:0:0: [sr0] Sense Key : 0x5 [current] [ 1296.406023] sr 1:0:0:0: [sr0] ASC=0x64 ASCQ=0x0 [ 1296.406253] end_request: I/O error, dev sr0, sector 0 [ 1296.406353] Buffer I/O error on device sr0, logical block 0 Tracing qemu i get the following call chain from READ CD handler: [ide_atapi_cmd](lba=0x200 nb_sec=10 trnsfr=f0) ATAPI Packet= be 04 00 00 02 00 00 00 10 f0 00 00 [ide_atapi_cmd_read_dma]read dma: LBA=512 nb_sectors=16 [[ide_atapi_cmd_read_dma_cb]]aio_read_cd: lba=512 n=16 [bdrv_aio_read] [bdrv_check_request] ret=0 [bdrv_aio_read] [ide_atapi_cmd_read_dma_cb] ret<0 goto eot [dma_bdrv_cb](dbs->sg_cur_index == dbs->sg->nsg || ret < 0) Frankly, from this point i do not understand what is happening. Can any one of you please help me out understanding what is the reason that READ CD command is failing and what is the reason of the Buffer I/O error in dmesg. It will be great if someone can give me some pointers to understand the root cause or suggest someways to further debug this. Thanks & Regards, Shubham
Re: [Qemu-devel] Re: How to map PCI memory into the VM without trapping
On Wed, May 5, 2010 at 5:20 PM, Frank Berreth wrote: > Hi there, > this is just an update if you are interested in the outcome. I turns out > that my initial assumption that there would be page faults/trapping on the > memory pages was false. The reason the throughput is so low is because the > memory was mapped non-cached. The VGA driver and the ivshmem driver use > pci_ioremap_bar which will *always* map the PCI bar non-cached (even the > resourceX_wc). > Changing the driver(s) to use ioremap_cache or ioremap_wc speeds up things > quite a bit. I don't know if VGA framebuffer was always mapped this way -- > it appears on a real system that usually graphics memory is mapped WC. > Mapping it UC would cause quite a performance degradation. This could be the > reason for the reported VGA performance drop in another email thread. IMHO, > since QEMU emulates VGA, this could be mapped WB. > Thanks, > Frank. > Hi Frank, Thanks for the note. I'll make the change to cached and see if that helps with some of the tests I'm running. Cheers, Cam
[Qemu-devel] [PATCH 0/5] tcg: Initialize prologue after guest_base fixed
By doing this we can make any number of decisions about code generation during the prologue. For instance, we can decide whether or not to reserve a register to hold the value of GUEST_BASE. This fixes a latent bug in the two ppc ports in that GUEST_BASE was being loaded as an immediate before the final value of GUEST_BASE is computed. This also fixes some register reservation problems I noticed in the ia64 port, while trying to decide which register to use for holding GUEST_BASE across the TB. r~ Richard Henderson (5): tcg: Initialize the prologue after GUEST_BASE is fixed. tcg-hppa: Load GUEST_BASE as an immediate. tcg-ia64: Fix some register usage issues. tcg-ia64: Load GUEST_BASE into a register. tcg-ppc: Conditionally reserve TCG_GUEST_BASE_REG. bsd-user/main.c|9 ++- exec.c |5 + linux-user/main.c |9 ++- tcg/hppa/tcg-target.c | 12 +-- tcg/ia64/tcg-target.c | 199 +++- tcg/ppc/tcg-target.c |8 +- tcg/ppc64/tcg-target.c |9 +- tcg/tcg.c |3 + tcg/tcg.h |1 + 9 files changed, 167 insertions(+), 88 deletions(-)
[Qemu-devel] [PATCH 5/5] tcg-ppc: Conditionally reserve TCG_GUEST_BASE_REG.
We need not reserve the register unless we're going to use it. Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target.c |8 tcg/ppc64/tcg-target.c |9 - 2 files changed, 8 insertions(+), 9 deletions(-) diff --git a/tcg/ppc/tcg-target.c b/tcg/ppc/tcg-target.c index ce078e4..654d244 100644 --- a/tcg/ppc/tcg-target.c +++ b/tcg/ppc/tcg-target.c @@ -933,7 +933,10 @@ void tcg_target_qemu_prologue (TCGContext *s) tcg_out32 (s, STW | RS (0) | RA (1) | (frame_size + LR_OFFSET)); #ifdef CONFIG_USE_GUEST_BASE -tcg_out_movi (s, TCG_TYPE_I32, TCG_GUEST_BASE_REG, GUEST_BASE); +if (GUEST_BASE) { +tcg_out_movi (s, TCG_TYPE_I32, TCG_GUEST_BASE_REG, GUEST_BASE); +tcg_regset_set_reg(s->reserved_regs, TCG_GUEST_BASE_REG); +} #endif tcg_out32 (s, MTSPR | RS (3) | CTR); @@ -1914,9 +1917,6 @@ void tcg_target_init(TCGContext *s) #ifdef _CALL_SYSV tcg_regset_set_reg(s->reserved_regs, TCG_REG_R13); #endif -#ifdef CONFIG_USE_GUEST_BASE -tcg_regset_set_reg(s->reserved_regs, TCG_GUEST_BASE_REG); -#endif tcg_add_target_add_op_defs(ppc_op_defs); } diff --git a/tcg/ppc64/tcg-target.c b/tcg/ppc64/tcg-target.c index 2d436a5..a1e643c 100644 --- a/tcg/ppc64/tcg-target.c +++ b/tcg/ppc64/tcg-target.c @@ -899,7 +899,10 @@ void tcg_target_qemu_prologue (TCGContext *s) tcg_out32 (s, STD | RS (0) | RA (1) | (frame_size + 16)); #ifdef CONFIG_USE_GUEST_BASE -tcg_out_movi (s, TCG_TYPE_I64, TCG_GUEST_BASE_REG, GUEST_BASE); +if (GUEST_BASE) { +tcg_out_movi (s, TCG_TYPE_I64, TCG_GUEST_BASE_REG, GUEST_BASE); +tcg_regset_set_reg(s->reserved_regs, TCG_GUEST_BASE_REG); +} #endif tcg_out32 (s, MTSPR | RS (3) | CTR); @@ -1692,9 +1695,5 @@ void tcg_target_init (TCGContext *s) #endif tcg_regset_set_reg (s->reserved_regs, TCG_REG_R13); -#ifdef CONFIG_USE_GUEST_BASE -tcg_regset_set_reg(s->reserved_regs, TCG_GUEST_BASE_REG); -#endif - tcg_add_target_add_op_defs (ppc_op_defs); } -- 1.7.0.1
[Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device
On 04/21/2010 12:53 PM, Cam Macdonell wrote: Support an inter-vm shared memory device that maps a shared-memory object as a PCI device in the guest. This patch also supports interrupts between guest by communicating over a unix domain socket. This patch applies to the qemu-kvm repository. -device ivshmem,size=[,shm=] Interrupts are supported between multiple VMs by using a shared memory server by using a chardev socket. -device ivshmem,size=[,shm=] [,chardev=][,msi=on][,irqfd=on][,vectors=n] -chardev socket,path=,id= (shared memory server is qemu.git/contrib/ivshmem-server) Sample programs and init scripts are in a git repo here: www.gitorious.org/nahanni --- Makefile.target |3 + hw/ivshmem.c| 727 +++ qemu-char.c |6 + qemu-char.h |3 + qemu-doc.texi | 25 ++ 5 files changed, 764 insertions(+), 0 deletions(-) create mode 100644 hw/ivshmem.c diff --git a/Makefile.target b/Makefile.target index 1ffd802..bc9a681 100644 --- a/Makefile.target +++ b/Makefile.target @@ -199,6 +199,9 @@ obj-$(CONFIG_USB_OHCI) += usb-ohci.o obj-y += rtl8139.o obj-y += e1000.o +# Inter-VM PCI shared memory +obj-y += ivshmem.o + # Hardware support obj-i386-y = pckbd.o dma.o obj-i386-y += vga.o diff --git a/hw/ivshmem.c b/hw/ivshmem.c new file mode 100644 index 000..f8d8fdb --- /dev/null +++ b/hw/ivshmem.c @@ -0,0 +1,727 @@ +/* + * Inter-VM Shared Memory PCI device. + * + * Author: + * Cam Macdonell + * + * Based On: cirrus_vga.c and rtl8139.c + * + * This code is licensed under the GNU GPL v2. + */ +#include +#include +#include +#include +#include +#include This will break the Windows along with any non-Linux unix or any Linux old enough to not have eventfd support. If it's based on cirrus_vga.c and rtl8139.c, then it ought to carry the respective copyrights, no? Regards, Anthony Liguori +#include "hw.h" +#include "console.h" +#include "pc.h" +#include "pci.h" +#include "sysemu.h" + +#include "msix.h" +#include "qemu-kvm.h" +#include "libkvm.h" + +#include +#include +#include +#include + +#define IVSHMEM_IRQFD 0 +#define IVSHMEM_MSI 1 + +#define DEBUG_IVSHMEM +#ifdef DEBUG_IVSHMEM +#define IVSHMEM_DPRINTF(fmt, args...)\ +do {printf("IVSHMEM: " fmt, ##args); } while (0) +#else +#define IVSHMEM_DPRINTF(fmt, args...) +#endif + +typedef struct EventfdEntry { +PCIDevice *pdev; +int vector; +} EventfdEntry; + +typedef struct IVShmemState { +PCIDevice dev; +uint32_t intrmask; +uint32_t intrstatus; +uint32_t doorbell; + +CharDriverState * chr; +CharDriverState ** eventfd_chr; +int ivshmem_mmio_io_addr; + +pcibus_t mmio_addr; +unsigned long ivshmem_offset; +uint64_t ivshmem_size; /* size of shared memory region */ +int shm_fd; /* shared memory file descriptor */ + +int nr_allocated_vms; +/* array of eventfds for each guest */ +int ** eventfds; +/* keep track of # of eventfds for each guest*/ +int * eventfds_posn_count; + +int nr_alloc_guests; +int vm_id; +int num_eventfds; +uint32_t vectors; +uint32_t features; +EventfdEntry *eventfd_table; + +char * shmobj; +char * sizearg; +} IVShmemState; + +/* registers for the Inter-VM shared memory device */ +enum ivshmem_registers { +IntrMask = 0, +IntrStatus = 4, +IVPosition = 8, +Doorbell = 12, +}; + +static inline uint32_t ivshmem_has_feature(IVShmemState *ivs, int feature) { +return (ivs->features& (1<< feature)); +} + +static inline int is_power_of_two(int x) { +return (x& (x-1)) == 0; +} + +static void ivshmem_map(PCIDevice *pci_dev, int region_num, +pcibus_t addr, pcibus_t size, int type) +{ +IVShmemState *s = DO_UPCAST(IVShmemState, dev, pci_dev); + +IVSHMEM_DPRINTF("addr = %u size = %u\n", (uint32_t)addr, (uint32_t)size); +cpu_register_physical_memory(addr, s->ivshmem_size, s->ivshmem_offset); + +} + +/* accessing registers - based on rtl8139 */ +static void ivshmem_update_irq(IVShmemState *s, int val) +{ +int isr; +isr = (s->intrstatus& s->intrmask)& 0x; + +/* don't print ISR resets */ +if (isr) { +IVSHMEM_DPRINTF("Set IRQ to %d (%04x %04x)\n", + isr ? 1 : 0, s->intrstatus, s->intrmask); +} + +qemu_set_irq(s->dev.irq[0], (isr != 0)); +} + +static void ivshmem_IntrMask_write(IVShmemState *s, uint32_t val) +{ +IVSHMEM_DPRINTF("IntrMask write(w) val = 0x%04x\n", val); + +s->intrmask = val; + +ivshmem_update_irq(s, val); +} + +static uint32_t ivshmem_IntrMask_read(IVShmemState *s) +{ +uint32_t ret = s->intrmask; + +IVSHMEM_DPRINTF("intrmask read(w) val = 0x%04x\n", ret); + +return ret; +} + +static void ivshmem_IntrStatus_write(IVShmemState *s, uint32_t val) +{ +IVSHMEM_DPRINTF("IntrStatus write(w) val = 0x%04x\n", val); + +s->intrstatus = val; + +ivshmem_u
[Qemu-devel] [PATCH 3/5] tcg-ia64: Fix some register usage issues.
(1) The output registers were not marked call-clobbered, even though they can be modified by called functions. (2) The thread pointer was not marked reserved. (3) R4-R6 are call-saved, but not saved by the prologue. Rather than save them, mark them reserved so that we don't use them. Signed-off-by: Richard Henderson --- tcg/ia64/tcg-target.c | 67 ++-- 1 files changed, 42 insertions(+), 25 deletions(-) diff --git a/tcg/ia64/tcg-target.c b/tcg/ia64/tcg-target.c index 401dfec..acd4ce8 100644 --- a/tcg/ia64/tcg-target.c +++ b/tcg/ia64/tcg-target.c @@ -2283,39 +2283,56 @@ void tcg_target_init(TCGContext *s) 0xull); tcg_regset_set(tcg_target_available_regs[TCG_TYPE_I64], 0xull); -tcg_regset_set(tcg_target_call_clobber_regs, - (1 << TCG_REG_R8) | - (1 << TCG_REG_R9) | - (1 << TCG_REG_R10) | - (1 << TCG_REG_R11) | - (1 << TCG_REG_R13) | - (1 << TCG_REG_R14) | - (1 << TCG_REG_R15) | - (1 << TCG_REG_R16) | - (1 << TCG_REG_R17) | - (1 << TCG_REG_R18) | - (1 << TCG_REG_R19) | - (1 << TCG_REG_R20) | - (1 << TCG_REG_R21) | - (1 << TCG_REG_R22) | - (1 << TCG_REG_R23) | - (1 << TCG_REG_R24) | - (1 << TCG_REG_R25) | - (1 << TCG_REG_R26) | - (1 << TCG_REG_R27) | - (1 << TCG_REG_R28) | - (1 << TCG_REG_R29) | - (1 << TCG_REG_R30) | - (1 << TCG_REG_R31)); -tcg_regset_clear(s->reserved_regs); +tcg_regset_clear(tcg_target_call_clobber_regs); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R8); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R9); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R10); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R11); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R14); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R15); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R16); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R17); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R18); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R19); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R20); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R21); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R22); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R23); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R24); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R25); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R26); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R27); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R28); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R29); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R30); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R31); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R56); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R57); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R58); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R59); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R60); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R61); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R62); +tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R63); + +tcg_regset_clear(s->reserved_regs); tcg_regset_set_reg(s->reserved_regs, TCG_REG_R0); /* zero register */ tcg_regset_set_reg(s->reserved_regs, TCG_REG_R1); /* global pointer */ tcg_regset_set_reg(s->reserved_regs, TCG_REG_R2); /* internal use */ tcg_regset_set_reg(s->reserved_regs, TCG_REG_R3); /* internal use */ tcg_regset_set_reg(s->reserved_regs, TCG_REG_R12); /* stack pointer */ +tcg_regset_set_reg(s->reserved_regs, TCG_REG_R13); /* thread pointer */ tcg_regset_set_reg(s->reserved_regs, TCG_REG_R32); /* return address */ tcg_regset_set_reg(s->reserved_regs, TCG_REG_R33); /* PFS */ +/* The following 3 are not in use, are call-saved, but *not* saved + by the prologue. Therefore we cannot use them without modifying + the prologue. There doesn't seem to be any good reason to use + these as opposed to the windowed registers. */ +tcg_regset_set_reg(s->reserved_regs, TCG_REG_R4); +tcg_regset_set_reg(s->reserved_regs, TCG_REG_R5); +tcg_regset_set
[Qemu-devel] [PATCH 2/5] tcg-hppa: Load GUEST_BASE as an immediate.
Now that the prologue is generated after GUEST_BASE is fixed, we can load it as an immediate, and also avoid reserving the register if it isn't necessary. Signed-off-by: Richard Henderson --- tcg/hppa/tcg-target.c | 12 1 files changed, 4 insertions(+), 8 deletions(-) diff --git a/tcg/hppa/tcg-target.c b/tcg/hppa/tcg-target.c index 012e486..a5f2162 100644 --- a/tcg/hppa/tcg-target.c +++ b/tcg/hppa/tcg-target.c @@ -1629,11 +1629,10 @@ void tcg_target_qemu_prologue(TCGContext *s) } #ifdef CONFIG_USE_GUEST_BASE -/* Note that GUEST_BASE can change after the prologue is generated. - To combat that, load the value from the variable instead of - embedding a constant here. */ -tcg_out_ld(s, TCG_TYPE_PTR, TCG_GUEST_BASE_REG, - TCG_REG_R0, (tcg_target_long)&guest_base); +if (GUEST_BASE != 0) { +tcg_out_movi(s, TCG_TYPE_PTR, TCG_GUEST_BASE_REG, GUEST_BASE); +tcg_regset_set_reg(s->reserved_regs, TCG_GUEST_BASE_REG); +} #endif /* Jump to TB, and adjust R18 to be the return address. */ @@ -1679,9 +1678,6 @@ void tcg_target_init(TCGContext *s) tcg_regset_set_reg(s->reserved_regs, TCG_REG_DP); /* data pointer */ tcg_regset_set_reg(s->reserved_regs, TCG_REG_SP); /* stack pointer */ tcg_regset_set_reg(s->reserved_regs, TCG_REG_R31); /* ble link reg */ -#ifdef CONFIG_USE_GUEST_BASE -tcg_regset_set_reg(s->reserved_regs, TCG_GUEST_BASE_REG); -#endif tcg_add_target_add_op_defs(hppa_op_defs); } -- 1.7.0.1
Re: [Qemu-devel] [PATCH] QMP: Spec: Private Extensions support
On 05/06/2010 10:52 AM, Markus Armbruster wrote: Anthony, no reply from you; did it fall through the cracks? If you're fine with my draft, I'll turn it into a proper patch. Yes, sorry, I thought I had already responded as such. Regards, Anthony Liguori Markus Armbruster writes: Anthony asked me to take a stab at rewriting his draft to something more along the lines of what I'm thinking. Here goes. I put some remarks [in brackets]. FYI, I'll be out of town until Wednesday. 6. Downstream extension of QMP -- We recommend that downstream consumers of QEMU do *not* modify QMP. Management tools should be able to support both upstream and downstream versions of QMP without special logic, and downstream extensions are inherently at odds with that. However, we recognize that it is sometimes impossible for downstreams to avoid modifying QMP. Both upstream and downstream need to take care to preserve long-term compatibility and interoperability. To help with that, QMP reserves JSON object member names beginning with '__' (double underscore) for downstream use ("downstream names"). This means upstream will never use any downstream names for its commands, arguments, errors, asynchronous events, and so forth. Any new names downstream wishes to add must begin with '__'. To ensure compatibility with other downstreams, it is strongly recommended that you prefix the commands with '__RFQDN_' where RFQDN is a valid, reverse fully qualified domain name which you control. For example, a qemu-kvm specific monitor command would be: (qemu) __org.linux-kvm_enable_irqchip Downstream must not change the server greeting (section 2.2) other than to offer additional capabilities. But see below for why even that is discouraged. Section '5 Compatibility Considerations' applies to downstream as well as to upstream, obviously. [That section needs work!] It follows that downstream must behave exactly like upstream for any input not containing members with downstream names ("downstream members"), except it may add members with downstream names to its output. Thus, a client should not be able to distinguish downstream from upstream as long as it doesn't send input with downstream members, and properly ignores any downstream members in the output it receives. [I fully support everything up to this point. I have some reservations on the rest, and I doubt it'll be all that useful, but I don't really mind having it, at least not in this form.] Advice on downstream modifications: [I made a honest effort at capturing Anthony's intentions here, my apologies if I screwed it up.] 1. Introducing new commands is okay. If you want to extend an existing command, consider introducing a new one with the new behaviour instead. [FIXME Could use a rationale: why is extending bad? Make sure to cover errors, because that's needed for 3.] 2. Introducing new asynchronous messages is okay. If you want to extend an existing message, consider adding a new one instead. [FIXME Could use a rationale: why is extending bad?] 3. Introducing new errors for use in new commands is okay. Adding new errors to existing commands counts as extension, so 1. applies. 4. New capabilities are strongly discouraged. Capabilities are for evolving the basic protocol, and multiple diverging basic protocol dialects are most undesirable.
[Qemu-devel] Re: [Commit 20d97356] Fix OpenBSD build
On 5/5/10, Kevin Wolf wrote: > Hi Blue Swirl, > > did I miss the patch posted for this commit? I can't find it on the > mailing list. I think the BlockDriver declarations should really stay at > the end of the source file of each driver. > > What about using a different way to get rid of the static forward > declaration, namely using bdrv_find_format() instead of referencing it > directly? If you agree I'll post patches to revert your change and use > this solution instead. Somehow I can't form in my mind the change from your description, but I'd be fine with that too if it also compiles in OpenBSD.
[Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device
On Thu, May 6, 2010 at 11:32 AM, Anthony Liguori wrote: > On 04/21/2010 12:53 PM, Cam Macdonell wrote: >> >> Support an inter-vm shared memory device that maps a shared-memory object >> as a >> PCI device in the guest. This patch also supports interrupts between >> guest by >> communicating over a unix domain socket. This patch applies to the >> qemu-kvm >> repository. >> >> -device ivshmem,size=[,shm=] >> >> Interrupts are supported between multiple VMs by using a shared memory >> server >> by using a chardev socket. >> >> -device ivshmem,size=[,shm=] >> [,chardev=][,msi=on][,irqfd=on][,vectors=n] >> -chardev socket,path=,id= >> >> (shared memory server is qemu.git/contrib/ivshmem-server) >> >> Sample programs and init scripts are in a git repo here: >> >> www.gitorious.org/nahanni >> --- >> Makefile.target | 3 + >> hw/ivshmem.c | 727 >> +++ >> qemu-char.c | 6 + >> qemu-char.h | 3 + >> qemu-doc.texi | 25 ++ >> 5 files changed, 764 insertions(+), 0 deletions(-) >> create mode 100644 hw/ivshmem.c >> >> diff --git a/Makefile.target b/Makefile.target >> index 1ffd802..bc9a681 100644 >> --- a/Makefile.target >> +++ b/Makefile.target >> @@ -199,6 +199,9 @@ obj-$(CONFIG_USB_OHCI) += usb-ohci.o >> obj-y += rtl8139.o >> obj-y += e1000.o >> >> +# Inter-VM PCI shared memory >> +obj-y += ivshmem.o >> + >> # Hardware support >> obj-i386-y = pckbd.o dma.o >> obj-i386-y += vga.o >> diff --git a/hw/ivshmem.c b/hw/ivshmem.c >> new file mode 100644 >> index 000..f8d8fdb >> --- /dev/null >> +++ b/hw/ivshmem.c >> @@ -0,0 +1,727 @@ >> +/* >> + * Inter-VM Shared Memory PCI device. >> + * >> + * Author: >> + * Cam Macdonell >> + * >> + * Based On: cirrus_vga.c and rtl8139.c >> + * >> + * This code is licensed under the GNU GPL v2. >> + */ >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> > > This will break the Windows along with any non-Linux unix or any Linux old > enough to not have eventfd support. I'll wrap it with #ifdef CONFIG_EVENTFD > If it's based on cirrus_vga.c and rtl8139.c, then it ought to carry the > respective copyrights, no? Sure, I can add those Cam > > Regards, > > Anthony Liguori > >> +#include "hw.h" >> +#include "console.h" >> +#include "pc.h" >> +#include "pci.h" >> +#include "sysemu.h" >> + >> +#include "msix.h" >> +#include "qemu-kvm.h" >> +#include "libkvm.h" >> + >> +#include >> +#include >> +#include >> +#include >> + >> +#define IVSHMEM_IRQFD 0 >> +#define IVSHMEM_MSI 1 >> + >> +#define DEBUG_IVSHMEM >> +#ifdef DEBUG_IVSHMEM >> +#define IVSHMEM_DPRINTF(fmt, args...) \ >> + do {printf("IVSHMEM: " fmt, ##args); } while (0) >> +#else >> +#define IVSHMEM_DPRINTF(fmt, args...) >> +#endif >> + >> +typedef struct EventfdEntry { >> + PCIDevice *pdev; >> + int vector; >> +} EventfdEntry; >> + >> +typedef struct IVShmemState { >> + PCIDevice dev; >> + uint32_t intrmask; >> + uint32_t intrstatus; >> + uint32_t doorbell; >> + >> + CharDriverState * chr; >> + CharDriverState ** eventfd_chr; >> + int ivshmem_mmio_io_addr; >> + >> + pcibus_t mmio_addr; >> + unsigned long ivshmem_offset; >> + uint64_t ivshmem_size; /* size of shared memory region */ >> + int shm_fd; /* shared memory file descriptor */ >> + >> + int nr_allocated_vms; >> + /* array of eventfds for each guest */ >> + int ** eventfds; >> + /* keep track of # of eventfds for each guest*/ >> + int * eventfds_posn_count; >> + >> + int nr_alloc_guests; >> + int vm_id; >> + int num_eventfds; >> + uint32_t vectors; >> + uint32_t features; >> + EventfdEntry *eventfd_table; >> + >> + char * shmobj; >> + char * sizearg; >> +} IVShmemState; >> + >> +/* registers for the Inter-VM shared memory device */ >> +enum ivshmem_registers { >> + IntrMask = 0, >> + IntrStatus = 4, >> + IVPosition = 8, >> + Doorbell = 12, >> +}; >> + >> +static inline uint32_t ivshmem_has_feature(IVShmemState *ivs, int >> feature) { >> + return (ivs->features& (1<< feature)); >> +} >> + >> +static inline int is_power_of_two(int x) { >> + return (x& (x-1)) == 0; >> +} >> + >> +static void ivshmem_map(PCIDevice *pci_dev, int region_num, >> + pcibus_t addr, pcibus_t size, int type) >> +{ >> + IVShmemState *s = DO_UPCAST(IVShmemState, dev, pci_dev); >> + >> + IVSHMEM_DPRINTF("addr = %u size = %u\n", (uint32_t)addr, >> (uint32_t)size); >> + cpu_register_physical_memory(addr, s->ivshmem_size, >> s->ivshmem_offset); >> + >> +} >> + >> +/* accessing registers - based on rtl8139 */ >> +static void ivshmem_update_irq(IVShmemState *s, int val) >> +{ >> + int isr; >> + isr = (s->intrstatus& s->intrmask)& 0x; >> + >> + /* don't print ISR resets */ >> + if (isr) { >> + IVSHMEM_DPRINTF("Set IRQ to %d (%04x %04x)\n", >> + isr ? 1 : 0, s->intrstatus, s->int
[Qemu-devel] Re: [PATCH] vdi: Fix image creation
Am 06.05.2010 14:55, schrieb Kevin Wolf: The number of blocks needs to be rounded up to cover all of the virtual hard disk. Without this fix, we can't even open our own images if their size is not a multiple of the block size. Signed-off-by: Kevin Wolf --- block/vdi.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/block/vdi.c b/block/vdi.c index 2b4d2c2..b990bbc 100644 --- a/block/vdi.c +++ b/block/vdi.c @@ -827,7 +827,7 @@ static int vdi_create(const char *filename, QEMUOptionParameter *options) return -errno; } - blocks = bytes / block_size; + blocks = (bytes + block_size - 1) / block_size; bmap_size = blocks * sizeof(uint32_t); bmap_size = ((bmap_size + SECTOR_SIZE - 1) & ~(SECTOR_SIZE -1)); 'bytes' (for header.disk_size) must be fixed, too, and so does vdi_open. I'll send a patch which hopefully addresses all these points. Thanks + kind regards, Stefan
[Qemu-devel] Re: vgabios + qemu: issues and plans.
Gerd Hoffmann wrote: Hi, First, I'd like to be able to load the vgabios via PCI ROM bar on all pci vga cards (stdvga, vmware, soon qxl). The PCI ID in the bios has to match the PCI ID of the card, so we'll need a bunch of vga bios binaries, all identical except for the PCI ID. Or we need some kind of binary patching. Otherwise seabios will not load them from the PCI ROM bar. vgabios could be build multiple times with PCIBIOS set and VENDOR_ID and DEVICE_ID supplied from the Makefile like it's already done with VGABIOS_DATE. Yes, something like this I have in mind. Works for stdvga and qxl. vmware fails as it has the memory at pci region 1 (region 0 has ioports) and it seems vgabios isn't prepared to handle that ... Do you mean vgabios currently doesn't work with qemu and "-vga vmware" ? Anyone knows what the best place to send patches to is? vgabios-developers list @ savannah looks pretty much dead. It has one (!) message in 2009 which is an unanswered question. Or is upstream dead anyway and we should just go with http://git.qemu.org/vgabios.git/ ? From the "amount" of messages on the mailing list i conclude vgabios just works good enough for most people. I would not say the developement is dead, but event driven and there is no itch to strach at the moment. Try sending patches to the mailing list and/or the vgabios home at http://savannah.nongnu.org/projects/vgabios . Sebastian cheers, Gerd
Re: [Qemu-devel] [PATCH 1/3] cursor: add cursor functions.
On 5/5/10, Gerd Hoffmann wrote: > Add a new cursor type to console.h and a bunch of functions to > deal with cursors the (new) cursor.c file. > > Signed-off-by: Gerd Hoffmann > --- > Makefile.objs |3 +- > console.h | 24 ++- > cursor.c | 208 > + > 3 files changed, 232 insertions(+), 3 deletions(-) > create mode 100644 cursor.c > > diff --git a/Makefile.objs b/Makefile.objs > index ecdd53e..1ee6e9d 100644 > --- a/Makefile.objs > +++ b/Makefile.objs > @@ -48,7 +48,8 @@ common-obj-y = $(block-obj-y) > common-obj-y += $(net-obj-y) > common-obj-y += $(qobject-obj-y) > common-obj-$(CONFIG_LINUX) += $(fsdev-obj-$(CONFIG_LINUX)) > -common-obj-y += readline.o console.o async.o qemu-error.o > +common-obj-y += readline.o console.o cursor.o async.o qemu-error.o > + > common-obj-y += tcg-runtime.o host-utils.o > common-obj-y += irq.o ioport.o input.o > common-obj-$(CONFIG_PTIMER) += ptimer.o > diff --git a/console.h b/console.h > index 6def115..88861cb 100644 > --- a/console.h > +++ b/console.h > @@ -126,6 +126,27 @@ struct DisplaySurface { > struct PixelFormat pf; > }; > > +/* cursor data format is 32bit RGBA */ > +typedef struct QEMUCursor { > +int width, height; > +int hot_x, hot_y; > +int refcount; > +uint32_tdata[]; > +} QEMUCursor; > + > +QEMUCursor *cursor_alloc(int width, int height); > +void cursor_get(QEMUCursor *c); > +void cursor_put(QEMUCursor *c); > +QEMUCursor *cursor_builtin_hidden(void); > +QEMUCursor *cursor_builtin_left_ptr(void); > +void cursor_print_ascii_art(QEMUCursor *c, const char *prefix); > +int cursor_get_mono_bpl(QEMUCursor *c); > +void cursor_set_mono(QEMUCursor *c, > + uint32_t foreground, uint32_t background, uint8_t > *image, > + int transparent, uint8_t *mask); > +void cursor_get_mono_image(QEMUCursor *c, int foreground, uint8_t *mask); > +void cursor_get_mono_mask(QEMUCursor *c, int transparent, uint8_t *mask); > + > struct DisplayChangeListener { > int idle; > uint64_t gui_timer_interval; > @@ -158,8 +179,7 @@ struct DisplayState { > struct DisplayChangeListener* listeners; > > void (*mouse_set)(int x, int y, int on); > -void (*cursor_define)(int width, int height, int bpp, int hot_x, int > hot_y, > - uint8_t *image, uint8_t *mask); > +void (*cursor_define)(QEMUCursor *cursor); > > struct DisplayState *next; > }; > diff --git a/cursor.c b/cursor.c > new file mode 100644 > index 000..3995a31 > --- /dev/null > +++ b/cursor.c > @@ -0,0 +1,208 @@ > +#include "qemu-common.h" > +#include "console.h" > + > +static const char cursor_hidden_32[32*32]; > +static const char cursor_left_ptr_32[32*32] = { > +"" > +" X " > +" XX " > +" X.X" > +" X..X " > +" X...X " > +" XX " > +" X.X" > +" X..X " > +" X...X " > +" XX " > +" X.X" > +" X..X..X" > +" X.X X..X " > +" XX X..X " > +" XX..X " > +" X..X " > +" X..X " > +" X..X " > +"XX " > +"" > +}; Is this format standard? How about using X bitmap format instead: $ cat /usr/include/X11/bitmaps/left_ptr #define left_ptr_width 16 #define left_ptr_height 16 #define left_ptr_x_hot 3 #define left_ptr_y_hot 1 static char left_ptr_bits[] = { 0x00, 0x00, 0x08, 0x00, 0x18, 0x00, 0x38, 0x00, 0x78, 0x00, 0xf8, 0x00, 0xf8, 0x01, 0xf8, 0x03, 0xf8, 0x07, 0xf8, 0x00, 0xd8, 0x00, 0x88, 0x01, 0x80, 0x01, 0x00, 0x03, 0x00, 0x03, 0x00, 0x00}; Then there would be no need of parsing. > + > +/* for creating built-in cursors */ > +static void cursor_parse_ascii_art(QEMUCursor *c, const char *ptr) > +{ > +int i, pixels; > + > +pixels = c->width * c->height; > +for (i = 0; i < pixels; i++) { > +switch (ptr[i]) { > +case 'X': /* black */ > +c->data[i] = 0xff00; > +break; > +case '.': /* white */ > +c->data[i] = 0x; > +break; > +case ' ': /* transparent */ > +default: > +c->data[i] = 0x; > +break; > +} > +} > +} > + > +/* nice for debugging */ > +void cursor_print_ascii_art(QEMUCursor *c, const cha
[Qemu-devel] [PATCH 1/5] tcg: Initialize the prologue after GUEST_BASE is fixed.
This will allow backends to make intelligent choices about how to implement GUEST_BASE. Signed-off-by: Richard Henderson --- bsd-user/main.c |9 - exec.c|5 + linux-user/main.c |9 - tcg/tcg.c |3 +++ tcg/tcg.h |1 + 5 files changed, 25 insertions(+), 2 deletions(-) diff --git a/bsd-user/main.c b/bsd-user/main.c index b1c438d..05cc3d9 100644 --- a/bsd-user/main.c +++ b/bsd-user/main.c @@ -30,7 +30,7 @@ #include "qemu-common.h" /* For tb_lock */ #include "exec-all.h" - +#include "tcg.h" #include "qemu-timer.h" #include "envlist.h" @@ -970,6 +970,13 @@ int main(int argc, char **argv) syscall_init(); signal_init(); +#if defined(CONFIG_USE_GUEST_BASE) +/* Now that we've loaded the binary, GUEST_BASE is fixed. Delay + generating the prologue until now so that the prologue can take + the real value of GUEST_BASE into account. */ +tcg_prologue_init(&tcg_ctx); +#endif + /* build Task State */ memset(ts, 0, sizeof(TaskState)); init_task_state(ts); diff --git a/exec.c b/exec.c index 95b92f7..f417bb6 100644 --- a/exec.c +++ b/exec.c @@ -574,6 +574,11 @@ void cpu_exec_init_all(unsigned long tb_size) #if !defined(CONFIG_USER_ONLY) io_mem_init(); #endif +#if !defined(CONFIG_USER_ONLY) || !defined(CONFIG_USE_GUEST_BASE) +/* There's no guest base to take into account, so go ahead and + initialize the prologue now. */ +tcg_prologue_init(&tcg_ctx); +#endif } #if defined(CPU_SAVE_VERSION) && !defined(CONFIG_USER_ONLY) diff --git a/linux-user/main.c b/linux-user/main.c index eafcfc1..331cc30 100644 --- a/linux-user/main.c +++ b/linux-user/main.c @@ -31,7 +31,7 @@ #include "cache-utils.h" /* For tb_lock */ #include "exec-all.h" - +#include "tcg.h" #include "qemu-timer.h" #include "envlist.h" @@ -2984,6 +2984,13 @@ int main(int argc, char **argv, char **envp) syscall_init(); signal_init(); +#if defined(CONFIG_USE_GUEST_BASE) +/* Now that we've loaded the binary, GUEST_BASE is fixed. Delay + generating the prologue until now so that the prologue can take + the real value of GUEST_BASE into account. */ +tcg_prologue_init(&tcg_ctx); +#endif + #if defined(TARGET_I386) cpu_x86_set_cpl(env, 3); diff --git a/tcg/tcg.c b/tcg/tcg.c index 53da2c4..9105266 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -240,7 +240,10 @@ void tcg_context_init(TCGContext *s) } tcg_target_init(s); +} +void tcg_prologue_init(TCGContext *s) +{ /* init global prologue and epilogue */ s->code_buf = code_gen_prologue; s->code_ptr = s->code_buf; diff --git a/tcg/tcg.h b/tcg/tcg.h index 166c889..bab13cd 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -346,6 +346,7 @@ static inline void *tcg_malloc(int size) } void tcg_context_init(TCGContext *s); +void tcg_prologue_init(TCGContext *s); void tcg_func_start(TCGContext *s); int tcg_gen_code(TCGContext *s, uint8_t *gen_code_buf); -- 1.7.0.1
[Qemu-devel] Re: vgabios + qemu: issues and plans.
Gerd Hoffmann wrote: First, I'd like to be able to load the vgabios via PCI ROM bar on all pci vga cards (stdvga, vmware, soon qxl). The PCI ID in the bios has to match the PCI ID of the card, so we'll need a bunch of vga bios binaries, all identical except for the PCI ID. Or we need some kind of binary patching. Otherwise seabios will not load them from the PCI ROM bar. Make vga bios aware of all ids? Would work for the code doing the lfb address lookup. But I don't think the PCI option rom header allows multiple entries. AFAIK it does not, but you can put multiple PCI option roms (concatenated) on the BAR (see "6.3.1. PCI Expansion ROM Contents"). I also want to eliminate the magic number. Boch vga bios is paravirtualized to get VBE address dynamically via a new pci device with special pci device of device id 0x1234. 0x1234 happens to be the vendor id the qemu stdvga has. Probably the same is true for bochs. So this code actually looks up the emulated vga card ;) Bochs and qemu use 0x1234:0x. Sebastian Are you meaning to follow bochs way? Yes. cheers, Gerd
[Qemu-devel] [PATCH] vdi: Fix image opening and creation for odd disk sizes
This patch fixes a regression introduced by commit 95a2f9bc588c3f83375d87b0a9394f89a1bcfada. The fix is based on a patch from Kevin Wolf. Here his comment: "The number of blocks needs to be rounded up to cover all of the virtual hard disk. Without this fix, we can't even open our own images if their size is not a multiple of the block size." While Kevin's patch addressed vdi_create, my modification also fixes vdi_open which now accepts any image which is large enough to hold the blocks. I also decided to keep the original code in vdi_create which rounds down. Rounding works in both directions, and there are good arguments for both, so I just left the original simple code. It is very important to use the rounded value for the new disk size, too - otherwise VirtualBox cannot open our disk image. Cc: Kevin Wolf Cc: François Revol Signed-off-by: Stefan Weil --- block/vdi.c |7 +-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/block/vdi.c b/block/vdi.c index 1d257b4..2213819 100644 --- a/block/vdi.c +++ b/block/vdi.c @@ -414,8 +414,8 @@ static int vdi_open(BlockDriverState *bs, int flags) } else if (header.block_size != 1 * MiB) { logout("unsupported block size %u B\n", header.block_size); goto fail; -} else if ((header.disk_size + header.block_size - 1) / header.block_size != - (uint64_t)header.blocks_in_image) { +} else if (header.disk_size < + (uint64_t)header.blocks_in_image * header.block_size) { logout("unexpected block number %u B\n", header.blocks_in_image); goto fail; } else if (!uuid_is_null(header.uuid_link)) { @@ -827,7 +827,10 @@ static int vdi_create(const char *filename, QEMUOptionParameter *options) return -errno; } +/* VirtualBox wants a disk size which is a multiple of the block size. */ blocks = bytes / block_size; +bytes = blocks * block_size; + bmap_size = blocks * sizeof(uint32_t); bmap_size = ((bmap_size + SECTOR_SIZE - 1) & ~(SECTOR_SIZE -1)); -- 1.7.0
Re: [Qemu-devel] eepro100: missing gpxe-eepro100-80862449.rom?
Am 05.05.2010 22:03, schrieb Sebastian Herbszt: eepro100_register_devices() references three option roms: - gpxe-eepro100-80861209.rom - gpxe-eepro100-80861229.rom - gpxe-eepro100-80862449.rom The last one seems to be missing because it's not in the pc-bios directory. Sebastian Yes, it is missing. I did not add it for several reasons: * The new variant i82801 (which needs the missing ROM) is currently not available via model=i82801 (it can be selected by experts using other methods). * Experts can get the missing ROM from ROM-o-matic. * All three ROMs differ only in one or two bytes (device id). That's a waste of binary space (56832 bytes!). Therefore I plan to remove all but the first ROM file. The missing ones can be created by make or patched on the fly by QEMU during load. Maybe it is also possible to create ROM files which support more than one vendor/device entry - that would be the best solution. Regards Stefan
[Qemu-devel] [PATCH 4/5] tcg-ia64: Load GUEST_BASE into a register.
Saves one bundle per memory operation. Signed-off-by: Richard Henderson --- tcg/ia64/tcg-target.c | 132 1 files changed, 88 insertions(+), 44 deletions(-) diff --git a/tcg/ia64/tcg-target.c b/tcg/ia64/tcg-target.c index acd4ce8..0d275e9 100644 --- a/tcg/ia64/tcg-target.c +++ b/tcg/ia64/tcg-target.c @@ -40,6 +40,12 @@ static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = { }; #endif +#ifdef CONFIG_USE_GUEST_BASE +#define TCG_GUEST_BASE_REG TCG_REG_R55 +#else +#define TCG_GUEST_BASE_REG TCG_REG_R0 +#endif + /* Branch registers */ enum { TCG_REG_B0 = 0, @@ -1641,9 +1647,13 @@ static inline void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc) static inline void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int opc) { +static uint64_t const opc_ld_m1[4] = { +OPC_LD1_M1, OPC_LD2_M1, OPC_LD4_M1, OPC_LD8_M1 +}; +static uint64_t const opc_sxt_i29[4] = { +OPC_SXT1_I29, OPC_SXT2_I29, OPC_SXT4_I29, 0 +}; int addr_reg, data_reg, mem_index, s_bits, bswap; -uint64_t opc_ld_m1[4] = { OPC_LD1_M1, OPC_LD2_M1, OPC_LD4_M1, OPC_LD8_M1 }; -uint64_t opc_sxt_i29[8] = { OPC_SXT1_I29, OPC_SXT2_I29, OPC_SXT4_I29, 0 }; data_reg = *args++; addr_reg = *args++; @@ -1656,19 +1666,21 @@ static inline void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int opc) bswap = 0; #endif -tcg_out_bundle(s, mLX, - tcg_opc_m48(TCG_REG_P0, OPC_NOP_M48, 0), - tcg_opc_l2 ((tcg_target_long) GUEST_BASE), - tcg_opc_x2 (TCG_REG_P0, OPC_MOVL_X2, TCG_REG_R2, - GUEST_BASE)); - #if TARGET_LONG_BITS == 32 -tcg_out_bundle(s, mII, - tcg_opc_m48(TCG_REG_P0, OPC_NOP_M48, 0), - tcg_opc_i29(TCG_REG_P0, OPC_ZXT4_I29, - TCG_REG_R3, addr_reg), - tcg_opc_a1 (TCG_REG_P0, OPC_ADD_A1, TCG_REG_R2, - TCG_REG_R2, TCG_REG_R3)); +if (GUEST_BASE != 0) { +tcg_out_bundle(s, mII, + tcg_opc_m48(TCG_REG_P0, OPC_NOP_M48, 0), + tcg_opc_i29(TCG_REG_P0, OPC_ZXT4_I29, + TCG_REG_R3, addr_reg), + tcg_opc_a1 (TCG_REG_P0, OPC_ADD_A1, TCG_REG_R2, + TCG_GUEST_BASE_REG, TCG_REG_R3)); +} else { +tcg_out_bundle(s, miI, + tcg_opc_m48(TCG_REG_P0, OPC_NOP_M48, 0), + tcg_opc_i29(TCG_REG_P0, OPC_ZXT4_I29, + TCG_REG_R2, addr_reg), + tcg_opc_i18(TCG_REG_P0, OPC_NOP_I18, 0)); +} if (!bswap || s_bits == 0) { if (s_bits == opc) { @@ -1724,12 +1736,20 @@ static inline void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int opc) } } #else -tcg_out_bundle(s, MmI, - tcg_opc_a1 (TCG_REG_P0, OPC_ADD_A1, TCG_REG_R2, - TCG_REG_R2, addr_reg), - tcg_opc_m1 (TCG_REG_P0, opc_ld_m1[s_bits], - data_reg, TCG_REG_R2), - tcg_opc_i18(TCG_REG_P0, OPC_NOP_I18, 0)); +if (GUEST_BASE != 0) { +tcg_out_bundle(s, MmI, + tcg_opc_a1 (TCG_REG_P0, OPC_ADD_A1, TCG_REG_R2, + TCG_GUEST_BASE_REG, addr_reg), + tcg_opc_m1 (TCG_REG_P0, opc_ld_m1[s_bits], + data_reg, TCG_REG_R2), + tcg_opc_i18(TCG_REG_P0, OPC_NOP_I18, 0)); +} else { +tcg_out_bundle(s, mmI, + tcg_opc_m48(TCG_REG_P0, OPC_NOP_M48, 0), + tcg_opc_m1 (TCG_REG_P0, opc_ld_m1[s_bits], + data_reg, addr_reg), + tcg_opc_i18(TCG_REG_P0, OPC_NOP_I18, 0)); +} if (bswap && s_bits == 1) { tcg_out_bundle(s, mII, @@ -1764,8 +1784,13 @@ static inline void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int opc) static inline void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc) { +static uint64_t const opc_st_m4[4] = { +OPC_ST1_M4, OPC_ST2_M4, OPC_ST4_M4, OPC_ST8_M4 +}; int addr_reg, data_reg, bswap; -uint64_t opc_st_m4[4] = { OPC_ST1_M4, OPC_ST2_M4, OPC_ST4_M4, OPC_ST8_M4 }; +#if TARGET_LONG_BITS == 64 +uint64_t add_guest_base; +#endif data_reg = *args++; addr_reg = *args++; @@ -1776,19 +1801,22 @@ static inline void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc) bswap = 0; #endif -tcg_out_bundle(s, mLX, - tcg_opc_m48(TCG_REG_P0, OPC_NOP_M48, 0), - tcg_opc_l2 ((tcg_target_long) GUEST_BASE), - tcg_opc_x2 (TCG_REG_P0, OPC_MOVL_X2, TCG_REG_R2, - GUEST_BASE)); - #if TARGET_LONG_BI
[Qemu-devel] [PATCH] block/vdi: Allow disk images of size 0
Even it is not very useful, users may create images of size 0. Without the special option CONFIG_ZERO_MALLOC, qemu_mallocz aborts execution when it is told to allocate 0 bytes, so avoid this kind of call. Cc: Kevin Wolf Signed-off-by: Stefan Weil --- block/vdi.c |9 +++-- 1 files changed, 7 insertions(+), 2 deletions(-) diff --git a/block/vdi.c b/block/vdi.c index 2213819..02b9fea 100644 --- a/block/vdi.c +++ b/block/vdi.c @@ -435,7 +435,9 @@ static int vdi_open(BlockDriverState *bs, int flags) bmap_size = header.blocks_in_image * sizeof(uint32_t); bmap_size = (bmap_size + SECTOR_SIZE - 1) / SECTOR_SIZE; -s->bmap = qemu_malloc(bmap_size * SECTOR_SIZE); +if (bmap_size > 0) { +s->bmap = qemu_malloc(bmap_size * SECTOR_SIZE); +} if (bdrv_read(bs->file, s->bmap_sector, (uint8_t *)s->bmap, bmap_size) < 0) { goto fail_free_bmap; } @@ -860,7 +862,10 @@ static int vdi_create(const char *filename, QEMUOptionParameter *options) result = -errno; } -bmap = (uint32_t *)qemu_mallocz(bmap_size); +bmap = NULL; +if (bmap_size > 0) { +bmap = (uint32_t *)qemu_mallocz(bmap_size); +} for (i = 0; i < blocks; i++) { if (image_type == VDI_TYPE_STATIC) { bmap[i] = i; -- 1.7.0
[Qemu-devel] Re: sparc64 lazy conditional codes evaluation
On 5/5/10, Igor Kovalenko wrote: > On Wed, May 5, 2010 at 12:21 AM, Blue Swirl wrote: > > On 5/3/10, Igor Kovalenko wrote: > >> On Mon, May 3, 2010 at 11:54 PM, Blue Swirl wrote: > >> > On 5/3/10, Igor Kovalenko wrote: > >> >> On Mon, May 3, 2010 at 11:24 PM, Blue Swirl > wrote: > >> >> > On 5/3/10, Igor Kovalenko wrote: > >> >> >> Hi! > >> >> >> > >> >> >> There is an issue with lazy conditional codes evaluation where > >> >> >> we return from trap handler with mismatching conditionals. > >> >> >> > >> >> >> I seldom reproduce it here when dragging qemu window while > >> >> >> machine is working through silo initialization. I use gentoo > minimal cd > >> >> >> install-sparc64-minimal-20100322.iso but I think anything with > silo boot > >> >> >> would experience the same. Once in a while it would report crc > error, > >> >> >> unable to open cd partition or it would fail to decompress image. > >> >> > > >> >> > I think I've also seen this. > >> >> > > >> >> >> Pattern that fails appears to require a sequence of compare insn > >> >> >> possibly followed by a few instructions which do not touch > conditionals, > >> >> >> then conditional branch insn. If it happens that we trap while > processing > >> >> >> conditional branch insn so it is restarted after return from > trap then > >> >> >> seldom conditional codes are calculated incorrectly. > >> >> >> > >> >> >> I cannot point to exact cause but it appears that after trap > return > >> >> >> we may have CC_OP and CC_SRC* mismatch somewhere, > >> >> >> since adding more cond evaluation flushes over the code helps. > >> >> >> > >> >> >> We already tried doing flush more frequently and it is still not > >> >> >> complete, so the question is how to finally do this once and > right :) > >> >> >> > >> >> >> Obviously I do not get the design of lazy evaluation right, but > >> >> >> the following list appears to be good start. Plan is to prepare > >> >> >> a change to qemu and find a way to test it. > >> >> >> > >> >> >> 1. Since SPARC* is a RISC CPU it seems to be not profitable to > >> >> >>use DisasContext->cc_op to predict if flags should be not > evaluated > >> >> >>due to overriding insn. Instead we can drop cc_op from > disassembler > >> >> >>context and simplify code to only use cc_op from env. > >> >> > > >> >> > Not currently, but in the future we may use that to do even lazier > >> >> > flags computation. For example the sequence 'cmp x, y; bne target' > >> >> > could be much more optimal by changing the branch to do the > >> >> > comparison. Here's an old unfinished patch to do some of this. > > > I wonder if it buys anything. Sparc RISC architecture means optimizing > compiler > would prevent any extra flags computation, right? So it is basically 1-to-1 > conditional computation and use. Or even worse, if we delay computation > until there are two or more consumers, correct? For x86 target, that is the other part of savings from using lazy condition computation. It's true that it will not benefit RISC targets much. But I think you are missing the other part, where the actual flags are not calculated but instead the original comparison can be used. For example, consider this Sparc64 code: IN: 0x01fff000be58: cmp %g2, 0x51 OUT: [size=82] 0x40df16b0: mov0x10(%r14),%rbp 0x40df16b4: mov%rbp,%rbx 0x40df16b7: sub$0x51,%rbx 0x40df16bb: mov%rbx,%r12 0x40df16be: mov%r12,0x10a60(%r14) 0x40df16c5: mov%rbp,0x60(%r14) 0x40df16c9: mov$0x51,%ebp 0x40df16ce: mov%rbp,0x68(%r14) 0x40df16d2: mov%rbx,0x70(%r14) 0x40df16d6: mov$0x7,%ebp 0x40df16db: mov%ebp,0x78(%r14) 0x40df16df: mov$0x1fff000be5c,%rbp 0x40df16e9: mov%rbp,0x48(%r14) 0x40df16ed: mov$0x1fff000be60,%rbp 0x40df16f7: mov%rbp,0x50(%r14) 0x40df16fb: xor%eax,%eax 0x40df16fd: jmpq 0xbfface -- IN: 0x01fff000be5c: bne 0x1fff000c268 OUT: [size=95] 0x40df1710: callq 0x518260 0x40df1715: mov0x98(%r14),%ebp 0x40df171c: mov%ebp,%ebx 0x40df171e: shr$0x16,%rbx 0x40df1722: and$0x1,%ebx 0x40df1725: xor$0x1,%rbx 0x40df1729: mov%rbx,0x90(%r14) 0x40df1730: mov$0x1fff000be60,%rbp 0x40df173a: mov%rbp,0x48(%r14) 0x40df173e: test %rbx,%rbx 0x40df1741: je 0x40df175a 0x40df1747: mov$0x1fff000c268,%rbp 0x40df1751: mov%rbp,0x50(%r14) 0x40df1755: jmpq 0x40df1768 0x40df175a: mov$0x1fff000be64,%rbp 0x40df1764: mov%rbp,0x50(%r14) 0x40df1768: xor%eax,%eax 0x40df176a: jmpq 0xbfface Instead of calculating any flags, we should generate for combined 'cmp/btest + branch' sequences something like: mov0x10(%r14),%rbp mov%rbp,%rbx cmp$0x51,%rbx je 0x40ac275a mov$0x1fff000c268,%rbp mov%rbp,0x50(%r14) jmpq 0x40ac2768 mov$0x1fff000be64,%rbp mov%rbp,0x50
[Qemu-devel] Re: Re: [RFC] [PATCH] add ahci support into qemu
Stuart Brady wrote: On Tue, May 04, 2010 at 10:51:37PM +0200, Sebastian Herbszt wrote: >diff --git a/hw/pci_ids.h b/hw/pci_ids.h >index fe7a121..4d4de93 100644 >--- a/hw/pci_ids.h >+++ b/hw/pci_ids.h >@@ -97,3 +97,4 @@ >#define PCI_DEVICE_ID_INTEL_82371AB 0x7111 >#define PCI_DEVICE_ID_INTEL_82371AB_20x7112 >#define PCI_DEVICE_ID_INTEL_82371AB_30x7113 >+#define PCI_DEVICE_ID_INTEL_ICH6R_AHCI0x2652 The list is sorted by vendor and device id. This entry should go after "PCI_DEVICE_ID_INTEL_ESB_9". The naming scheme seems to be VENDOR_DEVICE_FUNCTION, so i suggest something like PCI_DEVICE_ID_INTEL_ICH6R_2 or PCI_DEVICE_ID_INTEL_82801FR_2. Linux seems to have called this PCI_DEVICE_ID_INTEL_ICH6_4 at one point. So this should be function 4, no? Well, i am not sure if the last part is the function number. pci_ids.h from the linux kernel got this: #define PCI_DEVICE_ID_INTEL_ICH6_0 0x2640 #define PCI_DEVICE_ID_INTEL_ICH6_1 0x2641 #define PCI_DEVICE_ID_INTEL_ICH6_2 0x2642 #define PCI_DEVICE_ID_INTEL_ICH6_16 0x266a #define PCI_DEVICE_ID_INTEL_ICH6_17 0x266d #define PCI_DEVICE_ID_INTEL_ICH6_18 0x266e #define PCI_DEVICE_ID_INTEL_ICH6_19 0x266f According to the Intel spec the SATA controller is D31:F2. So we got device 31 and function 2. The IDE controller is D31:F1, but above it's ICH6_19. Sebastian Cheers, -- Stuart Brady
[Qemu-devel] Re: Qemu - samba share help
Arpit Patel wrote: > Hi All, > > I am trying to share files between host OS and guest OS, both are Ubuntu. > > Here is the command line that I am using to start guest OS > *qemu -kernel kernelimage -initrd initrd.img /dev/zero -append "cmdline" > -smb /tmp* > * > * > After which I tried to *mount* /tmp on guest OS, using IP address of host > OS, > something like *mount //10.80.8.100/tmp /tmp*, but fails saying "*No such > device"*. Try //10.0.2.4/qemu (a virtual server inside the virtual net that QEMU creates with its user-mode IP stack, 'qemu' is the default share name). > > Samba service is also running on Host OS. Actually not needed for -smb, it will start a separate smbd instance. But recent samba versions can cause troubles here (hard-coded paths prevent unprivileged smbd instances). In that case you need to fall back to the main host smb server. That would be reachable under 10.0.2.2 (default IP config), share names as you assigned. > > Any ideas or any other way to share files between Host OS and Guest OS. > > Thanks for Help. > Jan signature.asc Description: OpenPGP digital signature
[Qemu-devel] Re: vgabios + qemu: issues and plans.
Hi, Works for stdvga and qxl. vmware fails as it has the memory at pci region 1 (region 0 has ioports) and it seems vgabios isn't prepared to handle that ... Do you mean vgabios currently doesn't work with qemu and "-vga vmware" ? It works, but uses the bochs lfb at the magic address 0xe000. Killing the 0xe00 alias mapping and switching over to use the pci region instead requires some changes in pci_get_lfb_addr to handle the situation. Try sending patches to the mailing list and/or the vgabios home at http://savannah.nongnu.org/projects/vgabios . Ok, I'll try to send my patch series down that route. thanks, Gerd
Re: [Qemu-devel] [PATCH 1/3] cursor: add cursor functions.
On 5/6/10, Gerd Hoffmann wrote: > > > > > +static const char cursor_left_ptr_32[32*32] = { > > > +"" > > > +" X " > > > +" XX " > > > +" X.X" > > > +" X..X " > > > +" X...X " > > > +" XX " > > > +" X.X" > > > +" X..X " > > > +" X...X " > > > +" XX " > > > +" X.X" > > > +" X..X..X" > > > +" X.X X..X " > > > +" XX X..X " > > > +" XX..X " > > > +" X..X " > > > +" X..X " > > > +" X..X " > > > +"XX " > > > +"" > > > +}; > > > > > > > Is this format standard? > > > > Inspiried by xpm but simplified a bit as full xpm support just for a > built-in fallback cursor would have been overkill. The nice thing about xpm > is that you can use any text editor for editing icons. But for X bitmaps, you can use any real drawing program. The same would also apply to full xpm. But read on. > > How about using X bitmap format instead: > > $ cat /usr/include/X11/bitmaps/left_ptr > > > > > > Then there would be no need of parsing. > > > > Well. You still would have to convert it as qemu internal cursor format is > defined as 32bit depth, rgb with alpha channel. So it doesn't buy you that > much. There's still more wasted memory compared to binary format. I think the best would be to store only the 32 bit image and perform any conversions offline. This was the approach with for example keyboard tables, we don't convert them at startup but the original tables are retained as comments.
Re: [Qemu-devel] [PATCH 1/3] cursor: add cursor functions.
+static const char cursor_left_ptr_32[32*32] = { +"" +" X " +" XX " +" X.X" +" X..X " +" X...X " +" XX " +" X.X" +" X..X " +" X...X " +" XX " +" X.X" +" X..X..X" +" X.X X..X " +" XX X..X " +" XX..X " +" X..X " +" X..X " +" X..X " +"XX " +"" +}; Is this format standard? Inspiried by xpm but simplified a bit as full xpm support just for a built-in fallback cursor would have been overkill. The nice thing about xpm is that you can use any text editor for editing icons. How about using X bitmap format instead: $ cat /usr/include/X11/bitmaps/left_ptr Then there would be no need of parsing. Well. You still would have to convert it as qemu internal cursor format is defined as 32bit depth, rgb with alpha channel. So it doesn't buy you that much. cheers, Gerd
Re: [Qemu-devel] [PATCH] block: Fix protocol detection for Windows devices
On Thu, May 06, 2010 at 02:56:38PM +0200, Kevin Wolf wrote: > We can't assume the file protocol for Windows devices, they need the same > detection as other files for which an explicit protocol is not specified. Looks good, Reviewed-by: Christoph Hellwig
Re: [Qemu-devel] [PATCH] block: Fix bdrv_commit
On Thu, May 06, 2010 at 04:44:34PM +0200, Kevin Wolf wrote: > When reopening the image, don't guess the driver, but use the same driver as > was used before. This is important if the format=... option was used for that > image. > > Signed-off-by: Kevin Wolf Looks good, Reviewed-by: Christoph Hellwig
[Qemu-devel] [PATCH 1/2] parallels: use pread
Use pread instead of lseek + read in preparation of using the qemu block API. Signed-off-by: Christoph Hellwig Index: qemu-kevin/block/parallels.c === --- qemu-kevin.orig/block/parallels.c 2010-05-06 21:59:25.809255629 +0200 +++ qemu-kevin/block/parallels.c2010-05-06 22:01:11.343284544 +0200 @@ -83,7 +83,7 @@ static int parallels_open(BlockDriverSta s->fd = fd; -if (read(fd, &ph, sizeof(ph)) != sizeof(ph)) +if (pread(fd, &ph, sizeof(ph), 0) != sizeof(ph)) goto fail; if (memcmp(ph.magic, HEADER_MAGIC, 16) || @@ -93,14 +93,11 @@ static int parallels_open(BlockDriverSta bs->total_sectors = le32_to_cpu(ph.nb_sectors); -if (lseek(s->fd, 64, SEEK_SET) != 64) - goto fail; - s->tracks = le32_to_cpu(ph.tracks); s->catalog_size = le32_to_cpu(ph.catalog_entries); s->catalog_bitmap = qemu_malloc(s->catalog_size * 4); -if (read(s->fd, s->catalog_bitmap, s->catalog_size * 4) != +if (pread(s->fd, s->catalog_bitmap, s->catalog_size * 4, 64) != s->catalog_size * 4) goto fail; for (i = 0; i < s->catalog_size; i++) @@ -114,28 +111,18 @@ fail: return -1; } -static inline int seek_to_sector(BlockDriverState *bs, int64_t sector_num) +static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num) { BDRVParallelsState *s = bs->opaque; uint32_t index, offset; -uint64_t position; index = sector_num / s->tracks; offset = sector_num % s->tracks; -// not allocated +/* not allocated */ if ((index > s->catalog_size) || (s->catalog_bitmap[index] == 0)) return -1; - -position = (uint64_t)(s->catalog_bitmap[index] + offset) * 512; - -//fprintf(stderr, "sector: %llx index=%x offset=%x pointer=%x position=%x\n", -// sector_num, index, offset, s->catalog_bitmap[index], position); - -if (lseek(s->fd, position, SEEK_SET) != position) - return -1; - -return 0; +return (uint64_t)(s->catalog_bitmap[index] + offset) * 512; } static int parallels_read(BlockDriverState *bs, int64_t sector_num, @@ -144,11 +131,13 @@ static int parallels_read(BlockDriverSta BDRVParallelsState *s = bs->opaque; while (nb_sectors > 0) { - if (!seek_to_sector(bs, sector_num)) { - if (read(s->fd, buf, 512) != 512) - return -1; - } else +int64_t position = seek_to_sector(bs, sector_num); +if (position >= 0) { +if (pread(s->fd, buf, 512, position) != 512) +return -1; +} else { memset(buf, 0, 512); +} nb_sectors--; sector_num++; buf += 512;
[Qemu-devel] [PATCH 2/2] parallels: use qemu block API
Use bdrv_pwrite to access the backing device instead of pread, and convert the driver to implementing the bdrv_open method which gives it an already opened BlockDriverState for the underlying device. Signed-off-by: Christoph Hellwig Index: qemu-kevin/block/parallels.c === --- qemu-kevin.orig/block/parallels.c 2010-05-03 13:04:37.494261748 +0200 +++ qemu-kevin/block/parallels.c2010-05-03 13:05:55.781255810 +0200 @@ -46,7 +46,6 @@ struct parallels_header { } __attribute__((packed)); typedef struct BDRVParallelsState { -int fd; uint32_t *catalog_bitmap; int catalog_size; @@ -68,22 +67,15 @@ static int parallels_probe(const uint8_t return 0; } -static int parallels_open(BlockDriverState *bs, const char *filename, int flags) +static int parallels_open(BlockDriverState *bs, int flags) { BDRVParallelsState *s = bs->opaque; -int fd, i; +int i; struct parallels_header ph; -fd = open(filename, O_RDONLY | O_BINARY | O_LARGEFILE); -if (fd < 0) { -return -1; -} - bs->read_only = 1; // no write support yet -s->fd = fd; - -if (pread(fd, &ph, sizeof(ph), 0) != sizeof(ph)) +if (bdrv_pread(bs->file, 0, &ph, sizeof(ph)) != sizeof(ph)) goto fail; if (memcmp(ph.magic, HEADER_MAGIC, 16) || @@ -97,7 +89,7 @@ static int parallels_open(BlockDriverSta s->catalog_size = le32_to_cpu(ph.catalog_entries); s->catalog_bitmap = qemu_malloc(s->catalog_size * 4); -if (pread(s->fd, s->catalog_bitmap, s->catalog_size * 4, 64) != +if (bdrv_pread(bs->file, 64, s->catalog_bitmap, s->catalog_size * 4) != s->catalog_size * 4) goto fail; for (i = 0; i < s->catalog_size; i++) @@ -107,7 +99,6 @@ static int parallels_open(BlockDriverSta fail: if (s->catalog_bitmap) qemu_free(s->catalog_bitmap); -close(fd); return -1; } @@ -128,12 +119,10 @@ static int64_t seek_to_sector(BlockDrive static int parallels_read(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, int nb_sectors) { -BDRVParallelsState *s = bs->opaque; - while (nb_sectors > 0) { int64_t position = seek_to_sector(bs, sector_num); if (position >= 0) { -if (pread(s->fd, buf, 512, position) != 512) +if (bdrv_pread(bs->file, position, buf, 512) != 512) return -1; } else { memset(buf, 0, 512); @@ -149,14 +138,13 @@ static void parallels_close(BlockDriverS { BDRVParallelsState *s = bs->opaque; qemu_free(s->catalog_bitmap); -close(s->fd); } static BlockDriver bdrv_parallels = { .format_name = "parallels", .instance_size = sizeof(BDRVParallelsState), .bdrv_probe= parallels_probe, -.bdrv_file_open= parallels_open, +.bdrv_open = parallels_open, .bdrv_read = parallels_read, .bdrv_close= parallels_close, };
[Qemu-devel] [PATCH] Add missing 'static' attribute
Function usage() is only used locally. Signed-off-by: Stefan Weil --- tests/qruncom.c |2 +- tests/runcom.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tests/qruncom.c b/tests/qruncom.c index a8d0ef6..079f7a2 100644 --- a/tests/qruncom.c +++ b/tests/qruncom.c @@ -89,7 +89,7 @@ int errno; #define COM_BASE_ADDR0x10100 -void usage(void) +static void usage(void) { printf("qruncom version 0.1 (c) 2003 Fabrice Bellard\n" "usage: qruncom file.com\n" diff --git a/tests/runcom.c b/tests/runcom.c index cbbaf31..6380566 100644 --- a/tests/runcom.c +++ b/tests/runcom.c @@ -25,7 +25,7 @@ _syscall2(int, vm86, int, func, struct vm86plus_struct *, v86) #define COM_BASE_ADDR0x10100 -void usage(void) +static void usage(void) { printf("runcom version 0.1 (c) 2003 Fabrice Bellard\n" "usage: runcom file.com\n" -- 1.7.0
[Qemu-devel] [PATCH] mips-dis: Add missing static attributes
mips_abi_choices and mips_arch_choices are only used locally. Signed-off-by: Stefan Weil --- mips-dis.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/mips-dis.c b/mips-dis.c index 56bffe5..4623a1c 100644 --- a/mips-dis.c +++ b/mips-dis.c @@ -3035,7 +3035,7 @@ struct mips_abi_choice const char * const *fpr_names; }; -struct mips_abi_choice mips_abi_choices[] = +static struct mips_abi_choice mips_abi_choices[] = { { "numeric", mips_gpr_names_numeric, mips_fpr_names_numeric }, { "32", mips_gpr_names_oldabi, mips_fpr_names_32 }, @@ -3086,7 +3086,7 @@ struct mips_arch_choice //~ #define ARRAY_SIZE(a) (sizeof(a) / sizeof(a[0])) -const struct mips_arch_choice mips_arch_choices[] = +static const struct mips_arch_choice mips_arch_choices[] = { { "numeric", 0, 0, 0, 0, mips_cp0_names_numeric, NULL, 0, mips_hwr_names_numeric }, -- 1.7.0
[Qemu-devel] [PATCH] darwin-user: Add missing static attribute
Function usage is only used locally, so add "static". Signed-off-by: Stefan Weil --- darwin-user/main.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/darwin-user/main.c b/darwin-user/main.c index ade7d48..175e12f 100644 --- a/darwin-user/main.c +++ b/darwin-user/main.c @@ -704,7 +704,7 @@ void cpu_loop(CPUX86State *env) } #endif -void usage(void) +static void usage(void) { printf("qemu-" TARGET_ARCH " version " QEMU_VERSION ", Copyright (c) 2003-2004 Fabrice Bellard\n" "usage: qemu-" TARGET_ARCH " [-h] [-d opts] [-L path] [-s size] program [arguments...]\n" -- 1.7.0
[Qemu-devel] Re: vgabios + qemu: issues and plans.
Gerd Hoffmann wrote: Hi, Works for stdvga and qxl. vmware fails as it has the memory at pci region 1 (region 0 has ioports) and it seems vgabios isn't prepared to handle that ... Do you mean vgabios currently doesn't work with qemu and "-vga vmware" ? It works, but uses the bochs lfb at the magic address 0xe000. Killing the 0xe00 alias mapping and switching over to use the pci region instead requires some changes in pci_get_lfb_addr to handle the situation. Does this only happen with VBE? Sebastian
Re: [Qemu-devel] [PATCH] sparc64: implement global translation table entries v1
Thanks, applied all. On 5/4/10, Igor V. Kovalenko wrote: > From: Igor V. Kovalenko > > - match global tte against any context > - show global tte in MMU dump > > v0->v1: added default case to switch statement in demap_tlb > - should fix gcc warning about uninitialized context variable > > Signed-off-by: Igor V. Kovalenko > --- > target-sparc/cpu.h | 18 > target-sparc/helper.c| 33 - > target-sparc/op_helper.c | 53 > ++ > 3 files changed, 76 insertions(+), 28 deletions(-) > > diff --git a/target-sparc/cpu.h b/target-sparc/cpu.h > index 0e7f390..b705728 100644 > --- a/target-sparc/cpu.h > +++ b/target-sparc/cpu.h > @@ -513,6 +513,24 @@ static inline void cpu_set_cwp(CPUSPARCState *env1, int > new_cwp) > /* sun4m.c, sun4u.c */ > void cpu_check_irqs(CPUSPARCState *env); > > +#if defined (TARGET_SPARC64) > + > +static inline int compare_masked(uint64_t x, uint64_t y, uint64_t mask) > +{ > +return (x & mask) == (y & mask); > +} > + > +#define MMU_CONTEXT_BITS 13 > +#define MMU_CONTEXT_MASK ((1 << MMU_CONTEXT_BITS) - 1) > + > +static inline int tlb_compare_context(const SparcTLBEntry *tlb, > + uint64_t context) > +{ > +return compare_masked(context, tlb->tag, MMU_CONTEXT_MASK); > +} > + > +#endif > + > static inline void PUT_PSR(CPUSPARCState *env1, target_ulong val) > { > env1->psr = val & PSR_ICC; > diff --git a/target-sparc/helper.c b/target-sparc/helper.c > index 1f0f7d4..4ece01b 100644 > --- a/target-sparc/helper.c > +++ b/target-sparc/helper.c > @@ -381,17 +381,11 @@ static inline target_phys_addr_t > ultrasparc_truncate_physical(uint64_t x) > * UltraSparc IIi I/DMMUs > */ > > -static inline int compare_masked(uint64_t x, uint64_t y, uint64_t mask) > -{ > -return (x & mask) == (y & mask); > -} > - > // Returns true if TTE tag is valid and matches virtual address value in > context > // requires virtual address mask value calculated from TTE entry size > static inline int ultrasparc_tag_match(SparcTLBEntry *tlb, > uint64_t address, uint64_t context, > - target_phys_addr_t *physical, > - int is_nucleus) > + target_phys_addr_t *physical) > { > uint64_t mask; > > @@ -413,8 +407,7 @@ static inline int ultrasparc_tag_match(SparcTLBEntry > *tlb, > > // valid, context match, virtual address match? > if (TTE_IS_VALID(tlb->tte) && > -((is_nucleus && compare_masked(0, tlb->tag, 0x1fff)) > - || TTE_IS_GLOBAL(tlb->tte) || compare_masked(context, tlb->tag, > 0x1fff)) > +(TTE_IS_GLOBAL(tlb->tte) || tlb_compare_context(tlb, context)) > && compare_masked(address, tlb->tag, mask)) > { > // decode physical address > @@ -431,7 +424,6 @@ static int get_physical_address_data(CPUState *env, > { > unsigned int i; > uint64_t context; > -int is_nucleus; > > if ((env->lsu & DMMU_E) == 0) { /* DMMU disabled */ > *physical = ultrasparc_truncate_physical(address); > @@ -439,14 +431,16 @@ static int get_physical_address_data(CPUState *env, > return 0; > } > > -context = env->dmmu.mmu_primary_context & 0x1fff; > -is_nucleus = env->tl > 0; > +if (env->tl == 0) { > +context = env->dmmu.mmu_primary_context & 0x1fff; > +} else { > +context = 0; > +} > > for (i = 0; i < 64; i++) { > // ctx match, vaddr match, valid? > if (ultrasparc_tag_match(&env->dtlb[i], > - address, context, physical, > - is_nucleus)) { > + address, context, physical)) { > // access ok? > if (((env->dtlb[i].tte & 0x4) && is_user) || > (!(env->dtlb[i].tte & 0x2) && (rw == 1))) { > @@ -492,7 +486,6 @@ static int get_physical_address_code(CPUState *env, > { > unsigned int i; > uint64_t context; > -int is_nucleus; > > if ((env->lsu & IMMU_E) == 0 || (env->pstate & PS_RED) != 0) { > /* IMMU disabled */ > @@ -501,14 +494,16 @@ static int get_physical_address_code(CPUState *env, > return 0; > } > > -context = env->dmmu.mmu_primary_context & 0x1fff; > -is_nucleus = env->tl > 0; > +if (env->tl == 0) { > +context = env->dmmu.mmu_primary_context & 0x1fff; > +} else { > +context = 0; > +} > > for (i = 0; i < 64; i++) { > // ctx match, vaddr match, valid? > if (ultrasparc_tag_match(&env->itlb[i], > - address, context, physical, > - is_nucleus)) { > + address, context, physical))
[Qemu-devel] Re: vgabios + qemu: issues and plans.
On 05/06/10 22:19, Sebastian Herbszt wrote: Gerd Hoffmann wrote: Hi, Works for stdvga and qxl. vmware fails as it has the memory at pci region 1 (region 0 has ioports) and it seems vgabios isn't prepared to handle that ... Do you mean vgabios currently doesn't work with qemu and "-vga vmware" ? It works, but uses the bochs lfb at the magic address 0xe000. Killing the 0xe00 alias mapping and switching over to use the pci region instead requires some changes in pci_get_lfb_addr to handle the situation. Does this only happen with VBE? Yes. cheers, Gerd
Re: [Qemu-devel] Re: [PATCH] Remove IO_MEM_SUBWIDTH.
2010/4/28 Artyom Tarasenko : > 2010/4/27 Richard Henderson : >> On 04/26/2010 02:54 PM, Artyom Tarasenko wrote: >>> This patch introduces a regression. qemu crashes on lance test: >> >> I'm not sure how to get to this, since the sparc-test images don't >> include ifconfig, and I havn't been able to find a sparc install >> image that works (doesn't support sparc32 or sparc64 fails to load). >> >> That said, try this and see if it works. >> >> >> r~ >> >> --- >> diff --git a/exec.c b/exec.c >> index 14d1fd7..572d3fd 100644 >> --- a/exec.c >> +++ b/exec.c >> @@ -3286,6 +3286,8 @@ static int cpu_register_io_memory_fixed(int io_index, >> CPUWriteMemoryFunc * const >> *mem_write, >> void *opaque) >> { >> + int i; >> + >> if (io_index <= 0) { >> io_index = get_free_io_mem_idx(); >> if (io_index == -1) >> @@ -3296,8 +3298,14 @@ static int cpu_register_io_memory_fixed(int io_index, >> return -1; >> } >> >> - memcpy(io_mem_read[io_index], mem_read, 3 * sizeof(CPUReadMemoryFunc*)); >> - memcpy(io_mem_write[io_index], mem_write, 3 * >> sizeof(CPUWriteMemoryFunc*)); >> + for (i = 0; i < 3; ++i) { >> + io_mem_read[io_index][i] >> + = (mem_read[i] ? mem_read[i] : unassigned_mem_read[i]); >> + } >> + for (i = 0; i < 3; ++i) { >> + io_mem_write[io_index][i] >> + = (mem_write[i] ? mem_write[i] : unassigned_mem_write[i]); >> + } >> io_mem_opaque[io_index] = opaque; >> >> return (io_index << IO_MEM_SHIFT); >> Why the fix didn't make it into the git? Does it introduce other problems? > Looks good, thanks. > > Acked-by: Artyom Tarasenko > -- Regards, Artyom Tarasenko solaris/sparc under qemu blog: http://tyom.blogspot.com/
[Qemu-devel] Re: Endless loop in qcow2_alloc_cluster_offset
On Thu, Nov 19, 2009 at 01:19:55PM +0100, Jan Kiszka wrote: > Hi, > > I just managed to push a qemu-kvm process (git rev. b496fe3431) into an > endless loop in qcow2_alloc_cluster_offset, namely over > QLIST_FOREACH(old_alloc, &s->cluster_allocs, next_in_flight): > > (gdb) bt > #0 0x0048614b in qcow2_alloc_cluster_offset (bs=0xc4e1d0, > offset=7417184256, n_start=0, n_end=16, num=0xcb351c, m=0xcb3568) at > /data/qemu-kvm/block/qcow2-cluster.c:750 > #1 0x004828d0 in qcow_aio_write_cb (opaque=0xcb34d0, ret=0) at > /data/qemu-kvm/block/qcow2.c:587 > #2 0x00482a44 in qcow_aio_writev (bs=, > sector_num=, qiov=, > nb_sectors=, cb=, opaque= optimized out>) at /data/qemu-kvm/block/qcow2.c:645 > #3 0x00470e89 in bdrv_aio_writev (bs=0xc4e1d0, sector_num=2, > qiov=0x7f48a9010ed0, nb_sectors=16, cb=0x470d20 , > opaque=0x7f48a9010f0c) at /data/qemu-kvm/block.c:1362 > #4 0x00472991 in bdrv_write_em (bs=0xc4e1d0, sector_num=14486688, > buf=0xd67200 "H\a", nb_sectors=16) at /data/qemu-kvm/block.c:1736 > #5 0x00435581 in ide_sector_write (s=0xc92650) at > /data/qemu-kvm/hw/ide/core.c:622 > #6 0x00425fc2 in kvm_handle_io (env=) at > /data/qemu-kvm/kvm-all.c:553 > #7 kvm_run (env=) at /data/qemu-kvm/qemu-kvm.c:964 > #8 0x00426049 in kvm_cpu_exec (env=0x1000) at > /data/qemu-kvm/qemu-kvm.c:1651 > #9 0x0042627d in kvm_main_loop_cpu (_env=) at > /data/qemu-kvm/qemu-kvm.c:1893 > #10 ap_main_loop (_env=) at > /data/qemu-kvm/qemu-kvm.c:1943 > #11 0x7f48ae89d070 in start_thread () from /lib64/libpthread.so.0 > #12 0x7f48abf0711d in clone () from /lib64/libc.so.6 > #13 0x in ?? () > (gdb) print ((BDRVQcowState *)bs->opaque)->cluster_allocs.lh_first > $5 = (struct QCowL2Meta *) 0xcb3568 > (gdb) print *((BDRVQcowState *)bs->opaque)->cluster_allocs.lh_first > $6 = {offset = 7417176064, n_start = 0, nb_available = 16, nb_clusters = 0, > depends_on = 0xcb3568, dependent_requests = {lh_first = 0x0}, next_in_flight > = {le_next = 0xcb3568, le_prev = 0xc4ebd8}} > > So next == first. > Seen the exact same bug twice in a row while installing FC12 with IDE disk, current qemu-kvm.git. qemu-system-x86_64 -drive file=/root/images/fc12-ide.img,cache=writeback \ -m 1000 -vnc :1 \ -net nic,model=virtio \ -net tap,script=/root/ifup.sh -serial stdio \ -cdrom /root/iso/linux/Fedora-12-x86_64-DVD.iso -monitor telnet::4445,server,nowait -usbdevice tablet Can't reproduce though.
Re: [Qemu-devel] [PATCH RFC] virtio: put last seen used index into ring itself
On Thu, 6 May 2010 07:30:00 pm Avi Kivity wrote: > On 05/05/2010 11:58 PM, Michael S. Tsirkin wrote: > > + /* We publish the last-seen used index at the end of the available ring. > > +* It is at the end for backwards compatibility. */ > > + vr->last_used_idx =&(vr)->avail->ring[num]; > > + /* Verify that last used index does not spill over the used ring. */ > > + BUG_ON((void *)vr->last_used_idx + > > + sizeof *vr->last_used_idx> (void *)vr->used); > > } > > > > Shouldn't this be on its own cache line? It's next to the available ring; because that's where the guest publishes its data. That whole page is guest-write, host-read. Putting it on a cacheline by itself would be a slight pessimization; the host cpu would have to get the last_used_idx cacheline and the avail descriptor cacheline every time. This way, they are sometimes the same cacheline. Hope that clarifies, Rusty.
[Qemu-devel] Re: [PATCH RFC] virtio: put last seen used index into ring itself
On Thu, 6 May 2010 03:49:46 pm Michael S. Tsirkin wrote: > Now, I also added an mb() in guest between read and write so > that last used index write can not get ahead of used index read. > It does feel good to have it there, but I can not say why > it's helpful. Works fine without it, but then these > subtle races might be hard to trigger. What do you think? I couldn't see that in the patch? I don't think it's necessary though, since the write of depends last_used depends on the read of used (and no platform we care about would reorder such a thing). I'm reasonably happy, but we should write some convenient test for missing interrupts. I'm thinking of a sender which does a loop: blasts 1MB of UDP packets, then prints the time and sleep(1). The receiver would print the time every 1MB of received data. The two times should almost exactly correspond. Assuming that the network doesn't overflow and lose stuff, this should identify any missing wakeup/interrupts (depending on direction used). Cheers, Rusty.