Re: [Qemu-devel] [PATCH] qemu_timedate_diff() shouldn't modify its argument.
On Mon, Nov 07, 2011 at 10:16:29PM +0200, Ronen Hod wrote: > On 11/06/2011 06:00 PM, Gleb Natapov wrote: > >The caller of qemu_timedate_diff() does not expect that tm it passes to > >the function will be modified, but mktime() is destructive and modifies > >its argument. Pass a copy of tm to it and set tm_isdst so that mktime() > >will not rely on it since its value may be outdated. > > I believe that the original issue was not related to outdated data > at the moment of the daylight saving time transition. Don't know what do you mean here. The problem is definitely related to incorrect data about daylight saving time. > using tmp.tm_isdst = -1 sounds good, but why use a copy of tm? The Hmm, what is not clear about "qemu_timedate_diff() shouldn't modify its argument"? :) > only significant field that will change in the tm is the tm_isdst > itself that will be set to 0/1 (correctly). This is incorrect (read man), but event if it would breed a new kind of especially handsome ponies, if this is what caller wants it should call mktime() by itself and not be a side effect of some random function call. > > Acked-by: Ronen Hod > > >Signed-off-by: Gleb Natapov > >diff --git a/vl.c b/vl.c > >index 624da0f..641629b 100644 > >--- a/vl.c > >+++ b/vl.c > >@@ -460,8 +460,11 @@ int qemu_timedate_diff(struct tm *tm) > > if (rtc_date_offset == -1) > > if (rtc_utc) > > seconds = mktimegm(tm); > >-else > >-seconds = mktime(tm); > >+else { > >+struct tm tmp = *tm; > >+tmp.tm_isdst = -1; /* use timezone to figure it out */ > >+seconds = mktime(&tmp); > >+} > > else > > seconds = mktimegm(tm) + rtc_date_offset; > > > >-- > > Gleb. > > -- Gleb.
Re: [Qemu-devel] how to delete a savevm in image?
You can try like this: (qemu) info snapshots IDTAG VM SIZEDATE VM CLOCK 1 vm-20111025134936 202M 2011-10-25 13:49:36 00:02:19.524 (qemu) savevm save_name_1 (qemu) info snapshots IDTAG VM SIZEDATE VM CLOCK 1 vm-20111025134936 202M 2011-10-25 13:49:36 00:02:19.524 2 save_name_1202M 2011-10-25 13:51:53 00:04:28.395 (qemu) loadvm vm-20111025134936 (qemu) delvm save_name_1 (qemu) info snapshots IDTAG VM SIZEDATE VM CLOCK 1 vm-20111025134936 202M 2011-10-25 13:49:36 00:02:19.524 (qemu) So if you delete, you can use delvm. 2011/11/8 Jun Koi : > hi, > > during the run of my VM, i used "savevm" to save the VM status down. > now how can i delete that? > > i checked qemu-img, and it doesnt seem to have any option to delete > the savevm in the VM image. > > thanks, > Jun > > -- Regards Robert Wang
[Qemu-devel] how to delete a savevm in image?
hi, during the run of my VM, i used "savevm" to save the VM status down. now how can i delete that? i checked qemu-img, and it doesnt seem to have any option to delete the savevm in the VM image. thanks, Jun
Re: [Qemu-devel] [PATCH v4 0/4] -net bridge: rootless bridge support for qemu
Why do you not develop one helper to set up bridge env for qemu guests when the host have no bridge interface? On Wed, Nov 2, 2011 at 1:13 AM, Corey Bryant wrote: > With qemu it is possible to run a guest from an unprivileged user but if > we wanted to communicate with the outside world we had to switch > to root. > > We address this problem by introducing a new network backend and a new > network option for -net tap. This is less flexible when compared to > existing -net tap options because it relies on a helper with elevated > privileges to do the heavy lifting of allocating and attaching a tap > device to a bridge. We use a special purpose helper because we don't > want to elevate the privileges of more generic tools like brctl. > > Qemu can be run with the default network helper as follows (in these cases > attaching the tap device to the default br0 bridge): > > qemu -hda linux.img -net bridge -net nic > or: > qemu -hda linux.img -net tap,helper=/usr/local/libexec/qemu-bridge-helper > -net nic > > The default helper uses it's own ACL mechanism for access control, but > future network helpers could be developed, for example, to support PolicyKit > for access control. > > More details are included in individual patches. The helper is broken into > a series of patches to improve reviewabilty. > > v2: > - Updated signed-off-by's > - Updated author's email > - Set default bridge to br0 > - Added -net bridge > - Updated ACL example > - Moved from libcap to libcap-ng > - Fail helper when libcap-ng not configured > > v3: > - Use simple queue to store ACLs > - Added goto cleanup to helper's main > - Allow helper execution if libcap-ng not configured > - Completed static analysis and memory analysis on helper > > v4: > - Update has_vnet_hdr() to return bool > - Update helper's main() to prevent errno clobbering > - Let Kernel cleanup helper's file descriptors > > Corey Bryant (4): > Add basic version of bridge helper > Add access control support to qemu bridge helper > Add cap reduction support to enable use as SUID > Add support for net bridge > > Makefile | 12 ++- > configure | 37 + > net.c | 29 - > net.h | 3 + > net/tap.c | 190 ++- > net/tap.h | 3 + > qemu-bridge-helper.c | 407 > ++ > qemu-options.hx | 73 -- > 8 files changed, 731 insertions(+), 23 deletions(-) > create mode 100644 qemu-bridge-helper.c > > -- > 1.7.3.4 > > > -- Regards, Zhi Yong Wu
Re: [Qemu-devel] [RFC 4/6] block: request overlap detection
On Mon, Nov 7, 2011 at 10:37 PM, Stefan Hajnoczi wrote: > On Mon, Nov 7, 2011 at 11:49 AM, Zhi Yong Wu wrote: >> On Mon, Oct 17, 2011 at 11:47 PM, Stefan Hajnoczi >> wrote: >>> Detect overlapping requests and remember to align to cluster boundaries >>> if the image format uses them. This assumes that allocating I/O is >>> performed in cluster granularity - which is true for qcow2, qed, etc. >>> >>> Signed-off-by: Stefan Hajnoczi >>> --- >>> block.c | 39 +-- >>> 1 files changed, 37 insertions(+), 2 deletions(-) >>> >>> diff --git a/block.c b/block.c >>> index cc3202c..0c22741 100644 >>> --- a/block.c >>> +++ b/block.c >>> @@ -1052,21 +1052,56 @@ static BdrvTrackedRequest >>> *tracked_request_add(BlockDriverState *bs, >>> return req; >>> } >>> >>> +/** >>> + * Round a region to cluster boundaries >>> + */ >>> +static void round_to_clusters(BlockDriverState *bs, >>> + int64_t sector_num, int nb_sectors, >>> + int64_t *cluster_sector_num, >>> + int *cluster_nb_sectors) >>> +{ >>> + BlockDriverInfo bdi; >>> + >>> + if (bdrv_get_info(bs, &bdi) < 0 || bdi.cluster_size == 0) { >>> + *cluster_sector_num = sector_num; >>> + *cluster_nb_sectors = nb_sectors; >>> + } else { >>> + int64_t c = bdi.cluster_size / BDRV_SECTOR_SIZE; >>> + *cluster_sector_num = (sector_num / c) * c; >> I can understand the above formula, but the one below is >> very magic. :) and can not be understood by me. >>> + *cluster_nb_sectors = ((sector_num % c) + nb_sectors + c - 1) / c >>> * c; > > I agree this is ugly. Here is what is going on: > > c = number of sectors per cluster > cluster_sector_num = sector number rounded *down* to cluster boundary > cluster_nb_sectors = number of sectors from cluster_sector_num to > rounded up sector_num+nb_sectors > > So the magic expression is takes the original sector_num to > sector_num+nb_sectors region: > > |---XXX|XXX---| > > Where |-| is a cluster and is the region from sector_num to > sector_num+nb_sectors, then the output should be: > > |RR|RR| > > Everything has been rounded to clusters. So here is the expression broken > down: > > *cluster_nb_sectors = ((sector_num % c) + nb_sectors + c - 1) / c * c; > AA XX BB > > |AAAXXX|XXXBBB| > > A is actually equivalent to sector_num - cluster_sector_num. > > X is the original unrounded region. > > B is the rounding up to the next cluster bounary. > > Another way of writing this: > > *cluster_nb_sectors = ROUND_UP((sector_num - cluster_sector_num) + > nb_sectors, c); Above expression seems to not be correct; It should be *cluster_nb_sectors = ROUND_UP((sector_num - cluster_sector_num) + nb_sectors, c) * c; *cluster_nb_sectors = ((sector_num % c) + nb_sectors + c - 1) / c * c; #define ROUND_UP(x,y) (((x)+(y)-1)/(y)) I seems to understand your magic expression. thanks. > > I'll try to improve the code in the next revision. > > Stefan > -- Regards, Zhi Yong Wu
Re: [Qemu-devel] [PATCH 4/5] integratorcp: convert integratorcm to VMState
On 11/08/2011 04:07 AM, Peter Maydell wrote: > 2011/10/26 Peter Maydell : > > On 25 October 2011 12:09, Benoît Canet wrote: > >> +static const VMStateDescription vmstate_integratorcm = { > >> +.name = "integratorcm", > >> +.version_id = 1, > >> +.minimum_version_id = 1, > >> +.minimum_version_id_old = 1, > >> +.fields = (VMStateField[]) { > >> +VMSTATE_UINT32(memsz, integratorcm_state), > >> +VMSTATE_BOOL(flash_mapped, integratorcm_state), > > > > This raises a question. flash_mapped here is a flag which just > > tracks whether the associated MemoryRegion is currently mapped > > or unmapped. Do we need to do anything special to ensure that > > the MemoryRegion itself is reset to the correct mapped/unmapped > > state on restore? > > > > I recall discussing this kind of thing with Avi on IRC but I > > can't remember what the conclusion was. > > Avi, ping? I'm still interested in the answer to this question. Sorry, missed this. Yes, you need to ensure the memory region mapping reflects flash_mapped. btw, is flash_mapped a real device state (corresponds to a real memory bit on the device) or just an artefact? It's usually a bad idea to cast implementation artefacts in vmstate concrete. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.
Re: [Qemu-devel] Secure KVM
On Mon, 07 Nov 2011 12:03:38 -0600, Anthony Liguori wrote: > So the sandbox loop would look like: > > void main() { >setup_devices(); > >read_from_event_channel(main_channel); >for i in vrings: > check_vring_notification(i); > } lguest uses a model where you attach an eventfd to a given virtqueue. (If you don't have an eventfd registered for a vq, the main process returns from the read() of /dev/lguest with the info). At the moment we use a process per virtqueue, but you could attach the same eventfd to multiple vqs. Since you can't select() inside seccomp, the main process could write to the eventfd to wake up the thread to respond to IPC. Here's the net output code: /* * The Network * * Handling output for network is also simple: we get all the output buffers * and write them to /dev/net/tun. */ struct net_info { int tunfd; }; static void net_output(struct virtqueue *vq) { struct net_info *net_info = vq->dev->priv; unsigned int head, out, in; struct iovec iov[vq->vring.num]; /* We usually wait in here for the Guest to give us a packet. */ head = wait_for_vq_desc(vq, iov, &out, &in); if (in) errx(1, "Input buffers in net output queue?"); /* * Send the whole thing through to /dev/net/tun. It expects the exact * same format: what a coincidence! */ if (writev(net_info->tunfd, iov, out) < 0) warnx("Write to tun failed (%d)?", errno); /* * Done with that one; wait_for_vq_desc() will send the interrupt if * all packets are processed. */ add_used(vq, head, 0); } Here's the input thread: /* * Handling network input is a bit trickier, because I've tried to optimize it. * * First we have a helper routine which tells is if from this file descriptor * (ie. the /dev/net/tun device) will block: */ static bool will_block(int fd) { fd_set fdset; struct timeval zero = { 0, 0 }; FD_ZERO(&fdset); FD_SET(fd, &fdset); return select(fd+1, &fdset, NULL, NULL, &zero) != 1; } /* * This handles packets coming in from the tun device to our Guest. Like all * service routines, it gets called again as soon as it returns, so you don't * see a while(1) loop here. */ static void net_input(struct virtqueue *vq) { int len; unsigned int head, out, in; struct iovec iov[vq->vring.num]; struct net_info *net_info = vq->dev->priv; /* * Get a descriptor to write an incoming packet into. This will also * send an interrupt if they're out of descriptors. */ head = wait_for_vq_desc(vq, iov, &out, &in); if (out) errx(1, "Output buffers in net input queue?"); /* * If it looks like we'll block reading from the tun device, send them * an interrupt. */ if (vq->pending_used && will_block(net_info->tunfd)) trigger_irq(vq); /* * Read in the packet. This is where we normally wait (when there's no * incoming network traffic). */ len = readv(net_info->tunfd, iov, in); if (len <= 0) warn("Failed to read from tun (%d).", errno); /* * Mark that packet buffer as used, but don't interrupt here. We want * to wait until we've done as much work as we can. */ add_used(vq, head, len); }
[Qemu-devel] [PATCH v3] block: Use bdrv functions to replace file operation in cow.c
Since common file operation functions lack of error detection, so change them to bdrv series functions. v3: correct some errors v2: Only contains the function modified. v1: Fix coding style and convert file operation functions to bdrv functions. Signed-off-by: Li Zhi Hui --- block/cow.c | 34 -- 1 files changed, 16 insertions(+), 18 deletions(-) diff --git a/block/cow.c b/block/cow.c index 707c0aa..71d7b11 100644 --- a/block/cow.c +++ b/block/cow.c @@ -243,12 +243,12 @@ static void cow_close(BlockDriverState *bs) static int cow_create(const char *filename, QEMUOptionParameter *options) { -int fd, cow_fd; struct cow_header_v2 cow_header; struct stat st; int64_t image_sectors = 0; const char *image_filename = NULL; int ret; +BlockDriverState *cow_bs; /* Read out options */ while (options && options->name) { @@ -260,10 +260,16 @@ static int cow_create(const char *filename, QEMUOptionParameter *options) options++; } -cow_fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, - 0644); -if (cow_fd < 0) -return -errno; +ret = bdrv_create_file(filename, options); +if (ret < 0) { +return ret; +} + +ret = bdrv_file_open(&cow_bs, filename, BDRV_O_RDWR); +if (ret < 0) { +return ret; +} + memset(&cow_header, 0, sizeof(cow_header)); cow_header.magic = cpu_to_be32(COW_MAGIC); cow_header.version = cpu_to_be32(COW_VERSION); @@ -271,16 +277,9 @@ static int cow_create(const char *filename, QEMUOptionParameter *options) /* Note: if no file, we put a dummy mtime */ cow_header.mtime = cpu_to_be32(0); -fd = open(image_filename, O_RDONLY | O_BINARY); -if (fd < 0) { -close(cow_fd); -goto mtime_fail; -} -if (fstat(fd, &st) != 0) { -close(fd); +if (stat(image_filename, &st) != 0) { goto mtime_fail; } -close(fd); cow_header.mtime = cpu_to_be32(st.st_mtime); mtime_fail: pstrcpy(cow_header.backing_file, sizeof(cow_header.backing_file), @@ -288,21 +287,20 @@ static int cow_create(const char *filename, QEMUOptionParameter *options) } cow_header.sectorsize = cpu_to_be32(512); cow_header.size = cpu_to_be64(image_sectors * 512); -ret = qemu_write_full(cow_fd, &cow_header, sizeof(cow_header)); +ret = bdrv_pwrite(cow_bs, 0, &cow_header, sizeof(cow_header)); if (ret != sizeof(cow_header)) { -ret = -errno; goto exit; } /* resize to include at least all the bitmap */ -ret = ftruncate(cow_fd, sizeof(cow_header) + ((image_sectors + 7) >> 3)); +ret = bdrv_truncate(cow_bs, +sizeof(cow_header) + ((image_sectors + 7) >> 3)); if (ret) { -ret = -errno; goto exit; } exit: -close(cow_fd); +bdrv_delete(cow_bs); return ret; } -- 1.7.4.1
Re: [Qemu-devel] [RFC 1/6] block: add request tracking
On Mon, Nov 7, 2011 at 7:41 PM, Stefan Hajnoczi wrote: > On Mon, Nov 7, 2011 at 11:00 AM, Zhi Yong Wu wrote: >> On Mon, Oct 17, 2011 at 11:47 PM, Stefan Hajnoczi >> wrote: >>> +/** >>> + * Enable tracking of incoming requests >>> + * >>> + * Request tracking can be safely used by multiple users at the same time, >>> + * there is an internal reference count to match start and stop calls. >>> + */ >>> +void bdrv_start_request_tracking(BlockDriverState *bs) >>> +{ >>> + bs->request_tracking++; >>> +} >>> + >>> +/** >>> + * Disable tracking of incoming requests >>> + * >>> + * Note that in-flight requests are still tracked, this function only stops >>> + * tracking incoming requests. >>> + */ >>> +void bdrv_stop_request_tracking(BlockDriverState *bs) >>> +{ >>> + bs->request_tracking--; >>> +} >> I don't understand what the real intention for the above functions is. >> IMHO, why can we not drop them? > > I have dropped them after removing the g_malloc() as Kevin suggested. > The idea was to avoid the overhead of request tracking when no feature > is using request tracking. Great > > Stefan > -- Regards, Zhi Yong Wu
[Qemu-devel] [PATCH] 9pfs: Stat response from server offset by 2 bytes
From: Timothy Rule The 9P spec states that for the stat message the "stat[n]" structure shall be encoded at offset 7 in the 9P message (see §13.9 message Rstat). The existing code is encoding a 2 byte value (hard coded 0 value) at offset 7 of the 9P message, and then follows with the "stat[n]" structure at offset 9 of the 9P message. This patch removes the encoding of the 2 byte value which has the effect of moving the "stat[n]" structure from offset 9 to offset 7 in the 9P message Rstat. Signed-off-by: Timothy Rule Signed-off-by: David Gibson --- hw/9pfs/virtio-9p.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c index 01cf337..35d8851 100644 --- a/hw/9pfs/virtio-9p.c +++ b/hw/9pfs/virtio-9p.c @@ -1279,7 +1279,7 @@ static void v9fs_stat(void *opaque) if (err < 0) { goto out; } -offset += pdu_marshal(pdu, offset, "wS", 0, &v9stat); +offset += pdu_marshal(pdu, offset, "S", &v9stat); err = offset; trace_v9fs_stat_return(pdu->tag, pdu->id, v9stat.mode, v9stat.atime, v9stat.mtime, v9stat.length); -- 1.7.7.1
[Qemu-devel] [PATCH] pseries: Correct RAM size check for SLOF
The SLOF firmware used on the pseries machine needs a reasonable amount of (guest) RAM in order to run, so we have a check in the machine init function to check that this is available. However, SLOF runs in real mode (MMU off) which means it can only actually access the RMA (Real Mode Area), not all of RAM. In many cases the RMA is the same as all RAM, but when running with Book3S HV KVM on PowerPC 970, the RMA must be especially allocated to be (host) physically contiguous. In this case, the RMA size is determined by what the host admin allocated at boot time, and will usually be less than the whole guest RAM size. This patch corrects the test to see if SLOF has enough memory for this case. In addition, more recent versions of SLOF that were committed earlier don't need quite as much memory as earlier versions. Therefore, this patch also reduces the amount of RAM we require to run SLOF. Signed-off-by: David Gibson --- hw/spapr.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/hw/spapr.c b/hw/spapr.c index 40cfc9b..e826e0b 100644 --- a/hw/spapr.c +++ b/hw/spapr.c @@ -57,7 +57,7 @@ #define FW_MAX_SIZE 0x40 #define FW_FILE_NAME"slof.bin" -#define MIN_RAM_SLOF 512UL +#define MIN_RMA_SLOF 128UL #define TIMEBASE_FREQ 51200ULL @@ -562,9 +562,9 @@ static void ppc_spapr_init(ram_addr_t ram_size, spapr->entry_point = KERNEL_LOAD_ADDR; } else { -if (ram_size < (MIN_RAM_SLOF << 20)) { +if (rma_size < (MIN_RMA_SLOF << 20)) { fprintf(stderr, "qemu: pSeries SLOF firmware requires >= " -"%ldM guest RAM\n", MIN_RAM_SLOF); +"%ldM guest RAM\n", MIN_RMA_SLOF); exit(1); } filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, FW_FILE_NAME); -- 1.7.7.1
Re: [Qemu-devel] [PATCH] KVM: Add wrapper script around QEMU to test kernels
On Mon, 7 Nov 2011, Ingo Molnar wrote: > I think we needed to do only one revert along the way in the past two > years, to fix an unintended ABI breakage in PowerTop. Considering the > total complexity of the perf ABI our compatibility track record is > *very* good. There have been more breakages, as you know. It's just they weren't caught in time so they were declared to be grandfathered in rather than fixed. > Pekka, Vince has meanwhile become the resident perf critic on lkml, > always in it when it comes to some perf-bashing: For what it's worth you'll find commits from me in the qemu tree, and I also oppose the merge of kvm-tool into the Linux tree. > ... and you have argued against perf from the very first day on, when > you were one of the perfmon developers - and IMO in hindsight you've > been repeatedly wrong about most of your design arguments. I can't find an exact e-mail, but I seem to recall my arguments were that Pentium 4 support would be hard (it was), that in-kernel generalized events were a bad idea (I still think that, try talking to the ARM guys sometime about that) and that making access to raw events hard (by not using a naming library) was silly. I'm sure I probably said other things that were eventually addressed. > The PAPI project has the (fundamental) problem that you are still > doing it in the old-style sw design fashion, with many months long > delays in testing, and then you are blaming the problems you > inevitably meet with that model on *us*. The fundamental problem with the PAPI project is that we only have 3 full-time developers, and we have to make sure PAPI runs on about 10 different platforms, of which perf_events/Linux is only one. Time I waste tracking down perf_event ABI regressions and DoS bugs takes away from actual useful userspace PAPI development. > There was one PAPI incident i remember where it took you several > *months* to report a regression in a regular PAPI test-case (no > actual app affected as far as i know). No other tester ever ran the > PAPI testcases so nobody else reported it. We have a huge userbase. They run on some pretty amazing machines and do some tests that strain perf libraries to the limit. They also tend to use distro kernels, assuming they even have moved to 2.6.31+ kernels yet. When these power users report problems, they aren't going to be against the -tip tree. > Nobody but you tests PAPI so you need to become *part* of the > upstream development process, which releases a new upstream kernel > every 3 months. PAPI is a free software project, with the devel tree available from CVS. It takes maybe 15 minutes to run the full PAPI regression suite. I encourage you or any perf developer to try it and report any issues. I can only be so comprehensive. I didn't find the current NMI-watchdog regression right away because my git tree builds didn't have it enabled. It wasn't until there started being 3.0 distro kernels that people started reporting the problem to us. > Also, as i mentioned it several times before, you are free to add an > arbitrary number of ABI test-cases to 'perf test' and we can promise > that we run that. Right now it consists of a few tests: as mentioned before I have my own perf_event test suite with 20+ tests. http://web.eecs.utk.edu/~vweaver1/projects/perf-events/validation.html I do run it often. It tends to be reactionary though, as I can only add a test for a bug once I know about it. I also have more up-to date perf documentation than the kernel does: http://web.eecs.utk.edu/~vweaver1/projects/perf-events/programming.html and a cpu compatability matrix: http://web.eecs.utk.edu/~vweaver1/projects/perf-events/support.html I didn't really want to turn this into yet another perf flamewar. I just didn't want the implication that perf being in kernel is all rainbows and unicorns to go unchallenged. Vince
Re: [Qemu-devel] Accessing a linux guest's data structures
> > Hi > > I am a beginner qemu developer. > > I am running a linux guest inside qemu and I need to determine what process, > > thread is currently running in the guest. > > How should I do this? Any suggestions? Or can anyone point me to the > > relevant areas in qemu's source. ^^^ I guess he want to know how to know which process is running in the guest OS from QEMU's perspective. Regards, chenwj -- Wei-Ren Chen (陳韋任) Computer Systems Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667
[Qemu-devel] [PATCH v12 4/5] hmp/qmp: add block_set_io_throttle
Signed-off-by: Zhi Yong Wu Signed-off-by: Stefan Hajnoczi --- block.c | 15 + blockdev.c | 59 ++ blockdev.h |2 + hmp-commands.hx | 15 + hmp.c| 10 + qapi-schema.json | 16 +- qerror.c |4 +++ qerror.h |3 ++ qmp-commands.hx | 53 +++- 9 files changed, 175 insertions(+), 2 deletions(-) diff --git a/block.c b/block.c index 3d0ec23..fffc7dc 100644 --- a/block.c +++ b/block.c @@ -1971,6 +1971,21 @@ BlockInfoList *qmp_query_block(Error **errp) info->value->inserted->has_backing_file = true; info->value->inserted->backing_file = g_strdup(bs->backing_file); } + +if (bs->io_limits_enabled) { +info->value->inserted->bps = + bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]; +info->value->inserted->bps_rd = + bs->io_limits.bps[BLOCK_IO_LIMIT_READ]; +info->value->inserted->bps_wr = + bs->io_limits.bps[BLOCK_IO_LIMIT_WRITE]; +info->value->inserted->iops = + bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]; +info->value->inserted->iops_rd = + bs->io_limits.iops[BLOCK_IO_LIMIT_READ]; +info->value->inserted->iops_wr = + bs->io_limits.iops[BLOCK_IO_LIMIT_WRITE]; +} } /* XXX: waiting for the qapi to support GSList */ diff --git a/blockdev.c b/blockdev.c index 651828c..95d1faa 100644 --- a/blockdev.c +++ b/blockdev.c @@ -757,6 +757,65 @@ int do_change_block(Monitor *mon, const char *device, return monitor_read_bdrv_key_start(mon, bs, NULL, NULL); } +/* throttling disk I/O limits */ +int do_block_set_io_throttle(Monitor *mon, + const QDict *qdict, QObject **ret_data) +{ +BlockIOLimit io_limits; +const char *devname = qdict_get_str(qdict, "device"); +BlockDriverState *bs; + +io_limits.bps[BLOCK_IO_LIMIT_TOTAL] += qdict_get_try_int(qdict, "bps", -1); +io_limits.bps[BLOCK_IO_LIMIT_READ] += qdict_get_try_int(qdict, "bps_rd", -1); +io_limits.bps[BLOCK_IO_LIMIT_WRITE] += qdict_get_try_int(qdict, "bps_wr", -1); +io_limits.iops[BLOCK_IO_LIMIT_TOTAL] += qdict_get_try_int(qdict, "iops", -1); +io_limits.iops[BLOCK_IO_LIMIT_READ] += qdict_get_try_int(qdict, "iops_rd", -1); +io_limits.iops[BLOCK_IO_LIMIT_WRITE] += qdict_get_try_int(qdict, "iops_wr", -1); + +bs = bdrv_find(devname); +if (!bs) { +qerror_report(QERR_DEVICE_NOT_FOUND, devname); +return -1; +} + +if ((io_limits.bps[BLOCK_IO_LIMIT_TOTAL] == -1) +|| (io_limits.bps[BLOCK_IO_LIMIT_READ] == -1) +|| (io_limits.bps[BLOCK_IO_LIMIT_WRITE] == -1) +|| (io_limits.iops[BLOCK_IO_LIMIT_TOTAL] == -1) +|| (io_limits.iops[BLOCK_IO_LIMIT_READ] == -1) +|| (io_limits.iops[BLOCK_IO_LIMIT_WRITE] == -1)) { +qerror_report(QERR_MISSING_PARAMETER, + "bps/bps_rd/bps_wr/iops/iops_rd/iops_wr"); +return -1; +} + +if (!do_check_io_limits(&io_limits)) { +qerror_report(QERR_INVALID_PARAMETER_COMBINATION); +return -1; +} + +bs->io_limits = io_limits; +bs->slice_time = BLOCK_IO_SLICE_TIME; + +if (!bs->io_limits_enabled && bdrv_io_limits_enabled(bs)) { +bdrv_io_limits_enable(bs); +} else if (bs->io_limits_enabled && !bdrv_io_limits_enabled(bs)) { +bdrv_io_limits_disable(bs); +} else { +if (bs->block_timer) { +qemu_mod_timer(bs->block_timer, qemu_get_clock_ns(vm_clock)); +} +} + +return 0; +} + int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data) { const char *id = qdict_get_str(qdict, "id"); diff --git a/blockdev.h b/blockdev.h index 3587786..1b48a75 100644 --- a/blockdev.h +++ b/blockdev.h @@ -63,6 +63,8 @@ int do_block_set_passwd(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_change_block(Monitor *mon, const char *device, const char *filename, const char *fmt); int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data); +int do_block_set_io_throttle(Monitor *mon, + const QDict *qdict, QObject **ret_data); int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_block_resize(Monitor *mon, const QDict *qdict, QObject **ret_data); diff --git a/hmp-commands.hx b/hmp-commands.hx index 089c1ac..f8d855e 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1207,6 +1207,21 @@ ETEXI }, STEXI +@item block_
[Qemu-devel] [PATCH v12 3/5] block: add I/O throttling algorithm
Signed-off-by: Zhi Yong Wu Signed-off-by: Stefan Hajnoczi --- block.c | 234 +++ block.h |1 + block_int.h |1 + 3 files changed, 236 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 79e7f09..3d0ec23 100644 --- a/block.c +++ b/block.c @@ -74,6 +74,13 @@ static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs, bool is_write); static void coroutine_fn bdrv_co_do_rw(void *opaque); +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, +bool is_write, double elapsed_time, uint64_t *wait); +static bool bdrv_exceed_iops_limits(BlockDriverState *bs, bool is_write, +double elapsed_time, uint64_t *wait); +static bool bdrv_exceed_io_limits(BlockDriverState *bs, int nb_sectors, +bool is_write, int64_t *wait); + static QTAILQ_HEAD(, BlockDriverState) bdrv_states = QTAILQ_HEAD_INITIALIZER(bdrv_states); @@ -107,6 +114,24 @@ int is_windows_drive(const char *filename) #endif /* throttling disk I/O limits */ +void bdrv_io_limits_disable(BlockDriverState *bs) +{ +bs->io_limits_enabled = false; + +while (qemu_co_queue_next(&bs->throttled_reqs)); + +if (bs->block_timer) { +qemu_del_timer(bs->block_timer); +qemu_free_timer(bs->block_timer); +bs->block_timer = NULL; +} + +bs->slice_start = 0; +bs->slice_end = 0; +bs->slice_time = 0; +memset(&bs->io_base, 0, sizeof(bs->io_base)); +} + static void bdrv_block_timer(void *opaque) { BlockDriverState *bs = opaque; @@ -136,6 +161,31 @@ bool bdrv_io_limits_enabled(BlockDriverState *bs) || io_limits->iops[BLOCK_IO_LIMIT_TOTAL]; } +static void bdrv_io_limits_intercept(BlockDriverState *bs, + bool is_write, int nb_sectors) +{ +int64_t wait_time = -1; + +if (!qemu_co_queue_empty(&bs->throttled_reqs)) { +qemu_co_queue_wait(&bs->throttled_reqs); +} + +/* In fact, we hope to keep each request's timing, in FIFO mode. The next + * throttled requests will not be dequeued until the current request is + * allowed to be serviced. So if the current request still exceeds the + * limits, it will be inserted to the head. All requests followed it will + * be still in throttled_reqs queue. + */ + +while (bdrv_exceed_io_limits(bs, nb_sectors, is_write, &wait_time)) { +qemu_mod_timer(bs->block_timer, + wait_time + qemu_get_clock_ns(vm_clock)); +qemu_co_queue_wait_insert_head(&bs->throttled_reqs); +} + +qemu_co_queue_next(&bs->throttled_reqs); +} + /* check if the path starts with ":" */ static int path_has_protocol(const char *path) { @@ -718,6 +768,11 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags, bdrv_dev_change_media_cb(bs, true); } +/* throttling disk I/O limits */ +if (bs->io_limits_enabled) { +bdrv_io_limits_enable(bs); +} + return 0; unlink_and_fail: @@ -753,6 +808,11 @@ void bdrv_close(BlockDriverState *bs) bdrv_dev_change_media_cb(bs, false); } + +/*throttling disk I/O limits*/ +if (bs->io_limits_enabled) { +bdrv_io_limits_disable(bs); +} } void bdrv_close_all(void) @@ -1291,6 +1351,11 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs, return -EIO; } +/* throttling disk read I/O */ +if (bs->io_limits_enabled) { +bdrv_io_limits_intercept(bs, false, nb_sectors); +} + return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov); } @@ -1321,6 +1386,11 @@ static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs, return -EIO; } +/* throttling disk write I/O */ +if (bs->io_limits_enabled) { +bdrv_io_limits_intercept(bs, true, nb_sectors); +} + ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov); if (bs->dirty_bitmap) { @@ -2512,6 +2582,170 @@ void bdrv_aio_cancel(BlockDriverAIOCB *acb) acb->pool->cancel(acb); } +/* block I/O throttling */ +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, + bool is_write, double elapsed_time, uint64_t *wait) +{ +uint64_t bps_limit = 0; +double bytes_limit, bytes_base, bytes_res; +double slice_time, wait_time; + +if (bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]) { +bps_limit = bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]; +} else if (bs->io_limits.bps[is_write]) { +bps_limit = bs->io_limits.bps[is_write]; +} else { +if (wait) { +*wait = 0; +} + +return false; +} + +slice_time = bs->slice_end - bs->slice_start; +slice_time /= (NANOSECONDS_PER_SECOND); +bytes_limit = bps_limit * slice_time; +bytes_base = bs->nr_bytes[is_write] - bs->io_base.bytes[is_write]; +if (bs->io_limits.bps[BLOCK_IO_
Re: [Qemu-devel] [PATCH v12 3/5] block: add I/O throttling algorithm
On Mon, Nov 7, 2011 at 11:18 PM, Kevin Wolf wrote: > Am 03.11.2011 09:57, schrieb Zhi Yong Wu: >> Signed-off-by: Zhi Yong Wu >> Signed-off-by: Stefan Hajnoczi >> --- >> block.c | 220 >> +++ >> block.h | 1 + >> block_int.h | 1 + >> 3 files changed, 222 insertions(+), 0 deletions(-) >> >> diff --git a/block.c b/block.c >> index 79e7f09..b2af48f 100644 >> --- a/block.c >> +++ b/block.c >> @@ -74,6 +74,13 @@ static BlockDriverAIOCB >> *bdrv_co_aio_rw_vector(BlockDriverState *bs, >> bool is_write); >> static void coroutine_fn bdrv_co_do_rw(void *opaque); >> >> +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, >> + bool is_write, double elapsed_time, uint64_t *wait); >> +static bool bdrv_exceed_iops_limits(BlockDriverState *bs, bool is_write, >> + double elapsed_time, uint64_t *wait); >> +static bool bdrv_exceed_io_limits(BlockDriverState *bs, int nb_sectors, >> + bool is_write, int64_t *wait); >> + >> static QTAILQ_HEAD(, BlockDriverState) bdrv_states = >> QTAILQ_HEAD_INITIALIZER(bdrv_states); >> >> @@ -107,6 +114,24 @@ int is_windows_drive(const char *filename) >> #endif >> >> /* throttling disk I/O limits */ >> +void bdrv_io_limits_disable(BlockDriverState *bs) >> +{ >> + bs->io_limits_enabled = false; >> + >> + while (qemu_co_queue_next(&bs->throttled_reqs)); >> + >> + if (bs->block_timer) { >> + qemu_del_timer(bs->block_timer); >> + qemu_free_timer(bs->block_timer); >> + bs->block_timer = NULL; >> + } >> + >> + bs->slice_start = 0; >> + bs->slice_end = 0; >> + bs->slice_time = 0; >> + memset(&bs->io_base, 0, sizeof(bs->io_base)); >> +} >> + >> static void bdrv_block_timer(void *opaque) >> { >> BlockDriverState *bs = opaque; >> @@ -136,6 +161,31 @@ bool bdrv_io_limits_enabled(BlockDriverState *bs) >> || io_limits->iops[BLOCK_IO_LIMIT_TOTAL]; >> } >> >> +static void bdrv_io_limits_intercept(BlockDriverState *bs, >> + bool is_write, int nb_sectors) >> +{ >> + int64_t wait_time = -1; >> + >> + if (!qemu_co_queue_empty(&bs->throttled_reqs)) { >> + qemu_co_queue_wait(&bs->throttled_reqs); >> + } >> + >> + /* In fact, we hope to keep each request's timing, in FIFO mode. The >> next >> + * throttled requests will not be dequeued until the current request is >> + * allowed to be serviced. So if the current request still exceeds the >> + * limits, it will be inserted to the head. All requests followed it >> will >> + * be still in throttled_reqs queue. >> + */ >> + >> + while (bdrv_exceed_io_limits(bs, nb_sectors, is_write, &wait_time)) { >> + qemu_mod_timer(bs->block_timer, >> + wait_time + qemu_get_clock_ns(vm_clock)); >> + qemu_co_queue_wait_insert_head(&bs->throttled_reqs); >> + } >> + >> + qemu_co_queue_next(&bs->throttled_reqs); >> +} >> + >> /* check if the path starts with ":" */ >> static int path_has_protocol(const char *path) >> { >> @@ -718,6 +768,11 @@ int bdrv_open(BlockDriverState *bs, const char >> *filename, int flags, >> bdrv_dev_change_media_cb(bs, true); >> } >> >> + /* throttling disk I/O limits */ >> + if (bs->io_limits_enabled) { >> + bdrv_io_limits_enable(bs); >> + } >> + >> return 0; >> >> unlink_and_fail: >> @@ -753,6 +808,11 @@ void bdrv_close(BlockDriverState *bs) >> >> bdrv_dev_change_media_cb(bs, false); >> } >> + >> + /*throttling disk I/O limits*/ >> + if (bs->io_limits_enabled) { >> + bdrv_io_limits_disable(bs); >> + } >> } >> >> void bdrv_close_all(void) >> @@ -1291,6 +1351,11 @@ static int coroutine_fn >> bdrv_co_do_readv(BlockDriverState *bs, >> return -EIO; >> } >> >> + /* throttling disk read I/O */ >> + if (bs->io_limits_enabled) { >> + bdrv_io_limits_intercept(bs, false, nb_sectors); >> + } >> + >> return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov); >> } >> >> @@ -1321,6 +1386,11 @@ static int coroutine_fn >> bdrv_co_do_writev(BlockDriverState *bs, >> return -EIO; >> } >> >> + /* throttling disk write I/O */ >> + if (bs->io_limits_enabled) { >> + bdrv_io_limits_intercept(bs, true, nb_sectors); >> + } >> + >> ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov); >> >> if (bs->dirty_bitmap) { >> @@ -2512,6 +2582,156 @@ void bdrv_aio_cancel(BlockDriverAIOCB *acb) >> acb->pool->cancel(acb); >> } >> >> +/* block I/O throttling */ >> +static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, >> + bool is_write, double elapsed_time, uint64_t *wait) { >> + uint64_t bps_limit = 0; >> + double bytes_limit, bytes_base, bytes_res; >> + double slice_time, wait_time; >> + >> + if (bs->io_limits.bps[BLOCK_IO_LIMIT_
Re: [Qemu-devel] How to build QEMU from git
2011/11/8 Michael Eager : > Hi -- > > I ran into a problem running the qemu-0.15 and I was told > that the problem was fixed in the 1.0 release candidate. > I don't find a package with the RC on http://wiki.qemu.org/Download, > so I cloned the git repo and tried to build qemu. > > I didn't find any instructions on how to build qemu on the wiki, > so I winged it with the following configure: > > configure \ > --enable-sdl \ > --enable-curses \ > --enable-curl \ > --enable-kvm \ > --enable-nptl \ > --enable-uuid \ > --enable-linux-aio \ > --enable-io-thread \ > --enable-debug > > Is this correct, or is there a better set of options? > If you want faster, you can use --target-list option. Then other targets will not be compiled. > When I run make, there are a number of compiles which fail because > of variables being set but not used. I've modified the code to > eliminate some of them to allow the build to continue. Is there > a simpler way around this? > > After getting past these issues, libhw32/serial.o is not being built: > make[1]: *** No rule to make target `../libhw32/serial.o', needed by > `qemu'. Stop. > make[1]: Leaving directory `/local/eager/source/qemu/build/i386-softmmu' > make: *** [subdir-i386-softmmu] Error > > Is there a fix for this problem? > > -- > Michael Eager ea...@eagercon.com > 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 > > -- Regards Robert Wang
Re: [Qemu-devel] How to build QEMU from git
> I ran into a problem running the qemu-0.15 and I was told > that the problem was fixed in the 1.0 release candidate. > I don't find a package with the RC on http://wiki.qemu.org/Download, There is a qemu-1.0-rc1 on the page now. :) Regards, chenwj -- Wei-Ren Chen (陳韋任) Computer Systems Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667
Re: [Qemu-devel] [PATCH v12 4/5] hmp/qmp: add block_set_io_throttle
On Mon, Nov 7, 2011 at 11:26 PM, Kevin Wolf wrote: > Am 03.11.2011 09:57, schrieb Zhi Yong Wu: >> Signed-off-by: Zhi Yong Wu >> Signed-off-by: Stefan Hajnoczi >> --- >> block.c | 15 + >> blockdev.c | 59 >> ++ >> blockdev.h | 2 + >> hmp-commands.hx | 15 + >> hmp.c | 10 + >> qapi-schema.json | 16 +- >> qerror.c | 4 +++ >> qerror.h | 3 ++ >> qmp-commands.hx | 53 +++- >> 9 files changed, 175 insertions(+), 2 deletions(-) >> >> diff --git a/block.c b/block.c >> index b2af48f..ed6fe20 100644 >> --- a/block.c >> +++ b/block.c >> @@ -1971,6 +1971,21 @@ BlockInfoList *qmp_query_block(Error **errp) >> info->value->inserted->has_backing_file = true; >> info->value->inserted->backing_file = >> g_strdup(bs->backing_file); >> } >> + >> + if (bs->io_limits_enabled) { >> + info->value->inserted->bps = >> + bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]; >> + info->value->inserted->bps_rd = >> + bs->io_limits.bps[BLOCK_IO_LIMIT_READ]; >> + info->value->inserted->bps_wr = >> + bs->io_limits.bps[BLOCK_IO_LIMIT_WRITE]; >> + info->value->inserted->iops = >> + bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]; >> + info->value->inserted->iops_rd = >> + bs->io_limits.iops[BLOCK_IO_LIMIT_READ]; >> + info->value->inserted->iops_wr = >> + bs->io_limits.iops[BLOCK_IO_LIMIT_WRITE]; >> + } >> } >> >> /* XXX: waiting for the qapi to support GSList */ >> diff --git a/blockdev.c b/blockdev.c >> index 651828c..95d1faa 100644 >> --- a/blockdev.c >> +++ b/blockdev.c >> @@ -757,6 +757,65 @@ int do_change_block(Monitor *mon, const char *device, >> return monitor_read_bdrv_key_start(mon, bs, NULL, NULL); >> } >> >> +/* throttling disk I/O limits */ >> +int do_block_set_io_throttle(Monitor *mon, >> + const QDict *qdict, QObject **ret_data) >> +{ >> + BlockIOLimit io_limits; >> + const char *devname = qdict_get_str(qdict, "device"); >> + BlockDriverState *bs; >> + >> + io_limits.bps[BLOCK_IO_LIMIT_TOTAL] >> + = qdict_get_try_int(qdict, "bps", -1); >> + io_limits.bps[BLOCK_IO_LIMIT_READ] >> + = qdict_get_try_int(qdict, "bps_rd", -1); >> + io_limits.bps[BLOCK_IO_LIMIT_WRITE] >> + = qdict_get_try_int(qdict, "bps_wr", -1); >> + io_limits.iops[BLOCK_IO_LIMIT_TOTAL] >> + = qdict_get_try_int(qdict, "iops", -1); >> + io_limits.iops[BLOCK_IO_LIMIT_READ] >> + = qdict_get_try_int(qdict, "iops_rd", -1); >> + io_limits.iops[BLOCK_IO_LIMIT_WRITE] >> + = qdict_get_try_int(qdict, "iops_wr", -1); >> + >> + bs = bdrv_find(devname); >> + if (!bs) { >> + qerror_report(QERR_DEVICE_NOT_FOUND, devname); >> + return -1; >> + } >> + >> + if ((io_limits.bps[BLOCK_IO_LIMIT_TOTAL] == -1) >> + || (io_limits.bps[BLOCK_IO_LIMIT_READ] == -1) >> + || (io_limits.bps[BLOCK_IO_LIMIT_WRITE] == -1) >> + || (io_limits.iops[BLOCK_IO_LIMIT_TOTAL] == -1) >> + || (io_limits.iops[BLOCK_IO_LIMIT_READ] == -1) >> + || (io_limits.iops[BLOCK_IO_LIMIT_WRITE] == -1)) { >> + qerror_report(QERR_MISSING_PARAMETER, >> + "bps/bps_rd/bps_wr/iops/iops_rd/iops_wr"); >> + return -1; >> + } > > Here you require that all parameters are set... This is what i want. > >> + >> + if (!do_check_io_limits(&io_limits)) { >> + qerror_report(QERR_INVALID_PARAMETER_COMBINATION); >> + return -1; >> + } >> + >> + bs->io_limits = io_limits; >> + bs->slice_time = BLOCK_IO_SLICE_TIME; >> + >> + if (!bs->io_limits_enabled && bdrv_io_limits_enabled(bs)) { >> + bdrv_io_limits_enable(bs); >> + } else if (bs->io_limits_enabled && !bdrv_io_limits_enabled(bs)) { >> + bdrv_io_limits_disable(bs); >> + } else { >> + if (bs->block_timer) { >> + qemu_mod_timer(bs->block_timer, qemu_get_clock_ns(vm_clock)); >> + } >> + } >> + >> + return 0; >> +} >> + >> int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data) >> { >> const char *id = qdict_get_str(qdict, "id"); >> diff --git a/blockdev.h b/blockdev.h >> index 3587786..1b48a75 100644 >> --- a/blockdev.h >> +++ b/blockdev.h >> @@ -63,6 +63,8 @@ int do_block_set_passwd(Monitor *mon, const QDict *qdict, >> QObject **ret_data); >> int do_change_block(Monitor *mon, const char *device, >> const char *filename, const char *fmt)
Re: [Qemu-devel] [PATCH v12 5/5] block: perf testing report based on block I/O throttling
On Mon, Nov 7, 2011 at 11:27 PM, Kevin Wolf wrote: > Am 03.11.2011 09:57, schrieb Zhi Yong Wu: >> The file 1mbps.dat is based on bps=1024*1024 I/O throttling; and the file >> 10mbps.dat is based on bps=10*1024*1024 I/O throttling. >> >> Signed-off-by: Zhi Yong Wu >> --- >> 10mbps.dat | 310 ++ >> 1mbps.dat | 339 >> >> 2 files changed, 649 insertions(+), 0 deletions(-) >> create mode 100644 10mbps.dat >> create mode 100644 1mbps.dat > > This is just for information and not supposed to be merged, right? Yeah, it is only to supply some proof to prove I/O block throttling is effective. > > Kevin > -- Regards, Zhi Yong Wu
Re: [Qemu-devel] [PATCH 4/5] integratorcp: convert integratorcm to VMState
2011/10/26 Peter Maydell : > On 25 October 2011 12:09, Benoît Canet wrote: >> +static const VMStateDescription vmstate_integratorcm = { >> + .name = "integratorcm", >> + .version_id = 1, >> + .minimum_version_id = 1, >> + .minimum_version_id_old = 1, >> + .fields = (VMStateField[]) { >> + VMSTATE_UINT32(memsz, integratorcm_state), >> + VMSTATE_BOOL(flash_mapped, integratorcm_state), > > This raises a question. flash_mapped here is a flag which just > tracks whether the associated MemoryRegion is currently mapped > or unmapped. Do we need to do anything special to ensure that > the MemoryRegion itself is reset to the correct mapped/unmapped > state on restore? > > I recall discussing this kind of thing with Avi on IRC but I > can't remember what the conclusion was. Avi, ping? I'm still interested in the answer to this question. thanks -- PMM
Re: [Qemu-devel] Patch: fix typo: runnning -> running
Better cc to qemu-triv...@nongnu.org On Mon, Nov 07, 2011 at 03:58:23PM -0800, Vagrant Cascadian wrote: > One n too many for running, need we say more. > > Signed-Off-By: Vagrant Cascadian > > Index: qemu/target-i386/kvm.c > === > --- qemu.orig/target-i386/kvm.c 2011-07-02 21:25:10.0 -0700 > +++ qemu/target-i386/kvm.c2011-07-02 21:25:20.0 -0700 > @@ -1537,7 +1537,7 @@ > code); > if (host_supports_vmx() && code == VMX_INVALID_GUEST_STATE) { > fprintf(stderr, > -"\nIf you're runnning a guest on an Intel machine > without " > +"\nIf you're running a guest on an Intel machine without > " > "unrestricted mode\n" > "support, the failure can be most likely due to the > guest " > "entering an invalid\n" > > > live well, > vagrant -- Wei-Ren Chen (陳韋任) Computer Systems Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667
Re: [Qemu-devel] [PATCH V3] Introduce a new bus "ICC" to connect APIC
On Sun, Nov 06, 2011 at 07:06:29PM +0100, Jan Kiszka wrote: > On 2011-11-03 10:30, pingf...@linux.vnet.ibm.com wrote: > > From: Liu Ping Fan > > > > Introduce a new structure CPUS as the controller of ICC (INTERRUPT > > CONTROLLER COMMUNICATIONS), and new bus "ICC" to hold APIC,instead > > of sysbus. So we can support APIC hot-plug feature. > > > > Signed-off-by: liu ping fan > > --- > > Makefile.target |1 + > > hw/apic.c | 24 + > > hw/apic.h |1 + > > hw/icc_bus.c| 92 > > +++ > > hw/icc_bus.h| 61 + > > hw/pc.c |9 +++-- > > hw/pc_piix.c| 14 +++- > > target-i386/cpu.h |1 + > > target-i386/cpuid.c | 16 + > > 9 files changed, 207 insertions(+), 12 deletions(-) > > create mode 100644 hw/icc_bus.c > > create mode 100644 hw/icc_bus.h > > > > diff --git a/Makefile.target b/Makefile.target > > index 9011f28..5607c6d 100644 > > --- a/Makefile.target > > +++ b/Makefile.target > > @@ -241,6 +241,7 @@ obj-i386-$(CONFIG_KVM) += kvmclock.o > > obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o > > obj-i386-y += testdev.o > > obj-i386-y += acpi.o acpi_piix4.o > > +obj-i386-y += icc_bus.o > > > > obj-i386-y += pcspk.o i8254.o > > obj-i386-$(CONFIG_KVM_PIT) += i8254-kvm.o > > diff --git a/hw/apic.c b/hw/apic.c > > index 69d6ac5..34fa1dd 100644 > > --- a/hw/apic.c > > +++ b/hw/apic.c > > @@ -21,9 +21,10 @@ > > #include "ioapic.h" > > #include "qemu-timer.h" > > #include "host-utils.h" > > -#include "sysbus.h" > > +#include "icc_bus.h" > > #include "trace.h" > > #include "kvm.h" > > +#include "exec-memory.h" > > > > /* APIC Local Vector Table */ > > #define APIC_LVT_TIMER 0 > > @@ -80,7 +81,7 @@ > > typedef struct APICState APICState; > > > > struct APICState { > > -SysBusDevice busdev; > > +ICCBusDevice busdev; > > MemoryRegion io_memory; > > void *cpu_env; > > uint32_t apicbase; > > @@ -1104,9 +1105,19 @@ static const MemoryRegionOps apic_io_ops = { > > .endianness = DEVICE_NATIVE_ENDIAN, > > }; > > > > -static int apic_init1(SysBusDevice *dev) > > +int apic_mmio_map(DeviceState *dev, target_phys_addr_t base) > > { > > -APICState *s = FROM_SYSBUS(APICState, dev); > > +APICState *s = DO_UPCAST(APICState, busdev.qdev, dev); > > + > > +memory_region_add_subregion(get_system_memory(), > > +base, > > +&s->io_memory); > > +return 0; > > +} > > + > > +static int apic_init1(ICCBusDevice *dev) > > +{ > > +APICState *s = DO_UPCAST(APICState, busdev, dev); > > static int last_apic_idx; > > > > if (last_apic_idx >= MAX_APICS) { > > @@ -1114,7 +1125,6 @@ static int apic_init1(SysBusDevice *dev) > > } > > memory_region_init_io(&s->io_memory, &apic_io_ops, s, "apic", > >MSI_ADDR_SIZE); > > -sysbus_init_mmio_region(dev, &s->io_memory); > > > > s->timer = qemu_new_timer_ns(vm_clock, apic_timer, s); > > s->idx = last_apic_idx++; > > @@ -1122,7 +1132,7 @@ static int apic_init1(SysBusDevice *dev) > > return 0; > > } > > > > -static SysBusDeviceInfo apic_info = { > > +static ICCBusDeviceInfo apic_info = { > > .init = apic_init1, > > .qdev.name = "apic", > > .qdev.size = sizeof(APICState), > > @@ -1138,7 +1148,7 @@ static SysBusDeviceInfo apic_info = { > > > > static void apic_register_devices(void) > > { > > -sysbus_register_withprop(&apic_info); > > +iccbus_register_devinfo(&apic_info); > > } > > > > device_init(apic_register_devices) > > diff --git a/hw/apic.h b/hw/apic.h > > index c857d52..e2c0af5 100644 > > --- a/hw/apic.h > > +++ b/hw/apic.h > > @@ -20,6 +20,7 @@ void cpu_set_apic_tpr(DeviceState *s, uint8_t val); > > uint8_t cpu_get_apic_tpr(DeviceState *s); > > void apic_init_reset(DeviceState *s); > > void apic_sipi(DeviceState *s); > > +int apic_mmio_map(DeviceState *dev, target_phys_addr_t base); > > > > /* pc.c */ > > int cpu_is_bsp(CPUState *env); > > diff --git a/hw/icc_bus.c b/hw/icc_bus.c > > new file mode 100644 > > index 000..ac88f2e > > --- /dev/null > > +++ b/hw/icc_bus.c > > @@ -0,0 +1,92 @@ > > +/* icc_bus.c > > + * emulate x86 ICC(INTERRUPT CONTROLLER COMMUNICATIONS) bus > > + * > > + * Copyright IBM, Corp. 2011 > > + * > > + * This library is free software; you can redistribute it and/or > > + * modify it under the terms of the GNU Lesser General Public > > + * License as published by the Free Software Foundation; either > > + * version 2 of the License, or (at your option) any later version. > > + * > > + * This library is distributed in the hope that it will be useful, > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > + * Lesser General Public License for mor
[Qemu-devel] Patch: fix typo: runnning -> running
One n too many for running, need we say more. Signed-Off-By: Vagrant Cascadian Index: qemu/target-i386/kvm.c === --- qemu.orig/target-i386/kvm.c 2011-07-02 21:25:10.0 -0700 +++ qemu/target-i386/kvm.c 2011-07-02 21:25:20.0 -0700 @@ -1537,7 +1537,7 @@ code); if (host_supports_vmx() && code == VMX_INVALID_GUEST_STATE) { fprintf(stderr, -"\nIf you're runnning a guest on an Intel machine without " +"\nIf you're running a guest on an Intel machine without " "unrestricted mode\n" "support, the failure can be most likely due to the guest " "entering an invalid\n" live well, vagrant
Re: [Qemu-devel] [PATCH] KVM: Add wrapper script around QEMU to test kernels
On Nov 7, 2011, at 5:19 PM, Anthony Liguori wrote: > > The kernel ecosystem does not have to be limited to linux.git. There could > be a process to be a "kernel.org project" for projects that fit a certain set > of criteria. These projects could all share the Linux kernel release cadence > and have a kernel maintainer as a sponsor or something like that. > > That is something that could potentially benefit things like e2fs-tools and > all of the other tools that are tied closely to the kernel. We have that already. Packages such as e2fsprogs, xfsprogs, xfstests, sparse, git, etc., have git trees under git.kernel.org. And I agree that's the perfect place for kvm-tool and perf. :-) -- Ted
Re: [Qemu-devel] clk_setup decl missing in virtex_init
Works for me. On 11/07/2011 03:05 PM, Edgar E. Iglesias wrote: On Mon, Nov 07, 2011 at 02:46:47PM -0800, Michael Eager wrote: The declaration of clk_setup is missing in qemu/hw/virtex_ml507.c: static void virtex_init(ram_addr_t ram_size, const char *boot_device, const char *kernel_filename, const char *kernel_cmdline, const char *initrd_filename, const char *cpu_model) { ... ... memset(clk_setup, 0, sizeof(clk_setup)); This SEGVs because clk_setup is an inline function defined in ppc.h. (I presume that the linker generates an out-of-line copy.) It isn't clear what the declaration should be. In ppc405_uc.c, there is a decl: clk_setup_t clk_setup[PPC405EP_CLK_NB]; Hi the following patch seems to work on my side. Fabien, could you please see if this was your intention? It seems to be commit ddd1055b07fdfe488a22c2275adaca75f4206d30 that introduced the segfault. Cheers commit 8e95771e4afb6e91c30a53943118afef4631b919 Author: Edgar E. Iglesias Date: Tue Nov 8 00:00:55 2011 +0100 virtex: Remove memset of clk_setup clk_setup is now a function. Signed-off-by: Edgar E. Iglesias diff --git a/hw/virtex_ml507.c b/hw/virtex_ml507.c index d31a204..5ea0e60 100644 --- a/hw/virtex_ml507.c +++ b/hw/virtex_ml507.c @@ -202,7 +202,6 @@ static void virtex_init(ram_addr_t ram_size, cpu_model = "440-Xilinx"; } -memset(clk_setup, 0, sizeof(clk_setup)); env = ppc440_init_xilinx(&ram_size, 1, cpu_model, 4); qemu_register_reset(main_cpu_reset, env); -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
Re: [Qemu-devel] [PATCH v2 4/7] block: take lock around bdrv_read implementations
On 11/08/2011 12:26 AM, Avi Kivity wrote: > On 11/06/2011 07:25 PM, Paolo Bonzini wrote: > > > > Perhaps the failure is only reproduced 80-90% of the time and this > > screws up the bisection. > > Correct, bad bisect. It actually reproduces reliably, when the bisecter > is reliable. > > The new candidate (disclaimer: unreliable bisecter) is: > > commit a5c57d64aa61b700db444c4864a1da11f1165db6 > Author: Paolo Bonzini > Date: Mon Sep 12 14:40:36 2011 +0200 > > qemu-timer: do not refer to runstate_is_running() > > Signed-off-by: Paolo Bonzini > > Fixed by 47113ab6b8c5659a, as Anthony pointed out on IRC. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.
Re: [Qemu-devel] clk_setup decl missing in virtex_init
On Mon, Nov 07, 2011 at 02:46:47PM -0800, Michael Eager wrote: > The declaration of clk_setup is missing in > qemu/hw/virtex_ml507.c: > > static void virtex_init(ram_addr_t ram_size, > const char *boot_device, > const char *kernel_filename, > const char *kernel_cmdline, > const char *initrd_filename, const char *cpu_model) > { > ... > > ... > memset(clk_setup, 0, sizeof(clk_setup)); > > This SEGVs because clk_setup is an inline function defined in ppc.h. > (I presume that the linker generates an out-of-line copy.) > > It isn't clear what the declaration should be. In ppc405_uc.c, there > is a decl: >clk_setup_t clk_setup[PPC405EP_CLK_NB]; Hi the following patch seems to work on my side. Fabien, could you please see if this was your intention? It seems to be commit ddd1055b07fdfe488a22c2275adaca75f4206d30 that introduced the segfault. Cheers commit 8e95771e4afb6e91c30a53943118afef4631b919 Author: Edgar E. Iglesias Date: Tue Nov 8 00:00:55 2011 +0100 virtex: Remove memset of clk_setup clk_setup is now a function. Signed-off-by: Edgar E. Iglesias diff --git a/hw/virtex_ml507.c b/hw/virtex_ml507.c index d31a204..5ea0e60 100644 --- a/hw/virtex_ml507.c +++ b/hw/virtex_ml507.c @@ -202,7 +202,6 @@ static void virtex_init(ram_addr_t ram_size, cpu_model = "440-Xilinx"; } -memset(clk_setup, 0, sizeof(clk_setup)); env = ppc440_init_xilinx(&ram_size, 1, cpu_model, 4); qemu_register_reset(main_cpu_reset, env);
Re: [Qemu-devel] [PATCH v7 2/2] pc: Support system flash memory with pflash
On Wed, Nov 2, 2011 at 21:19, Jordan Justen wrote: > If a pflash image is found, then it is used for the system > firmware image. > > If a pflash image is not initially found, then a read-only > pflash device is created using the -bios filename. > > KVM cannot execute from a pflash region currently. > Therefore, when KVM is enabled, a (read-only) ram memory > region is created and filled with the contents of the > pflash drive. > > Signed-off-by: Jordan Justen > Cc: Anthony Liguori > --- > Makefile.target | 1 + > default-configs/i386-softmmu.mak | 1 + > default-configs/x86_64-softmmu.mak | 1 + > hw/boards.h | 1 + > hw/pc.c | 55 +- > hw/pc.h | 3 + > hw/pc_sysfw.c | 196 > > vl.c | 2 +- > 8 files changed, 209 insertions(+), 51 deletions(-) > create mode 100644 hw/pc_sysfw.c > > diff --git a/Makefile.target b/Makefile.target > index 0c86bc5..8adda6c 100644 > --- a/Makefile.target > +++ b/Makefile.target > @@ -232,6 +232,7 @@ obj-i386-y += vmport.o > obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o > obj-i386-y += debugcon.o multiboot.o > obj-i386-y += pc_piix.o > +obj-i386-y += pc_sysfw.o > obj-i386-$(CONFIG_KVM) += kvmclock.o > obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o > > diff --git a/default-configs/i386-softmmu.mak > b/default-configs/i386-softmmu.mak > index e67ebb3..cd407a9 100644 > --- a/default-configs/i386-softmmu.mak > +++ b/default-configs/i386-softmmu.mak > @@ -22,3 +22,4 @@ CONFIG_SOUND=y > CONFIG_HPET=y > CONFIG_APPLESMC=y > CONFIG_I8259=y > +CONFIG_PFLASH_CFI01=y > diff --git a/default-configs/x86_64-softmmu.mak > b/default-configs/x86_64-softmmu.mak > index b75757e..47734ea 100644 > --- a/default-configs/x86_64-softmmu.mak > +++ b/default-configs/x86_64-softmmu.mak > @@ -22,3 +22,4 @@ CONFIG_SOUND=y > CONFIG_HPET=y > CONFIG_APPLESMC=y > CONFIG_I8259=y > +CONFIG_PFLASH_CFI01=y > diff --git a/hw/boards.h b/hw/boards.h > index 716fd7b..45a31a1 100644 > --- a/hw/boards.h > +++ b/hw/boards.h > @@ -33,6 +33,7 @@ typedef struct QEMUMachine { > } QEMUMachine; > > int qemu_register_machine(QEMUMachine *m); > +QEMUMachine *find_default_machine(void); > > extern QEMUMachine *current_machine; > > diff --git a/hw/pc.c b/hw/pc.c > index eb4c2d8..ce97eb7 100644 > --- a/hw/pc.c > +++ b/hw/pc.c > @@ -57,10 +57,6 @@ > #define DPRINTF(fmt, ...) > #endif > > -#define BIOS_FILENAME "bios.bin" > - > -#define PC_MAX_BIOS_SIZE (4 * 1024 * 1024) > - > /* Leave a chunk of memory at the top of RAM for the BIOS ACPI tables. */ > #define ACPI_DATA_SIZE 0x1 > #define BIOS_CFG_IOPORT 0x510 > @@ -976,11 +972,9 @@ void pc_memory_init(MemoryRegion *system_memory, > MemoryRegion *rom_memory, > MemoryRegion **ram_memory) > { > - char *filename; > - int ret, linux_boot, i; > - MemoryRegion *ram, *bios, *isa_bios, *option_rom_mr; > + int linux_boot, i; > + MemoryRegion *ram, *option_rom_mr; > MemoryRegion *ram_below_4g, *ram_above_4g; > - int bios_size, isa_bios_size; > void *fw_cfg; > > linux_boot = (kernel_filename != NULL); > @@ -1005,43 +999,9 @@ void pc_memory_init(MemoryRegion *system_memory, > ram_above_4g); > } > > - /* BIOS load */ > - if (bios_name == NULL) > - bios_name = BIOS_FILENAME; > - filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name); > - if (filename) { > - bios_size = get_image_size(filename); > - } else { > - bios_size = -1; > - } > - if (bios_size <= 0 || > - (bios_size % 65536) != 0) { > - goto bios_error; > - } > - bios = g_malloc(sizeof(*bios)); > - memory_region_init_ram(bios, NULL, "pc.bios", bios_size); > - memory_region_set_readonly(bios, true); > - ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1); > - if (ret != 0) { > - bios_error: > - fprintf(stderr, "qemu: could not load PC BIOS '%s'\n", bios_name); > - exit(1); > - } > - if (filename) { > - g_free(filename); > - } > - /* map the last 128KB of the BIOS in ISA space */ > - isa_bios_size = bios_size; > - if (isa_bios_size > (128 * 1024)) > - isa_bios_size = 128 * 1024; > - isa_bios = g_malloc(sizeof(*isa_bios)); > - memory_region_init_alias(isa_bios, "isa-bios", bios, > - bios_size - isa_bios_size, isa_bios_size); > - memory_region_add_subregion_overlap(rom_memory, > - 0x10 - isa_bios_size, > - isa_bios, > - 1); > - memory_region_set_readonly(isa_bios, true); > + > + /* Initialize ROM or flash ranges for PC firmware */ > + pc_system_firmware_init
[Qemu-devel] clk_setup decl missing in virtex_init
The declaration of clk_setup is missing in qemu/hw/virtex_ml507.c: static void virtex_init(ram_addr_t ram_size, const char *boot_device, const char *kernel_filename, const char *kernel_cmdline, const char *initrd_filename, const char *cpu_model) { ... ... memset(clk_setup, 0, sizeof(clk_setup)); This SEGVs because clk_setup is an inline function defined in ppc.h. (I presume that the linker generates an out-of-line copy.) It isn't clear what the declaration should be. In ppc405_uc.c, there is a decl: clk_setup_t clk_setup[PPC405EP_CLK_NB]; -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
Re: [Qemu-devel] [PATCH v2 4/7] block: take lock around bdrv_read implementations
On 11/06/2011 07:25 PM, Paolo Bonzini wrote: > > Perhaps the failure is only reproduced 80-90% of the time and this > screws up the bisection. Correct, bad bisect. It actually reproduces reliably, when the bisecter is reliable. The new candidate (disclaimer: unreliable bisecter) is: commit a5c57d64aa61b700db444c4864a1da11f1165db6 Author: Paolo Bonzini Date: Mon Sep 12 14:40:36 2011 +0200 qemu-timer: do not refer to runstate_is_running() Signed-off-by: Paolo Bonzini -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.
Re: [Qemu-devel] [PATCH] KVM: Add wrapper script around QEMU to test kernels
On 11/07/2011 03:36 PM, Pekka Enberg wrote: Hi Ted, On Mon, Nov 7, 2011 at 10:32 PM, Ted Ts'o wrote: Personally, I consider code that runs in userspace as a pretty bright line, as being "not kernel code", and while perhaps things like initramfs and the crazy ideas people have had in the past of moving stuff out of kernel/init.c into userspace might have qualified as stuff really close to the kernel, something like kvm-tool that runs way after boot, doesn't even come close. Wine is another example of another package that has lots of close kernel ties, but was also not bundled into the kernel. It's not as clear line as you make it out to be. KVM tool also has mini-BIOS code that runs in guest space. It has a code that runs in userspace but is effectively a simple bootloader. So it definitely doesn't fit the simple definition of "running way after boot" (we're _booting_ the kernel too). Linsched fits your definition but is clearly worth integrating to the kernel tree. While you are suggesting that maybe we should move Perf out of the tree now that it's mature, I'm pretty sure you'd agree that it probably would not have happened if the userspace parts were developed out of tree. There's also spectacular failures in the kernel history where the userspace split was enforced. For example, userspace suspend didn't turn out the way people envisioned it at the time. We don't know how it would have worked out if the userspace components would have been in the tree but it certainly would have solved many if the early ABI issues. I guess I'm trying to argue here that there's a middle ground. I'm willing to bet projects like klibc and unified initramfs will eventually make it to the kernel tree because they simply make so much sense. I'm also willing to be that the costs of moving Perf out of the tree are simply too high to make it worthwhile. Does that mean KVM tool should get a free pass in merging? Absolutely not. But I do think your position is too extreme and ignores the benefits of developing userspace tools in the kernel ecosystem which was summed up by Anthony rather well in this thread: https://lkml.org/lkml/2011/11/7/169 The kernel ecosystem does not have to be limited to linux.git. There could be a process to be a "kernel.org project" for projects that fit a certain set of criteria. These projects could all share the Linux kernel release cadence and have a kernel maintainer as a sponsor or something like that. That is something that could potentially benefit things like e2fs-tools and all of the other tools that are tied closely to the kernel. In fact, having a single place where users could find all of the various kernel related tools and helpers would probably be extremely useful. There's no reason this needs to be linux.git though, this could just be a web page on kernel.org. Regards, Anthony Liguori Pekka
[Qemu-devel] [ANNOUNCE] QEMU 1.0-rc1
Hi, On behalf of the QEMU Team, I'd like to announce the availability of QEMU 1.0, release candidate 1. This is the first release candidate for the 1.0 release. This is not intended for production use but rather for testing. To participate in the testing effort, please read the Testing Wiki[1] and sign up to test a subsystem. Any problems should be reported on Launchpad[2] or qemu-devel. If you've contributed to the 1.0 release, please take a moment and update the Changelog[3] so we can have a high quality change log for the release. The full release schedule[4] is also available on the wiki. http://wiki.qemu.org/download/qemu-1.0-rc1.tar.gz Known Issues: 1) There is an issue resetting Linux guests that is still being investigated. Shutting down QEMU completely and restarting works around the issue. Changelog since v1.0-rc0 - console: Fix rendering of VGA underline (Markus Armbruster) - qemu_timedate_diff() shouldn't modify its argument. (Gleb Natapov) - reenable vm_clock when resuming all vcpus (Wen Congyang) - qxl: fix vga port initialization. (Gerd Hoffmann) - ac97: don't override the pci subsystem id (Gerd Hoffmann) - pc: add 1.0 machine type (Gerd Hoffmann) - disable automatic loading of sgabios when -nographic (Paolo Bonzini) - add sgabios blob and submodule (Paolo Bonzini) - xen-platform: Fix IO port read/write functions (Anthony PERARD) - readline: Fix buffer overrun on re-add to history (Markus Armbruster) - cmd: Fix potential memory leak (Pavel Borzenkov) - cmd: Fix potential NULL pointer dereference (Pavel Borzenkov) - cmd: Fix coding style in cmd.c (Pavel Borzenkov) - arm_gic: handle banked enable bits for per-cpu interrupts (Rabin Vincent) - vvfat: reorganize computation of disk geometry (Paolo Bonzini) - vvfat: do not hardcode sector counts in error message (Paolo Bonzini) - vvfat: unify and correct computation of sector count (Paolo Bonzini) - vvfat: need to use first_sectors_number to distinguish fdd/hdd (Paolo Bonzini) - vvfat: do not fail if the disk has spare sectors (Paolo Bonzini) - vvfat: fix out of bounds array_get usage (Paolo Bonzini) - block/cloop: Use g_free instead of free (Dong Xu Wang) - block/cloop: Fix coding style (Dong Xu Wang) - dma: Avoid reentrancy in DMA transfer handlers (Kevin Wolf) - qemu-io: Fix multiwrite_f error handling (Kevin Wolf) - qemu-io: Handle create_iovec errors (Kevin Wolf) - Fix X86 CPU topology in KVM mode (Bharata B Rao) - intel-hda: fix stream search (Gerd Hoffmann) - virtio-blk: pass full status to the guest (Paolo Bonzini) - hw/9pfs: use g_vasprintf() instead of rolling our own (Stefan Hajnoczi) - xtensa_lx60: fix build date code and change memory region names (Max Filippov) - xtensa_lx60: pass kernel arguments from -append (Max Filippov) - xtensa_lx60: add FLASH support (Max Filippov) - target-xtensa: raise an exception for invalid and reserved opcodes (Max Filippov) - target-xtensa: handle cache options in the overlay tool (Max Filippov) - target-xtensa: mask out undefined bits of WINDOWSTART SR (Max Filippov) - tcg: Add tcg interpreter to configure / make (Stefan Weil) - tcg: Add tci disassembler (Stefan Weil) - tcg: Add interpreter for bytecode (Stefan Weil) - tcg: Add bytecode generator for tcg interpreter (Stefan Weil) - tcg: Make ARRAY_SIZE(tcg_op_defs) globally available (Stefan Weil) - tcg: TCG targets may define tcg_qemu_tb_exec (Stefan Weil) - memory: use 128-bit integers for sizes and intermediates (Avi Kivity) - Add support for 128-bit arithmetic (Avi Kivity) [1] http://wiki.qemu.org/Planning/1.0/Testing [2] http://wiki.qemu.org/ReportABug [3] http://wiki.qemu.org/ChangeLog/Next [4] http://wiki.qemu.org/Planning/1.0 Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH] KVM: Add wrapper script around QEMU to test kernels
Hi Ted, On Mon, Nov 7, 2011 at 10:32 PM, Ted Ts'o wrote: > Personally, I consider code that runs in userspace as a pretty bright > line, as being "not kernel code", and while perhaps things like > initramfs and the crazy ideas people have had in the past of moving > stuff out of kernel/init.c into userspace might have qualified as > stuff really close to the kernel, something like kvm-tool that runs > way after boot, doesn't even come close. Wine is another example of > another package that has lots of close kernel ties, but was also not > bundled into the kernel. It's not as clear line as you make it out to be. KVM tool also has mini-BIOS code that runs in guest space. It has a code that runs in userspace but is effectively a simple bootloader. So it definitely doesn't fit the simple definition of "running way after boot" (we're _booting_ the kernel too). Linsched fits your definition but is clearly worth integrating to the kernel tree. While you are suggesting that maybe we should move Perf out of the tree now that it's mature, I'm pretty sure you'd agree that it probably would not have happened if the userspace parts were developed out of tree. There's also spectacular failures in the kernel history where the userspace split was enforced. For example, userspace suspend didn't turn out the way people envisioned it at the time. We don't know how it would have worked out if the userspace components would have been in the tree but it certainly would have solved many if the early ABI issues. I guess I'm trying to argue here that there's a middle ground. I'm willing to bet projects like klibc and unified initramfs will eventually make it to the kernel tree because they simply make so much sense. I'm also willing to be that the costs of moving Perf out of the tree are simply too high to make it worthwhile. Does that mean KVM tool should get a free pass in merging? Absolutely not. But I do think your position is too extreme and ignores the benefits of developing userspace tools in the kernel ecosystem which was summed up by Anthony rather well in this thread: https://lkml.org/lkml/2011/11/7/169 Pekka
Re: [Qemu-devel] [PATCH] KVM: Add wrapper script around QEMU to test kernels
On Mon, Nov 07, 2011 at 10:09:34PM +0200, Pekka Enberg wrote: > > I guess for perf ABI, "perf test" is the closest thing to a > specification so if your application is using something that's not > covered by it, you might be in trouble. I don't believe there's ever been any guarantee that "perf test" from version N of the kernel will always work on a version N+M of the kernel. Perhaps I am wrong, though. If that is a guarantee that the perf developers are willing to stand behind, or have already made, I would love to be corrected and would be delighted to hear that in fact there is a stable, backwards compatible perf ABI. Regards, - Ted
Re: [Qemu-devel] Autotest | Job ID: 2072 "[unittest emulator + kernel git] Upstream qemu-kvm.git sanity 11-07-2011 13:16:33" | Status: 1 Completed | Success Rate: 94.74 %
On 11/07/2011 04:56 PM, kvm-autot...@redhat.com wrote: Job ID: 2072 Job name: [unittest emulator + kernel git] Upstream qemu-kvm.git sanity 11-07-2011 13:16:33 Summary: Host: virtlab201.virt.bos.redhat.com Status: Completed Status: 1 Completed Results interface URL: http://autotest.virt.bos.redhat.com/afe/#tab_id=view_job&object_id=2072 Execution time (HH:MM:SS): 01:14:19 User tests executed: 19 User tests passed: 18 User tests failed: 1 User tests success rate: 94.74 % Failures: Test Name Status Reason kvm.qemu-kvm-git.unittests FAIL Unit tests failed: emulator Marcelo, As requested, we've run the kvm unit tests with a kernel based on git commit f2ae29e3492d41f082bf1e66532c2fe4da65f486. Relevant log entries: 11/07 14:08:55 INFO |virt_utils:0476| Fetching git [REP 'git://github.com/avikivity/kvm.git' BRANCH 'master' COMMIT 'f2ae29e3492d41f082bf1e66532c2fe4da65f486'] -> /tmp/kernel_src ... 11/07 14:15:02 DEBUG|base_utils:0074| Running 'git describe' 11/07 14:15:06 INFO |virt_utils:0501| Commit hash for git://github.com/avikivity/kvm.git is f2ae29e3492d41f082bf1e66532c2fe4da65f486 (tag v3.1-rc4-4933-gf2ae29e) qemu-kvm information: 11/07 14:32:02 DEBUG|virt_utils:2587| Git repo qemu_kvm uri: git://github.com/avikivity/qemu.git 11/07 14:32:02 DEBUG|virt_utils:2590| Git repo qemu_kvm branch: master 11/07 14:32:02 DEBUG|virt_utils:2595| Git repo qemu_kvm lbranch: master 11/07 14:32:02 DEBUG|virt_utils:2599| Git repo qemu_kvm commit is not set ... 11/07 14:32:37 DEBUG|base_utils:0074| Running 'git describe' 11/07 14:32:37 INFO |virt_utils:2531| Commit hash for qemu_kvm is 7879db7e9c09b92d9af1c143fbe2cc212ec89e4b (no tag found) But the result is exactly the same: 11/07 14:35:06 INFO | unittest:0052| Running emulator 11/07 14:35:06 DEBUG|virt_env_p:0062| Preprocessing VM 'None' 11/07 14:35:06 INFO |kvm_vm:0790| Running qemu command: /usr/local/autotest/tests/kvm/qemu -name 'None' -nodefaults -vga std -monitor unix:'/tmp/monitor-humanmonitor1-2007-143402-z5b2',server,nowait -qmp unix:'/tmp/monitor-qmpmonitor1-2007-143402-z5b2',server,nowait -serial unix:'/tmp/serial-2007-143402-z5b2',server,nowait -m 512 -smp 2 -kernel '/usr/local/autotest/tests/kvm/unittests/emulator.flat' -vnc :0 -chardev file,id=testlog,path=/tmp/testlog-2007-143402-z5b2 -device testdev,chardev=testlog -S 11/07 14:35:07 DEBUG|kvm_monito:0624| (monitor qmpmonitor1) Sending command 'qmp_capabilities' 11/07 14:35:07 DEBUG|kvm_vm:0851| VM appears to be alive with PID 18680 11/07 14:35:07 DEBUG|kvm_monito:0254| (monitor humanmonitor1) Sending command 'cont' 11/07 14:35:07 INFO | unittest:0096| Waiting for unittest emulator to complete, timeout 600, output in /tmp/testlog-2007-143402-z5b2 11/07 14:35:09 INFO | aexpect:0783| [qemu output] (Process terminated with status 1) 11/07 14:35:09 ERROR| unittest:0104| Unit test emulator failed 11/07 14:35:09 INFO | unittest:0113| Unit test log collected and available under /usr/local/autotest/results/default/kvm.qemu-kvm-git.unittests/debug/emulator.log Please let me know if I can do anything else to help you.
Re: [Qemu-devel] [PATCH] KVM: Add wrapper script around QEMU to test kernels
On Mon, Nov 07, 2011 at 09:53:28PM +0200, Pekka Enberg wrote: > > I'm sure perf developers break the ABI sometimes - that happens > elsewhere in the kernel as well. However, Ted claimed that perf > developers use tools/perf as an excuse to break the ABI _on purpose_ > which is something I have hard time believing. I remember an assertion, probably a year or two ago, probably at the previous year's kernel summit, that one of the reasons for having the perf code inline in the kernel was so that synchronized changes could be made to both the kernel and userspace tool together. So it's not a matter of breaking the ABI _on_ _purpose_, it's an assertion that there is no ABI at all. Since the perf tool and the kernel tool have to be built together, so long as a user does that, no harm, no foul. Recall that Linus has said that he doesn't care about whether or not something is an ABI; he only care if users code don't perceive breakage. If they didn't perceive breakage, then it doesn't matter if an interface is changed. So the real question is whether or not this was an excuse to break the ABI, but whether or not the perf developers acknowledge there is an ABI at all, and whether it's OK for other developers to depend on the syscall interface or not. Actually, though, it shouldn't matter, because intentions don't matter. Recall the powertop/ftrace case. If you expose an interface, and people start using that interface, then you can't break them, period. So as far as Vince is concerned, if you have a userspace library which depends on the perf interface, then you should try out the kernel after each merge window, and if your library breaks, you should complain to Ingo and Linus directly, and request that the commit which broke your tool to be reverted --- because that's the rule; no breakage is allowed. As far as kvm-tool being in the kernel, I still don't see particularly valid arguments for why it should be in the kernel. It can't be the perf argument of "we can make simultaneous changes in the userspace and kernel code", because if those changes break qemu-kvm, then a complaint to Linus will cause the problem code to be reverted. As far as the code using the same coding conventions and naming conventions as the kernel, that to me isn't a particular strong argument either. E2fsprogs uses the Signed-off-by lines, and the same coding conventions of the kernel, and it even has a slightly modified version of two kernel source file in e2fsprogs (e2fsck/recovery.c and e2fsck/revoke.c), plus a header file with data structures that have to be kept in sync with the kernel header file. But that doesn't make it "part of the kernel", and it's not a justification for it to be bundled with the kernel. Personally, I consider code that runs in userspace as a pretty bright line, as being "not kernel code", and while perhaps things like initramfs and the crazy ideas people have had in the past of moving stuff out of kernel/init.c into userspace might have qualified as stuff really close to the kernel, something like kvm-tool that runs way after boot, doesn't even come close. Wine is another example of another package that has lots of close kernel ties, but was also not bundled into the kernel. The precedent has all mainly been on the "keep the kernel separate" side of things, and the arguments for bundling it with the kernel are much weaker, especially since the interface is well-developed, and there are external users of the interface which means you can't make changes to the interface willy-nilly. Indeed, when the perf interface was changing all the time, maybe there was some convenience to have it be bundled with the kernel, so there was no need to negotiate interface version numbers, et. al. But given how it has to link in so many user space libraries, I personally think it's fair to ask the question whether now that it has matured, whether it's time to move it out of the kernel source tree. Regards, - Ted
Re: [Qemu-devel] [RFC] vmstate: Add copyrights for all cpus
On Mon, Nov 07, 2011 at 01:29:35PM -0600, Anthony Liguori wrote: > On 11/07/2011 12:39 PM, Edgar E. Iglesias wrote: > >On Mon, Nov 07, 2011 at 06:38:49PM +0100, Juan Quintela wrote: > >> static bool feature_vfp_needed(void *opaque) > >>diff --git a/target-cris/vmstate-cpu.c b/target-cris/vmstate-cpu.c > >>index 0f732d3..bbccb8b 100644 > >>--- a/target-cris/vmstate-cpu.c > >>+++ b/target-cris/vmstate-cpu.c > >>@@ -1,3 +1,18 @@ > >>+/* > >>+ * Migration support for cris cpus > >>+ * > >>+ * Copyright (C) 2011 Red Hat, Inc. > >>+ * > >>+ * Author(s): > >>+ * Juan Quintela > >>+ * > >>+ * Based on qemu-file support done by: > >>+ * Edgar E. Iglesias > >>+ * > >>+ * This work is licensed under the terms of the GNU GPL, version 2. See > >>+ * the COPYING file in the top-level directory. > >>+ */ > >>+ > >> #include "hw/hw.h" > >> > >> static const VMStateDescription vmstate_tlbset = { > > > > > >Hi Juan, > > > >This is OK with me > > Please give an explicit Acked-by for the commit. I'd like there to > be a clear history in git of these types of changes. OK: Acked-by: Edgar E. Iglesias Cheers
Re: [Qemu-devel] [PATCH] qemu_timedate_diff() shouldn't modify its argument.
On 11/06/2011 06:00 PM, Gleb Natapov wrote: The caller of qemu_timedate_diff() does not expect that tm it passes to the function will be modified, but mktime() is destructive and modifies its argument. Pass a copy of tm to it and set tm_isdst so that mktime() will not rely on it since its value may be outdated. I believe that the original issue was not related to outdated data at the moment of the daylight saving time transition. using tmp.tm_isdst = -1 sounds good, but why use a copy of tm? The only significant field that will change in the tm is the tm_isdst itself that will be set to 0/1 (correctly). Acked-by: Ronen Hod Signed-off-by: Gleb Natapov diff --git a/vl.c b/vl.c index 624da0f..641629b 100644 --- a/vl.c +++ b/vl.c @@ -460,8 +460,11 @@ int qemu_timedate_diff(struct tm *tm) if (rtc_date_offset == -1) if (rtc_utc) seconds = mktimegm(tm); -else -seconds = mktime(tm); +else { +struct tm tmp = *tm; +tmp.tm_isdst = -1; /* use timezone to figure it out */ +seconds = mktime(&tmp); + } else seconds = mktimegm(tm) + rtc_date_offset; -- Gleb.
Re: [Qemu-devel] [PATCH] KVM: Add wrapper script around QEMU to test kernels
On Mon, 7 Nov 2011, Frank Ch. Eigler wrote: The ABI design allows for that kind of flexible extensibility, and it's one of its major advantages. What we *cannot* protect against is you relying on obscure details of the ABI [...] Is there some documentation that clearly spells out which parts of the perf syscall userspace ABI are "obscure" and thus presumably changeable? That's actually something the KVM and virtio folks have done a great job with IMHO. Both ABIs are documented pretty extensively and the specs are kept up to date. I guess for perf ABI, "perf test" is the closest thing to a specification so if your application is using something that's not covered by it, you might be in trouble. Pekka
Re: [Qemu-devel] [PATCH] KVM: Add wrapper script around QEMU to test kernels
Ingo Molnar writes: > [...] >> It's problem enough that there's no way to know what version of the >> perf_event abi you are running against and we have to guess based >> on kernel version. This gets "fun" because all of the vendors have >> backported seemingly random chunks of perf_event code to their >> older kernels. > > The ABI design allows for that kind of flexible extensibility, and > it's one of its major advantages. > > What we *cannot* protect against is you relying on obscure details of > the ABI [...] Is there some documentation that clearly spells out which parts of the perf syscall userspace ABI are "obscure" and thus presumably changeable? > [...] The usual ABI rules also apply: we'll revert everything that > breaks the ABI - but for that you need to report it *in time* [...] If the ABI is so great in its flexible extensibility, how come it can't be flexibly extended without having to passing the burden of compatibility testing & reversion-yawping to someone else? - FChE
[Qemu-devel] [PATCH] docs: Add writing-qmp-commands.txt
Explains how to write QMP commands using the QAPI. Signed-off-by: Luiz Capitulino --- PS: This addresses all comments against the RFC version, plus some additional changes and the "Returning Lists" chapter PPS: Anthony, I think this could be merged for 1.0 as it's only documentation docs/writing-qmp-commands.txt | 642 + 1 files changed, 642 insertions(+), 0 deletions(-) create mode 100644 docs/writing-qmp-commands.txt diff --git a/docs/writing-qmp-commands.txt b/docs/writing-qmp-commands.txt new file mode 100644 index 000..0472fc3 --- /dev/null +++ b/docs/writing-qmp-commands.txt @@ -0,0 +1,642 @@ += How to write QMP commands using the QAPI framework = + +This document is a step-by-step guide on how to write new QMP commands using +the QAPI framework. It also shows how to implement new style HMP commands. + +This document doesn't discuss QMP protocol level details, nor does it dive +into the QAPI framework implementation. + +For an in-depth introduction to the QAPI framework, please refer to +docs/qapi-code-gen.txt. For documentation about the QMP protocol, please +check the files in QMP/. + +== Overview == + +Generally speaking, the following steps should be taken in order to write a +new QMP command. + +1. Write the command's and type(s) specification in the QAPI schema file + (qapi-schema.json in the root source directory) + +2. Write the QMP command itself, which is a regular C function. Preferably, + the command should be exported by some QEMU subsystem. But it can also be + added to the qmp.c file + +3. At this point the command can be tested under the QMP protocol + +4. Write the HMP command equivalent. This is not required and should only be + done if it does make sense to have the functionality in HMP. The HMP command + is implemented in terms of the QMP command + +The following sections will demonstrate each of the steps above. We will start +very simple and get more complex as we progress. + +=== Testing === + +For all the examples in the next sections, the test setup is the same and is +shown here. + +First, QEMU should be started as: + +# /path/to/your/source/qemu [...] \ +-chardev socket,id=qmp,port=,host=localhost,server \ +-mon chardev=qmp,mode=control,pretty=on + +Then, in a different terminal: + +$ telnet localhost +Trying 127.0.0.1... +Connected to localhost. +Escape character is '^]'. +{ +"QMP": { +"version": { +"qemu": { +"micro": 50, +"minor": 15, +"major": 0 +}, +"package": "" +}, +"capabilities": [ +] +} +} + +The above output is the QMP server saying you're connected. The server is +actually in capabilities negotiation mode. To enter in command mode type: + +{ "execute": "qmp_capabilities" } + +Then the server should respond: + +{ +"return": { +} +} + +Which is QMP's way of saying "the latest command executed OK and didn't return +any data". Now you're ready to enter the QMP example commands as explained in +the following sections. + +== Writing a command that doesn't return data == + +That's the most simple QMP command that can be written. Usually, this kind of +command carries some meaningful action in QEMU but here it will just print +"Hello, world" to the standard output. + +Our command will be called "hello-world". It takes no arguments, nor does it +return any data. + +The first step is to add the following line to the bottom of the +qapi-schema.json file: + +{ 'command': 'hello-world' } + +The "command" keyword defines a new QMP command. It's an JSON object. All +schema entries are JSON objects. The line above will instruct the QAPI to +generate any prototypes and the necessary code to marshal and unmarshal +protocol data. + +The next step is to write the "hello-world" implementation. As explained +earlier, it's preferable for commands to live in QEMU subsystems. But +"hello-world" doesn't pertain to any, so we put its implementation in qmp.c: + +void qmp_hello_world(Error **errp) +{ +printf("Hello, world!\n"); +} + +There are a few things to be noticed: + +1. QMP command implementation functions must be prefixed with "qmp_" +2. qmp_hello_world() returns void, this is in accordance with the fact that the + command doesn't return any data +3. It takes an "Error **" argument. This is required. Later we will see how to + return errors and take additional arguments. The Error argument should not + be touched if the command doesn't return errors +4. We won't add the function's prototype. That's automatically done by the QAPI +5. Printing to the terminal is discouraged for QMP commands, we do it here + because it's the easiest way to demonstrate a QMP command + +Now a little hack is needed. As we're still using the old QMP server we need +to add the new command to its internal dispatch table. This step won't be +required in the near future. Open the qmp
Re: [Qemu-devel] [PATCH] KVM: Add wrapper script around QEMU to test kernels
On Mon, 7 Nov 2011, Pekka Enberg wrote: >> I've never heard ABI incompatibility used as an argument for perf. Ingo? On Mon, Nov 7, 2011 at 7:03 PM, Vince Weaver wrote: > Never overtly. They're too clever for that. If you want me to take you seriously, spare me from the conspiracy theories, OK? I'm sure perf developers break the ABI sometimes - that happens elsewhere in the kernel as well. However, Ted claimed that perf developers use tools/perf as an excuse to break the ABI _on purpose_ which is something I have hard time believing. Your snarky remarks doesn't really help this discussion either. It's apparent from the LKML discussions that you're more interested in arguing with the perf developers rather than helping them. Pekka
Re: [Qemu-devel] [RFC] vmstate: Add copyrights for all cpus
Am Montag 07 November 2011, 18:38:49 schrieb Juan Quintela: > diff --git a/target-lm32/vmstate-cpu.c b/target-lm32/vmstate-cpu.c > index 60b4b29..99ce957 100644 > --- a/target-lm32/vmstate-cpu.c > +++ b/target-lm32/vmstate-cpu.c > @@ -1,3 +1,18 @@ > +/* > + * Migration support for lm32 cpus > + * > + * Copyright (C) 2011 Red Hat, Inc. > + * > + * Author(s): > + * Juan Quintela > + * > + * Based on qemu-file support done by: > + * Michael Walle > + * > + * This work is licensed under the terms of the GNU GPL, version 2. See > + * the COPYING file in the top-level directory. > + */ > + > #include "hw/hw.h" > > const VMStateDescription vmstate_cpu = { Acked-By: Michael Walle -- Michael
Re: [Qemu-devel] [RFC] vmstate: Add copyrights for all cpus
On 11/07/2011 09:38 AM, Juan Quintela wrote: > alpha: > CC: Richard Henderson Acked-by: Richard Henderson r~
Re: [Qemu-devel] [RFC] vmstate: Add copyrights for all cpus
Paolo Bonzini wrote: > On 11/07/2011 06:38 PM, Juan Quintela wrote: >> + * This work is licensed under the terms of the GNU GPL, version 2. See >> + * the COPYING file in the top-level directory. > > ... or later please. I thougth I had forgot something to ask O:-) Later, Juan.
Re: [Qemu-devel] [RFC] vmstate: Add copyrights for all cpus
On 11/07/2011 12:39 PM, Edgar E. Iglesias wrote: On Mon, Nov 07, 2011 at 06:38:49PM +0100, Juan Quintela wrote: static bool feature_vfp_needed(void *opaque) diff --git a/target-cris/vmstate-cpu.c b/target-cris/vmstate-cpu.c index 0f732d3..bbccb8b 100644 --- a/target-cris/vmstate-cpu.c +++ b/target-cris/vmstate-cpu.c @@ -1,3 +1,18 @@ +/* + * Migration support for cris cpus + * + * Copyright (C) 2011 Red Hat, Inc. + * + * Author(s): + * Juan Quintela + * + * Based on qemu-file support done by: + * Edgar E. Iglesias + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + #include "hw/hw.h" static const VMStateDescription vmstate_tlbset = { Hi Juan, This is OK with me Please give an explicit Acked-by for the commit. I'd like there to be a clear history in git of these types of changes. Regards, Anthony Liguori Thanks, Edgar
Re: [Qemu-devel] [PATCH] qemu_timedate_diff() shouldn't modify its argument.
On 11/06/2011 11:00 AM, Gleb Natapov wrote: The caller of qemu_timedate_diff() does not expect that tm it passes to the function will be modified, but mktime() is destructive and modifies its argument. Pass a copy of tm to it and set tm_isdst so that mktime() will not rely on it since its value may be outdated. Ohhh, nice catch. Signed-off-by: Gleb Natapov Acked-by: Rik van Riel -- All rights reversed
Re: [Qemu-devel] [RFC] vmstate: Add copyrights for all cpus
On Mon, Nov 07, 2011 at 06:38:49PM +0100, Juan Quintela wrote: > Hi > > This patch adds copyrights to all the machine description files for > all architectures supported. (this is done on top of my vmstate-cpus > series patches) The problem? > > - What should we put as "copyirght" owners. > > Althought I modified almost every line of the files, mostly of the > changes are a conversion, so claiming myself as the only "copyright" > owner sounds at least pretentious, and more than probably false. > > I tried to "dig" into the git logs and tried to came with "whoever" > commit the initial cpu_save/load foar each architecture. I have put > them as: > > * Based on qemu-file support done by: > * Richard Henderson > > But I would preffer that the persons involved state what copyright > notice they want, name, address, year(s), etc. (Some architectures > already have a propper copyright notice, I didn't touch them), and > others had an empty file (I put mine there on the previosu series). > > Several of the logs are from the svn days, and then I don't know if > the person was the committer, or the author. If anyone contributed > to the functionality and want to add its copyright, please told me. > > To make things more complicated, when machine.c files were split from > vl.c, they didn't carry any copyright notice at all, should we copy > back everything from vl.c? > > To make things more complicated, it looks like Thiemo Seufer did the > original mips support, and he passed away. So he can't obviously > comment. > > Anthony asked me to send a patch to the list, asking form comments. > > alpha: > CC: Richard Henderson > > arm: > CC: Andrzej Zaborowski > > (it appears as balrog, but on irc channel peter told me that balrog > has him) > > cris: > CC: Edgar E. Iglesias ... > static bool feature_vfp_needed(void *opaque) > diff --git a/target-cris/vmstate-cpu.c b/target-cris/vmstate-cpu.c > index 0f732d3..bbccb8b 100644 > --- a/target-cris/vmstate-cpu.c > +++ b/target-cris/vmstate-cpu.c > @@ -1,3 +1,18 @@ > +/* > + * Migration support for cris cpus > + * > + * Copyright (C) 2011 Red Hat, Inc. > + * > + * Author(s): > + * Juan Quintela > + * > + * Based on qemu-file support done by: > + * Edgar E. Iglesias > + * > + * This work is licensed under the terms of the GNU GPL, version 2. See > + * the COPYING file in the top-level directory. > + */ > + > #include "hw/hw.h" > > static const VMStateDescription vmstate_tlbset = { Hi Juan, This is OK with me Thanks, Edgar
[Qemu-devel] Summary of CD, DVD, BD passthrough tests with -drive if=virtio : Full success
Hi, here are test results from the USB attached BD recorder. It is a SATA drive 'Optiarc' 'BD RW BD-5300S' rev '1.04' at a 'JMicron' 'JM20336 SATA, USB Combo' controller. -- qemu start command is the same as with the internal SATA drive and my previous if=virtio tests x86_64-softmmu/qemu-system-x86_64 \ -L .../pc-bios \ -enable-kvm \ -nographic \ -m 512 \ -net nic,model=ne2k_pci \ -net user,hostfwd=tcp::5557-:22 \ -hda /dvdbuffer/i386-install.qemu \ -drive file=/dev/sg3,if=virtio \ -cdrom /dvdbuffer/pseudo_drive -- BD-RE The drive gets listed by xorriso option -devices (for libburn internal reasons it needs a softlink pointing from /dev/sr1 to /dev/vda). Media inquiry works fine. (Disc is a VERBATIM/IM0/0.) Reading of ISO 9660 filesytem works fine (via READ(10), not via read()). MD5s of content (from host tests) match. Writing with WRITE(10) and Defect Management works fine. Writing with WRITE(12) and Streaming bit (i.e. no Defect Management) works fine and at full nominal medium speed. Listing of available format descriptors works fine. Re-formatting works fine (with progress report by REQUEST SENSE). BD-R Media inquiry of blank medium works fine. (Disc is a VERBATIM/IMw/0, an LTH medium, not suitable for older BD drives.) Formatting works fine. (Drive reports no progress with TEST UNIT READY or REQUEST SENSE. Both commands indicate that the drive is busy. So this is a shortcomming of the drive and not of qemu.) Media inquiry of formatted medium works fine. Writing with WRITE(10) and Defect Management works fine. Writing with WRITE(12) and Streaming bit (i.e. no Defect Management) works fine and at full nominal medium speed. Reading of ISO 9660 filesytem works fine. MD5s match. (I do not test with unformatted BD-R, because there is no difference in the SCSI commands used for writing.) Closing after writing a final session works fine. Now for the most mediocre member of the DVD family: DVD-RAM (I really did find one that still works) Reading of old content works fine. Writing an ISO 9960 session works with WRITE(12) and Streaming bit as good as DVD-RAM can do. One file MD5 sum did not match. (So even this read test was challenged and tested.) Writing an ISO 9960 session works with WRITE(10) and Defect Management. Re-formatting works fine (with progress report by REQUEST SENSE). --- I repeat a consolidated summary of my successful tests with other media types with above qemu start command. DVD recorder was 'TSSTcorp' 'CDDVDW SH-S223B' rev SB02'. No test really failed. Some aspects of qemu virtio drives surprised my libburn and will have to be addressed in the next realeases of libburn and GNU xorriso. --- All media types Tray loading and ejecting. CD-RW (and most probably CD-R) Writing a session with a single track with write type TAO. As first session and as add-on session. Burning a CD SAO session to blank medium. Blanking in mode fast. Closing CD-RW after writing a TAO session. DVD in general Telling the drive the desired DVD write speed by SET STREAMING. DVD+RW Writing to thoroughly formatted DVD+RW. Writing to partly formatted DVD+RW and new unformatted DVD+RW. (With background formatting.) DVD-RW (and most probably DVD-R) Blanking. Writing with write type DAO to blank sequential DVD-RW. Writing with write type Incremental (aka Packet), multiple sessions. Closing sequential DVD-RW after an Incremental session. Formatting DVD-RW from sequential profile 0x14 to overwritable profile 0x13. Writing to partly formatted and to thoroughly formatted DVD-RW. DVD+R Writing sessions without using RESERVE TRACK. Writing sessions with RESERVE TRACK. Closing of medium. Tomorrow i plan to test Paolo Bonzini's proposal -drive file=/dev/sr0,if=none,id=scsicd -device virtio-blk,drive=scsicd,logical_block_size=2048,physical_block_size=2048 with some of the re-usable media types. Have a nice day :) Thomas
Re: [Qemu-devel] Secure KVM
On 11/07/2011 11:52 AM, Sasha Levin wrote: Hi Anthony, Thank you for your comments! On Mon, 2011-11-07 at 11:37 -0600, Anthony Liguori wrote: On 11/06/2011 02:40 PM, Sasha Levin wrote: Hi all, I'm planning on doing a small fork of the KVM tool to turn it into a 'Secure KVM' enabled hypervisor. Now you probably ask yourself, Huh? The idea was discussed briefly couple of months ago, but never got off the ground - which is a shame IMO. It's easy to explain the problem: If an attacker finds a security hole in any of the devices which are exposed to the guest, the attacker would be able to either crash the guest, or possibly run code on the host itself. The solution is also simple to explain: Split the devices into different processes and use seccomp to sandbox each device into the exact set of resources it needs to operate, nothing more and nothing less. Since I'll be basing it on the KVM tool, which doesn't really emulate that many legacy devices, I'll focus first on the virtio family for the sake of simplicity (and covering 90% of the options). This is my basic overview of how I'm planning on implementing the initial POC: 1. First I'll focus on the simple virtio-rng device, it's simple enough to allow us to focus on the aspects which are important for the POC while still covering most bases (i.e. sandbox to single file - /dev/urandom and such). 2. Do it on a one process per device concept, where for each device (notice - not device *type*) requested, a new process which handles it will be spawned. 3. That process will be limited exactly to the resources it needs to operate, for example - if we run a virtio-blk device, it would be able to access only the image file which it should be using. 4. Connection between hypervisor and devices will be based on unix sockets, this should allow for better separation compared to other approaches such as shared memory. 5. While performance is an aspect, complete isolation is more important. Security is primary, performance is secondary. 6. Share as much code as possible with current implementation of virtio devices, make it possible to run virtio devices either like it's being done now, or by spawning them as separate processes - the amount of specific code for the separate process case should be minimal. Thats all I have for now, comments are *very* welcome. I thought about this a bit and have some ideas that may or may not help. 1) If you add device save/load support, then it's something you can potentially use to give yourself quite a bit of flexibility in changing the sandbox. At any point in run time, you can save the device model's state in the sandbox, destroy the sandbox, and then build a new sandbox and restore the device to its former state. This might turn out to be very useful in supporting things like device hotplug and/or memory hot plug. 2) I think it's largely possible to implement all device emulation without doing any dynamic memory allocation. Since memory allocation DoS is something you have to deal with anyway, I suspect most device emulation already uses a fixed amount of memory per device. This can potentially dramatically simplify things. 3) I think virtio can/should be used as a generic "backend to frontend" transport between the device model and the tool. virtio requires server and client to have shared memory, so if we already go with shared memory we can just let the device manage the actual virtio driver directly, no? Let's say you're implementing an IDE device model in the sandbox. You can try to implement the block layer in the sandbox but I think that quickly will become too difficult. You can do as Avi suggested and do all DMA accesses from the IDE device model as RPCs, or you can map guest memory as shared memory and utilize (1) in order to change that mapping as you need to. At some point, you end up with a struct iovec and an offset that you want to read/write to the virtual disk. You need a way to send that to the "frontend" that will then handle that as a raw/qcow2 request. Well, virtio is great at doing exactly that :-) So if you increase your shared memory to have a little bit extra to stick another vring, you can use that for device model -> front end communication without paying an extra memcpy. For notifications, the easiest thing to do is setup an "event channel" bitmap and use a single eventfd to multiplex that event channel bitmap. This is pretty much how Xen works btw. A single interrupt is reserved and a bitmap is used to dispatch the actual events. So the sandbox loop would look like: void main() { setup_devices(); read_from_event_channel(main_channel); for i in vrings: check_vring_notification(i); } Once vring would be used for dispatching PIO/MMIO. The remaining vrings could be used for anything really. Like I mentioned elsewhere, just think of the sandbox as just an extension of the guests firmware. The purpose of the sandbox is to reduce a
Re: [Qemu-devel] [PATCH] KVM: Add wrapper script around QEMU to test kernels
* Vince Weaver wrote: > On Mon, 7 Nov 2011, Pekka Enberg wrote: > > > I've never heard ABI incompatibility used as an argument for > > perf. Ingo? Correct, the ABI has been designed in a way to make it really hard to break the ABI via either directed backports or other mess-ups. The ABI is both backwards *and* forwards ABI compatible, which is very rare amongst Linux ABIs. For frequently used tools, such as perf, there's no ABI compatibility problem in practice: using newer perf on older kernels is pretty common. Using older perf on new kernels is rarer, but that generally works too. In hindsight being in the kernel repo made it *easier* for perf to implement a good, stable ABI while also keeping a very high rate of change of the subsystem: changes are more 'concentrated' and people can stay focused on the ball to extend the ABI in sensible ways instead of struggling with project boundary artifacts. I think we needed to do only one revert along the way in the past two years, to fix an unintended ABI breakage in PowerTop. Considering the total complexity of the perf ABI our compatibility track record is *very* good. > Never overtly. They're too clever for that. Pekka, Vince has meanwhile become the resident perf critic on lkml, always in it when it comes to some perf-bashing: > In any case, as a primary developer of a library (PAPI) that uses > the perf_events ABI I have to say that having perf in the kernel > has been a *major* pain for us. ... and you have argued against perf from the very first day on, when you were one of the perfmon developers - and IMO in hindsight you've been repeatedly wrong about most of your design arguments. > Unlike the perf developers, we *do* have to maintain backwards > compatability. [...] We do too, i use new perf on older distro kernels all the time. If you see a breakage of functionality that tools use and report in a timely fashion then please report it. > [...] And we have a lot of nasty code in PAPI to handle this. > Entirely because the perf_events ABI is not stable. It's mostly > stable, but there are enough regressions to be a pain. You are blaming the wrong guys really. The PAPI project has the (fundamental) problem that you are still doing it in the old-style sw design fashion, with many months long delays in testing, and then you are blaming the problems you inevitably meet with that model on *us*. There was one PAPI incident i remember where it took you several *months* to report a regression in a regular PAPI test-case (no actual app affected as far as i know). No other tester ever ran the PAPI testcases so nobody else reported it. Moving perf out of the kernel would make that particular situation *worse*, by further increasing the latency of fixes and by further increasing the risk of breakages. Sorry, but you are trying to "fix" perf by dragging it down to your bad level of design and we will understandably resist that ... > It's problem enough that there's no way to know what version of the > perf_event abi you are running against and we have to guess based > on kernel version. This gets "fun" because all of the vendors have > backported seemingly random chunks of perf_event code to their > older kernels. The ABI design allows for that kind of flexible extensibility, and it's one of its major advantages. What we *cannot* protect against is you relying on obscure details of the ABI without adding it to 'perf test' and then not testing the upstream kernel in a timely enough fashion either ... Nobody but you tests PAPI so you need to become *part* of the upstream development process, which releases a new upstream kernel every 3 months. > And it often does seem as the perf developers don't care when > something breaks in perf_events if it doesn't affect perf users. I have to reject your slander, both Peter, Arnaldo and me care deeply about fixing regressions and i've personally applied fixes out of order that addressed some sort of PAPI problem - whenever you chose to report them. Vince, you are wrong and you have also become somewhat malicious in your arguments - please stop it. > For example, the new NMI watchdog severely breaks perf_event event > allocation if you are using FORMAT_GROUP. perf doesn't use this > though, so none of the kernel developers seem to care. And unless > I can quickly come up with a patch as an outsider, a few kernel > versions will go by and the kernel devs will declare "well it was > broken so long, now we don't have to fix it". Fun. Face it, the *real* problem is that beyond yourself very few people who use a new kernel use PAPI and your long latency of testing exposes you to breakages in a much more agile subsystem such as perf. Please fix that instead of blaming it on others. Also, as i mentioned it several times before, you are free to add an arbitrary number of ABI test-cases to 'perf test' and we can promise that we run that. Right now
Re: [Qemu-devel] Secure KVM
Hi Anthony, Thank you for your comments! On Mon, 2011-11-07 at 11:37 -0600, Anthony Liguori wrote: > On 11/06/2011 02:40 PM, Sasha Levin wrote: > > Hi all, > > > > I'm planning on doing a small fork of the KVM tool to turn it into a > > 'Secure KVM' enabled hypervisor. Now you probably ask yourself, Huh? > > > > The idea was discussed briefly couple of months ago, but never got off > > the ground - which is a shame IMO. > > > > It's easy to explain the problem: If an attacker finds a security hole > > in any of the devices which are exposed to the guest, the attacker would > > be able to either crash the guest, or possibly run code on the host > > itself. > > > > The solution is also simple to explain: Split the devices into different > > processes and use seccomp to sandbox each device into the exact set of > > resources it needs to operate, nothing more and nothing less. > > > > Since I'll be basing it on the KVM tool, which doesn't really emulate > > that many legacy devices, I'll focus first on the virtio family for the > > sake of simplicity (and covering 90% of the options). > > > > This is my basic overview of how I'm planning on implementing the > > initial POC: > > > > 1. First I'll focus on the simple virtio-rng device, it's simple enough > > to allow us to focus on the aspects which are important for the POC > > while still covering most bases (i.e. sandbox to single file > > - /dev/urandom and such). > > > > 2. Do it on a one process per device concept, where for each device > > (notice - not device *type*) requested, a new process which handles it > > will be spawned. > > > > 3. That process will be limited exactly to the resources it needs to > > operate, for example - if we run a virtio-blk device, it would be able > > to access only the image file which it should be using. > > > > 4. Connection between hypervisor and devices will be based on unix > > sockets, this should allow for better separation compared to other > > approaches such as shared memory. > > > > 5. While performance is an aspect, complete isolation is more important. > > Security is primary, performance is secondary. > > > > 6. Share as much code as possible with current implementation of virtio > > devices, make it possible to run virtio devices either like it's being > > done now, or by spawning them as separate processes - the amount of > > specific code for the separate process case should be minimal. > > > > > > Thats all I have for now, comments are *very* welcome. > > I thought about this a bit and have some ideas that may or may not help. > > 1) If you add device save/load support, then it's something you can > potentially > use to give yourself quite a bit of flexibility in changing the sandbox. At > any > point in run time, you can save the device model's state in the sandbox, > destroy > the sandbox, and then build a new sandbox and restore the device to its > former > state. > > This might turn out to be very useful in supporting things like device > hotplug > and/or memory hot plug. > > 2) I think it's largely possible to implement all device emulation without > doing > any dynamic memory allocation. Since memory allocation DoS is something you > have to deal with anyway, I suspect most device emulation already uses a > fixed > amount of memory per device. This can potentially dramatically simplify > things. > > 3) I think virtio can/should be used as a generic "backend to frontend" > transport between the device model and the tool. virtio requires server and client to have shared memory, so if we already go with shared memory we can just let the device manage the actual virtio driver directly, no? Also, things like interrupts would also require some sort of a different IPC, which would complicate things a bit. > 4) Lack of select() is really challenging. I understand why it's not there > since it can technically be emulated but it seems like a no-risk syscall to > whitelist and it would make programming in a sandbox so much easier. Maybe > Andrea has some comments here? I might be missing something here. There are several of these which would be nice to have, and if we can get seccomp filters we have good flexibility with which APIs we allow for each device. > Regards, > > Anthony Liguori > > > > -- Sasha.
Re: [Qemu-devel] [RFC] vmstate: Add copyrights for all cpus
On 11/07/2011 06:38 PM, Juan Quintela wrote: + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. ... or later please. Paolo
[Qemu-devel] [RFC] vmstate: Add copyrights for all cpus
Hi This patch adds copyrights to all the machine description files for all architectures supported. (this is done on top of my vmstate-cpus series patches) The problem? - What should we put as "copyirght" owners. Althought I modified almost every line of the files, mostly of the changes are a conversion, so claiming myself as the only "copyright" owner sounds at least pretentious, and more than probably false. I tried to "dig" into the git logs and tried to came with "whoever" commit the initial cpu_save/load foar each architecture. I have put them as: * Based on qemu-file support done by: * Richard Henderson But I would preffer that the persons involved state what copyright notice they want, name, address, year(s), etc. (Some architectures already have a propper copyright notice, I didn't touch them), and others had an empty file (I put mine there on the previosu series). Several of the logs are from the svn days, and then I don't know if the person was the committer, or the author. If anyone contributed to the functionality and want to add its copyright, please told me. To make things more complicated, when machine.c files were split from vl.c, they didn't carry any copyright notice at all, should we copy back everything from vl.c? To make things more complicated, it looks like Thiemo Seufer did the original mips support, and he passed away. So he can't obviously comment. Anthony asked me to send a patch to the list, asking form comments. alpha: CC: Richard Henderson arm: CC: Andrzej Zaborowski (it appears as balrog, but on irc channel peter told me that balrog has him) cris: CC: Edgar E. Iglesias i386: Fabrice Bellard? * Copyright (c) 2003-2008 Fabrice Bellard Didn't cc'd him because he left project/didn't have email address on MAINTAINERS/vl.c. If you think that I should cc'd to him, just let me know. lm32: CC: Michael Walle mips: Thiemo Seufer? ppc & sparc: CC: Blue Swirl What do you think, what should we do? Juan. Signed-off-by: Juan Quintela --- target-alpha/vmstate-cpu.c | 15 +++ target-arm/vmstate-cpu.c | 15 +++ target-cris/vmstate-cpu.c | 15 +++ target-i386/vmstate-cpu.c | 15 +++ target-lm32/vmstate-cpu.c | 15 +++ target-mips/vmstate-cpu.c | 15 +++ target-ppc/vmstate-cpu.c | 15 +++ target-sparc/vmstate-cpu.c | 15 +++ 8 files changed, 120 insertions(+), 0 deletions(-) diff --git a/target-alpha/vmstate-cpu.c b/target-alpha/vmstate-cpu.c index 156cb74..5525fab 100644 --- a/target-alpha/vmstate-cpu.c +++ b/target-alpha/vmstate-cpu.c @@ -1,3 +1,18 @@ +/* + * Migration support for alpha cpus + * + * Copyright (C) 2011 Red Hat, Inc. + * + * Author(s): + * Juan Quintela + * + * Based on qemu-file support done by: + * Richard Henderson + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + #include "hw/hw.h" static int get_fpcr(QEMUFile *f, void *opaque, size_t size) diff --git a/target-arm/vmstate-cpu.c b/target-arm/vmstate-cpu.c index 836d9ed..4435540 100644 --- a/target-arm/vmstate-cpu.c +++ b/target-arm/vmstate-cpu.c @@ -1,3 +1,18 @@ +/* + * Migration support for arm cpus + * + * Copyright (C) 2011 Red Hat, Inc. + * + * Author(s): + * Juan Quintela + * + * Based on qemu-file support done by: + * Andrzej Zaborowski + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + #include "hw/hw.h" static bool feature_vfp_needed(void *opaque) diff --git a/target-cris/vmstate-cpu.c b/target-cris/vmstate-cpu.c index 0f732d3..bbccb8b 100644 --- a/target-cris/vmstate-cpu.c +++ b/target-cris/vmstate-cpu.c @@ -1,3 +1,18 @@ +/* + * Migration support for cris cpus + * + * Copyright (C) 2011 Red Hat, Inc. + * + * Author(s): + * Juan Quintela + * + * Based on qemu-file support done by: + * Edgar E. Iglesias + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + #include "hw/hw.h" static const VMStateDescription vmstate_tlbset = { diff --git a/target-i386/vmstate-cpu.c b/target-i386/vmstate-cpu.c index 838983e..8bb7ca8 100644 --- a/target-i386/vmstate-cpu.c +++ b/target-i386/vmstate-cpu.c @@ -1,3 +1,18 @@ +/* + * Migration support for x86 cpus + * + * Copyright (C) 2011 Red Hat, Inc. + * + * Author(s): + * Juan Quintela + * + * Based on qemu-file support done by: + * Fabrice Bellard + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + #include "hw/hw.h" static const VMStateDescription vmstate_segment = { diff --git a/target-lm32/vmstate-cpu.c b/target-lm32/vmstate-cpu.c index 60b4b29..99ce957 100644 --- a/target-lm32/vmstate-cpu.c +++ b/target-lm32/vmstate-cpu.c @@ -1,3 +1,18 @@ +/* + * Migration support for
[Qemu-devel] How to build QEMU from git
Hi -- I ran into a problem running the qemu-0.15 and I was told that the problem was fixed in the 1.0 release candidate. I don't find a package with the RC on http://wiki.qemu.org/Download, so I cloned the git repo and tried to build qemu. I didn't find any instructions on how to build qemu on the wiki, so I winged it with the following configure: configure \ --enable-sdl\ --enable-curses \ --enable-curl \ --enable-kvm\ --enable-nptl \ --enable-uuid \ --enable-linux-aio \ --enable-io-thread \ --enable-debug Is this correct, or is there a better set of options? When I run make, there are a number of compiles which fail because of variables being set but not used. I've modified the code to eliminate some of them to allow the build to continue. Is there a simpler way around this? After getting past these issues, libhw32/serial.o is not being built: make[1]: *** No rule to make target `../libhw32/serial.o', needed by `qemu'. Stop. make[1]: Leaving directory `/local/eager/source/qemu/build/i386-softmmu' make: *** [subdir-i386-softmmu] Error Is there a fix for this problem? -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
Re: [Qemu-devel] Secure KVM
On 11/06/2011 02:40 PM, Sasha Levin wrote: Hi all, I'm planning on doing a small fork of the KVM tool to turn it into a 'Secure KVM' enabled hypervisor. Now you probably ask yourself, Huh? The idea was discussed briefly couple of months ago, but never got off the ground - which is a shame IMO. It's easy to explain the problem: If an attacker finds a security hole in any of the devices which are exposed to the guest, the attacker would be able to either crash the guest, or possibly run code on the host itself. The solution is also simple to explain: Split the devices into different processes and use seccomp to sandbox each device into the exact set of resources it needs to operate, nothing more and nothing less. Since I'll be basing it on the KVM tool, which doesn't really emulate that many legacy devices, I'll focus first on the virtio family for the sake of simplicity (and covering 90% of the options). This is my basic overview of how I'm planning on implementing the initial POC: 1. First I'll focus on the simple virtio-rng device, it's simple enough to allow us to focus on the aspects which are important for the POC while still covering most bases (i.e. sandbox to single file - /dev/urandom and such). 2. Do it on a one process per device concept, where for each device (notice - not device *type*) requested, a new process which handles it will be spawned. 3. That process will be limited exactly to the resources it needs to operate, for example - if we run a virtio-blk device, it would be able to access only the image file which it should be using. 4. Connection between hypervisor and devices will be based on unix sockets, this should allow for better separation compared to other approaches such as shared memory. 5. While performance is an aspect, complete isolation is more important. Security is primary, performance is secondary. 6. Share as much code as possible with current implementation of virtio devices, make it possible to run virtio devices either like it's being done now, or by spawning them as separate processes - the amount of specific code for the separate process case should be minimal. Thats all I have for now, comments are *very* welcome. I thought about this a bit and have some ideas that may or may not help. 1) If you add device save/load support, then it's something you can potentially use to give yourself quite a bit of flexibility in changing the sandbox. At any point in run time, you can save the device model's state in the sandbox, destroy the sandbox, and then build a new sandbox and restore the device to its former state. This might turn out to be very useful in supporting things like device hotplug and/or memory hot plug. 2) I think it's largely possible to implement all device emulation without doing any dynamic memory allocation. Since memory allocation DoS is something you have to deal with anyway, I suspect most device emulation already uses a fixed amount of memory per device. This can potentially dramatically simplify things. 3) I think virtio can/should be used as a generic "backend to frontend" transport between the device model and the tool. 4) Lack of select() is really challenging. I understand why it's not there since it can technically be emulated but it seems like a no-risk syscall to whitelist and it would make programming in a sandbox so much easier. Maybe Andrea has some comments here? I might be missing something here. Regards, Anthony Liguori
[Qemu-devel] [PATCH 05/11] block/cloop: Use g_free instead of free
From: Dong Xu Wang Fix mismatching allocation and deallocation: g_free should be used to pair with g_malloc. Reviewed-by: Andreas Färber Reviewed_by: Ray Wang Signed-off-by: Dong Xu Wang Signed-off-by: Kevin Wolf --- block/cloop.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/block/cloop.c b/block/cloop.c index 799b6c2..7570eb8 100644 --- a/block/cloop.c +++ b/block/cloop.c @@ -171,10 +171,10 @@ static void cloop_close(BlockDriverState *bs) { BDRVCloopState *s = bs->opaque; if (s->n_blocks > 0) { -free(s->offsets); +g_free(s->offsets); } -free(s->compressed_block); -free(s->uncompressed_block); +g_free(s->compressed_block); +g_free(s->uncompressed_block); inflateEnd(&s->zstream); } -- 1.7.6.4
Re: [Qemu-devel] [qemu-kvm unittest regression] Re: Autotest | Job ID: 2011 "Upstream qemu-kvm.git sanity 11-01-2011 00:04:02" | Status: 1 Completed | Success Rate: 94.74 %
On 11/07/2011 02:22 PM, Marcelo Tosatti wrote: On Mon, Nov 07, 2011 at 09:00:26AM -0300, Cleber Rosa wrote: On 11/07/2011 07:21 AM, Marcelo Tosatti wrote: On Tue, Nov 01, 2011 at 02:08:54PM -0200, Lucas Meneghel Rodrigues wrote: On 11/01/2011 12:17 PM, kvm-autotest wrote: Job ID: 2011 Job name: Upstream qemu-kvm.git sanity 11-01-2011 00:04:02 Summary: Host: Status: Completed Status: 1 Completed Execution time (HH:MM:SS): 01:17:02 User tests executed: 19 User tests passed: 18 User tests failed: 1 User tests success rate: 94.74 % Failures: Test Name Status Reason kvm.qemu-kvm-git.unittests FAIL Unit tests failed: emulator Hi Marcelo, Avi: We've seen emulator unittest failures during the last couple of jobs of qemu-kvm.git userspace + kvm.git kernel. Relevant hashes for the last failure seen: 11/01 09:33:59 INFO |virt_utils:0501| Commit hash for git://github.com/avikivity/kvm.git is b796a09c5d808f4013f27ad45953db604dac18fd (tag v3.1-rc4-10168-gb796a09) 11/01 09:50:57 DEBUG|virt_utils:2587| Git repo qemu_kvm uri: git://github.com/avikivity/qemu.git 11/01 09:51:52 INFO |virt_utils:2531| Commit hash for qemu_kvm is 7879db7e9c09b92d9af1c143fbe2cc212ec89e4b (no tag found) Cheers, Lucas Is there a log with more details available? Marcelo, The Autotest "landing" page for this job is: http://autotest.virt.bos.redhat.com/afe/#tab_id=view_job&object_id=2011 You can jump straight to the unittest logs by going to: http://autotest.virt.bos.redhat.com/results/2011-autotest/virtlab201.virt.bos.redhat.com/kvm.qemu-kvm-git.unittests/debug/ And this is the autotest log entries we got while running emulator: 11/01 09:54:32 INFO | unittest:0052| Running emulator 11/01 09:54:32 DEBUG|virt_env_p:0052| Preprocessing VM 'None' 11/01 09:54:32 INFO |kvm_vm:0790| Running qemu command: /usr/local/autotest/tests/kvm/qemu -name 'None' -nodefaults -vga std -monitor unix:'/tmp/monitor-humanmonitor1-2001-095328-ESmp',server,nowait -qmp unix:'/tmp/monitor-qmpmonitor1-2001-095328-ESmp',server,nowait -serial unix:'/tmp/serial-2001-095328-ESmp',server,nowait -m 512 -smp 2 -kernel '/usr/local/autotest/tests/kvm/unittests/emulator.flat' -vnc :0 -chardev file,id=testlog,path=/tmp/testlog-2001-095328-ESmp -device testdev,chardev=testlog -S 11/01 09:54:33 DEBUG|kvm_monito:0624| (monitor qmpmonitor1) Sending command 'qmp_capabilities' 11/01 09:54:33 DEBUG|kvm_vm:0851| VM appears to be alive with PID 18682 11/01 09:54:33 DEBUG|kvm_monito:0254| (monitor humanmonitor1) Sending command 'cont' 11/01 09:54:33 INFO | unittest:0096| Waiting for unittest emulator to complete, timeout 600, output in /tmp/testlog-2001-095328-ESmp 11/01 09:54:35 INFO | aexpect:0783| [qemu output] (Process terminated with status 1) 11/01 09:54:35 ERROR| unittest:0104| Unit test emulator failed 11/01 09:54:35 INFO | unittest:0113| Unit test log collected and available under /usr/local/autotest/results/default/kvm.qemu-kvm-git.unittests/debug/emulator.log Unfortunately, the log for the emulator test itself (emulator.log) is empty. If you think it's helpful, we can setup the same environment again, so that you can manually run and debug this failure. Thanks. Cleber, Can you retest with kvm.git commit ID f2ae29e3492d41f082bf1e66532c2fe4da65f486 instead of branch master? Sure! I'll keep you informed.
[Qemu-devel] [PATCH 11/11] vvfat: reorganize computation of disk geometry
From: Paolo Bonzini First determine FAT12/16/32, then compute geometry from that for both FDD and HDD. For 1.44MB floppies, and 2.88MB floppies using FAT16, change to 1 sector/cluster. The default remains 2.88MB with FAT12 and 2 sectors/cluster. Both DOS and mkdosfs by default format a 2.88MB floppy as FAT12. Signed-off-by: Paolo Bonzini Signed-off-by: Kevin Wolf --- block/vvfat.c | 40 1 files changed, 24 insertions(+), 16 deletions(-) diff --git a/block/vvfat.c b/block/vvfat.c index faf2947..8511fe5 100644 --- a/block/vvfat.c +++ b/block/vvfat.c @@ -982,7 +982,6 @@ static int is_consistent(BDRVVVFATState *s); static int vvfat_open(BlockDriverState *bs, const char* dirname, int flags) { BDRVVVFATState *s = bs->opaque; -int floppy = 0; int i; #ifdef DEBUG @@ -996,11 +995,8 @@ DLOG(if (stderr == NULL) { s->bs = bs; -s->fat_type=16; /* LATER TODO: if FAT32, adjust */ s->sectors_per_cluster=0x10; -/* 504MB disk*/ -bs->cyls=1024; bs->heads=16; bs->secs=63; s->current_cluster=0x; @@ -1015,14 +1011,6 @@ DLOG(if (stderr == NULL) { if (!strstart(dirname, "fat:", NULL)) return -1; -if (strstr(dirname, ":floppy:")) { - floppy = 1; - s->fat_type = 12; - s->first_sectors_number = 1; - s->sectors_per_cluster=2; - bs->cyls = 80; bs->heads = 2; bs->secs = 36; -} - if (strstr(dirname, ":32:")) { fprintf(stderr, "Big fat greek warning: FAT32 has not been tested. You are welcome to do so!\n"); s->fat_type = 32; @@ -1030,7 +1018,27 @@ DLOG(if (stderr == NULL) { s->fat_type = 16; } else if (strstr(dirname, ":12:")) { s->fat_type = 12; - bs->secs = 18; +} + +if (strstr(dirname, ":floppy:")) { + /* 1.44MB or 2.88MB floppy. 2.88MB can be FAT12 (default) or FAT16. */ + if (!s->fat_type) { + s->fat_type = 12; + bs->secs = 36; + s->sectors_per_cluster=2; + } else { + bs->secs=(s->fat_type == 12 ? 18 : 36); + s->sectors_per_cluster=1; + } + s->first_sectors_number = 1; + bs->cyls=80; bs->heads=2; +} else { + /* 32MB or 504MB disk*/ + if (!s->fat_type) { + s->fat_type = 16; + } + bs->cyls=(s->fat_type == 12 ? 64 : 1024); + bs->heads=16; bs->secs=63; } s->sector_count=bs->cyls*bs->heads*bs->secs-(s->first_sectors_number-1); @@ -1058,10 +1066,10 @@ DLOG(if (stderr == NULL) { if(s->first_sectors_number==0x40) init_mbr(s); - -/* for some reason or other, MS-DOS does not like to know about CHS... */ -if (floppy) +else { +/* MS-DOS does not like to know about CHS (?). */ bs->heads = bs->cyls = bs->secs = 0; +} //assert(is_consistent(s)); qemu_co_mutex_init(&s->lock); -- 1.7.6.4
[Qemu-devel] [PATCH 09/11] vvfat: unify and correct computation of sector count
From: Paolo Bonzini The sector count is stored in the partition and hence must not include the sectors before its start. At the same time, remove the useless special casing for 1.44 MB floppies. This fixes fsck on VVFAT hard disks, which otherwise tries to seek past the end of the disk. Signed-off-by: Paolo Bonzini Signed-off-by: Kevin Wolf --- block/vvfat.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/block/vvfat.c b/block/vvfat.c index 1f7cc48..3e7b407 100644 --- a/block/vvfat.c +++ b/block/vvfat.c @@ -1026,8 +1026,6 @@ DLOG(if (stderr == NULL) { bs->cyls = 80; bs->heads = 2; bs->secs = 36; } -s->sector_count=bs->cyls*bs->heads*bs->secs; - if (strstr(dirname, ":32:")) { fprintf(stderr, "Big fat greek warning: FAT32 has not been tested. You are welcome to do so!\n"); s->fat_type = 32; @@ -1035,9 +1033,11 @@ DLOG(if (stderr == NULL) { s->fat_type = 16; } else if (strstr(dirname, ":12:")) { s->fat_type = 12; - s->sector_count=2880; + bs->secs = 18; } +s->sector_count=bs->cyls*bs->heads*bs->secs-(s->first_sectors_number-1); + if (strstr(dirname, ":rw:")) { if (enable_write_target(s)) return -1; -- 1.7.6.4
Re: [Qemu-devel] Summary of CD, DVD passthrough tests with -drive if=scsi
On 11/07/2011 06:02 PM, Thomas Schmitt wrote: But how would i come from there to your proposal ? -device virtio-blk,drive=scsicd,logical_block_size=2048,physical_block_size=2048 First of all, virtio-blk is an alias for virtio-blk-pci. I should not have used it as it's likely confusing, but anyway "qemu-kvm -device virtio-blk-pci,?" tells you all the options that a virtio-blk(-pci) device can have. The idea then is that when you want to express extra options, you cannot use simply "-drive if=virtio", and qdev-device-use.txt then hints about how to convert -drive if=virtio" to "-drive if=none" + "-device virtio-blk". Paolo
Re: [Qemu-devel] [PATCH] qmp: add test tool for QMP
On Mon, Nov 07, 2011 at 02:39:29PM -0200, Luiz Capitulino wrote: > On Mon, 07 Nov 2011 10:35:51 -0600 > Anthony Liguori wrote: > > > On 11/07/2011 10:30 AM, Luiz Capitulino wrote: > > > On Mon, 07 Nov 2011 10:09:55 -0600 > > > Anthony Liguori wrote: > > > > > >> On 11/07/2011 10:08 AM, Luiz Capitulino wrote: > > >>> On Mon, 7 Nov 2011 09:11:15 -0600 > > >>> Anthony Liguori wrote: > > >>> > > I wrote this quickly to aid in testing. It's similar to qmp-shell > > with a few > > important differences: > > > > 1) It is not interactive. That makes it useful for scripting. > > > > 2) qmp-shell: > > > > (QEMU) set_password protocol=vnc password=foo > > > > 3) qmp: > > > > $ qmp set_password --protocol=vnc --password=foo > > > > 4) Extensible, git-style interface. If an invalid command name is > > passed, it > > will try to exec qmp-$1. > > > > 5) It attempts to pretty print the JSON responses in a shell friendly > > format > > such that tools can work with the output. > > > > Hope others will also find it useful. > > > > Signed-off-by: Anthony Liguori > > >>> > > >>> Acked-by: Luiz Capitulino > > >> > > >> BTW, one thing I'd like to try at some point soon is to generate man > > >> pages from > > >> qapi-schema.json. If you notice in the script, it does online help by > > >> invoking man. > > > > > > Yes, I did notice it. I didn't comment on it because I imagined you had > > > plans > > > about it. > > > > > > PS: I don't think this needs to go through my tree. > > > > What do you want to do with qmp.py? Do you feel comfortable installing it > > in > > $PYTHONPATH and treating it as a supported API? > > I probably don't. I coded it as demo in the very beginning of QMP, maybe > we should first define what we expect from a QMP Python class then we > can see whether it fits or not... I feel it needs to be revamped. > It should not blocking, i.e. for event notification. I have a patch that fixes that but breaks tab-completion.
[Qemu-devel] [PATCH 07/11] vvfat: do not fail if the disk has spare sectors
From: Paolo Bonzini If the number of "faked sectors" + the number of sectors that are part of a cluster does not sum up to the total number of sectors, qemu-img convert fails. Read these spare sectors as all zeros. Signed-off-by: Paolo Bonzini Signed-off-by: Kevin Wolf --- block/vvfat.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/block/vvfat.c b/block/vvfat.c index 75d0dc0..9f851b0 100644 --- a/block/vvfat.c +++ b/block/vvfat.c @@ -1245,7 +1245,7 @@ static int vvfat_read(BlockDriverState *bs, int64_t sector_num, int i; for(i=0;i= s->sector_count) + if (sector_num >= bs->total_sectors) return -1; if (s->qcow) { int n; @@ -1271,7 +1271,7 @@ DLOG(fprintf(stderr, "sector %d not allocated\n", (int)sector_num)); uint32_t sector=sector_num-s->faked_sectors, sector_offset_in_cluster=(sector%s->sectors_per_cluster), cluster_num=sector/s->sectors_per_cluster; - if(read_cluster(s, cluster_num) != 0) { + if(cluster_num > s->cluster_count || read_cluster(s, cluster_num) != 0) { /* LATER TODO: strict: return -1; */ memset(buf+i*0x200,0,0x200); continue; -- 1.7.6.4
Re: [Qemu-devel] [qemu-kvm unittest regression] Re: Autotest | Job ID: 2011 "Upstream qemu-kvm.git sanity 11-01-2011 00:04:02" | Status: 1 Completed | Success Rate: 94.74 %
On Mon, Nov 07, 2011 at 09:00:26AM -0300, Cleber Rosa wrote: > On 11/07/2011 07:21 AM, Marcelo Tosatti wrote: > >On Tue, Nov 01, 2011 at 02:08:54PM -0200, Lucas Meneghel Rodrigues wrote: > >>On 11/01/2011 12:17 PM, kvm-autotest wrote: > >>>Job ID: 2011 > >>>Job name: Upstream qemu-kvm.git sanity 11-01-2011 00:04:02 > >>>Summary: Host: Status: Completed > >>>Status: 1 Completed > >>>Execution time (HH:MM:SS): 01:17:02 > >>>User tests executed: 19 > >>>User tests passed: 18 > >>>User tests failed: 1 > >>>User tests success rate: 94.74 % > >>>Failures: > >>>Test Name Status Reason > >>>kvm.qemu-kvm-git.unittests FAIL Unit tests failed: emulator > >>Hi Marcelo, Avi: > >> > >>We've seen emulator unittest failures during the last couple of jobs > >>of qemu-kvm.git userspace + kvm.git kernel. Relevant hashes for the > >>last failure seen: > >> > >>11/01 09:33:59 INFO |virt_utils:0501| Commit hash for > >>git://github.com/avikivity/kvm.git is > >>b796a09c5d808f4013f27ad45953db604dac18fd (tag > >>v3.1-rc4-10168-gb796a09) > >> > >>11/01 09:50:57 DEBUG|virt_utils:2587| Git repo qemu_kvm uri: > >>git://github.com/avikivity/qemu.git > >>11/01 09:51:52 INFO |virt_utils:2531| Commit hash for qemu_kvm is > >>7879db7e9c09b92d9af1c143fbe2cc212ec89e4b (no tag found) > >> > >>Cheers, > >> > >>Lucas > >Is there a log with more details available? > > > > Marcelo, > > The Autotest "landing" page for this job is: > > http://autotest.virt.bos.redhat.com/afe/#tab_id=view_job&object_id=2011 > > You can jump straight to the unittest logs by going to: > > http://autotest.virt.bos.redhat.com/results/2011-autotest/virtlab201.virt.bos.redhat.com/kvm.qemu-kvm-git.unittests/debug/ > > And this is the autotest log entries we got while running emulator: > > 11/01 09:54:32 INFO | unittest:0052| Running emulator > 11/01 09:54:32 DEBUG|virt_env_p:0052| Preprocessing VM 'None' > 11/01 09:54:32 INFO |kvm_vm:0790| Running qemu command: > /usr/local/autotest/tests/kvm/qemu -name 'None' -nodefaults -vga std -monitor > unix:'/tmp/monitor-humanmonitor1-2001-095328-ESmp',server,nowait -qmp > unix:'/tmp/monitor-qmpmonitor1-2001-095328-ESmp',server,nowait -serial > unix:'/tmp/serial-2001-095328-ESmp',server,nowait -m 512 -smp 2 -kernel > '/usr/local/autotest/tests/kvm/unittests/emulator.flat' -vnc :0 -chardev > file,id=testlog,path=/tmp/testlog-2001-095328-ESmp -device > testdev,chardev=testlog -S > 11/01 09:54:33 DEBUG|kvm_monito:0624| (monitor qmpmonitor1) Sending command > 'qmp_capabilities' > 11/01 09:54:33 DEBUG|kvm_vm:0851| VM appears to be alive with PID 18682 > 11/01 09:54:33 DEBUG|kvm_monito:0254| (monitor humanmonitor1) Sending command > 'cont' > 11/01 09:54:33 INFO | unittest:0096| Waiting for unittest emulator to > complete, timeout 600, output in /tmp/testlog-2001-095328-ESmp > 11/01 09:54:35 INFO | aexpect:0783| [qemu output] (Process terminated with > status 1) > 11/01 09:54:35 ERROR| unittest:0104| Unit test emulator failed > 11/01 09:54:35 INFO | unittest:0113| Unit test log collected and available > under > /usr/local/autotest/results/default/kvm.qemu-kvm-git.unittests/debug/emulator.log > > Unfortunately, the log for the emulator test itself (emulator.log) is empty. > If you think it's helpful, we can setup the same environment again, so that > you can manually run and debug this failure. > > Thanks. Cleber, Can you retest with kvm.git commit ID f2ae29e3492d41f082bf1e66532c2fe4da65f486 instead of branch master?
[Qemu-devel] [PATCH 02/11] qemu-io: Fix multiwrite_f error handling
Without this fix, some qiovs can be leaked if an error occurs. Also a semicolon at the end of the command line would make the code walk beyond the end of argv. Signed-off-by: Kevin Wolf --- qemu-io.c |9 ++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git a/qemu-io.c b/qemu-io.c index 1c49d44..de26422 100644 --- a/qemu-io.c +++ b/qemu-io.c @@ -960,21 +960,21 @@ static int multiwrite_f(int argc, char **argv) buf = g_malloc0(nr_reqs * sizeof(*buf)); qiovs = g_malloc(nr_reqs * sizeof(*qiovs)); -for (i = 0; i < nr_reqs; i++) { +for (i = 0; i < nr_reqs && optind < argc; i++) { int j; /* Read the offset of the request */ offset = cvtnum(argv[optind]); if (offset < 0) { printf("non-numeric offset argument -- %s\n", argv[optind]); -return 0; +goto out; } optind++; if (offset & 0x1ff) { printf("offset %lld is not sector aligned\n", (long long)offset); -return 0; +goto out; } if (i == 0) { @@ -1005,6 +1005,9 @@ static int multiwrite_f(int argc, char **argv) pattern++; } +/* If there were empty requests at the end, ignore them */ +nr_reqs = i; + gettimeofday(&t1, NULL); cnt = do_aio_multiwrite(reqs, nr_reqs, &total); gettimeofday(&t2, NULL); -- 1.7.6.4
[Qemu-devel] [PATCH 04/11] block/cloop: Fix coding style
From: Dong Xu Wang Fix coding style in block/cloop.c. Reviewed-by: Andreas Färber Reviewed_by: Ray Wang Signed-off-by: Dong Xu Wang Signed-off-by: Kevin Wolf --- block/cloop.c | 115 +++-- 1 files changed, 63 insertions(+), 52 deletions(-) diff --git a/block/cloop.c b/block/cloop.c index 775f8a9..799b6c2 100644 --- a/block/cloop.c +++ b/block/cloop.c @@ -30,7 +30,7 @@ typedef struct BDRVCloopState { CoMutex lock; uint32_t block_size; uint32_t n_blocks; -uint64_t* offsets; +uint64_t *offsets; uint32_t sectors_per_block; uint32_t current_block; uint8_t *compressed_block; @@ -40,21 +40,23 @@ typedef struct BDRVCloopState { static int cloop_probe(const uint8_t *buf, int buf_size, const char *filename) { -const char* magic_version_2_0="#!/bin/sh\n" - "#V2.0 Format\n" - "modprobe cloop file=$0 && mount -r -t iso9660 /dev/cloop $1\n"; -int length=strlen(magic_version_2_0); -if(length>buf_size) - length=buf_size; -if(!memcmp(magic_version_2_0,buf,length)) - return 2; +const char *magic_version_2_0 = "#!/bin/sh\n" +"#V2.0 Format\n" +"modprobe cloop file=$0 && mount -r -t iso9660 /dev/cloop $1\n"; +int length = strlen(magic_version_2_0); +if (length > buf_size) { +length = buf_size; +} +if (!memcmp(magic_version_2_0, buf, length)) { +return 2; +} return 0; } static int cloop_open(BlockDriverState *bs, int flags) { BDRVCloopState *s = bs->opaque; -uint32_t offsets_size,max_compressed_block_size=1,i; +uint32_t offsets_size, max_compressed_block_size = 1, i; bs->read_only = 1; @@ -74,26 +76,28 @@ static int cloop_open(BlockDriverState *bs, int flags) s->offsets = g_malloc(offsets_size); if (bdrv_pread(bs->file, 128 + 4 + 4, s->offsets, offsets_size) < offsets_size) { - goto cloop_close; +goto cloop_close; } for(i=0;in_blocks;i++) { - s->offsets[i]=be64_to_cpu(s->offsets[i]); - if(i>0) { - uint32_t size=s->offsets[i]-s->offsets[i-1]; - if(size>max_compressed_block_size) - max_compressed_block_size=size; - } +s->offsets[i] = be64_to_cpu(s->offsets[i]); +if (i > 0) { +uint32_t size = s->offsets[i] - s->offsets[i - 1]; +if (size > max_compressed_block_size) { +max_compressed_block_size = size; +} +} } /* initialize zlib engine */ -s->compressed_block = g_malloc(max_compressed_block_size+1); +s->compressed_block = g_malloc(max_compressed_block_size + 1); s->uncompressed_block = g_malloc(s->block_size); -if(inflateInit(&s->zstream) != Z_OK) - goto cloop_close; -s->current_block=s->n_blocks; +if (inflateInit(&s->zstream) != Z_OK) { +goto cloop_close; +} +s->current_block = s->n_blocks; s->sectors_per_block = s->block_size/512; -bs->total_sectors = s->n_blocks*s->sectors_per_block; +bs->total_sectors = s->n_blocks * s->sectors_per_block; qemu_co_mutex_init(&s->lock); return 0; @@ -105,27 +109,30 @@ static inline int cloop_read_block(BlockDriverState *bs, int block_num) { BDRVCloopState *s = bs->opaque; -if(s->current_block != block_num) { - int ret; -uint32_t bytes = s->offsets[block_num+1]-s->offsets[block_num]; +if (s->current_block != block_num) { +int ret; +uint32_t bytes = s->offsets[block_num + 1] - s->offsets[block_num]; ret = bdrv_pread(bs->file, s->offsets[block_num], s->compressed_block, bytes); -if (ret != bytes) +if (ret != bytes) { return -1; +} + +s->zstream.next_in = s->compressed_block; +s->zstream.avail_in = bytes; +s->zstream.next_out = s->uncompressed_block; +s->zstream.avail_out = s->block_size; +ret = inflateReset(&s->zstream); +if (ret != Z_OK) { +return -1; +} +ret = inflate(&s->zstream, Z_FINISH); +if (ret != Z_STREAM_END || s->zstream.total_out != s->block_size) { +return -1; +} - s->zstream.next_in = s->compressed_block; - s->zstream.avail_in = bytes; - s->zstream.next_out = s->uncompressed_block; - s->zstream.avail_out = s->block_size; - ret = inflateReset(&s->zstream); - if(ret != Z_OK) - return -1; - ret = inflate(&s->zstream, Z_FINISH); - if(ret != Z_STREAM_END || s->zstream.total_out != s->block_size) - return -1; - - s->current_block = block_num; +s->current_block = block_num; } return 0; } @@ -136,12 +143,15 @@ static int cloop_read(BlockDriverState *bs, int64_t sector_num, BDRVCloopState *s = bs->opaque; int i; -for(i=0;isectors_per_block), - block_num=(sector_num+i)/s-
Re: [Qemu-devel] [PATCH] KVM: Add wrapper script around QEMU to test kernels
On Mon, 7 Nov 2011, Pekka Enberg wrote: > I've never heard ABI incompatibility used as an argument for perf. Ingo? Never overtly. They're too clever for that. In any case, as a primary developer of a library (PAPI) that uses the perf_events ABI I have to say that having perf in the kernel has been a *major* pain for us. Unlike the perf developers, we *do* have to maintain backwards compatability. And we have a lot of nasty code in PAPI to handle this. Entirely because the perf_events ABI is not stable. It's mostly stable, but there are enough regressions to be a pain. It's problem enough that there's no way to know what version of the perf_event abi you are running against and we have to guess based on kernel version. This gets "fun" because all of the vendors have backported seemingly random chunks of perf_event code to their older kernels. And it often does seem as the perf developers don't care when something breaks in perf_events if it doesn't affect perf users. For example, the new NMI watchdog severely breaks perf_event event allocation if you are using FORMAT_GROUP. perf doesn't use this though, so none of the kernel developers seem to care. And unless I can quickly come up with a patch as an outsider, a few kernel versions will go by and the kernel devs will declare "well it was broken so long, now we don't have to fix it". Fun. Vince
Re: [Qemu-devel] Summary of CD, DVD passthrough tests with -drive if=scsi
Hi, Paolo Bonzini wrote: > Yes, docs/qdev-device-use.txt helps. "qemu -device virtio-blk,?" too. I read it and ran '-device ?'. But both left me clueless. The text i understood mainly as refering to "old ways" and "new way" of defining devices. I stalled when i found no flesh for "HOST-OPTS" and "DEV-OPTS". man ./qemu.1 lists options of -devices, but it did not lead me to "virtio-blk" as driver name. Do i get it right that e.g. virtio-blk-pci.scsi=on/off means there is an option -device virtio-blk,scsi=on But how would i come from there to your proposal ? -device virtio-blk,drive=scsicd,logical_block_size=2048,physical_block_size=2048 Not nagging, just pointing out the problems of a noob. To my luck, you are a very friendly tutor. Else i would have failed early. i wrote: > > Can FreeBSD and Solaris use virtio drives ? Paolo Bonzini wrote: > Perhaps, but probably not with SG_IO so that's a "no" for your use case. You mean SG_IO on the host system ? On the guest, libburn would use CAM resp. uscsi. I tried to learn about the overall concept of virtio but mostly find prescriptions how to use it. How much similarity between host and guest is needed ? Actually i came to qemu because on the long term i want to see GNU xorriso burn a DVD on GNU/Hurd. Just for fulfilling my duty as GNU maintainer. (Not that anybody else would find this to be a reasonable idea ... :)) But i understand that GNU/Hurd will not offer the guest support for virtio. So i will have to resort to -cdrom for development and an old PC for burn testing, if ever an RPC for userspace SCSI transactions gets approved. (Inside gnumach, there sit transaction calls for ATAPI and SCSI. But there is no way to use them directly from Hurd.) Have a nice day :) Thomas
[Qemu-devel] [PULL 00/11] Block patches for 1.0
Am 07.11.2011 17:55, schrieb Kevin Wolf: > The following changes since commit 932eacc158c064935c7bab920c88a93a629e1ca4: > > Merge branch 'xtensa' of git://jcmvbkbc.spb.ru/dumb/qemu-xtensa (2011-11-02 > 20:52:23 +) > > are available in the git repository at: > > git://repo.or.cz/qemu/kevin.git for-anthony > > Dong Xu Wang (2): > block/cloop: Fix coding style > block/cloop: Use g_free instead of free > > Kevin Wolf (3): > qemu-io: Handle create_iovec errors > qemu-io: Fix multiwrite_f error handling > dma: Avoid reentrancy in DMA transfer handlers > > Paolo Bonzini (6): > vvfat: fix out of bounds array_get usage > vvfat: do not fail if the disk has spare sectors > vvfat: need to use first_sectors_number to distinguish fdd/hdd > vvfat: unify and correct computation of sector count > vvfat: do not hardcode sector counts in error message > vvfat: reorganize computation of disk geometry > > block/cloop.c | 119 > +++-- > block/vvfat.c | 60 - > hw/dma.c | 10 + > qemu-io.c | 37 ++--- > 4 files changed, 138 insertions(+), 88 deletions(-) Subject should have been PULL instead of PATCH...
[Qemu-devel] [PATCH 06/11] vvfat: fix out of bounds array_get usage
From: Paolo Bonzini When reading the address of the first free entry, you cannot use array_get without first marking all entries as occupied. This is visible if you change the sectors per cluster on a floppy from 2 to 1. Signed-off-by: Paolo Bonzini Signed-off-by: Kevin Wolf --- block/vvfat.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/block/vvfat.c b/block/vvfat.c index e1fcdbc..75d0dc0 100644 --- a/block/vvfat.c +++ b/block/vvfat.c @@ -799,6 +799,7 @@ static int read_directory(BDRVVVFATState* s, int mapping_index) /* root directory */ int cur = s->directory.next; array_ensure_allocated(&(s->directory), ROOT_ENTRIES - 1); + s->directory.next = ROOT_ENTRIES; memset(array_get(&(s->directory), cur), 0, (ROOT_ENTRIES - cur) * sizeof(direntry_t)); } -- 1.7.6.4
[Qemu-devel] [PATCH 10/11] vvfat: do not hardcode sector counts in error message
From: Paolo Bonzini Signed-off-by: Paolo Bonzini Signed-off-by: Kevin Wolf --- block/vvfat.c |7 ++- 1 files changed, 2 insertions(+), 5 deletions(-) diff --git a/block/vvfat.c b/block/vvfat.c index 3e7b407..faf2947 100644 --- a/block/vvfat.c +++ b/block/vvfat.c @@ -916,11 +916,8 @@ static int init_directories(BDRVVVFATState* s, cluster = mapping->end; if(cluster > s->cluster_count) { - fprintf(stderr,"Directory does not fit in FAT%d (capacity %s)\n", - s->fat_type, - s->fat_type == 12 ? s->sector_count == 2880 ? "1.44 MB" - : "2.88 MB" - : "504MB"); + fprintf(stderr,"Directory does not fit in FAT%d (capacity %.2f MB)\n", + s->fat_type, s->sector_count / 2000.0); return -EINVAL; } -- 1.7.6.4
[Qemu-devel] [PATCH 08/11] vvfat: need to use first_sectors_number to distinguish fdd/hdd
From: Paolo Bonzini This is consistent with what "real" floppies have, so file(1) now actually recognizes the VVFAT image as a 1.44 MB floppy. Signed-off-by: Paolo Bonzini Signed-off-by: Kevin Wolf --- block/vvfat.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/block/vvfat.c b/block/vvfat.c index 9f851b0..1f7cc48 100644 --- a/block/vvfat.c +++ b/block/vvfat.c @@ -954,7 +954,7 @@ static int init_directories(BDRVVVFATState* s, bootsector->number_of_fats=0x2; /* number of FATs */ bootsector->root_entries=cpu_to_le16(s->sectors_of_root_directory*0x10); bootsector->total_sectors16=s->sector_count>0x?0:cpu_to_le16(s->sector_count); - bootsector->media_type=(s->fat_type!=12?0xf8:s->sector_count==5760?0xf9:0xf8); /* media descriptor */ +bootsector->media_type=(s->first_sectors_number>1?0xf8:0xf0); /* media descriptor (f8=hd, f0=3.5 fd)*/ s->fat.pointer[0] = bootsector->media_type; bootsector->sectors_per_fat=cpu_to_le16(s->sectors_per_fat); bootsector->sectors_per_track=cpu_to_le16(s->bs->secs); @@ -963,7 +963,7 @@ static int init_directories(BDRVVVFATState* s, bootsector->total_sectors=cpu_to_le32(s->sector_count>0x?s->sector_count:0); /* LATER TODO: if FAT32, this is wrong */ -bootsector->u.fat16.drive_number=s->fat_type==12?0:0x80; /* assume this is hda (TODO) */ +bootsector->u.fat16.drive_number=s->first_sectors_number==1?0:0x80; /* fda=0, hda=0x80 */ bootsector->u.fat16.current_head=0; bootsector->u.fat16.signature=0x29; bootsector->u.fat16.id=cpu_to_le32(0xfabe1afd); -- 1.7.6.4
[Qemu-devel] [PATCH 01/11] qemu-io: Handle create_iovec errors
Callers of create_iovec() didn't check for failure and continued with uninitialised data in error cases. This patch adds checks to each call. Signed-off-by: Kevin Wolf --- qemu-io.c | 28 1 files changed, 24 insertions(+), 4 deletions(-) diff --git a/qemu-io.c b/qemu-io.c index 5af887e..1c49d44 100644 --- a/qemu-io.c +++ b/qemu-io.c @@ -596,6 +596,9 @@ static int readv_f(int argc, char **argv) nr_iov = argc - optind; buf = create_iovec(&qiov, &argv[optind], nr_iov, 0xab); +if (buf == NULL) { +return 0; +} gettimeofday(&t1, NULL); cnt = do_aio_readv(&qiov, offset, &total); @@ -850,6 +853,9 @@ static int writev_f(int argc, char **argv) nr_iov = argc - optind; buf = create_iovec(&qiov, &argv[optind], nr_iov, pattern); +if (buf == NULL) { +return 0; +} gettimeofday(&t1, NULL); cnt = do_aio_writev(&qiov, offset, &total); @@ -950,8 +956,8 @@ static int multiwrite_f(int argc, char **argv) } } -reqs = g_malloc(nr_reqs * sizeof(*reqs)); -buf = g_malloc(nr_reqs * sizeof(*buf)); +reqs = g_malloc0(nr_reqs * sizeof(*reqs)); +buf = g_malloc0(nr_reqs * sizeof(*buf)); qiovs = g_malloc(nr_reqs * sizeof(*qiovs)); for (i = 0; i < nr_reqs; i++) { @@ -985,8 +991,12 @@ static int multiwrite_f(int argc, char **argv) nr_iov = j - optind; /* Build request */ +buf[i] = create_iovec(&qiovs[i], &argv[optind], nr_iov, pattern); +if (buf[i] == NULL) { +goto out; +} + reqs[i].qiov = &qiovs[i]; -buf[i] = create_iovec(reqs[i].qiov, &argv[optind], nr_iov, pattern); reqs[i].sector = offset >> 9; reqs[i].nb_sectors = reqs[i].qiov->size >> 9; @@ -1014,7 +1024,9 @@ static int multiwrite_f(int argc, char **argv) out: for (i = 0; i < nr_reqs; i++) { qemu_io_free(buf[i]); -qemu_iovec_destroy(&qiovs[i]); +if (reqs[i].qiov != NULL) { +qemu_iovec_destroy(&qiovs[i]); +} } g_free(buf); g_free(reqs); @@ -1185,6 +1197,10 @@ static int aio_read_f(int argc, char **argv) nr_iov = argc - optind; ctx->buf = create_iovec(&ctx->qiov, &argv[optind], nr_iov, 0xab); +if (ctx->buf == NULL) { +free(ctx); +return 0; +} gettimeofday(&ctx->t1, NULL); acb = bdrv_aio_readv(bs, ctx->offset >> 9, &ctx->qiov, @@ -1280,6 +1296,10 @@ static int aio_write_f(int argc, char **argv) nr_iov = argc - optind; ctx->buf = create_iovec(&ctx->qiov, &argv[optind], nr_iov, pattern); +if (ctx->buf == NULL) { +free(ctx); +return 0; +} gettimeofday(&ctx->t1, NULL); acb = bdrv_aio_writev(bs, ctx->offset >> 9, &ctx->qiov, -- 1.7.6.4
[Qemu-devel] [PATCH 00/11] Block patches for 1.0
The following changes since commit 932eacc158c064935c7bab920c88a93a629e1ca4: Merge branch 'xtensa' of git://jcmvbkbc.spb.ru/dumb/qemu-xtensa (2011-11-02 20:52:23 +) are available in the git repository at: git://repo.or.cz/qemu/kevin.git for-anthony Dong Xu Wang (2): block/cloop: Fix coding style block/cloop: Use g_free instead of free Kevin Wolf (3): qemu-io: Handle create_iovec errors qemu-io: Fix multiwrite_f error handling dma: Avoid reentrancy in DMA transfer handlers Paolo Bonzini (6): vvfat: fix out of bounds array_get usage vvfat: do not fail if the disk has spare sectors vvfat: need to use first_sectors_number to distinguish fdd/hdd vvfat: unify and correct computation of sector count vvfat: do not hardcode sector counts in error message vvfat: reorganize computation of disk geometry block/cloop.c | 119 +++-- block/vvfat.c | 60 - hw/dma.c | 10 + qemu-io.c | 37 ++--- 4 files changed, 138 insertions(+), 88 deletions(-)
[Qemu-devel] [PATCH 03/11] dma: Avoid reentrancy in DMA transfer handlers
With the conversion of the block layer to coroutines, bdrv_read/write have changed to run a nested event loop that calls qemu_bh_poll. Consequently a scheduled BH can be called while a DMA transfer handler runs and this means that DMA_run becomes reentrant. Devices haven't been designed to cope with that, so instead of running a nested transfer handler just wait for the next invocation of the BH from the main loop. This fixes some problems with the floppy device. Signed-off-by: Kevin Wolf --- hw/dma.c | 10 ++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/hw/dma.c b/hw/dma.c index 8a7302a..0a9322d 100644 --- a/hw/dma.c +++ b/hw/dma.c @@ -358,6 +358,14 @@ static void DMA_run (void) struct dma_cont *d; int icont, ichan; int rearm = 0; +static int running = 0; + +if (running) { +rearm = 1; +goto out; +} else { +running = 1; +} d = dma_controllers; @@ -374,6 +382,8 @@ static void DMA_run (void) } } +running = 0; +out: if (rearm) qemu_bh_schedule_idle(dma_bh); } -- 1.7.6.4
[Qemu-devel] [PATCH 6/8 v2] block: add eject request callback
Recent versions of udev always keep the tray locked so that the kernel can observe "eject request" events (aka tray button presses) even on discs that aren't mounted. Add support for these events in the ATAPI and SCSI cd drive device models. To let management cope with the behavior of udev, an event should also be added for "tray opened/closed". This way, after issuing an "eject" command, management can poll until the guests actually reacts to the command. They can then issue the "change" command after the tray has been opened, or try with "eject -f" after a (configurable?) timeout. However, with this patch and the corresponding support in the device models, at least it is possible to do a manual two-step eject+change sequence. Signed-off-by: Paolo Bonzini --- v1->v2: do not change the behavior of eject -f. block.c|7 +++ block.h|8 blockdev.c | 10 ++ 3 files changed, 21 insertions(+), 4 deletions(-) diff --git a/block.c b/block.c index 0568df2..e847738 100644 --- a/block.c +++ b/block.c @@ -816,6 +816,13 @@ bool bdrv_dev_has_removable_media(BlockDriverState *bs) return !bs->dev || (bs->dev_ops && bs->dev_ops->change_media_cb); } +void bdrv_dev_eject_request(BlockDriverState *bs, bool force) +{ +if (bs->dev_ops && bs->dev_ops->eject_request_cb) { +bs->dev_ops->eject_request_cb(bs->dev_opaque, force); +} +} + bool bdrv_dev_is_tray_open(BlockDriverState *bs) { if (bs->dev_ops && bs->dev_ops->is_tray_open) { diff --git a/block.h b/block.h index 38cd748..38fd0f6 100644 --- a/block.h +++ b/block.h @@ -39,6 +39,15 @@ typedef struct BlockDevOps { */ void (*change_media_cb)(void *opaque, bool load); /* + * Runs when an eject request is issued from the monitor, the tray + * is closed, and the medium is locked. + * Device models that do not implement is_medium_locked will not need + * this callback. Device models that can lock the medium or tray might + * want to implement the callback and unlock the tray when "force" is + * true, even if they do not support eject requests. + */ +void (*eject_request_cb)(void *opaque, bool force); +/* * Is the virtual tray open? * Device models implement this only when the device has a tray. */ @@ -111,6 +118,7 @@ void bdrv_detach_dev(BlockDriverState *bs, void *dev); void *bdrv_get_attached_dev(BlockDriverState *bs); void bdrv_set_dev_ops(BlockDriverState *bs, const BlockDevOps *ops, void *opaque); +void bdrv_dev_eject_request(BlockDriverState *bs, bool force); bool bdrv_dev_has_removable_media(BlockDriverState *bs); bool bdrv_dev_is_tray_open(BlockDriverState *bs); bool bdrv_dev_is_medium_locked(BlockDriverState *bs); diff --git a/blockdev.c b/blockdev.c index 0827bf7..2228186 100644 --- a/blockdev.c +++ b/blockdev.c @@ -635,10 +635,12 @@ static int eject_device(Monitor *mon, BlockDriverState *bs, int force) qerror_report(QERR_DEVICE_NOT_REMOVABLE, bdrv_get_device_name(bs)); return -1; } -if (!force && !bdrv_dev_is_tray_open(bs) -&& bdrv_dev_is_medium_locked(bs)) { -qerror_report(QERR_DEVICE_LOCKED, bdrv_get_device_name(bs)); -return -1; +if (bdrv_dev_is_medium_locked(bs) && !bdrv_dev_is_tray_open(bs)) { +bdrv_dev_eject_request(bs, force); +if (!force) { +qerror_report(QERR_DEVICE_LOCKED, bdrv_get_device_name(bs)); +return -1; +} } bdrv_close(bs); return 0; -- 1.7.7.1
Re: [Qemu-devel] [PATCH 1/3] Support guest reboots when in Xen HVM mode
On 11/07/2011 11:43 AM, Stefano Stabellini wrote: On Fri, 28 Oct 2011, John Baboval wrote: Call xc_domain_shutdown with the reboot flag when the guest requests a reboot. Thanks for the patch! Sorry for the delay in replaying but away for XenSummit Asia. Signed-off-by: John V. Baboval Signed-off-by: Tom Goetz --- xen-all.c | 22 ++ 1 files changed, 22 insertions(+), 0 deletions(-) diff --git a/xen-all.c b/xen-all.c index b5e28ab..cd71b24 100644 --- a/xen-all.c +++ b/xen-all.c @@ -742,6 +742,7 @@ static void cpu_handle_ioreq(void *opaque) } if (qemu_reset_requested_get()) { qemu_system_reset(VMRESET_REPORT); +reboot_hvm_domain(); } } @@ -979,3 +980,24 @@ void destroy_hvm_domain(void) xc_interface_close(xc_handle); } } + +void reboot_hvm_domain(void) +{ +XenXC xc_handle; +int sts; + +xc_handle = xen_xc_interface_open(0, 0, 0); +if (xc_handle == XC_HANDLER_INITIAL_VALUE) { +fprintf(stderr, "Cannot acquire xenctrl handle\n"); +} else { +sts = xc_domain_shutdown(xc_handle, xen_domid, SHUTDOWN_reboot); +if (sts != 0) { +fprintf(stderr, "? xc_domain_shutdown failed to issue reboot, " +"sts %d, %s\n", sts, strerror(errno)); +} else { +fprintf(stderr, "Issued domain %d reboot\n", xen_domid); +} +xc_interface_close(xc_handle); +} +} I think that what you are doing is correct but I couldn't help but notice that reboot_hvm_domain is very similar to destroy_hvm_domain. I would rather unify the two functions and have a single shutdown_domain function with two arguments: a reboot/destroy argument and xc_handle (we don't need to open a new one, we can reuse state->xce_handle). Are you OK with submitting a new version of this patch with these changes? Sounds good to me. I have a huge backlog of stuff to debug and test though, so it'll be a few days before I get to this.
Re: [Qemu-devel] [PATCH 2/3] Xen conditionals
On Fri, 28 Oct 2011, John Baboval wrote: > Don't perform RTC or APIC setup when running in xen mode. Both are actually emulated in Xen so registering the two devices in Qemu is harmless because they are never going to receive any events from the hypervisor. Thus I would rather keep the difference with the non-xen case smaller and avoid introducing more if (xen_enabled()).
Re: [Qemu-devel] [PATCH 1/3] Support guest reboots when in Xen HVM mode
On Fri, 28 Oct 2011, John Baboval wrote: > Call xc_domain_shutdown with the reboot flag when the guest requests a > reboot. Thanks for the patch! Sorry for the delay in replaying but away for XenSummit Asia. > Signed-off-by: John V. Baboval > Signed-off-by: Tom Goetz > --- > xen-all.c | 22 ++ > 1 files changed, 22 insertions(+), 0 deletions(-) > > diff --git a/xen-all.c b/xen-all.c > index b5e28ab..cd71b24 100644 > --- a/xen-all.c > +++ b/xen-all.c > @@ -742,6 +742,7 @@ static void cpu_handle_ioreq(void *opaque) > } > if (qemu_reset_requested_get()) { > qemu_system_reset(VMRESET_REPORT); > +reboot_hvm_domain(); > } > } > @@ -979,3 +980,24 @@ void destroy_hvm_domain(void) > xc_interface_close(xc_handle); > } > } > + > +void reboot_hvm_domain(void) > +{ > +XenXC xc_handle; > +int sts; > + > +xc_handle = xen_xc_interface_open(0, 0, 0); > +if (xc_handle == XC_HANDLER_INITIAL_VALUE) { > +fprintf(stderr, "Cannot acquire xenctrl handle\n"); > +} else { > +sts = xc_domain_shutdown(xc_handle, xen_domid, SHUTDOWN_reboot); > +if (sts != 0) { > +fprintf(stderr, "? xc_domain_shutdown failed to issue reboot, " > +"sts %d, %s\n", sts, strerror(errno)); > +} else { > +fprintf(stderr, "Issued domain %d reboot\n", xen_domid); > +} > +xc_interface_close(xc_handle); > +} > +} I think that what you are doing is correct but I couldn't help but notice that reboot_hvm_domain is very similar to destroy_hvm_domain. I would rather unify the two functions and have a single shutdown_domain function with two arguments: a reboot/destroy argument and xc_handle (we don't need to open a new one, we can reuse state->xce_handle). Are you OK with submitting a new version of this patch with these changes?
Re: [Qemu-devel] [PATCH] qmp: add test tool for QMP
On Mon, 07 Nov 2011 10:35:51 -0600 Anthony Liguori wrote: > On 11/07/2011 10:30 AM, Luiz Capitulino wrote: > > On Mon, 07 Nov 2011 10:09:55 -0600 > > Anthony Liguori wrote: > > > >> On 11/07/2011 10:08 AM, Luiz Capitulino wrote: > >>> On Mon, 7 Nov 2011 09:11:15 -0600 > >>> Anthony Liguori wrote: > >>> > I wrote this quickly to aid in testing. It's similar to qmp-shell with > a few > important differences: > > 1) It is not interactive. That makes it useful for scripting. > > 2) qmp-shell: > > (QEMU) set_password protocol=vnc password=foo > > 3) qmp: > > $ qmp set_password --protocol=vnc --password=foo > > 4) Extensible, git-style interface. If an invalid command name is > passed, it > will try to exec qmp-$1. > > 5) It attempts to pretty print the JSON responses in a shell friendly > format > such that tools can work with the output. > > Hope others will also find it useful. > > Signed-off-by: Anthony Liguori > >>> > >>> Acked-by: Luiz Capitulino > >> > >> BTW, one thing I'd like to try at some point soon is to generate man pages > >> from > >> qapi-schema.json. If you notice in the script, it does online help by > >> invoking man. > > > > Yes, I did notice it. I didn't comment on it because I imagined you had > > plans > > about it. > > > > PS: I don't think this needs to go through my tree. > > What do you want to do with qmp.py? Do you feel comfortable installing it in > $PYTHONPATH and treating it as a supported API? I probably don't. I coded it as demo in the very beginning of QMP, maybe we should first define what we expect from a QMP Python class then we can see whether it fits or not... I feel it needs to be revamped.
Re: [Qemu-devel] [PATCH] qmp: add test tool for QMP
On 11/07/2011 10:30 AM, Luiz Capitulino wrote: On Mon, 07 Nov 2011 10:09:55 -0600 Anthony Liguori wrote: On 11/07/2011 10:08 AM, Luiz Capitulino wrote: On Mon, 7 Nov 2011 09:11:15 -0600 Anthony Liguori wrote: I wrote this quickly to aid in testing. It's similar to qmp-shell with a few important differences: 1) It is not interactive. That makes it useful for scripting. 2) qmp-shell: (QEMU) set_password protocol=vnc password=foo 3) qmp: $ qmp set_password --protocol=vnc --password=foo 4) Extensible, git-style interface. If an invalid command name is passed, it will try to exec qmp-$1. 5) It attempts to pretty print the JSON responses in a shell friendly format such that tools can work with the output. Hope others will also find it useful. Signed-off-by: Anthony Liguori Acked-by: Luiz Capitulino BTW, one thing I'd like to try at some point soon is to generate man pages from qapi-schema.json. If you notice in the script, it does online help by invoking man. Yes, I did notice it. I didn't comment on it because I imagined you had plans about it. PS: I don't think this needs to go through my tree. What do you want to do with qmp.py? Do you feel comfortable installing it in $PYTHONPATH and treating it as a supported API? Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH] qmp: add test tool for QMP
On Mon, 07 Nov 2011 10:09:55 -0600 Anthony Liguori wrote: > On 11/07/2011 10:08 AM, Luiz Capitulino wrote: > > On Mon, 7 Nov 2011 09:11:15 -0600 > > Anthony Liguori wrote: > > > >> I wrote this quickly to aid in testing. It's similar to qmp-shell with a > >> few > >> important differences: > >> > >> 1) It is not interactive. That makes it useful for scripting. > >> > >> 2) qmp-shell: > >> > >> (QEMU) set_password protocol=vnc password=foo > >> > >> 3) qmp: > >> > >> $ qmp set_password --protocol=vnc --password=foo > >> > >> 4) Extensible, git-style interface. If an invalid command name is passed, > >> it > >> will try to exec qmp-$1. > >> > >> 5) It attempts to pretty print the JSON responses in a shell friendly > >> format > >> such that tools can work with the output. > >> > >> Hope others will also find it useful. > >> > >> Signed-off-by: Anthony Liguori > > > > Acked-by: Luiz Capitulino > > BTW, one thing I'd like to try at some point soon is to generate man pages > from > qapi-schema.json. If you notice in the script, it does online help by > invoking man. Yes, I did notice it. I didn't comment on it because I imagined you had plans about it. PS: I don't think this needs to go through my tree. > > Regards, > > Anthony Liguori > > > > >> --- > >> QMP/qmp | 120 > >> +++ > >> 1 files changed, 120 insertions(+), 0 deletions(-) > >> create mode 100755 QMP/qmp > >> > >> diff --git a/QMP/qmp b/QMP/qmp > >> new file mode 100755 > >> index 000..7b2a3c7 > >> --- /dev/null > >> +++ b/QMP/qmp > >> @@ -0,0 +1,120 @@ > >> +#!/usr/bin/python > >> +# > >> +# QMP command line tool > >> +# > >> +# Copyright IBM, Corp. 2011 > >> +# > >> +# Authors: > >> +# Anthony Liguori > >> +# > >> +# This work is licensed under the terms of the GNU GPLv2 or later. > >> +# See the COPYING file in the top-level directory. > >> + > >> +import sys, os > >> +from qmp import QEMUMonitorProtocol > >> + > >> +def print_response(rsp, prefix=[]): > >> +if type(rsp) == list: > >> +i = 0 > >> +for item in rsp: > >> +if prefix == []: > >> +prefix = ['item'] > >> +print_response(item, prefix[:-1] + ['%s[%d]' % (prefix[-1], > >> i)]) > >> +i += 1 > >> +elif type(rsp) == dict: > >> +for key in rsp.keys(): > >> +print_response(rsp[key], prefix + [key]) > >> +else: > >> +if len(prefix): > >> +print '%s: %s' % ('.'.join(prefix), rsp) > >> +else: > >> +print '%s' % (rsp) > >> + > >> +def main(args): > >> +path = None > >> + > >> +# Use QMP_PATH if it's set > >> +if os.environ.has_key('QMP_PATH'): > >> +path = os.environ['QMP_PATH'] > >> + > >> +while len(args): > >> +arg = args[0] > >> + > >> +if arg.startswith('--'): > >> +arg = arg[2:] > >> +if arg.find('=') == -1: > >> +value = True > >> +else: > >> +arg, value = arg.split('=', 1) > >> + > >> +if arg in ['path']: > >> +path = value > >> +elif arg in ['help']: > >> +os.execlp('man', 'man', 'qmp') > >> +else: > >> +print 'Unknown argument "%s"' % arg > >> + > >> +args = args[1:] > >> +else: > >> +break > >> + > >> +if not path: > >> +print "QMP path isn't set, use --path or set QMP_PATH" > >> +return 1 > >> + > >> +command, args = args[0], args[1:] > >> + > >> +if command in ['help']: > >> +os.execlp('man', 'man', 'qmp') > >> + > >> +srv = QEMUMonitorProtocol(path) > >> +srv.connect() > >> + > >> +def do_command(srv, cmd, **kwds): > >> +rsp = srv.cmd(cmd, kwds) > >> +if rsp.has_key('error'): > >> +raise Exception(rsp['error']['desc']) > >> +return rsp['return'] > >> + > >> +commands = map(lambda x: x['name'], do_command(srv, 'query-commands')) > >> + > >> +srv.close() > >> + > >> +if command not in commands: > >> +fullcmd = 'qmp-%s' % command > >> +try: > >> +os.environ['QMP_PATH'] = path > >> +os.execvp(fullcmd, [fullcmd] + args) > >> +except OSError, (errno, msg): > >> +if errno == 2: > >> +print 'Command "%s" not found.' % (fullcmd) > >> +return 1 > >> +raise > >> +return 0 > >> + > >> +srv = QEMUMonitorProtocol(path) > >> +srv.connect() > >> + > >> +arguments = {} > >> +for arg in args: > >> +if not arg.startswith('--'): > >> +print 'Unknown argument "%s"' % arg > >> +return 1 > >> + > >> +arg = arg[2:] > >> +if arg.find('=') == -1: > >> +value = True > >> +else: > >> +arg, value = arg.split('=', 1) > >> + > >> +if arg in ['help']: >
Re: [Qemu-devel] [Qemu-trivial] [PATCH] qemu-tech.texi: Remove libqemu from QEMU generic features
On Mon, Nov 07, 2011 at 05:47:05PM +0800, 陳韋任 wrote: > On Mon, Nov 07, 2011 at 08:18:01AM +, Stefan Hajnoczi wrote: > > On Sat, Nov 05, 2011 at 09:06:54AM +0800, 陳韋任 wrote: > > > According to [1], libqemu is not available anymore. Remove libqemu > > > from QEMU generic features. > > > > > > [1] http://www.mail-archive.com/qemu-devel@nongnu.org/msg49809.html > > > > Then perhaps we should get rid of tests/qruncom.c too? > > I think so. But I don't know if anyone object this. Is it O.K. cc > this to qemu-devel@nongnu.org ? I suggest sending a patch that removes all references to libqemu.a and tests/qruncom.c. Send it to qemu-devel instead of qemu-trivial and see if anyone complains :). Stefan
Re: [Qemu-devel] [PATCH 6/8] block: add eject request callback
On 11/07/2011 04:23 PM, Markus Armbruster wrote: 1. eject without -f is as you mentioned. Would implementing the missing part help with the problem at hand? It would help for non-buggy guests. Buggy means even "echo -1 > /sys/block/sr0/events_poll_msecs". However, a full solution would require a change in management, and adding a TRAY_STATUS_CHANGED event to QEMU. Not sure this is required for 1.0, as it can even be added later to stable. 2. eject with -f should really never be needed, but it does whatever is needed to be able to follow up with a "change" command. It turns out it is really "unlock" and "ask the guest to eject" combined, but that's the implementation, not the model. Physical hardware doesn't work that way (unless I misunderstand it). Always a warning sign. True. The difference from the paperclip model is that it gives a chance for the OS to clean up and eject safely. It wouldn't be hard to convince me otherwise though, especially if it can help getting the patch in 1.0. The "eject -f"+"change" can be replaced by "eject", possibly followed by "eject -f" after a timeout, and then followed again by "change". On bare metal, the pressing the tray button accomplishes that: the drive notifies the OS the button was pressed, the well-behaved OS cleans up and ejects. With a misbehaving OS, you're reduced to the paperclip. Can't we replicate that? 1. Tray button / eject without -f A. when the tray is not locked: tray opens immediately. B. when the tray is locked: OS gets notified. We expect it to clean up, unlock and open the tray at some point in the near future. 2. Paperclip / eject with -f Tray opens immediately. Guest OS may be unhappy about it. I now redid my tests with the paperclip behavior (it's really one line of different code) and everything seems to work. Will submit v2. Paolo
Re: [Qemu-devel] [PATCH] qmp: add test tool for QMP
On 11/07/2011 10:08 AM, Luiz Capitulino wrote: On Mon, 7 Nov 2011 09:11:15 -0600 Anthony Liguori wrote: I wrote this quickly to aid in testing. It's similar to qmp-shell with a few important differences: 1) It is not interactive. That makes it useful for scripting. 2) qmp-shell: (QEMU) set_password protocol=vnc password=foo 3) qmp: $ qmp set_password --protocol=vnc --password=foo 4) Extensible, git-style interface. If an invalid command name is passed, it will try to exec qmp-$1. 5) It attempts to pretty print the JSON responses in a shell friendly format such that tools can work with the output. Hope others will also find it useful. Signed-off-by: Anthony Liguori Acked-by: Luiz Capitulino BTW, one thing I'd like to try at some point soon is to generate man pages from qapi-schema.json. If you notice in the script, it does online help by invoking man. Regards, Anthony Liguori --- QMP/qmp | 120 +++ 1 files changed, 120 insertions(+), 0 deletions(-) create mode 100755 QMP/qmp diff --git a/QMP/qmp b/QMP/qmp new file mode 100755 index 000..7b2a3c7 --- /dev/null +++ b/QMP/qmp @@ -0,0 +1,120 @@ +#!/usr/bin/python +# +# QMP command line tool +# +# Copyright IBM, Corp. 2011 +# +# Authors: +# Anthony Liguori +# +# This work is licensed under the terms of the GNU GPLv2 or later. +# See the COPYING file in the top-level directory. + +import sys, os +from qmp import QEMUMonitorProtocol + +def print_response(rsp, prefix=[]): +if type(rsp) == list: +i = 0 +for item in rsp: +if prefix == []: +prefix = ['item'] +print_response(item, prefix[:-1] + ['%s[%d]' % (prefix[-1], i)]) +i += 1 +elif type(rsp) == dict: +for key in rsp.keys(): +print_response(rsp[key], prefix + [key]) +else: +if len(prefix): +print '%s: %s' % ('.'.join(prefix), rsp) +else: +print '%s' % (rsp) + +def main(args): +path = None + +# Use QMP_PATH if it's set +if os.environ.has_key('QMP_PATH'): +path = os.environ['QMP_PATH'] + +while len(args): +arg = args[0] + +if arg.startswith('--'): +arg = arg[2:] +if arg.find('=') == -1: +value = True +else: +arg, value = arg.split('=', 1) + +if arg in ['path']: +path = value +elif arg in ['help']: +os.execlp('man', 'man', 'qmp') +else: +print 'Unknown argument "%s"' % arg + +args = args[1:] +else: +break + +if not path: +print "QMP path isn't set, use --path or set QMP_PATH" +return 1 + +command, args = args[0], args[1:] + +if command in ['help']: +os.execlp('man', 'man', 'qmp') + +srv = QEMUMonitorProtocol(path) +srv.connect() + +def do_command(srv, cmd, **kwds): +rsp = srv.cmd(cmd, kwds) +if rsp.has_key('error'): +raise Exception(rsp['error']['desc']) +return rsp['return'] + +commands = map(lambda x: x['name'], do_command(srv, 'query-commands')) + +srv.close() + +if command not in commands: +fullcmd = 'qmp-%s' % command +try: +os.environ['QMP_PATH'] = path +os.execvp(fullcmd, [fullcmd] + args) +except OSError, (errno, msg): +if errno == 2: +print 'Command "%s" not found.' % (fullcmd) +return 1 +raise +return 0 + +srv = QEMUMonitorProtocol(path) +srv.connect() + +arguments = {} +for arg in args: +if not arg.startswith('--'): +print 'Unknown argument "%s"' % arg +return 1 + +arg = arg[2:] +if arg.find('=') == -1: +value = True +else: +arg, value = arg.split('=', 1) + +if arg in ['help']: +os.execlp('man', 'man', 'qmp-%s' % command) +return 1 + +arguments[arg] = value + +rsp = do_command(srv, command, **arguments) +print_response(rsp) + +if __name__ == '__main__': +sys.exit(main(sys.argv[1:]))
Re: [Qemu-devel] [PATCH] qmp: add test tool for QMP
On Mon, 7 Nov 2011 09:11:15 -0600 Anthony Liguori wrote: > I wrote this quickly to aid in testing. It's similar to qmp-shell with a few > important differences: > > 1) It is not interactive. That makes it useful for scripting. > > 2) qmp-shell: > > (QEMU) set_password protocol=vnc password=foo > > 3) qmp: > > $ qmp set_password --protocol=vnc --password=foo > > 4) Extensible, git-style interface. If an invalid command name is passed, it >will try to exec qmp-$1. > > 5) It attempts to pretty print the JSON responses in a shell friendly format >such that tools can work with the output. > > Hope others will also find it useful. > > Signed-off-by: Anthony Liguori Acked-by: Luiz Capitulino > --- > QMP/qmp | 120 > +++ > 1 files changed, 120 insertions(+), 0 deletions(-) > create mode 100755 QMP/qmp > > diff --git a/QMP/qmp b/QMP/qmp > new file mode 100755 > index 000..7b2a3c7 > --- /dev/null > +++ b/QMP/qmp > @@ -0,0 +1,120 @@ > +#!/usr/bin/python > +# > +# QMP command line tool > +# > +# Copyright IBM, Corp. 2011 > +# > +# Authors: > +# Anthony Liguori > +# > +# This work is licensed under the terms of the GNU GPLv2 or later. > +# See the COPYING file in the top-level directory. > + > +import sys, os > +from qmp import QEMUMonitorProtocol > + > +def print_response(rsp, prefix=[]): > +if type(rsp) == list: > +i = 0 > +for item in rsp: > +if prefix == []: > +prefix = ['item'] > +print_response(item, prefix[:-1] + ['%s[%d]' % (prefix[-1], i)]) > +i += 1 > +elif type(rsp) == dict: > +for key in rsp.keys(): > +print_response(rsp[key], prefix + [key]) > +else: > +if len(prefix): > +print '%s: %s' % ('.'.join(prefix), rsp) > +else: > +print '%s' % (rsp) > + > +def main(args): > +path = None > + > +# Use QMP_PATH if it's set > +if os.environ.has_key('QMP_PATH'): > +path = os.environ['QMP_PATH'] > + > +while len(args): > +arg = args[0] > + > +if arg.startswith('--'): > +arg = arg[2:] > +if arg.find('=') == -1: > +value = True > +else: > +arg, value = arg.split('=', 1) > + > +if arg in ['path']: > +path = value > +elif arg in ['help']: > +os.execlp('man', 'man', 'qmp') > +else: > +print 'Unknown argument "%s"' % arg > + > +args = args[1:] > +else: > +break > + > +if not path: > +print "QMP path isn't set, use --path or set QMP_PATH" > +return 1 > + > +command, args = args[0], args[1:] > + > +if command in ['help']: > +os.execlp('man', 'man', 'qmp') > + > +srv = QEMUMonitorProtocol(path) > +srv.connect() > + > +def do_command(srv, cmd, **kwds): > +rsp = srv.cmd(cmd, kwds) > +if rsp.has_key('error'): > +raise Exception(rsp['error']['desc']) > +return rsp['return'] > + > +commands = map(lambda x: x['name'], do_command(srv, 'query-commands')) > + > +srv.close() > + > +if command not in commands: > +fullcmd = 'qmp-%s' % command > +try: > +os.environ['QMP_PATH'] = path > +os.execvp(fullcmd, [fullcmd] + args) > +except OSError, (errno, msg): > +if errno == 2: > +print 'Command "%s" not found.' % (fullcmd) > +return 1 > +raise > +return 0 > + > +srv = QEMUMonitorProtocol(path) > +srv.connect() > + > +arguments = {} > +for arg in args: > +if not arg.startswith('--'): > +print 'Unknown argument "%s"' % arg > +return 1 > + > +arg = arg[2:] > +if arg.find('=') == -1: > +value = True > +else: > +arg, value = arg.split('=', 1) > + > +if arg in ['help']: > +os.execlp('man', 'man', 'qmp-%s' % command) > +return 1 > + > +arguments[arg] = value > + > +rsp = do_command(srv, command, **arguments) > +print_response(rsp) > + > +if __name__ == '__main__': > +sys.exit(main(sys.argv[1:]))
[Qemu-devel] [Bug 885213] Re: Latest GIT version fails to compile on Linux cris-softmmu/pci-stub.c
Thanks for the fix. I've done as you suggested, and the latest version now compiles. Please close. ** Changed in: qemu Status: New => Fix Released -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/885213 Title: Latest GIT version fails to compile on Linux cris-softmmu/pci-stub.c Status in QEMU: Fix Released Bug description: The latest GIT version (e072ea2fd8fdceef64159b9596d3c15ce01bea91) fails to compile. Host: debian x86-64. gcc 4.6.2 ... CCcris-softmmu/pci-stub.o ... In file included from /home/njh/src/qemu/hw/pci-stub.c:24:0: ./qmp-commands.h: At top level: ./qmp-commands.h:3:1: error: expected identifier or ‘(’ before ‘{’ token . To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/885213/+subscriptions
Re: [Qemu-devel] [PATCH 2/2] ac97: don't override the pci subsystem id
On 11/07/2011 09:33 AM, Gerd Hoffmann wrote: This patch removes the code lines which set the subsystem id for the emulated ac97 card to 8086:. Due to the device id being zero the subsystem id isn't vaild anyway. With the patch applied the sound card gets the default qemu subsystem id (1af4:1100) instead. I don't like having a property of "use broken". Wouldn't it be better to have the subsystem vendor and device id be configurable, set the default to the qemu subsystem ids, and then set it to 8086: for < 1.0? Regards, Anthony Liguori [ v2: old& broken id is maintained for -M pc-$oldqemuversion ] Cc: Takashi Iwai Signed-off-by: Gerd Hoffmann --- hw/ac97.c| 16 +++- hw/pc_piix.c | 16 2 files changed, 27 insertions(+), 5 deletions(-) diff --git a/hw/ac97.c b/hw/ac97.c index 6800af4..0dbba3b 100644 --- a/hw/ac97.c +++ b/hw/ac97.c @@ -150,6 +150,7 @@ typedef struct AC97BusMasterRegs { typedef struct AC97LinkState { PCIDevice dev; QEMUSoundCard card; +uint32_t use_broken_id; uint32_t glob_cnt; uint32_t glob_sta; uint32_t cas; @@ -1305,11 +1306,12 @@ static int ac97_initfn (PCIDevice *dev) c[PCI_BASE_ADDRESS_0 + 6] = 0x00; c[PCI_BASE_ADDRESS_0 + 7] = 0x00; -c[PCI_SUBSYSTEM_VENDOR_ID] = 0x86; /* svid subsystem vendor id rwo */ -c[PCI_SUBSYSTEM_VENDOR_ID + 1] = 0x80; - -c[PCI_SUBSYSTEM_ID] = 0x00; /* sid subsystem id rwo */ -c[PCI_SUBSYSTEM_ID + 1] = 0x00; +if (s->use_broken_id) { +c[PCI_SUBSYSTEM_VENDOR_ID] = 0x86; +c[PCI_SUBSYSTEM_VENDOR_ID + 1] = 0x80; +c[PCI_SUBSYSTEM_ID] = 0x00; +c[PCI_SUBSYSTEM_ID + 1] = 0x00; +} c[PCI_INTERRUPT_LINE] = 0x00; /* intr_ln interrupt line rw */ c[PCI_INTERRUPT_PIN] = 0x01; /* intr_pn interrupt pin ro */ @@ -1350,6 +1352,10 @@ static PCIDeviceInfo ac97_info = { .device_id= PCI_DEVICE_ID_INTEL_82801AA_5, .revision = 0x01, .class_id = PCI_CLASS_MULTIMEDIA_AUDIO, +.qdev.props = (Property[]) { +DEFINE_PROP_UINT32("use_broken_id", AC97LinkState, use_broken_id, 0), +DEFINE_PROP_END_OF_LIST(), +} }; static void ac97_register (void) diff --git a/hw/pc_piix.c b/hw/pc_piix.c index 93e40d0..27ea570 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -347,6 +347,10 @@ static QEMUMachine pc_machine_v0_13 = { .driver = "virtio-net-pci", .property = "event_idx", .value= "off", +},{ +.driver = "AC97", +.property = "use_broken_id", +.value= stringify(1), }, { /* end of list */ } }, @@ -390,6 +394,10 @@ static QEMUMachine pc_machine_v0_12 = { .driver = "virtio-net-pci", .property = "event_idx", .value= "off", +},{ +.driver = "AC97", +.property = "use_broken_id", +.value= stringify(1), }, { /* end of list */ } } @@ -441,6 +449,10 @@ static QEMUMachine pc_machine_v0_11 = { .driver = "virtio-net-pci", .property = "event_idx", .value= "off", +},{ +.driver = "AC97", +.property = "use_broken_id", +.value= stringify(1), }, { /* end of list */ } } @@ -504,6 +516,10 @@ static QEMUMachine pc_machine_v0_10 = { .driver = "virtio-net-pci", .property = "event_idx", .value= "off", +},{ +.driver = "AC97", +.property = "use_broken_id", +.value= stringify(1), }, { /* end of list */ } },
Re: [Qemu-devel] Guest hangs on reboot with Windows Server 2003
On 11/07/2011 09:49 AM, Alon Levy wrote: On Mon, Nov 07, 2011 at 09:22:46AM -0600, Anthony Liguori wrote: On 11/07/2011 09:21 AM, Paolo Bonzini wrote: On 11/07/2011 02:52 PM, Stefan Hajnoczi wrote: For QEMU 1.0 Test Day I'm hitting a guest hang with Windows Server 2003 whenever it tries to reboot. The QEMU monitor is available and working. I'm using a qcow2 image over IDE, but this may not be qcow2/block related. This is a regression and does not occur under Debian 0.14.1+dfsg-3. I am bisecting against v0.15.0 now and will report back. Just wanted to let you know so we don't duplicate work. Fixed by https://github.com/aliguori/qemu/commit/3f8aeaa Excellent. Thanks for pointing that out! Can this be pushed to master as well? It will be pushed to master. I'm currently testing that branch. Regards, Anthony Liguori Regards, Anthony Liguori Paolo
Re: [Qemu-devel] Guest hangs on reboot with Windows Server 2003
On Mon, Nov 07, 2011 at 09:22:46AM -0600, Anthony Liguori wrote: > On 11/07/2011 09:21 AM, Paolo Bonzini wrote: > >On 11/07/2011 02:52 PM, Stefan Hajnoczi wrote: > >>For QEMU 1.0 Test Day I'm hitting a guest hang with Windows Server > >>2003 whenever it tries to reboot. The QEMU monitor is available and > >>working. I'm using a qcow2 image over IDE, but this may not be > >>qcow2/block related. > >> > >>This is a regression and does not occur under Debian 0.14.1+dfsg-3. I > >>am bisecting against v0.15.0 now and will report back. Just wanted to > >>let you know so we don't duplicate work. > > > >Fixed by > > > >https://github.com/aliguori/qemu/commit/3f8aeaa > > Excellent. Thanks for pointing that out! Can this be pushed to master as well? > > Regards, > > Anthony Liguori > > > > >Paolo > > > > > >
[Qemu-devel] [PATCH 2/2] ac97: don't override the pci subsystem id
This patch removes the code lines which set the subsystem id for the emulated ac97 card to 8086:. Due to the device id being zero the subsystem id isn't vaild anyway. With the patch applied the sound card gets the default qemu subsystem id (1af4:1100) instead. [ v2: old & broken id is maintained for -M pc-$oldqemuversion ] Cc: Takashi Iwai Signed-off-by: Gerd Hoffmann --- hw/ac97.c| 16 +++- hw/pc_piix.c | 16 2 files changed, 27 insertions(+), 5 deletions(-) diff --git a/hw/ac97.c b/hw/ac97.c index 6800af4..0dbba3b 100644 --- a/hw/ac97.c +++ b/hw/ac97.c @@ -150,6 +150,7 @@ typedef struct AC97BusMasterRegs { typedef struct AC97LinkState { PCIDevice dev; QEMUSoundCard card; +uint32_t use_broken_id; uint32_t glob_cnt; uint32_t glob_sta; uint32_t cas; @@ -1305,11 +1306,12 @@ static int ac97_initfn (PCIDevice *dev) c[PCI_BASE_ADDRESS_0 + 6] = 0x00; c[PCI_BASE_ADDRESS_0 + 7] = 0x00; -c[PCI_SUBSYSTEM_VENDOR_ID] = 0x86; /* svid subsystem vendor id rwo */ -c[PCI_SUBSYSTEM_VENDOR_ID + 1] = 0x80; - -c[PCI_SUBSYSTEM_ID] = 0x00; /* sid subsystem id rwo */ -c[PCI_SUBSYSTEM_ID + 1] = 0x00; +if (s->use_broken_id) { +c[PCI_SUBSYSTEM_VENDOR_ID] = 0x86; +c[PCI_SUBSYSTEM_VENDOR_ID + 1] = 0x80; +c[PCI_SUBSYSTEM_ID] = 0x00; +c[PCI_SUBSYSTEM_ID + 1] = 0x00; +} c[PCI_INTERRUPT_LINE] = 0x00; /* intr_ln interrupt line rw */ c[PCI_INTERRUPT_PIN] = 0x01; /* intr_pn interrupt pin ro */ @@ -1350,6 +1352,10 @@ static PCIDeviceInfo ac97_info = { .device_id= PCI_DEVICE_ID_INTEL_82801AA_5, .revision = 0x01, .class_id = PCI_CLASS_MULTIMEDIA_AUDIO, +.qdev.props = (Property[]) { +DEFINE_PROP_UINT32("use_broken_id", AC97LinkState, use_broken_id, 0), +DEFINE_PROP_END_OF_LIST(), +} }; static void ac97_register (void) diff --git a/hw/pc_piix.c b/hw/pc_piix.c index 93e40d0..27ea570 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -347,6 +347,10 @@ static QEMUMachine pc_machine_v0_13 = { .driver = "virtio-net-pci", .property = "event_idx", .value= "off", +},{ +.driver = "AC97", +.property = "use_broken_id", +.value= stringify(1), }, { /* end of list */ } }, @@ -390,6 +394,10 @@ static QEMUMachine pc_machine_v0_12 = { .driver = "virtio-net-pci", .property = "event_idx", .value= "off", +},{ +.driver = "AC97", +.property = "use_broken_id", +.value= stringify(1), }, { /* end of list */ } } @@ -441,6 +449,10 @@ static QEMUMachine pc_machine_v0_11 = { .driver = "virtio-net-pci", .property = "event_idx", .value= "off", +},{ +.driver = "AC97", +.property = "use_broken_id", +.value= stringify(1), }, { /* end of list */ } } @@ -504,6 +516,10 @@ static QEMUMachine pc_machine_v0_10 = { .driver = "virtio-net-pci", .property = "event_idx", .value= "off", +},{ +.driver = "AC97", +.property = "use_broken_id", +.value= stringify(1), }, { /* end of list */ } }, -- 1.7.1
[Qemu-devel] [PATCH 1/2] pc: add 1.0 machine type
This patch adds a pc-1.0 machine type. Signed-off-by: Gerd Hoffmann --- hw/pc_piix.c | 14 +++--- 1 files changed, 11 insertions(+), 3 deletions(-) diff --git a/hw/pc_piix.c b/hw/pc_piix.c index 8c7f2b7..93e40d0 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -297,8 +297,8 @@ static void pc_xen_hvm_init(ram_addr_t ram_size, } #endif -static QEMUMachine pc_machine = { -.name = "pc-0.14", +static QEMUMachine pc_machine_v1_0 = { +.name = "pc-1.0", .alias = "pc", .desc = "Standard PC", .init = pc_init_pci, @@ -306,6 +306,13 @@ static QEMUMachine pc_machine = { .is_default = 1, }; +static QEMUMachine pc_machine_v0_14 = { +.name = "pc-0.14", +.desc = "Standard PC", +.init = pc_init_pci, +.max_cpus = 255, +}; + static QEMUMachine pc_machine_v0_13 = { .name = "pc-0.13", .desc = "Standard PC", @@ -521,7 +528,8 @@ static QEMUMachine xenfv_machine = { static void pc_machine_init(void) { -qemu_register_machine(&pc_machine); +qemu_register_machine(&pc_machine_v1_0); +qemu_register_machine(&pc_machine_v0_14); qemu_register_machine(&pc_machine_v0_13); qemu_register_machine(&pc_machine_v0_12); qemu_register_machine(&pc_machine_v0_11); -- 1.7.1
Re: [Qemu-devel] [PATCH v12 5/5] block: perf testing report based on block I/O throttling
Am 03.11.2011 09:57, schrieb Zhi Yong Wu: > The file 1mbps.dat is based on bps=1024*1024 I/O throttling; and the file > 10mbps.dat is based on bps=10*1024*1024 I/O throttling. > > Signed-off-by: Zhi Yong Wu > --- > 10mbps.dat | 310 ++ > 1mbps.dat | 339 > > 2 files changed, 649 insertions(+), 0 deletions(-) > create mode 100644 10mbps.dat > create mode 100644 1mbps.dat This is just for information and not supposed to be merged, right? Kevin
Re: [Qemu-devel] [PATCH 6/8] block: add eject request callback
Paolo Bonzini writes: > On 11/07/2011 02:21 PM, Markus Armbruster wrote: >> The commit message should explain why we need this callback. The cover >> letter says "support for eject requests is required by udev 173." >> Please elaborate on that. > > Well, first and foremost eject requests are in the spec. :) The fact > that recent guests need it is more a justification for pulling this in > 1.0. I agree we should try to solve the problem for 1.0. >> You implement it for IDE in PATCH 7/8 and SCSI in PATCH 8/8. You don't >> implement it for floppy, despite the comment. That's okay; floppy has >> no use for it. It's the comment that needs fixing. Devices that >> implement is_medium_locked() must implement this one as well. > > Right. > >> 1. eject without -f behaves like the physical tray button. It has >> immediate effect, unless the tray is locked closed. Then, the drive >> just notifies the OS of the button push, so the OS can react to it. The >> latter isn't implemented in QEMU. >> >> 2. eject with -f behaves like whatever physical way there is to pry the >> tray open, locked or not. CD-ROM drives commonly have a little button >> hidden in some hope you can reach with a bent paperclip. >> >> Could you explain your mental model? > > 1. eject without -f is as you mentioned. Would implementing the missing part help with the problem at hand? > 2. eject with -f should really never be needed, but it does whatever > is needed to be able to follow up with a "change" command. It turns > out it is really "unlock" and "ask the guest to eject" combined, but > that's the implementation, not the model. Physical hardware doesn't work that way (unless I misunderstand it). Always a warning sign. > The difference from the paperclip model is that it gives a chance for > the OS to clean up and eject safely. It wouldn't be hard to convince > me otherwise though, especially if it can help getting the patch in > 1.0. The "eject -f"+"change" can be replaced by "eject", possibly > followed by "eject -f" after a timeout, and then followed again by > "change". On bare metal, the pressing the tray button accomplishes that: the drive notifies the OS the button was pressed, the well-behaved OS cleans up and ejects. With a misbehaving OS, you're reduced to the paperclip. Can't we replicate that? 1. Tray button / eject without -f A. when the tray is not locked: tray opens immediately. B. when the tray is locked: OS gets notified. We expect it to clean up, unlock and open the tray at some point in the near future. 2. Paperclip / eject with -f Tray opens immediately. Guest OS may be unhappy about it.
Re: [Qemu-devel] Guest hangs on reboot with Windows Server 2003
On 11/07/2011 09:22 AM, Stefan Hajnoczi wrote: On Mon, Nov 7, 2011 at 3:21 PM, Paolo Bonzini wrote: On 11/07/2011 02:52 PM, Stefan Hajnoczi wrote: For QEMU 1.0 Test Day I'm hitting a guest hang with Windows Server 2003 whenever it tries to reboot. The QEMU monitor is available and working. I'm using a qcow2 image over IDE, but this may not be qcow2/block related. This is a regression and does not occur under Debian 0.14.1+dfsg-3. I am bisecting against v0.15.0 now and will report back. Just wanted to let you know so we don't duplicate work. Fixed by https://github.com/aliguori/qemu/commit/3f8aeaa Thanks! If anyone hits a hang on reboot, make sure to use v1.0-rc1 or later. I'll make sure to highlight this in the release notes. Regards, Anthony Liguori Stefan
Re: [Qemu-devel] [PATCH v12 4/5] hmp/qmp: add block_set_io_throttle
Am 03.11.2011 09:57, schrieb Zhi Yong Wu: > Signed-off-by: Zhi Yong Wu > Signed-off-by: Stefan Hajnoczi > --- > block.c | 15 + > blockdev.c | 59 > ++ > blockdev.h |2 + > hmp-commands.hx | 15 + > hmp.c| 10 + > qapi-schema.json | 16 +- > qerror.c |4 +++ > qerror.h |3 ++ > qmp-commands.hx | 53 +++- > 9 files changed, 175 insertions(+), 2 deletions(-) > > diff --git a/block.c b/block.c > index b2af48f..ed6fe20 100644 > --- a/block.c > +++ b/block.c > @@ -1971,6 +1971,21 @@ BlockInfoList *qmp_query_block(Error **errp) > info->value->inserted->has_backing_file = true; > info->value->inserted->backing_file = > g_strdup(bs->backing_file); > } > + > +if (bs->io_limits_enabled) { > +info->value->inserted->bps = > + bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]; > +info->value->inserted->bps_rd = > + bs->io_limits.bps[BLOCK_IO_LIMIT_READ]; > +info->value->inserted->bps_wr = > + bs->io_limits.bps[BLOCK_IO_LIMIT_WRITE]; > +info->value->inserted->iops = > + bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]; > +info->value->inserted->iops_rd = > + bs->io_limits.iops[BLOCK_IO_LIMIT_READ]; > +info->value->inserted->iops_wr = > + bs->io_limits.iops[BLOCK_IO_LIMIT_WRITE]; > +} > } > > /* XXX: waiting for the qapi to support GSList */ > diff --git a/blockdev.c b/blockdev.c > index 651828c..95d1faa 100644 > --- a/blockdev.c > +++ b/blockdev.c > @@ -757,6 +757,65 @@ int do_change_block(Monitor *mon, const char *device, > return monitor_read_bdrv_key_start(mon, bs, NULL, NULL); > } > > +/* throttling disk I/O limits */ > +int do_block_set_io_throttle(Monitor *mon, > + const QDict *qdict, QObject **ret_data) > +{ > +BlockIOLimit io_limits; > +const char *devname = qdict_get_str(qdict, "device"); > +BlockDriverState *bs; > + > +io_limits.bps[BLOCK_IO_LIMIT_TOTAL] > += qdict_get_try_int(qdict, "bps", -1); > +io_limits.bps[BLOCK_IO_LIMIT_READ] > += qdict_get_try_int(qdict, "bps_rd", -1); > +io_limits.bps[BLOCK_IO_LIMIT_WRITE] > += qdict_get_try_int(qdict, "bps_wr", -1); > +io_limits.iops[BLOCK_IO_LIMIT_TOTAL] > += qdict_get_try_int(qdict, "iops", -1); > +io_limits.iops[BLOCK_IO_LIMIT_READ] > += qdict_get_try_int(qdict, "iops_rd", -1); > +io_limits.iops[BLOCK_IO_LIMIT_WRITE] > += qdict_get_try_int(qdict, "iops_wr", -1); > + > +bs = bdrv_find(devname); > +if (!bs) { > +qerror_report(QERR_DEVICE_NOT_FOUND, devname); > +return -1; > +} > + > +if ((io_limits.bps[BLOCK_IO_LIMIT_TOTAL] == -1) > +|| (io_limits.bps[BLOCK_IO_LIMIT_READ] == -1) > +|| (io_limits.bps[BLOCK_IO_LIMIT_WRITE] == -1) > +|| (io_limits.iops[BLOCK_IO_LIMIT_TOTAL] == -1) > +|| (io_limits.iops[BLOCK_IO_LIMIT_READ] == -1) > +|| (io_limits.iops[BLOCK_IO_LIMIT_WRITE] == -1)) { > +qerror_report(QERR_MISSING_PARAMETER, > + "bps/bps_rd/bps_wr/iops/iops_rd/iops_wr"); > +return -1; > +} Here you require that all parameters are set... > + > +if (!do_check_io_limits(&io_limits)) { > +qerror_report(QERR_INVALID_PARAMETER_COMBINATION); > +return -1; > +} > + > +bs->io_limits = io_limits; > +bs->slice_time = BLOCK_IO_SLICE_TIME; > + > +if (!bs->io_limits_enabled && bdrv_io_limits_enabled(bs)) { > +bdrv_io_limits_enable(bs); > +} else if (bs->io_limits_enabled && !bdrv_io_limits_enabled(bs)) { > +bdrv_io_limits_disable(bs); > +} else { > +if (bs->block_timer) { > +qemu_mod_timer(bs->block_timer, qemu_get_clock_ns(vm_clock)); > +} > +} > + > +return 0; > +} > + > int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data) > { > const char *id = qdict_get_str(qdict, "id"); > diff --git a/blockdev.h b/blockdev.h > index 3587786..1b48a75 100644 > --- a/blockdev.h > +++ b/blockdev.h > @@ -63,6 +63,8 @@ int do_block_set_passwd(Monitor *mon, const QDict *qdict, > QObject **ret_data); > int do_change_block(Monitor *mon, const char *device, > const char *filename, const char *fmt); > int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data); > +int do_block_set_io_throttle(Monitor *mon, > + const QDict *qdict, QObject **ret_data)
Re: [Qemu-devel] Guest hangs on reboot with Windows Server 2003
On 11/07/2011 09:21 AM, Paolo Bonzini wrote: On 11/07/2011 02:52 PM, Stefan Hajnoczi wrote: For QEMU 1.0 Test Day I'm hitting a guest hang with Windows Server 2003 whenever it tries to reboot. The QEMU monitor is available and working. I'm using a qcow2 image over IDE, but this may not be qcow2/block related. This is a regression and does not occur under Debian 0.14.1+dfsg-3. I am bisecting against v0.15.0 now and will report back. Just wanted to let you know so we don't duplicate work. Fixed by https://github.com/aliguori/qemu/commit/3f8aeaa Excellent. Thanks for pointing that out! Regards, Anthony Liguori Paolo
Re: [Qemu-devel] Guest hangs on reboot with Windows Server 2003
On Mon, Nov 7, 2011 at 3:21 PM, Paolo Bonzini wrote: > On 11/07/2011 02:52 PM, Stefan Hajnoczi wrote: >> >> For QEMU 1.0 Test Day I'm hitting a guest hang with Windows Server >> 2003 whenever it tries to reboot. The QEMU monitor is available and >> working. I'm using a qcow2 image over IDE, but this may not be >> qcow2/block related. >> >> This is a regression and does not occur under Debian 0.14.1+dfsg-3. I >> am bisecting against v0.15.0 now and will report back. Just wanted to >> let you know so we don't duplicate work. > > Fixed by > > https://github.com/aliguori/qemu/commit/3f8aeaa Thanks! If anyone hits a hang on reboot, make sure to use v1.0-rc1 or later. Stefan
Re: [Qemu-devel] Guest hangs on reboot with Windows Server 2003
On 11/07/2011 02:52 PM, Stefan Hajnoczi wrote: For QEMU 1.0 Test Day I'm hitting a guest hang with Windows Server 2003 whenever it tries to reboot. The QEMU monitor is available and working. I'm using a qcow2 image over IDE, but this may not be qcow2/block related. This is a regression and does not occur under Debian 0.14.1+dfsg-3. I am bisecting against v0.15.0 now and will report back. Just wanted to let you know so we don't duplicate work. Fixed by https://github.com/aliguori/qemu/commit/3f8aeaa Paolo