Re: [Qemu-devel] Fw: [Qemu-arm] [PATCH v2 0/6] Runtime pagesize computation

2016-06-29 Thread Vijay Kilari
On Wed, Jun 29, 2016 at 12:24 PM, Kumar, Vijaya  wrote:
>
>
>
> 
> From: Peter Maydell 
> Sent: Tuesday, June 28, 2016 1:46 PM
> To: qemu-arm; QEMU Developers
> Cc: Paolo Bonzini; Kumar, Vijaya; Patch Tracking
> Subject: Re: [Qemu-arm] [PATCH v2 0/6] Runtime pagesize computation
>
> On 21 June 2016 at 18:09, Peter Maydell  wrote:
>> This set of patches is a development based on the ones from Vijaya:
>> the general idea is similar but I have tried to improve the interface
>> for defining the page size a bit.  I've also tweaked patches 2 and 3
>> to address code review comments.
>
>> NB: I have only very lightly tested these and haven't attempted
>> to measure performance at all. There is an assert() in the
>> definition of TARGET_PAGE_BITS which is good for making sure
>> it isn't used before it's valid but not so good for speed.
>
> Vijaya, are you in a position to test this patchset for
> performance? Presumably you have a test case benchmark you're
> looking to improve here?

I have tested the patches and the test case that I was trying was
Live migration of Idle VM on arm64 platform.
VM migrated is with 4 VCPUS and 8GB RAM running CentOS.

With page bits 10 (1K), the live migration time is 5.8 sec

capabilities: xbzrle: off rdma-pin-all: off auto-converge: off
zero-blocks: off compress: off events: off x-postcopy-ram: off
Migration status: completed
total time: 5857 milliseconds
downtime: 102 milliseconds
setup: 14 milliseconds
transferred ram: 336081 kbytes
throughput: 470.21 mbps
remaining ram: 0 kbytes
total ram: 8519872 kbytes
duplicate: 8271539 pages
skipped: 0 pages
normal: 261340 pages
normal bytes: 261340 kbytes
dirty sync count: 3

With page bits 12 (4K), live migration time is 2.9 sec

capabilities: xbzrle: off rdma-pin-all: off auto-converge: off
zero-blocks: off compress: off events: off x-postcopy-ram: off
Migration status: completed
total time: 2974 milliseconds
downtime: 76 milliseconds
setup: 5 milliseconds
transferred ram: 301327 kbytes
throughput: 830.30 mbps
remaining ram: 0 kbytes
total ram: 8519872 kbytes
duplicate: 2062398 pages
skipped: 0 pages
normal: 70662 pages
normal bytes: 282648 kbytes
dirty sync count: 3

Regards
Vijay
>
> thanks
> -- PMM



Re: [Qemu-devel] Question about qtest and IOMMU

2016-06-29 Thread Paolo Bonzini


On 29/06/2016 08:36, Jan Kiszka wrote:
> On 2016-06-29 08:32, Peter Xu wrote:
>> Hi, all,
>>
>> I am thinking about whether it's possible to write up a unit test
>> program for emulated IOMMUs (of course, Intel IOMMU would be the first
>> one). This can provide us the chance to do quick functional regression
>> tests for IOMMU just like other devices, as well as customized test
>> sequences which is hard to do in real guests (e.g., specific cache
>> invalidations, error injections), etc..
>>
>> I see that current qtest framework cannot support well on testing
>> IOMMUs. For DMA remapping, things would be quite smooth. The problem
>> is that, we still do not have a complete test framework on the
>> interrupts. E.g., currently qtest is still an acceleration type, in
>> which we have no vCPUs, as well as interrupt chips like APICs (please
>> correct me if I am wrong).

qtest does have VCPUs, they just run a dummy loop.  You do have an APIC
too, but reading it doesn't work because cpu_get_current_apic() returns
NULL.

You can use kvm-unit-tests if qtest is not flexible enough.  It's
probably the simplest thing to do if you also want to test kernel LAPIC
and split irqchip operation.

Paolo

>> It's even further if we want to test
>> something like kernel irqchips with QEMU. Not sure whether it's
>> possible to do test based on a much realistic VM (e.g., with KVM
>> enabled, but just keep the CPUs stall?).
> 
> Adding David and Valentine as we were discussing this need in the
> context of the AMD IOMMU as well: You cannot test errors with workload
> (like Linux) that do not trigger them in normal conditions.

Paolo



[Qemu-devel] [Bug 1588328] Re: Qemu 2.6 Solaris 9 Sparc Segmentation Fault

2016-06-29 Thread Zhen Ning Lim
Hmm.. strange. I did make a new disk went into the setup, then format
the disk. After that, i rebooted and start that installation. But, it
seems still there is no disk detected.

 Media [1]: 1
Reading disc for Solaris Operating Environment...

The system is being initialized, please wait... |
No Disks found. 
Check to make sure disks are cabled and powered up. 

 Press OK to Exit.

   
  
/iommu@0,1000/sbus@0,10001000/espdma@5,840/esp@5,880/sd@0,0
Specify disk (enter its number): 0


AVAILABLE DRIVE TYPES:
0. Auto configure
1. Quantum ProDrive 80S
2. Quantum ProDrive 105S
3. CDC Wren IV 94171-344
4. SUN0104
5. SUN0207
6. SUN0327
7. SUN0340
8. SUN0424
9. SUN0535
10. SUN0669
11. SUN1.0G
12. SUN1.05
13. SUN1.3G
14. SUN2.1G
15. SUN2.9G
16. Zip 100
17. Zip 250
18. other
Specify disk type (enter its number): 18
Enter number of data cylinders: 24620
Enter number of alternate cylinders[2]: 
Enter number of physical cylinders[24622]: 
Enter number of heads: 27
Enter physical number of heads[default]: 107
Enter number of data sectors/track: 107
Enter number of physical sectors/track[default]: 
Enter rpm of drive[3600]: 
Enter format time[default]: 
Enter cylinder skew[default]: 
Enter track skew[default]: 
Enter tracks per zone[default]: 
Enter alternate tracks[default]: 
Enter alternate sectors[default]: 
Enter cache control[default]: 
Enter prefetch threshold[default]: 
Enter minimum prefetch[default]: 
Enter maximum prefetch[default]: 
Enter disk type name (remember quotes): Sparc9
selecting c0t0d0
[disk formatted]


FORMAT MENU:
disk   - select a disk
type   - select (define) a disk type
partition  - select (define) a partition table
current- describe the current disk
format - format and analyze the disk
repair - repair a defective sector
label  - write label to the disk
analyze- surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
save   - save new disk/partition definitions
inquiry- show vendor, product and revision
scsi   - independent SCSI mode selects
cache  - enable, disable or query SCSI disk cache
volname- set 8-character volume name
! - execute , then return
quit
format> label
[0] SMI Label
[1] EFI Label
Specify Label type[0]: 1
Ready to label disk, continue?y

format> q

#reboot
Jun 28 23:37:16 rpcbind: rpcbind terminating on signal.
syncing file systems... done
rebooting...
rebooting ()
Configuration device id QEMU version 1 machine id 32
Probing SBus slot 0 offset 0
Probing SBus slot 1 offset 0
Probing SBus slot 2 offset 0
Probing SBus slot 3 offset 0
Probing SBus slot 4 offset 0
Probing SBus slot 5 offset 0
Invalid FCode start byte
CPUs: 1 x FMI,MB86904
UUID: ----
Welcome to OpenBIOS v1.1 built on Apr 18 2016 08:19
  Type 'help' for detailed information
Trying cdrom:d...
Not a bootable ELF image
Loading a.out image...
Loaded 7680 bytes
entry point is 0x4000
bootpath: 
/iommu@0,1000/sbus@0,10001000/espdma@5,840/esp@5,880/sd@2,0:d

Jumping to entry point 4000 for type 0005...
switching to new context:
SunOS Release 5.9 Version Generic_118558-34 32-bit
Copyright 1983-2003 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
Configuring /dev and /devices
NOTICE: Couldn't set value (../../sun/io/audio/sada/drv/audiocs/audio_4231.c, 
Line #1759 0x00 0x88)
audio may not work correctly until it is stopped and restarted

Please specify the media from which you will install the Solaris Operating
Environment.

Media:

1. CD/DVD
2. Network File System
3. HTTP (Flash archive only)
4. FTP (Flash archive only)
5. Local Tape (Flash archive only)

   Media [1]: 1
Reading disc for Solaris Operating Environment...

The system is being initialized, please wait... -^[[6|^R
^[[/
No Disks found. 
Check to make sure disks are cabled and powered up. 

 Press OK to Exit.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1588328

Title:
  Qemu 2.6 Solaris 9 Sparc Segmentation Fault

Status in QEMU:
  New

Bug description:
  Hi,
  I tried the following command to boot Solaris 9 sparc:
  qemu-system-sparc -nographic -boot d -hda ./Spark9.disk -m 256 -cdrom 
sol-9-905hw-ga-sparc-dvd.iso -serial telnet:0.0.0.0:3000,server 

  It seems there are a few Segmentation Faults, one from the starting of
  the boot. Another at the beginning of the commandline installation.

  Trying 127.0.0.1...
  Connected to localhost.
  Escape character is '^]'.
  Configuration device id QEMU version 1 machine id 32
  Probing SBus slot 0 offset 0
 

Re: [Qemu-devel] [PATCH v0] spapr: Restore support for older PowerPC CPU cores

2016-06-29 Thread Thomas Huth
On 28.06.2016 17:05, Bharata B Rao wrote:
> Introduction of core based CPU hotplug for PowerPC sPAPR didn't
> add support for 970 and POWER5+ based core types. Add support for
> the same.
> 
> Signed-off-by: Bharata B Rao 
> ---
> TODO:
> - There are few other variants of 970, like 970fx etc for which I have not
>   added core types since I am not sure if they fall under sPAPR category.

At least the 970MP was used in IBM's JS21 blade server which was also a
sPAPR based system, so you might at least want to add that CPU, too.

If I get https://en.wikipedia.org/wiki/PowerPC_970#PowerPC_970FX right,
the 970FX has only been used in Apple computers, so it's likely not
needed to add support for these FX CPUs.

 Thomas




Re: [Qemu-devel] Regression: virtio-pci: convert to ioeventfd callbacks

2016-06-29 Thread Cornelia Huck
On Wed, 29 Jun 2016 09:41:50 +0800
Jason Wang  wrote:

> 
> 
> On 2016年06月27日 17:44, Peter Lieven wrote:
> > Hi, with the above patch applied:
> >
> > commit 9f06e71a567ba5ee8b727e65a2d5347fd331d2aa
> > Author: Cornelia Huck 
> > Date:   Fri Jun 10 11:04:12 2016 +0200
> >
> > virtio-pci: convert to ioeventfd callbacks
> >
> > a Ubuntu 14.04 VM freezes at startup when blk-mq is set up - even if 
> > there is only one queue.
> >
> > Peter
> >
> >
> 
> In fact, I notice vhost-net does not work for master, look like we are 
> trying to set host notifier without initialization which seems a bug
> 

Does the patch in <20160628101618.57bdb1e0.cornelia.h...@de.ibm.com> help?




Re: [Qemu-devel] [PATCH 1/3] block: ignore flush requests when storage is clean

2016-06-29 Thread Paolo Bonzini


On 28/06/2016 23:01, Paolo Bonzini wrote:
> 
> 
> On 24/06/2016 17:06, Denis V. Lunev wrote:
>> From: Evgeny Yakovlev 
>>
>> Some guests (win2008 server for example) do a lot of unnecessary
>> flushing when underlying media has not changed. This adds additional
>> overhead on host when calling fsync/fdatasync.
>>
>> This change introduces a dirty flag in BlockDriverState which is set
>> in bdrv_set_dirty and is checked in bdrv_co_flush. This allows us to
>> avoid unnesessary flushing when storage is clean.
>>
>> The problem with excessive flushing was found by a performance test
>> which does parallel directory tree creation (from 2 processes).
>> Results improved from 0.424 loops/sec to 0.432 loops/sec.
>> Each loop creates 10^3 directories with 10 files in each.
>>
>> Signed-off-by: Evgeny Yakovlev 
>> Signed-off-by: Denis V. Lunev 
>> CC: Kevin Wolf 
>> CC: Max Reitz 
>> CC: Stefan Hajnoczi 
>> CC: Fam Zheng 
>> CC: John Snow 
>> ---
>>  block.c   |  1 +
>>  block/dirty-bitmap.c  |  3 +++
>>  block/io.c| 19 +++
>>  include/block/block_int.h |  2 ++
>>  4 files changed, 25 insertions(+)
>>
>> diff --git a/block.c b/block.c
>> index f4648e9..e36f148 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -2582,6 +2582,7 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset)
>>  ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
>>  bdrv_dirty_bitmap_truncate(bs);
>>  bdrv_parent_cb_resize(bs);
>> +bs->dirty = true; /* file node sync is needed after truncate */
>>  }
>>  return ret;
>>  }
>> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
>> index 4902ca5..54e0413 100644
>> --- a/block/dirty-bitmap.c
>> +++ b/block/dirty-bitmap.c
>> @@ -370,6 +370,9 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t 
>> cur_sector,
>>  }
>>  hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
>>  }
>> +
>> +/* Set global block driver dirty flag even if bitmap is disabled */
>> +bs->dirty = true;
>>  }
>>  
>>  /**
>> diff --git a/block/io.c b/block/io.c
>> index 7cf3645..8078af2 100644
>> --- a/block/io.c
>> +++ b/block/io.c
>> @@ -2239,6 +2239,25 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>>  goto flush_parent;
>>  }
>>  
>> +/* Check if storage is actually dirty before flushing to disk */
>> +if (!bs->dirty) {
>> +/* Flush requests are appended to tracked request list in order so 
>> that
>> + * most recent request is at the head of the list. Following code 
>> uses
>> + * this ordering to wait for the most recent flush request to 
>> complete
>> + * to ensure that requests return in order */
>> +BdrvTrackedRequest *prev_req;
>> +QLIST_FOREACH(prev_req, &bs->tracked_requests, list) {
>> +if (prev_req == &req || prev_req->type != BDRV_TRACKED_FLUSH) {
>> +continue;
>> +}
>> +
>> +qemu_co_queue_wait(&prev_req->wait_queue);
>> +break;
>> +}
>> +goto flush_parent;
> 
> Can you just have a CoQueue specific to flushes, where a completing
> flush does a restart_all on the CoQueue?

Something like this:

current_gen = bs->write_gen;
if (bs->flush_started_gen >= current_gen) {
while (bs->flushed_gen < current_gen) {
qemu_co_queue_wait(&bs->flush_queue);
}
return;
}

bs->flush_started_gen = current_gen;
...
if (current_gen < bs->flushed_gen) {
bs->flushed_gen = current_gen;
qemu_co_queue_restart_all(&bs->flush_queue);
}

Paolo

> Flushes are never serialising, so there's no reason for them to be in
> tracked_requests (I posted patches a while ago that instead use a simple
> atomic counter, but they will only be in 2.8).



Re: [Qemu-devel] Automated testing of block/gluster.c with upstream Gluster

2016-06-29 Thread Lukáš Doktor

Dne 28.6.2016 v 17:56 Niels de Vos napsal(a):

On Tue, Jun 28, 2016 at 05:20:03PM +0200, Lukáš Doktor wrote:

Dne 28.6.2016 v 16:10 Kevin Wolf napsal(a):

Am 28.06.2016 um 11:02 hat Niels de Vos geschrieben:

Hi,

it seems we broke the block/gluster.c functionality with a recent patch
in upstream Gluster. In order to prevent this from happening in the
future, I would like to setup a Jenkins job that installs a plan CentOS
with its version of QEMU, and nightly builds of upstream Gluster.
Getting a notification about breakage the day after a patch got merged
seems like a reasonable approach.

The test should at least boot the generic CentOS cloud image (slightly
modified with libguestfs) and return a success/fail. I am wondering if
there are automated tests like this already, and if I could (re)use some
of the scripts for it. At the moment, I am thinking to so it like this:
 - download the image [1]
 - set kernel parameters to output on the serial console
 - add a auto-login user/script
 - have the script write "bootup complete" or something
 - have the script poweroff the VM
 - script that started the VM checks for the "bootup complete" message
 - return success/fail


Sounds like something that Avocado should be able (or actually is
designed) to do. I can't tell you the details of how to write the test
case for it, but I'm adding a CC to Lukáš who probably can (and I think
it shouldn't be hard anyway).

Kevin



Hello guys,

yes, Avocado is designed to do this and I believe it even contain quite a
few Gluster tests. You can look for them in avocado-vt or ping our QA folks
who might give you some pointers (cc Xu nad Hao).

Regarding the building the CI I use the combination of Jenkins, Jenkins job
builder and Avocado (avocado-vt) to check power/arm
weekly/per-package-update. Jenkins even supports github and other triggers
if you decide you have enough resources to check each PR/commit. It all
depends on what HW you have available.


That looks promising! Its a bit more complex (or at least 'new' for me)
than that I was hoping. There is Gluster support in there, I found a
description of it here:
  http://avocado-vt.readthedocs.io/en/latest/GlusterFs.html
  http://avocado-vt.readthedocs.io/en/latest/RunQemuUnittests.html

Browsing through the docs does not really explain me how to put a
configuration file together that runs the QEMU tests with a VM image on
Gluster though. I probably need to read much more, but a pointer or very
minimal example would be much appreciated.

When I'm able to run avocado-vt, it should be trivial to put that in a
Jenkins job :)

Many thanks,
Niels



Hello Niels,

yep, it should be quite simple, but I don't want to break my setup just 
to try it out. Hopefully you'll manage to do it yourself. Anyway few 
pointers...


Install avocado:


http://avocado-vt.readthedocs.io/en/latest/GetStartedGuide.html#installing-avocado

it should be straight forward so I expect you're able to run `avocado 
boot` already. There are several backends so can run the same tests on 
top of them. Let's assume you're using `qemu`, which is the default. The 
difference is the backend configuration location and some details 
regarding importing images and so on. Qemu is the simplest for me as it 
does not require anything.


Now regarding the GlusterFS. By default avocado uses `only 
(image_backend=filesystem)` hardcoded in 
`/usr/share/avocado_vt/backends/qemu/cfg/tests.cfg`. This is because 
wast majority of people don't want to change it and if they do they 
usually use custom config files. I don't think you'd like to go that way 
so let's just patch that file and change it to `only 
(image_backend=gluster)`.


Then you might take a look into 
`/usr/share/avocado_vt/shared/cfg/guest-hw.cfg` where you can find the 
`variants image_backend:` and several profiles there including 
`gluster`. You can specify the `gluster_brick` and other options.


When you modify everything to your needs you should re-run `avocado 
vt-bootstrap` to update the configs and you should be ready to run. Any 
test should then use the `gluster` image instead of file-based image.



As for the kvm-unit-test, the documentation is seriously outdated and 
new version is in progress. You can use the pure avocado for running 
kvm-unit-test, see 
https://github.com/avocado-framework/avocado/pull/1280 for details. I'll 
update the avocado-vt documentation when the script is merged.



In the end I'd recommend using the `--xunit` to produce junit results 
and you can import them in jenkins including per-subtest-statuses. Let 
me send you my setup in PM for inspiration...



Regards,
Lukáš



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH 2/2 V3] hmp: show all of snapshot info on every block dev in output of 'info snapshots'

2016-06-29 Thread Lin Ma
Currently, the output of 'info snapshots' shows fully available snapshots.
It's opaque, hides some snapshot information to users. It's not convenient
if users want to know more about all of snapshot information on every block
device via monitor.

Follow Kevin's and Max's proposals, The patch makes the output more detailed:
(qemu) info snapshots
List of snapshots present on all disks:
 IDTAG VM SIZEDATE   VM CLOCK
 --checkpoint-1   165M 2016-05-22 16:58:07   00:02:06.813

List of partial (non-loadable) snapshots on 'drive_image1':
 IDTAG VM SIZEDATE   VM CLOCK
 1 snap1 0 2016-05-22 16:57:31   00:01:30.567

Signed-off-by: Lin Ma 
---
 migration/savevm.c | 95 ++
 1 file changed, 88 insertions(+), 7 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index a8f22da..e5a5536 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2200,12 +2200,31 @@ void hmp_delvm(Monitor *mon, const QDict *qdict)
 void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
 {
 BlockDriverState *bs, *bs1;
+BdrvNextIterator it1;
 QEMUSnapshotInfo *sn_tab, *sn;
+bool no_snapshot = true;
 int nb_sns, i;
 int total;
-int *available_snapshots;
+int *global_snapshots;
 AioContext *aio_context;
 
+typedef struct SnapshotEntry {
+QEMUSnapshotInfo sn;
+QTAILQ_ENTRY(SnapshotEntry) next;
+} SnapshotEntry;
+
+typedef struct ImageEntry {
+const char *imagename;
+QTAILQ_ENTRY(ImageEntry) next;
+QTAILQ_HEAD(, SnapshotEntry) snapshots;
+} ImageEntry;
+
+QTAILQ_HEAD(, ImageEntry) image_list =
+QTAILQ_HEAD_INITIALIZER(image_list);
+
+ImageEntry *image_entry;
+SnapshotEntry *snapshot_entry;
+
 bs = bdrv_all_find_vmstate_bs();
 if (!bs) {
 monitor_printf(mon, "No available block device supports snapshots\n");
@@ -,25 +2241,65 @@ void hmp_info_snapshots(Monitor *mon, const QDict 
*qdict)
 return;
 }
 
-if (nb_sns == 0) {
+for (bs1 = bdrv_first(&it1); bs1; bs1 = bdrv_next(&it1)) {
+int bs1_nb_sns = 0;
+ImageEntry *ie;
+SnapshotEntry *se;
+AioContext *ctx = bdrv_get_aio_context(bs1);
+
+aio_context_acquire(ctx);
+if (bdrv_can_snapshot(bs1)) {
+sn = NULL;
+bs1_nb_sns = bdrv_snapshot_list(bs1, &sn);
+if (bs1_nb_sns > 0) {
+no_snapshot = false;
+ie = g_new0(ImageEntry, 1);
+ie->imagename = bdrv_get_device_name(bs1);
+QTAILQ_INIT(&ie->snapshots);
+QTAILQ_INSERT_TAIL(&image_list, ie, next);
+for (i = 0; i < bs1_nb_sns; i++) {
+se = g_new0(SnapshotEntry, 1);
+se->sn = sn[i];
+QTAILQ_INSERT_TAIL(&ie->snapshots, se, next);
+}
+}
+g_free(sn);
+}
+aio_context_release(ctx);
+}
+
+if (no_snapshot) {
 monitor_printf(mon, "There is no snapshot available.\n");
 return;
 }
 
-available_snapshots = g_new0(int, nb_sns);
+global_snapshots = g_new0(int, nb_sns);
 total = 0;
 for (i = 0; i < nb_sns; i++) {
+SnapshotEntry *next_sn;
 if (bdrv_all_find_snapshot(sn_tab[i].name, &bs1) == 0) {
-available_snapshots[total] = i;
+global_snapshots[total] = i;
 total++;
+QTAILQ_FOREACH(image_entry, &image_list, next) {
+QTAILQ_FOREACH_SAFE(snapshot_entry, &image_entry->snapshots,
+next, next_sn) {
+if (!strcmp(sn_tab[i].name, snapshot_entry->sn.name)) {
+QTAILQ_REMOVE(&image_entry->snapshots, snapshot_entry,
+  next);
+g_free(snapshot_entry);
+}
+}
+}
 }
 }
 
+monitor_printf(mon, "List of snapshots present on all disks:\n");
+
 if (total > 0) {
 bdrv_snapshot_dump((fprintf_function)monitor_printf, mon, NULL);
 monitor_printf(mon, "\n");
 for (i = 0; i < total; i++) {
-sn = &sn_tab[available_snapshots[i]];
+sn = &sn_tab[global_snapshots[i]];
 /* The ID is not guaranteed to be the same on all images, so
  * overwrite it.
  */
@@ -2249,11 +2308,33 @@ void hmp_info_snapshots(Monitor *mon, const QDict 
*qdict)
 monitor_printf(mon, "\n");
 }
 } else {
-monitor_printf(mon, "There is no suitable snapshot available\n");
+monitor_printf(mon, "None\n");
 }
 
+QTAILQ_FOREACH(image_entry, &image_list, next) {
+if (QTAILQ_EMPTY(&image_entry->snapshots)) {
+continue;
+}
+  

[Qemu-devel] [PATCH 0/2 V3] Show all of snapshot info on every block dev

2016-06-29 Thread Lin Ma
V3: Fix leaking the entries in image_list and the entries in their
ImageEntry.snapshots lists

V2: Split it to 2 patches.

Lin Ma (2):
  hmp: use snapshot name to determine whether a snapshot is 'fully
available'
  hmp: show all of snapshot info on every block dev in output of 'info
snapshots'

 migration/savevm.c | 101 -
 1 file changed, 93 insertions(+), 8 deletions(-)

-- 
2.8.1




Re: [Qemu-devel] Question about qtest and IOMMU

2016-06-29 Thread Peter Xu
On Wed, Jun 29, 2016 at 09:08:29AM +0200, Paolo Bonzini wrote:
> 
> 
> On 29/06/2016 08:36, Jan Kiszka wrote:
> > On 2016-06-29 08:32, Peter Xu wrote:
> >> Hi, all,
> >>
> >> I am thinking about whether it's possible to write up a unit test
> >> program for emulated IOMMUs (of course, Intel IOMMU would be the first
> >> one). This can provide us the chance to do quick functional regression
> >> tests for IOMMU just like other devices, as well as customized test
> >> sequences which is hard to do in real guests (e.g., specific cache
> >> invalidations, error injections), etc..
> >>
> >> I see that current qtest framework cannot support well on testing
> >> IOMMUs. For DMA remapping, things would be quite smooth. The problem
> >> is that, we still do not have a complete test framework on the
> >> interrupts. E.g., currently qtest is still an acceleration type, in
> >> which we have no vCPUs, as well as interrupt chips like APICs (please
> >> correct me if I am wrong).
> 
> qtest does have VCPUs, they just run a dummy loop.  You do have an APIC
> too, but reading it doesn't work because cpu_get_current_apic() returns
> NULL.

Right, thanks to point out.

> 
> You can use kvm-unit-tests if qtest is not flexible enough.  It's
> probably the simplest thing to do if you also want to test kernel LAPIC
> and split irqchip operation.

Will have a look. Thanks Paolo. :)

-- peterx



[Qemu-devel] [PATCH 1/2 V3] hmp: use snapshot name to determine whether a snapshot is 'fully available'

2016-06-29 Thread Lin Ma
Currently qemu uses snapshot id to determine whether a snapshot is fully
available, It causes incorrect output in some scenario.

For instance:
(qemu) info block
drive_image1 (#block113): /opt/vms/SLES12-SP1-JeOS-x86_64-GM/disk0.qcow2
(qcow2)
Cache mode:   writeback

drive_image2 (#block349): /opt/vms/SLES12-SP1-JeOS-x86_64-GM/disk1.qcow2
(qcow2)
Cache mode:   writeback
(qemu)
(qemu) info snapshots
There is no snapshot available.
(qemu)
(qemu) snapshot_blkdev_internal drive_image1 snap1
(qemu)
(qemu) info snapshots
There is no suitable snapshot available
(qemu)
(qemu) savevm checkpoint-1
(qemu)
(qemu) info snapshots
IDTAG VM SIZEDATE   VM CLOCK
1 snap1 0 2016-05-22 16:57:31   00:01:30.567
(qemu)

$ qemu-img snapshot -l disk0.qcow2
Snapshot list:
IDTAG VM SIZEDATE   VM CLOCK
1 snap1 0 2016-05-22 16:57:31   00:01:30.567
2 checkpoint-1   165M 2016-05-22 16:58:07   00:02:06.813

$ qemu-img snapshot -l disk1.qcow2
Snapshot list:
IDTAG VM SIZEDATE   VM CLOCK
1 checkpoint-1  0 2016-05-22 16:58:07   00:02:06.813

The patch uses snapshot name instead of snapshot id to determine whether a
snapshot is fully available and uses '--' instead of snapshot id in output
because the snapshot id is not guaranteed to be the same on all images.
For instance:
(qemu) info snapshots
List of snapshots present on all disks:
 IDTAG VM SIZEDATE   VM CLOCK
 --checkpoint-1   165M 2016-05-22 16:58:07   00:02:06.813

Signed-off-by: Lin Ma 
---
Reviewed-by: Max Reitz

 migration/savevm.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 38b85ee..a8f22da 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2230,7 +2230,7 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
 available_snapshots = g_new0(int, nb_sns);
 total = 0;
 for (i = 0; i < nb_sns; i++) {
-if (bdrv_all_find_snapshot(sn_tab[i].id_str, &bs1) == 0) {
+if (bdrv_all_find_snapshot(sn_tab[i].name, &bs1) == 0) {
 available_snapshots[total] = i;
 total++;
 }
@@ -2241,6 +2241,10 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
 monitor_printf(mon, "\n");
 for (i = 0; i < total; i++) {
 sn = &sn_tab[available_snapshots[i]];
+/* The ID is not guaranteed to be the same on all images, so
+ * overwrite it.
+ */
+pstrcpy(sn->id_str, sizeof(sn->id_str), "--");
 bdrv_snapshot_dump((fprintf_function)monitor_printf, mon, sn);
 monitor_printf(mon, "\n");
 }
-- 
2.8.1




Re: [Qemu-devel] [PATCH 2/2] arm/virt: Mark pcie controller node as dma-coherent

2016-06-29 Thread Bogdan Purcareata
On 28.06.2016 17:16, Peter Maydell wrote:
> On 16 June 2016 at 14:58, Ard Biesheuvel  wrote:
>> On 2 June 2016 at 14:45, Alexander Graf  wrote:
>>> On 02.06.16 14:32, Peter Maydell wrote:
 This patch seems to change the property of the emulated PCIe controller
 based on the host PCIe controller even if we're not doing any PCIe
 passthrough at all. That seems definitely wrong to me.

 (Should the purely-emulated case be marked DMA-coherent anyway?
 I forget the fiddly details...)
>>>
>>> I do too, let's involve a few people who know :). Not exposing it as
>>> coherent is definitely wrong, but whether "dma-coherent" is the right
>>> choice I don't know.
>
>> As far as I understand it, the purely emulated case should be marked
>> DMA coherent, since otherwise, guest drivers may perform cache
>> maintenance that the host is not expecting. This is especially harmful
>> if the guest invalidates the caches after a device to memory transfer,
>> which may result in data being lost if the data was only present in
>> the caches to begin with (which is the case for devices that are
>> emulated by the host)
>
> So the consensus seems to be that:
>  * emulated PCI devices definitely need dma-coherent
>  * passthrough devices where the host controller is dma-coherent
>also need dma-coherent
>  * passthrough devices where the host controller is not dma-coherent
>don't want dma-coherent, but we have to set things per-PCI-controller
>
> Would somebody like to write a patch which just unconditionally
> sets the dma-coherent property on our PCI controller dt node?

I will send this patch later today.

Best regards,
Bogdan P.

> That seems a clear improvement on what we have at the moment.
> We can look at whether we want to support passthrough from a
> non-dma-coherent host pci controller (via a 2nd guest pci controller?)
> later...
>
> thanks
> -- PMM
>

Re: [Qemu-devel] [PATCH v5 6/9] ast2400: add SMC controllers (FMC and SPI)

2016-06-29 Thread Cédric Le Goater
On 06/28/2016 08:24 PM, Cédric Le Goater wrote:
> The Aspeed AST2400 soc includes a static memory controller for the BMC
> which supports NOR, NAND and SPI flash memory modules. This controller
> has two modes : the SMC for the legacy interface which supports only
> one module and the FMC for the new interface which supports up to five
> modules. The AST2400 also includes a SPI only controller used for the
> host firmware, commonly called BIOS on Intel. It can be used in three
> mode : a SPI master, SPI slave and SPI pass-through
> 
> Below is the initial framework for the SMC controller (FMC mode only)
> and the SPI controller: the sysbus object, MMIO for registers
> configuration and controls. Each controller has a SPI bus and a
> configurable number of CS lines for SPI flash slaves.
> 
> The differences between the controllers are small, so they are
> abstracted using indirections on the register numbers.
> 
> Only SPI flash modules are supported.
> 
> Signed-off-by: Cédric Le Goater 
> ---
> 
>  Changes since v3:
> 
>  - Fixed typos on CTRL
>  - Fixed multiple error handling when setting properties in
>ast2400_realize()

Peter,

I missed one ... See below.

>  - Added error messages when max_slaves is exceeded
>  - Added definitions for R_INTR_CTRL register bits
>  - Added definitions for R_DMA_* registers
>  - Constantified a couple of routine arguments
>  - Sorted out what was need for migration (registers only apriori) 
> 
>  Changes since v2:
> 
>  - Switched to a realize ops to be able to handle errors.
> 
>  hw/arm/ast2400.c|  34 -
>  hw/ssi/Makefile.objs|   1 +
>  hw/ssi/aspeed_smc.c | 326 
> 
>  include/hw/arm/ast2400.h|   3 +
>  include/hw/ssi/aspeed_smc.h |  79 +++
>  5 files changed, 442 insertions(+), 1 deletion(-)
>  create mode 100644 hw/ssi/aspeed_smc.c
>  create mode 100644 include/hw/ssi/aspeed_smc.h
> 
> diff --git a/hw/arm/ast2400.c b/hw/arm/ast2400.c
> index b14a82fcdef1..b16ba2d0c516 100644
> --- a/hw/arm/ast2400.c
> +++ b/hw/arm/ast2400.c
> @@ -23,6 +23,9 @@
>  #define AST2400_UART_5_BASE  0x00184000
>  #define AST2400_IOMEM_SIZE   0x0020
>  #define AST2400_IOMEM_BASE   0x1E60
> +#define AST2400_SMC_BASE AST2400_IOMEM_BASE /* Legacy SMC */
> +#define AST2400_FMC_BASE 0X1E62
> +#define AST2400_SPI_BASE 0X1E63
>  #define AST2400_VIC_BASE 0x1E6C
>  #define AST2400_SCU_BASE 0x1E6E2000
>  #define AST2400_TIMER_BASE   0x1E782000
> @@ -85,13 +88,21 @@ static void ast2400_init(Object *obj)
>"hw-strap1", &error_abort);
>  object_property_add_alias(obj, "hw-strap2", OBJECT(&s->scu),
>"hw-strap2", &error_abort);
> +
> +object_initialize(&s->smc, sizeof(s->smc), "aspeed.smc.fmc");
> +object_property_add_child(obj, "smc", OBJECT(&s->smc), NULL);
> +qdev_set_parent_bus(DEVICE(&s->smc), sysbus_get_default());
> +
> +object_initialize(&s->spi, sizeof(s->spi), "aspeed.smc.spi");
> +object_property_add_child(obj, "spi", OBJECT(&s->spi), NULL);
> +qdev_set_parent_bus(DEVICE(&s->spi), sysbus_get_default());
>  }
>  
>  static void ast2400_realize(DeviceState *dev, Error **errp)
>  {
>  int i;
>  AST2400State *s = AST2400(dev);
> -Error *err = NULL;
> +Error *err = NULL, *local_err = NULL;
>  
>  /* IO space */
>  memory_region_init_io(&s->iomem, NULL, &ast2400_io_ops, NULL,
> @@ -147,6 +158,27 @@ static void ast2400_realize(DeviceState *dev, Error 
> **errp)
>  sysbus_mmio_map(SYS_BUS_DEVICE(&s->i2c), 0, AST2400_I2C_BASE);
>  sysbus_connect_irq(SYS_BUS_DEVICE(&s->i2c), 0,
> qdev_get_gpio_in(DEVICE(&s->vic), 12));
> +
> +/* SMC */
> +object_property_set_int(OBJECT(&s->smc), 1, "num-cs", &err);
> +object_property_set_bool(OBJECT(&s->smc), true, "realized", &err);

It should be a '&local_err' above and it is missing a :

   error_propagate(&err, local_err);

Please tell me if you want a resend.

Thanks,

C. 

> +if (err) {
> +error_propagate(errp, err);
> +return;
> +}
> +sysbus_mmio_map(SYS_BUS_DEVICE(&s->smc), 0, AST2400_FMC_BASE);
> +sysbus_connect_irq(SYS_BUS_DEVICE(&s->smc), 0,
> +   qdev_get_gpio_in(DEVICE(&s->vic), 19));
> +
> +/* SPI */
> +object_property_set_int(OBJECT(&s->spi), 1, "num-cs", &err);
> +object_property_set_bool(OBJECT(&s->spi), true, "realized", &local_err);
> +error_propagate(&err, local_err);
> +if (err) {
> +error_propagate(errp, err);
> +return;
> +}
> +sysbus_mmio_map(SYS_BUS_DEVICE(&s->spi), 0, AST2400_SPI_BASE);
>  }
>  
>  static void ast2400_class_init(ObjectClass *oc, void *data)
> diff --git a/hw/ssi/Makefile.objs b/hw/ssi/Makefile.objs
> index fcbb79ef0185..c79a8dcd86a9 100644
> --

Re: [Qemu-devel] Regression: virtio-pci: convert to ioeventfd callbacks

2016-06-29 Thread Jason Wang



On 2016年06月29日 15:23, Cornelia Huck wrote:

On Wed, 29 Jun 2016 09:41:50 +0800
Jason Wang  wrote:



On 2016年06月27日 17:44, Peter Lieven wrote:

Hi, with the above patch applied:

commit 9f06e71a567ba5ee8b727e65a2d5347fd331d2aa
Author: Cornelia Huck 
Date:   Fri Jun 10 11:04:12 2016 +0200

 virtio-pci: convert to ioeventfd callbacks

a Ubuntu 14.04 VM freezes at startup when blk-mq is set up - even if
there is only one queue.

Peter



In fact, I notice vhost-net does not work for master, look like we are
trying to set host notifier without initialization which seems a bug


Does the patch in <20160628101618.57bdb1e0.cornelia.h...@de.ibm.com> help?




It doesn't help.

Thanks



Re: [Qemu-devel] [PULL 6/8] qemu-img: move common options parsing before commands processing

2016-06-29 Thread Denis V. Lunev

On 06/29/2016 12:27 AM, Stefan Hajnoczi wrote:

From: "Denis V. Lunev" 

This is necessary to enable creation of common qemu-img options which will
be specified before command.

The patch also enables '-V' alias to '--version' (exactly like in other
block utilities) and documents this change.

Signed-off-by: Denis V. Lunev 
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Message-id: 1466174654-30130-7-git-send-email-...@openvz.org
CC: Paolo Bonzini 
CC: Kevin Wolf 
Signed-off-by: Stefan Hajnoczi 
---
  qemu-img.c| 41 +++--
  qemu-img.texi | 10 +-
  2 files changed, 36 insertions(+), 15 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 14e2661..2194c2d 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -91,9 +91,12 @@ static void QEMU_NORETURN help(void)
  {
  const char *help_msg =
 QEMU_IMG_VERSION
-   "usage: qemu-img command [command options]\n"
+   "usage: qemu-img [standard options] command [command options]\n"
 "QEMU disk image utility\n"
 "\n"
+   "'-h', '--help'   display this help and exit\n"
+   "'-V', '--version'output version information and exit\n"
+   "\n"
 "Command syntax:\n"
  #define DEF(option, callback, arg_string)\
 "  " arg_string "\n"
@@ -3806,7 +3809,7 @@ int main(int argc, char **argv)
  int c;
  static const struct option long_options[] = {
  {"help", no_argument, 0, 'h'},
-{"version", no_argument, 0, 'v'},
+{"version", no_argument, 0, 'V'},
  {0, 0, 0, 0}
  };
  
@@ -3829,28 +3832,38 @@ int main(int argc, char **argv)

  if (argc < 2) {
  error_exit("Not enough arguments");
  }
-cmdname = argv[1];
  
  qemu_add_opts(&qemu_object_opts);

  qemu_add_opts(&qemu_source_opts);
  
+while ((c = getopt_long(argc, argv, "+hV", long_options, NULL)) != -1) {

+switch (c) {
+case 'h':
+help();
+return 0;
+case 'V':
+printf(QEMU_IMG_VERSION);
+return 0;
+}
+}
+
+cmdname = argv[optind];
+
+/* reset getopt_long scanning */
+argc -= optind;
+if (argc < 1) {
+return 0;
+}
+argv += optind;
+optind = 1;

this patch breaks check-block.sh
we should have here
  'optind = 0'

Den



Re: [Qemu-devel] [PATCH] configure: mark qemu-ga VSS includes as system headers

2016-06-29 Thread Thomas Huth
On 29.06.2016 01:43, Michael Roth wrote:
> As of e4650c81, we do w32 builds with -Werror enabled. Unfortunately
> for cases where we enable VSS support in qemu-ga, we still have
> warnings generated by VSS includes that ship as part of the Microsoft
> VSS SDK.
> 
> We can selectively address a number of these warnings using
> 
>   #pragma GCC diagnostic ignored ...
> 
> but at least one of these:
> 
>   warning: ‘typedef’ was ignored in this declaration
> 
> resulting from declarations of the form:
> 
>   typedef struct Blah { ... };
> 
> does not provide a specific command-line/pragma option to disable
> warnings of the sort.
> 
> To allow VSS builds to succeed, the next-best option is disabling
> these warnings on a per-file basis. pragmas like #pragma GCC
> system_header can be used to declare subsequent includes/declarations
> as being exempt from normal warnings, but this must be done within
> a header file.
> 
> Since we don't control the VSS SDK, we'd need to rely on a
> intermediate header include to accomplish this, and
> since different objects in the VSS link target rely on different
> headers from the VSS SDK, this would become somewhat of a rat's nest
> (though not totally unmanageable).
> 
> The next step up in granularity is just marking the entire VSS
> SDK include path as system headers via -isystem. This is a bit more
> heavy-handed, but since this SDK hasn't changed since 2005, there's
> likely little to be gained from selectively disabling warnings
> anyway, so we implement that approach here.
> 
> This fixes the -Werror failures in both the configure test and the
> qga build due to shared reliance on $vss_win32_include. For the
> same reason, this also enforces a new dependency on -isystem support
> in the C/C++ compiler when building QGA with VSS enabled.

Did we ever support any non-GCC-based compiler for building QGA? I don't
think so, but in the worst case, we could later add a check whether the
compiler supports that parameter, too...

Anyway, I think your patch is a nice and clean way to deal with the
error messages from these headers, so:

Reviewed-by: Thomas Huth 




Re: [Qemu-devel] [PATCH v4 1/3] block: ignore flush requests when storage is clean

2016-06-29 Thread Denis V. Lunev

On 06/29/2016 04:12 AM, Fam Zheng wrote:

On Tue, 06/28 12:10, Denis V. Lunev wrote:

On 06/28/2016 04:27 AM, Fam Zheng wrote:

On Mon, 06/27 17:47, Denis V. Lunev wrote:

From: Evgeny Yakovlev 

Some guests (win2008 server for example) do a lot of unnecessary
flushing when underlying media has not changed. This adds additional
overhead on host when calling fsync/fdatasync.

This change introduces a dirty flag in BlockDriverState which is set
in bdrv_set_dirty and is checked in bdrv_co_flush. This allows us to
avoid unnecessary flushing when storage is clean.

The problem with excessive flushing was found by a performance test
which does parallel directory tree creation (from 2 processes).
Results improved from 0.424 loops/sec to 0.432 loops/sec.
Each loop creates 10^3 directories with 10 files in each.

Signed-off-by: Evgeny Yakovlev 
Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Max Reitz 
CC: Stefan Hajnoczi 
CC: Fam Zheng 
CC: John Snow 
---
   block.c   |  1 +
   block/dirty-bitmap.c  |  3 +++
   block/io.c| 19 +++
   include/block/block_int.h |  1 +
   4 files changed, 24 insertions(+)

diff --git a/block.c b/block.c
index 947df29..68ae3a0 100644
--- a/block.c
+++ b/block.c
@@ -2581,6 +2581,7 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset)
   ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
   bdrv_dirty_bitmap_truncate(bs);
   bdrv_parent_cb_resize(bs);
+bs->dirty = true; /* file node sync is needed after truncate */
   }
   return ret;
   }
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 4902ca5..54e0413 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -370,6 +370,9 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t 
cur_sector,
   }
   hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
   }
+
+/* Set global block driver dirty flag even if bitmap is disabled */
+bs->dirty = true;
   }
   /**
diff --git a/block/io.c b/block/io.c
index b9e53e3..152f5a9 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2247,6 +2247,25 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
   goto flush_parent;
   }
+/* Check if storage is actually dirty before flushing to disk */
+if (!bs->dirty) {
+/* Flush requests are appended to tracked request list in order so that
+ * most recent request is at the head of the list. Following code uses
+ * this ordering to wait for the most recent flush request to complete
+ * to ensure that requests return in order */
+BdrvTrackedRequest *prev_req;
+QLIST_FOREACH(prev_req, &bs->tracked_requests, list) {
+if (prev_req == &req || prev_req->type != BDRV_TRACKED_FLUSH) {
+continue;
+}
+
+qemu_co_queue_wait(&prev_req->wait_queue);
+break;
+}
+goto flush_parent;

Should we check bs->dirty again after qemu_co_queue_wait()? I think another
write request could sneak in while this coroutine yields.

no, we do not care. Any subsequent to FLUSH write does not guaranteed to
be flushed. We have the warranty only that all write requests completed
prior to this flush are really flushed.

I'm not worried about subsequent requests.

A prior request can be already in progress or be waiting when we check
bs->dirty, though it would be false there, but it will become true soon --
bdrv_set_dirty is only called when a request is completing.

Fam

I have written specifically about this situation. FLUSH in the
controller does not guarantee that requests which are in the
progress at the moment when the flush is initiated will be
flushed.

It guarantees that requests, which are completed, i.e. which
status 'COMPLETED' was returned to the guest, will  be flushed.

Den



Re: [Qemu-devel] [PATCH 1/3] block: ignore flush requests when storage is clean

2016-06-29 Thread Denis V. Lunev

On 06/29/2016 10:36 AM, Paolo Bonzini wrote:


On 28/06/2016 23:01, Paolo Bonzini wrote:


On 24/06/2016 17:06, Denis V. Lunev wrote:

From: Evgeny Yakovlev 

Some guests (win2008 server for example) do a lot of unnecessary
flushing when underlying media has not changed. This adds additional
overhead on host when calling fsync/fdatasync.

This change introduces a dirty flag in BlockDriverState which is set
in bdrv_set_dirty and is checked in bdrv_co_flush. This allows us to
avoid unnesessary flushing when storage is clean.

The problem with excessive flushing was found by a performance test
which does parallel directory tree creation (from 2 processes).
Results improved from 0.424 loops/sec to 0.432 loops/sec.
Each loop creates 10^3 directories with 10 files in each.

Signed-off-by: Evgeny Yakovlev 
Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Max Reitz 
CC: Stefan Hajnoczi 
CC: Fam Zheng 
CC: John Snow 
---
  block.c   |  1 +
  block/dirty-bitmap.c  |  3 +++
  block/io.c| 19 +++
  include/block/block_int.h |  2 ++
  4 files changed, 25 insertions(+)

diff --git a/block.c b/block.c
index f4648e9..e36f148 100644
--- a/block.c
+++ b/block.c
@@ -2582,6 +2582,7 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset)
  ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
  bdrv_dirty_bitmap_truncate(bs);
  bdrv_parent_cb_resize(bs);
+bs->dirty = true; /* file node sync is needed after truncate */
  }
  return ret;
  }
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 4902ca5..54e0413 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -370,6 +370,9 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t 
cur_sector,
  }
  hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
  }
+
+/* Set global block driver dirty flag even if bitmap is disabled */
+bs->dirty = true;
  }
  
  /**

diff --git a/block/io.c b/block/io.c
index 7cf3645..8078af2 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2239,6 +2239,25 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
  goto flush_parent;
  }
  
+/* Check if storage is actually dirty before flushing to disk */

+if (!bs->dirty) {
+/* Flush requests are appended to tracked request list in order so that
+ * most recent request is at the head of the list. Following code uses
+ * this ordering to wait for the most recent flush request to complete
+ * to ensure that requests return in order */
+BdrvTrackedRequest *prev_req;
+QLIST_FOREACH(prev_req, &bs->tracked_requests, list) {
+if (prev_req == &req || prev_req->type != BDRV_TRACKED_FLUSH) {
+continue;
+}
+
+qemu_co_queue_wait(&prev_req->wait_queue);
+break;
+}
+goto flush_parent;

Can you just have a CoQueue specific to flushes, where a completing
flush does a restart_all on the CoQueue?

Something like this:

 current_gen = bs->write_gen;
 if (bs->flush_started_gen >= current_gen) {
 while (bs->flushed_gen < current_gen) {
 qemu_co_queue_wait(&bs->flush_queue);
 }
 return;
 }

 bs->flush_started_gen = current_gen;
 ...
 if (current_gen < bs->flushed_gen) {
 bs->flushed_gen = current_gen;
 qemu_co_queue_restart_all(&bs->flush_queue);
 }

Paolo


I have had exactly this inn mind originally but current queue
with tracked requests is also useful. If it is going to be removed
in 2.8, generation approach would also work.

Thank you,
Den



Re: [Qemu-devel] [PATCH v8 11/12] vfio: register aer resume notification handler for aer resume

2016-06-29 Thread Zhou Jie

Hi Alex,


And yet we have struct pci_dev.broken_intx_masking and we test for
working DisINTx via pci_intx_mask_supported() rather than simply
looking for a PCIe device.  Some devices are broken and some simply
don't follow the spec, so you're going to need to deal with that or
exclude those devices.

For those devices I have no way to disable the INTx.


How does that happen, aren't we notifying the user at the point the
error occurs, while the device is still in the process or being reset?
My question is how does the user know that the host reset is complete
in order to begin their own re-initialization?

I will add a state in "struct vfio_pci_device".
The state is set when the device can not work such as a aer error
 occured.
And the state is clear when the device can work such as resume
 received.
Return the state when user get info by vfio_pci_ioctl.


The interrupt status will be cleared by hardware.
So the hardware is the same as the state when the
vfio device fd is opened.


The PCI-core in Linux will save and restore the device state around
reset, how do we know that vfio-pci itself is not racing that reset and
whether PCI-core will restore the state including our interrupt masking
or a state without it?  Do we need to restore the state to the one we
saved when we originally opened the device?  Shouldn't that mean we
teardown the interrupt setup the user had prior to the error event?

For above you said.
Maybe disable the interrupt is not a good idea.
Think about what will happend in the interrupt handler.
Maybe read/write configure space and region bar.
I will make the configure space read only.
Do nothing for region bar which used by userd.


How will the user know when the device is
ready to be reset?  Which of the ioctls that you're blocking can they
poll w/o any unwanted side-effects or awkward interactions?  Should
flag bits in the device info ioctl indicate not only support for this
behavior but also the current status?  Thanks,

I can block the reset ioctl and config write.
I will not add flag for the device current status,
because I don't depend on user to prevent awkward interactions.


Ok, so that's a reason to block rather than return -EAGAIN.  Still we
need some way to indicate to the user whether the device supports this
new interaction rather than the existing behavior.  Thanks,

Because write configure space maybe happened in interrupt handler.
I think block is not a good choice.

Sincerely
Zhou Jie





Re: [Qemu-devel] [PATCH v4 1/3] block: ignore flush requests when storage is clean

2016-06-29 Thread Stefan Hajnoczi
On Wed, Jun 29, 2016 at 09:12:41AM +0800, Fam Zheng wrote:
> On Tue, 06/28 12:10, Denis V. Lunev wrote:
> > On 06/28/2016 04:27 AM, Fam Zheng wrote:
> > > On Mon, 06/27 17:47, Denis V. Lunev wrote:
> > > > From: Evgeny Yakovlev 
> > > > 
> > > > Some guests (win2008 server for example) do a lot of unnecessary
> > > > flushing when underlying media has not changed. This adds additional
> > > > overhead on host when calling fsync/fdatasync.
> > > > 
> > > > This change introduces a dirty flag in BlockDriverState which is set
> > > > in bdrv_set_dirty and is checked in bdrv_co_flush. This allows us to
> > > > avoid unnecessary flushing when storage is clean.
> > > > 
> > > > The problem with excessive flushing was found by a performance test
> > > > which does parallel directory tree creation (from 2 processes).
> > > > Results improved from 0.424 loops/sec to 0.432 loops/sec.
> > > > Each loop creates 10^3 directories with 10 files in each.
> > > > 
> > > > Signed-off-by: Evgeny Yakovlev 
> > > > Signed-off-by: Denis V. Lunev 
> > > > CC: Kevin Wolf 
> > > > CC: Max Reitz 
> > > > CC: Stefan Hajnoczi 
> > > > CC: Fam Zheng 
> > > > CC: John Snow 
> > > > ---
> > > >   block.c   |  1 +
> > > >   block/dirty-bitmap.c  |  3 +++
> > > >   block/io.c| 19 +++
> > > >   include/block/block_int.h |  1 +
> > > >   4 files changed, 24 insertions(+)
> > > > 
> > > > diff --git a/block.c b/block.c
> > > > index 947df29..68ae3a0 100644
> > > > --- a/block.c
> > > > +++ b/block.c
> > > > @@ -2581,6 +2581,7 @@ int bdrv_truncate(BlockDriverState *bs, int64_t 
> > > > offset)
> > > >   ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
> > > >   bdrv_dirty_bitmap_truncate(bs);
> > > >   bdrv_parent_cb_resize(bs);
> > > > +bs->dirty = true; /* file node sync is needed after truncate */
> > > >   }
> > > >   return ret;
> > > >   }
> > > > diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
> > > > index 4902ca5..54e0413 100644
> > > > --- a/block/dirty-bitmap.c
> > > > +++ b/block/dirty-bitmap.c
> > > > @@ -370,6 +370,9 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t 
> > > > cur_sector,
> > > >   }
> > > >   hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
> > > >   }
> > > > +
> > > > +/* Set global block driver dirty flag even if bitmap is disabled */
> > > > +bs->dirty = true;
> > > >   }
> > > >   /**
> > > > diff --git a/block/io.c b/block/io.c
> > > > index b9e53e3..152f5a9 100644
> > > > --- a/block/io.c
> > > > +++ b/block/io.c
> > > > @@ -2247,6 +2247,25 @@ int coroutine_fn bdrv_co_flush(BlockDriverState 
> > > > *bs)
> > > >   goto flush_parent;
> > > >   }
> > > > +/* Check if storage is actually dirty before flushing to disk */
> > > > +if (!bs->dirty) {
> > > > +/* Flush requests are appended to tracked request list in 
> > > > order so that
> > > > + * most recent request is at the head of the list. Following 
> > > > code uses
> > > > + * this ordering to wait for the most recent flush request to 
> > > > complete
> > > > + * to ensure that requests return in order */
> > > > +BdrvTrackedRequest *prev_req;
> > > > +QLIST_FOREACH(prev_req, &bs->tracked_requests, list) {
> > > > +if (prev_req == &req || prev_req->type != 
> > > > BDRV_TRACKED_FLUSH) {
> > > > +continue;
> > > > +}
> > > > +
> > > > +qemu_co_queue_wait(&prev_req->wait_queue);
> > > > +break;
> > > > +}
> > > > +goto flush_parent;
> > > Should we check bs->dirty again after qemu_co_queue_wait()? I think 
> > > another
> > > write request could sneak in while this coroutine yields.
> > no, we do not care. Any subsequent to FLUSH write does not guaranteed to
> > be flushed. We have the warranty only that all write requests completed
> > prior to this flush are really flushed.
> 
> I'm not worried about subsequent requests.
> 
> A prior request can be already in progress or be waiting when we check
> bs->dirty, though it would be false there, but it will become true soon --
> bdrv_set_dirty is only called when a request is completing.

Flush only guarantees that already completed writes are persistent.  It
is not a barrier operation.  It does not wait for in-flight writes and
makes no guarantee regarding them.

Stefan


signature.asc
Description: PGP signature


[Qemu-devel] [Bug 1586229] Re: seabios hell

2016-06-29 Thread T. Huth
Sounds like your describing problems with SeaBIOS, not with QEMU. May I
suggest to report this issues to the SeaBIOS project instead? See
http://seabios.org/

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1586229

Title:
  seabios hell

Status in QEMU:
  New

Bug description:
  getting weird annoying seabios hell and not sure how to fix it.

  ok.

  there IS a SEA-BIOS. There IS a way in.

  -I found it by mistake.(and yall need to move the BIOS key...its in
  the wrong place)

  I was tryng to boot Yosemite to re-install. I mashed the key too early
  and it wanted to boot the hard drive.

  Apparently the bios loads AFTER the hard drive wants to boot, not
  BEFORE it.And it will ONLY load when booting a hard disk.

  ..Booting hard disk...[mash F8 here but let go and wait]
  eventually will want to load the OS and clear the screen[mash F8 again]

  --Youre in!

  Its tiny, like a mini award bios but youre in! 
  -Change anything HERE, though...and kiss booting a cd goodbye!

  Im trying to diagnose a black screen, seems related to seabios, not
  the vga driver.

  -mayhaps wants to boot hard disk but in fact its not bootable as the
  installer hung(and often unices install bootloader late in process)?

  I cant boot the disc to reinstall to tell. But I have a few dos iso
  lying around...hmmm.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1586229/+subscriptions



Re: [Qemu-devel] Regression: virtio-pci: convert to ioeventfd callbacks

2016-06-29 Thread Cornelia Huck
On Wed, 29 Jun 2016 16:21:07 +0800
Jason Wang  wrote:

> 
> 
> On 2016年06月29日 15:23, Cornelia Huck wrote:
> > On Wed, 29 Jun 2016 09:41:50 +0800
> > Jason Wang  wrote:
> >
> >>
> >> On 2016年06月27日 17:44, Peter Lieven wrote:
> >>> Hi, with the above patch applied:
> >>>
> >>> commit 9f06e71a567ba5ee8b727e65a2d5347fd331d2aa
> >>> Author: Cornelia Huck 
> >>> Date:   Fri Jun 10 11:04:12 2016 +0200
> >>>
> >>>  virtio-pci: convert to ioeventfd callbacks
> >>>
> >>> a Ubuntu 14.04 VM freezes at startup when blk-mq is set up - even if
> >>> there is only one queue.
> >>>
> >>> Peter
> >>>
> >>>
> >> In fact, I notice vhost-net does not work for master, look like we are
> >> trying to set host notifier without initialization which seems a bug
> >>
> > Does the patch in <20160628101618.57bdb1e0.cornelia.h...@de.ibm.com> help?
> >
> >
> 
> It doesn't help.

Thanks for checking.

Debugging. This is broken here (virtio-ccw) as well.




Re: [Qemu-devel] [PATCH V3 2/3] hw/apci: handle 64-bit MMIO regions correctly

2016-06-29 Thread Igor Mammedov
On Tue, 28 Jun 2016 20:08:42 +0300
Marcel Apfelbaum  wrote:

> On 06/28/2016 05:39 PM, Igor Mammedov wrote:
> > On Tue, 28 Jun 2016 12:59:27 +0300
> > Marcel Apfelbaum  wrote:
> >  
> >> In build_crs(), the calculation and merging of the ranges already happens
> >> in 64-bit, but the entry boundaries are silently truncated to 32-bit in the
> >> call to aml_dword_memory(). Fix it by handling the 64-bit MMIO ranges 
> >> separately.
> >> This fixes 64-bit BARs behind PXBs.
> >>
> >> Signed-off-by: Marcel Apfelbaum   
> > patch indeed fixes issue with truncating in aml_dword_memory()
> > as per hunk
> >
> > @@ -3306,12 +3306,12 @@ DefinitionBlock ("", "DSDT", 1, "BOCHS ", 
> > "BXPCDSDT", 0x0001)
> >   0x, // Translation Offset
> >   0x0020, // Length
> >   ,, , AddressRangeMemory, TypeStatic)
> > -DWordMemory (ResourceProducer, PosDecode, MinFixed, 
> > MaxFixed, NonCacheable, ReadWrite,
> > -0x, // Granularity
> > -0x, // Range Minimum
> > -0x, // Range Maximum
> > -0x, // Translation Offset
> > -0x, // Length
> > +QWordMemory (ResourceProducer, PosDecode, MinFixed, 
> > MaxFixed, NonCacheable, ReadWrite,
> > +0x, // Granularity
> > +0x0001, // Range Minimum
> > +0x0001, // Range Maximum
> > +0x, // Translation Offset
> > +0x0001, // Length
> >   ,, , AddressRangeMemory, TypeStatic)
> >   WordBusNumber (ResourceProducer, MinFixed, MaxFixed, 
> > PosDecode,
> >   0x, // Granularity
> >
> > how a second hunk is present which touches 32bit part of _CRS:
> >
> > @@ -3372,9 +3372,9 @@ DefinitionBlock ("", "DSDT", 1, "BOCHS ", "BXPCDSDT", 
> > 0x0001)
> >   DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, 
> > NonCacheable, ReadWrite,
> >   0x, // Granularity
> >   0xFEA0, // Range Minimum
> > -0x, // Range Maximum
> > +0xFEBF, // Range Maximum
> >   0x, // Translation Offset
> > -0x0160, // Length
> > +0x0020, // Length
> >   ,, , AddressRangeMemory, TypeStatic)
> >   })
> >
> > was it expected? Why?
> >  
> 
> Yes, it is expected. It is the same bug. If you try a pc machine
> without pxb you will have  0xFEBF as the 32-bit upper IOMMU limit.
> 
> However, when having 32-bit ranges being merged with 64-bit ranges will result
> in a wrong upper limit.
> 
> So this is a second fix to the same problem.

Reviewed-by: Igor Mammedov 


> 
> 
> Thanks,
> Marcel
> 
> >> ---
> >>   hw/i386/acpi-build.c | 53 
> >> +++-
> >>   1 file changed, 44 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> >> index f306ae3..3808347 100644
> >> --- a/hw/i386/acpi-build.c
> >> +++ b/hw/i386/acpi-build.c
> >> @@ -746,18 +746,22 @@ static void crs_range_free(gpointer data)
> >>   typedef struct CrsRangeSet {
> >>   GPtrArray *io_ranges;
> >>   GPtrArray *mem_ranges;
> >> +GPtrArray *mem_64bit_ranges;
> >>} CrsRangeSet;
> >>
> >>   static void crs_range_set_init(CrsRangeSet *range_set)
> >>   {
> >>   range_set->io_ranges = 
> >> g_ptr_array_new_with_free_func(crs_range_free);
> >>   range_set->mem_ranges = 
> >> g_ptr_array_new_with_free_func(crs_range_free);
> >> +range_set->mem_64bit_ranges =
> >> +g_ptr_array_new_with_free_func(crs_range_free);
> >>   }
> >>
> >>   static void crs_range_set_free(CrsRangeSet *range_set)
> >>   {
> >>   g_ptr_array_free(range_set->io_ranges, true);
> >>   g_ptr_array_free(range_set->mem_ranges, true);
> >> +g_ptr_array_free(range_set->mem_64bit_ranges, true);
> >>   }
> >>
> >>   static gint crs_range_compare(gconstpointer a, gconstpointer b)
> >> @@ -915,8 +919,14 @@ static Aml *build_crs(PCIHostState *host, CrsRangeSet 
> >> *range_set)
> >>* that do not support multiple root buses
> >>*/
> >>   if (range_base && range_base <= range_limit) {
> >> -crs_range_insert(temp_range_set.mem_ranges,
> >> - range_base, range_limit);
> >> +uint64_t length = range_limit - range_base + 1;
> >> +if (range_limit <= UINT32_MAX && length <= UINT32_MAX) {
> >> +crs_range_insert(temp_range_set.mem_ranges,
> >> + 

Re: [Qemu-devel] [PULL 00/32] Misc patches for QEMU soft freeze

2016-06-29 Thread Peter Maydell
On 28 June 2016 at 18:33, Paolo Bonzini  wrote:
> The following changes since commit 7dd929dfdc5c52ce79b21bf557ff506e89acbf63:
>
>   configure: Make AVX2 test robust to non-ELF systems (2016-06-28 15:40:40 
> +0100)
>
> are available in the git repository at:
>
>   git://github.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to ea74c50f48100860ef4d27f4a1b2aa3f5cb9a766:
>
>   vl: smp_parse: fix regression (2016-06-28 19:19:29 +0200)
>
> 
> * serial port fixes (Paolo)
> * Q35 modeling improvements (Paolo, Vasily)
> * chardev cleanup improvements (Marc-André)
> * iscsi bugfix (Peter L.)
> * cpu_exec patch from multi-arch patches (Peter C.)
> * pci-assign tweak (Lin Ma)

This triggers a lot of errors from the clang ubsan:

/home/petmay01/linaro/qemu-for-merges/qemu-char.c:4043:5: runtime
error: member access within misaligned address 0x101010101010101 for
type 'CharDriverState' (aka 'struct CharDriverState'), which requires
8 byte alignment
0x101010101010101: note: pointer points here


(There was also a hang trying to run tests on 32-bit
ARM, which might or might not be related. Don't have
more details on that one, sorry.)

thanks
-- PMM



Re: [Qemu-devel] [PATCH 2/3] replay: allow replay stopping and restarting

2016-06-29 Thread Pavel Dovgalyuk
Ping?

Pavel Dovgalyuk

> -Original Message-
> From: Pavel Dovgalyuk [mailto:dovga...@ispras.ru]
> Sent: Monday, June 20, 2016 9:27 AM
> To: 'Paolo Bonzini'; 'Pavel Dovgalyuk'
> Cc: qemu-devel@nongnu.org; jasow...@redhat.com; ag...@suse.de; 
> da...@gibson.dropbear.id.au
> Subject: RE: [PATCH 2/3] replay: allow replay stopping and restarting
> 
> > From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> > > From: "Pavel Dovgalyuk" 
> > > This patch fixes bug with stopping and restarting replay
> > > through monitor.
> > >
> > > Signed-off-by: Pavel Dovgalyuk 
> > > ---
> > >  block/blkreplay.c|   18 +-
> > >  cpus.c   |1 +
> > >  include/sysemu/replay.h  |2 ++
> > >  replay/replay-internal.h |2 --
> > >  vl.c |1 +
> > >  5 files changed, 17 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/block/blkreplay.c b/block/blkreplay.c
> > > index 42f1813..438170c 100644
> > > --- a/block/blkreplay.c
> > > +++ b/block/blkreplay.c
> > > @@ -70,6 +70,14 @@ static void blkreplay_bh_cb(void *opaque)
> > >  g_free(req);
> > >  }
> > >
> > > +static uint64_t blkreplay_next_id(void)
> > > +{
> > > +if (replay_events_enabled()) {
> > > +return request_id++;
> > > +}
> > > +return 0;
> > > +}
> >
> > What happens if 0 is returned?
> 
> It could be any value. When replay events are disables,
> it means that either replay is disabled or execution is stopped.
> In first case we won't pass this requests through the replay queue
> and therefore id is useless.
> In stopped mode we have to keep request_id unchanged to make
> record/replay deterministic.
> 
> > I think that you want to call
> > replay_disable_events...
> >
> > >  bdrv_drain_all();
> >
> > ... after this bdrv_drain_all.
> 
> Why? We disable replay events to avoid adding new block requests
> to the queue. How this is related to drain all?
> 
> >
> > I was going to suggest using qemu_add_vm_change_state_handler
> > in replay_start (which could have replaced the existing call
> > to replay_enable_events), but that's not possible if you have
> > to do your calls after bdrv_drain_all.
> 
> Pavel Dovgalyuk





Re: [Qemu-devel] [PATCH] block/qdev: Fix NULL access when using BB twice

2016-06-29 Thread Kevin Wolf
Am 23.06.2016 um 09:30 hat Kevin Wolf geschrieben:
> BlockBackend has only a single pointer to its guest device, so it makes
> sure that only a single guest device is attached to it. device-add
> returns an error if you try to attach a second device to a BB. In order
> to make the error message nicer, -device that manually connects to a
> if=none block device get a different message than -drive that implicitly
> creates a guest device. The if=... option is stored in DriveInfo.
> 
> However, since blockdev-add exists, not every BlockBackend has a
> DriveInfo any more. Check that it exists before we dereference it.
> 
> QMP reproducer resulting in a segfault:
> 
> {"execute":"blockdev-add","arguments":{"options":{"id":"disk","driver":"file","filename":"/tmp/test.img"}}}
> {"execute":"device_add","arguments":{"driver":"virtio-blk-pci","drive":"disk"}}
> {"execute":"device_add","arguments":{"driver":"virtio-blk-pci","drive":"disk"}}
> 
> Signed-off-by: Kevin Wolf 

Applied to my block branch.

Kevin



Re: [Qemu-devel] Automated testing of block/gluster.c with upstream Gluster

2016-06-29 Thread Niels de Vos
On Wed, Jun 29, 2016 at 09:39:22AM +0200, Lukáš Doktor wrote:
> Dne 28.6.2016 v 17:56 Niels de Vos napsal(a):
> > On Tue, Jun 28, 2016 at 05:20:03PM +0200, Lukáš Doktor wrote:
> > > Dne 28.6.2016 v 16:10 Kevin Wolf napsal(a):
> > > > Am 28.06.2016 um 11:02 hat Niels de Vos geschrieben:
> > > > > Hi,
> > > > > 
> > > > > it seems we broke the block/gluster.c functionality with a recent 
> > > > > patch
> > > > > in upstream Gluster. In order to prevent this from happening in the
> > > > > future, I would like to setup a Jenkins job that installs a plan 
> > > > > CentOS
> > > > > with its version of QEMU, and nightly builds of upstream Gluster.
> > > > > Getting a notification about breakage the day after a patch got merged
> > > > > seems like a reasonable approach.
> > > > > 
> > > > > The test should at least boot the generic CentOS cloud image (slightly
> > > > > modified with libguestfs) and return a success/fail. I am wondering if
> > > > > there are automated tests like this already, and if I could (re)use 
> > > > > some
> > > > > of the scripts for it. At the moment, I am thinking to so it like 
> > > > > this:
> > > > >  - download the image [1]
> > > > >  - set kernel parameters to output on the serial console
> > > > >  - add a auto-login user/script
> > > > >  - have the script write "bootup complete" or something
> > > > >  - have the script poweroff the VM
> > > > >  - script that started the VM checks for the "bootup complete" message
> > > > >  - return success/fail
> > > > 
> > > > Sounds like something that Avocado should be able (or actually is
> > > > designed) to do. I can't tell you the details of how to write the test
> > > > case for it, but I'm adding a CC to Lukáš who probably can (and I think
> > > > it shouldn't be hard anyway).
> > > > 
> > > > Kevin
> > > > 
> > > 
> > > Hello guys,
> > > 
> > > yes, Avocado is designed to do this and I believe it even contain quite a
> > > few Gluster tests. You can look for them in avocado-vt or ping our QA 
> > > folks
> > > who might give you some pointers (cc Xu nad Hao).
> > > 
> > > Regarding the building the CI I use the combination of Jenkins, Jenkins 
> > > job
> > > builder and Avocado (avocado-vt) to check power/arm
> > > weekly/per-package-update. Jenkins even supports github and other triggers
> > > if you decide you have enough resources to check each PR/commit. It all
> > > depends on what HW you have available.
> > 
> > That looks promising! Its a bit more complex (or at least 'new' for me)
> > than that I was hoping. There is Gluster support in there, I found a
> > description of it here:
> >   http://avocado-vt.readthedocs.io/en/latest/GlusterFs.html
> >   http://avocado-vt.readthedocs.io/en/latest/RunQemuUnittests.html
> > 
> > Browsing through the docs does not really explain me how to put a
> > configuration file together that runs the QEMU tests with a VM image on
> > Gluster though. I probably need to read much more, but a pointer or very
> > minimal example would be much appreciated.
> > 
> > When I'm able to run avocado-vt, it should be trivial to put that in a
> > Jenkins job :)
> > 
> > Many thanks,
> > Niels
> > 
> 
> Hello Niels,
> 
> yep, it should be quite simple, but I don't want to break my setup just to
> try it out. Hopefully you'll manage to do it yourself. Anyway few
> pointers...
> 
> Install avocado:
> 
> 
> http://avocado-vt.readthedocs.io/en/latest/GetStartedGuide.html#installing-avocado
> 
> it should be straight forward so I expect you're able to run `avocado boot`
> already. There are several backends so can run the same tests on top of
> them. Let's assume you're using `qemu`, which is the default. The difference
> is the backend configuration location and some details regarding importing
> images and so on. Qemu is the simplest for me as it does not require
> anything.
> 
> Now regarding the GlusterFS. By default avocado uses `only
> (image_backend=filesystem)` hardcoded in
> `/usr/share/avocado_vt/backends/qemu/cfg/tests.cfg`. This is because wast
> majority of people don't want to change it and if they do they usually use
> custom config files. I don't think you'd like to go that way so let's just
> patch that file and change it to `only (image_backend=gluster)`.
> 
> Then you might take a look into
> `/usr/share/avocado_vt/shared/cfg/guest-hw.cfg` where you can find the
> `variants image_backend:` and several profiles there including `gluster`.
> You can specify the `gluster_brick` and other options.
> 
> When you modify everything to your needs you should re-run `avocado
> vt-bootstrap` to update the configs and you should be ready to run. Any test
> should then use the `gluster` image instead of file-based image.
> 
> 
> As for the kvm-unit-test, the documentation is seriously outdated and new
> version is in progress. You can use the pure avocado for running
> kvm-unit-test, see https://github.com/avocado-framework/avocado/pull/1280
> for details. I'll update the avocado-vt documentation whe

[Qemu-devel] [PULL 2/6] ipxe: add e1000e rom

2016-06-29 Thread Gerd Hoffmann
Signed-off-by: Gerd Hoffmann 
---
 roms/Makefile | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/roms/Makefile b/roms/Makefile
index 7bd1252..e8133fe 100644
--- a/roms/Makefile
+++ b/roms/Makefile
@@ -1,11 +1,13 @@
 
 vgabios_variants := stdvga cirrus vmware qxl isavga virtio
 vgabios_targets  := $(subst -isavga,,$(patsubst 
%,vgabios-%.bin,$(vgabios_variants)))
-pxerom_variants  := e1000 eepro100 ne2k_pci pcnet rtl8139 virtio
-pxerom_targets   := 8086100e 80861209 10500940 10222000 10ec8139 1af41000
+pxerom_variants  := e1000 e1000e eepro100 ne2k_pci pcnet rtl8139 virtio
+pxerom_targets   := 8086100e 808610d3 80861209 10500940 10222000 10ec8139 
1af41000
 
 pxe-rom-e1000efi-rom-e1000: VID := 8086
 pxe-rom-e1000efi-rom-e1000: DID := 100e
+pxe-rom-e1000e   efi-rom-e1000e   : VID := 8086
+pxe-rom-e1000e   efi-rom-e1000e   : DID := 10d3
 pxe-rom-eepro100 efi-rom-eepro100 : VID := 8086
 pxe-rom-eepro100 efi-rom-eepro100 : DID := 1209
 pxe-rom-ne2k_pci efi-rom-ne2k_pci : VID := 1050
-- 
1.8.3.1




[Qemu-devel] [PULL 5/6] vmxnet3: add boot rom

2016-06-29 Thread Gerd Hoffmann
Disable for old machine types as this is a guest visible change.

Signed-off-by: Gerd Hoffmann 
---
 hw/net/vmxnet3.c | 1 +
 include/hw/i386/pc.h | 4 
 2 files changed, 5 insertions(+)

diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index d978976..25cee9f 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -2700,6 +2700,7 @@ static void vmxnet3_class_init(ObjectClass *class, void 
*data)
 c->vendor_id = PCI_VENDOR_ID_VMWARE;
 c->device_id = PCI_DEVICE_ID_VMWARE_VMXNET3;
 c->revision = PCI_DEVICE_ID_VMWARE_VMXNET3_REVISION;
+c->romfile = "efi-vmxnet3.rom";
 c->class_id = PCI_CLASS_NETWORK_ETHERNET;
 c->subsystem_vendor_id = PCI_VENDOR_ID_VMWARE;
 c->subsystem_id = PCI_DEVICE_ID_VMWARE_VMXNET3;
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 49566c8..a112efb 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -362,6 +362,10 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);
 .driver   = TYPE_X86_CPU,\
 .property = "cpuid-0xb",\
 .value= "off",\
+},{\
+.driver   = "vmxnet3",\
+.property = "romfile",\
+.value= "",\
 },
 
 #define PC_COMPAT_2_5 \
-- 
1.8.3.1




[Qemu-devel] [PULL 3/6] ipxe: add vmxnet3 rom

2016-06-29 Thread Gerd Hoffmann
Signed-off-by: Gerd Hoffmann 
---
 roms/Makefile | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/roms/Makefile b/roms/Makefile
index e8133fe..88b3709 100644
--- a/roms/Makefile
+++ b/roms/Makefile
@@ -1,8 +1,8 @@
 
 vgabios_variants := stdvga cirrus vmware qxl isavga virtio
 vgabios_targets  := $(subst -isavga,,$(patsubst 
%,vgabios-%.bin,$(vgabios_variants)))
-pxerom_variants  := e1000 e1000e eepro100 ne2k_pci pcnet rtl8139 virtio
-pxerom_targets   := 8086100e 808610d3 80861209 10500940 10222000 10ec8139 
1af41000
+pxerom_variants  := e1000 e1000e eepro100 ne2k_pci pcnet rtl8139 virtio vmxnet3
+pxerom_targets   := 8086100e 808610d3 80861209 10500940 10222000 10ec8139 
1af41000 15ad07b0
 
 pxe-rom-e1000efi-rom-e1000: VID := 8086
 pxe-rom-e1000efi-rom-e1000: DID := 100e
@@ -18,6 +18,8 @@ pxe-rom-rtl8139  efi-rom-rtl8139  : VID := 10ec
 pxe-rom-rtl8139  efi-rom-rtl8139  : DID := 8139
 pxe-rom-virtio   efi-rom-virtio   : VID := 1af4
 pxe-rom-virtio   efi-rom-virtio   : DID := 1000
+pxe-rom-vmxnet3  efi-rom-vmxnet3  : VID := 15ad
+pxe-rom-vmxnet3  efi-rom-vmxnet3  : DID := 07b0
 
 #
 # cross compiler auto detection
-- 
1.8.3.1




[Qemu-devel] [PULL 0/6] ipxe: update submodule from 4e03af8ec to 041863191

2016-06-29 Thread Gerd Hoffmann
  Hi,

Here comes the ipxe update for 2.7, rebasing the ipxe module to latest
master and also adding boot roms for e1000e and vmxnet3.

please pull,
  Gerd

The following changes since commit c7288767523f6510cf557707d3eb5e78e519b90d:

  Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160623' into 
staging (2016-06-23 11:53:14 +0100)

are available in the git repository at:


  git://git.kraxel.org/qemu tags/pull-ipxe-20160629-1

for you to fetch changes up to c52125ab9280733b8d265195f6ffe9c76772b0a5:

  ipxe: update prebuilt binaries (2016-06-24 14:18:19 +0200)


ipxe: update submodule from 4e03af8ec to 041863191
e1000e+vmxnet3: add boot rom


Gerd Hoffmann (6):
  ipxe: update submodule from 4e03af8ec to 041863191
  ipxe: add e1000e rom
  ipxe: add vmxnet3 rom
  e1000e: add boot rom
  vmxnet3: add boot rom
  ipxe: update prebuilt binaries

 hw/net/e1000e.c  |   1 +
 hw/net/vmxnet3.c |   1 +
 include/hw/i386/pc.h |   4 
 pc-bios/efi-e1000.rom| Bin 196608 -> 209408 bytes
 pc-bios/efi-e1000e.rom   | Bin 0 -> 209408 bytes
 pc-bios/efi-eepro100.rom | Bin 197120 -> 209920 bytes
 pc-bios/efi-ne2k_pci.rom | Bin 195584 -> 208384 bytes
 pc-bios/efi-pcnet.rom| Bin 195584 -> 208384 bytes
 pc-bios/efi-rtl8139.rom  | Bin 199168 -> 211456 bytes
 pc-bios/efi-virtio.rom   | Bin 193024 -> 211456 bytes
 pc-bios/efi-vmxnet3.rom  | Bin 0 -> 205312 bytes
 roms/Makefile|   8 ++--
 roms/ipxe|   2 +-
 13 files changed, 13 insertions(+), 3 deletions(-)
 create mode 100644 pc-bios/efi-e1000e.rom
 create mode 100644 pc-bios/efi-vmxnet3.rom



Re: [Qemu-devel] [PATCH 2/3] ide: ignore retry_unit check for non-retry operations

2016-06-29 Thread Evgeny Yakovlev



On 28.06.2016 23:56, Paolo Bonzini wrote:


On 24/06/2016 17:06, Denis V. Lunev wrote:

When doing DMA request ide/core.c will set s->retry_unit to s->unit in
ide_start_dma. When dma completes ide_set_inactive sets retry_unit to -1.
After that ide_flush_cache runs and fails thanks to blkdebug.
ide_flush_cb calls ide_handle_rw_error which asserts that s->retry_unit
== s->unit. But s->retry_unit is still -1 after previous DMA completion
and flush does not use anything related to retry.

Wouldn't the assertion fail for a PIO read/write too?  Perhaps
retry_unit should be set to s->unit in ide_transfer_start too.


If PIO follows DMA and fails then yes, it looks like it will trigger an 
assert. I am not sure about setting retry_unit in ide_transfer_start. It 
looks like currently only DMA I/O entries touch retry_unit at all. Does 
that mean that PIO, flush, etc do not support retries by design and we 
need to add more exceptions to assert check or is it a real bug in how 
retries are initialized?




Paolo





[Qemu-devel] [PULL 1/6] ipxe: update submodule from 4e03af8ec to 041863191

2016-06-29 Thread Gerd Hoffmann
shortlog


Andrew Widdersheim (1):
  [netdevice] Add "ifname" setting

Carl Henrik Lunde (1):
  [vmxnet3] Avoid completely filling the TX descriptor ring

Christian Hesse (2):
  [golan] Fix build error on some versions of gcc
  [ath9k] Fix buffer overrun for ar9287

Christian Nilsson (2):
  [intel] Add PCI device ID for another I219-V
  [intel] Add PCI device ID for another I219-LM

Hummel Frank (1):
  [intel] Add INTEL_NO_PHY_RST for I218-LM

Kyösti Mälkki (1):
  [intel] Add PCI IDs for i210/i211 flashless operation

Ladi Prosek (6):
  [pci] Add pci_find_next_capability()
  [virtio] Add virtio 1.0 constants and data structures
  [virtio] Add virtio 1.0 PCI support
  [virtio] Add virtio-net 1.0 support
  [virtio] Renumber virtio_pci_region flags
  [virtio] Fix virtio-pci logging

Leendert van Doorn (2):
  [tg3] Fix address truncation bug on 64-bit machines
  [tg3] Add missing memory barrier

Michael Brown (287):
  [settings] Re-add "uristring" setting type
  [dhcp] Do not skip ProxyDHCPREQUEST if next-server is empty
  [efi] Add definitions of GUIDs observed when booting shim.efi and grub.efi
  [efi] Mark EFI debug transcription functions as __attribute__ (( pure ))
  [efi] Remove raw EFI_HANDLE values from debug messages
  [efi] Include installed protocol list in unknown handle names
  [efi] Improve efi_wrap debugging
  [pxe] Construct all fake DHCP packets before starting PXE NBP
  [efi] Add definitions of GUIDs observed when booting wdsmgfw.efi
  [efi] Fix debug directory size
  [efi] Populate debug directory entry FileOffset field
  [build] Search for ldlinux.c32 separately from isolinux.bin
  [tcpip] Allow supported address families to be detected at runtime
  [efi] Allow calls to efi_snp_claim() and efi_snp_release() to be nested
  [efi] Fix order of events on SNP removal path
  [efi] Do not return EFI_NOT_READY from our ReceiveFilters() method
  [pxe] Populate ciaddr in fake PXE Boot Server ACK packet
  [uri] Generalise tftp_uri() to pxe_uri()
  [efi] Implement the EFI_PXE_BASE_CODE_PROTOCOL
  [usb] Expose usb_find_driver()
  [usb] Add function to device's function list before attempting probe
  [efi] Add USB headers and GUID definitions
  [efi] Allow efidev_parent() to traverse multiple device generations
  [efi] Add a USB host controller driver based on EFI_USB_IO_PROTOCOL
  [tcpip] Avoid generating positive zero for transmitted UDP checksums
  [usb] Generalise zero-length packet generation logic
  [ehci] Do not treat zero-length NULL pointers as unreachable
  [ehci] Support arbitrarily large transfers
  [xhci] Support arbitrarily large transfers
  [efi] Provide efi_devpath_len()
  [efi] Include a copy of the device path within struct efi_device
  [usb] Select preferred USB device configuration based on driver score
  [usb] Allow for wildcard USB class IDs
  [efi] Expose unused USB devices via EFI_USB_IO_PROTOCOL
  [ncm] Support setting MAC address
  [build] Remove dependency on libiberty
  [efi] Minimise use of iPXE header files when building host utilities
  [pxe] Invoke INT 1a,564e when PXE stack is activated
  [pxe] Notify BIOS via INT 1a,564e for each new network device
  [efi] Work around broken 32-bit PE executable parsing in ImageHlp.dll
  [efi] Avoid infinite loops when asked to stop non-existent devices
  [efi] Expose an UNDI interface alongside the existing SNP interface
  [malloc] Avoid integer overflow for excessively large memory allocations
  [peerdist] Avoid NULL pointer dereference for plaintext blocks
  [http] Verify server port when reusing a pooled connection
  [efi] Reset root directory when installing EFI_SIMPLE_FILE_SYSTEM_PROTOCOL
  [efi] Update to current EDK2 headers
  [efi] Import EFI_HII_FONT_PROTOCOL definitions
  [fbcon] Allow character height to be selected at runtime
  [fbcon] Move margin calculations to fbcon.c
  [console] Tidy up config/console.h
  [build] Generalise CONSOLE_VESAFB to CONSOLE_FRAMEBUFFER
  [efi] Add support for EFI_GRAPHICS_OUTPUT_PROTOCOL frame buffer consoles
  [dhcp] Reset start time when deferring discovery
  [dhcp] Limit maximum number of DHCP discovery deferrals
  [comboot] Reset console before starting COMBOOT executable
  [intel] Forcibly skip PHY reset on some models
  [intel] Correct definition of receive overrun bit
  [infiniband] Add definitions for FDR and EDR link speeds
  [infiniband] Add qword accessors for ib_guid and ib_gid
  [pci] Add definitions for PCI Express function level reset (FLR)
  [bitops] Fix definitions for big-endian devices
  [smsc95xx] Add driver for SMSC/Microchip LAN95xx USB Ethernet NICs
  [bitops] Provide BIT_QWORD_PTR()
  [efi] Add %.usb target for building EFI-bootable USB (or other

[Qemu-devel] [PULL 6/6] ipxe: update prebuilt binaries

2016-06-29 Thread Gerd Hoffmann
Signed-off-by: Gerd Hoffmann 
---
 pc-bios/efi-e1000.rom| Bin 196608 -> 209408 bytes
 pc-bios/efi-e1000e.rom   | Bin 0 -> 209408 bytes
 pc-bios/efi-eepro100.rom | Bin 197120 -> 209920 bytes
 pc-bios/efi-ne2k_pci.rom | Bin 195584 -> 208384 bytes
 pc-bios/efi-pcnet.rom| Bin 195584 -> 208384 bytes
 pc-bios/efi-rtl8139.rom  | Bin 199168 -> 211456 bytes
 pc-bios/efi-virtio.rom   | Bin 193024 -> 211456 bytes
 pc-bios/efi-vmxnet3.rom  | Bin 0 -> 205312 bytes
 8 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/efi-e1000e.rom
 create mode 100644 pc-bios/efi-vmxnet3.rom

diff --git a/pc-bios/efi-e1000.rom b/pc-bios/efi-e1000.rom
index 4bc89a3..4e61f9b 100644
Binary files a/pc-bios/efi-e1000.rom and b/pc-bios/efi-e1000.rom differ
diff --git a/pc-bios/efi-e1000e.rom b/pc-bios/efi-e1000e.rom
new file mode 100644
index 000..192a437
Binary files /dev/null and b/pc-bios/efi-e1000e.rom differ
diff --git a/pc-bios/efi-eepro100.rom b/pc-bios/efi-eepro100.rom
index 85b7f9b..66c5226 100644
Binary files a/pc-bios/efi-eepro100.rom and b/pc-bios/efi-eepro100.rom differ
diff --git a/pc-bios/efi-ne2k_pci.rom b/pc-bios/efi-ne2k_pci.rom
index ebafd84..8c3e5fd 100644
Binary files a/pc-bios/efi-ne2k_pci.rom and b/pc-bios/efi-ne2k_pci.rom differ
diff --git a/pc-bios/efi-pcnet.rom b/pc-bios/efi-pcnet.rom
index 6f19723..802e225 100644
Binary files a/pc-bios/efi-pcnet.rom and b/pc-bios/efi-pcnet.rom differ
diff --git a/pc-bios/efi-rtl8139.rom b/pc-bios/efi-rtl8139.rom
index 086551b..8827181 100644
Binary files a/pc-bios/efi-rtl8139.rom and b/pc-bios/efi-rtl8139.rom differ
diff --git a/pc-bios/efi-virtio.rom b/pc-bios/efi-virtio.rom
index 140c680..2fc0497 100644
Binary files a/pc-bios/efi-virtio.rom and b/pc-bios/efi-virtio.rom differ
diff --git a/pc-bios/efi-vmxnet3.rom b/pc-bios/efi-vmxnet3.rom
new file mode 100644
index 000..3d42635
Binary files /dev/null and b/pc-bios/efi-vmxnet3.rom differ
-- 
1.8.3.1




[Qemu-devel] [PULL 4/6] e1000e: add boot rom

2016-06-29 Thread Gerd Hoffmann
Signed-off-by: Gerd Hoffmann 
---
 hw/net/e1000e.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index 692283f..4778744 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -693,6 +693,7 @@ static void e1000e_class_init(ObjectClass *class, void 
*data)
 c->vendor_id = PCI_VENDOR_ID_INTEL;
 c->device_id = E1000_DEV_ID_82574L;
 c->revision = 0;
+c->romfile = "efi-e1000e.rom";
 c->class_id = PCI_CLASS_NETWORK_ETHERNET;
 c->is_express = 1;
 
-- 
1.8.3.1




[Qemu-devel] [PATCH v2 kernel 2/7] virtio-balloon: define new feature bit and page bitmap head

2016-06-29 Thread Liang Li
Add a new feature which supports sending the page information with
a bitmap. The current implementation uses PFNs array, which is not
very efficient. Using bitmap can improve the performance of
inflating/deflating significantly

The page bitmap header will used to tell the host some information
about the page bitmap. e.g. the page size, page bitmap length and
start pfn.

Signed-off-by: Liang Li 
Cc: Michael S. Tsirkin 
Cc: Paolo Bonzini 
Cc: Cornelia Huck 
Cc: Amit Shah 
---
 include/uapi/linux/virtio_balloon.h | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/include/uapi/linux/virtio_balloon.h 
b/include/uapi/linux/virtio_balloon.h
index 343d7dd..d3b182a 100644
--- a/include/uapi/linux/virtio_balloon.h
+++ b/include/uapi/linux/virtio_balloon.h
@@ -34,6 +34,7 @@
 #define VIRTIO_BALLOON_F_MUST_TELL_HOST0 /* Tell before reclaiming 
pages */
 #define VIRTIO_BALLOON_F_STATS_VQ  1 /* Memory Stats virtqueue */
 #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM2 /* Deflate balloon on OOM */
+#define VIRTIO_BALLOON_F_PAGE_BITMAP   3 /* Send page info with bitmap */
 
 /* Size of a PFN in the balloon interface. */
 #define VIRTIO_BALLOON_PFN_SHIFT 12
@@ -82,4 +83,22 @@ struct virtio_balloon_stat {
__virtio64 val;
 } __attribute__((packed));
 
+/* Page bitmap header structure */
+struct balloon_bmap_hdr {
+   /* Used to distinguish different request */
+   __virtio16 cmd;
+   /* Shift width of page in the bitmap */
+   __virtio16 page_shift;
+   /* flag used to identify different status */
+   __virtio16 flag;
+   /* Reserved */
+   __virtio16 reserved;
+   /* ID of the request */
+   __virtio64 req_id;
+   /* The pfn of 0 bit in the bitmap */
+   __virtio64 start_pfn;
+   /* The length of the bitmap, in bytes */
+   __virtio64 bmap_len;
+};
+
 #endif /* _LINUX_VIRTIO_BALLOON_H */
-- 
1.8.3.1




[Qemu-devel] [PATCH v2 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-06-29 Thread Liang Li
This patch set contains two parts of changes to the virtio-balloon. 

One is the change for speeding up the inflating & deflating process,
the main idea of this optimization is to use bitmap to send the page
information to host instead of the PFNs, to reduce the overhead of
virtio data transmission, address translation and madvise(). This can
help to improve the performance by about 85%.

Another change is for speeding up live migration. By skipping process
guest's free pages in the first round of data copy, to reduce needless
data processing, this can help to save quite a lot of CPU cycles and
network bandwidth. We put guest's free page information in bitmap and
send it to host with the virt queue of virtio-balloon. For an idle 8GB
guest, this can help to shorten the total live migration time from 2Sec
to about 500ms in the 10Gbps network environment.  


Changes from v1 to v2:
* Abandon the patch for dropping page cache.
* Put some structures to uapi head file.
* Use a new way to determine the page bitmap size.
* Use a unified way to send the free page information with the bitmap 
* Address the issues referred in MST's comments

Liang Li (7):
  virtio-balloon: rework deflate to add page to a list
  virtio-balloon: define new feature bit and page bitmap head
  mm: add a function to get the max pfn
  virtio-balloon: speed up inflate/deflate process
  virtio-balloon: define feature bit and head for misc virt queue
  mm: add the related functions to get free page info
  virtio-balloon: tell host vm's free page info

 drivers/virtio/virtio_balloon.c | 306 +++-
 include/uapi/linux/virtio_balloon.h |  41 +
 mm/page_alloc.c |  52 ++
 3 files changed, 359 insertions(+), 40 deletions(-)

-- 
1.8.3.1




[Qemu-devel] [PATCH v2 kernel 5/7] virtio-balloon: define feature bit and head for misc virt queue

2016-06-29 Thread Liang Li
Define a new feature bit which supports a new virtual queue. This
new virtual qeuque is for information exchange between hypervisor
and guest. The VMM hypervisor can make use of this virtual queue
to request the guest do some operations, e.g. drop page cache,
synchronize file system, etc. And the VMM hypervisor can get some
of guest's runtime information through this virtual queue, e.g. the
guest's free page information, which can be used for live migration
optimization.

Signed-off-by: Liang Li 
Cc: Michael S. Tsirkin 
Cc: Paolo Bonzini 
Cc: Cornelia Huck 
Cc: Amit Shah 
---
 include/uapi/linux/virtio_balloon.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/include/uapi/linux/virtio_balloon.h 
b/include/uapi/linux/virtio_balloon.h
index d3b182a..be4880f 100644
--- a/include/uapi/linux/virtio_balloon.h
+++ b/include/uapi/linux/virtio_balloon.h
@@ -35,6 +35,7 @@
 #define VIRTIO_BALLOON_F_STATS_VQ  1 /* Memory Stats virtqueue */
 #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM2 /* Deflate balloon on OOM */
 #define VIRTIO_BALLOON_F_PAGE_BITMAP   3 /* Send page info with bitmap */
+#define VIRTIO_BALLOON_F_MISC_VQ   4 /* Misc info virtqueue */
 
 /* Size of a PFN in the balloon interface. */
 #define VIRTIO_BALLOON_PFN_SHIFT 12
@@ -101,4 +102,25 @@ struct balloon_bmap_hdr {
__virtio64 bmap_len;
 };
 
+enum balloon_req_id {
+   /* Get free pages information */
+   BALLOON_GET_FREE_PAGES,
+};
+
+enum balloon_flag {
+   /* Have more data for a request */
+   BALLOON_FLAG_CONT,
+   /* No more data for a request */
+   BALLOON_FLAG_DONE,
+};
+
+struct balloon_req_hdr {
+   /* Used to distinguish different request */
+   __virtio16 cmd;
+   /* Reserved */
+   __virtio16 reserved[3];
+   /* Request parameter */
+   __virtio64 param;
+};
+
 #endif /* _LINUX_VIRTIO_BALLOON_H */
-- 
1.8.3.1




[Qemu-devel] [PATCH v2 kernel 4/7] virtio-balloon: speed up inflate/deflate process

2016-06-29 Thread Liang Li
The implementation of the current virtio-balloon is not very
efficient, the time spends on different stages of inflating
the balloon to 7GB of a 8GB idle guest:

a. allocating pages (6.5%)
b. sending PFNs to host (68.3%)
c. address translation (6.1%)
d. madvise (19%)

It takes about 4126ms for the inflating process to complete.
Debugging shows that the bottle neck are the stage b and stage d.

If using a bitmap to send the page info instead of the PFNs, we
can reduce the overhead in stage b quite a lot. Furthermore, we
can do the address translation and call madvise() with a bulk of
RAM pages, instead of the current page per page way, the overhead
of stage c and stage d can also be reduced a lot.

This patch is the kernel side implementation which is intended to
speed up the inflating & deflating process by adding a new feature
to the virtio-balloon device. With this new feature, inflating the
balloon to 7GB of a 8GB idle guest only takes 590ms, the
performance improvement is about 85%.

TODO: optimize stage a by allocating/freeing a chunk of pages
instead of a single page at a time.

Signed-off-by: Liang Li 
Suggested-by: Michael S. Tsirkin 
Cc: Michael S. Tsirkin 
Cc: Paolo Bonzini 
Cc: Cornelia Huck 
Cc: Amit Shah 
---
 drivers/virtio/virtio_balloon.c | 184 +++-
 1 file changed, 162 insertions(+), 22 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 8d649a2..2d18ff6 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -41,10 +41,28 @@
 #define OOM_VBALLOON_DEFAULT_PAGES 256
 #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
 
+/*
+ * VIRTIO_BALLOON_PFNS_LIMIT is used to limit the size of page bitmap
+ * to prevent a very large page bitmap, there are two reasons for this:
+ * 1) to save memory.
+ * 2) allocate a large bitmap may fail.
+ *
+ * The actual limit of pfn is determined by:
+ * pfn_limit = min(max_pfn, VIRTIO_BALLOON_PFNS_LIMIT);
+ *
+ * If system has more pages than VIRTIO_BALLOON_PFNS_LIMIT, we will scan
+ * the page list and send the PFNs with several times. To reduce the
+ * overhead of scanning the page list. VIRTIO_BALLOON_PFNS_LIMIT should
+ * be set with a value which can cover most cases.
+ */
+#define VIRTIO_BALLOON_PFNS_LIMIT ((32 * (1ULL << 30)) >> PAGE_SHIFT) /* 32GB 
*/
+
 static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
 module_param(oom_pages, int, S_IRUSR | S_IWUSR);
 MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
 
+extern unsigned long get_max_pfn(void);
+
 struct virtio_balloon {
struct virtio_device *vdev;
struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
@@ -62,6 +80,15 @@ struct virtio_balloon {
 
/* Number of balloon pages we've told the Host we're not using. */
unsigned int num_pages;
+   /* Pointer of the bitmap header. */
+   void *bmap_hdr;
+   /* Bitmap and length used to tell the host the pages */
+   unsigned long *page_bitmap;
+   unsigned long bmap_len;
+   /* Pfn limit */
+   unsigned long pfn_limit;
+   /* Used to record the processed pfn range */
+   unsigned long min_pfn, max_pfn, start_pfn, end_pfn;
/*
 * The pages we've told the Host we're not using are enqueued
 * at vb_dev_info->pages list.
@@ -105,12 +132,45 @@ static void balloon_ack(struct virtqueue *vq)
wake_up(&vb->acked);
 }
 
+static inline void init_pfn_range(struct virtio_balloon *vb)
+{
+   vb->min_pfn = ULONG_MAX;
+   vb->max_pfn = 0;
+}
+
+static inline void update_pfn_range(struct virtio_balloon *vb,
+struct page *page)
+{
+   unsigned long balloon_pfn = page_to_balloon_pfn(page);
+
+   if (balloon_pfn < vb->min_pfn)
+   vb->min_pfn = balloon_pfn;
+   if (balloon_pfn > vb->max_pfn)
+   vb->max_pfn = balloon_pfn;
+}
+
 static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
 {
struct scatterlist sg;
unsigned int len;
 
-   sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
+   if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_PAGE_BITMAP)) {
+   struct balloon_bmap_hdr *hdr = vb->bmap_hdr;
+   unsigned long bmap_len;
+
+   /* cmd and req_id are not used here, set them to 0 */
+   hdr->cmd = cpu_to_virtio16(vb->vdev, 0);
+   hdr->page_shift = cpu_to_virtio16(vb->vdev, PAGE_SHIFT);
+   hdr->reserved = cpu_to_virtio16(vb->vdev, 0);
+   hdr->req_id = cpu_to_virtio64(vb->vdev, 0);
+   hdr->start_pfn = cpu_to_virtio64(vb->vdev, vb->start_pfn);
+   bmap_len = min(vb->bmap_len,
+   (vb->end_pfn - vb->start_pfn) / BITS_PER_BYTE);
+   hdr->bmap_len = cpu_to_virtio64(vb->vdev, bmap_len);
+   sg_init_one(&sg, hdr,
+sizeof(struct balloon_bmap_hdr) + bmap_len);
+   } else
+

[Qemu-devel] [PATCH v2 kernel 3/7] mm: add a function to get the max pfn

2016-06-29 Thread Liang Li
Expose the function to get the max pfn, so it can be used in the
virtio-balloon device driver.

Signed-off-by: Liang Li 
Cc: Andrew Morton 
Cc: Mel Gorman 
Cc: Michael S. Tsirkin 
Cc: Paolo Bonzini 
Cc: Cornelia Huck 
Cc: Amit Shah 
---
 mm/page_alloc.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6903b69..2083b40 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4515,6 +4515,12 @@ void show_free_areas(unsigned int filter)
show_swap_cache_info();
 }
 
+unsigned long get_max_pfn(void)
+{
+   return max_pfn;
+}
+EXPORT_SYMBOL(get_max_pfn);
+
 static void zoneref_set_zone(struct zone *zone, struct zoneref *zoneref)
 {
zoneref->zone = zone;
-- 
1.8.3.1




[Qemu-devel] [PATCH v2 kernel 6/7] mm: add the related functions to get free page info

2016-06-29 Thread Liang Li
Save the free page info into a page bitmap, will be used in virtio
balloon device driver.

Signed-off-by: Liang Li 
Cc: Andrew Morton 
Cc: Mel Gorman 
Cc: Michael S. Tsirkin 
Cc: Paolo Bonzini 
Cc: Cornelia Huck 
Cc: Amit Shah 
---
 mm/page_alloc.c | 46 ++
 1 file changed, 46 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2083b40..c2a6669 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4521,6 +4521,52 @@ unsigned long get_max_pfn(void)
 }
 EXPORT_SYMBOL(get_max_pfn);
 
+static void mark_free_pages_bitmap(struct zone *zone, unsigned long start_pfn,
+   unsigned long end_pfn, unsigned long *bitmap, unsigned long len)
+{
+   unsigned long pfn, flags, page_num;
+   unsigned int order, t;
+   struct list_head *curr;
+
+   if (zone_is_empty(zone))
+   return;
+   end_pfn = min(start_pfn + len, end_pfn);
+   spin_lock_irqsave(&zone->lock, flags);
+
+   for_each_migratetype_order(order, t) {
+   list_for_each(curr, &zone->free_area[order].free_list[t]) {
+   pfn = page_to_pfn(list_entry(curr, struct page, lru));
+   if (pfn >= start_pfn && pfn <= end_pfn) {
+   page_num = 1UL << order;
+   if (pfn + page_num > end_pfn)
+   page_num = end_pfn - pfn;
+   bitmap_set(bitmap, pfn - start_pfn, page_num);
+   }
+   }
+   }
+
+   spin_unlock_irqrestore(&zone->lock, flags);
+}
+
+int get_free_pages(unsigned long start_pfn, unsigned long end_pfn,
+   unsigned long *bitmap, unsigned long len)
+{
+   struct zone *zone;
+   int ret = 0;
+
+   if (bitmap == NULL || start_pfn > end_pfn || start_pfn >= max_pfn)
+   return 0;
+   if (end_pfn < max_pfn)
+   ret = 1;
+   if (end_pfn >= max_pfn)
+   ret = 0;
+
+   for_each_populated_zone(zone)
+   mark_free_pages_bitmap(zone, start_pfn, end_pfn, bitmap, len);
+   return ret;
+}
+EXPORT_SYMBOL(get_free_pages);
+
 static void zoneref_set_zone(struct zone *zone, struct zoneref *zoneref)
 {
zoneref->zone = zone;
-- 
1.8.3.1




[Qemu-devel] [PATCH v2 kernel 1/7] virtio-balloon: rework deflate to add page to a list

2016-06-29 Thread Liang Li
will allow faster notifications using a bitmap down the road.
balloon_pfn_to_page() can be removed because it's useless.

Signed-off-by: Liang Li 
Signed-off-by: Michael S. Tsirkin 
Cc: Paolo Bonzini 
Cc: Cornelia Huck 
Cc: Amit Shah 
---
 drivers/virtio/virtio_balloon.c | 22 --
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 476c0e3..8d649a2 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -98,12 +98,6 @@ static u32 page_to_balloon_pfn(struct page *page)
return pfn * VIRTIO_BALLOON_PAGES_PER_PAGE;
 }
 
-static struct page *balloon_pfn_to_page(u32 pfn)
-{
-   BUG_ON(pfn % VIRTIO_BALLOON_PAGES_PER_PAGE);
-   return pfn_to_page(pfn / VIRTIO_BALLOON_PAGES_PER_PAGE);
-}
-
 static void balloon_ack(struct virtqueue *vq)
 {
struct virtio_balloon *vb = vq->vdev->priv;
@@ -176,18 +170,16 @@ static unsigned fill_balloon(struct virtio_balloon *vb, 
size_t num)
return num_allocated_pages;
 }
 
-static void release_pages_balloon(struct virtio_balloon *vb)
+static void release_pages_balloon(struct virtio_balloon *vb,
+struct list_head *pages)
 {
-   unsigned int i;
-   struct page *page;
+   struct page *page, *next;
 
-   /* Find pfns pointing at start of each page, get pages and free them. */
-   for (i = 0; i < vb->num_pfns; i += VIRTIO_BALLOON_PAGES_PER_PAGE) {
-   page = balloon_pfn_to_page(virtio32_to_cpu(vb->vdev,
-  vb->pfns[i]));
+   list_for_each_entry_safe(page, next, pages, lru) {
if (!virtio_has_feature(vb->vdev,
VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
adjust_managed_page_count(page, 1);
+   list_del(&page->lru);
put_page(page); /* balloon reference */
}
 }
@@ -197,6 +189,7 @@ static unsigned leak_balloon(struct virtio_balloon *vb, 
size_t num)
unsigned num_freed_pages;
struct page *page;
struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
+   LIST_HEAD(pages);
 
/* We can only do one array worth at a time. */
num = min(num, ARRAY_SIZE(vb->pfns));
@@ -208,6 +201,7 @@ static unsigned leak_balloon(struct virtio_balloon *vb, 
size_t num)
if (!page)
break;
set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
+   list_add(&page->lru, &pages);
vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
}
 
@@ -219,7 +213,7 @@ static unsigned leak_balloon(struct virtio_balloon *vb, 
size_t num)
 */
if (vb->num_pfns != 0)
tell_host(vb, vb->deflate_vq);
-   release_pages_balloon(vb);
+   release_pages_balloon(vb, &pages);
mutex_unlock(&vb->balloon_lock);
return num_freed_pages;
 }
-- 
1.8.3.1




Re: [Qemu-devel] [PULL 00/32] Misc patches for QEMU soft freeze

2016-06-29 Thread Paolo Bonzini


On 29/06/2016 11:42, Peter Maydell wrote:
> On 28 June 2016 at 18:33, Paolo Bonzini  wrote:
>> The following changes since commit 7dd929dfdc5c52ce79b21bf557ff506e89acbf63:
>>
>>   configure: Make AVX2 test robust to non-ELF systems (2016-06-28 15:40:40 
>> +0100)
>>
>> are available in the git repository at:
>>
>>   git://github.com/bonzini/qemu.git tags/for-upstream
>>
>> for you to fetch changes up to ea74c50f48100860ef4d27f4a1b2aa3f5cb9a766:
>>
>>   vl: smp_parse: fix regression (2016-06-28 19:19:29 +0200)
>>
>> 
>> * serial port fixes (Paolo)
>> * Q35 modeling improvements (Paolo, Vasily)
>> * chardev cleanup improvements (Marc-André)
>> * iscsi bugfix (Peter L.)
>> * cpu_exec patch from multi-arch patches (Peter C.)
>> * pci-assign tweak (Lin Ma)
> 
> This triggers a lot of errors from the clang ubsan:
> 
> /home/petmay01/linaro/qemu-for-merges/qemu-char.c:4043:5: runtime
> error: member access within misaligned address 0x101010101010101 for
> type 'CharDriverState' (aka 'struct CharDriverState'), which requires
> 8 byte alignment
> 0x101010101010101: note: pointer points here

Real bug, this should fix it:

diff --git a/qemu-char.c b/qemu-char.c
index 4aeafe8..33ddabf 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -4553,7 +4553,7 @@ static void qemu_chr_cleanup(void)
 {
 CharDriverState *chr;

-QTAILQ_FOREACH(chr, &chardevs, next) {
+QTAILQ_FOREACH_SAFE(chr, &chardevs, next) {
 qemu_chr_delete(chr);
 }
 }


Paolo

> 
> 
> (There was also a hang trying to run tests on 32-bit
> ARM, which might or might not be related. Don't have
> more details on that one, sorry.)
> 
> thanks
> -- PMM
> 
> 



[Qemu-devel] [PATCH v2 kernel 7/7] virtio-balloon: tell host vm's free page info

2016-06-29 Thread Liang Li
Support the request for vm's free page information, response with
a page bitmap. QEMU can make use of this free page bitmap to speed
up live migration process by skipping process the free pages.

Signed-off-by: Liang Li 
Cc: Michael S. Tsirkin 
Cc: Paolo Bonzini 
Cc: Cornelia Huck 
Cc: Amit Shah 
---
 drivers/virtio/virtio_balloon.c | 104 +---
 1 file changed, 98 insertions(+), 6 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 2d18ff6..5ca4ad3 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -62,10 +62,13 @@ module_param(oom_pages, int, S_IRUSR | S_IWUSR);
 MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
 
 extern unsigned long get_max_pfn(void);
+extern int get_free_pages(unsigned long start_pfn, unsigned long end_pfn,
+   unsigned long *bitmap, unsigned long len);
+
 
 struct virtio_balloon {
struct virtio_device *vdev;
-   struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
+   struct virtqueue *inflate_vq, *deflate_vq, *stats_vq, *misc_vq;
 
/* The balloon servicing is delegated to a freezable workqueue. */
struct work_struct update_balloon_stats_work;
@@ -89,6 +92,8 @@ struct virtio_balloon {
unsigned long pfn_limit;
/* Used to record the processed pfn range */
unsigned long min_pfn, max_pfn, start_pfn, end_pfn;
+   /* Request header */
+   struct balloon_req_hdr req_hdr;
/*
 * The pages we've told the Host we're not using are enqueued
 * at vb_dev_info->pages list.
@@ -373,6 +378,49 @@ static void update_balloon_stats(struct virtio_balloon *vb)
pages_to_bytes(available));
 }
 
+static void update_free_pages_stats(struct virtio_balloon *vb,
+   unsigned long req_id)
+{
+   struct scatterlist sg_in, sg_out;
+   unsigned long pfn = 0, bmap_len, max_pfn;
+   struct virtqueue *vq = vb->misc_vq;
+   struct balloon_bmap_hdr *hdr = vb->bmap_hdr;
+   int ret = 1;
+
+   max_pfn = get_max_pfn();
+   mutex_lock(&vb->balloon_lock);
+   while (pfn < max_pfn) {
+   memset(vb->page_bitmap, 0, vb->bmap_len);
+   ret = get_free_pages(pfn, pfn + vb->pfn_limit,
+   vb->page_bitmap, vb->bmap_len * BITS_PER_BYTE);
+   hdr->cmd = cpu_to_virtio16(vb->vdev, BALLOON_GET_FREE_PAGES);
+   hdr->page_shift = cpu_to_virtio16(vb->vdev, PAGE_SHIFT);
+   hdr->req_id = cpu_to_virtio64(vb->vdev, req_id);
+   hdr->start_pfn = cpu_to_virtio64(vb->vdev, pfn);
+   bmap_len = vb->pfn_limit / BITS_PER_BYTE;
+   if (!ret) {
+   hdr->flag = cpu_to_virtio16(vb->vdev,
+   BALLOON_FLAG_DONE);
+   if (pfn + vb->pfn_limit > max_pfn)
+   bmap_len = (max_pfn - pfn) / BITS_PER_BYTE;
+   } else
+   hdr->flag = cpu_to_virtio16(vb->vdev,
+   BALLOON_FLAG_CONT);
+   hdr->bmap_len = cpu_to_virtio64(vb->vdev, bmap_len);
+   sg_init_one(&sg_out, hdr,
+sizeof(struct balloon_bmap_hdr) + bmap_len);
+
+   virtqueue_add_outbuf(vq, &sg_out, 1, vb, GFP_KERNEL);
+   virtqueue_kick(vq);
+   pfn += vb->pfn_limit;
+   }
+
+   sg_init_one(&sg_in, &vb->req_hdr, sizeof(vb->req_hdr));
+   virtqueue_add_inbuf(vq, &sg_in, 1, &vb->req_hdr, GFP_KERNEL);
+   virtqueue_kick(vq);
+   mutex_unlock(&vb->balloon_lock);
+}
+
 /*
  * While most virtqueues communicate guest-initiated requests to the 
hypervisor,
  * the stats queue operates in reverse.  The driver initializes the virtqueue
@@ -511,18 +559,49 @@ static void update_balloon_size_func(struct work_struct 
*work)
queue_work(system_freezable_wq, work);
 }
 
+static void misc_handle_rq(struct virtio_balloon *vb)
+{
+   struct balloon_req_hdr *ptr_hdr;
+   unsigned int len;
+
+   ptr_hdr = virtqueue_get_buf(vb->misc_vq, &len);
+   if (!ptr_hdr || len != sizeof(vb->req_hdr))
+   return;
+
+   switch (ptr_hdr->cmd) {
+   case BALLOON_GET_FREE_PAGES:
+   update_free_pages_stats(vb, ptr_hdr->param);
+   break;
+   default:
+   break;
+   }
+}
+
+static void misc_request(struct virtqueue *vq)
+{
+   struct virtio_balloon *vb = vq->vdev->priv;
+
+   misc_handle_rq(vb);
+}
+
 static int init_vqs(struct virtio_balloon *vb)
 {
-   struct virtqueue *vqs[3];
-   vq_callback_t *callbacks[] = { balloon_ack, balloon_ack, stats_request 
};
-   static const char * const names[] = { "inflate", "deflate", "stats" };
+   struct virtqueue *vqs[4];
+   vq_callback_t *callbacks[] = { balloon_ack

Re: [Qemu-devel] [PATCH v2 1/2] trace: [linux-user] Commandline arguments to control tracing

2016-06-29 Thread Lluís Vilanova
Stefan Hajnoczi writes:

> On Wed, Jun 22, 2016 at 12:04:35PM +0200, Lluís Vilanova wrote:
>> @@ -4047,6 +4064,12 @@ static const struct qemu_argument arg_table[] = {
>> "",   "log system calls"},
>> {"seed",   "QEMU_RAND_SEED",   true,  handle_arg_randseed,
>> "",   "Seed for pseudo-random number generator"},
>> +{"trace-enable", "QEMU_TRACE_ENABLE",true,  handle_arg_trace_enable,
>> + "name",   "enable tracing of specified event names (pass 'help' to 
>> show a list of events)"},
>> +{"trace-events", "QEMU_TRACE_EVENTS",true,  handle_arg_trace_events,
>> + "eventsfile", "enable tracing of specified event names (one 
>> name/pattern per line)"},
>> +{"trace-file", "QEMU_TRACE_FILE",  true,  handle_arg_trace_file,
>> + "tracefile",  "output trace file"},

> Riku: These command-line options differ from the qemu-system -trace
> option.  Should there be consistency or does *-user do its own thing?

Do you mean it differs on semantics or on syntax? For the latter, *-user option
parsers do not use the more flexible parser used in vl.c (each has their own
much simpler implementation).

Cheers,
  Lluis



Re: [Qemu-devel] [PATCH 2/3] replay: allow replay stopping and restarting

2016-06-29 Thread Paolo Bonzini


On 20/06/2016 08:26, Pavel Dovgalyuk wrote:
>> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
>>> From: "Pavel Dovgalyuk" 
>>> This patch fixes bug with stopping and restarting replay
>>> through monitor.
>>>
>>> Signed-off-by: Pavel Dovgalyuk 
>>> ---
>>>  block/blkreplay.c|   18 +-
>>>  cpus.c   |1 +
>>>  include/sysemu/replay.h  |2 ++
>>>  replay/replay-internal.h |2 --
>>>  vl.c |1 +
>>>  5 files changed, 17 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/block/blkreplay.c b/block/blkreplay.c
>>> index 42f1813..438170c 100644
>>> --- a/block/blkreplay.c
>>> +++ b/block/blkreplay.c
>>> @@ -70,6 +70,14 @@ static void blkreplay_bh_cb(void *opaque)
>>>  g_free(req);
>>>  }
>>>
>>> +static uint64_t blkreplay_next_id(void)
>>> +{
>>> +if (replay_events_enabled()) {
>>> +return request_id++;
>>> +}
>>> +return 0;
>>> +}
>>
>> What happens if 0 is returned?  
> 
> It could be any value. When replay events are disables,
> it means that either replay is disabled or execution is stopped.
> In first case we won't pass this requests through the replay queue
> and therefore id is useless.
> In stopped mode we have to keep request_id unchanged to make
> record/replay deterministic.
> 
>> I think that you want to call
>> replay_disable_events...
>>
>>>  bdrv_drain_all();
>>
>> ... after this bdrv_drain_all.
> 
> Why? We disable replay events to avoid adding new block requests
> to the queue. How this is related to drain all?

drain all completes the guest's pending requests.  If you disable events
before drain all, doesn't that cause a mismatch between record and replay?

>>
>> I was going to suggest using qemu_add_vm_change_state_handler
>> in replay_start (which could have replaced the existing call
>> to replay_enable_events), but that's not possible if you have
>> to do your calls after bdrv_drain_all.
> 
> Pavel Dovgalyuk
> 
> 
> 



Re: [Qemu-devel] [PATCH 2/3] ide: ignore retry_unit check for non-retry operations

2016-06-29 Thread Paolo Bonzini


On 29/06/2016 10:35, Evgeny Yakovlev wrote:
>>>
>> Wouldn't the assertion fail for a PIO read/write too?  Perhaps
>> retry_unit should be set to s->unit in ide_transfer_start too.
> 
> If PIO follows DMA and fails then yes, it looks like it will trigger an
> assert. I am not sure about setting retry_unit in ide_transfer_start. It
> looks like currently only DMA I/O entries touch retry_unit at all. Does
> that mean that PIO, flush, etc do not support retries by design and we
> need to add more exceptions to assert check or is it a real bug in how
> retries are initialized?

Both PIO and flush do support retries, so I think it is a real bug.

Paolo



Re: [Qemu-devel] [PATCH 0/2] memory/intel_iommu: Generate error for incompatible usage

2016-06-29 Thread Paolo Bonzini


On 28/06/2016 16:49, Alex Williamson wrote:
> Paolo & Michael,
> 
> Any comments on this series?  I think we need Paolo's ack for the memory
> changes and either of your ack for hw/i386/.  I'm happy to pull this
> through my tree with your approval though.  Thanks,

I think I already acked the callbacks, in any case the patches look good.

Paolo

> Alex
> 
> On Wed, 15 Jun 2016 09:56:03 -0600
> Alex Williamson  wrote:
> 
>> VT-d emulation is currently incompatible with device assignment due
>> to intel_iommu's lack of support for memory_region_notify_iommu().
>> Alexey has proposed a nice addition to the MemoryRegionIOMMUOps
>> structure that adds callbacks when the first iommu notifier is
>> registered and the last is removed.  For POWER this will allow them
>> to switch the view of the iommu depending on whether anyone in
>> userspace is watching.  For VT-d I expect that eventually we'll use
>> these callbacks to enable and disable code paths so that we avoid
>> notifier overhead when there are no registered notifiy-ees.  For now,
>> we don't support calling memory_region_notify_iommu(), so this
>> signals an incompatible hardware configuration.  If we choose to make
>> CM=0 a user selectable option, something like this might continue to
>> be useful if we only support notifies via invalidations rather than
>> full VT-d data structure shadowing.
>>
>> Even though we're currently working on enabling users like vfio-pci
>> with VT-d, I believe this is correct for the current state of things.
>> We might even want to consider this stable for v2.6.x so that
>> downstreams pick it up to avoid incompatible configurations.
>>
>> Alexey, I hope I'm not stepping on your toes by extracting this
>> from your latest patch series.  Please let us know whether you
>> approve.  Thanks,
>>
>> Alex
>>
>> ---
>>
>> Alex Williamson (1):
>>   intel_iommu: Throw hw_error on notify_started
>>
>> Alexey Kardashevskiy (1):
>>   memory: Add MemoryRegionIOMMUOps.notify_started/stopped callbacks
>>
>>
>>  hw/i386/intel_iommu.c |   12 
>>  hw/vfio/common.c  |5 +++--
>>  include/exec/memory.h |8 +++-
>>  memory.c  |   10 +-
>>  4 files changed, 31 insertions(+), 4 deletions(-)
> 



Re: [Qemu-devel] [PATCH v2 1/3] char: clean up remaining chardevs when leaving

2016-06-29 Thread Paolo Bonzini


On 16/06/2016 21:28, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> This helps to remove various chardev resources leaks when leaving qemu.
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  qemu-char.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/qemu-char.c b/qemu-char.c
> index c926e9a..98dcd49 100644
> --- a/qemu-char.c
> +++ b/qemu-char.c
> @@ -4541,6 +4541,15 @@ void qmp_chardev_remove(const char *id, Error **errp)
>  qemu_chr_delete(chr);
>  }
>  
> +static void qemu_chr_cleanup(void)
> +{
> +CharDriverState *chr;
> +
> +QTAILQ_FOREACH(chr, &chardevs, next) {
> +qemu_chr_delete(chr);
> +}
> +}

FYI, this patch is necessary on top:

diff --git a/qemu-char.c b/qemu-char.c
index 016badb..bc04ced 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -4551,9 +4551,9 @@

 static void qemu_chr_cleanup(void)
 {
-CharDriverState *chr;
+CharDriverState *chr, *tmp;

-QTAILQ_FOREACH(chr, &chardevs, next) {
+QTAILQ_FOREACH_SAFE(chr, &chardevs, next, tmp) {
 qemu_chr_delete(chr);
 }
 }


(Reproducer: start QEMU with MALLOC_PERTURB_=42 and type "quit" on the
monitor).

Thanks,

Paolo



[Qemu-devel] [Bug 1588328] Re: Qemu 2.6 Solaris 9 Sparc Segmentation Fault

2016-06-29 Thread Mark Cave-Ayland
Okay. Can you confirm which version (or git revision) you've used to
apply the patch so I can try and reproduce locally?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1588328

Title:
  Qemu 2.6 Solaris 9 Sparc Segmentation Fault

Status in QEMU:
  New

Bug description:
  Hi,
  I tried the following command to boot Solaris 9 sparc:
  qemu-system-sparc -nographic -boot d -hda ./Spark9.disk -m 256 -cdrom 
sol-9-905hw-ga-sparc-dvd.iso -serial telnet:0.0.0.0:3000,server 

  It seems there are a few Segmentation Faults, one from the starting of
  the boot. Another at the beginning of the commandline installation.

  Trying 127.0.0.1...
  Connected to localhost.
  Escape character is '^]'.
  Configuration device id QEMU version 1 machine id 32
  Probing SBus slot 0 offset 0
  Probing SBus slot 1 offset 0
  Probing SBus slot 2 offset 0
  Probing SBus slot 3 offset 0
  Probing SBus slot 4 offset 0
  Probing SBus slot 5 offset 0
  Invalid FCode start byte
  CPUs: 1 x FMI,MB86904
  UUID: ----
  Welcome to OpenBIOS v1.1 built on Apr 18 2016 08:19
Type 'help' for detailed information
  Trying cdrom:d...
  Not a bootable ELF image
  Loading a.out image...
  Loaded 7680 bytes
  entry point is 0x4000
  bootpath: 
/iommu@0,1000/sbus@0,10001000/espdma@5,840/esp@5,880/sd@2,0:d

  Jumping to entry point 4000 for type 0005...
  switching to new context:
  SunOS Release 5.9 Version Generic_118558-34 32-bit
  Copyright 1983-2003 Sun Microsystems, Inc.  All rights reserved.
  Use is subject to license terms.
  WARNING: 
/iommu@0,1000/sbus@0,10001000/espdma@5,840/esp@5,880/sd@0,0 (sd0):
Corrupt label; wrong magic number

  Segmentation Fault
  Configuring /dev and /devices
  NOTICE: Couldn't set value (../../sun/io/audio/sada/drv/audiocs/audio_4231.c, 
Line #1759 0x00 0x88)
  audio may not work correctly until it is stopped and restarted
  Segmentation Fault
  Using RPC Bootparams for network configuration information.
  Skipping interface le0
  Searching for configuration file(s)...
  Search complete.

  

  What type of terminal are you using?
   1) ANSI Standard CRT
   2) DEC VT52
   3) DEC VT100
   4) Heathkit 19
   5) Lear Siegler ADM31
   6) PC Console
   7) Sun Command Tool
   8) Sun Workstation
   9) Televideo 910
   10) Televideo 925
   11) Wyse Model 50
   12) X Terminal Emulator (xterms)
   13) CDE Terminal Emulator (dtterm)
   14) Other
  Type the number of your choice and press Return: 3
  syslog service starting.
  savecore: no dump device configured
  Running in command line mode
  /sbin/disk0_install[109]: 143 Segmentation Fault
  /sbin/run_install[130]: 155 Segmentation Fault

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1588328/+subscriptions



Re: [Qemu-devel] Regression: virtio-pci: convert to ioeventfd callbacks

2016-06-29 Thread Cornelia Huck
On Wed, 29 Jun 2016 09:41:50 +0800
Jason Wang  wrote:
 
> On 2016年06月27日 17:44, Peter Lieven wrote:
> > Hi, with the above patch applied:
> >
> > commit 9f06e71a567ba5ee8b727e65a2d5347fd331d2aa
> > Author: Cornelia Huck 
> > Date:   Fri Jun 10 11:04:12 2016 +0200
> >
> > virtio-pci: convert to ioeventfd callbacks
> >
> > a Ubuntu 14.04 VM freezes at startup when blk-mq is set up - even if 
> > there is only one queue.
> >
> > Peter
> >
> >
> 
> In fact, I notice vhost-net does not work for master, look like we are 
> trying to set host notifier without initialization which seems a bug

Yes, that's the problem: We switch handlers, but the notifier has not
yet been setup (that's different from the dataplane call sequence). I
think we need to setup the notifier without touching the handler for
that case. I'm working on a patch.




Re: [Qemu-devel] [PULL 30/34] virtio-bus: have callers tolerate new host notifier api

2016-06-29 Thread Marc-André Lureau
Hi

On Fri, Jun 24, 2016 at 7:55 AM, Michael S. Tsirkin  wrote:
> From: Cornelia Huck 
>
> Have vhost and dataplane use the new api for transports that
> have been converted.
>
> Signed-off-by: Cornelia Huck 
> Reviewed-by: Fam Zheng 
> Reviewed-by: Stefan Hajnoczi 
> Reviewed-by: Michael S. Tsirkin 
> Signed-off-by: Michael S. Tsirkin 
> ---

This patch and further break vhost-user-test:

QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64  tests/vhost-user-test
/x86_64/vhost-user/read-guest-mem: binding does not support host notifiers
qemu-system-x86_64: unable to start vhost net: 38: falling back on
userspace virtio
**
ERROR:tests/vhost-user-test.c:162:wait_for_fds: assertion failed: (s->fds_num)
Aborted (core dumped)

(I wonder why it wasn't noticied)

>  hw/block/dataplane/virtio-blk.c | 14 +++---
>  hw/scsi/virtio-scsi-dataplane.c | 20 +++-
>  hw/virtio/vhost.c   | 20 
>  3 files changed, 42 insertions(+), 12 deletions(-)
>
> diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
> index 2073f9a..fdf5fd1 100644
> --- a/hw/block/dataplane/virtio-blk.c
> +++ b/hw/block/dataplane/virtio-blk.c
> @@ -79,7 +79,8 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, 
> VirtIOBlkConf *conf,
>  }
>
>  /* Don't try if transport does not support notifiers. */
> -if (!k->set_guest_notifiers || !k->set_host_notifier) {
> +if (!k->set_guest_notifiers ||
> +(!k->set_host_notifier && !k->ioeventfd_started)) {
>  error_setg(errp,
> "device is incompatible with dataplane "
> "(transport does not support notifiers)");
> @@ -157,7 +158,10 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
>  s->guest_notifier = virtio_queue_get_guest_notifier(s->vq);
>
>  /* Set up virtqueue notify */
> -r = k->set_host_notifier(qbus->parent, 0, true);
> +r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), 0, true);
> +if (r == -ENOSYS) {
> +r = k->set_host_notifier(qbus->parent, 0, true);
> +}
>  if (r != 0) {
>  fprintf(stderr, "virtio-blk failed to set host notifier (%d)\n", r);
>  goto fail_host_notifier;
> @@ -193,6 +197,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
>  BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s->vdev)));
>  VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
>  VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
> +int r;
>
>  if (!vblk->dataplane_started || s->stopping) {
>  return;
> @@ -217,7 +222,10 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
>
>  aio_context_release(s->ctx);
>
> -k->set_host_notifier(qbus->parent, 0, false);
> +r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), 0, false);
> +if (r == -ENOSYS) {
> +k->set_host_notifier(qbus->parent, 0, false);
> +}
>
>  /* Clean up guest notifier (irq) */
>  k->set_guest_notifiers(qbus->parent, 1, false);
> diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
> index 1a49f1e..b9a5716 100644
> --- a/hw/scsi/virtio-scsi-dataplane.c
> +++ b/hw/scsi/virtio-scsi-dataplane.c
> @@ -31,7 +31,8 @@ void virtio_scsi_set_iothread(VirtIOSCSI *s, IOThread 
> *iothread)
>  s->ctx = iothread_get_aio_context(vs->conf.iothread);
>
>  /* Don't try if transport does not support notifiers. */
> -if (!k->set_guest_notifiers || !k->set_host_notifier) {
> +if (!k->set_guest_notifiers ||
> +(!k->set_host_notifier && !k->ioeventfd_started)) {
>  fprintf(stderr, "virtio-scsi: Failed to set iothread "
> "(transport does not support notifiers)");
>  exit(1);
> @@ -73,7 +74,10 @@ static int virtio_scsi_vring_init(VirtIOSCSI *s, VirtQueue 
> *vq, int n,
>  int rc;
>
>  /* Set up virtqueue notify */
> -rc = k->set_host_notifier(qbus->parent, n, true);
> +rc = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), n, true);
> +if (rc == -ENOSYS) {
> +rc = k->set_host_notifier(qbus->parent, n, true);
> +}
>  if (rc != 0) {
>  fprintf(stderr, "virtio-scsi: Failed to set host notifier (%d)\n",
>  rc);
> @@ -159,7 +163,10 @@ fail_vrings:
>  virtio_scsi_clear_aio(s);
>  aio_context_release(s->ctx);
>  for (i = 0; i < vs->conf.num_queues + 2; i++) {
> -k->set_host_notifier(qbus->parent, i, false);
> +rc = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), i, false);
> +if (rc == -ENOSYS) {
> +k->set_host_notifier(qbus->parent, i, false);
> +}
>  }
>  k->set_guest_notifiers(qbus->parent, vs->conf.num_queues + 2, false);
>  fail_guest_notifiers:
> @@ -174,7 +181,7 @@ void virtio_scsi_dataplane_stop(VirtIOSCSI *s)
>  BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s)));
>  VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
>  VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(s);
> -int i;
> +int i, rc

[Qemu-devel] [PATCH v0] spapr: Restore support for 970MP and POWER8NVL CPU cores

2016-06-29 Thread Bharata B Rao
Introduction of core based CPU hotplug for PowerPC sPAPR didn't
add support for 970MP and POWER8NVL based core types. Add support for
the same.

While we are here, add support for explicit specification of POWER5+_v2.1
core type.

Signed-off-by: Bharata B Rao 
---
 hw/ppc/spapr_cpu_core.c | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 2aa0dc5..e30b159 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -337,12 +337,15 @@ static void glue(glue(spapr_cpu_core_, _fname), 
_initfn(Object *obj)) \
 core->cpu_class = oc; \
 }
 
+SPAPR_CPU_CORE_INITFN(970mp_v1.0, 970MP_v10);
+SPAPR_CPU_CORE_INITFN(970mp_v1.1, 970MP_v11);
 SPAPR_CPU_CORE_INITFN(970_v2.2, 970);
 SPAPR_CPU_CORE_INITFN(POWER5+_v2.1, POWER5plus);
 SPAPR_CPU_CORE_INITFN(POWER7_v2.3, POWER7);
 SPAPR_CPU_CORE_INITFN(POWER7+_v2.1, POWER7plus);
 SPAPR_CPU_CORE_INITFN(POWER8_v2.0, POWER8);
 SPAPR_CPU_CORE_INITFN(POWER8E_v2.1, POWER8E);
+SPAPR_CPU_CORE_INITFN(POWER8NVL_v1.0, POWER8NVL);
 
 typedef struct SPAPRCoreInfo {
 const char *name;
@@ -350,10 +353,19 @@ typedef struct SPAPRCoreInfo {
 } SPAPRCoreInfo;
 
 static const SPAPRCoreInfo spapr_cores[] = {
-/* 970 */
+/* 970 and aliaes */
+{ .name = "970_v2.2", .initfn = spapr_cpu_core_970_initfn },
 { .name = "970", .initfn = spapr_cpu_core_970_initfn },
 
-/* POWER5 */
+/* 970MP variants and aliases */
+{ .name = "970MP_v1.0", .initfn = spapr_cpu_core_970MP_v10_initfn },
+{ .name = "970mp_v1.0", .initfn = spapr_cpu_core_970MP_v10_initfn },
+{ .name = "970MP_v1.1", .initfn = spapr_cpu_core_970MP_v11_initfn },
+{ .name = "970mp_v1.1", .initfn = spapr_cpu_core_970MP_v11_initfn },
+{ .name = "970mp", .initfn = spapr_cpu_core_970MP_v11_initfn },
+
+/* POWER5 and aliases */
+{ .name = "POWER5+_v2.1", .initfn = spapr_cpu_core_POWER5plus_initfn },
 { .name = "POWER5+", .initfn = spapr_cpu_core_POWER5plus_initfn },
 
 /* POWER7 and aliases */
@@ -373,6 +385,10 @@ static const SPAPRCoreInfo spapr_cores[] = {
 { .name = "POWER8E_v2.1", .initfn = spapr_cpu_core_POWER8E_initfn },
 { .name = "POWER8E", .initfn = spapr_cpu_core_POWER8E_initfn },
 
+/* POWER8NVL and aliases */
+{ .name = "POWER8NVL_v1.0", .initfn = spapr_cpu_core_POWER8NVL_initfn },
+{ .name = "POWER8NVL", .initfn = spapr_cpu_core_POWER8NVL_initfn },
+
 { .name = NULL }
 };
 
-- 
2.1.0




Re: [Qemu-devel] [PATCH v6 5/7] util: add QAuthZ object as an authorization base class

2016-06-29 Thread Daniel P. Berrange
On Tue, Jun 28, 2016 at 06:22:10PM +0200, Marc-André Lureau wrote:

> > +
> > +static const TypeInfo authz_info = {
> > +.parent = TYPE_OBJECT,
> > +.name = TYPE_QAUTHZ,
> > +.instance_size = sizeof(QAuthZ),
> > +.class_size = sizeof(QAuthZClass),
> 
> .abstract = true? (perhaps it's not necessary, but that would be more clear)

Yes, makes sense to add that since you can't instantiate this class
and do anything useful with it.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH 2/3] replay: allow replay stopping and restarting

2016-06-29 Thread Pavel Dovgalyuk
> From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo 
> Bonzini
> On 20/06/2016 08:26, Pavel Dovgalyuk wrote:
> >> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> >>> From: "Pavel Dovgalyuk" 
> >>> This patch fixes bug with stopping and restarting replay
> >>> through monitor.
> >>>
> >>> Signed-off-by: Pavel Dovgalyuk 
> >>> ---
> >
> >> I think that you want to call
> >> replay_disable_events...
> >>
> >>>  bdrv_drain_all();
> >>
> >> ... after this bdrv_drain_all.
> >
> > Why? We disable replay events to avoid adding new block requests
> > to the queue. How this is related to drain all?
> 
> drain all completes the guest's pending requests.  If you disable events
> before drain all, doesn't that cause a mismatch between record and replay?

Looks reasonable, thanks. I'll update the patch.
What about replay patch for networking?

Pavel Dovgalyuk




Re: [Qemu-devel] [PATCH v2 1/3] char: clean up remaining chardevs when leaving

2016-06-29 Thread Marc-André Lureau
Hi

On Wed, Jun 29, 2016 at 12:53 PM, Paolo Bonzini  wrote:
>
>
> On 16/06/2016 21:28, marcandre.lur...@redhat.com wrote:
>> From: Marc-André Lureau 
>>
>> This helps to remove various chardev resources leaks when leaving qemu.
>>
>> Signed-off-by: Marc-André Lureau 
>> ---
>>  qemu-char.c | 11 +++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/qemu-char.c b/qemu-char.c
>> index c926e9a..98dcd49 100644
>> --- a/qemu-char.c
>> +++ b/qemu-char.c
>> @@ -4541,6 +4541,15 @@ void qmp_chardev_remove(const char *id, Error **errp)
>>  qemu_chr_delete(chr);
>>  }
>>
>> +static void qemu_chr_cleanup(void)
>> +{
>> +CharDriverState *chr;
>> +
>> +QTAILQ_FOREACH(chr, &chardevs, next) {
>> +qemu_chr_delete(chr);
>> +}
>> +}
>
> FYI, this patch is necessary on top:
>
> diff --git a/qemu-char.c b/qemu-char.c
> index 016badb..bc04ced 100644
> --- a/qemu-char.c
> +++ b/qemu-char.c
> @@ -4551,9 +4551,9 @@
>
>  static void qemu_chr_cleanup(void)
>  {
> -CharDriverState *chr;
> +CharDriverState *chr, *tmp;
>
> -QTAILQ_FOREACH(chr, &chardevs, next) {
> +QTAILQ_FOREACH_SAFE(chr, &chardevs, next, tmp) {
>  qemu_chr_delete(chr);
>  }
>  }
>

ack, I guess you'll have it in your pull request? squash or seperate, anyhow.

thanks



-- 
Marc-André Lureau



Re: [Qemu-devel] [RFH PATCH] vhost-user-test: fix g_cond_wait_until compat implementation

2016-06-29 Thread Marc-André Lureau
Hi

On Tue, Jun 28, 2016 at 7:22 PM, Paolo Bonzini  wrote:
> This fixes compilation with glib versions up to 2.30, such
> as the one in CentOS 6.
>

What's RFH in title? :) (not sure this applies here:
http://www.urbandictionary.com/define.php?term=rfh)

> Even with this patch the test fails though:
>
> ERROR:/tmp/qemu-test/src/tests/vhost-user-test.c:165:wait_for_fds: assertion 
> failed: (s->fds_num)
>

That's a regression from Cornelia's series "virtio-bus: have callers
tolerate new host notifier api".

> Signed-off-by: Paolo Bonzini 
> ---
>  tests/vhost-user-test.c | 11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c
> index 8b2164b..4de64df 100644
> --- a/tests/vhost-user-test.c
> +++ b/tests/vhost-user-test.c
> @@ -127,21 +127,24 @@ typedef struct TestServer {
>  int fds_num;
>  int fds[VHOST_MEMORY_MAX_NREGIONS];
>  VhostUserMemory memory;
> -GMutex data_mutex;
> -GCond data_cond;
> +CompatGMutex data_mutex;
> +CompatGCond data_cond;
>  int log_fd;
>  uint64_t rings;
>  } TestServer;
>
>  #if !GLIB_CHECK_VERSION(2, 32, 0)
> -static gboolean g_cond_wait_until(CompatGCond cond, CompatGMutex mutex,
> +static gboolean g_cond_wait_until(CompatGCond *cond, CompatGMutex *mutex,
>gint64 end_time)
>  {
>  gboolean ret = FALSE;
>  end_time -= g_get_monotonic_time();
>  GTimeVal time = { end_time / G_TIME_SPAN_SECOND,
>end_time % G_TIME_SPAN_SECOND };
> -ret = g_cond_timed_wait(cond, mutex, &time);
> +g_assert(mutex->once.status != G_ONCE_STATUS_PROGRESS);
> +g_once(&cond->once, do_g_cond_new, NULL);
> +ret = g_cond_timed_wait((GCond *) cond->once.retval,
> +(GMutex *) mutex->once.retval, &time);
>  return ret;
>  }
>  #endif
> --
> 2.7.4
>
>


Reviewed-by: Marc-André Lureau 



-- 
Marc-André Lureau



Re: [Qemu-devel] [PULL 30/34] virtio-bus: have callers tolerate new host notifier api

2016-06-29 Thread Cornelia Huck
On Wed, 29 Jun 2016 13:37:15 +0200
Marc-André Lureau  wrote:

> Hi
> 
> On Fri, Jun 24, 2016 at 7:55 AM, Michael S. Tsirkin  wrote:
> > From: Cornelia Huck 
> >
> > Have vhost and dataplane use the new api for transports that
> > have been converted.
> >
> > Signed-off-by: Cornelia Huck 
> > Reviewed-by: Fam Zheng 
> > Reviewed-by: Stefan Hajnoczi 
> > Reviewed-by: Michael S. Tsirkin 
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> 
> This patch and further break vhost-user-test:
> 
> QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64  tests/vhost-user-test
> /x86_64/vhost-user/read-guest-mem: binding does not support host notifiers
> qemu-system-x86_64: unable to start vhost net: 38: falling back on
> userspace virtio
> **
> ERROR:tests/vhost-user-test.c:162:wait_for_fds: assertion failed: (s->fds_num)
> Aborted (core dumped)

Yes, there's an || that needs to be a && (too late to fix), and the
mechanism does not work for vhost (currently working on a fix; our
fallback is too good since it only prints a message but otherwise
works).





Re: [Qemu-devel] [PATCH 00/17] block: Convert common I/O path to BdrvChild

2016-06-29 Thread Kevin Wolf
Am 28.06.2016 um 15:28 hat Stefan Hajnoczi geschrieben:
> Do you want to take it through your tree to avoid
> conflicts/dependencies?
> 
> Acked-by: Stefan Hajnoczi 

Yes, thanks. I've applied it to my tree now (with a few fixes addressing
the review comments).

Kevin


pgpv9vsAWc2g7.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH] scsi: esp: fix migration

2016-06-29 Thread Juan Quintela
Paolo Bonzini  wrote:
> On 27/06/2016 09:20, Amit Shah wrote:
>> On (Mon) 20 Jun 2016 [16:33:26], Paolo Bonzini wrote:
>>> Commit 926cde5 ("scsi: esp: make cmdbuf big enough for maximum CDB size",
>>> 2016-06-16) changed the size of a migrated field.  Split it in two
>>> parts, and only migrate the second part in a new vmstate version.
>> 
>> With this patch, the static checker fails in this way:
>> 
>> Section "esp", Description "esp": expected field "cmdlen", got
>> "cmdbuf"; skipping rest
>> Section "dc390", Description "esp": expected field "cmdlen", got
>> "cmdbuf"; skipping rest
>> Section "am53c974", Description "esp": expected field "cmdlen", got
>> "cmdbuf"; skipping rest
>> 
>> Note it doesn't complain about the version numbers.  That's because:
>> 
>>>  const VMStateDescription vmstate_esp = {
>>>  .name ="esp",
>>> -.version_id = 3,
>>> +.version_id = 4,
>>>  .minimum_version_id = 3,
>> 
>> this suggests older versions can still be accepted for incoming
>> migration, which isn't true.
>
> Sure they can:
>
> -VMSTATE_BUFFER(cmdbuf, ESPState),
> +VMSTATE_PARTIAL_BUFFER(cmdbuf, ESPState, 16),
> +VMSTATE_BUFFER_START_MIDDLE_V(cmdbuf, ESPState, 16, 4),

Amit, would it help the checker if we do something like:

-VMSTATE_BUFFER(cmdbuf, ESPState),
+VMSTATE_PARTIAL_BUFFER_TEST(cmdbuf, ESPState, 16, v_is_3),
+VMSTATE_BUFFER_TEST(cmdbuf, ESPState, from_4),

Yes, VMSTATE_PARTIAL_BUFFER_TEST don't exist, but it is trivial to
define.

Later, Juan.


>
> 2.6 is transmitting version 3 and a 16-byte buffer.
>
> 2.7 is transmitting version 4, a first 16-byte buffer, and a second
> 16-byte buffer that is skipped when receiving version 3.
>
> So it seems like a static checker limitation.
>
> Paolo



Re: [Qemu-devel] [RFH PATCH] vhost-user-test: fix g_cond_wait_until compat implementation

2016-06-29 Thread Paolo Bonzini


On 29/06/2016 13:48, Marc-André Lureau wrote:
> Hi
> 
> On Tue, Jun 28, 2016 at 7:22 PM, Paolo Bonzini  wrote:
>> This fixes compilation with glib versions up to 2.30, such
>> as the one in CentOS 6.
>>
> 
> What's RFH in title? :) (not sure this applies here:
> http://www.urbandictionary.com/define.php?term=rfh)

Request for help with the other bug. :)

Paolo

>> Even with this patch the test fails though:
>>
>> ERROR:/tmp/qemu-test/src/tests/vhost-user-test.c:165:wait_for_fds: assertion 
>> failed: (s->fds_num)
>>
> 
> That's a regression from Cornelia's series "virtio-bus: have callers
> tolerate new host notifier api".
> 
>> Signed-off-by: Paolo Bonzini 
>> ---
>>  tests/vhost-user-test.c | 11 +++
>>  1 file changed, 7 insertions(+), 4 deletions(-)
>>
>> diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c
>> index 8b2164b..4de64df 100644
>> --- a/tests/vhost-user-test.c
>> +++ b/tests/vhost-user-test.c
>> @@ -127,21 +127,24 @@ typedef struct TestServer {
>>  int fds_num;
>>  int fds[VHOST_MEMORY_MAX_NREGIONS];
>>  VhostUserMemory memory;
>> -GMutex data_mutex;
>> -GCond data_cond;
>> +CompatGMutex data_mutex;
>> +CompatGCond data_cond;
>>  int log_fd;
>>  uint64_t rings;
>>  } TestServer;
>>
>>  #if !GLIB_CHECK_VERSION(2, 32, 0)
>> -static gboolean g_cond_wait_until(CompatGCond cond, CompatGMutex mutex,
>> +static gboolean g_cond_wait_until(CompatGCond *cond, CompatGMutex *mutex,
>>gint64 end_time)
>>  {
>>  gboolean ret = FALSE;
>>  end_time -= g_get_monotonic_time();
>>  GTimeVal time = { end_time / G_TIME_SPAN_SECOND,
>>end_time % G_TIME_SPAN_SECOND };
>> -ret = g_cond_timed_wait(cond, mutex, &time);
>> +g_assert(mutex->once.status != G_ONCE_STATUS_PROGRESS);
>> +g_once(&cond->once, do_g_cond_new, NULL);
>> +ret = g_cond_timed_wait((GCond *) cond->once.retval,
>> +(GMutex *) mutex->once.retval, &time);
>>  return ret;
>>  }
>>  #endif
>> --
>> 2.7.4
>>
>>
> 
> 
> Reviewed-by: Marc-André Lureau 
> 
> 
> 



Re: [Qemu-devel] [Qemu-ppc] [PATCH] target-ppc: gen_pause for instructions: yield, mdoio, mdoom, miso

2016-06-29 Thread Benjamin Herrenschmidt
On Fri, 2016-06-24 at 13:18 -0700, Aaron Larson wrote:
> Call gen_pause for all "or rx,rx,rx" encodings other nop.  This
> provides a reasonable implementation for yield, and a better
> approximation for mdoio, mdoom, and miso.  The choice to pause for
> all
> encodings !=0 leverages the PowerISA admonition that the reserved
> encodings might change program priority, providing a slight "future
> proofing".
> 
> Signed-off-by: Aaron Larson 

Acked-by: Benjamin Herrenschmidt 

> ---
>  target-ppc/translate.c | 15 ---
>  1 file changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> index 2f1c591..c4559b6 100644
> --- a/target-ppc/translate.c
> +++ b/target-ppc/translate.c
> @@ -1471,7 +1471,7 @@ static void gen_or(DisasContext *ctx)
>  } else if (unlikely(Rc(ctx->opcode) != 0)) {
>  gen_set_Rc0(ctx, cpu_gpr[rs]);
>  #if defined(TARGET_PPC64)
> -} else {
> +} else if (rs != 0) { /* 0 is nop */
>  int prio = 0;
>  
>  switch (rs) {
> @@ -1514,7 +1514,6 @@ static void gen_or(DisasContext *ctx)
>  break;
>  #endif
>  default:
> -/* nop */
>  break;
>  }
>  if (prio) {
> @@ -1524,13 +1523,15 @@ static void gen_or(DisasContext *ctx)
>  tcg_gen_ori_tl(t0, t0, ((uint64_t)prio) << 50);
>  gen_store_spr(SPR_PPR, t0);
>  tcg_temp_free(t0);
> -/* Pause us out of TCG otherwise spin loops with smt_low
> - * eat too much CPU and the kernel hangs
> - */
> +}
>  #if !defined(CONFIG_USER_ONLY)
> -gen_pause(ctx);
> +/* Pause out of TCG otherwise spin loops with smt_low eat
> too much
> + * CPU and the kernel hangs.  This applies to all encodings
> other
> + * than no-op, e.g., miso(rs=26), yield(27), mdoio(29),
> mdoom(30),
> + * and all currently undefined.
> + */
> +gen_pause(ctx);
>  #endif
> -}
>  #endif
>  }
>  }



[Qemu-devel] [PATCH v2 12/17] block: Convert bdrv_write() to BdrvChild

2016-06-29 Thread Kevin Wolf
Signed-off-by: Kevin Wolf 
Acked-by: Stefan Hajnoczi 
---

This patch contains non-trivial fixes, so I think it's worth sending out a v2
for it even though I already applied the series. I added a coroutine entry
wrapper qcow(2)_write that can be used from .bdrv_write_compressed. These
wrappers will soon disappear again when .bdrv_write_compressed is changed into
.bdrv_co_pwritev_compressed (Pavel Butsykin's backup compression series).

 block/io.c |  5 +++--
 block/qcow.c   | 45 -
 block/qcow2-cluster.c  |  2 +-
 block/qcow2-refcount.c |  2 +-
 block/qcow2.c  | 47 ++-
 block/vdi.c|  4 ++--
 block/vvfat.c  |  5 ++---
 include/block/block.h  |  2 +-
 8 files changed, 100 insertions(+), 12 deletions(-)

diff --git a/block/io.c b/block/io.c
index 6dfc0eb..2e04a80 100644
--- a/block/io.c
+++ b/block/io.c
@@ -642,10 +642,11 @@ int bdrv_read(BdrvChild *child, int64_t sector_num,
   -EINVAL  Invalid sector number or nb_sectors
   -EACCES  Trying to write a read-only device
 */
-int bdrv_write(BlockDriverState *bs, int64_t sector_num,
+int bdrv_write(BdrvChild *child, int64_t sector_num,
const uint8_t *buf, int nb_sectors)
 {
-return bdrv_rw_co(bs, sector_num, (uint8_t *)buf, nb_sectors, true, 0);
+return bdrv_rw_co(child->bs, sector_num, (uint8_t *)buf, nb_sectors,
+  true, 0);
 }
 
 int bdrv_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
diff --git a/block/qcow.c b/block/qcow.c
index 0db43f8..674595e 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -913,6 +913,49 @@ static int qcow_make_empty(BlockDriverState *bs)
 return 0;
 }
 
+typedef struct QcowWriteCo {
+BlockDriverState *bs;
+int64_t sector_num;
+const uint8_t *buf;
+int nb_sectors;
+int ret;
+} QcowWriteCo;
+
+static void qcow_write_co_entry(void *opaque)
+{
+QcowWriteCo *co = opaque;
+QEMUIOVector qiov;
+
+struct iovec iov = (struct iovec) {
+.iov_base   = (uint8_t*) co->buf,
+.iov_len= co->nb_sectors * BDRV_SECTOR_SIZE,
+};
+qemu_iovec_init_external(&qiov, &iov, 1);
+
+co->ret = qcow_co_writev(co->bs, co->sector_num, co->nb_sectors, &qiov);
+}
+
+/* Wrapper for non-coroutine contexts */
+static int qcow_write(BlockDriverState *bs, int64_t sector_num,
+  const uint8_t *buf, int nb_sectors)
+{
+Coroutine *co;
+AioContext *aio_context = bdrv_get_aio_context(bs);
+QcowWriteCo data = {
+.bs = bs,
+.sector_num = sector_num,
+.buf= buf,
+.nb_sectors = nb_sectors,
+.ret= -EINPROGRESS,
+};
+co = qemu_coroutine_create(qcow_write_co_entry);
+qemu_coroutine_enter(co, &data);
+while (data.ret == -EINPROGRESS) {
+aio_poll(aio_context, true);
+}
+return data.ret;
+}
+
 /* XXX: put compressed sectors first, then all the cluster aligned
tables to avoid losing bytes in alignment */
 static int qcow_write_compressed(BlockDriverState *bs, int64_t sector_num,
@@ -969,7 +1012,7 @@ static int qcow_write_compressed(BlockDriverState *bs, 
int64_t sector_num,
 
 if (ret != Z_STREAM_END || out_len >= s->cluster_size) {
 /* could not compress: write normal cluster */
-ret = bdrv_write(bs, sector_num, buf, s->cluster_sectors);
+ret = qcow_write(bs, sector_num, buf, s->cluster_sectors);
 if (ret < 0) {
 goto fail;
 }
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index c1e9eee..a2490d7 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1784,7 +1784,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState 
*bs, uint64_t *l1_table,
 goto fail;
 }
 
-ret = bdrv_write(bs->file->bs, l2_offset / BDRV_SECTOR_SIZE,
+ret = bdrv_write(bs->file, l2_offset / BDRV_SECTOR_SIZE,
  (void *)l2_table, s->cluster_sectors);
 if (ret < 0) {
 goto fail;
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 3bef410..12e7e6b 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -2098,7 +2098,7 @@ write_refblocks:
 on_disk_refblock = (void *)((char *) *refcount_table +
 refblock_index * s->cluster_size);
 
-ret = bdrv_write(bs->file->bs, refblock_offset / BDRV_SECTOR_SIZE,
+ret = bdrv_write(bs->file, refblock_offset / BDRV_SECTOR_SIZE,
  on_disk_refblock, s->cluster_sectors);
 if (ret < 0) {
 fprintf(stderr, "ERROR writing refblock: %s\n", strerror(-ret));
diff --git a/block/qcow2.c b/block/qcow2.c
index 0178931..cd9c27b 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2533,6 +2533,51 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t 
offset)
 return 0;
 }
 
+t

[Qemu-devel] [PATCH] virtio: Fix setting up host notifiers for vhost

2016-06-29 Thread Cornelia Huck
When setting up host notifiers, virtio_bus_set_host_notifier()
simply switches the handler. This will only work, however, if
the ioeventfd has already been setup; this is true for dataplane,
but not for vhost.

Fix this by starting the ioeventfd if that has not happened
before.

While we're at it, also fixup the unsetting path of
set_host_notifier_internal().

Fixes: 6798e245a3 ("virtio-bus: common ioeventfd infrastructure")
Reported-by: Jason Wang 
Reported-by: Marc-André Lureau 
Signed-off-by: Cornelia Huck 
---

This fixes the vhost regression for me, while dataplane continues
to work.

Peter, does this help with your iSCSI regression?

---
 hw/virtio/virtio-bus.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
index 1313760..0136242 100644
--- a/hw/virtio/virtio-bus.c
+++ b/hw/virtio/virtio-bus.c
@@ -176,8 +176,8 @@ static int set_host_notifier_internal(DeviceState *proxy, 
VirtioBusState *bus,
 return r;
 }
 } else {
-virtio_queue_set_host_notifier_fd_handler(vq, false, false);
 k->ioeventfd_assign(proxy, notifier, n, assign);
+virtio_queue_set_host_notifier_fd_handler(vq, false, false);
 event_notifier_cleanup(notifier);
 }
 return r;
@@ -258,6 +258,9 @@ int virtio_bus_set_host_notifier(VirtioBusState *bus, int 
n, bool assign)
 return -ENOSYS;
 }
 if (assign) {
+if (!k->ioeventfd_started(proxy)) {
+virtio_bus_start_ioeventfd(bus);
+}
 /*
  * Stop using the generic ioeventfd, we are doing eventfd handling
  * ourselves below
-- 
2.6.6




Re: [Qemu-devel] [PATCH v6 4/7] qom: support arbitrary non-scalar properties with -object

2016-06-29 Thread Daniel P. Berrange
On Tue, Jun 28, 2016 at 06:09:08PM +0200, Marc-André Lureau wrote:
> Hi
> 
> On Tue, Jun 14, 2016 at 6:07 PM, Daniel P. Berrange  
> wrote:
> > The current -object command line syntax only allows for
> > creation of objects with scalar properties, or a list
> > with a fixed scalar element type. Objects which have
> > properties that are represented as structs in the QAPI
> > schema cannot be created using -object.
> >
> > This is a design limitation of the way the OptsVisitor
> > is written. It simply iterates over the QemuOpts values
> > as a flat list. The support for lists is enabled by
> > allowing the same key to be repeated in the opts string.
> >
> > It is not practical to extend the OptsVisitor to support
> > more complex data structures while also maintaining
> > the existing list handling behaviour that is relied upon
> > by other areas of QEMU.
> >
> > Fortunately there is no existing object that implements
> > the UserCreatable interface that relies on the list
> > handling behaviour, so it is possible to swap out the
> > OptsVisitor for a different visitor implementation, so
> > -object supports non-scalar properties, thus leaving
> > other users of OptsVisitor unaffected.
> >
> > The previously added qdict_crumple() method is able to
> > take a qdict containing a flat set of properties and
> > turn that into a arbitrarily nested set of dicts and
> > lists. By combining qemu_opts_to_qdict and qdict_crumple()
> > together, we can turn the opt string into a data structure
> > that is practically identical to that passed over QMP
> > when defining an object. The only difference is that all
> > the scalar values are represented as strings, rather than
> > strings, ints and bools. This is sufficient to let us
> > replace the OptsVisitor with the QMPInputVisitor for
> > use with -object.
> >
> > Thus -object can now support non-scalar properties,
> > for example the QMP object
> >
> >   {
> > "execute": "object-add",
> > "arguments": {
> >   "qom-type": "demo",
> >   "id": "demo0",
> >   "parameters": {
> > "foo": [
> >   { "bar": "one", "wizz": "1" },
> >   { "bar": "two", "wizz": "2" }
> > ]
> >   }
> > }
> >   }
> >
> > Would be creatable via the CLI now using
> >
> > $QEMU \
> >   -object demo,id=demo0,\
> >   foo.0.bar=one,foo.0.wizz=1,\
> >   foo.1.bar=two,foo.1.wizz=2
> >
> > Notice that this syntax is intentionally compatible
> > with that currently used by block drivers.
> >
> > This is also wired up to work for the 'object_add' command
> > in the HMP monitor with the same syntax.
> >
> >   (hmp) object_add demo,id=demo0,\
> >foo.0.bar=one,foo.0.wizz=1,\
> >foo.1.bar=two,foo.1.wizz=2
> >
> > NB indentation should not be used with HMP commands, this
> > is just for convenient formatting in this commit message.
> >
> > Signed-off-by: Daniel P. Berrange 
> 
> The patch breaks parsing of size arguments:
> 
> -object memory-backend-file,id=mem,size=512M,mem-path=/tmp: Parameter
> 'size' expects a number
> 
> 
> Looks like the previous patch needs type_size support

Yep, I've modified the previous patch to implement the type_size callback
on the QMP visitor and added a unit test case for this too.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



[Qemu-devel] [PATCH] spec/qcow2: bitmaps: zero bitmap table offset

2016-06-29 Thread Vladimir Sementsov-Ogievskiy
This allows effectively free in_use bitmap clusters including bitmap
table without loss of meaningful data.

Now it is possible only to free end-point clusters and zero-out (not
free) bitmap table

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

Hi all!

Here is one small but significant addition to specification of bitmaps in qcow2.

Can we apply it just like this or I'll have to inroduce new incompatible 
feature flag?

If there is existing implementation of the format, it may break image, saved by
software, using extended spec. But is there are any implementations except not
finished my one?


 docs/specs/qcow2.txt | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
index 80cdfd0..dd07a82 100644
--- a/docs/specs/qcow2.txt
+++ b/docs/specs/qcow2.txt
@@ -435,6 +435,8 @@ Structure of a bitmap directory entry:
 Offset into the image file at which the bitmap table
 (described below) for the bitmap starts. Must be aligned to
 a cluster boundary.
+Zero value means that bitmap table is not allocated and the
+bitmap should be considered as empty (all bits are zero).
 
  8 - 11:bitmap_table_size
 Number of entries in the bitmap table of the bitmap.
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH] linux-user: convert sockaddr_ll from host to target

2016-06-29 Thread Laurent Vivier

Le 27/06/2016 à 00:18, Laurent Vivier a écrit :
> As we convert sockaddr for AF_PACKET family for sendto() (target to
> host) we need also to convert this for getsockname() (host to target).
> 
> arping uses getsockname() to get the the interface address and uses
> this address with sendto().
> 
> Tested with:
> 
> /sbin/arping -D -q -c2 -I eno1 192.168.122.88
> 
> ...
> getsockname(3, {sa_family=AF_PACKET, proto=0x806, if2,
> pkttype=PACKET_HOST, addr(6)={1, 10c37b6b9a76}, [18]) = 0
> ...
> sendto(3, "..." 28, 0,
>{sa_family=AF_PACKET, proto=0x806, if2, pkttype=PACKET_HOST,
>addr(6)={1, }, 20) = 28
> ...
> 
> Signed-off-by: Laurent Vivier 
> ---
>  linux-user/syscall.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index 731926d..599b946 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -100,6 +100,7 @@ int __clone2(int (*fn)(void *), void *child_stack_base,
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #ifdef CONFIG_RTNETLINK
>  #include 
> @@ -1379,6 +1380,10 @@ static inline abi_long 
> host_to_target_sockaddr(abi_ulong target_addr,
>  struct sockaddr_nl *target_nl = (struct sockaddr_nl *)target_saddr;
>  target_nl->nl_pid = tswap32(target_nl->nl_pid);
>  target_nl->nl_groups = tswap32(target_nl->nl_groups);
> +} else if (addr->sa_family == AF_PACKET) {
> +struct sockaddr_ll *target_ll = (struct sockaddr_ll *)target_saddr;
> +target_ll->sll_ifindex = tswap32(target_ll->sll_ifindex);
> +target_ll->sll_hatype = tswap16(target_ll->sll_hatype);
>  }
>  unlock_user(target_saddr, target_addr, len);
>  
> 

It would be good to have this patch in 2.7 as this bug breaks dhclient:
dhclient uses arping to check the IP address is not already in use and
then hangs. I've seen that, at least, in a fedora21 + qemu-ppc64 container.

Thanks,
Laurent



Re: [Qemu-devel] [PATCH] virtio: Fix setting up host notifiers for vhost

2016-06-29 Thread Marc-André Lureau
Hi

On Wed, Jun 29, 2016 at 2:17 PM, Cornelia Huck  wrote:
> When setting up host notifiers, virtio_bus_set_host_notifier()
> simply switches the handler. This will only work, however, if
> the ioeventfd has already been setup; this is true for dataplane,
> but not for vhost.
>
> Fix this by starting the ioeventfd if that has not happened
> before.
>
> While we're at it, also fixup the unsetting path of
> set_host_notifier_internal().
>
> Fixes: 6798e245a3 ("virtio-bus: common ioeventfd infrastructure")
> Reported-by: Jason Wang 
> Reported-by: Marc-André Lureau 
> Signed-off-by: Cornelia Huck 
> ---
>
> This fixes the vhost regression for me, while dataplane continues
> to work.
>

That doesn't work here,
QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64  tests/vhost-user-test

hangs in /x86_64/vhost-user/migrate

> Peter, does this help with your iSCSI regression?
>
> ---
>  hw/virtio/virtio-bus.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> index 1313760..0136242 100644
> --- a/hw/virtio/virtio-bus.c
> +++ b/hw/virtio/virtio-bus.c
> @@ -176,8 +176,8 @@ static int set_host_notifier_internal(DeviceState *proxy, 
> VirtioBusState *bus,
>  return r;
>  }
>  } else {
> -virtio_queue_set_host_notifier_fd_handler(vq, false, false);
>  k->ioeventfd_assign(proxy, notifier, n, assign);
> +virtio_queue_set_host_notifier_fd_handler(vq, false, false);
>  event_notifier_cleanup(notifier);
>  }
>  return r;
> @@ -258,6 +258,9 @@ int virtio_bus_set_host_notifier(VirtioBusState *bus, int 
> n, bool assign)
>  return -ENOSYS;
>  }
>  if (assign) {
> +if (!k->ioeventfd_started(proxy)) {
> +virtio_bus_start_ioeventfd(bus);
> +}
>  /*
>   * Stop using the generic ioeventfd, we are doing eventfd handling
>   * ourselves below
> --
> 2.6.6
>



-- 
Marc-André Lureau



Re: [Qemu-devel] [PATCH v0] spapr: Restore support for 970MP and POWER8NVL CPU cores

2016-06-29 Thread Thomas Huth
On 29.06.2016 13:37, Bharata B Rao wrote:
> Introduction of core based CPU hotplug for PowerPC sPAPR didn't
> add support for 970MP and POWER8NVL based core types. Add support for
> the same.
> 
> While we are here, add support for explicit specification of POWER5+_v2.1
> core type.
> 
> Signed-off-by: Bharata B Rao 
> ---
>  hw/ppc/spapr_cpu_core.c | 20 ++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> index 2aa0dc5..e30b159 100644
> --- a/hw/ppc/spapr_cpu_core.c
> +++ b/hw/ppc/spapr_cpu_core.c
> @@ -337,12 +337,15 @@ static void glue(glue(spapr_cpu_core_, _fname), 
> _initfn(Object *obj)) \
>  core->cpu_class = oc; \
>  }
>  
> +SPAPR_CPU_CORE_INITFN(970mp_v1.0, 970MP_v10);
> +SPAPR_CPU_CORE_INITFN(970mp_v1.1, 970MP_v11);
>  SPAPR_CPU_CORE_INITFN(970_v2.2, 970);
>  SPAPR_CPU_CORE_INITFN(POWER5+_v2.1, POWER5plus);
>  SPAPR_CPU_CORE_INITFN(POWER7_v2.3, POWER7);
>  SPAPR_CPU_CORE_INITFN(POWER7+_v2.1, POWER7plus);
>  SPAPR_CPU_CORE_INITFN(POWER8_v2.0, POWER8);
>  SPAPR_CPU_CORE_INITFN(POWER8E_v2.1, POWER8E);
> +SPAPR_CPU_CORE_INITFN(POWER8NVL_v1.0, POWER8NVL);
>  
>  typedef struct SPAPRCoreInfo {
>  const char *name;
> @@ -350,10 +353,19 @@ typedef struct SPAPRCoreInfo {
>  } SPAPRCoreInfo;
>  
>  static const SPAPRCoreInfo spapr_cores[] = {
> -/* 970 */
> +/* 970 and aliaes */
> +{ .name = "970_v2.2", .initfn = spapr_cpu_core_970_initfn },
>  { .name = "970", .initfn = spapr_cpu_core_970_initfn },
>  
> -/* POWER5 */
> +/* 970MP variants and aliases */
> +{ .name = "970MP_v1.0", .initfn = spapr_cpu_core_970MP_v10_initfn },
> +{ .name = "970mp_v1.0", .initfn = spapr_cpu_core_970MP_v10_initfn },
> +{ .name = "970MP_v1.1", .initfn = spapr_cpu_core_970MP_v11_initfn },
> +{ .name = "970mp_v1.1", .initfn = spapr_cpu_core_970MP_v11_initfn },
> +{ .name = "970mp", .initfn = spapr_cpu_core_970MP_v11_initfn },

Are the upper-case "970MP_v1.1" and "970MP_v1.0" lines required here?
According to target-ppc/cpu-models.c, these CPU models are always
spelled with lower-case letters in QEMU, aren't they?

 Thomas




Re: [Qemu-devel] [PATCH] virtio: abort on fatal error instead of just exiting

2016-06-29 Thread Markus Armbruster
Igor Mammedov  writes:

> replace mainly useless exit(1) on fatal error path with
> abort(), so that it would be possible to generate core
> dump, that could be used to analyse cause of problem.
>
> Signed-off-by: Igor Mammedov 
> ---
>  hw/virtio/virtio.c | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 7ed06ea..9d3ac72 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -315,7 +315,7 @@ static int virtqueue_num_heads(VirtQueue *vq, unsigned 
> int idx)
>  if (num_heads > vq->vring.num) {
>  error_report("Guest moved used index from %u to %u",
>   idx, vq->shadow_avail_idx);
> -exit(1);
> +abort();

What's wrong with a simple assert(num_heads <= vq->vring.num)?

>  }
>  /* On success, callers read a descriptor at vq->last_avail_idx.
>   * Make sure descriptor read does not bypass avail index read. */
[...]



Re: [Qemu-devel] [PATCH v3 1/1] target-arm: Use Neon for zero checking

2016-06-29 Thread Paolo Bonzini


On 29/06/2016 10:47, vija...@cavium.com wrote:
> From: Vijay 
> 
> Use Neon instructions to perform zero checking of
> buffer. This is helps in reducing total migration time.
> 
> Use case: Idle VM live migration with 4 VCPUS and 8GB ram
> running CentOS 7.
> 
> Without Neon, the Total migration time is 3.5 Sec
> 
> Migration status: completed
> total time: 3560 milliseconds
> downtime: 33 milliseconds
> setup: 5 milliseconds
> transferred ram: 297907 kbytes
> throughput: 685.76 mbps
> remaining ram: 0 kbytes
> total ram: 8519872 kbytes
> duplicate: 2062760 pages
> skipped: 0 pages
> normal: 69808 pages
> normal bytes: 279232 kbytes
> dirty sync count: 3
> 
> With Neon, the total migration time is 2.9 Sec
> 
> Migration status: completed
> total time: 2960 milliseconds
> downtime: 65 milliseconds
> setup: 4 milliseconds
> transferred ram: 299869 kbytes
> throughput: 830.19 mbps
> remaining ram: 0 kbytes
> total ram: 8519872 kbytes
> duplicate: 2064313 pages
> skipped: 0 pages
> normal: 70294 pages
> normal bytes: 281176 kbytes
> dirty sync count: 3
> 
> Signed-off-by: Vijaya Kumar K 
> Signed-off-by: Suresh 
> ---
>  util/cutils.c |7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/util/cutils.c b/util/cutils.c
> index 5830a68..4779403 100644
> --- a/util/cutils.c
> +++ b/util/cutils.c
> @@ -184,6 +184,13 @@ int qemu_fdatasync(int fd)
>  #define SPLAT(p)   _mm_set1_epi8(*(p))
>  #define ALL_EQ(v1, v2) (_mm_movemask_epi8(_mm_cmpeq_epi8(v1, v2)) == 0x)
>  #define VEC_OR(v1, v2) (_mm_or_si128(v1, v2))
> +#elif __aarch64__
> +#include "arm_neon.h"
> +#define VECTYPEuint64x2_t
> +#define ALL_EQ(v1, v2) \
> +((vgetq_lane_u64(v1, 0) == vgetq_lane_u64(v2, 0)) && \
> + (vgetq_lane_u64(v1, 1) == vgetq_lane_u64(v2, 1)))
> +#define VEC_OR(v1, v2) ((v1) | (v2))
>  #else
>  #define VECTYPEunsigned long
>  #define SPLAT(p)   (*(p) * (~0UL / 255))
> 

Acked-by: Paolo Bonzini 



Re: [Qemu-devel] [PATCH] spec/qcow2: bitmaps: zero bitmap table offset

2016-06-29 Thread Vladimir Sementsov-Ogievskiy

On 29.06.2016 15:22, Vladimir Sementsov-Ogievskiy wrote:

This allows effectively free in_use bitmap clusters including bitmap
table without loss of meaningful data.

Now it is possible only to free end-point clusters and zero-out (not
free) bitmap table

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

Hi all!

Here is one small but significant addition to specification of bitmaps in qcow2.

Can we apply it just like this or I'll have to inroduce new incompatible 
feature flag?

If there is existing implementation of the format, it may break image, saved by
software, using extended spec. But is there are any implementations except not
finished my one?


  docs/specs/qcow2.txt | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
index 80cdfd0..dd07a82 100644
--- a/docs/specs/qcow2.txt
+++ b/docs/specs/qcow2.txt
@@ -435,6 +435,8 @@ Structure of a bitmap directory entry:
  Offset into the image file at which the bitmap table
  (described below) for the bitmap starts. Must be aligned 
to
  a cluster boundary.
+Zero value means that bitmap table is not allocated and the
+bitmap should be considered as empty (all bits are zero).
  
   8 - 11:bitmap_table_size

  Number of entries in the bitmap table of the bitmap.


+   bitmap_table_size must be zero if bitmap_table_size is zero.



--
Best regards,
Vladimir




Re: [Qemu-devel] [PATCH 3/2] MAINTAINERS: Remove Blue Swirl leftovers

2016-06-29 Thread Ed Maste
On 20 June 2016 at 10:19, Markus Armbruster  wrote:
>
> As per Paolo's recommendation, downgrade status of "BSD user" from
> Maintained to Orphan since the FreeBSD guys effectively forked it, and
> "SPARC target" from Maintained to Odd Fixes, since we still have the
> overall TCG maintainer looking after it.

Note that we are still very interested in having the BSD user
refactoring and improvements upstream, and are not interested in
indefinitely carrying around a fork. We do need to figure out how to
effectively restart the effort to upstream the work.

The bsd-user work is stable and usable. We use it to cross-build more
than 20,000 packages of third-party software in the FreeBSD ports
collection.



[Qemu-devel] [PATCH] vhost-user: disable chardev handlers on close

2016-06-29 Thread Paolo Bonzini
This otherwise causes a use-after-free if network backend cleanup
is performed before character device cleanup.

Cc: Marc-André Lureau 
Signed-off-by: Paolo Bonzini 
---
I'm including this in the pull request too.

 net/vhost-user.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/net/vhost-user.c b/net/vhost-user.c
index 636899a..92f4cfd 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -151,6 +151,11 @@ static void vhost_user_cleanup(NetClientState *nc)
 vhost_net_cleanup(s->vhost_net);
 s->vhost_net = NULL;
 }
+if (s->chr) {
+qemu_chr_add_handlers(s->chr, NULL, NULL, NULL, NULL);
+qemu_chr_fe_release(s->chr);
+s->chr = NULL;
+}
 
 qemu_purge_queued_packets(nc);
 }
-- 
1.8.3.1




[Qemu-devel] [PATCH] vhost-user: disable chardev handlers on close

2016-06-29 Thread Paolo Bonzini
This otherwise causes a use-after-free if network backend cleanup
is performed before character device cleanup.

Cc: Marc-André Lureau 
Signed-off-by: Paolo Bonzini 
---
 net/vhost-user.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/net/vhost-user.c b/net/vhost-user.c
index 636899a..92f4cfd 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -151,6 +151,11 @@ static void vhost_user_cleanup(NetClientState *nc)
 vhost_net_cleanup(s->vhost_net);
 s->vhost_net = NULL;
 }
+if (s->chr) {
+qemu_chr_add_handlers(s->chr, NULL, NULL, NULL, NULL);
+qemu_chr_fe_release(s->chr);
+s->chr = NULL;
+}
 
 qemu_purge_queued_packets(nc);
 }
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH 3/2] MAINTAINERS: Remove Blue Swirl leftovers

2016-06-29 Thread Paolo Bonzini


On 29/06/2016 15:24, Ed Maste wrote:
> On 20 June 2016 at 10:19, Markus Armbruster  wrote:
>>
>> As per Paolo's recommendation, downgrade status of "BSD user" from
>> Maintained to Orphan since the FreeBSD guys effectively forked it, and
>> "SPARC target" from Maintained to Odd Fixes, since we still have the
>> overall TCG maintainer looking after it.
> 
> Note that we are still very interested in having the BSD user
> refactoring and improvements upstream, and are not interested in
> indefinitely carrying around a fork. We do need to figure out how to
> effectively restart the effort to upstream the work.
> 
> The bsd-user work is stable and usable. We use it to cross-build more
> than 20,000 packages of third-party software in the FreeBSD ports
> collection.

Honestly I'm wondering if a huge code drop could be the right solution
here.  It's not how we usually do things, but rules exist to be broken...

Paolo



Re: [Qemu-devel] [PATCH v0] spapr: Restore support for 970MP and POWER8NVL CPU cores

2016-06-29 Thread Bharata B Rao
On Wed, Jun 29, 2016 at 02:28:08PM +0200, Thomas Huth wrote:
> On 29.06.2016 13:37, Bharata B Rao wrote:
> > Introduction of core based CPU hotplug for PowerPC sPAPR didn't
> > add support for 970MP and POWER8NVL based core types. Add support for
> > the same.
> > 
> > While we are here, add support for explicit specification of POWER5+_v2.1
> > core type.
> > 
> > Signed-off-by: Bharata B Rao 
> > ---
> >  hw/ppc/spapr_cpu_core.c | 20 ++--
> >  1 file changed, 18 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> > index 2aa0dc5..e30b159 100644
> > --- a/hw/ppc/spapr_cpu_core.c
> > +++ b/hw/ppc/spapr_cpu_core.c
> > @@ -337,12 +337,15 @@ static void glue(glue(spapr_cpu_core_, _fname), 
> > _initfn(Object *obj)) \
> >  core->cpu_class = oc; \
> >  }
> >  
> > +SPAPR_CPU_CORE_INITFN(970mp_v1.0, 970MP_v10);
> > +SPAPR_CPU_CORE_INITFN(970mp_v1.1, 970MP_v11);
> >  SPAPR_CPU_CORE_INITFN(970_v2.2, 970);
> >  SPAPR_CPU_CORE_INITFN(POWER5+_v2.1, POWER5plus);
> >  SPAPR_CPU_CORE_INITFN(POWER7_v2.3, POWER7);
> >  SPAPR_CPU_CORE_INITFN(POWER7+_v2.1, POWER7plus);
> >  SPAPR_CPU_CORE_INITFN(POWER8_v2.0, POWER8);
> >  SPAPR_CPU_CORE_INITFN(POWER8E_v2.1, POWER8E);
> > +SPAPR_CPU_CORE_INITFN(POWER8NVL_v1.0, POWER8NVL);
> >  
> >  typedef struct SPAPRCoreInfo {
> >  const char *name;
> > @@ -350,10 +353,19 @@ typedef struct SPAPRCoreInfo {
> >  } SPAPRCoreInfo;
> >  
> >  static const SPAPRCoreInfo spapr_cores[] = {
> > -/* 970 */
> > +/* 970 and aliaes */
> > +{ .name = "970_v2.2", .initfn = spapr_cpu_core_970_initfn },
> >  { .name = "970", .initfn = spapr_cpu_core_970_initfn },
> >  
> > -/* POWER5 */
> > +/* 970MP variants and aliases */
> > +{ .name = "970MP_v1.0", .initfn = spapr_cpu_core_970MP_v10_initfn },
> > +{ .name = "970mp_v1.0", .initfn = spapr_cpu_core_970MP_v10_initfn },
> > +{ .name = "970MP_v1.1", .initfn = spapr_cpu_core_970MP_v11_initfn },
> > +{ .name = "970mp_v1.1", .initfn = spapr_cpu_core_970MP_v11_initfn },
> > +{ .name = "970mp", .initfn = spapr_cpu_core_970MP_v11_initfn },
> 
> Are the upper-case "970MP_v1.1" and "970MP_v1.0" lines required here?
> According to target-ppc/cpu-models.c, these CPU models are always
> spelled with lower-case letters in QEMU, aren't they?

The .name here is used to build the type of spapr-cpu-core based on the
CPU model specified. I saw that we support both

-cpu 970MP_v1.1 and
-cpu 970mp_v1.1

Same for v1.0 (but support only -cpu 970mp and not -cpu 970MP!)

Hence the above lines are needed to register appropriate spapr-cpu-core
types (970mp_v1.1-spapr-cpu-core or 970MP_v1.1-spapr-cpu-core) based on the
CPU model specified.

Regards,
Bharata.




[Qemu-devel] Bug or what?

2016-06-29 Thread Mundek

Hi,
i am developing an operating system, and when i run "qemu-system-i386 
-kernel kernel.mkern -d in_asm -no-reboot"

i get (at the end):

Servicing hardware INT=0x20

IN:
0x0020:  movsl  %ds:(%esi),%es:(%edi)
0x0021:  incb   (%eax)
0x0023:  lock xchg %ebp,%ecx
0x0026:  add%dh,%al
0x0028:  sub$0xd6,%al
0x002a:  add%dh,%al
0x002c:  sub$0xd6,%al
0x002e:  add%dh,%al
0x0030:  sub$0xd6,%al
0x0032:  add%dh,%al
0x0034:  sub$0xd6,%al
0x0036:  add%dh,%al
0x0038:  push   %edi
0x0039:  out%eax,(%dx)
0x003a:  add%dh,%al
0x003c:  sub$0xd6,%al
0x003e:  add%dh,%al
0x0040:  push   %eax
0x0041:  push   %esi
0x0042:  add%al,%al
0x0044:  dec%ebp
0x0045:  clc
0x0046:  add%dh,%al
0x0048:  inc%ecx
0x0049:  clc
0x004a:  add%dh,%al
0x004c:  (bad)
0x004d:  jecxz  0x4f
Disassembler disagrees with translator over instruction decoding
Please report this to qemu-devel@nongnu.org

So here i am, reportin this. Is this my shitty code, or your emulator?
Thanks,
Olgierd (m00nd3ck)



[Qemu-devel] [PATCH v3 0/1] ARM64: Live migration optimization

2016-06-29 Thread vijayak
From: Vijaya Kumar K 

To optimize Live migration time on ARM64 machine,
Neon instructions are used for Zero page checking.

With these changes, total migration time comes down
from 3.5 seconds to 2.9 seconds.

These patches are tested on top of (GICv3 live migration support)
https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg05284.html
However there is no direct dependency on these patches.

v2 -> v3 changes:
  - Dropped Thunderx specific patches(2) from this series. Will
be added on kernel exposing midr register to userspace.
  - Used generic zero page checking function. Only macros
are updated.

v1 -> v2 changes:

  - Dropped 'target-arm: Update page size for aarch64' patch.
  - Each loop in zero buffer check function is reduced to
16 from 32.
  - Replaced vorrq_u64 with '|' in Neon macros
  - Renamed local variable to reflect 128 bit.
  - Introduced new file cpuinfo.c to parse /proc/cpuinfo
  - Added Thunderx specific patches to add prefetch in
zero buffer check function.

Vijay (1):
  target-arm: Use Neon for zero checking

 util/cutils.c |7 +++
 1 file changed, 7 insertions(+)

-- 
1.7.9.5




[Qemu-devel] [PATCH v3 1/1] target-arm: Use Neon for zero checking

2016-06-29 Thread vijayak
From: Vijay 

Use Neon instructions to perform zero checking of
buffer. This is helps in reducing total migration time.

Use case: Idle VM live migration with 4 VCPUS and 8GB ram
running CentOS 7.

Without Neon, the Total migration time is 3.5 Sec

Migration status: completed
total time: 3560 milliseconds
downtime: 33 milliseconds
setup: 5 milliseconds
transferred ram: 297907 kbytes
throughput: 685.76 mbps
remaining ram: 0 kbytes
total ram: 8519872 kbytes
duplicate: 2062760 pages
skipped: 0 pages
normal: 69808 pages
normal bytes: 279232 kbytes
dirty sync count: 3

With Neon, the total migration time is 2.9 Sec

Migration status: completed
total time: 2960 milliseconds
downtime: 65 milliseconds
setup: 4 milliseconds
transferred ram: 299869 kbytes
throughput: 830.19 mbps
remaining ram: 0 kbytes
total ram: 8519872 kbytes
duplicate: 2064313 pages
skipped: 0 pages
normal: 70294 pages
normal bytes: 281176 kbytes
dirty sync count: 3

Signed-off-by: Vijaya Kumar K 
Signed-off-by: Suresh 
---
 util/cutils.c |7 +++
 1 file changed, 7 insertions(+)

diff --git a/util/cutils.c b/util/cutils.c
index 5830a68..4779403 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -184,6 +184,13 @@ int qemu_fdatasync(int fd)
 #define SPLAT(p)   _mm_set1_epi8(*(p))
 #define ALL_EQ(v1, v2) (_mm_movemask_epi8(_mm_cmpeq_epi8(v1, v2)) == 0x)
 #define VEC_OR(v1, v2) (_mm_or_si128(v1, v2))
+#elif __aarch64__
+#include "arm_neon.h"
+#define VECTYPEuint64x2_t
+#define ALL_EQ(v1, v2) \
+((vgetq_lane_u64(v1, 0) == vgetq_lane_u64(v2, 0)) && \
+ (vgetq_lane_u64(v1, 1) == vgetq_lane_u64(v2, 1)))
+#define VEC_OR(v1, v2) ((v1) | (v2))
 #else
 #define VECTYPEunsigned long
 #define SPLAT(p)   (*(p) * (~0UL / 255))
-- 
1.7.9.5




Re: [Qemu-devel] [PATCH v9 12/13] e1000e: remove unnecessary internal msi state flag

2016-06-29 Thread Markus Armbruster
Cao jin  writes:

> Internal big flag E1000E_USE_MSI is unnecessary, also is the helper
> function: e1000e_init_msi(), e1000e_cleanup_msi(), so, remove them all.
>
> cc: Dmitry Fleytman 
> cc: Jason Wang 
> cc: Markus Armbruster 
> cc: Marcel Apfelbaum 
> cc: Michael S. Tsirkin 
>
> Signed-off-by: Cao jin 

Reviewed-by: Markus Armbruster 



Re: [Qemu-devel] [PATCH v9 13/13] vmw_pvscsi: remove unnecessary internal msi state flag

2016-06-29 Thread Markus Armbruster
Cao jin  writes:

> Internal flag msi_used is uncesessary, msi_uninit() could be called
> directly, msi_enabled() is enough to check device msi state.
>
> But for migration compatibility, keep the field in structure.
>
> cc: Paolo Bonzini 
> cc: Dmitry Fleytman 
> cc: Markus Armbruster 
> cc: Marcel Apfelbaum 
> cc: Michael S. Tsirkin 
>
> Signed-off-by: Cao jin 

Reviewed-by: Markus Armbruster 



Re: [Qemu-devel] [PATCH 1/3] Mediated device Core driver

2016-06-29 Thread Xiao Guangrong



On 06/21/2016 12:31 AM, Kirti Wankhede wrote:

Design for Mediated Device Driver:
Main purpose of this driver is to provide a common interface for mediated
device management that can be used by differnt drivers of different
devices.

This module provides a generic interface to create the device, add it to
mediated bus, add device to IOMMU group and then add it to vfio group.

Below is the high Level block diagram, with Nvidia, Intel and IBM devices
as example, since these are the devices which are going to actively use
this module as of now.

  +---+
  |   |
  | +---+ |  mdev_register_driver() +--+
  | |   | +<+ __init() |
  | |   | | |  |
  | |  mdev | +>+  |<-> VFIO user
  | |  bus  | | probe()/remove()| vfio_mpci.ko |APIs
  | |  driver   | | |  |
  | |   | | +--+
  | |   | |  mdev_register_driver() +--+
  | |   | +<+ __init() |
  | |   | | |  |
  | |   | +>+  |<-> VFIO user
  | +---+ | probe()/remove()| vfio_mccw.ko |APIs
  |   | |  |
  |  MDEV CORE| +--+
  |   MODULE  |
  |   mdev.ko |
  | +---+ |  mdev_register_device() +--+
  | |   | +<+  |
  | |   | | |  nvidia.ko   |<-> physical
  | |   | +>+  |device
  | |   | |callback +--+
  | | Physical  | |
  | |  device   | |  mdev_register_device() +--+
  | | interface | |<+  |
  | |   | | |  i915.ko |<-> physical
  | |   | +>+  |device
  | |   | |callback +--+
  | |   | |
  | |   | |  mdev_register_device() +--+
  | |   | +<+  |
  | |   | | | ccw_device.ko|<-> physical
  | |   | +>+  |device
  | |   | |callback +--+
  | +---+ |
  +---+

Core driver provides two types of registration interfaces:
1. Registration interface for mediated bus driver:

/**
   * struct mdev_driver - Mediated device's driver
   * @name: driver name
   * @probe: called when new device created
   * @remove:called when device removed
   * @match: called when new device or driver is added for this bus.
Return 1 if given device can be handled by given driver and
zero otherwise.
   * @driver:device driver structure
   *
   **/
struct mdev_driver {
  const char *name;
  int  (*probe)  (struct device *dev);
  void (*remove) (struct device *dev);
 int  (*match)(struct device *dev);
  struct device_driverdriver;
};

int  mdev_register_driver(struct mdev_driver *drv, struct module *owner);
void mdev_unregister_driver(struct mdev_driver *drv);

Mediated device's driver for mdev should use this interface to register
with Core driver. With this, mediated devices driver for such devices is
responsible to add mediated device to VFIO group.

2. Physical device driver interface
This interface provides vendor driver the set APIs to manage physical
device related work in their own driver. APIs are :
- supported_config: provide supported configuration list by the vendor
driver
- create: to allocate basic resources in vendor driver for a mediated
  device.
- destroy: to free resources in vendor driver when mediated device is
   destroyed.
- start: to initiate mediated device initialization process from vendor
 driver when VM boots and before QEMU starts.
- shutdown: to teardown mediated device resources during VM teardown.
- read : read emulation callback.
- write: write emulation callback.
- set_irqs: send interrupt configuration information that QEMU sets.
- get_region_info: to provide region size and its flags for the mediated
   device.
- validate_map_request: to validate remap pfn request.

This registration interface should be used by vendor drivers to register
each physical device to mdev core driver.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I73a5084574270b14541c529461ea2f03c292d510
---
  drivers/vfio/Kconfig |   1 +
  drivers/vfio/Makefile|   1 +
  drivers/vfio/mdev/Kconfig|  11 +
  drivers/vfio/mdev/Makefile   |   5 +
  drivers/vfio/mdev/mdev_core.c  

Re: [Qemu-devel] [RFC v3 17/19] tcg: enable thread-per-vCPU

2016-06-29 Thread Sergey Fedorov
On 03/06/16 23:40, Alex Bennée wrote:
> There are a number of changes that occur at the same time here:
>
>   - tb_lock is no longer a NOP for SoftMMU
>
>   The tb_lock protects both translation and memory map structures. The
>   debug assert is updated to reflect this.

This could be a separate patch.

If we use tb_lock in system-mode to protect the structures protected by
mmap_lock in user-mode then maybe we can merge those two locks because,
as I remember, tb_lock in user-mode emulation is only held outside of
mmap_lock for patching TB for direct jumps.

>
>   - introduce a single vCPU qemu_tcg_cpu_thread_fn
>
>   One of these is spawned per vCPU with its own Thread and Condition
>   variables. qemu_tcg_single_cpu_thread_fn is the new name for the old
>   single threaded function.

So we have 'tcg_current_rr_cpu' and 'qemu_cpu_kick_rr_cpu() at this
moment, maybe name this function like qemu_tcg_rr_cpu_thread_fn()? ;)

>
>   - the TLS current_cpu variable is now live for the lifetime of MTTCG
> vCPU threads. This is for future work where async jobs need to know
> the vCPU context they are operating in.

This is important change because we set 'current_cpu' to NULL outside of
cpu_exec() before, I wonder why.

>
> The user to switch on multi-thread behaviour and spawn a thread
> per-vCPU. For a simple test like:
>
>   ./arm/run ./arm/locking-test.flat -smp 4 -accel tcg,thread=multi

It would be nice to mention that the simple test is from kvm_unit_tests.

>
> Will now use 4 vCPU threads and have an expected FAIL (instead of the
> unexpected PASS) as the default mode of the test has no protection when
> incrementing a shared variable.
>
> However we still default to a single thread for all vCPUs as individual
> front-end and back-ends need additional fixes to safely support:
>   - atomic behaviour
>   - tb invalidation
>   - memory ordering
>
> The function default_mttcg_enabled can be tweaked as support is added.
>
> Signed-off-by: KONRAD Frederic 
> Signed-off-by: Paolo Bonzini 
> [AJB: Some fixes, conditionally, commit rewording]
> Signed-off-by: Alex Bennée 
>
(snip)
> diff --git a/cpus.c b/cpus.c
> index 35374fd..419caa2 100644
> --- a/cpus.c
> +++ b/cpus.c
(snip)
> @@ -1042,9 +1039,7 @@ static void qemu_tcg_wait_io_event(CPUState *cpu)
>  qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);
>  }
>  
> -CPU_FOREACH(cpu) {
> -qemu_wait_io_event_common(cpu);
> -}
> +qemu_wait_io_event_common(cpu);

Is it okay for single-threaded CPU loop?

>  }
>  
>  static void qemu_kvm_wait_io_event(CPUState *cpu)
(snip)
> @@ -1331,6 +1324,69 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
>  return NULL;
>  }
>  
> +/* Multi-threaded TCG
> + *
> + * In the multi-threaded case each vCPU has its own thread. The TLS
> + * variable current_cpu can be used deep in the code to find the
> + * current CPUState for a given thread.
> + */
> +
> +static void *qemu_tcg_cpu_thread_fn(void *arg)
> +{
> +CPUState *cpu = arg;
> +
> +rcu_register_thread();
> +
> +qemu_mutex_lock_iothread();
> +qemu_thread_get_self(cpu->thread);
> +
> +cpu->thread_id = qemu_get_thread_id();
> +cpu->created = true;
> +cpu->can_do_io = 1;
> +current_cpu = cpu;
> +qemu_cond_signal(&qemu_cpu_cond);
> +
> +/* process any pending work */
> +atomic_mb_set(&cpu->exit_request, 1);
> +
> +while (1) {
> +bool sleep = false;
> +
> +if (cpu_can_run(cpu)) {
> +int r = tcg_cpu_exec(cpu);
> +switch (r) {
> +case EXCP_DEBUG:
> +cpu_handle_guest_debug(cpu);
> +break;
> +case EXCP_HALTED:
> +/* during start-up the vCPU is reset and the thread is
> + * kicked several times. If we don't ensure we go back
> + * to sleep in the halted state we won't cleanly
> + * start-up when the vCPU is enabled.
> + */
> +sleep = true;
> +break;
> +default:
> +/* Ignore everything else? */
> +break;
> +}
> +} else {
> +sleep = true;
> +}
> +
> +handle_icount_deadline();
> +
> +if (sleep) {
> +qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);
> +}
> +
> +atomic_mb_set(&cpu->exit_request, 0);
> +qemu_tcg_wait_io_event(cpu);

Do we really want to wait in qemu_tcg_wait_io_event() while
"all_cpu_threads_idle()"?

> +}
> +
> +return NULL;
> +}
> +
>  static void qemu_cpu_kick_thread(CPUState *cpu)
>  {
>  #ifndef _WIN32
> @@ -1355,7 +1411,7 @@ void qemu_cpu_kick(CPUState *cpu)
>  qemu_cond_broadcast(cpu->halt_cond);
>  if (tcg_enabled()) {
>  cpu_exit(cpu);
> -/* Also ensure current RR cpu is kicked */
> +/* NOP unless doing single-thread RR */
>  qemu_cpu_kick_rr_cpu();
>  } else {
>  qemu_cpu_kick_threa

Re: [Qemu-devel] [PATCH v17 0/4][WIP] block/gluster: add support for multiple gluster servers

2016-06-29 Thread Jeff Cody
On Wed, Jun 15, 2016 at 01:55:43PM +0530, Prasanna Kumar Kalever wrote:
> This version of patches are rebased on master branch.
> 
> Prasanna Kumar Kalever (4):
>   block/gluster: rename [server, volname, image] -> [host, volume, path]
>   block/gluster: code cleanup
>   block/gluster: using new qapi schema
>   block/gluster: add support for multiple gluster servers
>

I think the main criticism with this series revolves around the interface,
and the overloading of the server hosts fields when using tcp and unix
sockets, etc.  The idea of using flat unions for the API was floated.

Eric, does this criticism still stand, from libvirt's perspective?  Or are
you comfortable enough with the current interface that I can go ahead and
take this series in through my tree?


> v1:
> multiple host addresses but common port number and transport type
> pattern: URI syntax with query (?) delimitor
> syntax:
> file=gluster[+transport-type]://host1:24007/testvol/a.img\
>  ?server=host2&server=host3
> 
> v2:
> multiple host addresses each have their own port number, but all use
>  common transport type
> pattern: URI syntax  with query (?) delimiter
> syntax:
> file=gluster[+transport-type]://[host[:port]]/testvol/a.img\
>  [?server=host1[:port]\
>   &server=host2[:port]]
> 
> v3:
> multiple host addresses each have their own port number and transport type
> pattern: changed to json
> syntax:
> 'json:{"driver":"qcow2","file":{"driver":"gluster","volume":"testvol",
>"path":"/path/a.qcow2","server":
>  [{"host":"1.2.3.4","port":"24007","transport":"tcp"},
>   {"host":"4.5.6.7","port":"24008","transport":"rdma"}] } }'
> 
> v4, v5:
> address comments from "Eric Blake" 
> renamed:
> 'backup-volfile-servers' -> 'volfile-servers'
> 
> v6:
> address comments from Peter Krempa 
> renamed:
>  'volname'->  'volume'
>  'image-path' ->  'path'
>  'server' ->  'host'
> 
> v7:
> fix for v6 (initialize num_servers to 1 and other typos)
> 
> v8:
> split patch set v7 into series of 3 as per Peter Krempa 
> review comments
> 
> v9:
> reorder the series of patches addressing "Eric Blake" 
> review comments
> 
> v10:
> fix mem-leak as per Peter Krempa  review comments
> 
> v11:
> using qapi-types* defined structures as per "Eric Blake" 
> review comments.
> 
> v12:
> fix crash caused in qapi_free_BlockdevOptionsGluster
> 
> v13:
> address comments from "Jeff Cody" 
> 
> v14:
> address comments from "Eric Blake" 
> split patch 3/3 into two
> rename input option and variable from 'servers' to 'server'
> 
> v15:
> patch 1/4 changed the commit message as per Eric's comment
> patch 2/4 are unchanged
> patch 3/4 addressed Jeff's comments
> patch 4/4 concentrates on unix transport related help info,
> rename 'parse_transport_option()' to 'qapi_enum_parse()',
> address memory leaks and other comments given by Jeff and Eric
> 
> v16:
> In patch 4/4 fixed segfault on glfs_init() error case, as per Jeff's comments
> other patches in this series remain unchanged
> 
> v17:
> rebase of v16 on latest master
> 
>  block/gluster.c  | 484 
> ++-
>  qapi/block-core.json |  64 ++-
>  2 files changed, 419 insertions(+), 129 deletions(-)
> 
> -- 
> 2.5.5
> 



Re: [Qemu-devel] [PATCH] virtio: Fix setting up host notifiers for vhost

2016-06-29 Thread Cornelia Huck
On Wed, 29 Jun 2016 14:23:42 +0200
Marc-André Lureau  wrote:

> Hi
> 
> On Wed, Jun 29, 2016 at 2:17 PM, Cornelia Huck  
> wrote:
> > When setting up host notifiers, virtio_bus_set_host_notifier()
> > simply switches the handler. This will only work, however, if
> > the ioeventfd has already been setup; this is true for dataplane,
> > but not for vhost.
> >
> > Fix this by starting the ioeventfd if that has not happened
> > before.
> >
> > While we're at it, also fixup the unsetting path of
> > set_host_notifier_internal().
> >
> > Fixes: 6798e245a3 ("virtio-bus: common ioeventfd infrastructure")
> > Reported-by: Jason Wang 
> > Reported-by: Marc-André Lureau 
> > Signed-off-by: Cornelia Huck 
> > ---
> >
> > This fixes the vhost regression for me, while dataplane continues
> > to work.
> >
> 
> That doesn't work here,
> QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64  tests/vhost-user-test
> 
> hangs in /x86_64/vhost-user/migrate

I can reproduce it, but I have zero ideas on how to proceed.

I can see that one of the qemus sits on event_notifier_test_and_clear
when vhost tries to shut down. (I am thoroughly confused by all of
that qtest setup, so I have no idea which qemu instance this is...)

Looking at the code path, we really should switch the handler around,
but virtio_queue_set_host_notifier_fd_handler always unsets the handler
unless both assign and set_handler are true. Is that really what we
want?

I fear I have stared at this for so long that I have now lost myself
between all these flags, so I hope one of the folks on cc: has a good
idea...




Re: [Qemu-devel] [PATCH v17 0/4][WIP] block/gluster: add support for multiple gluster servers

2016-06-29 Thread Daniel P. Berrange
On Wed, Jun 29, 2016 at 10:11:31AM -0400, Jeff Cody wrote:
> On Wed, Jun 15, 2016 at 01:55:43PM +0530, Prasanna Kumar Kalever wrote:
> > This version of patches are rebased on master branch.
> > 
> > Prasanna Kumar Kalever (4):
> >   block/gluster: rename [server, volname, image] -> [host, volume, path]
> >   block/gluster: code cleanup
> >   block/gluster: using new qapi schema
> >   block/gluster: add support for multiple gluster servers
> >
> 
> I think the main criticism with this series revolves around the interface,
> and the overloading of the server hosts fields when using tcp and unix
> sockets, etc.  The idea of using flat unions for the API was floated.
> 
> Eric, does this criticism still stand, from libvirt's perspective?  Or are
> you comfortable enough with the current interface that I can go ahead and
> take this series in through my tree?

Just from a general QAPI design POV I think this overloading is undesirable.

We cared enough about not doing this overloading in the past that we
created SocketAddress which is a union of InetSocketAddress and
UnixSocketAddress. Given this historical best practice, I don't think
we should be overloading "host" for unix socket path.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [RFC v3 19/19] cpu-exec: remove tb_lock from the hot-path

2016-06-29 Thread Sergey Fedorov
On 03/06/16 23:40, Alex Bennée wrote:
> Lock contention in the hot path of moving between existing patched
> TranslationBlocks is the main drag on MTTCG performance. This patch
> pushes the tb_lock() usage down to the two places that really need it:
>
>   - code generation (tb_gen_code)
>   - jump patching (tb_add_jump)
>
> The rest of the code doesn't really need to hold a lock as it is either
> using per-CPU structures or designed to be used in concurrent read
> situations (qht_lookup).
>
> Signed-off-by: Alex Bennée 
>
> ---
> v3
>   - fix merge conflicts with Sergey's patch
> ---
>  cpu-exec.c | 59 ++-
>  1 file changed, 30 insertions(+), 29 deletions(-)
>
> diff --git a/cpu-exec.c b/cpu-exec.c
> index b017643..4af0b52 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -298,41 +298,38 @@ static TranslationBlock *tb_find_slow(CPUState *cpu,
>   * Pairs with smp_wmb() in tb_phys_invalidate(). */
>  smp_rmb();
>  tb = tb_find_physical(cpu, pc, cs_base, flags);
> -if (tb) {
> -goto found;
> -}
> +if (!tb) {
>  
> -/* mmap_lock is needed by tb_gen_code, and mmap_lock must be
> - * taken outside tb_lock.  Since we're momentarily dropping
> - * tb_lock, there's a chance that our desired tb has been
> - * translated.
> - */
> -tb_unlock();
> -mmap_lock();
> -tb_lock();
> -tb = tb_find_physical(cpu, pc, cs_base, flags);
> -if (tb) {
> -mmap_unlock();
> -goto found;
> -}
> +/* mmap_lock is needed by tb_gen_code, and mmap_lock must be
> + * taken outside tb_lock.
> + */
> +mmap_lock();
> +tb_lock();
>  
> -/* if no translated code available, then translate it now */
> -tb = tb_gen_code(cpu, pc, cs_base, flags, 0);
> +/* There's a chance that our desired tb has been translated while
> + * taking the locks so we check again inside the lock.
> + */
> +tb = tb_find_physical(cpu, pc, cs_base, flags);
> +if (!tb) {
> +/* if no translated code available, then translate it now */
> +tb = tb_gen_code(cpu, pc, cs_base, flags, 0);
> +}
>  
> -mmap_unlock();
> +tb_unlock();
> +mmap_unlock();
> +}
>  
> -found:
> -/* we add the TB in the virtual pc hash table */
> +/* We add the TB in the virtual pc hash table for the fast lookup */
>  cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)] = tb;

Hmm, seems like I forgot to convert this into atomic_set() in the
previous patch...

>  return tb;
>  }
>  
>  static inline TranslationBlock *tb_find_fast(CPUState *cpu,
> - TranslationBlock **last_tb,
> + TranslationBlock **ltbp,

I'm not sure if it is more readable...

>   int tb_exit)
>  {
>  CPUArchState *env = (CPUArchState *)cpu->env_ptr;
> -TranslationBlock *tb;
> +TranslationBlock *tb, *last_tb;
>  target_ulong cs_base, pc;
>  uint32_t flags;
>  
> @@ -340,7 +337,6 @@ static inline TranslationBlock *tb_find_fast(CPUState 
> *cpu,
> always be the same before a given translated block
> is executed. */
>  cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
> -tb_lock();
>  tb = atomic_read(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)]);
>  if (unlikely(!tb || tb->pc != pc || tb->cs_base != cs_base ||
>   tb->flags != flags)) {
> @@ -350,7 +346,7 @@ static inline TranslationBlock *tb_find_fast(CPUState 
> *cpu,
>  /* Ensure that no TB jump will be modified as the
>   * translation buffer has been flushed.
>   */
> -*last_tb = NULL;
> +*ltbp = NULL;
>  cpu->tb_flushed = false;
>  }
>  #ifndef CONFIG_USER_ONLY
> @@ -359,14 +355,19 @@ static inline TranslationBlock *tb_find_fast(CPUState 
> *cpu,
>   * spanning two pages because the mapping for the second page can change.
>   */
>  if (tb->page_addr[1] != -1) {
> -*last_tb = NULL;
> +*ltbp = NULL;
>  }
>  #endif
> +
>  /* See if we can patch the calling TB. */
> -if (*last_tb && !qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
> -tb_add_jump(*last_tb, tb_exit, tb);
> +last_tb = *ltbp;
> +if (!qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN) &&
> +last_tb &&
> +!last_tb->jmp_list_next[tb_exit]) {

If we're going to check this outside of tb_lock we have to do this with
atomic_{read,set}(). However, I think it is rare case to race on
tb_add_jump() so probably it is okay to do the check under tb_lock.

> +tb_lock();
> +tb_add_jump(last_tb, tb_exit, tb);
> +tb_unlock();
>  }
> -tb_unlock();
>  return tb;
>  }
>  

Kind regards,
Sergey



[Qemu-devel] [Bug 1588328] Re: Qemu 2.6 Solaris 9 Sparc Segmentation Fault

2016-06-29 Thread Zhen Ning Lim
May 11 2016. qemu-2.6.0 from http://wiki.qemu.org/Download

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1588328

Title:
  Qemu 2.6 Solaris 9 Sparc Segmentation Fault

Status in QEMU:
  New

Bug description:
  Hi,
  I tried the following command to boot Solaris 9 sparc:
  qemu-system-sparc -nographic -boot d -hda ./Spark9.disk -m 256 -cdrom 
sol-9-905hw-ga-sparc-dvd.iso -serial telnet:0.0.0.0:3000,server 

  It seems there are a few Segmentation Faults, one from the starting of
  the boot. Another at the beginning of the commandline installation.

  Trying 127.0.0.1...
  Connected to localhost.
  Escape character is '^]'.
  Configuration device id QEMU version 1 machine id 32
  Probing SBus slot 0 offset 0
  Probing SBus slot 1 offset 0
  Probing SBus slot 2 offset 0
  Probing SBus slot 3 offset 0
  Probing SBus slot 4 offset 0
  Probing SBus slot 5 offset 0
  Invalid FCode start byte
  CPUs: 1 x FMI,MB86904
  UUID: ----
  Welcome to OpenBIOS v1.1 built on Apr 18 2016 08:19
Type 'help' for detailed information
  Trying cdrom:d...
  Not a bootable ELF image
  Loading a.out image...
  Loaded 7680 bytes
  entry point is 0x4000
  bootpath: 
/iommu@0,1000/sbus@0,10001000/espdma@5,840/esp@5,880/sd@2,0:d

  Jumping to entry point 4000 for type 0005...
  switching to new context:
  SunOS Release 5.9 Version Generic_118558-34 32-bit
  Copyright 1983-2003 Sun Microsystems, Inc.  All rights reserved.
  Use is subject to license terms.
  WARNING: 
/iommu@0,1000/sbus@0,10001000/espdma@5,840/esp@5,880/sd@0,0 (sd0):
Corrupt label; wrong magic number

  Segmentation Fault
  Configuring /dev and /devices
  NOTICE: Couldn't set value (../../sun/io/audio/sada/drv/audiocs/audio_4231.c, 
Line #1759 0x00 0x88)
  audio may not work correctly until it is stopped and restarted
  Segmentation Fault
  Using RPC Bootparams for network configuration information.
  Skipping interface le0
  Searching for configuration file(s)...
  Search complete.

  

  What type of terminal are you using?
   1) ANSI Standard CRT
   2) DEC VT52
   3) DEC VT100
   4) Heathkit 19
   5) Lear Siegler ADM31
   6) PC Console
   7) Sun Command Tool
   8) Sun Workstation
   9) Televideo 910
   10) Televideo 925
   11) Wyse Model 50
   12) X Terminal Emulator (xterms)
   13) CDE Terminal Emulator (dtterm)
   14) Other
  Type the number of your choice and press Return: 3
  syslog service starting.
  savecore: no dump device configured
  Running in command line mode
  /sbin/disk0_install[109]: 143 Segmentation Fault
  /sbin/run_install[130]: 155 Segmentation Fault

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1588328/+subscriptions



Re: [Qemu-devel] [RFC v3 19/19] cpu-exec: remove tb_lock from the hot-path

2016-06-29 Thread Alex Bennée

Sergey Fedorov  writes:

> On 03/06/16 23:40, Alex Bennée wrote:
>> Lock contention in the hot path of moving between existing patched
>> TranslationBlocks is the main drag on MTTCG performance. This patch
>> pushes the tb_lock() usage down to the two places that really need it:
>>
>>   - code generation (tb_gen_code)
>>   - jump patching (tb_add_jump)
>>
>> The rest of the code doesn't really need to hold a lock as it is either
>> using per-CPU structures or designed to be used in concurrent read
>> situations (qht_lookup).
>>
>> Signed-off-by: Alex Bennée 
>>
>> ---
>> v3
>>   - fix merge conflicts with Sergey's patch
>> ---
>>  cpu-exec.c | 59 ++-
>>  1 file changed, 30 insertions(+), 29 deletions(-)
>>
>> diff --git a/cpu-exec.c b/cpu-exec.c
>> index b017643..4af0b52 100644
>> --- a/cpu-exec.c
>> +++ b/cpu-exec.c
>> @@ -298,41 +298,38 @@ static TranslationBlock *tb_find_slow(CPUState *cpu,
>>   * Pairs with smp_wmb() in tb_phys_invalidate(). */
>>  smp_rmb();
>>  tb = tb_find_physical(cpu, pc, cs_base, flags);
>> -if (tb) {
>> -goto found;
>> -}
>> +if (!tb) {
>>
>> -/* mmap_lock is needed by tb_gen_code, and mmap_lock must be
>> - * taken outside tb_lock.  Since we're momentarily dropping
>> - * tb_lock, there's a chance that our desired tb has been
>> - * translated.
>> - */
>> -tb_unlock();
>> -mmap_lock();
>> -tb_lock();
>> -tb = tb_find_physical(cpu, pc, cs_base, flags);
>> -if (tb) {
>> -mmap_unlock();
>> -goto found;
>> -}
>> +/* mmap_lock is needed by tb_gen_code, and mmap_lock must be
>> + * taken outside tb_lock.
>> + */
>> +mmap_lock();
>> +tb_lock();
>>
>> -/* if no translated code available, then translate it now */
>> -tb = tb_gen_code(cpu, pc, cs_base, flags, 0);
>> +/* There's a chance that our desired tb has been translated while
>> + * taking the locks so we check again inside the lock.
>> + */
>> +tb = tb_find_physical(cpu, pc, cs_base, flags);
>> +if (!tb) {
>> +/* if no translated code available, then translate it now */
>> +tb = tb_gen_code(cpu, pc, cs_base, flags, 0);
>> +}
>>
>> -mmap_unlock();
>> +tb_unlock();
>> +mmap_unlock();
>> +}
>>
>> -found:
>> -/* we add the TB in the virtual pc hash table */
>> +/* We add the TB in the virtual pc hash table for the fast lookup */
>>  cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)] = tb;
>
> Hmm, seems like I forgot to convert this into atomic_set() in the
> previous patch...

OK, can you fix that in your quick fixes series?

>
>>  return tb;
>>  }
>>
>>  static inline TranslationBlock *tb_find_fast(CPUState *cpu,
>> - TranslationBlock **last_tb,
>> + TranslationBlock **ltbp,
>
> I'm not sure if it is more readable...

I'll revert. I was trying to keep line lengths short :-/

>
>>   int tb_exit)
>>  {
>>  CPUArchState *env = (CPUArchState *)cpu->env_ptr;
>> -TranslationBlock *tb;
>> +TranslationBlock *tb, *last_tb;
>>  target_ulong cs_base, pc;
>>  uint32_t flags;
>>
>> @@ -340,7 +337,6 @@ static inline TranslationBlock *tb_find_fast(CPUState 
>> *cpu,
>> always be the same before a given translated block
>> is executed. */
>>  cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
>> -tb_lock();
>>  tb = atomic_read(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)]);
>>  if (unlikely(!tb || tb->pc != pc || tb->cs_base != cs_base ||
>>   tb->flags != flags)) {
>> @@ -350,7 +346,7 @@ static inline TranslationBlock *tb_find_fast(CPUState 
>> *cpu,
>>  /* Ensure that no TB jump will be modified as the
>>   * translation buffer has been flushed.
>>   */
>> -*last_tb = NULL;
>> +*ltbp = NULL;
>>  cpu->tb_flushed = false;
>>  }
>>  #ifndef CONFIG_USER_ONLY
>> @@ -359,14 +355,19 @@ static inline TranslationBlock *tb_find_fast(CPUState 
>> *cpu,
>>   * spanning two pages because the mapping for the second page can 
>> change.
>>   */
>>  if (tb->page_addr[1] != -1) {
>> -*last_tb = NULL;
>> +*ltbp = NULL;
>>  }
>>  #endif
>> +
>>  /* See if we can patch the calling TB. */
>> -if (*last_tb && !qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
>> -tb_add_jump(*last_tb, tb_exit, tb);
>> +last_tb = *ltbp;
>> +if (!qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN) &&
>> +last_tb &&
>> +!last_tb->jmp_list_next[tb_exit]) {
>
> If we're going to check this outside of tb_lock we have to do this with
> atomic_{read,set}(). However, I think it is rare case to race on
> tb_add_jump() so probably it is okay to do the check under tb_lock.

It's checking for NULL, it

Re: [Qemu-devel] [PATCH] virtio: Fix setting up host notifiers for vhost

2016-06-29 Thread Paolo Bonzini


On 29/06/2016 16:15, Cornelia Huck wrote:
> 
> I can see that one of the qemus sits on event_notifier_test_and_clear
> when vhost tries to shut down. (I am thoroughly confused by all of
> that qtest setup, so I have no idea which qemu instance this is...)

Stupid question ahead---if you mean QEMU is sitting in a blocking read,
isn't event_notifier_test_and_clear supposed to be non-blocking?

Paolo



Re: [Qemu-devel] [RFC v3 19/19] cpu-exec: remove tb_lock from the hot-path

2016-06-29 Thread Sergey Fedorov
On 29/06/16 17:47, Alex Bennée wrote:
> Sergey Fedorov  writes:
>
>> On 03/06/16 23:40, Alex Bennée wrote:
>>> Lock contention in the hot path of moving between existing patched
>>> TranslationBlocks is the main drag on MTTCG performance. This patch
>>> pushes the tb_lock() usage down to the two places that really need it:
>>>
>>>   - code generation (tb_gen_code)
>>>   - jump patching (tb_add_jump)
>>>
>>> The rest of the code doesn't really need to hold a lock as it is either
>>> using per-CPU structures or designed to be used in concurrent read
>>> situations (qht_lookup).
>>>
>>> Signed-off-by: Alex Bennée 
>>>
>>> ---
>>> v3
>>>   - fix merge conflicts with Sergey's patch
>>> ---
>>>  cpu-exec.c | 59 ++-
>>>  1 file changed, 30 insertions(+), 29 deletions(-)
>>>
>>> diff --git a/cpu-exec.c b/cpu-exec.c
>>> index b017643..4af0b52 100644
>>> --- a/cpu-exec.c
>>> +++ b/cpu-exec.c
>>> @@ -298,41 +298,38 @@ static TranslationBlock *tb_find_slow(CPUState *cpu,
>>>   * Pairs with smp_wmb() in tb_phys_invalidate(). */
>>>  smp_rmb();
>>>  tb = tb_find_physical(cpu, pc, cs_base, flags);
>>> -if (tb) {
>>> -goto found;
>>> -}
>>> +if (!tb) {
>>>
>>> -/* mmap_lock is needed by tb_gen_code, and mmap_lock must be
>>> - * taken outside tb_lock.  Since we're momentarily dropping
>>> - * tb_lock, there's a chance that our desired tb has been
>>> - * translated.
>>> - */
>>> -tb_unlock();
>>> -mmap_lock();
>>> -tb_lock();
>>> -tb = tb_find_physical(cpu, pc, cs_base, flags);
>>> -if (tb) {
>>> -mmap_unlock();
>>> -goto found;
>>> -}
>>> +/* mmap_lock is needed by tb_gen_code, and mmap_lock must be
>>> + * taken outside tb_lock.
>>> + */
>>> +mmap_lock();
>>> +tb_lock();
>>>
>>> -/* if no translated code available, then translate it now */
>>> -tb = tb_gen_code(cpu, pc, cs_base, flags, 0);
>>> +/* There's a chance that our desired tb has been translated while
>>> + * taking the locks so we check again inside the lock.
>>> + */
>>> +tb = tb_find_physical(cpu, pc, cs_base, flags);
>>> +if (!tb) {
>>> +/* if no translated code available, then translate it now */
>>> +tb = tb_gen_code(cpu, pc, cs_base, flags, 0);
>>> +}
>>>
>>> -mmap_unlock();
>>> +tb_unlock();
>>> +mmap_unlock();
>>> +}
>>>
>>> -found:
>>> -/* we add the TB in the virtual pc hash table */
>>> +/* We add the TB in the virtual pc hash table for the fast lookup */
>>>  cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)] = tb;
>> Hmm, seems like I forgot to convert this into atomic_set() in the
>> previous patch...
> OK, can you fix that in your quick fixes series?

Sure. I think that patch and this are both ready-to-go into mainline.

>
>>>  return tb;
>>>  }
>>>
>>>  static inline TranslationBlock *tb_find_fast(CPUState *cpu,
>>> - TranslationBlock **last_tb,
>>> + TranslationBlock **ltbp,
>> I'm not sure if it is more readable...
> I'll revert. I was trying to keep line lengths short :-/
>
>>>   int tb_exit)
>>>  {
>>>  CPUArchState *env = (CPUArchState *)cpu->env_ptr;
>>> -TranslationBlock *tb;
>>> +TranslationBlock *tb, *last_tb;
>>>  target_ulong cs_base, pc;
>>>  uint32_t flags;
>>>
>>> @@ -340,7 +337,6 @@ static inline TranslationBlock *tb_find_fast(CPUState 
>>> *cpu,
>>> always be the same before a given translated block
>>> is executed. */
>>>  cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
>>> -tb_lock();
>>>  tb = atomic_read(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)]);
>>>  if (unlikely(!tb || tb->pc != pc || tb->cs_base != cs_base ||
>>>   tb->flags != flags)) {
>>> @@ -350,7 +346,7 @@ static inline TranslationBlock *tb_find_fast(CPUState 
>>> *cpu,
>>>  /* Ensure that no TB jump will be modified as the
>>>   * translation buffer has been flushed.
>>>   */
>>> -*last_tb = NULL;
>>> +*ltbp = NULL;
>>>  cpu->tb_flushed = false;
>>>  }
>>>  #ifndef CONFIG_USER_ONLY
>>> @@ -359,14 +355,19 @@ static inline TranslationBlock *tb_find_fast(CPUState 
>>> *cpu,
>>>   * spanning two pages because the mapping for the second page can 
>>> change.
>>>   */
>>>  if (tb->page_addr[1] != -1) {
>>> -*last_tb = NULL;
>>> +*ltbp = NULL;
>>>  }
>>>  #endif
>>> +
>>>  /* See if we can patch the calling TB. */
>>> -if (*last_tb && !qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
>>> -tb_add_jump(*last_tb, tb_exit, tb);
>>> +last_tb = *ltbp;
>>> +if (!qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN) &&
>>> +last_tb &&
>>> +!last_tb->jmp_list_next[tb_exit]) {
>> If

Re: [Qemu-devel] [RFC 4/8] linux-user: Rework exclusive operation mechanism

2016-06-29 Thread Sergey Fedorov
On 27/06/16 12:02, Alex Bennée wrote:
> Sergey Fedorov  writes:
>
>> From: Sergey Fedorov 
>>
(snip)
>> diff --git a/linux-user/main.c b/linux-user/main.c
>> index b9a4e0ea45ac..485336f78b8f 100644
>> --- a/linux-user/main.c
>> +++ b/linux-user/main.c
>> @@ -111,7 +111,8 @@ static pthread_mutex_t cpu_list_mutex = 
>> PTHREAD_MUTEX_INITIALIZER;
>>  static pthread_mutex_t exclusive_lock = PTHREAD_MUTEX_INITIALIZER;
>>  static pthread_cond_t exclusive_cond = PTHREAD_COND_INITIALIZER;
>>  static pthread_cond_t exclusive_resume = PTHREAD_COND_INITIALIZER;
>> -static int pending_cpus;
>> +static bool exclusive_pending;
>> +static int tcg_pending_cpus;
> I'm not sure you need to re-name to tcg_pending_cpus as TCG is implied
> for linux-user. Also they are not really CPUs (although we are using the
> CPU structure for each running thread). I'm not sure if there is a
> neater way to make the distinction clear.

How about 'tcg_pending_threads'? It is going to be used in system-mode
soon, so I'd like to keep "tcg_" prefix.

>
>>  /* Make sure everything is in a consistent state for calling fork().  */
>>  void fork_start(void)
>> @@ -133,7 +134,8 @@ void fork_end(int child)
>>  QTAILQ_REMOVE(&cpus, cpu, node);
>>  }
>>  }
>> -pending_cpus = 0;
>> +tcg_pending_cpus = 0;
>> +exclusive_pending = false;
>>  pthread_mutex_init(&exclusive_lock, NULL);
>>  pthread_mutex_init(&cpu_list_mutex, NULL);
>>  pthread_cond_init(&exclusive_cond, NULL);
>> @@ -150,7 +152,7 @@ void fork_end(int child)
>> must be held.  */
>>  static inline void exclusive_idle(void)
>>  {
>> -while (pending_cpus) {
>> +while (exclusive_pending) {
>>  pthread_cond_wait(&exclusive_resume, &exclusive_lock);
>>  }
>>  }
>> @@ -164,15 +166,14 @@ static inline void start_exclusive(void)
>>  pthread_mutex_lock(&exclusive_lock);
>>  exclusive_idle();
>>
>> -pending_cpus = 1;
>> +exclusive_pending = true;
>>  /* Make all other cpus stop executing.  */
>>  CPU_FOREACH(other_cpu) {
>>  if (other_cpu->running) {
>> -pending_cpus++;
>>  cpu_exit(other_cpu);
>>  }
>>  }
>> -if (pending_cpus > 1) {
>> +while (tcg_pending_cpus) {
>>  pthread_cond_wait(&exclusive_cond, &exclusive_lock);
>>  }
>>  }
>> @@ -180,7 +181,7 @@ static inline void start_exclusive(void)
>>  /* Finish an exclusive operation.  */
>>  static inline void __attribute__((unused)) end_exclusive(void)
>>  {
>> -pending_cpus = 0;
>> +exclusive_pending = false;
>>  pthread_cond_broadcast(&exclusive_resume);
>>  pthread_mutex_unlock(&exclusive_lock);
>>  }
>> @@ -191,6 +192,7 @@ static inline void cpu_exec_start(CPUState *cpu)
>>  pthread_mutex_lock(&exclusive_lock);
>>  exclusive_idle();
>>  cpu->running = true;
>> +tcg_pending_cpus++;
> These aren't TLS variables so shouldn't we be ensuring all access is atomic?

It is protected by 'exclusive_lock'.

>
>>  pthread_mutex_unlock(&exclusive_lock);
>>  }
>>
>> @@ -199,11 +201,9 @@ static inline void cpu_exec_end(CPUState *cpu)
>>  {
>>  pthread_mutex_lock(&exclusive_lock);
>>  cpu->running = false;
>> -if (pending_cpus > 1) {
>> -pending_cpus--;
>> -if (pending_cpus == 1) {
>> -pthread_cond_signal(&exclusive_cond);
>> -}
>> +tcg_pending_cpus--;
>> +if (!tcg_pending_cpus) {
>> +pthread_cond_broadcast(&exclusive_cond);
>>  }
> Couldn't two threads race to -1 here?

See comment above.

Kind regards,
Sergey

>
>>  exclusive_idle();
>>  pthread_mutex_unlock(&exclusive_lock);
>
> --
> Alex Bennée




Re: [Qemu-devel] [RFC 6/8] linux-user: Support CPU work queue

2016-06-29 Thread Sergey Fedorov
On 27/06/16 12:31, Alex Bennée wrote:
> Sergey Fedorov  writes:
>
>> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
>> index c1f59fa59d2c..23b4b50e0a45 100644
>> --- a/include/exec/exec-all.h
>> +++ b/include/exec/exec-all.h
>> @@ -407,4 +407,8 @@ extern int singlestep;
>>  extern CPUState *tcg_current_cpu;
>>  extern bool exit_request;
>>
>> +void wait_cpu_work(void);
>> +void signal_cpu_work(void);
>> +void flush_queued_work(CPUState *cpu);
>> +
> Now these are public APIs (and have multiple implementations) some doc
> comments would be useful here.


Sure, I'll do this as soon as I'm sure that this is a right approach.

Thanks,
Sergey



Re: [Qemu-devel] [RFC 7/8] cpu-exec-common: Introduce async_safe_run_on_cpu()

2016-06-29 Thread Sergey Fedorov
On 27/06/16 12:36, Alex Bennée wrote:
> Sergey Fedorov  writes:
>
>> From: Sergey Fedorov 
>>
(snip)
>> diff --git a/cpus.c b/cpus.c
>> index 98f60f6f98f5..bb6bd8615cfc 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -932,6 +932,18 @@ static void qemu_tcg_destroy_vcpu(CPUState *cpu)
>>  {
>>  }
>>
>> +static void tcg_cpu_exec_start(CPUState *cpu)
>> +{
>> +tcg_pending_cpus++;
>> +}
>> +
>> +static void tcg_cpu_exec_end(CPUState *cpu)
>> +{
>> +if (--tcg_pending_cpus) {
>> +signal_cpu_work();
>> +}
>> +}
> Don't these need to be atomic?

'tcg_pending_cpus' is protected by BQL.

>
>> +
>>  static void qemu_wait_io_event_common(CPUState *cpu)
>>  {
>>  if (cpu->stop) {
>>
(snip)

Thanks,
Sergey



Re: [Qemu-devel] [RFC 8/8] tcg: Make tb_flush() thread safe

2016-06-29 Thread Sergey Fedorov
On 28/06/16 19:18, Alex Bennée wrote:
> Sergey Fedorov  writes:
>
>> From: Sergey Fedorov 
>>
>> Use async_safe_run_on_cpu() to make tb_flush() thread safe.
>>
>> Signed-off-by: Sergey Fedorov 
>> Signed-off-by: Sergey Fedorov 
>> ---
>>  translate-all.c | 12 
>>  1 file changed, 8 insertions(+), 4 deletions(-)
>>
>> diff --git a/translate-all.c b/translate-all.c
>> index 3f402dfe04f5..09b1d0b0efc3 100644
>> --- a/translate-all.c
>> +++ b/translate-all.c
>> @@ -832,7 +832,7 @@ static void page_flush_tb(void)
>>
>>  /* flush all the translation blocks */
>>  /* XXX: tb_flush is currently not thread safe */
>^^^
>
> The comment belies a lack of confidence ;-)

Nice catch!

Thanks,
Sergey

>
>> -void tb_flush(CPUState *cpu)
>> +static void do_tb_flush(CPUState *cpu, void *data)
>>  {
>>  #if defined(DEBUG_FLUSH)
>>  printf("qemu: flush code_size=%ld nb_tbs=%d avg_tb_size=%ld\n",
>> @@ -861,6 +861,11 @@ void tb_flush(CPUState *cpu)
>>  tcg_ctx.tb_ctx.tb_flush_count++;
>>  }
>>
>> +void tb_flush(CPUState *cpu)
>> +{
>> +async_safe_run_on_cpu(cpu, do_tb_flush, NULL);
>> +}
>> +
>>  #ifdef DEBUG_TB_CHECK
>>
>>  static void
>> @@ -1163,9 +1168,8 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>>   buffer_overflow:
>>  /* flush must be done */
>>  tb_flush(cpu);
>> -/* cannot fail at this point */
>> -tb = tb_alloc(pc);
>> -assert(tb != NULL);
>> +mmap_unlock();
>> +cpu_loop_exit(cpu);
>>  }
>>
>>  gen_code_buf = tcg_ctx.code_gen_ptr;
>
> --
> Alex Bennée




Re: [Qemu-devel] [PULL 0/8] Tracing patches

2016-06-29 Thread Peter Maydell
On 28 June 2016 at 22:27, Stefan Hajnoczi  wrote:
> The following changes since commit d7f30403576f04f1f3a5fb5a1d18cba8dfa7a6d2:
>
>   cputlb: don't cpu_abort() if guest tries to execute outside RAM or RAM 
> (2016-06-28 18:50:53 +0100)
>
> are available in the git repository at:
>
>   git://github.com/stefanha/qemu.git tags/tracing-pull-request
>
> for you to fetch changes up to 9c15e70086f3343bd810c6150d92ebfd6f346fcf:
>
>   trace: [*-user] Add events to trace guest syscalls in syscall emulation 
> mode (2016-06-28 21:14:12 +0100)
>
> 
>
> 
>
> Denis V. Lunev (7):
>   doc: sync help description for --trace with man for qemu.1
>   doc: move text describing --trace to specific .texi file
>   trace: move qemu_trace_opts to trace/control.c
>   trace: enable tracing in qemu-io
>   trace: enable tracing in qemu-nbd
>   qemu-img: move common options parsing before commands processing
>   trace: enable tracing in qemu-img
>
> Lluís Vilanova (1):
>   trace: [*-user] Add events to trace guest syscalls in syscall
> emulation mode
>

Applied, thanks.

-- PMM



Re: [Qemu-devel] [PATCH] virtio: Fix setting up host notifiers for vhost

2016-06-29 Thread Cornelia Huck
On Wed, 29 Jun 2016 16:52:40 +0200
Paolo Bonzini  wrote:

> On 29/06/2016 16:15, Cornelia Huck wrote:
> > 
> > I can see that one of the qemus sits on event_notifier_test_and_clear
> > when vhost tries to shut down. (I am thoroughly confused by all of
> > that qtest setup, so I have no idea which qemu instance this is...)
> 
> Stupid question ahead---if you mean QEMU is sitting in a blocking read,
> isn't event_notifier_test_and_clear supposed to be non-blocking?

Yes, and this is what completely threw me off. I may have been very
confused at that point of time, though; I can recheck tomorrow.




Re: [Qemu-devel] [PULL v2 00/24] linux-user changes for v2.7

2016-06-29 Thread Peter Maydell
On 28 June 2016 at 20:12,   wrote:
> From: Riku Voipio 
>
> The following changes since commit c7288767523f6510cf557707d3eb5e78e519b90d:
>
>   Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160623' 
> into staging (2016-06-23 11:53:14 +0100)
>
> are available in the git repository at:
>
>   git://git.linaro.org/people/riku.voipio/qemu.git 
> tags/pull-linux-user-20160628
>
> for you to fetch changes up to 4ba92cd736a9ce0dc83c9b16a75d24d385e1cdf3:
>
>   linux-user: Provide safe_syscall for ppc64 (2016-06-26 13:17:22 +0300)
>
> 
> Drop building linux-user targets on HPPA or m68k host systems
> and add safe_syscall support for i386, aarch64, arm, ppc64 and
> s390x.

Applied, thanks.

-- PMM



Re: [Qemu-devel] [PATCH v2] target-ppc: Eliminate redundant and incorrect function booke206_page_size_to_tlb

2016-06-29 Thread alarson
David Gibson  wrote on 06/28/2016 08:42:01 
PM:

> On Tue, Jun 28, 2016 at 06:50:05AM -0700, Aaron Larson wrote:
> > 
> > Eliminate redundant and incorrect booke206_page_size_to_tlb function
> > from ppce500_spin.c in preference to previously existing but newly
> > exported definition from e500.c
> > ...
> > Signed-off-by: Aaron Larson 
> 
> Applied to pppc-for-2.7, thanks.

I had previously created a bug for this at: 
https://bugs.launchpad.net/qemu/+bug/1587535

Do you want me to do anything with that bug?



  1   2   >