date:20160505


forget to cc some maintainers

On 05/06/2016 12:20 PM, Cao jin wrote:

All the other devices` .realize function name are xxx_realize, except this one.

cc: Paolo Bonzini 
Reviewed-by: Marcel Apfelbaum 
Signed-off-by: Cao jin 
---
  hw/scsi/mptsas.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/scsi/mptsas.c b/hw/scsi/mptsas.c
index 499c146..1c18c84 100644
--- a/hw/scsi/mptsas.c
+++ b/hw/scsi/mptsas.c
@@ -1270,7 +1270,7 @@ static const struct SCSIBusInfo mptsas_scsi_info = {
  .load_request = mptsas_load_request,
  };

-static void mptsas_scsi_init(PCIDevice *dev, Error **errp)
+static void mptsas_scsi_realize(PCIDevice *dev, Error **errp)
  {
  DeviceState *d = DEVICE(dev);
  MPTSASState *s = MPT_SAS(dev);
@@ -1413,7 +1413,7 @@ static void mptsas1068_class_init(ObjectClass *oc, void 
*data)
  DeviceClass *dc = DEVICE_CLASS(oc);
  PCIDeviceClass *pc = PCI_DEVICE_CLASS(oc);

-pc->realize = mptsas_scsi_init;
+pc->realize = mptsas_scsi_realize;
  pc->exit = mptsas_scsi_uninit;
  pc->romfile = 0;
  pc->vendor_id = PCI_VENDOR_ID_LSI_LOGIC;



--
Yours Sincerely,

Cao jin

Re: [Qemu-devel] [PATCH v5 02/11] change pvscsi_init_msi() type to void


forget to cc some maintainers

On 05/06/2016 12:20 PM, Cao jin wrote:

Nobody use its return value, so change the type to void.

cc: Paolo Bonzini 
Acked-by: Dmitry Fleytman 
Reviewed-by: Marcel Apfelbaum 
Signed-off-by: Cao jin 
---
  hw/scsi/vmw_pvscsi.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/hw/scsi/vmw_pvscsi.c b/hw/scsi/vmw_pvscsi.c
index 9abc086..4ce3581 100644
--- a/hw/scsi/vmw_pvscsi.c
+++ b/hw/scsi/vmw_pvscsi.c
@@ -1039,7 +1039,7 @@ pvscsi_io_read(void *opaque, hwaddr addr, unsigned size)
  }


-static bool
+static void
  pvscsi_init_msi(PVSCSIState *s)
  {
  int res;
@@ -1053,8 +1053,6 @@ pvscsi_init_msi(PVSCSIState *s)
  } else {
  s->msi_used = true;
  }
-
-return s->msi_used;
  }

  static void



--
Yours Sincerely,

Cao jin

Re: [Qemu-devel] [RFC PATCH V3 1/4] colo-compare: introduce colo compare initlization

2016-05-05 Thread Zhang Chen




On 04/29/2016 10:03 AM, Jason Wang wrote:


On 04/28/2016 05:04 PM, Zhang Chen wrote:


On 04/28/2016 04:17 PM, Jason Wang wrote:

On 04/28/2016 03:55 PM, Zhang Chen wrote:

On 04/28/2016 03:16 PM, Jason Wang wrote:

On 04/28/2016 02:53 PM, Jason Wang wrote:

+static void compare_set_outdev(Object *obj, const char *value,
Error **errp)

+{
+CompareState *s = COLO_COMPARE(obj);
+
+g_free(s->outdev);
+s->outdev = g_strdup(value);
+}
+
+/*
+ * called from the main thread on the primary
+ * to setup colo-compare.
+ */
+static void colo_compare_complete(UserCreatable *uc, Error **errp)
+{
+CompareState *s = COLO_COMPARE(uc);
+
+if (!s->pri_indev || !s->sec_indev || !s->outdev) {
+error_setg(errp, "colo compare needs 'primary_in' ,"
+   "'secondary_in','outdev' property set");
+return;
+} else if (!strcmp(s->pri_indev, s->outdev) ||
+   !strcmp(s->sec_indev, s->outdev) ||
+   !strcmp(s->pri_indev, s->sec_indev)) {
+error_setg(errp, "'indev' and 'outdev' could not be same "
+   "for compare module");
+return;
+}
+
+s->chr_pri_in = qemu_chr_find(s->pri_indev);
+if (s->chr_pri_in == NULL) {
+error_setg(errp, "Primary IN Device '%s' not found",
+   s->pri_indev);
+return;
+}
+
+s->chr_sec_in = qemu_chr_find(s->sec_indev);
+if (s->chr_sec_in == NULL) {
+error_setg(errp, "Secondary IN Device '%s' not found",
+   s->sec_indev);
+return;
+}
+
+s->chr_out = qemu_chr_find(s->outdev);
+if (s->chr_out == NULL) {
+error_setg(errp, "OUT Device '%s' not found", s->outdev);
+return;
+}
+
+qemu_chr_fe_claim_no_fail(s->chr_pri_in);
+qemu_chr_add_handlers(s->chr_pri_in, compare_chr_can_read,
+  compare_pri_chr_in, NULL, s);
+
+qemu_chr_fe_claim_no_fail(s->chr_sec_in);
+qemu_chr_add_handlers(s->chr_sec_in, compare_chr_can_read,
+  compare_sec_chr_in, NULL, s);
+

Btw, what's the reason of handling this in main loop? I thought it
would
be better to do this in colo thread? Otherwise, you need lots of extra
synchronizations?

Do you mean we should start/stop/do checkpoint it by colo-frame?

I mean we probably want to handle pri_in and sec_in in colo compare
thread. Through this way, there's no need for extra synchronization with
main loop.

I get your point, but how to do this.
Now, we use qemu_chr_add_handlers to do this job.

You probably want to start a new main loop in colo comparing thread.



IIUC, do you mean
- remove char device read_handler

 ↓at colo comparing thread↓
while (true) {
- blocking read packet from char device with select(2)/poll(2)...
- compare packet
}

This solution will lead comparing packet and reading packet in serial.
But i don't know if this will have a good performance.



Thanks
zhangchen



.



--
Thanks
zhangchen

Re: [Qemu-devel] [PATCH v5 03/11] megasas: Fix


sorry, forget to cc some maintainers

On 05/06/2016 12:20 PM, Cao jin wrote:

msi_init returns non-zero value on both failure and success.

cc: Hannes Reinecke 
cc: Paolo Bonzini 
Signed-off-by: Cao jin 
---
  hw/scsi/megasas.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index a63a581..56fb645 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -2348,7 +2348,7 @@ static void megasas_scsi_realize(PCIDevice *dev, Error 
**errp)
"megasas-queue", 0x4);

  if (megasas_use_msi(s) &&
-msi_init(dev, 0x50, 1, true, false)) {
+msi_init(dev, 0x50, 1, true, false) < 0) {
  s->flags &= ~MEGASAS_MASK_USE_MSI;
  }
  if (megasas_use_msix(s) &&



--
Yours Sincerely,

Cao jin

Re: [Qemu-devel] [PATCH] Allow users to specify the vmdk virtual hardware version.

2016-05-05 Thread Janne Karhunen

Fam,

Any objections to this one?


--
Janne

On Tue, May 3, 2016 at 12:43 PM, Janne Karhunen
 wrote:
> From: Janne Karhunen 
>
> Vmdk images have metadata to indicate the vmware virtual
> hardware version image was created/tested to run with.
> Allow users to specify that version via new 'hwversion'
> option.
>
> Signed-off-by: Janne Karhunen 
> ---
>  block/vmdk.c  | 27 +++
>  include/block/block_int.h |  2 +-
>  qemu-doc.texi |  3 +++
>  3 files changed, 27 insertions(+), 5 deletions(-)
>
> diff --git a/block/vmdk.c b/block/vmdk.c
> index 45f9d3c..955c6b3 100644
> --- a/block/vmdk.c
> +++ b/block/vmdk.c
> @@ -1829,8 +1829,8 @@ static int vmdk_create(const char *filename, QemuOpts 
> *opts, Error **errp)
>  int64_t total_size = 0, filesize;
>  char *adapter_type = NULL;
>  char *backing_file = NULL;
> +char *hw_version = NULL;
>  char *fmt = NULL;
> -int flags = 0;
>  int ret = 0;
>  bool flat, split, compress;
>  GString *ext_desc_lines;
> @@ -1861,7 +1861,7 @@ static int vmdk_create(const char *filename, QemuOpts 
> *opts, Error **errp)
>  "# The Disk Data Base\n"
>  "#DDB\n"
>  "\n"
> -"ddb.virtualHWVersion = \"%d\"\n"
> +"ddb.virtualHWVersion = \"%s\"\n"
>  "ddb.geometry.cylinders = \"%" PRId64 "\"\n"
>  "ddb.geometry.heads = \"%" PRIu32 "\"\n"
>  "ddb.geometry.sectors = \"63\"\n"
> @@ -1878,8 +1878,20 @@ static int vmdk_create(const char *filename, QemuOpts 
> *opts, Error **errp)
>BDRV_SECTOR_SIZE);
>  adapter_type = qemu_opt_get_del(opts, BLOCK_OPT_ADAPTER_TYPE);
>  backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
> +hw_version = qemu_opt_get_del(opts, BLOCK_OPT_HWVERSION);
>  if (qemu_opt_get_bool_del(opts, BLOCK_OPT_COMPAT6, false)) {
> -flags |= BLOCK_FLAG_COMPAT6;
> +if (strcmp(hw_version, "undefined")) {
> +error_setg(errp,
> +   "compat6 cannot be enabled with hwversion set");
> +ret = -EINVAL;
> +goto exit;
> +}
> +g_free(hw_version);
> +hw_version = g_strdup("6");
> +}
> +if (strcmp(hw_version, "undefined") == 0) {
> +g_free(hw_version);
> +hw_version = g_strdup("4");
>  }
>  fmt = qemu_opt_get_del(opts, BLOCK_OPT_SUBFMT);
>  if (qemu_opt_get_bool_del(opts, BLOCK_OPT_ZEROED_GRAIN, false)) {
> @@ -2001,7 +2013,7 @@ static int vmdk_create(const char *filename, QemuOpts 
> *opts, Error **errp)
> fmt,
> parent_desc_line,
> ext_desc_lines->str,
> -   (flags & BLOCK_FLAG_COMPAT6 ? 6 : 4),
> +   hw_version,
> total_size /
> (int64_t)(63 * number_heads * 
> BDRV_SECTOR_SIZE),
> number_heads,
> @@ -2047,6 +2059,7 @@ exit:
>  }
>  g_free(adapter_type);
>  g_free(backing_file);
> +g_free(hw_version);
>  g_free(fmt);
>  g_free(desc);
>  g_free(path);
> @@ -2298,6 +2311,12 @@ static QemuOptsList vmdk_create_opts = {
>  .def_value_str = "off"
>  },
>  {
> +.name = BLOCK_OPT_HWVERSION,
> +.type = QEMU_OPT_STRING,
> +.help = "VMDK hardware version",
> +.def_value_str = "undefined"
> +},
> +{
>  .name = BLOCK_OPT_SUBFMT,
>  .type = QEMU_OPT_STRING,
>  .help =
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 10d8759..931a412 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -38,12 +38,12 @@
>  #include "qemu/throttle.h"
>
>  #define BLOCK_FLAG_ENCRYPT  1
> -#define BLOCK_FLAG_COMPAT6  4
>  #define BLOCK_FLAG_LAZY_REFCOUNTS   8
>
>  #define BLOCK_OPT_SIZE  "size"
>  #define BLOCK_OPT_ENCRYPT   "encryption"
>  #define BLOCK_OPT_COMPAT6   "compat6"
> +#define BLOCK_OPT_HWVERSION "hwversion"
>  #define BLOCK_OPT_BACKING_FILE  "backing_file"
>  #define BLOCK_OPT_BACKING_FMT   "backing_fmt"
>  #define BLOCK_OPT_CLUSTER_SIZE  "cluster_size"
> diff --git a/qemu-doc.texi b/qemu-doc.texi
> index 79141d3..f37fd31 100644
> --- a/qemu-doc.texi
> +++ b/qemu-doc.texi
> @@ -693,6 +693,9 @@ Supported options:
>  File name of a base image (see @option{create} subcommand).
>  @item compat6
>  Create a VMDK version 6 image (instead of version 4)
> +@item hwversion
> +Specify vmdk virtual hardware version. Compat6 flag cannot be enabled
> +if hwversion is specified.
>  @item subformat
>  Specifies which VMDK subformat to use. Valid options are
>  @code{monolithicSparse} (default),
> --
> 1.9.1
>

Re: [Qemu-devel] [PATCH] rbd:change error_setg() to error_setg_errno()

2016-05-05 Thread Vikhyat Umrao

On Fri, May 6, 2016 at 6:14 AM, Fam Zheng  wrote:

> On Thu, 05/05 15:06, Jeff Cody wrote:
> > On Thu, May 05, 2016 at 06:45:15PM +0100, Stefan Hajnoczi wrote:
> > > On Mon, May 02, 2016 at 09:55:17PM +0530, Vikhyat Umrao wrote:
> > > > From 1c63c246f47a1a65d8740d7ce3725fe3820c0a37 Mon Sep 17 00:00:00
> 2001
> > > > From: Vikhyat Umrao 
> > > > Date: Mon, 2 May 2016 21:47:31 +0530
> > > > Subject: [PATCH] rbd:change error_setg() to error_setg_errno()
> > > >
> > > > Ceph RBD block driver does not use error_setg_errno() where
> > > > it is possible to use. This patch replaces error_setg()
> > > > from error_setg_errno().
> > > >
> > > > Signed-off-by: Vikhyat Umrao 
> > > > ---
> > > >  block/rbd.c | 37 ++---
> > > >  1 file changed, 22 insertions(+), 15 deletions(-)
> > > >
> > > > diff --git a/block/rbd.c b/block/rbd.c
> > > > index 5bc5b32..c286b32 100644
> > > > --- a/block/rbd.c
> > > > +++ b/block/rbd.c
> > > > @@ -290,7 +290,7 @@ static int qemu_rbd_set_conf(rados_t cluster,
> const
> > > > char *conf,
> > > >  if (only_read_conf_file) {
> > > >  ret = rados_conf_read_file(cluster, value);
> > > >  if (ret < 0) {
> > > > -error_setg(errp, "error reading conf file %s",
> value);
> > > > +error_setg_errno(errp, -ret, "error reading
> conf file
> > > > %s", value);
> > >
> > > Please use scripts/checkpatch.pl to scan patches for coding style
> > > violations.  This line exceeds the maximum line length.
> >
> >
> > Also, if you can use something like git send-email, it formats the patch
> > emails much nicer -- which makes it easier to review & apply.
>
> Also please use ./scripts/get_maintainer.pl to get a more complete
> recipient
> list (for example it will report qemu-bl...@nongnu.org and other block
> layer maintainers).
>
> Fam
>

Thank you everyone for your inputs. Sure I will fix the current patch and
will check it with given scripts and then will send patch v2.

Re: [Qemu-devel] [PATCH v2 2/2] xen: add pvUSB backend

2016-05-05 Thread Juergen Gross

On 05/05/16 12:13, Anthony PERARD wrote:
> On Wed, May 04, 2016 at 10:25:03AM +0200, Juergen Gross wrote:
>> On 03/05/16 17:06, Anthony PERARD wrote:
>>> On Thu, Mar 10, 2016 at 04:19:30PM +0100, Juergen Gross wrote:
 +static void usbback_bh(void *opaque)
 +{
 +struct usbback_info *usbif;
 +struct usbif_urb_back_ring *urb_ring;
 +struct usbback_req *usbback_req;
 +RING_IDX rc, rp;
 +unsigned int more_to_do;
 +
 +usbif = opaque;
 +if (usbif->ring_error) {
 +return;
 +}
 +
 +urb_ring = >urb_ring;
 +rc = urb_ring->req_cons;
 +rp = urb_ring->sring->req_prod;
>>>
>>> Maybe use atomic_read() here to avoid req_prod been read more than once.
>>
>> Hmm. This isn't done in the other backends.
>>
>> TBH: what would happen if req_prod would be read multiple times? In the
>> worst case we would see a new request from the guest which we would have
>> missed without the atomic_read().
> 
> If the guest is misbehaving, it maybe could provoke QEMU to handle more
> request. I'm not sure.

I don't think this would add any risk to dom0. A misbehaving guest
writing arbitrary values to ->req_prod could influence qemu activity
in just the same way regardless whether atomic_read() is used on qemu
side or not. The only difference would be that with atomic_read() the
additional qemu activity would be delayed until the next invocation
of the function.

> For this use of atomic_read, I'm mostly refering to XSA-155[1] and a
> conversation[2].

The main problem with XSA-155 was modification of the request's contents
by the guest after verification by the backend happened. This is not
related to reading the producer's ring index.

I should use RING_COPY_REQUEST(), however.

Juergen

Re: [Qemu-devel] TCP Segementation Offloading

2016-05-05 Thread Ingo Krabbe

> On Sun, May 01, 2016 at 02:31:57PM +0200, Ingo Krabbe wrote:
>> Good Mayday Qemu Developers,
>> 
>> today I tried to find a reference to a networking problem, that seems to be 
>> of quite general nature: TCP Segmentation Offloading (TSO) in virtual 
>> environments.
>> 
>> When I setup TAP network adapter for a virtual machine and put it into a 
>> host bridge, the known best practice is to manually set "tso off gso off" 
>> with ethtool, for the guest driver if I use a hardware emulation, such as 
>> e1000 and/or "tso off gso off" for the host driver and/or for the bridge 
>> adapter, if I use the virtio driver, as otherwise you experience 
>> (sometimes?) performance problems or even lost packages.
> 
> I can't parse this sentence.  In what cases do you think it's a "known
> best practice" to disable tso and gso?  Maybe a table would be a clearer
> way to communicate this.
> 
> Can you provide a link to the source claiming tso and gso should be
> disabled?

Sorry for that long sentence. The consequence seems to be, that it is most 
stable to turn off tso and gso for host bridges and for adapters in virtual 
machines.

One of the most comprehensive collections of arguments is this article


https://kris.io/2015/10/01/kvm-network-performance-tso-and-gso-turn-it-off/

while I also found a documentation for Centos 6


https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Host_Configuration_and_Guest_Installation_Guide/ch10s04.html

In google groups this one is discussed

https://code.google.com/p/ganeti/wiki/PerformanceTuning

Of course the same is found for Xen Machines

http://cloudnull.io/2012/07/xenserver-network-tuning/

You see there are several Links in the internet and my first question is: Why 
can't I find this discussion in the qemu-wiki space.

I think the bug

https://bugs.launchpad.net/bugs/1202289

is related.

>> I haven't found a complete analysis of the background of these problems, but 
>> there seem to be some effects on MTU based fragmentation and UDP checksums.
>> 
>> There is a tso related bug on launchpad, but the context of this bug is too 
>> narrow, for the generality of the problem.
>> 
>> Also it seems that there is a problem in LXC contexts too (I found such a 
>> reference, without detailed description in a Post about Xen setup).
>> 
>> My question now is: Is there a bug in the driver code and shouldn't this be 
>> documented somewhere in wiki.qemu.org? Where there developments about this 
>> topic in the past or is there any planned/ongoing work todo on the qemu 
>> drivers?
>> 
>> Most problem reports found relate to deprecated Centos6 qemu-kvm packages.
>> 
>> In our company we have similar or even worse problems with Centos7 hosts and 
>> guest machines.
> 
> Have haven't explained what problem you are experiencing.  If you want
> help with your setup please include your QEMU command-line (ps aux |
> grep qemu), the traffic pattern (ideally how to reproduce it with a
> benchmarking tool), and what observation you are making (e.g. netstat
> counters showing dropped packets).

I was quite astonished about the many hints about virtio drivers as we had this 
problem with the e1000 driver in a Centos7 Guest on a Centos6 Host.

e1000 :00:03.0 ens3: Detected Tx Unit Hang#012  Tx Queue
 <0>#012  TDH  <42>#012  TDT  <42>#012  
next_to_use  <2e>#012  next_to_clean
<42>#012buffer_info[next_to_clean]#012  time_stamp   <104aff1b8>#012  
next_to_watch<44>#012  jiffies  <104b00ee9>#012  
next_to_watch.status <0>
Apr 25 21:08:48 db03 kernel: [ cut here ]
Apr 25 21:08:48 db03 kernel: WARNING: at net/sched/sch_generic.c:297 
dev_watchdog+0x270/0x280()
Apr 25 21:08:48 db03 kernel: NETDEV WATCHDOG: ens3 (e1000): transmit 
queue 0 timed out
Apr 25 21:08:48 db03 kernel: Modules linked in: binfmt_misc ipt_REJECT 
nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip6t_REJECT nf_conntrack_ipv6 
nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables btrfs 
zlib_deflate raid6_pq xor ext4 mbcache jbd2 crc32_pclmul ghash_clmulni_intel 
aesni_intel lrw gf128mul glue_helper ablk_helper i2c_piix4 ppdev cryptd pcspkr 
virtio_balloon parport_pc parport sg nfsd auth_rpcgss nfs_acl lockd grace 
sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic ata_generic 
pata_acpi virtio_scsi cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper 
ttm drm crct10dif_pclmul crct10dif_common ata_piix crc32c_intel virtio_pci 
e1000 i2c_core virtio_ring libata serio_raw virtio floppy dm_mirror 
dm_region_hash dm_log dm_mod
Apr 25 21:08:48 db03 kernel: CPU: 2 PID: 0 Comm: swapper/2 Not tainted 
3.10.0-327.13.1.el7.x86_64 #1
Apr 25 21:08:48 db03 kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 
01/01/2007
Apr 25 21:08:48 db03 kernel:

Re: [Qemu-devel] [PATCH v3 07/18] qapi: Add json output visitor

2016-05-05 Thread Eric Blake

On 05/04/2016 09:45 AM, Markus Armbruster wrote:
> +void json_output_visitor_reset(JsonOutputVisitor *v);

 Hmm.  Why is "reset" not a Visitor method?

 I think this would let us put the things enforced by your "qmp: Tighten
 output visitor rules" in the Visitor contract.
>>>
>>> I thought about that, and now that you've mentioned it, I'll probably
>>> give it a try (that is, make visit_reset() a top-level construct that
>>> ALL visitors must support, rather than just qmp-output and json-output).
>>
>> Yes, please.
> 
> Same question for "cleanup".  Stupid name for a destructor, by the way.

Interface question - all of the FOO_visitor_new() functions return a
subtype pointer Foo*, rather than Visitor*; along with a Visitor
*FOO_get_visitor(FOO*) for up-casting, so that FOO* can be used in the
per-type cleanup function; the FOO* pointers are also useful for two
additional output functions in the two output visitors.  We're proposing
hiding the per-type cleanup function behind a simpler visit_free(Visitor
*v).  So all that's left are the two output functions.  Can we get rid
of those, and make Visitor* the only public interface, rather than
making every caller have to do upcasts?

It looks like outside of the testsuite, all uses of these visitors are
local to a single function; and improving the testsuite is not the end
of the world.  Particularly if only the testsuite is using reset, it may
be easier to just patch the testsuite to use a new visitor in the places
where it currently does a reset.  How ugly would it be to require that
the caller pass in a pointer to the result as part of creating the
visitor, with the promise that the result is populated at the end of a
successful visit, and left NULL if the visit is reset early?  Or maybe a
visit_complete() function that is a no-op on input visitors, but
populates the parameter passed at creation for output visitors.  If we
do that, we could rewrite things like the existing:

QObject *object_property_get_qobject(Object *obj, const char *name,
 Error **errp)
{
QObject *ret = NULL;
Error *local_err = NULL;
QmpOutputVisitor *qov;

qov = qmp_output_visitor_new();
object_property_get(obj, qmp_output_get_visitor(qov), name, _err);
if (!local_err) {
ret = qmp_output_get_qobject(qov);
}
error_propagate(errp, local_err);
qmp_output_visitor_cleanup(qov);
return ret;
}

to instead be:

QObject *object_property_get_qobject(Object *obj, const char *name,
 Error **errp)
{
QObject *ret = NULL;
Error *local_err = NULL;
Visitor *v;

v = qmp_output_visitor_new();
object_property_get(obj, v, name, _err);
if (!local_err) {
visit_complete(v); /* populates ret */
}
error_propagate(errp, local_err);
visit_free(v);
return ret;
}

Slightly shorter, but populating 'ret' at a distance feels a bit weird.
 Maybe we need to keep the FOO_new() functions returning FOO* rather
than Visitor*, along with the FOO_get_visitor() functions, after all.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v5 04/11] mptsas: change .realize function name

All the other devices` .realize function name are xxx_realize, except this one.

cc: Paolo Bonzini 
Reviewed-by: Marcel Apfelbaum 
Signed-off-by: Cao jin 
---
 hw/scsi/mptsas.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/scsi/mptsas.c b/hw/scsi/mptsas.c
index 499c146..1c18c84 100644
--- a/hw/scsi/mptsas.c
+++ b/hw/scsi/mptsas.c
@@ -1270,7 +1270,7 @@ static const struct SCSIBusInfo mptsas_scsi_info = {
 .load_request = mptsas_load_request,
 };
 
-static void mptsas_scsi_init(PCIDevice *dev, Error **errp)
+static void mptsas_scsi_realize(PCIDevice *dev, Error **errp)
 {
 DeviceState *d = DEVICE(dev);
 MPTSASState *s = MPT_SAS(dev);
@@ -1413,7 +1413,7 @@ static void mptsas1068_class_init(ObjectClass *oc, void 
*data)
 DeviceClass *dc = DEVICE_CLASS(oc);
 PCIDeviceClass *pc = PCI_DEVICE_CLASS(oc);
 
-pc->realize = mptsas_scsi_init;
+pc->realize = mptsas_scsi_realize;
 pc->exit = mptsas_scsi_uninit;
 pc->romfile = 0;
 pc->vendor_id = PCI_VENDOR_ID_LSI_LOGIC;
-- 
2.1.0

[Qemu-devel] [PATCH v5 10/11] pci core: assert ENOSPC when add capability

ENOSPC is programming error, assert it for debugging.

cc: Michael S. Tsirkin 
cc: Marcel Apfelbaum 
cc: Markus Armbruster 
Signed-off-by: Cao jin 
---
 hw/pci/pci.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index f0f41dc..fc8b377 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2151,10 +2151,8 @@ int pci_add_capability2(PCIDevice *pdev, uint8_t cap_id,
 
 if (!offset) {
 offset = pci_find_space(pdev, size);
-if (!offset) {
-error_setg(errp, "out of PCI config space");
-return -ENOSPC;
-}
+/* out of PCI config space should be programming error */
+assert(offset);
 } else {
 /* Verify that capabilities don't overlap.  Note: device assignment
  * depends on this check to verify that the device is not broken.
-- 
2.1.0

[Qemu-devel] [PATCH v5 06/11] intel-hda: change msi property type

>From uint32 to enum OnOffAuto.

cc: Gerd Hoffmann 
cc: Michael S. Tsirkin 
cc: Markus Armbruster 
cc: Marcel Apfelbaum 

Signed-off-by: Cao jin 
---
 hw/audio/intel-hda.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/audio/intel-hda.c b/hw/audio/intel-hda.c
index d372d4a..61362e5 100644
--- a/hw/audio/intel-hda.c
+++ b/hw/audio/intel-hda.c
@@ -187,7 +187,7 @@ struct IntelHDAState {
 
 /* properties */
 uint32_t debug;
-uint32_t msi;
+OnOffAuto msi;
 bool old_msi_addr;
 };
 
@@ -1142,7 +1142,8 @@ static void intel_hda_realize(PCIDevice *pci, Error 
**errp)
 memory_region_init_io(>mmio, OBJECT(d), _hda_mmio_ops, d,
   "intel-hda", 0x4000);
 pci_register_bar(>pci, 0, 0, >mmio);
-if (d->msi) {
+if (d->msi == ON_OFF_AUTO_AUTO ||
+d->msi == ON_OFF_AUTO_ON) {
 msi_init(>pci, d->old_msi_addr ? 0x50 : 0x60, 1, true, false);
 }
 
@@ -1234,7 +1235,7 @@ static const VMStateDescription vmstate_intel_hda = {
 
 static Property intel_hda_properties[] = {
 DEFINE_PROP_UINT32("debug", IntelHDAState, debug, 0),
-DEFINE_PROP_UINT32("msi", IntelHDAState, msi, 1),
+DEFINE_PROP_ON_OFF_AUTO("msi", IntelHDAState, msi, ON_OFF_AUTO_AUTO),
 DEFINE_PROP_BOOL("old_msi_addr", IntelHDAState, old_msi_addr, false),
 DEFINE_PROP_END_OF_LIST(),
 };
-- 
2.1.0

[Qemu-devel] [PATCH v5 11/11] pci: Convert msi_init() to Error and fix callers to check it

msi_init() reports errors with error_report(), which is wrong
when it's used in realize().

Fix by converting it to Error.

Fix its callers to handle failure instead of ignoring it.

For those callers who don`t handle the failure, it might happen:
when user want msi on, but he doesn`t get what he want because of
msi_init fails silently.

cc: Gerd Hoffmann 
cc: John Snow 
cc: Dmitry Fleytman 
cc: Jason Wang 
cc: Michael S. Tsirkin 
cc: Hannes Reinecke 
cc: Paolo Bonzini 
cc: Alex Williamson 
cc: Markus Armbruster 
cc: Marcel Apfelbaum 

Signed-off-by: Cao jin 
---
the affected device is modified in this way:
1. intel hd audio: move msi_init() above, save the strength to free the
   MemoryRegion when it fails.
2. ich9 ahci: move msi_init() above, save the strenth to free the resource
   allocated when calling achi_realize(). It doesn`t have msi property, so
   msi_init failure leads to fall back to INTx silently. Just free the error
   object
3. vmxnet3: move msi_init() above. Remove the unecessary vmxnet3_init_msi().
   It doesn`t have msi property, so msi_init() failure leads to fall back to
   INTx silently. Just free the error object.
4. ioh3420/xio3130_downstream/xio3130_upstream: they are pcie components, msi
   or msix is forced, catch error and report it right there.
5. pci_bridge_dev: msi_init`s failure is fatal, follow the behaviour.
6. megasas_scsi: move msi_init() above, save the strength to free the
   MemoryRegion when it fails.
7. mptsas: Move msi_init() above, save the strength to free the MemoryRegion
   when it fails.
8. pvscsi: it doesn`t have msi property, msi_init fail leads to fall back to
   INTx silently.
9. usb-xhci: move msi_init() above, save the strength to free the MemoryRegion
   when it fails.
10. vfio-pci: keep the previous behaviour, and just catch & report error.

 hw/audio/intel-hda.c   | 18 +
 hw/ide/ich.c   | 15 +-
 hw/net/vmxnet3.c   | 41 +++---
 hw/pci-bridge/ioh3420.c|  4 +++-
 hw/pci-bridge/pci_bridge_dev.c |  7 ---
 hw/pci-bridge/xio3130_downstream.c |  4 +++-
 hw/pci-bridge/xio3130_upstream.c   |  4 +++-
 hw/pci/msi.c   |  9 +
 hw/scsi/megasas.c  | 18 +
 hw/scsi/mptsas.c   | 20 ++-
 hw/scsi/vmw_pvscsi.c   |  6 +-
 hw/usb/hcd-xhci.c  | 18 +
 hw/vfio/pci.c  |  6 --
 include/hw/pci/msi.h   |  3 ++-
 14 files changed, 112 insertions(+), 61 deletions(-)

diff --git a/hw/audio/intel-hda.c b/hw/audio/intel-hda.c
index 61362e5..0a46358 100644
--- a/hw/audio/intel-hda.c
+++ b/hw/audio/intel-hda.c
@@ -1131,6 +1131,7 @@ static void intel_hda_realize(PCIDevice *pci, Error 
**errp)
 {
 IntelHDAState *d = INTEL_HDA(pci);
 uint8_t *conf = d->pci.config;
+Error *err = NULL;
 
 d->name = object_get_typename(OBJECT(d));
 
@@ -1139,13 +1140,22 @@ static void intel_hda_realize(PCIDevice *pci, Error 
**errp)
 /* HDCTL off 0x40 bit 0 selects signaling mode (1-HDA, 0 - Ac97) 18.1.19 */
 conf[0x40] = 0x01;
 
+if (d->msi != ON_OFF_AUTO_OFF) {
+msi_init(>pci, d->old_msi_addr ? 0x50 : 0x60, 1,
+ true, false, );
+if (err && d->msi == ON_OFF_AUTO_ON) {
+/* If user set msi=on, then device creation fail */
+error_propagate(errp, err);
+return;
+} else if (err && d->msi == ON_OFF_AUTO_AUTO) {
+/* If user doesn`t set it on, switch to non-msi variant silently */
+error_free(err);
+}
+}
+
 memory_region_init_io(>mmio, OBJECT(d), _hda_mmio_ops, d,
   "intel-hda", 0x4000);
 pci_register_bar(>pci, 0, 0, >mmio);
-if (d->msi == ON_OFF_AUTO_AUTO ||
-d->msi == ON_OFF_AUTO_ON) {
-msi_init(>pci, d->old_msi_addr ? 0x50 : 0x60, 1, true, false);
-}
 
 hda_codec_bus_init(DEVICE(pci), >codecs, sizeof(d->codecs),
intel_hda_response, intel_hda_xfer);
diff --git a/hw/ide/ich.c b/hw/ide/ich.c
index 0a13334..aec8262 100644
--- a/hw/ide/ich.c
+++ b/hw/ide/ich.c
@@ -111,6 +111,16 @@ static void pci_ich9_ahci_realize(PCIDevice *dev, Error 
**errp)
 int sata_cap_offset;
 uint8_t *sata_cap;
 d = ICH_AHCI(dev);
+Error *err = NULL;
+
+/* Although the AHCI 1.3 specification states that the first capability
+ * should be PMCAP, the Intel ICH9 data sheet specifies that the ICH9
+ * AHCI device puts the MSI capability first, pointing to 0x80. */
+msi_init(dev, ICH9_MSI_CAP_OFFSET, 1, true, false, );
+if (err) {
+/* Fall back to INTx silently */
+

[Qemu-devel] [PATCH v5 01/11] fix some coding style problems

It has:
1. More newlines make the code block well separated.
2. Add more comments for msi_init.
3. Fix a indentation in vmxnet3.c.
4. ioh3420 & xio3130_downstream: put PCI Express capability init function
   together, make it more readable.

cc: Dmitry Fleytman 
cc: Jason Wang 
cc: Michael S. Tsirkin 
cc: Markus Armbruster 
cc: Marcel Apfelbaum 

Signed-off-by: Cao jin 
---
 hw/net/vmxnet3.c   |  2 +-
 hw/pci-bridge/ioh3420.c|  7 ++-
 hw/pci-bridge/pci_bridge_dev.c |  4 
 hw/pci-bridge/xio3130_downstream.c |  6 +-
 hw/pci-bridge/xio3130_upstream.c   |  3 +++
 hw/pci/msi.c   | 19 +++
 6 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index 093a71e..7a38e47 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -348,7 +348,7 @@ typedef struct {
 /* Interrupt management */
 
 /*
- *This function returns sign whether interrupt line is in asserted state
+ * This function returns sign whether interrupt line is in asserted state
  * This depends on the type of interrupt used. For INTX interrupt line will
  * be asserted until explicit deassertion, for MSI(X) interrupt line will
  * be deasserted automatically due to notification semantics of the MSI(X)
diff --git a/hw/pci-bridge/ioh3420.c b/hw/pci-bridge/ioh3420.c
index 0937fa3..b4a7806 100644
--- a/hw/pci-bridge/ioh3420.c
+++ b/hw/pci-bridge/ioh3420.c
@@ -106,12 +106,14 @@ static int ioh3420_initfn(PCIDevice *d)
 if (rc < 0) {
 goto err_bridge;
 }
+
 rc = msi_init(d, IOH_EP_MSI_OFFSET, IOH_EP_MSI_NR_VECTOR,
   IOH_EP_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT,
   IOH_EP_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT);
 if (rc < 0) {
 goto err_bridge;
 }
+
 rc = pcie_cap_init(d, IOH_EP_EXP_OFFSET, PCI_EXP_TYPE_ROOT_PORT, p->port);
 if (rc < 0) {
 goto err_msi;
@@ -120,18 +122,21 @@ static int ioh3420_initfn(PCIDevice *d)
 pcie_cap_arifwd_init(d);
 pcie_cap_deverr_init(d);
 pcie_cap_slot_init(d, s->slot);
+pcie_cap_root_init(d);
+
 pcie_chassis_create(s->chassis);
 rc = pcie_chassis_add_slot(s);
 if (rc < 0) {
 goto err_pcie_cap;
 }
-pcie_cap_root_init(d);
+
 rc = pcie_aer_init(d, IOH_EP_AER_OFFSET, PCI_ERR_SIZEOF);
 if (rc < 0) {
 goto err;
 }
 pcie_aer_root_init(d);
 ioh3420_aer_vector_update(d);
+
 return 0;
 
 err:
diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
index 862a236..32f4daa 100644
--- a/hw/pci-bridge/pci_bridge_dev.c
+++ b/hw/pci-bridge/pci_bridge_dev.c
@@ -67,10 +67,12 @@ static int pci_bridge_dev_initfn(PCIDevice *dev)
 /* MSI is not applicable without SHPC */
 bridge_dev->flags &= ~(1 << PCI_BRIDGE_DEV_F_MSI_REQ);
 }
+
 err = slotid_cap_init(dev, 0, bridge_dev->chassis_nr, 0);
 if (err) {
 goto slotid_error;
 }
+
 if ((bridge_dev->flags & (1 << PCI_BRIDGE_DEV_F_MSI_REQ)) &&
 msi_nonbroken) {
 err = msi_init(dev, 0, 1, true, true);
@@ -78,6 +80,7 @@ static int pci_bridge_dev_initfn(PCIDevice *dev)
 goto msi_error;
 }
 }
+
 if (shpc_present(dev)) {
 /* TODO: spec recommends using 64 bit prefetcheable BAR.
  * Check whether that works well. */
@@ -85,6 +88,7 @@ static int pci_bridge_dev_initfn(PCIDevice *dev)
  PCI_BASE_ADDRESS_MEM_TYPE_64, _dev->bar);
 }
 return 0;
+
 msi_error:
 slotid_cap_cleanup(dev);
 slotid_error:
diff --git a/hw/pci-bridge/xio3130_downstream.c 
b/hw/pci-bridge/xio3130_downstream.c
index cf1ee63..e6d653d 100644
--- a/hw/pci-bridge/xio3130_downstream.c
+++ b/hw/pci-bridge/xio3130_downstream.c
@@ -70,11 +70,13 @@ static int xio3130_downstream_initfn(PCIDevice *d)
 if (rc < 0) {
 goto err_bridge;
 }
+
 rc = pci_bridge_ssvid_init(d, XIO3130_SSVID_OFFSET,
XIO3130_SSVID_SVID, XIO3130_SSVID_SSID);
 if (rc < 0) {
 goto err_bridge;
 }
+
 rc = pcie_cap_init(d, XIO3130_EXP_OFFSET, PCI_EXP_TYPE_DOWNSTREAM,
p->port);
 if (rc < 0) {
@@ -83,12 +85,14 @@ static int xio3130_downstream_initfn(PCIDevice *d)
 pcie_cap_flr_init(d);
 pcie_cap_deverr_init(d);
 pcie_cap_slot_init(d, s->slot);
+pcie_cap_arifwd_init(d);
+
 pcie_chassis_create(s->chassis);
 rc = pcie_chassis_add_slot(s);
 if (rc < 0) {
 goto err_pcie_cap;
 }
-pcie_cap_arifwd_init(d);
+
 rc = pcie_aer_init(d, XIO3130_AER_OFFSET, PCI_ERR_SIZEOF);
 if (rc < 0) {
 goto err;
diff --git a/hw/pci-bridge/xio3130_upstream.c b/hw/pci-bridge/xio3130_upstream.c
index 164ef58..d976844 100644
--- a/hw/pci-bridge/xio3130_upstream.c
+++ b/hw/pci-bridge/xio3130_upstream.c
@@

[Qemu-devel] [PATCH v5 08/11] megasas: change msi/msix property type

>From bit to enum OnOffAuto.

cc: Hannes Reinecke 
cc: Paolo Bonzini 
cc: Markus Armbruster 
Signed-off-by: Cao jin 
---
 hw/scsi/megasas.c | 28 +++-
 1 file changed, 11 insertions(+), 17 deletions(-)

diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index 56fb645..e71a28b 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -48,11 +48,7 @@
 
 #define MEGASAS_FLAG_USE_JBOD  0
 #define MEGASAS_MASK_USE_JBOD  (1 << MEGASAS_FLAG_USE_JBOD)
-#define MEGASAS_FLAG_USE_MSI   1
-#define MEGASAS_MASK_USE_MSI   (1 << MEGASAS_FLAG_USE_MSI)
-#define MEGASAS_FLAG_USE_MSIX  2
-#define MEGASAS_MASK_USE_MSIX  (1 << MEGASAS_FLAG_USE_MSIX)
-#define MEGASAS_FLAG_USE_QUEUE64   3
+#define MEGASAS_FLAG_USE_QUEUE64   1
 #define MEGASAS_MASK_USE_QUEUE64   (1 << MEGASAS_FLAG_USE_QUEUE64)
 
 static const char *mfi_frame_desc[] = {
@@ -96,6 +92,8 @@ typedef struct MegasasState {
 int busy;
 int diag;
 int adp_reset;
+OnOffAuto msi;
+OnOffAuto msix;
 
 MegasasCmd *event_cmd;
 int event_locale;
@@ -159,12 +157,12 @@ static bool megasas_use_queue64(MegasasState *s)
 
 static bool megasas_use_msi(MegasasState *s)
 {
-return s->flags & MEGASAS_MASK_USE_MSI;
+return s->msi == ON_OFF_AUTO_AUTO || s->msi == ON_OFF_AUTO_ON;
 }
 
 static bool megasas_use_msix(MegasasState *s)
 {
-return s->flags & MEGASAS_MASK_USE_MSIX;
+return s->msix == ON_OFF_AUTO_AUTO || s->msix == ON_OFF_AUTO_ON;
 }
 
 static bool megasas_is_jbod(MegasasState *s)
@@ -2349,12 +2347,12 @@ static void megasas_scsi_realize(PCIDevice *dev, Error 
**errp)
 
 if (megasas_use_msi(s) &&
 msi_init(dev, 0x50, 1, true, false) < 0) {
-s->flags &= ~MEGASAS_MASK_USE_MSI;
+s->msi = ON_OFF_AUTO_OFF;
 }
 if (megasas_use_msix(s) &&
 msix_init(dev, 15, >mmio_io, b->mmio_bar, 0x2000,
   >mmio_io, b->mmio_bar, 0x3800, 0x68)) {
-s->flags &= ~MEGASAS_MASK_USE_MSIX;
+s->msix = ON_OFF_AUTO_OFF;
 }
 if (pci_is_express(dev)) {
 pcie_endpoint_cap_init(dev, 0xa0);
@@ -2422,10 +2420,8 @@ static Property megasas_properties_gen1[] = {
MEGASAS_DEFAULT_FRAMES),
 DEFINE_PROP_STRING("hba_serial", MegasasState, hba_serial),
 DEFINE_PROP_UINT64("sas_address", MegasasState, sas_addr, 0),
-DEFINE_PROP_BIT("use_msi", MegasasState, flags,
-MEGASAS_FLAG_USE_MSI, false),
-DEFINE_PROP_BIT("use_msix", MegasasState, flags,
-MEGASAS_FLAG_USE_MSIX, false),
+DEFINE_PROP_ON_OFF_AUTO("msi", MegasasState, msi, ON_OFF_AUTO_AUTO),
+DEFINE_PROP_ON_OFF_AUTO("msix", MegasasState, msix, ON_OFF_AUTO_AUTO),
 DEFINE_PROP_BIT("use_jbod", MegasasState, flags,
 MEGASAS_FLAG_USE_JBOD, false),
 DEFINE_PROP_END_OF_LIST(),
@@ -2438,10 +2434,8 @@ static Property megasas_properties_gen2[] = {
MEGASAS_GEN2_DEFAULT_FRAMES),
 DEFINE_PROP_STRING("hba_serial", MegasasState, hba_serial),
 DEFINE_PROP_UINT64("sas_address", MegasasState, sas_addr, 0),
-DEFINE_PROP_BIT("use_msi", MegasasState, flags,
-MEGASAS_FLAG_USE_MSI, true),
-DEFINE_PROP_BIT("use_msix", MegasasState, flags,
-MEGASAS_FLAG_USE_MSIX, true),
+DEFINE_PROP_ON_OFF_AUTO("msi", MegasasState, msi, ON_OFF_AUTO_AUTO),
+DEFINE_PROP_ON_OFF_AUTO("msix", MegasasState, msix, ON_OFF_AUTO_AUTO),
 DEFINE_PROP_BIT("use_jbod", MegasasState, flags,
 MEGASAS_FLAG_USE_JBOD, false),
 DEFINE_PROP_END_OF_LIST(),
-- 
2.1.0

[Qemu-devel] [PATCH v5 05/11] usb xhci: change msi/msix property type

>From bit to enum OnOffAuto.

cc: Gerd Hoffmann 
cc: Michael S. Tsirkin 
cc: Markus Armbruster 
cc: Marcel Apfelbaum 

Signed-off-by: Cao jin 
---
 hw/usb/hcd-xhci.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c
index bcde8a2..6d70289 100644
--- a/hw/usb/hcd-xhci.c
+++ b/hw/usb/hcd-xhci.c
@@ -461,6 +461,8 @@ struct XHCIState {
 uint32_t numslots;
 uint32_t flags;
 uint32_t max_pstreams_mask;
+OnOffAuto msi;
+OnOffAuto msix;
 
 /* Operational Registers */
 uint32_t usbcmd;
@@ -498,9 +500,7 @@ typedef struct XHCIEvRingSeg {
 } XHCIEvRingSeg;
 
 enum xhci_flags {
-XHCI_FLAG_USE_MSI = 1,
-XHCI_FLAG_USE_MSI_X,
-XHCI_FLAG_SS_FIRST,
+XHCI_FLAG_SS_FIRST = 1,
 XHCI_FLAG_FORCE_PCIE_ENDCAP,
 XHCI_FLAG_ENABLE_STREAMS,
 };
@@ -3645,10 +3645,12 @@ static void usb_xhci_realize(struct PCIDevice *dev, 
Error **errp)
 assert(ret >= 0);
 }
 
-if (xhci_get_flag(xhci, XHCI_FLAG_USE_MSI)) {
+if (xhci->msi == ON_OFF_AUTO_ON ||
+xhci->msi == ON_OFF_AUTO_AUTO) {
 msi_init(dev, 0x70, xhci->numintrs, true, false);
 }
-if (xhci_get_flag(xhci, XHCI_FLAG_USE_MSI_X)) {
+if (xhci->msix == ON_OFF_AUTO_ON ||
+xhci->msix == ON_OFF_AUTO_AUTO) {
 msix_init(dev, xhci->numintrs,
   >mem, 0, OFF_MSIX_TABLE,
   >mem, 0, OFF_MSIX_PBA,
@@ -3869,8 +3871,8 @@ static const VMStateDescription vmstate_xhci = {
 };
 
 static Property xhci_properties[] = {
-DEFINE_PROP_BIT("msi",  XHCIState, flags, XHCI_FLAG_USE_MSI, true),
-DEFINE_PROP_BIT("msix", XHCIState, flags, XHCI_FLAG_USE_MSI_X, true),
+DEFINE_PROP_ON_OFF_AUTO("msi", XHCIState, msi, ON_OFF_AUTO_AUTO),
+DEFINE_PROP_ON_OFF_AUTO("msix", XHCIState, msix, ON_OFF_AUTO_AUTO),
 DEFINE_PROP_BIT("superspeed-ports-first",
 XHCIState, flags, XHCI_FLAG_SS_FIRST, true),
 DEFINE_PROP_BIT("force-pcie-endcap", XHCIState, flags,
-- 
2.1.0

[Qemu-devel] [PATCH v5 00/11] Add param Error ** for msi_init()

This patchset is for 2.7. This version has a huge change, so I hope to
get some comments first.

The change mostly is:
1. According suggestions, modify devices` msi/msix property type.
   its type mostly bit or uint, now change it to enum OnOffAuto,
   and default to "auto". So we will know if user want msi/msix or not,
   then the process will be:

   /-> Fail: switch to the non-msi variant
 msi = auto -> msi_init
   \-> Success: we got msi variant

 /-> Fail: device creation fail
 msi = on -> msi_init
 \-> Success: we got msi variant

2. For devices who don`t have msi/msix property, following its previous
   behaviour: if msi_init`s failure is non-fatal, then fall back to INTx
   silently, or else, device creation fail.

3. pci core: Assert ENOSPC error when add capability

4. fix to all the other tiny comments

cc: Gerd Hoffmann 
cc: John Snow 
cc: Dmitry Fleytman 
cc: Jason Wang 
cc: Michael S. Tsirkin 
cc: Hannes Reinecke 
cc: Paolo Bonzini 
cc: Alex Williamson 
cc: Markus Armbruster 
cc: Marcel Apfelbaum 

Cao jin (11):
  fix some coding style problems
  change pvscsi_init_msi() type to void
  megasas: Fix
  mptsas: change .realize function name
  usb xhci: change msi/msix property type
  intel-hda: change msi property type
  mptsas: change msi property type
  megasas: change msi/msix property type
  pci bridge dev: change msi property type
  pci core: assert ENOSPC when add capability
  pci: Convert msi_init() to Error and fix callers to check it

 hw/audio/intel-hda.c   | 21 +-
 hw/ide/ich.c   | 15 -
 hw/net/vmxnet3.c   | 43 +++--
 hw/pci-bridge/ioh3420.c| 11 --
 hw/pci-bridge/pci_bridge_dev.c | 21 --
 hw/pci-bridge/xio3130_downstream.c | 10 +++--
 hw/pci-bridge/xio3130_upstream.c   |  7 +-
 hw/pci/msi.c   | 24 +++--
 hw/pci/pci.c   |  6 ++
 hw/scsi/megasas.c  | 44 +-
 hw/scsi/mptsas.c   | 26 +++---
 hw/scsi/mptsas.h   |  3 ++-
 hw/scsi/vmw_pvscsi.c   | 10 +
 hw/usb/hcd-xhci.c  | 30 ++
 hw/vfio/pci.c  |  6 --
 include/hw/pci/msi.h   |  3 ++-
 16 files changed, 181 insertions(+), 99 deletions(-)

-- 
2.1.0

[Qemu-devel] [PATCH v5 03/11] megasas: Fix

msi_init returns non-zero value on both failure and success.

cc: Hannes Reinecke 
cc: Paolo Bonzini 
Signed-off-by: Cao jin 
---
 hw/scsi/megasas.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index a63a581..56fb645 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -2348,7 +2348,7 @@ static void megasas_scsi_realize(PCIDevice *dev, Error 
**errp)
   "megasas-queue", 0x4);
 
 if (megasas_use_msi(s) &&
-msi_init(dev, 0x50, 1, true, false)) {
+msi_init(dev, 0x50, 1, true, false) < 0) {
 s->flags &= ~MEGASAS_MASK_USE_MSI;
 }
 if (megasas_use_msix(s) &&
-- 
2.1.0

[Qemu-devel] [PATCH v5 09/11] pci bridge dev: change msi property type

>From bit to enum OnOffAuto.

cc: Michael S. Tsirkin 
cc: Markus Armbruster 
cc: Marcel Apfelbaum 
Signed-off-by: Cao jin 
---

Actually, I am not quite sure this device need this change, RFC.

 hw/pci-bridge/pci_bridge_dev.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
index 32f4daa..9e31f0e 100644
--- a/hw/pci-bridge/pci_bridge_dev.c
+++ b/hw/pci-bridge/pci_bridge_dev.c
@@ -41,9 +41,10 @@ struct PCIBridgeDev {
 
 MemoryRegion bar;
 uint8_t chassis_nr;
-#define PCI_BRIDGE_DEV_F_MSI_REQ 0
-#define PCI_BRIDGE_DEV_F_SHPC_REQ 1
+#define PCI_BRIDGE_DEV_F_SHPC_REQ 0
 uint32_t flags;
+
+OnOffAuto msi;
 };
 typedef struct PCIBridgeDev PCIBridgeDev;
 
@@ -65,7 +66,7 @@ static int pci_bridge_dev_initfn(PCIDevice *dev)
 }
 } else {
 /* MSI is not applicable without SHPC */
-bridge_dev->flags &= ~(1 << PCI_BRIDGE_DEV_F_MSI_REQ);
+bridge_dev->msi = ON_OFF_AUTO_OFF;
 }
 
 err = slotid_cap_init(dev, 0, bridge_dev->chassis_nr, 0);
@@ -73,7 +74,8 @@ static int pci_bridge_dev_initfn(PCIDevice *dev)
 goto slotid_error;
 }
 
-if ((bridge_dev->flags & (1 << PCI_BRIDGE_DEV_F_MSI_REQ)) &&
+if ((bridge_dev->msi == ON_OFF_AUTO_AUTO ||
+bridge_dev->msi == ON_OFF_AUTO_ON) &&
 msi_nonbroken) {
 err = msi_init(dev, 0, 1, true, true);
 if (err < 0) {
@@ -146,8 +148,8 @@ static Property pci_bridge_dev_properties[] = {
 /* Note: 0 is not a legal chassis number. */
 DEFINE_PROP_UINT8(PCI_BRIDGE_DEV_PROP_CHASSIS_NR, PCIBridgeDev, chassis_nr,
   0),
-DEFINE_PROP_BIT(PCI_BRIDGE_DEV_PROP_MSI, PCIBridgeDev, flags,
-PCI_BRIDGE_DEV_F_MSI_REQ, true),
+DEFINE_PROP_ON_OFF_AUTO(PCI_BRIDGE_DEV_PROP_MSI, PCIBridgeDev, msi,
+ON_OFF_AUTO_AUTO),
 DEFINE_PROP_BIT(PCI_BRIDGE_DEV_PROP_SHPC, PCIBridgeDev, flags,
 PCI_BRIDGE_DEV_F_SHPC_REQ, true),
 DEFINE_PROP_END_OF_LIST(),
-- 
2.1.0

[Qemu-devel] [PATCH v5 07/11] mptsas: change msi property type

>From uint32 to enum OnOffAuto, and give it a shorter name.

cc: Paolo Bonzini 
cc: Markus Armbruster 
Signed-off-by: Cao jin 
---
 hw/scsi/mptsas.c | 4 ++--
 hw/scsi/mptsas.h | 3 ++-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/scsi/mptsas.c b/hw/scsi/mptsas.c
index 1c18c84..afee576 100644
--- a/hw/scsi/mptsas.c
+++ b/hw/scsi/mptsas.c
@@ -1285,7 +1285,7 @@ static void mptsas_scsi_realize(PCIDevice *dev, Error 
**errp)
 memory_region_init_io(>diag_io, OBJECT(s), _diag_ops, s,
   "mptsas-diag", 0x1);
 
-if (s->msi_available &&
+if ((s->msi == ON_OFF_AUTO_AUTO || s->msi == ON_OFF_AUTO_ON) &&
 msi_init(dev, 0, 1, true, false) >= 0) {
 s->msi_in_use = true;
 }
@@ -1404,7 +1404,7 @@ static const VMStateDescription vmstate_mptsas = {
 static Property mptsas_properties[] = {
 DEFINE_PROP_UINT64("sas_address", MPTSASState, sas_addr, 0),
 /* TODO: test MSI support under Windows */
-DEFINE_PROP_BIT("msi", MPTSASState, msi_available, 0, true),
+DEFINE_PROP_ON_OFF_AUTO("msi", MPTSASState, msi, ON_OFF_AUTO_AUTO),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/scsi/mptsas.h b/hw/scsi/mptsas.h
index 595f81f..0436a33 100644
--- a/hw/scsi/mptsas.h
+++ b/hw/scsi/mptsas.h
@@ -27,7 +27,8 @@ struct MPTSASState {
 MemoryRegion diag_io;
 QEMUBH *request_bh;
 
-uint32_t msi_available;
+/* properties */
+OnOffAuto msi;
 uint64_t sas_addr;
 
 bool msi_in_use;
-- 
2.1.0

[Qemu-devel] [PATCH v5 02/11] change pvscsi_init_msi() type to void

Nobody use its return value, so change the type to void.

cc: Paolo Bonzini 
Acked-by: Dmitry Fleytman 
Reviewed-by: Marcel Apfelbaum 
Signed-off-by: Cao jin 
---
 hw/scsi/vmw_pvscsi.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/hw/scsi/vmw_pvscsi.c b/hw/scsi/vmw_pvscsi.c
index 9abc086..4ce3581 100644
--- a/hw/scsi/vmw_pvscsi.c
+++ b/hw/scsi/vmw_pvscsi.c
@@ -1039,7 +1039,7 @@ pvscsi_io_read(void *opaque, hwaddr addr, unsigned size)
 }
 
 
-static bool
+static void
 pvscsi_init_msi(PVSCSIState *s)
 {
 int res;
@@ -1053,8 +1053,6 @@ pvscsi_init_msi(PVSCSIState *s)
 } else {
 s->msi_used = true;
 }
-
-return s->msi_used;
 }
 
 static void
-- 
2.1.0

[Qemu-devel] [PATCH v3 5/5] adb.c: add power key support

Add support for the power key. It has to be handled differently from the other
keys because it is the only 16-bit value key.

Signed-off-by: John Arbuckle 
---
v3 change
Add several suggested comments.
Moved the location of an else statement in the adb_keyboard_event() function.

 hw/input/adb.c | 45 ++---
 1 file changed, 34 insertions(+), 11 deletions(-)

diff --git a/hw/input/adb.c b/hw/input/adb.c
index 37728b3..b5e53c7 100644
--- a/hw/input/adb.c
+++ b/hw/input/adb.c
@@ -343,10 +343,25 @@ static int adb_kbd_poll(ADBDevice *d, uint8_t *obuf)
 s->rptr = 0;
 }
 s->count--;
-obuf[0] = keycode;
-/* NOTE: could put a second keycode if needed */
-obuf[1] = 0xff;
-olen = 2;
+/*
+ * The power key is the only two byte value key, so it is a special case.
+ * Since 0x7f is not a used keycode for ADB we overload it to indicate the
+ * power button when we're storing keycodes in our internal buffer, and
+ * expand it out to two bytes when we send to the guest."
+ */
+if (keycode == 0x7f) {
+obuf[0] = 0x7f;
+obuf[1] = 0x7f;
+olen = 2;
+} else {
+obuf[0] = keycode;
+/* NOTE: the power key key-up is the two byte sequence 0xff 0xff;
+ * otherwise we could in theory send a second keycode in the second
+ * byte, but choose not to bother.
+ */
+obuf[1] = 0xff;
+olen = 2;
+}
 
 return olen;
 }
@@ -424,15 +439,23 @@ static void adb_keyboard_event(DeviceState *dev, 
QemuConsole *src,
 return;
 }
 keycode = qcode_to_adb_keycode[qcode];
-if (keycode == NO_KEY) {  /* We don't want to send this to the guest */
-return;
-}
 
-if (evt->u.key.data->down == false) { /* if key release event */
-keycode = keycode | 0x80;   /* create keyboard break code */
+/* The power button is a special case because it is a 16-bit value */
+if (qcode == Q_KEY_CODE_POWER) {
+printf("Power Key detected\n");
+if (evt->u.key.data->down == true) { /* Power button pushed keycode */
+adb_kbd_put_keycode(s, 0x7f);
+} else {   /* Power button released keycode */
+adb_kbd_put_keycode(s, 0xff);
+}
+} else if (keycode == NO_KEY) { /* NO_KEY shouldn't be sent to guest */
+return;
+} else {   /* For all non-power keys - safe for 8-bit keycodes */
+if (evt->u.key.data->down == false) { /* if key release event */
+keycode = keycode | 0x80;   /* create keyboard break code */
+}
+adb_kbd_put_keycode(s, keycode);
 }
-
-adb_kbd_put_keycode(s, keycode);
 }
 
 static const VMStateDescription vmstate_adb_kbd = {
-- 
2.7.2

[Qemu-devel] [PATCH v3 4/5] adb.c: prevent NO_KEY value from going to guest

The NO_KEY value should not be sent to the guest. This patch drops that value.

Signed-off-by: John Arbuckle 
---
 hw/input/adb.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/input/adb.c b/hw/input/adb.c
index 6d4f4dc..37728b3 100644
--- a/hw/input/adb.c
+++ b/hw/input/adb.c
@@ -424,6 +424,9 @@ static void adb_keyboard_event(DeviceState *dev, 
QemuConsole *src,
 return;
 }
 keycode = qcode_to_adb_keycode[qcode];
+if (keycode == NO_KEY) {  /* We don't want to send this to the guest */
+return;
+}
 
 if (evt->u.key.data->down == false) { /* if key release event */
 keycode = keycode | 0x80;   /* create keyboard break code */
-- 
2.7.2

[Qemu-devel] [PATCH v3 3/5] adb.c: correct several key assignments

The original pc_to_adb_keycode mapping did have several keys that were
incorrectly mapped. This patch fixes these mappings.

Signed-off-by: John Arbuckle 
---
 hw/input/adb.c | 33 -
 1 file changed, 12 insertions(+), 21 deletions(-)

diff --git a/hw/input/adb.c b/hw/input/adb.c
index 715642f..6d4f4dc 100644
--- a/hw/input/adb.c
+++ b/hw/input/adb.c
@@ -62,6 +62,9 @@ do { printf("ADB: " fmt , ## __VA_ARGS__); } while (0)
 /* error codes */
 #define ADB_RET_NOTPRESENT (-2)
 
+/* The adb keyboard doesn't have every key imaginable */
+#define NO_KEY 0xff
+
 static void adb_device_reset(ADBDevice *d)
 {
 qdev_reset_all(DEVICE(d));
@@ -191,19 +194,21 @@ typedef struct ADBKeyboardClass {
 } ADBKeyboardClass;
 
 int qcode_to_adb_keycode[] = {
+ /* Make sure future additions are automatically set to NO_KEY */
+[0 ... 0xff]   = NO_KEY,
 
 [Q_KEY_CODE_SHIFT] = ADB_KEY_LEFT_SHIFT,
 [Q_KEY_CODE_SHIFT_R]   = ADB_KEY_RIGHT_SHIFT,
 [Q_KEY_CODE_ALT]   = ADB_KEY_LEFT_OPTION,
 [Q_KEY_CODE_ALT_R] = ADB_KEY_RIGHT_OPTION,
-[Q_KEY_CODE_ALTGR] = 0,
+[Q_KEY_CODE_ALTGR] = ADB_KEY_RIGHT_OPTION,
 [Q_KEY_CODE_CTRL]  = ADB_KEY_LEFT_CONTROL,
 [Q_KEY_CODE_CTRL_R]= ADB_KEY_RIGHT_CONTROL,
 [Q_KEY_CODE_META_L]= ADB_KEY_LEFT_COMMAND,
 
  /* ADB_KEY_RIGHT_COMMAND works as right super in Linux */
  /* Use ADB_KEY_LEFT_COMMAND for Mac OS compatibility */
-[Q_KEY_CODE_META_R]= 0x7e,
+[Q_KEY_CODE_META_R]= ADB_KEY_LEFT_COMMAND,
 [Q_KEY_CODE_SPC]   = ADB_KEY_SPACEBAR,
 
 [Q_KEY_CODE_ESC]   = ADB_KEY_ESC,
@@ -272,13 +277,13 @@ int qcode_to_adb_keycode[] = {
 [Q_KEY_CODE_F10]   = ADB_KEY_F10,
 [Q_KEY_CODE_F11]   = ADB_KEY_F11,
 [Q_KEY_CODE_F12]   = ADB_KEY_F12,
-[Q_KEY_CODE_PRINT] = 0,
-[Q_KEY_CODE_SYSRQ] = 0,
+[Q_KEY_CODE_PRINT] = ADB_KEY_F13,
+[Q_KEY_CODE_SYSRQ] = ADB_KEY_F13,
 [Q_KEY_CODE_SCROLL_LOCK]   = ADB_KEY_F14,
-[Q_KEY_CODE_PAUSE] = 0,
+[Q_KEY_CODE_PAUSE] = ADB_KEY_F15,
 
 [Q_KEY_CODE_NUM_LOCK]  = ADB_KEY_KP_CLEAR,
-[Q_KEY_CODE_KP_EQUALS] = 0,
+[Q_KEY_CODE_KP_EQUALS] = ADB_KEY_KP_EQUAL,
 [Q_KEY_CODE_KP_DIVIDE] = ADB_KEY_KP_DIVIDE,
 [Q_KEY_CODE_KP_MULTIPLY]   = ADB_KEY_KP_MULTIPLY,
 [Q_KEY_CODE_KP_SUBTRACT]   = ADB_KEY_KP_SUBTRACT,
@@ -301,27 +306,13 @@ int qcode_to_adb_keycode[] = {
 [Q_KEY_CODE_LEFT]  = ADB_KEY_LEFT,
 [Q_KEY_CODE_RIGHT] = ADB_KEY_RIGHT,
 
-[Q_KEY_CODE_HELP]  = 0,
+[Q_KEY_CODE_HELP]  = ADB_KEY_HELP,
 [Q_KEY_CODE_INSERT]= ADB_KEY_HELP,
 [Q_KEY_CODE_DELETE]= ADB_KEY_FORWARD_DELETE,
 [Q_KEY_CODE_HOME]  = ADB_KEY_HOME,
 [Q_KEY_CODE_END]   = ADB_KEY_END,
 [Q_KEY_CODE_PGUP]  = ADB_KEY_PAGE_UP,
 [Q_KEY_CODE_PGDN]  = ADB_KEY_PAGE_DOWN,
-
-[Q_KEY_CODE_LESS]  = 0xa,
-[Q_KEY_CODE_STOP]  = 0,
-[Q_KEY_CODE_AGAIN] = 0,
-[Q_KEY_CODE_PROPS] = 0,
-[Q_KEY_CODE_UNDO]  = 0,
-[Q_KEY_CODE_FRONT] = 0,
-[Q_KEY_CODE_COPY]  = 0,
-[Q_KEY_CODE_OPEN]  = 0,
-[Q_KEY_CODE_PASTE] = 0,
-[Q_KEY_CODE_FIND]  = 0,
-[Q_KEY_CODE_CUT]   = 0,
-[Q_KEY_CODE_LF]= 0,
-[Q_KEY_CODE_COMPOSE]   = 0,
 };
 
 static void adb_kbd_put_keycode(void *opaque, int keycode)
-- 
2.7.2

[Qemu-devel] [PATCH v3 2/5] adb.c: add support for QKeyCode

The old pc scancode translation is replaced with QEMU's QKeyCode.

Signed-off-by: John Arbuckle 
---
*v3 changes
Kept original pc_to_adb_keycode mapping.

*v2 changes
Changed order of this patch.

 hw/input/adb.c | 222 +
 1 file changed, 176 insertions(+), 46 deletions(-)

diff --git a/hw/input/adb.c b/hw/input/adb.c
index f0ad0d4..715642f 100644
--- a/hw/input/adb.c
+++ b/hw/input/adb.c
@@ -25,6 +25,9 @@
 #include "hw/hw.h"
 #include "hw/input/adb.h"
 #include "ui/console.h"
+#include "include/hw/input/adb-keys.h"
+#include "ui/input.h"
+#include "sysemu/sysemu.h"
 
 /* debug ADB */
 //#define DEBUG_ADB
@@ -187,23 +190,138 @@ typedef struct ADBKeyboardClass {
 DeviceRealize parent_realize;
 } ADBKeyboardClass;
 
-static const uint8_t pc_to_adb_keycode[256] = {
-  0, 53, 18, 19, 20, 21, 23, 22, 26, 28, 25, 29, 27, 24, 51, 48,
- 12, 13, 14, 15, 17, 16, 32, 34, 31, 35, 33, 30, 36, 54,  0,  1,
-  2,  3,  5,  4, 38, 40, 37, 41, 39, 50, 56, 42,  6,  7,  8,  9,
- 11, 45, 46, 43, 47, 44,123, 67, 58, 49, 57,122,120, 99,118, 96,
- 97, 98,100,101,109, 71,107, 89, 91, 92, 78, 86, 87, 88, 69, 83,
- 84, 85, 82, 65,  0,  0, 10,103,111,  0,  0,110, 81,  0,  0,  0,
-  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
-  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
-  0,  0,  0, 94,  0, 93,  0,  0,  0,  0,  0,  0,104,102,  0,  0,
-  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0, 76,125,  0,  0,
-  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,105,  0,  0,  0,  0,  0,
-  0,  0,  0,  0,  0, 75,  0,  0,124,  0,  0,  0,  0,  0,  0,  0,
-  0,  0,  0,  0,  0,  0,  0,115, 62,116,  0, 59,  0, 60,  0,119,
- 61,121,114,117,  0,  0,  0,  0,  0,  0,  0, 55,126,  0,127,  0,
-  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
-  0,  0,  0,  0,  0, 95,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
+int qcode_to_adb_keycode[] = {
+
+[Q_KEY_CODE_SHIFT] = ADB_KEY_LEFT_SHIFT,
+[Q_KEY_CODE_SHIFT_R]   = ADB_KEY_RIGHT_SHIFT,
+[Q_KEY_CODE_ALT]   = ADB_KEY_LEFT_OPTION,
+[Q_KEY_CODE_ALT_R] = ADB_KEY_RIGHT_OPTION,
+[Q_KEY_CODE_ALTGR] = 0,
+[Q_KEY_CODE_CTRL]  = ADB_KEY_LEFT_CONTROL,
+[Q_KEY_CODE_CTRL_R]= ADB_KEY_RIGHT_CONTROL,
+[Q_KEY_CODE_META_L]= ADB_KEY_LEFT_COMMAND,
+
+ /* ADB_KEY_RIGHT_COMMAND works as right super in Linux */
+ /* Use ADB_KEY_LEFT_COMMAND for Mac OS compatibility */
+[Q_KEY_CODE_META_R]= 0x7e,
+[Q_KEY_CODE_SPC]   = ADB_KEY_SPACEBAR,
+
+[Q_KEY_CODE_ESC]   = ADB_KEY_ESC,
+[Q_KEY_CODE_1] = ADB_KEY_1,
+[Q_KEY_CODE_2] = ADB_KEY_2,
+[Q_KEY_CODE_3] = ADB_KEY_3,
+[Q_KEY_CODE_4] = ADB_KEY_4,
+[Q_KEY_CODE_5] = ADB_KEY_5,
+[Q_KEY_CODE_6] = ADB_KEY_6,
+[Q_KEY_CODE_7] = ADB_KEY_7,
+[Q_KEY_CODE_8] = ADB_KEY_8,
+[Q_KEY_CODE_9] = ADB_KEY_9,
+[Q_KEY_CODE_0] = ADB_KEY_0,
+[Q_KEY_CODE_MINUS] = ADB_KEY_MINUS,
+[Q_KEY_CODE_EQUAL] = ADB_KEY_EQUAL,
+[Q_KEY_CODE_BACKSPACE] = ADB_KEY_DELETE,
+[Q_KEY_CODE_TAB]   = ADB_KEY_TAB,
+[Q_KEY_CODE_Q] = ADB_KEY_Q,
+[Q_KEY_CODE_W] = ADB_KEY_W,
+[Q_KEY_CODE_E] = ADB_KEY_E,
+[Q_KEY_CODE_R] = ADB_KEY_R,
+[Q_KEY_CODE_T] = ADB_KEY_T,
+[Q_KEY_CODE_Y] = ADB_KEY_Y,
+[Q_KEY_CODE_U] = ADB_KEY_U,
+[Q_KEY_CODE_I] = ADB_KEY_I,
+[Q_KEY_CODE_O] = ADB_KEY_O,
+[Q_KEY_CODE_P] = ADB_KEY_P,
+[Q_KEY_CODE_BRACKET_LEFT]  = ADB_KEY_LEFT_BRACKET,
+[Q_KEY_CODE_BRACKET_RIGHT] = ADB_KEY_RIGHT_BRACKET,
+[Q_KEY_CODE_RET]   = ADB_KEY_RETURN,
+[Q_KEY_CODE_A] = ADB_KEY_A,
+[Q_KEY_CODE_S] = ADB_KEY_S,
+[Q_KEY_CODE_D] = ADB_KEY_D,
+[Q_KEY_CODE_F] = ADB_KEY_F,
+[Q_KEY_CODE_G] = ADB_KEY_G,
+[Q_KEY_CODE_H] = ADB_KEY_H,
+[Q_KEY_CODE_J] = ADB_KEY_J,
+[Q_KEY_CODE_K] = ADB_KEY_K,
+[Q_KEY_CODE_L] = ADB_KEY_L,
+[Q_KEY_CODE_SEMICOLON] = ADB_KEY_SEMICOLON,
+[Q_KEY_CODE_APOSTROPHE]= ADB_KEY_APOSTROPHE,
+[Q_KEY_CODE_GRAVE_ACCENT]  = ADB_KEY_GRAVE_ACCENT,
+[Q_KEY_CODE_BACKSLASH] = ADB_KEY_BACKSLASH,
+[Q_KEY_CODE_Z] = ADB_KEY_Z,
+[Q_KEY_CODE_X] = ADB_KEY_X,
+[Q_KEY_CODE_C] = ADB_KEY_C,
+[Q_KEY_CODE_V] = ADB_KEY_V,
+[Q_KEY_CODE_B] = ADB_KEY_B,
+[Q_KEY_CODE_N] = ADB_KEY_N,
+[Q_KEY_CODE_M] = ADB_KEY_M,
+[Q_KEY_CODE_COMMA] = ADB_KEY_COMMA,
+[Q_KEY_CODE_DOT]   = ADB_KEY_PERIOD,
+[Q_KEY_CODE_SLASH] = ADB_KEY_FORWARD_SLASH,
+

[Qemu-devel] [PATCH v3 1/5] adb-keys.h: initial commit

Add the adb-keys.h file. It maps ADB transition key codes with values.

Signed-off-by: John Arbuckle 
---
*v3 changes:
Removed note.

*v2 changes:
Changed order of this patch.

 include/hw/input/adb-keys.h | 142 
 1 file changed, 142 insertions(+)
 create mode 100644 include/hw/input/adb-keys.h

diff --git a/include/hw/input/adb-keys.h b/include/hw/input/adb-keys.h
new file mode 100644
index 000..eb97e1d
--- /dev/null
+++ b/include/hw/input/adb-keys.h
@@ -0,0 +1,142 @@
+/*
+ * QEMU System Emulator
+ *
+ * Copyright (c) 2016 John Arbuckle
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+/*
+ *  adb-keys.h
+ *
+ *  Provides an enum of all the Macintosh keycodes.
+ *  Additional information: 
http://www.archive.org/stream/apple-guide-macintosh-family-hardware/Apple_Guide_to_the_Macintosh_Family_Hardware_2e#page/n345/mode/2up
+ *  page 308
+ */
+
+#ifndef ADB_KEYS_H
+#define ADB_KEYS_H
+
+enum {
+ADB_KEY_A = 0x00,
+ADB_KEY_B = 0x0b,
+ADB_KEY_C = 0x08,
+ADB_KEY_D = 0x02,
+ADB_KEY_E = 0x0e,
+ADB_KEY_F = 0x03,
+ADB_KEY_G = 0x05,
+ADB_KEY_H = 0x04,
+ADB_KEY_I = 0x22,
+ADB_KEY_J = 0x26,
+ADB_KEY_K = 0x28,
+ADB_KEY_L = 0x25,
+ADB_KEY_M = 0x2e,
+ADB_KEY_N = 0x2d,
+ADB_KEY_O = 0x1f,
+ADB_KEY_P = 0x23,
+ADB_KEY_Q = 0x0c,
+ADB_KEY_R = 0x0f,
+ADB_KEY_S = 0x01,
+ADB_KEY_T = 0x11,
+ADB_KEY_U = 0x20,
+ADB_KEY_V = 0x09,
+ADB_KEY_W = 0x0d,
+ADB_KEY_X = 0x07,
+ADB_KEY_Y = 0x10,
+ADB_KEY_Z = 0x06,
+
+ADB_KEY_0 = 0x1d,
+ADB_KEY_1 = 0x12,
+ADB_KEY_2 = 0x13,
+ADB_KEY_3 = 0x14,
+ADB_KEY_4 = 0x15,
+ADB_KEY_5 = 0x17,
+ADB_KEY_6 = 0x16,
+ADB_KEY_7 = 0x1a,
+ADB_KEY_8 = 0x1c,
+ADB_KEY_9 = 0x19,
+
+ADB_KEY_GRAVE_ACCENT = 0x32,
+ADB_KEY_MINUS = 0x1b,
+ADB_KEY_EQUAL = 0x18,
+ADB_KEY_DELETE = 0x33,
+ADB_KEY_CAPS_LOCK = 0x39,
+ADB_KEY_TAB = 0x30,
+ADB_KEY_RETURN = 0x24,
+ADB_KEY_LEFT_BRACKET = 0x21,
+ADB_KEY_RIGHT_BRACKET = 0x1e,
+ADB_KEY_BACKSLASH = 0x2a,
+ADB_KEY_SEMICOLON = 0x29,
+ADB_KEY_APOSTROPHE = 0x27,
+ADB_KEY_COMMA = 0x2b,
+ADB_KEY_PERIOD = 0x2f,
+ADB_KEY_FORWARD_SLASH = 0x2c,
+ADB_KEY_LEFT_SHIFT = 0x38,
+ADB_KEY_RIGHT_SHIFT = 0x7b,
+ADB_KEY_SPACEBAR = 0x31,
+ADB_KEY_LEFT_CONTROL = 0x36,
+ADB_KEY_RIGHT_CONTROL = 0x7d,
+ADB_KEY_LEFT_OPTION = 0x3a,
+ADB_KEY_RIGHT_OPTION = 0x7c,
+ADB_KEY_LEFT_COMMAND = 0x37,
+ADB_KEY_RIGHT_COMMAND = 0x7e,
+
+ADB_KEY_KP_0 = 0x52,
+ADB_KEY_KP_1 = 0x53,
+ADB_KEY_KP_2 = 0x54,
+ADB_KEY_KP_3 = 0x55,
+ADB_KEY_KP_4 = 0x56,
+ADB_KEY_KP_5 = 0x57,
+ADB_KEY_KP_6 = 0x58,
+ADB_KEY_KP_7 = 0x59,
+ADB_KEY_KP_8 = 0x5b,
+ADB_KEY_KP_9 = 0x5c,
+ADB_KEY_KP_PERIOD = 0x41,
+ADB_KEY_KP_ENTER = 0x4c,
+ADB_KEY_KP_PLUS = 0x45,
+ADB_KEY_KP_SUBTRACT = 0x4e,
+ADB_KEY_KP_MULTIPLY = 0x43,
+ADB_KEY_KP_DIVIDE = 0x4b,
+ADB_KEY_KP_EQUAL = 0x51,
+ADB_KEY_KP_CLEAR = 0x47,
+
+ADB_KEY_UP = 0x3e,
+ADB_KEY_DOWN = 0x3d,
+ADB_KEY_LEFT = 0x3b,
+ADB_KEY_RIGHT = 0x3c,
+
+ADB_KEY_HELP = 0x72,
+ADB_KEY_HOME = 0x73,
+ADB_KEY_PAGE_UP = 0x74,
+ADB_KEY_PAGE_DOWN = 0x79,
+ADB_KEY_END = 0x77,
+ADB_KEY_FORWARD_DELETE = 0x75,
+
+ADB_KEY_ESC = 0x35,
+ADB_KEY_F1 = 0x7a,
+ADB_KEY_F2 = 0x78,
+ADB_KEY_F3 = 0x63,
+ADB_KEY_F4 = 0x76,
+ADB_KEY_F5 = 0x60,
+ADB_KEY_F6 = 0x61,
+ADB_KEY_F7 = 0x62,
+ADB_KEY_F8 = 0x64,
+ADB_KEY_F9 = 0x65,
+ADB_KEY_F10 = 0x6d,
+ADB_KEY_F11 = 0x67,
+ADB_KEY_F12 = 0x6f,
+ADB_KEY_F13 = 0x69,
+ADB_KEY_F14 = 0x6b,
+ADB_KEY_F15 = 0x71,
+
+ADB_KEY_VOLUME_UP = 0x48,
+ADB_KEY_VOLUME_DOWN = 0x49,
+ADB_KEY_VOLUME_MUTE = 0x4a,
+/* ADB_KEY_POWER = 0x7f7f, */
+};
+
+/* Could not find the value for this key. */
+/* #define ADB_KEY_EJECT */
+
+#endif /* ADB_KEYS_H */
-- 
2.7.2

[Qemu-devel] [PATCH v3 0/5] ADB improvements

This patch series makes several improvements to the ADB code. To test this code,
please implement the patches in the order below.

John Arbuckle (5):
  adb-keys.h: initial commit
  adb.c: add support for QKeyCode
  adb.c: correct several key assignments
  adb.c: prevent NO_KEY value from going to guest
  adb.c: add power key support

 hw/input/adb.c  | 239 +++-
 include/hw/input/adb-keys.h | 142 ++
 2 files changed, 335 insertions(+), 46 deletions(-)
 create mode 100644 include/hw/input/adb-keys.h

-- 
*v3 changes:
Removed note from adb-keys.h.
Created new patch that adds QKeyCode support, but doesn't change any mappings.
Made patch that does correct key mappings.
Made new patch the prevents NO_KEY value from going to guest.
Add suggested comments to power key code.
Moved else statement.

*v2 changes:
Swapped order of adb-keys.h and "add support for QKeyCode" patches.

2.7.2

Re: [Qemu-devel] [patch v6 11/12] vfio: register aer resume notification handler for aer resume

2016-05-05 Thread Chen Fan



On 04/26/2016 10:48 PM, Alex Williamson wrote:

On Tue, 26 Apr 2016 11:39:02 +0800
Chen Fan  wrote:


On 04/14/2016 09:02 AM, Chen Fan wrote:

On 04/12/2016 05:38 AM, Alex Williamson wrote:

On Tue, 5 Apr 2016 19:42:02 +0800
Cao jin  wrote:
  

From: Chen Fan

for supporting aer recovery, host and guest would run the same aer
recovery code, that would do the secondary bus reset if the error
is fatal, the aer recovery process:
1. error_detected
2. reset_link (if fatal)
3. slot_reset/mmio_enabled
4. resume

it indicates that host will do secondary bus reset to reset
the physical devices under bus in step 2, that would cause
devices in D3 status in a short time. but in qemu, we register
an error detected handler, that would be invoked as host broadcasts
the error-detected event in step 1, in order to avoid guest do
reset_link when host do reset_link simultaneously. it may cause
fatal error. we introduce a resmue notifier to assure host reset
completely. then do guest aer injection.

Why is it safe to continue running the VM between the error detected
notification and the resume notification?  We're just pushing back the
point at which we inject the AER into the guest, potentially negating
any benefit by allowing the VM to consume bad data.  Shouldn't we
instead be immediately notifying the VM on error detected, but stalling
any access to the device until resume is signaled?  How do we know that
resume will ever be signaled?  We have both the problem that we may be
running on an older kernel that won't support a resume notification and
the problem that seeing a resume notification depends on the host being
able to successfully complete a link reset after fatal error. We can
detect support for resume notification, but we still need a strategy
for never receiving it.  Thanks,

That's make sense, but I haven't came up with a good idea. do you have
any idea, Alex?

I don't know that there are any good solutions here.  We need to
respond to the current error notifier interrupt and not regress from
our support there.  I think that means that if we want to switch from a
simple halt-on-error to a mechanism for the guest to handle recovery,
we need to disable access to the device between being notified that the
error occurred and being notified to resume.  We can do that by
disabling mmaps to the device and preventing access via the slow path
handlers.  I don't know what the best solution is for preventing access,
do we block and pause the VM or do we drop writes and return -1 for
reads, that's something that needs to be determined.  We also need to
inject the AER into the VM at the point we're notified of an error
because the VM needs to know as soon as possible to stop using the
device or trusting any data from it.  The next coordination point would
be something like the resume notifier that you've added and there are
numerous questions around the interaction of that with the guest
handling.  Clearly we can't do a guest directed bus reset until we get
the resume notifier, so do we block that execution path in QEMU until
the resume notification is received?  What happens if we don't get that
notification?  Is there any way that we can rely on the host having
done a bus reset to the point where we don't need to act on the guest
directed reset?  These are all things that need to be figured out.
Thanks,

Maybe we can simply pause the vcpu running and avoid the VM to
access the device. and add two flags in VFIO_DEVICE_GET_INFO to query
whether the vfio pci driver has a resume notifier,
if it does not have resume notifier flags, we can directly fail to boot 
up VM

as with aer enabled. otherwise, we should wait for resume notifier coming to
restart the cpu. about the problem of the reduplicated bus reset by host 
and guest,
I think qemu can according to the error is fatal or non-fatal to decide 
whether need
to do a bus reset on guest, I think it's not critical and could be 
resolved later.


Thanks,
Chen



Alex


.

Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework

2016-05-05 Thread Li, Liang Z

> This series of patches provides a framework for testing migration
> performance characteristics. The motivating factor for this is planning that 
> is
> underway in OpenStack wrt making use of QEMU migration features such as
> compression, auto-converge and post-copy. The primary aim for OpenStack
> is to have Nova autonomously manage migration features & tunables to
> maximise chances that migration will complete. The problem faced is figuring
> out just which QEMU migration features are "best" suited to our needs. This
> means we want data on how well they are able to ensure completion of a
> migration, against the host resources used and the impact on the guest
> workload performance.
> 
> The test framework produced here takes a pathelogical guest workload
> (every CPU just burning 100% of time xor'ing every byte of guest memory
> with random data). This is quite a pessimistic test because most guest
> workloads are not giong to be this heavy on memory writes, and their data
> won't be uniformly random and so will be able to compress better than this
> test does.
> 


Wonderful test report!

> With this worst case guest, I have produced a set of tests using UNIX socket,
> TCP localhost, TCP remote and RDMA remote socket transports, with both a
> 1 GB RAM + 1 CPU guest and a 8 GB RAM + 4 CPU guest.
> 
> The TCP/RDMA remote host tests were run over a 10-GiG-E network
> interface.
> 
> I have put the results online to view here:
> 
>   https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/
> 
> The charts here are showing two core sets of data:
> 
>  - The guest CPU performance. The left axis is showing the time in
> milliseconds
>required to xor 1 GB of memory. This is shown per-guest CPU and
> combined all
>CPUs.
> 
>  - The host CPU utilization. The right axis is showing the overall QEMU 
> process
>CPU utilization, and the per-VCPU utilization.
> 
> Note that the charts are interactive - you can turn on/off each plot line and
> zoom in by selecting regions on the chart.
> 
> 
> Some interesting things that I have observed with this
> 
>  - At the start of each iteration of migration there is a distinct drop in
>guest CPU performance as shown by a spike in the guest CPU time lines.
>Performance would drop from 200ms/GB to 400ms/GB. Presumably this is
>related to QEMU recalculating the dirty bitmap for the guest RAM. See
>the spikes in the green line in:
> 
> https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-
> remote-1gb-1cpu/post-copy-bandwidth/post-copy-bw-1gbs.html
> 
>  - For the larger sized guests, the auto-converge code has to throttle the
>guest to as much as 90% or more before it is able to meet the 500ms max
>downtime value
> 
> https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-
> remote-1gb-1cpu/auto-converge-bandwidth/auto-converge-bw-1gbs.html
> 
>Even then I often saw tests aborting as they hit the max number of
>iterations I permitted (30 iters max)
> 
> https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-
> remote-8gb-4cpu/auto-converge-bandwidth/auto-converge-bw-10gbs.html
> 
>  - MT compression is actively harmful to chances of successful migration
> when
>the guest RAM is not compression friendly. My work load is worst case
> since
>it is splattering RAM with totally random bytes. The MT compression is
>dramatically increasing the time for each iteration as we bottleneck on CPU
>compression speed, leaving the network largely idle. This causes migration
>which would have completed without compression, to fail. It also burns
> huge
>amounts of host CPU time
> 
>  https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-
> remote-1gb-1cpu/compr-mt/compr-mt-threads-4.html
> 
>  - XBZRLE compression did not have as much of a CPU peformance penalty on
> the
>host as MT comprssion, but also did not help migration to actually 
> complete.
>Again this is largely due to the workload being the worst case scenario 
> with
>random bytes. The downside is obviously the potentially significant
> memory
>overhead on the host due to the cache sizing
> 
> https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-
> remote-1gb-1cpu/compr-xbzrle/compr-xbzrle-cache-50.html
> 
> 
>  - Post-copy, by its very nature, obviously ensured that the migraton would
>complete. While post-copy was running in pre-copy mode there was a
> somewhat
>chaotic small impact on guest CPU performance, causing performance to
>periodically oscillate between 400ms/GB and 800ms/GB. This is less than
>the impact at the start of each migration iteration which was 1000ms/GB
>in this test. There was also a massive penalty at time of switchover from
>pre to post copy, as to be expected. The migration completed in post-copy
>phase quite quickly though. For this workload, number of iterations in
>pre-copy mode before switching to post-copy did not

Re: [Qemu-devel] [PATCH] rbd:change error_setg() to error_setg_errno()

2016-05-05 Thread Fam Zheng

On Thu, 05/05 15:06, Jeff Cody wrote:
> On Thu, May 05, 2016 at 06:45:15PM +0100, Stefan Hajnoczi wrote:
> > On Mon, May 02, 2016 at 09:55:17PM +0530, Vikhyat Umrao wrote:
> > > From 1c63c246f47a1a65d8740d7ce3725fe3820c0a37 Mon Sep 17 00:00:00 2001
> > > From: Vikhyat Umrao 
> > > Date: Mon, 2 May 2016 21:47:31 +0530
> > > Subject: [PATCH] rbd:change error_setg() to error_setg_errno()
> > > 
> > > Ceph RBD block driver does not use error_setg_errno() where
> > > it is possible to use. This patch replaces error_setg()
> > > from error_setg_errno().
> > > 
> > > Signed-off-by: Vikhyat Umrao 
> > > ---
> > >  block/rbd.c | 37 ++---
> > >  1 file changed, 22 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/block/rbd.c b/block/rbd.c
> > > index 5bc5b32..c286b32 100644
> > > --- a/block/rbd.c
> > > +++ b/block/rbd.c
> > > @@ -290,7 +290,7 @@ static int qemu_rbd_set_conf(rados_t cluster, const
> > > char *conf,
> > >  if (only_read_conf_file) {
> > >  ret = rados_conf_read_file(cluster, value);
> > >  if (ret < 0) {
> > > -error_setg(errp, "error reading conf file %s", 
> > > value);
> > > +error_setg_errno(errp, -ret, "error reading conf file
> > > %s", value);
> > 
> > Please use scripts/checkpatch.pl to scan patches for coding style
> > violations.  This line exceeds the maximum line length.
> 
> 
> Also, if you can use something like git send-email, it formats the patch
> emails much nicer -- which makes it easier to review & apply.

Also please use ./scripts/get_maintainer.pl to get a more complete recipient
list (for example it will report qemu-bl...@nongnu.org and other block
layer maintainers).

Fam

Re: [Qemu-devel] [PATCH qemu v16 02/19] memory: Call region_del() callbacks on memory listener unregistering

2016-05-05 Thread Alex Williamson

On Wed,  4 May 2016 16:52:14 +1000
Alexey Kardashevskiy  wrote:

> When a new memory listener is registered, listener_add_address_space()
> is called and which in turn calls region_add() callbacks of memory regions.
> However when unregistering the memory listener, it is just removed from
> the listening chain and no region_del() is called.
> 
> This adds listener_del_address_space() and uses it in
> memory_listener_unregister(). listener_add_address_space() was used as
> a template with the following changes:
> s/log_global_start/log_global_stop/
> s/log_start/log_stop/
> s/region_add/region_del/
> 
> This will allow the following patches to add/remove DMA windows
> dynamically from VFIO's PCI address space's region_add()/region_del().

Following patch 1 comments, it would be a bug if the kernel actually
needed this to do cleanup, we must release everything if QEMU gets shot
with a SIGKILL anyway.  So what does this cleanup facilitate in QEMU?
Having QEMU trigger an unmap for each region_del is not going to be as
efficient as just dropping the container and letting the kernel handle
the cleanup all in one go.  Thanks,

Alex

> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
>  memory.c | 48 
>  1 file changed, 48 insertions(+)
> 
> diff --git a/memory.c b/memory.c
> index f76f85d..f762a34 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -2185,6 +2185,49 @@ static void listener_add_address_space(MemoryListener 
> *listener,
>  flatview_unref(view);
>  }
>  
> +static void listener_del_address_space(MemoryListener *listener,
> +   AddressSpace *as)
> +{
> +FlatView *view;
> +FlatRange *fr;
> +
> +if (listener->address_space_filter
> +&& listener->address_space_filter != as) {
> +return;
> +}
> +
> +if (listener->begin) {
> +listener->begin(listener);
> +}
> +if (global_dirty_log) {
> +if (listener->log_global_stop) {
> +listener->log_global_stop(listener);
> +}
> +}
> +
> +view = address_space_get_flatview(as);
> +FOR_EACH_FLAT_RANGE(fr, view) {
> +MemoryRegionSection section = {
> +.mr = fr->mr,
> +.address_space = as,
> +.offset_within_region = fr->offset_in_region,
> +.size = fr->addr.size,
> +.offset_within_address_space = int128_get64(fr->addr.start),
> +.readonly = fr->readonly,
> +};
> +if (fr->dirty_log_mask && listener->log_stop) {
> +listener->log_stop(listener, , 0, fr->dirty_log_mask);
> +}
> +if (listener->region_del) {
> +listener->region_del(listener, );
> +}
> +}
> +if (listener->commit) {
> +listener->commit(listener);
> +}
> +flatview_unref(view);
> +}
> +
>  void memory_listener_register(MemoryListener *listener, AddressSpace *filter)
>  {
>  MemoryListener *other = NULL;
> @@ -2211,6 +2254,11 @@ void memory_listener_register(MemoryListener 
> *listener, AddressSpace *filter)
>  
>  void memory_listener_unregister(MemoryListener *listener)
>  {
> +AddressSpace *as;
> +
> +QTAILQ_FOREACH(as, _spaces, address_spaces_link) {
> +listener_del_address_space(listener, as);
> +}
>  QTAILQ_REMOVE(_listeners, listener, link);
>  }
>

Re: [Qemu-devel] [PATCH qemu v16 01/19] vfio: Delay DMA address space listener release

2016-05-05 Thread Alex Williamson

On Wed,  4 May 2016 16:52:13 +1000
Alexey Kardashevskiy  wrote:

> This postpones VFIO container deinitialization to let region_del()
> callbacks (called via vfio_listener_release) do proper clean up
> while the group is still attached to the container.

Any mappings within the container should clean themselves up when the
container is deprivleged by removing the last group in the kernel.  Is
the issue that that doesn't happen, which would be a spapr vfio kernel
bug, or that our QEMU side structures get all out of whack if we let
that happen?

> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
>  hw/vfio/common.c | 22 +++---
>  1 file changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index fe5ec6a..0b40262 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -921,23 +921,31 @@ static void vfio_disconnect_container(VFIOGroup *group)
>  {
>  VFIOContainer *container = group->container;
>  
> -if (ioctl(group->fd, VFIO_GROUP_UNSET_CONTAINER, >fd)) {
> -error_report("vfio: error disconnecting group %d from container",
> - group->groupid);
> -}
> -
>  QLIST_REMOVE(group, container_next);
> +
> +if (QLIST_EMPTY(>group_list)) {
> +VFIOGuestIOMMU *giommu;
> +
> +vfio_listener_release(container);
> +
> +QLIST_FOREACH(giommu, >giommu_list, giommu_next) {
> +memory_region_unregister_iommu_notifier(>n);
> +}
> +}
> +
>  group->container = NULL;
> +if (ioctl(group->fd, VFIO_GROUP_UNSET_CONTAINER, >fd)) {
> +error_report("vfio: error disconnecting group %d from container",
> + group->groupid);
> +}
>  
>  if (QLIST_EMPTY(>group_list)) {
>  VFIOAddressSpace *space = container->space;
>  VFIOGuestIOMMU *giommu, *tmp;
>  
> -vfio_listener_release(container);
>  QLIST_REMOVE(container, next);
>  
>  QLIST_FOREACH_SAFE(giommu, >giommu_list, giommu_next, 
> tmp) {
> -memory_region_unregister_iommu_notifier(>n);
>  QLIST_REMOVE(giommu, giommu_next);
>  g_free(giommu);
>  }

I'm not spotting why this is a 2-pass process vs simply moving the
existing QLIST_EMPTY cleanup above the ioctl.  Thanks,

Alex

Re: [Qemu-devel] [PATCH v4 13/14] tb hash: track translated blocks with qht

2016-05-05 Thread Emilio G. Cota

On Wed, May 04, 2016 at 07:22:16 -1000, Richard Henderson wrote:
> On 05/04/2016 05:36 AM, Emilio G. Cota wrote:
> >BTW in the last couple of days I did some more work beyond v4:
> >
> >- Added a benchmark (not a correctness test) to measure parallel
> >  performance of QHT (recall that test/qht-test is sequential.)
> >
> >- Added support for concurrent insertions as long as they're not to the
> >  same bucket, thus getting rid of the "external lock" requirement.
> >  This is not really needed for MTTCG because all insertions are supposed
> >  to be serialized by tb_lock; however, the feature (1) has no negative
> >  performance impact (just adds an unlikely() branch after lock acquisition
> >  on insertions/removals) and (2) could be useful for future (parallel)
> >  users of qht.
> >
> >Should I send this work as follow-up patches to v4 to ease review, or
> >should I send a v5 with them merged in?
> 
> Let's handle these as follow-on, since we've already got multiple R-b tags 
> for v4.

OK, will submit the modifications next week.

BTW Benchmarking with the new test is giving me some interesting
results.

For instance, I'm measuring a 5% lookup latency reduction
(single-threaded throughput goes from 45.84 to 48.41 M lookups/s)
if I remove support in qht for MRU promotions.

This is tempting me to just kill the feature, since resizing
works very well.  And it would save ~90 lines of code. Is anyone
against this?

Thanks,

Emilio

[Qemu-devel] [Bug 1336794] Re: 9pfs does not honor open file handles on unlinked files

2016-05-05 Thread Server Angels

** Also affects: ubuntu
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1336794

Title:
  9pfs does not honor open file handles on unlinked files

Status in QEMU:
  New
Status in Ubuntu:
  New

Bug description:
  This was originally filed over here:
  https://bugzilla.redhat.com/show_bug.cgi?id=1114221

  The open-unlink-fstat idiom used in some places to create an anonymous
  private temporary file does not work in a QEMU guest over a virtio-9p
  filesystem.

  Version-Release number of selected component (if applicable):

  qemu-kvm-1.6.2-6.fc20.x86_64
  qemu-system-x86-1.6.2-6.fc20.x86_64
  (those are fedora RPMs)

  How reproducible:

  Always. See this example C program:

  https://bugzilla.redhat.com/attachment.cgi?id=913069

  Steps to Reproduce:
  1. Export a filesystem with virt-manager for the guest.
(type: mount, driver: default, mode: passthrough)
  2. Start guest and mount that filesystem
(mount -t 9p -o trans=virtio,version=9p2000.L  ...)
  3. Run a program that uses open-unlink-fstat
(in my case it was trying to compile Perl 5.20)

  Actual results:

  fstat fails:

  open("/home/tst/filename", O_RDWR|O_CREAT|O_EXCL, 0600) = 3
  unlink("/home/tst/filename")= 0
  fstat(3, 0x23aa1a8) = -1 ENOENT (No such file or 
directory)
  close(3)

  Expected results:

  open("/home/tst/filename", O_RDWR|O_CREAT|O_EXCL, 0600) = 3
  unlink("/home/tst/filename")= 0
  fstat(3, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
  fcntl(3, F_SETFD, FD_CLOEXEC)   = 0
  close(3) 

  Additional info:

  There was a patch put into the kernel back in '07 to handle this very
  problem for other filesystems; maybe its helpful:

http://lwn.net/Articles/251228/

  There is also a thread on LKML from last December specifically about
  this very problem:

https://lkml.org/lkml/2013/12/31/163

  There was a discussion on the QEMU list back in '11 that doesn't seem
  to have come to a conclusion, but did provide the test program that
  i've attached to this report:

http://marc.info/?l=qemu-devel=130443605720648=2

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1336794/+subscriptions

[Qemu-devel] [Bug 1336794] Re: 9pfs does not honor open file handles on unlinked files

2016-05-05 Thread Server Angels

I would appreciate this patch being committed as I *think* it's
affecting a system i'm building now.

I have a backup host with 2 VMs. For business reasons they need to be
network isolated from each other and the host, so each is passed through
a physical NIC. Each VM does need access to a variable size datastore on
the host so I am using virtfs /9p to expose a mountpoint to each VM.

The VMs each backup servers to  their respective mountpoint to this
virtfs mount using rsync. Just backing up one server with ~4000 files
and 3 large sparse VM images saw the open files on the backup host
increase to over *80* and the rsync progressively get slower.
Shutting down these VMs then takes hours as it can't unlock the files it
has open on the backup host.

I understand rsync does use open-unlink-fstat extensively, hence why I
think this is the issue.

This is a deal breaker for any production use of virtfs. Does anybody
know if this is fixed in other builds of qemu?

tl;dr - to recreate this on 16.04 - create a VM with a virtfs/9p mount
to the host. Do lots of rsyncs to this mount within the VM, watch 'lsof
| wc -l' go higher and higher on the host.

Thanks,

/Sean

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1336794

Title:
  9pfs does not honor open file handles on unlinked files

Status in QEMU:
  New

Bug description:
  This was originally filed over here:
  https://bugzilla.redhat.com/show_bug.cgi?id=1114221

  The open-unlink-fstat idiom used in some places to create an anonymous
  private temporary file does not work in a QEMU guest over a virtio-9p
  filesystem.

  Version-Release number of selected component (if applicable):

  qemu-kvm-1.6.2-6.fc20.x86_64
  qemu-system-x86-1.6.2-6.fc20.x86_64
  (those are fedora RPMs)

  How reproducible:

  Always. See this example C program:

  https://bugzilla.redhat.com/attachment.cgi?id=913069

  Steps to Reproduce:
  1. Export a filesystem with virt-manager for the guest.
(type: mount, driver: default, mode: passthrough)
  2. Start guest and mount that filesystem
(mount -t 9p -o trans=virtio,version=9p2000.L  ...)
  3. Run a program that uses open-unlink-fstat
(in my case it was trying to compile Perl 5.20)

  Actual results:

  fstat fails:

  open("/home/tst/filename", O_RDWR|O_CREAT|O_EXCL, 0600) = 3
  unlink("/home/tst/filename")= 0
  fstat(3, 0x23aa1a8) = -1 ENOENT (No such file or 
directory)
  close(3)

  Expected results:

  open("/home/tst/filename", O_RDWR|O_CREAT|O_EXCL, 0600) = 3
  unlink("/home/tst/filename")= 0
  fstat(3, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
  fcntl(3, F_SETFD, FD_CLOEXEC)   = 0
  close(3) 

  Additional info:

  There was a patch put into the kernel back in '07 to handle this very
  problem for other filesystems; maybe its helpful:

http://lwn.net/Articles/251228/

  There is also a thread on LKML from last December specifically about
  this very problem:

https://lkml.org/lkml/2013/12/31/163

  There was a discussion on the QEMU list back in '11 that doesn't seem
  to have come to a conclusion, but did provide the test program that
  i've attached to this report:

http://marc.info/?l=qemu-devel=130443605720648=2

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1336794/+subscriptions

Re: [Qemu-devel] [PATCH RFC 0/8] basic vfio-ccw infrastructure

2016-05-05 Thread Neo Jia

On Thu, May 05, 2016 at 01:19:45PM -0600, Alex Williamson wrote:
> [cc +Intel,NVIDIA]
> 
> On Thu, 5 May 2016 18:29:08 +0800
> Dong Jia  wrote:
> 
> > On Wed, 4 May 2016 13:26:53 -0600
> > Alex Williamson  wrote:
> > 
> > > On Wed, 4 May 2016 17:26:29 +0800
> > > Dong Jia  wrote:
> > >   
> > > > On Fri, 29 Apr 2016 11:17:35 -0600
> > > > Alex Williamson  wrote:
> > > > 
> > > > Dear Alex:
> > > > 
> > > > Thanks for the comments.
> > > > 
> > > > [...]
> > > >   
> > > > > > 
> > > > > > The user of vfio-ccw is not limited to Qemu, while Qemu is 
> > > > > > definitely a
> > > > > > good example to get understand how these patches work. Here is a 
> > > > > > little
> > > > > > bit more detail how an I/O request triggered by the Qemu guest will 
> > > > > > be
> > > > > > handled (without error handling).
> > > > > > 
> > > > > > Explanation:
> > > > > > Q1-Q4: Qemu side process.
> > > > > > K1-K6: Kernel side process.
> > > > > > 
> > > > > > Q1. Intercept a ssch instruction.
> > > > > > Q2. Translate the guest ccw program to a user space ccw program
> > > > > > (u_ccwchain).
> > > > > 
> > > > > Is this replacing guest physical address in the program with QEMU
> > > > > virtual addresses?
> > > > Yes.
> > > >   
> > > > > 
> > > > > > Q3. Call VFIO_DEVICE_CCW_CMD_REQUEST (u_ccwchain, orb, irb).
> > > > > > K1. Copy from u_ccwchain to kernel (k_ccwchain).
> > > > > > K2. Translate the user space ccw program to a kernel space ccw
> > > > > > program, which becomes runnable for a real device.
> > > > > 
> > > > > And here we translate and likely pin QEMU virtual address to physical
> > > > > addresses to further modify the program sent into the channel?
> > > > Yes. Exactly.
> > > >   
> > > > > 
> > > > > > K3. With the necessary information contained in the orb passed 
> > > > > > in
> > > > > > by Qemu, issue the k_ccwchain to the device, and wait event 
> > > > > > q
> > > > > > for the I/O result.
> > > > > > K4. Interrupt handler gets the I/O result, and wakes up the 
> > > > > > wait q.
> > > > > > K5. CMD_REQUEST ioctl gets the I/O result, and uses the result 
> > > > > > to
> > > > > > update the user space irb.
> > > > > > K6. Copy irb and scsw back to user space.
> > > > > > Q4. Update the irb for the guest.
> > > > > 
> > > > > If the answers to my questions above are both yes,
> > > > Yes, they are.
> > > >   
> > > > > then this is really a mediated interface, not a direct assignment.
> > > > Right. This is true.
> > > >   
> > > > > We don't need an iommu
> > > > > because we're policing and translating the program for the device
> > > > > before it gets sent to hardware.  I think there are better ways than
> > > > > noiommu to handle such devices perhaps even with better performance
> > > > > than this two-stage translation.  In fact, I think the solution we 
> > > > > plan
> > > > > to implement for vGPU support would work here.
> > > > > 
> > > > > Like your device, a vGPU is mediated, we don't have IOMMU level
> > > > > translation or isolation since a vGPU is largely a software construct,
> > > > > but we do have software policing and translating how the GPU is
> > > > > programmed.  To do this we're creating a type1 compatible vfio iommu
> > > > > backend that uses the existing map and unmap ioctls, but rather than
> > > > > programming them into an IOMMU for a device, it simply stores the
> > > > > translations for use by later requests.  This means that a device
> > > > > programmed in a VM with guest physical addresses can have the
> > > > > vfio kernel convert that address to process virtual address, pin the
> > > > > page and program the hardware with the host physical address in one
> > > > > step.
> > > > I've read through the mail threads those discuss how to add vGPU
> > > > support in VFIO. I'm afraid that proposal could not be simply addressed
> > > > to this case, especially if we want to make the vfio api completely
> > > > compatible with the existing usage.
> > > > 
> > > > AFAIU, a PCI device (or a vGPU device) uses a dedicated, exclusive and
> > > > fixed range of address in the memory space for DMA operations. Any
> > > > address inside this range will not be used for other purpose. Thus we
> > > > can add memory listener on this range, and pin the pages for further
> > > > use (DMA operation). And we can keep the pages pinned during the life
> > > > cycle of the VM (not quite accurate, or I should say 'the target
> > > > device').  
> > > 
> > > That's not entirely accurate.  Ignoring a guest IOMMU, current device
> > > assignment pins all of guest memory, not just a dedicated, exclusive
> > > range of it, in order to map it through the hardware IOMMU.  That gives
> > > the guest the ability to transparently perform DMA with the device
> > > since the

Re: [Qemu-devel] [RFC PATCH v3 2/3] VFIO driver for vGPU device

2016-05-05 Thread Neo Jia

On Thu, May 05, 2016 at 09:24:26AM +, Tian, Kevin wrote:
> > From: Alex Williamson
> > Sent: Thursday, May 05, 2016 1:06 AM
> > > > > +
> > > > > +static int vgpu_dev_mmio_fault(struct vm_area_struct *vma, struct 
> > > > > vm_fault
> > *vmf)
> > > > > +{
> > > > > + int ret = 0;
> > > > > + struct vfio_vgpu_device *vdev = vma->vm_private_data;
> > > > > + struct vgpu_device *vgpu_dev;
> > > > > + struct gpu_device *gpu_dev;
> > > > > + u64 virtaddr = (u64)vmf->virtual_address;
> > > > > + u64 offset, phyaddr;
> > > > > + unsigned long req_size, pgoff;
> > > > > + pgprot_t pg_prot;
> > > > > +
> > > > > + if (!vdev && !vdev->vgpu_dev)
> > > > > + return -EINVAL;
> > > > > +
> > > > > + vgpu_dev = vdev->vgpu_dev;
> > > > > + gpu_dev  = vgpu_dev->gpu_dev;
> > > > > +
> > > > > + offset   = vma->vm_pgoff << PAGE_SHIFT;
> > > > > + phyaddr  = virtaddr - vma->vm_start + offset;
> > > > > + pgoff= phyaddr >> PAGE_SHIFT;
> > > > > + req_size = vma->vm_end - virtaddr;
> > > > > + pg_prot  = vma->vm_page_prot;
> > > > > +
> > > > > + if (gpu_dev->ops->validate_map_request) {
> > > > > + ret = gpu_dev->ops->validate_map_request(vgpu_dev, 
> > > > > virtaddr,
> > ,
> > > > > +  _size, 
> > > > > _prot);
> > > > > + if (ret)
> > > > > + return ret;
> > > > > +
> > > > > + if (!req_size)
> > > > > + return -EINVAL;
> > > > > + }
> > > > > +
> > > > > + ret = remap_pfn_range(vma, virtaddr, pgoff, req_size, pg_prot);
> > > >
> > > > So not supporting validate_map_request() means that the user can
> > > > directly mmap BARs of the host GPU and as shown below, we assume a 1:1
> > > > mapping of vGPU BAR to host GPU BAR.  Is that ever valid in a vGPU
> > > > scenario or should this callback be required?  It's not clear to me how
> > > > the vendor driver determines what this maps to, do they compare it to
> > > > the physical device's own BAR addresses?
> > >
> > > I didn't quite understand too. Based on earlier discussion, do we need
> > > something like this, or could achieve the purpose just by leveraging
> > > recent sparse mmap support?
> > 
> > The reason for faulting in the mmio space, if I recall correctly, is to
> > enable an ordering where the user driver (QEMU) can mmap regions of the
> > device prior to resources being allocated on the host GPU to handle
> > them.  Sparse mmap only partially handles that, it's not dynamic.  With
> > this faulting mechanism, the host GPU doesn't need to commit resources
> > until the mmap is actually accessed.  Thanks,
> > 
> > Alex
> 
> Neo/Kirti, any specific example how above exactly works? I can see
> difference from sparse mmap based on Alex's explanation, but still
> cannot map the 1st sentence to a real scenario clearly. Now our side
> doesn't use such faulting-based method. So I'd like to understand it
> clearly and then see any value to do same thing for Intel GPU.

Hi Kevin,

The short answer is CPU access to GPU resources via MMIO region.

Thanks,
Neo

> 
> Thanks
> Kevin

Re: [Qemu-devel] [PATCH RFC 0/8] basic vfio-ccw infrastructure

2016-05-05 Thread Alex Williamson

[cc +Intel,NVIDIA]

On Thu, 5 May 2016 18:29:08 +0800
Dong Jia  wrote:

> On Wed, 4 May 2016 13:26:53 -0600
> Alex Williamson  wrote:
> 
> > On Wed, 4 May 2016 17:26:29 +0800
> > Dong Jia  wrote:
> >   
> > > On Fri, 29 Apr 2016 11:17:35 -0600
> > > Alex Williamson  wrote:
> > > 
> > > Dear Alex:
> > > 
> > > Thanks for the comments.
> > > 
> > > [...]
> > >   
> > > > > 
> > > > > The user of vfio-ccw is not limited to Qemu, while Qemu is definitely 
> > > > > a
> > > > > good example to get understand how these patches work. Here is a 
> > > > > little
> > > > > bit more detail how an I/O request triggered by the Qemu guest will be
> > > > > handled (without error handling).
> > > > > 
> > > > > Explanation:
> > > > > Q1-Q4: Qemu side process.
> > > > > K1-K6: Kernel side process.
> > > > > 
> > > > > Q1. Intercept a ssch instruction.
> > > > > Q2. Translate the guest ccw program to a user space ccw program
> > > > > (u_ccwchain).
> > > > 
> > > > Is this replacing guest physical address in the program with QEMU
> > > > virtual addresses?
> > > Yes.
> > >   
> > > > 
> > > > > Q3. Call VFIO_DEVICE_CCW_CMD_REQUEST (u_ccwchain, orb, irb).
> > > > > K1. Copy from u_ccwchain to kernel (k_ccwchain).
> > > > > K2. Translate the user space ccw program to a kernel space ccw
> > > > > program, which becomes runnable for a real device.
> > > > 
> > > > And here we translate and likely pin QEMU virtual address to physical
> > > > addresses to further modify the program sent into the channel?
> > > Yes. Exactly.
> > >   
> > > > 
> > > > > K3. With the necessary information contained in the orb passed in
> > > > > by Qemu, issue the k_ccwchain to the device, and wait event q
> > > > > for the I/O result.
> > > > > K4. Interrupt handler gets the I/O result, and wakes up the wait 
> > > > > q.
> > > > > K5. CMD_REQUEST ioctl gets the I/O result, and uses the result to
> > > > > update the user space irb.
> > > > > K6. Copy irb and scsw back to user space.
> > > > > Q4. Update the irb for the guest.
> > > > 
> > > > If the answers to my questions above are both yes,
> > > Yes, they are.
> > >   
> > > > then this is really a mediated interface, not a direct assignment.
> > > Right. This is true.
> > >   
> > > > We don't need an iommu
> > > > because we're policing and translating the program for the device
> > > > before it gets sent to hardware.  I think there are better ways than
> > > > noiommu to handle such devices perhaps even with better performance
> > > > than this two-stage translation.  In fact, I think the solution we plan
> > > > to implement for vGPU support would work here.
> > > > 
> > > > Like your device, a vGPU is mediated, we don't have IOMMU level
> > > > translation or isolation since a vGPU is largely a software construct,
> > > > but we do have software policing and translating how the GPU is
> > > > programmed.  To do this we're creating a type1 compatible vfio iommu
> > > > backend that uses the existing map and unmap ioctls, but rather than
> > > > programming them into an IOMMU for a device, it simply stores the
> > > > translations for use by later requests.  This means that a device
> > > > programmed in a VM with guest physical addresses can have the
> > > > vfio kernel convert that address to process virtual address, pin the
> > > > page and program the hardware with the host physical address in one
> > > > step.
> > > I've read through the mail threads those discuss how to add vGPU
> > > support in VFIO. I'm afraid that proposal could not be simply addressed
> > > to this case, especially if we want to make the vfio api completely
> > > compatible with the existing usage.
> > > 
> > > AFAIU, a PCI device (or a vGPU device) uses a dedicated, exclusive and
> > > fixed range of address in the memory space for DMA operations. Any
> > > address inside this range will not be used for other purpose. Thus we
> > > can add memory listener on this range, and pin the pages for further
> > > use (DMA operation). And we can keep the pages pinned during the life
> > > cycle of the VM (not quite accurate, or I should say 'the target
> > > device').  
> > 
> > That's not entirely accurate.  Ignoring a guest IOMMU, current device
> > assignment pins all of guest memory, not just a dedicated, exclusive
> > range of it, in order to map it through the hardware IOMMU.  That gives
> > the guest the ability to transparently perform DMA with the device
> > since the IOMMU maps the guest physical to host physical translations.  
> Thanks for this explanation.
> 
> I noticed in the Qemu part, when we tried to introduce vfio-pci to the
> s390 architecture, we set the IOMMU width by calling
> memory_region_add_subregion before initializing the address_space of
> the PCI device, which will be registered

Re: [Qemu-devel] [PATCH] rbd:change error_setg() to error_setg_errno()

2016-05-05 Thread Jeff Cody

On Thu, May 05, 2016 at 06:45:15PM +0100, Stefan Hajnoczi wrote:
> On Mon, May 02, 2016 at 09:55:17PM +0530, Vikhyat Umrao wrote:
> > From 1c63c246f47a1a65d8740d7ce3725fe3820c0a37 Mon Sep 17 00:00:00 2001
> > From: Vikhyat Umrao 
> > Date: Mon, 2 May 2016 21:47:31 +0530
> > Subject: [PATCH] rbd:change error_setg() to error_setg_errno()
> > 
> > Ceph RBD block driver does not use error_setg_errno() where
> > it is possible to use. This patch replaces error_setg()
> > from error_setg_errno().
> > 
> > Signed-off-by: Vikhyat Umrao 
> > ---
> >  block/rbd.c | 37 ++---
> >  1 file changed, 22 insertions(+), 15 deletions(-)
> > 
> > diff --git a/block/rbd.c b/block/rbd.c
> > index 5bc5b32..c286b32 100644
> > --- a/block/rbd.c
> > +++ b/block/rbd.c
> > @@ -290,7 +290,7 @@ static int qemu_rbd_set_conf(rados_t cluster, const
> > char *conf,
> >  if (only_read_conf_file) {
> >  ret = rados_conf_read_file(cluster, value);
> >  if (ret < 0) {
> > -error_setg(errp, "error reading conf file %s", value);
> > +error_setg_errno(errp, -ret, "error reading conf file
> > %s", value);
> 
> Please use scripts/checkpatch.pl to scan patches for coding style
> violations.  This line exceeds the maximum line length.


Also, if you can use something like git send-email, it formats the patch
emails much nicer -- which makes it easier to review & apply.

Thanks,
Jeff

Re: [Qemu-devel] virtio-net and vhost-net init, virtio-scsi and vhost-scsi init

On Thu, May 05, 2016 at 11:05:27AM +, Catalin Vasile wrote:
> When the virtio-net and virtio-scsi drivers have done the probe() primitive 
> they set the DRIVER_OK flag.
> 
> If the vhost kernel backend is used, the set_status() primitive in qemu will 
> be triggered with DRIVER_OK status and it will trigger vhost_XXX_start().
> 
> How does the net and scsi solutions ensure that the vhost_XXX_start() 
> primitive has finished before sending jobs to through the virtqueues? Because 
> if vhost_XXX_start() has not finished, jobs might get to the qemu dummy 
> virtqueue handles, instead of getting to the vhost server in the host kernel.

The trick is to explicitly signal the ioeventfd.  See QEMU's
virtio_pci_set_host_notifier_internal():

  r = event_notifier_init(notifier, 1);

The '1' means the notifier is initialized in a readable state so whoever
is monitoring the file descriptor will be able to read right away.

Whether vhost receives the kick from the guest or not, it will
definitely notice that the ioeventfd can be read.  And maybe the
virtqueue has no new available buffers.  That's okay too, then this was
just a spurious wakeup.

Is this what you were asking about?

Stefan

signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH] rbd:change error_setg() to error_setg_errno()

On Mon, May 02, 2016 at 09:55:17PM +0530, Vikhyat Umrao wrote:
> From 1c63c246f47a1a65d8740d7ce3725fe3820c0a37 Mon Sep 17 00:00:00 2001
> From: Vikhyat Umrao 
> Date: Mon, 2 May 2016 21:47:31 +0530
> Subject: [PATCH] rbd:change error_setg() to error_setg_errno()
> 
> Ceph RBD block driver does not use error_setg_errno() where
> it is possible to use. This patch replaces error_setg()
> from error_setg_errno().
> 
> Signed-off-by: Vikhyat Umrao 
> ---
>  block/rbd.c | 37 ++---
>  1 file changed, 22 insertions(+), 15 deletions(-)
> 
> diff --git a/block/rbd.c b/block/rbd.c
> index 5bc5b32..c286b32 100644
> --- a/block/rbd.c
> +++ b/block/rbd.c
> @@ -290,7 +290,7 @@ static int qemu_rbd_set_conf(rados_t cluster, const
> char *conf,
>  if (only_read_conf_file) {
>  ret = rados_conf_read_file(cluster, value);
>  if (ret < 0) {
> -error_setg(errp, "error reading conf file %s", value);
> +error_setg_errno(errp, -ret, "error reading conf file
> %s", value);

Please use scripts/checkpatch.pl to scan patches for coding style
violations.  This line exceeds the maximum line length.


signature.asc
Description: PGP signature

Re: [Qemu-devel] TCP Segementation Offloading

On Sun, May 01, 2016 at 02:31:57PM +0200, Ingo Krabbe wrote:
> Good Mayday Qemu Developers,
> 
> today I tried to find a reference to a networking problem, that seems to be 
> of quite general nature: TCP Segmentation Offloading (TSO) in virtual 
> environments.
> 
> When I setup TAP network adapter for a virtual machine and put it into a host 
> bridge, the known best practice is to manually set "tso off gso off" with 
> ethtool, for the guest driver if I use a hardware emulation, such as e1000 
> and/or "tso off gso off" for the host driver and/or for the bridge adapter, 
> if I use the virtio driver, as otherwise you experience (sometimes?) 
> performance problems or even lost packages.

I can't parse this sentence.  In what cases do you think it's a "known
best practice" to disable tso and gso?  Maybe a table would be a clearer
way to communicate this.

Can you provide a link to the source claiming tso and gso should be
disabled?

> I haven't found a complete analysis of the background of these problems, but 
> there seem to be some effects on MTU based fragmentation and UDP checksums.
> 
> There is a tso related bug on launchpad, but the context of this bug is too 
> narrow, for the generality of the problem.
> 
> Also it seems that there is a problem in LXC contexts too (I found such a 
> reference, without detailed description in a Post about Xen setup).
> 
> My question now is: Is there a bug in the driver code and shouldn't this be 
> documented somewhere in wiki.qemu.org? Where there developments about this 
> topic in the past or is there any planned/ongoing work todo on the qemu 
> drivers?
> 
> Most problem reports found relate to deprecated Centos6 qemu-kvm packages.
> 
> In our company we have similar or even worse problems with Centos7 hosts and 
> guest machines.

Have haven't explained what problem you are experiencing.  If you want
help with your setup please include your QEMU command-line (ps aux |
grep qemu), the traffic pattern (ideally how to reproduce it with a
benchmarking tool), and what observation you are making (e.g. netstat
counters showing dropped packets).

> I'm going to analyze these problems next week anyway and I woud be happy to 
> share my observation with you. (Where can I register for the wiki, or whom 
> should I sent my reports about this topic?).

I have CCed Michael Tsirkin and Jason Wang.  They do most of the
virtio-net development.

> 
> Regards,
> 
> Ingo Krabbe
> 
> 


signature.asc
Description: PGP signature

Re: [Qemu-devel] [Qemu-Devel] [PATCH] Changed malloc to g_malloc, free to g_free in linux-user/qemu.h

On Thu, Mar 24, 2016 at 12:02:03AM +0530, Md Haris Iqbal wrote:
> Signed-off-by: Md Haris Iqbal 
> ---
>  linux-user/qemu.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH] virtio: set ISR status when delivering queue MSI

On Thu, May 5, 2016 at 6:04 PM, Michael S. Tsirkin  wrote:
> On Thu, May 05, 2016 at 05:27:03PM +0200, Ladi Prosek wrote:
>> On Thu, May 5, 2016 at 5:20 PM, Michael S. Tsirkin  wrote:
>> > On Thu, May 05, 2016 at 05:15:01PM +0200, Ladi Prosek wrote:
>> >> On Thu, May 5, 2016 at 5:02 PM, Michael S. Tsirkin  
>> >> wrote:
>> >> > On Thu, May 05, 2016 at 04:59:13PM +0200, Ladi Prosek wrote:
>> >> >> On Thu, May 5, 2016 at 3:36 PM, Michael S. Tsirkin  
>> >> >> wrote:
>> >> >> > On Thu, May 05, 2016 at 11:13:37AM +0200, Ladi Prosek wrote:
>> >> >> >> There is a discrepancy between dataplane and no-dataplane virtio
>> >> >> >> behavior with respect to the ISR status register and MSI-X
>> >> >> >> capability.
>> >> >> >>
>> >> >> >> Without dataplane the Queue interrupt ISR status bit is set
>> >> >> >> regardless of how the notification is delivered to the guest.
>> >> >> >>
>> >> >> >> With dataplane the Queue interrupt ISR status bit is set only
>> >> >> >> if the notification is delivered as an IRQ.
>> >> >> >>
>> >> >> >> While both conform to the spec, QEMU should be consistent to
>> >> >> >> minimize surprises with broken drivers.
>> >> >> >>
>> >> >> >> This RFC patch shows the relevant code location and contains a
>> >> >> >> possible fix which makes QEMU set the bit on the MSI dataplane
>> >> >> >> code path.
>> >> >> >>
>> >> >> >> Signed-off-by: Ladi Prosek 
>> >> >> >
>> >> >> > How is this supposed to work when injecting interrupts using irqfd?
>> >> >> > If anything, I would rather drop setting the flag on non-dataplane.
>> >> >>
>> >> >> My bad, this is messed up. For some reason it works, or appears to be
>> >> >> working at least.
>> >> >>
>> >> >> The reason why I'd like to add setting the flag in dataplane is that
>> >> >> if we go the other way we'll break drivers. I know for sure that there
>> >> >> are drivers depending on this.
>> >> >
>> >> > Rly? These are way out of spec then. I'd rather have these drivers
>> >> > fixed: virtio said ISR can't be used with MSI since the 1st version
>> >> > where MSI was introduced. virtio 1.0 allows using ISR for config
>> >> > interrupts only.
>> >>
>> >> Should the below paragraph be worded differently then? Or explicitly
>> >> say that the contents of the status word is undefined?
>> >
>> > It says don't access. How would you word a driver requirement otherwise?
>>
>> I would use a stronger mode than "SHOULD NOT". Playing devil's
>> advocate, I could interpret the sentence as a suggested perf
>> optimization - i.e. the interrupt message conveys the required
>> information, no need to consult ISR status.
>
> OK, maybe it should be "MUST NOT".
> Older virtio said it is not valid.
> Want to reports this on virtio comments mailing list?

Sure, will do.

>> >> Specified since
>> >> 1st version vs. actually behaving since 1st version.
>> >
>> > Yea. Yack. People just try whatever seems to work.
>> >
>> >> If you think it's
>> >> safe to remove the |= 0x01 from the regular path I'll fix the drivers
>> >> I know of, that's not a problem.
>> >
>> > Maybe we should keep the  |= 0x01 in QEMU for now, but let's
>> > not mess up dataplace.  Please fix the drivers.
>> >
>> > Maybe remove |= 0x01 if legacy is disabled.
>>
>> Ok, sounds good. Thanks!
>>
>> >> 4.1.4.5.2 Driver Requirements: ISR status capability
>> >>
>> >> If MSI-X capability is enabled, the driver SHOULD NOT access ISR
>> >> status upon detecting a Queue Interrupt.
>> >>
>> >> >> >
>> >> >> >> ---
>> >> >> >>  hw/virtio/virtio.c | 19 ++-
>> >> >> >>  1 file changed, 14 insertions(+), 5 deletions(-)
>> >> >> >>
>> >> >> >> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>> >> >> >> index 08275a9..4053313 100644
>> >> >> >> --- a/hw/virtio/virtio.c
>> >> >> >> +++ b/hw/virtio/virtio.c
>> >> >> >> @@ -1192,7 +1192,7 @@ void virtio_notify(VirtIODevice *vdev, 
>> >> >> >> VirtQueue *vq)
>> >> >> >>  }
>> >> >> >>
>> >> >> >>  trace_virtio_notify(vdev, vq);
>> >> >> >> -vdev->isr |= 0x01;
>> >> >> >> +vdev->isr |= 0x01; // <= here we set ISR status even if we 
>> >> >> >> don't raise IRQ
>> >> >> >>  virtio_notify_vector(vdev, vq->vector);
>> >> >> >>  }
>> >> >> >>
>> >> >> >> @@ -1757,16 +1757,25 @@ static void 
>> >> >> >> virtio_queue_guest_notifier_read(EventNotifier *n)
>> >> >> >>  }
>> >> >> >>  }
>> >> >> >>
>> >> >> >> +static void virtio_queue_guest_notifier_read_irqfd(EventNotifier 
>> >> >> >> *n)
>> >> >> >> +{
>> >> >> >> +VirtQueue *vq = container_of(n, VirtQueue, guest_notifier);
>> >> >> >> +if (event_notifier_test_and_clear(n)) {
>> >> >> >> +vq->vdev->isr |= 0x01;
>> >> >> >> +}
>> >> >> >> +}
>> >> >> >> +
>> >> >> >>  void virtio_queue_set_guest_notifier_fd_handler(VirtQueue *vq, 
>> >> >> >> bool assign,
>> >> >> >>  bool with_irqfd)
>> >> >> >>  {
>> >> >> >> -if (assign &&

Re: [Qemu-devel] [PATCH] gdbstub: avoid busy loop while waiting for gdb

On 3 May 2016 at 23:58, Peter Wu  wrote:
> While waiting for a gdb response, or while sending an acknowledgement
> there is not much to do, so just mark the socket as non-blocking to
> avoid a busy loop while paused at gdb. This only affects the user-mode
> emulation (qemu-arm -g 1234 ./a.out).
>
> Note that this issue was reported before at
> https://lists.nongnu.org/archive/html/qemu-devel/2013-02/msg02277.html.
>
> While at it, close the gdb client fd on EOF or error while reading.
>
> Signed-off-by: Peter Wu 
> ---
>  gdbstub.c | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/gdbstub.c b/gdbstub.c
> index 0e431fd..b126bf5 100644
> --- a/gdbstub.c
> +++ b/gdbstub.c
> @@ -332,7 +332,7 @@ static int get_char(GDBState *s)
>  if (ret < 0) {
>  if (errno == ECONNRESET)
>  s->fd = -1;
> -if (errno != EINTR && errno != EAGAIN)
> +if (errno != EINTR)
>  return -1;
>  } else if (ret == 0) {
>  close(s->fd);
> @@ -393,7 +393,7 @@ static void put_buffer(GDBState *s, const uint8_t *buf, 
> int len)
>  while (len > 0) {
>  ret = send(s->fd, buf, len, 0);
>  if (ret < 0) {
> -if (errno != EINTR && errno != EAGAIN)
> +if (errno != EINTR)
>  return;
>  } else {
>  buf += ret;
> @@ -1542,9 +1542,13 @@ gdb_handlesig(CPUState *cpu, int sig)
>  for (i = 0; i < n; i++) {
>  gdb_read_byte(s, buf[i]);
>  }
> -} else if (n == 0 || errno != EAGAIN) {
> +} else {
>  /* XXX: Connection closed.  Should probably wait for another
> connection before continuing.  */
> +if (n == 0) {
> +close(s->fd);
> +}
> +s->fd = -1;
>  return sig;
>  }
>  }
> @@ -1599,8 +1603,6 @@ static void gdb_accept(void)
>  gdb_has_xml = false;
>
>  gdbserver_state = s;
> -
> -fcntl(fd, F_SETFL, O_NONBLOCK);
>  }
>
>  static int gdbserver_open(int port)

It does look like all the places we use s->fd:
 * are only for user-mode
 * are currently coded as "loop round until we get something",
   so effectively blocking but doing it with a busy loop

I suspect the O_NONBLOCK here may be legacy from before the
system-emulation gdbstub was reworked to use a chr backend
(in system emulation there really are other things we need
to attend to besides gdb stub packets, like the monitor).

Other than the error in the commit message,
Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [Qemu-devel] [PATCH] disas/microblaze: Add missing 'const' attributes

On Wed, Apr 06, 2016 at 12:01:17PM +0200, Stefan Weil wrote:
> Am 06.04.2016 um 11:16 schrieb Edgar E. Iglesias:
> > On Tue, Mar 22, 2016 at 08:31:33AM +0100, Stefan Weil wrote:
> >> Making the opcode list 'const' saves memory.
> >> Some function arguments and local variables needed 'const', too.
> >>
> >> Add also 'static' to two local functions.
> > Hi Stefan,
> >
> > Sorry for the delays...
> >
> > I gave this a try but it fails for me:
> >
> > /home/edgar/src/c/qemu/qemu/disas/microblaze.c:668:1: error: conflicting 
> > types for ‘get_field_special’
> >  get_field_special(long instr, const struct op_code_struct *op)
> >  ^
> > /home/edgar/src/c/qemu/qemu/disas/microblaze.c:599:8: note: previous 
> > declaration of ‘get_field_special’ was here
> >  char * get_field_special (long instr, struct op_code_struct * op);
> > ^
> > /home/edgar/src/c/qemu/qemu/disas/microblaze.c:733:1: error: conflicting 
> > types for ‘read_insn_microblaze’
> >  read_insn_microblaze (bfd_vma memaddr, 
> >  ^
> > /home/edgar/src/c/qemu/qemu/disas/microblaze.c:600:15: note: previous 
> > declaration of ‘read_insn_microblaze’ was here
> >  unsigned long read_insn_microblaze (bfd_vma memaddr, 
> >^
> > make: *** [disas/microblaze.o] Error 1
> > make: *** Waiting for unfinished jobs
> >
> >
> > It looks like if you may have forgotten to update or remove the function 
> > prototypes...
> >
> > Best regards,
> > Edgar
> >
> 
> Hello Edgar,
> 
> you are right, I forgot to remove the two function prototypes.
> They are not needed, so removing is the best solution which
> I also used in most of my working trees. Obviously I had
> chosen a bad tree for sending the patch, sorry for that.
> 
> Should I send an update, or can you just remove the two
> conflicting prototypes?

Hi Stefan,

Nah, I've added a modified version of your patch into my
mb-next branch.

Thanks!
Edgar

Re: [Qemu-devel] [PATCH] linux-user/signal.c: Use target address instead of host address for microblaze restorer

On Thu, May 05, 2016 at 10:48:57PM +0800, Chen Gang wrote:
> On 5/5/16 00:05, Peter Maydell wrote:
> > On 29 March 2016 at 15:13,   wrote:
> >> From: Chen Gang 
> >>
> >> The return address is in target space, so the restorer address needs to
> >> be target space, too.
> >>
> >> Signed-off-by: Chen Gang 
> >> ---
> >>  linux-user/signal.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/linux-user/signal.c b/linux-user/signal.c
> >> index 4157154..c0a6f7e 100644
> >> --- a/linux-user/signal.c
> >> +++ b/linux-user/signal.c
> >> @@ -3532,7 +3532,8 @@ static void setup_frame(int sig, struct 
> >> target_sigaction *ka,
> >>
> >>  /* Return from sighandler will jump to the tramp.
> >> Negative 8 offset because return is rtsd r15, 8 */
> >> -env->regs[15] = ((unsigned long)frame->tramp) - 8;
> >> +env->regs[15] = frame_addr + offsetof(struct target_signal_frame, 
> >> tramp)
> >> +   - 8;
> >>  }
> >>
> >>  /* Set up registers for signal handler */
> > 
> > Reviewed-by: Peter Maydell 
> > 
> 
> Thank all of you for the 2 patches reviewing.
> 
> I guess, this month, I may have free time (at least, will not be as busy
> as the previous month), I shall finish tilegx floating point insns (it
> has been delayed too long).

Reviewed-by: Edgar E. Iglesias

[Qemu-devel] [PATCH v4 1/1] target-arm: A64: Create Instruction Syndromes for Data Aborts

From: "Edgar E. Iglesias" 

Add support for generating the ISS (Instruction Specific Syndrome) for
Data Abort exceptions taken from AArch64.
These syndromes are used by hypervisors for example to trap and emulate
memory accesses.

We save the decoded data out-of-band with the TBs at translation time.
When exceptions hit, the extra data attached to the TB is used to
recreate the state needed to encode instruction syndromes.
This avoids the need to emit moves with every load/store.

Based on a suggestion from Peter Maydell.

Suggested-by: Peter Maydell 
Signed-off-by: Edgar E. Iglesias 
---
 target-arm/cpu.h   |  14 -
 target-arm/op_helper.c |  49 ++--
 target-arm/translate-a64.c | 140 ++---
 target-arm/translate.c |   5 +-
 target-arm/translate.h |   2 +
 5 files changed, 180 insertions(+), 30 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 066ff67..dea9bab 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -94,7 +94,19 @@
 struct arm_boot_info;
 
 #define NB_MMU_MODES 7
-#define TARGET_INSN_START_EXTRA_WORDS 1
+/* ARM-specific extra insn start words:
+ * 1: Conditional execution bits
+ * 2: Partial exception syndrome for data aborts
+ */
+#define TARGET_INSN_START_EXTRA_WORDS 2
+
+/* The 2nd extra word holding syndrome info for data aborts does not use
+ * the upper 6 bits nor the lower 14 bits. We mask and shift it down to
+ * help the sleb128 encoder do a better job.
+ * When restoring the CPU state, we shift it back up.
+ */
+#define ARM_INSN_START_WORD2_MASK ((1 << 26) - 1)
+#define ARM_INSN_START_WORD2_SHIFT 14
 
 /* We currently assume float and double are IEEE single and double
precision respectively.
diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
index c7fba85..26aa4af 100644
--- a/target-arm/op_helper.c
+++ b/target-arm/op_helper.c
@@ -75,6 +75,43 @@ uint32_t HELPER(neon_tbl)(CPUARMState *env, uint32_t ireg, 
uint32_t def,
 
 #if !defined(CONFIG_USER_ONLY)
 
+static inline uint32_t merge_syn_data_abort(uint32_t template_syn,
+unsigned int target_el,
+bool same_el,
+bool s1ptw, int is_write,
+int fsc)
+{
+uint32_t syn;
+
+/* ISV is only set for data aborts routed to EL2 and
+ * never for stage-1 page table walks faulting on stage 2.
+ *
+ * Furthermore, ISV is only set for certain kinds of load/stores.
+ * If the template syndrome does not have ISV set, we should leave
+ * it cleared.
+ *
+ * See ARMv8 specs, D7-1974:
+ * ISS encoding for an exception from a Data Abort, the
+ * ISV field.
+ */
+if (!(template_syn & ARM_EL_ISV) || target_el != 2 || s1ptw) {
+syn = syn_data_abort_no_iss(same_el,
+0, 0, s1ptw, is_write == 1, fsc);
+} else {
+/* Fields: IL, ISV, SAS, SSE, SRT, SF and AR come from the template
+ * syndrome created at translation time.
+ * Now we create the runtime syndrome with the remaining fields.
+ */
+syn = syn_data_abort_with_iss(same_el,
+  0, 0, 0, 0, 0,
+  0, 0, s1ptw, is_write == 1, fsc,
+  false);
+/* Merge the runtime syndrome with the template syndrome.  */
+syn |= template_syn;
+}
+return syn;
+}
+
 /* try to fill the TLB and return an exception if error. If retaddr is
  * NULL, it means that the function was called in C code (i.e. not
  * from generated code or from helper.c)
@@ -115,8 +152,8 @@ void tlb_fill(CPUState *cs, target_ulong addr, int 
is_write, int mmu_idx,
 syn = syn_insn_abort(same_el, 0, fi.s1ptw, syn);
 exc = EXCP_PREFETCH_ABORT;
 } else {
-syn = syn_data_abort_no_iss(same_el,
-0, 0, fi.s1ptw, is_write == 1, syn);
+syn = merge_syn_data_abort(env->exception.syndrome, target_el,
+   same_el, fi.s1ptw, is_write, syn);
 if (is_write == 1 && arm_feature(env, ARM_FEATURE_V6)) {
 fsr |= (1 << 11);
 }
@@ -137,6 +174,7 @@ void arm_cpu_do_unaligned_access(CPUState *cs, vaddr vaddr, 
int is_write,
 CPUARMState *env = >env;
 int target_el;
 bool same_el;
+uint32_t syn;
 
 if (retaddr) {
 /* now we have a real cpu fault */
@@ -161,10 +199,9 @@ void arm_cpu_do_unaligned_access(CPUState *cs, vaddr 
vaddr, int is_write,
 env->exception.fsr |= (1 << 11);
 }
 
-raise_exception(env, EXCP_DATA_ABORT,
-syn_data_abort_no_iss(same_el,
-  0, 0, 0, is_write == 1, 0x21),
-

[Qemu-devel] [PATCH v4 0/1] arm: Steps towards EL2 support round 6

From: "Edgar E. Iglesias" 

Hi,

Another round of patches towards EL2 support. This one adds partial
Instruction Syndrome generation for Data Aborts while running in AArch64.

I don't feel very confident with the way I collect the regsize info used
to fill out the SF field. Feedback on that would be great.

Once we sort out the details on how this should be implemented we can
fill out the parts needed for AArch32. Possibly in a future version of
this same series.

Comments welcome!

Best regards,
Edgar

ChangeLog v3 -> v4
* Assert that we only set the syndrome template once per insn
* Mask out upper six bits from syndrome template
* Remove the _with_isv versions of do_gpr_ld/st
* Mention that we only handle AArch64 in the commit message

ChangeLog:
v2 -> v3:
* Commented on inst start extra words
* Add macro for word2 shift
* Move ISS field collection closer to tcg_gen_qemu_ld/st
* Changed logic to compute regsize for disas_ld_lit
* Introduce syn_data_abort_with_iss/no_iss
* Rename some isv naming to iss
* Drop the patch: "Use isyn.swstep.ex to hold the is_ldex state"

v1 -> v2:
* Reworked the syndrome generation code to reuse syn_data_abort for
  the encoding.
* Reworded a bunch of comments.
* Fixed thumb vs 16bit IL field issue.


Edgar E. Iglesias (1):
  target-arm: A64: Create Instruction Syndromes for Data Aborts

 target-arm/cpu.h   |  14 -
 target-arm/op_helper.c |  49 ++--
 target-arm/translate-a64.c | 140 ++---
 target-arm/translate.c |   5 +-
 target-arm/translate.h |   2 +
 5 files changed, 180 insertions(+), 30 deletions(-)

-- 
2.5.0

Re: [Qemu-devel] [PATCH] gdbstub: avoid busy loop while waiting for gdb

2016-05-05 Thread Peter Wu

On Thu, May 05, 2016 at 04:37:40PM +0100, Peter Maydell wrote:
> On 3 May 2016 at 23:58, Peter Wu  wrote:
> > While waiting for a gdb response, or while sending an acknowledgement
> > there is not much to do, so just mark the socket as non-blocking to
> > avoid a busy loop while paused at gdb. This only affects the user-mode
> > emulation (qemu-arm -g 1234 ./a.out).
> >
> > Note that this issue was reported before at
> > https://lists.nongnu.org/archive/html/qemu-devel/2013-02/msg02277.html.
> >
> > While at it, close the gdb client fd on EOF or error while reading.
> 
> The commit message says "mark the socket as non-blocking"...
> 
> > @@ -1599,8 +1603,6 @@ static void gdb_accept(void)
> >  gdb_has_xml = false;
> >
> >  gdbserver_state = s;
> > -
> > -fcntl(fd, F_SETFL, O_NONBLOCK);
> >  }
> 
> ...but the code change is *removing* a call to mark the
> socket as non-blocking. Which is correct?
> 
> thanks
> -- PMM

The commit message is misleading, it should have been "so do not mark
the socket as non-blocking". If you were to apply this, please fix it up
:-)
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl

Re: [Qemu-devel] [RFC PATCH] virtio: set ISR status when delivering queue MSI

On Thu, May 05, 2016 at 05:27:03PM +0200, Ladi Prosek wrote:
> On Thu, May 5, 2016 at 5:20 PM, Michael S. Tsirkin  wrote:
> > On Thu, May 05, 2016 at 05:15:01PM +0200, Ladi Prosek wrote:
> >> On Thu, May 5, 2016 at 5:02 PM, Michael S. Tsirkin  wrote:
> >> > On Thu, May 05, 2016 at 04:59:13PM +0200, Ladi Prosek wrote:
> >> >> On Thu, May 5, 2016 at 3:36 PM, Michael S. Tsirkin  
> >> >> wrote:
> >> >> > On Thu, May 05, 2016 at 11:13:37AM +0200, Ladi Prosek wrote:
> >> >> >> There is a discrepancy between dataplane and no-dataplane virtio
> >> >> >> behavior with respect to the ISR status register and MSI-X
> >> >> >> capability.
> >> >> >>
> >> >> >> Without dataplane the Queue interrupt ISR status bit is set
> >> >> >> regardless of how the notification is delivered to the guest.
> >> >> >>
> >> >> >> With dataplane the Queue interrupt ISR status bit is set only
> >> >> >> if the notification is delivered as an IRQ.
> >> >> >>
> >> >> >> While both conform to the spec, QEMU should be consistent to
> >> >> >> minimize surprises with broken drivers.
> >> >> >>
> >> >> >> This RFC patch shows the relevant code location and contains a
> >> >> >> possible fix which makes QEMU set the bit on the MSI dataplane
> >> >> >> code path.
> >> >> >>
> >> >> >> Signed-off-by: Ladi Prosek 
> >> >> >
> >> >> > How is this supposed to work when injecting interrupts using irqfd?
> >> >> > If anything, I would rather drop setting the flag on non-dataplane.
> >> >>
> >> >> My bad, this is messed up. For some reason it works, or appears to be
> >> >> working at least.
> >> >>
> >> >> The reason why I'd like to add setting the flag in dataplane is that
> >> >> if we go the other way we'll break drivers. I know for sure that there
> >> >> are drivers depending on this.
> >> >
> >> > Rly? These are way out of spec then. I'd rather have these drivers
> >> > fixed: virtio said ISR can't be used with MSI since the 1st version
> >> > where MSI was introduced. virtio 1.0 allows using ISR for config
> >> > interrupts only.
> >>
> >> Should the below paragraph be worded differently then? Or explicitly
> >> say that the contents of the status word is undefined?
> >
> > It says don't access. How would you word a driver requirement otherwise?
> 
> I would use a stronger mode than "SHOULD NOT". Playing devil's
> advocate, I could interpret the sentence as a suggested perf
> optimization - i.e. the interrupt message conveys the required
> information, no need to consult ISR status.

OK, maybe it should be "MUST NOT".
Older virtio said it is not valid.
Want to reports this on virtio comments mailing list?


> >> Specified since
> >> 1st version vs. actually behaving since 1st version.
> >
> > Yea. Yack. People just try whatever seems to work.
> >
> >> If you think it's
> >> safe to remove the |= 0x01 from the regular path I'll fix the drivers
> >> I know of, that's not a problem.
> >
> > Maybe we should keep the  |= 0x01 in QEMU for now, but let's
> > not mess up dataplace.  Please fix the drivers.
> >
> > Maybe remove |= 0x01 if legacy is disabled.
> 
> Ok, sounds good. Thanks!
> 
> >> 4.1.4.5.2 Driver Requirements: ISR status capability
> >>
> >> If MSI-X capability is enabled, the driver SHOULD NOT access ISR
> >> status upon detecting a Queue Interrupt.
> >>
> >> >> >
> >> >> >> ---
> >> >> >>  hw/virtio/virtio.c | 19 ++-
> >> >> >>  1 file changed, 14 insertions(+), 5 deletions(-)
> >> >> >>
> >> >> >> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> >> >> >> index 08275a9..4053313 100644
> >> >> >> --- a/hw/virtio/virtio.c
> >> >> >> +++ b/hw/virtio/virtio.c
> >> >> >> @@ -1192,7 +1192,7 @@ void virtio_notify(VirtIODevice *vdev, 
> >> >> >> VirtQueue *vq)
> >> >> >>  }
> >> >> >>
> >> >> >>  trace_virtio_notify(vdev, vq);
> >> >> >> -vdev->isr |= 0x01;
> >> >> >> +vdev->isr |= 0x01; // <= here we set ISR status even if we 
> >> >> >> don't raise IRQ
> >> >> >>  virtio_notify_vector(vdev, vq->vector);
> >> >> >>  }
> >> >> >>
> >> >> >> @@ -1757,16 +1757,25 @@ static void 
> >> >> >> virtio_queue_guest_notifier_read(EventNotifier *n)
> >> >> >>  }
> >> >> >>  }
> >> >> >>
> >> >> >> +static void virtio_queue_guest_notifier_read_irqfd(EventNotifier *n)
> >> >> >> +{
> >> >> >> +VirtQueue *vq = container_of(n, VirtQueue, guest_notifier);
> >> >> >> +if (event_notifier_test_and_clear(n)) {
> >> >> >> +vq->vdev->isr |= 0x01;
> >> >> >> +}
> >> >> >> +}
> >> >> >> +
> >> >> >>  void virtio_queue_set_guest_notifier_fd_handler(VirtQueue *vq, bool 
> >> >> >> assign,
> >> >> >>  bool with_irqfd)
> >> >> >>  {
> >> >> >> -if (assign && !with_irqfd) {
> >> >> >> +if (assign) {
> >> >> >>  event_notifier_set_handler(>guest_notifier,
> >> >> >> -   
> >> >> >> virtio_queue_guest_notifier_read);
> >> >> >> +

Re: [Qemu-devel] [PATCH v3 7/7] target-arm: A64: Create Instruction Syndromes for Data Aborts

On Wed, May 04, 2016 at 06:38:49PM +0100, Peter Maydell wrote:
> On 29 April 2016 at 13:08, Edgar E. Iglesias  wrote:
> > From: "Edgar E. Iglesias" 
> >
> > Add support for generating the instruction syndrome for Data Aborts.
> > These syndromes are used by hypervisors for example to trap and emulate
> > memory accesses.
> >
> > We save the decoded data out-of-band with the TBs at translation time.
> > When exceptions hit, the extra data attached to the TB is used to
> > recreate the state needed to encode instruction syndromes.
> > This avoids the need to emit moves with every load/store.
> >
> > Based on a suggestion from Peter Maydell.
> 
> Worth mentioning in the commit message that we don't currently generate
> ISV information for exceptions from AArch32?

Yes, I've changed the first part of the commit message to:

target-arm: A64: Create Instruction Syndromes for Data Aborts

Add support for generating the ISS (Instruction Specific Syndrome) for
Data Abort exceptions taken from AArch64.

...



> 
> > Suggested-by: Peter Maydell 
> > Signed-off-by: Edgar E. Iglesias 
> > ---
> >  target-arm/cpu.h   |  12 -
> >  target-arm/op_helper.c |  49 ++---
> >  target-arm/translate-a64.c | 130 
> > +++--
> >  target-arm/translate.c |   5 +-
> >  target-arm/translate.h |   2 +
> >  5 files changed, 173 insertions(+), 25 deletions(-)
> >
> > diff --git a/target-arm/cpu.h b/target-arm/cpu.h
> > index 066ff67..256fbec 100644
> > --- a/target-arm/cpu.h
> > +++ b/target-arm/cpu.h
> > @@ -94,7 +94,17 @@
> >  struct arm_boot_info;
> >
> >  #define NB_MMU_MODES 7
> > -#define TARGET_INSN_START_EXTRA_WORDS 1
> > +/* ARM-specific extra insn start words:
> > + * 1: Conditional execution bits
> > + * 2: Partial exception syndrome for data aborts
> > + */
> > +#define TARGET_INSN_START_EXTRA_WORDS 2
> > +
> > +/* The 2nd extra word holding syndrome info for data aborts does not use
> > + * the lower 14 bits. We shift it down to help the sleb128 encoder do a
> > + * better job. When restoring the CPU state, we shift it back up.
> > + */
> > +#define ARM_INSN_START_WORD2_SHIFT 14
> 
> We could also not bother putting bits 25..31 (ie the field that always reads
> EC_DATAABORT) in the insn start word.

Yes, good point. I'll mask them out for v4.


> 
> > @@ -720,23 +720,55 @@ static void gen_adc_CC(int sf, TCGv_i64 dest, 
> > TCGv_i64 t0, TCGv_i64 t1)
> >   * Store from GPR register to memory.
> >   */
> >  static void do_gpr_st_memidx(DisasContext *s, TCGv_i64 source,
> > - TCGv_i64 tcg_addr, int size, int memidx)
> > + TCGv_i64 tcg_addr, int size, int memidx,
> > + bool iss_valid,
> > + unsigned int iss_srt,
> > + bool iss_sf, bool iss_ar)
> >  {
> >  g_assert(size <= 3);
> >  tcg_gen_qemu_st_i64(source, tcg_addr, memidx, s->be_data + size);
> > +
> > +if (iss_valid) {
> > +uint32_t isyn32;
> > +
> > +isyn32 = syn_data_abort_with_iss(0,
> > + size,
> > + false,
> > + iss_srt,
> > + iss_sf,
> > + iss_ar,
> > + 0, 0, 0, 0, 0, false);
> > +isyn32 >>= ARM_INSN_START_WORD2_SHIFT;
> > +tcg_set_insn_param(s->insn_start_idx, 2, isyn32);
> 
> Is it worth doing
>   assert(s->insn_start_idx);
>   tcg_set_insn_param(...);
>   s->insn_start_idx = 0;
> 
> as a way to catch accidentally trying to set the syndrome info twice ?

Yes, why not. I've done that in v4.



> 
> > +}
> > +}
> > +
> > +static void do_gpr_st_with_isv(DisasContext *s, TCGv_i64 source,
> > +   TCGv_i64 tcg_addr, int size,
> > +   bool iss_valid,
> > +   unsigned int iss_srt,
> > +   bool iss_sf, bool iss_ar)
> > +{
> > +do_gpr_st_memidx(s, source, tcg_addr, size, get_mem_index(s),
> > + iss_valid, iss_srt, iss_sf, iss_ar);
> >  }
> >
> >  static void do_gpr_st(DisasContext *s, TCGv_i64 source,
> >TCGv_i64 tcg_addr, int size)
> >  {
> > -do_gpr_st_memidx(s, source, tcg_addr, size, get_mem_index(s));
> > +do_gpr_st_with_isv(s, source, tcg_addr, size,
> > +   false, 0, false, false);
> >  }
> 
> There's only two places where we use do_gpr_st() rather than the
> _with_isv() version (both in the load/store pair function), and
> similarly for do_gpr_ld(); so I think it would be better just to
> have do_gpr_ld/st always take the iss arguments and not have a
> separate

[Qemu-devel] [PATCH] libqos: add qvirtqueue_cleanup()

qvirtqueue_setup() allocates the vring and virtqueue state.  So far
there has been no function to free it.  Callers have been using
guest_free() for the vring but forgot to free the QVirtQueue state.

This patch solves the memory leak by introducing qvirtqueue_cleanup().

Signed-off-by: Stefan Hajnoczi 
---
This fix is based on based on "[PATCH 0/5] libqos: add ability to pop buffers
from virtqueue" but probably applies more or less cleanly onto other trees too.
Also available on my vsock tree:
https://github.com/stefanha/qemu/commits/vsock.

 tests/libqos/virtio-mmio.c |  8 
 tests/libqos/virtio-pci.c  | 10 ++
 tests/libqos/virtio.c  |  6 ++
 tests/libqos/virtio.h  |  5 +
 tests/virtio-blk-test.c| 10 +-
 tests/virtio-net-test.c|  2 +-
 tests/virtio-scsi-test.c   |  2 +-
 7 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/tests/libqos/virtio-mmio.c b/tests/libqos/virtio-mmio.c
index 7756b92..c9c73bc 100644
--- a/tests/libqos/virtio-mmio.c
+++ b/tests/libqos/virtio-mmio.c
@@ -155,6 +155,13 @@ static QVirtQueue 
*qvirtio_mmio_virtqueue_setup(QVirtioDevice *d,
 return vq;
 }
 
+static void qvirtio_mmio_virtqueue_cleanup(QVirtQueue *vq,
+   QGuestAllocator *alloc)
+{
+guest_free(alloc, vq->desc);
+g_free(vq);
+}
+
 static void qvirtio_mmio_virtqueue_kick(QVirtioDevice *d, QVirtQueue *vq)
 {
 QVirtioMMIODevice *dev = (QVirtioMMIODevice *)d;
@@ -177,6 +184,7 @@ const QVirtioBus qvirtio_mmio = {
 .get_queue_size = qvirtio_mmio_get_queue_size,
 .set_queue_address = qvirtio_mmio_set_queue_address,
 .virtqueue_setup = qvirtio_mmio_virtqueue_setup,
+.virtqueue_cleanup = qvirtio_mmio_virtqueue_cleanup,
 .virtqueue_kick = qvirtio_mmio_virtqueue_kick,
 };
 
diff --git a/tests/libqos/virtio-pci.c b/tests/libqos/virtio-pci.c
index d30d224..2b678b3 100644
--- a/tests/libqos/virtio-pci.c
+++ b/tests/libqos/virtio-pci.c
@@ -236,6 +236,15 @@ static QVirtQueue 
*qvirtio_pci_virtqueue_setup(QVirtioDevice *d,
 return >vq;
 }
 
+static void qvirtio_pci_virtqueue_cleanup(QVirtQueue *vq,
+  QGuestAllocator *alloc)
+{
+QVirtQueuePCI *vqpci = container_of(vq, QVirtQueuePCI, vq);
+
+guest_free(alloc, vq->desc);
+g_free(vqpci);
+}
+
 static void qvirtio_pci_virtqueue_kick(QVirtioDevice *d, QVirtQueue *vq)
 {
 QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d;
@@ -258,6 +267,7 @@ const QVirtioBus qvirtio_pci = {
 .get_queue_size = qvirtio_pci_get_queue_size,
 .set_queue_address = qvirtio_pci_set_queue_address,
 .virtqueue_setup = qvirtio_pci_virtqueue_setup,
+.virtqueue_cleanup = qvirtio_pci_virtqueue_cleanup,
 .virtqueue_kick = qvirtio_pci_virtqueue_kick,
 };
 
diff --git a/tests/libqos/virtio.c b/tests/libqos/virtio.c
index b1591db..9bdcd39 100644
--- a/tests/libqos/virtio.c
+++ b/tests/libqos/virtio.c
@@ -55,6 +55,12 @@ QVirtQueue *qvirtqueue_setup(const QVirtioBus *bus, 
QVirtioDevice *d,
 return bus->virtqueue_setup(d, alloc, index);
 }
 
+void qvirtqueue_cleanup(const QVirtioBus *bus, QVirtQueue *vq,
+QGuestAllocator *alloc)
+{
+return bus->virtqueue_cleanup(vq, alloc);
+}
+
 void qvirtio_reset(const QVirtioBus *bus, QVirtioDevice *d)
 {
 bus->set_status(d, 0);
diff --git a/tests/libqos/virtio.h b/tests/libqos/virtio.h
index 5d40b23..f56a7e2 100644
--- a/tests/libqos/virtio.h
+++ b/tests/libqos/virtio.h
@@ -83,6 +83,9 @@ typedef struct QVirtioBus {
 QVirtQueue *(*virtqueue_setup)(QVirtioDevice *d, QGuestAllocator *alloc,
 uint16_t 
index);
 
+/* Free virtqueue resources */
+void (*virtqueue_cleanup)(QVirtQueue *vq, QGuestAllocator *alloc);
+
 /* Notify changes in virtqueue */
 void (*virtqueue_kick)(QVirtioDevice *d, QVirtQueue *vq);
 } QVirtioBus;
@@ -125,6 +128,8 @@ unsigned int qvirtio_wait_queue_buf(const QVirtioBus *bus, 
QVirtioDevice *d,
 gint64 timeout_us);
 QVirtQueue *qvirtqueue_setup(const QVirtioBus *bus, QVirtioDevice *d,
 QGuestAllocator *alloc, uint16_t 
index);
+void qvirtqueue_cleanup(const QVirtioBus *bus, QVirtQueue *vq,
+QGuestAllocator *alloc);
 
 void qvring_init(const QGuestAllocator *alloc, QVirtQueue *vq, uint64_t addr);
 QVRingIndirectDesc *qvring_indirect_desc_setup(QVirtioDevice *d,
diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c
index 0ed5cc3..4483c5a 100644
--- a/tests/virtio-blk-test.c
+++ b/tests/virtio-blk-test.c
@@ -299,7 +299,7 @@ static void pci_basic(void)
 (uint64_t)(uintptr_t)addr);
 
 /* End test */
-guest_free(alloc, vqpci->vq.desc);
+qvirtqueue_cleanup(_pci, >vq, alloc);
 pc_alloc_uninit(alloc);
 qvirtio_pci_device_disable(dev);
 g_free(dev);
@@ -402,7

Re: [Qemu-devel] [PATCH] linux-user/signal.c: Use target address instead of host address for microblaze restorer

2016-05-05 Thread Chen Gang

On 5/5/16 00:05, Peter Maydell wrote:
> On 29 March 2016 at 15:13,   wrote:
>> From: Chen Gang 
>>
>> The return address is in target space, so the restorer address needs to
>> be target space, too.
>>
>> Signed-off-by: Chen Gang 
>> ---
>>  linux-user/signal.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/linux-user/signal.c b/linux-user/signal.c
>> index 4157154..c0a6f7e 100644
>> --- a/linux-user/signal.c
>> +++ b/linux-user/signal.c
>> @@ -3532,7 +3532,8 @@ static void setup_frame(int sig, struct 
>> target_sigaction *ka,
>>
>>  /* Return from sighandler will jump to the tramp.
>> Negative 8 offset because return is rtsd r15, 8 */
>> -env->regs[15] = ((unsigned long)frame->tramp) - 8;
>> +env->regs[15] = frame_addr + offsetof(struct target_signal_frame, 
>> tramp)
>> +   - 8;
>>  }
>>
>>  /* Set up registers for signal handler */
> 
> Reviewed-by: Peter Maydell 
> 

Thank all of you for the 2 patches reviewing.

I guess, this month, I may have free time (at least, will not be as busy
as the previous month), I shall finish tilegx floating point insns (it
has been delayed too long).


Thanks.
-- 
Chen Gang (陈刚)

Managing Natural Environments is the Duty of Human Beings.

Re: [Qemu-devel] [RFC v2 04/11] tcg: comment on which functions have to be called with tb_lock held

2016-05-05 Thread Sergey Fedorov

On 05/05/16 18:25, Sergey Fedorov wrote:
> On 05/05/16 18:03, Alex Bennée wrote:
>> Sergey Fedorov  writes:
>>
>>> On 05/04/16 18:32, Alex Bennée wrote:
 diff --git a/exec.c b/exec.c
 index f46e596..17f390e 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -826,6 +826,7 @@ int cpu_breakpoint_insert(CPUState *cpu, vaddr pc, int 
 flags,
  {
  CPUBreakpoint *bp;

 +/* TODO: locking (RCU?) */
  bp = g_malloc(sizeof(*bp));

  bp->pc = pc;
>>> This comment is a little inconsistent. We should make access to
>>> breakpoint and watchpoint lists to be thread-safe in all the functions
>>> using them. So if we note this, it should be noted in all such places.
>>> Also, it's probably not a good idea to put such comment just above
>>> g_malloc() invocation, it could be a bit confusing. A bit more details
>>> would also be nice.
>> Good point.
>>
>> I could really do with some tests to exercise the debugging interface. I
>> did some when I wrote the arm kvm GDB stuff (see
>> 261f4d6d3e5445f887e070f047968e756c30cf06) but it is a) not plumbed in
>> and b) not really a stress test which is what you want to be sure your
>> handling is thread safe.
> It seems we can only have a race between TCG helper inserting/removing
> break-/watchpoint and gdbstub. So maybe we could just use separate lists
> for CPU and GDB breakpoints?

However, we would still need to synchronize reads in VCPU threads with
insertions/removals in gdbstub... No much point in splitting lists, we
could just use a spinlock to serialize insertions/removals and RCU to
make reads safe.

Kind regards,
Sergey

Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework

2016-05-05 Thread Dr. David Alan Gilbert

* Daniel P. Berrange (berra...@redhat.com) wrote:
> This series of patches provides a framework for testing migration performance
> characteristics. The motivating factor for this is planning that is underway
> in OpenStack wrt making use of QEMU migration features such as compression,
> auto-converge and post-copy. The primary aim for OpenStack is to have Nova
> autonomously manage migration features & tunables to maximise chances that
> migration will complete. The problem faced is figuring out just which QEMU
> migration features are "best" suited to our needs. This means we want data
> on how well they are able to ensure completion of a migration, against the
> host resources used and the impact on the guest workload performance.
> 
> The test framework produced here takes a pathelogical guest workload (every
> CPU just burning 100% of time xor'ing every byte of guest memory with random
> data). This is quite a pessimistic test because most guest workloads are not
> giong to be this heavy on memory writes, and their data won't be uniformly
> random and so will be able to compress better than this test does.
> 
> With this worst case guest, I have produced a set of tests using UNIX socket,
> TCP localhost, TCP remote and RDMA remote socket transports, with both a
> 1 GB RAM + 1 CPU guest and a 8 GB RAM + 4 CPU guest.
> 
> The TCP/RDMA remote host tests were run over a 10-GiG-E network interface.
> 
> I have put the results online to view here:
> 
>   https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/
> 
> The charts here are showing two core sets of data:
> 
>  - The guest CPU performance. The left axis is showing the time in 
> milliseconds
>required to xor 1 GB of memory. This is shown per-guest CPU and combined 
> all
>CPUs.
> 
>  - The host CPU utilization. The right axis is showing the overall QEMU 
> process
>CPU utilization, and the per-VCPU utilization.
> 
> Note that the charts are interactive - you can turn on/off each plot line and
> zoom in by selecting regions on the chart.
> 
> 
> Some interesting things that I have observed with this
> 
>  - At the start of each iteration of migration there is a distinct drop in
>guest CPU performance as shown by a spike in the guest CPU time lines.
>Performance would drop from 200ms/GB to 400ms/GB. Presumably this is
>related to QEMU recalculating the dirty bitmap for the guest RAM. See
>the spikes in the green line in:
> 
> 
> https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-remote-1gb-1cpu/post-copy-bandwidth/post-copy-bw-1gbs.html

Yeh, that doesn't surprise me too much.

>  - For the larger sized guests, the auto-converge code has to throttle the
>guest to as much as 90% or more before it is able to meet the 500ms max
>downtime value
> 
> 
> https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-remote-1gb-1cpu/auto-converge-bandwidth/auto-converge-bw-1gbs.html
> 
>Even then I often saw tests aborting as they hit the max number of
>iterations I permitted (30 iters max)
> 
> 
> https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-remote-8gb-4cpu/auto-converge-bandwidth/auto-converge-bw-10gbs.html

It doesn't take much CPU to dirty memory, so you need an awful lot of 
throttling,
and the throttling is non-discriminate so it throttles threads that are dirtying
memory a lot as well as those that aren't.

>  - MT compression is actively harmful to chances of successful migration when
>the guest RAM is not compression friendly. My work load is worst case since
>it is splattering RAM with totally random bytes. The MT compression is
>dramatically increasing the time for each iteration as we bottleneck on CPU
>compression speed, leaving the network largely idle. This causes migration
>which would have completed without compression, to fail. It also burns huge
>amounts of host CPU time
> 
>  
> https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-remote-1gb-1cpu/compr-mt/compr-mt-threads-4.html

Yes; I think the hope is that this will work well with compression accelerator
hardware.  I look forward to the vendors of that hardware using your scripts
to produce comparisons.

>  - XBZRLE compression did not have as much of a CPU peformance penalty on the
>host as MT comprssion, but also did not help migration to actually 
> complete.
>Again this is largely due to the workload being the worst case scenario 
> with
>random bytes. The downside is obviously the potentially significant memory
>overhead on the host due to the cache sizing
> 
> 
> https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-remote-1gb-1cpu/compr-xbzrle/compr-xbzrle-cache-50.html
> 
> 
>  - Post-copy, by its very nature, obviously ensured that the migraton would
>complete. While post-copy was running in pre-copy mode there was a somewhat
>chaotic small impact on guest CPU performance, causing performance to
>periodically

Re: [Qemu-devel] [PATCH] gdbstub: avoid busy loop while waiting for gdb

On 3 May 2016 at 23:58, Peter Wu  wrote:
> While waiting for a gdb response, or while sending an acknowledgement
> there is not much to do, so just mark the socket as non-blocking to
> avoid a busy loop while paused at gdb. This only affects the user-mode
> emulation (qemu-arm -g 1234 ./a.out).
>
> Note that this issue was reported before at
> https://lists.nongnu.org/archive/html/qemu-devel/2013-02/msg02277.html.
>
> While at it, close the gdb client fd on EOF or error while reading.

The commit message says "mark the socket as non-blocking"...

> @@ -1599,8 +1603,6 @@ static void gdb_accept(void)
>  gdb_has_xml = false;
>
>  gdbserver_state = s;
> -
> -fcntl(fd, F_SETFL, O_NONBLOCK);
>  }

...but the code change is *removing* a call to mark the
socket as non-blocking. Which is correct?

thanks
-- PMM

Re: [Qemu-devel] [PATCH v2 1/4] adb-keys.h - initial commit

On 5 May 2016 at 15:58, Programmingkid  wrote:
> Thank you for reviewing my patch. Just to sure, you want me to send
> another patch that has the note removed? If that is the case, I give
> you full permission to delete the note. That would definitely save us
> some time.

Yeah, if this was the only thing I'd just fix it up locally, but in
this case there are enough issues with the rest of the series that
you'll need to redo and resend it anyway.

thanks
-- PMM

Re: [Qemu-devel] [PATCH v2 4/4] adb.c: add power key support

On 24 March 2016 at 14:07, Programmingkid  wrote:
> Add support for the power key. It has to be handled differently from the other
> keys because it is the only 16-bit value key.
>
> Signed-off-by: John Arbuckle 
> ---
>  hw/input/adb.c | 32 +---
>  1 file changed, 25 insertions(+), 7 deletions(-)
>
> diff --git a/hw/input/adb.c b/hw/input/adb.c
> index 8413db9..fab241b 100644
> --- a/hw/input/adb.c
> +++ b/hw/input/adb.c
> @@ -358,10 +358,17 @@ static int adb_kbd_poll(ADBDevice *d, uint8_t *obuf)
>  s->rptr = 0;
>  }
>  s->count--;
> -obuf[0] = keycode;
> -/* NOTE: could put a second keycode if needed */
> -obuf[1] = 0xff;
> -olen = 2;
> +/* The power key is the only two byte value key, so it is a special 
> case. */
> +if (keycode == 0x7f) {
> +obuf[0] = 0x7f;
> +obuf[1] = 0x7f;
> +olen = 2;
> +} else {
> +obuf[0] = keycode;
> +/* NOTE: could put a second keycode if needed */
> +obuf[1] = 0xff;

This codepath is also now handling "power key key-up" (which is 0xff 0xff),
so that should have a comment. I suggest changing the NOTE to something like:
 /* NOTE: the power key key-up is the two byte sequence 0xff 0xff; otherwise
  * we could in theory send a second keycode in the second byte, but choose
  * not to bother.
  */

> +olen = 2;
> +}
>
>  return olen;
>  }
> @@ -440,11 +447,22 @@ static void adb_keyboard_event(DeviceState *dev, 
> QemuConsole *src,
>  }
>  keycode = qcode_to_adb_keycode[qcode];
>
> -if (evt->u.key.data->down == false) { /* if key release event */
> -keycode = keycode | 0x80;   /* create keyboard break code */
> +/* The power button is a special case because it is a 16-bit value */

useful also to say "Since 0x7f is not a used keycode for ADB we overload
it to indicate the power button when we're storing keycodes in our internal
buffer, and expand it out to two bytes when we send to the guest."

> +if (qcode == Q_KEY_CODE_POWER) {
> +if (evt->u.key.data->down == true) { /* Power button pushed keycode 
> */
> +adb_kbd_put_keycode(s, 0x7f);
> +} else {   /* Power button released keycode 
> */
> +adb_kbd_put_keycode(s, 0xff);
> +}
>  }
>
> -adb_kbd_put_keycode(s, keycode);
> +/* For all non-power keys - safe for 8-bit keycodes */
> +else {

Your braces here are wrong: the "else" should be on the same line as
the closing } of the preceding if clause (so "} else {".)

> +if (evt->u.key.data->down == false) { /* if key release event */
> +keycode = keycode | 0x80;   /* create keyboard break code */
> +}
> +adb_kbd_put_keycode(s, keycode);
> +}
>  }
>
>  static const VMStateDescription vmstate_adb_kbd = {

Otherwise looks good.

thanks
-- PMM

Re: [Qemu-devel] [RFC PATCH] virtio: set ISR status when delivering queue MSI

On Thu, May 5, 2016 at 5:20 PM, Michael S. Tsirkin  wrote:
> On Thu, May 05, 2016 at 05:15:01PM +0200, Ladi Prosek wrote:
>> On Thu, May 5, 2016 at 5:02 PM, Michael S. Tsirkin  wrote:
>> > On Thu, May 05, 2016 at 04:59:13PM +0200, Ladi Prosek wrote:
>> >> On Thu, May 5, 2016 at 3:36 PM, Michael S. Tsirkin  
>> >> wrote:
>> >> > On Thu, May 05, 2016 at 11:13:37AM +0200, Ladi Prosek wrote:
>> >> >> There is a discrepancy between dataplane and no-dataplane virtio
>> >> >> behavior with respect to the ISR status register and MSI-X
>> >> >> capability.
>> >> >>
>> >> >> Without dataplane the Queue interrupt ISR status bit is set
>> >> >> regardless of how the notification is delivered to the guest.
>> >> >>
>> >> >> With dataplane the Queue interrupt ISR status bit is set only
>> >> >> if the notification is delivered as an IRQ.
>> >> >>
>> >> >> While both conform to the spec, QEMU should be consistent to
>> >> >> minimize surprises with broken drivers.
>> >> >>
>> >> >> This RFC patch shows the relevant code location and contains a
>> >> >> possible fix which makes QEMU set the bit on the MSI dataplane
>> >> >> code path.
>> >> >>
>> >> >> Signed-off-by: Ladi Prosek 
>> >> >
>> >> > How is this supposed to work when injecting interrupts using irqfd?
>> >> > If anything, I would rather drop setting the flag on non-dataplane.
>> >>
>> >> My bad, this is messed up. For some reason it works, or appears to be
>> >> working at least.
>> >>
>> >> The reason why I'd like to add setting the flag in dataplane is that
>> >> if we go the other way we'll break drivers. I know for sure that there
>> >> are drivers depending on this.
>> >
>> > Rly? These are way out of spec then. I'd rather have these drivers
>> > fixed: virtio said ISR can't be used with MSI since the 1st version
>> > where MSI was introduced. virtio 1.0 allows using ISR for config
>> > interrupts only.
>>
>> Should the below paragraph be worded differently then? Or explicitly
>> say that the contents of the status word is undefined?
>
> It says don't access. How would you word a driver requirement otherwise?

I would use a stronger mode than "SHOULD NOT". Playing devil's
advocate, I could interpret the sentence as a suggested perf
optimization - i.e. the interrupt message conveys the required
information, no need to consult ISR status.

>> Specified since
>> 1st version vs. actually behaving since 1st version.
>
> Yea. Yack. People just try whatever seems to work.
>
>> If you think it's
>> safe to remove the |= 0x01 from the regular path I'll fix the drivers
>> I know of, that's not a problem.
>
> Maybe we should keep the  |= 0x01 in QEMU for now, but let's
> not mess up dataplace.  Please fix the drivers.
>
> Maybe remove |= 0x01 if legacy is disabled.

Ok, sounds good. Thanks!

>> 4.1.4.5.2 Driver Requirements: ISR status capability
>>
>> If MSI-X capability is enabled, the driver SHOULD NOT access ISR
>> status upon detecting a Queue Interrupt.
>>
>> >> >
>> >> >> ---
>> >> >>  hw/virtio/virtio.c | 19 ++-
>> >> >>  1 file changed, 14 insertions(+), 5 deletions(-)
>> >> >>
>> >> >> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>> >> >> index 08275a9..4053313 100644
>> >> >> --- a/hw/virtio/virtio.c
>> >> >> +++ b/hw/virtio/virtio.c
>> >> >> @@ -1192,7 +1192,7 @@ void virtio_notify(VirtIODevice *vdev, VirtQueue 
>> >> >> *vq)
>> >> >>  }
>> >> >>
>> >> >>  trace_virtio_notify(vdev, vq);
>> >> >> -vdev->isr |= 0x01;
>> >> >> +vdev->isr |= 0x01; // <= here we set ISR status even if we don't 
>> >> >> raise IRQ
>> >> >>  virtio_notify_vector(vdev, vq->vector);
>> >> >>  }
>> >> >>
>> >> >> @@ -1757,16 +1757,25 @@ static void 
>> >> >> virtio_queue_guest_notifier_read(EventNotifier *n)
>> >> >>  }
>> >> >>  }
>> >> >>
>> >> >> +static void virtio_queue_guest_notifier_read_irqfd(EventNotifier *n)
>> >> >> +{
>> >> >> +VirtQueue *vq = container_of(n, VirtQueue, guest_notifier);
>> >> >> +if (event_notifier_test_and_clear(n)) {
>> >> >> +vq->vdev->isr |= 0x01;
>> >> >> +}
>> >> >> +}
>> >> >> +
>> >> >>  void virtio_queue_set_guest_notifier_fd_handler(VirtQueue *vq, bool 
>> >> >> assign,
>> >> >>  bool with_irqfd)
>> >> >>  {
>> >> >> -if (assign && !with_irqfd) {
>> >> >> +if (assign) {
>> >> >>  event_notifier_set_handler(>guest_notifier,
>> >> >> -   virtio_queue_guest_notifier_read);
>> >> >> + with_irqfd ?
>> >> >> +virtio_queue_guest_notifier_read_irqfd :
>> >> >> +virtio_queue_guest_notifier_read);
>> >> >>  } else {
>> >> >>  event_notifier_set_handler(>guest_notifier, NULL);
>> >> >> -}
>> >> >> -if (!assign) {
>> >> >> +
>> >> >>  /* Test and clear notifier before closing it,
>> >> >>   * in case poll callback didn't have time to run. */
>>

Re: [Qemu-devel] [RFC v2 04/11] tcg: comment on which functions have to be called with tb_lock held

2016-05-05 Thread Sergey Fedorov

On 05/05/16 18:03, Alex Bennée wrote:
> Sergey Fedorov  writes:
>
>> On 05/04/16 18:32, Alex Bennée wrote:
>>> diff --git a/exec.c b/exec.c
>>> index f46e596..17f390e 100644
>>> --- a/exec.c
>>> +++ b/exec.c
>>> @@ -826,6 +826,7 @@ int cpu_breakpoint_insert(CPUState *cpu, vaddr pc, int 
>>> flags,
>>>  {
>>>  CPUBreakpoint *bp;
>>>
>>> +/* TODO: locking (RCU?) */
>>>  bp = g_malloc(sizeof(*bp));
>>>
>>>  bp->pc = pc;
>> This comment is a little inconsistent. We should make access to
>> breakpoint and watchpoint lists to be thread-safe in all the functions
>> using them. So if we note this, it should be noted in all such places.
>> Also, it's probably not a good idea to put such comment just above
>> g_malloc() invocation, it could be a bit confusing. A bit more details
>> would also be nice.
> Good point.
>
> I could really do with some tests to exercise the debugging interface. I
> did some when I wrote the arm kvm GDB stuff (see
> 261f4d6d3e5445f887e070f047968e756c30cf06) but it is a) not plumbed in
> and b) not really a stress test which is what you want to be sure your
> handling is thread safe.

It seems we can only have a race between TCG helper inserting/removing
break-/watchpoint and gdbstub. So maybe we could just use separate lists
for CPU and GDB breakpoints?

Kind regards,
Sergey

Re: [Qemu-devel] [PATCH v2 3/4] adb.c: NO_KEY

On 24 March 2016 at 14:06, Programmingkid  wrote:
> Sets keys that are not supported by ADB to an unusable value of 0xff.
>
> Signed-off-by: John Arbuckle 
> ---
>  hw/input/adb.c | 32 +++-
>  1 file changed, 19 insertions(+), 13 deletions(-)
>
> diff --git a/hw/input/adb.c b/hw/input/adb.c
> index 3bfa686..8413db9 100644
> --- a/hw/input/adb.c
> +++ b/hw/input/adb.c
> @@ -62,6 +62,9 @@ do { printf("ADB: " fmt , ## __VA_ARGS__); } while (0)
>  /* error codes */
>  #define ADB_RET_NOTPRESENT (-2)
>
> +/* The adb keyboard doesn't have every key imaginable */
> +#define NO_KEY 0xff
> +
>  static void adb_device_reset(ADBDevice *d)
>  {
>  qdev_reset_all(DEVICE(d));
> @@ -192,6 +195,9 @@ typedef struct ADBKeyboardClass {
>
>  int qcode_to_adb_keycode[] = {
>
> + /* Make sure future additions are automatically set to NO_KEY */
> +[0 ... 0xff]   = NO_KEY,
> +
>  [Q_KEY_CODE_SHIFT] = ADB_KEY_LEFT_SHIFT,
>  [Q_KEY_CODE_SHIFT_R]   = ADB_KEY_RIGHT_SHIFT,
>  [Q_KEY_CODE_ALT]   = ADB_KEY_LEFT_OPTION,
> @@ -309,19 +315,19 @@ int qcode_to_adb_keycode[] = {
>  [Q_KEY_CODE_PGUP]  = ADB_KEY_PAGE_UP,
>  [Q_KEY_CODE_PGDN]  = ADB_KEY_PAGE_DOWN,
>
> -[Q_KEY_CODE_LESS]  = 0,
> -[Q_KEY_CODE_STOP]  = 0,
> -[Q_KEY_CODE_AGAIN] = 0,
> -[Q_KEY_CODE_PROPS] = 0,
> -[Q_KEY_CODE_UNDO]  = 0,
> -[Q_KEY_CODE_FRONT] = 0,
> -[Q_KEY_CODE_COPY]  = 0,
> -[Q_KEY_CODE_OPEN]  = 0,
> -[Q_KEY_CODE_PASTE] = 0,
> -[Q_KEY_CODE_FIND]  = 0,
> -[Q_KEY_CODE_CUT]   = 0,
> -[Q_KEY_CODE_LF]= 0,
> -[Q_KEY_CODE_COMPOSE]   = 0,
> +[Q_KEY_CODE_LESS]  = NO_KEY,
> +[Q_KEY_CODE_STOP]  = NO_KEY,
> +[Q_KEY_CODE_AGAIN] = NO_KEY,
> +[Q_KEY_CODE_PROPS] = NO_KEY,
> +[Q_KEY_CODE_UNDO]  = NO_KEY,
> +[Q_KEY_CODE_FRONT] = NO_KEY,
> +[Q_KEY_CODE_COPY]  = NO_KEY,
> +[Q_KEY_CODE_OPEN]  = NO_KEY,
> +[Q_KEY_CODE_PASTE] = NO_KEY,
> +[Q_KEY_CODE_FIND]  = NO_KEY,
> +[Q_KEY_CODE_CUT]   = NO_KEY,
> +[Q_KEY_CODE_LF]= NO_KEY,
> +[Q_KEY_CODE_COMPOSE]   = NO_KEY,

Since NO_KEY is the default value, there's no need to explicitly
list these keys as generating it.

You need to add a check to adb_keyboard_event() to make it
return early if the array entry is NO_KEY, because otherwise
we'll end up sending the 0xff to the guest.

thanks
-- PMM

Re: [Qemu-devel] [PATCH v2 2/4] adb.c: add support for QKeyCode

On 24 March 2016 at 14:09, Programmingkid  wrote:
> The old pc scancode translation is replaced with QEMU's QKeyCode.
>
> Signed-off-by: John Arbuckle 
> ---
> *v2 changes
> Changed order of this patch.

I wrote a quick test program to check that this patch doesn't change
what key we generate for any particular qcode (you can find the test
program at http://people.linaro.org/~peter.maydell/adb-test.c),
but it shows up a handful of changes:

qcode 5 pckey 0x64 oldadb 0x0 newadb 0x7c
qcode 82 pckey 0x54 oldadb 0x0 newadb 0x69
qcode 93 pckey 0x56 oldadb 0xa newadb 0x0
qcode 96 pckey 0xb7 oldadb 0x0 newadb 0x69
qcode 118 pckey 0x0 oldadb 0x0 newadb 0x72
qcode 120 pckey 0xdc oldadb 0x7e newadb 0x37
qcode 122 pckey 0x0 oldadb 0x0 newadb 0x71
qcode 125 pckey 0x0 oldadb 0x0 newadb 0x51

That's Q_KEY_CODE_ALTGR, Q_KEY_CODE_SYSRQ,
Q_KEY_CODE_LESS, Q_KEY_CODE_PRINT, Q_KEY_CODE_HELP,
Q_KEY_CODE_META_R, Q_KEY_CODE_PAUSE, Q_KEY_CODE_KP_EQUALS.

I suspect these are all bugfixes to the layout, but they should
be done in a separate patch, so that this patch doesn't change
behaviour at all.

If you fix the qcode_to_adb_keycode[] array so that it gives the
same results for those entries as the existing code then this
patch is good.

thanks
-- PMM

Re: [Qemu-devel] [RFC PATCH] virtio: set ISR status when delivering queue MSI

On Thu, May 05, 2016 at 05:15:01PM +0200, Ladi Prosek wrote:
> On Thu, May 5, 2016 at 5:02 PM, Michael S. Tsirkin  wrote:
> > On Thu, May 05, 2016 at 04:59:13PM +0200, Ladi Prosek wrote:
> >> On Thu, May 5, 2016 at 3:36 PM, Michael S. Tsirkin  wrote:
> >> > On Thu, May 05, 2016 at 11:13:37AM +0200, Ladi Prosek wrote:
> >> >> There is a discrepancy between dataplane and no-dataplane virtio
> >> >> behavior with respect to the ISR status register and MSI-X
> >> >> capability.
> >> >>
> >> >> Without dataplane the Queue interrupt ISR status bit is set
> >> >> regardless of how the notification is delivered to the guest.
> >> >>
> >> >> With dataplane the Queue interrupt ISR status bit is set only
> >> >> if the notification is delivered as an IRQ.
> >> >>
> >> >> While both conform to the spec, QEMU should be consistent to
> >> >> minimize surprises with broken drivers.
> >> >>
> >> >> This RFC patch shows the relevant code location and contains a
> >> >> possible fix which makes QEMU set the bit on the MSI dataplane
> >> >> code path.
> >> >>
> >> >> Signed-off-by: Ladi Prosek 
> >> >
> >> > How is this supposed to work when injecting interrupts using irqfd?
> >> > If anything, I would rather drop setting the flag on non-dataplane.
> >>
> >> My bad, this is messed up. For some reason it works, or appears to be
> >> working at least.
> >>
> >> The reason why I'd like to add setting the flag in dataplane is that
> >> if we go the other way we'll break drivers. I know for sure that there
> >> are drivers depending on this.
> >
> > Rly? These are way out of spec then. I'd rather have these drivers
> > fixed: virtio said ISR can't be used with MSI since the 1st version
> > where MSI was introduced. virtio 1.0 allows using ISR for config
> > interrupts only.
> 
> Should the below paragraph be worded differently then? Or explicitly
> say that the contents of the status word is undefined?

It says don't access. How would you word a driver requirement otherwise?

> Specified since
> 1st version vs. actually behaving since 1st version.

Yea. Yack. People just try whatever seems to work.

> If you think it's
> safe to remove the |= 0x01 from the regular path I'll fix the drivers
> I know of, that's not a problem.

Maybe we should keep the  |= 0x01 in QEMU for now, but let's
not mess up dataplace.  Please fix the drivers.

Maybe remove |= 0x01 if legacy is disabled.

> 4.1.4.5.2 Driver Requirements: ISR status capability
> 
> If MSI-X capability is enabled, the driver SHOULD NOT access ISR
> status upon detecting a Queue Interrupt.
> 
> >> >
> >> >> ---
> >> >>  hw/virtio/virtio.c | 19 ++-
> >> >>  1 file changed, 14 insertions(+), 5 deletions(-)
> >> >>
> >> >> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> >> >> index 08275a9..4053313 100644
> >> >> --- a/hw/virtio/virtio.c
> >> >> +++ b/hw/virtio/virtio.c
> >> >> @@ -1192,7 +1192,7 @@ void virtio_notify(VirtIODevice *vdev, VirtQueue 
> >> >> *vq)
> >> >>  }
> >> >>
> >> >>  trace_virtio_notify(vdev, vq);
> >> >> -vdev->isr |= 0x01;
> >> >> +vdev->isr |= 0x01; // <= here we set ISR status even if we don't 
> >> >> raise IRQ
> >> >>  virtio_notify_vector(vdev, vq->vector);
> >> >>  }
> >> >>
> >> >> @@ -1757,16 +1757,25 @@ static void 
> >> >> virtio_queue_guest_notifier_read(EventNotifier *n)
> >> >>  }
> >> >>  }
> >> >>
> >> >> +static void virtio_queue_guest_notifier_read_irqfd(EventNotifier *n)
> >> >> +{
> >> >> +VirtQueue *vq = container_of(n, VirtQueue, guest_notifier);
> >> >> +if (event_notifier_test_and_clear(n)) {
> >> >> +vq->vdev->isr |= 0x01;
> >> >> +}
> >> >> +}
> >> >> +
> >> >>  void virtio_queue_set_guest_notifier_fd_handler(VirtQueue *vq, bool 
> >> >> assign,
> >> >>  bool with_irqfd)
> >> >>  {
> >> >> -if (assign && !with_irqfd) {
> >> >> +if (assign) {
> >> >>  event_notifier_set_handler(>guest_notifier,
> >> >> -   virtio_queue_guest_notifier_read);
> >> >> + with_irqfd ?
> >> >> +virtio_queue_guest_notifier_read_irqfd :
> >> >> +virtio_queue_guest_notifier_read);
> >> >>  } else {
> >> >>  event_notifier_set_handler(>guest_notifier, NULL);
> >> >> -}
> >> >> -if (!assign) {
> >> >> +
> >> >>  /* Test and clear notifier before closing it,
> >> >>   * in case poll callback didn't have time to run. */
> >> >>  virtio_queue_guest_notifier_read(>guest_notifier);
> >> >> --
> >> >> 2.5.5

Re: [Qemu-devel] [RFC PATCH] virtio: set ISR status when delivering queue MSI

On Thu, May 5, 2016 at 5:02 PM, Michael S. Tsirkin  wrote:
> On Thu, May 05, 2016 at 04:59:13PM +0200, Ladi Prosek wrote:
>> On Thu, May 5, 2016 at 3:36 PM, Michael S. Tsirkin  wrote:
>> > On Thu, May 05, 2016 at 11:13:37AM +0200, Ladi Prosek wrote:
>> >> There is a discrepancy between dataplane and no-dataplane virtio
>> >> behavior with respect to the ISR status register and MSI-X
>> >> capability.
>> >>
>> >> Without dataplane the Queue interrupt ISR status bit is set
>> >> regardless of how the notification is delivered to the guest.
>> >>
>> >> With dataplane the Queue interrupt ISR status bit is set only
>> >> if the notification is delivered as an IRQ.
>> >>
>> >> While both conform to the spec, QEMU should be consistent to
>> >> minimize surprises with broken drivers.
>> >>
>> >> This RFC patch shows the relevant code location and contains a
>> >> possible fix which makes QEMU set the bit on the MSI dataplane
>> >> code path.
>> >>
>> >> Signed-off-by: Ladi Prosek 
>> >
>> > How is this supposed to work when injecting interrupts using irqfd?
>> > If anything, I would rather drop setting the flag on non-dataplane.
>>
>> My bad, this is messed up. For some reason it works, or appears to be
>> working at least.
>>
>> The reason why I'd like to add setting the flag in dataplane is that
>> if we go the other way we'll break drivers. I know for sure that there
>> are drivers depending on this.
>
> Rly? These are way out of spec then. I'd rather have these drivers
> fixed: virtio said ISR can't be used with MSI since the 1st version
> where MSI was introduced. virtio 1.0 allows using ISR for config
> interrupts only.

Should the below paragraph be worded differently then? Or explicitly
say that the contents of the status word is undefined? Specified since
1st version vs. actually behaving since 1st version. If you think it's
safe to remove the |= 0x01 from the regular path I'll fix the drivers
I know of, that's not a problem.

4.1.4.5.2 Driver Requirements: ISR status capability

If MSI-X capability is enabled, the driver SHOULD NOT access ISR
status upon detecting a Queue Interrupt.

>> >
>> >> ---
>> >>  hw/virtio/virtio.c | 19 ++-
>> >>  1 file changed, 14 insertions(+), 5 deletions(-)
>> >>
>> >> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>> >> index 08275a9..4053313 100644
>> >> --- a/hw/virtio/virtio.c
>> >> +++ b/hw/virtio/virtio.c
>> >> @@ -1192,7 +1192,7 @@ void virtio_notify(VirtIODevice *vdev, VirtQueue 
>> >> *vq)
>> >>  }
>> >>
>> >>  trace_virtio_notify(vdev, vq);
>> >> -vdev->isr |= 0x01;
>> >> +vdev->isr |= 0x01; // <= here we set ISR status even if we don't 
>> >> raise IRQ
>> >>  virtio_notify_vector(vdev, vq->vector);
>> >>  }
>> >>
>> >> @@ -1757,16 +1757,25 @@ static void 
>> >> virtio_queue_guest_notifier_read(EventNotifier *n)
>> >>  }
>> >>  }
>> >>
>> >> +static void virtio_queue_guest_notifier_read_irqfd(EventNotifier *n)
>> >> +{
>> >> +VirtQueue *vq = container_of(n, VirtQueue, guest_notifier);
>> >> +if (event_notifier_test_and_clear(n)) {
>> >> +vq->vdev->isr |= 0x01;
>> >> +}
>> >> +}
>> >> +
>> >>  void virtio_queue_set_guest_notifier_fd_handler(VirtQueue *vq, bool 
>> >> assign,
>> >>  bool with_irqfd)
>> >>  {
>> >> -if (assign && !with_irqfd) {
>> >> +if (assign) {
>> >>  event_notifier_set_handler(>guest_notifier,
>> >> -   virtio_queue_guest_notifier_read);
>> >> + with_irqfd ?
>> >> +virtio_queue_guest_notifier_read_irqfd :
>> >> +virtio_queue_guest_notifier_read);
>> >>  } else {
>> >>  event_notifier_set_handler(>guest_notifier, NULL);
>> >> -}
>> >> -if (!assign) {
>> >> +
>> >>  /* Test and clear notifier before closing it,
>> >>   * in case poll callback didn't have time to run. */
>> >>  virtio_queue_guest_notifier_read(>guest_notifier);
>> >> --
>> >> 2.5.5

Re: [Qemu-devel] [RFC v2 04/11] tcg: comment on which functions have to be called with tb_lock held

2016-05-05 Thread Alex Bennée


Sergey Fedorov  writes:

> On 05/04/16 18:32, Alex Bennée wrote:
>> diff --git a/exec.c b/exec.c
>> index f46e596..17f390e 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -826,6 +826,7 @@ int cpu_breakpoint_insert(CPUState *cpu, vaddr pc, int 
>> flags,
>>  {
>>  CPUBreakpoint *bp;
>>
>> +/* TODO: locking (RCU?) */
>>  bp = g_malloc(sizeof(*bp));
>>
>>  bp->pc = pc;
>
> This comment is a little inconsistent. We should make access to
> breakpoint and watchpoint lists to be thread-safe in all the functions
> using them. So if we note this, it should be noted in all such places.
> Also, it's probably not a good idea to put such comment just above
> g_malloc() invocation, it could be a bit confusing. A bit more details
> would also be nice.

Good point.

I could really do with some tests to exercise the debugging interface. I
did some when I wrote the arm kvm GDB stuff (see
261f4d6d3e5445f887e070f047968e756c30cf06) but it is a) not plumbed in
and b) not really a stress test which is what you want to be sure your
handling is thread safe.

>
> Kind regards,
> Sergey


--
Alex Bennée

Re: [Qemu-devel] [RFC PATCH] virtio: set ISR status when delivering queue MSI

On Thu, May 05, 2016 at 04:59:13PM +0200, Ladi Prosek wrote:
> On Thu, May 5, 2016 at 3:36 PM, Michael S. Tsirkin  wrote:
> > On Thu, May 05, 2016 at 11:13:37AM +0200, Ladi Prosek wrote:
> >> There is a discrepancy between dataplane and no-dataplane virtio
> >> behavior with respect to the ISR status register and MSI-X
> >> capability.
> >>
> >> Without dataplane the Queue interrupt ISR status bit is set
> >> regardless of how the notification is delivered to the guest.
> >>
> >> With dataplane the Queue interrupt ISR status bit is set only
> >> if the notification is delivered as an IRQ.
> >>
> >> While both conform to the spec, QEMU should be consistent to
> >> minimize surprises with broken drivers.
> >>
> >> This RFC patch shows the relevant code location and contains a
> >> possible fix which makes QEMU set the bit on the MSI dataplane
> >> code path.
> >>
> >> Signed-off-by: Ladi Prosek 
> >
> > How is this supposed to work when injecting interrupts using irqfd?
> > If anything, I would rather drop setting the flag on non-dataplane.
> 
> My bad, this is messed up. For some reason it works, or appears to be
> working at least.
> 
> The reason why I'd like to add setting the flag in dataplane is that
> if we go the other way we'll break drivers. I know for sure that there
> are drivers depending on this.

Rly? These are way out of spec then. I'd rather have these drivers
fixed: virtio said ISR can't be used with MSI since the 1st version
where MSI was introduced. virtio 1.0 allows using ISR for config
interrupts only.

> >
> >> ---
> >>  hw/virtio/virtio.c | 19 ++-
> >>  1 file changed, 14 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> >> index 08275a9..4053313 100644
> >> --- a/hw/virtio/virtio.c
> >> +++ b/hw/virtio/virtio.c
> >> @@ -1192,7 +1192,7 @@ void virtio_notify(VirtIODevice *vdev, VirtQueue *vq)
> >>  }
> >>
> >>  trace_virtio_notify(vdev, vq);
> >> -vdev->isr |= 0x01;
> >> +vdev->isr |= 0x01; // <= here we set ISR status even if we don't 
> >> raise IRQ
> >>  virtio_notify_vector(vdev, vq->vector);
> >>  }
> >>
> >> @@ -1757,16 +1757,25 @@ static void 
> >> virtio_queue_guest_notifier_read(EventNotifier *n)
> >>  }
> >>  }
> >>
> >> +static void virtio_queue_guest_notifier_read_irqfd(EventNotifier *n)
> >> +{
> >> +VirtQueue *vq = container_of(n, VirtQueue, guest_notifier);
> >> +if (event_notifier_test_and_clear(n)) {
> >> +vq->vdev->isr |= 0x01;
> >> +}
> >> +}
> >> +
> >>  void virtio_queue_set_guest_notifier_fd_handler(VirtQueue *vq, bool 
> >> assign,
> >>  bool with_irqfd)
> >>  {
> >> -if (assign && !with_irqfd) {
> >> +if (assign) {
> >>  event_notifier_set_handler(>guest_notifier,
> >> -   virtio_queue_guest_notifier_read);
> >> + with_irqfd ?
> >> +virtio_queue_guest_notifier_read_irqfd :
> >> +virtio_queue_guest_notifier_read);
> >>  } else {
> >>  event_notifier_set_handler(>guest_notifier, NULL);
> >> -}
> >> -if (!assign) {
> >> +
> >>  /* Test and clear notifier before closing it,
> >>   * in case poll callback didn't have time to run. */
> >>  virtio_queue_guest_notifier_read(>guest_notifier);
> >> --
> >> 2.5.5

Re: [Qemu-devel] [RFC PATCH] virtio: set ISR status when delivering queue MSI

On Thu, May 5, 2016 at 3:36 PM, Michael S. Tsirkin  wrote:
> On Thu, May 05, 2016 at 11:13:37AM +0200, Ladi Prosek wrote:
>> There is a discrepancy between dataplane and no-dataplane virtio
>> behavior with respect to the ISR status register and MSI-X
>> capability.
>>
>> Without dataplane the Queue interrupt ISR status bit is set
>> regardless of how the notification is delivered to the guest.
>>
>> With dataplane the Queue interrupt ISR status bit is set only
>> if the notification is delivered as an IRQ.
>>
>> While both conform to the spec, QEMU should be consistent to
>> minimize surprises with broken drivers.
>>
>> This RFC patch shows the relevant code location and contains a
>> possible fix which makes QEMU set the bit on the MSI dataplane
>> code path.
>>
>> Signed-off-by: Ladi Prosek 
>
> How is this supposed to work when injecting interrupts using irqfd?
> If anything, I would rather drop setting the flag on non-dataplane.

My bad, this is messed up. For some reason it works, or appears to be
working at least.

The reason why I'd like to add setting the flag in dataplane is that
if we go the other way we'll break drivers. I know for sure that there
are drivers depending on this.

>
>> ---
>>  hw/virtio/virtio.c | 19 ++-
>>  1 file changed, 14 insertions(+), 5 deletions(-)
>>
>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>> index 08275a9..4053313 100644
>> --- a/hw/virtio/virtio.c
>> +++ b/hw/virtio/virtio.c
>> @@ -1192,7 +1192,7 @@ void virtio_notify(VirtIODevice *vdev, VirtQueue *vq)
>>  }
>>
>>  trace_virtio_notify(vdev, vq);
>> -vdev->isr |= 0x01;
>> +vdev->isr |= 0x01; // <= here we set ISR status even if we don't raise 
>> IRQ
>>  virtio_notify_vector(vdev, vq->vector);
>>  }
>>
>> @@ -1757,16 +1757,25 @@ static void 
>> virtio_queue_guest_notifier_read(EventNotifier *n)
>>  }
>>  }
>>
>> +static void virtio_queue_guest_notifier_read_irqfd(EventNotifier *n)
>> +{
>> +VirtQueue *vq = container_of(n, VirtQueue, guest_notifier);
>> +if (event_notifier_test_and_clear(n)) {
>> +vq->vdev->isr |= 0x01;
>> +}
>> +}
>> +
>>  void virtio_queue_set_guest_notifier_fd_handler(VirtQueue *vq, bool assign,
>>  bool with_irqfd)
>>  {
>> -if (assign && !with_irqfd) {
>> +if (assign) {
>>  event_notifier_set_handler(>guest_notifier,
>> -   virtio_queue_guest_notifier_read);
>> + with_irqfd ?
>> +virtio_queue_guest_notifier_read_irqfd :
>> +virtio_queue_guest_notifier_read);
>>  } else {
>>  event_notifier_set_handler(>guest_notifier, NULL);
>> -}
>> -if (!assign) {
>> +
>>  /* Test and clear notifier before closing it,
>>   * in case poll callback didn't have time to run. */
>>  virtio_queue_guest_notifier_read(>guest_notifier);
>> --
>> 2.5.5

Re: [Qemu-devel] [PATCH v2 1/4] adb-keys.h - initial commit


On May 5, 2016, at 10:54 AM, Peter Maydell wrote:

> On 24 March 2016 at 14:03, Programmingkid  wrote:
>> Add the adb-keys.h file. It maps ADB transition key codes with values.
>> 
>> Signed-off-by: John Arbuckle 
>> ---
>> *v2 changes:
>> Changed order of this patch.
>> 
>> include/hw/input/adb-keys.h | 147 
>> 
>> 1 file changed, 147 insertions(+)
>> create mode 100644 include/hw/input/adb-keys.h
>> 
>> diff --git a/include/hw/input/adb-keys.h b/include/hw/input/adb-keys.h
>> new file mode 100644
>> index 000..633cf71
>> --- /dev/null
>> +++ b/include/hw/input/adb-keys.h
>> @@ -0,0 +1,147 @@
>> +/*
>> + * QEMU System Emulator
>> + *
>> + * Copyright (c) 2016 John Arbuckle
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
>> + * See the COPYING file in the top-level directory.
>> + */
>> +
>> +/*
>> + *  adb-keys.h
>> + *
>> + *  Provides an enum of all the Macintosh keycodes.
>> + *  Note: keys like Power, volume related, and eject are handled at a lower
>> + *level and are not available to QEMU. That doesn't mean we can't
>> + *substitute one key for another. The function keys like F1 make a 
>> good
>> + *substitute for these keys. This can be done in the GTK, SDL, or 
>> Cocoa
>> + *code.
> 
> I don't think this Note makes much sense here, because it's talking about
> the UI layer, but this file is just for the keyboard emulation.
> 
> Otherwise
> Reviewed-by: Peter Maydell 
> 
> thanks
> -- PMM

Thank you for reviewing my patch. Just to sure, you want me to send another 
patch that has the note removed? If that is the case, I give you full 
permission to delete the note. That would definitely save us some time.

Re: [Qemu-devel] [PATCH v2 1/4] adb-keys.h - initial commit

On 24 March 2016 at 14:03, Programmingkid  wrote:
> Add the adb-keys.h file. It maps ADB transition key codes with values.
>
> Signed-off-by: John Arbuckle 
> ---
> *v2 changes:
> Changed order of this patch.
>
>  include/hw/input/adb-keys.h | 147 
> 
>  1 file changed, 147 insertions(+)
>  create mode 100644 include/hw/input/adb-keys.h
>
> diff --git a/include/hw/input/adb-keys.h b/include/hw/input/adb-keys.h
> new file mode 100644
> index 000..633cf71
> --- /dev/null
> +++ b/include/hw/input/adb-keys.h
> @@ -0,0 +1,147 @@
> +/*
> + * QEMU System Emulator
> + *
> + * Copyright (c) 2016 John Arbuckle
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +/*
> + *  adb-keys.h
> + *
> + *  Provides an enum of all the Macintosh keycodes.
> + *  Note: keys like Power, volume related, and eject are handled at a lower
> + *level and are not available to QEMU. That doesn't mean we can't
> + *substitute one key for another. The function keys like F1 make a 
> good
> + *substitute for these keys. This can be done in the GTK, SDL, or 
> Cocoa
> + *code.

I don't think this Note makes much sense here, because it's talking about
the UI layer, but this file is just for the keyboard emulation.

Otherwise
Reviewed-by: Peter Maydell 

thanks
-- PMM

[Qemu-devel] [PATCH v1 5/6] scripts: ensure monitor socket has SO_REUSEADDR set

If tests use a TCP based monitor socket, the connection will
go into a TIMED_WAIT state when the test exits. This will
randomly prevent the test from being re-run without a certain
time period. Set the SO_REUSEADDR flag on the socket to ensure
we can immediately re-run the tests
---
 scripts/qmp/qmp.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/qmp/qmp.py b/scripts/qmp/qmp.py
index 2d0d926..62d3651 100644
--- a/scripts/qmp/qmp.py
+++ b/scripts/qmp/qmp.py
@@ -43,6 +43,7 @@ class QEMUMonitorProtocol:
 self._debug = debug
 self.__sock = self.__get_sock()
 if server:
+self.__sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
 self.__sock.bind(self.__address)
 self.__sock.listen(1)
 
-- 
2.5.5

[Qemu-devel] [PATCH v1 6/6] tests: introduce a framework for testing migration performance

This introduces a moderately general purpose framework for
testing performance of migration.

The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.

The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.

None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.

While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.

All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.

The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.

While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).

To use this, first you must build the initrd image

 $ make tests/migration/initrd-stress.img

To run a a one-shot test with all default parameters

 $ ./tests/migration/guestperf.py > result.json

This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes

 $ ./tests/migration/guestperf.py \
   --mem 4 --cpus 2 \
   --src-mem-bind 0 --src-cpu-bind 0,1 \
   --dst-mem-bind 1 --dst-cpu-bind 2,3 \
   > result.json

Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.

To make it run across separate hosts:

 $ ./tests/migration/guestperf.py \
   --dst-host somehostname > result.json

To request that post-copy is enabled, with switchover
after 5 iterations

 $ ./tests/migration/guestperf.py \
   --post-copy --post-copy-iters 5 > result.json

Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:

 $ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json

To further include host vCPU utilization and overall QEMU
utilization

 $ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json

NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do

 $ pip install --user  plotly

Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in

  $HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js

The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.

Finally, to run the entire standardized set of comparisons

  $ ./tests/migration/guestperf-batch.py \
   --dst-host somehost \
   --mem 4 --cpus 2 \
   --src-mem-bind 0 --src-cpu-bind

[Qemu-devel] [PATCH v1 3/6] scripts: refactor the VM class in iotests for reuse

The iotests module has a python class for controlling QEMU
processes. Pull the generic functionality out of this file
and create a scripts/qemu.py module containing a QEMUMachine
class. Put the QTest integration support into a subclass
QEMUQtestMachine.

Signed-off-by: Daniel P. Berrange 
---
 scripts/qemu.py   | 202 ++
 scripts/qtest.py  |  34 +++
 tests/qemu-iotests/iotests.py | 135 +---
 3 files changed, 240 insertions(+), 131 deletions(-)
 create mode 100644 scripts/qemu.py

diff --git a/scripts/qemu.py b/scripts/qemu.py
new file mode 100644
index 000..9cdad24
--- /dev/null
+++ b/scripts/qemu.py
@@ -0,0 +1,202 @@
+# QEMU library
+#
+# Copyright (C) 2015-2016 Red Hat Inc.
+# Copyright (C) 2012 IBM Corp.
+#
+# Authors:
+#  Fam Zheng 
+#
+# This work is licensed under the terms of the GNU GPL, version 2.  See
+# the COPYING file in the top-level directory.
+#
+# Based on qmp.py.
+#
+
+import errno
+import string
+import os
+import sys
+import subprocess
+import qmp.qmp
+
+
+class QEMUMachine(object):
+'''A QEMU VM'''
+
+def __init__(self, binary, args=[], wrapper=[], name=None, 
test_dir="/var/tmp",
+ monitor_address=None, debug=False):
+if name is None:
+name = "qemu-%d" % os.getpid()
+if monitor_address is None:
+monitor_address = os.path.join(test_dir, name + "-monitor.sock")
+self._monitor_address = monitor_address
+self._qemu_log_path = os.path.join(test_dir, name + ".log")
+self._popen = None
+self._binary = binary
+self._args = args
+self._wrapper = wrapper
+self._events = []
+self._iolog = None
+self._debug = debug
+
+# This can be used to add an unused monitor instance.
+def add_monitor_telnet(self, ip, port):
+args = 'tcp:%s:%d,server,nowait,telnet' % (ip, port)
+self._args.append('-monitor')
+self._args.append(args)
+
+def add_fd(self, fd, fdset, opaque, opts=''):
+'''Pass a file descriptor to the VM'''
+options = ['fd=%d' % fd,
+   'set=%d' % fdset,
+   'opaque=%s' % opaque]
+if opts:
+options.append(opts)
+
+self._args.append('-add-fd')
+self._args.append(','.join(options))
+return self
+
+def send_fd_scm(self, fd_file_path):
+# In iotest.py, the qmp should always use unix socket.
+assert self._qmp.is_scm_available()
+bin = socket_scm_helper
+if os.path.exists(bin) == False:
+print "Scm help program does not present, path '%s'." % bin
+return -1
+fd_param = ["%s" % bin,
+"%d" % self._qmp.get_sock_fd(),
+"%s" % fd_file_path]
+devnull = open('/dev/null', 'rb')
+p = subprocess.Popen(fd_param, stdin=devnull, stdout=sys.stdout,
+ stderr=sys.stderr)
+return p.wait()
+
+@staticmethod
+def _remove_if_exists(path):
+'''Remove file object at path if it exists'''
+try:
+os.remove(path)
+except OSError as exception:
+if exception.errno == errno.ENOENT:
+return
+raise
+
+def get_pid(self):
+if not self._popen:
+return None
+return self._popen.pid
+
+def _load_io_log(self):
+with open(self._qemu_log_path, "r") as fh:
+self._iolog = fh.read()
+
+def _base_args(self):
+if isinstance(self._monitor_address, tuple):
+moncdev = "socket,id=mon,host=%s,port=%s" % (
+self._monitor_address[0],
+self._monitor_address[1])
+else:
+moncdev = 'socket,id=mon,path=%s' % self._monitor_address
+return ['-chardev', moncdev,
+'-mon', 'chardev=mon,mode=control',
+'-display', 'none', '-vga', 'none']
+
+def _pre_launch(self):
+self._qmp = qmp.qmp.QEMUMonitorProtocol(self._monitor_address, 
server=True,
+debug=self._debug)
+
+def _post_launch(self):
+self._qmp.accept()
+
+def _post_shutdown(self):
+if not isinstance(self._monitor_address, tuple):
+self._remove_if_exists(self._monitor_address)
+self._remove_if_exists(self._qemu_log_path)
+
+def launch(self):
+'''Launch the VM and establish a QMP connection'''
+devnull = open('/dev/null', 'rb')
+qemulog = open(self._qemu_log_path, 'wb')
+try:
+self._pre_launch()
+args = self._wrapper + [self._binary] + self._base_args() + 
self._args
+self._popen = subprocess.Popen(args, stdin=devnull, stdout=qemulog,
+   stderr=subprocess.STDOUT, 
shell=False)
+

[Qemu-devel] [PATCH v1 0/6] A migration performance testing framework

This series of patches provides a framework for testing migration performance
characteristics. The motivating factor for this is planning that is underway
in OpenStack wrt making use of QEMU migration features such as compression,
auto-converge and post-copy. The primary aim for OpenStack is to have Nova
autonomously manage migration features & tunables to maximise chances that
migration will complete. The problem faced is figuring out just which QEMU
migration features are "best" suited to our needs. This means we want data
on how well they are able to ensure completion of a migration, against the
host resources used and the impact on the guest workload performance.

The test framework produced here takes a pathelogical guest workload (every
CPU just burning 100% of time xor'ing every byte of guest memory with random
data). This is quite a pessimistic test because most guest workloads are not
giong to be this heavy on memory writes, and their data won't be uniformly
random and so will be able to compress better than this test does.

With this worst case guest, I have produced a set of tests using UNIX socket,
TCP localhost, TCP remote and RDMA remote socket transports, with both a
1 GB RAM + 1 CPU guest and a 8 GB RAM + 4 CPU guest.

The TCP/RDMA remote host tests were run over a 10-GiG-E network interface.

I have put the results online to view here:

https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/

The charts here are showing two core sets of data:

- The guest CPU performance. The left axis is showing the time in milliseconds
required to xor 1 GB of memory. This is shown per-guest CPU and combined all
CPUs.

- The host CPU utilization. The right axis is showing the overall QEMU process
CPU utilization, and the per-VCPU utilization.

Note that the charts are interactive - you can turn on/off each plot line and
zoom in by selecting regions on the chart.

Some interesting things that I have observed with this

- At the start of each iteration of migration there is a distinct drop in
guest CPU performance as shown by a spike in the guest CPU time lines.
Performance would drop from 200ms/GB to 400ms/GB. Presumably this is
related to QEMU recalculating the dirty bitmap for the guest RAM. See
the spikes in the green line in:

https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-remote-1gb-1cpu/post-copy-bandwidth/post-copy-bw-1gbs.html

- For the larger sized guests, the auto-converge code has to throttle the
guest to as much as 90% or more before it is able to meet the 500ms max
downtime value

https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-remote-1gb-1cpu/auto-converge-bandwidth/auto-converge-bw-1gbs.html

Even then I often saw tests aborting as they hit the max number of
iterations I permitted (30 iters max)

https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-remote-8gb-4cpu/auto-converge-bandwidth/auto-converge-bw-10gbs.html

- MT compression is actively harmful to chances of successful migration when
the guest RAM is not compression friendly. My work load is worst case since
it is splattering RAM with totally random bytes. The MT compression is
dramatically increasing the time for each iteration as we bottleneck on CPU
compression speed, leaving the network largely idle. This causes migration
which would have completed without compression, to fail. It also burns huge
amounts of host CPU time

https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-remote-1gb-1cpu/compr-mt/compr-mt-threads-4.html

- XBZRLE compression did not have as much of a CPU peformance penalty on the
host as MT comprssion, but also did not help migration to actually complete.
Again this is largely due to the workload being the worst case scenario with
random bytes. The downside is obviously the potentially significant memory
overhead on the host due to the cache sizing

https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-remote-1gb-1cpu/compr-xbzrle/compr-xbzrle-cache-50.html

- Post-copy, by its very nature, obviously ensured that the migraton would
complete. While post-copy was running in pre-copy mode there was a somewhat
chaotic small impact on guest CPU performance, causing performance to
periodically oscillate between 400ms/GB and 800ms/GB. This is less than
the impact at the start of each migration iteration which was 1000ms/GB
in this test. There was also a massive penalty at time of switchover from
pre to post copy, as to be expected. The migration completed in post-copy
phase quite quickly though. For this workload, number of iterations in
pre-copy mode before switching to post-copy did not have much impact. I
expect a less extreme workload would have shown more interesting results
wrt number of iterations of pre-copy:

https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-remote-8gb-4cpu/post-copy-iters.html

[Qemu-devel] [PATCH v1 2/6] scripts: add a 'debug' parameter to QEMUMonitorProtocol

Add a 'debug' parameter to the QEMUMonitorProtocol class
which will cause it to print out all JSON strings on
sys.stderr

Signed-off-by: Daniel P. Berrange 
---
 scripts/qmp/qmp.py | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/scripts/qmp/qmp.py b/scripts/qmp/qmp.py
index 779332f..70e927e 100644
--- a/scripts/qmp/qmp.py
+++ b/scripts/qmp/qmp.py
@@ -11,6 +11,7 @@
 import json
 import errno
 import socket
+import sys
 
 class QMPError(Exception):
 pass
@@ -25,7 +26,7 @@ class QMPTimeoutError(QMPError):
 pass
 
 class QEMUMonitorProtocol:
-def __init__(self, address, server=False):
+def __init__(self, address, server=False, debug=False):
 """
 Create a QEMUMonitorProtocol class.
 
@@ -39,6 +40,7 @@ class QEMUMonitorProtocol:
 """
 self.__events = []
 self.__address = address
+self._debug = debug
 self.__sock = self.__get_sock()
 if server:
 self.__sock.bind(self.__address)
@@ -68,6 +70,8 @@ class QEMUMonitorProtocol:
 return
 resp = json.loads(data)
 if 'event' in resp:
+if self._debug:
+print >>sys.stderr, "QMP:<<< %s" % resp
 self.__events.append(resp)
 if not only_event:
 continue
@@ -148,13 +152,18 @@ class QEMUMonitorProtocol:
 @return QMP response as a Python dict or None if the connection has
 been closed
 """
+if self._debug:
+print >>sys.stderr, "QMP:>>> %s" % qmp_cmd
 try:
 self.__sock.sendall(json.dumps(qmp_cmd))
 except socket.error as err:
 if err[0] == errno.EPIPE:
 return
 raise socket.error(err)
-return self.__json_read()
+resp = self.__json_read()
+if self._debug:
+print >>sys.stderr, "QMP:<<< %s" % resp
+return resp
 
 def cmd(self, name, args=None, id=None):
 """
-- 
2.5.5

[Qemu-devel] [PATCH v1 4/6] scripts: set timeout when waiting for qemu monitor connection

If QEMU fails to launch for some reason, the QEMUMonitorProtocol
class accept() method will wait forever in a socket accept call.
Set a timeout of 15 seconds so that we fail more gracefully
instead of hanging the test script forever

Signed-off-by: Daniel P. Berrange 
---
 scripts/qmp/qmp.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/qmp/qmp.py b/scripts/qmp/qmp.py
index 70e927e..2d0d926 100644
--- a/scripts/qmp/qmp.py
+++ b/scripts/qmp/qmp.py
@@ -140,6 +140,7 @@ class QEMUMonitorProtocol:
 @raise QMPConnectError if the greeting is not received
 @raise QMPCapabilitiesError if fails to negotiate capabilities
 """
+self.__sock.settimeout(15)
 self.__sock, _ = self.__sock.accept()
 self.__sockfile = self.__sock.makefile()
 return self.__negotiate_capabilities()
-- 
2.5.5

[Qemu-devel] [PATCH v1 1/6] scripts: add init.py file to scripts/qmp/

When searching for modules to load, python will ignore any
sub-directory which does not contain __init__.py. This means
that both scripts and scripts/qmp/ have to be explicitly added
to the python path. By adding a __init__.py file to scripts/qmp,
we only need add scripts/ to the python path and can then simply
do 'from qmp import qmp' to load scripts/qmp/qmp.py.

Signed-off-by: Daniel P. Berrange 
---
 scripts/qmp/__init__.py | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 scripts/qmp/__init__.py

diff --git a/scripts/qmp/__init__.py b/scripts/qmp/__init__.py
new file mode 100644
index 000..e69de29
-- 
2.5.5

Re: [Qemu-devel] [PATCH v1 1/9] hw/intc: QOM'ify etraxfs_pic.c

On Thu, May 05, 2016 at 06:28:48PM +0800, xiaoqiang zhao wrote:
> Drop the old SysBus init function and use instance_init
> 
> Signed-off-by: xiaoqiang zhao 

Reviewed-by: Edgar E. Iglesias 
Tested-by: Edgar E. Iglesias 


> ---
>  hw/intc/etraxfs_pic.c | 13 ++---
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/intc/etraxfs_pic.c b/hw/intc/etraxfs_pic.c
> index 48f9477..64a6f4b 100644
> --- a/hw/intc/etraxfs_pic.c
> +++ b/hw/intc/etraxfs_pic.c
> @@ -146,19 +146,19 @@ static void irq_handler(void *opaque, int irq, int 
> level)
>  pic_update(fs);
>  }
>  
> -static int etraxfs_pic_init(SysBusDevice *sbd)
> +static void etraxfs_pic_init(Object *obj)
>  {
> -DeviceState *dev = DEVICE(sbd);
> -struct etrax_pic *s = ETRAX_FS_PIC(dev);
> +DeviceState *dev = DEVICE(obj);
> +struct etrax_pic *s = ETRAX_FS_PIC(obj);
> +SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
>  
>  qdev_init_gpio_in(dev, irq_handler, 32);
>  sysbus_init_irq(sbd, >parent_irq);
>  sysbus_init_irq(sbd, >parent_nmi);
>  
> -memory_region_init_io(>mmio, OBJECT(s), _ops, s,
> +memory_region_init_io(>mmio, obj, _ops, s,
>"etraxfs-pic", R_MAX * 4);
>  sysbus_init_mmio(sbd, >mmio);
> -return 0;
>  }
>  
>  static Property etraxfs_pic_properties[] = {
> @@ -169,9 +169,7 @@ static Property etraxfs_pic_properties[] = {
>  static void etraxfs_pic_class_init(ObjectClass *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
> -SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
>  
> -k->init = etraxfs_pic_init;
>  dc->props = etraxfs_pic_properties;
>  /*
>   * Note: pointer property "interrupt_vector" may remain null, thus
> @@ -183,6 +181,7 @@ static const TypeInfo etraxfs_pic_info = {
>  .name  = TYPE_ETRAX_FS_PIC,
>  .parent= TYPE_SYS_BUS_DEVICE,
>  .instance_size = sizeof(struct etrax_pic),
> +.instance_init = etraxfs_pic_init,
>  .class_init= etraxfs_pic_class_init,
>  };
>  
> -- 
> 2.1.4
> 
>

Re: [Qemu-devel] [PATCH v2 2/3] Add ENET/Gbps Ethernet support to FEC device

On 23 April 2016 at 11:58, Jean-Christophe Dubois  wrote:
> The ENET device (present in i.MX6) is "derived" from FEC and backward
> compatible with it.
>
> This patch add the necessary support of the added feature in the ENET
> device to allow Linux to use it (on supported processors).
>
> Signed-off-by: Jean-Christophe Dubois 
> ---
>  hw/arm/fsl-imx25.c   |   3 +
>  hw/net/imx_fec.c | 984 
> +++
>  include/hw/net/imx_fec.h | 243 +---
>  3 files changed, 929 insertions(+), 301 deletions(-)

I'm afraid this patch is too large for me to review, and quickly
scanning through the patch makes it look like it's half "new
feature" and half "rewrite existing stuff". Please can you
split this up somehow so it's easier to review?

thanks
-- PMM

Re: [Qemu-devel] [V9 0/4] AMD IOMMU

2016-05-05 Thread David Kiarie

On Wed, May 4, 2016 at 2:05 PM, Valentine Sinitsyn
 wrote:
> On 04.05.2016 16:02, David Kiarie wrote:
>>
>>
>>
>> On 04/05/16 13:58, Valentine Sinitsyn wrote:
>>>
>>> On 04.05.2016 15:51, David Kiarie wrote:

 On Wed, May 4, 2016 at 10:39 AM, Valentine Sinitsyn
  wrote:
>
> Hi everyone,
>
> On 04.05.2016 12:05, David Kiarie wrote:
>>
>>
>> On Wed, May 4, 2016 at 9:12 AM, Jan Kiszka  wrote:
>>>
>>>
>>> On 2016-04-30 00:42, David Kiarie wrote:


 These series adds AMD IOMMU support to Qemu. It's currently in
 the 9th
 version.

 In this series I have (hopefully) addressed all the comments made
 in the
 previous version.
 I have also tested and successfully passed-through PCI device 'ac97'
 with more devices to be tested.

>>>
>>> I've done some basic testing with a Jailhouse setup and found it
>>> working. The ACPI table is now properly parsed and the DMA
>>> remapping was
>>> not disturbing the system after Jailhouse was activated.
>>>
>>> However, it was also still not intervening after I started to corrupt
>>> the configuration, removed DMA target properties from most of the
>>> RAM or
>>> dropped PCI devices.
>
>
> Please also remember that unlisted devices go without translation.
> To "mute"
> the device, set V, TV, the DomainId, and zero everything else in the
> DTE.
>
>>
>> This means you're invalidating DTEs ?
>>
>>>
>>> You are not dropping invalid remapping requests, are you?
>>> According to
>>> the logs, you are detecting them at least:
>>>
>>> (amd-iommu)amd_iommu_get_dte: Device Table at 0x3b0d4000
>>> (amd-iommu)amd_iommu_get_dte: Pte entry at 0x0 is invalid
>>> (amd-iommu)amd_iommu_translate: devid: 00:02.0 gpa 0x32f39480 hpa
>>> 0x32f39000
>>>
>>> It's a bit hard to test right now if remapping is actually properly
>>> working in all important cases if you do not reject invalid ones.
>
>
> My understanding is that you should generate an IO_PAGE_FAULT event
> and drop
> the request. This doesn't apply to ATS, which is a bit trickier, but we
> don't address ATS in this patch series anyway, do we?


 My next question is what you mean by 'reject' and 'drop'. In I
 encounter an invalid PTE/DTE I don't translate the gpa, it just become
 the hpa which is what is happening above.
>>>
>>> What happens if you just ignore the request? I mean, what if you don't
>>> forward it to anywhere else in QEMU, just log this event and return?
>>
>>
>> Am guessing this should have something to do with pci abort which, last
>> time I tried, wasn't aborting at request. I will look at it again.
>
> My initial answer was also "do target abort". But then I did a quick look
> over the spec, and found no such requirement. Please read relevant parts
> thoroughly yourself, and maybe experiment with "just ignore"/"explicitly
> abort" options.

Qemu doesn't seem to be honouring target aborts. The PCI device
represented by IOMMU seems like just a link to allow communication
with the OS/System Software.

In place of target aborts I am going to instead populate the struct
(IOMMUTLBEntry) like below

  IOMMUTLBEntry ret = {
.target_as = _space_memory,
.iova = addr,
.translated_addr = 0,
.addr_mask = ~(hwaddr)0,
.perm = IOMMU_NONE,
}

Functionally speaking, this doesn't seem very different from a target
abort because with a target abort, the bus is expected to complete the
translation which should come down to the same thing(with the later
being more complicated).

On the other hand, I am yet to confirm, from the spec that the
particular reported case warrants a target abort but cases such as
page faults should be target aborted (which is not currently the case,
so they should be fixed).


>
> Valentine

Re: [Qemu-devel] [RFC v2 04/11] tcg: comment on which functions have to be called with tb_lock held

2016-05-05 Thread Sergey Fedorov

On 05/04/16 18:32, Alex Bennée wrote:
> diff --git a/exec.c b/exec.c
> index f46e596..17f390e 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -826,6 +826,7 @@ int cpu_breakpoint_insert(CPUState *cpu, vaddr pc, int 
> flags,
>  {
>  CPUBreakpoint *bp;
>  
> +/* TODO: locking (RCU?) */
>  bp = g_malloc(sizeof(*bp));
>  
>  bp->pc = pc;

This comment is a little inconsistent. We should make access to
breakpoint and watchpoint lists to be thread-safe in all the functions
using them. So if we note this, it should be noted in all such places.
Also, it's probably not a good idea to put such comment just above
g_malloc() invocation, it could be a bit confusing. A bit more details
would also be nice.

Kind regards,
Sergey

Re: [Qemu-devel] [PATCH v2 1/3] net: improve UDP/TCP checksum computation.

On 23 April 2016 at 11:58, Jean-Christophe Dubois  wrote:
> This patch adds:
>  * based on Eth, UDP, TCP struct present in eth.h instead of hardcoded 
> indexes.
>  * based on various macros present in eth.h.
>  * allow to account for optional VLAN header.

This is doing several things:
 (1) changing style to use structs and macros rather than raw array accesses
(which shouldn't affect functionality)
 (2) adding new functionality

I think they would be better in separate patches.


This function is used by multiple network devices, including the
important ones xen_nic and virtio-net -- is it definitely OK to
have it change behaviour for those devices?


> Signed-off-by: Jean-Christophe Dubois 
> ---
>  net/checksum.c | 83 
> --
>  1 file changed, 57 insertions(+), 26 deletions(-)
>
> diff --git a/net/checksum.c b/net/checksum.c
> index d0fa424..fd25209 100644
> --- a/net/checksum.c
> +++ b/net/checksum.c
> @@ -18,9 +18,7 @@
>  #include "qemu/osdep.h"
>  #include "qemu-common.h"
>  #include "net/checksum.h"
> -
> -#define PROTO_TCP  6
> -#define PROTO_UDP 17
> +#include "net/eth.h"
>
>  uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq)
>  {
> @@ -57,40 +55,73 @@ uint16_t net_checksum_tcpudp(uint16_t length, uint16_t 
> proto,
>
>  void net_checksum_calculate(uint8_t *data, int length)
>  {
> -int hlen, plen, proto, csum_offset;
> -uint16_t csum;
> +int plen;
> +struct ip_header *ip;
> +
> +/* Ensure we have at least a Eth header */
> +if (length < sizeof(struct eth_header)) {
> +return;
> +}
>
> -/* Ensure data has complete L2 & L3 headers. */
> -if (length < 14 + 20) {
> +/* Now check we have an IP header (with an optonnal VLAN header */

"optional", also missing ")".

> +if (length < eth_get_l2_hdr_length(data) + sizeof(struct ip_header)) {
>  return;
>  }
>
> -if ((data[14] & 0xf0) != 0x40)
> +ip = PKT_GET_IP_HDR(data);

Are we definitely guaranteed that the input data buffer is aligned?
It seems unlikely, and if it isn't then a lot of this code (both
in macros like PKT_GET_IP_HDR, and open coded stuff below) is going
to go wrong if it tries to just access 16 bit struct fields and they're
not aligned and we're on a host that requires strict alignment.

> +
> +if (IP_HEADER_VERSION(ip) != IP_HEADER_VERSION_4) {
> return; /* not IPv4 */
> -hlen  = (data[14] & 0x0f) * 4;
> -plen  = (data[16] << 8 | data[17]) - hlen;
> -proto = data[23];
> +}
> +
> +/* Last, check that we have enough data for the IP frame */
> +if (length < eth_get_l2_hdr_length(data) + be16_to_cpu(ip->ip_len)) {
> +return;
> +}
> +
> +plen  = be16_to_cpu(ip->ip_len) - IP_HDR_GET_LEN(ip);
> +
> +switch (ip->ip_p) {
> +case IP_PROTO_TCP:
> +{
> +uint16_t csum;
> +tcp_header *tcp = (tcp_header *)(ip + 1);
> +
> +if (plen < sizeof(tcp_header)) {
> +return;
> +}

I think this indent style is indenting to much. switch statements
usually look like:


switch (whatever) {
case FOO:
{
code here;
}
case BAR:
{
more code;
}
default:
whatever;
}

>
> -switch (proto) {
> -case PROTO_TCP:
> -   csum_offset = 16;
> +tcp->th_sum = 0;
> +
> +csum = net_checksum_tcpudp(plen, ip->ip_p,
> +   (uint8_t *)>ip_src,
> +   (uint8_t *)tcp);
> +
> +tcp->th_sum = cpu_to_be16(csum);
> +}
> break;
> -case PROTO_UDP:
> -   csum_offset = 6;
> +case IP_PROTO_UDP:
> +{
> +uint16_t csum;
> +udp_header *udp = (udp_header *)(ip + 1);
> +
> +if (plen < sizeof(udp_header)) {
> +return;
> +}
> +
> +udp->uh_sum = 0;
> +
> +csum = net_checksum_tcpudp(plen, ip->ip_p,
> +   (uint8_t *)>ip_src,
> +   (uint8_t *)udp);
> +
> +udp->uh_sum = cpu_to_be16(csum);
> +}
> break;
>  default:
> +/* Can't handle any other protocol */
> return;
>  }
> -
> -if (plen < csum_offset + 2 || 14 + hlen + plen > length) {
> -return;
> -}
> -
> -data[14+hlen+csum_offset]   = 0;
> -data[14+hlen+csum_offset+1] = 0;
> -csum = net_checksum_tcpudp(plen, proto, data+14+12, data+14+hlen);
> -data[14+hlen+csum_offset]   = csum >> 8;
> -data[14+hlen+csum_offset+1] = csum & 0xff;
>  }
>
>  uint32_t
> --
> 2.7.4

thanks
-- PMM

Re: [Qemu-devel] [PATCH v2 0/6] QOM'ify hw/char devices

2016-05-05 Thread xiaoqiang zhao


> 在 2016年5月5日，20:51，Peter Maydell  写道：
> 
>> On 5 May 2016 at 11:38, 赵小强  wrote:
>> At 2016-03-29 15:47:19, "xiaoqiang zhao"  wrote:
>>> This patch set trys to QOM'ify hw/char files, see commit messages
>>> for more details
>>> 
>>> Changes in v2:
>>> * rename TYPE_SCLP_LM_CONSOLE to TYPE_SCLPLM_CONSOLE which is suggested by
>>> Cornelia Huck 
>>> * rebase on the current master
>>> 
>>> xiaoqiang zhao (6):
>>> hw/char: QOM'ify escc.c
>>> hw/char: QOM'ify etraxfs_ser.c
>>> hw/char: QOM'ify lm32_juart.c
>>> hw/char: QOM'ify lm32_uart.c
>>> hw/char: QOM'ify sclpconsole-lm.c
>>> hw/char: QOM'ify sclpconsole.c
>>> 
>>> hw/char/escc.c   | 12 +---
>>> hw/char/etraxfs_ser.c| 11 +--
>>> hw/char/lm32_juart.c |  9 +++--
>>> hw/char/lm32_uart.c  | 12 +---
>>> hw/char/sclpconsole-lm.c | 14 +-
>>> hw/char/sclpconsole.c| 12 
>>> 6 files changed, 35 insertions(+), 35 deletions(-)
>>> 
>>> --
>>> 2.1.4
>> 
>> ping ???
> 
> I think you will have better luck if you rearrange all these
> QOM patches so that you provide them as one series per board
> or per target architecture, not one per type of device.
> There is no single person with responsibility for "all of
> hw/char" so structuring your cleanup patchsets like this will
> tend to result in the people who might care about the devices
> not looking at them. (Also you can concentrate on the devices
> which are actively maintained, like ARM ones, x86 ones, MIPS
> and SPARC ones, rather than the oddballs semi-orphaned ones
> like lm32, CRIS, etc.)
> 
> thanks
> -- PMM

Thanks peter, Sounds reasonable!

Re: [Qemu-devel] [PATCH v7 0/6] Add i.MX6 (Single/Dual/Quad) support

On 2 April 2016 at 15:29, Jean-Christophe Dubois  wrote:
> This patch series adds support for the Freescale i.MX6 processor.
>
> For now we only support the following devices:
> * up to 4 Cortex A9 cores
> * A9 MPCORE (SCU, GIC, TWD)
> * 5 i.MX UARTs
> * 2 EPIT timers
> * 1 GPT timer
> * 7 GPIO controllers
> * 6 SDHC controllers
> * 5 SPI controllers
> * 1 CCM device
> * 1 SRC device
> * 3 I2C devices
> * various ROM/RAM areas.
>
> This also adds the sabrelite board as a an actual platform for i.MX6.
>
> This series was tested by booting a 4.4 linux kernel (using the
> imx_v6_v7_defconfig file as kernel configuration).
>
> Note1: as sabrelite uses uart2 as console, you have to redirect the second
> QEMU serial port to stdout.
> qemu-system-arm -M sabrelite -display none ... -serial null -serial stdio
>
> Note2: You need to disable the GPIO section related to physical push buttons
> in the Linux DTS tree in order to avoid excecive interrupt triggering on
> GPIO devices for non existant buttons.

Applied to target-arm.next, thanks.

-- PMM

Re: [Qemu-devel] [PATCH v2 3/3] Add ENET device to i.MX6 SOC.

On 23 April 2016 at 11:58, Jean-Christophe Dubois  wrote:
> This adds the ENET device to the i.MX6 SOC.
>
> This was tested by booting Linux on an Qemu i.MX6 instance and accessing
> the internet from the linux guest.
>
> Signed-off-by: Jean-Christophe Dubois 

Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [Qemu-devel] [PATCH 1/2] migration: Move qjson.[ch] to migration/

2016-05-05 Thread Dr. David Alan Gilbert

* Markus Armbruster (arm...@redhat.com) wrote:
> Type QJSON lets you build JSON text.  Its interface mirrors (a subset
> of) abstract JSON syntax.
> 
> QAPI output visitors also produce JSON text.  They assert their
> preconditions and invariants, and therefore abort on incorrect use.
> 
> Contrastingly, QJSON does *not* detect incorrect use.  It happily
> produces invalid JSON then.  This is what migration wants.
> 
> QJSON was designed for migration, and migration is its only user.
> Move it to migration/ for proper coverage by MAINTAINERS, and to deter
> accidental use outside migration.
> 
> Signed-off-by: Markus Armbruster 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  Makefile.objs   |  1 -
>  include/{ => migration}/qjson.h |  0
>  include/migration/vmstate.h |  2 +-
>  migration/Makefile.objs |  1 +
>  qjson.c => migration/qjson.c| 23 +--
>  migration/vmstate.c |  1 -
>  tests/Makefile  |  2 +-
>  7 files changed, 20 insertions(+), 10 deletions(-)
>  rename include/{ => migration}/qjson.h (100%)
>  rename qjson.c => migration/qjson.c (83%)
> 
> diff --git a/Makefile.objs b/Makefile.objs
> index 8f705f6..da49b71 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -52,7 +52,6 @@ common-obj-$(CONFIG_LINUX) += fsdev/
>  common-obj-y += migration/
>  common-obj-y += qemu-char.o #aio.o
>  common-obj-y += page_cache.o
> -common-obj-y += qjson.o
>  
>  common-obj-$(CONFIG_SPICE) += spice-qemu-char.o
>  
> diff --git a/include/qjson.h b/include/migration/qjson.h
> similarity index 100%
> rename from include/qjson.h
> rename to include/migration/qjson.h
> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> index 84ee355..30ecc44 100644
> --- a/include/migration/vmstate.h
> +++ b/include/migration/vmstate.h
> @@ -29,7 +29,7 @@
>  #ifndef CONFIG_USER_ONLY
>  #include 
>  #endif
> -#include 
> +#include "migration/qjson.h"
>  
>  typedef void SaveStateHandler(QEMUFile *f, void *opaque);
>  typedef int LoadStateHandler(QEMUFile *f, void *opaque, int version_id);
> diff --git a/migration/Makefile.objs b/migration/Makefile.objs
> index 0cac6d7..d25ff48 100644
> --- a/migration/Makefile.objs
> +++ b/migration/Makefile.objs
> @@ -2,6 +2,7 @@ common-obj-y += migration.o tcp.o
>  common-obj-y += vmstate.o
>  common-obj-y += qemu-file.o qemu-file-buf.o qemu-file-unix.o 
> qemu-file-stdio.o
>  common-obj-y += xbzrle.o postcopy-ram.o
> +common-obj-y += qjson.o
>  
>  common-obj-$(CONFIG_RDMA) += rdma.o
>  common-obj-$(CONFIG_POSIX) += exec.o unix.o fd.o
> diff --git a/qjson.c b/migration/qjson.c
> similarity index 83%
> rename from qjson.c
> rename to migration/qjson.c
> index b65ca6e..cb479fe 100644
> --- a/qjson.c
> +++ b/migration/qjson.c
> @@ -1,5 +1,5 @@
>  /*
> - * QEMU JSON writer
> + * A simple JSON writer
>   *
>   * Copyright Alexander Graf
>   *
> @@ -11,12 +11,23 @@
>   *
>   */
>  
> +/*
> + * Type QJSON lets you build JSON text.  Its interface mirrors (a
> + * subset of) abstract JSON syntax.
> + *
> + * It does *not* detect incorrect use.  It happily produces invalid
> + * JSON then.  This is what migration wants.
> + *
> + * QAPI output visitors also produce JSON text.  However, they do
> + * assert their preconditions and invariants, and therefore abort on
> + * incorrect use.
> + */
> +
>  #include "qemu/osdep.h"
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> +#include "qapi/qmp/qstring.h"
> +#include "migration/qjson.h"
> +#include "qemu/module.h"
> +#include "qom/object.h"
>  
>  struct QJSON {
>  Object obj;
> diff --git a/migration/vmstate.c b/migration/vmstate.c
> index bf3d5db..46dc55e 100644
> --- a/migration/vmstate.c
> +++ b/migration/vmstate.c
> @@ -6,7 +6,6 @@
>  #include "qemu/bitops.h"
>  #include "qemu/error-report.h"
>  #include "trace.h"
> -#include "qjson.h"
>  
>  static void vmstate_subsection_save(QEMUFile *f, const VMStateDescription 
> *vmsd,
>  void *opaque, QJSON *vmdesc);
> diff --git a/tests/Makefile b/tests/Makefile
> index 9194f18..4204d97 100644
> --- a/tests/Makefile
> +++ b/tests/Makefile
> @@ -435,7 +435,7 @@ tests/test-qdev-global-props$(EXESUF): 
> tests/test-qdev-global-props.o \
>   $(test-qapi-obj-y)
>  tests/test-vmstate$(EXESUF): tests/test-vmstate.o \
>   migration/vmstate.o migration/qemu-file.o migration/qemu-file-buf.o \
> -migration/qemu-file-unix.o qjson.o \
> +migration/qemu-file-unix.o migration/qjson.o \
>   $(test-qom-obj-y)
>  tests/test-timed-average$(EXESUF): tests/test-timed-average.o qemu-timer.o \
>   $(test-util-obj-y)
> -- 
> 2.5.5
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH 11/18] vhost-user: add shutdown support

On Wed, May 04, 2016 at 08:44:37PM -0700, Yuanhan Liu wrote:
> Hello,
> 
> On Wed, May 04, 2016 at 10:13:49PM +0300, Michael S. Tsirkin wrote:
> > How do you avoid it?
> > 
> > > > Management is required to make this robust, auto-reconnect
> > > > is handy for people bypassing management.
> > > 
> > > tbh, I don't like autoreconnect. My previous series didn't include
> > > this and assumed the feature would be supported only when qemu is
> > > configured to be the server. I added reconnect upon request by users.
> > 
> > I don't have better solutions so OK I guess.
> 
> Yes, it's a request from me :)
> Well, there may be few others also requested that.
> 
> The reason I had this asking is that, so far, we just have only one
> vhost-user frontend, and that is QEMU. But we may have many backends,
> I'm aware of 4 at the time writing, including the vubr from QEMU.
> While we could do vhost-user client and reconnect implementation
> on all backends, it's clear that implementing it in the only backend
> (QEMU) introduces more benefits.
> 
> OTOH, I could implement DPDK vhost-user as client and try reconnect
> there, if that could makes all stuff a bit easier.
> 
>   --yliu

In my opinion, if slave initiates disconnecting then slave should
also initiate connecting.

In a sense, autoretry of connections is not a vhost-user
feature. It's a general chardev feature. It also does not
have to be related to disconnect: retrying on first connect
failure makes exactly as much sense.

-- 
MST

Re: [Qemu-devel] [RFC PATCH] virtio: set ISR status when delivering queue MSI

On Thu, May 05, 2016 at 11:13:37AM +0200, Ladi Prosek wrote:
> There is a discrepancy between dataplane and no-dataplane virtio
> behavior with respect to the ISR status register and MSI-X
> capability.
> 
> Without dataplane the Queue interrupt ISR status bit is set
> regardless of how the notification is delivered to the guest.
> 
> With dataplane the Queue interrupt ISR status bit is set only
> if the notification is delivered as an IRQ.
> 
> While both conform to the spec, QEMU should be consistent to
> minimize surprises with broken drivers.
> 
> This RFC patch shows the relevant code location and contains a
> possible fix which makes QEMU set the bit on the MSI dataplane
> code path.
> 
> Signed-off-by: Ladi Prosek 

How is this supposed to work when injecting interrupts using irqfd?
If anything, I would rather drop setting the flag on non-dataplane.


> ---
>  hw/virtio/virtio.c | 19 ++-
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 08275a9..4053313 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -1192,7 +1192,7 @@ void virtio_notify(VirtIODevice *vdev, VirtQueue *vq)
>  }
>  
>  trace_virtio_notify(vdev, vq);
> -vdev->isr |= 0x01;
> +vdev->isr |= 0x01; // <= here we set ISR status even if we don't raise 
> IRQ
>  virtio_notify_vector(vdev, vq->vector);
>  }
>  
> @@ -1757,16 +1757,25 @@ static void 
> virtio_queue_guest_notifier_read(EventNotifier *n)
>  }
>  }
>  
> +static void virtio_queue_guest_notifier_read_irqfd(EventNotifier *n)
> +{
> +VirtQueue *vq = container_of(n, VirtQueue, guest_notifier);
> +if (event_notifier_test_and_clear(n)) {
> +vq->vdev->isr |= 0x01;
> +}
> +}
> +
>  void virtio_queue_set_guest_notifier_fd_handler(VirtQueue *vq, bool assign,
>  bool with_irqfd)
>  {
> -if (assign && !with_irqfd) {
> +if (assign) {
>  event_notifier_set_handler(>guest_notifier,
> -   virtio_queue_guest_notifier_read);
> + with_irqfd ?
> +virtio_queue_guest_notifier_read_irqfd :
> +virtio_queue_guest_notifier_read);
>  } else {
>  event_notifier_set_handler(>guest_notifier, NULL);
> -}
> -if (!assign) {
> +
>  /* Test and clear notifier before closing it,
>   * in case poll callback didn't have time to run. */
>  virtio_queue_guest_notifier_read(>guest_notifier);
> -- 
> 2.5.5

Re: [Qemu-devel] [PATCH v8 0/5] ARM: Add NUMA support for machine virt

On 26 April 2016 at 12:21, Andrew Jones  wrote:
> On Tue, Apr 26, 2016 at 06:40:24PM +0800, Shannon Zhao wrote:
>> From: Shannon Zhao 
>>
>> Add NUMA support for machine virt. Tested successfully running a guest
>> Linux kernel with the following patch applied:
>>
>> - [PATCH v16 0/6] arm64, numa: Add numa support for arm64 platforms
>> https://lkml.org/lkml/2016/4/8/571
>> - [PATCH v5 00/14] ACPI NUMA support for ARM64
>> https://lkml.org/lkml/2016/4/19/852
>
> This series looks good to me, but I guess we still need to wait
> for the DT spec[*] to be merged first. Hopefully that'll happen
> around the time that the 2.7 dev window opens.

I'm going to put this series into target-arm.next now, really just
so I can get it off my to-review list. I think the DT spec being
in the for-next branch is a reasonable implication that it's stable,
and in the worst case that we need to make fixups we have plenty
of time before our next QEMU release to do them in.
Please let me know if any spec change does happen before the DT
hits mainline...

thanks
-- PMM

Re: [Qemu-devel] [RFC PATCH v3 1/3] vGPU Core driver

2016-05-05 Thread Kirti Wankhede

On 5/5/2016 5:37 PM, Tian, Kevin wrote:

From: Kirti Wankhede [mailto:kwankh...@nvidia.com]
Sent: Thursday, May 05, 2016 6:45 PM

On 5/5/2016 2:36 PM, Tian, Kevin wrote:

From: Kirti Wankhede
Sent: Wednesday, May 04, 2016 9:32 PM

Thanks Alex.

 >> +config VGPU_VFIO
 >> +tristate
 >> +depends on VGPU
 >> +default n
 >> +
 >
 > This is a little bit convoluted, it seems like everything added in this
 > patch is vfio agnostic, it doesn't necessarily care what the consumer
 > is.  That makes me think we should only be adding CONFIG_VGPU here and
 > it should not depend on CONFIG_VFIO or be enabling CONFIG_VGPU_VFIO.
 > The middle config entry is also redundant to the first, just move the
 > default line up to the first and remove the rest.

CONFIG_VGPU doesn't directly depend on VFIO. CONFIG_VGPU_VFIO is
directly dependent on VFIO. But devices created by VGPU core module need
a driver to manage those devices. CONFIG_VGPU_VFIO is the driver which
will manage vgpu devices. So I think CONFIG_VGPU_VFIO should be enabled
by CONFIG_VGPU.

This would look like:
menuconfig VGPU
 tristate "VGPU driver framework"
 select VGPU_VFIO
 default n
 help
 VGPU provides a framework to virtualize GPU without SR-IOV cap
 See Documentation/vgpu.txt for more details.

 If you don't know what do here, say N.

config VGPU_VFIO
 tristate
 depends on VGPU
 depends on VFIO
 default n

There could be multiple drivers operating VGPU. Why do we restrict
it to VFIO here?

VGPU_VFIO uses VFIO APIs, it depends on VFIO.
I think since there is no other driver than VGPU_VFIO for VGPU devices,
we should keep default selection of VGPU_VFIO on VGPU. May be in future
if other driver is add ti operate vGPU devices, then default selection
can be removed.

What's your plan to support Xen here?

No plans to support Xen.

Thanks,
Kirti

Re: [Qemu-devel] [PATCH v2 0/6] QOM'ify hw/char devices

On 5 May 2016 at 11:38, 赵小强  wrote:
> At 2016-03-29 15:47:19, "xiaoqiang zhao"  wrote:
>>This patch set trys to QOM'ify hw/char files, see commit messages
>>for more details
>>
>>Changes in v2:
>>* rename TYPE_SCLP_LM_CONSOLE to TYPE_SCLPLM_CONSOLE which is suggested by
>>  Cornelia Huck 
>>* rebase on the current master
>>
>>xiaoqiang zhao (6):
>>  hw/char: QOM'ify escc.c
>>  hw/char: QOM'ify etraxfs_ser.c
>>  hw/char: QOM'ify lm32_juart.c
>>  hw/char: QOM'ify lm32_uart.c
>>  hw/char: QOM'ify sclpconsole-lm.c
>>  hw/char: QOM'ify sclpconsole.c
>>
>> hw/char/escc.c   | 12 +---
>> hw/char/etraxfs_ser.c| 11 +--
>> hw/char/lm32_juart.c |  9 +++--
>> hw/char/lm32_uart.c  | 12 +---
>> hw/char/sclpconsole-lm.c | 14 +-
>> hw/char/sclpconsole.c| 12 
>> 6 files changed, 35 insertions(+), 35 deletions(-)
>>
>>--
>>2.1.4
>>
>
> ping ???

I think you will have better luck if you rearrange all these
QOM patches so that you provide them as one series per board
or per target architecture, not one per type of device.
There is no single person with responsibility for "all of
hw/char" so structuring your cleanup patchsets like this will
tend to result in the people who might care about the devices
not looking at them. (Also you can concentrate on the devices
which are actively maintained, like ARM ones, x86 ones, MIPS
and SPARC ones, rather than the oddballs semi-orphaned ones
like lm32, CRIS, etc.)

thanks
-- PMM

Re: [Qemu-devel] [Qemu-arm] [PATCH RESEND 4/5] hw/display: QOM'ify milkymist-vgafb.c

On 5 May 2016 at 04:04, xiaoqiang zhao  wrote:
> * Drop the old SysBus init function and use instance_init
> * Move graphic_console_init into realize stage
>
> Signed-off-by: xiaoqiang zhao 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [Qemu-devel] [PATCH RESEND 3/5] hw/display: QOM'ify milkymist-tmu2.c

On 5 May 2016 at 04:04, xiaoqiang zhao  wrote:
> * Drop the old SysBus init function and use instance_init
> * Move tmu2_glx_init into realize stage
>
> Signed-off-by: xiaoqiang zhao 

Reviewed-by: Peter Maydell 

> +static void milkymist_tmu2_realize(DeviceState *dev, Error **errp)
> +{
> +MilkymistTMU2State *s = MILKYMIST_TMU2(dev);
> +
> +if (tmu2_glx_init(s)) {
> +error_setg(errp, "tmu2_glx_init failed.");
> +}
>  }

The milkymist maintainer might have a suggestion for a
more informative error message here.

thanks
-- PMM

Re: [Qemu-devel] [Qemu-arm] [PATCH RESEND 2/5] hw/display: QOM'ify jazz_led.c

On 5 May 2016 at 04:04, xiaoqiang zhao  wrote:
> * Drop the old SysBus init function and use instance_init
> * Move graphic_console_init into realize stage
>
> Signed-off-by: xiaoqiang zhao 
> ---
>  hw/display/jazz_led.c | 18 +++---
>  1 file changed, 11 insertions(+), 7 deletions(-)

Reviewed-by: Peter Maydell 

CCing the MIPS maintainers and the listed maintainer for this
board so they can pick this patch up.

thanks
-- PMM

Re: [Qemu-devel] [RFC PATCH v3 1/3] vGPU Core driver

2016-05-05 Thread Tian, Kevin

> From: Kirti Wankhede [mailto:kwankh...@nvidia.com]
> Sent: Thursday, May 05, 2016 6:45 PM
> 
> 
> On 5/5/2016 2:36 PM, Tian, Kevin wrote:
> >> From: Kirti Wankhede
> >> Sent: Wednesday, May 04, 2016 9:32 PM
> >>
> >> Thanks Alex.
> >>
> >>  >> +config VGPU_VFIO
> >>  >> +tristate
> >>  >> +depends on VGPU
> >>  >> +default n
> >>  >> +
> >>  >
> >>  > This is a little bit convoluted, it seems like everything added in this
> >>  > patch is vfio agnostic, it doesn't necessarily care what the consumer
> >>  > is.  That makes me think we should only be adding CONFIG_VGPU here and
> >>  > it should not depend on CONFIG_VFIO or be enabling CONFIG_VGPU_VFIO.
> >>  > The middle config entry is also redundant to the first, just move the
> >>  > default line up to the first and remove the rest.
> >>
> >> CONFIG_VGPU doesn't directly depend on VFIO. CONFIG_VGPU_VFIO is
> >> directly dependent on VFIO. But devices created by VGPU core module need
> >> a driver to manage those devices. CONFIG_VGPU_VFIO is the driver which
> >> will manage vgpu devices. So I think CONFIG_VGPU_VFIO should be enabled
> >> by CONFIG_VGPU.
> >>
> >> This would look like:
> >> menuconfig VGPU
> >>  tristate "VGPU driver framework"
> >>  select VGPU_VFIO
> >>  default n
> >>  help
> >>  VGPU provides a framework to virtualize GPU without SR-IOV cap
> >>  See Documentation/vgpu.txt for more details.
> >>
> >>  If you don't know what do here, say N.
> >>
> >> config VGPU_VFIO
> >>  tristate
> >>  depends on VGPU
> >>  depends on VFIO
> >>  default n
> >>
> >
> > There could be multiple drivers operating VGPU. Why do we restrict
> > it to VFIO here?
> >
> 
> VGPU_VFIO uses VFIO APIs, it depends on VFIO.
> I think since there is no other driver than VGPU_VFIO for VGPU devices,
> we should keep default selection of VGPU_VFIO on VGPU. May be in future
> if other driver is add ti operate vGPU devices, then default selection
> can be removed.

What's your plan to support Xen here?

> 
> >>  >> +create_attr_error:
> >>  >> +  if (gpu_dev->ops->vgpu_destroy) {
> >>  >> +  int ret = 0;
> >>  >> +  ret = gpu_dev->ops->vgpu_destroy(gpu_dev->dev,
> >>  >> +   vgpu_dev->uuid,
> >>  >> +   
> >> vgpu_dev->vgpu_instance);
> >>  >
> >>  > Unnecessary initialization and we don't do anything with the result.
> >>  > Below indicates lack of vgpu_destroy indicates the vendor doesn't
> >>  > support unplug, but doesn't that break our error cleanup path here?
> >>  >
> >>
> >> Comment about vgpu_destroy:
> >> If VM is running and vgpu_destroy is called that
> >> means the vGPU is being hotunpluged. Return
> >> error if VM is running and graphics driver
> >> doesn't support vgpu hotplug.
> >>
> >> Its GPU drivers responsibility to check if VM is running and return
> >> accordingly. This is vgpu creation path. Vgpu device would be hotplug to
> >> VM on vgpu_start.
> >
> > How does GPU driver know whether VM is running? VM is managed
> > by KVM here.
> >
> > Maybe it's clearer to say whether vGPU is busy which means some work
> > has been loaded to vGPU. That's something GPU driver can tell.
> >
> 
> GPU driver can detect based on resources allocated for the VM from
> vgpu_create/vgpu_start.

Yes, in that case GPU driver only knows a vGPU is in-use, not who is
using vGPU (now is VM, in the future it could be a container). Anyway
my point is just not assuming VM to add limitation when it's not necessary. :-)

> 
> >>
> >>  >> + * @vgpu_bar_info:Called to get BAR size and flags of 
> >> vGPU device.
> >>  >> + *@vdev: vgpu device structure
> >>  >> + *@bar_index: BAR index
> >>  >> + *@bar_info: output, returns size and 
> >> flags of
> >>  >> + *requested BAR
> >>  >> + *Returns integer: success (0) or error 
> >> (< 0)
> >>  >
> >>  > This is called bar_info, but the bar_index is actually the vfio region
> >>  > index and things like the config region info is being overloaded
> >>  > through it.  We already have a structure defined for getting a generic
> >>  > region index, why not use it?  Maybe this should just be
> >>  > vgpu_vfio_get_region_info.
> >>  >
> >>
> >> Ok. Will do.
> >
> > As you commented earlier that GPU driver is required to provide config
> > space (which I agree), then what's the point of introducing another
> > bar specific structure? VFIO can use @write to get bar information
> > from vgpu config space, just like how it's done on physical device today.
> >
> 
> It is required not only for size, but also to fetch flags. Region flags
> could be combination of:
> 
> #define VFIO_REGION_INFO_FLAG_READ  (1 << 0) /* Region supports read */
> #define VFIO_REGION_INFO_FLAG_WRITE (1 << 1) /* Region supports write

[Qemu-devel] virtio-net and vhost-net init, virtio-scsi and vhost-scsi init

2016-05-05 Thread Catalin Vasile

When the virtio-net and virtio-scsi drivers have done the probe() primitive 
they set the DRIVER_OK flag.

If the vhost kernel backend is used, the set_status() primitive in qemu will be 
triggered with DRIVER_OK status and it will trigger vhost_XXX_start().

How does the net and scsi solutions ensure that the vhost_XXX_start() primitive 
has finished before sending jobs to through the virtqueues? Because if 
vhost_XXX_start() has not finished, jobs might get to the qemu dummy virtqueue 
handles, instead of getting to the vhost server in the host kernel.

Re: [Qemu-devel] [PATCH v1 0/9] QOM'ify hw/intc files

2016-05-05 Thread xiaoqiang zhao



> 在 2016年5月5日，18:46，Peter Maydell  写道：
> 
>> On 5 May 2016 at 11:41, 赵小强  wrote:
>> At 2016-05-05 18:39:52, "Peter Maydell"  wrote:
 On 5 May 2016 at 11:28, xiaoqiang zhao  wrote:
 This patch set QOM'ify files under hw/intc directory. See each commit
 message for details.
 
 Changes in v1:
  use error_setg instead of error_report in realize function
>>> 
>>> How can you have changes in a v1? Was this supposed to be v2?
> 
>> Yes, should be v2 here. Sorry ! need a resend ?
> 
> No. Is only the last patch changed from the previous set ?
> 
> thanks
> -- PMM

Yes！

Re: [Qemu-devel] [PATCH v1 0/9] QOM'ify hw/intc files