date:20160623

Re: [Qemu-devel] [PATCH v3 19/22] block: Split bdrv_merge_limits() from bdrv_refresh_limits()

2016-06-23 Thread Fam Zheng

On Thu, 06/23 16:37, Eric Blake wrote:
> During bdrv_merge_limits(), we were computing initial limits
> based on another BDS in two places.  At first glance, the two
> computations are not identical (one is doing straight copying,
> the other is doing merging towards or away from zero) - but
> when you realize that the first round is starting with all-0
> memory, all of the merging happens to work.  Factoring out the
> merging makes it easier to track how two BDS limits are merged,
> in case we have future reasons to merge in even more limits.
> 
> Signed-off-by: Eric Blake 
> 
> ---
> v3: Split raw block driver changes to its own patch, make new
> function static
> v2: new patch
> ---
>  block/io.c | 31 +--
>  1 file changed, 13 insertions(+), 18 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 0f15d05..69dbbd3 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -67,6 +67,17 @@ static void bdrv_parent_drained_end(BlockDriverState *bs)
>  }
>  }
> 
> +static void bdrv_merge_limits(BlockLimits *dst, const BlockLimits *src)
> +{
> +dst->opt_transfer = MAX(dst->opt_transfer, src->opt_transfer);
> +dst->max_transfer = MIN_NON_ZERO(dst->max_transfer, src->max_transfer);
> +dst->opt_mem_alignment = MAX(dst->opt_mem_alignment,
> + src->opt_mem_alignment);
> +dst->min_mem_alignment = MAX(dst->min_mem_alignment,
> + src->min_mem_alignment);
> +dst->max_iov = MIN_NON_ZERO(dst->max_iov, src->max_iov);
> +}
> +
>  void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
>  {
>  BlockDriver *drv = bs->drv;
> @@ -88,11 +99,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error 
> **errp)
>  error_propagate(errp, local_err);
>  return;
>  }
> -bs->bl.opt_transfer = bs->file->bs->bl.opt_transfer;
> -bs->bl.max_transfer = bs->file->bs->bl.max_transfer;
> -bs->bl.min_mem_alignment = bs->file->bs->bl.min_mem_alignment;
> -bs->bl.opt_mem_alignment = bs->file->bs->bl.opt_mem_alignment;
> -bs->bl.max_iov = bs->file->bs->bl.max_iov;
> +bdrv_merge_limits(&bs->bl, &bs->file->bs->bl);
>  } else {
>  bs->bl.min_mem_alignment = 512;
>  bs->bl.opt_mem_alignment = getpagesize();
> @@ -107,19 +114,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error 
> **errp)
>  error_propagate(errp, local_err);
>  return;
>  }
> -bs->bl.opt_transfer = MAX(bs->bl.opt_transfer,
> -  bs->backing->bs->bl.opt_transfer);
> -bs->bl.max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
> -   bs->backing->bs->bl.max_transfer);
> -bs->bl.opt_mem_alignment =
> -MAX(bs->bl.opt_mem_alignment,
> -bs->backing->bs->bl.opt_mem_alignment);
> -bs->bl.min_mem_alignment =
> -MAX(bs->bl.min_mem_alignment,
> -bs->backing->bs->bl.min_mem_alignment);
> -bs->bl.max_iov =
> -MIN(bs->bl.max_iov,
> -bs->backing->bs->bl.max_iov);
> +bdrv_merge_limits(&bs->bl, &bs->backing->bs->bl);
>  }
> 
>  /* Then let the driver override it */
> -- 
> 2.5.5
> 

Reviewed-by: Fam Zheng

Re: [Qemu-devel] [RFC 0/9] Introduce light weight PC platform pc-lite

2016-06-23 Thread Claudio Fontana

Hi Paolo,

On 23.06.2016 20:44, Paolo Bonzini wrote:
> 
> 
> On 23/06/2016 10:32, Chao Peng wrote:
>> The original usage model is to replace kvm-tool with QEMU for Clear
>> Containers (https://clearlinux.org/features/clear-containers). It's not
>> going to present the guest a real PC platform, but instead, a totally
>> virtual platform.
> 
> It is not completely virtual; it has PCI for example.  Hyper-V is an
> example of a completely virtual platform, even the LAPIC is customized
> with paravirtual features.
> 
> qboot does basically four things: 1) relocate from ROM to 0xf; 2)
> initialize PCI; 2) provide the ACPI and e820 tables; 3) boot.
> 
> If Linux can boot without initializing PCI bridges and without INTX, we
> can remove that code from qboot.  The PCI scan is the most expensive
> part, I think.  (2) and (3) are the same no matter if you run them in
> QEMU or the guest.
> 
> That leaves out only relocation (PAM).
> 
>> Every little bit boot time saving is important because
>> we are trying to achieve comparable result with that for Linux native
>> container.
>>
>> With this usage model, I doubt introducing a firmware layer is a good
>> idea:
>>
>> On one side, even with optimized and compact qboot it still takes us
>> ~15ms.
> 
> Have you profiled it?  If it is code in QEMU that we can optimize (e.g.
> memory.c), that would benefit all guests.
> 
>> This is not a small value because current Linux kernel takes
>> only ~50ms (and we are still on the way to optimize it). And when
>> you look at the SeaBIOS or qboot, almost all the code are useless for
>> this usage model. They are doing things that is important for
>> traditional PC booting but cost 15ms doing useless things for us (It
>> is really not easy to save 15ms in other place, for example, in
>> Linux. Personally I tend to change the architecture for this new
>> usage model, e.g. eliminate firmware).
>>
>> On the other side, even boot the new pc-lite platform with firmware,
>> it does not mean it can support non-Linux system like Windows. So
>> generally I don't see the benefit of introducing a firmware layer.
> 
> The main benefit is maintainability, by reducing the amount of code
> specific to pc-lite.
> 
>> Besides, I'm also not quite sure if build around Q35 is the best
>> solution:
>>
>> The problem with Q35 is some features like SMM/SMRAM/PAM slow done
>> the booting even we actually never use them. While removing these
>> features can cause guest see different feature set for a same device
>> and it also prevents us to do further optimizations on that in guest.
> 
> Of these, qboot only uses PAM, and even that could be removed (PAM is
> only necessary because of how qboot probes parallel flash).  SMRAM
> should not slow down booting if you don't use them.  Do they?
> 
> Paolo

I use qboot for similar goals, you mention that PAM is necessary because of how 
qboot probes parallel flash,
however in my custom platform I removed PAM completely from QEMU, and 
everything seems to work without any problems..

Ciao

Claudio

Re: [Qemu-devel] [PATCH v3 18/22] block: Drop raw_refresh_limits()

2016-06-23 Thread Fam Zheng

On Thu, 06/23 16:37, Eric Blake wrote:
> The raw block driver was blindly copying all limits from bs->file,
> even though: 1. the main bdrv_refresh_limits() already does this
> for many of the limits, and 2. blindly copying from the children
> can weaken any stricter limits that were already inherited from
> the backing chain during the main bdrv_refresh_limits().  Also,
> a future patch is about to move .request_alignment into
> BlockLimits, and that is a limit that should NOT be copied from
> other layers in the BDS chain.
> 
> Thus, we can completely drop raw_refresh_limits(), and rely on
> the block layer setting up the proper limits.
> 
> Signed-off-by: Eric Blake 
> 
> ---
> v3: new patch, split out from 'block: Split bdrv_merge_limits()...'
> ---
>  block/raw_bsd.c | 8 +---
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/block/raw_bsd.c b/block/raw_bsd.c
> index 7f63791..5855e84 100644
> --- a/block/raw_bsd.c
> +++ b/block/raw_bsd.c
> @@ -1,6 +1,6 @@
>  /* BlockDriver implementation for "raw"
>   *
> - * Copyright (C) 2010, 2013, Red Hat, Inc.
> + * Copyright (C) 2010-2016 Red Hat, Inc.
>   * Copyright (C) 2010, Blue Swirl 
>   * Copyright (C) 2009, Anthony Liguori 
>   *
> @@ -150,11 +150,6 @@ static int raw_get_info(BlockDriverState *bs, 
> BlockDriverInfo *bdi)
>  return bdrv_get_info(bs->file->bs, bdi);
>  }
> 
> -static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
> -{
> -bs->bl = bs->file->bs->bl;
> -}
> -
>  static int raw_truncate(BlockDriverState *bs, int64_t offset)
>  {
>  return bdrv_truncate(bs->file->bs, offset);
> @@ -252,7 +247,6 @@ BlockDriver bdrv_raw = {
>  .bdrv_getlength   = &raw_getlength,
>  .has_variable_length  = true,
>  .bdrv_get_info= &raw_get_info,
> -.bdrv_refresh_limits  = &raw_refresh_limits,
>  .bdrv_probe_blocksizes = &raw_probe_blocksizes,
>  .bdrv_probe_geometry  = &raw_probe_geometry,
>  .bdrv_media_changed   = &raw_media_changed,
> -- 
> 2.5.5
> 

Reviewed-by: Fam Zheng

Re: [Qemu-devel] [PATCH v3 17/22] block: Switch discard length bounds to byte-based

2016-06-23 Thread Fam Zheng

On Thu, 06/23 16:37, Eric Blake wrote:
> Sector-based limits are awkward to think about; in our on-going
> quest to move to byte-based interfaces, convert max_discard and
> discard_alignment.  Rename them, using 'pdiscard' as an aid to
> track which remaining discard interfaces need conversion, and so
> that the compiler will help us catch the change in semantics
> across any rebased code.  The BlockLimits type is now completely
> byte-based; and in iscsi.c, sector_limits_lun2qemu() is no
> longer needed.
> 
> pdiscard_alignment is made unsigned (we use power-of-2 alignments
> as bitmasks, where unsigned is easier to think about) while
> leaving max_pdiscard signed (since we still have an 'int'
> interface); this is comparable to what commit cf081fc did for
> write zeroes limits.  We may later want to make everything an
> unsigned 64-bit limit - but that requires a bigger code audit.
> 
> Signed-off-by: Eric Blake 
> 
> ---
> v3: split out write_zeroes wording tweaks, improve commit message
> v2: rebase nbd and iscsi limits across earlier improvements
> ---
>  include/block/block_int.h | 14 ++
>  block/io.c| 16 +---
>  block/iscsi.c | 19 ++-
>  block/nbd.c   |  2 +-
>  qemu-img.c|  3 ++-
>  5 files changed, 28 insertions(+), 26 deletions(-)
> 
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 7a4a00f..388ef80 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -324,11 +324,17 @@ struct BlockDriver {
>  };
> 
>  typedef struct BlockLimits {
> -/* maximum number of sectors that can be discarded at once */
> -int max_discard;
> +/* maximum number of bytes that can be discarded at once (since it
> + * is signed, it must be < 2G, if set), should be multiple of
> + * pdiscard_alignment, but need not be power of 2. May be 0 if no
> + * inherent 32-bit limit */
> +int32_t max_pdiscard;
> 
> -/* optimal alignment for discard requests in sectors */
> -int64_t discard_alignment;
> +/* optimal alignment for discard requests in bytes, must be power
> + * of 2, less than max_discard if that is set, and multiple of

s/max_discard/max_pdiscard/

> + * bs->request_alignment. May be 0 if bs->request_alignment is
> + * good enough */
> +uint32_t pdiscard_alignment;
> 
>  /* maximum number of bytes that can zeroized at once (since it is
>   * signed, it must be < 2G, if set), should be multiple of
> diff --git a/block/io.c b/block/io.c
> index 8ca9d43..0f15d05 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2368,19 +2368,21 @@ int coroutine_fn bdrv_co_discard(BlockDriverState 
> *bs, int64_t sector_num,
>  goto out;
>  }
> 
> -max_discard = MIN_NON_ZERO(bs->bl.max_discard, BDRV_REQUEST_MAX_SECTORS);
> +max_discard = MIN_NON_ZERO(bs->bl.max_pdiscard >> BDRV_SECTOR_BITS,
> +   BDRV_REQUEST_MAX_SECTORS);
>  while (nb_sectors > 0) {
>  int ret;
>  int num = nb_sectors;
> +int discard_alignment = bs->bl.pdiscard_alignment >> 
> BDRV_SECTOR_BITS;
> 
>  /* align request */
> -if (bs->bl.discard_alignment &&
> -num >= bs->bl.discard_alignment &&
> -sector_num % bs->bl.discard_alignment) {
> -if (num > bs->bl.discard_alignment) {
> -num = bs->bl.discard_alignment;
> +if (discard_alignment &&
> +num >= discard_alignment &&
> +sector_num % discard_alignment) {
> +if (num > discard_alignment) {
> +num = discard_alignment;
>  }
> -num -= sector_num % bs->bl.discard_alignment;
> +num -= sector_num % discard_alignment;

Or just

   num = discard_alignment - sector_num % discard_alignment;

without the if.

Otherwise looks good,

Reviewed-by: Fam Zheng 

>  }
> 
>  /* limit request size */
> diff --git a/block/iscsi.c b/block/iscsi.c
> index 368687d..0d16c31 100644
> --- a/block/iscsi.c
> +++ b/block/iscsi.c
> @@ -1696,13 +1696,6 @@ static void iscsi_close(BlockDriverState *bs)
>  memset(iscsilun, 0, sizeof(IscsiLun));
>  }
> 
> -static int sector_limits_lun2qemu(int64_t sector, IscsiLun *iscsilun)
> -{
> -int limit = MIN(sector_lun2qemu(sector, iscsilun), INT_MAX / 2 + 1);
> -
> -return limit < BDRV_REQUEST_MAX_SECTORS ? limit : 0;
> -}
> -
>  static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
>  {
>  /* We don't actually refresh here, but just return data queried in
> @@ -1722,14 +1715,14 @@ static void iscsi_refresh_limits(BlockDriverState 
> *bs, Error **errp)
>  }
> 
>  if (iscsilun->lbp.lbpu) {
> -if (iscsilun->bl.max_unmap < 0x) {
> -bs->bl.max_discard =
> -sector_limits_lun2qemu(iscsilun->bl.max_unmap, iscsilun);
> +if (iscsilun->bl.max_unmap < 0x / iscsilu

Re: [Qemu-devel] [PULL 5/8] target-sparc: Use global registers for the register window

2016-06-23 Thread Paolo Bonzini

On 24/06/2016 05:57, Richard Henderson wrote:
> 
> Whatever happens, it happens after 10GB of logs, which is simply too
> much to sift through.  I've tried to narrow it down, but the lack of a
> hardware tlb refill means that we get hundreds of thousands of Data
> Access Faults that are simply TLB misses and not the actual Segmentation
> Fault in question.
> 
> It doesn't seem to affect other OSes, so I can't imagine what quirk is
> being exercised in this case.
> 
> As loath as I am to suggest it, we may have to revert the sparc indirect
> register patch for the release.

We have more than a month.  If it's reproducible, it can be fixed. :)

> I do now ping the rest of my sparc improvements patchset.  It's
> completely independent of the use of indirect registers.

Mark, perhaps you can try to use migration to reduce the amount of
logging?  (Start QEMU with -snapshot, try to stop the vm before it
fails.  If you succeed, do a "migrate exec:cat>foo.sav" followed by
"commit"; if you fail, try again).

It would be nice to have a mechanism to stop the VM after executing N
basic blocks.  Binary search on this value then can help with coming up
with a more easily debuggable snapshot, possibly to a point where the
difference between pre-patch and post-patch becomes deterministic.

Paolo

Re: [Qemu-devel] [RFC 0/9] Introduce light weight PC platform pc-lite

2016-06-23 Thread Claudio Fontana

On 24.06.2016 14:39, Claudio Fontana wrote:
> Hi Paolo,
> 
> On 23.06.2016 20:44, Paolo Bonzini wrote:
>>
>>
>> On 23/06/2016 10:32, Chao Peng wrote:
>>> The original usage model is to replace kvm-tool with QEMU for Clear
>>> Containers (https://clearlinux.org/features/clear-containers). It's not
>>> going to present the guest a real PC platform, but instead, a totally
>>> virtual platform.
>>
>> It is not completely virtual; it has PCI for example.  Hyper-V is an
>> example of a completely virtual platform, even the LAPIC is customized
>> with paravirtual features.
>>
>> qboot does basically four things: 1) relocate from ROM to 0xf; 2)
>> initialize PCI; 2) provide the ACPI and e820 tables; 3) boot.
>>
>> If Linux can boot without initializing PCI bridges and without INTX, we
>> can remove that code from qboot.  The PCI scan is the most expensive
>> part, I think.  (2) and (3) are the same no matter if you run them in
>> QEMU or the guest.
>>
>> That leaves out only relocation (PAM).
>>
>>> Every little bit boot time saving is important because
>>> we are trying to achieve comparable result with that for Linux native
>>> container.
>>>
>>> With this usage model, I doubt introducing a firmware layer is a good
>>> idea:
>>>
>>> On one side, even with optimized and compact qboot it still takes us
>>> ~15ms.
>>
>> Have you profiled it?  If it is code in QEMU that we can optimize (e.g.
>> memory.c), that would benefit all guests.
>>
>>> This is not a small value because current Linux kernel takes
>>> only ~50ms (and we are still on the way to optimize it). And when
>>> you look at the SeaBIOS or qboot, almost all the code are useless for
>>> this usage model. They are doing things that is important for
>>> traditional PC booting but cost 15ms doing useless things for us (It
>>> is really not easy to save 15ms in other place, for example, in
>>> Linux. Personally I tend to change the architecture for this new
>>> usage model, e.g. eliminate firmware).
>>>
>>> On the other side, even boot the new pc-lite platform with firmware,
>>> it does not mean it can support non-Linux system like Windows. So
>>> generally I don't see the benefit of introducing a firmware layer.
>>
>> The main benefit is maintainability, by reducing the amount of code
>> specific to pc-lite.
>>
>>> Besides, I'm also not quite sure if build around Q35 is the best
>>> solution:
>>>
>>> The problem with Q35 is some features like SMM/SMRAM/PAM slow done
>>> the booting even we actually never use them. While removing these
>>> features can cause guest see different feature set for a same device
>>> and it also prevents us to do further optimizations on that in guest.
>>
>> Of these, qboot only uses PAM, and even that could be removed (PAM is
>> only necessary because of how qboot probes parallel flash).  SMRAM
>> should not slow down booting if you don't use them.  Do they?
>>
>> Paolo
> 
> I use qboot for similar goals, you mention that PAM is necessary because of 
> how qboot probes parallel flash,
> however in my custom platform I removed PAM completely from QEMU, and 
> everything seems to work without any problems..
>

Btw before you ask: yes I am booting with pflash.

Ciao

C.

Re: [Qemu-devel] [PATCH v2 1/2] qapi: Report support for -device cpu hotplug in query-machines

2016-06-23 Thread Igor Mammedov

On Thu, 23 Jun 2016 23:23:33 +0200
Peter Krempa  wrote:

> For management apps it's very useful to know whether the selected
> machine type supports cpu hotplug via the new -device approach. Using
> the presence of 'query-hotpluggable-cpus' alone is not enough as a
> witness.
> 
> Add a property to 'MachineInfo' called 'hotpluggable-cpus' that will
> report the presence of this feature.
> 
> Example of output:
> {
> "hotpluggable-cpus": false,
> "name": "mac99",
> "cpu-max": 1
> },
> {
> "hotpluggable-cpus": true,
> "name": "pseries-2.7",
> "is-default": true,
> "cpu-max": 255,
> "alias": "pseries"
> },
> 
> Signed-off-by: Peter Krempa 
> Reviewed-by: Eric Blake 

Reviewed-by: Igor Mammedov 

> ---
>  qapi-schema.json | 5 -
>  vl.c | 1 +
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 0964eec..24ede28 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -2986,11 +2986,14 @@
>  # @cpu-max: maximum number of CPUs supported by the machine type
>  #   (since 1.5.0)
>  #
> +# @hotpluggable-cpus: cpu hotplug via -device is supported (since
> 2.7.0) +#
>  # Since: 1.2.0
>  ##
>  { 'struct': 'MachineInfo',
>'data': { 'name': 'str', '*alias': 'str',
> -'*is-default': 'bool', 'cpu-max': 'int' } }
> +'*is-default': 'bool', 'cpu-max': 'int',
> +'hotpluggable-cpus': 'bool'} }
> 
>  ##
>  # @query-machines:
> diff --git a/vl.c b/vl.c
> index c85833a..4c1f9ae 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1524,6 +1524,7 @@ MachineInfoList *qmp_query_machines(Error
> **errp)
> 
>  info->name = g_strdup(mc->name);
>  info->cpu_max = !mc->max_cpus ? 1 : mc->max_cpus;
> +info->hotpluggable_cpus = !!mc->query_hotpluggable_cpus;
> 
>  entry = g_malloc0(sizeof(*entry));
>  entry->value = info;

Re: [Qemu-devel] [PATCH v3 15/22] block: Switch transfer length bounds to byte-based

2016-06-23 Thread Fam Zheng

On Thu, 06/23 16:37, Eric Blake wrote:
> Sector-based limits are awkward to think about; in our on-going
> quest to move to byte-based interfaces, convert max_transfer_length
> and opt_transfer_length.  Rename them (dropping the _length suffix)
> so that the compiler will help us catch the change in semantics
> across any rebased code, and improve the documentation.  Use unsigned
> values, so that we don't have to worry about negative values and
> so that bit-twiddling is easier; however, we are still constrained
> by 2^31 of signed int in most APIs.
> 
> When a value comes from an external source (iscsi and raw-posix),
> sanitize the results to ensure that opt_transfer is a power of 2.
> 
> Signed-off-by: Eric Blake 

Reviewed-by: Fam Zheng

Re: [Qemu-devel] [PATCH 2/4] virtio: Always use aio path to set host handler

2016-06-23 Thread Paolo Bonzini

On 24/06/2016 07:12, Fam Zheng wrote:
> Apart from the interface difference, the aio version works the same as
> the non-aio one. The event notifier versus aio fd handler makes no
> diffeerence, except the former led to an ugly patch in commit
> ab27c3b5e7, which won't be necessary any more.
> 
> As the first step to unify them, all callers are switched to this
> renamed aio iterface, and function comment is added.

I think this is too aggressive, and I'm not sure it's correct.
bdrv_drain wants to look at block devices only, while this would also
process NICs too for example.  I don't know the effect.

Can we do this only for virtio-blk and virtio-scsi?

Paolo

Re: [Qemu-devel] [PATCH v1 02/11] ppc/xics: Move SPAPR specific code to a separate file

2016-06-23 Thread David Gibson

On Fri, Jun 24, 2016 at 11:27:58AM +0530, Nikunj A Dadhania wrote:
> David Gibson  writes:
> 
> > [ Unknown signature status ]
> > On Thu, Jun 23, 2016 at 11:17:21PM +0530, Nikunj A Dadhania wrote:
> >> From: Benjamin Herrenschmidt 
> >> 
> >> Leave the core ICP/ICS logic in xics.c and move the top level
> >> class wrapper, hypercall and RTAS handlers to xics_spapr.c
> >> 
> >> Signed-off-by: Benjamin Herrenschmidt 
> >> [add cpu.h in xics_spapr.c, move set_nr_irqs and set_nr_servers to
> >>  xics_spapr.c]
> >> Signed-off-by: Nikunj A Dadhania 
> >> ---
> >>  default-configs/ppc64-softmmu.mak |   1 +
> >>  hw/intc/Makefile.objs |   1 +
> >>  hw/intc/xics.c| 418 
> >> +---
> >>  hw/intc/xics_spapr.c  | 432 
> >> ++
> >>  include/hw/ppc/xics.h |  21 ++
> >>  5 files changed, 464 insertions(+), 409 deletions(-)
> >>  create mode 100644 hw/intc/xics_spapr.c
> >> 
> >> diff --git a/default-configs/ppc64-softmmu.mak 
> >> b/default-configs/ppc64-softmmu.mak
> >> index bb71b23..c4be59f 100644
> >> --- a/default-configs/ppc64-softmmu.mak
> >> +++ b/default-configs/ppc64-softmmu.mak
> >> @@ -49,6 +49,7 @@ CONFIG_ETSEC=y
> >>  CONFIG_LIBDECNUMBER=y
> >>  # For pSeries
> >>  CONFIG_XICS=$(CONFIG_PSERIES)
> >> +CONFIG_XICS_SPAPR=$(CONFIG_PSERIES)
> >>  CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
> >>  # For PReP
> >>  CONFIG_MC146818RTC=y
> >> diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
> >> index c7bbf88..530df2e 100644
> >> --- a/hw/intc/Makefile.objs
> >> +++ b/hw/intc/Makefile.objs
> >> @@ -30,6 +30,7 @@ obj-$(CONFIG_OPENPIC_KVM) += openpic_kvm.o
> >>  obj-$(CONFIG_RASPI) += bcm2835_ic.o bcm2836_control.o
> >>  obj-$(CONFIG_SH4) += sh_intc.o
> >>  obj-$(CONFIG_XICS) += xics.o
> >> +obj-$(CONFIG_XICS_SPAPR) += xics_spapr.o
> >>  obj-$(CONFIG_XICS_KVM) += xics_kvm.o
> >>  obj-$(CONFIG_ALLWINNER_A10_PIC) += allwinner-a10-pic.o
> >>  obj-$(CONFIG_S390_FLIC) += s390_flic.o
> >> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> >> index a715532..6ca391f 100644
> >> --- a/hw/intc/xics.c
> >> +++ b/hw/intc/xics.c
> >> @@ -32,12 +32,11 @@
> >>  #include "hw/hw.h"
> >>  #include "trace.h"
> >>  #include "qemu/timer.h"
> >> -#include "hw/ppc/spapr.h"
> >>  #include "hw/ppc/xics.h"
> >>  #include "qemu/error-report.h"
> >>  #include "qapi/visitor.h"
> >>  
> >> -static int get_cpu_index_by_dt_id(int cpu_dt_id)
> >> +int get_cpu_index_by_dt_id(int cpu_dt_id)
> >
> > If this is made public it needs  xics_*() name the current one is too
> > generic for a global symbol.
> 
> Sure. Should we also make  icp_set_*  as xics_icp_set_* then ?

I'm happy enough to treat icp_() as sufficient namespacing on its own.

> 
> >
> >>  {
> >>  PowerPCCPU *cpu = ppc_get_vcpu_by_dt_id(cpu_dt_id);
> >>  
> >> @@ -242,7 +241,7 @@ static void icp_resend(XICSState *icp, int server)
> >>  ics_resend(icp->ics);
> >>  }
> >>  
> >> -static void icp_set_cppr(XICSState *icp, int server, uint8_t cppr)
> >> +void icp_set_cppr(XICSState *icp, int server, uint8_t cppr)
> >>  {
> >>  ICPState *ss = icp->ss + server;
> >>  uint8_t old_cppr;
> >> @@ -266,7 +265,7 @@ static void icp_set_cppr(XICSState *icp, int server, 
> >> uint8_t cppr)
> >>  }
> >>  }
> >>  
> >> -static void icp_set_mfrr(XICSState *icp, int server, uint8_t mfrr)
> >> +void icp_set_mfrr(XICSState *icp, int server, uint8_t mfrr)
> >>  {
> >>  ICPState *ss = icp->ss + server;
> >>  
> >> @@ -276,7 +275,7 @@ static void icp_set_mfrr(XICSState *icp, int server, 
> >> uint8_t mfrr)
> >>  }
> >>  }
> >>  
> >> -static uint32_t icp_accept(ICPState *ss)
> >> +uint32_t icp_accept(ICPState *ss)
> >>  {
> >>  uint32_t xirr = ss->xirr;
> >>  
> >> @@ -289,7 +288,7 @@ static uint32_t icp_accept(ICPState *ss)
> >>  return xirr;
> >>  }
> >>  
> >> -static void icp_eoi(XICSState *icp, int server, uint32_t xirr)
> >> +void icp_eoi(XICSState *icp, int server, uint32_t xirr)
> >>  {
> >>  ICPState *ss = icp->ss + server;
> >>  
> 
> Regards
> Nikunj
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v1 02/11] ppc/xics: Move SPAPR specific code to a separate file

2016-06-23 Thread Nikunj A Dadhania

David Gibson  writes:

> [ Unknown signature status ]
> On Thu, Jun 23, 2016 at 11:17:21PM +0530, Nikunj A Dadhania wrote:
>> From: Benjamin Herrenschmidt 
>> 
>> Leave the core ICP/ICS logic in xics.c and move the top level
>> class wrapper, hypercall and RTAS handlers to xics_spapr.c
>> 
>> Signed-off-by: Benjamin Herrenschmidt 
>> [add cpu.h in xics_spapr.c, move set_nr_irqs and set_nr_servers to
>>  xics_spapr.c]
>> Signed-off-by: Nikunj A Dadhania 
>> ---
>>  default-configs/ppc64-softmmu.mak |   1 +
>>  hw/intc/Makefile.objs |   1 +
>>  hw/intc/xics.c| 418 +---
>>  hw/intc/xics_spapr.c  | 432 
>> ++
>>  include/hw/ppc/xics.h |  21 ++
>>  5 files changed, 464 insertions(+), 409 deletions(-)
>>  create mode 100644 hw/intc/xics_spapr.c
>> 
>> diff --git a/default-configs/ppc64-softmmu.mak 
>> b/default-configs/ppc64-softmmu.mak
>> index bb71b23..c4be59f 100644
>> --- a/default-configs/ppc64-softmmu.mak
>> +++ b/default-configs/ppc64-softmmu.mak
>> @@ -49,6 +49,7 @@ CONFIG_ETSEC=y
>>  CONFIG_LIBDECNUMBER=y
>>  # For pSeries
>>  CONFIG_XICS=$(CONFIG_PSERIES)
>> +CONFIG_XICS_SPAPR=$(CONFIG_PSERIES)
>>  CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
>>  # For PReP
>>  CONFIG_MC146818RTC=y
>> diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
>> index c7bbf88..530df2e 100644
>> --- a/hw/intc/Makefile.objs
>> +++ b/hw/intc/Makefile.objs
>> @@ -30,6 +30,7 @@ obj-$(CONFIG_OPENPIC_KVM) += openpic_kvm.o
>>  obj-$(CONFIG_RASPI) += bcm2835_ic.o bcm2836_control.o
>>  obj-$(CONFIG_SH4) += sh_intc.o
>>  obj-$(CONFIG_XICS) += xics.o
>> +obj-$(CONFIG_XICS_SPAPR) += xics_spapr.o
>>  obj-$(CONFIG_XICS_KVM) += xics_kvm.o
>>  obj-$(CONFIG_ALLWINNER_A10_PIC) += allwinner-a10-pic.o
>>  obj-$(CONFIG_S390_FLIC) += s390_flic.o
>> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
>> index a715532..6ca391f 100644
>> --- a/hw/intc/xics.c
>> +++ b/hw/intc/xics.c
>> @@ -32,12 +32,11 @@
>>  #include "hw/hw.h"
>>  #include "trace.h"
>>  #include "qemu/timer.h"
>> -#include "hw/ppc/spapr.h"
>>  #include "hw/ppc/xics.h"
>>  #include "qemu/error-report.h"
>>  #include "qapi/visitor.h"
>>  
>> -static int get_cpu_index_by_dt_id(int cpu_dt_id)
>> +int get_cpu_index_by_dt_id(int cpu_dt_id)
>
> If this is made public it needs  xics_*() name the current one is too
> generic for a global symbol.

Sure. Should we also make  icp_set_*  as xics_icp_set_* then ?

>
>>  {
>>  PowerPCCPU *cpu = ppc_get_vcpu_by_dt_id(cpu_dt_id);
>>  
>> @@ -242,7 +241,7 @@ static void icp_resend(XICSState *icp, int server)
>>  ics_resend(icp->ics);
>>  }
>>  
>> -static void icp_set_cppr(XICSState *icp, int server, uint8_t cppr)
>> +void icp_set_cppr(XICSState *icp, int server, uint8_t cppr)
>>  {
>>  ICPState *ss = icp->ss + server;
>>  uint8_t old_cppr;
>> @@ -266,7 +265,7 @@ static void icp_set_cppr(XICSState *icp, int server, 
>> uint8_t cppr)
>>  }
>>  }
>>  
>> -static void icp_set_mfrr(XICSState *icp, int server, uint8_t mfrr)
>> +void icp_set_mfrr(XICSState *icp, int server, uint8_t mfrr)
>>  {
>>  ICPState *ss = icp->ss + server;
>>  
>> @@ -276,7 +275,7 @@ static void icp_set_mfrr(XICSState *icp, int server, 
>> uint8_t mfrr)
>>  }
>>  }
>>  
>> -static uint32_t icp_accept(ICPState *ss)
>> +uint32_t icp_accept(ICPState *ss)
>>  {
>>  uint32_t xirr = ss->xirr;
>>  
>> @@ -289,7 +288,7 @@ static uint32_t icp_accept(ICPState *ss)
>>  return xirr;
>>  }
>>  
>> -static void icp_eoi(XICSState *icp, int server, uint32_t xirr)
>> +void icp_eoi(XICSState *icp, int server, uint32_t xirr)
>>  {
>>  ICPState *ss = icp->ss + server;
>>  

Regards
Nikunj

[Qemu-devel] [PATCH v2 11/10] pc: acpi: update expected DSDT blobs with new CPU hotplug AML

2016-06-23 Thread Igor Mammedov

Signed-off-by: Igor Mammedov 
---
 tests/acpi-test-data/pc/DSDT | Bin 5503 -> 6008 bytes
 tests/acpi-test-data/pc/DSDT.bridge  | Bin 7362 -> 7867 bytes
 tests/acpi-test-data/q35/DSDT| Bin 8265 -> 8770 bytes
 tests/acpi-test-data/q35/DSDT.bridge | Bin 8282 -> 8787 bytes
 4 files changed, 0 insertions(+), 0 deletions(-)

diff --git a/tests/acpi-test-data/pc/DSDT b/tests/acpi-test-data/pc/DSDT
index 
8b4f1a09b87f8361fb572022f69d304ddeeace99..8053d711058c0f9541d6d97690819f9de697751c
 100644
GIT binary patch
delta 907
zcmb7D&2G~`5Z-NEx>>tLsVpJEfk-_dA(Sp$dNyM>F^y9z>ms3zWFM2B_DE0ekHib)
z1ya;|gm@7iqz}L?%otSBO)m9mJv;h-X1-m${oRwXj*G7^7~}UpJ7S@oG++%K1@K!
zW*r+~K)FJBvk0*UGEty|6bdSkhh&#}%t!yKtSWNcSn4?tOg+cdf)4&9?wWVOC`h&Q
zk%UNsc2+gkwt^LEk0H2I&OWNc3b%uMvm`~9B?K45iL
M-6|{k(W(pc2VMhaIRF3v

diff --git a/tests/acpi-test-data/pc/DSDT.bridge 
b/tests/acpi-test-data/pc/DSDT.bridge
index 
0d09b5cc61114b68fee0f14729732786854b19fe..850e71a973e52cc5e546fdd2757f0e089fed7192
 100644
GIT binary patch
delta 907
zcmb7D&2G~`5Z-NEx>>tLp*~RwBK5!pwF?reIB+mylbFWA%DPBsBiYBKr@i*%{z$w)
zULZj4QPpE_eFEO7%($qcn_Tc|Jv;h-X1=|BfAc}cIxcQrFvj0q$&-yvdQj?*r8x&V
z-fz)yb|m&{!yz9WGEu@vcQ&Q$akgL!9-J_9nvZnBeYTK+IoqY5h;<=Ph8)tN<}k{>
za!5Wa&OCrD7Ut|3HMKw|gD!T)QPB;9G99MEGAf$$!RzrJQA2*DMcGf|dNYDNRqBl*
z327WezvX(^f#TD*wYi*4*mqD$O~d{ZK>D`X9q)StkVC5SQKCuw-JPAdWn{Cgnm?bp
z7bM@xr^)vEk6-V)wxu`hg
zdCyz2)GPxB=CNve$
z*}wCroh`{l&vv@PnU!m9m0FM8A`W4tJadtF7_G5l&4)R~e2Gg+8747{ijFS@6Zm@Y
sKJ-h4e>+5WZt2h?Blr~m)}

delta 383
zcmYLFK~KUk82!3|ECqu|V~FuU_yJ@wdhlQ(mSHjjHg*>?&AKfl^VW+8;DCxz&Q9+`
zcPGZ9H%|U6yJ=z%eSPis@?PK9;pzZ)dDmI(0HB8tJSt7ChR2URvQJsRqsiHvva>-1
zLe|RI00Z*n%j>z1HIR`49i&iDfYc}3lyL|BtCFHfadn}mKrm&Nt+_4yP3$#x&M8Q>
zdqW9<1o74@zKSR1#Rq?C0e*rf0+(PMpkPr^*Tb<;ZfbrKAPAGzbL@vQ%o7Soa2%q6
z)4LF8VdRMQ8{yuc7TWPksU2Tc>(eDHQF8?Dt#Woy8J4)|-c@pvWjWe#XM2>H=I<}b
zy4UYGkY-H)kc!8hPR6(c>dCTAq=;ohL~5CzrV>Q(nCY*LthG%YA+mMZ_D-1PS>`Ta
PbynRfEBe_ezt?{Ns(5C)

diff --git a/tests/acpi-test-data/q35/DSDT b/tests/acpi-test-data/q35/DSDT
index 
67445428d935bd6ea5957526089ba7e719a1783a..58fbb3d2e2dc8e8256984744bfb9411feb2e35fe
 100644
GIT binary patch
delta 907
zcmb7D%}(1u5Z)zDW!8o&l`p_0f&-Tdw!Ltwgc+N}5T{nwAXS^nJ|>*@+LH;Dc!9h?
zsCa>jH{iyNW8Ma43_@j-OMP0;j=rCnZyyftBN3@6KiX!DfBP#Y8?EG^P)CKn0x{fg
z&~f^6bFb1Ivfe3k3mB-@nh+g6_vb$Me=&WT4OGs(xi5S@*`c41wIPf99Fo4`Fi6ii
z#3zz72Oy1HJ^4gU4N!5fjh#`Lw*$XO`iW~874=o`^Kg%-Av;rU(M_UiErcwQ@{1%1
zN#uu*q8YRxKmM^Y6Vnm91}Wt@=zajCU)4LqeTM)tNMtsM6^Va(G9$Z;Y=)Jy&8dAs
z@{Rad>fQRLdJlD525FN#{T~yzSrFls;5~lOT?Yky_svIT8R6xh(3_SV#T#a$;*e(@
zXTegv3>=uJiGUvXje!=iN4n(ubEwwup)3G=l8

delta 398
zcmYLFK}y3w6#bL58m1{)8N>>O&;yhpx^JDP#Yoek$wne%l2Ax@vvFa&sI{P7xU#r&
zkxDKg-oTaKpvQ3%1!wW!yqS6bz4@Qr?>6qbUGKsGKu_;@QJ-5!fqNd1Gs*@XMJ!j8
z4V?tIs8z5A8WbxOlU0Zn5Qzj81eZ{PEF{O2xgPpgrDdMsy^Wdy!IW9H;y3XZv3JG!W7Rq-i2v|qKz(($5Srdo68^k&r`*
z*D-Ru-W5k1tw5~aaR2_Y)LOsRTXQXNy$-Hu0Uf!uD#m@Z)wen_Zo$8kK|EC1>e%oX
zdO`AN?R|fFN||Z?ewtK+LC1qE9}ED|^L{yVJ&;$HZ6Zl58xm5>?G+g)frnf_(I{&S
dodXpia(LYe`pgO}^91Bpj#^sL52Mj6{{Z}6X!QU9

diff --git a/tests/acpi-test-data/q35/DSDT.bridge 
b/tests/acpi-test-data/q35/DSDT.bridge
index 
e85f5b1af9fcd36c9522b05e0085af84d6c010cb..c392802a95cb5690c6719b7909c9f8fa2213e503
 100644
GIT binary patch
delta 907
zcmb7D%}(1u5Z)yYGHU}$q5OL^gW2t
zX@idQ+k=xzbIAHv%q?K3TDwAY`O;td+`nV`V>VPd_m;l!_3V&-Le_yS9&ktoio-Dd
zz#)DoIdcHg$knqK)YJeK_dD1bg?T6Ni)4_vhEY-9_m4*>L=D+{8hhI68F*AcI6^!&s5{)yjA_JZFu?l%31Ufs1

delta 398
zcmYLF&r8EF82!?2TCxtE1aZQk>@1`BMs1w;b)Td(}WpQM-P{HZyFDV}n?4U-5(Yh50XCtSM8_tOYLoHn0VVJzYzA%_&N
zV&wR}OODoBfmpre!NWzVwR)|$W*YPS4z6g7j=WkG<38Hx8yy*U5ZuckKB{bVYy@+?
zAbGU*Zm>9|u4(>$npD}K<3pAY27u^!znpnK$Scb>ktCK43904wiVT#%W3Io^sM{Di
cdn!U?|GLHcuEi|#801!tT3XQ$qj6IH0lN=q*#H0l

-- 
1.8.3.1

Re: [Qemu-devel] [RFC PATCH 3/3] filter-rewriter: rewrite tcp packet to keep secondary connection

2016-06-23 Thread Jason Wang




On 2016年06月23日 18:48, Zhang Chen wrote:



On 06/22/2016 02:34 PM, Jason Wang wrote:



On 2016年06月22日 11:12, Zhang Chen wrote:



On 06/20/2016 08:14 PM, Dr. David Alan Gilbert wrote:

* Jason Wang (jasow...@redhat.com) wrote:


On 2016年06月14日 19:15, Zhang Chen wrote:

We will rewrite tcp packet secondary received and sent.

More verbose please. E.g which fields were rewrote and why.


OK.


Signed-off-by: Zhang Chen 
Signed-off-by: Li Zhijian 
Signed-off-by: Wen Congyang 
---
   net/filter-rewriter.c | 94 
+--

   trace-events  |  3 ++
   2 files changed, 95 insertions(+), 2 deletions(-)

diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
index 12f88c5..86a2f53 100644
--- a/net/filter-rewriter.c
+++ b/net/filter-rewriter.c
@@ -21,6 +21,7 @@
   #include "qemu/main-loop.h"
   #include "qemu/iov.h"
   #include "net/checksum.h"
+#include "trace.h"
   #define FILTER_COLO_REWRITER(obj) \
   OBJECT_CHECK(RewriterState, (obj), TYPE_FILTER_REWRITER)
@@ -64,6 +65,75 @@ static int is_tcp_packet(Packet *pkt)
   }
   }
+static int handle_primary_tcp_pkt(NetFilterState *nf,
+  Connection *conn,
+  Packet *pkt)
+{
+struct tcphdr *tcp_pkt;
+
+tcp_pkt = (struct tcphdr *)pkt->transport_layer;
+
+if (trace_event_get_state(TRACE_COLO_FILTER_REWRITER_DEBUG)) {

Why not use tracepoints directly?

Because trace can't cope with you having to do an allocation/free.


+ char *sdebug, *ddebug;
+sdebug = strdup(inet_ntoa(pkt->ip->ip_src));
+ddebug = strdup(inet_ntoa(pkt->ip->ip_dst));
+fprintf(stderr, "%s: src/dst: %s/%s p: seq/ack=%u/%u"
+"  flags=%x\n", __func__, sdebug, ddebug,
+ntohl(tcp_pkt->th_seq), ntohl(tcp_pkt->th_ack),
+tcp_pkt->th_flags);
However, this should use the trace_ call to write the result even 
if it's

using trace_event_get_state to switch the whole block on/off.


I will fix it in next version.




+ g_free(sdebug);
+g_free(ddebug);
+}
+
+if (((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) == TH_ACK)) {
+/* save primary colo tcp packet seq */
+conn->primary_seq = ntohl(tcp_pkt->th_ack) - 1;
Looks like primary_seq will only be updated during handshake, I 
wonder how

this works.


OK.
We assume that colo guest is a tcp server.

Firstly, client start a tcp handshake. the packet's seq=client_seq,
ack=0,flag=SYN. COLO primary guest get this pkt and 
mirror(filter-mirror)

to secondary guest, secondary get it use filter-redirector.
Then,primary guest response 
pkt(seq=primary_seq,ack=client_seq+1,flag=ACK|SYN).
secondary guest response 
pkt(seq=secondary_seq,ack=client_seq+1,flag=ACK|SYN).
In here,we use filter-rewriter save the secondary_seq to it's tcp 
connection.
Finally handshake,client send 
pkt(seq=client_seq+1,ack=primary_seq+1,flag=ACK).
Here,filter-rewriter can get primary_seq, and rewrite ack from 
primary_seq+1
to secondary_seq+1, recalculate checksum. So the secondary tcp 
connection

kept good.

When we send/recv packet.
client send 
pkt(seq=client_seq+1+data_len,ack=primary_seq+1,flag=ACK|PSH).

filter-rewriter rewrite ack and send to secondary guest.


If I read your code correctly, secondary_seq will only be updated 
during handshake. So the ack seq will always be same for each packet 
received by secondary?


Yes. I don't know why kernel do this. But　I dump the packet hex found 
that,
the ack packet flag=ACK means only ack enabled.and the seq will affect 
tcp checksum

make connection failed.



Not sure I get your meaning, but basically the code here should not have 
any assumptions on guest behaviors.






primary guest response 
pkt(seq=primary_seq+1,ack=client_seq+1+data_len,flag=ACK)
secondary guest response 
pkt(seq=secondary_seq+1,ack=client_seq+1+data_len,flag=ACK)


Is ACK a must here?


Yes.



Looks not, e.g what happens if guest does not use piggybacking acks?

[Qemu-devel] [PULL 33/34] virtio-mmio: convert to ioeventfd callbacks

2016-06-23 Thread Michael S. Tsirkin

From: Cornelia Huck 

Convert to the new interface.

Signed-off-by: Cornelia Huck 
Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/virtio-mmio.c | 128 
 1 file changed, 41 insertions(+), 87 deletions(-)

diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
index d4cd91f..eb84b74 100644
--- a/hw/virtio/virtio-mmio.c
+++ b/hw/virtio/virtio-mmio.c
@@ -93,90 +93,59 @@ typedef struct {
 bool ioeventfd_started;
 } VirtIOMMIOProxy;
 
-static int virtio_mmio_set_host_notifier_internal(VirtIOMMIOProxy *proxy,
-  int n, bool assign,
-  bool set_handler)
+static bool virtio_mmio_ioeventfd_started(DeviceState *d)
 {
-VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
-VirtQueue *vq = virtio_get_queue(vdev, n);
-EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
-int r = 0;
+VirtIOMMIOProxy *proxy = VIRTIO_MMIO(d);
 
-if (assign) {
-r = event_notifier_init(notifier, 1);
-if (r < 0) {
-error_report("%s: unable to init event notifier: %d",
- __func__, r);
-return r;
-}
-virtio_queue_set_host_notifier_fd_handler(vq, true, set_handler);
-memory_region_add_eventfd(&proxy->iomem, VIRTIO_MMIO_QUEUENOTIFY, 4,
-  true, n, notifier);
-} else {
-memory_region_del_eventfd(&proxy->iomem, VIRTIO_MMIO_QUEUENOTIFY, 4,
-  true, n, notifier);
-virtio_queue_set_host_notifier_fd_handler(vq, false, false);
-event_notifier_cleanup(notifier);
-}
-return r;
+return proxy->ioeventfd_started;
 }
 
-static void virtio_mmio_start_ioeventfd(VirtIOMMIOProxy *proxy)
+static void virtio_mmio_ioeventfd_set_started(DeviceState *d, bool started,
+  bool err)
 {
-VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
-int n, r;
+VirtIOMMIOProxy *proxy = VIRTIO_MMIO(d);
 
-if (!kvm_eventfds_enabled() ||
-proxy->ioeventfd_disabled ||
-proxy->ioeventfd_started) {
-return;
-}
+proxy->ioeventfd_started = started;
+}
 
-for (n = 0; n < VIRTIO_QUEUE_MAX; n++) {
-if (!virtio_queue_get_num(vdev, n)) {
-continue;
-}
+static bool virtio_mmio_ioeventfd_disabled(DeviceState *d)
+{
+VirtIOMMIOProxy *proxy = VIRTIO_MMIO(d);
 
-r = virtio_mmio_set_host_notifier_internal(proxy, n, true, true);
-if (r < 0) {
-goto assign_error;
-}
-}
-proxy->ioeventfd_started = true;
-return;
+return !kvm_eventfds_enabled() || proxy->ioeventfd_disabled;
+}
 
-assign_error:
-while (--n >= 0) {
-if (!virtio_queue_get_num(vdev, n)) {
-continue;
-}
+static void virtio_mmio_ioeventfd_set_disabled(DeviceState *d, bool disabled)
+{
+VirtIOMMIOProxy *proxy = VIRTIO_MMIO(d);
 
-r = virtio_mmio_set_host_notifier_internal(proxy, n, false, false);
-assert(r >= 0);
-}
-proxy->ioeventfd_started = false;
-error_report("%s: failed. Fallback to a userspace (slower).", __func__);
+proxy->ioeventfd_disabled = disabled;
 }
 
-static void virtio_mmio_stop_ioeventfd(VirtIOMMIOProxy *proxy)
+static int virtio_mmio_ioeventfd_assign(DeviceState *d,
+EventNotifier *notifier,
+int n, bool assign)
 {
-int r;
-int n;
-VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
+VirtIOMMIOProxy *proxy = VIRTIO_MMIO(d);
 
-if (!proxy->ioeventfd_started) {
-return;
+if (assign) {
+memory_region_add_eventfd(&proxy->iomem, VIRTIO_MMIO_QUEUENOTIFY, 4,
+  true, n, notifier);
+} else {
+memory_region_del_eventfd(&proxy->iomem, VIRTIO_MMIO_QUEUENOTIFY, 4,
+  true, n, notifier);
 }
+return 0;
+}
 
-for (n = 0; n < VIRTIO_QUEUE_MAX; n++) {
-if (!virtio_queue_get_num(vdev, n)) {
-continue;
-}
+static void virtio_mmio_start_ioeventfd(VirtIOMMIOProxy *proxy)
+{
+virtio_bus_start_ioeventfd(&proxy->bus);
+}
 
-r = virtio_mmio_set_host_notifier_internal(proxy, n, false, false);
-assert(r >= 0);
-}
-proxy->ioeventfd_started = false;
+static void virtio_mmio_stop_ioeventfd(VirtIOMMIOProxy *proxy)
+{
+virtio_bus_stop_ioeventfd(&proxy->bus);
 }
 
 static uint64_t virtio_mmio_read(void *opaque, hwaddr offset, unsigned size)
@@ -498,25 +467,6 @@ assign_error:
 return r;
 }
 
-static int virtio_mmio_set_host_notifier(DeviceState *opaque, int n,
- bool assign)
-{
-VirtIOMMIOProxy *proxy = VIRTIO_MMIO(opaque);
-
-/*

[Qemu-devel] [PULL 30/34] virtio-bus: have callers tolerate new host notifier api

2016-06-23 Thread Michael S. Tsirkin

From: Cornelia Huck 

Have vhost and dataplane use the new api for transports that
have been converted.

Signed-off-by: Cornelia Huck 
Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/block/dataplane/virtio-blk.c | 14 +++---
 hw/scsi/virtio-scsi-dataplane.c | 20 +++-
 hw/virtio/vhost.c   | 20 
 3 files changed, 42 insertions(+), 12 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 2073f9a..fdf5fd1 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -79,7 +79,8 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, 
VirtIOBlkConf *conf,
 }
 
 /* Don't try if transport does not support notifiers. */
-if (!k->set_guest_notifiers || !k->set_host_notifier) {
+if (!k->set_guest_notifiers ||
+(!k->set_host_notifier && !k->ioeventfd_started)) {
 error_setg(errp,
"device is incompatible with dataplane "
"(transport does not support notifiers)");
@@ -157,7 +158,10 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 s->guest_notifier = virtio_queue_get_guest_notifier(s->vq);
 
 /* Set up virtqueue notify */
-r = k->set_host_notifier(qbus->parent, 0, true);
+r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), 0, true);
+if (r == -ENOSYS) {
+r = k->set_host_notifier(qbus->parent, 0, true);
+}
 if (r != 0) {
 fprintf(stderr, "virtio-blk failed to set host notifier (%d)\n", r);
 goto fail_host_notifier;
@@ -193,6 +197,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s->vdev)));
 VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
 VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
+int r;
 
 if (!vblk->dataplane_started || s->stopping) {
 return;
@@ -217,7 +222,10 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 
 aio_context_release(s->ctx);
 
-k->set_host_notifier(qbus->parent, 0, false);
+r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), 0, false);
+if (r == -ENOSYS) {
+k->set_host_notifier(qbus->parent, 0, false);
+}
 
 /* Clean up guest notifier (irq) */
 k->set_guest_notifiers(qbus->parent, 1, false);
diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
index 1a49f1e..b9a5716 100644
--- a/hw/scsi/virtio-scsi-dataplane.c
+++ b/hw/scsi/virtio-scsi-dataplane.c
@@ -31,7 +31,8 @@ void virtio_scsi_set_iothread(VirtIOSCSI *s, IOThread 
*iothread)
 s->ctx = iothread_get_aio_context(vs->conf.iothread);
 
 /* Don't try if transport does not support notifiers. */
-if (!k->set_guest_notifiers || !k->set_host_notifier) {
+if (!k->set_guest_notifiers ||
+(!k->set_host_notifier && !k->ioeventfd_started)) {
 fprintf(stderr, "virtio-scsi: Failed to set iothread "
"(transport does not support notifiers)");
 exit(1);
@@ -73,7 +74,10 @@ static int virtio_scsi_vring_init(VirtIOSCSI *s, VirtQueue 
*vq, int n,
 int rc;
 
 /* Set up virtqueue notify */
-rc = k->set_host_notifier(qbus->parent, n, true);
+rc = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), n, true);
+if (rc == -ENOSYS) {
+rc = k->set_host_notifier(qbus->parent, n, true);
+}
 if (rc != 0) {
 fprintf(stderr, "virtio-scsi: Failed to set host notifier (%d)\n",
 rc);
@@ -159,7 +163,10 @@ fail_vrings:
 virtio_scsi_clear_aio(s);
 aio_context_release(s->ctx);
 for (i = 0; i < vs->conf.num_queues + 2; i++) {
-k->set_host_notifier(qbus->parent, i, false);
+rc = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), i, false);
+if (rc == -ENOSYS) {
+k->set_host_notifier(qbus->parent, i, false);
+}
 }
 k->set_guest_notifiers(qbus->parent, vs->conf.num_queues + 2, false);
 fail_guest_notifiers:
@@ -174,7 +181,7 @@ void virtio_scsi_dataplane_stop(VirtIOSCSI *s)
 BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s)));
 VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
 VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(s);
-int i;
+int i, rc;
 
 if (!s->dataplane_started || s->dataplane_stopping) {
 return;
@@ -198,7 +205,10 @@ void virtio_scsi_dataplane_stop(VirtIOSCSI *s)
 aio_context_release(s->ctx);
 
 for (i = 0; i < vs->conf.num_queues + 2; i++) {
-k->set_host_notifier(qbus->parent, i, false);
+rc = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), i, false);
+if (rc == -ENOSYS) {
+k->set_host_notifier(qbus->parent, i, false);
+}
 }
 
 /* Clean up guest notifier (irq) */
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 81cc5b0..bce1b6e 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1110,14 +1110,18 @@ int vhost_dev_enab

[Qemu-devel] [PATCH v2 12/10] pc: acpi: add expected DSDT/MADT blobs for CPU hotplug testscase

2016-06-23 Thread Igor Mammedov

Signed-off-by: Igor Mammedov 
---
 tests/acpi-test-data/pc/APIC.cphp  | Bin 0 -> 160 bytes
 tests/acpi-test-data/pc/DSDT.cphp  | Bin 0 -> 6435 bytes
 tests/acpi-test-data/q35/APIC.cphp | Bin 0 -> 160 bytes
 tests/acpi-test-data/q35/DSDT.cphp | Bin 0 -> 9197 bytes
 4 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 tests/acpi-test-data/pc/APIC.cphp
 create mode 100644 tests/acpi-test-data/pc/DSDT.cphp
 create mode 100644 tests/acpi-test-data/q35/APIC.cphp
 create mode 100644 tests/acpi-test-data/q35/DSDT.cphp

diff --git a/tests/acpi-test-data/pc/APIC.cphp 
b/tests/acpi-test-data/pc/APIC.cphp
new file mode 100644
index 
..1bf8a0a63bc1c9b716d937b96eb34b05016b9366
GIT binary patch
literal 160
zcmXwx!3}^Q3`JW6f}%UP3UJYrBwpObWgNv(oJ8%n@zB24pP!~WmxG9S&r6xsF>kdb
z$yhQtNOavFgY<9)W~DJWDKu7Tozi)bd+hVZHk}Lv=1?18ZTnj%1d=%RQTrg}%kWi^)!*oq
zjaB;huKSJaKKjC?9gl22SDtQmyw9Jwn*>3RH$BEsP!=7l;@G_fQ>*7?r&ia~Yfm7rjAu?cO=_ZlUD+nidHKSIk0569_w2ZYIWHnpC&SPJn}nMch(e6PU}u
z-M9YqF0x=xLTcB^WW%gBDS4lWS{VgVtH3`+yL4UT10$Q=yVh!JKpIS03MLEvoo8oO
zsYg7b2#bWS(jBrxgo#~Z_ugBp=pkGb)ucZwVW56Tm$-yNuPw3#{}%;_*Y3S-tZ#%J
zr)Q%bWtLbZ3IfaWimru=I63rafz7Yd@5S#$BCXON#UEj!7H^WPlFwaOX_#fc*eiN{
zCZ`aVVCyVT*-Iv{H{oxFEwE$u5&MBnGg)?4^lJ7jQ!x$4KLRLr@AnO}9r`K}bv{^n
zoKkl%0n2?vo=IWM3d^k0PsC3|Szg@t{i#aYx>4YhnxH`javEHaIGR`DE0M^HichnG
zG{p!F6G9$X(O4egl>j_4@F+ETltlZcX0>UGykIhdidJcE7&)6(8rm9B-!!%AEy2Ew+VQd1MWeS%w+VK)-^S)6
zqBLO(RUBoFVcM%YbIewocr(Jj>ygg$O7dxk?R%egm_RnYy`9b`VIsLdQ2O@)l!R^5
zXs+pGYjCB1pANG94wJ%Wi)=m1gjyLu+5UYdge{d}ix{?OWXt<(catduHZFOxMToc8
zf$^SfQQ~bqaXaL3=g74Wu3Q({Wi?%Ai2l(yRhk#
zM=Yf-*KcdBBmi3Z>=a9VIYE+svh9+uu#F|)yFN%g?Ly35l#j64?lmSMOi1QnL#CmC
zV0n^ZuB_}FoBeW%B*g?|DTBWh{OuBTI@p8g1iGhY9ldUm&roLje#*Q3M30r9h=FO3af@`o=)hA+hoU$T4a5=3uBhnIrkc?#hv0!z-z
zZc3f-7h6pQbBwM+6RxhJw}JytWA{cy-)vRGA=reUTp7*W$kiS`@;-X}=iJVRA3uD&
zbN|DSiA^=Lu{JEf8OByAc}ZTP?GkE#nS_f{>>~
z(lkSdQZs`fQM0Oz93b^_JEx|ddb2Kj1RL$%>gqkeN`Wtdf0?po*7Ny79z6)o^Mq!hrR=;+zkR~eSUYl6BZ0H=%LbxRDquL3U#(4Pme!Qx
z!l3T+a;opbaSmlRN(!qpSd~r$@rtXCY#5`;@pmCPZ5i`XJf}Q*f$x_UCZwLq@{>gb!plq?UYy
z2?hyll-t=9lZlM?Hn64~+#Q${M4fUVs1!yW7S863@!?+{?(F8eCn~5q>mSU6WZ%%6Ex0toy}+i1e`6|7
z7%&7G*y
zPshQBHxJcgxa43*HU-W$0&xb!S|GmFsPfjUAP!sSjPl(f_B@C+&uCR@*a?LO5`oaD
zVFwf%NV0>?C}3Yyd^7eQs86vC?K`MbzcK4K(nnznN)5C%2Kruio(K&^J`{sn{aAykHT@#^8bUn>Jxk*!3~m_Aa;2IL4tKms2M^53B>U@=3=!bH
zjdO}$@L+tEewC&&w9_EfegyNYbf{s5rw*VKu~5!B$)S^R&?U|azgPsU
zax9dyCOI@22fe{r8Hci1C})M;5IB&EgD!K{G>4{Rp`10tp_w@7dz^KOL#JY)oE6^8
zfv?kX(Dym3!l6nml(S|zG#dx~fV0kU=u9k>v(9qpY#g-ASt0X-u+&&6XFbEAXX2n0
z&N|1TbFom)dX_`a#z8gCI?ti=u~5!>jziDIL3Pf0o>
z2?|uCz_O1DS~7dx6udtUEhsBPO+YQQNuWV-7}{{G8=(ycgDpO^;b_aD4Tpn`I<(<<
z@1bpauM5=`jLLf~~{t@J0fCWLHp!O~CCrUmw|Tq7LcI?fbqy
zvilK3VsbkiCWn?bX2+-@#X>vAt&iC;a!8iYokdb
z$yhQtNOavFgY<9)W~DJWDKu7Tozi)bd+hVZHk}Lv=1?18ZTnj%1k`3ZtwG6@7{+!>rO20e&Ky~_fGu&N4>j(KyTfRU6&f(Q{7gn+qvHeT5T&Q
zzI5~4E?K$!my{O$rJIeQR&u=UJVOsFdBg>$Tdjrp;@7U@bOYH+JKbW~6i)Y6Ewr50
ztwuvQLAzNOemL3PZUy#(*F_M%xH{OF=y=lZwHmu`Ok;=STmzxu~-AH43`
z0IcEL!S{Mh0p+2_I;DD-KHSUnIq*L1?^*BR$SR{(2aBKf6;5`0bTB3`^*_wZUMjJA
z^tu;0Qcu~bHp*?K$ASusA7{7PXh$M1#Mj^DgxxvtD4u_zycMoAnqhavztL^Aiz23;
zUQAtg{?v25-XQ-;zbE>=-0|^|7)*cCza#!~Colf>zs!+1a%XV1nyuMcclv`#Tu3Ar
zwh-?K@8-laG#om$ox>noYZbeEIx&D{45m?Q?xfrvU7kAbhm?EYO?3{=Q(FYvQ86tn
ze3kH3Z?wY{qsl4wkWdRil|@i2Z&^VJAN2-4yqo8qO{Gh=rHBBCLwFFZM+$`;O=w{&cexj^OFEKgs7~B$0_d(GwO}uZUOheI*5@ox>-i
z?OP+_%zTpQxS1=$BjEGUG6LGdUy^5>#@`!cah8w7Lwi)vbEhiS+v&H{j&tQc7b@F0
zC#y>xoH&3ZXv)!26eDnTX&c@v%>RX#-A=?((8)7a`{cZ|DMFnXDRWUbZ
z=Z}xEG)UYqA{Kzt@)+{~RUt8vpRp-s0y~U|sh}yrOhB25keC<^W7Eu3BcS__vobU-
znSiR0n5qiydx_;dHZv8}mP|Exgu2d*p)XV%b}Wu}5=O`QmJofC%6
z2}9?EsS}~D(=l{9hEB)SiBQ*x6+%1HlZMVoQzt@Qr)%hR4V|v36QQm%XXwlsI&-E@
zgt|_wRND2-8#?o*PK3J7DMRO!p>xXAiBQ)$ZRngfbWWQ(5$ZZ;44pHE&KXlDLS3h4
z==2Poo~aX|u5;GVIcw;gHFYA?bW;n!NFqjt%<^_|9P-k8=m=_J^MU#n8XI?Uxmkj15lZjAg
zUN)GQ4d!K&iBM-g!A#{cPcT!h@lQymTDTDsQ#r}9QFGa-xop-%C~G<}(4uCbl~!$J
zplGA;&_EHOtPIi!R8bhH#IYq=*zYhiRY*)F4F)Q)%0M+J8K{IZlMECgMxzW=
zVuumcR9;I4Dxo6-MTns@76vM@!-$ybE$G3{-=Xfl4Sd$v_b*
zoiI>|WhM+%gOY(tC^N}G5h|T9P>E$G3{-=Xfl4Sd$v_b*oiI>|WhM+%gOY(tC^N}G
z5h|T9P>E$G3{-=Xfl4Sd$v_b*oiI>|WhM+%gOY(tC^N}G5h|T9P>E$G3{-=Xfl4Sd
z$v_b*oiI>|WhM+%gOY(tC^N}G5h|T9P>E$G3{-=Xfl4Sd$v_b*oiI>|WhM+%gOY(t
zC^N}G5h|T9P>E$G3{-=Xfl4Sd$v_b*oiI>|WhM+%gOY(tC^N}G5h|T9P>E$G3{-=X
zfl4Sd$v_b*oiI>|WhM+%gOY(tC^N}G5h|T9P>E$G3{-=Xfl4Sd$v_b*oiI>|WhM+%
zgOY(tC^N}G5h|T9P>E$G3{-=Xfl4Sd$v_b*oiI>|WhM+%gOY(tC^N}G5h|T9P>E$G
z3{-=Xfl4Sd$v_b*oiI>|WhM+%gOY(tC^N}G5h|T9P(+%6BGL^Mp>CiEH3QX{Fi?$2
z2C6a1Ks6=|RAa(GH6|IT#v}vPm@rU{2?N!bWS|<83{+#nKs6=|RAZ8XYD_XvjR^xq
zq^>Ru6cMf%pG-1Pgt!bUB&IsIFi=G5+`>Q+sdGyPicp>pK|4wC931G!uPIT(yEuZdI{sxtC#%KtCu>5
z5or$+)!o!%ln=D>0hbRF<%2WI2gCAVvOFG_eQ))lRzBqNVWNC^M)`19zML#?KSueo
zR=&*T%Zc*kGs>5TP_#X(>yV_hfmscwk#WkJv2?QU)O2I>#y8Vpt1^~O
z_MMt;Y#uC_>9BtpODFqIO*eL5FPiDFsToTr`%XAbKh-w*`u
z?UQHajb^!}?nD<85dJe2G;Xa_-?$h5{;l7w?7#Zv8*lCZ=G8Yv#|j$t&EXw6<+>H?
zoBTI3->`~LA=IY)c*WYzh)LxVAG_}`d+otw)+0Ib=xjny{#cQMDWasMbXKnQ^fzA)<
zK0c0jlie6-EP8_r{p0~s9

Re: [Qemu-devel] [PATCH v2 09/10] tests: acpi: add CPU hotplug testcase

2016-06-23 Thread Igor Mammedov

On Fri, 24 Jun 2016 08:53:25 +0300
"Michael S. Tsirkin"  wrote:

> On Thu, Jun 23, 2016 at 03:47:36PM +0200, Igor Mammedov wrote:
> > On Thu, 23 Jun 2016 16:08:38 +0300
> > Marcel Apfelbaum  wrote:
> > 
> > > On 06/16/2016 07:55 PM, Igor Mammedov wrote:
> > > > Test with:
> > > >
> > > >  -smp 2,cores=3,sockets=2,maxcpus=6
> > > >
> > > > to capture sparse APIC ID values that default
> > > > AMD CPU has in above configuration.
> > > >
> > > > Signed-off-by: Igor Mammedov 
> > > > ---
> > > >   tests/bios-tables-test.c | 28 
> > > >   1 file changed, 28 insertions(+)
> > > >
> > > > diff --git a/tests/bios-tables-test.c b/tests/bios-tables-test.c
> > > > index 16d11aa..a7abe91 100644
> > > > --- a/tests/bios-tables-test.c
> > > > +++ b/tests/bios-tables-test.c
> > > > @@ -788,6 +788,32 @@ static void test_acpi_q35_tcg_bridge(void)
> > > >   free_test_data(&data);
> > > >   }
> > > >
> > > > +static void test_acpi_piix4_tcg_cphp(void)
> > > > +{
> > > > +test_data data;
> > > > +
> > > > +memset(&data, 0, sizeof(data));
> > > > +data.machine = MACHINE_PC;
> > > > +data.variant = ".cphp";
> > > > +test_acpi_one("-machine accel=tcg"
> > > > +  " -smp 2,cores=3,sockets=2,maxcpus=6",
> > > > +  &data);
> > > > +free_test_data(&data);
> > > > +}
> > > > +
> > > > +static void test_acpi_q35_tcg_cphp(void)
> > > > +{
> > > > +test_data data;
> > > > +
> > > > +memset(&data, 0, sizeof(data));
> > > > +data.machine = MACHINE_Q35;
> > > > +data.variant = ".cphp";
> > > > +test_acpi_one("-machine q35,accel=tcg"
> > > > +  " -smp 2,cores=3,sockets=2,maxcpus=6",
> > > > +  &data);
> > > > +free_test_data(&data);
> > > > +}
> > > > +
> > > >   int main(int argc, char *argv[])
> > > >   {
> > > >   const char *arch = qtest_get_arch();
> > > > @@ -804,6 +830,8 @@ int main(int argc, char *argv[])
> > > >   qtest_add_func("acpi/piix4/tcg/bridge",
> > > > test_acpi_piix4_tcg_bridge); qtest_add_func("acpi/q35/tcg",
> > > > test_acpi_q35_tcg); qtest_add_func("acpi/q35/tcg/bridge",
> > > > test_acpi_q35_tcg_bridge);
> > > > +qtest_add_func("acpi/piix4/tcg/cpuhp",
> > > > test_acpi_piix4_tcg_cphp);
> > > > +qtest_add_func("acpi/q35/tcg/cpuhp",
> > > > test_acpi_q35_tcg_cphp); }
> > > >   ret = g_test_run();
> > > >   boot_sector_cleanup(disk);
> > > >
> > > 
> > > It looks good, but did you miss the .cphp variant expected files
> > > on purpose?
> > yes, it was in separate commit and I've dropped it before publishing
> > tree, per Michael's suggestion not to post ACPI tables blobs since
> > he updates them himself.
> > I can regenerate blob and post it any time as commit on top of this
> > if needed.
> 
> you need to patch the script that updates the blob.
> I can run it myself but you should mention it in commit log.
I guess I've misunderstood.
I'll post extra patch here with blob update.

> 
> > > 
> > > 
> > > Reviewed-by: Marcel Apfelbaum 
> > > Thanks,
> > > Marcel
> > Thanks!

[Qemu-devel] [PULL 27/34] acpi-test-data: update expected

2016-06-23 Thread Michael S. Tsirkin

switched to new cpu hotplug interface, aml changed.

Signed-off-by: Michael S. Tsirkin 
---
 tests/acpi-test-data/pc/DSDT | Bin 5503 -> 6008 bytes
 tests/acpi-test-data/pc/DSDT.bridge  | Bin 7362 -> 7867 bytes
 tests/acpi-test-data/pc/DSDT.ipmikcs | Bin 5575 -> 6080 bytes
 tests/acpi-test-data/q35/DSDT| Bin 8265 -> 8770 bytes
 tests/acpi-test-data/q35/DSDT.bridge | Bin 8282 -> 8787 bytes
 tests/acpi-test-data/q35/DSDT.ipmibt | Bin 8340 -> 8845 bytes
 6 files changed, 0 insertions(+), 0 deletions(-)

diff --git a/tests/acpi-test-data/pc/DSDT b/tests/acpi-test-data/pc/DSDT
index 
8b4f1a09b87f8361fb572022f69d304ddeeace99..8053d711058c0f9541d6d97690819f9de697751c
 100644
GIT binary patch
delta 907
zcmb7D&2G~`5Z-NEx>>tLsVpJEfk-_dA(Sp$dNyM>F^y9z>ms3zWFM2B_DE0ekHib)
z1ya;|gm@7iqz}L?%otSBO)m9mJv;h-X1-m${oRwXj*G7^7~}UpJ7S@oG++%K1@K!
zW*r+~K)FJBvk0*UGEty|6bdSkhh&#}%t!yKtSWNcSn4?tOg+cdf)4&9?wWVOC`h&Q
zk%UNsc2+gkwt^LEk0H2I&OWNc3b%uMvm`~9B?K45iL
M-6|{k(W(pc2VMhaIRF3v

diff --git a/tests/acpi-test-data/pc/DSDT.bridge 
b/tests/acpi-test-data/pc/DSDT.bridge
index 
0d09b5cc61114b68fee0f14729732786854b19fe..850e71a973e52cc5e546fdd2757f0e089fed7192
 100644
GIT binary patch
delta 907
zcmb7D&2G~`5Z-NEx>>tLp*~RwBK5!pwF?reIB+mylbFWA%DPBsBiYBKr@i*%{z$w)
zULZj4QPpE_eFEO7%($qcn_Tc|Jv;h-X1=|BfAc}cIxcQrFvj0q$&-yvdQj?*r8x&V
z-fz)yb|m&{!yz9WGEu@vcQ&Q$akgL!9-J_9nvZnBeYTK+IoqY5h;<=Ph8)tN<}k{>
za!5Wa&OCrD7Ut|3HMKw|gD!T)QPB;9G99MEGAf$$!RzrJQA2*DMcGf|dNYDNRqBl*
z327WezvX(^f#TD*wYi*4*mqD$O~d{ZK>D`X9q)StkVC5SQKCuw-JPAdWn{Cgnm?bp
z7bM@xr^)vEk6-V)wxu`hg
zdCyz2)GPxB=CNve$
z*}wCroh`{l&vv@PnU!m9m0FM8A`W4tJadtF7_G5l&4)R~e2Gg+8747{ijFS@6Zm@Y
sKJ-h4e>+5WZt2h?Blr~m)}

delta 383
zcmYLFK~KUk82!3|ECqu|V~FuU_yJ@wdhlQ(mSHjjHg*>?&AKfl^VW+8;DCxz&Q9+`
zcPGZ9H%|U6yJ=z%eSPis@?PK9;pzZ)dDmI(0HB8tJSt7ChR2URvQJsRqsiHvva>-1
zLe|RI00Z*n%j>z1HIR`49i&iDfYc}3lyL|BtCFHfadn}mKrm&Nt+_4yP3$#x&M8Q>
zdqW9<1o74@zKSR1#Rq?C0e*rf0+(PMpkPr^*Tb<;ZfbrKAPAGzbL@vQ%o7Soa2%q6
z)4LF8VdRMQ8{yuc7TWPksU2Tc>(eDHQF8?Dt#Woy8J4)|-c@pvWjWe#XM2>H=I<}b
zy4UYGkY-H)kc!8hPR6(c>dCTAq=;ohL~5CzrV>Q(nCY*LthG%YA+mMZ_D-1PS>`Ta
PbynRfEBe_ezt?{Ns(5C)

diff --git a/tests/acpi-test-data/pc/DSDT.ipmikcs 
b/tests/acpi-test-data/pc/DSDT.ipmikcs
index 
f10cd9e296c942b66d6f2404f1a95fe5cc4a6796..8ac48afb6a672a6d70e13539a17614b465d648d6
 100644
GIT binary patch
delta 907
zcmb7DKX21O6u)a+xSZXhRF;rniPV7s)Er%y&9g~NTR+@de7n$~P+S8C12GEOQ*oGsas2Pe#&=3`xOpDkrz&UWc1VqM6S5r=f7IgGQ<
z9FmWUGY=q(g*n@yrWWXA*u~B`D!O4%rlV9?Mn!Wzd^6c2YRErnQ4Z3$-ijbkm3pg4
zLK+9rZ@C$Ep!o1*VRoywPS==-y(lraTITvi;6>D
z^t}~J%`$LcZadDO+ZAcpZFzNl;4KCSRXgKscDf!*wpPKg9l_RH&DwCNi{Rd>ah@6N6hDx1OfGAbl(8FaT?Jh&aE4s>UW_fC2*
z;vbl&`6K)>rd2S9yu9RlyqC8=+8pAZ+w(TN0BHS=KWlS)pSkCZoLY3yRm5Ux(W_w$
zTr_m7fd=IY#AF#@1w<@C1;Hg$APdNWMO_d5tI{gZaDAm_K(Iw^NAcVEi#S{K+OfctaOs^S|=IRn{&t^N3A>X@1?bebjWebe=tf?ggM*s7bg}q&EId5
zZ8qq7kQG$`5QDj1w_vUZ^2v5gB#G@nOd5ruA|u4`ROlyKQCm|*i0s{T*afwjZ650&
N&$3);Wqoj+8$Y*jXKVlf

diff --git a/tests/acpi-test-data/q35/DSDT b/tests/acpi-test-data/q35/DSDT
index 
67445428d935bd6ea5957526089ba7e719a1783a..58fbb3d2e2dc8e8256984744bfb9411feb2e35fe
 100644
GIT binary patch
delta 907
zcmb7D%}(1u5Z)zDW!8o&l`p_0f&-Tdw!Ltwgc+N}5T{nwAXS^nJ|>*@+LH;Dc!9h?
zsCa>jH{iyNW8Ma43_@j-OMP0;j=rCnZyyftBN3@6KiX!DfBP#Y8?EG^P)CKn0x{fg
z&~f^6bFb1Ivfe3k3mB-@nh+g6_vb$Me=&WT4OGs(xi5S@*`c41wIPf99Fo4`Fi6ii
z#3zz72Oy1HJ^4gU4N!5fjh#`Lw*$XO`iW~874=o`^Kg%-Av;rU(M_UiErcwQ@{1%1
zN#uu*q8YRxKmM^Y6Vnm91}Wt@=zajCU)4LqeTM)tNMtsM6^Va(G9$Z;Y=)Jy&8dAs
z@{Rad>fQRLdJlD525FN#{T~yzSrFls;5~lOT?Yky_svIT8R6xh(3_SV#T#a$;*e(@
zXTegv3>=uJiGUvXje!=iN4n(ubEwwup)3G=l8

delta 398
zcmYLFK}y3w6#bL58m1{)8N>>O&;yhpx^JDP#Yoek$wne%l2Ax@vvFa&sI{P7xU#r&
zkxDKg-oTaKpvQ3%1!wW!yqS6bz4@Qr?>6qbUGKsGKu_;@QJ-5!fqNd1Gs*@XMJ!j8
z4V?tIs8z5A8WbxOlU0Zn5Qzj81eZ{PEF{O2xgPpgrDdMsy^Wdy!IW9H;y3XZv3JG!W7Rq-i2v|qKz(($5Srdo68^k&r`*
z*D-Ru-W5k1tw5~aaR2_Y)LOsRTXQXNy$-Hu0Uf!uD#m@Z)wen_Zo$8kK|EC1>e%oX
zdO`AN?R|fFN||Z?ewtK+LC1qE9}ED|^L{yVJ&;$HZ6Zl58xm5>?G+g)frnf_(I{&S
dodXpia(LYe`pgO}^91Bpj#^sL52Mj6{{Z}6X!QU9

diff --git a/tests/acpi-test-data/q35/DSDT.bridge 
b/tests/acpi-test-data/q35/DSDT.bridge
index 
e85f5b1af9fcd36c9522b05e0085af84d6c010cb..c392802a95cb5690c6719b7909c9f8fa2213e503
 100644
GIT binary patch
delta 907
zcmb7D%}(1u5Z)yYGHU}$q5OL^gW2t
zX@idQ+k=xzbIAHv%q?K3TDwAY`O;td+`nV`V>VPd_m;l!_3V&-Le_yS9&ktoio-Dd
zz#)DoIdcHg$knqK)YJeK_dD1bg?T6Ni)4_vhEY-9_m4*>L=D+{8hhI68F*AcI6^!&s5{)yjA_JZFu?l%31Ufs1

delta 398
zcmYLF&r8EF82!?2TCxtE1aZQk>@1`BMs1w;b)Td(}WpQM-P{HZyFDV}n?4U-5(Yh50XCtSM8_tOYLoHn0VVJzYzA%_&N
zV&wR}OODoBfmpre!NWzVwR)|$W*YPS4z6g7j=WkG<38Hx8yy*U5ZuckKB{bVYy@+?
zAbGU*Zm>9|u4(>$npD}K<3pAY27u^!znpnK$Scb>ktCK43904wiVT#%W3Io^sM{Di
cdn!U?|GLHcuEi|#801!tT3XQ$qj6IH0lN=q*#H0l

diff --git a/tests/acpi-test-data/q35/DSDT.ipmibt 
b/tests/acpi-test-data/q35/DSDT.ipmibt
index 
3db2b0b5f96e567d42d3abc672c95e9724243aaf..0ea38e1e72977e82053b087a6bd2e4ea21373420
 100644
GIT binary patch
delta 907
zcmb7DKTq306u%=*WzL2w$V*{~VCaJ2+J)JDb`wJ!telHfT_jtb5~dRylM6^}OyL6r
z#L%_s*XY2QZv*ccgy50|Pv>_}zd!H&?mE1RMWmwqYLhYk>#z9#sg>*%>RX}DK@4{r
zbe#NI->

[Qemu-devel] [PULL 31/34] virtio-ccw: convert to ioeventfd callbacks

2016-06-23 Thread Michael S. Tsirkin

From: Cornelia Huck 

Use the new interface.

Signed-off-by: Cornelia Huck 
Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/s390x/virtio-ccw.c | 133 +-
 1 file changed, 45 insertions(+), 88 deletions(-)

diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index 1625e6b..8b709e3 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -69,92 +69,58 @@ VirtIODevice *virtio_ccw_get_vdev(SubchDev *sch)
 return vdev;
 }
 
-static int virtio_ccw_set_guest2host_notifier(VirtioCcwDevice *dev, int n,
-  bool assign, bool set_handler)
+static void virtio_ccw_start_ioeventfd(VirtioCcwDevice *dev)
 {
-VirtIODevice *vdev = virtio_bus_get_device(&dev->bus);
-VirtQueue *vq = virtio_get_queue(vdev, n);
-EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
-int r = 0;
-SubchDev *sch = dev->sch;
-uint32_t sch_id = (css_build_subchannel_id(sch) << 16) | sch->schid;
+virtio_bus_start_ioeventfd(&dev->bus);
+}
 
-if (assign) {
-r = event_notifier_init(notifier, 1);
-if (r < 0) {
-error_report("%s: unable to init event notifier: %d", __func__, r);
-return r;
-}
-virtio_queue_set_host_notifier_fd_handler(vq, true, set_handler);
-r = s390_assign_subch_ioeventfd(notifier, sch_id, n, assign);
-if (r < 0) {
-error_report("%s: unable to assign ioeventfd: %d", __func__, r);
-virtio_queue_set_host_notifier_fd_handler(vq, false, false);
-event_notifier_cleanup(notifier);
-return r;
-}
-} else {
-virtio_queue_set_host_notifier_fd_handler(vq, false, false);
-s390_assign_subch_ioeventfd(notifier, sch_id, n, assign);
-event_notifier_cleanup(notifier);
-}
-return r;
+static void virtio_ccw_stop_ioeventfd(VirtioCcwDevice *dev)
+{
+virtio_bus_stop_ioeventfd(&dev->bus);
 }
 
-static void virtio_ccw_start_ioeventfd(VirtioCcwDevice *dev)
+static bool virtio_ccw_ioeventfd_started(DeviceState *d)
 {
-VirtIODevice *vdev;
-int n, r;
+VirtioCcwDevice *dev = VIRTIO_CCW_DEVICE(d);
 
-if (!(dev->flags & VIRTIO_CCW_FLAG_USE_IOEVENTFD) ||
-dev->ioeventfd_disabled ||
-dev->ioeventfd_started) {
-return;
-}
-vdev = virtio_bus_get_device(&dev->bus);
-for (n = 0; n < VIRTIO_CCW_QUEUE_MAX; n++) {
-if (!virtio_queue_get_num(vdev, n)) {
-continue;
-}
-r = virtio_ccw_set_guest2host_notifier(dev, n, true, true);
-if (r < 0) {
-goto assign_error;
-}
-}
-dev->ioeventfd_started = true;
-return;
+return dev->ioeventfd_started;
+}
 
-  assign_error:
-while (--n >= 0) {
-if (!virtio_queue_get_num(vdev, n)) {
-continue;
-}
-r = virtio_ccw_set_guest2host_notifier(dev, n, false, false);
-assert(r >= 0);
+static void virtio_ccw_ioeventfd_set_started(DeviceState *d, bool started,
+ bool err)
+{
+VirtioCcwDevice *dev = VIRTIO_CCW_DEVICE(d);
+
+dev->ioeventfd_started = started;
+if (err) {
+/* Disable ioeventfd for this device. */
+dev->flags &= ~VIRTIO_CCW_FLAG_USE_IOEVENTFD;
 }
-dev->ioeventfd_started = false;
-/* Disable ioeventfd for this device. */
-dev->flags &= ~VIRTIO_CCW_FLAG_USE_IOEVENTFD;
-error_report("%s: failed. Fallback to userspace (slower).", __func__);
 }
 
-static void virtio_ccw_stop_ioeventfd(VirtioCcwDevice *dev)
+static bool virtio_ccw_ioeventfd_disabled(DeviceState *d)
 {
-VirtIODevice *vdev;
-int n, r;
+VirtioCcwDevice *dev = VIRTIO_CCW_DEVICE(d);
 
-if (!dev->ioeventfd_started) {
-return;
-}
-vdev = virtio_bus_get_device(&dev->bus);
-for (n = 0; n < VIRTIO_CCW_QUEUE_MAX; n++) {
-if (!virtio_queue_get_num(vdev, n)) {
-continue;
-}
-r = virtio_ccw_set_guest2host_notifier(dev, n, false, false);
-assert(r >= 0);
-}
-dev->ioeventfd_started = false;
+return dev->ioeventfd_disabled ||
+!(dev->flags & VIRTIO_CCW_FLAG_USE_IOEVENTFD);
+}
+
+static void virtio_ccw_ioeventfd_set_disabled(DeviceState *d, bool disabled)
+{
+VirtioCcwDevice *dev = VIRTIO_CCW_DEVICE(d);
+
+dev->ioeventfd_disabled = disabled;
+}
+
+static int virtio_ccw_ioeventfd_assign(DeviceState *d, EventNotifier *notifier,
+   int n, bool assign)
+{
+VirtioCcwDevice *dev = VIRTIO_CCW_DEVICE(d);
+SubchDev *sch = dev->sch;
+uint32_t sch_id = (css_build_subchannel_id(sch) << 16) | sch->schid;
+
+return s390_assign_subch_ioeventfd(notifier, sch_id, n, assign);
 }
 
 VirtualCssBus *virtual_css_bus_init(void)
@@ -1157,19 +1123,6 @@ static bool virtio_ccw_query_guest_notifiers(DeviceState

Re: [Qemu-devel] [PATCH v3 16/22] block: Wording tweaks to write zeroes limits

2016-06-23 Thread Fam Zheng

On Thu, 06/23 16:37, Eric Blake wrote:
> Improve the documentation of the write zeroes limits, to mention
> additional constraints that drivers should observe.  Worth squashing
> into commit cf081fca, if that hadn't been pushed already :)
> 
> Signed-off-by: Eric Blake 
> 
> ---
> v3: new patch, split off from "block: Switch discard length bounds..."
> ---
>  include/block/block_int.h | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 7d2b152..7a4a00f 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -331,11 +331,14 @@ typedef struct BlockLimits {
>  int64_t discard_alignment;
> 
>  /* maximum number of bytes that can zeroized at once (since it is
> - * signed, it must be < 2G, if set) */
> + * signed, it must be < 2G, if set), should be multiple of
> + * pwrite_zeroes_alignment. May be 0 if no inherent 32-bit limit */

"inherent 32-bit limit"? What is special about 32-bit other than this field is
32-bit? Anyway,

Reviewed-by: Fam Zheng 

>  int32_t max_pwrite_zeroes;
> 
>  /* optimal alignment for write zeroes requests in bytes, must be
> - * power of 2, and less than max_pwrite_zeroes if that is set */
> + * power of 2, less than max_pwrite_zeroes if that is set, and
> + * multiple of bs->request_alignment. May be 0 if
> + * bs->request_alignment is good enough */
>  uint32_t pwrite_zeroes_alignment;
> 
>  /* optimal transfer length in bytes (must be power of 2, and
> -- 
> 2.5.5
>

[Qemu-devel] [PULL 34/34] virtio-bus: remove old set_host_notifier callback

2016-06-23 Thread Michael S. Tsirkin

From: Cornelia Huck 

All users have been converted to the new ioevent callbacks.

Signed-off-by: Cornelia Huck 
Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/virtio/virtio-bus.h  |  1 -
 hw/block/dataplane/virtio-blk.c | 12 ++--
 hw/scsi/virtio-scsi-dataplane.c | 19 ---
 hw/virtio/vhost.c   | 13 +
 4 files changed, 7 insertions(+), 38 deletions(-)

diff --git a/include/hw/virtio/virtio-bus.h b/include/hw/virtio/virtio-bus.h
index 9637f80..f3e5ef3 100644
--- a/include/hw/virtio/virtio-bus.h
+++ b/include/hw/virtio/virtio-bus.h
@@ -52,7 +52,6 @@ typedef struct VirtioBusClass {
 bool (*has_extra_state)(DeviceState *d);
 bool (*query_guest_notifiers)(DeviceState *d);
 int (*set_guest_notifiers)(DeviceState *d, int nvqs, bool assign);
-int (*set_host_notifier)(DeviceState *d, int n, bool assigned);
 void (*vmstate_change)(DeviceState *d, bool running);
 /*
  * transport independent init function.
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index fdf5fd1..2041b04 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -79,8 +79,7 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, 
VirtIOBlkConf *conf,
 }
 
 /* Don't try if transport does not support notifiers. */
-if (!k->set_guest_notifiers ||
-(!k->set_host_notifier && !k->ioeventfd_started)) {
+if (!k->set_guest_notifiers || !k->ioeventfd_started) {
 error_setg(errp,
"device is incompatible with dataplane "
"(transport does not support notifiers)");
@@ -159,9 +158,6 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 
 /* Set up virtqueue notify */
 r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), 0, true);
-if (r == -ENOSYS) {
-r = k->set_host_notifier(qbus->parent, 0, true);
-}
 if (r != 0) {
 fprintf(stderr, "virtio-blk failed to set host notifier (%d)\n", r);
 goto fail_host_notifier;
@@ -197,7 +193,6 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s->vdev)));
 VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
 VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
-int r;
 
 if (!vblk->dataplane_started || s->stopping) {
 return;
@@ -222,10 +217,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 
 aio_context_release(s->ctx);
 
-r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), 0, false);
-if (r == -ENOSYS) {
-k->set_host_notifier(qbus->parent, 0, false);
-}
+virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), 0, false);
 
 /* Clean up guest notifier (irq) */
 k->set_guest_notifiers(qbus->parent, 1, false);
diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
index b9a5716..18ced31 100644
--- a/hw/scsi/virtio-scsi-dataplane.c
+++ b/hw/scsi/virtio-scsi-dataplane.c
@@ -31,8 +31,7 @@ void virtio_scsi_set_iothread(VirtIOSCSI *s, IOThread 
*iothread)
 s->ctx = iothread_get_aio_context(vs->conf.iothread);
 
 /* Don't try if transport does not support notifiers. */
-if (!k->set_guest_notifiers ||
-(!k->set_host_notifier && !k->ioeventfd_started)) {
+if (!k->set_guest_notifiers || !k->ioeventfd_started) {
 fprintf(stderr, "virtio-scsi: Failed to set iothread "
"(transport does not support notifiers)");
 exit(1);
@@ -70,14 +69,10 @@ static int virtio_scsi_vring_init(VirtIOSCSI *s, VirtQueue 
*vq, int n,
   void (*fn)(VirtIODevice *vdev, VirtQueue 
*vq))
 {
 BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s)));
-VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
 int rc;
 
 /* Set up virtqueue notify */
 rc = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), n, true);
-if (rc == -ENOSYS) {
-rc = k->set_host_notifier(qbus->parent, n, true);
-}
 if (rc != 0) {
 fprintf(stderr, "virtio-scsi: Failed to set host notifier (%d)\n",
 rc);
@@ -163,10 +158,7 @@ fail_vrings:
 virtio_scsi_clear_aio(s);
 aio_context_release(s->ctx);
 for (i = 0; i < vs->conf.num_queues + 2; i++) {
-rc = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), i, false);
-if (rc == -ENOSYS) {
-k->set_host_notifier(qbus->parent, i, false);
-}
+virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), i, false);
 }
 k->set_guest_notifiers(qbus->parent, vs->conf.num_queues + 2, false);
 fail_guest_notifiers:
@@ -181,7 +173,7 @@ void virtio_scsi_dataplane_stop(VirtIOSCSI *s)
 BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s)));
 VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
 VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(s);
-int i, rc;
+int i;
 
 if (!s->dataplane_started || s->datapla

Re: [Qemu-devel] Default for phys-addr-bits? (was Re: [PATCH 4/5] x86: Allow physical address bits to be set)

2016-06-23 Thread Gerd Hoffmann

On Do, 2016-06-23 at 19:38 +0300, Michael S. Tsirkin wrote:
> On Thu, Jun 23, 2016 at 10:40:03AM +0200, Gerd Hoffmann wrote:
> >   Hi,
> > 
> > > > Well the crash of guest phys bits > host phys bits, should be easy to
> > > > reproduce by booting a 65GB guest on a 64GB RAM + 2GB swap host with
> > > > 36 host phys bits using the upstream qemu that forces the guest phys
> > > > bits to 40.
> > > 
> > > So you supply more RAM than host can address, and guest crashes?
> > 
> > Yep.  The only reason we don't see this happening in practice is that
> > it's probably next to impossible to find a machine which has (a) only 36
> > physical address lines and (b) allows to plug that much RAM.
> > 
> > > Why are we worried about it?
> > 
> > It's more a issue with pci ressources.  In theory seabios/edk2 could go
> > figure how big the physical address space is, then map 64bit pci bars as
> > high as possible, thereby making stuff like etc/reserved-memory-end in
> > fw_cfg unnecessary.
> > 
> > But with qemu saying 40 phys bits are available even if they are not
> > this approach isn't going to fly ...
> > 
> > cheers,
> >   Gerd
> 
> Nah, x86 guests really need to go by _CRS.

Yep, we can implement the "soft-phys-bits" that way.

> bios doesn't want to parse that
> so it can go by some fw cfg file instead.

firmware can't use it anyway because the firmware first maps the bars,
the loads acpi tables (while qemu generates _CRS entries according to
the bios mappings).

> Going by phys bits won't work on old qemu so I don't believe it's
> practical.

Indeed, so I guess we'll have to stick to the current approach of
mapping 64bit bars above ram (or etc/reserved-memory-end if present).

cheers,
  Gerd

[Qemu-devel] [PULL 32/34] virtio-pci: convert to ioeventfd callbacks

2016-06-23 Thread Michael S. Tsirkin

From: Cornelia Huck 

Convert to new interface.

Signed-off-by: Cornelia Huck 
Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/virtio-pci.c | 124 -
 1 file changed, 41 insertions(+), 83 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 1a02783..2b34b43 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -262,14 +262,44 @@ static int virtio_pci_load_queue(DeviceState *d, int n, 
QEMUFile *f)
 return 0;
 }
 
+static bool virtio_pci_ioeventfd_started(DeviceState *d)
+{
+VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
+
+return proxy->ioeventfd_started;
+}
+
+static void virtio_pci_ioeventfd_set_started(DeviceState *d, bool started,
+ bool err)
+{
+VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
+
+proxy->ioeventfd_started = started;
+}
+
+static bool virtio_pci_ioeventfd_disabled(DeviceState *d)
+{
+VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
+
+return proxy->ioeventfd_disabled ||
+!(proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD);
+}
+
+static void virtio_pci_ioeventfd_set_disabled(DeviceState *d, bool disabled)
+{
+VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
+
+proxy->ioeventfd_disabled = disabled;
+}
+
 #define QEMU_VIRTIO_PCI_QUEUE_MEM_MULT 0x1000
 
-static int virtio_pci_set_host_notifier_internal(VirtIOPCIProxy *proxy,
- int n, bool assign, bool 
set_handler)
+static int virtio_pci_ioeventfd_assign(DeviceState *d, EventNotifier *notifier,
+   int n, bool assign)
 {
+VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
 VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
 VirtQueue *vq = virtio_get_queue(vdev, n);
-EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
 bool legacy = !(proxy->flags & VIRTIO_PCI_FLAG_DISABLE_LEGACY);
 bool modern = !(proxy->flags & VIRTIO_PCI_FLAG_DISABLE_MODERN);
 bool fast_mmio = kvm_ioeventfd_any_length_enabled();
@@ -280,16 +310,8 @@ static int 
virtio_pci_set_host_notifier_internal(VirtIOPCIProxy *proxy,
 hwaddr modern_addr = QEMU_VIRTIO_PCI_QUEUE_MEM_MULT *
  virtio_get_queue_index(vq);
 hwaddr legacy_addr = VIRTIO_PCI_QUEUE_NOTIFY;
-int r = 0;
 
 if (assign) {
-r = event_notifier_init(notifier, 1);
-if (r < 0) {
-error_report("%s: unable to init event notifier: %d",
- __func__, r);
-return r;
-}
-virtio_queue_set_host_notifier_fd_handler(vq, true, set_handler);
 if (modern) {
 if (fast_mmio) {
 memory_region_add_eventfd(modern_mr, modern_addr, 0,
@@ -325,68 +347,18 @@ static int 
virtio_pci_set_host_notifier_internal(VirtIOPCIProxy *proxy,
 memory_region_del_eventfd(legacy_mr, legacy_addr, 2,
   true, n, notifier);
 }
-virtio_queue_set_host_notifier_fd_handler(vq, false, false);
-event_notifier_cleanup(notifier);
 }
-return r;
+return 0;
 }
 
 static void virtio_pci_start_ioeventfd(VirtIOPCIProxy *proxy)
 {
-VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
-int n, r;
-
-if (!(proxy->flags & VIRTIO_PCI_FLAG_USE_IOEVENTFD) ||
-proxy->ioeventfd_disabled ||
-proxy->ioeventfd_started) {
-return;
-}
-
-for (n = 0; n < VIRTIO_QUEUE_MAX; n++) {
-if (!virtio_queue_get_num(vdev, n)) {
-continue;
-}
-
-r = virtio_pci_set_host_notifier_internal(proxy, n, true, true);
-if (r < 0) {
-goto assign_error;
-}
-}
-proxy->ioeventfd_started = true;
-return;
-
-assign_error:
-while (--n >= 0) {
-if (!virtio_queue_get_num(vdev, n)) {
-continue;
-}
-
-r = virtio_pci_set_host_notifier_internal(proxy, n, false, false);
-assert(r >= 0);
-}
-proxy->ioeventfd_started = false;
-error_report("%s: failed. Fallback to a userspace (slower).", __func__);
+virtio_bus_start_ioeventfd(&proxy->bus);
 }
 
 static void virtio_pci_stop_ioeventfd(VirtIOPCIProxy *proxy)
 {
-VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
-int r;
-int n;
-
-if (!proxy->ioeventfd_started) {
-return;
-}
-
-for (n = 0; n < VIRTIO_QUEUE_MAX; n++) {
-if (!virtio_queue_get_num(vdev, n)) {
-continue;
-}
-
-r = virtio_pci_set_host_notifier_internal(proxy, n, false, false);
-assert(r >= 0);
-}
-proxy->ioeventfd_started = false;
+virtio_bus_stop_ioeventfd(&proxy->bus);
 }
 
 static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val)
@@ -1110,24 +1082,6 @@ assign_error:
 return r;
 }
 
-static int virtio_pci_

[Qemu-devel] [PULL 28/34] pc: acpi: drop intermediate PCMachineState.node_cpu

2016-06-23 Thread Michael S. Tsirkin

From: Igor Mammedov 

PCMachineState.node_cpu was used for mapping APIC ID
to numa node id as CPU entries in SRAT used to be
built on sparse APIC ID bitmap (up to apic_id_limit).
However since commit
  5803fce pc: acpi: SRAT: create only valid processor lapic entries
CPU entries in SRAT aren't build using apic bitmap
but using 0..maxcpus index instead which is also used
for creating numa_info[x].node_cpu map.
So instead of doing useless intermediate conversion from
  1. node by cpu index -> node by apic id
   i.e. numa_info[x].node_cpu -> PCMachineState.node_cpu
  2. apic id -> srat entry PMX
   PCMachineState.node_cpu[apic id] -> PMX value
use numa_info[x].node_cpu map directly like ARM does and do
  1. numa_info[x].node_cpu -> PMX value using index
 in range 0..maxcpus
and drop not necessary PCMachineState.node_cpu and related
code.

That also removes the last (not counting legacy hotplug)
dependency of ACPI code on apic_id_limit and need to allocate
huge sparse PCMachineState.node_cpu array in case of 32-bit
APIC IDs.

Signed-off-by: Igor Mammedov 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/i386/pc.h |  1 -
 hw/i386/acpi-build.c | 11 ---
 hw/i386/pc.c | 16 +---
 3 files changed, 9 insertions(+), 19 deletions(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 884224e..948ed0c 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -72,7 +72,6 @@ struct PCMachineState {
 /* NUMA information: */
 uint64_t numa_nodes;
 uint64_t *node_mem;
-uint64_t *node_cpu;
 };
 
 #define PC_MACHINE_ACPI_DEVICE_PROP "acpi-device"
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 20e5b49..5a594be 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -44,6 +44,7 @@
 #include "hw/acpi/tpm.h"
 #include "sysemu/tpm_backend.h"
 #include "hw/timer/mc146818rtc_regs.h"
+#include "sysemu/numa.h"
 
 /* Supported chipsets: */
 #include "hw/acpi/piix4.h"
@@ -2328,7 +2329,6 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
MachineState *machine)
 AcpiSratMemoryAffinity *numamem;
 
 int i;
-uint64_t curnode;
 int srat_start, numa_start, slots;
 uint64_t mem_len, mem_base, next_base;
 MachineClass *mc = MACHINE_GET_CLASS(machine);
@@ -2344,14 +2344,19 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
MachineState *machine)
 srat->reserved1 = cpu_to_le32(1);
 
 for (i = 0; i < apic_ids->len; i++) {
+int j;
 int apic_id = apic_ids->cpus[i].arch_id;
 
 core = acpi_data_push(table_data, sizeof *core);
 core->type = ACPI_SRAT_PROCESSOR_APIC;
 core->length = sizeof(*core);
 core->local_apic_id = apic_id;
-curnode = pcms->node_cpu[apic_id];
-core->proximity_lo = curnode;
+for (j = 0; j < nb_numa_nodes; j++) {
+if (test_bit(i, numa_info[j].node_cpu)) {
+core->proximity_lo = j;
+break;
+}
+}
 memset(core->proximity_hi, 0, 3);
 core->local_sapic_eid = 0;
 core->flags = cpu_to_le32(1);
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index dbfba5c..b8fead3 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1179,7 +1179,7 @@ void pc_machine_done(Notifier *notifier, void *data)
 
 void pc_guest_info_init(PCMachineState *pcms)
 {
-int i, j;
+int i;
 
 pcms->apic_xrupt_override = kvm_allows_irq0_override();
 pcms->numa_nodes = nb_numa_nodes;
@@ -1189,20 +1189,6 @@ void pc_guest_info_init(PCMachineState *pcms)
 pcms->node_mem[i] = numa_info[i].node_mem;
 }
 
-pcms->node_cpu = g_malloc0(pcms->apic_id_limit *
- sizeof *pcms->node_cpu);
-
-for (i = 0; i < max_cpus; i++) {
-unsigned int apic_id = x86_cpu_apic_id_from_index(i);
-assert(apic_id < pcms->apic_id_limit);
-for (j = 0; j < nb_numa_nodes; j++) {
-if (test_bit(i, numa_info[j].node_cpu)) {
-pcms->node_cpu[apic_id] = j;
-break;
-}
-}
-}
-
 pcms->machine_done.notify = pc_machine_done;
 qemu_add_machine_init_done_notifier(&pcms->machine_done);
 }
-- 
MST

[Qemu-devel] [PULL 29/34] virtio-bus: common ioeventfd infrastructure

2016-06-23 Thread Michael S. Tsirkin

From: Cornelia Huck 

Introduce a set of ioeventfd callbacks on the virtio-bus level
that can be implemented by the individual transports. At the
virtio-bus level, do common handling for host notifiers (which
is actually most of it).

Two things of note:
- When setting the host notifier, we only switch from/to the
  generic ioeventfd handler. This fixes a latent bug where we
  had no ioeventfd assigned for a certain window.
- We always iterate over all possible virtio queues, even though
  ccw (currently) has a lower limit. It does not really matter
  here.

Signed-off-by: Cornelia Huck 
Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/virtio/virtio-bus.h |  30 ++
 hw/virtio/virtio-bus.c | 132 +
 2 files changed, 162 insertions(+)

diff --git a/include/hw/virtio/virtio-bus.h b/include/hw/virtio/virtio-bus.h
index 3f2c136..9637f80 100644
--- a/include/hw/virtio/virtio-bus.h
+++ b/include/hw/virtio/virtio-bus.h
@@ -71,6 +71,29 @@ typedef struct VirtioBusClass {
 void (*device_unplugged)(DeviceState *d);
 int (*query_nvectors)(DeviceState *d);
 /*
+ * ioeventfd handling: if the transport implements ioeventfd_started,
+ * it must implement the other ioeventfd callbacks as well
+ */
+/* Returns true if the ioeventfd has been started for the device. */
+bool (*ioeventfd_started)(DeviceState *d);
+/*
+ * Sets the 'ioeventfd started' state after the ioeventfd has been
+ * started/stopped for the device. err signifies whether an error
+ * had occurred.
+ */
+void (*ioeventfd_set_started)(DeviceState *d, bool started, bool err);
+/* Returns true if the ioeventfd has been disabled for the device. */
+bool (*ioeventfd_disabled)(DeviceState *d);
+/* Sets the 'ioeventfd disabled' state for the device. */
+void (*ioeventfd_set_disabled)(DeviceState *d, bool disabled);
+/*
+ * Assigns/deassigns the ioeventfd backing for the transport on
+ * the device for queue number n. Returns an error value on
+ * failure.
+ */
+int (*ioeventfd_assign)(DeviceState *d, EventNotifier *notifier,
+int n, bool assign);
+/*
  * Does the transport have variable vring alignment?
  * (ie can it ever call virtio_queue_set_align()?)
  * Note that changing this will break migration for this transport.
@@ -111,4 +134,11 @@ static inline VirtIODevice 
*virtio_bus_get_device(VirtioBusState *bus)
 return (VirtIODevice *)qdev;
 }
 
+/* Start the ioeventfd. */
+void virtio_bus_start_ioeventfd(VirtioBusState *bus);
+/* Stop the ioeventfd. */
+void virtio_bus_stop_ioeventfd(VirtioBusState *bus);
+/* Switch from/to the generic ioeventfd handler */
+int virtio_bus_set_host_notifier(VirtioBusState *bus, int n, bool assign);
+
 #endif /* VIRTIO_BUS_H */
diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
index 574f0e2..1313760 100644
--- a/hw/virtio/virtio-bus.c
+++ b/hw/virtio/virtio-bus.c
@@ -146,6 +146,138 @@ void virtio_bus_set_vdev_config(VirtioBusState *bus, 
uint8_t *config)
 }
 }
 
+/*
+ * This function handles both assigning the ioeventfd handler and
+ * registering it with the kernel.
+ * assign: register/deregister ioeventfd with the kernel
+ * set_handler: use the generic ioeventfd handler
+ */
+static int set_host_notifier_internal(DeviceState *proxy, VirtioBusState *bus,
+  int n, bool assign, bool set_handler)
+{
+VirtIODevice *vdev = virtio_bus_get_device(bus);
+VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(bus);
+VirtQueue *vq = virtio_get_queue(vdev, n);
+EventNotifier *notifier = virtio_queue_get_host_notifier(vq);
+int r = 0;
+
+if (assign) {
+r = event_notifier_init(notifier, 1);
+if (r < 0) {
+error_report("%s: unable to init event notifier: %d", __func__, r);
+return r;
+}
+virtio_queue_set_host_notifier_fd_handler(vq, true, set_handler);
+r = k->ioeventfd_assign(proxy, notifier, n, assign);
+if (r < 0) {
+error_report("%s: unable to assign ioeventfd: %d", __func__, r);
+virtio_queue_set_host_notifier_fd_handler(vq, false, false);
+event_notifier_cleanup(notifier);
+return r;
+}
+} else {
+virtio_queue_set_host_notifier_fd_handler(vq, false, false);
+k->ioeventfd_assign(proxy, notifier, n, assign);
+event_notifier_cleanup(notifier);
+}
+return r;
+}
+
+void virtio_bus_start_ioeventfd(VirtioBusState *bus)
+{
+VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(bus);
+DeviceState *proxy = DEVICE(BUS(bus)->parent);
+VirtIODevice *vdev;
+int n, r;
+
+if (!k->ioeventfd_started || k->ioeventfd_started(proxy)) {
+return;
+}
+if (k->ioeventfd_disabled(proxy)) {
+return;
+}
+vdev = virtio_bus_g

[Qemu-devel] [PULL 23/34] acpi: cpuhp: implement hot-add parts of CPU hotplug interface

2016-06-23 Thread Michael S. Tsirkin

From: Igor Mammedov 

it adds hw registers needed for handling CPU hot-add and
corresponding AML methods to handle hot-add events on
guest side.

Signed-off-by: Igor Mammedov 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/acpi/cpu.h |   5 +-
 hw/acpi/cpu.c | 150 +-
 hw/acpi/trace-events  |   4 ++
 3 files changed, 157 insertions(+), 2 deletions(-)

diff --git a/include/hw/acpi/cpu.h b/include/hw/acpi/cpu.h
index f345447..55c3166 100644
--- a/include/hw/acpi/cpu.h
+++ b/include/hw/acpi/cpu.h
@@ -20,11 +20,13 @@
 typedef struct AcpiCpuStatus {
 struct CPUState *cpu;
 uint64_t arch_id;
+bool is_inserting;
 } AcpiCpuStatus;
 
 typedef struct CPUHotplugState {
 MemoryRegion ctrl_reg;
 uint32_t selector;
+uint8_t command;
 uint32_t dev_count;
 AcpiCpuStatus *devs;
 } CPUHotplugState;
@@ -41,7 +43,8 @@ typedef struct CPUHotplugFeatures {
 
 void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
 hwaddr io_base,
-const char *res_root);
+const char *res_root,
+const char *event_handler_method);
 
 extern const VMStateDescription vmstate_cpu_hotplug;
 #define VMSTATE_CPU_HOTPLUG(cpuhp, state) \
diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index d99002c..811be8a 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -7,6 +7,13 @@
 #define ACPI_CPU_HOTPLUG_REG_LEN 12
 #define ACPI_CPU_SELECTOR_OFFSET_WR 0
 #define ACPI_CPU_FLAGS_OFFSET_RW 4
+#define ACPI_CPU_CMD_OFFSET_WR 5
+#define ACPI_CPU_CMD_DATA_OFFSET_RW 8
+
+enum {
+CPHP_GET_NEXT_CPU_WITH_EVENT_CMD = 0,
+CPHP_CMD_MAX
+};
 
 static uint64_t cpu_hotplug_rd(void *opaque, hwaddr addr, unsigned size)
 {
@@ -22,8 +29,19 @@ static uint64_t cpu_hotplug_rd(void *opaque, hwaddr addr, 
unsigned size)
 switch (addr) {
 case ACPI_CPU_FLAGS_OFFSET_RW: /* pack and return is_* fields */
 val |= cdev->cpu ? 1 : 0;
+val |= cdev->is_inserting ? 2 : 0;
 trace_cpuhp_acpi_read_flags(cpu_st->selector, val);
 break;
+case ACPI_CPU_CMD_DATA_OFFSET_RW:
+switch (cpu_st->command) {
+case CPHP_GET_NEXT_CPU_WITH_EVENT_CMD:
+   val = cpu_st->selector;
+   break;
+default:
+   break;
+}
+trace_cpuhp_acpi_read_cmd_data(cpu_st->selector, val);
+break;
 default:
 break;
 }
@@ -34,6 +52,7 @@ static void cpu_hotplug_wr(void *opaque, hwaddr addr, 
uint64_t data,
unsigned int size)
 {
 CPUHotplugState *cpu_st = opaque;
+AcpiCpuStatus *cdev;
 
 assert(cpu_st->dev_count);
 
@@ -49,6 +68,33 @@ static void cpu_hotplug_wr(void *opaque, hwaddr addr, 
uint64_t data,
 cpu_st->selector = data;
 trace_cpuhp_acpi_write_idx(cpu_st->selector);
 break;
+case ACPI_CPU_FLAGS_OFFSET_RW: /* set is_* fields  */
+cdev = &cpu_st->devs[cpu_st->selector];
+if (data & 2) { /* clear insert event */
+cdev->is_inserting = false;
+trace_cpuhp_acpi_clear_inserting_evt(cpu_st->selector);
+}
+break;
+case ACPI_CPU_CMD_OFFSET_WR:
+trace_cpuhp_acpi_write_cmd(cpu_st->selector, data);
+if (data < CPHP_CMD_MAX) {
+cpu_st->command = data;
+if (cpu_st->command == CPHP_GET_NEXT_CPU_WITH_EVENT_CMD) {
+uint32_t iter = cpu_st->selector;
+
+do {
+cdev = &cpu_st->devs[iter];
+if (cdev->is_inserting) {
+cpu_st->selector = iter;
+trace_cpuhp_acpi_cpu_has_events(cpu_st->selector,
+cdev->is_inserting);
+break;
+}
+iter = iter + 1 < cpu_st->dev_count ? iter + 1 : 0;
+} while (iter != cpu_st->selector);
+}
+}
+break;
 default:
 break;
 }
@@ -111,8 +157,23 @@ void acpi_cpu_plug_cb(HotplugHandler *hotplug_dev,
 }
 
 cdev->cpu = CPU(dev);
+if (dev->hotplugged) {
+cdev->is_inserting = true;
+acpi_send_event(DEVICE(hotplug_dev), ACPI_CPU_HOTPLUG_STATUS);
+}
 }
 
+static const VMStateDescription vmstate_cpuhp_sts = {
+.name = "CPU hotplug device state",
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.fields  = (VMStateField[]) {
+VMSTATE_BOOL(is_inserting, AcpiCpuStatus),
+VMSTATE_END_OF_LIST()
+}
+};
+
 const VMStateDescription vmstate_cpu_hotplug = {
 .name = "CPU hotplug state",
 .version_id = 1,
@@ -120,6 +181,9 @@ const VMStateDescription vmstate_cpu_hotplug = {
 .minimum_version_id_old = 1,
 .fields  = (VMStateField[]) {
 VMSTATE_UINT32(selector, CPUHotplugState),
+VMSTATE_UINT8(command, CPUHotplugState),
+VMSTATE_

[Qemu-devel] [PULL 26/34] pc: use new CPU hotplug interface since 2.7 machine type

2016-06-23 Thread Michael S. Tsirkin

From: Igor Mammedov 

For compatibility reasons PC/Q35 will start with legacy
CPU hotplug interface by default but with new CPU hotplug
AML code since 2.7 machine type. That way legacy firmware
that doesn't use QEMU generated ACPI tables will be
able to continue using legacy CPU hotplug interface.

While new machine type, with firmware supporting QEMU
provided ACPI tables, will generate new CPU hotplug AML,
which will switch to new CPU hotplug interface when
guest OS executes its _INI method on ACPI tables
loading.

Signed-off-by: Igor Mammedov 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/acpi/cpu.h |  1 +
 include/hw/acpi/cpu_hotplug.h |  6 ++
 include/hw/i386/pc.h  |  2 ++
 hw/acpi/cpu.c |  9 +
 hw/acpi/cpu_hotplug.c | 21 -
 hw/acpi/ich9.c| 33 +
 hw/acpi/piix4.c   | 32 
 hw/i386/acpi-build.c  | 12 +++-
 hw/i386/pc_piix.c |  2 ++
 hw/i386/pc_q35.c  |  2 ++
 10 files changed, 118 insertions(+), 2 deletions(-)

diff --git a/include/hw/acpi/cpu.h b/include/hw/acpi/cpu.h
index 980a83c..89ce172 100644
--- a/include/hw/acpi/cpu.h
+++ b/include/hw/acpi/cpu.h
@@ -49,6 +49,7 @@ void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
 
 typedef struct CPUHotplugFeatures {
 bool apci_1_compatible;
+bool has_legacy_cphp;
 } CPUHotplugFeatures;
 
 void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
diff --git a/include/hw/acpi/cpu_hotplug.h b/include/hw/acpi/cpu_hotplug.h
index 6fef67e..b995ef2 100644
--- a/include/hw/acpi/cpu_hotplug.h
+++ b/include/hw/acpi/cpu_hotplug.h
@@ -16,8 +16,10 @@
 #include "hw/acpi/pc-hotplug.h"
 #include "hw/acpi/aml-build.h"
 #include "hw/hotplug.h"
+#include "hw/acpi/cpu.h"
 
 typedef struct AcpiCpuHotplug {
+Object *device;
 MemoryRegion io;
 uint8_t sts[ACPI_GPE_PROC_LEN];
 } AcpiCpuHotplug;
@@ -28,6 +30,10 @@ void legacy_acpi_cpu_plug_cb(HotplugHandler *hotplug_dev,
 void legacy_acpi_cpu_hotplug_init(MemoryRegion *parent, Object *owner,
   AcpiCpuHotplug *gpe_cpu, uint16_t base);
 
+void acpi_switch_to_modern_cphp(AcpiCpuHotplug *gpe_cpu,
+CPUHotplugState *cpuhp_state,
+uint16_t io_port);
+
 void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
   uint16_t io_base);
 #endif
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 9e23929..884224e 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -137,6 +137,8 @@ struct PCMachineClass {
 
 /* TSC rate migration: */
 bool save_tsc_khz;
+/* generate legacy CPU hotplug AML */
+bool legacy_cpu_hotplug;
 };
 
 #define TYPE_PC_MACHINE "generic-pc-machine"
diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 401ac0d..c13b65c 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -373,6 +373,15 @@ void build_cpus_aml(Aml *table, MachineState *machine, 
CPUHotplugFeatures opts,
 aml_append(field, aml_named_field(CPU_DATA, 32));
 aml_append(cpu_ctrl_dev, field);
 
+if (opts.has_legacy_cphp) {
+method = aml_method("_INI", 0, AML_SERIALIZED);
+/* switch off legacy CPU hotplug HW and use new one,
+ * on reboot system is in new mode and writing 0
+ * in CPU_SELECTOR selects BSP, which is NOP at
+ * the time _INI is called */
+aml_append(method, aml_store(zero, aml_name(CPU_SELECTOR)));
+aml_append(cpu_ctrl_dev, method);
+}
 }
 aml_append(sb_scope, cpu_ctrl_dev);
 
diff --git a/hw/acpi/cpu_hotplug.c b/hw/acpi/cpu_hotplug.c
index fe75bd9..e19d902 100644
--- a/hw/acpi/cpu_hotplug.c
+++ b/hw/acpi/cpu_hotplug.c
@@ -34,7 +34,15 @@ static uint64_t cpu_status_read(void *opaque, hwaddr addr, 
unsigned int size)
 static void cpu_status_write(void *opaque, hwaddr addr, uint64_t data,
  unsigned int size)
 {
-/* TODO: implement VCPU removal on guest signal that CPU can be removed */
+/* firmware never used to write in CPU present bitmap so use
+   this fact as means to switch QEMU into modern CPU hotplug
+   mode by writing 0 at the beginning of legacy CPU bitmap
+ */
+if (addr == 0 && data == 0) {
+AcpiCpuHotplug *cpus = opaque;
+object_property_set_bool(cpus->device, false, "cpu-hotplug-legacy",
+ &error_abort);
+}
 }
 
 static const MemoryRegionOps AcpiCpuHotplug_ops = {
@@ -83,6 +91,17 @@ void legacy_acpi_cpu_hotplug_init(MemoryRegion *parent, 
Object *owner,
 memory_region_init_io(&gpe_cpu->io, owner, &AcpiCpuHotplug_ops,
   gpe_cpu, "acpi-cpu-hotplug", ACPI_GPE_PROC_LEN);
 memory_region_add_subregion(parent, base, &gpe_cpu->io);
+gpe_cpu->devi

[Qemu-devel] [PULL 24/34] acpi: cpuhp: implement hot-remove parts of CPU hotplug interface

2016-06-23 Thread Michael S. Tsirkin

From: Igor Mammedov 

it adds hw registers needed for handling CPU hot-remove and
corresponding AML methods to request and eject a CPU with
necessary hotplug callbacks in pc,piix4,ich9 code.

Signed-off-by: Igor Mammedov 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/acpi/cpu.h |  8 +
 hw/acpi/cpu.c | 90 ---
 hw/acpi/ich9.c|  7 
 hw/acpi/piix4.c   |  6 
 hw/i386/pc.c  | 47 +++
 hw/acpi/trace-events  |  5 ++-
 6 files changed, 158 insertions(+), 5 deletions(-)

diff --git a/include/hw/acpi/cpu.h b/include/hw/acpi/cpu.h
index 55c3166..f334221 100644
--- a/include/hw/acpi/cpu.h
+++ b/include/hw/acpi/cpu.h
@@ -21,6 +21,7 @@ typedef struct AcpiCpuStatus {
 struct CPUState *cpu;
 uint64_t arch_id;
 bool is_inserting;
+bool is_removing;
 } AcpiCpuStatus;
 
 typedef struct CPUHotplugState {
@@ -34,6 +35,13 @@ typedef struct CPUHotplugState {
 void acpi_cpu_plug_cb(HotplugHandler *hotplug_dev,
   CPUHotplugState *cpu_st, DeviceState *dev, Error **errp);
 
+void acpi_cpu_unplug_request_cb(HotplugHandler *hotplug_dev,
+CPUHotplugState *cpu_st,
+DeviceState *dev, Error **errp);
+
+void acpi_cpu_unplug_cb(CPUHotplugState *cpu_st,
+DeviceState *dev, Error **errp);
+
 void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
  CPUHotplugState *state, hwaddr base_addr);
 
diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 811be8a..483b808 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -30,6 +30,7 @@ static uint64_t cpu_hotplug_rd(void *opaque, hwaddr addr, 
unsigned size)
 case ACPI_CPU_FLAGS_OFFSET_RW: /* pack and return is_* fields */
 val |= cdev->cpu ? 1 : 0;
 val |= cdev->is_inserting ? 2 : 0;
+val |= cdev->is_removing  ? 4 : 0;
 trace_cpuhp_acpi_read_flags(cpu_st->selector, val);
 break;
 case ACPI_CPU_CMD_DATA_OFFSET_RW:
@@ -73,6 +74,22 @@ static void cpu_hotplug_wr(void *opaque, hwaddr addr, 
uint64_t data,
 if (data & 2) { /* clear insert event */
 cdev->is_inserting = false;
 trace_cpuhp_acpi_clear_inserting_evt(cpu_st->selector);
+} else if (data & 4) { /* clear remove event */
+cdev->is_removing = false;
+trace_cpuhp_acpi_clear_remove_evt(cpu_st->selector);
+} else if (data & 8) {
+DeviceState *dev = NULL;
+HotplugHandler *hotplug_ctrl = NULL;
+
+if (!cdev->cpu) {
+trace_cpuhp_acpi_ejecting_invalid_cpu(cpu_st->selector);
+break;
+}
+
+trace_cpuhp_acpi_ejecting_cpu(cpu_st->selector);
+dev = DEVICE(cdev->cpu);
+hotplug_ctrl = qdev_get_hotplug_handler(dev);
+hotplug_handler_unplug(hotplug_ctrl, dev, NULL);
 }
 break;
 case ACPI_CPU_CMD_OFFSET_WR:
@@ -84,10 +101,10 @@ static void cpu_hotplug_wr(void *opaque, hwaddr addr, 
uint64_t data,
 
 do {
 cdev = &cpu_st->devs[iter];
-if (cdev->is_inserting) {
+if (cdev->is_inserting || cdev->is_removing) {
 cpu_st->selector = iter;
 trace_cpuhp_acpi_cpu_has_events(cpu_st->selector,
-cdev->is_inserting);
+cdev->is_inserting, cdev->is_removing);
 break;
 }
 iter = iter + 1 < cpu_st->dev_count ? iter + 1 : 0;
@@ -163,6 +180,34 @@ void acpi_cpu_plug_cb(HotplugHandler *hotplug_dev,
 }
 }
 
+void acpi_cpu_unplug_request_cb(HotplugHandler *hotplug_dev,
+CPUHotplugState *cpu_st,
+DeviceState *dev, Error **errp)
+{
+AcpiCpuStatus *cdev;
+
+cdev = get_cpu_status(cpu_st, dev);
+if (!cdev) {
+return;
+}
+
+cdev->is_removing = true;
+acpi_send_event(DEVICE(hotplug_dev), ACPI_CPU_HOTPLUG_STATUS);
+}
+
+void acpi_cpu_unplug_cb(CPUHotplugState *cpu_st,
+DeviceState *dev, Error **errp)
+{
+AcpiCpuStatus *cdev;
+
+cdev = get_cpu_status(cpu_st, dev);
+if (!cdev) {
+return;
+}
+
+cdev->cpu = NULL;
+}
+
 static const VMStateDescription vmstate_cpuhp_sts = {
 .name = "CPU hotplug device state",
 .version_id = 1,
@@ -170,6 +215,7 @@ static const VMStateDescription vmstate_cpuhp_sts = {
 .minimum_version_id_old = 1,
 .fields  = (VMStateField[]) {
 VMSTATE_BOOL(is_inserting, AcpiCpuStatus),
+VMSTATE_BOOL(is_removing, AcpiCpuStatus),
 VMSTATE_END_OF_LIST()
 }
 };
@@ -194,12 +240,15 @@ const VMStateDescription vmstate_cpu_hotplug = {
 #define CPU_STS_METHOD"CSTA"
 #define CPU_SCAN_METHOD   "CSCN"
 #define

[Qemu-devel] [PULL 25/34] acpi: cpuhp: add cpu._OST handling

2016-06-23 Thread Michael S. Tsirkin

From: Igor Mammedov 

it adds HW and AML parts for CPU_Device._OST method
handling to allow OSPM reports status of hot-(un)plug
operation.
And extends QMP command query-acpi-ospm-status to report
CPU's OST info along with already reported PC-DIMM devices.

Signed-off-by: Igor Mammedov 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 qapi-schema.json  |  3 +-
 include/hw/acpi/cpu.h |  4 +++
 hw/acpi/cpu.c | 82 +++
 hw/acpi/ich9.c|  3 ++
 hw/acpi/piix4.c   |  3 ++
 hw/acpi/trace-events  |  2 ++
 6 files changed, 96 insertions(+), 1 deletion(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index 0964eec..84b6708 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -4079,8 +4079,9 @@
 ## @ACPISlotType
 #
 # @DIMM: memory slot
+# @CPU: logical CPU slot (since 2.7)
 #
-{ 'enum': 'ACPISlotType', 'data': [ 'DIMM' ] }
+{ 'enum': 'ACPISlotType', 'data': [ 'DIMM', 'CPU' ] }
 
 ## @ACPIOSTInfo
 #
diff --git a/include/hw/acpi/cpu.h b/include/hw/acpi/cpu.h
index f334221..980a83c 100644
--- a/include/hw/acpi/cpu.h
+++ b/include/hw/acpi/cpu.h
@@ -22,6 +22,8 @@ typedef struct AcpiCpuStatus {
 uint64_t arch_id;
 bool is_inserting;
 bool is_removing;
+uint32_t ost_event;
+uint32_t ost_status;
 } AcpiCpuStatus;
 
 typedef struct CPUHotplugState {
@@ -54,6 +56,8 @@ void build_cpus_aml(Aml *table, MachineState *machine, 
CPUHotplugFeatures opts,
 const char *res_root,
 const char *event_handler_method);
 
+void acpi_cpu_ospm_status(CPUHotplugState *cpu_st, ACPIOSTInfoList ***list);
+
 extern const VMStateDescription vmstate_cpu_hotplug;
 #define VMSTATE_CPU_HOTPLUG(cpuhp, state) \
 VMSTATE_STRUCT(cpuhp, state, 1, \
diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 483b808..401ac0d 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -2,6 +2,7 @@
 #include "hw/boards.h"
 #include "hw/acpi/cpu.h"
 #include "qapi/error.h"
+#include "qapi-event.h"
 #include "trace.h"
 
 #define ACPI_CPU_HOTPLUG_REG_LEN 12
@@ -12,9 +13,42 @@
 
 enum {
 CPHP_GET_NEXT_CPU_WITH_EVENT_CMD = 0,
+CPHP_OST_EVENT_CMD = 1,
+CPHP_OST_STATUS_CMD = 2,
 CPHP_CMD_MAX
 };
 
+static ACPIOSTInfo *acpi_cpu_device_status(int idx, AcpiCpuStatus *cdev)
+{
+ACPIOSTInfo *info = g_new0(ACPIOSTInfo, 1);
+
+info->slot_type = ACPI_SLOT_TYPE_CPU;
+info->slot = g_strdup_printf("%d", idx);
+info->source = cdev->ost_event;
+info->status = cdev->ost_status;
+if (cdev->cpu) {
+DeviceState *dev = DEVICE(cdev->cpu);
+if (dev->id) {
+info->device = g_strdup(dev->id);
+info->has_device = true;
+}
+}
+return info;
+}
+
+void acpi_cpu_ospm_status(CPUHotplugState *cpu_st, ACPIOSTInfoList ***list)
+{
+int i;
+
+for (i = 0; i < cpu_st->dev_count; i++) {
+ACPIOSTInfoList *elem = g_new0(ACPIOSTInfoList, 1);
+elem->value = acpi_cpu_device_status(i, &cpu_st->devs[i]);
+elem->next = NULL;
+**list = elem;
+*list = &elem->next;
+}
+}
+
 static uint64_t cpu_hotplug_rd(void *opaque, hwaddr addr, unsigned size)
 {
 uint64_t val = 0;
@@ -54,6 +88,7 @@ static void cpu_hotplug_wr(void *opaque, hwaddr addr, 
uint64_t data,
 {
 CPUHotplugState *cpu_st = opaque;
 AcpiCpuStatus *cdev;
+ACPIOSTInfo *info;
 
 assert(cpu_st->dev_count);
 
@@ -112,6 +147,28 @@ static void cpu_hotplug_wr(void *opaque, hwaddr addr, 
uint64_t data,
 }
 }
 break;
+case ACPI_CPU_CMD_DATA_OFFSET_RW:
+switch (cpu_st->command) {
+case CPHP_OST_EVENT_CMD: {
+   cdev = &cpu_st->devs[cpu_st->selector];
+   cdev->ost_event = data;
+   trace_cpuhp_acpi_write_ost_ev(cpu_st->selector, cdev->ost_event);
+   break;
+}
+case CPHP_OST_STATUS_CMD: {
+   cdev = &cpu_st->devs[cpu_st->selector];
+   cdev->ost_status = data;
+   info = acpi_cpu_device_status(cpu_st->selector, cdev);
+   qapi_event_send_acpi_device_ost(info, &error_abort);
+   qapi_free_ACPIOSTInfo(info);
+   trace_cpuhp_acpi_write_ost_status(cpu_st->selector,
+ cdev->ost_status);
+   break;
+}
+default:
+   break;
+}
+break;
 default:
 break;
 }
@@ -216,6 +273,8 @@ static const VMStateDescription vmstate_cpuhp_sts = {
 .fields  = (VMStateField[]) {
 VMSTATE_BOOL(is_inserting, AcpiCpuStatus),
 VMSTATE_BOOL(is_removing, AcpiCpuStatus),
+VMSTATE_UINT32(ost_event, AcpiCpuStatus),
+VMSTATE_UINT32(ost_status, AcpiCpuStatus),
 VMSTATE_END_OF_LIST()
 }
 };
@@ -241,6 +300,7 @@ const VMStateDescription vmstate_cpu_hotplug = {
 #define CPU_SCAN_METHOD   "CSCN"
 #define CPU_NOTIFY_METHOD "CTFY"
 #define CPU_EJECT_METHOD  "CEJ0"
+#define CPU_OST_METHOD"COST"
 
 #define

[Qemu-devel] [PULL 16/34] nvdimm acpi: support Set Namespace Label Data function

2016-06-23 Thread Michael S. Tsirkin

From: Xiao Guangrong 

Function 6 is used to set Namespace Label Data

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Xiao Guangrong 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/acpi/nvdimm.c | 44 +++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 388d42e..e486128 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -463,6 +463,8 @@ struct NvdimmFuncSetLabelDataIn {
 uint8_t in_buf[0]; /* the data written to label data area. */
 } QEMU_PACKED;
 typedef struct NvdimmFuncSetLabelDataIn NvdimmFuncSetLabelDataIn;
+QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncSetLabelDataIn) +
+  offsetof(NvdimmDsmIn, arg3) > 4096);
 
 static void
 nvdimm_dsm_function0(uint32_t supported_func, hwaddr dsm_mem_addr)
@@ -616,6 +618,39 @@ static void nvdimm_dsm_get_label_data(NVDIMMDevice 
*nvdimm, NvdimmDsmIn *in,
 g_free(get_label_data_out);
 }
 
+/*
+ * DSM Spec Rev1 4.6 Set Namespace Label Data (Function Index 6).
+ */
+static void nvdimm_dsm_set_label_data(NVDIMMDevice *nvdimm, NvdimmDsmIn *in,
+  hwaddr dsm_mem_addr)
+{
+NVDIMMClass *nvc = NVDIMM_GET_CLASS(nvdimm);
+NvdimmFuncSetLabelDataIn *set_label_data;
+uint32_t status;
+
+set_label_data = (NvdimmFuncSetLabelDataIn *)in->arg3;
+
+le32_to_cpus(&set_label_data->offset);
+le32_to_cpus(&set_label_data->length);
+
+nvdimm_debug("Write Label Data: offset %#x length %#x.\n",
+ set_label_data->offset, set_label_data->length);
+
+status = nvdimm_rw_label_data_check(nvdimm, set_label_data->offset,
+set_label_data->length);
+if (status != 0 /* Success */) {
+nvdimm_dsm_no_payload(status, dsm_mem_addr);
+return;
+}
+
+assert(sizeof(*in) + sizeof(*set_label_data) + set_label_data->length <=
+   4096);
+
+nvc->write_label_data(nvdimm, set_label_data->in_buf,
+  set_label_data->length, set_label_data->offset);
+nvdimm_dsm_no_payload(0 /* Success */, dsm_mem_addr);
+}
+
 static void nvdimm_dsm_device(NvdimmDsmIn *in, hwaddr dsm_mem_addr)
 {
 NVDIMMDevice *nvdimm = nvdimm_get_device_by_handle(in->handle);
@@ -629,7 +664,8 @@ static void nvdimm_dsm_device(NvdimmDsmIn *in, hwaddr 
dsm_mem_addr)
  support for any functions other
  than function 0. */ |
   1 << 4 /* Get Namespace Label Size */ |
-  1 << 5 /* Get Namespace Label Data */;
+  1 << 5 /* Get Namespace Label Data */ |
+  1 << 6 /* Set Namespace Label Data */;
 }
 nvdimm_dsm_function0(supported_func, dsm_mem_addr);
 return;
@@ -655,6 +691,12 @@ static void nvdimm_dsm_device(NvdimmDsmIn *in, hwaddr 
dsm_mem_addr)
 return;
 }
 break;
+case 0x6 /* Set Namespace Label Data */:
+if (nvdimm->label_size) {
+nvdimm_dsm_set_label_data(nvdimm, in, dsm_mem_addr);
+return;
+}
+break;
 }
 
 nvdimm_dsm_no_payload(1 /* Not Supported */, dsm_mem_addr);
-- 
MST

[Qemu-devel] [PULL 20/34] pc: piix4/ich9: add 'cpu-hotplug-legacy' property

2016-06-23 Thread Michael S. Tsirkin

From: Igor Mammedov 

It will be used to select which hotplug call-back is called
and for switching from legacy mode into new one.

Signed-off-by: Igor Mammedov 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/acpi/ich9.h |  1 +
 hw/acpi/ich9.c | 23 ++-
 hw/acpi/piix4.c| 24 +++-
 3 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/include/hw/acpi/ich9.h b/include/hw/acpi/ich9.h
index bbd657c..e29a856 100644
--- a/include/hw/acpi/ich9.h
+++ b/include/hw/acpi/ich9.h
@@ -48,6 +48,7 @@ typedef struct ICH9LPCPMRegs {
 uint32_t pm_io_base;
 Notifier powerdown_notifier;
 
+bool cpu_hotplug_legacy;
 AcpiCpuHotplug gpe_cpu;
 
 MemHotplugState acpi_memory_hotplug;
diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index 853c9c4..ed16940 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -306,6 +306,21 @@ static void ich9_pm_set_memory_hotplug_support(Object 
*obj, bool value,
 s->pm.acpi_memory_hotplug.is_enabled = value;
 }
 
+static bool ich9_pm_get_cpu_hotplug_legacy(Object *obj, Error **errp)
+{
+ICH9LPCState *s = ICH9_LPC_DEVICE(obj);
+
+return s->pm.cpu_hotplug_legacy;
+}
+
+static void ich9_pm_set_cpu_hotplug_legacy(Object *obj, bool value,
+   Error **errp)
+{
+ICH9LPCState *s = ICH9_LPC_DEVICE(obj);
+
+s->pm.cpu_hotplug_legacy = value;
+}
+
 static void ich9_pm_get_disable_s3(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
 {
@@ -397,6 +412,7 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm, 
Error **errp)
 {
 static const uint32_t gpe0_len = ICH9_PMIO_GPE0_LEN;
 pm->acpi_memory_hotplug.is_enabled = true;
+pm->cpu_hotplug_legacy = true;
 pm->disable_s3 = 0;
 pm->disable_s4 = 0;
 pm->s4_val = 2;
@@ -412,6 +428,10 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs 
*pm, Error **errp)
  ich9_pm_get_memory_hotplug_support,
  ich9_pm_set_memory_hotplug_support,
  NULL);
+object_property_add_bool(obj, "cpu-hotplug-legacy",
+ ich9_pm_get_cpu_hotplug_legacy,
+ ich9_pm_set_cpu_hotplug_legacy,
+ NULL);
 object_property_add(obj, ACPI_PM_PROP_S3_DISABLED, "uint8",
 ich9_pm_get_disable_s3,
 ich9_pm_set_disable_s3,
@@ -439,7 +459,8 @@ void ich9_pm_device_plug_cb(HotplugHandler *hotplug_dev, 
DeviceState *dev,
 object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
 acpi_memory_plug_cb(hotplug_dev, &lpc->pm.acpi_memory_hotplug,
 dev, errp);
-} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+} else if (lpc->pm.cpu_hotplug_legacy &&
+   object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
 legacy_acpi_cpu_plug_cb(hotplug_dev, &lpc->pm.gpe_cpu, dev, errp);
 } else {
 error_setg(errp, "acpi: device plug request for not supported device"
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index c48cb1b..9ae3964 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -86,6 +86,7 @@ typedef struct PIIX4PMState {
 uint8_t disable_s4;
 uint8_t s4_val;
 
+bool cpu_hotplug_legacy;
 AcpiCpuHotplug gpe_cpu;
 
 MemHotplugState acpi_memory_hotplug;
@@ -351,7 +352,8 @@ static void piix4_device_plug_cb(HotplugHandler 
*hotplug_dev,
 acpi_memory_plug_cb(hotplug_dev, &s->acpi_memory_hotplug, dev, errp);
 } else if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
 acpi_pcihp_device_plug_cb(hotplug_dev, &s->acpi_pci_hotplug, dev, 
errp);
-} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+} else if (s->cpu_hotplug_legacy &&
+   object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
 legacy_acpi_cpu_plug_cb(hotplug_dev, &s->gpe_cpu, dev, errp);
 } else {
 error_setg(errp, "acpi: device plug request for not supported device"
@@ -560,6 +562,21 @@ static const MemoryRegionOps piix4_gpe_ops = {
 .endianness = DEVICE_LITTLE_ENDIAN,
 };
 
+
+static bool piix4_get_cpu_hotplug_legacy(Object *obj, Error **errp)
+{
+PIIX4PMState *s = PIIX4_PM(obj);
+
+return s->cpu_hotplug_legacy;
+}
+
+static void piix4_set_cpu_hotplug_legacy(Object *obj, bool value, Error **errp)
+{
+PIIX4PMState *s = PIIX4_PM(obj);
+
+s->cpu_hotplug_legacy = value;
+}
+
 static void piix4_acpi_system_hot_add_init(MemoryRegion *parent,
PCIBus *bus, PIIX4PMState *s)
 {
@@ -570,6 +587,11 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion 
*parent,
 acpi_pcihp_init(OBJECT(s), &s->acpi_pci_hotplug, bus, parent,
 s->use_acpi_pci_hotplug);
 
+s->cpu_hotplug_legacy = true;
+object_property_add_bool(OBJECT(s), "cpu-hotplug-legacy",
+

[Qemu-devel] [PULL 18/34] i386: pci-assign: Fix MSI-X table size

2016-06-23 Thread Michael S. Tsirkin

From: Ido Yariv 

The current code creates a whole page mmio region for the MSI-X table
size.

However, the page containing the MSI-X table may contain other registers
not related to MSI-X. Creating an mmio region for the whole page masks
such registers and may break drivers in the guest OS.

Since maximal number of entries is known, use that instead to deduce the
table size when setting up the mmio region.

Signed-off-by: Ido Yariv 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/kvm/pci-assign.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
index f9c9014..98997d1 100644
--- a/hw/i386/kvm/pci-assign.c
+++ b/hw/i386/kvm/pci-assign.c
@@ -36,8 +36,6 @@
 #include "kvm_i386.h"
 #include "hw/pci/pci-assign.h"
 
-#define MSIX_PAGE_SIZE 0x1000
-
 /* From linux/ioport.h */
 #define IORESOURCE_IO   0x0100  /* Resource type */
 #define IORESOURCE_MEM  0x0200
@@ -122,6 +120,7 @@ typedef struct AssignedDevice {
 int *msi_virq;
 MSIXTableEntry *msix_table;
 hwaddr msix_table_addr;
+uint16_t msix_table_size;
 uint16_t msix_max;
 MemoryRegion mmio;
 char *configfd_name;
@@ -1310,6 +1309,7 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 bar_nr = msix_table_entry & PCI_MSIX_FLAGS_BIRMASK;
 msix_table_entry &= ~PCI_MSIX_FLAGS_BIRMASK;
 dev->msix_table_addr = pci_region[bar_nr].base_addr + msix_table_entry;
+dev->msix_table_size = msix_max * sizeof(MSIXTableEntry);
 dev->msix_max = msix_max;
 }
 
@@ -1633,7 +1633,7 @@ static void assigned_dev_msix_reset(AssignedDevice *dev)
 return;
 }
 
-memset(dev->msix_table, 0, MSIX_PAGE_SIZE);
+memset(dev->msix_table, 0, dev->msix_table_size);
 
 for (i = 0, entry = dev->msix_table; i < dev->msix_max; i++, entry++) {
 entry->ctrl = cpu_to_le32(0x1); /* Masked */
@@ -1642,8 +1642,8 @@ static void assigned_dev_msix_reset(AssignedDevice *dev)
 
 static void assigned_dev_register_msix_mmio(AssignedDevice *dev, Error **errp)
 {
-dev->msix_table = mmap(NULL, MSIX_PAGE_SIZE, PROT_READ|PROT_WRITE,
-   MAP_ANONYMOUS|MAP_PRIVATE, 0, 0);
+dev->msix_table = mmap(NULL, dev->msix_table_size, PROT_READ | PROT_WRITE,
+   MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
 if (dev->msix_table == MAP_FAILED) {
 error_setg_errno(errp, errno, "failed to allocate msix_table");
 dev->msix_table = NULL;
@@ -1653,7 +1653,7 @@ static void 
assigned_dev_register_msix_mmio(AssignedDevice *dev, Error **errp)
 assigned_dev_msix_reset(dev);
 
 memory_region_init_io(&dev->mmio, OBJECT(dev), &assigned_dev_msix_mmio_ops,
-  dev, "assigned-dev-msix", MSIX_PAGE_SIZE);
+  dev, "assigned-dev-msix", dev->msix_table_size);
 }
 
 static void assigned_dev_unregister_msix_mmio(AssignedDevice *dev)
@@ -1662,7 +1662,7 @@ static void 
assigned_dev_unregister_msix_mmio(AssignedDevice *dev)
 return;
 }
 
-if (munmap(dev->msix_table, MSIX_PAGE_SIZE) == -1) {
+if (munmap(dev->msix_table, dev->msix_table_size) == -1) {
 error_report("error unmapping msix_table! %s", strerror(errno));
 }
 dev->msix_table = NULL;
-- 
MST

[Qemu-devel] [PULL 21/34] acpi: cpuhp: add CPU devices AML with _STA method

2016-06-23 Thread Michael S. Tsirkin

From: Igor Mammedov 

it adds CPU objects to DSDT with _STA method
and QEMU side of CPU hotplug interface initialization
with registers sufficient to handle _STA requests,
including necessary hotplug callbacks in piix4,ich9 code.

Hot-(un)plug hw/acpi parts will be added by
corresponding follow up patches.

Signed-off-by: Igor Mammedov 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/acpi/cpu.h  |  51 +++
 include/hw/acpi/ich9.h |   2 +
 hw/acpi/cpu.c  | 240 +
 hw/acpi/ich9.c |   9 +-
 hw/acpi/piix4.c|  11 ++-
 hw/acpi/Makefile.objs  |   1 +
 hw/acpi/trace-events   |   5 ++
 7 files changed, 313 insertions(+), 6 deletions(-)
 create mode 100644 include/hw/acpi/cpu.h
 create mode 100644 hw/acpi/cpu.c

diff --git a/include/hw/acpi/cpu.h b/include/hw/acpi/cpu.h
new file mode 100644
index 000..f345447
--- /dev/null
+++ b/include/hw/acpi/cpu.h
@@ -0,0 +1,51 @@
+/*
+ * QEMU ACPI hotplug utilities
+ *
+ * Copyright (C) 2016 Red Hat Inc
+ *
+ * Authors:
+ *   Igor Mammedov 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#ifndef ACPI_CPU_H
+#define ACPI_CPU_H
+
+#include "hw/qdev-core.h"
+#include "hw/acpi/acpi.h"
+#include "hw/acpi/aml-build.h"
+#include "hw/hotplug.h"
+
+typedef struct AcpiCpuStatus {
+struct CPUState *cpu;
+uint64_t arch_id;
+} AcpiCpuStatus;
+
+typedef struct CPUHotplugState {
+MemoryRegion ctrl_reg;
+uint32_t selector;
+uint32_t dev_count;
+AcpiCpuStatus *devs;
+} CPUHotplugState;
+
+void acpi_cpu_plug_cb(HotplugHandler *hotplug_dev,
+  CPUHotplugState *cpu_st, DeviceState *dev, Error **errp);
+
+void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
+ CPUHotplugState *state, hwaddr base_addr);
+
+typedef struct CPUHotplugFeatures {
+bool apci_1_compatible;
+} CPUHotplugFeatures;
+
+void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
+hwaddr io_base,
+const char *res_root);
+
+extern const VMStateDescription vmstate_cpu_hotplug;
+#define VMSTATE_CPU_HOTPLUG(cpuhp, state) \
+VMSTATE_STRUCT(cpuhp, state, 1, \
+   vmstate_cpu_hotplug, CPUHotplugState)
+
+#endif
diff --git a/include/hw/acpi/ich9.h b/include/hw/acpi/ich9.h
index e29a856..a352c94 100644
--- a/include/hw/acpi/ich9.h
+++ b/include/hw/acpi/ich9.h
@@ -23,6 +23,7 @@
 
 #include "hw/acpi/acpi.h"
 #include "hw/acpi/cpu_hotplug.h"
+#include "hw/acpi/cpu.h"
 #include "hw/acpi/memory_hotplug.h"
 #include "hw/acpi/acpi_dev_interface.h"
 #include "hw/acpi/tco.h"
@@ -50,6 +51,7 @@ typedef struct ICH9LPCPMRegs {
 
 bool cpu_hotplug_legacy;
 AcpiCpuHotplug gpe_cpu;
+CPUHotplugState cpuhp_state;
 
 MemHotplugState acpi_memory_hotplug;
 
diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
new file mode 100644
index 000..d99002c
--- /dev/null
+++ b/hw/acpi/cpu.c
@@ -0,0 +1,240 @@
+#include "qemu/osdep.h"
+#include "hw/boards.h"
+#include "hw/acpi/cpu.h"
+#include "qapi/error.h"
+#include "trace.h"
+
+#define ACPI_CPU_HOTPLUG_REG_LEN 12
+#define ACPI_CPU_SELECTOR_OFFSET_WR 0
+#define ACPI_CPU_FLAGS_OFFSET_RW 4
+
+static uint64_t cpu_hotplug_rd(void *opaque, hwaddr addr, unsigned size)
+{
+uint64_t val = 0;
+CPUHotplugState *cpu_st = opaque;
+AcpiCpuStatus *cdev;
+
+if (cpu_st->selector >= cpu_st->dev_count) {
+return val;
+}
+
+cdev = &cpu_st->devs[cpu_st->selector];
+switch (addr) {
+case ACPI_CPU_FLAGS_OFFSET_RW: /* pack and return is_* fields */
+val |= cdev->cpu ? 1 : 0;
+trace_cpuhp_acpi_read_flags(cpu_st->selector, val);
+break;
+default:
+break;
+}
+return val;
+}
+
+static void cpu_hotplug_wr(void *opaque, hwaddr addr, uint64_t data,
+   unsigned int size)
+{
+CPUHotplugState *cpu_st = opaque;
+
+assert(cpu_st->dev_count);
+
+if (addr) {
+if (cpu_st->selector >= cpu_st->dev_count) {
+trace_cpuhp_acpi_invalid_idx_selected(cpu_st->selector);
+return;
+}
+}
+
+switch (addr) {
+case ACPI_CPU_SELECTOR_OFFSET_WR: /* current CPU selector */
+cpu_st->selector = data;
+trace_cpuhp_acpi_write_idx(cpu_st->selector);
+break;
+default:
+break;
+}
+}
+
+static const MemoryRegionOps cpu_hotplug_ops = {
+.read = cpu_hotplug_rd,
+.write = cpu_hotplug_wr,
+.endianness = DEVICE_LITTLE_ENDIAN,
+.valid = {
+.min_access_size = 1,
+.max_access_size = 4,
+},
+};
+
+void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
+ CPUHotplugState *state, hwaddr base_addr)
+{
+MachineState *machine = MACHINE(qdev_get_machine());
+MachineClass *mc = MACHINE_GET_CLASS(machine);
+CPUArchIdList *id_list;
+int i;
+
+

[Qemu-devel] [PULL 13/34] nvdimm acpi: check revision

2016-06-23 Thread Michael S. Tsirkin

From: Xiao Guangrong 

Currently only revision 1 is supported

Signed-off-by: Xiao Guangrong 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/acpi/nvdimm.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 07c95c1..8b89285 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -485,6 +485,13 @@ nvdimm_dsm_write(void *opaque, hwaddr addr, uint64_t val, 
unsigned size)
 nvdimm_debug("Revision %#x Handler %#x Function %#x.\n", in->revision,
  in->handle, in->function);
 
+if (in->revision != 0x1 /* Currently we only support DSM Spec Rev1. */) {
+nvdimm_debug("Revision %#x is not supported, expect %#x.\n",
+ in->revision, 0x1);
+nvdimm_dsm_no_payload(1 /* Not Supported */, dsm_mem_addr);
+goto exit;
+}
+
  /* Handle 0 is reserved for NVDIMM Root Device. */
 if (!in->handle) {
 nvdimm_dsm_root(in, dsm_mem_addr);
-- 
MST

[Qemu-devel] [PULL 17/34] docs: add NVDIMM ACPI documentation

2016-06-23 Thread Michael S. Tsirkin

From: Xiao Guangrong 

It describes the basic concepts of NVDIMM ACPI and the interfaces
between QEMU and the ACPI BIOS

Signed-off-by: Xiao Guangrong 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 docs/specs/acpi_nvdimm.txt | 132 +
 1 file changed, 132 insertions(+)
 create mode 100644 docs/specs/acpi_nvdimm.txt

diff --git a/docs/specs/acpi_nvdimm.txt b/docs/specs/acpi_nvdimm.txt
new file mode 100644
index 000..0fdd251
--- /dev/null
+++ b/docs/specs/acpi_nvdimm.txt
@@ -0,0 +1,132 @@
+QEMU<->ACPI BIOS NVDIMM interface
+-
+
+QEMU supports NVDIMM via ACPI. This document describes the basic concepts of
+NVDIMM ACPI and the interface between QEMU and the ACPI BIOS.
+
+NVDIMM ACPI Background
+--
+NVDIMM is introduced in ACPI 6.0 which defines an NVDIMM root device under
+_SB scope with a _HID of “ACPI0012”. For each NVDIMM present or intended
+to be supported by platform, platform firmware also exposes an ACPI
+Namespace Device under the root device.
+
+The NVDIMM child devices under the NVDIMM root device are defined with _ADR
+corresponding to the NFIT device handle. The NVDIMM root device and the
+NVDIMM devices can have device specific methods (_DSM) to provide additional
+functions specific to a particular NVDIMM implementation.
+
+This is an example from ACPI 6.0, a platform contains one NVDIMM:
+
+Scope (\_SB){
+   Device (NVDR) // Root device
+   {
+  Name (_HID, “ACPI0012”)
+  Method (_STA) {...}
+  Method (_FIT) {...}
+  Method (_DSM, ...) {...}
+  Device (NVD)
+  {
+ Name(_ADR, h) //where h is NFIT Device Handle for this NVDIMM
+ Method (_DSM, ...) {...}
+  }
+   }
+}
+
+Method supported on both NVDIMM root device and NVDIMM device
+_DSM (Device Specific Method)
+   It is a control method that enables devices to provide device specific
+   control functions that are consumed by the device driver.
+   The NVDIMM DSM specification can be found at:
+http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
+
+   Arguments:
+   Arg0 – A Buffer containing a UUID (16 Bytes)
+   Arg1 – An Integer containing the Revision ID (4 Bytes)
+   Arg2 – An Integer containing the Function Index (4 Bytes)
+   Arg3 – A package containing parameters for the function specified by the
+  UUID, Revision ID, and Function Index
+
+   Return Value:
+   If Function Index = 0, a Buffer containing a function index bitfield.
+   Otherwise, the return value and type depends on the UUID, revision ID
+   and function index which are described in the DSM specification.
+
+Methods on NVDIMM ROOT Device
+_FIT(Firmware Interface Table)
+   It evaluates to a buffer returning data in the format of a series of NFIT
+   Type Structure.
+
+   Arguments: None
+
+   Return Value:
+   A Buffer containing a list of NFIT Type structure entries.
+
+   The detailed definition of the structure can be found at ACPI 6.0: 5.2.25
+   NVDIMM Firmware Interface Table (NFIT).
+
+QEMU NVDIMM Implemention
+
+QEMU uses 4 bytes IO Port starting from 0x0a18 and a RAM-based memory page
+for NVDIMM ACPI.
+
+Memory:
+   QEMU uses BIOS Linker/loader feature to ask BIOS to allocate a memory
+   page and dynamically patch its into a int32 object named "MEMA" in ACPI.
+
+   This page is RAM-based and it is used to transfer data between _DSM
+   method and QEMU. If ACPI has control, this pages is owned by ACPI which
+   writes _DSM input data to it, otherwise, it is owned by QEMU which
+   emulates _DSM access and writes the output data to it.
+
+   ACPI writes _DSM Input Data (based on the offset in the page):
+   [0x0 - 0x3]: 4 bytes, NVDIMM Device Handle, 0 is reserved for NVDIMM
+Root device.
+   [0x4 - 0x7]: 4 bytes, Revision ID, that is the Arg1 of _DSM method.
+   [0x8 - 0xB]: 4 bytes. Function Index, that is the Arg2 of _DSM method.
+   [0xC - 0xFFF]: 4084 bytes, the Arg3 of _DSM method.
+
+   QEMU Writes Output Data (based on the offset in the page):
+   [0x0 - 0x3]: 4 bytes, the length of result
+   [0x4 - 0xFFF]: 4092 bytes, the DSM result filled by QEMU
+
+IO Port 0x0a18 - 0xa1b:
+   ACPI writes the address of the memory page allocated by BIOS to this
+   port then QEMU gets the control and fills the result in the memory page.
+
+   write Access:
+   [0x0a18 - 0xa1b]: 4 bytes, the address of the memory page allocated
+ by BIOS.
+
+_DSM process diagram:
+-
+"MEMA" indicates the address of memory page allocated by BIOS.
+
+ +--+  +---+
+ |    1. OSPM   |  |2. OSPM|
+ | save _DSM input data |  |  write "MEMA" to  | Exit to QEMU
+ | to the page  +->|  IO port 0x0a18   ++
+ | indicated by "MEMA"  |  |   ||
+ +

[Qemu-devel] [PULL 22/34] pc: acpi: introduce AcpiDeviceIfClass.madt_cpu hook

2016-06-23 Thread Michael S. Tsirkin

From: Igor Mammedov 

Add madt_cpu callback to AcpiDeviceIfClass and use
it for generating LAPIC MADT entries for CPUs.

Later it will be used for generating x2APIC
entries in case of more than 255 CPUs and also
would be reused by ARM target when ACPI CPU hotplug
is introduced there.

Signed-off-by: Igor Mammedov 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/acpi/acpi_dev_interface.h |  7 ++
 include/hw/i386/pc.h |  5 
 hw/acpi/piix4.c  |  1 +
 hw/i386/acpi-build.c | 45 +---
 hw/isa/lpc_ich9.c|  1 +
 stubs/pc_madt_cpu_entry.c|  7 ++
 stubs/Makefile.objs  |  1 +
 7 files changed, 49 insertions(+), 18 deletions(-)
 create mode 100644 stubs/pc_madt_cpu_entry.c

diff --git a/include/hw/acpi/acpi_dev_interface.h 
b/include/hw/acpi/acpi_dev_interface.h
index a0c4a33..da4ef7f 100644
--- a/include/hw/acpi/acpi_dev_interface.h
+++ b/include/hw/acpi/acpi_dev_interface.h
@@ -3,6 +3,7 @@
 
 #include "qom/object.h"
 #include "qapi-types.h"
+#include "hw/boards.h"
 
 /* These values are part of guest ABI, and can not be changed */
 typedef enum {
@@ -37,6 +38,10 @@ void acpi_send_event(DeviceState *dev, AcpiEventStatusBits 
event);
  * ospm_status: returns status of ACPI device objects, reported
  *  via _OST method if device supports it.
  * send_event: inject a specified event into guest
+ * madt_cpu: fills @entry with Interrupt Controller Structure
+ *   for CPU indexed by @uid in @apic_ids array,
+ *   returned structure types are:
+ *   0 - Local APIC, 9 - Local x2APIC, 0xB - GICC
  *
  * Interface is designed for providing unified interface
  * to generic ACPI functionality that could be used without
@@ -50,5 +55,7 @@ typedef struct AcpiDeviceIfClass {
 /*  */
 void (*ospm_status)(AcpiDeviceIf *adev, ACPIOSTInfoList ***list);
 void (*send_event)(AcpiDeviceIf *adev, AcpiEventStatusBits ev);
+void (*madt_cpu)(AcpiDeviceIf *adev, int uid,
+ CPUArchIdList *apic_ids, GArray *entry);
 } AcpiDeviceIfClass;
 #endif
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 49566c8..9e23929 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -17,6 +17,7 @@
 #include "hw/compat.h"
 #include "hw/mem/pc-dimm.h"
 #include "hw/mem/nvdimm.h"
+#include "hw/acpi/acpi_dev_interface.h"
 
 #define HPET_INTCAP "hpet-intcap"
 
@@ -345,6 +346,10 @@ void pc_system_firmware_init(MemoryRegion *rom_memory,
 /* pvpanic.c */
 uint16_t pvpanic_port(void);
 
+/* acpi-build.c */
+void pc_madt_cpu_entry(AcpiDeviceIf *adev, int uid,
+   CPUArchIdList *apic_ids, GArray *entry);
+
 /* e820 types */
 #define E820_RAM1
 #define E820_RESERVED   2
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 6351d2e..6d24cb5 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -658,6 +658,7 @@ static void piix4_pm_class_init(ObjectClass *klass, void 
*data)
 hc->unplug = piix4_device_unplug_cb;
 adevc->ospm_status = piix4_ospm_status;
 adevc->send_event = piix4_send_gpe;
+adevc->madt_cpu = pc_madt_cpu_entry;
 }
 
 static const TypeInfo piix4_pm_info = {
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index b3dc1df..e35a446 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -329,12 +329,38 @@ build_fadt(GArray *table_data, BIOSLinker *linker, 
AcpiPmInfo *pm,
  (void *)fadt, "FACP", sizeof(*fadt), 1, oem_id, oem_table_id);
 }
 
+void pc_madt_cpu_entry(AcpiDeviceIf *adev, int uid,
+   CPUArchIdList *apic_ids, GArray *entry)
+{
+int apic_id;
+AcpiMadtProcessorApic *apic = acpi_data_push(entry, sizeof *apic);
+
+apic_id = apic_ids->cpus[uid].arch_id;
+apic->type = ACPI_APIC_PROCESSOR;
+apic->length = sizeof(*apic);
+apic->processor_id = uid;
+apic->local_apic_id = apic_id;
+if (apic_ids->cpus[uid].cpu != NULL) {
+apic->flags = cpu_to_le32(1);
+} else {
+/* ACPI spec says that LAPIC entry for non present
+ * CPU may be omitted from MADT or it must be marked
+ * as disabled. However omitting non present CPU from
+ * MADT breaks hotplug on linux. So possible CPUs
+ * should be put in MADT but kept disabled.
+ */
+apic->flags = cpu_to_le32(0);
+}
+}
+
 static void
 build_madt(GArray *table_data, BIOSLinker *linker, PCMachineState *pcms)
 {
 MachineClass *mc = MACHINE_GET_CLASS(pcms);
 CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(MACHINE(pcms));
 int madt_start = table_data->len;
+AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_GET_CLASS(pcms->acpi_dev);
+AcpiDeviceIf *adev = ACPI_DEVICE_IF(pcms->acpi_dev);
 
 AcpiMultipleApicTable *madt;
 AcpiMadtIoApic *io_apic;
@@ -347,24 +373,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, 
PCMachineState *pcms)
 madt->flags

[Qemu-devel] [PULL 15/34] nvdimm acpi: support Get Namespace Label Data function

2016-06-23 Thread Michael S. Tsirkin

From: Xiao Guangrong 

Function 5 is used to get Namespace Label Data

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Xiao Guangrong 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/acpi/nvdimm.c | 83 +++-
 1 file changed, 82 insertions(+), 1 deletion(-)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 4a25d8f..388d42e 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -440,6 +440,14 @@ struct NvdimmFuncGetLabelSizeOut {
 typedef struct NvdimmFuncGetLabelSizeOut NvdimmFuncGetLabelSizeOut;
 QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncGetLabelSizeOut) > 4096);
 
+struct NvdimmFuncGetLabelDataIn {
+uint32_t offset; /* the offset in the namespace label data area. */
+uint32_t length; /* the size of data is to be read via the function. */
+} QEMU_PACKED;
+typedef struct NvdimmFuncGetLabelDataIn NvdimmFuncGetLabelDataIn;
+QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncGetLabelDataIn) +
+  offsetof(NvdimmDsmIn, arg3) > 4096);
+
 struct NvdimmFuncGetLabelDataOut {
 /* the size of buffer filled by QEMU. */
 uint32_t len;
@@ -447,6 +455,7 @@ struct NvdimmFuncGetLabelDataOut {
 uint8_t out_buf[0]; /* the data got via Get Namesapce Label function. */
 } QEMU_PACKED;
 typedef struct NvdimmFuncGetLabelDataOut NvdimmFuncGetLabelDataOut;
+QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncGetLabelDataOut) > 4096);
 
 struct NvdimmFuncSetLabelDataIn {
 uint32_t offset; /* the offset in the namespace label data area. */
@@ -542,6 +551,71 @@ static void nvdimm_dsm_label_size(NVDIMMDevice *nvdimm, 
hwaddr dsm_mem_addr)
   sizeof(label_size_out));
 }
 
+static uint32_t nvdimm_rw_label_data_check(NVDIMMDevice *nvdimm,
+   uint32_t offset, uint32_t length)
+{
+uint32_t ret = 3 /* Invalid Input Parameters */;
+
+if (offset + length < offset) {
+nvdimm_debug("offset %#x + length %#x is overflow.\n", offset,
+ length);
+return ret;
+}
+
+if (nvdimm->label_size < offset + length) {
+nvdimm_debug("position %#x is beyond label data (len = %" PRIx64 
").\n",
+ offset + length, nvdimm->label_size);
+return ret;
+}
+
+if (length > nvdimm_get_max_xfer_label_size()) {
+nvdimm_debug("length (%#x) is larger than max_xfer (%#x).\n",
+ length, nvdimm_get_max_xfer_label_size());
+return ret;
+}
+
+return 0 /* Success */;
+}
+
+/*
+ * DSM Spec Rev1 4.5 Get Namespace Label Data (Function Index 5).
+ */
+static void nvdimm_dsm_get_label_data(NVDIMMDevice *nvdimm, NvdimmDsmIn *in,
+  hwaddr dsm_mem_addr)
+{
+NVDIMMClass *nvc = NVDIMM_GET_CLASS(nvdimm);
+NvdimmFuncGetLabelDataIn *get_label_data;
+NvdimmFuncGetLabelDataOut *get_label_data_out;
+uint32_t status;
+int size;
+
+get_label_data = (NvdimmFuncGetLabelDataIn *)in->arg3;
+le32_to_cpus(&get_label_data->offset);
+le32_to_cpus(&get_label_data->length);
+
+nvdimm_debug("Read Label Data: offset %#x length %#x.\n",
+ get_label_data->offset, get_label_data->length);
+
+status = nvdimm_rw_label_data_check(nvdimm, get_label_data->offset,
+get_label_data->length);
+if (status != 0 /* Success */) {
+nvdimm_dsm_no_payload(status, dsm_mem_addr);
+return;
+}
+
+size = sizeof(*get_label_data_out) + get_label_data->length;
+assert(size <= 4096);
+get_label_data_out = g_malloc(size);
+
+get_label_data_out->len = cpu_to_le32(size);
+get_label_data_out->func_ret_status = cpu_to_le32(0 /* Success */);
+nvc->read_label_data(nvdimm, get_label_data_out->out_buf,
+ get_label_data->length, get_label_data->offset);
+
+cpu_physical_memory_write(dsm_mem_addr, get_label_data_out, size);
+g_free(get_label_data_out);
+}
+
 static void nvdimm_dsm_device(NvdimmDsmIn *in, hwaddr dsm_mem_addr)
 {
 NVDIMMDevice *nvdimm = nvdimm_get_device_by_handle(in->handle);
@@ -554,7 +628,8 @@ static void nvdimm_dsm_device(NvdimmDsmIn *in, hwaddr 
dsm_mem_addr)
 supported_func |= 0x1 /* Bit 0 indicates whether there is
  support for any functions other
  than function 0. */ |
-  1 << 4 /* Get Namespace Label Size */;
+  1 << 4 /* Get Namespace Label Size */ |
+  1 << 5 /* Get Namespace Label Data */;
 }
 nvdimm_dsm_function0(supported_func, dsm_mem_addr);
 return;
@@ -574,6 +649,12 @@ static void nvdimm_dsm_device(NvdimmDsmIn *in, hwaddr 
dsm_mem_addr)
 return;
 }
 break;
+case 5 /* Get Namespace Label Data */:
+if (nvdimm->label_size) {
+nvdimm_dsm_get_label_data(

[Qemu-devel] [PULL 04/34] bios: Add tests for the IPMI ACPI and SMBIOS entries

2016-06-23 Thread Michael S. Tsirkin

From: Corey Minyard 

Signed-off-by: Corey Minyard 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/bios-tables-test.c |  60 ---
 tests/acpi-test-data/pc/DSDT.ipmikcs | Bin 0 -> 5575 bytes
 tests/acpi-test-data/q35/DSDT.ipmibt | Bin 0 -> 8340 bytes
 3 files changed, 56 insertions(+), 4 deletions(-)
 create mode 100644 tests/acpi-test-data/pc/DSDT.ipmikcs
 create mode 100644 tests/acpi-test-data/q35/DSDT.ipmibt

diff --git a/tests/bios-tables-test.c b/tests/bios-tables-test.c
index 16d11aa..92c90dd 100644
--- a/tests/bios-tables-test.c
+++ b/tests/bios-tables-test.c
@@ -49,6 +49,8 @@ typedef struct {
 GArray *tables;
 uint32_t smbios_ep_addr;
 struct smbios_21_entry_point smbios_ep_table;
+uint8_t *required_struct_types;
+int required_struct_types_len;
 } test_data;
 
 #define ACPI_READ_FIELD(field, addr)   \
@@ -334,7 +336,7 @@ static void test_acpi_tables(test_data *data)
 for (i = 0; i < tables_nr; i++) {
 AcpiSdtTable ssdt_table;
 
-memset(&ssdt_table, 0 , sizeof(ssdt_table));
+memset(&ssdt_table, 0, sizeof(ssdt_table));
 uint32_t addr = data->rsdt_tables_addr[i + 1]; /* fadt is first */
 test_dst_table(&ssdt_table, addr);
 g_array_append_val(data->tables, ssdt_table);
@@ -661,7 +663,6 @@ static void test_smbios_structs(test_data *data)
 uint32_t addr = ep_table->structure_table_address;
 int i, len, max_len = 0;
 uint8_t type, prv, crt;
-uint8_t required_struct_types[] = {0, 1, 3, 4, 16, 17, 19, 32, 127};
 
 /* walk the smbios tables */
 for (i = 0; i < ep_table->number_of_structures; i++) {
@@ -701,8 +702,8 @@ static void test_smbios_structs(test_data *data)
 g_assert_cmpuint(ep_table->max_structure_size, ==, max_len);
 
 /* required struct types must all be present */
-for (i = 0; i < ARRAY_SIZE(required_struct_types); i++) {
-g_assert(test_bit(required_struct_types[i], struct_bitmap));
+for (i = 0; i < data->required_struct_types_len; i++) {
+g_assert(test_bit(data->required_struct_types[i], struct_bitmap));
 }
 }
 
@@ -742,6 +743,10 @@ static void test_acpi_one(const char *params, test_data 
*data)
 g_free(args);
 }
 
+static uint8_t base_required_struct_types[] = {
+0, 1, 3, 4, 16, 17, 19, 32, 127
+};
+
 static void test_acpi_piix4_tcg(void)
 {
 test_data data;
@@ -751,6 +756,8 @@ static void test_acpi_piix4_tcg(void)
  */
 memset(&data, 0, sizeof(data));
 data.machine = MACHINE_PC;
+data.required_struct_types = base_required_struct_types;
+data.required_struct_types_len = ARRAY_SIZE(base_required_struct_types);
 test_acpi_one("-machine accel=tcg", &data);
 free_test_data(&data);
 }
@@ -762,6 +769,8 @@ static void test_acpi_piix4_tcg_bridge(void)
 memset(&data, 0, sizeof(data));
 data.machine = MACHINE_PC;
 data.variant = ".bridge";
+data.required_struct_types = base_required_struct_types;
+data.required_struct_types_len = ARRAY_SIZE(base_required_struct_types);
 test_acpi_one("-machine accel=tcg -device pci-bridge,chassis_nr=1", &data);
 free_test_data(&data);
 }
@@ -772,6 +781,8 @@ static void test_acpi_q35_tcg(void)
 
 memset(&data, 0, sizeof(data));
 data.machine = MACHINE_Q35;
+data.required_struct_types = base_required_struct_types;
+data.required_struct_types_len = ARRAY_SIZE(base_required_struct_types);
 test_acpi_one("-machine q35,accel=tcg", &data);
 free_test_data(&data);
 }
@@ -783,11 +794,50 @@ static void test_acpi_q35_tcg_bridge(void)
 memset(&data, 0, sizeof(data));
 data.machine = MACHINE_Q35;
 data.variant = ".bridge";
+data.required_struct_types = base_required_struct_types;
+data.required_struct_types_len = ARRAY_SIZE(base_required_struct_types);
 test_acpi_one("-machine q35,accel=tcg -device pci-bridge,chassis_nr=1",
   &data);
 free_test_data(&data);
 }
 
+static uint8_t ipmi_required_struct_types[] = {
+0, 1, 3, 4, 16, 17, 19, 32, 38, 127
+};
+
+static void test_acpi_q35_tcg_ipmi(void)
+{
+test_data data;
+
+memset(&data, 0, sizeof(data));
+data.machine = MACHINE_Q35;
+data.variant = ".ipmibt";
+data.required_struct_types = ipmi_required_struct_types;
+data.required_struct_types_len = ARRAY_SIZE(ipmi_required_struct_types);
+test_acpi_one("-machine q35,accel=tcg -device ipmi-bmc-sim,id=bmc0"
+  " -device isa-ipmi-bt,bmc=bmc0",
+  &data);
+free_test_data(&data);
+}
+
+static void test_acpi_piix4_tcg_ipmi(void)
+{
+test_data data;
+
+/* Supplying -machine accel argument overrides the default (qtest).
+ * This is to make guest actually run.
+ */
+memset(&data, 0, sizeof(data));
+data.machine = MACHINE_PC;
+data.variant = ".ipmikcs";
+data.required_struct_types = ipmi_required_struct_types;
+data.required_struct_types_len = ARRAY_SI

[Qemu-devel] [PULL 07/34] acpi: add aml_object_type

2016-06-23 Thread Michael S. Tsirkin

From: Xiao Guangrong 

Implement ObjectType which is used by NVDIMM _DSM method in
later patch

Signed-off-by: Xiao Guangrong 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/acpi/aml-build.h | 1 +
 hw/acpi/aml-build.c | 8 
 2 files changed, 9 insertions(+)

diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 10c09ca..7a548e1 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -363,6 +363,7 @@ Aml *aml_refof(Aml *arg);
 Aml *aml_derefof(Aml *arg);
 Aml *aml_sizeof(Aml *arg);
 Aml *aml_concatenate(Aml *source1, Aml *source2, Aml *target);
+Aml *aml_object_type(Aml *object);
 
 void
 build_header(BIOSLinker *linker, GArray *table_data,
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 874e473..c71fd16 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1481,6 +1481,14 @@ Aml *aml_concatenate(Aml *source1, Aml *source2, Aml 
*target)
  target);
 }
 
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefObjectType */
+Aml *aml_object_type(Aml *object)
+{
+Aml *var = aml_opcode(0x8E /* ObjectTypeOp */);
+aml_append(var, object);
+return var;
+}
+
 void
 build_header(BIOSLinker *linker, GArray *table_data,
  AcpiTableHeader *h, const char *sig, int len, uint8_t rev,
-- 
MST

[Qemu-devel] [PULL 14/34] nvdimm acpi: support Get Namespace Label Size function

2016-06-23 Thread Michael S. Tsirkin

From: Xiao Guangrong 

Function 4 is used to get Namespace label size

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Xiao Guangrong 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/acpi/nvdimm.c | 130 +--
 1 file changed, 127 insertions(+), 3 deletions(-)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 8b89285..4a25d8f 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -216,6 +216,26 @@ static uint32_t nvdimm_slot_to_dcr_index(int slot)
 return nvdimm_slot_to_spa_index(slot) + 1;
 }
 
+static NVDIMMDevice *nvdimm_get_device_by_handle(uint32_t handle)
+{
+NVDIMMDevice *nvdimm = NULL;
+GSList *list, *device_list = nvdimm_get_plugged_device_list();
+
+for (list = device_list; list; list = list->next) {
+NVDIMMDevice *nvd = list->data;
+int slot = object_property_get_int(OBJECT(nvd), PC_DIMM_SLOT_PROP,
+   NULL);
+
+if (nvdimm_slot_to_handle(slot) == handle) {
+nvdimm = nvd;
+break;
+}
+}
+
+g_slist_free(device_list);
+return nvdimm;
+}
+
 /* ACPI 6.0: 5.2.25.1 System Physical Address Range Structure */
 static void
 nvdimm_build_structure_spa(GArray *structures, DeviceState *dev)
@@ -406,6 +426,35 @@ struct NvdimmDsmFuncNoPayloadOut {
 } QEMU_PACKED;
 typedef struct NvdimmDsmFuncNoPayloadOut NvdimmDsmFuncNoPayloadOut;
 
+struct NvdimmFuncGetLabelSizeOut {
+/* the size of buffer filled by QEMU. */
+uint32_t len;
+uint32_t func_ret_status; /* return status code. */
+uint32_t label_size; /* the size of label data area. */
+/*
+ * Maximum size of the namespace label data length supported by
+ * the platform in Get/Set Namespace Label Data functions.
+ */
+uint32_t max_xfer;
+} QEMU_PACKED;
+typedef struct NvdimmFuncGetLabelSizeOut NvdimmFuncGetLabelSizeOut;
+QEMU_BUILD_BUG_ON(sizeof(NvdimmFuncGetLabelSizeOut) > 4096);
+
+struct NvdimmFuncGetLabelDataOut {
+/* the size of buffer filled by QEMU. */
+uint32_t len;
+uint32_t func_ret_status; /* return status code. */
+uint8_t out_buf[0]; /* the data got via Get Namesapce Label function. */
+} QEMU_PACKED;
+typedef struct NvdimmFuncGetLabelDataOut NvdimmFuncGetLabelDataOut;
+
+struct NvdimmFuncSetLabelDataIn {
+uint32_t offset; /* the offset in the namespace label data area. */
+uint32_t length; /* the size of data is to be written via the function. */
+uint8_t in_buf[0]; /* the data written to label data area. */
+} QEMU_PACKED;
+typedef struct NvdimmFuncSetLabelDataIn NvdimmFuncSetLabelDataIn;
+
 static void
 nvdimm_dsm_function0(uint32_t supported_func, hwaddr dsm_mem_addr)
 {
@@ -442,16 +491,91 @@ static void nvdimm_dsm_root(NvdimmDsmIn *in, hwaddr 
dsm_mem_addr)
 nvdimm_dsm_no_payload(1 /* Not Supported */, dsm_mem_addr);
 }
 
+/*
+ * the max transfer size is the max size transferred by both a
+ * 'Get Namespace Label Data' function and a 'Set Namespace Label Data'
+ * function.
+ */
+static uint32_t nvdimm_get_max_xfer_label_size(void)
+{
+uint32_t max_get_size, max_set_size, dsm_memory_size = 4096;
+
+/*
+ * the max data ACPI can read one time which is transferred by
+ * the response of 'Get Namespace Label Data' function.
+ */
+max_get_size = dsm_memory_size - sizeof(NvdimmFuncGetLabelDataOut);
+
+/*
+ * the max data ACPI can write one time which is transferred by
+ * 'Set Namespace Label Data' function.
+ */
+max_set_size = dsm_memory_size - offsetof(NvdimmDsmIn, arg3) -
+   sizeof(NvdimmFuncSetLabelDataIn);
+
+return MIN(max_get_size, max_set_size);
+}
+
+/*
+ * DSM Spec Rev1 4.4 Get Namespace Label Size (Function Index 4).
+ *
+ * It gets the size of Namespace Label data area and the max data size
+ * that Get/Set Namespace Label Data functions can transfer.
+ */
+static void nvdimm_dsm_label_size(NVDIMMDevice *nvdimm, hwaddr dsm_mem_addr)
+{
+NvdimmFuncGetLabelSizeOut label_size_out = {
+.len = cpu_to_le32(sizeof(label_size_out)),
+};
+uint32_t label_size, mxfer;
+
+label_size = nvdimm->label_size;
+mxfer = nvdimm_get_max_xfer_label_size();
+
+nvdimm_debug("label_size %#x, max_xfer %#x.\n", label_size, mxfer);
+
+label_size_out.func_ret_status = cpu_to_le32(0 /* Success */);
+label_size_out.label_size = cpu_to_le32(label_size);
+label_size_out.max_xfer = cpu_to_le32(mxfer);
+
+cpu_physical_memory_write(dsm_mem_addr, &label_size_out,
+  sizeof(label_size_out));
+}
+
 static void nvdimm_dsm_device(NvdimmDsmIn *in, hwaddr dsm_mem_addr)
 {
+NVDIMMDevice *nvdimm = nvdimm_get_device_by_handle(in->handle);
+
 /* See the comments in nvdimm_dsm_root(). */
 if (!in->function) {
-nvdimm_dsm_function0(0 /* No function supported other than
-  function 0 */, dsm_mem_addr);

[Qemu-devel] [PULL 19/34] docs: update ACPI CPU hotplug spec with new protocol

2016-06-23 Thread Michael S. Tsirkin

From: Igor Mammedov 

Add description of new CPU hotplug interface.

To switch from from legacy mode into new mode use fact
that write accesses into CPU present bitmap were never
used before and were ignored by QEMU.
So use it to as a way to switch from legacy mode.
That way pc/q35 machine starts in legacy mode and
QEMU generated ACPI tables will switch to new CPU
hotplug interface during runtime.
In case QEMU is started with legacy BIOS (that doesn't
support QEMU generated ACPI tables), legacy CPU hotplug
will remain active and could be used by BIOS built in
ACPI tables for CPU hotplug.

Signed-off-by: Igor Mammedov 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 docs/specs/acpi_cpu_hotplug.txt | 94 +++--
 1 file changed, 82 insertions(+), 12 deletions(-)

diff --git a/docs/specs/acpi_cpu_hotplug.txt b/docs/specs/acpi_cpu_hotplug.txt
index 340b751..ee219c8 100644
--- a/docs/specs/acpi_cpu_hotplug.txt
+++ b/docs/specs/acpi_cpu_hotplug.txt
@@ -4,21 +4,91 @@ QEMU<->ACPI BIOS CPU hotplug interface
 QEMU supports CPU hotplug via ACPI. This document
 describes the interface between QEMU and the ACPI BIOS.
 
-ACPI GPE block (IO ports 0xafe0-0xafe3, byte access):
--
-
-Generic ACPI GPE block. Bit 2 (GPE.2) used to notify CPU
-hot-add/remove event to ACPI BIOS, via SCI interrupt.
+ACPI BIOS GPE.2 handler is dedicated for notifying OS about CPU hot-add
+and hot-remove events.
 
+
+Legacy ACPI CPU hotplug interface registers:
+
 CPU present bitmap for:
   ICH9-LPC (IO port 0x0cd8-0xcf7, 1-byte access)
   PIIX-PM  (IO port 0xaf00-0xaf1f, 1-byte access)
+  One bit per CPU. Bit position reflects corresponding CPU APIC ID. Read-only.
+  The first DWORD in bitmap is used in write mode to switch from legacy
+  to new CPU hotplug interface, write 0 into it to do switch.
 ---
-One bit per CPU. Bit position reflects corresponding CPU APIC ID.
-Read-only.
+QEMU sets corresponding CPU bit on hot-add event and issues SCI
+with GPE.2 event set. CPU present map is read by ACPI BIOS GPE.2 handler
+to notify OS about CPU hot-add events. CPU hot-remove isn't supported.
+
+=
+ACPI CPU hotplug interface registers:
+-
+Register block base address:
+ICH9-LPC IO port 0x0cd8
+PIIX-PM  IO port 0xaf00
+Register block size:
+ACPI_CPU_HOTPLUG_REG_LEN = 12
+
+read access:
+offset:
+[0x0-0x3] reserved
+[0x4] CPU device status fields: (1 byte access)
+bits:
+   0: Device is enabled and may be used by guest
+   1: Device insert event, used to distinguish device for which
+  no device check event to OSPM was issued.
+  It's valid only when bit 0 is set.
+   2: Device remove event, used to distinguish device for which
+  no device eject request to OSPM was issued.
+   3-7: reserved and should be ignored by OSPM
+[0x5-0x7] reserved
+[0x8] Command data: (DWORD access)
+  in case of error or unsupported command reads is 0x
+  current 'Command field' value:
+  0: returns PXM value corresponding to device
+
+write access:
+offset:
+[0x0-0x3] CPU selector: (DWORD access)
+  selects active CPU device. All following accesses to other
+  registers will read/store data from/to selected CPU.
+[0x4] CPU device control fields: (1 byte access)
+bits:
+0: reserved, OSPM must clear it before writing to register.
+1: if set to 1 clears device insert event, set by OSPM
+   after it has emitted device check event for the
+   selected CPU device
+2: if set to 1 clears device remove event, set by OSPM
+   after it has emitted device eject request for the
+   selected CPU device
+3: if set to 1 initiates device eject, set by OSPM when it
+   triggers CPU device removal and calls _EJ0 method
+4-7: reserved, OSPM must clear them before writing to register
+[0x5] Command field: (1 byte access)
+  value:
+0: selects a CPU device with inserting/removing events and
+   following reads from 'Command data' register return
+   selected CPU (CPU selector value). If no CPU with events
+   found, the current CPU selector doesn't change and
+   corresponding insert/remove event flags are not set.
+1: following writes to 'Command data' register set OST event
+   register in QEMU
+2: following writes to 'Command data' register set OST status
+   register in QEMU
+other values: reserved
+[0x6-0x7] reserved
+[0x8] Command data: (DW

[Qemu-devel] [PULL 05/34] pc-dimm: introduce get_vmstate_memory_region callback

2016-06-23 Thread Michael S. Tsirkin

From: Xiao Guangrong 

This callback returns the MemoryRegion that is the memory of dimm should
be kept during live migration

nvdimm device is different with pc-dimm as its memory includes not only
the MemoryRegion directly mapping to guest's address space but also the
memory used as label data

Signed-off-by: Xiao Guangrong 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/mem/pc-dimm.h |  5 -
 hw/mem/pc-dimm.c | 14 --
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h
index 67e92d8..1e483f2 100644
--- a/include/hw/mem/pc-dimm.h
+++ b/include/hw/mem/pc-dimm.h
@@ -61,7 +61,9 @@ typedef struct PCDIMMDevice {
  * @realize: called after common dimm is realized so that the dimm based
  * devices get the chance to do specified operations.
  * @get_memory_region: returns #MemoryRegion associated with @dimm which
- * is directly mapped into the physical address space of guest
+ * is directly mapped into the physical address space of guest.
+ * @get_vmstate_memory_region: returns #MemoryRegion which indicates the
+ * memory of @dimm should be kept during live migration.
  */
 typedef struct PCDIMMDeviceClass {
 /* private */
@@ -70,6 +72,7 @@ typedef struct PCDIMMDeviceClass {
 /* public */
 void (*realize)(PCDIMMDevice *dimm, Error **errp);
 MemoryRegion *(*get_memory_region)(PCDIMMDevice *dimm);
+MemoryRegion *(*get_vmstate_memory_region)(PCDIMMDevice *dimm);
 } PCDIMMDeviceClass;
 
 /**
diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
index 6de2275..249193a 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/pc-dimm.c
@@ -40,6 +40,8 @@ void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState 
*hpms,
 int slot;
 MachineState *machine = MACHINE(qdev_get_machine());
 PCDIMMDevice *dimm = PC_DIMM(dev);
+PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
+MemoryRegion *vmstate_mr = ddc->get_vmstate_memory_region(dimm);
 Error *local_err = NULL;
 uint64_t existing_dimms_capacity = 0;
 uint64_t addr;
@@ -105,7 +107,7 @@ void pc_dimm_memory_plug(DeviceState *dev, 
MemoryHotplugState *hpms,
 }
 
 memory_region_add_subregion(&hpms->mr, addr - hpms->base, mr);
-vmstate_register_ram(mr, dev);
+vmstate_register_ram(vmstate_mr, dev);
 numa_set_mem_node_id(addr, memory_region_size(mr), dimm->node);
 
 out:
@@ -116,10 +118,12 @@ void pc_dimm_memory_unplug(DeviceState *dev, 
MemoryHotplugState *hpms,
MemoryRegion *mr)
 {
 PCDIMMDevice *dimm = PC_DIMM(dev);
+PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
+MemoryRegion *vmstate_mr = ddc->get_vmstate_memory_region(dimm);
 
 numa_unset_mem_node_id(dimm->addr, memory_region_size(mr), dimm->node);
 memory_region_del_subregion(&hpms->mr, mr);
-vmstate_unregister_ram(mr, dev);
+vmstate_unregister_ram(vmstate_mr, dev);
 }
 
 static int pc_existing_dimms_capacity_internal(Object *obj, void *opaque)
@@ -424,6 +428,11 @@ static MemoryRegion 
*pc_dimm_get_memory_region(PCDIMMDevice *dimm)
 return host_memory_backend_get_memory(dimm->hostmem, &error_abort);
 }
 
+static MemoryRegion *pc_dimm_get_vmstate_memory_region(PCDIMMDevice *dimm)
+{
+return host_memory_backend_get_memory(dimm->hostmem, &error_abort);
+}
+
 static void pc_dimm_class_init(ObjectClass *oc, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(oc);
@@ -434,6 +443,7 @@ static void pc_dimm_class_init(ObjectClass *oc, void *data)
 dc->desc = "DIMM memory module";
 
 ddc->get_memory_region = pc_dimm_get_memory_region;
+ddc->get_vmstate_memory_region = pc_dimm_get_vmstate_memory_region;
 }
 
 static TypeInfo pc_dimm_info = {
-- 
MST

[Qemu-devel] [PULL 08/34] acpi: add aml_call5

2016-06-23 Thread Michael S. Tsirkin

From: Xiao Guangrong 

It will be used by NVDIMM ACPI

Signed-off-by: Xiao Guangrong 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/acpi/aml-build.h |  2 ++
 hw/acpi/aml-build.c | 14 ++
 2 files changed, 16 insertions(+)

diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 7a548e1..e7a1a4c 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -277,6 +277,8 @@ Aml *aml_call1(const char *method, Aml *arg1);
 Aml *aml_call2(const char *method, Aml *arg1, Aml *arg2);
 Aml *aml_call3(const char *method, Aml *arg1, Aml *arg2, Aml *arg3);
 Aml *aml_call4(const char *method, Aml *arg1, Aml *arg2, Aml *arg3, Aml *arg4);
+Aml *aml_call5(const char *method, Aml *arg1, Aml *arg2, Aml *arg3, Aml *arg4,
+   Aml *arg5);
 Aml *aml_gpio_int(AmlConsumerAndProducer con_and_pro,
   AmlLevelAndEdge edge_level,
   AmlActiveHighAndLow active_level, AmlShared shared,
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index c71fd16..db3e914 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -660,6 +660,20 @@ Aml *aml_call4(const char *method, Aml *arg1, Aml *arg2, 
Aml *arg3, Aml *arg4)
 return var;
 }
 
+/* helper to call method with 5 arguments */
+Aml *aml_call5(const char *method, Aml *arg1, Aml *arg2, Aml *arg3, Aml *arg4,
+   Aml *arg5)
+{
+Aml *var = aml_alloc();
+build_append_namestring(var->buf, "%s", method);
+aml_append(var, arg1);
+aml_append(var, arg2);
+aml_append(var, arg3);
+aml_append(var, arg4);
+aml_append(var, arg5);
+return var;
+}
+
 /*
  * ACPI 5.0: 6.4.3.8.1 GPIO Connection Descriptor
  * Type 1, Large Item Name 0xC
-- 
MST

[Qemu-devel] [PULL 11/34] nvdimm acpi: check UUID

2016-06-23 Thread Michael S. Tsirkin

From: Xiao Guangrong 

Check arg0 which indicates UUID to see if it is valid

Signed-off-by: Xiao Guangrong 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/acpi/nvdimm.c | 32 ++--
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 95504e9..b01f2c6 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -487,19 +487,39 @@ void nvdimm_init_acpi_state(AcpiNVDIMMState *state, 
MemoryRegion *io,
 
 static void nvdimm_build_common_dsm(Aml *dev)
 {
-Aml *method, *ifctx, *function, *dsm_mem, *unpatched, *result_size;
+Aml *method, *ifctx, *function, *handle, *uuid, *dsm_mem, *result_size;
+Aml *elsectx, *unsupport, *unpatched, *expected_uuid, *uuid_invalid;
 Aml *pckg, *pckg_index, *pckg_buf;
 uint8_t byte_list[1];
 
 method = aml_method(NVDIMM_COMMON_DSM, 5, AML_SERIALIZED);
+uuid = aml_arg(0);
 function = aml_arg(2);
+handle = aml_arg(4);
 dsm_mem = aml_name(NVDIMM_ACPI_MEM_ADDR);
 
 /*
  * do not support any method if DSM memory address has not been
  * patched.
  */
-unpatched = aml_if(aml_equal(dsm_mem, aml_int(0x0)));
+unpatched = aml_equal(dsm_mem, aml_int(0x0));
+
+expected_uuid = aml_local(0);
+
+ifctx = aml_if(aml_equal(handle, aml_int(0x0)));
+aml_append(ifctx, aml_store(
+   aml_touuid("2F10E7A4-9E91-11E4-89D3-123B93F75CBA")
+   /* UUID for NVDIMM Root Device */, expected_uuid));
+aml_append(method, ifctx);
+elsectx = aml_else();
+aml_append(elsectx, aml_store(
+   aml_touuid("4309AC30-0D11-11E4-9191-0800200C9A66")
+   /* UUID for NVDIMM Devices */, expected_uuid));
+aml_append(method, elsectx);
+
+uuid_invalid = aml_lnot(aml_equal(uuid, expected_uuid));
+
+unsupport = aml_if(aml_or(unpatched, uuid_invalid, NULL));
 
 /*
  * function 0 is called to inquire what functions are supported by
@@ -508,19 +528,19 @@ static void nvdimm_build_common_dsm(Aml *dev)
 ifctx = aml_if(aml_equal(function, aml_int(0)));
 byte_list[0] = 0 /* No function Supported */;
 aml_append(ifctx, aml_return(aml_buffer(1, byte_list)));
-aml_append(unpatched, ifctx);
+aml_append(unsupport, ifctx);
 
 /* No function is supported yet. */
 byte_list[0] = 1 /* Not Supported */;
-aml_append(unpatched, aml_return(aml_buffer(1, byte_list)));
-aml_append(method, unpatched);
+aml_append(unsupport, aml_return(aml_buffer(1, byte_list)));
+aml_append(method, unsupport);
 
 /*
  * The HDLE indicates the DSM function is issued from which device,
  * it reserves 0 for root device and is the handle for NVDIMM devices.
  * See the comments in nvdimm_slot_to_handle().
  */
-aml_append(method, aml_store(aml_arg(4), aml_name("HDLE")));
+aml_append(method, aml_store(handle, aml_name("HDLE")));
 aml_append(method, aml_store(aml_arg(1), aml_name("REVS")));
 aml_append(method, aml_store(aml_arg(2), aml_name("FUNC")));
 
-- 
MST

[Qemu-devel] [PULL 09/34] nvdimm acpi: set HDLE properly

2016-06-23 Thread Michael S. Tsirkin

From: Xiao Guangrong 

Now we pass HDLE to Qemu properly, use 0 for root device and use the
handle for nvdimm devices

Signed-off-by: Xiao Guangrong 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/acpi/nvdimm.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index b4c2262..14355f8 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -490,7 +490,7 @@ static void nvdimm_build_common_dsm(Aml *dev)
 Aml *method, *ifctx, *function, *dsm_mem, *unpatched, *result_size;
 uint8_t byte_list[1];
 
-method = aml_method(NVDIMM_COMMON_DSM, 4, AML_SERIALIZED);
+method = aml_method(NVDIMM_COMMON_DSM, 5, AML_SERIALIZED);
 function = aml_arg(2);
 dsm_mem = aml_name(NVDIMM_ACPI_MEM_ADDR);
 
@@ -516,11 +516,10 @@ static void nvdimm_build_common_dsm(Aml *dev)
 
 /*
  * The HDLE indicates the DSM function is issued from which device,
- * it is not used at this time as no function is supported yet.
- * Currently we make it always be 0 for all the devices and will set
- * the appropriate value once real function is implemented.
+ * it reserves 0 for root device and is the handle for NVDIMM devices.
+ * See the comments in nvdimm_slot_to_handle().
  */
-aml_append(method, aml_store(aml_int(0x0), aml_name("HDLE")));
+aml_append(method, aml_store(aml_arg(4), aml_name("HDLE")));
 aml_append(method, aml_store(aml_arg(1), aml_name("REVS")));
 aml_append(method, aml_store(aml_arg(2), aml_name("FUNC")));
 
@@ -542,13 +541,14 @@ static void nvdimm_build_common_dsm(Aml *dev)
 aml_append(dev, method);
 }
 
-static void nvdimm_build_device_dsm(Aml *dev)
+static void nvdimm_build_device_dsm(Aml *dev, uint32_t handle)
 {
 Aml *method;
 
 method = aml_method("_DSM", 4, AML_NOTSERIALIZED);
-aml_append(method, aml_return(aml_call4(NVDIMM_COMMON_DSM, aml_arg(0),
-  aml_arg(1), aml_arg(2), aml_arg(3;
+aml_append(method, aml_return(aml_call5(NVDIMM_COMMON_DSM, aml_arg(0),
+  aml_arg(1), aml_arg(2), aml_arg(3),
+  aml_int(handle;
 aml_append(dev, method);
 }
 
@@ -573,7 +573,7 @@ static void nvdimm_build_nvdimm_devices(GSList 
*device_list, Aml *root_dev)
  */
 aml_append(nvdimm_dev, aml_name_decl("_ADR", aml_int(handle)));
 
-nvdimm_build_device_dsm(nvdimm_dev);
+nvdimm_build_device_dsm(nvdimm_dev, handle);
 aml_append(root_dev, nvdimm_dev);
 }
 }
@@ -665,7 +665,9 @@ static void nvdimm_build_ssdt(GSList *device_list, GArray 
*table_offsets,
 aml_append(dev, field);
 
 nvdimm_build_common_dsm(dev);
-nvdimm_build_device_dsm(dev);
+
+/* 0 is reserved for root device. */
+nvdimm_build_device_dsm(dev, 0);
 
 nvdimm_build_nvdimm_devices(device_list, dev);
 
-- 
MST

Re: [Qemu-devel] [PATCH v1 01/11] ppc/xics: Rename existing xics to xics_spapr

2016-06-23 Thread Nikunj A Dadhania

David Gibson  writes:

> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
>> index 9091054..452a978 100644
>> --- a/include/hw/ppc/xics.h
>> +++ b/include/hw/ppc/xics.h
>> @@ -32,20 +32,24 @@
>>  #define TYPE_XICS_COMMON "xics-common"
>>  #define XICS_COMMON(obj) OBJECT_CHECK(XICSState, (obj), TYPE_XICS_COMMON)
>>  
>> -#define TYPE_XICS "xics"
>> -#define XICS(obj) OBJECT_CHECK(XICSState, (obj), TYPE_XICS)
>> +/*
>> + * Retain xics as the type name to be compatible for migration. Rest all the
>> + * functions, class and variables are renamed as xics_spapr.
>> + */
>> +#define TYPE_XICS_SPAPR "xics"
>> +#define XICS(obj) OBJECT_CHECK(XICSState, (obj), TYPE_XICS_SPAPR)
>
> This should change to XICS_SPAPR to match the TYPE macro.

Done.

>
>>  
>> -#define TYPE_KVM_XICS "xics-kvm"
>> -#define KVM_XICS(obj) OBJECT_CHECK(KVMXICSState, (obj), TYPE_KVM_XICS)
>> +#define TYPE_XICS_SPAPR_KVM "xics-spapr-kvm"
>> +#define KVM_XICS(obj) OBJECT_CHECK(KVMXICSState, (obj), TYPE_XICS_SPAPR_KVM)
>
> Likewise XICS_SPAPR_KVM().
>

Done.

Regards
Nikunj

[Qemu-devel] [PULL 12/34] nvdimm acpi: abstract the operations for root & nvdimm devices

2016-06-23 Thread Michael S. Tsirkin

From: Xiao Guangrong 

It separates the operations between root device and nvdimm devices
in order to introducing label functions support for nvdimm device

Signed-off-by: Xiao Guangrong 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/acpi/nvdimm.c | 74 ++--
 1 file changed, 56 insertions(+), 18 deletions(-)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index b01f2c6..07c95c1 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -406,6 +406,55 @@ struct NvdimmDsmFuncNoPayloadOut {
 } QEMU_PACKED;
 typedef struct NvdimmDsmFuncNoPayloadOut NvdimmDsmFuncNoPayloadOut;
 
+static void
+nvdimm_dsm_function0(uint32_t supported_func, hwaddr dsm_mem_addr)
+{
+NvdimmDsmFunc0Out func0 = {
+.len = cpu_to_le32(sizeof(func0)),
+.supported_func = cpu_to_le32(supported_func),
+};
+cpu_physical_memory_write(dsm_mem_addr, &func0, sizeof(func0));
+}
+
+static void
+nvdimm_dsm_no_payload(uint32_t func_ret_status, hwaddr dsm_mem_addr)
+{
+NvdimmDsmFuncNoPayloadOut out = {
+.len = cpu_to_le32(sizeof(out)),
+.func_ret_status = cpu_to_le32(func_ret_status),
+};
+cpu_physical_memory_write(dsm_mem_addr, &out, sizeof(out));
+}
+
+static void nvdimm_dsm_root(NvdimmDsmIn *in, hwaddr dsm_mem_addr)
+{
+/*
+ * function 0 is called to inquire which functions are supported by
+ * OSPM
+ */
+if (!in->function) {
+nvdimm_dsm_function0(0 /* No function supported other than
+  function 0 */, dsm_mem_addr);
+return;
+}
+
+/* No function except function 0 is supported yet. */
+nvdimm_dsm_no_payload(1 /* Not Supported */, dsm_mem_addr);
+}
+
+static void nvdimm_dsm_device(NvdimmDsmIn *in, hwaddr dsm_mem_addr)
+{
+/* See the comments in nvdimm_dsm_root(). */
+if (!in->function) {
+nvdimm_dsm_function0(0 /* No function supported other than
+  function 0 */, dsm_mem_addr);
+return;
+}
+
+/* No function except function 0 is supported yet. */
+nvdimm_dsm_no_payload(1 /* Not Supported */, dsm_mem_addr);
+}
+
 static uint64_t
 nvdimm_dsm_read(void *opaque, hwaddr addr, unsigned size)
 {
@@ -436,26 +485,15 @@ nvdimm_dsm_write(void *opaque, hwaddr addr, uint64_t val, 
unsigned size)
 nvdimm_debug("Revision %#x Handler %#x Function %#x.\n", in->revision,
  in->handle, in->function);
 
-/*
- * function 0 is called to inquire which functions are supported by
- * OSPM
- */
-if (in->function == 0) {
-NvdimmDsmFunc0Out func0 = {
-.len = cpu_to_le32(sizeof(func0)),
- /* No function supported other than function 0 */
-.supported_func = cpu_to_le32(0),
-};
-cpu_physical_memory_write(dsm_mem_addr, &func0, sizeof func0);
-} else {
-/* No function except function 0 is supported yet. */
-NvdimmDsmFuncNoPayloadOut out = {
-.len = cpu_to_le32(sizeof(out)),
-.func_ret_status = cpu_to_le32(1)  /* Not Supported */,
-};
-cpu_physical_memory_write(dsm_mem_addr, &out, sizeof(out));
+ /* Handle 0 is reserved for NVDIMM Root Device. */
+if (!in->handle) {
+nvdimm_dsm_root(in, dsm_mem_addr);
+goto exit;
 }
 
+nvdimm_dsm_device(in, dsm_mem_addr);
+
+exit:
 g_free(in);
 }
 
-- 
MST

[Qemu-devel] [PULL 03/34] acpi: Add IPMI table entries

2016-06-23 Thread Michael S. Tsirkin

From: Corey Minyard 

Use the ACPI table construction tools to create an ACPI entry
for IPMI.  This adds a function called build_acpi_ipmi_devices
to add an DSDT entry for IPMI if IPMI is compiled in and an
IPMI device exists.  It also adds a dummy function if IPMI
is not compiled in.

This conforms to section "C3-2 Locating IPMI System Interfaces in
ACPI Name Space" in the IPMI 2.0 specification.

Signed-off-by: Corey Minyard 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/acpi/ipmi.h |  22 +++
 hw/acpi/ipmi.c | 105 +
 hw/i386/acpi-build.c   |  12 ++
 stubs/ipmi.c   |  14 +++
 hw/acpi/Makefile.objs  |   1 +
 stubs/Makefile.objs|   1 +
 6 files changed, 155 insertions(+)
 create mode 100644 include/hw/acpi/ipmi.h
 create mode 100644 hw/acpi/ipmi.c
 create mode 100644 stubs/ipmi.c

diff --git a/include/hw/acpi/ipmi.h b/include/hw/acpi/ipmi.h
new file mode 100644
index 000..ab2bb29
--- /dev/null
+++ b/include/hw/acpi/ipmi.h
@@ -0,0 +1,22 @@
+/*
+ * QEMU IPMI ACPI handling
+ *
+ * Copyright (c) 2015,2016 Corey Minyard 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#ifndef HW_ACPI_IPMI_H
+#define HW_ACPI_IPMI_H
+
+#include "qemu/osdep.h"
+#include "hw/acpi/aml-build.h"
+
+/*
+ * Add ACPI IPMI entries for all registered IPMI devices whose parent
+ * bus matches the given bus.  The resource is the ACPI resource that
+ * contains the IPMI device, this is required for the I2C CRS.
+ */
+void build_acpi_ipmi_devices(Aml *table, BusState *bus);
+
+#endif /* HW_ACPI_IPMI_H */
diff --git a/hw/acpi/ipmi.c b/hw/acpi/ipmi.c
new file mode 100644
index 000..7e74ce4
--- /dev/null
+++ b/hw/acpi/ipmi.c
@@ -0,0 +1,105 @@
+/*
+ * IPMI ACPI firmware handling
+ *
+ * Copyright (c) 2015,2016 Corey Minyard, MontaVista Software, LLC
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/ipmi/ipmi.h"
+#include "hw/acpi/aml-build.h"
+#include "hw/acpi/acpi.h"
+#include "hw/acpi/ipmi.h"
+
+static Aml *aml_ipmi_crs(IPMIFwInfo *info)
+{
+Aml *crs = aml_resource_template();
+
+/*
+ * The base address is fixed and cannot change.  That may be different
+ * if someone does PCI, but we aren't there yet.
+ */
+switch (info->memspace) {
+case IPMI_MEMSPACE_IO:
+aml_append(crs, aml_io(AML_DECODE16, info->base_address,
+   info->base_address + info->register_length - 1,
+   info->register_spacing, info->register_length));
+break;
+case IPMI_MEMSPACE_MEM32:
+aml_append(crs,
+   aml_dword_memory(AML_POS_DECODE,
+AML_MIN_FIXED, AML_MAX_FIXED,
+AML_NON_CACHEABLE, AML_READ_WRITE,
+0x,
+info->base_address,
+info->base_address + info->register_length - 1,
+info->register_spacing, info->register_length));
+break;
+case IPMI_MEMSPACE_MEM64:
+aml_append(crs,
+   aml_qword_memory(AML_POS_DECODE,
+AML_MIN_FIXED, AML_MAX_FIXED,
+AML_NON_CACHEABLE, AML_READ_WRITE,
+0xULL,
+info->base_address,
+info->base_address + info->register_length - 1,
+info->register_spacing, info->register_length));
+break;
+case IPMI_MEMSPACE_SMBUS:
+aml_append(crs, aml_return(aml_int(info->base_address)));
+break;
+default:
+abort();
+}
+
+if (info->interrupt_number) {
+aml_append(crs, aml_irq_no_flags(info->interrupt_number));
+}
+
+return crs;
+}
+
+static Aml *aml_ipmi_device(IPMIFwInfo *info)
+{
+Aml *dev;
+uint16_t version = ((info->ipmi_spec_major_revision << 8)
+| (info->ipmi_spec_minor_revision << 4));
+
+assert(info->ipmi_spec_minor_revision <= 15);
+
+dev = aml_device("MI%d", info->uuid);
+aml_append(dev, aml_name_decl("_HID", aml_eisaid("IPI0001")));
+aml_append(dev, aml_name_decl("_STR", aml_string("ipmi_%s",
+ info->interface_name)));
+aml_append(dev, aml_name_decl("_UID", aml_int(info->uuid)));
+aml_append(dev, aml_name_decl("_CRS", aml_ipmi_crs(info)));
+aml_append(dev, aml_name_decl("_IFT", aml_int(info->interface_type)));
+aml_append(dev, aml_name_decl("_SRV", aml_int(version)));
+
+return dev;
+}
+
+void build_acpi_ipmi_devices(Aml *scope, BusState *bus)
+{
+
+BusChild *kid;
+
+QTAILQ_FOREACH(kid, &bus->children,  sibling

[Qemu-devel] [PULL 06/34] nvdimm: support nvdimm label

2016-06-23 Thread Michael S. Tsirkin

From: Xiao Guangrong 

Introduce a parameter, 'label-size', which is the size of nvdimm label
data area which is reserved at the end of backend memory. It is required
at least 128k

Two callbacks, read_label_data() and write_label_data(), are used to
operate the label area

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Xiao Guangrong 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/mem/nvdimm.h |  55 +++-
 hw/mem/nvdimm.c | 132 
 2 files changed, 186 insertions(+), 1 deletion(-)

diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
index 60ee92b..1cfe9e0 100644
--- a/include/hw/mem/nvdimm.h
+++ b/include/hw/mem/nvdimm.h
@@ -34,7 +34,60 @@
 } \
 } while (0)
 
-#define TYPE_NVDIMM "nvdimm"
+/*
+ * The minimum label data size is required by NVDIMM Namespace
+ * specification, see the chapter 2 Namespaces:
+ *   "NVDIMMs following the NVDIMM Block Mode Specification use an area
+ *at least 128KB in size, which holds around 1000 labels."
+ */
+#define MIN_NAMESPACE_LABEL_SIZE  (128UL << 10)
+
+#define TYPE_NVDIMM  "nvdimm"
+#define NVDIMM(obj)  OBJECT_CHECK(NVDIMMDevice, (obj), TYPE_NVDIMM)
+#define NVDIMM_CLASS(oc) OBJECT_CLASS_CHECK(NVDIMMClass, (oc), TYPE_NVDIMM)
+#define NVDIMM_GET_CLASS(obj) OBJECT_GET_CLASS(NVDIMMClass, (obj), \
+   TYPE_NVDIMM)
+struct NVDIMMDevice {
+/* private */
+PCDIMMDevice parent_obj;
+
+/* public */
+
+/*
+ * the size of label data in NVDIMM device which is presented to
+ * guest via __DSM "Get Namespace Label Size" function.
+ */
+uint64_t label_size;
+
+/*
+ * the address of label data which is read by __DSM "Get Namespace
+ * Label Data" function and written by __DSM "Set Namespace Label
+ * Data" function.
+ */
+void *label_data;
+
+/*
+ * it's the PMEM region in NVDIMM device, which is presented to
+ * guest via ACPI NFIT and _FIT method if NVDIMM hotplug is supported.
+ */
+MemoryRegion nvdimm_mr;
+};
+typedef struct NVDIMMDevice NVDIMMDevice;
+
+struct NVDIMMClass {
+/* private */
+PCDIMMDeviceClass parent_class;
+
+/* public */
+
+/* read @size bytes from NVDIMM label data at @offset into @buf. */
+void (*read_label_data)(NVDIMMDevice *nvdimm, void *buf,
+uint64_t size, uint64_t offset);
+/* write @size bytes from @buf to NVDIMM label data at @offset. */
+void (*write_label_data)(NVDIMMDevice *nvdimm, const void *buf,
+ uint64_t size, uint64_t offset);
+};
+typedef struct NVDIMMClass NVDIMMClass;
 
 #define NVDIMM_DSM_MEM_FILE "etc/acpi/nvdimm-mem"
 
diff --git a/hw/mem/nvdimm.c b/hw/mem/nvdimm.c
index 0a602f2..81896c0 100644
--- a/hw/mem/nvdimm.c
+++ b/hw/mem/nvdimm.c
@@ -23,20 +23,152 @@
  */
 
 #include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qapi/visitor.h"
 #include "hw/mem/nvdimm.h"
 
+static void nvdimm_get_label_size(Object *obj, Visitor *v, const char *name,
+  void *opaque, Error **errp)
+{
+NVDIMMDevice *nvdimm = NVDIMM(obj);
+uint64_t value = nvdimm->label_size;
+
+visit_type_size(v, name, &value, errp);
+}
+
+static void nvdimm_set_label_size(Object *obj, Visitor *v, const char *name,
+  void *opaque, Error **errp)
+{
+NVDIMMDevice *nvdimm = NVDIMM(obj);
+Error *local_err = NULL;
+uint64_t value;
+
+if (memory_region_size(&nvdimm->nvdimm_mr)) {
+error_setg(&local_err, "cannot change property value");
+goto out;
+}
+
+visit_type_size(v, name, &value, &local_err);
+if (local_err) {
+goto out;
+}
+if (value < MIN_NAMESPACE_LABEL_SIZE) {
+error_setg(&local_err, "Property '%s.%s' (0x%" PRIx64 ") is required"
+   " at least 0x%lx", object_get_typename(obj),
+   name, value, MIN_NAMESPACE_LABEL_SIZE);
+goto out;
+}
+
+nvdimm->label_size = value;
+out:
+error_propagate(errp, local_err);
+}
+
+static void nvdimm_init(Object *obj)
+{
+object_property_add(obj, "label-size", "int",
+nvdimm_get_label_size, nvdimm_set_label_size, NULL,
+NULL, NULL);
+}
+
+static MemoryRegion *nvdimm_get_memory_region(PCDIMMDevice *dimm)
+{
+NVDIMMDevice *nvdimm = NVDIMM(dimm);
+
+return &nvdimm->nvdimm_mr;
+}
+
+static void nvdimm_realize(PCDIMMDevice *dimm, Error **errp)
+{
+MemoryRegion *mr = host_memory_backend_get_memory(dimm->hostmem, errp);
+NVDIMMDevice *nvdimm = NVDIMM(dimm);
+uint64_t align, pmem_size, size = memory_region_size(mr);
+
+align = memory_region_get_alignment(mr);
+
+pmem_size = size - nvdimm->label_size;
+nvdimm->label_data = memory_region_get_r

[Qemu-devel] [PULL 00/34] pc, pci, virtio: new features, cleanups, fixes

2016-06-23 Thread Michael S. Tsirkin

The following changes since commit c7288767523f6510cf557707d3eb5e78e519b90d:

  Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160623' into 
staging (2016-06-23 11:53:14 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream

for you to fetch changes up to 21a4d96243e60a4c8eeb124a023b8a3bd9120e18:

  virtio-bus: remove old set_host_notifier callback (2016-06-24 08:47:35 +0300)


pc, pci, virtio: new features, cleanups, fixes

nvdimm label support
cpu acpi hotplug rework
virtio rework
misc cleanups and fixes

Signed-off-by: Michael S. Tsirkin 


Corey Minyard (4):
  smbios: Move table build tools into an include file.
  ipmi: Add SMBIOS table entry
  acpi: Add IPMI table entries
  bios: Add tests for the IPMI ACPI and SMBIOS entries

Cornelia Huck (6):
  virtio-bus: common ioeventfd infrastructure
  virtio-bus: have callers tolerate new host notifier api
  virtio-ccw: convert to ioeventfd callbacks
  virtio-pci: convert to ioeventfd callbacks
  virtio-mmio: convert to ioeventfd callbacks
  virtio-bus: remove old set_host_notifier callback

Ido Yariv (1):
  i386: pci-assign: Fix MSI-X table size

Igor Mammedov (9):
  docs: update ACPI CPU hotplug spec with new protocol
  pc: piix4/ich9: add 'cpu-hotplug-legacy' property
  acpi: cpuhp: add CPU devices AML with _STA method
  pc: acpi: introduce AcpiDeviceIfClass.madt_cpu hook
  acpi: cpuhp: implement hot-add parts of CPU hotplug interface
  acpi: cpuhp: implement hot-remove parts of CPU hotplug interface
  acpi: cpuhp: add cpu._OST handling
  pc: use new CPU hotplug interface since 2.7 machine type
  pc: acpi: drop intermediate PCMachineState.node_cpu

Michael S. Tsirkin (1):
  acpi-test-data: update expected

Xiao Guangrong (13):
  pc-dimm: introduce get_vmstate_memory_region callback
  nvdimm: support nvdimm label
  acpi: add aml_object_type
  acpi: add aml_call5
  nvdimm acpi: set HDLE properly
  nvdimm acpi: save arg3 of _DSM method
  nvdimm acpi: check UUID
  nvdimm acpi: abstract the operations for root & nvdimm devices
  nvdimm acpi: check revision
  nvdimm acpi: support Get Namespace Label Size function
  nvdimm acpi: support Get Namespace Label Data function
  nvdimm acpi: support Set Namespace Label Data function
  docs: add NVDIMM ACPI documentation

 qapi-schema.json |   3 +-
 hw/smbios/smbios_build.h |  87 ++
 include/hw/acpi/acpi_dev_interface.h |   7 +
 include/hw/acpi/aml-build.h  |   3 +
 include/hw/acpi/cpu.h|  67 +
 include/hw/acpi/cpu_hotplug.h|   6 +
 include/hw/acpi/ich9.h   |   3 +
 include/hw/acpi/ipmi.h   |  22 ++
 include/hw/i386/pc.h |   8 +-
 include/hw/mem/nvdimm.h  |  55 +++-
 include/hw/mem/pc-dimm.h |   5 +-
 include/hw/smbios/ipmi.h |  15 +
 include/hw/virtio/virtio-bus.h   |  31 +-
 hw/acpi/aml-build.c  |  22 ++
 hw/acpi/cpu.c| 561 +++
 hw/acpi/cpu_hotplug.c|  21 +-
 hw/acpi/ich9.c   |  69 -
 hw/acpi/ipmi.c   | 105 +++
 hw/acpi/nvdimm.c | 400 ++---
 hw/acpi/piix4.c  |  71 -
 hw/block/dataplane/virtio-blk.c  |   6 +-
 hw/i386/acpi-build.c |  80 +++--
 hw/i386/kvm/pci-assign.c |  14 +-
 hw/i386/pc.c |  63 +++-
 hw/i386/pc_piix.c|   2 +
 hw/i386/pc_q35.c |   2 +
 hw/isa/lpc_ich9.c|   1 +
 hw/mem/nvdimm.c  | 132 +
 hw/mem/pc-dimm.c |  14 +-
 hw/s390x/virtio-ccw.c| 133 +++--
 hw/scsi/virtio-scsi-dataplane.c  |   9 +-
 hw/smbios/smbios.c   |  72 +
 hw/smbios/smbios_type_38.c   | 117 
 hw/virtio/vhost.c|  13 +-
 hw/virtio/virtio-bus.c   | 132 +
 hw/virtio/virtio-mmio.c  | 128 +++-
 hw/virtio/virtio-pci.c   | 124 +++-
 stubs/ipmi.c |  14 +
 stubs/pc_madt_cpu_entry.c|   7 +
 stubs/smbios_type_38.c   |  14 +
 tests/bios-tables-test.c |  60 +++-
 docs/specs/acpi_cpu_hotplug.txt  |  94 +-
 docs/specs/acpi_nvdimm.txt   | 132 +
 hw/acpi/Makefile.objs|   2 +
 hw/acpi/trace-events |  14 +
 hw/smbios/Makefile.objs  |   1 +
 stubs/Makefile.objs  |   3 +
 tests/acpi-test-data/pc/DSDT | Bin 5503 -

[Qemu-devel] [PULL 10/34] nvdimm acpi: save arg3 of _DSM method

2016-06-23 Thread Michael S. Tsirkin

From: Xiao Guangrong 

Check if the input Arg3 is valid then store it into ARG3 if it is
needed

Signed-off-by: Xiao Guangrong 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/acpi/nvdimm.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 14355f8..95504e9 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -488,6 +488,7 @@ void nvdimm_init_acpi_state(AcpiNVDIMMState *state, 
MemoryRegion *io,
 static void nvdimm_build_common_dsm(Aml *dev)
 {
 Aml *method, *ifctx, *function, *dsm_mem, *unpatched, *result_size;
+Aml *pckg, *pckg_index, *pckg_buf;
 uint8_t byte_list[1];
 
 method = aml_method(NVDIMM_COMMON_DSM, 5, AML_SERIALIZED);
@@ -524,6 +525,25 @@ static void nvdimm_build_common_dsm(Aml *dev)
 aml_append(method, aml_store(aml_arg(2), aml_name("FUNC")));
 
 /*
+ * The fourth parameter (Arg3) of _DSM is a package which contains
+ * a buffer, the layout of the buffer is specified by UUID (Arg0),
+ * Revision ID (Arg1) and Function Index (Arg2) which are documented
+ * in the DSM Spec.
+ */
+pckg = aml_arg(3);
+ifctx = aml_if(aml_and(aml_equal(aml_object_type(pckg),
+   aml_int(4 /* Package */)) /* It is a Package? */,
+   aml_equal(aml_sizeof(pckg), aml_int(1)) /* 1 element? */,
+   NULL));
+
+pckg_index = aml_local(2);
+pckg_buf = aml_local(3);
+aml_append(ifctx, aml_store(aml_index(pckg, aml_int(0)), pckg_index));
+aml_append(ifctx, aml_store(aml_derefof(pckg_index), pckg_buf));
+aml_append(ifctx, aml_store(pckg_buf, aml_name("ARG3")));
+aml_append(method, ifctx);
+
+/*
  * tell QEMU about the real address of DSM memory, then QEMU
  * gets the control and fills the result in DSM memory.
  */
-- 
MST

[Qemu-devel] [PULL 02/34] ipmi: Add SMBIOS table entry

2016-06-23 Thread Michael S. Tsirkin

From: Corey Minyard 

Add an IPMI table entry to the SMBIOS.

Signed-off-by: Corey Minyard 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/smbios/ipmi.h   |  15 ++
 hw/smbios/smbios.c |   2 +
 hw/smbios/smbios_type_38.c | 117 +
 stubs/smbios_type_38.c |  14 ++
 hw/smbios/Makefile.objs|   1 +
 stubs/Makefile.objs|   1 +
 6 files changed, 150 insertions(+)
 create mode 100644 include/hw/smbios/ipmi.h
 create mode 100644 hw/smbios/smbios_type_38.c
 create mode 100644 stubs/smbios_type_38.c

diff --git a/include/hw/smbios/ipmi.h b/include/hw/smbios/ipmi.h
new file mode 100644
index 000..1c9aae3
--- /dev/null
+++ b/include/hw/smbios/ipmi.h
@@ -0,0 +1,15 @@
+/*
+ * IPMI SMBIOS firmware handling
+ *
+ * Copyright (c) 2015,2016 Corey Minyard, MontaVista Software, LLC
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_SMBIOS_IPMI_H
+#define QEMU_SMBIOS_IPMI_H
+
+void smbios_build_type_38_table(void);
+
+#endif /* QEMU_SMBIOS_IPMI_H */
diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index 5dc3e43..74c7102 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -25,6 +25,7 @@
 #include "hw/loader.h"
 #include "exec/cpu-common.h"
 #include "smbios_build.h"
+#include "hw/smbios/ipmi.h"
 
 /* legacy structures and constants for <= 2.0 machines */
 struct smbios_header {
@@ -848,6 +849,7 @@ void smbios_get_tables(const struct smbios_phys_mem_area 
*mem_array,
 }
 
 smbios_build_type_32_table();
+smbios_build_type_38_table();
 smbios_build_type_127_table();
 
 smbios_validate_table();
diff --git a/hw/smbios/smbios_type_38.c b/hw/smbios/smbios_type_38.c
new file mode 100644
index 000..56e8609
--- /dev/null
+++ b/hw/smbios/smbios_type_38.c
@@ -0,0 +1,117 @@
+/*
+ * IPMI SMBIOS firmware handling
+ *
+ * Copyright (c) 2015,2016 Corey Minyard, MontaVista Software, LLC
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/ipmi/ipmi.h"
+#include "hw/smbios/ipmi.h"
+#include "hw/smbios/smbios.h"
+#include "qemu/error-report.h"
+#include "smbios_build.h"
+
+/* SMBIOS type 38 - IPMI */
+struct smbios_type_38 {
+struct smbios_structure_header header;
+uint8_t interface_type;
+uint8_t ipmi_spec_revision;
+uint8_t i2c_slave_address;
+uint8_t nv_storage_device_address;
+uint64_t base_address;
+uint8_t base_address_modifier;
+uint8_t interrupt_number;
+} QEMU_PACKED;
+
+static void smbios_build_one_type_38(IPMIFwInfo *info)
+{
+uint64_t baseaddr = info->base_address;
+SMBIOS_BUILD_TABLE_PRE(38, 0x3000, true);
+
+t->interface_type = info->interface_type;
+t->ipmi_spec_revision = ((info->ipmi_spec_major_revision << 4)
+ | info->ipmi_spec_minor_revision);
+t->i2c_slave_address = info->i2c_slave_address;
+t->nv_storage_device_address = 0;
+
+assert(info->ipmi_spec_minor_revision <= 15);
+assert(info->ipmi_spec_major_revision <= 15);
+
+/* or 1 to set it to I/O space */
+switch (info->memspace) {
+case IPMI_MEMSPACE_IO:
+baseaddr |= 1;
+break;
+case IPMI_MEMSPACE_MEM32:
+case IPMI_MEMSPACE_MEM64:
+break;
+case IPMI_MEMSPACE_SMBUS:
+baseaddr <<= 1;
+break;
+}
+
+t->base_address = cpu_to_le64(baseaddr);
+
+t->base_address_modifier = 0;
+if (info->irq_type == IPMI_LEVEL_IRQ) {
+t->base_address_modifier |= 1;
+}
+switch (info->register_spacing) {
+case 1:
+break;
+case 4:
+t->base_address_modifier |= 1 << 6;
+break;
+case 16:
+t->base_address_modifier |= 2 << 6;
+break;
+default:
+error_report("IPMI register spacing %d is not compatible with"
+ " SMBIOS, ignoring this entry.", info->register_spacing);
+return;
+}
+t->interrupt_number = info->interrupt_number;
+
+SMBIOS_BUILD_TABLE_POST;
+}
+
+static void smbios_add_ipmi_devices(BusState *bus)
+{
+BusChild *kid;
+
+QTAILQ_FOREACH(kid, &bus->children,  sibling) {
+DeviceState *dev = kid->child;
+Object *obj = object_dynamic_cast(OBJECT(dev), TYPE_IPMI_INTERFACE);
+BusState *childbus;
+
+if (obj) {
+IPMIInterface *ii;
+IPMIInterfaceClass *iic;
+IPMIFwInfo info;
+
+ii = IPMI_INTERFACE(obj);
+iic = IPMI_INTERFACE_GET_CLASS(obj);
+memset(&info, 0, sizeof(info));
+iic->get_fwinfo(ii, &info);
+smbios_build_one_type_38(&info);
+continue;
+}
+
+QLIST_FOREACH(childbus, &dev->child_bus, sibling) {
+smbios_add_ipmi_devices(childbus);
+}
+}
+}

[Qemu-devel] [PULL 01/34] smbios: Move table build tools into an include file.

2016-06-23 Thread Michael S. Tsirkin

From: Corey Minyard 

This will let things in other files (like IPMI) build SMBIOS tables.

Signed-off-by: Corey Minyard 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/smbios/smbios_build.h | 87 
 hw/smbios/smbios.c   | 70 --
 2 files changed, 93 insertions(+), 64 deletions(-)
 create mode 100644 hw/smbios/smbios_build.h

diff --git a/hw/smbios/smbios_build.h b/hw/smbios/smbios_build.h
new file mode 100644
index 000..68b8b72
--- /dev/null
+++ b/hw/smbios/smbios_build.h
@@ -0,0 +1,87 @@
+/*
+ * SMBIOS Support
+ *
+ * Copyright (C) 2009 Hewlett-Packard Development Company, L.P.
+ * Copyright (C) 2013 Red Hat, Inc.
+ *
+ * Authors:
+ *  Alex Williamson 
+ *  Markus Armbruster 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Contributions after 2012-01-13 are licensed under the terms of the
+ * GNU GPL, version 2 or (at your option) any later version.
+ */
+
+#ifndef QEMU_SMBIOS_BUILD_H
+#define QEMU_SMBIOS_BUILD_H
+
+bool smbios_skip_table(uint8_t type, bool required_table);
+
+extern uint8_t *smbios_tables;
+extern size_t smbios_tables_len;
+extern unsigned smbios_table_max;
+extern unsigned smbios_table_cnt;
+
+#define SMBIOS_BUILD_TABLE_PRE(tbl_type, tbl_handle, tbl_required)\
+struct smbios_type_##tbl_type *t; \
+size_t t_off; /* table offset into smbios_tables */   \
+int str_index = 0;\
+do {  \
+/* should we skip building this table ? */\
+if (smbios_skip_table(tbl_type, tbl_required)) {  \
+return;   \
+} \
+  \
+/* use offset of table t within smbios_tables */  \
+/* (pointer must be updated after each realloc) */\
+t_off = smbios_tables_len;\
+smbios_tables_len += sizeof(*t);  \
+smbios_tables = g_realloc(smbios_tables, smbios_tables_len);  \
+t = (struct smbios_type_##tbl_type *)(smbios_tables + t_off); \
+  \
+t->header.type = tbl_type;\
+t->header.length = sizeof(*t);\
+t->header.handle = cpu_to_le16(tbl_handle);   \
+} while (0)
+
+#define SMBIOS_TABLE_SET_STR(tbl_type, field, value)  \
+do {  \
+int len = (value != NULL) ? strlen(value) + 1 : 0;\
+if (len > 1) {\
+smbios_tables = g_realloc(smbios_tables,  \
+  smbios_tables_len + len);   \
+memcpy(smbios_tables + smbios_tables_len, value, len);\
+smbios_tables_len += len; \
+/* update pointer post-realloc */ \
+t = (struct smbios_type_##tbl_type *)(smbios_tables + t_off); \
+t->field = ++str_index;   \
+} else {  \
+t->field = 0; \
+} \
+} while (0)
+
+#define SMBIOS_BUILD_TABLE_POST   \
+do {  \
+size_t term_cnt, t_size;  \
+  \
+/* add '\0' terminator (add two if no strings defined) */ \
+term_cnt = (str_index == 0) ? 2 : 1;  \
+smbios_tables = g_realloc(smbios_tables,  \
+  smbios_tables_len + term_cnt);  \
+memset(smbios_tables + smbios_tables_len, 0, term_cnt);   \
+smbios_tables_len += term_cnt;\
+  \
+/* update smbios max. element size */ \
+t_size = smbios_tables_len - t_off;   \
+

Re: [Qemu-devel] [PATCH v2 09/10] tests: acpi: add CPU hotplug testcase

2016-06-23 Thread Michael S. Tsirkin

On Thu, Jun 23, 2016 at 03:47:36PM +0200, Igor Mammedov wrote:
> On Thu, 23 Jun 2016 16:08:38 +0300
> Marcel Apfelbaum  wrote:
> 
> > On 06/16/2016 07:55 PM, Igor Mammedov wrote:
> > > Test with:
> > >
> > >  -smp 2,cores=3,sockets=2,maxcpus=6
> > >
> > > to capture sparse APIC ID values that default
> > > AMD CPU has in above configuration.
> > >
> > > Signed-off-by: Igor Mammedov 
> > > ---
> > >   tests/bios-tables-test.c | 28 
> > >   1 file changed, 28 insertions(+)
> > >
> > > diff --git a/tests/bios-tables-test.c b/tests/bios-tables-test.c
> > > index 16d11aa..a7abe91 100644
> > > --- a/tests/bios-tables-test.c
> > > +++ b/tests/bios-tables-test.c
> > > @@ -788,6 +788,32 @@ static void test_acpi_q35_tcg_bridge(void)
> > >   free_test_data(&data);
> > >   }
> > >
> > > +static void test_acpi_piix4_tcg_cphp(void)
> > > +{
> > > +test_data data;
> > > +
> > > +memset(&data, 0, sizeof(data));
> > > +data.machine = MACHINE_PC;
> > > +data.variant = ".cphp";
> > > +test_acpi_one("-machine accel=tcg"
> > > +  " -smp 2,cores=3,sockets=2,maxcpus=6",
> > > +  &data);
> > > +free_test_data(&data);
> > > +}
> > > +
> > > +static void test_acpi_q35_tcg_cphp(void)
> > > +{
> > > +test_data data;
> > > +
> > > +memset(&data, 0, sizeof(data));
> > > +data.machine = MACHINE_Q35;
> > > +data.variant = ".cphp";
> > > +test_acpi_one("-machine q35,accel=tcg"
> > > +  " -smp 2,cores=3,sockets=2,maxcpus=6",
> > > +  &data);
> > > +free_test_data(&data);
> > > +}
> > > +
> > >   int main(int argc, char *argv[])
> > >   {
> > >   const char *arch = qtest_get_arch();
> > > @@ -804,6 +830,8 @@ int main(int argc, char *argv[])
> > >   qtest_add_func("acpi/piix4/tcg/bridge",
> > > test_acpi_piix4_tcg_bridge); qtest_add_func("acpi/q35/tcg",
> > > test_acpi_q35_tcg); qtest_add_func("acpi/q35/tcg/bridge",
> > > test_acpi_q35_tcg_bridge);
> > > +qtest_add_func("acpi/piix4/tcg/cpuhp",
> > > test_acpi_piix4_tcg_cphp);
> > > +qtest_add_func("acpi/q35/tcg/cpuhp",
> > > test_acpi_q35_tcg_cphp); }
> > >   ret = g_test_run();
> > >   boot_sector_cleanup(disk);
> > >
> > 
> > It looks good, but did you miss the .cphp variant expected files on
> > purpose?
> yes, it was in separate commit and I've dropped it before publishing
> tree, per Michael's suggestion not to post ACPI tables blobs since he
> updates them himself.
> I can regenerate blob and post it any time as commit on top of this if
> needed.

you need to patch the script that updates the blob.
I can run it myself but you should mention it in commit log.

> > 
> > 
> > Reviewed-by: Marcel Apfelbaum 
> > Thanks,
> > Marcel
> Thanks!

Re: [Qemu-devel] [PATCH 00/13] virtio migration: Flip outer layer to vmstate

2016-06-23 Thread Michael S. Tsirkin

On Tue, Jun 21, 2016 at 08:13:54PM +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> Hi,
>   This series converts the outer most layer of virtio to
> use VMState macros;  this is the easy bit, but I'm hoping that
> having done that, the next trick is to nibble away at the virtio_save/load
> functions and all of the zillions of device/bus helpers.
> 
> I think the first two patches are the most controversial;
> they remove migration support for old version of virtio-net and virtio-serial;
> (for virtio-net versions prior to 0.11 and for virtio-serial prior to 0.13).
> I'm working on the basis that migration has bit rotted enough so
> that the streams aren't migration compatible for that long back
> on upstream - but if anyone knows otherwise please shout.

I'm ok with removing 0.11 migration compat code.
But doesn't the serial change break 0.12 as well?

> The reason for doing those is that the virtio structure makes
> it a bit tricky to pass the outer device version number down
> through VMState to the device specific code (I can do it
> as a hack if necessary using a dummy is_needed function);
> and with -net and -serial compatibility sorted I think
> every other device just supports a single version.
> 
> My main reason for doing this is to get rid of the
> calls to register_savevm ('going to disappear as soon..' since 2010)
> 
> It's lightly tested using the magic line:
> ./x86_64-softmmu/qemu-system-x86_64 -nographic -machine 
> pc-i440fx-2.6,accel=kvm -cpu qemu64 -m 2048M -drive 
> file=/home/vmimages/f20.img,if=none,id=drivea -device virtio-scsi,id=scsi 
> -device scsi-hd,drive=drivea -device virtio-rng -device virtio-serial 
> -chardev file,id=test,path=/tmp/testfile -device 
> virtconsole,chardev=test,name=foo -virtfs 
> local,path=/home,security_model=passthrough,mount_tag=host_share  -device 
> virtio-gpu  -drive file=/home/vmimages/jeos-19-64.qcow2,id=jeos,if=none 
> -device virtio-blk,drive=jeos  -device virtio-balloon
> 
> Thoughts?
> 
> Dave
> 
> Dr. David Alan Gilbert (13):
>   virtio-net: Remove old migration version support
>   virtio-serial: Remove old migration version support
>   virtio: Migration helper function and macro
>   virtio-scsi: Wrap in vmstate
>   virtio-blk: Wrap in vmstate
>   virtio-rng: Wrap in vmstate
>   virtio-balloon: Wrap in vmstate
>   virtio-net: Wrap in vmstate
>   virtio-serial: Wrap in vmstate
>   9pfs: Wrap in vmstate
>   virtio-input: Wrap in vmstate
>   virtio-gpu: Wrap in vmstate
>   virtio: Update migration docs
> 
>  docs/virtio-migration.txt   |   6 ++-
>  hw/9pfs/virtio-9p-device.c  |  14 +++---
>  hw/block/virtio-blk.c   |  16 +++
>  hw/char/virtio-serial-bus.c |  62 +--
>  hw/display/virtio-gpu.c |  17 +++-
>  hw/input/virtio-input.c |  26 +++
>  hw/net/virtio-net.c | 102 
> +---
>  hw/scsi/virtio-scsi.c   |  21 +++--
>  hw/virtio/virtio-balloon.c  |  19 +++--
>  hw/virtio/virtio-rng.c  |  20 +++--
>  hw/virtio/virtio.c  |   6 +++
>  include/hw/virtio/virtio.h  |  20 +
>  12 files changed, 130 insertions(+), 199 deletions(-)
> 
> -- 
> 2.7.4

Re: [Qemu-devel] [PATCH v4 0/4] enable iommu with -device

2016-06-23 Thread Michael S. Tsirkin

On Tue, Jun 14, 2016 at 10:19:32AM +0300, Marcel Apfelbaum wrote:
> Create the iommu device with '-device intel-iommu' instead of 
> '-machine,iommu=on'.
> 
> The device is part of the machine properties because we wanted
> to ensure is created before any other PCI device.
> 
> The alternative is to skip the bus_master_enable_region at
> the time the device is created. We can create this region
> at machine_done phase. (patch 1)
> 
> Then we need to enable sysbus devices(*) for PC machines (patch 2),
> since intel-iommu is a sysbus device.
> 
> Patch 3 moves the IOMMU init proces into iommu's realize function
> and allows the device creation in both ways.



This breaks make check on x86:

GTESTER check-qtest-ppc64
Broken pipe
GTester: last random seed: R02Sf26cb511ed533726cf1afef21a00bde4
/scm/qemu/tests/Makefile.include:668: recipe for target
'check-qtest-ppc64' failed
make: *** [check-qtest-ppc64] Error 1


> Finally patch 4 removes the iommu machine property.
> 
> v3 -> v4:
>   - Rebased on mst/pci tree (Michael).
> 
> v2 -> v3:
>   - Add machine_done notifier in pci_bus realize and remove it unrealize 
> (Paolo).
>   - Add comments for 'cannot_instantiate_with_device_add_yet' (Markus).
>   - Split adding the -device iommu support and removing the iommu machine 
> property (Michael).
>   - Use pci_setup_iommu as before (Peter)
>   - Mark intel-iommu as not hot-pluggable (Peter)
>   - Rebased on master
> 
> v1 -> v2:
>   - Enable bus_master also on init if the guest OS already booted to enable 
> hotplug (Paolo).
>   - Add a machine_done notifier to PCIBus instead of adding functionality
> for q35 machine_done callback. The main reason is we don't want to 
> replicate
> the code for all platforms that support PCI and is also cleaner this way.
>   - Added 'cannot_instantiate_with_device_add_yet' to sysbus devices that lead
> to crashes if added with -device.
>   - Rebased on master
> 
> 
> (*) Creates a new problem since we have now a bunch of
> new devices that can be created with -device on Q35:
>   name "q35-pcihost", bus System
>   name "sysbus-ohci", bus System, desc "OHCI USB Controller"
>   name "allwinner-ahci", bus System
>   name "cfi.pflash01", bus System
>   name "esp", bus System
>   name "SUNW,fdtwo", bus System
>   name "sysbus-ahci", bus System
>   name "sysbus-fdc", bus System
>   name "vfio-amd-xgbe", bus System, desc "VFIO AMD XGBE"
>   name "vfio-calxeda-xgmac", bus System, desc "VFIO Calxeda XGMAC"
>   name "virtio-mmio", bus System
>   name "fw_cfg", bus System
>   name "fw_cfg_io", bus System
>   name "fw_cfg_mem", bus System
>   name "generic-sdhci", bus System
>   name "hpet", bus System
>   name "i440FX-pcihost", bus System
>   name "intel-iommu", bus System
>   name "ioapic", bus System
>   name "isabus-bridge", bus System
>   name "kvm-ioapic", bus System
>   name "kvmclock", bus System
>   name "kvmvapic", bus System
>   name "pxb-host", bus System
> 
> Took care of the ones creating immediate issues (like crashes) by marking them
> as 'cannot_instantiate_with_device_add_yet'. I didn't mark them all because:
>   - libvirt will mask them anyway
>   - some of them have already a "protection" in place
>   - it is possible that some of them can be actually used with -device on 
> other platform. 
>   - those are not 'interesting' scenarios.
> If somebody spots devices in the list that cannot be added with -device on 
> any platform
> please let me know and I'll mark them.
> 
> 
> Thanks,
> Marcel
> 
> Marcel Apfelbaum (4):
>   hw/pci: delay bus_master_enable_region initialization
>   q35: allow dynamic sysbus
>   hw/iommu: enable iommu with -device
>   machine: remove iommu property
> 
>  hw/core/machine.c   | 20 --
>  hw/i386/intel_iommu.c   | 16 +++
>  hw/i386/pc_q35.c|  2 +-
>  hw/pci-bridge/pci_expander_bridge.c |  2 ++
>  hw/pci-host/piix.c  |  2 ++
>  hw/pci-host/q35.c   | 31 +++-
>  hw/pci/pci.c| 41 
> +
>  include/hw/pci-host/q35.h   |  1 -
>  include/hw/pci/pci_bus.h|  2 ++
>  include/sysemu/sysemu.h |  1 +
>  qemu-options.hx |  3 ---
>  vl.c|  5 +
>  12 files changed, 64 insertions(+), 62 deletions(-)
> 
> -- 
> 2.4.3

Re: [Qemu-devel] Change of max-ram-below-4g initial value breaks Xen

2016-06-23 Thread Gerd Hoffmann

On Do, 2016-06-23 at 17:18 +0100, Anthony PERARD wrote:
> On Thu, Jun 23, 2016 at 04:57:54PM +0200, Gerd Hoffmann wrote:
> >   Hi,
> > 
> > > How could xen_ram_init() find out if the value of max-ram-below-4g is
> > > the default or if a user have set it? Is there another way we could fix
> > > this?
> > 
> > Attached patch should fix it.  Patch survived a quick smoke test on kvm
> > so far, need to do some more testing tomorrow.  Can you give it a spin
> > on xen?
> 
> Thanks. Unfortunately, it does not work :(.
> 
> In this patch, max_ram_below_4g is set before the call to xen_ram_init()
> and xen_ram_init read it back (via object_property_get_int()).  So, in
> xen_ram_init, user_lowmem is not 0.

Ah, I see.  We do the split calculation twice on xen.  That is pretty
pointless.  New patch attached.

cheers,
  Gerd

From a1bb0d4f7a94e97102e7ea72d0a65de2a17b1160 Mon Sep 17 00:00:00 2001
From: Gerd Hoffmann 
Date: Thu, 23 Jun 2016 16:49:03 +0200
Subject: [PATCH] xen: fix ram init regression

Commit "8156d48 pc: allow raising low memory via max-ram-below-4g
option" causes a regression on xen, because it uses a different
memory split.

This patch initializes max-ram-below-4g to zero and leaves the
initialization to the memory initialization functions.  That way
they can pick different default values (max-ram-below-4g is zero
still) or use the user supplied value (max-ram-below-4g is non-zero).

Also skip the whole ram split calculation on Xen.  xen_ram_init()
does its own split calculation anyway so it is superfluous, also
this way xen_ram_init can actually see whenever max-ram-below-4g
is zero or not.

Signed-off-by: Gerd Hoffmann 
---
 hw/i386/pc.c  |  2 +-
 hw/i386/pc_piix.c | 52 +---
 hw/i386/pc_q35.c  |  3 +++
 xen-hvm.c |  3 +++
 4 files changed, 36 insertions(+), 24 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 7198ed5..66e1dae 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1886,7 +1886,7 @@ static void pc_machine_initfn(Object *obj)
 pc_machine_get_hotplug_memory_region_size,
 NULL, NULL, NULL, &error_abort);
 
-pcms->max_ram_below_4g = 0xe000; /* 3.5G */
+pcms->max_ram_below_4g = 0; /* use default */
 object_property_add(obj, PC_MACHINE_MAX_RAM_BELOW_4G, "size",
 pc_machine_get_max_ram_below_4g,
 pc_machine_set_max_ram_below_4g,
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 53bc968..f51fa77 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -108,37 +108,43 @@ static void pc_init1(MachineState *machine,
  *so legacy non-PAE guests can get as much memory as possible in
  *the 32bit address space below 4G.
  *
+ *  - Note that Xen has its own ram setp code in xen_ram_init(),
+ *called via xen_hvm_init().
+ *
  * Examples:
  *qemu -M pc-1.7 -m 4G(old default)-> 3584M low,  512M high
  *qemu -M pc -m 4G(new default)-> 3072M low, 1024M high
  *qemu -M pc,max-ram-below-4g=2G -m 4G -> 2048M low, 2048M high
  *qemu -M pc,max-ram-below-4g=4G -m 3968M  -> 3968M low (=4G-128M)
  */
-lowmem = pcms->max_ram_below_4g;
-if (machine->ram_size >= pcms->max_ram_below_4g) {
-if (pcmc->gigabyte_align) {
-if (lowmem > 0xc000) {
-lowmem = 0xc000;
-}
-if (lowmem & ((1ULL << 30) - 1)) {
-error_report("Warning: Large machine and max_ram_below_4g "
- "(%" PRIu64 ") not a multiple of 1G; "
- "possible bad performance.",
- pcms->max_ram_below_4g);
-}
-}
-}
-
-if (machine->ram_size >= lowmem) {
-pcms->above_4g_mem_size = machine->ram_size - lowmem;
-pcms->below_4g_mem_size = lowmem;
-} else {
-pcms->above_4g_mem_size = 0;
-pcms->below_4g_mem_size = machine->ram_size;
-}
-
 if (xen_enabled()) {
 xen_hvm_init(pcms, &ram_memory);
+} else {
+if (!pcms->max_ram_below_4g) {
+pcms->max_ram_below_4g = 0xe000; /* default: 3.5G */
+}
+lowmem = pcms->max_ram_below_4g;
+if (machine->ram_size >= pcms->max_ram_below_4g) {
+if (pcmc->gigabyte_align) {
+if (lowmem > 0xc000) {
+lowmem = 0xc000;
+}
+if (lowmem & ((1ULL << 30) - 1)) {
+error_report("Warning: Large machine and max_ram_below_4g "
+ "(%" PRIu64 ") not a multiple of 1G; "
+ "possible bad performance.",
+ pcms->max_ram_below_4g);
+}
+}
+}
+
+if (machine->ram_size >= lowmem) {
+pcms->above_4g_mem_size = machine->ram_size - lowmem;

Re: [Qemu-devel] [PATCH v1 04/11] ppc/xics: Remove unused xics_set_irq_type()

2016-06-23 Thread David Gibson

On Thu, Jun 23, 2016 at 11:17:23PM +0530, Nikunj A Dadhania wrote:
> From: Benjamin Herrenschmidt 
> 
> Signed-off-by: Benjamin Herrenschmidt 
> Reviewed-by: David Gibson 
> Signed-off-by: Nikunj A Dadhania 

This stands on its own so I've applied it to ppc-for-2.7 (adjusting
for context conflicts, obviously).

> ---
>  hw/intc/xics.c| 11 ---
>  include/hw/ppc/xics.h |  1 -
>  2 files changed, 12 deletions(-)
> 
> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> index 40969ee..4f15a2d 100644
> --- a/hw/intc/xics.c
> +++ b/hw/intc/xics.c
> @@ -695,17 +695,6 @@ void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
>  lsi ? XICS_FLAGS_IRQ_LSI : XICS_FLAGS_IRQ_MSI;
>  }
>  
> -void xics_set_irq_type(XICSState *icp, int irq, bool lsi)
> -{
> -int src = xics_find_source(icp, irq);
> -ICSState *ics;
> -
> -assert(src >= 0);
> -
> -ics = &icp->ics[src];
> -ics_set_irq_type(ics, irq - ics->offset, lsi);
> -}
> -
>  static void xics_register_types(void)
>  {
>  type_register_static(&xics_common_info);
> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> index 32ea706..2a9b91d 100644
> --- a/include/hw/ppc/xics.h
> +++ b/include/hw/ppc/xics.h
> @@ -170,7 +170,6 @@ struct ICSIRQState {
>  #define XICS_IRQS_SPAPR   1024
>  
>  qemu_irq xics_get_qirq(XICSState *icp, int irq);
> -void xics_set_irq_type(XICSState *icp, int irq, bool lsi);
>  int xics_spapr_alloc(XICSState *icp, int src, int irq_hint, bool lsi,
>   Error **errp);
>  int xics_spapr_alloc_block(XICSState *icp, int src, int num, bool lsi,

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v3 09/22] blkdebug: Set request_alignment during .bdrv_refresh_limits()

2016-06-23 Thread Fam Zheng

On Thu, 06/23 16:37, Eric Blake wrote:
> We want to eventually stick request_alignment alongside other
> BlockLimits, but first, we must ensure it is populated at the
> same time as all other limits, rather than being a special case
> that is set only when a block is first opened.
> 
> Note that when the user does not provide "align", then we were
> defaulting to bs->request_alignment - but at this stage in the
> initialization, that was always 512.  We were also rejecting an
> explicit "align":0 from the user; this patch now allows that,
> as an explicit request for the default alignment (which may not
> always be 512 in the future).
> 
> qemu-iotests 77 is particularly sensitive to the fact that we
> can specify an artificial alignment override in blkdebug, and
> that override must continue to work even when limits are
> refreshed on an already open device.
> 
> Signed-off-by: Eric Blake 
> 
> ---
> v3: rework to allow user to specify "align":0 for default
> v2: new patch
> ---
>  qapi/block-core.json |  3 ++-
>  block/blkdebug.c | 19 +++
>  2 files changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 98a20d2..ac8f5f6 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1961,7 +1961,8 @@
>  #
>  # @config:  #optional filename of the configuration file
>  #
> -# @align:   #optional required alignment for requests in bytes
> +# @align:   #optional required alignment for requests in bytes,
> +#   must be power of 2, or 0 for default
>  #
>  # @inject-error:#optional array of error injection descriptions
>  #
> diff --git a/block/blkdebug.c b/block/blkdebug.c
> index 20d25bd..54b6870 100644
> --- a/block/blkdebug.c
> +++ b/block/blkdebug.c
> @@ -37,6 +37,7 @@
>  typedef struct BDRVBlkdebugState {
>  int state;
>  int new_state;
> +int align;
> 
>  QLIST_HEAD(, BlkdebugRule) rules[BLKDBG__MAX];
>  QSIMPLEQ_HEAD(, BlkdebugRule) active_rules;
> @@ -382,10 +383,10 @@ static int blkdebug_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  }
> 
>  /* Set request alignment */
> -align = qemu_opt_get_size(opts, "align", bs->request_alignment);
> -if (align > 0 && align < INT_MAX && !(align & (align - 1))) {
> -bs->request_alignment = align;
> -} else {
> +align = qemu_opt_get_size(opts, "align", 0);
> +if (align < INT_MAX && is_power_of_2(align)) {
> +s->align = align;
> +} else if (align) {
>  error_setg(errp, "Invalid alignment");
>  ret = -EINVAL;
>  goto fail_unref;
> @@ -720,6 +721,15 @@ static void blkdebug_refresh_filename(BlockDriverState 
> *bs, QDict *options)
>  bs->full_open_options = opts;
>  }
> 
> +static void blkdebug_refresh_limits(BlockDriverState *bs, Error **errp)
> +{
> +BDRVBlkdebugState *s = bs->opaque;
> +
> +if (s->align) {
> +bs->request_alignment = s->align;
> +}
> +}
> +
>  static int blkdebug_reopen_prepare(BDRVReopenState *reopen_state,
> BlockReopenQueue *queue, Error **errp)
>  {
> @@ -738,6 +748,7 @@ static BlockDriver bdrv_blkdebug = {
>  .bdrv_getlength = blkdebug_getlength,
>  .bdrv_truncate  = blkdebug_truncate,
>  .bdrv_refresh_filename  = blkdebug_refresh_filename,
> +.bdrv_refresh_limits= blkdebug_refresh_limits,
> 
>  .bdrv_aio_readv = blkdebug_aio_readv,
>  .bdrv_aio_writev= blkdebug_aio_writev,
> -- 
> 2.5.5
> 

Reviewed-by: Fam Zheng

Re: [Qemu-devel] [PATCH 1/3] qapi: Report support for -device cpu hotplug in query-machines

2016-06-23 Thread Peter Krempa

On Fri, Jun 24, 2016 at 14:56:51 +1000, David Gibson wrote:

[...]

> > You are correct - query-commands says whether 'query-hotpluggable-cpus'
> > exists as a command.  But that is insufficient.  See my review, or the
> > v2 patch, where the above poor wording was corrected to say what was
> > really meant: knowing whether query-hotpluggable-cpus exists is
> > insufficient to tell you whether a given cpu type can be hotplugged.  So
> > adding one more piece of witness (for every type of cpu supported, we
> > also advertise if it is hotpluggable) is enough for libvirt to
> > efficiently take advantage of the new query-hotpluggable-cpus command.
> 
> Ah, right.  Or to put it another way, the availability of
> query-hotpluggable-cpus is global across qemu, whereas actually being
> able to use it for hotplug is per machine type.
> 
> Would it be possible to do this instead by attempting to invoke
> query-hopluggable-cpus and seeing if it returns any information?

It is not strictly necessary for us to have this in the context of
usability. If the user requests using the new hotplug feature we will
try it unconditionally and call query-hotpluggable-cpus before even
starting guest execution. A failure to query the state will then result
in termination of the VM.

It is necessary though to report the availability of the feature to the
user via our domain capabilities API which some higher layer management
apps use to make decisions.

This would also be necessary if we wanted to switch by default to the
new approach, but that's not really possible as libvirt tries to
guarantee that a config valid on certain version will be still valid
even when it was migrated to a newer version and then back.

My current plan is to start qemu with -smp cpus=1,... and then call
query-hotpluggable-cpus and then hotplug all of them until the requested
configuration is satisfied. This approach is necessary so that we can
query for the model and topology info so that we don't need to
re-implement all the numbering and naming logic from qemu.

Additionally this will require us to mark one CPU as non-hotpluggable as
-smp cpus=0,maxcpus=10 is basically translated to -smp
cpus=10,maxcpus=10.

Peter

Re: [Qemu-devel] [PATCH v2 0/2] qapi: Fix up cpu hotplug property names and add witness for cpu hotplug support

2016-06-23 Thread David Gibson

On Fri, 24 Jun 2016 07:31:39 +0200
Igor Mammedov  wrote:

> On Fri, 24 Jun 2016 13:00:56 +1000
> David Gibson  wrote:
> 
> > On Thu, 23 Jun 2016 23:23:32 +0200
> > Peter Krempa  wrote:
> >   
> > > Version 2:
> > > - fix typos/incompetence/drowsiness based language errors in commit
> > > message
> > > - select version 1 as prefered way
> > > - add -id suffix to all members of CpuInstanceProperties
> > > - note in qapi-schema the need to keep members in sync
> > > - fix output text field names in HMP impl
> > > 
> > > Peter Krempa (2):
> > >   qapi: Report support for -device cpu hotplug in query-machines
> > >   qapi: keep names in 'CpuInstanceProperties' in sync with struct
> > > CPUCore  
> > 
> > Adding Bharata to CC.
> > 
> > Igor, should I take these through my tree?  
> Yep, please do so.
> It' would be even better if it's merged into master ASAP
> (for libvirt and for x86 device_add series that introduces these
> properties for x86 CPUs)

Yes, that's my intention.  Just want to check what the final word is on
whether we need the extra witness, or whether checking what's returned
from query-hotpluggable-cpus is sufficient.

-- 
David Gibson 
Senior Software Engineer, Virtualization, Red Hat


pgp_hoNqHvHfP.pgp
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 2/2] qapi: keep names in 'CpuInstanceProperties' in sync with struct CPUCore

2016-06-23 Thread Igor Mammedov

On Thu, 23 Jun 2016 23:23:34 +0200
Peter Krempa  wrote:

> struct CPUCore uses 'id' suffix in the property name. As docs for
> query-hotpluggable-cpus state that the cpu core properties should be
> passed back to device_add by management in case new members are added
> and thus the names for the fields should be kept in sync.
> 
> Signed-off-by: Peter Krempa 
> ---
>  hmp.c | 16 
>  hw/ppc/spapr.c|  4 ++--
>  include/hw/cpu/core.h |  3 +++
>  qapi-schema.json  | 19 ++-
>  4 files changed, 23 insertions(+), 19 deletions(-)
> 
> diff --git a/hmp.c b/hmp.c
> index 997a768..925601a 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -2457,17 +2457,17 @@ void hmp_hotpluggable_cpus(Monitor *mon,
> const QDict *qdict)
> 
>  c = l->value->props;
>  monitor_printf(mon, "  CPUInstance Properties:\n");
> -if (c->has_node) {
> -monitor_printf(mon, "node: \"%" PRIu64 "\"\n",
> c->node);
> +if (c->has_node_id) {
> +monitor_printf(mon, "node-id: \"%" PRIu64 "\"\n",
> c->node_id); }
> -if (c->has_socket) {
> -monitor_printf(mon, "socket: \"%" PRIu64 "\"\n",
> c->socket);
> +if (c->has_socket_id) {
> +monitor_printf(mon, "socket-id: \"%" PRIu64 "\"\n",
> c->socket_id); }
> -if (c->has_core) {
> -monitor_printf(mon, "core: \"%" PRIu64 "\"\n",
> c->core);
> +if (c->has_core_id) {
> +monitor_printf(mon, "core-id: \"%" PRIu64 "\"\n",
> c->core_id); }
> -if (c->has_thread) {
> -monitor_printf(mon, "thread: \"%" PRIu64 "\"\n",
> c->thread);
> +if (c->has_thread_id) {
> +monitor_printf(mon, "thread-id: \"%" PRIu64 "\"\n",
> c->thread_id); }
> 
>  l = l->next;
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 778fa25..0b6bb9c 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2367,8 +2367,8 @@ static HotpluggableCPUList
> *spapr_query_hotpluggable_cpus(MachineState *machine)
> 
>  cpu_item->type = spapr_get_cpu_core_type(machine->cpu_model);
>  cpu_item->vcpus_count = smp_threads;
> -cpu_props->has_core = true;
> -cpu_props->core = i * smt;
> +cpu_props->has_core_id = true;
> +cpu_props->core_id = i * smt;
>  /* TODO: add 'has_node/node' here to describe
> to which node core belongs */
> 
> diff --git a/include/hw/cpu/core.h b/include/hw/cpu/core.h
> index 4540a7d..79ac79c 100644
> --- a/include/hw/cpu/core.h
> +++ b/include/hw/cpu/core.h
> @@ -26,6 +26,9 @@ typedef struct CPUCore {
>  int nr_threads;
>  } CPUCore;
> 
> +/* Note: topology field names need to be kept in sync with
> + * 'CpuInstanceProperties' */
> +
>  #define CPU_CORE_PROP_CORE_ID "core-id"
> 
>  #endif
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 24ede28..d0c4be1 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -4267,20 +4267,21 @@
>  # Note: currently there are 4 properties that could be present
>  # but management should be prepared to pass through other
>  # properties with device_add command to allow for future
> -# interface extension.
> +# interface extension. This also requires the filed names to be kept
> in sync +# sync with the properties passed to -device/device_add.
>  #
> -# @node: #optional NUMA node ID the CPU belongs to
> -# @socket: #optional socket number within node/board the CPU belongs
> to -# @core: #optional core number within socket the CPU belongs to
> -# @thread: #optional thread number within core the CPU belongs to
> +# @node-id: #optional NUMA node ID the CPU belongs to
> +# @socket-id: #optional socket number within node/board the CPU
> belongs to +# @core-id: #optional core number within socket the CPU
> belongs to +# @thread-id: #optional thread number within core the CPU
> belongs to #
>  # Since: 2.7
>  ##
>  { 'struct': 'CpuInstanceProperties',
> -  'data': { '*node': 'int',
> -'*socket': 'int',
> -'*core': 'int',
> -'*thread': 'int'
> +  'data': { '*node-id': 'int',
> +'*socket-id': 'int',
> +'*core-id': 'int',
> +'*thread-id': 'int'
>}
>  }
> 

Reviewed-by: Igor Mammedov

Re: [Qemu-devel] [PATCH v2 0/2] qapi: Fix up cpu hotplug property names and add witness for cpu hotplug support

2016-06-23 Thread Igor Mammedov

On Fri, 24 Jun 2016 13:00:56 +1000
David Gibson  wrote:

> On Thu, 23 Jun 2016 23:23:32 +0200
> Peter Krempa  wrote:
> 
> > Version 2:
> > - fix typos/incompetence/drowsiness based language errors in commit
> > message
> > - select version 1 as prefered way
> > - add -id suffix to all members of CpuInstanceProperties
> > - note in qapi-schema the need to keep members in sync
> > - fix output text field names in HMP impl
> > 
> > Peter Krempa (2):
> >   qapi: Report support for -device cpu hotplug in query-machines
> >   qapi: keep names in 'CpuInstanceProperties' in sync with struct
> > CPUCore
> 
> Adding Bharata to CC.
> 
> Igor, should I take these through my tree?
Yep, please do so.
It' would be even better if it's merged into master ASAP
(for libvirt and for x86 device_add series that introduces these
properties for x86 CPUs)

Re: [Qemu-devel] [PATCH v3 08/22] block: Give nonzero result to blk_get_max_transfer_length()

2016-06-23 Thread Fam Zheng

On Thu, 06/23 16:37, Eric Blake wrote:
> Making all callers special-case 0 as unlimited is awkward,
> and we DO have a hard maximum of BDRV_REQUEST_MAX_SECTORS given
> our current block layer API limits.
> 
> In the case of scsi, this means that we now always advertise a
> limit to the guest, even in cases where the underlying layers
> previously use 0 for no inherent limit beyond the block layer.
> 
> Signed-off-by: Eric Blake 
> Reviewed-by: Kevin Wolf 
> 
> ---
> v3: rebase to scsi limits fix
> v2: new patch
> ---
>  block/block-backend.c  |  7 ---
>  hw/block/virtio-blk.c  |  3 +--
>  hw/scsi/scsi-generic.c | 12 ++--
>  3 files changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 34500e6..1fb070b 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -1303,15 +1303,16 @@ int blk_get_flags(BlockBackend *blk)
>  }
>  }
> 
> +/* Returns the maximum transfer length, in sectors; guaranteed nonzero */
>  int blk_get_max_transfer_length(BlockBackend *blk)
>  {
>  BlockDriverState *bs = blk_bs(blk);
> +int max = 0;
> 
>  if (bs) {
> -return bs->bl.max_transfer_length;
> -} else {
> -return 0;
> +max = bs->bl.max_transfer_length;
>  }
> +return MIN_NON_ZERO(max, BDRV_REQUEST_MAX_SECTORS);
>  }
> 
>  int blk_get_max_iov(BlockBackend *blk)
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index 284e646..1d2792e 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -382,7 +382,7 @@ static int multireq_compare(const void *a, const void *b)
>  void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb)
>  {
>  int i = 0, start = 0, num_reqs = 0, niov = 0, nb_sectors = 0;
> -int max_xfer_len = 0;
> +int max_xfer_len;
>  int64_t sector_num = 0;
> 
>  if (mrb->num_reqs == 1) {
> @@ -392,7 +392,6 @@ void virtio_blk_submit_multireq(BlockBackend *blk, 
> MultiReqBuffer *mrb)
>  }
> 
>  max_xfer_len = blk_get_max_transfer_length(mrb->reqs[0]->dev->blk);
> -max_xfer_len = MIN_NON_ZERO(max_xfer_len, BDRV_REQUEST_MAX_SECTORS);
> 
>  qsort(mrb->reqs, mrb->num_reqs, sizeof(*mrb->reqs),
>&multireq_compare);
> diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
> index 75e227d..0cb8568 100644
> --- a/hw/scsi/scsi-generic.c
> +++ b/hw/scsi/scsi-generic.c
> @@ -227,12 +227,12 @@ static void scsi_read_complete(void * opaque, int ret)
>  r->req.cmd.buf[2] == 0xb0) {
>  uint32_t max_xfer_len = blk_get_max_transfer_length(s->conf.blk) /
>  (s->blocksize / BDRV_SECTOR_SIZE);
> -if (max_xfer_len) {
> -stl_be_p(&r->buf[8], max_xfer_len);
> -/* Also take care of the opt xfer len. */
> -if (ldl_be_p(&r->buf[12]) > max_xfer_len) {
> -stl_be_p(&r->buf[12], max_xfer_len);
> -}
> +
> +assert(max_xfer_len);
> +stl_be_p(&r->buf[8], max_xfer_len);
> +/* Also take care of the opt xfer len. */
> +if (ldl_be_p(&r->buf[12]) > max_xfer_len) {
> +stl_be_p(&r->buf[12], max_xfer_len);
>  }
>  }
>  scsi_req_data(&r->req, len);
> -- 
> 2.5.5
> 

Reviewed-by: Fam Zheng

Re: [Qemu-devel] [PATCH 1/3] qapi: Report support for -device cpu hotplug in query-machines

2016-06-23 Thread Igor Mammedov

On Fri, 24 Jun 2016 14:56:51 +1000
David Gibson  wrote:

> On Thu, 23 Jun 2016 21:49:25 -0600
> Eric Blake  wrote:
> 
> > On 06/23/2016 08:56 PM, David Gibson wrote:
> > > On Thu, 23 Jun 2016 22:23:23 +0200
> > > Peter Krempa  wrote:
> > >   
> > >> For management apps it's very useful to know whether the selected
> > >> machine type supports cpu hotplug via the new -device approach.
> > >> Using the presence of 'query-hotpluggable-cpus' is enough for a
> > >> withess. 
> > 
> > > 
> > > I'd been under the impression that there was a general way of
> > > detecting the availability of a particular qmp command.  Was I
> > > mistaken?  
> > 
> > You are correct - query-commands says whether
> > 'query-hotpluggable-cpus' exists as a command.  But that is
> > insufficient.  See my review, or the v2 patch, where the above poor
> > wording was corrected to say what was really meant: knowing whether
> > query-hotpluggable-cpus exists is insufficient to tell you whether
> > a given cpu type can be hotplugged.  So adding one more piece of
> > witness (for every type of cpu supported, we also advertise if it
> > is hotpluggable) is enough for libvirt to efficiently take
> > advantage of the new query-hotpluggable-cpus
> > command.
> 
> Ah, right.  Or to put it another way, the availability of
> query-hotpluggable-cpus is global across qemu, whereas actually being
> able to use it for hotplug is per machine type.
> 
> Would it be possible to do this instead by attempting to invoke
> query-hopluggable-cpus and seeing if it returns any information?
This sounds like a better way, for x86 we can set
query-hotpluggable-cpus hook to NULL for old machine types so that
it would return error that it's not supported.

> 
>

Re: [Qemu-devel] [PATCH 06/11] target-i386: add socket/core/thread properties to X86CPU

2016-06-23 Thread Igor Mammedov

On Thu, 23 Jun 2016 18:43:53 -0300
Eduardo Habkost  wrote:

> On Thu, Jun 23, 2016 at 10:46:36PM +0200, Igor Mammedov wrote:
> > On Thu, 23 Jun 2016 17:18:46 -0300
> > Eduardo Habkost  wrote:
> [...]
> > > > 
> > > > > 
> > > > > I suggest validating the properties, and setting them in case
> > > > > they are not set:
> > > > > 
> > > > > x86_topo_ids_from_apicid(cpu->apic_id, smp_cores,
> > > > > smp_threads, &topo);
> > > > > 
> > > > > if (cpu->socket != -1 && cpu->socket != topo.socket_id) {
> > > > > error_setg(errp, "CPU socket ID mismatch: ...");
> > > > > return;
> > > > > }
> > > > > cpu->socket = topo.socket_id;
> > > > > 
> > > > > if (cpu->core != -1 && cpu->core != topo.core_id) {
> > > > > error_setg(errp, "CPU core ID mismatch: ...");
> > > > > return;
> > > > > }
> > > > > cpu->core = topo.core_id;
> > > > > 
> > > > > if (cpu->thread != -1 && cpu->thread != topo.smt_id) {
> > > > > error_setg(errp, "CPU thread ID mismatch: ...");
> > > > > return;
> > > > > }
> > > > > cpu->thread = topo.smt_id;
> > > > > 
> > > > > We could do that inside x86_cpu_realizefn(), so that
> > > > > socket/core/thread would be always set in all CPUs.
> > > > all CPUs pass through pc_cpu_pre_plug() before cpu.relizefn()
> > > > so I'd rather do it here at board level responsible for
> > > > setting apic_id or socket/core/thread info is not set.
> > > 
> > > Then *-user will be inconsistent, and will always have invalid
> > > values in socket/core/thread. If one day we add any logic using
> > > the socket/core/thread properties in cpu.c, it will break on
> > > *-user.
> > > 
> > > All those properties are X86CPU properties, meaning they are
> > > input to the X86CPU code. I don't see why we should move logic
> > > related to them outside cpu.c unless really necessary.
> > > 
> > > (The apic-id calculation at patch 07/11, for example, is more
> > > difficult to move to cpu.c, because the PC code needs the APIC ID
> > > before calling realize. We could move it to the apic-id getter,
> > > but I dislike having magic getter/setters and like that you made
> > > it a static property.)
> > set of socket/core/thread is a synonym for apic_id and it's board
> > that manages and knows valid values for them.
> 
> The CPU already knows exactly what the bits inside APIC ID mean,
> because it has to report topology information through CPUID.
> 
> > Putting above snippet into
> > cpu.relizefn() would make CPU access globals smp_cores, smp_threads
> > which are essentially machine_state and Drew working on moving them
> > there and eliminating globals. So suddenly CPU would need to poke
> > into machine object, and we return to the same state wrt *-user
> > only with hack in cpu.c.
> 
> After Drew's code is included, we can simply use
> CPUState::nr_cores and CPUState::nr_threads.
> 
> > 
> > I'd worry about -smp and *-user when it comes into that target as it
> > will probably need apic_id and maybe socket/core/thread as well.
> 
> The point is to not even have to worry about *-user later, by
> keeping both softmmu and *-user consistent. People reviewing
> patches a few years from now probably wouldn't even notice that
> code using the socket/core/thread fields will break in *-user.
> 
> I wouldn't mind about having the code in pc.c at all, if it
> didn't make *-user inconsistent, or if the CPU object didn't had
> all the required information yet. But I don't think it is
> reasonable to intentionally leave X86CPU fields inconsistent in
> *-user if we can easily fix it by moving initialization to
> realizefn.
> 
> But if you are really strongly against that, I can propose that
> as a follow-up later (after Drew's series is included).
Lets do it as follow-up after Drew's -smp refactoring is in.

Re: [Qemu-devel] [PATCH v1 02/11] ppc/xics: Move SPAPR specific code to a separate file

2016-06-23 Thread David Gibson

On Thu, Jun 23, 2016 at 11:17:21PM +0530, Nikunj A Dadhania wrote:
> From: Benjamin Herrenschmidt 
> 
> Leave the core ICP/ICS logic in xics.c and move the top level
> class wrapper, hypercall and RTAS handlers to xics_spapr.c
> 
> Signed-off-by: Benjamin Herrenschmidt 
> [add cpu.h in xics_spapr.c, move set_nr_irqs and set_nr_servers to
>  xics_spapr.c]
> Signed-off-by: Nikunj A Dadhania 
> ---
>  default-configs/ppc64-softmmu.mak |   1 +
>  hw/intc/Makefile.objs |   1 +
>  hw/intc/xics.c| 418 +---
>  hw/intc/xics_spapr.c  | 432 
> ++
>  include/hw/ppc/xics.h |  21 ++
>  5 files changed, 464 insertions(+), 409 deletions(-)
>  create mode 100644 hw/intc/xics_spapr.c
> 
> diff --git a/default-configs/ppc64-softmmu.mak 
> b/default-configs/ppc64-softmmu.mak
> index bb71b23..c4be59f 100644
> --- a/default-configs/ppc64-softmmu.mak
> +++ b/default-configs/ppc64-softmmu.mak
> @@ -49,6 +49,7 @@ CONFIG_ETSEC=y
>  CONFIG_LIBDECNUMBER=y
>  # For pSeries
>  CONFIG_XICS=$(CONFIG_PSERIES)
> +CONFIG_XICS_SPAPR=$(CONFIG_PSERIES)
>  CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
>  # For PReP
>  CONFIG_MC146818RTC=y
> diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
> index c7bbf88..530df2e 100644
> --- a/hw/intc/Makefile.objs
> +++ b/hw/intc/Makefile.objs
> @@ -30,6 +30,7 @@ obj-$(CONFIG_OPENPIC_KVM) += openpic_kvm.o
>  obj-$(CONFIG_RASPI) += bcm2835_ic.o bcm2836_control.o
>  obj-$(CONFIG_SH4) += sh_intc.o
>  obj-$(CONFIG_XICS) += xics.o
> +obj-$(CONFIG_XICS_SPAPR) += xics_spapr.o
>  obj-$(CONFIG_XICS_KVM) += xics_kvm.o
>  obj-$(CONFIG_ALLWINNER_A10_PIC) += allwinner-a10-pic.o
>  obj-$(CONFIG_S390_FLIC) += s390_flic.o
> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> index a715532..6ca391f 100644
> --- a/hw/intc/xics.c
> +++ b/hw/intc/xics.c
> @@ -32,12 +32,11 @@
>  #include "hw/hw.h"
>  #include "trace.h"
>  #include "qemu/timer.h"
> -#include "hw/ppc/spapr.h"
>  #include "hw/ppc/xics.h"
>  #include "qemu/error-report.h"
>  #include "qapi/visitor.h"
>  
> -static int get_cpu_index_by_dt_id(int cpu_dt_id)
> +int get_cpu_index_by_dt_id(int cpu_dt_id)

If this is made public it needs  xics_*() name the current one is too
generic for a global symbol.

>  {
>  PowerPCCPU *cpu = ppc_get_vcpu_by_dt_id(cpu_dt_id);
>  
> @@ -242,7 +241,7 @@ static void icp_resend(XICSState *icp, int server)
>  ics_resend(icp->ics);
>  }
>  
> -static void icp_set_cppr(XICSState *icp, int server, uint8_t cppr)
> +void icp_set_cppr(XICSState *icp, int server, uint8_t cppr)
>  {
>  ICPState *ss = icp->ss + server;
>  uint8_t old_cppr;
> @@ -266,7 +265,7 @@ static void icp_set_cppr(XICSState *icp, int server, 
> uint8_t cppr)
>  }
>  }
>  
> -static void icp_set_mfrr(XICSState *icp, int server, uint8_t mfrr)
> +void icp_set_mfrr(XICSState *icp, int server, uint8_t mfrr)
>  {
>  ICPState *ss = icp->ss + server;
>  
> @@ -276,7 +275,7 @@ static void icp_set_mfrr(XICSState *icp, int server, 
> uint8_t mfrr)
>  }
>  }
>  
> -static uint32_t icp_accept(ICPState *ss)
> +uint32_t icp_accept(ICPState *ss)
>  {
>  uint32_t xirr = ss->xirr;
>  
> @@ -289,7 +288,7 @@ static uint32_t icp_accept(ICPState *ss)
>  return xirr;
>  }
>  
> -static void icp_eoi(XICSState *icp, int server, uint32_t xirr)
> +void icp_eoi(XICSState *icp, int server, uint32_t xirr)
>  {
>  ICPState *ss = icp->ss + server;
>  
> @@ -390,12 +389,6 @@ static const TypeInfo icp_info = {
>  /*
>   * ICS: Source layer
>   */
> -static int ics_valid_irq(ICSState *ics, uint32_t nr)
> -{
> -return (nr >= ics->offset)
> -&& (nr < (ics->offset + ics->nr_irqs));
> -}
> -
>  static void resend_msi(ICSState *ics, int srcno)
>  {
>  ICSIRQState *irq = ics->irqs + srcno;
> @@ -480,8 +473,8 @@ static void write_xive_lsi(ICSState *ics, int srcno)
>  resend_lsi(ics, srcno);
>  }
>  
> -static void ics_write_xive(ICSState *ics, int nr, int server,
> -   uint8_t priority, uint8_t saved_priority)
> +void ics_write_xive(ICSState *ics, int nr, int server,
> +uint8_t priority, uint8_t saved_priority)
>  {
>  int srcno = nr - ics->offset;
>  ICSIRQState *irq = ics->irqs + srcno;
> @@ -658,7 +651,7 @@ static const TypeInfo ics_info = {
>  /*
>   * Exported functions
>   */
> -static int xics_find_source(XICSState *icp, int irq)
> +int xics_find_source(XICSState *icp, int irq)
>  {
>  int sources = 1;
>  int src;
> @@ -686,7 +679,7 @@ qemu_irq xics_get_qirq(XICSState *icp, int irq)
>  return NULL;
>  }
>  
> -static void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
> +void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
>  {
>  assert(!(ics->irqs[srcno].flags & XICS_FLAGS_IRQ_MASK));
>  
> @@ -705,402 +698,9 @@ void xics_set_irq_type(XICSState *icp, int irq, bool 
> lsi)
>  ics_set_irq_type(ics, irq - ics->offset, lsi);
>  }
>  
> -#

Re: [Qemu-devel] [PATCH v3 07/22] scsi: Advertise limits by blocksize, not 512

2016-06-23 Thread Fam Zheng

On Thu, 06/23 16:37, Eric Blake wrote:
> s->blocksize may be larger than 512, in which case our
> tweaks to max_xfer_len and opt_xfer_len must be scaled
> appropriately.
> 
> Reported-by: Fam Zheng 
> Signed-off-by: Eric Blake 
> CC: qemu-sta...@nongnu.org
> 
> ---
> v3: new patch
> ---
>  hw/scsi/scsi-generic.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
> index 6a2d89a..75e227d 100644
> --- a/hw/scsi/scsi-generic.c
> +++ b/hw/scsi/scsi-generic.c
> @@ -225,7 +225,8 @@ static void scsi_read_complete(void * opaque, int ret)
>  if (s->type == TYPE_DISK &&
>  r->req.cmd.buf[0] == INQUIRY &&
>  r->req.cmd.buf[2] == 0xb0) {
> -uint32_t max_xfer_len = blk_get_max_transfer_length(s->conf.blk);
> +uint32_t max_xfer_len = blk_get_max_transfer_length(s->conf.blk) /
> +(s->blocksize / BDRV_SECTOR_SIZE);
>  if (max_xfer_len) {
>  stl_be_p(&r->buf[8], max_xfer_len);
>  /* Also take care of the opt xfer len. */
> -- 
> 2.5.5
> 

Reviewed-by: Fam Zheng

Re: [Qemu-devel] [PATCH v1 03/11] ppc/xics: Implement H_IPOLL using an accessor

2016-06-23 Thread David Gibson

On Thu, Jun 23, 2016 at 11:17:22PM +0530, Nikunj A Dadhania wrote:
> From: Benjamin Herrenschmidt 
> 
> None of the other presenter functions directly mucks with the
> internal state, so don't do it there either.
> 
> Signed-off-by: Benjamin Herrenschmidt 
> Signed-off-by: Nikunj A Dadhania 

Reviewed-by: David Gibson 

Modulo changes that will be necessary to account for review comments
on earlier patches.

> ---
>  hw/intc/xics.c| 8 
>  hw/intc/xics_spapr.c  | 7 ---
>  include/hw/ppc/xics.h | 1 +
>  3 files changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> index 6ca391f..40969ee 100644
> --- a/hw/intc/xics.c
> +++ b/hw/intc/xics.c
> @@ -288,6 +288,14 @@ uint32_t icp_accept(ICPState *ss)
>  return xirr;
>  }
>  
> +uint32_t icp_ipoll(ICPState *ss, uint32_t *mfrr)
> +{
> +if (mfrr) {
> +*mfrr = ss->mfrr;
> +}
> +return ss->xirr;
> +}
> +
>  void icp_eoi(XICSState *icp, int server, uint32_t xirr)
>  {
>  ICPState *ss = icp->ss + server;
> diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
> index 48d458a..4d5adda 100644
> --- a/hw/intc/xics_spapr.c
> +++ b/hw/intc/xics_spapr.c
> @@ -99,10 +99,11 @@ static target_ulong h_ipoll(PowerPCCPU *cpu, 
> sPAPRMachineState *spapr,
>  target_ulong opcode, target_ulong *args)
>  {
>  CPUState *cs = CPU(cpu);
> -ICPState *ss = &spapr->icp->ss[cs->cpu_index];
> +uint32_t mfrr;
> +uint32_t xirr = icp_ipoll(spapr->icp->ss + cs->cpu_index, &mfrr);
>  
> -args[0] = ss->xirr;
> -args[1] = ss->mfrr;
> +args[0] = xirr;
> +args[1] = mfrr;
>  
>  return H_SUCCESS;
>  }
> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> index 76b45ef..32ea706 100644
> --- a/include/hw/ppc/xics.h
> +++ b/include/hw/ppc/xics.h
> @@ -186,6 +186,7 @@ int get_cpu_index_by_dt_id(int cpu_dt_id);
>  void icp_set_cppr(XICSState *icp, int server, uint8_t cppr);
>  void icp_set_mfrr(XICSState *icp, int server, uint8_t mfrr);
>  uint32_t icp_accept(ICPState *ss);
> +uint32_t icp_ipoll(ICPState *ss, uint32_t *mfrr);
>  void icp_eoi(XICSState *icp, int server, uint32_t xirr);
>  
>  void ics_write_xive(ICSState *ics, int nr, int server,

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v1 01/11] ppc/xics: Rename existing xics to xics_spapr

2016-06-23 Thread David Gibson

On Thu, Jun 23, 2016 at 11:17:20PM +0530, Nikunj A Dadhania wrote:
> From: Benjamin Herrenschmidt 
> 
> The common class doesn't change, the KVM one is sPAPR specific. Rename
> variables and functions to xics_spapr.
> 
> Retain the type name as "xics" to preserve migration for existing sPAPR
> guests.
> 
> Signed-off-by: Benjamin Herrenschmidt 
> Signed-off-by: Nikunj A Dadhania 
> ---
>  hw/intc/xics.c| 29 +++--
>  hw/intc/xics_kvm.c|  6 +++---
>  hw/ppc/spapr.c|  7 ---
>  hw/ppc/spapr_events.c |  2 +-
>  hw/ppc/spapr_pci.c| 10 +-
>  hw/ppc/spapr_vio.c|  2 +-
>  include/hw/ppc/xics.h | 29 +
>  7 files changed, 46 insertions(+), 39 deletions(-)
> 
> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> index cce7f3d..a715532 100644
> --- a/hw/intc/xics.c
> +++ b/hw/intc/xics.c
> @@ -729,7 +729,8 @@ static int ics_find_free_block(ICSState *ics, int num, 
> int alignnum)
>  return -1;
>  }
>  
> -int xics_alloc(XICSState *icp, int src, int irq_hint, bool lsi, Error **errp)
> +int xics_spapr_alloc(XICSState *icp, int src, int irq_hint, bool lsi,
> + Error **errp)
>  {
>  ICSState *ics = &icp->ics[src];
>  int irq;
> @@ -760,8 +761,8 @@ int xics_alloc(XICSState *icp, int src, int irq_hint, 
> bool lsi, Error **errp)
>   * Allocate block of consecutive IRQs, and return the number of the first 
> IRQ in the block.
>   * If align==true, aligns the first IRQ number to num.
>   */
> -int xics_alloc_block(XICSState *icp, int src, int num, bool lsi, bool align,
> - Error **errp)
> +int xics_spapr_alloc_block(XICSState *icp, int src, int num, bool lsi,
> +   bool align, Error **errp)
>  {
>  int i, first = -1;
>  ICSState *ics = &icp->ics[src];
> @@ -810,7 +811,7 @@ static void ics_free(ICSState *ics, int srcno, int num)
>  }
>  }
>  
> -void xics_free(XICSState *icp, int irq, int num)
> +void xics_spapr_free(XICSState *icp, int irq, int num)
>  {
>  int src = xics_find_source(icp, irq);
>  
> @@ -1029,7 +1030,7 @@ static void xics_set_nr_servers(XICSState *icp, 
> uint32_t nr_servers,
>  }
>  }
>  
> -static void xics_realize(DeviceState *dev, Error **errp)
> +static void xics_spapr_realize(DeviceState *dev, Error **errp)
>  {
>  XICSState *icp = XICS(dev);
>  Error *error = NULL;
> @@ -1068,7 +1069,7 @@ static void xics_realize(DeviceState *dev, Error **errp)
>  }
>  }
>  
> -static void xics_initfn(Object *obj)
> +static void xics_spapr_initfn(Object *obj)
>  {
>  XICSState *xics = XICS(obj);
>  
> @@ -1077,29 +1078,29 @@ static void xics_initfn(Object *obj)
>  xics->ics->icp = xics;
>  }
>  
> -static void xics_class_init(ObjectClass *oc, void *data)
> +static void xics_spapr_class_init(ObjectClass *oc, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(oc);
> -XICSStateClass *xsc = XICS_CLASS(oc);
> +XICSStateClass *xsc = XICS_SPAPR_CLASS(oc);
>  
> -dc->realize = xics_realize;
> +dc->realize = xics_spapr_realize;
>  xsc->set_nr_irqs = xics_set_nr_irqs;
>  xsc->set_nr_servers = xics_set_nr_servers;
>  }
>  
> -static const TypeInfo xics_info = {
> -.name  = TYPE_XICS,
> +static const TypeInfo xics_spapr_info = {
> +.name  = TYPE_XICS_SPAPR,
>  .parent= TYPE_XICS_COMMON,
>  .instance_size = sizeof(XICSState),
>  .class_size = sizeof(XICSStateClass),
> -.class_init= xics_class_init,
> -.instance_init = xics_initfn,
> +.class_init= xics_spapr_class_init,
> +.instance_init = xics_spapr_initfn,
>  };
>  
>  static void xics_register_types(void)
>  {
>  type_register_static(&xics_common_info);
> -type_register_static(&xics_info);
> +type_register_static(&xics_spapr_info);
>  type_register_static(&ics_info);
>  type_register_static(&icp_info);
>  }
> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
> index b17d6a9..90d657e 100644
> --- a/hw/intc/xics_kvm.c
> +++ b/hw/intc/xics_kvm.c
> @@ -495,8 +495,8 @@ static void xics_kvm_class_init(ObjectClass *oc, void 
> *data)
>  xsc->set_nr_servers = xics_kvm_set_nr_servers;
>  }
>  
> -static const TypeInfo xics_kvm_info = {
> -.name  = TYPE_KVM_XICS,
> +static const TypeInfo xics_spapr_kvm_info = {
> +.name  = TYPE_XICS_SPAPR_KVM,
>  .parent= TYPE_XICS_COMMON,
>  .instance_size = sizeof(KVMXICSState),
>  .class_init= xics_kvm_class_init,
> @@ -505,7 +505,7 @@ static const TypeInfo xics_kvm_info = {
>  
>  static void xics_kvm_register_types(void)
>  {
> -type_register_static(&xics_kvm_info);
> +type_register_static(&xics_spapr_kvm_info);
>  type_register_static(&ics_kvm_info);
>  type_register_static(&icp_kvm_info);
>  }
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 0b6bb9c..a8d497c 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -122,7 +122,8 @@ static XICSState *xics_system_

[Qemu-devel] [PATCH 4/4] Revert "mirror: Workaround for unexpected iohandler events during completion"

2016-06-23 Thread Fam Zheng

This reverts commit ab27c3b5e7408693dde0b565f050aa55c4a1bcef.

The virtio host notifiers are now covered by bdrv_drained_begin/end, so
we don't need this hacky quiescing of the iohandler context anymore.

Signed-off-by: Fam Zheng 
---
 block/mirror.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index a04ed9c..147c0d6 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -500,9 +500,6 @@ static void mirror_exit(BlockJob *job, void *opaque)
 block_job_completed(&s->common, data->ret);
 g_free(data);
 bdrv_drained_end(src);
-if (qemu_get_aio_context() == bdrv_get_aio_context(src)) {
-aio_enable_external(iohandler_get_aio_context());
-}
 bdrv_unref(src);
 }
 
@@ -726,12 +723,6 @@ immediate_exit:
 /* Before we switch to target in mirror_exit, make sure data doesn't
  * change. */
 bdrv_drained_begin(bs);
-if (qemu_get_aio_context() == bdrv_get_aio_context(bs)) {
-/* FIXME: virtio host notifiers run on iohandler_ctx, therefore the
- * above bdrv_drained_end isn't enough to quiesce it. This is ugly, we
- * need a block layer API change to achieve this. */
-aio_disable_external(iohandler_get_aio_context());
-}
 block_job_defer_to_main_loop(&s->common, mirror_exit, data);
 }
 
-- 
2.8.3

[Qemu-devel] [PATCH 2/4] virtio: Always use aio path to set host handler

2016-06-23 Thread Fam Zheng

Apart from the interface difference, the aio version works the same as
the non-aio one. The event notifier versus aio fd handler makes no
diffeerence, except the former led to an ugly patch in commit
ab27c3b5e7, which won't be necessary any more.

As the first step to unify them, all callers are switched to this
renamed aio iterface, and function comment is added.

Signed-off-by: Fam Zheng 
---
 hw/block/dataplane/virtio-blk.c |  6 +++---
 hw/scsi/virtio-scsi-dataplane.c |  9 +
 hw/virtio/virtio-bus.c  | 13 +
 hw/virtio/virtio.c  | 12 +---
 include/hw/virtio/virtio.h  |  6 +++---
 5 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 2041b04..61d65bb 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -174,8 +174,8 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 
 /* Get this show started by hooking up our callbacks */
 aio_context_acquire(s->ctx);
-virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx,
-   
virtio_blk_data_plane_handle_output);
+virtio_queue_set_host_notifier_handler(s->vq, s->ctx, true,
+   
virtio_blk_data_plane_handle_output);
 aio_context_release(s->ctx);
 return;
 
@@ -210,7 +210,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 aio_context_acquire(s->ctx);
 
 /* Stop notifications for new requests from guest */
-virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, NULL);
+virtio_queue_set_host_notifier_handler(s->vq, s->ctx, false, NULL);
 
 /* Drain and switch bs back to the QEMU main loop */
 blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
index 18ced31..ffabb87 100644
--- a/hw/scsi/virtio-scsi-dataplane.c
+++ b/hw/scsi/virtio-scsi-dataplane.c
@@ -80,7 +80,7 @@ static int virtio_scsi_vring_init(VirtIOSCSI *s, VirtQueue 
*vq, int n,
 return rc;
 }
 
-virtio_queue_aio_set_host_notifier_handler(vq, s->ctx, fn);
+virtio_queue_set_host_notifier_handler(vq, s->ctx, true, fn);
 return 0;
 }
 
@@ -97,10 +97,11 @@ static void virtio_scsi_clear_aio(VirtIOSCSI *s)
 VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(s);
 int i;
 
-virtio_queue_aio_set_host_notifier_handler(vs->ctrl_vq, s->ctx, NULL);
-virtio_queue_aio_set_host_notifier_handler(vs->event_vq, s->ctx, NULL);
+virtio_queue_set_host_notifier_handler(vs->ctrl_vq, s->ctx, false, NULL);
+virtio_queue_set_host_notifier_handler(vs->event_vq, s->ctx, false, NULL);
 for (i = 0; i < vs->conf.num_queues; i++) {
-virtio_queue_aio_set_host_notifier_handler(vs->cmd_vqs[i], s->ctx, 
NULL);
+virtio_queue_set_host_notifier_handler(vs->cmd_vqs[i], s->ctx, false,
+   NULL);
 }
 }
 
diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
index f34b4fc..0f81096 100644
--- a/hw/virtio/virtio-bus.c
+++ b/hw/virtio/virtio-bus.c
@@ -166,16 +166,20 @@ static int set_host_notifier_internal(DeviceState *proxy, 
VirtioBusState *bus,
 error_report("%s: unable to init event notifier: %d", __func__, r);
 return r;
 }
-virtio_queue_set_host_notifier_fd_handler(vq, true, true);
+virtio_queue_set_host_notifier_handler(vq, qemu_get_aio_context(),
+   true, NULL);
+
 r = k->ioeventfd_assign(proxy, notifier, n, assign);
 if (r < 0) {
 error_report("%s: unable to assign ioeventfd: %d", __func__, r);
-virtio_queue_set_host_notifier_fd_handler(vq, false, false);
+virtio_queue_set_host_notifier_handler(vq, qemu_get_aio_context(),
+   false, NULL);
 event_notifier_cleanup(notifier);
 return r;
 }
 } else {
-virtio_queue_set_host_notifier_fd_handler(vq, false, false);
+virtio_queue_set_host_notifier_handler(vq, qemu_get_aio_context(),
+   false, NULL);
 k->ioeventfd_assign(proxy, notifier, n, assign);
 event_notifier_cleanup(notifier);
 }
@@ -269,7 +273,8 @@ int virtio_bus_set_host_notifier(VirtioBusState *bus, int 
n, bool assign)
  * ioeventfd and we may end up with a notification where
  * we don't expect one.
  */
-virtio_queue_set_host_notifier_fd_handler(vq, assign, !assign);
+virtio_queue_set_host_notifier_handler(vq, qemu_get_aio_context(),
+   false, NULL);
 if (!assign) {
 /* Use generic ioeventfd handler again. */
 k->ioeventfd_set_disabled(proxy, false);
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index e1e93c7..99cd0c0 100644
--- a/hw/vi

[Qemu-devel] [PATCH 3/4] virtio: Drop the unused virtio_queue_set_host_notifier_fd_handler code

2016-06-23 Thread Fam Zheng

Signed-off-by: Fam Zheng 
---
 hw/virtio/virtio.c | 24 
 include/hw/virtio/virtio.h |  2 --
 2 files changed, 26 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 99cd0c0..7a375c1 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -1815,30 +1815,6 @@ void virtio_queue_set_host_notifier_handler(VirtQueue 
*vq, AioContext *ctx,
 }
 }
 
-static void virtio_queue_host_notifier_read(EventNotifier *n)
-{
-VirtQueue *vq = container_of(n, VirtQueue, host_notifier);
-if (event_notifier_test_and_clear(n)) {
-virtio_queue_notify_vq(vq);
-}
-}
-
-void virtio_queue_set_host_notifier_fd_handler(VirtQueue *vq, bool assign,
-   bool set_handler)
-{
-if (assign && set_handler) {
-event_notifier_set_handler(&vq->host_notifier, true,
-   virtio_queue_host_notifier_read);
-} else {
-event_notifier_set_handler(&vq->host_notifier, true, NULL);
-}
-if (!assign) {
-/* Test and clear notifier before after disabling event,
- * in case poll callback didn't have time to run. */
-virtio_queue_host_notifier_read(&vq->host_notifier);
-}
-}
-
 EventNotifier *virtio_queue_get_host_notifier(VirtQueue *vq)
 {
 return &vq->host_notifier;
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 9a40df7..49488bf 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -248,8 +248,6 @@ EventNotifier *virtio_queue_get_guest_notifier(VirtQueue 
*vq);
 void virtio_queue_set_guest_notifier_fd_handler(VirtQueue *vq, bool assign,
 bool with_irqfd);
 EventNotifier *virtio_queue_get_host_notifier(VirtQueue *vq);
-void virtio_queue_set_host_notifier_fd_handler(VirtQueue *vq, bool assign,
-   bool set_handler);
 void virtio_queue_set_host_notifier_handler(VirtQueue *vq, AioContext *ctx,
 bool assign,
 VirtQueueHandleOutput 
handle_output);
-- 
2.8.3

[Qemu-devel] [PATCH 1/4] virtio: Add typedef for handle_output

2016-06-23 Thread Fam Zheng

The function pointer signature has been repeated a few times, using a
typedef may make coding easier.

Signed-off-by: Fam Zheng 
---
 hw/virtio/virtio.c | 9 -
 include/hw/virtio/virtio.h | 5 +++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 7ed06ea..e1e93c7 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -95,8 +95,8 @@ struct VirtQueue
 int inuse;
 
 uint16_t vector;
-void (*handle_output)(VirtIODevice *vdev, VirtQueue *vq);
-void (*handle_aio_output)(VirtIODevice *vdev, VirtQueue *vq);
+VirtQueueHandleOutput handle_output;
+VirtQueueHandleOutput handle_aio_output;
 VirtIODevice *vdev;
 EventNotifier guest_notifier;
 EventNotifier host_notifier;
@@ -1131,7 +1131,7 @@ void virtio_queue_set_vector(VirtIODevice *vdev, int n, 
uint16_t vector)
 }
 
 VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size,
-void (*handle_output)(VirtIODevice *, VirtQueue *))
+VirtQueueHandleOutput handle_output)
 {
 int i;
 
@@ -1794,8 +1794,7 @@ static void 
virtio_queue_host_notifier_aio_read(EventNotifier *n)
 }
 
 void virtio_queue_aio_set_host_notifier_handler(VirtQueue *vq, AioContext *ctx,
-void 
(*handle_output)(VirtIODevice *,
-  
VirtQueue *))
+VirtQueueHandleOutput 
handle_output)
 {
 if (handle_output) {
 vq->handle_aio_output = handle_output;
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 96b581d..faec22a 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -138,9 +138,10 @@ void virtio_cleanup(VirtIODevice *vdev);
 /* Set the child bus name. */
 void virtio_device_set_child_bus_name(VirtIODevice *vdev, char *bus_name);
 
+typedef void (*VirtQueueHandleOutput)(VirtIODevice *, VirtQueue *);
+
 VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size,
-void (*handle_output)(VirtIODevice *,
-  VirtQueue *));
+VirtQueueHandleOutput handle_output);
 
 void virtio_del_queue(VirtIODevice *vdev, int n);
 
-- 
2.8.3

[Qemu-devel] [PATCH 0/4] virtio: Merge two host notifier handling paths

2016-06-23 Thread Fam Zheng

This series is based on top of Cornelia's

[PATCH 0/6] virtio: refactor host notifiers

The benifit is we don't use event_notifier_set_handler even in non-dataplane
now, which in turn makes virtio-blk and virtio-scsi follow block layer aio
context semantics. Specifically, I/O requests must come from
blk_get_aio_context(blk) events, rather than iohandler_get_aio_context(), so
that bdrv_drained_begin/end will work as expected.

Patch 4 reverts the hack (ab27c3b5e7) we added for 2.6. Lately, commit
b880481579 added another pair of bdrv_drained_begin/end so the crash cannot
happen even without ab27c3b5e7, but in order to avoid leaking requests, patch
two is still a must.



Fam Zheng (4):
  virtio: Add typedef for handle_output
  virtio: Always use aio path to set host handler
  virtio: Drop the unused virtio_queue_set_host_notifier_fd_handler code
  Revert "mirror: Workaround for unexpected iohandler events during
completion"

 block/mirror.c  |  9 -
 hw/block/dataplane/virtio-blk.c |  6 +++---
 hw/scsi/virtio-scsi-dataplane.c |  9 +
 hw/virtio/virtio-bus.c  | 13 +
 hw/virtio/virtio.c  | 43 -
 include/hw/virtio/virtio.h  | 13 ++---
 6 files changed, 35 insertions(+), 58 deletions(-)

-- 
2.8.3

[Qemu-devel] [PATCH v3 3/3] palmetto-bmc: Configure the SCU's hardware strapping register

2016-06-23 Thread Andrew Jeffery

The magic constant configures the following options:

* 28:27: Configure DRAM size as 256MB
* 26:24: DDR3 SDRAM with CL = 6, CWL = 5
* 23: Configure 24/48MHz CLKIN
* 22: Disable GPIOE pass-through mode
* 21: Disable GPIOD pass-through mode
* 20: Enable LPC decode of SuperIO 0x2E/0x4E addresses
* 19: Disable ACPI
* 18: Configure 48MHz CLKIN
* 17: Disable BMC 2nd boot watchdog timer
* 16: Decode SuperIO address 0x2E
* 15: VGA Class Code
* 14: Enable LPC dedicated reset pin
* 13:12: Enable SPI Master and SPI Slave to AHB Bridge
* 11:10: Select CPU:AHB ratio = 2:1
* 9:8: Select 384MHz H-PLL
* 7: Configure MAC#2 for RMII/NCSI
* 6: Configure MAC#1 for RMII/NCSI
* 5: No VGA BIOS ROM
* 4: Boot using 32bit SPI address mode
* 3:2: Select 16MB VGA memory
* 1:0: Boot from SPI flash memory

Signed-off-by: Andrew Jeffery 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Peter Maydell 
---
 hw/arm/palmetto-bmc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/arm/palmetto-bmc.c b/hw/arm/palmetto-bmc.c
index a51d960510ee..b8eed21348d8 100644
--- a/hw/arm/palmetto-bmc.c
+++ b/hw/arm/palmetto-bmc.c
@@ -44,6 +44,8 @@ static void palmetto_bmc_init(MachineState *machine)
 &bmc->ram);
 object_property_add_const_link(OBJECT(&bmc->soc), "ram", OBJECT(&bmc->ram),
&error_abort);
+object_property_set_int(OBJECT(&bmc->soc), 0x120CE416, "hw-strap1",
+&error_abort);
 object_property_set_bool(OBJECT(&bmc->soc), true, "realized",
  &error_abort);
 
-- 
2.7.4

[Qemu-devel] [PATCH v3 1/3] hw/misc: Add a model for the ASPEED System Control Unit

2016-06-23 Thread Andrew Jeffery

The SCU is a collection of chip-level control registers that manage the
various functions supported by ASPEED SoCs. Typically the bits control
interactions with clocks, external hardware or reset behaviour, and we
can largly take a hands-off approach to reads and writes.

Firmware makes heavy use of the state to determine how to boot, but the
reset values vary from SoC to SoC (eg AST2400 vs AST2500). A qdev
property is exposed so that the integrating SoC model can configure the
silicon revision, which in-turn selects the appropriate reset values.
Further qdev properties are exposed so the board model can configure the
board-dependent hardware strapping.

Almost all provided AST2400 reset values are specified by the datasheet.
The notable exception is SOC_SCRATCH1, where we mark the DRAM as
successfully initialised to avoid unnecessary dark corners in the SoC's
u-boot support.

Signed-off-by: Andrew Jeffery 
---
Since v2:

* Fix mixing of offsets and register indexes
* Sanity check device property values
* Move trace event definition to hw/misc/trace-events

Since v1:

* Move reset values into SCU implementation (also make register defines private)
* Expose silicon-rev property which is used to select appropriate reset values
* Expose hw-strap1/hw-strap2 properties for board-specific SoC configuration

 hw/misc/Makefile.objs|   1 +
 hw/misc/aspeed_scu.c | 284 +++
 hw/misc/trace-events |   3 +
 include/hw/misc/aspeed_scu.h |  34 ++
 4 files changed, 322 insertions(+)
 create mode 100644 hw/misc/aspeed_scu.c
 create mode 100644 include/hw/misc/aspeed_scu.h

diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index ffb49c11aca6..54020aa06c00 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -52,3 +52,4 @@ obj-$(CONFIG_PVPANIC) += pvpanic.o
 obj-$(CONFIG_EDU) += edu.o
 obj-$(CONFIG_HYPERV_TESTDEV) += hyperv_testdev.o
 obj-$(CONFIG_AUX) += aux.o
+obj-$(CONFIG_ASPEED_SOC) += aspeed_scu.o
diff --git a/hw/misc/aspeed_scu.c b/hw/misc/aspeed_scu.c
new file mode 100644
index ..ff231dbb3c17
--- /dev/null
+++ b/hw/misc/aspeed_scu.c
@@ -0,0 +1,284 @@
+/*
+ * ASPEED System Control Unit
+ *
+ * Andrew Jeffery 
+ *
+ * Copyright 2016 IBM Corp.
+ *
+ * This code is licensed under the GPL version 2 or later.  See
+ * the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include 
+#include "hw/misc/aspeed_scu.h"
+#include "hw/qdev-properties.h"
+#include "qapi/error.h"
+#include "qapi/visitor.h"
+#include "qemu/bitops.h"
+#include "trace.h"
+
+#define TO_REG(offset) ((offset) >> 2)
+
+#define PROT_KEY TO_REG(0x00)
+#define SYS_RST_CTRL TO_REG(0x04)
+#define CLK_SEL  TO_REG(0x08)
+#define CLK_STOP_CTRLTO_REG(0x0C)
+#define FREQ_CNTR_CTRL   TO_REG(0x10)
+#define FREQ_CNTR_EVAL   TO_REG(0x14)
+#define IRQ_CTRL TO_REG(0x18)
+#define D2PLL_PARAM  TO_REG(0x1C)
+#define MPLL_PARAM   TO_REG(0x20)
+#define HPLL_PARAM   TO_REG(0x24)
+#define FREQ_CNTR_RANGE  TO_REG(0x28)
+#define MISC_CTRL1   TO_REG(0x2C)
+#define PCI_CTRL1TO_REG(0x30)
+#define PCI_CTRL2TO_REG(0x34)
+#define PCI_CTRL3TO_REG(0x38)
+#define SYS_RST_STATUS   TO_REG(0x3C)
+#define SOC_SCRATCH1 TO_REG(0x40)
+#define SOC_SCRATCH2 TO_REG(0x44)
+#define MAC_CLK_DELAYTO_REG(0x48)
+#define MISC_CTRL2   TO_REG(0x4C)
+#define VGA_SCRATCH1 TO_REG(0x50)
+#define VGA_SCRATCH2 TO_REG(0x54)
+#define VGA_SCRATCH3 TO_REG(0x58)
+#define VGA_SCRATCH4 TO_REG(0x5C)
+#define VGA_SCRATCH5 TO_REG(0x60)
+#define VGA_SCRATCH6 TO_REG(0x64)
+#define VGA_SCRATCH7 TO_REG(0x68)
+#define VGA_SCRATCH8 TO_REG(0x6C)
+#define HW_STRAP1TO_REG(0x70)
+#define RNG_CTRL TO_REG(0x74)
+#define RNG_DATA TO_REG(0x78)
+#define SILICON_REV  TO_REG(0x7C)
+#define PINMUX_CTRL1 TO_REG(0x80)
+#define PINMUX_CTRL2 TO_REG(0x84)
+#define PINMUX_CTRL3 TO_REG(0x88)
+#define PINMUX_CTRL4 TO_REG(0x8C)
+#define PINMUX_CTRL5 TO_REG(0x90)
+#define PINMUX_CTRL6 TO_REG(0x94)
+#define WDT_RST_CTRL TO_REG(0x9C)
+#define PINMUX_CTRL7 TO_REG(0xA0)
+#define PINMUX_CTRL8 TO_REG(0xA4)
+#define PINMUX_CTRL9 TO_REG(0xA8)
+#define WAKEUP_ENTO_REG(0xC0)
+#define WAKEUP_CTRL  TO_REG(0xC4)
+#define HW_STRAP2TO_REG(0xD0)
+#define FREE_CNTR4   TO_REG(0xE0)
+#define FREE_CNTR4_EXT   TO_REG(0xE4)
+#define CPU2_CTRLTO_REG(0x100)
+#define CPU2_BASE_SEG1   TO_REG(0x104)
+#define CPU2_BASE_SEG2   TO_REG(0x108)
+#define CPU2_BASE_SEG3   TO_REG(0x10C)
+#define CPU2_BASE_SEG4   TO_REG(0x110)
+#define CPU2_BASE_SEG5   TO_REG(0x114)
+#define CPU2_CACHE_CTRL  TO_REG(0x118)
+#define UART_HPLL_CLKTO_R

[Qemu-devel] [PATCH v3 2/3] ast2400: Integrate the SCU model and set silicon revision

2016-06-23 Thread Andrew Jeffery

By specifying the silicon revision we select the appropriate reset
values for the SoC.

Additionally, expose hardware strapping properties aliasing those
provided by the SCU for board-specific configuration.

Signed-off-by: Andrew Jeffery 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Peter Maydell 
---
Since v2:

* Configure SoC silicon revision in the SCU via silicon-rev property

Since v1:

* Remove reset value configuration
* Alias the SCU's hardware strapping properties to expose them to boards

 hw/arm/ast2400.c | 21 +
 include/hw/arm/ast2400.h |  2 ++
 2 files changed, 23 insertions(+)

diff --git a/hw/arm/ast2400.c b/hw/arm/ast2400.c
index 4a9de0e10cbc..b14a82fcdef1 100644
--- a/hw/arm/ast2400.c
+++ b/hw/arm/ast2400.c
@@ -24,9 +24,12 @@
 #define AST2400_IOMEM_SIZE   0x0020
 #define AST2400_IOMEM_BASE   0x1E60
 #define AST2400_VIC_BASE 0x1E6C
+#define AST2400_SCU_BASE 0x1E6E2000
 #define AST2400_TIMER_BASE   0x1E782000
 #define AST2400_I2C_BASE 0x1E78A000
 
+#define AST2400_A0_SILICON_REV   0x02000303
+
 static const int uart_irqs[] = { 9, 32, 33, 34, 10 };
 static const int timer_irqs[] = { 16, 17, 18, 35, 36, 37, 38, 39, };
 
@@ -72,6 +75,16 @@ static void ast2400_init(Object *obj)
 object_initialize(&s->i2c, sizeof(s->i2c), TYPE_ASPEED_I2C);
 object_property_add_child(obj, "i2c", OBJECT(&s->i2c), NULL);
 qdev_set_parent_bus(DEVICE(&s->i2c), sysbus_get_default());
+
+object_initialize(&s->scu, sizeof(s->scu), TYPE_ASPEED_SCU);
+object_property_add_child(obj, "scu", OBJECT(&s->scu), NULL);
+qdev_set_parent_bus(DEVICE(&s->scu), sysbus_get_default());
+qdev_prop_set_uint32(DEVICE(&s->scu), "silicon-rev",
+ AST2400_A0_SILICON_REV);
+object_property_add_alias(obj, "hw-strap1", OBJECT(&s->scu),
+  "hw-strap1", &error_abort);
+object_property_add_alias(obj, "hw-strap2", OBJECT(&s->scu),
+  "hw-strap2", &error_abort);
 }
 
 static void ast2400_realize(DeviceState *dev, Error **errp)
@@ -110,6 +123,14 @@ static void ast2400_realize(DeviceState *dev, Error **errp)
 sysbus_connect_irq(SYS_BUS_DEVICE(&s->timerctrl), i, irq);
 }
 
+/* SCU */
+object_property_set_bool(OBJECT(&s->scu), true, "realized", &err);
+if (err) {
+error_propagate(errp, err);
+return;
+}
+sysbus_mmio_map(SYS_BUS_DEVICE(&s->scu), 0, AST2400_SCU_BASE);
+
 /* UART - attach an 8250 to the IO space as our UART5 */
 if (serial_hds[0]) {
 qemu_irq uart5 = qdev_get_gpio_in(DEVICE(&s->vic), uart_irqs[4]);
diff --git a/include/hw/arm/ast2400.h b/include/hw/arm/ast2400.h
index c05ed5376736..f1a64fd3893d 100644
--- a/include/hw/arm/ast2400.h
+++ b/include/hw/arm/ast2400.h
@@ -14,6 +14,7 @@
 
 #include "hw/arm/arm.h"
 #include "hw/intc/aspeed_vic.h"
+#include "hw/misc/aspeed_scu.h"
 #include "hw/timer/aspeed_timer.h"
 #include "hw/i2c/aspeed_i2c.h"
 
@@ -27,6 +28,7 @@ typedef struct AST2400State {
 AspeedVICState vic;
 AspeedTimerCtrlState timerctrl;
 AspeedI2CState i2c;
+AspeedSCUState scu;
 } AST2400State;
 
 #define TYPE_AST2400 "ast2400"
-- 
2.7.4

[Qemu-devel] [PATCH v3 0/3] Add ASPEED SCU device

2016-06-23 Thread Andrew Jeffery

Hi all,

These are three patches implementing minimal functionality for the ASPEED System
Control Unit device and integrating it into the AST2400 SoC model/palmetto-bmc
machine. The device is critical for initialisation of u-boot and the kernel as
it provides chip level control registers, influencing the configuration of the
software and the software's configuration of the SoC.

Since v2:

* Fix mixing of offsets and register indexes
* Sanity check device property values
* SoC actually initialises the silicon revision

Since v1:

* Select reset values based on silicon revision
* Expose hardware strapping values via properties

Andrew Jeffery (3):
  hw/misc: Add a model for the ASPEED System Control Unit
  ast2400: Integrate the SCU model and set silicon revision
  palmetto-bmc: Configure the SCU's hardware strapping register

 hw/arm/ast2400.c |  21 
 hw/arm/palmetto-bmc.c|   2 +
 hw/misc/Makefile.objs|   1 +
 hw/misc/aspeed_scu.c | 284 +++
 hw/misc/trace-events |   3 +
 include/hw/arm/ast2400.h |   2 +
 include/hw/misc/aspeed_scu.h |  34 ++
 7 files changed, 347 insertions(+)
 create mode 100644 hw/misc/aspeed_scu.c
 create mode 100644 include/hw/misc/aspeed_scu.h

-- 
2.7.4

Re: [Qemu-devel] Question about a qemu Aarch64 error when adding several SCSI disks

2016-06-23 Thread Kevin Zhao

Hi Peter,
 Follow your advice, I have complied the Qemu v2.6.
stack@u202158:~$ kvm --version
QEMU emulator version 2.6.50 (v2.6.0-1280-g6f1d2d1-dirty), Copyright (c)
2003-2008 Fabrice Bellard
 With this newest version, I use virt-manager to create the guest , the
xml file is in the attachment. But the Qemu return error when creating:
 *error: internal error: process exited while connecting to monitor:
qemu-system-aarch64: -device
pci-bridge,chassis_nr=2,id=pci,bus=pci,addr=0x1: Duplicate ID 'pci' for
device*

 The guest xml file in in attachment. But the XML worked when Qemu is
v2.4.0.
  Also I delete the items in the xml :
  -  
  -  
  -
  -
  -  
  -  
  -
  -
  -
  -  
  Using virsh create guest.xml, got the error too :
  *error: internal error: process exited while connecting to monitor:
qemu-system-aarch64: -device
pci-bridge,chassis_nr=2,id=pci,bus=pci,addr=0x1: Duplicate ID 'pci' for
device.*
My test machine is Softiron, with AMD* ARM64 *server CPU. The  libvirt
version is 1.3.1

 Kindly need your help. You will be really appreciated :-)
 Big Thanks~

Best Regards,
Kevin Zhao


On 22 June 2016 at 20:34, Kevin Zhao  wrote:

> Hi Peter,
>Should I use the newest version v 2.6.0 ?
>
> On 22 June 2016 at 20:04, Peter Maydell  wrote:
>
>> On 22 June 2016 at 12:51, Kevin Zhao  wrote:
>> > Hi All,
>> >Greetings from Linaro. This is Kevin from Linaro, and recently I
>> > have met a problem of qemu-system-aarch64 when I am working on
>> virt-manager.
>> >I have reported a bug here:
>> > https://bugs.launchpad.net/qemu/+bug/1594239
>> >
>> >It's mainly bout adding several SCSI disks to Aarch64 guests.
>> >If you have a moment, pls kindly give some advice about
>> this.Thanks
>>
>> Can you reproduce the bug with a newer version of QEMU?
>>
>> thanks
>> -- PMM
>>
>
>
   
  generic
  3e541395-28c1-41f6-ba7a-14b648f82d84   
  1048576 
  1048576   
  1   
  
hvm
/usr/share/AAVMF/AAVMF_CODE.fd  
/var/lib/libvirt/qemu/nvram/generic_VARS.fd
  
 

 
  destroy  
  restart  
  restart
 
/usr/bin/kvm

Re: [Qemu-devel] [PATCH 1/3] qapi: Report support for -device cpu hotplug in query-machines

2016-06-23 Thread David Gibson

On Thu, 23 Jun 2016 21:49:25 -0600
Eric Blake  wrote:

> On 06/23/2016 08:56 PM, David Gibson wrote:
> > On Thu, 23 Jun 2016 22:23:23 +0200
> > Peter Krempa  wrote:
> >   
> >> For management apps it's very useful to know whether the selected
> >> machine type supports cpu hotplug via the new -device approach. Using
> >> the presence of 'query-hotpluggable-cpus' is enough for a withess.
> >>  
> 
> > 
> > I'd been under the impression that there was a general way of detecting
> > the availability of a particular qmp command.  Was I mistaken?  
> 
> You are correct - query-commands says whether 'query-hotpluggable-cpus'
> exists as a command.  But that is insufficient.  See my review, or the
> v2 patch, where the above poor wording was corrected to say what was
> really meant: knowing whether query-hotpluggable-cpus exists is
> insufficient to tell you whether a given cpu type can be hotplugged.  So
> adding one more piece of witness (for every type of cpu supported, we
> also advertise if it is hotpluggable) is enough for libvirt to
> efficiently take advantage of the new query-hotpluggable-cpus command.

Ah, right.  Or to put it another way, the availability of
query-hotpluggable-cpus is global across qemu, whereas actually being
able to use it for hotplug is per machine type.

Would it be possible to do this instead by attempting to invoke
query-hopluggable-cpus and seeing if it returns any information?


-- 
David Gibson 
Senior Software Engineer, Virtualization, Red Hat


pgpwpl8hpjTE7.pgp
Description: OpenPGP digital signature

[Qemu-devel] [PATCH] target-sparc: Use overalignment flags for twinx and block asis

2016-06-23 Thread Richard Henderson

This allows us to enforce 16 and 64-byte alignment
without any extra overhead.

Signed-off-by: Richard Henderson 
---

This patch is dependent on by sparc improvements branch, along with
Sergey's alignment improvement patch.  A buildable tree is at

  git://github.com/rth7680/qemu.git tgt-sparc-tmp


r~


 target-sparc/translate.c | 24 +++-
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 28416fa..f384cbf 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -2377,6 +2377,7 @@ static void gen_ldf_asi(DisasContext *dc, TCGv addr,
 tcg_gen_qemu_ld_i64(cpu_fpr[rd / 2], addr, da.mem_idx, da.memop);
 break;
 case 16:
+/* Only 8 byte alignment required, which is automatic here.  */
 tcg_gen_qemu_ld_i64(cpu_fpr[rd / 2], addr, da.mem_idx, da.memop);
 tcg_gen_addi_tl(addr, addr, 8);
 tcg_gen_qemu_ld_i64(cpu_fpr[rd/2+1], addr, da.mem_idx, da.memop);
@@ -2389,20 +2390,23 @@ static void gen_ldf_asi(DisasContext *dc, TCGv addr,
 case GET_ASI_BLOCK:
 /* Valid for lddfa on aligned registers only.  */
 if (size == 8 && (rd & 7) == 0) {
+TCGMemOp memop;
 TCGv eight;
 int i;
 
-gen_check_align(addr, 0x3f);
 gen_address_mask(dc, addr);
 
+/* The first operation checks required alignment.  */
+memop = da.memop | MO_ALIGN_64;
 eight = tcg_const_tl(8);
 for (i = 0; ; ++i) {
 tcg_gen_qemu_ld_i64(cpu_fpr[rd / 2 + i], addr,
-da.mem_idx, da.memop);
+da.mem_idx, memop);
 if (i == 7) {
 break;
 }
 tcg_gen_add_tl(addr, addr, eight);
+memop = da.memop;
 }
 tcg_temp_free(eight);
 } else {
@@ -2445,6 +2449,7 @@ static void gen_ldf_asi(DisasContext *dc, TCGv addr,
 gen_helper_ld_asi(cpu_fpr[rd / 2], cpu_env, addr, r_asi, 
r_mop);
 break;
 case 16:
+/* Only 8 byte alignment required, which is automatic here.  */
 gen_helper_ld_asi(cpu_fpr[rd / 2], cpu_env, addr, r_asi, 
r_mop);
 tcg_gen_addi_tl(addr, addr, 8);
 gen_helper_ld_asi(cpu_fpr[rd/2+1], cpu_env, addr, r_asi, 
r_mop);
@@ -2480,6 +2485,7 @@ static void gen_stf_asi(DisasContext *dc, TCGv addr,
 tcg_gen_qemu_st_i64(cpu_fpr[rd / 2], addr, da.mem_idx, da.memop);
 break;
 case 16:
+/* Only 8 byte alignment required, which is automatic here.  */
 tcg_gen_qemu_st_i64(cpu_fpr[rd / 2], addr, da.mem_idx, da.memop);
 tcg_gen_addi_tl(addr, addr, 8);
 tcg_gen_qemu_st_i64(cpu_fpr[rd/2+1], addr, da.mem_idx, da.memop);
@@ -2492,20 +2498,23 @@ static void gen_stf_asi(DisasContext *dc, TCGv addr,
 case GET_ASI_BLOCK:
 /* Valid for stdfa on aligned registers only.  */
 if (size == 8 && (rd & 7) == 0) {
+TCGMemOp memop;
 TCGv eight;
 int i;
 
-gen_check_align(addr, 0x3f);
 gen_address_mask(dc, addr);
 
+/* The first operation checks required alignment.  */
+memop = da.memop | MO_ALIGN_64;
 eight = tcg_const_tl(8);
 for (i = 0; ; ++i) {
 tcg_gen_qemu_st_i64(cpu_fpr[rd / 2 + i], addr,
-da.mem_idx, da.memop);
+da.mem_idx, memop);
 if (i == 7) {
 break;
 }
 tcg_gen_add_tl(addr, addr, eight);
+memop = da.memop;
 }
 tcg_temp_free(eight);
 } else {
@@ -2543,9 +2552,8 @@ static void gen_ldda_asi(DisasContext *dc, TCGv addr, int 
insn, int rd)
 return;
 
 case GET_ASI_DTWINX:
-gen_check_align(addr, 15);
 gen_address_mask(dc, addr);
-tcg_gen_qemu_ld_i64(hi, addr, da.mem_idx, da.memop);
+tcg_gen_qemu_ld_i64(hi, addr, da.mem_idx, da.memop | MO_ALIGN_16);
 tcg_gen_addi_tl(addr, addr, 8);
 tcg_gen_qemu_ld_i64(lo, addr, da.mem_idx, da.memop);
 break;
@@ -2598,9 +2606,8 @@ static void gen_stda_asi(DisasContext *dc, TCGv hi, TCGv 
addr,
 break;
 
 case GET_ASI_DTWINX:
-gen_check_align(addr, 15);
 gen_address_mask(dc, addr);
-tcg_gen_qemu_st_i64(hi, addr, da.mem_idx, da.memop);
+tcg_gen_qemu_st_i64(hi, addr, da.mem_idx, da.memop | MO_ALIGN_16);
 tcg_gen_addi_tl(addr, addr, 8);
 tcg_gen_qemu_st_i64(lo, addr, da.mem_idx, da.memop);
 break;
@@ -5469,7 +5476,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned 
int insn)
 if (gen_trap_

Re: [Qemu-devel] [PATCH v4] Improve the alignment check infrastructure

2016-06-23 Thread Richard Henderson


On 06/23/2016 12:18 PM, Richard Henderson wrote:

On 06/23/2016 11:16 AM, Sergey Sorokin wrote:

+#if defined(CONFIG_SOFTMMU)
+/**
+ * get_alignment_bits
+ * @memop: TCGMemOp value
+ *
+ * Extract the alignment size from the memop.
+ *
+ * Returns: 0 in case of byte access (which is always aligned);
+ *  positive value - number of alignment bits;
+ *  negative value if unaligned access enabled
+ *  and this is not a byte access.
+ */
+static inline int get_alignment_bits(TCGMemOp memop)
+{
+int a = memop & MO_AMASK;
+int s = memop & MO_SIZE;
+
+if (a == MO_UNALN) {
+/* Negative value if unaligned access enabled,
+ * or zero value in case of byte access.
+ */
+return -s;
+} else if (a == MO_ALIGN) {
+tcg_debug_assert((TLB_FLAGS_MASK & ((1 << s) - 1)) == 0);
+/* A natural alignment: return a number of access size bits */
+return s;
+} else {
+/* Specific alignment size. It must be equal or greater
+ * than the access size.
+ */
+a >>= MO_ASHIFT;
+tcg_debug_assert(a >= s);
+tcg_debug_assert((TLB_FLAGS_MASK & ((1 << a) - 1)) == 0);
+return a;
+}
+}
+#endif  /* CONFIG_SOFTMMU */


While it's true that usermode doesn't support alignment checks at all (either
direction, I'd prefer to leave the function available and isolate the one
assert that caused your build problem.  E.g.

static inline int get_alignment_bits(TCGMemOp memop)
{
int a = memop & MO_AMASK;
int s = memop & MO_SIZE;
int r;

...
} else if (a == MO_ALIGN) {
/* A natural alignment: return a number of access size bits */
r = s;
} else {
/* Specific alignment size. It must be equal or greater
 * than the access size.
 */
r = a >> MO_ASHIFT;
tcg_debug_assert(r >= s);
}
#ifdef CONFIG_SOFTMMU
/* Make sure requested alignment doesn't overlap TLB flags.  */
tcg_debug_assert((TLB_FLAGS_MASK & ((1 << r) - 1)) == 0);
#endif
return r;
}


I'll give this a test with some target-sparc changes where this will be useful.


I've merged the patch with the above change for tcg-next.


r~

Re: [Qemu-devel] [PATCH 06/12] monitor: remove mhandler.cmd_new

2016-06-23 Thread Eric Blake

On 06/22/2016 06:08 PM, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> This is no longer necessary, now that middle mode has been removed.
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  docs/writing-qmp-commands.txt |   8 +-
>  hmp-commands-info.hx  | 118 
>  hmp-commands.hx   | 206 
> +-
>  monitor.c |  11 +--
>  4 files changed, 168 insertions(+), 175 deletions(-)

The changes to hmp-commands*.hx are mechanical; you may want to set up
git to order the diff so that more relevant files like monitor.c are
listed first.  Do that by 'git config diff.orderFile /path/to/file',
then in that file, listing globs in the order that you want to prioritize.

> +++ b/monitor.c
> @@ -130,13 +130,10 @@ typedef struct mon_cmd_t {
>  const char *args_type;
>  const char *params;
>  const char *help;
> -union {
> -void (*cmd)(Monitor *mon, const QDict *qdict);
> -void (*cmd_new)(QDict *params, QObject **ret_data, Error **errp);
> -} mhandler;
> +void (*cmd)(Monitor *mon, const QDict *qdict);
>  /* @sub_table is a list of 2nd level of commands. If it do not exist,

As long as you are touching this, fix the grammar bug: s/do/does/

> - * mhandler should be used. If it exist, sub_table[?].mhandler should be
> - * used, and mhandler of 1st level plays the role of help function.
> + * cmd should be used. If it exist, sub_table[?].cmd should be

s/exist/exists/

> + * used, and cmd of 1st level plays the role of help function.
>   */
>  struct mon_cmd_t *sub_table;
>  void (*command_completion)(ReadLineState *rs, int nb_args, const char 
> *str);
> @@ -2927,7 +2924,7 @@ static void handle_hmp_command(Monitor *mon, const char 
> *cmdline)
>  return;
>  }
>  
> -cmd->mhandler.cmd(mon, qdict);
> +cmd->cmd(mon, qdict);
>  QDECREF(qdict);
>  }
>  
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 05/12] monitor: register the qapi generated commands

2016-06-23 Thread Eric Blake

On 06/22/2016 06:08 PM, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> Stop using the so-called 'middle' mode. Instead, use qmp_find_command()
> from generated qapi commands registry.
> 
> Note: this commit requires a 'make clean' prior to make, since the
> generated files do not depend on Makefile (due to a cyclic rule
> introduced in 4115852bb0).

I'm not as well-versed as Paolo on Makefile issues, so I won't comment
on that part of the patch.

> 
> Signed-off-by: Marc-André Lureau 
> ---
>  Makefile|   2 +-
>  monitor.c   |  12 +++--
>  qmp-commands.hx | 143 
> 
>  vl.c|   1 +
>  4 files changed, 11 insertions(+), 147 deletions(-)
> 

> +++ b/monitor.c
> @@ -3884,6 +3884,7 @@ static void handle_qmp_command(JSONMessageParser 
> *parser, GQueue *tokens)
>  QObject *obj, *data;
>  QDict *input, *args;
>  const mon_cmd_t *cmd;
> +QmpCommand *qcmd;
>  const char *cmd_name;
>  Monitor *mon = cur_mon;
>  
> @@ -3909,7 +3910,8 @@ static void handle_qmp_command(JSONMessageParser 
> *parser, GQueue *tokens)
>  cmd_name = qdict_get_str(input, "execute");
>  trace_handle_qmp_command(mon, cmd_name);
>  cmd = qmp_find_cmd(cmd_name);
> -if (!cmd) {
> +qcmd = qmp_find_command(cmd_name);

Is it worth creating a single type that contains both mon_cmd_t and
QmpCommand information, so that we don't need two similarly-named
functions to look up two related pieces of information?  Not necessarily
in this patch.

> +if (!qcmd || !cmd) {
>  error_set(&local_err, ERROR_CLASS_COMMAND_NOT_FOUND,
>"The command %s has not been found", cmd_name);
>  goto err_out;
> @@ -3931,7 +3933,7 @@ static void handle_qmp_command(JSONMessageParser 
> *parser, GQueue *tokens)
>  goto err_out;
>  }
>  
> -cmd->mhandler.cmd_new(args, &data, &local_err);
> +qcmd->fn(args, &data, &local_err);
>  
>  err_out:
>  monitor_protocol_emitter(mon, data, local_err);
> @@ -4000,9 +4002,13 @@ void monitor_resume(Monitor *mon)
>  
>  static QObject *get_qmp_greeting(void)
>  {
> +QmpCommand *cmd;
>  QObject *ver = NULL;
>  
> -qmp_marshal_query_version(NULL, &ver, NULL);
> +cmd = qmp_find_command("query-version");
> +assert(cmd && cmd->fn);
> +cmd->fn(NULL, &ver, NULL);
> +
>  return qobject_from_jsonf("{'QMP':{'version': %p,'capabilities': 
> []}}",ver);

Worth fixing the missing space after ',' while touching near here?

>  }
>  
> diff --git a/qmp-commands.hx b/qmp-commands.hx
> index ee88e48..95c1e7d 100644
> --- a/qmp-commands.hx
> +++ b/qmp-commands.hx
> @@ -63,7 +63,6 @@ EQMP
>  {
>  .name   = "quit",
>  .args_type  = "",
> -.mhandler.cmd_new = qmp_marshal_quit,

At one point, I posted an RFC patch for removing .args_type on most QMP
command listings in this file. Maybe you still do that later in your
series, but as my work definitely conflicts with yours, and your series
is older, I don't mind getting through your series first.

Overall, I like the general idea of automating things rather than having
to duplicate information in qmp-commands.hx.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 04/12] monitor: remove usage of generated marshal functions

2016-06-23 Thread Eric Blake

On 06/22/2016 06:08 PM, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> Once the middle mode is removed, the generated marshal functions will no
> longer be exported.
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  monitor.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/monitor.c b/monitor.c
> index fc691b9..585bc1f 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -3609,21 +3609,21 @@ static int monitor_can_read(void *opaque)
>  return (mon->suspend_cnt == 0) ? 1 : 0;
>  }
>  
> -static bool invalid_qmp_mode(const Monitor *mon, const mon_cmd_t *cmd,
> +static bool invalid_qmp_mode(const Monitor *mon, const gchar *cmd,

Why 'gchar'?  What's wrong with 'char'?  (Some of glib's typedefs make
sense, but gchar is not one of them)

> @@ -3914,7 +3914,7 @@ static void handle_qmp_command(JSONMessageParser 
> *parser, GQueue *tokens)
>"The command %s has not been found", cmd_name);
>  goto err_out;
>  }
> -if (invalid_qmp_mode(mon, cmd, &local_err)) {
> +if (invalid_qmp_mode(mon, cmd_name, &local_err)) {
>  goto err_out;

Particularly since cmd_name is const char * in the caller.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] Qemu and heavily increased RSS usage

2016-06-23 Thread Paolo Bonzini


> > If it's 10M nothing.  If there is a 100M regression that is also caused
> > by RCU, we have to give up on it for that data structure, or mmap/munmap
> > the affected data structures.
> 
> If it was only 10MB I would agree. But if I run the VM described earlier
> in this thread it goes from ~35MB with Qemu-2.2.0 to ~130-150MB with
> current master. This is with coroutine pool disabled. With the coroutine pool
> it can grow to sth like 300-350MB.
> 
> Is there an easy way to determinate if RCU is the problem? I have the same
> symptoms, valgrind doesn't see the allocated memory. Is it possible
> to make rcu_call directly invoking the function - maybe with a lock around it
> that serializes the calls? Even if its expensive it might show if we search
> at the right place.

Yes, you can do that.  Just make it call the function without locks, for
a quick PoC it will be okay.

Paolo

[Qemu-devel] [PATCH 1/2] nbd: Convert to byte-based interface

2016-06-23 Thread Eric Blake

The NBD protocol doesn't have any notion of sectors, so it is
a fairly easy convertion to use byte-based read and write.

Signed-off-by: Eric Blake 
---
 block/nbd-client.h  |  8 
 include/block/nbd.h |  1 -
 block/nbd-client.c  | 30 +-
 block/nbd.c | 12 ++--
 4 files changed, 27 insertions(+), 24 deletions(-)

diff --git a/block/nbd-client.h b/block/nbd-client.h
index 62dec33..fa9817b 100644
--- a/block/nbd-client.h
+++ b/block/nbd-client.h
@@ -46,10 +46,10 @@ void nbd_client_close(BlockDriverState *bs);

 int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int count);
 int nbd_client_co_flush(BlockDriverState *bs);
-int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
- int nb_sectors, QEMUIOVector *qiov, int flags);
-int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
-int nb_sectors, QEMUIOVector *qiov);
+int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset,
+  uint64_t bytes, QEMUIOVector *qiov, int flags);
+int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
+ uint64_t bytes, QEMUIOVector *qiov, int flags);

 void nbd_client_detach_aio_context(BlockDriverState *bs);
 void nbd_client_attach_aio_context(BlockDriverState *bs,
diff --git a/include/block/nbd.h b/include/block/nbd.h
index 503f514..cb91820 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -77,7 +77,6 @@ enum {

 /* Maximum size of a single READ/WRITE data buffer */
 #define NBD_MAX_BUFFER_SIZE (32 * 1024 * 1024)
-#define NBD_MAX_SECTORS (NBD_MAX_BUFFER_SIZE / BDRV_SECTOR_SIZE)

 /* Maximum size of an export name. The NBD spec requires 256 and
  * suggests that servers support up to 4096, but we stick to only the
diff --git a/block/nbd-client.c b/block/nbd-client.c
index 070bb9d..647dedd 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -218,17 +218,20 @@ static void nbd_coroutine_end(NbdClientSession *s,
 }
 }

-int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
-int nb_sectors, QEMUIOVector *qiov)
+int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
+ uint64_t bytes, QEMUIOVector *qiov, int flags)
 {
 NbdClientSession *client = nbd_get_client_session(bs);
-struct nbd_request request = { .type = NBD_CMD_READ };
+struct nbd_request request = {
+.type = NBD_CMD_READ,
+.from = offset,
+.len = bytes,
+};
 struct nbd_reply reply;
 ssize_t ret;

-assert(nb_sectors <= NBD_MAX_SECTORS);
-request.from = sector_num * 512;
-request.len = nb_sectors * 512;
+assert(bytes <= NBD_MAX_BUFFER_SIZE);
+assert(!flags);

 nbd_coroutine_start(client, &request);
 ret = nbd_co_send_request(bs, &request, NULL);
@@ -239,14 +242,17 @@ int nbd_client_co_readv(BlockDriverState *bs, int64_t 
sector_num,
 }
 nbd_coroutine_end(client, &request);
 return -reply.error;
-
 }

-int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
- int nb_sectors, QEMUIOVector *qiov, int flags)
+int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset,
+  uint64_t bytes, QEMUIOVector *qiov, int flags)
 {
 NbdClientSession *client = nbd_get_client_session(bs);
-struct nbd_request request = { .type = NBD_CMD_WRITE };
+struct nbd_request request = {
+.type = NBD_CMD_WRITE,
+.from = offset,
+.len = bytes,
+};
 struct nbd_reply reply;
 ssize_t ret;

@@ -255,9 +261,7 @@ int nbd_client_co_writev(BlockDriverState *bs, int64_t 
sector_num,
 request.type |= NBD_CMD_FLAG_FUA;
 }

-assert(nb_sectors <= NBD_MAX_SECTORS);
-request.from = sector_num * 512;
-request.len = nb_sectors * 512;
+assert(bytes <= NBD_MAX_BUFFER_SIZE);

 nbd_coroutine_start(client, &request);
 ret = nbd_co_send_request(bs, &request, qiov);
diff --git a/block/nbd.c b/block/nbd.c
index 42cae0e..8d57220 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -438,8 +438,8 @@ static BlockDriver bdrv_nbd = {
 .instance_size  = sizeof(BDRVNBDState),
 .bdrv_parse_filename= nbd_parse_filename,
 .bdrv_file_open = nbd_open,
-.bdrv_co_readv  = nbd_client_co_readv,
-.bdrv_co_writev_flags   = nbd_client_co_writev,
+.bdrv_co_preadv = nbd_client_co_preadv,
+.bdrv_co_pwritev= nbd_client_co_pwritev,
 .bdrv_close = nbd_close,
 .bdrv_co_flush_to_os= nbd_co_flush,
 .bdrv_co_pdiscard   = nbd_client_co_pdiscard,
@@ -456,8 +456,8 @@ static BlockDriver bdrv_nbd_tcp = {
 .instance_size  = sizeof(BDRVNBDState),
 .bdrv_parse_filename= nbd_parse_filename,
 .bdrv_file_open = nbd_open,
-.bdrv_co_readv  = nbd_client_co_readv,
-

Re: [Qemu-devel] [PATCH 03/12] monitor: register gen:false commands manually

2016-06-23 Thread Eric Blake

On 06/22/2016 06:08 PM, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> Since a few commands are using 'gen': false, they are not registered
> automatically by the generator. Register manually instead.
> 
> This is in preparation for removal of qapi 'middle' mode generation.
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  monitor.c | 11 +++
>  1 file changed, 11 insertions(+)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH 0/2] Switch raw NBD to byte-based

2016-06-23 Thread Eric Blake

With these two patches, I'm finally able to run:

./qemu-nbd -f raw -x foo file
./qemu-io -f raw -t none nbd://localhost:10809/foo

and get true byte-based access over the wire for operations such
as 'r 1 1' or 'w 1 1', rather than RMW sector-aligned access.

Depends on these series:
v3 Byte-based block limits:
https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg06983.html
v1 Auto-fragment large transactions at the block layer:
https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg05819.html
v1 byte-based block discard:
https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg06491.html

Eric Blake (2):
  nbd: Convert to byte-based interface
  raw_bsd: Convert to byte-based interface

 block/nbd-client.h  |  8 
 include/block/nbd.h |  1 -
 block/nbd-client.c  | 30 +-
 block/nbd.c | 12 ++--
 block/raw_bsd.c | 35 +--
 5 files changed, 44 insertions(+), 42 deletions(-)

-- 
2.5.5

[Qemu-devel] [PATCH v3 5/9] tcg: Reorg TCGOp chaining

2016-06-23 Thread Richard Henderson

Instead of using -1 as end of chain, use 0, and link through the 0
entry as a fully circular double-linked list.

Signed-off-by: Richard Henderson 
---
 include/exec/gen-icount.h |  2 +-
 tcg/optimize.c|  8 ++--
 tcg/tcg-op.c  |  2 +-
 tcg/tcg.c | 32 
 tcg/tcg.h | 20 
 5 files changed, 28 insertions(+), 36 deletions(-)

diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h
index a011324..5f16077 100644
--- a/include/exec/gen-icount.h
+++ b/include/exec/gen-icount.h
@@ -59,7 +59,7 @@ static void gen_tb_end(TranslationBlock *tb, int num_insns)
 }
 
 /* Terminate the linked list.  */
-tcg_ctx.gen_op_buf[tcg_ctx.gen_last_op_idx].next = -1;
+tcg_ctx.gen_op_buf[tcg_ctx.gen_op_buf[0].prev].next = 0;
 }
 
 static inline void gen_io_start(void)
diff --git a/tcg/optimize.c b/tcg/optimize.c
index c0d975b..8df7fc7 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -103,11 +103,7 @@ static TCGOp *insert_op_before(TCGContext *s, TCGOp 
*old_op,
 .prev = prev,
 .next = next
 };
-if (prev >= 0) {
-s->gen_op_buf[prev].next = oi;
-} else {
-s->gen_first_op_idx = oi;
-}
+s->gen_op_buf[prev].next = oi;
 old_op->prev = oi;
 
 return new_op;
@@ -583,7 +579,7 @@ void tcg_optimize(TCGContext *s)
 nb_globals = s->nb_globals;
 reset_all_temps(nb_temps);
 
-for (oi = s->gen_first_op_idx; oi >= 0; oi = oi_next) {
+for (oi = s->gen_op_buf[0].next; oi != 0; oi = oi_next) {
 tcg_target_ulong mask, partmask, affected;
 int nb_oargs, nb_iargs, i;
 TCGArg tmp;
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 569cdc6..62d91b4 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -52,7 +52,7 @@ static void tcg_emit_op(TCGContext *ctx, TCGOpcode opc, int 
args)
 int pi = oi - 1;
 
 tcg_debug_assert(oi < OPC_BUF_SIZE);
-ctx->gen_last_op_idx = oi;
+ctx->gen_op_buf[0].prev = oi;
 ctx->gen_next_op_idx = ni;
 
 ctx->gen_op_buf[oi] = (TCGOp){
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 400e69c..bb1efe2 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -437,9 +437,9 @@ void tcg_func_start(TCGContext *s)
 s->goto_tb_issue_mask = 0;
 #endif
 
-s->gen_first_op_idx = 0;
-s->gen_last_op_idx = -1;
-s->gen_next_op_idx = 0;
+s->gen_op_buf[0].next = 1;
+s->gen_op_buf[0].prev = 0;
+s->gen_next_op_idx = 1;
 s->gen_next_parm_idx = 0;
 
 s->be = tcg_malloc(sizeof(TCGBackendData));
@@ -868,7 +868,7 @@ void tcg_gen_callN(TCGContext *s, void *func, TCGArg ret,
 /* Make sure the calli field didn't overflow.  */
 tcg_debug_assert(s->gen_op_buf[i].calli == real_args);
 
-s->gen_last_op_idx = i;
+s->gen_op_buf[0].prev = i;
 s->gen_next_op_idx = i + 1;
 s->gen_next_parm_idx = pi;
 
@@ -1004,7 +1004,7 @@ void tcg_dump_ops(TCGContext *s)
 TCGOp *op;
 int oi;
 
-for (oi = s->gen_first_op_idx; oi >= 0; oi = op->next) {
+for (oi = s->gen_op_buf[0].next; oi != 0; oi = op->next) {
 int i, k, nb_oargs, nb_iargs, nb_cargs;
 const TCGOpDef *def;
 const TCGArg *args;
@@ -1016,7 +1016,7 @@ void tcg_dump_ops(TCGContext *s)
 args = &s->gen_opparam_buf[op->args];
 
 if (c == INDEX_op_insn_start) {
-qemu_log("%s ", oi != s->gen_first_op_idx ? "\n" : "");
+qemu_log("%s ", oi != s->gen_op_buf[0].next ? "\n" : "");
 
 for (i = 0; i < TARGET_INSN_START_WORDS; ++i) {
 target_ulong a;
@@ -1287,18 +1287,10 @@ void tcg_op_remove(TCGContext *s, TCGOp *op)
 int next = op->next;
 int prev = op->prev;
 
-if (next >= 0) {
-s->gen_op_buf[next].prev = prev;
-} else {
-s->gen_last_op_idx = prev;
-}
-if (prev >= 0) {
-s->gen_op_buf[prev].next = next;
-} else {
-s->gen_first_op_idx = next;
-}
+s->gen_op_buf[next].prev = prev;
+s->gen_op_buf[prev].next = next;
 
-memset(op, -1, sizeof(*op));
+memset(op, 0, sizeof(*op));
 
 #ifdef CONFIG_PROFILER
 s->del_op_count++;
@@ -1344,7 +1336,7 @@ static void tcg_liveness_analysis(TCGContext *s)
 mem_temps = tcg_malloc(s->nb_temps);
 tcg_la_func_end(s, dead_temps, mem_temps);
 
-for (oi = s->gen_last_op_idx; oi >= 0; oi = oi_prev) {
+for (oi = s->gen_op_buf[0].prev; oi != 0; oi = oi_prev) {
 int i, nb_iargs, nb_oargs;
 TCGOpcode opc_new, opc_new2;
 bool have_opc_new2;
@@ -2321,7 +2313,7 @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb)
 {
 int n;
 
-n = s->gen_last_op_idx + 1;
+n = s->gen_op_buf[0].prev + 1;
 s->op_count += n;
 if (n > s->op_count_max) {
 s->op_count_max = n;
@@ -2380,7 +2372,7 @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb)
 tcg_out_tb_init(s);
 
 num_insns = -1;
-for (oi = s->gen_first_op_idx; oi >= 0; oi = oi_next) {
+for

[Qemu-devel] [PATCH 2/2] raw_bsd: Convert to byte-based interface

2016-06-23 Thread Eric Blake

Since the raw format driver is just passing things through, we can
do byte-based read and write if the underlying protocol does
likewise.

There's one tricky part - if we probed the image format, we document
that we restrict operations on the initial sector.  Rather than
trying to handle a read-modify-write on the first sector, it's
easiest to just include in our restrictions that partial writes to
the first sector are not permitted.

Signed-off-by: Eric Blake 
---
 block/raw_bsd.c | 35 +--
 1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index 365c38a..a417d9a 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -50,32 +50,32 @@ static int raw_reopen_prepare(BDRVReopenState *reopen_state,
 return 0;
 }

-static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t sector_num,
- int nb_sectors, QEMUIOVector *qiov)
+static int coroutine_fn raw_co_preadv(BlockDriverState *bs, uint64_t offset,
+  uint64_t bytes, QEMUIOVector *qiov,
+  int flags)
 {
 BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
-return bdrv_co_readv(bs->file->bs, sector_num, nb_sectors, qiov);
+return bdrv_co_preadv(bs->file->bs, offset, bytes, qiov, flags);
 }

-static int coroutine_fn
-raw_co_writev_flags(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
-QEMUIOVector *qiov, int flags)
+static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, uint64_t offset,
+   uint64_t bytes, QEMUIOVector *qiov,
+   int flags)
 {
 void *buf = NULL;
 BlockDriver *drv;
 QEMUIOVector local_qiov;
 int ret;

-if (bs->probed && sector_num == 0) {
-/* As long as these conditions are true, we can't get partial writes to
- * the probe buffer and can just directly check the request. */
+if (bs->probed && offset < BLOCK_PROBE_BUF_SIZE && bytes) {
+/* Handling partial writes would be a pain - so we just
+ * require that the guest cannot write to the first sector
+ * without writing the entire sector */
 QEMU_BUILD_BUG_ON(BLOCK_PROBE_BUF_SIZE != 512);
 QEMU_BUILD_BUG_ON(BDRV_SECTOR_SIZE != 512);
-
-if (nb_sectors == 0) {
-/* qemu_iovec_to_buf() would fail, but we want to return success
- * instead of -EINVAL in this case. */
-return 0;
+if (offset != 0 || bytes < BLOCK_PROBE_BUF_SIZE) {
+ret = -EINVAL;
+goto fail;
 }

 buf = qemu_try_blockalign(bs->file->bs, 512);
@@ -105,8 +105,7 @@ raw_co_writev_flags(BlockDriverState *bs, int64_t 
sector_num, int nb_sectors,
 }

 BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
-ret = bdrv_co_pwritev(bs->file->bs, sector_num * BDRV_SECTOR_SIZE,
-  nb_sectors * BDRV_SECTOR_SIZE, qiov, flags);
+ret = bdrv_co_pwritev(bs->file->bs, offset, bytes, qiov, flags);

 fail:
 if (qiov == &local_qiov) {
@@ -240,8 +239,8 @@ BlockDriver bdrv_raw = {
 .bdrv_open= &raw_open,
 .bdrv_close   = &raw_close,
 .bdrv_create  = &raw_create,
-.bdrv_co_readv= &raw_co_readv,
-.bdrv_co_writev_flags = &raw_co_writev_flags,
+.bdrv_co_preadv   = &raw_co_preadv,
+.bdrv_co_pwritev  = &raw_co_pwritev,
 .bdrv_co_pwrite_zeroes = &raw_co_pwrite_zeroes,
 .bdrv_co_pdiscard = &raw_co_pdiscard,
 .bdrv_co_get_block_status = &raw_co_get_block_status,
-- 
2.5.5

Re: [Qemu-devel] [PATCH 02/12] qapi-schema: add 'device_add'

2016-06-23 Thread Eric Blake

On 06/22/2016 06:07 PM, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> Even though device_add is not fully qapi'fied, we may add it to the json
> schema with 'gen': false, so registration and documentation can be
> generated.
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  qapi-schema.json | 26 ++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 73f0b6f..929f84e 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -2188,6 +2188,32 @@
>  ##
>  { 'command': 'xen-set-global-dirty-log', 'data': { 'enable': 'bool' } }
>  
> +##
> +# @device_add:
> +#
> +# @driver: the name of the new device's driver
> +# @bus: #optional the device's parent bus (device tree path)
> +# @id: the device's ID, must be unique
> +# @props: #optional a dictionary of properties to be passed to the backend
> +#
> +# Add a device.
> +#
> +# Notes:
> +# 1. For detailed information about this command, please refer to the
> +#'docs/qdev-device-use.txt' file.
> +#
> +# 2. It's possible to list device properties by running QEMU with the
> +#"-device DEVICE,\?" command-line argument, where DEVICE is the device's 
> name

I'd prefer mentioning -device DEVICE,help, rather than a shell
metacharacter, so that the advice doesn't break in the presence of an
oddly-named file in the current directory when the user forgets to quote
things in the shell.

Long line; please keep under 80 columns.

> +#
> +# Example:
> +#
> +# -> { "execute": "device_add", "arguments": { "driver": "e1000", "id": 
> "net1" } }
> +# <- { "return": {} }
> +#
> +##

Missing a Since: designation; probably 0.13, since we've had it (and it
has been non-introspectible) since that point.

> +{ 'command': 'device_add',
> +  'data': {'driver': 'str', 'id': 'str'}, 'gen': false }
> +
>  ##
>  # @device_del:
>  #
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PULL 5/8] target-sparc: Use global registers for the register window

2016-06-23 Thread Richard Henderson


On 06/16/2016 02:53 PM, Mark Cave-Ayland wrote:

On 16/06/16 21:26, Richard Henderson wrote:


On 06/14/2016 02:52 PM, Mark Cave-Ayland wrote:

Following up the bug report at
https://bugs.launchpad.net/qemu/+bug/1588328, I bisected the regression
down to this particular commit. I can't see anything obvious here, so
perhaps this is exposing another bug somewhere else?



Probably.  I'm downloading the solaris image now.


r~


Thanks for taking a look - otherwise I won't be able to get to this
until next week. My thinking was that since the code makes access to
regwptr direct instead of copied into a temporary, something is
accidentally clobbering a destination register...


I've been unable to find this.

Whatever happens, it happens after 10GB of logs, which is simply too much to 
sift through.  I've tried to narrow it down, but the lack of a hardware tlb 
refill means that we get hundreds of thousands of Data Access Faults that are 
simply TLB misses and not the actual Segmentation Fault in question.


It doesn't seem to affect other OSes, so I can't imagine what quirk is being 
exercised in this case.


As loath as I am to suggest it, we may have to revert the sparc indirect 
register patch for the release.


I do now ping the rest of my sparc improvements patchset.  It's completely 
independent of the use of indirect registers.



r~

[Qemu-devel] [PATCH v3 9/9] tcg: Lower indirect registers in a separate pass

2016-06-23 Thread Richard Henderson

Rather than rely on recursion during the middle of register allocation,
lower indirect registers to loads and stores off the indirect base into
plain temps.

For an x86_64 host, with sufficient registers, this results in identical
code, modulo the actual register assignments.

For an i686 host, with insufficient registers, this means that temps can
be (temporarily) spilled to the stack in order to satisfy an allocation.
This as opposed to the possibility of not being able to spill, to allocate
a register for the indirect base, in order to perform a spill.

Signed-off-by: Richard Henderson 
---
 include/qemu/log.h |   1 +
 tcg/optimize.c |  31 +-
 tcg/tcg.c  | 306 +++--
 tcg/tcg.h  |   4 +
 util/log.c |   5 +-
 5 files changed, 263 insertions(+), 84 deletions(-)

diff --git a/include/qemu/log.h b/include/qemu/log.h
index 8bec6b4..9c1cc38 100644
--- a/include/qemu/log.h
+++ b/include/qemu/log.h
@@ -42,6 +42,7 @@ static inline bool qemu_log_separate(void)
 #define CPU_LOG_TB_NOCHAIN (1 << 13)
 #define CPU_LOG_PAGE   (1 << 14)
 #define LOG_TRACE  (1 << 15)
+#define CPU_LOG_TB_OP_IND  (1 << 16)
 
 /* Returns true if a bit is set in the current loglevel mask
  */
diff --git a/tcg/optimize.c b/tcg/optimize.c
index 8df7fc7..cffe89b 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -82,33 +82,6 @@ static void init_temp_info(TCGArg temp)
 }
 }
 
-static TCGOp *insert_op_before(TCGContext *s, TCGOp *old_op,
-TCGOpcode opc, int nargs)
-{
-int oi = s->gen_next_op_idx;
-int pi = s->gen_next_parm_idx;
-int prev = old_op->prev;
-int next = old_op - s->gen_op_buf;
-TCGOp *new_op;
-
-tcg_debug_assert(oi < OPC_BUF_SIZE);
-tcg_debug_assert(pi + nargs <= OPPARAM_BUF_SIZE);
-s->gen_next_op_idx = oi + 1;
-s->gen_next_parm_idx = pi + nargs;
-
-new_op = &s->gen_op_buf[oi];
-*new_op = (TCGOp){
-.opc = opc,
-.args = pi,
-.prev = prev,
-.next = next
-};
-s->gen_op_buf[prev].next = oi;
-old_op->prev = oi;
-
-return new_op;
-}
-
 static int op_bits(TCGOpcode op)
 {
 const TCGOpDef *def = &tcg_op_defs[op];
@@ -1116,7 +1089,7 @@ void tcg_optimize(TCGContext *s)
 uint64_t a = ((uint64_t)ah << 32) | al;
 uint64_t b = ((uint64_t)bh << 32) | bl;
 TCGArg rl, rh;
-TCGOp *op2 = insert_op_before(s, op, INDEX_op_movi_i32, 2);
+TCGOp *op2 = tcg_op_insert_before(s, op, INDEX_op_movi_i32, 2);
 TCGArg *args2 = &s->gen_opparam_buf[op2->args];
 
 if (opc == INDEX_op_add2_i32) {
@@ -1142,7 +1115,7 @@ void tcg_optimize(TCGContext *s)
 uint32_t b = temps[args[3]].val;
 uint64_t r = (uint64_t)a * b;
 TCGArg rl, rh;
-TCGOp *op2 = insert_op_before(s, op, INDEX_op_movi_i32, 2);
+TCGOp *op2 = tcg_op_insert_before(s, op, INDEX_op_movi_i32, 2);
 TCGArg *args2 = &s->gen_opparam_buf[op2->args];
 
 rl = args[0];
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 3e4bc99..f00500e 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -531,8 +531,12 @@ int tcg_global_mem_new_internal(TCGType type, TCGv_ptr 
base,
 #endif
 
 if (!base_ts->fixed_reg) {
-indirect_reg = 1;
+/* We do not support double-indirect registers.  */
+tcg_debug_assert(!base_ts->indirect_reg);
 base_ts->indirect_base = 1;
+s->nb_indirects += (TCG_TARGET_REG_BITS == 32 && type == TCG_TYPE_I64
+? 2 : 1);
+indirect_reg = 1;
 }
 
 if (TCG_TARGET_REG_BITS == 32 && type == TCG_TYPE_I64) {
@@ -1323,9 +1327,66 @@ void tcg_op_remove(TCGContext *s, TCGOp *op)
 #endif
 }
 
+TCGOp *tcg_op_insert_before(TCGContext *s, TCGOp *old_op,
+TCGOpcode opc, int nargs)
+{
+int oi = s->gen_next_op_idx;
+int pi = s->gen_next_parm_idx;
+int prev = old_op->prev;
+int next = old_op - s->gen_op_buf;
+TCGOp *new_op;
+
+tcg_debug_assert(oi < OPC_BUF_SIZE);
+tcg_debug_assert(pi + nargs <= OPPARAM_BUF_SIZE);
+s->gen_next_op_idx = oi + 1;
+s->gen_next_parm_idx = pi + nargs;
+
+new_op = &s->gen_op_buf[oi];
+*new_op = (TCGOp){
+.opc = opc,
+.args = pi,
+.prev = prev,
+.next = next
+};
+s->gen_op_buf[prev].next = oi;
+old_op->prev = oi;
+
+return new_op;
+}
+
+TCGOp *tcg_op_insert_after(TCGContext *s, TCGOp *old_op,
+   TCGOpcode opc, int nargs)
+{
+int oi = s->gen_next_op_idx;
+int pi = s->gen_next_parm_idx;
+int prev = old_op - s->gen_op_buf;
+int next = old_op->next;
+TCGOp *new_op;
+
+tcg_debug_assert(oi < OPC_BUF_SIZE);
+tcg_debug_assert(pi + nargs <= OPPARAM_BUF_SIZE);
+s->gen_next_op_idx = oi + 1;
+s->gen_next_parm_idx = pi + nargs;
+
+new_o

Re: [Qemu-devel] [PATCH 1/3] qapi: Report support for -device cpu hotplug in query-machines

2016-06-23 Thread Eric Blake

On 06/23/2016 08:56 PM, David Gibson wrote:
> On Thu, 23 Jun 2016 22:23:23 +0200
> Peter Krempa  wrote:
> 
>> For management apps it's very useful to know whether the selected
>> machine type supports cpu hotplug via the new -device approach. Using
>> the presence of 'query-hotpluggable-cpus' is enough for a withess.
>>

> 
> I'd been under the impression that there was a general way of detecting
> the availability of a particular qmp command.  Was I mistaken?

You are correct - query-commands says whether 'query-hotpluggable-cpus'
exists as a command.  But that is insufficient.  See my review, or the
v2 patch, where the above poor wording was corrected to say what was
really meant: knowing whether query-hotpluggable-cpus exists is
insufficient to tell you whether a given cpu type can be hotplugged.  So
adding one more piece of witness (for every type of cpu supported, we
also advertise if it is hotpluggable) is enough for libvirt to
efficiently take advantage of the new query-hotpluggable-cpus command.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v3 6/9] tcg: Fold life data into TCGOp

2016-06-23 Thread Richard Henderson

Reduce the size of other bitfields to make room.
This reduces the cache footprint of compilation.

Signed-off-by: Richard Henderson 
---
 tcg/tcg.c |  9 +++--
 tcg/tcg.h | 26 ++
 2 files changed, 17 insertions(+), 18 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index bb1efe2..c41640f 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1327,10 +1327,7 @@ static inline void tcg_la_bb_end(TCGContext *s, uint8_t 
*dead_temps,
 static void tcg_liveness_analysis(TCGContext *s)
 {
 uint8_t *dead_temps, *mem_temps;
-int oi, oi_prev, nb_ops;
-
-nb_ops = s->gen_next_op_idx;
-s->op_arg_life = tcg_malloc(nb_ops * sizeof(TCGLifeData));
+int oi, oi_prev;
 
 dead_temps = tcg_malloc(s->nb_temps);
 mem_temps = tcg_malloc(s->nb_temps);
@@ -1553,7 +1550,7 @@ static void tcg_liveness_analysis(TCGContext *s)
 }
 break;
 }
-s->op_arg_life[oi] = arg_life;
+op->life = arg_life;
 }
 }
 
@@ -2377,7 +2374,7 @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb)
 TCGArg * const args = &s->gen_opparam_buf[op->args];
 TCGOpcode opc = op->opc;
 const TCGOpDef *def = &tcg_op_defs[opc];
-TCGLifeData arg_life = s->op_arg_life[oi];
+TCGLifeData arg_life = op->life;
 
 oi_next = op->next;
 #ifdef CONFIG_PROFILER
diff --git a/tcg/tcg.h b/tcg/tcg.h
index 49b396d..2ff3ad2 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -513,25 +513,30 @@ typedef struct TCGTempSet {
 #define SYNC_ARG  1
 typedef uint16_t TCGLifeData;
 
+/* The layout here is designed to avoid crossing of a 32-bit boundary.
+   If we do so, gcc adds padding, expanding the size to 12.  */
 typedef struct TCGOp {
-TCGOpcode opc   : 8;
+TCGOpcode opc   : 8;/*  8 */
+
+/* Index of the prex/next op, or 0 for the end of the list.  */
+unsigned prev   : 10;   /* 18 */
+unsigned next   : 10;   /* 28 */
 
 /* The number of out and in parameter for a call.  */
-unsigned callo  : 2;
-unsigned calli  : 6;
+unsigned calli  : 4;/* 32 */
+unsigned callo  : 2;/* 34 */
 
 /* Index of the arguments for this op, or 0 for zero-operand ops.  */
-unsigned args   : 16;
+unsigned args   : 14;   /* 48 */
 
-/* Index of the prex/next op, or 0 for the end of the list.  */
-unsigned prev   : 16;
-unsigned next   : 16;
+/* Lifetime data of the operands.  */
+unsigned life   : 16;   /* 64 */
 } TCGOp;
 
 /* Make sure operands fit in the bitfields above.  */
 QEMU_BUILD_BUG_ON(NB_OPS > (1 << 8));
-QEMU_BUILD_BUG_ON(OPC_BUF_SIZE > (1 << 16));
-QEMU_BUILD_BUG_ON(OPPARAM_BUF_SIZE > (1 << 16));
+QEMU_BUILD_BUG_ON(OPC_BUF_SIZE > (1 << 10));
+QEMU_BUILD_BUG_ON(OPPARAM_BUF_SIZE > (1 << 14));
 
 /* Make sure that we don't overflow 64 bits without noticing.  */
 QEMU_BUILD_BUG_ON(sizeof(TCGOp) > 8);
@@ -549,9 +554,6 @@ struct TCGContext {
 uint16_t *tb_jmp_insn_offset; /* tb->jmp_insn_offset if USE_DIRECT_JUMP */
 uintptr_t *tb_jmp_target_addr; /* tb->jmp_target_addr if !USE_DIRECT_JUMP 
*/
 
-/* liveness analysis */
-TCGLifeData *op_arg_life;
-
 TCGRegSet reserved_regs;
 intptr_t current_frame_offset;
 intptr_t frame_start;
-- 
2.5.5

[Qemu-devel] [PATCH v3 4/9] tcg: Compress liveness data to 16 bits

2016-06-23 Thread Richard Henderson

This reduces both memory usage and per-insn cacheline usage
during code generation.

Signed-off-by: Richard Henderson 
---
 tcg/tcg.c | 58 ++
 tcg/tcg.h | 16 ++--
 2 files changed, 32 insertions(+), 42 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index 64060c6..400e69c 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1329,7 +1329,7 @@ static inline void tcg_la_bb_end(TCGContext *s, uint8_t 
*dead_temps,
 }
 }
 
-/* Liveness analysis : update the opc_dead_args array to tell if a
+/* Liveness analysis : update the opc_arg_life array to tell if a
given input arguments is dead. Instructions updating dead
temporaries are removed. */
 static void tcg_liveness_analysis(TCGContext *s)
@@ -1338,9 +1338,8 @@ static void tcg_liveness_analysis(TCGContext *s)
 int oi, oi_prev, nb_ops;
 
 nb_ops = s->gen_next_op_idx;
-s->op_dead_args = tcg_malloc(nb_ops * sizeof(uint16_t));
-s->op_sync_args = tcg_malloc(nb_ops * sizeof(uint8_t));
-
+s->op_arg_life = tcg_malloc(nb_ops * sizeof(TCGLifeData));
+
 dead_temps = tcg_malloc(s->nb_temps);
 mem_temps = tcg_malloc(s->nb_temps);
 tcg_la_func_end(s, dead_temps, mem_temps);
@@ -1349,8 +1348,7 @@ static void tcg_liveness_analysis(TCGContext *s)
 int i, nb_iargs, nb_oargs;
 TCGOpcode opc_new, opc_new2;
 bool have_opc_new2;
-uint16_t dead_args;
-uint8_t sync_args;
+TCGLifeData arg_life = 0;
 TCGArg arg;
 
 TCGOp * const op = &s->gen_op_buf[oi];
@@ -1382,15 +1380,13 @@ static void tcg_liveness_analysis(TCGContext *s)
 do_not_remove_call:
 
 /* output args are dead */
-dead_args = 0;
-sync_args = 0;
 for (i = 0; i < nb_oargs; i++) {
 arg = args[i];
 if (dead_temps[arg]) {
-dead_args |= (1 << i);
+arg_life |= DEAD_ARG << i;
 }
 if (mem_temps[arg]) {
-sync_args |= (1 << i);
+arg_life |= SYNC_ARG << i;
 }
 dead_temps[arg] = 1;
 mem_temps[arg] = 0;
@@ -1411,7 +1407,7 @@ static void tcg_liveness_analysis(TCGContext *s)
 arg = args[i];
 if (arg != TCG_CALL_DUMMY_ARG) {
 if (dead_temps[arg]) {
-dead_args |= (1 << i);
+arg_life |= DEAD_ARG << i;
 }
 }
 }
@@ -1420,8 +1416,6 @@ static void tcg_liveness_analysis(TCGContext *s)
 arg = args[i];
 dead_temps[arg] = 0;
 }
-s->op_dead_args[oi] = dead_args;
-s->op_sync_args[oi] = sync_args;
 }
 }
 break;
@@ -1532,15 +1526,13 @@ static void tcg_liveness_analysis(TCGContext *s)
 } else {
 do_not_remove:
 /* output args are dead */
-dead_args = 0;
-sync_args = 0;
 for (i = 0; i < nb_oargs; i++) {
 arg = args[i];
 if (dead_temps[arg]) {
-dead_args |= (1 << i);
+arg_life |= DEAD_ARG << i;
 }
 if (mem_temps[arg]) {
-sync_args |= (1 << i);
+arg_life |= SYNC_ARG << i;
 }
 dead_temps[arg] = 1;
 mem_temps[arg] = 0;
@@ -1558,7 +1550,7 @@ static void tcg_liveness_analysis(TCGContext *s)
 for (i = nb_oargs; i < nb_oargs + nb_iargs; i++) {
 arg = args[i];
 if (dead_temps[arg]) {
-dead_args |= (1 << i);
+arg_life |= DEAD_ARG << i;
 }
 }
 /* input arguments are live for preceding opcodes */
@@ -1566,11 +1558,10 @@ static void tcg_liveness_analysis(TCGContext *s)
 arg = args[i];
 dead_temps[arg] = 0;
 }
-s->op_dead_args[oi] = dead_args;
-s->op_sync_args[oi] = sync_args;
 }
 break;
 }
+s->op_arg_life[oi] = arg_life;
 }
 }
 
@@ -1891,11 +1882,11 @@ static void tcg_reg_alloc_bb_end(TCGContext *s, 
TCGRegSet allocated_regs)
 save_globals(s, allocated_regs);
 }
 
-#define IS_DEAD_ARG(n) ((dead_args >> (n)) & 1)
-#define NEED_SYNC_ARG(n) ((sync_args >> (n)) & 1)
+#define IS_DEAD_ARG(n)   (arg_life & (DEAD_ARG << (n)))
+#define NEED_SYNC_ARG(n) (arg_life & (SYNC_ARG << (n)))

[Qemu-devel] [PATCH v3 2/9] tcg: Optimize spills of constants

2016-06-23 Thread Richard Henderson

While we can store constants via constrants on INDEX_op_st_i32 et al,
we weren't able to spill constants to backing store.

Add a new backend interface, tcg_out_sti, which may store the constant
(and is allowed to fail).  Rearrange the temp_* helpers so that we only
attempt to directly store a constant when the temp is becoming dead/free.

Signed-off-by: Richard Henderson 
---
 tcg/aarch64/tcg-target.inc.c |  10 +++
 tcg/arm/tcg-target.inc.c |   6 ++
 tcg/i386/tcg-target.inc.c|  21 --
 tcg/ia64/tcg-target.inc.c|  10 +++
 tcg/mips/tcg-target.inc.c|  10 +++
 tcg/ppc/tcg-target.inc.c |   6 ++
 tcg/s390/tcg-target.inc.c|   6 ++
 tcg/sparc/tcg-target.inc.c   |  10 +++
 tcg/tcg.c| 159 ---
 tcg/tci/tcg-target.inc.c |   6 ++
 10 files changed, 166 insertions(+), 78 deletions(-)

diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c
index 1447f7c..5ac0091 100644
--- a/tcg/aarch64/tcg-target.inc.c
+++ b/tcg/aarch64/tcg-target.inc.c
@@ -716,6 +716,16 @@ static inline void tcg_out_st(TCGContext *s, TCGType type, 
TCGReg arg,
  arg, arg1, arg2);
 }
 
+static inline bool tcg_out_sti(TCGContext *s, TCGType type, TCGArg val,
+   TCGReg base, intptr_t ofs)
+{
+if (val == 0) {
+tcg_out_st(s, type, TCG_REG_XZR, base, ofs);
+return true;
+}
+return false;
+}
+
 static inline void tcg_out_bfm(TCGContext *s, TCGType ext, TCGReg rd,
TCGReg rn, unsigned int a, unsigned int b)
 {
diff --git a/tcg/arm/tcg-target.inc.c b/tcg/arm/tcg-target.inc.c
index f9f54c6..172feba 100644
--- a/tcg/arm/tcg-target.inc.c
+++ b/tcg/arm/tcg-target.inc.c
@@ -2046,6 +2046,12 @@ static inline void tcg_out_st(TCGContext *s, TCGType 
type, TCGReg arg,
 tcg_out_st32(s, COND_AL, arg, arg1, arg2);
 }
 
+static inline bool tcg_out_sti(TCGContext *s, TCGType type, TCGArg val,
+   TCGReg base, intptr_t ofs)
+{
+return false;
+}
+
 static inline void tcg_out_mov(TCGContext *s, TCGType type,
TCGReg ret, TCGReg arg)
 {
diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c
index 317484c..bc34535 100644
--- a/tcg/i386/tcg-target.inc.c
+++ b/tcg/i386/tcg-target.inc.c
@@ -710,12 +710,19 @@ static inline void tcg_out_st(TCGContext *s, TCGType 
type, TCGReg arg,
 tcg_out_modrm_offset(s, opc, arg, arg1, arg2);
 }
 
-static inline void tcg_out_sti(TCGContext *s, TCGType type, TCGReg base,
-   tcg_target_long ofs, tcg_target_long val)
+static bool tcg_out_sti(TCGContext *s, TCGType type, TCGArg val,
+TCGReg base, intptr_t ofs)
 {
-int opc = OPC_MOVL_EvIz + (type == TCG_TYPE_I64 ? P_REXW : 0);
-tcg_out_modrm_offset(s, opc, 0, base, ofs);
+int rexw = 0;
+if (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64) {
+if (val != (int32_t)val) {
+return false;
+}
+rexw = P_REXW;
+}
+tcg_out_modrm_offset(s, OPC_MOVL_EvIz | rexw, 0, base, ofs);
 tcg_out32(s, val);
+return true;
 }
 
 static void tcg_out_shifti(TCGContext *s, int subopc, int reg, int count)
@@ -1321,10 +1328,10 @@ static void tcg_out_qemu_ld_slow_path(TCGContext *s, 
TCGLabelQemuLdst *l)
 ofs += 4;
 }
 
-tcg_out_sti(s, TCG_TYPE_I32, TCG_REG_ESP, ofs, oi);
+tcg_out_sti(s, TCG_TYPE_I32, oi, TCG_REG_ESP, ofs);
 ofs += 4;
 
-tcg_out_sti(s, TCG_TYPE_PTR, TCG_REG_ESP, ofs, (uintptr_t)l->raddr);
+tcg_out_sti(s, TCG_TYPE_PTR, (uintptr_t)l->raddr, TCG_REG_ESP, ofs);
 } else {
 tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
 /* The second argument is already loaded with addrlo.  */
@@ -1413,7 +1420,7 @@ static void tcg_out_qemu_st_slow_path(TCGContext *s, 
TCGLabelQemuLdst *l)
 ofs += 4;
 }
 
-tcg_out_sti(s, TCG_TYPE_I32, TCG_REG_ESP, ofs, oi);
+tcg_out_sti(s, TCG_TYPE_I32, oi, TCG_REG_ESP, ofs);
 ofs += 4;
 
 retaddr = TCG_REG_EAX;
diff --git a/tcg/ia64/tcg-target.inc.c b/tcg/ia64/tcg-target.inc.c
index 395223e..c91f392 100644
--- a/tcg/ia64/tcg-target.inc.c
+++ b/tcg/ia64/tcg-target.inc.c
@@ -973,6 +973,16 @@ static inline void tcg_out_st(TCGContext *s, TCGType type, 
TCGReg arg,
 }
 }
 
+static inline bool tcg_out_sti(TCGContext *s, TCGType type, TCGArg val,
+   TCGReg base, intptr_t ofs)
+{
+if (val == 0) {
+tcg_out_st(s, type, TCG_REG_R0, base, ofs);
+return true;
+}
+return false;
+}
+
 static inline void tcg_out_alu(TCGContext *s, uint64_t opc_a1, uint64_t opc_a3,
TCGReg ret, TCGArg arg1, int const_arg1,
TCGArg arg2, int const_arg2)
diff --git a/tcg/mips/tcg-target.inc.c b/tcg/mips/tcg-target.inc.c
index 50e98ea..2f9be48 100644
--- a/t

[Qemu-devel] [PATCH v3 0/9] Third try at fixing sparc register allocation

2016-06-23 Thread Richard Henderson

I was unhappy about the complexity of the second try.

Better to convert to normal temps, allowing in rare
occasions, spilling the "globals" to the stack in order
to satisfy register allocation.

I can no longer provoke an allocation failure on i686.
Hopefully this fixes the OpenBSD case that Mark mentioned
re the second attempt.


r~


Richard Henderson (9):
  tcg: Fix name for high-half register
  tcg: Optimize spills of constants
  tcg: Require liveness analysis
  tcg: Compress liveness data to 16 bits
  tcg: Reorg TCGOp chaining
  tcg: Fold life data into TCGOp
  tcg: Compress dead_temps and mem_temps into a single array
  tcg: Include liveness info in the dumps
  tcg: Lower indirect registers in a separate pass

 include/exec/gen-icount.h|   2 +-
 include/qemu/log.h   |   1 +
 tcg/aarch64/tcg-target.inc.c |  10 +
 tcg/arm/tcg-target.inc.c |   6 +
 tcg/i386/tcg-target.inc.c|  21 +-
 tcg/ia64/tcg-target.inc.c|  10 +
 tcg/mips/tcg-target.inc.c|  10 +
 tcg/optimize.c   |  37 +--
 tcg/ppc/tcg-target.inc.c |   6 +
 tcg/s390/tcg-target.inc.c|   6 +
 tcg/sparc/tcg-target.inc.c   |  10 +
 tcg/tcg-op.c |   2 +-
 tcg/tcg.c| 690 ---
 tcg/tcg.h|  50 ++--
 tcg/tci/tcg-target.inc.c |   6 +
 util/log.c   |   5 +-
 16 files changed, 563 insertions(+), 309 deletions(-)

-- 
2.5.5

1 2 3 4 5 >

1 - 100 of 434 matches

Mail list logo