Re: [Qemu-devel] [libvirt-users] Sluggish performance with virtio and Win10

2016-02-18 Thread Martin Kletzander

On Fri, Feb 19, 2016 at 07:16:12AM +0100, John Obaterspok wrote:

2016-02-18 15:15 GMT+01:00 Martin Kletzander :


On Thu, Feb 18, 2016 at 12:59:52PM +0100, John Obaterspok wrote:


2016-02-18 11:25 GMT+01:00 Martin Kletzander :

On Thu, Feb 18, 2016 at 10:41:42AM +0100, John Obaterspok wrote:


2016-02-18 10:13 GMT+01:00 Martin Kletzander :


On Thu, Feb 18, 2016 at 08:49:38AM +0100, John Obaterspok wrote:



Hello,



I'm using virt-manager on my F23 box to run a Windows 10 image but the
performance is so bad it's killing me.

I have "vmx" flag in /proc/cpuinfo

# lsmod |grep kvm
kvm_intel 167936  6
kvm   503808  1 kvm_intel

virtio-win-0.1.112-1.noarch

But no virtio modules loaded. Should they be loaded nowadays?


Not on the host AFAIK.


The disk format used is vmdk with no caching and native mode.

The io is 100% in windows task manager performing less than 1MB/s


Any clues?


What are the figures from the host?  What is qemu doing and what are
the


other processes and devices doing?


What is the best way to find this out?




{,a,h}top should do for the initial runs, just to see if the block layer

is busy or the CPU is busy or something else is blocking it


atop seems to indicate that sdd is busy?


DSK |  sdd |  busy 96% |  |  read1455 | write
1319 |  KiB/r  5 | KiB/w  9 |   | MBr/s   0.74 | MBw/s
 1.26  | avq 1.01 |   | avio 3.43 ms |

# mount | grep sdd
/dev/sdd2 on /vm type ext4 (rw,relatime,seclabel,data=ordered)



And it doesn't do that in any other process on the host?  It looks like
it's not related to virtualisation...



Hi,

I changed from vmdk to raw and the Write performance went from 1.6 MB/s to
~100 MB/s


Now that's interesting!  Of course raw will be faster but by 2 orders of
magnitude?  That seems unreasonably much more than it should be.


Is vmdk write performance so bad?



I can't say, I wouldn't even expect that -- that's why I didn't even
suggest it.


Result:
http://postimg.org/image/gcqe5affn/



It would be worth asking on qemu-devel, so I'm adding them Cc.  Question
is, was that something wrong with the initial setup that hindered the
performance of the machine or is it just that vmdk performace is bad?
What could we check so that we can hint users in a better direction
performance-wise?

Martin

For a full reference, here is the original thread:

 https://www.redhat.com/archives/libvirt-users/2016-February/msg00048.html


signature.asc
Description: Digital signature


Re: [Qemu-devel] [PATCH 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU

2016-02-18 Thread Peter Xu
Hi, Jan,

On Fri, Feb 19, 2016 at 07:46:26AM +0100, Jan Kiszka wrote:
> Hi Peter,
> 
> On 2016-02-19 04:30, Peter Xu wrote:
> > This patchset provide very basic functionalities for interrupt
> > remapping (IR) support of the emulated Intel IOMMU device.
> 
> Interesting. Some questions:
> 
> - Were you aware of
> http://git.kiszka.org/?p=qemu.git;a=shortlog;h=refs/heads/queues/vtd-intremap,
> and can you comment on how this relates to your approach? Mine included
> HPET support, e.g.

No. I do not know we have existing works related. I'd say that I am
one of those who are not fan of duplicating works. If I know that
there is existing ones, I would have asked you first.

Actually there are several people within my working team knows that
I would be working on this, and I believe none of us do know your
work too... Or there must be someone telling me so...

Whatever, I am sorry that finally we got some duplicated work. :-(

I have not go deeper into your codes yet, I'd say that most of the
logic is alike, since lots of works are to follow the spec. And the
implementation is slightly different (actually I just got this
memory region idea from Paolo yesterday... instead of a worse one to
declare a specific func ops for MSIs).

What I was going to do is to at least support:

- vhost
- pass through devices

So what I was planning is that, this series will be the start. And
the above two is the first-step goal (and I may need kvm-ioapic as
well though).

With the above, we should be able to run DPDK applications in guests
with either pass-through or vhost-user devices (these should be the
two scenario which is most possibly to be used AFAIU). This is what
I was trying to do. HPET is not with high priority in my todo list.

> 
> - Rita Sinha is currently working on integrating my old patches with the
> split-irqchip to get KVM working (as an Outreachy project). It's
> probably a bit unfortunate to consider a different horse that late in to
> project. What do you think, how could we benefit from each other?

I'd say that I am still new to QEMU community. So this is the first
time I encountered this kind of problem... Do you have any
suggestion?

> 
> - Radim was telling me to look on this as well. How do your efforts
> correlate?

Same as above. Do you have any suggestion on how we'd move on?

One question about your tree: have you posted patches before? and
what's the relationship between your tree and current QEMU master?

> 
> > 
> > By default, IR is disabled to be better compatible with current
> > QEMU. To enable IR, we can using the following command to boot a
> > IR-supported VM with basic network (still do not support kvm-ioapic,
> > so we need to specify kernel_irqchip=off here):
> > 
> > $ qemu-system-x86_64 -M q35,iommu=on,int_remap=on,kernel_irqchip=off \
> >  -enable-kvm -m 1024 -s \
> >  -monitor telnet::,server,nowait \
> >  -netdev user,id=user.0,hostfwd=tcp::-:22 \
> >  -device virtio-net-pci,netdev=user.0 \
> >  /var/lib/libvirt/images/vm1.qcow2
> > 
> > When guest boots, we can verify whether IR enabled by grepping the
> > dmesg like:
> > 
> > [root@localhost ~]# journalctl -k | grep "DMAR-IR"
> > Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: IOAPIC id 0 under 
> > DRHD base  0xfed9 IOMMU 0
> > Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: Enabled IRQ 
> > remapping in xapic mode
> > 
> > Currently only two devices are supported:
> > 
> > - Emulated IOAPIC device
> > - PCI Devices
> > 
> > TODO List:
> > 
> > - kvm-ioapic support
> > - vhost support
> > - pass through device support
> > - EIM support
> > - IR fault reporting
> > - source-id validation for IRTE
> > - IRTE cache and better queued invalidation
> > - migration support (for IOMMU as general?)
> > - more?
> > 
> > Peter Xu (13):
> >   q35: add "int-remap" flag to enable intr
> >   acpi: enable INTR for DMAR report structure
> >   intel_iommu: allow queued invalidation for IR
> >   intel_iommu: set IR bit for ECAP register
> >   acpi: add DMAR scope definition for root IOAPIC
> >   intel_iommu: define interrupt remap table addr register
> >   intel_iommu: handle interrupt remap enable
> >   intel_iommu: define several structs for IOMMU IR
> >   intel_iommu: provide helper function vtd_get_iommu
> >   ioapic-common: add iommu for IOAPICCommonState
> >   intel_iommu: add IR translation faults defines
> >   intel_iommu: ioapic: IR support for emulated IOAPIC
> >   intel_iommu: Add support for PCI MSI remap
> > 
> >  hw/core/machine.c |  20 ++
> >  hw/i386/acpi-build.c  |  41 +++-
> >  hw/i386/intel_iommu.c | 397 
> > +-
> >  hw/i386/intel_iommu_internal.h|  25 +++
> >  hw/intc/ioapic.c  |  36 +++-
> >  hw/intc/ioapic_common.c   |   2 +
> >  hw/pci-host/q35.c |   6 +-
> >  include/hw/acpi/acpi-defs.h   |  15 ++
> >  include/hw/boards.h   |   1 +
> >  

Re: [Qemu-devel] [PATCH 0/8] virtio: allow migration with dataplane

2016-02-18 Thread Michael S. Tsirkin
On Sun, Feb 14, 2016 at 06:17:03PM +0100, Paolo Bonzini wrote:
> Currently, dataplane threads are shut down during migration because
> vring.c is not able to track dirty memory.  However, all the relevant
> parts of QEMU have been made thread-safe now, so we can drop vring.c
> completely.  With these patches, virtio-dataplane is now simply "virtio
> with ioeventfd in a different AioContext".

virtio bits look OK to me.
Removing flags like started and disabled can happen later.

> 
> Paolo Bonzini (8):
>   block-migration: acquire AioContext as necessary
>   vring: make vring_enable_notification return void
>   virtio: add AioContext-specific function for host notifiers
>   virtio: export vring_notify as virtio_should_notify
>   virtio-blk: fix "disabled data plane" mode
>   virtio-blk: do not use vring in dataplane
>   virtio-scsi: do not use vring in dataplane
>   vring: remove
> 
>  hw/block/dataplane/virtio-blk.c   | 130 +-
>  hw/block/dataplane/virtio-blk.h   |   1 +
>  hw/block/virtio-blk.c |  51 +--
>  hw/scsi/virtio-scsi-dataplane.c   | 196 ++---
>  hw/scsi/virtio-scsi.c |  52 +--
>  hw/virtio/Makefile.objs   |   1 -
>  hw/virtio/dataplane/Makefile.objs |   1 -
>  hw/virtio/dataplane/vring.c   | 549 
> --
>  hw/virtio/virtio.c|  20 +-
>  include/hw/virtio/dataplane/vring-accessors.h |  75 
>  include/hw/virtio/dataplane/vring.h   |  51 ---
>  include/hw/virtio/virtio-blk.h|   4 +-
>  include/hw/virtio/virtio-scsi.h   |  21 +-
>  include/hw/virtio/virtio.h|   3 +
>  migration/block.c |  61 ++-
>  trace-events  |   3 -
>  16 files changed, 134 insertions(+), 1085 deletions(-)
>  delete mode 100644 hw/virtio/dataplane/Makefile.objs
>  delete mode 100644 hw/virtio/dataplane/vring.c
>  delete mode 100644 include/hw/virtio/dataplane/vring-accessors.h
>  delete mode 100644 include/hw/virtio/dataplane/vring.h
> 
> -- 
> 1.8.3.1



Re: [Qemu-devel] [PATCH 1/8] block-migration: acquire AioContext as necessary

2016-02-18 Thread Michael S. Tsirkin
ping
Paolo - were you going to address these questions?
Or did I miss it?

On Tue, Feb 16, 2016 at 03:17:11PM +0800, Fam Zheng wrote:
> On Sun, 02/14 18:17, Paolo Bonzini wrote:
> > This is needed because dataplane will run during block migration as well.
> > 
> > The block device migration code is quite liberal in taking the iothread
> > mutex.  For simplicity, keep it the same way, even though one could
> > actually choose between the BQL (for regular BlockDriverStates) and
> > the AioContext (for dataplane BlockDriverStates).  When the block layer
> > is made fully thread safe, aio_context_acquire shall go away altogether.
> > 
> > Signed-off-by: Paolo Bonzini 
> > ---
> >  migration/block.c | 61 
> > ---
> >  1 file changed, 49 insertions(+), 12 deletions(-)
> > 
> > diff --git a/migration/block.c b/migration/block.c
> > index a444058..6dd2327 100644
> > --- a/migration/block.c
> > +++ b/migration/block.c
> > @@ -60,9 +60,15 @@ typedef struct BlkMigDevState {
> >  int64_t cur_sector;
> >  int64_t cur_dirty;
> >  
> > -/* Protected by block migration lock.  */
> > +/* Data in the aio_bitmap is protected by block migration lock.
> > + * Allocation and free happen during setup and cleanup respectively.
> > + */
> >  unsigned long *aio_bitmap;
> > +
> > +/* Protected by block migration lock.  */
> >  int64_t completed_sectors;
> > +
> > +/* Protected by iothread lock / AioContext.  */
> >  BdrvDirtyBitmap *dirty_bitmap;
> >  Error *blocker;
> >  } BlkMigDevState;
> > @@ -100,7 +106,7 @@ typedef struct BlkMigState {
> >  int prev_progress;
> >  int bulk_completed;
> >  
> > -/* Lock must be taken _inside_ the iothread lock.  */
> > +/* Lock must be taken _inside_ the iothread lock and any AioContexts.  
> > */
> >  QemuMutex lock;
> >  } BlkMigState;
> >  
> > @@ -264,11 +270,13 @@ static int mig_save_device_bulk(QEMUFile *f, 
> > BlkMigDevState *bmds)
> >  
> >  if (bmds->shared_base) {
> >  qemu_mutex_lock_iothread();
> > +aio_context_acquire(bdrv_get_aio_context(bs));
> >  while (cur_sector < total_sectors &&
> > !bdrv_is_allocated(bs, cur_sector, MAX_IS_ALLOCATED_SEARCH,
> >_sectors)) {
> >  cur_sector += nr_sectors;
> >  }
> > +aio_context_release(bdrv_get_aio_context(bs));
> >  qemu_mutex_unlock_iothread();
> >  }
> >  
> > @@ -302,11 +310,21 @@ static int mig_save_device_bulk(QEMUFile *f, 
> > BlkMigDevState *bmds)
> >  block_mig_state.submitted++;
> >  blk_mig_unlock();
> >  
> > +/* We do not know if bs is under the main thread (and thus does
> > + * not acquire the AioContext when doing AIO) or rather under
> > + * dataplane.  Thus acquire both the iothread mutex and the
> > + * AioContext.
> > + *
> > + * This is ugly and will disappear when we make bdrv_* thread-safe,
> > + * without the need to acquire the AioContext.
> > + */
> >  qemu_mutex_lock_iothread();
> > +aio_context_acquire(bdrv_get_aio_context(bmds->bs));
> >  blk->aiocb = bdrv_aio_readv(bs, cur_sector, >qiov,
> >  nr_sectors, blk_mig_read_cb, blk);
> >  
> >  bdrv_reset_dirty_bitmap(bmds->dirty_bitmap, cur_sector, nr_sectors);
> > +aio_context_release(bdrv_get_aio_context(bmds->bs));
> >  qemu_mutex_unlock_iothread();
> >  
> >  bmds->cur_sector = cur_sector + nr_sectors;
> > @@ -321,8 +339,9 @@ static int set_dirty_tracking(void)
> >  int ret;
> >  
> >  QSIMPLEQ_FOREACH(bmds, _mig_state.bmds_list, entry) {
> > +/* Creating/dropping dirty bitmaps only requires the big QEMU 
> > lock.  */
> 
> Why? I don't think it is safe today.  The BDS state is mutated and it can race
> with bdrv_set_dirty() etc. (Also the refresh_total_sectors in bdrv_nb_sectors
> can even do read/write, no?)
> 
> >  bmds->dirty_bitmap = bdrv_create_dirty_bitmap(bmds->bs, BLOCK_SIZE,
> >NULL, NULL);
> >  if (!bmds->dirty_bitmap) {
> >  ret = -errno;
> >  goto fail;
> > @@ -332,11 +352,14 @@ static int set_dirty_tracking(void)
> >  return ret;
> >  }
> >  
> > +/* Called with iothread lock taken.  */
> > +
> >  static void unset_dirty_tracking(void)
> >  {
> >  BlkMigDevState *bmds;
> >  
> >  QSIMPLEQ_FOREACH(bmds, _mig_state.bmds_list, entry) {
> > +/* Creating/dropping dirty bitmaps only requires the big QEMU 
> > lock.  */
> 
> Ditto.
> 
> >  bdrv_release_dirty_bitmap(bmds->bs, bmds->dirty_bitmap);
> >  }
> >  }
> > @@ -597,21 +627,28 @@ static void block_migration_cleanup(void *opaque)
> >  {
> >  BlkMigDevState *bmds;
> >  BlkMigBlock *blk;
> > +AioContext *ctx;
> >  
> >  bdrv_drain_all();
> >  
> >  unset_dirty_tracking();
> >  
> > -

[Qemu-devel] [PATCH v3] usb: add pid check at the first of uhci_handle_td()

2016-02-18 Thread Gonglei
pid can be gotten from uhci device memory in uhci_handle_td(),
so the guest can trigger assert qemu if we get an invalid pid.
And the uhci spec 2.1.2 tells us The Host Controller sets Host
Controller Process Error bit to 1 when it detects a fatal error
and indicates that the Host Controller suffered a consistency
check failure while processing a Transfer Descriptor. An example
of a consistency check failure would be finding an illegal PID
field while processing the packet header portion of the TD.
When this error occurs, the Host Controller clears the Run/Stop
bit in the Command register to prevent further schedule execution.

We'd better to set UHCI_STS_HCPERR and kick an interrupt, check
the pid value at the first of uhci_handle_td function.

[Also fixed BZ 1070027]

Signed-off-by: Gonglei 
---
 v3:  checking whenever the pid is valid as very first
  thing in uhci_handle_td.  (As Gerd's suggestion, thanks)

 hw/usb/hcd-uhci.c | 22 --
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/hw/usb/hcd-uhci.c b/hw/usb/hcd-uhci.c
index 5ccfb83..03fe599 100644
--- a/hw/usb/hcd-uhci.c
+++ b/hw/usb/hcd-uhci.c
@@ -773,8 +773,22 @@ static int uhci_handle_td(UHCIState *s, UHCIQueue *q, 
uint32_t qh_addr,
 bool spd;
 bool queuing = (q != NULL);
 uint8_t pid = td->token & 0xff;
-UHCIAsync *async = uhci_async_find_td(s, td_addr);
+UHCIAsync *async;
 
+switch(pid) {
+case USB_TOKEN_OUT:
+case USB_TOKEN_SETUP:
+case USB_TOKEN_IN:
+break;
+default:
+/* invalid pid : frame interrupted */
+s->status |= UHCI_STS_HCPERR;
+s->cmd &= ~UHCI_CMD_RS;
+uhci_update_irq(s);
+return TD_RESULT_STOP_FRAME;
+}
+
+async = uhci_async_find_td(s, td_addr);
 if (async) {
 if (uhci_queue_verify(async->queue, qh_addr, td, td_addr, queuing)) {
 assert(q == NULL || q == async->queue);
@@ -880,11 +894,7 @@ static int uhci_handle_td(UHCIState *s, UHCIQueue *q, 
uint32_t qh_addr,
 break;
 
 default:
-/* invalid pid : frame interrupted */
-uhci_async_free(async);
-s->status |= UHCI_STS_HCPERR;
-uhci_update_irq(s);
-return TD_RESULT_STOP_FRAME;
+abort(); /* Never to execute */
 }
 
 if (async->packet.status == USB_RET_ASYNC) {
-- 
1.8.5.2





Re: [Qemu-devel] [PATCH v2] scripts/kvm/kvm_stat: Fix missing right parantheses and ".format(...)"

2016-02-18 Thread Janosch Frank
On 02/19/2016 03:20 AM, Fam Zheng wrote:
> They seem to have snuck in when applying Janosch Frank
> 's previous patch.
> 
> Signed-off-by: Fam Zheng 
> 
> ---
> v2: Also fix .format. [Janosch]
> ---
>  scripts/kvm/kvm_stat | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/scripts/kvm/kvm_stat b/scripts/kvm/kvm_stat
> index 3cf1181..769d884 100755
> --- a/scripts/kvm/kvm_stat
> +++ b/scripts/kvm/kvm_stat
> @@ -796,11 +796,12 @@ def check_access(options):
>  sys.stderr.write("Please enable CONFIG_TRACING in your kernel "
>   "when using the option -t (default).\n"
>   "If it is enabled, make {0} readable by the "
> - "current user.\n")
> + "current user.\n"
> + .format(PATH_DEBUGFS_TRACING))
>  if options.tracepoints:
>  sys.exit(1)
> 
> -sys.stderr.write("Falling back to debugfs statistics!\n"
> +sys.stderr.write("Falling back to debugfs statistics!\n")
>  options.debugfs = True
>  sleep(5)
> 
Reviewed-by: Janosch Frank 
Tested-by: Janosch Frank 

Thanks




Re: [Qemu-devel] SUMMARY: Re: [RFC 1/1] nbd (specification): add NBD_CMD_WRITE_ZEROES command

2016-02-18 Thread Denis V. Lunev

On 02/18/2016 08:23 PM, Denis V. Lunev wrote:

On 02/18/2016 07:35 PM, Eric Blake wrote:

On 02/18/2016 02:18 AM, Roman Kagan wrote:

On Wed, Feb 17, 2016 at 01:58:47PM -0700, Eric Blake wrote:

On 02/17/2016 11:10 AM, Denis V. Lunev wrote:

@@ -446,6 +448,11 @@ The following request types exist:
  about the contents of the export affected by this command, 
until

  overwriting it again with `NBD_CMD_WRITE`.
  +* `NBD_CMD_WRITE_ZEROES` (6)
+
+A request to write zeroes. The command is functional 
equivalent of
+the NBD_WRITE_COMMAND but without payload sent through the 
channel.

This lets us push holes during writes. Do we have the converse
operation, that is, an easy way to query if a block of data will 
read as
all zeroes, and therefore the client can bypass reading that 
portion of
the disk (in other words, an equivalent to 
lseek(SEEK_HOLE/SEEK_DATA))?

The spec doesn't have anything like that.

OTOH, unlike the write case, where you have all the information and 
just

choose whether to send normal write or zero write, the extra round-trip
of a separate SEEK_HOLE/SEEK_DATA request may lead to actually 
degrading

the overall throughput.

Rather it may be a better idea to add something like sparse read where
the server would, instead of sending the full length of data in the
response payload, send a smarter variable-length package with a
scatter-gather list or a bitmap of used blocks in the beginning, and 
let

the client decode it and fill the gaps with zeros.

Sure, that would work too, and sounds nicer.  Either way, the point is
that we should strongly consider improving the NBD protocol to allow
more efficient handling of sparse files, in both the push and in the
pull direction.  Qemu already has a desire to use both directions of
improvements, but there are more programs, both clients and servers,
outside of qemu, that could benefit from such protocol improvements.


OK

Here is a short summary of features which seems necessary from QEMU 
point of

view:
- ability to avoid sending zeroes during write operation. The proposal 
comes in

  the thread-starter letter
- ability to request block status (allocate/not allocated) from 
server. This seems

  interesting to preserve "sparseness" of the transferring data
- ability to skip zeroes during read operation, i.e. something like 
READ2 command

  which will return vector of chunks as a reply

All 3 features seem usable for generic NBD use-cases and not only for 
QEMU.


If there are no objections I'll sum this up and come with a 
specification draft.


Den

P.S. I have added here all parties which have participated in 
conversation in

   different threads on QEMU side.


interesting point from a verbal discussion with one of my friends.
Protocol level compression could eliminate the necessity to
think about zeroes in channel either from read or from write
point of views and will also reduce the amount of data to
transfer.

Den



Re: [Qemu-devel] [PATCH] fw_cfg: unbreak migration compatibility for 2.4 and earlier machines

2016-02-18 Thread Gerd Hoffmann
On Do, 2016-02-18 at 20:31 +0100, Laszlo Ersek wrote:
> When I reviewed Marc's fw_cfg DMA patches, I completely missed that the
> way we set dma_enabled would break migration.

Patch looks fine to me ...

> Not tested:
> * actual migration
> * when board code doesn't request DMA support
> 
> Testing feedback from people who use migration would be nice.

... but some actual migration testing would be great indeed.

thanks,
  Gerd




Re: [Qemu-devel] [PATCH v8 0/5] add ACPI node for fw_cfg on pc and arm

2016-02-18 Thread Gerd Hoffmann
On Di, 2016-02-16 at 18:49 -0500, Gabriel L. Somlo wrote:
> On Thu, Feb 11, 2016 at 05:06:00PM -0500, Gabriel L. Somlo wrote:
> > Generate an ACPI DSDT node for fw_cfg on pc and arm guests.
> > 
> > New since v7:
> > 
> > - edited commit blurb on 3/5 to match updated content, i.e. that
> >   the ACPI node is now inserted into the DSDT (no longer the SSDT).
> >   (Thanks to Igor Mammedov for catching that!)
> 
> BTW, regarding Igor's question about Windows starting to search for a
> driver: Just for grins, I installed Windows 10 (with qemu git master
> *before* this series was applied). Then, after applying the series,
> DeviceManager was happy and had no unknown hardware listed.

Igor?  Could I have a review for the acpi bits?  Then I can prepare a
fw_cfg pull request early next week with this and the fw-cfg-dma
migration fix from laszlo

thanks,
  Gerd




Re: [Qemu-devel] [PATCH 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU

2016-02-18 Thread Jan Kiszka
Hi Peter,

On 2016-02-19 04:30, Peter Xu wrote:
> This patchset provide very basic functionalities for interrupt
> remapping (IR) support of the emulated Intel IOMMU device.

Interesting. Some questions:

- Were you aware of
http://git.kiszka.org/?p=qemu.git;a=shortlog;h=refs/heads/queues/vtd-intremap,
and can you comment on how this relates to your approach? Mine included
HPET support, e.g.

- Rita Sinha is currently working on integrating my old patches with the
split-irqchip to get KVM working (as an Outreachy project). It's
probably a bit unfortunate to consider a different horse that late in to
project. What do you think, how could we benefit from each other?

- Radim was telling me to look on this as well. How do your efforts
correlate?

> 
> By default, IR is disabled to be better compatible with current
> QEMU. To enable IR, we can using the following command to boot a
> IR-supported VM with basic network (still do not support kvm-ioapic,
> so we need to specify kernel_irqchip=off here):
> 
> $ qemu-system-x86_64 -M q35,iommu=on,int_remap=on,kernel_irqchip=off \
>  -enable-kvm -m 1024 -s \
>  -monitor telnet::,server,nowait \
>  -netdev user,id=user.0,hostfwd=tcp::-:22 \
>  -device virtio-net-pci,netdev=user.0 \
>/var/lib/libvirt/images/vm1.qcow2
> 
> When guest boots, we can verify whether IR enabled by grepping the
> dmesg like:
> 
> [root@localhost ~]# journalctl -k | grep "DMAR-IR"
> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: IOAPIC id 0 under DRHD 
> base  0xfed9 IOMMU 0
> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: Enabled IRQ remapping 
> in xapic mode
> 
> Currently only two devices are supported:
> 
> - Emulated IOAPIC device
> - PCI Devices
> 
> TODO List:
> 
> - kvm-ioapic support
> - vhost support
> - pass through device support
> - EIM support
> - IR fault reporting
> - source-id validation for IRTE
> - IRTE cache and better queued invalidation
> - migration support (for IOMMU as general?)
> - more?
> 
> Peter Xu (13):
>   q35: add "int-remap" flag to enable intr
>   acpi: enable INTR for DMAR report structure
>   intel_iommu: allow queued invalidation for IR
>   intel_iommu: set IR bit for ECAP register
>   acpi: add DMAR scope definition for root IOAPIC
>   intel_iommu: define interrupt remap table addr register
>   intel_iommu: handle interrupt remap enable
>   intel_iommu: define several structs for IOMMU IR
>   intel_iommu: provide helper function vtd_get_iommu
>   ioapic-common: add iommu for IOAPICCommonState
>   intel_iommu: add IR translation faults defines
>   intel_iommu: ioapic: IR support for emulated IOAPIC
>   intel_iommu: Add support for PCI MSI remap
> 
>  hw/core/machine.c |  20 ++
>  hw/i386/acpi-build.c  |  41 +++-
>  hw/i386/intel_iommu.c | 397 
> +-
>  hw/i386/intel_iommu_internal.h|  25 +++
>  hw/intc/ioapic.c  |  36 +++-
>  hw/intc/ioapic_common.c   |   2 +
>  hw/pci-host/q35.c |   6 +-
>  include/hw/acpi/acpi-defs.h   |  15 ++
>  include/hw/boards.h   |   1 +
>  include/hw/i386/intel_iommu.h | 121 
>  include/hw/i386/ioapic_internal.h |   3 +
>  include/hw/pci/msi.h  |   4 +
>  12 files changed, 651 insertions(+), 20 deletions(-)
> 

Is there a git tree with your patches somewhere?

Thanks,
Jan



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v3 0/2] Fix migration of old pseries

2016-02-18 Thread Greg Kurz
On Thu, 18 Feb 2016 21:00:38 +0100
Laurent Vivier  wrote:

> On 18/02/2016 12:32, Greg Kurz wrote:
> > QEMU 2.4 broke the migration of old pseries machine with the addition
> > of configuration sections, which are sent unconditionally.
> > 
> > We assume that QEMU 2.3 is more deployed than any newer release (based on
> > the versions currently shipped by most distros). This v3 series hence
> > reverses the logic from v2: it now fully fixes migration of old pseries
> > from/to QEMU 2.3 and provides a manual workaround for the QEMU 2.4/2.4.1/2.5
> > case.
> > 
> > With this series, I could migrate the same pseries-2.3 instance in a full
> > 2.3->2.6->2.5->2.6->2.4->2.6->2.3 cycle.
> > 
> > ---
> > 
> > Greg Kurz (2):
> >   spapr: skip configuration section during migration of older machines
> >   migration: allow machine to enforce configuration section migration
> > 
> > 
> >  hw/core/machine.c   |   21 +
> >  hw/ppc/spapr.c  |1 +
> >  include/hw/boards.h |1 +
> >  migration/savevm.c  |   10 --
> >  qemu-options.hx |3 ++-
> >  5 files changed, 33 insertions(+), 3 deletions(-)  
> 
> For the series
> Tested-by: Laurent Vivier 
> 
> Results of tests:
> 
> pseries-2.3
> 
> qemu-2.3.0 -> qemu-master: OK
> qemu-2.3.1 -> qemu-master: OK
> qemu-2.4.0 -> qemu-master: OK with qom-set
> qemu-2.4.1 -> qemu-master: OK with qom-set
> qemu-2.5.0 -> qemu-master: OK with qom-set
> qemu-master -> qemu-2.3.0: OK
> qemu-master -> qemu-2.3.1: OK
> qemu-master -> qemu-2.4.0: OK with qom-set
> qemu-master -> qemu-2.4.1: OK with qom-set
> qemu-master -> qemu-2.5.0: OK with qom-set
> qemu-master -> qemu-master: OK
> 
> pseries-2.4, pseries-2.5 and pseries-2.6 works well with the patch
> applied too.
> 
> master is "b527c9b qcow2: Write full header on image creation" + this series
> 
> Laurent
> 

Thanks for the reviewing and testing !

Cheers.

--
Greg





Re: [Qemu-devel] [PATCH v2 7/9] i.MX: Add i.MX6 SOC implementation.

2016-02-18 Thread Jean-Christophe DUBOIS

Le 18/02/2016 22:46, Peter Maydell a écrit :

On 18 February 2016 at 20:51, Jean-Christophe DUBOIS
 wrote:

Le 16/02/2016 22:57, Peter Maydell a écrit :

On 16 February 2016 at 21:47, Jean-Christophe DUBOIS
 wrote:

In QEMU, other Cortex A9 (Versatilepb.c, Exynos, Zynq ...) are also setting
has_el3 to false ...

So these generally are the "legacy" platforms which were
added before we ever had EL3 support in QEMU. For them it's hard
to turn the EL3 support on for the board even if in theory
it ought to be on, because we don't know what users are
running on it that we might break. With a new to QEMU board
we have an opportunity to get it right from the start.


OK, so is the "highbank" the only Qemu Cortex A9 board supporting
el3 yet?

Yep. We don't have many A9 boards and most of those we do have
are in the 'legacy' bucket.


-kernel I would expect to work, though, at least if the
only issue is the interrupt controller setup. It seems
worth investigating why it goes wrong.


Well, I can boot uniprocessor (-smp 1) without trouble but if I turn logs on
(guest_errors,unimp) I am getting a lot of

gic_dist_writeb: Bad offset 38x (a few at startup)
Ignoring attempt to switch CPSR_A flag from non-secure world with
SCR.AW bit clear (a lot)
Ignoring attempt to switch CPSR_F flag from non-secure world with
SCR.FW bit clear (a few)

This would only be a problem if your kernel needed to use FIQ,
I think.


It is a standard linux (4.5-rc1) so it should not use FIQ I guess.

Now why is the linux code trying to do things it is not allowed to do in 
its security level ?


Or would Linux expect the secure world to set these bits before running it?




I am not sure if this is a problem. Do you have some opinion on this?

When I turn SMP (-smp 2 or more), I am unable to complete the boot. As soon
as my secondary cpu is started QEMU will continue to boot "very slowly" but
doesn't get to the linux user prompt overnight.

Does SMP work with EL3 not enabled, or is this a different bug?


If I set has_el3 to false, I can boot the 4 cores without problem. With 
has_el3 set to true (default value) I am getting the above behavior 
(boot OK in uniprocessor mode, and misbehaving if -smp >= 2).


JC



thanks
-- PMM






Re: [Qemu-devel] flushing before updating pc.ram

2016-02-18 Thread TeLeMan
On Fri, Feb 19, 2016 at 12:53 AM, Egbert S.  wrote:
> I have here a case (over at GitHub unicorn-engine/unicorn
> tests/unit/test_tb_x86.c) that is running a stack-based Alpha-Mixed sample
> that contains an instruction that changed an operand of the next
> instruction, the one that QEMU does not detect nor execute properly (even
> with TARGET_HAS_PRECISE_SMC define compiled in).  Stack-based writeable
> MemoryRegion "pc.ram" starts at 0x6000.
>
>
> Live trace of QEMU is:
> ecx = 0x5ff8
> # Modify code location 6028 to 0x10 from 0x51
> 6021  304130  xor  byte ptr [ocx + 0x30], al
> 6024  41  inc ecx
> 6025  6b414151 imul eax, dword ptr [ecx + 0x41], 0x51
>
> In the notdirty_mem_write(), it does a tb_invalidate_phys_page_fast()
> firstly before writing directly to the "pc.ram".
>
> As a result, the newly reconstructed TB rebuilds the 'imul' micro-operation
> sequence , but still retrieving the original 0x51 immediate byte operand
> (and not the expected 0x10).
>
> Interestingly, doing a read operation of the exact 0x6028 byte retrieves
> that updated (and correct) 0x10.
>
> Could it be that in the notdirty_mem_write(), that stX_p() must be performed
> firstly before the tb_invalidate_phys_page_fast()?
>

I think it is same as the old issue:

https://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg02161.html



Re: [Qemu-devel] [RFC] QMP: add query-hotpluggable-cpus

2016-02-18 Thread David Gibson
On Thu, Feb 18, 2016 at 11:37:39AM +0100, Igor Mammedov wrote:
> On Thu, 18 Feb 2016 14:39:52 +1100
> David Gibson  wrote:
> 
> > On Tue, Feb 16, 2016 at 11:36:55AM +0100, Igor Mammedov wrote:
> > > On Mon, 15 Feb 2016 20:43:41 +0100
> > > Markus Armbruster  wrote:
> > >   
> > > > Igor Mammedov  writes:
> > > >   
> > > > > it will allow mgmt to query present and possible to hotplug CPUs
> > > > > it is required from a target platform that wish to support
> > > > > command to set board specific MachineClass.possible_cpus() hook,
> > > > > which will return a list of possible CPUs with options
> > > > > that would be needed for hotplugging possible CPUs.
> > > > >
> > > > > For RFC there are:
> > > > >'arch_id': 'int' - mandatory unique CPU number,
> > > > >   for x86 it's APIC ID for ARM it's MPIDR
> > > > >'type': 'str' - CPU object type for usage with device_add
> > > > >
> > > > > and a set of optional fields that would allows mgmt tools
> > > > > to know at what granularity and where a new CPU could be
> > > > > hotplugged;
> > > > > [node],[socket],[core],[thread]
> > > > > Hopefully that should cover needs for CPU hotplug porposes for
> > > > > magor targets and we can extend structure in future adding
> > > > > more fields if it will be needed.
> > > > >
> > > > > also for present CPUs there is a 'cpu_link' field which
> > > > > would allow mgmt inspect whatever object/abstraction
> > > > > the target platform considers as CPU object.
> > > > >
> > > > > For RFC purposes implements only for x86 target so far.
> > > > 
> > > > Adding ad hoc queries as we go won't scale.  Could this be solved by a
> > > > generic introspection interface?  
> > > Do you mean generic QOM introspection?
> > > 
> > > Using QOM we could have '/cpus' container and create QOM links
> > > for exiting (populated links) and possible (empty links) CPUs.
> > > However in that case link's name will need have a special format
> > > that will convey an information necessary for mgmt to hotplug
> > > a CPU object, at least:
> > >   - where: [node],[socket],[core],[thread] options
> > >   - optionally what CPU object to use with device_add command  
> > 
> > Hmm.. is it not enough to follow the link and get the topology
> > information by examining the target?
> One can't follow a link if it's an empty one, hence
> CPU placement information should be provided somehow,
> either:

Ah, right, so the issue is determining the socket/core/thread
addresses that cpus which aren't yet present will have.

>  * by precreating cpu-package objects with properties that
>would describe it /could be inspected via OQM/

So, we could do this, but I think the natural way would be to have the
information for each potential thread in the package.  Just putting
say "core number" in the package itself assumes more than I'd like
about how packages sit in the heirarchy.  Plus, it means that
management has a bunch of cases to deal with: package has all the
information, package has just a core id, package has just a socket id,
and so forth.

It is a but clunky that when the package is plugged, this information
will have to sit parallel to the array of actual thread links.

Markus or Andreas is there a natural way to present a list of (node,
socket, core, thread) tuples in the package object?  Preferably
without having to create a whole bunch of "potential thread" objects
just for the purpose.

> or
>  * via QMP/HMP command that would provide the same information
>only without need to precreate anything. The only difference
>is that it allows to use -device/device_add for new CPUs.

I'd be ok with that option as well.  I'd be thinking it would be
implemented via a class method on the package object which returns the
addresses that its contained threads will have, whether or not they're
present right now.  Does that make sense?

> Considering that we would need to create HMP command so user could
> inspect possible CPUs from monitor, it would need to do the same as
> QMP command regardless of whether it's cpu-package objects or
> just board calculated info a runtime.
>  
> > In the design Eduardo and I have been discussing we're actually not
> > planning to allow device_add to construct CPU packages - at least, not
> > for the time being.  The idea is that the machine type will construct
> > enough packages for maxcpus, and management just toggles them on and
> > off.
> Another question is how it would work wrt migration?

I'm assuming the "present" bits would be added to the migration
stream; seems straightforward enough to me.  Is there some
consideration I'm missing?

> > We can eventually allow construction of new packages with device_add,
> > but for now that gets hidden inside the platform until we've worked
> > out more details.
> > 
> > > Another approach to do QOM introspection would be to model hierarchy 
> > > of objects like 

[Qemu-devel] [PATCH v5 5/6] s390/virtio-ccw: Add hotplug handler

2016-02-18 Thread Matthew Rosato
First consumer will be CPU hotplug, to update machine/cpu[n]
links during cpu plug.

Signed-off-by: Matthew Rosato 
---
 hw/s390x/s390-virtio-ccw.c | 40 
 1 file changed, 40 insertions(+)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index b05ed8b..31c1082 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -156,10 +156,46 @@ static void ccw_init(MachineState *machine)
 gtod_save, gtod_load, kvm_state);
 }
 
+static void s390_cpu_plug(HotplugHandler *hotplug_dev,
+DeviceState *dev, Error **errp)
+{
+gchar *name;
+S390CPU *cpu = S390_CPU(dev);
+CPUState *cs = CPU(dev);
+
+name = g_strdup_printf("cpu[%i]", cpu->env.cpu_num);
+object_property_set_link(OBJECT(qdev_get_machine()), OBJECT(cs), name,
+ errp);
+}
+
+static void s390_machine_device_plug(HotplugHandler *hotplug_dev,
+ DeviceState *dev, Error **errp)
+{
+if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+s390_cpu_plug(hotplug_dev, dev, errp);
+}
+}
+
+static void s390_machine_device_unplug(HotplugHandler *hotplug_dev,
+   DeviceState *dev, Error **errp)
+{
+
+}
+
+static HotplugHandler *s390_get_hotplug_handler(MachineState *machine,
+DeviceState *dev)
+{
+if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+return HOTPLUG_HANDLER(machine);
+}
+return NULL;
+}
+
 static void ccw_machine_class_init(ObjectClass *oc, void *data)
 {
 MachineClass *mc = MACHINE_CLASS(oc);
 NMIClass *nc = NMI_CLASS(oc);
+HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
 
 mc->init = ccw_init;
 mc->reset = s390_machine_reset;
@@ -171,6 +207,9 @@ static void ccw_machine_class_init(ObjectClass *oc, void 
*data)
 mc->no_sdcard = 1;
 mc->use_sclp = 1;
 mc->max_cpus = 255;
+mc->get_hotplug_handler = s390_get_hotplug_handler;
+hc->plug = s390_machine_device_plug;
+hc->unplug = s390_machine_device_unplug;
 nc->nmi_monitor_handler = s390_nmi;
 }
 
@@ -232,6 +271,7 @@ static const TypeInfo ccw_machine_info = {
 .class_init= ccw_machine_class_init,
 .interfaces = (InterfaceInfo[]) {
 { TYPE_NMI },
+{ TYPE_HOTPLUG_HANDLER},
 { }
 },
 };
-- 
1.9.1




[Qemu-devel] [PATCH v5 6/6] s390x/cpu: Allow hotplug of CPUs

2016-02-18 Thread Matthew Rosato
Implement cpu hotplug routine and add the machine hook.

Signed-off-by: Matthew Rosato 
---
 hw/s390x/s390-virtio-ccw.c |  1 +
 target-s390x/cpu.c | 45 +
 target-s390x/cpu.h |  1 +
 3 files changed, 47 insertions(+)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 31c1082..3be41ee 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -199,6 +199,7 @@ static void ccw_machine_class_init(ObjectClass *oc, void 
*data)
 
 mc->init = ccw_init;
 mc->reset = s390_machine_reset;
+mc->hot_add_cpu = s390_hot_add_cpu;
 mc->block_default_type = IF_VIRTIO;
 mc->no_cdrom = 1;
 mc->no_floppy = 1;
diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
index 8dfd063..99394ec 100644
--- a/target-s390x/cpu.c
+++ b/target-s390x/cpu.c
@@ -32,6 +32,12 @@
 #include "trace.h"
 #ifndef CONFIG_USER_ONLY
 #include "sysemu/arch_init.h"
+#include "sysemu/sysemu.h"
+#include "hw/boards.h"
+#include "hw/s390x/sclp.h"
+#include "qom/cpu.h"
+
+#define last_cpu QTAILQ_LAST(, CPUTailQ)
 #endif
 
 #define CR0_RESET   0xE0UL
@@ -211,6 +217,12 @@ static void s390_cpu_realizefn(DeviceState *dev, Error 
**errp)
 #endif
 
 scc->parent_realize(dev, errp);
+
+#if !defined(CONFIG_USER_ONLY)
+if (dev->hotplugged) {
+raise_irq_cpu_hotplug();
+}
+#endif
 }
 
 static void s390_cpu_initfn(Object *obj)
@@ -254,6 +266,39 @@ static void s390_cpu_finalize(Object *obj)
 }
 
 #if !defined(CONFIG_USER_ONLY)
+void s390_hot_add_cpu(const int64_t id, Error **errp)
+{
+MachineState *machine = MACHINE(qdev_get_machine());
+
+if (id < 0) {
+error_setg(errp, "Invalid CPU id: %" PRIi64, id);
+return;
+}
+
+if (cpu_exists(id)) {
+error_setg(errp, "Unable to add CPU: %" PRIi64
+   ", it already exists", id);
+return;
+}
+
+if (id >= max_cpus) {
+error_setg(errp, "Unable to add CPU: %" PRIi64
+   ", max allowed: %d", id, max_cpus - 1);
+return;
+}
+
+if (id != (last_cpu->cpu_index + 1)) {
+error_setg(errp, "Unable to add CPU: %" PRIi64
+   ", The next available id is %d", id,
+   (last_cpu->cpu_index + 1));
+return;
+}
+
+cpu_s390x_init(machine->cpu_model);
+}
+#endif
+
+#if !defined(CONFIG_USER_ONLY)
 static bool disabled_wait(CPUState *cpu)
 {
 return cpu->halted && !(S390_CPU(cpu)->env.psw.mask &
diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
index 06ca60b..cf1e2bd 100644
--- a/target-s390x/cpu.h
+++ b/target-s390x/cpu.h
@@ -541,6 +541,7 @@ static inline uint8_t s390_cpu_get_state(S390CPU *cpu)
 
 void gtod_save(QEMUFile *f, void *opaque);
 int gtod_load(QEMUFile *f, void *opaque, int version_id);
+void s390_hot_add_cpu(const int64_t id, Error **errp);
 
 /* service interrupts are floating therefore we must not pass an cpustate */
 void s390_sclp_extint(uint32_t parm);
-- 
1.9.1




[Qemu-devel] [PATCH v5 4/6] s390x/cpu: Add CPU property links

2016-02-18 Thread Matthew Rosato
Link each CPUState as property machine/cpu[n] during initialization.
Additionally, maintain an array of state pointers indexed by CPU
id for fast lookup during interrupt handling.

Signed-off-by: Matthew Rosato 
---
 hw/s390x/s390-virtio.c | 26 +-
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/hw/s390x/s390-virtio.c b/hw/s390x/s390-virtio.c
index b3707f4..6bd9803 100644
--- a/hw/s390x/s390-virtio.c
+++ b/hw/s390x/s390-virtio.c
@@ -60,15 +60,16 @@
 #define S390_TOD_CLOCK_VALUE_MISSING0x00
 #define S390_TOD_CLOCK_VALUE_PRESENT0x01
 
-static S390CPU **ipi_states;
+static S390CPU **cpu_states;
 
 S390CPU *s390_cpu_addr2state(uint16_t cpu_addr)
 {
-if (cpu_addr >= smp_cpus) {
+if (cpu_addr >= max_cpus) {
 return NULL;
 }
 
-return ipi_states[cpu_addr];
+/* Fast lookup via CPU ID */
+return cpu_states[cpu_addr];
 }
 
 void s390_init_ipl_dev(const char *kernel_filename,
@@ -98,19 +99,26 @@ void s390_init_ipl_dev(const char *kernel_filename,
 void s390_init_cpus(MachineState *machine)
 {
 int i;
+gchar *name;
 
 if (machine->cpu_model == NULL) {
 machine->cpu_model = "host";
 }
 
-ipi_states = g_malloc(sizeof(S390CPU *) * smp_cpus);
+cpu_states = g_malloc0(sizeof(S390CPU *) * max_cpus);
 
-for (i = 0; i < smp_cpus; i++) {
-S390CPU *cpu;
-
-cpu = cpu_s390x_init(machine->cpu_model);
+for (i = 0; i < max_cpus; i++) {
+name = g_strdup_printf("cpu[%i]", i);
+object_property_add_link(OBJECT(machine), name, TYPE_S390_CPU,
+ (Object **) _states[i],
+ object_property_allow_set_link,
+ OBJ_PROP_LINK_UNREF_ON_RELEASE,
+ _abort);
+g_free(name);
+}
 
-ipi_states[i] = cpu;
+for (i = 0; i < smp_cpus; i++) {
+cpu_s390x_init(machine->cpu_model);
 }
 }
 
-- 
1.9.1




[Qemu-devel] [PATCH v5 2/6] s390x/cpu: Set initial CPU state in common routine

2016-02-18 Thread Matthew Rosato
Both initial and hotplugged CPUs need to set the same initial
state.

Signed-off-by: Matthew Rosato 
Reviewed-by: David Hildenbrand 
---
 hw/s390x/s390-virtio.c | 4 
 target-s390x/cpu.c | 2 ++
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/hw/s390x/s390-virtio.c b/hw/s390x/s390-virtio.c
index b576811..b3707f4 100644
--- a/hw/s390x/s390-virtio.c
+++ b/hw/s390x/s390-virtio.c
@@ -107,14 +107,10 @@ void s390_init_cpus(MachineState *machine)
 
 for (i = 0; i < smp_cpus; i++) {
 S390CPU *cpu;
-CPUState *cs;
 
 cpu = cpu_s390x_init(machine->cpu_model);
-cs = CPU(cpu);
 
 ipi_states[i] = cpu;
-cs->halted = 1;
-cs->exception_index = EXCP_HLT;
 }
 }
 
diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
index 73a910d..603c2a1 100644
--- a/target-s390x/cpu.c
+++ b/target-s390x/cpu.c
@@ -219,6 +219,8 @@ static void s390_cpu_initfn(Object *obj)
 #endif
 
 cs->env_ptr = env;
+cs->halted = 1;
+cs->exception_index = EXCP_HLT;
 cpu_exec_init(cs, _abort);
 #if !defined(CONFIG_USER_ONLY)
 qemu_register_reset(s390_cpu_machine_reset_cb, cpu);
-- 
1.9.1




[Qemu-devel] [PATCH v5 3/6] s390x/cpu: Move some CPU initialization into realize

2016-02-18 Thread Matthew Rosato
In preparation for hotplug, defer some CPU initialization
until the device is actually being realized.

Signed-off-by: Matthew Rosato 
---
 target-s390x/cpu.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
index 603c2a1..8dfd063 100644
--- a/target-s390x/cpu.c
+++ b/target-s390x/cpu.c
@@ -195,7 +195,13 @@ static void s390_cpu_realizefn(DeviceState *dev, Error 
**errp)
 {
 CPUState *cs = CPU(dev);
 S390CPUClass *scc = S390_CPU_GET_CLASS(dev);
+S390CPU *cpu = S390_CPU(dev);
+CPUS390XState *env = >env;
 
+#if !defined(CONFIG_USER_ONLY)
+qemu_register_reset(s390_cpu_machine_reset_cb, cpu);
+#endif
+env->cpu_num = cs->cpu_index;
 s390_cpu_gdb_init(cs);
 qemu_init_vcpu(cs);
 #if !defined(CONFIG_USER_ONLY)
@@ -213,7 +219,6 @@ static void s390_cpu_initfn(Object *obj)
 S390CPU *cpu = S390_CPU(obj);
 CPUS390XState *env = >env;
 static bool inited;
-static int cpu_num = 0;
 #if !defined(CONFIG_USER_ONLY)
 struct tm tm;
 #endif
@@ -223,7 +228,6 @@ static void s390_cpu_initfn(Object *obj)
 cs->exception_index = EXCP_HLT;
 cpu_exec_init(cs, _abort);
 #if !defined(CONFIG_USER_ONLY)
-qemu_register_reset(s390_cpu_machine_reset_cb, cpu);
 qemu_get_timedate(, 0);
 env->tod_offset = TOD_UNIX_EPOCH +
   (time2tod(mktimegm()) * 10ULL);
@@ -232,7 +236,6 @@ static void s390_cpu_initfn(Object *obj)
 env->cpu_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, s390x_cpu_timer, cpu);
 s390_cpu_set_state(CPU_STATE_STOPPED, cpu);
 #endif
-env->cpu_num = cpu_num++;
 
 if (tcg_enabled() && !inited) {
 inited = true;
-- 
1.9.1




[Qemu-devel] [PATCH v5 1/6] s390x/cpu: Cleanup init in preparation for hotplug

2016-02-18 Thread Matthew Rosato
Ensure a valid cpu_model is set upfront by setting the
default value directly into the MachineState when none is
specified.  This is needed to ensure hotplugged CPUs share
the same cpu_model.

Signed-off-by: Matthew Rosato 
Reviewed-by: David Hildenbrand 
---
 hw/s390x/s390-virtio-ccw.c | 2 +-
 hw/s390x/s390-virtio.c | 8 
 hw/s390x/s390-virtio.h | 2 +-
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 89f5d0d..b05ed8b 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -136,7 +136,7 @@ static void ccw_init(MachineState *machine)
 virtio_ccw_register_hcalls();
 
 /* init CPUs */
-s390_init_cpus(machine->cpu_model);
+s390_init_cpus(machine);
 
 if (kvm_enabled()) {
 kvm_s390_enable_css_support(s390_cpu_addr2state(0));
diff --git a/hw/s390x/s390-virtio.c b/hw/s390x/s390-virtio.c
index c320878..b576811 100644
--- a/hw/s390x/s390-virtio.c
+++ b/hw/s390x/s390-virtio.c
@@ -95,12 +95,12 @@ void s390_init_ipl_dev(const char *kernel_filename,
 qdev_init_nofail(dev);
 }
 
-void s390_init_cpus(const char *cpu_model)
+void s390_init_cpus(MachineState *machine)
 {
 int i;
 
-if (cpu_model == NULL) {
-cpu_model = "host";
+if (machine->cpu_model == NULL) {
+machine->cpu_model = "host";
 }
 
 ipi_states = g_malloc(sizeof(S390CPU *) * smp_cpus);
@@ -109,7 +109,7 @@ void s390_init_cpus(const char *cpu_model)
 S390CPU *cpu;
 CPUState *cs;
 
-cpu = cpu_s390x_init(cpu_model);
+cpu = cpu_s390x_init(machine->cpu_model);
 cs = CPU(cpu);
 
 ipi_states[i] = cpu;
diff --git a/hw/s390x/s390-virtio.h b/hw/s390x/s390-virtio.h
index eebce8e..ffd014c 100644
--- a/hw/s390x/s390-virtio.h
+++ b/hw/s390x/s390-virtio.h
@@ -19,7 +19,7 @@
 typedef int (*s390_virtio_fn)(const uint64_t *args);
 void s390_register_virtio_hypercall(uint64_t code, s390_virtio_fn fn);
 
-void s390_init_cpus(const char *cpu_model);
+void s390_init_cpus(MachineState *machine);
 void s390_init_ipl_dev(const char *kernel_filename,
const char *kernel_cmdline,
const char *initrd_filename,
-- 
1.9.1




[Qemu-devel] [PATCH v5 0/6] Allow hotplug of s390 CPUs

2016-02-18 Thread Matthew Rosato
Changes from v4->v5:
* Patch #3 - Drop static next_cpu_id (Igor)
* Patch #4 - Drop register_cpustate in favor of machine hotplug handler.
  Re-write commit message as a result. (Igor)
* Patch #5 - Revive hotplug handler code from previous patch set.  Use 
  plug callback to update property links instead of register_cpustate. (Igor)
* Patch #6 - Use last_cpu instead of next_cpu_id to sanity-check hotplugged 
  CPU id (Igor)

**

As discussed in the KVM call, we will go ahead with cpu_add for 
s390x to get cpu hotplug functionality in s390x now, until
architectures that require a more robust hotplug interface
settle on a design.

To configure a guest with 2 CPUs online at 
boot and 4 maximum:

qemu -smp 2,maxcpus=4

Or, when using libvirt:
  
...
4
...
   


To subsequently hotplug a CPU:

Issue 'cpu-add ' from qemu monitor, or use virsh setvcpus --count  
, where  is the total number of desired guest CPUs.

At this point, the guest must bring the CPU online for use -- This can be 
achieved via "echo 1 > /sys/devices/system/cpu/cpuX/online" or via a management 
tool like cpuplugd.

This patch set is based on work previously done by Jason Herne.

Matthew Rosato (6):
  s390x/cpu: Cleanup init in preparation for hotplug
  s390x/cpu: Set initial CPU state in common routine
  s390x/cpu: Move some CPU initialization into realize
  s390x/cpu: Add CPU property links
  s390/virtio-ccw: Add hotplug handler
  s390x/cpu: Allow hotplug of CPUs

 hw/s390x/s390-virtio-ccw.c | 43 ++-
 hw/s390x/s390-virtio.c | 36 -
 hw/s390x/s390-virtio.h |  2 +-
 target-s390x/cpu.c | 56 +++---
 target-s390x/cpu.h |  1 +
 5 files changed, 117 insertions(+), 21 deletions(-)

-- 
1.9.1




[Qemu-devel] [PATCH 11/13] intel_iommu: add IR translation faults defines

2016-02-18 Thread Peter Xu
Adding translation fault definitions for interrupt remapping. Please
refer to VT-d spec section 7.1.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu_internal.h | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 309833f..c66cb83 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -271,6 +271,21 @@ typedef enum VTDFaultReason {
  * context-entry.
  */
 VTD_FR_CONTEXT_ENTRY_TT,
+
+/*
+ * Interrupt remapping transition faults
+ */
+VTD_FR_IR_REQ_RSVD = 0x20, /* One or more IR request resved
+* fields set */
+VTD_FR_IR_INDEX_OVER = 0x21, /* Index value greater than max */
+VTD_FR_IR_ENTRY_P = 0x22,/* Present (P) not set in IRTE */
+VTD_FR_IR_ROOT_INVAL = 0x23, /* IR Root table invalid */
+VTD_FR_IR_IRTE_RSVD = 0x24,  /* IRTE Rsvd field non-zero with
+  * Present flag set */
+VTD_FR_IR_REQ_COMPAT = 0x25, /* Encountered compatible IR
+  * request while disabled */
+VTD_FR_IR_SID_ERR = 0x26,   /* Invalid Source-ID */
+
 /* This is not a normal fault reason. We use this to indicate some faults
  * that are not referenced by the VT-d specification.
  * Fault event with such reason should not be recorded.
-- 
2.4.3




[Qemu-devel] [PATCH 10/13] ioapic-common: add iommu for IOAPICCommonState

2016-02-18 Thread Peter Xu
When IR is enabled for IOMMU, each IOAPIC will belong to a specific
intel IOMMU. This pointer will store the owner of current IOAPIC, which
is always the default IOMMU device.

Signed-off-by: Peter Xu 
---
 hw/intc/ioapic_common.c   | 2 ++
 include/hw/i386/ioapic_internal.h | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/hw/intc/ioapic_common.c b/hw/intc/ioapic_common.c
index 0a48de2..2c25aaa 100644
--- a/hw/intc/ioapic_common.c
+++ b/hw/intc/ioapic_common.c
@@ -136,6 +136,8 @@ static void ioapic_common_realize(DeviceState *dev, Error 
**errp)
 
 sysbus_init_mmio(SYS_BUS_DEVICE(s), >io_memory);
 ioapic_no++;
+
+s->iommu = vtd_iommu_get();
 }
 
 static const VMStateDescription vmstate_ioapic_common = {
diff --git a/include/hw/i386/ioapic_internal.h 
b/include/hw/i386/ioapic_internal.h
index 797ed47..41fc282 100644
--- a/include/hw/i386/ioapic_internal.h
+++ b/include/hw/i386/ioapic_internal.h
@@ -25,6 +25,7 @@
 #include "hw/hw.h"
 #include "exec/memory.h"
 #include "hw/sysbus.h"
+#include "hw/i386/intel_iommu.h"
 
 #define MAX_IOAPICS 1
 
@@ -101,6 +102,8 @@ struct IOAPICCommonState {
 uint8_t ioregsel;
 uint32_t irr;
 uint64_t ioredtbl[IOAPIC_NUM_PINS];
+/* IOMMU pointer that this IOAPIC belongs. */
+IntelIOMMUState *iommu;
 };
 
 void ioapic_reset_common(DeviceState *dev);
-- 
2.4.3




[Qemu-devel] [PATCH 09/13] intel_iommu: provide helper function vtd_get_iommu

2016-02-18 Thread Peter Xu
Moves acpi_get_iommu() under VT-d to make it a public function.

Signed-off-by: Peter Xu 
---
 hw/i386/acpi-build.c  | 17 ++---
 hw/i386/intel_iommu.c | 13 +
 include/hw/i386/intel_iommu.h |  2 ++
 3 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 1cefe43..3cc7886 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2491,19 +2491,6 @@ build_mcfg_q35(GArray *table_data, GArray *linker, 
AcpiMcfgInfo *info)
 build_header(linker, table_data, (void *)mcfg, sig, len, 1, NULL, NULL);
 }
 
-static IntelIOMMUState *acpi_get_iommu(void)
-{
-bool ambiguous = false;
-Object *intel_iommu = NULL;
-
-intel_iommu = object_resolve_path_type("", TYPE_INTEL_IOMMU_DEVICE,
- );
-if (ambiguous)
-intel_iommu = NULL;
-
-return (IntelIOMMUState *)intel_iommu;
-}
-
 static void
 build_dmar_q35(GArray *table_data, GArray *linker)
 {
@@ -2512,7 +2499,7 @@ build_dmar_q35(GArray *table_data, GArray *linker)
 AcpiTableDmar *dmar;
 AcpiDmarHardwareUnit *drhd;
 uint8_t dmar_flags = 0;
-IntelIOMMUState *intel_iommu = acpi_get_iommu();
+IntelIOMMUState *intel_iommu = vtd_iommu_get();
 AcpiDmarDeviceScope *scope = NULL;
 /* Root complex IOAPIC use one path[0] only */
 uint16_t scope_size = sizeof(*scope) + sizeof(uint16_t);
@@ -2613,7 +2600,7 @@ static bool acpi_get_mcfg(AcpiMcfgInfo *mcfg)
 
 static bool acpi_has_iommu(void)
 {
-return !!acpi_get_iommu();
+return !!vtd_iommu_get();
 }
 
 static bool acpi_has_nvdimm(void)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index f1cb574..a9cbd7d 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2000,6 +2000,19 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, 
PCIBus *bus, int devfn)
 return vtd_dev_as;
 }
 
+IntelIOMMUState *vtd_iommu_get(void)
+{
+bool ambiguous = false;
+Object *intel_iommu = NULL;
+
+intel_iommu = object_resolve_path_type("", TYPE_INTEL_IOMMU_DEVICE,
+ );
+if (ambiguous)
+intel_iommu = NULL;
+
+return (IntelIOMMUState *)intel_iommu;
+}
+
 /* Do the initialization. It will also be called when reset, so pay
  * attention when adding new initialization stuff.
  */
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 89781b4..bb94fbd 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -197,5 +197,7 @@ struct IntelIOMMUState {
  * create a new one if none exists
  */
 VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn);
+/* Get default IOMMU object */
+IntelIOMMUState *vtd_iommu_get(void);
 
 #endif
-- 
2.4.3




[Qemu-devel] [PATCH 08/13] intel_iommu: define several structs for IOMMU IR

2016-02-18 Thread Peter Xu
Several data structs are defined to better support the rest of the
patches: IRTE to parse remapping table entries, and IOAPIC/MSI related
structure bits to parse interrupt entries to be filled in by guest
kernel.

Signed-off-by: Peter Xu 
---
 include/hw/i386/intel_iommu.h | 60 +++
 1 file changed, 60 insertions(+)

diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 0107488..89781b4 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -52,6 +52,9 @@ typedef struct IntelIOMMUState IntelIOMMUState;
 typedef struct VTDAddressSpace VTDAddressSpace;
 typedef struct VTDIOTLBEntry VTDIOTLBEntry;
 typedef struct VTDBus VTDBus;
+typedef union VTD_IRTE VTD_IRTE;
+typedef union VTD_IR_IOAPICEntry VTD_IR_IOAPICEntry;
+typedef union VTD_IR_MSIAddress VTD_IR_MSIAddress;
 
 /* Context-Entry */
 struct VTDContextEntry {
@@ -90,6 +93,63 @@ struct VTDIOTLBEntry {
 bool write_flags;
 };
 
+/* Interrupt Remapping Table Entry Definition */
+union VTD_IRTE {
+struct {
+uint8_t present:1;  /* Whether entry present/available */
+uint8_t fault_disable:1;/* Fault Processing Disable */
+uint8_t dest_mode:1;/* Destination Mode */
+uint8_t redir_hint:1;   /* Redirection Hint */
+uint8_t trigger_mode:1; /* Trigger Mode */
+uint8_t delivery_mode:3;/* Delivery Mode */
+uint8_t __avail:4;  /* Available spaces for software */
+uint8_t __reserved_0:3; /* Reserved 0 */
+uint8_t irte_mode:1;/* IRTE Mode */
+uint8_t vector:8;   /* Interrupt Vector */
+uint8_t __reserved_1:8; /* Reserved 1 */
+uint32_t dest_id:32;/* Destination ID */
+uint16_t source_id:16;  /* Source-ID */
+uint8_t sid_q:2;/* Source-ID Qualifier */
+uint8_t sid_vtype:2;/* Source-ID Validation Type */
+uint64_t __reserved_2:44;   /* Reserved 2 */
+} QEMU_PACKED;
+uint64_t data[2];
+};
+
+/* Programming format for IOAPIC table entries */
+union VTD_IR_IOAPICEntry {
+struct {
+uint8_t vector:8;   /* Vector */
+uint8_t __zeros:3;  /* Reserved (all zero) */
+uint8_t index_h:1;  /* Interrupt Index bit 15 */
+uint8_t status:1;   /* Deliver Status */
+uint8_t polarity:1; /* Interrupt Polarity */
+uint8_t remote_irr:1;   /* Remote IRR */
+uint8_t trigger_mode:1; /* Trigger Mode */
+uint8_t mask:1; /* Mask */
+uint32_t __reserved:31; /* Reserved (should all zero) */
+uint8_t int_mode:1; /* Interrupt Format */
+uint16_t index_l:15;/* Interrupt Index bits 14-0 */
+} QEMU_PACKED;
+uint64_t data;
+};
+
+/* Programming format for MSI/MSI-X addresses */
+union VTD_IR_MSIAddress {
+struct {
+uint8_t __not_care:2;
+uint8_t index_h:1;  /* Interrupt index bit 15 */
+uint8_t sub_valid:1;/* SHV: Sub-Handle Valid bit */
+uint8_t int_mode:1; /* Interrupt format */
+uint16_t index_l:15;/* Interrupt index bit 14-0 */
+uint16_t __head:12; /* Should always be: 0x0fee */
+} QEMU_PACKED;
+uint32_t data;
+};
+
+/* When IR is enabled, all MSI/MSI-X data bits should be zero */
+#define VTD_IR_MSI_DATA  (0)
+
 /* The iommu (DMAR) device state struct */
 struct IntelIOMMUState {
 SysBusDevice busdev;
-- 
2.4.3




[Qemu-devel] [PATCH 06/13] intel_iommu: define interrupt remap table addr register

2016-02-18 Thread Peter Xu
Defined Interrupt Remap Table Address register to store IR table
pointer. Also, do proper handling on global command register writes to
store table pointer and its size.

One more debug flag "DEBUG_IR" is added for interrupt remapping.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c  | 52 +-
 hw/i386/intel_iommu_internal.h |  4 
 include/hw/i386/intel_iommu.h  |  3 +++
 3 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 79585d2..62f0fa7 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -29,7 +29,7 @@
 #ifdef DEBUG_INTEL_IOMMU
 enum {
 DEBUG_GENERAL, DEBUG_CSR, DEBUG_INV, DEBUG_MMU, DEBUG_FLOG,
-DEBUG_CACHE,
+DEBUG_CACHE, DEBUG_IR,
 };
 #define VTD_DBGBIT(x)   (1 << DEBUG_##x)
 static int vtd_dbgflags = VTD_DBGBIT(GENERAL) | VTD_DBGBIT(CSR);
@@ -899,6 +899,19 @@ static void vtd_root_table_setup(IntelIOMMUState *s)
 (s->root_extended ? "(extended)" : ""));
 }
 
+static void vtd_interrupt_remap_table_setup(IntelIOMMUState *s)
+{
+uint64_t value = 0;
+value = vtd_get_quad_raw(s, DMAR_IRTA_REG);
+s->intr_size = 1UL << ((value & VTD_IRTA_SIZE_MASK) + 1);
+s->intr_root = value & VTD_IRTA_ADDR_MASK;
+
+/* TODO: invalidate interrupt entry cache */
+
+VTD_DPRINTF(CSR, "int remap table addr 0x%"PRIx64 " size %"PRIu32,
+s->intr_root, s->intr_size);
+}
+
 static void vtd_context_global_invalidate(IntelIOMMUState *s)
 {
 s->context_cache_gen++;
@@ -1137,6 +1150,16 @@ static void vtd_handle_gcmd_srtp(IntelIOMMUState *s)
 vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_RTPS);
 }
 
+/* Set Interrupt Remap Table Pointer */
+static void vtd_handle_gcmd_sirtp(IntelIOMMUState *s)
+{
+VTD_DPRINTF(CSR, "set Interrupt Remap Table Pointer");
+
+vtd_interrupt_remap_table_setup(s);
+/* Ok - report back to driver */
+vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_IRTPS);
+}
+
 /* Handle Translation Enable/Disable */
 static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
 {
@@ -1176,6 +1199,10 @@ static void vtd_handle_gcmd_write(IntelIOMMUState *s)
 /* Queued Invalidation Enable */
 vtd_handle_gcmd_qie(s, val & VTD_GCMD_QIE);
 }
+if (val & VTD_GCMD_SIRTP) {
+/* Set/update the interrupt remapping root-table pointer */
+vtd_handle_gcmd_sirtp(s);
+}
 }
 
 /* Handle write to Context Command Register */
@@ -1837,6 +1864,23 @@ static void vtd_mem_write(void *opaque, hwaddr addr,
 vtd_update_fsts_ppf(s);
 break;
 
+case DMAR_IRTA_REG:
+VTD_DPRINTF(IR, "DMAR_IRTA_REG write addr 0x%"PRIx64
+", size %d, val 0x%"PRIx64, addr, size, val);
+if (size == 4) {
+vtd_set_long(s, addr, val);
+} else {
+vtd_set_quad(s, addr, val);
+}
+break;
+
+case DMAR_IRTA_REG_HI:
+VTD_DPRINTF(IR, "DMAR_IRTA_REG_HI write addr 0x%"PRIx64
+", size %d, val 0x%"PRIx64, addr, size, val);
+assert(size == 4);
+vtd_set_long(s, addr, val);
+break;
+
 default:
 VTD_DPRINTF(GENERAL, "error: unhandled reg write addr 0x%"PRIx64
 ", size %d, val 0x%"PRIx64, addr, size, val);
@@ -2014,6 +2058,12 @@ static void vtd_init(IntelIOMMUState *s)
 /* Fault Recording Registers, 128-bit */
 vtd_define_quad(s, DMAR_FRCD_REG_0_0, 0, 0, 0);
 vtd_define_quad(s, DMAR_FRCD_REG_0_2, 0, 0, 0x8000ULL);
+
+/*
+ * Interrupt remapping registers, not support extended interrupt
+ * mode for now.
+ */
+vtd_define_quad(s, DMAR_IRTA_REG, 0, 0xf00fULL, 0);
 }
 
 /* Should not reset address_spaces when reset because devices will still use
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 5b98a11..309833f 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -172,6 +172,10 @@
 #define VTD_RTADDR_RTT  (1ULL << 11)
 #define VTD_RTADDR_ADDR_MASK(VTD_HAW_MASK ^ 0xfffULL)
 
+/* IRTA_REG */
+#define VTD_IRTA_ADDR_MASK  (VTD_HAW_MASK ^ 0xfffULL)
+#define VTD_IRTA_SIZE_MASK  (0xfULL)
+
 /* ECAP_REG */
 /* (offset >> 4) << 8 */
 #define VTD_ECAP_IRO(DMAR_IOTLB_REG_OFFSET << 4)
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 83e5a1e..0107488 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -128,6 +128,9 @@ struct IntelIOMMUState {
 
 /* interrupt remapping */
 bool intr_supported;/* Whether IR is supported */
+bool intr_enabled;  /* Whether guest enabled IR */
+dma_addr_t intr_root;   /* Interrupt remapping table pointer */
+uint32_t intr_size; /* Number of IR table entries */
 };
 
 /* Find the VTD Address space associated with 

[Qemu-devel] [PATCH 05/13] acpi: add DMAR scope definition for root IOAPIC

2016-02-18 Thread Peter Xu
To enable interrupt remapping for intel IOMMU device, each IOAPIC device
in the system reported via ACPI MADT must be explicitly enumerated under
one specific remapping hardware unit. This patch adds the root-complex
IOAPIC into the default DMAR device.

Please refer to VT-d spec 8.3.1.1 for more information.

Signed-off-by: Peter Xu 
---
 hw/i386/acpi-build.c| 23 +--
 include/hw/acpi/acpi-defs.h | 15 +++
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index d9e4f91..1cefe43 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -76,6 +76,9 @@
 #define ACPI_BUILD_DPRINTF(fmt, ...)
 #endif
 
+/* Default IOAPIC ID */
+#define ACPI_BUILD_IOAPIC_ID 0x0
+
 typedef struct AcpiCpuInfo {
 DECLARE_BITMAP(found_cpus, ACPI_CPU_HOTPLUG_ID_LIMIT);
 } AcpiCpuInfo;
@@ -392,7 +395,6 @@ build_madt(GArray *table_data, GArray *linker, AcpiCpuInfo 
*cpu)
 io_apic = acpi_data_push(table_data, sizeof *io_apic);
 io_apic->type = ACPI_APIC_IO;
 io_apic->length = sizeof(*io_apic);
-#define ACPI_BUILD_IOAPIC_ID 0x0
 io_apic->io_apic_id = ACPI_BUILD_IOAPIC_ID;
 io_apic->address = cpu_to_le32(IO_APIC_DEFAULT_ADDRESS);
 io_apic->interrupt = cpu_to_le32(0);
@@ -2511,6 +2513,9 @@ build_dmar_q35(GArray *table_data, GArray *linker)
 AcpiDmarHardwareUnit *drhd;
 uint8_t dmar_flags = 0;
 IntelIOMMUState *intel_iommu = acpi_get_iommu();
+AcpiDmarDeviceScope *scope = NULL;
+/* Root complex IOAPIC use one path[0] only */
+uint16_t scope_size = sizeof(*scope) + sizeof(uint16_t);
 
 assert(intel_iommu);
 
@@ -2526,11 +2531,25 @@ build_dmar_q35(GArray *table_data, GArray *linker)
 /* DMAR Remapping Hardware Unit Definition structure */
 drhd = acpi_data_push(table_data, sizeof(*drhd));
 drhd->type = cpu_to_le16(ACPI_DMAR_TYPE_HARDWARE_UNIT);
-drhd->length = cpu_to_le16(sizeof(*drhd));   /* No device scope now */
+drhd->length = cpu_to_le16(sizeof(*drhd) + scope_size);
 drhd->flags = ACPI_DMAR_INCLUDE_PCI_ALL;
 drhd->pci_segment = cpu_to_le16(0);
 drhd->address = cpu_to_le64(Q35_HOST_BRIDGE_IOMMU_ADDR);
 
+/* Scope definition for the root-complex IOAPIC */
+scope = acpi_data_push(table_data, scope_size);
+scope->entry_type = cpu_to_le16(ACPI_DMAR_DEV_SCOPE_TYPE_IOAPIC);
+scope->length = scope_size;
+/*
+ * An arbitary but unique bus number, to be used to generate
+ * source ID for IOAPIC device in BDF format.
+ */
+#define ACPI_IOAPIC_BUS_IR (0xf0)
+#define ACPI_IOAPIC_DEVFN_IR   (0x00)
+scope->enumeration_id = cpu_to_le16(ACPI_BUILD_IOAPIC_ID);
+scope->bus = cpu_to_le16(ACPI_IOAPIC_BUS_IR);
+scope->path[0] = cpu_to_le16(ACPI_IOAPIC_DEVFN_IR);
+
 build_header(linker, table_data, (void *)(table_data->data + dmar_start),
  "DMAR", table_data->len - dmar_start, 1, NULL, NULL);
 }
diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
index c7a03d4..2430af6 100644
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@@ -556,6 +556,20 @@ enum {
 /*
  * Sub-structures for DMAR
  */
+
+#define ACPI_DMAR_DEV_SCOPE_TYPE_IOAPIC (0x03)
+
+/* Device scope structure for DRHD. */
+struct AcpiDmarDeviceScope {
+uint8_t entry_type;
+uint8_t length;
+uint16_t reserved;
+uint8_t enumeration_id;
+uint8_t bus;
+uint16_t path[0];   /* list of dev:func pairs */
+} QEMU_PACKED;
+typedef struct AcpiDmarDeviceScope AcpiDmarDeviceScope;
+
 /* Type 0: Hardware Unit Definition */
 struct AcpiDmarHardwareUnit {
 uint16_t type;
@@ -564,6 +578,7 @@ struct AcpiDmarHardwareUnit {
 uint8_t reserved;
 uint16_t pci_segment;   /* The PCI Segment associated with this unit */
 uint64_t address;   /* Base address of remapping hardware register-set */
+AcpiDmarDeviceScope scope[0];
 } QEMU_PACKED;
 typedef struct AcpiDmarHardwareUnit AcpiDmarHardwareUnit;
 
-- 
2.4.3




[Qemu-devel] [PATCH 13/13] intel_iommu: Add support for PCI MSI remap

2016-02-18 Thread Peter Xu
This patch enables interrupt remapping for PCI devices.

To play the trick, one memory region "iommu_ir" is added as child region
of the original iommu memory region, covering range 0xfeeX (which is
the address range for APIC). All the writes to this range will be taken
as MSI, and translation is carried out when IR is enabled.

The translation process tried to make full use of the helper function
from IOAPIC patch. Several more bits are added to VTDIrq as MSI specific
fields.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c  | 172 -
 hw/i386/intel_iommu_internal.h |   2 +
 include/hw/i386/intel_iommu.h  |  34 
 include/hw/pci/msi.h   |   4 +
 4 files changed, 211 insertions(+), 1 deletion(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 089dac9..34aa1fa 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -43,6 +43,10 @@ static int vtd_dbgflags = VTD_DBGBIT(GENERAL) | 
VTD_DBGBIT(CSR);
 #define VTD_DPRINTF(what, fmt, ...) do {} while (0)
 #endif
 
+static int vtd_interrupt_remap_msi(IntelIOMMUState *iommu,
+   MSIMessage *origin,
+   MSIMessage *translated);
+
 static void vtd_define_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val,
 uint64_t wmask, uint64_t w1cmask)
 {
@@ -1963,12 +1967,70 @@ static const MemoryRegionOps vtd_mem_ops = {
 },
 };
 
+static uint64_t vtd_mem_ir_read(void *opaque, hwaddr addr, unsigned size)
+{
+uint64_t data = 0;
+
+addr += VTD_INTERRUPT_ADDR_FIRST;
+
+VTD_DPRINTF(IR, "read mem_ir addr 0x%"PRIx64 " size %u",
+addr, size);
+
+if (dma_memory_read(_space_memory, addr, , size)) {
+VTD_DPRINTF(GENERAL, "error: fail to access 0x%"PRIx64, addr);
+return (uint32_t) -1;
+}
+
+return data;
+}
+
+static void vtd_mem_ir_write(void *opaque, hwaddr addr,
+ uint64_t value, unsigned size)
+{
+int ret = 0;
+MSIMessage from = {0}, to = {0};
+
+from.address = (uint64_t) addr + VTD_INTERRUPT_ADDR_FIRST;
+from.data = (uint32_t) value;
+
+ret = vtd_interrupt_remap_msi(opaque, , );
+if (ret) {
+/* TODO: report error */
+VTD_DPRINTF(GENERAL, "int remap fail for addr 0x%"PRIx64
+" data 0x%"PRIx32, from.address, from.data);
+/* Drop this interrupt */
+return;
+}
+
+VTD_DPRINTF(IR, "delivering MSI 0x%"PRIx64":0x%"PRIx32,
+to.address, to.data);
+
+if (dma_memory_write(_space_memory, to.address,
+ , size)) {
+VTD_DPRINTF(GENERAL, "error: fail to write 0x%"PRIx64
+" value 0x%"PRIx32, to.address, to.data);
+}
+}
+
+static const MemoryRegionOps vtd_mem_ir_ops = {
+.read = vtd_mem_ir_read,
+.write = vtd_mem_ir_write,
+.endianness = DEVICE_LITTLE_ENDIAN,
+.impl = {
+.min_access_size = 4,
+.max_access_size = 4,
+},
+.valid = {
+.min_access_size = 4,
+.max_access_size = 4,
+},
+};
+
 static Property vtd_properties[] = {
 DEFINE_PROP_UINT32("version", IntelIOMMUState, version, 0),
 DEFINE_PROP_END_OF_LIST(),
 };
 
-
 VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
 {
 uintptr_t key = (uintptr_t)bus;
@@ -1994,6 +2056,11 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, 
PCIBus *bus, int devfn)
 vtd_dev_as->context_cache_entry.context_cache_gen = 0;
 memory_region_init_iommu(_dev_as->iommu, OBJECT(s),
  >iommu_ops, "intel_iommu", UINT64_MAX);
+memory_region_init_io(_dev_as->iommu_ir, OBJECT(s),
+  _mem_ir_ops, s, "intel_iommu_ir",
+  VTD_INTERRUPT_ADDR_SIZE);
+memory_region_add_subregion(_dev_as->iommu, 
VTD_INTERRUPT_ADDR_FIRST,
+_dev_as->iommu_ir);
 address_space_init(_dev_as->as,
_dev_as->iommu, "intel_iommu");
 }
@@ -2074,6 +2141,7 @@ static int vtd_remap_irq_get(IntelIOMMUState *iommu, 
uint16_t index, VTDIrq *irq
 irq->dest = (irte.dest_id & VTD_IR_APIC_DEST_MASK) >> \
 VTD_IR_APIC_DEST_SHIFT;
 irq->dest_mode = irte.dest_mode;
+irq->redir_hint = irte.redir_hint;
 
 VTD_DPRINTF(IR, "remapping interrupt index %d: trig:%u,vec:%u,"
 "deliver:%u,dest:%u,dest_mode:%u", index,
@@ -2140,6 +2208,108 @@ int vtd_interrupt_remap_ioapic(IntelIOMMUState *iommu,
 return 0;
 }
 
+/* Generate one MSI message from VTDIrq info */
+static void vtd_generate_msi_message(VTDIrq *irq, MSIMessage *msg_out)
+{
+VTD_MSIMessage msg;
+
+bzero(, sizeof(msg));
+
+/* Generate address bits */
+msg.dest_mode = irq->dest_mode;
+msg.redir_hint = irq->redir_hint;
+msg.dest = irq->dest;
+msg.__addr_head = 0xfee;
+

[Qemu-devel] [PATCH 04/13] intel_iommu: set IR bit for ECAP register

2016-02-18 Thread Peter Xu
Enable IR in IOMMU Extended Capability register.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c  | 4 
 hw/i386/intel_iommu_internal.h | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 4b0558e..79585d2 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1961,6 +1961,10 @@ static void vtd_init(IntelIOMMUState *s)
  VTD_CAP_SAGAW | VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS;
 s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
 
+if (s->intr_supported) {
+s->ecap |= VTD_ECAP_IR;
+}
+
 vtd_reset_context_cache(s);
 vtd_reset_iotlb(s);
 
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index b648e69..5b98a11 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -176,6 +176,8 @@
 /* (offset >> 4) << 8 */
 #define VTD_ECAP_IRO(DMAR_IOTLB_REG_OFFSET << 4)
 #define VTD_ECAP_QI (1ULL << 1)
+/* Interrupt Remapping support */
+#define VTD_ECAP_IR (1ULL << 3)
 
 /* CAP_REG */
 /* (offset >> 4) << 24 */
-- 
2.4.3




[Qemu-devel] [PATCH 03/13] intel_iommu: allow queued invalidation for IR

2016-02-18 Thread Peter Xu
Queued invalidation is required for IR. This patch add basic support for
interrupt cache invalidate requests. Since we currently have no IR cache
implemented yet, we can just skip all interrupt cache invalidation
requests for now.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c  | 9 +
 hw/i386/intel_iommu_internal.h | 2 ++
 2 files changed, 11 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 347718f..4b0558e 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1400,6 +1400,15 @@ static bool vtd_process_inv_desc(IntelIOMMUState *s)
 }
 break;
 
+case VTD_INV_DESC_IEC:
+VTD_DPRINTF(INV, "Interrupt Entry Cache Invalidation "
+"not implemented yet");
+/*
+ * Since currently we do not cache interrupt entries, we can
+ * just mark this descriptor as "good" and move on.
+ */
+break;
+
 default:
 VTD_DPRINTF(GENERAL, "error: unkonw Invalidation Descriptor type "
 "hi 0x%"PRIx64 " lo 0x%"PRIx64 " type %"PRIu8,
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index e5f514c..b648e69 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -286,6 +286,8 @@ typedef struct VTDInvDesc VTDInvDesc;
 #define VTD_INV_DESC_TYPE   0xf
 #define VTD_INV_DESC_CC 0x1 /* Context-cache Invalidate Desc */
 #define VTD_INV_DESC_IOTLB  0x2
+#define VTD_INV_DESC_IEC0x4 /* Interrupt Entry Cache
+   Invalidate Descriptor */
 #define VTD_INV_DESC_WAIT   0x5 /* Invalidation Wait Descriptor */
 #define VTD_INV_DESC_NONE   0   /* Not an Invalidate Descriptor */
 
-- 
2.4.3




[Qemu-devel] [PATCH 12/13] intel_iommu: ioapic: IR support for emulated IOAPIC

2016-02-18 Thread Peter Xu
This patch add the first device support for Intel IOMMU interrupt
remapping, which is the default IOAPIC device created alongside with Q35
platform. This will be the first step along the way to fully enable
IOMMU IR on x86 systems.

Currently, only emluated IOAPIC is supported. This requires
"kernel_irqchip=off" parameter specified.

Originally, IOAPIC has its own table to maintain IRQ information. When
IOMMU IR is enabled, guest OS will fill in the real IRQ data into IRTE
entries of IOMMU IR root table, while in IOAPIC only the index
information is maintained (with several legacy bits which might not be
covered by VT-d IR). If so, we need to talk to IOMMU to get the real IRQ
information to deliver.

Please refer to VT-d spec 5.1.5.1 for more information.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 127 ++
 hw/intc/ioapic.c  |  36 +---
 include/hw/i386/intel_iommu.h |  17 ++
 3 files changed, 173 insertions(+), 7 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index a9cbd7d..089dac9 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2013,6 +2013,133 @@ IntelIOMMUState *vtd_iommu_get(void)
 return (IntelIOMMUState *)intel_iommu;
 }
 
+/* Read IRTE entry with specific index */
+static int vtd_irte_get(IntelIOMMUState *iommu, uint16_t index,
+VTD_IRTE *entry)
+{
+dma_addr_t addr = 0x00;
+
+addr = iommu->intr_root + index * sizeof(*entry);
+if (dma_memory_read(_space_memory, addr, entry,
+sizeof(*entry))) {
+VTD_DPRINTF(GENERAL, "error: fail to access IR root at 0x%"PRIx64
+" + %"PRIu16, iommu->intr_root, index);
+return -VTD_FR_IR_ROOT_INVAL;
+}
+
+if (!entry->present) {
+VTD_DPRINTF(GENERAL, "error: present flag not set in IRTE"
+" entry index %u value 0x%"PRIx64 " 0x%"PRIx64,
+index, le64_to_cpu(entry->data[1]),
+le64_to_cpu(entry->data[0]));
+return -VTD_FR_IR_ENTRY_P;
+}
+
+if (entry->__reserved_0 || entry->__reserved_1 || \
+entry->__reserved_2) {
+VTD_DPRINTF(GENERAL, "error: IRTE entry index %"PRIu16
+" reserved fields non-zero: 0x%"PRIx64 " 0x%"PRIx64,
+index, le64_to_cpu(entry->data[1]),
+le64_to_cpu(entry->data[0]));
+return -VTD_FR_IR_IRTE_RSVD;
+}
+
+/*
+ * TODO: Check Source-ID corresponds to SVT (Source Validation
+ * Type) bits
+ */
+
+return 0;
+}
+
+/* Fetch IRQ information of specific IR index */
+static int vtd_remap_irq_get(IntelIOMMUState *iommu, uint16_t index, VTDIrq 
*irq)
+{
+VTD_IRTE irte;
+int ret = 0;
+
+bzero(, sizeof(irte));
+
+ret = vtd_irte_get(iommu, index, );
+if (ret) {
+return ret;
+}
+
+irq->trigger_mode = irte.trigger_mode;
+irq->vector = irte.vector;
+irq->delivery_mode = irte.delivery_mode;
+/* Not support EIM yet: please refer to vt-d 9.10 DST bits */
+#define  VTD_IR_APIC_DEST_MASK (0xff00ULL)
+#define  VTD_IR_APIC_DEST_SHIFT(8)
+irq->dest = (irte.dest_id & VTD_IR_APIC_DEST_MASK) >> \
+VTD_IR_APIC_DEST_SHIFT;
+irq->dest_mode = irte.dest_mode;
+
+VTD_DPRINTF(IR, "remapping interrupt index %d: trig:%u,vec:%u,"
+"deliver:%u,dest:%u,dest_mode:%u", index,
+irq->trigger_mode, irq->vector, irq->delivery_mode,
+irq->dest, irq->dest_mode);
+
+return 0;
+}
+
+/* Interrupt remapping for IOAPIC IRQ entry */
+int vtd_interrupt_remap_ioapic(IntelIOMMUState *iommu,
+   uint64_t *ioapic_entry, VTDIrq *irq)
+{
+int ret = 0;
+uint16_t index = 0;
+VTD_IR_IOAPICEntry *entry = (VTD_IR_IOAPICEntry *)ioapic_entry;
+
+assert(iommu && entry && irq);
+assert(iommu->intr_enabled);
+
+/* TODO: Currently we still do not support compatible mode */
+if (entry->int_mode != VTD_IR_INT_FORMAT_REMAP) {
+VTD_DPRINTF(GENERAL, "error: trying to remap IOAPIC entry"
+" with compatible format: 0x%"PRIx64,
+le64_to_cpu(entry->data));
+return -VTD_FR_IR_REQ_COMPAT;
+}
+
+if (entry->__zeros || entry->__reserved) {
+VTD_DPRINTF(GENERAL, "error: reserved not empty for IOAPIC"
+"entry 0x%"PRIx64, le64_to_cpu(entry->data));
+return -VTD_FR_IR_REQ_RSVD;
+}
+
+index = entry->index_h << 15 | entry->index_l;
+ret = vtd_remap_irq_get(iommu, index, irq);
+if (ret) {
+return ret;
+}
+
+/* Trigger mode should be aligned between IOAPIC entry and IRTE
+ * entry */
+if (irq->trigger_mode != entry->trigger_mode) {
+/* This is possibly guest OS bug?! */
+VTD_DPRINTF(GENERAL, "error: IOAPIC trigger mode inconsistent: "
+"0x%"PRIx64 " 

[Qemu-devel] [PATCH 07/13] intel_iommu: handle interrupt remap enable

2016-02-18 Thread Peter Xu
Handle writting to IRE bit in global command register.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 62f0fa7..f1cb574 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1179,6 +1179,22 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool 
en)
 }
 }
 
+/* Handle Interrupt Remap Enable/Disable */
+static void vtd_handle_gcmd_ire(IntelIOMMUState *s, bool en)
+{
+VTD_DPRINTF(CSR, "Interrupt Remap Enable %s", (en ? "on" : "off"));
+
+if (en) {
+s->intr_enabled = true;
+/* Ok - report back to driver */
+vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_IRES);
+} else {
+s->intr_enabled = false;
+/* Ok - report back to driver */
+vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_IRES, 0);
+}
+}
+
 /* Handle write to Global Command Register */
 static void vtd_handle_gcmd_write(IntelIOMMUState *s)
 {
@@ -1203,6 +1219,10 @@ static void vtd_handle_gcmd_write(IntelIOMMUState *s)
 /* Set/update the interrupt remapping root-table pointer */
 vtd_handle_gcmd_sirtp(s);
 }
+if (changed & VTD_GCMD_IRE) {
+/* Interrupt remap enable/disable */
+vtd_handle_gcmd_ire(s, val & VTD_GCMD_IRE);
+}
 }
 
 /* Handle write to Context Command Register */
-- 
2.4.3




[Qemu-devel] [PATCH 02/13] acpi: enable INTR for DMAR report structure

2016-02-18 Thread Peter Xu
In ACPI DMA remapping report structure, enable INTR flag when specified.

Signed-off-by: Peter Xu 
---
 hw/i386/acpi-build.c  | 31 ---
 include/hw/i386/intel_iommu.h |  2 ++
 2 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 4554eb8..d9e4f91 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2489,6 +2489,19 @@ build_mcfg_q35(GArray *table_data, GArray *linker, 
AcpiMcfgInfo *info)
 build_header(linker, table_data, (void *)mcfg, sig, len, 1, NULL, NULL);
 }
 
+static IntelIOMMUState *acpi_get_iommu(void)
+{
+bool ambiguous = false;
+Object *intel_iommu = NULL;
+
+intel_iommu = object_resolve_path_type("", TYPE_INTEL_IOMMU_DEVICE,
+ );
+if (ambiguous)
+intel_iommu = NULL;
+
+return (IntelIOMMUState *)intel_iommu;
+}
+
 static void
 build_dmar_q35(GArray *table_data, GArray *linker)
 {
@@ -2496,10 +2509,19 @@ build_dmar_q35(GArray *table_data, GArray *linker)
 
 AcpiTableDmar *dmar;
 AcpiDmarHardwareUnit *drhd;
+uint8_t dmar_flags = 0;
+IntelIOMMUState *intel_iommu = acpi_get_iommu();
+
+assert(intel_iommu);
+
+if (intel_iommu->intr_supported) {
+/* enable INTR for the IOMMU device */
+dmar_flags |= DMAR_REPORT_F_INTR;
+}
 
 dmar = acpi_data_push(table_data, sizeof(*dmar));
 dmar->host_address_width = VTD_HOST_ADDRESS_WIDTH - 1;
-dmar->flags = 0;/* No intr_remap for now */
+dmar->flags = dmar_flags;
 
 /* DMAR Remapping Hardware Unit Definition structure */
 drhd = acpi_data_push(table_data, sizeof(*drhd));
@@ -2572,12 +2594,7 @@ static bool acpi_get_mcfg(AcpiMcfgInfo *mcfg)
 
 static bool acpi_has_iommu(void)
 {
-bool ambiguous;
-Object *intel_iommu;
-
-intel_iommu = object_resolve_path_type("", TYPE_INTEL_IOMMU_DEVICE,
-   );
-return intel_iommu && !ambiguous;
+return !!acpi_get_iommu();
 }
 
 static bool acpi_has_nvdimm(void)
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 6e52c6b..83e5a1e 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -44,6 +44,8 @@
 #define VTD_HOST_ADDRESS_WIDTH  39
 #define VTD_HAW_MASK((1ULL << VTD_HOST_ADDRESS_WIDTH) - 1)
 
+#define DMAR_REPORT_F_INTR  (1)
+
 typedef struct VTDContextEntry VTDContextEntry;
 typedef struct VTDContextCacheEntry VTDContextCacheEntry;
 typedef struct IntelIOMMUState IntelIOMMUState;
-- 
2.4.3




[Qemu-devel] [PATCH 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU

2016-02-18 Thread Peter Xu
This patchset provide very basic functionalities for interrupt
remapping (IR) support of the emulated Intel IOMMU device.

By default, IR is disabled to be better compatible with current
QEMU. To enable IR, we can using the following command to boot a
IR-supported VM with basic network (still do not support kvm-ioapic,
so we need to specify kernel_irqchip=off here):

$ qemu-system-x86_64 -M q35,iommu=on,int_remap=on,kernel_irqchip=off \
 -enable-kvm -m 1024 -s \
 -monitor telnet::,server,nowait \
 -netdev user,id=user.0,hostfwd=tcp::-:22 \
 -device virtio-net-pci,netdev=user.0 \
 /var/lib/libvirt/images/vm1.qcow2

When guest boots, we can verify whether IR enabled by grepping the
dmesg like:

[root@localhost ~]# journalctl -k | grep "DMAR-IR"
Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: IOAPIC id 0 under DRHD 
base  0xfed9 IOMMU 0
Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: Enabled IRQ remapping in 
xapic mode

Currently only two devices are supported:

- Emulated IOAPIC device
- PCI Devices

TODO List:

- kvm-ioapic support
- vhost support
- pass through device support
- EIM support
- IR fault reporting
- source-id validation for IRTE
- IRTE cache and better queued invalidation
- migration support (for IOMMU as general?)
- more?

Peter Xu (13):
  q35: add "int-remap" flag to enable intr
  acpi: enable INTR for DMAR report structure
  intel_iommu: allow queued invalidation for IR
  intel_iommu: set IR bit for ECAP register
  acpi: add DMAR scope definition for root IOAPIC
  intel_iommu: define interrupt remap table addr register
  intel_iommu: handle interrupt remap enable
  intel_iommu: define several structs for IOMMU IR
  intel_iommu: provide helper function vtd_get_iommu
  ioapic-common: add iommu for IOAPICCommonState
  intel_iommu: add IR translation faults defines
  intel_iommu: ioapic: IR support for emulated IOAPIC
  intel_iommu: Add support for PCI MSI remap

 hw/core/machine.c |  20 ++
 hw/i386/acpi-build.c  |  41 +++-
 hw/i386/intel_iommu.c | 397 +-
 hw/i386/intel_iommu_internal.h|  25 +++
 hw/intc/ioapic.c  |  36 +++-
 hw/intc/ioapic_common.c   |   2 +
 hw/pci-host/q35.c |   6 +-
 include/hw/acpi/acpi-defs.h   |  15 ++
 include/hw/boards.h   |   1 +
 include/hw/i386/intel_iommu.h | 121 
 include/hw/i386/ioapic_internal.h |   3 +
 include/hw/pci/msi.h  |   4 +
 12 files changed, 651 insertions(+), 20 deletions(-)

-- 
2.4.3




[Qemu-devel] [PATCH 01/13] q35: add "int-remap" flag to enable intr

2016-02-18 Thread Peter Xu
One flag is added to specify whether to enable INTR for emulated
IOMMU. By default, interrupt remapping is not supportted. To enable it,
we should specify something like:

$ qemu-system-x86_64 -M q35,iommu=on,int_remap=on

To be more clear, the following command:

$ qemu-system-x86_64 -M q35,iommu=on

Will enable Intel IOMMU only, without interrupt remapping support.

Signed-off-by: Peter Xu 
---
 hw/core/machine.c | 20 
 hw/pci-host/q35.c |  6 --
 include/hw/boards.h   |  1 +
 include/hw/i386/intel_iommu.h |  3 +++
 4 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 6d1a0d8..852cee2 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -298,6 +298,20 @@ static void machine_set_iommu(Object *obj, bool value, 
Error **errp)
 ms->iommu = value;
 }
 
+static bool machine_get_int_remap(Object *obj, Error **errp)
+{
+MachineState *ms = MACHINE(obj);
+
+return ms->int_remap;
+}
+
+static void machine_set_int_remap(Object *obj, bool value, Error **errp)
+{
+MachineState *ms = MACHINE(obj);
+
+ms->int_remap = value;
+}
+
 static void machine_set_suppress_vmdesc(Object *obj, bool value, Error **errp)
 {
 MachineState *ms = MACHINE(obj);
@@ -461,6 +475,12 @@ static void machine_initfn(Object *obj)
 object_property_set_description(obj, "iommu",
 "Set on/off to enable/disable Intel IOMMU 
(VT-d)",
 NULL);
+object_property_add_bool(obj, "int-remap",
+ machine_get_int_remap,
+ machine_set_int_remap, NULL);
+object_property_set_description(obj, "int-remap",
+"Set on/off to enable/disable Intel IOMMU 
INTR",
+NULL);
 object_property_add_bool(obj, "suppress-vmdesc",
  machine_get_suppress_vmdesc,
  machine_set_suppress_vmdesc, NULL);
diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index 115fb8c..7cd4d87 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -434,13 +434,14 @@ static AddressSpace *q35_host_dma_iommu(PCIBus *bus, void 
*opaque, int devfn)
 return _as->as;
 }
 
-static void mch_init_dmar(MCHPCIState *mch)
+static void mch_init_dmar(MCHPCIState *mch, bool intr_supported)
 {
 PCIBus *pci_bus = PCI_BUS(qdev_get_parent_bus(DEVICE(mch)));
 
 mch->iommu = INTEL_IOMMU_DEVICE(qdev_create(NULL, 
TYPE_INTEL_IOMMU_DEVICE));
 object_property_add_child(OBJECT(mch), "intel-iommu",
   OBJECT(mch->iommu), NULL);
+mch->iommu->intr_supported = intr_supported;
 qdev_init_nofail(DEVICE(mch->iommu));
 sysbus_mmio_map(SYS_BUS_DEVICE(mch->iommu), 0, Q35_HOST_BRIDGE_IOMMU_ADDR);
 
@@ -507,7 +508,8 @@ static void mch_realize(PCIDevice *d, Error **errp)
 }
 /* Intel IOMMU (VT-d) */
 if (object_property_get_bool(qdev_get_machine(), "iommu", NULL)) {
-mch_init_dmar(mch);
+mch_init_dmar(mch, object_property_get_bool(qdev_get_machine(),
+"int-remap", false));
 }
 }
 
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 0f30959..d488e40 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -127,6 +127,7 @@ struct MachineState {
 bool igd_gfx_passthru;
 char *firmware;
 bool iommu;
+bool int_remap;
 bool suppress_vmdesc;
 
 ram_addr_t ram_size;
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index b024ffa..6e52c6b 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -123,6 +123,9 @@ struct IntelIOMMUState {
 MemoryRegionIOMMUOps iommu_ops;
 GHashTable *vtd_as_by_busptr;   /* VTDBus objects indexed by PCIBus* 
reference */
 VTDBus *vtd_as_by_bus_num[VTD_PCI_BUS_MAX]; /* VTDBus objects indexed by 
bus number */
+
+/* interrupt remapping */
+bool intr_supported;/* Whether IR is supported */
 };
 
 /* Find the VTD Address space associated with the given bus pointer,
-- 
2.4.3




[Qemu-devel] [PATCH v2] scripts/kvm/kvm_stat: Fix missing right parantheses and ".format(...)"

2016-02-18 Thread Fam Zheng
They seem to have snuck in when applying Janosch Frank
's previous patch.

Signed-off-by: Fam Zheng 

---
v2: Also fix .format. [Janosch]
---
 scripts/kvm/kvm_stat | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/scripts/kvm/kvm_stat b/scripts/kvm/kvm_stat
index 3cf1181..769d884 100755
--- a/scripts/kvm/kvm_stat
+++ b/scripts/kvm/kvm_stat
@@ -796,11 +796,12 @@ def check_access(options):
 sys.stderr.write("Please enable CONFIG_TRACING in your kernel "
  "when using the option -t (default).\n"
  "If it is enabled, make {0} readable by the "
- "current user.\n")
+ "current user.\n"
+ .format(PATH_DEBUGFS_TRACING))
 if options.tracepoints:
 sys.exit(1)
 
-sys.stderr.write("Falling back to debugfs statistics!\n"
+sys.stderr.write("Falling back to debugfs statistics!\n")
 options.debugfs = True
 sleep(5)
 
-- 
2.4.3




Re: [Qemu-devel] [PATCH] scripts/kvm/kvm_stat: Fix missing right parantheses

2016-02-18 Thread Fam Zheng
On Thu, 02/18 12:42, Janosch Frank wrote:
> On 02/17/2016 07:18 AM, Fam Zheng wrote:
> > Signed-off-by: Fam Zheng 
> > ---
> >  scripts/kvm/kvm_stat | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/scripts/kvm/kvm_stat b/scripts/kvm/kvm_stat
> > index 3cf1181..517fbec 100755
> > --- a/scripts/kvm/kvm_stat
> > +++ b/scripts/kvm/kvm_stat
> > @@ -800,7 +800,7 @@ def check_access(options):
> >  if options.tracepoints:
> >  sys.exit(1)
> > 
> > -sys.stderr.write("Falling back to debugfs statistics!\n"
> > +sys.stderr.write("Falling back to debugfs statistics!\n")
> >  options.debugfs = True
> >  sleep(5)
> > 
> Thanks for fixing that.
> If you also want to fix the missing .format(PATH_DEBUGFS_TRACING) in the
> write a few lines above, go ahead.
> 
> I should have been more careful when reading the pull mail from Paolo.
> The affected lines are not my code and definitely not my v2 which
> included the "exit if -t is passed explicitly" that Paolo added to my
> commit.
> 

Yep, I can do it. Will send v2.

Fam



Re: [Qemu-devel] [PATCH v2 0/6] external backup api

2016-02-18 Thread Fam Zheng
On Thu, 02/18 16:41, Stefan Hajnoczi wrote:
> On Thu, Feb 18, 2016 at 12:11:14PM +, Daniel P. Berrange wrote:
> > On Wed, Feb 17, 2016 at 08:47:11PM +0300, Vladimir Sementsov-Ogievskiy 
> > wrote:
> > > On 16.02.2016 20:09, Stefan Hajnoczi wrote:
> > > >On Wed, Feb 10, 2016 at 10:10:04AM +, Stefan Hajnoczi wrote:
> > > >>On Tue, Feb 09, 2016 at 05:41:50PM +0300, Denis V. Lunev wrote:
> > > >>>On 02/09/2016 05:28 PM, Stefan Hajnoczi wrote:
> > > On Fri, Feb 05, 2016 at 11:28:42AM +0300, Denis V. Lunev wrote:
> > > >On 02/03/2016 11:14 AM, Fam Zheng wrote:
> > > >>On Sat, 01/30 13:56, Vladimir Sementsov-Ogievskiy wrote:
> > > >>>Hi all.
> > > >>>
> > > >>>These series which aims to add external backup api. This is needed 
> > > >>>to allow
> > > >>>backup software use our dirty bitmaps.
> > > >>>
> > > >>>Vmware and Parallels Cloud Server have this feature.
> > > >>What is the advantage of this appraoch over "drive-backup 
> > > >>sync=incremental
> > > >>..."?
> > > >This will allow third-party vendors to backup QEMU VMs into
> > > >their own formats or to the cloud etc.
> > > As an example, there is a third-party backup format called VMA from
> > > Proxmox.  A few years ago I posted a proof-of-concept external backup
> > > tool in Python:
> > > 
> > > https://lists.gnu.org/archive/html/qemu-devel/2013-03/msg01536.html
> > > 
> > > It takes a full backup using drive-backup NBD (plus RAM/device state)
> > > but the same can be done with incremental backups.
> > > 
> > > Does this NBD approach meet your requirements?
> > > 
> > > Stefan
> > > >>>for us we should somehow provide implementation of
> > > >>>calls posted by Vladimir. They are available in Parallels Server
> > > >>>version 6 and should be available in the next QEMU based
> > > >>>release using "Parallels SDK to libvirt" convertor. The problem
> > > >>>for us is that this old approach is used in the other side
> > > >>>of the product - in containers implementation while this
> > > >>>SDK is a universal access tool to both things.
> > > >>Point taken.  I think many other backup applications will expect a
> > > >>similar API, so it's pragmatic to provide something compatible.
> > > >Kevin Wolf and Daniel Berrange proposed an elegant way to avoid the
> > > >concerns about the QMP monitor:
> > > >
> > > >Previously I described incremental backup in "push" mode (already
> > > >supported today with drive-backup).  QEMU connects to a remote NBD
> > > >server and writes out the contents of all dirty blocks:
> > > >
> > > >   QEMU ---Write dirty blocks--> Backup appliance (server)
> > > >
> > > >This doesn't lend itself well to existing backup applications that
> > > >expect to iterate the dirty bitmap manually.
> > > >
> > > >Let's add a "pull" mode where the connection of the NBD connection is
> > > >reversed.  The backup application connects to QEMU's NBD server (image
> > > >fleecing).  The NBD protocol is extended to support the SCSI Get LBA
> > > >Status command for querying block provisioning information.  Now the
> > > >backup application can use Get LBA Status to fetch the dirty block
> > > >information from QEMU.
> > > >
> > > >   QEMU (server) <--Get LBA Status or Read dirty blocks-- Backup 
> > > > appliance
> > > >
> > > >The dirty block information goes over the same NBD connection used to
> > > >read the contents of the dirty blocks.  The QMP monitor is not used to
> > > >transfer dirty block information.
> > > >
> > > >It may be necessary to extend the nbd-server-add command so that a
> > > >bitmap name can be passed.  This bitmap will be used to answer Get LBA
> > > >Status queries instead of using on bdrv_co_get_block_status().  This
> > > >would be necessary if image fleecing (point in time snapshot) is used.
> > > >
> > > >Stefan
> > > 
> > > There are no such commands in nbd spec here:
> > > 
> > > https://github.com/yoe/nbd/blob/master/doc/proto.md
> > > 
> > > 
> > > So, I'm not sure, that adding something qemu-specific to this external
> > > protocol will be simple or even true way. Is Qemu already extending 
> > > original
> > > nbd?
> > 
> > No, we don't do any QEMU specific extensions. The point of the approach
> > Stefan suggests here though, is that it is *not* an inherantly QEMU-specific
> > concept, it is relevant to any NBD server implementation.
> > 
> > For example, consider you were using a regular NBD server to export a
> > sparse LVM volume. This proposed extension would be relevant to such
> > a use case. As such this proposed extension is something that is likely
> > to be acceptable for the generic NBD specification.
> 
> Yes, Get LBA Status could be useful for non-QEMU NBD users too.
> 
> NBD already supports a TRIM command so the ability to query the
> allocation status is a natural feature to add.

Is it an abuse to "Get LBA Status" to return dirty information? Because in SCSI
the command reports 

Re: [Qemu-devel] [RFC PATCH 0/2] ARM: add QMP command to query GIC version

2016-02-18 Thread Peter Xu
On Thu, Feb 18, 2016 at 06:10:21PM +0100, Andrea Bolognani wrote:
> On Thu, 2016-02-18 at 17:52 +0100, Andrew Jones wrote:
> > > Is this work on any of our todo list (or anyone has started the
> > > prototyping)?
> > > 
> > > It seems reasonable to provide such a generic interface, rather than
> > > adding a "query-gic-capability" for GIC versions only. The problem
> > > is that, I am not sure how eagerly we are wanting this GIC
> > > interface, and when will this framework be there in QOM.
> > 
> > We want it eagerly :-) This type of a rabbit hole is likely why Daniel
> > was suggesting we do more in libvirt. I'm still not sure we want to
> > probe both kvm and qemu from libvirt though, so I'm still in favor of
> > an improved qemu probing method being worked out.
> > 
> > I don't know what the policy is for deprecating QMP commands, but I
> > wonder if we can't introduce a QMP command now, and then, after working
> > out the QOM extensions, we could shift to it, deprecating this QMP
> > command and any others that would no longer be needed.
> 
> AFAIK, the current situation of libvirt passing the GIC version to
> QEMU and simply reporting in case of failure is not unprecedented
> and there are a few cases where probing in advance would simply not
> be feasible.
> 
> Any probing code added to libvirt would have to be kept around
> forever to ensure compatibility with current QEMU versions, so it
> should IMHO be seen as a last resort in case we can't live without
> GIC version probing while it's being implemented, properly, in QEMU.

If libvirt is the most possible consumer for the new command, I
think it might not be too hard to keep the compatibility of all
possible versions of QEMU. E.g., after we have got a better way to
query GIC version other than query-gic-capability, we can do
something like this in libvirt:

- try query-gic-capability
  - if supported -> [got GIC version]
  - if not supported -> try the new method
- if supported -> [got GIC version]
- if not supported -> [not support]

During the time when QEMU has both methods working (before
obsoleting the query-gic-capability QMP command), QEMU will make
sure querying in both way will get exactly the same results.

Does this work?

Thanks.
Peter



[Qemu-devel] [RFC 0/1] riscv: Add full-system emulation support for the RISC-V Instruction Set Architecture (RV64G)

2016-02-18 Thread Sagar Karandikar
The patch in this RFC adds support for the RISC-V ISA [1] as a target. It has 
been tested booting Linux and FreeBSD, passes the RISC-V assembly test suite, 
and has had the riscv-torture tester running on it for a couple of weeks now 
without any issues arising.

With this RFC, I mainly wanted to get input on the overall design of the target
implementation, as well as see if any regular contributors would be interested 
in co-mentoring RISC-V related projects for QEMU's Google Summer of Code with 
me.

In case anyone wants to test it out, there is a repo on the RISC-V GitHub
organization with a README.md containing instructions, a copy of the full 
v2.5.0 
codebase with this patch applied, and links to prebuilt RISC-V linux images [2].

Some notes/questions:
- This provides support only for the 64-bit version of the ISA and full system 
  emulation (no user-mode)
- This currently applies to the 2.5.0 release version. I will bump the
  underlying codebase, split this into multiple patches, apply style checks 
  before submitting real patches
- The code in target-riscv/fpu-custom-riscv is an updated/modified version of 
  softfloat. Is it okay to submodule this until the FPU behavior in RISC-V
  is stabilized? (and then later, presumably merge it with the version of
  softfloat included in QEMU). For current review purposes, I believe everything
  in this directory can be ignored.
- The devices in hw/riscv/htif are intended to mimic the experimental devices
  that we use with our RISC-V test chips. These will be removed and replaced 
  with "real" devices once there is better software support in the OS/bootloader
  ports.

Potentially useful references:
- RISC-V User Spec: See [3]. This spec is frozen.
- RISC-V Privileged Spec: See [4]. This spec is a draft.
- Spike, the reference simulator for the ISA: See [5].

[1] https://riscv.org
[2] https://github.com/riscv/riscv-qemu
[3] http://riscv.org/wp-content/uploads/2015/11/riscv-spec-v2.0.pdf
[4] http://riscv.org/wp-content/uploads/2015/11/riscv-privileged-spec-v1.7.pdf
[5] https://github.com/riscv/riscv-isa-sim

Sagar Karandikar (1):
  riscv: Add full-system emulation support for the RISC-V Instruction
Set Architecture (RV64G)

 arch_init.c|2 +
 configure  |   13 +
 cpus.c |6 +
 default-configs/riscv-linux-user.mak   |1 +
 default-configs/riscv-softmmu.mak  |   38 +
 disas.c|2 +
 disas/Makefile.objs|1 +
 disas/riscv.c  |   46 +
 hw/riscv/Makefile.objs |5 +
 hw/riscv/cputimer.c|  170 ++
 hw/riscv/htif/frontend.c   |  215 ++
 hw/riscv/htif/htif.c   |  459 +
 hw/riscv/riscv_board.c |  330 +++
 hw/riscv/riscv_int.c   |   84 +
 hw/riscv/softint.c |  121 ++
 include/disas/bfd.h|1 +
 include/elf.h  |2 +
 include/exec/poison.h  |1 +
 include/exec/user/thunk.h  |2 +-
 include/hw/riscv/bios.h|4 +
 include/hw/riscv/cpudevs.h |   14 +
 include/hw/riscv/cputimer.h|4 +
 include/hw/riscv/htif/frontend.h   |   30 +
 include/hw/riscv/htif/htif.h   |   76 +
 include/hw/riscv/riscv.h   |   10 +
 include/hw/riscv/softint.h |   50 +
 include/sysemu/arch_init.h |1 +
 target-riscv/Makefile.objs |  114 ++
 target-riscv/TODO  |   17 +
 target-riscv/cpu-qom.h |   86 +
 target-riscv/cpu.c |  143 ++
 target-riscv/cpu.h |  449 
 .../fpu-custom-riscv/8086/OLD-specialize.c |   40 +
 .../fpu-custom-riscv/8086/OLD-specialize.h |  379 
 target-riscv/fpu-custom-riscv/8086/platform.h  |   38 +
 .../fpu-custom-riscv/8086/s_commonNaNToF32UI.c |   17 +
 .../fpu-custom-riscv/8086/s_commonNaNToF64UI.c |   19 +
 .../fpu-custom-riscv/8086/s_f32UIToCommonNaN.c |   25 +
 .../fpu-custom-riscv/8086/s_f64UIToCommonNaN.c |   25 +
 .../fpu-custom-riscv/8086/s_isSigNaNF32UI.c|   13 +
 .../fpu-custom-riscv/8086/s_isSigNaNF64UI.c|   15 +
 .../fpu-custom-riscv/8086/s_propagateNaNF32UI.c|   55 +
 .../fpu-custom-riscv/8086/s_propagateNaNF64UI.c|   55 +
 .../fpu-custom-riscv/8086/softfloat_raiseFlags.c   |   51 +
 .../fpu-custom-riscv/8086/softfloat_types.h|   16 +
 

[Qemu-devel] [PATCH v2 1/2] generic-loader: Add a generic loader

2016-02-18 Thread Alistair Francis
Add a generic loader to QEMU which can be used to load images or set
memory values.

This only supports ARM architectures at the moment.

Signed-off-by: Alistair Francis 
---
V2:
 - Add maintainers entry
 - Perform bounds checking
 - Register and unregister the reset in the realise/unrealise
Changes since RFC:
 - Add BE support

 MAINTAINERS  |   6 ++
 default-configs/arm-softmmu.mak  |   1 +
 hw/misc/Makefile.objs|   2 +
 hw/misc/generic-loader.c | 142 +++
 include/hw/misc/generic-loader.h |  50 ++
 5 files changed, 201 insertions(+)
 create mode 100644 hw/misc/generic-loader.c
 create mode 100644 include/hw/misc/generic-loader.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 02710f8..1bc92c3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -956,6 +956,12 @@ F: hw/acpi/nvdimm.c
 F: hw/mem/nvdimm.c
 F: include/hw/mem/nvdimm.h
 
+Generic Loader
+M: Alistair Francis 
+S: Maintained
+F: hw/misc/generic-loader.c
+F: include/hw/misc/generic-loader.h
+
 Subsystems
 --
 Audio
diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index a9f82a1..b246b75 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -110,3 +110,4 @@ CONFIG_IOH3420=y
 CONFIG_I82801B11=y
 CONFIG_ACPI=y
 CONFIG_SMBIOS=y
+CONFIG_LOADER=y
diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index ea6cd3c..9f05dcf 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -46,3 +46,5 @@ obj-$(CONFIG_STM32F2XX_SYSCFG) += stm32f2xx_syscfg.o
 obj-$(CONFIG_PVPANIC) += pvpanic.o
 obj-$(CONFIG_EDU) += edu.o
 obj-$(CONFIG_HYPERV_TESTDEV) += hyperv_testdev.o
+
+obj-$(CONFIG_LOADER) += generic-loader.o
diff --git a/hw/misc/generic-loader.c b/hw/misc/generic-loader.c
new file mode 100644
index 000..e551a38
--- /dev/null
+++ b/hw/misc/generic-loader.c
@@ -0,0 +1,142 @@
+/*
+ * Generic Loader
+ *
+ * Copyright (C) 2014 Li Guang
+ * Copyright (C) 2016 Xilinx Inc.
+ * Written by Li Guang 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ * for more details.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/sysbus.h"
+#include "sysemu/dma.h"
+#include "hw/loader.h"
+#include "hw/misc/generic-loader.h"
+
+#define CPU_NONE 0xFF
+
+static void generic_loader_reset(void *opaque)
+{
+GenericLoaderState *s = GENERIC_LOADER(opaque);
+
+if (s->cpu) {
+CPUClass *cc = CPU_GET_CLASS(s->cpu);
+cpu_reset(s->cpu);
+cc->set_pc(s->cpu, s->addr);
+}
+
+if (s->data_len) {
+assert(s->data_len < sizeof(s->data));
+dma_memory_write((s->cpu ? s->cpu : first_cpu)->as, s->addr, >data,
+ s->data_len);
+}
+}
+
+static void generic_loader_realize(DeviceState *dev, Error **errp)
+{
+GenericLoaderState *s = GENERIC_LOADER(dev);
+hwaddr entry;
+int big_endian;
+int size = 0;
+
+qemu_register_reset(generic_loader_reset, dev);
+
+if (s->cpu_nr != CPU_NONE) {
+CPUState *cs = first_cpu;
+int cpu_num = 0;
+
+CPU_FOREACH(cs) {
+if (cpu_num == s->cpu_nr) {
+s->cpu = cs;
+break;
+} else if (!CPU_NEXT(cs)) {
+error_setg(errp, "Specified boot CPU#%d is nonexistent",
+   s->cpu_nr);
+return;
+} else {
+cpu_num++;
+}
+}
+}
+
+#ifdef TARGET_WORDS_BIGENDIAN
+big_endian = 1;
+#else
+big_endian = 0;
+#endif
+
+if (s->file) {
+if (!s->force_raw) {
+size = load_elf(s->file, NULL, NULL, , NULL, NULL,
+big_endian, ELF_ARCH, 0);
+
+if (size < 0) {
+size = load_uimage(s->file, , NULL, NULL, NULL, NULL);
+}
+}
+
+if (size < 0) {
+size = load_image_targphys(s->file, s->addr, 0);
+} else {
+s->addr = entry;
+}
+
+if (size < 0) {
+error_setg(errp, "Cannot load specified image %s", s->file);
+return;
+}
+}
+
+if (s->data_len && (s->data_len > sizeof(s->data))) {
+error_setg(errp, "data-len cannot be more then the data size");
+return;
+}
+}
+
+static void generic_loader_unrealize(DeviceState *dev, Error **errp)
+{
+qemu_unregister_reset(generic_loader_reset, dev);
+}
+
+static Property generic_loader_props[] = {
+

[Qemu-devel] [PATCH v2 2/2] docs: Add a generic loader explanation document

2016-02-18 Thread Alistair Francis
Signed-off-by: Alistair Francis 
---

 docs/generic-loader.txt | 21 +
 1 file changed, 21 insertions(+)
 create mode 100644 docs/generic-loader.txt

diff --git a/docs/generic-loader.txt b/docs/generic-loader.txt
new file mode 100644
index 000..69e262d
--- /dev/null
+++ b/docs/generic-loader.txt
@@ -0,0 +1,21 @@
+Copyright (c) 2016 Xilinx Inc.
+
+This work is licensed under the terms of the GNU GPL, version 2 or later.  See
+the COPYING file in the top-level directory.
+
+
+This loader allows the user to load multiple images or values into QEMU at 
startup.
+
+Loading Memory Values
+---
+Memory values can be loaded like this:
+-device loader,addr=0xfd1a0104,data=0x800e,data-len=4
+
+Loading Images
+---
+Images can be loaded like this:
+-device loader,file=./images/boot.elf,cpu=0
+
+The limiation for arch is based off settting the ELF_ARCH macro.
+
+At the moment only the ARM arhitectures are supported
-- 
2.5.0




[Qemu-devel] [PATCH v2 0/2] Add a generic loader

2016-02-18 Thread Alistair Francis
This work is based on the original work by Li Guang with extra
features added by Peter C and myself.

The idea of this loader is to allow the user to load multiple images
or values into QEMU at startup.

Memory values can be loaded like this: -device 
loader,addr=0xfd1a0104,data=0x800e,data-len=4

Images can be loaded like this: -device loader,file=./images/u-boot.elf,cpu=0

This can be useful and we use it a lot in Xilinx to load multiple images
into a machine at creation (ATF, Kernel and DTB for example).

It can also be used to set registers.

The limiation for arch is based off settting the ELF_ARCH macro.

The reset patch is required otherwise the reset will never be registered
and the loader can't change the PC in the case of images.

V2:
 - Add an entry to the maintainers file
 - Add some documentation
 - Perform bounds checking on the data_len
 - Register and unregister the reset in the realise/unrealise
Changes since RFC:
 - Add support for BE


Alistair Francis (2):
  generic-loader: Add a generic loader
  docs: Add a generic loader explanation document

 MAINTAINERS  |   6 ++
 default-configs/arm-softmmu.mak  |   1 +
 docs/generic-loader.txt  |  21 ++
 hw/misc/Makefile.objs|   2 +
 hw/misc/generic-loader.c | 142 +++
 include/hw/misc/generic-loader.h |  50 ++
 6 files changed, 222 insertions(+)
 create mode 100644 docs/generic-loader.txt
 create mode 100644 hw/misc/generic-loader.c
 create mode 100644 include/hw/misc/generic-loader.h

-- 
2.5.0




Re: [Qemu-devel] [PATCHv7 4/9] slirp: Factorizing tcpiphdr structure with an union

2016-02-18 Thread Samuel Thibault
Hello,

Just to make sure: we have not received comments on this patch.

(that said, it's the most complex part of the series, so I'm not
surprised if it takes more time :) )

Samuel

Samuel Thibault, on Sun 14 Feb 2016 18:47:38 +0100, wrote:
> From: Guillaume Subiron 
> 
> This patch factorizes the tcpiphdr structure to put the IPv4 fields in
> an union, for addition of version 6 in further patch.
> Using some macros, retrocompatibility of the existing code is assured.
> 
> This patch also fixes the SLIRP_MSIZE and margin computation in various
> functions, and makes them compatible with the new tcpiphdr structure,
> whose size will be bigger than sizeof(struct tcphdr) + sizeof(struct ip)
> 
> Signed-off-by: Guillaume Subiron 
> Signed-off-by: Samuel Thibault 
> ---
>  slirp/if.h |  4 ++--
>  slirp/mbuf.c   |  3 ++-
>  slirp/slirp.c  | 15 ---
>  slirp/socket.c | 13 -
>  slirp/tcp_input.c  | 31 ---
>  slirp/tcp_output.c | 18 +-
>  slirp/tcp_subr.c   | 31 ++-
>  slirp/tcpip.h  | 31 +++
>  8 files changed, 102 insertions(+), 44 deletions(-)
> 
> diff --git a/slirp/if.h b/slirp/if.h
> index 3327023..c7a5c57 100644
> --- a/slirp/if.h
> +++ b/slirp/if.h
> @@ -17,7 +17,7 @@
>  #define IF_MRU 1500
>  #define  IF_COMP IF_AUTOCOMP /* Flags for compression */
>  
> -/* 2 for alignment, 14 for ethernet, 40 for TCP/IP */
> -#define IF_MAXLINKHDR (2 + 14 + 40)
> +/* 2 for alignment, 14 for ethernet */
> +#define IF_MAXLINKHDR (2 + ETH_HLEN)
>  
>  #endif
> diff --git a/slirp/mbuf.c b/slirp/mbuf.c
> index c959758..f081c69 100644
> --- a/slirp/mbuf.c
> +++ b/slirp/mbuf.c
> @@ -24,7 +24,8 @@
>   * Find a nice value for msize
>   * XXX if_maxlinkhdr already in mtu
>   */
> -#define SLIRP_MSIZE (IF_MTU + IF_MAXLINKHDR + offsetof(struct mbuf, m_dat) + 
> 6)
> +#define SLIRP_MSIZE\
> +(offsetof(struct mbuf, m_dat) + IF_MAXLINKHDR + TCPIPHDR_DELTA + IF_MTU)
>  
>  void
>  m_init(Slirp *slirp)
> diff --git a/slirp/slirp.c b/slirp/slirp.c
> index 5f42ada..c2c4597 100644
> --- a/slirp/slirp.c
> +++ b/slirp/slirp.c
> @@ -754,15 +754,16 @@ void slirp_input(Slirp *slirp, const uint8_t *pkt, int 
> pkt_len)
>  m = m_get(slirp);
>  if (!m)
>  return;
> -/* Note: we add to align the IP header */
> -if (M_FREEROOM(m) < pkt_len + 2) {
> -m_inc(m, pkt_len + 2);
> +/* Note: we add 2 to align the IP header on 4 bytes,
> + * and add the margin for the tcpiphdr overhead  */
> +if (M_FREEROOM(m) < pkt_len + TCPIPHDR_DELTA + 2) {
> +m_inc(m, pkt_len + TCPIPHDR_DELTA + 2);
>  }
> -m->m_len = pkt_len + 2;
> -memcpy(m->m_data + 2, pkt, pkt_len);
> +m->m_len = pkt_len + TCPIPHDR_DELTA + 2;
> +memcpy(m->m_data + TCPIPHDR_DELTA + 2, pkt, pkt_len);
>  
> -m->m_data += 2 + ETH_HLEN;
> -m->m_len -= 2 + ETH_HLEN;
> +m->m_data += TCPIPHDR_DELTA + 2 + ETH_HLEN;
> +m->m_len -= TCPIPHDR_DELTA + 2 + ETH_HLEN;
>  
>  if (proto == ETH_P_IP) {
>  ip_input(m);
> diff --git a/slirp/socket.c b/slirp/socket.c
> index b79ddec..d4b02c8 100644
> --- a/slirp/socket.c
> +++ b/slirp/socket.c
> @@ -483,7 +483,18 @@ sorecvfrom(struct socket *so)
> if (!m) {
> return;
> }
> -   m->m_data += IF_MAXLINKHDR;
> +   switch (so->so_ffamily) {
> +   case AF_INET:
> +   m->m_data += IF_MAXLINKHDR + sizeof(struct udpiphdr);
> +   break;
> +   case AF_INET6:
> +   m->m_data += IF_MAXLINKHDR + sizeof(struct ip6)
> +  + sizeof(struct udphdr);
> +   break;
> +   default:
> +   g_assert_not_reached();
> +   break;
> +   }
>  
> /*
>  * XXX Shouldn't FIONREAD packets destined for port 53,
> diff --git a/slirp/tcp_input.c b/slirp/tcp_input.c
> index 5f845da..26b0c8b 100644
> --- a/slirp/tcp_input.c
> +++ b/slirp/tcp_input.c
> @@ -256,11 +256,6 @@ tcp_input(struct mbuf *m, int iphlen, struct socket 
> *inso)
>   }
>   slirp = m->slirp;
>  
> - /*
> -  * Get IP and TCP header together in first mbuf.
> -  * Note: IP leaves IP header in first mbuf.
> -  */
> - ti = mtod(m, struct tcpiphdr *);
>   if (iphlen > sizeof(struct ip )) {
> ip_stripoptions(m, (struct mbuf *)0);
> iphlen=sizeof(struct ip );
> @@ -277,14 +272,28 @@ tcp_input(struct mbuf *m, int iphlen, struct socket 
> *inso)
>   save_ip.ip_len+= iphlen;
>  
>   /*
> +  * Get IP and TCP header together in first mbuf.
> +  * Note: IP leaves IP header in first mbuf.
> +  */
> + m->m_data -= sizeof(struct tcpiphdr) - (sizeof(struct ip)
> +  + sizeof(struct tcphdr));
> +

Re: [Qemu-devel] [PATCHv7 9/9] qapi-schema, qemu-options & slirp: Adding Qemu options for IPv6 addresses

2016-02-18 Thread Samuel Thibault
Thomas Huth, on Wed 17 Feb 2016 12:59:04 +0100, wrote:
> The IPv4 code seems to have a sanity check that the vhost address is
> within the same net as specified with the "net=" parameter... maybe the
> IPv6 part should have that, too? Or use the prefix from "ip6-net=" here
> if ip6-host has not been specified manually?

I'd say the latter indeed.  I have reworked this part of the code, and
then the implementation matches the documentation :)

@@ -235,6 +242,52 @@ static int net_slirp_init(NetClientState *peer, const char 
*model,
 }
 #endif
 
+
+if (!vprefix6) {
+vprefix6 = "fec0::";
+}
+if (!inet_pton(AF_INET6, vprefix6, _prefix)) {
+return -1;
+}
+
+if (!vprefix6_len) {
+vprefix6_len = 64;
+}
+if (vprefix6_len < 0 || vprefix6_len > 128) {
+return -1;
+}
+
+if (vhost6) {
+if (!inet_pton(AF_INET6, vhost6, _host)) {
+return -1;
+}
+if (!in6_equal_net(_prefix, _host, vprefix6_len)) {
+return -1;
+}
+} else {
+if (vprefix6_len > 126) {
+return -1;
+}
+ip6_host = ip6_prefix;
+ip6_host.s6_addr[15] |= 2;
+}
+
+if (vnameserver6) {
+if (!inet_pton(AF_INET6, vnameserver6, _dns)) {
+return -1;
+}
+if (!in6_equal_net(_prefix, _dns, vprefix6_len)) {
+return -1;
+}
+} else {
+if (vprefix6_len > 126) {
+return -1;
+}
+ip6_dns = ip6_prefix;
+ip6_dns.s6_addr[15] |= 3;
+}
+
+
 nc = qemu_new_net_client(_slirp_info, peer, model, name);
 
 snprintf(nc->info_str, sizeof(nc->info_str),



Re: [Qemu-devel] [PATCH v4] hw/ppc/spapr: Implement the h_page_init hypercall

2016-02-18 Thread David Gibson
On Thu, Feb 18, 2016 at 10:15:54AM +0100, Thomas Huth wrote:
> This hypercall either initializes a page with zeros, or copies
> another page.
> According to LoPAPR, the i-cache of the page should also be
> flushed if using H_ICACHE_INVALIDATE or H_ICACHE_SYNCHRONIZE,
> and the d-cache should be synchronized to the RAM if the
> H_ICACHE_SYNCHRONIZE flag is used. For this, two new functions
> are introduced, kvmppc_dcbst_range() and kvmppc_icbi()_range, which
> use the corresponding assembler instructions to flush the caches
> if running with KVM on Power. If the code runs with TCG instead,
> the code only uses tb_flush(), assuming that this will be
> enough for synchronization.
> 
> Signed-off-by: Thomas Huth 

Applied to ppc-for-2.6, thanks.

> ---
>  v4:
>  - Re-arranged the H_COPY_PAGE code block for better readability
>and to avoid a wrong compiler warning with newer versions of GCC
> 
>  hw/ppc/spapr_hcall.c | 60 
> 
>  target-ppc/kvm_ppc.h | 36 +--
>  2 files changed, 94 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index 6e9b6be..1733482 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -386,6 +386,65 @@ static target_ulong h_set_xdabr(PowerPCCPU *cpu, 
> sPAPRMachineState *spapr,
>  return H_SUCCESS;
>  }
>  
> +static target_ulong h_page_init(PowerPCCPU *cpu, sPAPRMachineState *spapr,
> +target_ulong opcode, target_ulong *args)
> +{
> +target_ulong flags = args[0];
> +hwaddr dst = args[1];
> +hwaddr src = args[2];
> +hwaddr len = TARGET_PAGE_SIZE;
> +uint8_t *pdst, *psrc;
> +target_long ret = H_SUCCESS;
> +
> +if (flags & ~(H_ICACHE_SYNCHRONIZE | H_ICACHE_INVALIDATE
> +  | H_COPY_PAGE | H_ZERO_PAGE)) {
> +qemu_log_mask(LOG_UNIMP, "h_page_init: Bad flags (" TARGET_FMT_lx 
> "\n",
> +  flags);
> +return H_PARAMETER;
> +}
> +
> +/* Map-in destination */
> +if (!is_ram_address(spapr, dst) || (dst & ~TARGET_PAGE_MASK) != 0) {
> +return H_PARAMETER;
> +}
> +pdst = cpu_physical_memory_map(dst, , 1);
> +if (!pdst || len != TARGET_PAGE_SIZE) {
> +return H_PARAMETER;
> +}
> +
> +if (flags & H_COPY_PAGE) {
> +/* Map-in source, copy to destination, and unmap source again */
> +if (!is_ram_address(spapr, src) || (src & ~TARGET_PAGE_MASK) != 0) {
> +ret = H_PARAMETER;
> +goto unmap_out;
> +}
> +psrc = cpu_physical_memory_map(src, , 0);
> +if (!psrc || len != TARGET_PAGE_SIZE) {
> +ret = H_PARAMETER;
> +goto unmap_out;
> +}
> +memcpy(pdst, psrc, len);
> +cpu_physical_memory_unmap(psrc, len, 0, len);
> +} else if (flags & H_ZERO_PAGE) {
> +memset(pdst, 0, len);  /* Just clear the destination page */
> +}
> +
> +if (kvm_enabled() && (flags & H_ICACHE_SYNCHRONIZE) != 0) {
> +kvmppc_dcbst_range(cpu, pdst, len);
> +}
> +if (flags & (H_ICACHE_SYNCHRONIZE | H_ICACHE_INVALIDATE)) {
> +if (kvm_enabled()) {
> +kvmppc_icbi_range(cpu, pdst, len);
> +} else {
> +tb_flush(CPU(cpu));
> +}
> +}
> +
> +unmap_out:
> +cpu_physical_memory_unmap(pdst, TARGET_PAGE_SIZE, 1, len);
> +return ret;
> +}
> +
>  #define FLAGS_REGISTER_VPA 0x2000ULL
>  #define FLAGS_REGISTER_DTL 0x4000ULL
>  #define FLAGS_REGISTER_SLBSHADOW   0x6000ULL
> @@ -1045,6 +1104,7 @@ static void hypercall_register_types(void)
>  spapr_register_hypercall(H_SET_SPRG0, h_set_sprg0);
>  spapr_register_hypercall(H_SET_DABR, h_set_dabr);
>  spapr_register_hypercall(H_SET_XDABR, h_set_xdabr);
> +spapr_register_hypercall(H_PAGE_INIT, h_page_init);
>  spapr_register_hypercall(H_SET_MODE, h_set_mode);
>  
>  /* "debugger" hcalls (also used by SLOF). Note: We do -not- differenciate
> diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
> index aaa828c..fd64c44 100644
> --- a/target-ppc/kvm_ppc.h
> +++ b/target-ppc/kvm_ppc.h
> @@ -249,15 +249,47 @@ static inline int kvmppc_enable_hwrng(void)
>  #endif
>  
>  #ifndef CONFIG_KVM
> +
>  #define kvmppc_eieio() do { } while (0)
> -#else
> +
> +static inline void kvmppc_dcbst_range(PowerPCCPU *cpu, uint8_t *addr, int 
> len)
> +{
> +}
> +
> +static inline void kvmppc_icbi_range(PowerPCCPU *cpu, uint8_t *addr, int len)
> +{
> +}
> +
> +#else   /* CONFIG_KVM */
> +
>  #define kvmppc_eieio() \
>  do {  \
>  if (kvm_enabled()) {  \
>  asm volatile("eieio" : : : "memory"); \
>  } \
>  } while (0)
> -#endif
> +
> +/* Store data cache blocks back to memory */
> +static inline void kvmppc_dcbst_range(PowerPCCPU *cpu, uint8_t 

Re: [Qemu-devel] [PATCH] spapr: initialize local Error pointer

2016-02-18 Thread David Gibson
On Thu, Feb 18, 2016 at 04:02:53PM +0100, Markus Armbruster wrote:
> Greg Kurz  writes:
> 
> > This fixes a crash in the target QEMU during migration.
> >
> > Fixes: c5f54f3e31bf693f70a98d4d73ea5dbe05689857
> 
> Suggest to word that as "Broken in commit c5f54f3."
> 
> > Signed-off-by: Greg Kurz 
> > ---
> >  hw/ppc/spapr.c |2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index e214a34257b3..c119f5582429 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -1528,7 +1528,7 @@ static int htab_load(QEMUFile *f, void *opaque, int 
> > version_id)
> >  section_hdr = qemu_get_be32(f);
> >  
> >  if (section_hdr) {
> > -Error *local_err;
> > +Error *local_err = NULL;
> >  
> >  /* First section gives the htab size */
> >  spapr_reallocate_hpt(spapr, section_hdr, _err);
> 
> Easy mistake to make (I've made it myself).  I wish I had the time to
> figure out how to make Coccinelle catch it.
> 
> Reviewed-by: Markus Armbruster 

Applied to ppc-for-2.6 with Markus's suggested rewording.  Thanks.


-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] ppc/kvm: Use error_report() instead of cpu_abort() for user-triggerable errors

2016-02-18 Thread David Gibson
On Thu, Feb 18, 2016 at 10:01:38PM +0100, Thomas Huth wrote:
> Setting the KVM_CAP_PPC_PAPR capability can fail if either the KVM
> kernel module does not support it, or if the specified vCPU type
> is not a 64-bit Book3-S CPU type. For example, the user can trigger
> it easily with "-M pseries -cpu G2leLS" when using the kvm-pr kernel
> module. So the error should not be reported with cpu_abort() since
> this function is rather meant for reporting programming errors than
> reporting user-triggerable errors (it prints out all CPU registers
> and then calls abort() to kills the program - two things that the
> normal user does not expect here) . So let's use error_report() with
> exit(1) here instead.
> A similar problem exists in the code that sets the KVM_CAP_PPC_EPR
> capability, so while we're at it, fix that, too.
> 
> Signed-off-by: Thomas Huth 

Applied to ppc-for-2.6, thanks.

> ---
>  target-ppc/kvm.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 70ca296..762d6cf 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -23,6 +23,7 @@
>  #include 
>  
>  #include "qemu-common.h"
> +#include "qemu/error-report.h"
>  #include "qemu/timer.h"
>  #include "sysemu/sysemu.h"
>  #include "sysemu/kvm.h"
> @@ -1993,7 +1994,8 @@ void kvmppc_set_papr(PowerPCCPU *cpu)
>  
>  ret = kvm_vcpu_enable_cap(cs, KVM_CAP_PPC_PAPR, 0);
>  if (ret) {
> -cpu_abort(cs, "This KVM version does not support PAPR\n");
> +error_report("This vCPU type or KVM version does not support PAPR");
> +exit(1);
>  }
>  
>  /* Update the capability flag so we sync the right information
> @@ -2013,7 +2015,8 @@ void kvmppc_set_mpic_proxy(PowerPCCPU *cpu, int 
> mpic_proxy)
>  
>  ret = kvm_vcpu_enable_cap(cs, KVM_CAP_PPC_EPR, 0, mpic_proxy);
>  if (ret && mpic_proxy) {
> -cpu_abort(cs, "This KVM version does not support EPR\n");
> +error_report("This KVM version does not support EPR");
> +exit(1);
>  }
>  }
>  

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] ppc/kvm: Tell the user what might be wrong when using bad CPU types with kvm-hv

2016-02-18 Thread David Gibson
On Thu, Feb 18, 2016 at 03:13:52PM -0700, Eric Blake wrote:
> On 02/18/2016 02:01 PM, Thomas Huth wrote:
> > Using a CPU type that does not match the host is not possible when using
> > the kvm-hv kernel module - the PVR is checked in the kernel function
> > kvm_arch_vcpu_ioctl_set_sregs_hv() and rejected with -EINVAL if it
> > does not match the host.
> > However, when the user tries to specify a non-matching CPU type, QEMU
> > currently only reports "kvm_init_vcpu failed: Invalid argument", and
> > this is of course not very helpful for the user to solve the problem.
> > So this patch adds a more descriptive error message that tells the
> > user to specify "-cpu host" instead.
> > 
> > Signed-off-by: Thomas Huth 
> > ---
> >  target-ppc/kvm.c | 4 
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> > index 762d6cf..6545fbe 100644
> > --- a/target-ppc/kvm.c
> > +++ b/target-ppc/kvm.c
> > @@ -513,6 +513,10 @@ int kvm_arch_init_vcpu(CPUState *cs)
> >  /* Synchronize sregs with kvm */
> >  ret = kvm_arch_sync_sregs(cpu);
> >  if (ret) {
> > +if (ret == -EINVAL) {
> > +error_report("Register sync failed... If you're using 
> > kvm-hv.ko,"
> > + " only \"-cpu host\" is possible!");
> 
> No need to shout at the user; drop the trailing !

Applied to ppc-for-2.6 with ! removed.  Thanks.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCHv7 8/9] slirp: Adding IPv6 address for DNS relay

2016-02-18 Thread Samuel Thibault
Samuel Thibault, on Wed 17 Feb 2016 10:36:35 +0100, wrote:
> That's now like this in my tree:
> 
> diff --git a/slirp/ip6.h b/slirp/ip6.h
> index 9f7623f..9e4844e 100644
> --- a/slirp/ip6.h
> +++ b/slirp/ip6.h
> @@ -70,7 +70,11 @@ static inline bool in6_equal_mach(const struct in6_addr *a,
>|| (in6_equal_net(a, &(struct in6_addr)LINKLOCAL_ADDR, 64)\
>&& in6_equal_mach(a, >vhost_addr6, 64)))
>  
> -#define in6_equal_dns(a) 0
> +#define in6_equal_dns(a)\
> +((in6_equal_net(a, >vprefix_addr6, slirp->vprefix_len)\
> +  && in6_equal_mach(a, >vnameserver_addr6, slirp->vprefix_len))\
> +  || (in6_equal_net(a, &(struct in6_addr)LINKLOCAL_ADDR, 64))\
> +  && in6_equal_mach(a, >vnameserver_addr6, 64))

Oops, I meant

+#define in6_equal_dns(a)\
+((in6_equal_net(a, >vprefix_addr6, slirp->vprefix_len)\
+  && in6_equal_mach(a, >vnameserver_addr6, slirp->vprefix_len))\
+  || (in6_equal_net(a, &(struct in6_addr)LINKLOCAL_ADDR, 64)\
+  && in6_equal_mach(a, >vnameserver_addr6, 64)))



Re: [Qemu-devel] [PATCH v3 0/2] Fix migration of old pseries

2016-02-18 Thread David Gibson
On Thu, Feb 18, 2016 at 12:32:11PM +0100, Greg Kurz wrote:
> QEMU 2.4 broke the migration of old pseries machine with the addition
> of configuration sections, which are sent unconditionally.
> 
> We assume that QEMU 2.3 is more deployed than any newer release (based on
> the versions currently shipped by most distros). This v3 series hence
> reverses the logic from v2: it now fully fixes migration of old pseries
> from/to QEMU 2.3 and provides a manual workaround for the QEMU 2.4/2.4.1/2.5
> case.
> 
> With this series, I could migrate the same pseries-2.3 instance in a full
> 2.3->2.6->2.5->2.6->2.4->2.6->2.3 cycle.

Sorry, I've lost track slightly here.  Does this series apply on top
of, or instead of your earlier series that peeks for the config
section?

> ---
> 
> Greg Kurz (2):
>   spapr: skip configuration section during migration of older machines
>   migration: allow machine to enforce configuration section migration
> 
> 
>  hw/core/machine.c   |   21 +
>  hw/ppc/spapr.c  |1 +
>  include/hw/boards.h |1 +
>  migration/savevm.c  |   10 --
>  qemu-options.hx |3 ++-
>  5 files changed, 33 insertions(+), 3 deletions(-)
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v1 1/2] qdev-monitor.c: Register reset function if the device has one

2016-02-18 Thread Alistair Francis
On Thu, Feb 18, 2016 at 3:07 PM, Peter Crosthwaite
 wrote:
> On Thu, Feb 18, 2016 at 1:47 PM, Paolo Bonzini  wrote:
>>
>>
>> On 18/02/2016 10:56, Markus Armbruster wrote:
>>> Alistair Francis  writes:
>>>
 If the device being added when running qdev_device_add() has
 a reset function, register it so that it can be called.

 Signed-off-by: Alistair Francis 
 ---

  qdev-monitor.c | 2 ++
  1 file changed, 2 insertions(+)

 diff --git a/qdev-monitor.c b/qdev-monitor.c
 index 81e3ff3..0a99d01 100644
 --- a/qdev-monitor.c
 +++ b/qdev-monitor.c
 @@ -561,6 +561,8 @@ DeviceState *qdev_device_add(QemuOpts *opts, Error 
 **errp)

  if (bus) {
  qdev_set_parent_bus(dev, bus);
 +} else if (dc->reset) {
 +qemu_register_reset((void (*)(void *))dc->reset, dev);
  }

  id = qemu_opts_id(opts);
>>>
>>> This looks wrong to me.
>>>
>>> You stuff all the device reset methods into the global reset_handlers
>>> list, where they get called in some semi-random order.  This breaks when
>>> there are reset order dependencies between devices, e.g. between a
>>> device and the bus it plugs into.
>>
>> There is no bus here, see the "if" above the one that's being added.
>>
>> However, what devices have done so far is to register/unregister the
>> reset in the realize/unrealize methods, and I suggest doing the same.

Ok, I am happy to do it that way. It just seemed dodgy to me
registering the reset in the realise.

This also seemed like a feature worth having, as I thought this would
come up again in the future.

>>
>
> Does this assume the device itself knows whether it is bus-connected
> or not? This way has the advantage of catchall-ing devices that have
> no bus connected and the device may or may not know whether it is
> bus-connected (nor should it need to know). Probably doesn't have in
> tree precedent yet, but I thought we wanted to move away from
> qdev/qbus managing the device-tree. So ideally, the new else should
> become unconditional long term once we debusify (and properly QOMify)
> the reset tree (and the if goes away).

That was my general thinking as well.

Thanks,

Alistair

>
>> If you really want to add the magic qemu_register_reset, you should at
>> least do one of the following:
>>
>> * add a matching unregister (no idea where)
>>
>
> You could add a boolean flag to DeviceState that is set by this
> registration. It can then be checked at unrealize to remove reset
> handler.
>
> Regards,
> Peter
>
>> * assert that the device is not hot-unpluggable, otherwise hot-unplug
>> would leave a dangling pointer.
>>
>> Paolo
>>
>>> Propagating the reset signal to all the devices is a qdev problem.
>>> Copying Andreas for further insight.
>



Re: [Qemu-devel] [RFC PATCH] tests/pxe-test: add pxe vhost user test

2016-02-18 Thread Michael S. Tsirkin
On Thu, Feb 18, 2016 at 07:05:47PM +0200, Victor Kaplansky wrote:
> Add one more testcase to pxe-test for checking vhost-user
> interface. The test is based on spawning vhost-user-bridge
> process in addition to qemu under test.
> 
> Suggested-by: Michael S. Tsirkin 
> Signed-off-by: Victor Kaplansky 
> ---
> The patch is a quick and dirty, but working implementation of
> Michal's idea. It is based on latest vhost-user-bridge test and
> latest change in rules: "[PATCH] rules: filter out irrelevant
> files".

you asked how to fix warning about no nics being there.
I think that maybe we should just drop this warning.
It's bogus in case of a hotplug, anyway.
Or, check qtest_enabled and print if not there only.

> 
>  tests/pxe-test.c | 39 +++
>  tests/Makefile   |  4 +++-
>  2 files changed, 42 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/pxe-test.c b/tests/pxe-test.c
> index fa430958..d01965f8 100644
> --- a/tests/pxe-test.c
> +++ b/tests/pxe-test.c
> @@ -48,6 +48,44 @@ static void test_pxe_virtio_pci(void)
>  test_pxe_one("-device virtio-net-pci,netdev=" NETNAME);
>  }
>  
> +static void test_pxe_vhost_user(void)
> +{
> +pid_t pid;
> +int status;
> +
> +pid = fork();
> +if (!pid) {
> +char *vubr_binary;
> +char *args;
> +
> +vubr_binary = getenv("QTEST_VUBR_BINARY");
> +assert(vubr_binary);
> +
> +args =
> +g_strdup_printf("%s -u /tmp/vubr.sock -l 127.0.0.1: "
> +"-r 127.0.0.1: 1> /dev/null 2>&1",

Don't hard-code port numbers. open sockets and pass their FDs around.


> +vubr_binary);


Pls don't use /tmp like this, this is a security problem.
Pls use g_file_open_tmp, mkdtemp or some such.
See tests/vhost-user-test.c as one example.

> +system(args);
> +free(args);
> +exit(0);
> +}
> +
> +g_usleep(1 * G_USEC_PER_SEC);
> +qtest_start(
> +"-enable-kvm "
> +"-m 1024 "

1G for a pxe test? seems way too much.

> +"-object 
> memory-backend-file,id=mem,size=1024M,mem-path=/tmp/hugepages,share=on "

again, look at vhost-user test for a clean way to do this.

> +"-numa node,memdev=mem -mem-prealloc "
> +"-device virtio-net-pci,netdev=net0 "
> +"-chardev socket,id=char0,path=/tmp/vubr.sock "
> +"-netdev type=vhost-user,id=net0,chardev=char0,vhostforce "
> +"-net user,vlan=5,tftp=./,bootfile=tests/pxe-test-disk.raw "
> +"-net socket,vlan=5,udp=localhost:,localaddr=localhost: ");
> +boot_sector_test();
> +qtest_quit(global_qtest);
> +waitpid(pid, , 0);
> +}
> +
>  int main(int argc, char *argv[])
>  {
>  int ret;
> @@ -60,6 +98,7 @@ int main(int argc, char *argv[])
>  g_test_init(, , NULL);
>  
>  if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
> +qtest_add_func("pxe/vhost-user", test_pxe_vhost_user);
>  qtest_add_func("pxe/e1000", test_pxe_e1000);
>  qtest_add_func("pxe/virtio", test_pxe_virtio_pci);
>  }
> diff --git a/tests/Makefile b/tests/Makefile
> index 839d357d..40d9d8cc 100644
> --- a/tests/Makefile
> +++ b/tests/Makefile
> @@ -521,7 +521,8 @@ tests/hd-geo-test$(EXESUF): tests/hd-geo-test.o
>  tests/boot-order-test$(EXESUF): tests/boot-order-test.o $(libqos-obj-y)
>  tests/bios-tables-test$(EXESUF): tests/bios-tables-test.o \
>   tests/boot-sector.o $(libqos-obj-y)
> -tests/pxe-test$(EXESUF): tests/pxe-test.o tests/boot-sector.o $(libqos-obj-y)
> +tests/pxe-test$(EXESUF): tests/pxe-test.o tests/boot-sector.o \
> +tests/vhost-user-bridge$(EXESUF) $(libqos-obj-y)
>  tests/tmp105-test$(EXESUF): tests/tmp105-test.o $(libqos-omap-obj-y)
>  tests/ds1338-test$(EXESUF): tests/ds1338-test.o $(libqos-imx-obj-y)
>  tests/i440fx-test$(EXESUF): tests/i440fx-test.o $(libqos-pc-obj-y)
> @@ -620,6 +621,7 @@ $(patsubst %, check-qtest-%, $(QTEST_TARGETS)): 
> check-qtest-%: $(check-qtest-y)
>   $(if $(CONFIG_GCOV),@rm -f *.gcda */*.gcda */*/*.gcda */*/*/*.gcda,)
>   $(call quiet-command,QTEST_QEMU_BINARY=$*-softmmu/qemu-system-$* \
>   QTEST_QEMU_IMG=qemu-img$(EXESUF) \
> + QTEST_VUBR_BINARY=./tests/vhost-user-bridge$(EXESUF) \
>   MALLOC_PERTURB_=$${MALLOC_PERTURB_:-$$((RANDOM % 255 + 1))} \
>   gtester $(GTESTER_OPTIONS) -m=$(SPEED) $(check-qtest-$*-y) 
> $(check-qtest-generic-y),"GTESTER $@")
>   $(if $(CONFIG_GCOV),@for f in $(gcov-files-$*-y) 
> $(gcov-files-generic-y); do \
> -- 
> Victor



Re: [Qemu-devel] [PATCH v2 1/1] hyperv: cpu hotplug fix with HyperV enabled

2016-02-18 Thread Eduardo Habkost
On Thu, Feb 18, 2016 at 07:50:59PM +0300, Denis V. Lunev wrote:
> On 02/17/2016 11:31 PM, Eduardo Habkost wrote:
> >On Sat, Feb 13, 2016 at 03:00:15PM +0300, Denis V. Lunev wrote:
> >>With Hyper-V enabled CPU hotplug stops working. The CPU appears in device
> >>manager on Windows but does not appear in peformance monitor and control
> >>panel.
> >>
> >>The root of the problem is the following. Windows checks
> >>HV_X64_CPU_DYNAMIC_PARTITIONING_AVAILABLE bit in CPUID. The presence of
> >>this bit is enough to cure the situation.
> >What about live migration? This is going to change CPUID data
> >under the guest's feet.
> >
> >>The bit should be set when CPU hotplug is allowed for HyperV VM. The check
> >>that hot_add_cpu callback is defined is enough from the protocol point
> >>of view. Though this callback is defined almost always thus there is no
> >>need to export that knowledge in the other way.
> >What would be the consequences of setting it when CPU hotplug is
> >not available? Is there any real advantage of keeping it unset in
> >pc-1.4 and older?
> >
> >If there are good reasons to keep it unset if CPU hotplug is not
> >possible, why set it when max_cpus == smp_cpus?
> 
> I have made some tests with Win2k12 and the picture matches my
> expectations. This property is read from CPUID once at system
> boot:
> - hotplug is working for VM with this property set after migration
>   to QEMU which does not support this property
> - hotplug remains not working after migration to QEMU which
>   sets this property
> No side effects detected but I have not checked that a lot.
> 
> I have discussed this thing with our local Windows experts and
> they do not know side-effects of this.
> 
> Thus I think that we could set this unconditionally.
> 

We avoid changing CPUID of a running VM, but if HyperV experts
don't see any issues I won't mind keeping it simple.

-- 
Eduardo



Re: [Qemu-devel] [PATCH v2 0/5] q35: Remove old machines and unused compat code

2016-02-18 Thread Eduardo Habkost
On Thu, Feb 18, 2016 at 01:39:39PM +0100, Markus Armbruster wrote:
> Eduardo Habkost  writes:
> 
> > This is another attempt to remove old q35 machine code. Now I am
> > also removing unused compat code to demonstrate the benefit of
> > throwing away the old code that nobody uses.
> >
> > Eduardo Habkost (5):
> >   q35: Remove old machine versions
> >   machine: Remove no_tco field
> >   ich9: Remove enable_tco arguments from init functions
> >   q35: Remove unused q35-acpi-dsdt.aml file
> >   q35: No need to check gigabyte_align
> 
> Reviewed-by: Markus Armbruster 
> 
> Followup questions:
> 
> * Can pcmc->pci_enabled be false in pc_q35_init()?

No, but I am trying to keep pc_q35_init() code closer to
pc_piix.c:pc_init1() to make future code unification easier.

> 
> * Likewise, pcmc->smbios_defaults?

Ditto.

However, if other developers think otherwise and would like to
make q35_init() simpler in the meantime, I won't object to
patches removing the "if" lines.

-- 
Eduardo



Re: [Qemu-devel] [PATCH v1 1/2] qdev-monitor.c: Register reset function if the device has one

2016-02-18 Thread Peter Crosthwaite
On Thu, Feb 18, 2016 at 1:47 PM, Paolo Bonzini  wrote:
>
>
> On 18/02/2016 10:56, Markus Armbruster wrote:
>> Alistair Francis  writes:
>>
>>> If the device being added when running qdev_device_add() has
>>> a reset function, register it so that it can be called.
>>>
>>> Signed-off-by: Alistair Francis 
>>> ---
>>>
>>>  qdev-monitor.c | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/qdev-monitor.c b/qdev-monitor.c
>>> index 81e3ff3..0a99d01 100644
>>> --- a/qdev-monitor.c
>>> +++ b/qdev-monitor.c
>>> @@ -561,6 +561,8 @@ DeviceState *qdev_device_add(QemuOpts *opts, Error 
>>> **errp)
>>>
>>>  if (bus) {
>>>  qdev_set_parent_bus(dev, bus);
>>> +} else if (dc->reset) {
>>> +qemu_register_reset((void (*)(void *))dc->reset, dev);
>>>  }
>>>
>>>  id = qemu_opts_id(opts);
>>
>> This looks wrong to me.
>>
>> You stuff all the device reset methods into the global reset_handlers
>> list, where they get called in some semi-random order.  This breaks when
>> there are reset order dependencies between devices, e.g. between a
>> device and the bus it plugs into.
>
> There is no bus here, see the "if" above the one that's being added.
>
> However, what devices have done so far is to register/unregister the
> reset in the realize/unrealize methods, and I suggest doing the same.
>

Does this assume the device itself knows whether it is bus-connected
or not? This way has the advantage of catchall-ing devices that have
no bus connected and the device may or may not know whether it is
bus-connected (nor should it need to know). Probably doesn't have in
tree precedent yet, but I thought we wanted to move away from
qdev/qbus managing the device-tree. So ideally, the new else should
become unconditional long term once we debusify (and properly QOMify)
the reset tree (and the if goes away).

> If you really want to add the magic qemu_register_reset, you should at
> least do one of the following:
>
> * add a matching unregister (no idea where)
>

You could add a boolean flag to DeviceState that is set by this
registration. It can then be checked at unrealize to remove reset
handler.

Regards,
Peter

> * assert that the device is not hot-unpluggable, otherwise hot-unplug
> would leave a dangling pointer.
>
> Paolo
>
>> Propagating the reset signal to all the devices is a qdev problem.
>> Copying Andreas for further insight.



Re: [Qemu-devel] [PATCH] ppc/kvm: Tell the user what might be wrong when using bad CPU types with kvm-hv

2016-02-18 Thread Eric Blake
On 02/18/2016 02:01 PM, Thomas Huth wrote:
> Using a CPU type that does not match the host is not possible when using
> the kvm-hv kernel module - the PVR is checked in the kernel function
> kvm_arch_vcpu_ioctl_set_sregs_hv() and rejected with -EINVAL if it
> does not match the host.
> However, when the user tries to specify a non-matching CPU type, QEMU
> currently only reports "kvm_init_vcpu failed: Invalid argument", and
> this is of course not very helpful for the user to solve the problem.
> So this patch adds a more descriptive error message that tells the
> user to specify "-cpu host" instead.
> 
> Signed-off-by: Thomas Huth 
> ---
>  target-ppc/kvm.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 762d6cf..6545fbe 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -513,6 +513,10 @@ int kvm_arch_init_vcpu(CPUState *cs)
>  /* Synchronize sregs with kvm */
>  ret = kvm_arch_sync_sregs(cpu);
>  if (ret) {
> +if (ret == -EINVAL) {
> +error_report("Register sync failed... If you're using kvm-hv.ko,"
> + " only \"-cpu host\" is possible!");

No need to shout at the user; drop the trailing !

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [Bug 1490611] Re: Using qemu >=2.2.1 to convert raw->VHD (fixed) adds extra padding to the result file, which Microsoft Azure rejects as invalid

2016-02-18 Thread Cole Mickens
Unfortunately, VHDX is not supported in Azure.

I would be happy to help test out any patches.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1490611

Title:
  Using qemu >=2.2.1 to convert raw->VHD (fixed) adds extra padding to
  the result file, which Microsoft Azure rejects as invalid

Status in QEMU:
  Fix Released

Bug description:
  Starting with a raw disk image, using "qemu-img convert" to convert
  from raw to VHD results in the output VHD file's virtual size being
  aligned to the nearest 516096 bytes (16 heads x 63 sectors per head x
  512 bytes per sector), instead of preserving the input file's size as
  the output VHD's virtual disk size.

  Microsoft Azure requires that disk images (VHDs) submitted for upload
  have virtual sizes aligned to a megabyte boundary. (Ex. 4096MB,
  4097MB, 4098MB, etc. are OK, 4096.5MB is rejected with an error.) This
  is reflected in Microsoft's documentation: https://azure.microsoft.com
  /en-us/documentation/articles/virtual-machines-linux-create-upload-
  vhd-generic/

  This is reproducible with the following set of commands (including the
  Azure command line tools from https://github.com/Azure/azure-xplat-
  cli). For the following example, I used qemu version 2.2.1:

  $ dd if=/dev/zero of=source-disk.img bs=1M count=4096

  $ stat source-disk.img 
File: ‘source-disk.img’
Size: 4294967296  Blocks: 798656 IO Block: 4096   regular file
  Device: fc01h/64513dInode: 13247963Links: 1
  Access: (0644/-rw-r--r--)  Uid: ( 1000/  smkent)   Gid: ( 1000/  smkent)
  Access: 2015-08-18 09:48:02.613988480 -0700
  Modify: 2015-08-18 09:48:02.825985646 -0700
  Change: 2015-08-18 09:48:02.825985646 -0700
   Birth: -

  $ qemu-img convert -f raw -o subformat=fixed -O vpc source-disk.img
  dest-disk.vhd

  $ stat dest-disk.vhd 
File: ‘dest-disk.vhd’
Size: 4296499712  Blocks: 535216 IO Block: 4096   regular file
  Device: fc01h/64513dInode: 13247964Links: 1
  Access: (0644/-rw-r--r--)  Uid: ( 1000/  smkent)   Gid: ( 1000/  smkent)
  Access: 2015-08-18 09:50:22.252077624 -0700
  Modify: 2015-08-18 09:49:24.424868868 -0700
  Change: 2015-08-18 09:49:24.424868868 -0700
   Birth: -

  $ azure vm image create testimage1 dest-disk.vhd -o linux -l "West US"
  info:Executing command vm image create
  + Retrieving storage accounts 
 
  info:VHD size : 4097 MB
  info:Uploading 4195800.5 KB
  Requested:100.0% Completed:100.0% Running:   0 Time: 1m 0s Speed:  6744 KB/s 
  info:https://[redacted].blob.core.windows.net/vm-images/dest-disk.vhd was 
uploaded successfully
  error:   The VHD 
https://[redacted].blob.core.windows.net/vm-images/dest-disk.vhd has an 
unsupported virtual size of 4296499200 bytes.  The size must be a whole number 
(in MBs).
  info:Error information has been recorded to /home/smkent/.azure/azure.err
  error:   vm image create command failed

  I also ran the above commands using qemu 2.4.0, which resulted in the
  same error as the conversion behavior is the same.

  However, qemu 2.1.1 and earlier (including qemu 2.0.0 installed by
  Ubuntu 14.04) does not pad the virtual disk size during conversion.
  Using qemu-img convert from qemu versions <=2.1.1 results in a VHD
  that is exactly the size of the raw input file plus 512 bytes (for the
  VHD footer). Those qemu versions do not attempt to realign the disk.
  As a result, Azure accepts VHD files created using those versions of
  qemu-img convert for upload.

  Is there a reason why newer qemu realigns the converted VHD file? It
  would be useful if an option were added to disable this feature, as
  current versions of qemu cannot be used to create VHD files for Azure
  using Microsoft's official instructions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1490611/+subscriptions



Re: [Qemu-devel] [PATCH v1 1/2] qdev-monitor.c: Register reset function if the device has one

2016-02-18 Thread Paolo Bonzini


On 18/02/2016 10:56, Markus Armbruster wrote:
> Alistair Francis  writes:
> 
>> If the device being added when running qdev_device_add() has
>> a reset function, register it so that it can be called.
>>
>> Signed-off-by: Alistair Francis 
>> ---
>>
>>  qdev-monitor.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/qdev-monitor.c b/qdev-monitor.c
>> index 81e3ff3..0a99d01 100644
>> --- a/qdev-monitor.c
>> +++ b/qdev-monitor.c
>> @@ -561,6 +561,8 @@ DeviceState *qdev_device_add(QemuOpts *opts, Error 
>> **errp)
>>  
>>  if (bus) {
>>  qdev_set_parent_bus(dev, bus);
>> +} else if (dc->reset) {
>> +qemu_register_reset((void (*)(void *))dc->reset, dev);
>>  }
>>  
>>  id = qemu_opts_id(opts);
> 
> This looks wrong to me.
> 
> You stuff all the device reset methods into the global reset_handlers
> list, where they get called in some semi-random order.  This breaks when
> there are reset order dependencies between devices, e.g. between a
> device and the bus it plugs into.

There is no bus here, see the "if" above the one that's being added.

However, what devices have done so far is to register/unregister the
reset in the realize/unrealize methods, and I suggest doing the same.

If you really want to add the magic qemu_register_reset, you should at
least do one of the following:

* add a matching unregister (no idea where)

* assert that the device is not hot-unpluggable, otherwise hot-unplug
would leave a dangling pointer.

Paolo

> Propagating the reset signal to all the devices is a qdev problem.
> Copying Andreas for further insight.



Re: [Qemu-devel] [PATCH v2 7/9] i.MX: Add i.MX6 SOC implementation.

2016-02-18 Thread Peter Maydell
On 18 February 2016 at 20:51, Jean-Christophe DUBOIS
 wrote:
> Le 16/02/2016 22:57, Peter Maydell a écrit :
>
> On 16 February 2016 at 21:47, Jean-Christophe DUBOIS
>  wrote:
>
> In QEMU, other Cortex A9 (Versatilepb.c, Exynos, Zynq ...) are also setting
> has_el3 to false ...
>
> So these generally are the "legacy" platforms which were
> added before we ever had EL3 support in QEMU. For them it's hard
> to turn the EL3 support on for the board even if in theory
> it ought to be on, because we don't know what users are
> running on it that we might break. With a new to QEMU board
> we have an opportunity to get it right from the start.
>
>
> OK, so is the "highbank" the only Qemu Cortex A9 board supporting
> el3 yet?

Yep. We don't have many A9 boards and most of those we do have
are in the 'legacy' bucket.

> -kernel I would expect to work, though, at least if the
> only issue is the interrupt controller setup. It seems
> worth investigating why it goes wrong.
>
>
> Well, I can boot uniprocessor (-smp 1) without trouble but if I turn logs on
> (guest_errors,unimp) I am getting a lot of
>
> gic_dist_writeb: Bad offset 38x (a few at startup)
> Ignoring attempt to switch CPSR_A flag from non-secure world with
> SCR.AW bit clear (a lot)
> Ignoring attempt to switch CPSR_F flag from non-secure world with
> SCR.FW bit clear (a few)

This would only be a problem if your kernel needed to use FIQ,
I think.

> I am not sure if this is a problem. Do you have some opinion on this?
>
> When I turn SMP (-smp 2 or more), I am unable to complete the boot. As soon
> as my secondary cpu is started QEMU will continue to boot "very slowly" but
> doesn't get to the linux user prompt overnight.

Does SMP work with EL3 not enabled, or is this a different bug?

thanks
-- PMM



[Qemu-devel] [PATCH v2 0/3] add missing virtio aliases

2016-02-18 Thread Sascha Silbe
Split out from the series fixing qemu-iotest 140 on s390x [1].

This series makes crafting qemu command lines that work across
multiple (virtio-supporting) platforms at lot easier.

Changes since the original series:
- included patch to improve the error message when the target device
  class of an alias doesn't exist
- added comment about keeping alias table sorted
- covering all non-abstract virtio device classes rather than just
  those already implemented on s390x

Sascha Silbe (3):
  qdev-monitor: improve error message when alias device is unavailable
  qdev-monitor: sort alias table by typename
  qdev-monitor: add missing aliases for virtio device classes

 qdev-monitor.c | 42 ++
 1 file changed, 34 insertions(+), 8 deletions(-)

[1] mid:1455023713-104799-1-git-send-email-si...@linux.vnet.ibm.com
"[PATCH 0/3] qemu-iotests: fix 140 on s390x" by Sascha Silbe
 on 2016-02-09
-- 
2.1.4




[Qemu-devel] [PATCH v2 2/3] qdev-monitor: sort alias table by typename

2016-02-18 Thread Sascha Silbe
Sort the alias table by typename so it's easier to see which aliases
exist.

Signed-off-by: Sascha Silbe 
---
v1->v2: Added comment asking devs to keep table sorted
---
 qdev-monitor.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/qdev-monitor.c b/qdev-monitor.c
index e5136d7..4e3681c 100644
--- a/qdev-monitor.c
+++ b/qdev-monitor.c
@@ -39,19 +39,20 @@ typedef struct QDevAlias
 uint32_t arch_mask;
 } QDevAlias;
 
+/* Please keep this table sorted by typename. */
 static const QDevAlias qdev_alias_table[] = {
-{ "virtio-blk-pci", "virtio-blk", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
-{ "virtio-net-pci", "virtio-net", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
-{ "virtio-serial-pci", "virtio-serial", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
+{ "e1000", "e1000-82540em" },
+{ "ich9-ahci", "ahci" },
+{ "kvm-pci-assign", "pci-assign" },
+{ "lsi53c895a", "lsi" },
 { "virtio-balloon-pci", "virtio-balloon",
 QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
 { "virtio-blk-ccw", "virtio-blk", QEMU_ARCH_S390X },
+{ "virtio-blk-pci", "virtio-blk", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
 { "virtio-net-ccw", "virtio-net", QEMU_ARCH_S390X },
+{ "virtio-net-pci", "virtio-net", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
 { "virtio-serial-ccw", "virtio-serial", QEMU_ARCH_S390X },
-{ "lsi53c895a", "lsi" },
-{ "ich9-ahci", "ahci" },
-{ "kvm-pci-assign", "pci-assign" },
-{ "e1000", "e1000-82540em" },
+{ "virtio-serial-pci", "virtio-serial", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
 { }
 };
 
-- 
2.1.4




[Qemu-devel] [PATCH v2 3/3] qdev-monitor: add missing aliases for virtio device classes

2016-02-18 Thread Sascha Silbe
virtio-{blk,balloon,net,serial} are aliases for their actual,
architecture-dependent implementations (*-ccw on s390x, *-pci on other
architectures supporting virtio). This makes it a lot easier to craft
qemu invocations that work on all supported architectures. Complete
the set to cover all existing non-abstract virtio device classes.

For virtio-balloon, only the CCW implementation was missing.

Signed-off-by: Sascha Silbe 
---
v1->v2:
- Completely wired up all (non-abstract) virtio device classes, even
  those that still lack a CCW implementation.

Markus, does your Reviewed-by apply to this version as well? (Wasn't
sure so I didn't include it)
---
 qdev-monitor.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/qdev-monitor.c b/qdev-monitor.c
index 4e3681c..be6a07e 100644
--- a/qdev-monitor.c
+++ b/qdev-monitor.c
@@ -45,14 +45,33 @@ static const QDevAlias qdev_alias_table[] = {
 { "ich9-ahci", "ahci" },
 { "kvm-pci-assign", "pci-assign" },
 { "lsi53c895a", "lsi" },
+{ "virtio-9p-ccw", "virtio-9p", QEMU_ARCH_S390X },
+{ "virtio-9p-pci", "virtio-9p", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
+{ "virtio-balloon-ccw", "virtio-balloon", QEMU_ARCH_S390X },
 { "virtio-balloon-pci", "virtio-balloon",
 QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
 { "virtio-blk-ccw", "virtio-blk", QEMU_ARCH_S390X },
 { "virtio-blk-pci", "virtio-blk", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
+{ "virtio-gpu-ccw", "virtio-gpu", QEMU_ARCH_S390X },
+{ "virtio-gpu-pci", "virtio-gpu", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
+{ "virtio-input-host-ccw", "virtio-input-host", QEMU_ARCH_S390X },
+{ "virtio-input-host-pci", "virtio-input-host",
+QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
+{ "virtio-keyboard-ccw", "virtio-keyboard", QEMU_ARCH_S390X },
+{ "virtio-keyboard-pci", "virtio-keyboard",
+QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
+{ "virtio-mouse-ccw", "virtio-mouse", QEMU_ARCH_S390X },
+{ "virtio-mouse-pci", "virtio-mouse", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
 { "virtio-net-ccw", "virtio-net", QEMU_ARCH_S390X },
 { "virtio-net-pci", "virtio-net", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
+{ "virtio-rng-ccw", "virtio-rng", QEMU_ARCH_S390X },
+{ "virtio-rng-pci", "virtio-rng", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
+{ "virtio-scsi-ccw", "virtio-scsi", QEMU_ARCH_S390X },
+{ "virtio-scsi-pci", "virtio-scsi", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
 { "virtio-serial-ccw", "virtio-serial", QEMU_ARCH_S390X },
 { "virtio-serial-pci", "virtio-serial", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
+{ "virtio-tablet-ccw", "virtio-tablet", QEMU_ARCH_S390X },
+{ "virtio-tablet-pci", "virtio-tablet", QEMU_ARCH_ALL & ~QEMU_ARCH_S390X },
 { }
 };
 
-- 
2.1.4




[Qemu-devel] [PATCH v2 1/3] qdev-monitor: improve error message when alias device is unavailable

2016-02-18 Thread Sascha Silbe
When trying to instantiate an alias that points to a device class that
doesn't exist, the error message looks like qemu misunderstood the
request:

$ s390x-softmmu/qemu-system-s390x -device virtio-gpu
qemu-system-s390x: -device virtio-gpu: 'virtio-gpu-ccw' is not a valid
device model name

Special-case the error message to make it explicit that alias
expansion is going on:

$ s390x-softmmu/qemu-system-s390x -device virtio-gpu
qemu-system-s390x: -device virtio-gpu: 'virtio-gpu' (alias
'virtio-gpu-ccw') is not a valid device model name

Suggested-By: Cornelia Huck 
Signed-off-by: Sascha Silbe 
---
v1->v2: new patch
---
 qdev-monitor.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/qdev-monitor.c b/qdev-monitor.c
index 81e3ff3..e5136d7 100644
--- a/qdev-monitor.c
+++ b/qdev-monitor.c
@@ -188,6 +188,7 @@ static DeviceClass *qdev_get_device_class(const char 
**driver, Error **errp)
 {
 ObjectClass *oc;
 DeviceClass *dc;
+const char *original_name = *driver;
 
 oc = object_class_by_name(*driver);
 if (!oc) {
@@ -200,7 +201,12 @@ static DeviceClass *qdev_get_device_class(const char 
**driver, Error **errp)
 }
 
 if (!object_class_dynamic_cast(oc, TYPE_DEVICE)) {
-error_setg(errp, "'%s' is not a valid device model name", *driver);
+if (*driver != original_name) {
+error_setg(errp, "'%s' (alias '%s') is not a valid device model"
+   " name", original_name, *driver);
+} else {
+error_setg(errp, "'%s' is not a valid device model name", *driver);
+}
 return NULL;
 }
 
-- 
2.1.4




Re: [Qemu-devel] [PATCH 2/3] qdev-monitor: add missing aliases for virtio-{9p, balloon, rng, scsi}

2016-02-18 Thread Sascha Silbe
Dear Conny,

Cornelia Huck  writes:

[virtio-{gpu,input,input-hid,input-host,keyboard,mouse,tablet}]
> [For laughs and giggles, I have wired up all of these devices for ccw.
> Excluding input-host (for which I did not have a suitable evdev), I can
> specify the various devices on the commandline and get some devices in
> the guest which do... nothing :) I won't pursue this further for now,
> as I currently don't have a convincing use case beyond "because we
> can".]

Care to share the patch?


> qemu-system-s390x: -device virtio-gpu: 'virtio-gpu' (alias 'virtio-gpu-ccw') 
> is not a valid device model name
>
> would make it obvious that some alias expansion had been going on. I
> think that would be useful.

Good idea. Will include a patch for this in the next version.

Sascha
-- 
Softwareentwicklung Sascha Silbe, Niederhofenstraße 5/1, 71229 Leonberg
https://se-silbe.de/
USt-IdNr. DE281696641




[Qemu-devel] [PATCH] ppc/kvm: Tell the user what might be wrong when using bad CPU types with kvm-hv

2016-02-18 Thread Thomas Huth
Using a CPU type that does not match the host is not possible when using
the kvm-hv kernel module - the PVR is checked in the kernel function
kvm_arch_vcpu_ioctl_set_sregs_hv() and rejected with -EINVAL if it
does not match the host.
However, when the user tries to specify a non-matching CPU type, QEMU
currently only reports "kvm_init_vcpu failed: Invalid argument", and
this is of course not very helpful for the user to solve the problem.
So this patch adds a more descriptive error message that tells the
user to specify "-cpu host" instead.

Signed-off-by: Thomas Huth 
---
 target-ppc/kvm.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 762d6cf..6545fbe 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -513,6 +513,10 @@ int kvm_arch_init_vcpu(CPUState *cs)
 /* Synchronize sregs with kvm */
 ret = kvm_arch_sync_sregs(cpu);
 if (ret) {
+if (ret == -EINVAL) {
+error_report("Register sync failed... If you're using kvm-hv.ko,"
+ " only \"-cpu host\" is possible!");
+}
 return ret;
 }
 
-- 
1.8.3.1




[Qemu-devel] [PATCH] ppc/kvm: Use error_report() instead of cpu_abort() for user-triggerable errors

2016-02-18 Thread Thomas Huth
Setting the KVM_CAP_PPC_PAPR capability can fail if either the KVM
kernel module does not support it, or if the specified vCPU type
is not a 64-bit Book3-S CPU type. For example, the user can trigger
it easily with "-M pseries -cpu G2leLS" when using the kvm-pr kernel
module. So the error should not be reported with cpu_abort() since
this function is rather meant for reporting programming errors than
reporting user-triggerable errors (it prints out all CPU registers
and then calls abort() to kills the program - two things that the
normal user does not expect here) . So let's use error_report() with
exit(1) here instead.
A similar problem exists in the code that sets the KVM_CAP_PPC_EPR
capability, so while we're at it, fix that, too.

Signed-off-by: Thomas Huth 
---
 target-ppc/kvm.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 70ca296..762d6cf 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -23,6 +23,7 @@
 #include 
 
 #include "qemu-common.h"
+#include "qemu/error-report.h"
 #include "qemu/timer.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/kvm.h"
@@ -1993,7 +1994,8 @@ void kvmppc_set_papr(PowerPCCPU *cpu)
 
 ret = kvm_vcpu_enable_cap(cs, KVM_CAP_PPC_PAPR, 0);
 if (ret) {
-cpu_abort(cs, "This KVM version does not support PAPR\n");
+error_report("This vCPU type or KVM version does not support PAPR");
+exit(1);
 }
 
 /* Update the capability flag so we sync the right information
@@ -2013,7 +2015,8 @@ void kvmppc_set_mpic_proxy(PowerPCCPU *cpu, int 
mpic_proxy)
 
 ret = kvm_vcpu_enable_cap(cs, KVM_CAP_PPC_EPR, 0, mpic_proxy);
 if (ret && mpic_proxy) {
-cpu_abort(cs, "This KVM version does not support EPR\n");
+error_report("This KVM version does not support EPR");
+exit(1);
 }
 }
 
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH v2 7/9] i.MX: Add i.MX6 SOC implementation.

2016-02-18 Thread Jean-Christophe DUBOIS

Le 16/02/2016 22:57, Peter Maydell a écrit :

On 16 February 2016 at 21:47, Jean-Christophe DUBOIS
 wrote:

In QEMU, other Cortex A9 (Versatilepb.c, Exynos, Zynq ...) are also setting
has_el3 to false ...

So these generally are the "legacy" platforms which were
added before we ever had EL3 support in QEMU. For them it's hard
to turn the EL3 support on for the board even if in theory
it ought to be on, because we don't know what users are
running on it that we might break. With a new to QEMU board
we have an opportunity to get it right from the start.


OK, so is the "highbank" the only Qemu Cortex A9 board supporting el3 yet?



-kernel I would expect to work, though, at least if the
only issue is the interrupt controller setup. It seems
worth investigating why it goes wrong.


Well, I can boot uniprocessor (-smp 1) without trouble but if I turn 
logs on (guest_errors,unimp) I am getting a lot of


 * gic_dist_writeb: Bad offset 38x (a few at startup)
 * Ignoring attempt to switch CPSR_A flag from non-secure world with
   SCR.AW bit clear (a lot)
 * Ignoring attempt to switch CPSR_F flag from non-secure world with
   SCR.FW bit clear (a few)

I am not sure if this is a problem. Do you have some opinion on this?

When I turn SMP (-smp 2 or more), I am unable to complete the boot. As 
soon as my secondary cpu is started QEMU will continue to boot "very 
slowly" but doesn't get to the linux user prompt overnight.


JC



thanks
-- PMM





[Qemu-devel] [PATCH v2 2/2] qemu-iotests: 140: make description slightly more verbose

2016-02-18 Thread Sascha Silbe
Describe in a little more detail what the test is supposed to achieve.

Signed-off-by: Sascha Silbe 
---
Max, does this reflect your intentions well enough? I took the liberty
of re-using some of your review comments to extend the description.

v1->v2: new patch

 tests/qemu-iotests/140 | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/140 b/tests/qemu-iotests/140
index baaf64e..05e4506 100755
--- a/tests/qemu-iotests/140
+++ b/tests/qemu-iotests/140
@@ -1,6 +1,10 @@
 #!/bin/bash
 #
-# Test case for ejecting a BB with an NBD server attached to it
+# Test case for ejecting a BlockBackend with an NBD server attached to it
+#
+# Verify that the NBD server stops offering the drive when ejecting a
+# BlockDriverState tree from a BlockBackend (that is, a medium from a
+# drive) exposed via an NBD server.
 #
 # Copyright (C) 2016 Red Hat, Inc.
 #
-- 
2.1.4




[Qemu-devel] [PATCH v2 0/2] qemu-iotests: fix 140 on s390x

2016-02-18 Thread Sascha Silbe
Yet another IDE-using test crept in (commit 16dee418). Fix it by
disabling the implicit drive.

v1->v2:
- don't use any drive at all instead of replacing IDE with virtio-scsi
- split off virtio alias patch into separate series (no longer needed
  in this one)
- add patch to improve description of qemu-iotest 140


Sascha Silbe (2):
  qemu-iotests: 140: don't use IDE drive
  qemu-iotests: 140: make description slightly more verbose

 tests/qemu-iotests/140 | 8 ++--
 tests/qemu-iotests/140.out | 1 -
 2 files changed, 6 insertions(+), 3 deletions(-)

-- 
2.1.4




[Qemu-devel] [PATCH v2 1/2] qemu-iotests: 140: don't use IDE device

2016-02-18 Thread Sascha Silbe
IDE is only implemented by very few architectures (mostly PC). The
test doesn't actually need a block device attached to the
BlockBackend, so just drop it and adjust the reference output
accordingly.

Fixes: 16dee418 ("iotests: Add test for eject under NBD server")
Signed-off-by: Sascha Silbe 
---
v1->v2:
  - don't use any drive at all instead of replacing IDE with virtio-scsi

 tests/qemu-iotests/140 | 2 +-
 tests/qemu-iotests/140.out | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/140 b/tests/qemu-iotests/140
index f78c317..baaf64e 100755
--- a/tests/qemu-iotests/140
+++ b/tests/qemu-iotests/140
@@ -49,7 +49,7 @@ _make_test_img 64k
 $QEMU_IO -c 'write -P 42 0 64k' "$TEST_IMG" | _filter_qemu_io
 
 keep_stderr=y \
-_launch_qemu -drive if=ide,media=cdrom,id=drv,file="$TEST_IMG",format=$IMGFMT \
+_launch_qemu -drive if=none,media=cdrom,id=drv,file="$TEST_IMG",format=$IMGFMT 
\
 2> >(_filter_nbd)
 
 _send_qemu_cmd $QEMU_HANDLE \
diff --git a/tests/qemu-iotests/140.out b/tests/qemu-iotests/140.out
index 72f1b4c..0409cd0 100644
--- a/tests/qemu-iotests/140.out
+++ b/tests/qemu-iotests/140.out
@@ -7,7 +7,6 @@ wrote 65536/65536 bytes at offset 0
 {"return": {}}
 read 65536/65536 bytes at offset 0
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"DEVICE_TRAY_MOVED", "data": {"device": "drv", "tray-open": true}}
 {"return": {}}
 can't open device nbd+unix:///drv?socket=TEST_DIR/nbd: No export with name 
'drv' available
 no file open, try 'help open'
-- 
2.1.4




Re: [Qemu-devel] [PATCH v11 00/15] prune some QAPI visitor cruft (was qapi cleanups subset E)

2016-02-18 Thread Eric Blake
On 02/18/2016 01:08 PM, Markus Armbruster wrote:
> Eric Blake  writes:
> 
>> I'm still working on my documentation patches for QAPI visitors
>> (https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg03504.html),
>> but am finding it easier to nuke bad code up front than to document
>> that it is bad only to later nuke it. So this pulls bits and pieces
>> of other patches that Markus I have previously posted, along with
>> some new glue, to get rid of some of the worst of the cruft.
>>

> 
> Applied to qapi-next with minor tweaks.  Diff between the final
> appended.  I'd appreciate a look-over.

Looks correct to me.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v4 4/5] s390x/cpu: Add functions to register CPU state

2016-02-18 Thread Matthew Rosato
On 02/18/2016 04:36 AM, Igor Mammedov wrote:
> On Wed, 17 Feb 2016 15:12:34 -0500
> Matthew Rosato  wrote:
> 
>> Introduce s390_register_cpustate, which will set the
>> machine/cpu[n] link with the current CPU state.  Additionally,
>> maintain an array of state pointers indexed by CPU id for fast lookup
>> during interrupt handling.
>>
> cosmetic nit,
> you call qdev_get_machine() several time withing single function
> or when function already has machine argument. You probably shouldn't
> do it or do it once and use return value throughout function.
> 

Oops, will fix that.

>> Signed-off-by: Matthew Rosato 
>> ---
>>  hw/s390x/s390-virtio.c | 45 -
>>  target-s390x/cpu.c | 14 +-
>>  target-s390x/cpu.h |  1 +
>>  3 files changed, 50 insertions(+), 10 deletions(-)
>>
>> diff --git a/hw/s390x/s390-virtio.c b/hw/s390x/s390-virtio.c
>> index b3707f4..6a1e3f6 100644
>> --- a/hw/s390x/s390-virtio.c
>> +++ b/hw/s390x/s390-virtio.c
>> @@ -60,15 +60,35 @@
>>  #define S390_TOD_CLOCK_VALUE_MISSING0x00
>>  #define S390_TOD_CLOCK_VALUE_PRESENT0x01
>>  
>> -static S390CPU **ipi_states;
>> +static S390CPU **cpu_states;
>>  
>>  S390CPU *s390_cpu_addr2state(uint16_t cpu_addr)
>>  {
>> -if (cpu_addr >= smp_cpus) {
>> +if (cpu_addr >= max_cpus) {
>>  return NULL;
>>  }
>>  
>> -return ipi_states[cpu_addr];
>> +/* Fast lookup via CPU ID */
>> +return cpu_states[cpu_addr];
>> +}
>> +
>> +int s390_register_cpustate(uint16_t cpu_addr, S390CPU *state)
> isn't it the board responsibility to link CPU somewhere?
> I'd suggest to use for it generic device hotplug hooks
> see how get_hotplug_handler() and pc_machine_device_plug_cb() are used,
> it's executed after CPU's realize is successfully completed.
> 

OK, I'll have a look at this.

>> +{
>> +gchar *name;
>> +int r = 0;
>> +
>> +name = g_strdup_printf("cpu[%i]", cpu_addr);
>> +if (object_property_get_link(qdev_get_machine(), name, NULL)) {
>> +r = -EEXIST;
>> +goto out;
>> +}
>> +
>> +object_property_set_link(qdev_get_machine(), OBJECT(state), name,\
>> + _abort);
>> +
>> +out:
>> +g_free(name);
>> +return r;
>>  }
>>  
>>  void s390_init_ipl_dev(const char *kernel_filename,
>> @@ -98,19 +118,26 @@ void s390_init_ipl_dev(const char *kernel_filename,
>>  void s390_init_cpus(MachineState *machine)
>>  {
>>  int i;
>> +gchar *name;
>>  
>>  if (machine->cpu_model == NULL) {
>>  machine->cpu_model = "host";
>>  }
>>  
>> -ipi_states = g_malloc(sizeof(S390CPU *) * smp_cpus);
>> -
>> -for (i = 0; i < smp_cpus; i++) {
>> -S390CPU *cpu;
>> +cpu_states = g_malloc0(sizeof(S390CPU *) * max_cpus);
>>  
>> -cpu = cpu_s390x_init(machine->cpu_model);
>> +for (i = 0; i < max_cpus; i++) {
>> +name = g_strdup_printf("cpu[%i]", i);
>> +object_property_add_link(qdev_get_machine(), name, TYPE_S390_CPU,
>> + (Object **) _states[i],
>> + object_property_allow_set_link,
>> + OBJ_PROP_LINK_UNREF_ON_RELEASE,
>> + _abort);
>> +g_free(name);
>> +}
>>  
>> -ipi_states[i] = cpu;
>> +for (i = 0; i < smp_cpus; i++) {
>> +cpu_s390x_init(machine->cpu_model);
>>  }
>>  }
>>  
>> diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
>> index 5f6411f..620b2e3 100644
>> --- a/target-s390x/cpu.c
>> +++ b/target-s390x/cpu.c
>> @@ -32,6 +32,7 @@
>>  #include "trace.h"
>>  #ifndef CONFIG_USER_ONLY
>>  #include "sysemu/arch_init.h"
>> +#include "sysemu/sysemu.h"
>>  #endif
>>  
>>  #define CR0_RESET   0xE0UL
>> @@ -202,9 +203,20 @@ static void s390_cpu_realizefn(DeviceState *dev, Error 
>> **errp)
>>  CPUS390XState *env = >env;
>>  
>>  #if !defined(CONFIG_USER_ONLY)
>> +if (s390_register_cpustate(next_cpu_id, cpu) < 0) {
>> +error_setg(errp, "Cannot have more than %d CPUs", max_cpus);
>> +return;
>> +}
>>  qemu_register_reset(s390_cpu_machine_reset_cb, cpu);
>> -#endif
>> +env->cpu_num = next_cpu_id;
>> +while (next_cpu_id < max_cpus - 1) {
>> +if (!cpu_exists(++next_cpu_id)) {
>> +break;
>> +}
>> +}
> maybe it's possible to reuse cpu_get_free_index() here?
> 

Hmm, but based on your other comment, there's no need to track
next_cpu_id in this fashion afterall.  To bring these 2 points together,
how about the following:

1) In Patch 4 (this patch), drop next_cpu_id.  Since we don't allow
unplug and only do sequential cpu adds, rely on env->cpu_num =
cs->cpu_index here in realizefn.

2) For hotplug in patch 5, check against QTAILQ_LAST(, CPUTailQ) as
you suggest.

I tried this out quick, seems to work fine.

Matt




Re: [Qemu-devel] [PATCH v11 03/15] qapi: Forbid 'any' inside an alternate

2016-02-18 Thread Eric Blake
On 02/18/2016 10:03 AM, Markus Armbruster wrote:

>>> Could use a test for alternate member of alternate type.
>>
>> One step ahead of you: commit 3d0c4829 added the test
>> alternate-nested.json, and commits 44bd1276 and dd883c6f fixed the
>> parser to reject it (first by a hard-coded check, then via allow_metas[]
>> excluding alternates).  'any' is the only value that could sneak
>> through, because it is a subset of 'built-in' which allow_metas[]
>> whitelisted.
> 
> Then find_alternate_member_qtype()'s final return None is unreachable,
> correct?

Indeed, the testsuite still passes with:

diff --git i/scripts/qapi.py w/scripts/qapi.py
index 849..81d435f 100644
--- i/scripts/qapi.py
+++ w/scripts/qapi.py
@@ -345,7 +345,7 @@ def find_alternate_member_qtype(qapi_type):
 return "QTYPE_QSTRING"
 elif find_union(qapi_type):
 return "QTYPE_QDICT"
-return None
+assert False


 # Return the discriminator enum define if discriminator is specified as an


That said, even though we currently filter out unknown types before
deciding to call find_alternate_member_qtype, it's not out of the
question that future work to move ad hoc front-end tests into formal
QAPISchema .check() methods may cause us to call
find_alternate_member_qtype('unknown').  Leaving it as return None
instead of asserting would make the error message added in this patch
nicer.  Then again, refactoring would move the error message of this
patch to the .check() methods.  So I won't worry about it for now.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 7/8] all: Clean up includes

2016-02-18 Thread Peter Maydell
On 18 February 2016 at 19:16, Eric Blake  wrote:
> On 02/18/2016 11:05 AM, Peter Maydell wrote:
>> Clean up includes so that osdep.h is included first and headers
>> which it implies are not included manually.
>>
>> This commit was created with scripts/clean-includes.
>>
>> Signed-off-by: Peter Maydell 
>> ---
>
>>  56 files changed, 2 insertions(+), 100 deletions(-)
>>
>
>> +++ b/io/channel-util.c
>> @@ -18,6 +18,7 @@
>>   *
>>   */
>>
>> +#include "qemu/osdep.h"
>>  #include "io/channel-util.h"
>>  #include "io/channel-file.h"
>>  #include "io/channel-socket.h"
>
> Ah, so the 2 insertions are due to recent file additions, after your
> last round of cleanups.
>
> Any way to automate this into checkpatch.pl for new file creation?  Then
> again, not all developers have Coccinelle installed.  But even checking
> whether the string 'include.*qemu/osdep.h' is present in a new file may
> help, even if it doesn't detect it being included out-of-order.

Looking at this is on my todo list, but TBH once this patch is in
master I don't expect much backsliding, because forgetting osdep.h
will result in your new file not compiling at all. It's only in
this transitional stage where qemu-common.h still pulls in osdep.h
that it's possible for a new file to slip in without the include.

> Leftover dead checks of HAVE_UNISTD_H and so forth; this file could use
> further manual cleanups.  For that matter, do we even need HAVE_UNISTD_H
> in slirp/slirp_config.h any more?  There's probably quite a bit of
> pruning of cruft we could do.  But as this patch was completely
> automated, I'm fine if that cleanup is done as followups.

OK. Will put it on my list to fix up later.

thanks
-- PMM



Re: [Qemu-devel] [PATCH 4/8] scripts/clean-includes: Enhance to handle header files

2016-02-18 Thread Peter Maydell
On 18 February 2016 at 19:04, Eric Blake  wrote:
> On 02/18/2016 11:05 AM, Peter Maydell wrote:
>> Enhance clean-includes to handle header files as well as .c source
>> files. For headers we merely remove all the redundant #include
>> lines, including any includes of qemu/osdep.h itself.
>>
>> There is a simple mollyguard on the include file processing to
>> skip a few key headers like osdep.h itself, to avoid producing
>> bad patches if the script is run on every file in include/.
>>
>> Signed-off-by: Peter Maydell 
>> Reviewed-by: Eric Blake 
>> ---
>>  scripts/clean-includes | 50 
>> ++
>>  1 file changed, 42 insertions(+), 8 deletions(-)
>
> I already saw your followup explaining about the spurious R-b, so let's
> make it real this time :)
>
>>
>> diff --git a/scripts/clean-includes b/scripts/clean-includes
>> index 1af1f82..737a5ce 100755
>> --- a/scripts/clean-includes
>> +++ b/scripts/clean-includes
>> @@ -1,7 +1,8 @@
>>  #!/bin/sh -e
>>  #
>>  # Clean up QEMU #include lines by ensuring that qemu/osdep.h
>> -# is the first include listed.
>> +# is the first include listed, and no headers provided by
>> +# osdep.h itself are redundantly included.
>
> Do you want to mention 'is the first include listed in .c files', now
> that this cleanup also scrubs .h files?  Or, since you go into details
> below and this is just the summary, maybe 'is the first include
> encountered'?

OK.

>>  #
>>  # Copyright (c) 2015 Linaro Limited
>
> Want to add 2016?

Most of the Copyright lines in QEMU source files are probably
out of date; I'm not convinced that touching the Copyright line
for the first patch to each source file each year really gains us
anything.

>>  #
>> @@ -22,6 +23,11 @@
>>
>>  # This script requires Coccinelle to be installed.
>>
>> +# .c files will have the osdep.h included added, and redundant
>> +# includes removed.
>> +# .h files will have redundant includes (including includes of osdep.h)
>> +# removed.
>
> Maybe a note here that "this is because any other .h file will not be
> included by a .c file until after osdep.h" ?  Or is that too verbose?

Feels a touch over-verbose to me.

>> +  case "$f" in
>> +*.c)
>> +  MODE=c
>> +  ;;
>> +
>> *include/qemu/osdep.h|include/qemu/compiler.h|include/config.h|include/standard-headers/)
>
> Spaces around | may make this more legible, and doesn't affect correctness.

OK.

>> +
>> +  if [ "$MODE" = "c" ]; then
>> +# First, use coccinelle to add qemu/osdep.h before the first existing 
>> include
>
> Should the tool name be capitalized as Coccinelle?

OK.

>> +# (this will add two lines if the file uses both "..." and <...> 
>> #includes,
>> +# but we will remove the extras in the next step)
>> +spatch  --in-place --no-show-diff --cocci-file "$COCCIFILE" "$f"
>> +
>> +# Now remove any duplicate osdep.h includes
>> +perl -n -i  -e 'print if !/#include "qemu\/osdep.h"/ || !$n++;' "$f"
>
> Why two spaces before -e?

Accidental.

> I can understand the use of perl here (detecting duplicates lines is
> doable, but a lot harder, in sed),...
>
>> +  else
>> +# Remove includes of osdep.h itself
>> +perl -n -i -e 'print if !/\s*#\s*include\s*(["<][^>"]*[">])/ ||
>> +! (grep { $_ eq $1 } qw ("qemu/osdep.h"))' "$f"
>
> ...but this could be shortened, if you wanted:
>
> sed -i '/#[[:space:]]*include[[:space:]]*["<]qemu/osdep.h[">]/d' "$f"

The next perl line (only partially visible in the diff context)
is doing basically the same job for a larger set of header files,
so I preferred to retain the same code to do the same thing,
even if in this case there's only one header in the list.

> My findings are minor; functionally, your patch is sane, so the
> accidental R-b can now be treated as real, if you don't want to respin.

Thanks. I'll fix the minor space/comment issues above, but won't
resend unless I need to redo the series for some other reason.

-- PMM



[Qemu-devel] [PULL 10/13] device_tree: qemu_fdt_getprop_cell converted to use the error API

2016-02-18 Thread Alex Williamson
From: Eric Auger 

This patch aligns the prototype with qemu_fdt_getprop. The caller
can choose whether the function self-asserts on error (passing
_fatal as Error ** argument, corresponding to the legacy behavior),
or behaves differently such as simply output a message.

In this later case the caller can use the new lenp parameter to interpret
the error if any.

Signed-off-by: Eric Auger 
Reviewed-by: Peter Crosthwaite 
Signed-off-by: Alex Williamson 
---
 device_tree.c|   21 ++---
 hw/arm/boot.c|6 --
 hw/arm/vexpress.c|6 --
 include/sysemu/device_tree.h |   14 +-
 4 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/device_tree.c b/device_tree.c
index c93a615..2a40711 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -347,15 +347,22 @@ const void *qemu_fdt_getprop(void *fdt, const char 
*node_path,
 }
 
 uint32_t qemu_fdt_getprop_cell(void *fdt, const char *node_path,
-   const char *property)
+   const char *property, int *lenp, Error **errp)
 {
 int len;
-const uint32_t *p = qemu_fdt_getprop(fdt, node_path, property, ,
- _fatal);
-if (len != 4) {
-error_report("%s: %s/%s not 4 bytes long (not a cell?)",
- __func__, node_path, property);
-exit(1);
+const uint32_t *p;
+
+if (!lenp) {
+lenp = 
+}
+p = qemu_fdt_getprop(fdt, node_path, property, lenp, errp);
+if (!p) {
+return 0;
+} else if (*lenp != 4) {
+error_setg(errp, "%s: %s/%s not 4 bytes long (not a cell?)",
+   __func__, node_path, property);
+*lenp = -EINVAL;
+return 0;
 }
 return be32_to_cpu(*p);
 }
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index cce8c7c..0a56d34c 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -437,8 +437,10 @@ static int load_dtb(hwaddr addr, const struct 
arm_boot_info *binfo,
 return 0;
 }
 
-acells = qemu_fdt_getprop_cell(fdt, "/", "#address-cells");
-scells = qemu_fdt_getprop_cell(fdt, "/", "#size-cells");
+acells = qemu_fdt_getprop_cell(fdt, "/", "#address-cells",
+   NULL, _fatal);
+scells = qemu_fdt_getprop_cell(fdt, "/", "#size-cells",
+   NULL, _fatal);
 if (acells == 0 || scells == 0) {
 fprintf(stderr, "dtb file invalid (#address-cells or #size-cells 
0)\n");
 goto fail;
diff --git a/hw/arm/vexpress.c b/hw/arm/vexpress.c
index 3154aea..726c4e0 100644
--- a/hw/arm/vexpress.c
+++ b/hw/arm/vexpress.c
@@ -478,8 +478,10 @@ static void vexpress_modify_dtb(const struct arm_boot_info 
*info, void *fdt)
 uint32_t acells, scells, intc;
 const VEDBoardInfo *daughterboard = (const VEDBoardInfo *)info;
 
-acells = qemu_fdt_getprop_cell(fdt, "/", "#address-cells");
-scells = qemu_fdt_getprop_cell(fdt, "/", "#size-cells");
+acells = qemu_fdt_getprop_cell(fdt, "/", "#address-cells",
+   NULL, _fatal);
+scells = qemu_fdt_getprop_cell(fdt, "/", "#size-cells",
+   NULL, _fatal);
 intc = find_int_controller(fdt);
 if (!intc) {
 /* Not fatal, we just won't provide virtio. This will
diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
index 48bf3b5..705650a 100644
--- a/include/sysemu/device_tree.h
+++ b/include/sysemu/device_tree.h
@@ -67,8 +67,20 @@ int qemu_fdt_setprop_phandle(void *fdt, const char 
*node_path,
 const void *qemu_fdt_getprop(void *fdt, const char *node_path,
  const char *property, int *lenp,
  Error **errp);
+/**
+ * qemu_fdt_getprop_cell: retrieve the value of a given 4 byte property
+ * @fdt: pointer to the device tree blob
+ * @node_path: node path
+ * @property: name of the property to find
+ * @lenp: fdt error if any or -EINVAL if the property size is different from
+ *4 bytes, or 4 (expected length of the property) upon success.
+ * @errp: handle to an error object
+ *
+ * returns the property value on success
+ */
 uint32_t qemu_fdt_getprop_cell(void *fdt, const char *node_path,
-   const char *property);
+   const char *property, int *lenp,
+   Error **errp);
 uint32_t qemu_fdt_get_phandle(void *fdt, const char *path);
 uint32_t qemu_fdt_alloc_phandle(void *fdt);
 int qemu_fdt_nop_node(void *fdt, const char *node_path);




Re: [Qemu-devel] [PATCH v11 00/15] prune some QAPI visitor cruft (was qapi cleanups subset E)

2016-02-18 Thread Markus Armbruster
Eric Blake  writes:

> I'm still working on my documentation patches for QAPI visitors
> (https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg03504.html),
> but am finding it easier to nuke bad code up front than to document
> that it is bad only to later nuke it. So this pulls bits and pieces
> of other patches that Markus I have previously posted, along with
> some new glue, to get rid of some of the worst of the cruft.
>
> v10 was here:
> https://lists.gnu.org/archive/html/qemu-devel/2016-02/msg03282.html
>
> Since then, I've folded in fixes for Markus' review comments,
> including rearranging some hunks and retitling some patches,
> and dropping the attempt to minimize generated gotos. The biggest
> changes are moving where variants get visited (during
> visit_type_FOO_fields() rather than visit_type_FOO()), dropping
> visit_type_alternate_FOO() (open-coding it during
> gen_visit_alternate()), and fixing bugs (properly handle
> visit_start_union() so that there are no bisection points where
> we are handling 'void *data' incorrectly).
>
> 001/15:[] [--] 'qapi: Simplify excess input reporting in input visitors'
> 002/15:[] [--] 'qapi: Forbid empty unions and useless alternates'
> 003/15:[down] 'qapi: Forbid 'any' inside an alternate'
> 004/15:[down] 'qapi: Add tests of complex objects within alternate'
> 005/15:[] [--] 'qapi-visit: Simplify how we visit common union members'
> 006/15:[down] 'qapi: Visit variants in visit_type_FOO_fields()'
> 007/15:[0051] [FC] 'qapi-visit: Unify struct and union visit'
> 008/15:[0012] [FC] 'qapi-visit: Less indirection in visit_type_Foo_fields()'
> 009/15:[0002] [FC] 'qapi: Adjust layout of FooList types'
> 010/15:[0004] [FC] 'qapi: Emit structs used as variants in topological order'
> 011/15:[down] 'qapi-visit: Use common idiom in gen_visit_fields_decl()'
> 012/15:[0148] [FC] 'qapi: Don't box struct branch of alternate'
> 013/15:[0074] [FC] 'qapi: Don't box branches of flat unions'
> 014/15:[down] 'qapi: Delete visit_start_union(), gen_visit_implicit_struct()'
> 015/15:[0001] [FC] 'qapi: Change visit_start_implicit_struct to 
> visit_start_alternate'

Applied to qapi-next with minor tweaks.  Diff between the final
appended.  I'd appreciate a look-over.

Thanks!

--- a/scripts/qapi-visit.py
+++ b/scripts/qapi-visit.py
@@ -41,7 +41,7 @@ static void visit_type_%(c_type)s_fields(Visitor *v, 
%(c_type)s *obj, Error **er
  c_type=typ.c_name())
 
 
-def gen_visit_struct_fields(name, base, members, variants=None):
+def gen_visit_struct_fields(name, base, members, variants):
 ret = ''
 
 if base:



Re: [Qemu-devel] [PATCH 8/8] include: Clean up includes

2016-02-18 Thread Peter Maydell
On 18 February 2016 at 19:54, Eric Blake  wrote:
> On 02/18/2016 11:05 AM, Peter Maydell wrote:
>> Clean up includes so that osdep.h is included first and headers
>> which it implies are not included manually.
>>
>> This commit was created with scripts/clean-includes.
>>
>> NB: If this commit breaks compilation for your out-of-tree
>> patchseries or fork, then you need to make sure you add
>> #include "qemu/osdep.h" to any new .c files that you have.
>>
>> Signed-off-by: Peter Maydell 
>> ---
>
>>  97 files changed, 156 deletions(-)
>
>>
>> +++ b/include/migration/qemu-file.h
>> @@ -25,7 +25,6 @@
>>  #define QEMU_FILE_H 1
>>  #include "exec/cpu-common.h"
>>
>> -#include 
>>
>>  /* This function writes a chunk of data to a file at the given position.
>
> Cleanups like this change the resulting whitespace in the file.  I don't
> personally care, but I'm pointing it out in case someone does.

Oh, you mean we end up with two blank lines in a row? Yeah,
I don't care about that either :-)

thanks
-- PMM



[Qemu-devel] [PULL 09/13] device_tree: qemu_fdt_getprop converted to use the error API

2016-02-18 Thread Alex Williamson
From: Eric Auger 

Current qemu_fdt_getprop exits if the property is not found. It is
sometimes needed to read an optional property, in which case we do
not wish to exit but simply returns a null value.

This patch converts qemu_fdt_getprop to accept an Error **, and existing
users are converted to pass _fatal. This preserves the existing
behaviour. Then to use the API with your optional semantic a null
parameter can be conveyed.

Signed-off-by: Eric Auger 
Reviewed-by: Peter Crosthwaite 
Signed-off-by: Alex Williamson 
---
 device_tree.c|   11 ++-
 include/sysemu/device_tree.h |   13 -
 2 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/device_tree.c b/device_tree.c
index 2587257..c93a615 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -330,18 +330,18 @@ int qemu_fdt_setprop_string(void *fdt, const char 
*node_path,
 }
 
 const void *qemu_fdt_getprop(void *fdt, const char *node_path,
- const char *property, int *lenp)
+ const char *property, int *lenp, Error **errp)
 {
 int len;
 const void *r;
+
 if (!lenp) {
 lenp = 
 }
 r = fdt_getprop(fdt, findnode_nofail(fdt, node_path), property, lenp);
 if (!r) {
-error_report("%s: Couldn't get %s/%s: %s", __func__,
- node_path, property, fdt_strerror(*lenp));
-exit(1);
+error_setg(errp, "%s: Couldn't get %s/%s: %s", __func__,
+  node_path, property, fdt_strerror(*lenp));
 }
 return r;
 }
@@ -350,7 +350,8 @@ uint32_t qemu_fdt_getprop_cell(void *fdt, const char 
*node_path,
const char *property)
 {
 int len;
-const uint32_t *p = qemu_fdt_getprop(fdt, node_path, property, );
+const uint32_t *p = qemu_fdt_getprop(fdt, node_path, property, ,
+ _fatal);
 if (len != 4) {
 error_report("%s: %s/%s not 4 bytes long (not a cell?)",
  __func__, node_path, property);
diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
index 552df21..48bf3b5 100644
--- a/include/sysemu/device_tree.h
+++ b/include/sysemu/device_tree.h
@@ -54,8 +54,19 @@ int qemu_fdt_setprop_string(void *fdt, const char *node_path,
 int qemu_fdt_setprop_phandle(void *fdt, const char *node_path,
  const char *property,
  const char *target_node_path);
+/**
+ * qemu_fdt_getprop: retrieve the value of a given property
+ * @fdt: pointer to the device tree blob
+ * @node_path: node path
+ * @property: name of the property to find
+ * @lenp: fdt error if any or length of the property on success
+ * @errp: handle to an error object
+ *
+ * returns a pointer to the property on success and NULL on failure
+ */
 const void *qemu_fdt_getprop(void *fdt, const char *node_path,
- const char *property, int *lenp);
+ const char *property, int *lenp,
+ Error **errp);
 uint32_t qemu_fdt_getprop_cell(void *fdt, const char *node_path,
const char *property);
 uint32_t qemu_fdt_get_phandle(void *fdt, const char *path);




[Qemu-devel] [PULL 08/13] device_tree: introduce qemu_fdt_node_path

2016-02-18 Thread Alex Williamson
From: Eric Auger 

This new helper routine returns a NULL terminated array of
node paths matching a node name and a compat string.

Signed-off-by: Eric Auger 
Reviewed-by: Peter Maydell 
Signed-off-by: Alex Williamson 
---
 device_tree.c|   51 ++
 include/sysemu/device_tree.h |   18 +++
 2 files changed, 69 insertions(+)

diff --git a/device_tree.c b/device_tree.c
index 9e77c69..2587257 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -226,6 +226,57 @@ static int findnode_nofail(void *fdt, const char 
*node_path)
 return offset;
 }
 
+char **qemu_fdt_node_path(void *fdt, const char *name, char *compat,
+  Error **errp)
+{
+int offset, len, ret;
+const char *iter_name;
+unsigned int path_len = 16, n = 0;
+GSList *path_list = NULL, *iter;
+char **path_array;
+
+offset = fdt_node_offset_by_compatible(fdt, -1, compat);
+
+while (offset >= 0) {
+iter_name = fdt_get_name(fdt, offset, );
+if (!iter_name) {
+offset = len;
+break;
+}
+if (!strcmp(iter_name, name)) {
+char *path;
+
+path = g_malloc(path_len);
+while ((ret = fdt_get_path(fdt, offset, path, path_len))
+  == -FDT_ERR_NOSPACE) {
+path_len += 16;
+path = g_realloc(path, path_len);
+}
+path_list = g_slist_prepend(path_list, path);
+n++;
+}
+offset = fdt_node_offset_by_compatible(fdt, offset, compat);
+}
+
+if (offset < 0 && offset != -FDT_ERR_NOTFOUND) {
+error_setg(errp, "%s: abort parsing dt for %s/%s: %s",
+   __func__, name, compat, fdt_strerror(offset));
+g_slist_free_full(path_list, g_free);
+return NULL;
+}
+
+path_array = g_new(char *, n + 1);
+path_array[n--] = NULL;
+
+for (iter = path_list; iter; iter = iter->next) {
+path_array[n--] = iter->data;
+}
+
+g_slist_free(path_list);
+
+return path_array;
+}
+
 int qemu_fdt_setprop(void *fdt, const char *node_path,
  const char *property, const void *val, int size)
 {
diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
index 62093ba..552df21 100644
--- a/include/sysemu/device_tree.h
+++ b/include/sysemu/device_tree.h
@@ -25,6 +25,24 @@ void *load_device_tree(const char *filename_path, int 
*sizep);
 void *load_device_tree_from_sysfs(void);
 #endif
 
+/**
+ * qemu_fdt_node_path: return the paths of nodes matching a given
+ * name and compat string
+ * @fdt: pointer to the dt blob
+ * @name: node name
+ * @compat: compatibility string
+ * @errp: handle to an error object
+ *
+ * returns a newly allocated NULL-terminated array of node paths.
+ * Use g_strfreev() to free it. If one or more nodes were found, the
+ * array contains the path of each node and the last element equals to
+ * NULL. If there is no error but no matching node was found, the
+ * returned array contains a single element equal to NULL. If an error
+ * was encountered when parsing the blob, the function returns NULL
+ */
+char **qemu_fdt_node_path(void *fdt, const char *name, char *compat,
+  Error **errp);
+
 int qemu_fdt_setprop(void *fdt, const char *node_path,
  const char *property, const void *val, int size);
 int qemu_fdt_setprop_cell(void *fdt, const char *node_path,




[Qemu-devel] [PULL 12/13] hw/arm/sysbus-fdt: enable amd-xgbe dynamic instantiation

2016-02-18 Thread Alex Williamson
From: Eric Auger 

This patch allows the instantiation of the vfio-amd-xgbe device
from the QEMU command line (-device vfio-amd-xgbe,host="").

The guest is exposed with a device tree node that combines the description
of both XGBE and PHY (representation supported from 4.2 onwards kernel):
Documentation/devicetree/bindings/net/amd-xgbe.txt.

There are 5 register regions, 6 interrupts including 4 optional
edge-sensitive per-channel interrupts.

Some property values are inherited from host device tree. Host device tree
must feature a combined XGBE/PHY representation (>= 4.2 host kernel).

2 clock nodes (dma and ptp) also are created. It is checked those clocks
are fixed on host side.

AMD XGBE node creation function has a dependency on vfio Linux header and
more generally node creation function for VFIO platform devices only make
sense with CONFIG_LINUX so let's protect this code with #ifdef CONFIG_LINUX.

Signed-off-by: Eric Auger 
Reviewed-by: Peter Maydell 
Signed-off-by: Alex Williamson 
---
 hw/arm/sysbus-fdt.c |  194 +--
 1 file changed, 188 insertions(+), 6 deletions(-)

diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index 8575cfe..9920388 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -23,6 +23,10 @@
 
 #include "qemu/osdep.h"
 #include 
+#include "qemu-common.h"
+#ifdef CONFIG_LINUX
+#include 
+#endif
 #include "hw/arm/sysbus-fdt.h"
 #include "qemu/error-report.h"
 #include "sysemu/device_tree.h"
@@ -30,6 +34,7 @@
 #include "sysemu/sysemu.h"
 #include "hw/vfio/vfio-platform.h"
 #include "hw/vfio/vfio-calxeda-xgmac.h"
+#include "hw/vfio/vfio-amd-xgbe.h"
 #include "hw/arm/fdt.h"
 
 /*
@@ -65,6 +70,8 @@ typedef struct HostProperty {
 bool optional;
 } HostProperty;
 
+#ifdef CONFIG_LINUX
+
 /**
  * copy_properties_from_host
  *
@@ -127,12 +134,9 @@ static HostProperty clock_copied_properties[] = {
  * @host_phandle: phandle of the clock in host device tree
  * @guest_phandle: phandle to assign to the guest node
  */
-void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
- uint32_t host_phandle,
- uint32_t guest_phandle);
-void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
- uint32_t host_phandle,
- uint32_t guest_phandle)
+static void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
+uint32_t host_phandle,
+uint32_t guest_phandle)
 {
 char *node_path = NULL;
 char *nodename;
@@ -177,6 +181,28 @@ void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
 g_free(node_path);
 }
 
+/**
+ * sysfs_to_dt_name: convert the name found in sysfs into the node name
+ * for instance e090.xgmac is converted into xgmac@e090
+ * @sysfs_name: directory name in sysfs
+ *
+ * returns the device tree name upon success or NULL in case the sysfs name
+ * does not match the expected format
+ */
+static char *sysfs_to_dt_name(const char *sysfs_name)
+{
+gchar **substrings =  g_strsplit(sysfs_name, ".", 2);
+char *dt_name = NULL;
+
+if (!substrings || !substrings[0] || !substrings[1]) {
+goto out;
+}
+dt_name = g_strdup_printf("%s@%s", substrings[1], substrings[0]);
+out:
+g_strfreev(substrings);
+return dt_name;
+}
+
 /* Device Specific Code */
 
 /**
@@ -244,9 +270,165 @@ fail_reg:
 return ret;
 }
 
+/* AMD xgbe properties whose values are copied/pasted from host */
+static HostProperty amd_xgbe_copied_properties[] = {
+{"compatible", false},
+{"dma-coherent", true},
+{"amd,per-channel-interrupt", true},
+{"phy-mode", false},
+{"mac-address", true},
+{"amd,speed-set", false},
+{"amd,serdes-blwc", true},
+{"amd,serdes-cdr-rate", true},
+{"amd,serdes-pq-skew", true},
+{"amd,serdes-tx-amp", true},
+{"amd,serdes-dfe-tap-config", true},
+{"amd,serdes-dfe-tap-enable", true},
+{"clock-names", false},
+};
+
+/**
+ * add_amd_xgbe_fdt_node
+ *
+ * Generates the combined xgbe/phy node following kernel >=4.2
+ * binding documentation:
+ * Documentation/devicetree/bindings/net/amd-xgbe.txt:
+ * Also 2 clock nodes are created (dma and ptp)
+ *
+ * Asserts in case of error
+ */
+static int add_amd_xgbe_fdt_node(SysBusDevice *sbdev, void *opaque)
+{
+PlatformBusFDTData *data = opaque;
+PlatformBusDevice *pbus = data->pbus;
+VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
+VFIODevice *vbasedev = >vbasedev;
+VFIOINTp *intp;
+const char *parent_node = data->pbus_node_name;
+char **node_path, *nodename, *dt_name;
+void *guest_fdt = data->fdt, *host_fdt;
+const void *r;
+int i, prop_len;
+uint32_t *irq_attr, *reg_attr, *host_clock_phandles;
+uint64_t mmio_base, irq_number;
+uint32_t guest_clock_phandles[2];
+
+host_fdt = 

[Qemu-devel] [PULL 11/13] hw/arm/sysbus-fdt: helpers for clock node generation

2016-02-18 Thread Alex Williamson
From: Eric Auger 

Some passthrough'ed devices depend on clock nodes. Those need to be
generated in the guest device tree. This patch introduces some helpers
to build a clock node from information retrieved in the host device tree.

- copy_properties_from_host copies properties from a host device tree
  node to a guest device tree node
- fdt_build_clock_node builds a guest clock node and checks the host
  fellow clock is a fixed one.

fdt_build_clock_node will become static as soon as it gets used. A
dummy pre-declaration is needed for compilation of this patch.

Signed-off-by: Eric Auger 
Reviewed-by: Peter Maydell 
Signed-off-by: Alex Williamson 
---
 hw/arm/sysbus-fdt.c |  120 +++
 1 file changed, 120 insertions(+)

diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index 68a3de5..8575cfe 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -22,6 +22,7 @@
  */
 
 #include "qemu/osdep.h"
+#include 
 #include "hw/arm/sysbus-fdt.h"
 #include "qemu/error-report.h"
 #include "sysemu/device_tree.h"
@@ -57,6 +58,125 @@ typedef struct NodeCreationPair {
 int (*add_fdt_node_fn)(SysBusDevice *sbdev, void *opaque);
 } NodeCreationPair;
 
+/* helpers */
+
+typedef struct HostProperty {
+const char *name;
+bool optional;
+} HostProperty;
+
+/**
+ * copy_properties_from_host
+ *
+ * copies properties listed in an array from host device tree to
+ * guest device tree. If a non optional property is not found, the
+ * function asserts. An optional property is ignored if not found
+ * in the host device tree.
+ * @props: array of HostProperty to copy
+ * @nb_props: number of properties in the array
+ * @host_dt: host device tree blob
+ * @guest_dt: guest device tree blob
+ * @node_path: host dt node path where the property is supposed to be
+  found
+ * @nodename: guest node name the properties should be added to
+ */
+static void copy_properties_from_host(HostProperty *props, int nb_props,
+  void *host_fdt, void *guest_fdt,
+  char *node_path, char *nodename)
+{
+int i, prop_len;
+const void *r;
+Error *err = NULL;
+
+for (i = 0; i < nb_props; i++) {
+r = qemu_fdt_getprop(host_fdt, node_path,
+ props[i].name,
+ _len,
+ props[i].optional ?  : _fatal);
+if (r) {
+qemu_fdt_setprop(guest_fdt, nodename,
+ props[i].name, r, prop_len);
+} else {
+if (prop_len != -FDT_ERR_NOTFOUND) {
+/* optional property not returned although property exists */
+error_report_err(err);
+} else {
+error_free(err);
+}
+}
+}
+}
+
+/* clock properties whose values are copied/pasted from host */
+static HostProperty clock_copied_properties[] = {
+{"compatible", false},
+{"#clock-cells", false},
+{"clock-frequency", true},
+{"clock-output-names", true},
+};
+
+/**
+ * fdt_build_clock_node
+ *
+ * Build a guest clock node, used as a dependency from a passthrough'ed
+ * device. Most information are retrieved from the host clock node.
+ * Also check the host clock is a fixed one.
+ *
+ * @host_fdt: host device tree blob from which info are retrieved
+ * @guest_fdt: guest device tree blob where the clock node is added
+ * @host_phandle: phandle of the clock in host device tree
+ * @guest_phandle: phandle to assign to the guest node
+ */
+void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
+ uint32_t host_phandle,
+ uint32_t guest_phandle);
+void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
+ uint32_t host_phandle,
+ uint32_t guest_phandle)
+{
+char *node_path = NULL;
+char *nodename;
+const void *r;
+int ret, node_offset, prop_len, path_len = 16;
+
+node_offset = fdt_node_offset_by_phandle(host_fdt, host_phandle);
+if (node_offset <= 0) {
+error_setg(_fatal,
+   "not able to locate clock handle %d in host device tree",
+   host_phandle);
+}
+node_path = g_malloc(path_len);
+while ((ret = fdt_get_path(host_fdt, node_offset, node_path, path_len))
+== -FDT_ERR_NOSPACE) {
+path_len += 16;
+node_path = g_realloc(node_path, path_len);
+}
+if (ret < 0) {
+error_setg(_fatal,
+   "not able to retrieve node path for clock handle %d",
+   host_phandle);
+}
+
+r = qemu_fdt_getprop(host_fdt, node_path, "compatible", _len,
+ _fatal);
+if (strcmp(r, "fixed-clock")) {
+error_setg(_fatal,
+   "clock handle %d is not a fixed 

[Qemu-devel] [PULL 07/13] device_tree: introduce load_device_tree_from_sysfs

2016-02-18 Thread Alex Williamson
From: Eric Auger 

This function returns the host device tree blob from sysfs
(/proc/device-tree). It uses a recursive function inspired
from dtc read_fstree.

Signed-off-by: Eric Auger 
Reviewed-by: Peter Maydell 
Signed-off-by: Alex Williamson 
---
 device_tree.c|  100 ++
 include/sysemu/device_tree.h |8 +++
 2 files changed, 108 insertions(+)

diff --git a/device_tree.c b/device_tree.c
index b1ad836..9e77c69 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -13,6 +13,10 @@
 
 #include "qemu/osdep.h"
 
+#ifdef CONFIG_LINUX
+#include 
+#endif
+
 #include "qemu-common.h"
 #include "qemu/error-report.h"
 #include "sysemu/device_tree.h"
@@ -112,6 +116,102 @@ fail:
 return NULL;
 }
 
+#ifdef CONFIG_LINUX
+
+#define SYSFS_DT_BASEDIR "/proc/device-tree"
+
+/**
+ * read_fstree: this function is inspired from dtc read_fstree
+ * @fdt: preallocated fdt blob buffer, to be populated
+ * @dirname: directory to scan under SYSFS_DT_BASEDIR
+ * the search is recursive and the tree is searched down to the
+ * leaves (property files).
+ *
+ * the function asserts in case of error
+ */
+static void read_fstree(void *fdt, const char *dirname)
+{
+DIR *d;
+struct dirent *de;
+struct stat st;
+const char *root_dir = SYSFS_DT_BASEDIR;
+const char *parent_node;
+
+if (strstr(dirname, root_dir) != dirname) {
+error_setg(_fatal, "%s: %s must be searched within %s",
+   __func__, dirname, root_dir);
+}
+parent_node = [strlen(SYSFS_DT_BASEDIR)];
+
+d = opendir(dirname);
+if (!d) {
+error_setg(_fatal, "%s cannot open %s", __func__, dirname);
+}
+
+while ((de = readdir(d)) != NULL) {
+char *tmpnam;
+
+if (!g_strcmp0(de->d_name, ".")
+|| !g_strcmp0(de->d_name, "..")) {
+continue;
+}
+
+tmpnam = g_strdup_printf("%s/%s", dirname, de->d_name);
+
+if (lstat(tmpnam, ) < 0) {
+error_setg(_fatal, "%s cannot lstat %s", __func__, tmpnam);
+}
+
+if (S_ISREG(st.st_mode)) {
+gchar *val;
+gsize len;
+
+if (!g_file_get_contents(tmpnam, , , NULL)) {
+error_setg(_fatal, "%s not able to extract info from %s",
+   __func__, tmpnam);
+}
+
+if (strlen(parent_node) > 0) {
+qemu_fdt_setprop(fdt, parent_node,
+ de->d_name, val, len);
+} else {
+qemu_fdt_setprop(fdt, "/", de->d_name, val, len);
+}
+g_free(val);
+} else if (S_ISDIR(st.st_mode)) {
+char *node_name;
+
+node_name = g_strdup_printf("%s/%s",
+parent_node, de->d_name);
+qemu_fdt_add_subnode(fdt, node_name);
+g_free(node_name);
+read_fstree(fdt, tmpnam);
+}
+
+g_free(tmpnam);
+}
+
+closedir(d);
+}
+
+/* load_device_tree_from_sysfs: extract the dt blob from host sysfs */
+void *load_device_tree_from_sysfs(void)
+{
+void *host_fdt;
+int host_fdt_size;
+
+host_fdt = create_device_tree(_fdt_size);
+read_fstree(host_fdt, SYSFS_DT_BASEDIR);
+if (fdt_check_header(host_fdt)) {
+error_setg(_fatal,
+   "%s host device tree extracted into memory is invalid",
+   __func__);
+}
+return host_fdt;
+}
+
+#endif /* CONFIG_LINUX */
+
 static int findnode_nofail(void *fdt, const char *node_path)
 {
 int offset;
diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
index 359e143..62093ba 100644
--- a/include/sysemu/device_tree.h
+++ b/include/sysemu/device_tree.h
@@ -16,6 +16,14 @@
 
 void *create_device_tree(int *sizep);
 void *load_device_tree(const char *filename_path, int *sizep);
+#ifdef CONFIG_LINUX
+/**
+ * load_device_tree_from_sysfs: reads the device tree information in the
+ * /proc/device-tree directory and return the corresponding binary blob
+ * buffer pointer. Asserts in case of error.
+ */
+void *load_device_tree_from_sysfs(void);
+#endif
 
 int qemu_fdt_setprop(void *fdt, const char *node_path,
  const char *property, const void *val, int size);




[Qemu-devel] [PULL 06/13] hw/vfio/platform: amd-xgbe device

2016-02-18 Thread Alex Williamson
From: Eric Auger 

This patch introduces the amd-xgbe VFIO platform device. It
allows the guest to do passthrough on a device exposing an
"amd,xgbe-seattle-v1a" compat string.

Signed-off-by: Eric Auger 
Reviewed-by: Alex Bennée 
Signed-off-by: Alex Williamson 
---
 hw/vfio/Makefile.objs   |1 +
 hw/vfio/amd-xgbe.c  |   55 +++
 include/hw/vfio/vfio-amd-xgbe.h |   51 
 3 files changed, 107 insertions(+)
 create mode 100644 hw/vfio/amd-xgbe.c
 create mode 100644 include/hw/vfio/vfio-amd-xgbe.h

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index d324863..ceddbb8 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -3,4 +3,5 @@ obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o pci-quirks.o
 obj-$(CONFIG_SOFTMMU) += platform.o
 obj-$(CONFIG_SOFTMMU) += calxeda-xgmac.o
+obj-$(CONFIG_SOFTMMU) += amd-xgbe.o
 endif
diff --git a/hw/vfio/amd-xgbe.c b/hw/vfio/amd-xgbe.c
new file mode 100644
index 000..53451eb
--- /dev/null
+++ b/hw/vfio/amd-xgbe.c
@@ -0,0 +1,55 @@
+/*
+ * AMD XGBE VFIO device
+ *
+ * Copyright Linaro Limited, 2015
+ *
+ * Authors:
+ *  Eric Auger 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "hw/vfio/vfio-amd-xgbe.h"
+
+static void amd_xgbe_realize(DeviceState *dev, Error **errp)
+{
+VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
+VFIOAmdXgbeDeviceClass *k = VFIO_AMD_XGBE_DEVICE_GET_CLASS(dev);
+
+vdev->compat = g_strdup("amd,xgbe-seattle-v1a");
+
+k->parent_realize(dev, errp);
+}
+
+static const VMStateDescription vfio_platform_amd_xgbe_vmstate = {
+.name = TYPE_VFIO_AMD_XGBE,
+.unmigratable = 1,
+};
+
+static void vfio_amd_xgbe_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+VFIOAmdXgbeDeviceClass *vcxc =
+VFIO_AMD_XGBE_DEVICE_CLASS(klass);
+vcxc->parent_realize = dc->realize;
+dc->realize = amd_xgbe_realize;
+dc->desc = "VFIO AMD XGBE";
+dc->vmsd = _platform_amd_xgbe_vmstate;
+}
+
+static const TypeInfo vfio_amd_xgbe_dev_info = {
+.name = TYPE_VFIO_AMD_XGBE,
+.parent = TYPE_VFIO_PLATFORM,
+.instance_size = sizeof(VFIOAmdXgbeDevice),
+.class_init = vfio_amd_xgbe_class_init,
+.class_size = sizeof(VFIOAmdXgbeDeviceClass),
+};
+
+static void register_amd_xgbe_dev_type(void)
+{
+type_register_static(_amd_xgbe_dev_info);
+}
+
+type_init(register_amd_xgbe_dev_type)
diff --git a/include/hw/vfio/vfio-amd-xgbe.h b/include/hw/vfio/vfio-amd-xgbe.h
new file mode 100644
index 000..9fff65e
--- /dev/null
+++ b/include/hw/vfio/vfio-amd-xgbe.h
@@ -0,0 +1,51 @@
+/*
+ * VFIO AMD XGBE device
+ *
+ * Copyright Linaro Limited, 2015
+ *
+ * Authors:
+ *  Eric Auger 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef HW_VFIO_VFIO_AMD_XGBE_H
+#define HW_VFIO_VFIO_AMD_XGBE_H
+
+#include "hw/vfio/vfio-platform.h"
+
+#define TYPE_VFIO_AMD_XGBE "vfio-amd-xgbe"
+
+/**
+ * This device exposes:
+ * - 5 MMIO regions: MAC, PCS, SerDes Rx/Tx regs,
+ SerDes Integration Registers 1/2 & 2/2
+ * - 2 level sensitive IRQs and optional DMA channel IRQs
+ */
+struct VFIOAmdXgbeDevice {
+VFIOPlatformDevice vdev;
+};
+
+typedef struct VFIOAmdXgbeDevice VFIOAmdXgbeDevice;
+
+struct VFIOAmdXgbeDeviceClass {
+/*< private >*/
+VFIOPlatformDeviceClass parent_class;
+/*< public >*/
+DeviceRealize parent_realize;
+};
+
+typedef struct VFIOAmdXgbeDeviceClass VFIOAmdXgbeDeviceClass;
+
+#define VFIO_AMD_XGBE_DEVICE(obj) \
+ OBJECT_CHECK(VFIOAmdXgbeDevice, (obj), TYPE_VFIO_AMD_XGBE)
+#define VFIO_AMD_XGBE_DEVICE_CLASS(klass) \
+ OBJECT_CLASS_CHECK(VFIOAmdXgbeDeviceClass, (klass), \
+TYPE_VFIO_AMD_XGBE)
+#define VFIO_AMD_XGBE_DEVICE_GET_CLASS(obj) \
+ OBJECT_GET_CLASS(VFIOAmdXgbeDeviceClass, (obj), \
+  TYPE_VFIO_AMD_XGBE)
+
+#endif




[Qemu-devel] [PULL 04/13] pcie_aer: expose pcie_aer_msg() interface

2016-02-18 Thread Alex Williamson
From: Chen Fan 

For vfio device, we need to propagate the aer error to
Guest OS. we use the pcie_aer_msg() to send aer error
to guest.

Signed-off-by: Chen Fan 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Alex Williamson 
---
 hw/pci/pcie_aer.c |2 +-
 include/hw/pci/pcie_aer.h |1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
index 8043020..e2d4e68 100644
--- a/hw/pci/pcie_aer.c
+++ b/hw/pci/pcie_aer.c
@@ -371,7 +371,7 @@ static void pcie_aer_msg_root_port(PCIDevice *dev, const 
PCIEAERMsg *msg)
  *
  * Walk up the bus tree from the device, propagate the error message.
  */
-static void pcie_aer_msg(PCIDevice *dev, const PCIEAERMsg *msg)
+void pcie_aer_msg(PCIDevice *dev, const PCIEAERMsg *msg)
 {
 uint8_t type;
 
diff --git a/include/hw/pci/pcie_aer.h b/include/hw/pci/pcie_aer.h
index 156acb0..c2ee4e2 100644
--- a/include/hw/pci/pcie_aer.h
+++ b/include/hw/pci/pcie_aer.h
@@ -102,5 +102,6 @@ void pcie_aer_root_write_config(PCIDevice *dev,
 
 /* error injection */
 int pcie_aer_inject_error(PCIDevice *dev, const PCIEAERErr *err);
+void pcie_aer_msg(PCIDevice *dev, const PCIEAERMsg *msg);
 
 #endif /* QEMU_PCIE_AER_H */




[Qemu-devel] [PULL 03/13] aer: impove pcie_aer_init to support vfio device

2016-02-18 Thread Alex Williamson
From: Chen Fan 

pcie_aer_init was used to emulate an aer capability for pcie device,
but for vfio device, the aer config space size is mutable and is not
always equal to PCI_ERR_SIZEOF(0x48). it depends on where the TLP Prefix
register required, so here we add a size argument.

Signed-off-by: Chen Fan 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Alex Williamson 
---
 hw/pci-bridge/ioh3420.c|2 +-
 hw/pci-bridge/xio3130_downstream.c |2 +-
 hw/pci-bridge/xio3130_upstream.c   |2 +-
 hw/pci/pcie_aer.c  |4 ++--
 include/hw/pci/pcie_aer.h  |2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/pci-bridge/ioh3420.c b/hw/pci-bridge/ioh3420.c
index 8ac4240..e62eefb 100644
--- a/hw/pci-bridge/ioh3420.c
+++ b/hw/pci-bridge/ioh3420.c
@@ -130,7 +130,7 @@ static int ioh3420_initfn(PCIDevice *d)
 goto err_pcie_cap;
 }
 pcie_cap_root_init(d);
-rc = pcie_aer_init(d, IOH_EP_AER_OFFSET);
+rc = pcie_aer_init(d, IOH_EP_AER_OFFSET, PCI_ERR_SIZEOF);
 if (rc < 0) {
 goto err;
 }
diff --git a/hw/pci-bridge/xio3130_downstream.c 
b/hw/pci-bridge/xio3130_downstream.c
index 9eb3d88..8458790 100644
--- a/hw/pci-bridge/xio3130_downstream.c
+++ b/hw/pci-bridge/xio3130_downstream.c
@@ -93,7 +93,7 @@ static int xio3130_downstream_initfn(PCIDevice *d)
 goto err_pcie_cap;
 }
 pcie_cap_arifwd_init(d);
-rc = pcie_aer_init(d, XIO3130_AER_OFFSET);
+rc = pcie_aer_init(d, XIO3130_AER_OFFSET, PCI_ERR_SIZEOF);
 if (rc < 0) {
 goto err;
 }
diff --git a/hw/pci-bridge/xio3130_upstream.c b/hw/pci-bridge/xio3130_upstream.c
index 7d255a6..c7fd397 100644
--- a/hw/pci-bridge/xio3130_upstream.c
+++ b/hw/pci-bridge/xio3130_upstream.c
@@ -82,7 +82,7 @@ static int xio3130_upstream_initfn(PCIDevice *d)
 }
 pcie_cap_flr_init(d);
 pcie_cap_deverr_init(d);
-rc = pcie_aer_init(d, XIO3130_AER_OFFSET);
+rc = pcie_aer_init(d, XIO3130_AER_OFFSET, PCI_ERR_SIZEOF);
 if (rc < 0) {
 goto err;
 }
diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
index a9d9d06..8043020 100644
--- a/hw/pci/pcie_aer.c
+++ b/hw/pci/pcie_aer.c
@@ -95,12 +95,12 @@ static void aer_log_clear_all_err(PCIEAERLog *aer_log)
 aer_log->log_num = 0;
 }
 
-int pcie_aer_init(PCIDevice *dev, uint16_t offset)
+int pcie_aer_init(PCIDevice *dev, uint16_t offset, uint16_t size)
 {
 PCIExpressDevice *exp;
 
 pcie_add_capability(dev, PCI_EXT_CAP_ID_ERR, PCI_ERR_VER,
-offset, PCI_ERR_SIZEOF);
+offset, size);
 exp = >exp;
 exp->aer_cap = offset;
 
diff --git a/include/hw/pci/pcie_aer.h b/include/hw/pci/pcie_aer.h
index 2fb8388..156acb0 100644
--- a/include/hw/pci/pcie_aer.h
+++ b/include/hw/pci/pcie_aer.h
@@ -87,7 +87,7 @@ struct PCIEAERErr {
 
 extern const VMStateDescription vmstate_pcie_aer_log;
 
-int pcie_aer_init(PCIDevice *dev, uint16_t offset);
+int pcie_aer_init(PCIDevice *dev, uint16_t offset, uint16_t size);
 void pcie_aer_exit(PCIDevice *dev);
 void pcie_aer_write_config(PCIDevice *dev,
uint32_t addr, uint32_t val, int len);




[Qemu-devel] [PULL 05/13] vfio/pci: replace 1 with PCI_CAP_LIST_NEXT to make code self-explain

2016-02-18 Thread Alex Williamson
From: Wei Yang 

Use the macro PCI_CAP_LIST_NEXT instead of 1, so that the code would be
more self-explain.

This patch makes this change and also fixs one typo in comment.

Signed-off-by: Wei Yang 
Signed-off-by: Alex Williamson 
---
 hw/vfio/pci.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index e671506..dc5fa9f 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1509,7 +1509,7 @@ static uint8_t vfio_std_cap_max_size(PCIDevice *pdev, 
uint8_t pos)
 uint16_t next = PCI_CONFIG_SPACE_SIZE;
 
 for (tmp = pdev->config[PCI_CAPABILITY_LIST]; tmp;
- tmp = pdev->config[tmp + 1]) {
+ tmp = pdev->config[tmp + PCI_CAP_LIST_NEXT]) {
 if (tmp > pos && tmp < next) {
 next = tmp;
 }
@@ -1698,7 +1698,7 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t 
pos)
 int ret;
 
 cap_id = pdev->config[pos];
-next = pdev->config[pos + 1];
+next = pdev->config[pos + PCI_CAP_LIST_NEXT];
 
 /*
  * If it becomes important to configure capabilities to their actual
@@ -1712,7 +1712,7 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t 
pos)
  * pci_add_capability always inserts the new capability at the head
  * of the chain.  Therefore to end up with a chain that matches the
  * physical device, we insert from the end by making this recursive.
- * This is also why we pre-caclulate size above as cached config space
+ * This is also why we pre-calculate size above as cached config space
  * will be changed as we unwind the stack.
  */
 if (next) {
@@ -1728,7 +1728,7 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t 
pos)
 }
 
 /* Use emulated next pointer to allow dropping caps */
-pci_set_byte(vdev->emulated_config_bits + pos + 1, 0xff);
+pci_set_byte(vdev->emulated_config_bits + pos + PCI_CAP_LIST_NEXT, 0xff);
 
 switch (cap_id) {
 case PCI_CAP_ID_MSI:




[Qemu-devel] [PULL 13/13] hw/arm/sysbus-fdt: remove qemu_fdt_setprop returned value check

2016-02-18 Thread Alex Williamson
From: Eric Auger 

qemu_fdt_setprop asserts in case of error hence no need to check
the returned value.

Signed-off-by: Eric Auger 
Reviewed-by: Peter Maydell 
Signed-off-by: Alex Williamson 
---
 hw/arm/sysbus-fdt.c |   19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index 9920388..04afeae 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -217,7 +217,7 @@ static int add_calxeda_midway_xgmac_fdt_node(SysBusDevice 
*sbdev, void *opaque)
 PlatformBusDevice *pbus = data->pbus;
 void *fdt = data->fdt;
 const char *parent_node = data->pbus_node_name;
-int compat_str_len, i, ret = -1;
+int compat_str_len, i;
 char *nodename;
 uint32_t *irq_attr, *reg_attr;
 uint64_t mmio_base, irq_number;
@@ -242,12 +242,8 @@ static int add_calxeda_midway_xgmac_fdt_node(SysBusDevice 
*sbdev, void *opaque)
 reg_attr[2 * i + 1] = cpu_to_be32(
 memory_region_size(>regions[i]->mem));
 }
-ret = qemu_fdt_setprop(fdt, nodename, "reg", reg_attr,
-   vbasedev->num_regions * 2 * sizeof(uint32_t));
-if (ret) {
-error_report("could not set reg property of node %s", nodename);
-goto fail_reg;
-}
+qemu_fdt_setprop(fdt, nodename, "reg", reg_attr,
+ vbasedev->num_regions * 2 * sizeof(uint32_t));
 
 irq_attr = g_new(uint32_t, vbasedev->num_irqs * 3);
 for (i = 0; i < vbasedev->num_irqs; i++) {
@@ -257,17 +253,12 @@ static int add_calxeda_midway_xgmac_fdt_node(SysBusDevice 
*sbdev, void *opaque)
 irq_attr[3 * i + 1] = cpu_to_be32(irq_number);
 irq_attr[3 * i + 2] = cpu_to_be32(GIC_FDT_IRQ_FLAGS_LEVEL_HI);
 }
-ret = qemu_fdt_setprop(fdt, nodename, "interrupts",
+qemu_fdt_setprop(fdt, nodename, "interrupts",
  irq_attr, vbasedev->num_irqs * 3 * sizeof(uint32_t));
-if (ret) {
-error_report("could not set interrupts property of node %s",
- nodename);
-}
 g_free(irq_attr);
-fail_reg:
 g_free(reg_attr);
 g_free(nodename);
-return ret;
+return 0;
 }
 
 /* AMD xgbe properties whose values are copied/pasted from host */




[Qemu-devel] [PULL 02/13] vfio: make the 4 bytes aligned for capability size

2016-02-18 Thread Alex Williamson
From: Chen Fan 

this function search the capability from the end, the last
size should 0x100 - pos, not 0xff - pos.

Signed-off-by: Chen Fan 
Reviewed-by: Marcel Apfelbaum 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Alex Williamson 
---
 hw/vfio/pci.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 49f3d2d..e671506 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1505,7 +1505,8 @@ static void vfio_unmap_bars(VFIOPCIDevice *vdev)
  */
 static uint8_t vfio_std_cap_max_size(PCIDevice *pdev, uint8_t pos)
 {
-uint8_t tmp, next = 0xff;
+uint8_t tmp;
+uint16_t next = PCI_CONFIG_SPACE_SIZE;
 
 for (tmp = pdev->config[PCI_CAPABILITY_LIST]; tmp;
  tmp = pdev->config[tmp + 1]) {




[Qemu-devel] [PULL 00/13] VFIO updates 2016-02-18

2016-02-18 Thread Alex Williamson
The following changes since commit dd5e38b19d7cb07d317e1285941d8245c01da540:

  Merge remote-tracking branch 
'remotes/pmaydell/tags/pull-target-arm-20160218-1' into staging (2016-02-18 
15:20:35 +)

are available in the git repository at:


  git://github.com/awilliam/qemu-vfio.git tags/vfio-update-20160218.0

for you to fetch changes up to 87889319d955f346a68ade46905a17fefe9d:

  hw/arm/sysbus-fdt: remove qemu_fdt_setprop returned value check (2016-02-18 
12:43:13 -0700)

VFIO updates 2016-02-18

 - AER pre-enable and misc fixes (Cao jin and Chen Fan)
 - PCI_CAP_LIST_NEXT cleanup (Wei Yang)
 - AMD XGBE KVM platform passthrough (Eric Auger)


Chen Fan (4):
  pcie: modify the capability size assert
  vfio: make the 4 bytes aligned for capability size
  aer: impove pcie_aer_init to support vfio device
  pcie_aer: expose pcie_aer_msg() interface

Eric Auger (8):
  hw/vfio/platform: amd-xgbe device
  device_tree: introduce load_device_tree_from_sysfs
  device_tree: introduce qemu_fdt_node_path
  device_tree: qemu_fdt_getprop converted to use the error API
  device_tree: qemu_fdt_getprop_cell converted to use the error API
  hw/arm/sysbus-fdt: helpers for clock node generation
  hw/arm/sysbus-fdt: enable amd-xgbe dynamic instantiation
  hw/arm/sysbus-fdt: remove qemu_fdt_setprop returned value check

Wei Yang (1):
  vfio/pci: replace 1 with PCI_CAP_LIST_NEXT to make code self-explain

 device_tree.c  | 179 +++--
 hw/arm/boot.c  |   6 +-
 hw/arm/sysbus-fdt.c| 319 +++--
 hw/arm/vexpress.c  |   6 +-
 hw/pci-bridge/ioh3420.c|   2 +-
 hw/pci-bridge/xio3130_downstream.c |   2 +-
 hw/pci-bridge/xio3130_upstream.c   |   2 +-
 hw/pci/pcie.c  |   2 +-
 hw/pci/pcie_aer.c  |   6 +-
 hw/vfio/Makefile.objs  |   1 +
 hw/vfio/amd-xgbe.c |  55 +++
 hw/vfio/pci.c  |  11 +-
 include/hw/pci/pcie_aer.h  |   3 +-
 include/hw/vfio/vfio-amd-xgbe.h|  51 ++
 include/sysemu/device_tree.h   |  53 +-
 15 files changed, 656 insertions(+), 42 deletions(-)
 create mode 100644 hw/vfio/amd-xgbe.c
 create mode 100644 include/hw/vfio/vfio-amd-xgbe.h



[Qemu-devel] [PULL 01/13] pcie: modify the capability size assert

2016-02-18 Thread Alex Williamson
From: Chen Fan 

Device's Offset and size can reach PCIE_CONFIG_SPACE_SIZE,
fix the corresponding assert.

Signed-off-by: Chen Fan 
Reviewed-by: Marcel Apfelbaum 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Alex Williamson 
---
 hw/pci/pcie.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index 435a6cf..4aca0c5 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -608,7 +608,7 @@ void pcie_add_capability(PCIDevice *dev,
 
 assert(offset >= PCI_CONFIG_SPACE_SIZE);
 assert(offset < offset + size);
-assert(offset + size < PCIE_CONFIG_SPACE_SIZE);
+assert(offset + size <= PCIE_CONFIG_SPACE_SIZE);
 assert(size >= 8);
 assert(pci_is_express(dev));
 




Re: [Qemu-devel] [PATCH 0/8] more include cleaning

2016-02-18 Thread Eric Blake
On 02/18/2016 11:05 AM, Peter Maydell wrote:
> This patchset makes more progress with with cleaning our
> include use and is hopefully the last of the big ones
> (there are some other minor fixups that can be done after).
> 
> Patch 1 is already in Andreas' QOM tree but I include it
> for convenience since without it things will fail to compile
> later in the series.
> Patch 2 fixes a problem with including osdep.h first in
> arm-a64.cc if the compiler doesn't support C++11
> Patch 3 is the previously sent arm-a64.cc change
> Patches 4 and 5 add support to the clean-includes script
> for working on .h files, and for saying "run on everything
> in the source tree"
> Patch 6 fixes something I had forgotten about -- osdep.h
> needs to include config-target.h if the object file being
> built is a per-target one!
> Patch 7 is the results of cleaning everything except include/.
> Patch 8 cleans include/.
> 
> In particular, since patch 8 removes the osdep.h include from
> qemu-common.h it is where you will see compile failures if anything
> isn't including osdep.h (this builds fine against current master but
> might break in-flight patches and other out of tree code if it does
> not include osdep.h everywhere it should).
> 
> I could split patches 7 and 8 up into smaller chunks if that seems
> helpful to people, but I figured they weren't too terrible as-is.
> 
> thanks
> -- PMM
> 
> 
> Peter Maydell (8):

Tested-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v1 0/2] Add a generic loader

2016-02-18 Thread Peter Maydell
On 18 February 2016 at 19:02, Alistair Francis
 wrote:
> On Thu, Feb 18, 2016 at 10:27 AM, Hollis Blanchard
>  wrote:
>> I understand the part about loading multiple files, but why would I want to
>> load a kernel with this loader, instead of the -kernel option?
>
> If you just want to load a kernel, you should use the -kernel option.
> The advantage with this is when you need multiple images. For example:
>  - Load ATF and let it run
>  - Hands off to a simple loader that sets some registers and drops EL
>  - Start Linux
>
> To be able to perform the steps above all three images and the DTB
> need to already be loaded. This allows us to boot Linux on ATF without
> u-boot. That is just one example, there are others. Mostly it comes
> down to using ATF and something.

FWIW, the ATF stack we have for the 'virt' board uses the semihosting
API to load the later stage blobs. (That's not saying we don't need
this series.)

thanks
-- PMM



Re: [Qemu-devel] [PATCH v11 12/15] qapi: Don't box struct branch of alternate

2016-02-18 Thread Eric Blake
On 02/18/2016 11:56 AM, Markus Armbruster wrote:

> Better, but the sentence is long enough to confuse even a German.  What
> about:

lol

> 
> The corresponding spot in qapi-visit.c calls visit_type_FOO(), which
> first calls visit_start_struct() to allocate or deallocate the member
> and handle a layer of {} from the JSON stream, then visits the
> members.  To peel off the indirection and the memory management that
> comes with it, we inline this call, then suppress allocation /
> deallocation by passing NULL to visit_start_struct(), and adjust the
> member visit:

Works for me.  [Technically, visit_start_struct() never deallocates, it
merely records a pointer for later deallocation during
visit_end_struct(); but I don't think we need to worry about the
imprecision here]

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v3 0/2] Fix migration of old pseries

2016-02-18 Thread Laurent Vivier


On 18/02/2016 12:32, Greg Kurz wrote:
> QEMU 2.4 broke the migration of old pseries machine with the addition
> of configuration sections, which are sent unconditionally.
> 
> We assume that QEMU 2.3 is more deployed than any newer release (based on
> the versions currently shipped by most distros). This v3 series hence
> reverses the logic from v2: it now fully fixes migration of old pseries
> from/to QEMU 2.3 and provides a manual workaround for the QEMU 2.4/2.4.1/2.5
> case.
> 
> With this series, I could migrate the same pseries-2.3 instance in a full
> 2.3->2.6->2.5->2.6->2.4->2.6->2.3 cycle.
> 
> ---
> 
> Greg Kurz (2):
>   spapr: skip configuration section during migration of older machines
>   migration: allow machine to enforce configuration section migration
> 
> 
>  hw/core/machine.c   |   21 +
>  hw/ppc/spapr.c  |1 +
>  include/hw/boards.h |1 +
>  migration/savevm.c  |   10 --
>  qemu-options.hx |3 ++-
>  5 files changed, 33 insertions(+), 3 deletions(-)

For the series
Tested-by: Laurent Vivier 

Results of tests:

pseries-2.3

qemu-2.3.0 -> qemu-master: OK
qemu-2.3.1 -> qemu-master: OK
qemu-2.4.0 -> qemu-master: OK with qom-set
qemu-2.4.1 -> qemu-master: OK with qom-set
qemu-2.5.0 -> qemu-master: OK with qom-set
qemu-master -> qemu-2.3.0: OK
qemu-master -> qemu-2.3.1: OK
qemu-master -> qemu-2.4.0: OK with qom-set
qemu-master -> qemu-2.4.1: OK with qom-set
qemu-master -> qemu-2.5.0: OK with qom-set
qemu-master -> qemu-master: OK

pseries-2.4, pseries-2.5 and pseries-2.6 works well with the patch
applied too.

master is "b527c9b qcow2: Write full header on image creation" + this series

Laurent



Re: [Qemu-devel] [PATCH v3 2/2] migration: allow machine to enforce configuration section migration

2016-02-18 Thread Laurent Vivier


On 18/02/2016 12:32, Greg Kurz wrote:
> Migration of pseries-2.3 doesn't have configuration section. Unfortunately,
> QEMU 2.4/2.4.1/2.5 are buggy and always stream and expect the configuration
> section, and break migration both ways.
> 
> This patch introduces a property which allows to enforce a configuration
> section for machines who don't have one.
> 
> It can be set at startup:
> 
> -machine enforce-config-section=on
> 
> or later from the QEMU monitor:
> 
> qom-set /machine enforce-config-section on
> 
> It is up to the tooling to set or unset this property according to the
> version of the QEMU at the other end of the pipe.
> 
> Signed-off-by: Greg Kurz 

Reviewed-by: Laurent Vivier 

> ---
>  hw/core/machine.c   |   21 +
>  include/hw/boards.h |1 +
>  migration/savevm.c  |   10 --
>  qemu-options.hx |3 ++-
>  4 files changed, 32 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 6d1a0d8eebc4..a8c4680b0c47 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -312,6 +312,21 @@ static bool machine_get_suppress_vmdesc(Object *obj, 
> Error **errp)
>  return ms->suppress_vmdesc;
>  }
>  
> +static void machine_set_enforce_config_section(Object *obj, bool value,
> + Error **errp)
> +{
> +MachineState *ms = MACHINE(obj);
> +
> +ms->enforce_config_section = value;
> +}
> +
> +static bool machine_get_enforce_config_section(Object *obj, Error **errp)
> +{
> +MachineState *ms = MACHINE(obj);
> +
> +return ms->enforce_config_section;
> +}
> +
>  static int error_on_sysbus_device(SysBusDevice *sbdev, void *opaque)
>  {
>  error_report("Option '-device %s' cannot be handled by this machine",
> @@ -467,6 +482,12 @@ static void machine_initfn(Object *obj)
>  object_property_set_description(obj, "suppress-vmdesc",
>  "Set on to disable self-describing 
> migration",
>  NULL);
> +object_property_add_bool(obj, "enforce-config-section",
> + machine_get_enforce_config_section,
> + machine_set_enforce_config_section, NULL);
> +object_property_set_description(obj, "enforce-config-section",
> +"Set on to enforce configuration section 
> migration",
> +NULL);
>  
>  /* Register notifier when init is done for sysbus sanity checks */
>  ms->sysbus_notifier.notify = machine_init_notify;
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index 0f30959e2e3b..cfdf8d10cfcd 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -128,6 +128,7 @@ struct MachineState {
>  char *firmware;
>  bool iommu;
>  bool suppress_vmdesc;
> +bool enforce_config_section;
>  
>  ram_addr_t ram_size;
>  ram_addr_t maxram_size;
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 94f2894243ce..2fc57363674f 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -878,13 +878,19 @@ bool qemu_savevm_state_blocked(Error **errp)
>  return false;
>  }
>  
> +static bool enforce_config_section(void)
> +{
> +MachineState *machine = MACHINE(qdev_get_machine());
> +return machine->enforce_config_section;
> +}
> +
>  void qemu_savevm_state_header(QEMUFile *f)
>  {
>  trace_savevm_state_header();
>  qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
>  qemu_put_be32(f, QEMU_VM_FILE_VERSION);
>  
> -if (!savevm_state.skip_configuration) {
> +if (!savevm_state.skip_configuration || enforce_config_section()) {
>  qemu_put_byte(f, QEMU_VM_CONFIGURATION);
>  vmstate_save_state(f, _configuration, _state, 0);
>  }
> @@ -1875,7 +1881,7 @@ int qemu_loadvm_state(QEMUFile *f)
>  return -ENOTSUP;
>  }
>  
> -if (!savevm_state.skip_configuration) {
> +if (!savevm_state.skip_configuration || enforce_config_section()) {
>  if (qemu_get_byte(f) != QEMU_VM_CONFIGURATION) {
>  error_report("Configuration section missing");
>  return -EINVAL;
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 2f0465eeb1d1..8d81e310f2bf 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -43,7 +43,8 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
>  "aes-key-wrap=on|off controls support for AES key 
> wrapping (default=on)\n"
>  "dea-key-wrap=on|off controls support for DEA key 
> wrapping (default=on)\n"
>  "suppress-vmdesc=on|off disables self-describing 
> migration (default=off)\n"
> -"nvdimm=on|off controls NVDIMM support (default=off)\n",
> +"nvdimm=on|off controls NVDIMM support (default=off)\n"
> +"enforce-config-section=on|off enforce configuration 
> section migration 

Re: [Qemu-devel] [PATCH v3 1/2] spapr: skip configuration section during migration of older machines

2016-02-18 Thread Laurent Vivier


On 18/02/2016 12:32, Greg Kurz wrote:
> Since QEMU 2.4, we have a configuration section in the migration stream.
> This must be skipped for older machines, like it is already done for x86.
> 
> This patch fixes the migration of pseries-2.3 from/to QEMU 2.3, but it
> breaks migration of the same machine from/to QEMU 2.4/2.4.1/2.5. We do
> that anyway because QEMU 2.3 is likely to be more widely deployed than
> newer QEMU versions.
> 
> Fixes: 61964c23e5ddd5a33f15699e45ce126f879e3e33
> Signed-off-by: Greg Kurz 

Reviewed-by: Laurent Vivier 

> ---
>  hw/ppc/spapr.c |1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 5bd8fd3ef842..bca7cb8a5d27 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2446,6 +2446,7 @@ static void 
> spapr_machine_2_3_instance_options(MachineState *machine)
>  spapr_machine_2_4_instance_options(machine);
>  savevm_skip_section_footers();
>  global_state_set_optional();
> +savevm_skip_configuration();
>  }
>  
>  static void spapr_machine_2_3_class_options(MachineClass *mc)
> 



Re: [Qemu-devel] [PATCH 8/8] include: Clean up includes

2016-02-18 Thread Eric Blake
On 02/18/2016 11:05 AM, Peter Maydell wrote:
> Clean up includes so that osdep.h is included first and headers
> which it implies are not included manually.
> 
> This commit was created with scripts/clean-includes.
> 
> NB: If this commit breaks compilation for your out-of-tree
> patchseries or fork, then you need to make sure you add
> #include "qemu/osdep.h" to any new .c files that you have.
> 
> Signed-off-by: Peter Maydell 
> ---

>  97 files changed, 156 deletions(-)

> 
> +++ b/include/migration/qemu-file.h
> @@ -25,7 +25,6 @@
>  #define QEMU_FILE_H 1
>  #include "exec/cpu-common.h"
>  
> -#include 
>  
>  /* This function writes a chunk of data to a file at the given position.

Cleanups like this change the resulting whitespace in the file.  I don't
personally care, but I'm pointing it out in case someone does.

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] target-mips: Stop using uint_fast*_t types in r4k_tlb_t struct

2016-02-18 Thread Leon Alrae
On 18/02/16 11:51, Peter Maydell wrote:
> On 2 February 2016 at 10:49, Leon Alrae  wrote:
>> On 25/01/16 17:40, Peter Maydell wrote:
>>> The r4k_tlb_t structure uses the uint_fast*_t types. Most of these
>>> uses are in bitfields and are thus pointless, because the bitfield
>>> itself specifies the width of the type; just use 'unsigned int'
>>> instead. (On glibc uint_fast16_t is defined as either 32 or 64 bits,
>>> so we know the code is not reliant on it being exactly 16 bits.)
>>> There is also one use of uint_fast8_t, which we replace with uint8_t,
>>> because both are exactly 8 bits on glibc and this is the only
>>> place outside the softfloat code which uses an int_fast*_t type.
>>>
>>> Signed-off-by: Peter Maydell 
>>> ---
>>> I'm going to have a go at getting rid of the int_fast16_t usage
>>> in the softfloat code too, but in the meantime this is an
>>> independent cleanup.
>>>
>>>  target-mips/cpu.h | 26 +-
>>>  1 file changed, 13 insertions(+), 13 deletions(-)
>>
>> Applied to target-mips tree, thanks.
> 
> Hi -- is this going to appear in master soon? I have another patch
> pending that depends on it...

I'm out of office this week and will be able to send a pull request
early next week. If you are planning to apply that another patch
earlier, then please feel free to include this one in your pull request.

Thanks,
Leon




  1   2   3   4   >