date:20181211

Re: [Qemu-devel] [PATCH for-4.0 0/6] vhost-user-blk: Add support for backend reconnecting

2018-12-11 Thread Jason Wang




On 2018/12/12 下午2:41, Yongji Xie wrote:

On Wed, 12 Dec 2018 at 12:07, Jason Wang  wrote:


On 2018/12/12 上午11:21, Yongji Xie wrote:

On Wed, 12 Dec 2018 at 11:00, Jason Wang  wrote:

On 2018/12/12 上午10:48, Yongji Xie wrote:

On Mon, 10 Dec 2018 at 17:32, Jason Wang  wrote:

On 2018/12/6 下午9:59, Michael S. Tsirkin wrote:

On Thu, Dec 06, 2018 at 09:57:22PM +0800, Jason Wang wrote:

On 2018/12/6 下午2:35,elohi...@gmail.com  wrote:

From: Xie Yongji

This patchset is aimed at supporting qemu to reconnect
vhost-user-blk backend after vhost-user-blk backend crash or
restart.

The patch 1 tries to implenment the sync connection for
"reconnect socket".

The patch 2 introduces a new message VHOST_USER_SET_VRING_INFLIGHT
to support offering shared memory to backend to record
its inflight I/O.

The patch 3,4 are the corresponding libvhost-user patches of
patch 2. Make libvhost-user support VHOST_USER_SET_VRING_INFLIGHT.

The patch 5 supports vhost-user-blk to reconnect backend when
connection closed.

The patch 6 tells qemu that we support reconnecting now.

To use it, we could start qemu with:

qemu-system-x86_64 \
 -chardev socket,id=char0,path=/path/vhost.socket,reconnect=1,wait \
 -device vhost-user-blk-pci,chardev=char0 \

and start vhost-user-blk backend with:

vhost-user-blk -b /path/file -s /path/vhost.socket

Then we can restart vhost-user-blk at any time during VM running.

I wonder whether or not it's better to handle this at the level of virtio
protocol itself instead of vhost-user level. E.g expose last_avail_idx to
driver might be sufficient?

Another possible issue is, looks like you need to deal with different kinds
of ring layouts e.g packed virtqueues.

Thanks

I'm not sure I understand your comments here.
All these would be guest-visible extensions.

Looks not, it only introduces a shared memory between qemu and
vhost-user backend?



Possible for sure but how is this related to
a patch supporting transparent reconnects?

I might miss something. My understanding is that we support transparent
reconnects, but we can't deduce an accurate last_avail_idx and this is
what capability this series try to add. To me, this series is functional
equivalent to expose last_avail_idx (or avail_idx_cons) in available
ring. So the information is inside guest memory, vhost-user backend can
access it and update it directly. I believe this is some modern NIC did
as well (but index is in MMIO area of course).


Hi Jason,

If my understand is correct, it might be not enough to only expose
last_avail_idx.
Because we would not process descriptors in the same order in which they have
been made available sometimes. If so, we can't get correct inflight
I/O information
from available ring.

You can get this with the help of the both used ring and last_avail_idx
I believe. Or maybe you can give us an example?


A simple example, we assume ring size is 8:

1. guest fill avail ring

avail ring: 0 1 2 3 4 5 6 7
used ring:

2. vhost-user backend complete 4,5,6,7 and fill used ring

avail ring: 0 1 2 3 4 5 6 7
used ring: 4 5 6 7

3. guest fill avail ring again

avail ring: 4 5 6 7 4 5 6 7
used ring: 4 5 6 7

4. vhost-user backend crash

The inflight descriptors 0, 1, 2, 3 lost.

Thanks,
Yongji


Ok, then we can simply forbid increasing the avail_idx in this case?

Basically, it's a question of whether or not it's better to done it in
the level of virtio instead of vhost. I'm pretty sure if we expose
sufficient information, it could be done without touching vhost-user.
And we won't deal with e.g migration and other cases.


OK, I get your point. That's indeed an alternative way. But this feature seems
to be only useful to vhost-user backend.



I admit I could not think of a use case other than vhost-user.



  I'm not sure whether it make sense to
touch virtio protocol for this feature.



Some possible advantages:

- Feature could be determined and noticed by user or management layer.

- There's no need to invent ring layout specific protocol to record in 
flight descriptors. E.g if my understanding is correct, for this series 
and for the example above, it still can not work for packed virtqueue 
since descriptor id is not sufficient (descriptor could be overwritten 
by used one). You probably need to have a (partial) copy of descriptor 
ring for this.


- No need to deal with migration, all information was in guest memory.

Thanks



Thanks,
Yongji

Re: [Qemu-devel] [PATCH v7 09/19] spapr: add device tree support for the XIVE exploitation mode

2018-12-11 Thread Cédric Le Goater

On 12/12/18 1:19 AM, David Gibson wrote:
> On Tue, Dec 11, 2018 at 10:06:46AM +0100, Cédric Le Goater wrote:
>> On 12/11/18 1:38 AM, David Gibson wrote:
>>> On Mon, Dec 10, 2018 at 08:53:17AM +0100, Cédric Le Goater wrote:
 On 12/10/18 7:39 AM, David Gibson wrote:
> On Sun, Dec 09, 2018 at 08:46:00PM +0100, Cédric Le Goater wrote:
>> The XIVE interface for the guest is described in the device tree under
>> the "interrupt-controller" node. A couple of new properties are
>> specific to XIVE :
>>
>>  - "reg"
>>
>>contains the base address and size of the thread interrupt
>>managnement areas (TIMA), for the User level and for the Guest OS
>>level. Only the Guest OS level is taken into account today.
>>
>>  - "ibm,xive-eq-sizes"
>>
>>the size of the event queues. One cell per size supported, contains
>>log2 of size, in ascending order.
>>
>>  - "ibm,xive-lisn-ranges"
>>
>>the IRQ interrupt number ranges assigned to the guest for the IPIs.
>>
>> and also under the root node :
>>
>>  - "ibm,plat-res-int-priorities"
>>
>>contains a list of priorities that the hypervisor has reserved for
>>its own use. OPAL uses the priority 7 queue to automatically
>>escalate interrupts for all other queues (DD2.X POWER9). So only
>>priorities [0..6] are allowed for the guest.
>>
>> Extend the sPAPR IRQ backend with a new handler to populate the DT
>> with the appropriate "interrupt-controller" node.
>>
>> Signed-off-by: Cédric Le Goater 
>> ---
>>  include/hw/ppc/spapr_irq.h  |  2 ++
>>  include/hw/ppc/spapr_xive.h |  2 ++
>>  include/hw/ppc/xics.h   |  4 +--
>>  hw/intc/spapr_xive.c| 64 +
>>  hw/intc/xics_spapr.c|  3 +-
>>  hw/ppc/spapr.c  |  3 +-
>>  hw/ppc/spapr_irq.c  |  3 ++
>>  7 files changed, 77 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
>> index 23cdb51b879e..e51e9f052f63 100644
>> --- a/include/hw/ppc/spapr_irq.h
>> +++ b/include/hw/ppc/spapr_irq.h
>> @@ -39,6 +39,8 @@ typedef struct sPAPRIrq {
>>  void (*free)(sPAPRMachineState *spapr, int irq, int num);
>>  qemu_irq (*qirq)(sPAPRMachineState *spapr, int irq);
>>  void (*print_info)(sPAPRMachineState *spapr, Monitor *mon);
>> +void (*dt_populate)(sPAPRMachineState *spapr, uint32_t nr_servers,
>> +void *fdt, uint32_t phandle);
>>  } sPAPRIrq;
>>  
>>  extern sPAPRIrq spapr_irq_xics;
>> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
>> index 9506a8f4d10a..728a5e8dc163 100644
>> --- a/include/hw/ppc/spapr_xive.h
>> +++ b/include/hw/ppc/spapr_xive.h
>> @@ -45,5 +45,7 @@ qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t 
>> lisn);
>>  typedef struct sPAPRMachineState sPAPRMachineState;
>>  
>>  void spapr_xive_hcall_init(sPAPRMachineState *spapr);
>> +void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void 
>> *fdt,
>> +   uint32_t phandle);
>>  
>>  #endif /* PPC_SPAPR_XIVE_H */
>> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
>> index 9958443d1984..14afda198cdb 100644
>> --- a/include/hw/ppc/xics.h
>> +++ b/include/hw/ppc/xics.h
>> @@ -181,8 +181,6 @@ typedef struct XICSFabricClass {
>>  ICPState *(*icp_get)(XICSFabric *xi, int server);
>>  } XICSFabricClass;
>>  
>> -void spapr_dt_xics(int nr_servers, void *fdt, uint32_t phandle);
>> -
>>  ICPState *xics_icp_get(XICSFabric *xi, int server);
>>  
>>  /* Internal XICS interfaces */
>> @@ -204,6 +202,8 @@ void icp_resend(ICPState *ss);
>>  
>>  typedef struct sPAPRMachineState sPAPRMachineState;
>>  
>> +void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t nr_servers, void 
>> *fdt,
>> +   uint32_t phandle);
>>  int xics_kvm_init(sPAPRMachineState *spapr, Error **errp);
>>  void xics_spapr_init(sPAPRMachineState *spapr);
>>  
>> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
>> index 982ac6e17051..a6d854b07690 100644
>> --- a/hw/intc/spapr_xive.c
>> +++ b/hw/intc/spapr_xive.c
>> @@ -14,6 +14,7 @@
>>  #include "target/ppc/cpu.h"
>>  #include "sysemu/cpus.h"
>>  #include "monitor/monitor.h"
>> +#include "hw/ppc/fdt.h"
>>  #include "hw/ppc/spapr.h"
>>  #include "hw/ppc/spapr_xive.h"
>>  #include "hw/ppc/xive.h"
>> @@ -1381,3 +1382,66 @@ void spapr_xive_hcall_init(sPAPRMachineState 
>> *spapr)
>>  spapr_register_hypercall(H_INT_SYNC, h_int_sync);
>>  spapr_register_hypercall(H_INT_RESET, h_int_reset);
>>  }
>> +
>> +void spapr_dt_xive(sPAPRMachineState *spapr,

Re: [Qemu-devel] [PATCH v7 18/19] spapr: add a 'pseries-4.0-xive' machine type

2018-12-11 Thread Cédric Le Goater

[ ... ]

>>> So, instead I think we want a machine option which can be set to
>>> xics/xive/dual, with xics being the default for earlier machine types
>>> and dual the default for 4.0 onwards.
>>
>> I will revive an old patch doing just that. 
>>
>> The question now is how to link the sPAPRMachineState instance to 
>> the selected sPAPR IRQ backend. 
>>
>> I don't think we can move 'smc->irq' to sPAPRMachineState.
> 
> I think you could..
> 
>> So we will
>> need an helper returning the appropriate backend depending on the machine 
>> option and 'smc->irq' should disappear.
> 
> ..but this approach might be easier.

I proposed the first approach in v8. We can add the missing wrappers 
in a second round and move then under spapr_irq.h. These are :
 
ops :

   spapr->irq->dt_populate   spapr->irq->print_info
   spapr->irq->cpu_intc_create (this name is too long)

constants :

   spapr->irq->ov5
   spapr->irq->nr_msis

C.

Re: [Qemu-devel] [PATCH v2 1/4] hostmem-memfd: disable for systems wihtout sealing support

2018-12-11 Thread Gerd Hoffmann

On Tue, Dec 11, 2018 at 02:09:11PM +0300, Ilya Maximets wrote:
> On 11.12.2018 13:53, Daniel P. Berrangé wrote:
> >>
> >> Let's restrict memfd backend to systems with sealing support.
> > 
> > I don't think we need todo that - sealing is optional in the QEMU code,
> > we simply have it set to the wrong default when sealing is not available.
> 
> That was literally what I've fixed in v1:
> https://lists.nongnu.org/archive/html/qemu-devel/2018-11/msg05483.html
> 
> but 2 people suggested me to disable memfd entirely for this case.
> Do you think I need to get patch from v1 back ?
> 
> Gerd, Marc-André, what do you think?

I still think it makes sense to require sealing support.  Sealing is
very useful, and there are only a few kernel versions with memfd but
without sealing.  So finding such kernels in the wild will become more
rare over time.  I wouldn't worry too much about them.

cheers,
  Gerd

Re: [Qemu-devel] [PATCH for-4.0 v7 01/27] qapi: make sure osdep.h is included in type headers

2018-12-11 Thread Markus Armbruster

Daniel P. Berrangé  writes:

> On Mon, Dec 10, 2018 at 02:28:26PM +0100, Markus Armbruster wrote:
[...]
>> checkpatch.pl could flag patches adding .c files that don't include
>> "qemu/osdep.h" first.  The "first" part might be a bit annoying to code.
>
> You can get this logic from GNULIBs syntax-check make rules. It uses
> this perl code to check that config.h is included first:
>
> while (<>) {
> if (/^# *include\b/) {
> if (not m{^# *include }) {
> print "$ARGV\n";
> $$ret = 1;
> }
> close ARGV;
> }
> }

Thanks!

Re: [Qemu-devel] [PATCH for-4.0 v7 01/27] qapi: make sure osdep.h is included in type headers

2018-12-11 Thread Markus Armbruster

Marc-André Lureau  writes:

> Hi
>
> On Mon, Dec 10, 2018 at 5:28 PM Markus Armbruster  wrote:
>>
>> Marc-André Lureau  writes:
>>
>> > Hi
>> >
>> > On Mon, Dec 10, 2018 at 1:52 PM Markus Armbruster  
>> > wrote:
>> >>
>> >> Marc-André Lureau  writes:
>> >>
>> >> > Now that the schema can be configured, it is crucial that all types
>> >> > are configured the same. Make sure config-host.h is included, by
>> >> > checking osdep.h inclusion. The build-sys tracks the dependency and
>> >> > rebuilds the types if the configuration changed.
>> >> >
>> >> > Signed-off-by: Marc-André Lureau 
>> >> > ---
>> >> >  scripts/qapi/types.py | 3 +++
>> >> >  1 file changed, 3 insertions(+)
>> >> >
>> >> > diff --git a/scripts/qapi/types.py b/scripts/qapi/types.py
>> >> > index fd7808103c..3bb9cb6d47 100644
>> >> > --- a/scripts/qapi/types.py
>> >> > +++ b/scripts/qapi/types.py
>> >> > @@ -201,6 +201,9 @@ class 
>> >> > QAPISchemaGenTypeVisitor(QAPISchemaModularCVisitor):
>> >> >  ''',
>> >> >types=types, visit=visit))
>> >> >  self._genh.preamble_add(mcgen('''
>> >> > +#ifndef QEMU_OSDEP_H
>> >> > +#error Please include osdep.h first!
>> >> > +#endif
>> >> >  #include "qapi/qapi-builtin-types.h"
>> >> >  '''))
>> >>
>> >> I understand why you propose this patch.  The QAPI-generated headers use
>> >> #ifdef CONFIG_FOO.  The configuration header "qemu/osdep.h" must be
>> >> consistently included before the generated headers, or else you end up
>> >> with nasty bugs, such as the same enum having different values in
>> >> different .o, or the same struct having a different layout.
>> >>
>> >> But this applies to *all* headers that use #ifdef.  Why check it here,
>> >> but not there?  What makes the QAPI-generated headers special?
>> >>
>> >
>> > It's the discussion about #if in headers (and enums in particular)
>> > that started this. We want to make sure all compilation units share
>> > the same data structure/ABI. I proposed to include osdep.h in qapi
>> > headers, but that was rejected.
>> > The warning is a different approach. I agree it could apply to all
>> > headers. Do you think I should update all headers as well?
>>
>> That would replace the rule 'all .c files must include "qemu/osdep.h"
>> first' by 'all .h files must check "qemu/osdep.h" has been included
>> already', which is not an improvement.
>
> That would be an improvement since headers may also be impacted by osdep.h
>
> Alternatively, why not use -include ?

Requiring .c to include the configuration header first follows
established Autoconf practice.  The "first" part covers all headers.

I've seen plenty of autoconfiscated code, yet none where the headers
double-check the configuration header has been included already.  Such a
check is neither sufficient nor really necessary.  Checking the .c is
both.

I have no strong feelings about using -include instead.  I'd prefer to
avoid the difference to what other projects do, though.

Circling back to the issue that made you propose this patch.

I think I'm (belatedly) coming around to your initial view that the use
of conditionals in generated QAPI code is safe enough.

I shouldn't be worried about #if defined(CONFIG_HOST_THING) where
CONFIG_HOST_THING belongs to config-host.h.  That's always pulled in via
qemu/osdep.h.

I also shouldn't be worried about #if defined(CONFIG_TARGET_THING) where
CONFIG_TARGET_THING belongs to config-target.h.  qemu/osdep.h pulls in
either that or exec/poison.h.

I even shouldn't be worried about #if defined(RANDOM_MACRO) as long as
qemu/osdep.h pulls in its owner or poisons it.  I might legitimately be
worried about the "as long as" part.

We could perhaps extract the ifconds and look for macros that belong
neither to config-host.h nor config-target.h.  But I'm not sure it's
worth the bother.

Re: [Qemu-devel] [PATCH for-4.0 0/6] vhost-user-blk: Add support for backend reconnecting

2018-12-11 Thread Yongji Xie

On Wed, 12 Dec 2018 at 12:07, Jason Wang  wrote:
>
>
> On 2018/12/12 上午11:21, Yongji Xie wrote:
> > On Wed, 12 Dec 2018 at 11:00, Jason Wang  wrote:
> >>
> >> On 2018/12/12 上午10:48, Yongji Xie wrote:
> >>> On Mon, 10 Dec 2018 at 17:32, Jason Wang  wrote:
>  On 2018/12/6 下午9:59, Michael S. Tsirkin wrote:
> > On Thu, Dec 06, 2018 at 09:57:22PM +0800, Jason Wang wrote:
> >> On 2018/12/6 下午2:35,elohi...@gmail.com  wrote:
> >>> From: Xie Yongji
> >>>
> >>> This patchset is aimed at supporting qemu to reconnect
> >>> vhost-user-blk backend after vhost-user-blk backend crash or
> >>> restart.
> >>>
> >>> The patch 1 tries to implenment the sync connection for
> >>> "reconnect socket".
> >>>
> >>> The patch 2 introduces a new message VHOST_USER_SET_VRING_INFLIGHT
> >>> to support offering shared memory to backend to record
> >>> its inflight I/O.
> >>>
> >>> The patch 3,4 are the corresponding libvhost-user patches of
> >>> patch 2. Make libvhost-user support VHOST_USER_SET_VRING_INFLIGHT.
> >>>
> >>> The patch 5 supports vhost-user-blk to reconnect backend when
> >>> connection closed.
> >>>
> >>> The patch 6 tells qemu that we support reconnecting now.
> >>>
> >>> To use it, we could start qemu with:
> >>>
> >>> qemu-system-x86_64 \
> >>> -chardev 
> >>> socket,id=char0,path=/path/vhost.socket,reconnect=1,wait \
> >>> -device vhost-user-blk-pci,chardev=char0 \
> >>>
> >>> and start vhost-user-blk backend with:
> >>>
> >>> vhost-user-blk -b /path/file -s /path/vhost.socket
> >>>
> >>> Then we can restart vhost-user-blk at any time during VM running.
> >> I wonder whether or not it's better to handle this at the level of 
> >> virtio
> >> protocol itself instead of vhost-user level. E.g expose last_avail_idx 
> >> to
> >> driver might be sufficient?
> >>
> >> Another possible issue is, looks like you need to deal with different 
> >> kinds
> >> of ring layouts e.g packed virtqueues.
> >>
> >> Thanks
> > I'm not sure I understand your comments here.
> > All these would be guest-visible extensions.
>  Looks not, it only introduces a shared memory between qemu and
>  vhost-user backend?
> 
> 
> > Possible for sure but how is this related to
> > a patch supporting transparent reconnects?
>  I might miss something. My understanding is that we support transparent
>  reconnects, but we can't deduce an accurate last_avail_idx and this is
>  what capability this series try to add. To me, this series is functional
>  equivalent to expose last_avail_idx (or avail_idx_cons) in available
>  ring. So the information is inside guest memory, vhost-user backend can
>  access it and update it directly. I believe this is some modern NIC did
>  as well (but index is in MMIO area of course).
> 
> >>> Hi Jason,
> >>>
> >>> If my understand is correct, it might be not enough to only expose
> >>> last_avail_idx.
> >>> Because we would not process descriptors in the same order in which they 
> >>> have
> >>> been made available sometimes. If so, we can't get correct inflight
> >>> I/O information
> >>> from available ring.
> >>
> >> You can get this with the help of the both used ring and last_avail_idx
> >> I believe. Or maybe you can give us an example?
> >>
> > A simple example, we assume ring size is 8:
> >
> > 1. guest fill avail ring
> >
> > avail ring: 0 1 2 3 4 5 6 7
> > used ring:
> >
> > 2. vhost-user backend complete 4,5,6,7 and fill used ring
> >
> > avail ring: 0 1 2 3 4 5 6 7
> > used ring: 4 5 6 7
> >
> > 3. guest fill avail ring again
> >
> > avail ring: 4 5 6 7 4 5 6 7
> > used ring: 4 5 6 7
> >
> > 4. vhost-user backend crash
> >
> > The inflight descriptors 0, 1, 2, 3 lost.
> >
> > Thanks,
> > Yongji
>
>
> Ok, then we can simply forbid increasing the avail_idx in this case?
>
> Basically, it's a question of whether or not it's better to done it in
> the level of virtio instead of vhost. I'm pretty sure if we expose
> sufficient information, it could be done without touching vhost-user.
> And we won't deal with e.g migration and other cases.
>

OK, I get your point. That's indeed an alternative way. But this feature seems
to be only useful to vhost-user backend. I'm not sure whether it make sense to
touch virtio protocol for this feature.

Thanks,
Yongji

Re: [Qemu-devel] [PATCH v5 0/7] nvdimm: support MAP_SYNC for memory-backend-file

2018-12-11 Thread Yi Zhang

On 2018-12-05 at 09:59:23 +, Stefan Hajnoczi wrote:
> On Mon, Nov 05, 2018 at 04:07:50PM +0800, Zhang Yi wrote:
> > Linux 4.15 introduces a new mmap flag MAP_SYNC, which can be used to
> > guarantee the write persistence to mmap'ed files supporting DAX (e.g.,
> > files on ext4/xfs file system mounted with '-o dax').
> > 
> 
> Hi,
> There are a lot of people on CC so probably no one felt responsible to
> review this series quickly.  Feel free to send a "ping" email if your
> patches aren't receiving attention.
> 
> Overall this feature looks fine.  I left a few small comments.
> 
> Thanks,
> Stefan

Thanks your review, Stefan, I will post the next version soon.

> 
> > A description of MAP_SYNC and MAP_SHARED_VALIDATE can be found at
> > https://patchwork.kernel.org/patch/10028151/
> > 
> > In order to make sure that the file metadata is in sync after a fault 
> > while we are writing a shared DAX supporting backend files, this
> > patch-set enables QEMU to use MAP_SYNC flag for memory-backend-dax-file.
> > 
> > As the DAX vs DMA truncated issue was solved, we refined the code and
> > send out this feature for the v5 version.
> > 
> > A new auto on/off option 'sync' is added to memory-backend-file:
> >  - on:  try to pass MAP_SYNC to mmap(2); if MAP_SYNC is not supported or
> > 'share=off', QEMU will abort
> >  - off: never pass MAP_SYNC to mmap(2)
> >  - auto (default): if MAP_SYNC is supported and 'share=on', work as if
> > 'sync=on'; otherwise, work as if 'sync=off'
> > 
> > Changes in v5:
> >  * Add patch 1 to fix a memory leak issue.
> >  * Refine the patch 4-6
> >  * Remove the patch 3 as we already change the parameter from "shared" to
> >"flags"
> > 
> > Changes in v4:
> >  * Add patch 1-3 to switch some functions to a single 'flags'
> >parameters. (Michael S. Tsirkin)
> >  * v3 patch 1-3 become v4 patch 4-6.
> >  * Patch 4: move definitions of MAP_SYNC and MAP_SHARED_VALIDATE to a
> >new header file under include/standard-headers/linux/. (Michael S. 
> > Tsirkin)
> >  * Patch 6: refine the description of the 'sync' option. (Michael S. 
> > Tsirkin)
> > 
> > Changes in v3:
> >  * Patch 1: add MAP_SHARED_VALIDATE in both sync=on and sync=auto
> >cases, and add back the retry mechanism. MAP_SYNC will be ignored
> >by Linux kernel 4.15 if MAP_SHARED_VALIDATE is missed.
> >  * Patch 1: define MAP_SYNC and MAP_SHARED_VALIDATE as 0 on non-Linux
> >platforms in order to make qemu_ram_mmap() compile on those platforms.
> >  * Patch 2&3: include more information in error messages of
> >memory-backend in hope to help user to identify the error.
> >(Dr. David Alan Gilbert)
> >  * Patch 3: fix typo in the commit message. (Dr. David Alan Gilbert)
> > 
> > Changes in v2:
> >  * Add 'sync' option to control the use of MAP_SYNC. (Eduardo Habkost)
> >  * Remove the unnecessary set of MAP_SHARED_VALIDATE in some cases and
> >the retry mechanism in qemu_ram_mmap(). (Michael S. Tsirkin)
> >  * Move OS dependent definitions of MAP_SYNC and MAP_SHARED_VALIDATE
> >to osdep.h. (Michael S. Tsirkin)
> > 
> > Zhang Yi (7):
> >   numa: Fixed the memory leak of numa error message
> >   util/mmap-alloc: switch qemu_ram_mmap() to 'flags' parameter
> >   exec: switch qemu_ram_alloc_from_{file, fd} to the 'flags' parameter
> >   util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap()
> >   util/mmap-alloc: Switch the RAM_SYNC flags to OnOffAuto
> >   hostmem: add more information in error messages
> >   hostmem-file: add 'sync' option
> > 
> >  backends/hostmem-file.c   | 45 
> > +--
> >  backends/hostmem.c|  8 ---
> >  docs/nvdimm.txt   | 20 +++-
> >  exec.c|  9 +++
> >  include/exec/memory.h | 18 ++
> >  include/exec/ram_addr.h   |  1 +
> >  include/qemu/mmap-alloc.h | 20 +++-
> >  include/standard-headers/linux/mman.h | 44 
> > ++
> >  numa.c|  1 +
> >  qemu-options.hx   | 22 -
> >  util/mmap-alloc.c | 26 
> >  util/oslib-posix.c|  4 +++-
> >  12 files changed, 200 insertions(+), 18 deletions(-)
> >  create mode 100644 include/standard-headers/linux/mman.h
> > 
> > -- 
> > 2.7.4
> > 
> >

Re: [Qemu-devel] [PATCH for-4.0 0/6] vhost-user-blk: Add support for backend reconnecting

2018-12-11 Thread Jason Wang




On 2018/12/12 上午11:21, Yongji Xie wrote:

On Wed, 12 Dec 2018 at 11:00, Jason Wang  wrote:


On 2018/12/12 上午10:48, Yongji Xie wrote:

On Mon, 10 Dec 2018 at 17:32, Jason Wang  wrote:

On 2018/12/6 下午9:59, Michael S. Tsirkin wrote:

On Thu, Dec 06, 2018 at 09:57:22PM +0800, Jason Wang wrote:

On 2018/12/6 下午2:35,elohi...@gmail.com  wrote:

From: Xie Yongji

This patchset is aimed at supporting qemu to reconnect
vhost-user-blk backend after vhost-user-blk backend crash or
restart.

The patch 1 tries to implenment the sync connection for
"reconnect socket".

The patch 2 introduces a new message VHOST_USER_SET_VRING_INFLIGHT
to support offering shared memory to backend to record
its inflight I/O.

The patch 3,4 are the corresponding libvhost-user patches of
patch 2. Make libvhost-user support VHOST_USER_SET_VRING_INFLIGHT.

The patch 5 supports vhost-user-blk to reconnect backend when
connection closed.

The patch 6 tells qemu that we support reconnecting now.

To use it, we could start qemu with:

qemu-system-x86_64 \
-chardev socket,id=char0,path=/path/vhost.socket,reconnect=1,wait \
-device vhost-user-blk-pci,chardev=char0 \

and start vhost-user-blk backend with:

vhost-user-blk -b /path/file -s /path/vhost.socket

Then we can restart vhost-user-blk at any time during VM running.

I wonder whether or not it's better to handle this at the level of virtio
protocol itself instead of vhost-user level. E.g expose last_avail_idx to
driver might be sufficient?

Another possible issue is, looks like you need to deal with different kinds
of ring layouts e.g packed virtqueues.

Thanks

I'm not sure I understand your comments here.
All these would be guest-visible extensions.

Looks not, it only introduces a shared memory between qemu and
vhost-user backend?



Possible for sure but how is this related to
a patch supporting transparent reconnects?

I might miss something. My understanding is that we support transparent
reconnects, but we can't deduce an accurate last_avail_idx and this is
what capability this series try to add. To me, this series is functional
equivalent to expose last_avail_idx (or avail_idx_cons) in available
ring. So the information is inside guest memory, vhost-user backend can
access it and update it directly. I believe this is some modern NIC did
as well (but index is in MMIO area of course).


Hi Jason,

If my understand is correct, it might be not enough to only expose
last_avail_idx.
Because we would not process descriptors in the same order in which they have
been made available sometimes. If so, we can't get correct inflight
I/O information
from available ring.


You can get this with the help of the both used ring and last_avail_idx
I believe. Or maybe you can give us an example?


A simple example, we assume ring size is 8:

1. guest fill avail ring

avail ring: 0 1 2 3 4 5 6 7
used ring:

2. vhost-user backend complete 4,5,6,7 and fill used ring

avail ring: 0 1 2 3 4 5 6 7
used ring: 4 5 6 7

3. guest fill avail ring again

avail ring: 4 5 6 7 4 5 6 7
used ring: 4 5 6 7

4. vhost-user backend crash

The inflight descriptors 0, 1, 2, 3 lost.

Thanks,
Yongji



Ok, then we can simply forbid increasing the avail_idx in this case?

Basically, it's a question of whether or not it's better to done it in 
the level of virtio instead of vhost. I'm pretty sure if we expose 
sufficient information, it could be done without touching vhost-user. 
And we won't deal with e.g migration and other cases.


Thanks

Re: [Qemu-devel] [PATCH for-4.0 0/6] vhost-user-blk: Add support for backend reconnecting

2018-12-11 Thread Yongji Xie

On Wed, 12 Dec 2018 at 11:00, Jason Wang  wrote:
>
>
> On 2018/12/12 上午10:48, Yongji Xie wrote:
> > On Mon, 10 Dec 2018 at 17:32, Jason Wang  wrote:
> >>
> >> On 2018/12/6 下午9:59, Michael S. Tsirkin wrote:
> >>> On Thu, Dec 06, 2018 at 09:57:22PM +0800, Jason Wang wrote:
>  On 2018/12/6 下午2:35,elohi...@gmail.com  wrote:
> > From: Xie Yongji
> >
> > This patchset is aimed at supporting qemu to reconnect
> > vhost-user-blk backend after vhost-user-blk backend crash or
> > restart.
> >
> > The patch 1 tries to implenment the sync connection for
> > "reconnect socket".
> >
> > The patch 2 introduces a new message VHOST_USER_SET_VRING_INFLIGHT
> > to support offering shared memory to backend to record
> > its inflight I/O.
> >
> > The patch 3,4 are the corresponding libvhost-user patches of
> > patch 2. Make libvhost-user support VHOST_USER_SET_VRING_INFLIGHT.
> >
> > The patch 5 supports vhost-user-blk to reconnect backend when
> > connection closed.
> >
> > The patch 6 tells qemu that we support reconnecting now.
> >
> > To use it, we could start qemu with:
> >
> > qemu-system-x86_64 \
> >-chardev 
> > socket,id=char0,path=/path/vhost.socket,reconnect=1,wait \
> >-device vhost-user-blk-pci,chardev=char0 \
> >
> > and start vhost-user-blk backend with:
> >
> > vhost-user-blk -b /path/file -s /path/vhost.socket
> >
> > Then we can restart vhost-user-blk at any time during VM running.
>  I wonder whether or not it's better to handle this at the level of virtio
>  protocol itself instead of vhost-user level. E.g expose last_avail_idx to
>  driver might be sufficient?
> 
>  Another possible issue is, looks like you need to deal with different 
>  kinds
>  of ring layouts e.g packed virtqueues.
> 
>  Thanks
> >>> I'm not sure I understand your comments here.
> >>> All these would be guest-visible extensions.
> >>
> >> Looks not, it only introduces a shared memory between qemu and
> >> vhost-user backend?
> >>
> >>
> >>> Possible for sure but how is this related to
> >>> a patch supporting transparent reconnects?
> >>
> >> I might miss something. My understanding is that we support transparent
> >> reconnects, but we can't deduce an accurate last_avail_idx and this is
> >> what capability this series try to add. To me, this series is functional
> >> equivalent to expose last_avail_idx (or avail_idx_cons) in available
> >> ring. So the information is inside guest memory, vhost-user backend can
> >> access it and update it directly. I believe this is some modern NIC did
> >> as well (but index is in MMIO area of course).
> >>
> > Hi Jason,
> >
> > If my understand is correct, it might be not enough to only expose
> > last_avail_idx.
> > Because we would not process descriptors in the same order in which they 
> > have
> > been made available sometimes. If so, we can't get correct inflight
> > I/O information
> > from available ring.
>
>
> You can get this with the help of the both used ring and last_avail_idx
> I believe. Or maybe you can give us an example?
>

A simple example, we assume ring size is 8:

1. guest fill avail ring

avail ring: 0 1 2 3 4 5 6 7
used ring:

2. vhost-user backend complete 4,5,6,7 and fill used ring

avail ring: 0 1 2 3 4 5 6 7
used ring: 4 5 6 7

3. guest fill avail ring again

avail ring: 4 5 6 7 4 5 6 7
used ring: 4 5 6 7

4. vhost-user backend crash

The inflight descriptors 0, 1, 2, 3 lost.

Thanks,
Yongji

Re: [Qemu-devel] [PATCH for-4.0 0/6] vhost-user-blk: Add support for backend reconnecting

2018-12-11 Thread Jason Wang




On 2018/12/12 上午10:48, Yongji Xie wrote:

On Mon, 10 Dec 2018 at 17:32, Jason Wang  wrote:


On 2018/12/6 下午9:59, Michael S. Tsirkin wrote:

On Thu, Dec 06, 2018 at 09:57:22PM +0800, Jason Wang wrote:

On 2018/12/6 下午2:35,elohi...@gmail.com  wrote:

From: Xie Yongji

This patchset is aimed at supporting qemu to reconnect
vhost-user-blk backend after vhost-user-blk backend crash or
restart.

The patch 1 tries to implenment the sync connection for
"reconnect socket".

The patch 2 introduces a new message VHOST_USER_SET_VRING_INFLIGHT
to support offering shared memory to backend to record
its inflight I/O.

The patch 3,4 are the corresponding libvhost-user patches of
patch 2. Make libvhost-user support VHOST_USER_SET_VRING_INFLIGHT.

The patch 5 supports vhost-user-blk to reconnect backend when
connection closed.

The patch 6 tells qemu that we support reconnecting now.

To use it, we could start qemu with:

qemu-system-x86_64 \
   -chardev socket,id=char0,path=/path/vhost.socket,reconnect=1,wait \
   -device vhost-user-blk-pci,chardev=char0 \

and start vhost-user-blk backend with:

vhost-user-blk -b /path/file -s /path/vhost.socket

Then we can restart vhost-user-blk at any time during VM running.

I wonder whether or not it's better to handle this at the level of virtio
protocol itself instead of vhost-user level. E.g expose last_avail_idx to
driver might be sufficient?

Another possible issue is, looks like you need to deal with different kinds
of ring layouts e.g packed virtqueues.

Thanks

I'm not sure I understand your comments here.
All these would be guest-visible extensions.


Looks not, it only introduces a shared memory between qemu and
vhost-user backend?



Possible for sure but how is this related to
a patch supporting transparent reconnects?


I might miss something. My understanding is that we support transparent
reconnects, but we can't deduce an accurate last_avail_idx and this is
what capability this series try to add. To me, this series is functional
equivalent to expose last_avail_idx (or avail_idx_cons) in available
ring. So the information is inside guest memory, vhost-user backend can
access it and update it directly. I believe this is some modern NIC did
as well (but index is in MMIO area of course).


Hi Jason,

If my understand is correct, it might be not enough to only expose
last_avail_idx.
Because we would not process descriptors in the same order in which they have
been made available sometimes. If so, we can't get correct inflight
I/O information
from available ring.



You can get this with the help of the both used ring and last_avail_idx 
I believe. Or maybe you can give us an example?


Thanks




Thanks,
Yongji

Re: [Qemu-devel] [RFC v3 15/24] riscv: tcg-target: Add the add2 and sub2 instructions

2018-12-11 Thread Richard Henderson

On 12/7/18 6:48 PM, Alistair Francis wrote:
> +static void tcg_out_addsub2(TCGContext *s,
> +TCGReg rl, TCGReg rh,
> +TCGReg al, TCGReg ah,
> +TCGReg bl, TCGReg bh,
> +bool cbl, bool cbh, bool is_sub, bool is32bit)

bl and bh must be TCGArg, since they may be constants, as indicated by cbl+cbh.


r~

Re: [Qemu-devel] [PATCH for-4.0 0/6] vhost-user-blk: Add support for backend reconnecting

2018-12-11 Thread Yongji Xie

On Mon, 10 Dec 2018 at 17:32, Jason Wang  wrote:
>
>
> On 2018/12/6 下午9:59, Michael S. Tsirkin wrote:
> > On Thu, Dec 06, 2018 at 09:57:22PM +0800, Jason Wang wrote:
> >> On 2018/12/6 下午2:35,elohi...@gmail.com  wrote:
> >>> From: Xie Yongji
> >>>
> >>> This patchset is aimed at supporting qemu to reconnect
> >>> vhost-user-blk backend after vhost-user-blk backend crash or
> >>> restart.
> >>>
> >>> The patch 1 tries to implenment the sync connection for
> >>> "reconnect socket".
> >>>
> >>> The patch 2 introduces a new message VHOST_USER_SET_VRING_INFLIGHT
> >>> to support offering shared memory to backend to record
> >>> its inflight I/O.
> >>>
> >>> The patch 3,4 are the corresponding libvhost-user patches of
> >>> patch 2. Make libvhost-user support VHOST_USER_SET_VRING_INFLIGHT.
> >>>
> >>> The patch 5 supports vhost-user-blk to reconnect backend when
> >>> connection closed.
> >>>
> >>> The patch 6 tells qemu that we support reconnecting now.
> >>>
> >>> To use it, we could start qemu with:
> >>>
> >>> qemu-system-x86_64 \
> >>>   -chardev 
> >>> socket,id=char0,path=/path/vhost.socket,reconnect=1,wait \
> >>>   -device vhost-user-blk-pci,chardev=char0 \
> >>>
> >>> and start vhost-user-blk backend with:
> >>>
> >>> vhost-user-blk -b /path/file -s /path/vhost.socket
> >>>
> >>> Then we can restart vhost-user-blk at any time during VM running.
> >> I wonder whether or not it's better to handle this at the level of virtio
> >> protocol itself instead of vhost-user level. E.g expose last_avail_idx to
> >> driver might be sufficient?
> >>
> >> Another possible issue is, looks like you need to deal with different kinds
> >> of ring layouts e.g packed virtqueues.
> >>
> >> Thanks
> > I'm not sure I understand your comments here.
> > All these would be guest-visible extensions.
>
>
> Looks not, it only introduces a shared memory between qemu and
> vhost-user backend?
>
>
> > Possible for sure but how is this related to
> > a patch supporting transparent reconnects?
>
>
> I might miss something. My understanding is that we support transparent
> reconnects, but we can't deduce an accurate last_avail_idx and this is
> what capability this series try to add. To me, this series is functional
> equivalent to expose last_avail_idx (or avail_idx_cons) in available
> ring. So the information is inside guest memory, vhost-user backend can
> access it and update it directly. I believe this is some modern NIC did
> as well (but index is in MMIO area of course).
>

Hi Jason,

If my understand is correct, it might be not enough to only expose
last_avail_idx.
Because we would not process descriptors in the same order in which they have
been made available sometimes. If so, we can't get correct inflight
I/O information
from available ring.

Thanks,
Yongji

Re: [Qemu-devel] [RFC v3 08/24] riscv: tcg-target: Add support for the constraints

2018-12-11 Thread Richard Henderson

On 12/7/18 6:47 PM, Alistair Francis wrote:
> +if ((ct & TCG_CT_CONST_N12) && val == sextreg(-val, 0, 12)) {

 -val == sextreg(-val, 0, 12)

What's here will of course always fail.


r~

Re: [Qemu-devel] [PATCH V12 0/5] add pvpanic mmio support

2018-12-11 Thread peng.hao2

>> v11 --> v12
>>  realize pvpanic as a pci device and use the mmio of pci device.
>
>Do you have a pointer to the kernel patches?
>
>Thanks,
>>drew
>
I'm still sorting out the code for the kernel part, and I haven't submitted a 
patch yet.

Re: [Qemu-devel] [PATCH 2/5] pvrdma: add uar_read routine

2018-12-11 Thread 李强




At 2018-12-11 23:22:32, "Yuval Shaia"  wrote:
>On Tue, Dec 11, 2018 at 06:56:39PM +0530, P J P wrote:
>> From: Prasad J Pandit 
>> 
>> Define skeleton 'uar_read' routine. Avoid NULL dereference.
>> 
>> Reported-by: Li Qiang 
>> Signed-off-by: Prasad J Pandit 
>> ---
>>  hw/rdma/vmw/pvrdma_main.c | 6 ++
>>  1 file changed, 6 insertions(+)
>> 
>> diff --git a/hw/rdma/vmw/pvrdma_main.c b/hw/rdma/vmw/pvrdma_main.c
>> index ca5fa8d981..a6211d416d 100644
>> --- a/hw/rdma/vmw/pvrdma_main.c
>> +++ b/hw/rdma/vmw/pvrdma_main.c
>> @@ -455,6 +455,11 @@ static const MemoryRegionOps regs_ops = {
>>  },
>>  };
>>  
>> +static uint64_t uar_read(void *opaque, hwaddr addr, unsigned size)
>> +{
>> +return 0;
>> +}
>> +
>>  static void uar_write(void *opaque, hwaddr addr, uint64_t val, unsigned 
>> size)
>>  {
>>  PVRDMADev *dev = opaque;
>> @@ -496,6 +501,7 @@ static void uar_write(void *opaque, hwaddr addr, 
>> uint64_t val, unsigned size)
>>  }
>>  
>>  static const MemoryRegionOps uar_ops = {
>> +.read = uar_read,
>

>Are you sure it is needed?


I'm quite sure this.
The issue here is that in memory_region_dispatch_read1
if there is no mr's read callback, the 'memory_region_read_with_attrs_accessor' 
will be called, but in that the 'mr->ops->raed_with_attrs' has no check.


In fact, I have send out a patch for the framework:
-->https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg02265.html


But no more response.


>Looking at memory_region_dispatch_read1 i can see that there is a check but 
>>not sure this is the right place. Anyways, if it is not, i believe this
>should be framework responsibility.


Reference Peter's answer here:
-->https://lists.gnu.org/archive/html/qemu-devel/2018-09/msg01404.html


"Currently our semantics are "you must provide both read and write, even
if one of them just always returns 0 / does nothing / returns an error".
We could probably reasonably assert this at the point when the
MemoryRegionOps is registered."




Thanks,
Li Qiang


> >> .write = uar_write, >> .endianness = DEVICE_LITTLE_ENDIAN, >> .impl = { >> 
> >> -- >> 2.19.2 >>

Re: [Qemu-devel] [PATCH v7 16/19] spapr: introduce a new sPAPR IRQ backend supporting XIVE and XICS

2018-12-11 Thread David Gibson

On Tue, Dec 11, 2018 at 11:19:01AM +0100, Cédric Le Goater wrote:
> On 12/11/18 3:03 AM, David Gibson wrote:
> > On Sun, Dec 09, 2018 at 08:46:07PM +0100, Cédric Le Goater wrote:
> >> The interrupt mode is chosen by the CAS negotiation process and
> >> activated after a reset to take into account the required changes in
> >> the machine. These impact the device tree layout, the interrupt
> >> presenter object and the exposed MMIO regions in the case of XIVE.
> >>
> >> This default interrupt mode for the machine is XICS.
> >>
> >> Signed-off-by: Cédric Le Goater 
> >> ---
> >>  include/hw/ppc/spapr_irq.h |   1 +
> >>  hw/ppc/spapr.c |   3 +-
> >>  hw/ppc/spapr_hcall.c   |  13 
> >>  hw/ppc/spapr_irq.c | 143 +
> >>  4 files changed, 159 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> >> index b34d5a00381b..29936498dbc8 100644
> >> --- a/include/hw/ppc/spapr_irq.h
> >> +++ b/include/hw/ppc/spapr_irq.h
> >> @@ -51,6 +51,7 @@ typedef struct sPAPRIrq {
> >>  extern sPAPRIrq spapr_irq_xics;
> >>  extern sPAPRIrq spapr_irq_xics_legacy;
> >>  extern sPAPRIrq spapr_irq_xive;
> >> +extern sPAPRIrq spapr_irq_dual;
> >>  
> >>  void spapr_irq_init(sPAPRMachineState *spapr, Error **errp);
> >>  int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error 
> >> **errp);
> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >> index 5ef87a00f68b..fa41927d95dd 100644
> >> --- a/hw/ppc/spapr.c
> >> +++ b/hw/ppc/spapr.c
> >> @@ -2631,7 +2631,8 @@ static void spapr_machine_init(MachineState *machine)
> >>  spapr_ovec_set(spapr->ov5, OV5_DRMEM_V2);
> >>  
> >>  /* advertise XIVE */
> >> -if (smc->irq->ov5 == SPAPR_OV5_XIVE_EXPLOIT) {
> >> +if (smc->irq->ov5 == SPAPR_OV5_XIVE_EXPLOIT ||
> >> +smc->irq->ov5 == SPAPR_OV5_XIVE_BOTH) {
> >>  spapr_ovec_set(spapr->ov5, OV5_XIVE_EXPLOIT);
> >>  }
> >>  
> >> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> >> index ae913d070f50..186b6a65543f 100644
> >> --- a/hw/ppc/spapr_hcall.c
> >> +++ b/hw/ppc/spapr_hcall.c
> >> @@ -1654,6 +1654,19 @@ static target_ulong 
> >> h_client_architecture_support(PowerPCCPU *cpu,
> >>  (spapr_h_cas_compose_response(spapr, args[1], args[2],
> >>ov5_updates) != 0);
> >>  }
> >> +
> >> +/*
> >> + * Generate a machine reset when we have an update of the
> >> + * interrupt mode. Only required on the machine supporting both
> >> + * mode.
> >> + */
> >> +if (!spapr->cas_reboot) {
> >> +sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> >> +
> >> +spapr->cas_reboot = spapr_ovec_test(ov5_updates, OV5_XIVE_EXPLOIT)
> >> +&& smc->irq->ov5 == SPAPR_OV5_XIVE_BOTH;
> >> +}
> >> +
> >>  spapr_ovec_cleanup(ov5_updates);
> >>  
> >>  if (spapr->cas_reboot) {
> >> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> >> index a8e50725397c..7c34939f774a 100644
> >> --- a/hw/ppc/spapr_irq.c
> >> +++ b/hw/ppc/spapr_irq.c
> >> @@ -392,6 +392,149 @@ sPAPRIrq spapr_irq_xive = {
> >>  .reset   = spapr_irq_reset_xive,
> >>  };
> >>  
> >> +/*
> >> + * Dual XIVE and XICS IRQ backend.
> >> + *
> >> + * Both interrupt mode, XIVE and XICS, objects are created but the
> >> + * machine starts in legacy interrupt mode (XICS). It can be changed
> >> + * by the CAS negotiation process and, in that case, the new mode is
> >> + * activated after extra machine reset.
> >> + */
> >> +
> >> +/*
> >> + * Returns the sPAPR IRQ backend negotiated by CAS. XICS is the
> >> + * default.
> >> + */
> >> +static sPAPRIrq *spapr_irq_current(sPAPRMachineState *spapr)
> >> +{
> >> +return spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT) ?
> >> +_irq_xive : _irq_xics;
> >> +}
> >> +
> >> +static void spapr_irq_init_dual(sPAPRMachineState *spapr, Error **errp)
> >> +{
> >> +MachineState *machine = MACHINE(spapr);
> >> +Error *local_err = NULL;
> >> +
> >> +if (kvm_enabled() && machine_kernel_irqchip_allowed(machine)) {
> >> +error_setg(errp, "No KVM support for the 'dual' machine");
> >> +return;
> >> +}
> >> +
> >> +spapr_irq_xics.init(spapr, _err);
> >> +if (local_err) {
> >> +error_propagate(errp, local_err);
> >> +return;
> >> +}
> >> +
> >> +spapr_irq_xive.init(spapr, _err);
> >> +if (local_err) {
> >> +error_propagate(errp, local_err);
> >> +return;
> >> +}
> >> +}
> >> +
> >> +static int spapr_irq_claim_dual(sPAPRMachineState *spapr, int irq, bool 
> >> lsi,
> >> +Error **errp)
> >> +{
> >> +int ret;
> >> +Error *local_err = NULL;
> >> +
> >> +ret = spapr_irq_xive.claim(spapr, irq, lsi, _err);
> >> +if (local_err) {
> >> +error_propagate(errp, local_err);
> >> +return ret;
> >> +}
> >> +
> >> +ret =

Re: [Qemu-devel] [PATCH v2] target/i386: Fixes to the check missing features routine

2018-12-11 Thread Eduardo Habkost

On Tue, Dec 11, 2018 at 11:28:46AM -0500, Wainer dos Santos Moschetta wrote:
> The x86_cpu_class_check_missing_features() returns a list
> of unavailable features compared to the host CPU. Currently it may
> return empty strings for unnamed features as well as duplicated
> names.
> 
> For example, the qmp "query-cpu-definitions" below shows one empty
> string and repeated "mpx" entries:
> 
> (...)
> {"execute": "query-cpu-definitions"}
> (...)
> {
> "name": "Cascadelake-Server",
> "typename": "Cascadelake-Server-x86_64-cpu",
> "unavailable-features": [
> "hle",
> "rtm",
> "mpx",
> "avx512f",
> "avx512dq",
> "rdseed",
> "adx",
> "smap",
> "clflushopt",
> "clwb",
> "intel-pt",
> "avx512cd",
> "avx512bw",
> "avx512vl",
> "pku",
> "",

I just noticed one thing: we probably want to find out the cause
of this empty entry, instead of ignoring it in the code.  Named
CPU models must only refer to named CPU features.

I think this is caused by CPUID_7_0_ECX_OSPKE, I will
investigate.  But this doesn't make your patch incorrect.


> "avx512vnni",
> "spec-ctrl",
> "ssbd",
> "3dnowprefetch",
> "xsavec",
> "xgetbv1",
> "mpx",
> "mpx",
> "avx512f",
> "avx512f",
> "avx512f",
> "pku"
> ],
> (...)
> 
> Signed-off-by: Wainer dos Santos Moschetta 
> Reviewed-by: Eduardo Habkost 
> Reviewed-by: Eric Blake 
> Reviewed-by: Caio Carrara 
> ---
> v2:
>  * Fixed typos. [eblake]
>  * Removed unwanted manual test case. [ccarrara, ehabkost]
>  * Not passing 'accel=kvm' on test's VM. [ehabkost]
>  * Removed unneeded g_strdup() call. [ehabkost]
>  * Formatted comment according to QEMU's coding style. [ehabkost]
> 
> v1: https://www.mail-archive.com/qemu-devel@nongnu.org/msg579404.html
> ---
>  target/i386/cpu.c   | 11 -
>  tests/acceptance/cpu_definitions.py | 35 +
>  2 files changed, 45 insertions(+), 1 deletion(-)
>  create mode 100644 tests/acceptance/cpu_definitions.py
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index f81d35e1f9..014b91e608 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -3615,19 +3615,28 @@ static void 
> x86_cpu_class_check_missing_features(X86CPUClass *xcc,
>  
>  x86_cpu_filter_features(xc);
>  
> +/* Auxiliary dictionary to avoid duplicate entries in the list. */
> +QDict *unique_feats_dict = qdict_new();
> +
>  for (w = 0; w < FEATURE_WORDS; w++) {
>  uint32_t filtered = xc->filtered_features[w];
>  int i;
>  for (i = 0; i < 32; i++) {
>  if (filtered & (1UL << i)) {
> +const char *fname = x86_cpu_feature_name(w, i);
> +if (!fname || qdict_haskey(unique_feats_dict, fname)) {
> +continue;
> +}
>  strList *new = g_new0(strList, 1);

I like mixed declarations, but unfortunately they are not allowed
by CODING_STYLE:

5. Declarations

Mixed declarations (interleaving statements and declarations within
blocks) are generally not allowed; declarations should be at the 
beginning
of blocks.


The logic now looks good, though.

> -new->value = g_strdup(x86_cpu_feature_name(w, i));
> +new->value = g_strdup(fname);
>  *next = new;
>  next = >next;
> +qdict_put_null(unique_feats_dict, new->value);
>  }
>  }
>  }
>  
> +g_free(unique_feats_dict);
>  object_unref(OBJECT(xc));
>  }
>  
> diff --git a/tests/acceptance/cpu_definitions.py 
> b/tests/acceptance/cpu_definitions.py
> new file mode 100644
> index 00..4edad86799
> --- /dev/null
> +++ b/tests/acceptance/cpu_definitions.py
> @@ -0,0 +1,35 @@
> +# CPU definitions tests.
> +#
> +# Copyright (c) 2018 Red Hat, Inc.
> +#
> +# Author:
> +#  Wainer dos Santos Moschetta 
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2 or
> +# later.  See the COPYING file in the top-level directory.
> +
> +from avocado import skip
> +from avocado_qemu import Test
> +
> +
> +class CPUDefinitions(Test):
> +"""
> +Tests for the CPU definitions.
> +
> +:avocado: enable
> +:avocado: tags=x86_64
> +"""
> +def test_unavailable_features(self):
> +self.vm.add_args("-machine", "q35")

I thought the explicit -machine option was here only because of
the old accel=kvm option, and the whole line would be removed.
Why do you need to explicitly ask for a Q35 machine to test this?


> +

Re: [Qemu-devel] [PATCH for-4.0 v4 0/2] virtio: Provide version-specific variants of virtio PCI devices

2018-12-11 Thread Michael S. Tsirkin

Nothing, I'm packing up the 1st pull request.

On Tue, Dec 11, 2018 at 11:18:51PM -0200, Eduardo Habkost wrote:
> Friendly ping.  3.1.0 is tagged now, so there's anything else
> blocking this series?
> 
> 
> On Wed, Dec 05, 2018 at 05:57:02PM -0200, Eduardo Habkost wrote:
> > Existing modern-only device types are not being touched by v3, as
> > they don't need separate variants.  However, I plan to implement
> > separate cleanups in the code that calls virtio_pci_force_virtio_1(),
> > first, and then propose additional changes (e.g. deprecating
> > disable-legacy and disable-modern in those device types).
> > 
> > Changes v3 -> v4:
> > * Trivial comment fixes (Cornelia Huck)
> > * Test code update (Caio Carrara)
> > 
> > Changes v2 -> v3:
> > * Split into two separate patches (type registration helper
> >   and introduction of new types)
> > * Rewrote virtio_pci_types_register() completely:
> >   * Replaced magic generation of type names with explicit fields in
> > VirtioPCIDeviceTypeInfo
> >   * Removed modern_only field (not necessary anymore)
> >   * Don't register a separate base type unless necessary
> > 
> > Changes v1 -> v2:
> > * Removed *-0.9 devices.  Nobody will want to use them, if
> >   transitional devices work with legacy drivers
> >   (Gerd Hoffmann, Michael S. Tsirkin)
> > * Drop virtio version from name: rename -1.0-transitional to
> >   -transitional (Michael S. Tsirkin)
> > * Renamed -1.0 to -non-transitional
> > * Don't add any extra variants to modern-only device types
> >   (they don't need it)
> > * Fix typo on TYPE_VIRTIO_INPUT_HOST_PCI (crash reported by
> >   Cornelia Huck)
> > * No need to change cast macros for modern-only devices
> > * Rename virtio_register_types() to virtio_pci_types_register()
> > 
> > Original patch description:
> > 
> > Many of the current virtio-*-pci device types actually represent
> > 3 different types of devices:
> > * virtio 1.0 non-transitional devices
> > * virtio 1.0 transitional devices
> > * virtio 0.9 ("legacy device" in virtio 1.0 terminology)
> > 
> > That would be just an annoyance if it didn't break our device/bus
> > compatibility QMP interfaces.  With this multi-purpose device
> > type, there's no way to tell management software that
> > transitional devices and legacy devices require a Conventional
> > PCI bus.
> > 
> > The multi-purpose device types would also prevent us from telling
> > management software what's the PCI vendor/device ID for them,
> > because their PCI IDs change at runtime depending on the bus
> > where they were plugged.
> > 
> > This patch adds separate device types for each of those virtio
> > device flavors:
> > 
> > * virtio-*-pci: the existing multi-purpose device types
> > * virtio-*-pci-transitional: virtio-1.0 device supporting legacy drivers
> > * virtio-*-pci-non-transitional: modern-only
> > 
> > Reference to previous discussion that originated this idea:
> >   https://www.mail-archive.com/qemu-devel@nongnu.org/msg558389.html
> > 
> > Eduardo Habkost (2):
> >   virtio: Helper for registering virtio device types
> >   virtio: Provide version-specific variants of virtio PCI devices
> > 
> >  hw/virtio/virtio-pci.h |  78 +++--
> >  hw/display/virtio-gpu-pci.c|   7 +-
> >  hw/display/virtio-vga.c|   7 +-
> >  hw/virtio/virtio-crypto-pci.c  |   7 +-
> >  hw/virtio/virtio-pci.c | 267 ++---
> >  tests/acceptance/virtio_version.py | 176 +++
> >  6 files changed, 452 insertions(+), 90 deletions(-)
> >  create mode 100644 tests/acceptance/virtio_version.py
> > 
> > -- 
> > 2.18.0.rc1.1.g3f1ff2140
> > 
> 
> -- 
> Eduardo

Re: [Qemu-devel] [PATCH for-4.0 v4 0/2] virtio: Provide version-specific variants of virtio PCI devices

2018-12-11 Thread Eduardo Habkost

Friendly ping.  3.1.0 is tagged now, so there's anything else
blocking this series?


On Wed, Dec 05, 2018 at 05:57:02PM -0200, Eduardo Habkost wrote:
> Existing modern-only device types are not being touched by v3, as
> they don't need separate variants.  However, I plan to implement
> separate cleanups in the code that calls virtio_pci_force_virtio_1(),
> first, and then propose additional changes (e.g. deprecating
> disable-legacy and disable-modern in those device types).
> 
> Changes v3 -> v4:
> * Trivial comment fixes (Cornelia Huck)
> * Test code update (Caio Carrara)
> 
> Changes v2 -> v3:
> * Split into two separate patches (type registration helper
>   and introduction of new types)
> * Rewrote virtio_pci_types_register() completely:
>   * Replaced magic generation of type names with explicit fields in
> VirtioPCIDeviceTypeInfo
>   * Removed modern_only field (not necessary anymore)
>   * Don't register a separate base type unless necessary
> 
> Changes v1 -> v2:
> * Removed *-0.9 devices.  Nobody will want to use them, if
>   transitional devices work with legacy drivers
>   (Gerd Hoffmann, Michael S. Tsirkin)
> * Drop virtio version from name: rename -1.0-transitional to
>   -transitional (Michael S. Tsirkin)
> * Renamed -1.0 to -non-transitional
> * Don't add any extra variants to modern-only device types
>   (they don't need it)
> * Fix typo on TYPE_VIRTIO_INPUT_HOST_PCI (crash reported by
>   Cornelia Huck)
> * No need to change cast macros for modern-only devices
> * Rename virtio_register_types() to virtio_pci_types_register()
> 
> Original patch description:
> 
> Many of the current virtio-*-pci device types actually represent
> 3 different types of devices:
> * virtio 1.0 non-transitional devices
> * virtio 1.0 transitional devices
> * virtio 0.9 ("legacy device" in virtio 1.0 terminology)
> 
> That would be just an annoyance if it didn't break our device/bus
> compatibility QMP interfaces.  With this multi-purpose device
> type, there's no way to tell management software that
> transitional devices and legacy devices require a Conventional
> PCI bus.
> 
> The multi-purpose device types would also prevent us from telling
> management software what's the PCI vendor/device ID for them,
> because their PCI IDs change at runtime depending on the bus
> where they were plugged.
> 
> This patch adds separate device types for each of those virtio
> device flavors:
> 
> * virtio-*-pci: the existing multi-purpose device types
> * virtio-*-pci-transitional: virtio-1.0 device supporting legacy drivers
> * virtio-*-pci-non-transitional: modern-only
> 
> Reference to previous discussion that originated this idea:
>   https://www.mail-archive.com/qemu-devel@nongnu.org/msg558389.html
> 
> Eduardo Habkost (2):
>   virtio: Helper for registering virtio device types
>   virtio: Provide version-specific variants of virtio PCI devices
> 
>  hw/virtio/virtio-pci.h |  78 +++--
>  hw/display/virtio-gpu-pci.c|   7 +-
>  hw/display/virtio-vga.c|   7 +-
>  hw/virtio/virtio-crypto-pci.c  |   7 +-
>  hw/virtio/virtio-pci.c | 267 ++---
>  tests/acceptance/virtio_version.py | 176 +++
>  6 files changed, 452 insertions(+), 90 deletions(-)
>  create mode 100644 tests/acceptance/virtio_version.py
> 
> -- 
> 2.18.0.rc1.1.g3f1ff2140
> 

-- 
Eduardo

Re: [Qemu-devel] [PATCH v7 09/19] spapr: add device tree support for the XIVE exploitation mode

2018-12-11 Thread David Gibson

On Tue, Dec 11, 2018 at 10:06:46AM +0100, Cédric Le Goater wrote:
> On 12/11/18 1:38 AM, David Gibson wrote:
> > On Mon, Dec 10, 2018 at 08:53:17AM +0100, Cédric Le Goater wrote:
> >> On 12/10/18 7:39 AM, David Gibson wrote:
> >>> On Sun, Dec 09, 2018 at 08:46:00PM +0100, Cédric Le Goater wrote:
>  The XIVE interface for the guest is described in the device tree under
>  the "interrupt-controller" node. A couple of new properties are
>  specific to XIVE :
> 
>   - "reg"
> 
> contains the base address and size of the thread interrupt
> managnement areas (TIMA), for the User level and for the Guest OS
> level. Only the Guest OS level is taken into account today.
> 
>   - "ibm,xive-eq-sizes"
> 
> the size of the event queues. One cell per size supported, contains
> log2 of size, in ascending order.
> 
>   - "ibm,xive-lisn-ranges"
> 
> the IRQ interrupt number ranges assigned to the guest for the IPIs.
> 
>  and also under the root node :
> 
>   - "ibm,plat-res-int-priorities"
> 
> contains a list of priorities that the hypervisor has reserved for
> its own use. OPAL uses the priority 7 queue to automatically
> escalate interrupts for all other queues (DD2.X POWER9). So only
> priorities [0..6] are allowed for the guest.
> 
>  Extend the sPAPR IRQ backend with a new handler to populate the DT
>  with the appropriate "interrupt-controller" node.
> 
>  Signed-off-by: Cédric Le Goater 
>  ---
>   include/hw/ppc/spapr_irq.h  |  2 ++
>   include/hw/ppc/spapr_xive.h |  2 ++
>   include/hw/ppc/xics.h   |  4 +--
>   hw/intc/spapr_xive.c| 64 +
>   hw/intc/xics_spapr.c|  3 +-
>   hw/ppc/spapr.c  |  3 +-
>   hw/ppc/spapr_irq.c  |  3 ++
>   7 files changed, 77 insertions(+), 4 deletions(-)
> 
>  diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
>  index 23cdb51b879e..e51e9f052f63 100644
>  --- a/include/hw/ppc/spapr_irq.h
>  +++ b/include/hw/ppc/spapr_irq.h
>  @@ -39,6 +39,8 @@ typedef struct sPAPRIrq {
>   void (*free)(sPAPRMachineState *spapr, int irq, int num);
>   qemu_irq (*qirq)(sPAPRMachineState *spapr, int irq);
>   void (*print_info)(sPAPRMachineState *spapr, Monitor *mon);
>  +void (*dt_populate)(sPAPRMachineState *spapr, uint32_t nr_servers,
>  +void *fdt, uint32_t phandle);
>   } sPAPRIrq;
>   
>   extern sPAPRIrq spapr_irq_xics;
>  diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
>  index 9506a8f4d10a..728a5e8dc163 100644
>  --- a/include/hw/ppc/spapr_xive.h
>  +++ b/include/hw/ppc/spapr_xive.h
>  @@ -45,5 +45,7 @@ qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t 
>  lisn);
>   typedef struct sPAPRMachineState sPAPRMachineState;
>   
>   void spapr_xive_hcall_init(sPAPRMachineState *spapr);
>  +void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void 
>  *fdt,
>  +   uint32_t phandle);
>   
>   #endif /* PPC_SPAPR_XIVE_H */
>  diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
>  index 9958443d1984..14afda198cdb 100644
>  --- a/include/hw/ppc/xics.h
>  +++ b/include/hw/ppc/xics.h
>  @@ -181,8 +181,6 @@ typedef struct XICSFabricClass {
>   ICPState *(*icp_get)(XICSFabric *xi, int server);
>   } XICSFabricClass;
>   
>  -void spapr_dt_xics(int nr_servers, void *fdt, uint32_t phandle);
>  -
>   ICPState *xics_icp_get(XICSFabric *xi, int server);
>   
>   /* Internal XICS interfaces */
>  @@ -204,6 +202,8 @@ void icp_resend(ICPState *ss);
>   
>   typedef struct sPAPRMachineState sPAPRMachineState;
>   
>  +void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t nr_servers, void 
>  *fdt,
>  +   uint32_t phandle);
>   int xics_kvm_init(sPAPRMachineState *spapr, Error **errp);
>   void xics_spapr_init(sPAPRMachineState *spapr);
>   
>  diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
>  index 982ac6e17051..a6d854b07690 100644
>  --- a/hw/intc/spapr_xive.c
>  +++ b/hw/intc/spapr_xive.c
>  @@ -14,6 +14,7 @@
>   #include "target/ppc/cpu.h"
>   #include "sysemu/cpus.h"
>   #include "monitor/monitor.h"
>  +#include "hw/ppc/fdt.h"
>   #include "hw/ppc/spapr.h"
>   #include "hw/ppc/spapr_xive.h"
>   #include "hw/ppc/xive.h"
>  @@ -1381,3 +1382,66 @@ void spapr_xive_hcall_init(sPAPRMachineState 
>  *spapr)
>   spapr_register_hypercall(H_INT_SYNC, h_int_sync);
>   spapr_register_hypercall(H_INT_RESET, h_int_reset);
>   }
>  +
>  +void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void 
>  *fdt,
>

Re: [Qemu-devel] [Qemu-ppc] [PATCH qemu] ppc/spapr: Receive and store device tree blob from SLOF

2018-12-11 Thread David Gibson

On Tue, Dec 11, 2018 at 02:36:09PM +1100, Alexey Kardashevskiy wrote:
> 
> 
> On 10/12/2018 17:20, David Gibson wrote:
> > On Mon, Nov 12, 2018 at 03:12:26PM +1100, Alexey Kardashevskiy wrote:
> >>
> >>
> >> On 12/11/2018 05:10, Greg Kurz wrote:
> >>> Hi Alexey,
> >>>
> >>> Just a few remarks. See below.
> >>>
> >>> On Thu,  8 Nov 2018 12:44:06 +1100
> >>> Alexey Kardashevskiy  wrote:
> >>>
>  SLOF receives a device tree and updates it with various properties
>  before switching to the guest kernel and QEMU is not aware of any changes
>  made by SLOF. Since there is no real RTAS (QEMU implements it), it makes
>  sense to pass the SLOF final device tree to QEMU to let it implement
>  RTAS related tasks better, such as PCI host bus adapter hotplug.
> 
>  Specifially, now QEMU can find out the actual XICS phandle (for PHB
>  hotplug) and the RTAS linux,rtas-entry/base properties (for firmware
>  assisted NMI - FWNMI).
> 
>  This stores the initial DT blob in the sPAPR machine and replaces it
>  in the KVMPPC_H_UPDATE_DT (new private hypercall) handler.
> 
>  This adds an @update_dt_enabled machine property to allow backward
>  migration.
> 
>  SLOF already has a hypercall since
>  https://github.com/aik/SLOF/commit/e6fc84652c9c0073f9183
> 
>  Signed-off-by: Alexey Kardashevskiy 
>  ---
>   include/hw/ppc/spapr.h |  7 ++-
>   hw/ppc/spapr.c | 29 -
>   hw/ppc/spapr_hcall.c   | 32 
>   hw/ppc/trace-events|  2 ++
>   4 files changed, 68 insertions(+), 2 deletions(-)
> 
>  diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>  index ad4d7cfd97..f5dcaf44cb 100644
>  --- a/include/hw/ppc/spapr.h
>  +++ b/include/hw/ppc/spapr.h
>  @@ -100,6 +100,7 @@ struct sPAPRMachineClass {
>   
>   /*< public >*/
>   bool dr_lmb_enabled;   /* enable dynamic-reconfig/hotplug of 
>  LMBs */
>  +bool update_dt_enabled;/* enable KVMPPC_H_UPDATE_DT */
>   bool use_ohci_by_default;  /* use USB-OHCI instead of XHCI */
>   bool pre_2_10_has_unused_icps;
>   bool legacy_irq_allocation;
>  @@ -136,6 +137,9 @@ struct sPAPRMachineState {
>   int vrma_adjust;
>   ssize_t rtas_size;
>   void *rtas_blob;
>  +uint32_t fdt_size;
>  +uint32_t fdt_initial_size;
> >>>
> >>> I don't quite see the purpose of fdt_initial_size... it seems to be only
> >>> used to print a trace.
> >>
> >>
> >> Ah, lost in rebase. The purpose was to test if the new device tree has
> >> not grown too much.
> >>
> >>
> >>
> >>>
>  +void *fdt_blob;
>   long kernel_size;
>   bool kernel_le;
>   uint32_t initrd_base;
>  @@ -462,7 +466,8 @@ struct sPAPRMachineState {
>   #define KVMPPC_H_LOGICAL_MEMOP  (KVMPPC_HCALL_BASE + 0x1)
>   /* Client Architecture support */
>   #define KVMPPC_H_CAS(KVMPPC_HCALL_BASE + 0x2)
>  -#define KVMPPC_HCALL_MAXKVMPPC_H_CAS
>  +#define KVMPPC_H_UPDATE_DT  (KVMPPC_HCALL_BASE + 0x3)
>  +#define KVMPPC_HCALL_MAXKVMPPC_H_UPDATE_DT
>   
>   typedef struct sPAPRDeviceTreeUpdateHeader {
>   uint32_t version_id;
>  diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>  index c08130facb..5e2d4d211c 100644
>  --- a/hw/ppc/spapr.c
>  +++ b/hw/ppc/spapr.c
>  @@ -1633,7 +1633,10 @@ static void spapr_machine_reset(void)
>   /* Load the fdt */
>   qemu_fdt_dumpdtb(fdt, fdt_totalsize(fdt));
>   cpu_physical_memory_write(fdt_addr, fdt, fdt_totalsize(fdt));
>  -g_free(fdt);
>  +g_free(spapr->fdt_blob);
>  +spapr->fdt_size = fdt_totalsize(fdt);
>  +spapr->fdt_initial_size = spapr->fdt_size;
>  +spapr->fdt_blob = fdt;
> >>>
> >>> Hmm... It looks weird to store state in a reset handler. I'd rather zeroe
> >>> both fdt_blob and fdt_size here.
> >>
> >> The device tree is built from the reset handler and the idea is that we
> >> want to always have some tree in the machine.
> > 
> > Yes, I think the approach here is fine.  Otherwise when we want to
> > look up the current fdt state in RTAS calls or whatever we'd always
> > have to do
> > if (fdt_blob)
> > look up that
> > else
> > look up qemu created fdt.
> > 
> > Incidentally 'fdt' and 'fdt_blob' names do a terrible job of
> > distinguishing what the difference is.  Renaming fdt to fdt_initial
> > (to match fdt_initial_size) and fdt_blob to fdt should make that
> > clearer.
> 
> There is just one fdt in sPAPRMachineState - it is fdt_blob as it is
> flattened. The "fdt" symbol above is local to spapr_machine_reset() and
> when the tree is built - it is stored in fdt_blob.

Uh, sorry, I misread.  I'll look more carefully at the next spin.

-- 
David Gibson| I'll have my

Re: [Qemu-devel] [PATCH v7 18/19] spapr: add a 'pseries-4.0-xive' machine type

2018-12-11 Thread David Gibson

On Tue, Dec 11, 2018 at 11:42:03AM +0100, Cédric Le Goater wrote:
> On 12/11/18 3:06 AM, David Gibson wrote:
> > On Mon, Dec 10, 2018 at 11:17:33PM +0100, Cédric Le Goater wrote:
> >> On 12/9/18 8:46 PM, Cédric Le Goater wrote:
> >>> This pseries machine makes use of a new sPAPR IRQ backend supporting
> >>> the XIVE interrupt mode.
> >>>
> >>> The guest OS is required to have support for the XIVE exploitation
> >>> mode of the POWER9 interrupt controller.
> >>>
> >>> Signed-off-by: Cédric Le Goater 
> >>> ---
> >>>  hw/ppc/spapr.c | 15 +++
> >>>  1 file changed, 15 insertions(+)
> >>>
> >>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >>> index 4012ebd794a4..3cc134a0b673 100644
> >>> --- a/hw/ppc/spapr.c
> >>> +++ b/hw/ppc/spapr.c
> >>> @@ -3985,6 +3985,21 @@ static void 
> >>> spapr_machine_4_0_class_options(MachineClass *mc)
> >>>  
> >>>  DEFINE_SPAPR_MACHINE(4_0, "4.0", true);
> >>>  
> >>> +static void spapr_machine_4_0_xive_instance_options(MachineState 
> >>> *machine)
> >>> +{
> >>> +spapr_machine_4_0_instance_options(machine);
> >>> +}
> >>> +
> >>> +static void spapr_machine_4_0_xive_class_options(MachineClass *mc)
> >>> +{
> >>> +sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
> >>> +
> >>> +spapr_machine_4_0_class_options(mc);> +smc->irq = 
> >>> _irq_xive;
> >>
> >> I have been adding checks on the CPU model to export the XIVE capability 
> >> only on POWER9 processors but it breaks some of the tests.
> >>
> >> I was wondering if we could add a default POWER9 CPU to the -xive machine 
> >> : 
> >>
> >>   + mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power9_v2.0");
> >>
> >> and if we could change tests/cpu-plug-test.c with :
> >>
> >>   @@ -198,8 +198,13 @@ static void add_pseries_test_case(const
> >>}
> >>data = g_new(PlugTestData, 1);
> >>data->machine = g_strdup(mname);
> >>   -data->cpu_model = "power8_v2.0";
> >>   -data->device_model = g_strdup("power8_v2.0-spapr-cpu-core");
> >>   +if (g_str_has_suffix(mname, "xive")) {
> >>   +data->cpu_model = "power9_v2.0";
> >>   +data->device_model = g_strdup("power9_v2.0-spapr-cpu-core");
> >>   +} else {
> >>   +data->cpu_model = "power8_v2.0";
> >>   +data->device_model = g_strdup("power8_v2.0-spapr-cpu-core");
> >>   +}
> >>data->sockets = 2;
> >>data->cores = 3;
> >>data->threads = 1;
> >>
> >> or if there is a better way ?
> > 
> > So, I'd actually prefer a machine option, rather than wholly separate
> > machine types to select xics/xive/dual.  Machine types was fine while
> > prototyping this, but I don't think we want to actually merge new
> > machine types for it.
> 
> I agree. 
> 
> > So, instead I think we want a machine option which can be set to
> > xics/xive/dual, with xics being the default for earlier machine types
> > and dual the default for 4.0 onwards.
> 
> I will revive an old patch doing just that. 
> 
> The question now is how to link the sPAPRMachineState instance to 
> the selected sPAPR IRQ backend. 
> 
> I don't think we can move 'smc->irq' to sPAPRMachineState.

I think you could..

> So we will
> need an helper returning the appropriate backend depending on the machine 
> option and 'smc->irq' should disappear.

..but this approach might be easier.

> 
> > We can make POWER9 the default cpu for 4.0 onwards as well, if you want.
> 
> OK.
> 
> C.
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [Qemu-ppc] [PATCH qemu] ppc/spapr: Receive and store device tree blob from SLOF

2018-12-11 Thread David Gibson

On Tue, Dec 11, 2018 at 10:55:59AM +0100, Greg Kurz wrote:
> On Tue, 11 Dec 2018 14:53:32 +1100
> Alexey Kardashevskiy  wrote:
> 
> > On 10/12/2018 20:30, Greg Kurz wrote:
> > > On Mon, 10 Dec 2018 17:20:43 +1100
> > > David Gibson  wrote:
> > >   
> > >> On Mon, Nov 12, 2018 at 03:12:26PM +1100, Alexey Kardashevskiy wrote:  
> > >>>
> > >>>
> > >>> On 12/11/2018 05:10, Greg Kurz wrote:
> >  Hi Alexey,
> > 
> >  Just a few remarks. See below.
> > 
> >  On Thu,  8 Nov 2018 12:44:06 +1100
> >  Alexey Kardashevskiy  wrote:
> >  
> > > SLOF receives a device tree and updates it with various properties
> > > before switching to the guest kernel and QEMU is not aware of any 
> > > changes
> > > made by SLOF. Since there is no real RTAS (QEMU implements it), it 
> > > makes
> > > sense to pass the SLOF final device tree to QEMU to let it implement
> > > RTAS related tasks better, such as PCI host bus adapter hotplug.
> > >
> > > Specifially, now QEMU can find out the actual XICS phandle (for PHB
> > > hotplug) and the RTAS linux,rtas-entry/base properties (for firmware
> > > assisted NMI - FWNMI).
> > >
> > > This stores the initial DT blob in the sPAPR machine and replaces it
> > > in the KVMPPC_H_UPDATE_DT (new private hypercall) handler.
> > >
> > > This adds an @update_dt_enabled machine property to allow backward
> > > migration.
> > >
> > > SLOF already has a hypercall since
> > > https://github.com/aik/SLOF/commit/e6fc84652c9c0073f9183
> > >
> > > Signed-off-by: Alexey Kardashevskiy 
> > > ---
> > >  include/hw/ppc/spapr.h |  7 ++-
> > >  hw/ppc/spapr.c | 29 -
> > >  hw/ppc/spapr_hcall.c   | 32 
> > >  hw/ppc/trace-events|  2 ++
> > >  4 files changed, 68 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > > index ad4d7cfd97..f5dcaf44cb 100644
> > > --- a/include/hw/ppc/spapr.h
> > > +++ b/include/hw/ppc/spapr.h
> > > @@ -100,6 +100,7 @@ struct sPAPRMachineClass {
> > >  
> > >  /*< public >*/
> > >  bool dr_lmb_enabled;   /* enable dynamic-reconfig/hotplug of 
> > > LMBs */
> > > +bool update_dt_enabled;/* enable KVMPPC_H_UPDATE_DT */
> > >  bool use_ohci_by_default;  /* use USB-OHCI instead of XHCI */
> > >  bool pre_2_10_has_unused_icps;
> > >  bool legacy_irq_allocation;
> > > @@ -136,6 +137,9 @@ struct sPAPRMachineState {
> > >  int vrma_adjust;
> > >  ssize_t rtas_size;
> > >  void *rtas_blob;
> > > +uint32_t fdt_size;
> > > +uint32_t fdt_initial_size;
> > 
> >  I don't quite see the purpose of fdt_initial_size... it seems to be 
> >  only
> >  used to print a trace.
> > >>>
> > >>>
> > >>> Ah, lost in rebase. The purpose was to test if the new device tree has
> > >>> not grown too much.
> > >>>
> > >>>
> > >>> 
> >  
> > > +void *fdt_blob;
> > >  long kernel_size;
> > >  bool kernel_le;
> > >  uint32_t initrd_base;
> > > @@ -462,7 +466,8 @@ struct sPAPRMachineState {
> > >  #define KVMPPC_H_LOGICAL_MEMOP  (KVMPPC_HCALL_BASE + 0x1)
> > >  /* Client Architecture support */
> > >  #define KVMPPC_H_CAS(KVMPPC_HCALL_BASE + 0x2)
> > > -#define KVMPPC_HCALL_MAXKVMPPC_H_CAS
> > > +#define KVMPPC_H_UPDATE_DT  (KVMPPC_HCALL_BASE + 0x3)
> > > +#define KVMPPC_HCALL_MAXKVMPPC_H_UPDATE_DT
> > >  
> > >  typedef struct sPAPRDeviceTreeUpdateHeader {
> > >  uint32_t version_id;
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index c08130facb..5e2d4d211c 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -1633,7 +1633,10 @@ static void spapr_machine_reset(void)
> > >  /* Load the fdt */
> > >  qemu_fdt_dumpdtb(fdt, fdt_totalsize(fdt));
> > >  cpu_physical_memory_write(fdt_addr, fdt, fdt_totalsize(fdt));
> > > -g_free(fdt);
> > > +g_free(spapr->fdt_blob);
> > > +spapr->fdt_size = fdt_totalsize(fdt);
> > > +spapr->fdt_initial_size = spapr->fdt_size;
> > > +spapr->fdt_blob = fdt;
> > 
> >  Hmm... It looks weird to store state in a reset handler. I'd rather 
> >  zeroe
> >  both fdt_blob and fdt_size here.
> > >>>
> > >>> The device tree is built from the reset handler and the idea is that we
> > >>> want to always have some tree in the machine.
> > >>
> > >> Yes, I think the approach here is fine.  Otherwise when we want to
> > >> look up the current fdt state in RTAS calls or whatever we'd always
> > >> have to do
> > >>  if (fdt_blob)
> > >>  look up that
> > >>  else
> > >>  look up qemu created fdt.
> > >>  
> > > 
>

Re: [Qemu-devel] [PATCH v7 15/19] spapr/xive: enable XIVE MMIOs at reset

2018-12-11 Thread David Gibson

On Tue, Dec 11, 2018 at 11:14:41AM +0100, Cédric Le Goater wrote:
> On 12/11/18 2:47 AM, David Gibson wrote:
> > On Sun, Dec 09, 2018 at 08:46:06PM +0100, Cédric Le Goater wrote:
> >> Depending on the interrupt mode chosen, enable or disable the XIVE
> >> MMIOs.
> >>
> >> Signed-off-by: Cédric Le Goater 
> >> ---
> >>  include/hw/ppc/spapr_xive.h | 1 +
> >>  hw/intc/spapr_xive.c| 9 +
> >>  hw/ppc/spapr_irq.c  | 8 
> >>  3 files changed, 18 insertions(+)
> >>
> >> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> >> index 7244a6231ce6..308afb61a666 100644
> >> --- a/include/hw/ppc/spapr_xive.h
> >> +++ b/include/hw/ppc/spapr_xive.h
> >> @@ -48,5 +48,6 @@ void spapr_xive_hcall_init(sPAPRMachineState *spapr);
> >>  void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void 
> >> *fdt,
> >> uint32_t phandle);
> >>  void spapr_xive_reset_tctx(sPAPRXive *xive);
> >> +void spapr_xive_enable_mmio(sPAPRXive *xive, bool enable);
> >>  
> >>  #endif /* PPC_SPAPR_XIVE_H */
> >> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> >> index 560d8d031f74..c6dbb2e8cfc7 100644
> >> --- a/hw/intc/spapr_xive.c
> >> +++ b/hw/intc/spapr_xive.c
> >> @@ -179,6 +179,15 @@ static void spapr_xive_map_mmio(sPAPRXive *xive)
> >>  sysbus_mmio_map(SYS_BUS_DEVICE(xive), 2, xive->tm_base);
> >>  }
> >>  
> >> +void spapr_xive_enable_mmio(sPAPRXive *xive, bool enable)
> > 
> > The logic looks fine, but I dislike this name - it's called
> > ..._enable() when it can also be used to disable.
> 
> ok. Let's call it spapr_xive_mmio_set_enabled() then

Sounds good.

> 
> C.
>  
> >> +{
> >> +memory_region_set_enabled(>source.esb_mmio, enable);
> >> +memory_region_set_enabled(>tm_mmio, enable);
> >> +
> >> +/* Disable the END ESBs until a guest OS makes use of them */
> >> +memory_region_set_enabled(>end_source.esb_mmio, false);
> >> +}
> >> +
> >>  /*
> >>   * When a Virtual Processor is scheduled to run on a HW thread, the
> >>   * hypervisor pushes its identifier in the OS CAM line. Emulate the
> >> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> >> index b423cee30e2c..a8e50725397c 100644
> >> --- a/hw/ppc/spapr_irq.c
> >> +++ b/hw/ppc/spapr_irq.c
> >> @@ -217,6 +217,11 @@ static void spapr_irq_reset_xics(sPAPRMachineState 
> >> *spapr, Error **errp)
> >>  CPU_FOREACH(cs) {
> >>  spapr_cpu_core_set_intc(POWERPC_CPU(cs), spapr->icp_type);
> >>  }
> >> +
> >> +/* Deactivate the XIVE MMIOs */
> >> +if (spapr->xive) {
> >> +spapr_xive_enable_mmio(spapr->xive, false);
> >> +}
> >>  }
> >>  
> >>  #define SPAPR_IRQ_XICS_NR_IRQS 0x1000
> >> @@ -358,6 +363,9 @@ static void spapr_irq_reset_xive(sPAPRMachineState 
> >> *spapr, Error **errp)
> >>   * to come after the XiveTCTX reset handlers.
> >>   */
> >>  spapr_xive_reset_tctx(spapr->xive);
> >> +
> >> +/* Activate the XIVE MMIOs */
> >> +spapr_xive_enable_mmio(spapr->xive, true);
> >>  }
> >>  
> >>  /*
> > 
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[Qemu-devel] [PATCH] virt: Fix broken indentation

2018-12-11 Thread Eduardo Habkost

I introduced indentation using tabs instead of spaces in another
commit.  Peter reported the problem, and I failed to fix that
before sending my pull request.

Reported-by: Peter Maydell 
Fixes: 951597607696 ("virt: Eliminate separate instance_init functions")
Signed-off-by: Eduardo Habkost 
---
 hw/arm/virt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 17f1b49d11..5b678237b7 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1854,7 +1854,7 @@ static const TypeInfo virt_machine_info = {
 .instance_size = sizeof(VirtMachineState),
 .class_size= sizeof(VirtMachineClass),
 .class_init= virt_machine_class_init,
-   .instance_init = virt_instance_init,
+.instance_init = virt_instance_init,
 .interfaces = (InterfaceInfo[]) {
  { TYPE_HOTPLUG_HANDLER },
  { }
-- 
2.18.0.rc1.1.g3f1ff2140

Re: [Qemu-devel] [RFC v3 04/24] linux-user: riscv: Fix compile failure on riscv32 hosts

2018-12-11 Thread Alistair Francis

On Mon, Dec 10, 2018 at 9:04 AM Richard Henderson
 wrote:
>
> On 12/7/18 6:46 PM, Alistair Francis wrote:
> > When cross compilling for riscv32 hosts using GCC 8.2 this error is seen:
> > error: '__NR__llseek' undeclared (first use in this function); did you 
> > mean '_llseek'?
> >
> > To avoid the error let's ensure that __NR__llseek is defined.
> >
> > Signed-off-by: Alistair Francis 
> > ---
> >  linux-user/riscv/target_syscall.h | 5 +
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/linux-user/riscv/target_syscall.h 
> > b/linux-user/riscv/target_syscall.h
> > index ee81d8bc88..af73f71839 100644
> > --- a/linux-user/riscv/target_syscall.h
> > +++ b/linux-user/riscv/target_syscall.h
> > @@ -47,6 +47,11 @@ struct target_pt_regs {
> >  #endif
> >  #define UNAME_MINIMUM_RELEASE "4.15.0"
> >
> > +/* This is sometimes needed to compile riscv32 Linux user mode */
> > +#if !defined(__NR__llseek) && !defined(__NR_lseek)
> > +#define __NR__llseek __NR3264_lseek
> > +#endif
>
> I'm not quite sure how this could be.  Upstream I see
>
> #if __BITS_PER_LONG == 64 && !defined(__SYSCALL_COMPAT)
> ...
> #define __NR_lseek __NR3264_lseek
> ...
> #else
> ...
> #define __NR_llseek __NR3264_lseek
>
> so this define should already exist.
>
> Do you have the wrong headers installed for the cross-build?

I'm using Yocto for the cross build, so I don't see how I can get the
wrong headers. I agree though that I probably shouldn't need this
patch. Maybe it's something to do with the glibc fork I'm using. I'll
drop this patch and see if I can figure out what's going on.

Alistair

>
>
> r~

Re: [Qemu-devel] [PULL 00/24] Machine queue post-3.1.0 (including 4.0 machine-types)

2018-12-11 Thread no-reply

Patchew URL: https://patchew.org/QEMU/20181211180129.7661-1-ehabk...@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 20181211180129.7661-1-ehabk...@redhat.com
Subject: [Qemu-devel] [PULL 00/24] Machine queue post-3.1.0 (including 4.0 
machine-types)
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
67604b2 qom: remove unimplemented class_finalize
2378b00 qdev: move qdev_prop_register_global_list() to tests
4a805fe accel: register global_props like machine globals
9f66e9c qom: make user_creatable_complete() specific to UserCreatable
69661de qom: make interface types abstract
f122c95 tests: qdev_prop_check_globals() doesn't return "all_used"
1850057 pc: Use default_machine_opts to set suppress_vmdesc
2d1e974 spapr: Delete instance_options functions
b112dc8 spapr: Use default_machine_opts to set suppress_vmdesc
5bec42a spapr: Use default_machine_opts to set use_hotplug_event_source
de8eced virt: Eliminate separate instance_init functions
0502fc4 q35/440fx/arm/spapr: Add QEMU 4.0 machine type
cfbdcf0 hostmem: Validate host-nodes before setting bitmap
d61e4e16 numa: Match struct to typedef name
45bfc96 i386: Rename bools in PCMachineState to end in _enabled
9413306 move ObjectClass to typedefs.h
9e54c44 memory-device: avoid overflows on very huge devices
d18efc0 memory-device: use QEMU_IS_ALIGNED
6f10012 range: pass const pointer where possible
243a721 Deprecate HMP `cpu-add`
1c26ebd Deprecate QMP `cpu-add`
77c2b38 docs: Document vCPU hotplug procedure
56f06c1 hw/timer/sun4v-rtc: Fix tracing at sun4v_rtc_write()
d53badd hostmem-file: remove object id from pmem error message

=== OUTPUT BEGIN ===
Checking PATCH 1/24: hostmem-file: remove object id from pmem error message...
Checking PATCH 2/24: hw/timer/sun4v-rtc: Fix tracing at sun4v_rtc_write()...
Checking PATCH 3/24: docs: Document vCPU hotplug procedure...
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#13: 
new file mode 100644

total: 0 errors, 1 warnings, 142 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 4/24: Deprecate QMP `cpu-add`...
Checking PATCH 5/24: Deprecate HMP `cpu-add`...
Checking PATCH 6/24: range: pass const pointer where possible...
Checking PATCH 7/24: memory-device: use QEMU_IS_ALIGNED...
Checking PATCH 8/24: memory-device: avoid overflows on very huge devices...
Checking PATCH 9/24: move ObjectClass to typedefs.h...
Checking PATCH 10/24: i386: Rename bools in PCMachineState to end in _enabled...
Checking PATCH 11/24: numa: Match struct to typedef name...
Checking PATCH 12/24: hostmem: Validate host-nodes before setting bitmap...
Checking PATCH 13/24: q35/440fx/arm/spapr: Add QEMU 4.0 machine type...
Checking PATCH 14/24: virt: Eliminate separate instance_init functions...
ERROR: code indent should never use tabs
#67: FILE: hw/arm/virt.c:1857:
+^I.instance_init = virt_instance_init,$

total: 1 errors, 0 warnings, 159 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 15/24: spapr: Use default_machine_opts to set 
use_hotplug_event_source...
Checking PATCH 16/24: spapr: Use default_machine_opts to set suppress_vmdesc...
Checking PATCH 17/24: spapr: Delete instance_options functions...
Checking PATCH 18/24: pc: Use default_machine_opts to set suppress_vmdesc...
Checking PATCH 19/24: tests: qdev_prop_check_globals() doesn't return 
"all_used"...
Checking PATCH 20/24: qom: make interface types abstract...
Checking PATCH 21/24: qom: make user_creatable_complete() specific to 
UserCreatable...
Checking PATCH 22/24: accel: register global_props like machine globals...
Checking PATCH 23/24: qdev: move qdev_prop_register_global_list() to tests...
Checking PATCH 24/24: qom: remove unimplemented class_finalize...
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20181211180129.7661-1-ehabk...@redhat.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [Qemu-devel] [RFC v3 19/24] riscv: tcg-target: Add the out op decoder

2018-12-11 Thread Alistair Francis

On Tue, Dec 11, 2018 at 2:45 PM Richard Henderson
 wrote:
>
> On 12/10/18 11:56 AM, Richard Henderson wrote:
> > On 12/7/18 6:49 PM, Alistair Francis wrote:
> >> +case INDEX_op_neg_i64:
> >> +tcg_out_opc_imm(s, OPC_SUB, a0, TCG_REG_ZERO, a1);
> >
> > tcg_out_opc_reg.
> >
> >> +case INDEX_op_mulsh_i32:
> >> +case INDEX_op_mulsh_i64:
> >> +tcg_out_opc_imm(s, OPC_MULH, a0, a1, a2);
> >> +break;
> >> +
> >> +case INDEX_op_muluh_i32:
> >> +case INDEX_op_muluh_i64:
> >> +tcg_out_opc_imm(s, OPC_MULHU, a0, a1, a2);
> >> +break;
> >
> > Likewise.
>
> Incidentally, catching these sorts of errors is why tcg/s390 puts the format 
> of
> the insn into the enum.  E.g.
>
> /* Emit an opcode with "type-checking" of the format.  */
> #define tcg_out_insn(S, FMT, OP, ...) \
> glue(tcg_out_insn_,FMT)(S, glue(glue(FMT,_),OP), ## __VA_ARGS__)
>
> tcg_out_insn(s, RR, AR, a0, a2);
> tcg_out_insn(s, RRE, AGR, TCG_TMP0, index);
>
> /* All of the following instructions are prefixed with their instruction
>format...  */
> typedef enum S390Opcode {
> ...
> RR_AR   = 0x1a,
> RRE_AGR = 0xb908,

That is really cool.

>
>
> I do something similar for tcg/aarch64, although that's more complicated due 
> to
> the wide variety of formats.
>
> Whether you go back and retro-fit this scheme to your existing patch set is up
> to you.  But I think it could be worthwhile.

It looks like I'm ready to send a patch series (instead of an RFC) so
I might leave it for now. There are a range of improvements we can
work on in the future, this can be yet another one.

Alistair

>
>
> r~

Re: [Qemu-devel] [Qemu-ppc] [PATCH qemu 2/3] ppc/spapr: Receive and store device tree blob from SLOF

2018-12-11 Thread Alexey Kardashevskiy




On 12/12/2018 03:35, Greg Kurz wrote:
> On Tue, 11 Dec 2018 16:49:25 +1100
> Alexey Kardashevskiy  wrote:
> 
>> SLOF receives a device tree and updates it with various properties
>> before switching to the guest kernel and QEMU is not aware of any changes
>> made by SLOF. Since there is no real RTAS (QEMU implements it), it makes
>> sense to pass the SLOF final device tree to QEMU to let it implement
>> RTAS related tasks better, such as PCI host bus adapter hotplug.
>>
>> Specifially, now QEMU can find out the actual XICS phandle (for PHB
>> hotplug) and the RTAS linux,rtas-entry/base properties (for firmware
>> assisted NMI - FWNMI).
>>
>> This stores the initial DT blob in the sPAPR machine and replaces it
>> in the KVMPPC_H_UPDATE_DT (new private hypercall) handler.
>>
>> This adds an @update_dt_enabled machine property to allow backward
>> migration.
>>
>> SLOF already has a hypercall since
>> https://github.com/aik/SLOF/commit/e6fc84652c9c0073f9183
>>
>> Signed-off-by: Alexey Kardashevskiy 
>> ---
>>  include/hw/ppc/spapr.h |  7 ++-
>>  hw/ppc/spapr.c | 31 ++-
>>  hw/ppc/spapr_hcall.c   | 42 ++
>>  hw/ppc/trace-events|  3 +++
>>  4 files changed, 81 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>> index 1987640..86c90df 100644
>> --- a/include/hw/ppc/spapr.h
>> +++ b/include/hw/ppc/spapr.h
>> @@ -102,6 +102,7 @@ struct sPAPRMachineClass {
>>  
>>  /*< public >*/
>>  bool dr_lmb_enabled;   /* enable dynamic-reconfig/hotplug of LMBs */
>> +bool update_dt_enabled;/* enable KVMPPC_H_UPDATE_DT */
>>  bool use_ohci_by_default;  /* use USB-OHCI instead of XHCI */
>>  bool pre_2_10_has_unused_icps;
>>  bool legacy_irq_allocation;
>> @@ -138,6 +139,9 @@ struct sPAPRMachineState {
>>  int vrma_adjust;
>>  ssize_t rtas_size;
>>  void *rtas_blob;
>> +uint32_t fdt_size;
>> +uint32_t fdt_initial_size;
>> +void *fdt_blob;
>>  long kernel_size;
>>  bool kernel_le;
>>  uint32_t initrd_base;
>> @@ -464,7 +468,8 @@ struct sPAPRMachineState {
>>  #define KVMPPC_H_LOGICAL_MEMOP  (KVMPPC_HCALL_BASE + 0x1)
>>  /* Client Architecture support */
>>  #define KVMPPC_H_CAS(KVMPPC_HCALL_BASE + 0x2)
>> -#define KVMPPC_HCALL_MAXKVMPPC_H_CAS
>> +#define KVMPPC_H_UPDATE_DT  (KVMPPC_HCALL_BASE + 0x3)
>> +#define KVMPPC_HCALL_MAXKVMPPC_H_UPDATE_DT
>>  
>>  typedef struct sPAPRDeviceTreeUpdateHeader {
>>  uint32_t version_id;
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 8a18250..984bf32 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1654,7 +1654,10 @@ static void spapr_machine_reset(void)
>>  /* Load the fdt */
>>  qemu_fdt_dumpdtb(fdt, fdt_totalsize(fdt));
>>  cpu_physical_memory_write(fdt_addr, fdt, fdt_totalsize(fdt));
>> -g_free(fdt);
>> +g_free(spapr->fdt_blob);
>> +spapr->fdt_size = fdt_totalsize(fdt);
>> +spapr->fdt_initial_size = spapr->fdt_size;
>> +spapr->fdt_blob = fdt;
>>  
>>  /* Set up the entry state */
>>  spapr_cpu_set_entry_state(first_ppc_cpu, SPAPR_ENTRY_POINT, fdt_addr);
>> @@ -1908,6 +1911,27 @@ static const VMStateDescription vmstate_spapr_irq_map 
>> = {
>>  },
>>  };
>>  
>> +static bool spapr_dtb_needed(void *opaque)
>> +{
>> +sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(opaque);
>> +
>> +return smc->update_dt_enabled;
>> +}
>> +
>> +static const VMStateDescription vmstate_spapr_dtb = {
>> +.name = "spapr_dtb",
>> +.version_id = 1,
>> +.minimum_version_id = 1,
>> +.needed = spapr_dtb_needed,
>> +.fields = (VMStateField[]) {
>> +VMSTATE_UINT32(fdt_initial_size, sPAPRMachineState),
>> +VMSTATE_UINT32(fdt_size, sPAPRMachineState),
>> +VMSTATE_VBUFFER_ALLOC_UINT32(fdt_blob, sPAPRMachineState, 0, NULL,
>> + fdt_size),
> 
> Unless I'm missing something, it looks like the initial spapr->fdt_blob in the
> destination might be leaked...

ah true.


> 
>> +VMSTATE_END_OF_LIST()
>> +},
>> +};
>> +
>>  static const VMStateDescription vmstate_spapr = {
>>  .name = "spapr",
>>  .version_id = 3,
>> @@ -1937,6 +1961,7 @@ static const VMStateDescription vmstate_spapr = {
>>  _spapr_cap_ibs,
>>  _spapr_irq_map,
>>  _spapr_cap_nested_kvm_hv,
>> +_spapr_dtb,
>>  NULL
>>  }
>>  };
>> @@ -3871,6 +3896,7 @@ static void spapr_machine_class_init(ObjectClass *oc, 
>> void *data)
>>  hc->unplug = spapr_machine_device_unplug;
>>  
>>  smc->dr_lmb_enabled = true;
>> +smc->update_dt_enabled = true;
>>  mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power8_v2.0");
>>  mc->has_hotpluggable_cpus = true;
>>  smc->resize_hpt_default = SPAPR_RESIZE_HPT_ENABLED;
>> @@ -3981,8 +4007,11 @@ static void 
>> spapr_machine_3_1_instance_options(MachineState *machine)
>>

Re: [Qemu-devel] [PATCH 0/3] bitmaps: remove x- prefix from QMP api

2018-12-11 Thread John Snow




On 12/6/18 7:08 PM, no-re...@patchew.org wrote:
> Patchew URL: https://patchew.org/QEMU/20181206192544.3987-1-js...@redhat.com/
> 
> 
> 
> Hi,
> 
> This series seems to have some coding style problems. See output below for
> more information:
> 
> Type: series
> Subject: [Qemu-devel] [PATCH 0/3] bitmaps: remove x- prefix from QMP api
> Message-id: 20181206192544.3987-1-js...@redhat.com
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> 
> BASE=base
> n=1
> total=$(git log --oneline $BASE.. | wc -l)
> failed=0
> 
> git config --local diff.renamelimit 0
> git config --local diff.renames True
> git config --local diff.algorithm histogram
> 
> commits="$(git log --format=%H --reverse $BASE..)"
> for c in $commits; do
> echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
> if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; 
> then
> failed=1
> echo
> fi
> n=$((n+1))
> done
> 
> exit $failed
> === TEST SCRIPT END ===
> 
> Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
> Switched to a new branch 'test'
> c972673 block: remove 'x' prefix from experimental bitmap APIs
> ebd1e6c blockdev: n-ary bitmap merge
> 68d3c0c blockdev: abort transactions in reverse order
> 
> === OUTPUT BEGIN ===
> Checking PATCH 1/3: blockdev: abort transactions in reverse order...
> Checking PATCH 2/3: blockdev: n-ary bitmap merge...
> ERROR: externs should be avoided in .c files
> #23: FILE: blockdev.c:2125:
> +void do_block_dirty_bitmap_merge(const char *node, const char *target,
> 

Oh, I guess I'm missing `static` here for this forward declaration.

Also, we need to edit patchew to not CC f...@redhat.com anymore :(

> total: 1 errors, 0 warnings, 147 lines checked
> 
> Your patch has style problems, please review.  If any of these errors
> are false positives report them to the maintainer, see
> CHECKPATCH in MAINTAINERS.
> 
> Checking PATCH 3/3: block: remove 'x' prefix from experimental bitmap APIs...
> === OUTPUT END ===
> 
> Test command exited with code: 1
> 
> 
> The full log is available at
> http://patchew.org/logs/20181206192544.3987-1-js...@redhat.com/testing.checkpatch/?type=message.
> ---
> Email generated automatically by Patchew [http://patchew.org/].
> Please send your feedback to patchew-de...@redhat.com
>

Re: [Qemu-devel] [PATCH 2/3] blockdev: n-ary bitmap merge

2018-12-11 Thread John Snow




On 12/7/18 11:25 AM, Eric Blake wrote:
> On 12/6/18 1:25 PM, John Snow wrote:
>> Especially outside of transactions, it is helpful to provide
>> all-or-nothing semantics for bitmap merges. This facilitates
>> the coalescing of multiple bitmaps into a single target for
>> the "checkpoint" interpretation when assembling bitmaps that
>> represent arbitrary points in time from component bitmaps.
>>
>> Signed-off-by: John Snow 
>> ---
>>   blockdev.c   | 64 +---
>>   qapi/block-core.json | 22 +++
>>   2 files changed, 53 insertions(+), 33 deletions(-)
>>
> 
>> +++ b/qapi/block-core.json
>> @@ -1818,14 +1818,14 @@
>>   #
>>   # @node: name of device/node which the bitmap is tracking
>>   #
>> -# @dst_name: name of the destination dirty bitmap
>> +# @target: name of the destination dirty bitmap
>>   #
>> -# @src_name: name of the source dirty bitmap
>> +# @bitmaps: name(s) of the source dirty bitmap(s)
>>   #
>>   # Since: 3.0
>>   ##
>>   { 'struct': 'BlockDirtyBitmapMerge',
>> -  'data': { 'node': 'str', 'dst_name': 'str', 'src_name': 'str' } }
>> +  'data': { 'node': 'str', 'target': 'str', 'bitmaps': ['str'] } }
> 
> Definitely worthwhile!
> 
> I'll update my pending libvirt patches to use this.
> 
>>     ##
>>   # @block-dirty-bitmap-add:
>> @@ -1940,23 +1940,23 @@
>>   ##
>>   # @x-block-dirty-bitmap-merge:
>>   #
>> -# FIXME: Rename @src_name and @dst_name to src-name and dst-name.
>> -#
>> -# Merge @src_name dirty bitmap to @dst_name dirty bitmap. @src_name
>> dirty
>> -# bitmap is unchanged. On error, @dst_name is unchanged.
>> +# Merge dirty bitmaps listed in @bitmaps to the @target dirty bitmap.
>> +# The @bitmaps dirty bitmaps are unchanged.
> 
> Well, except in the corner case of when @bitmaps also lists the
> destination (I presume that merging a bitmap into itself silently
> succeeds with no further changes, but therefore the inclusion of the
> destination in the list of sources means that that particular source is
> changing due to merging in the other sources).  Not worth rewording this
>  sentence, but does make you want to consider ensuring that the
> testsuite covers merging a bitmap into itself.
> 

OK, noted.

> Reviewed-by: Eric Blake 
> 
Thanks!

Re: [Qemu-devel] [PATCH v8 0/4] Connect a PCIe host and graphics support to RISC-V

2018-12-11 Thread Palmer Dabbelt


On Tue, 11 Dec 2018 14:37:07 PST (-0800), Alistair Francis wrote:

This series is now ready to be merged, all of the patches are reviewed
and tested.

Palmer can you take this with all the other RISC-V patches sent during
the freeze?


Yep, I'll be collecting everything this week.

Thanks!



V8:
 - Drop SiFive U support
 - Drop legacy -nic support
 - Small other review changes
V7:
 - Fix the GPEX memory mapping thanks to Bin Meng
 - Fix the interrupt mapping thanks to Logan Gunthorpe
V6:
 - Fix the interrupt issue for the GPEX device
V5:
 - Rebase
 - Include pci.mak in the default configs
V4:
 - Fix the spike device tree
 - Don't use stdvga device
V3:
 - Remove Makefile config changes
 - Connect a network adapter to the virt device
V2:
 - Use the gpex PCIe host for virt
 - Add support for SiFive U PCIe


Alistair Francis (4):
  hw/riscv/virt: Increase the number of interrupts
  hw/riscv/virt: Adjust memory layout spacing
  hw/riscv/virt: Connect the gpex PCIe
  riscv: Enable VGA and PCIE_VGA

 default-configs/riscv32-softmmu.mak |   8 +-
 default-configs/riscv64-softmmu.mak |   8 +-
 hw/riscv/virt.c | 147 ++--
 include/hw/riscv/virt.h |  15 ++-
 4 files changed, 165 insertions(+), 13 deletions(-)

--
2.19.1

[Qemu-devel] [PATCH v8 09/12] spapr: set the interrupt presenter at reset

2018-12-11 Thread Cédric Le Goater

Currently, the interrupt presenter of the vCPU is set at realize
time. Setting it at reset will become necessary when the new machine
supporting both interrupt modes is introduced. In this machine, the
interrupt mode is chosen at CAS time and activated after a reset.

Signed-off-by: Cédric Le Goater 
---

 Changes since v7:

 - introduced spapr_irq_reset_xics(). 

 include/hw/ppc/spapr_cpu_core.h |  2 ++
 hw/ppc/spapr_cpu_core.c | 26 ++
 hw/ppc/spapr_irq.c  | 13 +
 3 files changed, 41 insertions(+)

diff --git a/include/hw/ppc/spapr_cpu_core.h b/include/hw/ppc/spapr_cpu_core.h
index 9e2821e4b31f..fc8ea9021656 100644
--- a/include/hw/ppc/spapr_cpu_core.h
+++ b/include/hw/ppc/spapr_cpu_core.h
@@ -53,4 +53,6 @@ static inline sPAPRCPUState *spapr_cpu_state(PowerPCCPU *cpu)
 return (sPAPRCPUState *)cpu->machine_data;
 }
 
+void spapr_cpu_core_set_intc(PowerPCCPU *cpu, const char *intc_type);
+
 #endif
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 82666436e9b4..afc5d4f4e739 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -397,3 +397,29 @@ static const TypeInfo spapr_cpu_core_type_infos[] = {
 };
 
 DEFINE_TYPES(spapr_cpu_core_type_infos)
+
+typedef struct ForeachFindIntCArgs {
+const char *intc_type;
+Object *intc;
+} ForeachFindIntCArgs;
+
+static int spapr_cpu_core_find_intc(Object *child, void *opaque)
+{
+ForeachFindIntCArgs *args = opaque;
+
+if (object_dynamic_cast(child, args->intc_type)) {
+args->intc = child;
+}
+
+return args->intc != NULL;
+}
+
+void spapr_cpu_core_set_intc(PowerPCCPU *cpu, const char *intc_type)
+{
+ForeachFindIntCArgs args = { intc_type, NULL };
+
+object_child_foreach(OBJECT(cpu), spapr_cpu_core_find_intc, );
+g_assert(args.intc);
+
+cpu->intc = args.intc;
+}
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 0999a2b2d69c..814500f22d34 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -12,6 +12,7 @@
 #include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "hw/ppc/spapr.h"
+#include "hw/ppc/spapr_cpu_core.h"
 #include "hw/ppc/spapr_xive.h"
 #include "hw/ppc/xics.h"
 #include "sysemu/kvm.h"
@@ -208,6 +209,15 @@ static int spapr_irq_post_load_xics(sPAPRMachineState 
*spapr, int version_id)
 return 0;
 }
 
+static void spapr_irq_reset_xics(sPAPRMachineState *spapr, Error **errp)
+{
+   CPUState *cs;
+
+CPU_FOREACH(cs) {
+spapr_cpu_core_set_intc(POWERPC_CPU(cs), spapr->icp_type);
+}
+}
+
 #define SPAPR_IRQ_XICS_NR_IRQS 0x1000
 #define SPAPR_IRQ_XICS_NR_MSIS \
 (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
@@ -225,6 +235,7 @@ sPAPRIrq spapr_irq_xics = {
 .dt_populate = spapr_dt_xics,
 .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
 .post_load   = spapr_irq_post_load_xics,
+.reset   = spapr_irq_reset_xics,
 };
 
 /*
@@ -325,6 +336,8 @@ static void spapr_irq_reset_xive(sPAPRMachineState *spapr, 
Error **errp)
 CPU_FOREACH(cs) {
 PowerPCCPU *cpu = POWERPC_CPU(cs);
 
+spapr_cpu_core_set_intc(cpu, TYPE_XIVE_TCTX);
+
 /* (TCG) Set the OS CAM line of the thread interrupt context. */
 spapr_xive_set_tctx_os_cam(XIVE_TCTX(cpu->intc));
 }
-- 
2.17.2

Re: [Qemu-devel] [PATCH 3/3] block: remove 'x' prefix from experimental bitmap APIs

2018-12-11 Thread John Snow




On 12/7/18 11:28 AM, Eric Blake wrote:
> On 12/6/18 1:25 PM, John Snow wrote:
>> The 'x' prefix was added because we were uncertain of the direction we'd
>> take for the libvirt API. With the general approach solidified, I feel
>> comfortable committing to this API for 4.0.
>>
>> Signed-off-by: John Snow 
>> ---
> 
>> +++ b/tests/qemu-iotests/223
>> @@ -112,9 +112,9 @@ _send_qemu_cmd $QEMU_HANDLE
>> '{"execute":"qmp_capabilities"}' "return"
>>   _send_qemu_cmd $QEMU_HANDLE '{"execute":"blockdev-add",
>>     "arguments":{"driver":"qcow2", "node-name":"n",
>>   "file":{"driver":"file", "filename":"'"$TEST_IMG"'"}}}' "return"
>> -_send_qemu_cmd $QEMU_HANDLE '{"execute":"x-block-dirty-bitmap-disable",
>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-dirty-bitmap-disable",
>>     "arguments":{"node":"n", "name":"b"}}' "return"
>> -_send_qemu_cmd $QEMU_HANDLE '{"execute":"x-block-dirty-bitmap-disable",
>> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"block-dirty-bitmap-disable",
>>     "arguments":{"node":"n", "name":"b2"}}' "return"
>>   _send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-start",
>>     "arguments":{"addr":{"type":"unix",
> 
> No iotests coverage of block-dirty-bitmap-merge.  We should fix that as
> part of this series; separate patch is fine.
> 

you're not wrong :(

> I'm glad you remembered to renumber all the 'since' tags to 4.0 (as the
> new spelling is indeed new to 4.0, not when we introduced the older x-
> variant).
> 
> Reviewed-by: Eric Blake 
> 

Thanks!

[Qemu-devel] [PATCH v8 07/12] spapr: add an extra OV5 field to the sPAPR IRQ backend

2018-12-11 Thread Cédric Le Goater

The interrupt modes supported by the hypervisor are advertised to the
guest with new bits definitions of the option vector 5 of property
"ibm,arch-vec-5-platform-support. The byte 23 bits 0-1 of the OV5 are
defined as follow :

  0b00   PAPR 2.7 and earlier (Legacy systems)
  0b01   XIVE Exploitation mode only
  0b10   Either available

If the client/guest selects the XIVE interrupt mode, it informs the
hypervisor by returning the value 0b01 in byte 23 bits 0-1. A 0b00
value indicates the use of the XICS interrupt mode (Legacy systems).

The sPAPR IRQ backend is extended with these definitions and the
values are directly used to populate the "ibm,arch-vec-5-platform-support"
property. The interrupt mode is advertised under TCG and under KVM.
Although a KVM XIVE device is not yet available, the machine can still
operate with kernel_irqchip=off. However, we apply a restriction on
the CPU which is required to be a POWER9 when a XIVE interrupt
controller is in use.

Signed-off-by: Cédric Le Goater 
---

 Changes since v7:

 - improved commit log
 - introduced a check on CPU type.

 include/hw/ppc/spapr.h |  6 ++
 include/hw/ppc/spapr_irq.h |  1 +
 hw/ppc/spapr.c | 36 ++--
 hw/ppc/spapr_irq.c |  3 +++
 4 files changed, 40 insertions(+), 6 deletions(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 6bf028a02fe2..06765b4e9d79 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -824,5 +824,11 @@ int spapr_caps_post_migration(sPAPRMachineState *spapr);
 
 void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
   Error **errp);
+/*
+ * XIVE definitions
+ */
+#define SPAPR_OV5_XIVE_LEGACY   0x0
+#define SPAPR_OV5_XIVE_EXPLOIT  0x40
+#define SPAPR_OV5_XIVE_BOTH 0x80 /* Only to advertise on the platform */
 
 #endif /* HW_SPAPR_H */
diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index 63061a009b4c..b34d5a00381b 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -33,6 +33,7 @@ void spapr_irq_msi_reset(sPAPRMachineState *spapr);
 typedef struct sPAPRIrq {
 uint32_tnr_irqs;
 uint32_tnr_msis;
+uint8_t ov5;
 
 void (*init)(sPAPRMachineState *spapr, Error **errp);
 int (*claim)(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 57c0066edd56..ff138f224a87 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1095,12 +1095,14 @@ static void spapr_dt_rtas(sPAPRMachineState *spapr, 
void *fdt)
 spapr_dt_rtas_tokens(fdt, rtas);
 }
 
-/* Prepare ibm,arch-vec-5-platform-support, which indicates the MMU features
- * that the guest may request and thus the valid values for bytes 24..26 of
- * option vector 5: */
-static void spapr_dt_ov5_platform_support(void *fdt, int chosen)
+/* Prepare ibm,arch-vec-5-platform-support, which indicates the MMU
+ * and the XIVE features that the guest may request and thus the valid
+ * values for bytes 23..26 of option vector 5: */
+static void spapr_dt_ov5_platform_support(sPAPRMachineState *spapr, void *fdt,
+  int chosen)
 {
 PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
+sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
 
 char val[2 * 4] = {
 23, 0x00, /* Xive mode, filled in below. */
@@ -,9 +1113,17 @@ static void spapr_dt_ov5_platform_support(void *fdt, int 
chosen)
 
 if (!ppc_check_compat(first_ppc_cpu, CPU_POWERPC_LOGICAL_3_00, 0,
   first_ppc_cpu->compat_pvr)) {
-/* If we're in a pre POWER9 compat mode then the guest should do hash 
*/
+/* If we're in a pre POWER9 compat mode then the guest should do hash
+ * and use the legacy interrupt mode
+ */
+val[1] = 0x00; /* XICS */
 val[3] = 0x00; /* Hash */
 } else if (kvm_enabled()) {
+/* If the KVM XIVE device is not available, the machine can
+ * still operate with kernel_irqchip=off
+ */
+val[1] = smc->irq->ov5;
+
 if (kvmppc_has_cap_mmu_radix() && kvmppc_has_cap_mmu_hash_v3()) {
 val[3] = 0x80; /* OV5_MMU_BOTH */
 } else if (kvmppc_has_cap_mmu_radix()) {
@@ -1122,6 +1132,9 @@ static void spapr_dt_ov5_platform_support(void *fdt, int 
chosen)
 val[3] = 0x00; /* Hash */
 }
 } else {
+/* In TCG, the interrupt mode is defined by the pseries machine */
+val[1] = smc->irq->ov5;
+
 /* V3 MMU supports both hash and radix in tcg (with dynamic switching) 
*/
 val[3] = 0xC0;
 }
@@ -1189,7 +1202,7 @@ static void spapr_dt_chosen(sPAPRMachineState *spapr, 
void *fdt)
 _FDT(fdt_setprop_string(fdt, chosen, "stdout-path", stdout_path));
 }
 
-spapr_dt_ov5_platform_support(fdt, chosen);
+spapr_dt_ov5_platform_support(spapr, fdt, chosen);
 
 g_free(stdout_path);
 g_free(bootlist);
@@

[Qemu-devel] [PATCH v8 06/12] spapr: add a 'reset' method to the sPAPR IRQ backend

2018-12-11 Thread Cédric Le Goater

For the time being, the XIVE reset handler updates the OS CAM line of
the vCPU as it is done under a real hypervisor when a vCPU is
scheduled to run on a HW thread. This will let the XIVE presenter
engine find a match among the NVTs dispatched on the HW threads.

This handler will become even more useful when we introduce the
machine supporting both interrupt modes, XIVE and XICS. In this
machine, the interrupt mode is chosen by the CAS negotiation process
and activated after a reset.

Signed-off-by: Cédric Le Goater 
---

 Changes since v7:

 - removed spapr_irq_reset_xics(). Will be introduce later.
 - replaced spapr_xive_reset_tctx() by spapr_xive_set_tctx_os_cam() to
   also cover hotplugged CPUs. The XiveTCTX and the CPU models lack a
   reset in case of hotplugged. 
 
 include/hw/ppc/spapr_irq.h  |  2 ++
 include/hw/ppc/spapr_xive.h |  1 +
 hw/intc/spapr_xive.c| 17 +
 hw/ppc/spapr.c  |  5 +
 hw/ppc/spapr_irq.c  | 30 +-
 5 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index 84a25ffb6c65..63061a009b4c 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -44,6 +44,7 @@ typedef struct sPAPRIrq {
 Object *(*cpu_intc_create)(sPAPRMachineState *spapr, Object *cpu,
Error **errp);
 int (*post_load)(sPAPRMachineState *spapr, int version_id);
+void (*reset)(sPAPRMachineState *spapr, Error **errp);
 } sPAPRIrq;
 
 extern sPAPRIrq spapr_irq_xics;
@@ -55,6 +56,7 @@ int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool 
lsi, Error **errp);
 void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
 qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
 int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id);
+void spapr_irq_reset(sPAPRMachineState *spapr, Error **errp);
 
 /*
  * XICS legacy routines
diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index 728a5e8dc163..728735dbcfbe 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -47,5 +47,6 @@ typedef struct sPAPRMachineState sPAPRMachineState;
 void spapr_xive_hcall_init(sPAPRMachineState *spapr);
 void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
uint32_t phandle);
+void spapr_xive_set_tctx_os_cam(XiveTCTX *tctx);
 
 #endif /* PPC_SPAPR_XIVE_H */
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index 9e5b2db2439a..aaa5865c4080 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -179,6 +179,23 @@ static void spapr_xive_map_mmio(sPAPRXive *xive)
 sysbus_mmio_map(SYS_BUS_DEVICE(xive), 2, xive->tm_base);
 }
 
+/*
+ * When a Virtual Processor is scheduled to run on a HW thread, the
+ * hypervisor pushes its identifier in the OS CAM line. Emulate the
+ * same behavior under QEMU.
+ */
+void spapr_xive_set_tctx_os_cam(XiveTCTX *tctx)
+{
+uint8_t  nvt_blk;
+uint32_t nvt_idx;
+uint32_t nvt_cam;
+
+spapr_xive_cpu_to_nvt(POWERPC_CPU(tctx->cs), _blk, _idx);
+
+nvt_cam = cpu_to_be32(TM_QW1W2_VO | xive_nvt_cam_line(nvt_blk, nvt_idx));
+memcpy(>regs[TM_QW1_OS + TM_WORD2], _cam, 4);
+}
+
 static void spapr_xive_end_reset(XiveEND *end)
 {
 memset(end, 0, sizeof(*end));
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 1f41ea2f3c6c..57c0066edd56 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1619,6 +1619,11 @@ static void spapr_machine_reset(void)
 
 qemu_devices_reset();
 
+/* This is fixing some of the default configuration of the XIVE
+ * devices. To be called after the reset of the machine devices.
+ */
+spapr_irq_reset(spapr, _fatal);
+
 /* DRC reset may cause a device to be unplugged. This will cause troubles
  * if this device is used by another device (eg, a running vhost backend
  * will crash QEMU if the DIMM holding the vring goes away). To avoid such
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 292c448a15fa..f835ea939cbf 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -305,7 +305,13 @@ static void spapr_irq_print_info_xive(sPAPRMachineState 
*spapr,
 static Object *spapr_irq_cpu_intc_create_xive(sPAPRMachineState *spapr,
   Object *cpu, Error **errp)
 {
-return xive_tctx_create(cpu, XIVE_ROUTER(spapr->xive), errp);
+Object *obj = xive_tctx_create(cpu, XIVE_ROUTER(spapr->xive), errp);
+
+/* (TCG) Early setting the OS CAM line for hotplugged CPUs as they
+ * don't benificiate from the reset of the XIVE IRQ backend
+ */
+spapr_xive_set_tctx_os_cam(XIVE_TCTX(obj));
+return obj;
 }
 
 static int spapr_irq_post_load_xive(sPAPRMachineState *spapr, int version_id)
@@ -313,6 +319,18 @@ static int spapr_irq_post_load_xive(sPAPRMachineState 
*spapr, int version_id)
 return 0;
 }
 
+static void spapr_irq_reset_xive(sPAPRMachineState *spapr, Error **errp)
+{
+

[Qemu-devel] [PATCH v8 12/12] spapr: change default CPU type to POWER9

2018-12-11 Thread Cédric Le Goater

Signed-off-by: Cédric Le Goater 
---
 hw/ppc/spapr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 97a5e3c9929f..705d3f4057b0 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3938,7 +3938,7 @@ static void spapr_machine_class_init(ObjectClass *oc, 
void *data)
 hc->unplug = spapr_machine_device_unplug;
 
 smc->dr_lmb_enabled = true;
-mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power8_v2.0");
+mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power9_v2.0");
 mc->has_hotpluggable_cpus = true;
 smc->resize_hpt_default = SPAPR_RESIZE_HPT_ENABLED;
 fwc->get_dev_path = spapr_get_fw_dev_path;
-- 
2.17.2

[Qemu-devel] [PATCH v8 00/12] ppc: support for the XIVE interrupt controller (POWER9)

2018-12-11 Thread Cédric Le Goater

Hello,

Here is the version 8 of the QEMU models adding support for the XIVE
interrupt controller to the sPAPR machine, under TCG only. KVM support
will be proposed in an other patchset, along with the KVM XIVE device
patchset, so will PowerNV.

The most important changes for sPAPR are the introduction of a 'ic-mode'
machine option to select the interrupt mode of the machine and the fix
for CPU hotplug.

Thanks,

C.

Changes in v8 (sPAPR only) :

 - reworked spapr_irq_init_xive() to use qdev_create()
 - changed the EQ sizes list property to contain 64K only
 - fixed OS CAM line of hotplugged CPUs (TCG)
 - introduced a check on CPU type when activating XIVE.
 - introduced an 'ic-mode' machine option
 - renamed spapr_xive_enable_mmio() to spapr_xive_mmio_set_enabled()
 - changed default CPU type to POWER9
 
Changes in v7 :

 https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg01760.html

Changes in v6 :

 https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg00965.html

Changes in v5 :

 https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg03218.html

Changes in v4 :

 https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg01672.html


= XIVE =


The POWER9 processor comes with a new interrupt controller, called
XIVE as "eXternal Interrupt Virtualization Engine".


* Overall architecture


 XIVE Interrupt Controller
 ++  IPIs
 | +-+ +-+ ++ |+---+
 | |VC   | |CQ   | |PC  |> | CORES |
 | | esb | | | ||> |   |
 | | eas | |  Bridge | |   tctx |> |   |
 | |SC   end | | | |nvt | ||   |
 +--+| +-+ +++ ++ |+-+-+-+-+
 | RAM  |+--|-+  | | |
 |  |   || | |
 |  |   || | |
 |  |  +vv-v-v--+other
 |  <--+ Power Bus  +--> chips
 |  esb |  +-+---+--+
 |  eas ||   |
 |  end | +--|--+|
 |  nvt |   +++ |   +++
 +--+   |SC   | |   |SC   |
| | |   | |
| PQ-bits | |   | PQ-bits |
| local   |-+   |  in VC  |
+-+ +-+
   PCIe NX,NPU,CAPI

  SC: Source Controller (aka. IVSE)
  VC: Virtualization Controller (aka. IVRE)
  PC: Presentation Controller (aka. IVPE)
  CQ: Common Queue (Bridge)

 PQ-bits: 2 bits source state machine (P:pending Q:queued)
 esb: Event State Buffer (Array of PQ bits in an IVSE)
 eas: Event Assignment Structure
 end: Event Notification Descriptor
 nvt: Notification Virtual Target
tctx: Thread interrupt Context


The XIVE IC is composed of three sub-engines :

  - Interrupt Virtualization Source Engine (IVSE), or Source
Controller (SC). These are found in PCI PHBs, in the PSI host
bridge controller, but also inside the main controller for the
core IPIs and other sub-chips (NX, CAP, NPU) of the
chip/processor. They are configured to feed the IVRE with events.

  - Interrupt Virtualization Routing Engine (IVRE) or Virtualization
Controller (VC). Its job is to match an event source with an Event
Notification Descriptor (END).

  - Interrupt Virtualization Presentation Engine (IVPE) or Presentation
Controller (PC). It maintains the interrupt context state of each
thread and handles the delivery of the external exception to the
thread.


* XIVE internal tables

Each of the sub-engines uses a set of tables to redirect exceptions
from event sources to CPU threads.

  +---+
  User or OS  |  EQ   |
  or  +-->|entries|
  Hypervisor  |   |  ..   |
Memory|   +---+
  |   ^
  |   |
 +-+
  |   |
  Hypervisor  +--++---+--++---+--+   +--+
Memory| ESB  || EAT  || ENDT |   | NVTT |
   (skiboot)  ++-+++-+++-+   +--+
^  |^  |^  |   ^
|  ||  ||  |   |

[Qemu-devel] [PATCH v8 08/12] spapr: introduce an 'ic-mode' machine option

2018-12-11 Thread Cédric Le Goater

This option is used to select the interrupt controller mode (XICS or
XIVE) with which the machine will operate. XICS being the default
mode for now.

When running a machine with the XIVE interrupt mode backend, the guest
OS is required to have support for the XIVE exploitation mode. In the
case of legacy OS, the mode selected by CAS should be XICS and the OS
should fail to boot. However, QEMU could possibly detect it, terminate
the boot process and reset to stop in the SLOF firmware. This is not
yet handled.

Signed-off-by: Cédric Le Goater 
---
 include/hw/ppc/spapr.h  |  1 +
 hw/ppc/spapr.c  | 52 ++---
 hw/ppc/spapr_cpu_core.c |  3 +--
 hw/ppc/spapr_irq.c  | 34 ---
 4 files changed, 56 insertions(+), 34 deletions(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 06765b4e9d79..2c77a8ba8810 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -177,6 +177,7 @@ struct sPAPRMachineState {
 int32_t irq_map_nr;
 unsigned long *irq_map;
 sPAPRXive  *xive;
+sPAPRIrq *irq;
 
 bool cmd_line_caps[SPAPR_CAP_NUM];
 sPAPRCapabilities def, eff, mig;
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index ff138f224a87..5d985e38a598 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1102,7 +1102,6 @@ static void 
spapr_dt_ov5_platform_support(sPAPRMachineState *spapr, void *fdt,
   int chosen)
 {
 PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
-sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
 
 char val[2 * 4] = {
 23, 0x00, /* Xive mode, filled in below. */
@@ -1122,7 +1121,7 @@ static void 
spapr_dt_ov5_platform_support(sPAPRMachineState *spapr, void *fdt,
 /* If the KVM XIVE device is not available, the machine can
  * still operate with kernel_irqchip=off
  */
-val[1] = smc->irq->ov5;
+val[1] = spapr->irq->ov5;
 
 if (kvmppc_has_cap_mmu_radix() && kvmppc_has_cap_mmu_hash_v3()) {
 val[3] = 0x80; /* OV5_MMU_BOTH */
@@ -1133,7 +1132,7 @@ static void 
spapr_dt_ov5_platform_support(sPAPRMachineState *spapr, void *fdt,
 }
 } else {
 /* In TCG, the interrupt mode is defined by the pseries machine */
-val[1] = smc->irq->ov5;
+val[1] = spapr->irq->ov5;
 
 /* V3 MMU supports both hash and radix in tcg (with dynamic switching) 
*/
 val[3] = 0xC0;
@@ -1281,7 +1280,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
 _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
 
 /* /interrupt controller */
-smc->irq->dt_populate(spapr, spapr_max_server_number(spapr), fdt,
+spapr->irq->dt_populate(spapr, spapr_max_server_number(spapr), fdt,
   PHANDLE_XICP);
 
 ret = spapr_populate_memory(spapr, fdt);
@@ -1302,7 +1301,8 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
 }
 
 QLIST_FOREACH(phb, >phbs, list) {
-ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt, smc->irq->nr_msis);
+ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt,
+spapr->irq->nr_msis);
 if (ret < 0) {
 error_report("couldn't setup PCI devices in fdt");
 exit(1);
@@ -2636,7 +2636,7 @@ static void spapr_machine_init(MachineState *machine)
 spapr_ovec_set(spapr->ov5, OV5_DRMEM_V2);
 
 /* advertise XIVE on POWER9 machines */
-if (smc->irq->ov5 & SPAPR_OV5_XIVE_EXPLOIT) {
+if (spapr->irq->ov5 & SPAPR_OV5_XIVE_EXPLOIT) {
 if (ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00,
   0, spapr->max_compat_pvr)) {
 spapr_ovec_set(spapr->ov5, OV5_XIVE_EXPLOIT);
@@ -3056,9 +3056,38 @@ static void spapr_set_vsmt(Object *obj, Visitor *v, 
const char *name,
 visit_type_uint32(v, name, (uint32_t *)opaque, errp);
 }
 
+static char *spapr_get_ic_mode(Object *obj, Error **errp)
+{
+sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
+
+if (spapr->irq == _irq_xics_legacy) {
+return g_strdup("legacy");
+} else if (spapr->irq == _irq_xics) {
+return g_strdup("xics");
+} else if (spapr->irq == _irq_xive) {
+return g_strdup("xive");
+}
+g_assert_not_reached();
+}
+
+static void spapr_set_ic_mode(Object *obj, const char *value, Error **errp)
+{
+sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
+
+/* The legacy IRQ backend can not be set */
+if (strcmp(value, "xics") == 0) {
+spapr->irq = _irq_xics;
+} else if (strcmp(value, "xive") == 0) {
+spapr->irq = _irq_xive;
+} else {
+error_setg(errp, "Bad value for \"ic-mode\" property");
+}
+}
+
 static void spapr_instance_init(Object *obj)
 {
 sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
+sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
 
 spapr->htab_fd = -1;
 spapr->use_hotplug_event_source =

[Qemu-devel] [PATCH v8 02/12] spapr: add hcalls support for the XIVE exploitation interrupt mode

2018-12-11 Thread Cédric Le Goater

The different XIVE virtualization structures (sources and event queues)
are configured with a set of Hypervisor calls :

 - H_INT_GET_SOURCE_INFO

   used to obtain the address of the MMIO page of the Event State
   Buffer (ESB) entry associated with the source.

 - H_INT_SET_SOURCE_CONFIG

   assigns a source to a "target".

 - H_INT_GET_SOURCE_CONFIG

   determines which "target" and "priority" is assigned to a source

 - H_INT_GET_QUEUE_INFO

   returns the address of the notification management page associated
   with the specified "target" and "priority".

 - H_INT_SET_QUEUE_CONFIG

   sets or resets the event queue for a given "target" and "priority".
   It is also used to set the notification configuration associated
   with the queue, only unconditional notification is supported for
   the moment. Reset is performed with a queue size of 0 and queueing
   is disabled in that case.

 - H_INT_GET_QUEUE_CONFIG

   returns the queue settings for a given "target" and "priority".

 - H_INT_RESET

   resets all of the guest's internal interrupt structures to their
   initial state, losing all configuration set via the hcalls
   H_INT_SET_SOURCE_CONFIG and H_INT_SET_QUEUE_CONFIG.

 - H_INT_SYNC

   issue a synchronisation on a source to make sure all notifications
   have reached their queue.

Calls that still need to be addressed :

   H_INT_SET_OS_REPORTING_LINE
   H_INT_GET_OS_REPORTING_LINE

See the code for more documentation on each hcall.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
---
 include/hw/ppc/spapr.h  |  15 +-
 include/hw/ppc/spapr_xive.h |   4 +
 hw/intc/spapr_xive.c| 963 
 hw/ppc/spapr_irq.c  |   2 +
 4 files changed, 983 insertions(+), 1 deletion(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index cb3082d319af..6bf028a02fe2 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -452,7 +452,20 @@ struct sPAPRMachineState {
 #define H_INVALIDATE_PID0x378
 #define H_REGISTER_PROC_TBL 0x37C
 #define H_SIGNAL_SYS_RESET  0x380
-#define MAX_HCALL_OPCODEH_SIGNAL_SYS_RESET
+
+#define H_INT_GET_SOURCE_INFO   0x3A8
+#define H_INT_SET_SOURCE_CONFIG 0x3AC
+#define H_INT_GET_SOURCE_CONFIG 0x3B0
+#define H_INT_GET_QUEUE_INFO0x3B4
+#define H_INT_SET_QUEUE_CONFIG  0x3B8
+#define H_INT_GET_QUEUE_CONFIG  0x3BC
+#define H_INT_SET_OS_REPORTING_LINE 0x3C0
+#define H_INT_GET_OS_REPORTING_LINE 0x3C4
+#define H_INT_ESB   0x3C8
+#define H_INT_SYNC  0x3CC
+#define H_INT_RESET 0x3D0
+
+#define MAX_HCALL_OPCODEH_INT_RESET
 
 /* The hcalls above are standardized in PAPR and implemented by pHyp
  * as well.
diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index f087959b9924..9506a8f4d10a 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -42,4 +42,8 @@ bool spapr_xive_irq_free(sPAPRXive *xive, uint32_t lisn);
 void spapr_xive_pic_print_info(sPAPRXive *xive, Monitor *mon);
 qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t lisn);
 
+typedef struct sPAPRMachineState sPAPRMachineState;
+
+void spapr_xive_hcall_init(sPAPRMachineState *spapr);
+
 #endif /* PPC_SPAPR_XIVE_H */
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index 3ade419fdbb1..982ac6e17051 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -38,6 +38,13 @@
 
 #define SPAPR_XIVE_NVT_BASE 0x400
 
+/*
+ * The sPAPR machine has a unique XIVE IC device. Assign a fixed value
+ * to the controller block id value. It can nevertheless be changed
+ * for testing purpose.
+ */
+#define SPAPR_XIVE_BLOCK_ID 0x0
+
 /*
  * sPAPR NVT and END indexing helpers
  */
@@ -46,6 +53,64 @@ static uint32_t spapr_xive_nvt_to_target(uint8_t nvt_blk, 
uint32_t nvt_idx)
 return nvt_idx - SPAPR_XIVE_NVT_BASE;
 }
 
+static void spapr_xive_cpu_to_nvt(PowerPCCPU *cpu,
+  uint8_t *out_nvt_blk, uint32_t *out_nvt_idx)
+{
+assert(cpu);
+
+if (out_nvt_blk) {
+*out_nvt_blk = SPAPR_XIVE_BLOCK_ID;
+}
+
+if (out_nvt_blk) {
+*out_nvt_idx = SPAPR_XIVE_NVT_BASE + cpu->vcpu_id;
+}
+}
+
+static int spapr_xive_target_to_nvt(uint32_t target,
+uint8_t *out_nvt_blk, uint32_t 
*out_nvt_idx)
+{
+PowerPCCPU *cpu = spapr_find_cpu(target);
+
+if (!cpu) {
+return -1;
+}
+
+spapr_xive_cpu_to_nvt(cpu, out_nvt_blk, out_nvt_idx);
+return 0;
+}
+
+/*
+ * sPAPR END indexing uses a simple mapping of the CPU vcpu_id, 8
+ * priorities per CPU
+ */
+static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
+  uint8_t *out_end_blk, uint32_t *out_end_idx)
+{
+assert(cpu);
+
+if (out_end_blk) {
+*out_end_blk = SPAPR_XIVE_BLOCK_ID;
+}
+
+if (out_end_idx) {
+*out_end_idx = (cpu->vcpu_id << 3) + prio;
+}
+}
+
+static int spapr_xive_target_to_end(uint32_t target, uint8_t prio,
+

Re: [Qemu-devel] [RFC v3 19/24] riscv: tcg-target: Add the out op decoder

2018-12-11 Thread Richard Henderson

On 12/10/18 11:56 AM, Richard Henderson wrote:
> On 12/7/18 6:49 PM, Alistair Francis wrote:
>> +case INDEX_op_neg_i64:
>> +tcg_out_opc_imm(s, OPC_SUB, a0, TCG_REG_ZERO, a1);
> 
> tcg_out_opc_reg.
> 
>> +case INDEX_op_mulsh_i32:
>> +case INDEX_op_mulsh_i64:
>> +tcg_out_opc_imm(s, OPC_MULH, a0, a1, a2);
>> +break;
>> +
>> +case INDEX_op_muluh_i32:
>> +case INDEX_op_muluh_i64:
>> +tcg_out_opc_imm(s, OPC_MULHU, a0, a1, a2);
>> +break;
> 
> Likewise.

Incidentally, catching these sorts of errors is why tcg/s390 puts the format of
the insn into the enum.  E.g.

/* Emit an opcode with "type-checking" of the format.  */
#define tcg_out_insn(S, FMT, OP, ...) \
glue(tcg_out_insn_,FMT)(S, glue(glue(FMT,_),OP), ## __VA_ARGS__)

tcg_out_insn(s, RR, AR, a0, a2);
tcg_out_insn(s, RRE, AGR, TCG_TMP0, index);

/* All of the following instructions are prefixed with their instruction
   format...  */
typedef enum S390Opcode {
...
RR_AR   = 0x1a,
RRE_AGR = 0xb908,

I do something similar for tcg/aarch64, although that's more complicated due to
the wide variety of formats.

Whether you go back and retro-fit this scheme to your existing patch set is up
to you.  But I think it could be worthwhile.

r~

[Qemu-devel] [PATCH v8 3/4] hw/riscv/virt: Connect the gpex PCIe

2018-12-11 Thread Alistair Francis

Connect the gpex PCIe device based on the device tree included in the
HiFive Unleashed ROM.

Signed-off-by: Alistair Francis 
Signed-off-by: Logan Gunthorpe 
Reviewed-by: Logan Gunthorpe 
Tested-by: Guenter Roeck 
Tested-by: Andrea Bolognani 
---
 default-configs/riscv32-softmmu.mak |   5 +-
 default-configs/riscv64-softmmu.mak |   5 +-
 hw/riscv/virt.c | 131 +++-
 include/hw/riscv/virt.h |  13 ++-
 4 files changed, 150 insertions(+), 4 deletions(-)

diff --git a/default-configs/riscv32-softmmu.mak 
b/default-configs/riscv32-softmmu.mak
index 7937c69e22..c5ea36cba5 100644
--- a/default-configs/riscv32-softmmu.mak
+++ b/default-configs/riscv32-softmmu.mak
@@ -1,7 +1,10 @@
 # Default configuration for riscv-softmmu
 
+include pci.mak
+
 CONFIG_SERIAL=y
 CONFIG_VIRTIO_MMIO=y
-include virtio.mak
 
 CONFIG_CADENCE=y
+
+CONFIG_PCI_GENERIC=y
diff --git a/default-configs/riscv64-softmmu.mak 
b/default-configs/riscv64-softmmu.mak
index 7937c69e22..c5ea36cba5 100644
--- a/default-configs/riscv64-softmmu.mak
+++ b/default-configs/riscv64-softmmu.mak
@@ -1,7 +1,10 @@
 # Default configuration for riscv-softmmu
 
+include pci.mak
+
 CONFIG_SERIAL=y
 CONFIG_VIRTIO_MMIO=y
-include virtio.mak
 
 CONFIG_CADENCE=y
+
+CONFIG_PCI_GENERIC=y
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 6b6fa39aaa..e7f0716fb6 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -39,6 +39,8 @@
 #include "sysemu/arch_init.h"
 #include "sysemu/device_tree.h"
 #include "exec/address-spaces.h"
+#include "hw/pci/pci.h"
+#include "hw/pci-host/gpex.h"
 #include "elf.h"
 
 #include 
@@ -55,6 +57,9 @@ static const struct MemmapEntry {
 [VIRT_UART0] =   { 0x1000, 0x100 },
 [VIRT_VIRTIO] =  { 0x10001000,0x1000 },
 [VIRT_DRAM] ={ 0x8000,   0x0 },
+[VIRT_PCIE_MMIO] =   { 0x4000,0x4000 },
+[VIRT_PCIE_PIO] ={ 0x0300,0x0001 },
+[VIRT_PCIE_ECAM] =   { 0x3000,0x1000 },
 };
 
 static uint64_t load_kernel(const char *kernel_filename)
@@ -98,6 +103,51 @@ static hwaddr load_initrd(const char *filename, uint64_t 
mem_size,
 return *start + size;
 }
 
+static void create_pcie_irq_map(void *fdt, char *nodename,
+uint32_t plic_phandle)
+{
+int pin, dev;
+uint32_t
+full_irq_map[GPEX_NUM_IRQS * GPEX_NUM_IRQS * FDT_INT_MAP_WIDTH] = {};
+uint32_t *irq_map = full_irq_map;
+
+/* This code creates a standard swizzle of interrupts such that
+ * each device's first interrupt is based on it's PCI_SLOT number.
+ * (See pci_swizzle_map_irq_fn())
+ *
+ * We only need one entry per interrupt in the table (not one per
+ * possible slot) seeing the interrupt-map-mask will allow the table
+ * to wrap to any number of devices.
+ */
+for (dev = 0; dev < GPEX_NUM_IRQS; dev++) {
+int devfn = dev * 0x8;
+
+for (pin = 0; pin < GPEX_NUM_IRQS; pin++) {
+int irq_nr = PCIE_IRQ + ((pin + PCI_SLOT(devfn)) % GPEX_NUM_IRQS);
+int i = 0;
+
+irq_map[i] = cpu_to_be32(devfn << 8);
+
+i += FDT_PCI_ADDR_CELLS;
+irq_map[i] = cpu_to_be32(pin + 1);
+
+i += FDT_PCI_INT_CELLS;
+irq_map[i++] = cpu_to_be32(plic_phandle);
+
+i += FDT_PLIC_ADDR_CELLS;
+irq_map[i] = cpu_to_be32(irq_nr);
+
+irq_map += FDT_INT_MAP_WIDTH;
+}
+}
+
+qemu_fdt_setprop(fdt, nodename, "interrupt-map",
+ full_irq_map, sizeof(full_irq_map));
+
+qemu_fdt_setprop_cells(fdt, nodename, "interrupt-map-mask",
+   0x1800, 0, 0, 0x7);
+}
+
 static void *create_fdt(RISCVVirtState *s, const struct MemmapEntry *memmap,
 uint64_t mem_size, const char *cmdline)
 {
@@ -203,7 +253,10 @@ static void *create_fdt(RISCVVirtState *s, const struct 
MemmapEntry *memmap,
 nodename = g_strdup_printf("/soc/interrupt-controller@%lx",
 (long)memmap[VIRT_PLIC].base);
 qemu_fdt_add_subnode(fdt, nodename);
-qemu_fdt_setprop_cell(fdt, nodename, "#interrupt-cells", 1);
+qemu_fdt_setprop_cells(fdt, nodename, "#address-cells",
+   FDT_PLIC_ADDR_CELLS);
+qemu_fdt_setprop_cell(fdt, nodename, "#interrupt-cells",
+  FDT_PLIC_INT_CELLS);
 qemu_fdt_setprop_string(fdt, nodename, "compatible", "riscv,plic0");
 qemu_fdt_setprop(fdt, nodename, "interrupt-controller", NULL, 0);
 qemu_fdt_setprop(fdt, nodename, "interrupts-extended",
@@ -233,6 +286,33 @@ static void *create_fdt(RISCVVirtState *s, const struct 
MemmapEntry *memmap,
 g_free(nodename);
 }
 
+nodename = g_strdup_printf("/soc/pci@%lx",
+(long) memmap[VIRT_PCIE_ECAM].base);
+qemu_fdt_add_subnode(fdt, nodename);
+qemu_fdt_setprop_cells(fdt, nodename, "#address-cells",
+   FDT_PCI_ADDR_CELLS);
+qemu_fdt_setprop_cells(fdt,

[Qemu-devel] [PATCH v8 10/12] spapr: enable XIVE MMIOs at reset

2018-12-11 Thread Cédric Le Goater

Depending on the interrupt mode of the machine, enable or disable the
XIVE MMIOs.

Signed-off-by: Cédric Le Goater 
---

 Changes since v7:

 - renamed spapr_xive_enable_mmio() to spapr_xive_mmio_set_enabled()

 include/hw/ppc/spapr_xive.h | 1 +
 hw/intc/spapr_xive.c| 9 +
 hw/ppc/spapr_irq.c  | 8 
 3 files changed, 18 insertions(+)

diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index 728735dbcfbe..9b49871bdb1a 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -48,5 +48,6 @@ void spapr_xive_hcall_init(sPAPRMachineState *spapr);
 void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
uint32_t phandle);
 void spapr_xive_set_tctx_os_cam(XiveTCTX *tctx);
+void spapr_xive_mmio_set_enabled(sPAPRXive *xive, bool enable);
 
 #endif /* PPC_SPAPR_XIVE_H */
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index aaa5865c4080..cd1b2c06f88b 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -179,6 +179,15 @@ static void spapr_xive_map_mmio(sPAPRXive *xive)
 sysbus_mmio_map(SYS_BUS_DEVICE(xive), 2, xive->tm_base);
 }
 
+void spapr_xive_mmio_set_enabled(sPAPRXive *xive, bool enable)
+{
+memory_region_set_enabled(>source.esb_mmio, enable);
+memory_region_set_enabled(>tm_mmio, enable);
+
+/* Disable the END ESBs until a guest OS makes use of them */
+memory_region_set_enabled(>end_source.esb_mmio, false);
+}
+
 /*
  * When a Virtual Processor is scheduled to run on a HW thread, the
  * hypervisor pushes its identifier in the OS CAM line. Emulate the
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 814500f22d34..b1319905327f 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -216,6 +216,11 @@ static void spapr_irq_reset_xics(sPAPRMachineState *spapr, 
Error **errp)
 CPU_FOREACH(cs) {
 spapr_cpu_core_set_intc(POWERPC_CPU(cs), spapr->icp_type);
 }
+
+/* Deactivate the XIVE MMIOs */
+if (spapr->xive) {
+spapr_xive_mmio_set_enabled(spapr->xive, false);
+}
 }
 
 #define SPAPR_IRQ_XICS_NR_IRQS 0x1000
@@ -341,6 +346,9 @@ static void spapr_irq_reset_xive(sPAPRMachineState *spapr, 
Error **errp)
 /* (TCG) Set the OS CAM line of the thread interrupt context. */
 spapr_xive_set_tctx_os_cam(XIVE_TCTX(cpu->intc));
 }
+
+/* Activate the XIVE MMIOs */
+spapr_xive_mmio_set_enabled(spapr->xive, true);
 }
 
 /*
-- 
2.17.2

[Qemu-devel] [PATCH v8 11/12] spapr: introduce a new sPAPR IRQ backend supporting XIVE and XICS

2018-12-11 Thread Cédric Le Goater

The 'dual' sPAPR IRQ backend supports both interrupt mode, XIVE
exploitation mode and the legacy compatibility mode (XICS). both modes
are not supported at the same time.

The machine starts with the legacy mode and a new interrupt mode can
then be negotiated by the CAS process. In this case, the new mode is
activated after a reset to take into account the required changes in
the machine. These impact the device tree layout, the interrupt
presenter object and the exposed MMIO regions in the case of XIVE.

Signed-off-by: Cédric Le Goater 
---

 Changes since v7:

 - usage of 'ic-mode' machine option

 include/hw/ppc/spapr_irq.h |   1 +
 hw/ppc/spapr.c |  10 ++-
 hw/ppc/spapr_hcall.c   |  11 +++
 hw/ppc/spapr_irq.c | 143 +
 4 files changed, 162 insertions(+), 3 deletions(-)

diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index b34d5a00381b..29936498dbc8 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -51,6 +51,7 @@ typedef struct sPAPRIrq {
 extern sPAPRIrq spapr_irq_xics;
 extern sPAPRIrq spapr_irq_xics_legacy;
 extern sPAPRIrq spapr_irq_xive;
+extern sPAPRIrq spapr_irq_dual;
 
 void spapr_irq_init(sPAPRMachineState *spapr, Error **errp);
 int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 5d985e38a598..97a5e3c9929f 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2636,11 +2636,11 @@ static void spapr_machine_init(MachineState *machine)
 spapr_ovec_set(spapr->ov5, OV5_DRMEM_V2);
 
 /* advertise XIVE on POWER9 machines */
-if (spapr->irq->ov5 & SPAPR_OV5_XIVE_EXPLOIT) {
+if (spapr->irq->ov5 & (SPAPR_OV5_XIVE_EXPLOIT | SPAPR_OV5_XIVE_BOTH)) {
 if (ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00,
   0, spapr->max_compat_pvr)) {
 spapr_ovec_set(spapr->ov5, OV5_XIVE_EXPLOIT);
-} else {
+} else if (spapr->irq->ov5 & SPAPR_OV5_XIVE_EXPLOIT) {
 error_report("XIVE-only machines require a POWER9 CPU");
 exit(1);
 }
@@ -3066,6 +3066,8 @@ static char *spapr_get_ic_mode(Object *obj, Error **errp)
 return g_strdup("xics");
 } else if (spapr->irq == _irq_xive) {
 return g_strdup("xive");
+} else if (spapr->irq == _irq_dual) {
+return g_strdup("dual");
 }
 g_assert_not_reached();
 }
@@ -3079,6 +3081,8 @@ static void spapr_set_ic_mode(Object *obj, const char 
*value, Error **errp)
 spapr->irq = _irq_xics;
 } else if (strcmp(value, "xive") == 0) {
 spapr->irq = _irq_xive;
+} else if (strcmp(value, "dual") == 0) {
+spapr->irq = _irq_dual;
 } else {
 error_setg(errp, "Bad value for \"ic-mode\" property");
 }
@@ -3127,7 +3131,7 @@ static void spapr_instance_init(Object *obj)
 object_property_add_str(obj, "ic-mode", spapr_get_ic_mode,
 spapr_set_ic_mode, NULL);
 object_property_set_description(obj, "ic-mode",
- "Specifies the interrupt controller mode (xics, xive)",
+ "Specifies the interrupt controller mode (xics, xive, dual)",
  NULL);
 }
 
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index ae913d070f50..09386458f267 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1654,6 +1654,17 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu,
 (spapr_h_cas_compose_response(spapr, args[1], args[2],
   ov5_updates) != 0);
 }
+
+/*
+ * Generate a machine reset when we have an update of the
+ * interrupt mode. Only required on the machine supporting both
+ * mode.
+ */
+if (!spapr->cas_reboot) {
+spapr->cas_reboot = spapr_ovec_test(ov5_updates, OV5_XIVE_EXPLOIT)
+&& spapr->irq->ov5 & SPAPR_OV5_XIVE_BOTH;
+}
+
 spapr_ovec_cleanup(ov5_updates);
 
 if (spapr->cas_reboot) {
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index b1319905327f..31d195b08952 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -375,6 +375,149 @@ sPAPRIrq spapr_irq_xive = {
 .reset   = spapr_irq_reset_xive,
 };
 
+/*
+ * Dual XIVE and XICS IRQ backend.
+ *
+ * Both interrupt mode, XIVE and XICS, objects are created but the
+ * machine starts in legacy interrupt mode (XICS). It can be changed
+ * by the CAS negotiation process and, in that case, the new mode is
+ * activated after extra machine reset.
+ */
+
+/*
+ * Returns the sPAPR IRQ backend negotiated by CAS. XICS is the
+ * default.
+ */
+static sPAPRIrq *spapr_irq_current(sPAPRMachineState *spapr)
+{
+return spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT) ?
+_irq_xive : _irq_xics;
+}
+
+static void spapr_irq_init_dual(sPAPRMachineState *spapr, Error **errp)
+{
+MachineState *machine = MACHINE(spapr);
+Error *local_err =

[Qemu-devel] [PATCH v8 4/4] riscv: Enable VGA and PCIE_VGA

2018-12-11 Thread Alistair Francis

Enable compile support for VGA devices. This allows the user to conenct
a display by adding '-device bochs-display -display sdl' to their
command line argument.

Signed-off-by: Alistair Francis 
Reviewed-by: Logan Gunthorpe 
Tested-by: Andrea Bolognani 
---
 default-configs/riscv32-softmmu.mak | 3 +++
 default-configs/riscv64-softmmu.mak | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/default-configs/riscv32-softmmu.mak 
b/default-configs/riscv32-softmmu.mak
index c5ea36cba5..dbc9398284 100644
--- a/default-configs/riscv32-softmmu.mak
+++ b/default-configs/riscv32-softmmu.mak
@@ -8,3 +8,6 @@ CONFIG_VIRTIO_MMIO=y
 CONFIG_CADENCE=y
 
 CONFIG_PCI_GENERIC=y
+
+CONFIG_VGA=y
+CONFIG_VGA_PCI=y
diff --git a/default-configs/riscv64-softmmu.mak 
b/default-configs/riscv64-softmmu.mak
index c5ea36cba5..dbc9398284 100644
--- a/default-configs/riscv64-softmmu.mak
+++ b/default-configs/riscv64-softmmu.mak
@@ -8,3 +8,6 @@ CONFIG_VIRTIO_MMIO=y
 CONFIG_CADENCE=y
 
 CONFIG_PCI_GENERIC=y
+
+CONFIG_VGA=y
+CONFIG_VGA_PCI=y
-- 
2.19.1

[Qemu-devel] [PATCH v8 04/12] spapr: allocate the interrupt thread context under the CPU core

2018-12-11 Thread Cédric Le Goater

Each interrupt mode has its own specific interrupt presenter object,
that we store under the CPU object, one for XICS and one for XIVE.

Extend the sPAPR IRQ backend with a new handler to support them both.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
---
 include/hw/ppc/spapr_irq.h |  2 ++
 include/hw/ppc/xive.h  |  1 +
 hw/intc/xive.c | 22 ++
 hw/ppc/spapr_cpu_core.c|  5 ++---
 hw/ppc/spapr_irq.c | 15 +++
 5 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index e51e9f052f63..13db0428ab51 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -41,6 +41,8 @@ typedef struct sPAPRIrq {
 void (*print_info)(sPAPRMachineState *spapr, Monitor *mon);
 void (*dt_populate)(sPAPRMachineState *spapr, uint32_t nr_servers,
 void *fdt, uint32_t phandle);
+Object *(*cpu_intc_create)(sPAPRMachineState *spapr, Object *cpu,
+   Error **errp);
 } sPAPRIrq;
 
 extern sPAPRIrq spapr_irq_xics;
diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
index 19309d1d65d1..18cd114eb244 100644
--- a/include/hw/ppc/xive.h
+++ b/include/hw/ppc/xive.h
@@ -419,6 +419,7 @@ typedef struct XiveTCTX {
 extern const MemoryRegionOps xive_tm_ops;
 
 void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon);
+Object *xive_tctx_create(Object *cpu, XiveRouter *xrtr, Error **errp);
 
 static inline uint32_t xive_nvt_cam_line(uint8_t nvt_blk, uint32_t nvt_idx)
 {
diff --git a/hw/intc/xive.c b/hw/intc/xive.c
index ea5385ff7784..53d2f191e8a3 100644
--- a/hw/intc/xive.c
+++ b/hw/intc/xive.c
@@ -526,6 +526,28 @@ static const TypeInfo xive_tctx_info = {
 .class_init= xive_tctx_class_init,
 };
 
+Object *xive_tctx_create(Object *cpu, XiveRouter *xrtr, Error **errp)
+{
+Error *local_err = NULL;
+Object *obj;
+
+obj = object_new(TYPE_XIVE_TCTX);
+object_property_add_child(cpu, TYPE_XIVE_TCTX, obj, _abort);
+object_unref(obj);
+object_property_add_const_link(obj, "cpu", cpu, _abort);
+object_property_set_bool(obj, true, "realized", _err);
+if (local_err) {
+goto error;
+}
+
+return obj;
+
+error:
+object_unparent(obj);
+error_propagate(errp, local_err);
+return NULL;
+}
+
 /*
  * XIVE ESB helpers
  */
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 2398ce62c0e7..1811cd48db90 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -11,7 +11,6 @@
 #include "hw/ppc/spapr_cpu_core.h"
 #include "target/ppc/cpu.h"
 #include "hw/ppc/spapr.h"
-#include "hw/ppc/xics.h" /* for icp_create() - to be removed */
 #include "hw/boards.h"
 #include "qapi/error.h"
 #include "sysemu/cpus.h"
@@ -215,6 +214,7 @@ static void spapr_cpu_core_unrealize(DeviceState *dev, 
Error **errp)
 static void spapr_realize_vcpu(PowerPCCPU *cpu, sPAPRMachineState *spapr,
sPAPRCPUCore *sc, Error **errp)
 {
+sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
 CPUPPCState *env = >env;
 CPUState *cs = CPU(cpu);
 Error *local_err = NULL;
@@ -233,8 +233,7 @@ static void spapr_realize_vcpu(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 qemu_register_reset(spapr_cpu_reset, cpu);
 spapr_cpu_reset(cpu);
 
-cpu->intc = icp_create(OBJECT(cpu), spapr->icp_type, XICS_FABRIC(spapr),
-   _err);
+cpu->intc = smc->irq->cpu_intc_create(spapr, OBJECT(cpu), _err);
 if (local_err) {
 goto error_unregister;
 }
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 975954dc2712..fdcc7795e446 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -191,6 +191,12 @@ static void spapr_irq_print_info_xics(sPAPRMachineState 
*spapr, Monitor *mon)
 ics_pic_print_info(spapr->ics, mon);
 }
 
+static Object *spapr_irq_cpu_intc_create_xics(sPAPRMachineState *spapr,
+  Object *cpu, Error **errp)
+{
+return icp_create(cpu, spapr->icp_type, XICS_FABRIC(spapr), errp);
+}
+
 #define SPAPR_IRQ_XICS_NR_IRQS 0x1000
 #define SPAPR_IRQ_XICS_NR_MSIS \
 (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
@@ -205,6 +211,7 @@ sPAPRIrq spapr_irq_xics = {
 .qirq= spapr_qirq_xics,
 .print_info  = spapr_irq_print_info_xics,
 .dt_populate = spapr_dt_xics,
+.cpu_intc_create = spapr_irq_cpu_intc_create_xics,
 };
 
 /*
@@ -282,6 +289,12 @@ static void spapr_irq_print_info_xive(sPAPRMachineState 
*spapr,
 spapr_xive_pic_print_info(spapr->xive, mon);
 }
 
+static Object *spapr_irq_cpu_intc_create_xive(sPAPRMachineState *spapr,
+  Object *cpu, Error **errp)
+{
+return xive_tctx_create(cpu, XIVE_ROUTER(spapr->xive), errp);
+}
+
 /*
  * XIVE uses the full IRQ number space. Set it to 8K to be compatible
  * with XICS.
@@ -300,6 +313,7 @@ sPAPRIrq spapr_irq_xive = {

[Qemu-devel] [PATCH v8 05/12] spapr: extend the sPAPR IRQ backend for XICS migration

2018-12-11 Thread Cédric Le Goater

Introduce a new sPAPR IRQ handler to handle resend after migration
when the machine is using a KVM XICS interrupt controller model.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
---
 include/hw/ppc/spapr_irq.h |  2 ++
 hw/ppc/spapr.c | 13 +
 hw/ppc/spapr_irq.c | 27 +++
 3 files changed, 34 insertions(+), 8 deletions(-)

diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index 13db0428ab51..84a25ffb6c65 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -43,6 +43,7 @@ typedef struct sPAPRIrq {
 void *fdt, uint32_t phandle);
 Object *(*cpu_intc_create)(sPAPRMachineState *spapr, Object *cpu,
Error **errp);
+int (*post_load)(sPAPRMachineState *spapr, int version_id);
 } sPAPRIrq;
 
 extern sPAPRIrq spapr_irq_xics;
@@ -53,6 +54,7 @@ void spapr_irq_init(sPAPRMachineState *spapr, Error **errp);
 int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
 void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
 qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
+int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id);
 
 /*
  * XICS legacy routines
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index ab46460ddb8b..1f41ea2f3c6c 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1730,14 +1730,6 @@ static int spapr_post_load(void *opaque, int version_id)
 return err;
 }
 
-if (!object_dynamic_cast(OBJECT(spapr->ics), TYPE_ICS_KVM)) {
-CPUState *cs;
-CPU_FOREACH(cs) {
-PowerPCCPU *cpu = POWERPC_CPU(cs);
-icp_resend(ICP(cpu->intc));
-}
-}
-
 /* In earlier versions, there was no separate qdev for the PAPR
  * RTC, so the RTC offset was stored directly in sPAPREnvironment.
  * So when migrating from those versions, poke the incoming offset
@@ -1758,6 +1750,11 @@ static int spapr_post_load(void *opaque, int version_id)
 }
 }
 
+err = spapr_irq_post_load(spapr, version_id);
+if (err) {
+return err;
+}
+
 return err;
 }
 
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index fdcc7795e446..292c448a15fa 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -197,6 +197,18 @@ static Object 
*spapr_irq_cpu_intc_create_xics(sPAPRMachineState *spapr,
 return icp_create(cpu, spapr->icp_type, XICS_FABRIC(spapr), errp);
 }
 
+static int spapr_irq_post_load_xics(sPAPRMachineState *spapr, int version_id)
+{
+if (!object_dynamic_cast(OBJECT(spapr->ics), TYPE_ICS_KVM)) {
+CPUState *cs;
+CPU_FOREACH(cs) {
+PowerPCCPU *cpu = POWERPC_CPU(cs);
+icp_resend(ICP(cpu->intc));
+}
+}
+return 0;
+}
+
 #define SPAPR_IRQ_XICS_NR_IRQS 0x1000
 #define SPAPR_IRQ_XICS_NR_MSIS \
 (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
@@ -212,6 +224,7 @@ sPAPRIrq spapr_irq_xics = {
 .print_info  = spapr_irq_print_info_xics,
 .dt_populate = spapr_dt_xics,
 .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
+.post_load   = spapr_irq_post_load_xics,
 };
 
 /*
@@ -295,6 +308,11 @@ static Object 
*spapr_irq_cpu_intc_create_xive(sPAPRMachineState *spapr,
 return xive_tctx_create(cpu, XIVE_ROUTER(spapr->xive), errp);
 }
 
+static int spapr_irq_post_load_xive(sPAPRMachineState *spapr, int version_id)
+{
+return 0;
+}
+
 /*
  * XIVE uses the full IRQ number space. Set it to 8K to be compatible
  * with XICS.
@@ -314,6 +332,7 @@ sPAPRIrq spapr_irq_xive = {
 .print_info  = spapr_irq_print_info_xive,
 .dt_populate = spapr_dt_xive,
 .cpu_intc_create = spapr_irq_cpu_intc_create_xive,
+.post_load   = spapr_irq_post_load_xive,
 };
 
 /*
@@ -352,6 +371,13 @@ qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq)
 return smc->irq->qirq(spapr, irq);
 }
 
+int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id)
+{
+sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
+
+return smc->irq->post_load(spapr, version_id);
+}
+
 /*
  * XICS legacy routines - to deprecate one day
  */
@@ -420,4 +446,5 @@ sPAPRIrq spapr_irq_xics_legacy = {
 .print_info  = spapr_irq_print_info_xics,
 .dt_populate = spapr_dt_xics,
 .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
+.post_load   = spapr_irq_post_load_xics,
 };
-- 
2.17.2

[Qemu-devel] [PATCH v8 01/12] spapr: introduce a new machine IRQ backend for XIVE

2018-12-11 Thread Cédric Le Goater

The XIVE IRQ backend uses the same layout as the new XICS backend but
covers the full range of the IRQ number space. The IRQ numbers for the
CPU IPIs are allocated at the bottom of this space, below 4K, to
preserve compatibility with XICS which does not use that range.

This should be enough given that the maximum number of CPUs is 1024
for the sPAPR machine under QEMU. For the record, the biggest POWER8
or POWER9 system has a maximum of 1536 HW threads (16 sockets, 192
cores, SMT8).

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
---

 Changes since v7:

 - removed spapr_xive_create()
 - reworked spapr_irq_init_xive() to use qdev_create()
 
 include/hw/ppc/spapr.h |  2 +
 include/hw/ppc/spapr_irq.h |  2 +
 hw/ppc/spapr_irq.c | 93 ++
 3 files changed, 97 insertions(+)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 198764066dc9..cb3082d319af 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -16,6 +16,7 @@ typedef struct sPAPREventLogEntry sPAPREventLogEntry;
 typedef struct sPAPREventSource sPAPREventSource;
 typedef struct sPAPRPendingHPT sPAPRPendingHPT;
 typedef struct ICSState ICSState;
+typedef struct sPAPRXive sPAPRXive;
 
 #define HPTE64_V_HPTE_DIRTY 0x0040ULL
 #define SPAPR_ENTRY_POINT   0x100
@@ -175,6 +176,7 @@ struct sPAPRMachineState {
 const char *icp_type;
 int32_t irq_map_nr;
 unsigned long *irq_map;
+sPAPRXive  *xive;
 
 bool cmd_line_caps[SPAPR_CAP_NUM];
 sPAPRCapabilities def, eff, mig;
diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index bd7301e6d9c6..23cdb51b879e 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -13,6 +13,7 @@
 /*
  * IRQ range offsets per device type
  */
+#define SPAPR_IRQ_IPI0x0
 #define SPAPR_IRQ_EPOW   0x1000  /* XICS_IRQ_BASE offset */
 #define SPAPR_IRQ_HOTPLUG0x1001
 #define SPAPR_IRQ_VIO0x1100  /* 256 VIO devices */
@@ -42,6 +43,7 @@ typedef struct sPAPRIrq {
 
 extern sPAPRIrq spapr_irq_xics;
 extern sPAPRIrq spapr_irq_xics_legacy;
+extern sPAPRIrq spapr_irq_xive;
 
 void spapr_irq_init(sPAPRMachineState *spapr, Error **errp);
 int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index f8b651de0ec9..1f5aac55d370 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -12,6 +12,7 @@
 #include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "hw/ppc/spapr.h"
+#include "hw/ppc/spapr_xive.h"
 #include "hw/ppc/xics.h"
 #include "sysemu/kvm.h"
 
@@ -205,6 +206,98 @@ sPAPRIrq spapr_irq_xics = {
 .print_info  = spapr_irq_print_info_xics,
 };
 
+/*
+ * XIVE IRQ backend.
+ */
+static void spapr_irq_init_xive(sPAPRMachineState *spapr, Error **errp)
+{
+MachineState *machine = MACHINE(spapr);
+sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
+uint32_t nr_servers = spapr_max_server_number(spapr);
+DeviceState *dev;
+int i;
+
+/* KVM XIVE device not yet available */
+if (kvm_enabled()) {
+if (machine_kernel_irqchip_required(machine)) {
+error_setg(errp, "kernel_irqchip requested. no KVM XIVE support");
+return;
+}
+}
+
+dev = qdev_create(NULL, TYPE_SPAPR_XIVE);
+qdev_prop_set_uint32(dev, "nr-irqs", smc->irq->nr_irqs);
+/*
+ * 8 XIVE END structures per CPU. One for each available priority
+ */
+qdev_prop_set_uint32(dev, "nr-ends", nr_servers << 3);
+qdev_init_nofail(dev);
+
+spapr->xive = SPAPR_XIVE(dev);
+
+/* Enable the CPU IPIs */
+for (i = 0; i < nr_servers; ++i) {
+spapr_xive_irq_claim(spapr->xive, SPAPR_IRQ_IPI + i, false);
+}
+}
+
+static int spapr_irq_claim_xive(sPAPRMachineState *spapr, int irq, bool lsi,
+Error **errp)
+{
+if (!spapr_xive_irq_claim(spapr->xive, irq, lsi)) {
+error_setg(errp, "IRQ %d is invalid", irq);
+return -1;
+}
+return 0;
+}
+
+static void spapr_irq_free_xive(sPAPRMachineState *spapr, int irq, int num)
+{
+int i;
+
+for (i = irq; i < irq + num; ++i) {
+spapr_xive_irq_free(spapr->xive, i);
+}
+}
+
+static qemu_irq spapr_qirq_xive(sPAPRMachineState *spapr, int irq)
+{
+return spapr_xive_qirq(spapr->xive, irq);
+}
+
+static void spapr_irq_print_info_xive(sPAPRMachineState *spapr,
+  Monitor *mon)
+{
+CPUState *cs;
+
+CPU_FOREACH(cs) {
+PowerPCCPU *cpu = POWERPC_CPU(cs);
+
+xive_tctx_pic_print_info(XIVE_TCTX(cpu->intc), mon);
+}
+
+spapr_xive_pic_print_info(spapr->xive, mon);
+}
+
+/*
+ * XIVE uses the full IRQ number space. Set it to 8K to be compatible
+ * with XICS.
+ */
+
+#define SPAPR_IRQ_XIVE_NR_IRQS 0x2000
+#define SPAPR_IRQ_XIVE_NR_MSIS (SPAPR_IRQ_XIVE_NR_IRQS - SPAPR_IRQ_MSI)
+
+sPAPRIrq spapr_irq_xive = {
+.nr_irqs =

[Qemu-devel] [PATCH v8 03/12] spapr: add device tree support for the XIVE exploitation mode

2018-12-11 Thread Cédric Le Goater

The XIVE interface for the guest is described in the device tree under
the "interrupt-controller" node. A couple of new properties are
specific to XIVE :

 - "reg"

   contains the base address and size of the thread interrupt
   managnement areas (TIMA), for the User level and for the Guest OS
   level. Only the Guest OS level is taken into account today.

 - "ibm,xive-eq-sizes"

   the size of the event queues. One cell per size supported, contains
   log2 of size, in ascending order.

 - "ibm,xive-lisn-ranges"

   the IRQ interrupt number ranges assigned to the guest for the IPIs.

and also under the root node :

 - "ibm,plat-res-int-priorities"

   contains a list of priorities that the hypervisor has reserved for
   its own use. OPAL uses the priority 7 queue to automatically
   escalate interrupts for all other queues (DD2.X POWER9). So only
   priorities [0..6] are allowed for the guest.

Extend the sPAPR IRQ backend with a new handler to populate the DT
with the appropriate "interrupt-controller" node.

Signed-off-by: Cédric Le Goater 
---

 Changes since v7:

 - changed the EQ sizes list property to contain 64K only
 
 include/hw/ppc/spapr_irq.h  |  2 ++
 include/hw/ppc/spapr_xive.h |  2 ++
 include/hw/ppc/xics.h   |  4 +--
 hw/intc/spapr_xive.c| 64 +
 hw/intc/xics_spapr.c|  3 +-
 hw/ppc/spapr.c  |  3 +-
 hw/ppc/spapr_irq.c  |  3 ++
 7 files changed, 77 insertions(+), 4 deletions(-)

diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index 23cdb51b879e..e51e9f052f63 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -39,6 +39,8 @@ typedef struct sPAPRIrq {
 void (*free)(sPAPRMachineState *spapr, int irq, int num);
 qemu_irq (*qirq)(sPAPRMachineState *spapr, int irq);
 void (*print_info)(sPAPRMachineState *spapr, Monitor *mon);
+void (*dt_populate)(sPAPRMachineState *spapr, uint32_t nr_servers,
+void *fdt, uint32_t phandle);
 } sPAPRIrq;
 
 extern sPAPRIrq spapr_irq_xics;
diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index 9506a8f4d10a..728a5e8dc163 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -45,5 +45,7 @@ qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t lisn);
 typedef struct sPAPRMachineState sPAPRMachineState;
 
 void spapr_xive_hcall_init(sPAPRMachineState *spapr);
+void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
+   uint32_t phandle);
 
 #endif /* PPC_SPAPR_XIVE_H */
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 9958443d1984..14afda198cdb 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -181,8 +181,6 @@ typedef struct XICSFabricClass {
 ICPState *(*icp_get)(XICSFabric *xi, int server);
 } XICSFabricClass;
 
-void spapr_dt_xics(int nr_servers, void *fdt, uint32_t phandle);
-
 ICPState *xics_icp_get(XICSFabric *xi, int server);
 
 /* Internal XICS interfaces */
@@ -204,6 +202,8 @@ void icp_resend(ICPState *ss);
 
 typedef struct sPAPRMachineState sPAPRMachineState;
 
+void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
+   uint32_t phandle);
 int xics_kvm_init(sPAPRMachineState *spapr, Error **errp);
 void xics_spapr_init(sPAPRMachineState *spapr);
 
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index 982ac6e17051..9e5b2db2439a 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -14,6 +14,7 @@
 #include "target/ppc/cpu.h"
 #include "sysemu/cpus.h"
 #include "monitor/monitor.h"
+#include "hw/ppc/fdt.h"
 #include "hw/ppc/spapr.h"
 #include "hw/ppc/spapr_xive.h"
 #include "hw/ppc/xive.h"
@@ -1381,3 +1382,66 @@ void spapr_xive_hcall_init(sPAPRMachineState *spapr)
 spapr_register_hypercall(H_INT_SYNC, h_int_sync);
 spapr_register_hypercall(H_INT_RESET, h_int_reset);
 }
+
+void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
+   uint32_t phandle)
+{
+sPAPRXive *xive = spapr->xive;
+int node;
+uint64_t timas[2 * 2];
+/* Interrupt number ranges for the IPIs */
+uint32_t lisn_ranges[] = {
+cpu_to_be32(0),
+cpu_to_be32(nr_servers),
+};
+/* EQ size - the sizes of pages supported by the system 4K, 64K,
+ * 2M, 16M. We only advertise 64K for the moment.
+ */
+uint32_t eq_sizes[] = {
+cpu_to_be32(16), /* 64K */
+};
+/* The following array is in sync with the reserved priorities
+ * defined by the 'spapr_xive_priority_is_reserved' routine.
+ */
+uint32_t plat_res_int_priorities[] = {
+cpu_to_be32(7),/* start */
+cpu_to_be32(0xf8), /* count */
+};
+gchar *nodename;
+
+/* Thread Interrupt Management Area : User (ring 3) and OS (ring 2) */
+timas[0] = cpu_to_be64(xive->tm_base +
+   XIVE_TM_USER_PAGE * (1ull << TM_SHIFT));
+timas[1] = cpu_to_be64(1ull <<

[Qemu-devel] [PATCH v8 2/4] hw/riscv/virt: Adjust memory layout spacing

2018-12-11 Thread Alistair Francis

Signed-off-by: Alistair Francis 
Reviewed-by: Logan Gunthorpe 
Tested-by: Guenter Roeck 
Tested-by: Andrea Bolognani 
---
 hw/riscv/virt.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 2b38f89070..6b6fa39aaa 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -47,14 +47,14 @@ static const struct MemmapEntry {
 hwaddr base;
 hwaddr size;
 } virt_memmap[] = {
-[VIRT_DEBUG] ={0x0,  0x100 },
-[VIRT_MROM] = { 0x1000,0x11000 },
-[VIRT_TEST] = {   0x10, 0x1000 },
-[VIRT_CLINT] ={  0x200,0x1 },
-[VIRT_PLIC] = {  0xc00,  0x400 },
-[VIRT_UART0] ={ 0x1000,  0x100 },
-[VIRT_VIRTIO] =   { 0x10001000, 0x1000 },
-[VIRT_DRAM] = { 0x8000,0x0 },
+[VIRT_DEBUG] =   {0x0, 0x100 },
+[VIRT_MROM] ={ 0x1000,   0x11000 },
+[VIRT_TEST] ={   0x10,0x1000 },
+[VIRT_CLINT] =   {  0x200,   0x1 },
+[VIRT_PLIC] ={  0xc00, 0x400 },
+[VIRT_UART0] =   { 0x1000, 0x100 },
+[VIRT_VIRTIO] =  { 0x10001000,0x1000 },
+[VIRT_DRAM] ={ 0x8000,   0x0 },
 };
 
 static uint64_t load_kernel(const char *kernel_filename)
-- 
2.19.1

[Qemu-devel] [PATCH v8 1/4] hw/riscv/virt: Increase the number of interrupts

2018-12-11 Thread Alistair Francis

Increase the number of interrupts to match the HiFive Unleashed board.

Signed-off-by: Alistair Francis 
Tested-by: Guenter Roeck 
Tested-by: Andrea Bolognani 
---
 include/hw/riscv/virt.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/hw/riscv/virt.h b/include/hw/riscv/virt.h
index 91163d6cbf..2b2e6dd4ea 100644
--- a/include/hw/riscv/virt.h
+++ b/include/hw/riscv/virt.h
@@ -45,7 +45,7 @@ enum {
 UART0_IRQ = 10,
 VIRTIO_IRQ = 1, /* 1 to 8 */
 VIRTIO_COUNT = 8,
-VIRTIO_NDEV = 10
+VIRTIO_NDEV = 0x35 /* Arbitrary maximum number of interrupts */
 };
 
 enum {
-- 
2.19.1

[Qemu-devel] [PATCH v8 0/4] Connect a PCIe host and graphics support to RISC-V

2018-12-11 Thread Alistair Francis

This series is now ready to be merged, all of the patches are reviewed
and tested.

Palmer can you take this with all the other RISC-V patches sent during
the freeze?

V8:
 - Drop SiFive U support
 - Drop legacy -nic support
 - Small other review changes
V7:
 - Fix the GPEX memory mapping thanks to Bin Meng
 - Fix the interrupt mapping thanks to Logan Gunthorpe
V6:
 - Fix the interrupt issue for the GPEX device
V5:
 - Rebase
 - Include pci.mak in the default configs
V4:
 - Fix the spike device tree
 - Don't use stdvga device
V3:
 - Remove Makefile config changes
 - Connect a network adapter to the virt device
V2:
 - Use the gpex PCIe host for virt
 - Add support for SiFive U PCIe


Alistair Francis (4):
  hw/riscv/virt: Increase the number of interrupts
  hw/riscv/virt: Adjust memory layout spacing
  hw/riscv/virt: Connect the gpex PCIe
  riscv: Enable VGA and PCIE_VGA

 default-configs/riscv32-softmmu.mak |   8 +-
 default-configs/riscv64-softmmu.mak |   8 +-
 hw/riscv/virt.c | 147 ++--
 include/hw/riscv/virt.h |  15 ++-
 4 files changed, 165 insertions(+), 13 deletions(-)

-- 
2.19.1

Re: [Qemu-devel] [PULL 00/24] Machine queue post-3.1.0 (including 4.0 machine-types)

2018-12-11 Thread Peter Maydell

On Tue, 11 Dec 2018 at 18:01, Eduardo Habkost  wrote:
>
> The following changes since commit 32a1a94dd324d33578dca1dc96d7896a0244d768:
>
>   Update version for v3.1.0 release (2018-12-11 17:18:37 +)
>
> are available in the Git repository at:
>
>   git://github.com/ehabkost/qemu.git tags/machine-next-pull-request
>
> for you to fetch changes up to 37fdb2c56c603378b85466d1dd64fb4c95f9deb7:
>
>   qom: remove unimplemented class_finalize (2018-12-11 15:45:23 -0200)
>
> 
> Machine queue post-3.1.0 (including 4.0 machine-types)

Applied, thanks.

-- PMM

Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH 0/6] target/ppc: convert VMX instructions to use TCG vector operations

2018-12-11 Thread Richard Henderson

On 12/11/18 1:35 PM, Mark Cave-Ayland wrote:
> Looking at your profiles above, the primary hotspot appears to be
> helper_lookup_tb_ptr(). However as someone quite new to the TCG parts of 
> QEMU, I
> couldn't tell you whether or not this is to be expected.
> 
> Perhaps a question for Richard: what could we consider to be a "normal" 
> backend when
> looking at profiles in terms of recommended features to implement, and to get 
> an idea
> of what a typical profile should look like?
> 
> It would be interesting to compare with a similar workload profile on e.g. 
> ARM to see
> if there is anything obvious that stands out for PPC.

For Alpha, which has relatively sane tlb management, the top entry in the
profile is helper_lookup_tb_ptr at about 8%.  Otherwise, somewhere in the top 2
or 3 functions will be helper_le_ldq_mmu at about 2%.

That's probably a best-case scenario.


r~

Re: [Qemu-devel] [RFC PATCH 4/6] target/ppc: switch FPR, VMX and VSX helpers to access data directly from cpu_env

2018-12-11 Thread Richard Henderson

On 12/11/18 1:21 PM, Mark Cave-Ayland wrote:
>> Note however, that there are other steps that you must add here before using
>> vector operations in the next patch:
>>
>> (1a) The fpr and vsr arrays must be merged, since fpr[n] == vsrh[n].
>>  If this isn't done, then you simply cannot apply one operation
>>  to two disjoint memory blocks.
>>
>> (1b) The vsr and avr arrays should be merged, since vsr[32+n] == avr[n].
>>  This is simply tidiness, matching the layout to the architecture.
>>
>> These steps will modify gdbstub.c, machine.c, and linux-user/.
> 
> The reason I didn't touch the VSR arrays was because I was hoping that this 
> could be
> done as a follow up later; my thought was that since I'd only introduced 
> vector
> operations into the VMX instructions then currently no vector operations 
> could be
> done across the 2 separate memory blocks?

True, until you convert the VSX insns you can delay this.
Though honestly I would consider doing both at once.

>> (2) The vsr array needs to be QEMU_ALIGN(16).  See target/arm/cpu.h.
>> We assert that the host addresses are 16 byte aligned, so that we
>> can eventually use Altivec/VSX in tcg/ppc/.
> 
> That's a good observation. Presumably being on Intel the unaligned accesses 
> would
> still work but just be slower? I've certainly seen the new vector ops being 
> emitted
> in the generated code.

Yes, currently I generate unaligned loads.  It made sense when considering AVX2
and ARM SVE, since I do not increase the alignment requirements to 32-bytes
when using 256-bit vectors.

I do wonder if I should go back and generate aligned loads, just to raise
SIGBUS when one has forgotten the QEMU_ALIGN marker, as a portability aid.


r~

Re: [Qemu-devel] [qemu-s390x] [PATCH v2 2/3] s390: cpu feature for diagnose 318 andlimit max VCPUs to 247

2018-12-11 Thread Collin Walling

On 12/11/18 11:47 AM, Collin Walling wrote:
> On 12/7/18 7:08 AM, Cornelia Huck wrote:
>> On Thu,  6 Dec 2018 17:24:17 -0500
>> Collin Walling  wrote:
>>
>>> Diagnose 318 is a new z14.2 CPU feature. Since we are able to emulate
>>> it entirely via KVM, we can add guest support for earlier models. A
>>> new CPU feature for diagnose 318 (shortened to diag318) will be made
>>> available to guests starting with the zEC12-full CPU model.
>>>
>>> The z14.2 adds a new read SCP info byte (let's call it byte 134) to
>>> detect the availability of diag318. Because of this, we have room for
>>> one less VCPU and thus limit the max VPUs supported in a configuration
>>> to 247 (down from 248).
>>>
>>> Signed-off-by: Collin Walling .
>>> ---
>>>  hw/s390x/sclp.c | 2 ++
>>>  include/hw/s390x/sclp.h | 2 ++
>>>  target/s390x/cpu.h  | 2 +-
>>>  target/s390x/cpu_features.c | 3 +++
>>>  target/s390x/cpu_features.h | 1 +
>>>  target/s390x/cpu_features_def.h | 3 +++
>>>  target/s390x/gen-features.c | 1 +
>>>  target/s390x/kvm.c  | 1 +
>>>  8 files changed, 14 insertions(+), 1 deletion(-)
>>>
>>
>>> diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
>>> index 8c2320e..594b4a4 100644
>>> --- a/target/s390x/cpu.h
>>> +++ b/target/s390x/cpu.h
>>> @@ -52,7 +52,7 @@
>>>  
>>>  #define MMU_USER_IDX 0
>>>  
>>> -#define S390_MAX_CPUS 248
>>> +#define S390_MAX_CPUS 247
>>
>> Isn't that already problematic if you try to migrate from an older QEMU
>> with all possible vcpus defined? IOW, don't you really need a way that
>> older machines can still run with one more vcpu?
>>
> 
> Good call. I'll run some tests on this and see what happens. I'll report
> here on those results.
> 

Migrating to a machine that supports less vCPUs will report

error: unsupported configuration: Maximum CPUs greater than specified machine 
type limit

I revisited the code to see if there's a way to dynamically set the max vcpu 
count based 
on the read scp info size, but it gets really tricky and code looks very 
complicated.
(Having a packed struct contain the CPU entries whose maximum is determined by 
hardware
limitations makes things difficult -- but who said s390 is easy? :) )

In reality, do we often have guests running with 248 or even 247 vcpus? If so, 
I imagine
the performance isn't too significant?

>>>  
>>>  typedef struct PSW {
>>>  uint64_t mask;
>>
> 
> 


-- 
Respectfully,
- Collin Walling

[Qemu-devel] [PULL v2 2/3] tpm: Make sure new locality passed to tpm_tis_prep_abort() is valid

2018-12-11 Thread Stefan Berger

Make sure that the new locality passed to tpm_tis_prep_abort()
is valid.

Add a comment to aborting_locty that it may be any locality, including
TPM_TIS_NO_LOCALITY.

Signed-off-by: Stefan Berger 
Reviewed-by: Marc-André Lureau 
---
 hw/tpm/tpm_tis.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/tpm/tpm_tis.c b/hw/tpm/tpm_tis.c
index 176d424ed9..04e4ad9212 100644
--- a/hw/tpm/tpm_tis.c
+++ b/hw/tpm/tpm_tis.c
@@ -263,7 +263,9 @@ static void tpm_tis_prep_abort(TPMState *s, uint8_t locty, 
uint8_t newlocty)
 {
 uint8_t busy_locty;
 
-s->aborting_locty = locty;
+assert(TPM_TIS_IS_VALID_LOCTY(newlocty));
+
+s->aborting_locty = locty; /* may also be TPM_TIS_NO_LOCALITY */
 s->next_locty = newlocty;  /* locality after successful abort */
 
 /*
-- 
2.17.1

[Qemu-devel] [PULL 2/4] x86/cpu: Enable MOVDIR64B cpu feature

2018-12-11 Thread Eduardo Habkost

From: Liu Jingqi 

MOVDIR64B moves 64-bytes as direct-store with 64-bytes write atomicity.
Direct store is implemented by using write combining (WC) for writing
data directly into memory without caching the data.

The bit definition:
CPUID.(EAX=7,ECX=0):ECX[bit 28] MOVDIR64B

The release document ref below link:
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf

Cc: Xu Tao 
Signed-off-by: Liu Jingqi 
Message-Id: <1541488407-17045-3-git-send-email-jingqi@intel.com>
Signed-off-by: Eduardo Habkost 
---
 target/i386/cpu.h | 1 +
 target/i386/cpu.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index b4f03ffd74..ef41a033c5 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -688,6 +688,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
 #define CPUID_7_0_ECX_RDPID(1U << 22)
 #define CPUID_7_0_ECX_CLDEMOTE (1U << 25)  /* CLDEMOTE Instruction */
 #define CPUID_7_0_ECX_MOVDIRI  (1U << 27)  /* MOVDIRI Instruction */
+#define CPUID_7_0_ECX_MOVDIR64B (1U << 28) /* MOVDIR64B Instruction */
 
 #define CPUID_7_0_EDX_AVX512_4VNNIW (1U << 2) /* AVX512 Neural Network 
Instructions */
 #define CPUID_7_0_EDX_AVX512_4FMAPS (1U << 3) /* AVX512 Multiply Accumulation 
Single Precision */
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 227baea337..86a934d450 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1024,7 +1024,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = 
{
 "la57", NULL, NULL, NULL,
 NULL, NULL, "rdpid", NULL,
 NULL, "cldemote", NULL, "movdiri",
-NULL, NULL, NULL, NULL,
+"movdir64b", NULL, NULL, NULL,
 },
 .cpuid = {
 .eax = 7,
-- 
2.18.0.rc1.1.g3f1ff2140

[Qemu-devel] [PULL v2 0/3] Merge tpm 2018/12/04 v1

2018-12-11 Thread Stefan Berger

This series of patches removes an unnecessary parameter from tpm_tis_abort()
and adds a locality range check (using assert()) to tpm_tis_prep_abort() and
tpm_tis_request_completed().

   Stefan

The following changes since commit 83ea23cd207a03c5736be0231acbf7f8b05dbf52:

  i386: hvf: Fix overrun of _decode_tbl1 (2018-12-03 15:09:55 +)

are available in the Git repository at:

  git://github.com/stefanberger/qemu-tpm.git tags/pull-tpm-2018-12-04-1

for you to fetch changes up to a639f96111eadb3b8e3021fd3f27e2948ad1c640:

  tpm: Make sure the locality received from backend is valid (2018-12-04 
10:21:25 -0500)


Stefan Berger (3):
  tpm: Remove unused locty parameter from tpm_tis_abort()
  tpm: Make sure new locality passed to tpm_tis_prep_abort() is valid
  tpm: Make sure the locality received from backend is valid

 hw/tpm/tpm_tis.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

-- 
2.17.1

[Qemu-devel] [PULL 3/4] target/i386/kvm.c: Don't mark cpuid_data as QEMU_PACKED

2018-12-11 Thread Eduardo Habkost

From: Peter Maydell 

clang complains about taking the address of a packed
member of a struct:

target/i386/kvm.c:1245:27: warning: taking address of packed member 'cpuid' of 
class or structure '' may result in an unaligned pointer value 
[-Waddress-of-packed-member]
c = cpuid_find_entry(_data.cpuid, 1, 0);
  ^~~~
target/i386/kvm.c:1297:31: warning: taking address of packed member 'cpuid' of 
class or structure '' may result in an unaligned pointer value 
[-Waddress-of-packed-member]
c = cpuid_find_entry(_data.cpuid, kvm_base, 0);
  ^~~~

The kernel's definitions of struct kvm_cpuid2 and struct
kvm_cpuid_entry2 are carefully set up with padding fields
so that there is no between-struct padding anyway, so
the QEMU_PACKED annotation is unnecessary and might result
in the compiler generating worse code. Drop it, and instead
assert at build time that there is no stray padding.

Signed-off-by: Peter Maydell 
Message-Id: <20181210114654.31433-1-peter.mayd...@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Signed-off-by: Eduardo Habkost 
---
 target/i386/kvm.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index b2401d13ea..739cf8c8ea 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -864,7 +864,15 @@ int kvm_arch_init_vcpu(CPUState *cs)
 struct {
 struct kvm_cpuid2 cpuid;
 struct kvm_cpuid_entry2 entries[KVM_MAX_CPUID_ENTRIES];
-} QEMU_PACKED cpuid_data;
+} cpuid_data;
+/*
+ * The kernel defines these structs with padding fields so there
+ * should be no extra padding in our cpuid_data struct.
+ */
+QEMU_BUILD_BUG_ON(sizeof(cpuid_data) !=
+  sizeof(struct kvm_cpuid2) +
+  sizeof(struct kvm_cpuid_entry2) * KVM_MAX_CPUID_ENTRIES);
+
 X86CPU *cpu = X86_CPU(cs);
 CPUX86State *env = >env;
 uint32_t limit, i, j, cpuid_i;
-- 
2.18.0.rc1.1.g3f1ff2140

[Qemu-devel] [PULL v2 1/3] tpm: Remove unused locty parameter from tpm_tis_abort()

2018-12-11 Thread Stefan Berger

Remove the unused locty parameter from tpm_tis_abort() function.

Signed-off-by: Stefan Berger 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/tpm/tpm_tis.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/tpm/tpm_tis.c b/hw/tpm/tpm_tis.c
index d9322692ee..176d424ed9 100644
--- a/hw/tpm/tpm_tis.c
+++ b/hw/tpm/tpm_tis.c
@@ -233,7 +233,7 @@ static void tpm_tis_new_active_locality(TPMState *s, 
uint8_t new_active_locty)
 }
 
 /* abort -- this function switches the locality */
-static void tpm_tis_abort(TPMState *s, uint8_t locty)
+static void tpm_tis_abort(TPMState *s)
 {
 s->rw_offset = 0;
 
@@ -281,7 +281,7 @@ static void tpm_tis_prep_abort(TPMState *s, uint8_t locty, 
uint8_t newlocty)
 }
 }
 
-tpm_tis_abort(s, locty);
+tpm_tis_abort(s);
 }
 
 /*
@@ -311,7 +311,7 @@ static void tpm_tis_request_completed(TPMIf *ti, int ret)
 }
 
 if (TPM_TIS_IS_VALID_LOCTY(s->next_locty)) {
-tpm_tis_abort(s, locty);
+tpm_tis_abort(s);
 }
 
 tpm_tis_raise_irq(s, locty,
-- 
2.17.1

[Qemu-devel] [PULL v2 3/3] tpm: Make sure the locality received from backend is valid

2018-12-11 Thread Stefan Berger

Make sure that the locality passed from the backend to
tpm_tis_request_completed() is valid.

Signed-off-by: Stefan Berger 
Reviewed-by: Marc-André Lureau 
---
 hw/tpm/tpm_tis.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/tpm/tpm_tis.c b/hw/tpm/tpm_tis.c
index 04e4ad9212..2563d7501f 100644
--- a/hw/tpm/tpm_tis.c
+++ b/hw/tpm/tpm_tis.c
@@ -295,6 +295,8 @@ static void tpm_tis_request_completed(TPMIf *ti, int ret)
 uint8_t locty = s->cmd.locty;
 uint8_t l;
 
+assert(TPM_TIS_IS_VALID_LOCTY(locty));
+
 if (s->cmd.selftest_done) {
 for (l = 0; l < TPM_TIS_NUM_LOCALITIES; l++) {
 s->loc[l].sts |= TPM_TIS_STS_SELFTEST_DONE;
-- 
2.17.1

[Qemu-devel] [PULL 1/4] x86/cpu: Enable MOVDIRI cpu feature

2018-12-11 Thread Eduardo Habkost

From: Liu Jingqi 

MOVDIRI moves doubleword or quadword from register to memory through
direct store which is implemented by using write combining (WC) for
writing data directly into memory without caching the data.

The bit definition:
CPUID.(EAX=7,ECX=0):ECX[bit 27] MOVDIRI

The release document ref below link:
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf

Cc: Xu Tao 
Signed-off-by: Liu Jingqi 
Message-Id: <1541488407-17045-2-git-send-email-jingqi@intel.com>
Signed-off-by: Eduardo Habkost 
---
 target/i386/cpu.h | 1 +
 target/i386/cpu.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 9c52d0cbeb..b4f03ffd74 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -687,6 +687,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
 #define CPUID_7_0_ECX_LA57 (1U << 16)
 #define CPUID_7_0_ECX_RDPID(1U << 22)
 #define CPUID_7_0_ECX_CLDEMOTE (1U << 25)  /* CLDEMOTE Instruction */
+#define CPUID_7_0_ECX_MOVDIRI  (1U << 27)  /* MOVDIRI Instruction */
 
 #define CPUID_7_0_EDX_AVX512_4VNNIW (1U << 2) /* AVX512 Neural Network 
Instructions */
 #define CPUID_7_0_EDX_AVX512_4FMAPS (1U << 3) /* AVX512 Multiply Accumulation 
Single Precision */
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index f81d35e1f9..227baea337 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1023,7 +1023,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = 
{
 "avx512bitalg", NULL, "avx512-vpopcntdq", NULL,
 "la57", NULL, NULL, NULL,
 NULL, NULL, "rdpid", NULL,
-NULL, "cldemote", NULL, NULL,
+NULL, "cldemote", NULL, "movdiri",
 NULL, NULL, NULL, NULL,
 },
 .cpuid = {
-- 
2.18.0.rc1.1.g3f1ff2140

[Qemu-devel] [PULL 4/4] i386: Add "stibp" flag name

2018-12-11 Thread Eduardo Habkost

The STIBP flag may be supported by the host KVM module, so QEMU
can allow it to be configured manually, and it can be exposed to
guests when using "-cpu host".

No additional migration code is required because the whole
contents of spec_ctrl is already migrated in the "cpu/spec_ctrl"
section.

Corresponding KVM patch was submitted at:
https://lore.kernel.org/lkml/20181205191956.31480-1-ehabk...@redhat.com/

Signed-off-by: Eduardo Habkost 
Message-Id: <20181210180250.31299-1-ehabk...@redhat.com>
Signed-off-by: Eduardo Habkost 
---
 target/i386/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 86a934d450..12f559b6af 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1042,7 +1042,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = 
{
 NULL, NULL, NULL, NULL,
 NULL, NULL, "pconfig", NULL,
 NULL, NULL, NULL, NULL,
-NULL, NULL, "spec-ctrl", NULL,
+NULL, NULL, "spec-ctrl", "stibp",
 NULL, "arch-capabilities", NULL, "ssbd",
 },
 .cpuid = {
-- 
2.18.0.rc1.1.g3f1ff2140

[Qemu-devel] [PULL 0/4] x86 queue, 2018-12-11

2018-12-11 Thread Eduardo Habkost

The following changes since commit 32a1a94dd324d33578dca1dc96d7896a0244d768:

  Update version for v3.1.0 release (2018-12-11 17:18:37 +)

are available in the Git repository at:

  git://github.com/ehabkost/qemu.git tags/x86-next-pull-request

for you to fetch changes up to 0e8916582991b9fd0b94850a8444b8b80d0a0955:

  i386: Add "stibp" flag name (2018-12-11 18:50:48 -0200)


x86 queue, 2018-12-11

* New CPU features: MOVDIRI, MOVDIR64B (Liu Jingqi);
  STIBP (Eduardo Habkost)
* Fix clang build warning (Peter Maydell)



Eduardo Habkost (1):
  i386: Add "stibp" flag name

Liu Jingqi (2):
  x86/cpu: Enable MOVDIRI cpu feature
  x86/cpu: Enable MOVDIR64B cpu feature

Peter Maydell (1):
  target/i386/kvm.c: Don't mark cpuid_data as QEMU_PACKED

 target/i386/cpu.h |  2 ++
 target/i386/cpu.c |  6 +++---
 target/i386/kvm.c | 10 +-
 3 files changed, 14 insertions(+), 4 deletions(-)

-- 
2.18.0.rc1.1.g3f1ff2140

Re: [Qemu-devel] [Qemu-trivial] [PATCH trivial] docs/devel/build-system: fix 'softmu' typo

2018-12-11 Thread Laurent Vivier

On 28/11/2018 16:34, Emilio G. Cota wrote:
> Signed-off-by: Emilio G. Cota 
> ---
>  docs/devel/build-system.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/docs/devel/build-system.txt b/docs/devel/build-system.txt
> index 52501f2ad9..f9fd27fab0 100644
> --- a/docs/devel/build-system.txt
> +++ b/docs/devel/build-system.txt
> @@ -393,7 +393,7 @@ all use $(obj) as a prefix to the target, e.g.
>  This file provides the entry point used to build each individual system
>  or userspace emulator target. Each enabled target has its own
>  subdirectory. For example if configure is run with the argument
> -'--target-list=x86_64-softmmu', then a sub-directory 'x86_64-softmu'
> +'--target-list=x86_64-softmmu', then a sub-directory 'x86_64-softmmu'
>  will be created, containing a 'Makefile' which symlinks back to
>  Makefile.target
>  
> 

Applied to my trivial-patches branch.

Thanks,
Laurent

Re: [Qemu-devel] [PATCH 4/5] pvrdma: release ring object in case of an error

2018-12-11 Thread P J P

  Hello Yuval,

+-- On Tue, 11 Dec 2018, Yuval Shaia wrote --+
| > Ditto, here send rind and recv rings stays mapped.
| > Look at how QP's ring is destroyed in destroy_qp.
| > 
| > For both case suggesting to define a new static function that destroy rings
| > and call it from both error flow of create_* and from destroy_*
| > 

I see, okay.

Similar resource clean-up required in pvrdma_realize(). In case of an error it 
does: 'goto out;', but nothing is free'd there.

Is another destroy_ routine required there?
 
| Also, can you rebase this patch on top of the patchset i posted last week:
| https://patchwork.kernel.org/patch/10705439/

Okay, I'll send revised patch set. Thanks so much for the prompt review.

Thank you.
--
Prasad J Pandit / Red Hat Product Security Team
47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F

[Qemu-devel] [Bug 1735049] Re: Need MTTCG support for x86 guests

2018-12-11 Thread Emilio G. Cota

This feature is in QEMU v3.1, which was released today.

** Changed in: qemu
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1735049

Title:
  Need MTTCG support for x86 guests

Status in QEMU:
  Fix Released

Bug description:
  MTTCG support is notably absent for x86_64 guests.  The last Wiki
  update on MTTCG was back in 2015, and I am having some difficulty
  determining the current status of the underlying requirements to
  enable this feature on x86 hosts.

  For instance, has support for strong-on-weak memory consistency been
  added into QEMU GIT at this point?

  Thanks!

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1735049/+subscriptions

Re: [Qemu-devel] [PATCH v2] target/i386: Fixes to the check missing features routine

2018-12-11 Thread Eric Blake


On 12/11/18 1:47 PM, Wainer dos Santos Moschetta wrote:



Yes, it helped a lot, thanks. And I apologize for my mistake, I'm gonna 
send a v3 fixing it.


You may want to wait a day or so for any other comments on v2, to 
minimize the resend churn. A maintainer can fix up tags, particularly 
when they are aware it is from a newer contributor still learning how 
things work.




Another doubt that I have: is it advisable to CC everyone that reviewed 
(with or without R-by) the previous version of my patch?


CC'ing previous reviewers is generally a reasonable idea, since they are 
then more likely to double-check that the things they pointed out in the 
first version are indeed fixed in the respin (and since without the cc, 
it's a lot easier to miss that a respin is even available on-list for 
followup review, thanks to the list traffic volume).  You might not get 
a reply from everyone cc'd, but that's not fatal to acceptance of the patch.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH v2] target/i386: Fixes to the check missing features routine

2018-12-11 Thread Wainer dos Santos Moschetta




On 12/11/2018 03:15 PM, Eric Blake wrote:

On 12/11/18 10:28 AM, Wainer dos Santos Moschetta wrote:

The x86_cpu_class_check_missing_features() returns a list
of unavailable features compared to the host CPU. Currently it may
return empty strings for unnamed features as well as duplicated
names.

For example, the qmp "query-cpu-definitions" below shows one empty
string and repeated "mpx" entries:




Signed-off-by: Wainer dos Santos Moschetta 
Reviewed-by: Eduardo Habkost 
Reviewed-by: Eric Blake 


Careful. While I spotted typos in v1,...


Reviewed-by: Caio Carrara 
---
v2:
  * Fixed typos. [eblake]


...and you indeed addressed them, me pointing out typos does not imply 
that I reviewed the patch for correctness.  In fact, I specifically 
did NOT give my R-by: tag to v1, because I'm not (yet?) familiar 
enough with the tests/acceptance/ framework to state that I have fully 
reviewed the patch for correctness; instead, I'm comfortable relying 
on the reviews of others (and I am again intentionally not giving R-by 
to v2).


Also, when posting a v2, you should include the R-by actually given to 
v1 only if the patch is roughly the same as the original.  Fixing 
minor issues that a reviewer pointed out, or doing obvious rebasing to 
changes applied earlier in the series or on upstream git, but where 
the algorithm of the patch itself did not change, is okay for keeping 
R-b (so if I _had_ given R-b, and your spelling changes were the only 
difference, then keeping my R-b would make sense); but where the patch 
is fundamentally different, such as:



 * Removed unwanted manual test case. [ccarrara, ehabkost]
 * Not passing 'accel=kvm' on test's VM. [ehabkost]


then omitting ALL R-by tags, in order to ensure that reviewers check 
that the new patch is still correct, is a wiser course of action.  
Yes, this is more of a rule of thumb, and there are cases where 
keeping or dropping R-b is more of an art form than an exact science; 
but hopefully this helps you understand how the tag can be useful for 
iterative reviews.




Hi Eric,

Yes, it helped a lot, thanks. And I apologize for my mistake, I'm gonna 
send a v3 fixing it.


Another doubt that I have: is it advisable to CC everyone that reviewed 
(with or without R-by) the previous version of my patch?


Thanks!

- Wainer

Re: [Qemu-devel] [PATCH] qdev/core: Can not replug device on bus that allows one device

2018-12-11 Thread Tony Krowiak


On 12/11/18 10:23 AM, Igor Mammedov wrote:

On Mon, 10 Dec 2018 14:14:14 -0500
Tony Krowiak  wrote:


If the maximum number of devices allowed on a bus is 1 and a device
which is plugged into the bus is subsequently unplugged, attempting to replug
the device fails with error "Bus 'xxx' does not support hotplugging".
The "error" is detected in the qbus_is_full(BusState *bus) function
(qdev_monitor.c) because bus->max_index >= bus_class->max_dev. The
root of the problem is that the bus->max_index is not decremented when a device
is unplugged from the bus. This patch fixes that problem.

Signed-off-by: Tony Krowiak 
---
  hw/core/qdev.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 6b3cc55b27c2..b35b0bf27925 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -59,6 +59,8 @@ static void bus_remove_child(BusState *bus, DeviceState 
*child)
  snprintf(name, sizeof(name), "child[%d]", kid->index);
  QTAILQ_REMOVE(>children, kid, sibling);
  
+bus->max_index--;

that might cause problem when bus is allowed to has more than 1 device
and unplugged device is not the last one.
In that case bus_add_child() might create a child with duplicate name.


I see what you are saying. The max_index is assigned to the child device
index in the bus_add_child() function. The child device index is used to
generate a unique name for the child device. The generated name is then
used to link the child as a property to the bus. When the child is
removed from the bus, the name is regenerated from the child device
index and the named property is . It would therefore appear that the
real purpose of the max_index is to generate indexes for children of
the bus thus ensuring each child has a unique index. In other words,
the max_index value is only tangentially connected to indexing the list
of children.

This results in a disconnect between the usage of the max_index value
when adding and removing a child from the bus, and the check in the
qbus_is_full() function which compares the max_index to the maximum
number of devices allowed on the bus. If a child has been removed from
the bus, the max_index value does not indicate whether the bus is
full, it only specifies the index to be assigned to the next child to be
added to the bus.

To resolve this problem, I propose the following:

Add the following field to struct BusState (include/hw/qdev_core.h):

struct BusState {
Object obj;
DeviceState *parent;
char *name;
HotplugHandler *hotplug_handler;
int max_index;
bool realized;
+int num_children;
QTAILQ_HEAD(ChildrenHead, BusChild) children;
QLIST_ENTRY(BusState) sibling;
};

Add the following lines of code to the add/remove child functions in
hw/core/qdev.c:

static void bus_add_child(BusState *bus, DeviceState *child)
{
char name[32];
BusChild *kid = g_malloc0(sizeof(*kid));

kid->index = bus->max_index++;
kid->child = child;
object_ref(OBJECT(kid->child));

QTAILQ_INSERT_HEAD(>children, kid, sibling);

/* This transfers ownership of kid->child to the property.  */
snprintf(name, sizeof(name), "child[%d]", kid->index);
object_property_add_link(OBJECT(bus), name,
 object_get_typename(OBJECT(child)),
 (Object **)>child,
 NULL, /* read-only property */
 0, /* return ownership on prop deletion */
 NULL);

+bus->num_children++;
}

static void bus_remove_child(BusState *bus, DeviceState *child)
{
BusChild *kid;

QTAILQ_FOREACH(kid, >children, sibling) {
if (kid->child == child) {
char name[32];

snprintf(name, sizeof(name), "child[%d]", kid->index);
QTAILQ_REMOVE(>children, kid, sibling);

/* This gives back ownership of kid->child back to us.  */
object_property_del(OBJECT(bus), name, NULL);
object_unref(OBJECT(kid->child));
g_free(kid);

+   bus->num_children--;

return;
}
}
}

Change the line of code in the qbus_is_full() function in
qdev_monitor.c:


static inline bool qbus_is_full(BusState *bus)
{
BusClass *bus_class = BUS_GET_CLASS(bus);
-return bus_class->max_dev &&
-   bus->max_index >= bus_class->max_dev;
+return bus_class->max_dev &&
+   bus->num_children >= bus_class->max_dev;
}






+
  /* This gives back ownership of kid->child back to us.  */
  object_property_del(OBJECT(bus), name, NULL);
  object_unref(OBJECT(kid->child));

Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH 0/6] target/ppc: convert VMX instructions to use TCG vector operations

2018-12-11 Thread Mark Cave-Ayland

On 11/12/2018 01:20, David Gibson wrote:

> On Mon, Dec 10, 2018 at 09:54:51PM +0100, BALATON Zoltan wrote:
>> On Mon, 10 Dec 2018, David Gibson wrote:
>>> On Mon, Dec 10, 2018 at 01:33:53AM +0100, BALATON Zoltan wrote:
 On Fri, 7 Dec 2018, Mark Cave-Ayland wrote:
> This patchset is an attempt at trying to improve the VMX (Altivec) 
> instruction
> performance by making use of the new TCG vector operations where possible.

 This is very welcome, thanks for doing this.

> In order to use TCG vector operations, the registers must be accessible 
> from cpu_env
> whilst currently they are accessed via arrays of static TCG globals. 
> Patches 1-3
> are therefore mechanical patches which introduce access helpers for FPR, 
> AVR and VSR
> registers using the supplied TCGv_i64 parameter.

 Have you tried some benchmarks or tests to measure the impact of these
 changes? I've tried the (very unscientific) benchmarks I've written about
 before here:

 http://lists.nongnu.org/archive/html/qemu-ppc/2018-07/msg00261.html

 (which seem to use AltiVec/VMX instructions but not sure which) on mac99
 with MorphOS and I could not see any performance increase. I haven't run
 enough tests but results with or without this series on master were mostly
 the same within a few percents, and sometimes even seen lower performance
 with these patches than without. I haven't tried to find out why (no time
 for that now) so can't really draw any conclusions from this. I'm also not
 sure if I've actually tested what you've changed or these use instructions
 that your patches don't optimise yet, or the changes I've seen were just
 normal changes between runs; but I wonder if the increased number of
 temporaries could result in lower performance in some cases?
>>>
>>> What was your host machine.  IIUC this change will only improve
>>> performance if the host tcg backend is able to implement TCG vector
>>> ops in terms of vector ops on the host.
>>
>> Tried it on i5 650 which has: sse sse2 ssse3 sse4_1 sse4_2. I assume x86_64
>> should be supported but not sure what are the CPU requirements.
>>
>>> In addition, this series only converts a subset of the integer and
>>> logical vector instructions.  If your testcase is mostly floating
>>> point (vectored or otherwise), it will still be softfloat and so not
>>> see any speedup.
>>
>> Yes, I don't really know what these tests use but I think "lame" test is
>> mostly floating point but tried with "lame_vmx" which should at least use
>> some vector ops and "mplayer -benchmark" test is more vmx dependent based on
>> my previous profiling and testing with hardfloat but I'm not sure. (When
>> testing these with hardfloat I've found that lame was benefiting from
>> hardfloat but mplayer wasn't and more VMX related functions showed up with
>> mplayer so I assumed it's more VMX bound.)
> 
> I should clarify here.  When I say "floating point" above, I'm not
> meaning things using the regular FPU instead of the vector unit.  I'm
> saying *anything* involving floating point calculations whether
> they're done in the FPU or the vector unit.
> 
> The patches here don't convert all VMX instructions to use vector TCG
> ops - they only convert a few, and those few are about using the
> vector unit for integer (and logical) operations.  VMX instructions
> involving floating point calculations are unaffected and will still
> use soft-float.

Right. As I mentioned in an earlier email, this is hopefully laying the 
groundwork
for future evolution of the TCG vector operations and making use of the existing
functions first.

Certainly I'd be interested at looking at the hardfloat patches after this, 
since FP
performance is something that can offer enormous benefit to MacOS emulation.

>> I've tried to do some profiling again to find out what's used but I can't
>> get good results with the tools I have (oprofile stopped working since I've
>> updated my machine and Linux perf provides results that are hard to
>> interpret for me, haven't tried if gprof would work now it didn't before)
>> but I've seen some vector related helpers in the profile so at least some
>> vector ops are used. The "helper_vperm" came up top at about 11th (not sure
>> where is it called from), other vector helpers were lower.
>>
>> I don't remember details now but previously when testing hardfloat I've
>> written this: "I've looked at vperm which came out top in one of the
>> profiles I've taken and on little endian hosts it has the loop backwards and
>> also accesses vector elements from end to front which I wonder may be enough
>> for the compiler to not be able to optimise it? But I haven't checked
>> assembly. The altivec dependent mplayer video decoding test did not change
>> much with hardfloat, it took 98% compared to master so likely altivec is
>> dominating here." (Although this was with the PPC specific vector

Re: [Qemu-devel] [RFC 0/3] target/m68k: convert to transaction_failed hook

2018-12-11 Thread Peter Maydell

On Tue, 11 Dec 2018 at 19:13, Mark Cave-Ayland
 wrote:
> On 10/12/2018 16:56, Peter Maydell wrote:
> > Anyway, I send it out as a skeleton for comments, because
> > it would be nice to get rid of the old unassigned_access
> > hook, which is fundamentally broken (it's still used by m68k,
> > microblaze, mips and sparc).
>
> Laurent is really the expert here (my work on the q800 was purely on the 
> device
> side), however is this also a nudge to see if the unassigned_access hook can 
> be
> eliminated from sparc too? ;)

It would certainly be great to convert sparc too;
it and mips are a little more complicated than these
ones, but the principle is the same:
 * helper functions in target/sparc which call
   cpu_unassigned_access() should be changed to call
   some sparc-internal function to raise the right
   exception
 * callsites in target/sparc which do loads or stores
   by physical address should be checked to ensure they
   do the right thing when a bus error is detected;
   this usually means changing them to use address_space_*
   functions and check they return MEMTX_OK. (With the
   old unassigned_access hook these would result in calls
   to the hook, which was often the wrong thing anyway.
   The transaction_failed hook is called only for accesses
   via the TCG MMU.) The docs/devel/loads-stores.rst docs
   have some handy regexes for use with 'git grep'; for sparc
   these catch everything:
 git grep '\' target/sparc/
 git grep '\' target/sparc/
 * convert the hook itself: this requires a little fiddling
   of parameters, and the addition of the cpu_restore_state()
   call

(MIPS has some odd board-specific handling on top of that
which will need to be fixed too.)

thanks
-- PMM

[Qemu-devel] [PATCH] x86: host-phys-bits-limit option

2018-12-11 Thread Eduardo Habkost

Some downstream distributions of QEMU set host-phys-bits=on by
default.  This worked very well for most use cases, because
phys-bits really didn't have huge consequences. The only
difference was on the CPUID data seen by guests, and on the
handling of reserved bits.

This changed in KVM commit 855feb673640 ("KVM: MMU: Add 5 level
EPT & Shadow page table support").  Now choosing a large
phys-bits value for a VM has bigger impact: it will make KVM use
5-level EPT even when it's not really necessary.  This means
using the host phys-bits value may not be the best choice.

Management software could address this problem by manually
configuring phys-bits depending on the size of the VM and the
amount of MMIO address space required for hotplug.  But this is
not trivial to implement.

However, there's another workaround that would work for most
cases: keep using the host phys-bits value, but only if it's
smaller than 48.  This patch makes this possible by introducing a
new "-cpu" option: "host-phys-bits-limit".  Management software
or users can make sure they will always use 4-level EPT using:
"host-phys-bits=on,host-phys-bits-limit=48".

This behavior is still not enabled by default because QEMU
doesn't enable host-phys-bits=on by default.  But users,
management software, or downstream distributions may choose to
change their defaults using the new option.

Signed-off-by: Eduardo Habkost 
---
 target/i386/cpu.h |  3 ++
 target/i386/cpu.c |  5 ++
 tests/acceptance/x86-phys-bits.py | 81 +++
 3 files changed, 89 insertions(+)
 create mode 100644 tests/acceptance/x86-phys-bits.py

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 9c52d0cbeb..a545b77c2c 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1457,6 +1457,9 @@ struct X86CPU {
 /* if true override the phys_bits value with a value read from the host */
 bool host_phys_bits;
 
+/* if set, limit maximum value for phys_bits when host_phys_bits is true */
+uint8_t host_phys_bits_limit;
+
 /* Stop SMI delivery for migration compatibility with old machines */
 bool kvm_no_smi_migration;
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index f81d35e1f9..df200754a2 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5152,6 +5152,10 @@ static void x86_cpu_realizefn(DeviceState *dev, Error 
**errp)
 if (cpu->host_phys_bits) {
 /* The user asked for us to use the host physical bits */
 cpu->phys_bits = host_phys_bits;
+if (cpu->host_phys_bits_limit &&
+cpu->phys_bits > cpu->host_phys_bits_limit) {
+cpu->phys_bits = cpu->host_phys_bits_limit;
+}
 }
 
 /* Print a warning if the user set it to a value that's not the
@@ -5739,6 +5743,7 @@ static Property x86_cpu_properties[] = {
 DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
 DEFINE_PROP_UINT32("phys-bits", X86CPU, phys_bits, 0),
 DEFINE_PROP_BOOL("host-phys-bits", X86CPU, host_phys_bits, false),
+DEFINE_PROP_UINT8("host-phys-bits-limit", X86CPU, host_phys_bits_limit, 0),
 DEFINE_PROP_BOOL("fill-mtrr-mask", X86CPU, fill_mtrr_mask, true),
 DEFINE_PROP_UINT32("level", X86CPU, env.cpuid_level, UINT32_MAX),
 DEFINE_PROP_UINT32("xlevel", X86CPU, env.cpuid_xlevel, UINT32_MAX),
diff --git a/tests/acceptance/x86-phys-bits.py 
b/tests/acceptance/x86-phys-bits.py
new file mode 100644
index 00..ae85d7e4e7
--- /dev/null
+++ b/tests/acceptance/x86-phys-bits.py
@@ -0,0 +1,81 @@
+# Test for host-phys-bits-limit option
+#
+# Copyright (c) 2018 Red Hat, Inc.
+#
+# Author:
+#  Eduardo Habkost 
+#
+# This work is licensed under the terms of the GNU GPL, version 2 or
+# later.  See the COPYING file in the top-level directory.
+import re
+
+from avocado_qemu import Test
+
+class HostPhysbits(Test):
+"""
+Check if `host-phys-bits` and `host-phys-bits` options work.
+
+:avocado: enable
+:avocado: tags=x86_64
+"""
+
+def cpu_qom_get(self, prop):
+qom_path = self.vm.command('query-cpus')[0]['qom_path']
+return self.vm.command('qom-get', path=qom_path, property=prop)
+
+def cpu_phys_bits(self):
+return self.cpu_qom_get('phys-bits')
+
+def host_phys_bits(self):
+cpuinfo = open('/proc/cpuinfo', 'rb').read()
+m = re.search(b'([0-9]+) bits physical', cpuinfo)
+if m is None:
+self.cancel("Couldn't read phys-bits from /proc/cpuinfo")
+return int(m.group(1))
+
+def setUp(self):
+super(HostPhysbits, self).setUp()
+self.vm.add_args('-S', '-machine', 'accel=kvm:tcg')
+self.vm.launch()
+if not self.vm.command('query-kvm')['enabled']:
+self.cancel("Test case requires KVM")
+self.vm.shutdown()
+
+
+def test_no_host_phys_bits(self):
+# default should be phys-bits=40 if host-phys-bits=off
+

Re: [Qemu-devel] [RFC PATCH 3/6] target/ppc: introduce get_cpu_vsr{l, h}() and set_cpu_vsr{l, h}() helpers for VSR register access

2018-12-11 Thread Mark Cave-Ayland

On 10/12/2018 19:16, Richard Henderson wrote:

> On 12/7/18 2:56 AM, Mark Cave-Ayland wrote:
>> +static inline void get_vsr(TCGv_i64 dst, int n)
>> +{
>> +tcg_gen_ld_i64(dst, cpu_env, offsetof(CPUPPCState, vsr[n]));
>> +}
>> +
>> +static inline void set_vsr(int n, TCGv_i64 src)
>> +{
>> +tcg_gen_st_i64(src, cpu_env, offsetof(CPUPPCState, vsr[n]));
>> +}
> 
> Why isn't this helper still using cpu_vsr[n]?

Gah this is just a mistake from squashing patches from my working branch to 
tidy them
up for submission. I'll fix this in the next iteration.


ATB,

Mark.

Re: [Qemu-devel] [RFC PATCH 4/6] target/ppc: switch FPR, VMX and VSX helpers to access data directly from cpu_env

2018-12-11 Thread Mark Cave-Ayland

On 10/12/2018 19:05, Richard Henderson wrote:

> On 12/7/18 2:56 AM, Mark Cave-Ayland wrote:
>> Instead of accessing the FPR, VMX and VSX registers through static arrays of
>> TCGv_i64 globals, remove them and change the helpers to load/store data 
>> directly
>> within cpu_env.
>>
>> Signed-off-by: Mark Cave-Ayland 
>> ---
>>  target/ppc/translate.c | 59 
>> ++
>>  1 file changed, 16 insertions(+), 43 deletions(-)
> 
> Reviewed-by: Richard Henderson 
> 
> Note however, that there are other steps that you must add here before using
> vector operations in the next patch:
> 
> (1a) The fpr and vsr arrays must be merged, since fpr[n] == vsrh[n].
>  If this isn't done, then you simply cannot apply one operation
>  to two disjoint memory blocks.
> 
> (1b) The vsr and avr arrays should be merged, since vsr[32+n] == avr[n].
>  This is simply tidiness, matching the layout to the architecture.
> 
> These steps will modify gdbstub.c, machine.c, and linux-user/.

The reason I didn't touch the VSR arrays was because I was hoping that this 
could be
done as a follow up later; my thought was that since I'd only introduced vector
operations into the VMX instructions then currently no vector operations could 
be
done across the 2 separate memory blocks?

> (2) The vsr array needs to be QEMU_ALIGN(16).  See target/arm/cpu.h.
> We assert that the host addresses are 16 byte aligned, so that we
> can eventually use Altivec/VSX in tcg/ppc/.

That's a good observation. Presumably being on Intel the unaligned accesses 
would
still work but just be slower? I've certainly seen the new vector ops being 
emitted
in the generated code.


ATB,

Mark.

Re: [Qemu-devel] [RFC PATCH 2/6] target/ppc: introduce get_avr64() and set_avr64() helpers for VMX register access

2018-12-11 Thread Mark Cave-Ayland

On 10/12/2018 18:49, Richard Henderson wrote:

> On 12/7/18 2:56 AM, Mark Cave-Ayland wrote:
>> +avr = tcg_temp_new_i64();   
>>   \
>>  EA = tcg_temp_new();
>>   \
>>  gen_addr_reg_index(ctx, EA);
>>   \
>>  tcg_gen_andi_tl(EA, EA, ~0xf);  
>>   \
>>  /* We only need to swap high and low halves. gen_qemu_ld64_i64 does 
>>   \
>> necessary 64-bit byteswap already. */
>>   \
>>  if (ctx->le_mode) { 
>>   \
>> -gen_qemu_ld64_i64(ctx, cpu_avrl[rD(ctx->opcode)], EA);  
>>   \
>> +gen_qemu_ld64_i64(ctx, avr, EA);
>>   \
>> +set_avr64(rD(ctx->opcode), avr, false); 
>>   \
>>  tcg_gen_addi_tl(EA, EA, 8); 
>>   \
>> -gen_qemu_ld64_i64(ctx, cpu_avrh[rD(ctx->opcode)], EA);  
>>   \
>> +gen_qemu_ld64_i64(ctx, avr, EA);
>>   \
>> +set_avr64(rD(ctx->opcode), avr, true);  
> 
> An accurate conversion, but I'm going to call this an existing bug:
> 
> The writeback to both avr{l,h} should be delayed until all exceptions have 
> been
> raised.  Thus you should perform the two gen_qemu_ld64_i64 into two 
> temporaries
> and only then write them both back via set_avr64.

Thanks for the pointer. I'll add this to the list of changes for the next 
revision of
the patchset.

> Otherwise,
> Reviewed-by: Richard Henderson 


ATB,

Mark.

Re: [Qemu-devel] [RFC PATCH 1/6] target/ppc: introduce get_fpr() and set_fpr() helpers for FP register access

2018-12-11 Thread Mark Cave-Ayland

On 10/12/2018 18:43, Richard Henderson wrote:

> On 12/7/18 2:56 AM, Mark Cave-Ayland wrote:
>> -gen_helper_f##op(cpu_fpr[rD(ctx->opcode)], cpu_env, 
>>   \
>> - cpu_fpr[rA(ctx->opcode)],  
>>   \
>> - cpu_fpr[rC(ctx->opcode)], cpu_fpr[rB(ctx->opcode)]);   
>>   \
>> +get_fpr(t0, rA(ctx->opcode));   
>>   \
>> +get_fpr(t1, rC(ctx->opcode));   
>>   \
>> +get_fpr(t2, rB(ctx->opcode));   
>>   \
>> +gen_helper_f##op(t3, cpu_env, t0, t1, t2);  
>>   \
>> +set_fpr(rD(ctx->opcode), t3);   
>>   \
>>  if (isfloat) {  
>>   \
>> -gen_helper_frsp(cpu_fpr[rD(ctx->opcode)], cpu_env,  
>>   \
>> -cpu_fpr[rD(ctx->opcode)]);  
>>   \
>> +get_fpr(t0, rD(ctx->opcode));   
>>   \
>> +gen_helper_frsp(t3, cpu_env, t0);   
>>   \
>> +set_fpr(rD(ctx->opcode), t3);   
>>   \
>>  }   
>>   \
> 
> This is an accurate conversion, but the writeback to the rD register need not
> happen until after helper_frsp.  Just move it below the isfloat block.

Okay - I'll fix this up in the next iteration.

> I do see that helper_frsp can raise an exception for invalid_op for SNaN.  If
> that code were actually reachable, this would have been an existing bug, in
> that we should not have written back to rD after the first operation.  
> However,
> any SNaN will already have been eliminated by the first operation (via
> squashing to QNaN or by exiting via exception).
> 
> Similarly in GEN_FLOAT_AB.

Noted.

>> +get_fpr(t0, rB(ctx->opcode));
>> +gen_helper_frsqrte(t1, cpu_env, t0);
>> +set_fpr(rD(ctx->opcode), t1);
>> +gen_helper_frsp(t1, cpu_env, t1);
>> +gen_compute_fprf_float64(t1);
> 
> gen_frsqrtes has the set_fpr in the wrong place.  Likewise gen_fsqrts.

Ooops. I remember having a think a bit harder about these two functions, so 
I'll go
and take another look.


ATB,

Mark.

Re: [Qemu-devel] [RFC 0/3] target/m68k: convert to transaction_failed hook

2018-12-11 Thread Mark Cave-Ayland

On 10/12/2018 16:56, Peter Maydell wrote:

> This patchset converts the m68k target from the deprecated
> unassigned_access hook to the new transaction_failed hook.
> It's RFC for a couple of reasons:
>  * it's untested, since I don't have an m68k test image
>  * the second patch just makes "bus error while trying to
>read page tables" be treated as a page fault, when it
>should probably cause a fault reporting it as a bus error
>of some kind
>  * I don't understand why the old unassigned_access hook
>set the ATC bit in the MMU SSW, since the docs I have say
>this should be set if the fault happened during a table
>search, but cleared if it's just an ordinary bus-errored
>data or insn access. Probably this is a pre-existing bug?
> 
> Anyway, I send it out as a skeleton for comments, because
> it would be nice to get rid of the old unassigned_access
> hook, which is fundamentally broken (it's still used by m68k,
> microblaze, mips and sparc).

Laurent is really the expert here (my work on the q800 was purely on the device
side), however is this also a nudge to see if the unassigned_access hook can be
eliminated from sparc too? ;)


ATB,

Mark.

Re: [Qemu-devel] [RFC PATCH 0/6] target/ppc: convert VMX instructions to use TCG vector operations

2018-12-11 Thread Mark Cave-Ayland

On 10/12/2018 13:04, Aleksandar Markovic wrote:

> On Dec 7, 2018 9:59 AM, "Mark Cave-Ayland" 
> wrote:
>>
>> This patchset is an attempt at trying to improve the VMX (Altivec)
> instruction
>> performance by making use of the new TCG vector operations where possible.
>>
> 
> Hello, Mark.
> 
> I just want to say that I support these efforts. Very interesting, it can
> bring significant improvements for multimedia intensive emulations. But,
> even more important, we can see new tcg vector interface in action,
> possibly suggest improvements, extensions.
> 
> I would just like to ask you to add some performance comparisons, if
> possible. Very simple tests would be sufficient.
> 
> Thanks again for this series!

Thanks Aleksander! In my local tests I haven't done much in the way of 
benchmarking,
other than to check the output of the disassembler to confirm that the x86 
vector
opcodes are being used. Given that only a small number of TCG vector operations 
are
currently suitable for use with the VMX code, then I would expect certain 
artificial
benchmarks to show improvements e.g. with the logical ops but there was 
certainly
nothing noticeable in normal usage with my OpenBIOS test images.

I agree with you that having a second implementation other than ARM will help 
provide
ideas as to how the vector interfaces can be evolved to support more operations 
in
future.

ATB,

Mark.

Re: [Qemu-devel] [RFC PATCH 1/6] target/ppc: introduce get_fpr() and set_fpr() helpers for FP register access

2018-12-11 Thread Mark Cave-Ayland

On 10/12/2018 05:17, David Gibson wrote:

> On Fri, Dec 07, 2018 at 08:56:30AM +, Mark Cave-Ayland wrote:
>> These helpers allow us to move FP register values to/from the specified 
>> TCGv_i64
>> argument.
>>
>> To prevent FP helpers accessing the cpu_fpr array directly, add extra TCG
>> temporaries as required.
> 
> It's not obvious to me why that's a desirable thing.  I'm assuming
> it's somehow necessary for the stuff later in the series, but I think
> we need a brief rationale here to explain why this isn't just adding
> extra reg copies for the sake of it.

Ah okay. It's because the VSX registers extend the set of VMX and FPR registers 
(see
the cpu_vsrh() and cpu_vsrl() helpers at the top of
target/ppc/translate/vsx-impl.inc.c). So in order to remove the static TCGv_i64
arrays then these helpers must also be converted.

ATB,

Mark.

[Qemu-devel] [ANNOUNCE] QEMU 3.1.0 is now available

2018-12-11 Thread Michael Roth

Hello,

On behalf of the QEMU Team, I'd like to announce the availability of
the QEMU 3.1.0 release. This release contains 1900+ commits from 189
authors.

You can grab the tarball from our download page here:

  https://www.qemu.org/download/#source

The full list of changes are available at:

  https://wiki.qemu.org/ChangeLog/3.1

Highlights include:

 * ARM: emulation support for microbit and Xilinx Versal machine models
 * ARM: support for ARMv6M architecture and Cortex-M0 CPU model
 * ARM: support for Cortex-A72 CPU model
 * ARM: virt/xlnx-zynqmp: virtualization extensions for GICv2 interrupt
   controller
 * ARM: emulation of AArch32 virtualization/hypervisor mode now supported
   for Cortex-A7 and Corex-A15
 * MIPS: emulation support for nanoMIPS I7200
 * MIPS: emulation support for MXU SIMD instructions for MIPS32
 * PowerPC: pseries: enablement of nested virtualization via KVM-HV
 * PowerPC: prep: deprecated in favor of 40p machine model
 * Powerpc: 40p: IRQ routing fixes, switch from Open HackWare to OpenBIOS
 * PowerPC: g3beige/mac99: support for booting from virtio-blk-pci
 * s390: VFIO passthrough support for crypto devices (vfio-ap)
 * s390: KVM support for backing guests with huge pages
 * SPARC: sun4u: support for booting from virtio-blk-pci
 * x86: multi-threaded TCG support
 * x86: KVM support for Enlightened VMCS (improved perf for Hyper-V on KVM)
 * x86: KVM support for Hyper-V IPI enlightenments
 * Xtensa: support for input from chardev consoles

 * Support for AMD IOMMU interrupt remapping and guest virtual APIC mode
 * XTS cipher mode is now ~2x faster
 * stdvga and bocks-display devices can expose EDID information to guest,
   (for use with xres/yres resolution options)
 * qemu-img tool can now generate LUKS-encrypted files through 'convert' command

 * and lots more...

Thank you to everyone involved!

Re: [Qemu-devel] [PATCH 2/3] mac_newworld: enable access to EDID data for the VGA device

2018-12-11 Thread Mark Cave-Ayland

On 10/12/2018 03:46, David Gibson wrote:

> On Fri, Dec 07, 2018 at 04:08:05PM +, Mark Cave-Ayland wrote:
>> This is in preparation for some upcoming QEMU NDRV driver changes that pass
>> display information from the host to the guest.
>>
>> Signed-off-by: Mark Cave-Ayland 
> 
> This looks fine by my limited knowledge of this area.  I'm slightly
> perturbed I can't see any existing examples in the tree of setting the
> edid property from the machine.

Certainly both patches work in that the corresponding MemoryRegion appears in 
both
machines when applied, and indeed now also I have guest code that can read from 
it too.


ATB,

Mark.

[Qemu-devel] [PULL 12/30] MAINTAINERS: Add missing entries for the Xilinx ZynqMP machine

2018-12-11 Thread Laurent Vivier

From: Philippe Mathieu-Daudé 

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Alistair Francis 
Message-Id: <20181125205000.10324-5-phi...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 MAINTAINERS | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index edc88164c3..b32cd139a7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -741,6 +741,9 @@ L: qemu-...@nongnu.org
 S: Maintained
 F: hw/*/xlnx*.c
 F: include/hw/*/xlnx*.h
+F: include/hw/ssi/xilinx_spips.h
+F: hw/display/dpcd.c
+F: include/hw/display/dpcd.h
 
 ARM ACPI Subsystem
 M: Shannon Zhao 
-- 
2.19.2

[Qemu-devel] [PULL 15/30] MAINTAINERS: Add a missing entry for the Old World machines

2018-12-11 Thread Laurent Vivier

From: Philippe Mathieu-Daudé 

Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: David Gibson 
Reviewed-by: Mark Cave-Ayland 
Message-Id: <20181125205000.10324-9-phi...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 4b66aca933..581832b257 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -967,6 +967,7 @@ F: hw/ppc/mac_oldworld.c
 F: hw/pci-host/grackle.c
 F: hw/misc/macio/
 F: hw/intc/heathrow_pic.c
+F: include/hw/intc/heathrow_pic.h
 
 PReP
 M: Hervé Poussineau 
-- 
2.19.2

Re: [Qemu-devel] [PATCH for-3.2 01/11] vhost-user: define conventions for vhost-user backends

2018-12-11 Thread Michael S. Tsirkin

On Tue, Dec 11, 2018 at 09:29:44AM +, Daniel P. Berrangé wrote:
> On Tue, Dec 11, 2018 at 08:42:41AM +0100, Hoffmann, Gerd wrote:
> >   Hi,
> > 
> > > Right. The main issue is that we need to make sure only
> > > in-tree devices are supported.
> > 
> > Well, that is under debate right now, see:
> > https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg04912.html
> 
> I've previously been against the idea of external plugins for QEMU,
> however, that was when the plugin was something that would be dlopen'd
> by QEMU. That would cause our internal ABI to be exposed to 3rd parties
> which is highly undesirable, even if they were open source to comply
> with the license needs.
> 
> When the plugin is a completely isolated process communicating with a
> well defined protocol, it is not placing a significant burden on the
> QEMU developers' ongoing maintainence, nor has problems with license
> compliance. The main problem would come from debugging the combined
> system as the external process is essentially a black box from QEMU's
> POV. Downstream OS vendors are free to place restrictions on which
> backend processes they'd be willing to support with QEMU, and upstream
> is under no obligation to debug stuff beyond the QEMU boundary.
> 
> We have already accepted that tradeoff with networking by supporting
> vhost-user and have externals impls like DPDK, so I don't see a
> compelling reason to try to restrict it for other vhost-user backends.

OK seems to be more or less a rough concensus then.

I wonder what's the approach wrt migration though.

Even the compatibility story about vhost-user isn't
great, I would like to see something solid before
we allow that.

Are we happy to just block live migration?
For sure that's already the case with VFIO.


> > > vhost-user by design
> > > is for out of tree users. It needn't be hard,
> > > maybe it's enough to just make qemu launch these processes
> > > as opposed to connecting to them on command line.
> > 
> > Not sure this is a good idea, with security being one of the motivating
> > factors to move device emulation to other processes.  When libvirt
> > launches the processes it can place them in separate sandboxes ...
> 
> Yep, libvirt already turns on seccomp policies which forbid QEMU from
> forking/execing anything, and we have no desire to go backwards here.
> Any external processes have to be launched by libvirt ahead of time.
> 
> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[Qemu-devel] [PULL 21/30] MAINTAINERS: Add a missing entry to SPICE

2018-12-11 Thread Laurent Vivier

From: Philippe Mathieu-Daudé 

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Gerd Hoffmann 
Message-Id: <20181125205000.10324-18-phi...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index b325a5af87..20096dad05 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1772,6 +1772,7 @@ F: ui/spice-*.c
 F: audio/spiceaudio.c
 F: hw/display/qxl*
 F: qapi/ui.json
+F: docs/spice-port-fqdn.txt
 
 Graphics
 M: Gerd Hoffmann 
-- 
2.19.2

[Qemu-devel] [PULL 01/30] hw: qdev: fix error in comment

2018-12-11 Thread Laurent Vivier

From: Li Qiang 

Cc: qemu-triv...@nongnu.org

Signed-off-by: Li Qiang 
Reviewed-by: Laurent Vivier 
Message-Id: <20181030151637.37207-1-liq...@163.com>
Signed-off-by: Laurent Vivier 
---
 include/hw/qdev-core.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index a24d0dd566..92851e55df 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -197,7 +197,7 @@ typedef struct BusChild {
 
 /**
  * BusState:
- * @hotplug_device: link to a hotplug device associated with bus.
+ * @hotplug_handler: link to a hotplug handler associated with bus.
  */
 struct BusState {
 Object obj;
-- 
2.19.2

[Qemu-devel] [PULL 28/30] MAINTAINERS: Update email address for Fam Zheng

2018-12-11 Thread Laurent Vivier

From: Fam Zheng 

Since I am about to change company, update the email address in
MAINTAINERS to my personal one. Depending on responsibility changes I
may eventually fade out in some of the maintained areas, but that will
be figured out afterward, or maybe I'll use the work email later. For
now, just do a search and replace.

Signed-off-by: Fam Zheng 
Signed-off-by: Fam Zheng 
Reviewed-by: Marc-André Lureau 
Reviewed-by: John Snow 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Message-Id: <20181121153036.2941-1-f...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 MAINTAINERS | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 0e5c9cf52a..02acc1cc8a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1301,7 +1301,7 @@ T: git https://github.com/jasowang/qemu.git net
 
 SCSI
 M: Paolo Bonzini 
-R: Fam Zheng 
+R: Fam Zheng 
 S: Supported
 F: include/hw/scsi/*
 F: hw/scsi/*
@@ -1629,7 +1629,7 @@ T: git https://repo.or.cz/qemu/kevin.git block
 
 Block I/O path
 M: Stefan Hajnoczi 
-M: Fam Zheng 
+M: Fam Zheng 
 L: qemu-bl...@nongnu.org
 S: Supported
 F: util/async.c
@@ -1643,7 +1643,7 @@ T: git https://github.com/stefanha/qemu.git block
 
 Block SCSI subsystem
 M: Paolo Bonzini 
-R: Fam Zheng 
+R: Fam Zheng 
 L: qemu-bl...@nongnu.org
 S: Supported
 F: include/scsi/*
@@ -1675,7 +1675,7 @@ F: qapi/transaction.json
 T: git https://repo.or.cz/qemu/armbru.git block-next
 
 Dirty Bitmaps
-M: Fam Zheng 
+M: Fam Zheng 
 M: John Snow 
 L: qemu-bl...@nongnu.org
 S: Supported
@@ -2051,7 +2051,7 @@ F: tests/test-throttle.c
 L: qemu-bl...@nongnu.org
 
 UUID
-M: Fam Zheng 
+M: Fam Zheng 
 S: Supported
 F: util/uuid.c
 F: include/qemu/uuid.h
@@ -2182,7 +2182,7 @@ F: disas/tci.c
 Block drivers
 -
 VMDK
-M: Fam Zheng 
+M: Fam Zheng 
 L: qemu-bl...@nongnu.org
 S: Supported
 F: block/vmdk.c
@@ -2268,13 +2268,13 @@ F: block/gluster.c
 T: git https://github.com/codyprime/qemu-kvm-jtc.git block
 
 Null Block Driver
-M: Fam Zheng 
+M: Fam Zheng 
 L: qemu-bl...@nongnu.org
 S: Supported
 F: block/null.c
 
 NVMe Block Driver
-M: Fam Zheng 
+M: Fam Zheng 
 L: qemu-bl...@nongnu.org
 S: Supported
 F: block/nvme*
@@ -2405,7 +2405,7 @@ Build and test automation
 -
 Build and test automation
 M: Alex Bennée 
-M: Fam Zheng 
+M: Fam Zheng 
 R: Philippe Mathieu-Daudé 
 L: qemu-devel@nongnu.org
 S: Maintained
-- 
2.19.2

[Qemu-devel] [PULL 30/30] Fixes i386 xchgq test

2018-12-11 Thread Laurent Vivier

From: "fabrice.descl...@cea.fr" 

As "xchg" reads and writes both operands, the "+m" is required to avoid
undefined behavior on -O2 compilation.

Signed-off-by: Fabrice Desclaux 
Reviewed-by: Richard Henderson 
Message-Id: <03506cf0-a204-f619-8ee4-4990a5e69...@cea.fr>
Signed-off-by: Laurent Vivier 
---
 tests/tcg/i386/test-i386.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/tcg/i386/test-i386.c b/tests/tcg/i386/test-i386.c
index a29b41e764..18d5609665 100644
--- a/tests/tcg/i386/test-i386.c
+++ b/tests/tcg/i386/test-i386.c
@@ -1137,7 +1137,7 @@ void test_xchg(void)
 TEST_XCHG(xchgb, "b", "+q");
 
 #if defined(__x86_64__)
-TEST_XCHG(xchgq, "", "=m");
+TEST_XCHG(xchgq, "", "+m");
 #endif
 TEST_XCHG(xchgl, "k", "+m");
 TEST_XCHG(xchgw, "w", "+m");
-- 
2.19.2

[Qemu-devel] [PULL 27/30] cutils: Assert in-range base for string-to-integer conversions

2018-12-11 Thread Laurent Vivier

From: Eric Blake 

POSIX states that the value of endptr is unspecified if strtol()
fails with EINVAL due to an invalid base argument.  Since none of
the callers to check_strtox_error() initialized endptr, we could
end up propagating uninitialized data back to a caller on error.
However, passing an out-of-range base is already a sign of poor
programming, so let's just assert that base is in range, at which
point check_strtox_error() can be tightened to assert that it is
receiving an initialized ep that points somewhere within the
caller's original string, regardless of whether strto*() succeeded
or failed with ERANGE.

Reported-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Eric Blake 
Reviewed-by: Markus Armbruster 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20181206151856.77503-1-ebl...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 util/cutils.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/util/cutils.c b/util/cutils.c
index 698bd315bd..0621565930 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -280,6 +280,7 @@ int qemu_strtosz_metric(const char *nptr, char **end, 
uint64_t *result)
 static int check_strtox_error(const char *nptr, char *ep,
   const char **endptr, int libc_errno)
 {
+assert(ep >= nptr);
 if (endptr) {
 *endptr = ep;
 }
@@ -327,6 +328,7 @@ int qemu_strtoi(const char *nptr, const char **endptr, int 
base,
 char *ep;
 long long lresult;
 
+assert((unsigned) base <= 36 && base != 1);
 if (!nptr) {
 if (endptr) {
 *endptr = nptr;
@@ -379,6 +381,7 @@ int qemu_strtoui(const char *nptr, const char **endptr, int 
base,
 char *ep;
 long long lresult;
 
+assert((unsigned) base <= 36 && base != 1);
 if (!nptr) {
 if (endptr) {
 *endptr = nptr;
@@ -435,6 +438,7 @@ int qemu_strtol(const char *nptr, const char **endptr, int 
base,
 {
 char *ep;
 
+assert((unsigned) base <= 36 && base != 1);
 if (!nptr) {
 if (endptr) {
 *endptr = nptr;
@@ -477,6 +481,7 @@ int qemu_strtoul(const char *nptr, const char **endptr, int 
base,
 {
 char *ep;
 
+assert((unsigned) base <= 36 && base != 1);
 if (!nptr) {
 if (endptr) {
 *endptr = nptr;
@@ -504,6 +509,7 @@ int qemu_strtoi64(const char *nptr, const char **endptr, 
int base,
 {
 char *ep;
 
+assert((unsigned) base <= 36 && base != 1);
 if (!nptr) {
 if (endptr) {
 *endptr = nptr;
@@ -527,6 +533,7 @@ int qemu_strtou64(const char *nptr, const char **endptr, 
int base,
 {
 char *ep;
 
+assert((unsigned) base <= 36 && base != 1);
 if (!nptr) {
 if (endptr) {
 *endptr = nptr;
@@ -594,6 +601,7 @@ int parse_uint(const char *s, unsigned long long *value, 
char **endptr,
 char *endp = (char *)s;
 unsigned long long val = 0;
 
+assert((unsigned) base <= 36 && base != 1);
 if (!s) {
 r = -EINVAL;
 goto out;
-- 
2.19.2

Re: [Qemu-devel] [PATCH 18/26] target/arm: Export aa64_va_parameters to internals.h

2018-12-11 Thread Richard Henderson

On 12/11/18 10:53 AM, Peter Maydell wrote:
> On Fri, 7 Dec 2018 at 10:37, Richard Henderson
>  wrote:
>>
>> We need to reuse this from helper-a64.c.  Provide a stub
>> definition for CONFIG_USER_ONLY.  This matches the stub
>> definitions that we removed for arm_regime_tbi{0,1} before.
>>
>> Signed-off-by: Richard Henderson 
>> ---
>>  target/arm/internals.h | 29 +
>>  target/arm/helper.c| 16 ++--
>>  2 files changed, 31 insertions(+), 14 deletions(-)
> 
> Reviewed-by: Peter Maydell 

FWIW, I'll smoosh this into the creation of the helper.
At some point we may fix tbi for user-only.


r~

[Qemu-devel] [PULL 19/30] MAINTAINERS: Add missing entries for the Canon DIGIC machine

2018-12-11 Thread Laurent Vivier

From: Philippe Mathieu-Daudé 

This pattern now also matches:
- include/hw/timer/digic-timer.h
- include/hw/char/digic-uart.h

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Peter Maydell 
Message-Id: <20181125205000.10324-16-phi...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 381c39b6d8..68ff5ee413 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -541,6 +541,7 @@ L: qemu-...@nongnu.org
 S: Odd Fixes
 F: include/hw/arm/digic.h
 F: hw/*/digic*
+F: include/hw/*/digic*
 
 Gumstix
 M: Peter Maydell 
-- 
2.19.2

[Qemu-devel] [PULL 05/30] qapi: Reduce Makefile boilerplate

2018-12-11 Thread Laurent Vivier

From: Eric Blake 

Adding a new qapi module had some rather tedious repetition to
wire it into Makefile, Makefile.objs, and .gitignore (for example,
see commit bf42508f and its followup b61acdec). For make, add some
indirection by taking advantage of GNU Make string processing to
expand a list of module names into all the required artifacts, so
that future additions of a new module need only touch the list of
module names.  And for gitignore, use globs to cover all generated
file names.

The list has to live in Makefile.objs, due to the way that
our unnest-vars macro slirps in that file without remembering
any definition of $(QAPI_MODULES) from Makefile.

Signed-off-by: Eric Blake 
Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Markus Armbruster 
Tested-by: Yuval Shaia 
Message-Id: <20181116200016.2080785-1-ebl...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 .gitignore|  72 ++-
 Makefile  | 194 +++---
 Makefile.objs |  75 ++-
 3 files changed, 35 insertions(+), 306 deletions(-)

diff --git a/.gitignore b/.gitignore
index 64efdfd929..0430257313 100644
--- a/.gitignore
+++ b/.gitignore
@@ -30,78 +30,14 @@
 /qapi-gen-timestamp
 /qapi/qapi-builtin-types.[ch]
 /qapi/qapi-builtin-visit.[ch]
-/qapi/qapi-commands-block-core.[ch]
-/qapi/qapi-commands-block.[ch]
-/qapi/qapi-commands-char.[ch]
-/qapi/qapi-commands-common.[ch]
-/qapi/qapi-commands-crypto.[ch]
-/qapi/qapi-commands-introspect.[ch]
-/qapi/qapi-commands-job.[ch]
-/qapi/qapi-commands-migration.[ch]
-/qapi/qapi-commands-misc.[ch]
-/qapi/qapi-commands-net.[ch]
-/qapi/qapi-commands-rocker.[ch]
-/qapi/qapi-commands-run-state.[ch]
-/qapi/qapi-commands-sockets.[ch]
-/qapi/qapi-commands-tpm.[ch]
-/qapi/qapi-commands-trace.[ch]
-/qapi/qapi-commands-transaction.[ch]
-/qapi/qapi-commands-ui.[ch]
+/qapi/qapi-commands-*.[ch]
 /qapi/qapi-commands.[ch]
-/qapi/qapi-events-block-core.[ch]
-/qapi/qapi-events-block.[ch]
-/qapi/qapi-events-char.[ch]
-/qapi/qapi-events-common.[ch]
-/qapi/qapi-events-crypto.[ch]
-/qapi/qapi-events-introspect.[ch]
-/qapi/qapi-events-job.[ch]
-/qapi/qapi-events-migration.[ch]
-/qapi/qapi-events-misc.[ch]
-/qapi/qapi-events-net.[ch]
-/qapi/qapi-events-rocker.[ch]
-/qapi/qapi-events-run-state.[ch]
-/qapi/qapi-events-sockets.[ch]
-/qapi/qapi-events-tpm.[ch]
-/qapi/qapi-events-trace.[ch]
-/qapi/qapi-events-transaction.[ch]
-/qapi/qapi-events-ui.[ch]
+/qapi/qapi-events-*.[ch]
 /qapi/qapi-events.[ch]
 /qapi/qapi-introspect.[ch]
-/qapi/qapi-types-block-core.[ch]
-/qapi/qapi-types-block.[ch]
-/qapi/qapi-types-char.[ch]
-/qapi/qapi-types-common.[ch]
-/qapi/qapi-types-crypto.[ch]
-/qapi/qapi-types-introspect.[ch]
-/qapi/qapi-types-job.[ch]
-/qapi/qapi-types-migration.[ch]
-/qapi/qapi-types-misc.[ch]
-/qapi/qapi-types-net.[ch]
-/qapi/qapi-types-rocker.[ch]
-/qapi/qapi-types-run-state.[ch]
-/qapi/qapi-types-sockets.[ch]
-/qapi/qapi-types-tpm.[ch]
-/qapi/qapi-types-trace.[ch]
-/qapi/qapi-types-transaction.[ch]
-/qapi/qapi-types-ui.[ch]
+/qapi/qapi-types-*.[ch]
 /qapi/qapi-types.[ch]
-/qapi/qapi-visit-block-core.[ch]
-/qapi/qapi-visit-block.[ch]
-/qapi/qapi-visit-char.[ch]
-/qapi/qapi-visit-common.[ch]
-/qapi/qapi-visit-crypto.[ch]
-/qapi/qapi-visit-introspect.[ch]
-/qapi/qapi-visit-job.[ch]
-/qapi/qapi-visit-migration.[ch]
-/qapi/qapi-visit-misc.[ch]
-/qapi/qapi-visit-net.[ch]
-/qapi/qapi-visit-rocker.[ch]
-/qapi/qapi-visit-run-state.[ch]
-/qapi/qapi-visit-sockets.[ch]
-/qapi/qapi-visit-tpm.[ch]
-/qapi/qapi-visit-trace.[ch]
-/qapi/qapi-visit-transaction.[ch]
-/qapi/qapi-visit-ui.[ch]
+/qapi/qapi-visit-*.[ch]
 /qapi/qapi-visit.[ch]
 /qapi/qapi-doc.texi
 /qemu-doc.html
diff --git a/Makefile b/Makefile
index f2947186a4..c8b9efdad4 100644
--- a/Makefile
+++ b/Makefile
@@ -88,82 +88,26 @@ endif
 include $(SRC_PATH)/rules.mak
 
 GENERATED_FILES = qemu-version.h config-host.h qemu-options.def
-GENERATED_FILES += qapi/qapi-builtin-types.h qapi/qapi-builtin-types.c
-GENERATED_FILES += qapi/qapi-types.h qapi/qapi-types.c
-GENERATED_FILES += qapi/qapi-types-block-core.h qapi/qapi-types-block-core.c
-GENERATED_FILES += qapi/qapi-types-block.h qapi/qapi-types-block.c
-GENERATED_FILES += qapi/qapi-types-char.h qapi/qapi-types-char.c
-GENERATED_FILES += qapi/qapi-types-common.h qapi/qapi-types-common.c
-GENERATED_FILES += qapi/qapi-types-crypto.h qapi/qapi-types-crypto.c
-GENERATED_FILES += qapi/qapi-types-introspect.h qapi/qapi-types-introspect.c
-GENERATED_FILES += qapi/qapi-types-job.h qapi/qapi-types-job.c
-GENERATED_FILES += qapi/qapi-types-migration.h qapi/qapi-types-migration.c
-GENERATED_FILES += qapi/qapi-types-misc.h qapi/qapi-types-misc.c
-GENERATED_FILES += qapi/qapi-types-net.h qapi/qapi-types-net.c
-GENERATED_FILES += qapi/qapi-types-rocker.h qapi/qapi-types-rocker.c
-GENERATED_FILES += qapi/qapi-types-run-state.h qapi/qapi-types-run-state.c
-GENERATED_FILES += qapi/qapi-types-sockets.h qapi/qapi-types-sockets.c
-GENERATED_FILES += qapi/qapi-types-tpm.h

Re: [Qemu-devel] [PATCH 17/26] target/arm: Reuse aa64_va_parameters for setting tbflags

2018-12-11 Thread Richard Henderson

On 12/11/18 10:52 AM, Peter Maydell wrote:
> This has lost the bit of the old functions that converted
> the stage 1+2 MMU index into a stage 1 MMU index. The call
> to regime_el() in aa64_va_parameters() will assert if it is
> passed ARMMMUIdx_S12NSE0 or ARMMMUIdx_S12NSE1. (In the code
> paths in the get_phys_addr() functions, this is handled by
> the top level get_phys_addr() code, so get_phys_addr_lpae()
> never sees a stage 1+2 MMU index.)

Yes, I've got a fixup patch on my branch for this.
It showed up quite early booting bios.  ;-)


r~

[Qemu-devel] [PULL 08/30] MAINTAINERS: Add nios2-related files to the Nios2 section

2018-12-11 Thread Laurent Vivier

From: Thomas Huth 

nios2_iic.c and the default-configs/nios2-softmmu.mak file are
currently "unmaintained" according to the get_maintainers.pl script.
Move them to the Nios2 section where they obviously belong to.

Signed-off-by: Thomas Huth 
Reviewed-by: Laurent Vivier 
Message-Id: <1542899500-23346-1-git-send-email-th...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 MAINTAINERS | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 059c775a06..f5ef20d359 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -233,7 +233,9 @@ M: Marek Vasut 
 S: Maintained
 F: target/nios2/
 F: hw/nios2/
+F: hw/intc/nios2_iic.c
 F: disas/nios2.c
+F: default-configs/nios2-softmmu.mak
 
 OpenRISC
 M: Stafford Horne 
-- 
2.19.2

Re: [Qemu-devel] [RFC 3/3] pvh: Boot uncompressed kernel using direct boot ABI

2018-12-11 Thread Maran Wilson


On 12/11/2018 9:11 AM, Stefano Garzarella wrote:

Hi Liam,
in order to support PVH also with SeaBIOS, I'm going to work on a new
option rom (like linuxboot/multiboot) that can be used in this case.


That is awesome. Yes, please keep us posted when you have something working.

Just FYI, before switching over to using Qemu+qboot, we had been using a 
Qemu only solution (but not using an option rom) internally that worked 
very well using no FW at all. We had Qemu simply parse the ELF file and 
jump to the PVH entry point if one is found. The only gotcha was that we 
had to include a pair of patches that were originally written by folks 
at Intel as part of the clear containers work. Specifically, in order to 
be able to skip firmware entirely, we had to do 2 additional things: (1) 
ACPI tables generated by Qemu are usually patched up by FW. Since we 
were running no FW, we needed to do that patching up of the ACPI tables 
in Qemu when it was detected that we were going to enter the OS via the 
PVH entry point. (2) We also needed to add a patch to Qemu to enable a 
few PM registers -- something typically done by FW.


But if SeaBIOS is involved in the solution you are working on, I guess 
you won't really need those extra patches. Just figured I'd mention it 
so you have the full picture.


Thanks,
-Maran


I'll keep you updated on it!

Cheers,
Stefano
On Wed, Dec 5, 2018 at 11:38 PM Liam Merwick  wrote:

These changes (along with corresponding qboot and Linux kernel changes)
enable a guest to be booted using the x86/HVM direct boot ABI.

This commit adds a load_elfboot() routine to pass the size and
location of the kernel entry point to qboot (which will fill in
the start_info struct information needed to to boot the guest).
Having loaded the ELF binary, load_linux() will run qboot
which continues the boot.

The address for the kernel entry point has already been read
from an ELF Note in the uncompressed kernel binary earlier
in pc_memory_init().

Signed-off-by: George Kennedy 
Signed-off-by: Liam Merwick 
---
  hw/i386/pc.c | 72 
  1 file changed, 72 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 056aa46d99b9..d3012cbd8597 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -54,6 +54,7 @@
  #include "sysemu/qtest.h"
  #include "kvm_i386.h"
  #include "hw/xen/xen.h"
+#include "hw/xen/start_info.h"
  #include "ui/qemu-spice.h"
  #include "exec/memory.h"
  #include "exec/address-spaces.h"
@@ -1098,6 +1099,50 @@ done:
  return pvh_start_addr != 0;
  }

+static bool load_elfboot(const char *kernel_filename,
+   int kernel_file_size,
+   uint8_t *header,
+   size_t pvh_xen_start_addr,
+   FWCfgState *fw_cfg)
+{
+uint32_t flags = 0;
+uint32_t mh_load_addr = 0;
+uint32_t elf_kernel_size = 0;
+uint64_t elf_entry;
+uint64_t elf_low, elf_high;
+int kernel_size;
+
+if (ldl_p(header) != 0x464c457f) {
+return false; /* no elfboot */
+}
+
+bool elf_is64 = header[EI_CLASS] == ELFCLASS64;
+flags = elf_is64 ?
+((Elf64_Ehdr *)header)->e_flags : ((Elf32_Ehdr *)header)->e_flags;
+
+if (flags & 0x00010004) { /* LOAD_ELF_HEADER_HAS_ADDR */
+error_report("elfboot unsupported flags = %x", flags);
+exit(1);
+}
+
+kernel_size = load_elf(kernel_filename, NULL, NULL, _entry,
+   _low, _high, 0, I386_ELF_MACHINE,
+   0, 0);
+
+if (kernel_size < 0) {
+error_report("Error while loading elf kernel");
+exit(1);
+}
+mh_load_addr = elf_low;
+elf_kernel_size = elf_high - elf_low;
+
+fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ENTRY, pvh_xen_start_addr);
+fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, mh_load_addr);
+fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, elf_kernel_size);
+
+return true;
+}
+
  static void load_linux(PCMachineState *pcms,
 FWCfgState *fw_cfg)
  {
@@ -1138,6 +1183,33 @@ static void load_linux(PCMachineState *pcms,
  if (ldl_p(header+0x202) == 0x53726448) {
  protocol = lduw_p(header+0x206);
  } else {
+/* If the kernel address for using the x86/HVM direct boot ABI has
+ * been saved then proceed with booting the uncompressed kernel */
+if (pvh_start_addr) {
+if (load_elfboot(kernel_filename, kernel_size,
+ header, pvh_start_addr, fw_cfg)) {
+struct hvm_modlist_entry ramdisk_mod = { 0 };
+
+fclose(f);
+
+fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE,
+strlen(kernel_cmdline) + 1);
+fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
+
+assert(machine->device_memory != NULL);
+ramdisk_mod.paddr = machine->device_memory->base;
+ramdisk_mod.size =
+

[Qemu-devel] [PULL 18/30] MAINTAINERS: Add missing entries to the vhost section

2018-12-11 Thread Laurent Vivier

From: Philippe Mathieu-Daudé 

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Thomas Huth 
Message-Id: <20181125205000.10324-15-phi...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index a7bf9c42a9..381c39b6d8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1382,6 +1382,7 @@ M: Michael S. Tsirkin 
 S: Supported
 F: hw/*/*vhost*
 F: docs/interop/vhost-user.txt
+F: contrib/vhost-user-*/
 
 virtio
 M: Michael S. Tsirkin 
-- 
2.19.2

Re: [Qemu-devel] [PATCH 24/26] target/arm: Enable PAuth for user-only -cpu max

2018-12-11 Thread Richard Henderson

On 12/11/18 9:45 AM, Peter Maydell wrote:
> On Fri, 7 Dec 2018 at 10:37, Richard Henderson
>  wrote:
>>
>> Signed-off-by: Richard Henderson 
>> ---
>>  target/arm/cpu64.c | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
>> index 1d57be0c91..84f70b2a24 100644
>> --- a/target/arm/cpu64.c
>> +++ b/target/arm/cpu64.c
>> @@ -316,6 +316,10 @@ static void aarch64_max_initfn(Object *obj)
>>
>>  t = cpu->isar.id_aa64isar1;
>>  t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1);
>> +t = FIELD_DP64(t, ID_AA64ISAR1, APA, 1); /* PAuth, architected only 
>> */
>> +t = FIELD_DP64(t, ID_AA64ISAR1, API, 0);
>> +t = FIELD_DP64(t, ID_AA64ISAR1, GPA, 1);
>> +t = FIELD_DP64(t, ID_AA64ISAR1, GPI, 0);
>>  cpu->isar.id_aa64isar1 = t;
>>
>>  t = cpu->isar.id_aa64pfr0;
> 
> I don't see why this is enabling for user-only and
> not also for system: what am I missing ?

Err.. more brain fluff.  This does enable for system;
it's the next patch that's for user-only.


r~

1 2 3 4 >

1 - 100 of 387 matches

Mail list logo