Re: [Qemu-devel] [PATCH v7 09/19] spapr: add device tree support for the XIVE exploitation mode

2018-12-09 Thread Cédric Le Goater
On 12/10/18 7:39 AM, David Gibson wrote:
> On Sun, Dec 09, 2018 at 08:46:00PM +0100, Cédric Le Goater wrote:
>> The XIVE interface for the guest is described in the device tree under
>> the "interrupt-controller" node. A couple of new properties are
>> specific to XIVE :
>>
>>  - "reg"
>>
>>contains the base address and size of the thread interrupt
>>managnement areas (TIMA), for the User level and for the Guest OS
>>level. Only the Guest OS level is taken into account today.
>>
>>  - "ibm,xive-eq-sizes"
>>
>>the size of the event queues. One cell per size supported, contains
>>log2 of size, in ascending order.
>>
>>  - "ibm,xive-lisn-ranges"
>>
>>the IRQ interrupt number ranges assigned to the guest for the IPIs.
>>
>> and also under the root node :
>>
>>  - "ibm,plat-res-int-priorities"
>>
>>contains a list of priorities that the hypervisor has reserved for
>>its own use. OPAL uses the priority 7 queue to automatically
>>escalate interrupts for all other queues (DD2.X POWER9). So only
>>priorities [0..6] are allowed for the guest.
>>
>> Extend the sPAPR IRQ backend with a new handler to populate the DT
>> with the appropriate "interrupt-controller" node.
>>
>> Signed-off-by: Cédric Le Goater 
>> ---
>>  include/hw/ppc/spapr_irq.h  |  2 ++
>>  include/hw/ppc/spapr_xive.h |  2 ++
>>  include/hw/ppc/xics.h   |  4 +--
>>  hw/intc/spapr_xive.c| 64 +
>>  hw/intc/xics_spapr.c|  3 +-
>>  hw/ppc/spapr.c  |  3 +-
>>  hw/ppc/spapr_irq.c  |  3 ++
>>  7 files changed, 77 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
>> index 23cdb51b879e..e51e9f052f63 100644
>> --- a/include/hw/ppc/spapr_irq.h
>> +++ b/include/hw/ppc/spapr_irq.h
>> @@ -39,6 +39,8 @@ typedef struct sPAPRIrq {
>>  void (*free)(sPAPRMachineState *spapr, int irq, int num);
>>  qemu_irq (*qirq)(sPAPRMachineState *spapr, int irq);
>>  void (*print_info)(sPAPRMachineState *spapr, Monitor *mon);
>> +void (*dt_populate)(sPAPRMachineState *spapr, uint32_t nr_servers,
>> +void *fdt, uint32_t phandle);
>>  } sPAPRIrq;
>>  
>>  extern sPAPRIrq spapr_irq_xics;
>> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
>> index 9506a8f4d10a..728a5e8dc163 100644
>> --- a/include/hw/ppc/spapr_xive.h
>> +++ b/include/hw/ppc/spapr_xive.h
>> @@ -45,5 +45,7 @@ qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t lisn);
>>  typedef struct sPAPRMachineState sPAPRMachineState;
>>  
>>  void spapr_xive_hcall_init(sPAPRMachineState *spapr);
>> +void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
>> +   uint32_t phandle);
>>  
>>  #endif /* PPC_SPAPR_XIVE_H */
>> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
>> index 9958443d1984..14afda198cdb 100644
>> --- a/include/hw/ppc/xics.h
>> +++ b/include/hw/ppc/xics.h
>> @@ -181,8 +181,6 @@ typedef struct XICSFabricClass {
>>  ICPState *(*icp_get)(XICSFabric *xi, int server);
>>  } XICSFabricClass;
>>  
>> -void spapr_dt_xics(int nr_servers, void *fdt, uint32_t phandle);
>> -
>>  ICPState *xics_icp_get(XICSFabric *xi, int server);
>>  
>>  /* Internal XICS interfaces */
>> @@ -204,6 +202,8 @@ void icp_resend(ICPState *ss);
>>  
>>  typedef struct sPAPRMachineState sPAPRMachineState;
>>  
>> +void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
>> +   uint32_t phandle);
>>  int xics_kvm_init(sPAPRMachineState *spapr, Error **errp);
>>  void xics_spapr_init(sPAPRMachineState *spapr);
>>  
>> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
>> index 982ac6e17051..a6d854b07690 100644
>> --- a/hw/intc/spapr_xive.c
>> +++ b/hw/intc/spapr_xive.c
>> @@ -14,6 +14,7 @@
>>  #include "target/ppc/cpu.h"
>>  #include "sysemu/cpus.h"
>>  #include "monitor/monitor.h"
>> +#include "hw/ppc/fdt.h"
>>  #include "hw/ppc/spapr.h"
>>  #include "hw/ppc/spapr_xive.h"
>>  #include "hw/ppc/xive.h"
>> @@ -1381,3 +1382,66 @@ void spapr_xive_hcall_init(sPAPRMachineState *spapr)
>>  spapr_register_hypercall(H_INT_SYNC, h_int_sync);
>>  spapr_register_hypercall(H_INT_RESET, h_int_reset);
>>  }
>> +
>> +void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
>> +   uint32_t phandle)
>> +{
>> +sPAPRXive *xive = spapr->xive;
>> +int node;
>> +uint64_t timas[2 * 2];
>> +/* Interrupt number ranges for the IPIs */
>> +uint32_t lisn_ranges[] = {
>> +cpu_to_be32(0),
>> +cpu_to_be32(nr_servers),
>> +};
>> +uint32_t eq_sizes[] = {
>> +cpu_to_be32(12), /* 4K */
>> +cpu_to_be32(16), /* 64K */
>> +cpu_to_be32(21), /* 2M */
>> +cpu_to_be32(24), /* 16M */
> 
> For KVM, are we going to need to clamp this list based on the
> pagesizes the guest can use?

I would say so. Is there a KVM service for that ?

Today, the OS scans the list 

Re: [Qemu-devel] [PATCH] cpus.c: Fix race condition in cpu_stop_current()

2018-12-09 Thread Jaap Crezee
Hello all,

On 12/7/18 4:59 PM, Peter Maydell wrote:
> Jaap: could you test whether this patch fixes the issue you
> were seeing, please?


My test is going very well. With the patch applied, I have no longer been able 
to freeze/hang the VM. Currently at 7024 reboots and counting over
runtime 1 day 23 hours. I will start testing on my production environment as 
well.

Tested-by: Jaap Crezee 


regards,


Jaap



Re: [Qemu-devel] [PATCH v11 0/3] wakeup-from-suspend and system_wakeup changes

2018-12-09 Thread Markus Armbruster
Queued, thanks!



Re: [Qemu-devel] [PATCH v7 12/19] spapr: add a 'reset' method to the sPAPR IRQ backend

2018-12-09 Thread Cédric Le Goater
On 12/10/18 7:42 AM, David Gibson wrote:
> On Sun, Dec 09, 2018 at 08:46:03PM +0100, Cédric Le Goater wrote:
>> For the time being, the XIVE reset handler updates the OS CAM line of
>> the vCPU as it is done under a real hypervisor when a vCPU is
>> scheduled to run on a HW thread.
>>
>> This handler will become even more useful when we introduce the
>> machine supporting both interrupt modes, XIVE and XICS. In this
>> machine, the interrupt mode is chosen by the CAS negotiation process
>> and activated after a reset.
>>
>> Signed-off-by: Cédric Le Goater 
>> ---
>>  include/hw/ppc/spapr_irq.h  |  2 ++
>>  include/hw/ppc/spapr_xive.h |  1 +
>>  hw/intc/spapr_xive.c| 24 
>>  hw/ppc/spapr.c  |  5 +
>>  hw/ppc/spapr_irq.c  | 24 
>>  5 files changed, 56 insertions(+)
>>
>> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
>> index 84a25ffb6c65..63061a009b4c 100644
>> --- a/include/hw/ppc/spapr_irq.h
>> +++ b/include/hw/ppc/spapr_irq.h
>> @@ -44,6 +44,7 @@ typedef struct sPAPRIrq {
>>  Object *(*cpu_intc_create)(sPAPRMachineState *spapr, Object *cpu,
>> Error **errp);
>>  int (*post_load)(sPAPRMachineState *spapr, int version_id);
>> +void (*reset)(sPAPRMachineState *spapr, Error **errp);
>>  } sPAPRIrq;
>>  
>>  extern sPAPRIrq spapr_irq_xics;
>> @@ -55,6 +56,7 @@ int spapr_irq_claim(sPAPRMachineState *spapr, int irq, 
>> bool lsi, Error **errp);
>>  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
>>  qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
>>  int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id);
>> +void spapr_irq_reset(sPAPRMachineState *spapr, Error **errp);
>>  
>>  /*
>>   * XICS legacy routines
>> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
>> index 728a5e8dc163..7244a6231ce6 100644
>> --- a/include/hw/ppc/spapr_xive.h
>> +++ b/include/hw/ppc/spapr_xive.h
>> @@ -47,5 +47,6 @@ typedef struct sPAPRMachineState sPAPRMachineState;
>>  void spapr_xive_hcall_init(sPAPRMachineState *spapr);
>>  void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
>> uint32_t phandle);
>> +void spapr_xive_reset_tctx(sPAPRXive *xive);
>>  
>>  #endif /* PPC_SPAPR_XIVE_H */
>> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
>> index a6d854b07690..560d8d031f74 100644
>> --- a/hw/intc/spapr_xive.c
>> +++ b/hw/intc/spapr_xive.c
>> @@ -179,6 +179,30 @@ static void spapr_xive_map_mmio(sPAPRXive *xive)
>>  sysbus_mmio_map(SYS_BUS_DEVICE(xive), 2, xive->tm_base);
>>  }
>>  
>> +/*
>> + * When a Virtual Processor is scheduled to run on a HW thread, the
>> + * hypervisor pushes its identifier in the OS CAM line. Emulate the
>> + * same behavior under QEMU.
>> + */
>> +void spapr_xive_reset_tctx(sPAPRXive *xive)
>> +{
>> +CPUState *cs;
>> +uint8_t  nvt_blk;
>> +uint32_t nvt_idx;
>> +uint32_t nvt_cam;
>> +
>> +CPU_FOREACH(cs) {
>> +PowerPCCPU *cpu = POWERPC_CPU(cs);
>> +XiveTCTX *tctx = XIVE_TCTX(cpu->intc);
>> +
>> +spapr_xive_cpu_to_nvt(cpu, _blk, _idx);
>> +
>> +nvt_cam = cpu_to_be32(TM_QW1W2_VO |
>> +  xive_nvt_cam_line(nvt_blk, nvt_idx));
>> +memcpy(>regs[TM_QW1_OS + TM_WORD2], _cam, 4);
>> +}
>> +}
>> +
>>  static void spapr_xive_end_reset(XiveEND *end)
>>  {
>>  memset(end, 0, sizeof(*end));
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 8cea4cad1732..98d69f09e080 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1619,6 +1619,11 @@ static void spapr_machine_reset(void)
>>  
>>  qemu_devices_reset();
>>  
>> +/* This is fixing some of the default configuration of the XIVE
>> + * devices. To be called after the reset of the machine devices.
>> + */
>> +spapr_irq_reset(spapr, _fatal);
>> +
>>  /* DRC reset may cause a device to be unplugged. This will cause 
>> troubles
>>   * if this device is used by another device (eg, a running vhost backend
>>   * will crash QEMU if the DIMM holding the vring goes away). To avoid 
>> such
>> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
>> index 35a067cad3f8..04f5c9665550 100644
>> --- a/hw/ppc/spapr_irq.c
>> +++ b/hw/ppc/spapr_irq.c
>> @@ -209,6 +209,10 @@ static int spapr_irq_post_load_xics(sPAPRMachineState 
>> *spapr, int version_id)
>>  return 0;
>>  }
>>  
>> +static void spapr_irq_reset_xics(sPAPRMachineState *spapr, Error **errp)
>> +{
>> +}
> 
> You already have a check for a NULL reset hook in spapr_irq_reset() so
> you could omit this empty function.

It's being used in patch 14 and 15. But I can add the XICS reset handler
at that time.

C.

>> +
>>  #define SPAPR_IRQ_XICS_NR_IRQS 0x1000
>>  #define SPAPR_IRQ_XICS_NR_MSIS \
>>  (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
>> @@ -225,6 +229,7 @@ sPAPRIrq spapr_irq_xics = {
>>  

Re: [Qemu-devel] [PATCH v6 21/27] qapi: add #if conditions to generated code members

2018-12-09 Thread Markus Armbruster
Marc-André Lureau  writes:

> Hi
> On Thu, Dec 6, 2018 at 9:42 PM Markus Armbruster  wrote:
>>
>> Marc-André Lureau  writes:
>>
>> > Wrap generated enum/struct members and code with #if/#endif, using the
>>
>> enum and struct members
>
> ok
>
>>
>> > .ifcond members added in the previous patches.
>> >
>> > Some types generate both enum and struct members for example, so a
>> > step-by-step is unnecessarily complicated to deal with (it would
>> > easily generate invalid intermediary code).
>>
>> Can you give an example of a schema definition that would lead to
>> complications?
>>
>
> Honestly, I don't remember well (it's been a while I wrote that code).

I know...

> It must be related to implicit enums, such as union kind... If there
> is no strong need to split this patch, I would rather not do that
> extra work.

I'm not looking for reasons to split this patch, I'm looking for
stronger reasons to keep it just like it is :)

Your hunch that complications would arise for simple unions plausible:
there the same conditional needs to be applied both to the C enum's
member and the C union member.

For the generated C code to compile, each union tag enum member
conditional must imply the corresponding variant conditional.

For flat unions, the two are separate.  The QAPI generator makes no
effort to check the enum member's if condition implies the union
variant's if condition; if you mess them up in the schema, you get to
deal with the C compilation errors.

For simple unions, the two are one.

If we separate the generator updates for enums and for union members,
and do enum members first, then unions with conditional tag members
can't compile.  Corrollary: simple unions with conditional variants
can't compile.

What if we do union members first?

Again, I'm not asking for patch splitting here, I'm just trying to
arrive at a clearer understanding to avoid making insufficiently
supported claims in the commit message.  The combined patch looks small
and clean enough to keep it combined.

[...]




Re: [Qemu-devel] [PATCH v7 03/19] ppc/xive: introduce a simplified XIVE presenter

2018-12-09 Thread Cédric Le Goater
On 12/10/18 5:27 AM, David Gibson wrote:
> On Sun, Dec 09, 2018 at 08:45:54PM +0100, Cédric Le Goater wrote:
>> The last sub-engine of the XIVE architecture is the Interrupt
>> Virtualization Presentation Engine (IVPE). On HW, the IVRE and the
>> IVPE share elements, the Power Bus interface (CQ), the routing table
>> descriptors, and they can be combined in the same HW logic. We do the
>> same in QEMU and combine both engines in the XiveRouter for
>> simplicity.
>>
>> When the IVRE has completed its job of matching an event source with a
>> Notification Virtual Target (NVT) to notify, it forwards the event
>> notification to the IVPE sub-engine. The IVPE scans the thread
>> interrupt contexts of the Notification Virtual Targets (NVT)
>> dispatched on the HW processor threads and if a match is found, it
>> signals the thread. If not, the IVPE escalates the notification to
>> some other targets and records the notification in a backlog queue.
>>
>> The IVPE maintains the thread interrupt context state for each of its
>> NVTs not dispatched on HW processor threads in the Notification
>> Virtual Target table (NVTT).
>>
>> The model currently only supports single NVT notifications.
>>
>> Signed-off-by: Cédric Le Goater 
> 
> Applied.
> 
> I think the tctx_word2() should have the byteswap, rather than having
> it in the callers, but that can be fixed later.

I thought it was better to explicitly show in the code where the 
byteswaps were needed. Anyway, this is very localized, so, yes, 
we can change it later on.

C.

> 
>> ---
>>
>>  Changes since v6 :
>>
>>  - removed HW CAM line setting and use as it is only useful for PowerNV
>>  - made use of xive_tctx_word2() helper
>>  - made use of GETFIELD_BE32() to compare CAM lines
>>  - fixed initialization of XiveTCTXMatch
>>
>>  include/hw/ppc/xive.h  |  14 +++
>>  include/hw/ppc/xive_regs.h |  24 +
>>  hw/intc/xive.c | 185 +
>>  3 files changed, 223 insertions(+)
>>
>> diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
>> index 1e823a4c64e9..19309d1d65d1 100644
>> --- a/include/hw/ppc/xive.h
>> +++ b/include/hw/ppc/xive.h
>> @@ -325,6 +325,10 @@ typedef struct XiveRouterClass {
>> XiveEND *end);
>>  int (*write_end)(XiveRouter *xrtr, uint8_t end_blk, uint32_t end_idx,
>>   XiveEND *end, uint8_t word_number);
>> +int (*get_nvt)(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
>> +   XiveNVT *nvt);
>> +int (*write_nvt)(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
>> + XiveNVT *nvt, uint8_t word_number);
>>  } XiveRouterClass;
>>  
>>  void xive_eas_pic_print_info(XiveEAS *eas, uint32_t lisn, Monitor *mon);
>> @@ -335,6 +339,11 @@ int xive_router_get_end(XiveRouter *xrtr, uint8_t 
>> end_blk, uint32_t end_idx,
>>  XiveEND *end);
>>  int xive_router_write_end(XiveRouter *xrtr, uint8_t end_blk, uint32_t 
>> end_idx,
>>XiveEND *end, uint8_t word_number);
>> +int xive_router_get_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
>> +XiveNVT *nvt);
>> +int xive_router_write_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t 
>> nvt_idx,
>> +  XiveNVT *nvt, uint8_t word_number);
>> +
>>  
>>  /*
>>   * XIVE END ESBs
>> @@ -411,4 +420,9 @@ extern const MemoryRegionOps xive_tm_ops;
>>  
>>  void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon);
>>  
>> +static inline uint32_t xive_nvt_cam_line(uint8_t nvt_blk, uint32_t nvt_idx)
>> +{
>> +return (nvt_blk << 19) | nvt_idx;
>> +}
>> +
>>  #endif /* PPC_XIVE_H */
>> diff --git a/include/hw/ppc/xive_regs.h b/include/hw/ppc/xive_regs.h
>> index ede3d04c5eda..85557e730cd8 100644
>> --- a/include/hw/ppc/xive_regs.h
>> +++ b/include/hw/ppc/xive_regs.h
>> @@ -186,4 +186,28 @@ typedef struct XiveEND {
>>  #define GETFIELD_BE32(m, v)   GETFIELD(m, be32_to_cpu(v))
>>  #define SETFIELD_BE32(m, v, val)  cpu_to_be32(SETFIELD(m, be32_to_cpu(v), 
>> val))
>>  
>> +/* Notification Virtual Target (NVT) */
>> +typedef struct XiveNVT {
>> +uint32_tw0;
>> +#define NVT_W0_VALID PPC_BIT32(0)
>> +uint32_tw1;
>> +uint32_tw2;
>> +uint32_tw3;
>> +uint32_tw4;
>> +uint32_tw5;
>> +uint32_tw6;
>> +uint32_tw7;
>> +uint32_tw8;
>> +#define NVT_W8_GRP_VALID PPC_BIT32(0)
>> +uint32_tw9;
>> +uint32_twa;
>> +uint32_twb;
>> +uint32_twc;
>> +uint32_twd;
>> +uint32_twe;
>> +uint32_twf;
>> +} XiveNVT;
>> +
>> +#define xive_nvt_is_valid(nvt)(be32_to_cpu((nvt)->w0) & NVT_W0_VALID)
>> +
>>  #endif /* PPC_XIVE_REGS_H */
>> diff --git a/hw/intc/xive.c b/hw/intc/xive.c
>> index 2615d16b7437..3eecffe99b3a 100644
>> --- 

Re: [Qemu-devel] [PATCH v7 01/19] ppc/xive: add support for the END Event State Buffers

2018-12-09 Thread Cédric Le Goater
On 12/10/18 5:16 AM, David Gibson wrote:
> On Sun, Dec 09, 2018 at 08:45:52PM +0100, Cédric Le Goater wrote:
>> The Event Notification Descriptor (END) XIVE structure also contains
>> two Event State Buffers providing further coalescing of interrupts,
>> one for the notification event (ESn) and one for the escalation events
>> (ESe). A MMIO page is assigned for each to control the EOI through
>> loads only. Stores are not allowed.
>>
>> The END ESBs are modeled through an object resembling the 'XiveSource'
>> It is stateless as the END state bits are backed into the XiveEND
>> structure under the XiveRouter and the MMIO accesses follow the same
>> rules as for the XiveSource ESBs.
>>
>> END ESBs are not supported by the Linux drivers neither on OPAL nor on
>> sPAPR. Nevetherless, it provides a mean to study the question in the
>> future and validates a bit more the XIVE model.
>>
>> Signed-off-by: Cédric Le Goater 
>> ---
>>
>>  Changes since v6:
>>
>>  - removed the 'chip-id' field from XiveRouter
>>  - introduced a 'block-id' field in XiveENDSource to lookup the XIVE
>>END structure when doing a load in the MMIO ESB
>>  - removed reset XiveENDSource handler
>>
>>  include/hw/ppc/xive.h |  21 ++
>>  hw/intc/xive.c| 160 +-
>>  2 files changed, 179 insertions(+), 2 deletions(-)
> 
> Applied to ppc-for-4.0.
> 
> I had some thoughts about maybe-nicer arrangements of things here, but
> nothing important enough to delay this (the things I'm mulling over
> wouldn't break migration, so it's fixable later).

OK. No problem for me to do it afterwards. 

It's a bit of pain to maintain a pile of 30/40 patches and changing stuff   
in the first ones. 

C.



Re: [Qemu-devel] [PATCH v7 17/19] spapr: Add a pseries-4.0 machine type

2018-12-09 Thread Cédric Le Goater
On 12/10/18 4:41 AM, David Gibson wrote:
> On Mon, Dec 10, 2018 at 09:05:06AM +1100, Benjamin Herrenschmidt wrote:
>> On Sun, 2018-12-09 at 20:46 +0100, Cédric Le Goater wrote:
>>> Signed-off-by: Cédric Le Goater 
>>> ---
>>
>> If you're going to do that, can we include large decrementer in there
>> too ? (patches from Suraj in my tree but they night need a bit of
>> massaging).
> 
> We don't need to worry about that here.  The machine type's not
> considered finalized until the release, so as long as you get the
> large dec stuff in before the 4.0 release, it's fine.

Are we talking about these 5 patches ? 

  target/ppc: Implement large decrementer support for TCG 
  
https://github.com/legoater/qemu/commit/9b3131ae25aa1ee630c48a0489d7194b3046031a

  target/ppc: Implement large decrementer support for KVM 
  
https://github.com/legoater/qemu/commit/eceb9fe2c77ba40230621af56dd20090a282e2f1

  target/ppc: Implement migration support for large decrementer 
  
https://github.com/legoater/qemu/commit/8da02805dfa39b888df530a6f00a59e6b2fbe34b
 
  target/ppc: Enable the large decrementer for TCG and KVM guests 
  
https://github.com/legoater/qemu/commit/0cff350c80e19553c35a3fc8a9859533d606c3e8

  target/ppc: Add cmd line option to disable the large decrementer 
  
https://github.com/legoater/qemu/commit/7136bfa944d8dc405150d0bc281c3df5cab98ab1

The PowerNV POWER9 will need the TCG part. 

> Looks like Eduardo and others are probably doing a big batch machine
> type update via the machine tree.  That will probably conflict, but it
> should be a fairly easy one for me to sort out when the time comes.

I think you can possibly just drop this patch if someone adds the 
4.0 machine before or just drop the include/hw/compat.h changes

C.

>>
>>>  include/hw/compat.h |  3 +++
>>>  hw/ppc/spapr.c  | 25 ++---
>>>  2 files changed, 25 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/include/hw/compat.h b/include/hw/compat.h
>>> index 6f4d5fc64704..70958328fe7a 100644
>>> --- a/include/hw/compat.h
>>> +++ b/include/hw/compat.h
>>> @@ -1,6 +1,9 @@
>>>  #ifndef HW_COMPAT_H
>>>  #define HW_COMPAT_H
>>>  
>>> +#define HW_COMPAT_3_1 \
>>> +/* empty */
>>> +
>>>  #define HW_COMPAT_3_0 \
>>>  /* empty */
>>>  
>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>> index fa41927d95dd..4012ebd794a4 100644
>>> --- a/hw/ppc/spapr.c
>>> +++ b/hw/ppc/spapr.c
>>> @@ -3971,19 +3971,38 @@ static const TypeInfo spapr_machine_info = {
>>>  }\
>>>  type_init(spapr_machine_register_##suffix)
>>>  
>>> - /*
>>> +/*
>>> + * pseries-4.0
>>> + */
>>> +static void spapr_machine_4_0_instance_options(MachineState *machine)
>>> +{
>>> +}
>>> +
>>> +static void spapr_machine_4_0_class_options(MachineClass *mc)
>>> +{
>>> +/* Defaults for the latest behaviour inherited from the base class */
>>> +}
>>> +
>>> +DEFINE_SPAPR_MACHINE(4_0, "4.0", true);
>>> +
>>> +/*
>>>   * pseries-3.1
>>>   */
>>> +#define SPAPR_COMPAT_3_1  \
>>> +HW_COMPAT_3_1
>>> +
>>>  static void spapr_machine_3_1_instance_options(MachineState *machine)
>>>  {
>>> +spapr_machine_4_0_instance_options(machine);
>>>  }
>>>  
>>>  static void spapr_machine_3_1_class_options(MachineClass *mc)
>>>  {
>>> -/* Defaults for the latest behaviour inherited from the base class */
>>> +spapr_machine_4_0_class_options(mc);
>>> +SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_3_1);
>>>  }
>>>  
>>> -DEFINE_SPAPR_MACHINE(3_1, "3.1", true);
>>> +DEFINE_SPAPR_MACHINE(3_1, "3.1", false);
>>>  
>>>  /*
>>>   * pseries-3.0
>>
> 




Re: [Qemu-devel] [PATCH v7 12/19] spapr: add a 'reset' method to the sPAPR IRQ backend

2018-12-09 Thread David Gibson
On Sun, Dec 09, 2018 at 08:46:03PM +0100, Cédric Le Goater wrote:
> For the time being, the XIVE reset handler updates the OS CAM line of
> the vCPU as it is done under a real hypervisor when a vCPU is
> scheduled to run on a HW thread.
> 
> This handler will become even more useful when we introduce the
> machine supporting both interrupt modes, XIVE and XICS. In this
> machine, the interrupt mode is chosen by the CAS negotiation process
> and activated after a reset.
> 
> Signed-off-by: Cédric Le Goater 
> ---
>  include/hw/ppc/spapr_irq.h  |  2 ++
>  include/hw/ppc/spapr_xive.h |  1 +
>  hw/intc/spapr_xive.c| 24 
>  hw/ppc/spapr.c  |  5 +
>  hw/ppc/spapr_irq.c  | 24 
>  5 files changed, 56 insertions(+)
> 
> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> index 84a25ffb6c65..63061a009b4c 100644
> --- a/include/hw/ppc/spapr_irq.h
> +++ b/include/hw/ppc/spapr_irq.h
> @@ -44,6 +44,7 @@ typedef struct sPAPRIrq {
>  Object *(*cpu_intc_create)(sPAPRMachineState *spapr, Object *cpu,
> Error **errp);
>  int (*post_load)(sPAPRMachineState *spapr, int version_id);
> +void (*reset)(sPAPRMachineState *spapr, Error **errp);
>  } sPAPRIrq;
>  
>  extern sPAPRIrq spapr_irq_xics;
> @@ -55,6 +56,7 @@ int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool 
> lsi, Error **errp);
>  void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
>  qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
>  int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id);
> +void spapr_irq_reset(sPAPRMachineState *spapr, Error **errp);
>  
>  /*
>   * XICS legacy routines
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index 728a5e8dc163..7244a6231ce6 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -47,5 +47,6 @@ typedef struct sPAPRMachineState sPAPRMachineState;
>  void spapr_xive_hcall_init(sPAPRMachineState *spapr);
>  void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
> uint32_t phandle);
> +void spapr_xive_reset_tctx(sPAPRXive *xive);
>  
>  #endif /* PPC_SPAPR_XIVE_H */
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index a6d854b07690..560d8d031f74 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -179,6 +179,30 @@ static void spapr_xive_map_mmio(sPAPRXive *xive)
>  sysbus_mmio_map(SYS_BUS_DEVICE(xive), 2, xive->tm_base);
>  }
>  
> +/*
> + * When a Virtual Processor is scheduled to run on a HW thread, the
> + * hypervisor pushes its identifier in the OS CAM line. Emulate the
> + * same behavior under QEMU.
> + */
> +void spapr_xive_reset_tctx(sPAPRXive *xive)
> +{
> +CPUState *cs;
> +uint8_t  nvt_blk;
> +uint32_t nvt_idx;
> +uint32_t nvt_cam;
> +
> +CPU_FOREACH(cs) {
> +PowerPCCPU *cpu = POWERPC_CPU(cs);
> +XiveTCTX *tctx = XIVE_TCTX(cpu->intc);
> +
> +spapr_xive_cpu_to_nvt(cpu, _blk, _idx);
> +
> +nvt_cam = cpu_to_be32(TM_QW1W2_VO |
> +  xive_nvt_cam_line(nvt_blk, nvt_idx));
> +memcpy(>regs[TM_QW1_OS + TM_WORD2], _cam, 4);
> +}
> +}
> +
>  static void spapr_xive_end_reset(XiveEND *end)
>  {
>  memset(end, 0, sizeof(*end));
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 8cea4cad1732..98d69f09e080 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1619,6 +1619,11 @@ static void spapr_machine_reset(void)
>  
>  qemu_devices_reset();
>  
> +/* This is fixing some of the default configuration of the XIVE
> + * devices. To be called after the reset of the machine devices.
> + */
> +spapr_irq_reset(spapr, _fatal);
> +
>  /* DRC reset may cause a device to be unplugged. This will cause troubles
>   * if this device is used by another device (eg, a running vhost backend
>   * will crash QEMU if the DIMM holding the vring goes away). To avoid 
> such
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index 35a067cad3f8..04f5c9665550 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -209,6 +209,10 @@ static int spapr_irq_post_load_xics(sPAPRMachineState 
> *spapr, int version_id)
>  return 0;
>  }
>  
> +static void spapr_irq_reset_xics(sPAPRMachineState *spapr, Error **errp)
> +{
> +}

You already have a check for a NULL reset hook in spapr_irq_reset() so
you could omit this empty function.

> +
>  #define SPAPR_IRQ_XICS_NR_IRQS 0x1000
>  #define SPAPR_IRQ_XICS_NR_MSIS \
>  (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
> @@ -225,6 +229,7 @@ sPAPRIrq spapr_irq_xics = {
>  .dt_populate = spapr_dt_xics,
>  .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
>  .post_load   = spapr_irq_post_load_xics,
> +.reset   = spapr_irq_reset_xics,
>  };
>  
>  /*
> @@ -333,6 +338,15 @@ static int 

Re: [Qemu-devel] [PATCH v7 17/19] spapr: Add a pseries-4.0 machine type

2018-12-09 Thread David Gibson
On Sun, Dec 09, 2018 at 08:46:08PM +0100, Cédric Le Goater wrote:
> Signed-off-by: Cédric Le Goater 

Applied, since we'll need something like this sooner or later anyway.
I may have conflicts to resolve since I think a patch including a
similar chage is in someone else's tree, but it shouldn't be too hard
to deal with.

> ---
>  include/hw/compat.h |  3 +++
>  hw/ppc/spapr.c  | 25 ++---
>  2 files changed, 25 insertions(+), 3 deletions(-)
> 
> diff --git a/include/hw/compat.h b/include/hw/compat.h
> index 6f4d5fc64704..70958328fe7a 100644
> --- a/include/hw/compat.h
> +++ b/include/hw/compat.h
> @@ -1,6 +1,9 @@
>  #ifndef HW_COMPAT_H
>  #define HW_COMPAT_H
>  
> +#define HW_COMPAT_3_1 \
> +/* empty */
> +
>  #define HW_COMPAT_3_0 \
>  /* empty */
>  
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index fa41927d95dd..4012ebd794a4 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3971,19 +3971,38 @@ static const TypeInfo spapr_machine_info = {
>  }\
>  type_init(spapr_machine_register_##suffix)
>  
> - /*
> +/*
> + * pseries-4.0
> + */
> +static void spapr_machine_4_0_instance_options(MachineState *machine)
> +{
> +}
> +
> +static void spapr_machine_4_0_class_options(MachineClass *mc)
> +{
> +/* Defaults for the latest behaviour inherited from the base class */
> +}
> +
> +DEFINE_SPAPR_MACHINE(4_0, "4.0", true);
> +
> +/*
>   * pseries-3.1
>   */
> +#define SPAPR_COMPAT_3_1  \
> +HW_COMPAT_3_1
> +
>  static void spapr_machine_3_1_instance_options(MachineState *machine)
>  {
> +spapr_machine_4_0_instance_options(machine);
>  }
>  
>  static void spapr_machine_3_1_class_options(MachineClass *mc)
>  {
> -/* Defaults for the latest behaviour inherited from the base class */
> +spapr_machine_4_0_class_options(mc);
> +SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_3_1);
>  }
>  
> -DEFINE_SPAPR_MACHINE(3_1, "3.1", true);
> +DEFINE_SPAPR_MACHINE(3_1, "3.1", false);
>  
>  /*
>   * pseries-3.0

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v7 08/19] spapr: add hcalls support for the XIVE exploitation interrupt mode

2018-12-09 Thread David Gibson
On Sun, Dec 09, 2018 at 08:45:59PM +0100, Cédric Le Goater wrote:
> The different XIVE virtualization structures (sources and event queues)
> are configured with a set of Hypervisor calls :
> 
>  - H_INT_GET_SOURCE_INFO
> 
>used to obtain the address of the MMIO page of the Event State
>Buffer (ESB) entry associated with the source.
> 
>  - H_INT_SET_SOURCE_CONFIG
> 
>assigns a source to a "target".
> 
>  - H_INT_GET_SOURCE_CONFIG
> 
>determines which "target" and "priority" is assigned to a source
> 
>  - H_INT_GET_QUEUE_INFO
> 
>returns the address of the notification management page associated
>with the specified "target" and "priority".
> 
>  - H_INT_SET_QUEUE_CONFIG
> 
>sets or resets the event queue for a given "target" and "priority".
>It is also used to set the notification configuration associated
>with the queue, only unconditional notification is supported for
>the moment. Reset is performed with a queue size of 0 and queueing
>is disabled in that case.
> 
>  - H_INT_GET_QUEUE_CONFIG
> 
>returns the queue settings for a given "target" and "priority".
> 
>  - H_INT_RESET
> 
>resets all of the guest's internal interrupt structures to their
>initial state, losing all configuration set via the hcalls
>H_INT_SET_SOURCE_CONFIG and H_INT_SET_QUEUE_CONFIG.
> 
>  - H_INT_SYNC
> 
>issue a synchronisation on a source to make sure all notifications
>have reached their queue.
> 
> Calls that still need to be addressed :
> 
>H_INT_SET_OS_REPORTING_LINE
>H_INT_GET_OS_REPORTING_LINE
> 
> See the code for more documentation on each hcall.
> 
> Signed-off-by: Cédric Le Goater 

Reviewed-by: David Gibson 

> ---
> 
>  Changes since v6:
> 
>  - simplified the prototypes of helpers
>  - introduced a fixed value for the controller block id value.
>  
>  include/hw/ppc/spapr.h  |  15 +-
>  include/hw/ppc/spapr_xive.h |   4 +
>  hw/intc/spapr_xive.c| 963 
>  hw/ppc/spapr_irq.c  |   2 +
>  4 files changed, 983 insertions(+), 1 deletion(-)
> 
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index cb3082d319af..6bf028a02fe2 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -452,7 +452,20 @@ struct sPAPRMachineState {
>  #define H_INVALIDATE_PID0x378
>  #define H_REGISTER_PROC_TBL 0x37C
>  #define H_SIGNAL_SYS_RESET  0x380
> -#define MAX_HCALL_OPCODEH_SIGNAL_SYS_RESET
> +
> +#define H_INT_GET_SOURCE_INFO   0x3A8
> +#define H_INT_SET_SOURCE_CONFIG 0x3AC
> +#define H_INT_GET_SOURCE_CONFIG 0x3B0
> +#define H_INT_GET_QUEUE_INFO0x3B4
> +#define H_INT_SET_QUEUE_CONFIG  0x3B8
> +#define H_INT_GET_QUEUE_CONFIG  0x3BC
> +#define H_INT_SET_OS_REPORTING_LINE 0x3C0
> +#define H_INT_GET_OS_REPORTING_LINE 0x3C4
> +#define H_INT_ESB   0x3C8
> +#define H_INT_SYNC  0x3CC
> +#define H_INT_RESET 0x3D0
> +
> +#define MAX_HCALL_OPCODEH_INT_RESET
>  
>  /* The hcalls above are standardized in PAPR and implemented by pHyp
>   * as well.
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index f087959b9924..9506a8f4d10a 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -42,4 +42,8 @@ bool spapr_xive_irq_free(sPAPRXive *xive, uint32_t lisn);
>  void spapr_xive_pic_print_info(sPAPRXive *xive, Monitor *mon);
>  qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t lisn);
>  
> +typedef struct sPAPRMachineState sPAPRMachineState;
> +
> +void spapr_xive_hcall_init(sPAPRMachineState *spapr);
> +
>  #endif /* PPC_SPAPR_XIVE_H */
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index 3ade419fdbb1..982ac6e17051 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -38,6 +38,13 @@
>  
>  #define SPAPR_XIVE_NVT_BASE 0x400
>  
> +/*
> + * The sPAPR machine has a unique XIVE IC device. Assign a fixed value
> + * to the controller block id value. It can nevertheless be changed
> + * for testing purpose.
> + */
> +#define SPAPR_XIVE_BLOCK_ID 0x0
> +
>  /*
>   * sPAPR NVT and END indexing helpers
>   */
> @@ -46,6 +53,64 @@ static uint32_t spapr_xive_nvt_to_target(uint8_t nvt_blk, 
> uint32_t nvt_idx)
>  return nvt_idx - SPAPR_XIVE_NVT_BASE;
>  }
>  
> +static void spapr_xive_cpu_to_nvt(PowerPCCPU *cpu,
> +  uint8_t *out_nvt_blk, uint32_t 
> *out_nvt_idx)
> +{
> +assert(cpu);
> +
> +if (out_nvt_blk) {
> +*out_nvt_blk = SPAPR_XIVE_BLOCK_ID;
> +}
> +
> +if (out_nvt_blk) {
> +*out_nvt_idx = SPAPR_XIVE_NVT_BASE + cpu->vcpu_id;
> +}
> +}
> +
> +static int spapr_xive_target_to_nvt(uint32_t target,
> +uint8_t *out_nvt_blk, uint32_t 
> *out_nvt_idx)
> +{
> +PowerPCCPU *cpu = spapr_find_cpu(target);
> +
> +if (!cpu) {
> +return -1;
> +}
> +
> +spapr_xive_cpu_to_nvt(cpu, out_nvt_blk, out_nvt_idx);
> +return 0;
> +}
> 

Re: [Qemu-devel] [PATCH v7 09/19] spapr: add device tree support for the XIVE exploitation mode

2018-12-09 Thread David Gibson
On Sun, Dec 09, 2018 at 08:46:00PM +0100, Cédric Le Goater wrote:
> The XIVE interface for the guest is described in the device tree under
> the "interrupt-controller" node. A couple of new properties are
> specific to XIVE :
> 
>  - "reg"
> 
>contains the base address and size of the thread interrupt
>managnement areas (TIMA), for the User level and for the Guest OS
>level. Only the Guest OS level is taken into account today.
> 
>  - "ibm,xive-eq-sizes"
> 
>the size of the event queues. One cell per size supported, contains
>log2 of size, in ascending order.
> 
>  - "ibm,xive-lisn-ranges"
> 
>the IRQ interrupt number ranges assigned to the guest for the IPIs.
> 
> and also under the root node :
> 
>  - "ibm,plat-res-int-priorities"
> 
>contains a list of priorities that the hypervisor has reserved for
>its own use. OPAL uses the priority 7 queue to automatically
>escalate interrupts for all other queues (DD2.X POWER9). So only
>priorities [0..6] are allowed for the guest.
> 
> Extend the sPAPR IRQ backend with a new handler to populate the DT
> with the appropriate "interrupt-controller" node.
> 
> Signed-off-by: Cédric Le Goater 
> ---
>  include/hw/ppc/spapr_irq.h  |  2 ++
>  include/hw/ppc/spapr_xive.h |  2 ++
>  include/hw/ppc/xics.h   |  4 +--
>  hw/intc/spapr_xive.c| 64 +
>  hw/intc/xics_spapr.c|  3 +-
>  hw/ppc/spapr.c  |  3 +-
>  hw/ppc/spapr_irq.c  |  3 ++
>  7 files changed, 77 insertions(+), 4 deletions(-)
> 
> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> index 23cdb51b879e..e51e9f052f63 100644
> --- a/include/hw/ppc/spapr_irq.h
> +++ b/include/hw/ppc/spapr_irq.h
> @@ -39,6 +39,8 @@ typedef struct sPAPRIrq {
>  void (*free)(sPAPRMachineState *spapr, int irq, int num);
>  qemu_irq (*qirq)(sPAPRMachineState *spapr, int irq);
>  void (*print_info)(sPAPRMachineState *spapr, Monitor *mon);
> +void (*dt_populate)(sPAPRMachineState *spapr, uint32_t nr_servers,
> +void *fdt, uint32_t phandle);
>  } sPAPRIrq;
>  
>  extern sPAPRIrq spapr_irq_xics;
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index 9506a8f4d10a..728a5e8dc163 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -45,5 +45,7 @@ qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t lisn);
>  typedef struct sPAPRMachineState sPAPRMachineState;
>  
>  void spapr_xive_hcall_init(sPAPRMachineState *spapr);
> +void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
> +   uint32_t phandle);
>  
>  #endif /* PPC_SPAPR_XIVE_H */
> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> index 9958443d1984..14afda198cdb 100644
> --- a/include/hw/ppc/xics.h
> +++ b/include/hw/ppc/xics.h
> @@ -181,8 +181,6 @@ typedef struct XICSFabricClass {
>  ICPState *(*icp_get)(XICSFabric *xi, int server);
>  } XICSFabricClass;
>  
> -void spapr_dt_xics(int nr_servers, void *fdt, uint32_t phandle);
> -
>  ICPState *xics_icp_get(XICSFabric *xi, int server);
>  
>  /* Internal XICS interfaces */
> @@ -204,6 +202,8 @@ void icp_resend(ICPState *ss);
>  
>  typedef struct sPAPRMachineState sPAPRMachineState;
>  
> +void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
> +   uint32_t phandle);
>  int xics_kvm_init(sPAPRMachineState *spapr, Error **errp);
>  void xics_spapr_init(sPAPRMachineState *spapr);
>  
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index 982ac6e17051..a6d854b07690 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -14,6 +14,7 @@
>  #include "target/ppc/cpu.h"
>  #include "sysemu/cpus.h"
>  #include "monitor/monitor.h"
> +#include "hw/ppc/fdt.h"
>  #include "hw/ppc/spapr.h"
>  #include "hw/ppc/spapr_xive.h"
>  #include "hw/ppc/xive.h"
> @@ -1381,3 +1382,66 @@ void spapr_xive_hcall_init(sPAPRMachineState *spapr)
>  spapr_register_hypercall(H_INT_SYNC, h_int_sync);
>  spapr_register_hypercall(H_INT_RESET, h_int_reset);
>  }
> +
> +void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
> +   uint32_t phandle)
> +{
> +sPAPRXive *xive = spapr->xive;
> +int node;
> +uint64_t timas[2 * 2];
> +/* Interrupt number ranges for the IPIs */
> +uint32_t lisn_ranges[] = {
> +cpu_to_be32(0),
> +cpu_to_be32(nr_servers),
> +};
> +uint32_t eq_sizes[] = {
> +cpu_to_be32(12), /* 4K */
> +cpu_to_be32(16), /* 64K */
> +cpu_to_be32(21), /* 2M */
> +cpu_to_be32(24), /* 16M */

For KVM, are we going to need to clamp this list based on the
pagesizes the guest can use?

> +};
> +/* The following array is in sync with the reserved priorities
> + * defined by the 'spapr_xive_priority_is_reserved' routine.
> + */
> +uint32_t plat_res_int_priorities[] = {
> +cpu_to_be32(7),/* 

Re: [Qemu-devel] [PATCH v11 0/3] wakeup-from-suspend and system_wakeup changes

2018-12-09 Thread Markus Armbruster
Eduardo Habkost  writes:

> On Thu, Dec 06, 2018 at 07:59:02AM +0100, Markus Armbruster wrote:
>> Daniel Henrique Barboza  writes:
>> 
>> > changes in v11:
>> > - fixed typos, changed version to 4.0 in patches 1 and 3
>> > - changed text in patch 2 to be less alarming
>> > - patch 3: changed error handling
>> > - previous version link:
>> > http://lists.nongnu.org/archive/html/qemu-devel/2018-11/msg01774.html
>> 
>> Looks ready to me.  Who's going to merge it?
>
> Do you mind merging it through the QMP tree?
>
> Acked-by: Eduardo Habkost 

Can do.  Thanks!



Re: [Qemu-devel] Guests are crashing on startup, seem related to usb-audio

2018-12-09 Thread kra...@redhat.com
  Hi,

> #3  0x701be412 in __GI___assert_fail (assertion=0x55fb8738
> "p->actual_length + bytes <= iov->size", file=0x55fb8456
> "hw/usb/core.c", line=592, function=0x55fb8980
> <__PRETTY_FUNCTION__.26351> "usb_packet_copy") at assert.c:101
> #4  0x55bd5ed7 in usb_packet_copy (p=0x7fffc4722ea8,
> ptr=0x7fffbc053ee0, bytes=192) at hw/usb/core.c:592

Can you "print *p" here?

thanks,
  Gerd




Re: [Qemu-devel] [Qemu-ppc] [PATCH qemu] ppc/spapr: Receive and store device tree blob from SLOF

2018-12-09 Thread David Gibson
On Mon, Nov 12, 2018 at 03:12:26PM +1100, Alexey Kardashevskiy wrote:
> 
> 
> On 12/11/2018 05:10, Greg Kurz wrote:
> > Hi Alexey,
> > 
> > Just a few remarks. See below.
> > 
> > On Thu,  8 Nov 2018 12:44:06 +1100
> > Alexey Kardashevskiy  wrote:
> > 
> >> SLOF receives a device tree and updates it with various properties
> >> before switching to the guest kernel and QEMU is not aware of any changes
> >> made by SLOF. Since there is no real RTAS (QEMU implements it), it makes
> >> sense to pass the SLOF final device tree to QEMU to let it implement
> >> RTAS related tasks better, such as PCI host bus adapter hotplug.
> >>
> >> Specifially, now QEMU can find out the actual XICS phandle (for PHB
> >> hotplug) and the RTAS linux,rtas-entry/base properties (for firmware
> >> assisted NMI - FWNMI).
> >>
> >> This stores the initial DT blob in the sPAPR machine and replaces it
> >> in the KVMPPC_H_UPDATE_DT (new private hypercall) handler.
> >>
> >> This adds an @update_dt_enabled machine property to allow backward
> >> migration.
> >>
> >> SLOF already has a hypercall since
> >> https://github.com/aik/SLOF/commit/e6fc84652c9c0073f9183
> >>
> >> Signed-off-by: Alexey Kardashevskiy 
> >> ---
> >>  include/hw/ppc/spapr.h |  7 ++-
> >>  hw/ppc/spapr.c | 29 -
> >>  hw/ppc/spapr_hcall.c   | 32 
> >>  hw/ppc/trace-events|  2 ++
> >>  4 files changed, 68 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> >> index ad4d7cfd97..f5dcaf44cb 100644
> >> --- a/include/hw/ppc/spapr.h
> >> +++ b/include/hw/ppc/spapr.h
> >> @@ -100,6 +100,7 @@ struct sPAPRMachineClass {
> >>  
> >>  /*< public >*/
> >>  bool dr_lmb_enabled;   /* enable dynamic-reconfig/hotplug of LMBs 
> >> */
> >> +bool update_dt_enabled;/* enable KVMPPC_H_UPDATE_DT */
> >>  bool use_ohci_by_default;  /* use USB-OHCI instead of XHCI */
> >>  bool pre_2_10_has_unused_icps;
> >>  bool legacy_irq_allocation;
> >> @@ -136,6 +137,9 @@ struct sPAPRMachineState {
> >>  int vrma_adjust;
> >>  ssize_t rtas_size;
> >>  void *rtas_blob;
> >> +uint32_t fdt_size;
> >> +uint32_t fdt_initial_size;
> > 
> > I don't quite see the purpose of fdt_initial_size... it seems to be only
> > used to print a trace.
> 
> 
> Ah, lost in rebase. The purpose was to test if the new device tree has
> not grown too much.
> 
> 
> 
> > 
> >> +void *fdt_blob;
> >>  long kernel_size;
> >>  bool kernel_le;
> >>  uint32_t initrd_base;
> >> @@ -462,7 +466,8 @@ struct sPAPRMachineState {
> >>  #define KVMPPC_H_LOGICAL_MEMOP  (KVMPPC_HCALL_BASE + 0x1)
> >>  /* Client Architecture support */
> >>  #define KVMPPC_H_CAS(KVMPPC_HCALL_BASE + 0x2)
> >> -#define KVMPPC_HCALL_MAXKVMPPC_H_CAS
> >> +#define KVMPPC_H_UPDATE_DT  (KVMPPC_HCALL_BASE + 0x3)
> >> +#define KVMPPC_HCALL_MAXKVMPPC_H_UPDATE_DT
> >>  
> >>  typedef struct sPAPRDeviceTreeUpdateHeader {
> >>  uint32_t version_id;
> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >> index c08130facb..5e2d4d211c 100644
> >> --- a/hw/ppc/spapr.c
> >> +++ b/hw/ppc/spapr.c
> >> @@ -1633,7 +1633,10 @@ static void spapr_machine_reset(void)
> >>  /* Load the fdt */
> >>  qemu_fdt_dumpdtb(fdt, fdt_totalsize(fdt));
> >>  cpu_physical_memory_write(fdt_addr, fdt, fdt_totalsize(fdt));
> >> -g_free(fdt);
> >> +g_free(spapr->fdt_blob);
> >> +spapr->fdt_size = fdt_totalsize(fdt);
> >> +spapr->fdt_initial_size = spapr->fdt_size;
> >> +spapr->fdt_blob = fdt;
> > 
> > Hmm... It looks weird to store state in a reset handler. I'd rather zeroe
> > both fdt_blob and fdt_size here.
> 
> The device tree is built from the reset handler and the idea is that we
> want to always have some tree in the machine.

Yes, I think the approach here is fine.  Otherwise when we want to
look up the current fdt state in RTAS calls or whatever we'd always
have to do
if (fdt_blob)
look up that
else
look up qemu created fdt.

Incidentally 'fdt' and 'fdt_blob' names do a terrible job of
distinguishing what the difference is.  Renaming fdt to fdt_initial
(to match fdt_initial_size) and fdt_blob to fdt should make that
clearer.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v7 05/19] spapr/xive: introduce a XIVE interrupt controller

2018-12-09 Thread David Gibson
On Sun, Dec 09, 2018 at 08:45:56PM +0100, Cédric Le Goater wrote:
> sPAPRXive models the XIVE interrupt controller of the sPAPR machine.
> It inherits from the XiveRouter and provisions storage for the routing
> tables :
> 
>   - Event Assignment Structure (EAS)
>   - Event Notification Descriptor (END)
> 
> The sPAPRXive model incorporates an internal XiveSource for the IPIs
> and for the interrupts of the virtual devices of the guest. This model
> is consistent with XIVE architecture which also incorporates an
> internal IVSE for IPIs and accelerator interrupts in the IVRE
> sub-engine.
> 
> The sPAPRXive model exports two memory regions, one for the ESB
> trigger and management pages used to control the sources and one for
> the TIMA pages. They are mapped by default at the addresses found on
> chip 0 of a baremetal system. This is also consistent with the XIVE
> architecture which defines a Virtualization Controller BAR for the
> internal IVSE ESB pages and a Thread Managment BAR for the TIMA.
> 
> Signed-off-by: Cédric Le Goater 
> Reviewed-by: David Gibson 

Applied.

> ---
>  default-configs/ppc64-softmmu.mak |   1 +
>  include/hw/ppc/spapr_xive.h   |  45 
>  hw/intc/spapr_xive.c  | 366 ++
>  hw/intc/Makefile.objs |   1 +
>  4 files changed, 413 insertions(+)
>  create mode 100644 include/hw/ppc/spapr_xive.h
>  create mode 100644 hw/intc/spapr_xive.c
> 
> diff --git a/default-configs/ppc64-softmmu.mak 
> b/default-configs/ppc64-softmmu.mak
> index 2d1e7c5c4668..7f34ad0528ed 100644
> --- a/default-configs/ppc64-softmmu.mak
> +++ b/default-configs/ppc64-softmmu.mak
> @@ -17,6 +17,7 @@ CONFIG_XICS=$(CONFIG_PSERIES)
>  CONFIG_XICS_SPAPR=$(CONFIG_PSERIES)
>  CONFIG_XICS_KVM=$(call land,$(CONFIG_PSERIES),$(CONFIG_KVM))
>  CONFIG_XIVE=$(CONFIG_PSERIES)
> +CONFIG_XIVE_SPAPR=$(CONFIG_PSERIES)
>  CONFIG_MEM_DEVICE=y
>  CONFIG_DIMM=y
>  CONFIG_SPAPR_RNG=y
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> new file mode 100644
> index ..f087959b9924
> --- /dev/null
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -0,0 +1,45 @@
> +/*
> + * QEMU PowerPC sPAPR XIVE interrupt controller model
> + *
> + * Copyright (c) 2017-2018, IBM Corporation.
> + *
> + * This code is licensed under the GPL version 2 or later. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#ifndef PPC_SPAPR_XIVE_H
> +#define PPC_SPAPR_XIVE_H
> +
> +#include "hw/ppc/xive.h"
> +
> +#define TYPE_SPAPR_XIVE "spapr-xive"
> +#define SPAPR_XIVE(obj) OBJECT_CHECK(sPAPRXive, (obj), TYPE_SPAPR_XIVE)
> +
> +typedef struct sPAPRXive {
> +XiveRouterparent;
> +
> +/* Internal interrupt source for IPIs and virtual devices */
> +XiveSourcesource;
> +hwaddrvc_base;
> +
> +/* END ESB MMIOs */
> +XiveENDSource end_source;
> +hwaddrend_base;
> +
> +/* Routing table */
> +XiveEAS   *eat;
> +uint32_t  nr_irqs;
> +XiveEND   *endt;
> +uint32_t  nr_ends;
> +
> +/* TIMA mapping address */
> +hwaddrtm_base;
> +MemoryRegion  tm_mmio;
> +} sPAPRXive;
> +
> +bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn, bool lsi);
> +bool spapr_xive_irq_free(sPAPRXive *xive, uint32_t lisn);
> +void spapr_xive_pic_print_info(sPAPRXive *xive, Monitor *mon);
> +qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t lisn);
> +
> +#endif /* PPC_SPAPR_XIVE_H */
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> new file mode 100644
> index ..eef5830d45c6
> --- /dev/null
> +++ b/hw/intc/spapr_xive.c
> @@ -0,0 +1,366 @@
> +/*
> + * QEMU PowerPC sPAPR XIVE interrupt controller model
> + *
> + * Copyright (c) 2017-2018, IBM Corporation.
> + *
> + * This code is licensed under the GPL version 2 or later. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/log.h"
> +#include "qapi/error.h"
> +#include "qemu/error-report.h"
> +#include "target/ppc/cpu.h"
> +#include "sysemu/cpus.h"
> +#include "monitor/monitor.h"
> +#include "hw/ppc/spapr.h"
> +#include "hw/ppc/spapr_xive.h"
> +#include "hw/ppc/xive.h"
> +#include "hw/ppc/xive_regs.h"
> +
> +/*
> + * XIVE Virtualization Controller BAR and Thread Managment BAR that we
> + * use for the ESB pages and the TIMA pages
> + */
> +#define SPAPR_XIVE_VC_BASE   0x00060100ull
> +#define SPAPR_XIVE_TM_BASE   0x000603020318ull
> +
> +/*
> + * On sPAPR machines, use a simplified output for the XIVE END
> + * structure dumping only the information related to the OS EQ.
> + */
> +static void spapr_xive_end_pic_print_info(sPAPRXive *xive, XiveEND *end,
> +  Monitor *mon)
> +{
> +uint32_t qindex = GETFIELD_BE32(END_W1_PAGE_OFF, end->w1);
> +uint32_t qgen = GETFIELD_BE32(END_W1_GENERATION, end->w1);
> +uint32_t qsize = GETFIELD_BE32(END_W0_QSIZE, end->w0);
> +uint32_t qentries = 1 << (qsize + 10);
> +

Re: [Qemu-devel] [PATCH v7 07/19] spapr: introduce a new machine IRQ backend for XIVE

2018-12-09 Thread David Gibson
On Sun, Dec 09, 2018 at 08:45:58PM +0100, Cédric Le Goater wrote:
> The XIVE IRQ backend uses the same layout as the new XICS backend but
> covers the full range of the IRQ number space. The IRQ numbers for the
> CPU IPIs are allocated at the bottom of this space, below 4K, to
> preserve compatibility with XICS which does not use that range.
> 
> This should be enough given that the maximum number of CPUs is 1024
> for the sPAPR machine under QEMU. For the record, the biggest POWER8
> or POWER9 system has a maximum of 1536 HW threads (16 sockets, 192
> cores, SMT8).
> 
> Signed-off-by: Cédric Le Goater 

Reviewed-by: David Gibson 

WIth the exception of the TODO noted below.

> ---
>  include/hw/ppc/spapr.h |   2 +
>  include/hw/ppc/spapr_irq.h |   2 +
>  hw/ppc/spapr_irq.c | 113 +
>  3 files changed, 117 insertions(+)
> 
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 198764066dc9..cb3082d319af 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -16,6 +16,7 @@ typedef struct sPAPREventLogEntry sPAPREventLogEntry;
>  typedef struct sPAPREventSource sPAPREventSource;
>  typedef struct sPAPRPendingHPT sPAPRPendingHPT;
>  typedef struct ICSState ICSState;
> +typedef struct sPAPRXive sPAPRXive;
>  
>  #define HPTE64_V_HPTE_DIRTY 0x0040ULL
>  #define SPAPR_ENTRY_POINT   0x100
> @@ -175,6 +176,7 @@ struct sPAPRMachineState {
>  const char *icp_type;
>  int32_t irq_map_nr;
>  unsigned long *irq_map;
> +sPAPRXive  *xive;
>  
>  bool cmd_line_caps[SPAPR_CAP_NUM];
>  sPAPRCapabilities def, eff, mig;
> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> index bd7301e6d9c6..23cdb51b879e 100644
> --- a/include/hw/ppc/spapr_irq.h
> +++ b/include/hw/ppc/spapr_irq.h
> @@ -13,6 +13,7 @@
>  /*
>   * IRQ range offsets per device type
>   */
> +#define SPAPR_IRQ_IPI0x0
>  #define SPAPR_IRQ_EPOW   0x1000  /* XICS_IRQ_BASE offset */
>  #define SPAPR_IRQ_HOTPLUG0x1001
>  #define SPAPR_IRQ_VIO0x1100  /* 256 VIO devices */
> @@ -42,6 +43,7 @@ typedef struct sPAPRIrq {
>  
>  extern sPAPRIrq spapr_irq_xics;
>  extern sPAPRIrq spapr_irq_xics_legacy;
> +extern sPAPRIrq spapr_irq_xive;
>  
>  void spapr_irq_init(sPAPRMachineState *spapr, Error **errp);
>  int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error 
> **errp);
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index f8b651de0ec9..0bf47ff9fa26 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -12,6 +12,7 @@
>  #include "qemu/error-report.h"
>  #include "qapi/error.h"
>  #include "hw/ppc/spapr.h"
> +#include "hw/ppc/spapr_xive.h"
>  #include "hw/ppc/xics.h"
>  #include "sysemu/kvm.h"
>  
> @@ -205,6 +206,118 @@ sPAPRIrq spapr_irq_xics = {
>  .print_info  = spapr_irq_print_info_xics,
>  };
>  
> +/*
> + * XIVE IRQ backend.
> + */
> +static sPAPRXive *spapr_xive_create(sPAPRMachineState *spapr, int nr_irqs,
> +int nr_servers, Error **errp)
> +{
> +sPAPRXive *xive;
> +Error *local_err = NULL;
> +Object *obj;
> +uint32_t nr_ends = nr_servers << 3; /* 8 priority ENDs per CPU */
> +int i;
> +
> +/* TODO : use qdev_create() ? */

Ok, still waiting on this todo.

> +obj = object_new(TYPE_SPAPR_XIVE);
> +object_property_set_int(obj, nr_irqs, "nr-irqs", _abort);
> +object_property_set_int(obj, nr_ends, "nr-ends", _abort);
> +object_property_set_bool(obj, true, "realized", _err);
> +if (local_err) {
> +error_propagate(errp, local_err);
> +return NULL;
> +}
> +qdev_set_parent_bus(DEVICE(obj), sysbus_get_default());
> +xive = SPAPR_XIVE(obj);
> +
> +/* Enable the CPU IPIs */
> +for (i = 0; i < nr_servers; ++i) {
> +spapr_xive_irq_claim(xive, SPAPR_IRQ_IPI + i, false);
> +}
> +
> +return xive;
> +}
> +
> +static void spapr_irq_init_xive(sPAPRMachineState *spapr, Error **errp)
> +{
> +MachineState *machine = MACHINE(spapr);
> +sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> +int nr_irqs = smc->irq->nr_irqs;
> +Error *local_err = NULL;
> +
> +/* KVM XIVE device not yet available */
> +if (kvm_enabled()) {
> +if (machine_kernel_irqchip_required(machine)) {
> +error_setg(errp, "kernel_irqchip requested. no KVM XIVE 
> support");
> +return;
> +}
> +}
> +
> +spapr->xive = spapr_xive_create(spapr, nr_irqs,
> +spapr_max_server_number(spapr), 
> _err);
> +if (local_err) {
> +error_propagate(errp, local_err);
> +return;
> +}
> +}
> +
> +static int spapr_irq_claim_xive(sPAPRMachineState *spapr, int irq, bool lsi,
> +Error **errp)
> +{
> +if (!spapr_xive_irq_claim(spapr->xive, irq, lsi)) {
> +error_setg(errp, "IRQ %d is invalid", irq);
> +return -1;
> +}
> + 

Re: [Qemu-devel] [PATCH v7 01/19] ppc/xive: add support for the END Event State Buffers

2018-12-09 Thread David Gibson
On Sun, Dec 09, 2018 at 08:45:52PM +0100, Cédric Le Goater wrote:
> The Event Notification Descriptor (END) XIVE structure also contains
> two Event State Buffers providing further coalescing of interrupts,
> one for the notification event (ESn) and one for the escalation events
> (ESe). A MMIO page is assigned for each to control the EOI through
> loads only. Stores are not allowed.
> 
> The END ESBs are modeled through an object resembling the 'XiveSource'
> It is stateless as the END state bits are backed into the XiveEND
> structure under the XiveRouter and the MMIO accesses follow the same
> rules as for the XiveSource ESBs.
> 
> END ESBs are not supported by the Linux drivers neither on OPAL nor on
> sPAPR. Nevetherless, it provides a mean to study the question in the
> future and validates a bit more the XIVE model.
> 
> Signed-off-by: Cédric Le Goater 
> ---
> 
>  Changes since v6:
> 
>  - removed the 'chip-id' field from XiveRouter
>  - introduced a 'block-id' field in XiveENDSource to lookup the XIVE
>END structure when doing a load in the MMIO ESB
>  - removed reset XiveENDSource handler
> 
>  include/hw/ppc/xive.h |  21 ++
>  hw/intc/xive.c| 160 +-
>  2 files changed, 179 insertions(+), 2 deletions(-)

Applied to ppc-for-4.0.

I had some thoughts about maybe-nicer arrangements of things here, but
nothing important enough to delay this (the things I'm mulling over
wouldn't break migration, so it's fixable later).

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 2/3] mac_newworld: enable access to EDID data for the VGA device

2018-12-09 Thread David Gibson
On Fri, Dec 07, 2018 at 04:08:05PM +, Mark Cave-Ayland wrote:
> This is in preparation for some upcoming QEMU NDRV driver changes that pass
> display information from the host to the guest.
> 
> Signed-off-by: Mark Cave-Ayland 

This looks fine by my limited knowledge of this area.  I'm slightly
perturbed I can't see any existing examples in the tree of setting the
edid property from the machine.

> ---
>  hw/ppc/mac_newworld.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
> index 14273a123e..df0a2f03ff 100644
> --- a/hw/ppc/mac_newworld.c
> +++ b/hw/ppc/mac_newworld.c
> @@ -430,7 +430,10 @@ static void ppc_core99_init(MachineState *machine)
>  }
>  }
>  
> -pci_vga_init(pci_bus);
> +dev = qdev_create(BUS(pci_bus), "VGA");
> +qdev_prop_set_int32(dev, "addr", -1);
> +qdev_prop_set_bit(dev, "edid", true);
> +qdev_init_nofail(dev);
>  
>  if (graphic_depth != 15 && graphic_depth != 32 && graphic_depth != 8) {
>  graphic_depth = 15;

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [RFC PATCH 1/6] target/ppc: introduce get_fpr() and set_fpr() helpers for FP register access

2018-12-09 Thread David Gibson
On Fri, Dec 07, 2018 at 08:56:30AM +, Mark Cave-Ayland wrote:
> These helpers allow us to move FP register values to/from the specified 
> TCGv_i64
> argument.
> 
> To prevent FP helpers accessing the cpu_fpr array directly, add extra TCG
> temporaries as required.

It's not obvious to me why that's a desirable thing.  I'm assuming
it's somehow necessary for the stuff later in the series, but I think
we need a brief rationale here to explain why this isn't just adding
extra reg copies for the sake of it.

> 
> Signed-off-by: Mark Cave-Ayland 
> ---
>  target/ppc/translate.c |  10 +
>  target/ppc/translate/fp-impl.inc.c | 492 
> -
>  2 files changed, 392 insertions(+), 110 deletions(-)
> 
> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index 2b37910248..1d4bf624a3 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -6694,6 +6694,16 @@ static inline void gen_##name(DisasContext *ctx)   
> \
>  GEN_TM_PRIV_NOOP(treclaim);
>  GEN_TM_PRIV_NOOP(trechkpt);
>  
> +static inline void get_fpr(TCGv_i64 dst, int regno)
> +{
> +tcg_gen_mov_i64(dst, cpu_fpr[regno]);
> +}
> +
> +static inline void set_fpr(int regno, TCGv_i64 src)
> +{
> +tcg_gen_mov_i64(cpu_fpr[regno], src);
> +}
> +
>  #include "translate/fp-impl.inc.c"
>  
>  #include "translate/vmx-impl.inc.c"
> diff --git a/target/ppc/translate/fp-impl.inc.c 
> b/target/ppc/translate/fp-impl.inc.c
> index 08770ba9f5..923fb7550f 100644
> --- a/target/ppc/translate/fp-impl.inc.c
> +++ b/target/ppc/translate/fp-impl.inc.c
> @@ -34,24 +34,39 @@ static void gen_set_cr1_from_fpscr(DisasContext *ctx)
>  #define _GEN_FLOAT_ACB(name, op, op1, op2, isfloat, set_fprf, type)  
>  \
>  static void gen_f##name(DisasContext *ctx)   
>  \
>  {
>  \
> +TCGv_i64 t0; 
>  \
> +TCGv_i64 t1; 
>  \
> +TCGv_i64 t2; 
>  \
> +TCGv_i64 t3; 
>  \
>  if (unlikely(!ctx->fpu_enabled)) {   
>  \
>  gen_exception(ctx, POWERPC_EXCP_FPU);
>  \
>  return;  
>  \
>  }
>  \
> +t0 = tcg_temp_new_i64(); 
>  \
> +t1 = tcg_temp_new_i64(); 
>  \
> +t2 = tcg_temp_new_i64(); 
>  \
> +t3 = tcg_temp_new_i64(); 
>  \
>  gen_reset_fpstatus();
>  \
> -gen_helper_f##op(cpu_fpr[rD(ctx->opcode)], cpu_env,  
>  \
> - cpu_fpr[rA(ctx->opcode)],   
>  \
> - cpu_fpr[rC(ctx->opcode)], cpu_fpr[rB(ctx->opcode)]);
>  \
> +get_fpr(t0, rA(ctx->opcode));
>  \
> +get_fpr(t1, rC(ctx->opcode));
>  \
> +get_fpr(t2, rB(ctx->opcode));
>  \
> +gen_helper_f##op(t3, cpu_env, t0, t1, t2);   
>  \
> +set_fpr(rD(ctx->opcode), t3);
>  \
>  if (isfloat) {   
>  \
> -gen_helper_frsp(cpu_fpr[rD(ctx->opcode)], cpu_env,   
>  \
> -cpu_fpr[rD(ctx->opcode)]);   
>  \
> +get_fpr(t0, rD(ctx->opcode));
>  \
> +gen_helper_frsp(t3, cpu_env, t0);
>  \
> +set_fpr(rD(ctx->opcode), t3);
>  \
>  }
>  \
>  if (set_fprf) {  
>  \
> -gen_compute_fprf_float64(cpu_fpr[rD(ctx->opcode)]);  
>  \
> +gen_compute_fprf_float64(t3);
>  \
>  }
>  \
>  if (unlikely(Rc(ctx->opcode) != 0)) {
>  \
>  gen_set_cr1_from_fpscr(ctx); 
>  \
>  }
>  \
> +tcg_temp_free_i64(t0);

Re: [Qemu-devel] [PATCH v7 06/19] spapr/xive: use the VCPU id as a NVT identifier

2018-12-09 Thread David Gibson
On Sun, Dec 09, 2018 at 08:45:57PM +0100, Cédric Le Goater wrote:
> The IVPE scans the O/S CAM line of the XIVE thread interrupt contexts
> to find a matching Notification Virtual Target (NVT) among the NVTs
> dispatched on the HW processor threads.
> 
> On a real system, the thread interrupt contexts are updated by the
> hypervisor when a Virtual Processor is scheduled to run on a HW
> thread. Under QEMU, the model will emulate the same behavior by
> hardwiring the NVT identifier in the thread context registers at
> reset.
> 
> The NVT identifier used by the sPAPRXive model is the VCPU id. The END
> identifier is also derived from the VCPU id. A set of helpers doing
> the conversion between identifiers are provided for the hcalls
> configuring the sources and the ENDs.
> 
> The model does not need a NVT table but the XiveRouter NVT operations
> are provided to perform some extra checks in the routing algorithm.
> 
> Signed-off-by: Cédric Le Goater 

Applied.

> ---
> 
>  Changes since v6:
> 
>  - simplified the prototypes of helpers
>  - introduced an assert in set_nvt() method
> 
>  hw/intc/spapr_xive.c | 56 +++-
>  1 file changed, 55 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index eef5830d45c6..3ade419fdbb1 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -26,6 +26,26 @@
>  #define SPAPR_XIVE_VC_BASE   0x00060100ull
>  #define SPAPR_XIVE_TM_BASE   0x000603020318ull
>  
> +/*
> + * The allocation of VP blocks is a complex operation in OPAL and the
> + * VP identifiers have a relation with the number of HW chips, the
> + * size of the VP blocks, VP grouping, etc. The QEMU sPAPR XIVE
> + * controller model does not have the same constraints and can use a
> + * simple mapping scheme of the CPU vcpu_id
> + *
> + * These identifiers are never returned to the OS.
> + */
> +
> +#define SPAPR_XIVE_NVT_BASE 0x400
> +
> +/*
> + * sPAPR NVT and END indexing helpers
> + */
> +static uint32_t spapr_xive_nvt_to_target(uint8_t nvt_blk, uint32_t nvt_idx)
> +{
> +return nvt_idx - SPAPR_XIVE_NVT_BASE;
> +}
> +
>  /*
>   * On sPAPR machines, use a simplified output for the XIVE END
>   * structure dumping only the information related to the OS EQ.
> @@ -40,7 +60,8 @@ static void spapr_xive_end_pic_print_info(sPAPRXive *xive, 
> XiveEND *end,
>  uint32_t nvt = GETFIELD_BE32(END_W6_NVT_INDEX, end->w6);
>  uint8_t priority = GETFIELD_BE32(END_W7_F0_PRIORITY, end->w7);
>  
> -monitor_printf(mon, "%3d/%d % 6d/%5d ^%d", nvt,
> +monitor_printf(mon, "%3d/%d % 6d/%5d ^%d",
> +   spapr_xive_nvt_to_target(0, nvt),
> priority, qindex, qentries, qgen);
>  
>  xive_end_queue_pic_print_info(end, 6, mon);
> @@ -246,6 +267,37 @@ static int spapr_xive_write_end(XiveRouter *xrtr, 
> uint8_t end_blk,
>  return 0;
>  }
>  
> +static int spapr_xive_get_nvt(XiveRouter *xrtr,
> +  uint8_t nvt_blk, uint32_t nvt_idx, XiveNVT 
> *nvt)
> +{
> +uint32_t vcpu_id = spapr_xive_nvt_to_target(nvt_blk, nvt_idx);
> +PowerPCCPU *cpu = spapr_find_cpu(vcpu_id);
> +
> +if (!cpu) {
> +/* TODO: should we assert() if we can find a NVT ? */
> +return -1;
> +}
> +
> +/*
> + * sPAPR does not maintain a NVT table. Return that the NVT is
> + * valid if we have found a matching CPU
> + */
> +nvt->w0 = cpu_to_be32(NVT_W0_VALID);
> +return 0;
> +}
> +
> +static int spapr_xive_write_nvt(XiveRouter *xrtr, uint8_t nvt_blk,
> +uint32_t nvt_idx, XiveNVT *nvt,
> +uint8_t word_number)
> +{
> +/*
> + * We don't need to write back to the NVTs because the sPAPR
> + * machine should never hit a non-scheduled NVT. It should never
> + * get called.
> + */
> +g_assert_not_reached();
> +}
> +
>  static const VMStateDescription vmstate_spapr_xive_end = {
>  .name = TYPE_SPAPR_XIVE "/end",
>  .version_id = 1,
> @@ -308,6 +360,8 @@ static void spapr_xive_class_init(ObjectClass *klass, 
> void *data)
>  xrc->get_eas = spapr_xive_get_eas;
>  xrc->get_end = spapr_xive_get_end;
>  xrc->write_end = spapr_xive_write_end;
> +xrc->get_nvt = spapr_xive_get_nvt;
> +xrc->write_nvt = spapr_xive_write_nvt;
>  }
>  
>  static const TypeInfo spapr_xive_info = {

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v7 03/19] ppc/xive: introduce a simplified XIVE presenter

2018-12-09 Thread David Gibson
On Sun, Dec 09, 2018 at 08:45:54PM +0100, Cédric Le Goater wrote:
> The last sub-engine of the XIVE architecture is the Interrupt
> Virtualization Presentation Engine (IVPE). On HW, the IVRE and the
> IVPE share elements, the Power Bus interface (CQ), the routing table
> descriptors, and they can be combined in the same HW logic. We do the
> same in QEMU and combine both engines in the XiveRouter for
> simplicity.
> 
> When the IVRE has completed its job of matching an event source with a
> Notification Virtual Target (NVT) to notify, it forwards the event
> notification to the IVPE sub-engine. The IVPE scans the thread
> interrupt contexts of the Notification Virtual Targets (NVT)
> dispatched on the HW processor threads and if a match is found, it
> signals the thread. If not, the IVPE escalates the notification to
> some other targets and records the notification in a backlog queue.
> 
> The IVPE maintains the thread interrupt context state for each of its
> NVTs not dispatched on HW processor threads in the Notification
> Virtual Target table (NVTT).
> 
> The model currently only supports single NVT notifications.
> 
> Signed-off-by: Cédric Le Goater 

Applied.

I think the tctx_word2() should have the byteswap, rather than having
it in the callers, but that can be fixed later.

> ---
> 
>  Changes since v6 :
> 
>  - removed HW CAM line setting and use as it is only useful for PowerNV
>  - made use of xive_tctx_word2() helper
>  - made use of GETFIELD_BE32() to compare CAM lines
>  - fixed initialization of XiveTCTXMatch
> 
>  include/hw/ppc/xive.h  |  14 +++
>  include/hw/ppc/xive_regs.h |  24 +
>  hw/intc/xive.c | 185 +
>  3 files changed, 223 insertions(+)
> 
> diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
> index 1e823a4c64e9..19309d1d65d1 100644
> --- a/include/hw/ppc/xive.h
> +++ b/include/hw/ppc/xive.h
> @@ -325,6 +325,10 @@ typedef struct XiveRouterClass {
> XiveEND *end);
>  int (*write_end)(XiveRouter *xrtr, uint8_t end_blk, uint32_t end_idx,
>   XiveEND *end, uint8_t word_number);
> +int (*get_nvt)(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
> +   XiveNVT *nvt);
> +int (*write_nvt)(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
> + XiveNVT *nvt, uint8_t word_number);
>  } XiveRouterClass;
>  
>  void xive_eas_pic_print_info(XiveEAS *eas, uint32_t lisn, Monitor *mon);
> @@ -335,6 +339,11 @@ int xive_router_get_end(XiveRouter *xrtr, uint8_t 
> end_blk, uint32_t end_idx,
>  XiveEND *end);
>  int xive_router_write_end(XiveRouter *xrtr, uint8_t end_blk, uint32_t 
> end_idx,
>XiveEND *end, uint8_t word_number);
> +int xive_router_get_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
> +XiveNVT *nvt);
> +int xive_router_write_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t 
> nvt_idx,
> +  XiveNVT *nvt, uint8_t word_number);
> +
>  
>  /*
>   * XIVE END ESBs
> @@ -411,4 +420,9 @@ extern const MemoryRegionOps xive_tm_ops;
>  
>  void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon);
>  
> +static inline uint32_t xive_nvt_cam_line(uint8_t nvt_blk, uint32_t nvt_idx)
> +{
> +return (nvt_blk << 19) | nvt_idx;
> +}
> +
>  #endif /* PPC_XIVE_H */
> diff --git a/include/hw/ppc/xive_regs.h b/include/hw/ppc/xive_regs.h
> index ede3d04c5eda..85557e730cd8 100644
> --- a/include/hw/ppc/xive_regs.h
> +++ b/include/hw/ppc/xive_regs.h
> @@ -186,4 +186,28 @@ typedef struct XiveEND {
>  #define GETFIELD_BE32(m, v)   GETFIELD(m, be32_to_cpu(v))
>  #define SETFIELD_BE32(m, v, val)  cpu_to_be32(SETFIELD(m, be32_to_cpu(v), 
> val))
>  
> +/* Notification Virtual Target (NVT) */
> +typedef struct XiveNVT {
> +uint32_tw0;
> +#define NVT_W0_VALID PPC_BIT32(0)
> +uint32_tw1;
> +uint32_tw2;
> +uint32_tw3;
> +uint32_tw4;
> +uint32_tw5;
> +uint32_tw6;
> +uint32_tw7;
> +uint32_tw8;
> +#define NVT_W8_GRP_VALID PPC_BIT32(0)
> +uint32_tw9;
> +uint32_twa;
> +uint32_twb;
> +uint32_twc;
> +uint32_twd;
> +uint32_twe;
> +uint32_twf;
> +} XiveNVT;
> +
> +#define xive_nvt_is_valid(nvt)(be32_to_cpu((nvt)->w0) & NVT_W0_VALID)
> +
>  #endif /* PPC_XIVE_REGS_H */
> diff --git a/hw/intc/xive.c b/hw/intc/xive.c
> index 2615d16b7437..3eecffe99b3a 100644
> --- a/hw/intc/xive.c
> +++ b/hw/intc/xive.c
> @@ -983,6 +983,183 @@ int xive_router_write_end(XiveRouter *xrtr, uint8_t 
> end_blk, uint32_t end_idx,
> return xrc->write_end(xrtr, end_blk, end_idx, end, word_number);
>  }
>  
> +int xive_router_get_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
> +

Re: [Qemu-devel] [PATCH v7 02/19] ppc/xive: introduce the XIVE interrupt thread context

2018-12-09 Thread David Gibson
On Sun, Dec 09, 2018 at 08:45:53PM +0100, Cédric Le Goater wrote:
> Each POWER9 processor chip has a XIVE presenter that can generate four
> different exceptions to its threads:
> 
>   - hypervisor exception,
>   - O/S exception
>   - Event-Based Branch (EBB)
>   - msgsnd (doorbell).
> 
> Each exception has a state independent from the others called a Thread
> Interrupt Management context. This context is a set of registers which
> lets the thread handle priority management and interrupt acknowledgment
> among other things. The most important ones being :
> 
>   - Interrupt Priority Register  (PIPR)
>   - Interrupt Pending Buffer (IPB)
>   - Current Processor Priority   (CPPR)
>   - Notification Source Register (NSR)
> 
> These registers are accessible through a specific MMIO region, called
> the Thread Interrupt Management Area (TIMA), four aligned pages, each
> exposing a different view of the registers. First page (page address
> ending in 0b00) gives access to the entire context and is reserved for
> the ring 0 view for the physical thread context. The second (page
> address ending in 0b01) is for the hypervisor, ring 1 view. The third
> (page address ending in 0b10) is for the operating system, ring 2
> view. The fourth (page address ending in 0b11) is for user level, ring
> 3 view.
> 
> The thread interrupt context is modeled with a XiveTCTX object
> containing the values of the different exception registers. The TIMA
> region is mapped at the same address for each CPU.
> 
> Signed-off-by: Cédric Le Goater 
> Reviewed-by: David Gibson 

Applied.

> ---
> 
>  Changes since v6
> 
>  - introduced a xive_tctx_word2() helper to extract TM_WORD2 of a ring.
> 
>  include/hw/ppc/xive.h  |  44 
>  include/hw/ppc/xive_regs.h |  82 +++
>  hw/intc/xive.c | 424 +
>  3 files changed, 550 insertions(+)
> 
> diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
> index 014f64aa98f6..1e823a4c64e9 100644
> --- a/include/hw/ppc/xive.h
> +++ b/include/hw/ppc/xive.h
> @@ -367,4 +367,48 @@ typedef struct XiveENDSource {
>  void xive_end_pic_print_info(XiveEND *end, uint32_t end_idx, Monitor *mon);
>  void xive_end_queue_pic_print_info(XiveEND *end, uint32_t width, Monitor 
> *mon);
>  
> +/*
> + * XIVE Thread interrupt Management (TM) context
> + */
> +
> +#define TYPE_XIVE_TCTX "xive-tctx"
> +#define XIVE_TCTX(obj) OBJECT_CHECK(XiveTCTX, (obj), TYPE_XIVE_TCTX)
> +
> +/*
> + * XIVE Thread interrupt Management register rings :
> + *
> + *   QW-0  User   event-based exception state
> + *   QW-1  O/SOS context for priority management, interrupt acks
> + *   QW-2  Pool   hypervisor pool context for virtual processors 
> dispatched
> + *   QW-3  Physical   physical thread context and security context
> + */
> +#define XIVE_TM_RING_COUNT  4
> +#define XIVE_TM_RING_SIZE   0x10
> +
> +typedef struct XiveTCTX {
> +DeviceState parent_obj;
> +
> +CPUState*cs;
> +qemu_irqoutput;
> +
> +uint8_t regs[XIVE_TM_RING_COUNT * XIVE_TM_RING_SIZE];
> +} XiveTCTX;
> +
> +/*
> + * XIVE Thread Interrupt Management Aera (TIMA)
> + *
> + * This region gives access to the registers of the thread interrupt
> + * management context. It is four page wide, each page providing a
> + * different view of the registers. The page with the lower offset is
> + * the most privileged and gives access to the entire context.
> + */
> +#define XIVE_TM_HW_PAGE 0x0
> +#define XIVE_TM_HV_PAGE 0x1
> +#define XIVE_TM_OS_PAGE 0x2
> +#define XIVE_TM_USER_PAGE   0x3
> +
> +extern const MemoryRegionOps xive_tm_ops;
> +
> +void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon);
> +
>  #endif /* PPC_XIVE_H */
> diff --git a/include/hw/ppc/xive_regs.h b/include/hw/ppc/xive_regs.h
> index 3c0ebad18b69..ede3d04c5eda 100644
> --- a/include/hw/ppc/xive_regs.h
> +++ b/include/hw/ppc/xive_regs.h
> @@ -23,6 +23,88 @@
>  #define XIVE_SRCNO_INDEX(srcno) ((srcno) & 0x0fff)
>  #define XIVE_SRCNO(blk, idx)((uint32_t)(blk) << 28 | (idx))
>  
> +#define TM_SHIFT16
> +
> +/* TM register offsets */
> +#define TM_QW0_USER 0x000 /* All rings */
> +#define TM_QW1_OS   0x010 /* Ring 0..2 */
> +#define TM_QW2_HV_POOL  0x020 /* Ring 0..1 */
> +#define TM_QW3_HV_PHYS  0x030 /* Ring 0..1 */
> +
> +/* Byte offsets inside a QW QW0 QW1 QW2 QW3 */
> +#define TM_NSR  0x0  /*  +   +   -   +  */
> +#define TM_CPPR 0x1  /*  -   +   -   +  */
> +#define TM_IPB  0x2  /*  -   +   +   +  */
> +#define TM_LSMFB0x3  /*  -   +   +   +  */
> +#define TM_ACK_CNT  0x4  /*  -   +   -   -  */
> +#define TM_INC  0x5  /*  -   +   -   +  */
> +#define TM_AGE  0x6  /*  -   +   -   +  */
> +#define TM_PIPR 0x7  /*  -   +   -   +  */
> +
> +#define TM_WORD00x0
> 

Re: [Qemu-devel] [PATCH v7 17/19] spapr: Add a pseries-4.0 machine type

2018-12-09 Thread David Gibson
On Mon, Dec 10, 2018 at 09:05:06AM +1100, Benjamin Herrenschmidt wrote:
> On Sun, 2018-12-09 at 20:46 +0100, Cédric Le Goater wrote:
> > Signed-off-by: Cédric Le Goater 
> > ---
> 
> If you're going to do that, can we include large decrementer in there
> too ? (patches from Suraj in my tree but they night need a bit of
> massaging).

We don't need to worry about that here.  The machine type's not
considered finalized until the release, so as long as you get the
large dec stuff in before the 4.0 release, it's fine.

Looks like Eduardo and others are probably doing a big batch machine
type update via the machine tree.  That will probably conflict, but it
should be a fairly easy one for me to sort out when the time comes.

> 
> >  include/hw/compat.h |  3 +++
> >  hw/ppc/spapr.c  | 25 ++---
> >  2 files changed, 25 insertions(+), 3 deletions(-)
> > 
> > diff --git a/include/hw/compat.h b/include/hw/compat.h
> > index 6f4d5fc64704..70958328fe7a 100644
> > --- a/include/hw/compat.h
> > +++ b/include/hw/compat.h
> > @@ -1,6 +1,9 @@
> >  #ifndef HW_COMPAT_H
> >  #define HW_COMPAT_H
> >  
> > +#define HW_COMPAT_3_1 \
> > +/* empty */
> > +
> >  #define HW_COMPAT_3_0 \
> >  /* empty */
> >  
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index fa41927d95dd..4012ebd794a4 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -3971,19 +3971,38 @@ static const TypeInfo spapr_machine_info = {
> >  }\
> >  type_init(spapr_machine_register_##suffix)
> >  
> > - /*
> > +/*
> > + * pseries-4.0
> > + */
> > +static void spapr_machine_4_0_instance_options(MachineState *machine)
> > +{
> > +}
> > +
> > +static void spapr_machine_4_0_class_options(MachineClass *mc)
> > +{
> > +/* Defaults for the latest behaviour inherited from the base class */
> > +}
> > +
> > +DEFINE_SPAPR_MACHINE(4_0, "4.0", true);
> > +
> > +/*
> >   * pseries-3.1
> >   */
> > +#define SPAPR_COMPAT_3_1  \
> > +HW_COMPAT_3_1
> > +
> >  static void spapr_machine_3_1_instance_options(MachineState *machine)
> >  {
> > +spapr_machine_4_0_instance_options(machine);
> >  }
> >  
> >  static void spapr_machine_3_1_class_options(MachineClass *mc)
> >  {
> > -/* Defaults for the latest behaviour inherited from the base class */
> > +spapr_machine_4_0_class_options(mc);
> > +SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_3_1);
> >  }
> >  
> > -DEFINE_SPAPR_MACHINE(3_1, "3.1", true);
> > +DEFINE_SPAPR_MACHINE(3_1, "3.1", false);
> >  
> >  /*
> >   * pseries-3.0
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


[Qemu-devel] possible bug hw/adc/stm32f2xx_adc.c

2018-12-09 Thread Seth K
Thank you all for help with my last patch. I found one more entry in my
notes that could be a bug, or could be a misunderstanding on my part.

The memory map in DocID15818 (Rev 15) datasheet says:
ADC1 - ADC2 - ADC3:  0x40012000-0x400123FF

That suggests a size of 0x400 (they share that range?)

Line 279/280 of hw/adc/stm32f2xx_adc.c seems to use 0xFF
memory_region_init_io(>mmio, obj, _adc_ops, s,
TYPE_STM32F2XX_ADC, 0xFF); Probably just confusion on my part, but thought
I would mention it just in case.
Thanks,
Seth

PS: Sorry if you are all the wrong people to email about this ADC...


Re: [Qemu-devel] [PATCH v3 2/5] util: introduce threaded workqueue

2018-12-09 Thread Xiao Guangrong




On 12/5/18 1:16 AM, Paolo Bonzini wrote:

On 04/12/18 16:49, Christophe de Dinechin wrote:

  Linux and QEMU's own qht work just fine with compile-time directives.


Wouldn’t it work fine without any compile-time directive at all?


Yes, that's what I meant.  Though there are certainly cases in which the
difference without proper cacheline alignment is an order of magnitude
less throughput or something like that; it would certainly be noticeable.


I don't think lock-free lists are easier.  Bitmaps smaller than 64
elements are both faster and easier to manage.


I believe that this is only true if you use a linked list for both freelist
management and for thread notification (i.e. to replace the bitmaps).
However, if you use an atomic list only for the free list, and keep
bitmaps for signaling, then performance is at least equal, often better.
Plus you get the added benefit of having a thread-safe API, i.e.
something that is truly lock-free.

I did a small experiment to test / prove this. Last commit on branch:
https://github.com/c3d/recorder/commits/181122-xiao_guangdong_introduce-threaded-workqueue
Take with a grain of salt, microbenchmarks are always suspect ;-)

The code in “thread_test.c” includes Xiao’s code with two variations,
plus some testing code lifted from the flight recorder library.
1. The FREE_LIST variation (sl_test) is what I would like to propose.
2. The BITMAP variation (bm_test) is the baseline
3. The DOUBLE_LIST variation (ll_test) is the slow double-list approach

To run it, you need to do “make opt-test”, then run “test_script”
which outputs a CSV file. The summary of my findings testing on
a ThinkPad, a Xeon machine and a MacBook is here:
https://imgur.com/a/4HmbB9K

Overall, the proposed approach:

- makes the API thread safe and lock free, addressing the one
drawback that Xiao was mentioning.

- delivers up to 30% more requests on the Macbook, while being
“within noise” (sometimes marginally better) for the other two.
I suspect an optimization opportunity found by clang, because
the Macbook delivers really high numbers.

- spends less time blocking when all threads are busy, which
accounts for the higher number of client loops.

If you think that makes sense, then either Xiao can adapt the code
from the branch above, or I can send a follow-up patch.


Having a follow-up patch would be best I think.  Thanks for
experimenting with this, it's always fun stuff. :)



Yup, Christophe, please post the follow-up patches and add yourself
to the author list if you like. I am looking forward to it. :)

Thanks!






Re: [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug

2018-12-09 Thread xuyandong
On Sat, Dec 08, 2018 at 11:58:59AM +, xuyandong wrote:
> > Hi all,
> >
> >
> >
> > In our test, we configured VM with several pci-bridges and a
> > virtio-net nic been attached with bus 4,
> >
> > After VM is startup, We ping this nic from host to judge if it is
> > working normally. Then, we hot add pci devices to this VM with bus 0.
> >
> > We  found the virtio-net NIC in bus 4 is not working (can not connect)
> > occasionally, as it kick virtio backend failure with error below:
> >
> > Unassigned mem write fc803004 = 0x1
> >
> >
> >
> > memory-region: pci_bridge_pci
> >
> >   - (prio 0, RW): pci_bridge_pci
> >
> > fc80-fc803fff (prio 1, RW): virtio-pci
> >
> >   fc80-fc800fff (prio 0, RW):
> > virtio-pci-common
> >
> >   fc801000-fc801fff (prio 0, RW): virtio-pci-isr
> >
> >   fc802000-fc802fff (prio 0, RW):
> > virtio-pci-device
> >
> >   fc803000-fc803fff (prio 0, RW):
> > virtio-pci-notify  <- io mem unassigned
> >
> >   …
> >
> >
> >
> > We caught an exceptional address changing while this problem happened,
> > show as
> > follow:
> >
> > Before pci_bridge_update_mappings:
> >
> >   fc00-fc1f (prio 1, RW): alias
> > pci_bridge_pref_mem @pci_bridge_pci fc00-fc1f
> >
> >   fc20-fc3f (prio 1, RW): alias
> > pci_bridge_pref_mem @pci_bridge_pci fc20-fc3f
> >
> >   fc40-fc5f (prio 1, RW): alias
> > pci_bridge_pref_mem @pci_bridge_pci fc40-fc5f
> >
> >   fc60-fc7f (prio 1, RW): alias
> > pci_bridge_pref_mem @pci_bridge_pci fc60-fc7f
> >
> >   fc80-fc9f (prio 1, RW): alias
> > pci_bridge_pref_mem @pci_bridge_pci fc80-fc9f
> > <- correct Adress Spce
> >
> >   fca0-fcbf (prio 1, RW): alias
> > pci_bridge_pref_mem @pci_bridge_pci fca0-fcbf
> >
> >   fcc0-fcdf (prio 1, RW): alias
> > pci_bridge_pref_mem @pci_bridge_pci fcc0-fcdf
> >
> >   fce0-fcff (prio 1, RW): alias
> > pci_bridge_pref_mem @pci_bridge_pci fce0-fcff
> >
> >
> >
> > After pci_bridge_update_mappings:
> >
> >   fda0-fdbf (prio 1, RW): alias
> > pci_bridge_mem @pci_bridge_pci fda0-fdbf
> >
> >   fdc0-fddf (prio 1, RW): alias
> > pci_bridge_mem @pci_bridge_pci fdc0-fddf
> >
> >   fde0-fdff (prio 1, RW): alias
> > pci_bridge_mem @pci_bridge_pci fde0-fdff
> >
> >   fe00-fe1f (prio 1, RW): alias
> > pci_bridge_mem @pci_bridge_pci fe00-fe1f
> >
> >   fe20-fe3f (prio 1, RW): alias
> > pci_bridge_mem @pci_bridge_pci fe20-fe3f
> >
> >   fe40-fe5f (prio 1, RW): alias
> > pci_bridge_mem @pci_bridge_pci fe40-fe5f
> >
> >   fe60-fe7f (prio 1, RW): alias
> > pci_bridge_mem @pci_bridge_pci fe60-fe7f
> >
> >   fe80-fe9f (prio 1, RW): alias
> > pci_bridge_mem @pci_bridge_pci fe80-fe9f
> >
> >   fc80-fc80 (prio 1, RW): alias 
> > pci_bridge_pref_mem
> > @pci_bridge_pci fc80-fc80   <- Exceptional Adress
> Space
> 
> This one is empty though right?
> 
> >
> >
> > We have figured out why this address becomes this value,  according to
> > pci spec,  pci driver can get BAR address size by writing 0x
> > to
> >
> > the pci register firstly, and then read back the value from this register.
> 
> 
> OK however as you show below the BAR being sized is the BAR if a bridge. Are
> you then adding a bridge device by hotplug?

No, I just simply hot plugged a VFIO device to Bus 0, another interesting 
phenomenon is
If I hot plug the device to other bus, this doesn't happened.
 
> 
> 
> > We didn't handle this value  specially while process pci write in
> > qemu, the function call stack is:
> >
> > Pci_bridge_dev_write_config
> >
> > -> pci_bridge_write_config
> >
> > -> pci_default_write_config (we update the config[address] value here
> > -> to
> > fc80, which should be 0xfc80 )
> >
> > -> pci_bridge_update_mappings
> >
> > ->pci_bridge_region_del(br, br->windows);
> >
> > -> pci_bridge_region_init
> >
> > ->
> > pci_bridge_init_alias (here pci_bridge_get_base, we use the wrong
> > value
> > fc80)
> >
> > 

Re: [Qemu-devel] [PATCH v6 08/37] ppc/xive: introduce a simplified XIVE presenter

2018-12-09 Thread David Gibson
On Fri, Dec 07, 2018 at 09:49:29AM +0100, Cédric Le Goater wrote:
> On 12/7/18 4:10 AM, David Gibson wrote:
> > On Thu, Dec 06, 2018 at 12:22:22AM +0100, Cédric Le Goater wrote:
> >> The last sub-engine of the XIVE architecture is the Interrupt
> >> Virtualization Presentation Engine (IVPE). On HW, the IVRE and the
> >> IVPE share elements, the Power Bus interface (CQ), the routing table
> >> descriptors, and they can be combined in the same HW logic. We do the
> >> same in QEMU and combine both engines in the XiveRouter for
> >> simplicity.
> >>
> >> When the IVRE has completed its job of matching an event source with a
> >> Notification Virtual Target (NVT) to notify, it forwards the event
> >> notification to the IVPE sub-engine. The IVPE scans the thread
> >> interrupt contexts of the Notification Virtual Targets (NVT)
> >> dispatched on the HW processor threads and if a match is found, it
> >> signals the thread. If not, the IVPE escalates the notification to
> >> some other targets and records the notification in a backlog queue.
> >>
> >> The IVPE maintains the thread interrupt context state for each of its
> >> NVTs not dispatched on HW processor threads in the Notification
> >> Virtual Target table (NVTT).
> >>
> >> The model currently only supports single NVT notifications.
> >>
> >> Signed-off-by: Cédric Le Goater 
> >> ---
> >>  include/hw/ppc/xive.h  |  15 +++
> >>  include/hw/ppc/xive_regs.h |  24 
> >>  hw/intc/xive.c | 227 +
> >>  3 files changed, 266 insertions(+)
> >>
> >> diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
> >> index 74b547707b17..e9b06e75fc1c 100644
> >> --- a/include/hw/ppc/xive.h
> >> +++ b/include/hw/ppc/xive.h
> >> @@ -327,6 +327,10 @@ typedef struct XiveRouterClass {
> >> XiveEND *end);
> >>  int (*write_end)(XiveRouter *xrtr, uint8_t end_blk, uint32_t end_idx,
> >>   XiveEND *end, uint8_t word_number);
> >> +int (*get_nvt)(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
> >> +   XiveNVT *nvt);
> >> +int (*write_nvt)(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
> >> + XiveNVT *nvt, uint8_t word_number);
> >>  } XiveRouterClass;
> >>  
> >>  void xive_eas_pic_print_info(XiveEAS *eas, uint32_t lisn, Monitor *mon);
> >> @@ -337,6 +341,11 @@ int xive_router_get_end(XiveRouter *xrtr, uint8_t 
> >> end_blk, uint32_t end_idx,
> >>  XiveEND *end);
> >>  int xive_router_write_end(XiveRouter *xrtr, uint8_t end_blk, uint32_t 
> >> end_idx,
> >>XiveEND *end, uint8_t word_number);
> >> +int xive_router_get_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t 
> >> nvt_idx,
> >> +XiveNVT *nvt);
> >> +int xive_router_write_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t 
> >> nvt_idx,
> >> +  XiveNVT *nvt, uint8_t word_number);
> >> +
> >>  
> >>  /*
> >>   * XIVE END ESBs
> >> @@ -393,6 +402,7 @@ typedef struct XiveTCTX {
> >>  qemu_irqoutput;
> >>  
> >>  uint8_t regs[XIVE_TM_RING_COUNT * XIVE_TM_RING_SIZE];
> >> +uint32_thw_cam;
> > 
> > I don't love having this as a separate field.  Since it also appears
> > within the register space, it's kind of redundant. 
> 
> yes.
> 
> > On the other hand,
> > I see that wiring up the property directly to the register space
> > doesn't really work.  Not sure how to deal with that one.
> 
> We could use get/set properties for "hw-cam" to assign WORD2 of the 
> physical ring and exclude it from reset, which makes some sense. The
> test on the PHYS ring in xive_presenter_tctx_match() would also look 
> like the other tests. I think this is better.

Ok sounds good.

> On a related topic, WORD2 of the OS ring is assigned by the hypervisor. 
> For the sPAPR machine, this is done when the sPAPR IRQ backend is 
> reseted. See patch 21 in v6.

Yes, I figured.

[snip]
> >> +/*
> >> + * The thread context register words are in big-endian format.
> >> + */
> >> +static int xive_presenter_tctx_match(XiveTCTX *tctx, uint8_t format,
> >> + uint8_t nvt_blk, uint32_t nvt_idx,
> >> + bool cam_ignore, uint32_t logic_serv)
> >> +{
> >> +uint32_t cam = xive_nvt_cam_line(nvt_blk, nvt_idx);
> >> +uint8_t *regs;
> >> +uint32_t qw3w2;
> >> +uint32_t qw2w2;
> >> +uint32_t qw1w2;
> >> +uint32_t qw0w2;
> >> +
> >> +/* TODO (PowerNV): ignore low order bits of nvt id */
> >> +
> >> +regs = >regs[TM_QW3_HV_PHYS];
> >> +qw3w2 = be32_to_cpu(*((uint32_t *) [TM_WORD2]));
> > 
> > This is one of the main places we access regs and we have to do
> > horrible casting.  Would it make more sense for it to be a uint32_t
> > array?  Or at least for the local *regs to be.
> 
> The register array is accessed by byte (patch 9) for the first two 
> words and by word for WORD2. I don't see any good 

Re: [Qemu-devel] [PATCH v6 04/37] ppc/xive: introduce the XiveRouter model

2018-12-09 Thread David Gibson
On Fri, Dec 07, 2018 at 08:49:21AM +0100, Cédric Le Goater wrote:
> On 12/7/18 2:57 AM, David Gibson wrote:
> > On Thu, Dec 06, 2018 at 07:22:54AM +0100, Cédric Le Goater wrote:
> >> On 12/6/18 4:41 AM, David Gibson wrote:
> >>> On Thu, Dec 06, 2018 at 12:22:18AM +0100, Cédric Le Goater wrote:
>  The XiveRouter models the second sub-engine of the XIVE architecture :
>  the Interrupt Virtualization Routing Engine (IVRE).
> 
>  The IVRE handles event notifications of the IVSE and performs the
>  interrupt routing process. For this purpose, it uses a set of tables
>  stored in system memory, the first of which being the Event Assignment
>  Structure (EAS) table.
> 
>  The EAT associates an interrupt source number with an Event Notification
>  Descriptor (END) which will be used in a second phase of the routing
>  process to identify a Notification Virtual Target.
> 
>  The XiveRouter is an abstract class which needs to be inherited from
>  to define a storage for the EAT, and other upcoming tables.
> 
>  Signed-off-by: Cédric Le Goater 
>  ---
>   include/hw/ppc/xive.h  | 31 
>   include/hw/ppc/xive_regs.h | 50 +
>   hw/intc/xive.c | 76 ++
>   3 files changed, 157 insertions(+)
>   create mode 100644 include/hw/ppc/xive_regs.h
> 
>  diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
>  index 6770cffec67d..57ec9f84f527 100644
>  --- a/include/hw/ppc/xive.h
>  +++ b/include/hw/ppc/xive.h
>  @@ -141,6 +141,8 @@
>   #define PPC_XIVE_H
>   
>   #include "hw/qdev-core.h"
>  +#include "hw/sysbus.h"
>  +#include "hw/ppc/xive_regs.h"
>   
>   /*
>    * XIVE Fabric (Interface between Source and Router)
>  @@ -297,4 +299,33 @@ static inline void xive_source_irq_set(XiveSource 
>  *xsrc, uint32_t srcno,
>   }
>   }
>   
>  +/*
>  + * XIVE Router
>  + */
>  +
>  +typedef struct XiveRouter {
>  +SysBusDeviceparent;
> >>>
> >>> I thought the plan was to make XiveRouter as well as XiveSource a
> >>> TYPE_DEVICE descendent rather than a SysBusDevice?
> >>
> >> We start talking about that, indeed, but then :
> >>
> >>https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg06407.html
> >>
> >> I thought we concluded that it was going to get too complex.
> >>
> >> Also, sPAPRXive is a direct descendant of XiveRouter and we want sPAPRXive 
> >> on SysBus.
> > 
> > Ah, good point.  So, to clarify my thinking here - I think from a
> > theoretical point of view, having XiveRouter not be sysbus and
> > including it by composition is probably the "correct" approach.
> 
> One possible solution would be to transform the XiveRouter in a QOM 
> interface, this will be possible when I have removed the chip_id field,
> and define the VST accessors as we do today. I am not sure how QOM 
> interfaces are considered, but I think they are more in the composition 
> pattern than inheritance. That way, we could have sPAPRXive directly 
> inherit from SysBusDevice.
> 
> I can give it a try for v7, and you could merge the small XiveRouter 
> changes in the current XiveRouter patch.
> 
> > But I can also see that that will be a bit of a pain in practice.  So
> > yes, keeping it as a SysBusDevice is ok, at least as long as any
> > migration stuff is in the "outermost" / most specific type, which I
> > believe it is.
> 
> By this sentence, you mean that we don't rely on the XiveRouter model 
> to capture the sPAPRXive state ?

Yes.  Basically we should only have VMStateDecriptions registered by
the spapr specific objects, not the internal parts / superclasses
they're composed of.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH 0/6] target/ppc: convert VMX instructions to use TCG vector operations

2018-12-09 Thread David Gibson
On Mon, Dec 10, 2018 at 01:33:53AM +0100, BALATON Zoltan wrote:
> On Fri, 7 Dec 2018, Mark Cave-Ayland wrote:
> > This patchset is an attempt at trying to improve the VMX (Altivec) 
> > instruction
> > performance by making use of the new TCG vector operations where possible.
> 
> This is very welcome, thanks for doing this.
> 
> > In order to use TCG vector operations, the registers must be accessible 
> > from cpu_env
> > whilst currently they are accessed via arrays of static TCG globals. 
> > Patches 1-3
> > are therefore mechanical patches which introduce access helpers for FPR, 
> > AVR and VSR
> > registers using the supplied TCGv_i64 parameter.
> 
> Have you tried some benchmarks or tests to measure the impact of these
> changes? I've tried the (very unscientific) benchmarks I've written about
> before here:
> 
> http://lists.nongnu.org/archive/html/qemu-ppc/2018-07/msg00261.html
> 
> (which seem to use AltiVec/VMX instructions but not sure which) on mac99
> with MorphOS and I could not see any performance increase. I haven't run
> enough tests but results with or without this series on master were mostly
> the same within a few percents, and sometimes even seen lower performance
> with these patches than without. I haven't tried to find out why (no time
> for that now) so can't really draw any conclusions from this. I'm also not
> sure if I've actually tested what you've changed or these use instructions
> that your patches don't optimise yet, or the changes I've seen were just
> normal changes between runs; but I wonder if the increased number of
> temporaries could result in lower performance in some cases?

What was your host machine.  IIUC this change will only improve
performance if the host tcg backend is able to implement TCG vector
ops in terms of vector ops on the host.

In addition, this series only converts a subset of the integer and
logical vector instructions.  If your testcase is mostly floating
point (vectored or otherwise), it will still be softfloat and so not
see any speedup.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 1/1] Changes requirement for "vsubsbs" instruction

2018-12-09 Thread David Gibson
On Fri, Dec 07, 2018 at 03:13:14PM -0200, Leonardo Bras wrote:
> From: "Paul A. Clarke" 
> 
> Changes requirement for "vsubsbs" instruction, which has been supported
> since ISA 2.03. (Please see section 5.9.1.2 of ISA 2.03)
> 
> Reported-by: Paul A. Clarke 
> Signed-off-by: Paul A. Clarke 
> Signed-off-by: Leonardo Bras 

Those instruction generating macros are super-confusing, but I think
this is right.  vsubsbs has been there for ages with altivec, bcdtrunc
is new in ISA 3.0.

Applied to ppc-for-4.0.

> ---
>  target/ppc/translate/vmx-ops.inc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/ppc/translate/vmx-ops.inc.c 
> b/target/ppc/translate/vmx-ops.inc.c
> index 139f80cb24..84e05fb827 100644
> --- a/target/ppc/translate/vmx-ops.inc.c
> +++ b/target/ppc/translate/vmx-ops.inc.c
> @@ -143,7 +143,7 @@ GEN_VXFORM(vaddsws, 0, 14),
>  GEN_VXFORM_DUAL(vsububs, bcdadd, 0, 24, PPC_ALTIVEC, PPC_NONE),
>  GEN_VXFORM_DUAL(vsubuhs, bcdsub, 0, 25, PPC_ALTIVEC, PPC_NONE),
>  GEN_VXFORM(vsubuws, 0, 26),
> -GEN_VXFORM_DUAL(vsubsbs, bcdtrunc, 0, 28, PPC_NONE, PPC2_ISA300),
> +GEN_VXFORM_DUAL(vsubsbs, bcdtrunc, 0, 28, PPC_ALTIVEC, PPC2_ISA300),
>  GEN_VXFORM(vsubshs, 0, 29),
>  GEN_VXFORM_DUAL(vsubsws, xpnd04_2, 0, 30, PPC_ALTIVEC, PPC_NONE),
>  GEN_VXFORM_207(vadduqm, 0, 4),

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 1/3] MAINTAINERS: add myself as maintainer for Mac Old World and New World machines

2018-12-09 Thread David Gibson
On Fri, Dec 07, 2018 at 04:08:04PM +, Mark Cave-Ayland wrote:
> I've unofficially been doing most of the work on the Mac machines for a while
> now, so update MAINTAINERS to reflect this. David is still happy to be listed
> as a reviewer as per our discussion at KVM forum.
> 
> Signed-off-by: Mark Cave-Ayland 

Acked-by: David Gibson 

> ---
>  MAINTAINERS | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 63effdc473..64bffaecca 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -935,7 +935,8 @@ F: hw/ppc/mpc8544ds.c
>  F: hw/ppc/mpc8544_guts.c
>  
>  New World
> -M: David Gibson 
> +M: Mark Cave-Ayland 
> +R: David Gibson 
>  L: qemu-...@nongnu.org
>  S: Odd Fixes
>  F: hw/ppc/mac_newworld.c
> @@ -949,7 +950,8 @@ F: include/hw/misc/mos6522.h
>  F: include/hw/ppc/mac_dbdma.h
>  
>  Old World
> -M: David Gibson 
> +M: Mark Cave-Ayland 
> +R: David Gibson 
>  L: qemu-...@nongnu.org
>  S: Odd Fixes
>  F: hw/ppc/mac_oldworld.c

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug

2018-12-09 Thread Michael S. Tsirkin
On Sat, Dec 08, 2018 at 11:58:59AM +, xuyandong wrote:
> Hi all,
> 
>  
> 
> In our test, we configured VM with several pci-bridges and a virtio-net nic
> been attached with bus 4,
> 
> After VM is startup, We ping this nic from host to judge if it is working
> normally. Then, we hot add pci devices to this VM with bus 0.
> 
> We  found the virtio-net NIC in bus 4 is not working (can not connect)
> occasionally, as it kick virtio backend failure with error below:
> 
> Unassigned mem write fc803004 = 0x1
> 
>  
> 
> memory-region: pci_bridge_pci
> 
>   - (prio 0, RW): pci_bridge_pci
> 
> fc80-fc803fff (prio 1, RW): virtio-pci
> 
>   fc80-fc800fff (prio 0, RW): virtio-pci-common
> 
>   fc801000-fc801fff (prio 0, RW): virtio-pci-isr
> 
>   fc802000-fc802fff (prio 0, RW): virtio-pci-device
> 
>   fc803000-fc803fff (prio 0, RW): virtio-pci-notify  <- io
> mem unassigned
> 
>   …
> 
>  
> 
> We caught an exceptional address changing while this problem happened, show as
> follow:
> 
> Before pci_bridge_update_mappings:
> 
>   fc00-fc1f (prio 1, RW): alias 
> pci_bridge_pref_mem
> @pci_bridge_pci fc00-fc1f
> 
>   fc20-fc3f (prio 1, RW): alias 
> pci_bridge_pref_mem
> @pci_bridge_pci fc20-fc3f
> 
>   fc40-fc5f (prio 1, RW): alias 
> pci_bridge_pref_mem
> @pci_bridge_pci fc40-fc5f
> 
>   fc60-fc7f (prio 1, RW): alias 
> pci_bridge_pref_mem
> @pci_bridge_pci fc60-fc7f
> 
>   fc80-fc9f (prio 1, RW): alias 
> pci_bridge_pref_mem
> @pci_bridge_pci fc80-fc9f <- correct Adress Spce
> 
>   fca0-fcbf (prio 1, RW): alias 
> pci_bridge_pref_mem
> @pci_bridge_pci fca0-fcbf
> 
>   fcc0-fcdf (prio 1, RW): alias 
> pci_bridge_pref_mem
> @pci_bridge_pci fcc0-fcdf
> 
>   fce0-fcff (prio 1, RW): alias 
> pci_bridge_pref_mem
> @pci_bridge_pci fce0-fcff
> 
>  
> 
> After pci_bridge_update_mappings:
> 
>   fda0-fdbf (prio 1, RW): alias pci_bridge_mem
> @pci_bridge_pci fda0-fdbf
> 
>   fdc0-fddf (prio 1, RW): alias pci_bridge_mem
> @pci_bridge_pci fdc0-fddf
> 
>   fde0-fdff (prio 1, RW): alias pci_bridge_mem
> @pci_bridge_pci fde0-fdff
> 
>   fe00-fe1f (prio 1, RW): alias pci_bridge_mem
> @pci_bridge_pci fe00-fe1f
> 
>   fe20-fe3f (prio 1, RW): alias pci_bridge_mem
> @pci_bridge_pci fe20-fe3f
> 
>   fe40-fe5f (prio 1, RW): alias pci_bridge_mem
> @pci_bridge_pci fe40-fe5f
> 
>   fe60-fe7f (prio 1, RW): alias pci_bridge_mem
> @pci_bridge_pci fe60-fe7f
> 
>   fe80-fe9f (prio 1, RW): alias pci_bridge_mem
> @pci_bridge_pci fe80-fe9f
> 
>   fc80-fc80 (prio 1, RW): alias 
> pci_bridge_pref_mem
> @pci_bridge_pci fc80-fc80   <- Exceptional Adress 
> Space

This one is empty though right?

>  
> 
> We have figured out why this address becomes this value,  according to pci
> spec,  pci driver can get BAR address size by writing 0x to
> 
> the pci register firstly, and then read back the value from this register.


OK however as you show below the BAR being sized is the BAR
if a bridge. Are you then adding a bridge device by hotplug?



> We didn't handle this value  specially while process pci write in qemu, the
> function call stack is:
> 
> Pci_bridge_dev_write_config
> 
> -> pci_bridge_write_config
> 
> -> pci_default_write_config (we update the config[address] value here to
> fc80, which should be 0xfc80 )
>
> -> pci_bridge_update_mappings
> 
> ->pci_bridge_region_del(br, br->windows);
> 
> -> pci_bridge_region_init
> 
> ->
> pci_bridge_init_alias (here pci_bridge_get_base, we use the wrong value
> fc80)
> 
> ->
> memory_region_transaction_commit
> 
>  
> 
> So, as we can see, we use the wrong base address in qemu to update the memory
> regions, though, we update the base address to
> 
> The correct value after pci driver in VM write the original value back, the
> virtio NIC in bus 4 may still sends net packets concurrently with
> 
> The wrong memory region 

Re: [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug

2018-12-09 Thread xuyandong
n Sat, Dec 08, 2018 at 11:58:59AM +, xuyandong wrote:
> > Hi all,
> >
> >
> >
> > In our test, we configured VM with several pci-bridges and a
> > virtio-net nic been attached with bus 4,
> >
> > After VM is startup, We ping this nic from host to judge if it is
> > working normally. Then, we hot add pci devices to this VM with bus 0.
> >
> > We  found the virtio-net NIC in bus 4 is not working (can not connect)
> > occasionally, as it kick virtio backend failure with error below:
> >
> > Unassigned mem write fc803004 = 0x1
> 
> Thanks for the report. Which guest was used to produce this problem?
> 
> --
> MST

I was seeing this problem when I hotplug a VFIO device to guest CentOS 7.4,
after that I compiled the latest Linux kernel and it also contains this problem.

Thinks,
Xu




Re: [Qemu-devel] [PATCH] target/i386: Fixes to the check missing features routine

2018-12-09 Thread Caio Carrara
On Fri, Dec 07, 2018 at 05:14:17PM -0500, Wainer dos Santos Moschetta wrote:
> The x86_cpu_class_check_missing_features() returns a list
> of unavailable features compared to the host CPU. Currently it may
> return empty strings for unamed features as well as duplicated
> names.
> 
> For example, the qmp "query-cpu-definitions" below shows one empty
> string and repeated "mpx" entries:
> 
> (...)
> {"execute": "query-cpu-definitions"}
> (...)
> {
> "name": "Cascadelake-Server",
> "typename": "Cascadelake-Server-x86_64-cpu",
> "unavailable-features": [
> "hle",
> "rtm",
> "mpx",
> "avx512f",
> "avx512dq",
> "rdseed",
> "adx",
> "smap",
> "clflushopt",
> "clwb",
> "intel-pt",
> "avx512cd",
> "avx512bw",
> "avx512vl",
> "pku",
> "",
> "avx512vnni",
> "spec-ctrl",
> "ssbd",
> "3dnowprefetch",
> "xsavec",
> "xgetbv1",
> "mpx",
> "mpx",
> "avx512f",
> "avx512f",
> "avx512f",
> "pku"
> ],
> (...)
> 
> Signed-off-by: Wainer dos Santos Moschetta 
> ---
> Note: the skipped testcase was used to test fix in my system so it has
> assumptions about the host CPU. It's impracticial to change it to allow
> running on any system though. Therefore, I am okay on either leave or remove
> it. Opinions?

I disagree with this test. This is an always skipping test that
tend to become easily a meaningless dead code. If your real tests that is
not being skipped have proper coverage than it should be enough.

> ---
>  target/i386/cpu.c   | 12 +-
>  tests/acceptance/cpu_definitions.py | 61 +
>  2 files changed, 72 insertions(+), 1 deletion(-)
>  create mode 100644 tests/acceptance/cpu_definitions.py
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index f81d35e1f9..2502a3adda 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -3615,19 +3615,29 @@ static void 
> x86_cpu_class_check_missing_features(X86CPUClass *xcc,
>  
>  x86_cpu_filter_features(xc);
>  
> +/* Uses an auxiliar dictionary to ensure the list of features has not
> +   repeated name. */
> +QDict *unique_feats_dict = qdict_new();
> +
>  for (w = 0; w < FEATURE_WORDS; w++) {
>  uint32_t filtered = xc->filtered_features[w];
>  int i;
>  for (i = 0; i < 32; i++) {
>  if (filtered & (1UL << i)) {
> +const char *fname = g_strdup(x86_cpu_feature_name(w, i));
> +if (!fname || qdict_haskey(unique_feats_dict, fname)) {
> +continue;
> +}
>  strList *new = g_new0(strList, 1);
> -new->value = g_strdup(x86_cpu_feature_name(w, i));
> +new->value = g_strdup(fname);
>  *next = new;
>  next = >next;
> +qdict_put_null(unique_feats_dict, new->value);
>  }
>  }
>  }
>  
> +g_free(unique_feats_dict);
>  object_unref(OBJECT(xc));
>  }
>  
> diff --git a/tests/acceptance/cpu_definitions.py 
> b/tests/acceptance/cpu_definitions.py
> new file mode 100644
> index 00..65cea0427e
> --- /dev/null
> +++ b/tests/acceptance/cpu_definitions.py
> @@ -0,0 +1,61 @@
> +# CPU definitions tests.
> +#
> +# Copyright (c) 2018 Red Hat, Inc.
> +#
> +# Author:
> +#  Wainer dos Santos Moschetta 
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2 or
> +# later.  See the COPYING file in the top-level directory.
> +
> +from avocado import skip
> +from avocado_qemu import Test
> +
> +
> +class CPUDefinitions(Test):
> +"""
> +Tests for the CPU definitions.
> +
> +:avocado: enable
> +:avocado: tags=x86_64
> +"""
> +def test_unavailable_features(self):
> +self.vm.add_args("-machine", "q35,accel=kvm")
> +self.vm.launch()
> +cpu_definitions = self.vm.command('query-cpu-definitions')
> +self.assertTrue(len(cpu_definitions) > 0)
> +for cpu_model in cpu_definitions:
> +name = cpu_model.get('name')
> +unavailable_features = cpu_model.get('unavailable-features')
> +
> +self.assertNotIn("", unavailable_features,
> + name + " has unamed feature")
> +self.assertEqual(len(unavailable_features),
> + len(set(unavailable_features)),
> + name + " has duplicate feature")
> +
> +@skip("Have assumptions about the host CPU")
> +def test_unavailable_features_manual(self):
> +"""
> +This 

Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH 0/6] target/ppc: convert VMX instructions to use TCG vector operations

2018-12-09 Thread BALATON Zoltan

On Fri, 7 Dec 2018, Mark Cave-Ayland wrote:

This patchset is an attempt at trying to improve the VMX (Altivec) instruction
performance by making use of the new TCG vector operations where possible.


This is very welcome, thanks for doing this.


In order to use TCG vector operations, the registers must be accessible from 
cpu_env
whilst currently they are accessed via arrays of static TCG globals. Patches 1-3
are therefore mechanical patches which introduce access helpers for FPR, AVR 
and VSR
registers using the supplied TCGv_i64 parameter.


Have you tried some benchmarks or tests to measure the impact of these 
changes? I've tried the (very unscientific) benchmarks I've written about 
before here:


http://lists.nongnu.org/archive/html/qemu-ppc/2018-07/msg00261.html

(which seem to use AltiVec/VMX instructions but not sure which) on mac99 
with MorphOS and I could not see any performance increase. I haven't run 
enough tests but results with or without this series on master were mostly 
the same within a few percents, and sometimes even seen lower performance 
with these patches than without. I haven't tried to find out why (no time 
for that now) so can't really draw any conclusions from this. I'm also not 
sure if I've actually tested what you've changed or these use instructions 
that your patches don't optimise yet, or the changes I've seen were just 
normal changes between runs; but I wonder if the increased number of 
temporaries could result in lower performance in some cases?


Regards,
BALATON Zoltan



Re: [Qemu-devel] Help needed: test-qht-par hangs on Travis

2018-12-09 Thread Emilio G. Cota
On Fri, Dec 07, 2018 at 18:41:07 -0200, Eduardo Habkost wrote:
> I've noticed QEMU Travis builds are failing recently, and they
> seem to happen only on the --enable-gprof jobs.  I have enabled
> V=1 and noticed that the jobs are hanging inside test-qht-par.
> 
> Example here (look for "/qht/parallel/2threads-0%updates-1s"):
> 
> https://travis-ci.org/ehabkost/qemu-hacks/jobs/465081311
> 
> Does anybody have any idea why?

So if I read that output correctly, it seems that the second
test in qht-par never completes.

Enabling gprof and gcov (as in that build) should just lower
the throughput of the benchmark (test-qht-par invokes qht-bench),
but the duration should be the same (1 second per test, so no need
to wait for 10 minutes).

Can you try re-running the test, after applying the appended patch?
(It disables the "resize" thread.)

Also, does it reliably hang on Travis, or are these hangs
intermittent?

Thanks,

Emilio
---
diff --git a/tests/test-qht-par.c b/tests/test-qht-par.c
index d8a83caf5c..83ac92e430 100644
--- a/tests/test-qht-par.c
+++ b/tests/test-qht-par.c
@@ -6,7 +6,7 @@
  */
 #include "qemu/osdep.h"

-#define TEST_QHT_STRING "tests/qht-bench 1>/dev/null 2>&1 -R -S0.1 -D1 -N1 
"
+#define TEST_QHT_STRING "tests/qht-bench 1>/dev/null 2>&1 -R "

 static void test_qht(int n_threads, int update_rate, int duration)
 {




Re: [Qemu-devel] [PATCH v7 17/19] spapr: Add a pseries-4.0 machine type

2018-12-09 Thread Benjamin Herrenschmidt
On Sun, 2018-12-09 at 20:46 +0100, Cédric Le Goater wrote:
> Signed-off-by: Cédric Le Goater 
> ---

If you're going to do that, can we include large decrementer in there
too ? (patches from Suraj in my tree but they night need a bit of
massaging).

>  include/hw/compat.h |  3 +++
>  hw/ppc/spapr.c  | 25 ++---
>  2 files changed, 25 insertions(+), 3 deletions(-)
> 
> diff --git a/include/hw/compat.h b/include/hw/compat.h
> index 6f4d5fc64704..70958328fe7a 100644
> --- a/include/hw/compat.h
> +++ b/include/hw/compat.h
> @@ -1,6 +1,9 @@
>  #ifndef HW_COMPAT_H
>  #define HW_COMPAT_H
>  
> +#define HW_COMPAT_3_1 \
> +/* empty */
> +
>  #define HW_COMPAT_3_0 \
>  /* empty */
>  
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index fa41927d95dd..4012ebd794a4 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3971,19 +3971,38 @@ static const TypeInfo spapr_machine_info = {
>  }\
>  type_init(spapr_machine_register_##suffix)
>  
> - /*
> +/*
> + * pseries-4.0
> + */
> +static void spapr_machine_4_0_instance_options(MachineState *machine)
> +{
> +}
> +
> +static void spapr_machine_4_0_class_options(MachineClass *mc)
> +{
> +/* Defaults for the latest behaviour inherited from the base class */
> +}
> +
> +DEFINE_SPAPR_MACHINE(4_0, "4.0", true);
> +
> +/*
>   * pseries-3.1
>   */
> +#define SPAPR_COMPAT_3_1  \
> +HW_COMPAT_3_1
> +
>  static void spapr_machine_3_1_instance_options(MachineState *machine)
>  {
> +spapr_machine_4_0_instance_options(machine);
>  }
>  
>  static void spapr_machine_3_1_class_options(MachineClass *mc)
>  {
> -/* Defaults for the latest behaviour inherited from the base class */
> +spapr_machine_4_0_class_options(mc);
> +SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_3_1);
>  }
>  
> -DEFINE_SPAPR_MACHINE(3_1, "3.1", true);
> +DEFINE_SPAPR_MACHINE(3_1, "3.1", false);
>  
>  /*
>   * pseries-3.0




Re: [Qemu-devel] [PATCH v9 08/14] target/arm: Make PMCEID[01]_EL0 64 bit registers, add PMCEID[23]

2018-12-09 Thread Peter Maydell
On Fri, 7 Dec 2018 at 18:00, Richard Henderson
 wrote:
>
> On 12/5/18 9:32 AM, Aaron Lindsay wrote:
> > On Dec 05 08:43, Aaron Lindsay wrote:
> >> Signed-off-by: Aaron Lindsay 
> >> +if (FIELD_EX32(cpu->id_dfr0, ID_DFR0, PERFMON) >= 4) {
> >
> > After further discussion on my last version, this should be
> >
> > if (FIELD_EX32(cpu->id_dfr0, ID_DFR0, PERFMON) >= 4 &&
> >   FIELD_EX32(cpu->id_dfr0, ID_DFR0, PERFMON) != 0xf) {
> >
> > to guard against defining these registers for implementation-defined
> > PMUs.
>
> When id fields define values like 0b, that is a hint that the field should
> be interpreted as signed, and you should still use a >= comparison.  (See
> D12.1.4, Principles of the ID scheme for fields in ID registers.)

That section calls out these PMU ID registers as exceptions
which do not follow the scheme and specifically notes that
the "not 0xf and greater than or equal to 4" is the kind of
comparison required here...

thanks
-- PMM



[Qemu-devel] [Bug 1174654] Re: qemu-system-x86_64 takes 100% CPU after host machine resumed from suspend to ram

2018-12-09 Thread Mark A. Hershberger
I was seeing this problem when my Debian laptop suspended.  The CentOS
guest would begin consuming a lot of cpu and only a hard-reset would fix
it.

Changing the rtc line to

  

seems to have fixed it, though I haven't done extensive testing yet.

Thanks!

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1174654

Title:
  qemu-system-x86_64 takes 100% CPU after host machine resumed from
  suspend to ram

Status in QEMU:
  Confirmed
Status in qemu package in Ubuntu:
  Invalid

Bug description:
  I have Windows XP SP3  inside qemu VM. All works fine in 12.10. But
  after upgraiding to 13.04 i have to restart the VM each time i
  resuming my host machine, because qemu process starts to take CPU
  cycles and OS inside VM is very slow and sluggish. However it's still
  controllable and could be shutdown by itself.

  According to the taskmgr any active process takes 99% CPU. It's not
  stuck on some single process.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1174654/+subscriptions



[Qemu-devel] [PATCH v7 19/19] spapr: add a 'pseries-4.0-dual' machine type

2018-12-09 Thread Cédric Le Goater
This pseries machine makes use of a new sPAPR IRQ backend supporting
both interrupt modes : XIVE and XICS, the default being XICS.

Signed-off-by: Cédric Le Goater 
---
 hw/ppc/spapr.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 3cc134a0b673..d9fd4851824e 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4000,6 +4000,21 @@ static void 
spapr_machine_4_0_xive_class_options(MachineClass *mc)
 
 DEFINE_SPAPR_MACHINE(4_0_xive, "4.0-xive", false);
 
+static void spapr_machine_4_0_dual_instance_options(MachineState *machine)
+{
+spapr_machine_4_0_instance_options(machine);
+}
+
+static void spapr_machine_4_0_dual_class_options(MachineClass *mc)
+{
+sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
+
+spapr_machine_4_0_class_options(mc);
+smc->irq = _irq_dual;
+}
+
+DEFINE_SPAPR_MACHINE(4_0_dual, "4.0-dual", false);
+
 /*
  * pseries-3.1
  */
-- 
2.17.2




Re: [Qemu-devel] [PATCH] target/i386: Generate #UD when applying LOCK to a register

2018-12-09 Thread Philippe Mathieu-Daudé
Cc'ing Alberto

On 12/7/18 6:09 PM, Richard Henderson wrote:
> This covers inc, dec, and the bit test instructions.
> 
> I believe we've finally covered all of the cases for
> which we have an atomic path that would use the cpu_A0
> temp, which is only initialized for address sources.
> 

Reported-by: Alberto Ortega 

> Fixes: https://bugs.launchpad.net/qemu/+bug/1803160/comments/4
> Signed-off-by: Richard Henderson 
> ---
>  target/i386/translate.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/target/i386/translate.c b/target/i386/translate.c
> index 0dd5fbe45c..eb52322a47 100644
> --- a/target/i386/translate.c
> +++ b/target/i386/translate.c
> @@ -1398,6 +1398,11 @@ static void gen_op(DisasContext *s1, int op, TCGMemOp 
> ot, int d)
>  static void gen_inc(DisasContext *s1, TCGMemOp ot, int d, int c)
>  {
>  if (s1->prefix & PREFIX_LOCK) {
> +if (d != OR_TMP0) {
> +/* Lock prefix when destination is not memory.  */
> +gen_illegal_opcode(s1);
> +return;
> +}
>  tcg_gen_movi_tl(s1->T0, c > 0 ? 1 : -1);
>  tcg_gen_atomic_add_fetch_tl(s1->T0, s1->A0, s1->T0,
>  s1->mem_index, ot | MO_LE);
> @@ -6764,6 +6769,9 @@ static target_ulong disas_insn(DisasContext *s, 
> CPUState *cpu)
>  gen_op_ld_v(s, ot, s->T0, s->A0);
>  }
>  } else {
> +if (s->prefix & PREFIX_LOCK) {
> +goto illegal_op;
> +}
>  gen_op_mov_v_reg(s, ot, s->T0, rm);
>  }
>  /* load shift */
> @@ -6803,6 +6811,9 @@ static target_ulong disas_insn(DisasContext *s, 
> CPUState *cpu)
>  gen_op_ld_v(s, ot, s->T0, s->A0);
>  }
>  } else {
> +if (s->prefix & PREFIX_LOCK) {
> +goto illegal_op;
> +}
>  gen_op_mov_v_reg(s, ot, s->T0, rm);
>  }
>  bt_op:
> 



Re: [Qemu-devel] [Bug 1803160] Re: qemu-3.1.0-rc0: tcg.c crash in temp_load

2018-12-09 Thread Philippe Mathieu-Daudé
Hi Alberto,

Can you open another ticket for your new bug?

Thanks.

On Fri, Dec 7, 2018 at 6:22 PM Richard Henderson  wrote:
>
> This second crash is of course a different bug.
>
> --
> You received this bug notification because you are a member of qemu-
> devel-ml, which is subscribed to QEMU.
> https://bugs.launchpad.net/bugs/1803160
>
> Title:
>   qemu-3.1.0-rc0: tcg.c crash in temp_load
>
> Status in QEMU:
>   Fix Committed
>
> Bug description:
>   QEMU version:
>   -
>
>   qemu-3.1.0-rc0 compiled from sources (earlier versions also affected)
>
>   Summary:
>   
>
>   TCG crashes in i386 and x86_64 when it tries to execute some specific
>   illegal instructions. When running full OS emulation, both the guest
>   system and QEMU crash.
>
>   The issue has been reproduced in two scenarios:
>
>   Ubuntu x64 host running Debian x86 guest with the following command
>   line: qemu-system-x86_64 -m 4G debian.qcow
>
>   When the attached ELF file is executed inside the guest, QEMU crashes.
>
>   It can also be reproduced from the command line:
>
>   $ qemu-i386 tcg_crash.elf
>   /home/alberto/Documents/qemu-3.1.0-rc0/tcg/tcg.c:2863: tcg fatal error
>   qemu: uncaught target signal 11 (Segmentation fault) - core dumped
>   zsh: segmentation fault (core dumped)  
> ../qemu-3.1.0-rc0/build/i386-linux-user/qemu-i386 tcg_crash.elf
>
>   GDB backtrace:
>
>   (gdb) bt
>   #0  0x60206488 in raise ()
>   #1  0x60206b8a in abort ()
>   #2  0x60007016 in temp_load (s=s@entry=0x607a2780 , 
> ts=ts@entry=0x607a3178 , desired_regs=, 
> allocated_regs=allocated_regs@entry=16400)
>   at /home/alberto/Documents/qemu-3.1.0-rc0/tcg/tcg.c:2863
>   #3  0x6000a4d9 in tcg_reg_alloc_op (op=0x62808c20, s= out>) at /home/alberto/Documents/qemu-3.1.0-rc0/tcg/tcg.c:3070
>   #4  tcg_gen_code (s=, tb=tb@entry=0x607ac040 
> ) at 
> /home/alberto/Documents/qemu-3.1.0-rc0/tcg/tcg.c:3598
>   #5  0x6003ef9a in tb_gen_code (cpu=cpu@entry=0x627e0010, 
> pc=pc@entry=134512724, cs_base=cs_base@entry=0, flags=flags@entry=4194483, 
> cflags=cflags@entry=0)
>   at /home/alberto/Documents/qemu-3.1.0-rc0/accel/tcg/translate-all.c:1752
>   #6  0x6003d979 in tb_find (cf_mask=0, tb_exit=0, last_tb=0x0, 
> cpu=0x0) at /home/alberto/Documents/qemu-3.1.0-rc0/accel/tcg/cpu-exec.c:404
>   #7  cpu_exec (cpu=cpu@entry=0x627e0010) at 
> /home/alberto/Documents/qemu-3.1.0-rc0/accel/tcg/cpu-exec.c:724
>   #8  0x6006e1a0 in cpu_loop (env=env@entry=0x627e82c0) at 
> /home/alberto/Documents/qemu-3.1.0-rc0/linux-user/i386/cpu_loop.c:93
>   #9  0x600037c5 in main (argc=2, argv=0x7fffdd28, 
> envp=) at 
> /home/alberto/Documents/qemu-3.1.0-rc0/linux-user/main.c:819
>   (gdb)
>
>   Testcase:
>   -
>
>   Find ELF file attached.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/qemu/+bug/1803160/+subscriptions
>



[Qemu-devel] [PATCH v7 15/19] spapr/xive: enable XIVE MMIOs at reset

2018-12-09 Thread Cédric Le Goater
Depending on the interrupt mode chosen, enable or disable the XIVE
MMIOs.

Signed-off-by: Cédric Le Goater 
---
 include/hw/ppc/spapr_xive.h | 1 +
 hw/intc/spapr_xive.c| 9 +
 hw/ppc/spapr_irq.c  | 8 
 3 files changed, 18 insertions(+)

diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index 7244a6231ce6..308afb61a666 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -48,5 +48,6 @@ void spapr_xive_hcall_init(sPAPRMachineState *spapr);
 void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
uint32_t phandle);
 void spapr_xive_reset_tctx(sPAPRXive *xive);
+void spapr_xive_enable_mmio(sPAPRXive *xive, bool enable);
 
 #endif /* PPC_SPAPR_XIVE_H */
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index 560d8d031f74..c6dbb2e8cfc7 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -179,6 +179,15 @@ static void spapr_xive_map_mmio(sPAPRXive *xive)
 sysbus_mmio_map(SYS_BUS_DEVICE(xive), 2, xive->tm_base);
 }
 
+void spapr_xive_enable_mmio(sPAPRXive *xive, bool enable)
+{
+memory_region_set_enabled(>source.esb_mmio, enable);
+memory_region_set_enabled(>tm_mmio, enable);
+
+/* Disable the END ESBs until a guest OS makes use of them */
+memory_region_set_enabled(>end_source.esb_mmio, false);
+}
+
 /*
  * When a Virtual Processor is scheduled to run on a HW thread, the
  * hypervisor pushes its identifier in the OS CAM line. Emulate the
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index b423cee30e2c..a8e50725397c 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -217,6 +217,11 @@ static void spapr_irq_reset_xics(sPAPRMachineState *spapr, 
Error **errp)
 CPU_FOREACH(cs) {
 spapr_cpu_core_set_intc(POWERPC_CPU(cs), spapr->icp_type);
 }
+
+/* Deactivate the XIVE MMIOs */
+if (spapr->xive) {
+spapr_xive_enable_mmio(spapr->xive, false);
+}
 }
 
 #define SPAPR_IRQ_XICS_NR_IRQS 0x1000
@@ -358,6 +363,9 @@ static void spapr_irq_reset_xive(sPAPRMachineState *spapr, 
Error **errp)
  * to come after the XiveTCTX reset handlers.
  */
 spapr_xive_reset_tctx(spapr->xive);
+
+/* Activate the XIVE MMIOs */
+spapr_xive_enable_mmio(spapr->xive, true);
 }
 
 /*
-- 
2.17.2




[Qemu-devel] [PATCH v7 17/19] spapr: Add a pseries-4.0 machine type

2018-12-09 Thread Cédric Le Goater
Signed-off-by: Cédric Le Goater 
---
 include/hw/compat.h |  3 +++
 hw/ppc/spapr.c  | 25 ++---
 2 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/include/hw/compat.h b/include/hw/compat.h
index 6f4d5fc64704..70958328fe7a 100644
--- a/include/hw/compat.h
+++ b/include/hw/compat.h
@@ -1,6 +1,9 @@
 #ifndef HW_COMPAT_H
 #define HW_COMPAT_H
 
+#define HW_COMPAT_3_1 \
+/* empty */
+
 #define HW_COMPAT_3_0 \
 /* empty */
 
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index fa41927d95dd..4012ebd794a4 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3971,19 +3971,38 @@ static const TypeInfo spapr_machine_info = {
 }\
 type_init(spapr_machine_register_##suffix)
 
- /*
+/*
+ * pseries-4.0
+ */
+static void spapr_machine_4_0_instance_options(MachineState *machine)
+{
+}
+
+static void spapr_machine_4_0_class_options(MachineClass *mc)
+{
+/* Defaults for the latest behaviour inherited from the base class */
+}
+
+DEFINE_SPAPR_MACHINE(4_0, "4.0", true);
+
+/*
  * pseries-3.1
  */
+#define SPAPR_COMPAT_3_1  \
+HW_COMPAT_3_1
+
 static void spapr_machine_3_1_instance_options(MachineState *machine)
 {
+spapr_machine_4_0_instance_options(machine);
 }
 
 static void spapr_machine_3_1_class_options(MachineClass *mc)
 {
-/* Defaults for the latest behaviour inherited from the base class */
+spapr_machine_4_0_class_options(mc);
+SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_3_1);
 }
 
-DEFINE_SPAPR_MACHINE(3_1, "3.1", true);
+DEFINE_SPAPR_MACHINE(3_1, "3.1", false);
 
 /*
  * pseries-3.0
-- 
2.17.2




[Qemu-devel] [PATCH v7 08/19] spapr: add hcalls support for the XIVE exploitation interrupt mode

2018-12-09 Thread Cédric Le Goater
The different XIVE virtualization structures (sources and event queues)
are configured with a set of Hypervisor calls :

 - H_INT_GET_SOURCE_INFO

   used to obtain the address of the MMIO page of the Event State
   Buffer (ESB) entry associated with the source.

 - H_INT_SET_SOURCE_CONFIG

   assigns a source to a "target".

 - H_INT_GET_SOURCE_CONFIG

   determines which "target" and "priority" is assigned to a source

 - H_INT_GET_QUEUE_INFO

   returns the address of the notification management page associated
   with the specified "target" and "priority".

 - H_INT_SET_QUEUE_CONFIG

   sets or resets the event queue for a given "target" and "priority".
   It is also used to set the notification configuration associated
   with the queue, only unconditional notification is supported for
   the moment. Reset is performed with a queue size of 0 and queueing
   is disabled in that case.

 - H_INT_GET_QUEUE_CONFIG

   returns the queue settings for a given "target" and "priority".

 - H_INT_RESET

   resets all of the guest's internal interrupt structures to their
   initial state, losing all configuration set via the hcalls
   H_INT_SET_SOURCE_CONFIG and H_INT_SET_QUEUE_CONFIG.

 - H_INT_SYNC

   issue a synchronisation on a source to make sure all notifications
   have reached their queue.

Calls that still need to be addressed :

   H_INT_SET_OS_REPORTING_LINE
   H_INT_GET_OS_REPORTING_LINE

See the code for more documentation on each hcall.

Signed-off-by: Cédric Le Goater 
---

 Changes since v6:

 - simplified the prototypes of helpers
 - introduced a fixed value for the controller block id value.
 
 include/hw/ppc/spapr.h  |  15 +-
 include/hw/ppc/spapr_xive.h |   4 +
 hw/intc/spapr_xive.c| 963 
 hw/ppc/spapr_irq.c  |   2 +
 4 files changed, 983 insertions(+), 1 deletion(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index cb3082d319af..6bf028a02fe2 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -452,7 +452,20 @@ struct sPAPRMachineState {
 #define H_INVALIDATE_PID0x378
 #define H_REGISTER_PROC_TBL 0x37C
 #define H_SIGNAL_SYS_RESET  0x380
-#define MAX_HCALL_OPCODEH_SIGNAL_SYS_RESET
+
+#define H_INT_GET_SOURCE_INFO   0x3A8
+#define H_INT_SET_SOURCE_CONFIG 0x3AC
+#define H_INT_GET_SOURCE_CONFIG 0x3B0
+#define H_INT_GET_QUEUE_INFO0x3B4
+#define H_INT_SET_QUEUE_CONFIG  0x3B8
+#define H_INT_GET_QUEUE_CONFIG  0x3BC
+#define H_INT_SET_OS_REPORTING_LINE 0x3C0
+#define H_INT_GET_OS_REPORTING_LINE 0x3C4
+#define H_INT_ESB   0x3C8
+#define H_INT_SYNC  0x3CC
+#define H_INT_RESET 0x3D0
+
+#define MAX_HCALL_OPCODEH_INT_RESET
 
 /* The hcalls above are standardized in PAPR and implemented by pHyp
  * as well.
diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index f087959b9924..9506a8f4d10a 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -42,4 +42,8 @@ bool spapr_xive_irq_free(sPAPRXive *xive, uint32_t lisn);
 void spapr_xive_pic_print_info(sPAPRXive *xive, Monitor *mon);
 qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t lisn);
 
+typedef struct sPAPRMachineState sPAPRMachineState;
+
+void spapr_xive_hcall_init(sPAPRMachineState *spapr);
+
 #endif /* PPC_SPAPR_XIVE_H */
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index 3ade419fdbb1..982ac6e17051 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -38,6 +38,13 @@
 
 #define SPAPR_XIVE_NVT_BASE 0x400
 
+/*
+ * The sPAPR machine has a unique XIVE IC device. Assign a fixed value
+ * to the controller block id value. It can nevertheless be changed
+ * for testing purpose.
+ */
+#define SPAPR_XIVE_BLOCK_ID 0x0
+
 /*
  * sPAPR NVT and END indexing helpers
  */
@@ -46,6 +53,64 @@ static uint32_t spapr_xive_nvt_to_target(uint8_t nvt_blk, 
uint32_t nvt_idx)
 return nvt_idx - SPAPR_XIVE_NVT_BASE;
 }
 
+static void spapr_xive_cpu_to_nvt(PowerPCCPU *cpu,
+  uint8_t *out_nvt_blk, uint32_t *out_nvt_idx)
+{
+assert(cpu);
+
+if (out_nvt_blk) {
+*out_nvt_blk = SPAPR_XIVE_BLOCK_ID;
+}
+
+if (out_nvt_blk) {
+*out_nvt_idx = SPAPR_XIVE_NVT_BASE + cpu->vcpu_id;
+}
+}
+
+static int spapr_xive_target_to_nvt(uint32_t target,
+uint8_t *out_nvt_blk, uint32_t 
*out_nvt_idx)
+{
+PowerPCCPU *cpu = spapr_find_cpu(target);
+
+if (!cpu) {
+return -1;
+}
+
+spapr_xive_cpu_to_nvt(cpu, out_nvt_blk, out_nvt_idx);
+return 0;
+}
+
+/*
+ * sPAPR END indexing uses a simple mapping of the CPU vcpu_id, 8
+ * priorities per CPU
+ */
+static void spapr_xive_cpu_to_end(PowerPCCPU *cpu, uint8_t prio,
+  uint8_t *out_end_blk, uint32_t *out_end_idx)
+{
+assert(cpu);
+
+if (out_end_blk) {
+*out_end_blk = SPAPR_XIVE_BLOCK_ID;
+}
+
+if (out_end_idx) {
+*out_end_idx = 

[Qemu-devel] [PATCH v7 16/19] spapr: introduce a new sPAPR IRQ backend supporting XIVE and XICS

2018-12-09 Thread Cédric Le Goater
The interrupt mode is chosen by the CAS negotiation process and
activated after a reset to take into account the required changes in
the machine. These impact the device tree layout, the interrupt
presenter object and the exposed MMIO regions in the case of XIVE.

This default interrupt mode for the machine is XICS.

Signed-off-by: Cédric Le Goater 
---
 include/hw/ppc/spapr_irq.h |   1 +
 hw/ppc/spapr.c |   3 +-
 hw/ppc/spapr_hcall.c   |  13 
 hw/ppc/spapr_irq.c | 143 +
 4 files changed, 159 insertions(+), 1 deletion(-)

diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index b34d5a00381b..29936498dbc8 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -51,6 +51,7 @@ typedef struct sPAPRIrq {
 extern sPAPRIrq spapr_irq_xics;
 extern sPAPRIrq spapr_irq_xics_legacy;
 extern sPAPRIrq spapr_irq_xive;
+extern sPAPRIrq spapr_irq_dual;
 
 void spapr_irq_init(sPAPRMachineState *spapr, Error **errp);
 int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 5ef87a00f68b..fa41927d95dd 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2631,7 +2631,8 @@ static void spapr_machine_init(MachineState *machine)
 spapr_ovec_set(spapr->ov5, OV5_DRMEM_V2);
 
 /* advertise XIVE */
-if (smc->irq->ov5 == SPAPR_OV5_XIVE_EXPLOIT) {
+if (smc->irq->ov5 == SPAPR_OV5_XIVE_EXPLOIT ||
+smc->irq->ov5 == SPAPR_OV5_XIVE_BOTH) {
 spapr_ovec_set(spapr->ov5, OV5_XIVE_EXPLOIT);
 }
 
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index ae913d070f50..186b6a65543f 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1654,6 +1654,19 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu,
 (spapr_h_cas_compose_response(spapr, args[1], args[2],
   ov5_updates) != 0);
 }
+
+/*
+ * Generate a machine reset when we have an update of the
+ * interrupt mode. Only required on the machine supporting both
+ * mode.
+ */
+if (!spapr->cas_reboot) {
+sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
+
+spapr->cas_reboot = spapr_ovec_test(ov5_updates, OV5_XIVE_EXPLOIT)
+&& smc->irq->ov5 == SPAPR_OV5_XIVE_BOTH;
+}
+
 spapr_ovec_cleanup(ov5_updates);
 
 if (spapr->cas_reboot) {
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index a8e50725397c..7c34939f774a 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -392,6 +392,149 @@ sPAPRIrq spapr_irq_xive = {
 .reset   = spapr_irq_reset_xive,
 };
 
+/*
+ * Dual XIVE and XICS IRQ backend.
+ *
+ * Both interrupt mode, XIVE and XICS, objects are created but the
+ * machine starts in legacy interrupt mode (XICS). It can be changed
+ * by the CAS negotiation process and, in that case, the new mode is
+ * activated after extra machine reset.
+ */
+
+/*
+ * Returns the sPAPR IRQ backend negotiated by CAS. XICS is the
+ * default.
+ */
+static sPAPRIrq *spapr_irq_current(sPAPRMachineState *spapr)
+{
+return spapr_ovec_test(spapr->ov5_cas, OV5_XIVE_EXPLOIT) ?
+_irq_xive : _irq_xics;
+}
+
+static void spapr_irq_init_dual(sPAPRMachineState *spapr, Error **errp)
+{
+MachineState *machine = MACHINE(spapr);
+Error *local_err = NULL;
+
+if (kvm_enabled() && machine_kernel_irqchip_allowed(machine)) {
+error_setg(errp, "No KVM support for the 'dual' machine");
+return;
+}
+
+spapr_irq_xics.init(spapr, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+
+spapr_irq_xive.init(spapr, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+}
+
+static int spapr_irq_claim_dual(sPAPRMachineState *spapr, int irq, bool lsi,
+Error **errp)
+{
+int ret;
+Error *local_err = NULL;
+
+ret = spapr_irq_xive.claim(spapr, irq, lsi, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return ret;
+}
+
+ret = spapr_irq_xics.claim(spapr, irq, lsi, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+}
+
+return ret;
+}
+
+static void spapr_irq_free_dual(sPAPRMachineState *spapr, int irq, int num)
+{
+spapr_irq_xive.free(spapr, irq, num);
+spapr_irq_xics.free(spapr, irq, num);
+}
+
+static qemu_irq spapr_qirq_dual(sPAPRMachineState *spapr, int irq)
+{
+return spapr_irq_current(spapr)->qirq(spapr, irq);
+}
+
+static void spapr_irq_print_info_dual(sPAPRMachineState *spapr, Monitor *mon)
+{
+spapr_irq_current(spapr)->print_info(spapr, mon);
+}
+
+static void spapr_irq_dt_populate_dual(sPAPRMachineState *spapr,
+   uint32_t nr_servers, void *fdt,
+   uint32_t phandle)
+{
+spapr_irq_current(spapr)->dt_populate(spapr, nr_servers, 

[Qemu-devel] [PATCH v7 12/19] spapr: add a 'reset' method to the sPAPR IRQ backend

2018-12-09 Thread Cédric Le Goater
For the time being, the XIVE reset handler updates the OS CAM line of
the vCPU as it is done under a real hypervisor when a vCPU is
scheduled to run on a HW thread.

This handler will become even more useful when we introduce the
machine supporting both interrupt modes, XIVE and XICS. In this
machine, the interrupt mode is chosen by the CAS negotiation process
and activated after a reset.

Signed-off-by: Cédric Le Goater 
---
 include/hw/ppc/spapr_irq.h  |  2 ++
 include/hw/ppc/spapr_xive.h |  1 +
 hw/intc/spapr_xive.c| 24 
 hw/ppc/spapr.c  |  5 +
 hw/ppc/spapr_irq.c  | 24 
 5 files changed, 56 insertions(+)

diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index 84a25ffb6c65..63061a009b4c 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -44,6 +44,7 @@ typedef struct sPAPRIrq {
 Object *(*cpu_intc_create)(sPAPRMachineState *spapr, Object *cpu,
Error **errp);
 int (*post_load)(sPAPRMachineState *spapr, int version_id);
+void (*reset)(sPAPRMachineState *spapr, Error **errp);
 } sPAPRIrq;
 
 extern sPAPRIrq spapr_irq_xics;
@@ -55,6 +56,7 @@ int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool 
lsi, Error **errp);
 void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
 qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
 int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id);
+void spapr_irq_reset(sPAPRMachineState *spapr, Error **errp);
 
 /*
  * XICS legacy routines
diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index 728a5e8dc163..7244a6231ce6 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -47,5 +47,6 @@ typedef struct sPAPRMachineState sPAPRMachineState;
 void spapr_xive_hcall_init(sPAPRMachineState *spapr);
 void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
uint32_t phandle);
+void spapr_xive_reset_tctx(sPAPRXive *xive);
 
 #endif /* PPC_SPAPR_XIVE_H */
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index a6d854b07690..560d8d031f74 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -179,6 +179,30 @@ static void spapr_xive_map_mmio(sPAPRXive *xive)
 sysbus_mmio_map(SYS_BUS_DEVICE(xive), 2, xive->tm_base);
 }
 
+/*
+ * When a Virtual Processor is scheduled to run on a HW thread, the
+ * hypervisor pushes its identifier in the OS CAM line. Emulate the
+ * same behavior under QEMU.
+ */
+void spapr_xive_reset_tctx(sPAPRXive *xive)
+{
+CPUState *cs;
+uint8_t  nvt_blk;
+uint32_t nvt_idx;
+uint32_t nvt_cam;
+
+CPU_FOREACH(cs) {
+PowerPCCPU *cpu = POWERPC_CPU(cs);
+XiveTCTX *tctx = XIVE_TCTX(cpu->intc);
+
+spapr_xive_cpu_to_nvt(cpu, _blk, _idx);
+
+nvt_cam = cpu_to_be32(TM_QW1W2_VO |
+  xive_nvt_cam_line(nvt_blk, nvt_idx));
+memcpy(>regs[TM_QW1_OS + TM_WORD2], _cam, 4);
+}
+}
+
 static void spapr_xive_end_reset(XiveEND *end)
 {
 memset(end, 0, sizeof(*end));
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 8cea4cad1732..98d69f09e080 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1619,6 +1619,11 @@ static void spapr_machine_reset(void)
 
 qemu_devices_reset();
 
+/* This is fixing some of the default configuration of the XIVE
+ * devices. To be called after the reset of the machine devices.
+ */
+spapr_irq_reset(spapr, _fatal);
+
 /* DRC reset may cause a device to be unplugged. This will cause troubles
  * if this device is used by another device (eg, a running vhost backend
  * will crash QEMU if the DIMM holding the vring goes away). To avoid such
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 35a067cad3f8..04f5c9665550 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -209,6 +209,10 @@ static int spapr_irq_post_load_xics(sPAPRMachineState 
*spapr, int version_id)
 return 0;
 }
 
+static void spapr_irq_reset_xics(sPAPRMachineState *spapr, Error **errp)
+{
+}
+
 #define SPAPR_IRQ_XICS_NR_IRQS 0x1000
 #define SPAPR_IRQ_XICS_NR_MSIS \
 (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
@@ -225,6 +229,7 @@ sPAPRIrq spapr_irq_xics = {
 .dt_populate = spapr_dt_xics,
 .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
 .post_load   = spapr_irq_post_load_xics,
+.reset   = spapr_irq_reset_xics,
 };
 
 /*
@@ -333,6 +338,15 @@ static int spapr_irq_post_load_xive(sPAPRMachineState 
*spapr, int version_id)
 return 0;
 }
 
+static void spapr_irq_reset_xive(sPAPRMachineState *spapr, Error **errp)
+{
+/*
+ * Set the OS CAM line of the cpu interrupt thread context. Needs
+ * to come after the XiveTCTX reset handlers.
+ */
+spapr_xive_reset_tctx(spapr->xive);
+}
+
 /*
  * XIVE uses the full IRQ number space. Set it to 8K to be compatible
  * with XICS.
@@ -353,6 +367,7 @@ sPAPRIrq 

[Qemu-devel] [PATCH v7 10/19] spapr: allocate the interrupt thread context under the CPU core

2018-12-09 Thread Cédric Le Goater
Each interrupt mode has its own specific interrupt presenter object,
that we store under the CPU object, one for XICS and one for XIVE.

Extend the sPAPR IRQ backend with a new handler to support them both.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
---

 Changes since v6:

 - removed the hardwiring the HW CAM line. Back to v5 state.

 include/hw/ppc/spapr_irq.h |  2 ++
 include/hw/ppc/xive.h  |  1 +
 hw/intc/xive.c | 22 ++
 hw/ppc/spapr_cpu_core.c|  5 ++---
 hw/ppc/spapr_irq.c | 15 +++
 5 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index e51e9f052f63..13db0428ab51 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -41,6 +41,8 @@ typedef struct sPAPRIrq {
 void (*print_info)(sPAPRMachineState *spapr, Monitor *mon);
 void (*dt_populate)(sPAPRMachineState *spapr, uint32_t nr_servers,
 void *fdt, uint32_t phandle);
+Object *(*cpu_intc_create)(sPAPRMachineState *spapr, Object *cpu,
+   Error **errp);
 } sPAPRIrq;
 
 extern sPAPRIrq spapr_irq_xics;
diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
index 19309d1d65d1..18cd114eb244 100644
--- a/include/hw/ppc/xive.h
+++ b/include/hw/ppc/xive.h
@@ -419,6 +419,7 @@ typedef struct XiveTCTX {
 extern const MemoryRegionOps xive_tm_ops;
 
 void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon);
+Object *xive_tctx_create(Object *cpu, XiveRouter *xrtr, Error **errp);
 
 static inline uint32_t xive_nvt_cam_line(uint8_t nvt_blk, uint32_t nvt_idx)
 {
diff --git a/hw/intc/xive.c b/hw/intc/xive.c
index ea5385ff7784..53d2f191e8a3 100644
--- a/hw/intc/xive.c
+++ b/hw/intc/xive.c
@@ -526,6 +526,28 @@ static const TypeInfo xive_tctx_info = {
 .class_init= xive_tctx_class_init,
 };
 
+Object *xive_tctx_create(Object *cpu, XiveRouter *xrtr, Error **errp)
+{
+Error *local_err = NULL;
+Object *obj;
+
+obj = object_new(TYPE_XIVE_TCTX);
+object_property_add_child(cpu, TYPE_XIVE_TCTX, obj, _abort);
+object_unref(obj);
+object_property_add_const_link(obj, "cpu", cpu, _abort);
+object_property_set_bool(obj, true, "realized", _err);
+if (local_err) {
+goto error;
+}
+
+return obj;
+
+error:
+object_unparent(obj);
+error_propagate(errp, local_err);
+return NULL;
+}
+
 /*
  * XIVE ESB helpers
  */
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 2398ce62c0e7..1811cd48db90 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -11,7 +11,6 @@
 #include "hw/ppc/spapr_cpu_core.h"
 #include "target/ppc/cpu.h"
 #include "hw/ppc/spapr.h"
-#include "hw/ppc/xics.h" /* for icp_create() - to be removed */
 #include "hw/boards.h"
 #include "qapi/error.h"
 #include "sysemu/cpus.h"
@@ -215,6 +214,7 @@ static void spapr_cpu_core_unrealize(DeviceState *dev, 
Error **errp)
 static void spapr_realize_vcpu(PowerPCCPU *cpu, sPAPRMachineState *spapr,
sPAPRCPUCore *sc, Error **errp)
 {
+sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
 CPUPPCState *env = >env;
 CPUState *cs = CPU(cpu);
 Error *local_err = NULL;
@@ -233,8 +233,7 @@ static void spapr_realize_vcpu(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 qemu_register_reset(spapr_cpu_reset, cpu);
 spapr_cpu_reset(cpu);
 
-cpu->intc = icp_create(OBJECT(cpu), spapr->icp_type, XICS_FABRIC(spapr),
-   _err);
+cpu->intc = smc->irq->cpu_intc_create(spapr, OBJECT(cpu), _err);
 if (local_err) {
 goto error_unregister;
 }
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 38ea2da7a094..5efe33826967 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -191,6 +191,12 @@ static void spapr_irq_print_info_xics(sPAPRMachineState 
*spapr, Monitor *mon)
 ics_pic_print_info(spapr->ics, mon);
 }
 
+static Object *spapr_irq_cpu_intc_create_xics(sPAPRMachineState *spapr,
+  Object *cpu, Error **errp)
+{
+return icp_create(cpu, spapr->icp_type, XICS_FABRIC(spapr), errp);
+}
+
 #define SPAPR_IRQ_XICS_NR_IRQS 0x1000
 #define SPAPR_IRQ_XICS_NR_MSIS \
 (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
@@ -205,6 +211,7 @@ sPAPRIrq spapr_irq_xics = {
 .qirq= spapr_qirq_xics,
 .print_info  = spapr_irq_print_info_xics,
 .dt_populate = spapr_dt_xics,
+.cpu_intc_create = spapr_irq_cpu_intc_create_xics,
 };
 
 /*
@@ -302,6 +309,12 @@ static void spapr_irq_print_info_xive(sPAPRMachineState 
*spapr,
 spapr_xive_pic_print_info(spapr->xive, mon);
 }
 
+static Object *spapr_irq_cpu_intc_create_xive(sPAPRMachineState *spapr,
+  Object *cpu, Error **errp)
+{
+return xive_tctx_create(cpu, XIVE_ROUTER(spapr->xive), errp);
+}
+
 /*
  * XIVE uses the full IRQ number space. Set it to 

[Qemu-devel] [PATCH v7 11/19] spapr: extend the sPAPR IRQ backend for XICS migration

2018-12-09 Thread Cédric Le Goater
Introduce a new sPAPR IRQ handler to handle resend after migration
when the machine is using a KVM XICS interrupt controller model.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
---
 include/hw/ppc/spapr_irq.h |  2 ++
 hw/ppc/spapr.c | 13 +
 hw/ppc/spapr_irq.c | 27 +++
 3 files changed, 34 insertions(+), 8 deletions(-)

diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index 13db0428ab51..84a25ffb6c65 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -43,6 +43,7 @@ typedef struct sPAPRIrq {
 void *fdt, uint32_t phandle);
 Object *(*cpu_intc_create)(sPAPRMachineState *spapr, Object *cpu,
Error **errp);
+int (*post_load)(sPAPRMachineState *spapr, int version_id);
 } sPAPRIrq;
 
 extern sPAPRIrq spapr_irq_xics;
@@ -53,6 +54,7 @@ void spapr_irq_init(sPAPRMachineState *spapr, Error **errp);
 int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
 void spapr_irq_free(sPAPRMachineState *spapr, int irq, int num);
 qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq);
+int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id);
 
 /*
  * XICS legacy routines
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 8ff22cdb79d8..8cea4cad1732 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1730,14 +1730,6 @@ static int spapr_post_load(void *opaque, int version_id)
 return err;
 }
 
-if (!object_dynamic_cast(OBJECT(spapr->ics), TYPE_ICS_KVM)) {
-CPUState *cs;
-CPU_FOREACH(cs) {
-PowerPCCPU *cpu = POWERPC_CPU(cs);
-icp_resend(ICP(cpu->intc));
-}
-}
-
 /* In earlier versions, there was no separate qdev for the PAPR
  * RTC, so the RTC offset was stored directly in sPAPREnvironment.
  * So when migrating from those versions, poke the incoming offset
@@ -1758,6 +1750,11 @@ static int spapr_post_load(void *opaque, int version_id)
 }
 }
 
+err = spapr_irq_post_load(spapr, version_id);
+if (err) {
+return err;
+}
+
 return err;
 }
 
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 5efe33826967..35a067cad3f8 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -197,6 +197,18 @@ static Object 
*spapr_irq_cpu_intc_create_xics(sPAPRMachineState *spapr,
 return icp_create(cpu, spapr->icp_type, XICS_FABRIC(spapr), errp);
 }
 
+static int spapr_irq_post_load_xics(sPAPRMachineState *spapr, int version_id)
+{
+if (!object_dynamic_cast(OBJECT(spapr->ics), TYPE_ICS_KVM)) {
+CPUState *cs;
+CPU_FOREACH(cs) {
+PowerPCCPU *cpu = POWERPC_CPU(cs);
+icp_resend(ICP(cpu->intc));
+}
+}
+return 0;
+}
+
 #define SPAPR_IRQ_XICS_NR_IRQS 0x1000
 #define SPAPR_IRQ_XICS_NR_MSIS \
 (XICS_IRQ_BASE + SPAPR_IRQ_XICS_NR_IRQS - SPAPR_IRQ_MSI)
@@ -212,6 +224,7 @@ sPAPRIrq spapr_irq_xics = {
 .print_info  = spapr_irq_print_info_xics,
 .dt_populate = spapr_dt_xics,
 .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
+.post_load   = spapr_irq_post_load_xics,
 };
 
 /*
@@ -315,6 +328,11 @@ static Object 
*spapr_irq_cpu_intc_create_xive(sPAPRMachineState *spapr,
 return xive_tctx_create(cpu, XIVE_ROUTER(spapr->xive), errp);
 }
 
+static int spapr_irq_post_load_xive(sPAPRMachineState *spapr, int version_id)
+{
+return 0;
+}
+
 /*
  * XIVE uses the full IRQ number space. Set it to 8K to be compatible
  * with XICS.
@@ -334,6 +352,7 @@ sPAPRIrq spapr_irq_xive = {
 .print_info  = spapr_irq_print_info_xive,
 .dt_populate = spapr_dt_xive,
 .cpu_intc_create = spapr_irq_cpu_intc_create_xive,
+.post_load   = spapr_irq_post_load_xive,
 };
 
 /*
@@ -372,6 +391,13 @@ qemu_irq spapr_qirq(sPAPRMachineState *spapr, int irq)
 return smc->irq->qirq(spapr, irq);
 }
 
+int spapr_irq_post_load(sPAPRMachineState *spapr, int version_id)
+{
+sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
+
+return smc->irq->post_load(spapr, version_id);
+}
+
 /*
  * XICS legacy routines - to deprecate one day
  */
@@ -440,4 +466,5 @@ sPAPRIrq spapr_irq_xics_legacy = {
 .print_info  = spapr_irq_print_info_xics,
 .dt_populate = spapr_dt_xics,
 .cpu_intc_create = spapr_irq_cpu_intc_create_xics,
+.post_load   = spapr_irq_post_load_xics,
 };
-- 
2.17.2




[Qemu-devel] [PATCH v7 06/19] spapr/xive: use the VCPU id as a NVT identifier

2018-12-09 Thread Cédric Le Goater
The IVPE scans the O/S CAM line of the XIVE thread interrupt contexts
to find a matching Notification Virtual Target (NVT) among the NVTs
dispatched on the HW processor threads.

On a real system, the thread interrupt contexts are updated by the
hypervisor when a Virtual Processor is scheduled to run on a HW
thread. Under QEMU, the model will emulate the same behavior by
hardwiring the NVT identifier in the thread context registers at
reset.

The NVT identifier used by the sPAPRXive model is the VCPU id. The END
identifier is also derived from the VCPU id. A set of helpers doing
the conversion between identifiers are provided for the hcalls
configuring the sources and the ENDs.

The model does not need a NVT table but the XiveRouter NVT operations
are provided to perform some extra checks in the routing algorithm.

Signed-off-by: Cédric Le Goater 
---

 Changes since v6:

 - simplified the prototypes of helpers
 - introduced an assert in set_nvt() method

 hw/intc/spapr_xive.c | 56 +++-
 1 file changed, 55 insertions(+), 1 deletion(-)

diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index eef5830d45c6..3ade419fdbb1 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -26,6 +26,26 @@
 #define SPAPR_XIVE_VC_BASE   0x00060100ull
 #define SPAPR_XIVE_TM_BASE   0x000603020318ull
 
+/*
+ * The allocation of VP blocks is a complex operation in OPAL and the
+ * VP identifiers have a relation with the number of HW chips, the
+ * size of the VP blocks, VP grouping, etc. The QEMU sPAPR XIVE
+ * controller model does not have the same constraints and can use a
+ * simple mapping scheme of the CPU vcpu_id
+ *
+ * These identifiers are never returned to the OS.
+ */
+
+#define SPAPR_XIVE_NVT_BASE 0x400
+
+/*
+ * sPAPR NVT and END indexing helpers
+ */
+static uint32_t spapr_xive_nvt_to_target(uint8_t nvt_blk, uint32_t nvt_idx)
+{
+return nvt_idx - SPAPR_XIVE_NVT_BASE;
+}
+
 /*
  * On sPAPR machines, use a simplified output for the XIVE END
  * structure dumping only the information related to the OS EQ.
@@ -40,7 +60,8 @@ static void spapr_xive_end_pic_print_info(sPAPRXive *xive, 
XiveEND *end,
 uint32_t nvt = GETFIELD_BE32(END_W6_NVT_INDEX, end->w6);
 uint8_t priority = GETFIELD_BE32(END_W7_F0_PRIORITY, end->w7);
 
-monitor_printf(mon, "%3d/%d % 6d/%5d ^%d", nvt,
+monitor_printf(mon, "%3d/%d % 6d/%5d ^%d",
+   spapr_xive_nvt_to_target(0, nvt),
priority, qindex, qentries, qgen);
 
 xive_end_queue_pic_print_info(end, 6, mon);
@@ -246,6 +267,37 @@ static int spapr_xive_write_end(XiveRouter *xrtr, uint8_t 
end_blk,
 return 0;
 }
 
+static int spapr_xive_get_nvt(XiveRouter *xrtr,
+  uint8_t nvt_blk, uint32_t nvt_idx, XiveNVT *nvt)
+{
+uint32_t vcpu_id = spapr_xive_nvt_to_target(nvt_blk, nvt_idx);
+PowerPCCPU *cpu = spapr_find_cpu(vcpu_id);
+
+if (!cpu) {
+/* TODO: should we assert() if we can find a NVT ? */
+return -1;
+}
+
+/*
+ * sPAPR does not maintain a NVT table. Return that the NVT is
+ * valid if we have found a matching CPU
+ */
+nvt->w0 = cpu_to_be32(NVT_W0_VALID);
+return 0;
+}
+
+static int spapr_xive_write_nvt(XiveRouter *xrtr, uint8_t nvt_blk,
+uint32_t nvt_idx, XiveNVT *nvt,
+uint8_t word_number)
+{
+/*
+ * We don't need to write back to the NVTs because the sPAPR
+ * machine should never hit a non-scheduled NVT. It should never
+ * get called.
+ */
+g_assert_not_reached();
+}
+
 static const VMStateDescription vmstate_spapr_xive_end = {
 .name = TYPE_SPAPR_XIVE "/end",
 .version_id = 1,
@@ -308,6 +360,8 @@ static void spapr_xive_class_init(ObjectClass *klass, void 
*data)
 xrc->get_eas = spapr_xive_get_eas;
 xrc->get_end = spapr_xive_get_end;
 xrc->write_end = spapr_xive_write_end;
+xrc->get_nvt = spapr_xive_get_nvt;
+xrc->write_nvt = spapr_xive_write_nvt;
 }
 
 static const TypeInfo spapr_xive_info = {
-- 
2.17.2




[Qemu-devel] [PATCH v7 03/19] ppc/xive: introduce a simplified XIVE presenter

2018-12-09 Thread Cédric Le Goater
The last sub-engine of the XIVE architecture is the Interrupt
Virtualization Presentation Engine (IVPE). On HW, the IVRE and the
IVPE share elements, the Power Bus interface (CQ), the routing table
descriptors, and they can be combined in the same HW logic. We do the
same in QEMU and combine both engines in the XiveRouter for
simplicity.

When the IVRE has completed its job of matching an event source with a
Notification Virtual Target (NVT) to notify, it forwards the event
notification to the IVPE sub-engine. The IVPE scans the thread
interrupt contexts of the Notification Virtual Targets (NVT)
dispatched on the HW processor threads and if a match is found, it
signals the thread. If not, the IVPE escalates the notification to
some other targets and records the notification in a backlog queue.

The IVPE maintains the thread interrupt context state for each of its
NVTs not dispatched on HW processor threads in the Notification
Virtual Target table (NVTT).

The model currently only supports single NVT notifications.

Signed-off-by: Cédric Le Goater 
---

 Changes since v6 :

 - removed HW CAM line setting and use as it is only useful for PowerNV
 - made use of xive_tctx_word2() helper
 - made use of GETFIELD_BE32() to compare CAM lines
 - fixed initialization of XiveTCTXMatch

 include/hw/ppc/xive.h  |  14 +++
 include/hw/ppc/xive_regs.h |  24 +
 hw/intc/xive.c | 185 +
 3 files changed, 223 insertions(+)

diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
index 1e823a4c64e9..19309d1d65d1 100644
--- a/include/hw/ppc/xive.h
+++ b/include/hw/ppc/xive.h
@@ -325,6 +325,10 @@ typedef struct XiveRouterClass {
XiveEND *end);
 int (*write_end)(XiveRouter *xrtr, uint8_t end_blk, uint32_t end_idx,
  XiveEND *end, uint8_t word_number);
+int (*get_nvt)(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
+   XiveNVT *nvt);
+int (*write_nvt)(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
+ XiveNVT *nvt, uint8_t word_number);
 } XiveRouterClass;
 
 void xive_eas_pic_print_info(XiveEAS *eas, uint32_t lisn, Monitor *mon);
@@ -335,6 +339,11 @@ int xive_router_get_end(XiveRouter *xrtr, uint8_t end_blk, 
uint32_t end_idx,
 XiveEND *end);
 int xive_router_write_end(XiveRouter *xrtr, uint8_t end_blk, uint32_t end_idx,
   XiveEND *end, uint8_t word_number);
+int xive_router_get_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
+XiveNVT *nvt);
+int xive_router_write_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
+  XiveNVT *nvt, uint8_t word_number);
+
 
 /*
  * XIVE END ESBs
@@ -411,4 +420,9 @@ extern const MemoryRegionOps xive_tm_ops;
 
 void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon);
 
+static inline uint32_t xive_nvt_cam_line(uint8_t nvt_blk, uint32_t nvt_idx)
+{
+return (nvt_blk << 19) | nvt_idx;
+}
+
 #endif /* PPC_XIVE_H */
diff --git a/include/hw/ppc/xive_regs.h b/include/hw/ppc/xive_regs.h
index ede3d04c5eda..85557e730cd8 100644
--- a/include/hw/ppc/xive_regs.h
+++ b/include/hw/ppc/xive_regs.h
@@ -186,4 +186,28 @@ typedef struct XiveEND {
 #define GETFIELD_BE32(m, v)   GETFIELD(m, be32_to_cpu(v))
 #define SETFIELD_BE32(m, v, val)  cpu_to_be32(SETFIELD(m, be32_to_cpu(v), val))
 
+/* Notification Virtual Target (NVT) */
+typedef struct XiveNVT {
+uint32_tw0;
+#define NVT_W0_VALID PPC_BIT32(0)
+uint32_tw1;
+uint32_tw2;
+uint32_tw3;
+uint32_tw4;
+uint32_tw5;
+uint32_tw6;
+uint32_tw7;
+uint32_tw8;
+#define NVT_W8_GRP_VALID PPC_BIT32(0)
+uint32_tw9;
+uint32_twa;
+uint32_twb;
+uint32_twc;
+uint32_twd;
+uint32_twe;
+uint32_twf;
+} XiveNVT;
+
+#define xive_nvt_is_valid(nvt)(be32_to_cpu((nvt)->w0) & NVT_W0_VALID)
+
 #endif /* PPC_XIVE_REGS_H */
diff --git a/hw/intc/xive.c b/hw/intc/xive.c
index 2615d16b7437..3eecffe99b3a 100644
--- a/hw/intc/xive.c
+++ b/hw/intc/xive.c
@@ -983,6 +983,183 @@ int xive_router_write_end(XiveRouter *xrtr, uint8_t 
end_blk, uint32_t end_idx,
return xrc->write_end(xrtr, end_blk, end_idx, end, word_number);
 }
 
+int xive_router_get_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
+XiveNVT *nvt)
+{
+   XiveRouterClass *xrc = XIVE_ROUTER_GET_CLASS(xrtr);
+
+   return xrc->get_nvt(xrtr, nvt_blk, nvt_idx, nvt);
+}
+
+int xive_router_write_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
+XiveNVT *nvt, uint8_t word_number)
+{
+   XiveRouterClass *xrc = XIVE_ROUTER_GET_CLASS(xrtr);
+
+   return xrc->write_nvt(xrtr, nvt_blk, nvt_idx, nvt, word_number);
+}
+
+/*
+ * The thread context 

[Qemu-devel] [PATCH v7 09/19] spapr: add device tree support for the XIVE exploitation mode

2018-12-09 Thread Cédric Le Goater
The XIVE interface for the guest is described in the device tree under
the "interrupt-controller" node. A couple of new properties are
specific to XIVE :

 - "reg"

   contains the base address and size of the thread interrupt
   managnement areas (TIMA), for the User level and for the Guest OS
   level. Only the Guest OS level is taken into account today.

 - "ibm,xive-eq-sizes"

   the size of the event queues. One cell per size supported, contains
   log2 of size, in ascending order.

 - "ibm,xive-lisn-ranges"

   the IRQ interrupt number ranges assigned to the guest for the IPIs.

and also under the root node :

 - "ibm,plat-res-int-priorities"

   contains a list of priorities that the hypervisor has reserved for
   its own use. OPAL uses the priority 7 queue to automatically
   escalate interrupts for all other queues (DD2.X POWER9). So only
   priorities [0..6] are allowed for the guest.

Extend the sPAPR IRQ backend with a new handler to populate the DT
with the appropriate "interrupt-controller" node.

Signed-off-by: Cédric Le Goater 
---
 include/hw/ppc/spapr_irq.h  |  2 ++
 include/hw/ppc/spapr_xive.h |  2 ++
 include/hw/ppc/xics.h   |  4 +--
 hw/intc/spapr_xive.c| 64 +
 hw/intc/xics_spapr.c|  3 +-
 hw/ppc/spapr.c  |  3 +-
 hw/ppc/spapr_irq.c  |  3 ++
 7 files changed, 77 insertions(+), 4 deletions(-)

diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index 23cdb51b879e..e51e9f052f63 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -39,6 +39,8 @@ typedef struct sPAPRIrq {
 void (*free)(sPAPRMachineState *spapr, int irq, int num);
 qemu_irq (*qirq)(sPAPRMachineState *spapr, int irq);
 void (*print_info)(sPAPRMachineState *spapr, Monitor *mon);
+void (*dt_populate)(sPAPRMachineState *spapr, uint32_t nr_servers,
+void *fdt, uint32_t phandle);
 } sPAPRIrq;
 
 extern sPAPRIrq spapr_irq_xics;
diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index 9506a8f4d10a..728a5e8dc163 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -45,5 +45,7 @@ qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t lisn);
 typedef struct sPAPRMachineState sPAPRMachineState;
 
 void spapr_xive_hcall_init(sPAPRMachineState *spapr);
+void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
+   uint32_t phandle);
 
 #endif /* PPC_SPAPR_XIVE_H */
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 9958443d1984..14afda198cdb 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -181,8 +181,6 @@ typedef struct XICSFabricClass {
 ICPState *(*icp_get)(XICSFabric *xi, int server);
 } XICSFabricClass;
 
-void spapr_dt_xics(int nr_servers, void *fdt, uint32_t phandle);
-
 ICPState *xics_icp_get(XICSFabric *xi, int server);
 
 /* Internal XICS interfaces */
@@ -204,6 +202,8 @@ void icp_resend(ICPState *ss);
 
 typedef struct sPAPRMachineState sPAPRMachineState;
 
+void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
+   uint32_t phandle);
 int xics_kvm_init(sPAPRMachineState *spapr, Error **errp);
 void xics_spapr_init(sPAPRMachineState *spapr);
 
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index 982ac6e17051..a6d854b07690 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -14,6 +14,7 @@
 #include "target/ppc/cpu.h"
 #include "sysemu/cpus.h"
 #include "monitor/monitor.h"
+#include "hw/ppc/fdt.h"
 #include "hw/ppc/spapr.h"
 #include "hw/ppc/spapr_xive.h"
 #include "hw/ppc/xive.h"
@@ -1381,3 +1382,66 @@ void spapr_xive_hcall_init(sPAPRMachineState *spapr)
 spapr_register_hypercall(H_INT_SYNC, h_int_sync);
 spapr_register_hypercall(H_INT_RESET, h_int_reset);
 }
+
+void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
+   uint32_t phandle)
+{
+sPAPRXive *xive = spapr->xive;
+int node;
+uint64_t timas[2 * 2];
+/* Interrupt number ranges for the IPIs */
+uint32_t lisn_ranges[] = {
+cpu_to_be32(0),
+cpu_to_be32(nr_servers),
+};
+uint32_t eq_sizes[] = {
+cpu_to_be32(12), /* 4K */
+cpu_to_be32(16), /* 64K */
+cpu_to_be32(21), /* 2M */
+cpu_to_be32(24), /* 16M */
+};
+/* The following array is in sync with the reserved priorities
+ * defined by the 'spapr_xive_priority_is_reserved' routine.
+ */
+uint32_t plat_res_int_priorities[] = {
+cpu_to_be32(7),/* start */
+cpu_to_be32(0xf8), /* count */
+};
+gchar *nodename;
+
+/* Thread Interrupt Management Area : User (ring 3) and OS (ring 2) */
+timas[0] = cpu_to_be64(xive->tm_base +
+   XIVE_TM_USER_PAGE * (1ull << TM_SHIFT));
+timas[1] = cpu_to_be64(1ull << TM_SHIFT);
+timas[2] = cpu_to_be64(xive->tm_base +
+   XIVE_TM_OS_PAGE * (1ull << 

[Qemu-devel] [PATCH v7 05/19] spapr/xive: introduce a XIVE interrupt controller

2018-12-09 Thread Cédric Le Goater
sPAPRXive models the XIVE interrupt controller of the sPAPR machine.
It inherits from the XiveRouter and provisions storage for the routing
tables :

  - Event Assignment Structure (EAS)
  - Event Notification Descriptor (END)

The sPAPRXive model incorporates an internal XiveSource for the IPIs
and for the interrupts of the virtual devices of the guest. This model
is consistent with XIVE architecture which also incorporates an
internal IVSE for IPIs and accelerator interrupts in the IVRE
sub-engine.

The sPAPRXive model exports two memory regions, one for the ESB
trigger and management pages used to control the sources and one for
the TIMA pages. They are mapped by default at the addresses found on
chip 0 of a baremetal system. This is also consistent with the XIVE
architecture which defines a Virtualization Controller BAR for the
internal IVSE ESB pages and a Thread Managment BAR for the TIMA.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
---
 default-configs/ppc64-softmmu.mak |   1 +
 include/hw/ppc/spapr_xive.h   |  45 
 hw/intc/spapr_xive.c  | 366 ++
 hw/intc/Makefile.objs |   1 +
 4 files changed, 413 insertions(+)
 create mode 100644 include/hw/ppc/spapr_xive.h
 create mode 100644 hw/intc/spapr_xive.c

diff --git a/default-configs/ppc64-softmmu.mak 
b/default-configs/ppc64-softmmu.mak
index 2d1e7c5c4668..7f34ad0528ed 100644
--- a/default-configs/ppc64-softmmu.mak
+++ b/default-configs/ppc64-softmmu.mak
@@ -17,6 +17,7 @@ CONFIG_XICS=$(CONFIG_PSERIES)
 CONFIG_XICS_SPAPR=$(CONFIG_PSERIES)
 CONFIG_XICS_KVM=$(call land,$(CONFIG_PSERIES),$(CONFIG_KVM))
 CONFIG_XIVE=$(CONFIG_PSERIES)
+CONFIG_XIVE_SPAPR=$(CONFIG_PSERIES)
 CONFIG_MEM_DEVICE=y
 CONFIG_DIMM=y
 CONFIG_SPAPR_RNG=y
diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
new file mode 100644
index ..f087959b9924
--- /dev/null
+++ b/include/hw/ppc/spapr_xive.h
@@ -0,0 +1,45 @@
+/*
+ * QEMU PowerPC sPAPR XIVE interrupt controller model
+ *
+ * Copyright (c) 2017-2018, IBM Corporation.
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ */
+
+#ifndef PPC_SPAPR_XIVE_H
+#define PPC_SPAPR_XIVE_H
+
+#include "hw/ppc/xive.h"
+
+#define TYPE_SPAPR_XIVE "spapr-xive"
+#define SPAPR_XIVE(obj) OBJECT_CHECK(sPAPRXive, (obj), TYPE_SPAPR_XIVE)
+
+typedef struct sPAPRXive {
+XiveRouterparent;
+
+/* Internal interrupt source for IPIs and virtual devices */
+XiveSourcesource;
+hwaddrvc_base;
+
+/* END ESB MMIOs */
+XiveENDSource end_source;
+hwaddrend_base;
+
+/* Routing table */
+XiveEAS   *eat;
+uint32_t  nr_irqs;
+XiveEND   *endt;
+uint32_t  nr_ends;
+
+/* TIMA mapping address */
+hwaddrtm_base;
+MemoryRegion  tm_mmio;
+} sPAPRXive;
+
+bool spapr_xive_irq_claim(sPAPRXive *xive, uint32_t lisn, bool lsi);
+bool spapr_xive_irq_free(sPAPRXive *xive, uint32_t lisn);
+void spapr_xive_pic_print_info(sPAPRXive *xive, Monitor *mon);
+qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t lisn);
+
+#endif /* PPC_SPAPR_XIVE_H */
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
new file mode 100644
index ..eef5830d45c6
--- /dev/null
+++ b/hw/intc/spapr_xive.c
@@ -0,0 +1,366 @@
+/*
+ * QEMU PowerPC sPAPR XIVE interrupt controller model
+ *
+ * Copyright (c) 2017-2018, IBM Corporation.
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "target/ppc/cpu.h"
+#include "sysemu/cpus.h"
+#include "monitor/monitor.h"
+#include "hw/ppc/spapr.h"
+#include "hw/ppc/spapr_xive.h"
+#include "hw/ppc/xive.h"
+#include "hw/ppc/xive_regs.h"
+
+/*
+ * XIVE Virtualization Controller BAR and Thread Managment BAR that we
+ * use for the ESB pages and the TIMA pages
+ */
+#define SPAPR_XIVE_VC_BASE   0x00060100ull
+#define SPAPR_XIVE_TM_BASE   0x000603020318ull
+
+/*
+ * On sPAPR machines, use a simplified output for the XIVE END
+ * structure dumping only the information related to the OS EQ.
+ */
+static void spapr_xive_end_pic_print_info(sPAPRXive *xive, XiveEND *end,
+  Monitor *mon)
+{
+uint32_t qindex = GETFIELD_BE32(END_W1_PAGE_OFF, end->w1);
+uint32_t qgen = GETFIELD_BE32(END_W1_GENERATION, end->w1);
+uint32_t qsize = GETFIELD_BE32(END_W0_QSIZE, end->w0);
+uint32_t qentries = 1 << (qsize + 10);
+uint32_t nvt = GETFIELD_BE32(END_W6_NVT_INDEX, end->w6);
+uint8_t priority = GETFIELD_BE32(END_W7_F0_PRIORITY, end->w7);
+
+monitor_printf(mon, "%3d/%d % 6d/%5d ^%d", nvt,
+   priority, qindex, qentries, qgen);
+
+xive_end_queue_pic_print_info(end, 6, mon);
+monitor_printf(mon, "]");
+}
+
+void 

[Qemu-devel] [PATCH v7 07/19] spapr: introduce a new machine IRQ backend for XIVE

2018-12-09 Thread Cédric Le Goater
The XIVE IRQ backend uses the same layout as the new XICS backend but
covers the full range of the IRQ number space. The IRQ numbers for the
CPU IPIs are allocated at the bottom of this space, below 4K, to
preserve compatibility with XICS which does not use that range.

This should be enough given that the maximum number of CPUs is 1024
for the sPAPR machine under QEMU. For the record, the biggest POWER8
or POWER9 system has a maximum of 1536 HW threads (16 sockets, 192
cores, SMT8).

Signed-off-by: Cédric Le Goater 
---
 include/hw/ppc/spapr.h |   2 +
 include/hw/ppc/spapr_irq.h |   2 +
 hw/ppc/spapr_irq.c | 113 +
 3 files changed, 117 insertions(+)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 198764066dc9..cb3082d319af 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -16,6 +16,7 @@ typedef struct sPAPREventLogEntry sPAPREventLogEntry;
 typedef struct sPAPREventSource sPAPREventSource;
 typedef struct sPAPRPendingHPT sPAPRPendingHPT;
 typedef struct ICSState ICSState;
+typedef struct sPAPRXive sPAPRXive;
 
 #define HPTE64_V_HPTE_DIRTY 0x0040ULL
 #define SPAPR_ENTRY_POINT   0x100
@@ -175,6 +176,7 @@ struct sPAPRMachineState {
 const char *icp_type;
 int32_t irq_map_nr;
 unsigned long *irq_map;
+sPAPRXive  *xive;
 
 bool cmd_line_caps[SPAPR_CAP_NUM];
 sPAPRCapabilities def, eff, mig;
diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index bd7301e6d9c6..23cdb51b879e 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -13,6 +13,7 @@
 /*
  * IRQ range offsets per device type
  */
+#define SPAPR_IRQ_IPI0x0
 #define SPAPR_IRQ_EPOW   0x1000  /* XICS_IRQ_BASE offset */
 #define SPAPR_IRQ_HOTPLUG0x1001
 #define SPAPR_IRQ_VIO0x1100  /* 256 VIO devices */
@@ -42,6 +43,7 @@ typedef struct sPAPRIrq {
 
 extern sPAPRIrq spapr_irq_xics;
 extern sPAPRIrq spapr_irq_xics_legacy;
+extern sPAPRIrq spapr_irq_xive;
 
 void spapr_irq_init(sPAPRMachineState *spapr, Error **errp);
 int spapr_irq_claim(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index f8b651de0ec9..0bf47ff9fa26 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -12,6 +12,7 @@
 #include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "hw/ppc/spapr.h"
+#include "hw/ppc/spapr_xive.h"
 #include "hw/ppc/xics.h"
 #include "sysemu/kvm.h"
 
@@ -205,6 +206,118 @@ sPAPRIrq spapr_irq_xics = {
 .print_info  = spapr_irq_print_info_xics,
 };
 
+/*
+ * XIVE IRQ backend.
+ */
+static sPAPRXive *spapr_xive_create(sPAPRMachineState *spapr, int nr_irqs,
+int nr_servers, Error **errp)
+{
+sPAPRXive *xive;
+Error *local_err = NULL;
+Object *obj;
+uint32_t nr_ends = nr_servers << 3; /* 8 priority ENDs per CPU */
+int i;
+
+/* TODO : use qdev_create() ? */
+obj = object_new(TYPE_SPAPR_XIVE);
+object_property_set_int(obj, nr_irqs, "nr-irqs", _abort);
+object_property_set_int(obj, nr_ends, "nr-ends", _abort);
+object_property_set_bool(obj, true, "realized", _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return NULL;
+}
+qdev_set_parent_bus(DEVICE(obj), sysbus_get_default());
+xive = SPAPR_XIVE(obj);
+
+/* Enable the CPU IPIs */
+for (i = 0; i < nr_servers; ++i) {
+spapr_xive_irq_claim(xive, SPAPR_IRQ_IPI + i, false);
+}
+
+return xive;
+}
+
+static void spapr_irq_init_xive(sPAPRMachineState *spapr, Error **errp)
+{
+MachineState *machine = MACHINE(spapr);
+sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
+int nr_irqs = smc->irq->nr_irqs;
+Error *local_err = NULL;
+
+/* KVM XIVE device not yet available */
+if (kvm_enabled()) {
+if (machine_kernel_irqchip_required(machine)) {
+error_setg(errp, "kernel_irqchip requested. no KVM XIVE support");
+return;
+}
+}
+
+spapr->xive = spapr_xive_create(spapr, nr_irqs,
+spapr_max_server_number(spapr), 
_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+}
+
+static int spapr_irq_claim_xive(sPAPRMachineState *spapr, int irq, bool lsi,
+Error **errp)
+{
+if (!spapr_xive_irq_claim(spapr->xive, irq, lsi)) {
+error_setg(errp, "IRQ %d is invalid", irq);
+return -1;
+}
+return 0;
+}
+
+static void spapr_irq_free_xive(sPAPRMachineState *spapr, int irq, int num)
+{
+int i;
+
+for (i = irq; i < irq + num; ++i) {
+spapr_xive_irq_free(spapr->xive, i);
+}
+}
+
+static qemu_irq spapr_qirq_xive(sPAPRMachineState *spapr, int irq)
+{
+return spapr_xive_qirq(spapr->xive, irq);
+}
+
+static void spapr_irq_print_info_xive(sPAPRMachineState *spapr,
+  Monitor *mon)

[Qemu-devel] [PATCH v7 01/19] ppc/xive: add support for the END Event State Buffers

2018-12-09 Thread Cédric Le Goater
The Event Notification Descriptor (END) XIVE structure also contains
two Event State Buffers providing further coalescing of interrupts,
one for the notification event (ESn) and one for the escalation events
(ESe). A MMIO page is assigned for each to control the EOI through
loads only. Stores are not allowed.

The END ESBs are modeled through an object resembling the 'XiveSource'
It is stateless as the END state bits are backed into the XiveEND
structure under the XiveRouter and the MMIO accesses follow the same
rules as for the XiveSource ESBs.

END ESBs are not supported by the Linux drivers neither on OPAL nor on
sPAPR. Nevetherless, it provides a mean to study the question in the
future and validates a bit more the XIVE model.

Signed-off-by: Cédric Le Goater 
---

 Changes since v6:

 - removed the 'chip-id' field from XiveRouter
 - introduced a 'block-id' field in XiveENDSource to lookup the XIVE
   END structure when doing a load in the MMIO ESB
 - removed reset XiveENDSource handler

 include/hw/ppc/xive.h |  21 ++
 hw/intc/xive.c| 160 +-
 2 files changed, 179 insertions(+), 2 deletions(-)

diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
index 4851d3b3a41f..014f64aa98f6 100644
--- a/include/hw/ppc/xive.h
+++ b/include/hw/ppc/xive.h
@@ -336,6 +336,27 @@ int xive_router_get_end(XiveRouter *xrtr, uint8_t end_blk, 
uint32_t end_idx,
 int xive_router_write_end(XiveRouter *xrtr, uint8_t end_blk, uint32_t end_idx,
   XiveEND *end, uint8_t word_number);
 
+/*
+ * XIVE END ESBs
+ */
+
+#define TYPE_XIVE_END_SOURCE "xive-end-source"
+#define XIVE_END_SOURCE(obj) \
+OBJECT_CHECK(XiveENDSource, (obj), TYPE_XIVE_END_SOURCE)
+
+typedef struct XiveENDSource {
+DeviceState parent;
+
+uint32_tnr_ends;
+uint8_t block_id;
+
+/* ESB memory region */
+uint32_tesb_shift;
+MemoryRegionesb_mmio;
+
+XiveRouter  *xrtr;
+} XiveENDSource;
+
 /*
  * For legacy compatibility, the exceptions define up to 256 different
  * priorities. P9 implements only 9 levels : 8 active levels [0 - 7]
diff --git a/hw/intc/xive.c b/hw/intc/xive.c
index 41d8ba1540d0..2196ce8de059 100644
--- a/hw/intc/xive.c
+++ b/hw/intc/xive.c
@@ -612,8 +612,18 @@ static void xive_router_end_notify(XiveRouter *xrtr, 
uint8_t end_blk,
  * even futher coalescing in the Router
  */
 if (!xive_end_is_notify()) {
-qemu_log_mask(LOG_UNIMP, "XIVE: !UCOND_NOTIFY not implemented\n");
-return;
+uint8_t pq = GETFIELD_BE32(END_W1_ESn, end.w1);
+bool notify = xive_esb_trigger();
+
+if (pq != GETFIELD_BE32(END_W1_ESn, end.w1)) {
+end.w1 = SETFIELD_BE32(END_W1_ESn, end.w1, pq);
+xive_router_write_end(xrtr, end_blk, end_idx, , 1);
+}
+
+/* ESn[Q]=1 : end of notification */
+if (!notify) {
+return;
+}
 }
 
 /*
@@ -692,6 +702,151 @@ void xive_eas_pic_print_info(XiveEAS *eas, uint32_t lisn, 
Monitor *mon)
(uint32_t) GETFIELD_BE64(EAS_END_DATA, eas->w));
 }
 
+/*
+ * END ESB MMIO loads
+ */
+static uint64_t xive_end_source_read(void *opaque, hwaddr addr, unsigned size)
+{
+XiveENDSource *xsrc = XIVE_END_SOURCE(opaque);
+uint32_t offset = addr & 0xFFF;
+uint8_t end_blk;
+uint32_t end_idx;
+XiveEND end;
+uint32_t end_esmask;
+uint8_t pq;
+uint64_t ret = -1;
+
+end_blk = xsrc->block_id;
+end_idx = addr >> (xsrc->esb_shift + 1);
+
+if (xive_router_get_end(xsrc->xrtr, end_blk, end_idx, )) {
+qemu_log_mask(LOG_GUEST_ERROR, "XIVE: No END %x/%x\n", end_blk,
+  end_idx);
+return -1;
+}
+
+if (!xive_end_is_valid()) {
+qemu_log_mask(LOG_GUEST_ERROR, "XIVE: END %x/%x is invalid\n",
+  end_blk, end_idx);
+return -1;
+}
+
+end_esmask = addr_is_even(addr, xsrc->esb_shift) ? END_W1_ESn : END_W1_ESe;
+pq = GETFIELD_BE32(end_esmask, end.w1);
+
+switch (offset) {
+case XIVE_ESB_LOAD_EOI ... XIVE_ESB_LOAD_EOI + 0x7FF:
+ret = xive_esb_eoi();
+
+/* Forward the source event notification for routing ?? */
+break;
+
+case XIVE_ESB_GET ... XIVE_ESB_GET + 0x3FF:
+ret = pq;
+break;
+
+case XIVE_ESB_SET_PQ_00 ... XIVE_ESB_SET_PQ_00 + 0x0FF:
+case XIVE_ESB_SET_PQ_01 ... XIVE_ESB_SET_PQ_01 + 0x0FF:
+case XIVE_ESB_SET_PQ_10 ... XIVE_ESB_SET_PQ_10 + 0x0FF:
+case XIVE_ESB_SET_PQ_11 ... XIVE_ESB_SET_PQ_11 + 0x0FF:
+ret = xive_esb_set(, (offset >> 8) & 0x3);
+break;
+default:
+qemu_log_mask(LOG_GUEST_ERROR, "XIVE: invalid END ESB load addr %d\n",
+  offset);
+return -1;
+}
+
+if (pq != GETFIELD_BE32(end_esmask, end.w1)) {
+end.w1 = SETFIELD_BE32(end_esmask, end.w1, pq);
+xive_router_write_end(xsrc->xrtr, end_blk, end_idx, , 1);
+}
+
+

[Qemu-devel] [PATCH v7 18/19] spapr: add a 'pseries-4.0-xive' machine type

2018-12-09 Thread Cédric Le Goater
This pseries machine makes use of a new sPAPR IRQ backend supporting
the XIVE interrupt mode.

The guest OS is required to have support for the XIVE exploitation
mode of the POWER9 interrupt controller.

Signed-off-by: Cédric Le Goater 
---
 hw/ppc/spapr.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 4012ebd794a4..3cc134a0b673 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3985,6 +3985,21 @@ static void spapr_machine_4_0_class_options(MachineClass 
*mc)
 
 DEFINE_SPAPR_MACHINE(4_0, "4.0", true);
 
+static void spapr_machine_4_0_xive_instance_options(MachineState *machine)
+{
+spapr_machine_4_0_instance_options(machine);
+}
+
+static void spapr_machine_4_0_xive_class_options(MachineClass *mc)
+{
+sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
+
+spapr_machine_4_0_class_options(mc);
+smc->irq = _irq_xive;
+}
+
+DEFINE_SPAPR_MACHINE(4_0_xive, "4.0-xive", false);
+
 /*
  * pseries-3.1
  */
-- 
2.17.2




[Qemu-devel] [PATCH v7 00/19] ppc: support for the XIVE interrupt controller (POWER9)

2018-12-09 Thread Cédric Le Goater
Hello,

Here is the version 7 of the QEMU models adding support for the XIVE
interrupt controller to the sPAPR machine, under TCG only this
time. KVM support will be proposed in an other patchset, along with
the KVM XIVE device patchset, and so will PowerNV.

The most important changes for sPAPR are the introduction of the 4.0
machines. The sPAPRXive model still inherits from the XiveRouter. It
is possible to change the class inheritance tree but it does not bring
much for now.

I am not sure how we should handle the machine definitions, so I
proposed both, XIVE only and dual interrupt mode. The impact on the
XICS machine is limited with TCG but KVM support of the 'dual' machine
will change things. Let me know how you want to proceed.

Thanks,

C.

Changes in v7 :

 Common XIVE models :
 
 - removed the 'chip-id' field from XiveRouter
 - introduced a 'block-id' field in XiveENDSource to lookup the XIVE
   END structure when doing a load in the MMIO ESB
 - removed reset XiveENDSource handler
 - introduced a xive_tctx_word2() helper to extract TM_WORD2 of a ring.
 - removed HW CAM line setting and use as it is only useful for PowerNV
 - made use of xive_tctx_word2() helper
 - made use of GETFIELD_BE32() to compare CAM lines
 - fixed initialization of XiveTCTXMatch

 sPAPR models :

 - simplified the prototypes of helpers
 - introduced an assert in set_nvt() method
 - introduced a fixed value for the controller block id value.
 - removed the hardwiring the HW CAM line. Back to v5 state.
 - removed patch "spapr: modify the irq backend 'init' method". It did
   not bring much
 - split the 'xive' sPAPR IRQ backend from the 'xive' machine
 - split the 'dual' sPAPR IRQ backend from the 'dual' machine
 - introduced 4.0* machines
 
 KVM :

 - hardly no changes 
 - will come later in a KVM patchset

 PowerNV:

 - will come later in a PowerNV patchset

Changes in v6 :

 https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg00965.html

Changes in v5 :

 https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg03218.html

Changes in v4 :

 https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg01672.html


= XIVE =


The POWER9 processor comes with a new interrupt controller, called
XIVE as "eXternal Interrupt Virtualization Engine".


* Overall architecture


 XIVE Interrupt Controller
 ++  IPIs
 | +-+ +-+ ++ |+---+
 | |VC   | |CQ   | |PC  |> | CORES |
 | | esb | | | ||> |   |
 | | eas | |  Bridge | |   tctx |> |   |
 | |SC   end | | | |nvt | ||   |
 +--+| +-+ +++ ++ |+-+-+-+-+
 | RAM  |+--|-+  | | |
 |  |   || | |
 |  |   || | |
 |  |  +vv-v-v--+other
 |  <--+ Power Bus  +--> chips
 |  esb |  +-+---+--+
 |  eas ||   |
 |  end | +--|--+|
 |  nvt |   +++ |   +++
 +--+   |SC   | |   |SC   |
| | |   | |
| PQ-bits | |   | PQ-bits |
| local   |-+   |  in VC  |
+-+ +-+
   PCIe NX,NPU,CAPI

  SC: Source Controller (aka. IVSE)
  VC: Virtualization Controller (aka. IVRE)
  PC: Presentation Controller (aka. IVPE)
  CQ: Common Queue (Bridge)

 PQ-bits: 2 bits source state machine (P:pending Q:queued)
 esb: Event State Buffer (Array of PQ bits in an IVSE)
 eas: Event Assignment Structure
 end: Event Notification Descriptor
 nvt: Notification Virtual Target
tctx: Thread interrupt Context


The XIVE IC is composed of three sub-engines :

  - Interrupt Virtualization Source Engine (IVSE), or Source
Controller (SC). These are found in PCI PHBs, in the PSI host
bridge controller, but also inside the main controller for the
core IPIs and other sub-chips (NX, CAP, NPU) of the
chip/processor. They are configured to feed the IVRE with events.

  - Interrupt Virtualization Routing Engine (IVRE) or Virtualization
Controller (VC). Its job is to match an event source with an Event
Notification Descriptor (END).

  - Interrupt Virtualization Presentation Engine (IVPE) or Presentation
Controller (PC). It maintains the interrupt context state of each
thread and handles the delivery of the 

[Qemu-devel] [RFC v2 38/38] tests/plugin: add sample plugins

2018-12-09 Thread Emilio G. Cota
Pass arguments with -plugin=libfoo.so,arg=bar,arg=baz

Signed-off-by: Emilio G. Cota 
---
 configure |  4 +-
 tests/plugin/bb.c | 66 ++
 tests/plugin/empty.c  | 30 ++
 tests/plugin/insn.c   | 63 +
 tests/plugin/mem.c| 93 +++
 tests/plugin/Makefile | 28 +
 6 files changed, 282 insertions(+), 2 deletions(-)
 create mode 100644 tests/plugin/bb.c
 create mode 100644 tests/plugin/empty.c
 create mode 100644 tests/plugin/insn.c
 create mode 100644 tests/plugin/mem.c
 create mode 100644 tests/plugin/Makefile

diff --git a/configure b/configure
index 91f9c08ae2..1440b5e688 100755
--- a/configure
+++ b/configure
@@ -7450,14 +7450,14 @@ fi
 
 # build tree in object directory in case the source is not in the current 
directory
 DIRS="tests tests/tcg tests/tcg/cris tests/tcg/lm32 tests/libqos 
tests/qapi-schema tests/tcg/xtensa tests/qemu-iotests tests/vm"
-DIRS="$DIRS tests/fp"
+DIRS="$DIRS tests/fp tests/plugin"
 DIRS="$DIRS docs docs/interop fsdev scsi"
 DIRS="$DIRS pc-bios/optionrom pc-bios/spapr-rtas pc-bios/s390-ccw"
 DIRS="$DIRS roms/seabios roms/vgabios"
 FILES="Makefile tests/tcg/Makefile qdict-test-data.txt"
 FILES="$FILES tests/tcg/cris/Makefile tests/tcg/cris/.gdbinit"
 FILES="$FILES tests/tcg/lm32/Makefile tests/tcg/xtensa/Makefile po/Makefile"
-FILES="$FILES tests/fp/Makefile"
+FILES="$FILES tests/fp/Makefile tests/plugin/Makefile"
 FILES="$FILES pc-bios/optionrom/Makefile pc-bios/keymaps"
 FILES="$FILES pc-bios/spapr-rtas/Makefile"
 FILES="$FILES pc-bios/s390-ccw/Makefile"
diff --git a/tests/plugin/bb.c b/tests/plugin/bb.c
new file mode 100644
index 00..bb868599a9
--- /dev/null
+++ b/tests/plugin/bb.c
@@ -0,0 +1,66 @@
+/*
+ * Copyright (C) 2018, Emilio G. Cota 
+ *
+ * License: GNU GPL, version 2 or later.
+ *   See the COPYING file in the top-level directory.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+static uint64_t bb_count;
+static uint64_t insn_count;
+static int stdout_fd;
+static bool do_inline;
+
+static void plugin_exit(qemu_plugin_id_t id, void *p)
+{
+dprintf(stdout_fd, "bb's: %" PRIu64", insns: %" PRIu64 "\n",
+bb_count, insn_count);
+}
+
+static void vcpu_tb_exec(unsigned int cpu_index, void *udata)
+{
+unsigned long n_insns = (unsigned long)udata;
+
+insn_count += n_insns;
+bb_count++;
+}
+
+static void vcpu_tb_trans(qemu_plugin_id_t id, unsigned int cpu_index,
+  struct qemu_plugin_tb *tb)
+{
+unsigned long n_insns = qemu_plugin_tb_n_insns(tb);
+
+if (do_inline) {
+qemu_plugin_register_vcpu_tb_exec_inline(tb, 
QEMU_PLUGIN_INLINE_ADD_U64,
+ _count, 1);
+qemu_plugin_register_vcpu_tb_exec_inline(tb, 
QEMU_PLUGIN_INLINE_ADD_U64,
+ _count, n_insns);
+} else {
+qemu_plugin_register_vcpu_tb_exec_cb(tb, vcpu_tb_exec,
+ QEMU_PLUGIN_CB_NO_REGS,
+ (void *)n_insns);
+}
+}
+
+QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id, int argc,
+   char **argv)
+{
+if (argc && strcmp(argv[0], "inline") == 0) {
+do_inline = true;
+}
+
+/* to be used when in the exit hook */
+stdout_fd = dup(STDOUT_FILENO);
+assert(stdout_fd);
+
+qemu_plugin_register_vcpu_tb_trans_cb(id, vcpu_tb_trans);
+qemu_plugin_register_atexit_cb(id, plugin_exit, NULL);
+return 0;
+}
diff --git a/tests/plugin/empty.c b/tests/plugin/empty.c
new file mode 100644
index 00..b2e30bddb2
--- /dev/null
+++ b/tests/plugin/empty.c
@@ -0,0 +1,30 @@
+/*
+ * Copyright (C) 2018, Emilio G. Cota 
+ *
+ * License: GNU GPL, version 2 or later.
+ *   See the COPYING file in the top-level directory.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+/*
+ * Empty TB translation callback.
+ * This allows us to measure the overhead of injecting and then
+ * removing empty instrumentation.
+ */
+static void vcpu_tb_trans(qemu_plugin_id_t id, unsigned int cpu_index,
+  struct qemu_plugin_tb *tb)
+{ }
+
+QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id, int argc,
+   char **argv)
+{
+qemu_plugin_register_vcpu_tb_trans_cb(id, vcpu_tb_trans);
+return 0;
+}
diff --git a/tests/plugin/insn.c b/tests/plugin/insn.c
new file mode 100644
index 00..11afe0e8f1
--- /dev/null
+++ b/tests/plugin/insn.c
@@ -0,0 +1,63 @@
+/*
+ * Copyright (C) 2018, Emilio G. Cota 
+ *
+ * License: GNU GPL, version 2 or later.
+ *   See the COPYING file in the top-level directory.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+static int stdout_fd;
+static uint64_t 

[Qemu-devel] [RFC v2 31/38] target/xtensa: fetch code with translator_ld

2018-12-09 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota 
---
 target/xtensa/translate.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/xtensa/translate.c b/target/xtensa/translate.c
index 46e1338448..c140742562 100644
--- a/target/xtensa/translate.c
+++ b/target/xtensa/translate.c
@@ -882,7 +882,7 @@ static inline unsigned xtensa_op0_insn_len(DisasContext 
*dc, uint8_t op0)
 static void disas_xtensa_insn(CPUXtensaState *env, DisasContext *dc)
 {
 xtensa_isa isa = dc->config->isa;
-unsigned char b[MAX_INSN_LENGTH] = {cpu_ldub_code(env, dc->pc)};
+unsigned char b[MAX_INSN_LENGTH] = {translator_ldub(env, dc->pc)};
 unsigned len = xtensa_op0_insn_len(dc, b[0]);
 xtensa_format fmt;
 int slot, slots;
@@ -914,7 +914,7 @@ static void disas_xtensa_insn(CPUXtensaState *env, 
DisasContext *dc)
   dc->pc);
 }
 for (i = 1; i < len; ++i) {
-b[i] = cpu_ldub_code(env, dc->pc + i);
+b[i] = translator_ldub(env, dc->pc + i);
 }
 xtensa_insnbuf_from_chars(isa, dc->insnbuf, b, len);
 fmt = xtensa_format_decode(isa, dc->insnbuf);
-- 
2.17.1




[Qemu-devel] [RFC v2 25/38] target/i386: fetch code with translator_ld

2018-12-09 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota 
---
 target/i386/translate.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 83c1ebe491..6ea784da54 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -1900,28 +1900,28 @@ static uint64_t advance_pc(CPUX86State *env, 
DisasContext *s, int num_bytes)
 
 static inline uint8_t x86_ldub_code(CPUX86State *env, DisasContext *s)
 {
-return cpu_ldub_code(env, advance_pc(env, s, 1));
+return translator_ldub(env, advance_pc(env, s, 1));
 }
 
 static inline int16_t x86_ldsw_code(CPUX86State *env, DisasContext *s)
 {
-return cpu_ldsw_code(env, advance_pc(env, s, 2));
+return translator_ldsw(env, advance_pc(env, s, 2));
 }
 
 static inline uint16_t x86_lduw_code(CPUX86State *env, DisasContext *s)
 {
-return cpu_lduw_code(env, advance_pc(env, s, 2));
+return translator_lduw(env, advance_pc(env, s, 2));
 }
 
 static inline uint32_t x86_ldl_code(CPUX86State *env, DisasContext *s)
 {
-return cpu_ldl_code(env, advance_pc(env, s, 4));
+return translator_ldl(env, advance_pc(env, s, 4));
 }
 
 #ifdef TARGET_X86_64
 static inline uint64_t x86_ldq_code(CPUX86State *env, DisasContext *s)
 {
-return cpu_ldq_code(env, advance_pc(env, s, 8));
+return translator_ldq(env, advance_pc(env, s, 8));
 }
 #endif
 
-- 
2.17.1




[Qemu-devel] [PATCH v7 02/19] ppc/xive: introduce the XIVE interrupt thread context

2018-12-09 Thread Cédric Le Goater
Each POWER9 processor chip has a XIVE presenter that can generate four
different exceptions to its threads:

  - hypervisor exception,
  - O/S exception
  - Event-Based Branch (EBB)
  - msgsnd (doorbell).

Each exception has a state independent from the others called a Thread
Interrupt Management context. This context is a set of registers which
lets the thread handle priority management and interrupt acknowledgment
among other things. The most important ones being :

  - Interrupt Priority Register  (PIPR)
  - Interrupt Pending Buffer (IPB)
  - Current Processor Priority   (CPPR)
  - Notification Source Register (NSR)

These registers are accessible through a specific MMIO region, called
the Thread Interrupt Management Area (TIMA), four aligned pages, each
exposing a different view of the registers. First page (page address
ending in 0b00) gives access to the entire context and is reserved for
the ring 0 view for the physical thread context. The second (page
address ending in 0b01) is for the hypervisor, ring 1 view. The third
(page address ending in 0b10) is for the operating system, ring 2
view. The fourth (page address ending in 0b11) is for user level, ring
3 view.

The thread interrupt context is modeled with a XiveTCTX object
containing the values of the different exception registers. The TIMA
region is mapped at the same address for each CPU.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
---

 Changes since v6

 - introduced a xive_tctx_word2() helper to extract TM_WORD2 of a ring.

 include/hw/ppc/xive.h  |  44 
 include/hw/ppc/xive_regs.h |  82 +++
 hw/intc/xive.c | 424 +
 3 files changed, 550 insertions(+)

diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
index 014f64aa98f6..1e823a4c64e9 100644
--- a/include/hw/ppc/xive.h
+++ b/include/hw/ppc/xive.h
@@ -367,4 +367,48 @@ typedef struct XiveENDSource {
 void xive_end_pic_print_info(XiveEND *end, uint32_t end_idx, Monitor *mon);
 void xive_end_queue_pic_print_info(XiveEND *end, uint32_t width, Monitor *mon);
 
+/*
+ * XIVE Thread interrupt Management (TM) context
+ */
+
+#define TYPE_XIVE_TCTX "xive-tctx"
+#define XIVE_TCTX(obj) OBJECT_CHECK(XiveTCTX, (obj), TYPE_XIVE_TCTX)
+
+/*
+ * XIVE Thread interrupt Management register rings :
+ *
+ *   QW-0  User   event-based exception state
+ *   QW-1  O/SOS context for priority management, interrupt acks
+ *   QW-2  Pool   hypervisor pool context for virtual processors dispatched
+ *   QW-3  Physical   physical thread context and security context
+ */
+#define XIVE_TM_RING_COUNT  4
+#define XIVE_TM_RING_SIZE   0x10
+
+typedef struct XiveTCTX {
+DeviceState parent_obj;
+
+CPUState*cs;
+qemu_irqoutput;
+
+uint8_t regs[XIVE_TM_RING_COUNT * XIVE_TM_RING_SIZE];
+} XiveTCTX;
+
+/*
+ * XIVE Thread Interrupt Management Aera (TIMA)
+ *
+ * This region gives access to the registers of the thread interrupt
+ * management context. It is four page wide, each page providing a
+ * different view of the registers. The page with the lower offset is
+ * the most privileged and gives access to the entire context.
+ */
+#define XIVE_TM_HW_PAGE 0x0
+#define XIVE_TM_HV_PAGE 0x1
+#define XIVE_TM_OS_PAGE 0x2
+#define XIVE_TM_USER_PAGE   0x3
+
+extern const MemoryRegionOps xive_tm_ops;
+
+void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon);
+
 #endif /* PPC_XIVE_H */
diff --git a/include/hw/ppc/xive_regs.h b/include/hw/ppc/xive_regs.h
index 3c0ebad18b69..ede3d04c5eda 100644
--- a/include/hw/ppc/xive_regs.h
+++ b/include/hw/ppc/xive_regs.h
@@ -23,6 +23,88 @@
 #define XIVE_SRCNO_INDEX(srcno) ((srcno) & 0x0fff)
 #define XIVE_SRCNO(blk, idx)((uint32_t)(blk) << 28 | (idx))
 
+#define TM_SHIFT16
+
+/* TM register offsets */
+#define TM_QW0_USER 0x000 /* All rings */
+#define TM_QW1_OS   0x010 /* Ring 0..2 */
+#define TM_QW2_HV_POOL  0x020 /* Ring 0..1 */
+#define TM_QW3_HV_PHYS  0x030 /* Ring 0..1 */
+
+/* Byte offsets inside a QW QW0 QW1 QW2 QW3 */
+#define TM_NSR  0x0  /*  +   +   -   +  */
+#define TM_CPPR 0x1  /*  -   +   -   +  */
+#define TM_IPB  0x2  /*  -   +   +   +  */
+#define TM_LSMFB0x3  /*  -   +   +   +  */
+#define TM_ACK_CNT  0x4  /*  -   +   -   -  */
+#define TM_INC  0x5  /*  -   +   -   +  */
+#define TM_AGE  0x6  /*  -   +   -   +  */
+#define TM_PIPR 0x7  /*  -   +   -   +  */
+
+#define TM_WORD00x0
+#define TM_WORD10x4
+
+/*
+ * QW word 2 contains the valid bit at the top and other fields
+ * depending on the QW.
+ */
+#define TM_WORD20x8
+#define   TM_QW0W2_VU   PPC_BIT32(0)
+#define   TM_QW0W2_LOGIC_SERV   PPC_BITMASK32(1, 31) /* XX 2,31 ? */
+#define   TM_QW1W2_VO   

[Qemu-devel] [PATCH v7 14/19] spapr: set the interrupt presenter at reset

2018-12-09 Thread Cédric Le Goater
Currently, the interrupt presenter of the vCPU is set at realize
time. Setting it at reset will become useful when the new machine
supporting both interrupt modes is introduced. In this machine, the
interrupt mode is chosen at CAS time and activated after a reset.

Signed-off-by: Cédric Le Goater 
---
 include/hw/ppc/spapr_cpu_core.h |  2 ++
 hw/ppc/spapr_cpu_core.c | 26 ++
 hw/ppc/spapr_irq.c  | 12 
 3 files changed, 40 insertions(+)

diff --git a/include/hw/ppc/spapr_cpu_core.h b/include/hw/ppc/spapr_cpu_core.h
index 9e2821e4b31f..fc8ea9021656 100644
--- a/include/hw/ppc/spapr_cpu_core.h
+++ b/include/hw/ppc/spapr_cpu_core.h
@@ -53,4 +53,6 @@ static inline sPAPRCPUState *spapr_cpu_state(PowerPCCPU *cpu)
 return (sPAPRCPUState *)cpu->machine_data;
 }
 
+void spapr_cpu_core_set_intc(PowerPCCPU *cpu, const char *intc_type);
+
 #endif
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 1811cd48db90..529de0b6b9c8 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -398,3 +398,29 @@ static const TypeInfo spapr_cpu_core_type_infos[] = {
 };
 
 DEFINE_TYPES(spapr_cpu_core_type_infos)
+
+typedef struct ForeachFindIntCArgs {
+const char *intc_type;
+Object *intc;
+} ForeachFindIntCArgs;
+
+static int spapr_cpu_core_find_intc(Object *child, void *opaque)
+{
+ForeachFindIntCArgs *args = opaque;
+
+if (object_dynamic_cast(child, args->intc_type)) {
+args->intc = child;
+}
+
+return args->intc != NULL;
+}
+
+void spapr_cpu_core_set_intc(PowerPCCPU *cpu, const char *intc_type)
+{
+ForeachFindIntCArgs args = { intc_type, NULL };
+
+object_child_foreach(OBJECT(cpu), spapr_cpu_core_find_intc, );
+g_assert(args.intc);
+
+cpu->intc = args.intc;
+}
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 7a0d4f529763..b423cee30e2c 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -12,6 +12,7 @@
 #include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "hw/ppc/spapr.h"
+#include "hw/ppc/spapr_cpu_core.h"
 #include "hw/ppc/spapr_xive.h"
 #include "hw/ppc/xics.h"
 #include "sysemu/kvm.h"
@@ -211,6 +212,11 @@ static int spapr_irq_post_load_xics(sPAPRMachineState 
*spapr, int version_id)
 
 static void spapr_irq_reset_xics(sPAPRMachineState *spapr, Error **errp)
 {
+CPUState *cs;
+
+CPU_FOREACH(cs) {
+spapr_cpu_core_set_intc(POWERPC_CPU(cs), spapr->icp_type);
+}
 }
 
 #define SPAPR_IRQ_XICS_NR_IRQS 0x1000
@@ -341,6 +347,12 @@ static int spapr_irq_post_load_xive(sPAPRMachineState 
*spapr, int version_id)
 
 static void spapr_irq_reset_xive(sPAPRMachineState *spapr, Error **errp)
 {
+CPUState *cs;
+
+CPU_FOREACH(cs) {
+spapr_cpu_core_set_intc(POWERPC_CPU(cs), TYPE_XIVE_TCTX);
+}
+
 /*
  * Set the OS CAM line of the cpu interrupt thread context. Needs
  * to come after the XiveTCTX reset handlers.
-- 
2.17.2




[Qemu-devel] [RFC v2 27/38] target/m68k: fetch code with translator_ld

2018-12-09 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota 
---
 target/m68k/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index d55e707cf6..71263f8b37 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -373,7 +373,7 @@ static TCGv gen_ldst(DisasContext *s, int opsize, TCGv 
addr, TCGv val,
 static inline uint16_t read_im16(CPUM68KState *env, DisasContext *s)
 {
 uint16_t im;
-im = cpu_lduw_code(env, s->pc);
+im = translator_lduw(env, s->pc);
 s->pc += 2;
 return im;
 }
-- 
2.17.1




[Qemu-devel] [PATCH v7 13/19] spapr: add an extra OV5 field to the sPAPR IRQ backend

2018-12-09 Thread Cédric Le Goater
This field defines the interrupt modes supported by the hypervisor in
the "ibm,arch-vec-5-platform-support" property. The CAS negotiation
process will select which mode to use.

Signed-off-by: Cédric Le Goater 
---
 include/hw/ppc/spapr.h |  6 ++
 include/hw/ppc/spapr_irq.h |  1 +
 hw/ppc/spapr.c | 23 ++-
 hw/ppc/spapr_irq.c |  3 +++
 4 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 6bf028a02fe2..daced428a42c 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -824,5 +824,11 @@ int spapr_caps_post_migration(sPAPRMachineState *spapr);
 
 void spapr_check_pagesize(sPAPRMachineState *spapr, hwaddr pagesize,
   Error **errp);
+/*
+ * XIVE definitions
+ */
+#define SPAPR_OV5_XIVE_LEGACY   0x0
+#define SPAPR_OV5_XIVE_EXPLOIT  0x40
+#define SPAPR_OV5_XIVE_BOTH 0x80
 
 #endif /* HW_SPAPR_H */
diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index 63061a009b4c..b34d5a00381b 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -33,6 +33,7 @@ void spapr_irq_msi_reset(sPAPRMachineState *spapr);
 typedef struct sPAPRIrq {
 uint32_tnr_irqs;
 uint32_tnr_msis;
+uint8_t ov5;
 
 void (*init)(sPAPRMachineState *spapr, Error **errp);
 int (*claim)(sPAPRMachineState *spapr, int irq, bool lsi, Error **errp);
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 98d69f09e080..5ef87a00f68b 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1095,12 +1095,14 @@ static void spapr_dt_rtas(sPAPRMachineState *spapr, 
void *fdt)
 spapr_dt_rtas_tokens(fdt, rtas);
 }
 
-/* Prepare ibm,arch-vec-5-platform-support, which indicates the MMU features
- * that the guest may request and thus the valid values for bytes 24..26 of
- * option vector 5: */
-static void spapr_dt_ov5_platform_support(void *fdt, int chosen)
+/* Prepare ibm,arch-vec-5-platform-support, which indicates the MMU
+ * and the XIVE features that the guest may request and thus the valid
+ * values for bytes 23..26 of option vector 5: */
+static void spapr_dt_ov5_platform_support(sPAPRMachineState *spapr, void *fdt,
+  int chosen)
 {
 PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
+sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
 
 char val[2 * 4] = {
 23, 0x00, /* Xive mode, filled in below. */
@@ -1121,7 +1123,13 @@ static void spapr_dt_ov5_platform_support(void *fdt, int 
chosen)
 } else {
 val[3] = 0x00; /* Hash */
 }
+/* If the KVM XIVE device is not available, the machine can
+ * still operate with kernel_irqchip=off
+ */
+val[1] = smc->irq->ov5;
 } else {
+val[1] = smc->irq->ov5;
+
 /* V3 MMU supports both hash and radix in tcg (with dynamic switching) 
*/
 val[3] = 0xC0;
 }
@@ -1189,7 +1197,7 @@ static void spapr_dt_chosen(sPAPRMachineState *spapr, 
void *fdt)
 _FDT(fdt_setprop_string(fdt, chosen, "stdout-path", stdout_path));
 }
 
-spapr_dt_ov5_platform_support(fdt, chosen);
+spapr_dt_ov5_platform_support(spapr, fdt, chosen);
 
 g_free(stdout_path);
 g_free(bootlist);
@@ -2622,6 +2630,11 @@ static void spapr_machine_init(MachineState *machine)
 /* advertise support for ibm,dyamic-memory-v2 */
 spapr_ovec_set(spapr->ov5, OV5_DRMEM_V2);
 
+/* advertise XIVE */
+if (smc->irq->ov5 == SPAPR_OV5_XIVE_EXPLOIT) {
+spapr_ovec_set(spapr->ov5, OV5_XIVE_EXPLOIT);
+}
+
 /* init CPUs */
 spapr_init_cpus(spapr);
 
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 04f5c9665550..7a0d4f529763 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -220,6 +220,7 @@ static void spapr_irq_reset_xics(sPAPRMachineState *spapr, 
Error **errp)
 sPAPRIrq spapr_irq_xics = {
 .nr_irqs = SPAPR_IRQ_XICS_NR_IRQS,
 .nr_msis = SPAPR_IRQ_XICS_NR_MSIS,
+.ov5 = SPAPR_OV5_XIVE_LEGACY,
 
 .init= spapr_irq_init_xics,
 .claim   = spapr_irq_claim_xics,
@@ -358,6 +359,7 @@ static void spapr_irq_reset_xive(sPAPRMachineState *spapr, 
Error **errp)
 sPAPRIrq spapr_irq_xive = {
 .nr_irqs = SPAPR_IRQ_XIVE_NR_IRQS,
 .nr_msis = SPAPR_IRQ_XIVE_NR_MSIS,
+.ov5 = SPAPR_OV5_XIVE_EXPLOIT,
 
 .init= spapr_irq_init_xive,
 .claim   = spapr_irq_claim_xive,
@@ -482,6 +484,7 @@ int spapr_irq_find(sPAPRMachineState *spapr, int num, bool 
align, Error **errp)
 sPAPRIrq spapr_irq_xics_legacy = {
 .nr_irqs = SPAPR_IRQ_XICS_LEGACY_NR_IRQS,
 .nr_msis = SPAPR_IRQ_XICS_LEGACY_NR_IRQS,
+.ov5 = SPAPR_OV5_XIVE_LEGACY,
 
 .init= spapr_irq_init_xics,
 .claim   = spapr_irq_claim_xics,
-- 
2.17.2




[Qemu-devel] [RFC v2 32/38] target/openrisc: fetch code with translator_ld

2018-12-09 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota 
---
 target/openrisc/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index a271cd3903..6b5efc0155 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -1305,7 +1305,7 @@ static void openrisc_tr_translate_insn(DisasContextBase 
*dcbase, CPUState *cs)
 {
 DisasContext *dc = container_of(dcbase, DisasContext, base);
 OpenRISCCPU *cpu = OPENRISC_CPU(cs);
-uint32_t insn = cpu_ldl_code(>env, dc->base.pc_next);
+uint32_t insn = translator_ldl(>env, dc->base.pc_next);
 
 if (!decode(dc, insn)) {
 gen_illegal_exception(dc);
-- 
2.17.1




[Qemu-devel] [RFC v2 33/38] translator: inject instrumentation from plugins

2018-12-09 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota 
---
 accel/tcg/translator.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index afd0a49ea6..68174a2986 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -17,6 +17,7 @@
 #include "exec/gen-icount.h"
 #include "exec/log.h"
 #include "exec/translator.h"
+#include "exec/plugin-gen.h"
 
 /* Pairs with tcg_clear_temp_count.
To be called by #TranslatorOps.{translate_insn,tb_stop} if
@@ -35,6 +36,7 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
  CPUState *cpu, TranslationBlock *tb)
 {
 int bp_insn = 0;
+bool plugin_enabled;
 
 /* Initialize DisasContext */
 db->tb = tb;
@@ -67,11 +69,17 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
 ops->tb_start(db, cpu);
 tcg_debug_assert(db->is_jmp == DISAS_NEXT);  /* no early exit */
 
+plugin_enabled = plugin_gen_tb_start(cpu, tb);
+
 while (true) {
 db->num_insns++;
 ops->insn_start(db, cpu);
 tcg_debug_assert(db->is_jmp == DISAS_NEXT);  /* no early exit */
 
+if (plugin_enabled) {
+plugin_gen_insn_start(cpu, db);
+}
+
 /* Pass breakpoint hits to target for further processing */
 if (!db->singlestep_enabled
 && unlikely(!QTAILQ_EMPTY(>breakpoints))) {
@@ -107,6 +115,10 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
 ops->translate_insn(db, cpu);
 }
 
+if (plugin_enabled) {
+plugin_gen_insn_end();
+}
+
 /* Stop translation if translate_insn so indicated.  */
 if (db->is_jmp != DISAS_NEXT) {
 break;
@@ -124,6 +136,10 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
 ops->tb_stop(db, cpu);
 gen_tb_end(db->tb, db->num_insns - bp_insn);
 
+if (plugin_enabled) {
+plugin_gen_tb_end(cpu);
+}
+
 /* The disas_log hook may use these values rather than recompute.  */
 db->tb->size = db->pc_next - db->pc_first;
 db->tb->icount = db->num_insns;
-- 
2.17.1




[Qemu-devel] [RFC v2 35/38] configure: add --enable-plugins

2018-12-09 Thread Emilio G. Cota
Add support for both ld (using --dynamic-list) and MacOSX's ld64
(-exported_symbols_list).

Reviewed-by: Alex Bennée 
Signed-off-by: Emilio G. Cota 
---
 configure   | 82 +
 Makefile|  1 +
 Makefile.target | 18 ++-
 .gitignore  |  2 ++
 4 files changed, 102 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index 1c473ce95b..91f9c08ae2 100755
--- a/configure
+++ b/configure
@@ -30,6 +30,7 @@ TMPO="${TMPDIR1}/${TMPB}.o"
 TMPCXX="${TMPDIR1}/${TMPB}.cxx"
 TMPE="${TMPDIR1}/${TMPB}.exe"
 TMPMO="${TMPDIR1}/${TMPB}.mo"
+TMPTXT="${TMPDIR1}/${TMPB}.txt"
 
 rm -f config.log
 
@@ -474,6 +475,7 @@ libxml2=""
 docker="no"
 debug_mutex="no"
 libpmem=""
+plugins="no"
 
 # cross compilers defaults, can be overridden with --cross-cc-ARCH
 cross_cc_aarch64="aarch64-linux-gnu-gcc"
@@ -1444,6 +1446,10 @@ for opt do
   ;;
   --disable-libpmem) libpmem=no
   ;;
+  --enable-plugins) plugins="yes"
+  ;;
+  --disable-plugins) plugins="no"
+  ;;
   *)
   echo "ERROR: unknown option $opt"
   echo "Try '$0 --help' for more information"
@@ -1634,6 +1640,8 @@ Advanced options (experts only):
xen pv domain builder
   --enable-debug-stack-usage
track the maximum stack usage of stacks created by 
qemu_alloc_stack
+  --enable-plugins
+   enable plugins via shared library loading
 
 Optional features, enabled with --enable-FEATURE and
 disabled with --disable-FEATURE, default is enabled if available:
@@ -5143,6 +5151,58 @@ if compile_prog "" "" ; then
   atomic64=yes
 fi
 
+#
+# See if --dynamic-list is supported by the linker
+
+cat > $TMPTXT < $TMPC <
+void foo(void);
+
+void foo(void)
+{
+  printf("foo\n");
+}
+
+int main(void)
+{
+  foo();
+  return 0;
+}
+EOF
+
+ld_dynamic_list="no"
+if compile_prog "" "-Wl,--dynamic-list=$TMPTXT" ; then
+  ld_dynamic_list="yes"
+fi
+
+#
+# See if -exported_symbols_list is supported by the linker
+
+cat > $TMPTXT < Your SDL version is too old - please upgrade to have SDL support"
@@ -6779,6 +6840,27 @@ if test "$libpmem" = "yes" ; then
   echo "CONFIG_LIBPMEM=y" >> $config_host_mak
 fi
 
+if test "$plugins" = "yes" ; then
+echo "CONFIG_PLUGIN=y" >> $config_host_mak
+LIBS="-ldl $LIBS"
+# Copy the export object list to the build dir
+if test "$ld_dynamic_list" = "yes" ; then
+   echo "CONFIG_HAS_LD_DYNAMIC_LIST=yes" >> $config_host_mak
+   ld_symbols=qemu-plugins-ld.symbols
+   cp "$source_path/qemu-plugins.symbols" $ld_symbols
+elif test "$ld_exported_symbols_list" = "yes" ; then
+   echo "CONFIG_HAS_LD_EXPORTED_SYMBOLS_LIST=yes" >> $config_host_mak
+   ld64_symbols=qemu-plugins-ld64.symbols
+   echo "# Automatically generated by configure - do not modify" > 
$ld64_symbols
+   grep 'qemu_' "$source_path/qemu-plugins.symbols" | sed 's/;//g' | \
+   sed -E 's/^[[:space:]]*(.*)/_\1/' >> $ld64_symbols
+else
+   error_exit \
+   "If \$plugins=yes, either \$ld_dynamic_list or " \
+   "\$ld_exported_symbols_list should have been set to 'yes'."
+fi
+fi
+
 if test "$tcg_interpreter" = "yes"; then
   QEMU_INCLUDES="-iquote \$(SRC_PATH)/tcg/tci $QEMU_INCLUDES"
 elif test "$ARCH" = "sparc64" ; then
diff --git a/Makefile b/Makefile
index 9cb3076d84..6fd15b296f 100644
--- a/Makefile
+++ b/Makefile
@@ -778,6 +778,7 @@ distclean: clean
rm -f qemu-doc.fn qemu-doc.fns qemu-doc.info qemu-doc.ky qemu-doc.kys
rm -f qemu-doc.log qemu-doc.pdf qemu-doc.pg qemu-doc.toc qemu-doc.tp
rm -f qemu-doc.vr qemu-doc.txt
+   rm -f qemu-plugins-ld.symbols qemu-plugins-ld64.symbols
rm -f config.log
rm -f linux-headers/asm
rm -f docs/version.texi
diff --git a/Makefile.target b/Makefile.target
index 75637c285c..7dada1f368 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -107,7 +107,23 @@ obj-y += target/$(TARGET_BASE_ARCH)/
 obj-y += disas.o
 obj-$(call notempty,$(TARGET_XML_FILES)) += gdbstub-xml.o
 
-obj-$(CONFIG_PLUGINS) += plugin.o
+ifdef CONFIG_PLUGIN
+obj-y += plugin.o
+# Abuse -libs suffix to only link with --dynamic-list/-exported_symbols_list
+# when the final binary includes the plugin object.
+#
+# Note that simply setting LDFLAGS is not enough: we build binaries that
+# never link plugin.o, and the linker might fail (at least ld64 does)
+# if the symbols in the list are not in the output binary.
+ ifdef CONFIG_HAS_LD_DYNAMIC_LIST
+ plugin.o-libs := -Wl,--dynamic-list=$(BUILD_DIR)/qemu-plugins-ld.symbols
+ else
+  ifdef CONFIG_HAS_LD_EXPORTED_SYMBOLS_LIST
+  plugin.o-libs := \
+   -Wl,-exported_symbols_list,$(BUILD_DIR)/qemu-plugins-ld64.symbols
+  endif
+ endif
+endif
 
 #
 # Linux user emulator target
diff --git a/.gitignore b/.gitignore
index 64efdfd929..29f6446508 100644
--- 

[Qemu-devel] [RFC v2 37/38] linux-user: support -plugin option

2018-12-09 Thread Emilio G. Cota
From: Lluís Vilanova 

Signed-off-by: Lluís Vilanova 
[ cota: s/instrument/plugin ]
Signed-off-by: Emilio G. Cota 
---
 linux-user/main.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/linux-user/main.c b/linux-user/main.c
index 923cbb753a..482766f0f4 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -28,6 +28,7 @@
 #include "qemu/config-file.h"
 #include "qemu/cutils.h"
 #include "qemu/help_option.h"
+#include "qemu/plugin.h"
 #include "cpu.h"
 #include "exec/exec-all.h"
 #include "tcg.h"
@@ -385,6 +386,15 @@ static void handle_arg_trace(const char *arg)
 trace_file = trace_opt_parse(arg);
 }
 
+static struct qemu_plugin_list plugins = QTAILQ_HEAD_INITIALIZER(plugins);
+
+#ifdef CONFIG_PLUGIN
+static void handle_arg_plugin(const char *arg)
+{
+qemu_plugin_opt_parse(arg, );
+}
+#endif
+
 struct qemu_argument {
 const char *argv;
 const char *env;
@@ -436,6 +446,10 @@ static const struct qemu_argument arg_table[] = {
  "",   "Seed for pseudo-random number generator"},
 {"trace",  "QEMU_TRACE",   true,  handle_arg_trace,
  "",   "[[enable=]][,events=][,file=]"},
+#ifdef CONFIG_PLUGIN
+{"plugin", "QEMU_PLUGIN",  true,  handle_arg_plugin,
+ "",   "[file=][,arg=]"},
+#endif
 {"version","QEMU_VERSION", false, handle_arg_version,
  "",   "display version information and exit"},
 {NULL, NULL, false, NULL, NULL, NULL}
@@ -627,6 +641,7 @@ int main(int argc, char **argv, char **envp)
 srand(time(NULL));
 
 qemu_add_opts(_trace_opts);
+qemu_plugin_add_opts();
 
 optind = parse_args(argc, argv);
 
@@ -634,6 +649,9 @@ int main(int argc, char **argv, char **envp)
 exit(1);
 }
 trace_init_file(trace_file);
+if (qemu_plugin_load_list()) {
+exit(1);
+}
 
 /* Zero out regs */
 memset(regs, 0, sizeof(struct target_pt_regs));
-- 
2.17.1




[Qemu-devel] [PATCH v7 04/19] ppc/xive: notify the CPU when the interrupt priority is more privileged

2018-12-09 Thread Cédric Le Goater
After the event data was enqueued in the O/S Event Queue, the IVPE
raises the bit corresponding to the priority of the pending interrupt
in the register IBP (Interrupt Pending Buffer) to indicate there is an
event pending in one of the 8 priority queues. The Pending Interrupt
Priority Register (PIPR) is also updated using the IPB. This register
represent the priority of the most favored pending notification.

The PIPR is then compared to the the Current Processor Priority
Register (CPPR). If it is more favored (numerically less than), the
CPU interrupt line is raised and the EO bit of the Notification Source
Register (NSR) is updated to notify the presence of an exception for
the O/S. The check needs to be done whenever the PIPR or the CPPR are
changed.

The O/S acknowledges the interrupt with a special load in the Thread
Interrupt Management Area. If the EO bit of the NSR is set, the CPPR
takes the value of PIPR. The bit number in the IBP corresponding to
the priority of the pending interrupt is reseted and so is the EO bit
of the NSR.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
---
 hw/intc/xive.c | 94 +-
 1 file changed, 93 insertions(+), 1 deletion(-)

diff --git a/hw/intc/xive.c b/hw/intc/xive.c
index 3eecffe99b3a..ea5385ff7784 100644
--- a/hw/intc/xive.c
+++ b/hw/intc/xive.c
@@ -22,9 +22,73 @@
  * XIVE Thread Interrupt Management context
  */
 
+/* Convert a priority number to an Interrupt Pending Buffer (IPB)
+ * register, which indicates a pending interrupt at the priority
+ * corresponding to the bit number
+ */
+static uint8_t priority_to_ipb(uint8_t priority)
+{
+return priority > XIVE_PRIORITY_MAX ?
+0 : 1 << (XIVE_PRIORITY_MAX - priority);
+}
+
+/* Convert an Interrupt Pending Buffer (IPB) register to a Pending
+ * Interrupt Priority Register (PIPR), which contains the priority of
+ * the most favored pending notification.
+ */
+static uint8_t ipb_to_pipr(uint8_t ibp)
+{
+return ibp ? clz32((uint32_t)ibp << 24) : 0xff;
+}
+
+static void ipb_update(uint8_t *regs, uint8_t priority)
+{
+regs[TM_IPB] |= priority_to_ipb(priority);
+regs[TM_PIPR] = ipb_to_pipr(regs[TM_IPB]);
+}
+
+static uint8_t exception_mask(uint8_t ring)
+{
+switch (ring) {
+case TM_QW1_OS:
+return TM_QW1_NSR_EO;
+default:
+g_assert_not_reached();
+}
+}
+
 static uint64_t xive_tctx_accept(XiveTCTX *tctx, uint8_t ring)
 {
-return 0;
+uint8_t *regs = >regs[ring];
+uint8_t nsr = regs[TM_NSR];
+uint8_t mask = exception_mask(ring);
+
+qemu_irq_lower(tctx->output);
+
+if (regs[TM_NSR] & mask) {
+uint8_t cppr = regs[TM_PIPR];
+
+regs[TM_CPPR] = cppr;
+
+/* Reset the pending buffer bit */
+regs[TM_IPB] &= ~priority_to_ipb(cppr);
+regs[TM_PIPR] = ipb_to_pipr(regs[TM_IPB]);
+
+/* Drop Exception bit */
+regs[TM_NSR] &= ~mask;
+}
+
+return (nsr << 8) | regs[TM_CPPR];
+}
+
+static void xive_tctx_notify(XiveTCTX *tctx, uint8_t ring)
+{
+uint8_t *regs = >regs[ring];
+
+if (regs[TM_PIPR] < regs[TM_CPPR]) {
+regs[TM_NSR] |= exception_mask(ring);
+qemu_irq_raise(tctx->output);
+}
 }
 
 static void xive_tctx_set_cppr(XiveTCTX *tctx, uint8_t ring, uint8_t cppr)
@@ -34,6 +98,9 @@ static void xive_tctx_set_cppr(XiveTCTX *tctx, uint8_t ring, 
uint8_t cppr)
 }
 
 tctx->regs[ring + TM_CPPR] = cppr;
+
+/* CPPR has changed, check if we need to raise a pending exception */
+xive_tctx_notify(tctx, ring);
 }
 
 /*
@@ -189,6 +256,17 @@ static void xive_tm_set_os_cppr(XiveTCTX *tctx, hwaddr 
offset,
 xive_tctx_set_cppr(tctx, TM_QW1_OS, value & 0xff);
 }
 
+/*
+ * Adjust the IPB to allow a CPU to process event queues of other
+ * priorities during one physical interrupt cycle.
+ */
+static void xive_tm_set_os_pending(XiveTCTX *tctx, hwaddr offset,
+   uint64_t value, unsigned size)
+{
+ipb_update(>regs[TM_QW1_OS], value & 0xff);
+xive_tctx_notify(tctx, TM_QW1_OS);
+}
+
 /*
  * Define a mapping of "special" operations depending on the TIMA page
  * offset and the size of the operation.
@@ -211,6 +289,7 @@ static const XiveTmOp xive_tm_operations[] = {
 
 /* MMIOs above 2K : special operations with side effects */
 { XIVE_TM_OS_PAGE, TM_SPC_ACK_OS_REG, 2, NULL, xive_tm_ack_os_reg },
+{ XIVE_TM_OS_PAGE, TM_SPC_SET_OS_PENDING, 1, xive_tm_set_os_pending, NULL 
},
 };
 
 static const XiveTmOp *xive_tm_find_op(hwaddr offset, unsigned size, bool 
write)
@@ -373,6 +452,13 @@ static void xive_tctx_reset(void *dev)
 tctx->regs[TM_QW1_OS + TM_LSMFB] = 0xFF;
 tctx->regs[TM_QW1_OS + TM_ACK_CNT] = 0xFF;
 tctx->regs[TM_QW1_OS + TM_AGE] = 0xFF;
+
+/*
+ * Initialize PIPR to 0xFF to avoid phantom interrupts when the
+ * CPPR is first set.
+ */
+tctx->regs[TM_QW1_OS + TM_PIPR] =
+ipb_to_pipr(tctx->regs[TM_QW1_OS + TM_IPB]);
 }
 
 

[Qemu-devel] [RFC v2 10/38] plugin-gen: add module for TCG-related code

2018-12-09 Thread Emilio G. Cota
We first inject empty instrumentation from translator_loop.
After translation, we go through the plugins to see what
they want to register for, filling in the empty instrumentation.
If if turns out that some instrumentation remains unused, we
remove it.

This approach supports the following features:

- Inlining TCG code for simple operations. Note that we do not
  export TCG ops to plugins. Instead, we give them a C API to
  insert inlined ops. So far we only support adding an immediate
  to a u64, e.g. to count events.

- "Direct" callbacks. These are callbacks that do not go via
  a helper. Instead, the helper is defined at run-time, so that
  the plugin code is directly called from TCG. This makes direct
  callbacks as efficient as possible; they are therefore used
  for very frequent events, e.g. memory callbacks.

- Passing the host address to memory callbacks. Most of this
  is implemented in a later patch though.

- Instrumentation of memory accesses performed from helpers.
  See the corresponding comment, as well as a later patch.

Signed-off-by: Emilio G. Cota 
---
 accel/tcg/plugin-helpers.h  |6 +
 include/exec/helper-gen.h   |1 +
 include/exec/helper-proto.h |1 +
 include/exec/helper-tcg.h   |1 +
 include/exec/plugin-gen.h   |   66 +++
 tcg/tcg-op.h|   11 +
 tcg/tcg-opc.h   |3 +
 tcg/tcg.h   |   20 +
 accel/tcg/plugin-gen.c  | 1077 +++
 tcg/tcg.c   |   17 +
 accel/tcg/Makefile.objs |1 +
 11 files changed, 1204 insertions(+)
 create mode 100644 accel/tcg/plugin-helpers.h
 create mode 100644 include/exec/plugin-gen.h
 create mode 100644 accel/tcg/plugin-gen.c

diff --git a/accel/tcg/plugin-helpers.h b/accel/tcg/plugin-helpers.h
new file mode 100644
index 00..5457a3a577
--- /dev/null
+++ b/accel/tcg/plugin-helpers.h
@@ -0,0 +1,6 @@
+#ifdef CONFIG_PLUGIN
+/* Note: no TCG flags because those are overwritten later */
+DEF_HELPER_2(plugin_vcpu_udata_cb, void, i32, ptr)
+DEF_HELPER_4(plugin_vcpu_mem_cb, void, i32, i32, i64, ptr)
+DEF_HELPER_5(plugin_vcpu_mem_haddr_cb, void, i32, i32, i64, ptr, ptr)
+#endif
diff --git a/include/exec/helper-gen.h b/include/exec/helper-gen.h
index 22381a1708..236ff40524 100644
--- a/include/exec/helper-gen.h
+++ b/include/exec/helper-gen.h
@@ -70,6 +70,7 @@ static inline void glue(gen_helper_, 
name)(dh_retvar_decl(ret)  \
 #include "trace/generated-helpers.h"
 #include "trace/generated-helpers-wrappers.h"
 #include "tcg-runtime.h"
+#include "plugin-helpers.h"
 
 #undef DEF_HELPER_FLAGS_0
 #undef DEF_HELPER_FLAGS_1
diff --git a/include/exec/helper-proto.h b/include/exec/helper-proto.h
index 74943edb13..1c4ba9bc78 100644
--- a/include/exec/helper-proto.h
+++ b/include/exec/helper-proto.h
@@ -33,6 +33,7 @@ dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), 
dh_ctype(t3), \
 #include "helper.h"
 #include "trace/generated-helpers.h"
 #include "tcg-runtime.h"
+#include "plugin-helpers.h"
 
 #undef DEF_HELPER_FLAGS_0
 #undef DEF_HELPER_FLAGS_1
diff --git a/include/exec/helper-tcg.h b/include/exec/helper-tcg.h
index b3bdb0c399..3977b4c606 100644
--- a/include/exec/helper-tcg.h
+++ b/include/exec/helper-tcg.h
@@ -48,6 +48,7 @@
 #include "helper.h"
 #include "trace/generated-helpers.h"
 #include "tcg-runtime.h"
+#include "plugin-helpers.h"
 
 #undef str
 #undef DEF_HELPER_FLAGS_0
diff --git a/include/exec/plugin-gen.h b/include/exec/plugin-gen.h
new file mode 100644
index 00..449ea16034
--- /dev/null
+++ b/include/exec/plugin-gen.h
@@ -0,0 +1,66 @@
+/*
+ * Copyright (C) 2017, Emilio G. Cota 
+ *
+ * License: GNU GPL, version 2 or later.
+ *   See the COPYING file in the top-level directory.
+ *
+ * plugin-gen.h - TCG-dependent definitions for generating plugin code
+ *
+ * This header should be included only from plugin.c and C files that emit
+ * TCG code.
+ */
+#ifndef QEMU_PLUGIN_GEN_H
+#define QEMU_PLUGIN_GEN_H
+
+#include "qemu/plugin.h"
+#include "tcg/tcg.h"
+
+/* used by plugin_callback_start and plugin_callback_end TCG ops */
+enum plugin_gen_from {
+PLUGIN_GEN_FROM_TB,
+PLUGIN_GEN_FROM_INSN,
+PLUGIN_GEN_FROM_MEM,
+PLUGIN_GEN_AFTER_INSN,
+PLUGIN_GEN_N_FROMS,
+};
+
+struct DisasContextBase;
+
+#ifdef CONFIG_PLUGIN
+
+bool plugin_gen_tb_start(CPUState *cpu, const TranslationBlock *tb);
+void plugin_gen_tb_end(CPUState *cpu);
+void plugin_gen_insn_start(CPUState *cpu, const struct DisasContextBase *db);
+void plugin_gen_insn_end(void);
+
+void plugin_gen_disable_mem_helpers(void);
+void plugin_gen_empty_mem_callback(TCGv addr, uint8_t info);
+
+#else /* !CONFIG_PLUGIN */
+
+static inline
+bool plugin_gen_tb_start(CPUState *cpu, const TranslationBlock *tb)
+{
+return false;
+}
+
+static inline
+void plugin_gen_insn_start(CPUState *cpu, const struct DisasContextBase *db)
+{ }
+
+static inline void plugin_gen_insn_end(void)
+{ }
+
+static inline void plugin_gen_tb_end(CPUState *cpu)
+{ }
+
+static inline 

[Qemu-devel] [RFC v2 26/38] target/hppa: fetch code with translator_ld

2018-12-09 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota 
---
 target/hppa/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index df9179e70f..806dbda51f 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -4754,7 +4754,7 @@ static void hppa_tr_translate_insn(DisasContextBase 
*dcbase, CPUState *cs)
 {
 /* Always fetch the insn, even if nullified, so that we check
the page permissions for execute.  */
-uint32_t insn = cpu_ldl_code(env, ctx->base.pc_next);
+uint32_t insn = translator_ldl(env, ctx->base.pc_next);
 
 /* Set up the IA queue for the next insn.
This will be overwritten by a branch.  */
-- 
2.17.1




[Qemu-devel] [RFC v2 36/38] vl: support -plugin option

2018-12-09 Thread Emilio G. Cota
From: Lluís Vilanova 

Signed-off-by: Lluís Vilanova 
[ cota: s/instrument/plugin ]
Signed-off-by: Emilio G. Cota 
---
 vl.c| 11 +++
 qemu-options.hx | 17 +
 2 files changed, 28 insertions(+)

diff --git a/vl.c b/vl.c
index 1fcacc5caa..a1d6b76315 100644
--- a/vl.c
+++ b/vl.c
@@ -111,6 +111,7 @@ int main(int argc, char **argv)
 
 #include "trace-root.h"
 #include "trace/control.h"
+#include "qemu/plugin.h"
 #include "qemu/queue.h"
 #include "sysemu/arch_init.h"
 
@@ -2998,6 +2999,7 @@ int main(int argc, char **argv, char **envp)
 } BlockdevOptions_queue;
 QSIMPLEQ_HEAD(, BlockdevOptions_queue) bdo_queue
 = QSIMPLEQ_HEAD_INITIALIZER(bdo_queue);
+struct qemu_plugin_list plugin_list = QTAILQ_HEAD_INITIALIZER(plugin_list);
 
 module_call_init(MODULE_INIT_TRACE);
 
@@ -3026,6 +3028,7 @@ int main(int argc, char **argv, char **envp)
 qemu_add_opts(_global_opts);
 qemu_add_opts(_mon_opts);
 qemu_add_opts(_trace_opts);
+qemu_plugin_add_opts();
 qemu_add_opts(_option_rom_opts);
 qemu_add_opts(_machine_opts);
 qemu_add_opts(_accel_opts);
@@ -3840,6 +3843,9 @@ int main(int argc, char **argv, char **envp)
 g_free(trace_file);
 trace_file = trace_opt_parse(optarg);
 break;
+case QEMU_OPTION_plugin:
+qemu_plugin_opt_parse(optarg, _list);
+break;
 case QEMU_OPTION_readconfig:
 {
 int ret = qemu_read_config_file(optarg);
@@ -4137,6 +4143,11 @@ int main(int argc, char **argv, char **envp)
machine_class->default_machine_opts, 0);
 }
 
+/* process plugin before CPUs are created, but once -smp has been parsed */
+if (qemu_plugin_load_list(_list)) {
+exit(1);
+}
+
 qemu_opts_foreach(qemu_find_opts("device"),
   default_driver_check, NULL, NULL);
 qemu_opts_foreach(qemu_find_opts("global"),
diff --git a/qemu-options.hx b/qemu-options.hx
index 08f8516a9a..f3549eb2ec 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3847,6 +3847,23 @@ HXCOMM HX does not support conditional compilation of 
text.
 @findex -trace
 @include qemu-option-trace.texi
 ETEXI
+DEF("plugin", HAS_ARG, QEMU_OPTION_plugin,
+"-plugin [file=][,arg=]\n"
+"load a plugin\n",
+QEMU_ARCH_ALL)
+STEXI
+@item -plugin file=@var{file}[,arg=@var{string}]
+@findex -plugin
+
+Load a plugin.
+
+@table @option
+@item file=@var{file}
+Load the given plugin from a shared library file.
+@item arg=@var{string}
+Argument string passed to the plugin. (Can be given multiple times.)
+@end table
+ETEXI
 
 HXCOMM Internal use
 DEF("qtest", HAS_ARG, QEMU_OPTION_qtest, "", QEMU_ARCH_ALL)
-- 
2.17.1




[Qemu-devel] [RFC v2 30/38] target/sparc: fetch code with translator_ld

2018-12-09 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota 
---
 target/sparc/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 74315cdf09..2c754b6163 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -5900,7 +5900,7 @@ static void sparc_tr_translate_insn(DisasContextBase 
*dcbase, CPUState *cs)
 CPUSPARCState *env = cs->env_ptr;
 unsigned int insn;
 
-insn = cpu_ldl_code(env, dc->pc);
+insn = translator_ldl(env, dc->pc);
 dc->base.pc_next += 4;
 disas_sparc_insn(dc, insn);
 
-- 
2.17.1




[Qemu-devel] [RFC v2 17/38] *-user: notify plugin of exit

2018-12-09 Thread Emilio G. Cota
Reviewed-by: Alex Bennée 
Signed-off-by: Emilio G. Cota 
---
 bsd-user/syscall.c | 3 +++
 linux-user/exit.c  | 1 +
 2 files changed, 4 insertions(+)

diff --git a/bsd-user/syscall.c b/bsd-user/syscall.c
index 66492aaf5d..b7818af450 100644
--- a/bsd-user/syscall.c
+++ b/bsd-user/syscall.c
@@ -332,6 +332,7 @@ abi_long do_freebsd_syscall(void *cpu_env, int num, 
abi_long arg1,
 _mcleanup();
 #endif
 gdb_exit(cpu_env, arg1);
+qemu_plugin_atexit_cb();
 /* XXX: should free thread stack and CPU env */
 _exit(arg1);
 ret = 0; /* avoid warning */
@@ -430,6 +431,7 @@ abi_long do_netbsd_syscall(void *cpu_env, int num, abi_long 
arg1,
 _mcleanup();
 #endif
 gdb_exit(cpu_env, arg1);
+qemu_plugin_atexit_cb();
 /* XXX: should free thread stack and CPU env */
 _exit(arg1);
 ret = 0; /* avoid warning */
@@ -505,6 +507,7 @@ abi_long do_openbsd_syscall(void *cpu_env, int num, 
abi_long arg1,
 _mcleanup();
 #endif
 gdb_exit(cpu_env, arg1);
+qemu_plugin_atexit_cb();
 /* XXX: should free thread stack and CPU env */
 _exit(arg1);
 ret = 0; /* avoid warning */
diff --git a/linux-user/exit.c b/linux-user/exit.c
index 14e94e28fa..768856483a 100644
--- a/linux-user/exit.c
+++ b/linux-user/exit.c
@@ -32,4 +32,5 @@ void preexit_cleanup(CPUArchState *env, int code)
 __gcov_dump();
 #endif
 gdb_exit(env, code);
+qemu_plugin_atexit_cb();
 }
-- 
2.17.1




[Qemu-devel] [RFC v2 15/38] tcg: let plugins instrument memory accesses

2018-12-09 Thread Emilio G. Cota
XXX: store hostaddr from non-i386 TCG backends
XXX: what hostaddr to return for I/O accesses?
XXX: what hostaddr to return for cross-page accesses?

Here the trickiest feature is passing the host address to
memory callbacks that request it. Perhaps it would be more
appropriate to pass a "physical" address to plugins, but since
in QEMU host addr ~= guest physical, I'm going with that for
simplicity.

To keep the implementation simple we piggy-back on the TLB fast path,
and thus can only provide the host address _after_ memory accesses
have occurred. For the slow path, it's a bit tedious because there
are many places to update, but it's fairly simple.

However, note that cross-page accesses are tricky, since the
access might be to non-contiguous host addresses. So I'm punting
on that and just passing NULL.

Signed-off-by: Emilio G. Cota 
---
 accel/tcg/atomic_template.h   |  5 +++
 accel/tcg/softmmu_template.h  | 43 +-
 include/exec/cpu-defs.h   |  9 +
 include/exec/cpu_ldst.h   |  9 +
 include/exec/cpu_ldst_template.h  | 43 ++
 include/exec/cpu_ldst_useronly_template.h | 42 +++---
 tcg/tcg.h |  1 +
 accel/tcg/cpu-exec.c  |  2 ++
 accel/tcg/cputlb.c|  9 +
 tcg/i386/tcg-target.inc.c |  7 
 tcg/tcg-op.c  | 44 ++-
 11 files changed, 169 insertions(+), 45 deletions(-)

diff --git a/accel/tcg/atomic_template.h b/accel/tcg/atomic_template.h
index 2f7d5ee02a..5619c4b4b9 100644
--- a/accel/tcg/atomic_template.h
+++ b/accel/tcg/atomic_template.h
@@ -18,6 +18,7 @@
  * License along with this library; if not, see .
  */
 
+#include "qemu/plugin.h"
 #include "trace/mem.h"
 
 #if DATA_SIZE == 16
@@ -73,6 +74,8 @@ void atomic_trace_rmw_pre(CPUArchState *env, target_ulong 
addr, uint8_t info)
 static inline void atomic_trace_rmw_post(CPUArchState *env, target_ulong addr,
  void *haddr, uint8_t info)
 {
+qemu_plugin_vcpu_mem_cb(ENV_GET_CPU(env), addr, haddr, info);
+qemu_plugin_vcpu_mem_cb(ENV_GET_CPU(env), addr, haddr, info | 
TRACE_MEM_ST);
 }
 
 static inline
@@ -84,6 +87,7 @@ void atomic_trace_ld_pre(CPUArchState *env, target_ulong 
addr, uint8_t info)
 static inline void atomic_trace_ld_post(CPUArchState *env, target_ulong addr,
 void *haddr, uint8_t info)
 {
+qemu_plugin_vcpu_mem_cb(ENV_GET_CPU(env), addr, haddr, info);
 }
 
 static inline
@@ -95,6 +99,7 @@ void atomic_trace_st_pre(CPUArchState *env, target_ulong 
addr, uint8_t info)
 static inline void atomic_trace_st_post(CPUArchState *env, target_ulong addr,
 void *haddr, uint8_t info)
 {
+qemu_plugin_vcpu_mem_cb(ENV_GET_CPU(env), addr, haddr, info);
 }
 #endif /* ATOMIC_TEMPLATE_COMMON */
 
diff --git a/accel/tcg/softmmu_template.h b/accel/tcg/softmmu_template.h
index b0adea045e..79109e25a1 100644
--- a/accel/tcg/softmmu_template.h
+++ b/accel/tcg/softmmu_template.h
@@ -45,7 +45,6 @@
 #error unsupported data size
 #endif
 
-
 /* For the benefit of TCG generated code, we want to avoid the complication
of ABI-specific return type promotion and always return a value extended
to the register size of the host.  This is tcg_target_long, except in the
@@ -99,10 +98,15 @@ static inline DATA_TYPE glue(io_read, SUFFIX)(CPUArchState 
*env,
   size_t mmu_idx, size_t index,
   target_ulong addr,
   uintptr_t retaddr,
+  TCGMemOp mo,
   bool recheck,
   MMUAccessType access_type)
 {
 CPUIOTLBEntry *iotlbentry = >iotlb[mmu_idx][index];
+
+/* XXX Any sensible choice other than NULL? */
+set_hostaddr(env, mo, NULL);
+
 return io_readx(env, iotlbentry, mmu_idx, addr, retaddr, recheck,
 access_type, DATA_SIZE);
 }
@@ -115,7 +119,8 @@ WORD_TYPE helper_le_ld_name(CPUArchState *env, target_ulong 
addr,
 uintptr_t index = tlb_index(env, mmu_idx, addr);
 CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
 target_ulong tlb_addr = entry->ADDR_READ;
-unsigned a_bits = get_alignment_bits(get_memop(oi));
+TCGMemOp mo = get_memop(oi);
+unsigned a_bits = get_alignment_bits(mo);
 uintptr_t haddr;
 DATA_TYPE res;
 
@@ -141,7 +146,7 @@ WORD_TYPE helper_le_ld_name(CPUArchState *env, target_ulong 
addr,
 
 /* ??? Note that the io helpers always read data in the target
byte ordering.  We should push the LE/BE request down into io.  */
-res = glue(io_read, SUFFIX)(env, mmu_idx, index, addr, retaddr,
+ 

[Qemu-devel] [RFC v2 34/38] plugin: add API symbols to qemu-plugins.symbols

2018-12-09 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota 
---
 qemu-plugins.symbols | 34 ++
 1 file changed, 34 insertions(+)
 create mode 100644 qemu-plugins.symbols

diff --git a/qemu-plugins.symbols b/qemu-plugins.symbols
new file mode 100644
index 00..2a5b18862a
--- /dev/null
+++ b/qemu-plugins.symbols
@@ -0,0 +1,34 @@
+{
+  qemu_plugin_uninstall;
+  qemu_plugin_register_vcpu_init_cb;
+  qemu_plugin_register_vcpu_exit_cb;
+  qemu_plugin_register_vcpu_idle_cb;
+  qemu_plugin_register_vcpu_resume_cb;
+  qemu_plugin_register_vcpu_insn_exec_cb;
+  qemu_plugin_register_vcpu_insn_exec_inline;
+  qemu_plugin_register_vcpu_mem_cb;
+  qemu_plugin_register_vcpu_mem_haddr_cb;
+  qemu_plugin_register_vcpu_mem_inline;
+  qemu_plugin_ram_addr_from_host;
+  qemu_plugin_register_vcpu_tb_trans_cb;
+  qemu_plugin_register_vcpu_tb_exec_cb;
+  qemu_plugin_register_vcpu_tb_exec_inline;
+  qemu_plugin_register_flush_cb;
+  qemu_plugin_register_vcpu_syscall_cb;
+  qemu_plugin_register_vcpu_syscall_ret_cb;
+  qemu_plugin_register_atexit_cb;
+  qemu_plugin_tb_n_insns;
+  qemu_plugin_tb_get_insn;
+  qemu_plugin_tb_vaddr;
+  qemu_plugin_insn_data;
+  qemu_plugin_insn_size;
+  qemu_plugin_insn_vaddr;
+  qemu_plugin_insn_haddr;
+  qemu_plugin_mem_size_shift;
+  qemu_plugin_mem_is_sign_extended;
+  qemu_plugin_mem_is_big_endian;
+  qemu_plugin_mem_is_store;
+  qemu_plugin_vcpu_for_each;
+  qemu_plugin_n_vcpus;
+  qemu_plugin_n_max_vcpus;
+};
-- 
2.17.1




[Qemu-devel] [RFC v2 22/38] target/arm: call qemu_plugin_insn_append

2018-12-09 Thread Emilio G. Cota
I considered using translator_ld* from arm_ldl_code
et al. However, note that there's a helper that also calls
arm_ldl_code, so we'd have to change that caller.

In thumb's case I'm also calling plugin_insn_append directly,
since we can't assume that all instructions are 16 bits long.

Signed-off-by: Emilio G. Cota 
---
 target/arm/translate-a64.c | 2 ++
 target/arm/translate.c | 8 +++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 88195ab949..db95161c16 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -38,6 +38,7 @@
 #include "trace-tcg.h"
 #include "translate-a64.h"
 #include "qemu/atomic128.h"
+#include "qemu/plugin.h"
 
 static TCGv_i64 cpu_X[32];
 static TCGv_i64 cpu_pc;
@@ -13321,6 +13322,7 @@ static void disas_a64_insn(CPUARMState *env, 
DisasContext *s)
 uint32_t insn;
 
 insn = arm_ldl_code(env, s->pc, s->sctlr_b);
+plugin_insn_append(, sizeof(insn));
 s->insn = insn;
 s->pc += 4;
 
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 7c4675ffd8..d5171f54f6 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -13234,6 +13234,7 @@ static void arm_tr_translate_insn(DisasContextBase 
*dcbase, CPUState *cpu)
 }
 
 insn = arm_ldl_code(env, dc->pc, dc->sctlr_b);
+plugin_insn_append(, sizeof(insn));
 dc->insn = insn;
 dc->pc += 4;
 disas_arm_insn(dc, insn);
@@ -13304,11 +13305,16 @@ static void thumb_tr_translate_insn(DisasContextBase 
*dcbase, CPUState *cpu)
 insn = arm_lduw_code(env, dc->pc, dc->sctlr_b);
 is_16bit = thumb_insn_is_16bit(dc, insn);
 dc->pc += 2;
-if (!is_16bit) {
+if (is_16bit) {
+uint16_t insn16 = insn;
+
+plugin_insn_append(, sizeof(insn16));
+} else {
 uint32_t insn2 = arm_lduw_code(env, dc->pc, dc->sctlr_b);
 
 insn = insn << 16 | insn2;
 dc->pc += 2;
+plugin_insn_append(, sizeof(insn));
 }
 dc->insn = insn;
 
-- 
2.17.1




[Qemu-devel] [RFC v2 29/38] target/riscv: fetch code with translator_ld

2018-12-09 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota 
---
 target/riscv/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 18d7b6d147..fa96f45a69 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -1848,7 +1848,7 @@ static void riscv_tr_translate_insn(DisasContextBase 
*dcbase, CPUState *cpu)
 DisasContext *ctx = container_of(dcbase, DisasContext, base);
 CPURISCVState *env = cpu->env_ptr;
 
-ctx->opcode = cpu_ldl_code(env, ctx->base.pc_next);
+ctx->opcode = translator_ldl(env, ctx->base.pc_next);
 decode_opc(env, ctx);
 ctx->base.pc_next = ctx->pc_succ_insn;
 
-- 
2.17.1




[Qemu-devel] [RFC v2 18/38] *-user: plugin syscalls

2018-12-09 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota 
---
 bsd-user/syscall.c   | 9 +
 linux-user/syscall.c | 3 +++
 2 files changed, 12 insertions(+)

diff --git a/bsd-user/syscall.c b/bsd-user/syscall.c
index b7818af450..4993f81b2b 100644
--- a/bsd-user/syscall.c
+++ b/bsd-user/syscall.c
@@ -323,6 +323,8 @@ abi_long do_freebsd_syscall(void *cpu_env, int num, 
abi_long arg1,
 gemu_log("freebsd syscall %d\n", num);
 #endif
 trace_guest_user_syscall(cpu, num, arg1, arg2, arg3, arg4, arg5, arg6, 
arg7, arg8);
+qemu_plugin_vcpu_syscall(cpu, num, arg1, arg2, arg3, arg4, arg5, arg6, 
arg7,
+ arg8);
 if(do_strace)
 print_freebsd_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
 
@@ -404,6 +406,7 @@ abi_long do_freebsd_syscall(void *cpu_env, int num, 
abi_long arg1,
 if (do_strace)
 print_freebsd_syscall_ret(num, ret);
 trace_guest_user_syscall_ret(cpu, num, ret);
+qemu_plugin_vcpu_syscall_ret(cpu, num, ret);
 return ret;
  efault:
 ret = -TARGET_EFAULT;
@@ -422,6 +425,8 @@ abi_long do_netbsd_syscall(void *cpu_env, int num, abi_long 
arg1,
 gemu_log("netbsd syscall %d\n", num);
 #endif
 trace_guest_user_syscall(cpu, num, arg1, arg2, arg3, arg4, arg5, arg6, 0, 
0);
+qemu_plugin_vcpu_syscall(cpu, num, arg1, arg2, arg3, arg4, arg5, arg6, 0,
+ 0);
 if(do_strace)
 print_netbsd_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
 
@@ -480,6 +485,7 @@ abi_long do_netbsd_syscall(void *cpu_env, int num, abi_long 
arg1,
 if (do_strace)
 print_netbsd_syscall_ret(num, ret);
 trace_guest_user_syscall_ret(cpu, num, ret);
+qemu_plugin_vcpu_syscall_ret(cpu, num, ret);
 return ret;
  efault:
 ret = -TARGET_EFAULT;
@@ -498,6 +504,8 @@ abi_long do_openbsd_syscall(void *cpu_env, int num, 
abi_long arg1,
 gemu_log("openbsd syscall %d\n", num);
 #endif
 trace_guest_user_syscall(cpu, num, arg1, arg2, arg3, arg4, arg5, arg6, 0, 
0);
+qemu_plugin_vcpu_syscall(cpu, num, arg1, arg2, arg3, arg4, arg5, arg6, 0,
+ 0);
 if(do_strace)
 print_openbsd_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
 
@@ -556,6 +564,7 @@ abi_long do_openbsd_syscall(void *cpu_env, int num, 
abi_long arg1,
 if (do_strace)
 print_openbsd_syscall_ret(num, ret);
 trace_guest_user_syscall_ret(cpu, num, ret);
+qemu_plugin_vcpu_syscall_ret(cpu, num, ret);
 return ret;
  efault:
 ret = -TARGET_EFAULT;
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 15b03e17b9..9f6457768c 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -11422,6 +11422,8 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 
 trace_guest_user_syscall(cpu, num, arg1, arg2, arg3, arg4,
  arg5, arg6, arg7, arg8);
+qemu_plugin_vcpu_syscall(cpu, num, arg1, arg2, arg3, arg4, arg5, arg6, 
arg7,
+ arg8);
 
 if (unlikely(do_strace)) {
 print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
@@ -11434,5 +11436,6 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 }
 
 trace_guest_user_syscall_ret(cpu, num, ret);
+qemu_plugin_vcpu_syscall_ret(cpu, num, ret);
 return ret;
 }
-- 
2.17.1




[Qemu-devel] [RFC v2 24/38] target/sh4: fetch code with translator_ld (WIP)

2018-12-09 Thread Emilio G. Cota
XXX: cleanly get the gUSA instructions

Signed-off-by: Emilio G. Cota 
---
 target/sh4/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index ab254b0e8d..1704ce8dae 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -2331,7 +2331,7 @@ static void sh4_tr_translate_insn(DisasContextBase 
*dcbase, CPUState *cs)
 }
 #endif
 
-ctx->opcode = cpu_lduw_code(env, ctx->base.pc_next);
+ctx->opcode = translator_lduw(env, ctx->base.pc_next);
 decode_opc(ctx);
 ctx->base.pc_next += 2;
 }
-- 
2.17.1




[Qemu-devel] [RFC v2 14/38] atomic_template: add inline trace/plugin helpers

2018-12-09 Thread Emilio G. Cota
In preparation for plugin support.

Signed-off-by: Emilio G. Cota 
---
 accel/tcg/atomic_template.h | 110 
 1 file changed, 75 insertions(+), 35 deletions(-)

diff --git a/accel/tcg/atomic_template.h b/accel/tcg/atomic_template.h
index 8d177fefef..2f7d5ee02a 100644
--- a/accel/tcg/atomic_template.h
+++ b/accel/tcg/atomic_template.h
@@ -59,25 +59,44 @@
 # define ABI_TYPE  uint32_t
 #endif
 
-#define ATOMIC_TRACE_RMW do {   \
-uint8_t info = glue(trace_mem_build_info_no_se, MEND)(SHIFT, false); \
-\
-trace_guest_mem_before_exec(ENV_GET_CPU(env), addr, info);  \
-trace_guest_mem_before_exec(ENV_GET_CPU(env), addr, \
-info | TRACE_MEM_ST);   \
-} while (0)
-
-#define ATOMIC_TRACE_LD do {\
-uint8_t info = glue(trace_mem_build_info_no_se, MEND)(SHIFT, false); \
-\
-trace_guest_mem_before_exec(ENV_GET_CPU(env), addr, info);  \
-} while (0)
-
-# define ATOMIC_TRACE_ST do {   \
-uint8_t info = glue(trace_mem_build_info_no_se, MEND)(SHIFT, true); \
-\
-trace_guest_mem_before_exec(ENV_GET_CPU(env), addr, info);  \
-} while (0)
+#ifndef ATOMIC_TEMPLATE_COMMON
+#define ATOMIC_TEMPLATE_COMMON
+static inline
+void atomic_trace_rmw_pre(CPUArchState *env, target_ulong addr, uint8_t info)
+{
+CPUState *cpu = ENV_GET_CPU(env);
+
+trace_guest_mem_before_exec(cpu, addr, info);
+trace_guest_mem_before_exec(cpu, addr, info | TRACE_MEM_ST);
+}
+
+static inline void atomic_trace_rmw_post(CPUArchState *env, target_ulong addr,
+ void *haddr, uint8_t info)
+{
+}
+
+static inline
+void atomic_trace_ld_pre(CPUArchState *env, target_ulong addr, uint8_t info)
+{
+trace_guest_mem_before_exec(ENV_GET_CPU(env), addr, info);
+}
+
+static inline void atomic_trace_ld_post(CPUArchState *env, target_ulong addr,
+void *haddr, uint8_t info)
+{
+}
+
+static inline
+void atomic_trace_st_pre(CPUArchState *env, target_ulong addr, uint8_t info)
+{
+trace_guest_mem_before_exec(ENV_GET_CPU(env), addr, info);
+}
+
+static inline void atomic_trace_st_post(CPUArchState *env, target_ulong addr,
+void *haddr, uint8_t info)
+{
+}
+#endif /* ATOMIC_TEMPLATE_COMMON */
 
 /* Define host-endian atomic operations.  Note that END is used within
the ATOMIC_NAME macro, and redefined below.  */
@@ -98,14 +117,16 @@ ABI_TYPE ATOMIC_NAME(cmpxchg)(CPUArchState *env, 
target_ulong addr,
 ATOMIC_MMU_DECLS;
 DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
 DATA_TYPE ret;
+uint8_t info = glue(trace_mem_build_info_no_se, MEND)(SHIFT, false);
 
-ATOMIC_TRACE_RMW;
+atomic_trace_rmw_pre(env, addr, info);
 #if DATA_SIZE == 16
 ret = atomic16_cmpxchg(haddr, cmpv, newv);
 #else
 ret = atomic_cmpxchg__nocheck(haddr, cmpv, newv);
 #endif
 ATOMIC_MMU_CLEANUP;
+atomic_trace_rmw_post(env, addr, haddr, info);
 return ret;
 }
 
@@ -115,10 +136,12 @@ ABI_TYPE ATOMIC_NAME(ld)(CPUArchState *env, target_ulong 
addr EXTRA_ARGS)
 {
 ATOMIC_MMU_DECLS;
 DATA_TYPE val, *haddr = ATOMIC_MMU_LOOKUP;
+uint8_t info = glue(trace_mem_build_info_no_se, MEND)(SHIFT, false);
 
-ATOMIC_TRACE_LD;
+atomic_trace_ld_pre(env, addr, info);
 val = atomic16_read(haddr);
 ATOMIC_MMU_CLEANUP;
+atomic_trace_ld_post(env, addr, haddr, info);
 return val;
 }
 
@@ -127,10 +150,12 @@ void ATOMIC_NAME(st)(CPUArchState *env, target_ulong addr,
 {
 ATOMIC_MMU_DECLS;
 DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
+uint8_t info = glue(trace_mem_build_info_no_se, MEND)(SHIFT, true);
 
-ATOMIC_TRACE_ST;
+atomic_trace_st_pre(env, addr, info);
 atomic16_set(haddr, val);
 ATOMIC_MMU_CLEANUP;
+atomic_trace_st_post(env, addr, haddr, info);
 }
 #endif
 #else
@@ -140,10 +165,12 @@ ABI_TYPE ATOMIC_NAME(xchg)(CPUArchState *env, 
target_ulong addr,
 ATOMIC_MMU_DECLS;
 DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
 DATA_TYPE ret;
+uint8_t info = glue(trace_mem_build_info_no_se, MEND)(SHIFT, false);
 
-ATOMIC_TRACE_RMW;
+atomic_trace_rmw_pre(env, addr, info);
 ret = atomic_xchg__nocheck(haddr, val);
 ATOMIC_MMU_CLEANUP;
+atomic_trace_rmw_post(env, addr, haddr, info);
 return ret;
 }
 
@@ -154,10 +181,12 @@ ABI_TYPE ATOMIC_NAME(X)(CPUArchState *env, target_ulong 
addr,   \
 ATOMIC_MMU_DECLS;   \
 DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;   \
 DATA_TYPE ret;   

[Qemu-devel] [RFC v2 28/38] target/alpha: fetch code with translator_ld

2018-12-09 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota 
---
 target/alpha/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index 25cd95931d..f8d194994a 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -2987,7 +2987,7 @@ static void alpha_tr_translate_insn(DisasContextBase 
*dcbase, CPUState *cpu)
 {
 DisasContext *ctx = container_of(dcbase, DisasContext, base);
 CPUAlphaState *env = cpu->env_ptr;
-uint32_t insn = cpu_ldl_code(env, ctx->base.pc_next);
+uint32_t insn = translator_ldl(env, ctx->base.pc_next);
 
 ctx->base.pc_next += 4;
 ctx->base.is_jmp = translate_one(ctx, insn);
-- 
2.17.1




[Qemu-devel] [RFC v2 16/38] translate-all: notify plugin code of tb_flush

2018-12-09 Thread Emilio G. Cota
Plugins might allocate per-TB data that then they get passed each
time a TB is executed (via the *userdata pointer).

Notify plugin code every time a code cache flush occurs, so
that plugins can then reclaim the memory of the per-TB data.

Reviewed-by: Alex Bennée 
Signed-off-by: Emilio G. Cota 
---
 accel/tcg/translate-all.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 62d5e13185..aaa8193ceb 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1232,6 +1232,8 @@ static gboolean tb_host_size_iter(gpointer key, gpointer 
value, gpointer data)
 /* flush all the translation blocks */
 static void do_tb_flush(CPUState *cpu, run_on_cpu_data tb_flush_count)
 {
+bool did_flush = false;
+
 mmap_lock();
 /* If it is already been done on request of another CPU,
  * just retry.
@@ -1239,6 +1241,7 @@ static void do_tb_flush(CPUState *cpu, run_on_cpu_data 
tb_flush_count)
 if (tb_ctx.tb_flush_count != tb_flush_count.host_int) {
 goto done;
 }
+did_flush = true;
 
 if (DEBUG_TB_FLUSH_GATE) {
 size_t nb_tbs = tcg_nb_tbs();
@@ -1263,6 +1266,9 @@ static void do_tb_flush(CPUState *cpu, run_on_cpu_data 
tb_flush_count)
 
 done:
 mmap_unlock();
+if (did_flush) {
+qemu_plugin_flush_cb();
+}
 }
 
 void tb_flush(CPUState *cpu)
-- 
2.17.1




[Qemu-devel] [RFC v2 19/38] cpu: hook plugin vcpu events

2018-12-09 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota 
---
 cpus.c| 10 ++
 exec.c|  2 ++
 qom/cpu.c |  2 ++
 3 files changed, 14 insertions(+)

diff --git a/cpus.c b/cpus.c
index c9acef73e4..e3844c69c8 100644
--- a/cpus.c
+++ b/cpus.c
@@ -43,6 +43,7 @@
 #include "exec/exec-all.h"
 
 #include "qemu/thread.h"
+#include "qemu/plugin.h"
 #include "sysemu/cpus.h"
 #include "sysemu/qtest.h"
 #include "qemu/main-loop.h"
@@ -1322,12 +1323,21 @@ static void qemu_tcg_rr_wait_io_event(CPUState *cpu)
 
 static void qemu_wait_io_event(CPUState *cpu)
 {
+bool slept = false;
+
 g_assert(cpu_mutex_locked(cpu));
 g_assert(!qemu_mutex_iothread_locked());
 
 while (cpu_thread_is_idle(cpu)) {
+if (!slept) {
+slept = true;
+qemu_plugin_vcpu_idle_cb(cpu);
+}
 qemu_cond_wait(>halt_cond, >lock);
 }
+if (slept) {
+qemu_plugin_vcpu_resume_cb(cpu);
+}
 
 #ifdef _WIN32
 /* Eat dummy APC queued by qemu_cpu_kick_thread.  */
diff --git a/exec.c b/exec.c
index 04d505500b..bdafb0e71a 100644
--- a/exec.c
+++ b/exec.c
@@ -967,6 +967,8 @@ void cpu_exec_realizefn(CPUState *cpu, Error **errp)
 }
 tlb_init(cpu);
 
+qemu_plugin_vcpu_init_hook(cpu);
+
 #ifndef CONFIG_USER_ONLY
 if (qdev_get_vmsd(DEVICE(cpu)) == NULL) {
 vmstate_register(NULL, cpu->cpu_index, _cpu_common, cpu);
diff --git a/qom/cpu.c b/qom/cpu.c
index b33d182c4c..6233a98a84 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -32,6 +32,7 @@
 #include "hw/boards.h"
 #include "hw/qdev-properties.h"
 #include "trace-root.h"
+#include "qemu/plugin.h"
 
 CPUInterruptHandler cpu_interrupt_handler;
 
@@ -352,6 +353,7 @@ static void cpu_common_unrealizefn(DeviceState *dev, Error 
**errp)
 CPUState *cpu = CPU(dev);
 /* NOTE: latest generic point before the cpu is fully unrealized */
 trace_fini_vcpu(cpu);
+qemu_plugin_vcpu_exit_hook(cpu);
 cpu_exec_unrealizefn(cpu);
 }
 
-- 
2.17.1




[Qemu-devel] [RFC v2 13/38] atomic_template: fix indentation in GEN_ATOMIC_HELPER

2018-12-09 Thread Emilio G. Cota
Reviewed-by: Alex Bennée 
Signed-off-by: Emilio G. Cota 
---
 accel/tcg/atomic_template.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/accel/tcg/atomic_template.h b/accel/tcg/atomic_template.h
index efde12fdb2..8d177fefef 100644
--- a/accel/tcg/atomic_template.h
+++ b/accel/tcg/atomic_template.h
@@ -284,7 +284,7 @@ ABI_TYPE ATOMIC_NAME(xchg)(CPUArchState *env, target_ulong 
addr,
 
 #define GEN_ATOMIC_HELPER(X)\
 ABI_TYPE ATOMIC_NAME(X)(CPUArchState *env, target_ulong addr,   \
- ABI_TYPE val EXTRA_ARGS)   \
+ABI_TYPE val EXTRA_ARGS)\
 {   \
 ATOMIC_MMU_DECLS;   \
 DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;   \
-- 
2.17.1




[Qemu-devel] [RFC v2 08/38] tcg: drop nargs from tcg_op_insert_{before, after}

2018-12-09 Thread Emilio G. Cota
It's unused.

Signed-off-by: Emilio G. Cota 
---
 tcg/tcg.h  |  4 ++--
 tcg/optimize.c |  4 ++--
 tcg/tcg.c  | 10 --
 3 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/tcg/tcg.h b/tcg/tcg.h
index f4efbaa680..a745e926bb 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -1073,8 +1073,8 @@ void tcg_gen_callN(void *func, TCGTemp *ret, int nargs, 
TCGTemp **args);
 
 TCGOp *tcg_emit_op(TCGOpcode opc);
 void tcg_op_remove(TCGContext *s, TCGOp *op);
-TCGOp *tcg_op_insert_before(TCGContext *s, TCGOp *op, TCGOpcode opc, int narg);
-TCGOp *tcg_op_insert_after(TCGContext *s, TCGOp *op, TCGOpcode opc, int narg);
+TCGOp *tcg_op_insert_before(TCGContext *s, TCGOp *op, TCGOpcode opc);
+TCGOp *tcg_op_insert_after(TCGContext *s, TCGOp *op, TCGOpcode opc);
 
 void tcg_optimize(TCGContext *s);
 
diff --git a/tcg/optimize.c b/tcg/optimize.c
index 5dbe11c3c8..a2247f6de3 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -1249,7 +1249,7 @@ void tcg_optimize(TCGContext *s)
 uint64_t a = ((uint64_t)ah << 32) | al;
 uint64_t b = ((uint64_t)bh << 32) | bl;
 TCGArg rl, rh;
-TCGOp *op2 = tcg_op_insert_before(s, op, INDEX_op_movi_i32, 2);
+TCGOp *op2 = tcg_op_insert_before(s, op, INDEX_op_movi_i32);
 
 if (opc == INDEX_op_add2_i32) {
 a += b;
@@ -1271,7 +1271,7 @@ void tcg_optimize(TCGContext *s)
 uint32_t b = arg_info(op->args[3])->val;
 uint64_t r = (uint64_t)a * b;
 TCGArg rl, rh;
-TCGOp *op2 = tcg_op_insert_before(s, op, INDEX_op_movi_i32, 2);
+TCGOp *op2 = tcg_op_insert_before(s, op, INDEX_op_movi_i32);
 
 rl = op->args[0];
 rh = op->args[1];
diff --git a/tcg/tcg.c b/tcg/tcg.c
index e85133ef05..e87c662a18 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -2203,16 +2203,14 @@ TCGOp *tcg_emit_op(TCGOpcode opc)
 return op;
 }
 
-TCGOp *tcg_op_insert_before(TCGContext *s, TCGOp *old_op,
-TCGOpcode opc, int nargs)
+TCGOp *tcg_op_insert_before(TCGContext *s, TCGOp *old_op, TCGOpcode opc)
 {
 TCGOp *new_op = tcg_op_alloc(opc);
 QTAILQ_INSERT_BEFORE(old_op, new_op, link);
 return new_op;
 }
 
-TCGOp *tcg_op_insert_after(TCGContext *s, TCGOp *old_op,
-   TCGOpcode opc, int nargs)
+TCGOp *tcg_op_insert_after(TCGContext *s, TCGOp *old_op, TCGOpcode opc)
 {
 TCGOp *new_op = tcg_op_alloc(opc);
 QTAILQ_INSERT_AFTER(>ops, old_op, new_op, link);
@@ -2550,7 +2548,7 @@ static bool liveness_pass_2(TCGContext *s)
 TCGOpcode lopc = (arg_ts->type == TCG_TYPE_I32
   ? INDEX_op_ld_i32
   : INDEX_op_ld_i64);
-TCGOp *lop = tcg_op_insert_before(s, op, lopc, 3);
+TCGOp *lop = tcg_op_insert_before(s, op, lopc);
 
 lop->args[0] = temp_arg(dir_ts);
 lop->args[1] = temp_arg(arg_ts->mem_base);
@@ -2619,7 +2617,7 @@ static bool liveness_pass_2(TCGContext *s)
 TCGOpcode sopc = (arg_ts->type == TCG_TYPE_I32
   ? INDEX_op_st_i32
   : INDEX_op_st_i64);
-TCGOp *sop = tcg_op_insert_after(s, op, sopc, 3);
+TCGOp *sop = tcg_op_insert_after(s, op, sopc);
 
 sop->args[0] = temp_arg(dir_ts);
 sop->args[1] = temp_arg(arg_ts->mem_base);
-- 
2.17.1




[Qemu-devel] [RFC v2 21/38] translator: add translator_ld{ub, sw, uw, l, q}

2018-12-09 Thread Emilio G. Cota
Suggested-by: Richard Henderson 
Signed-off-by: Emilio G. Cota 
---
 include/exec/translator.h | 28 
 1 file changed, 28 insertions(+)

diff --git a/include/exec/translator.h b/include/exec/translator.h
index 71e7b2c347..39f6f514a7 100644
--- a/include/exec/translator.h
+++ b/include/exec/translator.h
@@ -19,7 +19,10 @@
  */
 
 
+#include "qemu/bswap.h"
 #include "exec/exec-all.h"
+#include "exec/cpu_ldst.h"
+#include "exec/plugin-gen.h"
 #include "tcg/tcg.h"
 
 
@@ -141,4 +144,29 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
 
 void translator_loop_temp_check(DisasContextBase *db);
 
+#define GEN_TRANSLATOR_LD(fullname, name, type, swap_fn)\
+static inline type  \
+fullname ## _swap(CPUArchState *env, abi_ptr pc, bool do_swap)  \
+{   \
+type ret = cpu_ ## name ## _code(env, pc);  \
+\
+if (do_swap) {  \
+ret = swap_fn(ret); \
+}   \
+plugin_insn_append(, sizeof(ret));  \
+return ret; \
+}   \
+\
+static inline type fullname(CPUArchState *env, abi_ptr pc)  \
+{   \
+return fullname ## _swap(env, pc, false);   \
+}
+
+GEN_TRANSLATOR_LD(translator_ldub, ldub, uint8_t, /* no swap needed */)
+GEN_TRANSLATOR_LD(translator_ldsw, ldsw, int16_t, bswap16)
+GEN_TRANSLATOR_LD(translator_lduw, lduw, uint16_t, bswap16)
+GEN_TRANSLATOR_LD(translator_ldl, ldl, uint32_t, bswap32)
+GEN_TRANSLATOR_LD(translator_ldq, ldq, uint64_t, bswap64)
+#undef GEN_TRANSLATOR_LD
+
 #endif  /* EXEC__TRANSLATOR_H */
-- 
2.17.1




[Qemu-devel] [RFC v2 05/38] plugin: add user-facing API

2018-12-09 Thread Emilio G. Cota
Add the API first to ease review.

Signed-off-by: Emilio G. Cota 
---
 include/qemu/qemu-plugin.h | 241 +
 1 file changed, 241 insertions(+)
 create mode 100644 include/qemu/qemu-plugin.h

diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
new file mode 100644
index 00..6c67211900
--- /dev/null
+++ b/include/qemu/qemu-plugin.h
@@ -0,0 +1,241 @@
+/*
+ * Copyright (C) 2017, Emilio G. Cota 
+ *
+ * License: GNU GPL, version 2 or later.
+ *   See the COPYING file in the top-level directory.
+ */
+#ifndef QEMU_PLUGIN_API_H
+#define QEMU_PLUGIN_API_H
+
+#include 
+#include 
+
+/*
+ * For best performance, build the plugin with -fvisibility=hidden so that
+ * QEMU_PLUGIN_LOCAL is implicit. Then, just mark qemu_plugin_install with
+ * QEMU_PLUGIN_EXPORT. For more info, see
+ *   https://gcc.gnu.org/wiki/Visibility
+ */
+#if defined _WIN32 || defined __CYGWIN__
+  #ifdef BUILDING_DLL
+#define QEMU_PLUGIN_EXPORT __declspec(dllexport)
+  #else
+#define QEMU_PLUGIN_EXPORT __declspec(dllimport)
+  #endif
+  #define QEMU_PLUGIN_LOCAL
+#else
+  #if __GNUC__ >= 4
+#define QEMU_PLUGIN_EXPORT __attribute__((visibility("default")))
+#define QEMU_PLUGIN_LOCAL  __attribute__((visibility("hidden")))
+  #else
+#define QEMU_PLUGIN_EXPORT
+#define QEMU_PLUGIN_LOCAL
+  #endif
+#endif
+
+typedef uint64_t qemu_plugin_id_t;
+
+/**
+ * qemu_plugin_install - Install a plugin
+ * @id: this plugin's opaque ID
+ * @argc: number of arguments
+ * @argv: array of arguments (@argc elements)
+ *
+ * All plugins must export this symbol.
+ *
+ * Note: @argv is freed after this function returns.
+ */
+QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id, int argc,
+   char **argv);
+
+typedef void (*qemu_plugin_uninstall_cb_t)(qemu_plugin_id_t id);
+
+/**
+ * qemu_plugin_uninstall - Uninstall a plugin
+ * @id: this plugin's opaque ID
+ * @cb: callback to be called once the plugin has been removed
+ *
+ * Do NOT assume that the plugin has been uninstalled once this
+ * function returns. Plugins are uninstalled asynchronously,
+ * and therefore the given plugin might still receive callbacks
+ * from prior subscriptions _until_ @cb is called.
+ */
+void qemu_plugin_uninstall(qemu_plugin_id_t id, qemu_plugin_uninstall_cb_t cb);
+
+typedef void (*qemu_plugin_simple_cb_t)(qemu_plugin_id_t id);
+
+typedef void (*qemu_plugin_udata_cb_t)(qemu_plugin_id_t id, void *userdata);
+
+typedef void (*qemu_plugin_vcpu_simple_cb_t)(qemu_plugin_id_t id,
+ unsigned int vcpu_index);
+
+typedef void (*qemu_plugin_vcpu_udata_cb_t)(unsigned int vcpu_index,
+void *userdata);
+
+/**
+ * qemu_plugin_register_vcpu_init_cb - register a vCPU initialization callback
+ * @id: plugin ID
+ * @cb: callback function
+ *
+ * The @cb function is called every time a vCPU is initialized.
+ *
+ * See also: qemu_plugin_register_vcpu_exit_cb()
+ */
+void qemu_plugin_register_vcpu_init_cb(qemu_plugin_id_t id,
+   qemu_plugin_vcpu_simple_cb_t cb);
+
+/**
+ * qemu_plugin_register_vcpu_exit_cb - register a vCPU exit callback
+ * @id: plugin ID
+ * @cb: callback function
+ *
+ * The @cb function is called every time a vCPU exits.
+ *
+ * See also: qemu_plugin_register_vcpu_init_cb()
+ */
+void qemu_plugin_register_vcpu_exit_cb(qemu_plugin_id_t id,
+   qemu_plugin_vcpu_simple_cb_t cb);
+
+void qemu_plugin_register_vcpu_idle_cb(qemu_plugin_id_t id,
+   qemu_plugin_vcpu_simple_cb_t cb);
+
+void qemu_plugin_register_vcpu_resume_cb(qemu_plugin_id_t id,
+ qemu_plugin_vcpu_simple_cb_t cb);
+
+struct qemu_plugin_tb;
+struct qemu_plugin_insn;
+
+enum qemu_plugin_cb_flags {
+QEMU_PLUGIN_CB_NO_REGS, /* callback does not access the CPU's regs */
+QEMU_PLUGIN_CB_R_REGS,  /* callback reads the CPU's regs */
+QEMU_PLUGIN_CB_RW_REGS, /* callback reads and writes the CPU's regs */
+};
+
+enum qemu_plugin_mem_rw {
+QEMU_PLUGIN_MEM_R = 1,
+QEMU_PLUGIN_MEM_W,
+QEMU_PLUGIN_MEM_RW,
+};
+
+typedef void (*qemu_plugin_vcpu_tb_trans_cb_t)(qemu_plugin_id_t id,
+   unsigned int vcpu_index,
+   struct qemu_plugin_tb *tb);
+
+void qemu_plugin_register_vcpu_tb_trans_cb(qemu_plugin_id_t id,
+   qemu_plugin_vcpu_tb_trans_cb_t cb);
+
+/* can only call from tb_trans_cb callback */
+void qemu_plugin_register_vcpu_tb_exec_cb(struct qemu_plugin_tb *tb,
+  qemu_plugin_vcpu_udata_cb_t cb,
+  enum qemu_plugin_cb_flags flags,
+  void *userdata);
+
+enum qemu_plugin_op {
+QEMU_PLUGIN_INLINE_ADD_U64,

[Qemu-devel] [RFC v2 03/38] cpu: introduce cpu_in_exclusive_work_context()

2018-12-09 Thread Emilio G. Cota
Suggested-by: Alex Bennée 
Signed-off-by: Emilio G. Cota 
---
 include/qom/cpu.h | 13 +
 cpus-common.c |  2 ++
 2 files changed, 15 insertions(+)

diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 772cc960fe..fab18089db 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -349,6 +349,7 @@ struct CPUState {
 bool thread_kicked;
 bool crash_occurred;
 bool exit_request;
+bool in_exclusive_work_context;
 uint32_t cflags_next_tb;
 /* updates protected by BQL */
 uint32_t interrupt_request;
@@ -913,6 +914,18 @@ void async_run_on_cpu_no_bql(CPUState *cpu, 
run_on_cpu_func func,
  */
 void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func, 
run_on_cpu_data data);
 
+/**
+ * cpu_in_exclusive_work_context()
+ * @cpu: The vCPU to check
+ *
+ * Returns true if @cpu is an exclusive work context, which has
+ * previously been queued via async_safe_run_on_cpu().
+ */
+static inline bool cpu_in_exclusive_work_context(const CPUState *cpu)
+{
+return cpu->in_exclusive_work_context;
+}
+
 /**
  * qemu_get_cpu:
  * @index: The CPUState@cpu_index value of the CPU to obtain.
diff --git a/cpus-common.c b/cpus-common.c
index 232cb12c46..d6ea42c80c 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -370,7 +370,9 @@ static void process_queued_cpu_work_locked(CPUState *cpu)
 qemu_mutex_unlock_iothread();
 }
 start_exclusive();
+cpu->in_exclusive_work_context = true;
 wi->func(cpu, wi->data);
+cpu->in_exclusive_work_context = false;
 end_exclusive();
 if (has_bql) {
 qemu_mutex_lock_iothread();
-- 
2.17.1




[Qemu-devel] [RFC v2 23/38] target/ppc: fetch code with translator_ld

2018-12-09 Thread Emilio G. Cota
Signed-off-by: Emilio G. Cota 
---
 target/ppc/translate.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 2d31b5f7a1..7a7c8a9a88 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -7555,11 +7555,9 @@ static void ppc_tr_translate_insn(DisasContextBase 
*dcbase, CPUState *cs)
 LOG_DISAS("nip=" TARGET_FMT_lx " super=%d ir=%d\n",
   ctx->base.pc_next, ctx->mem_idx, (int)msr_ir);
 
-if (unlikely(need_byteswap(ctx))) {
-ctx->opcode = bswap32(cpu_ldl_code(env, ctx->base.pc_next));
-} else {
-ctx->opcode = cpu_ldl_code(env, ctx->base.pc_next);
-}
+ctx->opcode = translator_ldl_swap(env, ctx->base.pc_next,
+  need_byteswap(ctx));
+
 LOG_DISAS("translate opcode %08x (%02x %02x %02x %02x) (%s)\n",
   ctx->opcode, opc1(ctx->opcode), opc2(ctx->opcode),
   opc3(ctx->opcode), opc4(ctx->opcode),
-- 
2.17.1




[Qemu-devel] [RFC v2 11/38] tcg: add tcg_gen_st_ptr

2018-12-09 Thread Emilio G. Cota
Will gain a user soon.

Signed-off-by: Emilio G. Cota 
---
 tcg/tcg-op.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
index e2948b10a2..d3c79a6cb2 100644
--- a/tcg/tcg-op.h
+++ b/tcg/tcg-op.h
@@ -1219,6 +1219,11 @@ static inline void tcg_gen_ld_ptr(TCGv_ptr r, TCGv_ptr 
a, intptr_t o)
 glue(tcg_gen_ld_,PTR)((NAT)r, a, o);
 }
 
+static inline void tcg_gen_st_ptr(TCGv_ptr r, TCGv_ptr a, intptr_t o)
+{
+glue(tcg_gen_st_, PTR)((NAT)r, a, o);
+}
+
 static inline void tcg_gen_discard_ptr(TCGv_ptr a)
 {
 glue(tcg_gen_discard_,PTR)((NAT)a);
-- 
2.17.1




[Qemu-devel] [RFC v2 00/38] Plugin support

2018-12-09 Thread Emilio G. Cota
v1: https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg05682.html

Changes since v1:

- Drop the 2-pass translation. Instead, empty instrumentation
  is injected during translation. If it turns out that this
  empty instrumentation is not needed, it is removed from
  the output. For this, add 2 TCG ops that mark the beginning
  and end of this empty instrumentation.

  This is cleaner than 2-pass translation, although it
  ends up being quite a bit more code, since we have
  to copy backend TCG ops, which is tedious. Performance-wise,
  it is at worst ~9% slower (~1.3% avg) than 2-pass for SPEC06int:

https://imgur.com/a/bUNox3H

  This is for an "empty" plugin (also added to tests/plugin/empty.c).
  That is, it subscribes to TB translation events and does nothing
  with them (i.e. no execution-time subscriptions).
  This means the empty instrumentation has to be injected and then
  removed, which is the worst-case scenario since all the injection
  work is wasted.

- Add QTAILQ_REMOVE_SEVERAL, which helps speed up the removal
  of empty instrumentation.

- Drop the "TCG runtime helper" support. We do not need it
  for empty instrumentation; we just replace the function pointer
  in the copied "call" op directly.
  + To detect when an instruction uses helpers, just strncmp
  the helper's name against "plugin_".

- Drop tb->plugin_mask. Instead, read cpu->plugin_mask from
  translator_loop.

- Drop the xxhash patches, since I submitted those as a separate
  series.

- Move a lot of plugin-related code from translator.c to
  plugin-gen.c, leaving only a few function calls in translator.c.

- Add support for only subscribing to an instruction's reads or
  writes. This is implemented via a flag added to the memory
  registration functions of the public API.

- Disentangle callbacks into separate arrays. Instead of just
  having 3 arrays (tb, insn and mem callbacks), have 5 arrays
  (tb, insn, virt. mem, hostaddr mem) of 2 arrays each (udata_cb
  and inline). This takes a bit more space per TB, but note that
  this struct is allocated only once in each TCGContext. OTOH,
  it makes the code much simpler. The union in struct dyn_cb
  remains, since for instrumenting memory accesses from helpers
  we still coalesce all types of memory callbacks into a single
  array.

- Add get_page_addr_code_hostp to get the host address of code
  from common code. Use this to export the host address of
  instructions (qemu_plugin_insn_haddr() added to the public API).

- Define TCGMemOp MO_HADDR. If set, the TCG backend copies on
  a TLB hit the corresponding host address to env->hostaddr.
  This allows us to only do this copy when needed.

- Use helpers for reading and setting env->hostaddr, so that
  we minimize the use of #ifdef CONFIG_PLUGIN.

- Only define env->hostaddr if CONFIG_PLUGIN.

- Drop the trailing 'S' in CONFIG_PLUGINS: it is now CONFIG_PLUGIN.

- Drop a few optional features from the RFC:
  + lockstep execution
  + plugin-chan + guest hooks
  + virtual clock control

- Define translator_ld* helpers and use them, as suggested
  by Alex and rth. All target ISAs that use translator_loop
  have been converted, except s390x and mips.

- Do not bloat TCGContext if !CONFIG_PLUGIN.

- Define TCGContext.plugin_tb as a pointer, instead of the
  whole struct.

- Test on 32-bit and 64-bit hosts (i386, x86_64, ppc64, aarch64).

- Add cpu_in_exclusive_work_context() and use it in tb_flush(),
  as suggested by Alex.

- configure fixes, including MacOSX builds thanks to Roman's help.

- Remove macros in atomic_template.h, as suggested by Alex.
  Turns out they aren't needed, inlines are enough.

- Fixed a bug by which cpu->plugin_mem was not being cleared
  if the instruction that used helpers was the last one in
  a TB (e.g. an exception). Fix it by adding checks (1) when
  returning from longjmp, and (2) when finishing a TB from
  tcg, so that we're sure to leave cpu->plugin_mem
  in a good state. (I noticed the bug by uninstalling a plugin
  that had registered memory callbacks, which resulted in
  callbacks to the uninstalled [dlclose'd] plugin.)

- Make sure tcg_ctx->plugin_mem_cb is always NULL after finishing
  the translation of a TB. This fixes a bug on uninstall.

- Do not abort when qemu_plugin_uninstall is called more than
  once. This is actually quite common, so just silently return
  on subsequent calls to uninstall.

- Drop the "qemu"/QEMU from some overly long function/macro
  names. This applies to qemu-internal files, of course.

- Keep the plugin's argument array in memory until the plugin
  is uninstalled, so that plugins don't have to strdup their
  arguments.

- Drop nargs argument from tcg_op_insert_before/after; it's
  unused.

- Rename plugin-api.h to qemu-plugin.h, which is the same name
  it gets in the final destination (after `make install').

- Add insn_inline function to the API.

- Add some sample plugins to tests/plugin.

You can fetch this series from:
  

[Qemu-devel] [RFC v2 20/38] plugin-gen: add plugin_insn_append

2018-12-09 Thread Emilio G. Cota
By adding it to plugin-gen's header file, we can export is as
an inline, since tcg.h is included in the header (we need tcg_ctx).

Signed-off-by: Emilio G. Cota 
---
 include/exec/plugin-gen.h | 27 ++-
 accel/tcg/plugin-gen.c| 10 +-
 2 files changed, 27 insertions(+), 10 deletions(-)

diff --git a/include/exec/plugin-gen.h b/include/exec/plugin-gen.h
index 449ea16034..b09c16b720 100644
--- a/include/exec/plugin-gen.h
+++ b/include/exec/plugin-gen.h
@@ -15,15 +15,6 @@
 #include "qemu/plugin.h"
 #include "tcg/tcg.h"
 
-/* used by plugin_callback_start and plugin_callback_end TCG ops */
-enum plugin_gen_from {
-PLUGIN_GEN_FROM_TB,
-PLUGIN_GEN_FROM_INSN,
-PLUGIN_GEN_FROM_MEM,
-PLUGIN_GEN_AFTER_INSN,
-PLUGIN_GEN_N_FROMS,
-};
-
 struct DisasContextBase;
 
 #ifdef CONFIG_PLUGIN
@@ -36,6 +27,21 @@ void plugin_gen_insn_end(void);
 void plugin_gen_disable_mem_helpers(void);
 void plugin_gen_empty_mem_callback(TCGv addr, uint8_t info);
 
+static inline void plugin_insn_append(const void *from, size_t size)
+{
+struct qemu_plugin_insn *insn = tcg_ctx->plugin_insn;
+
+if (insn == NULL) {
+return;
+}
+if (unlikely(insn->size + size > insn->capacity)) {
+insn->data = g_realloc(insn->data, insn->size + size);
+insn->capacity = insn->size + size;
+}
+memcpy(insn->data + insn->size, from, size);
+insn->size += size;
+}
+
 #else /* !CONFIG_PLUGIN */
 
 static inline
@@ -60,6 +66,9 @@ static inline void plugin_gen_disable_mem_helpers(void)
 static inline void plugin_gen_empty_mem_callback(TCGv addr, uint8_t info)
 { }
 
+static inline void plugin_insn_append(const void *from, size_t size)
+{ }
+
 #endif /* CONFIG_PLUGIN */
 
 #endif /* QEMU_PLUGIN_GEN_H */
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 06ec23e9f5..e6dd79e4d8 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -60,9 +60,17 @@
 /*
  * plugin_cb_start TCG op args[]:
  * 0: enum plugin_gen_from
- * 1: enum plugin_gen_cb (defined below)
+ * 1: enum plugin_gen_cb
  * 2: set to 1 if it's a mem callback and it's a write, 0 otherwise.
  */
+enum plugin_gen_from {
+PLUGIN_GEN_FROM_TB,
+PLUGIN_GEN_FROM_INSN,
+PLUGIN_GEN_FROM_MEM,
+PLUGIN_GEN_AFTER_INSN,
+PLUGIN_GEN_N_FROMS,
+};
+
 enum plugin_gen_cb {
 PLUGIN_GEN_CB_UDATA,
 PLUGIN_GEN_CB_INLINE,
-- 
2.17.1




[Qemu-devel] [RFC v2 12/38] tcg: add MO_HADDR to TCGMemOp

2018-12-09 Thread Emilio G. Cota
We will use this from plugins to mark mem accesses so that
we can later obtain their host address.

Signed-off-by: Emilio G. Cota 
---
 tcg/tcg.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/tcg/tcg.h b/tcg/tcg.h
index 6fd525023b..a376f83ab6 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -359,6 +359,13 @@ typedef enum TCGMemOp {
 MO_ALIGN_32 = 5 << MO_ASHIFT,
 MO_ALIGN_64 = 6 << MO_ASHIFT,
 
+/*
+ * SoftMMU-only: if set, the TCG backend puts the corresponding host 
address
+ * in CPUArchState.hostaddr.
+ */
+MO_HSHIFT = MO_ASHIFT + 3,
+MO_HADDR = 1 << MO_HSHIFT,
+
 /* Combinations of the above, for ease of use.  */
 MO_UB= MO_8,
 MO_UW= MO_16,
-- 
2.17.1




[Qemu-devel] [RFC v2 06/38] plugin: add core code

2018-12-09 Thread Emilio G. Cota
Some design requirements/goals:

- Make sure we cannot deadlock, particularly under MTTCG. For this,
  we acquire a lock when called from plugin code, and keep
  RCU lists of callbacks so that we do not have to hold the lock
  when calling the callbacks. This is also for performance, since
  some callbacks (e.g. memory access callbacks) might be called very
  frequently.
  * A consequence of this is that we keep our own list of CPUs, so that
we do not have to worry about locking order wrt cpu_list_lock.
  * Use a recursive lock, since we can get registration calls from
callbacks.

- Support as many plugins as the user wants (e.g. -plugin foo -plugin bar),
  just like other tools (e.g. dynamorio) do.

- Support the uninstallation of a plugin at any time (e.g. from plugin
  callbacks).

- Avoid malicious plugins from abusing the API. This is done by:
  * Adding a qemu_plugin_id_t that all calls need to use. This is a unique
id per plugin.
  * Hiding CPUState * under cpu_index. Plugin code can keep per-vcpu
data by using said index (say to index an array).
  * Only exporting the relevant qemu_plugin symbols to the plugins by
passing --dynamic-file to the linker (when supported), instead of
exporting all symbols with -rdynamic.

- Performance: registering/unregistering callbacks is "slow", since
  it takes a lock. But this is very infrequent; we want performance when
  calling (or not calling) callbacks, not when registering them.
  Using RCU is great for this. The only difficulty is when uninstalling
  a plugin, where some callbacks might still be called after the
  uninstall returns. An alternative would be to use r/w locks, but that
  would complicate code quite a bit for very little gain. In any case,
  I suspect most plugins will just run until QEMU exits.

Some design decisions:
- I considered registering callbacks per-vcpu, but really I don't see the
  use case for it (would complicate the API and 99% of plugins won't care, so
  I'd rather make that 1% slower by letting them discard unwanted callbacks).

- Last, 'plugin' vs. 'instrumentation' naming: I think instrumentation is a
  subset of the functionality that plugins can provide. IOW, in the future
  not all plugins might be considered instrumentation, even if currently
  my goal is to use them for that purpose.

Signed-off-by: Emilio G. Cota 
---
 Makefile  |7 +-
 Makefile.target   |2 +
 include/qemu/plugin.h |  253 ++
 include/qom/cpu.h |6 +
 plugin.c  | 1030 +
 5 files changed, 1297 insertions(+), 1 deletion(-)
 create mode 100644 include/qemu/plugin.h
 create mode 100644 plugin.c

diff --git a/Makefile b/Makefile
index f2947186a4..9cb3076d84 100644
--- a/Makefile
+++ b/Makefile
@@ -862,8 +862,10 @@ ifneq (,$(findstring qemu-ga,$(TOOLS)))
 endif
 endif
 
+install-includedir:
+   $(INSTALL_DIR) "$(DESTDIR)$(includedir)"
 
-install: all $(if $(BUILD_DOCS),install-doc) install-datadir 
install-localstatedir
+install: all $(if $(BUILD_DOCS),install-doc) install-datadir 
install-localstatedir install-includedir
 ifneq ($(TOOLS),)
$(call install-prog,$(subst 
qemu-ga,qemu-ga$(EXESUF),$(TOOLS)),$(DESTDIR)$(bindir))
 endif
@@ -885,6 +887,9 @@ ifneq ($(BLOBS),)
 endif
 ifdef CONFIG_GTK
$(MAKE) -C po $@
+endif
+ifeq ($(CONFIG_PLUGIN),y)
+   $(INSTALL_DATA) $(SRC_PATH)/include/qemu/qemu-plugin.h 
"$(DESTDIR)$(includedir)/qemu-plugin.h"
 endif
$(INSTALL_DIR) "$(DESTDIR)$(qemu_datadir)/keymaps"
set -e; for x in $(KEYMAPS); do \
diff --git a/Makefile.target b/Makefile.target
index 4d56298bbf..75637c285c 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -107,6 +107,8 @@ obj-y += target/$(TARGET_BASE_ARCH)/
 obj-y += disas.o
 obj-$(call notempty,$(TARGET_XML_FILES)) += gdbstub-xml.o
 
+obj-$(CONFIG_PLUGINS) += plugin.o
+
 #
 # Linux user emulator target
 
diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
new file mode 100644
index 00..e483d0eaa8
--- /dev/null
+++ b/include/qemu/plugin.h
@@ -0,0 +1,253 @@
+/*
+ * Copyright (C) 2017, Emilio G. Cota 
+ *
+ * License: GNU GPL, version 2 or later.
+ *   See the COPYING file in the top-level directory.
+ */
+#ifndef QEMU_PLUGIN_H
+#define QEMU_PLUGIN_H
+
+#include "qemu/config-file.h"
+#include "qemu/qemu-plugin.h"
+#include "qemu/error-report.h"
+#include "qemu/queue.h"
+#include "qemu/option.h"
+
+/*
+ * Option parsing/processing.
+ * Note that we can load an arbitrary number of plugins.
+ */
+struct qemu_plugin_desc;
+QTAILQ_HEAD(qemu_plugin_list, qemu_plugin_desc);
+
+#ifdef CONFIG_PLUGIN
+extern QemuOptsList qemu_plugin_opts;
+
+static inline void qemu_plugin_add_opts(void)
+{
+qemu_add_opts(_plugin_opts);
+}
+
+void qemu_plugin_opt_parse(const char *optarg, struct qemu_plugin_list *head);
+int qemu_plugin_load_list(struct qemu_plugin_list *head);
+#else /* !CONFIG_PLUGIN */

[Qemu-devel] [RFC v2 09/38] cputlb: introduce get_page_addr_code_hostp

2018-12-09 Thread Emilio G. Cota
This will be used by plugins to get the host address
of instructions.

Signed-off-by: Emilio G. Cota 
---
 include/exec/exec-all.h | 13 +
 accel/tcg/cputlb.c  | 14 +-
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 815e5b1e83..afcc01e0e3 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -22,6 +22,7 @@
 
 #include "qemu-common.h"
 #include "exec/tb-context.h"
+#include "exec/cpu_ldst.h"
 #include "sysemu/cpus.h"
 
 /* allow to see translation results - the slowdown should be negligible, so we 
leave it */
@@ -487,12 +488,24 @@ static inline tb_page_addr_t 
get_page_addr_code(CPUArchState *env1, target_ulong
 {
 return addr;
 }
+
+static inline tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env1,
+  target_ulong addr,
+  void **hostp)
+{
+if (hostp) {
+*hostp = g2h(addr);
+}
+return addr;
+}
 #else
 static inline void mmap_lock(void) {}
 static inline void mmap_unlock(void) {}
 
 /* cputlb.c */
 tb_page_addr_t get_page_addr_code(CPUArchState *env1, target_ulong addr);
+tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env1, target_ulong addr,
+void **hostp);
 
 void tlb_reset_dirty(CPUState *cpu, ram_addr_t start1, ram_addr_t length);
 void tlb_set_dirty(CPUState *cpu, target_ulong vaddr);
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index e3582f2f1d..5c61908084 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1069,7 +1069,8 @@ static bool victim_tlb_hit(CPUArchState *env, size_t 
mmu_idx, size_t index,
  * is actually a ram_addr_t (in system mode; the user mode emulation
  * version of this function returns a guest virtual address).
  */
-tb_page_addr_t get_page_addr_code(CPUArchState *env, target_ulong addr)
+tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr,
+void **hostp)
 {
 uintptr_t mmu_idx = cpu_mmu_index(env, true);
 uintptr_t index = tlb_index(env, mmu_idx, addr);
@@ -1092,13 +1093,24 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env, 
target_ulong addr)
  *than a target page, so we must redo the MMU check every insn
  *  - TLB_MMIO: region is not backed by RAM
  */
+if (hostp) {
+*hostp = NULL;
+}
 return -1;
 }
 
 p = (void *)((uintptr_t)addr + entry->addend);
+if (hostp) {
+*hostp = p;
+}
 return qemu_ram_addr_from_host_nofail(p);
 }
 
+tb_page_addr_t get_page_addr_code(CPUArchState *env, target_ulong addr)
+{
+return get_page_addr_code_hostp(env, addr, NULL);
+}
+
 /* Probe for whether the specified guest write access is permitted.
  * If it is not permitted then an exception will be taken in the same
  * way as if this were a real write access (and we will not return).
-- 
2.17.1




[Qemu-devel] [RFC v2 02/38] tcg/README: fix typo s/afterwise/afterwards/

2018-12-09 Thread Emilio G. Cota
Afterwise is "wise after the fact", as in "hindsight".
Here we meant "afterwards" (as in "subsequently"). Fix it.

Reviewed-by: Alex Bennée 
Signed-off-by: Emilio G. Cota 
---
 tcg/README | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tcg/README b/tcg/README
index d22ee084b8..3fa8a7059f 100644
--- a/tcg/README
+++ b/tcg/README
@@ -101,7 +101,7 @@ This can be overridden using the following function 
modifiers:
   canonical locations before calling the helper.
 - TCG_CALL_NO_WRITE_GLOBALS means that the helper does not modify any globals.
   They will only be saved to their canonical location before calling helpers,
-  but they won't be reloaded afterwise.
+  but they won't be reloaded afterwards.
 - TCG_CALL_NO_SIDE_EFFECTS means that the call to the function is removed if
   the return value is not used.
 
-- 
2.17.1




[Qemu-devel] [RFC v2 01/38] trace: expand mem_info:size_shift to 3 bits

2018-12-09 Thread Emilio G. Cota
This will allow us to trace 16B-long memory accesses.

While at it, add some defines for the mem_info bits and simplify
trace_mem_get_info by making it a wrapper around trace_mem_build_info.

Signed-off-by: Emilio G. Cota 
---
 trace-events | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/trace-events b/trace-events
index 4fd2cb4b97..9d65d472d2 100644
--- a/trace-events
+++ b/trace-events
@@ -151,7 +151,7 @@ vcpu guest_cpu_reset(void)
 # Access information can be parsed as:
 #
 # struct mem_info {
-# uint8_t size_shift : 2; /* interpreted as "1 << size_shift" bytes */
+# uint8_t size_shift : 3; /* interpreted as "1 << size_shift" bytes */
 # boolsign_extend: 1; /* sign-extended */
 # uint8_t endianness : 1; /* 0: little, 1: big */
 # boolstore  : 1; /* wheter it's a store operation */
-- 
2.17.1




[Qemu-devel] [RFC v2 07/38] queue: add QTAILQ_REMOVE_SEVERAL

2018-12-09 Thread Emilio G. Cota
This is faster than removing elements one by one.

Will gain a user soon.

Signed-off-by: Emilio G. Cota 
---
 include/qemu/queue.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/include/qemu/queue.h b/include/qemu/queue.h
index ac418efc43..0283c2dd7d 100644
--- a/include/qemu/queue.h
+++ b/include/qemu/queue.h
@@ -419,6 +419,16 @@ struct {   
 \
 (elm)->field.tqe_prev = NULL;   \
 } while (/*CONSTCOND*/0)
 
+/* remove @left, @right and all elements in between from @head */
+#define QTAILQ_REMOVE_SEVERAL(head, left, right, field) do {\
+if (((right)->field.tqe_next) != NULL)  \
+(right)->field.tqe_next->field.tqe_prev =   \
+(left)->field.tqe_prev; \
+else\
+(head)->tqh_last = (left)->field.tqe_prev;  \
+*(left)->field.tqe_prev = (right)->field.tqe_next;  \
+} while (/*CONSTCOND*/0)
+
 #define QTAILQ_FOREACH(var, head, field)\
 for ((var) = ((head)->tqh_first);   \
 (var);  \
-- 
2.17.1




  1   2   >